Can someone take a look at running Test Big Linked for > 5 iterations
with slowDeterministic chaos monkey on a distributed cluster.  I'm
pretty concerned about HBASE-9338

On Tue, Sep 3, 2013 at 6:57 AM, Jean-Marc Spaggiari
<[email protected]> wrote:
> There was a typo in my log4j.properties :(
>
> So it's working fine.
>
> The only INFO logs I still see are those one:
> 2013-09-03 09:53:07,313 INFO  [M:0;t430s:45176] mortbay.log: Logging to
> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
> org.mortbay.log.Slf4jLog
> 2013-09-03 09:53:07,350 INFO  [M:0;t430s:45176] mortbay.log: jetty-6.1.26
> But there is only very few of them.
>
> Performances wise, here are the numbers (the higher, the better. Rows per
> seconds, expect for scans where it's rows/min). As you will see, 0.96 is
> slower only for RandomSeekScanTest (way slower) and RandomScanWithRange10
> but is faster for everything else. I ran the tests with the default
> settings. So I think we should look at RandomSeekScanTest but expect this
> one, everything else is pretty good.
>
> Also, I have been able to reproduce this exception:
> 2013-09-03 09:55:36,718 WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181]
> server.NIOServerCnxn: caught end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x140e4191edb0009, likely client has closed socket
>     at
> org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)
>     at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
>     at java.lang.Thread.run(Thread.java:662)
>
> Just had to run PE and kill it in the middle.
>
> JM
>
>
>
> 0.96.0RC0 0.94.11
>  org.apache.hadoop.hbase.PerformanceEvaluation$FilteredScanTest 10.28 10.17
> 101.12%  org.apache.hadoop.hbase.PerformanceEvaluation$RandomReadTest 966.01
> 810.58 119.18%
> org.apache.hadoop.hbase.PerformanceEvaluation$RandomSeekScanTest 98.50
> 255.71 38.52%  org.apache.hadoop.hbase.PerformanceEvaluation$RandomWriteTest
> 39251.17 25682.11 152.83%
> org.apache.hadoop.hbase.PerformanceEvaluation$RandomScanWithRange10Test
> 25844.88 28715.29 90.00%
> org.apache.hadoop.hbase.PerformanceEvaluation$RandomScanWithRange100Test
> 20029.48 18022.39 111.14%
> org.apache.hadoop.hbase.PerformanceEvaluation$RandomScanWithRange1000Test
> 2692.16 2346.85 114.71%
> org.apache.hadoop.hbase.PerformanceEvaluation$SequentialReadTest 3002.18
> 2875.83 104.39%
> org.apache.hadoop.hbase.PerformanceEvaluation$SequentialWriteTest 38995.50
> 26693.23 146.09%
>
>
> 2013/9/3 Stack <[email protected]>
>
>> On Mon, Sep 2, 2013 at 10:51 AM, Jean-Marc Spaggiari <
>> [email protected]> wrote:
>>
>> > I have created:
>> >  HBASE-9412
>> >  HBASE-9413
>> >  HBASE-9414
>> >
>> > I have not been able yet to reproduce the ZK error. I'm trying.
>> >
>> >
>> Is it when you have a shell connection and then kill it?
>>
>>
>>
>> > Last, I tried, with no success, to set loglevel to WARN to remove all
>> > DEBUG and INFO logs. Setting it to WARN remove the DEBUG lines, but I
>> > keep getting the INFO. Seems that something is setting the log level
>> > somewhere else, or it's not read.
>> >
>> > Here is my log4j.properties file. I removed all the customed log level
>> > to setup WARN for org.apache. And it's still showing INFO...
>> >
>> >
>>
>> You did it by editing log4j and restarting or in the UI?  I think the UI
>> log level setting is broke.... (new issue!)
>>
>> Thanks for trying it out JMS,
>>
>> So everything is slower in 0.96?
>> St.Ack
>>
>>
>>
>> > JM
>> >
>> >
>> > # Define some default values that can be overridden by system properties
>> > hbase.root.logger=WARN,console
>> > hbase.security.logger=WARN,console
>> > hbase.log.dir=.
>> > hbase.log.file=hbase.log
>> >
>> > # Define the root logger to the system property "hbase.root.logger".
>> > log4j.rootLogger=${hbase.root.logger}
>> >
>> > # Logging Threshold
>> > log4j.threshold=ALL
>> >
>> > #
>> > # Daily Rolling File Appender
>> > #
>> > log4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender
>> > log4j.appender.DRFA.File=${hbase.log.dir}/${hbase.log.file}
>> >
>> > # Rollver at midnight
>> > log4j.appender.DRFA.DatePattern=.yyyy-MM-dd
>> >
>> > # 30-day backup
>> > #log4j.appender.DRFA.MaxBackupIndex=30
>> > log4j.appender.DRFA.layout=org.apache.log4j.PatternLayout
>> >
>> > # Pattern format: Date LogLevel LoggerName LogMessage
>> > log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %-5p [%t] %c{2}:
>> > %m%n
>> >
>> > # Rolling File Appender properties
>> > hbase.log.maxfilesize=256MB
>> > hbase.log.maxbackupindex=20
>> >
>> > # Rolling File Appender
>> > log4j.appender.RFA=org.apache.log4j.RollingFileAppender
>> > log4j.appender.RFA.File=${hbase.log.dir}/${hbase.log.file}
>> >
>> > log4j.appender.RFA.MaxFileSize=${hbase.log.maxfilesize}
>> > log4j.appender.RFA.MaxBackupIndex=${hbase.log.maxbackupindex}
>> >
>> > log4j.appender.RFA.layout=org.apache.log4j.PatternLayout
>> > log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p [%t] %c{2}:
>> > %m%n
>> >
>> > #
>> > # Security audit appender
>> > #
>> > hbase.security.log.file=SecurityAuth.audit
>> > hbase.security.log.maxfilesize=256MB
>> > hbase.security.log.maxbackupindex=20
>> > log4j.appender.RFAS=org.apache.log4j.RollingFileAppender
>> > log4j.appender.RFAS.File=${hbase.log.dir}/${hbase.security.log.file}
>> > log4j.appender.RFAS.MaxFileSize=${hbase.security.log.maxfilesize}
>> > log4j.appender.RFAS.MaxBackupIndex=${hbase.security.log.maxbackupindex}
>> > log4j.appender.RFAS.layout=org.apache.log4j.PatternLayout
>> > log4j.appender.RFAS.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
>> > log4j.category.SecurityLogger=${hbase.security.logger}
>> > log4j.additivity.SecurityLogger=false
>> >
>> >
>> #log4j.logger.SecurityLogger.org.apache.hadoop.hbase.security.access.AccessController=TRACE
>> >
>> > #
>> > # Null Appender
>> > #
>> > log4j.appender.NullAppender=org.apache.log4j.varia.NullAppender
>> >
>> > #
>> > # console
>> > # Add "console" to rootlogger above if you want to use this
>> > #
>> > log4j.appender.console=org.apache.log4j.ConsoleAppender
>> > log4j.appender.console.target=System.err
>> > log4j.appender.console.layout=org.apache.log4j.PatternLayout
>> > log4j.appender.console.layout.ConversionPattern=%d{ISO8601} %-5p [%t]
>> > %c{2}: %m%n
>> >
>> > # Custom Logging levels
>> >
>> > log4j.logger.ore.apache=WARN
>> >
>> > 2013/9/2 Jean-Marc Spaggiari <[email protected]>:
>> > > Hi St.Ack,
>> > >
>> > > I will open the relate JIRAs in few minutes.
>> > >
>> > > Regarding performances, RandomSeekScanTest is way slower, and only
>> > > RandomScanWithRange100Test was faster. Others were similar. For
>> > > RandomScanWithRange100Test I suspect that I don't have the right
>> > > number for 0.94 so 0.94.11 tests are running right now on the same
>> > > server with the same configuration. I will start to have numbers by
>> > > end of day, else, tomorrow morning, but will most probably take about
>> > > 24h to get all of them.
>> > >
>> > > JM
>> > >
>> > > 2013/9/2 Stack <[email protected]>:
>> > >> On Mon, Sep 2, 2013 at 9:41 AM, Jean-Marc Spaggiari <
>> > [email protected]
>> > >>> wrote:
>> > >>
>> > >>> Got it.
>> > >>>
>> > >>> I can't run the integration tests for now because I'm lacking some
>> > >>> servers :( Need to complete some HBase on RAID tests before I can get
>> > >>> those new servers
>> > >>>
>> > >>> First thing is, start-hbase.cmd has the execute flag set. I don't
>> > >>> think it's required. And it will help with tabulation feature if we
>> > >>> can un-set it.
>> > >>>
>> > >>> For 0.96.0RC0 here are my results:
>> > >>> First, I get 2 .out files. each time I start the server, instead of
>> > >>> usually one... With the same timestamp.
>> > >>>
>> > >>> -rw-r--r-- 1 jmspaggiari jmspaggiari     0 Aug 31 15:38
>> > >>> hbase-jmspaggiari-master-t430s.out
>> > >>> -rw-r--r-- 1 jmspaggiari jmspaggiari     0 Aug 31 15:38
>> > >>> hbase-jmspaggiari-master-t430s.out.1
>> > >>>
>> > >>>
>> > >>> In the UI, we say "The .META. table holds references to all User
>> Table
>> > >>> regions" but the table name is "hbase:meta" and not ".META."
>> > >>>
>> > >>> On the logs, I found this exception that I did not had before:
>> > >>> 2013-08-31 18:45:05,490 WARN
>> > >>> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] server.NIOServerCnxn:
>> > >>> caught end of stream exception
>> > >>> EndOfStreamException: Unable to read additional data from client
>> > >>> sessionid 0x140d68bb9d50004, likely client has closed socket
>> > >>>     at
>> > >>>
>> org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)
>> > >>>     at
>> > >>>
>> >
>> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
>> > >>>     at java.lang.Thread.run(Thread.java:662)
>> > >>>
>> > >>>
>> > >>> I ran PE over 48h. I don't have another 0.96 baseline to compare
>> with,
>> > >>> so I compared with 0.94.
>> > >>> org.apache.hadoop.hbase.PerformanceEvaluation$RandomSeekScanTest is
>> > >>> about 3 times slower with 0.96.
>> > >>>
>> > org.apache.hadoop.hbase.PerformanceEvaluation$RandomScanWithRange100Test
>> > >>> seems to be 2 times faster.
>> > >>>
>> > >>> Writes also are faster but I changed my hard drive since I ran 0.94.
>> > >>> So I'm currently re-running 0.94 and will provide another more
>> > >>> accurate comparison soon.
>> > >>>
>> > >>> Ran LoadTestTool: Failed to write keys: 0
>> > >>>
>> > >>> I have not been able to run IntegrationTestLoadAndVerify nor
>> > >>> IntegrationTestBigLinkedList because of the lack of servers.
>> > >>>
>> > >>> So overall, it seems to be working fine, but I have not been able to
>> > >>> test this release as deeply as I'm usually testing the 0.94 releases.
>> > >>>
>> > >>
>> > >> Thank you for taking it for a spin JMS -- you the man.  If you don't
>> > open
>> > >> issues for the teething stuff, I will.
>> > >>
>> > >> So 0.96 is slower doing RandomSeekScanTest but faster on your other
>> > tests?
>> > >>  I can take a look too.
>> > >>
>> > >> St.Ack
>> >
>>

Reply via email to