I agree. Nothing new with this release. For the disks space, I have 370GB on the drive where the test is running and 20GB in the tmp folder. I monitored that over the process and it used only 1GB on both disks. So I don't think it's space related. That's strange because I tried on 2 deferent config (hardware, os, etc.) and both times got the same result. I will retry on a 3rd computer to validate.
2013/12/19 lars hofhansl <la...@apache.org> > Thanks again, JM. > > > Yep, both IntegrationTestLoadAndVerify and IntegrationTestBigLinkedList > pass for me in local install every time I run it (many times by now). JDK > 1.6.0_34-b04. > > One thing I found is that they do not clean up their data and fill up the > disk, once the disk is full the tests simply time out for me, but they > could fail in more "interesting ways" too when that happens... Maybe that's > what you see? > > > In any case nothing new with this release, right? Need to double-check the > tests. > > -- Lars > > > > ----- Original Message ----- > From: Jean-Marc Spaggiari <jean-m...@spaggiari.org> > To: lars hofhansl <la...@apache.org>; dev <dev@hbase.apache.org> > Cc: > Sent: Thursday, December 19, 2013 10:45 AM > Subject: Re: [VOTE] The 1st hbase 0.94.15 release candidate is available > for download > > For the version, issue is the alter command I used. Sorry about that. > Forget it. > > For IntegrationTestLoadAndVerify I have already reported the issue with > 0.94.10 on July 23rd. > > Just retried with 0.94.14 and 0.94.13 and failed on both too. By failed I > mean they give me REFERENCES_CHECKED=9855773 instead of a 100000000. Are > you getting 100000000? > > Single node it a local install on my laptop. No other HBase instances > configured, using local file system. For the 7 node cluster it's using > Hadoop 1.0.4 > > In local mode I'm running with jdk 1.6.0_45. On the 7 nodes I'm running > 1.7.0_5 > > What's strange with the abstract issue is that IntegrationTestsDriver is > not the only one using ToolRunner, but is the only one to fail. Strange. > > JM > > > > 2013/12/19 lars hofhansl <la...@apache.org> > > > The single node cluster was just a local install, right? I.e. using the > > local file system, rather than HDFS...? > > On the 7 node cluster, which version of HDFS did you use? If not 1.0.4 I > > assume you recompiled HBase :) > > > > I definitely do not see the AbstractMethodError issue. That very looks > > like a classpath setup issue. > > > > Ran IntegrationTestLoadAndVerify and IntegrationTestBigLinkedList in a > > loop in local mode. Didn't fail once. > > > > Let's chat offline and figure out if/where your setup is different from > > mine. > > > > -- Lars > > > > ________________________________ > > From: lars hofhansl <la...@apache.org> > > To: Jean-Marc Spaggiari <jean-m...@spaggiari.org>; dev < > > dev@hbase.apache.org> > > Sent: Thursday, December 19, 2013 8:53 AM > > Subject: Re: [VOTE] The 1st hbase 0.94.15 release candidate is available > > for download > > > > > > Thanks JM. > > > > > > You did a "raw" scan below. It'll return to you exactly what is there, so > > you'll see the 3 versions before you compact, that is by design. > > java.lang.AbstractMethodError looks like an issue local to your install. > > I'll check. > > > > > > IntegrationTestLoadAndVerify is interesting. Did that pass reliably in > > older releases of 0.94 (0.94.14 or 0.94.13)? > > > > -- Lars > > > > > > ________________________________ > > > > From: Jean-Marc Spaggiari <jean-m...@spaggiari.org> > > To: dev <dev@hbase.apache.org>; lars hofhansl <la...@apache.org> > > Sent: Thursday, December 19, 2013 7:01 AM > > Subject: Re: [VOTE] The 1st hbase 0.94.15 release candidate is available > > for download > > > > > > > > tl;tr see arrow below. > > > > > > > > Downloaded and checked signature for bother vanilla and secured. Passed. > > Random checked documentation and CHANGES.txt. Passed > > > > > > On a single node cluster: > > Ran the tests. All passed. > > Ran IntegrationTestLoadAndVerify. Got REFERENCES_CHECKED=9855424, > > expected 10000000? Failed? > > Ran IntegrationTestBigLinkedList. Passed. > > Ran HBCK after those tests and got many errors about _original-evil-name > > and clone tables. > > Cleared everything, restarted HBase. Re-ran IntegrationTestBigLinkedList, > > HBCK ok. Re-ran IntegrationTestLoadAndVerify, failed again: > > 13/12/18 21:24:24 ERROR test.IntegrationTestBigLinkedList$Verify: > Expected > > referenced count does not match with actual referenced count. expected > > referenced=3000000 ,actual=9000000 > > Exception in thread "main" java.lang.RuntimeException: Verify.verify > failed > > at > > > org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Loop.runVerify(IntegrationTestBigLinkedList.java:724) > > at > > > org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Loop.run(IntegrationTestBigLinkedList.java:757) > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > > at > > > org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList.run(IntegrationTestBigLinkedList.java:1069) > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > > at > > > org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList.main(IntegrationTestBigLinkedList.java:1073) > > > > But now HBCK is clean. Figured that HBCK issue is because of some > leftover > > from org.apache.hadoop.hbase.regionserver.TestStoreFile who is writting > in > > the same directory as the default standalone HBase. > > > > From the shell, create a table 15 regions, put, compact, scan, etc. Table > > definition is VERSIONS => 2. However, scan 't1', {RAW => true, VERSIONS > => > > 10} still return 3 versions even after flush/compact/major_compact: > > hbase(main):034:0> scan 't1', {RAW => true, VERSIONS => 10} > > ROW > > COLUMN+CELL > > rowkey column=f1:c1, > > timestamp=1387421969489, > > value=value > > rowkey column=f1:c1, > > timestamp=1387421969337, > > value=value > > rowkey column=f1:c1, > > timestamp=1387421969162, > > value=value > > 1 row(s) in 0.0570 seconds > > > > Will have expected only 2 to be return. > > > > > > > > Stopped HBase, checked the log, everything is fine. > > > > > > Now on a 7 nodes cluster: > > > > Deployed jars and did rolling restart on a 0.94.14 cluster. Passed. > > > > Configured default balancer, merged a 60 region table to a single region, > > restarted the cluster, all fine. > > > > major_compact the table to get it split into 60 regions, balancer, all > > fine except that balancer need to be run twice to get correct balancing. > > > > Some "No serialized HRegionInfo in keyvalues" in the logs not related to > > the tables I'm "playing" with. > > > > Restored customized balancer, restarted, rebalanced, all fine. > > Ran IntegrationTestLoadAndVerify. Got REFERENCES_CHECKED=9855645, > > expected 10000000? Failed? > > > > Ran IntegrationTestBigLinkedList. Passed. > > > > > > Last, I tried to run IntegrationTestsDriver but it failed. I need to look > > at that. > > > > hbase@node3:~/hbase-0.94.3$ bin/hbase > > org.apache.hadoop.hbase.IntegrationTestsDriver > > Exception in thread "main" java.lang.AbstractMethodError: > > org.apache.hadoop.hbase.util.AbstractHBaseTool.doWork()V > > at > > > org.apache.hadoop.hbase.util.AbstractHBaseTool.run(AbstractHBaseTool.java:103) > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) > > at > > > org.apache.hadoop.hbase.IntegrationTestsDriver.main(IntegrationTestsDriver.java:47) > > > > > > > > > > =====> tl;tr: > > > > - Small issue with balancer when 60 regions assigned to a single server. > > Need to run twice to get that correctly balanced; > > > > - Leftover in the wrong place from > > org.apache.hadoop.hbase.regionserver.TestStoreFile; > > - Table with VERSIONS => 2 returns 3 versions instead of 2; > > - IntegrationTestsDriver not running. > > > > > > I don't think there is anything here to stop the release but there is > > still few things that need to be looked at. > > > > > > JM > > > > > > > > > > 2013/12/18 lars hofhansl <la...@apache.org> > > > > The 1st 0.94.15 RC is available for download at > > http://people.apache.org/~larsh/hbase-0.94.15-rc0/ > > >Signed with my code signing key: C7CFE328 > > > > > >HBase 0.94.15 is a bug fix release along with some performance > > improvements: > > > [HBASE-7886] - [replication] hlog zk node will not be deleted if > > client roll hlog > > > [HBASE-9485] - TableOutputCommitter should implement recovery if we > > don't want jobs to start from 0 on RM restart > > > [HBASE-9995] - Not stopping ReplicationSink when using custom > > implementation for the ReplicationSink > > > [HBASE-10014] - HRegion#doMiniBatchMutation rollbacks the memstore > > even if there is nothing to rollback. > > > [HBASE-10015] - Replace intrinsic locking with explicit locks in > > StoreScanner > > > [HBASE-10026] - HBaseAdmin#createTable could fail if region splits > > too fast > > > [HBASE-10046] - Unmonitored HBase service could accumulate Status > > objects and OOM > > > [HBASE-10057] - TestRestoreFlushSnapshotFromClient and > > TestRestoreSnapshotFromClient fail to finish occasionally > > > [HBASE-10061] - TableMapReduceUtil.findOrCreateJar calls > > updateMap(null, ) resulting in thrown NPE > > > [HBASE-10064] - AggregateClient.validateParameters can throw NPE > > > [HBASE-10089] - Metrics intern table names cause eventual permgen > OOM > > in 0.94 > > > [HBASE-10111] - Verify that a snapshot is not corrupted before > > restoring it > > > [HBASE-10112] - Hbase rest query params for maxVersions and > maxValues > > are not parsed > > > [HBASE-10117] - Avoid synchronization in > > HRegionScannerImpl.isFilterDone > > > [HBASE-10120] - start-hbase.sh doesn't respect --config in > > non-distributed mode > > > [HBASE-10179] - HRegionServer underreports readRequestCounts by 1 > > under certain conditions > > > [HBASE-10181] - HBaseObjectWritable.readObject catches > > DoNotRetryIOException and wraps it back in a regular IOException > > > [HBASE-9931] - Optional setBatch for CopyTable to copy large rows in > > batches > > > [HBASE-10001] - Add a coprocessor to help testing the performances > > without taking into account the i/o > > > [HBASE-10007] - PerformanceEvaluation: Add sampling and latency > > collection to randomRead test > > > [HBASE-10010] - eliminate the put latency spike on the new log file > > beginning > > > [HBASE-10048] - Add hlog number metric in regionserver > > > [HBASE-10049] - Small improvments in region_mover.rb > > > [HBASE-10093] - Unregister ReplicationSource metric bean when the > > replication source thread is terminated > > > [HBASE-9047] - Tool to handle finishing replication when the cluster > > is offline > > > [HBASE-10119] - Allow HBase coprocessors to clean up when they fail > > > [HBASE-9927] - ReplicationLogCleaner#stop() calls > > HConnectionManager#deleteConnection() unnecessarily > > > [HBASE-9986] - Incorporate HTTPS support for HBase (0.94 port) > > > [HBASE-10058] - Test for HBASE-9915 (avoid reading index blocks) > > > [HBASE-10189] - Intermittent TestReplicationSyncUpTool failure > > > > > >The list of changes is also available here: > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753&version=12325559 > > > > > >Here're the jenkins runs for this RC: > > https://builds.apache.org/job/HBase-0.94.15/2/ and > > https://builds.apache.org/job/HBase-0.94.15-security/1/ > > > > > >Please try out the RC, check out the doc, take it for a spin, etc, and > > vote +1/-1 by EOD December 27th on whether we should release this as > > 0.94.15. (9 days because of the holidays) > > > > > >Thanks. > > > > > >-- Lars > > > > > > >