[jira] [Commented] (HBASE-14020) Unsafe based optimized write in ByteBufferOutputStream
[ https://issues.apache.org/jira/browse/HBASE-14020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613601#comment-14613601 ] Hadoop QA commented on HBASE-14020: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12743540/HBASE-14020_v2.patch against master branch at commit e640f1e76af8f32015f475629610da127897f01e. ATTACHMENT ID: 12743540 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:red}-1 javac{color}. The applied patch generated 16 javac compiler warnings (more than the master's current 13 warnings). {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 2 warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 1899 checkstyle errors (more than the master's current 1898 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): at org.apache.hadoop.hbase.util.TestDrainBarrier.testStopIsBlockedByOps(TestDrainBarrier.java:98) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14655//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14655//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14655//artifact/patchprocess/checkstyle-aggregate.html Javadoc warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14655//artifact/patchprocess/patchJavadocWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14655//console This message is automatically generated. Unsafe based optimized write in ByteBufferOutputStream -- Key: HBASE-14020 URL: https://issues.apache.org/jira/browse/HBASE-14020 Project: HBase Issue Type: Sub-task Components: Scanners Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 2.0.0 Attachments: HBASE-14020.patch, HBASE-14020_v2.patch We use this class to build the cellblock at RPC layer. The write operation is doing puts to java ByteBuffer which is having lot of overhead. Instead we can do Unsafe based copy to buffer operation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14018) RegionServer is aborted when flushing memstore.
[ https://issues.apache.org/jira/browse/HBASE-14018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613574#comment-14613574 ] Dinh Duong Mai commented on HBASE-14018: I did. And I hope some one could help me. RegionServer is aborted when flushing memstore. --- Key: HBASE-14018 URL: https://issues.apache.org/jira/browse/HBASE-14018 Project: HBase Issue Type: Bug Components: hadoop2, hbase Affects Versions: 1.0.1.1 Environment: CentOS x64 Server Reporter: Dinh Duong Mai Attachments: hbase-hadoop-master-node1.vmcluster.log, hbase-hadoop-regionserver-node1.vmcluster.log, hbase-hadoop-zookeeper-node1.vmcluster.log + Pseudo-distributed Hadoop (2.6.0), ZK_HBASE_MANAGE = true (1 master, 1 regionserver). + Put data to OpenTSDB, 1000 records / s, for 2000 seconds. + RegionServer is aborted. === RegionServer logs === 2015-07-03 16:37:37,332 INFO [LruBlockCacheStatsExecutor] hfile.LruBlockCache: totalSize=371.27 KB, freeSize=181.41 MB, max=181.78 MB, blockCount=5, accesses=1623, hits=172, hitRatio=10.60%, , cachingAccesses=177, cachingHits=151, cachingHitsRatio=85.31%, evictions=1139, evicted=21, evictedPerRun=0.018437225371599197 2015-07-03 16:37:37,898 INFO [node1:16040Replication Statistics #0] regionserver.Replication: Normal source for cluster 1: Total replicated edits: 2744, currently replicating from: hdfs://node1.vmcluster:9000/hbase/WALs/node1.vmcluster,16040,1435897652505/node1.vmcluster%2C16040%2C1435897652505.default.1435908458590 at position: 19207814 2015-07-03 16:42:37,331 INFO [LruBlockCacheStatsExecutor] hfile.LruBlockCache: totalSize=371.27 KB, freeSize=181.41 MB, max=181.78 MB, blockCount=5, accesses=1624, hits=173, hitRatio=10.65%, , cachingAccesses=178, cachingHits=152, cachingHitsRatio=85.39%, evictions=1169, evicted=21, evictedPerRun=0.01796407252550125 2015-07-03 16:42:37,899 INFO [node1:16040Replication Statistics #0] regionserver.Replication: Normal source for cluster 1: Total replicated edits: 3049, currently replicating from: hdfs://node1.vmcluster:9000/hbase/WALs/node1.vmcluster,16040,1435897652505/node1.vmcluster%2C16040%2C1435897652505.default.1435908458590 at position: 33026416 2015-07-03 16:43:27,217 INFO [MemStoreFlusher.1] regionserver.HRegion: Started memstore flush for tsdb,,1435897759785.2d49cd81fb6513f51af58bd0394c4e0d., current region memstore size 128.05 MB 2015-07-03 16:43:27,899 FATAL [MemStoreFlusher.1] regionserver.HRegionServer: ABORTING region server node1.vmcluster,16040,1435897652505: Replay of WAL required. Forcing server shutdown org.apache.hadoop.hbase.DroppedSnapshotException: region: tsdb,,1435897759785.2d49cd81fb6513f51af58bd0394c4e0d. at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2001) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1772) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1704) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:445) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:407) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:69) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:225) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.ArrayIndexOutOfBoundsException: -32743 at org.apache.hadoop.hbase.CellComparator.getMinimumMidpointArray(CellComparator.java:478) at org.apache.hadoop.hbase.CellComparator.getMidpoint(CellComparator.java:448) at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.finishBlock(HFileWriterV2.java:165) at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.checkBlockBoundary(HFileWriterV2.java:146) at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(HFileWriterV2.java:263) at org.apache.hadoop.hbase.io.hfile.HFileWriterV3.append(HFileWriterV3.java:87) at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:932) at org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:121) at org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:71) at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:879) at org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2128) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1955) ... 7 more 2015-07-03 16:43:27,901 FATAL [MemStoreFlusher.1] regionserver.HRegionServer: RegionServer
[jira] [Commented] (HBASE-8642) [Snapshot] List and delete snapshot by table
[ https://issues.apache.org/jira/browse/HBASE-8642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613603#comment-14613603 ] Hadoop QA commented on HBASE-8642: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12743520/HBASE-8642-v4.patch against master branch at commit e640f1e76af8f32015f475629610da127897f01e. ATTACHMENT ID: 12743520 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:red}-1 findbugs{color}. The patch appears to cause Findbugs (version 2.0.3) to fail. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: +puts No snapshots matched the table name regular expression #{tableNameregex.to_s} and the snapshot name regular expression #{snapshotNameRegex.to_s} if count == 0 + puts Failed to delete snapshot: #{deleteSnapshot.getName}, due to below exception,\n + $! {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14657//testReport/ Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14657//artifact/patchprocess/checkstyle-aggregate.html Javadoc warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14657//artifact/patchprocess/patchJavadocWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14657//console This message is automatically generated. [Snapshot] List and delete snapshot by table Key: HBASE-8642 URL: https://issues.apache.org/jira/browse/HBASE-8642 Project: HBase Issue Type: Improvement Components: snapshots Affects Versions: 0.98.0, 0.95.0, 0.95.1, 0.95.2 Reporter: Julian Zhou Assignee: Ashish Singhi Fix For: 2.0.0 Attachments: 8642-trunk-0.95-v0.patch, 8642-trunk-0.95-v1.patch, 8642-trunk-0.95-v2.patch, HBASE-8642-v1.patch, HBASE-8642-v2.patch, HBASE-8642-v3.patch, HBASE-8642-v4.patch, HBASE-8642.patch Support list and delete snapshots by table names. User scenario: A user wants to delete all the snapshots which were taken in January month for a table 't' where snapshot names starts with 'Jan'. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14021) Quota table has a wrong description on the UI
[ https://issues.apache.org/jira/browse/HBASE-14021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613612#comment-14613612 ] Hadoop QA commented on HBASE-14021: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12743507/HBASE-14021.patch against master branch at commit e640f1e76af8f32015f475629610da127897f01e. ATTACHMENT ID: 12743507 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 1899 checkstyle errors (more than the master's current 1898 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn post-site goal to fail. {color:red}-1 core tests{color}. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14656//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14656//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14656//artifact/patchprocess/checkstyle-aggregate.html Javadoc warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14656//artifact/patchprocess/patchJavadocWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14656//console This message is automatically generated. Quota table has a wrong description on the UI - Key: HBASE-14021 URL: https://issues.apache.org/jira/browse/HBASE-14021 Project: HBase Issue Type: Bug Components: UI Affects Versions: 1.1.0 Reporter: Ashish Singhi Assignee: Ashish Singhi Priority: Minor Fix For: 2.0.0, 1.1.2, 1.3.0, 1.2.1 Attachments: HBASE-14021.patch, error.png, fix.png !error.png! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14017) Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion
[ https://issues.apache.org/jira/browse/HBASE-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613585#comment-14613585 ] Hadoop QA commented on HBASE-14017: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12743530/HBASE-14017.v1-branch1.1.patch against master branch at commit e640f1e76af8f32015f475629610da127897f01e. ATTACHMENT ID: 12743530 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14653//console This message is automatically generated. Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion - Key: HBASE-14017 URL: https://issues.apache.org/jira/browse/HBASE-14017 Project: HBase Issue Type: Sub-task Components: proc-v2 Affects Versions: 2.0.0, 1.2.0, 1.1.1, 1.3.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Blocker Fix For: 2.0.0, 1.2.0, 1.1.2 Attachments: HBASE-14017-v0.patch, HBASE-14017.v1-branch1.1.patch [~syuanjiang] found a concurrecy issue in the procedure queue delete where we don't have an exclusive lock before deleting the table {noformat} Thread 1: Create table is running - the queue is empty and wlock is false Thread 2: markTableAsDeleted see the queue empty and wlock= false Thread 1: tryWrite() set wlock=true; too late Thread 2: delete the queue Thread 1: never able to release the lock - NPE when trying to get the queue {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5878) Use getVisibleLength public api from HdfsDataInputStream from Hadoop-2.
[ https://issues.apache.org/jira/browse/HBASE-5878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613599#comment-14613599 ] Hadoop QA commented on HBASE-5878: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12743500/HBASE-5878-v5.patch against master branch at commit e640f1e76af8f32015f475629610da127897f01e. ATTACHMENT ID: 12743500 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14652//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14652//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14652//artifact/patchprocess/checkstyle-aggregate.html Javadoc warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14652//artifact/patchprocess/patchJavadocWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14652//console This message is automatically generated. Use getVisibleLength public api from HdfsDataInputStream from Hadoop-2. --- Key: HBASE-5878 URL: https://issues.apache.org/jira/browse/HBASE-5878 Project: HBase Issue Type: Bug Components: wal Reporter: Uma Maheswara Rao G Assignee: Ashish Singhi Fix For: 2.0.0, 1.1.2, 1.3.0, 1.2.1, 1.0.3 Attachments: HBASE-5878-v2.patch, HBASE-5878-v3.patch, HBASE-5878-v4.patch, HBASE-5878-v5.patch, HBASE-5878.patch SequencFileLogReader: Currently Hbase using getFileLength api from DFSInputStream class by reflection. DFSInputStream is not exposed as public. So, this may change in future. Now HDFS exposed HdfsDataInputStream as public API. We can make use of it, when we are not able to find the getFileLength api from DFSInputStream as a else condition. So, that we will not have any sudden surprise like we are facing today. Also, it is just logging one warn message and proceeding if it throws any exception while getting the length. I think we can re-throw the exception because there is no point in continuing with dataloss. {code} long adjust = 0; try { Field fIn = FilterInputStream.class.getDeclaredField(in); fIn.setAccessible(true); Object realIn = fIn.get(this.in); // In hadoop 0.22, DFSInputStream is a standalone class. Before this, // it was an inner class of DFSClient. if (realIn.getClass().getName().endsWith(DFSInputStream)) { Method getFileLength = realIn.getClass(). getDeclaredMethod(getFileLength, new Class? []{}); getFileLength.setAccessible(true); long realLength = ((Long)getFileLength. invoke(realIn, new Object []{})).longValue(); assert(realLength = this.length); adjust = realLength - this.length; } else { LOG.info(Input stream class: + realIn.getClass().getName() + , not adjusting length); } } catch(Exception e) { SequenceFileLogReader.LOG.warn( Error while trying to get accurate file length. + Truncation / data loss may occur if
[jira] [Commented] (HBASE-13832) Procedure V2: master fail to start due to WALProcedureStore sync failures when HDFS data nodes count is low
[ https://issues.apache.org/jira/browse/HBASE-13832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613622#comment-14613622 ] Hadoop QA commented on HBASE-13832: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12743415/HBASE-13832-v4.patch against master branch at commit e640f1e76af8f32015f475629610da127897f01e. ATTACHMENT ID: 12743415 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 1899 checkstyle errors (more than the master's current 1898 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14661//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14661//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14661//artifact/patchprocess/checkstyle-aggregate.html Javadoc warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14661//artifact/patchprocess/patchJavadocWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14661//console This message is automatically generated. Procedure V2: master fail to start due to WALProcedureStore sync failures when HDFS data nodes count is low --- Key: HBASE-13832 URL: https://issues.apache.org/jira/browse/HBASE-13832 Project: HBase Issue Type: Sub-task Components: master, proc-v2 Affects Versions: 2.0.0, 1.1.0, 1.2.0 Reporter: Stephen Yuan Jiang Assignee: Matteo Bertozzi Priority: Critical Fix For: 2.0.0, 1.1.2, 1.3.0, 1.2.1 Attachments: HBASE-13832-v0.patch, HBASE-13832-v1.patch, HBASE-13832-v2.patch, HBASE-13832-v4.patch, HDFSPipeline.java, hbase-13832-test-hang.patch, hbase-13832-v3.patch when the data node 3, we got failure in WALProcedureStore#syncLoop() during master start. The failure prevents master to get started. {noformat} 2015-05-29 13:27:16,625 ERROR [WALProcedureStoreSyncThread] wal.WALProcedureStore: Sync slot failed, abort. java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[10.333.444.555:50010,DS-3ced-93f4-47b6-9c23-1426f7a6acdc,DISK], DatanodeInfoWithStorage[10.222.666.777:50010,DS-f9c983b4-1f10-4d5e-8983-490ece56c772,DISK]], original=[DatanodeInfoWithStorage[10.333.444.555:50010,DS-3ced-93f4-47b6-9c23-1426f7a6acdc,DISK], DatanodeInfoWithStorage[10.222.666.777:50010,DS-f9c983b4-1f10-4d5e-8983- 490ece56c772,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:951) {noformat} One proposal is to implement some similar logic as FSHLog: if IOException is thrown during syncLoop in WALProcedureStore#start(), instead of immediate abort, we could try to roll the log and see whether this resolve the issue; if the new log cannot be created or more exception from rolling the log, we then abort. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12213) HFileBlock backed by Array of ByteBuffers
[ https://issues.apache.org/jira/browse/HBASE-12213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613620#comment-14613620 ] Hadoop QA commented on HBASE-12213: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12743465/HBASE-12213_2.patch against master branch at commit e640f1e76af8f32015f475629610da127897f01e. ATTACHMENT ID: 12743465 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 47 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:red}-1 javac{color}. The applied patch generated 20 javac compiler warnings (more than the master's current 13 warnings). {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 5 warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 1914 checkstyle errors (more than the master's current 1898 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + * Fetches the short at the given index. Does not change position of the underlying ByteBuffers. The + * difference for this API from {@link #getShort(int)} is the caller is sure that the index will be + * Fetches the long at the given index. Does not change position of the underlying ByteBuffers. The + public static void copy(ByteBuffer src, int srcOffset, ByteBuffer dest, int destOffset, int length) { +assertBuffersEqual(new MultiByteBuffer(expectedBuffer), actualBuffer, algo, encoding, pread); {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are 4 zombie test(s): at org.apache.hadoop.hbase.security.access.TestWithDisabledAuthorization.testPassiveMasterOperations(TestWithDisabledAuthorization.java:516) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14654//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14654//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14654//artifact/patchprocess/checkstyle-aggregate.html Javadoc warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14654//artifact/patchprocess/patchJavadocWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14654//console This message is automatically generated. HFileBlock backed by Array of ByteBuffers - Key: HBASE-12213 URL: https://issues.apache.org/jira/browse/HBASE-12213 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Reporter: Anoop Sam John Assignee: ramkrishna.s.vasudevan Attachments: HBASE-12213_1.patch, HBASE-12213_2.patch In L2 cache (offheap) an HFile block might have been cached into multiple chunks of buffers. If HFileBlock need single BB, we will end up in recreation of bigger BB and copying. Instead we can make HFileBlock to serve data from an array of BBs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14011) MultiByteBuffer position based reads does not work correctly
[ https://issues.apache.org/jira/browse/HBASE-14011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613639#comment-14613639 ] Hudson commented on HBASE-14011: FAILURE: Integrated in HBase-TRUNK #6626 (See [https://builds.apache.org/job/HBase-TRUNK/6626/]) HBASE-14011 - MultiByteBuffer position based reads does not work correctly (ramkrishna: rev 1b75fd2bd6fdae2b6a8634ff24492ff0b96c1f32) * hbase-common/src/test/java/org/apache/hadoop/hbase/nio/TestMultiByteBuffer.java * hbase-common/src/main/java/org/apache/hadoop/hbase/nio/MultiByteBuffer.java MultiByteBuffer position based reads does not work correctly Key: HBASE-14011 URL: https://issues.apache.org/jira/browse/HBASE-14011 Project: HBase Issue Type: Bug Affects Versions: 2.0.0 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 2.0.0 Attachments: HBASE-14011.patch The positional based reads in MBBs are having some issues when we try to read the first element from the 2nd BB when the MBB is formed with multiple BBs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13977) Convert getKey and related APIs to Cell
[ https://issues.apache.org/jira/browse/HBASE-13977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613637#comment-14613637 ] Hudson commented on HBASE-13977: FAILURE: Integrated in HBase-TRUNK #6626 (See [https://builds.apache.org/job/HBase-TRUNK/6626/]) HBASE-13977 - Convert getKey and related APIs to Cell (Ram) (ramkrishna: rev 74e82c64e5cd0b7e598997199fec432e51d8d267) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionFileSystem.java * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java * hbase-prefix-tree/src/main/java/org/apache/hadoop/hbase/codec/prefixtree/PrefixTreeSeeker.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java * hbase-common/src/main/java/org/apache/hadoop/hbase/CellUtil.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StripeMultiFileWriter.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileEncryption.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEditsReplaySink.java * hbase-server/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java * hbase-server/src/main/java/org/apache/hadoop/hbase/util/CompressionTest.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionReplicas.java * hbase-server/src/main/java/org/apache/hadoop/hbase/util/BloomFilter.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileScanner.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMajorCompaction.java * hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestSeekTo.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java * hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderImpl.java * hbase-server/src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java Convert getKey and related APIs to Cell --- Key: HBASE-13977 URL: https://issues.apache.org/jira/browse/HBASE-13977 Project: HBase Issue Type: Sub-task Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 2.0.0 Attachments: HBASE-13977.patch, HBASE-13977_1.patch, HBASE-13977_2.patch, HBASE-13977_3.patch, HBASE-13977_4.patch, HBASE-13977_4.patch, HBASE-13977_5.patch During the course of changes for HBASE-11425 felt that more APIs can be converted to return Cell instead of BB like getKey, getLastKey. We can also rename the getKeyValue to getCell. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13975) add 1.2 RM to docs
[ https://issues.apache.org/jira/browse/HBASE-13975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613638#comment-14613638 ] Hudson commented on HBASE-13975: FAILURE: Integrated in HBase-TRUNK #6626 (See [https://builds.apache.org/job/HBase-TRUNK/6626/]) HBASE-13975 Add busbey as 1.2 RM to ref guide (busbey: rev 7ffd14986b689b91845813ce0321fef70778539a) * src/main/asciidoc/_chapters/developer.adoc add 1.2 RM to docs -- Key: HBASE-13975 URL: https://issues.apache.org/jira/browse/HBASE-13975 Project: HBase Issue Type: Sub-task Components: documentation Affects Versions: 1.2.0 Reporter: Sean Busbey Assignee: Sean Busbey Fix For: 2.0.0 Attachments: HBASE-13975.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14013) Retry when RegionServerNotYetRunningException rather than go ahead with assign so for sure we don't skip WAL replay
[ https://issues.apache.org/jira/browse/HBASE-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613641#comment-14613641 ] Hudson commented on HBASE-14013: FAILURE: Integrated in HBase-TRUNK #6626 (See [https://builds.apache.org/job/HBase-TRUNK/6626/]) HBASE-14013 Retry when RegionServerNotYetRunningException rather than go ahead with assign so for sure we don't skip WAL replay (stack: rev 71a523a6197da0abe93469e13d644adb629529db) * hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java Retry when RegionServerNotYetRunningException rather than go ahead with assign so for sure we don't skip WAL replay --- Key: HBASE-14013 URL: https://issues.apache.org/jira/browse/HBASE-14013 Project: HBase Issue Type: Sub-task Components: Region Assignment Reporter: stack Assignee: Enis Soztutar Fix For: 2.0.0, 1.2.0, 1.1.2, 1.3.0 Attachments: hbase-13895_addendum3-branch-1.1.patch, hbase-13895_addendum3-branch-1.patch, hbase-13895_addendum3-master.patch Patches are copied from parent. They were done by [~enis] +1 from. They continue the theme of the parent applying it to RegionServerNotYetRunningException as well as the new region aborting exception .. added in parent issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13970) NPE during compaction in trunk
[ https://issues.apache.org/jira/browse/HBASE-13970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613643#comment-14613643 ] Hudson commented on HBASE-13970: FAILURE: Integrated in HBase-TRUNK #6626 (See [https://builds.apache.org/job/HBase-TRUNK/6626/]) HBASE-13970 NPE during compaction in trunk (zhangduo: rev 28a035000f3cbf7c2b242d6dd9ba3bf84f4e2503) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java NPE during compaction in trunk -- Key: HBASE-13970 URL: https://issues.apache.org/jira/browse/HBASE-13970 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 0.98.13, 1.2.0, 1.1.1 Reporter: ramkrishna.s.vasudevan Assignee: Duo Zhang Fix For: 2.0.0, 0.98.14, 1.1.2, 1.3.0, 1.2.1 Attachments: HBASE-13970-v1.patch, HBASE-13970.patch Updated the trunk.. Loaded the table with PE tool. Trigger a flush to ensure all data is flushed out to disk. When the first compaction is triggered we get an NPE and this is very easy to reproduce {code} 015-06-25 21:33:46,041 INFO [main-EventThread] procedure.ZKProcedureMemberRpcs: Received procedure start children changed event: /hbase/flush-table-proc/acquired 2015-06-25 21:33:46,051 INFO [rs(stobdtserver3,16040,1435248182301)-flush-proc-pool3-thread-1] regionserver.HRegion: Flushing 1/1 column families, memstore=76.91 MB 2015-06-25 21:33:46,159 ERROR [regionserver/stobdtserver3/10.224.54.70:16040-longCompactions-1435248183945] regionserver.CompactSplitThread: Compaction failed Request = regionName=TestTable,283887,1435248198798.028fb0324cd6eb03d5022eb8c147b7c4., storeName=info, fileCount=3, fileSize=343.4 M (114.5 M, 114.5 M, 114.5 M), priority=3, time=7536968291719985 java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.compactions.PressureAwareCompactionThroughputController$ActiveCompaction.access$700(PressureAwareCompactionThroughputController.java:79) at org.apache.hadoop.hbase.regionserver.compactions.PressureAwareCompactionThroughputController.finish(PressureAwareCompactionThroughputController.java:238) at org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:306) at org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:106) at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:112) at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1202) at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1792) at org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.run(CompactSplitThread.java:524) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2015-06-25 21:33:46,745 INFO [rs(stobdtserver3,16040,1435248182301)-flush-proc-pool3-thread-1] regionserver.DefaultStoreFlusher: Flushed, sequenceid=1534, memsize=76.9 M, hasBloomFilter=true, into tmp file hdfs://stobdtserver3:9010/hbase/data/default/TestTable/028fb0324cd6eb03d5022eb8c147b7c4/.tmp/942ba0831a0047a08987439e34361a0c 2015-06-25 21:33:46,772 INFO [rs(stobdtserver3,16040,1435248182301)-flush-proc-pool3-thread-1] regionserver.HStore: Added hdfs://stobdtserver3:9010/hbase/data/default/TestTable/028fb0324cd6eb03d5022eb8c147b7c4/info/942ba0831a0047a08987439e34361a0c, entries=68116, sequenceid=1534, filesize=68.7 M 2015-06-25 21:33:46,773 INFO [rs(stobdtserver3,16040,1435248182301)-flush-proc-pool3-thread-1] regionserver.HRegion: Finished memstore flush of ~76.91 MB/80649344, currentsize=0 B/0 for region TestTable,283887,1435248198798.028fb0324cd6eb03d5022eb8c147b7c4. in 723ms, sequenceid=1534, compaction requested=true 2015-06-25 21:33:46,780 INFO [main-EventThread] procedure.ZKProcedureMemberRpcs: Received created event:/hbase/flush-table-proc/reached/TestTable 2015-06-25 21:33:46,790 INFO [main-EventThread] procedure.ZKProcedureMemberRpcs: Received created event:/hbase/flush-table-proc/abort/TestTable 2015-06-25 21:33:46,791 INFO [main-EventThread] procedure.ZKProcedureMemberRpcs: Received procedure abort children changed event: /hbase/flush-table-proc/abort 2015-06-25 21:33:46,803 INFO [main-EventThread] procedure.ZKProcedureMemberRpcs: Received procedure start children changed event: /hbase/flush-table-proc/acquired 2015-06-25 21:33:46,818 INFO [main-EventThread] procedure.ZKProcedureMemberRpcs: Received procedure abort children changed event:
[jira] [Commented] (HBASE-13925) Use zookeeper multi to clear znodes in ZKProcedureUtil
[ https://issues.apache.org/jira/browse/HBASE-13925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613644#comment-14613644 ] Hudson commented on HBASE-13925: FAILURE: Integrated in HBase-TRUNK #6626 (See [https://builds.apache.org/job/HBase-TRUNK/6626/]) HBASE-13925 Use zookeeper multi to clear znodes in ZKProcedureUtil (apurtell: rev 5b75faefccee87b23344d53e65f7bf6efa957779) * hbase-client/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java * hbase-server/src/test/java/org/apache/hadoop/hbase/zookeeper/TestZKMulti.java * hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/ZKProcedureUtil.java Use zookeeper multi to clear znodes in ZKProcedureUtil -- Key: HBASE-13925 URL: https://issues.apache.org/jira/browse/HBASE-13925 Project: HBase Issue Type: Improvement Affects Versions: 0.98.13 Reporter: Ashish Singhi Assignee: Ashish Singhi Fix For: 2.0.0, 0.98.14, 1.0.2, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-13925-v1-again.patch, HBASE-13925-v1.patch, HBASE-13925.patch Address the TODO in ZKProcedureUtil clearChildZNodes() and clearZNodes methods -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14002) Add --noReplicationSetup option to IntegrationTestReplication
[ https://issues.apache.org/jira/browse/HBASE-14002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613636#comment-14613636 ] Hudson commented on HBASE-14002: FAILURE: Integrated in HBase-TRUNK #6626 (See [https://builds.apache.org/job/HBase-TRUNK/6626/]) HBASE-14002 Add --noReplicationSetup option to IntegrationTestReplication (apurtell: rev e640f1e76af8f32015f475629610da127897f01e) * hbase-it/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestReplication.java Add --noReplicationSetup option to IntegrationTestReplication - Key: HBASE-14002 URL: https://issues.apache.org/jira/browse/HBASE-14002 Project: HBase Issue Type: Improvement Components: integration tests Reporter: Dima Spivak Assignee: Dima Spivak Fix For: 2.0.0, 0.98.14, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-14002_master.patch IntegrationTestReplication has been flaky for me on pre-1.1 versions of HBase because of not-actually-synchronous operations in HBaseAdmin/Admin, which hamper its setupTablesAndReplication method. To get around this, I'd like to add a \-nrs/--noReplicationSetup option to the test to allow it to be run on clusters in which the necessary tables and replication have already been setup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14005) Set permission to .top hfile in LoadIncrementalHFiles
[ https://issues.apache.org/jira/browse/HBASE-14005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613635#comment-14613635 ] Hudson commented on HBASE-14005: FAILURE: Integrated in HBase-TRUNK #6626 (See [https://builds.apache.org/job/HBase-TRUNK/6626/]) HBASE-14005 Set permission to .top hfile in LoadIncrementalHFiles (Francesco MDE) (tedyu: rev 34dfd6c9b4d398074ba9dcc8423cf8a410c4d6b2) * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java Set permission to .top hfile in LoadIncrementalHFiles - Key: HBASE-14005 URL: https://issues.apache.org/jira/browse/HBASE-14005 Project: HBase Issue Type: Bug Reporter: Francesco MDE Assignee: Francesco MDE Priority: Trivial Fix For: 2.0.0, 0.98.14, 1.0.2, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-14005.patch Set the same -rwxrwxrwx permission to .top file as .bottom and _tmp -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14015) Allow setting a richer state value when toString a pv2
[ https://issues.apache.org/jira/browse/HBASE-14015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613642#comment-14613642 ] Hudson commented on HBASE-14015: FAILURE: Integrated in HBase-TRUNK #6626 (See [https://builds.apache.org/job/HBase-TRUNK/6626/]) HBASE-14015 Allow setting a richer state value when toString a pv2 (stack: rev 17703f03614e0803f46eadb70a2242060d04125c) * hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java * hbase-procedure/src/test/java/org/apache/hadoop/hbase/procedure2/TestProcedureToString.java * hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/StateMachineProcedure.java Allow setting a richer state value when toString a pv2 -- Key: HBASE-14015 URL: https://issues.apache.org/jira/browse/HBASE-14015 Project: HBase Issue Type: Improvement Components: proc-v2 Reporter: stack Assignee: stack Priority: Minor Fix For: 2.0.0, 1.2.0, 1.3.0 Attachments: 0001-HBASE-14015-Allow-setting-a-richer-state-value-when-.patch, 14015.addendum.to.fix.compile.issue.on.branch-1.branch-1.2.txt Debugging, my procedure after a crash was loaded out of the store and its state was RUNNING. It would help if I knew in which of the states of a StateMachineProcedure it was going to start RUNNING at. Chatting w/ Matteo, he suggested allowing Procedures customize the String. Here is patch that makes it so StateMachineProcedure will now print out the base state -- RUNNING, FINISHED -- followed by a ':' and then the StateMachineProcedure state: e.g. SimpleStateMachineProcedure state=RUNNABLE:SERVER_CRASH_ASSIGN -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14010) TestRegionRebalancing.testRebalanceOnRegionServerNumberChange fails; cluster not balanced
[ https://issues.apache.org/jira/browse/HBASE-14010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613640#comment-14613640 ] Hudson commented on HBASE-14010: FAILURE: Integrated in HBase-TRUNK #6626 (See [https://builds.apache.org/job/HBase-TRUNK/6626/]) HBASE-14010 TestRegionRebalancing.testRebalanceOnRegionServerNumberChange fails; cluster not balanced (stack: rev 90b51e85c4506089f39e6fe3bb27f338492bade6) * hbase-server/src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java TestRegionRebalancing.testRebalanceOnRegionServerNumberChange fails; cluster not balanced - Key: HBASE-14010 URL: https://issues.apache.org/jira/browse/HBASE-14010 Project: HBase Issue Type: Bug Components: test Reporter: stack Assignee: stack Fix For: 2.0.0, 1.2.0, 1.1.2 Attachments: 14010.txt, 14010.txt, 14010.txt java.lang.AssertionError: null at org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange(TestRegionRebalancing.java:144) from recent build https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-HBASE-Build/14639/testReport/junit/org.apache.hadoop.hbase/TestRegionRebalancing/testRebalanceOnRegionServerNumberChange_0_/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14023) HBase Srores NULL Value from delimited File Input
Soumendra Kumar Mishra created HBASE-14023: -- Summary: HBase Srores NULL Value from delimited File Input Key: HBASE-14023 URL: https://issues.apache.org/jira/browse/HBASE-14023 Project: HBase Issue Type: Bug Reporter: Soumendra Kumar Mishra Data: 101,SMITH,41775,,1000,,100,10 102,ALLEN,,77597,2000,,,20 103,WARD,,,2000,500,,30 Result: ROW COLUMN+CELL 101 column=info:dept, timestamp=1435992182400, value=10 101 column=info:ename, timestamp=1435992182400, value=SMITH 101 column=pay:bonus, timestamp=1435992182400, value=100 101 column=pay:comm, timestamp=1435992182400, value= 101 column=pay:sal, timestamp=1435992182400, value=1000 101 column=tel:mobile, timestamp=1435992182400, value= 101 column=tel:telephone, timestamp=1435992182400, value=41775 I am using PIG to Write Data into HBase. Same issue happened when Data Inserted from TextFile to HBase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14024) ImportTsv is not loading hbase-default.xml
Ashish Singhi created HBASE-14024: - Summary: ImportTsv is not loading hbase-default.xml Key: HBASE-14024 URL: https://issues.apache.org/jira/browse/HBASE-14024 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 2.0.0 Reporter: Ashish Singhi Assignee: Ashish Singhi Priority: Critical Fix For: 2.0.0 ImportTsv job is failing with below exception {noformat} Exception in thread main java.lang.IllegalArgumentException: Can not create a Path from a null string at org.apache.hadoop.fs.Path.checkPathArg(Path.java:123) at org.apache.hadoop.fs.Path.init(Path.java:135) at org.apache.hadoop.fs.Path.init(Path.java:89) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configurePartitioner(HFileOutputFormat2.java:591) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configureIncrementalLoad(HFileOutputFormat2.java:441) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configureIncrementalLoad(HFileOutputFormat2.java:406) at org.apache.hadoop.hbase.mapreduce.ImportTsv.createSubmittableJob(ImportTsv.java:555) at org.apache.hadoop.hbase.mapreduce.ImportTsv.run(ImportTsv.java:763) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.mapreduce.ImportTsv.main(ImportTsv.java:772) {noformat} {{hbase.fs.tmp.dir}} is set to a default value in hbase-default.xml. I found that hbase configuration resources from its xml are not loaded into conf object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work stopped] (HBASE-13867) Add endpoint coprocessor guide to HBase book
[ https://issues.apache.org/jira/browse/HBASE-13867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-13867 stopped by Gaurav Bhardwaj. --- Add endpoint coprocessor guide to HBase book Key: HBASE-13867 URL: https://issues.apache.org/jira/browse/HBASE-13867 Project: HBase Issue Type: Task Components: Coprocessors, documentation Reporter: Vladimir Rodionov Assignee: Gaurav Bhardwaj Endpoint coprocessors are very poorly documented. Coprocessor section of HBase book must be updated either with its own endpoint coprocessors HOW-TO guide or, at least, with the link(s) to some other guides. There is good description here: http://www.3pillarglobal.com/insights/hbase-coprocessors -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13637) branch-1.1 does not build against hadoop-2.2.
[ https://issues.apache.org/jira/browse/HBASE-13637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13637: Fix Version/s: (was: 1.2.0) 2.0.0 branch-1.1 does not build against hadoop-2.2. - Key: HBASE-13637 URL: https://issues.apache.org/jira/browse/HBASE-13637 Project: HBase Issue Type: Bug Reporter: Nick Dimiduk Assignee: Duo Zhang Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13637-branch-1.1.01.patch, HBASE-13637-branch-1.1.patch From RC0 VOTE thread, {quote} The build is broken with Hadoop-2.2 because mini-kdc is not found: \[ERROR\] Failed to execute goal on project hbase-server: Could not resolve dependencies for project org.apache.hbase:hbase-server:jar:1.1.0: Could not find artifact org.apache.hadoop:hadoop-minikdc:jar:2.2 {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13623) TestSplitLogManager.testGetPreviousRecoveryMode is still flaky
[ https://issues.apache.org/jira/browse/HBASE-13623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13623: Fix Version/s: (was: 1.1.1) (was: 1.2.0) (was: 2.0.0) TestSplitLogManager.testGetPreviousRecoveryMode is still flaky -- Key: HBASE-13623 URL: https://issues.apache.org/jira/browse/HBASE-13623 Project: HBase Issue Type: Test Components: master Affects Versions: 1.1.0 Reporter: Nick Dimiduk Attachments: TEST-org.apache.hadoop.hbase.master.TestSplitLogManager.xml Even with retry failing tests, I'm seeing {noformat} org.apache.hadoop.hbase.master.TestSplitLogManager.testGetPreviousRecoveryMode(org.apache.hadoop.hbase.master.TestSplitLogManager) Run 1: TestSplitLogManager.testGetPreviousRecoveryMode:661 Mode4=LOG_SPLITTING Run 2: TestSplitLogManager.testGetPreviousRecoveryMode:661 Mode4=LOG_SPLITTING Run 3: TestSplitLogManager.testGetPreviousRecoveryMode:661 Mode4=LOG_SPLITTING java.lang.AssertionError: Mode4=LOG_SPLITTING at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.assertTrue(Assert.java:41) at org.apache.hadoop.hbase.master.TestSplitLogManager.testGetPreviousRecoveryMode(TestSplitLogManager.java:661) {noformat} Let me give [~Apache9]'s test procedure from HBASE-13136 a spin. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13904) TestAssignmentManager.testBalanceOnMasterFailoverScenarioWithOfflineNode failing consistently on branch-1.1
[ https://issues.apache.org/jira/browse/HBASE-13904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13904: Fix Version/s: (was: 1.3.0) * branch-1: ec251bdd3649de7f30ece914c7930498e642527e * branch-1.0: cf1ccc30909bfb04326415e5a648605759d57360 * branch-1.1: ed62e08786273587378b86278fae452dfc817dfb * branch-1.2: 1eb8ac6fe9dd0c15cdb52f66ced4136316c06465 TestAssignmentManager.testBalanceOnMasterFailoverScenarioWithOfflineNode failing consistently on branch-1.1 --- Key: HBASE-13904 URL: https://issues.apache.org/jira/browse/HBASE-13904 Project: HBase Issue Type: Bug Components: master, Region Assignment, test Affects Versions: 1.1.1 Reporter: Nick Dimiduk Assignee: Mikhail Antonov Priority: Critical Fix For: 1.0.2, 1.2.0, 1.1.1 Attachments: HBASE-13904-branch-1.1.1.patch, HBASE-13904-mantonov_running_whole_class.txt, org.apache.hadoop.hbase.master.TestAssignmentManager-output.txt {noformat} $ JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.79.x86_64 ../apache-maven-3.3.3/bin/mvn -PrunAllTests -DreuseForks=false clean install -Dmaven.test.redirectTestOutputToFile=true -Dsurefire.rerunFailingTestsCount=4 -Dit.test=noItTest ... Tests in error: org.apache.hadoop.hbase.master.TestAssignmentManager.testBalanceOnMasterFailoverScenarioWithOfflineNode(org.apache.hadoop.hbase.master.TestAssignmentManager) Run 1: TestAssignmentManager.testBalanceOnMasterFailoverScenarioWithOfflineNode:368 » Run 2: TestAssignmentManager.testBalanceOnMasterFailoverScenarioWithOfflineNode:335 » Run 3: TestAssignmentManager.testBalanceOnMasterFailoverScenarioWithOfflineNode:335 » Run 4: TestAssignmentManager.testBalanceOnMasterFailoverScenarioWithOfflineNode:335 » Run 5: TestAssignmentManager.testBalanceOnMasterFailoverScenarioWithOfflineNode:335 » {noformat} {noformat} --- Test set: org.apache.hadoop.hbase.master.TestAssignmentManager --- Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 393.384 sec FAILURE! - in org.apache.hadoop.hbase.master.TestAssignmentManager testBalanceOnMasterFailoverScenarioWithOfflineNode(org.apache.hadoop.hbase.master.TestAssignmentManager) Time elapsed: 57.873 sec ERROR! java.lang.Exception: test timed out after 6 milliseconds at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.hbase.master.TestAssignmentManager.testBalanceOnMasterFailoverScenarioWithOfflineNode(TestAssignmentManager.java:335) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14012) Double Assignment and Dataloss when ServerCrashProcedure runs during Master failover
[ https://issues.apache.org/jira/browse/HBASE-14012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-14012: -- Description: (Rewrite to be more explicit about what the problem is) ITBLL. Master comes up (It is being killed every 1-5 minutes or so). It is joining a running cluster (all servers up except Master with most regions assigned out on cluster). ProcedureStore has two ServerCrashProcedures unfinished (RUNNABLE state) for two separate servers. One SCP is in the middle of the assign step when master crashes (SERVER_CRASH_ASSIGN). This SCP step has this comment on it: {code} // Assign may not be idempotent. SSH used to requeue the SSH if we got an IOE assigning // which is what we are mimicing here but it looks prone to double assignment if assign // fails midway. TODO: Test. {code} This issue is 1.2+ only since it is ServerCrashProcedure (Added in HBASE-13616, post hbase-1.1.x). Looking at ServerShutdownHandler, how we used to do crash processing before we moved over to the Pv2 framework, SSH may have (accidentally) avoided this issue since it does its processing in a big blob starting over if killed mid-crash. In particular, post-crash, SSH scans hbase:meta to find servers that were on the downed server. SCP scanneds Meta in one step, saves off the regions it finds into the ProcedureStore, and then in the next step, does actual assign. In this case, we crashed post-meta scan and during assign. Assign is a bulk assign. It mostly succeeded but got this: {code} 809622 2015-06-09 20:05:28,576 INFO [ProcedureExecutorThread-9] master.GeneralBulkAssigner: Failed assigning 3 regions to server c2021.halxg.cloudera.com,16020,1433905510696, reassigning them {code} So, most regions actually made it to new locations except for a few stragglers. All of the successfully assigned regions then are reassigned on other side of master restart when we replay the SCP assign step. Let me put together the scan meta and assign steps in SCP; this should do until we redo all of assign to run on Pv2. A few other things I noticed: In SCP, we only check if failover in first step, not for every step, which means ServerCrashProcedure will run if on reload it is beyond the first step. {code} // Is master fully online? If not, yield. No processing of servers unless master is up if (!services.getAssignmentManager().isFailoverCleanupDone()) { throwProcedureYieldException(Waiting on master failover to complete); } {code} This means we are assigning while Master is still coming up, a no-no (though it does not seem to have caused problem here). Fix. Also, I see that over the 8 hours of this particular log, each time the master crashes and comes back up, we queue a ServerCrashProcedure for c2022 because an empty dir never gets cleaned up: {code} 39 2015-06-09 22:15:33,074 WARN [ProcedureExecutorThread-0] master.SplitLogManager: returning success without actually splitting and deleting all the log files in path hdfs://c2020.halxg.cloudera.com:8020/hbase/WALs/c2022.halxg.cloudera.com,16020,1433902151857-splitting {code} Fix this too. was: ITBLL. Master comes up. It is joining a running cluster (all servers up except Master with most regions assigned out on cluster). ProcedureStore has two ServerCrashProcedures unfinished (RUNNABLE state). In SCP, we only check if failover in first step, not for every step, which means ServerCrashProcedure will run if on reload it is beyond the first step. {code} // Is master fully online? If not, yield. No processing of servers unless master is up if (!services.getAssignmentManager().isFailoverCleanupDone()) { throwProcedureYieldException(Waiting on master failover to complete); } {code} There is no definitive logging but it looks like we start running at the assign step. The regions to assign were persisted before master crash. The regions to assign may not make sense post crash: i.e. here we double-assign. Checking. We shouldn't run until master is fully up regardless. Double Assignment and Dataloss when ServerCrashProcedure runs during Master failover Key: HBASE-14012 URL: https://issues.apache.org/jira/browse/HBASE-14012 Project: HBase Issue Type: Bug Components: master, Region Assignment Affects Versions: 2.0.0, 1.2.0 Reporter: stack Assignee: stack Priority: Critical (Rewrite to be more explicit about what the problem is) ITBLL. Master comes up (It is being killed every 1-5 minutes or so). It is joining a running cluster (all servers up except Master with most regions assigned out on cluster). ProcedureStore has two ServerCrashProcedures unfinished (RUNNABLE state) for two separate servers. One SCP is in the
[jira] [Updated] (HBASE-13910) add branch-1.2 to precommit branches
[ https://issues.apache.org/jira/browse/HBASE-13910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13910: Fix Version/s: (was: 1.2.0) add branch-1.2 to precommit branches Key: HBASE-13910 URL: https://issues.apache.org/jira/browse/HBASE-13910 Project: HBase Issue Type: Sub-task Components: build Reporter: Sean Busbey Assignee: Sean Busbey Fix For: 2.0.0 Attachments: HBASE-13910.1.patch update the precommit test properties so that patches targeting branch-1.2 work -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14017) Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion
[ https://issues.apache.org/jira/browse/HBASE-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614056#comment-14614056 ] Hadoop QA commented on HBASE-14017: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12743601/HBASE-14017.v1-branch1.1.patch against master branch at commit e640f1e76af8f32015f475629610da127897f01e. ATTACHMENT ID: 12743601 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14666//console This message is automatically generated. Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion - Key: HBASE-14017 URL: https://issues.apache.org/jira/browse/HBASE-14017 Project: HBase Issue Type: Sub-task Components: proc-v2 Affects Versions: 2.0.0, 1.2.0, 1.1.1, 1.3.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Blocker Fix For: 2.0.0, 1.2.0, 1.1.2 Attachments: HBASE-14017-v0.patch, HBASE-14017-v0.patch, HBASE-14017.v1-branch1.1.patch, HBASE-14017.v1-branch1.1.patch [~syuanjiang] found a concurrecy issue in the procedure queue delete where we don't have an exclusive lock before deleting the table {noformat} Thread 1: Create table is running - the queue is empty and wlock is false Thread 2: markTableAsDeleted see the queue empty and wlock= false Thread 1: tryWrite() set wlock=true; too late Thread 2: delete the queue Thread 1: never able to release the lock - NPE when trying to get the queue {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14024) ImportTsv is not loading hbase-default.xml
[ https://issues.apache.org/jira/browse/HBASE-14024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614060#comment-14614060 ] Hadoop QA commented on HBASE-14024: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12743597/HBASE-14024.patch against master branch at commit e640f1e76af8f32015f475629610da127897f01e. ATTACHMENT ID: 12743597 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14662//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14662//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14662//artifact/patchprocess/checkstyle-aggregate.html Javadoc warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14662//artifact/patchprocess/patchJavadocWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14662//console This message is automatically generated. ImportTsv is not loading hbase-default.xml -- Key: HBASE-14024 URL: https://issues.apache.org/jira/browse/HBASE-14024 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 2.0.0 Reporter: Ashish Singhi Assignee: Ashish Singhi Priority: Critical Fix For: 2.0.0 Attachments: HBASE-14024.patch ImportTsv job is failing with below exception {noformat} Exception in thread main java.lang.IllegalArgumentException: Can not create a Path from a null string at org.apache.hadoop.fs.Path.checkPathArg(Path.java:123) at org.apache.hadoop.fs.Path.init(Path.java:135) at org.apache.hadoop.fs.Path.init(Path.java:89) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configurePartitioner(HFileOutputFormat2.java:591) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configureIncrementalLoad(HFileOutputFormat2.java:441) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configureIncrementalLoad(HFileOutputFormat2.java:406) at org.apache.hadoop.hbase.mapreduce.ImportTsv.createSubmittableJob(ImportTsv.java:555) at org.apache.hadoop.hbase.mapreduce.ImportTsv.run(ImportTsv.java:763) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.mapreduce.ImportTsv.main(ImportTsv.java:772) {noformat} {{hbase.fs.tmp.dir}} is set to a default value in hbase-default.xml. I found that hbase configuration resources from its xml are not loaded into conf object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14024) ImportTsv is not loading hbase-default.xml
[ https://issues.apache.org/jira/browse/HBASE-14024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Singhi updated HBASE-14024: -- Attachment: HBASE-14024.patch ImportTsv is not loading hbase-default.xml -- Key: HBASE-14024 URL: https://issues.apache.org/jira/browse/HBASE-14024 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 2.0.0 Reporter: Ashish Singhi Assignee: Ashish Singhi Priority: Critical Fix For: 2.0.0 Attachments: HBASE-14024.patch ImportTsv job is failing with below exception {noformat} Exception in thread main java.lang.IllegalArgumentException: Can not create a Path from a null string at org.apache.hadoop.fs.Path.checkPathArg(Path.java:123) at org.apache.hadoop.fs.Path.init(Path.java:135) at org.apache.hadoop.fs.Path.init(Path.java:89) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configurePartitioner(HFileOutputFormat2.java:591) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configureIncrementalLoad(HFileOutputFormat2.java:441) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configureIncrementalLoad(HFileOutputFormat2.java:406) at org.apache.hadoop.hbase.mapreduce.ImportTsv.createSubmittableJob(ImportTsv.java:555) at org.apache.hadoop.hbase.mapreduce.ImportTsv.run(ImportTsv.java:763) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.mapreduce.ImportTsv.main(ImportTsv.java:772) {noformat} {{hbase.fs.tmp.dir}} is set to a default value in hbase-default.xml. I found that hbase configuration resources from its xml are not loaded into conf object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14024) ImportTsv is not loading hbase-default.xml
[ https://issues.apache.org/jira/browse/HBASE-14024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Singhi updated HBASE-14024: -- Status: Patch Available (was: Open) ImportTsv is not loading hbase-default.xml -- Key: HBASE-14024 URL: https://issues.apache.org/jira/browse/HBASE-14024 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 2.0.0 Reporter: Ashish Singhi Assignee: Ashish Singhi Priority: Critical Fix For: 2.0.0 Attachments: HBASE-14024.patch ImportTsv job is failing with below exception {noformat} Exception in thread main java.lang.IllegalArgumentException: Can not create a Path from a null string at org.apache.hadoop.fs.Path.checkPathArg(Path.java:123) at org.apache.hadoop.fs.Path.init(Path.java:135) at org.apache.hadoop.fs.Path.init(Path.java:89) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configurePartitioner(HFileOutputFormat2.java:591) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configureIncrementalLoad(HFileOutputFormat2.java:441) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configureIncrementalLoad(HFileOutputFormat2.java:406) at org.apache.hadoop.hbase.mapreduce.ImportTsv.createSubmittableJob(ImportTsv.java:555) at org.apache.hadoop.hbase.mapreduce.ImportTsv.run(ImportTsv.java:763) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.mapreduce.ImportTsv.main(ImportTsv.java:772) {noformat} {{hbase.fs.tmp.dir}} is set to a default value in hbase-default.xml. I found that hbase configuration resources from its xml are not loaded into conf object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14025) Update CHANGES.txt for 1.2
[ https://issues.apache.org/jira/browse/HBASE-14025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614032#comment-14614032 ] Sean Busbey commented on HBASE-14025: - (opinion): JIRA is the source of truth for what is fixed in a given version. The CHANGES.txt file in a release is meant as a convenience for downstream folks who might not have access to JIRA. A primary benefit of providing this convenience is making sure the RM does a pass at validating the fix versions set in JIRA. # Clean up JIRA for your release ## get the release notes from JIRA in text format and save it (i.e. as CHANGES_1.2.0_jira.txt) ### i.e. project home - versions - 1.2.0 - Release Notes - Configure Release Notes - format: text - Create (leads to [these text release notes for 1.2.0|https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12332062styleName=TextprojectId=12310753] ### the text box at the bottom will have the contents to copy ### filter the file to be just a list of JIRA IDs {code} # on OS X, just use pbpaste as the first stage of this pipeline instead of saving to a temp file $ cat CHANGES_1.2.0_raw.txt | grep -o -E \[HBASE-[0-9]*\] | grep -o -E HBASE-[0-9]* | sort -u CHANGES_1.2.0_jira.txt {code} ## get the set of JIRAs that are marked in git ### find the commit where your release line branched off from the previous release. {code} $ git merge-base 1.1.0 branch-1.2 8166142b2e815fc6ab30c14a5a546cd242bf3b4c {code} ### get the set of all jiras committed between that branch point and the previous release. {code} $ git log --oneline 8166142...1.1.0 | grep -o -E HBASE-[0-9]* | awk '{$1=$1;print}' | sort -u in_1.1.0.txt {code} ### get the set of all jiras committed between that branch point and the current release.{code} $ git log --oneline 8166142...branch-1.2 | grep -o -E HBASE-[0-9]* | awk '{$1=$1;print}' | sort -u in_1.2.0.txt {code} ### check the set of jiras that appear in the previous release after the branch but not in the current release.{code} $ comm -2 -3 in_1.1.0.txt in_1.2.0.txt {code} ###* these can be release specific changes (i.e. disabling DLR in the 1.1.0 release) ###* they can be changes that were reverted during the prior release's RC process that have not been reapplied ###* they can be changes that were mistakenly only applied in the branch for the previous release and need to be cherry picked ### find the jiras for the current release that were not applied to the previous release after branching. this should be all the jiras that are in git for the current release.{code} $ comm -1 -3 in_1.1.0.txt in_1.2.0.txt CHANGES_1.2.0_git.txt {code} ## go through JIRAs that appear in the JIRA tracker but not in git {code} $ comm -1 -3 CHANGES_1.2.0_git.txt CHANGES_1.2.0_jira.txt {code} ##* the above process can give you false positive here, e.g. if a commit was in the prior release but then reverted and reapplied in your release—be sure to check the git history ##* sometimes these are issue that didn't get properly cleaned up when things made it into the previous minor/major release during the RC process ##* sometimes they're task or umbrella jiras that don't have a representation in git ##* sometimes they're documentation jiras that shouldn't have been marked for the release (doc tickets always are resolved against the major version that was in the _master_ git branch at the time) ##* sometimes they're improperly closed tickets (e.g. invalid, duplicate, and wontfix tickets should not have a fix version) ##* sometimes they were improperly labeled on commit and weren't reverted and reapplied with the correct commit message. ##* occasionally they're changes that should be in your branch but were missed. in this case reopen the ticket and cherry-pick to your branch ## go through JIRAs that appear in git but not in JIRA {code} $ comm -2 -3 CHANGES_1.2.0_git.txt CHANGES_1.2.0_jira.txt {code} ##* sometimes these were improperly labeled on commit. ideally they will have been reverted and reapplied with the correct commit message. ##* sometimes they weren't marked correctly in jira because they were pushed out of the previous minor/major release during the RC process. # remove entries in CHANGES.txt unrelated to your current release #* for a PATCH release, you should be starting from a file that only contains related changes #* for a MINOR release, you should reset the file so that you will only have changes for prior minor releases in the same major release (e.g. for the 1.2.0 release CHANGES.txt will start by copying the notes from 1.0.0 and 1.1.0) #* for a MAJOR release, you should reset the file so that you will only have changes for prior major releases (e.g. for the 2.0.0 release CHANGES.txt will start by copying the notes from 1.0.0) # generate a post-clean-up set of text release notes from JIRA and add it to the top of the CHANGES.txt file with the current date. Update
[jira] [Updated] (HBASE-14017) Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion
[ https://issues.apache.org/jira/browse/HBASE-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-14017: Attachment: HBASE-14017-v0.patch reattaching patches now that jenkins is back. Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion - Key: HBASE-14017 URL: https://issues.apache.org/jira/browse/HBASE-14017 Project: HBase Issue Type: Sub-task Components: proc-v2 Affects Versions: 2.0.0, 1.2.0, 1.1.1, 1.3.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Blocker Fix For: 2.0.0, 1.2.0, 1.1.2 Attachments: HBASE-14017-v0.patch, HBASE-14017-v0.patch, HBASE-14017.v1-branch1.1.patch [~syuanjiang] found a concurrecy issue in the procedure queue delete where we don't have an exclusive lock before deleting the table {noformat} Thread 1: Create table is running - the queue is empty and wlock is false Thread 2: markTableAsDeleted see the queue empty and wlock= false Thread 1: tryWrite() set wlock=true; too late Thread 2: delete the queue Thread 1: never able to release the lock - NPE when trying to get the queue {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14017) Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion
[ https://issues.apache.org/jira/browse/HBASE-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-14017: Status: Patch Available (was: Open) Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion - Key: HBASE-14017 URL: https://issues.apache.org/jira/browse/HBASE-14017 Project: HBase Issue Type: Sub-task Components: proc-v2 Affects Versions: 1.1.1, 2.0.0, 1.2.0, 1.3.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Blocker Fix For: 2.0.0, 1.2.0, 1.1.2 Attachments: HBASE-14017-v0.patch, HBASE-14017-v0.patch, HBASE-14017.v1-branch1.1.patch [~syuanjiang] found a concurrecy issue in the procedure queue delete where we don't have an exclusive lock before deleting the table {noformat} Thread 1: Create table is running - the queue is empty and wlock is false Thread 2: markTableAsDeleted see the queue empty and wlock= false Thread 1: tryWrite() set wlock=true; too late Thread 2: delete the queue Thread 1: never able to release the lock - NPE when trying to get the queue {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14017) Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion
[ https://issues.apache.org/jira/browse/HBASE-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614055#comment-14614055 ] Hadoop QA commented on HBASE-14017: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12743601/HBASE-14017.v1-branch1.1.patch against master branch at commit e640f1e76af8f32015f475629610da127897f01e. ATTACHMENT ID: 12743601 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14665//console This message is automatically generated. Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion - Key: HBASE-14017 URL: https://issues.apache.org/jira/browse/HBASE-14017 Project: HBase Issue Type: Sub-task Components: proc-v2 Affects Versions: 2.0.0, 1.2.0, 1.1.1, 1.3.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Blocker Fix For: 2.0.0, 1.2.0, 1.1.2 Attachments: HBASE-14017-v0.patch, HBASE-14017-v0.patch, HBASE-14017.v1-branch1.1.patch, HBASE-14017.v1-branch1.1.patch [~syuanjiang] found a concurrecy issue in the procedure queue delete where we don't have an exclusive lock before deleting the table {noformat} Thread 1: Create table is running - the queue is empty and wlock is false Thread 2: markTableAsDeleted see the queue empty and wlock= false Thread 1: tryWrite() set wlock=true; too late Thread 2: delete the queue Thread 1: never able to release the lock - NPE when trying to get the queue {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14025) Update CHANGES.txt for 1.2
[ https://issues.apache.org/jira/browse/HBASE-14025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614065#comment-14614065 ] Andrew Purtell commented on HBASE-14025: Thanks for documenting this black art Update CHANGES.txt for 1.2 -- Key: HBASE-14025 URL: https://issues.apache.org/jira/browse/HBASE-14025 Project: HBase Issue Type: Sub-task Components: documentation Affects Versions: 1.2.0 Reporter: Sean Busbey Assignee: Sean Busbey Fix For: 1.2.0 Since it's more effort than I expected, making a ticket to track actually updating CHANGES.txt so that new RMs have an idea what to expect. Maybe will make doc changes if there's enough here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-12707) TestMasterObserver#testTableOperations may fail due to race condition
[ https://issues.apache.org/jira/browse/HBASE-12707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu resolved HBASE-12707. Resolution: Cannot Reproduce TestMasterObserver#testTableOperations may fail due to race condition - Key: HBASE-12707 URL: https://issues.apache.org/jira/browse/HBASE-12707 Project: HBase Issue Type: Test Reporter: Ted Yu Priority: Minor Here was the failure I saw: {code} testTableOperations(org.apache.hadoop.hbase.coprocessor.TestMasterObserver) Time elapsed: 5.153 sec FAILURE! java.lang.AssertionError: Delete table handler should be called. at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.assertTrue(Assert.java:41) at org.apache.hadoop.hbase.coprocessor.TestMasterObserver.testTableOperations(TestMasterObserver.java:1302) {code} Here is relevant part of test output: {code} 2014-12-18 00:18:47,788 DEBUG [MASTER_TABLE_OPERATIONS-kiyo:56800-0] backup.HFileArchiver(438): Finished archiving from class org.apache.hadoop.hbase.backup.HFileArchiver$FileablePath, file:hdfs://localhost:49506/user/hortonzy/test-data/caaed63b-db6f-4dad-a455-3f63897555f7/.tmp/data/default/observed_table/294700c0c28e634c9e046d5368559666/recovered.edits/7.seqid, to hdfs://localhost:49506/user/hortonzy/test-data/caaed63b-db6f-4dad-a455-3f63897555f7/archive/data/default/observed_table/294700c0c28e634c9e046d5368559666/recovered.edits/7.seqid 2014-12-18 00:18:47,789 INFO [IPC Server handler 2 on 49506] blockmanagement.BlockManager(1074): BLOCK* addToInvalidates: blk_1073741879_1055 127.0.0.1:60154 127.0.0.1:38435 2014-12-18 00:18:47,789 DEBUG [MASTER_TABLE_OPERATIONS-kiyo:56800-0] backup.HFileArchiver(453): Deleted all region files in: hdfs://localhost:49506/user/hortonzy/test-data/caaed63b-db6f-4dad-a455-3f63897555f7/.tmp/data/default/observed_table/294700c0c28e634c9e046d5368559666 2014-12-18 00:18:47,790 INFO [IPC Server handler 0 on 49506] blockmanagement.BlockManager(1074): BLOCK* addToInvalidates: blk_1073741883_1059 127.0.0.1:60154 127.0.0.1:38435 2014-12-18 00:18:47,791 DEBUG [MASTER_TABLE_OPERATIONS-kiyo:56800-0] handler.DeleteTableHandler(160): Table 'observed_table' archived! 2014-12-18 00:18:47,791 DEBUG [MASTER_TABLE_OPERATIONS-kiyo:56800-0] handler.DeleteTableHandler(110): Removing 'observed_table' descriptor. 2014-12-18 00:18:47,792 DEBUG [MASTER_TABLE_OPERATIONS-kiyo:56800-0] handler.DeleteTableHandler(116): Removing 'observed_table' from region states. 2014-12-18 00:18:47,792 DEBUG [MASTER_TABLE_OPERATIONS-kiyo:56800-0] handler.DeleteTableHandler(120): Marking 'observed_table' as deleted. 2014-12-18 00:18:47,874 DEBUG [B.defaultRpcServer.handler=2,queue=0,port=56800] util.FSTableDescriptors(177): Exception during readTableDecriptor. Current table name = observed_table org.apache.hadoop.hbase.TableInfoMissingException: No table descriptor file under hdfs://localhost:49506/user/hortonzy/test-data/caaed63b-db6f-4dad-a455-3f63897555f7/data/default/observed_table at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorFromFs(FSTableDescriptors.java:509) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorFromFs(FSTableDescriptors.java:487) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:172) at org.apache.hadoop.hbase.master.HMaster.listTableDescriptors(HMaster.java:2166) at org.apache.hadoop.hbase.master.MasterRpcServices.getTableDescriptors(MasterRpcServices.java:788) at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:42402) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2028) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:112) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:92) at java.lang.Thread.run(Thread.java:724) 2014-12-18 00:18:47,875 INFO [main] client.HBaseAdmin(738): Deleted observed_table 2014-12-18 00:18:47,880 WARN [main] hbase.MetaTableAccessor$1(344): No serialized HRegionInfo in keyvalues={observed_table,,1418861912253.8551edbe4ffb771a607ad362bfc6642d./info:seqnumDuringOpen/1418861917732/Put/vlen=8/seqid=0, observed_table,,1418861912253.8551edbe4ffb771a607ad362bfc6642d./info:server/1418861917732/Put/vlen=28/seqid=0, observed_table,,1418861912253.8551edbe4ffb771a607ad362bfc6642d./info:serverstartcode/1418861917732/Put/vlen=8/seqid=0} 2014-12-18 00:18:47,924 INFO [main] hbase.ResourceChecker(171): after: coprocessor.TestMasterObserver#testTableOperations Thread=600 (was 475) {code} Here is
[jira] [Resolved] (HBASE-12784) TestFastFailWithoutTestUtil#testPreemptiveFastFailException50Times sometimes hangs
[ https://issues.apache.org/jira/browse/HBASE-12784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu resolved HBASE-12784. Resolution: Cannot Reproduce TestFastFailWithoutTestUtil#testPreemptiveFastFailException50Times sometimes hangs -- Key: HBASE-12784 URL: https://issues.apache.org/jira/browse/HBASE-12784 Project: HBase Issue Type: Test Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Attachments: org.apache.hadoop.hbase.client.TestFastFailWithoutTestUtil-output.txt I was running test suite against hadoop 2.7.0-SNAPSHOT and saw this: {code} testPreemptiveFastFailException50Times(org.apache.hadoop.hbase.client.TestFastFailWithoutTestUtil) Time elapsed: 120.013 sec ERROR! java.lang.Exception: test timed out after 12 milliseconds at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303) at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:248) at java.util.concurrent.FutureTask.get(FutureTask.java:111) at org.apache.hadoop.hbase.client.TestFastFailWithoutTestUtil.testPreemptiveFastFailException(TestFastFailWithoutTestUtil.java:450) at org.apache.hadoop.hbase.client.TestFastFailWithoutTestUtil.testPreemptiveFastFailException50Times(TestFastFailWithoutTestUtil.java:338) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13970) NPE during compaction in trunk
[ https://issues.apache.org/jira/browse/HBASE-13970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613921#comment-14613921 ] Hudson commented on HBASE-13970: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #999 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/999/]) HBASE-13970 NPE during compaction in trunk (zhangduo: rev ebb476ba87ba3ce5894ed1ad9350c5c89b4e0f6c) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java NPE during compaction in trunk -- Key: HBASE-13970 URL: https://issues.apache.org/jira/browse/HBASE-13970 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 0.98.13, 1.2.0, 1.1.1 Reporter: ramkrishna.s.vasudevan Assignee: Duo Zhang Fix For: 2.0.0, 0.98.14, 1.1.2, 1.3.0, 1.2.1 Attachments: HBASE-13970-v1.patch, HBASE-13970.patch Updated the trunk.. Loaded the table with PE tool. Trigger a flush to ensure all data is flushed out to disk. When the first compaction is triggered we get an NPE and this is very easy to reproduce {code} 015-06-25 21:33:46,041 INFO [main-EventThread] procedure.ZKProcedureMemberRpcs: Received procedure start children changed event: /hbase/flush-table-proc/acquired 2015-06-25 21:33:46,051 INFO [rs(stobdtserver3,16040,1435248182301)-flush-proc-pool3-thread-1] regionserver.HRegion: Flushing 1/1 column families, memstore=76.91 MB 2015-06-25 21:33:46,159 ERROR [regionserver/stobdtserver3/10.224.54.70:16040-longCompactions-1435248183945] regionserver.CompactSplitThread: Compaction failed Request = regionName=TestTable,283887,1435248198798.028fb0324cd6eb03d5022eb8c147b7c4., storeName=info, fileCount=3, fileSize=343.4 M (114.5 M, 114.5 M, 114.5 M), priority=3, time=7536968291719985 java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.compactions.PressureAwareCompactionThroughputController$ActiveCompaction.access$700(PressureAwareCompactionThroughputController.java:79) at org.apache.hadoop.hbase.regionserver.compactions.PressureAwareCompactionThroughputController.finish(PressureAwareCompactionThroughputController.java:238) at org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:306) at org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:106) at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:112) at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1202) at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1792) at org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.run(CompactSplitThread.java:524) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2015-06-25 21:33:46,745 INFO [rs(stobdtserver3,16040,1435248182301)-flush-proc-pool3-thread-1] regionserver.DefaultStoreFlusher: Flushed, sequenceid=1534, memsize=76.9 M, hasBloomFilter=true, into tmp file hdfs://stobdtserver3:9010/hbase/data/default/TestTable/028fb0324cd6eb03d5022eb8c147b7c4/.tmp/942ba0831a0047a08987439e34361a0c 2015-06-25 21:33:46,772 INFO [rs(stobdtserver3,16040,1435248182301)-flush-proc-pool3-thread-1] regionserver.HStore: Added hdfs://stobdtserver3:9010/hbase/data/default/TestTable/028fb0324cd6eb03d5022eb8c147b7c4/info/942ba0831a0047a08987439e34361a0c, entries=68116, sequenceid=1534, filesize=68.7 M 2015-06-25 21:33:46,773 INFO [rs(stobdtserver3,16040,1435248182301)-flush-proc-pool3-thread-1] regionserver.HRegion: Finished memstore flush of ~76.91 MB/80649344, currentsize=0 B/0 for region TestTable,283887,1435248198798.028fb0324cd6eb03d5022eb8c147b7c4. in 723ms, sequenceid=1534, compaction requested=true 2015-06-25 21:33:46,780 INFO [main-EventThread] procedure.ZKProcedureMemberRpcs: Received created event:/hbase/flush-table-proc/reached/TestTable 2015-06-25 21:33:46,790 INFO [main-EventThread] procedure.ZKProcedureMemberRpcs: Received created event:/hbase/flush-table-proc/abort/TestTable 2015-06-25 21:33:46,791 INFO [main-EventThread] procedure.ZKProcedureMemberRpcs: Received procedure abort children changed event: /hbase/flush-table-proc/abort 2015-06-25 21:33:46,803 INFO [main-EventThread] procedure.ZKProcedureMemberRpcs: Received procedure start children changed event: /hbase/flush-table-proc/acquired 2015-06-25 21:33:46,818 INFO [main-EventThread] procedure.ZKProcedureMemberRpcs: Received procedure abort children
[jira] [Commented] (HBASE-7847) Use zookeeper multi to clear znodes
[ https://issues.apache.org/jira/browse/HBASE-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613920#comment-14613920 ] Hudson commented on HBASE-7847: --- FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #999 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/999/]) HBASE-7847 Use zookeeper multi to clear znodes (Rakesh R) (apurtell: rev 897e11da76aa42587e8b1857f24d2880f5d0064a) * hbase-server/src/test/java/org/apache/hadoop/hbase/zookeeper/TestZKMulti.java * hbase-client/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java Use zookeeper multi to clear znodes --- Key: HBASE-7847 URL: https://issues.apache.org/jira/browse/HBASE-7847 Project: HBase Issue Type: Sub-task Reporter: Ted Yu Assignee: Rakesh R Fix For: 2.0.0, 1.1.0, 0.98.14, 1.0.2 Attachments: 7847-v1.txt, 7847_v6.patch, 7847_v6.patch, HBASE-7847.patch, HBASE-7847.patch, HBASE-7847.patch, HBASE-7847_v4.patch, HBASE-7847_v5.patch, HBASE-7847_v6.patch, HBASE-7847_v7.1.patch, HBASE-7847_v7.patch, HBASE-7847_v9.patch In ZKProcedureUtil, clearChildZNodes() and clearZNodes(String procedureName) should utilize zookeeper multi so that they're atomic -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13925) Use zookeeper multi to clear znodes in ZKProcedureUtil
[ https://issues.apache.org/jira/browse/HBASE-13925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613922#comment-14613922 ] Hudson commented on HBASE-13925: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #999 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/999/]) HBASE-13925 Use zookeeper multi to clear znodes in ZKProcedureUtil (apurtell: rev 9465a16d83fd661cd8e1612eec5c02e34629e084) * hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/ZKProcedureUtil.java * hbase-client/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java * hbase-server/src/test/java/org/apache/hadoop/hbase/zookeeper/TestZKMulti.java Use zookeeper multi to clear znodes in ZKProcedureUtil -- Key: HBASE-13925 URL: https://issues.apache.org/jira/browse/HBASE-13925 Project: HBase Issue Type: Improvement Affects Versions: 0.98.13 Reporter: Ashish Singhi Assignee: Ashish Singhi Fix For: 2.0.0, 0.98.14, 1.0.2, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-13925-v1-again.patch, HBASE-13925-v1.patch, HBASE-13925.patch Address the TODO in ZKProcedureUtil clearChildZNodes() and clearZNodes methods -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13978) Variable never assigned in SimpleTotalOrderPartitioner.getPartition()
[ https://issues.apache.org/jira/browse/HBASE-13978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613918#comment-14613918 ] Hudson commented on HBASE-13978: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #999 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/999/]) HBASE-13978: Variable never assigned in SimpleTotalOrderPartitioner.getPartition() (apurtell: rev 64847a6bc973771c9d373511f97215d6a299a5ca) * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/SimpleTotalOrderPartitioner.java Variable never assigned in SimpleTotalOrderPartitioner.getPartition() -- Key: HBASE-13978 URL: https://issues.apache.org/jira/browse/HBASE-13978 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 1.1.0.1 Reporter: Lars George Assignee: Bhupendra Kumar Jain Labels: beginner Fix For: 2.0.0, 0.98.14, 1.2.0 Attachments: 0001-HBASE-13978-Variable-never-assigned-in-SimpleTotalOr.patch See https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/SimpleTotalOrderPartitioner.java#L104, which has an {{if}} statement that tries to limit the code to run only once, but since it does not assign {{this.lastReduces}} it will always trigger and recompute the splits (and log them). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14002) Add --noReplicationSetup option to IntegrationTestReplication
[ https://issues.apache.org/jira/browse/HBASE-14002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613919#comment-14613919 ] Hudson commented on HBASE-14002: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #999 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/999/]) HBASE-14002 Add --noReplicationSetup option to IntegrationTestReplication (apurtell: rev 41306efd35249d0ae278f0b254e6033ff137c326) * hbase-it/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestReplication.java Add --noReplicationSetup option to IntegrationTestReplication - Key: HBASE-14002 URL: https://issues.apache.org/jira/browse/HBASE-14002 Project: HBase Issue Type: Improvement Components: integration tests Reporter: Dima Spivak Assignee: Dima Spivak Fix For: 2.0.0, 0.98.14, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-14002_master.patch IntegrationTestReplication has been flaky for me on pre-1.1 versions of HBase because of not-actually-synchronous operations in HBaseAdmin/Admin, which hamper its setupTablesAndReplication method. To get around this, I'd like to add a \-nrs/--noReplicationSetup option to the test to allow it to be run on clusters in which the necessary tables and replication have already been setup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14005) Set permission to .top hfile in LoadIncrementalHFiles
[ https://issues.apache.org/jira/browse/HBASE-14005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613917#comment-14613917 ] Hudson commented on HBASE-14005: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #999 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/999/]) HBASE-14005 Set permission to .top hfile in LoadIncrementalHFiles (Francesco MDE) (tedyu: rev 8b9859b08d19b2c875d0800809b25efba08f2502) * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java Set permission to .top hfile in LoadIncrementalHFiles - Key: HBASE-14005 URL: https://issues.apache.org/jira/browse/HBASE-14005 Project: HBase Issue Type: Bug Reporter: Francesco MDE Assignee: Francesco MDE Priority: Trivial Fix For: 2.0.0, 0.98.14, 1.0.2, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-14005.patch Set the same -rwxrwxrwx permission to .top file as .bottom and _tmp -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14025) Update CHANGES.txt for 1.2
Sean Busbey created HBASE-14025: --- Summary: Update CHANGES.txt for 1.2 Key: HBASE-14025 URL: https://issues.apache.org/jira/browse/HBASE-14025 Project: HBase Issue Type: Sub-task Components: documentation Affects Versions: 1.2.0 Reporter: Sean Busbey Assignee: Sean Busbey Fix For: 1.2.0 Since it's more effort than I expected, making a ticket to track actually updating CHANGES.txt so that new RMs have an idea what to expect. Maybe will make doc changes if there's enough here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HBASE-14025) Update CHANGES.txt for 1.2
[ https://issues.apache.org/jira/browse/HBASE-14025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-14025 started by Sean Busbey. --- Update CHANGES.txt for 1.2 -- Key: HBASE-14025 URL: https://issues.apache.org/jira/browse/HBASE-14025 Project: HBase Issue Type: Sub-task Components: documentation Affects Versions: 1.2.0 Reporter: Sean Busbey Assignee: Sean Busbey Fix For: 1.2.0 Since it's more effort than I expected, making a ticket to track actually updating CHANGES.txt so that new RMs have an idea what to expect. Maybe will make doc changes if there's enough here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14024) ImportTsv is not loading hbase-default.xml
[ https://issues.apache.org/jira/browse/HBASE-14024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614007#comment-14614007 ] Ashish Singhi commented on HBASE-14024: --- Attached patch for master. HBASE-13728 was committed only to master branch. Tested manually, it works fine. ImportTsv is not loading hbase-default.xml -- Key: HBASE-14024 URL: https://issues.apache.org/jira/browse/HBASE-14024 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 2.0.0 Reporter: Ashish Singhi Assignee: Ashish Singhi Priority: Critical Fix For: 2.0.0 Attachments: HBASE-14024.patch ImportTsv job is failing with below exception {noformat} Exception in thread main java.lang.IllegalArgumentException: Can not create a Path from a null string at org.apache.hadoop.fs.Path.checkPathArg(Path.java:123) at org.apache.hadoop.fs.Path.init(Path.java:135) at org.apache.hadoop.fs.Path.init(Path.java:89) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configurePartitioner(HFileOutputFormat2.java:591) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configureIncrementalLoad(HFileOutputFormat2.java:441) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configureIncrementalLoad(HFileOutputFormat2.java:406) at org.apache.hadoop.hbase.mapreduce.ImportTsv.createSubmittableJob(ImportTsv.java:555) at org.apache.hadoop.hbase.mapreduce.ImportTsv.run(ImportTsv.java:763) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.mapreduce.ImportTsv.main(ImportTsv.java:772) {noformat} {{hbase.fs.tmp.dir}} is set to a default value in hbase-default.xml. I found that hbase configuration resources from its xml are not loaded into conf object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HBASE-13867) Add endpoint coprocessor guide to HBase book
[ https://issues.apache.org/jira/browse/HBASE-13867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-13867 started by Gaurav Bhardwaj. --- Add endpoint coprocessor guide to HBase book Key: HBASE-13867 URL: https://issues.apache.org/jira/browse/HBASE-13867 Project: HBase Issue Type: Task Components: Coprocessors, documentation Reporter: Vladimir Rodionov Assignee: Gaurav Bhardwaj Endpoint coprocessors are very poorly documented. Coprocessor section of HBase book must be updated either with its own endpoint coprocessors HOW-TO guide or, at least, with the link(s) to some other guides. There is good description here: http://www.3pillarglobal.com/insights/hbase-coprocessors -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14021) Quota table has a wrong description on the UI
[ https://issues.apache.org/jira/browse/HBASE-14021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614017#comment-14614017 ] Ashish Singhi commented on HBASE-14021: --- bq. -1 javadoc. The javadoc tool appears to have generated 1 warning messages. Not related to patch. bq. -1 checkstyle. The applied patch generated 1899 checkstyle errors (more than the master's current 1898 errors). It is from the generated code. {noformat} python checkstyle_report.py trunkCheckstyle.xml patchCheckstyle.xml hbase-server/target/generated-jamon/org/apache/hadoop/hbase/tmpl/master/MasterStatusTmpl.java 28 29 {noformat} Quota table has a wrong description on the UI - Key: HBASE-14021 URL: https://issues.apache.org/jira/browse/HBASE-14021 Project: HBase Issue Type: Bug Components: UI Affects Versions: 1.1.0 Reporter: Ashish Singhi Assignee: Ashish Singhi Priority: Minor Fix For: 2.0.0, 1.1.2, 1.3.0, 1.2.1 Attachments: HBASE-14021.patch, error.png, fix.png !error.png! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13637) branch-1.1 does not build against hadoop-2.2.
[ https://issues.apache.org/jira/browse/HBASE-13637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13637: Fix Version/s: (was: 2.0.0) branch-1.1 does not build against hadoop-2.2. - Key: HBASE-13637 URL: https://issues.apache.org/jira/browse/HBASE-13637 Project: HBase Issue Type: Bug Reporter: Nick Dimiduk Assignee: Duo Zhang Fix For: 1.1.0 Attachments: HBASE-13637-branch-1.1.01.patch, HBASE-13637-branch-1.1.patch From RC0 VOTE thread, {quote} The build is broken with Hadoop-2.2 because mini-kdc is not found: \[ERROR\] Failed to execute goal on project hbase-server: Could not resolve dependencies for project org.apache.hbase:hbase-server:jar:1.1.0: Could not find artifact org.apache.hadoop:hadoop-minikdc:jar:2.2 {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14017) Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion
[ https://issues.apache.org/jira/browse/HBASE-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-14017: Attachment: HBASE-14017.v1-branch1.1.patch reattaching branch-1.1 patch for jenkins Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion - Key: HBASE-14017 URL: https://issues.apache.org/jira/browse/HBASE-14017 Project: HBase Issue Type: Sub-task Components: proc-v2 Affects Versions: 2.0.0, 1.2.0, 1.1.1, 1.3.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Blocker Fix For: 2.0.0, 1.2.0, 1.1.2 Attachments: HBASE-14017-v0.patch, HBASE-14017-v0.patch, HBASE-14017.v1-branch1.1.patch, HBASE-14017.v1-branch1.1.patch [~syuanjiang] found a concurrecy issue in the procedure queue delete where we don't have an exclusive lock before deleting the table {noformat} Thread 1: Create table is running - the queue is empty and wlock is false Thread 2: markTableAsDeleted see the queue empty and wlock= false Thread 1: tryWrite() set wlock=true; too late Thread 2: delete the queue Thread 1: never able to release the lock - NPE when trying to get the queue {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13970) NPE during compaction in trunk
[ https://issues.apache.org/jira/browse/HBASE-13970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13970: Fix Version/s: (was: 1.2.1) (was: 1.3.0) 1.2.0 NPE during compaction in trunk -- Key: HBASE-13970 URL: https://issues.apache.org/jira/browse/HBASE-13970 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 0.98.13, 1.2.0, 1.1.1 Reporter: ramkrishna.s.vasudevan Assignee: Duo Zhang Fix For: 2.0.0, 0.98.14, 1.2.0, 1.1.2 Attachments: HBASE-13970-v1.patch, HBASE-13970.patch Updated the trunk.. Loaded the table with PE tool. Trigger a flush to ensure all data is flushed out to disk. When the first compaction is triggered we get an NPE and this is very easy to reproduce {code} 015-06-25 21:33:46,041 INFO [main-EventThread] procedure.ZKProcedureMemberRpcs: Received procedure start children changed event: /hbase/flush-table-proc/acquired 2015-06-25 21:33:46,051 INFO [rs(stobdtserver3,16040,1435248182301)-flush-proc-pool3-thread-1] regionserver.HRegion: Flushing 1/1 column families, memstore=76.91 MB 2015-06-25 21:33:46,159 ERROR [regionserver/stobdtserver3/10.224.54.70:16040-longCompactions-1435248183945] regionserver.CompactSplitThread: Compaction failed Request = regionName=TestTable,283887,1435248198798.028fb0324cd6eb03d5022eb8c147b7c4., storeName=info, fileCount=3, fileSize=343.4 M (114.5 M, 114.5 M, 114.5 M), priority=3, time=7536968291719985 java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.compactions.PressureAwareCompactionThroughputController$ActiveCompaction.access$700(PressureAwareCompactionThroughputController.java:79) at org.apache.hadoop.hbase.regionserver.compactions.PressureAwareCompactionThroughputController.finish(PressureAwareCompactionThroughputController.java:238) at org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:306) at org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:106) at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:112) at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1202) at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1792) at org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.run(CompactSplitThread.java:524) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2015-06-25 21:33:46,745 INFO [rs(stobdtserver3,16040,1435248182301)-flush-proc-pool3-thread-1] regionserver.DefaultStoreFlusher: Flushed, sequenceid=1534, memsize=76.9 M, hasBloomFilter=true, into tmp file hdfs://stobdtserver3:9010/hbase/data/default/TestTable/028fb0324cd6eb03d5022eb8c147b7c4/.tmp/942ba0831a0047a08987439e34361a0c 2015-06-25 21:33:46,772 INFO [rs(stobdtserver3,16040,1435248182301)-flush-proc-pool3-thread-1] regionserver.HStore: Added hdfs://stobdtserver3:9010/hbase/data/default/TestTable/028fb0324cd6eb03d5022eb8c147b7c4/info/942ba0831a0047a08987439e34361a0c, entries=68116, sequenceid=1534, filesize=68.7 M 2015-06-25 21:33:46,773 INFO [rs(stobdtserver3,16040,1435248182301)-flush-proc-pool3-thread-1] regionserver.HRegion: Finished memstore flush of ~76.91 MB/80649344, currentsize=0 B/0 for region TestTable,283887,1435248198798.028fb0324cd6eb03d5022eb8c147b7c4. in 723ms, sequenceid=1534, compaction requested=true 2015-06-25 21:33:46,780 INFO [main-EventThread] procedure.ZKProcedureMemberRpcs: Received created event:/hbase/flush-table-proc/reached/TestTable 2015-06-25 21:33:46,790 INFO [main-EventThread] procedure.ZKProcedureMemberRpcs: Received created event:/hbase/flush-table-proc/abort/TestTable 2015-06-25 21:33:46,791 INFO [main-EventThread] procedure.ZKProcedureMemberRpcs: Received procedure abort children changed event: /hbase/flush-table-proc/abort 2015-06-25 21:33:46,803 INFO [main-EventThread] procedure.ZKProcedureMemberRpcs: Received procedure start children changed event: /hbase/flush-table-proc/acquired 2015-06-25 21:33:46,818 INFO [main-EventThread] procedure.ZKProcedureMemberRpcs: Received procedure abort children changed event: /hbase/flush-table-proc/abort {code} Will check this on what is the reason behind it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13930) Exclude Findbugs packages from shaded jars
[ https://issues.apache.org/jira/browse/HBASE-13930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13930: Fix Version/s: (was: 1.2.1) (was: 1.3.0) 1.2.0 Exclude Findbugs packages from shaded jars -- Key: HBASE-13930 URL: https://issues.apache.org/jira/browse/HBASE-13930 Project: HBase Issue Type: Bug Reporter: Nick Dimiduk Assignee: Gabor Liptak Fix For: 2.0.0, 1.2.0, 1.1.2 Attachments: HBASE-13930.1.patch, HBASE-13930.2.patch Looking at 1.1.1RC0 shaded artifacts, looks like classes from find bugs are under the {{edu}} prefix and are not shaded. We should exclude find bugs from the shaded builds, and/or shade shade the {{edu}} prefix as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13927) Allow hbase-daemon.sh to conditionally redirect the log or not
[ https://issues.apache.org/jira/browse/HBASE-13927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13927: Fix Version/s: 1.2.0 2.0.0 +1 will apply later this weekend barring objections. Allow hbase-daemon.sh to conditionally redirect the log or not -- Key: HBASE-13927 URL: https://issues.apache.org/jira/browse/HBASE-13927 Project: HBase Issue Type: Bug Components: shell Affects Versions: 2.0.0, 1.2.0 Reporter: Elliott Clark Assignee: Elliott Clark Labels: shell Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13927.patch, HBASE-13927.patch Kind of like HBASE_NOEXEC allow hbase-daemon to skip redirecting to a log. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14024) ImportTsv is not loading hbase-default.xml
[ https://issues.apache.org/jira/browse/HBASE-14024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614008#comment-14614008 ] Ashish Singhi commented on HBASE-14024: --- [~busbey] can you take a look ? ImportTsv is not loading hbase-default.xml -- Key: HBASE-14024 URL: https://issues.apache.org/jira/browse/HBASE-14024 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 2.0.0 Reporter: Ashish Singhi Assignee: Ashish Singhi Priority: Critical Fix For: 2.0.0 Attachments: HBASE-14024.patch ImportTsv job is failing with below exception {noformat} Exception in thread main java.lang.IllegalArgumentException: Can not create a Path from a null string at org.apache.hadoop.fs.Path.checkPathArg(Path.java:123) at org.apache.hadoop.fs.Path.init(Path.java:135) at org.apache.hadoop.fs.Path.init(Path.java:89) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configurePartitioner(HFileOutputFormat2.java:591) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configureIncrementalLoad(HFileOutputFormat2.java:441) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configureIncrementalLoad(HFileOutputFormat2.java:406) at org.apache.hadoop.hbase.mapreduce.ImportTsv.createSubmittableJob(ImportTsv.java:555) at org.apache.hadoop.hbase.mapreduce.ImportTsv.run(ImportTsv.java:763) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.mapreduce.ImportTsv.main(ImportTsv.java:772) {noformat} {{hbase.fs.tmp.dir}} is set to a default value in hbase-default.xml. I found that hbase configuration resources from its xml are not loaded into conf object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14025) Update CHANGES.txt for 1.2
[ https://issues.apache.org/jira/browse/HBASE-14025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614022#comment-14614022 ] stack commented on HBASE-14025: --- What you thinking [~busbey] If it helps: In past, you could not export more than 100 issues at a time from JIRA so making the CHANGES.txt content when 100 issues was painful. Often, just punted and pointed at the release notes report in JIRA. Even when an export, hackery coming up w/ the form that always screamed 'script it' but was always too lazy to do it; instead just bungled through regex'ing in editor. Update CHANGES.txt for 1.2 -- Key: HBASE-14025 URL: https://issues.apache.org/jira/browse/HBASE-14025 Project: HBase Issue Type: Sub-task Components: documentation Affects Versions: 1.2.0 Reporter: Sean Busbey Assignee: Sean Busbey Fix For: 1.2.0 Since it's more effort than I expected, making a ticket to track actually updating CHANGES.txt so that new RMs have an idea what to expect. Maybe will make doc changes if there's enough here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13333) Renew Scanner Lease without advancing the RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-1: Fix Version/s: 1.2.0 Renew Scanner Lease without advancing the RegionScanner --- Key: HBASE-1 URL: https://issues.apache.org/jira/browse/HBASE-1 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 2.0.0, 0.98.13, 1.0.2, 1.2.0, 1.1.1 Attachments: 1-0.98.txt, 1-master.txt We have a usecase (for Phoenix) where we want to let the server know that the client is still around. Like a client-side heartbeat. Doing a full heartbeat is complicated, but we could add the ability to make scanner call with caching set to 0. The server already does the right thing (it renews the lease, but does not advance the scanner). It looks like the client (ScannerCallable) also does the right thing. We cannot break ResultScanner before HBase 2.0, but we can add a renewLease() method to AbstractClientScaner. Phoenix (or any other caller) can then cast to ClientScanner and call that method to ensure we renew the lease on the server. It would be a simple and fully backwards compatible change. [~giacomotaylor] Comments? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11830) TestReplicationThrottler.testThrottling failed on virtual boxes
[ https://issues.apache.org/jira/browse/HBASE-11830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-11830: Fix Version/s: 1.2.0 TestReplicationThrottler.testThrottling failed on virtual boxes --- Key: HBASE-11830 URL: https://issues.apache.org/jira/browse/HBASE-11830 Project: HBase Issue Type: Bug Components: test Affects Versions: 2.0.0, 1.1.0, 1.2.0 Environment: kvm with Centos 6.5, openjdk1.7 Reporter: Sergey Soldatov Assignee: Stephen Yuan Jiang Priority: Minor Fix For: 2.0.0, 1.0.2, 1.2.0, 1.1.1 Attachments: HBASE-11830.patch during test runs TestReplicationThrottler.testThrottling sometimes fails with assertion testThrottling(org.apache.hadoop.hbase.replication.regionserver.TestReplicationThrottler) Time elapsed: 0.229 sec FAILURE! java.lang.AssertionError: null at org.junit.Assert.fail(Assert.java:86) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertTrue(Assert.java:52) at org.apache.hadoop.hbase.replication.regionserver.TestReplicationThrottler.testThrottling(TestReplicationThrottler.java:69) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13646) HRegion#execService should not try to build incomplete messages
[ https://issues.apache.org/jira/browse/HBASE-13646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614072#comment-14614072 ] Andrew Purtell commented on HBASE-13646: Are you going to commit [~busbey] ? Lgtm Should go in to 0.98 too. if too much trouble leave open with fix versions including 0.98.14 and I'll pick it up early next week HRegion#execService should not try to build incomplete messages --- Key: HBASE-13646 URL: https://issues.apache.org/jira/browse/HBASE-13646 Project: HBase Issue Type: Bug Components: Coprocessors, regionserver Affects Versions: 2.0.0, 1.2.0, 1.1.1 Reporter: Andrey Stepachev Assignee: Andrey Stepachev Fix For: 2.0.0 Attachments: HBASE-13646-branch-1.patch, HBASE-13646.patch, HBASE-13646.v2.patch, HBASE-13646.v2.patch If some RPC service, called on region throws exception, execService still tries to build Message. In case of complex messages with required fields it complicates service code because service need to pass fake protobuf objects, so they can be barely buildable. To mitigate that I propose to check that controller was failed and return null from call instead of failing with exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13329) ArrayIndexOutOfBoundsException in CellComparator#getMinimumMidpointArray
[ https://issues.apache.org/jira/browse/HBASE-13329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613911#comment-14613911 ] Dinh Duong Mai commented on HBASE-13329: I have experienced exactly the same problem. I reported this issue as HBASE-14018 (https://issues.apache.org/jira/browse/HBASE-14018?jql=project%20%3D%20HBASE#) and on stackoverflow (http://stackoverflow.com/questions/31164505/hbase-regionserver-is-aborted-and-can-never-be-brought-up-after-that). Please help fix this. ArrayIndexOutOfBoundsException in CellComparator#getMinimumMidpointArray Key: HBASE-13329 URL: https://issues.apache.org/jira/browse/HBASE-13329 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 1.0.1 Environment: linux-debian-jessie ec2 - t2.micro instances Reporter: Ruben Aguiar Priority: Critical Attachments: 13329-asserts.patch, 13329-v1.patch, HBASE-13329.test.00.branch-1.1.patch While trying to benchmark my opentsdb cluster, I've created a script that sends to hbase always the same value (in this case 1). After a few minutes, the whole region server crashes and the region itself becomes impossible to open again (cannot assign or unassign). After some investigation, what I saw on the logs is that when a Memstore flush is called on a large region (128mb) the process errors, killing the regionserver. On restart, replaying the edits generates the same error, making the region unavailable. Tried to manually unassign, assign or close_region. That didn't work because the code that reads/replays it crashes. From my investigation this seems to be an overflow issue. The logs show that the function getMinimumMidpointArray tried to access index -32743 of an array, extremely close to the minimum short value in Java. Upon investigation of the source code, it seems an index short is used, being incremented as long as the two vectors are the same, probably making it overflow on large vectors with equal data. Changing it to int should solve the problem. Here follows the hadoop logs of when the regionserver went down. Any help is appreciated. Any other information you need please do tell me: 2015-03-24 18:00:56,187 INFO [regionserver//10.2.0.73:16020.logRoller] wal.FSHLog: Rolled WAL /hbase/WALs/10.2.0.73,16020,1427216382590/10.2.0.73%2C16020%2C1427216382590.default.1427220018516 with entries=143, filesize=134.70 MB; new WAL /hbase/WALs/10.2.0.73,16020,1427216382590/10.2.0.73%2C16020%2C1427216382590.default.1427220056140 2015-03-24 18:00:56,188 INFO [regionserver//10.2.0.73:16020.logRoller] wal.FSHLog: Archiving hdfs://10.2.0.74:8020/hbase/WALs/10.2.0.73,16020,1427216382590/10.2.0.73%2C16020%2C1427216382590.default.1427219987709 to hdfs://10.2.0.74:8020/hbase/oldWALs/10.2.0.73%2C16020%2C1427216382590.default.1427219987709 2015-03-24 18:04:35,722 INFO [MemStoreFlusher.0] regionserver.HRegion: Started memstore flush for tsdb,,1427133969325.52bc1994da0fea97563a4a656a58bec2., current region memstore size 128.04 MB 2015-03-24 18:04:36,154 FATAL [MemStoreFlusher.0] regionserver.HRegionServer: ABORTING region server 10.2.0.73,16020,1427216382590: Replay of WAL required. Forcing server shutdown org.apache.hadoop.hbase.DroppedSnapshotException: region: tsdb,,1427133969325.52bc1994da0fea97563a4a656a58bec2. at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1999) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1770) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1702) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:445) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:407) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:69) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:225) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.ArrayIndexOutOfBoundsException: -32743 at org.apache.hadoop.hbase.CellComparator.getMinimumMidpointArray(CellComparator.java:478) at org.apache.hadoop.hbase.CellComparator.getMidpoint(CellComparator.java:448) at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.finishBlock(HFileWriterV2.java:165) at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.checkBlockBoundary(HFileWriterV2.java:146) at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(HFileWriterV2.java:263) at org.apache.hadoop.hbase.io.hfile.HFileWriterV3.append(HFileWriterV3.java:87) at
[jira] [Commented] (HBASE-13702) ImportTsv: Add dry-run functionality and log bad rows
[ https://issues.apache.org/jira/browse/HBASE-13702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613972#comment-14613972 ] Hudson commented on HBASE-13702: FAILURE: Integrated in HBase-1.3 #34 (See [https://builds.apache.org/job/HBase-1.3/34/]) HBASE-13702 ImportTsv: Add dry-run functionality and log bad rows (Apekshit Sharma) (tedyu: rev 9e54e195f60689bfde26279630f80825214d0219) * hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterTextMapper.java * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterMapper.java * hbase-it/src/test/java/org/apache/hadoop/hbase/mapreduce/IntegrationTestImportTsv.java ImportTsv: Add dry-run functionality and log bad rows - Key: HBASE-13702 URL: https://issues.apache.org/jira/browse/HBASE-13702 Project: HBase Issue Type: New Feature Reporter: Apekshit Sharma Assignee: Apekshit Sharma Fix For: 2.0.0, 1.3.0 Attachments: HBASE-13702-branch-1-v2.patch, HBASE-13702-branch-1-v3.patch, HBASE-13702-branch-1.patch, HBASE-13702-v2.patch, HBASE-13702-v3.patch, HBASE-13702-v4.patch, HBASE-13702-v5.patch, HBASE-13702.patch ImportTSV job skips bad records by default (keeps a count though). -Dimporttsv.skip.bad.lines=false can be used to fail if a bad row is encountered. To be easily able to determine which rows are corrupted in an input, rather than failing on one row at a time seems like a good feature to have. Moreover, there should be 'dry-run' functionality in such kinds of tools, which can essentially does a quick run of tool without making any changes but reporting any errors/warnings and success/failure. To identify corrupted rows, simply logging them should be enough. In worst case, all rows will be logged and size of logs will be same as input size, which seems fine. However, user might have to do some work figuring out where the logs. Is there some link we can show to the user when the tool starts which can help them with that? For the dry run, we can simply use if-else to skip over writing out KVs, and any other mutations, if present. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13665) Fix docs and site building on branch-1
[ https://issues.apache.org/jira/browse/HBASE-13665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13665: Fix Version/s: (was: 1.2.0) Fix docs and site building on branch-1 -- Key: HBASE-13665 URL: https://issues.apache.org/jira/browse/HBASE-13665 Project: HBase Issue Type: Task Components: documentation, site Affects Versions: 1.1.0 Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 1.1.0, 1.0.2 Attachments: HBASE-13665.00.branch-1.0.patch, HBASE-13665.00.branch-1.1.patch, HBASE-13665.00.branch-1.1.patch, HBASE-13665.00.branch-1.patch, HBASE-13665.00.branch-1.patch It was noticed during 1.1.0RC0 that the docs are built with the old docbook stuff. This should be fixed so we package the correct bits. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13661) Correct binary compatibility issues discovered in 1.1.0RC0
[ https://issues.apache.org/jira/browse/HBASE-13661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13661: Fix Version/s: (was: 1.2.0) Correct binary compatibility issues discovered in 1.1.0RC0 -- Key: HBASE-13661 URL: https://issues.apache.org/jira/browse/HBASE-13661 Project: HBase Issue Type: Bug Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 2.0.0, 1.1.0 Attachments: hbase-13661_v1-branch-1.1.patch, hbase-13661_v1-master.patch Over on the 1.1.0RC0 VOTE thread, Enis discovered some errors in InterfaceAudience annotations. Let's fix them. Filed by [~ndimiduk] on [~enis]'s behalf. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13747) Promote Java 8 to yes in support matrix
[ https://issues.apache.org/jira/browse/HBASE-13747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13747: Fix Version/s: (was: 1.3.0) Promote Java 8 to yes in support matrix - Key: HBASE-13747 URL: https://issues.apache.org/jira/browse/HBASE-13747 Project: HBase Issue Type: Umbrella Affects Versions: 1.2.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Critical Fix For: 2.0.0, 1.2.0 Now that Java 7 is EOL, we need to move to formally supporting Java 8. Let's use this ticket to track efforts needed to stop caveating our support table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13646) HRegion#execService should not try to build incomplete messages
[ https://issues.apache.org/jira/browse/HBASE-13646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614066#comment-14614066 ] Sean Busbey commented on HBASE-13646: - any chance this can go in this weekend? HRegion#execService should not try to build incomplete messages --- Key: HBASE-13646 URL: https://issues.apache.org/jira/browse/HBASE-13646 Project: HBase Issue Type: Bug Components: Coprocessors, regionserver Affects Versions: 2.0.0, 1.2.0, 1.1.1 Reporter: Andrey Stepachev Assignee: Andrey Stepachev Fix For: 2.0.0 Attachments: HBASE-13646-branch-1.patch, HBASE-13646.patch, HBASE-13646.v2.patch, HBASE-13646.v2.patch If some RPC service, called on region throws exception, execService still tries to build Message. In case of complex messages with required fields it complicates service code because service need to pass fake protobuf objects, so they can be barely buildable. To mitigate that I propose to check that controller was failed and return null from call instead of failing with exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13849) Remove restore and clone snapshot from the WebUI
[ https://issues.apache.org/jira/browse/HBASE-13849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614067#comment-14614067 ] Sean Busbey commented on HBASE-13849: - any objections to me applying this later this weekend? [~apurtell] would you want this in 0.98? Remove restore and clone snapshot from the WebUI Key: HBASE-13849 URL: https://issues.apache.org/jira/browse/HBASE-13849 Project: HBase Issue Type: Bug Components: snapshots Affects Versions: 1.0.1, 1.1.0, 0.98.13, 1.2.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13849-v0.patch Remove the clone and restore snapshot buttons from the WebUI. first reason, is that the operation may be too long for having the user wait on the WebUI. second reason is that an action from the webUI does not play well with security. since it is going to be executed by the hbase user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13849) Remove restore and clone snapshot from the WebUI
[ https://issues.apache.org/jira/browse/HBASE-13849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13849: Fix Version/s: 1.2.0 2.0.0 Remove restore and clone snapshot from the WebUI Key: HBASE-13849 URL: https://issues.apache.org/jira/browse/HBASE-13849 Project: HBase Issue Type: Bug Components: snapshots Affects Versions: 1.0.1, 1.1.0, 0.98.13, 1.2.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13849-v0.patch Remove the clone and restore snapshot buttons from the WebUI. first reason, is that the operation may be too long for having the user wait on the WebUI. second reason is that an action from the webUI does not play well with security. since it is going to be executed by the hbase user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13702) ImportTsv: Add dry-run functionality and log bad rows
[ https://issues.apache.org/jira/browse/HBASE-13702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613961#comment-14613961 ] Hudson commented on HBASE-13702: SUCCESS: Integrated in HBase-1.3-IT #19 (See [https://builds.apache.org/job/HBase-1.3-IT/19/]) HBASE-13702 ImportTsv: Add dry-run functionality and log bad rows (Apekshit Sharma) (tedyu: rev 9e54e195f60689bfde26279630f80825214d0219) * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterTextMapper.java * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java * hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java * hbase-it/src/test/java/org/apache/hadoop/hbase/mapreduce/IntegrationTestImportTsv.java * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterMapper.java ImportTsv: Add dry-run functionality and log bad rows - Key: HBASE-13702 URL: https://issues.apache.org/jira/browse/HBASE-13702 Project: HBase Issue Type: New Feature Reporter: Apekshit Sharma Assignee: Apekshit Sharma Fix For: 2.0.0, 1.3.0 Attachments: HBASE-13702-branch-1-v2.patch, HBASE-13702-branch-1-v3.patch, HBASE-13702-branch-1.patch, HBASE-13702-v2.patch, HBASE-13702-v3.patch, HBASE-13702-v4.patch, HBASE-13702-v5.patch, HBASE-13702.patch ImportTSV job skips bad records by default (keeps a count though). -Dimporttsv.skip.bad.lines=false can be used to fail if a bad row is encountered. To be easily able to determine which rows are corrupted in an input, rather than failing on one row at a time seems like a good feature to have. Moreover, there should be 'dry-run' functionality in such kinds of tools, which can essentially does a quick run of tool without making any changes but reporting any errors/warnings and success/failure. To identify corrupted rows, simply logging them should be enough. In worst case, all rows will be logged and size of logs will be same as input size, which seems fine. However, user might have to do some work figuring out where the logs. Is there some link we can show to the user when the tool starts which can help them with that? For the dry run, we can simply use if-else to skip over writing out KVs, and any other mutations, if present. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13879) Add hbase.hstore.compactionThreshold to HConstants
[ https://issues.apache.org/jira/browse/HBASE-13879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614024#comment-14614024 ] Gabor Liptak commented on HBASE-13879: -- [~anoop.hbase] Would some other changes needed before this can be considered for commit? Thanks Add hbase.hstore.compactionThreshold to HConstants -- Key: HBASE-13879 URL: https://issues.apache.org/jira/browse/HBASE-13879 Project: HBase Issue Type: Improvement Reporter: Gabor Liptak Priority: Minor Attachments: HBASE-13879.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14017) Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion
[ https://issues.apache.org/jira/browse/HBASE-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-14017: Status: Open (was: Patch Available) Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion - Key: HBASE-14017 URL: https://issues.apache.org/jira/browse/HBASE-14017 Project: HBase Issue Type: Sub-task Components: proc-v2 Affects Versions: 1.1.1, 2.0.0, 1.2.0, 1.3.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Blocker Fix For: 2.0.0, 1.2.0, 1.1.2 Attachments: HBASE-14017-v0.patch, HBASE-14017.v1-branch1.1.patch [~syuanjiang] found a concurrecy issue in the procedure queue delete where we don't have an exclusive lock before deleting the table {noformat} Thread 1: Create table is running - the queue is empty and wlock is false Thread 2: markTableAsDeleted see the queue empty and wlock= false Thread 1: tryWrite() set wlock=true; too late Thread 2: delete the queue Thread 1: never able to release the lock - NPE when trying to get the queue {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-14025) Update CHANGES.txt for 1.2
[ https://issues.apache.org/jira/browse/HBASE-14025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614061#comment-14614061 ] Sean Busbey edited comment on HBASE-14025 at 7/4/15 9:31 PM: - this pipeline does a better job of avoiding false positives for jiras that show up in git log messages:{code} $ git log --oneline 8166142...branch-1.2 | grep -o -E '^[a-z0-9]{7} HBASE-[0-9]* ' | grep -o -E 'HBASE-[0-9]*' | sort -u in_1.2.0.txt {code} was (Author: busbey): this pipeline does a better job of avoiding false positives for jiras that show up in git log messages:{code} $ git log --oneline 8166142...branch-1.2 | grep -o -E '[a-z0-9]{7} HBASE-[0-9]* ' | grep -o -E 'HBASE-[0-9]*' | sort -u in_1.2.0.txt {code} Update CHANGES.txt for 1.2 -- Key: HBASE-14025 URL: https://issues.apache.org/jira/browse/HBASE-14025 Project: HBase Issue Type: Sub-task Components: documentation Affects Versions: 1.2.0 Reporter: Sean Busbey Assignee: Sean Busbey Fix For: 1.2.0 Since it's more effort than I expected, making a ticket to track actually updating CHANGES.txt so that new RMs have an idea what to expect. Maybe will make doc changes if there's enough here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14025) Update CHANGES.txt for 1.2
[ https://issues.apache.org/jira/browse/HBASE-14025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614061#comment-14614061 ] Sean Busbey commented on HBASE-14025: - this pipeline does a better job of avoiding false positives for jiras that show up in git log messages:{code} $ git log --oneline 8166142...branch-1.2 | grep -o -E '[a-z0-9]{7} HBASE-[0-9]* ' | grep -o -E 'HBASE-[0-9]*' | sort -u in_1.2.0.txt {code} Update CHANGES.txt for 1.2 -- Key: HBASE-14025 URL: https://issues.apache.org/jira/browse/HBASE-14025 Project: HBase Issue Type: Sub-task Components: documentation Affects Versions: 1.2.0 Reporter: Sean Busbey Assignee: Sean Busbey Fix For: 1.2.0 Since it's more effort than I expected, making a ticket to track actually updating CHANGES.txt so that new RMs have an idea what to expect. Maybe will make doc changes if there's enough here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13596) src assembly does not build
[ https://issues.apache.org/jira/browse/HBASE-13596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13596: Fix Version/s: (was: 1.2.0) src assembly does not build --- Key: HBASE-13596 URL: https://issues.apache.org/jira/browse/HBASE-13596 Project: HBase Issue Type: Bug Components: build Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 2.0.0, 1.1.0 Attachments: 0001-HBASE-13596-src-assembly-does-not-build.patch Going through the RC motions, tried building from the src tgz. Looks like there's some missing pieces there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13781) Default to 700 for HDFS root dir permissions for secure deployments
[ https://issues.apache.org/jira/browse/HBASE-13781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13781: Assignee: (was: Enis Soztutar) Fix Version/s: (was: 1.2.0) (was: 2.0.0) Default to 700 for HDFS root dir permissions for secure deployments --- Key: HBASE-13781 URL: https://issues.apache.org/jira/browse/HBASE-13781 Project: HBase Issue Type: Improvement Reporter: Enis Soztutar Secure mode deployments should protect the files under HDFS root dir. We should check and set the root dirs permissions on a kerberos setup so that users does not have to. We have {{hbase.data.umask.enable}} and {{hbase.data.umask}} for data files, but those are not that useful since we should protect dir listing, and access to WAL files, snapshot files, etc. See HBASE-13768 which has an integration test for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14017) Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion
[ https://issues.apache.org/jira/browse/HBASE-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614054#comment-14614054 ] Hadoop QA commented on HBASE-14017: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12743600/HBASE-14017-v0.patch against master branch at commit e640f1e76af8f32015f475629610da127897f01e. ATTACHMENT ID: 12743600 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14664//console This message is automatically generated. Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion - Key: HBASE-14017 URL: https://issues.apache.org/jira/browse/HBASE-14017 Project: HBase Issue Type: Sub-task Components: proc-v2 Affects Versions: 2.0.0, 1.2.0, 1.1.1, 1.3.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Blocker Fix For: 2.0.0, 1.2.0, 1.1.2 Attachments: HBASE-14017-v0.patch, HBASE-14017-v0.patch, HBASE-14017.v1-branch1.1.patch, HBASE-14017.v1-branch1.1.patch [~syuanjiang] found a concurrecy issue in the procedure queue delete where we don't have an exclusive lock before deleting the table {noformat} Thread 1: Create table is running - the queue is empty and wlock is false Thread 2: markTableAsDeleted see the queue empty and wlock= false Thread 1: tryWrite() set wlock=true; too late Thread 2: delete the queue Thread 1: never able to release the lock - NPE when trying to get the queue {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13777) Table fragmentation display triggers NPE on master status page
[ https://issues.apache.org/jira/browse/HBASE-13777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13777: Fix Version/s: 1.2.0 Table fragmentation display triggers NPE on master status page -- Key: HBASE-13777 URL: https://issues.apache.org/jira/browse/HBASE-13777 Project: HBase Issue Type: Bug Components: UI Affects Versions: 1.1.0 Reporter: Lars George Assignee: Lars George Labels: beginner Fix For: 2.0.0, 0.98.13, 1.0.2, 1.2.0, 1.1.1 Attachments: 0001-HBASE-13777-Table-fragmentation-display-triggers-NPE.patch Steps to reproduce: - Enable UI support for Fragmentation {noformat} property namehbase.master.ui.fragmentation.enabled/name valuetrue/value /property {noformat} Make sure to restart HBase. - Create NSes and table {noformat} hbase(main):004:0 create_namespace 'testqauat' 0 row(s) in 0.0370 seconds hbase(main):005:0 create_namespace 'financedept' 0 row(s) in 0.0100 seconds hbase(main):006:0 create_namespace 'engdept' 0 row(s) in 0.0090 seconds hbase(main):007:0 create 'testqauat:testtable', 'cf1' 0 row(s) in 1.2590 seconds = Hbase::Table - testqauat:testtable hbase(main):008:0 for i in 'a'..'z' do for j in 'a'..'z' do put 'testqauat:testtable', row-#{i}#{j}, cf1:#{j}, #{j} end end {noformat} - Reload the master UI page and you get: {noformat} HTTP ERROR 500 Problem accessing /master-status. Reason: INTERNAL_SERVER_ERROR Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.tmpl.master.MasterStatusTmplImpl.__jamon_innerUnit__userTables(MasterStatusTmplImpl.java:685) at org.apache.hadoop.hbase.tmpl.master.MasterStatusTmplImpl.renderNoFlush(MasterStatusTmplImpl.java:268) at org.apache.hadoop.hbase.tmpl.master.MasterStatusTmpl.renderNoFlush(MasterStatusTmpl.java:377) at org.apache.hadoop.hbase.tmpl.master.MasterStatusTmpl.render(MasterStatusTmpl.java:368) at org.apache.hadoop.hbase.master.MasterStatusServlet.doGet(MasterStatusServlet.java:81) ... {noformat} Note that the table.jsp page works fine, just the master page fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13703) ReplicateContext should not be a member of ReplicationSource
[ https://issues.apache.org/jira/browse/HBASE-13703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13703: Fix Version/s: 1.2.0 ReplicateContext should not be a member of ReplicationSource Key: HBASE-13703 URL: https://issues.apache.org/jira/browse/HBASE-13703 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Fix For: 2.0.0, 0.98.13, 1.0.2, 1.2.0, 1.1.1 Attachments: 13703.txt The ReplicateContext object is created once per ReplicationSource and then reused when we have something to ship to the sinks. This is a misguided optimization. ReplicateContext is very lightweight (definitely compared to the all the work and copying the ReplicationSource is doing) and, crucially, it prevent the the entries array from being collected after it was successfully copied to the sink, wasting potentially a lot of heap. The entries array itself holds reference to WAL entries on the heap, that now also cannot be collected. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13618) ReplicationSource is too eager to remove sinks
[ https://issues.apache.org/jira/browse/HBASE-13618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13618: Fix Version/s: 1.2.0 ReplicationSource is too eager to remove sinks -- Key: HBASE-13618 URL: https://issues.apache.org/jira/browse/HBASE-13618 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Fix For: 2.0.0, 0.98.13, 1.0.2, 1.2.0, 1.1.1 Attachments: 13618-v2.txt, 13618.txt Looking at the replication for some other reason I noticed that the replication source might be a bit too eager to remove sinks from the list of valid sinks. The current logic allows a sink to fail N times (default 3) and then it will be remove from the sinks. But note that this failure count is never reduced, so given enough runtime and some network glitches _every_ sink will eventually be removed. When all sink are removed the source pick new sinks and the counter is set to 0 for all of them. I think we should change to reset the counter each time we successfully replicate something to the sink (which proves the sink isn't dead). Or we could decrease the counter each time we successfully replication, that might be better - if we consistently fail more attempts than we succeed the sink should be removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13601) Connection leak during log splitting
[ https://issues.apache.org/jira/browse/HBASE-13601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13601: Fix Version/s: 1.1.1 1.2.0 2.0.0 Connection leak during log splitting Key: HBASE-13601 URL: https://issues.apache.org/jira/browse/HBASE-13601 Project: HBase Issue Type: Bug Affects Versions: 1.0.0, 0.98.10 Reporter: Abhishek Singh Chouhan Assignee: Abhishek Singh Chouhan Fix For: 2.0.0, 0.98.13, 1.0.2, 1.2.0, 1.1.1 Attachments: HBASE-13601-0.98.patch, HBASE-13601-1.0.0.patch Ran into an issue where Region server died with the following exception {noformat} 2015-04-29 17:10:11,856 WARN [nector@0.0.0.0:60030] mortbay.log - EXCEPTION java.io.IOException: Too many open files at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:241) at org.mortbay.jetty.nio.SelectChannelConnector$1.acceptChannel(SelectChannelConnector.java:75) at org.mortbay.io.nio.SelectorManager$SelectSet.doSelect(SelectorManager.java:686) at org.mortbay.io.nio.SelectorManager.doSelect(SelectorManager.java:192) at org.mortbay.jetty.nio.SelectChannelConnector.accept(SelectChannelConnector.java:124) at org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.java:708) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) {noformat} Realized that all the tcp sockets on the system were used out due to the regionserver trying to split the log and failing multiple times and leaving a connection open - {noformat} java.io.IOException: Got error for OP_READ_BLOCK, self=/10..99.3:50695, remote=/10.232.99.36:50010, for file /hbase/WALs/host1,60020,1425930917890-splitting/host1%2C60020%2C1425930917890.1429358890944, for pool BP-181199659-10.232.99.2-1411124363096 block 1074497051_756497 at org.apache.hadoop.hdfs.RemoteBlockReader2.checkSuccess(RemoteBlockReader2.java:432) at org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:397) at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:786) at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:665) at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:325) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:567) at org.apache.hadoop.hdfs.DFSInputStream.seekToNewSource(DFSInputStream.java:1446) at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:769) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:799) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:840) at java.io.DataInputStream.read(DataInputStream.java:100) at org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createReader(HLogFactory.java:124) at org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createReader(HLogFactory.java:91) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getReader(HLogSplitter.java:660) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getReader(HLogSplitter.java:569) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:282) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:225) at org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.java:143) at org.apache.hadoop.hbase.regionserver.handler.HLogSplitterHandler.process(HLogSplitterHandler.java:82) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13923) Loaded region coprocessors are not reported in shell status command
[ https://issues.apache.org/jira/browse/HBASE-13923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13923: Fix Version/s: (was: 1.2.1) (was: 1.3.0) 1.2.0 Loaded region coprocessors are not reported in shell status command --- Key: HBASE-13923 URL: https://issues.apache.org/jira/browse/HBASE-13923 Project: HBase Issue Type: Bug Components: regionserver, shell Affects Versions: 1.1.0.1 Reporter: Lars George Assignee: Ashish Singhi Fix For: 2.0.0, 0.98.14, 1.0.2, 1.2.0, 1.1.2 Attachments: 13923-addendum.txt, HBASE-13923-branch-1.0.patch, HBASE-13923-v1.patch, HBASE-13923-v2.patch, HBASE-13923.patch I added a CP to a table using the shell's alter command. Now I tried to check if it was loaded (short of resorting to parsing the logs). I recalled the refguide mentioned the {{status 'detailed'}} command, and tried that to no avail. The UI shows the loaded class in the Software Attributes section, so the info is there. But a shell status command (even after waiting 12+ hours shows nothing. Here an example of a server that has it loaded according to {{describe}} and the UI, but the shell lists this: {noformat} slave-1.internal.larsgeorge.com:16020 1434486031598 requestsPerSecond=0.0, numberOfOnlineRegions=5, usedHeapMB=278, maxHeapMB=941, numberOfStores=5, numberOfStorefiles=3, storefileUncompressedSizeMB=2454, storefileSizeMB=2454, compressionRatio=1., memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=32070, writeRequestsCount=0, rootIndexSizeKB=0, totalStaticIndexSizeKB=2086, totalStaticBloomSizeKB=480, totalCompactingKVs=0, currentCompactedKVs=0, compactionProgressPct=NaN, coprocessors=[] testqauat:usertable,,1433747062257.4db0d7d73cbaac45cb8568d5b185e1f2. numberOfStores=1, numberOfStorefiles=0, storefileUncompressedSizeMB=0, lastMajorCompactionTimestamp=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=0, writeRequestsCount=0, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=0, currentCompactedKVs=0, compactionProgressPct=NaN, completeSequenceId=-1, dataLocality=0.0 testqauat:usertable,user0,1433747062257.f7c7fe3c7d26910010f40101b20f8d06. numberOfStores=1, numberOfStorefiles=0, storefileUncompressedSizeMB=0, lastMajorCompactionTimestamp=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=0, writeRequestsCount=0, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=0, currentCompactedKVs=0, compactionProgressPct=NaN, completeSequenceId=-1, dataLocality=0.0 testqauat:usertable,user1,1433747062257.dcd5395044732242dfed39b09aa05c36. numberOfStores=1, numberOfStorefiles=1, storefileUncompressedSizeMB=820, lastMajorCompactionTimestamp=1434173025593, storefileSizeMB=820, compressionRatio=1., memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=32070, writeRequestsCount=0, rootIndexSizeKB=0, totalStaticIndexSizeKB=699, totalStaticBloomSizeKB=160, totalCompactingKVs=0, currentCompactedKVs=0, compactionProgressPct=NaN, completeSequenceId=-1, dataLocality=1.0 testqauat:usertable,user7,1433747062257.9277fd1d34909b0cb150707cbd7a3907. numberOfStores=1, numberOfStorefiles=1, storefileUncompressedSizeMB=816, lastMajorCompactionTimestamp=1434283025585, storefileSizeMB=816, compressionRatio=1., memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=0, writeRequestsCount=0, rootIndexSizeKB=0, totalStaticIndexSizeKB=690, totalStaticBloomSizeKB=160, totalCompactingKVs=0, currentCompactedKVs=0, compactionProgressPct=NaN, completeSequenceId=-1, dataLocality=1.0 testqauat:usertable,user8,1433747062257.d930b52db8c7f07f3c3ab3e12e61a085. numberOfStores=1, numberOfStorefiles=1, storefileUncompressedSizeMB=818, lastMajorCompactionTimestamp=1433771950960, storefileSizeMB=818, compressionRatio=1., memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=0, writeRequestsCount=0, rootIndexSizeKB=0, totalStaticIndexSizeKB=697, totalStaticBloomSizeKB=160, totalCompactingKVs=0, currentCompactedKVs=0, compactionProgressPct=NaN, completeSequenceId=-1, dataLocality=1.0 {noformat} The refguide shows an example of an older HBase version that has the CP class listed properly. Something is broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14017) Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion
[ https://issues.apache.org/jira/browse/HBASE-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-14017: Status: Open (was: Patch Available) Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion - Key: HBASE-14017 URL: https://issues.apache.org/jira/browse/HBASE-14017 Project: HBase Issue Type: Sub-task Components: proc-v2 Affects Versions: 1.1.1, 2.0.0, 1.2.0, 1.3.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Blocker Fix For: 2.0.0, 1.2.0, 1.1.2 Attachments: HBASE-14017-v0.patch, HBASE-14017-v0.patch, HBASE-14017.v1-branch1.1.patch, HBASE-14017.v1-branch1.1.patch [~syuanjiang] found a concurrecy issue in the procedure queue delete where we don't have an exclusive lock before deleting the table {noformat} Thread 1: Create table is running - the queue is empty and wlock is false Thread 2: markTableAsDeleted see the queue empty and wlock= false Thread 1: tryWrite() set wlock=true; too late Thread 2: delete the queue Thread 1: never able to release the lock - NPE when trying to get the queue {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13849) Remove restore and clone snapshot from the WebUI
[ https://issues.apache.org/jira/browse/HBASE-13849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614071#comment-14614071 ] Andrew Purtell commented on HBASE-13849: Sure, Matteo asked for everywhere for good reason and we don't define UIs as API. Remove restore and clone snapshot from the WebUI Key: HBASE-13849 URL: https://issues.apache.org/jira/browse/HBASE-13849 Project: HBase Issue Type: Bug Components: snapshots Affects Versions: 1.0.1, 1.1.0, 0.98.13, 1.2.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13849-v0.patch Remove the clone and restore snapshot buttons from the WebUI. first reason, is that the operation may be too long for having the user wait on the WebUI. second reason is that an action from the webUI does not play well with security. since it is going to be executed by the hbase user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13702) ImportTsv: Add dry-run functionality and log bad rows
[ https://issues.apache.org/jira/browse/HBASE-13702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-13702: --- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) ImportTsv: Add dry-run functionality and log bad rows - Key: HBASE-13702 URL: https://issues.apache.org/jira/browse/HBASE-13702 Project: HBase Issue Type: New Feature Reporter: Apekshit Sharma Assignee: Apekshit Sharma Fix For: 2.0.0, 1.3.0 Attachments: HBASE-13702-branch-1-v2.patch, HBASE-13702-branch-1-v3.patch, HBASE-13702-branch-1.patch, HBASE-13702-v2.patch, HBASE-13702-v3.patch, HBASE-13702-v4.patch, HBASE-13702-v5.patch, HBASE-13702.patch ImportTSV job skips bad records by default (keeps a count though). -Dimporttsv.skip.bad.lines=false can be used to fail if a bad row is encountered. To be easily able to determine which rows are corrupted in an input, rather than failing on one row at a time seems like a good feature to have. Moreover, there should be 'dry-run' functionality in such kinds of tools, which can essentially does a quick run of tool without making any changes but reporting any errors/warnings and success/failure. To identify corrupted rows, simply logging them should be enough. In worst case, all rows will be logged and size of logs will be same as input size, which seems fine. However, user might have to do some work figuring out where the logs. Is there some link we can show to the user when the tool starts which can help them with that? For the dry run, we can simply use if-else to skip over writing out KVs, and any other mutations, if present. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5878) Use getVisibleLength public api from HdfsDataInputStream from Hadoop-2.
[ https://issues.apache.org/jira/browse/HBASE-5878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614013#comment-14614013 ] Ashish Singhi commented on HBASE-5878: -- -1's not related to patch. Build was successful {noformat} [INFO] [INFO] Reactor Summary: [INFO] [INFO] HBase . SUCCESS [ 1.480 s] [INFO] HBase - Checkstyle SUCCESS [ 0.052 s] [INFO] HBase - Annotations ... SUCCESS [ 0.076 s] [INFO] HBase - Protocol .. SUCCESS [ 0.121 s] [INFO] HBase - Common SUCCESS [ 0.219 s] [INFO] HBase - Procedure . SUCCESS [ 0.072 s] [INFO] HBase - Client SUCCESS [ 0.296 s] [INFO] HBase - Hadoop Compatibility .. SUCCESS [ 0.075 s] [INFO] HBase - Hadoop Two Compatibility .. SUCCESS [ 0.126 s] [INFO] HBase - Prefix Tree ... SUCCESS [ 0.097 s] [INFO] HBase - Server SUCCESS [ 0.964 s] [INFO] HBase - Testing Util .. SUCCESS [ 0.037 s] [INFO] HBase - Thrift SUCCESS [ 0.084 s] [INFO] HBase - Shell . SUCCESS [ 0.099 s] [INFO] HBase - Integration Tests . SUCCESS [ 0.097 s] [INFO] HBase - Examples .. SUCCESS [ 0.061 s] [INFO] HBase - Rest .. SUCCESS [ 0.104 s] [INFO] HBase - Assembly .. SUCCESS [ 0.042 s] [INFO] HBase - Shaded SUCCESS [ 0.036 s] [INFO] HBase - Shaded - Client ... SUCCESS [ 0.035 s] [INFO] HBase - Shaded - Server ... SUCCESS [ 0.035 s] [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 6.747 s [INFO] Finished at: 2015-07-04T08:04:56+00:00 [INFO] Final Memory: 36M/913M [INFO] {noformat} Use getVisibleLength public api from HdfsDataInputStream from Hadoop-2. --- Key: HBASE-5878 URL: https://issues.apache.org/jira/browse/HBASE-5878 Project: HBase Issue Type: Bug Components: wal Reporter: Uma Maheswara Rao G Assignee: Ashish Singhi Fix For: 2.0.0, 1.1.2, 1.3.0, 1.2.1, 1.0.3 Attachments: HBASE-5878-v2.patch, HBASE-5878-v3.patch, HBASE-5878-v4.patch, HBASE-5878-v5.patch, HBASE-5878.patch SequencFileLogReader: Currently Hbase using getFileLength api from DFSInputStream class by reflection. DFSInputStream is not exposed as public. So, this may change in future. Now HDFS exposed HdfsDataInputStream as public API. We can make use of it, when we are not able to find the getFileLength api from DFSInputStream as a else condition. So, that we will not have any sudden surprise like we are facing today. Also, it is just logging one warn message and proceeding if it throws any exception while getting the length. I think we can re-throw the exception because there is no point in continuing with dataloss. {code} long adjust = 0; try { Field fIn = FilterInputStream.class.getDeclaredField(in); fIn.setAccessible(true); Object realIn = fIn.get(this.in); // In hadoop 0.22, DFSInputStream is a standalone class. Before this, // it was an inner class of DFSClient. if (realIn.getClass().getName().endsWith(DFSInputStream)) { Method getFileLength = realIn.getClass(). getDeclaredMethod(getFileLength, new Class? []{}); getFileLength.setAccessible(true); long realLength = ((Long)getFileLength. invoke(realIn, new Object []{})).longValue(); assert(realLength = this.length); adjust = realLength - this.length; } else { LOG.info(Input stream class: + realIn.getClass().getName() + , not adjusting length); } } catch(Exception e) { SequenceFileLogReader.LOG.warn( Error while trying to get accurate file length. + Truncation / data loss may occur if RegionServers die., e); } return adjust + super.getPos(); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-8642) [Snapshot] List and delete snapshot by table
[ https://issues.apache.org/jira/browse/HBASE-8642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614023#comment-14614023 ] Ashish Singhi commented on HBASE-8642: -- {quote} -1 javadoc. The javadoc tool appears to have generated 1 warning messages -1 findbugs. The patch appears to cause Findbugs (version 2.0.3) to fail. {quote} Not related to the patch. Javadoc warning I have seen reported in other Hadoop QA report as well. bq. -1 core tests. The patch failed these unit tests: {noformat} $:$ python findHangingTests.py https://builds.apache.org/job/PreCommit-HBASE-Build/14657//consoleFull Fetching the console output from the URL Printing hanging tests Hanging test : org.apache.hadoop.hbase.util.TestDrainBarrier Printing Failing tests {noformat} Not related to the patch. [Snapshot] List and delete snapshot by table Key: HBASE-8642 URL: https://issues.apache.org/jira/browse/HBASE-8642 Project: HBase Issue Type: Improvement Components: snapshots Affects Versions: 0.98.0, 0.95.0, 0.95.1, 0.95.2 Reporter: Julian Zhou Assignee: Ashish Singhi Fix For: 2.0.0 Attachments: 8642-trunk-0.95-v0.patch, 8642-trunk-0.95-v1.patch, 8642-trunk-0.95-v2.patch, HBASE-8642-v1.patch, HBASE-8642-v2.patch, HBASE-8642-v3.patch, HBASE-8642-v4.patch, HBASE-8642.patch Support list and delete snapshots by table names. User scenario: A user wants to delete all the snapshots which were taken in January month for a table 't' where snapshot names starts with 'Jan'. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13796) ZKUtil doesn't clean quorum setting properly
[ https://issues.apache.org/jira/browse/HBASE-13796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614044#comment-14614044 ] Sean Busbey commented on HBASE-13796: - * commit e61bf1bf2582cad20c54585ceea21ec090984c1a on master * commit ffcd4d424f69b4ecac1bd9f5980c14bb4b61a3fa on branch-1 ZKUtil doesn't clean quorum setting properly Key: HBASE-13796 URL: https://issues.apache.org/jira/browse/HBASE-13796 Project: HBase Issue Type: Bug Affects Versions: 1.0.1, 1.1.0, 0.98.12 Reporter: Geoffrey Jacoby Assignee: Geoffrey Jacoby Priority: Minor Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13796.patch ZKUtil.getZooKeeperClusterKey is obviously trying to pull out the ZooKeeper quorum setting from the config object and remove several special characters from it. Due to a misplaced parenthesis, however, it's instead running the replace operation on the config setting _name_, HConstants.ZOOKEEPER_QUORUM, and not the config setting itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14026) Clarify Web API in version and compatibility docs
Sean Busbey created HBASE-14026: --- Summary: Clarify Web API in version and compatibility docs Key: HBASE-14026 URL: https://issues.apache.org/jira/browse/HBASE-14026 Project: HBase Issue Type: Task Components: documentation Reporter: Sean Busbey Assignee: Sean Busbey Priority: Critical Fix For: 2.0.0 per discussion on HBASE-13861, update our version and compatibility section to clarify under operational compatibility that by Web page API we mean the /jmx endpoint. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14018) RegionServer is aborted when flushing memstore.
[ https://issues.apache.org/jira/browse/HBASE-14018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614104#comment-14614104 ] Lars Hofhansl commented on HBASE-14018: --- bq. java.lang.ArrayIndexOutOfBoundsException: -32743 Looks like a bug, I linked to HBASE-13329, let's fix it there. RegionServer is aborted when flushing memstore. --- Key: HBASE-14018 URL: https://issues.apache.org/jira/browse/HBASE-14018 Project: HBase Issue Type: Bug Components: hadoop2, hbase Affects Versions: 1.0.1.1 Environment: CentOS x64 Server Reporter: Dinh Duong Mai Attachments: hbase-hadoop-master-node1.vmcluster.log, hbase-hadoop-regionserver-node1.vmcluster.log, hbase-hadoop-zookeeper-node1.vmcluster.log + Pseudo-distributed Hadoop (2.6.0), ZK_HBASE_MANAGE = true (1 master, 1 regionserver). + Put data to OpenTSDB, 1000 records / s, for 2000 seconds. + RegionServer is aborted. === RegionServer logs === 2015-07-03 16:37:37,332 INFO [LruBlockCacheStatsExecutor] hfile.LruBlockCache: totalSize=371.27 KB, freeSize=181.41 MB, max=181.78 MB, blockCount=5, accesses=1623, hits=172, hitRatio=10.60%, , cachingAccesses=177, cachingHits=151, cachingHitsRatio=85.31%, evictions=1139, evicted=21, evictedPerRun=0.018437225371599197 2015-07-03 16:37:37,898 INFO [node1:16040Replication Statistics #0] regionserver.Replication: Normal source for cluster 1: Total replicated edits: 2744, currently replicating from: hdfs://node1.vmcluster:9000/hbase/WALs/node1.vmcluster,16040,1435897652505/node1.vmcluster%2C16040%2C1435897652505.default.1435908458590 at position: 19207814 2015-07-03 16:42:37,331 INFO [LruBlockCacheStatsExecutor] hfile.LruBlockCache: totalSize=371.27 KB, freeSize=181.41 MB, max=181.78 MB, blockCount=5, accesses=1624, hits=173, hitRatio=10.65%, , cachingAccesses=178, cachingHits=152, cachingHitsRatio=85.39%, evictions=1169, evicted=21, evictedPerRun=0.01796407252550125 2015-07-03 16:42:37,899 INFO [node1:16040Replication Statistics #0] regionserver.Replication: Normal source for cluster 1: Total replicated edits: 3049, currently replicating from: hdfs://node1.vmcluster:9000/hbase/WALs/node1.vmcluster,16040,1435897652505/node1.vmcluster%2C16040%2C1435897652505.default.1435908458590 at position: 33026416 2015-07-03 16:43:27,217 INFO [MemStoreFlusher.1] regionserver.HRegion: Started memstore flush for tsdb,,1435897759785.2d49cd81fb6513f51af58bd0394c4e0d., current region memstore size 128.05 MB 2015-07-03 16:43:27,899 FATAL [MemStoreFlusher.1] regionserver.HRegionServer: ABORTING region server node1.vmcluster,16040,1435897652505: Replay of WAL required. Forcing server shutdown org.apache.hadoop.hbase.DroppedSnapshotException: region: tsdb,,1435897759785.2d49cd81fb6513f51af58bd0394c4e0d. at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2001) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1772) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1704) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:445) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:407) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:69) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:225) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.ArrayIndexOutOfBoundsException: -32743 at org.apache.hadoop.hbase.CellComparator.getMinimumMidpointArray(CellComparator.java:478) at org.apache.hadoop.hbase.CellComparator.getMidpoint(CellComparator.java:448) at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.finishBlock(HFileWriterV2.java:165) at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.checkBlockBoundary(HFileWriterV2.java:146) at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(HFileWriterV2.java:263) at org.apache.hadoop.hbase.io.hfile.HFileWriterV3.append(HFileWriterV3.java:87) at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:932) at org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:121) at org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:71) at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:879) at org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2128) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1955) ... 7 more 2015-07-03 16:43:27,901
[jira] [Commented] (HBASE-13329) ArrayIndexOutOfBoundsException in CellComparator#getMinimumMidpointArray
[ https://issues.apache.org/jira/browse/HBASE-13329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614103#comment-14614103 ] Lars Hofhansl commented on HBASE-13329: --- [~duong_dajgja], I linked the two issues together. Thank you reporting the issue. ArrayIndexOutOfBoundsException in CellComparator#getMinimumMidpointArray Key: HBASE-13329 URL: https://issues.apache.org/jira/browse/HBASE-13329 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 1.0.1 Environment: linux-debian-jessie ec2 - t2.micro instances Reporter: Ruben Aguiar Priority: Critical Attachments: 13329-asserts.patch, 13329-v1.patch, HBASE-13329.test.00.branch-1.1.patch While trying to benchmark my opentsdb cluster, I've created a script that sends to hbase always the same value (in this case 1). After a few minutes, the whole region server crashes and the region itself becomes impossible to open again (cannot assign or unassign). After some investigation, what I saw on the logs is that when a Memstore flush is called on a large region (128mb) the process errors, killing the regionserver. On restart, replaying the edits generates the same error, making the region unavailable. Tried to manually unassign, assign or close_region. That didn't work because the code that reads/replays it crashes. From my investigation this seems to be an overflow issue. The logs show that the function getMinimumMidpointArray tried to access index -32743 of an array, extremely close to the minimum short value in Java. Upon investigation of the source code, it seems an index short is used, being incremented as long as the two vectors are the same, probably making it overflow on large vectors with equal data. Changing it to int should solve the problem. Here follows the hadoop logs of when the regionserver went down. Any help is appreciated. Any other information you need please do tell me: 2015-03-24 18:00:56,187 INFO [regionserver//10.2.0.73:16020.logRoller] wal.FSHLog: Rolled WAL /hbase/WALs/10.2.0.73,16020,1427216382590/10.2.0.73%2C16020%2C1427216382590.default.1427220018516 with entries=143, filesize=134.70 MB; new WAL /hbase/WALs/10.2.0.73,16020,1427216382590/10.2.0.73%2C16020%2C1427216382590.default.1427220056140 2015-03-24 18:00:56,188 INFO [regionserver//10.2.0.73:16020.logRoller] wal.FSHLog: Archiving hdfs://10.2.0.74:8020/hbase/WALs/10.2.0.73,16020,1427216382590/10.2.0.73%2C16020%2C1427216382590.default.1427219987709 to hdfs://10.2.0.74:8020/hbase/oldWALs/10.2.0.73%2C16020%2C1427216382590.default.1427219987709 2015-03-24 18:04:35,722 INFO [MemStoreFlusher.0] regionserver.HRegion: Started memstore flush for tsdb,,1427133969325.52bc1994da0fea97563a4a656a58bec2., current region memstore size 128.04 MB 2015-03-24 18:04:36,154 FATAL [MemStoreFlusher.0] regionserver.HRegionServer: ABORTING region server 10.2.0.73,16020,1427216382590: Replay of WAL required. Forcing server shutdown org.apache.hadoop.hbase.DroppedSnapshotException: region: tsdb,,1427133969325.52bc1994da0fea97563a4a656a58bec2. at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1999) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1770) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1702) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:445) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:407) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:69) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:225) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.ArrayIndexOutOfBoundsException: -32743 at org.apache.hadoop.hbase.CellComparator.getMinimumMidpointArray(CellComparator.java:478) at org.apache.hadoop.hbase.CellComparator.getMidpoint(CellComparator.java:448) at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.finishBlock(HFileWriterV2.java:165) at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.checkBlockBoundary(HFileWriterV2.java:146) at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(HFileWriterV2.java:263) at org.apache.hadoop.hbase.io.hfile.HFileWriterV3.append(HFileWriterV3.java:87) at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:932) at org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:121) at
[jira] [Commented] (HBASE-13329) ArrayIndexOutOfBoundsException in CellComparator#getMinimumMidpointArray
[ https://issues.apache.org/jira/browse/HBASE-13329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614102#comment-14614102 ] Lars Hofhansl commented on HBASE-13329: --- Thanks [~duong_dajgja], do you remember what exactly you did. From looking at the code I think we have an issue when exactly these two happen: # We we have a row key of length Short.MAX_VALUE # All row keys in an HFile block are identical (i.e. we repeatedly insert Cells with the same key and that key is exactly Short.MAX_VALUE in size). Did that happen here. ArrayIndexOutOfBoundsException in CellComparator#getMinimumMidpointArray Key: HBASE-13329 URL: https://issues.apache.org/jira/browse/HBASE-13329 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 1.0.1 Environment: linux-debian-jessie ec2 - t2.micro instances Reporter: Ruben Aguiar Priority: Critical Attachments: 13329-asserts.patch, 13329-v1.patch, HBASE-13329.test.00.branch-1.1.patch While trying to benchmark my opentsdb cluster, I've created a script that sends to hbase always the same value (in this case 1). After a few minutes, the whole region server crashes and the region itself becomes impossible to open again (cannot assign or unassign). After some investigation, what I saw on the logs is that when a Memstore flush is called on a large region (128mb) the process errors, killing the regionserver. On restart, replaying the edits generates the same error, making the region unavailable. Tried to manually unassign, assign or close_region. That didn't work because the code that reads/replays it crashes. From my investigation this seems to be an overflow issue. The logs show that the function getMinimumMidpointArray tried to access index -32743 of an array, extremely close to the minimum short value in Java. Upon investigation of the source code, it seems an index short is used, being incremented as long as the two vectors are the same, probably making it overflow on large vectors with equal data. Changing it to int should solve the problem. Here follows the hadoop logs of when the regionserver went down. Any help is appreciated. Any other information you need please do tell me: 2015-03-24 18:00:56,187 INFO [regionserver//10.2.0.73:16020.logRoller] wal.FSHLog: Rolled WAL /hbase/WALs/10.2.0.73,16020,1427216382590/10.2.0.73%2C16020%2C1427216382590.default.1427220018516 with entries=143, filesize=134.70 MB; new WAL /hbase/WALs/10.2.0.73,16020,1427216382590/10.2.0.73%2C16020%2C1427216382590.default.1427220056140 2015-03-24 18:00:56,188 INFO [regionserver//10.2.0.73:16020.logRoller] wal.FSHLog: Archiving hdfs://10.2.0.74:8020/hbase/WALs/10.2.0.73,16020,1427216382590/10.2.0.73%2C16020%2C1427216382590.default.1427219987709 to hdfs://10.2.0.74:8020/hbase/oldWALs/10.2.0.73%2C16020%2C1427216382590.default.1427219987709 2015-03-24 18:04:35,722 INFO [MemStoreFlusher.0] regionserver.HRegion: Started memstore flush for tsdb,,1427133969325.52bc1994da0fea97563a4a656a58bec2., current region memstore size 128.04 MB 2015-03-24 18:04:36,154 FATAL [MemStoreFlusher.0] regionserver.HRegionServer: ABORTING region server 10.2.0.73,16020,1427216382590: Replay of WAL required. Forcing server shutdown org.apache.hadoop.hbase.DroppedSnapshotException: region: tsdb,,1427133969325.52bc1994da0fea97563a4a656a58bec2. at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1999) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1770) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1702) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:445) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:407) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:69) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:225) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.ArrayIndexOutOfBoundsException: -32743 at org.apache.hadoop.hbase.CellComparator.getMinimumMidpointArray(CellComparator.java:478) at org.apache.hadoop.hbase.CellComparator.getMidpoint(CellComparator.java:448) at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.finishBlock(HFileWriterV2.java:165) at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.checkBlockBoundary(HFileWriterV2.java:146) at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(HFileWriterV2.java:263) at
[jira] [Updated] (HBASE-13646) HRegion#execService should not try to build incomplete messages
[ https://issues.apache.org/jira/browse/HBASE-13646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13646: Fix Version/s: 1.2.0 0.98.14 sure. I'll push this later tonight / tomorrow unless someone objects. HRegion#execService should not try to build incomplete messages --- Key: HBASE-13646 URL: https://issues.apache.org/jira/browse/HBASE-13646 Project: HBase Issue Type: Bug Components: Coprocessors, regionserver Affects Versions: 2.0.0, 1.2.0, 1.1.1 Reporter: Andrey Stepachev Assignee: Andrey Stepachev Fix For: 2.0.0, 0.98.14, 1.2.0 Attachments: HBASE-13646-branch-1.patch, HBASE-13646.patch, HBASE-13646.v2.patch, HBASE-13646.v2.patch If some RPC service, called on region throws exception, execService still tries to build Message. In case of complex messages with required fields it complicates service code because service need to pass fake protobuf objects, so they can be barely buildable. To mitigate that I propose to check that controller was failed and return null from call instead of failing with exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14013) Retry when RegionServerNotYetRunningException rather than go ahead with assign so for sure we don't skip WAL replay
[ https://issues.apache.org/jira/browse/HBASE-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-14013: Fix Version/s: (was: 1.3.0) Retry when RegionServerNotYetRunningException rather than go ahead with assign so for sure we don't skip WAL replay --- Key: HBASE-14013 URL: https://issues.apache.org/jira/browse/HBASE-14013 Project: HBase Issue Type: Sub-task Components: Region Assignment Reporter: stack Assignee: Enis Soztutar Fix For: 2.0.0, 1.2.0, 1.1.2 Attachments: hbase-13895_addendum3-branch-1.1.patch, hbase-13895_addendum3-branch-1.patch, hbase-13895_addendum3-master.patch Patches are copied from parent. They were done by [~enis] +1 from. They continue the theme of the parent applying it to RegionServerNotYetRunningException as well as the new region aborting exception .. added in parent issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13895) DATALOSS: Region assigned before WAL replay when abort
[ https://issues.apache.org/jira/browse/HBASE-13895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13895: Fix Version/s: (was: 1.3.0) DATALOSS: Region assigned before WAL replay when abort -- Key: HBASE-13895 URL: https://issues.apache.org/jira/browse/HBASE-13895 Project: HBase Issue Type: Bug Affects Versions: 1.2.0 Reporter: stack Assignee: stack Priority: Critical Fix For: 2.0.0, 1.2.0, 1.1.2 Attachments: 13895.branch-1.2.txt, 13895.master.addendum2.txt, 13895.master.patch, hbase-13895_addendum-master.patch, hbase-13895_addendum.patch, hbase-13895_addendum3-branch-1.1.patch, hbase-13895_addendum3-branch-1.patch, hbase-13895_addendum3-master.patch, hbase-13895_v1-branch-1.1.patch Opening a place holder till finish analysis. I have dataloss running ITBLL at 3B (testing HBASE-13877). Most obvious culprit is the double-assignment that I can see. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13849) Remove restore and clone snapshot from the WebUI
[ https://issues.apache.org/jira/browse/HBASE-13849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13849: Fix Version/s: 0.98.14 Remove restore and clone snapshot from the WebUI Key: HBASE-13849 URL: https://issues.apache.org/jira/browse/HBASE-13849 Project: HBase Issue Type: Bug Components: snapshots Affects Versions: 1.0.1, 1.1.0, 0.98.13, 1.2.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: 2.0.0, 0.98.14, 1.2.0 Attachments: HBASE-13849-v0.patch Remove the clone and restore snapshot buttons from the WebUI. first reason, is that the operation may be too long for having the user wait on the WebUI. second reason is that an action from the webUI does not play well with security. since it is going to be executed by the hbase user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13585) HRegionFileSystem#splitStoreFile() finishes without closing the file handle in some situation
[ https://issues.apache.org/jira/browse/HBASE-13585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13585: Fix Version/s: (was: 1.2.0) HRegionFileSystem#splitStoreFile() finishes without closing the file handle in some situation - Key: HBASE-13585 URL: https://issues.apache.org/jira/browse/HBASE-13585 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 2.0.0, 1.1.0, 1.2.0 Environment: Windows Reporter: Stephen Yuan Jiang Assignee: Stephen Yuan Jiang Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2 Attachments: HBASE-13585-0.98.patch, HBASE-13585.v1-mater.patch HRegionFileSystem#splitStoreFile() opens a reader and does not close it in some situation. In Windows, TestSplitTransaction#testWholesomeSplit() consistently failed due to open file handle left at the end of test {noformat} Failed delete of C:/hbase/hbase-server/target/test-data/b470118c-978a-4915-8a12-b29b2c966beb/org.apache.hadoop.hbase.regionserver.TestSplitTransaction/data/default/table/a578b53b3c3a947c5f617c51ccb982cf Stacktrace java.io.IOException: Failed delete of C:/hbase/hbase-server/target/test-data/b470118c-978a-4915-8a12-b29b2c966beb/org.apache.hadoop.hbase.regionserver.TestSplitTransaction/data/default/table/a578b53b3c3a947c5f617c51ccb982cf at org.apache.hadoop.hbase.regionserver.TestSplitTransaction.teardown(TestSplitTransaction.java:106) Standard Error 2015-04-26 12:57:43,863 WARN [main] fs.FileUtil(187): Failed to delete file or dir [C:\hbase\hbase-server\target\test-data\b470118c-978a-4915-8a12-b29b2c966beb\org.apache.hadoop.hbase.regionserver.TestSplitTransaction\data\default\table\a578b53b3c3a947c5f617c51ccb982cf\info\.0519debf2b934245a9a8c3d7cfc0f51d.crc]: it still exists. 2015-04-26 12:57:43,864 WARN [main] fs.FileUtil(187): Failed to delete file or dir [C:\hbase\hbase-server\target\test-data\b470118c-978a-4915-8a12-b29b2c966beb\org.apache.hadoop.hbase.regionserver.TestSplitTransaction\data\default\table\a578b53b3c3a947c5f617c51ccb982cf\info\.45fd4e817ce64759abc6e982a5c0b830.crc]: it still exists. 2015-04-26 12:57:43,865 WARN [main] fs.FileUtil(187): Failed to delete file or dir [C:\hbase\hbase-server\target\test-data\b470118c-978a-4915-8a12-b29b2c966beb\org.apache.hadoop.hbase.regionserver.TestSplitTransaction\data\default\table\a578b53b3c3a947c5f617c51ccb982cf\info\.8e6bc752c9fc414abfe085f8959dce94.crc]: it still exists. 2015-04-26 12:57:43,870 WARN [main] fs.FileUtil(187): Failed to delete file or dir [C:\hbase\hbase-server\target\test-data\b470118c-978a-4915-8a12-b29b2c966beb\org.apache.hadoop.hbase.regionserver.TestSplitTransaction\data\default\table\a578b53b3c3a947c5f617c51ccb982cf\info\0519debf2b934245a9a8c3d7cfc0f51d]: it still exists. 2015-04-26 12:57:43,870 WARN [main] fs.FileUtil(187): Failed to delete file or dir [C:\hbase\hbase-server\target\test-data\b470118c-978a-4915-8a12-b29b2c966beb\org.apache.hadoop.hbase.regionserver.TestSplitTransaction\data\default\table\a578b53b3c3a947c5f617c51ccb982cf\info\45fd4e817ce64759abc6e982a5c0b830]: it still exists. 2015-04-26 12:57:43,871 WARN [main] fs.FileUtil(187): Failed to delete file or dir [C:\hbase\hbase-server\target\test-data\b470118c-978a-4915-8a12-b29b2c966beb\org.apache.hadoop.hbase.regionserver.TestSplitTransaction\data\default\table\a578b53b3c3a947c5f617c51ccb982cf\info\8e6bc752c9fc414abfe085f8959dce94]: it still exists. {noformat} HBASE-8300 tried to fix this issue by adding 'StoreFile#closeReader()' call. However, after the open reader calls, there are 4 'return null' calls before the close call. When I run TestSplitTransaction#testWholesomeSplit(), 2 of the 'return null' calls were hit multiple times (which means the opened file is not closed). The fix needs to make sure that StoreFile#closeReader() is called in all situation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13589) [WINDOWS] hbase.cmd script is broken
[ https://issues.apache.org/jira/browse/HBASE-13589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13589: Fix Version/s: (was: 1.2.0) [WINDOWS] hbase.cmd script is broken Key: HBASE-13589 URL: https://issues.apache.org/jira/browse/HBASE-13589 Project: HBase Issue Type: Bug Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 2.0.0, 1.1.0, 1.0.2 Attachments: hbase-13589_v1.patch It seems that after some recent changes, hbase.cmd no longer works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13594) MultiRowRangeFilter shouldn't call HBaseZeroCopyByteString.wrap() directly
[ https://issues.apache.org/jira/browse/HBASE-13594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13594: Fix Version/s: (was: 1.2.0) MultiRowRangeFilter shouldn't call HBaseZeroCopyByteString.wrap() directly -- Key: HBASE-13594 URL: https://issues.apache.org/jira/browse/HBASE-13594 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 2.0.0, 1.1.0 Attachments: 13594-v1.txt MultiRowRangeFilter calls HBaseZeroCopyByteString.wrap() directly. Instead it should call ByteStringer.wrap() -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14010) TestRegionRebalancing.testRebalanceOnRegionServerNumberChange fails; cluster not balanced
[ https://issues.apache.org/jira/browse/HBASE-14010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613799#comment-14613799 ] Hudson commented on HBASE-14010: FAILURE: Integrated in HBase-1.3 #33 (See [https://builds.apache.org/job/HBase-1.3/33/]) HBASE-14010 TestRegionRebalancing.testRebalanceOnRegionServerNumberChange fails; cluster not balanced (stack: rev bfaf837049619417afa86231b52eed138c305254) * hbase-server/src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java TestRegionRebalancing.testRebalanceOnRegionServerNumberChange fails; cluster not balanced - Key: HBASE-14010 URL: https://issues.apache.org/jira/browse/HBASE-14010 Project: HBase Issue Type: Bug Components: test Reporter: stack Assignee: stack Fix For: 2.0.0, 1.2.0, 1.1.2 Attachments: 14010.txt, 14010.txt, 14010.txt java.lang.AssertionError: null at org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange(TestRegionRebalancing.java:144) from recent build https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-HBASE-Build/14639/testReport/junit/org.apache.hadoop.hbase/TestRegionRebalancing/testRebalanceOnRegionServerNumberChange_0_/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14015) Allow setting a richer state value when toString a pv2
[ https://issues.apache.org/jira/browse/HBASE-14015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613801#comment-14613801 ] Hudson commented on HBASE-14015: FAILURE: Integrated in HBase-1.3 #33 (See [https://builds.apache.org/job/HBase-1.3/33/]) HBASE-14015 Allow setting a richer state value when toString a pv2 (stack: rev 72cc3baa91b92f8369be1ede701268ca7c707ff6) * hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/StateMachineProcedure.java * hbase-procedure/src/test/java/org/apache/hadoop/hbase/procedure2/TestProcedureToString.java * hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java HBASE-14015 Allow setting a richer state value when toString a pv2 -- ADDENDUM on branch-1 and derivatives (stack: rev 7b1f0b841b4d577f595dc3256919583aa1c745f1) * hbase-procedure/src/test/java/org/apache/hadoop/hbase/procedure2/TestProcedureToString.java Allow setting a richer state value when toString a pv2 -- Key: HBASE-14015 URL: https://issues.apache.org/jira/browse/HBASE-14015 Project: HBase Issue Type: Improvement Components: proc-v2 Reporter: stack Assignee: stack Priority: Minor Fix For: 2.0.0, 1.2.0, 1.3.0 Attachments: 0001-HBASE-14015-Allow-setting-a-richer-state-value-when-.patch, 14015.addendum.to.fix.compile.issue.on.branch-1.branch-1.2.txt Debugging, my procedure after a crash was loaded out of the store and its state was RUNNING. It would help if I knew in which of the states of a StateMachineProcedure it was going to start RUNNING at. Chatting w/ Matteo, he suggested allowing Procedures customize the String. Here is patch that makes it so StateMachineProcedure will now print out the base state -- RUNNING, FINISHED -- followed by a ':' and then the StateMachineProcedure state: e.g. SimpleStateMachineProcedure state=RUNNABLE:SERVER_CRASH_ASSIGN -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13895) DATALOSS: Region assigned before WAL replay when abort
[ https://issues.apache.org/jira/browse/HBASE-13895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613796#comment-14613796 ] Hudson commented on HBASE-13895: FAILURE: Integrated in HBase-1.3 #33 (See [https://builds.apache.org/job/HBase-1.3/33/]) HBASE-13895 DATALOSS: Region assigned before WAL replay when abort (Enis Soztutar) -- ADDENDUM (stack: rev 09846ff81a1af3cb68e19b7390df4424d45b5c42) * hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManagerOnCluster.java * hbase-client/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerAbortedException.java HBASE-13895 DATALOSS: Region assigned before WAL replay when abort (Enis Soztutar) -- ADDENDUM (stack: rev 7b4febbc2b32ac09189af8508d40872fea46ad98) * hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java DATALOSS: Region assigned before WAL replay when abort -- Key: HBASE-13895 URL: https://issues.apache.org/jira/browse/HBASE-13895 Project: HBase Issue Type: Bug Affects Versions: 1.2.0 Reporter: stack Assignee: stack Priority: Critical Fix For: 2.0.0, 1.2.0, 1.1.2, 1.3.0 Attachments: 13895.branch-1.2.txt, 13895.master.addendum2.txt, 13895.master.patch, hbase-13895_addendum-master.patch, hbase-13895_addendum.patch, hbase-13895_addendum3-branch-1.1.patch, hbase-13895_addendum3-branch-1.patch, hbase-13895_addendum3-master.patch, hbase-13895_v1-branch-1.1.patch Opening a place holder till finish analysis. I have dataloss running ITBLL at 3B (testing HBASE-13877). Most obvious culprit is the double-assignment that I can see. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14013) Retry when RegionServerNotYetRunningException rather than go ahead with assign so for sure we don't skip WAL replay
[ https://issues.apache.org/jira/browse/HBASE-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613800#comment-14613800 ] Hudson commented on HBASE-14013: FAILURE: Integrated in HBase-1.3 #33 (See [https://builds.apache.org/job/HBase-1.3/33/]) HBASE-14013 Retry when RegionServerNotYetRunningException rather than go ahead with assign so for sure we don't skip WAL replay (stack: rev e0e6a5f09d77d4fe6d60b8d0414b89e1853a5656) * hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java Retry when RegionServerNotYetRunningException rather than go ahead with assign so for sure we don't skip WAL replay --- Key: HBASE-14013 URL: https://issues.apache.org/jira/browse/HBASE-14013 Project: HBase Issue Type: Sub-task Components: Region Assignment Reporter: stack Assignee: Enis Soztutar Fix For: 2.0.0, 1.2.0, 1.1.2, 1.3.0 Attachments: hbase-13895_addendum3-branch-1.1.patch, hbase-13895_addendum3-branch-1.patch, hbase-13895_addendum3-master.patch Patches are copied from parent. They were done by [~enis] +1 from. They continue the theme of the parent applying it to RegionServerNotYetRunningException as well as the new region aborting exception .. added in parent issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)