[jira] [Commented] (HBASE-11882) Row level consistency may not be maintained with bulk load and compaction
[ https://issues.apache.org/jira/browse/HBASE-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119417#comment-14119417 ] ramkrishna.s.vasudevan commented on HBASE-11882: I hope you got my concern. Previously when a bulk load gets completed just after a scanner is created and before the scan does start a seek, the kvs in the bulk loaded file will also be taken into consideration. But after HBASE-11591 the bulk load file would not be taken into consideration. So if the test case expects some value from the bulk loaded file then it may fail. May be it did not happen now but may happen. Anyway I will check the test case once closely. +1 on patch. Row level consistency may not be maintained with bulk load and compaction - Key: HBASE-11882 URL: https://issues.apache.org/jira/browse/HBASE-11882 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.99.0, 2.0.0 Reporter: Jerry He Assignee: Jerry He Priority: Critical Fix For: 0.99.0, 2.0.0 Attachments: HBASE-11882-master-v1.patch, HBASE-11882-master-v2.patch, TestHRegionServerBulkLoad.java.patch While looking into the TestHRegionServerBulkLoad failure for HBASE-11772, I found the root cause is that row level atomicity may not be maintained with bulk load together with compation. TestHRegionServerBulkLoad is used to test bulk load atomicity. The test uses multiple threads to do bulk load and scan continuously and do compactions periodically. It verifies row level data is always consistent across column families. After HBASE-11591, we added readpoint checks for bulkloaded data using the seqId at the time of bulk load. Now a scanner will not see the data from a bulk load if the scanner's readpoint is earlier than the bulk load seqId. Previously, the atomic bulk load result is visible immediately to all scanners. The problem is with compaction after bulk load. Compaction does not lock the region and it is done one store (column family) at a time. It also compact away the seqId marker of bulk load. Here is an event sequence where the row level consistency is broken. 1. A scanner is started to scan a region with cf1 and cf2. The readpoint is 10. 2. There is a bulk load that loads into cf1 and cf2. The bulk load seqId is 11. Bulk load is guarded by region write lock. So it is atomic. 3. There is a compaction that compacts cf1. It compacts away the seqId marker of the bulk load. 4. The scanner tries to next to row-1001. It gets the bulk load data for cf1 since there is no seqId preventing it. It does not get the bulk load data for cf2 since the scanner's readpoint (10) is less than the bulk load seqId (11). Now the row level consistency is broken in this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11869) Support snapshot owner
[ https://issues.apache.org/jira/browse/HBASE-11869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119416#comment-14119416 ] Hadoop QA commented on HBASE-11869: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12666126/HBASE-11869-trunk-v3.diff against trunk revision . ATTACHMENT ID: 12666126 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.TestRegionRebalancing Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10691//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10691//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10691//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10691//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10691//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10691//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10691//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10691//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10691//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10691//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10691//console This message is automatically generated. Support snapshot owner -- Key: HBASE-11869 URL: https://issues.apache.org/jira/browse/HBASE-11869 Project: HBase Issue Type: Improvement Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 2.0.0 Attachments: HBASE-11869-trunk-v1.diff, HBASE-11869-trunk-v3.diff In current codebase, the table snapshot operations only can be done by the global admin , not by the table admin. There is a multi-tenant hbase cluster, each table has different snapshot policies, eg: do snapshot per week, or snapshot after the new data are imported. We want to release the snapshot permission to each table admin. According to [~mbertozzi]'s suggestion, we implement the snapshot owner feature. * The user with table admin permission can create snapshot and the owner of this snapshot is this user. * The owner of snapshot can delete and restore the snapshot. * Only the user with global admin permission can clone a snapshot, for this operation creates a new table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11877) Make TableSplit more readable
[ https://issues.apache.org/jira/browse/HBASE-11877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119435#comment-14119435 ] Liu Shaohui commented on HBASE-11877: - [~jmspaggi] [~stack] {quote} Don't see any issue with that. Have you ran it locally? Do you have a copy of the output? {quote} The split output from TestCopyTable {code} 2014-09-03 13:42:55,180 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: HBase table split(table name: testStartStopRow1, start row: row1, end row: row2, region location: localhost) {code} {quote} Has it passed the tests? {quote} Yes, all tests passed in my dev machine {quote} Not anyrisk for those new fields (m_tableName, m_regionLocation) to be null? {quote} The append method of StringBuilder and Bytes.toStringBinary have handled the null object situation. If the object is null, the method will transform it to string null. Make TableSplit more readable - Key: HBASE-11877 URL: https://issues.apache.org/jira/browse/HBASE-11877 Project: HBase Issue Type: Improvement Affects Versions: 2.0.0 Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Attachments: HBASE-11877-trunk-v1.diff When debugging MR jobs reading from hbase table, it's import to figure out which region a map task is reading from. But the table split object is hard to read. eg: {code} 2014-09-01 20:58:39,783 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: lg-hadoop-prc-st40.bj:,0 {code} See: TableSplit.java {code} @Override public String toString() { return m_regionLocation + : + Bytes.toStringBinary(m_startRow) + , + Bytes.toStringBinary(m_endRow); } {code} We should make it more readable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11877) Make TableSplit more readable
[ https://issues.apache.org/jira/browse/HBASE-11877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-11877: Attachment: HBASE-11877-trunk-v2.diff Add unit test and update the patch. Make TableSplit more readable - Key: HBASE-11877 URL: https://issues.apache.org/jira/browse/HBASE-11877 Project: HBase Issue Type: Improvement Affects Versions: 2.0.0 Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Attachments: HBASE-11877-trunk-v1.diff, HBASE-11877-trunk-v2.diff When debugging MR jobs reading from hbase table, it's import to figure out which region a map task is reading from. But the table split object is hard to read. eg: {code} 2014-09-01 20:58:39,783 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: lg-hadoop-prc-st40.bj:,0 {code} See: TableSplit.java {code} @Override public String toString() { return m_regionLocation + : + Bytes.toStringBinary(m_startRow) + , + Bytes.toStringBinary(m_endRow); } {code} We should make it more readable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11882) Row level consistency may not be maintained with bulk load and compaction
[ https://issues.apache.org/jira/browse/HBASE-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry He updated HBASE-11882: - Status: Open (was: Patch Available) Row level consistency may not be maintained with bulk load and compaction - Key: HBASE-11882 URL: https://issues.apache.org/jira/browse/HBASE-11882 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.99.0, 2.0.0 Reporter: Jerry He Assignee: Jerry He Priority: Critical Fix For: 0.99.0, 2.0.0 Attachments: HBASE-11882-master-v1.patch, HBASE-11882-master-v2.patch, TestHRegionServerBulkLoad.java.patch While looking into the TestHRegionServerBulkLoad failure for HBASE-11772, I found the root cause is that row level atomicity may not be maintained with bulk load together with compation. TestHRegionServerBulkLoad is used to test bulk load atomicity. The test uses multiple threads to do bulk load and scan continuously and do compactions periodically. It verifies row level data is always consistent across column families. After HBASE-11591, we added readpoint checks for bulkloaded data using the seqId at the time of bulk load. Now a scanner will not see the data from a bulk load if the scanner's readpoint is earlier than the bulk load seqId. Previously, the atomic bulk load result is visible immediately to all scanners. The problem is with compaction after bulk load. Compaction does not lock the region and it is done one store (column family) at a time. It also compact away the seqId marker of bulk load. Here is an event sequence where the row level consistency is broken. 1. A scanner is started to scan a region with cf1 and cf2. The readpoint is 10. 2. There is a bulk load that loads into cf1 and cf2. The bulk load seqId is 11. Bulk load is guarded by region write lock. So it is atomic. 3. There is a compaction that compacts cf1. It compacts away the seqId marker of the bulk load. 4. The scanner tries to next to row-1001. It gets the bulk load data for cf1 since there is no seqId preventing it. It does not get the bulk load data for cf2 since the scanner's readpoint (10) is less than the bulk load seqId (11). Now the row level consistency is broken in this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11882) Row level consistency may not be maintained with bulk load and compaction
[ https://issues.apache.org/jira/browse/HBASE-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119439#comment-14119439 ] Jerry He commented on HBASE-11882: -- Yes, the behavior change from the added readpoint checks for bulkloaded data using the seqId is understood. I did run the 'mvn test' after v2 patch earlier today The entire run passed cleanly. But I lost the result. I've just kick off another run, and will paste the result here. Also try to trigger a Hadoop QA run here. Row level consistency may not be maintained with bulk load and compaction - Key: HBASE-11882 URL: https://issues.apache.org/jira/browse/HBASE-11882 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.99.0, 2.0.0 Reporter: Jerry He Assignee: Jerry He Priority: Critical Fix For: 0.99.0, 2.0.0 Attachments: HBASE-11882-master-v1.patch, HBASE-11882-master-v2.patch, TestHRegionServerBulkLoad.java.patch While looking into the TestHRegionServerBulkLoad failure for HBASE-11772, I found the root cause is that row level atomicity may not be maintained with bulk load together with compation. TestHRegionServerBulkLoad is used to test bulk load atomicity. The test uses multiple threads to do bulk load and scan continuously and do compactions periodically. It verifies row level data is always consistent across column families. After HBASE-11591, we added readpoint checks for bulkloaded data using the seqId at the time of bulk load. Now a scanner will not see the data from a bulk load if the scanner's readpoint is earlier than the bulk load seqId. Previously, the atomic bulk load result is visible immediately to all scanners. The problem is with compaction after bulk load. Compaction does not lock the region and it is done one store (column family) at a time. It also compact away the seqId marker of bulk load. Here is an event sequence where the row level consistency is broken. 1. A scanner is started to scan a region with cf1 and cf2. The readpoint is 10. 2. There is a bulk load that loads into cf1 and cf2. The bulk load seqId is 11. Bulk load is guarded by region write lock. So it is atomic. 3. There is a compaction that compacts cf1. It compacts away the seqId marker of the bulk load. 4. The scanner tries to next to row-1001. It gets the bulk load data for cf1 since there is no seqId preventing it. It does not get the bulk load data for cf2 since the scanner's readpoint (10) is less than the bulk load seqId (11). Now the row level consistency is broken in this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11882) Row level consistency may not be maintained with bulk load and compaction
[ https://issues.apache.org/jira/browse/HBASE-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry He updated HBASE-11882: - Status: Patch Available (was: Open) Row level consistency may not be maintained with bulk load and compaction - Key: HBASE-11882 URL: https://issues.apache.org/jira/browse/HBASE-11882 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.99.0, 2.0.0 Reporter: Jerry He Assignee: Jerry He Priority: Critical Fix For: 0.99.0, 2.0.0 Attachments: HBASE-11882-master-v1.patch, HBASE-11882-master-v2.patch, TestHRegionServerBulkLoad.java.patch While looking into the TestHRegionServerBulkLoad failure for HBASE-11772, I found the root cause is that row level atomicity may not be maintained with bulk load together with compation. TestHRegionServerBulkLoad is used to test bulk load atomicity. The test uses multiple threads to do bulk load and scan continuously and do compactions periodically. It verifies row level data is always consistent across column families. After HBASE-11591, we added readpoint checks for bulkloaded data using the seqId at the time of bulk load. Now a scanner will not see the data from a bulk load if the scanner's readpoint is earlier than the bulk load seqId. Previously, the atomic bulk load result is visible immediately to all scanners. The problem is with compaction after bulk load. Compaction does not lock the region and it is done one store (column family) at a time. It also compact away the seqId marker of bulk load. Here is an event sequence where the row level consistency is broken. 1. A scanner is started to scan a region with cf1 and cf2. The readpoint is 10. 2. There is a bulk load that loads into cf1 and cf2. The bulk load seqId is 11. Bulk load is guarded by region write lock. So it is atomic. 3. There is a compaction that compacts cf1. It compacts away the seqId marker of the bulk load. 4. The scanner tries to next to row-1001. It gets the bulk load data for cf1 since there is no seqId preventing it. It does not get the bulk load data for cf2 since the scanner's readpoint (10) is less than the bulk load seqId (11). Now the row level consistency is broken in this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11869) Support snapshot owner
[ https://issues.apache.org/jira/browse/HBASE-11869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119441#comment-14119441 ] Liu Shaohui commented on HBASE-11869: - It seems that the failed test: TestRegionRebalancing has no relations with the patch. Support snapshot owner -- Key: HBASE-11869 URL: https://issues.apache.org/jira/browse/HBASE-11869 Project: HBase Issue Type: Improvement Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 2.0.0 Attachments: HBASE-11869-trunk-v1.diff, HBASE-11869-trunk-v3.diff In current codebase, the table snapshot operations only can be done by the global admin , not by the table admin. There is a multi-tenant hbase cluster, each table has different snapshot policies, eg: do snapshot per week, or snapshot after the new data are imported. We want to release the snapshot permission to each table admin. According to [~mbertozzi]'s suggestion, we implement the snapshot owner feature. * The user with table admin permission can create snapshot and the owner of this snapshot is this user. * The owner of snapshot can delete and restore the snapshot. * Only the user with global admin permission can clone a snapshot, for this operation creates a new table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11876) RegionScanner.nextRaw(...) should not update metrics
[ https://issues.apache.org/jira/browse/HBASE-11876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119443#comment-14119443 ] Lars Hofhansl commented on HBASE-11876: --- Sorry was traveling today. Was just finding some time to make a quick patch, and it's already done. :) Belated +1 on patch then, that's almost exactly how I would have coded it up. Should be a nice improvement. RegionScanner.nextRaw(...) should not update metrics Key: HBASE-11876 URL: https://issues.apache.org/jira/browse/HBASE-11876 Project: HBase Issue Type: Bug Affects Versions: 0.98.6 Reporter: Lars Hofhansl Assignee: Andrew Purtell Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11876-0.98.patch, HBASE-11876.patch, HBASE-11876.patch I added the RegionScanner.nextRaw(...) to allow smart client to avoid some of the default work that HBase is doing, such as {start|stop}RegionOperation and synchronized(scanner) for each row. Metrics should follow the same approach. Collecting them per row is expensive and a caller should have the option to collect those later or to avoid collecting them completely. We can also save some cycles in RSRcpServices.scan(...) if we updated the metric only once/batch instead of each row. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11876) RegionScanner.nextRaw(...) should not update metrics
[ https://issues.apache.org/jira/browse/HBASE-11876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119445#comment-14119445 ] Hudson commented on HBASE-11876: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #465 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/465/]) HBASE-11876 RegionScanner.nextRaw should not update metrics (apurtell: rev 23a4181d1ecc5f492c16dc579bff92eef7d209f1) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java RegionScanner.nextRaw(...) should not update metrics Key: HBASE-11876 URL: https://issues.apache.org/jira/browse/HBASE-11876 Project: HBase Issue Type: Bug Affects Versions: 0.98.6 Reporter: Lars Hofhansl Assignee: Andrew Purtell Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11876-0.98.patch, HBASE-11876.patch, HBASE-11876.patch I added the RegionScanner.nextRaw(...) to allow smart client to avoid some of the default work that HBase is doing, such as {start|stop}RegionOperation and synchronized(scanner) for each row. Metrics should follow the same approach. Collecting them per row is expensive and a caller should have the option to collect those later or to avoid collecting them completely. We can also save some cycles in RSRcpServices.scan(...) if we updated the metric only once/batch instead of each row. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11863) WAL files are not archived and stays in the WAL directory after splitting
[ https://issues.apache.org/jira/browse/HBASE-11863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119444#comment-14119444 ] Hudson commented on HBASE-11863: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #465 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/465/]) HBASE-11863 WAL files are not archived and stays in the WAL directory after splitting (enis: rev 7f28fcf429242c549219502bfb7da0ad28753f4c) * hbase-server/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java WAL files are not archived and stays in the WAL directory after splitting -- Key: HBASE-11863 URL: https://issues.apache.org/jira/browse/HBASE-11863 Project: HBase Issue Type: Bug Reporter: Enis Soztutar Assignee: Enis Soztutar Priority: Blocker Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: hbase-11863_v1-0.98.patch, hbase-11863_v1.patch, hbase-11863_v2.patch, hbase-11863_v3-0.98.patch, hbase-11863_v3-0.99.patch, hbase-11863_v3.patch In patch HBASE-11094, it seems that we changed the constructor we are using for SplitLogManager, which does not archive the log files after splitting is done to the archive folder. The log files stays in the splitting directory forever and re-split every time the master restarts. It is surprising that our unit tests are passing (since 0.94.4) without any issues. Part of the reason is that the split is actually carried, but the WAL is not moved and thus the -splitting directory never gets deleted. It seems critical to fix in 0.98.6, [~andrew.purt...@gmail.com] FYI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11868) Data loss in hlog when the hdfs is unavailable
[ https://issues.apache.org/jira/browse/HBASE-11868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119446#comment-14119446 ] Hudson commented on HBASE-11868: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #465 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/465/]) Revert HBASE-11868 Data loss in hlog when the hdfs is unavailable (Liu Shaohui) (apurtell: rev ee32706c5d93fb3de6f4aba09174d34ca3879f6d) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java Data loss in hlog when the hdfs is unavailable -- Key: HBASE-11868 URL: https://issues.apache.org/jira/browse/HBASE-11868 Project: HBase Issue Type: Bug Affects Versions: 0.98.5 Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Blocker Fix For: 0.98.6 Attachments: HBASE-11868-0.98-v1.diff, HBASE-11868-0.98-v2.diff When using the new thread model in hbase 0.98, we found a bug which may cause data loss when the the hdfs is unavailable. When writing wal Edits to hlog in doMiniBatchMutation of HRegion, the hlog first call appendNoSync to write the edits to hlog and then call sync with txid. Assumed that the txid of current write is 10, and the syncedTillHere in hlog is 9 and the failedTxid is 0. When the the hdfs is unavailable, the AsyncWriter or AsyncSyncer will fail to apend the edits or sync, then they will update the syncedTillHere to 10 and the failedTxid to 10. When the hlog calls the sync with txid :10, the failedTxid will nerver be checked for txid equals with syncedTillHere. The client thinks the write success , but the data only be writtten to memstore not hlog. If the regionserver is down later before the memstore is flushed, the data will be lost. See: FSHLog.java #1348 {code} // sync all transactions upto the specified txid private void syncer(long txid) throws IOException { synchronized (this.syncedTillHere) { while (this.syncedTillHere.get() txid) { try { this.syncedTillHere.wait(); if (txid = this.failedTxid.get()) { assert asyncIOE != null : current txid is among(under) failed txids, but asyncIOE is null!; throw asyncIOE; } } catch (InterruptedException e) { LOG.debug(interrupted while waiting for notification from AsyncNotifier); } } } } {code} We can fix this issue by moving the comparing of txid and failedTxid outside the while block. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11728) Data loss while scanning using PREFIX_TREE DATA-BLOCK-ENCODING
[ https://issues.apache.org/jira/browse/HBASE-11728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119447#comment-14119447 ] ramkrishna.s.vasudevan commented on HBASE-11728: Should we do this for some of the previous() cases also as done in the patch. May be that is the reason for the IT to fail. [~bdifn] Did you get an opportunity to use this patch and still you had some data loss while scanning? Data loss while scanning using PREFIX_TREE DATA-BLOCK-ENCODING -- Key: HBASE-11728 URL: https://issues.apache.org/jira/browse/HBASE-11728 Project: HBase Issue Type: Bug Components: Scanners Affects Versions: 0.96.1.1, 0.98.4 Environment: ubuntu12 hadoop-2.2.0 Hbase-0.96.1.1 SUN-JDK(1.7.0_06-b24) Reporter: wuchengzhi Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: 29cb562fad564b468ea9d61a2d60e8b0, HBASE-11728.patch, HBASE-11728_1.patch, HBASE-11728_2.patch, HBASE-11728_3.patch, HBASE-11728_4.patch, HFileAnalys.java, TestPrefixTree.java Original Estimate: 72h Remaining Estimate: 72h In Scan case, i prepare some data as beflow: Table Desc (Using the prefix-tree encoding) : 'prefix_tree_test', {NAME = 'cf_1', DATA_BLOCK_ENCODING = 'PREFIX_TREE', TTL = '15552000'} and i put 5 rows as: (RowKey , Qualifier, Value) 'a-b-0-0', 'qf_1', 'c1-value' 'a-b-A-1', 'qf_1', 'c1-value' 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value' 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2' 'a-b-B-2-1402397300-1402416535', 'qf_2', 'c2-value-3' so i try to scan the rowKey between 'a-b-A-1' and 'a-b-A-1:' , i and got the corret result: Test 1: Scan scan = new Scan(); scan.setStartRow(a-b-A-1.getBytes()); scan.setStopRow(a-b-A-1:.getBytes()); -- 'a-b-A-1', 'qf_1', 'c1-value' 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value' 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2' and then i try next , scan to addColumn Test2: Scan scan = new Scan(); scan.addColumn(Bytes.toBytes(cf_1) , Bytes.toBytes(qf_2)); scan.setStartRow(a-b-A-1.getBytes()); scan.setStopRow(a-b-A-1:.getBytes()); -- except: 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value' 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2' but actually i got nonthing. Then i update the addColumn for scan.addColumn(Bytes.toBytes(cf_1) , Bytes.toBytes(qf_1)); and i got the expected result 'a-b-A-1', 'qf_1', 'c1-value' as well. then i do more testing... i update the case to modify the startRow greater than the 'a-b-A-1' Test3: Scan scan = new Scan(); scan.setStartRow(a-b-A-1-.getBytes()); scan.setStopRow(a-b-A-1:.getBytes()); -- except: 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value' 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2' but actually i got nothing again. i modify the start row greater than 'a-b-A-1-1402329600-1402396277' Scan scan = new Scan(); scan.setStartRow(a-b-A-1-140239.getBytes()); scan.setStopRow(a-b-A-1:.getBytes()); and i got the expect row as well: 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2' So, i think it may be a bug in the prefix-tree encoding.It happens after the data flush to the storefile, and it's ok when the data in mem-store. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11886) The creator of the table should have all permissions on the table
[ https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das updated HBASE-11886: Status: Patch Available (was: Open) The creator of the table should have all permissions on the table - Key: HBASE-11886 URL: https://issues.apache.org/jira/browse/HBASE-11886 Project: HBase Issue Type: Bug Affects Versions: 0.98.3 Reporter: Devaraj Das Priority: Critical Fix For: 1.0.0, 0.98.6 Attachments: 11886-1.txt In our testing of 0.98.4 with security ON, we found that table creator doesn't have RWXCA on the created table. Instead, the user representing the HBase daemon gets all permissions. Due to this the table creator can't write to the table he just created. I am suspecting HBASE-11275 introduced the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11886) The creator of the table should have all permissions on the table
[ https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das updated HBASE-11886: Attachment: 11886-1.txt Thanks for the confirmation [~ram_krish]. The attached patch should fix the issue (although i haven't tested it with security ON). The creator of the table should have all permissions on the table - Key: HBASE-11886 URL: https://issues.apache.org/jira/browse/HBASE-11886 Project: HBase Issue Type: Bug Affects Versions: 0.98.3 Reporter: Devaraj Das Priority: Critical Fix For: 1.0.0, 0.98.6 Attachments: 11886-1.txt In our testing of 0.98.4 with security ON, we found that table creator doesn't have RWXCA on the created table. Instead, the user representing the HBase daemon gets all permissions. Due to this the table creator can't write to the table he just created. I am suspecting HBASE-11275 introduced the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11869) Support snapshot owner
[ https://issues.apache.org/jira/browse/HBASE-11869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119483#comment-14119483 ] Matteo Bertozzi commented on HBASE-11869: - +1 [~apurtell] do you want this in 98? It changes the ACL logic, but since it is not restricting should be compatible. Support snapshot owner -- Key: HBASE-11869 URL: https://issues.apache.org/jira/browse/HBASE-11869 Project: HBase Issue Type: Improvement Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 2.0.0 Attachments: HBASE-11869-trunk-v1.diff, HBASE-11869-trunk-v3.diff In current codebase, the table snapshot operations only can be done by the global admin , not by the table admin. There is a multi-tenant hbase cluster, each table has different snapshot policies, eg: do snapshot per week, or snapshot after the new data are imported. We want to release the snapshot permission to each table admin. According to [~mbertozzi]'s suggestion, we implement the snapshot owner feature. * The user with table admin permission can create snapshot and the owner of this snapshot is this user. * The owner of snapshot can delete and restore the snapshot. * Only the user with global admin permission can clone a snapshot, for this operation creates a new table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11339) HBase MOB
[ https://issues.apache.org/jira/browse/HBASE-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119489#comment-14119489 ] Lars Hofhansl commented on HBASE-11339: --- bq. Back in June, JingCheng's response to your comments never got feedback on how you'd manage the small files problem. To be fair, my comment itself addressed that by saying small blobs are stored by *value* in HBase, and only large bloba in HDFS. We can store a lot of 10MB (in the worst case scenario it's 200m x 10mb = 2pb) in HDFS, if that's not enough, we can dial up the threshold. It seems nobody understood what I am suggesting. Depending on use case and data distribution you pick a threshold X. Blobs with a size of X are stored directly in HBase as a column value. Blobs = X are stored in a HDFS with a reference in HBase using the 3-phase approach. bq. there are two HDFS blob + HBase metadata solutions are explicitly mentioned in section 4.1.2 (v4 design doc) with pros and cons True, but as I state the store small blobs by value and only large ones by reference solution is not mentioned in there. bq. The solution you propose is actually the first described hdfs+hbase approach Not it's not... It says either all blobs go into HBase or all blobs go into HDFS... See above. Small blobs would be stored directly in HBase, not in HDFS. That's key, nobody wants to store 100k or 1mb files directly in HDFS. bq. We have total 3 +1s for that Jira after many rounds of review rework. Can get it committed tomorrow IST unless objections...? We won't get this committed until we finish this discussion. So consider this my -1 until we finish. Going by the comments the use case is only 1-5mb files (definitely less than 64mb), correct? That changes the discussion, but it looks to me that now the use case is limited to a single scenario and carefully constructed (200m x 500k files) so that this change might be useful. I.e. pick a blob size just right, and pick the size distribution of the files just right and this makes sense. In my approach one can dial up/down the threshold of by-value and by-reference storage as needed. And I did not even realize the need for M/R. I do agree with all of following: * snapshots are harder * bulk load is harder * backup/restore/replication is harder Yet, all that is possible to do with a client only solution and could be abstracted there. I'll also admit that our blob storage tool is not finished, yet, and that for its use case we don't need replication or backup as it itself will be the backup solution for another very large data store. Are you guys absolutely... 100%... positive that this cannot be done in any other way and has to be done this way? That we cannot store files up to a certain size as values in HBase and larger files in HDFS? And there is not good threshold value for this? HBase MOB - Key: HBASE-11339 URL: https://issues.apache.org/jira/browse/HBASE-11339 Project: HBase Issue Type: Umbrella Components: regionserver, Scanners Reporter: Jingcheng Du Assignee: Jingcheng Du Attachments: HBase MOB Design-v2.pdf, HBase MOB Design-v3.pdf, HBase MOB Design-v4.pdf, HBase MOB Design.pdf, MOB user guide.docx, MOB user guide_v2.docx, hbase-11339-in-dev.patch It's quite useful to save the medium binary data like images, documents into Apache HBase. Unfortunately directly saving the binary MOB(medium object) to HBase leads to a worse performance since the frequent split and compaction. In this design, the MOB data are stored in an more efficient way, which keeps a high write/read performance and guarantees the data consistency in Apache HBase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HBASE-8674) JUnit and Surefire TRUNK-HBASE-2 plugins need a new home
[ https://issues.apache.org/jira/browse/HBASE-8674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dima Spivak reopened HBASE-8674: JUnit and Surefire TRUNK-HBASE-2 plugins need a new home Key: HBASE-8674 URL: https://issues.apache.org/jira/browse/HBASE-8674 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.98.0, 0.94.8, 0.95.1 Reporter: Andrew Purtell people.apache.org cannot currently host personal or transient Maven repos. {noformat} $ curl --connect-timeout 60 -v http://people.apache.org/~garyh/mvn/org/apache/maven/plugins/maven-remote-resources-plugin/1.4/maven-remote-resources-plugin-1.4.pom * About to connect() to people.apache.org port 80 (#0) * Trying 140.211.11.9... * Connection timed out after 60064 milliseconds * Closing connection 0 curl: (28) Connection timed out after 60064 milliseconds {noformat} All builds are at the moment broken if the HBase custom junit or surefire jars are not already in cache. Even if this is a temporary condition, we should find a new home for these artifacts, upgrade to versions that include our submitted changes (if any), or fall back to release versions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-8674) JUnit and Surefire TRUNK-HBASE-2 plugins need a new home
[ https://issues.apache.org/jira/browse/HBASE-8674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119497#comment-14119497 ] Dima Spivak commented on HBASE-8674: It might be worth revisiting this. My builds tonight were slowed to a crawl after I had to rebuild my local Maven repository and I can only assume that Gary's personal repo was sluggish because simply removing references to it in the root POM got things speedy again. JUnit and Surefire TRUNK-HBASE-2 plugins need a new home Key: HBASE-8674 URL: https://issues.apache.org/jira/browse/HBASE-8674 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.98.0, 0.94.8, 0.95.1 Reporter: Andrew Purtell people.apache.org cannot currently host personal or transient Maven repos. {noformat} $ curl --connect-timeout 60 -v http://people.apache.org/~garyh/mvn/org/apache/maven/plugins/maven-remote-resources-plugin/1.4/maven-remote-resources-plugin-1.4.pom * About to connect() to people.apache.org port 80 (#0) * Trying 140.211.11.9... * Connection timed out after 60064 milliseconds * Closing connection 0 curl: (28) Connection timed out after 60064 milliseconds {noformat} All builds are at the moment broken if the HBase custom junit or surefire jars are not already in cache. Even if this is a temporary condition, we should find a new home for these artifacts, upgrade to versions that include our submitted changes (if any), or fall back to release versions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11885) Provide a Dockerfile to easily build and run HBase from source
[ https://issues.apache.org/jira/browse/HBASE-11885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119501#comment-14119501 ] Dima Spivak commented on HBASE-11885: - I have a working Dockerfile that sets up the necessary HBase dependencies (i.e. Maven and Java), clones in the repo, runs {{mvn assembly:single}}, and then starts HBase and the HBase shell. The main problem I've run into pertains to the Maven step due to [HBASE-8674|https://issues.apache.org/jira/browse/HBASE-8674]. Can someone with knowledge of why the POMs are in the state that they are in take a look there and chime in on if there any consequence to simply removing Gary H's repo as a dependency? Provide a Dockerfile to easily build and run HBase from source -- Key: HBASE-11885 URL: https://issues.apache.org/jira/browse/HBASE-11885 Project: HBase Issue Type: New Feature Reporter: Dima Spivak Assignee: Dima Spivak [A recent email to dev@|http://mail-archives.apache.org/mod_mbox/hbase-dev/201408.mbox/%3CCAAef%2BM4q%3Da8Dqxe_EHSFTueY%2BXxz%2BtTe%2BJKsWWbXjhB_Pz7oSA%40mail.gmail.com%3E] highlighted the difficulty that new users can face in getting HBase compiled from source and running locally. I'd like to provide a Dockerfile that would allow anyone with Docker running on a machine with a reasonably current Linux kernel to do so with ease. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11868) Data loss in hlog when the hdfs is unavailable
[ https://issues.apache.org/jira/browse/HBASE-11868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119502#comment-14119502 ] Hudson commented on HBASE-11868: FAILURE: Integrated in HBase-0.98 #493 (See [https://builds.apache.org/job/HBase-0.98/493/]) HBASE-11868 Data loss in hlog when the hdfs is unavailable (Liu Shaohui) (apurtell: rev 39771b8f73a6e6eae12e8b3bdb7dd1fe13edc83c) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java Data loss in hlog when the hdfs is unavailable -- Key: HBASE-11868 URL: https://issues.apache.org/jira/browse/HBASE-11868 Project: HBase Issue Type: Bug Affects Versions: 0.98.5 Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Blocker Fix For: 0.98.6 Attachments: HBASE-11868-0.98-v1.diff, HBASE-11868-0.98-v2.diff When using the new thread model in hbase 0.98, we found a bug which may cause data loss when the the hdfs is unavailable. When writing wal Edits to hlog in doMiniBatchMutation of HRegion, the hlog first call appendNoSync to write the edits to hlog and then call sync with txid. Assumed that the txid of current write is 10, and the syncedTillHere in hlog is 9 and the failedTxid is 0. When the the hdfs is unavailable, the AsyncWriter or AsyncSyncer will fail to apend the edits or sync, then they will update the syncedTillHere to 10 and the failedTxid to 10. When the hlog calls the sync with txid :10, the failedTxid will nerver be checked for txid equals with syncedTillHere. The client thinks the write success , but the data only be writtten to memstore not hlog. If the regionserver is down later before the memstore is flushed, the data will be lost. See: FSHLog.java #1348 {code} // sync all transactions upto the specified txid private void syncer(long txid) throws IOException { synchronized (this.syncedTillHere) { while (this.syncedTillHere.get() txid) { try { this.syncedTillHere.wait(); if (txid = this.failedTxid.get()) { assert asyncIOE != null : current txid is among(under) failed txids, but asyncIOE is null!; throw asyncIOE; } } catch (InterruptedException e) { LOG.debug(interrupted while waiting for notification from AsyncNotifier); } } } } {code} We can fix this issue by moving the comparing of txid and failedTxid outside the while block. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-11887) Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value
stack created HBASE-11887: - Summary: Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value Key: HBASE-11887 URL: https://issues.apache.org/jira/browse/HBASE-11887 Project: HBase Issue Type: Bug Components: Protobufs Affects Versions: 0.99.0 Reporter: stack Assignee: stack Trying to test branch-1, I run out of mem pretty fast. Looking at dumps, I see too many instances of LiteralByteString. Seem to be 'qualifiers' and 'values' out of pb QualifierValue... and on up to the multi call into the server. Am having trouble finding how the retention is being done... Filing issue in meantime while work on it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119513#comment-14119513 ] Mikhail Antonov commented on HBASE-11165: - A side question to folks who recently benchmarked it on big clusters.. what's avg ratio of hdfs inodes / per region you observed? Trying to estimate the load the proposed 1M or 50M regions setup puts on NN. Scaling so cluster can host 1M regions and beyond (50M regions?) Key: HBASE-11165 URL: https://issues.apache.org/jira/browse/HBASE-11165 Project: HBase Issue Type: Brainstorming Reporter: stack Attachments: HBASE-11165.zip, Region Scalability test.pdf, zk_less_assignment_comparison_2.pdf This discussion issue comes out of Co-locate Meta And Master HBASE-10569 and comments on the doc posted there. A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M regions maybe even 50M later. This issue is about discussing how we will do that (or if not 50M on a cluster, how otherwise we can attain same end). More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11886) The creator of the table should have all permissions on the table
[ https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119533#comment-14119533 ] Hadoop QA commented on HBASE-11886: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12666169/11886-1.txt against trunk revision . ATTACHMENT ID: 12666169 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.TestRegionRebalancing Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10693//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10693//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10693//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10693//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10693//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10693//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10693//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10693//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10693//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10693//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10693//console This message is automatically generated. The creator of the table should have all permissions on the table - Key: HBASE-11886 URL: https://issues.apache.org/jira/browse/HBASE-11886 Project: HBase Issue Type: Bug Affects Versions: 0.98.3 Reporter: Devaraj Das Priority: Critical Fix For: 1.0.0, 0.98.6 Attachments: 11886-1.txt In our testing of 0.98.4 with security ON, we found that table creator doesn't have RWXCA on the created table. Instead, the user representing the HBase daemon gets all permissions. Due to this the table creator can't write to the table he just created. I am suspecting HBASE-11275 introduced the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11868) Data loss in hlog when the hdfs is unavailable
[ https://issues.apache.org/jira/browse/HBASE-11868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119664#comment-14119664 ] Hudson commented on HBASE-11868: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #466 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/466/]) HBASE-11868 Data loss in hlog when the hdfs is unavailable (Liu Shaohui) (apurtell: rev 39771b8f73a6e6eae12e8b3bdb7dd1fe13edc83c) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java Data loss in hlog when the hdfs is unavailable -- Key: HBASE-11868 URL: https://issues.apache.org/jira/browse/HBASE-11868 Project: HBase Issue Type: Bug Affects Versions: 0.98.5 Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Blocker Fix For: 0.98.6 Attachments: HBASE-11868-0.98-v1.diff, HBASE-11868-0.98-v2.diff When using the new thread model in hbase 0.98, we found a bug which may cause data loss when the the hdfs is unavailable. When writing wal Edits to hlog in doMiniBatchMutation of HRegion, the hlog first call appendNoSync to write the edits to hlog and then call sync with txid. Assumed that the txid of current write is 10, and the syncedTillHere in hlog is 9 and the failedTxid is 0. When the the hdfs is unavailable, the AsyncWriter or AsyncSyncer will fail to apend the edits or sync, then they will update the syncedTillHere to 10 and the failedTxid to 10. When the hlog calls the sync with txid :10, the failedTxid will nerver be checked for txid equals with syncedTillHere. The client thinks the write success , but the data only be writtten to memstore not hlog. If the regionserver is down later before the memstore is flushed, the data will be lost. See: FSHLog.java #1348 {code} // sync all transactions upto the specified txid private void syncer(long txid) throws IOException { synchronized (this.syncedTillHere) { while (this.syncedTillHere.get() txid) { try { this.syncedTillHere.wait(); if (txid = this.failedTxid.get()) { assert asyncIOE != null : current txid is among(under) failed txids, but asyncIOE is null!; throw asyncIOE; } } catch (InterruptedException e) { LOG.debug(interrupted while waiting for notification from AsyncNotifier); } } } } {code} We can fix this issue by moving the comparing of txid and failedTxid outside the while block. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11877) Make TableSplit more readable
[ https://issues.apache.org/jira/browse/HBASE-11877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119669#comment-14119669 ] Hadoop QA commented on HBASE-11877: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12666163/HBASE-11877-trunk-v2.diff against trunk revision . ATTACHMENT ID: 12666163 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction org.apache.hadoop.hbase.client.TestMultiParallel org.apache.hadoop.hbase.TestRegionRebalancing Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10692//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10692//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10692//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10692//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10692//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10692//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10692//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10692//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10692//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10692//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10692//console This message is automatically generated. Make TableSplit more readable - Key: HBASE-11877 URL: https://issues.apache.org/jira/browse/HBASE-11877 Project: HBase Issue Type: Improvement Affects Versions: 2.0.0 Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Attachments: HBASE-11877-trunk-v1.diff, HBASE-11877-trunk-v2.diff When debugging MR jobs reading from hbase table, it's import to figure out which region a map task is reading from. But the table split object is hard to read. eg: {code} 2014-09-01 20:58:39,783 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: lg-hadoop-prc-st40.bj:,0 {code} See: TableSplit.java {code} @Override public String toString() { return m_regionLocation + : + Bytes.toStringBinary(m_startRow) + , + Bytes.toStringBinary(m_endRow); } {code} We should make it more readable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11886) The creator of the table should have all permissions on the table
[ https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119705#comment-14119705 ] Ted Yu commented on HBASE-11886: +1 The creator of the table should have all permissions on the table - Key: HBASE-11886 URL: https://issues.apache.org/jira/browse/HBASE-11886 Project: HBase Issue Type: Bug Affects Versions: 0.98.3 Reporter: Devaraj Das Priority: Critical Fix For: 1.0.0, 0.98.6 Attachments: 11886-1.txt In our testing of 0.98.4 with security ON, we found that table creator doesn't have RWXCA on the created table. Instead, the user representing the HBase daemon gets all permissions. Due to this the table creator can't write to the table he just created. I am suspecting HBASE-11275 introduced the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11339) HBase MOB
[ https://issues.apache.org/jira/browse/HBASE-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119741#comment-14119741 ] Jingcheng Du commented on HBASE-11339: -- Thanks Lars for the comments. [~lhofhansl] bq. Going by the comments the use case is only 1-5mb files (definitely less than 64mb), correct? That changes the discussion, but it looks to me that now the use case is limited to a single scenario and carefully constructed (200m x 500k files) so that this change might be useful. I.e. pick a blob size just right, and pick the size distribution of the files just right and this makes sense. the client solution could work well too in certain cases of bigger size blobs and we could try leveraging the current MOB design approach for smaller values of KVs. In some usage scenarios, the value size is almost fixed, for example the pictures taken by camera of the traffic bureau, the contracts between banks and customers, the CT(Computed Tomography) records in hospitals, etc. This might be limited, but it’s really useful. As mentioned the client solution saves the records larger than 10MB to hdfs, and saves others to the HBase directly. To turn down the threshold less will lead to the insufficient using of the hdfs in client solution, instead saving them directly in HBase for this case. And even with value size less 10MB, the mob implementation has big improvements in performance than directly saving those records into HBase. The mob has a threshold as well, the mob could be saved as either value or reference by this threshold. We have a default value 100KB for it now. Users could change it and we also have a compactor to handle it (move the mob file to hbase, and vice versa). As Jon said, we'll revamp the mob compaction and get rid of the MR dependency. bq. Yet, all that is possible to do with a client only solution and could be abstracted there. To implement the snapshot, replication things in client solution are harder, it will bring the complexity for the client solution as well. To keep the consistency bwtween HBase and HDFS files during replication is a problem. To implement this in server side is a little bit easier, the mob includes the implementation of snapshot, and it supports the replication naturally because the mob data are saved in WAL. bq. (Subjectively) I do not like the complexity of this as seen by the various discussions here. That part is just my $0.02 of course. Yes, it’s complex, but they are meaningful and valuable. The patches provide features of read/write, compactions, snapshot and sweep for mob files. Even in the future HBase decides to implement streaming feature, the read, compaction, and snapshot parts would be useful probably. Thanks! HBase MOB - Key: HBASE-11339 URL: https://issues.apache.org/jira/browse/HBASE-11339 Project: HBase Issue Type: Umbrella Components: regionserver, Scanners Reporter: Jingcheng Du Assignee: Jingcheng Du Attachments: HBase MOB Design-v2.pdf, HBase MOB Design-v3.pdf, HBase MOB Design-v4.pdf, HBase MOB Design.pdf, MOB user guide.docx, MOB user guide_v2.docx, hbase-11339-in-dev.patch It's quite useful to save the medium binary data like images, documents into Apache HBase. Unfortunately directly saving the binary MOB(medium object) to HBase leads to a worse performance since the frequent split and compaction. In this design, the MOB data are stored in an more efficient way, which keeps a high write/read performance and guarantees the data consistency in Apache HBase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119832#comment-14119832 ] Andrey Stepachev commented on HBASE-11165: -- Just thinking, did anyone tried to measure how meta uses memory and reduce usage of memory? It is interesting, why NN is able to handle much more data in memory, while HMaster can't. Scaling so cluster can host 1M regions and beyond (50M regions?) Key: HBASE-11165 URL: https://issues.apache.org/jira/browse/HBASE-11165 Project: HBase Issue Type: Brainstorming Reporter: stack Attachments: HBASE-11165.zip, Region Scalability test.pdf, zk_less_assignment_comparison_2.pdf This discussion issue comes out of Co-locate Meta And Master HBASE-10569 and comments on the doc posted there. A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M regions maybe even 50M later. This issue is about discussing how we will do that (or if not 50M on a cluster, how otherwise we can attain same end). More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11886) The creator of the table should have all permissions on the table
[ https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119868#comment-14119868 ] Andrew Purtell commented on HBASE-11886: Thanks for finding this. +1 The creator of the table should have all permissions on the table - Key: HBASE-11886 URL: https://issues.apache.org/jira/browse/HBASE-11886 Project: HBase Issue Type: Bug Affects Versions: 0.98.3 Reporter: Devaraj Das Priority: Critical Fix For: 1.0.0, 0.98.6 Attachments: 11886-1.txt In our testing of 0.98.4 with security ON, we found that table creator doesn't have RWXCA on the created table. Instead, the user representing the HBase daemon gets all permissions. Due to this the table creator can't write to the table he just created. I am suspecting HBASE-11275 introduced the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11339) HBase MOB
[ https://issues.apache.org/jira/browse/HBASE-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119894#comment-14119894 ] Andrew Purtell commented on HBASE-11339: bq. As Jon said, we'll revamp the mob compaction and get rid of the MR dependency. Please. I don't think we should ever ship a release with a dependency on MR for core function. Committing this to trunk in stages could be ok, as long as we do not attempt a release including the feature before MOB compaction is handled natively. HBase MOB - Key: HBASE-11339 URL: https://issues.apache.org/jira/browse/HBASE-11339 Project: HBase Issue Type: Umbrella Components: regionserver, Scanners Reporter: Jingcheng Du Assignee: Jingcheng Du Attachments: HBase MOB Design-v2.pdf, HBase MOB Design-v3.pdf, HBase MOB Design-v4.pdf, HBase MOB Design.pdf, MOB user guide.docx, MOB user guide_v2.docx, hbase-11339-in-dev.patch It's quite useful to save the medium binary data like images, documents into Apache HBase. Unfortunately directly saving the binary MOB(medium object) to HBase leads to a worse performance since the frequent split and compaction. In this design, the MOB data are stored in an more efficient way, which keeps a high write/read performance and guarantees the data consistency in Apache HBase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11886) The creator of the table should have all permissions on the table
[ https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119871#comment-14119871 ] Andrew Purtell commented on HBASE-11886: Any chance for a short unit test that confirms the fix? Then we won't regress here again. Thanks. The creator of the table should have all permissions on the table - Key: HBASE-11886 URL: https://issues.apache.org/jira/browse/HBASE-11886 Project: HBase Issue Type: Bug Affects Versions: 0.98.3 Reporter: Devaraj Das Priority: Critical Fix For: 1.0.0, 0.98.6 Attachments: 11886-1.txt In our testing of 0.98.4 with security ON, we found that table creator doesn't have RWXCA on the created table. Instead, the user representing the HBase daemon gets all permissions. Due to this the table creator can't write to the table he just created. I am suspecting HBASE-11275 introduced the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11878) TestVisibilityLabelsWithDistributedLogReplay#testAddVisibilityLabelsOnRSRestart sometimes fails due to VisibilityController not yet initialized
[ https://issues.apache.org/jira/browse/HBASE-11878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119889#comment-14119889 ] Jean-Marc Spaggiari commented on HBASE-11878: - FYI, I re-run that on 4 servers using the 0.98RC1 src package. this specific test passed on 3 servers and failed with timeout on one of them: {code} Tests in error: testAddVisibilityLabelsOnRSRestart(org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithDistributedLogReplay): test timed out after 6 milliseconds {code} TestVisibilityLabelsWithDistributedLogReplay#testAddVisibilityLabelsOnRSRestart sometimes fails due to VisibilityController not yet initialized --- Key: HBASE-11878 URL: https://issues.apache.org/jira/browse/HBASE-11878 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: 11878-v1.txt, 11878-v2.txt, 11878-v2.txt, 11878-v3.txt, 11878-v4.txt, 11878-v5.txt In the thread w.r.t. first RC of 0.98.6, http://search-hadoop.com/m/DHED4p2rw81 , Jean-Marc reported that TestVisibilityLabelsWithDistributedLogReplay#testAddVisibilityLabelsOnRSRestart sometimes failed on his machines. From http://server.distparser.com:81/hbase/with_teds_patch2/hbasetest1/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithDistributedLogReplay-output.txt : {code} result { exception { name: org.apache.hadoop.hbase.coprocessor.CoprocessorException value: org.apache.hadoop.hbase.coprocessor.CoprocessorException: VisibilityController not yet initialized\n\tat org.apache.hadoop.hbase.security.visibility.VisibilityController.addLabels(VisibilityController.java:638)\n\tat org.apache.hadoop.hbase.protobuf.generated.VisibilityLabelsProtos$VisibilityLabelsService$1.addLabels(VisibilityLabelsProtos.java:5014)\n\tat org.apache.hadoop.hbase.protobuf.generated.VisibilityLabelsProtos$VisibilityLabelsService.callMethod(VisibilityLabelsProtos.java:5178)\n\tat org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:5794)\n\tat org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:1608)\n\tat org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:1590)\n\tat org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:30088)\n\tat org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2014)\n\tat org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)\n\tat org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)\n\tat org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)\n\tat java.lang.Thread.run(Thread.java:744)\n } } {code} The above exception revealed a race condition: writing of labels ABC and XYZ took place when VisibilityController was not yet initialized. The test writes the labels only once, leading to assertion failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HBASE-11876) RegionScanner.nextRaw(...) should not update metrics
[ https://issues.apache.org/jira/browse/HBASE-11876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell reopened HBASE-11876: The trunk change passed precommit but the 0.98 version requires an addendum to fix a failing test, see https://builds.apache.org/job/HBase-0.98/493/testReport/org.apache.hadoop.hbase.regionserver/TestRegionServerMetrics/testScanNext/ RegionScanner.nextRaw(...) should not update metrics Key: HBASE-11876 URL: https://issues.apache.org/jira/browse/HBASE-11876 Project: HBase Issue Type: Bug Affects Versions: 0.98.6 Reporter: Lars Hofhansl Assignee: Andrew Purtell Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11876-0.98.patch, HBASE-11876.patch, HBASE-11876.patch I added the RegionScanner.nextRaw(...) to allow smart client to avoid some of the default work that HBase is doing, such as {start|stop}RegionOperation and synchronized(scanner) for each row. Metrics should follow the same approach. Collecting them per row is expensive and a caller should have the option to collect those later or to avoid collecting them completely. We can also save some cycles in RSRcpServices.scan(...) if we updated the metric only once/batch instead of each row. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11886) The creator of the table should have all permissions on the table
[ https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-11886: --- Assignee: Devaraj Das The creator of the table should have all permissions on the table - Key: HBASE-11886 URL: https://issues.apache.org/jira/browse/HBASE-11886 Project: HBase Issue Type: Bug Affects Versions: 0.98.3 Reporter: Devaraj Das Assignee: Devaraj Das Priority: Critical Fix For: 1.0.0, 0.98.6 Attachments: 11886-1.txt In our testing of 0.98.4 with security ON, we found that table creator doesn't have RWXCA on the created table. Instead, the user representing the HBase daemon gets all permissions. Due to this the table creator can't write to the table he just created. I am suspecting HBASE-11275 introduced the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11885) Provide a Dockerfile to easily build and run HBase from source
[ https://issues.apache.org/jira/browse/HBASE-11885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119928#comment-14119928 ] Andrew Purtell commented on HBASE-11885: [~nkeywal] Provide a Dockerfile to easily build and run HBase from source -- Key: HBASE-11885 URL: https://issues.apache.org/jira/browse/HBASE-11885 Project: HBase Issue Type: New Feature Reporter: Dima Spivak Assignee: Dima Spivak [A recent email to dev@|http://mail-archives.apache.org/mod_mbox/hbase-dev/201408.mbox/%3CCAAef%2BM4q%3Da8Dqxe_EHSFTueY%2BXxz%2BtTe%2BJKsWWbXjhB_Pz7oSA%40mail.gmail.com%3E] highlighted the difficulty that new users can face in getting HBase compiled from source and running locally. I'd like to provide a Dockerfile that would allow anyone with Docker running on a machine with a reasonably current Linux kernel to do so with ease. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-11760) Tighten up region state transition
[ https://issues.apache.org/jira/browse/HBASE-11760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14112342#comment-14112342 ] Jimmy Xiang edited comment on HBASE-11760 at 9/3/14 3:28 PM: - Patch v2.1 is on RB: https://reviews.apache.org/r/25299/ was (Author: jxiang): Patch v2 is on RB: https://reviews.apache.org/r/25099/ Tighten up region state transition -- Key: HBASE-11760 URL: https://issues.apache.org/jira/browse/HBASE-11760 Project: HBase Issue Type: Improvement Components: Region Assignment Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 2.0.0 Attachments: hbase-11760.patch, hbase-11760_2.patch When a regionserver reports to master a region transition, we should check the current region state to be exactly what we expect. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11760) Tighten up region state transition
[ https://issues.apache.org/jira/browse/HBASE-11760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-11760: Attachment: hbase-11760_2.1.patch Attached v2.1 that's rebased to master latest. RB: https://reviews.apache.org/r/25299/ Tighten up region state transition -- Key: HBASE-11760 URL: https://issues.apache.org/jira/browse/HBASE-11760 Project: HBase Issue Type: Improvement Components: Region Assignment Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 2.0.0 Attachments: hbase-11760.patch, hbase-11760_2.1.patch, hbase-11760_2.patch When a regionserver reports to master a region transition, we should check the current region state to be exactly what we expect. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11876) RegionScanner.nextRaw(...) should not update metrics
[ https://issues.apache.org/jira/browse/HBASE-11876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119973#comment-14119973 ] Andrew Purtell commented on HBASE-11876: There are two code paths in 0.98 which call nextRaw from scanning code. One isn't found by Eclipse reference search but grep did the trick. Will revert previous commit and push the fixed version as soon as local tests check out. RegionScanner.nextRaw(...) should not update metrics Key: HBASE-11876 URL: https://issues.apache.org/jira/browse/HBASE-11876 Project: HBase Issue Type: Bug Affects Versions: 0.98.6 Reporter: Lars Hofhansl Assignee: Andrew Purtell Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11876-0.98.patch, HBASE-11876.patch, HBASE-11876.patch I added the RegionScanner.nextRaw(...) to allow smart client to avoid some of the default work that HBase is doing, such as {start|stop}RegionOperation and synchronized(scanner) for each row. Metrics should follow the same approach. Collecting them per row is expensive and a caller should have the option to collect those later or to avoid collecting them completely. We can also save some cycles in RSRcpServices.scan(...) if we updated the metric only once/batch instead of each row. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11876) RegionScanner.nextRaw(...) should not update metrics
[ https://issues.apache.org/jira/browse/HBASE-11876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-11876: --- Attachment: HBASE-11876-0.98.patch RegionScanner.nextRaw(...) should not update metrics Key: HBASE-11876 URL: https://issues.apache.org/jira/browse/HBASE-11876 Project: HBase Issue Type: Bug Affects Versions: 0.98.6 Reporter: Lars Hofhansl Assignee: Andrew Purtell Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11876-0.98.patch, HBASE-11876-0.98.patch, HBASE-11876.patch, HBASE-11876.patch I added the RegionScanner.nextRaw(...) to allow smart client to avoid some of the default work that HBase is doing, such as {start|stop}RegionOperation and synchronized(scanner) for each row. Metrics should follow the same approach. Collecting them per row is expensive and a caller should have the option to collect those later or to avoid collecting them completely. We can also save some cycles in RSRcpServices.scan(...) if we updated the metric only once/batch instead of each row. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11805) KeyValue to Cell Convert in WALEdit APIs
[ https://issues.apache.org/jira/browse/HBASE-11805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119994#comment-14119994 ] Anoop Sam John commented on HBASE-11805: Thanks Stack. {quote} {code} Do we have to do the below? - for (KeyValue kv : value.getKeyValues()) { + for (Cell cell : value.getCells()) { + KeyValue kv = KeyValueUtil.ensureKeyValue(cell); {code} {quote} This is in WALPlayer#HLogKeyValueMapper. The mapper o/p value is KeyValue. This also we can change to Cell. But I thought of doing all these as part of HBASE-11871. One sub task to remove enusreKeyValue from MR tools. Sounds ok? {quote} This one is a bit odd Anoop... {code} -for (KeyValue kv: kvs) { -size += kv.getLength(); + for (Cell cell: cells) { + size += KeyValueUtil.length(cell); } {code} Using a KeyValueUtils on a cell? {quote} We have CellUtil#estimatedSizeOf() but that is not KV length. That is length + SIZEOF_INT. The bytes a Cell takes when serialized to Encoder. KKUtil method was existing and been used by Prefix Tree also. May be will add length() in CellUtil , same way as in KVUtil and make code to use that. KeyValue to Cell Convert in WALEdit APIs Key: HBASE-11805 URL: https://issues.apache.org/jira/browse/HBASE-11805 Project: HBase Issue Type: Improvement Components: wal Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 0.99.0, 2.0.0, 0.98.7 Attachments: HBASE-11805.patch In almost all other main interface class/APIs we have changed KeyValue to Cell. But missing in WALEdit. This is public marked for Replication (Well it should be for CP also) These 2 APIs deal with KVs add(KeyValue kv) ArrayListKeyValue getKeyValues() Suggest deprecate them and add for 0.98 add(Cell kv) ListCell getCells() And just replace from 1.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-11888) Add per region flush count
Elliott Clark created HBASE-11888: - Summary: Add per region flush count Key: HBASE-11888 URL: https://issues.apache.org/jira/browse/HBASE-11888 Project: HBase Issue Type: Bug Reporter: Elliott Clark Assignee: Elliott Clark Debugging a workload that ran overnight it's hard to tell if a region flushed a lot and got compacted, or flushed fewer times and then got compacted. We should have a counter on that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11882) Row level consistency may not be maintained with bulk load and compaction
[ https://issues.apache.org/jira/browse/HBASE-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120039#comment-14120039 ] Jerry He commented on HBASE-11882: -- Still there is no Hadoop QA run. Here is my local 'mvn test' run result with patch v2: {code} [INFO] Reactor Summary: [INFO] [INFO] HBase . SUCCESS [ 1.793 s] [INFO] HBase - Common SUCCESS [ 50.103 s] [INFO] HBase - Protocol .. SUCCESS [ 0.073 s] [INFO] HBase - Client SUCCESS [01:02 min] [INFO] HBase - Hadoop Compatibility .. SUCCESS [ 8.157 s] [INFO] HBase - Hadoop Two Compatibility .. SUCCESS [ 5.973 s] [INFO] HBase - Prefix Tree ... SUCCESS [ 8.950 s] [INFO] HBase - Server SUCCESS [57:32 min] [INFO] HBase - Testing Util .. SUCCESS [ 1.101 s] [INFO] HBase - Thrift SUCCESS [02:08 min] [INFO] HBase - Shell . SUCCESS [ 1.586 s] [INFO] HBase - Integration Tests . SUCCESS [ 0.516 s] [INFO] HBase - Examples .. SUCCESS [ 1.916 s] [INFO] HBase - Assembly .. SUCCESS [ 1.045 s] [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 01:02 h [INFO] Finished at: 2014-09-03T00:15:35-08:00 [INFO] Final Memory: 46M/273M {code} Row level consistency may not be maintained with bulk load and compaction - Key: HBASE-11882 URL: https://issues.apache.org/jira/browse/HBASE-11882 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.99.0, 2.0.0 Reporter: Jerry He Assignee: Jerry He Priority: Critical Fix For: 0.99.0, 2.0.0 Attachments: HBASE-11882-master-v1.patch, HBASE-11882-master-v2.patch, TestHRegionServerBulkLoad.java.patch While looking into the TestHRegionServerBulkLoad failure for HBASE-11772, I found the root cause is that row level atomicity may not be maintained with bulk load together with compation. TestHRegionServerBulkLoad is used to test bulk load atomicity. The test uses multiple threads to do bulk load and scan continuously and do compactions periodically. It verifies row level data is always consistent across column families. After HBASE-11591, we added readpoint checks for bulkloaded data using the seqId at the time of bulk load. Now a scanner will not see the data from a bulk load if the scanner's readpoint is earlier than the bulk load seqId. Previously, the atomic bulk load result is visible immediately to all scanners. The problem is with compaction after bulk load. Compaction does not lock the region and it is done one store (column family) at a time. It also compact away the seqId marker of bulk load. Here is an event sequence where the row level consistency is broken. 1. A scanner is started to scan a region with cf1 and cf2. The readpoint is 10. 2. There is a bulk load that loads into cf1 and cf2. The bulk load seqId is 11. Bulk load is guarded by region write lock. So it is atomic. 3. There is a compaction that compacts cf1. It compacts away the seqId marker of the bulk load. 4. The scanner tries to next to row-1001. It gets the bulk load data for cf1 since there is no seqId preventing it. It does not get the bulk load data for cf2 since the scanner's readpoint (10) is less than the bulk load seqId (11). Now the row level consistency is broken in this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11853) Provide an alternative to the apache build for developers (like me) who aren't committers
[ https://issues.apache.org/jira/browse/HBASE-11853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120063#comment-14120063 ] Alex Newman commented on HBASE-11853: - [~apurtell] So another problem. It seems as though the tests in the compat directories aren't categorized at all. In fact they don't pull in hbase-common. Perhaps we could do packages which are not categorized at all in another commit? Provide an alternative to the apache build for developers (like me) who aren't committers - Key: HBASE-11853 URL: https://issues.apache.org/jira/browse/HBASE-11853 Project: HBase Issue Type: Bug Reporter: Alex Newman Assignee: Alex Newman Attachments: HBASE-11853-testing-v0.patch, HBASE-11853-testing-v1.patch, HBASE-11853-v3.patch Travis CI and Circle-CI now provide free builds for open source projects. I created the capability to run builds this way. Although they are closed source (and thus not a replacement for jenkins IMHO), they are super convenient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-11876) RegionScanner.nextRaw(...) should not update metrics
[ https://issues.apache.org/jira/browse/HBASE-11876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell resolved HBASE-11876. Resolution: Fixed Pushed to 0.98. All hbase-server tests pass locally. RegionScanner.nextRaw(...) should not update metrics Key: HBASE-11876 URL: https://issues.apache.org/jira/browse/HBASE-11876 Project: HBase Issue Type: Bug Affects Versions: 0.98.6 Reporter: Lars Hofhansl Assignee: Andrew Purtell Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11876-0.98.patch, HBASE-11876-0.98.patch, HBASE-11876.patch, HBASE-11876.patch I added the RegionScanner.nextRaw(...) to allow smart client to avoid some of the default work that HBase is doing, such as {start|stop}RegionOperation and synchronized(scanner) for each row. Metrics should follow the same approach. Collecting them per row is expensive and a caller should have the option to collect those later or to avoid collecting them completely. We can also save some cycles in RSRcpServices.scan(...) if we updated the metric only once/batch instead of each row. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11886) The creator of the table should have all permissions on the table
[ https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120070#comment-14120070 ] Andrew Purtell commented on HBASE-11886: bq. Any chance for a short unit test that confirms the fix? Then we won't regress here again. Thanks. I've got time to do this. Working on it now. The creator of the table should have all permissions on the table - Key: HBASE-11886 URL: https://issues.apache.org/jira/browse/HBASE-11886 Project: HBase Issue Type: Bug Affects Versions: 0.98.3 Reporter: Devaraj Das Assignee: Devaraj Das Priority: Critical Fix For: 1.0.0, 0.98.6 Attachments: 11886-1.txt In our testing of 0.98.4 with security ON, we found that table creator doesn't have RWXCA on the created table. Instead, the user representing the HBase daemon gets all permissions. Due to this the table creator can't write to the table he just created. I am suspecting HBASE-11275 introduced the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11805) KeyValue to Cell Convert in WALEdit APIs
[ https://issues.apache.org/jira/browse/HBASE-11805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120074#comment-14120074 ] stack commented on HBASE-11805: --- Sounds reasonable. +1 KeyValue to Cell Convert in WALEdit APIs Key: HBASE-11805 URL: https://issues.apache.org/jira/browse/HBASE-11805 Project: HBase Issue Type: Improvement Components: wal Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 0.99.0, 2.0.0, 0.98.7 Attachments: HBASE-11805.patch In almost all other main interface class/APIs we have changed KeyValue to Cell. But missing in WALEdit. This is public marked for Replication (Well it should be for CP also) These 2 APIs deal with KVs add(KeyValue kv) ArrayListKeyValue getKeyValues() Suggest deprecate them and add for 0.98 add(Cell kv) ListCell getCells() And just replace from 1.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120086#comment-14120086 ] stack commented on HBASE-11165: --- bq. I mean, seems tricky to make it be the same code path? Yeah. One of the code paths will 'suffer' neglect. [~mantonov] I like your cold backup - warm - hot - active-active let me try and put together a bit of a summary so far on findings and arguments. bq. ...and reduce usage of memory Yeah, we'll have to go this route if we are trying to keep state of a big cluster in heap. Could work on making the representation more compact. You arguing for single meta region [~octo47] then? There is also the on-hdfs size to consider (write-amplification) and the r/w i/os. Scaling so cluster can host 1M regions and beyond (50M regions?) Key: HBASE-11165 URL: https://issues.apache.org/jira/browse/HBASE-11165 Project: HBase Issue Type: Brainstorming Reporter: stack Attachments: HBASE-11165.zip, Region Scalability test.pdf, zk_less_assignment_comparison_2.pdf This discussion issue comes out of Co-locate Meta And Master HBASE-10569 and comments on the doc posted there. A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M regions maybe even 50M later. This issue is about discussing how we will do that (or if not 50M on a cluster, how otherwise we can attain same end). More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11888) Add per region flush count
[ https://issues.apache.org/jira/browse/HBASE-11888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-11888: -- Priority: Minor (was: Major) Add per region flush count -- Key: HBASE-11888 URL: https://issues.apache.org/jira/browse/HBASE-11888 Project: HBase Issue Type: Improvement Reporter: Elliott Clark Assignee: Elliott Clark Priority: Minor Debugging a workload that ran overnight it's hard to tell if a region flushed a lot and got compacted, or flushed fewer times and then got compacted. We should have a counter on that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11339) HBase MOB
[ https://issues.apache.org/jira/browse/HBASE-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120094#comment-14120094 ] Jonathan Hsieh commented on HBASE-11339: Re: [~lhofhansl] bq. To be fair, my comment itself addressed that by saying small blobs are stored by value in HBase, and only large bloba in HDFS. We can store a lot of 10MB (in the worst case scenario it's 200m x 10mb = 2pb) in HDFS, if that's not enough, we can dial up the threshold. bq. It seems nobody understood what I am suggesting. Depending on use case and data distribution you pick a threshold X. Blobs with a size of X are stored directly in HBase as a column value. Blobs = X are stored in a HDFS with a reference in HBase using the 3-phase approach. The MOB solution we're espousing does not preclude the hybrid hdfs+hbase approach - that could be still used with objects that are larger than or approach the hdfs block size. Our claim is that the mob approach is complementary to a proper streaming api based hdfs+hbase mechanism for large object. Operationally, the MOB design is similar -- Depending on use case and data distribution you pick a threshold X on each column family. Blobs with a size of X are stored directly in HBase as a column value. Blobs = X are stored in the MOB area with a reference in HBase using the on-flush/on-compaction approach. If the blob is larger than the ~10MB default [1], it is rejected. With the MOB design, if the threshold X performs poorly, then you can alter table the X value and the next major compaction will shift values between the MOB area and the normal hbase regions. With the HDFS+HBase approach, would we need a new mechanism to shift data between hdfs and hbase? Is there a simple tuning/migration story? bq. True, but as I state the store small blobs by value and only large ones by reference solution is not mentioned in there. bq. Not it's not... It says either all blobs go into HBase or all blobs go into HDFS... See above. Small blobs would be stored directly in HBase, not in HDFS. That's key, nobody wants to store 100k or 1mb files directly in HDFS. I'm confused. Section 4.1.2 part this split was assumed and the different mechanisms were for handling the large ones. The discussions earlier in the jira explicitly added a threshold sizes to separate them when the value or reference implementations are used. For people that want to put a lot of 100k or 1mb objects in hbase there are many problems that arise, and this mob feature is an approach to make this valid (according to the defaults) workload work better and more predictably. The mob design says store small blobs by value, moderate blobs by reference (with data in to mob area), and maintains that hbase is not for large objects [1] . bq. Yet, all that is possible to do with a client only solution and could be abstracted there. bq. I'll also admit that our blob storage tool is not finished, yet, and that for its use case we don't need replication or backup as it itself will be the backup solution for another very large data store. bq. Are you guys absolutely... 100%... positive that this cannot be done in any other way and has to be done this way? That we cannot store files up to a certain size as values in HBase and larger files in HDFS? And there is not good threshold value for this? I don't think that saying this is the only way something could be done is right thing to ask. There always many ways to get a functionality -- we've presented a few other potential solutions, and have chosen and are justifying a design considering many of the tradeoffs. It presented a need, a design, an early implementation, and evidence of a deployment and other potential use cases. The hybrid hdfs-hbase approach is one of the alternatives. I believe we agree that there will be some complexity introduced with that approach dealing with atomicity, bulk load, security, backup, replication and potentially tuning. We have enough detail from the discussion to handle atomicity, there are open questions with the others. It is hard to claim a feature is production-ready if we don't have a relatively simple mechanism for backups and disaster recovery. In some future, when the hybrid hdfs+hbase system gets open sourced along with operationally internalized tools complexities, I think it would be a fine addition to hbase. Rough thresholds would be 0-100k hbase by value, 100k-10MB hbase by mob, 10MB+ hbase by ref to hdfs. [1] Today the default Cell size max is ~10MB. https://github.com/apache/hbase/blob/master/hbase-common/src/main/resources/hbase-default.xml#L530 HBase MOB - Key: HBASE-11339 URL: https://issues.apache.org/jira/browse/HBASE-11339 Project: HBase Issue Type: Umbrella Components: regionserver, Scanners
[jira] [Updated] (HBASE-11888) Add per region flush count
[ https://issues.apache.org/jira/browse/HBASE-11888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-11888: -- Issue Type: Improvement (was: Bug) Add per region flush count -- Key: HBASE-11888 URL: https://issues.apache.org/jira/browse/HBASE-11888 Project: HBase Issue Type: Improvement Reporter: Elliott Clark Assignee: Elliott Clark Debugging a workload that ran overnight it's hard to tell if a region flushed a lot and got compacted, or flushed fewer times and then got compacted. We should have a counter on that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11886) The creator of the table should have all permissions on the table
[ https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120106#comment-14120106 ] Devaraj Das commented on HBASE-11886: - Thanks, [~apurtell], you beat me to it. The creator of the table should have all permissions on the table - Key: HBASE-11886 URL: https://issues.apache.org/jira/browse/HBASE-11886 Project: HBase Issue Type: Bug Affects Versions: 0.98.3 Reporter: Devaraj Das Assignee: Devaraj Das Priority: Critical Fix For: 1.0.0, 0.98.6 Attachments: 11886-1.txt In our testing of 0.98.4 with security ON, we found that table creator doesn't have RWXCA on the created table. Instead, the user representing the HBase daemon gets all permissions. Due to this the table creator can't write to the table he just created. I am suspecting HBASE-11275 introduced the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11760) Tighten up region state transition
[ https://issues.apache.org/jira/browse/HBASE-11760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120108#comment-14120108 ] Hadoop QA commented on HBASE-11760: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12666245/hbase-11760_2.1.patch against trunk revision . ATTACHMENT ID: 12666245 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.replication.regionserver.TestReplicationThrottler Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10694//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10694//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10694//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10694//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10694//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10694//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10694//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10694//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10694//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10694//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10694//console This message is automatically generated. Tighten up region state transition -- Key: HBASE-11760 URL: https://issues.apache.org/jira/browse/HBASE-11760 Project: HBase Issue Type: Improvement Components: Region Assignment Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 2.0.0 Attachments: hbase-11760.patch, hbase-11760_2.1.patch, hbase-11760_2.patch When a regionserver reports to master a region transition, we should check the current region state to be exactly what we expect. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11887) Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value
[ https://issues.apache.org/jira/browse/HBASE-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-11887: -- Attachment: Screen Shot 2014-09-03 at 10.18.58 AM.png Here is what it looks like. This is small heap. Has 20M references to qualifer/value after running a relatively short time. Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value --- Key: HBASE-11887 URL: https://issues.apache.org/jira/browse/HBASE-11887 Project: HBase Issue Type: Bug Components: Protobufs Affects Versions: 0.99.0 Reporter: stack Assignee: stack Attachments: Screen Shot 2014-09-03 at 10.18.58 AM.png Trying to test branch-1, I run out of mem pretty fast. Looking at dumps, I see too many instances of LiteralByteString. Seem to be 'qualifiers' and 'values' out of pb QualifierValue... and on up to the multi call into the server. Am having trouble finding how the retention is being done... Filing issue in meantime while work on it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11339) HBase MOB
[ https://issues.apache.org/jira/browse/HBASE-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120118#comment-14120118 ] Jonathan Hsieh commented on HBASE-11339: re: [~apurtell] bq. Please. I don't think we should ever ship a release with a dependency on MR for core function. Committing this to trunk in stages could be ok, as long as we do not attempt a release including the feature before MOB compaction is handled natively. I agree -- moreover, ideally hbase should not need external processes except for hdfs/zk. However, there is what should be and what has happened and what does happen. In these cases we have ended up marking features experimental. There are many examples of features in core hbase that shipped in stable releases and that still require external processes and may have no demonstrated users. You'd have to go back a bit to get one that explicitly depended on MR but they did exist. (e.g. pre dist log splitting we had a MR based log replay -- useful in avoiding 10 hr recovery downtimes). This would be a good discussion topic for an upcoming PMC meeting. What is your definition of stages? -- do you mean patch a time or something more like: stage one with external compactions, stage 2 with internal compactions? For this MOB feature, we would have the experimental tag while we had external compactions and it would remain until we remove external dependencies and this compaction harden with fault testing. Give our current cadence, we should be able have this completed as part of hbase 1.99/2.0 line's timeframe. HBase MOB - Key: HBASE-11339 URL: https://issues.apache.org/jira/browse/HBASE-11339 Project: HBase Issue Type: Umbrella Components: regionserver, Scanners Reporter: Jingcheng Du Assignee: Jingcheng Du Attachments: HBase MOB Design-v2.pdf, HBase MOB Design-v3.pdf, HBase MOB Design-v4.pdf, HBase MOB Design.pdf, MOB user guide.docx, MOB user guide_v2.docx, hbase-11339-in-dev.patch It's quite useful to save the medium binary data like images, documents into Apache HBase. Unfortunately directly saving the binary MOB(medium object) to HBase leads to a worse performance since the frequent split and compaction. In this design, the MOB data are stored in an more efficient way, which keeps a high write/read performance and guarantees the data consistency in Apache HBase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11885) Provide a Dockerfile to easily build and run HBase from source
[ https://issues.apache.org/jira/browse/HBASE-11885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120122#comment-14120122 ] Nicolas Liochon commented on HBASE-11885: - Since HBASE-4955 is fixed, it's possible and actually better to remove Gary's repo from the master version of HBase. Provide a Dockerfile to easily build and run HBase from source -- Key: HBASE-11885 URL: https://issues.apache.org/jira/browse/HBASE-11885 Project: HBase Issue Type: New Feature Reporter: Dima Spivak Assignee: Dima Spivak [A recent email to dev@|http://mail-archives.apache.org/mod_mbox/hbase-dev/201408.mbox/%3CCAAef%2BM4q%3Da8Dqxe_EHSFTueY%2BXxz%2BtTe%2BJKsWWbXjhB_Pz7oSA%40mail.gmail.com%3E] highlighted the difficulty that new users can face in getting HBase compiled from source and running locally. I'd like to provide a Dockerfile that would allow anyone with Docker running on a machine with a reasonably current Linux kernel to do so with ease. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11887) Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value
[ https://issues.apache.org/jira/browse/HBASE-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-11887: -- Attachment: 11887.txt This seems to help. BoundedPriorityBlockingQueue has all handlers which user CallRunners. The CallRunners hold on to Calls after they are done. All handlers can be holding on to big ycsb puts. With this in place, I run much longer. Heap character is completely different now. Let me dig more. This is a branch-1/master only issue (redo of rpc priority). Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value --- Key: HBASE-11887 URL: https://issues.apache.org/jira/browse/HBASE-11887 Project: HBase Issue Type: Bug Components: Protobufs Affects Versions: 0.99.0 Reporter: stack Assignee: stack Attachments: 11887.txt, Screen Shot 2014-09-03 at 10.18.58 AM.png Trying to test branch-1, I run out of mem pretty fast. Looking at dumps, I see too many instances of LiteralByteString. Seem to be 'qualifiers' and 'values' out of pb QualifierValue... and on up to the multi call into the server. Am having trouble finding how the retention is being done... Filing issue in meantime while work on it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120142#comment-14120142 ] Andrey Stepachev commented on HBASE-11165: -- bq.Yeah, we'll have to go this route if we are trying to keep state of a big cluster in heap. Could work on making the representation more compact. You arguing for single meta region Andrey Stepachev then? There is also the on-hdfs size to consider (write-amplification) and the r/w i/os. For sure, compact representation doesn't implicate single meta. Compact meta allows to bother with split meta only for really big installations. But how HDFS would handle that, as [~mantonov] mentioned above. As for compact META representations we can use other technics to reduce HDFS impact for big meta. Scaling so cluster can host 1M regions and beyond (50M regions?) Key: HBASE-11165 URL: https://issues.apache.org/jira/browse/HBASE-11165 Project: HBase Issue Type: Brainstorming Reporter: stack Attachments: HBASE-11165.zip, Region Scalability test.pdf, zk_less_assignment_comparison_2.pdf This discussion issue comes out of Co-locate Meta And Master HBASE-10569 and comments on the doc posted there. A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M regions maybe even 50M later. This issue is about discussing how we will do that (or if not 50M on a cluster, how otherwise we can attain same end). More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120144#comment-14120144 ] Andrey Stepachev commented on HBASE-11165: -- bq.But how HDFS would handle that, as Mikhail Antonov mentioned above? that should be a question Scaling so cluster can host 1M regions and beyond (50M regions?) Key: HBASE-11165 URL: https://issues.apache.org/jira/browse/HBASE-11165 Project: HBase Issue Type: Brainstorming Reporter: stack Attachments: HBASE-11165.zip, Region Scalability test.pdf, zk_less_assignment_comparison_2.pdf This discussion issue comes out of Co-locate Meta And Master HBASE-10569 and comments on the doc posted there. A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M regions maybe even 50M later. This issue is about discussing how we will do that (or if not 50M on a cluster, how otherwise we can attain same end). More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11885) Provide a Dockerfile to easily build and run HBase from source
[ https://issues.apache.org/jira/browse/HBASE-11885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120158#comment-14120158 ] Dima Spivak commented on HBASE-11885: - +1 ^^^. As I said, I had no issues at all building master from source after removing that repository altogether (0.94 is a different story, because of the dependency on version 2.12-TRUNK-HBASE-2 Surefire). Provide a Dockerfile to easily build and run HBase from source -- Key: HBASE-11885 URL: https://issues.apache.org/jira/browse/HBASE-11885 Project: HBase Issue Type: New Feature Reporter: Dima Spivak Assignee: Dima Spivak [A recent email to dev@|http://mail-archives.apache.org/mod_mbox/hbase-dev/201408.mbox/%3CCAAef%2BM4q%3Da8Dqxe_EHSFTueY%2BXxz%2BtTe%2BJKsWWbXjhB_Pz7oSA%40mail.gmail.com%3E] highlighted the difficulty that new users can face in getting HBase compiled from source and running locally. I'd like to provide a Dockerfile that would allow anyone with Docker running on a machine with a reasonably current Linux kernel to do so with ease. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11339) HBase MOB
[ https://issues.apache.org/jira/browse/HBASE-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120174#comment-14120174 ] Andrew Purtell commented on HBASE-11339: bq. You'd have to go back a bit to get one that explicitly depended on MR but they did exist. (e.g. pre dist log splitting we had a MR based log replay – useful in avoiding 10 hr recovery downtimes). The master's built in splitting was still available even if there was no MR runtime that could run the replay tool. bq. What is your definition of stages? -- do you mean patch a time or something more like: stage one with external compactions, stage 2 with internal compactions? Stage = JIRA issue. bq. For this MOB feature, we would have the experimental tag while we had external compactions and it would remain until we remove external dependencies and this compaction harden with fault testing. Whether or not the feature is tagged as experimental seems orthogonal to the compaction implementation question (at least to me). If I read the above correctly we are looking at 2.0 as a possible release for shipping this feature? I suggest we communicate the feature status as experimental for the whole release line, i.e. until 2.1, like what we have done with the cell security features in the 0.98 line. HBase MOB - Key: HBASE-11339 URL: https://issues.apache.org/jira/browse/HBASE-11339 Project: HBase Issue Type: Umbrella Components: regionserver, Scanners Reporter: Jingcheng Du Assignee: Jingcheng Du Attachments: HBase MOB Design-v2.pdf, HBase MOB Design-v3.pdf, HBase MOB Design-v4.pdf, HBase MOB Design.pdf, MOB user guide.docx, MOB user guide_v2.docx, hbase-11339-in-dev.patch It's quite useful to save the medium binary data like images, documents into Apache HBase. Unfortunately directly saving the binary MOB(medium object) to HBase leads to a worse performance since the frequent split and compaction. In this design, the MOB data are stored in an more efficient way, which keeps a high write/read performance and guarantees the data consistency in Apache HBase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11886) The creator of the table should have all permissions on the table
[ https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-11886: -- Fix Version/s: (was: 1.0.0) 2.0.0 0.99.0 The creator of the table should have all permissions on the table - Key: HBASE-11886 URL: https://issues.apache.org/jira/browse/HBASE-11886 Project: HBase Issue Type: Bug Affects Versions: 0.98.3 Reporter: Devaraj Das Assignee: Devaraj Das Priority: Critical Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: 11886-1.txt In our testing of 0.98.4 with security ON, we found that table creator doesn't have RWXCA on the created table. Instead, the user representing the HBase daemon gets all permissions. Due to this the table creator can't write to the table he just created. I am suspecting HBASE-11275 introduced the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11887) Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value
[ https://issues.apache.org/jira/browse/HBASE-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120191#comment-14120191 ] Enis Soztutar commented on HBASE-11887: --- nice one. Raising this to critical. Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value --- Key: HBASE-11887 URL: https://issues.apache.org/jira/browse/HBASE-11887 Project: HBase Issue Type: Bug Components: Protobufs Affects Versions: 0.99.0 Reporter: stack Assignee: stack Fix For: 0.99.0, 2.0.0 Attachments: 11887.txt, Screen Shot 2014-09-03 at 10.18.58 AM.png Trying to test branch-1, I run out of mem pretty fast. Looking at dumps, I see too many instances of LiteralByteString. Seem to be 'qualifiers' and 'values' out of pb QualifierValue... and on up to the multi call into the server. Am having trouble finding how the retention is being done... Filing issue in meantime while work on it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11887) Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value
[ https://issues.apache.org/jira/browse/HBASE-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-11887: -- Fix Version/s: 2.0.0 0.99.0 Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value --- Key: HBASE-11887 URL: https://issues.apache.org/jira/browse/HBASE-11887 Project: HBase Issue Type: Bug Components: Protobufs Affects Versions: 0.99.0 Reporter: stack Assignee: stack Priority: Critical Fix For: 0.99.0, 2.0.0 Attachments: 11887.txt, Screen Shot 2014-09-03 at 10.18.58 AM.png Trying to test branch-1, I run out of mem pretty fast. Looking at dumps, I see too many instances of LiteralByteString. Seem to be 'qualifiers' and 'values' out of pb QualifierValue... and on up to the multi call into the server. Am having trouble finding how the retention is being done... Filing issue in meantime while work on it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11887) Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value
[ https://issues.apache.org/jira/browse/HBASE-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-11887: -- Priority: Critical (was: Major) Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value --- Key: HBASE-11887 URL: https://issues.apache.org/jira/browse/HBASE-11887 Project: HBase Issue Type: Bug Components: Protobufs Affects Versions: 0.99.0 Reporter: stack Assignee: stack Priority: Critical Fix For: 0.99.0, 2.0.0 Attachments: 11887.txt, Screen Shot 2014-09-03 at 10.18.58 AM.png Trying to test branch-1, I run out of mem pretty fast. Looking at dumps, I see too many instances of LiteralByteString. Seem to be 'qualifiers' and 'values' out of pb QualifierValue... and on up to the multi call into the server. Am having trouble finding how the retention is being done... Filing issue in meantime while work on it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11805) KeyValue to Cell Convert in WALEdit APIs
[ https://issues.apache.org/jira/browse/HBASE-11805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-11805: --- Attachment: HBASE-11805_V2.patch V2 addressing Stack's comment. Added CellUtil#estimatedLengthOf(final Cell cell) When passed cell is a KV we return the KV.length() to maintain the same result as in the past. When cell is of other types, the return will be an estimate adding up the key, value and tags lengths part. KeyValue to Cell Convert in WALEdit APIs Key: HBASE-11805 URL: https://issues.apache.org/jira/browse/HBASE-11805 Project: HBase Issue Type: Improvement Components: wal Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 0.99.0, 2.0.0, 0.98.7 Attachments: HBASE-11805.patch, HBASE-11805_V2.patch In almost all other main interface class/APIs we have changed KeyValue to Cell. But missing in WALEdit. This is public marked for Replication (Well it should be for CP also) These 2 APIs deal with KVs add(KeyValue kv) ArrayListKeyValue getKeyValues() Suggest deprecate them and add for 0.98 add(Cell kv) ListCell getCells() And just replace from 1.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11882) Row level consistency may not be maintained with bulk load and compaction
[ https://issues.apache.org/jira/browse/HBASE-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120216#comment-14120216 ] Hadoop QA commented on HBASE-11882: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12666066/HBASE-11882-master-v2.patch against trunk revision . ATTACHMENT ID: 12666066 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are 2 zombie test(s): at org.apache.hadoop.hbase.client.TestHCM.testClusterStatus(TestHCM.java:250) at org.apache.hadoop.hbase.regionserver.TestHRegion.testWritesWhileGetting(TestHRegion.java:3813) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10695//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10695//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10695//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10695//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10695//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10695//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10695//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10695//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10695//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10695//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10695//console This message is automatically generated. Row level consistency may not be maintained with bulk load and compaction - Key: HBASE-11882 URL: https://issues.apache.org/jira/browse/HBASE-11882 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.99.0, 2.0.0 Reporter: Jerry He Assignee: Jerry He Priority: Critical Fix For: 0.99.0, 2.0.0 Attachments: HBASE-11882-master-v1.patch, HBASE-11882-master-v2.patch, TestHRegionServerBulkLoad.java.patch While looking into the TestHRegionServerBulkLoad failure for HBASE-11772, I found the root cause is that row level atomicity may not be maintained with bulk load together with compation. TestHRegionServerBulkLoad is used to test bulk load atomicity. The test uses multiple threads to do bulk load and scan continuously and do compactions periodically. It verifies row level data is always consistent across column families. After HBASE-11591, we added readpoint checks for bulkloaded data using the seqId at the time of bulk load. Now a scanner will not see the data from a bulk load if the scanner's readpoint is earlier than the bulk load seqId. Previously, the atomic bulk load result is visible immediately to all scanners. The problem is with compaction after bulk load. Compaction does not lock the region and it is done one store (column family) at a time. It also compact away the seqId marker of bulk load. Here is an event sequence where the row level consistency is broken. 1. A scanner is started to scan a region with cf1 and cf2. The readpoint is 10.
[jira] [Updated] (HBASE-11826) Split each tableOrRegionName admin methods into two targetted methods
[ https://issues.apache.org/jira/browse/HBASE-11826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-11826: -- Attachment: hbase-11826_v3.patch reattach. Split each tableOrRegionName admin methods into two targetted methods - Key: HBASE-11826 URL: https://issues.apache.org/jira/browse/HBASE-11826 Project: HBase Issue Type: Improvement Reporter: Carter Assignee: Carter Fix For: 0.99.0, 2.0.0 Attachments: HBASE_11826.patch, HBASE_11826_v2.patch, HBASE_11826_v2.patch, hbase-11826_v3.patch, hbase-11826_v3.patch Purpose of this is two implement [~enis]'s suggestion to strongly type the methods that take tableOrRegionName as an argument. For instance: {code} void compact(final String tableNameOrRegionName) void compact(final byte[] tableNameOrRegionName) {code} becomes {code} @Deprecated void compact(final String tableNameOrRegionName) @Deprecated void compact(final byte[] tableNameOrRegionName) void compact(TableName table) void compactRegion(final byte[] regionName) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-10841) Scan setters should consistently return this
[ https://issues.apache.org/jira/browse/HBASE-10841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120230#comment-14120230 ] Anoop Sam John commented on HBASE-10841: Patch looks good. bq.Here is a patch which fixes all offenders in Put,Delete,Get,Scan,etc Just edit this issue's title and description. :) Scan setters should consistently return this Key: HBASE-10841 URL: https://issues.apache.org/jira/browse/HBASE-10841 Project: HBase Issue Type: Sub-task Components: Client, Usability Affects Versions: 0.99.0 Reporter: Nick Dimiduk Assignee: Enis Soztutar Priority: Minor Fix For: 0.99.0, 2.0.0 Attachments: hbase-10841_v1.patch While addressing review comments on HBASE-10818, I noticed that our {{Scan}} class is inconsistent with it's setter methods. Some of them return {{this}}, other's don't. They should be consistent. I suggest making them all return {{this}}, to support chained invocation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-10841) Scan setters should consistently return this
[ https://issues.apache.org/jira/browse/HBASE-10841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-10841: --- Status: Patch Available (was: Open) Scan setters should consistently return this Key: HBASE-10841 URL: https://issues.apache.org/jira/browse/HBASE-10841 Project: HBase Issue Type: Sub-task Components: Client, Usability Affects Versions: 0.99.0 Reporter: Nick Dimiduk Assignee: Enis Soztutar Priority: Minor Fix For: 0.99.0, 2.0.0 Attachments: hbase-10841_v1.patch While addressing review comments on HBASE-10818, I noticed that our {{Scan}} class is inconsistent with it's setter methods. Some of them return {{this}}, other's don't. They should be consistent. I suggest making them all return {{this}}, to support chained invocation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11339) HBase MOB
[ https://issues.apache.org/jira/browse/HBASE-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120250#comment-14120250 ] Jonathan Hsieh commented on HBASE-11339: bq. The master's built in splitting was still available even if there was no MR runtime that could run the replay tool. If you were ok with 10 hr downtimes due to recovery (back then no meta first recovery), the sure. For large deployments that MR for this was critical and not really optional. bq. Stage = JIRA issue. sgtm. bq. If I read the above correctly we are looking at 2.0 as a possible release for shipping this feature? I suggest we communicate the feature status as experimental for the whole release line, i.e. until 2.1, like what we have done with the cell security features in the 0.98 line. Yes -- trunk is 2.0 and new features should only land in trunk and yes, we would note it as experimental until all pieces are in and some hardening as taken place. . Ideally, all major features would be experimental in their first release. If we follow through with having 2.0 - 2.1 be like will be like 0.92 - 0.94 or 0.96-0.98, then following the cell security approach for experimental status sounds good to me. HBase MOB - Key: HBASE-11339 URL: https://issues.apache.org/jira/browse/HBASE-11339 Project: HBase Issue Type: Umbrella Components: regionserver, Scanners Reporter: Jingcheng Du Assignee: Jingcheng Du Attachments: HBase MOB Design-v2.pdf, HBase MOB Design-v3.pdf, HBase MOB Design-v4.pdf, HBase MOB Design.pdf, MOB user guide.docx, MOB user guide_v2.docx, hbase-11339-in-dev.patch It's quite useful to save the medium binary data like images, documents into Apache HBase. Unfortunately directly saving the binary MOB(medium object) to HBase leads to a worse performance since the frequent split and compaction. In this design, the MOB data are stored in an more efficient way, which keeps a high write/read performance and guarantees the data consistency in Apache HBase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11876) RegionScanner.nextRaw(...) should not update metrics
[ https://issues.apache.org/jira/browse/HBASE-11876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120280#comment-14120280 ] Hudson commented on HBASE-11876: FAILURE: Integrated in HBase-0.98 #494 (See [https://builds.apache.org/job/HBase-0.98/494/]) Revert HBASE-11876 RegionScanner.nextRaw should not update metrics (apurtell: rev c3882ed73a5c77c21ee5110ded9598f2f317cb55) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java HBASE-11876 RegionScanner.nextRaw should not update metrics (apurtell: rev cf843b196338cf2f2bd0eafbe88fda7d4386fba2) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java RegionScanner.nextRaw(...) should not update metrics Key: HBASE-11876 URL: https://issues.apache.org/jira/browse/HBASE-11876 Project: HBase Issue Type: Bug Affects Versions: 0.98.6 Reporter: Lars Hofhansl Assignee: Andrew Purtell Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11876-0.98.patch, HBASE-11876-0.98.patch, HBASE-11876.patch, HBASE-11876.patch I added the RegionScanner.nextRaw(...) to allow smart client to avoid some of the default work that HBase is doing, such as {start|stop}RegionOperation and synchronized(scanner) for each row. Metrics should follow the same approach. Collecting them per row is expensive and a caller should have the option to collect those later or to avoid collecting them completely. We can also save some cycles in RSRcpServices.scan(...) if we updated the metric only once/batch instead of each row. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HBASE-11885) Provide a Dockerfile to easily build and run HBase from source
[ https://issues.apache.org/jira/browse/HBASE-11885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-11885 started by Dima Spivak. --- Provide a Dockerfile to easily build and run HBase from source -- Key: HBASE-11885 URL: https://issues.apache.org/jira/browse/HBASE-11885 Project: HBase Issue Type: New Feature Reporter: Dima Spivak Assignee: Dima Spivak [A recent email to dev@|http://mail-archives.apache.org/mod_mbox/hbase-dev/201408.mbox/%3CCAAef%2BM4q%3Da8Dqxe_EHSFTueY%2BXxz%2BtTe%2BJKsWWbXjhB_Pz7oSA%40mail.gmail.com%3E] highlighted the difficulty that new users can face in getting HBase compiled from source and running locally. I'd like to provide a Dockerfile that would allow anyone with Docker running on a machine with a reasonably current Linux kernel to do so with ease. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11876) RegionScanner.nextRaw(...) should not update metrics
[ https://issues.apache.org/jira/browse/HBASE-11876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120310#comment-14120310 ] Hudson commented on HBASE-11876: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #467 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/467/]) Revert HBASE-11876 RegionScanner.nextRaw should not update metrics (apurtell: rev c3882ed73a5c77c21ee5110ded9598f2f317cb55) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java HBASE-11876 RegionScanner.nextRaw should not update metrics (apurtell: rev cf843b196338cf2f2bd0eafbe88fda7d4386fba2) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java RegionScanner.nextRaw(...) should not update metrics Key: HBASE-11876 URL: https://issues.apache.org/jira/browse/HBASE-11876 Project: HBase Issue Type: Bug Affects Versions: 0.98.6 Reporter: Lars Hofhansl Assignee: Andrew Purtell Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11876-0.98.patch, HBASE-11876-0.98.patch, HBASE-11876.patch, HBASE-11876.patch I added the RegionScanner.nextRaw(...) to allow smart client to avoid some of the default work that HBase is doing, such as {start|stop}RegionOperation and synchronized(scanner) for each row. Metrics should follow the same approach. Collecting them per row is expensive and a caller should have the option to collect those later or to avoid collecting them completely. We can also save some cycles in RSRcpServices.scan(...) if we updated the metric only once/batch instead of each row. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11886) The creator of the table should have all permissions on the table
[ https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120321#comment-14120321 ] Anoop Sam John commented on HBASE-11886: Thanks for the find DD. + this.activeUser = UserProvider.instantiate(conf).getCurrent(); Get user here from RequestContext(?) BTW {code} private User getActiveUser() throws IOException { User user = RequestContext.getRequestUser(); if (!RequestContext.isInRequestContext()) { // for non-rpc handling, fallback to system user user = userProvider.getCurrent(); } return user; } {code} Using InheritableThreadLocal in RequestContext would solve the issue with out other changes? The creator of the table should have all permissions on the table - Key: HBASE-11886 URL: https://issues.apache.org/jira/browse/HBASE-11886 Project: HBase Issue Type: Bug Affects Versions: 0.98.3 Reporter: Devaraj Das Assignee: Devaraj Das Priority: Critical Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: 11886-1.txt In our testing of 0.98.4 with security ON, we found that table creator doesn't have RWXCA on the created table. Instead, the user representing the HBase daemon gets all permissions. Due to this the table creator can't write to the table he just created. I am suspecting HBASE-11275 introduced the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11826) Split each tableOrRegionName admin methods into two targetted methods
[ https://issues.apache.org/jira/browse/HBASE-11826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120341#comment-14120341 ] Hadoop QA commented on HBASE-11826: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12666279/hbase-11826_v3.patch against trunk revision . ATTACHMENT ID: 12666279 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 167 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10697//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10697//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10697//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10697//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10697//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10697//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10697//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10697//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10697//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10697//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10697//console This message is automatically generated. Split each tableOrRegionName admin methods into two targetted methods - Key: HBASE-11826 URL: https://issues.apache.org/jira/browse/HBASE-11826 Project: HBase Issue Type: Improvement Reporter: Carter Assignee: Carter Fix For: 0.99.0, 2.0.0 Attachments: HBASE_11826.patch, HBASE_11826_v2.patch, HBASE_11826_v2.patch, hbase-11826_v3.patch, hbase-11826_v3.patch Purpose of this is two implement [~enis]'s suggestion to strongly type the methods that take tableOrRegionName as an argument. For instance: {code} void compact(final String tableNameOrRegionName) void compact(final byte[] tableNameOrRegionName) {code} becomes {code} @Deprecated void compact(final String tableNameOrRegionName) @Deprecated void compact(final byte[] tableNameOrRegionName) void compact(TableName table) void compactRegion(final byte[] regionName) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120349#comment-14120349 ] Mikhail Antonov commented on HBASE-11165: - [~stack], [~octo47] - on this compaction topic, I also mentioned early on in the thread: bq. I wonder if it makes sense to have google doc linked to this jira to save various proposals, findings and estimates? Like that summarizes current usage to be conservatively 3.5Gb in meta / 1M regions. So seems like we're using 3-3.5 Kb per region-row? That should be compressible, looking at the data in meta rows. Also I think it would help if we can post here some numbers and capture in the documents, so we have the baseline for our work. For example: - how many kb in memory per-region in meta - how many hdfs inodes per region (depends on numbers of store files, but some estimate?) To estimate, how big would be a deployment where meta doesn't fit in memory? How many RSs, how many petabytes of data? Scaling so cluster can host 1M regions and beyond (50M regions?) Key: HBASE-11165 URL: https://issues.apache.org/jira/browse/HBASE-11165 Project: HBase Issue Type: Brainstorming Reporter: stack Attachments: HBASE-11165.zip, Region Scalability test.pdf, zk_less_assignment_comparison_2.pdf This discussion issue comes out of Co-locate Meta And Master HBASE-10569 and comments on the doc posted there. A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M regions maybe even 50M later. This issue is about discussing how we will do that (or if not 50M on a cluster, how otherwise we can attain same end). More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120360#comment-14120360 ] Andrey Stepachev commented on HBASE-11165: -- Also, it is very interesting, did big users of HBase with so many regions use NameNode federation or use enormous machine to handle NameNode with so many regions? Scaling so cluster can host 1M regions and beyond (50M regions?) Key: HBASE-11165 URL: https://issues.apache.org/jira/browse/HBASE-11165 Project: HBase Issue Type: Brainstorming Reporter: stack Attachments: HBASE-11165.zip, Region Scalability test.pdf, zk_less_assignment_comparison_2.pdf This discussion issue comes out of Co-locate Meta And Master HBASE-10569 and comments on the doc posted there. A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M regions maybe even 50M later. This issue is about discussing how we will do that (or if not 50M on a cluster, how otherwise we can attain same end). More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11886) The creator of the table should have all permissions on the table
[ https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120366#comment-14120366 ] Andrew Purtell commented on HBASE-11886: Back from other stuff. Let me address those points [~anoop.hbase] since I'm working in this area on a test. The creator of the table should have all permissions on the table - Key: HBASE-11886 URL: https://issues.apache.org/jira/browse/HBASE-11886 Project: HBase Issue Type: Bug Affects Versions: 0.98.3 Reporter: Devaraj Das Assignee: Devaraj Das Priority: Critical Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: 11886-1.txt In our testing of 0.98.4 with security ON, we found that table creator doesn't have RWXCA on the created table. Instead, the user representing the HBase daemon gets all permissions. Due to this the table creator can't write to the table he just created. I am suspecting HBASE-11275 introduced the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11886) The creator of the table should have all permissions on the table
[ https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120396#comment-14120396 ] Devaraj Das commented on HBASE-11886: - Using the user from RequestContext sounds fine. I am not so sure about the InheritableThreadLocal though. Since the master does HDFS operations when operations like createTable are called, it might be an issue, no? What I did changes the identity only for postCreateTableHandler but the other operations done as part of the createTable call is executed as the master's identity. The creator of the table should have all permissions on the table - Key: HBASE-11886 URL: https://issues.apache.org/jira/browse/HBASE-11886 Project: HBase Issue Type: Bug Affects Versions: 0.98.3 Reporter: Devaraj Das Assignee: Devaraj Das Priority: Critical Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: 11886-1.txt In our testing of 0.98.4 with security ON, we found that table creator doesn't have RWXCA on the created table. Instead, the user representing the HBase daemon gets all permissions. Due to this the table creator can't write to the table he just created. I am suspecting HBASE-11275 introduced the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-10841) Scan,Get,Put,Delete,etc setters should consistently return this
[ https://issues.apache.org/jira/browse/HBASE-10841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-10841: -- Summary: Scan,Get,Put,Delete,etc setters should consistently return this (was: Scan setters should consistently return this) Scan,Get,Put,Delete,etc setters should consistently return this --- Key: HBASE-10841 URL: https://issues.apache.org/jira/browse/HBASE-10841 Project: HBase Issue Type: Sub-task Components: Client, Usability Affects Versions: 0.99.0 Reporter: Nick Dimiduk Assignee: Enis Soztutar Priority: Minor Fix For: 0.99.0, 2.0.0 Attachments: hbase-10841_v1.patch While addressing review comments on HBASE-10818, I noticed that our {{Scan}} class is inconsistent with it's setter methods. Some of them return {{this}}, other's don't. They should be consistent. I suggest making them all return {{this}}, to support chained invocation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120408#comment-14120408 ] Francis Liu commented on HBASE-11165: - We're currently using huge NNs. We haven't looked into the number of inodes as that didn't seem to be an issue for the 1M case (We have a single NN running ~250M files). But we'll be watching it for the post 1M benchmarks. Will post results here. Scaling so cluster can host 1M regions and beyond (50M regions?) Key: HBASE-11165 URL: https://issues.apache.org/jira/browse/HBASE-11165 Project: HBase Issue Type: Brainstorming Reporter: stack Attachments: HBASE-11165.zip, Region Scalability test.pdf, zk_less_assignment_comparison_2.pdf This discussion issue comes out of Co-locate Meta And Master HBASE-10569 and comments on the doc posted there. A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M regions maybe even 50M later. This issue is about discussing how we will do that (or if not 50M on a cluster, how otherwise we can attain same end). More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11887) Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value
[ https://issues.apache.org/jira/browse/HBASE-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120406#comment-14120406 ] Enis Soztutar commented on HBASE-11887: --- +1 on patch if no other findings. Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value --- Key: HBASE-11887 URL: https://issues.apache.org/jira/browse/HBASE-11887 Project: HBase Issue Type: Bug Components: Protobufs Affects Versions: 0.99.0 Reporter: stack Assignee: stack Priority: Critical Fix For: 0.99.0, 2.0.0 Attachments: 11887.txt, Screen Shot 2014-09-03 at 10.18.58 AM.png Trying to test branch-1, I run out of mem pretty fast. Looking at dumps, I see too many instances of LiteralByteString. Seem to be 'qualifiers' and 'values' out of pb QualifierValue... and on up to the multi call into the server. Am having trouble finding how the retention is being done... Filing issue in meantime while work on it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11886) The creator of the table should have all permissions on the table
[ https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120427#comment-14120427 ] Anoop Sam John commented on HBASE-11886: bq.Since the master does HDFS operations when operations like createTable are called, it might be an issue, no? I think no issue. Because the op will be performed with master identity only. RequestContext is used to know who is the active user. RequestContext is HBase class and in HDFS we will be getting the user not from this. By a change in RequestContext ThreadLocal, we make sure in the flow wherever in HBase code, we check for the user from RequestContext , it is the RPC user who initiated the flow. Am ok not to do this change if there is a risk factor and need more time for tests. Andy would like to get the next RC soon I believe. +1 with just changing the part of getting activeUser from RequestContext. Mind adding a comment why we do this so that it will be easy for some one who read the code later. The creator of the table should have all permissions on the table - Key: HBASE-11886 URL: https://issues.apache.org/jira/browse/HBASE-11886 Project: HBase Issue Type: Bug Affects Versions: 0.98.3 Reporter: Devaraj Das Assignee: Devaraj Das Priority: Critical Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: 11886-1.txt In our testing of 0.98.4 with security ON, we found that table creator doesn't have RWXCA on the created table. Instead, the user representing the HBase daemon gets all permissions. Due to this the table creator can't write to the table he just created. I am suspecting HBASE-11275 introduced the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-11886) The creator of the table should have all permissions on the table
[ https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120427#comment-14120427 ] Anoop Sam John edited comment on HBASE-11886 at 9/3/14 8:51 PM: bq.Since the master does HDFS operations when operations like createTable are called, it might be an issue, no? I think no issue. Because the op will be performed with master identity only. RequestContext is used to know who is the active user. RequestContext is HBase class and in HDFS we will be getting the user not from this. By a change in RequestContext ThreadLocal, we make sure in the flow wherever in HBase code, we check for the user from RequestContext , it is the RPC user who initiated the flow. Am ok not to do this change if there is a risk factor and need more time for tests. Andy would like to get the next RC soon I believe. +1 with just changing the part of getting activeUser from RequestContext.(instead UserProvider.instantiate(conf).getCurrent()) Mind adding a comment why we do this so that it will be easy for some one who read the code later. was (Author: anoop.hbase): bq.Since the master does HDFS operations when operations like createTable are called, it might be an issue, no? I think no issue. Because the op will be performed with master identity only. RequestContext is used to know who is the active user. RequestContext is HBase class and in HDFS we will be getting the user not from this. By a change in RequestContext ThreadLocal, we make sure in the flow wherever in HBase code, we check for the user from RequestContext , it is the RPC user who initiated the flow. Am ok not to do this change if there is a risk factor and need more time for tests. Andy would like to get the next RC soon I believe. +1 with just changing the part of getting activeUser from RequestContext. Mind adding a comment why we do this so that it will be easy for some one who read the code later. The creator of the table should have all permissions on the table - Key: HBASE-11886 URL: https://issues.apache.org/jira/browse/HBASE-11886 Project: HBase Issue Type: Bug Affects Versions: 0.98.3 Reporter: Devaraj Das Assignee: Devaraj Das Priority: Critical Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: 11886-1.txt In our testing of 0.98.4 with security ON, we found that table creator doesn't have RWXCA on the created table. Instead, the user representing the HBase daemon gets all permissions. Due to this the table creator can't write to the table he just created. I am suspecting HBASE-11275 introduced the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11886) The creator of the table should have all permissions on the table
[ https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120434#comment-14120434 ] Andrew Purtell commented on HBASE-11886: bq. +1 with just changing the part of getting activeUser from RequestContext instead Yes this is what I am doing The creator of the table should have all permissions on the table - Key: HBASE-11886 URL: https://issues.apache.org/jira/browse/HBASE-11886 Project: HBase Issue Type: Bug Affects Versions: 0.98.3 Reporter: Devaraj Das Assignee: Devaraj Das Priority: Critical Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: 11886-1.txt In our testing of 0.98.4 with security ON, we found that table creator doesn't have RWXCA on the created table. Instead, the user representing the HBase daemon gets all permissions. Due to this the table creator can't write to the table he just created. I am suspecting HBASE-11275 introduced the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11887) Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value
[ https://issues.apache.org/jira/browse/HBASE-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120466#comment-14120466 ] stack commented on HBASE-11887: --- OK. Let me commit. Its been running this last few hours w/o a full GC. I have it set so a full GC kills the RS using below configs: export SERVER_GC_OPTS=-verbose:gc -XX:PrintFLSStatistics=1 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+HeapDumpBeforeFullGC -XX:HeapDumpPath=/data/1/ -XX:+CMSDumpAtPromotionFailure -XX:+HeapDumpAfterFullGC -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCApplicationStoppedTime It used to run out of road -- full GC -- after 5 minutes or so. Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value --- Key: HBASE-11887 URL: https://issues.apache.org/jira/browse/HBASE-11887 Project: HBase Issue Type: Bug Components: Protobufs Affects Versions: 0.99.0 Reporter: stack Assignee: stack Priority: Critical Fix For: 0.99.0, 2.0.0 Attachments: 11887.txt, Screen Shot 2014-09-03 at 10.18.58 AM.png Trying to test branch-1, I run out of mem pretty fast. Looking at dumps, I see too many instances of LiteralByteString. Seem to be 'qualifiers' and 'values' out of pb QualifierValue... and on up to the multi call into the server. Am having trouble finding how the retention is being done... Filing issue in meantime while work on it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11887) Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value
[ https://issues.apache.org/jira/browse/HBASE-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120473#comment-14120473 ] stack commented on HBASE-11887: --- jmap -histo: {code} num #instances #bytes class name -- 1: 3069883 780472352 [B 2: 3501587 112050784 org.apache.hadoop.hbase.KeyValue 3: 3441167 82588008 java.util.concurrent.ConcurrentSkipListMap$Node 4: 1077906 68985984 org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto$ColumnValue$QualifierValue 5: 2372963 56951112 com.google.protobuf.LiteralByteString 6: 1720444 41290656 java.util.concurrent.ConcurrentSkipListMap$Index 7: 7024 18547960 [I {code} shows QualifierValue at 1/10th of that the numbers used be with no mention of the LiteralByteString which used to have near equal pegging with [B instances. Let me commit then since it makes branch-1 YCSB-able (Previous it was not). Will be back to try and make more improvements but this is enough for one issue. Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value --- Key: HBASE-11887 URL: https://issues.apache.org/jira/browse/HBASE-11887 Project: HBase Issue Type: Bug Components: Protobufs Affects Versions: 0.99.0 Reporter: stack Assignee: stack Priority: Critical Fix For: 0.99.0, 2.0.0 Attachments: 11887.txt, Screen Shot 2014-09-03 at 10.18.58 AM.png Trying to test branch-1, I run out of mem pretty fast. Looking at dumps, I see too many instances of LiteralByteString. Seem to be 'qualifiers' and 'values' out of pb QualifierValue... and on up to the multi call into the server. Am having trouble finding how the retention is being done... Filing issue in meantime while work on it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11805) KeyValue to Cell Convert in WALEdit APIs
[ https://issues.apache.org/jira/browse/HBASE-11805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120475#comment-14120475 ] Hadoop QA commented on HBASE-11805: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12666280/HBASE-11805_V2.patch against trunk revision . ATTACHMENT ID: 12666280 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 45 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestMasterFailover org.apache.hadoop.hbase.client.TestMultiParallel org.apache.hadoop.hbase.client.TestReplicaWithCluster org.apache.hadoop.hbase.replication.TestPerTableCFReplication {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): at org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScanBase.testScan(TestTableInputFormatScanBase.java:238) at org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1.testScanEmptyToBBA(TestTableInputFormatScan1.java:70) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10696//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10696//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10696//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10696//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10696//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10696//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10696//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10696//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10696//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10696//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10696//console This message is automatically generated. KeyValue to Cell Convert in WALEdit APIs Key: HBASE-11805 URL: https://issues.apache.org/jira/browse/HBASE-11805 Project: HBase Issue Type: Improvement Components: wal Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 0.99.0, 2.0.0, 0.98.7 Attachments: HBASE-11805.patch, HBASE-11805_V2.patch In almost all other main interface class/APIs we have changed KeyValue to Cell. But missing in WALEdit. This is public marked for Replication (Well it should be for CP also) These 2 APIs deal with KVs add(KeyValue kv) ArrayListKeyValue getKeyValues() Suggest deprecate them and add for 0.98 add(Cell kv) ListCell getCells() And just replace from 1.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-11887) Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value
[ https://issues.apache.org/jira/browse/HBASE-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-11887. --- Resolution: Fixed Fix Version/s: 0.98.8 Hadoop Flags: Reviewed Applied to 0.98+. 0.98 doesn't seem to need it but going by what it does in branch-1, probably no harm letting go of these references so they get cleaned up early and maybe less-likely promoted (hope that ok [~apurtell]) Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value --- Key: HBASE-11887 URL: https://issues.apache.org/jira/browse/HBASE-11887 Project: HBase Issue Type: Bug Components: Protobufs Affects Versions: 0.99.0 Reporter: stack Assignee: stack Priority: Critical Fix For: 0.99.0, 2.0.0, 0.98.8 Attachments: 11887.txt, Screen Shot 2014-09-03 at 10.18.58 AM.png Trying to test branch-1, I run out of mem pretty fast. Looking at dumps, I see too many instances of LiteralByteString. Seem to be 'qualifiers' and 'values' out of pb QualifierValue... and on up to the multi call into the server. Am having trouble finding how the retention is being done... Filing issue in meantime while work on it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120485#comment-14120485 ] Elliott Clark commented on HBASE-11165: --- If we can get into the same scaling range as HDFS's namenode then I don't see the urgency to split meta. Num Files Num Regions So it would seem that addressing in-memory representation of meta would mean that the scaling bottle neck would be back to the NN. At some point there will be limits there, but that seems fine as long as there are the same limits to our underlying foundation (hdfs). Scaling so cluster can host 1M regions and beyond (50M regions?) Key: HBASE-11165 URL: https://issues.apache.org/jira/browse/HBASE-11165 Project: HBase Issue Type: Brainstorming Reporter: stack Attachments: HBASE-11165.zip, Region Scalability test.pdf, zk_less_assignment_comparison_2.pdf This discussion issue comes out of Co-locate Meta And Master HBASE-10569 and comments on the doc posted there. A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M regions maybe even 50M later. This issue is about discussing how we will do that (or if not 50M on a cluster, how otherwise we can attain same end). More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11886) The creator of the table should have all permissions on the table
[ https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-11886: --- Status: Open (was: Patch Available) The creator of the table should have all permissions on the table - Key: HBASE-11886 URL: https://issues.apache.org/jira/browse/HBASE-11886 Project: HBase Issue Type: Bug Affects Versions: 0.98.3 Reporter: Devaraj Das Assignee: Devaraj Das Priority: Critical Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: 11886-1.txt, HBASE-11886.patch In our testing of 0.98.4 with security ON, we found that table creator doesn't have RWXCA on the created table. Instead, the user representing the HBase daemon gets all permissions. Due to this the table creator can't write to the table he just created. I am suspecting HBASE-11275 introduced the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11886) The creator of the table should have all permissions on the table
[ https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-11886: --- Status: Patch Available (was: Open) The creator of the table should have all permissions on the table - Key: HBASE-11886 URL: https://issues.apache.org/jira/browse/HBASE-11886 Project: HBase Issue Type: Bug Affects Versions: 0.98.3 Reporter: Devaraj Das Assignee: Devaraj Das Priority: Critical Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: 11886-1.txt, HBASE-11886.patch In our testing of 0.98.4 with security ON, we found that table creator doesn't have RWXCA on the created table. Instead, the user representing the HBase daemon gets all permissions. Due to this the table creator can't write to the table he just created. I am suspecting HBASE-11275 introduced the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11886) The creator of the table should have all permissions on the table
[ https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-11886: --- Attachment: HBASE-11886.patch Updated patch with test and Anoop feedback. The creator of the table should have all permissions on the table - Key: HBASE-11886 URL: https://issues.apache.org/jira/browse/HBASE-11886 Project: HBase Issue Type: Bug Affects Versions: 0.98.3 Reporter: Devaraj Das Assignee: Devaraj Das Priority: Critical Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: 11886-1.txt, HBASE-11886.patch In our testing of 0.98.4 with security ON, we found that table creator doesn't have RWXCA on the created table. Instead, the user representing the HBase daemon gets all permissions. Due to this the table creator can't write to the table he just created. I am suspecting HBASE-11275 introduced the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11886) The creator of the table should have all permissions on the table
[ https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-11886: --- Attachment: HBASE-11886-0.98.patch Patch for 0.98 The creator of the table should have all permissions on the table - Key: HBASE-11886 URL: https://issues.apache.org/jira/browse/HBASE-11886 Project: HBase Issue Type: Bug Affects Versions: 0.98.3 Reporter: Devaraj Das Assignee: Devaraj Das Priority: Critical Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: 11886-1.txt, HBASE-11886-0.98.patch, HBASE-11886.patch In our testing of 0.98.4 with security ON, we found that table creator doesn't have RWXCA on the created table. Instead, the user representing the HBase daemon gets all permissions. Due to this the table creator can't write to the table he just created. I am suspecting HBASE-11275 introduced the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11826) Split each tableOrRegionName admin methods into two targetted methods
[ https://issues.apache.org/jira/browse/HBASE-11826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120508#comment-14120508 ] Enis Soztutar commented on HBASE-11826: --- it seems that some of the tests got killed during the jenkins run. I've run these tests, and they are passing with the test: {code} mvn clean test -Dtest=TestIdLock,TestMiniClusterLoadSequential,TestMergeTool,TestRegionSplitter,TestCoprocessorScanPolicy,TestTableInputFormatScan1,TestHFileOutputFormat2,TestAcidGuarantees,TestMiniClusterLoadEncoded,TestFSHDFSUtils {code} Split each tableOrRegionName admin methods into two targetted methods - Key: HBASE-11826 URL: https://issues.apache.org/jira/browse/HBASE-11826 Project: HBase Issue Type: Improvement Reporter: Carter Assignee: Carter Fix For: 0.99.0, 2.0.0 Attachments: HBASE_11826.patch, HBASE_11826_v2.patch, HBASE_11826_v2.patch, hbase-11826_v3.patch, hbase-11826_v3.patch Purpose of this is two implement [~enis]'s suggestion to strongly type the methods that take tableOrRegionName as an argument. For instance: {code} void compact(final String tableNameOrRegionName) void compact(final byte[] tableNameOrRegionName) {code} becomes {code} @Deprecated void compact(final String tableNameOrRegionName) @Deprecated void compact(final byte[] tableNameOrRegionName) void compact(TableName table) void compactRegion(final byte[] regionName) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120511#comment-14120511 ] Francis Liu commented on HBASE-11165: - {quote} Francis Liu, Virag Kothari - do you guys have by chance some recent numbers (or maybe estimate) on how long does full master failover take on the cluster with 300k or 3M regions? I didn't find those in the recent doc, eager to see that. {quote} [~mantonov] We don't have the numbers we'll get them next time. Tho failover recovery is essentially bounded on scanning meta and recovering dead servers. So without dead servers it would just be a fraction of the startup time. Scaling so cluster can host 1M regions and beyond (50M regions?) Key: HBASE-11165 URL: https://issues.apache.org/jira/browse/HBASE-11165 Project: HBase Issue Type: Brainstorming Reporter: stack Attachments: HBASE-11165.zip, Region Scalability test.pdf, zk_less_assignment_comparison_2.pdf This discussion issue comes out of Co-locate Meta And Master HBASE-10569 and comments on the doc posted there. A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M regions maybe even 50M later. This issue is about discussing how we will do that (or if not 50M on a cluster, how otherwise we can attain same end). More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11887) Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value
[ https://issues.apache.org/jira/browse/HBASE-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120512#comment-14120512 ] Andrew Purtell commented on HBASE-11887: Thanks! Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value --- Key: HBASE-11887 URL: https://issues.apache.org/jira/browse/HBASE-11887 Project: HBase Issue Type: Bug Components: Protobufs Affects Versions: 0.99.0 Reporter: stack Assignee: stack Priority: Critical Fix For: 0.99.0, 2.0.0, 0.98.8 Attachments: 11887.txt, Screen Shot 2014-09-03 at 10.18.58 AM.png Trying to test branch-1, I run out of mem pretty fast. Looking at dumps, I see too many instances of LiteralByteString. Seem to be 'qualifiers' and 'values' out of pb QualifierValue... and on up to the multi call into the server. Am having trouble finding how the retention is being done... Filing issue in meantime while work on it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120516#comment-14120516 ] Mikhail Antonov commented on HBASE-11165: - [~toffer] thanks! I'd be really curious to look at those numbers. Is the NN you mentioned with 250M files is solely dedicated to HBase installation? I mean, could the assumption be made that the HBase cluster with 1M or large regions consumes about 250M of files in HDFS, so roughly 250 files / per region, or would it be too bold assumption? [~eclark] so if we take as a baseline that (num of files) (num regions), I wonder how close to NN limits we are? I mean, if we're talking about case with 10M regions (or even 50M), with the same ratio of region-to-files, 10M regions would give us 2.5B files in HDFS? How close is that to HDFS limits? Scaling so cluster can host 1M regions and beyond (50M regions?) Key: HBASE-11165 URL: https://issues.apache.org/jira/browse/HBASE-11165 Project: HBase Issue Type: Brainstorming Reporter: stack Attachments: HBASE-11165.zip, Region Scalability test.pdf, zk_less_assignment_comparison_2.pdf This discussion issue comes out of Co-locate Meta And Master HBASE-10569 and comments on the doc posted there. A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M regions maybe even 50M later. This issue is about discussing how we will do that (or if not 50M on a cluster, how otherwise we can attain same end). More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11886) The creator of the table should have all permissions on the table
[ https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120517#comment-14120517 ] Hadoop QA commented on HBASE-11886: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12666325/HBASE-11886-0.98.patch against trunk revision . ATTACHMENT ID: 12666325 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10699//console This message is automatically generated. The creator of the table should have all permissions on the table - Key: HBASE-11886 URL: https://issues.apache.org/jira/browse/HBASE-11886 Project: HBase Issue Type: Bug Affects Versions: 0.98.3 Reporter: Devaraj Das Assignee: Devaraj Das Priority: Critical Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: 11886-1.txt, HBASE-11886-0.98.patch, HBASE-11886.patch In our testing of 0.98.4 with security ON, we found that table creator doesn't have RWXCA on the created table. Instead, the user representing the HBase daemon gets all permissions. Due to this the table creator can't write to the table he just created. I am suspecting HBASE-11275 introduced the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)