[jira] [Commented] (HBASE-11882) Row level consistency may not be maintained with bulk load and compaction

2014-09-03 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119417#comment-14119417
 ] 

ramkrishna.s.vasudevan commented on HBASE-11882:


I hope you got my concern.  Previously when a bulk load gets completed just 
after a scanner is created and before the scan does start a seek, the kvs in 
the bulk loaded file will also be taken into consideration. But after 
HBASE-11591 the bulk load file would not be taken into consideration.  So if 
the test case expects some value from the bulk loaded file then it may fail.  
May be it did not happen now but may happen.  Anyway I will check the test case 
once closely.  +1 on patch. 

 Row level consistency may not be maintained with bulk load and compaction
 -

 Key: HBASE-11882
 URL: https://issues.apache.org/jira/browse/HBASE-11882
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.99.0, 2.0.0
Reporter: Jerry He
Assignee: Jerry He
Priority: Critical
 Fix For: 0.99.0, 2.0.0

 Attachments: HBASE-11882-master-v1.patch, 
 HBASE-11882-master-v2.patch, TestHRegionServerBulkLoad.java.patch


 While looking into the TestHRegionServerBulkLoad failure for HBASE-11772, I 
 found the root cause is that row level atomicity may not be maintained with 
 bulk load together with compation.
 TestHRegionServerBulkLoad is used to test bulk load atomicity. The test uses 
 multiple threads to do bulk load and scan continuously and do compactions 
 periodically. 
 It verifies row level data is always consistent across column families.
 After HBASE-11591, we added readpoint checks for bulkloaded data using the 
 seqId at the time of bulk load. Now a scanner will not see the data from a 
 bulk load if the scanner's readpoint is earlier than the bulk load seqId.
 Previously, the atomic bulk load result is visible immediately to all 
 scanners.
 The problem is with compaction after bulk load. Compaction does not lock the 
 region and it is done one store (column family) at a time. It also compact 
 away the seqId marker of bulk load.
 Here is an event sequence where the row level consistency is broken.
 1. A scanner is started to scan a region with cf1 and cf2. The readpoint is 
 10.
 2. There is a bulk load that loads into cf1 and cf2. The bulk load seqId is 
 11. Bulk load is guarded by region write lock. So it is atomic.
 3. There is a compaction that compacts cf1. It compacts away the seqId marker 
 of the bulk load.
 4. The scanner tries to next to row-1001. It gets the bulk load data for cf1 
 since there is no seqId preventing it.  It does not get the bulk load data 
 for cf2 since the scanner's readpoint (10) is less than the bulk load seqId 
 (11).
 Now the row level consistency is broken in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11869) Support snapshot owner

2014-09-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119416#comment-14119416
 ] 

Hadoop QA commented on HBASE-11869:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12666126/HBASE-11869-trunk-v3.diff
  against trunk revision .
  ATTACHMENT ID: 12666126

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.TestRegionRebalancing

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10691//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10691//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10691//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10691//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10691//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10691//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10691//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10691//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10691//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10691//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10691//console

This message is automatically generated.

 Support snapshot owner
 --

 Key: HBASE-11869
 URL: https://issues.apache.org/jira/browse/HBASE-11869
 Project: HBase
  Issue Type: Improvement
Reporter: Liu Shaohui
Assignee: Liu Shaohui
Priority: Minor
 Fix For: 2.0.0

 Attachments: HBASE-11869-trunk-v1.diff, HBASE-11869-trunk-v3.diff


 In current codebase, the table snapshot operations only can be done by the 
 global admin , not by  the table admin.
 There is a multi-tenant hbase cluster, each table has different snapshot 
 policies, eg: do snapshot per week, or snapshot after the new data are 
 imported. 
 We want to release the snapshot permission to each table admin.
 According to [~mbertozzi]'s suggestion, we implement the snapshot owner 
 feature.
 * The user with table admin permission can create snapshot and the owner of 
 this snapshot is this user.
 * The owner of snapshot can delete and restore the snapshot.
 * Only the user with global admin permission can clone a snapshot, for this 
 operation creates a new table.
   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11877) Make TableSplit more readable

2014-09-03 Thread Liu Shaohui (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119435#comment-14119435
 ] 

Liu Shaohui commented on HBASE-11877:
-

[~jmspaggi] [~stack]
{quote}
Don't see any issue with that. Have you ran it locally? Do you have a copy of 
the output? 
{quote}
The split output from TestCopyTable
{code}
2014-09-03 13:42:55,180 INFO [main] org.apache.hadoop.mapred.MapTask: 
Processing split: HBase table split(table name: testStartStopRow1, start row: 
row1, end row: row2, region location: localhost)
{code}

{quote}
Has it passed the tests? 
{quote}
Yes, all tests passed in my dev machine
{quote}
Not anyrisk for those new fields (m_tableName, m_regionLocation) to be null?
{quote}
The append method of StringBuilder and Bytes.toStringBinary have handled the 
null object situation. If the object is null, the method will transform it to 
string null.

 Make TableSplit more readable
 -

 Key: HBASE-11877
 URL: https://issues.apache.org/jira/browse/HBASE-11877
 Project: HBase
  Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: Liu Shaohui
Assignee: Liu Shaohui
Priority: Minor
 Attachments: HBASE-11877-trunk-v1.diff


 When debugging MR jobs reading from hbase table, it's import to figure out 
 which region a map task is reading from.
 But the table split object is hard to read.
 eg:
 {code}
 2014-09-01 20:58:39,783 INFO [main] org.apache.hadoop.mapred.MapTask: 
 Processing split: lg-hadoop-prc-st40.bj:,0
 {code}
 See: TableSplit.java 
 {code}
   @Override
   public String toString() {
 return m_regionLocation + : +
   Bytes.toStringBinary(m_startRow) + , + Bytes.toStringBinary(m_endRow);
   }
 {code}
 We should make it more readable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11877) Make TableSplit more readable

2014-09-03 Thread Liu Shaohui (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liu Shaohui updated HBASE-11877:

Attachment: HBASE-11877-trunk-v2.diff

Add unit test and update the patch.

 Make TableSplit more readable
 -

 Key: HBASE-11877
 URL: https://issues.apache.org/jira/browse/HBASE-11877
 Project: HBase
  Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: Liu Shaohui
Assignee: Liu Shaohui
Priority: Minor
 Attachments: HBASE-11877-trunk-v1.diff, HBASE-11877-trunk-v2.diff


 When debugging MR jobs reading from hbase table, it's import to figure out 
 which region a map task is reading from.
 But the table split object is hard to read.
 eg:
 {code}
 2014-09-01 20:58:39,783 INFO [main] org.apache.hadoop.mapred.MapTask: 
 Processing split: lg-hadoop-prc-st40.bj:,0
 {code}
 See: TableSplit.java 
 {code}
   @Override
   public String toString() {
 return m_regionLocation + : +
   Bytes.toStringBinary(m_startRow) + , + Bytes.toStringBinary(m_endRow);
   }
 {code}
 We should make it more readable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11882) Row level consistency may not be maintained with bulk load and compaction

2014-09-03 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-11882:
-
Status: Open  (was: Patch Available)

 Row level consistency may not be maintained with bulk load and compaction
 -

 Key: HBASE-11882
 URL: https://issues.apache.org/jira/browse/HBASE-11882
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.99.0, 2.0.0
Reporter: Jerry He
Assignee: Jerry He
Priority: Critical
 Fix For: 0.99.0, 2.0.0

 Attachments: HBASE-11882-master-v1.patch, 
 HBASE-11882-master-v2.patch, TestHRegionServerBulkLoad.java.patch


 While looking into the TestHRegionServerBulkLoad failure for HBASE-11772, I 
 found the root cause is that row level atomicity may not be maintained with 
 bulk load together with compation.
 TestHRegionServerBulkLoad is used to test bulk load atomicity. The test uses 
 multiple threads to do bulk load and scan continuously and do compactions 
 periodically. 
 It verifies row level data is always consistent across column families.
 After HBASE-11591, we added readpoint checks for bulkloaded data using the 
 seqId at the time of bulk load. Now a scanner will not see the data from a 
 bulk load if the scanner's readpoint is earlier than the bulk load seqId.
 Previously, the atomic bulk load result is visible immediately to all 
 scanners.
 The problem is with compaction after bulk load. Compaction does not lock the 
 region and it is done one store (column family) at a time. It also compact 
 away the seqId marker of bulk load.
 Here is an event sequence where the row level consistency is broken.
 1. A scanner is started to scan a region with cf1 and cf2. The readpoint is 
 10.
 2. There is a bulk load that loads into cf1 and cf2. The bulk load seqId is 
 11. Bulk load is guarded by region write lock. So it is atomic.
 3. There is a compaction that compacts cf1. It compacts away the seqId marker 
 of the bulk load.
 4. The scanner tries to next to row-1001. It gets the bulk load data for cf1 
 since there is no seqId preventing it.  It does not get the bulk load data 
 for cf2 since the scanner's readpoint (10) is less than the bulk load seqId 
 (11).
 Now the row level consistency is broken in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11882) Row level consistency may not be maintained with bulk load and compaction

2014-09-03 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119439#comment-14119439
 ] 

Jerry He commented on HBASE-11882:
--

Yes, the behavior change from the added readpoint checks for bulkloaded data 
using the seqId is understood.

I did run the 'mvn test' after v2 patch earlier today  The entire run passed 
cleanly. But I lost the result.  I've just kick off another run, and will paste 
the result here.
Also try to trigger a Hadoop QA run here.

 Row level consistency may not be maintained with bulk load and compaction
 -

 Key: HBASE-11882
 URL: https://issues.apache.org/jira/browse/HBASE-11882
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.99.0, 2.0.0
Reporter: Jerry He
Assignee: Jerry He
Priority: Critical
 Fix For: 0.99.0, 2.0.0

 Attachments: HBASE-11882-master-v1.patch, 
 HBASE-11882-master-v2.patch, TestHRegionServerBulkLoad.java.patch


 While looking into the TestHRegionServerBulkLoad failure for HBASE-11772, I 
 found the root cause is that row level atomicity may not be maintained with 
 bulk load together with compation.
 TestHRegionServerBulkLoad is used to test bulk load atomicity. The test uses 
 multiple threads to do bulk load and scan continuously and do compactions 
 periodically. 
 It verifies row level data is always consistent across column families.
 After HBASE-11591, we added readpoint checks for bulkloaded data using the 
 seqId at the time of bulk load. Now a scanner will not see the data from a 
 bulk load if the scanner's readpoint is earlier than the bulk load seqId.
 Previously, the atomic bulk load result is visible immediately to all 
 scanners.
 The problem is with compaction after bulk load. Compaction does not lock the 
 region and it is done one store (column family) at a time. It also compact 
 away the seqId marker of bulk load.
 Here is an event sequence where the row level consistency is broken.
 1. A scanner is started to scan a region with cf1 and cf2. The readpoint is 
 10.
 2. There is a bulk load that loads into cf1 and cf2. The bulk load seqId is 
 11. Bulk load is guarded by region write lock. So it is atomic.
 3. There is a compaction that compacts cf1. It compacts away the seqId marker 
 of the bulk load.
 4. The scanner tries to next to row-1001. It gets the bulk load data for cf1 
 since there is no seqId preventing it.  It does not get the bulk load data 
 for cf2 since the scanner's readpoint (10) is less than the bulk load seqId 
 (11).
 Now the row level consistency is broken in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11882) Row level consistency may not be maintained with bulk load and compaction

2014-09-03 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-11882:
-
Status: Patch Available  (was: Open)

 Row level consistency may not be maintained with bulk load and compaction
 -

 Key: HBASE-11882
 URL: https://issues.apache.org/jira/browse/HBASE-11882
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.99.0, 2.0.0
Reporter: Jerry He
Assignee: Jerry He
Priority: Critical
 Fix For: 0.99.0, 2.0.0

 Attachments: HBASE-11882-master-v1.patch, 
 HBASE-11882-master-v2.patch, TestHRegionServerBulkLoad.java.patch


 While looking into the TestHRegionServerBulkLoad failure for HBASE-11772, I 
 found the root cause is that row level atomicity may not be maintained with 
 bulk load together with compation.
 TestHRegionServerBulkLoad is used to test bulk load atomicity. The test uses 
 multiple threads to do bulk load and scan continuously and do compactions 
 periodically. 
 It verifies row level data is always consistent across column families.
 After HBASE-11591, we added readpoint checks for bulkloaded data using the 
 seqId at the time of bulk load. Now a scanner will not see the data from a 
 bulk load if the scanner's readpoint is earlier than the bulk load seqId.
 Previously, the atomic bulk load result is visible immediately to all 
 scanners.
 The problem is with compaction after bulk load. Compaction does not lock the 
 region and it is done one store (column family) at a time. It also compact 
 away the seqId marker of bulk load.
 Here is an event sequence where the row level consistency is broken.
 1. A scanner is started to scan a region with cf1 and cf2. The readpoint is 
 10.
 2. There is a bulk load that loads into cf1 and cf2. The bulk load seqId is 
 11. Bulk load is guarded by region write lock. So it is atomic.
 3. There is a compaction that compacts cf1. It compacts away the seqId marker 
 of the bulk load.
 4. The scanner tries to next to row-1001. It gets the bulk load data for cf1 
 since there is no seqId preventing it.  It does not get the bulk load data 
 for cf2 since the scanner's readpoint (10) is less than the bulk load seqId 
 (11).
 Now the row level consistency is broken in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11869) Support snapshot owner

2014-09-03 Thread Liu Shaohui (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119441#comment-14119441
 ] 

Liu Shaohui commented on HBASE-11869:
-

It seems that the failed test: TestRegionRebalancing has no relations with the 
patch.

 Support snapshot owner
 --

 Key: HBASE-11869
 URL: https://issues.apache.org/jira/browse/HBASE-11869
 Project: HBase
  Issue Type: Improvement
Reporter: Liu Shaohui
Assignee: Liu Shaohui
Priority: Minor
 Fix For: 2.0.0

 Attachments: HBASE-11869-trunk-v1.diff, HBASE-11869-trunk-v3.diff


 In current codebase, the table snapshot operations only can be done by the 
 global admin , not by  the table admin.
 There is a multi-tenant hbase cluster, each table has different snapshot 
 policies, eg: do snapshot per week, or snapshot after the new data are 
 imported. 
 We want to release the snapshot permission to each table admin.
 According to [~mbertozzi]'s suggestion, we implement the snapshot owner 
 feature.
 * The user with table admin permission can create snapshot and the owner of 
 this snapshot is this user.
 * The owner of snapshot can delete and restore the snapshot.
 * Only the user with global admin permission can clone a snapshot, for this 
 operation creates a new table.
   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11876) RegionScanner.nextRaw(...) should not update metrics

2014-09-03 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119443#comment-14119443
 ] 

Lars Hofhansl commented on HBASE-11876:
---

Sorry was traveling today. Was just finding some time to make a quick patch, 
and it's already done. :)

Belated +1 on patch then, that's almost exactly how I would have coded it up. 
Should be a nice improvement.


 RegionScanner.nextRaw(...) should not update metrics
 

 Key: HBASE-11876
 URL: https://issues.apache.org/jira/browse/HBASE-11876
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.6
Reporter: Lars Hofhansl
Assignee: Andrew Purtell
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: HBASE-11876-0.98.patch, HBASE-11876.patch, 
 HBASE-11876.patch


 I added the RegionScanner.nextRaw(...) to allow smart client to avoid some 
 of the default work that HBase is doing, such as {start|stop}RegionOperation 
 and synchronized(scanner) for each row.
 Metrics should follow the same approach. Collecting them per row is expensive 
 and a caller should have the option to collect those later or to avoid 
 collecting them completely.
 We can also save some cycles in RSRcpServices.scan(...) if we updated the 
 metric only once/batch instead of each row.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11876) RegionScanner.nextRaw(...) should not update metrics

2014-09-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119445#comment-14119445
 ] 

Hudson commented on HBASE-11876:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #465 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/465/])
HBASE-11876 RegionScanner.nextRaw should not update metrics (apurtell: rev 
23a4181d1ecc5f492c16dc579bff92eef7d209f1)
* hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java


 RegionScanner.nextRaw(...) should not update metrics
 

 Key: HBASE-11876
 URL: https://issues.apache.org/jira/browse/HBASE-11876
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.6
Reporter: Lars Hofhansl
Assignee: Andrew Purtell
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: HBASE-11876-0.98.patch, HBASE-11876.patch, 
 HBASE-11876.patch


 I added the RegionScanner.nextRaw(...) to allow smart client to avoid some 
 of the default work that HBase is doing, such as {start|stop}RegionOperation 
 and synchronized(scanner) for each row.
 Metrics should follow the same approach. Collecting them per row is expensive 
 and a caller should have the option to collect those later or to avoid 
 collecting them completely.
 We can also save some cycles in RSRcpServices.scan(...) if we updated the 
 metric only once/batch instead of each row.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11863) WAL files are not archived and stays in the WAL directory after splitting

2014-09-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119444#comment-14119444
 ] 

Hudson commented on HBASE-11863:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #465 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/465/])
HBASE-11863 WAL files are not archived and stays in the WAL directory after 
splitting (enis: rev 7f28fcf429242c549219502bfb7da0ad28753f4c)
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java


 WAL files are not archived and stays in the WAL directory after splitting 
 --

 Key: HBASE-11863
 URL: https://issues.apache.org/jira/browse/HBASE-11863
 Project: HBase
  Issue Type: Bug
Reporter: Enis Soztutar
Assignee: Enis Soztutar
Priority: Blocker
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: hbase-11863_v1-0.98.patch, hbase-11863_v1.patch, 
 hbase-11863_v2.patch, hbase-11863_v3-0.98.patch, hbase-11863_v3-0.99.patch, 
 hbase-11863_v3.patch


 In patch HBASE-11094, it seems that we changed the constructor we are using 
 for SplitLogManager, which does not archive the log files after splitting is 
 done to the archive folder. The log files stays in the splitting directory 
 forever and re-split every time the master restarts. 
 It is surprising that our unit tests are passing (since 0.94.4) without any 
 issues. Part of the reason is that the split is actually carried, but the WAL 
 is not moved and thus the -splitting directory never gets deleted. 
 It seems critical to fix in 0.98.6, [~andrew.purt...@gmail.com] FYI. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11868) Data loss in hlog when the hdfs is unavailable

2014-09-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119446#comment-14119446
 ] 

Hudson commented on HBASE-11868:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #465 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/465/])
Revert HBASE-11868 Data loss in hlog when the hdfs is unavailable (Liu 
Shaohui) (apurtell: rev ee32706c5d93fb3de6f4aba09174d34ca3879f6d)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java


 Data loss in hlog when the hdfs is unavailable
 --

 Key: HBASE-11868
 URL: https://issues.apache.org/jira/browse/HBASE-11868
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.5
Reporter: Liu Shaohui
Assignee: Liu Shaohui
Priority: Blocker
 Fix For: 0.98.6

 Attachments: HBASE-11868-0.98-v1.diff, HBASE-11868-0.98-v2.diff


 When using the new thread model in hbase 0.98, we found a bug which may cause 
 data loss when the the hdfs is unavailable.
 When writing wal Edits to hlog in doMiniBatchMutation of HRegion, the hlog 
 first call appendNoSync to write the edits to hlog and then call sync with 
 txid. 
 Assumed that the txid of current write is 10, and the syncedTillHere in hlog 
 is 9 and the failedTxid is 0. When  the the hdfs is unavailable, the 
 AsyncWriter or AsyncSyncer will fail to apend the edits or sync, then they 
 will update the syncedTillHere to 10 and the failedTxid to 10.
 When the hlog calls the sync with txid :10, the failedTxid will nerver be 
 checked for txid equals with syncedTillHere.  The client thinks the write 
 success , but the data only be writtten to memstore not hlog. If the 
 regionserver is down later before the memstore is flushed, the data will be 
 lost.
 See: FSHLog.java #1348
 {code}
   // sync all transactions upto the specified txid
   private void syncer(long txid) throws IOException {
 synchronized (this.syncedTillHere) {
   while (this.syncedTillHere.get()  txid) {
 try {
   this.syncedTillHere.wait();
   if (txid = this.failedTxid.get()) {
 assert asyncIOE != null :
   current txid is among(under) failed txids, but asyncIOE is 
 null!;
 throw asyncIOE;
   }
 } catch (InterruptedException e) {
   LOG.debug(interrupted while waiting for notification from 
 AsyncNotifier);
 }
   }
 }
   }
 {code}
 We can fix this issue by moving the comparing of txid and failedTxid outside 
 the while block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11728) Data loss while scanning using PREFIX_TREE DATA-BLOCK-ENCODING

2014-09-03 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119447#comment-14119447
 ] 

ramkrishna.s.vasudevan commented on HBASE-11728:


Should we do this for some of the previous() cases also as done in the patch.  
May be that is the reason for the IT to fail. 
[~bdifn]
Did you get an opportunity to use this patch and still you had some data loss 
while scanning?

 Data loss while scanning using PREFIX_TREE DATA-BLOCK-ENCODING
 --

 Key: HBASE-11728
 URL: https://issues.apache.org/jira/browse/HBASE-11728
 Project: HBase
  Issue Type: Bug
  Components: Scanners
Affects Versions: 0.96.1.1, 0.98.4
 Environment: ubuntu12 
 hadoop-2.2.0
 Hbase-0.96.1.1
 SUN-JDK(1.7.0_06-b24)
Reporter: wuchengzhi
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 29cb562fad564b468ea9d61a2d60e8b0, HBASE-11728.patch, 
 HBASE-11728_1.patch, HBASE-11728_2.patch, HBASE-11728_3.patch, 
 HBASE-11728_4.patch, HFileAnalys.java, TestPrefixTree.java

   Original Estimate: 72h
  Remaining Estimate: 72h

 In Scan case, i prepare some data as beflow:
 Table Desc (Using the prefix-tree encoding) :
 'prefix_tree_test', {NAME = 'cf_1', DATA_BLOCK_ENCODING = 'PREFIX_TREE', 
 TTL = '15552000'}
 and i put 5 rows as:
 (RowKey , Qualifier, Value)
 'a-b-0-0', 'qf_1', 'c1-value'
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 'a-b-B-2-1402397300-1402416535', 'qf_2', 'c2-value-3'
 so i try to scan the rowKey between 'a-b-A-1' and 'a-b-A-1:' , i and got the 
 corret result:
 Test 1: 
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 and then i try next , scan to addColumn
 Test2:
 Scan scan = new Scan();
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_2));
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually i got nonthing. Then i update the addColumn for 
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_1)); and i got the 
 expected result 'a-b-A-1', 'qf_1', 'c1-value' as well.
 then i do more testing...  i update the case to modify the startRow greater 
 than the 'a-b-A-1' 
 Test3:
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually  i got nothing again. i modify the start row greater than 
 'a-b-A-1-1402329600-1402396277'
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-140239.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 and i got the expect row as well:
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 So, i think it may be a bug in the prefix-tree encoding.It happens after the 
 data flush to the storefile, and it's ok when the data in mem-store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11886) The creator of the table should have all permissions on the table

2014-09-03 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HBASE-11886:

Status: Patch Available  (was: Open)

 The creator of the table should have all permissions on the table
 -

 Key: HBASE-11886
 URL: https://issues.apache.org/jira/browse/HBASE-11886
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.3
Reporter: Devaraj Das
Priority: Critical
 Fix For: 1.0.0, 0.98.6

 Attachments: 11886-1.txt


 In our testing of 0.98.4 with security ON, we found that table creator 
 doesn't have RWXCA on the created table. Instead, the user representing the 
 HBase daemon gets all permissions. Due to this the table creator can't write 
 to the table he just created. I am suspecting HBASE-11275 introduced the 
 problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11886) The creator of the table should have all permissions on the table

2014-09-03 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HBASE-11886:

Attachment: 11886-1.txt

Thanks for the confirmation [~ram_krish]. The attached patch should fix the 
issue (although i haven't tested it with security ON).

 The creator of the table should have all permissions on the table
 -

 Key: HBASE-11886
 URL: https://issues.apache.org/jira/browse/HBASE-11886
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.3
Reporter: Devaraj Das
Priority: Critical
 Fix For: 1.0.0, 0.98.6

 Attachments: 11886-1.txt


 In our testing of 0.98.4 with security ON, we found that table creator 
 doesn't have RWXCA on the created table. Instead, the user representing the 
 HBase daemon gets all permissions. Due to this the table creator can't write 
 to the table he just created. I am suspecting HBASE-11275 introduced the 
 problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11869) Support snapshot owner

2014-09-03 Thread Matteo Bertozzi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119483#comment-14119483
 ] 

Matteo Bertozzi commented on HBASE-11869:
-

+1
[~apurtell] do you want this in 98? It changes the ACL logic, but since it is 
not restricting should be compatible.

 Support snapshot owner
 --

 Key: HBASE-11869
 URL: https://issues.apache.org/jira/browse/HBASE-11869
 Project: HBase
  Issue Type: Improvement
Reporter: Liu Shaohui
Assignee: Liu Shaohui
Priority: Minor
 Fix For: 2.0.0

 Attachments: HBASE-11869-trunk-v1.diff, HBASE-11869-trunk-v3.diff


 In current codebase, the table snapshot operations only can be done by the 
 global admin , not by  the table admin.
 There is a multi-tenant hbase cluster, each table has different snapshot 
 policies, eg: do snapshot per week, or snapshot after the new data are 
 imported. 
 We want to release the snapshot permission to each table admin.
 According to [~mbertozzi]'s suggestion, we implement the snapshot owner 
 feature.
 * The user with table admin permission can create snapshot and the owner of 
 this snapshot is this user.
 * The owner of snapshot can delete and restore the snapshot.
 * Only the user with global admin permission can clone a snapshot, for this 
 operation creates a new table.
   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11339) HBase MOB

2014-09-03 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119489#comment-14119489
 ] 

Lars Hofhansl commented on HBASE-11339:
---

bq. Back in June, JingCheng's response to your comments never got feedback on 
how you'd manage the small files problem.
To be fair, my comment itself addressed that by saying small blobs are stored 
by *value* in HBase, and only large bloba in HDFS. We can store a lot of 10MB 
(in the worst case scenario it's 200m x 10mb = 2pb) in HDFS, if that's not 
enough, we can dial up the threshold.

It seems nobody understood what I am suggesting. Depending on use case and data 
distribution you pick a threshold X. Blobs with a size of  X are stored 
directly in HBase as a column value. Blobs = X are stored in a HDFS with a 
reference in HBase using the 3-phase approach.

bq. there are two HDFS blob + HBase metadata solutions are explicitly mentioned 
in section 4.1.2 (v4 design doc) with pros and cons
True, but as I state the store small blobs by value and only large ones by 
reference solution is not mentioned in there.

bq. The solution you propose is actually the first described hdfs+hbase approach
Not it's not... It says either all blobs go into HBase or all blobs go into 
HDFS... See above. Small blobs would be stored directly in HBase, not in HDFS. 
That's key, nobody wants to store 100k or 1mb files directly in HDFS.

bq. We have total 3 +1s for that Jira after many rounds of review rework. Can 
get it committed tomorrow IST unless objections...?
We won't get this committed until we finish this discussion. So consider this 
my -1 until we finish.

Going by the comments the use case is only 1-5mb files (definitely less than 
64mb), correct? That changes the discussion, but it looks to me that now the 
use case is limited to a single scenario and carefully constructed (200m x 500k 
files) so that this change might be useful. I.e. pick a blob size just right, 
and pick the size distribution of the files just right and this makes sense.

In my approach one can dial up/down the threshold of by-value and by-reference 
storage as needed. And I did not even realize the need for M/R.

I do agree with all of following:
* snapshots are harder
* bulk load is harder
* backup/restore/replication is harder

Yet, all that is possible to do with a client only solution and could be 
abstracted there.

I'll also admit that our blob storage tool is not finished, yet, and that for 
its use case we don't need replication or backup as it itself will be the 
backup solution for another very large data store.

Are you guys absolutely... 100%... positive that this cannot be done in any 
other way and has to be done this way? That we cannot store files up to a 
certain size as values in HBase and larger files in HDFS? And there is not good 
threshold value for this?


 HBase MOB
 -

 Key: HBASE-11339
 URL: https://issues.apache.org/jira/browse/HBASE-11339
 Project: HBase
  Issue Type: Umbrella
  Components: regionserver, Scanners
Reporter: Jingcheng Du
Assignee: Jingcheng Du
 Attachments: HBase MOB Design-v2.pdf, HBase MOB Design-v3.pdf, HBase 
 MOB Design-v4.pdf, HBase MOB Design.pdf, MOB user guide.docx, MOB user 
 guide_v2.docx, hbase-11339-in-dev.patch


   It's quite useful to save the medium binary data like images, documents 
 into Apache HBase. Unfortunately directly saving the binary MOB(medium 
 object) to HBase leads to a worse performance since the frequent split and 
 compaction.
   In this design, the MOB data are stored in an more efficient way, which 
 keeps a high write/read performance and guarantees the data consistency in 
 Apache HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HBASE-8674) JUnit and Surefire TRUNK-HBASE-2 plugins need a new home

2014-09-03 Thread Dima Spivak (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dima Spivak reopened HBASE-8674:


 JUnit and Surefire TRUNK-HBASE-2 plugins need a new home
 

 Key: HBASE-8674
 URL: https://issues.apache.org/jira/browse/HBASE-8674
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.98.0, 0.94.8, 0.95.1
Reporter: Andrew Purtell

 people.apache.org cannot currently host personal or transient Maven repos. 
 {noformat}
 $ curl --connect-timeout 60 -v  
 http://people.apache.org/~garyh/mvn/org/apache/maven/plugins/maven-remote-resources-plugin/1.4/maven-remote-resources-plugin-1.4.pom
 * About to connect() to people.apache.org port 80 (#0)
 *   Trying 140.211.11.9...
 * Connection timed out after 60064 milliseconds
 * Closing connection 0
 curl: (28) Connection timed out after 60064 milliseconds
 {noformat}
 All builds are at the moment broken if the HBase custom junit or surefire 
 jars are not already in cache. Even if this is a temporary condition, we 
 should find a new home for these artifacts, upgrade to versions that include 
 our submitted changes (if any), or fall back to release versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-8674) JUnit and Surefire TRUNK-HBASE-2 plugins need a new home

2014-09-03 Thread Dima Spivak (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119497#comment-14119497
 ] 

Dima Spivak commented on HBASE-8674:


It might be worth revisiting this. My builds tonight were slowed to a crawl 
after I had to rebuild my local Maven repository and I can only assume that 
Gary's personal repo was sluggish because simply removing references to it in 
the root POM got things speedy again.

 JUnit and Surefire TRUNK-HBASE-2 plugins need a new home
 

 Key: HBASE-8674
 URL: https://issues.apache.org/jira/browse/HBASE-8674
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.98.0, 0.94.8, 0.95.1
Reporter: Andrew Purtell

 people.apache.org cannot currently host personal or transient Maven repos. 
 {noformat}
 $ curl --connect-timeout 60 -v  
 http://people.apache.org/~garyh/mvn/org/apache/maven/plugins/maven-remote-resources-plugin/1.4/maven-remote-resources-plugin-1.4.pom
 * About to connect() to people.apache.org port 80 (#0)
 *   Trying 140.211.11.9...
 * Connection timed out after 60064 milliseconds
 * Closing connection 0
 curl: (28) Connection timed out after 60064 milliseconds
 {noformat}
 All builds are at the moment broken if the HBase custom junit or surefire 
 jars are not already in cache. Even if this is a temporary condition, we 
 should find a new home for these artifacts, upgrade to versions that include 
 our submitted changes (if any), or fall back to release versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11885) Provide a Dockerfile to easily build and run HBase from source

2014-09-03 Thread Dima Spivak (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119501#comment-14119501
 ] 

Dima Spivak commented on HBASE-11885:
-

I have a working Dockerfile that sets up the necessary HBase dependencies (i.e. 
Maven and Java), clones in the repo, runs {{mvn assembly:single}}, and then 
starts HBase and the HBase shell. The main problem I've run into pertains to 
the Maven step due to 
[HBASE-8674|https://issues.apache.org/jira/browse/HBASE-8674]. Can someone with 
knowledge of why the POMs are in the state that they are in take a look there 
and chime in on if there any consequence to simply removing Gary H's repo as a 
dependency?

 Provide a Dockerfile to easily build and run HBase from source
 --

 Key: HBASE-11885
 URL: https://issues.apache.org/jira/browse/HBASE-11885
 Project: HBase
  Issue Type: New Feature
Reporter: Dima Spivak
Assignee: Dima Spivak

 [A recent email to 
 dev@|http://mail-archives.apache.org/mod_mbox/hbase-dev/201408.mbox/%3CCAAef%2BM4q%3Da8Dqxe_EHSFTueY%2BXxz%2BtTe%2BJKsWWbXjhB_Pz7oSA%40mail.gmail.com%3E]
  highlighted the difficulty that new users can face in getting HBase compiled 
 from source and running locally. I'd like to provide a Dockerfile that would 
 allow anyone with Docker running on a machine with a reasonably current Linux 
 kernel to do so with ease.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11868) Data loss in hlog when the hdfs is unavailable

2014-09-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119502#comment-14119502
 ] 

Hudson commented on HBASE-11868:


FAILURE: Integrated in HBase-0.98 #493 (See 
[https://builds.apache.org/job/HBase-0.98/493/])
HBASE-11868 Data loss in hlog when the hdfs is unavailable (Liu Shaohui) 
(apurtell: rev 39771b8f73a6e6eae12e8b3bdb7dd1fe13edc83c)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java


 Data loss in hlog when the hdfs is unavailable
 --

 Key: HBASE-11868
 URL: https://issues.apache.org/jira/browse/HBASE-11868
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.5
Reporter: Liu Shaohui
Assignee: Liu Shaohui
Priority: Blocker
 Fix For: 0.98.6

 Attachments: HBASE-11868-0.98-v1.diff, HBASE-11868-0.98-v2.diff


 When using the new thread model in hbase 0.98, we found a bug which may cause 
 data loss when the the hdfs is unavailable.
 When writing wal Edits to hlog in doMiniBatchMutation of HRegion, the hlog 
 first call appendNoSync to write the edits to hlog and then call sync with 
 txid. 
 Assumed that the txid of current write is 10, and the syncedTillHere in hlog 
 is 9 and the failedTxid is 0. When  the the hdfs is unavailable, the 
 AsyncWriter or AsyncSyncer will fail to apend the edits or sync, then they 
 will update the syncedTillHere to 10 and the failedTxid to 10.
 When the hlog calls the sync with txid :10, the failedTxid will nerver be 
 checked for txid equals with syncedTillHere.  The client thinks the write 
 success , but the data only be writtten to memstore not hlog. If the 
 regionserver is down later before the memstore is flushed, the data will be 
 lost.
 See: FSHLog.java #1348
 {code}
   // sync all transactions upto the specified txid
   private void syncer(long txid) throws IOException {
 synchronized (this.syncedTillHere) {
   while (this.syncedTillHere.get()  txid) {
 try {
   this.syncedTillHere.wait();
   if (txid = this.failedTxid.get()) {
 assert asyncIOE != null :
   current txid is among(under) failed txids, but asyncIOE is 
 null!;
 throw asyncIOE;
   }
 } catch (InterruptedException e) {
   LOG.debug(interrupted while waiting for notification from 
 AsyncNotifier);
 }
   }
 }
   }
 {code}
 We can fix this issue by moving the comparing of txid and failedTxid outside 
 the while block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-11887) Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value

2014-09-03 Thread stack (JIRA)
stack created HBASE-11887:
-

 Summary: Memory retention in branch-1; millions of instances of 
LiteralByteString for column qualifier and value
 Key: HBASE-11887
 URL: https://issues.apache.org/jira/browse/HBASE-11887
 Project: HBase
  Issue Type: Bug
  Components: Protobufs
Affects Versions: 0.99.0
Reporter: stack
Assignee: stack


Trying to test branch-1, I run out of mem pretty fast.  Looking at dumps, I see 
too many instances of LiteralByteString.  Seem to be 'qualifiers' and 'values' 
out of pb QualifierValue... and on up to the multi call into the server. Am 
having trouble finding how the retention is being done... Filing issue in 
meantime while work on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119513#comment-14119513
 ] 

Mikhail Antonov commented on HBASE-11165:
-

A side question to folks who recently benchmarked it on big clusters.. what's 
avg ratio of hdfs inodes / per region you observed? Trying to estimate the load 
the proposed 1M or 50M regions setup puts on NN.

 Scaling so cluster can host 1M regions and beyond (50M regions?)
 

 Key: HBASE-11165
 URL: https://issues.apache.org/jira/browse/HBASE-11165
 Project: HBase
  Issue Type: Brainstorming
Reporter: stack
 Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
 zk_less_assignment_comparison_2.pdf


 This discussion issue comes out of Co-locate Meta And Master HBASE-10569 
 and comments on the doc posted there.
 A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
 regions maybe even 50M later.  This issue is about discussing how we will do 
 that (or if not 50M on a cluster, how otherwise we can attain same end).
 More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11886) The creator of the table should have all permissions on the table

2014-09-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119533#comment-14119533
 ] 

Hadoop QA commented on HBASE-11886:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12666169/11886-1.txt
  against trunk revision .
  ATTACHMENT ID: 12666169

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.TestRegionRebalancing

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10693//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10693//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10693//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10693//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10693//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10693//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10693//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10693//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10693//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10693//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10693//console

This message is automatically generated.

 The creator of the table should have all permissions on the table
 -

 Key: HBASE-11886
 URL: https://issues.apache.org/jira/browse/HBASE-11886
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.3
Reporter: Devaraj Das
Priority: Critical
 Fix For: 1.0.0, 0.98.6

 Attachments: 11886-1.txt


 In our testing of 0.98.4 with security ON, we found that table creator 
 doesn't have RWXCA on the created table. Instead, the user representing the 
 HBase daemon gets all permissions. Due to this the table creator can't write 
 to the table he just created. I am suspecting HBASE-11275 introduced the 
 problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11868) Data loss in hlog when the hdfs is unavailable

2014-09-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119664#comment-14119664
 ] 

Hudson commented on HBASE-11868:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #466 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/466/])
HBASE-11868 Data loss in hlog when the hdfs is unavailable (Liu Shaohui) 
(apurtell: rev 39771b8f73a6e6eae12e8b3bdb7dd1fe13edc83c)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java


 Data loss in hlog when the hdfs is unavailable
 --

 Key: HBASE-11868
 URL: https://issues.apache.org/jira/browse/HBASE-11868
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.5
Reporter: Liu Shaohui
Assignee: Liu Shaohui
Priority: Blocker
 Fix For: 0.98.6

 Attachments: HBASE-11868-0.98-v1.diff, HBASE-11868-0.98-v2.diff


 When using the new thread model in hbase 0.98, we found a bug which may cause 
 data loss when the the hdfs is unavailable.
 When writing wal Edits to hlog in doMiniBatchMutation of HRegion, the hlog 
 first call appendNoSync to write the edits to hlog and then call sync with 
 txid. 
 Assumed that the txid of current write is 10, and the syncedTillHere in hlog 
 is 9 and the failedTxid is 0. When  the the hdfs is unavailable, the 
 AsyncWriter or AsyncSyncer will fail to apend the edits or sync, then they 
 will update the syncedTillHere to 10 and the failedTxid to 10.
 When the hlog calls the sync with txid :10, the failedTxid will nerver be 
 checked for txid equals with syncedTillHere.  The client thinks the write 
 success , but the data only be writtten to memstore not hlog. If the 
 regionserver is down later before the memstore is flushed, the data will be 
 lost.
 See: FSHLog.java #1348
 {code}
   // sync all transactions upto the specified txid
   private void syncer(long txid) throws IOException {
 synchronized (this.syncedTillHere) {
   while (this.syncedTillHere.get()  txid) {
 try {
   this.syncedTillHere.wait();
   if (txid = this.failedTxid.get()) {
 assert asyncIOE != null :
   current txid is among(under) failed txids, but asyncIOE is 
 null!;
 throw asyncIOE;
   }
 } catch (InterruptedException e) {
   LOG.debug(interrupted while waiting for notification from 
 AsyncNotifier);
 }
   }
 }
   }
 {code}
 We can fix this issue by moving the comparing of txid and failedTxid outside 
 the while block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11877) Make TableSplit more readable

2014-09-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119669#comment-14119669
 ] 

Hadoop QA commented on HBASE-11877:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12666163/HBASE-11877-trunk-v2.diff
  against trunk revision .
  ATTACHMENT ID: 12666163

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction
  org.apache.hadoop.hbase.client.TestMultiParallel
  org.apache.hadoop.hbase.TestRegionRebalancing

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10692//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10692//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10692//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10692//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10692//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10692//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10692//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10692//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10692//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10692//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10692//console

This message is automatically generated.

 Make TableSplit more readable
 -

 Key: HBASE-11877
 URL: https://issues.apache.org/jira/browse/HBASE-11877
 Project: HBase
  Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: Liu Shaohui
Assignee: Liu Shaohui
Priority: Minor
 Attachments: HBASE-11877-trunk-v1.diff, HBASE-11877-trunk-v2.diff


 When debugging MR jobs reading from hbase table, it's import to figure out 
 which region a map task is reading from.
 But the table split object is hard to read.
 eg:
 {code}
 2014-09-01 20:58:39,783 INFO [main] org.apache.hadoop.mapred.MapTask: 
 Processing split: lg-hadoop-prc-st40.bj:,0
 {code}
 See: TableSplit.java 
 {code}
   @Override
   public String toString() {
 return m_regionLocation + : +
   Bytes.toStringBinary(m_startRow) + , + Bytes.toStringBinary(m_endRow);
   }
 {code}
 We should make it more readable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11886) The creator of the table should have all permissions on the table

2014-09-03 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119705#comment-14119705
 ] 

Ted Yu commented on HBASE-11886:


+1

 The creator of the table should have all permissions on the table
 -

 Key: HBASE-11886
 URL: https://issues.apache.org/jira/browse/HBASE-11886
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.3
Reporter: Devaraj Das
Priority: Critical
 Fix For: 1.0.0, 0.98.6

 Attachments: 11886-1.txt


 In our testing of 0.98.4 with security ON, we found that table creator 
 doesn't have RWXCA on the created table. Instead, the user representing the 
 HBase daemon gets all permissions. Due to this the table creator can't write 
 to the table he just created. I am suspecting HBASE-11275 introduced the 
 problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11339) HBase MOB

2014-09-03 Thread Jingcheng Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119741#comment-14119741
 ] 

Jingcheng Du commented on HBASE-11339:
--

Thanks Lars for the comments. [~lhofhansl]

bq. Going by the comments the use case is only 1-5mb files (definitely less 
than 64mb), correct? That changes the discussion, but it looks to me that now 
the use case is limited to a single scenario and carefully constructed (200m x 
500k files) so that this change might be useful. I.e. pick a blob size just 
right, and pick the size distribution of the files just right and this makes 
sense.
the client solution could work well too in certain cases of bigger size blobs 
and we could try leveraging the current MOB design approach for smaller values 
of KVs.
In some usage scenarios, the value size is almost fixed, for example the 
pictures taken by camera of the traffic bureau, the contracts between banks and 
customers, the CT(Computed Tomography) records in hospitals, etc. This might be 
limited, but it’s really useful.
As mentioned the client solution saves the records larger than 10MB to hdfs, 
and saves others to the HBase directly. To turn down the threshold less will 
lead to the insufficient using of the hdfs in client solution, instead saving 
them directly in HBase for this case.
And even with value size less 10MB, the mob implementation has big improvements 
in performance than directly saving those records into HBase.

The mob has a threshold as well, the mob could be saved as either value or 
reference by this threshold. We have a default value 100KB for it now. Users 
could change it and we also have a compactor to handle it (move the mob file to 
hbase, and vice versa).

As Jon said, we'll revamp the mob compaction and get rid of the MR dependency.

bq. Yet, all that is possible to do with a client only solution and could be 
abstracted there.
To implement the snapshot, replication things in client solution are harder, it 
will bring the complexity for the client solution as well. To keep the 
consistency bwtween HBase and HDFS files during replication is a problem.
To implement this in server side is a little bit easier, the mob includes the 
implementation of snapshot, and it supports the replication naturally because 
the mob data are saved in WAL.

bq. (Subjectively) I do not like the complexity of this as seen by the various 
discussions here. That part is just my $0.02 of course.
Yes, it’s complex, but they are meaningful and valuable.
The patches provide features of read/write, compactions, snapshot and sweep for 
mob files. Even in the future HBase decides to implement streaming feature, the 
read, compaction, and snapshot parts would be useful probably.

Thanks!


 HBase MOB
 -

 Key: HBASE-11339
 URL: https://issues.apache.org/jira/browse/HBASE-11339
 Project: HBase
  Issue Type: Umbrella
  Components: regionserver, Scanners
Reporter: Jingcheng Du
Assignee: Jingcheng Du
 Attachments: HBase MOB Design-v2.pdf, HBase MOB Design-v3.pdf, HBase 
 MOB Design-v4.pdf, HBase MOB Design.pdf, MOB user guide.docx, MOB user 
 guide_v2.docx, hbase-11339-in-dev.patch


   It's quite useful to save the medium binary data like images, documents 
 into Apache HBase. Unfortunately directly saving the binary MOB(medium 
 object) to HBase leads to a worse performance since the frequent split and 
 compaction.
   In this design, the MOB data are stored in an more efficient way, which 
 keeps a high write/read performance and guarantees the data consistency in 
 Apache HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread Andrey Stepachev (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119832#comment-14119832
 ] 

Andrey Stepachev commented on HBASE-11165:
--

Just thinking, did anyone tried to measure how meta uses memory and reduce 
usage of memory? It is interesting, why NN is able to handle much more data in 
memory, while HMaster can't.

 Scaling so cluster can host 1M regions and beyond (50M regions?)
 

 Key: HBASE-11165
 URL: https://issues.apache.org/jira/browse/HBASE-11165
 Project: HBase
  Issue Type: Brainstorming
Reporter: stack
 Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
 zk_less_assignment_comparison_2.pdf


 This discussion issue comes out of Co-locate Meta And Master HBASE-10569 
 and comments on the doc posted there.
 A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
 regions maybe even 50M later.  This issue is about discussing how we will do 
 that (or if not 50M on a cluster, how otherwise we can attain same end).
 More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11886) The creator of the table should have all permissions on the table

2014-09-03 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119868#comment-14119868
 ] 

Andrew Purtell commented on HBASE-11886:


Thanks for finding this. +1

 The creator of the table should have all permissions on the table
 -

 Key: HBASE-11886
 URL: https://issues.apache.org/jira/browse/HBASE-11886
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.3
Reporter: Devaraj Das
Priority: Critical
 Fix For: 1.0.0, 0.98.6

 Attachments: 11886-1.txt


 In our testing of 0.98.4 with security ON, we found that table creator 
 doesn't have RWXCA on the created table. Instead, the user representing the 
 HBase daemon gets all permissions. Due to this the table creator can't write 
 to the table he just created. I am suspecting HBASE-11275 introduced the 
 problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11339) HBase MOB

2014-09-03 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119894#comment-14119894
 ] 

Andrew Purtell commented on HBASE-11339:


bq. As Jon said, we'll revamp the mob compaction and get rid of the MR 
dependency.

Please. I don't think we should ever ship a release with a dependency on MR for 
core function. Committing this to trunk in stages could be ok, as long as we do 
not attempt a release including the feature before MOB compaction is handled 
natively. 

 HBase MOB
 -

 Key: HBASE-11339
 URL: https://issues.apache.org/jira/browse/HBASE-11339
 Project: HBase
  Issue Type: Umbrella
  Components: regionserver, Scanners
Reporter: Jingcheng Du
Assignee: Jingcheng Du
 Attachments: HBase MOB Design-v2.pdf, HBase MOB Design-v3.pdf, HBase 
 MOB Design-v4.pdf, HBase MOB Design.pdf, MOB user guide.docx, MOB user 
 guide_v2.docx, hbase-11339-in-dev.patch


   It's quite useful to save the medium binary data like images, documents 
 into Apache HBase. Unfortunately directly saving the binary MOB(medium 
 object) to HBase leads to a worse performance since the frequent split and 
 compaction.
   In this design, the MOB data are stored in an more efficient way, which 
 keeps a high write/read performance and guarantees the data consistency in 
 Apache HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11886) The creator of the table should have all permissions on the table

2014-09-03 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119871#comment-14119871
 ] 

Andrew Purtell commented on HBASE-11886:


Any chance for a short unit test that confirms the fix? Then we won't regress 
here again. Thanks. 

 The creator of the table should have all permissions on the table
 -

 Key: HBASE-11886
 URL: https://issues.apache.org/jira/browse/HBASE-11886
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.3
Reporter: Devaraj Das
Priority: Critical
 Fix For: 1.0.0, 0.98.6

 Attachments: 11886-1.txt


 In our testing of 0.98.4 with security ON, we found that table creator 
 doesn't have RWXCA on the created table. Instead, the user representing the 
 HBase daemon gets all permissions. Due to this the table creator can't write 
 to the table he just created. I am suspecting HBASE-11275 introduced the 
 problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11878) TestVisibilityLabelsWithDistributedLogReplay#testAddVisibilityLabelsOnRSRestart sometimes fails due to VisibilityController not yet initialized

2014-09-03 Thread Jean-Marc Spaggiari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119889#comment-14119889
 ] 

Jean-Marc Spaggiari commented on HBASE-11878:
-

FYI,

I re-run that on 4 servers using the 0.98RC1 src package.

this specific test passed on 3 servers and failed with timeout on one of them:

{code}
Tests in error: 
  
testAddVisibilityLabelsOnRSRestart(org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithDistributedLogReplay):
 test timed out after 6 milliseconds
{code}

 TestVisibilityLabelsWithDistributedLogReplay#testAddVisibilityLabelsOnRSRestart
  sometimes fails due to VisibilityController not yet initialized
 ---

 Key: HBASE-11878
 URL: https://issues.apache.org/jira/browse/HBASE-11878
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 11878-v1.txt, 11878-v2.txt, 11878-v2.txt, 11878-v3.txt, 
 11878-v4.txt, 11878-v5.txt


 In the thread w.r.t. first RC of 0.98.6, 
 http://search-hadoop.com/m/DHED4p2rw81 , Jean-Marc reported that 
 TestVisibilityLabelsWithDistributedLogReplay#testAddVisibilityLabelsOnRSRestart
  sometimes failed on his machines.
 From 
 http://server.distparser.com:81/hbase/with_teds_patch2/hbasetest1/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithDistributedLogReplay-output.txt
  :
 {code}
 result {
   exception {
 name: org.apache.hadoop.hbase.coprocessor.CoprocessorException
 value: org.apache.hadoop.hbase.coprocessor.CoprocessorException: 
 VisibilityController not yet initialized\n\tat 
 org.apache.hadoop.hbase.security.visibility.VisibilityController.addLabels(VisibilityController.java:638)\n\tat
  
 org.apache.hadoop.hbase.protobuf.generated.VisibilityLabelsProtos$VisibilityLabelsService$1.addLabels(VisibilityLabelsProtos.java:5014)\n\tat
  
 org.apache.hadoop.hbase.protobuf.generated.VisibilityLabelsProtos$VisibilityLabelsService.callMethod(VisibilityLabelsProtos.java:5178)\n\tat
  
 org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:5794)\n\tat
  
 org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:1608)\n\tat
  
 org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:1590)\n\tat
  
 org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:30088)\n\tat
  org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2014)\n\tat 
 org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)\n\tat 
 org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)\n\tat
  org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)\n\tat 
 java.lang.Thread.run(Thread.java:744)\n
   }
 }
 {code}
 The above exception revealed a race condition: writing of labels ABC and  
 XYZ took place when VisibilityController was not yet initialized.
 The test writes the labels only once, leading to assertion failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HBASE-11876) RegionScanner.nextRaw(...) should not update metrics

2014-09-03 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell reopened HBASE-11876:


The trunk change passed precommit but the 0.98 version requires an addendum to 
fix a failing test, see 
https://builds.apache.org/job/HBase-0.98/493/testReport/org.apache.hadoop.hbase.regionserver/TestRegionServerMetrics/testScanNext/

 RegionScanner.nextRaw(...) should not update metrics
 

 Key: HBASE-11876
 URL: https://issues.apache.org/jira/browse/HBASE-11876
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.6
Reporter: Lars Hofhansl
Assignee: Andrew Purtell
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: HBASE-11876-0.98.patch, HBASE-11876.patch, 
 HBASE-11876.patch


 I added the RegionScanner.nextRaw(...) to allow smart client to avoid some 
 of the default work that HBase is doing, such as {start|stop}RegionOperation 
 and synchronized(scanner) for each row.
 Metrics should follow the same approach. Collecting them per row is expensive 
 and a caller should have the option to collect those later or to avoid 
 collecting them completely.
 We can also save some cycles in RSRcpServices.scan(...) if we updated the 
 metric only once/batch instead of each row.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11886) The creator of the table should have all permissions on the table

2014-09-03 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-11886:
---
Assignee: Devaraj Das

 The creator of the table should have all permissions on the table
 -

 Key: HBASE-11886
 URL: https://issues.apache.org/jira/browse/HBASE-11886
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.3
Reporter: Devaraj Das
Assignee: Devaraj Das
Priority: Critical
 Fix For: 1.0.0, 0.98.6

 Attachments: 11886-1.txt


 In our testing of 0.98.4 with security ON, we found that table creator 
 doesn't have RWXCA on the created table. Instead, the user representing the 
 HBase daemon gets all permissions. Due to this the table creator can't write 
 to the table he just created. I am suspecting HBASE-11275 introduced the 
 problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11885) Provide a Dockerfile to easily build and run HBase from source

2014-09-03 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119928#comment-14119928
 ] 

Andrew Purtell commented on HBASE-11885:


[~nkeywal]

 Provide a Dockerfile to easily build and run HBase from source
 --

 Key: HBASE-11885
 URL: https://issues.apache.org/jira/browse/HBASE-11885
 Project: HBase
  Issue Type: New Feature
Reporter: Dima Spivak
Assignee: Dima Spivak

 [A recent email to 
 dev@|http://mail-archives.apache.org/mod_mbox/hbase-dev/201408.mbox/%3CCAAef%2BM4q%3Da8Dqxe_EHSFTueY%2BXxz%2BtTe%2BJKsWWbXjhB_Pz7oSA%40mail.gmail.com%3E]
  highlighted the difficulty that new users can face in getting HBase compiled 
 from source and running locally. I'd like to provide a Dockerfile that would 
 allow anyone with Docker running on a machine with a reasonably current Linux 
 kernel to do so with ease.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-11760) Tighten up region state transition

2014-09-03 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14112342#comment-14112342
 ] 

Jimmy Xiang edited comment on HBASE-11760 at 9/3/14 3:28 PM:
-

Patch v2.1 is on RB: https://reviews.apache.org/r/25299/


was (Author: jxiang):
Patch v2 is on RB: https://reviews.apache.org/r/25099/

 Tighten up region state transition
 --

 Key: HBASE-11760
 URL: https://issues.apache.org/jira/browse/HBASE-11760
 Project: HBase
  Issue Type: Improvement
  Components: Region Assignment
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 2.0.0

 Attachments: hbase-11760.patch, hbase-11760_2.patch


 When a regionserver reports to master a region transition, we should check 
 the current region state to be exactly what we expect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11760) Tighten up region state transition

2014-09-03 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-11760:

Attachment: hbase-11760_2.1.patch

Attached v2.1 that's rebased to master latest. RB: 
https://reviews.apache.org/r/25299/

 Tighten up region state transition
 --

 Key: HBASE-11760
 URL: https://issues.apache.org/jira/browse/HBASE-11760
 Project: HBase
  Issue Type: Improvement
  Components: Region Assignment
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 2.0.0

 Attachments: hbase-11760.patch, hbase-11760_2.1.patch, 
 hbase-11760_2.patch


 When a regionserver reports to master a region transition, we should check 
 the current region state to be exactly what we expect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11876) RegionScanner.nextRaw(...) should not update metrics

2014-09-03 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119973#comment-14119973
 ] 

Andrew Purtell commented on HBASE-11876:


There are two code paths in 0.98 which call nextRaw from scanning code. One 
isn't found by Eclipse reference search but grep did the trick. Will revert 
previous commit and push the fixed version as soon as local tests check out.

 RegionScanner.nextRaw(...) should not update metrics
 

 Key: HBASE-11876
 URL: https://issues.apache.org/jira/browse/HBASE-11876
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.6
Reporter: Lars Hofhansl
Assignee: Andrew Purtell
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: HBASE-11876-0.98.patch, HBASE-11876.patch, 
 HBASE-11876.patch


 I added the RegionScanner.nextRaw(...) to allow smart client to avoid some 
 of the default work that HBase is doing, such as {start|stop}RegionOperation 
 and synchronized(scanner) for each row.
 Metrics should follow the same approach. Collecting them per row is expensive 
 and a caller should have the option to collect those later or to avoid 
 collecting them completely.
 We can also save some cycles in RSRcpServices.scan(...) if we updated the 
 metric only once/batch instead of each row.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11876) RegionScanner.nextRaw(...) should not update metrics

2014-09-03 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-11876:
---
Attachment: HBASE-11876-0.98.patch

 RegionScanner.nextRaw(...) should not update metrics
 

 Key: HBASE-11876
 URL: https://issues.apache.org/jira/browse/HBASE-11876
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.6
Reporter: Lars Hofhansl
Assignee: Andrew Purtell
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: HBASE-11876-0.98.patch, HBASE-11876-0.98.patch, 
 HBASE-11876.patch, HBASE-11876.patch


 I added the RegionScanner.nextRaw(...) to allow smart client to avoid some 
 of the default work that HBase is doing, such as {start|stop}RegionOperation 
 and synchronized(scanner) for each row.
 Metrics should follow the same approach. Collecting them per row is expensive 
 and a caller should have the option to collect those later or to avoid 
 collecting them completely.
 We can also save some cycles in RSRcpServices.scan(...) if we updated the 
 metric only once/batch instead of each row.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11805) KeyValue to Cell Convert in WALEdit APIs

2014-09-03 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119994#comment-14119994
 ] 

Anoop Sam John commented on HBASE-11805:


Thanks Stack.
{quote}
{code}
Do we have to do the below?

   - for (KeyValue kv : value.getKeyValues()) {
+ for (Cell cell : value.getCells()) {
+ KeyValue kv = KeyValueUtil.ensureKeyValue(cell);
{code}
{quote}
This is in WALPlayer#HLogKeyValueMapper. The mapper o/p value is KeyValue.  
This also we can change to Cell. But I thought of doing all these as part of 
HBASE-11871. One sub task  to remove enusreKeyValue from MR tools. Sounds ok?

{quote}
This one is a bit odd Anoop...
{code}
-for (KeyValue kv: kvs) {
-size += kv.getLength();
+ for (Cell cell: cells) { + size += KeyValueUtil.length(cell); }
{code}
Using a KeyValueUtils on a cell?
{quote}
We have CellUtil#estimatedSizeOf() but that is not KV length. That is length + 
SIZEOF_INT.  The bytes a Cell takes when serialized to Encoder.
KKUtil method was existing and been used by Prefix Tree also.  May be will add 
length() in CellUtil , same way as in KVUtil and make code to use that.


 KeyValue to Cell Convert in WALEdit APIs
 

 Key: HBASE-11805
 URL: https://issues.apache.org/jira/browse/HBASE-11805
 Project: HBase
  Issue Type: Improvement
  Components: wal
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.99.0, 2.0.0, 0.98.7

 Attachments: HBASE-11805.patch


 In almost all other main interface class/APIs we have changed KeyValue to 
 Cell. But missing in WALEdit. This is public marked for Replication (Well it 
 should be for CP also) 
 These 2 APIs deal with KVs
 add(KeyValue kv)
 ArrayListKeyValue getKeyValues()
 Suggest deprecate them and add for 0.98
 add(Cell kv) 
 ListCell getCells()
 And just replace from 1.0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-11888) Add per region flush count

2014-09-03 Thread Elliott Clark (JIRA)
Elliott Clark created HBASE-11888:
-

 Summary: Add per region flush count
 Key: HBASE-11888
 URL: https://issues.apache.org/jira/browse/HBASE-11888
 Project: HBase
  Issue Type: Bug
Reporter: Elliott Clark
Assignee: Elliott Clark


Debugging a workload that ran overnight it's hard to tell if a region flushed a 
lot and got compacted, or flushed fewer times and then got compacted.  We 
should have a counter on that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11882) Row level consistency may not be maintained with bulk load and compaction

2014-09-03 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120039#comment-14120039
 ] 

Jerry He commented on HBASE-11882:
--

Still there is no Hadoop QA run. 
Here is my local 'mvn test' run result with patch v2:
{code}
[INFO] Reactor Summary:
[INFO]
[INFO] HBase . SUCCESS [  1.793 s]
[INFO] HBase - Common  SUCCESS [ 50.103 s]
[INFO] HBase - Protocol .. SUCCESS [  0.073 s]
[INFO] HBase - Client  SUCCESS [01:02 min]
[INFO] HBase - Hadoop Compatibility .. SUCCESS [  8.157 s]
[INFO] HBase - Hadoop Two Compatibility .. SUCCESS [  5.973 s]
[INFO] HBase - Prefix Tree ... SUCCESS [  8.950 s]
[INFO] HBase - Server  SUCCESS [57:32 min]
[INFO] HBase - Testing Util .. SUCCESS [  1.101 s]
[INFO] HBase - Thrift  SUCCESS [02:08 min]
[INFO] HBase - Shell . SUCCESS [  1.586 s]
[INFO] HBase - Integration Tests . SUCCESS [  0.516 s]
[INFO] HBase - Examples .. SUCCESS [  1.916 s]
[INFO] HBase - Assembly .. SUCCESS [  1.045 s]
[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 01:02 h
[INFO] Finished at: 2014-09-03T00:15:35-08:00
[INFO] Final Memory: 46M/273M
{code}

 Row level consistency may not be maintained with bulk load and compaction
 -

 Key: HBASE-11882
 URL: https://issues.apache.org/jira/browse/HBASE-11882
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.99.0, 2.0.0
Reporter: Jerry He
Assignee: Jerry He
Priority: Critical
 Fix For: 0.99.0, 2.0.0

 Attachments: HBASE-11882-master-v1.patch, 
 HBASE-11882-master-v2.patch, TestHRegionServerBulkLoad.java.patch


 While looking into the TestHRegionServerBulkLoad failure for HBASE-11772, I 
 found the root cause is that row level atomicity may not be maintained with 
 bulk load together with compation.
 TestHRegionServerBulkLoad is used to test bulk load atomicity. The test uses 
 multiple threads to do bulk load and scan continuously and do compactions 
 periodically. 
 It verifies row level data is always consistent across column families.
 After HBASE-11591, we added readpoint checks for bulkloaded data using the 
 seqId at the time of bulk load. Now a scanner will not see the data from a 
 bulk load if the scanner's readpoint is earlier than the bulk load seqId.
 Previously, the atomic bulk load result is visible immediately to all 
 scanners.
 The problem is with compaction after bulk load. Compaction does not lock the 
 region and it is done one store (column family) at a time. It also compact 
 away the seqId marker of bulk load.
 Here is an event sequence where the row level consistency is broken.
 1. A scanner is started to scan a region with cf1 and cf2. The readpoint is 
 10.
 2. There is a bulk load that loads into cf1 and cf2. The bulk load seqId is 
 11. Bulk load is guarded by region write lock. So it is atomic.
 3. There is a compaction that compacts cf1. It compacts away the seqId marker 
 of the bulk load.
 4. The scanner tries to next to row-1001. It gets the bulk load data for cf1 
 since there is no seqId preventing it.  It does not get the bulk load data 
 for cf2 since the scanner's readpoint (10) is less than the bulk load seqId 
 (11).
 Now the row level consistency is broken in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11853) Provide an alternative to the apache build for developers (like me) who aren't committers

2014-09-03 Thread Alex Newman (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120063#comment-14120063
 ] 

Alex Newman commented on HBASE-11853:
-

[~apurtell] So another problem. It seems as though the tests in the compat 
directories aren't categorized at all. In fact they don't pull in hbase-common. 
Perhaps we could do packages which are not categorized at all in another commit?

 Provide an alternative to the apache build for developers (like me) who 
 aren't committers
 -

 Key: HBASE-11853
 URL: https://issues.apache.org/jira/browse/HBASE-11853
 Project: HBase
  Issue Type: Bug
Reporter: Alex Newman
Assignee: Alex Newman
 Attachments: HBASE-11853-testing-v0.patch, 
 HBASE-11853-testing-v1.patch, HBASE-11853-v3.patch


 Travis CI and Circle-CI now provide free builds for open source projects. I 
 created the capability to run builds this way. Although they are closed 
 source (and thus not a replacement for jenkins IMHO), they are super 
 convenient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-11876) RegionScanner.nextRaw(...) should not update metrics

2014-09-03 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-11876.

Resolution: Fixed

Pushed to 0.98. All hbase-server tests pass locally.

 RegionScanner.nextRaw(...) should not update metrics
 

 Key: HBASE-11876
 URL: https://issues.apache.org/jira/browse/HBASE-11876
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.6
Reporter: Lars Hofhansl
Assignee: Andrew Purtell
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: HBASE-11876-0.98.patch, HBASE-11876-0.98.patch, 
 HBASE-11876.patch, HBASE-11876.patch


 I added the RegionScanner.nextRaw(...) to allow smart client to avoid some 
 of the default work that HBase is doing, such as {start|stop}RegionOperation 
 and synchronized(scanner) for each row.
 Metrics should follow the same approach. Collecting them per row is expensive 
 and a caller should have the option to collect those later or to avoid 
 collecting them completely.
 We can also save some cycles in RSRcpServices.scan(...) if we updated the 
 metric only once/batch instead of each row.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11886) The creator of the table should have all permissions on the table

2014-09-03 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120070#comment-14120070
 ] 

Andrew Purtell commented on HBASE-11886:


bq. Any chance for a short unit test that confirms the fix? Then we won't 
regress here again. Thanks.

I've got time to do this. Working on it now. 

 The creator of the table should have all permissions on the table
 -

 Key: HBASE-11886
 URL: https://issues.apache.org/jira/browse/HBASE-11886
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.3
Reporter: Devaraj Das
Assignee: Devaraj Das
Priority: Critical
 Fix For: 1.0.0, 0.98.6

 Attachments: 11886-1.txt


 In our testing of 0.98.4 with security ON, we found that table creator 
 doesn't have RWXCA on the created table. Instead, the user representing the 
 HBase daemon gets all permissions. Due to this the table creator can't write 
 to the table he just created. I am suspecting HBASE-11275 introduced the 
 problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11805) KeyValue to Cell Convert in WALEdit APIs

2014-09-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120074#comment-14120074
 ] 

stack commented on HBASE-11805:
---

Sounds reasonable. +1

 KeyValue to Cell Convert in WALEdit APIs
 

 Key: HBASE-11805
 URL: https://issues.apache.org/jira/browse/HBASE-11805
 Project: HBase
  Issue Type: Improvement
  Components: wal
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.99.0, 2.0.0, 0.98.7

 Attachments: HBASE-11805.patch


 In almost all other main interface class/APIs we have changed KeyValue to 
 Cell. But missing in WALEdit. This is public marked for Replication (Well it 
 should be for CP also) 
 These 2 APIs deal with KVs
 add(KeyValue kv)
 ArrayListKeyValue getKeyValues()
 Suggest deprecate them and add for 0.98
 add(Cell kv) 
 ListCell getCells()
 And just replace from 1.0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120086#comment-14120086
 ] 

stack commented on HBASE-11165:
---

bq.  I mean, seems tricky to make it be the same code path?

Yeah. One of the code paths will 'suffer' neglect.

[~mantonov] I like your cold backup - warm - hot - active-active let 
me try and put together a bit of a summary so far on findings and arguments.


bq. ...and reduce usage of memory

Yeah, we'll have to go this route if we are trying to keep state of a big 
cluster in heap.  Could work on making the representation more compact.  You 
arguing for single meta region [~octo47] then? There is also the on-hdfs size 
to consider (write-amplification) and the r/w i/os.

 Scaling so cluster can host 1M regions and beyond (50M regions?)
 

 Key: HBASE-11165
 URL: https://issues.apache.org/jira/browse/HBASE-11165
 Project: HBase
  Issue Type: Brainstorming
Reporter: stack
 Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
 zk_less_assignment_comparison_2.pdf


 This discussion issue comes out of Co-locate Meta And Master HBASE-10569 
 and comments on the doc posted there.
 A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
 regions maybe even 50M later.  This issue is about discussing how we will do 
 that (or if not 50M on a cluster, how otherwise we can attain same end).
 More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11888) Add per region flush count

2014-09-03 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-11888:
--
Priority: Minor  (was: Major)

 Add per region flush count
 --

 Key: HBASE-11888
 URL: https://issues.apache.org/jira/browse/HBASE-11888
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Minor

 Debugging a workload that ran overnight it's hard to tell if a region flushed 
 a lot and got compacted, or flushed fewer times and then got compacted.  We 
 should have a counter on that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11339) HBase MOB

2014-09-03 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120094#comment-14120094
 ] 

Jonathan Hsieh commented on HBASE-11339:


Re: [~lhofhansl]

bq. To be fair, my comment itself addressed that by saying small blobs are 
stored by value in HBase, and only large bloba in HDFS. We can store a lot of 
10MB (in the worst case scenario it's 200m x 10mb = 2pb) in HDFS, if that's not 
enough, we can dial up the threshold.
bq. It seems nobody understood what I am suggesting. Depending on use case and 
data distribution you pick a threshold X. Blobs with a size of  X are stored 
directly in HBase as a column value. Blobs = X are stored in a HDFS with a 
reference in HBase using the 3-phase approach.

The MOB solution we're espousing does not preclude the hybrid hdfs+hbase 
approach - that could be still used with objects that are larger than or 
approach the hdfs block size.  Our claim is that the mob approach is 
complementary to a proper streaming api based hdfs+hbase mechanism for large 
object.  

Operationally, the MOB design is similar -- Depending on use case and data 
distribution you pick a threshold X on each column family.  Blobs with a size 
of  X are stored directly in HBase as a column value.  Blobs = X are stored 
in the MOB area with a reference in HBase using the on-flush/on-compaction 
approach. If the blob is larger than the ~10MB default [1], it is rejected. 

With the MOB design, if the threshold X performs poorly, then you can alter 
table the X value and the next major compaction will shift values between the 
MOB area and the normal hbase regions.  With the HDFS+HBase approach, would we 
need a new mechanism to shift data between hdfs and hbase? Is there a simple 
tuning/migration story?

bq. True, but as I state the store small blobs by value and only large ones by 
reference solution is not mentioned in there.

bq. Not it's not... It says either all blobs go into HBase or all blobs go into 
HDFS... See above. Small blobs would be stored directly in HBase, not in HDFS. 
That's key, nobody wants to store 100k or 1mb files directly in HDFS.

I'm confused.  Section 4.1.2 part this split was assumed and the different 
mechanisms were for handling the large ones.  The discussions earlier in the 
jira explicitly added a threshold sizes to separate them when the value or 
reference implementations are used.

For people that want to put a lot of 100k or 1mb objects in hbase there are 
many problems that arise, and this mob feature is an approach to make this 
valid (according to the defaults) workload work better and more predictably.  
The mob design says store small blobs by value,  moderate blobs by reference 
(with data in to mob area), and maintains that hbase is not for large objects 
[1] . 

bq. Yet, all that is possible to do with a client only solution and could be 
abstracted there.
bq. I'll also admit that our blob storage tool is not finished, yet, and that 
for its use case we don't need replication or backup as it itself will be the 
backup solution for another very large data store.
bq. Are you guys absolutely... 100%... positive that this cannot be done in any 
other way and has to be done this way? That we cannot store files up to a 
certain size as values in HBase and larger files in HDFS? And there is not good 
threshold value for this?

I don't think that saying this is the only way something could be done is 
right thing to ask.  There always many ways to get a functionality -- we've 
presented a few other potential solutions, and have chosen and are justifying a 
design considering many of the tradeoffs.  It presented a need, a design, an 
early implementation, and evidence of a deployment and other potential use 
cases.

The hybrid hdfs-hbase approach is one of the alternatives. I believe we agree 
that there will be some complexity introduced with that approach dealing with 
atomicity, bulk load, security, backup, replication and potentially tuning.  We 
have enough detail from the discussion to handle atomicity, there are open 
questions with the others.  It is hard to claim a feature is production-ready 
if we don't have a relatively simple mechanism for backups and disaster 
recovery.  In some future, when the hybrid hdfs+hbase system gets open sourced 
along with operationally internalized tools complexities, I think it would be a 
fine addition to hbase. 

Rough thresholds would be 0-100k hbase by value, 100k-10MB hbase by mob, 10MB+ 
hbase by ref to hdfs.

[1] Today the default Cell size max is ~10MB. 
https://github.com/apache/hbase/blob/master/hbase-common/src/main/resources/hbase-default.xml#L530


 HBase MOB
 -

 Key: HBASE-11339
 URL: https://issues.apache.org/jira/browse/HBASE-11339
 Project: HBase
  Issue Type: Umbrella
  Components: regionserver, Scanners
  

[jira] [Updated] (HBASE-11888) Add per region flush count

2014-09-03 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-11888:
--
Issue Type: Improvement  (was: Bug)

 Add per region flush count
 --

 Key: HBASE-11888
 URL: https://issues.apache.org/jira/browse/HBASE-11888
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark

 Debugging a workload that ran overnight it's hard to tell if a region flushed 
 a lot and got compacted, or flushed fewer times and then got compacted.  We 
 should have a counter on that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11886) The creator of the table should have all permissions on the table

2014-09-03 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120106#comment-14120106
 ] 

Devaraj Das commented on HBASE-11886:
-

Thanks, [~apurtell], you beat me to it.

 The creator of the table should have all permissions on the table
 -

 Key: HBASE-11886
 URL: https://issues.apache.org/jira/browse/HBASE-11886
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.3
Reporter: Devaraj Das
Assignee: Devaraj Das
Priority: Critical
 Fix For: 1.0.0, 0.98.6

 Attachments: 11886-1.txt


 In our testing of 0.98.4 with security ON, we found that table creator 
 doesn't have RWXCA on the created table. Instead, the user representing the 
 HBase daemon gets all permissions. Due to this the table creator can't write 
 to the table he just created. I am suspecting HBASE-11275 introduced the 
 problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11760) Tighten up region state transition

2014-09-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120108#comment-14120108
 ] 

Hadoop QA commented on HBASE-11760:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12666245/hbase-11760_2.1.patch
  against trunk revision .
  ATTACHMENT ID: 12666245

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.replication.regionserver.TestReplicationThrottler

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10694//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10694//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10694//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10694//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10694//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10694//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10694//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10694//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10694//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10694//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10694//console

This message is automatically generated.

 Tighten up region state transition
 --

 Key: HBASE-11760
 URL: https://issues.apache.org/jira/browse/HBASE-11760
 Project: HBase
  Issue Type: Improvement
  Components: Region Assignment
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 2.0.0

 Attachments: hbase-11760.patch, hbase-11760_2.1.patch, 
 hbase-11760_2.patch


 When a regionserver reports to master a region transition, we should check 
 the current region state to be exactly what we expect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11887) Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value

2014-09-03 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-11887:
--
Attachment: Screen Shot 2014-09-03 at 10.18.58 AM.png

Here is what it looks like.  This is small heap.  Has 20M references to 
qualifer/value after running a relatively short time.

 Memory retention in branch-1; millions of instances of LiteralByteString for 
 column qualifier and value
 ---

 Key: HBASE-11887
 URL: https://issues.apache.org/jira/browse/HBASE-11887
 Project: HBase
  Issue Type: Bug
  Components: Protobufs
Affects Versions: 0.99.0
Reporter: stack
Assignee: stack
 Attachments: Screen Shot 2014-09-03 at 10.18.58 AM.png


 Trying to test branch-1, I run out of mem pretty fast.  Looking at dumps, I 
 see too many instances of LiteralByteString.  Seem to be 'qualifiers' and 
 'values' out of pb QualifierValue... and on up to the multi call into the 
 server. Am having trouble finding how the retention is being done... Filing 
 issue in meantime while work on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11339) HBase MOB

2014-09-03 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120118#comment-14120118
 ] 

Jonathan Hsieh commented on HBASE-11339:


re: [~apurtell]

bq. Please. I don't think we should ever ship a release with a dependency on MR 
for core function. Committing this to trunk in stages could be ok, as long as 
we do not attempt a release including the feature before MOB compaction is 
handled natively.

I  agree -- moreover, ideally hbase should not need external processes except 
for hdfs/zk.  

However, there is what should be and what has happened and what does happen.  
In these cases we have ended up marking features experimental.  There are many 
examples of features in core hbase that shipped in stable releases and that 
still require external processes and may have no demonstrated users.  You'd 
have to go back a bit to get one that explicitly depended on MR but they did 
exist.  (e.g. pre dist log splitting we had a MR based log replay -- useful in 
avoiding 10 hr recovery downtimes).  This would be a good discussion topic for 
an upcoming PMC meeting.

What is your definition of stages? -- do you mean patch a time or something 
more like: stage one with external compactions, stage 2 with internal 
compactions?  For this MOB feature, we would have the experimental tag while we 
had external compactions and it would remain until we remove external 
dependencies and this compaction harden with fault testing.  Give our current 
cadence, we should be able have this completed as part of hbase 1.99/2.0 line's 
timeframe.

 HBase MOB
 -

 Key: HBASE-11339
 URL: https://issues.apache.org/jira/browse/HBASE-11339
 Project: HBase
  Issue Type: Umbrella
  Components: regionserver, Scanners
Reporter: Jingcheng Du
Assignee: Jingcheng Du
 Attachments: HBase MOB Design-v2.pdf, HBase MOB Design-v3.pdf, HBase 
 MOB Design-v4.pdf, HBase MOB Design.pdf, MOB user guide.docx, MOB user 
 guide_v2.docx, hbase-11339-in-dev.patch


   It's quite useful to save the medium binary data like images, documents 
 into Apache HBase. Unfortunately directly saving the binary MOB(medium 
 object) to HBase leads to a worse performance since the frequent split and 
 compaction.
   In this design, the MOB data are stored in an more efficient way, which 
 keeps a high write/read performance and guarantees the data consistency in 
 Apache HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11885) Provide a Dockerfile to easily build and run HBase from source

2014-09-03 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120122#comment-14120122
 ] 

Nicolas Liochon commented on HBASE-11885:
-

Since HBASE-4955 is fixed, it's possible and actually better to remove Gary's 
repo from the master version of HBase.

 Provide a Dockerfile to easily build and run HBase from source
 --

 Key: HBASE-11885
 URL: https://issues.apache.org/jira/browse/HBASE-11885
 Project: HBase
  Issue Type: New Feature
Reporter: Dima Spivak
Assignee: Dima Spivak

 [A recent email to 
 dev@|http://mail-archives.apache.org/mod_mbox/hbase-dev/201408.mbox/%3CCAAef%2BM4q%3Da8Dqxe_EHSFTueY%2BXxz%2BtTe%2BJKsWWbXjhB_Pz7oSA%40mail.gmail.com%3E]
  highlighted the difficulty that new users can face in getting HBase compiled 
 from source and running locally. I'd like to provide a Dockerfile that would 
 allow anyone with Docker running on a machine with a reasonably current Linux 
 kernel to do so with ease.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11887) Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value

2014-09-03 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-11887:
--
Attachment: 11887.txt

This seems to help.  BoundedPriorityBlockingQueue has all handlers which user 
CallRunners.  The CallRunners hold on to Calls after they are done.  All 
handlers can be holding on to big ycsb puts.  With this in place, I run much 
longer.  Heap character is completely different now.  Let me dig more.  This is 
a branch-1/master only issue (redo of rpc priority).

 Memory retention in branch-1; millions of instances of LiteralByteString for 
 column qualifier and value
 ---

 Key: HBASE-11887
 URL: https://issues.apache.org/jira/browse/HBASE-11887
 Project: HBase
  Issue Type: Bug
  Components: Protobufs
Affects Versions: 0.99.0
Reporter: stack
Assignee: stack
 Attachments: 11887.txt, Screen Shot 2014-09-03 at 10.18.58 AM.png


 Trying to test branch-1, I run out of mem pretty fast.  Looking at dumps, I 
 see too many instances of LiteralByteString.  Seem to be 'qualifiers' and 
 'values' out of pb QualifierValue... and on up to the multi call into the 
 server. Am having trouble finding how the retention is being done... Filing 
 issue in meantime while work on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread Andrey Stepachev (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120142#comment-14120142
 ] 

Andrey Stepachev commented on HBASE-11165:
--

bq.Yeah, we'll have to go this route if we are trying to keep state of a big 
cluster in heap. Could work on making the representation more compact. You 
arguing for single meta region Andrey Stepachev then? There is also the on-hdfs 
size to consider (write-amplification) and the r/w i/os.

For sure, compact representation doesn't implicate single meta. Compact meta 
allows to bother with split meta only for really big installations. But how 
HDFS would handle that, as [~mantonov] mentioned above.

As for compact META representations we can use other technics to reduce HDFS 
impact for big meta.

 Scaling so cluster can host 1M regions and beyond (50M regions?)
 

 Key: HBASE-11165
 URL: https://issues.apache.org/jira/browse/HBASE-11165
 Project: HBase
  Issue Type: Brainstorming
Reporter: stack
 Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
 zk_less_assignment_comparison_2.pdf


 This discussion issue comes out of Co-locate Meta And Master HBASE-10569 
 and comments on the doc posted there.
 A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
 regions maybe even 50M later.  This issue is about discussing how we will do 
 that (or if not 50M on a cluster, how otherwise we can attain same end).
 More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread Andrey Stepachev (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120144#comment-14120144
 ] 

Andrey Stepachev commented on HBASE-11165:
--

bq.But how HDFS would handle that, as Mikhail Antonov mentioned above?
that should be a question

 Scaling so cluster can host 1M regions and beyond (50M regions?)
 

 Key: HBASE-11165
 URL: https://issues.apache.org/jira/browse/HBASE-11165
 Project: HBase
  Issue Type: Brainstorming
Reporter: stack
 Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
 zk_less_assignment_comparison_2.pdf


 This discussion issue comes out of Co-locate Meta And Master HBASE-10569 
 and comments on the doc posted there.
 A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
 regions maybe even 50M later.  This issue is about discussing how we will do 
 that (or if not 50M on a cluster, how otherwise we can attain same end).
 More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11885) Provide a Dockerfile to easily build and run HBase from source

2014-09-03 Thread Dima Spivak (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120158#comment-14120158
 ] 

Dima Spivak commented on HBASE-11885:
-

+1 ^^^. As I said, I had no issues at all building master from source after 
removing that repository altogether (0.94 is a different story, because of the 
dependency on version 2.12-TRUNK-HBASE-2 Surefire).

 Provide a Dockerfile to easily build and run HBase from source
 --

 Key: HBASE-11885
 URL: https://issues.apache.org/jira/browse/HBASE-11885
 Project: HBase
  Issue Type: New Feature
Reporter: Dima Spivak
Assignee: Dima Spivak

 [A recent email to 
 dev@|http://mail-archives.apache.org/mod_mbox/hbase-dev/201408.mbox/%3CCAAef%2BM4q%3Da8Dqxe_EHSFTueY%2BXxz%2BtTe%2BJKsWWbXjhB_Pz7oSA%40mail.gmail.com%3E]
  highlighted the difficulty that new users can face in getting HBase compiled 
 from source and running locally. I'd like to provide a Dockerfile that would 
 allow anyone with Docker running on a machine with a reasonably current Linux 
 kernel to do so with ease.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11339) HBase MOB

2014-09-03 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120174#comment-14120174
 ] 

Andrew Purtell commented on HBASE-11339:


bq. You'd have to go back a bit to get one that explicitly depended on MR but 
they did exist. (e.g. pre dist log splitting we had a MR based log replay – 
useful in avoiding 10 hr recovery downtimes).

The master's built in splitting was still available even if there was no MR 
runtime that could run the replay tool.

bq. What is your definition of stages? -- do you mean patch a time or something 
more like: stage one with external compactions, stage 2 with internal 
compactions? 

Stage = JIRA issue.

bq. For this MOB feature, we would have the experimental tag while we had 
external compactions and it would remain until we remove external dependencies 
and this compaction harden with fault testing.

Whether or not the feature is tagged as experimental seems orthogonal to the 
compaction implementation question (at least to me).

If I read the above correctly we are looking at 2.0 as a possible release for 
shipping this feature? I suggest we communicate the feature status as 
experimental for the whole release line, i.e. until 2.1, like what we have done 
with the cell security features in the 0.98 line. 

 HBase MOB
 -

 Key: HBASE-11339
 URL: https://issues.apache.org/jira/browse/HBASE-11339
 Project: HBase
  Issue Type: Umbrella
  Components: regionserver, Scanners
Reporter: Jingcheng Du
Assignee: Jingcheng Du
 Attachments: HBase MOB Design-v2.pdf, HBase MOB Design-v3.pdf, HBase 
 MOB Design-v4.pdf, HBase MOB Design.pdf, MOB user guide.docx, MOB user 
 guide_v2.docx, hbase-11339-in-dev.patch


   It's quite useful to save the medium binary data like images, documents 
 into Apache HBase. Unfortunately directly saving the binary MOB(medium 
 object) to HBase leads to a worse performance since the frequent split and 
 compaction.
   In this design, the MOB data are stored in an more efficient way, which 
 keeps a high write/read performance and guarantees the data consistency in 
 Apache HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11886) The creator of the table should have all permissions on the table

2014-09-03 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-11886:
--
Fix Version/s: (was: 1.0.0)
   2.0.0
   0.99.0

 The creator of the table should have all permissions on the table
 -

 Key: HBASE-11886
 URL: https://issues.apache.org/jira/browse/HBASE-11886
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.3
Reporter: Devaraj Das
Assignee: Devaraj Das
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 11886-1.txt


 In our testing of 0.98.4 with security ON, we found that table creator 
 doesn't have RWXCA on the created table. Instead, the user representing the 
 HBase daemon gets all permissions. Due to this the table creator can't write 
 to the table he just created. I am suspecting HBASE-11275 introduced the 
 problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11887) Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value

2014-09-03 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120191#comment-14120191
 ] 

Enis Soztutar commented on HBASE-11887:
---

nice one. Raising this to critical. 

 Memory retention in branch-1; millions of instances of LiteralByteString for 
 column qualifier and value
 ---

 Key: HBASE-11887
 URL: https://issues.apache.org/jira/browse/HBASE-11887
 Project: HBase
  Issue Type: Bug
  Components: Protobufs
Affects Versions: 0.99.0
Reporter: stack
Assignee: stack
 Fix For: 0.99.0, 2.0.0

 Attachments: 11887.txt, Screen Shot 2014-09-03 at 10.18.58 AM.png


 Trying to test branch-1, I run out of mem pretty fast.  Looking at dumps, I 
 see too many instances of LiteralByteString.  Seem to be 'qualifiers' and 
 'values' out of pb QualifierValue... and on up to the multi call into the 
 server. Am having trouble finding how the retention is being done... Filing 
 issue in meantime while work on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11887) Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value

2014-09-03 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-11887:
--
Fix Version/s: 2.0.0
   0.99.0

 Memory retention in branch-1; millions of instances of LiteralByteString for 
 column qualifier and value
 ---

 Key: HBASE-11887
 URL: https://issues.apache.org/jira/browse/HBASE-11887
 Project: HBase
  Issue Type: Bug
  Components: Protobufs
Affects Versions: 0.99.0
Reporter: stack
Assignee: stack
Priority: Critical
 Fix For: 0.99.0, 2.0.0

 Attachments: 11887.txt, Screen Shot 2014-09-03 at 10.18.58 AM.png


 Trying to test branch-1, I run out of mem pretty fast.  Looking at dumps, I 
 see too many instances of LiteralByteString.  Seem to be 'qualifiers' and 
 'values' out of pb QualifierValue... and on up to the multi call into the 
 server. Am having trouble finding how the retention is being done... Filing 
 issue in meantime while work on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11887) Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value

2014-09-03 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-11887:
--
Priority: Critical  (was: Major)

 Memory retention in branch-1; millions of instances of LiteralByteString for 
 column qualifier and value
 ---

 Key: HBASE-11887
 URL: https://issues.apache.org/jira/browse/HBASE-11887
 Project: HBase
  Issue Type: Bug
  Components: Protobufs
Affects Versions: 0.99.0
Reporter: stack
Assignee: stack
Priority: Critical
 Fix For: 0.99.0, 2.0.0

 Attachments: 11887.txt, Screen Shot 2014-09-03 at 10.18.58 AM.png


 Trying to test branch-1, I run out of mem pretty fast.  Looking at dumps, I 
 see too many instances of LiteralByteString.  Seem to be 'qualifiers' and 
 'values' out of pb QualifierValue... and on up to the multi call into the 
 server. Am having trouble finding how the retention is being done... Filing 
 issue in meantime while work on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11805) KeyValue to Cell Convert in WALEdit APIs

2014-09-03 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-11805:
---
Attachment: HBASE-11805_V2.patch

V2 addressing Stack's comment.
Added CellUtil#estimatedLengthOf(final Cell cell)
When passed cell is a KV we return the KV.length() to maintain the same result 
as in the past.  When cell is of other types, the return will be an estimate 
adding up the key, value and tags lengths part.


 KeyValue to Cell Convert in WALEdit APIs
 

 Key: HBASE-11805
 URL: https://issues.apache.org/jira/browse/HBASE-11805
 Project: HBase
  Issue Type: Improvement
  Components: wal
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.99.0, 2.0.0, 0.98.7

 Attachments: HBASE-11805.patch, HBASE-11805_V2.patch


 In almost all other main interface class/APIs we have changed KeyValue to 
 Cell. But missing in WALEdit. This is public marked for Replication (Well it 
 should be for CP also) 
 These 2 APIs deal with KVs
 add(KeyValue kv)
 ArrayListKeyValue getKeyValues()
 Suggest deprecate them and add for 0.98
 add(Cell kv) 
 ListCell getCells()
 And just replace from 1.0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11882) Row level consistency may not be maintained with bulk load and compaction

2014-09-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120216#comment-14120216
 ] 

Hadoop QA commented on HBASE-11882:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12666066/HBASE-11882-master-v2.patch
  against trunk revision .
  ATTACHMENT ID: 12666066

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

 {color:red}-1 core zombie tests{color}.  There are 2 zombie test(s):   
at org.apache.hadoop.hbase.client.TestHCM.testClusterStatus(TestHCM.java:250)
at 
org.apache.hadoop.hbase.regionserver.TestHRegion.testWritesWhileGetting(TestHRegion.java:3813)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10695//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10695//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10695//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10695//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10695//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10695//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10695//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10695//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10695//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10695//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10695//console

This message is automatically generated.

 Row level consistency may not be maintained with bulk load and compaction
 -

 Key: HBASE-11882
 URL: https://issues.apache.org/jira/browse/HBASE-11882
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.99.0, 2.0.0
Reporter: Jerry He
Assignee: Jerry He
Priority: Critical
 Fix For: 0.99.0, 2.0.0

 Attachments: HBASE-11882-master-v1.patch, 
 HBASE-11882-master-v2.patch, TestHRegionServerBulkLoad.java.patch


 While looking into the TestHRegionServerBulkLoad failure for HBASE-11772, I 
 found the root cause is that row level atomicity may not be maintained with 
 bulk load together with compation.
 TestHRegionServerBulkLoad is used to test bulk load atomicity. The test uses 
 multiple threads to do bulk load and scan continuously and do compactions 
 periodically. 
 It verifies row level data is always consistent across column families.
 After HBASE-11591, we added readpoint checks for bulkloaded data using the 
 seqId at the time of bulk load. Now a scanner will not see the data from a 
 bulk load if the scanner's readpoint is earlier than the bulk load seqId.
 Previously, the atomic bulk load result is visible immediately to all 
 scanners.
 The problem is with compaction after bulk load. Compaction does not lock the 
 region and it is done one store (column family) at a time. It also compact 
 away the seqId marker of bulk load.
 Here is an event sequence where the row level consistency is broken.
 1. A scanner is started to scan a region with cf1 and cf2. The readpoint is 
 10.

[jira] [Updated] (HBASE-11826) Split each tableOrRegionName admin methods into two targetted methods

2014-09-03 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-11826:
--
Attachment: hbase-11826_v3.patch

reattach.

 Split each tableOrRegionName admin methods into two targetted methods
 -

 Key: HBASE-11826
 URL: https://issues.apache.org/jira/browse/HBASE-11826
 Project: HBase
  Issue Type: Improvement
Reporter: Carter
Assignee: Carter
 Fix For: 0.99.0, 2.0.0

 Attachments: HBASE_11826.patch, HBASE_11826_v2.patch, 
 HBASE_11826_v2.patch, hbase-11826_v3.patch, hbase-11826_v3.patch


 Purpose of this is two implement [~enis]'s suggestion to strongly type the 
 methods that take tableOrRegionName as an argument.
 For instance:
 {code}
 void compact(final String tableNameOrRegionName)
 void compact(final byte[] tableNameOrRegionName)
 {code}
 becomes
 {code}
 @Deprecated
 void compact(final String tableNameOrRegionName)
 @Deprecated
 void compact(final byte[] tableNameOrRegionName)
 void compact(TableName table)
 void compactRegion(final byte[] regionName)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-10841) Scan setters should consistently return this

2014-09-03 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120230#comment-14120230
 ] 

Anoop Sam John commented on HBASE-10841:


Patch looks good.
bq.Here is a patch which fixes all offenders in Put,Delete,Get,Scan,etc
Just edit this issue's title and description. :)


 Scan setters should consistently return this
 

 Key: HBASE-10841
 URL: https://issues.apache.org/jira/browse/HBASE-10841
 Project: HBase
  Issue Type: Sub-task
  Components: Client, Usability
Affects Versions: 0.99.0
Reporter: Nick Dimiduk
Assignee: Enis Soztutar
Priority: Minor
 Fix For: 0.99.0, 2.0.0

 Attachments: hbase-10841_v1.patch


 While addressing review comments on HBASE-10818, I noticed that our {{Scan}} 
 class is inconsistent with it's setter methods. Some of them return {{this}}, 
 other's don't. They should be consistent. I suggest making them all return 
 {{this}}, to support chained invocation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10841) Scan setters should consistently return this

2014-09-03 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-10841:
---
Status: Patch Available  (was: Open)

 Scan setters should consistently return this
 

 Key: HBASE-10841
 URL: https://issues.apache.org/jira/browse/HBASE-10841
 Project: HBase
  Issue Type: Sub-task
  Components: Client, Usability
Affects Versions: 0.99.0
Reporter: Nick Dimiduk
Assignee: Enis Soztutar
Priority: Minor
 Fix For: 0.99.0, 2.0.0

 Attachments: hbase-10841_v1.patch


 While addressing review comments on HBASE-10818, I noticed that our {{Scan}} 
 class is inconsistent with it's setter methods. Some of them return {{this}}, 
 other's don't. They should be consistent. I suggest making them all return 
 {{this}}, to support chained invocation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11339) HBase MOB

2014-09-03 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120250#comment-14120250
 ] 

Jonathan Hsieh commented on HBASE-11339:


bq. The master's built in splitting was still available even if there was no MR 
runtime that could run the replay tool.

If you were ok with 10 hr downtimes due to recovery (back then no meta first 
recovery), the sure.  For large deployments that MR for this was critical and 
not really optional.

bq. Stage = JIRA issue.

sgtm.

bq. If I read the above correctly we are looking at 2.0 as a possible release 
for shipping this feature? I suggest we communicate the feature status as 
experimental for the whole release line, i.e. until 2.1, like what we have done 
with the cell security features in the 0.98 line.

Yes -- trunk is 2.0 and new features should only land in trunk and yes, we 
would note it as experimental until all pieces are in and some hardening as 
taken place. .  Ideally, all major features would be experimental in their 
first release. If we follow through with having 2.0 - 2.1 be like will be like 
0.92 - 0.94 or 0.96-0.98, then following the cell security approach for 
experimental status sounds good to me.



 HBase MOB
 -

 Key: HBASE-11339
 URL: https://issues.apache.org/jira/browse/HBASE-11339
 Project: HBase
  Issue Type: Umbrella
  Components: regionserver, Scanners
Reporter: Jingcheng Du
Assignee: Jingcheng Du
 Attachments: HBase MOB Design-v2.pdf, HBase MOB Design-v3.pdf, HBase 
 MOB Design-v4.pdf, HBase MOB Design.pdf, MOB user guide.docx, MOB user 
 guide_v2.docx, hbase-11339-in-dev.patch


   It's quite useful to save the medium binary data like images, documents 
 into Apache HBase. Unfortunately directly saving the binary MOB(medium 
 object) to HBase leads to a worse performance since the frequent split and 
 compaction.
   In this design, the MOB data are stored in an more efficient way, which 
 keeps a high write/read performance and guarantees the data consistency in 
 Apache HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11876) RegionScanner.nextRaw(...) should not update metrics

2014-09-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120280#comment-14120280
 ] 

Hudson commented on HBASE-11876:


FAILURE: Integrated in HBase-0.98 #494 (See 
[https://builds.apache.org/job/HBase-0.98/494/])
Revert HBASE-11876 RegionScanner.nextRaw should not update metrics (apurtell: 
rev c3882ed73a5c77c21ee5110ded9598f2f317cb55)
* hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java
HBASE-11876 RegionScanner.nextRaw should not update metrics (apurtell: rev 
cf843b196338cf2f2bd0eafbe88fda7d4386fba2)
* hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java


 RegionScanner.nextRaw(...) should not update metrics
 

 Key: HBASE-11876
 URL: https://issues.apache.org/jira/browse/HBASE-11876
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.6
Reporter: Lars Hofhansl
Assignee: Andrew Purtell
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: HBASE-11876-0.98.patch, HBASE-11876-0.98.patch, 
 HBASE-11876.patch, HBASE-11876.patch


 I added the RegionScanner.nextRaw(...) to allow smart client to avoid some 
 of the default work that HBase is doing, such as {start|stop}RegionOperation 
 and synchronized(scanner) for each row.
 Metrics should follow the same approach. Collecting them per row is expensive 
 and a caller should have the option to collect those later or to avoid 
 collecting them completely.
 We can also save some cycles in RSRcpServices.scan(...) if we updated the 
 metric only once/batch instead of each row.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HBASE-11885) Provide a Dockerfile to easily build and run HBase from source

2014-09-03 Thread Dima Spivak (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-11885 started by Dima Spivak.
---
 Provide a Dockerfile to easily build and run HBase from source
 --

 Key: HBASE-11885
 URL: https://issues.apache.org/jira/browse/HBASE-11885
 Project: HBase
  Issue Type: New Feature
Reporter: Dima Spivak
Assignee: Dima Spivak

 [A recent email to 
 dev@|http://mail-archives.apache.org/mod_mbox/hbase-dev/201408.mbox/%3CCAAef%2BM4q%3Da8Dqxe_EHSFTueY%2BXxz%2BtTe%2BJKsWWbXjhB_Pz7oSA%40mail.gmail.com%3E]
  highlighted the difficulty that new users can face in getting HBase compiled 
 from source and running locally. I'd like to provide a Dockerfile that would 
 allow anyone with Docker running on a machine with a reasonably current Linux 
 kernel to do so with ease.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11876) RegionScanner.nextRaw(...) should not update metrics

2014-09-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120310#comment-14120310
 ] 

Hudson commented on HBASE-11876:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #467 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/467/])
Revert HBASE-11876 RegionScanner.nextRaw should not update metrics (apurtell: 
rev c3882ed73a5c77c21ee5110ded9598f2f317cb55)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
HBASE-11876 RegionScanner.nextRaw should not update metrics (apurtell: rev 
cf843b196338cf2f2bd0eafbe88fda7d4386fba2)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java


 RegionScanner.nextRaw(...) should not update metrics
 

 Key: HBASE-11876
 URL: https://issues.apache.org/jira/browse/HBASE-11876
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.6
Reporter: Lars Hofhansl
Assignee: Andrew Purtell
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: HBASE-11876-0.98.patch, HBASE-11876-0.98.patch, 
 HBASE-11876.patch, HBASE-11876.patch


 I added the RegionScanner.nextRaw(...) to allow smart client to avoid some 
 of the default work that HBase is doing, such as {start|stop}RegionOperation 
 and synchronized(scanner) for each row.
 Metrics should follow the same approach. Collecting them per row is expensive 
 and a caller should have the option to collect those later or to avoid 
 collecting them completely.
 We can also save some cycles in RSRcpServices.scan(...) if we updated the 
 metric only once/batch instead of each row.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11886) The creator of the table should have all permissions on the table

2014-09-03 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120321#comment-14120321
 ] 

Anoop Sam John commented on HBASE-11886:


Thanks for the find DD.

+  this.activeUser = UserProvider.instantiate(conf).getCurrent();
Get user here from RequestContext(?)

BTW
{code}
private User getActiveUser() throws IOException {
User user = RequestContext.getRequestUser();
if (!RequestContext.isInRequestContext()) {
  // for non-rpc handling, fallback to system user
  user = userProvider.getCurrent();
}
return user;
  }
{code}
Using InheritableThreadLocal in RequestContext would solve the issue with out 
other changes?

 The creator of the table should have all permissions on the table
 -

 Key: HBASE-11886
 URL: https://issues.apache.org/jira/browse/HBASE-11886
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.3
Reporter: Devaraj Das
Assignee: Devaraj Das
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 11886-1.txt


 In our testing of 0.98.4 with security ON, we found that table creator 
 doesn't have RWXCA on the created table. Instead, the user representing the 
 HBase daemon gets all permissions. Due to this the table creator can't write 
 to the table he just created. I am suspecting HBASE-11275 introduced the 
 problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11826) Split each tableOrRegionName admin methods into two targetted methods

2014-09-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120341#comment-14120341
 ] 

Hadoop QA commented on HBASE-11826:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12666279/hbase-11826_v3.patch
  against trunk revision .
  ATTACHMENT ID: 12666279

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 167 
new or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10697//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10697//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10697//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10697//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10697//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10697//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10697//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10697//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10697//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10697//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10697//console

This message is automatically generated.

 Split each tableOrRegionName admin methods into two targetted methods
 -

 Key: HBASE-11826
 URL: https://issues.apache.org/jira/browse/HBASE-11826
 Project: HBase
  Issue Type: Improvement
Reporter: Carter
Assignee: Carter
 Fix For: 0.99.0, 2.0.0

 Attachments: HBASE_11826.patch, HBASE_11826_v2.patch, 
 HBASE_11826_v2.patch, hbase-11826_v3.patch, hbase-11826_v3.patch


 Purpose of this is two implement [~enis]'s suggestion to strongly type the 
 methods that take tableOrRegionName as an argument.
 For instance:
 {code}
 void compact(final String tableNameOrRegionName)
 void compact(final byte[] tableNameOrRegionName)
 {code}
 becomes
 {code}
 @Deprecated
 void compact(final String tableNameOrRegionName)
 @Deprecated
 void compact(final byte[] tableNameOrRegionName)
 void compact(TableName table)
 void compactRegion(final byte[] regionName)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120349#comment-14120349
 ] 

Mikhail Antonov commented on HBASE-11165:
-

[~stack],  [~octo47] - on this compaction topic,  I also mentioned early on in 
the thread:

bq. I wonder if it makes sense to have google doc linked to this jira to save 
various proposals, findings and estimates? Like that summarizes current usage 
to be conservatively 3.5Gb in meta / 1M regions.

So seems like we're using 3-3.5 Kb per region-row? That should be compressible, 
looking at the data in meta rows. Also I think it would help if we can post 
here some numbers and capture in the documents, so we have the baseline for our 
work. For example:

 - how many kb in memory per-region in meta
 - how many hdfs inodes per region (depends on numbers of store files, but some 
estimate?)

To estimate, how big would be a deployment where meta doesn't fit in memory? 
How many RSs, how many petabytes of data?

 Scaling so cluster can host 1M regions and beyond (50M regions?)
 

 Key: HBASE-11165
 URL: https://issues.apache.org/jira/browse/HBASE-11165
 Project: HBase
  Issue Type: Brainstorming
Reporter: stack
 Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
 zk_less_assignment_comparison_2.pdf


 This discussion issue comes out of Co-locate Meta And Master HBASE-10569 
 and comments on the doc posted there.
 A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
 regions maybe even 50M later.  This issue is about discussing how we will do 
 that (or if not 50M on a cluster, how otherwise we can attain same end).
 More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread Andrey Stepachev (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120360#comment-14120360
 ] 

Andrey Stepachev commented on HBASE-11165:
--

Also, it is very interesting, did big users of HBase with so many regions use 
NameNode federation or use enormous machine to handle NameNode with so many 
regions?

 Scaling so cluster can host 1M regions and beyond (50M regions?)
 

 Key: HBASE-11165
 URL: https://issues.apache.org/jira/browse/HBASE-11165
 Project: HBase
  Issue Type: Brainstorming
Reporter: stack
 Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
 zk_less_assignment_comparison_2.pdf


 This discussion issue comes out of Co-locate Meta And Master HBASE-10569 
 and comments on the doc posted there.
 A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
 regions maybe even 50M later.  This issue is about discussing how we will do 
 that (or if not 50M on a cluster, how otherwise we can attain same end).
 More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11886) The creator of the table should have all permissions on the table

2014-09-03 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120366#comment-14120366
 ] 

Andrew Purtell commented on HBASE-11886:


Back from other stuff. Let me address those points [~anoop.hbase] since I'm 
working in this area on a test.

 The creator of the table should have all permissions on the table
 -

 Key: HBASE-11886
 URL: https://issues.apache.org/jira/browse/HBASE-11886
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.3
Reporter: Devaraj Das
Assignee: Devaraj Das
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 11886-1.txt


 In our testing of 0.98.4 with security ON, we found that table creator 
 doesn't have RWXCA on the created table. Instead, the user representing the 
 HBase daemon gets all permissions. Due to this the table creator can't write 
 to the table he just created. I am suspecting HBASE-11275 introduced the 
 problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11886) The creator of the table should have all permissions on the table

2014-09-03 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120396#comment-14120396
 ] 

Devaraj Das commented on HBASE-11886:
-

Using the user from RequestContext sounds fine.
I am not so sure about the InheritableThreadLocal though. Since the master does 
HDFS operations when operations like createTable are called, it might be an 
issue, no? What I did changes the identity only for postCreateTableHandler but 
the other operations done as part of the createTable call is executed as the 
master's identity.

 The creator of the table should have all permissions on the table
 -

 Key: HBASE-11886
 URL: https://issues.apache.org/jira/browse/HBASE-11886
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.3
Reporter: Devaraj Das
Assignee: Devaraj Das
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 11886-1.txt


 In our testing of 0.98.4 with security ON, we found that table creator 
 doesn't have RWXCA on the created table. Instead, the user representing the 
 HBase daemon gets all permissions. Due to this the table creator can't write 
 to the table he just created. I am suspecting HBASE-11275 introduced the 
 problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10841) Scan,Get,Put,Delete,etc setters should consistently return this

2014-09-03 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-10841:
--
Summary: Scan,Get,Put,Delete,etc setters should consistently return this  
(was: Scan setters should consistently return this)

 Scan,Get,Put,Delete,etc setters should consistently return this
 ---

 Key: HBASE-10841
 URL: https://issues.apache.org/jira/browse/HBASE-10841
 Project: HBase
  Issue Type: Sub-task
  Components: Client, Usability
Affects Versions: 0.99.0
Reporter: Nick Dimiduk
Assignee: Enis Soztutar
Priority: Minor
 Fix For: 0.99.0, 2.0.0

 Attachments: hbase-10841_v1.patch


 While addressing review comments on HBASE-10818, I noticed that our {{Scan}} 
 class is inconsistent with it's setter methods. Some of them return {{this}}, 
 other's don't. They should be consistent. I suggest making them all return 
 {{this}}, to support chained invocation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120408#comment-14120408
 ] 

Francis Liu commented on HBASE-11165:
-

We're currently using huge NNs. 

We haven't looked into the number of inodes as that didn't seem to be an issue 
for the 1M case (We have a single NN running ~250M files). But we'll be 
watching it for the post 1M  benchmarks. Will post results here.



 Scaling so cluster can host 1M regions and beyond (50M regions?)
 

 Key: HBASE-11165
 URL: https://issues.apache.org/jira/browse/HBASE-11165
 Project: HBase
  Issue Type: Brainstorming
Reporter: stack
 Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
 zk_less_assignment_comparison_2.pdf


 This discussion issue comes out of Co-locate Meta And Master HBASE-10569 
 and comments on the doc posted there.
 A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
 regions maybe even 50M later.  This issue is about discussing how we will do 
 that (or if not 50M on a cluster, how otherwise we can attain same end).
 More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11887) Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value

2014-09-03 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120406#comment-14120406
 ] 

Enis Soztutar commented on HBASE-11887:
---

+1 on patch if no other findings. 

 Memory retention in branch-1; millions of instances of LiteralByteString for 
 column qualifier and value
 ---

 Key: HBASE-11887
 URL: https://issues.apache.org/jira/browse/HBASE-11887
 Project: HBase
  Issue Type: Bug
  Components: Protobufs
Affects Versions: 0.99.0
Reporter: stack
Assignee: stack
Priority: Critical
 Fix For: 0.99.0, 2.0.0

 Attachments: 11887.txt, Screen Shot 2014-09-03 at 10.18.58 AM.png


 Trying to test branch-1, I run out of mem pretty fast.  Looking at dumps, I 
 see too many instances of LiteralByteString.  Seem to be 'qualifiers' and 
 'values' out of pb QualifierValue... and on up to the multi call into the 
 server. Am having trouble finding how the retention is being done... Filing 
 issue in meantime while work on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11886) The creator of the table should have all permissions on the table

2014-09-03 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120427#comment-14120427
 ] 

Anoop Sam John commented on HBASE-11886:


bq.Since the master does HDFS operations when operations like createTable are 
called, it might be an issue, no? 
I think no issue. Because the op will be performed with master identity only. 
RequestContext  is used to know who is the active user. RequestContext is HBase 
class and in HDFS we will be getting the user not from this.   By a change in 
RequestContext ThreadLocal, we make sure in the flow wherever in HBase code, we 
check for the user from RequestContext , it is the RPC user who initiated the 
flow.

Am ok not to do this change if there is a risk factor and need more time for 
tests. Andy would like to get the next RC soon I believe.

+1 with just changing the part of getting activeUser  from RequestContext.  
Mind adding a comment why we do this so that it will be easy for some one who 
read the code later.

 The creator of the table should have all permissions on the table
 -

 Key: HBASE-11886
 URL: https://issues.apache.org/jira/browse/HBASE-11886
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.3
Reporter: Devaraj Das
Assignee: Devaraj Das
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 11886-1.txt


 In our testing of 0.98.4 with security ON, we found that table creator 
 doesn't have RWXCA on the created table. Instead, the user representing the 
 HBase daemon gets all permissions. Due to this the table creator can't write 
 to the table he just created. I am suspecting HBASE-11275 introduced the 
 problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-11886) The creator of the table should have all permissions on the table

2014-09-03 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120427#comment-14120427
 ] 

Anoop Sam John edited comment on HBASE-11886 at 9/3/14 8:51 PM:


bq.Since the master does HDFS operations when operations like createTable are 
called, it might be an issue, no? 
I think no issue. Because the op will be performed with master identity only. 
RequestContext  is used to know who is the active user. RequestContext is HBase 
class and in HDFS we will be getting the user not from this.   By a change in 
RequestContext ThreadLocal, we make sure in the flow wherever in HBase code, we 
check for the user from RequestContext , it is the RPC user who initiated the 
flow.

Am ok not to do this change if there is a risk factor and need more time for 
tests. Andy would like to get the next RC soon I believe.

+1 with just changing the part of getting activeUser  from 
RequestContext.(instead UserProvider.instantiate(conf).getCurrent())  Mind 
adding a comment why we do this so that it will be easy for some one who read 
the code later.


was (Author: anoop.hbase):
bq.Since the master does HDFS operations when operations like createTable are 
called, it might be an issue, no? 
I think no issue. Because the op will be performed with master identity only. 
RequestContext  is used to know who is the active user. RequestContext is HBase 
class and in HDFS we will be getting the user not from this.   By a change in 
RequestContext ThreadLocal, we make sure in the flow wherever in HBase code, we 
check for the user from RequestContext , it is the RPC user who initiated the 
flow.

Am ok not to do this change if there is a risk factor and need more time for 
tests. Andy would like to get the next RC soon I believe.

+1 with just changing the part of getting activeUser  from RequestContext.  
Mind adding a comment why we do this so that it will be easy for some one who 
read the code later.

 The creator of the table should have all permissions on the table
 -

 Key: HBASE-11886
 URL: https://issues.apache.org/jira/browse/HBASE-11886
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.3
Reporter: Devaraj Das
Assignee: Devaraj Das
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 11886-1.txt


 In our testing of 0.98.4 with security ON, we found that table creator 
 doesn't have RWXCA on the created table. Instead, the user representing the 
 HBase daemon gets all permissions. Due to this the table creator can't write 
 to the table he just created. I am suspecting HBASE-11275 introduced the 
 problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11886) The creator of the table should have all permissions on the table

2014-09-03 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120434#comment-14120434
 ] 

Andrew Purtell commented on HBASE-11886:


bq. +1 with just changing the part of getting activeUser  from RequestContext 
instead 

Yes this is what I am doing

 The creator of the table should have all permissions on the table
 -

 Key: HBASE-11886
 URL: https://issues.apache.org/jira/browse/HBASE-11886
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.3
Reporter: Devaraj Das
Assignee: Devaraj Das
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 11886-1.txt


 In our testing of 0.98.4 with security ON, we found that table creator 
 doesn't have RWXCA on the created table. Instead, the user representing the 
 HBase daemon gets all permissions. Due to this the table creator can't write 
 to the table he just created. I am suspecting HBASE-11275 introduced the 
 problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11887) Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value

2014-09-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120466#comment-14120466
 ] 

stack commented on HBASE-11887:
---

OK. Let me commit. Its been running this last few hours w/o a full GC. I have 
it set so a full GC kills the RS using below configs:

export SERVER_GC_OPTS=-verbose:gc -XX:PrintFLSStatistics=1 -XX:+PrintGCDetails 
-XX:+PrintGCDateStamps -XX:+HeapDumpBeforeFullGC -XX:HeapDumpPath=/data/1/ 
-XX:+CMSDumpAtPromotionFailure -XX:+HeapDumpAfterFullGC 
-XX:+PrintTenuringDistribution -XX:+PrintGCApplicationConcurrentTime 
-XX:+PrintGCApplicationStoppedTime

It used to run out of road -- full GC -- after 5 minutes or so.

 Memory retention in branch-1; millions of instances of LiteralByteString for 
 column qualifier and value
 ---

 Key: HBASE-11887
 URL: https://issues.apache.org/jira/browse/HBASE-11887
 Project: HBase
  Issue Type: Bug
  Components: Protobufs
Affects Versions: 0.99.0
Reporter: stack
Assignee: stack
Priority: Critical
 Fix For: 0.99.0, 2.0.0

 Attachments: 11887.txt, Screen Shot 2014-09-03 at 10.18.58 AM.png


 Trying to test branch-1, I run out of mem pretty fast.  Looking at dumps, I 
 see too many instances of LiteralByteString.  Seem to be 'qualifiers' and 
 'values' out of pb QualifierValue... and on up to the multi call into the 
 server. Am having trouble finding how the retention is being done... Filing 
 issue in meantime while work on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11887) Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value

2014-09-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120473#comment-14120473
 ] 

stack commented on HBASE-11887:
---

jmap -histo:

{code}
 num #instances #bytes  class name
--
   1:   3069883  780472352  [B
   2:   3501587  112050784  org.apache.hadoop.hbase.KeyValue
   3:   3441167   82588008  
java.util.concurrent.ConcurrentSkipListMap$Node
   4:   1077906   68985984  
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MutationProto$ColumnValue$QualifierValue
   5:   2372963   56951112  com.google.protobuf.LiteralByteString
   6:   1720444   41290656  
java.util.concurrent.ConcurrentSkipListMap$Index
   7:  7024   18547960  [I
{code}

 shows QualifierValue at 1/10th of that the numbers used be with no mention 
of the LiteralByteString which used to have near equal pegging with [B 
instances.  Let me commit then since it makes branch-1 YCSB-able (Previous it 
was not).  Will be back to try and make more improvements but this is enough 
for one issue.

 Memory retention in branch-1; millions of instances of LiteralByteString for 
 column qualifier and value
 ---

 Key: HBASE-11887
 URL: https://issues.apache.org/jira/browse/HBASE-11887
 Project: HBase
  Issue Type: Bug
  Components: Protobufs
Affects Versions: 0.99.0
Reporter: stack
Assignee: stack
Priority: Critical
 Fix For: 0.99.0, 2.0.0

 Attachments: 11887.txt, Screen Shot 2014-09-03 at 10.18.58 AM.png


 Trying to test branch-1, I run out of mem pretty fast.  Looking at dumps, I 
 see too many instances of LiteralByteString.  Seem to be 'qualifiers' and 
 'values' out of pb QualifierValue... and on up to the multi call into the 
 server. Am having trouble finding how the retention is being done... Filing 
 issue in meantime while work on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11805) KeyValue to Cell Convert in WALEdit APIs

2014-09-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120475#comment-14120475
 ] 

Hadoop QA commented on HBASE-11805:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12666280/HBASE-11805_V2.patch
  against trunk revision .
  ATTACHMENT ID: 12666280

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 45 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.master.TestMasterFailover
  org.apache.hadoop.hbase.client.TestMultiParallel
  org.apache.hadoop.hbase.client.TestReplicaWithCluster
  org.apache.hadoop.hbase.replication.TestPerTableCFReplication

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s):   
at 
org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScanBase.testScan(TestTableInputFormatScanBase.java:238)
at 
org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1.testScanEmptyToBBA(TestTableInputFormatScan1.java:70)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10696//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10696//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10696//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10696//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10696//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10696//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10696//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10696//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10696//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10696//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10696//console

This message is automatically generated.

 KeyValue to Cell Convert in WALEdit APIs
 

 Key: HBASE-11805
 URL: https://issues.apache.org/jira/browse/HBASE-11805
 Project: HBase
  Issue Type: Improvement
  Components: wal
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.99.0, 2.0.0, 0.98.7

 Attachments: HBASE-11805.patch, HBASE-11805_V2.patch


 In almost all other main interface class/APIs we have changed KeyValue to 
 Cell. But missing in WALEdit. This is public marked for Replication (Well it 
 should be for CP also) 
 These 2 APIs deal with KVs
 add(KeyValue kv)
 ArrayListKeyValue getKeyValues()
 Suggest deprecate them and add for 0.98
 add(Cell kv) 
 ListCell getCells()
 And just replace from 1.0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-11887) Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value

2014-09-03 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-11887.
---
   Resolution: Fixed
Fix Version/s: 0.98.8
 Hadoop Flags: Reviewed

Applied to 0.98+. 0.98 doesn't seem to need it but going by what it does in 
branch-1, probably no harm letting go of these references so they get cleaned 
up early and maybe less-likely promoted (hope that ok [~apurtell])

 Memory retention in branch-1; millions of instances of LiteralByteString for 
 column qualifier and value
 ---

 Key: HBASE-11887
 URL: https://issues.apache.org/jira/browse/HBASE-11887
 Project: HBase
  Issue Type: Bug
  Components: Protobufs
Affects Versions: 0.99.0
Reporter: stack
Assignee: stack
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.8

 Attachments: 11887.txt, Screen Shot 2014-09-03 at 10.18.58 AM.png


 Trying to test branch-1, I run out of mem pretty fast.  Looking at dumps, I 
 see too many instances of LiteralByteString.  Seem to be 'qualifiers' and 
 'values' out of pb QualifierValue... and on up to the multi call into the 
 server. Am having trouble finding how the retention is being done... Filing 
 issue in meantime while work on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120485#comment-14120485
 ] 

Elliott Clark commented on HBASE-11165:
---

If we can get into the same scaling range as HDFS's namenode then I don't see 
the urgency to split meta.

Num Files  Num Regions

So it would seem that addressing in-memory representation of meta would mean 
that the scaling bottle neck would be back to the NN. At some point there will 
be limits there, but that seems fine as long as there are the same limits to 
our underlying foundation (hdfs).

 Scaling so cluster can host 1M regions and beyond (50M regions?)
 

 Key: HBASE-11165
 URL: https://issues.apache.org/jira/browse/HBASE-11165
 Project: HBase
  Issue Type: Brainstorming
Reporter: stack
 Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
 zk_less_assignment_comparison_2.pdf


 This discussion issue comes out of Co-locate Meta And Master HBASE-10569 
 and comments on the doc posted there.
 A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
 regions maybe even 50M later.  This issue is about discussing how we will do 
 that (or if not 50M on a cluster, how otherwise we can attain same end).
 More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11886) The creator of the table should have all permissions on the table

2014-09-03 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-11886:
---
Status: Open  (was: Patch Available)

 The creator of the table should have all permissions on the table
 -

 Key: HBASE-11886
 URL: https://issues.apache.org/jira/browse/HBASE-11886
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.3
Reporter: Devaraj Das
Assignee: Devaraj Das
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 11886-1.txt, HBASE-11886.patch


 In our testing of 0.98.4 with security ON, we found that table creator 
 doesn't have RWXCA on the created table. Instead, the user representing the 
 HBase daemon gets all permissions. Due to this the table creator can't write 
 to the table he just created. I am suspecting HBASE-11275 introduced the 
 problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11886) The creator of the table should have all permissions on the table

2014-09-03 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-11886:
---
Status: Patch Available  (was: Open)

 The creator of the table should have all permissions on the table
 -

 Key: HBASE-11886
 URL: https://issues.apache.org/jira/browse/HBASE-11886
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.3
Reporter: Devaraj Das
Assignee: Devaraj Das
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 11886-1.txt, HBASE-11886.patch


 In our testing of 0.98.4 with security ON, we found that table creator 
 doesn't have RWXCA on the created table. Instead, the user representing the 
 HBase daemon gets all permissions. Due to this the table creator can't write 
 to the table he just created. I am suspecting HBASE-11275 introduced the 
 problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11886) The creator of the table should have all permissions on the table

2014-09-03 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-11886:
---
Attachment: HBASE-11886.patch

Updated patch with test and Anoop feedback.

 The creator of the table should have all permissions on the table
 -

 Key: HBASE-11886
 URL: https://issues.apache.org/jira/browse/HBASE-11886
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.3
Reporter: Devaraj Das
Assignee: Devaraj Das
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 11886-1.txt, HBASE-11886.patch


 In our testing of 0.98.4 with security ON, we found that table creator 
 doesn't have RWXCA on the created table. Instead, the user representing the 
 HBase daemon gets all permissions. Due to this the table creator can't write 
 to the table he just created. I am suspecting HBASE-11275 introduced the 
 problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11886) The creator of the table should have all permissions on the table

2014-09-03 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-11886:
---
Attachment: HBASE-11886-0.98.patch

Patch for 0.98

 The creator of the table should have all permissions on the table
 -

 Key: HBASE-11886
 URL: https://issues.apache.org/jira/browse/HBASE-11886
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.3
Reporter: Devaraj Das
Assignee: Devaraj Das
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 11886-1.txt, HBASE-11886-0.98.patch, HBASE-11886.patch


 In our testing of 0.98.4 with security ON, we found that table creator 
 doesn't have RWXCA on the created table. Instead, the user representing the 
 HBase daemon gets all permissions. Due to this the table creator can't write 
 to the table he just created. I am suspecting HBASE-11275 introduced the 
 problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11826) Split each tableOrRegionName admin methods into two targetted methods

2014-09-03 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120508#comment-14120508
 ] 

Enis Soztutar commented on HBASE-11826:
---

it seems that some of the tests got killed during the jenkins run. I've run 
these tests, and they are passing with the test:
{code} 
mvn clean test 
-Dtest=TestIdLock,TestMiniClusterLoadSequential,TestMergeTool,TestRegionSplitter,TestCoprocessorScanPolicy,TestTableInputFormatScan1,TestHFileOutputFormat2,TestAcidGuarantees,TestMiniClusterLoadEncoded,TestFSHDFSUtils
{code}

 Split each tableOrRegionName admin methods into two targetted methods
 -

 Key: HBASE-11826
 URL: https://issues.apache.org/jira/browse/HBASE-11826
 Project: HBase
  Issue Type: Improvement
Reporter: Carter
Assignee: Carter
 Fix For: 0.99.0, 2.0.0

 Attachments: HBASE_11826.patch, HBASE_11826_v2.patch, 
 HBASE_11826_v2.patch, hbase-11826_v3.patch, hbase-11826_v3.patch


 Purpose of this is two implement [~enis]'s suggestion to strongly type the 
 methods that take tableOrRegionName as an argument.
 For instance:
 {code}
 void compact(final String tableNameOrRegionName)
 void compact(final byte[] tableNameOrRegionName)
 {code}
 becomes
 {code}
 @Deprecated
 void compact(final String tableNameOrRegionName)
 @Deprecated
 void compact(final byte[] tableNameOrRegionName)
 void compact(TableName table)
 void compactRegion(final byte[] regionName)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120511#comment-14120511
 ] 

Francis Liu commented on HBASE-11165:
-

{quote}
Francis Liu, Virag Kothari - do you guys have by chance some recent numbers (or 
maybe estimate) on how long does full master failover take on the cluster with 
300k or 3M regions? I didn't find those in the recent doc, eager to see that.
{quote}
[~mantonov] We don't have the numbers we'll get them next time. Tho failover 
recovery is essentially bounded on scanning meta and recovering dead servers. 
So without dead servers it would just be a fraction of the startup time.



 Scaling so cluster can host 1M regions and beyond (50M regions?)
 

 Key: HBASE-11165
 URL: https://issues.apache.org/jira/browse/HBASE-11165
 Project: HBase
  Issue Type: Brainstorming
Reporter: stack
 Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
 zk_less_assignment_comparison_2.pdf


 This discussion issue comes out of Co-locate Meta And Master HBASE-10569 
 and comments on the doc posted there.
 A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
 regions maybe even 50M later.  This issue is about discussing how we will do 
 that (or if not 50M on a cluster, how otherwise we can attain same end).
 More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11887) Memory retention in branch-1; millions of instances of LiteralByteString for column qualifier and value

2014-09-03 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120512#comment-14120512
 ] 

Andrew Purtell commented on HBASE-11887:


Thanks!

 Memory retention in branch-1; millions of instances of LiteralByteString for 
 column qualifier and value
 ---

 Key: HBASE-11887
 URL: https://issues.apache.org/jira/browse/HBASE-11887
 Project: HBase
  Issue Type: Bug
  Components: Protobufs
Affects Versions: 0.99.0
Reporter: stack
Assignee: stack
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.8

 Attachments: 11887.txt, Screen Shot 2014-09-03 at 10.18.58 AM.png


 Trying to test branch-1, I run out of mem pretty fast.  Looking at dumps, I 
 see too many instances of LiteralByteString.  Seem to be 'qualifiers' and 
 'values' out of pb QualifierValue... and on up to the multi call into the 
 server. Am having trouble finding how the retention is being done... Filing 
 issue in meantime while work on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-09-03 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120516#comment-14120516
 ] 

Mikhail Antonov commented on HBASE-11165:
-

[~toffer] thanks! I'd be really curious to look at those numbers.

Is the NN you mentioned with 250M files is solely dedicated to HBase 
installation? I mean, could the assumption be made that the HBase cluster with 
1M or large regions consumes about 250M of files in HDFS, so roughly 250 files 
/ per region, or would it be too bold assumption?

[~eclark] so if we take as a baseline that (num of files)  (num regions), I 
wonder how close to NN limits we are? I mean, if we're talking about case with 
10M regions (or even 50M), with the same ratio of region-to-files, 10M regions 
would give us 2.5B files in HDFS? How close is that to HDFS limits?

 Scaling so cluster can host 1M regions and beyond (50M regions?)
 

 Key: HBASE-11165
 URL: https://issues.apache.org/jira/browse/HBASE-11165
 Project: HBase
  Issue Type: Brainstorming
Reporter: stack
 Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
 zk_less_assignment_comparison_2.pdf


 This discussion issue comes out of Co-locate Meta And Master HBASE-10569 
 and comments on the doc posted there.
 A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
 regions maybe even 50M later.  This issue is about discussing how we will do 
 that (or if not 50M on a cluster, how otherwise we can attain same end).
 More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11886) The creator of the table should have all permissions on the table

2014-09-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120517#comment-14120517
 ] 

Hadoop QA commented on HBASE-11886:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12666325/HBASE-11886-0.98.patch
  against trunk revision .
  ATTACHMENT ID: 12666325

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10699//console

This message is automatically generated.

 The creator of the table should have all permissions on the table
 -

 Key: HBASE-11886
 URL: https://issues.apache.org/jira/browse/HBASE-11886
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.3
Reporter: Devaraj Das
Assignee: Devaraj Das
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 11886-1.txt, HBASE-11886-0.98.patch, HBASE-11886.patch


 In our testing of 0.98.4 with security ON, we found that table creator 
 doesn't have RWXCA on the created table. Instead, the user representing the 
 HBase daemon gets all permissions. Due to this the table creator can't write 
 to the table he just created. I am suspecting HBASE-11275 introduced the 
 problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >