[jira] [Commented] (HBASE-11591) Scanner fails to retrieve KV from bulk loaded file with highest sequence id than the cell's mvcc in a non-bulk loaded file

2014-08-18 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100310#comment-14100310
 ] 

Anoop Sam John commented on HBASE-11591:


Sure. Some quick comments after a glance at the patch

isBulkLoadResult -  isBulkLoaded()?   For setter also?
I see this isBulkLoadResult () in StoreFile.java level also. I would have been 
better to know this status from StoreFile rather than from StoreFileReader.

Also what abt compacting a flush file and a bulk loaded one?  Will we have 
issues then? This patch will handle that also?  Mind adding tests around that 
also.

compareWithoutMvcc(Cell left, Cell right)
Now we have deprecated *mvcc () methods. Suggest change in name here also.

bq.// TODO : While doing cells this is should be avoided in the read path.
IMHO we should not do this KeyValueUtil.ensureKeyValue() stuff from now. (In 
read path mainly) In near future we  will want Cells in read path. How we can 
solve this particular issue then? (We can not add setter in Cell.java I 
believe)  Or do we need an extension interface for Cell *in server side* which 
is having the setter?

Doing a deeper look Ram. Sorry for being late.

 Scanner fails to retrieve KV  from bulk loaded file with highest sequence id 
 than the cell's mvcc in a non-bulk loaded file
 ---

 Key: HBASE-11591
 URL: https://issues.apache.org/jira/browse/HBASE-11591
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.99.0

 Attachments: HBASE-11591.patch, HBASE-11591_1.patch, 
 HBASE-11591_2.patch, TestBulkload.java


 See discussion in HBASE-11339.
 When we have a case where there are same KVs in two files one produced by 
 flush/compaction and the other thro the bulk load.
 Both the files have some same kvs which matches even in timestamp.
 Steps:
 Add some rows with a specific timestamp and flush the same.  
 Bulk load a file with the same data.. Enusre that assign seqnum property is 
 set.
 The bulk load should use HFileOutputFormat2 (or ensure that we write the 
 bulk_time_output key).
 This would ensure that the bulk loaded file has the highest seq num.
 Assume the cell in the flushed/compacted store file is 
 row1,cf,cq,ts1, value1  and the cell in the bulk loaded file is
 row1,cf,cq,ts1,value2 
 (There are no parallel scans).
 Issue a scan on the table in 0.96. The retrieved value is 
 row1,cf1,cq,ts1,value2
 But the same in 0.98 will retrieve row1,cf1,cq,ts2,value1. 
 This is a behaviour change.  This is because of this code 
 {code}
 public int compare(KeyValueScanner left, KeyValueScanner right) {
   int comparison = compare(left.peek(), right.peek());
   if (comparison != 0) {
 return comparison;
   } else {
 // Since both the keys are exactly the same, we break the tie in favor
 // of the key which came latest.
 long leftSequenceID = left.getSequenceID();
 long rightSequenceID = right.getSequenceID();
 if (leftSequenceID  rightSequenceID) {
   return -1;
 } else if (leftSequenceID  rightSequenceID) {
   return 1;
 } else {
   return 0;
 }
   }
 }
 {code}
 Here  in 0.96 case the mvcc of the cell in both the files will have 0 and so 
 the comparison will happen from the else condition .  Where the seq id of the 
 bulk loaded file is greater and would sort out first ensuring that the scan 
 happens from that bulk loaded file.
 In case of 0.98+ as we are retaining the mvcc+seqid we are not making the 
 mvcc as 0 (remains a non zero positive value).  Hence the compare() sorts out 
 the cell in the flushed/compacted file.  Which means though we know the 
 lateset file is the bulk loaded file we don't scan the data.
 Seems to be a behaviour change.  Will check on other corner cases also but we 
 are trying to know the behaviour of bulk load because we are evaluating if it 
 can be used for MOB design.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11728) Some data miss when scan using PREFIX_TREE DATA-BLOCK-ENCODING

2014-08-18 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-11728:
---

Status: Open  (was: Patch Available)

 Some data miss when scan using PREFIX_TREE DATA-BLOCK-ENCODING
 --

 Key: HBASE-11728
 URL: https://issues.apache.org/jira/browse/HBASE-11728
 Project: HBase
  Issue Type: Bug
  Components: Scanners
Affects Versions: 0.98.4, 0.96.1.1
 Environment: ubuntu12 
 hadoop-2.2.0
 Hbase-0.96.1.1
 SUN-JDK(1.7.0_06-b24)
Reporter: wuchengzhi
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 29cb562fad564b468ea9d61a2d60e8b0, HBASE-11728.patch, 
 HBASE-11728_1.patch, HBASE-11728_2.patch, HFileAnalys.java, 
 TestPrefixTree.java

   Original Estimate: 72h
  Remaining Estimate: 72h

 In Scan case, i prepare some data as beflow:
 Table Desc (Using the prefix-tree encoding) :
 'prefix_tree_test', {NAME = 'cf_1', DATA_BLOCK_ENCODING = 'PREFIX_TREE', 
 TTL = '15552000'}
 and i put 5 rows as:
 (RowKey , Qualifier, Value)
 'a-b-0-0', 'qf_1', 'c1-value'
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 'a-b-B-2-1402397300-1402416535', 'qf_2', 'c2-value-3'
 so i try to scan the rowKey between 'a-b-A-1' and 'a-b-A-1:' , i and got the 
 corret result:
 Test 1: 
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 and then i try next , scan to addColumn
 Test2:
 Scan scan = new Scan();
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_2));
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually i got nonthing. Then i update the addColumn for 
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_1)); and i got the 
 expected result 'a-b-A-1', 'qf_1', 'c1-value' as well.
 then i do more testing...  i update the case to modify the startRow greater 
 than the 'a-b-A-1' 
 Test3:
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually  i got nothing again. i modify the start row greater than 
 'a-b-A-1-1402329600-1402396277'
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-140239.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 and i got the expect row as well:
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 So, i think it may be a bug in the prefix-tree encoding.It happens after the 
 data flush to the storefile, and it's ok when the data in mem-store.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11728) Some data miss when scan using PREFIX_TREE DATA-BLOCK-ENCODING

2014-08-18 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-11728:
---

Status: Patch Available  (was: Open)

 Some data miss when scan using PREFIX_TREE DATA-BLOCK-ENCODING
 --

 Key: HBASE-11728
 URL: https://issues.apache.org/jira/browse/HBASE-11728
 Project: HBase
  Issue Type: Bug
  Components: Scanners
Affects Versions: 0.98.4, 0.96.1.1
 Environment: ubuntu12 
 hadoop-2.2.0
 Hbase-0.96.1.1
 SUN-JDK(1.7.0_06-b24)
Reporter: wuchengzhi
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 29cb562fad564b468ea9d61a2d60e8b0, HBASE-11728.patch, 
 HBASE-11728_1.patch, HBASE-11728_2.patch, HBASE-11728_3.patch, 
 HFileAnalys.java, TestPrefixTree.java

   Original Estimate: 72h
  Remaining Estimate: 72h

 In Scan case, i prepare some data as beflow:
 Table Desc (Using the prefix-tree encoding) :
 'prefix_tree_test', {NAME = 'cf_1', DATA_BLOCK_ENCODING = 'PREFIX_TREE', 
 TTL = '15552000'}
 and i put 5 rows as:
 (RowKey , Qualifier, Value)
 'a-b-0-0', 'qf_1', 'c1-value'
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 'a-b-B-2-1402397300-1402416535', 'qf_2', 'c2-value-3'
 so i try to scan the rowKey between 'a-b-A-1' and 'a-b-A-1:' , i and got the 
 corret result:
 Test 1: 
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 and then i try next , scan to addColumn
 Test2:
 Scan scan = new Scan();
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_2));
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually i got nonthing. Then i update the addColumn for 
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_1)); and i got the 
 expected result 'a-b-A-1', 'qf_1', 'c1-value' as well.
 then i do more testing...  i update the case to modify the startRow greater 
 than the 'a-b-A-1' 
 Test3:
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually  i got nothing again. i modify the start row greater than 
 'a-b-A-1-1402329600-1402396277'
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-140239.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 and i got the expect row as well:
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 So, i think it may be a bug in the prefix-tree encoding.It happens after the 
 data flush to the storefile, and it's ok when the data in mem-store.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11728) Some data miss when scan using PREFIX_TREE DATA-BLOCK-ENCODING

2014-08-18 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-11728:
---

Attachment: HBASE-11728_3.patch

Updated the category on the test patch.  Also removed the syso in the test case 
and converted them to assertEquals.

 Some data miss when scan using PREFIX_TREE DATA-BLOCK-ENCODING
 --

 Key: HBASE-11728
 URL: https://issues.apache.org/jira/browse/HBASE-11728
 Project: HBase
  Issue Type: Bug
  Components: Scanners
Affects Versions: 0.96.1.1, 0.98.4
 Environment: ubuntu12 
 hadoop-2.2.0
 Hbase-0.96.1.1
 SUN-JDK(1.7.0_06-b24)
Reporter: wuchengzhi
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 29cb562fad564b468ea9d61a2d60e8b0, HBASE-11728.patch, 
 HBASE-11728_1.patch, HBASE-11728_2.patch, HBASE-11728_3.patch, 
 HFileAnalys.java, TestPrefixTree.java

   Original Estimate: 72h
  Remaining Estimate: 72h

 In Scan case, i prepare some data as beflow:
 Table Desc (Using the prefix-tree encoding) :
 'prefix_tree_test', {NAME = 'cf_1', DATA_BLOCK_ENCODING = 'PREFIX_TREE', 
 TTL = '15552000'}
 and i put 5 rows as:
 (RowKey , Qualifier, Value)
 'a-b-0-0', 'qf_1', 'c1-value'
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 'a-b-B-2-1402397300-1402416535', 'qf_2', 'c2-value-3'
 so i try to scan the rowKey between 'a-b-A-1' and 'a-b-A-1:' , i and got the 
 corret result:
 Test 1: 
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 and then i try next , scan to addColumn
 Test2:
 Scan scan = new Scan();
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_2));
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually i got nonthing. Then i update the addColumn for 
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_1)); and i got the 
 expected result 'a-b-A-1', 'qf_1', 'c1-value' as well.
 then i do more testing...  i update the case to modify the startRow greater 
 than the 'a-b-A-1' 
 Test3:
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually  i got nothing again. i modify the start row greater than 
 'a-b-A-1-1402329600-1402396277'
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-140239.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 and i got the expect row as well:
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 So, i think it may be a bug in the prefix-tree encoding.It happens after the 
 data flush to the storefile, and it's ok when the data in mem-store.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HBASE-11768) Register region server in zookeeper by ip address

2014-08-18 Thread Cheney Sun (JIRA)
Cheney Sun created HBASE-11768:
--

 Summary: Register region server in zookeeper by ip address
 Key: HBASE-11768
 URL: https://issues.apache.org/jira/browse/HBASE-11768
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 2.0.0
Reporter: Cheney Sun


HBase cluster isn't always setup along with a DNS server. But regionservers now 
register their hostnames in zookeeper, which bring some inconvenience when 
regionserver isn't in one DNS server. In such situation, clients have to 
maintain the ip/hostname mapping in their /etc/hosts files in order to resolve 
the hostname returned from zookeeper to the right address. 
This causes a lot of pain for clients to maintain the mapping, especially when 
adding new machines to the cluster, or some machines' address changed due to 
some reason. All clients need to update their host mapping files. 

The issue is to address this problem above, and try to add an option to let 
each regionserver record themself by ip address, instead of hostname only.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11768) Register region server in zookeeper by ip address

2014-08-18 Thread Cheney Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheney Sun updated HBASE-11768:
---

Description: 
HBase cluster isn't always setup along with a DNS server. But regionservers now 
register their hostnames in zookeeper, which bring some inconvenience when 
regionserver isn't in one DNS server. In such situation, clients have to 
maintain the ip/hostname mapping in their /etc/hosts files in order to resolve 
the hostname returned from zookeeper to the right address. 

However, this causes a lot of pain for clients to maintain the mapping, 
especially when adding new machines to the cluster, or some machines' address 
changed due to some reason. All clients need to update their host mapping 
files. 

The issue is to address this problem above, and try to add an option to let 
each regionserver record themself by ip address, instead of hostname only.

  was:
HBase cluster isn't always setup along with a DNS server. But regionservers now 
register their hostnames in zookeeper, which bring some inconvenience when 
regionserver isn't in one DNS server. In such situation, clients have to 
maintain the ip/hostname mapping in their /etc/hosts files in order to resolve 
the hostname returned from zookeeper to the right address. 
This causes a lot of pain for clients to maintain the mapping, especially when 
adding new machines to the cluster, or some machines' address changed due to 
some reason. All clients need to update their host mapping files. 

The issue is to address this problem above, and try to add an option to let 
each regionserver record themself by ip address, instead of hostname only.


 Register region server in zookeeper by ip address
 -

 Key: HBASE-11768
 URL: https://issues.apache.org/jira/browse/HBASE-11768
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 2.0.0
Reporter: Cheney Sun

 HBase cluster isn't always setup along with a DNS server. But regionservers 
 now register their hostnames in zookeeper, which bring some inconvenience 
 when regionserver isn't in one DNS server. In such situation, clients have to 
 maintain the ip/hostname mapping in their /etc/hosts files in order to 
 resolve the hostname returned from zookeeper to the right address. 
 However, this causes a lot of pain for clients to maintain the mapping, 
 especially when adding new machines to the cluster, or some machines' address 
 changed due to some reason. All clients need to update their host mapping 
 files. 
 The issue is to address this problem above, and try to add an option to let 
 each regionserver record themself by ip address, instead of hostname only.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11591) Scanner fails to retrieve KV from bulk loaded file with highest sequence id than the cell's mvcc in a non-bulk loaded file

2014-08-18 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100361#comment-14100361
 ] 

Anoop Sam John commented on HBASE-11591:


{code}
+  if(bulkLoad) {
+// TODO : While doing cells this is should be avoided in the read 
path.
+KeyValue leftKV = KeyValueUtil.ensureKeyValue(left.peek());
+KeyValue rightKV = KeyValueUtil.ensureKeyValue(right.peek());
+if(leftKV.getSequenceId() == 0) {
+  leftKV.setSequenceId(rightKV.getSequenceId());
+} else {
+  rightKV.setSequenceId(leftKV.getSequenceId());
+}
+  }
{code}

So what do we do here Ram? 
I think we need to set KV seqId for KVs, from bulk loaded file, to the file 
seqId (which we get from that file name).  So instead of this set seqId of one 
KV to other (which looks hacky IMO)  can we do the set by the seqId of the file?

 Scanner fails to retrieve KV  from bulk loaded file with highest sequence id 
 than the cell's mvcc in a non-bulk loaded file
 ---

 Key: HBASE-11591
 URL: https://issues.apache.org/jira/browse/HBASE-11591
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.99.0

 Attachments: HBASE-11591.patch, HBASE-11591_1.patch, 
 HBASE-11591_2.patch, TestBulkload.java


 See discussion in HBASE-11339.
 When we have a case where there are same KVs in two files one produced by 
 flush/compaction and the other thro the bulk load.
 Both the files have some same kvs which matches even in timestamp.
 Steps:
 Add some rows with a specific timestamp and flush the same.  
 Bulk load a file with the same data.. Enusre that assign seqnum property is 
 set.
 The bulk load should use HFileOutputFormat2 (or ensure that we write the 
 bulk_time_output key).
 This would ensure that the bulk loaded file has the highest seq num.
 Assume the cell in the flushed/compacted store file is 
 row1,cf,cq,ts1, value1  and the cell in the bulk loaded file is
 row1,cf,cq,ts1,value2 
 (There are no parallel scans).
 Issue a scan on the table in 0.96. The retrieved value is 
 row1,cf1,cq,ts1,value2
 But the same in 0.98 will retrieve row1,cf1,cq,ts2,value1. 
 This is a behaviour change.  This is because of this code 
 {code}
 public int compare(KeyValueScanner left, KeyValueScanner right) {
   int comparison = compare(left.peek(), right.peek());
   if (comparison != 0) {
 return comparison;
   } else {
 // Since both the keys are exactly the same, we break the tie in favor
 // of the key which came latest.
 long leftSequenceID = left.getSequenceID();
 long rightSequenceID = right.getSequenceID();
 if (leftSequenceID  rightSequenceID) {
   return -1;
 } else if (leftSequenceID  rightSequenceID) {
   return 1;
 } else {
   return 0;
 }
   }
 }
 {code}
 Here  in 0.96 case the mvcc of the cell in both the files will have 0 and so 
 the comparison will happen from the else condition .  Where the seq id of the 
 bulk loaded file is greater and would sort out first ensuring that the scan 
 happens from that bulk loaded file.
 In case of 0.98+ as we are retaining the mvcc+seqid we are not making the 
 mvcc as 0 (remains a non zero positive value).  Hence the compare() sorts out 
 the cell in the flushed/compacted file.  Which means though we know the 
 lateset file is the bulk loaded file we don't scan the data.
 Seems to be a behaviour change.  Will check on other corner cases also but we 
 are trying to know the behaviour of bulk load because we are evaluating if it 
 can be used for MOB design.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11768) Register region server in zookeeper by ip address

2014-08-18 Thread Cheney Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheney Sun updated HBASE-11768:
---

Attachment: HBASE_11768.patch

I like to provide one patch for review.

This patch is rather straightforward, which add one option 
hbase.regionserver.use.ip to control whether to use ip or hostname in 
zookeeper. 

By default, the value is false, to leave the current behavior unchanged. If set 
the value to true, regionserver ip instead of its hostname registered under the 
HBASE_ROOT/rs/ip.xx.xxx.



 Register region server in zookeeper by ip address
 -

 Key: HBASE-11768
 URL: https://issues.apache.org/jira/browse/HBASE-11768
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 2.0.0
Reporter: Cheney Sun
 Attachments: HBASE_11768.patch


 HBase cluster isn't always setup along with a DNS server. But regionservers 
 now register their hostnames in zookeeper, which bring some inconvenience 
 when regionserver isn't in one DNS server. In such situation, clients have to 
 maintain the ip/hostname mapping in their /etc/hosts files in order to 
 resolve the hostname returned from zookeeper to the right address. 
 However, this causes a lot of pain for clients to maintain the mapping, 
 especially when adding new machines to the cluster, or some machines' address 
 changed due to some reason. All clients need to update their host mapping 
 files. 
 The issue is to address this problem above, and try to add an option to let 
 each regionserver record themself by ip address, instead of hostname only.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11591) Scanner fails to retrieve KV from bulk loaded file with highest sequence id than the cell's mvcc in a non-bulk loaded file

2014-08-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100372#comment-14100372
 ] 

Hadoop QA commented on HBASE-11591:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12662425/HBASE-11591_2.patch
  against trunk revision .
  ATTACHMENT ID: 12662425

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 7 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10473//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10473//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10473//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10473//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10473//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10473//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10473//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10473//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10473//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10473//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10473//console

This message is automatically generated.

 Scanner fails to retrieve KV  from bulk loaded file with highest sequence id 
 than the cell's mvcc in a non-bulk loaded file
 ---

 Key: HBASE-11591
 URL: https://issues.apache.org/jira/browse/HBASE-11591
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.99.0

 Attachments: HBASE-11591.patch, HBASE-11591_1.patch, 
 HBASE-11591_2.patch, TestBulkload.java


 See discussion in HBASE-11339.
 When we have a case where there are same KVs in two files one produced by 
 flush/compaction and the other thro the bulk load.
 Both the files have some same kvs which matches even in timestamp.
 Steps:
 Add some rows with a specific timestamp and flush the same.  
 Bulk load a file with the same data.. Enusre that assign seqnum property is 
 set.
 The bulk load should use HFileOutputFormat2 (or ensure that we write the 
 bulk_time_output key).
 This would ensure that the bulk loaded file has the highest seq num.
 Assume the cell in the flushed/compacted store file is 
 row1,cf,cq,ts1, value1  and the cell in the bulk loaded file is
 row1,cf,cq,ts1,value2 
 (There are no parallel scans).
 Issue a scan on the table in 0.96. The retrieved value is 
 row1,cf1,cq,ts1,value2
 But the same in 0.98 will retrieve row1,cf1,cq,ts2,value1. 
 This is a behaviour change.  This is because of this code 
 {code}
 public int compare(KeyValueScanner left, KeyValueScanner right) {
   int comparison = compare(left.peek(), right.peek());
   if (comparison != 0) {
 return comparison;
   } else {
 // Since both the keys are exactly the same, we break the tie in favor
 // of the key which came latest.
 long leftSequenceID = 

[jira] [Commented] (HBASE-11591) Scanner fails to retrieve KV from bulk loaded file with highest sequence id than the cell's mvcc in a non-bulk loaded file

2014-08-18 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100384#comment-14100384
 ] 

ramkrishna.s.vasudevan commented on HBASE-11591:


I got a clean QA run.
bq.isBulkLoadResult - isBulkLoaded()? For setter also?
Okie. Fine with that.
bq.I see this isBulkLoadResult () in StoreFile.java level also. I would have 
been better to know this status from StoreFile rather than from StoreFileReader.
I spent some time for doing it.  Later decided this way.First thing is that 
only the reader is passed to the StoreFileScanner and storefilescanner only has 
a reader associated with it.  So if we need to have this informaiton from 
Storefile then i need to change the constructor of StoreFileScanner or use a 
setter.  I thought that was making the patch heavier.  Also in this case the 
information of bulk load or not has to be passed from the reader (because the 
reader reads the file info) and then set that on the Storefile.  Currently 
reader is also an inner class of StoreFile.  Considering all this i just kept 
the new getter/setter in the Reader level. 
bq.compareWithoutMvcc
Okie.  
bq.IMHO we should not do this KeyValueUtil.ensureKeyValue() stuff from now
Yes.. But i think that we should do in a separete JIRA infact to avoid this 
setSeqId but doing KeyValueUtil.ensureKeyValue().
bq.I think we need to set KV seqId for KVs, from bulk loaded file, to the file 
seqId
Yes.. I did set the other KV's sequence id because I wanted to ensure that we 
return one of the KVs from the two of them that are contesting here and ensure 
that we return a KV like what would have been returned if there was no clash 
and the lastest one was from the flushed file.  
Anyway before changing this let me check some more cases.  Then would update 
the patch accordingly.  Infact I had set the sequenceId of the file and later 
changed it to this way.

 Scanner fails to retrieve KV  from bulk loaded file with highest sequence id 
 than the cell's mvcc in a non-bulk loaded file
 ---

 Key: HBASE-11591
 URL: https://issues.apache.org/jira/browse/HBASE-11591
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.99.0

 Attachments: HBASE-11591.patch, HBASE-11591_1.patch, 
 HBASE-11591_2.patch, TestBulkload.java


 See discussion in HBASE-11339.
 When we have a case where there are same KVs in two files one produced by 
 flush/compaction and the other thro the bulk load.
 Both the files have some same kvs which matches even in timestamp.
 Steps:
 Add some rows with a specific timestamp and flush the same.  
 Bulk load a file with the same data.. Enusre that assign seqnum property is 
 set.
 The bulk load should use HFileOutputFormat2 (or ensure that we write the 
 bulk_time_output key).
 This would ensure that the bulk loaded file has the highest seq num.
 Assume the cell in the flushed/compacted store file is 
 row1,cf,cq,ts1, value1  and the cell in the bulk loaded file is
 row1,cf,cq,ts1,value2 
 (There are no parallel scans).
 Issue a scan on the table in 0.96. The retrieved value is 
 row1,cf1,cq,ts1,value2
 But the same in 0.98 will retrieve row1,cf1,cq,ts2,value1. 
 This is a behaviour change.  This is because of this code 
 {code}
 public int compare(KeyValueScanner left, KeyValueScanner right) {
   int comparison = compare(left.peek(), right.peek());
   if (comparison != 0) {
 return comparison;
   } else {
 // Since both the keys are exactly the same, we break the tie in favor
 // of the key which came latest.
 long leftSequenceID = left.getSequenceID();
 long rightSequenceID = right.getSequenceID();
 if (leftSequenceID  rightSequenceID) {
   return -1;
 } else if (leftSequenceID  rightSequenceID) {
   return 1;
 } else {
   return 0;
 }
   }
 }
 {code}
 Here  in 0.96 case the mvcc of the cell in both the files will have 0 and so 
 the comparison will happen from the else condition .  Where the seq id of the 
 bulk loaded file is greater and would sort out first ensuring that the scan 
 happens from that bulk loaded file.
 In case of 0.98+ as we are retaining the mvcc+seqid we are not making the 
 mvcc as 0 (remains a non zero positive value).  Hence the compare() sorts out 
 the cell in the flushed/compacted file.  Which means though we know the 
 lateset file is the bulk loaded file we don't scan the data.
 Seems to be a behaviour change.  Will check on other corner cases also but we 
 are trying to know the behaviour of bulk load because we are 

[jira] [Commented] (HBASE-11591) Scanner fails to retrieve KV from bulk loaded file with highest sequence id than the cell's mvcc in a non-bulk loaded file

2014-08-18 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100392#comment-14100392
 ] 

ramkrishna.s.vasudevan commented on HBASE-11591:


bq.Also what abt compacting a flush file and a bulk loaded one? Will we have 
issues then? This patch will handle that also? Mind adding tests around that 
also.
The current test is also compacting the flushed files. Behaviour wise both 
would be same in 0.99+.

 Scanner fails to retrieve KV  from bulk loaded file with highest sequence id 
 than the cell's mvcc in a non-bulk loaded file
 ---

 Key: HBASE-11591
 URL: https://issues.apache.org/jira/browse/HBASE-11591
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.99.0

 Attachments: HBASE-11591.patch, HBASE-11591_1.patch, 
 HBASE-11591_2.patch, TestBulkload.java


 See discussion in HBASE-11339.
 When we have a case where there are same KVs in two files one produced by 
 flush/compaction and the other thro the bulk load.
 Both the files have some same kvs which matches even in timestamp.
 Steps:
 Add some rows with a specific timestamp and flush the same.  
 Bulk load a file with the same data.. Enusre that assign seqnum property is 
 set.
 The bulk load should use HFileOutputFormat2 (or ensure that we write the 
 bulk_time_output key).
 This would ensure that the bulk loaded file has the highest seq num.
 Assume the cell in the flushed/compacted store file is 
 row1,cf,cq,ts1, value1  and the cell in the bulk loaded file is
 row1,cf,cq,ts1,value2 
 (There are no parallel scans).
 Issue a scan on the table in 0.96. The retrieved value is 
 row1,cf1,cq,ts1,value2
 But the same in 0.98 will retrieve row1,cf1,cq,ts2,value1. 
 This is a behaviour change.  This is because of this code 
 {code}
 public int compare(KeyValueScanner left, KeyValueScanner right) {
   int comparison = compare(left.peek(), right.peek());
   if (comparison != 0) {
 return comparison;
   } else {
 // Since both the keys are exactly the same, we break the tie in favor
 // of the key which came latest.
 long leftSequenceID = left.getSequenceID();
 long rightSequenceID = right.getSequenceID();
 if (leftSequenceID  rightSequenceID) {
   return -1;
 } else if (leftSequenceID  rightSequenceID) {
   return 1;
 } else {
   return 0;
 }
   }
 }
 {code}
 Here  in 0.96 case the mvcc of the cell in both the files will have 0 and so 
 the comparison will happen from the else condition .  Where the seq id of the 
 bulk loaded file is greater and would sort out first ensuring that the scan 
 happens from that bulk loaded file.
 In case of 0.98+ as we are retaining the mvcc+seqid we are not making the 
 mvcc as 0 (remains a non zero positive value).  Hence the compare() sorts out 
 the cell in the flushed/compacted file.  Which means though we know the 
 lateset file is the bulk loaded file we don't scan the data.
 Seems to be a behaviour change.  Will check on other corner cases also but we 
 are trying to know the behaviour of bulk load because we are evaluating if it 
 can be used for MOB design.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11553) Abstract visibility label related services into an interface

2014-08-18 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-11553:
---

Attachment: (was: HBASE-11553_V5.patch)

 Abstract visibility label related services into an interface
 

 Key: HBASE-11553
 URL: https://issues.apache.org/jira/browse/HBASE-11553
 Project: HBase
  Issue Type: Improvement
  Components: security
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: HBASE-11553.patch, HBASE-11553.patch, 
 HBASE-11553_V2.patch, HBASE-11553_V3.patch, HBASE-11553_V4.patch, 
 HBASE-11553_V5.patch


 - storage and retrieval of label dictionary and authentication sets 
 - marshalling and unmarshalling of visibility expression representations 
 in operation attributes and cell tags
 - management of assignment of authorizations to principals
 This will allow us to introduce additional serde implementations for 
 visibility expressions, for example storing as strings in some places and 
 compressed/tokenized representation in others in order to support additional 
 use cases.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11553) Abstract visibility label related services into an interface

2014-08-18 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-11553:
---

Status: Open  (was: Patch Available)

 Abstract visibility label related services into an interface
 

 Key: HBASE-11553
 URL: https://issues.apache.org/jira/browse/HBASE-11553
 Project: HBase
  Issue Type: Improvement
  Components: security
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: HBASE-11553.patch, HBASE-11553.patch, 
 HBASE-11553_V2.patch, HBASE-11553_V3.patch, HBASE-11553_V4.patch, 
 HBASE-11553_V5.patch


 - storage and retrieval of label dictionary and authentication sets 
 - marshalling and unmarshalling of visibility expression representations 
 in operation attributes and cell tags
 - management of assignment of authorizations to principals
 This will allow us to introduce additional serde implementations for 
 visibility expressions, for example storing as strings in some places and 
 compressed/tokenized representation in others in order to support additional 
 use cases.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11553) Abstract visibility label related services into an interface

2014-08-18 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-11553:
---

Status: Patch Available  (was: Open)

 Abstract visibility label related services into an interface
 

 Key: HBASE-11553
 URL: https://issues.apache.org/jira/browse/HBASE-11553
 Project: HBase
  Issue Type: Improvement
  Components: security
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: HBASE-11553.patch, HBASE-11553.patch, 
 HBASE-11553_V2.patch, HBASE-11553_V3.patch, HBASE-11553_V4.patch, 
 HBASE-11553_V5.patch


 - storage and retrieval of label dictionary and authentication sets 
 - marshalling and unmarshalling of visibility expression representations 
 in operation attributes and cell tags
 - management of assignment of authorizations to principals
 This will allow us to introduce additional serde implementations for 
 visibility expressions, for example storing as strings in some places and 
 compressed/tokenized representation in others in order to support additional 
 use cases.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11553) Abstract visibility label related services into an interface

2014-08-18 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-11553:
---

Attachment: HBASE-11553_V5.patch

 Abstract visibility label related services into an interface
 

 Key: HBASE-11553
 URL: https://issues.apache.org/jira/browse/HBASE-11553
 Project: HBase
  Issue Type: Improvement
  Components: security
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: HBASE-11553.patch, HBASE-11553.patch, 
 HBASE-11553_V2.patch, HBASE-11553_V3.patch, HBASE-11553_V4.patch, 
 HBASE-11553_V5.patch


 - storage and retrieval of label dictionary and authentication sets 
 - marshalling and unmarshalling of visibility expression representations 
 in operation attributes and cell tags
 - management of assignment of authorizations to principals
 This will allow us to introduce additional serde implementations for 
 visibility expressions, for example storing as strings in some places and 
 compressed/tokenized representation in others in order to support additional 
 use cases.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11553) Abstract visibility label related services into an interface

2014-08-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100457#comment-14100457
 ] 

Hadoop QA commented on HBASE-11553:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12662447/HBASE-11553_V5.patch
  against trunk revision .
  ATTACHMENT ID: 12662447

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 15 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.regionserver.wal.TestLogRollingNoCluster

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10475//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10475//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10475//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10475//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10475//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10475//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10475//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10475//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10475//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10475//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10475//console

This message is automatically generated.

 Abstract visibility label related services into an interface
 

 Key: HBASE-11553
 URL: https://issues.apache.org/jira/browse/HBASE-11553
 Project: HBase
  Issue Type: Improvement
  Components: security
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: HBASE-11553.patch, HBASE-11553.patch, 
 HBASE-11553_V2.patch, HBASE-11553_V3.patch, HBASE-11553_V4.patch, 
 HBASE-11553_V5.patch


 - storage and retrieval of label dictionary and authentication sets 
 - marshalling and unmarshalling of visibility expression representations 
 in operation attributes and cell tags
 - management of assignment of authorizations to principals
 This will allow us to introduce additional serde implementations for 
 visibility expressions, for example storing as strings in some places and 
 compressed/tokenized representation in others in order to support additional 
 use cases.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11591) Scanner fails to retrieve KV from bulk loaded file with highest sequence id than the cell's mvcc in a non-bulk loaded file

2014-08-18 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-11591:
---

Status: Open  (was: Patch Available)

 Scanner fails to retrieve KV  from bulk loaded file with highest sequence id 
 than the cell's mvcc in a non-bulk loaded file
 ---

 Key: HBASE-11591
 URL: https://issues.apache.org/jira/browse/HBASE-11591
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.99.0

 Attachments: HBASE-11591.patch, HBASE-11591_1.patch, 
 HBASE-11591_2.patch, TestBulkload.java


 See discussion in HBASE-11339.
 When we have a case where there are same KVs in two files one produced by 
 flush/compaction and the other thro the bulk load.
 Both the files have some same kvs which matches even in timestamp.
 Steps:
 Add some rows with a specific timestamp and flush the same.  
 Bulk load a file with the same data.. Enusre that assign seqnum property is 
 set.
 The bulk load should use HFileOutputFormat2 (or ensure that we write the 
 bulk_time_output key).
 This would ensure that the bulk loaded file has the highest seq num.
 Assume the cell in the flushed/compacted store file is 
 row1,cf,cq,ts1, value1  and the cell in the bulk loaded file is
 row1,cf,cq,ts1,value2 
 (There are no parallel scans).
 Issue a scan on the table in 0.96. The retrieved value is 
 row1,cf1,cq,ts1,value2
 But the same in 0.98 will retrieve row1,cf1,cq,ts2,value1. 
 This is a behaviour change.  This is because of this code 
 {code}
 public int compare(KeyValueScanner left, KeyValueScanner right) {
   int comparison = compare(left.peek(), right.peek());
   if (comparison != 0) {
 return comparison;
   } else {
 // Since both the keys are exactly the same, we break the tie in favor
 // of the key which came latest.
 long leftSequenceID = left.getSequenceID();
 long rightSequenceID = right.getSequenceID();
 if (leftSequenceID  rightSequenceID) {
   return -1;
 } else if (leftSequenceID  rightSequenceID) {
   return 1;
 } else {
   return 0;
 }
   }
 }
 {code}
 Here  in 0.96 case the mvcc of the cell in both the files will have 0 and so 
 the comparison will happen from the else condition .  Where the seq id of the 
 bulk loaded file is greater and would sort out first ensuring that the scan 
 happens from that bulk loaded file.
 In case of 0.98+ as we are retaining the mvcc+seqid we are not making the 
 mvcc as 0 (remains a non zero positive value).  Hence the compare() sorts out 
 the cell in the flushed/compacted file.  Which means though we know the 
 lateset file is the bulk loaded file we don't scan the data.
 Seems to be a behaviour change.  Will check on other corner cases also but we 
 are trying to know the behaviour of bulk load because we are evaluating if it 
 can be used for MOB design.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11757) Provide a common base abstract class for both RegionObserver and MasterObserver

2014-08-18 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-11757:


Attachment: HBASE-11757-v0.patch
HBASE-11757-0.98-v0.patch

 Provide a common base abstract class for both RegionObserver and 
 MasterObserver
 ---

 Key: HBASE-11757
 URL: https://issues.apache.org/jira/browse/HBASE-11757
 Project: HBase
  Issue Type: Improvement
Reporter: Andrew Purtell
Assignee: Matteo Bertozzi
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: HBASE-11757-0.98-v0.patch, HBASE-11757-v0.patch


 Some security coprocessors extend both RegionObserver and MasterObserver, 
 unfortunately only one of the two can use the available base abstract class 
 implementations. Provide a common base abstract class for both the 
 RegionObserver and MasterObserver interfaces. Update current coprocessors 
 that extend both interfaces to use the new common base abstract class.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11757) Provide a common base abstract class for both RegionObserver and MasterObserver

2014-08-18 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-11757:


Status: Patch Available  (was: Open)

 Provide a common base abstract class for both RegionObserver and 
 MasterObserver
 ---

 Key: HBASE-11757
 URL: https://issues.apache.org/jira/browse/HBASE-11757
 Project: HBase
  Issue Type: Improvement
Reporter: Andrew Purtell
Assignee: Matteo Bertozzi
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: HBASE-11757-0.98-v0.patch, HBASE-11757-v0.patch


 Some security coprocessors extend both RegionObserver and MasterObserver, 
 unfortunately only one of the two can use the available base abstract class 
 implementations. Provide a common base abstract class for both the 
 RegionObserver and MasterObserver interfaces. Update current coprocessors 
 that extend both interfaces to use the new common base abstract class.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11591) Scanner fails to retrieve KV from bulk loaded file with highest sequence id than the cell's mvcc in a non-bulk loaded file

2014-08-18 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-11591:
---

Status: Patch Available  (was: Open)

 Scanner fails to retrieve KV  from bulk loaded file with highest sequence id 
 than the cell's mvcc in a non-bulk loaded file
 ---

 Key: HBASE-11591
 URL: https://issues.apache.org/jira/browse/HBASE-11591
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.99.0

 Attachments: HBASE-11591.patch, HBASE-11591_1.patch, 
 HBASE-11591_2.patch, HBASE-11591_3.patch, TestBulkload.java


 See discussion in HBASE-11339.
 When we have a case where there are same KVs in two files one produced by 
 flush/compaction and the other thro the bulk load.
 Both the files have some same kvs which matches even in timestamp.
 Steps:
 Add some rows with a specific timestamp and flush the same.  
 Bulk load a file with the same data.. Enusre that assign seqnum property is 
 set.
 The bulk load should use HFileOutputFormat2 (or ensure that we write the 
 bulk_time_output key).
 This would ensure that the bulk loaded file has the highest seq num.
 Assume the cell in the flushed/compacted store file is 
 row1,cf,cq,ts1, value1  and the cell in the bulk loaded file is
 row1,cf,cq,ts1,value2 
 (There are no parallel scans).
 Issue a scan on the table in 0.96. The retrieved value is 
 row1,cf1,cq,ts1,value2
 But the same in 0.98 will retrieve row1,cf1,cq,ts2,value1. 
 This is a behaviour change.  This is because of this code 
 {code}
 public int compare(KeyValueScanner left, KeyValueScanner right) {
   int comparison = compare(left.peek(), right.peek());
   if (comparison != 0) {
 return comparison;
   } else {
 // Since both the keys are exactly the same, we break the tie in favor
 // of the key which came latest.
 long leftSequenceID = left.getSequenceID();
 long rightSequenceID = right.getSequenceID();
 if (leftSequenceID  rightSequenceID) {
   return -1;
 } else if (leftSequenceID  rightSequenceID) {
   return 1;
 } else {
   return 0;
 }
   }
 }
 {code}
 Here  in 0.96 case the mvcc of the cell in both the files will have 0 and so 
 the comparison will happen from the else condition .  Where the seq id of the 
 bulk loaded file is greater and would sort out first ensuring that the scan 
 happens from that bulk loaded file.
 In case of 0.98+ as we are retaining the mvcc+seqid we are not making the 
 mvcc as 0 (remains a non zero positive value).  Hence the compare() sorts out 
 the cell in the flushed/compacted file.  Which means though we know the 
 lateset file is the bulk loaded file we don't scan the data.
 Seems to be a behaviour change.  Will check on other corner cases also but we 
 are trying to know the behaviour of bulk load because we are evaluating if it 
 can be used for MOB design.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11591) Scanner fails to retrieve KV from bulk loaded file with highest sequence id than the cell's mvcc in a non-bulk loaded file

2014-08-18 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-11591:
---

Attachment: HBASE-11591_3.patch

Updated patch.  Tries to set the sequenceId of the bulk loaded file to the kv 
that is retrieved from the bulk loaded file.
Other thing to be noted is that
In the KVScannerComparator.compare() the code would not reach I think because
{code}
else if (leftSequenceID  rightSequenceID) {
{code}
always the list of Storefiles are sorted based on the seqId.  So if we have a 
the seqId of the storefiles as 15, 19, 21 then while creating the KVHeap
{code}
for (KeyValueScanner scanner : scanners) {
if (scanner.peek() != null) {
  this.heap.add(scanner);
} else {
  scanner.close();
}
  }
{code}
So it will try to add 15, 19 and then 21. The compare() will in 
KVScannercomparator will be called from PriorityQueue
{code}
private void siftUpUsingComparator(int k, E x) {
while (k  0) {
int parent = (k - 1)  1;
Object e = queue[parent];
if (comparator.compare(x, (E) e) = 0)
break;
queue[k] = e;
k = parent;
}
queue[k] = x;
}
{code}
Here we can see that the left hand side is always the element that we are 
trying to add and the right hand side is the existing one in the heap.  Since 
the list is always sorted (15, 19 and 21) so the compare will compare LHS=19 
and RHS=15 and then LHS=21 and RHS=19.  So i think the leftSequenceID will 
always be bigger.  Anyway added the condition of setting the sequenceId on the 
rightKV also.

 Scanner fails to retrieve KV  from bulk loaded file with highest sequence id 
 than the cell's mvcc in a non-bulk loaded file
 ---

 Key: HBASE-11591
 URL: https://issues.apache.org/jira/browse/HBASE-11591
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.99.0

 Attachments: HBASE-11591.patch, HBASE-11591_1.patch, 
 HBASE-11591_2.patch, HBASE-11591_3.patch, TestBulkload.java


 See discussion in HBASE-11339.
 When we have a case where there are same KVs in two files one produced by 
 flush/compaction and the other thro the bulk load.
 Both the files have some same kvs which matches even in timestamp.
 Steps:
 Add some rows with a specific timestamp and flush the same.  
 Bulk load a file with the same data.. Enusre that assign seqnum property is 
 set.
 The bulk load should use HFileOutputFormat2 (or ensure that we write the 
 bulk_time_output key).
 This would ensure that the bulk loaded file has the highest seq num.
 Assume the cell in the flushed/compacted store file is 
 row1,cf,cq,ts1, value1  and the cell in the bulk loaded file is
 row1,cf,cq,ts1,value2 
 (There are no parallel scans).
 Issue a scan on the table in 0.96. The retrieved value is 
 row1,cf1,cq,ts1,value2
 But the same in 0.98 will retrieve row1,cf1,cq,ts2,value1. 
 This is a behaviour change.  This is because of this code 
 {code}
 public int compare(KeyValueScanner left, KeyValueScanner right) {
   int comparison = compare(left.peek(), right.peek());
   if (comparison != 0) {
 return comparison;
   } else {
 // Since both the keys are exactly the same, we break the tie in favor
 // of the key which came latest.
 long leftSequenceID = left.getSequenceID();
 long rightSequenceID = right.getSequenceID();
 if (leftSequenceID  rightSequenceID) {
   return -1;
 } else if (leftSequenceID  rightSequenceID) {
   return 1;
 } else {
   return 0;
 }
   }
 }
 {code}
 Here  in 0.96 case the mvcc of the cell in both the files will have 0 and so 
 the comparison will happen from the else condition .  Where the seq id of the 
 bulk loaded file is greater and would sort out first ensuring that the scan 
 happens from that bulk loaded file.
 In case of 0.98+ as we are retaining the mvcc+seqid we are not making the 
 mvcc as 0 (remains a non zero positive value).  Hence the compare() sorts out 
 the cell in the flushed/compacted file.  Which means though we know the 
 lateset file is the bulk loaded file we don't scan the data.
 Seems to be a behaviour change.  Will check on other corner cases also but we 
 are trying to know the behaviour of bulk load because we are evaluating if it 
 can be used for MOB design.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11553) Abstract visibility label related services into an interface

2014-08-18 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100501#comment-14100501
 ] 

ramkrishna.s.vasudevan commented on HBASE-11553:


Just 2 minor nits in RB. Rest looks great.  +1 from me.

 Abstract visibility label related services into an interface
 

 Key: HBASE-11553
 URL: https://issues.apache.org/jira/browse/HBASE-11553
 Project: HBase
  Issue Type: Improvement
  Components: security
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: HBASE-11553.patch, HBASE-11553.patch, 
 HBASE-11553_V2.patch, HBASE-11553_V3.patch, HBASE-11553_V4.patch, 
 HBASE-11553_V5.patch


 - storage and retrieval of label dictionary and authentication sets 
 - marshalling and unmarshalling of visibility expression representations 
 in operation attributes and cell tags
 - management of assignment of authorizations to principals
 This will allow us to introduce additional serde implementations for 
 visibility expressions, for example storing as strings in some places and 
 compressed/tokenized representation in others in order to support additional 
 use cases.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HBASE-11769) Truncate table shouldn't revoke user privileges

2014-08-18 Thread hongyu bi (JIRA)
hongyu bi created HBASE-11769:
-

 Summary: Truncate table shouldn't revoke user privileges
 Key: HBASE-11769
 URL: https://issues.apache.org/jira/browse/HBASE-11769
 Project: HBase
  Issue Type: Bug
  Components: security
Affects Versions: 0.94.15
Reporter: hongyu bi


hbase(main):002:0 create 'a','cf'
0 row(s) in 0.2500 seconds

= Hbase::Table - a
hbase(main):003:0 grant 'usera','R','a'
0 row(s) in 0.2080 seconds

hbase(main):007:0 user_permission 'a'
User   
Table,Family,Qualifier:Permission   

   
 usera a,,: 
[Permission: actions=READ]  

  

hbase(main):004:0 truncate 'a'
Truncating 'a' table (it may take a while):
 - Disabling table...
 - Dropping table...
 - Creating table...
0 row(s) in 1.5320 seconds

hbase(main):005:0 user_permission 'a'
User   
Table,Family,Qualifier:Permission   

   








--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11757) Provide a common base abstract class for both RegionObserver and MasterObserver

2014-08-18 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100519#comment-14100519
 ] 

Anoop Sam John commented on HBASE-11757:


+1

 Provide a common base abstract class for both RegionObserver and 
 MasterObserver
 ---

 Key: HBASE-11757
 URL: https://issues.apache.org/jira/browse/HBASE-11757
 Project: HBase
  Issue Type: Improvement
Reporter: Andrew Purtell
Assignee: Matteo Bertozzi
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: HBASE-11757-0.98-v0.patch, HBASE-11757-v0.patch


 Some security coprocessors extend both RegionObserver and MasterObserver, 
 unfortunately only one of the two can use the available base abstract class 
 implementations. Provide a common base abstract class for both the 
 RegionObserver and MasterObserver interfaces. Update current coprocessors 
 that extend both interfaces to use the new common base abstract class.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11591) Scanner fails to retrieve KV from bulk loaded file with highest sequence id than the cell's mvcc in a non-bulk loaded file

2014-08-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100539#comment-14100539
 ] 

Hadoop QA commented on HBASE-11591:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12662463/HBASE-11591_3.patch
  against trunk revision .
  ATTACHMENT ID: 12662463

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 7 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.TestRegionRebalancing

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10477//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10477//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10477//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10477//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10477//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10477//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10477//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10477//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10477//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10477//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10477//console

This message is automatically generated.

 Scanner fails to retrieve KV  from bulk loaded file with highest sequence id 
 than the cell's mvcc in a non-bulk loaded file
 ---

 Key: HBASE-11591
 URL: https://issues.apache.org/jira/browse/HBASE-11591
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.99.0

 Attachments: HBASE-11591.patch, HBASE-11591_1.patch, 
 HBASE-11591_2.patch, HBASE-11591_3.patch, TestBulkload.java


 See discussion in HBASE-11339.
 When we have a case where there are same KVs in two files one produced by 
 flush/compaction and the other thro the bulk load.
 Both the files have some same kvs which matches even in timestamp.
 Steps:
 Add some rows with a specific timestamp and flush the same.  
 Bulk load a file with the same data.. Enusre that assign seqnum property is 
 set.
 The bulk load should use HFileOutputFormat2 (or ensure that we write the 
 bulk_time_output key).
 This would ensure that the bulk loaded file has the highest seq num.
 Assume the cell in the flushed/compacted store file is 
 row1,cf,cq,ts1, value1  and the cell in the bulk loaded file is
 row1,cf,cq,ts1,value2 
 (There are no parallel scans).
 Issue a scan on the table in 0.96. The retrieved value is 
 row1,cf1,cq,ts1,value2
 But the same in 0.98 will retrieve row1,cf1,cq,ts2,value1. 
 This is a behaviour change.  This is because of this code 
 {code}
 public int compare(KeyValueScanner left, KeyValueScanner right) {
   int comparison = compare(left.peek(), right.peek());
   if (comparison != 0) {
 return comparison;
   } else {
 // Since both the keys are exactly the same, we break the tie 

[jira] [Updated] (HBASE-11728) Some data miss when scan using PREFIX_TREE DATA-BLOCK-ENCODING

2014-08-18 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-11728:
---

Attachment: HBASE-11728_4.patch

Retry QA.

 Some data miss when scan using PREFIX_TREE DATA-BLOCK-ENCODING
 --

 Key: HBASE-11728
 URL: https://issues.apache.org/jira/browse/HBASE-11728
 Project: HBase
  Issue Type: Bug
  Components: Scanners
Affects Versions: 0.96.1.1, 0.98.4
 Environment: ubuntu12 
 hadoop-2.2.0
 Hbase-0.96.1.1
 SUN-JDK(1.7.0_06-b24)
Reporter: wuchengzhi
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 29cb562fad564b468ea9d61a2d60e8b0, HBASE-11728.patch, 
 HBASE-11728_1.patch, HBASE-11728_2.patch, HBASE-11728_3.patch, 
 HBASE-11728_4.patch, HFileAnalys.java, TestPrefixTree.java

   Original Estimate: 72h
  Remaining Estimate: 72h

 In Scan case, i prepare some data as beflow:
 Table Desc (Using the prefix-tree encoding) :
 'prefix_tree_test', {NAME = 'cf_1', DATA_BLOCK_ENCODING = 'PREFIX_TREE', 
 TTL = '15552000'}
 and i put 5 rows as:
 (RowKey , Qualifier, Value)
 'a-b-0-0', 'qf_1', 'c1-value'
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 'a-b-B-2-1402397300-1402416535', 'qf_2', 'c2-value-3'
 so i try to scan the rowKey between 'a-b-A-1' and 'a-b-A-1:' , i and got the 
 corret result:
 Test 1: 
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 and then i try next , scan to addColumn
 Test2:
 Scan scan = new Scan();
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_2));
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually i got nonthing. Then i update the addColumn for 
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_1)); and i got the 
 expected result 'a-b-A-1', 'qf_1', 'c1-value' as well.
 then i do more testing...  i update the case to modify the startRow greater 
 than the 'a-b-A-1' 
 Test3:
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually  i got nothing again. i modify the start row greater than 
 'a-b-A-1-1402329600-1402396277'
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-140239.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 and i got the expect row as well:
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 So, i think it may be a bug in the prefix-tree encoding.It happens after the 
 data flush to the storefile, and it's ok when the data in mem-store.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11728) Some data miss when scan using PREFIX_TREE DATA-BLOCK-ENCODING

2014-08-18 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-11728:
---

Status: Open  (was: Patch Available)

 Some data miss when scan using PREFIX_TREE DATA-BLOCK-ENCODING
 --

 Key: HBASE-11728
 URL: https://issues.apache.org/jira/browse/HBASE-11728
 Project: HBase
  Issue Type: Bug
  Components: Scanners
Affects Versions: 0.98.4, 0.96.1.1
 Environment: ubuntu12 
 hadoop-2.2.0
 Hbase-0.96.1.1
 SUN-JDK(1.7.0_06-b24)
Reporter: wuchengzhi
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 29cb562fad564b468ea9d61a2d60e8b0, HBASE-11728.patch, 
 HBASE-11728_1.patch, HBASE-11728_2.patch, HBASE-11728_3.patch, 
 HBASE-11728_4.patch, HFileAnalys.java, TestPrefixTree.java

   Original Estimate: 72h
  Remaining Estimate: 72h

 In Scan case, i prepare some data as beflow:
 Table Desc (Using the prefix-tree encoding) :
 'prefix_tree_test', {NAME = 'cf_1', DATA_BLOCK_ENCODING = 'PREFIX_TREE', 
 TTL = '15552000'}
 and i put 5 rows as:
 (RowKey , Qualifier, Value)
 'a-b-0-0', 'qf_1', 'c1-value'
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 'a-b-B-2-1402397300-1402416535', 'qf_2', 'c2-value-3'
 so i try to scan the rowKey between 'a-b-A-1' and 'a-b-A-1:' , i and got the 
 corret result:
 Test 1: 
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 and then i try next , scan to addColumn
 Test2:
 Scan scan = new Scan();
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_2));
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually i got nonthing. Then i update the addColumn for 
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_1)); and i got the 
 expected result 'a-b-A-1', 'qf_1', 'c1-value' as well.
 then i do more testing...  i update the case to modify the startRow greater 
 than the 'a-b-A-1' 
 Test3:
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually  i got nothing again. i modify the start row greater than 
 'a-b-A-1-1402329600-1402396277'
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-140239.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 and i got the expect row as well:
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 So, i think it may be a bug in the prefix-tree encoding.It happens after the 
 data flush to the storefile, and it's ok when the data in mem-store.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11728) Some data miss when scan using PREFIX_TREE DATA-BLOCK-ENCODING

2014-08-18 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-11728:
---

Status: Patch Available  (was: Open)

 Some data miss when scan using PREFIX_TREE DATA-BLOCK-ENCODING
 --

 Key: HBASE-11728
 URL: https://issues.apache.org/jira/browse/HBASE-11728
 Project: HBase
  Issue Type: Bug
  Components: Scanners
Affects Versions: 0.98.4, 0.96.1.1
 Environment: ubuntu12 
 hadoop-2.2.0
 Hbase-0.96.1.1
 SUN-JDK(1.7.0_06-b24)
Reporter: wuchengzhi
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 29cb562fad564b468ea9d61a2d60e8b0, HBASE-11728.patch, 
 HBASE-11728_1.patch, HBASE-11728_2.patch, HBASE-11728_3.patch, 
 HBASE-11728_4.patch, HFileAnalys.java, TestPrefixTree.java

   Original Estimate: 72h
  Remaining Estimate: 72h

 In Scan case, i prepare some data as beflow:
 Table Desc (Using the prefix-tree encoding) :
 'prefix_tree_test', {NAME = 'cf_1', DATA_BLOCK_ENCODING = 'PREFIX_TREE', 
 TTL = '15552000'}
 and i put 5 rows as:
 (RowKey , Qualifier, Value)
 'a-b-0-0', 'qf_1', 'c1-value'
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 'a-b-B-2-1402397300-1402416535', 'qf_2', 'c2-value-3'
 so i try to scan the rowKey between 'a-b-A-1' and 'a-b-A-1:' , i and got the 
 corret result:
 Test 1: 
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 and then i try next , scan to addColumn
 Test2:
 Scan scan = new Scan();
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_2));
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually i got nonthing. Then i update the addColumn for 
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_1)); and i got the 
 expected result 'a-b-A-1', 'qf_1', 'c1-value' as well.
 then i do more testing...  i update the case to modify the startRow greater 
 than the 'a-b-A-1' 
 Test3:
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually  i got nothing again. i modify the start row greater than 
 'a-b-A-1-1402329600-1402396277'
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-140239.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 and i got the expect row as well:
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 So, i think it may be a bug in the prefix-tree encoding.It happens after the 
 data flush to the storefile, and it's ok when the data in mem-store.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11728) Some data miss when scan using PREFIX_TREE DATA-BLOCK-ENCODING

2014-08-18 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-11728:
---

Fix Version/s: (was: 0.94.23)

 Some data miss when scan using PREFIX_TREE DATA-BLOCK-ENCODING
 --

 Key: HBASE-11728
 URL: https://issues.apache.org/jira/browse/HBASE-11728
 Project: HBase
  Issue Type: Bug
  Components: Scanners
Affects Versions: 0.96.1.1, 0.98.4
 Environment: ubuntu12 
 hadoop-2.2.0
 Hbase-0.96.1.1
 SUN-JDK(1.7.0_06-b24)
Reporter: wuchengzhi
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 29cb562fad564b468ea9d61a2d60e8b0, HBASE-11728.patch, 
 HBASE-11728_1.patch, HBASE-11728_2.patch, HBASE-11728_3.patch, 
 HBASE-11728_4.patch, HFileAnalys.java, TestPrefixTree.java

   Original Estimate: 72h
  Remaining Estimate: 72h

 In Scan case, i prepare some data as beflow:
 Table Desc (Using the prefix-tree encoding) :
 'prefix_tree_test', {NAME = 'cf_1', DATA_BLOCK_ENCODING = 'PREFIX_TREE', 
 TTL = '15552000'}
 and i put 5 rows as:
 (RowKey , Qualifier, Value)
 'a-b-0-0', 'qf_1', 'c1-value'
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 'a-b-B-2-1402397300-1402416535', 'qf_2', 'c2-value-3'
 so i try to scan the rowKey between 'a-b-A-1' and 'a-b-A-1:' , i and got the 
 corret result:
 Test 1: 
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 and then i try next , scan to addColumn
 Test2:
 Scan scan = new Scan();
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_2));
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually i got nonthing. Then i update the addColumn for 
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_1)); and i got the 
 expected result 'a-b-A-1', 'qf_1', 'c1-value' as well.
 then i do more testing...  i update the case to modify the startRow greater 
 than the 'a-b-A-1' 
 Test3:
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually  i got nothing again. i modify the start row greater than 
 'a-b-A-1-1402329600-1402396277'
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-140239.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 and i got the expect row as well:
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 So, i think it may be a bug in the prefix-tree encoding.It happens after the 
 data flush to the storefile, and it's ok when the data in mem-store.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11728) Some data miss when scan using PREFIX_TREE DATA-BLOCK-ENCODING

2014-08-18 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-11728:
---

Fix Version/s: 0.94.23

 Some data miss when scan using PREFIX_TREE DATA-BLOCK-ENCODING
 --

 Key: HBASE-11728
 URL: https://issues.apache.org/jira/browse/HBASE-11728
 Project: HBase
  Issue Type: Bug
  Components: Scanners
Affects Versions: 0.96.1.1, 0.98.4
 Environment: ubuntu12 
 hadoop-2.2.0
 Hbase-0.96.1.1
 SUN-JDK(1.7.0_06-b24)
Reporter: wuchengzhi
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 29cb562fad564b468ea9d61a2d60e8b0, HBASE-11728.patch, 
 HBASE-11728_1.patch, HBASE-11728_2.patch, HBASE-11728_3.patch, 
 HBASE-11728_4.patch, HFileAnalys.java, TestPrefixTree.java

   Original Estimate: 72h
  Remaining Estimate: 72h

 In Scan case, i prepare some data as beflow:
 Table Desc (Using the prefix-tree encoding) :
 'prefix_tree_test', {NAME = 'cf_1', DATA_BLOCK_ENCODING = 'PREFIX_TREE', 
 TTL = '15552000'}
 and i put 5 rows as:
 (RowKey , Qualifier, Value)
 'a-b-0-0', 'qf_1', 'c1-value'
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 'a-b-B-2-1402397300-1402416535', 'qf_2', 'c2-value-3'
 so i try to scan the rowKey between 'a-b-A-1' and 'a-b-A-1:' , i and got the 
 corret result:
 Test 1: 
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 and then i try next , scan to addColumn
 Test2:
 Scan scan = new Scan();
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_2));
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually i got nonthing. Then i update the addColumn for 
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_1)); and i got the 
 expected result 'a-b-A-1', 'qf_1', 'c1-value' as well.
 then i do more testing...  i update the case to modify the startRow greater 
 than the 'a-b-A-1' 
 Test3:
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually  i got nothing again. i modify the start row greater than 
 'a-b-A-1-1402329600-1402396277'
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-140239.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 and i got the expect row as well:
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 So, i think it may be a bug in the prefix-tree encoding.It happens after the 
 data flush to the storefile, and it's ok when the data in mem-store.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11769) Truncate table shouldn't revoke user privileges

2014-08-18 Thread Jean-Marc Spaggiari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100546#comment-14100546
 ] 

Jean-Marc Spaggiari commented on HBASE-11769:
-

Make sense to me.

There is also a truncate which preserves the splits. You might want to modify 
this one too. (truncate_preserve)

 Truncate table shouldn't revoke user privileges
 ---

 Key: HBASE-11769
 URL: https://issues.apache.org/jira/browse/HBASE-11769
 Project: HBase
  Issue Type: Bug
  Components: security
Affects Versions: 0.94.15
Reporter: hongyu bi

 hbase(main):002:0 create 'a','cf'
 0 row(s) in 0.2500 seconds
 = Hbase::Table - a
 hbase(main):003:0 grant 'usera','R','a'
 0 row(s) in 0.2080 seconds
 hbase(main):007:0 user_permission 'a'
 User   
 Table,Family,Qualifier:Permission 
   

  usera a,,: 
 [Permission: actions=READ]
   
   
 hbase(main):004:0 truncate 'a'
 Truncating 'a' table (it may take a while):
  - Disabling table...
  - Dropping table...
  - Creating table...
 0 row(s) in 1.5320 seconds
 hbase(main):005:0 user_permission 'a'
 User   
 Table,Family,Qualifier:Permission 
   




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11768) Register region server in zookeeper by ip address

2014-08-18 Thread Jean-Marc Spaggiari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100548#comment-14100548
 ] 

Jean-Marc Spaggiari commented on HBASE-11768:
-

Indeed, pretty simple patch. Have you tested it in a real cluster?


 Register region server in zookeeper by ip address
 -

 Key: HBASE-11768
 URL: https://issues.apache.org/jira/browse/HBASE-11768
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 2.0.0
Reporter: Cheney Sun
 Attachments: HBASE_11768.patch


 HBase cluster isn't always setup along with a DNS server. But regionservers 
 now register their hostnames in zookeeper, which bring some inconvenience 
 when regionserver isn't in one DNS server. In such situation, clients have to 
 maintain the ip/hostname mapping in their /etc/hosts files in order to 
 resolve the hostname returned from zookeeper to the right address. 
 However, this causes a lot of pain for clients to maintain the mapping, 
 especially when adding new machines to the cluster, or some machines' address 
 changed due to some reason. All clients need to update their host mapping 
 files. 
 The issue is to address this problem above, and try to add an option to let 
 each regionserver record themself by ip address, instead of hostname only.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11757) Provide a common base abstract class for both RegionObserver and MasterObserver

2014-08-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100553#comment-14100553
 ] 

Hadoop QA commented on HBASE-11757:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12662462/HBASE-11757-v0.patch
  against trunk revision .
  ATTACHMENT ID: 12662462

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction
  org.apache.hadoop.hbase.client.TestMultiParallel
  org.apache.hadoop.hbase.TestRegionRebalancing
  org.apache.hadoop.hbase.regionserver.TestHRegionBusyWait

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s): 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10476//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10476//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10476//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10476//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10476//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10476//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10476//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10476//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10476//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10476//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10476//console

This message is automatically generated.

 Provide a common base abstract class for both RegionObserver and 
 MasterObserver
 ---

 Key: HBASE-11757
 URL: https://issues.apache.org/jira/browse/HBASE-11757
 Project: HBase
  Issue Type: Improvement
Reporter: Andrew Purtell
Assignee: Matteo Bertozzi
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: HBASE-11757-0.98-v0.patch, HBASE-11757-v0.patch


 Some security coprocessors extend both RegionObserver and MasterObserver, 
 unfortunately only one of the two can use the available base abstract class 
 implementations. Provide a common base abstract class for both the 
 RegionObserver and MasterObserver interfaces. Update current coprocessors 
 that extend both interfaces to use the new common base abstract class.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Work started] (HBASE-11743) Add unit test for the fix that sorts custom value of BUCKET_CACHE_BUCKETS_KEY

2014-08-18 Thread Gustavo Anatoly (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-11743 started by Gustavo Anatoly.

 Add unit test for the fix that sorts custom value of BUCKET_CACHE_BUCKETS_KEY
 -

 Key: HBASE-11743
 URL: https://issues.apache.org/jira/browse/HBASE-11743
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu
Assignee: Gustavo Anatoly
Priority: Minor

 HBASE-11550 sorts the custom value of BUCKET_CACHE_BUCKETS_KEY such that 
 there is no wastage in bucket allocation.
 This JIRA is to add unit test for the fix so that there is no regression in 
 the future.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Work stopped] (HBASE-11743) Add unit test for the fix that sorts custom value of BUCKET_CACHE_BUCKETS_KEY

2014-08-18 Thread Gustavo Anatoly (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-11743 stopped by Gustavo Anatoly.

 Add unit test for the fix that sorts custom value of BUCKET_CACHE_BUCKETS_KEY
 -

 Key: HBASE-11743
 URL: https://issues.apache.org/jira/browse/HBASE-11743
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu
Assignee: Gustavo Anatoly
Priority: Minor

 HBASE-11550 sorts the custom value of BUCKET_CACHE_BUCKETS_KEY such that 
 there is no wastage in bucket allocation.
 This JIRA is to add unit test for the fix so that there is no regression in 
 the future.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11769) Truncate table shouldn't revoke user privileges

2014-08-18 Thread chendihao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100559#comment-14100559
 ] 

chendihao commented on HBASE-11769:
---

Agree with [~jmspaggi]. Truncate_preserve works well without removing the 
privilieges.

Won't fix, right?

 Truncate table shouldn't revoke user privileges
 ---

 Key: HBASE-11769
 URL: https://issues.apache.org/jira/browse/HBASE-11769
 Project: HBase
  Issue Type: Bug
  Components: security
Affects Versions: 0.94.15
Reporter: hongyu bi

 hbase(main):002:0 create 'a','cf'
 0 row(s) in 0.2500 seconds
 = Hbase::Table - a
 hbase(main):003:0 grant 'usera','R','a'
 0 row(s) in 0.2080 seconds
 hbase(main):007:0 user_permission 'a'
 User   
 Table,Family,Qualifier:Permission 
   

  usera a,,: 
 [Permission: actions=READ]
   
   
 hbase(main):004:0 truncate 'a'
 Truncating 'a' table (it may take a while):
  - Disabling table...
  - Dropping table...
  - Creating table...
 0 row(s) in 1.5320 seconds
 hbase(main):005:0 user_permission 'a'
 User   
 Table,Family,Qualifier:Permission 
   




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11769) Truncate table shouldn't revoke user privileges

2014-08-18 Thread Matteo Bertozzi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100564#comment-14100564
 ] 

Matteo Bertozzi commented on HBASE-11769:
-

truncate preserve only preserve the set of region splits.
Since the shell does a delete table + create table that will always remove the 
ACLs
HBASE-8332 fixed the problem by adding a truncate API which bypass the delete 
table/acls.

 Truncate table shouldn't revoke user privileges
 ---

 Key: HBASE-11769
 URL: https://issues.apache.org/jira/browse/HBASE-11769
 Project: HBase
  Issue Type: Bug
  Components: security
Affects Versions: 0.94.15
Reporter: hongyu bi

 hbase(main):002:0 create 'a','cf'
 0 row(s) in 0.2500 seconds
 = Hbase::Table - a
 hbase(main):003:0 grant 'usera','R','a'
 0 row(s) in 0.2080 seconds
 hbase(main):007:0 user_permission 'a'
 User   
 Table,Family,Qualifier:Permission 
   

  usera a,,: 
 [Permission: actions=READ]
   
   
 hbase(main):004:0 truncate 'a'
 Truncating 'a' table (it may take a while):
  - Disabling table...
  - Dropping table...
  - Creating table...
 0 row(s) in 1.5320 seconds
 hbase(main):005:0 user_permission 'a'
 User   
 Table,Family,Qualifier:Permission 
   




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11769) Truncate table shouldn't revoke user privileges

2014-08-18 Thread Jean-Marc Spaggiari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100575#comment-14100575
 ] 

Jean-Marc Spaggiari commented on HBASE-11769:
-

Just to be clear, I was not saying that preserve did or did not preserved 
privileges, was just that we might want to look it too.

So for the purpose of this patch, should Honguy simply update ruby scripts to 
call the new API provided by HBASE-8332? Might be cleaner than having 2 
implementations (One in ruby one in java) for the same feature?

 Truncate table shouldn't revoke user privileges
 ---

 Key: HBASE-11769
 URL: https://issues.apache.org/jira/browse/HBASE-11769
 Project: HBase
  Issue Type: Bug
  Components: security
Affects Versions: 0.94.15
Reporter: hongyu bi

 hbase(main):002:0 create 'a','cf'
 0 row(s) in 0.2500 seconds
 = Hbase::Table - a
 hbase(main):003:0 grant 'usera','R','a'
 0 row(s) in 0.2080 seconds
 hbase(main):007:0 user_permission 'a'
 User   
 Table,Family,Qualifier:Permission 
   

  usera a,,: 
 [Permission: actions=READ]
   
   
 hbase(main):004:0 truncate 'a'
 Truncating 'a' table (it may take a while):
  - Disabling table...
  - Dropping table...
  - Creating table...
 0 row(s) in 1.5320 seconds
 hbase(main):005:0 user_permission 'a'
 User   
 Table,Family,Qualifier:Permission 
   




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11761) Add a FAQ item for updating a maven-managed application from 0.94 - 0.96+

2014-08-18 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100648#comment-14100648
 ] 

Sean Busbey commented on HBASE-11761:
-

I was thinking the FAQ for a couple of reasons:

1) It also applies to 0.94 - 0.96 upgrades

2) I suspect developers of downstream clients are more likely to notice a FAQ 
item geared towards them than an addition to the upgrade docs.

Maybe a FAQ item with a pointer within both the 0.94 - 0.96 and 0.94 - 0.98 
upgrade sections?

 Add a FAQ item for updating a maven-managed application from 0.94 - 0.96+
 --

 Key: HBASE-11761
 URL: https://issues.apache.org/jira/browse/HBASE-11761
 Project: HBase
  Issue Type: Task
  Components: documentation
Reporter: Sean Busbey
  Labels: beginner

 In 0.96 we changed artifact structure, so that clients need to rely on an 
 artifact specific to some module (hopefully hbase-client) instead of a single 
 fat jar.
 We should add a FAQ item that points people towards hbase-client, to ease 
 those updating downstream applications from 0.94 to 0.98+.
 Showing an example pom entry for e.g. org.apache.hbase:hbase:0.94.22 and one 
 for e.g. org.apache.hbase:hbase-client:0.98.5 should be sufficient.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HBASE-11770) TestBlockCacheReporting.testBucketCache is not stable

2014-08-18 Thread Sergey Soldatov (JIRA)
Sergey Soldatov created HBASE-11770:
---

 Summary: TestBlockCacheReporting.testBucketCache is not stable 
 Key: HBASE-11770
 URL: https://issues.apache.org/jira/browse/HBASE-11770
 Project: HBase
  Issue Type: Bug
  Components: test
 Environment: kvm box with Ubuntu 12.04 Desktop 64bit. 
java version 1.7.0_65
Java(TM) SE Runtime Environment (build 1.7.0_65-b17)
Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)

Reporter: Sergey Soldatov
Assignee: Sergey Soldatov


Depending on the machine and OS TestBlockCacheReporting.testBucketCache may 
fail with NPE:
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.getBlock(BucketCache.java:417)
at 
org.apache.hadoop.hbase.io.hfile.CombinedBlockCache.getBlock(CombinedBlockCache.java:80)
at 
org.apache.hadoop.hbase.io.hfile.TestBlockCacheReporting.addDataAndHits(TestBlockCacheReporting.java:67)
at 
org.apache.hadoop.hbase.io.hfile.TestBlockCacheReporting.testBucketCache(TestBlockCacheReporting.java:86)





--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-18 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100778#comment-14100778
 ] 

Jimmy Xiang commented on HBASE-11165:
-

I agree with Matteo on this. One more benefit to have meta and master together 
is the meta/master recovery will be much simpler (I mean there won't be 
scenario like master is recovering,  meta regionserver may be down).

 Scaling so cluster can host 1M regions and beyond (50M regions?)
 

 Key: HBASE-11165
 URL: https://issues.apache.org/jira/browse/HBASE-11165
 Project: HBase
  Issue Type: Brainstorming
Reporter: stack
 Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
 zk_less_assignment_comparison_2.pdf


 This discussion issue comes out of Co-locate Meta And Master HBASE-10569 
 and comments on the doc posted there.
 A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
 regions maybe even 50M later.  This issue is about discussing how we will do 
 that (or if not 50M on a cluster, how otherwise we can attain same end).
 More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11610) Enhance remote meta updates

2014-08-18 Thread Virag Kothari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100826#comment-14100826
 ] 

Virag Kothari commented on HBASE-11610:
---

[~jxiang] Any other comments before we can get this in?

 Enhance remote meta updates
 ---

 Key: HBASE-11610
 URL: https://issues.apache.org/jira/browse/HBASE-11610
 Project: HBase
  Issue Type: Sub-task
Reporter: Jimmy Xiang
Assignee: Virag Kothari
 Attachments: HBASE-11610.patch


 Currently, if the meta region is on a regionserver instead of the master, 
 meta update is synchronized on one HTable instance. We should be able to do 
 better.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11610) Enhance remote meta updates

2014-08-18 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100841#comment-14100841
 ] 

Jimmy Xiang commented on HBASE-11610:
-

I have no more comment. I am ok with a patch that shares just one HConnection 
and one execution pool, closes the meta htable instance after each use. I am 
not sure about the current patch. Perhaps [~larsh]/[~nkeywal] can take a look?

 Enhance remote meta updates
 ---

 Key: HBASE-11610
 URL: https://issues.apache.org/jira/browse/HBASE-11610
 Project: HBase
  Issue Type: Sub-task
Reporter: Jimmy Xiang
Assignee: Virag Kothari
 Attachments: HBASE-11610.patch


 Currently, if the meta region is on a regionserver instead of the master, 
 meta update is synchronized on one HTable instance. We should be able to do 
 better.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11728) Data loss while scanning using PREFIX_TREE DATA-BLOCK-ENCODING

2014-08-18 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-11728:
---

Summary: Data loss while scanning using PREFIX_TREE DATA-BLOCK-ENCODING  
(was: Some data miss when scan using PREFIX_TREE DATA-BLOCK-ENCODING)

 Data loss while scanning using PREFIX_TREE DATA-BLOCK-ENCODING
 --

 Key: HBASE-11728
 URL: https://issues.apache.org/jira/browse/HBASE-11728
 Project: HBase
  Issue Type: Bug
  Components: Scanners
Affects Versions: 0.96.1.1, 0.98.4
 Environment: ubuntu12 
 hadoop-2.2.0
 Hbase-0.96.1.1
 SUN-JDK(1.7.0_06-b24)
Reporter: wuchengzhi
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 29cb562fad564b468ea9d61a2d60e8b0, HBASE-11728.patch, 
 HBASE-11728_1.patch, HBASE-11728_2.patch, HBASE-11728_3.patch, 
 HBASE-11728_4.patch, HFileAnalys.java, TestPrefixTree.java

   Original Estimate: 72h
  Remaining Estimate: 72h

 In Scan case, i prepare some data as beflow:
 Table Desc (Using the prefix-tree encoding) :
 'prefix_tree_test', {NAME = 'cf_1', DATA_BLOCK_ENCODING = 'PREFIX_TREE', 
 TTL = '15552000'}
 and i put 5 rows as:
 (RowKey , Qualifier, Value)
 'a-b-0-0', 'qf_1', 'c1-value'
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 'a-b-B-2-1402397300-1402416535', 'qf_2', 'c2-value-3'
 so i try to scan the rowKey between 'a-b-A-1' and 'a-b-A-1:' , i and got the 
 corret result:
 Test 1: 
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 and then i try next , scan to addColumn
 Test2:
 Scan scan = new Scan();
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_2));
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually i got nonthing. Then i update the addColumn for 
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_1)); and i got the 
 expected result 'a-b-A-1', 'qf_1', 'c1-value' as well.
 then i do more testing...  i update the case to modify the startRow greater 
 than the 'a-b-A-1' 
 Test3:
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually  i got nothing again. i modify the start row greater than 
 'a-b-A-1-1402329600-1402396277'
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-140239.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 and i got the expect row as well:
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 So, i think it may be a bug in the prefix-tree encoding.It happens after the 
 data flush to the storefile, and it's ok when the data in mem-store.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11591) Scanner fails to retrieve KV from bulk loaded file with highest sequence id than the cell's mvcc in a non-bulk loaded file

2014-08-18 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100860#comment-14100860
 ] 

Ted Yu commented on HBASE-11591:


{code}
+ * Compares two cells without mvcc
+ *
+ * @param left
+ * @param right
+ * @return less than 0 if left is smaller, 0 if equal etc..
+ */
+public int compareWithoutSeqId(Cell left, Cell right) {
{code}
Change javadoc to match the method name.

Cell is marked @InterfaceStability.Evolving
setSequenceId() should be added to Cell interface - in another issue.
{code}
+public class TestScannerWithBulkload {
+  private final static HBaseTestingUtility TEST_UTIL = new 
HBaseTestingUtility();
+  private final static String tableName = testBulkload;
{code}
Please change tableName to match test name.

 Scanner fails to retrieve KV  from bulk loaded file with highest sequence id 
 than the cell's mvcc in a non-bulk loaded file
 ---

 Key: HBASE-11591
 URL: https://issues.apache.org/jira/browse/HBASE-11591
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.99.0

 Attachments: HBASE-11591.patch, HBASE-11591_1.patch, 
 HBASE-11591_2.patch, HBASE-11591_3.patch, TestBulkload.java


 See discussion in HBASE-11339.
 When we have a case where there are same KVs in two files one produced by 
 flush/compaction and the other thro the bulk load.
 Both the files have some same kvs which matches even in timestamp.
 Steps:
 Add some rows with a specific timestamp and flush the same.  
 Bulk load a file with the same data.. Enusre that assign seqnum property is 
 set.
 The bulk load should use HFileOutputFormat2 (or ensure that we write the 
 bulk_time_output key).
 This would ensure that the bulk loaded file has the highest seq num.
 Assume the cell in the flushed/compacted store file is 
 row1,cf,cq,ts1, value1  and the cell in the bulk loaded file is
 row1,cf,cq,ts1,value2 
 (There are no parallel scans).
 Issue a scan on the table in 0.96. The retrieved value is 
 row1,cf1,cq,ts1,value2
 But the same in 0.98 will retrieve row1,cf1,cq,ts2,value1. 
 This is a behaviour change.  This is because of this code 
 {code}
 public int compare(KeyValueScanner left, KeyValueScanner right) {
   int comparison = compare(left.peek(), right.peek());
   if (comparison != 0) {
 return comparison;
   } else {
 // Since both the keys are exactly the same, we break the tie in favor
 // of the key which came latest.
 long leftSequenceID = left.getSequenceID();
 long rightSequenceID = right.getSequenceID();
 if (leftSequenceID  rightSequenceID) {
   return -1;
 } else if (leftSequenceID  rightSequenceID) {
   return 1;
 } else {
   return 0;
 }
   }
 }
 {code}
 Here  in 0.96 case the mvcc of the cell in both the files will have 0 and so 
 the comparison will happen from the else condition .  Where the seq id of the 
 bulk loaded file is greater and would sort out first ensuring that the scan 
 happens from that bulk loaded file.
 In case of 0.98+ as we are retaining the mvcc+seqid we are not making the 
 mvcc as 0 (remains a non zero positive value).  Hence the compare() sorts out 
 the cell in the flushed/compacted file.  Which means though we know the 
 lateset file is the bulk loaded file we don't scan the data.
 Seems to be a behaviour change.  Will check on other corner cases also but we 
 are trying to know the behaviour of bulk load because we are evaluating if it 
 can be used for MOB design.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11591) Scanner fails to retrieve KV from bulk loaded file with highest sequence id than the cell's mvcc in a non-bulk loaded file

2014-08-18 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100875#comment-14100875
 ] 

ramkrishna.s.vasudevan commented on HBASE-11591:


bq.setSequenceId() should be added to Cell interface - in another issue.
I don't think we can add setSequenceId() in Cell.  We can discuss on that. Will 
update the patch.

 Scanner fails to retrieve KV  from bulk loaded file with highest sequence id 
 than the cell's mvcc in a non-bulk loaded file
 ---

 Key: HBASE-11591
 URL: https://issues.apache.org/jira/browse/HBASE-11591
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.99.0

 Attachments: HBASE-11591.patch, HBASE-11591_1.patch, 
 HBASE-11591_2.patch, HBASE-11591_3.patch, TestBulkload.java


 See discussion in HBASE-11339.
 When we have a case where there are same KVs in two files one produced by 
 flush/compaction and the other thro the bulk load.
 Both the files have some same kvs which matches even in timestamp.
 Steps:
 Add some rows with a specific timestamp and flush the same.  
 Bulk load a file with the same data.. Enusre that assign seqnum property is 
 set.
 The bulk load should use HFileOutputFormat2 (or ensure that we write the 
 bulk_time_output key).
 This would ensure that the bulk loaded file has the highest seq num.
 Assume the cell in the flushed/compacted store file is 
 row1,cf,cq,ts1, value1  and the cell in the bulk loaded file is
 row1,cf,cq,ts1,value2 
 (There are no parallel scans).
 Issue a scan on the table in 0.96. The retrieved value is 
 row1,cf1,cq,ts1,value2
 But the same in 0.98 will retrieve row1,cf1,cq,ts2,value1. 
 This is a behaviour change.  This is because of this code 
 {code}
 public int compare(KeyValueScanner left, KeyValueScanner right) {
   int comparison = compare(left.peek(), right.peek());
   if (comparison != 0) {
 return comparison;
   } else {
 // Since both the keys are exactly the same, we break the tie in favor
 // of the key which came latest.
 long leftSequenceID = left.getSequenceID();
 long rightSequenceID = right.getSequenceID();
 if (leftSequenceID  rightSequenceID) {
   return -1;
 } else if (leftSequenceID  rightSequenceID) {
   return 1;
 } else {
   return 0;
 }
   }
 }
 {code}
 Here  in 0.96 case the mvcc of the cell in both the files will have 0 and so 
 the comparison will happen from the else condition .  Where the seq id of the 
 bulk loaded file is greater and would sort out first ensuring that the scan 
 happens from that bulk loaded file.
 In case of 0.98+ as we are retaining the mvcc+seqid we are not making the 
 mvcc as 0 (remains a non zero positive value).  Hence the compare() sorts out 
 the cell in the flushed/compacted file.  Which means though we know the 
 lateset file is the bulk loaded file we don't scan the data.
 Seems to be a behaviour change.  Will check on other corner cases also but we 
 are trying to know the behaviour of bulk load because we are evaluating if it 
 can be used for MOB design.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11657) Put HTable region methods in an interface

2014-08-18 Thread Carter (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100876#comment-14100876
 ] 

Carter commented on HBASE-11657:


That would certainly solve the problem with {{TableInputFormatBase}}.  We 
should also probably add a _getTableName_ method to {{RegionLocator}}, 
regardless.  Then passing the RL interface instead of a raw HTable object would 
provide everything that it needs for sharding the MR.

A more philosophical question is, why is {{HRegionLocation}} 
InterfaceAudience.Private to begin with?  It is a POJO that wraps 
{{HRegionInfo}} (InterfaceAudience.Public), {{ServerName}} 
(InterfaceAudience.Public), and _seqNum_ (an immutable long).  It seems to me 
that either the internal fields should be private too, or HRegionLocation 
should be public.  Unless there is some correlation of that information that 
shouldn't be exposed.  Thoughts, [~stack], [~ndimiduk], [~enis]?


 Put HTable region methods in an interface
 -

 Key: HBASE-11657
 URL: https://issues.apache.org/jira/browse/HBASE-11657
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.99.0
Reporter: Carter
Assignee: Carter
 Fix For: 0.99.0

 Attachments: HBASE_11657.patch, HBASE_11657_v2.patch, 
 HBASE_11657_v3.patch, HBASE_11657_v3.patch, HBASE_11657_v4.patch


 Most of the HTable methods are now abstracted by HTableInterface, with the 
 notable exception of the following methods that pertain to region metadata:
 {code}
 HRegionLocation getRegionLocation(final String row)
 HRegionLocation getRegionLocation(final byte [] row)
 HRegionLocation getRegionLocation(final byte [] row, boolean reload)
 byte [][] getStartKeys()
 byte[][] getEndKeys()
 Pairbyte[][],byte[][] getStartEndKeys()
 void clearRegionCache()
 {code}
 and a default scope method which maybe should be bundled with the others:
 {code}
 ListRegionLocations listRegionLocations()
 {code}
 Since the consensus seems to be that these would muddy HTableInterface with 
 non-core functionality, where should it go?  MapReduce looks up the region 
 boundaries, so it needs to be exposed somewhere.
 Let me throw out a straw man to start the conversation.  I propose:
 {code}
 org.apache.hadoop.hbase.client.HRegionInterface
 {code}
 Have HTable implement this interface.  Also add these methods to HConnection:
 {code}
 HRegionInterface getTableRegion(TableName tableName)
 HRegionInterface getTableRegion(TableName tableName, ExecutorService pool)
 {code}
 [~stack], [~ndimiduk], [~enis], thoughts?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (HBASE-11728) Data loss while scanning using PREFIX_TREE DATA-BLOCK-ENCODING

2014-08-18 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100881#comment-14100881
 ] 

ramkrishna.s.vasudevan edited comment on HBASE-11728 at 8/18/14 5:21 PM:
-

Committed to master, branch-1 and 0.98. Thanks for the review [~mcorgan]


was (Author: ram_krish):
Committed to master, branch-1 and 0.98. Thanks for the review @mcorgan.

 Data loss while scanning using PREFIX_TREE DATA-BLOCK-ENCODING
 --

 Key: HBASE-11728
 URL: https://issues.apache.org/jira/browse/HBASE-11728
 Project: HBase
  Issue Type: Bug
  Components: Scanners
Affects Versions: 0.96.1.1, 0.98.4
 Environment: ubuntu12 
 hadoop-2.2.0
 Hbase-0.96.1.1
 SUN-JDK(1.7.0_06-b24)
Reporter: wuchengzhi
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 29cb562fad564b468ea9d61a2d60e8b0, HBASE-11728.patch, 
 HBASE-11728_1.patch, HBASE-11728_2.patch, HBASE-11728_3.patch, 
 HBASE-11728_4.patch, HFileAnalys.java, TestPrefixTree.java

   Original Estimate: 72h
  Remaining Estimate: 72h

 In Scan case, i prepare some data as beflow:
 Table Desc (Using the prefix-tree encoding) :
 'prefix_tree_test', {NAME = 'cf_1', DATA_BLOCK_ENCODING = 'PREFIX_TREE', 
 TTL = '15552000'}
 and i put 5 rows as:
 (RowKey , Qualifier, Value)
 'a-b-0-0', 'qf_1', 'c1-value'
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 'a-b-B-2-1402397300-1402416535', 'qf_2', 'c2-value-3'
 so i try to scan the rowKey between 'a-b-A-1' and 'a-b-A-1:' , i and got the 
 corret result:
 Test 1: 
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 and then i try next , scan to addColumn
 Test2:
 Scan scan = new Scan();
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_2));
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually i got nonthing. Then i update the addColumn for 
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_1)); and i got the 
 expected result 'a-b-A-1', 'qf_1', 'c1-value' as well.
 then i do more testing...  i update the case to modify the startRow greater 
 than the 'a-b-A-1' 
 Test3:
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually  i got nothing again. i modify the start row greater than 
 'a-b-A-1-1402329600-1402396277'
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-140239.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 and i got the expect row as well:
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 So, i think it may be a bug in the prefix-tree encoding.It happens after the 
 data flush to the storefile, and it's ok when the data in mem-store.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11728) Data loss while scanning using PREFIX_TREE DATA-BLOCK-ENCODING

2014-08-18 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100881#comment-14100881
 ] 

ramkrishna.s.vasudevan commented on HBASE-11728:


Committed to master, branch-1 and 0.98. Thanks for the review @mcorgan.

 Data loss while scanning using PREFIX_TREE DATA-BLOCK-ENCODING
 --

 Key: HBASE-11728
 URL: https://issues.apache.org/jira/browse/HBASE-11728
 Project: HBase
  Issue Type: Bug
  Components: Scanners
Affects Versions: 0.96.1.1, 0.98.4
 Environment: ubuntu12 
 hadoop-2.2.0
 Hbase-0.96.1.1
 SUN-JDK(1.7.0_06-b24)
Reporter: wuchengzhi
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 29cb562fad564b468ea9d61a2d60e8b0, HBASE-11728.patch, 
 HBASE-11728_1.patch, HBASE-11728_2.patch, HBASE-11728_3.patch, 
 HBASE-11728_4.patch, HFileAnalys.java, TestPrefixTree.java

   Original Estimate: 72h
  Remaining Estimate: 72h

 In Scan case, i prepare some data as beflow:
 Table Desc (Using the prefix-tree encoding) :
 'prefix_tree_test', {NAME = 'cf_1', DATA_BLOCK_ENCODING = 'PREFIX_TREE', 
 TTL = '15552000'}
 and i put 5 rows as:
 (RowKey , Qualifier, Value)
 'a-b-0-0', 'qf_1', 'c1-value'
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 'a-b-B-2-1402397300-1402416535', 'qf_2', 'c2-value-3'
 so i try to scan the rowKey between 'a-b-A-1' and 'a-b-A-1:' , i and got the 
 corret result:
 Test 1: 
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 and then i try next , scan to addColumn
 Test2:
 Scan scan = new Scan();
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_2));
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually i got nonthing. Then i update the addColumn for 
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_1)); and i got the 
 expected result 'a-b-A-1', 'qf_1', 'c1-value' as well.
 then i do more testing...  i update the case to modify the startRow greater 
 than the 'a-b-A-1' 
 Test3:
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually  i got nothing again. i modify the start row greater than 
 'a-b-A-1-1402329600-1402396277'
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-140239.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 and i got the expect row as well:
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 So, i think it may be a bug in the prefix-tree encoding.It happens after the 
 data flush to the storefile, and it's ok when the data in mem-store.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11728) Data loss while scanning using PREFIX_TREE DATA-BLOCK-ENCODING

2014-08-18 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-11728:
---

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

 Data loss while scanning using PREFIX_TREE DATA-BLOCK-ENCODING
 --

 Key: HBASE-11728
 URL: https://issues.apache.org/jira/browse/HBASE-11728
 Project: HBase
  Issue Type: Bug
  Components: Scanners
Affects Versions: 0.96.1.1, 0.98.4
 Environment: ubuntu12 
 hadoop-2.2.0
 Hbase-0.96.1.1
 SUN-JDK(1.7.0_06-b24)
Reporter: wuchengzhi
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 29cb562fad564b468ea9d61a2d60e8b0, HBASE-11728.patch, 
 HBASE-11728_1.patch, HBASE-11728_2.patch, HBASE-11728_3.patch, 
 HBASE-11728_4.patch, HFileAnalys.java, TestPrefixTree.java

   Original Estimate: 72h
  Remaining Estimate: 72h

 In Scan case, i prepare some data as beflow:
 Table Desc (Using the prefix-tree encoding) :
 'prefix_tree_test', {NAME = 'cf_1', DATA_BLOCK_ENCODING = 'PREFIX_TREE', 
 TTL = '15552000'}
 and i put 5 rows as:
 (RowKey , Qualifier, Value)
 'a-b-0-0', 'qf_1', 'c1-value'
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 'a-b-B-2-1402397300-1402416535', 'qf_2', 'c2-value-3'
 so i try to scan the rowKey between 'a-b-A-1' and 'a-b-A-1:' , i and got the 
 corret result:
 Test 1: 
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 and then i try next , scan to addColumn
 Test2:
 Scan scan = new Scan();
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_2));
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually i got nonthing. Then i update the addColumn for 
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_1)); and i got the 
 expected result 'a-b-A-1', 'qf_1', 'c1-value' as well.
 then i do more testing...  i update the case to modify the startRow greater 
 than the 'a-b-A-1' 
 Test3:
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually  i got nothing again. i modify the start row greater than 
 'a-b-A-1-1402329600-1402396277'
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-140239.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 and i got the expect row as well:
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 So, i think it may be a bug in the prefix-tree encoding.It happens after the 
 data flush to the storefile, and it's ok when the data in mem-store.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11766) Backdoor CoprocessorHConnection is no longer being used for local writes

2014-08-18 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-11766:
---

Fix Version/s: 0.98.6
   2.0.0
   0.99.0
 Assignee: Andrew Purtell

 Backdoor CoprocessorHConnection is no longer being used for local writes
 

 Key: HBASE-11766
 URL: https://issues.apache.org/jira/browse/HBASE-11766
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.4
Reporter: James Taylor
Assignee: Andrew Purtell
  Labels: Phoenix
 Fix For: 0.99.0, 2.0.0, 0.98.6


 There's a backdoor CoprocessorHConnection used to ensure that a batched 
 mutation does not go over the wire and back, but executes immediately 
 locally. This is leveraged by Phoenix during secondary index maintenance (for 
 an ~20% perf improvement). It looks to me like it's no longer used, as the 
 following function is never invoked:
   public 
 org.apache.hadoop.hbase.protobuf.generated.ClientProtos.ClientService.BlockingInterface
   getClient(ServerName serverName) throws IOException {
 It'd be good if feasible to add an HBase unit test to prevent further 
 regressions. For more info, see PHOENIX-1166.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11232) Region fail to release the updatelock for illegal CF in multi row mutations

2014-08-18 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100891#comment-14100891
 ] 

Andrew Purtell commented on HBASE-11232:


Ping [~lhofhansl]. Shall we get this in for the next 0.94 release?

 Region fail to release the updatelock for illegal CF in multi row mutations
 ---

 Key: HBASE-11232
 URL: https://issues.apache.org/jira/browse/HBASE-11232
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.19
Reporter: Liu Shaohui
Assignee: Liu Shaohui
 Attachments: HBASE-11232-0.94.diff


 The failback code in processRowsWithLocks did not check the column family. If 
 there is an illegal CF in the muation, it will  throw NullPointException and 
 the update lock will not be released.  So the region can not be flushed and 
 compacted. 
 HRegion #4946
 {code}
 if (!mutations.isEmpty()  !walSyncSuccessful) {
   LOG.warn(Wal sync failed. Roll back  + mutations.size() +
memstore keyvalues for row(s): +
   processor.getRowsToLock().iterator().next() + ...);
   for (KeyValue kv : mutations) {
 stores.get(kv.getFamily()).rollback(kv);
   }
 }
 // 11. Roll mvcc forward
 if (writeEntry != null) {
   mvcc.completeMemstoreInsert(writeEntry);
   writeEntry = null;
 }
 if (locked) {
   this.updatesLock.readLock().unlock();
   locked = false;
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HBASE-11596) [Shell] Recreate table grants after truncate

2014-08-18 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-11596.


Resolution: Duplicate

I'm going to resolve this as a dup of HBASE-8332 and HBASE-11769

 [Shell] Recreate table grants after truncate
 

 Key: HBASE-11596
 URL: https://issues.apache.org/jira/browse/HBASE-11596
 Project: HBase
  Issue Type: Improvement
Reporter: Andrew Purtell
Priority: Minor
  Labels: beginner

 The shell's truncate command disables, drops, and creates a replacement 
 table. When the AccessController is active it observes the drop and cleans up 
 any grants made on the table. The shell does not take any action to preserve 
 the grants but could. Would make a nice improvement. If security is active 
 and running with administrative privilege, the shell could retrieve the 
 table- and CF-level grants before dropping the table and replay them on the 
 new table after creating it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11769) Truncate table shouldn't revoke user privileges

2014-08-18 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100964#comment-14100964
 ] 

Andrew Purtell commented on HBASE-11769:


Dup of HBASE-11596 (which itself is at least partially a dup of HBASE-8332, but 
would need a backport to 0.94)

 Truncate table shouldn't revoke user privileges
 ---

 Key: HBASE-11769
 URL: https://issues.apache.org/jira/browse/HBASE-11769
 Project: HBase
  Issue Type: Bug
  Components: security
Affects Versions: 0.94.15
Reporter: hongyu bi

 hbase(main):002:0 create 'a','cf'
 0 row(s) in 0.2500 seconds
 = Hbase::Table - a
 hbase(main):003:0 grant 'usera','R','a'
 0 row(s) in 0.2080 seconds
 hbase(main):007:0 user_permission 'a'
 User   
 Table,Family,Qualifier:Permission 
   

  usera a,,: 
 [Permission: actions=READ]
   
   
 hbase(main):004:0 truncate 'a'
 Truncating 'a' table (it may take a while):
  - Disabling table...
  - Dropping table...
  - Creating table...
 0 row(s) in 1.5320 seconds
 hbase(main):005:0 user_permission 'a'
 User   
 Table,Family,Qualifier:Permission 
   




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HBASE-11771) Move to log4j 2

2014-08-18 Thread Alex Newman (JIRA)
Alex Newman created HBASE-11771:
---

 Summary: Move to log4j 2
 Key: HBASE-11771
 URL: https://issues.apache.org/jira/browse/HBASE-11771
 Project: HBase
  Issue Type: Improvement
Reporter: Alex Newman
Assignee: Alex Newman
Priority: Minor


It seems much faster. Any objections?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HBASE-11771) Move to log4j 2

2014-08-18 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-11771.


Resolution: Duplicate
  Assignee: (was: Alex Newman)

Dup of HBASE-10092. Want to take that one over [~posix4e]?

 Move to log4j 2
 ---

 Key: HBASE-11771
 URL: https://issues.apache.org/jira/browse/HBASE-11771
 Project: HBase
  Issue Type: Improvement
Reporter: Alex Newman
Priority: Minor

 It seems much faster. Any objections?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11762) Record the class name of Codec in WAL header

2014-08-18 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100976#comment-14100976
 ] 

Ted Yu commented on HBASE-11762:


[~apurtell]:
What do you think of patch v4 ?

 Record the class name of Codec in WAL header
 

 Key: HBASE-11762
 URL: https://issues.apache.org/jira/browse/HBASE-11762
 Project: HBase
  Issue Type: Task
  Components: wal
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 1.0.0, 2.0.0, 0.98.6

 Attachments: 11762-v1.txt, 11762-v2.txt, 11762-v4.txt


 In follow-up discussion to HBASE-11620, Enis brought up this point:
 Related to this, should not we also write the CellCodec that we use in the 
 WAL header. Right now, the codec comes from the configuration which means 
 that you cannot read back the WAL files if you change the codec.
 This JIRA is to implement the above suggestion.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)

2014-08-18 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100953#comment-14100953
 ] 

Andrew Purtell commented on HBASE-11165:


bq. I agree with Matteo on this. One more benefit to have meta and master 
together is the meta/master recovery will be much simpler

Do we need to split this conversation into what to do on master and what to do 
with 0.98? We could for example file two separate subtasks that approach the 
meta scaling problem in different ways for the respective branches. They are 
divergent enough so that would be a good idea IMHO

 Scaling so cluster can host 1M regions and beyond (50M regions?)
 

 Key: HBASE-11165
 URL: https://issues.apache.org/jira/browse/HBASE-11165
 Project: HBase
  Issue Type: Brainstorming
Reporter: stack
 Attachments: HBASE-11165.zip, Region Scalability test.pdf, 
 zk_less_assignment_comparison_2.pdf


 This discussion issue comes out of Co-locate Meta And Master HBASE-10569 
 and comments on the doc posted there.
 A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M 
 regions maybe even 50M later.  This issue is about discussing how we will do 
 that (or if not 50M on a cluster, how otherwise we can attain same end).
 More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11771) Move to log4j 2

2014-08-18 Thread Alex Newman (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100981#comment-14100981
 ] 

Alex Newman commented on HBASE-11771:
-

Love to

 Move to log4j 2
 ---

 Key: HBASE-11771
 URL: https://issues.apache.org/jira/browse/HBASE-11771
 Project: HBase
  Issue Type: Improvement
Reporter: Alex Newman
Priority: Minor

 It seems much faster. Any objections?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10092) Move up on to log4j2

2014-08-18 Thread Alex Newman (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100982#comment-14100982
 ] 

Alex Newman commented on HBASE-10092:
-

Mind if I hope on this patch?

 Move up on to log4j2
 

 Key: HBASE-10092
 URL: https://issues.apache.org/jira/browse/HBASE-10092
 Project: HBase
  Issue Type: Sub-task
Reporter: stack
Assignee: stack
 Attachments: 10092.txt, 10092v2.txt, HBASE-10092.patch


 Allows logging with less friction.  See http://logging.apache.org/log4j/2.x/  
 This rather radical transition can be done w/ minor change given they have an 
 adapter for apache's logging, the one we use.  They also have and adapter for 
 slf4j so we likely can remove at least some of the 4 versions of this module 
 our dependencies make use of.
 I made a start in attached patch but am currently stuck in maven dependency 
 resolve hell courtesy of our slf4j.  Fixing will take some concentration and 
 a good net connection, an item I currently lack.  Other TODOs are that will 
 need to fix our little log level setting jsp page -- will likely have to undo 
 our use of hadoop's tool here -- and the config system changes a little.
 I will return to this project soon.  Will bring numbers.
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-10092) Move up on to log4j2

2014-08-18 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10092:
---

Assignee: Alex Newman  (was: stack)

 Move up on to log4j2
 

 Key: HBASE-10092
 URL: https://issues.apache.org/jira/browse/HBASE-10092
 Project: HBase
  Issue Type: Sub-task
Reporter: stack
Assignee: Alex Newman
 Attachments: 10092.txt, 10092v2.txt, HBASE-10092.patch


 Allows logging with less friction.  See http://logging.apache.org/log4j/2.x/  
 This rather radical transition can be done w/ minor change given they have an 
 adapter for apache's logging, the one we use.  They also have and adapter for 
 slf4j so we likely can remove at least some of the 4 versions of this module 
 our dependencies make use of.
 I made a start in attached patch but am currently stuck in maven dependency 
 resolve hell courtesy of our slf4j.  Fixing will take some concentration and 
 a good net connection, an item I currently lack.  Other TODOs are that will 
 need to fix our little log level setting jsp page -- will likely have to undo 
 our use of hadoop's tool here -- and the config system changes a little.
 I will return to this project soon.  Will bring numbers.
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10092) Move up on to log4j2

2014-08-18 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100994#comment-14100994
 ] 

Andrew Purtell commented on HBASE-10092:


Reassigned to [~posix4e]. Good luck!

 Move up on to log4j2
 

 Key: HBASE-10092
 URL: https://issues.apache.org/jira/browse/HBASE-10092
 Project: HBase
  Issue Type: Sub-task
Reporter: stack
Assignee: Alex Newman
 Attachments: 10092.txt, 10092v2.txt, HBASE-10092.patch


 Allows logging with less friction.  See http://logging.apache.org/log4j/2.x/  
 This rather radical transition can be done w/ minor change given they have an 
 adapter for apache's logging, the one we use.  They also have and adapter for 
 slf4j so we likely can remove at least some of the 4 versions of this module 
 our dependencies make use of.
 I made a start in attached patch but am currently stuck in maven dependency 
 resolve hell courtesy of our slf4j.  Fixing will take some concentration and 
 a good net connection, an item I currently lack.  Other TODOs are that will 
 need to fix our little log level setting jsp page -- will likely have to undo 
 our use of hadoop's tool here -- and the config system changes a little.
 I will return to this project soon.  Will bring numbers.
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11610) Enhance remote meta updates

2014-08-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101043#comment-14101043
 ] 

stack commented on HBASE-11610:
---

This patch is a bit of a hack. We are doing a one-off inside RegionStateStore 
to put up multiple HConnection instances (for sure we are creating many 
distinct instances?). I'd doubt anyone but you fellas will know of its 
existance (Needs a release not on the new config, 
hbase.statestore.meta.connection, and new config should probably be called 
hbase.regionstatestore.meta.connection). Would be nice if this connection 
setup was off in a separate class so should anyone else want to do this trick, 
they'll not duplicate your effort.  This is just a nit though.  I'm also fine 
with adding in stuff that is custom for you fellas (custom for now) just as 
long as it is well doc'd.

When would this code trigger?

+  if (hConnectionPool == null) {
+hConnectionPool = new 
HConnection[]{HConnectionManager.createConnection(server.getConfiguration())};
+  }

i.e. when would hConnectionPool be null?

Should this be private? +  private ThreadLocalHTableInterface 
threadLocalHTable =

It should have a comment on when this thread local gets instantiated -- what 
the current thread is at the time.

 Enhance remote meta updates
 ---

 Key: HBASE-11610
 URL: https://issues.apache.org/jira/browse/HBASE-11610
 Project: HBase
  Issue Type: Sub-task
Reporter: Jimmy Xiang
Assignee: Virag Kothari
 Attachments: HBASE-11610.patch


 Currently, if the meta region is on a regionserver instead of the master, 
 meta update is synchronized on one HTable instance. We should be able to do 
 better.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10092) Move up on to log4j2

2014-08-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101046#comment-14101046
 ] 

stack commented on HBASE-10092:
---

I think this a hbase 2.0 issue now.  Log config format changes which will be 
too much to take on in a 1.0 hbase (IMO).

 Move up on to log4j2
 

 Key: HBASE-10092
 URL: https://issues.apache.org/jira/browse/HBASE-10092
 Project: HBase
  Issue Type: Sub-task
Reporter: stack
Assignee: Alex Newman
 Attachments: 10092.txt, 10092v2.txt, HBASE-10092.patch


 Allows logging with less friction.  See http://logging.apache.org/log4j/2.x/  
 This rather radical transition can be done w/ minor change given they have an 
 adapter for apache's logging, the one we use.  They also have and adapter for 
 slf4j so we likely can remove at least some of the 4 versions of this module 
 our dependencies make use of.
 I made a start in attached patch but am currently stuck in maven dependency 
 resolve hell courtesy of our slf4j.  Fixing will take some concentration and 
 a good net connection, an item I currently lack.  Other TODOs are that will 
 need to fix our little log level setting jsp page -- will likely have to undo 
 our use of hadoop's tool here -- and the config system changes a little.
 I will return to this project soon.  Will bring numbers.
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11761) Add a FAQ item for updating a maven-managed application from 0.94 - 0.96+

2014-08-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101061#comment-14101061
 ] 

stack commented on HBASE-11761:
---

bq. Maybe a FAQ item with a pointer within both the 0.94 - 0.96 and 0.94 - 
0.98 upgrade sections?

Sounds great.

 Add a FAQ item for updating a maven-managed application from 0.94 - 0.96+
 --

 Key: HBASE-11761
 URL: https://issues.apache.org/jira/browse/HBASE-11761
 Project: HBase
  Issue Type: Task
  Components: documentation
Reporter: Sean Busbey
  Labels: beginner

 In 0.96 we changed artifact structure, so that clients need to rely on an 
 artifact specific to some module (hopefully hbase-client) instead of a single 
 fat jar.
 We should add a FAQ item that points people towards hbase-client, to ease 
 those updating downstream applications from 0.94 to 0.98+.
 Showing an example pom entry for e.g. org.apache.hbase:hbase:0.94.22 and one 
 for e.g. org.apache.hbase:hbase-client:0.98.5 should be sufficient.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11770) TestBlockCacheReporting.testBucketCache is not stable

2014-08-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101057#comment-14101057
 ] 

stack commented on HBASE-11770:
---

Want to assign to me [~sergey.soldatov] Its a test of my writing.

 TestBlockCacheReporting.testBucketCache is not stable 
 --

 Key: HBASE-11770
 URL: https://issues.apache.org/jira/browse/HBASE-11770
 Project: HBase
  Issue Type: Bug
  Components: test
 Environment: kvm box with Ubuntu 12.04 Desktop 64bit. 
 java version 1.7.0_65
 Java(TM) SE Runtime Environment (build 1.7.0_65-b17)
 Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)
Reporter: Sergey Soldatov
Assignee: Sergey Soldatov

 Depending on the machine and OS TestBlockCacheReporting.testBucketCache may 
 fail with NPE:
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.getBlock(BucketCache.java:417)
 at 
 org.apache.hadoop.hbase.io.hfile.CombinedBlockCache.getBlock(CombinedBlockCache.java:80)
 at 
 org.apache.hadoop.hbase.io.hfile.TestBlockCacheReporting.addDataAndHits(TestBlockCacheReporting.java:67)
 at 
 org.apache.hadoop.hbase.io.hfile.TestBlockCacheReporting.testBucketCache(TestBlockCacheReporting.java:86)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10092) Move up on to log4j2

2014-08-18 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101073#comment-14101073
 ] 

Andrew Purtell commented on HBASE-10092:


bq. Log config format changes which will be too much to take on in a 1.0 hbase 
(IMO).

Was thinking about this also. So let me just propose it then.. What about 
putting in a log configuration file adapter so we don't have to change our 
log4j properties files until later? This would be needed if we ever wanted to 
backport async logging improvements to something like 0.98. 

 Move up on to log4j2
 

 Key: HBASE-10092
 URL: https://issues.apache.org/jira/browse/HBASE-10092
 Project: HBase
  Issue Type: Sub-task
Reporter: stack
Assignee: Alex Newman
 Attachments: 10092.txt, 10092v2.txt, HBASE-10092.patch


 Allows logging with less friction.  See http://logging.apache.org/log4j/2.x/  
 This rather radical transition can be done w/ minor change given they have an 
 adapter for apache's logging, the one we use.  They also have and adapter for 
 slf4j so we likely can remove at least some of the 4 versions of this module 
 our dependencies make use of.
 I made a start in attached patch but am currently stuck in maven dependency 
 resolve hell courtesy of our slf4j.  Fixing will take some concentration and 
 a good net connection, an item I currently lack.  Other TODOs are that will 
 need to fix our little log level setting jsp page -- will likely have to undo 
 our use of hadoop's tool here -- and the config system changes a little.
 I will return to this project soon.  Will bring numbers.
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11591) Scanner fails to retrieve KV from bulk loaded file with highest sequence id than the cell's mvcc in a non-bulk loaded file

2014-08-18 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101080#comment-14101080
 ] 

Andrew Purtell commented on HBASE-11591:


bq. But setting the seqId on the read path would prevent us from using Cell 
based impl because Cell does not have it.

What prevents us from adding seqID accessors as an additional interface 
extending Cell in hbase-server as Anoop proposed above?

 Scanner fails to retrieve KV  from bulk loaded file with highest sequence id 
 than the cell's mvcc in a non-bulk loaded file
 ---

 Key: HBASE-11591
 URL: https://issues.apache.org/jira/browse/HBASE-11591
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.99.0

 Attachments: HBASE-11591.patch, HBASE-11591_1.patch, 
 HBASE-11591_2.patch, HBASE-11591_3.patch, TestBulkload.java


 See discussion in HBASE-11339.
 When we have a case where there are same KVs in two files one produced by 
 flush/compaction and the other thro the bulk load.
 Both the files have some same kvs which matches even in timestamp.
 Steps:
 Add some rows with a specific timestamp and flush the same.  
 Bulk load a file with the same data.. Enusre that assign seqnum property is 
 set.
 The bulk load should use HFileOutputFormat2 (or ensure that we write the 
 bulk_time_output key).
 This would ensure that the bulk loaded file has the highest seq num.
 Assume the cell in the flushed/compacted store file is 
 row1,cf,cq,ts1, value1  and the cell in the bulk loaded file is
 row1,cf,cq,ts1,value2 
 (There are no parallel scans).
 Issue a scan on the table in 0.96. The retrieved value is 
 row1,cf1,cq,ts1,value2
 But the same in 0.98 will retrieve row1,cf1,cq,ts2,value1. 
 This is a behaviour change.  This is because of this code 
 {code}
 public int compare(KeyValueScanner left, KeyValueScanner right) {
   int comparison = compare(left.peek(), right.peek());
   if (comparison != 0) {
 return comparison;
   } else {
 // Since both the keys are exactly the same, we break the tie in favor
 // of the key which came latest.
 long leftSequenceID = left.getSequenceID();
 long rightSequenceID = right.getSequenceID();
 if (leftSequenceID  rightSequenceID) {
   return -1;
 } else if (leftSequenceID  rightSequenceID) {
   return 1;
 } else {
   return 0;
 }
   }
 }
 {code}
 Here  in 0.96 case the mvcc of the cell in both the files will have 0 and so 
 the comparison will happen from the else condition .  Where the seq id of the 
 bulk loaded file is greater and would sort out first ensuring that the scan 
 happens from that bulk loaded file.
 In case of 0.98+ as we are retaining the mvcc+seqid we are not making the 
 mvcc as 0 (remains a non zero positive value).  Hence the compare() sorts out 
 the cell in the flushed/compacted file.  Which means though we know the 
 lateset file is the bulk loaded file we don't scan the data.
 Seems to be a behaviour change.  Will check on other corner cases also but we 
 are trying to know the behaviour of bulk load because we are evaluating if it 
 can be used for MOB design.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11657) Put HTable region methods in an interface

2014-08-18 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101086#comment-14101086
 ] 

Enis Soztutar commented on HBASE-11657:
---

I've made HRL private in an earlier patch that introduced RegionLocations 
class. The idea was to make regions transparent to users because they can 
change, not specific to HRL per se. With the introduction of RegionLocations 
and HBASE-10070 work, there can be more than one location for a region together 
with different replica_ids associated with regions. I did not want to expose 
those as the public API, but we can revisit that decision if we want.  

 Put HTable region methods in an interface
 -

 Key: HBASE-11657
 URL: https://issues.apache.org/jira/browse/HBASE-11657
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.99.0
Reporter: Carter
Assignee: Carter
 Fix For: 0.99.0

 Attachments: HBASE_11657.patch, HBASE_11657_v2.patch, 
 HBASE_11657_v3.patch, HBASE_11657_v3.patch, HBASE_11657_v4.patch


 Most of the HTable methods are now abstracted by HTableInterface, with the 
 notable exception of the following methods that pertain to region metadata:
 {code}
 HRegionLocation getRegionLocation(final String row)
 HRegionLocation getRegionLocation(final byte [] row)
 HRegionLocation getRegionLocation(final byte [] row, boolean reload)
 byte [][] getStartKeys()
 byte[][] getEndKeys()
 Pairbyte[][],byte[][] getStartEndKeys()
 void clearRegionCache()
 {code}
 and a default scope method which maybe should be bundled with the others:
 {code}
 ListRegionLocations listRegionLocations()
 {code}
 Since the consensus seems to be that these would muddy HTableInterface with 
 non-core functionality, where should it go?  MapReduce looks up the region 
 boundaries, so it needs to be exposed somewhere.
 Let me throw out a straw man to start the conversation.  I propose:
 {code}
 org.apache.hadoop.hbase.client.HRegionInterface
 {code}
 Have HTable implement this interface.  Also add these methods to HConnection:
 {code}
 HRegionInterface getTableRegion(TableName tableName)
 HRegionInterface getTableRegion(TableName tableName, ExecutorService pool)
 {code}
 [~stack], [~ndimiduk], [~enis], thoughts?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11753) Document HBASE_SHELL_OPTS environment variable

2014-08-18 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101093#comment-14101093
 ] 

Jonathan Hsieh commented on HBASE-11753:


lgtm. +1

 Document HBASE_SHELL_OPTS environment variable
 --

 Key: HBASE-11753
 URL: https://issues.apache.org/jira/browse/HBASE-11753
 Project: HBase
  Issue Type: Sub-task
  Components: documentation
Reporter: Misty Stanley-Jones
Assignee: Misty Stanley-Jones
 Fix For: 0.99.0, 0.96.3, 0.98.5, 0.94.22, 2.0.0

 Attachments: HBASE-11753.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11753) Document HBASE_SHELL_OPTS environment variable

2014-08-18 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-11753:
---

   Resolution: Fixed
Fix Version/s: (was: 0.94.22)
   (was: 0.98.5)
   (was: 0.96.3)
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I've committed the docs updates to branch-1 and trunk since those will be the 
long lived docs branches.

 Document HBASE_SHELL_OPTS environment variable
 --

 Key: HBASE-11753
 URL: https://issues.apache.org/jira/browse/HBASE-11753
 Project: HBase
  Issue Type: Sub-task
  Components: documentation
Reporter: Misty Stanley-Jones
Assignee: Misty Stanley-Jones
 Fix For: 0.99.0, 2.0.0

 Attachments: HBASE-11753.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11757) Provide a common base abstract class for both RegionObserver and MasterObserver

2014-08-18 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-11757:


   Resolution: Fixed
Fix Version/s: (was: 0.99.0)
   1.0.0
   Status: Resolved  (was: Patch Available)

 Provide a common base abstract class for both RegionObserver and 
 MasterObserver
 ---

 Key: HBASE-11757
 URL: https://issues.apache.org/jira/browse/HBASE-11757
 Project: HBase
  Issue Type: Improvement
Reporter: Andrew Purtell
Assignee: Matteo Bertozzi
 Fix For: 1.0.0, 2.0.0, 0.98.6

 Attachments: HBASE-11757-0.98-v0.patch, HBASE-11757-v0.patch


 Some security coprocessors extend both RegionObserver and MasterObserver, 
 unfortunately only one of the two can use the available base abstract class 
 implementations. Provide a common base abstract class for both the 
 RegionObserver and MasterObserver interfaces. Update current coprocessors 
 that extend both interfaces to use the new common base abstract class.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11550) Custom value for BUCKET_CACHE_BUCKETS_KEY should be sorted

2014-08-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101124#comment-14101124
 ] 

stack commented on HBASE-11550:
---

bq. I don't think it fair to the new contributor to have their contributions 
get caught up in the project politicking.

[~gustavoanatoly] To be clear, ill-defined JIRAs consume the attention of those 
who are trying to follow along. Lack of clarity in the definition requires we 
need to keep an active eye out. When the issue is trivial, this is particularly 
irksome.

This issue is a good example.  It starts out without provenance -- does the 
issue come of 'code-reading', testing?, a user reported issue?, an attempt at 
setting bucket sizes in configs -- and it has 'shoulds' and 'supposed to' in 
subject and original description but there is no justification as to why. Nick, 
a third party altogether, has to do detective work to elicit there is an actual 
problem here.

For another example, see the follow-on, filed again by Ted assigned to you, 
HBASE-11743.  Look at it. It says this issue, HBASE-11550, makes it ...such 
that there is no wastage in bucket allocation. But Nick resolves this issue 
with the comment that your patch ensures default and user-supplied config align 
punting on the wastage question... to he 'guesses' HBASE-11743. This is lack of 
alignment here. The mess that is this issue looks like it is to repeat over in 
HBASE-11743.

To avoid any crossfire in the future, I'd suggest file your own issues 
especially if you are trying to build yourself a bit of a track record. Also 
work on non-trivial issues as said before.  You will find it easier getting 
reviewers if the issue is non-trivial.

 Custom value for BUCKET_CACHE_BUCKETS_KEY should be sorted
 --

 Key: HBASE-11550
 URL: https://issues.apache.org/jira/browse/HBASE-11550
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.0, 0.98.4, 0.98.5
Reporter: Ted Yu
Assignee: Gustavo Anatoly
Priority: Trivial
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: HBASE-11550-v1.patch, HBASE-11550-v2.patch, 
 HBASE-11550-v3.patch, HBASE-11550-v4-0.98.patch, HBASE-11550-v4.patch, 
 HBASE-11550.patch


 User can pass bucket sizes through hbase.bucketcache.bucket.sizes config 
 entry.
 The sizes are supposed to be in increasing order. Validation should be added 
 in CacheConfig#getL2().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10092) Move up on to log4j2

2014-08-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101127#comment-14101127
 ] 

stack commented on HBASE-10092:
---

bq. What about putting in a log configuration file adapter so we don't have to 
change our log4j properties files until later?

That'd make it palatable.

 Move up on to log4j2
 

 Key: HBASE-10092
 URL: https://issues.apache.org/jira/browse/HBASE-10092
 Project: HBase
  Issue Type: Sub-task
Reporter: stack
Assignee: Alex Newman
 Attachments: 10092.txt, 10092v2.txt, HBASE-10092.patch


 Allows logging with less friction.  See http://logging.apache.org/log4j/2.x/  
 This rather radical transition can be done w/ minor change given they have an 
 adapter for apache's logging, the one we use.  They also have and adapter for 
 slf4j so we likely can remove at least some of the 4 versions of this module 
 our dependencies make use of.
 I made a start in attached patch but am currently stuck in maven dependency 
 resolve hell courtesy of our slf4j.  Fixing will take some concentration and 
 a good net connection, an item I currently lack.  Other TODOs are that will 
 need to fix our little log level setting jsp page -- will likely have to undo 
 our use of hadoop's tool here -- and the config system changes a little.
 I will return to this project soon.  Will bring numbers.
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11728) Data loss while scanning using PREFIX_TREE DATA-BLOCK-ENCODING

2014-08-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101138#comment-14101138
 ] 

Hudson commented on HBASE-11728:


SUCCESS: Integrated in HBase-0.98 #455 (See 
[https://builds.apache.org/job/HBase-0.98/455/])
HBASE-11728 - Data loss while scanning using PREFIX_TREE (ramkrishna: rev 
e07cf3554d628bb061aa51b9b83fd81783463e1d)
* 
hbase-prefix-tree/src/main/java/org/apache/hadoop/hbase/codec/prefixtree/decode/PrefixTreeArrayScanner.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/encoding/TestPrefixTree.java
* 
hbase-prefix-tree/src/main/java/org/apache/hadoop/hbase/codec/prefixtree/PrefixTreeSeeker.java


 Data loss while scanning using PREFIX_TREE DATA-BLOCK-ENCODING
 --

 Key: HBASE-11728
 URL: https://issues.apache.org/jira/browse/HBASE-11728
 Project: HBase
  Issue Type: Bug
  Components: Scanners
Affects Versions: 0.96.1.1, 0.98.4
 Environment: ubuntu12 
 hadoop-2.2.0
 Hbase-0.96.1.1
 SUN-JDK(1.7.0_06-b24)
Reporter: wuchengzhi
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 29cb562fad564b468ea9d61a2d60e8b0, HBASE-11728.patch, 
 HBASE-11728_1.patch, HBASE-11728_2.patch, HBASE-11728_3.patch, 
 HBASE-11728_4.patch, HFileAnalys.java, TestPrefixTree.java

   Original Estimate: 72h
  Remaining Estimate: 72h

 In Scan case, i prepare some data as beflow:
 Table Desc (Using the prefix-tree encoding) :
 'prefix_tree_test', {NAME = 'cf_1', DATA_BLOCK_ENCODING = 'PREFIX_TREE', 
 TTL = '15552000'}
 and i put 5 rows as:
 (RowKey , Qualifier, Value)
 'a-b-0-0', 'qf_1', 'c1-value'
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 'a-b-B-2-1402397300-1402416535', 'qf_2', 'c2-value-3'
 so i try to scan the rowKey between 'a-b-A-1' and 'a-b-A-1:' , i and got the 
 corret result:
 Test 1: 
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 and then i try next , scan to addColumn
 Test2:
 Scan scan = new Scan();
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_2));
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually i got nonthing. Then i update the addColumn for 
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_1)); and i got the 
 expected result 'a-b-A-1', 'qf_1', 'c1-value' as well.
 then i do more testing...  i update the case to modify the startRow greater 
 than the 'a-b-A-1' 
 Test3:
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually  i got nothing again. i modify the start row greater than 
 'a-b-A-1-1402329600-1402396277'
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-140239.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 and i got the expect row as well:
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 So, i think it may be a bug in the prefix-tree encoding.It happens after the 
 data flush to the storefile, and it's ok when the data in mem-store.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11512) Write region open/close events to WAL

2014-08-18 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-11512:
--

Attachment: hbase-11512_v3.patch

v3 patch from RB. 

 Write region open/close events to WAL
 -

 Key: HBASE-11512
 URL: https://issues.apache.org/jira/browse/HBASE-11512
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: hbase-11512_v1.patch, hbase-11512_v2.patch, 
 hbase-11512_v3.patch


 Similar to writing flush events to WAL (HBASE-11511) and compaction events to 
 WAL (HBASE-2231), we should write region open and close events to WAL. 
 This is especially important for secondary region replicas, since we can use 
 this information to pick up primary regions' files from secondary replicas.
 However, we may need this for regular inter cluster replication as well, see 
 issues HBASE-10343 and HBASE-9465. 
 A design doc for secondary replica replication can be found at HBASE-11183. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-4920) We need a mascot, a totem

2014-08-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101149#comment-14101149
 ] 

stack commented on HBASE-4920:
--

I like the #2 and #3.  I think they are fine as they are as suggestions. We can 
tidy up later.

bq. We're just down to deciding on an embodiment for the logo.

I'd like to suggest voting on the representation of the orca only, not an 
'apache hbase' + orca combination. Trying to vote on the latter will have us in 
the weeds: No, it should be on the left!, nN, on top  If we can 
decide on the orca representation, this will move us another step on.  It would 
allow us deploy the representation now.  Work on how the two are combined can 
come later. It will also vary with context (orca above, orca to the side, orca 
big or orca small).

 We need a mascot, a totem
 -

 Key: HBASE-4920
 URL: https://issues.apache.org/jira/browse/HBASE-4920
 Project: HBase
  Issue Type: Task
Reporter: stack
 Attachments: Apache_HBase_Orca_Logo_1.jpg, 
 Apache_HBase_Orca_Logo_Mean_version-3.pdf, 
 Apache_HBase_Orca_Logo_Mean_version-4.pdf, Apache_HBase_Orca_Logo_round5.pdf, 
 HBase Orca Logo.jpg, Orca_479990801.jpg, Screen shot 2011-11-30 at 4.06.17 
 PM.png, apache hbase orca logo_Proof 3.pdf, apache logo_Proof 8.pdf, 
 jumping-orca_rotated.xcf, jumping-orca_rotated_right.png, krake.zip, 
 more_orcas.png, more_orcas2.png, orca_clipart_freevector_lhs.jpeg, 
 orca_free_vector_on_top_66percent_levelled.png, 
 orca_free_vector_sheared_rotated_rhs.png, 
 orca_free_vector_some_selections.png, photo (2).JPG, plus_orca.png, 
 proposal_1_logo.png, proposal_1_logo.xcf, proposal_2_logo.png, 
 proposal_2_logo.xcf, proposal_3_logo.png, proposal_3_logo.xcf


 We need a totem for our t-shirt that is yet to be printed.  O'Reilly owns the 
 Clyesdale.  We need something else.
 We could have a fluffy little duck that quacks 'hbase!' when you squeeze it 
 and we could order boxes of them from some off-shore sweatshop that 
 subcontracts to a contractor who employs child labor only.
 Or we could have an Orca (Big!, Fast!, Killer!, and in a poem that Marcy from 
 Salesforce showed me, that was a bit too spiritual for me to be seen quoting 
 here, it had the Orca as the 'Guardian of the Cosmic Memory': i.e. in 
 translation, bigdata).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-4920) We need a mascot, a totem

2014-08-18 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101154#comment-14101154
 ] 

Andrew Purtell commented on HBASE-4920:
---

bq. If we can decide on the orca representation, this will move us another step 
on.  It would allow us deploy the representation now. 

JMs proposals all have the same stylized Orca representation. lgtm, +1 Let's 
move forward. As you say, we can tweak the positioning later. 

 We need a mascot, a totem
 -

 Key: HBASE-4920
 URL: https://issues.apache.org/jira/browse/HBASE-4920
 Project: HBase
  Issue Type: Task
Reporter: stack
 Attachments: Apache_HBase_Orca_Logo_1.jpg, 
 Apache_HBase_Orca_Logo_Mean_version-3.pdf, 
 Apache_HBase_Orca_Logo_Mean_version-4.pdf, Apache_HBase_Orca_Logo_round5.pdf, 
 HBase Orca Logo.jpg, Orca_479990801.jpg, Screen shot 2011-11-30 at 4.06.17 
 PM.png, apache hbase orca logo_Proof 3.pdf, apache logo_Proof 8.pdf, 
 jumping-orca_rotated.xcf, jumping-orca_rotated_right.png, krake.zip, 
 more_orcas.png, more_orcas2.png, orca_clipart_freevector_lhs.jpeg, 
 orca_free_vector_on_top_66percent_levelled.png, 
 orca_free_vector_sheared_rotated_rhs.png, 
 orca_free_vector_some_selections.png, photo (2).JPG, plus_orca.png, 
 proposal_1_logo.png, proposal_1_logo.xcf, proposal_2_logo.png, 
 proposal_2_logo.xcf, proposal_3_logo.png, proposal_3_logo.xcf


 We need a totem for our t-shirt that is yet to be printed.  O'Reilly owns the 
 Clyesdale.  We need something else.
 We could have a fluffy little duck that quacks 'hbase!' when you squeeze it 
 and we could order boxes of them from some off-shore sweatshop that 
 subcontracts to a contractor who employs child labor only.
 Or we could have an Orca (Big!, Fast!, Killer!, and in a poem that Marcy from 
 Salesforce showed me, that was a bit too spiritual for me to be seen quoting 
 here, it had the Orca as the 'Guardian of the Cosmic Memory': i.e. in 
 translation, bigdata).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11762) Record the class name of Codec in WAL header

2014-08-18 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-11762:
---

Attachment: 11762-v5.txt

Patch v5 addresses Enis' comment about WALCellCodec.create() method

 Record the class name of Codec in WAL header
 

 Key: HBASE-11762
 URL: https://issues.apache.org/jira/browse/HBASE-11762
 Project: HBase
  Issue Type: Task
  Components: wal
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 1.0.0, 2.0.0, 0.98.6

 Attachments: 11762-v1.txt, 11762-v2.txt, 11762-v4.txt, 11762-v5.txt


 In follow-up discussion to HBASE-11620, Enis brought up this point:
 Related to this, should not we also write the CellCodec that we use in the 
 WAL header. Right now, the codec comes from the configuration which means 
 that you cannot read back the WAL files if you change the codec.
 This JIRA is to implement the above suggestion.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-4920) We need a mascot, a totem

2014-08-18 Thread Jean-Marc Spaggiari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101165#comment-14101165
 ] 

Jean-Marc Spaggiari commented on HBASE-4920:


Perfect then.

Let's start a vote on the mailing list. 

2 options. 
[ ] You are fine with an Orca
[ ] You are not fine with an Orca

User mailing list? Or dev only?

 We need a mascot, a totem
 -

 Key: HBASE-4920
 URL: https://issues.apache.org/jira/browse/HBASE-4920
 Project: HBase
  Issue Type: Task
Reporter: stack
 Attachments: Apache_HBase_Orca_Logo_1.jpg, 
 Apache_HBase_Orca_Logo_Mean_version-3.pdf, 
 Apache_HBase_Orca_Logo_Mean_version-4.pdf, Apache_HBase_Orca_Logo_round5.pdf, 
 HBase Orca Logo.jpg, Orca_479990801.jpg, Screen shot 2011-11-30 at 4.06.17 
 PM.png, apache hbase orca logo_Proof 3.pdf, apache logo_Proof 8.pdf, 
 jumping-orca_rotated.xcf, jumping-orca_rotated_right.png, krake.zip, 
 more_orcas.png, more_orcas2.png, orca_clipart_freevector_lhs.jpeg, 
 orca_free_vector_on_top_66percent_levelled.png, 
 orca_free_vector_sheared_rotated_rhs.png, 
 orca_free_vector_some_selections.png, photo (2).JPG, plus_orca.png, 
 proposal_1_logo.png, proposal_1_logo.xcf, proposal_2_logo.png, 
 proposal_2_logo.xcf, proposal_3_logo.png, proposal_3_logo.xcf


 We need a totem for our t-shirt that is yet to be printed.  O'Reilly owns the 
 Clyesdale.  We need something else.
 We could have a fluffy little duck that quacks 'hbase!' when you squeeze it 
 and we could order boxes of them from some off-shore sweatshop that 
 subcontracts to a contractor who employs child labor only.
 Or we could have an Orca (Big!, Fast!, Killer!, and in a poem that Marcy from 
 Salesforce showed me, that was a bit too spiritual for me to be seen quoting 
 here, it had the Orca as the 'Guardian of the Cosmic Memory': i.e. in 
 translation, bigdata).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11657) Put HTable region methods in an interface

2014-08-18 Thread Carter (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101189#comment-14101189
 ] 

Carter commented on HBASE-11657:


I mainly wanted to make sure that the same person who made HRL private is okay 
with whatever we come up with for this interface.  Since you are one and the 
same person, I am less concerned.  ;-)

I double-checked and there is actually still a problem with the (byte[]/byte[]) 
- listservernames.  In short TabletInputFormatBase wants the following from 
the HTable that is being passed to it:
# The hostname and port of the regionserver for a row (handled by ServerName)
# The name of the table (we can add getTableName to RegionLocator)
# The region name itself, which it uses to lookup the region size in the 
RegionSizeCalculator (handled by HRegionInfo)

I see the following alternatives:
* Make HRL public.  It contains ServerName and HRegionInfo, which are both 
required by the current implementation of TableInputFormatBase.
* Return ServerName and region name in some new POJO
* Find a new way to do what TableInputFormatBase wants to accomplish

Sorry to open up this can of worms, but that's part of the fun of retrofitting 
an interface.


 Put HTable region methods in an interface
 -

 Key: HBASE-11657
 URL: https://issues.apache.org/jira/browse/HBASE-11657
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.99.0
Reporter: Carter
Assignee: Carter
 Fix For: 0.99.0

 Attachments: HBASE_11657.patch, HBASE_11657_v2.patch, 
 HBASE_11657_v3.patch, HBASE_11657_v3.patch, HBASE_11657_v4.patch


 Most of the HTable methods are now abstracted by HTableInterface, with the 
 notable exception of the following methods that pertain to region metadata:
 {code}
 HRegionLocation getRegionLocation(final String row)
 HRegionLocation getRegionLocation(final byte [] row)
 HRegionLocation getRegionLocation(final byte [] row, boolean reload)
 byte [][] getStartKeys()
 byte[][] getEndKeys()
 Pairbyte[][],byte[][] getStartEndKeys()
 void clearRegionCache()
 {code}
 and a default scope method which maybe should be bundled with the others:
 {code}
 ListRegionLocations listRegionLocations()
 {code}
 Since the consensus seems to be that these would muddy HTableInterface with 
 non-core functionality, where should it go?  MapReduce looks up the region 
 boundaries, so it needs to be exposed somewhere.
 Let me throw out a straw man to start the conversation.  I propose:
 {code}
 org.apache.hadoop.hbase.client.HRegionInterface
 {code}
 Have HTable implement this interface.  Also add these methods to HConnection:
 {code}
 HRegionInterface getTableRegion(TableName tableName)
 HRegionInterface getTableRegion(TableName tableName, ExecutorService pool)
 {code}
 [~stack], [~ndimiduk], [~enis], thoughts?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HBASE-11772) Bulk load mvcc and seqId issues with native hfiles

2014-08-18 Thread Jerry He (JIRA)
Jerry He created HBASE-11772:


 Summary: Bulk load mvcc and seqId issues with native hfiles
 Key: HBASE-11772
 URL: https://issues.apache.org/jira/browse/HBASE-11772
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.5
Reporter: Jerry He


There are mvcc and seqId issues when bulk load native hfiles -- meaning hfiles 
that are direct file copy-out from hbase, not from HFileOutputFormat job.

There are differences between these two types of hfiles.
Native hfiles have possible non-zero MAX_MEMSTORE_TS_KEY value and non-zero 
mvcc values in cells. 
Native hfiles also have MAX_SEQ_ID_KEY.
Native hfiles do not have BULKLOAD_TIME_KEY.

Here are a couple of problems I observed when bulk load native hfiles.

1.  Cells in newly bulk loaded hfiles can be invisible to scan.

It is easy to re-create.
Bulk load a native hfile that has a larger mvcc value in cells, e.g 10
If the current readpoint when initiating a scan is less than 10, the cells in 
the new hfile are skipped, thus become invisible.
We don't reset the readpoint of a region after bulk load.

2. The current StoreFile.isBulkLoadResult() is implemented as:
{code}
return metadataMap.containsKey(BULKLOAD_TIME_KEY)
{code}
which does not detect bulkloaded native hfiles.

3. Another observed problem is possible data loss during log recovery. 
It is similar to HBASE-10958 reported by [~jdcryans]. Borrow the re-create 
steps from HBASE-10958.

1) Create an empty table
2) Put one row in it (let's say it gets seqid 1)
3) Bulk load one native hfile with large seqId ( e.g. 100). The native hfile 
can be obtained by copying out from existing table.
4) Kill the region server that holds the table's region.
Scan the table once the region is made available again. The first row, at seqid 
1, will be missing since the HFile with seqid 100 makes us believe that 
everything that came before it was flushed. 

The problem 3 is probably related to 2. We will be ok if we get the appended 
seqId during bulk load instead of 100 from inside the file.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11772) Bulk load mvcc and seqId issues with native hfiles

2014-08-18 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101202#comment-14101202
 ] 

Jerry He commented on HBASE-11772:
--

The issues were observed in the 0.98 stream.
There are changed in the master branch, e.g. HBASE-8763 combine mvcc and seqId. 
 But I suspect the issues still exist there.

 Bulk load mvcc and seqId issues with native hfiles
 --

 Key: HBASE-11772
 URL: https://issues.apache.org/jira/browse/HBASE-11772
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.5
Reporter: Jerry He

 There are mvcc and seqId issues when bulk load native hfiles -- meaning 
 hfiles that are direct file copy-out from hbase, not from HFileOutputFormat 
 job.
 There are differences between these two types of hfiles.
 Native hfiles have possible non-zero MAX_MEMSTORE_TS_KEY value and non-zero 
 mvcc values in cells. 
 Native hfiles also have MAX_SEQ_ID_KEY.
 Native hfiles do not have BULKLOAD_TIME_KEY.
 Here are a couple of problems I observed when bulk load native hfiles.
 1.  Cells in newly bulk loaded hfiles can be invisible to scan.
 It is easy to re-create.
 Bulk load a native hfile that has a larger mvcc value in cells, e.g 10
 If the current readpoint when initiating a scan is less than 10, the cells in 
 the new hfile are skipped, thus become invisible.
 We don't reset the readpoint of a region after bulk load.
 2. The current StoreFile.isBulkLoadResult() is implemented as:
 {code}
 return metadataMap.containsKey(BULKLOAD_TIME_KEY)
 {code}
 which does not detect bulkloaded native hfiles.
 3. Another observed problem is possible data loss during log recovery. 
 It is similar to HBASE-10958 reported by [~jdcryans]. Borrow the re-create 
 steps from HBASE-10958.
 1) Create an empty table
 2) Put one row in it (let's say it gets seqid 1)
 3) Bulk load one native hfile with large seqId ( e.g. 100). The native hfile 
 can be obtained by copying out from existing table.
 4) Kill the region server that holds the table's region.
 Scan the table once the region is made available again. The first row, at 
 seqid 1, will be missing since the HFile with seqid 100 makes us believe that 
 everything that came before it was flushed. 
 The problem 3 is probably related to 2. We will be ok if we get the appended 
 seqId during bulk load instead of 100 from inside the file.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11728) Data loss while scanning using PREFIX_TREE DATA-BLOCK-ENCODING

2014-08-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101209#comment-14101209
 ] 

Hudson commented on HBASE-11728:


FAILURE: Integrated in HBase-1.0 #108 (See 
[https://builds.apache.org/job/HBase-1.0/108/])
HBASE-11728 - Data loss while scanning using PREFIX_TREE (ramkrishna: rev 
f8eb1962dc9e92122d00cccfede819014a1cc8f6)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/encoding/TestPrefixTree.java
* 
hbase-prefix-tree/src/main/java/org/apache/hadoop/hbase/codec/prefixtree/decode/PrefixTreeArrayScanner.java
* 
hbase-prefix-tree/src/main/java/org/apache/hadoop/hbase/codec/prefixtree/PrefixTreeSeeker.java


 Data loss while scanning using PREFIX_TREE DATA-BLOCK-ENCODING
 --

 Key: HBASE-11728
 URL: https://issues.apache.org/jira/browse/HBASE-11728
 Project: HBase
  Issue Type: Bug
  Components: Scanners
Affects Versions: 0.96.1.1, 0.98.4
 Environment: ubuntu12 
 hadoop-2.2.0
 Hbase-0.96.1.1
 SUN-JDK(1.7.0_06-b24)
Reporter: wuchengzhi
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: 29cb562fad564b468ea9d61a2d60e8b0, HBASE-11728.patch, 
 HBASE-11728_1.patch, HBASE-11728_2.patch, HBASE-11728_3.patch, 
 HBASE-11728_4.patch, HFileAnalys.java, TestPrefixTree.java

   Original Estimate: 72h
  Remaining Estimate: 72h

 In Scan case, i prepare some data as beflow:
 Table Desc (Using the prefix-tree encoding) :
 'prefix_tree_test', {NAME = 'cf_1', DATA_BLOCK_ENCODING = 'PREFIX_TREE', 
 TTL = '15552000'}
 and i put 5 rows as:
 (RowKey , Qualifier, Value)
 'a-b-0-0', 'qf_1', 'c1-value'
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 'a-b-B-2-1402397300-1402416535', 'qf_2', 'c2-value-3'
 so i try to scan the rowKey between 'a-b-A-1' and 'a-b-A-1:' , i and got the 
 corret result:
 Test 1: 
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 'a-b-A-1', 'qf_1', 'c1-value'
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 and then i try next , scan to addColumn
 Test2:
 Scan scan = new Scan();
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_2));
 scan.setStartRow(a-b-A-1.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually i got nonthing. Then i update the addColumn for 
 scan.addColumn(Bytes.toBytes(cf_1) ,  Bytes.toBytes(qf_1)); and i got the 
 expected result 'a-b-A-1', 'qf_1', 'c1-value' as well.
 then i do more testing...  i update the case to modify the startRow greater 
 than the 'a-b-A-1' 
 Test3:
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 --
 except:
 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 but actually  i got nothing again. i modify the start row greater than 
 'a-b-A-1-1402329600-1402396277'
 Scan scan = new Scan();
 scan.setStartRow(a-b-A-1-140239.getBytes());
 scan.setStopRow(a-b-A-1:.getBytes());
 and i got the expect row as well:
 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
 So, i think it may be a bug in the prefix-tree encoding.It happens after the 
 data flush to the storefile, and it's ok when the data in mem-store.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11772) Bulk load mvcc and seqId issues with native hfiles

2014-08-18 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101214#comment-14101214
 ] 

Jerry He commented on HBASE-11772:
--

Here is the proposed fix:

1) Better detection of bulk loaded files.  We can use the loaded file name with 
'_SeqId_' since we already use it as marker to get the load time seqId.
2)  Regard bulk loaded files always have mvcc 0.  Don't call 
StoreFileScanner.skipKVsNewerThanReadpoint() during scan if it is bulk loaded 
file no matter whether or not it has mvcc in the file.
3) The problem 3 will probably be fixed by 1).

 Bulk load mvcc and seqId issues with native hfiles
 --

 Key: HBASE-11772
 URL: https://issues.apache.org/jira/browse/HBASE-11772
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.5
Reporter: Jerry He

 There are mvcc and seqId issues when bulk load native hfiles -- meaning 
 hfiles that are direct file copy-out from hbase, not from HFileOutputFormat 
 job.
 There are differences between these two types of hfiles.
 Native hfiles have possible non-zero MAX_MEMSTORE_TS_KEY value and non-zero 
 mvcc values in cells. 
 Native hfiles also have MAX_SEQ_ID_KEY.
 Native hfiles do not have BULKLOAD_TIME_KEY.
 Here are a couple of problems I observed when bulk load native hfiles.
 1.  Cells in newly bulk loaded hfiles can be invisible to scan.
 It is easy to re-create.
 Bulk load a native hfile that has a larger mvcc value in cells, e.g 10
 If the current readpoint when initiating a scan is less than 10, the cells in 
 the new hfile are skipped, thus become invisible.
 We don't reset the readpoint of a region after bulk load.
 2. The current StoreFile.isBulkLoadResult() is implemented as:
 {code}
 return metadataMap.containsKey(BULKLOAD_TIME_KEY)
 {code}
 which does not detect bulkloaded native hfiles.
 3. Another observed problem is possible data loss during log recovery. 
 It is similar to HBASE-10958 reported by [~jdcryans]. Borrow the re-create 
 steps from HBASE-10958.
 1) Create an empty table
 2) Put one row in it (let's say it gets seqid 1)
 3) Bulk load one native hfile with large seqId ( e.g. 100). The native hfile 
 can be obtained by copying out from existing table.
 4) Kill the region server that holds the table's region.
 Scan the table once the region is made available again. The first row, at 
 seqid 1, will be missing since the HFile with seqid 100 makes us believe that 
 everything that came before it was flushed. 
 The problem 3 is probably related to 2. We will be ok if we get the appended 
 seqId during bulk load instead of 100 from inside the file.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-10092) Move up on to log4j2

2014-08-18 Thread Alex Newman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Newman updated HBASE-10092:


Fix Version/s: 2.0.0

 Move up on to log4j2
 

 Key: HBASE-10092
 URL: https://issues.apache.org/jira/browse/HBASE-10092
 Project: HBase
  Issue Type: Sub-task
Reporter: stack
Assignee: Alex Newman
 Fix For: 2.0.0

 Attachments: 10092.txt, 10092v2.txt, HBASE-10092.patch


 Allows logging with less friction.  See http://logging.apache.org/log4j/2.x/  
 This rather radical transition can be done w/ minor change given they have an 
 adapter for apache's logging, the one we use.  They also have and adapter for 
 slf4j so we likely can remove at least some of the 4 versions of this module 
 our dependencies make use of.
 I made a start in attached patch but am currently stuck in maven dependency 
 resolve hell courtesy of our slf4j.  Fixing will take some concentration and 
 a good net connection, an item I currently lack.  Other TODOs are that will 
 need to fix our little log level setting jsp page -- will likely have to undo 
 our use of hadoop's tool here -- and the config system changes a little.
 I will return to this project soon.  Will bring numbers.
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11734) Document changed behavior of hbase.hstore.time.to.purge.deletes

2014-08-18 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101252#comment-14101252
 ] 

Jonathan Hsieh commented on HBASE-11734:


Thanks misty.

Minor nit fix: are purge - are *purged*

{quote}
+descriptionThe amount of time to delay purging of delete markers with 
future timestamps. If 
+  unset, or set to 0, all delete markers, including those with future 
timestamps, are purge 
+  during the next major compaction. Otherwise, a delete marker is kept 
until the major compaction 
{quote}

I fixed when I committed to master, and branch-1.  


 Document changed behavior of hbase.hstore.time.to.purge.deletes
 ---

 Key: HBASE-11734
 URL: https://issues.apache.org/jira/browse/HBASE-11734
 Project: HBase
  Issue Type: Sub-task
  Components: documentation
Reporter: Misty Stanley-Jones
Assignee: Misty Stanley-Jones
 Fix For: 0.99.0, 0.98.2, 0.96.3

 Attachments: HBASE-11734.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11734) Document changed behavior of hbase.hstore.time.to.purge.deletes

2014-08-18 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-11734:
---

Attachment: hbase-11734.v2.branch1.patch
hbase-11734.v2.patch

I've committed the v2 versions of the patch.

 Document changed behavior of hbase.hstore.time.to.purge.deletes
 ---

 Key: HBASE-11734
 URL: https://issues.apache.org/jira/browse/HBASE-11734
 Project: HBase
  Issue Type: Sub-task
  Components: documentation
Reporter: Misty Stanley-Jones
Assignee: Misty Stanley-Jones
 Fix For: 0.99.0, 2.0.0

 Attachments: HBASE-11734.patch, hbase-11734.v2.branch1.patch, 
 hbase-11734.v2.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11734) Document changed behavior of hbase.hstore.time.to.purge.deletes

2014-08-18 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-11734:
---

Fix Version/s: (was: 0.96.3)
   (was: 0.98.2)
   2.0.0

 Document changed behavior of hbase.hstore.time.to.purge.deletes
 ---

 Key: HBASE-11734
 URL: https://issues.apache.org/jira/browse/HBASE-11734
 Project: HBase
  Issue Type: Sub-task
  Components: documentation
Reporter: Misty Stanley-Jones
Assignee: Misty Stanley-Jones
 Fix For: 0.99.0, 2.0.0

 Attachments: HBASE-11734.patch, hbase-11734.v2.branch1.patch, 
 hbase-11734.v2.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11734) Document changed behavior of hbase.hstore.time.to.purge.deletes

2014-08-18 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-11734:
---

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

 Document changed behavior of hbase.hstore.time.to.purge.deletes
 ---

 Key: HBASE-11734
 URL: https://issues.apache.org/jira/browse/HBASE-11734
 Project: HBase
  Issue Type: Sub-task
  Components: documentation
Reporter: Misty Stanley-Jones
Assignee: Misty Stanley-Jones
 Fix For: 0.99.0, 2.0.0

 Attachments: HBASE-11734.patch, hbase-11734.v2.branch1.patch, 
 hbase-11734.v2.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-4920) We need a mascot, a totem

2014-08-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101276#comment-14101276
 ] 

stack commented on HBASE-4920:
--

[~jmspaggi] We've already run the orca vote and that vote passed.  See 
http://search-hadoop.com/m/DHED4yIYZl1  If you check out the thread it seemed 
to want to move naturally to the next stage, the vote on the orca 
representation.

 We need a mascot, a totem
 -

 Key: HBASE-4920
 URL: https://issues.apache.org/jira/browse/HBASE-4920
 Project: HBase
  Issue Type: Task
Reporter: stack
 Attachments: Apache_HBase_Orca_Logo_1.jpg, 
 Apache_HBase_Orca_Logo_Mean_version-3.pdf, 
 Apache_HBase_Orca_Logo_Mean_version-4.pdf, Apache_HBase_Orca_Logo_round5.pdf, 
 HBase Orca Logo.jpg, Orca_479990801.jpg, Screen shot 2011-11-30 at 4.06.17 
 PM.png, apache hbase orca logo_Proof 3.pdf, apache logo_Proof 8.pdf, 
 jumping-orca_rotated.xcf, jumping-orca_rotated_right.png, krake.zip, 
 more_orcas.png, more_orcas2.png, orca_clipart_freevector_lhs.jpeg, 
 orca_free_vector_on_top_66percent_levelled.png, 
 orca_free_vector_sheared_rotated_rhs.png, 
 orca_free_vector_some_selections.png, photo (2).JPG, plus_orca.png, 
 proposal_1_logo.png, proposal_1_logo.xcf, proposal_2_logo.png, 
 proposal_2_logo.xcf, proposal_3_logo.png, proposal_3_logo.xcf


 We need a totem for our t-shirt that is yet to be printed.  O'Reilly owns the 
 Clyesdale.  We need something else.
 We could have a fluffy little duck that quacks 'hbase!' when you squeeze it 
 and we could order boxes of them from some off-shore sweatshop that 
 subcontracts to a contractor who employs child labor only.
 Or we could have an Orca (Big!, Fast!, Killer!, and in a poem that Marcy from 
 Salesforce showed me, that was a bit too spiritual for me to be seen quoting 
 here, it had the Orca as the 'Guardian of the Cosmic Memory': i.e. in 
 translation, bigdata).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-4920) We need a mascot, a totem

2014-08-18 Thread Jean-Marc Spaggiari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101299#comment-14101299
 ] 

Jean-Marc Spaggiari commented on HBASE-4920:


From the above comments, sound like consensus was page 1 of the PDF. Do we 
want to add http://www.vectorfree.com/jumping-orca into the vote too? Or stay 
with page 1?

Like do you agree on this orca (page 1) as a logo yes/no.

Or 

Between those 4 orcas rate them 1 to 4 (and then we compile the results)?

Is the orca on page 1 of the PDF free of rights?

 We need a mascot, a totem
 -

 Key: HBASE-4920
 URL: https://issues.apache.org/jira/browse/HBASE-4920
 Project: HBase
  Issue Type: Task
Reporter: stack
 Attachments: Apache_HBase_Orca_Logo_1.jpg, 
 Apache_HBase_Orca_Logo_Mean_version-3.pdf, 
 Apache_HBase_Orca_Logo_Mean_version-4.pdf, Apache_HBase_Orca_Logo_round5.pdf, 
 HBase Orca Logo.jpg, Orca_479990801.jpg, Screen shot 2011-11-30 at 4.06.17 
 PM.png, apache hbase orca logo_Proof 3.pdf, apache logo_Proof 8.pdf, 
 jumping-orca_rotated.xcf, jumping-orca_rotated_right.png, krake.zip, 
 more_orcas.png, more_orcas2.png, orca_clipart_freevector_lhs.jpeg, 
 orca_free_vector_on_top_66percent_levelled.png, 
 orca_free_vector_sheared_rotated_rhs.png, 
 orca_free_vector_some_selections.png, photo (2).JPG, plus_orca.png, 
 proposal_1_logo.png, proposal_1_logo.xcf, proposal_2_logo.png, 
 proposal_2_logo.xcf, proposal_3_logo.png, proposal_3_logo.xcf


 We need a totem for our t-shirt that is yet to be printed.  O'Reilly owns the 
 Clyesdale.  We need something else.
 We could have a fluffy little duck that quacks 'hbase!' when you squeeze it 
 and we could order boxes of them from some off-shore sweatshop that 
 subcontracts to a contractor who employs child labor only.
 Or we could have an Orca (Big!, Fast!, Killer!, and in a poem that Marcy from 
 Salesforce showed me, that was a bit too spiritual for me to be seen quoting 
 here, it had the Orca as the 'Guardian of the Cosmic Memory': i.e. in 
 translation, bigdata).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HBASE-11773) Wrong field used for protobuf construction in RegionStates.

2014-08-18 Thread Andrey Stepachev (JIRA)
Andrey Stepachev created HBASE-11773:


 Summary: Wrong field used for protobuf construction in 
RegionStates.
 Key: HBASE-11773
 URL: https://issues.apache.org/jira/browse/HBASE-11773
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Reporter: Andrey Stepachev
Assignee: Andrey Stepachev


Protobuf  Java Pojo converter uses wrong field for converted enum 
construction (actually default value of protobuf message used).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11682) Explain hotspotting

2014-08-18 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101308#comment-14101308
 ] 

Jonathan Hsieh commented on HBASE-11682:


{code}
+  paraSalting in this sense has nothing to do with cryptography, but 
refers to adding random
+data to the start of a row key. In this case, salting refers to adding 
a prefix to the row
+key to cause it to sort differently than it otherwise would. Salting 
can be helpful if you
+have a few keys that come up over and over, along with other rows that 
don't fit those keys.
+In that case, the regions holding rows with the hot keys would be 
overloaded, compared to
+the other regions. Salting completely removes ordering, so is often a 
poorer choice than
+hashing. Using totally random row keys for data which is accessed 
sequentially would remove
+the benefit of HBase's row-sorting algorithm and cause very poor 
performance, as each get or
+scan would need to query all regions./para
{code}

I don't think this salting example is correct about the ramifications.  Both 
Nick and I agree that salting is puting some random value in front of the 
actual value.  This means instead of one sorted list of entries, we'd have many 
n sorted lists of entries if the cardinality of the salt is n.

Example:  naively we have rowkeys like this:

foo0001
foo0002
foo0003
foo0004

if we us a 4 way salt (a,b,c,d), we could end up with data resorted like this:

a-foo0003
b-foo0001
c-foo0004
d-foo0002

Let say we add some new values to row foo0003.  It could get salted with a new 
salt, let's say 'c'.

a-foo0003
b-foo0001
*c-foo0003*
c-foo0004
d-foo0002

To read we still could get things read in the original order but we'd have to 
have a reader starting from each salt in parallel to get the rows back in 
order. (and likely need to do some coalescing of foo0003 to combine the 
a-foo0003 and c-foo0003 rows back into one.  The effect here in this situtation 
is that we could be writing with 4x the throughput now since we would be on 4 
different machines.(assuming that the a, b, c, d are balanced onto different 
machines).

Nick's point of view (please correct me if I am wrong) says that you could 
salt the original row key with a one-way hash so that foo0003 would always 
get salted with 'a'.  This would spread rowkeys that are lexicographically 
close (foo0001 and foo0002) to different machines that could help reduce 
contention and increase overall throughput but not allow ever allow a single 
row to have 4x the throughput like the other approach.

{code}
+  paraHashing refers to applying a random one-way function to the row 
key, such that a
+particular row always gets the same arbitrary value applied. This 
preserves the sort order
+so that scans are effective, but spreads out load across a region. One 
example where hashing
+is the right strategy would be if for some reason, a large proportion 
of rows started with
+the same letter. Normally, these would all be sorted into the same 
region. You can apply a
+hash to artificially differentiate them and spread them out./para
{code}

Hashing actually totally trashes the sort order -- in fact the goal of hashing 
is to evenly disburse entries that are near each other lexicographically as 
much as possible.

 Explain hotspotting
 ---

 Key: HBASE-11682
 URL: https://issues.apache.org/jira/browse/HBASE-11682
 Project: HBase
  Issue Type: Task
  Components: documentation
Reporter: Misty Stanley-Jones
Assignee: Misty Stanley-Jones
 Attachments: HBASE-11682-1.patch, HBASE-11682.patch, HBASE-11682.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11773) Wrong field used for protobuf construction in RegionStates.

2014-08-18 Thread Andrey Stepachev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Stepachev updated HBASE-11773:
-

Affects Version/s: 2.0.0
   1.0.0

 Wrong field used for protobuf construction in RegionStates.
 ---

 Key: HBASE-11773
 URL: https://issues.apache.org/jira/browse/HBASE-11773
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: 1.0.0, 2.0.0
Reporter: Andrey Stepachev
Assignee: Andrey Stepachev

 Protobuf  Java Pojo converter uses wrong field for converted enum 
 construction (actually default value of protobuf message used).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11739) Document blockCache contents report in the UI

2014-08-18 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-11739:
--

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Nice. Committed.

 Document blockCache contents report in the UI
 -

 Key: HBASE-11739
 URL: https://issues.apache.org/jira/browse/HBASE-11739
 Project: HBase
  Issue Type: Sub-task
  Components: documentation
Reporter: Misty Stanley-Jones
Assignee: Misty Stanley-Jones
 Fix For: 0.99.0

 Attachments: HBASE-11739.patch, bc_basic.png, bc_basic.png, 
 bc_config.png, bc_l1.png, bc_l2_buckets.png, bc_stats.png






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

2014-08-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101323#comment-14101323
 ] 

stack commented on HBASE-11747:
---

Good one. Every RS sending 100MB of 'status' to the master every second or so 
is just obnoxious, especially so when much of this info is being duplicated no 
our metrics 'channel'.  Thanks for bringing this one up Virag. We need a bit of 
fixup in here.

 ClusterStatus is too bulky 
 ---

 Key: HBASE-11747
 URL: https://issues.apache.org/jira/browse/HBASE-11747
 Project: HBase
  Issue Type: Sub-task
Reporter: Virag Kothari
 Attachments: exceptiontrace


 Following exception on 0.98 with 1M regions on cluster with 160 region servers
 {code}
 Caused by: java.io.IOException: Call to regionserverhost:port failed on local 
 exception: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1482)
   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1454)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712)
   at 
 org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.getClusterStatus(MasterProtos.java:42555)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$5.getClusterStatus(HConnectionManager.java:2132)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2166)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2162)
   at 
 org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
   ... 43 more
 Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11773) Wrong field used for protobuf construction in RegionStates.

2014-08-18 Thread Andrey Stepachev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Stepachev updated HBASE-11773:
-

Attachment: HBASE-11773.patch

 Wrong field used for protobuf construction in RegionStates.
 ---

 Key: HBASE-11773
 URL: https://issues.apache.org/jira/browse/HBASE-11773
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: 1.0.0, 2.0.0
Reporter: Andrey Stepachev
Assignee: Andrey Stepachev
 Attachments: HBASE-11773.patch


 Protobuf  Java Pojo converter uses wrong field for converted enum 
 construction (actually default value of protobuf message used).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11232) Region fail to release the updatelock for illegal CF in multi row mutations

2014-08-18 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-11232:
--

Fix Version/s: 0.94.23

 Region fail to release the updatelock for illegal CF in multi row mutations
 ---

 Key: HBASE-11232
 URL: https://issues.apache.org/jira/browse/HBASE-11232
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.19
Reporter: Liu Shaohui
Assignee: Liu Shaohui
 Fix For: 0.94.23

 Attachments: HBASE-11232-0.94.diff


 The failback code in processRowsWithLocks did not check the column family. If 
 there is an illegal CF in the muation, it will  throw NullPointException and 
 the update lock will not be released.  So the region can not be flushed and 
 compacted. 
 HRegion #4946
 {code}
 if (!mutations.isEmpty()  !walSyncSuccessful) {
   LOG.warn(Wal sync failed. Roll back  + mutations.size() +
memstore keyvalues for row(s): +
   processor.getRowsToLock().iterator().next() + ...);
   for (KeyValue kv : mutations) {
 stores.get(kv.getFamily()).rollback(kv);
   }
 }
 // 11. Roll mvcc forward
 if (writeEntry != null) {
   mvcc.completeMemstoreInsert(writeEntry);
   writeEntry = null;
 }
 if (locked) {
   this.updatesLock.readLock().unlock();
   locked = false;
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11773) Wrong field used for protobuf construction in RegionStates.

2014-08-18 Thread Andrey Stepachev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Stepachev updated HBASE-11773:
-

Status: Patch Available  (was: Open)

 Wrong field used for protobuf construction in RegionStates.
 ---

 Key: HBASE-11773
 URL: https://issues.apache.org/jira/browse/HBASE-11773
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: 1.0.0, 2.0.0
Reporter: Andrey Stepachev
Assignee: Andrey Stepachev
 Attachments: HBASE-11773.patch


 Protobuf  Java Pojo converter uses wrong field for converted enum 
 construction (actually default value of protobuf message used).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


  1   2   3   >