date:20150318

[jira] [Commented] (HBASE-13199) Some small improvements on canary tool

2015-03-18 Thread Liu Shaohui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366820#comment-14366820
 ] 

Liu Shaohui commented on HBASE-13199:
-

OK. Thanks

 Some small improvements on canary tool
 --

 Key: HBASE-13199
 URL: https://issues.apache.org/jira/browse/HBASE-13199
 Project: HBase
  Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: Liu Shaohui
Assignee: Liu Shaohui
 Fix For: 2.0.0

 Attachments: HBASE-13199-v1.diff, HBASE-13199-v2.diff, 
 HBASE-13199-v3.diff, HBASE-13199-v4.diff


 Improvements
 - Make the sniff of region and regionserver parallel to support large cluster 
 with 1+ region and 500+ regionservers using thread pool.
 - Set cacheblock to false in get and scan to avoid influence to block cache.
 - Add FirstKeyOnlyFilter to get and scan to avoid read and translate too many 
 data from HBase. There may be many column under a column family in a 
 flat-wide table.
  - Select the region randomly when sniffing a regionserver.
  - Make the sink class of canary configurable
 [~stack]
 Suggestions are welcomed. Thanks~
 Another question is that why to check each column family with separate 
 requests when sniffing a region? Can we just check a  column family of a 
 region?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13241) Add tests for group level grants

2015-03-18 Thread Matteo Bertozzi (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366897#comment-14366897
 ] 

Matteo Bertozzi commented on HBASE-13241:
-

I see only in one place the assert on the scan result
{code}
+ Scan s1 = new Scan();
+ try (ResultScanner scanner1 = table.getScanner(s1);) {
+   Result[] next1 = scanner1.next(5);
+   assertTrue(next1.length == 3);
+ }
{code}

all the other checks seem to just verify if the AccessDeniedException was 
received or not, so verifyAllowed()/verifyDenied() should be enough. if not 
why? what is the difference with the other scanAction we have already?
{code}
+ try (ResultScanner scanner1 = table.getScanner(s1);) {
+   fail(Access should be denied as the user  + USER1_TESTGROUP_QUALIFIER
+   +  read privilege has been revoked on column family qualifier 
+   + Bytes.toString(TEST_FAMILY) + ':' + Bytes.toString(Q1));
+ } catch (AccessDeniedException ignore) {
+ }
{code}



 Add tests for group level grants
 

 Key: HBASE-13241
 URL: https://issues.apache.org/jira/browse/HBASE-13241
 Project: HBase
  Issue Type: Improvement
  Components: security, test
Reporter: Sean Busbey
Assignee: Ashish Singhi
Priority: Critical
 Attachments: HBASE-13241-v1.patch, HBASE-13241-v2.patch, 
 HBASE-13241-v3.patch, HBASE-13241-v4.patch, HBASE-13241-v5.patch, 
 HBASE-13241.patch


 We need to have tests for group-level grants for various scopes. ref: 
 HBASE-13239



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path

2015-03-18 Thread Anoop Sam John (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-11425:
---
Attachment: HBASE-11425-E2E-NotComplete.patch

Attaching an E2E patch for reference. Still some more cleanups we are doing.  
Also avoiding some code duplication still in patch.

 Cell/DBB end-to-end on the read-path
 

 Key: HBASE-11425
 URL: https://issues.apache.org/jira/browse/HBASE-11425
 Project: HBase
  Issue Type: Umbrella
  Components: regionserver, Scanners
Affects Versions: 0.99.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Attachments: HBASE-11425-E2E-NotComplete.patch, Offheap reads in 
 HBase using BBs_V2.pdf, Offheap reads in HBase using BBs_final.pdf


 Umbrella jira to make sure we can have blocks cached in offheap backed cache. 
 In the entire read path, we can refer to this offheap buffer and avoid onheap 
 copying.
 The high level items I can identify as of now are
 1. Avoid the array() call on BB in read path.. (This is there in many 
 classes. We can handle class by class)
 2. Support Buffer based getter APIs in cell.  In read path we will create a 
 new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), 
 CPs etc.
 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy.
 4. Remove all CP hooks (which are already deprecated) which deal with KVs.  
 (In read path)
 Will add subtasks under this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13241) Add tests for group level grants

2015-03-18 Thread Ashish Singhi (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366957#comment-14366957
 ] 

Ashish Singhi commented on HBASE-13241:
---

Thanks [~mbertozzi] for taking a look.
bq. I see only in one place the assert on the scan result
We have it at three different places with different values expected.
1. {code}
+ Scan s1 = new Scan();
+  try (ResultScanner scanner1 = table.getScanner(s1);) {
+Result[] next1 = scanner1.next(5);
+assertTrue(next1.length == 3);
+  }
{code}
2. {code}
+ Scan s1 = new Scan();
+  try (ResultScanner scanner1 = table.getScanner(s1);) {
+Result[] next1 = scanner1.next(5);
+assertTrue(next1.length == 2);
+  }
{code}
 3. {code}
+ Scan s1 = new Scan();
+  try (ResultScanner scanner1 = table.getScanner(s1);) {
+Result[] next1 = scanner1.next(5);
+assertTrue(next1.length == 1);
+  }
{code}

bq. all the other checks seem to just verify if the AccessDeniedException was 
received or not, so verifyAllowed()/verifyDenied() should be enough. if not why?
I tried that way when [~srikanth235] offline suggested me, but here at each 
level we have different results.
Like when we grant a group, table level access then a user from it can perform 
scan at family level also but its not the same when we grant a group, access at 
qualifier level. So I will have to create so many actions for it to have it in 
one test which I did some what in my first patch but [~busbey] had some other 
thought and I felt it was reasonable, so I broke this test at different levels. 
Also verifyAllowed() and verifyDenied() internally uses user.runAs api.

bq. what is the difference with the other scanAction we have already?
If you are pointing at scanAction in TestAccessController#testRead then here we 
are not asserting scan result, we are checking whether user with READ access 
are able to scan the table or not.

 Add tests for group level grants
 

 Key: HBASE-13241
 URL: https://issues.apache.org/jira/browse/HBASE-13241
 Project: HBase
  Issue Type: Improvement
  Components: security, test
Reporter: Sean Busbey
Assignee: Ashish Singhi
Priority: Critical
 Attachments: HBASE-13241-v1.patch, HBASE-13241-v2.patch, 
 HBASE-13241-v3.patch, HBASE-13241-v4.patch, HBASE-13241-v5.patch, 
 HBASE-13241.patch


 We need to have tests for group-level grants for various scopes. ref: 
 HBASE-13239



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13241) Add tests for group level grants

2015-03-18 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-13241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366796#comment-14366796
]

Hadoop QA commented on HBASE-13241:
---

{color:green}+1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12705281/HBASE-13241-v5.patch
against master branch at commit f9a17edc252a88c5a1a2c7764e3f9f65623e0ced.
ATTACHMENT ID: 12705281

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 4 new
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all
supported hadoop versions (2.4.1 2.5.2 2.6.0)

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:green}+1 javadoc{color}. The javadoc tool did not generate any
warning messages.

{color:green}+1 checkstyle{color}. The applied patch does not increase the
total number of checkstyle errors

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

{color:green}+1 lineLengths{color}. The patch does not introduce lines
longer than 100

{color:green}+1 site{color}. The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}. The patch passed unit tests in .

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/13292//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/13292//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/13292//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/13292//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/13292//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/13292//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/13292//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/13292//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/13292//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/13292//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/13292//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/13292//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html
Checkstyle Errors:
https://builds.apache.org/job/PreCommit-HBASE-Build/13292//artifact/patchprocess/checkstyle-aggregate.html

Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/13292//console

This message is automatically generated.

Add tests for group level grants

Key: HBASE-13241
URL: https://issues.apache.org/jira/browse/HBASE-13241
Project: HBase
Issue Type: Improvement
Components: security, test
Reporter: Sean Busbey
Assignee: Ashish Singhi
Priority: Critical
Attachments: HBASE-13241-v1.patch, HBASE-13241-v2.patch,
HBASE-13241-v3.patch, HBASE-13241-v4.patch, HBASE-13241-v5.patch,
HBASE-13241.patch

We need to have tests for group-level grants for various scopes. ref:
HBASE-13239

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-11425) Cell/DBB end-to-end on the read-path

2015-03-18 Thread Anoop Sam John (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-11425:
---
Attachment: Offheap reads in HBase using BBs_V2.pdf

 Cell/DBB end-to-end on the read-path
 

 Key: HBASE-11425
 URL: https://issues.apache.org/jira/browse/HBASE-11425
 Project: HBase
  Issue Type: Umbrella
  Components: regionserver, Scanners
Affects Versions: 0.99.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Attachments: Offheap reads in HBase using BBs_V2.pdf, Offheap reads 
 in HBase using BBs_final.pdf


 Umbrella jira to make sure we can have blocks cached in offheap backed cache. 
 In the entire read path, we can refer to this offheap buffer and avoid onheap 
 copying.
 The high level items I can identify as of now are
 1. Avoid the array() call on BB in read path.. (This is there in many 
 classes. We can handle class by class)
 2. Support Buffer based getter APIs in cell.  In read path we will create a 
 new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), 
 CPs etc.
 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy.
 4. Remove all CP hooks (which are already deprecated) which deal with KVs.  
 (In read path)
 Will add subtasks under this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12636) Avoid too many write operations on zookeeper in replication

2015-03-18 Thread Liu Shaohui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366894#comment-14366894
 ] 

Liu Shaohui commented on HBASE-12636:
-

[~lhofhansl] [~stack]
Any suggestions about this patch?
The write operations to zookeeper from replication decrease to several hundreds 
from 5 thousands per second in our cluster with this patch.

 Avoid too many write operations on zookeeper in replication
 ---

 Key: HBASE-12636
 URL: https://issues.apache.org/jira/browse/HBASE-12636
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.94.11
Reporter: Liu Shaohui
Assignee: Liu Shaohui
  Labels: replication
 Fix For: 1.1.0

 Attachments: HBASE-12635-v2.diff, HBASE-12636-v1.diff


 In our production cluster, we found there are about over 1k write operations 
 per second on zookeeper from hbase replication. The reason is that the 
 replication source will write the log position to zookeeper for every edit 
 shipping. If the current replicating WAL is just the WAL that regionserver is 
 writing to,  each skipping will be very small but the frequency is very high, 
 which causes many write operations on zookeeper.
 A simple solution is that writing log position to zookeeper when position 
 diff or skipped edit number is larger than a threshold, not every  edit 
 shipping.
 Suggestions are welcomed, thx~



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13071) Hbase Streaming Scan Feature

2015-03-18 Thread Eshcar Hillel (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-13071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eshcar Hillel updated HBASE-13071:
--
Attachment: HBASE-13071_trunk_10.patch

 Hbase Streaming Scan Feature
 

 Key: HBASE-13071
 URL: https://issues.apache.org/jira/browse/HBASE-13071
 Project: HBase
  Issue Type: New Feature
Reporter: Eshcar Hillel
 Attachments: 99.eshcar.png, HBASE-13071_98_1.patch, 
 HBASE-13071_trunk_1.patch, HBASE-13071_trunk_10.patch, 
 HBASE-13071_trunk_10.patch, HBASE-13071_trunk_2.patch, 
 HBASE-13071_trunk_3.patch, HBASE-13071_trunk_4.patch, 
 HBASE-13071_trunk_5.patch, HBASE-13071_trunk_6.patch, 
 HBASE-13071_trunk_7.patch, HBASE-13071_trunk_8.patch, 
 HBASE-13071_trunk_9.patch, HBaseStreamingScanDesign.pdf, 
 HbaseStreamingScanEvaluation.pdf, 
 HbaseStreamingScanEvaluationwithMultipleClients.pdf, gc.eshcar.png, 
 hits.eshcar.png, network.png


 A scan operation iterates over all rows of a table or a subrange of the 
 table. The synchronous nature in which the data is served at the client side 
 hinders the speed the application traverses the data: it increases the 
 overall processing time, and may cause a great variance in the times the 
 application waits for the next piece of data.
 The scanner next() method at the client side invokes an RPC to the 
 regionserver and then stores the results in a cache. The application can 
 specify how many rows will be transmitted per RPC; by default this is set to 
 100 rows. 
 The cache can be considered as a producer-consumer queue, where the hbase 
 client pushes the data to the queue and the application consumes it. 
 Currently this queue is synchronous, i.e., blocking. More specifically, when 
 the application consumed all the data from the cache --- so the cache is 
 empty --- the hbase client retrieves additional data from the server and 
 re-fills the cache with new data. During this time the application is blocked.
 Under the assumption that the application processing time can be balanced by 
 the time it takes to retrieve the data, an asynchronous approach can reduce 
 the time the application is waiting for data.
 We attach a design document.
 We also have a patch that is based on a private branch, and some evaluation 
 results of this code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13071) Hbase Streaming Scan Feature

2015-03-18 Thread Eshcar Hillel (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-13071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eshcar Hillel updated HBASE-13071:
--
Attachment: (was: HBASE-13071_trunk_10.patch)

 Hbase Streaming Scan Feature
 

 Key: HBASE-13071
 URL: https://issues.apache.org/jira/browse/HBASE-13071
 Project: HBase
  Issue Type: New Feature
Reporter: Eshcar Hillel
 Attachments: 99.eshcar.png, HBASE-13071_98_1.patch, 
 HBASE-13071_trunk_1.patch, HBASE-13071_trunk_10.patch, 
 HBASE-13071_trunk_2.patch, HBASE-13071_trunk_3.patch, 
 HBASE-13071_trunk_4.patch, HBASE-13071_trunk_5.patch, 
 HBASE-13071_trunk_6.patch, HBASE-13071_trunk_7.patch, 
 HBASE-13071_trunk_8.patch, HBASE-13071_trunk_9.patch, 
 HBaseStreamingScanDesign.pdf, HbaseStreamingScanEvaluation.pdf, 
 HbaseStreamingScanEvaluationwithMultipleClients.pdf, gc.eshcar.png, 
 hits.eshcar.png, network.png


 A scan operation iterates over all rows of a table or a subrange of the 
 table. The synchronous nature in which the data is served at the client side 
 hinders the speed the application traverses the data: it increases the 
 overall processing time, and may cause a great variance in the times the 
 application waits for the next piece of data.
 The scanner next() method at the client side invokes an RPC to the 
 regionserver and then stores the results in a cache. The application can 
 specify how many rows will be transmitted per RPC; by default this is set to 
 100 rows. 
 The cache can be considered as a producer-consumer queue, where the hbase 
 client pushes the data to the queue and the application consumes it. 
 Currently this queue is synchronous, i.e., blocking. More specifically, when 
 the application consumed all the data from the cache --- so the cache is 
 empty --- the hbase client retrieves additional data from the server and 
 re-fills the cache with new data. During this time the application is blocked.
 Under the assumption that the application processing time can be balanced by 
 the time it takes to retrieve the data, an asynchronous approach can reduce 
 the time the application is waiting for data.
 We attach a design document.
 We also have a patch that is based on a private branch, and some evaluation 
 results of this code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13071) Hbase Streaming Scan Feature

2015-03-18 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-13071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367141#comment-14367141
]

Hadoop QA commented on HBASE-13071:
---

{color:green}+1 overall{color}. Here are the results of testing the latest
attachment

http://issues.apache.org/jira/secure/attachment/12705332/HBASE-13071_trunk_10.patch
against master branch at commit f9a17edc252a88c5a1a2c7764e3f9f65623e0ced.
ATTACHMENT ID: 12705332