[jira] [Commented] (HBASE-11261) Handle splitting/merging of regions that have region_replication greater than one

2017-08-29 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16145869#comment-16145869
 ] 

Devaraj Das commented on HBASE-11261:
-

What Enis said is right. We simplified the approach and let the master do the 
split/merge of replicas (which essentially means closing old replicas and 
creating new replicas) after the split/merge of the parent is complete.

> Handle splitting/merging of regions that have region_replication greater than 
> one
> -
>
> Key: HBASE-11261
> URL: https://issues.apache.org/jira/browse/HBASE-11261
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 11261-1.1.txt, 11261-1.2.txt, 11261-2.txt, 11261-3.txt, 
> 11261-committed.txt, 11261-with-merge-2.txt, 11261-with-merge-3.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18424) Fix TestAsyncTableGetMultiThreaded

2017-07-21 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096918#comment-16096918
 ] 

Devaraj Das commented on HBASE-18424:
-

TestAsyncTableGetMultiThreaded timed out on the hadoopqa run, [~vrodionov] ?

> Fix TestAsyncTableGetMultiThreaded
> --
>
> Key: HBASE-18424
> URL: https://issues.apache.org/jira/browse/HBASE-18424
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Attachments: HBASE-18424-v1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18086) Create native client which creates load on selected cluster

2017-07-20 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095141#comment-16095141
 ] 

Devaraj Das commented on HBASE-18086:
-

bq. Since load-client is supposed to verify large amount of data, I think it 
makes more sense to adopt this approach instead of issuing scan and get in two 
rounds of verification.
The idea is to put "stress" in the various code path. Let's get the behavior 
where one can say what to use via an option - scan or get or multi-get. By 
default it can do scan or multi-get...

> Create native client which creates load on selected cluster
> ---
>
> Key: HBASE-18086
> URL: https://issues.apache.org/jira/browse/HBASE-18086
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: 18086.v11.txt, 18086.v12.txt, 18086.v14.txt, 
> 18086.v17.txt, 18086.v1.txt, 18086.v3.txt, 18086.v4.txt, 18086.v5.txt, 
> 18086.v6.txt, 18086.v7.txt, 18086.v8.txt
>
>
> This task is to create a client which uses multiple threads to conduct Puts 
> followed by Gets against selected cluster.
> Default is to run the tool against local cluster.
> This would give us some idea on the characteristics of native client in terms 
> of handling high load.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18354) Fix TestMasterMetrics that were disabled by Proc-V2 AM in HBASE-14614

2017-07-14 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16088227#comment-16088227
 ] 

Devaraj Das commented on HBASE-18354:
-

The test works fine with the patch. But could you please clarify as to why that 
totalNumberBefore could be non-zero?

> Fix TestMasterMetrics that were disabled by Proc-V2 AM in HBASE-14614
> -
>
> Key: HBASE-18354
> URL: https://issues.apache.org/jira/browse/HBASE-18354
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.0.0-alpha-1
>Reporter: Stephen Yuan Jiang
>Assignee: Vladimir Rodionov
> Attachments: HBASE-18354-v1.patch
>
>
> With Core Proc-V2 AM change in HBASE-14614, stuff is different now around 
> startup which messes up the TestMasterMetrics test. HBASE-14614 disabled two 
> of three tests.
> This JIRA tracks work to fix the disabled tests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-14135) HBase Backup/Restore Phase 3: Merge backup images

2017-07-06 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16077479#comment-16077479
 ] 

Devaraj Das commented on HBASE-14135:
-

bq. Originally it was Service, then - Task, now is Job. What do you suggest, 
stack?
How about _process_? Seems generic enough to capture MR Job among other 
possibilities.. Stack?

> HBase Backup/Restore Phase 3: Merge backup images
> -
>
> Key: HBASE-14135
> URL: https://issues.apache.org/jira/browse/HBASE-14135
> Project: HBase
>  Issue Type: New Feature
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Blocker
>  Labels: backup
> Fix For: HBASE-7912
>
> Attachments: HBASE-14135-v1.patch, HBASE-14135-v3.patch
>
>
> User can merge incremental backup images into single incremental backup image.
> # Merge supports only incremental images
> # Merge supports only images for the same backup destinations
> Command:
> {code}
> hbase backup merge image1,image2,..imageK
> {code}
> Example:
> {code}
> hbase backup merge backup_143126764557,backup_143126764456 
> {code}
> When operation is complete, only the most recent backup image will be kept 
> (in above example -  backup_143126764557) as a merged backup image, all other 
> images will be deleted from both: file system and backup system tables, 
> corresponding backup manifest for the merged backup image will be updated to 
> remove dependencies from deleted images. Merged backup image will contains 
> all the data from original image and from deleted images.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HBASE-18214) Replace the folly::AtomicHashMap usage in the RPC layer

2017-06-23 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16061660#comment-16061660
 ] 

Devaraj Das edited comment on HBASE-18214 at 6/23/17 11:58 PM:
---

Looks fine but I think the lock inside {code}mapped_type& operator[](const 
key_type& key) {code} (in the class concurrent_map) should be the unique_lock 
since we are mutating the map when the key doesn't exist, no?


was (Author: devaraj):
Looks fine but I think the lock inside {code}mapped_type& operator[](const 
key_type& key) {code} should be the unique_lock since we are mutating the map 
when the key doesn't exist, no?

> Replace the folly::AtomicHashMap usage in the RPC layer
> ---
>
> Key: HBASE-18214
> URL: https://issues.apache.org/jira/browse/HBASE-18214
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Attachments: 18214-1-1.txt, 18214-1-2.txt, hbase-18214_v3.patch, 
> hbase-18214_v4.patch
>
>
> In my tests, I saw that folly::AtomicHashMap usage is not appropriate for 
> one, rather common use case. It'd become sort of unusable (inserts would 
> hang) after a bunch of inserts and erases. This hashmap is used to keep track 
> of call-Id after a connection is set up in the RPC layer (insert a 
> call-id/msg pair when an RPC is sent, and erase the pair when the 
> corresponding response is received). Here is a simple program that will 
> demonstrate the issue:
> {code}
> folly::AtomicHashMap f(100);
> int i = 0;
> while (i < 1) {
> try {
>   f.insert(i,100);
>   LOG(INFO) << "Inserted " << i << "  " << f.size();
>   f.erase(i);
>   LOG(INFO) << "Deleted " << i << "  " << f.size();
>   i++;
> } catch (const std::exception ) {
>   LOG(INFO) << "Exception " << e.what();
>   break;
> }
> }
> {code}
> After poking around a little bit, it is indeed called out as a limitation 
> here 
> https://github.com/facebook/folly/blob/master/folly/docs/AtomicHashMap.md 
> (grep for 'erase'). Proposal is to replace this with something that will fit 
> in in the above usecase (thinking of using std::unordered_map).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18214) Replace the folly::AtomicHashMap usage in the RPC layer

2017-06-23 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16061660#comment-16061660
 ] 

Devaraj Das commented on HBASE-18214:
-

Looks fine but I think the lock inside {code}mapped_type& operator[](const 
key_type& key) {code} should be the unique_lock since we are mutating the map 
when the key doesn't exist, no?

> Replace the folly::AtomicHashMap usage in the RPC layer
> ---
>
> Key: HBASE-18214
> URL: https://issues.apache.org/jira/browse/HBASE-18214
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Attachments: 18214-1-1.txt, 18214-1-2.txt, hbase-18214_v3.patch, 
> hbase-18214_v4.patch
>
>
> In my tests, I saw that folly::AtomicHashMap usage is not appropriate for 
> one, rather common use case. It'd become sort of unusable (inserts would 
> hang) after a bunch of inserts and erases. This hashmap is used to keep track 
> of call-Id after a connection is set up in the RPC layer (insert a 
> call-id/msg pair when an RPC is sent, and erase the pair when the 
> corresponding response is received). Here is a simple program that will 
> demonstrate the issue:
> {code}
> folly::AtomicHashMap f(100);
> int i = 0;
> while (i < 1) {
> try {
>   f.insert(i,100);
>   LOG(INFO) << "Inserted " << i << "  " << f.size();
>   f.erase(i);
>   LOG(INFO) << "Deleted " << i << "  " << f.size();
>   i++;
> } catch (const std::exception ) {
>   LOG(INFO) << "Exception " << e.what();
>   break;
> }
> }
> {code}
> After poking around a little bit, it is indeed called out as a limitation 
> here 
> https://github.com/facebook/folly/blob/master/folly/docs/AtomicHashMap.md 
> (grep for 'erase'). Proposal is to replace this with something that will fit 
> in in the above usecase (thinking of using std::unordered_map).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HBASE-18214) Replace the folly::AtomicHashMap usage in the RPC layer

2017-06-19 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16054931#comment-16054931
 ] 

Devaraj Das edited comment on HBASE-18214 at 6/19/17 11:23 PM:
---

bq. Instead of  std::unique_ptr mutex_; use 
std::shared_timed_mutex mutex_;
The issue is the shared_timed_mutex is (rightfully) not copyable. So one would 
have to define explicit copy constructors in the classes where this is used... 
It's not clear as to what such a copy constructor would do anyway (mutexes are 
fundamentally not copyable safely). Treating it as a pointer in class 
declarations works around such things since all the instances in question would 
refer to the same underlying mutex. But maybe, we should declare it to be 
shared_ptr as opposed to a unique_ptr. In the usecase we have, we probably will 
not run into issues with either anyway I guess.


was (Author: devaraj):
bq. Instead of  std::unique_ptr mutex_; use 
std::shared_timed_mutex mutex_;
The issue is the shared_timed_mutex is (rightfully) not copyable. So one would 
have to define explicit copy constructors in the classes where this is used... 
It's not clear as to what such a copy constructor would do anyway (mutexes are 
fundamentally not copyable safely). Treating it as a pointer in class 
declarations works around such things since all the instances in question would 
refer to the same underlying mutex. But maybe, we should declare it to be 
shared_ptr as opposed to a shared_ptr. In the usecase we have, we probably will 
not run into issues with either anyway I guess.

> Replace the folly::AtomicHashMap usage in the RPC layer
> ---
>
> Key: HBASE-18214
> URL: https://issues.apache.org/jira/browse/HBASE-18214
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Attachments: 18214-1-1.txt, 18214-1-2.txt
>
>
> In my tests, I saw that folly::AtomicHashMap usage is not appropriate for 
> one, rather common use case. It'd become sort of unusable (inserts would 
> hang) after a bunch of inserts and erases. This hashmap is used to keep track 
> of call-Id after a connection is set up in the RPC layer (insert a 
> call-id/msg pair when an RPC is sent, and erase the pair when the 
> corresponding response is received). Here is a simple program that will 
> demonstrate the issue:
> {code}
> folly::AtomicHashMap f(100);
> int i = 0;
> while (i < 1) {
> try {
>   f.insert(i,100);
>   LOG(INFO) << "Inserted " << i << "  " << f.size();
>   f.erase(i);
>   LOG(INFO) << "Deleted " << i << "  " << f.size();
>   i++;
> } catch (const std::exception ) {
>   LOG(INFO) << "Exception " << e.what();
>   break;
> }
> }
> {code}
> After poking around a little bit, it is indeed called out as a limitation 
> here 
> https://github.com/facebook/folly/blob/master/folly/docs/AtomicHashMap.md 
> (grep for 'erase'). Proposal is to replace this with something that will fit 
> in in the above usecase (thinking of using std::unordered_map).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18214) Replace the folly::AtomicHashMap usage in the RPC layer

2017-06-19 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16054931#comment-16054931
 ] 

Devaraj Das commented on HBASE-18214:
-

bq. Instead of  std::unique_ptr mutex_; use 
std::shared_timed_mutex mutex_;
The issue is the shared_timed_mutex is (rightfully) not copyable. So one would 
have to define explicit copy constructors in the classes where this is used... 
It's not clear as to what such a copy constructor would do anyway (mutexes are 
fundamentally not copyable safely). Treating it as a pointer in class 
declarations works around such things since all the instances in question would 
refer to the same underlying mutex. But maybe, we should declare it to be 
shared_ptr as opposed to a shared_ptr. In the usecase we have, we probably will 
not run into issues with either anyway I guess.

> Replace the folly::AtomicHashMap usage in the RPC layer
> ---
>
> Key: HBASE-18214
> URL: https://issues.apache.org/jira/browse/HBASE-18214
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Attachments: 18214-1-1.txt, 18214-1-2.txt
>
>
> In my tests, I saw that folly::AtomicHashMap usage is not appropriate for 
> one, rather common use case. It'd become sort of unusable (inserts would 
> hang) after a bunch of inserts and erases. This hashmap is used to keep track 
> of call-Id after a connection is set up in the RPC layer (insert a 
> call-id/msg pair when an RPC is sent, and erase the pair when the 
> corresponding response is received). Here is a simple program that will 
> demonstrate the issue:
> {code}
> folly::AtomicHashMap f(100);
> int i = 0;
> while (i < 1) {
> try {
>   f.insert(i,100);
>   LOG(INFO) << "Inserted " << i << "  " << f.size();
>   f.erase(i);
>   LOG(INFO) << "Deleted " << i << "  " << f.size();
>   i++;
> } catch (const std::exception ) {
>   LOG(INFO) << "Exception " << e.what();
>   break;
> }
> }
> {code}
> After poking around a little bit, it is indeed called out as a limitation 
> here 
> https://github.com/facebook/folly/blob/master/folly/docs/AtomicHashMap.md 
> (grep for 'erase'). Proposal is to replace this with something that will fit 
> in in the above usecase (thinking of using std::unordered_map).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18214) Replace the folly::AtomicHashMap usage in the RPC layer

2017-06-19 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HBASE-18214:

Attachment: 18214-1-2.txt

Attached patch makes the unordered_map usage slightly abstract since there are 
two places where similar functionality is required. A wrapper class MapUtil 
encapsulates the behavior we need. The thing doesn't compile yet. C++ template 
syntax needs to be dealt with.

> Replace the folly::AtomicHashMap usage in the RPC layer
> ---
>
> Key: HBASE-18214
> URL: https://issues.apache.org/jira/browse/HBASE-18214
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Attachments: 18214-1-1.txt, 18214-1-2.txt
>
>
> In my tests, I saw that folly::AtomicHashMap usage is not appropriate for 
> one, rather common use case. It'd become sort of unusable (inserts would 
> hang) after a bunch of inserts and erases. This hashmap is used to keep track 
> of call-Id after a connection is set up in the RPC layer (insert a 
> call-id/msg pair when an RPC is sent, and erase the pair when the 
> corresponding response is received). Here is a simple program that will 
> demonstrate the issue:
> {code}
> folly::AtomicHashMap f(100);
> int i = 0;
> while (i < 1) {
> try {
>   f.insert(i,100);
>   LOG(INFO) << "Inserted " << i << "  " << f.size();
>   f.erase(i);
>   LOG(INFO) << "Deleted " << i << "  " << f.size();
>   i++;
> } catch (const std::exception ) {
>   LOG(INFO) << "Exception " << e.what();
>   break;
> }
> }
> {code}
> After poking around a little bit, it is indeed called out as a limitation 
> here 
> https://github.com/facebook/folly/blob/master/folly/docs/AtomicHashMap.md 
> (grep for 'erase'). Proposal is to replace this with something that will fit 
> in in the above usecase (thinking of using std::unordered_map).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18214) Replace the folly::AtomicHashMap usage in the RPC layer

2017-06-19 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16054658#comment-16054658
 ] 

Devaraj Das commented on HBASE-18214:
-

Yeah you are right [~enis]. I need to also change similarly for the {{requests_ 
}}.. Patch coming up in 5-10.

> Replace the folly::AtomicHashMap usage in the RPC layer
> ---
>
> Key: HBASE-18214
> URL: https://issues.apache.org/jira/browse/HBASE-18214
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Attachments: 18214-1-1.txt
>
>
> In my tests, I saw that folly::AtomicHashMap usage is not appropriate for 
> one, rather common use case. It'd become sort of unusable (inserts would 
> hang) after a bunch of inserts and erases. This hashmap is used to keep track 
> of call-Id after a connection is set up in the RPC layer (insert a 
> call-id/msg pair when an RPC is sent, and erase the pair when the 
> corresponding response is received). Here is a simple program that will 
> demonstrate the issue:
> {code}
> folly::AtomicHashMap f(100);
> int i = 0;
> while (i < 1) {
> try {
>   f.insert(i,100);
>   LOG(INFO) << "Inserted " << i << "  " << f.size();
>   f.erase(i);
>   LOG(INFO) << "Deleted " << i << "  " << f.size();
>   i++;
> } catch (const std::exception ) {
>   LOG(INFO) << "Exception " << e.what();
>   break;
> }
> }
> {code}
> After poking around a little bit, it is indeed called out as a limitation 
> here 
> https://github.com/facebook/folly/blob/master/folly/docs/AtomicHashMap.md 
> (grep for 'erase'). Proposal is to replace this with something that will fit 
> in in the above usecase (thinking of using std::unordered_map).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18214) Replace the folly::AtomicHashMap usage in the RPC layer

2017-06-17 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HBASE-18214:

Attachment: 18214-1-1.txt

Still running tests .. but I think it should be fine.

> Replace the folly::AtomicHashMap usage in the RPC layer
> ---
>
> Key: HBASE-18214
> URL: https://issues.apache.org/jira/browse/HBASE-18214
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Attachments: 18214-1-1.txt
>
>
> In my tests, I saw that folly::AtomicHashMap usage is not appropriate for 
> one, rather common use case. It'd become sort of unusable (inserts would 
> hang) after a bunch of inserts and erases. This hashmap is used to keep track 
> of call-Id after a connection is set up in the RPC layer (insert a 
> call-id/msg pair when an RPC is sent, and erase the pair when the 
> corresponding response is received). Here is a simple program that will 
> demonstrate the issue:
> {code}
> folly::AtomicHashMap f(100);
> int i = 0;
> while (i < 1) {
> try {
>   f.insert(i,100);
>   LOG(INFO) << "Inserted " << i << "  " << f.size();
>   f.erase(i);
>   LOG(INFO) << "Deleted " << i << "  " << f.size();
>   i++;
> } catch (const std::exception ) {
>   LOG(INFO) << "Exception " << e.what();
>   break;
> }
> }
> {code}
> After poking around a little bit, it is indeed called out as a limitation 
> here 
> https://github.com/facebook/folly/blob/master/folly/docs/AtomicHashMap.md 
> (grep for 'erase'). Proposal is to replace this with something that will fit 
> in in the above usecase (thinking of using std::unordered_map).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-18214) Replace the folly::AtomicHashMap usage in the RPC layer

2017-06-13 Thread Devaraj Das (JIRA)
Devaraj Das created HBASE-18214:
---

 Summary: Replace the folly::AtomicHashMap usage in the RPC layer
 Key: HBASE-18214
 URL: https://issues.apache.org/jira/browse/HBASE-18214
 Project: HBase
  Issue Type: Sub-task
Reporter: Devaraj Das
Assignee: Devaraj Das


In my tests, I saw that folly::AtomicHashMap usage is not appropriate for one, 
rather common use case. It'd become sort of unusable (inserts would hang) after 
a bunch of inserts and erases. This hashmap is used to keep track of call-Id 
after a connection is set up in the RPC layer (insert a call-id/msg pair when 
an RPC is sent, and erase the pair when the corresponding response is 
received). Here is a simple program that will demonstrate the issue:
{code}
folly::AtomicHashMap f(100);
int i = 0;
while (i < 1) {
try {
  f.insert(i,100);
  LOG(INFO) << "Inserted " << i << "  " << f.size();
  f.erase(i);
  LOG(INFO) << "Deleted " << i << "  " << f.size();
  i++;
} catch (const std::exception ) {
  LOG(INFO) << "Exception " << e.what();
  break;
}
}
{code}
After poking around a little bit, it is indeed called out as a limitation here 
https://github.com/facebook/folly/blob/master/folly/docs/AtomicHashMap.md (grep 
for 'erase'). Proposal is to replace this with something that will fit in in 
the above usecase (thinking of using std::unordered_map).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HBASE-18005) read replica: handle the case that region server hosting both primary replica and meta region is down

2017-05-31 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16031577#comment-16031577
 ] 

Devaraj Das edited comment on HBASE-18005 at 5/31/17 5:43 PM:
--

LGTM [~huaxiang]. Nice work.


was (Author: devaraj):
LGTM [~h...@cloudera.com]. Nice work.

> read replica: handle the case that region server hosting both primary replica 
> and meta region is down
> -
>
> Key: HBASE-18005
> URL: https://issues.apache.org/jira/browse/HBASE-18005
> Project: HBase
>  Issue Type: Bug
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: HBASE-18005-master-001.patch, 
> HBASE-18005-master-002.patch, HBASE-18005-master-003.patch, 
> HBASE-18005-master-004.patch, HBASE-18005-master-005.patch
>
>
> Identified one corner case in testing  that when the region server hosting 
> both primary replica and the meta region is down, the client tries to reload 
> the primary replica location from meta table, it is supposed to clean up only 
> the cached location for specific replicaId, but it clears caches for all 
> replicas. Please see
> https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java#L813
> Since it takes some time for regions to be reassigned (including meta 
> region), the following may throw exception
> https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L173
> This exception needs to be caught and  it needs to get cached location (in 
> this case, the primary replica's location is not available). If there are 
> cached locations for other replicas, it can still go ahead to get stale 
> values from secondary replicas.
> With meta replica, it still helps to not clean up the caches for all replicas 
> as the info from primary meta replica is up-to-date.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18005) read replica: handle the case that region server hosting both primary replica and meta region is down

2017-05-31 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16031577#comment-16031577
 ] 

Devaraj Das commented on HBASE-18005:
-

LGTM [~h...@cloudera.com]. Nice work.

> read replica: handle the case that region server hosting both primary replica 
> and meta region is down
> -
>
> Key: HBASE-18005
> URL: https://issues.apache.org/jira/browse/HBASE-18005
> Project: HBase
>  Issue Type: Bug
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: HBASE-18005-master-001.patch, 
> HBASE-18005-master-002.patch, HBASE-18005-master-003.patch, 
> HBASE-18005-master-004.patch, HBASE-18005-master-005.patch
>
>
> Identified one corner case in testing  that when the region server hosting 
> both primary replica and the meta region is down, the client tries to reload 
> the primary replica location from meta table, it is supposed to clean up only 
> the cached location for specific replicaId, but it clears caches for all 
> replicas. Please see
> https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java#L813
> Since it takes some time for regions to be reassigned (including meta 
> region), the following may throw exception
> https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L173
> This exception needs to be caught and  it needs to get cached location (in 
> this case, the primary replica's location is not available). If there are 
> cached locations for other replicas, it can still go ahead to get stale 
> values from secondary replicas.
> With meta replica, it still helps to not clean up the caches for all replicas 
> as the info from primary meta replica is up-to-date.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18075) Support namespaces and tables with non-latin alphabetical characters

2017-05-19 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018182#comment-16018182
 ] 

Devaraj Das commented on HBASE-18075:
-

+1 .. Tested this on a real cluster

> Support namespaces and tables with non-latin alphabetical characters
> 
>
> Key: HBASE-18075
> URL: https://issues.apache.org/jira/browse/HBASE-18075
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Reporter: Josh Elser
>Assignee: Josh Elser
> Fix For: 2.0.0
>
> Attachments: HBASE-18075.001.patch, HBASE-18075.002.patch
>
>
> On the heels of HBASE-18067, it would be nice to support namespaces and 
> tables with names that fall outside of Latin alphabetical characters and 
> numbers.
> Our current regex for allowable characters is approximately 
> {{\[a-zA-Z0-9\]+}}.
> It would be nice to replace {{a-zA-Z}} with Java's {{\p\{IsAlphabetic\}}} 
> which will naturally restrict the unicode character space down to just those 
> that are part of the alphabet for each script (e.g. latin, cyrillic, greek).
> Technically, our possible scope of allowable characters is, best as I can 
> tell, only limited by the limitations of ZooKeeper itself 
> https://zookeeper.apache.org/doc/r3.4.10/zookeeperProgrammers.html#ch_zkDataModel
>  (as both table and namespace are created as znodes).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18067) Support a default converter for data read shell commands

2017-05-19 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018176#comment-16018176
 ] 

Devaraj Das commented on HBASE-18067:
-

+1 .. Tested this on a real cluster

> Support a default converter for data read shell commands
> 
>
> Key: HBASE-18067
> URL: https://issues.apache.org/jira/browse/HBASE-18067
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-18067.001.patch, HBASE-18067.002.patch, 
> HBASE-18067.003.patch
>
>
> The {{get}} and {{scan}} shell commands have the ability to specify some 
> complicated syntax on how to encode the bytes read from HBase on a per-column 
> basis. By default, bytes falling outside of a limited range of ASCII are just 
> printed as hex.
> It seems like the intent of these converts was to support conversion of 
> certain numeric columns as a readable string (e.g. 1234).
> However, if non-ascii encoded bytes are stored in the table (e.g. UTF-8 
> encoded bytes), we may want to treat all data we read as UTF-8 instead (e.g. 
> if row+column+value are in Chinese). It would be onerous to require users to 
> enumerate every column they're reading to parse as UTF-8 instead of the 
> limited ascii range. We can provide an option to encode all values retrieved 
> by the command.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17938) General fault - tolerance framework for backup/restore operations

2017-05-01 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15991883#comment-15991883
 ] 

Devaraj Das commented on HBASE-17938:
-

I agree.. Let's keep it simple for now. Once we get more experience, we can 
enhance as needed.

> General fault - tolerance framework for backup/restore operations
> -
>
> Key: HBASE-17938
> URL: https://issues.apache.org/jira/browse/HBASE-17938
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Fix For: 2.0.0
>
> Attachments: HBASE-17938-v1.patch, HBASE-17938-v2.patch, 
> HBASE-17938-v3.patch
>
>
> The framework must take care of all general types of failures during backup/ 
> restore and restore system to the original state in case of a failure.
> That won't solve all the possible issues  but we have a separate JIRAs for 
> them as a sub-tasks of HBASE-15277



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17287) Master becomes a zombie if filesystem object closes

2017-03-23 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15939131#comment-15939131
 ] 

Devaraj Das commented on HBASE-17287:
-

The approach seems brittle - doing string checks on exceptions. I am hoping 
there is a better way to address it?

> Master becomes a zombie if filesystem object closes
> ---
>
> Key: HBASE-17287
> URL: https://issues.apache.org/jira/browse/HBASE-17287
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Reporter: Clay B.
>Assignee: Ted Yu
> Attachments: 17287.master.v2.txt, 17287.v2.txt
>
>
> We have seen an issue whereby if the HDFS is unstable and the HBase master's 
> HDFS client is unable to stabilize before 
> {{dfs.client.failover.max.attempts}} then the master's filesystem object 
> closes. This seems to result in an HBase master which will continue to run 
> (process and znode exists) but no meaningful work can be done (e.g. assigning 
> meta).What we saw in our HBase master logs was:{code}2016-12-01 19:19:08,192 
> ERROR org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler: 
> Caught M_META_SERVER_SHUTDOWN, count=1java.io.IOException: failed log 
> splitting for cluster-r5n12.bloomberg.com,60200,1480632863218, will retryat 
> org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.process(MetaServerShutdownHandler.java:84)at
>  org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)at
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)at
>  java.lang.Thread.run(Thread.java:745)Caused by: java.io.IOException: 
> Filesystem closed{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-14123) HBase Backup/Restore Phase 2

2017-02-28 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1574#comment-1574
 ] 

Devaraj Das commented on HBASE-14123:
-

Thanks Stack :)
Hopefully, it is close to commit and thanks for the patience on seeing this 
through. [~vrodionov] please let's address the last set of comments/questions 
from Stack soon, and get onto the other blockers on this topic.

> HBase Backup/Restore Phase 2
> 
>
> Key: HBASE-14123
> URL: https://issues.apache.org/jira/browse/HBASE-14123
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Blocker
> Fix For: HBASE-7912
>
> Attachments: 14123-master.v14.txt, 14123-master.v15.txt, 
> 14123-master.v16.txt, 14123-master.v17.txt, 14123-master.v18.txt, 
> 14123-master.v19.txt, 14123-master.v20.txt, 14123-master.v21.txt, 
> 14123-master.v24.txt, 14123-master.v25.txt, 14123-master.v27.txt, 
> 14123-master.v28.txt, 14123-master.v29.full.txt, 14123-master.v2.txt, 
> 14123-master.v30.txt, 14123-master.v31.txt, 14123-master.v32.txt, 
> 14123-master.v33.txt, 14123-master.v34.txt, 14123-master.v35.txt, 
> 14123-master.v36.txt, 14123-master.v37.txt, 14123-master.v38.txt, 
> 14123.master.v39.patch, 14123-master.v3.txt, 14123.master.v40.patch, 
> 14123.master.v41.patch, 14123.master.v42.patch, 14123.master.v44.patch, 
> 14123.master.v45.patch, 14123.master.v46.patch, 14123.master.v48.patch, 
> 14123.master.v49.patch, 14123.master.v50.patch, 14123.master.v51.patch, 
> 14123.master.v52.patch, 14123.master.v54.patch, 14123.master.v56.patch, 
> 14123.master.v57.patch, 14123.master.v58.patch, 14123-master.v5.txt, 
> 14123-master.v6.txt, 14123-master.v7.txt, 14123-master.v8.txt, 
> 14123-master.v9.txt, 14123-v14.txt, Backup-restoreinHBase2.0 (1).pdf, 
> Backup-restoreinHBase2.0.pdf, HBASE-14123-for-7912-v1.patch, 
> HBASE-14123-for-7912-v6.patch, HBASE-14123-v10.patch, HBASE-14123-v11.patch, 
> HBASE-14123-v12.patch, HBASE-14123-v13.patch, HBASE-14123-v15.patch, 
> HBASE-14123-v16.patch, HBASE-14123-v1.patch, HBASE-14123-v2.patch, 
> HBASE-14123-v3.patch, HBASE-14123-v4.patch, HBASE-14123-v5.patch, 
> HBASE-14123-v6.patch, HBASE-14123-v7.patch, HBASE-14123-v9.patch
>
>
> Phase 2 umbrella JIRA. See HBASE-7912 for design document and description. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-16850) Run large scale correctness tests for HBASE-14918 (in-memory flushes/compactions)

2017-02-27 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886794#comment-15886794
 ] 

Devaraj Das commented on HBASE-16850:
-

Hi [~ebortnik] this jira was about correctness rather than benchmark 
performance. Have you guys run the ITBLL tests yet. I was planning to but 
haven't been able to yet, and not sure when i will be able to... 

> Run large scale correctness tests for HBASE-14918 (in-memory 
> flushes/compactions)
> -
>
> Key: HBASE-16850
> URL: https://issues.apache.org/jira/browse/HBASE-16850
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Devaraj Das
>Assignee: Devaraj Das
>Priority: Blocker
>
> As discussed here - 
> https://issues.apache.org/jira/browse/HBASE-16608?focusedCommentId=15577213=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15577213
> [~stack] [~anastas] [~ram_krish] [~anoop.hbase]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-14123) HBase Backup/Restore Phase 2

2017-02-27 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886600#comment-15886600
 ] 

Devaraj Das commented on HBASE-14123:
-

In the absence of backup configurations on the server side, we will be missing 
the relevant coprocessors and hence if a client makes an RPC, it will probably 
be rejected by HBase. The question is how does the backup client reacts to that 
and if it is graceful or not. [~vrodionov] that will be something to check. 
That seems to be the main sticking point in Stack's review.

> HBase Backup/Restore Phase 2
> 
>
> Key: HBASE-14123
> URL: https://issues.apache.org/jira/browse/HBASE-14123
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Blocker
> Fix For: HBASE-7912
>
> Attachments: 14123-master.v14.txt, 14123-master.v15.txt, 
> 14123-master.v16.txt, 14123-master.v17.txt, 14123-master.v18.txt, 
> 14123-master.v19.txt, 14123-master.v20.txt, 14123-master.v21.txt, 
> 14123-master.v24.txt, 14123-master.v25.txt, 14123-master.v27.txt, 
> 14123-master.v28.txt, 14123-master.v29.full.txt, 14123-master.v2.txt, 
> 14123-master.v30.txt, 14123-master.v31.txt, 14123-master.v32.txt, 
> 14123-master.v33.txt, 14123-master.v34.txt, 14123-master.v35.txt, 
> 14123-master.v36.txt, 14123-master.v37.txt, 14123-master.v38.txt, 
> 14123.master.v39.patch, 14123-master.v3.txt, 14123.master.v40.patch, 
> 14123.master.v41.patch, 14123.master.v42.patch, 14123.master.v44.patch, 
> 14123.master.v45.patch, 14123.master.v46.patch, 14123.master.v48.patch, 
> 14123.master.v49.patch, 14123.master.v50.patch, 14123.master.v51.patch, 
> 14123.master.v52.patch, 14123.master.v54.patch, 14123.master.v56.patch, 
> 14123.master.v57.patch, 14123.master.v58.patch, 14123-master.v5.txt, 
> 14123-master.v6.txt, 14123-master.v7.txt, 14123-master.v8.txt, 
> 14123-master.v9.txt, 14123-v14.txt, Backup-restoreinHBase2.0 (1).pdf, 
> Backup-restoreinHBase2.0.pdf, HBASE-14123-for-7912-v1.patch, 
> HBASE-14123-for-7912-v6.patch, HBASE-14123-v10.patch, HBASE-14123-v11.patch, 
> HBASE-14123-v12.patch, HBASE-14123-v13.patch, HBASE-14123-v15.patch, 
> HBASE-14123-v16.patch, HBASE-14123-v1.patch, HBASE-14123-v2.patch, 
> HBASE-14123-v3.patch, HBASE-14123-v4.patch, HBASE-14123-v5.patch, 
> HBASE-14123-v6.patch, HBASE-14123-v7.patch, HBASE-14123-v9.patch
>
>
> Phase 2 umbrella JIRA. See HBASE-7912 for design document and description. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17680) Run mini cluster through JNI in tests

2017-02-23 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882164#comment-15882164
 ] 

Devaraj Das commented on HBASE-17680:
-

I think for now it's fine to make the whole thing work via Buck (that assumes 
docker). For the Makefile based builds we can read JAVA_HOME, etc. It makes 
sense to write the Java wrapper for the HTU that combines various operations, 
and write a thinner layer for the JNI.

> Run mini cluster through JNI in tests
> -
>
> Key: HBASE-17680
> URL: https://issues.apache.org/jira/browse/HBASE-17680
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: 17680.v1.txt, 17680.v3.txt, 17680.v8.txt
>
>
> Currently tests start local hbase cluster through hbase shell.
> There is less control over the configuration of the local cluster this way.
> This issue would replace hbase shell with JNI interface to mini cluster.
> We would have full control over the cluster behavior.
> Thanks to [~devaraj] who started this initiative.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17620) Move table to another group (add -migrateTertiary)

2017-02-09 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15860189#comment-15860189
 ] 

Devaraj Das commented on HBASE-17620:
-

I'd say let's look at this in the future..

> Move table to another group (add -migrateTertiary)
> --
>
> Key: HBASE-17620
> URL: https://issues.apache.org/jira/browse/HBASE-17620
> Project: HBase
>  Issue Type: Sub-task
>  Components: FavoredNodes
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
> Fix For: 2.0.0
>
>
> As part of the design document in HBASE-15531, we mentioned about an approach 
> to move tables to new group. First only one favored node would be moved to 
> the new group using something like "rpm -migrateTertiary" command (this will 
> be something else since RPM will be deprecated). Once enough locality has 
> builtup on the tertiary nodes, the table can be moved to the group.
> In my experience, the likelihood of tables moving across groups is rare and 
> there is a brief amount of time when one of the FN will belong to another 
> group and stuff like that. When regions split, we also have to consider this 
> situation and generate one FN from the target (or tertiary's) group.
> Is this feature required? Do we need this as a start? I can attach tentative 
> patches and we could reconsider this in future if we don't need this as a 
> start.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17018) Spooling BufferedMutator

2016-12-19 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15762388#comment-15762388
 ] 

Devaraj Das commented on HBASE-17018:
-

Folks, let's consider putting the Spool* implementations outside HBase. The 
refactor of the Buffered* classes is fine IMO but I'd hesitate to open this 
feature up for general consumption yet.

> Spooling BufferedMutator
> 
>
> Key: HBASE-17018
> URL: https://issues.apache.org/jira/browse/HBASE-17018
> Project: HBase
>  Issue Type: New Feature
>Reporter: Joep Rottinghuis
> Attachments: HBASE-17018.master.001.patch, 
> HBASE-17018.master.002.patch, HBASE-17018.master.003.patch, 
> HBASE-17018SpoolingBufferedMutatorDesign-v1.pdf, YARN-4061 HBase requirements 
> for fault tolerant writer.pdf
>
>
> For Yarn Timeline Service v2 we use HBase as a backing store.
> A big concern we would like to address is what to do if HBase is 
> (temporarily) down, for example in case of an HBase upgrade.
> Most of the high volume writes will be mostly on a best-effort basis, but 
> occasionally we do a flush. Mainly during application lifecycle events, 
> clients will call a flush on the timeline service API. In order to handle the 
> volume of writes we use a BufferedMutator. When flush gets called on our API, 
> we in turn call flush on the BufferedMutator.
> We would like our interface to HBase be able to spool the mutations to a 
> filesystems in case of HBase errors. If we use the Hadoop filesystem 
> interface, this can then be HDFS, gcs, s3, or any other distributed storage. 
> The mutations can then later be re-played, for example through a MapReduce 
> job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17294) External Configuration for Memory Compaction

2016-12-19 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15762377#comment-15762377
 ] 

Devaraj Das commented on HBASE-17294:
-

[~eshcar] from your release note, it seems BASIC is beneficial for all cases. 
Can you please elaborate? Do you have benchmarks around this to prove? And also 
for the EAGER policy, are there benchmarks? Pardon me if I have missed it but 
it doesn't show up here 
https://docs.google.com/document/d/10k4hqi4mCCpVrPodp1Q4rV0XsZ4TZtULOa-FOoe-_hw/edit
This is a lot of code (and thanks for the deep work), and I'd definitely want 
to see benchmarks / results / benefits before we can ship this in a release.

> External Configuration for Memory Compaction 
> -
>
> Key: HBASE-17294
> URL: https://issues.apache.org/jira/browse/HBASE-17294
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Eshcar Hillel
>Assignee: Eshcar Hillel
> Fix For: 2.0.0
>
> Attachments: HBASE-17294-V01.patch, HBASE-17294-V02.patch, 
> HBASE-17294-V03.patch
>
>
> We would like to have a single external knob to control memstore compaction.
> Possible memstore compaction policies are none, basic, and eager.
> This sub-task allows to set this property at the column family level at table 
> creation time:
> {code}
> create ‘’,
>{NAME => ‘’, 
> IN_MEMORY_COMPACTION => ‘’}
> {code}
> or to set this at the global configuration level by setting the property in 
> hbase-site.xml, with BASIC being the default value:
> {code}
> 
>   hbase.hregion.compacting.memstore.type
>   
> 
> {code}
> The values used in this property can change as memstore compaction policies 
> evolve over time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17333) HBASE-17294 always ensures CompactingMemstore is default

2016-12-19 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15762082#comment-15762082
 ] 

Devaraj Das commented on HBASE-17333:
-

Guys, wondering if we have perf tests that prove BASIC is performant/beneficial 
in all cases (as release-noted in HBASE-17294). If so, why isn't that the 
default. Also, this doc 
https://docs.google.com/document/d/10k4hqi4mCCpVrPodp1Q4rV0XsZ4TZtULOa-FOoe-_hw/edit
 should be brought up to date w.r.t the developments happening.

> HBASE-17294 always ensures CompactingMemstore is default
> 
>
> Key: HBASE-17333
> URL: https://issues.apache.org/jira/browse/HBASE-17333
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-17333.patch, HBASE-17333_1.patch, 
> HBASE-17333_2.patch
>
>
> Was the purpose of HBASE-17294 is to make Compacting Memstore as default? Am 
> not sure on that. But that patch makes DefaultMemstore as a Noop. This JIRA 
> is to discuss and revert back to default memstore only if the family is not 
> configured for in memory flush/compaction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17101) FavoredNodes should not apply to system tables

2016-12-13 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746379#comment-15746379
 ] 

Devaraj Das commented on HBASE-17101:
-

+1

> FavoredNodes should not apply to system tables
> --
>
> Key: HBASE-17101
> URL: https://issues.apache.org/jira/browse/HBASE-17101
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
> Fix For: 2.0.0
>
> Attachments: HBASE-17101.master.001.patch, 
> HBASE-17101.master.002.patch, HBASE_17101_rough_draft.patch
>
>
> As described in the doc (see HBASE-15531), we would like to start with user 
> tables for favored nodes. This task ensures FN does not apply to system 
> tables.
> System tables are in memory and won't benefit from favored nodes. Since we 
> also maintain FN information for user regions in meta, it helps to keep 
> implementation simpler by ignoring system tables for the first iterations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17107) FN info should be cleaned up on region/table cleanup

2016-12-13 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746355#comment-15746355
 ] 

Devaraj Das commented on HBASE-17107:
-

+1

> FN info should be cleaned up on region/table cleanup
> 
>
> Key: HBASE-17107
> URL: https://issues.apache.org/jira/browse/HBASE-17107
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
> Fix For: 2.0.0
>
> Attachments: HBASE-17107.master.001.patch, HBASE_17107.draft.patch
>
>
> FN info should be cleaned up when table is deleted and when regions are GCed 
> (i.e. CatalogJanitor).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-17100) Implement Chore to sync FN info from Master to RegionServers

2016-12-08 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15734624#comment-15734624
 ] 

Devaraj Das commented on HBASE-17100:
-

[~thiruvel] which RPC failure are you worried about? I just checked the design 
doc as well, and didn't see a reference to this chore.

> Implement Chore to sync FN info from Master to RegionServers
> 
>
> Key: HBASE-17100
> URL: https://issues.apache.org/jira/browse/HBASE-17100
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
> Fix For: 2.0.0
>
> Attachments: HBASE-17100.master.001.patch, HBASE_17100_draft.patch
>
>
> Master will have a repair chore which will periodically sync fn information 
> from master to all the region servers. This will protect against rpc failures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14123) HBase Backup/Restore Phase 2

2016-12-08 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15734494#comment-15734494
 ] 

Devaraj Das commented on HBASE-14123:
-

[~stack], I just did a skim of the RB - your comments and [~vrodionov]'s 
responses. Seemed okay to me. You had also commented on this jira with a bunch 
of comments (above), and it seems [~vrodionov] opened follow up jiras for 
those, and he responded to your comments with the jira numbers inline. I didn't 
cover each and every comment but overall seemed okay to me.
So in summary, what changed in the megapatch is responded to inline in the RB 
and in the comments above. Does this work for you [~stack]?

> HBase Backup/Restore Phase 2
> 
>
> Key: HBASE-14123
> URL: https://issues.apache.org/jira/browse/HBASE-14123
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Blocker
> Fix For: 2.0.0
>
> Attachments: 14123-master.v14.txt, 14123-master.v15.txt, 
> 14123-master.v16.txt, 14123-master.v17.txt, 14123-master.v18.txt, 
> 14123-master.v19.txt, 14123-master.v2.txt, 14123-master.v20.txt, 
> 14123-master.v21.txt, 14123-master.v24.txt, 14123-master.v25.txt, 
> 14123-master.v27.txt, 14123-master.v28.txt, 14123-master.v29.full.txt, 
> 14123-master.v3.txt, 14123-master.v30.txt, 14123-master.v31.txt, 
> 14123-master.v32.txt, 14123-master.v33.txt, 14123-master.v34.txt, 
> 14123-master.v35.txt, 14123-master.v36.txt, 14123-master.v37.txt, 
> 14123-master.v38.txt, 14123-master.v5.txt, 14123-master.v6.txt, 
> 14123-master.v7.txt, 14123-master.v8.txt, 14123-master.v9.txt, 14123-v14.txt, 
> 14123.master.v39.patch, 14123.master.v40.patch, 
> HBASE-14123-for-7912-v1.patch, HBASE-14123-for-7912-v6.patch, 
> HBASE-14123-v1.patch, HBASE-14123-v10.patch, HBASE-14123-v11.patch, 
> HBASE-14123-v12.patch, HBASE-14123-v13.patch, HBASE-14123-v15.patch, 
> HBASE-14123-v16.patch, HBASE-14123-v2.patch, HBASE-14123-v3.patch, 
> HBASE-14123-v4.patch, HBASE-14123-v5.patch, HBASE-14123-v6.patch, 
> HBASE-14123-v7.patch, HBASE-14123-v9.patch
>
>
> Phase 2 umbrella JIRA. See HBASE-7912 for design document and description. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14123) HBase Backup/Restore Phase 2

2016-12-08 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15734140#comment-15734140
 ] 

Devaraj Das commented on HBASE-14123:
-

Assuming Stack made all the comments on RB, Vladimir can you just respond to 
those there, so all the responses are in one place?

> HBase Backup/Restore Phase 2
> 
>
> Key: HBASE-14123
> URL: https://issues.apache.org/jira/browse/HBASE-14123
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Blocker
> Fix For: 2.0.0
>
> Attachments: 14123-master.v14.txt, 14123-master.v15.txt, 
> 14123-master.v16.txt, 14123-master.v17.txt, 14123-master.v18.txt, 
> 14123-master.v19.txt, 14123-master.v2.txt, 14123-master.v20.txt, 
> 14123-master.v21.txt, 14123-master.v24.txt, 14123-master.v25.txt, 
> 14123-master.v27.txt, 14123-master.v28.txt, 14123-master.v29.full.txt, 
> 14123-master.v3.txt, 14123-master.v30.txt, 14123-master.v31.txt, 
> 14123-master.v32.txt, 14123-master.v33.txt, 14123-master.v34.txt, 
> 14123-master.v35.txt, 14123-master.v36.txt, 14123-master.v37.txt, 
> 14123-master.v38.txt, 14123-master.v5.txt, 14123-master.v6.txt, 
> 14123-master.v7.txt, 14123-master.v8.txt, 14123-master.v9.txt, 14123-v14.txt, 
> 14123.master.v39.patch, 14123.master.v40.patch, 
> HBASE-14123-for-7912-v1.patch, HBASE-14123-for-7912-v6.patch, 
> HBASE-14123-v1.patch, HBASE-14123-v10.patch, HBASE-14123-v11.patch, 
> HBASE-14123-v12.patch, HBASE-14123-v13.patch, HBASE-14123-v15.patch, 
> HBASE-14123-v16.patch, HBASE-14123-v2.patch, HBASE-14123-v3.patch, 
> HBASE-14123-v4.patch, HBASE-14123-v5.patch, HBASE-14123-v6.patch, 
> HBASE-14123-v7.patch, HBASE-14123-v9.patch
>
>
> Phase 2 umbrella JIRA. See HBASE-7912 for design document and description. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16956) Refactor FavoredNodePlan to use regionNames as keys

2016-11-17 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HBASE-16956:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed. Thanks, Thiruvel.

> Refactor FavoredNodePlan to use regionNames as keys
> ---
>
> Key: HBASE-16956
> URL: https://issues.apache.org/jira/browse/HBASE-16956
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-16956.branch-1.001.patch, 
> HBASE-16956.master.001.patch, HBASE-16956.master.002.patch, 
> HBASE-16956.master.003.patch, HBASE-16956.master.004.patch, 
> HBASE-16956.master.005.patch, HBASE-16956.master.006.patch, 
> HBASE-16956.master.007.patch
>
>
> We would like to rely on the FNPlan cache whether a region is offline or not. 
> Sticking to regionNames as keys makes that possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16956) Refactor FavoredNodePlan to use regionNames as keys

2016-11-15 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669261#comment-15669261
 ] 

Devaraj Das commented on HBASE-16956:
-

[~thiruvel] the last patch seems to be unrelated to this jira. But I took a 
look at the one before last you uploaded. Unless you wanted to update that and 
resubmit, and if no objections, I'll commit that tomorrow.

> Refactor FavoredNodePlan to use regionNames as keys
> ---
>
> Key: HBASE-16956
> URL: https://issues.apache.org/jira/browse/HBASE-16956
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-16956.branch-1.001.patch, 
> HBASE-16956.master.001.patch, HBASE-16956.master.002.patch, 
> HBASE-16956.master.003.patch, HBASE-16956.master.004.patch, 
> HBASE-16956.master.005.patch, HBASE-16956.master.006.patch
>
>
> We would like to rely on the FNPlan cache whether a region is offline or not. 
> Sticking to regionNames as keys makes that possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14141) HBase Backup/Restore Phase 3: Filter WALs on backup to include only edits from backup tables

2016-11-08 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15650007#comment-15650007
 ] 

Devaraj Das commented on HBASE-14141:
-

[~vrodionov] is there a way in which we can reliably identify the WALs that 
need to be kept track of, when a request for an incremental backup comes. Like, 
if there are multiple WALs, and some of them are part of a backup set or 
something, and some are not, can we make that distinction, and store the 
metadata about these. We could totally stay out of the business of doing 
anything WAL specific, but we should just be able to query some metadata state 
about which specific WALs we should back up when a request for an incremental 
backup comes... The default implementation could be returning all the WALs that 
are currently present in the WAL directory, and they are all backed up (this is 
what we have today).
I guess it's an extension to what you say above, but am trying to see if things 
can be simplified. So am proposing we flip it around, and say that if someone 
wants to take backups, they need to plan for it, and set up WALs appropriately.

> HBase Backup/Restore Phase 3: Filter WALs on backup to include only edits 
> from backup tables
> 
>
> Key: HBASE-14141
> URL: https://issues.apache.org/jira/browse/HBASE-14141
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 2.0.0
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading

2016-11-04 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15638081#comment-15638081
 ] 

Devaraj Das commented on HBASE-14417:
-

A summary of some internal discussions on the high-level flow that doesn't use 
ZK...
1. Client updates the hbase:backup table with a set of paths that are to be 
bulkloaded (if the tables in question have been fully backed up at least once 
in the past)
2. Client performs the bulkload of the data. If the client fails before the 
bulkload was fully complete, the cleaner chore in (5) would take care of 
cleaning up the unneeded entries from hbase:backup
3. There is a HFileCleaner that makes sure that paths that came about due to 
(1) are held until the next incremental backup
4. As part of the incremental backup, the hbase:backup table is updated to 
reflect the right location where the earlier bulkloaded file got copied to
5. A chore runs periodically (in the BackupController) that eliminates entries 
from the hbase:backup table if the corresponding paths don't exist in the 
filesystem until after a configured time period (default, say 24 hours; 
bulkload timeout is assumed to be much smaller than this, and hence all 
bulkloads that are meant to successfully complete would complete).
Thoughts?

> Incremental backup and bulk loading
> ---
>
> Key: HBASE-14417
> URL: https://issues.apache.org/jira/browse/HBASE-14417
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 2.0.0
>Reporter: Vladimir Rodionov
>Assignee: Ted Yu
>Priority: Critical
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: 14417.v1.txt, 14417.v11.txt, 14417.v13.txt, 
> 14417.v2.txt, 14417.v21.txt, 14417.v23.txt, 14417.v24.txt, 14417.v25.txt, 
> 14417.v6.txt
>
>
> Currently, incremental backup is based on WAL files. Bulk data loading 
> bypasses WALs for obvious reasons, breaking incremental backups. The only way 
> to continue backups after bulk loading is to create new full backup of a 
> table. This may not be feasible for customers who do bulk loading regularly 
> (say, every day).
> Google doc for design:
> https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16956) Refactor FavoredNodePlan to use regionNames as keys

2016-10-27 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613500#comment-15613500
 ] 

Devaraj Das commented on HBASE-16956:
-

Sorry, just read the description.. Yeah patch looks fine to me (pending QA).

> Refactor FavoredNodePlan to use regionNames as keys
> ---
>
> Key: HBASE-16956
> URL: https://issues.apache.org/jira/browse/HBASE-16956
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-16956.master.001.patch
>
>
> We would like to rely on the FNPlan cache whether a region is offline or not. 
> Sticking to regionNames as keys makes that possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16956) Refactor FavoredNodePlan to use regionNames as keys

2016-10-27 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613300#comment-15613300
 ] 

Devaraj Das commented on HBASE-16956:
-

[~thiruvel] not sure I understand the motivation for this change. Could you 
please shed some light. Thanks!

> Refactor FavoredNodePlan to use regionNames as keys
> ---
>
> Key: HBASE-16956
> URL: https://issues.apache.org/jira/browse/HBASE-16956
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-16956.master.001.patch
>
>
> We would like to rely on the FNPlan cache whether a region is offline or not. 
> Sticking to regionNames as keys makes that possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16414) Improve performance for RPC encryption with Apache Common Crypto

2016-10-21 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15594277#comment-15594277
 ] 

Devaraj Das commented on HBASE-16414:
-

Np [~ram_krish]. Fine by me to commit the patch if everyone else is okay with 
it.

> Improve performance for RPC encryption with Apache Common Crypto
> 
>
> Key: HBASE-16414
> URL: https://issues.apache.org/jira/browse/HBASE-16414
> Project: HBase
>  Issue Type: Improvement
>  Components: IPC/RPC
>Affects Versions: 2.0.0
>Reporter: Colin Ma
>Assignee: Colin Ma
> Attachments: HBASE-16414.001.patch, HBASE-16414.002.patch, 
> HBASE-16414.003.patch, HBASE-16414.004.patch, HBASE-16414.005.patch, 
> HBASE-16414.006.patch, HBASE-16414.007.patch, HBASE-16414.008.patch, 
> HBASE-16414.009.patch, HbaseRpcEncryptionWithCrypoto.docx
>
>
> Hbase RPC encryption is enabled by setting “hbase.rpc.protection” to 
> "privacy". With the token authentication, it utilized DIGEST-MD5 mechanisms 
> for secure authentication and data protection. For DIGEST-MD5, it uses DES, 
> 3DES or RC4 to do encryption and it is very slow, especially for Scan. This 
> will become the bottleneck of the RPC throughput.
> Apache Commons Crypto is a cryptographic library optimized with AES-NI. It 
> provides Java API for both cipher level and Java stream level. Developers can 
> use it to implement high performance AES encryption/decryption with the 
> minimum code and effort. Compare with the current implementation of 
> org.apache.hadoop.hbase.io.crypto.aes.AES, Crypto supports both JCE Cipher 
> and OpenSSL Cipher which is better performance than JCE Cipher. User can 
> configure the cipher type and the default is JCE Cipher.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15532) core favored nodes enhancements

2016-10-20 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593764#comment-15593764
 ] 

Devaraj Das commented on HBASE-15532:
-

Thanks for the patch [~thiruvel]. I started reviewing it but realized that it's 
somewhat difficult to keep track of the various changes and why something is 
done the way it is. Can you please help here. I'd like the patch to be broken 
up in chunks for review.
1. The split and merge code paths
2. The Balancer related enhancements
3. The various tools (and here also it should be broken up).


> core favored nodes enhancements
> ---
>
> Key: HBASE-15532
> URL: https://issues.apache.org/jira/browse/HBASE-15532
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Francis Liu
>Assignee: Thiruvel Thirumoolan
> Fix For: 2.0.0
>
> Attachments: HBASE-15532.master.000.patch, 
> HBASE-15532.master.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16850) Run large scale correctness tests for HBASE-14918 (in-memory flushes/compactions)

2016-10-14 Thread Devaraj Das (JIRA)
Devaraj Das created HBASE-16850:
---

 Summary: Run large scale correctness tests for HBASE-14918 
(in-memory flushes/compactions)
 Key: HBASE-16850
 URL: https://issues.apache.org/jira/browse/HBASE-16850
 Project: HBase
  Issue Type: Sub-task
Reporter: Devaraj Das
Priority: Blocker


As discussed here - 
https://issues.apache.org/jira/browse/HBASE-16608?focusedCommentId=15577213=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15577213
[~stack] [~anastas] [~ram_krish] [~anoop.hbase]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16849) Document for HBASE-14918 (in-memory flushes/compactions)

2016-10-14 Thread Devaraj Das (JIRA)
Devaraj Das created HBASE-16849:
---

 Summary: Document for HBASE-14918 (in-memory flushes/compactions)
 Key: HBASE-16849
 URL: https://issues.apache.org/jira/browse/HBASE-16849
 Project: HBase
  Issue Type: Sub-task
Reporter: Devaraj Das
Assignee: Anastasia Braginsky
Priority: Blocker


As discussed here - 
https://issues.apache.org/jira/browse/HBASE-16608?focusedCommentId=15577213=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15577213
[~stack] [~anoop.hbase] [~ram_krish]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16608) Introducing the ability to merge ImmutableSegments without copy-compaction or SQM usage

2016-10-14 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15577213#comment-15577213
 ] 

Devaraj Das commented on HBASE-16608:
-

Thanks for the clarifications, [~stack]. Echoing your thoughts that we really 
need to summarize all the great jira commentaries into user level stuff that 
people in the community can grok. Please please let's get that done with a high 
priority [~anastas]. I'll open a jira. The other thing we need to get to is the 
"correctness" test. I am sure the team is already doing that but need to run 
the usual ITBLL, etc. as well. I have the rig and could get to it - need a 
helping hand to make sure that the code path is exercised.

> Introducing the ability to merge ImmutableSegments without copy-compaction or 
> SQM usage
> ---
>
> Key: HBASE-16608
> URL: https://issues.apache.org/jira/browse/HBASE-16608
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Anastasia Braginsky
>Assignee: Anastasia Braginsky
> Attachments: HBASE-16417-V02.patch, HBASE-16417-V04.patch, 
> HBASE-16417-V06.patch, HBASE-16417-V07.patch, HBASE-16417-V08.patch, 
> HBASE-16417-V10.patch, HBASE-16608-V01.patch, HBASE-16608-V03.patch, 
> HBASE-16608-V04.patch, HBASE-16608-V08.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16608) Introducing the ability to merge ImmutableSegments without copy-compaction or SQM usage

2016-10-14 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15574526#comment-15574526
 ] 

Devaraj Das commented on HBASE-16608:
-

[~anastas] I have asked elsewhere as well. Asking here too... Is there any user 
facing docs on this? Looking at all the patches that happened in this area, I 
wish there was a dev branch for this work before committing to master. I 
haven't followed this work closely enough but looking at [~anoop.hbase]'s 
concern, I'd be -1 to get this patch in.

> Introducing the ability to merge ImmutableSegments without copy-compaction or 
> SQM usage
> ---
>
> Key: HBASE-16608
> URL: https://issues.apache.org/jira/browse/HBASE-16608
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Anastasia Braginsky
>Assignee: Anastasia Braginsky
> Attachments: HBASE-16417-V02.patch, HBASE-16417-V04.patch, 
> HBASE-16417-V06.patch, HBASE-16417-V07.patch, HBASE-16417-V08.patch, 
> HBASE-16417-V10.patch, HBASE-16608-V01.patch, HBASE-16608-V03.patch, 
> HBASE-16608-V04.patch, HBASE-16608-V08.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16727) Refactoring: move MR dependencies from HMaster

2016-09-28 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15530653#comment-15530653
 ] 

Devaraj Das commented on HBASE-16727:
-

Right [~busbey]... The title is probably a tad misleading. It's about removing 
the dependency on the MR runtime from the master that the code in the backup 
branch introduced.

> Refactoring: move MR dependencies from HMaster
> --
>
> Key: HBASE-16727
> URL: https://issues.apache.org/jira/browse/HBASE-16727
> Project: HBase
>  Issue Type: Task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>
> * No MR jobs in HMaster
> * No proc2 implementation
> * client-driven backup-restore
> * basic security: only super user is allowed to run backup/restore



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (HBASE-16727) Refactoring: move MR dependencies from HMaster

2016-09-28 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HBASE-16727:

Comment: was deleted

(was: Right [~busbey]... The title is probably a tad misleading. It's about 
removing the dependency on the MR runtime from the master that the code in the 
backup branch introduced.)

> Refactoring: move MR dependencies from HMaster
> --
>
> Key: HBASE-16727
> URL: https://issues.apache.org/jira/browse/HBASE-16727
> Project: HBase
>  Issue Type: Task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>
> * No MR jobs in HMaster
> * No proc2 implementation
> * client-driven backup-restore
> * basic security: only super user is allowed to run backup/restore



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16727) Refactoring: move MR dependencies from HMaster

2016-09-28 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15530652#comment-15530652
 ] 

Devaraj Das commented on HBASE-16727:
-

Right [~busbey]... The title is probably a tad misleading. It's about removing 
the dependency on the MR runtime from the master that the code in the backup 
branch introduced.

> Refactoring: move MR dependencies from HMaster
> --
>
> Key: HBASE-16727
> URL: https://issues.apache.org/jira/browse/HBASE-16727
> Project: HBase
>  Issue Type: Task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>
> * No MR jobs in HMaster
> * No proc2 implementation
> * client-driven backup-restore
> * basic security: only super user is allowed to run backup/restore



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16721) Concurrency issue in WAL unflushed seqId tracking

2016-09-28 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15530602#comment-15530602
 ] 

Devaraj Das commented on HBASE-16721:
-

In one occurrence of this issue, the region was actually non-existent. It was a 
region that was split into two regions, and the daughters were opened fine. But 
the flush continued to be blocked.

> Concurrency issue in WAL unflushed seqId tracking
> -
>
> Key: HBASE-16721
> URL: https://issues.apache.org/jira/browse/HBASE-16721
> Project: HBase
>  Issue Type: Bug
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.7, 1.2.4
>
>
> I'm inspecting an interesting case where in a production cluster, some 
> regionservers ends up accumulating hundreds of WAL files, even with force 
> flushes going on due to max logs. This happened multiple times on the 
> cluster, but not on other clusters. The cluster has periodic memstore flusher 
> disabled, however, this still does not explain why the force flush of regions 
> due to max limit is not working. I think the periodic memstore flusher just 
> masks the underlying problem, which is why we do not see this in other 
> clusters. 
> The problem starts like this: 
> {code}
> 2016-09-21 17:49:18,272 INFO  [regionserver//10.2.0.55:16020.logRoller] 
> wal.FSHLog: Too many wals: logs=33, maxlogs=32; forcing flush of 1 
> regions(s): d4cf39dc40ea79f5da4d0cf66d03cb1f
> 2016-09-21 17:49:18,273 WARN  [regionserver//10.2.0.55:16020.logRoller] 
> regionserver.LogRoller: Failed to schedule flush of 
> d4cf39dc40ea79f5da4d0cf66d03cb1f, region=null, requester=null
> {code}
> then, it continues until the RS is restarted: 
> {code}
> 2016-09-23 17:43:49,356 INFO  [regionserver//10.2.0.55:16020.logRoller] 
> wal.FSHLog: Too many wals: logs=721, maxlogs=32; forcing flush of 1 
> regions(s): d4cf39dc40ea79f5da4d0cf66d03cb1f
> 2016-09-23 17:43:49,357 WARN  [regionserver//10.2.0.55:16020.logRoller] 
> regionserver.LogRoller: Failed to schedule flush of 
> d4cf39dc40ea79f5da4d0cf66d03cb1f, region=null, requester=null
> {code}
> The problem is that region {{d4cf39dc40ea79f5da4d0cf66d03cb1f}} is already 
> split some time ago, and was able to flush its data and split without any 
> problems. However, the FSHLog still thinks that there is some unflushed data 
> for this region. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16727) Refactoring: move MR dependencies from HMaster

2016-09-28 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15530588#comment-15530588
 ] 

Devaraj Das commented on HBASE-16727:
-

Yeah let's preserve the current hbase-7912 branch

> Refactoring: move MR dependencies from HMaster
> --
>
> Key: HBASE-16727
> URL: https://issues.apache.org/jira/browse/HBASE-16727
> Project: HBase
>  Issue Type: Task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>
> * No MR jobs in HMaster
> * No proc2 implementation
> * client-driven backup-restore
> * basic security: only super user is allowed to run backup/restore



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16414) Improve performance for RPC encryption with Apache Common Crypto

2016-09-28 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15528705#comment-15528705
 ] 

Devaraj Das commented on HBASE-16414:
-

Nice work [~colinma]. Quick question - are you handling the issues to do with 
client and server being on different configs, and/or, with/without the 
functionality introduced by this patch? Wondering about the compatibility 
stories if the client and server sides are on different versions of hbase (one 
has the code and another doesn't), and/or different configurations with respect 
to this feature.

> Improve performance for RPC encryption with Apache Common Crypto
> 
>
> Key: HBASE-16414
> URL: https://issues.apache.org/jira/browse/HBASE-16414
> Project: HBase
>  Issue Type: Improvement
>  Components: IPC/RPC
>Affects Versions: 2.0.0
>Reporter: Colin Ma
>Assignee: Colin Ma
> Attachments: HBASE-16414.001.patch, HBASE-16414.002.patch, 
> HBASE-16414.003.patch, HBASE-16414.004.patch, HBASE-16414.005.patch, 
> HbaseRpcEncryptionWithCrypoto.docx
>
>
> Hbase RPC encryption is enabled by setting “hbase.rpc.protection” to 
> "privacy". With the token authentication, it utilized DIGEST-MD5 mechanisms 
> for secure authentication and data protection. For DIGEST-MD5, it uses DES, 
> 3DES or RC4 to do encryption and it is very slow, especially for Scan. This 
> will become the bottleneck of the RPC throughput.
> Apache Commons Crypto is a cryptographic library optimized with AES-NI. It 
> provides Java API for both cipher level and Java stream level. Developers can 
> use it to implement high performance AES encryption/decryption with the 
> minimum code and effort. Compare with the current implementation of 
> org.apache.hadoop.hbase.io.crypto.aes.AES, Crypto supports both JCE Cipher 
> and OpenSSL Cipher which is better performance than JCE Cipher. User can 
> configure the cipher type and the default is JCE Cipher.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16604) Scanner retries on IOException can cause the scans to miss data

2016-09-19 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504754#comment-15504754
 ] 

Devaraj Das commented on HBASE-16604:
-

LGTM

> Scanner retries on IOException can cause the scans to miss data 
> 
>
> Key: HBASE-16604
> URL: https://issues.apache.org/jira/browse/HBASE-16604
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, Scanners
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.7, 1.2.4
>
> Attachments: hbase-16604_v1.patch, hbase-16604_v2.patch, 
> hbase-16604_v3.patch
>
>
> Debugging an ITBLL failure, where the Verify did not "see" all the data in 
> the cluster, I've noticed that if we end up getting a generic IOException 
> from the HFileReader level, we may end up missing the rest of the data in the 
> region. I was able to manually test this, and this stack trace helps to 
> understand what is going on: 
> {code}
> 2016-09-09 16:27:15,633 INFO  [hconnection-0x71ad3d8a-shared--pool21-t9] 
> client.ScannerCallable(376): Open scanner=1 for 
> scan={"loadColumnFamiliesOnDemand":null,"startRow":"","stopRow":"","batch":-1,"cacheBlocks":true,"totalColumns":1,"maxResultSize":2097152,"families":{"testFamily":["testFamily"]},"caching":100,"maxVersions":1,"timeRange":[0,9223372036854775807]}
>  on region 
> region=testScanThrowsException,,1473463632707.b2adfb618e5d0fe225c1dc40c0eabfee.,
>  hostname=hw10676,51833,1473463626529, seqNum=2
> 2016-09-09 16:27:15,634 INFO  
> [B.fifo.QRpcServer.handler=5,queue=0,port=51833] 
> regionserver.RSRpcServices(2196): scan request:scanner_id: 1 number_of_rows: 
> 100 close_scanner: false next_call_seq: 0 client_handles_partials: true 
> client_handles_heartbeats: true renew: false
> 2016-09-09 16:27:15,635 INFO  
> [B.fifo.QRpcServer.handler=5,queue=0,port=51833] 
> regionserver.RSRpcServices(2510): Rolling back next call seqId
> 2016-09-09 16:27:15,635 INFO  
> [B.fifo.QRpcServer.handler=5,queue=0,port=51833] 
> regionserver.RSRpcServices(2565): Throwing new 
> ServiceExceptionjava.io.IOException: Could not reseek 
> StoreFileScanner[HFileScanner for reader 
> reader=hdfs://localhost:51795/user/enis/test-data/d6fb1c70-93c1-4099-acb7-5723fc05a737/data/default/testScanThrowsException/b2adfb618e5d0fe225c1dc40c0eabfee/testFamily/5a213cc23b714e5e8e1a140ebbe72f2c,
>  compression=none, cacheConf=blockCache=LruBlockCache{blockCount=0, 
> currentSize=1567264, freeSize=1525578848, maxSize=1527146112, 
> heapSize=1567264, minSize=1450788736, minFactor=0.95, multiSize=725394368, 
> multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, 
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, 
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, 
> prefetchOnOpen=false, firstKey=aaa/testFamily:testFamily/1473463633859/Put, 
> lastKey=zzz/testFamily:testFamily/1473463634271/Put, avgKeyLen=35, 
> avgValueLen=3, entries=17576, length=866998, 
> cur=/testFamily:/OLDEST_TIMESTAMP/Minimum/vlen=0/seqid=0] to key 
> /testFamily:testFamily/LATEST_TIMESTAMP/Maximum/vlen=0/seqid=0
> 2016-09-09 16:27:15,635 DEBUG 
> [B.fifo.QRpcServer.handler=5,queue=0,port=51833] ipc.CallRunner(110): 
> B.fifo.QRpcServer.handler=5,queue=0,port=51833: callId: 26 service: 
> ClientService methodName: Scan size: 26 connection: 192.168.42.75:51903
> java.io.IOException: Could not reseek StoreFileScanner[HFileScanner for 
> reader 
> reader=hdfs://localhost:51795/user/enis/test-data/d6fb1c70-93c1-4099-acb7-5723fc05a737/data/default/testScanThrowsException/b2adfb618e5d0fe225c1dc40c0eabfee/testFamily/5a213cc23b714e5e8e1a140ebbe72f2c,
>  compression=none, cacheConf=blockCache=LruBlockCache{blockCount=0, 
> currentSize=1567264, freeSize=1525578848, maxSize=1527146112, 
> heapSize=1567264, minSize=1450788736, minFactor=0.95, multiSize=725394368, 
> multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, 
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, 
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, 
> prefetchOnOpen=false, firstKey=aaa/testFamily:testFamily/1473463633859/Put, 
> lastKey=zzz/testFamily:testFamily/1473463634271/Put, avgKeyLen=35, 
> avgValueLen=3, entries=17576, length=866998, 
> cur=/testFamily:/OLDEST_TIMESTAMP/Minimum/vlen=0/seqid=0] to key 
> /testFamily:testFamily/LATEST_TIMESTAMP/Maximum/vlen=0/seqid=0
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:224)
>   at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:55)
>   at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:312)
>   at 
> 

[jira] [Commented] (HBASE-16604) Scanner retries on IOException can cause the scans to miss data

2016-09-12 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486292#comment-15486292
 ] 

Devaraj Das commented on HBASE-16604:
-

Very good find, [~enis]. On the patch, just a thought - should we treat the 
ScannerResetException the same way as the UnknownScannerException (in terms of 
checking timeout) in ClientScanner. That way, if the client's scannertimeout 
has expired, the client gets back an exception. Saying this, because if the 
IOException happened due to an underlying filesystem issue, the data might be 
unavailable for a longer duration (which might cause other bigger issues but 
still), and multiple retries may or may not help...

> Scanner retries on IOException can cause the scans to miss data 
> 
>
> Key: HBASE-16604
> URL: https://issues.apache.org/jira/browse/HBASE-16604
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, Scanners
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.7, 1.2.4
>
> Attachments: hbase-16604_v1.patch
>
>
> Debugging an ITBLL failure, where the Verify did not "see" all the data in 
> the cluster, I've noticed that if we end up getting a generic IOException 
> from the HFileReader level, we may end up missing the rest of the data in the 
> region. I was able to manually test this, and this stack trace helps to 
> understand what is going on: 
> {code}
> 2016-09-09 16:27:15,633 INFO  [hconnection-0x71ad3d8a-shared--pool21-t9] 
> client.ScannerCallable(376): Open scanner=1 for 
> scan={"loadColumnFamiliesOnDemand":null,"startRow":"","stopRow":"","batch":-1,"cacheBlocks":true,"totalColumns":1,"maxResultSize":2097152,"families":{"testFamily":["testFamily"]},"caching":100,"maxVersions":1,"timeRange":[0,9223372036854775807]}
>  on region 
> region=testScanThrowsException,,1473463632707.b2adfb618e5d0fe225c1dc40c0eabfee.,
>  hostname=hw10676,51833,1473463626529, seqNum=2
> 2016-09-09 16:27:15,634 INFO  
> [B.fifo.QRpcServer.handler=5,queue=0,port=51833] 
> regionserver.RSRpcServices(2196): scan request:scanner_id: 1 number_of_rows: 
> 100 close_scanner: false next_call_seq: 0 client_handles_partials: true 
> client_handles_heartbeats: true renew: false
> 2016-09-09 16:27:15,635 INFO  
> [B.fifo.QRpcServer.handler=5,queue=0,port=51833] 
> regionserver.RSRpcServices(2510): Rolling back next call seqId
> 2016-09-09 16:27:15,635 INFO  
> [B.fifo.QRpcServer.handler=5,queue=0,port=51833] 
> regionserver.RSRpcServices(2565): Throwing new 
> ServiceExceptionjava.io.IOException: Could not reseek 
> StoreFileScanner[HFileScanner for reader 
> reader=hdfs://localhost:51795/user/enis/test-data/d6fb1c70-93c1-4099-acb7-5723fc05a737/data/default/testScanThrowsException/b2adfb618e5d0fe225c1dc40c0eabfee/testFamily/5a213cc23b714e5e8e1a140ebbe72f2c,
>  compression=none, cacheConf=blockCache=LruBlockCache{blockCount=0, 
> currentSize=1567264, freeSize=1525578848, maxSize=1527146112, 
> heapSize=1567264, minSize=1450788736, minFactor=0.95, multiSize=725394368, 
> multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, 
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, 
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, 
> prefetchOnOpen=false, firstKey=aaa/testFamily:testFamily/1473463633859/Put, 
> lastKey=zzz/testFamily:testFamily/1473463634271/Put, avgKeyLen=35, 
> avgValueLen=3, entries=17576, length=866998, 
> cur=/testFamily:/OLDEST_TIMESTAMP/Minimum/vlen=0/seqid=0] to key 
> /testFamily:testFamily/LATEST_TIMESTAMP/Maximum/vlen=0/seqid=0
> 2016-09-09 16:27:15,635 DEBUG 
> [B.fifo.QRpcServer.handler=5,queue=0,port=51833] ipc.CallRunner(110): 
> B.fifo.QRpcServer.handler=5,queue=0,port=51833: callId: 26 service: 
> ClientService methodName: Scan size: 26 connection: 192.168.42.75:51903
> java.io.IOException: Could not reseek StoreFileScanner[HFileScanner for 
> reader 
> reader=hdfs://localhost:51795/user/enis/test-data/d6fb1c70-93c1-4099-acb7-5723fc05a737/data/default/testScanThrowsException/b2adfb618e5d0fe225c1dc40c0eabfee/testFamily/5a213cc23b714e5e8e1a140ebbe72f2c,
>  compression=none, cacheConf=blockCache=LruBlockCache{blockCount=0, 
> currentSize=1567264, freeSize=1525578848, maxSize=1527146112, 
> heapSize=1567264, minSize=1450788736, minFactor=0.95, multiSize=725394368, 
> multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, 
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, 
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, 
> prefetchOnOpen=false, firstKey=aaa/testFamily:testFamily/1473463633859/Put, 
> lastKey=zzz/testFamily:testFamily/1473463634271/Put, avgKeyLen=35, 
> avgValueLen=3, entries=17576, length=866998, 
> 

[jira] [Commented] (HBASE-16255) Backup/Restore IT

2016-08-31 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15451467#comment-15451467
 ] 

Devaraj Das commented on HBASE-16255:
-

No idea [~dspivak]. It could be various factors including the memory used by 
the underlying libraries. But I think we should move on at this point. 

> Backup/Restore IT
> -
>
> Key: HBASE-16255
> URL: https://issues.apache.org/jira/browse/HBASE-16255
> Project: HBase
>  Issue Type: Task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Attachments: 16255-addendum.3.txt, 16255.addendum, 16255.addendum2, 
> 16255.addendum4, 16255.addendum5, 16255.addendum6, HBASE-16255-v1.patch, 
> HBASE-16255-v2.patch, HBASE-16255-v3.patch, HBASE-16255-v4.patch, 
> HBASE-16255-v5.patch, HBASE-16255-v6.patch, backup-it-7912-8-30.out, 
> backup-it-8-30.out, backup-it-success.out
>
>
> Integration test for backup restore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16255) Backup/Restore IT

2016-08-31 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15451449#comment-15451449
 ] 

Devaraj Das commented on HBASE-16255:
-

And the test came out successful. Mission accomplished.

> Backup/Restore IT
> -
>
> Key: HBASE-16255
> URL: https://issues.apache.org/jira/browse/HBASE-16255
> Project: HBase
>  Issue Type: Task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Attachments: 16255-addendum.3.txt, 16255.addendum, 16255.addendum2, 
> 16255.addendum4, 16255.addendum5, 16255.addendum6, HBASE-16255-v1.patch, 
> HBASE-16255-v2.patch, HBASE-16255-v3.patch, HBASE-16255-v4.patch, 
> HBASE-16255-v5.patch, HBASE-16255-v6.patch, backup-it-7912-8-30.out, 
> backup-it-8-30.out, backup-it-success.out
>
>
> Integration test for backup restore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-16255) Backup/Restore IT

2016-08-31 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15451445#comment-15451445
 ] 

Devaraj Das edited comment on HBASE-16255 at 8/31/16 7:30 AM:
--

Sorry, I was off on my previous guess at the issue. I dug deeper and this seems 
like a yarn issue to do with the minimum memory for the containers.
The default heap size (-Xmx200m) for the tasks was too low ([~dspivak], 
curious, if you ran other IT tests that do mapreduce, and did/didn't see this 
issue)
I added the following in yarn-site.xml:
{noformat}

yarn.scheduler.minimum-allocation-mb
1024
  
  
mapreduce.map.memory.mb
1024
  
  
mapreduce.reduce.memory.mb
1024
  
  
mapred.child.java.opts
-Xmx1024m
  
  
mapred.map.child.java.opts
-Xmx1024m
  
  
mapred.reduce.child.java.opts
-Xmx1024m
  
{noformat}

And in mapred-site.xml, added the following:
{noformat}

mapred.map.child.java.opts
-Xmx1024m
  
  
mapred.reduce.child.java.opts
-Xmx1024m
  
  
mapred.child.java.opts
-Xmx1024m
  
{noformat}
At the time of this writing, the test was still running (it proceeded beyond 
your failure point [~dspivak]). Fingers crossed.


was (Author: devaraj):
Sorry, I was off on my previous guess at the issue. I dug deeper and this seems 
like a yarn issue to do with the minimum memory for the containers.
The default heap size (-Xmx200m) for the tasks was too low ([~dspivak], 
curious, if you ran other IT tests that do mapreduce, and did/didn't see this 
issue)
I added the following in yarn-site.xml:
{noformat}

yarn.scheduler.minimum-allocation-mb
1024
  
  
mapreduce.map.memory.mb
1024
  
  
mapreduce.reduce.memory.mb
1024
  
  
mapred.child.java.opts
-Xmx1024m
  
  
mapred.map.child.java.opts
-Xmx1024m
  
  
mapred.reduce.child.java.opts
-Xmx1024m
  

{noformat}

And in mapred-site.xml, added the following:
{noformat}

mapred.map.child.java.opts
-Xmx1024m
  
  
mapred.reduce.child.java.opts
-Xmx1024m
  
  
mapred.child.java.opts
-Xmx1024m
  
{noformat}
At the time of this writing, the test was still running (it proceeded beyond 
your failure point [~dspivak]). Fingers crossed.

> Backup/Restore IT
> -
>
> Key: HBASE-16255
> URL: https://issues.apache.org/jira/browse/HBASE-16255
> Project: HBase
>  Issue Type: Task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Attachments: 16255-addendum.3.txt, 16255.addendum, 16255.addendum2, 
> 16255.addendum4, 16255.addendum5, 16255.addendum6, HBASE-16255-v1.patch, 
> HBASE-16255-v2.patch, HBASE-16255-v3.patch, HBASE-16255-v4.patch, 
> HBASE-16255-v5.patch, HBASE-16255-v6.patch, backup-it-7912-8-30.out, 
> backup-it-8-30.out, backup-it-success.out
>
>
> Integration test for backup restore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16255) Backup/Restore IT

2016-08-31 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15451445#comment-15451445
 ] 

Devaraj Das commented on HBASE-16255:
-

Sorry, I was off on my previous guess at the issue. I dug deeper and this seems 
like a yarn issue to do with the minimum memory for the containers.
The default heap size (-Xmx200m) for the tasks was too low ([~dspivak], 
curious, if you ran other IT tests that do mapreduce, and did/didn't see this 
issue)
I added the following in yarn-site.xml:
{noformat}

yarn.scheduler.minimum-allocation-mb
1024
  
  
mapreduce.map.memory.mb
1024
  
  
mapreduce.reduce.memory.mb
1024
  
  
mapred.child.java.opts
-Xmx1024m
  
  
mapred.map.child.java.opts
-Xmx1024m
  
  
mapred.reduce.child.java.opts
-Xmx1024m
  

{noformat}

And in mapred-site.xml, added the following:
{noformat}

mapred.map.child.java.opts
-Xmx1024m
  
  
mapred.reduce.child.java.opts
-Xmx1024m
  
  
mapred.child.java.opts
-Xmx1024m
  
{noformat}
At the time of this writing, the test was still running (it proceeded beyond 
your failure point [~dspivak]). Fingers crossed.

> Backup/Restore IT
> -
>
> Key: HBASE-16255
> URL: https://issues.apache.org/jira/browse/HBASE-16255
> Project: HBase
>  Issue Type: Task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Attachments: 16255-addendum.3.txt, 16255.addendum, 16255.addendum2, 
> 16255.addendum4, 16255.addendum5, 16255.addendum6, HBASE-16255-v1.patch, 
> HBASE-16255-v2.patch, HBASE-16255-v3.patch, HBASE-16255-v4.patch, 
> HBASE-16255-v5.patch, HBASE-16255-v6.patch, backup-it-7912-8-30.out, 
> backup-it-8-30.out, backup-it-success.out
>
>
> Integration test for backup restore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-16255) Backup/Restore IT

2016-08-30 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15451197#comment-15451197
 ] 

Devaraj Das edited comment on HBASE-16255 at 8/31/16 5:33 AM:
--

Ah I think the issue is that in the docker env, all the hbase processes run as 
'root', and hence the backup dirs end up being created as 'root'. But at some 
point in the sanpshot copy or something, the user switches to hbase and fails 
to perform the filesystem operations.
I bet if things are run as 'hbase' things will work. [~dspivak] can you please 
check the docker stuff to run the server processes in the more natural 
deployment style - hbase regionserver processes as hbase, etc..


was (Author: devaraj):
Ah I think the issue is that in the docker env, all the hbase processes run as 
'root', and hence the backup dirs end up being created as 'root'. But at some 
point in the sanpshot copy or something, the user switches to hbase and fails 
to perform the filesystem operations.
I bet if things are run as 'hbase' things will work. [~dspivak] can you please 
check the docker stuff to run the processes in the more natural deployment 
style - hbase processes as hbase, etc..

> Backup/Restore IT
> -
>
> Key: HBASE-16255
> URL: https://issues.apache.org/jira/browse/HBASE-16255
> Project: HBase
>  Issue Type: Task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Attachments: 16255-addendum.3.txt, 16255.addendum, 16255.addendum2, 
> 16255.addendum4, 16255.addendum5, 16255.addendum6, HBASE-16255-v1.patch, 
> HBASE-16255-v2.patch, HBASE-16255-v3.patch, HBASE-16255-v4.patch, 
> HBASE-16255-v5.patch, HBASE-16255-v6.patch, backup-it-7912-8-30.out, 
> backup-it-8-30.out, backup-it-success.out
>
>
> Integration test for backup restore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-16255) Backup/Restore IT

2016-08-30 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15451197#comment-15451197
 ] 

Devaraj Das edited comment on HBASE-16255 at 8/31/16 5:33 AM:
--

Ah I think the issue is that in the docker env, all the hbase processes run as 
'root', and hence the backup dirs end up being created as 'root'. But at some 
point in the sanpshot copy or something, the user switches to hbase and fails 
to perform the filesystem operations.
I bet if things are run as 'hbase' things will work. [~dspivak] can you please 
check the docker stuff to run the server processes in the more natural 
deployment style - hbase regionserver/master processes as hbase, etc..


was (Author: devaraj):
Ah I think the issue is that in the docker env, all the hbase processes run as 
'root', and hence the backup dirs end up being created as 'root'. But at some 
point in the sanpshot copy or something, the user switches to hbase and fails 
to perform the filesystem operations.
I bet if things are run as 'hbase' things will work. [~dspivak] can you please 
check the docker stuff to run the server processes in the more natural 
deployment style - hbase regionserver processes as hbase, etc..

> Backup/Restore IT
> -
>
> Key: HBASE-16255
> URL: https://issues.apache.org/jira/browse/HBASE-16255
> Project: HBase
>  Issue Type: Task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Attachments: 16255-addendum.3.txt, 16255.addendum, 16255.addendum2, 
> 16255.addendum4, 16255.addendum5, 16255.addendum6, HBASE-16255-v1.patch, 
> HBASE-16255-v2.patch, HBASE-16255-v3.patch, HBASE-16255-v4.patch, 
> HBASE-16255-v5.patch, HBASE-16255-v6.patch, backup-it-7912-8-30.out, 
> backup-it-8-30.out, backup-it-success.out
>
>
> Integration test for backup restore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16255) Backup/Restore IT

2016-08-30 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15451197#comment-15451197
 ] 

Devaraj Das commented on HBASE-16255:
-

Ah I think the issue is that in the docker env, all the hbase processes run as 
'root', and hence the backup dirs end up being created as 'root'. But at some 
point in the sanpshot copy or something, the user switches to hbase and fails 
to perform the filesystem operations.
I bet if things are run as 'hbase' things will work. [~dspivak] can you please 
check the docker stuff to run the processes in the more natural deployment 
style - hbase processes as hbase, etc..

> Backup/Restore IT
> -
>
> Key: HBASE-16255
> URL: https://issues.apache.org/jira/browse/HBASE-16255
> Project: HBase
>  Issue Type: Task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Attachments: 16255-addendum.3.txt, 16255.addendum, 16255.addendum2, 
> 16255.addendum4, 16255.addendum5, 16255.addendum6, HBASE-16255-v1.patch, 
> HBASE-16255-v2.patch, HBASE-16255-v3.patch, HBASE-16255-v4.patch, 
> HBASE-16255-v5.patch, HBASE-16255-v6.patch, backup-it-7912-8-30.out, 
> backup-it-8-30.out, backup-it-success.out
>
>
> Integration test for backup restore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16255) Backup/Restore IT

2016-08-30 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15451179#comment-15451179
 ] 

Devaraj Das commented on HBASE-16255:
-

Interesting. The test creates the directory as "root" even if the test is run 
as "hbase". But later on, the snapshot fails with permission issues.
{noformat}
2016-08-31 04:34:56,399 INFO  [main] hbase.IntegrationTestBackupRestore: create 
full backup image for all tables
2016-08-31 04:34:56,584 INFO  [main] util.BackupClientUtil: Backup root dir 
hdfs://node-1.network21368:8020/user/hbase/test-data/6a446bec-cfae-409c-96c4-099a70bddc24/backupIT
 does not exist. Will be created.
{noformat}
When I run the command
{noformat}
[hbase@node-1 root]$ hadoop dfs -ls /user/hbase
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

16/08/31 04:35:18 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
Found 1 items
drwxr-xr-x   - root supergroup  0 2016-08-31 04:35 /user/hbase/test-data
{noformat}

> Backup/Restore IT
> -
>
> Key: HBASE-16255
> URL: https://issues.apache.org/jira/browse/HBASE-16255
> Project: HBase
>  Issue Type: Task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Attachments: 16255-addendum.3.txt, 16255.addendum, 16255.addendum2, 
> 16255.addendum4, 16255.addendum5, 16255.addendum6, HBASE-16255-v1.patch, 
> HBASE-16255-v2.patch, HBASE-16255-v3.patch, HBASE-16255-v4.patch, 
> HBASE-16255-v5.patch, HBASE-16255-v6.patch, backup-it-7912-8-30.out, 
> backup-it-8-30.out, backup-it-success.out
>
>
> Integration test for backup restore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16255) Backup/Restore IT

2016-08-30 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15450544#comment-15450544
 ] 

Devaraj Das commented on HBASE-16255:
-

Ok am trying this thing out on my laptop. Will report back (if my laptop 
doesn't melt :) )

> Backup/Restore IT
> -
>
> Key: HBASE-16255
> URL: https://issues.apache.org/jira/browse/HBASE-16255
> Project: HBase
>  Issue Type: Task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Attachments: 16255-addendum.3.txt, 16255.addendum, 16255.addendum2, 
> 16255.addendum4, 16255.addendum5, 16255.addendum6, HBASE-16255-v1.patch, 
> HBASE-16255-v2.patch, HBASE-16255-v3.patch, HBASE-16255-v4.patch, 
> HBASE-16255-v5.patch, HBASE-16255-v6.patch, backup-it-7912-8-30.out, 
> backup-it-8-30.out, backup-it-success.out
>
>
> Integration test for backup restore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16255) Backup/Restore IT

2016-08-30 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15450413#comment-15450413
 ] 

Devaraj Das commented on HBASE-16255:
-

[~dspivak]  But pardon me for belaboring this point - can you please 
upload the logs (presumably you have them already), and check if the necessary 
directories exist on the hdfs side (like /home/root, etc.) and has enough 
permissions. Given that Docker setup and such might take some time, I am 
requesting this of you. Thanks a bunch again :-)

> Backup/Restore IT
> -
>
> Key: HBASE-16255
> URL: https://issues.apache.org/jira/browse/HBASE-16255
> Project: HBase
>  Issue Type: Task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Attachments: 16255-addendum.3.txt, 16255.addendum, 16255.addendum2, 
> 16255.addendum4, 16255.addendum5, 16255.addendum6, HBASE-16255-v1.patch, 
> HBASE-16255-v2.patch, HBASE-16255-v3.patch, HBASE-16255-v4.patch, 
> HBASE-16255-v5.patch, HBASE-16255-v6.patch, backup-it-7912-8-30.out, 
> backup-it-8-30.out, backup-it-success.out
>
>
> Integration test for backup restore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16255) Backup/Restore IT

2016-08-30 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15450355#comment-15450355
 ] 

Devaraj Das commented on HBASE-16255:
-

[~dspivak] appreciate your efforts in running the IT with Docker. Out of 
curiosity, does 
hdfs://node-1.network3783:8020/user/root/test-data/0919b2b6-c990-4d3e-9aa4-c0d249014545/backupIT/backup_1472586845117/default/IntegrationTestBackupRestore.table2/
 exist and there are enough permissions? I mean this seems like something that 
we should be able to look at logs and figure out the reason for failure. Mind 
attaching the logs please [~dspivak]. Thanks a bunch.

> Backup/Restore IT
> -
>
> Key: HBASE-16255
> URL: https://issues.apache.org/jira/browse/HBASE-16255
> Project: HBase
>  Issue Type: Task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Attachments: 16255-addendum.3.txt, 16255.addendum, 16255.addendum2, 
> 16255.addendum4, 16255.addendum5, 16255.addendum6, HBASE-16255-v1.patch, 
> HBASE-16255-v2.patch, HBASE-16255-v3.patch, HBASE-16255-v4.patch, 
> HBASE-16255-v5.patch, HBASE-16255-v6.patch, backup-it-7912-8-30.out, 
> backup-it-8-30.out, backup-it-success.out
>
>
> Integration test for backup restore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16527) IOExceptions from DFS client still can cause CatalogJanitor to delete referenced files

2016-08-29 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15448112#comment-15448112
 ] 

Devaraj Das commented on HBASE-16527:
-

[~vrodionov] any chance of adding a test for this issue?

> IOExceptions from DFS client still can cause CatalogJanitor to delete 
> referenced files
> --
>
> Key: HBASE-16527
> URL: https://issues.apache.org/jira/browse/HBASE-16527
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Fix For: 2.0.0
>
> Attachments: HBASE-16527-v1.patch
>
>
> that was fixed partially in HBASE-13331, but issue still exists , now a 
> little bit deeper in the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16529) PathFilter accept implementations must be exception free

2016-08-29 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15448105#comment-15448105
 ] 

Devaraj Das commented on HBASE-16529:
-

Excellent find, [~vrodionov]!

> PathFilter accept implementations must be exception free
> 
>
> Key: HBASE-16529
> URL: https://issues.apache.org/jira/browse/HBASE-16529
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Vladimir Rodionov
>
> As an example of a wrong PathFilter implementation:
> FSUtils.ReferenceFileFilter
> {code}
> @Override
> protected boolean accept(Path p, @CheckForNull Boolean isDir) {
>   if (!StoreFileInfo.isReference(p)) {
> return false;
>   }
>   try {
> // only files can be references.
> return isFile(fs, isDir, p);
>   } catch (IOException ioe) {
> // Maybe the file was moved or the fs was disconnected.
> LOG.warn("Skipping file " + p +" due to IOException", ioe);
> return false;
>   }
> }
> {code}
> That is wrong. We can't say if path passes the filter or not if Exception is 
> thrown. The general rule: DO NOT USE ANY CALLS WHICH MAY THROW EXCEPTION 
> INSIDE ACCEPT METHOD IMPLEMENTATION.
> See HBASE-16527.
> FSUtils contains several path filters for starter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HBASE-16492) Setting timeout on blocking operations in read/write path

2016-08-25 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das reassigned HBASE-16492:
---

Assignee: Devaraj Das  (was: Phil Yang)

> Setting timeout on blocking operations in read/write path
> -
>
> Key: HBASE-16492
> URL: https://issues.apache.org/jira/browse/HBASE-16492
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Phil Yang
>Assignee: Devaraj Das
>
> After HBASE-15593, we can use rpc timeout provided by client to prevent 
> wasting time on requests that have been dropped by client. In 
> request-handling path there are some points that are suitable to check if we 
> can finish this request in time.
> We can do this work in several sub-tasks to make sure each one is simple and 
> easy to maintain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14921) Memory optimizations

2016-08-22 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431686#comment-15431686
 ] 

Devaraj Das commented on HBASE-14921:
-

I am trying to get hold of a rig for running itbll. But that might take some 
time, and if others feel comfortable getting this patch in, it's fine with me. 

> Memory optimizations
> 
>
> Key: HBASE-14921
> URL: https://issues.apache.org/jira/browse/HBASE-14921
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Eshcar Hillel
>Assignee: Anastasia Braginsky
> Attachments: CellBlocksSegmentInMemStore.pdf, 
> CellBlocksSegmentinthecontextofMemStore(1).pdf, HBASE-14921-V01.patch, 
> HBASE-14921-V02.patch, HBASE-14921-V03.patch, HBASE-14921-V04-CA-V02.patch, 
> HBASE-14921-V04-CA.patch, HBASE-14921-V05-CAO.patch, 
> HBASE-14921-V06-CAO.patch, HBASE-14921-V08-CAO.patch, 
> HBASE-14921-V09-CAO.patch, HBASE-14921-V10-CAO.patch, 
> HBASE-14921-V11-CAO.patch, HBASE-14921-V11-CAO.patch, 
> HBASE-14921-V12-CAO.patch, InitialCellArrayMapEvaluation.pdf, 
> IntroductiontoNewFlatandCompactMemStore.pdf, MemStoreSizes.pdf, 
> MemstoreItrCountissue.patch, NewCompactingMemStoreFlow.pptx
>
>
> Memory optimizations including compressed format representation and offheap 
> allocations



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14921) Memory optimizations

2016-08-22 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431204#comment-15431204
 ] 

Devaraj Das commented on HBASE-14921:
-

Great work, [~anastas]. I'd like to know whether you ran ITBLL and such other 
"correctness" benchmarks using these family of patches. Given the magnitude of 
the changes, I was thinking we should get some runs of ITBLL on these. Pardon 
me if you have already covered that aspect earlier.

> Memory optimizations
> 
>
> Key: HBASE-14921
> URL: https://issues.apache.org/jira/browse/HBASE-14921
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Eshcar Hillel
>Assignee: Anastasia Braginsky
> Attachments: CellBlocksSegmentInMemStore.pdf, 
> CellBlocksSegmentinthecontextofMemStore(1).pdf, HBASE-14921-V01.patch, 
> HBASE-14921-V02.patch, HBASE-14921-V03.patch, HBASE-14921-V04-CA-V02.patch, 
> HBASE-14921-V04-CA.patch, HBASE-14921-V05-CAO.patch, 
> HBASE-14921-V06-CAO.patch, HBASE-14921-V08-CAO.patch, 
> HBASE-14921-V09-CAO.patch, HBASE-14921-V10-CAO.patch, 
> HBASE-14921-V11-CAO.patch, HBASE-14921-V11-CAO.patch, 
> HBASE-14921-V12-CAO.patch, InitialCellArrayMapEvaluation.pdf, 
> IntroductiontoNewFlatandCompactMemStore.pdf, MemStoreSizes.pdf, 
> MemstoreItrCountissue.patch, NewCompactingMemStoreFlow.pptx
>
>
> Memory optimizations including compressed format representation and offheap 
> allocations



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading

2016-08-09 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15414450#comment-15414450
 ] 

Devaraj Das commented on HBASE-14417:
-

The question is should we do this first or HBASE-14141 first. We need both in 
reality. We could put in a short term solution for backing up bulk-loaded data 
but wondering if we should bite the bullet and do HBASE-14141 and then this.

> Incremental backup and bulk loading
> ---
>
> Key: HBASE-14417
> URL: https://issues.apache.org/jira/browse/HBASE-14417
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 2.0.0
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: backup
> Fix For: 2.0.0
>
>
> Currently, incremental backup is based on WAL files. Bulk data loading 
> bypasses WALs for obvious reasons, breaking incremental backups. The only way 
> to continue backups after bulk loading is to create new full backup of a 
> table. This may not be feasible for customers who do bulk loading regularly 
> (say, every day).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11574) hbase:meta's regions can be replicated

2016-07-27 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15395143#comment-15395143
 ] 

Devaraj Das commented on HBASE-11574:
-

Done adding release note. [~stack] I don't have a cluster handy but from what i 
recollect, the master webUI should show the replicas of the meta if you poke 
around. Also, the master logs would have logs that speak to the fact that 
replicas of meta are getting created. Finally, look at the test 
testShutdownHandling() in TestMetaWithReplicas .. it kills the primary meta 
replica server, and then makes sure that the client can still reach the 
secondary locations of the meta and get the locations of the regions of his 
table.

> hbase:meta's regions can be replicated
> --
>
> Key: HBASE-11574
> URL: https://issues.apache.org/jira/browse/HBASE-11574
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Enis Soztutar
>Assignee: Devaraj Das
> Fix For: 1.1.0
>
> Attachments: 11574-1.txt, 11574-2.txt, 11574-3.txt, 11574-6.txt, 
> 11574-7.txt, 11574-addendum-1.0.txt, 11574-addendum.txt, 
> meta-replicas-0.98.zip
>
>
> As mentioned elsewhere, we can leverage hbase-10070 features to create 
> replicas for the meta tables regions so that: 
>  1. meta hotspotting can be circumvented 
>  2. meta becomes highly available for reading 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11574) hbase:meta's regions can be replicated

2016-07-27 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HBASE-11574:

Release Note: 
On the server side, set hbase.meta.replica.count to the number of replicas of 
meta that you want to have in the cluster (defaults to 1). hbase.regionserver. 
meta.storefile.refresh.period should be set to a non-zero number in 
milliseconds - something like 3 (defaults to 0).
On the client/user side, set hbase.meta.replicas.use to true.

> hbase:meta's regions can be replicated
> --
>
> Key: HBASE-11574
> URL: https://issues.apache.org/jira/browse/HBASE-11574
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Enis Soztutar
>Assignee: Devaraj Das
> Fix For: 1.1.0
>
> Attachments: 11574-1.txt, 11574-2.txt, 11574-3.txt, 11574-6.txt, 
> 11574-7.txt, 11574-addendum-1.0.txt, 11574-addendum.txt, 
> meta-replicas-0.98.zip
>
>
> As mentioned elsewhere, we can leverage hbase-10070 features to create 
> replicas for the meta tables regions so that: 
>  1. meta hotspotting can be circumvented 
>  2. meta becomes highly available for reading 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16132) Scan does not return all the result when regionserver is busy

2016-07-01 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359493#comment-15359493
 ] 

Devaraj Das commented on HBASE-16132:
-

So if you look at the RpcRetryingCallerWithReadReplicas.call() implementation, 
it first does a poll (to wait for a certain timeout) -
{code}
  try {
// wait for the timeout to see whether the primary responds back
Future f = cs.poll(timeBeforeReplicas, TimeUnit.MICROSECONDS); 
// Yes, microseconds
if (f != null) {
  return f.get(); //great we got a response
}
  }
{code}
After that, it does a take() / get()
{code}
try {
  try {
Future f = cs.take();
return f.get();
  } catch (ExecutionException e) {
throwEnrichedException(e, retries);
  }
} catch (CancellationException e) {
{code}
In the ScannerCallableWithReplicas.call(), it does poll in both places. But 
after the second poll(), it might be better to do a get(). That should take 
care of throwing the exception (look at the implementation of get()). On a 
related note, should the second call to poll() be replaced with a call to 
take(). There is a difference between the poll() and take(). Haven't analyzed 
the side effects of doing that...
I am okay with your patch but wanted to bring the above up and see if it makes 
sense..

> Scan does not return all the result when regionserver is busy
> -
>
> Key: HBASE-16132
> URL: https://issues.apache.org/jira/browse/HBASE-16132
> Project: HBase
>  Issue Type: Bug
>Reporter: binlijin
>Assignee: binlijin
> Attachments: HBASE-16132.patch, HBASE-16132_v2.patch, 
> HBASE-16132_v3.patch, HBASE-16132_v3.patch, TestScanMissingData.java
>
>
> We have find some corner case, when regionserver is busy and last a long 
> time. Some scanner may return null even if they do not scan all data.
> We find in ScannerCallableWithReplicas there is a case do not handler 
> correct, when cs.poll timeout and do not return any result , it is will 
> return a null result, so scan get null result, and end the scan. 
>  {code}
> try {
>   Future> f = cs.poll(timeout, 
> TimeUnit.MILLISECONDS);
>   if (f != null) {
> Pair r = f.get(timeout, 
> TimeUnit.MILLISECONDS);
> if (r != null && r.getSecond() != null) {
>   updateCurrentlyServingReplica(r.getSecond(), r.getFirst(), done, 
> pool);
> }
> return r == null ? null : r.getFirst(); // great we got an answer
>   }
> } catch (ExecutionException e) {
>   RpcRetryingCallerWithReadReplicas.throwEnrichedException(e, retries);
> } catch (CancellationException e) {
>   throw new InterruptedIOException(e.getMessage());
> } catch (InterruptedException e) {
>   throw new InterruptedIOException(e.getMessage());
> } catch (TimeoutException e) {
>   throw new InterruptedIOException(e.getMessage());
> } finally {
>   // We get there because we were interrupted or because one or more of 
> the
>   // calls succeeded or failed. In all case, we stop all our tasks.
>   cs.cancelAll();
> }
> return null; // unreachable
>  {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-16132) Scan does not return all the result when regionserver is busy

2016-06-30 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15357669#comment-15357669
 ] 

Devaraj Das edited comment on HBASE-16132 at 6/30/16 7:13 PM:
--

Folks, I am wondering if we should change the call from a poll() (referring to 
the code snippet in the jira desciption) to take/get like what's done in the 
RpcRetryingCallerWithReadReplicas.call(). Would that be better or worse?


was (Author: devaraj):
Folks, I am wondering if we should change the call from a poll() to take/get 
like what's done in the RpcRetryingCallerWithReadReplicas.call(). Would that be 
better or worse?

> Scan does not return all the result when regionserver is busy
> -
>
> Key: HBASE-16132
> URL: https://issues.apache.org/jira/browse/HBASE-16132
> Project: HBase
>  Issue Type: Bug
>Reporter: binlijin
>Assignee: binlijin
> Attachments: HBASE-16132.patch, HBASE-16132_v2.patch, 
> HBASE-16132_v3.patch, HBASE-16132_v3.patch, TestScanMissingData.java
>
>
> We have find some corner case, when regionserver is busy and last a long 
> time. Some scanner may return null even if they do not scan all data.
> We find in ScannerCallableWithReplicas there is a case do not handler 
> correct, when cs.poll timeout and do not return any result , it is will 
> return a null result, so scan get null result, and end the scan. 
>  {code}
> try {
>   Future> f = cs.poll(timeout, 
> TimeUnit.MILLISECONDS);
>   if (f != null) {
> Pair r = f.get(timeout, 
> TimeUnit.MILLISECONDS);
> if (r != null && r.getSecond() != null) {
>   updateCurrentlyServingReplica(r.getSecond(), r.getFirst(), done, 
> pool);
> }
> return r == null ? null : r.getFirst(); // great we got an answer
>   }
> } catch (ExecutionException e) {
>   RpcRetryingCallerWithReadReplicas.throwEnrichedException(e, retries);
> } catch (CancellationException e) {
>   throw new InterruptedIOException(e.getMessage());
> } catch (InterruptedException e) {
>   throw new InterruptedIOException(e.getMessage());
> } catch (TimeoutException e) {
>   throw new InterruptedIOException(e.getMessage());
> } finally {
>   // We get there because we were interrupted or because one or more of 
> the
>   // calls succeeded or failed. In all case, we stop all our tasks.
>   cs.cancelAll();
> }
> return null; // unreachable
>  {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16132) Scan does not return all the result when regionserver is busy

2016-06-30 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15357669#comment-15357669
 ] 

Devaraj Das commented on HBASE-16132:
-

Folks, I am wondering if we should change the call from a poll() to take/get 
like what's done in the RpcRetryingCallerWithReadReplicas.call(). Would that be 
better or worse?

> Scan does not return all the result when regionserver is busy
> -
>
> Key: HBASE-16132
> URL: https://issues.apache.org/jira/browse/HBASE-16132
> Project: HBase
>  Issue Type: Bug
>Reporter: binlijin
>Assignee: binlijin
> Attachments: HBASE-16132.patch, HBASE-16132_v2.patch, 
> HBASE-16132_v3.patch, HBASE-16132_v3.patch, TestScanMissingData.java
>
>
> We have find some corner case, when regionserver is busy and last a long 
> time. Some scanner may return null even if they do not scan all data.
> We find in ScannerCallableWithReplicas there is a case do not handler 
> correct, when cs.poll timeout and do not return any result , it is will 
> return a null result, so scan get null result, and end the scan. 
>  {code}
> try {
>   Future> f = cs.poll(timeout, 
> TimeUnit.MILLISECONDS);
>   if (f != null) {
> Pair r = f.get(timeout, 
> TimeUnit.MILLISECONDS);
> if (r != null && r.getSecond() != null) {
>   updateCurrentlyServingReplica(r.getSecond(), r.getFirst(), done, 
> pool);
> }
> return r == null ? null : r.getFirst(); // great we got an answer
>   }
> } catch (ExecutionException e) {
>   RpcRetryingCallerWithReadReplicas.throwEnrichedException(e, retries);
> } catch (CancellationException e) {
>   throw new InterruptedIOException(e.getMessage());
> } catch (InterruptedException e) {
>   throw new InterruptedIOException(e.getMessage());
> } catch (TimeoutException e) {
>   throw new InterruptedIOException(e.getMessage());
> } finally {
>   // We get there because we were interrupted or because one or more of 
> the
>   // calls succeeded or failed. In all case, we stop all our tasks.
>   cs.cancelAll();
> }
> return null; // unreachable
>  {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16115) Missing security context in RegionObserver coprocessor when a compaction/split is triggered manually

2016-06-28 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15354410#comment-15354410
 ] 

Devaraj Das commented on HBASE-16115:
-

bq. I think the fix is for Phoenix to save off the result of User.getCurrent() 
at init time and do a doAs() with that UGI whenever attempting RPC.
Either that, or do the Phoenix RPC within a User.runAsLoginUser() context... 
The latter should be simpler.

> Missing security context in RegionObserver coprocessor when a 
> compaction/split is triggered manually
> 
>
> Key: HBASE-16115
> URL: https://issues.apache.org/jira/browse/HBASE-16115
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.20
>Reporter: Lars Hofhansl
>
> We ran into an interesting phenomenon which can easily render a cluster 
> unusable.
> We loaded some tests data into a test table and forced a manual compaction 
> through the UI. We have some compaction hooks implemented in a region 
> observer, which writes back to another HBase table when the compaction 
> finishes. We noticed that this coprocessor is not setup correctly, it seems 
> the security context is missing.
> The interesting part is that this _only_ happens when the compaction is 
> triggere through the UI. Automatic compactions (major or minor) or when 
> triggered via the HBase shell (folling a kinit) work fine. Only the 
> UI-triggered compactions cause this issues and lead to essentially 
> neverending compactions, immovable regions, etc.
> Not sure what exactly the issue is, but I wanted to make sure I capture this.
> [~apurtell], [~ghelmling], FYI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16115) Missing security context in RegionObserver coprocessor when a compaction/split is triggered manually

2016-06-28 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15353994#comment-15353994
 ] 

Devaraj Das commented on HBASE-16115:
-

Yeah, HBASE-14655 might be the cause of the issue at hand. Thinking about it, 
one regionserver wouldn't be able to communicate with another without valid 
credentials. HBASE-14655 makes it so that the preCompact hook would run as the 
end user submitting the compaction request. That wouldn't work for 
authentication purposes. When I talked about the issue we earlier faced, the 
way we fixed was to simply run everything in the compaction as the login user 
(which is hbase regionserver user), but we somehow thought that HBASE-14655 
would fix it in the long run, but let me check that hypothesis...

> Missing security context in RegionObserver coprocessor when a 
> compaction/split is triggered manually
> 
>
> Key: HBASE-16115
> URL: https://issues.apache.org/jira/browse/HBASE-16115
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.20
>Reporter: Lars Hofhansl
>
> We ran into an interesting phenomenon which can easily render a cluster 
> unusable.
> We loaded some tests data into a test table and forced a manual compaction 
> through the UI. We have some compaction hooks implemented in a region 
> observer, which writes back to another HBase table when the compaction 
> finishes. We noticed that this coprocessor is not setup correctly, it seems 
> the security context is missing.
> The interesting part is that this _only_ happens when the compaction is 
> triggere through the UI. Automatic compactions (major or minor) or when 
> triggered via the HBase shell (folling a kinit) work fine. Only the 
> UI-triggered compactions cause this issues and lead to essentially 
> neverending compactions, immovable regions, etc.
> Not sure what exactly the issue is, but I wanted to make sure I capture this.
> [~apurtell], [~ghelmling], FYI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16115) Missing security context in RegionObserver coprocessor when a compaction is triggered through the UI

2016-06-27 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15351446#comment-15351446
 ] 

Devaraj Das commented on HBASE-16115:
-

We saw a similar issue with Phoenix in the picture - what we observed was that 
for user-triggered compactions, the compaction in the regionserver would run as 
the user, and after the compaction, the regionserver would try to update a 
system table in Phoenix. That'd fail with an authentication failure because of 
the reason that there are no credentials to reach out to a remote server from 
within the user's context. HBASE-14655 should handle that case.

> Missing security context in RegionObserver coprocessor when a compaction is 
> triggered through the UI
> 
>
> Key: HBASE-16115
> URL: https://issues.apache.org/jira/browse/HBASE-16115
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.20
>Reporter: Lars Hofhansl
>
> We ran into an interesting phenomenon which can easily render a cluster 
> unusable.
> We loaded some tests data into a test table and forced a manual compaction 
> through the UI. We have some compaction hooks implemented in a region 
> observer, which writes back to another HBase table when the compaction 
> finishes. We noticed that this coprocessor is not setup correctly, it seems 
> the security context is missing.
> The interesting part is that this _only_ happens when the compaction is 
> triggere through the UI. Automatic compactions (major or minor) or when 
> triggered via the HBase shell (folling a kinit) work fine. Only the 
> UI-triggered compactions cause this issues and lead to essentially 
> neverending compactions, immovable regions, etc.
> Not sure what exactly the issue is, but I wanted to make sure I capture this.
> [~apurtell], [~ghelmling], FYI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15862) Backup - Delete- Restore does not restore deleted data

2016-06-06 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317094#comment-15317094
 ] 

Devaraj Das commented on HBASE-15862:
-

I'd say that we punt on it, and let the full-restore fail if the table already 
exists? Then the user can do the needful and either delete or not, take 
snapshot, etc.

> Backup - Delete- Restore does not restore deleted data
> --
>
> Key: HBASE-15862
> URL: https://issues.apache.org/jira/browse/HBASE-15862
> Project: HBase
>  Issue Type: Bug
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Attachments: HBASE-15862-v1.patch
>
>
> This was discovered during testing. If we delete row after full backup and 
> perform immediately restore, the deleted row still remains deleted. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15806) An endpoint-based export tool

2016-05-23 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297729#comment-15297729
 ] 

Devaraj Das commented on HBASE-15806:
-

Good catch there [~mbertozzi]. If what you are saying is true, it introduces a 
security hole. [~yuzhih...@gmail.com], we shouldn't add more security holes in 
the codebase if we can. Examples of earlier issues being present in the 
codebase doesn't automatically justify introducing a new issue.

> An endpoint-based export tool
> -
>
> Key: HBASE-15806
> URL: https://issues.apache.org/jira/browse/HBASE-15806
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 2.0.0
>Reporter: ChiaPing Tsai
>Assignee: ChiaPing Tsai
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: Experiment.png, HBASE-15806.patch
>
>
> The time for exporting table can be reduced, if we use the endpoint technique 
> to export the hdfs files by the region server rather than by hbase client.
> In my experiments, the elapsed time of endpoint-based export can be less than 
> half of current export tool (enable the hdfs compression)
> But the shortcomings is we need to alter table for deploying the endpoint
> any comments about this? thanks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15806) An endpoint-based export tool

2016-05-23 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15296885#comment-15296885
 ] 

Devaraj Das commented on HBASE-15806:
-

Any quantification on how much computational overhead it brings to the 
regionservers vis-a-vis the MR approach?

> An endpoint-based export tool
> -
>
> Key: HBASE-15806
> URL: https://issues.apache.org/jira/browse/HBASE-15806
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 2.0.0
>Reporter: ChiaPing Tsai
>Assignee: ChiaPing Tsai
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: Experiment.png, HBASE-15806.patch
>
>
> The time for exporting table can be reduced, if we use the endpoint technique 
> to export the hdfs files by the region server rather than by hbase client.
> In my experiments, the elapsed time of endpoint-based export can be less than 
> half of current export tool (enable the hdfs compression)
> But the shortcomings is we need to alter table for deploying the endpoint
> any comments about this? thanks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HBASE-15519) Add per-user metrics

2016-04-20 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das reopened HBASE-15519:
-

> Add per-user metrics 
> -
>
> Key: HBASE-15519
> URL: https://issues.apache.org/jira/browse/HBASE-15519
> Project: HBase
>  Issue Type: Sub-task
>  Components: metrics
>Affects Versions: 1.2.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Attachments: hbase-15519_v0.patch, hbase-15519_v1.patch, 
> hbase-15519_v1.patch, hbase-15519_v2.patch
>
>
> Per-user metrics will be useful in multi-tenant cases where we can emit 
> number of requests, operations, num RPCs etc at the per-user aggregate level 
> per regionserver. We currently have throttles per user, but no way to monitor 
> resource usage per-user. 
> Looking at these metrics, operators can adjust throttles, do capacity 
> planning, etc per-user. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15519) Add per-user metrics

2016-04-20 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HBASE-15519:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Add per-user metrics 
> -
>
> Key: HBASE-15519
> URL: https://issues.apache.org/jira/browse/HBASE-15519
> Project: HBase
>  Issue Type: Sub-task
>  Components: metrics
>Affects Versions: 1.2.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Attachments: hbase-15519_v0.patch, hbase-15519_v1.patch, 
> hbase-15519_v1.patch, hbase-15519_v2.patch
>
>
> Per-user metrics will be useful in multi-tenant cases where we can emit 
> number of requests, operations, num RPCs etc at the per-user aggregate level 
> per regionserver. We currently have throttles per user, but no way to monitor 
> resource usage per-user. 
> Looking at these metrics, operators can adjust throttles, do capacity 
> planning, etc per-user. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2016-04-12 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238048#comment-15238048
 ] 

Devaraj Das edited comment on HBASE-7912 at 4/12/16 9:43 PM:
-

bq. We are not repeating, ns is upper directory level, similar to hbase layout. 
I am not sure I am following you here
I think [~enis] is referring to the fact that you have the namespace repeated 
in the path twice. For example, in this the 'default' namespace appears twice:
{code}ROOT/default/test­1459375580152/backup_1459375618126/archive/data/default/test­1459375
580152/543f8c02c388dd931fb9bcd1c38e7372/f/a6ce4789f9b444d89bbc755254afd27d{code}


was (Author: devaraj):
bq. We are not repeating, ns is upper directory level, similar to hbase layout. 
I am not sure I am following you here
I think [~enis] is referring to the fact that you have the namespace repeated 
in the path twice. For example, in this:
{code}ROOT/default/test­1459375580152/backup_1459375618126/archive/data/default/test­1459375
580152/543f8c02c388dd931fb9bcd1c38e7372/f/a6ce4789f9b444d89bbc755254afd27d{code}

> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf, HBaseBackupandRestore.pdf, 
> HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental backups to capture the changes 
> from the full backup, or from last incremental backup.
> # The user needs to restore table data to a past point of time.
> # The full backup is restored to the table(s) or to different table name(s).  
> Then the incremental backups that are up to the desired point in time are 
> applied on top of the full backup. 
> We would 

[jira] [Commented] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2016-04-12 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238048#comment-15238048
 ] 

Devaraj Das commented on HBASE-7912:


bq. We are not repeating, ns is upper directory level, similar to hbase layout. 
I am not sure I am following you here
I think [~enis] is referring to the fact that you have the namespace repeated 
in the path twice. For example, in this:
{code}ROOT/default/test­1459375580152/backup_1459375618126/archive/data/default/test­1459375
580152/543f8c02c388dd931fb9bcd1c38e7372/f/a6ce4789f9b444d89bbc755254afd27d{code}

> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf, HBaseBackupandRestore.pdf, 
> HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental backups to capture the changes 
> from the full backup, or from last incremental backup.
> # The user needs to restore table data to a past point of time.
> # The full backup is restored to the table(s) or to different table name(s).  
> Then the incremental backups that are up to the desired point in time are 
> applied on top of the full backup. 
> We would support the following key features and capabilities.
> * Full backup uses HBase snapshot to capture HFiles.
> * Use HBase WALs to capture incremental changes, but we use bulk load of 
> HFiles for fast incremental restore.
> * Support single table or a set of tables, and column family level backup and 
> restore.
> * Restore to different table names.
> * Support adding additional tables or CF to backup set without interruption 
> of incremental backup schedule.
> * Support rollup/combining of incremental backups into 

[jira] [Commented] (HBASE-15383) Load distribute across secondary read replicas for meta

2016-03-03 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178565#comment-15178565
 ] 

Devaraj Das commented on HBASE-15383:
-

True that.. Nice idea.

> Load distribute across secondary read replicas for meta
> ---
>
> Key: HBASE-15383
> URL: https://issues.apache.org/jira/browse/HBASE-15383
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Reporter: Ashu Pachauri
>Assignee: Ashu Pachauri
> Fix For: 2.0.0, 1.3.0
>
>
> Right now, we always hit the primary replica for meta and fallback to the 
> secondary replicas in case of a timeout. This can hamper performance in 
> scenarios where meta becomes a hot region e.g. cluster ramp up..clients 
> dropping connections etc.
> It's good to have a load distribution approach on meta's secondary replicas 
> with fallback to primary if we read stale data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15383) Load distribute across secondary read replicas for meta

2016-03-03 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178403#comment-15178403
 ] 

Devaraj Das commented on HBASE-15383:
-

The point to note is that responses from secondaries are always flagged as 
"stale", even if the secondary does have the latest updates... Without 
addressing that, it's not easy to address this jira.

> Load distribute across secondary read replicas for meta
> ---
>
> Key: HBASE-15383
> URL: https://issues.apache.org/jira/browse/HBASE-15383
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Reporter: Ashu Pachauri
>Assignee: Ashu Pachauri
> Fix For: 2.0.0, 1.3.0
>
>
> Right now, we always hit the primary replica for meta and fallback to the 
> secondary replicas in case of a timeout. This can hamper performance in 
> scenarios where meta becomes a hot region e.g. cluster ramp up..clients 
> dropping connections etc.
> It's good to have a load distribution approach on meta's secondary replicas 
> with fallback to primary if we read stale data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15156) Support first assignment of split daughters to non-parent RS

2016-02-01 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15127799#comment-15127799
 ] 

Devaraj Das commented on HBASE-15156:
-

bq. If not, patch will work, for branch-1 at least.
[~stack] quick question - mind telling me why branch-1 and not both master and 
branch-1? 

> Support first assignment of split daughters to non-parent RS
> 
>
> Key: HBASE-15156
> URL: https://issues.apache.org/jira/browse/HBASE-15156
> Project: HBase
>  Issue Type: New Feature
>Reporter: Francis Liu
>Assignee: Francis Liu
> Attachments: HBASE-15156.patch
>
>
> On region split, the region's daughter is always opened by the same region 
> server hosting the parent region. In some cases this is not ideal:
> This feature was mainly needed for favored nodes to allow for more freedom 
> when selecting favored nodes for daughter regions. ie The daughter doesn't 
> have to always select the regionserver hosting the split as a favored node 
> which should allow for better favored node distribution.
> Though this feature is actually useful in cases where region splits occur 
> much more often than the balancer is run. It also is a bit more efficient as 
> the major compaction that occurs after daughter assignment does not go to 
> waste (ie cancelled half-way, loss of locality due to move, etc). We actually 
> run it this way in some of our clusters even without favored nodes enabled. 
> Hence I am supplying a patch which is independent of favored nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15156) Support first assignment of split daughters to non-parent RS

2016-02-01 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15126392#comment-15126392
 ] 

Devaraj Das commented on HBASE-15156:
-

[~stack] mind taking a look at this please

> Support first assignment of split daughters to non-parent RS
> 
>
> Key: HBASE-15156
> URL: https://issues.apache.org/jira/browse/HBASE-15156
> Project: HBase
>  Issue Type: New Feature
>Reporter: Francis Liu
>Assignee: Francis Liu
> Attachments: HBASE-15156.patch
>
>
> On region split, the region's daughter is always opened by the same region 
> server hosting the parent region. In some cases this is not ideal:
> This feature was mainly needed for favored nodes to allow for more freedom 
> when selecting favored nodes for daughter regions. ie The daughter doesn't 
> have to always select the regionserver hosting the split as a favored node 
> which should allow for better favored node distribution.
> Though this feature is actually useful in cases where region splits occur 
> much more often than the balancer is run. It also is a bit more efficient as 
> the major compaction that occurs after daughter assignment does not go to 
> waste (ie cancelled half-way, loss of locality due to move, etc). We actually 
> run it this way in some of our clusters even without favored nodes enabled. 
> Hence I am supplying a patch which is independent of favored nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15181) A simple implementation of date based tiered compaction

2016-01-27 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15120571#comment-15120571
 ] 

Devaraj Das commented on HBASE-15181:
-

dup of https://issues.apache.org/jira/browse/HBASE-14477? 

> A simple implementation of date based tiered compaction
> ---
>
> Key: HBASE-15181
> URL: https://issues.apache.org/jira/browse/HBASE-15181
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Reporter: Clara Xiong
>Assignee: Clara Xiong
> Fix For: 2.0.0
>
>
> This is a simple implementation of date-based tiered compaction similar to 
> Cassandra's for the following benefits:
> 1. Improve date-range-based scan by structuring store files in date-based 
> tiered layout.
> 2. Reduce compaction overhead.
> 3. Improve TTL efficiency.
> Perfect fit for the use cases that:
> 1. has mostly date-based date write and scan and a focus on the most recent 
> data. 
> 2. never or rarely deletes data.
> Out-of-order writes are handled gracefully so the data will still get to the 
> right store file for time-range-scan and re-compacton with existing store 
> file in the same time window is handled by ExploringCompactionPolicy.
> Time range overlapping among store files is tolerated and the performance 
> impact is minimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14963) Remove Guava dependency from HBase client code

2016-01-22 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HBASE-14963:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed this.

> Remove Guava dependency from HBase client code
> --
>
> Key: HBASE-14963
> URL: https://issues.apache.org/jira/browse/HBASE-14963
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: 2.0.0
>
> Attachments: no-stopwatch.txt
>
>
> We ran into an issue where an application bundled its own Guava (and that 
> happened to be in the classpath first) and HBase's MetaTableLocator threw an 
> exception due to the fact that Stopwatch's constructor wasn't compatible... 
> Might be better to not depend on Stopwatch at all in MetaTableLocator since 
> the functionality is easily doable without.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2016-01-21 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111314#comment-15111314
 ] 

Devaraj Das commented on HBASE-6721:


[~eclark] ping. Please respond when you have a moment.

> RegionServer Group based Assignment
> ---
>
> Key: HBASE-6721
> URL: https://issues.apache.org/jira/browse/HBASE-6721
> Project: HBase
>  Issue Type: New Feature
>Reporter: Francis Liu
>Assignee: Francis Liu
>  Labels: hbase-6721
> Attachments: 6721-master-webUI.patch, HBASE-6721 
> GroupBasedLoadBalancer Sequence Diagram.xml, HBASE-6721-DesigDoc.pdf, 
> HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, 
> HBASE-6721_0.98_2.patch, HBASE-6721_10.patch, HBASE-6721_11.patch, 
> HBASE-6721_12.patch, HBASE-6721_13.patch, HBASE-6721_14.patch, 
> HBASE-6721_15.patch, HBASE-6721_8.patch, HBASE-6721_9.patch, 
> HBASE-6721_9.patch, HBASE-6721_94.patch, HBASE-6721_94.patch, 
> HBASE-6721_94_2.patch, HBASE-6721_94_3.patch, HBASE-6721_94_3.patch, 
> HBASE-6721_94_4.patch, HBASE-6721_94_5.patch, HBASE-6721_94_6.patch, 
> HBASE-6721_94_7.patch, HBASE-6721_98_1.patch, HBASE-6721_98_2.patch, 
> HBASE-6721_hbase-6721_addendum.patch, HBASE-6721_trunk.patch, 
> HBASE-6721_trunk.patch, HBASE-6721_trunk.patch, HBASE-6721_trunk1.patch, 
> HBASE-6721_trunk2.patch, balanceCluster Sequence Diagram.svg, 
> hbase-6721-v15-branch-1.1.patch, hbase-6721-v16.patch, hbase-6721-v17.patch, 
> hbase-6721-v18.patch, hbase-6721-v19.patch, hbase-6721-v20.patch, 
> hbase-6721-v21.patch, hbase-6721-v22.patch, hbase-6721-v23.patch, 
> hbase-6721-v25.patch, immediateAssignments Sequence Diagram.svg, 
> randomAssignment Sequence Diagram.svg, retainAssignment Sequence Diagram.svg, 
> roundRobinAssignment Sequence Diagram.svg
>
>
> In multi-tenant deployments of HBase, it is likely that a RegionServer will 
> be serving out regions from a number of different tables owned by various 
> client applications. Being able to group a subset of running RegionServers 
> and assign specific tables to it, provides a client application a level of 
> isolation and resource allocation.
> The proposal essentially is to have an AssignmentManager which is aware of 
> RegionServer groups and assigns tables to region servers based on groupings. 
> Load balancing will occur on a per group basis as well. 
> This is essentially a simplification of the approach taken in HBASE-4120. See 
> attached document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14987) Compaction marker whose region name doesn't match current region's needs to be handled

2015-12-22 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15068878#comment-15068878
 ] 

Devaraj Das commented on HBASE-14987:
-

Good writeup, [~enis].
bq. In the log split case, we want to skip the edits (due to HBCK case), but 
secondary region replication we still want to throw the exception if regions do 
not match.
In the region replication case, we'd replay this on a replica (and not the 
primary region) and so the validation there would be slightly different, right?

bq. Idea on how compaction marker with mismatching region name can be generated 
/ replayed is welcome.
I guess you could write the hand coded edits in the file using the WALEdit 
classes?

> Compaction marker whose region name doesn't match current region's needs to 
> be handled
> --
>
> Key: HBASE-14987
> URL: https://issues.apache.org/jira/browse/HBASE-14987
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Stephen Yuan Jiang
> Attachments: 14987-suggest.txt, 14987-v1.txt, 14987-v2.txt, 
> 14987-v2.txt
>
>
> One customer encountered the following error when replaying recovered edits, 
> leading to region open failure:
> {code}
> region=table1,d6b-2282-9223370590058224807-U-9856557-
> EJ452727-16313786400171,1449616291799.fa8a526f2578eb3630bb08a4b1648f5d., 
> starting to roll back the global memstore   size.
> org.apache.hadoop.hbase.regionserver.WrongRegionException: Compaction marker 
> from WAL table_name: "table1"
> encoded_region_name: "d389c70fde9ec07971d0cfd20ef8f575"
> ...
> region_name: 
> "table1,d6b-2282-9223370590058224807-U-9856557-EJ452727-16313786400171,1449089609367.d389c70fde9ec07971d0cfd20ef8f575."
>  targetted for region d389c70fde9ec07971d0cfd20ef8f575 does not match this 
> region: {ENCODED => fa8a526f2578eb3630bb08a4b1648f5d, NAME => 
> 'table1,d6b-2282-
> 9223370590058224807-U-9856557-EJ452727-16313786400171,1449616291799.fa8a526f2578eb3630bb08a4b1648f5d.',
>  STARTKEY => 'd6b-2282-9223370590058224807-U-9856557-EJ452727- 
> 16313786400171', ENDKEY => 
> 'd76-2553-9223370588576178807-U-7416904-EK875822-1766218060'}
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkTargetRegion(HRegion.java:4592)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayWALCompactionMarker(HRegion.java:3831)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:3747)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:3601)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionStores(HRegion.java:911)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:789)
>   at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:762)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5774)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5744)
> {code}
> This was likely caused by the following action of hbck:
> {code}
> 15/12/08 18:11:34 INFO util.HBaseFsck: [hbasefsck-pool1-t37] Moving files 
> from 
> hdfs://Zealand/hbase/data/default/table1/d389c70fde9ec07971d0cfd20ef8f575/recovered.edits
>  into containing region 
> hdfs://Zealand/hbase/data/default/table1/fa8a526f2578eb3630bb08a4b1648f5d/recovered.edits
> {code}
> The recovered.edits for d389c70fde9ec07971d0cfd20ef8f575 contained compaction 
> marker which couldn't be replayed against fa8a526f2578eb3630bb08a4b1648f5d



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HBASE-8549) Integrate Favored Nodes into StochasticLoadBalancer

2015-12-22 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das reopened HBASE-8549:


I was discussing this offline with [~toffer], and there is a patch that Yahoo 
has developed that includes this and other fixes for running HBase with 
FavoredNodes enabled. Reopening this. Speaking for Francis, but we should have 
something up on jira soon..

> Integrate Favored Nodes into StochasticLoadBalancer
> ---
>
> Key: HBASE-8549
> URL: https://issues.apache.org/jira/browse/HBASE-8549
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer
>Reporter: Elliott Clark
> Attachments: HBASE-8549-0.patch
>
>
> Right now we have a FavoredNodeLoadBalancer.  It would be pretty easy to 
> integrate the favored node list into the strochastic balancer.  Then we would 
> have the best of both worlds.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14987) HBaseFsck#mergeRegionDirs() needs to handle compaction marker in recovered edits

2015-12-20 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15065964#comment-15065964
 ] 

Devaraj Das commented on HBASE-14987:
-

bq. When region names don't match, skip replaying compaction marker.
What about the other markers? It'll be good to document what we do with FLUSH 
etc

> HBaseFsck#mergeRegionDirs() needs to handle compaction marker in recovered 
> edits
> 
>
> Key: HBASE-14987
> URL: https://issues.apache.org/jira/browse/HBASE-14987
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Stephen Yuan Jiang
> Attachments: 14987-suggest.txt, 14987-v1.txt, 14987-v2.txt
>
>
> One customer encountered the following error when replaying recovered edits, 
> leading to region open failure:
> {code}
> region=table1,d6b-2282-9223370590058224807-U-9856557-
> EJ452727-16313786400171,1449616291799.fa8a526f2578eb3630bb08a4b1648f5d., 
> starting to roll back the global memstore   size.
> org.apache.hadoop.hbase.regionserver.WrongRegionException: Compaction marker 
> from WAL table_name: "table1"
> encoded_region_name: "d389c70fde9ec07971d0cfd20ef8f575"
> ...
> region_name: 
> "table1,d6b-2282-9223370590058224807-U-9856557-EJ452727-16313786400171,1449089609367.d389c70fde9ec07971d0cfd20ef8f575."
>  targetted for region d389c70fde9ec07971d0cfd20ef8f575 does not match this 
> region: {ENCODED => fa8a526f2578eb3630bb08a4b1648f5d, NAME => 
> 'table1,d6b-2282-
> 9223370590058224807-U-9856557-EJ452727-16313786400171,1449616291799.fa8a526f2578eb3630bb08a4b1648f5d.',
>  STARTKEY => 'd6b-2282-9223370590058224807-U-9856557-EJ452727- 
> 16313786400171', ENDKEY => 
> 'd76-2553-9223370588576178807-U-7416904-EK875822-1766218060'}
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkTargetRegion(HRegion.java:4592)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayWALCompactionMarker(HRegion.java:3831)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:3747)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:3601)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionStores(HRegion.java:911)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:789)
>   at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:762)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5774)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5744)
> {code}
> This was likely caused by the following action of hbck:
> {code}
> 15/12/08 18:11:34 INFO util.HBaseFsck: [hbasefsck-pool1-t37] Moving files 
> from 
> hdfs://Zealand/hbase/data/default/table1/d389c70fde9ec07971d0cfd20ef8f575/recovered.edits
>  into containing region 
> hdfs://Zealand/hbase/data/default/table1/fa8a526f2578eb3630bb08a4b1648f5d/recovered.edits
> {code}
> The recovered.edits for d389c70fde9ec07971d0cfd20ef8f575 contained compaction 
> marker which couldn't be replayed against fa8a526f2578eb3630bb08a4b1648f5d



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14963) Remove Guava dependency from HBase client code

2015-12-14 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057300#comment-15057300
 ] 

Devaraj Das commented on HBASE-14963:
-

Thanks [~enis] for the note. [~busbey] is that good enough justification for 
committing this patch... 

> Remove Guava dependency from HBase client code
> --
>
> Key: HBASE-14963
> URL: https://issues.apache.org/jira/browse/HBASE-14963
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Attachments: no-stopwatch.txt
>
>
> We ran into an issue where an application bundled its own Guava (and that 
> happened to be in the classpath first) and HBase's MetaTableLocator threw an 
> exception due to the fact that Stopwatch's constructor wasn't compatible... 
> Might be better to not depend on Stopwatch at all in MetaTableLocator since 
> the functionality is easily doable without.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14963) Remove Guava dependency from HBase client code

2015-12-14 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HBASE-14963:

Fix Version/s: 2.0.0
   Status: Patch Available  (was: Open)

> Remove Guava dependency from HBase client code
> --
>
> Key: HBASE-14963
> URL: https://issues.apache.org/jira/browse/HBASE-14963
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: 2.0.0
>
> Attachments: no-stopwatch.txt
>
>
> We ran into an issue where an application bundled its own Guava (and that 
> happened to be in the classpath first) and HBase's MetaTableLocator threw an 
> exception due to the fact that Stopwatch's constructor wasn't compatible... 
> Might be better to not depend on Stopwatch at all in MetaTableLocator since 
> the functionality is easily doable without.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14963) Remove Guava dependency from HBase client code

2015-12-10 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15051880#comment-15051880
 ] 

Devaraj Das commented on HBASE-14963:
-

Yes [~stack] that would work for sure. For now, we saw the issue with the 
Stopwatch class only, and hence the patch to only handle that.. But yeah I 
agree that shading is a better approach overall.

> Remove Guava dependency from HBase client code
> --
>
> Key: HBASE-14963
> URL: https://issues.apache.org/jira/browse/HBASE-14963
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Attachments: no-stopwatch.txt
>
>
> We ran into an issue where an application bundled its own Guava (and that 
> happened to be in the classpath first) and HBase's MetaTableLocator threw an 
> exception due to the fact that Stopwatch's constructor wasn't compatible... 
> Might be better to not depend on Stopwatch at all in MetaTableLocator since 
> the functionality is easily doable without.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14963) Remove Guava dependency from HBase client code

2015-12-10 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HBASE-14963:

Attachment: no-stopwatch.txt

> Remove Guava dependency from HBase client code
> --
>
> Key: HBASE-14963
> URL: https://issues.apache.org/jira/browse/HBASE-14963
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Attachments: no-stopwatch.txt
>
>
> We ran into an issue where an application bundled its own Guava (and that 
> happened to be in the classpath first) and HBase's MetaTableLocator threw an 
> exception due to the fact that Stopwatch's constructor wasn't compatible... 
> Might be better to not depend on Stopwatch at all in MetaTableLocator since 
> the functionality is easily doable without.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14963) Remove Guava dependency from HBase client code

2015-12-10 Thread Devaraj Das (JIRA)
Devaraj Das created HBASE-14963:
---

 Summary: Remove Guava dependency from HBase client code
 Key: HBASE-14963
 URL: https://issues.apache.org/jira/browse/HBASE-14963
 Project: HBase
  Issue Type: Improvement
  Components: Client
Reporter: Devaraj Das
Assignee: Devaraj Das


We ran into an issue where an application bundled its own Guava (and that 
happened to be in the classpath first) and HBase's MetaTableLocator threw an 
exception due to the fact that Stopwatch's constructor wasn't compatible... 
Might be better to not depend on Stopwatch at all in MetaTableLocator since the 
functionality is easily doable without.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14963) Remove Guava dependency from HBase client code

2015-12-10 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052227#comment-15052227
 ] 

Devaraj Das commented on HBASE-14963:
-

[~busbey] I may have overlooked something. Let me get back (will resolve this 
issue if this issue has been addressed already)..

> Remove Guava dependency from HBase client code
> --
>
> Key: HBASE-14963
> URL: https://issues.apache.org/jira/browse/HBASE-14963
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Attachments: no-stopwatch.txt
>
>
> We ran into an issue where an application bundled its own Guava (and that 
> happened to be in the classpath first) and HBase's MetaTableLocator threw an 
> exception due to the fact that Stopwatch's constructor wasn't compatible... 
> Might be better to not depend on Stopwatch at all in MetaTableLocator since 
> the functionality is easily doable without.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   4   5   6   7   8   9   10   >