[jira] [Commented] (HBASE-11811) Use binary search for seeking into a block

2015-06-27 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604001#comment-14604001
 ] 

Lars Hofhansl commented on HBASE-11811:
---

Sure, go ahead.


 Use binary search for seeking into a block
 --

 Key: HBASE-11811
 URL: https://issues.apache.org/jira/browse/HBASE-11811
 Project: HBase
  Issue Type: Brainstorming
Reporter: Lars Hofhansl
Assignee: Vladimir Rodionov
 Attachments: 11811-wip-v2.txt, 11811-wip-v4.txt, block_index-v2.txt


 Currently upon every seek (including Gets) we need to linearly look through 
 the block from the beginning until we find the Cell we are looking for.
 It should be possible to build a simple cache of offsets of Cells for each 
 block as it is loaded and then use binary search to find the Cell in question.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13639) SyncTable - rsync for HBase tables

2015-06-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604045#comment-14604045
 ] 

Hudson commented on HBASE-13639:


FAILURE: Integrated in HBase-0.98 #1042 (See 
[https://builds.apache.org/job/HBase-0.98/1042/])
Amend HBASE-13639 SyncTable - rsync for HBase tables (apurtell: rev 
df7ac74745ab881800d01d48a3a7f05c6a7992f4)
* 
hbase-hadoop1-compat/src/main/java/org/apache/hadoop/mapreduce/lib/output/MapFileOutputFormat.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHashTable.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HashTable.java


 SyncTable - rsync for HBase tables
 --

 Key: HBASE-13639
 URL: https://issues.apache.org/jira/browse/HBASE-13639
 Project: HBase
  Issue Type: New Feature
Reporter: Dave Latham
Assignee: Dave Latham
 Fix For: 2.0.0, 0.98.14, 1.2.0

 Attachments: HBASE-13639-0.98-addendum-hadoop-1.patch, 
 HBASE-13639-0.98.patch, HBASE-13639-v1.patch, HBASE-13639-v2.patch, 
 HBASE-13639-v3-0.98.patch, HBASE-13639-v3.patch, HBASE-13639.patch


 Given HBase tables in remote clusters with similar but not identical data, 
 efficiently update a target table such that the data in question is identical 
 to a source table.  Efficiency in this context means using far less network 
 traffic than would be required to ship all the data from one cluster to the 
 other.  Takes inspiration from rsync.
 Design doc: 
 https://docs.google.com/document/d/1-2c9kJEWNrXf5V4q_wBcoIXfdchN7Pxvxv1IO6PW0-U/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13356) HBase should provide an InputFormat supporting multiple scans in mapreduce jobs over snapshots

2015-06-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604044#comment-14604044
 ] 

Hudson commented on HBASE-13356:


FAILURE: Integrated in HBase-0.98 #1042 (See 
[https://builds.apache.org/job/HBase-0.98/1042/])
Amend HBASE-13356 HBase should provide an InputFormat supporting multiple scans 
in mapreduce jobs over snapshots (Andrew Mains) (apurtell: rev 
cfb4827326b6743cb732b92580152bcf46647b2c)
* hbase-server/src/main/java/org/apache/hadoop/hbase/util/ConfigurationUtil.java


 HBase should provide an InputFormat supporting multiple scans in mapreduce 
 jobs over snapshots
 --

 Key: HBASE-13356
 URL: https://issues.apache.org/jira/browse/HBASE-13356
 Project: HBase
  Issue Type: New Feature
  Components: mapreduce
Reporter: Andrew Mains
Assignee: Andrew Mains
Priority: Minor
 Fix For: 2.0.0, 0.98.14, 1.2.0

 Attachments: HBASE-13356-0.98-addendum-hadoop-1.patch, 
 HBASE-13356-0.98.patch, HBASE-13356-branch-1.patch, HBASE-13356.2.patch, 
 HBASE-13356.3.patch, HBASE-13356.4.patch, HBASE-13356.patch


 Currently, HBase supports the pushing of multiple scans to mapreduce jobs 
 over live tables (via MultiTableInputFormat) but only supports a single scan 
 for mapreduce jobs over table snapshots. It would be handy to support 
 multiple scans over snapshots as well, probably through another input format 
 (MultiTableSnapshotInputFormat?). To mimic the functionality present in 
 MultiTableInputFormat, the new input format would likely have to take in the 
 names of all snapshots used in addition to the scans.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13959) Region splitting takes too long because it uses a single thread in most common cases

2015-06-27 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603999#comment-14603999
 ] 

Lars Hofhansl commented on HBASE-13959:
---

Heh... See 13959-suggest.txt that I attached with same comment. :)
Basically your patch, but defaults the max to the max # of storefiles. So in 
typically setups one does not have to worry about this setting.

 Region splitting takes too long because it uses a single thread in most 
 common cases
 

 Key: HBASE-13959
 URL: https://issues.apache.org/jira/browse/HBASE-13959
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.98.12
Reporter: Hari Krishna Dara
Assignee: Hari Krishna Dara
Priority: Critical
 Fix For: 0.98.14

 Attachments: 13959-suggest.txt, HBASE-13959-2.patch, 
 HBASE-13959-3.patch, HBASE-13959-4.patch, HBASE-13959.patch, 
 region-split-durations-compared.png


 When storefiles need to be split as part of a region split, the current logic 
 uses a threadpool with the size set to the size of the number of stores. 
 Since most common table setup involves only a single column family, this 
 translates to having a single store and so the threadpool is run with a 
 single thread. However, in a write heavy workload, there could be several 
 tens of storefiles in a store at the time of splitting, and with a threadpool 
 size of one, these files end up getting split sequentially.
 With a bit of tracing, I noticed that it takes on an average of 350ms to 
 create a single reference file, and splitting each storefile involves 
 creating two of these, so with a storefile count of 20, it takes about 14s 
 just to get through this phase alone (2 reference files for each storefile), 
 pushing the total time the region is offline to 18s or more. For environments 
 that are setup to fail fast, this makes the client exhaust all retries and 
 fail with NotServingRegionException.
 The fix should increase the concurrency of this operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13814) AssignmentManager does not write the correct server name into Zookeeper when unassign region

2015-06-27 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604003#comment-14604003
 ] 

Lars Hofhansl commented on HBASE-13814:
---

+1 on v2.

 AssignmentManager does not write the correct server name into Zookeeper when 
 unassign region
 

 Key: HBASE-13814
 URL: https://issues.apache.org/jira/browse/HBASE-13814
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: 0.94.27
Reporter: cuijianwei
Priority: Minor
 Attachments: HBASE-13814-0.94-v1.patch, HBASE-13814-0.94-v2.patch


 When moving region, the region will firstly be unassigned from corresponding 
 region server by the method AssignmentManager#unassign(). AssignmentManager 
 will write the region info and the server name into Zookeeper by the 
 following code:
 {code}
   versionOfClosingNode = ZKAssign.createNodeClosing(
 master.getZooKeeper(), region, master.getServerName());
 {code}
 It seems that the AssignmentManager misuses the master's name as the server 
 name. If the ROOT region is being moved and the region server holding the 
 ROOT region is just crashed. The Master will try to start a 
 MetaServerShutdownHandler if the server is judged as holding meta region. The 
 judgment will be done by the method AssignmentManager#isCarryingRegion, and 
 the method will firstly check the server name in Zookeeper:
 {code}
 ServerName addressFromZK = (data != null  data.getOrigin() != null) ?
   data.getOrigin() : null;
 if (addressFromZK != null) {
   // if we get something from ZK, we will use the data
   boolean matchZK = (addressFromZK != null 
 addressFromZK.equals(serverName));
 {code}
 The wrong server name from Zookeeper will make the server not be judged as 
 holding the ROOT region. Then, the master will start a ServerShutdownHandler. 
 Unlike MetaServerShutdownHandler, the ServerShutdownHandler won't assign ROOT 
 region firstly, making the ROOT region won't be assigned forever. In our test 
 environment, we encounter this problem when moving ROOT region and stopping 
 the region server concurrently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-8642) [Snapshot] List and delete snapshot by table

2015-06-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604273#comment-14604273
 ] 

Hadoop QA commented on HBASE-8642:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12742324/HBASE-8642-v2.patch
  against master branch at commit 7dbb2e69776bae8c2f2781f36528c0e784f93a06.
  ATTACHMENT ID: 12742324

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+puts No snapshots matched the table name regular expression 
#{tableNameregex.to_s} and the snapshot name regular expression 
#{snapshotNameRegex.to_s} if count == 0
+puts #{successfullyDeleted} snapshots successfully deleted. unless 
successfullyDeleted == 0

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14593//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14593//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14593//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14593//console

This message is automatically generated.

 [Snapshot] List and delete snapshot by table
 

 Key: HBASE-8642
 URL: https://issues.apache.org/jira/browse/HBASE-8642
 Project: HBase
  Issue Type: Improvement
  Components: snapshots
Affects Versions: 0.98.0, 0.95.0, 0.95.1, 0.95.2
Reporter: Julian Zhou
Assignee: Ashish Singhi
 Fix For: 2.0.0

 Attachments: 8642-trunk-0.95-v0.patch, 8642-trunk-0.95-v1.patch, 
 8642-trunk-0.95-v2.patch, HBASE-8642-v1.patch, HBASE-8642-v2.patch, 
 HBASE-8642.patch


 Support list and delete snapshots by table names.
 User scenario:
 A user wants to delete all the snapshots which were taken in January month 
 for a table 't' where snapshot names starts with 'Jan'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-8642) [Snapshot] List and delete snapshot by table

2015-06-27 Thread Ashish Singhi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604189#comment-14604189
 ] 

Ashish Singhi commented on HBASE-8642:
--

Patch addressing Matteo's concern.
Please review.

 [Snapshot] List and delete snapshot by table
 

 Key: HBASE-8642
 URL: https://issues.apache.org/jira/browse/HBASE-8642
 Project: HBase
  Issue Type: Improvement
  Components: snapshots
Affects Versions: 0.98.0, 0.95.0, 0.95.1, 0.95.2
Reporter: Julian Zhou
Assignee: Ashish Singhi
 Fix For: 2.0.0

 Attachments: 8642-trunk-0.95-v0.patch, 8642-trunk-0.95-v1.patch, 
 8642-trunk-0.95-v2.patch, HBASE-8642-v1.patch, HBASE-8642-v2.patch, 
 HBASE-8642.patch


 Support list and delete snapshots by table names.
 User scenario:
 A user wants to delete all the snapshots which were taken in January month 
 for a table 't' where snapshot names starts with 'Jan'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-8642) [Snapshot] List and delete snapshot by table

2015-06-27 Thread Ashish Singhi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Singhi updated HBASE-8642:
-
Attachment: HBASE-8642-v2.patch

 [Snapshot] List and delete snapshot by table
 

 Key: HBASE-8642
 URL: https://issues.apache.org/jira/browse/HBASE-8642
 Project: HBase
  Issue Type: Improvement
  Components: snapshots
Affects Versions: 0.98.0, 0.95.0, 0.95.1, 0.95.2
Reporter: Julian Zhou
Assignee: Ashish Singhi
 Fix For: 2.0.0

 Attachments: 8642-trunk-0.95-v0.patch, 8642-trunk-0.95-v1.patch, 
 8642-trunk-0.95-v2.patch, HBASE-8642-v1.patch, HBASE-8642-v2.patch, 
 HBASE-8642.patch


 Support list and delete snapshots by table names.
 User scenario:
 A user wants to delete all the snapshots which were taken in January month 
 for a table 't' where snapshot names starts with 'Jan'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13964) Skip region normalization for tables under namespace quota

2015-06-27 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604229#comment-14604229
 ] 

Ted Yu commented on HBASE-13964:


I see.
Let's wait till we hear some feedback from users who enable namespace quota on 
how normalization should be done.

 Skip region normalization for tables under namespace quota
 --

 Key: HBASE-13964
 URL: https://issues.apache.org/jira/browse/HBASE-13964
 Project: HBase
  Issue Type: Task
  Components: Balancer, Usability
Reporter: Mikhail Antonov
Assignee: Ted Yu
 Fix For: 2.0.0, 1.2.0, 1.3.0

 Attachments: 13964-branch-1-v2.txt, 13964-branch-1-v3.txt, 
 13964-v1.txt


 As [~te...@apache.org] pointed out in HBASE-13103, we need to discuss how to 
 normalize regions of tables under namespace control. What was proposed is to 
 disable normalization of such tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13964) Skip region normalization for tables under namespace quota

2015-06-27 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604417#comment-14604417
 ] 

Ted Yu commented on HBASE-13964:


It seems metadata associated with split / merge requests would allow server 
side to distinguish between the ones initiated by normalizer vs. the ones 
triggered through other means.
As long as the net effect of splitting / merging initiated by normalizer 
doesn't increase the number of regions, normalization should be allowed when 
namespace quota is in effect.


 Skip region normalization for tables under namespace quota
 --

 Key: HBASE-13964
 URL: https://issues.apache.org/jira/browse/HBASE-13964
 Project: HBase
  Issue Type: Task
  Components: Balancer, Usability
Reporter: Mikhail Antonov
Assignee: Ted Yu
 Fix For: 2.0.0, 1.2.0, 1.3.0

 Attachments: 13964-branch-1-v2.txt, 13964-branch-1-v3.txt, 
 13964-v1.txt


 As [~te...@apache.org] pointed out in HBASE-13103, we need to discuss how to 
 normalize regions of tables under namespace control. What was proposed is to 
 disable normalization of such tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13936) Improve configuration framework

2015-06-27 Thread Apekshit Sharma (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604404#comment-14604404
 ] 

Apekshit Sharma commented on HBASE-13936:
-

I think that Hadoop style site files is an good design and it should be left as 
such. The scope of this project and the design changes we have thought so far 
(in the doc) will be invisible to the users and will only impact dev.

bq. Moving from Configuration to ConfigurationManager

So the basic idea is to encapsulate Configuration within ConfigurationManager 
and provide a better API for handling configurations. That will help in 
building a better framework for dynamic configurations, type check 
configuration values, and get rid of few other bad patterns.
Since the aim here is to promote right patterns (and possibly design the 
framework so that it's not possible to go otherwise), I will highlight major 
issues here and get everyone's opinions.
[~apurtell] On that note, what do you about the issue of set*() functions (my 
last post).

 Improve configuration framework
 ---

 Key: HBASE-13936
 URL: https://issues.apache.org/jira/browse/HBASE-13936
 Project: HBase
  Issue Type: Umbrella
Reporter: Apekshit Sharma
 Attachments: DynamicConfigs.v01.docx, design.png


 Here's the design doc: 
 https://docs.google.com/document/d/1WiO2bqguR2DaVT-J2SZTCONbQ3pEhpbOI_bbLMaXRjE/edit#
 Main changes:
 get*(foo.bar, default_value)  --- get*(HConfig.FOO_BAR)  // using enums
 Robust framework and better documentation for dynamic configurations.
 Basic overview of new design:
 !design.png!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)