date:20131204

Andrew Purtell created HBASE-10077:
--

 Summary: Per family WAL encryption
 Key: HBASE-10077
 URL: https://issues.apache.org/jira/browse/HBASE-10077
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.98.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.98.1


HBASE-7544 introduces WAL encryption to prevent the leakage of protected data 
to disk by way of WAL files. However it is currently enabled globally for the 
regionserver. Encryption of WAL entries should depend on whether or not an 
entry in the WAL is to be stored within an encrypted column family.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-9883) Support Tags in TColumnValue in Thrift

2013-12-04 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838711#comment-13838711
 ] 

Anoop Sam John commented on HBASE-9883:
---

Is this fixed already as part of HBASE-9884 Ram?

 Support Tags in TColumnValue in Thrift
 --

 Key: HBASE-9883
 URL: https://issues.apache.org/jira/browse/HBASE-9883
 Project: HBase
  Issue Type: Improvement
  Components: Thrift
Affects Versions: 0.98.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.98.1


 Suggested by Anoop, to handle this seperately raised this JIRA.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-9883) Support Tags in TColumnValue in Thrift

2013-12-04 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838730#comment-13838730
 ] 

ramkrishna.s.vasudevan commented on HBASE-9883:
---

No.  It has to be fixed as part of this JIRA.

 Support Tags in TColumnValue in Thrift
 --

 Key: HBASE-9883
 URL: https://issues.apache.org/jira/browse/HBASE-9883
 Project: HBase
  Issue Type: Improvement
  Components: Thrift
Affects Versions: 0.98.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.98.1


 Suggested by Anoop, to handle this seperately raised this JIRA.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-10031) Add a section on the transparent CF encryption feature to the manual


 [ 
https://issues.apache.org/jira/browse/HBASE-10031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10031:
---

Attachment: 10031.patch

 Add a section on the transparent CF encryption feature to the manual
 

 Key: HBASE-10031
 URL: https://issues.apache.org/jira/browse/HBASE-10031
 Project: HBase
  Issue Type: Task
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Blocker
 Fix For: 0.98.0

 Attachments: 10031.patch


 Document HBASE-7544 in detail in the manual.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Resolved] (HBASE-10031) Add a section on the transparent CF encryption feature to the manual


 [ 
https://issues.apache.org/jira/browse/HBASE-10031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-10031.


Resolution: Fixed

Attached what I committed to trunk. Documentation updates have been committed 
using both RTC and CTR. Opting for CTR for expediency.

 Add a section on the transparent CF encryption feature to the manual
 

 Key: HBASE-10031
 URL: https://issues.apache.org/jira/browse/HBASE-10031
 Project: HBase
  Issue Type: Task
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Blocker
 Fix For: 0.98.0

 Attachments: 10031.patch


 Document HBASE-7544 in detail in the manual.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Comment Edited] (HBASE-10031) Add a section on the transparent CF encryption feature to the manual


[ 
https://issues.apache.org/jira/browse/HBASE-10031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838759#comment-13838759
 ] 

Andrew Purtell edited comment on HBASE-10031 at 12/4/13 9:17 AM:
-

Attached what I committed to trunk. Documentation updates have been committed 
using both RTC and CTR. Opting for CTR for expediency.

Edit: I ran mvn pre-site and eyeballed the resulting HTML output.


was (Author: apurtell):
Attached what I committed to trunk. Documentation updates have been committed 
using both RTC and CTR. Opting for CTR for expediency.

 Add a section on the transparent CF encryption feature to the manual
 

 Key: HBASE-10031
 URL: https://issues.apache.org/jira/browse/HBASE-10031
 Project: HBase
  Issue Type: Task
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Blocker
 Fix For: 0.98.0

 Attachments: 10031.patch


 Document HBASE-7544 in detail in the manual.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-9682) Bulk loading fails after ClassLoader is updated on OSGi client

2013-12-04 Thread Amit Sela (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-9682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Sela updated HBASE-9682:
-

Affects Version/s: 0.94.12

 Bulk loading fails after ClassLoader is updated on OSGi client
 --

 Key: HBASE-9682
 URL: https://issues.apache.org/jira/browse/HBASE-9682
 Project: HBase
  Issue Type: Bug
  Components: Client, HFile, io
Affects Versions: 0.94.2, 0.94.12
Reporter: Amit Sela

 In an OSGi environment (felix) client, running with a bundled HBase (used bnd 
 tool), after CL is updated - for instance when updating the client bundle 
 without updating the HBase bundle, Algorithm class in HBase is a static enum 
 and therefore it's configuration member still holds on to the old CL. This 
 causes NPE when trying to bulk load using LoadIncrementalHFile.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-9682) Bulk loading fails after ClassLoader is updated on OSGi client

2013-12-04 Thread Amit Sela (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-9682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Sela updated HBASE-9682:
-

Attachment: HBASE-9682.patch

Patch fixing the issue for HBase 0.94.12
I removed the Configuration member and added a createConfiguration method that 
creates a new Configuration (with the hadoop.native.lib boolean true)  per 
demand.
This way, there are no cached configuration with stale CL after bundle update 
(in OSGi client environment)

 Bulk loading fails after ClassLoader is updated on OSGi client
 --

 Key: HBASE-9682
 URL: https://issues.apache.org/jira/browse/HBASE-9682
 Project: HBase
  Issue Type: Bug
  Components: Client, HFile, io
Affects Versions: 0.94.2, 0.94.12
Reporter: Amit Sela
 Attachments: HBASE-9682.patch


 In an OSGi environment (felix) client, running with a bundled HBase (used bnd 
 tool), after CL is updated - for instance when updating the client bundle 
 without updating the HBase bundle, Algorithm class in HBase is a static enum 
 and therefore it's configuration member still holds on to the old CL. This 
 causes NPE when trying to bulk load using LoadIncrementalHFile.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-5273) Provide a coprocessor template for fast development and testing

2013-12-04 Thread takeshi.miao (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838799#comment-13838799
 ] 

takeshi.miao commented on HBASE-5273:
-

I think this ticket could be closed due to the related example codes already in 
trunk
{code}
find -name example
./hbase-examples/src/main/java/org/apache/hadoop/hbase/coprocessor/example
./hbase-examples/src/test/java/org/apache/hadoop/hbase/coprocessor/example
./hbase-server/src/main/java/org/apache/hadoop/hbase/backup/example
./hbase-server/src/test/java/org/apache/hadoop/hbase/backup/example
{code}

 Provide a coprocessor template for fast development and testing
 ---

 Key: HBASE-5273
 URL: https://issues.apache.org/jira/browse/HBASE-5273
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors
Affects Versions: 0.92.0
Reporter: Mingjie Lai
Priority: Minor

 While reworking on the coprocessor blog, I start to realize that we should 
 have a template of coprocessor that helps users to quickly start to develop, 
 test a customized coprocessors. Currently there are some built-in 
 coprocessors but all over the code base, and a user has to search around the 
 code to see how to develop a new one.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-7057) Store Server Load in a Table

2013-12-04 Thread Elliott Clark (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-7057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838803#comment-13838803
 ] 

Elliott Clark commented on HBASE-7057:
--

Nope sounds great thanks [~apurtell]

 Store Server Load in a Table
 

 Key: HBASE-7057
 URL: https://issues.apache.org/jira/browse/HBASE-7057
 Project: HBase
  Issue Type: Improvement
  Components: metrics, UI
Affects Versions: 0.95.2
Reporter: Elliott Clark
Priority: Critical
  Labels: noob

 Currently the server hart beat sends over server load and region loads.  This 
 is used to display and store metrics about a region server.  It is also used 
 to remember the sequence id of flushes.
 This should be moved into an HBase table.  
 * Allow the last sequence id to persist over a master restart.
 * That would allow the balancer to have a more complete picture of what's 
 happened in the past.
 * Allow tools to be created to monitor hbase using hbase.
 * Simplify/remove the heartbeat.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-9986) Incorporate HTTPS support for HBase (0.94 port)


[ 
https://issues.apache.org/jira/browse/HBASE-9986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838840#comment-13838840
 ] 

Hudson commented on HBASE-9986:
---

SUCCESS: Integrated in HBase-0.94-security #351 (See 
[https://builds.apache.org/job/HBase-0.94-security/351/])
HBASE-9986 Incorporate HTTPS support for HBase (0.94 port) (Aditya Kishore) 
(larsh: rev 1547706)
* 
/hbase/branches/0.94/src/main/jamon/org/apache/hadoop/hbase/tmpl/master/BackupMasterStatusTmpl.jamon
* 
/hbase/branches/0.94/src/main/jamon/org/apache/hadoop/hbase/tmpl/master/MasterStatusTmpl.jamon
* 
/hbase/branches/0.94/src/main/jamon/org/apache/hadoop/hbase/tmpl/regionserver/RSStatusTmpl.jamon
* /hbase/branches/0.94/src/main/resources/hbase-webapps/master/table.jsp


 Incorporate HTTPS support for HBase (0.94 port)
 ---

 Key: HBASE-9986
 URL: https://issues.apache.org/jira/browse/HBASE-9986
 Project: HBase
  Issue Type: Task
Affects Versions: 0.94.13
Reporter: Aditya Kishore
Assignee: Aditya Kishore
 Fix For: 0.94.15

 Attachments: HBASE-9954_0.94.patch


 In various classes, http://; is hard coded.
 This JIRA adds support for using HBASE web UI via HTTPS 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-9261) Add cp hooks after {start|close}RegionOperation in batchMutate

2013-12-04 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838843#comment-13838843
 ] 

ramkrishna.s.vasudevan commented on HBASE-9261:
---

In postCompleteBatchMutate() can we pass a flag to indicate whether it is 
getting called out of success or failure in the finally block?

 Add cp hooks after {start|close}RegionOperation in batchMutate
 --

 Key: HBASE-9261
 URL: https://issues.apache.org/jira/browse/HBASE-9261
 Project: HBase
  Issue Type: Sub-task
Reporter: rajeshbabu
Assignee: rajeshbabu
 Attachments: HBASE-9261.patch, HBASE-9261_v2.patch, 
 HBASE-9261_v3.patch, HBASE-9261_v4.patch


 These hooks helps for checking Resources(blocking memstore size) and 
 necessary locking on index region while performing batch of mutations. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HBASE-10078) Dynamic Filter - Not using DynamicClassLoader whe using FilterList

Federico Gaule created HBASE-10078:
--

 Summary: Dynamic Filter - Not using DynamicClassLoader whe using 
FilterList
 Key: HBASE-10078
 URL: https://issues.apache.org/jira/browse/HBASE-10078
 Project: HBase
  Issue Type: Bug
  Components: Filters
Affects Versions: 0.94.13
Reporter: Federico Gaule
Priority: Minor


I've tried to use dynamic jar load 
(https://issues.apache.org/jira/browse/HBASE-1936) but seems to have an issue 
with FilterList. 
Here is some log from my app where i send a Get with a FilterList containing 
AFilter and other with BFilter.

2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: 
Class d.p.AFilter not found - using dynamical class loader
2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: 
Finding class: d.p.AFilter
2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: 
Loading new jar files, if any
2013-12-02 13:55:42,677 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: 
Finding class again: d.p.AFilter
2013-12-02 13:55:43,004 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: 
Can't find class d.p.BFilter
java.lang.ClassNotFoundException: d.p.BFilter
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at 
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.getClassByName(HbaseObjectWritable.java:792)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:679)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594)
at 
org.apache.hadoop.hbase.filter.FilterList.readFields(FilterList.java:324)
at org.apache.hadoop.hbase.client.Get.readFields(Get.java:405)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594)
at org.apache.hadoop.hbase.client.Action.readFields(Action.java:101)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594)
at 
org.apache.hadoop.hbase.client.MultiAction.readFields(MultiAction.java:116)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690)
at 
org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:126)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1311)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1226)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:748)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:539)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:514)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
AFilter is not found so it tries with DynamicClassLoader, but when it tries to 
load AFilter, it uses URLClassLoader and fails without checking out for dynamic 
jars.


I think the issue is releated to FilterList#readFields 


 public void readFields(final DataInput in) throws IOException {
byte opByte = in.readByte();
operator = Operator.values()[opByte];
int size = in.readInt();
if (size  0) {
  filters = new ArrayListFilter(size);
  for (int i = 0; i  size; i++) {
Filter filter = (Filter)HbaseObjectWritable.readObject(in, conf);
filters.add(filter);
  }
}
  }

HbaseObjectWritable#readObject uses a conf (created by calling 
HBaseConfiguration.create()) which i suppose doesn't include a 
DynamicClassLoader instance.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-10078) Dynamic Filter - Not using DynamicClassLoader whe using FilterList


 [ 
https://issues.apache.org/jira/browse/HBASE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Federico Gaule updated HBASE-10078:
---

Description: 
I've tried to use dynamic jar load 
(https://issues.apache.org/jira/browse/HBASE-1936) but seems to have an issue 
with FilterList. 
Here is some log from my app where i send a Get with a FilterList containing 
AFilter and other with BFilter.

2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: 
Class d.p.AFilter not found - using dynamical class loader
2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: 
Finding class: d.p.AFilter
2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: 
Loading new jar files, if any
2013-12-02 13:55:42,677 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: 
Finding class again: d.p.AFilter
2013-12-02 13:55:43,004 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: 
Can't find class d.p.BFilter
java.lang.ClassNotFoundException: d.p.BFilter
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at 
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.getClassByName(HbaseObjectWritable.java:792)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:679)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594)
at 
org.apache.hadoop.hbase.filter.FilterList.readFields(FilterList.java:324)
at org.apache.hadoop.hbase.client.Get.readFields(Get.java:405)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594)
at org.apache.hadoop.hbase.client.Action.readFields(Action.java:101)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594)
at 
org.apache.hadoop.hbase.client.MultiAction.readFields(MultiAction.java:116)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690)
at 
org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:126)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1311)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1226)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:748)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:539)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:514)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
AFilter is not found so it tries with DynamicClassLoader, but when it tries to 
load AFilter, it uses URLClassLoader and fails without checking out for dynamic 
jars.


I think the issue is releated to FilterList#readFields

{code:title=FilterList.java|borderStyle=solid} 
 public void readFields(final DataInput in) throws IOException {
byte opByte = in.readByte();
operator = Operator.values()[opByte];
int size = in.readInt();
if (size  0) {
  filters = new ArrayListFilter(size);
  for (int i = 0; i  size; i++) {
Filter filter = (Filter)HbaseObjectWritable.readObject(in, conf);
filters.add(filter);
  }
}
  }
{code}

HbaseObjectWritable#readObject uses a conf (created by calling 
HBaseConfiguration.create()) which i suppose doesn't include a 
DynamicClassLoader instance.

  was:
I've tried to use dynamic jar load 
(https://issues.apache.org/jira/browse/HBASE-1936) but seems to have an issue 
with FilterList. 
Here is some log from my app where i send a Get with a FilterList containing 
AFilter and other with BFilter.

2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: 
Class d.p.AFilter not found - using dynamical class loader
2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: 
Finding class: d.p.AFilter
2013-12-02 13:55:42,564 DEBUG

[jira] [Updated] (HBASE-10078) Dynamic Filter - Not using DynamicClassLoader whe using FilterList


 [ 
https://issues.apache.org/jira/browse/HBASE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Federico Gaule updated HBASE-10078:
---

Description: 
I've tried to use dynamic jar load 
(https://issues.apache.org/jira/browse/HBASE-1936) but seems to have an issue 
with FilterList. 
Here is some log from my app where i send a Get with a FilterList containing 
AFilter and other with BFilter.

2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: 
Class d.p.AFilter not found - using dynamical class loader
2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: 
Finding class: d.p.AFilter
2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: 
Loading new jar files, if any
2013-12-02 13:55:42,677 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: 
Finding class again: d.p.AFilter
2013-12-02 13:55:43,004 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: 
Can't find class d.p.BFilter
java.lang.ClassNotFoundException: d.p.BFilter
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at 
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.getClassByName(HbaseObjectWritable.java:792)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:679)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594)
at 
org.apache.hadoop.hbase.filter.FilterList.readFields(FilterList.java:324)
at org.apache.hadoop.hbase.client.Get.readFields(Get.java:405)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594)
at org.apache.hadoop.hbase.client.Action.readFields(Action.java:101)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594)
at 
org.apache.hadoop.hbase.client.MultiAction.readFields(MultiAction.java:116)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690)
at 
org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:126)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1311)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1226)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:748)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:539)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:514)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
AFilter is not found so it tries with DynamicClassLoader, but when it tries to 
load AFilter, it uses URLClassLoader and fails without checking out for dynamic 
jars.


I think the issue is releated to FilterList#readFields

{code:title=FilterList.java|borderStyle=solid} 

 public void readFields(final DataInput in) throws IOException {
byte opByte = in.readByte();
operator = Operator.values()[opByte];
int size = in.readInt();
if (size  0) {
  filters = new ArrayListFilter(size);
  for (int i = 0; i  size; i++) {
Filter filter = (Filter)HbaseObjectWritable.readObject(in, conf);
filters.add(filter);
  }
}
  }
{code}

HbaseObjectWritable#readObject uses a conf (created by calling 
HBaseConfiguration.create()) which i suppose doesn't include a 
DynamicClassLoader instance.

  was:
I've tried to use dynamic jar load 
(https://issues.apache.org/jira/browse/HBASE-1936) but seems to have an issue 
with FilterList. 
Here is some log from my app where i send a Get with a FilterList containing 
AFilter and other with BFilter.

2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: 
Class d.p.AFilter not found - using dynamical class loader
2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: 
Finding class: d.p.AFilter
2013-12-02 13:55:42,564 DEBUG

[jira] [Updated] (HBASE-10078) Dynamic Filter - Not using DynamicClassLoader whe using FilterList


 [ 
https://issues.apache.org/jira/browse/HBASE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Federico Gaule updated HBASE-10078:
---

Description: 
I've tried to use dynamic jar load 
(https://issues.apache.org/jira/browse/HBASE-1936) but seems to have an issue 
with FilterList. 
Here is some log from my app where i send a Get with a FilterList containing 
AFilter and other with BFilter.

{noformat}
2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: 
Class d.p.AFilter not found - using dynamical class loader
2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: 
Finding class: d.p.AFilter
2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: 
Loading new jar files, if any
2013-12-02 13:55:42,677 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: 
Finding class again: d.p.AFilter
2013-12-02 13:55:43,004 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: 
Can't find class d.p.BFilter
java.lang.ClassNotFoundException: d.p.BFilter
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at 
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.getClassByName(HbaseObjectWritable.java:792)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:679)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594)
at 
org.apache.hadoop.hbase.filter.FilterList.readFields(FilterList.java:324)
at org.apache.hadoop.hbase.client.Get.readFields(Get.java:405)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594)
at org.apache.hadoop.hbase.client.Action.readFields(Action.java:101)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594)
at 
org.apache.hadoop.hbase.client.MultiAction.readFields(MultiAction.java:116)
at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690)
at 
org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:126)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1311)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1226)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:748)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:539)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:514)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
{noformat}

AFilter is not found so it tries with DynamicClassLoader, but when it tries to 
load AFilter, it uses URLClassLoader and fails without checking out for dynamic 
jars.


I think the issue is releated to FilterList#readFields

{code:title=FilterList.java|borderStyle=solid} 
 public void readFields(final DataInput in) throws IOException {
byte opByte = in.readByte();
operator = Operator.values()[opByte];
int size = in.readInt();
if (size  0) {
  filters = new ArrayListFilter(size);
  for (int i = 0; i  size; i++) {
Filter filter = (Filter)HbaseObjectWritable.readObject(in, conf);
filters.add(filter);
  }
}
  }
{code}

HbaseObjectWritable#readObject uses a conf (created by calling 
HBaseConfiguration.create()) which i suppose doesn't include a 
DynamicClassLoader instance.

  was:
I've tried to use dynamic jar load 
(https://issues.apache.org/jira/browse/HBASE-1936) but seems to have an issue 
with FilterList. 
Here is some log from my app where i send a Get with a FilterList containing 
AFilter and other with BFilter.

2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: 
Class d.p.AFilter not found - using dynamical class loader
2013-12-02 13:55:42,564 DEBUG org.apache.hadoop.hbase.util.DynamicClassLoader: 
Finding class: d.p.AFilter
2013-12-02 13:55:42,564 DEBUG

[jira] [Updated] (HBASE-10078) Dynamic Filter - Not using DynamicClassLoader when using FilterList


 [ 
https://issues.apache.org/jira/browse/HBASE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Federico Gaule updated HBASE-10078:
---

Summary: Dynamic Filter - Not using DynamicClassLoader when using 
FilterList  (was: Dynamic Filter - Not using DynamicClassLoader whe using 
FilterList)

 Dynamic Filter - Not using DynamicClassLoader when using FilterList
 ---

 Key: HBASE-10078
 URL: https://issues.apache.org/jira/browse/HBASE-10078
 Project: HBase
  Issue Type: Bug
  Components: Filters
Affects Versions: 0.94.13
Reporter: Federico Gaule
Priority: Minor

 I've tried to use dynamic jar load 
 (https://issues.apache.org/jira/browse/HBASE-1936) but seems to have an issue 
 with FilterList. 
 Here is some log from my app where i send a Get with a FilterList containing 
 AFilter and other with BFilter.
 {noformat}
 2013-12-02 13:55:42,564 DEBUG 
 org.apache.hadoop.hbase.util.DynamicClassLoader: Class d.p.AFilter not found 
 - using dynamical class loader
 2013-12-02 13:55:42,564 DEBUG 
 org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class: d.p.AFilter
 2013-12-02 13:55:42,564 DEBUG 
 org.apache.hadoop.hbase.util.DynamicClassLoader: Loading new jar files, if any
 2013-12-02 13:55:42,677 DEBUG 
 org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class again: 
 d.p.AFilter
 2013-12-02 13:55:43,004 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: 
 Can't find class d.p.BFilter
 java.lang.ClassNotFoundException: d.p.BFilter
   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
   at java.lang.Class.forName0(Native Method)
   at java.lang.Class.forName(Class.java:247)
   at 
 org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820)
   at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.getClassByName(HbaseObjectWritable.java:792)
   at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:679)
   at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594)
   at 
 org.apache.hadoop.hbase.filter.FilterList.readFields(FilterList.java:324)
   at org.apache.hadoop.hbase.client.Get.readFields(Get.java:405)
   at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690)
   at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594)
   at org.apache.hadoop.hbase.client.Action.readFields(Action.java:101)
   at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690)
   at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594)
   at 
 org.apache.hadoop.hbase.client.MultiAction.readFields(MultiAction.java:116)
   at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690)
   at 
 org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:126)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1311)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1226)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:748)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:539)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:514)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 {noformat}
 AFilter is not found so it tries with DynamicClassLoader, but when it tries 
 to load AFilter, it uses URLClassLoader and fails without checking out for 
 dynamic jars.
 I think the issue is releated to FilterList#readFields
 {code:title=FilterList.java|borderStyle=solid} 
  public void readFields(final DataInput in) throws IOException {
 byte opByte = in.readByte();
 operator = Operator.values()[opByte];
 int size = in.readInt();
 if (size  0) {
   filters = new ArrayListFilter(size);
   for (int i = 0; i  size; i++) {
 Filter filter = (Filter)HbaseObjectWritable.readObject(in, conf);
 filters.add(filter);
   }
 }
   }
 {code}
 HbaseObjectWritable#readObject uses a conf (created by calling 
 HBaseConfiguration.create()) which

[jira] [Commented] (HBASE-9986) Incorporate HTTPS support for HBase (0.94 port)


[ 
https://issues.apache.org/jira/browse/HBASE-9986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838851#comment-13838851
 ] 

Hudson commented on HBASE-9986:
---

SUCCESS: Integrated in HBase-0.94 #1217 (See 
[https://builds.apache.org/job/HBase-0.94/1217/])
HBASE-9986 Incorporate HTTPS support for HBase (0.94 port) (Aditya Kishore) 
(larsh: rev 1547706)
* 
/hbase/branches/0.94/src/main/jamon/org/apache/hadoop/hbase/tmpl/master/BackupMasterStatusTmpl.jamon
* 
/hbase/branches/0.94/src/main/jamon/org/apache/hadoop/hbase/tmpl/master/MasterStatusTmpl.jamon
* 
/hbase/branches/0.94/src/main/jamon/org/apache/hadoop/hbase/tmpl/regionserver/RSStatusTmpl.jamon
* /hbase/branches/0.94/src/main/resources/hbase-webapps/master/table.jsp


 Incorporate HTTPS support for HBase (0.94 port)
 ---

 Key: HBASE-9986
 URL: https://issues.apache.org/jira/browse/HBASE-9986
 Project: HBase
  Issue Type: Task
Affects Versions: 0.94.13
Reporter: Aditya Kishore
Assignee: Aditya Kishore
 Fix For: 0.94.15

 Attachments: HBASE-9954_0.94.patch


 In various classes, http://; is hard coded.
 This JIRA adds support for using HBASE web UI via HTTPS 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Resolved] (HBASE-5273) Provide a coprocessor template for fast development and testing


 [ 
https://issues.apache.org/jira/browse/HBASE-5273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-5273.
--

Resolution: Won't Fix

Resolving as won't fix as per suggestion above by [~takeshi.miao]

 Provide a coprocessor template for fast development and testing
 ---

 Key: HBASE-5273
 URL: https://issues.apache.org/jira/browse/HBASE-5273
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors
Affects Versions: 0.92.0
Reporter: Mingjie Lai
Priority: Minor

 While reworking on the coprocessor blog, I start to realize that we should 
 have a template of coprocessor that helps users to quickly start to develop, 
 test a customized coprocessors. Currently there are some built-in 
 coprocessors but all over the code base, and a user has to search around the 
 code to see how to develop a new one.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10031) Add a section on the transparent CF encryption feature to the manual


[ 
https://issues.apache.org/jira/browse/HBASE-10031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838889#comment-13838889
 ] 

stack commented on HBASE-10031:
---

The doc is great.

 Add a section on the transparent CF encryption feature to the manual
 

 Key: HBASE-10031
 URL: https://issues.apache.org/jira/browse/HBASE-10031
 Project: HBase
  Issue Type: Task
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Blocker
 Fix For: 0.98.0

 Attachments: 10031.patch


 Document HBASE-7544 in detail in the manual.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-9485) TableOutputCommitter should implement recovery if we don't want jobs to start from 0 on RM restart


[ 
https://issues.apache.org/jira/browse/HBASE-9485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838927#comment-13838927
 ] 

Ted Yu commented on HBASE-9485:
---

Integrated to 0.96, 0.98 and trunk.

Thanks for the reviews, Devaraj, Nick and Vinod.

 TableOutputCommitter should implement recovery if we don't want jobs to start 
 from 0 on RM restart
 --

 Key: HBASE-9485
 URL: https://issues.apache.org/jira/browse/HBASE-9485
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.98.0, 0.96.2

 Attachments: 9485-v2.txt


 HBase extends OutputCommitter which turns recovery off. Meaning all completed 
 maps are lost on RM restart and job starts from scratch. FileOutputCommitter 
 implements recovery so we should look at that to see what is potentially 
 needed for recovery.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-9485) TableOutputCommitter should implement recovery if we don't want jobs to start from 0 on RM restart


 [ 
https://issues.apache.org/jira/browse/HBASE-9485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-9485:
--

Fix Version/s: 0.96.2
   0.98.0

 TableOutputCommitter should implement recovery if we don't want jobs to start 
 from 0 on RM restart
 --

 Key: HBASE-9485
 URL: https://issues.apache.org/jira/browse/HBASE-9485
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.98.0, 0.96.2

 Attachments: 9485-v2.txt


 HBase extends OutputCommitter which turns recovery off. Meaning all completed 
 maps are lost on RM restart and job starts from scratch. FileOutputCommitter 
 implements recovery so we should look at that to see what is potentially 
 needed for recovery.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-9892) Add info port to ServerName to support multi instances in a node

[
https://issues.apache.org/jira/browse/HBASE-9892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stack updated HBASE-9892:
-

Attachment: HBASE-9892-v5.txt

First cut at a trunk patch

(I just saw Liu Shaohui sleeping at his desk; obviously a man who is working
too hard -- smile).

Liu wants this patch so he can run two hbases on single node.

Add info port to ServerName to support multi instances in a node

Key: HBASE-9892
URL: https://issues.apache.org/jira/browse/HBASE-9892
Project: HBase
Issue Type: Improvement
Reporter: Liu Shaohui
Assignee: Liu Shaohui
Priority: Minor
Attachments: HBASE-9892-0.94-v1.diff, HBASE-9892-0.94-v2.diff,
HBASE-9892-0.94-v3.diff, HBASE-9892-0.94-v4.diff, HBASE-9892-0.94-v5.diff,
HBASE-9892-v5.txt

The full GC time of regionserver with big heap( 30G ) usually can not be
controlled in 30s. At the same time, the servers with 64G memory are normal.
So we try to deploy multi rs instances(2-3 ) in a single node and the heap of
each rs is about 20G ~ 24G.
Most of the things works fine, except the hbase web ui. The master get the RS
info port from conf, which is suitable for this situation of multi rs
instances in a node. So we add info port to ServerName.
a. at the startup, rs report it's info port to Hmaster.
b, For root region, rs write the servername with info port ro the zookeeper
root-region-server node.
c, For meta regions, rs write the servername with info port to root region
d. For user regions, rs write the servername with info port to meta regions
So hmaster and client can get info port from the servername.
To test this feature, I change the rs num from 1 to 3 in standalone mode, so
we can test it in standalone mode,
I think Hoya(hbase on yarn) will encounter the same problem. Anyone knows
how Hoya handle this problem?
PS: There are different formats for servername in zk node and meta table, i
think we need to unify it and refactor the code.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HBASE-10079) Increments lost after flush

Jonathan Hsieh created HBASE-10079:
--

 Summary: Increments lost after flush 
 Key: HBASE-10079
 URL: https://issues.apache.org/jira/browse/HBASE-10079
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.96.1
Reporter: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.96.1


Testing 0.96.1rc1.

With one process incrementing a row in a table, we increment single col.  We 
flush or do kills/kill-9 and data is lost.  flush and kill are likely the same 
problem (kill would flush), kill -9 may or may not have the same root cause.

5 nodes
hadoop 2.1.0 (a pre cdh5b1 hdfs).
hbase 0.96.1 rc1 

Test: 25 increments on a single row an single col with various number of 
client threads (IncrementBlaster).  Verify we have a count of 25 after the 
run (IncrementVerifier).

Run 1: No fault injection.  5 runs.  count = 25. on multiple runs.  
Correctness verified.  1638 inc/s throughput.
Run 2: flushes table with incrementing row.  count = 246875 !=25.  
correctness failed.  1517 inc/s throughput.  
Run 3: kill of rs hosting incremented row.  count = 243750 != 25. 
Correctness failed.   1451 inc/s throughput.
Run 4: one kill -9 of rs hosting incremented row.  246878.!= 25.  
Correctness failed. 1395 inc/s (including recovery)




--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10079) Increments lost after flush


[ 
https://issues.apache.org/jira/browse/HBASE-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839042#comment-13839042
 ] 

Jonathan Hsieh commented on HBASE-10079:


Here's a link to the test programs I used to pull out this bug.  It needs to be 
polished and turned in to an IT test as well as a perf test probably in a 
separate issue.
https://github.com/jmhsieh/hbase/tree/increval

 Increments lost after flush 
 

 Key: HBASE-10079
 URL: https://issues.apache.org/jira/browse/HBASE-10079
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.96.1
Reporter: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.96.1


 Testing 0.96.1rc1.
 With one process incrementing a row in a table, we increment single col.  We 
 flush or do kills/kill-9 and data is lost.  flush and kill are likely the 
 same problem (kill would flush), kill -9 may or may not have the same root 
 cause.
 5 nodes
 hadoop 2.1.0 (a pre cdh5b1 hdfs).
 hbase 0.96.1 rc1 
 Test: 25 increments on a single row an single col with various number of 
 client threads (IncrementBlaster).  Verify we have a count of 25 after 
 the run (IncrementVerifier).
 Run 1: No fault injection.  5 runs.  count = 25. on multiple runs.  
 Correctness verified.  1638 inc/s throughput.
 Run 2: flushes table with incrementing row.  count = 246875 !=25.  
 correctness failed.  1517 inc/s throughput.  
 Run 3: kill of rs hosting incremented row.  count = 243750 != 25. 
 Correctness failed.   1451 inc/s throughput.
 Run 4: one kill -9 of rs hosting incremented row.  246878.!= 25.  
 Correctness failed. 1395 inc/s (including recovery)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-10080) Unnecessary call to locateRegion when creating an HTable instance


 [ 
https://issues.apache.org/jira/browse/HBASE-10080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolas Liochon updated HBASE-10080:


Attachment: 10080.v1.patch

 Unnecessary call to locateRegion when creating an HTable instance
 -

 Key: HBASE-10080
 URL: https://issues.apache.org/jira/browse/HBASE-10080
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.98.0, 0.96.0
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
Priority: Trivial
 Fix For: 0.96.2, 0.98.1

 Attachments: 10080.v1.patch


 It's more or less in contradiction with the objective of having lightweight 
 HTable objects and the data may be stale when we will use it 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HBASE-10080) Unnecessary call to locateRegion when creating an HTable instance

Nicolas Liochon created HBASE-10080:
---

 Summary: Unnecessary call to locateRegion when creating an HTable 
instance
 Key: HBASE-10080
 URL: https://issues.apache.org/jira/browse/HBASE-10080
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.96.0, 0.98.0
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
Priority: Trivial
 Fix For: 0.96.2, 0.98.1
 Attachments: 10080.v1.patch

It's more or less in contradiction with the objective of having lightweight 
HTable objects and the data may be stale when we will use it 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-10080) Unnecessary call to locateRegion when creating an HTable instance


 [ 
https://issues.apache.org/jira/browse/HBASE-10080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolas Liochon updated HBASE-10080:


Status: Patch Available  (was: Open)

 Unnecessary call to locateRegion when creating an HTable instance
 -

 Key: HBASE-10080
 URL: https://issues.apache.org/jira/browse/HBASE-10080
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.96.0, 0.98.0
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
Priority: Trivial
 Fix For: 0.96.2, 0.98.1

 Attachments: 10080.v1.patch


 It's more or less in contradiction with the objective of having lightweight 
 HTable objects and the data may be stale when we will use it 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10080) Unnecessary call to locateRegion when creating an HTable instance

2013-12-04 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839088#comment-13839088
 ] 

Lars Hofhansl commented on HBASE-10080:
---

This is verifying essentially that the table exists. It is light weight if:
# the location is already cached, or
# you end up accessing the first region of the table later anyway


 Unnecessary call to locateRegion when creating an HTable instance
 -

 Key: HBASE-10080
 URL: https://issues.apache.org/jira/browse/HBASE-10080
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.98.0, 0.96.0
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
Priority: Trivial
 Fix For: 0.96.2, 0.98.1

 Attachments: 10080.v1.patch


 It's more or less in contradiction with the objective of having lightweight 
 HTable objects and the data may be stale when we will use it 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10079) Increments lost after flush

[
https://issues.apache.org/jira/browse/HBASE-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839085#comment-13839085
]

Jonathan Hsieh commented on HBASE-10079:

In 0.96.0:
* flush: Not able to reproduce data loss
* with kill: Not able to reproduce data loss. had an overcount of 1.
* with kill -9: Not able to reproduce data loss. had an overcount of 1.

The overcount of 1 is likely a different bug that I think that I'll let slide.
Likely the client thought it failed and retried, but it actually made it to the
log and increments not being idempotent.

So the bug is somewhere between 0.96.0 and 0.96.1rc1.

Increments lost after flush

Key: HBASE-10079
URL: https://issues.apache.org/jira/browse/HBASE-10079
Project: HBase
Issue Type: Bug
Components: regionserver
Affects Versions: 0.96.1
Reporter: Jonathan Hsieh
Priority: Blocker
Fix For: 0.96.1

Testing 0.96.1rc1.
With one process incrementing a row in a table, we increment single col. We
flush or do kills/kill-9 and data is lost. flush and kill are likely the
same problem (kill would flush), kill -9 may or may not have the same root
cause.
5 nodes
hadoop 2.1.0 (a pre cdh5b1 hdfs).
hbase 0.96.1 rc1
Test: 25 increments on a single row an single col with various number of
client threads (IncrementBlaster). Verify we have a count of 25 after
the run (IncrementVerifier).
Run 1: No fault injection. 5 runs. count = 25. on multiple runs.
Correctness verified. 1638 inc/s throughput.
Run 2: flushes table with incrementing row. count = 246875 !=25.
correctness failed. 1517 inc/s throughput.
Run 3: kill of rs hosting incremented row. count = 243750 != 25.
Correctness failed. 1451 inc/s throughput.
Run 4: one kill -9 of rs hosting incremented row. 246878.!= 25.
Correctness failed. 1395 inc/s (including recovery)

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10079) Increments lost after flush


[ 
https://issues.apache.org/jira/browse/HBASE-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839099#comment-13839099
 ] 

Sergey Shelukhin commented on HBASE-10079:
--

does the writer check for exceptions? can you try disabling nonces, to see if 
there could be collisions (although I would expect the client to receive 
exceptions in such cases)

 Increments lost after flush 
 

 Key: HBASE-10079
 URL: https://issues.apache.org/jira/browse/HBASE-10079
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.96.1
Reporter: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.96.1


 Testing 0.96.1rc1.
 With one process incrementing a row in a table, we increment single col.  We 
 flush or do kills/kill-9 and data is lost.  flush and kill are likely the 
 same problem (kill would flush), kill -9 may or may not have the same root 
 cause.
 5 nodes
 hadoop 2.1.0 (a pre cdh5b1 hdfs).
 hbase 0.96.1 rc1 
 Test: 25 increments on a single row an single col with various number of 
 client threads (IncrementBlaster).  Verify we have a count of 25 after 
 the run (IncrementVerifier).
 Run 1: No fault injection.  5 runs.  count = 25. on multiple runs.  
 Correctness verified.  1638 inc/s throughput.
 Run 2: flushes table with incrementing row.  count = 246875 !=25.  
 correctness failed.  1517 inc/s throughput.  
 Run 3: kill of rs hosting incremented row.  count = 243750 != 25. 
 Correctness failed.   1451 inc/s throughput.
 Run 4: one kill -9 of rs hosting incremented row.  246878.!= 25.  
 Correctness failed. 1395 inc/s (including recovery)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10079) Increments lost after flush


[ 
https://issues.apache.org/jira/browse/HBASE-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839102#comment-13839102
 ] 

Sergey Shelukhin commented on HBASE-10079:
--

hbase.regionserver.nonces.enabled is the server config setting. Although, 
during replay, the updates are never blocked if nonces collide. 

 Increments lost after flush 
 

 Key: HBASE-10079
 URL: https://issues.apache.org/jira/browse/HBASE-10079
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.96.1
Reporter: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.96.1


 Testing 0.96.1rc1.
 With one process incrementing a row in a table, we increment single col.  We 
 flush or do kills/kill-9 and data is lost.  flush and kill are likely the 
 same problem (kill would flush), kill -9 may or may not have the same root 
 cause.
 5 nodes
 hadoop 2.1.0 (a pre cdh5b1 hdfs).
 hbase 0.96.1 rc1 
 Test: 25 increments on a single row an single col with various number of 
 client threads (IncrementBlaster).  Verify we have a count of 25 after 
 the run (IncrementVerifier).
 Run 1: No fault injection.  5 runs.  count = 25. on multiple runs.  
 Correctness verified.  1638 inc/s throughput.
 Run 2: flushes table with incrementing row.  count = 246875 !=25.  
 correctness failed.  1517 inc/s throughput.  
 Run 3: kill of rs hosting incremented row.  count = 243750 != 25. 
 Correctness failed.   1451 inc/s throughput.
 Run 4: one kill -9 of rs hosting incremented row.  246878.!= 25.  
 Correctness failed. 1395 inc/s (including recovery)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10079) Increments lost after flush

[
https://issues.apache.org/jira/browse/HBASE-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839103#comment-13839103
]

Jonathan Hsieh commented on HBASE-10079:

Do you the increval rig with the github link in the first comment? No, that a
was a quick and dirty program to duplicate a customer issue.

I'm in the process of adding flushes to the TestAtomicOperation unit tests that
[~lhofhansl] mentioned in the mailing list. I'll be able to bisect find the
bug that way.

Increments lost after flush

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10079) Increments lost after flush


[ 
https://issues.apache.org/jira/browse/HBASE-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839114#comment-13839114
 ] 

Jonathan Hsieh commented on HBASE-10079:


This was the issue that fixed the problem in 0.94 and 0.95 branches (at the 
time).  It added at test to TestHRegion called 
testParallelIncrementWithMemStoreFlush that test the situtaion.

 Increments lost after flush 
 

 Key: HBASE-10079
 URL: https://issues.apache.org/jira/browse/HBASE-10079
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.96.1
Reporter: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.96.1


 Testing 0.96.1rc1.
 With one process incrementing a row in a table, we increment single col.  We 
 flush or do kills/kill-9 and data is lost.  flush and kill are likely the 
 same problem (kill would flush), kill -9 may or may not have the same root 
 cause.
 5 nodes
 hadoop 2.1.0 (a pre cdh5b1 hdfs).
 hbase 0.96.1 rc1 
 Test: 25 increments on a single row an single col with various number of 
 client threads (IncrementBlaster).  Verify we have a count of 25 after 
 the run (IncrementVerifier).
 Run 1: No fault injection.  5 runs.  count = 25. on multiple runs.  
 Correctness verified.  1638 inc/s throughput.
 Run 2: flushes table with incrementing row.  count = 246875 !=25.  
 correctness failed.  1517 inc/s throughput.  
 Run 3: kill of rs hosting incremented row.  count = 243750 != 25. 
 Correctness failed.   1451 inc/s throughput.
 Run 4: one kill -9 of rs hosting incremented row.  246878.!= 25.  
 Correctness failed. 1395 inc/s (including recovery)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Comment Edited] (HBASE-10079) Increments lost after flush

[
https://issues.apache.org/jira/browse/HBASE-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839103#comment-13839103
]

Jonathan Hsieh edited comment on HBASE-10079 at 12/4/13 5:49 PM:
-

Does the increval rig with the github link in the first comment check for
exceptions? No, it was a quick and dirty program to duplicate a customer issue.

I'm in the process of adding flushes to the TestAtomicOperation unit tests that
[~lhofhansl] mentioned in the mailing list. I'll be able to bisect find the
bug that way.

was (Author: jmhsieh):
Do you the increval rig with the github link in the first comment? No, that a
was a quick and dirty program to duplicate a customer issue.

I'm in the process of adding flushes to the TestAtomicOperation unit tests that
[~lhofhansl] mentioned in the mailing list. I'll be able to bisect find the
bug that way.

Increments lost after flush

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-7091) support custom GC options in hbase-env.sh

[
https://issues.apache.org/jira/browse/HBASE-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839122#comment-13839122
]

Nicolas Liochon commented on HBASE-7091:

I understand. Jira created :-).

support custom GC options in hbase-env.sh
-

Key: HBASE-7091
URL: https://issues.apache.org/jira/browse/HBASE-7091
Project: HBase
Issue Type: Bug
Components: scripts
Affects Versions: 0.94.4
Reporter: Jesse Yates
Assignee: Jesse Yates
Labels: newbie
Fix For: 0.94.4, 0.95.0

Attachments: hbase-7091-v1.patch

When running things like bin/start-hbase and bin/hbase-daemon.sh start
[master|regionserver|etc] we end up setting HBASE_OPTS property a couple
times via calling hbase-env.sh. This is generally not a problem for most
cases, but when you want to set your own GC log properties, one would think
you should set HBASE_GC_OPTS, which get added to HBASE_OPTS.
NOPE! That would make too much sense.
Running bin/hbase-daemons.sh will run bin/hbase-daemon.sh with the daemons it
needs to start. Each time through hbase-daemon.sh we also call bin/hbase.
This isn't a big deal except for each call to hbase-daemon.sh, we also source
hbase-env.sh twice (once in the script and once in bin/hbase). This is
important for my next point.
Note that to turn on GC logging, you uncomment:
{code}
# export HBASE_OPTS=$HBASE_OPTS -verbose:gc -XX:+PrintGCDetails
-XX:+PrintGCDateStamps $HBASE_GC_OPTS
{code}
and then to log to a gc file for each server, you then uncomment:
{code}
# export HBASE_USE_GC_LOGFILE=true
{code}
in hbase-env.sh
On the first pass through hbase-daemon.sh, HBASE_GC_OPTS isn't set, so
HBASE_OPTS doesn't get anything funky, but we set HBASE_USE_GC_LOGFILE, which
then sets HBASE_GC_OPTS to the log file (-Xloggc:...). Then in bin/hbase we
again run hbase-env.sh, which now hs HBASE_GC_OPTS set, adding the GC file.
This isn't a general problem because HBASE_OPTS is set without prefixing the
existing HBASE_OPTS (eg. HBASE_OPTS=$HBASE_OPTS ...), allowing easy
updating. However, GC OPTS don't work the same and this is really odd
behavior when you want to set your own GC opts, which can include turning on
GC log rolling (yes, yes, they really are jvm opts, but they ought to support
their own param, to help minimize clutter).
The simple version of this patch will just add an idempotent GC option to
hbase-env.sh and some comments that uncommenting
{code}
# export HBASE_USE_GC_LOGFILE=true
{code}
will lead to a custom gc log file per server (along with an example name), so
you don't need to set -Xloggc.
The more complex solution does the above and also solves the multiple calls
to hbase-env.sh so we can be sane about how all this works. Note that to fix
this, hbase-daemon.sh just needs to read in HBASE_USE_GC_LOGFILE after
sourcing hbase-env.sh and then update HBASE_OPTS. Oh and also not source
hbase-env.sh in bin/hbase.
Even further, we might want to consider adding options just for cases where
we don't need gc logging - i.e. the shell, the config reading tool, hcbk,
etc. This is the hardest version to handle since the first couple will
willy-nilly apply the gc options.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HBASE-10081) Since HBASE-7091, HBASE_OPTS cannot be set on the command line

Nicolas Liochon created HBASE-10081:
---

 Summary: Since HBASE-7091, HBASE_OPTS cannot be set on the command 
line
 Key: HBASE-10081
 URL: https://issues.apache.org/jira/browse/HBASE-10081
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.96.0, 0.98.0
Reporter: Nicolas Liochon
Priority: Minor


Discussed in HBASE-7091.
It's not critical, but a little bit surprising, as the comments in bin/hbase 
doesn't say anything about this. If you create your own hbase-env then it's not 
an issue...



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10074) consolidate and improve capacity/sizing documentation


[ 
https://issues.apache.org/jira/browse/HBASE-10074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839128#comment-13839128
 ] 

Sergey Shelukhin commented on HBASE-10074:
--

[~ndimiduk]
bq. Mind adding some JIRA references here?
Actually, do you have particular JIRAs in mind?

[~stack] thanks!

 consolidate and improve capacity/sizing documentation
 -

 Key: HBASE-10074
 URL: https://issues.apache.org/jira/browse/HBASE-10074
 Project: HBase
  Issue Type: Improvement
  Components: documentation
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10074.patch


 Region count description is in config section; region size description is in 
 architecture sections; both of these have a lot of good technical details, 
 but imho we could do better in terms of admin-centric advice. 
 Currently, there's a nearly-empty capacity section; I'd like to rewrite it to 
 consolidate capacity planning/sizing/region sizing information, and some 
 basic configuration pertaining to it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Comment Edited] (HBASE-10079) Increments lost after flush

[
https://issues.apache.org/jira/browse/HBASE-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839114#comment-13839114
]

Jonathan Hsieh edited comment on HBASE-10079 at 12/4/13 5:47 PM:
-

HBASE-6195 was the issue that fixed the problem in 0.94 and 0.95 branches (at
the time). It added at test to TestHRegion called
testParallelIncrementWithMemStoreFlush that test the situtaion.

was (Author: jmhsieh):
This was the issue that fixed the problem in 0.94 and 0.95 branches (at the
time). It added at test to TestHRegion called
testParallelIncrementWithMemStoreFlush that test the situtaion.

Increments lost after flush

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10079) Increments lost after flush


[ 
https://issues.apache.org/jira/browse/HBASE-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839153#comment-13839153
 ] 

Jonathan Hsieh commented on HBASE-10079:


TestHRegion#testParallelIncrementWithMemStoreFlush passes on the 0.96 tip  The 
test actually waits for all the increments to be done before flushing (instead 
of while other increments are happening) so my bet is that it  doesn't actually 
test the race condition.

 Increments lost after flush 
 

 Key: HBASE-10079
 URL: https://issues.apache.org/jira/browse/HBASE-10079
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.96.1
Reporter: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.96.1


 Testing 0.96.1rc1.
 With one process incrementing a row in a table, we increment single col.  We 
 flush or do kills/kill-9 and data is lost.  flush and kill are likely the 
 same problem (kill would flush), kill -9 may or may not have the same root 
 cause.
 5 nodes
 hadoop 2.1.0 (a pre cdh5b1 hdfs).
 hbase 0.96.1 rc1 
 Test: 25 increments on a single row an single col with various number of 
 client threads (IncrementBlaster).  Verify we have a count of 25 after 
 the run (IncrementVerifier).
 Run 1: No fault injection.  5 runs.  count = 25. on multiple runs.  
 Correctness verified.  1638 inc/s throughput.
 Run 2: flushes table with incrementing row.  count = 246875 !=25.  
 correctness failed.  1517 inc/s throughput.  
 Run 3: kill of rs hosting incremented row.  count = 243750 != 25. 
 Correctness failed.   1451 inc/s throughput.
 Run 4: one kill -9 of rs hosting incremented row.  246878.!= 25.  
 Correctness failed. 1395 inc/s (including recovery)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-9524) Multi row get does not return any results even if any one of the rows specified in the query is missing and improve exception handling

2013-12-04 Thread Vandana Ayyalasomayajula (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839186#comment-13839186
 ] 

Vandana Ayyalasomayajula commented on HBASE-9524:
-

Both the issues found in the Hadoop QA run seem to be unrelated. 
https://builds.apache.org/job/PreCommit-HBASE-Build/8053//artifact/trunk/patchprocess/patchSiteOutput.txt
https://builds.apache.org/job/PreCommit-HBASE-Build/8053//artifact/trunk/patchprocess/patchJavadocWarnings.txt

[~ndimiduk] Can you please take a look at the latest patch when you have time ? 
Thanks!



 Multi row get does not return any results even if any one of the rows 
 specified in the query is missing and improve exception handling
 --

 Key: HBASE-9524
 URL: https://issues.apache.org/jira/browse/HBASE-9524
 Project: HBase
  Issue Type: Improvement
  Components: REST
Reporter: Vandana Ayyalasomayajula
Assignee: Vandana Ayyalasomayajula
Priority: Minor
 Attachments: HBASE-9524_trunk.01.patch, hbase-9524_trunk.00.patch


 When a client tries to retrieve multiple rows using REST API, even if one of 
 the specified rows does not exist, 404 is returned. The correct way should be 
 to return the result for the found rows and ignore the non-existent ones. 
 Also, in the current code base, only some exceptions are handled, if some 
 exception like Access denied or no column found exception is throws by the 
 APIs, 500 ( server not found) is returned to user. This is leaves the end 
 user wondering what caused the rest command to fail. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HBASE-10082) Describe 'table' output is all on one line, could use better formatting

2013-12-04 Thread Maxime C Dumas (JIRA)

Maxime C Dumas created HBASE-10082:
--

 Summary: Describe 'table' output is all on one line, could use 
better formatting
 Key: HBASE-10082
 URL: https://issues.apache.org/jira/browse/HBASE-10082
 Project: HBase
  Issue Type: Improvement
 Environment: 0.94.2-cdh4.2.1
Reporter: Maxime C Dumas


If you describe 'table' from the HBase shell, you get an output like this for a 
very simple table:

hbase(main):023:0 describe 'movie'
DESCRIPTION 
  ENABLED 
 {NAME = 'movie', FAMILIES = [{NAME = 'info', DATA_BLOCK_ENCODING = 'NONE', 
B true
 LOOMFILTER = 'NONE', REPLICATION_SCOPE = '0', VERSIONS = '3', COMPRESSION 
=  
 'NONE', MIN_VERSIONS = '0', TTL = '2147483647', KEEP_DELETED_CELLS = 
'false', 
  BLOCKSIZE = '65536', IN_MEMORY = 'false', ENCODE_ON_DISK = 'true', 
BLOCKCACH 
 E = 'true'}, {NAME = 'media', DATA_BLOCK_ENCODING = 'NONE', BLOOMFILTER = 
'N 
 ONE', REPLICATION_SCOPE = '0', VERSIONS = '1', COMPRESSION = 'NONE', 
MIN_VERS 
 IONS = '0', TTL = '2147483647', KEEP_DELETED_CELLS = 'false', BLOCKSIZE = 
'6 
 5536', IN_MEMORY = 'false', ENCODE_ON_DISK = 'true', BLOCKCACHE = 'true'}]} 
  
1 row(s) in 0.0250 seconds

Not only everything is on one row, but also it seems to be limited in width (82 
chars).

I suggest we do a line return on each column family, or format it into a JSON 
(lint) format, or anything more readable!

Thanks!



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-10082) Describe 'table' output is all on one line, could use better formatting

2013-12-04 Thread Maxime C Dumas (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-10082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxime C Dumas updated HBASE-10082:
---

Priority: Minor  (was: Major)

 Describe 'table' output is all on one line, could use better formatting
 ---

 Key: HBASE-10082
 URL: https://issues.apache.org/jira/browse/HBASE-10082
 Project: HBase
  Issue Type: Improvement
 Environment: 0.94.2-cdh4.2.1
Reporter: Maxime C Dumas
Priority: Minor

 If you describe 'table' from the HBase shell, you get an output like this for 
 a very simple table:
 hbase(main):023:0 describe 'movie'
 DESCRIPTION   
 ENABLED 
  {NAME = 'movie', FAMILIES = [{NAME = 'info', DATA_BLOCK_ENCODING = 
 'NONE', B true
  LOOMFILTER = 'NONE', REPLICATION_SCOPE = '0', VERSIONS = '3', COMPRESSION 
 =  
  'NONE', MIN_VERSIONS = '0', TTL = '2147483647', KEEP_DELETED_CELLS = 
 'false', 
   BLOCKSIZE = '65536', IN_MEMORY = 'false', ENCODE_ON_DISK = 'true', 
 BLOCKCACH 
  E = 'true'}, {NAME = 'media', DATA_BLOCK_ENCODING = 'NONE', BLOOMFILTER 
 = 'N 
  ONE', REPLICATION_SCOPE = '0', VERSIONS = '1', COMPRESSION = 'NONE', 
 MIN_VERS 
  IONS = '0', TTL = '2147483647', KEEP_DELETED_CELLS = 'false', BLOCKSIZE 
 = '6 
  5536', IN_MEMORY = 'false', ENCODE_ON_DISK = 'true', BLOCKCACHE = 
 'true'}]}   
 1 row(s) in 0.0250 seconds
 Not only everything is on one row, but also it seems to be limited in width 
 (82 chars).
 I suggest we do a line return on each column family, or format it into a JSON 
 (lint) format, or anything more readable!
 Thanks!



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10080) Unnecessary call to locateRegion when creating an HTable instance

2013-12-04 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-10080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839194#comment-13839194
]

Hadoop QA commented on HBASE-10080:
---

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12617015/10080.v1.patch
against trunk revision .

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:red}-1 tests included{color}. The patch doesn't appear to include
any new or modified tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.

{color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop
1.0 profile.

{color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop
2.0 profile.

{color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1
warning messages.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

{color:green}+1 lineLengths{color}. The patch does not introduce lines
longer than 100

{color:red}-1 site{color}. The patch appears to cause mvn site goal to
fail.

{color:red}-1 core tests{color}. The patch failed these unit tests:
org.apache.hadoop.hbase.client.TestAdmin

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/8055//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/8055//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/8055//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/8055//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/8055//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/8055//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/8055//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/8055//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/8055//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/8055//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/8055//console

This message is automatically generated.

Unnecessary call to locateRegion when creating an HTable instance
-

Key: HBASE-10080
URL: https://issues.apache.org/jira/browse/HBASE-10080
Project: HBase
Issue Type: Bug
Components: Client
Affects Versions: 0.98.0, 0.96.0
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
Priority: Trivial
Fix For: 0.96.2, 0.98.1

Attachments: 10080.v1.patch

It's more or less in contradiction with the objective of having lightweight
HTable objects and the data may be stale when we will use it

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-10074) consolidate and improve capacity/sizing documentation


 [ 
https://issues.apache.org/jira/browse/HBASE-10074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-10074:
-

Attachment: HBASE-10074.01.patch

incorporated feedback, some spelling fixes and rephrases.

I'd assume +1 stands, will commit in the afternoon

 consolidate and improve capacity/sizing documentation
 -

 Key: HBASE-10074
 URL: https://issues.apache.org/jira/browse/HBASE-10074
 Project: HBase
  Issue Type: Improvement
  Components: documentation
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10074.01.patch, HBASE-10074.patch


 Region count description is in config section; region size description is in 
 architecture sections; both of these have a lot of good technical details, 
 but imho we could do better in terms of admin-centric advice. 
 Currently, there's a nearly-empty capacity section; I'd like to rewrite it to 
 consolidate capacity planning/sizing/region sizing information, and some 
 basic configuration pertaining to it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-9648) collection one expired storefile causes it to be replaced by another expired storefile


[ 
https://issues.apache.org/jira/browse/HBASE-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839201#comment-13839201
 ] 

Sergey Shelukhin commented on HBASE-9648:
-

stumbled upon this jira (not bug :)) again... do you want to go with either 
patch

 collection one expired storefile causes it to be replaced by another expired 
 storefile
 --

 Key: HBASE-9648
 URL: https://issues.apache.org/jira/browse/HBASE-9648
 Project: HBase
  Issue Type: Bug
  Components: Compaction
Reporter: Sergey Shelukhin
Assignee: Jean-Marc Spaggiari
 Attachments: HBASE-9648-v0-0.94.patch, HBASE-9648-v0-trunk.patch, 
 HBASE-9648-v1-trunk.patch, HBASE-9648.patch


 There's a shortcut in compaction selection that causes the selection of 
 expired store files to quickly delete.
 However, there's also the code that ensures we write at least one file to 
 preserve seqnum. This new empty file is expired, because it has no data, 
 presumably.
 So it's collected again, etc.
 This affects 94, probably also 96.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10074) consolidate and improve capacity/sizing documentation


[ 
https://issues.apache.org/jira/browse/HBASE-10074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839204#comment-13839204
 ] 

Sergey Shelukhin commented on HBASE-10074:
--

[~stack] ok for 96?

 consolidate and improve capacity/sizing documentation
 -

 Key: HBASE-10074
 URL: https://issues.apache.org/jira/browse/HBASE-10074
 Project: HBase
  Issue Type: Improvement
  Components: documentation
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10074.01.patch, HBASE-10074.patch


 Region count description is in config section; region size description is in 
 architecture sections; both of these have a lot of good technical details, 
 but imho we could do better in terms of admin-centric advice. 
 Currently, there's a nearly-empty capacity section; I'd like to rewrite it to 
 consolidate capacity planning/sizing/region sizing information, and some 
 basic configuration pertaining to it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Resolved] (HBASE-8929) IntegrationTestBigLinkedList reuses old data in some cases


 [ 
https://issues.apache.org/jira/browse/HBASE-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HBASE-8929.
-

Resolution: Not A Problem

 IntegrationTestBigLinkedList reuses old data in some cases
 --

 Key: HBASE-8929
 URL: https://issues.apache.org/jira/browse/HBASE-8929
 Project: HBase
  Issue Type: Bug
  Components: test
Reporter: Sergey Shelukhin
Priority: Minor

 When running the test repeatedly on the same cluster one can sometimes see 
 unexpected reference count, where the number found is (in the observed case) 
 a multiple of the number expected, so instead ok 2.5m nodes it finds 12.5m, 
 for example. It looks like it's reading the data from the old run. Setup 
 should delete that (not cleanup, as the data may be used for debugging after 
 the test)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Resolved] (HBASE-8777) HBase client should determine the destination server after retry time


 [ 
https://issues.apache.org/jira/browse/HBASE-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HBASE-8777.
-

Resolution: Won't Fix

probably too late for 94

 HBase client should determine the destination server after retry time
 -

 Key: HBASE-8777
 URL: https://issues.apache.org/jira/browse/HBASE-8777
 Project: HBase
  Issue Type: Improvement
  Components: Client
Affects Versions: 0.94.9
Reporter: Sergey Shelukhin

 HBase currently determines which server to go to, then creates delayed 
 callable with pre-determined server and goes there. For later 16-32-... 
 second retries this approach is suboptimal, the cluster could have seen 
 massive changes in the meantime, so retry might be completely useless.
 We should re-locate regions after the delay, at least for longer retries. 
 Given how grouping is currently done it would be a bit of a refactoring.
 The effect of this is alleviated (to a degree) on trunk by server-based 
 retries (if we fail going to the pre-delay server after delay and then 
 determine the server has changed, we will go to the new server immediately, 
 so we only lose the failed round-trip time); on 94, if the region is opened 
 on some other server during the delay, we'd go to the old one, fail, then 
 find out it's on different server, wait a bunch more time because it's a 
 late-stage retry and THEN go to the new one, as far as I see. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10079) Increments lost after flush

[
https://issues.apache.org/jira/browse/HBASE-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839233#comment-13839233
]

Jonathan Hsieh commented on HBASE-10079:

I tweaked the test and wasn't able to duplicate at the unit test level. I'm
looking into reverting a few patches touching memstore/flush area and testing
on the cluster (HBASE-9963 and HBASE-10014 seems like candidates) to see if
they caused the problem.

Increments lost after flush

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10040) Fix Potential Resource Leak in HRegion


[ 
https://issues.apache.org/jira/browse/HBASE-10040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839243#comment-13839243
 ] 

Sergey Shelukhin commented on HBASE-10040:
--

There is not good reason... probably added like that to allow user scanners to 
throw.
It could be removed

 Fix Potential Resource Leak in HRegion
 --

 Key: HBASE-10040
 URL: https://issues.apache.org/jira/browse/HBASE-10040
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 0.98.0, 0.96.0
Reporter: Elliott Clark
Assignee: Elliott Clark





--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10040) Fix Potential Resource Leak in HRegion


[ 
https://issues.apache.org/jira/browse/HBASE-10040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839244#comment-13839244
 ] 

Sergey Shelukhin commented on HBASE-10040:
--

Although techniclaly that ould be API change

 Fix Potential Resource Leak in HRegion
 --

 Key: HBASE-10040
 URL: https://issues.apache.org/jira/browse/HBASE-10040
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 0.98.0, 0.96.0
Reporter: Elliott Clark
Assignee: Elliott Clark





--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-4811) Support reverse Scan


[ 
https://issues.apache.org/jira/browse/HBASE-4811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839271#comment-13839271
 ] 

Hudson commented on HBASE-4811:
---

SUCCESS: Integrated in HBase-TRUNK #4711 (See 
[https://builds.apache.org/job/HBase-TRUNK/4711/])
HBASE-10072. Regenerate ClientProtos after HBASE-4811 (apurtell: rev 1547720)
* 
/hbase/trunk/hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java


 Support reverse Scan
 

 Key: HBASE-4811
 URL: https://issues.apache.org/jira/browse/HBASE-4811
 Project: HBase
  Issue Type: New Feature
  Components: Client
Affects Versions: 0.20.6, 0.94.7
Reporter: John Carrino
Assignee: chunhui shen
 Fix For: 0.98.0

 Attachments: 4811-0.94-v22.txt, 4811-0.94-v23.txt, 4811-0.94-v3.txt, 
 4811-trunk-v10.txt, 4811-trunk-v29.patch, 4811-trunk-v5.patch, 
 HBase-4811-0.94-v2.txt, HBase-4811-0.94.3modified.txt, hbase-4811-0.94 
 v21.patch, hbase-4811-0.94-v24.patch, hbase-4811-trunkv1.patch, 
 hbase-4811-trunkv11.patch, hbase-4811-trunkv12.patch, 
 hbase-4811-trunkv13.patch, hbase-4811-trunkv14.patch, 
 hbase-4811-trunkv15.patch, hbase-4811-trunkv16.patch, 
 hbase-4811-trunkv17.patch, hbase-4811-trunkv18.patch, 
 hbase-4811-trunkv19.patch, hbase-4811-trunkv20.patch, 
 hbase-4811-trunkv21.patch, hbase-4811-trunkv24.patch, 
 hbase-4811-trunkv24.patch, hbase-4811-trunkv25.patch, 
 hbase-4811-trunkv26.patch, hbase-4811-trunkv27.patch, 
 hbase-4811-trunkv28.patch, hbase-4811-trunkv4.patch, 
 hbase-4811-trunkv6.patch, hbase-4811-trunkv7.patch, hbase-4811-trunkv8.patch, 
 hbase-4811-trunkv9.patch


 Reversed scan means scan the rows backward. 
 And StartRow bigger than StopRow in a reversed scan.
 For example, for the following rows:
 aaa/c1:q1/value1
 aaa/c1:q2/value2
 bbb/c1:q1/value1
 bbb/c1:q2/value2
 ccc/c1:q1/value1
 ccc/c1:q2/value2
 ddd/c1:q1/value1
 ddd/c1:q2/value2
 eee/c1:q1/value1
 eee/c1:q2/value2
 you could do a reversed scan from 'ddd' to 'bbb'(exclude) like this:
 Scan scan = new Scan();
 scan.setStartRow('ddd');
 scan.setStopRow('bbb');
 scan.setReversed(true);
 for(Result result:htable.getScanner(scan)){
  System.out.println(result);
 }
 Aslo you could do the reversed scan with shell like this:
 hbase scan 'table',{REVERSED = true,STARTROW='ddd', STOPROW='bbb'}
 And the output is:
 ddd/c1:q1/value1
 ddd/c1:q2/value2
 ccc/c1:q1/value1
 ccc/c1:q2/value2
 All the documentation I find about HBase says that if you want forward and 
 reverse scans you should just build 2 tables and one be ascending and one 
 descending.  Is there a fundamental reason that HBase only supports forward 
 Scan?  It seems like a lot of extra space overhead and coding overhead (to 
 keep them in sync) to support 2 tables.  
 I am assuming this has been discussed before, but I can't find the 
 discussions anywhere about it or why it would be infeasible.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-9485) TableOutputCommitter should implement recovery if we don't want jobs to start from 0 on RM restart


[ 
https://issues.apache.org/jira/browse/HBASE-9485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839274#comment-13839274
 ] 

Hudson commented on HBASE-9485:
---

SUCCESS: Integrated in HBase-TRUNK #4711 (See 
[https://builds.apache.org/job/HBase-TRUNK/4711/])
HBASE-9485 TableOutputCommitter should implement recovery if we don't want jobs 
to start from 0 on RM restart (tedyu: rev 1547803)
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputCommitter.java


 TableOutputCommitter should implement recovery if we don't want jobs to start 
 from 0 on RM restart
 --

 Key: HBASE-9485
 URL: https://issues.apache.org/jira/browse/HBASE-9485
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.98.0, 0.96.2

 Attachments: 9485-v2.txt


 HBase extends OutputCommitter which turns recovery off. Meaning all completed 
 maps are lost on RM restart and job starts from scratch. FileOutputCommitter 
 implements recovery so we should look at that to see what is potentially 
 needed for recovery.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10072) Regenerate ClientProtos after HBASE-4811


[ 
https://issues.apache.org/jira/browse/HBASE-10072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839273#comment-13839273
 ] 

Hudson commented on HBASE-10072:


SUCCESS: Integrated in HBase-TRUNK #4711 (See 
[https://builds.apache.org/job/HBase-TRUNK/4711/])
HBASE-10072. Regenerate ClientProtos after HBASE-4811 (apurtell: rev 1547720)
* 
/hbase/trunk/hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java


 Regenerate ClientProtos after HBASE-4811
 

 Key: HBASE-10072
 URL: https://issues.apache.org/jira/browse/HBASE-10072
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.98.0

 Attachments: 10072.patch


 While running 'mvn compile -Pcompile-protobuf' I noticed 
 generated/ClientProtos.java changed. Looks like the message descriptor for 
 Scan has changed, and its FieldAccessorTable. Attaching the diff. Difference 
 in protoc version maybe? I'm using protoc 2.5.0.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10031) Add a section on the transparent CF encryption feature to the manual


[ 
https://issues.apache.org/jira/browse/HBASE-10031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839272#comment-13839272
 ] 

Hudson commented on HBASE-10031:


SUCCESS: Integrated in HBase-TRUNK #4711 (See 
[https://builds.apache.org/job/HBase-TRUNK/4711/])
HBASE-10031. Add a section on the transparent CF encryption feature to the 
manual (apurtell: rev 1547739)
* /hbase/trunk/src/main/docbkx/security.xml


 Add a section on the transparent CF encryption feature to the manual
 

 Key: HBASE-10031
 URL: https://issues.apache.org/jira/browse/HBASE-10031
 Project: HBase
  Issue Type: Task
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Blocker
 Fix For: 0.98.0

 Attachments: 10031.patch


 Document HBASE-7544 in detail in the manual.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10079) Increments lost after flush

[
https://issues.apache.org/jira/browse/HBASE-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839291#comment-13839291
]

Jonathan Hsieh commented on HBASE-10079:

Seems like reverting either HBASE-9963 or HBASE-10014 gets rid of the jagged
losses due to flushes. However when testing on the tip of 0.96 with the
reverts I seem to be losing some threads as the initialize becuase of some sort
of race.

I'm going to try from the exact point where 0.96.1rc1 was cut to see if it is
an a happy place any will investigate the htable initialization problem
afterwards.

Increments lost after flush

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-9485) TableOutputCommitter should implement recovery if we don't want jobs to start from 0 on RM restart


[ 
https://issues.apache.org/jira/browse/HBASE-9485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839304#comment-13839304
 ] 

Hudson commented on HBASE-9485:
---

SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #863 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/863/])
HBASE-9485 TableOutputCommitter should implement recovery if we don't want jobs 
to start from 0 on RM restart (tedyu: rev 1547803)
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputCommitter.java


 TableOutputCommitter should implement recovery if we don't want jobs to start 
 from 0 on RM restart
 --

 Key: HBASE-9485
 URL: https://issues.apache.org/jira/browse/HBASE-9485
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.98.0, 0.96.2

 Attachments: 9485-v2.txt


 HBase extends OutputCommitter which turns recovery off. Meaning all completed 
 maps are lost on RM restart and job starts from scratch. FileOutputCommitter 
 implements recovery so we should look at that to see what is potentially 
 needed for recovery.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10031) Add a section on the transparent CF encryption feature to the manual


[ 
https://issues.apache.org/jira/browse/HBASE-10031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839302#comment-13839302
 ] 

Hudson commented on HBASE-10031:


SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #863 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/863/])
HBASE-10031. Add a section on the transparent CF encryption feature to the 
manual (apurtell: rev 1547739)
* /hbase/trunk/src/main/docbkx/security.xml


 Add a section on the transparent CF encryption feature to the manual
 

 Key: HBASE-10031
 URL: https://issues.apache.org/jira/browse/HBASE-10031
 Project: HBase
  Issue Type: Task
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Blocker
 Fix For: 0.98.0

 Attachments: 10031.patch


 Document HBASE-7544 in detail in the manual.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-4811) Support reverse Scan


[ 
https://issues.apache.org/jira/browse/HBASE-4811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839301#comment-13839301
 ] 

Hudson commented on HBASE-4811:
---

SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #863 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/863/])
HBASE-10072. Regenerate ClientProtos after HBASE-4811 (apurtell: rev 1547720)
* 
/hbase/trunk/hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java


 Support reverse Scan
 

 Key: HBASE-4811
 URL: https://issues.apache.org/jira/browse/HBASE-4811
 Project: HBase
  Issue Type: New Feature
  Components: Client
Affects Versions: 0.20.6, 0.94.7
Reporter: John Carrino
Assignee: chunhui shen
 Fix For: 0.98.0

 Attachments: 4811-0.94-v22.txt, 4811-0.94-v23.txt, 4811-0.94-v3.txt, 
 4811-trunk-v10.txt, 4811-trunk-v29.patch, 4811-trunk-v5.patch, 
 HBase-4811-0.94-v2.txt, HBase-4811-0.94.3modified.txt, hbase-4811-0.94 
 v21.patch, hbase-4811-0.94-v24.patch, hbase-4811-trunkv1.patch, 
 hbase-4811-trunkv11.patch, hbase-4811-trunkv12.patch, 
 hbase-4811-trunkv13.patch, hbase-4811-trunkv14.patch, 
 hbase-4811-trunkv15.patch, hbase-4811-trunkv16.patch, 
 hbase-4811-trunkv17.patch, hbase-4811-trunkv18.patch, 
 hbase-4811-trunkv19.patch, hbase-4811-trunkv20.patch, 
 hbase-4811-trunkv21.patch, hbase-4811-trunkv24.patch, 
 hbase-4811-trunkv24.patch, hbase-4811-trunkv25.patch, 
 hbase-4811-trunkv26.patch, hbase-4811-trunkv27.patch, 
 hbase-4811-trunkv28.patch, hbase-4811-trunkv4.patch, 
 hbase-4811-trunkv6.patch, hbase-4811-trunkv7.patch, hbase-4811-trunkv8.patch, 
 hbase-4811-trunkv9.patch


 Reversed scan means scan the rows backward. 
 And StartRow bigger than StopRow in a reversed scan.
 For example, for the following rows:
 aaa/c1:q1/value1
 aaa/c1:q2/value2
 bbb/c1:q1/value1
 bbb/c1:q2/value2
 ccc/c1:q1/value1
 ccc/c1:q2/value2
 ddd/c1:q1/value1
 ddd/c1:q2/value2
 eee/c1:q1/value1
 eee/c1:q2/value2
 you could do a reversed scan from 'ddd' to 'bbb'(exclude) like this:
 Scan scan = new Scan();
 scan.setStartRow('ddd');
 scan.setStopRow('bbb');
 scan.setReversed(true);
 for(Result result:htable.getScanner(scan)){
  System.out.println(result);
 }
 Aslo you could do the reversed scan with shell like this:
 hbase scan 'table',{REVERSED = true,STARTROW='ddd', STOPROW='bbb'}
 And the output is:
 ddd/c1:q1/value1
 ddd/c1:q2/value2
 ccc/c1:q1/value1
 ccc/c1:q2/value2
 All the documentation I find about HBase says that if you want forward and 
 reverse scans you should just build 2 tables and one be ascending and one 
 descending.  Is there a fundamental reason that HBase only supports forward 
 Scan?  It seems like a lot of extra space overhead and coding overhead (to 
 keep them in sync) to support 2 tables.  
 I am assuming this has been discussed before, but I can't find the 
 discussions anywhere about it or why it would be infeasible.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10072) Regenerate ClientProtos after HBASE-4811


[ 
https://issues.apache.org/jira/browse/HBASE-10072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839303#comment-13839303
 ] 

Hudson commented on HBASE-10072:


SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #863 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/863/])
HBASE-10072. Regenerate ClientProtos after HBASE-4811 (apurtell: rev 1547720)
* 
/hbase/trunk/hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java


 Regenerate ClientProtos after HBASE-4811
 

 Key: HBASE-10072
 URL: https://issues.apache.org/jira/browse/HBASE-10072
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.98.0

 Attachments: 10072.patch


 While running 'mvn compile -Pcompile-protobuf' I noticed 
 generated/ClientProtos.java changed. Looks like the message descriptor for 
 Scan has changed, and its FieldAccessorTable. Attaching the diff. Difference 
 in protoc version maybe? I'm using protoc 2.5.0.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-9485) TableOutputCommitter should implement recovery if we don't want jobs to start from 0 on RM restart

2013-12-04 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839313#comment-13839313
 ] 

Lars Hofhansl commented on HBASE-9485:
--

We can add this to 0.94 as well, no?
If it is built against Hadoop 2.x it should just work.

 TableOutputCommitter should implement recovery if we don't want jobs to start 
 from 0 on RM restart
 --

 Key: HBASE-9485
 URL: https://issues.apache.org/jira/browse/HBASE-9485
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.98.0, 0.96.2

 Attachments: 9485-v2.txt


 HBase extends OutputCommitter which turns recovery off. Meaning all completed 
 maps are lost on RM restart and job starts from scratch. FileOutputCommitter 
 implements recovery so we should look at that to see what is potentially 
 needed for recovery.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10074) consolidate and improve capacity/sizing documentation

2013-12-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839315#comment-13839315
 ] 

Hadoop QA commented on HBASE-10074:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12617043/HBASE-10074.01.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation patch that doesn't require tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 1 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8056//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8056//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8056//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8056//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8056//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8056//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8056//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8056//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8056//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8056//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8056//console

This message is automatically generated.

 consolidate and improve capacity/sizing documentation
 -

 Key: HBASE-10074
 URL: https://issues.apache.org/jira/browse/HBASE-10074
 Project: HBase
  Issue Type: Improvement
  Components: documentation
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10074.01.patch, HBASE-10074.patch


 Region count description is in config section; region size description is in 
 architecture sections; both of these have a lot of good technical details, 
 but imho we could do better in terms of admin-centric advice. 
 Currently, there's a nearly-empty capacity section; I'd like to rewrite it to 
 consolidate capacity planning/sizing/region sizing information, and some 
 basic configuration pertaining to it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-4811) Support reverse Scan

2013-12-04 Thread Lars Hofhansl (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-4811:
-

Attachment: 4811-0.94-v25.txt

Here's the rebased 0.94 patch.

 Support reverse Scan
 

 Key: HBASE-4811
 URL: https://issues.apache.org/jira/browse/HBASE-4811
 Project: HBase
  Issue Type: New Feature
  Components: Client
Affects Versions: 0.20.6, 0.94.7
Reporter: John Carrino
Assignee: chunhui shen
 Fix For: 0.98.0

 Attachments: 4811-0.94-v22.txt, 4811-0.94-v23.txt, 4811-0.94-v25.txt, 
 4811-0.94-v3.txt, 4811-trunk-v10.txt, 4811-trunk-v29.patch, 
 4811-trunk-v5.patch, HBase-4811-0.94-v2.txt, HBase-4811-0.94.3modified.txt, 
 hbase-4811-0.94 v21.patch, hbase-4811-0.94-v24.patch, 
 hbase-4811-trunkv1.patch, hbase-4811-trunkv11.patch, 
 hbase-4811-trunkv12.patch, hbase-4811-trunkv13.patch, 
 hbase-4811-trunkv14.patch, hbase-4811-trunkv15.patch, 
 hbase-4811-trunkv16.patch, hbase-4811-trunkv17.patch, 
 hbase-4811-trunkv18.patch, hbase-4811-trunkv19.patch, 
 hbase-4811-trunkv20.patch, hbase-4811-trunkv21.patch, 
 hbase-4811-trunkv24.patch, hbase-4811-trunkv24.patch, 
 hbase-4811-trunkv25.patch, hbase-4811-trunkv26.patch, 
 hbase-4811-trunkv27.patch, hbase-4811-trunkv28.patch, 
 hbase-4811-trunkv4.patch, hbase-4811-trunkv6.patch, hbase-4811-trunkv7.patch, 
 hbase-4811-trunkv8.patch, hbase-4811-trunkv9.patch


 Reversed scan means scan the rows backward. 
 And StartRow bigger than StopRow in a reversed scan.
 For example, for the following rows:
 aaa/c1:q1/value1
 aaa/c1:q2/value2
 bbb/c1:q1/value1
 bbb/c1:q2/value2
 ccc/c1:q1/value1
 ccc/c1:q2/value2
 ddd/c1:q1/value1
 ddd/c1:q2/value2
 eee/c1:q1/value1
 eee/c1:q2/value2
 you could do a reversed scan from 'ddd' to 'bbb'(exclude) like this:
 Scan scan = new Scan();
 scan.setStartRow('ddd');
 scan.setStopRow('bbb');
 scan.setReversed(true);
 for(Result result:htable.getScanner(scan)){
  System.out.println(result);
 }
 Aslo you could do the reversed scan with shell like this:
 hbase scan 'table',{REVERSED = true,STARTROW='ddd', STOPROW='bbb'}
 And the output is:
 ddd/c1:q1/value1
 ddd/c1:q2/value2
 ccc/c1:q1/value1
 ccc/c1:q2/value2
 All the documentation I find about HBase says that if you want forward and 
 reverse scans you should just build 2 tables and one be ascending and one 
 descending.  Is there a fundamental reason that HBase only supports forward 
 Scan?  It seems like a lot of extra space overhead and coding overhead (to 
 keep them in sync) to support 2 tables.  
 I am assuming this has been discussed before, but I can't find the 
 discussions anywhere about it or why it would be infeasible.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10079) Increments lost after flush


[ 
https://issues.apache.org/jira/browse/HBASE-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839333#comment-13839333
 ] 

Nicolas Liochon commented on HBASE-10079:
-

I guess the error is in HBASE-9963. It seems there is an issue in 
HStore#StoreFlusherImpl#prepare: there is no lock there.

 Increments lost after flush 
 

 Key: HBASE-10079
 URL: https://issues.apache.org/jira/browse/HBASE-10079
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.96.1
Reporter: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.96.1


 Testing 0.96.1rc1.
 With one process incrementing a row in a table, we increment single col.  We 
 flush or do kills/kill-9 and data is lost.  flush and kill are likely the 
 same problem (kill would flush), kill -9 may or may not have the same root 
 cause.
 5 nodes
 hadoop 2.1.0 (a pre cdh5b1 hdfs).
 hbase 0.96.1 rc1 
 Test: 25 increments on a single row an single col with various number of 
 client threads (IncrementBlaster).  Verify we have a count of 25 after 
 the run (IncrementVerifier).
 Run 1: No fault injection.  5 runs.  count = 25. on multiple runs.  
 Correctness verified.  1638 inc/s throughput.
 Run 2: flushes table with incrementing row.  count = 246875 !=25.  
 correctness failed.  1517 inc/s throughput.  
 Run 3: kill of rs hosting incremented row.  count = 243750 != 25. 
 Correctness failed.   1451 inc/s throughput.
 Run 4: one kill -9 of rs hosting incremented row.  246878.!= 25.  
 Correctness failed. 1395 inc/s (including recovery)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10079) Increments lost after flush


[ 
https://issues.apache.org/jira/browse/HBASE-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839334#comment-13839334
 ] 

Jonathan Hsieh commented on HBASE-10079:


Actually, the current tip of 0.96 (HBASE-9485) doesn't seem to have the flush 
problem but does have the htable initializaiton problem.

 Increments lost after flush 
 

 Key: HBASE-10079
 URL: https://issues.apache.org/jira/browse/HBASE-10079
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.96.1
Reporter: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.96.1


 Testing 0.96.1rc1.
 With one process incrementing a row in a table, we increment single col.  We 
 flush or do kills/kill-9 and data is lost.  flush and kill are likely the 
 same problem (kill would flush), kill -9 may or may not have the same root 
 cause.
 5 nodes
 hadoop 2.1.0 (a pre cdh5b1 hdfs).
 hbase 0.96.1 rc1 
 Test: 25 increments on a single row an single col with various number of 
 client threads (IncrementBlaster).  Verify we have a count of 25 after 
 the run (IncrementVerifier).
 Run 1: No fault injection.  5 runs.  count = 25. on multiple runs.  
 Correctness verified.  1638 inc/s throughput.
 Run 2: flushes table with incrementing row.  count = 246875 !=25.  
 correctness failed.  1517 inc/s throughput.  
 Run 3: kill of rs hosting incremented row.  count = 243750 != 25. 
 Correctness failed.   1451 inc/s throughput.
 Run 4: one kill -9 of rs hosting incremented row.  246878.!= 25.  
 Correctness failed. 1395 inc/s (including recovery)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-10079) Increments lost after flush


 [ 
https://issues.apache.org/jira/browse/HBASE-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolas Liochon updated HBASE-10079:


Attachment: 10079.v1.patch

 Increments lost after flush 
 

 Key: HBASE-10079
 URL: https://issues.apache.org/jira/browse/HBASE-10079
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.96.1
Reporter: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.96.1

 Attachments: 10079.v1.patch


 Testing 0.96.1rc1.
 With one process incrementing a row in a table, we increment single col.  We 
 flush or do kills/kill-9 and data is lost.  flush and kill are likely the 
 same problem (kill would flush), kill -9 may or may not have the same root 
 cause.
 5 nodes
 hadoop 2.1.0 (a pre cdh5b1 hdfs).
 hbase 0.96.1 rc1 
 Test: 25 increments on a single row an single col with various number of 
 client threads (IncrementBlaster).  Verify we have a count of 25 after 
 the run (IncrementVerifier).
 Run 1: No fault injection.  5 runs.  count = 25. on multiple runs.  
 Correctness verified.  1638 inc/s throughput.
 Run 2: flushes table with incrementing row.  count = 246875 !=25.  
 correctness failed.  1517 inc/s throughput.  
 Run 3: kill of rs hosting incremented row.  count = 243750 != 25. 
 Correctness failed.   1451 inc/s throughput.
 Run 4: one kill -9 of rs hosting incremented row.  246878.!= 25.  
 Correctness failed. 1395 inc/s (including recovery)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10079) Increments lost after flush


[ 
https://issues.apache.org/jira/browse/HBASE-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839351#comment-13839351
 ] 

Nicolas Liochon commented on HBASE-10079:
-

That's strange. We should lock, and we don't do it in trunk or 0.96 head...

 Increments lost after flush 
 

 Key: HBASE-10079
 URL: https://issues.apache.org/jira/browse/HBASE-10079
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.96.1
Reporter: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.96.1

 Attachments: 10079.v1.patch


 Testing 0.96.1rc1.
 With one process incrementing a row in a table, we increment single col.  We 
 flush or do kills/kill-9 and data is lost.  flush and kill are likely the 
 same problem (kill would flush), kill -9 may or may not have the same root 
 cause.
 5 nodes
 hadoop 2.1.0 (a pre cdh5b1 hdfs).
 hbase 0.96.1 rc1 
 Test: 25 increments on a single row an single col with various number of 
 client threads (IncrementBlaster).  Verify we have a count of 25 after 
 the run (IncrementVerifier).
 Run 1: No fault injection.  5 runs.  count = 25. on multiple runs.  
 Correctness verified.  1638 inc/s throughput.
 Run 2: flushes table with incrementing row.  count = 246875 !=25.  
 correctness failed.  1517 inc/s throughput.  
 Run 3: kill of rs hosting incremented row.  count = 243750 != 25. 
 Correctness failed.   1451 inc/s throughput.
 Run 4: one kill -9 of rs hosting incremented row.  246878.!= 25.  
 Correctness failed. 1395 inc/s (including recovery)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-9485) TableOutputCommitter should implement recovery if we don't want jobs to start from 0 on RM restart


 [ 
https://issues.apache.org/jira/browse/HBASE-9485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-9485:
--

Fix Version/s: 0.94.15
   Status: Open  (was: Patch Available)

 TableOutputCommitter should implement recovery if we don't want jobs to start 
 from 0 on RM restart
 --

 Key: HBASE-9485
 URL: https://issues.apache.org/jira/browse/HBASE-9485
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.98.0, 0.94.15, 0.96.2

 Attachments: 9485-v2.txt


 HBase extends OutputCommitter which turns recovery off. Meaning all completed 
 maps are lost on RM restart and job starts from scratch. FileOutputCommitter 
 implements recovery so we should look at that to see what is potentially 
 needed for recovery.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-10079) Increments lost after flush


 [ 
https://issues.apache.org/jira/browse/HBASE-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolas Liochon updated HBASE-10079:


Status: Patch Available  (was: Open)

 Increments lost after flush 
 

 Key: HBASE-10079
 URL: https://issues.apache.org/jira/browse/HBASE-10079
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.96.1
Reporter: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.96.1

 Attachments: 10079.v1.patch


 Testing 0.96.1rc1.
 With one process incrementing a row in a table, we increment single col.  We 
 flush or do kills/kill-9 and data is lost.  flush and kill are likely the 
 same problem (kill would flush), kill -9 may or may not have the same root 
 cause.
 5 nodes
 hadoop 2.1.0 (a pre cdh5b1 hdfs).
 hbase 0.96.1 rc1 
 Test: 25 increments on a single row an single col with various number of 
 client threads (IncrementBlaster).  Verify we have a count of 25 after 
 the run (IncrementVerifier).
 Run 1: No fault injection.  5 runs.  count = 25. on multiple runs.  
 Correctness verified.  1638 inc/s throughput.
 Run 2: flushes table with incrementing row.  count = 246875 !=25.  
 correctness failed.  1517 inc/s throughput.  
 Run 3: kill of rs hosting incremented row.  count = 243750 != 25. 
 Correctness failed.   1451 inc/s throughput.
 Run 4: one kill -9 of rs hosting incremented row.  246878.!= 25.  
 Correctness failed. 1395 inc/s (including recovery)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-9485) TableOutputCommitter should implement recovery if we don't want jobs to start from 0 on RM restart


 [ 
https://issues.apache.org/jira/browse/HBASE-9485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-9485:
--

Attachment: 9485-0.94.txt

 TableOutputCommitter should implement recovery if we don't want jobs to start 
 from 0 on RM restart
 --

 Key: HBASE-9485
 URL: https://issues.apache.org/jira/browse/HBASE-9485
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.98.0, 0.94.15, 0.96.2

 Attachments: 9485-0.94.txt, 9485-v2.txt


 HBase extends OutputCommitter which turns recovery off. Meaning all completed 
 maps are lost on RM restart and job starts from scratch. FileOutputCommitter 
 implements recovery so we should look at that to see what is potentially 
 needed for recovery.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-9485) TableOutputCommitter should implement recovery if we don't want jobs to start from 0 on RM restart


[ 
https://issues.apache.org/jira/browse/HBASE-9485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839362#comment-13839362
 ] 

Ted Yu commented on HBASE-9485:
---

Integrated to 0.94 as well.

 TableOutputCommitter should implement recovery if we don't want jobs to start 
 from 0 on RM restart
 --

 Key: HBASE-9485
 URL: https://issues.apache.org/jira/browse/HBASE-9485
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.98.0, 0.94.15, 0.96.2

 Attachments: 9485-0.94.txt, 9485-v2.txt


 HBase extends OutputCommitter which turns recovery off. Meaning all completed 
 maps are lost on RM restart and job starts from scratch. FileOutputCommitter 
 implements recovery so we should look at that to see what is potentially 
 needed for recovery.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10079) Increments lost after flush

2013-12-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839431#comment-13839431
 ] 

Hadoop QA commented on HBASE-10079:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12617058/10079.v1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 1 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8057//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8057//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8057//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8057//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8057//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8057//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8057//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8057//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8057//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8057//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8057//console

This message is automatically generated.

 Increments lost after flush 
 

 Key: HBASE-10079
 URL: https://issues.apache.org/jira/browse/HBASE-10079
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.96.1
Reporter: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.96.1

 Attachments: 10079.v1.patch


 Testing 0.96.1rc1.
 With one process incrementing a row in a table, we increment single col.  We 
 flush or do kills/kill-9 and data is lost.  flush and kill are likely the 
 same problem (kill would flush), kill -9 may or may not have the same root 
 cause.
 5 nodes
 hadoop 2.1.0 (a pre cdh5b1 hdfs).
 hbase 0.96.1 rc1 
 Test: 25 increments on a single row an single col with various number of 
 client threads (IncrementBlaster).  Verify we have a count of 25 after 
 the run (IncrementVerifier).
 Run 1: No fault injection.  5 runs.  count = 25. on multiple runs.  
 Correctness verified.  1638 inc/s throughput.
 Run 2: flushes table with incrementing row.  count = 246875 !=25.  
 correctness failed.  1517 inc/s throughput.  
 Run 3: kill of rs hosting incremented row.  count = 243750 != 25. 
 Correctness failed.   1451 inc/s throughput.
 Run 4: one kill -9 of rs hosting incremented row.  246878.!= 25.  
 Correctness failed. 1395 inc/s (including recovery)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-4163) Create Split Strategy for YCSB Benchmark

2013-12-04 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839489#comment-13839489
 ] 

Luke Lu commented on HBASE-4163:


Tried to figure this out for somebody today, here is a hbase shell one-liner to 
save some more people's time before the feature is implemented:
{code}
create 'usertable', 'family', {SPLITS = (1..200).map {|i| 
user#{1000+i*(-1000)/200}}, MAX_FILESIZE = 4*1024**3}
{code}

 Create Split Strategy for YCSB Benchmark
 

 Key: HBASE-4163
 URL: https://issues.apache.org/jira/browse/HBASE-4163
 Project: HBase
  Issue Type: Improvement
  Components: util
Affects Versions: 0.90.3, 0.92.0
Reporter: Nicolas Spiegelberg
Assignee: Lars George
Priority: Minor
  Labels: benchmark

 Talked with Lars about how we can make it easier for users to run the YCSB 
 benchmarks against HBase  get realistic results.  Currently, HBase is 
 optimized for the random/uniform read/write case, which is the YCSB load.  
 The initial reason why we perform bad when users test against us is because 
 they do not presplit regions  have the split ratio really low.  We need a 
 one-line way for a user to create a table that is pre-split to 200 regions 
 (or some decent number) by default  disable splitting.  Realistically, this 
 is how a uniform load cluster should scale, so it's not a hack.  This will 
 also give us a good use case to point to for how users should pre-split 
 regions.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10079) Increments lost after flush


[ 
https://issues.apache.org/jira/browse/HBASE-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839491#comment-13839491
 ] 

stack commented on HBASE-10079:
---

Patch is good.  Nice work  Jon.  Makes sense this missing lock was exposed by 
hbase-9963.  Pity we didn't catch it in tests previous.  Any chance of a test?



 Increments lost after flush 
 

 Key: HBASE-10079
 URL: https://issues.apache.org/jira/browse/HBASE-10079
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.96.1
Reporter: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.96.1

 Attachments: 10079.v1.patch


 Testing 0.96.1rc1.
 With one process incrementing a row in a table, we increment single col.  We 
 flush or do kills/kill-9 and data is lost.  flush and kill are likely the 
 same problem (kill would flush), kill -9 may or may not have the same root 
 cause.
 5 nodes
 hadoop 2.1.0 (a pre cdh5b1 hdfs).
 hbase 0.96.1 rc1 
 Test: 25 increments on a single row an single col with various number of 
 client threads (IncrementBlaster).  Verify we have a count of 25 after 
 the run (IncrementVerifier).
 Run 1: No fault injection.  5 runs.  count = 25. on multiple runs.  
 Correctness verified.  1638 inc/s throughput.
 Run 2: flushes table with incrementing row.  count = 246875 !=25.  
 correctness failed.  1517 inc/s throughput.  
 Run 3: kill of rs hosting incremented row.  count = 243750 != 25. 
 Correctness failed.   1451 inc/s throughput.
 Run 4: one kill -9 of rs hosting incremented row.  246878.!= 25.  
 Correctness failed. 1395 inc/s (including recovery)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-9931) Optional setBatch for CopyTable to copy large rows in batches


[ 
https://issues.apache.org/jira/browse/HBASE-9931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839493#comment-13839493
 ] 

stack commented on HBASE-9931:
--

+1

Thanks for adding to 0.96.

 Optional setBatch for CopyTable to copy large rows in batches
 -

 Key: HBASE-9931
 URL: https://issues.apache.org/jira/browse/HBASE-9931
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce
Reporter: Dave Latham
Assignee: Nick Dimiduk
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: HBASE-9931.00.patch, HBASE-9931.01.patch


 We've had CopyTable jobs fail because a small number of rows are wide enough 
 to not fit into memory.  If we could specify the batch size for CopyTable 
 scans that shoud be able to break those large rows up into multiple 
 iterations to save the heap.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10079) Increments lost after flush


[ 
https://issues.apache.org/jira/browse/HBASE-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839495#comment-13839495
 ] 

Sergey Shelukhin commented on HBASE-10079:
--

+1

 Increments lost after flush 
 

 Key: HBASE-10079
 URL: https://issues.apache.org/jira/browse/HBASE-10079
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.96.1
Reporter: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.96.1

 Attachments: 10079.v1.patch


 Testing 0.96.1rc1.
 With one process incrementing a row in a table, we increment single col.  We 
 flush or do kills/kill-9 and data is lost.  flush and kill are likely the 
 same problem (kill would flush), kill -9 may or may not have the same root 
 cause.
 5 nodes
 hadoop 2.1.0 (a pre cdh5b1 hdfs).
 hbase 0.96.1 rc1 
 Test: 25 increments on a single row an single col with various number of 
 client threads (IncrementBlaster).  Verify we have a count of 25 after 
 the run (IncrementVerifier).
 Run 1: No fault injection.  5 runs.  count = 25. on multiple runs.  
 Correctness verified.  1638 inc/s throughput.
 Run 2: flushes table with incrementing row.  count = 246875 !=25.  
 correctness failed.  1517 inc/s throughput.  
 Run 3: kill of rs hosting incremented row.  count = 243750 != 25. 
 Correctness failed.   1451 inc/s throughput.
 Run 4: one kill -9 of rs hosting incremented row.  246878.!= 25.  
 Correctness failed. 1395 inc/s (including recovery)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-9485) TableOutputCommitter should implement recovery if we don't want jobs to start from 0 on RM restart


[ 
https://issues.apache.org/jira/browse/HBASE-9485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839500#comment-13839500
 ] 

Hudson commented on HBASE-9485:
---

SUCCESS: Integrated in hbase-0.96 #213 (See 
[https://builds.apache.org/job/hbase-0.96/213/])
HBASE-9485 TableOutputCommitter should implement recovery if we don't want jobs 
to start from 0 on RM restart (tedyu: rev 1547802)
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputCommitter.java


 TableOutputCommitter should implement recovery if we don't want jobs to start 
 from 0 on RM restart
 --

 Key: HBASE-9485
 URL: https://issues.apache.org/jira/browse/HBASE-9485
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.98.0, 0.94.15, 0.96.2

 Attachments: 9485-0.94.txt, 9485-v2.txt


 HBase extends OutputCommitter which turns recovery off. Meaning all completed 
 maps are lost on RM restart and job starts from scratch. FileOutputCommitter 
 implements recovery so we should look at that to see what is potentially 
 needed for recovery.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839529#comment-13839529
]

stack commented on HBASE-8763:
--

bq. 1) Memstore insert using long.max as the initial write number

What will we do if two edits arrive with same coordinates? How will we
distingush them if both have long.max during the time it takes to sync and
converte long.max to a legit seqid?

bq. Currently, we maintain an internal queue which might defer the read point
bump up if transactions complete order is different than that of MVCC internal
write queue.

A reason to unify MVCC and WAL seqid (smile).

bq. By doing above, it's possible to remove the logics maintaining writeQueue
...

We need the writeQueue for performance reasons, right? We need to add edits in
bulk under a lock and this lock is expensive to obtain (maybe I am missing
something?)

bq. ...so it means we can remove two locking and one queue loop in write code
path.

What are the two locks J?

Otherwise, sounds great. Will look at patches...

[BRAINSTORM] Combine MVCC and SeqId
---

Key: HBASE-8763
URL: https://issues.apache.org/jira/browse/HBASE-8763
Project: HBase
Issue Type: Improvement
Components: regionserver
Reporter: Enis Soztutar
Attachments: hbase-8736-poc.patch, hbase-8763_wip1.patch

HBASE-8701 and a lot of recent issues include good discussions about mvcc +
seqId semantics. It seems that having mvcc and the seqId complicates the
comparator semantics a lot in regards to flush + WAL replay + compactions +
delete markers and out of order puts.
Thinking more about it I don't think we need a MVCC write number which is
different than the seqId. We can keep the MVCC semantics, read point and
smallest read points intact, but combine mvcc write number and seqId. This
will allow cleaner semantics + implementation + smaller data files.
We can do some brainstorming for 0.98. We still have to verify that this
would be semantically correct, it should be so by my current understanding.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HBASE-10083) [89-fb] Better error handling for the compound bloom filter

2013-12-04 Thread Liyin Tang (JIRA)

Liyin Tang created HBASE-10083:
--

 Summary: [89-fb] Better error handling for the compound bloom 
filter 
 Key: HBASE-10083
 URL: https://issues.apache.org/jira/browse/HBASE-10083
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.89-fb
Reporter: Liyin Tang
Assignee: Liyin Tang


When RegionServer failed to load a bloom block from HDFS due to any timeout or 
other reasons, it threw out the exception and disable the entire bloom filter 
for this HFile. This behavior does not make too much sense, especially for the 
compound bloom filter. 

Instead of disabling the bloom filter for the entire file, it could just return 
a potentially false positive result (true) and keep the bloom filter available.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10079) Increments lost after flush


[ 
https://issues.apache.org/jira/browse/HBASE-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839554#comment-13839554
 ] 

Jonathan Hsieh commented on HBASE-10079:


Here's the dropped threads stack dump -- each one of these corresponds to a 
thread that didn't run.

{code}
Exception in thread Thread-58 java.lang.IllegalStateException: test was 
supposed to be in the cache
at 
org.apache.hadoop.hbase.TableName.createTableNameIfNecessary(TableName.java:337)
at org.apache.hadoop.hbase.TableName.valueOf(TableName.java:385)
at org.apache.hadoop.hbase.client.HTable.init(HTable.java:165)
at 
org.apache.hadoop.hbase.client.HTableFactory.createHTableInterface(HTableFactory.java:39)
at 
org.apache.hadoop.hbase.client.HTablePool.createHTable(HTablePool.java:271)
at 
org.apache.hadoop.hbase.client.HTablePool.findOrCreateTable(HTablePool.java:201)
at 
org.apache.hadoop.hbase.client.HTablePool.getTable(HTablePool.java:180)
at IncrementBlaster$1.run(IncrementBlaster.java:131)
{code}


 Increments lost after flush 
 

 Key: HBASE-10079
 URL: https://issues.apache.org/jira/browse/HBASE-10079
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.96.1
Reporter: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.96.1

 Attachments: 10079.v1.patch


 Testing 0.96.1rc1.
 With one process incrementing a row in a table, we increment single col.  We 
 flush or do kills/kill-9 and data is lost.  flush and kill are likely the 
 same problem (kill would flush), kill -9 may or may not have the same root 
 cause.
 5 nodes
 hadoop 2.1.0 (a pre cdh5b1 hdfs).
 hbase 0.96.1 rc1 
 Test: 25 increments on a single row an single col with various number of 
 client threads (IncrementBlaster).  Verify we have a count of 25 after 
 the run (IncrementVerifier).
 Run 1: No fault injection.  5 runs.  count = 25. on multiple runs.  
 Correctness verified.  1638 inc/s throughput.
 Run 2: flushes table with incrementing row.  count = 246875 !=25.  
 correctness failed.  1517 inc/s throughput.  
 Run 3: kill of rs hosting incremented row.  count = 243750 != 25. 
 Correctness failed.   1451 inc/s throughput.
 Run 4: one kill -9 of rs hosting incremented row.  246878.!= 25.  
 Correctness failed. 1395 inc/s (including recovery)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HBASE-10084) [WINDOWS] bin\hbase.cmd should allow whitespaces in java.library.path and classpath

2013-12-04 Thread Enis Soztutar (JIRA)

Enis Soztutar created HBASE-10084:
-

 Summary: [WINDOWS] bin\hbase.cmd should allow whitespaces in 
java.library.path and classpath
 Key: HBASE-10084
 URL: https://issues.apache.org/jira/browse/HBASE-10084
 Project: HBase
  Issue Type: Bug
  Components: scripts
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.98.0, 0.96.2


In case CLASSPATH or java.library.path from hadoop or HBASE_HOME contains 
directories with names containing whitespaces, the bin script splits out 
errors. We can fix the ws handling hopefully once and for all (or not)  



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10079) Increments lost after flush

[
https://issues.apache.org/jira/browse/HBASE-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839634#comment-13839634
]

Jonathan Hsieh commented on HBASE-10079:

I'm having a hard time recreating the jagged counts. I tried reverting
patches, and before and after the patch nkeywal provided. I think the flush
problem was a red herring where I was biased by the customer problem I was
recently working on.

When I changed my tests to do 10 increments the pattern I saw really jumped
out. Looking at the original numbers from this morning I see the same pattern
present with the 25 increments.

80 threads, 25 increments == 3125 increments / thread.
count = 246875 != 25 (flush) // one thread failed to start.
count = 243750 != 25 (kill) // two threads failed to start.
count = 246878 != 25 (kill -9) // one thread failed to start and we had 3
threads that sent increments that succeeded and retried but didn't get an ack
because of kill -9).

The last one through we off because it wasn't regular but I think the
explanation I have makes sense.

I'm looking into seeing if my test code is bad (is there TableName
documentation I ignoredthat says that the race in the stacktrace is my fault)
or if we need to add some synchronization to this createTableNameIfNecessary
method.

Increments lost after flush

Attachments: 10079.v1.patch

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Comment Edited] (HBASE-10079) Increments lost after flush


[ 
https://issues.apache.org/jira/browse/HBASE-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839634#comment-13839634
 ] 

Jonathan Hsieh edited comment on HBASE-10079 at 12/5/13 1:05 AM:
-

I'm having a hard time recreating the jagged counts.  I tried reverting 
patches, and before and after the patch nkeywal provided.  I think the flush 
problem was a red herring where I was biased by the customer problem I was 
recently working on.

When I changed my tests to do 10 increments the pattern I saw really jumped 
out.  Looking at the original numbers from this morning I see the same pattern 
present with the 25 increments.  

80 threads, 25 increments == 3125 increments / thread.
count = 246875 != 25 (flush)  // one thread failed to start.
count = 243750 != 25 (kill)  // two threads failed to start.  
count = 246878 != 25 (kill -9) // one thread failed to start and we had 3 
threads that sent increments that succeeded and retried but didn't get an ack 
because of kill -9).

The last one through me off because it wasn't regular but I think the 
explanation I have makes sense.

I'm looking into seeing if my test code is bad (is there TableName 
documentation I ignoredthat says  that the race in the stacktrace is my fault) 
or if we need to add some synchronization to this createTableNameIfNecessary 
method.




was (Author: jmhsieh):
I'm having a hard time recreating the jagged counts.  I tried reverting 
patches, and before and after the patch nkeywal provided.  I think the flush 
problem was a red herring where I was biased by the customer problem I was 
recently working on.

When I changed my tests to do 10 increments the pattern I saw really jumped 
out.  Looking at the original numbers from this morning I see the same pattern 
present with the 25 increments.  

80 threads, 25 increments == 3125 increments / thread.
count = 246875 != 25 (flush)  // one thread failed to start.
count = 243750 != 25 (kill)  // two threads failed to start.  
count = 246878 != 25 (kill -9) // one thread failed to start and we had 3 
threads that sent increments that succeeded and retried but didn't get an ack 
because of kill -9).

The last one through we off because it wasn't regular but I think the 
explanation I have makes sense.

I'm looking into seeing if my test code is bad (is there TableName 
documentation I ignoredthat says  that the race in the stacktrace is my fault) 
or if we need to add some synchronization to this createTableNameIfNecessary 
method.



 Increments lost after flush 
 

 Key: HBASE-10079
 URL: https://issues.apache.org/jira/browse/HBASE-10079
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.96.1
Reporter: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.96.1

 Attachments: 10079.v1.patch


 Testing 0.96.1rc1.
 With one process incrementing a row in a table, we increment single col.  We 
 flush or do kills/kill-9 and data is lost.  flush and kill are likely the 
 same problem (kill would flush), kill -9 may or may not have the same root 
 cause.
 5 nodes
 hadoop 2.1.0 (a pre cdh5b1 hdfs).
 hbase 0.96.1 rc1 
 Test: 25 increments on a single row an single col with various number of 
 client threads (IncrementBlaster).  Verify we have a count of 25 after 
 the run (IncrementVerifier).
 Run 1: No fault injection.  5 runs.  count = 25. on multiple runs.  
 Correctness verified.  1638 inc/s throughput.
 Run 2: flushes table with incrementing row.  count = 246875 !=25.  
 correctness failed.  1517 inc/s throughput.  
 Run 3: kill of rs hosting incremented row.  count = 243750 != 25. 
 Correctness failed.   1451 inc/s throughput.
 Run 4: one kill -9 of rs hosting incremented row.  246878.!= 25.  
 Correctness failed. 1395 inc/s (including recovery)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10079) Increments lost after flush


[ 
https://issues.apache.org/jira/browse/HBASE-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839643#comment-13839643
 ] 

Jonathan Hsieh commented on HBASE-10079:


Hm.. HBASE-6580 deprecates HTablePool and happened when I wasn't paying 
attention.

 Increments lost after flush 
 

 Key: HBASE-10079
 URL: https://issues.apache.org/jira/browse/HBASE-10079
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.96.1
Reporter: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.96.1

 Attachments: 10079.v1.patch


 Testing 0.96.1rc1.
 With one process incrementing a row in a table, we increment single col.  We 
 flush or do kills/kill-9 and data is lost.  flush and kill are likely the 
 same problem (kill would flush), kill -9 may or may not have the same root 
 cause.
 5 nodes
 hadoop 2.1.0 (a pre cdh5b1 hdfs).
 hbase 0.96.1 rc1 
 Test: 25 increments on a single row an single col with various number of 
 client threads (IncrementBlaster).  Verify we have a count of 25 after 
 the run (IncrementVerifier).
 Run 1: No fault injection.  5 runs.  count = 25. on multiple runs.  
 Correctness verified.  1638 inc/s throughput.
 Run 2: flushes table with incrementing row.  count = 246875 !=25.  
 correctness failed.  1517 inc/s throughput.  
 Run 3: kill of rs hosting incremented row.  count = 243750 != 25. 
 Correctness failed.   1451 inc/s throughput.
 Run 4: one kill -9 of rs hosting incremented row.  246878.!= 25.  
 Correctness failed. 1395 inc/s (including recovery)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HBASE-10085) Some regions aren't re-assigned after a mater restarts

Jeffrey Zhong created HBASE-10085:
-

 Summary: Some regions aren't re-assigned after a mater restarts
 Key: HBASE-10085
 URL: https://issues.apache.org/jira/browse/HBASE-10085
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: 0.96.0
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.98.0, 0.96.1


We see this issue happened in a cluster restart:

1) when shutdown a cluster, some regions are in offline state because no Region 
servers are available(stop RS and then Master)
2) When the cluster restarts, the offlined regions are forced to be offline 
again and SSH skip re-assigning them by function AM.processServerShutdown as 
shown below.

{code}
2013-12-03 10:41:56,686 INFO  [master:h2-ubuntu12-sec-1386048659-hbase-8:6] 
master.AssignmentManager: Processing 873dbd8c269f44d0aefb0f66c5b53537 in state: 
M_ZK_REGION_OFFLINE
2013-12-03 10:41:56,686 DEBUG [master:h2-ubuntu12-sec-1386048659-hbase-8:6] 
master.AssignmentManager: RIT 873dbd8c269f44d0aefb0f66c5b53537 in 
state=M_ZK_REGION_OFFLINE was on deadserver; forcing offline
...
2013-12-03 10:41:56,739 DEBUG [AM.-pool1-t8] master.AssignmentManager: Force 
region state offline {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, 
ts=1386067316737, 
server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
...
2013-12-03 10:41:57,223 WARN  
[MASTER_SERVER_OPERATIONS-h2-ubuntu12-sec-1386048659-hbase-8:6-3] 
master.RegionStates: THIS SHOULD NOT HAPPEN: unexpected 
{873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, ts=1386067316737, 
server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
 
{code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10084) [WINDOWS] bin\hbase.cmd should allow whitespaces in java.library.path and classpath

2013-12-04 Thread Jean-Marc Spaggiari (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839645#comment-13839645
 ] 

Jean-Marc Spaggiari commented on HBASE-10084:
-

I don't think this is specific to window$. Even under Linux I think we should 
allow whitespaces in the different paths.

 [WINDOWS] bin\hbase.cmd should allow whitespaces in java.library.path and 
 classpath
 ---

 Key: HBASE-10084
 URL: https://issues.apache.org/jira/browse/HBASE-10084
 Project: HBase
  Issue Type: Bug
  Components: scripts
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.98.0, 0.96.2


 In case CLASSPATH or java.library.path from hadoop or HBASE_HOME contains 
 directories with names containing whitespaces, the bin script splits out 
 errors. We can fix the ws handling hopefully once and for all (or not)  



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10085) Some regions aren't re-assigned after a mater restarts

2013-12-04 Thread Jimmy Xiang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839656#comment-13839656
 ] 

Jimmy Xiang commented on HBASE-10085:
-

Do you see this issue in 0.96.0 or 0.96.1?

 Some regions aren't re-assigned after a mater restarts
 --

 Key: HBASE-10085
 URL: https://issues.apache.org/jira/browse/HBASE-10085
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: 0.96.0
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.98.0, 0.96.1


 We see this issue happened in a cluster restart:
 1) when shutdown a cluster, some regions are in offline state because no 
 Region servers are available(stop RS and then Master)
 2) When the cluster restarts, the offlined regions are forced to be offline 
 again and SSH skip re-assigning them by function AM.processServerShutdown as 
 shown below.
 {code}
 2013-12-03 10:41:56,686 INFO  
 [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
 Processing 873dbd8c269f44d0aefb0f66c5b53537 in state: M_ZK_REGION_OFFLINE
 2013-12-03 10:41:56,686 DEBUG 
 [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
 RIT 873dbd8c269f44d0aefb0f66c5b53537 in state=M_ZK_REGION_OFFLINE was on 
 deadserver; forcing offline
 ...
 2013-12-03 10:41:56,739 DEBUG [AM.-pool1-t8] master.AssignmentManager: Force 
 region state offline {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, 
 ts=1386067316737, 
 server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
 ...
 2013-12-03 10:41:57,223 WARN  
 [MASTER_SERVER_OPERATIONS-h2-ubuntu12-sec-1386048659-hbase-8:6-3] 
 master.RegionStates: THIS SHOULD NOT HAPPEN: unexpected 
 {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, ts=1386067316737, 
 server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
  
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput


[ 
https://issues.apache.org/jira/browse/HBASE-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839662#comment-13839662
 ] 

stack commented on HBASE-8755:
--

Here is more review on the patch.  Make the changes suggested below and I'll +1 
it.

(Discussion off-line w/ Feng on this issue helped me better understand this 
patch and put to rest any notion that there is an easier 'fix' than the one 
proposed here.  That said.  There is much room for improvement but this can be 
done in a follow-on)

Remove these asserts rather than comment them out given they depended on a 
facility this patch removes.  Leaving them in will only make the next reader of 
the code -- very likely lacking the context you have -- feel uneasy thinking 
someone removed asserts just to get tests to pass.

 8 -assertTrue(Should have an outstanding WAL edit, ((FSHLog) 
log).hasDeferredEntries());
 9 +//assertTrue(Should have an outstanding WAL edit, ((FSHLog) 
log).hasDeferredEntries());

On the below...

+import java.util.Random;

... using a Random for choosing an arbitrary thread for a list of 4 is 
heavyweight.  Can you not take last digit of timestamp or nano timestamp or 
some attribute of the edit instead?  Something more lightweight?

Please remove all mentions of AsyncFlush since it no longer exists:

// all writes pending on AsyncWrite/AsyncFlush thread with

Leaving it in will confuse readers when they can't find any such thread class.

Is this comment right?

// txid = failedTxid will fail by throwing asyncIOE

Should it be = failedTxid?

This should be volatile since it is set by AsyncSync and then used by the main 
FSHLog thread (you have an assert to check it not null -- maybe you ran into an 
issue here already?):

 +  private IOException asyncIOE = null;

bq. +  private final Object bufferLock = new Object();

'bufferLock' if a very generic name. Could it be more descriptive?  It is a 
lock held for a short while while AsyncWriter moves queued edits off the 
globally seen queue to a local queue just before we send the edits to the WAL.  
You add a method named getPendingWrites that requires this lock be held.  Could 
we tie the method and the lock together better? Name it pendingWritesLock?  
(The name of the list to hold the pending writes is pendingWrites).

bq. ...because the HDFS write-method is pretty heavyweight as far as locking is 
concerned.

I think the heavyweight referred to in the above is hbase locking, not hdfs 
locking as the comment would imply.  If you agree (you know this code better 
than I), please adjust the comment.

Comments on what these threads do will help the next code reader.  AsyncWriter 
does adding of edits to HDFS.  AsyncSyncer needs a comment because it is 
oxymoronic (though it makes sense in this context).  In particular, a comment 
would draw out why we need so many instances of a syncer thread because 
everyone's first thought here is going to be why do we need this?  Ditto on the 
AsyncNotifier.  In the reviews above, folks have asked why we need this thread 
at all and a code reader will likely think similar on a first pass.  
Bottom-line, your patch raised questions from reviewers; it would be cool if 
the questions were answered in code comments where possible so the questions do 
not come up again.

  4 +  private final AsyncWriter   asyncWriter;
  5 +  private final AsyncSyncer[] asyncSyncers = new AsyncSyncer[5];
  6 +  private final AsyncNotifier asyncNotifier;

You remove the LogSyncer facility in this patch.  That is good (need to note 
this in release notes).  Your patch should remove the optional flush config 
from hbase-default.xml too since it no longer is relevant.

  3 -this.optionalFlushInterval =
  4 -  conf.getLong(hbase.regionserver.optionallogflushinterval, 1 * 
1000);

I see it here...

hbase-common/src/main/resources/hbase-default.xml:
namehbase.regionserver.optionallogflushinterval/name

A small nit is you might look at other threads in hbase and see how they are 
named... 

3 +asyncWriter = new AsyncWriter(AsyncHLogWriter);

Ditto here:

 +  asyncSyncers[i] = new AsyncSyncer(AsyncHLogSyncer + i);

Probably make the number of asyncsyncers a configuration (you don't have to put 
the option out in hbase-default.xml.. just make it so that if someone is 
reading the code and trips over this issue, they can change it by adding to 
hbase-site.xml w/o having to change code -- lets not reproduce the hard-coded 
'80' that is in the head of dfsclient we discussed yesterday -- smile).

... and here: asyncNotifier = new AsyncNotifier(AsyncHLogNotifier);

Not important but check out how other threads are named in hbase.  It might be 
good if these better align.

Maybe make a method for shutting down all these thread or use the 
Threads#shutdown method in Threads.java?

bq. LOG.error(Exception while waiting for AsyncNotifier threads to die, e);

Do LOG.error(Exception

[jira] [Updated] (HBASE-10085) Some regions aren't re-assigned after a mater restarts


 [ 
https://issues.apache.org/jira/browse/HBASE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-10085:
--

Affects Version/s: (was: 0.96.0)
   0.96.1

 Some regions aren't re-assigned after a mater restarts
 --

 Key: HBASE-10085
 URL: https://issues.apache.org/jira/browse/HBASE-10085
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: 0.96.1
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.98.0, 0.96.1


 We see this issue happened in a cluster restart:
 1) when shutdown a cluster, some regions are in offline state because no 
 Region servers are available(stop RS and then Master)
 2) When the cluster restarts, the offlined regions are forced to be offline 
 again and SSH skip re-assigning them by function AM.processServerShutdown as 
 shown below.
 {code}
 2013-12-03 10:41:56,686 INFO  
 [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
 Processing 873dbd8c269f44d0aefb0f66c5b53537 in state: M_ZK_REGION_OFFLINE
 2013-12-03 10:41:56,686 DEBUG 
 [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
 RIT 873dbd8c269f44d0aefb0f66c5b53537 in state=M_ZK_REGION_OFFLINE was on 
 deadserver; forcing offline
 ...
 2013-12-03 10:41:56,739 DEBUG [AM.-pool1-t8] master.AssignmentManager: Force 
 region state offline {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, 
 ts=1386067316737, 
 server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
 ...
 2013-12-03 10:41:57,223 WARN  
 [MASTER_SERVER_OPERATIONS-h2-ubuntu12-sec-1386048659-hbase-8:6-3] 
 master.RegionStates: THIS SHOULD NOT HAPPEN: unexpected 
 {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, ts=1386067316737, 
 server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
  
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10085) Some regions aren't re-assigned after a mater restarts


[ 
https://issues.apache.org/jira/browse/HBASE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839667#comment-13839667
 ] 

Sergey Shelukhin commented on HBASE-10085:
--

0.96.1, as of last Tuesday

 Some regions aren't re-assigned after a mater restarts
 --

 Key: HBASE-10085
 URL: https://issues.apache.org/jira/browse/HBASE-10085
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: 0.96.1
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.98.0, 0.96.1


 We see this issue happened in a cluster restart:
 1) when shutdown a cluster, some regions are in offline state because no 
 Region servers are available(stop RS and then Master)
 2) When the cluster restarts, the offlined regions are forced to be offline 
 again and SSH skip re-assigning them by function AM.processServerShutdown as 
 shown below.
 {code}
 2013-12-03 10:41:56,686 INFO  
 [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
 Processing 873dbd8c269f44d0aefb0f66c5b53537 in state: M_ZK_REGION_OFFLINE
 2013-12-03 10:41:56,686 DEBUG 
 [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
 RIT 873dbd8c269f44d0aefb0f66c5b53537 in state=M_ZK_REGION_OFFLINE was on 
 deadserver; forcing offline
 ...
 2013-12-03 10:41:56,739 DEBUG [AM.-pool1-t8] master.AssignmentManager: Force 
 region state offline {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, 
 ts=1386067316737, 
 server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
 ...
 2013-12-03 10:41:57,223 WARN  
 [MASTER_SERVER_OPERATIONS-h2-ubuntu12-sec-1386048659-hbase-8:6-3] 
 master.RegionStates: THIS SHOULD NOT HAPPEN: unexpected 
 {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, ts=1386067316737, 
 server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
  
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10079) Increments lost after flush

[
https://issues.apache.org/jira/browse/HBASE-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839677#comment-13839677
]

Jonathan Hsieh commented on HBASE-10079:

Removed HTablePool code and still got a race.

{code}
Exception in thread Thread-1 java.lang.IllegalStateException: test was
supposed to be in the cache
at
org.apache.hadoop.hbase.TableName.createTableNameIfNecessary(TableName.java:337)
at org.apache.hadoop.hbase.TableName.valueOf(TableName.java:412)
at org.apache.hadoop.hbase.client.HTable.init(HTable.java:150)
at IncrementBlaster$1.run(IncrementBlaster.java:130)
{code}

This table cache is the root cause of the race. The testing program has n
threads which waits until a rendezvous point before creating independent HTable
instances with the same name. It is unreasonable for separate HTable
constructors that just so happen to try to open the same table to race like
this. Fix should be in the TableName cache.

Increments lost after flush

Attachments: 10079.v1.patch

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.

2013-12-04 Thread Enis Soztutar (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839681#comment-13839681
]

Enis Soztutar commented on HBASE-10017:
---

bq. I have reproduced data loss during bulk load. This happens under the same
conditions as initial bug. 16 regions per table, I think it's not the only
case. Again, partitioner wrongly maps last region data and resulting region
HFile contains keys that shall not appear there.
This partitioner is not intended to be used by bulk load. It is already there
in the javadoc. TotalOrderPartioner should be used instead. If there are
changes to regions, LoadIncrementalFiles checks the boundaries (although not
sure whether it handles multiple splits to the same range or merges).

Other than that, the changes seems ok. However, I think we should get the
region boundaries at the start, and treat the range as immutable for the
lifetime of the partitioner. Although the table regions might go underlying
changes, we can at least guarantee a consistent mapping for key ranges. We can
to a table.getStartKeys() and do a binary search for the key range considering
the special region boundaries (empty start and stop rows).

HRegionPartitioner, rows directed to last partition are wrongly mapped.
---

Key: HBASE-10017
URL: https://issues.apache.org/jira/browse/HBASE-10017
Project: HBase
Issue Type: Bug
Components: mapreduce
Affects Versions: 0.94.6
Reporter: Roman Nikitchenko
Priority: Critical
Attachments: HBASE-10017-r1544633.patch, HBASE-10017-r1544633.patch,
patchSiteOutput.txt

Inside HRegionPartitioner class there is getPartition() method which should
map first numPartitions regions to appropriate partitions 1:1. But based on
condition last region is hashed which could lead to last reducer not having
any data. This is considered serious issue.
I reproduced this only starting from 16 regions per table. Original defect
was found in 0.94.6 but at least today's trunk and 0.91 branch head have the
same HRegionPartitioner code in this part which means the same issue.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-10085) Some regions aren't re-assigned after a mater restarts


 [ 
https://issues.apache.org/jira/browse/HBASE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-10085:
--

Status: Patch Available  (was: Open)

 Some regions aren't re-assigned after a mater restarts
 --

 Key: HBASE-10085
 URL: https://issues.apache.org/jira/browse/HBASE-10085
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: 0.96.1
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.98.0, 0.96.1

 Attachments: hbase-10085.patch


 We see this issue happened in a cluster restart:
 1) when shutdown a cluster, some regions are in offline state because no 
 Region servers are available(stop RS and then Master)
 2) When the cluster restarts, the offlined regions are forced to be offline 
 again and SSH skip re-assigning them by function AM.processServerShutdown as 
 shown below.
 {code}
 2013-12-03 10:41:56,686 INFO  
 [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
 Processing 873dbd8c269f44d0aefb0f66c5b53537 in state: M_ZK_REGION_OFFLINE
 2013-12-03 10:41:56,686 DEBUG 
 [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
 RIT 873dbd8c269f44d0aefb0f66c5b53537 in state=M_ZK_REGION_OFFLINE was on 
 deadserver; forcing offline
 ...
 2013-12-03 10:41:56,739 DEBUG [AM.-pool1-t8] master.AssignmentManager: Force 
 region state offline {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, 
 ts=1386067316737, 
 server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
 ...
 2013-12-03 10:41:57,223 WARN  
 [MASTER_SERVER_OPERATIONS-h2-ubuntu12-sec-1386048659-hbase-8:6-3] 
 master.RegionStates: THIS SHOULD NOT HAPPEN: unexpected 
 {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, ts=1386067316737, 
 server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
  
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-10085) Some regions aren't re-assigned after a mater restarts


 [ 
https://issues.apache.org/jira/browse/HBASE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-10085:
--

Attachment: hbase-10085.patch

Though we see this issue in the latest 0.96 code, it seems should happen in 
0.96.0 code base from the code.

 Some regions aren't re-assigned after a mater restarts
 --

 Key: HBASE-10085
 URL: https://issues.apache.org/jira/browse/HBASE-10085
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: 0.96.1
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.98.0, 0.96.1

 Attachments: hbase-10085.patch


 We see this issue happened in a cluster restart:
 1) when shutdown a cluster, some regions are in offline state because no 
 Region servers are available(stop RS and then Master)
 2) When the cluster restarts, the offlined regions are forced to be offline 
 again and SSH skip re-assigning them by function AM.processServerShutdown as 
 shown below.
 {code}
 2013-12-03 10:41:56,686 INFO  
 [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
 Processing 873dbd8c269f44d0aefb0f66c5b53537 in state: M_ZK_REGION_OFFLINE
 2013-12-03 10:41:56,686 DEBUG 
 [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
 RIT 873dbd8c269f44d0aefb0f66c5b53537 in state=M_ZK_REGION_OFFLINE was on 
 deadserver; forcing offline
 ...
 2013-12-03 10:41:56,739 DEBUG [AM.-pool1-t8] master.AssignmentManager: Force 
 region state offline {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, 
 ts=1386067316737, 
 server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
 ...
 2013-12-03 10:41:57,223 WARN  
 [MASTER_SERVER_OPERATIONS-h2-ubuntu12-sec-1386048659-hbase-8:6-3] 
 master.RegionStates: THIS SHOULD NOT HAPPEN: unexpected 
 {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, ts=1386067316737, 
 server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
  
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10079) Increments lost after flush


[ 
https://issues.apache.org/jira/browse/HBASE-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839689#comment-13839689
 ] 

Jonathan Hsieh commented on HBASE-10079:


[~nkeywal] HBASE-9976 introduces the TableName cache which is the root cause.

 Increments lost after flush 
 

 Key: HBASE-10079
 URL: https://issues.apache.org/jira/browse/HBASE-10079
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.96.1
Reporter: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.96.1

 Attachments: 10079.v1.patch


 Testing 0.96.1rc1.
 With one process incrementing a row in a table, we increment single col.  We 
 flush or do kills/kill-9 and data is lost.  flush and kill are likely the 
 same problem (kill would flush), kill -9 may or may not have the same root 
 cause.
 5 nodes
 hadoop 2.1.0 (a pre cdh5b1 hdfs).
 hbase 0.96.1 rc1 
 Test: 25 increments on a single row an single col with various number of 
 client threads (IncrementBlaster).  Verify we have a count of 25 after 
 the run (IncrementVerifier).
 Run 1: No fault injection.  5 runs.  count = 25. on multiple runs.  
 Correctness verified.  1638 inc/s throughput.
 Run 2: flushes table with incrementing row.  count = 246875 !=25.  
 correctness failed.  1517 inc/s throughput.  
 Run 3: kill of rs hosting incremented row.  count = 243750 != 25. 
 Correctness failed.   1451 inc/s throughput.
 Run 4: one kill -9 of rs hosting incremented row.  246878.!= 25.  
 Correctness failed. 1395 inc/s (including recovery)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10085) Some regions aren't re-assigned after a mater restarts

2013-12-04 Thread Jimmy Xiang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839700#comment-13839700
 ] 

Jimmy Xiang commented on HBASE-10085:
-

In step 2, do you mean the whole cluster restarts (both master + rs)? Is it 
easy to add a unit test?

 Some regions aren't re-assigned after a mater restarts
 --

 Key: HBASE-10085
 URL: https://issues.apache.org/jira/browse/HBASE-10085
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: 0.96.1
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.98.0, 0.96.1

 Attachments: hbase-10085.patch


 We see this issue happened in a cluster restart:
 1) when shutdown a cluster, some regions are in offline state because no 
 Region servers are available(stop RS and then Master)
 2) When the cluster restarts, the offlined regions are forced to be offline 
 again and SSH skip re-assigning them by function AM.processServerShutdown as 
 shown below.
 {code}
 2013-12-03 10:41:56,686 INFO  
 [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
 Processing 873dbd8c269f44d0aefb0f66c5b53537 in state: M_ZK_REGION_OFFLINE
 2013-12-03 10:41:56,686 DEBUG 
 [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
 RIT 873dbd8c269f44d0aefb0f66c5b53537 in state=M_ZK_REGION_OFFLINE was on 
 deadserver; forcing offline
 ...
 2013-12-03 10:41:56,739 DEBUG [AM.-pool1-t8] master.AssignmentManager: Force 
 region state offline {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, 
 ts=1386067316737, 
 server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
 ...
 2013-12-03 10:41:57,223 WARN  
 [MASTER_SERVER_OPERATIONS-h2-ubuntu12-sec-1386048659-hbase-8:6-3] 
 master.RegionStates: THIS SHOULD NOT HAPPEN: unexpected 
 {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, ts=1386067316737, 
 server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
  
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.

2013-12-04 Thread Nick Dimiduk (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839703#comment-13839703
]

Nick Dimiduk commented on HBASE-10017:
--

Multiple splits are handled through retrying. Splits are made and the halves
rewritten as independent HFiles with each pass, so this should be okay.

[~rn] I'm very concerned about the bulkload data loss issue, but I cannot
reproduce it using our existing unit tests (TestHRegionServerBulkLoad). Are you
able to demonstrate the loss in a test? As [~enis] said, TOP should be used for
generating HFiles files. Bulkload itself isn't performed inside a mapreduce
job, so I'm confused about how the HRegionPartitioner comes into play in this
scenario.

HRegionPartitioner, rows directed to last partition are wrongly mapped.
---

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.

2013-12-04 Thread Enis Soztutar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839702#comment-13839702
 ] 

Enis Soztutar commented on HBASE-10017:
---

bq. although not sure whether it handles multiple splits to the same range or 
merges
Nick pointed out that we are actually splitting those files by re-writing those 
files. I thought that we were creating actual reference files.  

 HRegionPartitioner, rows directed to last partition are wrongly mapped.
 ---

 Key: HBASE-10017
 URL: https://issues.apache.org/jira/browse/HBASE-10017
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.94.6
Reporter: Roman Nikitchenko
Priority: Critical
 Attachments: HBASE-10017-r1544633.patch, HBASE-10017-r1544633.patch, 
 patchSiteOutput.txt


 Inside HRegionPartitioner class there is getPartition() method which should 
 map first numPartitions regions to appropriate partitions 1:1. But based on 
 condition last region is hashed which could lead to last reducer not having 
 any data. This is considered serious issue.
 I reproduced this only starting from 16 regions per table. Original defect 
 was found in 0.94.6 but at least today's trunk and 0.91 branch head have the 
 same HRegionPartitioner code in this part which means the same issue.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10085) Some regions aren't re-assigned after a mater restarts