[jira] [Commented] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]

2012-09-05 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448512#comment-13448512
 ] 

stack commented on HBASE-6649:
--

This patch makes sense to me.  We replicate all up to the exception and then 
next time in, we should pick up the IOE again.  Want me to commit this DD?

 [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
 ---

 Key: HBASE-6649
 URL: https://issues.apache.org/jira/browse/HBASE-6649
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.92.3

 Attachments: 6649-1.patch, 6649-2.txt, HBase-0.92 #495 test - 
 queueFailover [Jenkins].html, HBase-0.92 #502 test - queueFailover 
 [Jenkins].html


 Have seen it twice in the recent past: http://bit.ly/MPCykB  
 http://bit.ly/O79Dq7 .. 
 Looking briefly at the logs hints at a pattern - in both the failed test 
 instances, there was an RS crash while the test was running.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6514) unknown metrics type: org.apache.hadoop.hbase.metrics.histogram.MetricsHistogram

2012-09-05 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448515#comment-13448515
 ] 

Elliott Clark commented on HBASE-6514:
--

Thanks Stack.  Always nice to have a double check.

 unknown metrics type: 
 org.apache.hadoop.hbase.metrics.histogram.MetricsHistogram
 

 Key: HBASE-6514
 URL: https://issues.apache.org/jira/browse/HBASE-6514
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.92.2, 0.94.0
 Environment: MacOS 10.8
 Oracle JDK 1.7
Reporter: Archimedes Trajano
Assignee: Elliott Clark
 Fix For: 0.92.2, 0.96.0, 0.94.2

 Attachments: FrameworkTest.java, FrameworkTest.java, 
 HBASE-6514-94-0.patch, HBASE-6514-trunk-0.patch, out.txt


 When trying to run a unit test that just starts up and shutdown the server 
 the following errors occur in System.out
 01:10:59,874 ERROR MetricsUtil:116 - unknown metrics type: 
 org.apache.hadoop.hbase.metrics.histogram.MetricsHistogram
 01:10:59,874 ERROR MetricsUtil:116 - unknown metrics type: 
 org.apache.hadoop.hbase.metrics.histogram.MetricsHistogram
 01:10:59,875 ERROR MetricsUtil:116 - unknown metrics type: 
 org.apache.hadoop.hbase.metrics.histogram.MetricsHistogram
 01:10:59,875 ERROR MetricsUtil:116 - unknown metrics type: 
 org.apache.hadoop.hbase.metrics.histogram.MetricsHistogram

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3976) Disable Block Cache On Compactions

2012-09-05 Thread Mikhail Bautin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448521#comment-13448521
 ] 

Mikhail Bautin commented on HBASE-3976:
---

Lars: I agree, cache-on-flush is definitely the most useful. This is what we 
are now using in production for some workloads.

 Disable Block Cache On Compactions
 --

 Key: HBASE-3976
 URL: https://issues.apache.org/jira/browse/HBASE-3976
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.90.3
Reporter: Karthick Sankarachary
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: HBASE-3976.patch, HBASE-3976-unconditional.patch, 
 HBASE-3976-V3.patch


 Is there a good reason to believe that caching blocks during compactions is 
 beneficial? Currently, if block cache is enabled on a certain family, then 
 every time it's compacted, we load all of its blocks into the (LRU) cache, at 
 the expense of the legitimately hot ones.
 As a matter of fact, this concern was raised earlier in HBASE-1597, which 
 rightly points out that, we should not bog down the LRU with unneccessary 
 blocks during compaction. Even though that issue has been marked as fixed, 
 it looks like it ought to be reopened.
 Should we err on the side of caution and not cache blocks during compactions 
 period (as illustrated in the attached patch)? Or, can we be selectively 
 aggressive about what blocks do get cached during compaction (e.g., only 
 cache those blocks from the recent files)?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-09-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448525#comment-13448525
 ] 

Hudson commented on HBASE-4050:
---

Integrated in HBase-TRUNK #3304 (See 
[https://builds.apache.org/job/HBase-TRUNK/3304/])
HBASE-4050 Clean up BaseMetricsSourceImpl (Revision 1381008)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/trunk/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetricsSourceImpl.java
* 
/hbase/trunk/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/metrics/BaseMetricsSourceImpl.java
* 
/hbase/trunk/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetricsSourceImpl.java
* 
/hbase/trunk/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/metrics/BaseMetricsSourceImpl.java


 Update HBase metrics framework to metrics2 framework
 

 Key: HBASE-4050
 URL: https://issues.apache.org/jira/browse/HBASE-4050
 Project: HBase
  Issue Type: New Feature
  Components: metrics
Affects Versions: 0.90.4
 Environment: Java 6
Reporter: Eric Yang
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.96.0

 Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
 HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050-2.patch, 
 HBASE-4050-3.patch, HBASE-4050-5.patch, HBASE-4050-6.patch, 
 HBASE-4050-7.patch, HBASE-4050-8_1.patch, HBASE-4050-8.patch, HBASE-4050.patch


 Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
 and it might get removed in future Hadoop release.  Hence, HBase needs to 
 revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6533) [replication] replication will be block if WAL compress set differently in master and slave configuration

2012-09-05 Thread terry zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448537#comment-13448537
 ] 

terry zhang commented on HBASE-6533:


this is because of master sending the hlog entry in compress mode. But Slave do 
not know about it. So when slave ipc hbaseserver deserilize the buffer and read 
the hlog entry fields error will happen. We can let the Master send the buffer 
in none compress mode. then whether master use hlog compression or not. Slave 
both can work fine

 [replication] replication will be block if WAL compress set differently in 
 master and slave configuration
 -

 Key: HBASE-6533
 URL: https://issues.apache.org/jira/browse/HBASE-6533
 Project: HBase
  Issue Type: Bug
  Components: replication
Affects Versions: 0.94.0
Reporter: terry zhang
Priority: Critical

 as we know in hbase 0.94.0 we have a configuration below
   property
 namehbase.regionserver.wal.enablecompression/name
  valuetrue/value
   /property
 if we enable it in master cluster and disable it in slave cluster . Then 
 replication will not work. It will throw unwrapRemoteException again and 
 again in master cluster.
 2012-08-09 12:49:55,892 WARN 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Can't 
 replicate because of an error
  on the remote cluster: 
 java.io.IOException: IPC server unable to read call parameters: Error in 
 readFields
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
 Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95)
 at 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:79)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:635)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:365)
 Caused by: org.apache.hadoop.ipc.RemoteException: IPC server unable to read 
 call parameters: Error in readFields
 at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:921)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:151)
 at $Proxy13.replicateLogEntries(Unknown Source)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:616)
 ... 1 more 
 This is because Slave cluster can not parse the hlog entry .
 2012-08-09 14:46:05,891 WARN org.apache.hadoop.ipc.HBaseServer: Unable to 
 read call parameters for client 10.232.98.89
 java.io.IOException: Error in readFields
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:685)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:586)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:635)
 at 
 org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1292)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1207)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:735)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:524)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:499)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.EOFException
 at java.io.DataInputStream.readFully(DataInputStream.java:180)
 at org.apache.hadoop.hbase.KeyValue.readFields(KeyValue.java:2254)
 at 
 org.apache.hadoop.hbase.regionserver.wal.WALEdit.readFields(WALEdit.java:146)
 at 
 org.apache.hadoop.hbase.regionserver.wal.HLog$Entry.readFields(HLog.java:1767)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:682)
 ... 11 more 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more 

[jira] [Updated] (HBASE-6533) [replication] replication will be block if WAL compress set differently in master and slave configuration

2012-09-05 Thread terry zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

terry zhang updated HBASE-6533:
---

Priority: Critical  (was: Major)

 [replication] replication will be block if WAL compress set differently in 
 master and slave configuration
 -

 Key: HBASE-6533
 URL: https://issues.apache.org/jira/browse/HBASE-6533
 Project: HBase
  Issue Type: Bug
  Components: replication
Affects Versions: 0.94.0
Reporter: terry zhang
Priority: Critical

 as we know in hbase 0.94.0 we have a configuration below
   property
 namehbase.regionserver.wal.enablecompression/name
  valuetrue/value
   /property
 if we enable it in master cluster and disable it in slave cluster . Then 
 replication will not work. It will throw unwrapRemoteException again and 
 again in master cluster.
 2012-08-09 12:49:55,892 WARN 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Can't 
 replicate because of an error
  on the remote cluster: 
 java.io.IOException: IPC server unable to read call parameters: Error in 
 readFields
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
 Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95)
 at 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:79)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:635)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:365)
 Caused by: org.apache.hadoop.ipc.RemoteException: IPC server unable to read 
 call parameters: Error in readFields
 at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:921)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:151)
 at $Proxy13.replicateLogEntries(Unknown Source)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:616)
 ... 1 more 
 This is because Slave cluster can not parse the hlog entry .
 2012-08-09 14:46:05,891 WARN org.apache.hadoop.ipc.HBaseServer: Unable to 
 read call parameters for client 10.232.98.89
 java.io.IOException: Error in readFields
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:685)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:586)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:635)
 at 
 org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1292)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1207)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:735)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:524)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:499)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.EOFException
 at java.io.DataInputStream.readFully(DataInputStream.java:180)
 at org.apache.hadoop.hbase.KeyValue.readFields(KeyValue.java:2254)
 at 
 org.apache.hadoop.hbase.regionserver.wal.WALEdit.readFields(WALEdit.java:146)
 at 
 org.apache.hadoop.hbase.regionserver.wal.HLog$Entry.readFields(HLog.java:1767)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:682)
 ... 11 more 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6533) [replication] replication will be block if WAL compress set differently in master and slave configuration

2012-09-05 Thread terry zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

terry zhang updated HBASE-6533:
---

Attachment: hbase-6533.patch

 [replication] replication will be block if WAL compress set differently in 
 master and slave configuration
 -

 Key: HBASE-6533
 URL: https://issues.apache.org/jira/browse/HBASE-6533
 Project: HBase
  Issue Type: Bug
  Components: replication
Affects Versions: 0.94.0
Reporter: terry zhang
Priority: Critical
 Attachments: hbase-6533.patch


 as we know in hbase 0.94.0 we have a configuration below
   property
 namehbase.regionserver.wal.enablecompression/name
  valuetrue/value
   /property
 if we enable it in master cluster and disable it in slave cluster . Then 
 replication will not work. It will throw unwrapRemoteException again and 
 again in master cluster.
 2012-08-09 12:49:55,892 WARN 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Can't 
 replicate because of an error
  on the remote cluster: 
 java.io.IOException: IPC server unable to read call parameters: Error in 
 readFields
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
 Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95)
 at 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:79)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:635)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:365)
 Caused by: org.apache.hadoop.ipc.RemoteException: IPC server unable to read 
 call parameters: Error in readFields
 at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:921)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:151)
 at $Proxy13.replicateLogEntries(Unknown Source)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:616)
 ... 1 more 
 This is because Slave cluster can not parse the hlog entry .
 2012-08-09 14:46:05,891 WARN org.apache.hadoop.ipc.HBaseServer: Unable to 
 read call parameters for client 10.232.98.89
 java.io.IOException: Error in readFields
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:685)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:586)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:635)
 at 
 org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1292)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1207)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:735)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:524)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:499)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.EOFException
 at java.io.DataInputStream.readFully(DataInputStream.java:180)
 at org.apache.hadoop.hbase.KeyValue.readFields(KeyValue.java:2254)
 at 
 org.apache.hadoop.hbase.regionserver.wal.WALEdit.readFields(WALEdit.java:146)
 at 
 org.apache.hadoop.hbase.regionserver.wal.HLog$Entry.readFields(HLog.java:1767)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:682)
 ... 11 more 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6592) [shell] Add means of custom formatting output by column

2012-09-05 Thread Jie Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Huang updated HBASE-6592:
-

Attachment: hbase-6592.patch

 [shell] Add means of custom formatting output by column
 ---

 Key: HBASE-6592
 URL: https://issues.apache.org/jira/browse/HBASE-6592
 Project: HBase
  Issue Type: New Feature
  Components: shell
Reporter: stack
Priority: Minor
  Labels: noob
 Attachments: hbase-6592.patch


 See Jacques suggestion toward end of this thread for how we should allow 
 adding a custom formatter per column to use outputting column content in 
 shell: 
 http://search-hadoop.com/m/2WxUB1fuxL11/Printing+integers+in+the+Hbase+shellsubj=Printing+integers+in+the+Hbase+shell

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6592) [shell] Add means of custom formatting output by column

2012-09-05 Thread Jie Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Huang updated HBASE-6592:
-

Attachment: (was: hbase-6592.patch)

 [shell] Add means of custom formatting output by column
 ---

 Key: HBASE-6592
 URL: https://issues.apache.org/jira/browse/HBASE-6592
 Project: HBase
  Issue Type: New Feature
  Components: shell
Reporter: stack
Priority: Minor
  Labels: noob
 Attachments: hbase-6592.patch


 See Jacques suggestion toward end of this thread for how we should allow 
 adding a custom formatter per column to use outputting column content in 
 shell: 
 http://search-hadoop.com/m/2WxUB1fuxL11/Printing+integers+in+the+Hbase+shellsubj=Printing+integers+in+the+Hbase+shell

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6592) [shell] Add means of custom formatting output by column

2012-09-05 Thread Jie Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448548#comment-13448548
 ] 

Jie Huang commented on HBASE-6592:
--

Add unit-test for this new feature. Any idea?

 [shell] Add means of custom formatting output by column
 ---

 Key: HBASE-6592
 URL: https://issues.apache.org/jira/browse/HBASE-6592
 Project: HBase
  Issue Type: New Feature
  Components: shell
Reporter: stack
Priority: Minor
  Labels: noob
 Attachments: hbase-6592.patch


 See Jacques suggestion toward end of this thread for how we should allow 
 adding a custom formatter per column to use outputting column content in 
 shell: 
 http://search-hadoop.com/m/2WxUB1fuxL11/Printing+integers+in+the+Hbase+shellsubj=Printing+integers+in+the+Hbase+shell

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6533) [replication] replication will be block if WAL compress set differently in master and slave configuration

2012-09-05 Thread terry zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

terry zhang updated HBASE-6533:
---

Fix Version/s: 0.94.3

 [replication] replication will be block if WAL compress set differently in 
 master and slave configuration
 -

 Key: HBASE-6533
 URL: https://issues.apache.org/jira/browse/HBASE-6533
 Project: HBase
  Issue Type: Bug
  Components: replication
Affects Versions: 0.94.0
Reporter: terry zhang
Priority: Critical
 Fix For: 0.94.3

 Attachments: hbase-6533.patch


 as we know in hbase 0.94.0 we have a configuration below
   property
 namehbase.regionserver.wal.enablecompression/name
  valuetrue/value
   /property
 if we enable it in master cluster and disable it in slave cluster . Then 
 replication will not work. It will throw unwrapRemoteException again and 
 again in master cluster.
 2012-08-09 12:49:55,892 WARN 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Can't 
 replicate because of an error
  on the remote cluster: 
 java.io.IOException: IPC server unable to read call parameters: Error in 
 readFields
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
 Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95)
 at 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:79)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:635)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:365)
 Caused by: org.apache.hadoop.ipc.RemoteException: IPC server unable to read 
 call parameters: Error in readFields
 at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:921)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:151)
 at $Proxy13.replicateLogEntries(Unknown Source)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:616)
 ... 1 more 
 This is because Slave cluster can not parse the hlog entry .
 2012-08-09 14:46:05,891 WARN org.apache.hadoop.ipc.HBaseServer: Unable to 
 read call parameters for client 10.232.98.89
 java.io.IOException: Error in readFields
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:685)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:586)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:635)
 at 
 org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1292)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1207)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:735)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:524)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:499)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.EOFException
 at java.io.DataInputStream.readFully(DataInputStream.java:180)
 at org.apache.hadoop.hbase.KeyValue.readFields(KeyValue.java:2254)
 at 
 org.apache.hadoop.hbase.regionserver.wal.WALEdit.readFields(WALEdit.java:146)
 at 
 org.apache.hadoop.hbase.regionserver.wal.HLog$Entry.readFields(HLog.java:1767)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:682)
 ... 11 more 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6533) [replication] replication will be block if WAL compress set differently in master and slave configuration

2012-09-05 Thread terry zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

terry zhang updated HBASE-6533:
---

Assignee: terry zhang

 [replication] replication will be block if WAL compress set differently in 
 master and slave configuration
 -

 Key: HBASE-6533
 URL: https://issues.apache.org/jira/browse/HBASE-6533
 Project: HBase
  Issue Type: Bug
  Components: replication
Affects Versions: 0.94.0
Reporter: terry zhang
Assignee: terry zhang
Priority: Critical
 Fix For: 0.94.3

 Attachments: hbase-6533.patch


 as we know in hbase 0.94.0 we have a configuration below
   property
 namehbase.regionserver.wal.enablecompression/name
  valuetrue/value
   /property
 if we enable it in master cluster and disable it in slave cluster . Then 
 replication will not work. It will throw unwrapRemoteException again and 
 again in master cluster.
 2012-08-09 12:49:55,892 WARN 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Can't 
 replicate because of an error
  on the remote cluster: 
 java.io.IOException: IPC server unable to read call parameters: Error in 
 readFields
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
 Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95)
 at 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:79)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:635)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:365)
 Caused by: org.apache.hadoop.ipc.RemoteException: IPC server unable to read 
 call parameters: Error in readFields
 at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:921)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:151)
 at $Proxy13.replicateLogEntries(Unknown Source)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:616)
 ... 1 more 
 This is because Slave cluster can not parse the hlog entry .
 2012-08-09 14:46:05,891 WARN org.apache.hadoop.ipc.HBaseServer: Unable to 
 read call parameters for client 10.232.98.89
 java.io.IOException: Error in readFields
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:685)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:586)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:635)
 at 
 org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1292)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1207)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:735)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:524)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:499)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.EOFException
 at java.io.DataInputStream.readFully(DataInputStream.java:180)
 at org.apache.hadoop.hbase.KeyValue.readFields(KeyValue.java:2254)
 at 
 org.apache.hadoop.hbase.regionserver.wal.WALEdit.readFields(WALEdit.java:146)
 at 
 org.apache.hadoop.hbase.regionserver.wal.HLog$Entry.readFields(HLog.java:1767)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:682)
 ... 11 more 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6719) [replication] Data will lose if open a Hlog failed more than maxRetriesMultiplier

2012-09-05 Thread terry zhang (JIRA)
terry zhang created HBASE-6719:
--

 Summary: [replication] Data will lose if open a Hlog failed more 
than maxRetriesMultiplier
 Key: HBASE-6719
 URL: https://issues.apache.org/jira/browse/HBASE-6719
 Project: HBase
  Issue Type: Bug
  Components: replication
Affects Versions: 0.94.1
Reporter: terry zhang
Assignee: terry zhang
Priority: Critical
 Fix For: 0.94.2


Please Take a look below code

{code:title=ReplicationSource.java|borderStyle=solid}

protected boolean openReader(int sleepMultiplier) {
{
  ...
  catch (IOException ioe) {

  LOG.warn(peerClusterZnode +  Got: , ioe);
  // TODO Need a better way to determinate if a file is really gone but
  // TODO without scanning all logs dir
  if (sleepMultiplier == this.maxRetriesMultiplier) {
LOG.warn(Waited too long for this file, considering dumping);
return !processEndOfFile(); // Open a file failed over 
maxRetriesMultiplier(default 10)
  }
}
return true;


  ...
}

  protected boolean processEndOfFile() {
if (this.queue.size() != 0) {// Skipped this Hlog . Data loss
  this.currentPath = null;
  this.position = 0;
  return true;
} else if (this.queueRecovered) {   // Terminate Failover Replication 
source thread ,data loss
  this.manager.closeRecoveredQueue(this);
  LOG.info(Finished recovering the queue);
  this.running = false;
  return true;
}
return false;
  }

{code} 


Some Time HDFS will meet some problem but actually Hlog file is OK , So after 
HDFS back  ,Some data will lose and can not find them back in slave cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6719) [replication] Data will lose if open a Hlog failed more than maxRetriesMultiplier

2012-09-05 Thread terry zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

terry zhang updated HBASE-6719:
---

Attachment: hbase-6719.patch

 [replication] Data will lose if open a Hlog failed more than 
 maxRetriesMultiplier
 -

 Key: HBASE-6719
 URL: https://issues.apache.org/jira/browse/HBASE-6719
 Project: HBase
  Issue Type: Bug
  Components: replication
Affects Versions: 0.94.1
Reporter: terry zhang
Assignee: terry zhang
Priority: Critical
 Fix For: 0.94.2

 Attachments: hbase-6719.patch


 Please Take a look below code
 {code:title=ReplicationSource.java|borderStyle=solid}
 protected boolean openReader(int sleepMultiplier) {
 {
   ...
   catch (IOException ioe) {
   LOG.warn(peerClusterZnode +  Got: , ioe);
   // TODO Need a better way to determinate if a file is really gone but
   // TODO without scanning all logs dir
   if (sleepMultiplier == this.maxRetriesMultiplier) {
 LOG.warn(Waited too long for this file, considering dumping);
 return !processEndOfFile(); // Open a file failed over 
 maxRetriesMultiplier(default 10)
   }
 }
 return true;
   ...
 }
   protected boolean processEndOfFile() {
 if (this.queue.size() != 0) {// Skipped this Hlog . Data loss
   this.currentPath = null;
   this.position = 0;
   return true;
 } else if (this.queueRecovered) {   // Terminate Failover Replication 
 source thread ,data loss
   this.manager.closeRecoveredQueue(this);
   LOG.info(Finished recovering the queue);
   this.running = false;
   return true;
 }
 return false;
   }
 {code} 
 Some Time HDFS will meet some problem but actually Hlog file is OK , So after 
 HDFS back  ,Some data will lose and can not find them back in slave cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6719) [replication] Data will lose if open a Hlog failed more than maxRetriesMultiplier

2012-09-05 Thread terry zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448578#comment-13448578
 ] 

terry zhang commented on HBASE-6719:


I think we need to handle the IOException carefully and better not to skip the 
Hlog unless it is really corrupted. We can log this failture as a fatal in Log 
and skip the Hlog (by delete the hlog zk node manually ) if we have to.

 [replication] Data will lose if open a Hlog failed more than 
 maxRetriesMultiplier
 -

 Key: HBASE-6719
 URL: https://issues.apache.org/jira/browse/HBASE-6719
 Project: HBase
  Issue Type: Bug
  Components: replication
Affects Versions: 0.94.1
Reporter: terry zhang
Assignee: terry zhang
Priority: Critical
 Fix For: 0.94.2

 Attachments: hbase-6719.patch


 Please Take a look below code
 {code:title=ReplicationSource.java|borderStyle=solid}
 protected boolean openReader(int sleepMultiplier) {
 {
   ...
   catch (IOException ioe) {
   LOG.warn(peerClusterZnode +  Got: , ioe);
   // TODO Need a better way to determinate if a file is really gone but
   // TODO without scanning all logs dir
   if (sleepMultiplier == this.maxRetriesMultiplier) {
 LOG.warn(Waited too long for this file, considering dumping);
 return !processEndOfFile(); // Open a file failed over 
 maxRetriesMultiplier(default 10)
   }
 }
 return true;
   ...
 }
   protected boolean processEndOfFile() {
 if (this.queue.size() != 0) {// Skipped this Hlog . Data loss
   this.currentPath = null;
   this.position = 0;
   return true;
 } else if (this.queueRecovered) {   // Terminate Failover Replication 
 source thread ,data loss
   this.manager.closeRecoveredQueue(this);
   LOG.info(Finished recovering the queue);
   this.running = false;
   return true;
 }
 return false;
   }
 {code} 
 Some Time HDFS will meet some problem but actually Hlog file is OK , So after 
 HDFS back  ,Some data will lose and can not find them back in slave cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6719) [replication] Data will lose if open a Hlog failed more than maxRetriesMultiplier

2012-09-05 Thread terry zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448584#comment-13448584
 ] 

terry zhang commented on HBASE-6719:


now we can handler it like below:

hlog size = 0, Hlog queue =0,Recovery thread = yes. Terminate recovery 
thread(return !processEndOfFile())
hlog size = 0, Hlog queue =0,Recovery thread = no. Continue Loop (return 
!processEndOfFile())
hlog size = 0, Hlog queue !=0,Recovery thread = yes. Skip hlog (return 
!processEndOfFile())
hlog size = 0, Hlog queue !=0,Recovery thread = no. skip hlog (return 
!processEndOfFile())

hlog size = 1, Hlog queue =0,Recovery thread = yes. LOG as a Fatal mistake in 
regionserver's log
hlog size = 1, Hlog queue =0,Recovery thread = no. LOG as a Fatal mistake in 
regionserver's log
hlog size = 1, Hlog queue !=0,Recovery thread = yes. LOG as a Fatal mistake in 
regionserver's log
hlog size = 1, Hlog queue !=0,Recovery thread = no. LOG as a Fatal mistake in 
regionserver's log

 [replication] Data will lose if open a Hlog failed more than 
 maxRetriesMultiplier
 -

 Key: HBASE-6719
 URL: https://issues.apache.org/jira/browse/HBASE-6719
 Project: HBase
  Issue Type: Bug
  Components: replication
Affects Versions: 0.94.1
Reporter: terry zhang
Assignee: terry zhang
Priority: Critical
 Fix For: 0.94.2

 Attachments: hbase-6719.patch


 Please Take a look below code
 {code:title=ReplicationSource.java|borderStyle=solid}
 protected boolean openReader(int sleepMultiplier) {
 {
   ...
   catch (IOException ioe) {
   LOG.warn(peerClusterZnode +  Got: , ioe);
   // TODO Need a better way to determinate if a file is really gone but
   // TODO without scanning all logs dir
   if (sleepMultiplier == this.maxRetriesMultiplier) {
 LOG.warn(Waited too long for this file, considering dumping);
 return !processEndOfFile(); // Open a file failed over 
 maxRetriesMultiplier(default 10)
   }
 }
 return true;
   ...
 }
   protected boolean processEndOfFile() {
 if (this.queue.size() != 0) {// Skipped this Hlog . Data loss
   this.currentPath = null;
   this.position = 0;
   return true;
 } else if (this.queueRecovered) {   // Terminate Failover Replication 
 source thread ,data loss
   this.manager.closeRecoveredQueue(this);
   LOG.info(Finished recovering the queue);
   this.running = false;
   return true;
 }
 return false;
   }
 {code} 
 Some Time HDFS will meet some problem but actually Hlog file is OK , So after 
 HDFS back  ,Some data will lose and can not find them back in slave cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6719) [replication] Data will lose if open a Hlog failed more than maxRetriesMultiplier

2012-09-05 Thread terry zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448586#comment-13448586
 ] 

terry zhang commented on HBASE-6719:


hlog size=1 Means hlog size is not 0.( hlog size != 0)

 [replication] Data will lose if open a Hlog failed more than 
 maxRetriesMultiplier
 -

 Key: HBASE-6719
 URL: https://issues.apache.org/jira/browse/HBASE-6719
 Project: HBase
  Issue Type: Bug
  Components: replication
Affects Versions: 0.94.1
Reporter: terry zhang
Assignee: terry zhang
Priority: Critical
 Fix For: 0.94.2

 Attachments: hbase-6719.patch


 Please Take a look below code
 {code:title=ReplicationSource.java|borderStyle=solid}
 protected boolean openReader(int sleepMultiplier) {
 {
   ...
   catch (IOException ioe) {
   LOG.warn(peerClusterZnode +  Got: , ioe);
   // TODO Need a better way to determinate if a file is really gone but
   // TODO without scanning all logs dir
   if (sleepMultiplier == this.maxRetriesMultiplier) {
 LOG.warn(Waited too long for this file, considering dumping);
 return !processEndOfFile(); // Open a file failed over 
 maxRetriesMultiplier(default 10)
   }
 }
 return true;
   ...
 }
   protected boolean processEndOfFile() {
 if (this.queue.size() != 0) {// Skipped this Hlog . Data loss
   this.currentPath = null;
   this.position = 0;
   return true;
 } else if (this.queueRecovered) {   // Terminate Failover Replication 
 source thread ,data loss
   this.manager.closeRecoveredQueue(this);
   LOG.info(Finished recovering the queue);
   this.running = false;
   return true;
 }
 return false;
   }
 {code} 
 Some Time HDFS will meet some problem but actually Hlog file is OK , So after 
 HDFS back  ,Some data will lose and can not find them back in slave cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6299) RS starts region open while fails ack to HMaster.sendRegionOpen() causes inconsistency in HMaster's region state and a series of successive problems.

2012-09-05 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448616#comment-13448616
 ] 

ramkrishna.s.vasudevan commented on HBASE-6299:
---

[~maryannxue]
You have any updated patch for this? Can we provide one updated patch for this 
issue ?

 RS starts region open while fails ack to HMaster.sendRegionOpen() causes 
 inconsistency in HMaster's region state and a series of successive problems.
 -

 Key: HBASE-6299
 URL: https://issues.apache.org/jira/browse/HBASE-6299
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.94.0
Reporter: Maryann Xue
Assignee: Maryann Xue
Priority: Critical
 Attachments: HBASE-6299.patch, HBASE-6299-v2.patch


 1. HMaster tries to assign a region to an RS.
 2. HMaster creates a RegionState for this region and puts it into 
 regionsInTransition.
 3. In the first assign attempt, HMaster calls RS.openRegion(). The RS 
 receives the open region request and starts to proceed, with success 
 eventually. However, due to network problems, HMaster fails to receive the 
 response for the openRegion() call, and the call times out.
 4. HMaster attemps to assign for a second time, choosing another RS. 
 5. But since the HMaster's OpenedRegionHandler has been triggered by the 
 region open of the previous RS, and the RegionState has already been removed 
 from regionsInTransition, HMaster finds invalid and ignores the unassigned ZK 
 node RS_ZK_REGION_OPENING updated by the second attempt.
 6. The unassigned ZK node stays and a later unassign fails coz 
 RS_ZK_REGION_CLOSING cannot be created.
 {code}
 2012-06-29 07:03:38,870 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for 
 region 
 CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.;
  
 plan=hri=CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.,
  src=swbss-hadoop-004,60020,1340890123243, 
 dest=swbss-hadoop-006,60020,1340890678078
 2012-06-29 07:03:38,870 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
 CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
  to swbss-hadoop-006,60020,1340890678078
 2012-06-29 07:03:38,870 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=swbss-hadoop-002:6, 
 region=b713fd655fa02395496c5a6e39ddf568
 2012-06-29 07:06:28,882 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, 
 region=b713fd655fa02395496c5a6e39ddf568
 2012-06-29 07:06:32,291 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, 
 region=b713fd655fa02395496c5a6e39ddf568
 2012-06-29 07:06:32,299 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENED, server=swbss-hadoop-006,60020,1340890678078, 
 region=b713fd655fa02395496c5a6e39ddf568
 2012-06-29 07:06:32,299 DEBUG 
 org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED 
 event for 
 CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
  from serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=518945, 
 regions=575, usedHeap=15282, maxHeap=31301); deleting unassigned node
 2012-06-29 07:06:32,299 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x2377fee2ae80007 Deleting existing unassigned node for 
 b713fd655fa02395496c5a6e39ddf568 that is in expected state RS_ZK_REGION_OPENED
 2012-06-29 07:06:32,301 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x2377fee2ae80007 Successfully deleted unassigned node for 
 region b713fd655fa02395496c5a6e39ddf568 in expected state RS_ZK_REGION_OPENED
 2012-06-29 07:06:32,301 DEBUG 
 org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: The master has 
 opened the region 
 CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
  that was online on serverName=swbss-hadoop-006,60020,1340890678078, 
 load=(requests=518945, regions=575, usedHeap=15282, maxHeap=31301)
 2012-06-29 07:07:41,140 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of 
 CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
  to serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=0, 
 regions=575, usedHeap=0, maxHeap=0), 

[jira] [Commented] (HBASE-3866) Script to add regions gradually to a new regionserver.

2012-09-05 Thread Aravind Gottipati (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448647#comment-13448647
 ] 

Aravind Gottipati commented on HBASE-3866:
--

I will defer to you folks regarding including this script with the 
distribution.  Stack's suggestion of closing the JIRA is a fine one, like he 
said - this would leave the script here for others to use.

I would however like to note a few things.

1. The script attached here is outdated.  A newer version of the script that 
worked with 0.92 is here 
(https://github.com/aravind/hbase-utils/blob/master/region_mover.rb).  I 
haven't been keeping up with the latest, so there is a very good chance, it 
might not work with versions after 0.92.

2. The script is pretty inefficient in how it moves and balances regions.  It 
maintains an internal hashmap (two of them even) of the servers - number of 
regions, to keep the region count balanced.

3. It is as portable as the original region mover script, since it re-uses most 
of the same mechanisms.


 Script to add regions gradually to a new regionserver.
 --

 Key: HBASE-3866
 URL: https://issues.apache.org/jira/browse/HBASE-3866
 Project: HBase
  Issue Type: Improvement
  Components: scripts
Affects Versions: 0.90.2
Reporter: Aravind Gottipati
Priority: Minor
 Attachments: 3866-max-regions-per-iteration.patch, slow_balancer.rb, 
 slow_balancer.rb


 When a new region server is brought online, the current balancer kicks off a 
 whole bunch of region moves and causes a lot of regions to be un-available 
 right away.  A slower balancer that gradually balances the cluster is 
 probably a good script to have.  I have an initial version that mooches off 
 the region_mover script to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-09-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448650#comment-13448650
 ] 

Hudson commented on HBASE-4050:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #160 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/160/])
HBASE-4050 Clean up BaseMetricsSourceImpl (Revision 1381008)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/trunk/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetricsSourceImpl.java
* 
/hbase/trunk/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/metrics/BaseMetricsSourceImpl.java
* 
/hbase/trunk/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetricsSourceImpl.java
* 
/hbase/trunk/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/metrics/BaseMetricsSourceImpl.java


 Update HBase metrics framework to metrics2 framework
 

 Key: HBASE-4050
 URL: https://issues.apache.org/jira/browse/HBASE-4050
 Project: HBase
  Issue Type: New Feature
  Components: metrics
Affects Versions: 0.90.4
 Environment: Java 6
Reporter: Eric Yang
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.96.0

 Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
 HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050-2.patch, 
 HBASE-4050-3.patch, HBASE-4050-5.patch, HBASE-4050-6.patch, 
 HBASE-4050-7.patch, HBASE-4050-8_1.patch, HBASE-4050-8.patch, HBASE-4050.patch


 Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
 and it might get removed in future Hadoop release.  Hence, HBase needs to 
 revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5631) hbck should handle case where .tableinfo file is missing.

2012-09-05 Thread Jie Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Huang updated HBASE-5631:
-

Attachment: (was: hbase-5631-trunk.patch)

 hbck should handle case where .tableinfo file is missing.
 -

 Key: HBASE-5631
 URL: https://issues.apache.org/jira/browse/HBASE-5631
 Project: HBase
  Issue Type: Improvement
  Components: hbck
Affects Versions: 0.92.2, 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jie Huang

 0.92+ branches have a .tableinfo file which could be missing from hdfs.  hbck 
 should be able to detect and repair this properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5631) hbck should handle case where .tableinfo file is missing.

2012-09-05 Thread Jie Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Huang updated HBASE-5631:
-

Attachment: hbase-5631.patch

here attaches the patch file for this feature. 

 hbck should handle case where .tableinfo file is missing.
 -

 Key: HBASE-5631
 URL: https://issues.apache.org/jira/browse/HBASE-5631
 Project: HBase
  Issue Type: Improvement
  Components: hbck
Affects Versions: 0.92.2, 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jie Huang
 Attachments: hbase-5631.patch


 0.92+ branches have a .tableinfo file which could be missing from hdfs.  hbck 
 should be able to detect and repair this properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation

2012-09-05 Thread Priyadarshini (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Priyadarshini updated HBASE-6698:
-

Attachment: HBASE-6698_2.patch

 Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
 --

 Key: HBASE-6698
 URL: https://issues.apache.org/jira/browse/HBASE-6698
 Project: HBase
  Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch, HBASE-6698.patch


 Currently the checkAndPut and checkAndDelete api internally calls the 
 internalPut and internalDelete.  May be we can just call doMiniBatchMutation
 only.  This will help in future like if we have some hooks and the CP
 handles certain cases in the doMiniBatchMutation the same can be done while
 doing a put thro checkAndPut or while doing a delete thro checkAndDelete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6592) [shell] Add means of custom formatting output by column

2012-09-05 Thread Jie Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Huang updated HBASE-6592:
-

Status: Patch Available  (was: Open)

 [shell] Add means of custom formatting output by column
 ---

 Key: HBASE-6592
 URL: https://issues.apache.org/jira/browse/HBASE-6592
 Project: HBase
  Issue Type: New Feature
  Components: shell
Reporter: stack
Priority: Minor
  Labels: noob
 Attachments: hbase-6592.patch


 See Jacques suggestion toward end of this thread for how we should allow 
 adding a custom formatter per column to use outputting column content in 
 shell: 
 http://search-hadoop.com/m/2WxUB1fuxL11/Printing+integers+in+the+Hbase+shellsubj=Printing+integers+in+the+Hbase+shell

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation

2012-09-05 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-6698:
--

Status: Open  (was: Patch Available)

 Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
 --

 Key: HBASE-6698
 URL: https://issues.apache.org/jira/browse/HBASE-6698
 Project: HBase
  Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch, HBASE-6698.patch


 Currently the checkAndPut and checkAndDelete api internally calls the 
 internalPut and internalDelete.  May be we can just call doMiniBatchMutation
 only.  This will help in future like if we have some hooks and the CP
 handles certain cases in the doMiniBatchMutation the same can be done while
 doing a put thro checkAndPut or while doing a delete thro checkAndDelete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation

2012-09-05 Thread Priyadarshini (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448697#comment-13448697
 ] 

Priyadarshini commented on HBASE-6698:
--

Refactored internalPut() and internalDelete().



 Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
 --

 Key: HBASE-6698
 URL: https://issues.apache.org/jira/browse/HBASE-6698
 Project: HBase
  Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch, HBASE-6698.patch


 Currently the checkAndPut and checkAndDelete api internally calls the 
 internalPut and internalDelete.  May be we can just call doMiniBatchMutation
 only.  This will help in future like if we have some hooks and the CP
 handles certain cases in the doMiniBatchMutation the same can be done while
 doing a put thro checkAndPut or while doing a delete thro checkAndDelete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation

2012-09-05 Thread Priyadarshini (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Priyadarshini updated HBASE-6698:
-

Status: Patch Available  (was: Open)

 Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
 --

 Key: HBASE-6698
 URL: https://issues.apache.org/jira/browse/HBASE-6698
 Project: HBase
  Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch, HBASE-6698.patch


 Currently the checkAndPut and checkAndDelete api internally calls the 
 internalPut and internalDelete.  May be we can just call doMiniBatchMutation
 only.  This will help in future like if we have some hooks and the CP
 handles certain cases in the doMiniBatchMutation the same can be done while
 doing a put thro checkAndPut or while doing a delete thro checkAndDelete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6286) Upgrade maven-compiler-plugin to 2.5.1

2012-09-05 Thread Michael Drzal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448727#comment-13448727
 ] 

Michael Drzal commented on HBASE-6286:
--

+1 seems like a win to me

 Upgrade maven-compiler-plugin to 2.5.1
 --

 Key: HBASE-6286
 URL: https://issues.apache.org/jira/browse/HBASE-6286
 Project: HBase
  Issue Type: Improvement
  Components: build
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Minor
 Attachments: HBASE-6286.patch


 time mvn -PlocalTests clean install -DskipTests 
 With 2.5.1:
 |user|1m35.634s|1m31.178s|1m31.366s|
 |sys|0m06.540s|0m05.376s|0m05.488s|
 With 2.0.2 (current):
 |user|2m01.168s|1m54.027s|1m57.799s|
 |sys|0m05.896s|0m05.912s|0m06.032s|

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6288) In hbase-daemons.sh, description of the default backup-master file path is wrong

2012-09-05 Thread Michael Drzal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448732#comment-13448732
 ] 

Michael Drzal commented on HBASE-6288:
--

+1 looks good [~benkimkimben]

 In hbase-daemons.sh, description of the default backup-master file path is 
 wrong
 

 Key: HBASE-6288
 URL: https://issues.apache.org/jira/browse/HBASE-6288
 Project: HBase
  Issue Type: Task
  Components: master, scripts, shell
Affects Versions: 0.92.0, 0.92.1, 0.94.0
Reporter: Benjamin Kim
 Attachments: HBASE-6288-92-1.patch, HBASE-6288-92.patch, 
 HBASE-6288-94.patch, HBASE-6288-trunk.patch


 In hbase-daemons.sh, description of the default backup-master file path is 
 wrong
 {code}
 #   HBASE_BACKUP_MASTERS File naming remote hosts.
 # Default is ${HADOOP_CONF_DIR}/backup-masters
 {code}
 it says the default backup-masters file path is at a hadoop-conf-dir, but 
 shouldn't this be HBASE_CONF_DIR?
 also adding following lines to conf/hbase-env.sh would be helpful
 {code}
 # File naming hosts on which backup HMaster will run.  
 $HBASE_HOME/conf/backup-masters by default.
 export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation

2012-09-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448737#comment-13448737
 ] 

Hadoop QA commented on HBASE-6698:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12543846/HBASE-6698_2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

-1 javadoc.  The javadoc tool appears to have generated 108 warning 
messages.

-1 javac.  The applied patch generated 5 javac compiler warnings (more than 
the trunk's current 4 warnings).

-1 findbugs.  The patch appears to introduce 7 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.regionserver.TestRegionServerMetrics

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2787//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2787//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2787//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2787//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2787//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2787//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2787//console

This message is automatically generated.

 Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
 --

 Key: HBASE-6698
 URL: https://issues.apache.org/jira/browse/HBASE-6698
 Project: HBase
  Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch, HBASE-6698.patch


 Currently the checkAndPut and checkAndDelete api internally calls the 
 internalPut and internalDelete.  May be we can just call doMiniBatchMutation
 only.  This will help in future like if we have some hooks and the CP
 handles certain cases in the doMiniBatchMutation the same can be done while
 doing a put thro checkAndPut or while doing a delete thro checkAndDelete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6302) Document how to run integration tests

2012-09-05 Thread Michael Drzal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448755#comment-13448755
 ] 

Michael Drzal commented on HBASE-6302:
--

Patch looks good, with the exception of the points that Andrew made.

 Document how to run integration tests
 -

 Key: HBASE-6302
 URL: https://issues.apache.org/jira/browse/HBASE-6302
 Project: HBase
  Issue Type: Sub-task
  Components: documentation
Reporter: stack
Assignee: Enis Soztutar
Priority: Blocker
 Fix For: 0.96.0

 Attachments: HBASE-6302_v1.patch


 HBASE-6203 has attached the old IT doc with some mods.  When we figure how 
 ITs are to be run, update it and apply the documentation under this issue.  
 Making a blocker against 0.96.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6651) Thread safety of HTablePool is doubtful

2012-09-05 Thread Hiroshi Ikeda (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448770#comment-13448770
 ] 

Hiroshi Ikeda commented on HBASE-6651:
--

* I think ThreadLocalPool is useless and dangerous. You never access a content 
in ThreadLocal from other threads, and if you require information in the 
content to dispose its container object or something, you must collect the 
information by using all the thread that you ever used to access.

* RoundRobinPool might give the same object to different threads.

* It is bad to use conccurent collections. We should explictly lock larger 
sections to keep consistency, or remove synchronization concerns from PoolMap 
with using explicit locks from outside of PoolMap.

* PoolMap breaks the contract of Map; The actual behaviors of the methods of 
PoolMap are vague. Also filling out the methods of Map causes the code dirty. 
We should simplify the code by removing the needless implementation at the 
start.


 Thread safety of HTablePool is doubtful
 ---

 Key: HBASE-6651
 URL: https://issues.apache.org/jira/browse/HBASE-6651
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.94.1
Reporter: Hiroshi Ikeda
Priority: Minor

 There are some operations in HTablePool to access to PoolMap in multiple 
 times without any explict synchronization. 
 For example HTablePool.closeTablePool() calles PoolMap.values(), and calles 
 PoolMap.remove(). If other threads add new instances to the pool in the 
 middle of the calls, the new added instances might be dropped. 
 (HTablePool.closeTablePool() also has another problem that calling it by 
 multple threads causes accessing HTable by multiple threads.)
 Moreover, PoolMap is not thread safe for the same reason.
 For example PoolMap.put() calles ConcurrentMap.get() and calles 
 ConcurrentMap.put(). If other threads add a new instance to the concurent map 
 in the middle of the calls, the new instance might be dropped.
 And also implementations of Pool have the same problems.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5631) hbck should handle case where .tableinfo file is missing.

2012-09-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448773#comment-13448773
 ] 

Hadoop QA commented on HBASE-5631:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12543845/hbase-5631.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

-1 javadoc.  The javadoc tool appears to have generated 108 warning 
messages.

-1 javac.  The applied patch generated 5 javac compiler warnings (more than 
the trunk's current 4 warnings).

-1 findbugs.  The patch appears to introduce 7 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2788//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2788//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2788//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2788//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2788//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2788//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2788//console

This message is automatically generated.

 hbck should handle case where .tableinfo file is missing.
 -

 Key: HBASE-5631
 URL: https://issues.apache.org/jira/browse/HBASE-5631
 Project: HBase
  Issue Type: Improvement
  Components: hbck
Affects Versions: 0.92.2, 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jie Huang
 Attachments: hbase-5631.patch


 0.92+ branches have a .tableinfo file which could be missing from hdfs.  hbck 
 should be able to detect and repair this properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-6288) In hbase-daemons.sh, description of the default backup-master file path is wrong

2012-09-05 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-6288.
--

   Resolution: Fixed
Fix Version/s: 0.94.2
   0.92.3
 Hadoop Flags: Reviewed

Committed to 0.92, 0.94 and to trunk.

 In hbase-daemons.sh, description of the default backup-master file path is 
 wrong
 

 Key: HBASE-6288
 URL: https://issues.apache.org/jira/browse/HBASE-6288
 Project: HBase
  Issue Type: Task
  Components: master, scripts, shell
Affects Versions: 0.92.0, 0.92.1, 0.94.0
Reporter: Benjamin Kim
 Fix For: 0.92.3, 0.94.2

 Attachments: HBASE-6288-92-1.patch, HBASE-6288-92.patch, 
 HBASE-6288-94.patch, HBASE-6288-trunk.patch


 In hbase-daemons.sh, description of the default backup-master file path is 
 wrong
 {code}
 #   HBASE_BACKUP_MASTERS File naming remote hosts.
 # Default is ${HADOOP_CONF_DIR}/backup-masters
 {code}
 it says the default backup-masters file path is at a hadoop-conf-dir, but 
 shouldn't this be HBASE_CONF_DIR?
 also adding following lines to conf/hbase-env.sh would be helpful
 {code}
 # File naming hosts on which backup HMaster will run.  
 $HBASE_HOME/conf/backup-masters by default.
 export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5631) hbck should handle case where .tableinfo file is missing.

2012-09-05 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448823#comment-13448823
 ] 

Jonathan Hsieh commented on HBASE-5631:
---

Have you tried shutting down the cluster and then restarting it?  I have a 
suspicion that this may not work if the HTD isn't cached.  Could you modify the 
test (add a few lines) from HBASE-6516 to verify that this patch fixes the 
table?  

{code}
+  HTableDescriptor[] htds = getHTableDescriptors(tmpList); // this goes to 
master which goes to the filesystem 
{code}



Nits: 
instead of this:
{code}
+  Path hbaseRoot = new Path(conf.get(HConstants.HBASE_DIR));
{code}
use this:
{code}
FSUtils.getRootDir(conf);
{code}

Are we purposely updating the passed in array?  could we just use tmpList?
{code}
+  ListString tmpList = new ArrayListString();
+  tmpList.addAll(orphanTableDirs);
+  HTableDescriptor[] htds = getHTableDescriptors(tmpList);
+  Iterator iter = orphanTableDirs.iterator();
+  int j = 0;
+  while (iter.hasNext()) {
+String tableName = (String) iter.next();
+ 
{code}

I wasn't consistent with error.print vs log.  I think I prefer log.  Any reason 
you picked this vs the other?
{code}
+errors.print(Try to fix orphan table:  + tableName);
..
+errors.print(fixing table:  + tableName);
..
+  errors.report(Failed to fix orphan table:  + tableName);
{code}

typo/reword:  hfsck - hbck, It is strongly recommended that you re-run hbck 
manually since orphan table dirs have been fixed
{code}
+LOG.warn(Strongly recommend to re-run manually hfsck after all 
orphanTableDirs being fixed);
{code}


 hbck should handle case where .tableinfo file is missing.
 -

 Key: HBASE-5631
 URL: https://issues.apache.org/jira/browse/HBASE-5631
 Project: HBase
  Issue Type: Improvement
  Components: hbck
Affects Versions: 0.92.2, 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jie Huang
 Attachments: hbase-5631.patch


 0.92+ branches have a .tableinfo file which could be missing from hdfs.  hbck 
 should be able to detect and repair this properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6288) In hbase-daemons.sh, description of the default backup-master file path is wrong

2012-09-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448824#comment-13448824
 ] 

Hudson commented on HBASE-6288:
---

Integrated in HBase-0.94 #449 (See 
[https://builds.apache.org/job/HBase-0.94/449/])
HBASE-6288 In hbase-daemons.sh, description of the default backup-master 
file path is wrong (Revision 1381219)

 Result = FAILURE
stack : 
Files : 
* /hbase/branches/0.94/bin/master-backup.sh
* /hbase/branches/0.94/conf/hbase-env.sh


 In hbase-daemons.sh, description of the default backup-master file path is 
 wrong
 

 Key: HBASE-6288
 URL: https://issues.apache.org/jira/browse/HBASE-6288
 Project: HBase
  Issue Type: Task
  Components: master, scripts, shell
Affects Versions: 0.92.0, 0.92.1, 0.94.0
Reporter: Benjamin Kim
 Fix For: 0.92.3, 0.94.2

 Attachments: HBASE-6288-92-1.patch, HBASE-6288-92.patch, 
 HBASE-6288-94.patch, HBASE-6288-trunk.patch


 In hbase-daemons.sh, description of the default backup-master file path is 
 wrong
 {code}
 #   HBASE_BACKUP_MASTERS File naming remote hosts.
 # Default is ${HADOOP_CONF_DIR}/backup-masters
 {code}
 it says the default backup-masters file path is at a hadoop-conf-dir, but 
 shouldn't this be HBASE_CONF_DIR?
 also adding following lines to conf/hbase-env.sh would be helpful
 {code}
 # File naming hosts on which backup HMaster will run.  
 $HBASE_HOME/conf/backup-masters by default.
 export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies

2012-09-05 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448843#comment-13448843
 ] 

ramkrishna.s.vasudevan commented on HBASE-6438:
---

@Stack
Sorry for missing out this review comment all these days.  
Actually we would like to get in HBASe-6299 also and this patch.  As you 
mentioned can we give a patch for 0.94 and 0.92 combining both.
We faced HBASE-6299 recently in one of our testing. Both should be an useful 
one.


 RegionAlreadyInTransitionException needs to give more info to avoid 
 assignment inconsistencies
 --

 Key: HBASE-6438
 URL: https://issues.apache.org/jira/browse/HBASE-6438
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: rajeshbabu
 Attachments: HBASE-6438_trunk.patch


 Seeing some of the recent issues in region assignment, 
 RegionAlreadyInTransitionException is one reason after which the region 
 assignment may or may not happen(in the sense we need to wait for the TM to 
 assign).
 In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on 
 master restart.
 Consider the following case, due to some reason like master restart or 
 external assign call, we try to assign a region that is already getting 
 opened in a RS.
 Now the next call to assign has already changed the state of the znode and so 
 the current assign that is going on the RS is affected and it fails.  The 
 second assignment that started also fails getting RAITE exception.  Finally 
 both assignments not carrying on.  Idea is to find whether any such RAITE 
 exception can be retried or not.
 Here again we have following cases like where
 - The znode is yet to transitioned from OFFLINE to OPENING in RS
 - RS may be in the step of openRegion.
 - RS may be trying to transition OPENING to OPENED.
 - RS is yet to add to online regions in the RS side.
 Here in openRegion() and updateMeta() any failures we are moving the znode to 
 FAILED_OPEN.  So in these cases getting an RAITE should be ok.  But in other 
 cases the assignment is stopped.
 The idea is to just add the current state of the region assignment in the RIT 
 map in the RS side and using that info we can determine whether the 
 assignment can be retried or not on getting an RAITE.
 Considering the current work going on in AM, pls do share if this is needed 
 atleast in the 0.92/0.94 versions?  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6288) In hbase-daemons.sh, description of the default backup-master file path is wrong

2012-09-05 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6288:
-

Fix Version/s: 0.96.0

 In hbase-daemons.sh, description of the default backup-master file path is 
 wrong
 

 Key: HBASE-6288
 URL: https://issues.apache.org/jira/browse/HBASE-6288
 Project: HBase
  Issue Type: Task
  Components: master, scripts, shell
Affects Versions: 0.92.0, 0.92.1, 0.94.0
Reporter: Benjamin Kim
 Fix For: 0.96.0, 0.92.3, 0.94.2

 Attachments: HBASE-6288-92-1.patch, HBASE-6288-92.patch, 
 HBASE-6288-94.patch, HBASE-6288-trunk.patch


 In hbase-daemons.sh, description of the default backup-master file path is 
 wrong
 {code}
 #   HBASE_BACKUP_MASTERS File naming remote hosts.
 # Default is ${HADOOP_CONF_DIR}/backup-masters
 {code}
 it says the default backup-masters file path is at a hadoop-conf-dir, but 
 shouldn't this be HBASE_CONF_DIR?
 also adding following lines to conf/hbase-env.sh would be helpful
 {code}
 # File naming hosts on which backup HMaster will run.  
 $HBASE_HOME/conf/backup-masters by default.
 export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3866) Script to add regions gradually to a new regionserver.

2012-09-05 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448871#comment-13448871
 ] 

Lars Hofhansl commented on HBASE-3866:
--

In my comment above I was referring to Ted's patch to HMaster.
I agree the scripts tend to rot (because we do not have a good test framework 
for them), but they are useful to be kept here.

So... What about Ted's attached patch?


 Script to add regions gradually to a new regionserver.
 --

 Key: HBASE-3866
 URL: https://issues.apache.org/jira/browse/HBASE-3866
 Project: HBase
  Issue Type: Improvement
  Components: scripts
Affects Versions: 0.90.2
Reporter: Aravind Gottipati
Priority: Minor
 Attachments: 3866-max-regions-per-iteration.patch, slow_balancer.rb, 
 slow_balancer.rb


 When a new region server is brought online, the current balancer kicks off a 
 whole bunch of region moves and causes a lot of regions to be un-available 
 right away.  A slower balancer that gradually balances the cluster is 
 probably a good script to have.  I have an initial version that mooches off 
 the region_mover script to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6651) Thread safety of HTablePool is doubtful

2012-09-05 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448874#comment-13448874
 ] 

stack commented on HBASE-6651:
--

@Hiroshi Thank you for digging in here.  ThreadLocalPool was added by 
HBASE-2938 a while back.  On #1, what do you see as implications?  If its a 
pool of threads and all are using threadlocal, what would they need to share 
info?  Can you say more on points #2 and #3 above?  What do you suggest we do?  
Purge ThreadLocalPool?  Thanks.

 Thread safety of HTablePool is doubtful
 ---

 Key: HBASE-6651
 URL: https://issues.apache.org/jira/browse/HBASE-6651
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.94.1
Reporter: Hiroshi Ikeda
Priority: Minor

 There are some operations in HTablePool to access to PoolMap in multiple 
 times without any explict synchronization. 
 For example HTablePool.closeTablePool() calles PoolMap.values(), and calles 
 PoolMap.remove(). If other threads add new instances to the pool in the 
 middle of the calls, the new added instances might be dropped. 
 (HTablePool.closeTablePool() also has another problem that calling it by 
 multple threads causes accessing HTable by multiple threads.)
 Moreover, PoolMap is not thread safe for the same reason.
 For example PoolMap.put() calles ConcurrentMap.get() and calles 
 ConcurrentMap.put(). If other threads add a new instance to the concurent map 
 in the middle of the calls, the new instance might be dropped.
 And also implementations of Pool have the same problems.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3866) Script to add regions gradually to a new regionserver.

2012-09-05 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448875#comment-13448875
 ] 

stack commented on HBASE-3866:
--

Patch looks good to me.  Commit under a new issue named Add 
max-regions-per-balance-iteration (or some such) -- (Hey Aravind!)

 Script to add regions gradually to a new regionserver.
 --

 Key: HBASE-3866
 URL: https://issues.apache.org/jira/browse/HBASE-3866
 Project: HBase
  Issue Type: Improvement
  Components: scripts
Affects Versions: 0.90.2
Reporter: Aravind Gottipati
Priority: Minor
 Attachments: 3866-max-regions-per-iteration.patch, slow_balancer.rb, 
 slow_balancer.rb


 When a new region server is brought online, the current balancer kicks off a 
 whole bunch of region moves and causes a lot of regions to be un-available 
 right away.  A slower balancer that gradually balances the cluster is 
 probably a good script to have.  I have an initial version that mooches off 
 the region_mover script to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6302) Document how to run integration tests

2012-09-05 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448878#comment-13448878
 ] 

stack commented on HBASE-6302:
--

@Enis Want to have a go at addressing Andrew comments?  Or just paste a CLI 
example here and I'll take care of getting above committed.

 Document how to run integration tests
 -

 Key: HBASE-6302
 URL: https://issues.apache.org/jira/browse/HBASE-6302
 Project: HBase
  Issue Type: Sub-task
  Components: documentation
Reporter: stack
Assignee: Enis Soztutar
Priority: Blocker
 Fix For: 0.96.0

 Attachments: HBASE-6302_v1.patch


 HBASE-6203 has attached the old IT doc with some mods.  When we figure how 
 ITs are to be run, update it and apply the documentation under this issue.  
 Making a blocker against 0.96.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6288) In hbase-daemons.sh, description of the default backup-master file path is wrong

2012-09-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448901#comment-13448901
 ] 

Hudson commented on HBASE-6288:
---

Integrated in HBase-0.92 #556 (See 
[https://builds.apache.org/job/HBase-0.92/556/])
HBASE-6288 In hbase-daemons.sh, description of the default backup-master 
file path is wrong (Revision 1381220)

 Result = SUCCESS
stack : 
Files : 
* /hbase/branches/0.92/bin/master-backup.sh
* /hbase/branches/0.92/conf/hbase-env.sh


 In hbase-daemons.sh, description of the default backup-master file path is 
 wrong
 

 Key: HBASE-6288
 URL: https://issues.apache.org/jira/browse/HBASE-6288
 Project: HBase
  Issue Type: Task
  Components: master, scripts, shell
Affects Versions: 0.92.0, 0.92.1, 0.94.0
Reporter: Benjamin Kim
 Fix For: 0.96.0, 0.92.3, 0.94.2

 Attachments: HBASE-6288-92-1.patch, HBASE-6288-92.patch, 
 HBASE-6288-94.patch, HBASE-6288-trunk.patch


 In hbase-daemons.sh, description of the default backup-master file path is 
 wrong
 {code}
 #   HBASE_BACKUP_MASTERS File naming remote hosts.
 # Default is ${HADOOP_CONF_DIR}/backup-masters
 {code}
 it says the default backup-masters file path is at a hadoop-conf-dir, but 
 shouldn't this be HBASE_CONF_DIR?
 also adding following lines to conf/hbase-env.sh would be helpful
 {code}
 # File naming hosts on which backup HMaster will run.  
 $HBASE_HOME/conf/backup-masters by default.
 export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6398) Print a warning if there is no local datanode

2012-09-05 Thread Sameer Vaishampayan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448902#comment-13448902
 ] 

Sameer Vaishampayan commented on HBASE-6398:


Will work on this.

 Print a warning if there is no local datanode
 -

 Key: HBASE-6398
 URL: https://issues.apache.org/jira/browse/HBASE-6398
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
  Labels: noob

 When starting up a RS HBase should print out a warning if there is no 
 datanode locally.  Lots of optimizations are only available if the data is 
 machine local.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HBASE-6288) In hbase-daemons.sh, description of the default backup-master file path is wrong

2012-09-05 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-6288:
-

Assignee: Benjamin Kim

 In hbase-daemons.sh, description of the default backup-master file path is 
 wrong
 

 Key: HBASE-6288
 URL: https://issues.apache.org/jira/browse/HBASE-6288
 Project: HBase
  Issue Type: Task
  Components: master, scripts, shell
Affects Versions: 0.92.0, 0.92.1, 0.94.0
Reporter: Benjamin Kim
Assignee: Benjamin Kim
 Fix For: 0.96.0, 0.92.3, 0.94.2

 Attachments: HBASE-6288-92-1.patch, HBASE-6288-92.patch, 
 HBASE-6288-94.patch, HBASE-6288-trunk.patch


 In hbase-daemons.sh, description of the default backup-master file path is 
 wrong
 {code}
 #   HBASE_BACKUP_MASTERS File naming remote hosts.
 # Default is ${HADOOP_CONF_DIR}/backup-masters
 {code}
 it says the default backup-masters file path is at a hadoop-conf-dir, but 
 shouldn't this be HBASE_CONF_DIR?
 also adding following lines to conf/hbase-env.sh would be helpful
 {code}
 # File naming hosts on which backup HMaster will run.  
 $HBASE_HOME/conf/backup-masters by default.
 export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6720) Optionally limit number of regions balanced in each balancer run

2012-09-05 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-6720:


 Summary: Optionally limit number of regions balanced in each 
balancer run
 Key: HBASE-6720
 URL: https://issues.apache.org/jira/browse/HBASE-6720
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Fix For: 0.96.0, 0.94.3


See discussion on HBASE-3866

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-3866) Script to add regions gradually to a new regionserver.

2012-09-05 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-3866:
-

Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

Filed HBASE-6720

 Script to add regions gradually to a new regionserver.
 --

 Key: HBASE-3866
 URL: https://issues.apache.org/jira/browse/HBASE-3866
 Project: HBase
  Issue Type: Improvement
  Components: scripts
Affects Versions: 0.90.2
Reporter: Aravind Gottipati
Priority: Minor
 Attachments: 3866-max-regions-per-iteration.patch, slow_balancer.rb, 
 slow_balancer.rb


 When a new region server is brought online, the current balancer kicks off a 
 whole bunch of region moves and causes a lot of regions to be un-available 
 right away.  A slower balancer that gradually balances the cluster is 
 probably a good script to have.  I have an initial version that mooches off 
 the region_mover script to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]

2012-09-05 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448907#comment-13448907
 ] 

Devaraj Das commented on HBASE-6649:


[~zhi...@ebaysf.com]This patch fixes a specific problem to do with replication 
missing rows, and in my observations, that leads to somewhat frequent 
TestReplication.queueFailover failures. On trunk, do you know which test hangs? 
There probably are more issues to fix in the replication area, and we should 
have follow up jiras (and this jira is part-1 :)).

 [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
 ---

 Key: HBASE-6649
 URL: https://issues.apache.org/jira/browse/HBASE-6649
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.92.3

 Attachments: 6649-1.patch, 6649-2.txt, HBase-0.92 #495 test - 
 queueFailover [Jenkins].html, HBase-0.92 #502 test - queueFailover 
 [Jenkins].html


 Have seen it twice in the recent past: http://bit.ly/MPCykB  
 http://bit.ly/O79Dq7 .. 
 Looking briefly at the logs hints at a pattern - in both the failed test 
 instances, there was an RS crash while the test was running.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]

2012-09-05 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448914#comment-13448914
 ] 

Ted Yu commented on HBASE-6649:
---

target/surefire-reports/org.apache.hadoop.hbase.replication.TestReplication.txt 
was 0 length.
There was no JVM left from TestReplication by the time I got back to computer.

 [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
 ---

 Key: HBASE-6649
 URL: https://issues.apache.org/jira/browse/HBASE-6649
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.92.3

 Attachments: 6649-1.patch, 6649-2.txt, 6649-trunk.patch, HBase-0.92 
 #495 test - queueFailover [Jenkins].html, HBase-0.92 #502 test - 
 queueFailover [Jenkins].html


 Have seen it twice in the recent past: http://bit.ly/MPCykB  
 http://bit.ly/O79Dq7 .. 
 Looking briefly at the logs hints at a pattern - in both the failed test 
 instances, there was an RS crash while the test was running.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6721) RegionServer Group based Assignment

2012-09-05 Thread Francis Liu (JIRA)
Francis Liu created HBASE-6721:
--

 Summary: RegionServer Group based Assignment
 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Vandana Ayyalasomayajula
 Fix For: 0.96.0


In multi-tenant deployments of HBase, it is likely that a RegionServer will be 
serving out regions from a number of different tables owned by various client 
applications. Being able to group a subset of running RegionServers and assign 
specific tables to it, provides a client application a level of isolation and 
resource allocation.

The proposal essentially is to have an AssignmentManager which is aware of 
RegionServer groups and assigns tables to region servers based on groupings. 
Load balancing will occur on a per group basis as well. 

This is essentially a simplification of the approach taken in HBASE-4120. See 
attached document.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6721) RegionServer Group based Assignment

2012-09-05 Thread Vandana Ayyalasomayajula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vandana Ayyalasomayajula updated HBASE-6721:


Attachment: HBASE-6721-DesigDoc.pdf

Design document for HBase region server grouping feature.

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Vandana Ayyalasomayajula
 Fix For: 0.96.0

 Attachments: HBASE-6721-DesigDoc.pdf


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3976) Disable Block Cache On Compactions

2012-09-05 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448920#comment-13448920
 ] 

Lars Hofhansl commented on HBASE-3976:
--

Hmm... Looking at the code in trunk, this is (mostly) what is currently 
happening anyway.
HStore.createWriterInTmp using the configured cacheOnWrite setting unless this 
is a compaction (in which case cacheOnWrite is set to false).
There is also a test for this in TestCacheOnWrite.

I think we can close this issue. Agreed?


 Disable Block Cache On Compactions
 --

 Key: HBASE-3976
 URL: https://issues.apache.org/jira/browse/HBASE-3976
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.90.3
Reporter: Karthick Sankarachary
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: HBASE-3976.patch, HBASE-3976-unconditional.patch, 
 HBASE-3976-V3.patch


 Is there a good reason to believe that caching blocks during compactions is 
 beneficial? Currently, if block cache is enabled on a certain family, then 
 every time it's compacted, we load all of its blocks into the (LRU) cache, at 
 the expense of the legitimately hot ones.
 As a matter of fact, this concern was raised earlier in HBASE-1597, which 
 rightly points out that, we should not bog down the LRU with unneccessary 
 blocks during compaction. Even though that issue has been marked as fixed, 
 it looks like it ought to be reopened.
 Should we err on the side of caution and not cache blocks during compactions 
 period (as illustrated in the attached patch)? Or, can we be selectively 
 aggressive about what blocks do get cached during compaction (e.g., only 
 cache those blocks from the recent files)?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]

2012-09-05 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448924#comment-13448924
 ] 

Lars Hofhansl commented on HBASE-6649:
--

Patch looks good to me.
(As Ted points out there might other issues as well)

 [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
 ---

 Key: HBASE-6649
 URL: https://issues.apache.org/jira/browse/HBASE-6649
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.96.0, 0.92.3, 0.94.2

 Attachments: 6649-1.patch, 6649-2.txt, 6649-trunk.patch, HBase-0.92 
 #495 test - queueFailover [Jenkins].html, HBase-0.92 #502 test - 
 queueFailover [Jenkins].html


 Have seen it twice in the recent past: http://bit.ly/MPCykB  
 http://bit.ly/O79Dq7 .. 
 Looking briefly at the logs hints at a pattern - in both the failed test 
 instances, there was an RS crash while the test was running.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]

2012-09-05 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6649:
-

Fix Version/s: 0.94.2
   0.96.0

I'd also like this in 0.94. The 0.92 will probably just apply cleanly. If not 
I'll make one.

 [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
 ---

 Key: HBASE-6649
 URL: https://issues.apache.org/jira/browse/HBASE-6649
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.96.0, 0.92.3, 0.94.2

 Attachments: 6649-1.patch, 6649-2.txt, 6649-trunk.patch, HBase-0.92 
 #495 test - queueFailover [Jenkins].html, HBase-0.92 #502 test - 
 queueFailover [Jenkins].html


 Have seen it twice in the recent past: http://bit.ly/MPCykB  
 http://bit.ly/O79Dq7 .. 
 Looking briefly at the logs hints at a pattern - in both the failed test 
 instances, there was an RS crash while the test was running.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-3976) Disable Block Cache On Compactions

2012-09-05 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-3976.
--

Resolution: Fixed

Closing... Please reopen if this should be kept open.

 Disable Block Cache On Compactions
 --

 Key: HBASE-3976
 URL: https://issues.apache.org/jira/browse/HBASE-3976
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.90.3
Reporter: Karthick Sankarachary
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: HBASE-3976.patch, HBASE-3976-unconditional.patch, 
 HBASE-3976-V3.patch


 Is there a good reason to believe that caching blocks during compactions is 
 beneficial? Currently, if block cache is enabled on a certain family, then 
 every time it's compacted, we load all of its blocks into the (LRU) cache, at 
 the expense of the legitimately hot ones.
 As a matter of fact, this concern was raised earlier in HBASE-1597, which 
 rightly points out that, we should not bog down the LRU with unneccessary 
 blocks during compaction. Even though that issue has been marked as fixed, 
 it looks like it ought to be reopened.
 Should we err on the side of caution and not cache blocks during compactions 
 period (as illustrated in the attached patch)? Or, can we be selectively 
 aggressive about what blocks do get cached during compaction (e.g., only 
 cache those blocks from the recent files)?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-3861) MiniZooKeeperCluster.startup() should refer to hbase.zookeeper.property.maxClientCnxns

2012-09-05 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-3861:
-


Looking at MiniZooKeeperCluster in trunk, this is already done:
{code}
  NIOServerCnxnFactory standaloneServerFactory;
  while (true) {
try {
  standaloneServerFactory = new NIOServerCnxnFactory();
  standaloneServerFactory.configure(
new InetSocketAddress(tentativePort),
configuration.getInt(HConstants.ZOOKEEPER_MAX_CLIENT_CNXNS,
  1000));
} catch (BindException e) {
{code}

Closing.

 MiniZooKeeperCluster.startup() should refer to 
 hbase.zookeeper.property.maxClientCnxns
 --

 Key: HBASE-3861
 URL: https://issues.apache.org/jira/browse/HBASE-3861
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.3
Reporter: Eugene Koontz
Assignee: Eugene Koontz
 Attachments: HBASE-3861.patch, HBASE-3861.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 Currently the number of the client connections is hard-wired to 1000:
 {noformat}
 standaloneServerFactory = new NIOServerCnxnFactory();
 standaloneServerFactory.configure(new 
 InetSocketAddress(clientPort),1000);
   } catch (BindException e) {
  
 {noformat}
 This should be set according to the test environment's hbase configuration. 
 The property in 
 question is : hbase.zookeeper.property.maxClientCnxns.
 Currently some tests such as org.apache.hadoop.hbase.client.TestHCM fail 
 because the number of connections used by the HBase client exceeds 1000. 
 Recently MAX_CACHED_HBASE_INSTANCES increased from 31 to 2000 on 0.90 branch:
 http://svn.apache.org/viewvc/hbase/branches/0.90/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java?p2=%2Fhbase%2Fbranches%2F0.90%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fhadoop%2Fhbase%2Fclient%2FHConnectionManager.javap1=%2Fhbase%2Fbranches%2F0.90%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fhadoop%2Fhbase%2Fclient%2FHConnectionManager.javar1=1096818r2=1096817view=diffpathrev=1096818
 and correspondingly the hbase config on the Zookeeper server-side also 
 increased in hbase-default.xml:
 http://svn.apache.org/viewvc/hbase/branches/0.90/src/main/resources/hbase-default.xml?p2=%2Fhbase%2Fbranches%2F0.90%2Fsrc%2Fmain%2Fresources%2Fhbase-default.xmlp1=%2Fhbase%2Fbranches%2F0.90%2Fsrc%2Fmain%2Fresources%2Fhbase-default.xmlr1=1091594r2=1091593view=diffpathrev=1091594
 So if MiniZKCluster looks at this setting, the test won't have this failure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-3861) MiniZooKeeperCluster.startup() should refer to hbase.zookeeper.property.maxClientCnxns

2012-09-05 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-3861.
--

Resolution: Fixed

 MiniZooKeeperCluster.startup() should refer to 
 hbase.zookeeper.property.maxClientCnxns
 --

 Key: HBASE-3861
 URL: https://issues.apache.org/jira/browse/HBASE-3861
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.3
Reporter: Eugene Koontz
Assignee: Eugene Koontz
 Attachments: HBASE-3861.patch, HBASE-3861.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 Currently the number of the client connections is hard-wired to 1000:
 {noformat}
 standaloneServerFactory = new NIOServerCnxnFactory();
 standaloneServerFactory.configure(new 
 InetSocketAddress(clientPort),1000);
   } catch (BindException e) {
  
 {noformat}
 This should be set according to the test environment's hbase configuration. 
 The property in 
 question is : hbase.zookeeper.property.maxClientCnxns.
 Currently some tests such as org.apache.hadoop.hbase.client.TestHCM fail 
 because the number of connections used by the HBase client exceeds 1000. 
 Recently MAX_CACHED_HBASE_INSTANCES increased from 31 to 2000 on 0.90 branch:
 http://svn.apache.org/viewvc/hbase/branches/0.90/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java?p2=%2Fhbase%2Fbranches%2F0.90%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fhadoop%2Fhbase%2Fclient%2FHConnectionManager.javap1=%2Fhbase%2Fbranches%2F0.90%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fhadoop%2Fhbase%2Fclient%2FHConnectionManager.javar1=1096818r2=1096817view=diffpathrev=1096818
 and correspondingly the hbase config on the Zookeeper server-side also 
 increased in hbase-default.xml:
 http://svn.apache.org/viewvc/hbase/branches/0.90/src/main/resources/hbase-default.xml?p2=%2Fhbase%2Fbranches%2F0.90%2Fsrc%2Fmain%2Fresources%2Fhbase-default.xmlp1=%2Fhbase%2Fbranches%2F0.90%2Fsrc%2Fmain%2Fresources%2Fhbase-default.xmlr1=1091594r2=1091593view=diffpathrev=1091594
 So if MiniZKCluster looks at this setting, the test won't have this failure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6302) Document how to run integration tests

2012-09-05 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448933#comment-13448933
 ] 

Enis Soztutar commented on HBASE-6302:
--

Sorry guys, I was waiting for HBASE-6241 to be resolved first before updating 
the patch. Without HBASE-6241 finalized, if we commit the doc, it might be 
confusing.

 Document how to run integration tests
 -

 Key: HBASE-6302
 URL: https://issues.apache.org/jira/browse/HBASE-6302
 Project: HBase
  Issue Type: Sub-task
  Components: documentation
Reporter: stack
Assignee: Enis Soztutar
Priority: Blocker
 Fix For: 0.96.0

 Attachments: HBASE-6302_v1.patch


 HBASE-6203 has attached the old IT doc with some mods.  When we figure how 
 ITs are to be run, update it and apply the documentation under this issue.  
 Making a blocker against 0.96.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3859) Increment a counter when a Scanner lease expires

2012-09-05 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448935#comment-13448935
 ] 

Lars Hofhansl commented on HBASE-3859:
--

Patch looks good. Should we commit?
I'm a bit fuzzy on the current state of Metrics in HBase (v1 vs v2, etc)

 Increment a counter when a Scanner lease expires
 

 Key: HBASE-3859
 URL: https://issues.apache.org/jira/browse/HBASE-3859
 Project: HBase
  Issue Type: Improvement
  Components: metrics, regionserver
Affects Versions: 0.90.2
Reporter: Benoit Sigoure
Assignee: Mubarak Seyed
Priority: Minor
 Attachments: HBASE-3859.trunk.v1.patch


 Whenever a Scanner lease expires, the RegionServer will close it 
 automatically and log a message to complain.  I would like the RegionServer 
 to increment a counter whenever this happens and expose this counter through 
 the metrics system, so we can plug this into our monitoring system (OpenTSDB) 
 and keep track of how frequently this happens.  It's not supposed to happen 
 frequently so it's good to keep an eye on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3976) Disable Block Cache On Compactions

2012-09-05 Thread Mikhail Bautin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448936#comment-13448936
 ] 

Mikhail Bautin commented on HBASE-3976:
---

Lars: thanks for double-checking this!

 Disable Block Cache On Compactions
 --

 Key: HBASE-3976
 URL: https://issues.apache.org/jira/browse/HBASE-3976
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.90.3
Reporter: Karthick Sankarachary
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: HBASE-3976.patch, HBASE-3976-unconditional.patch, 
 HBASE-3976-V3.patch


 Is there a good reason to believe that caching blocks during compactions is 
 beneficial? Currently, if block cache is enabled on a certain family, then 
 every time it's compacted, we load all of its blocks into the (LRU) cache, at 
 the expense of the legitimately hot ones.
 As a matter of fact, this concern was raised earlier in HBASE-1597, which 
 rightly points out that, we should not bog down the LRU with unneccessary 
 blocks during compaction. Even though that issue has been marked as fixed, 
 it looks like it ought to be reopened.
 Should we err on the side of caution and not cache blocks during compactions 
 period (as illustrated in the attached patch)? Or, can we be selectively 
 aggressive about what blocks do get cached during compaction (e.g., only 
 cache those blocks from the recent files)?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3854) broken examples

2012-09-05 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448938#comment-13448938
 ] 

Lars Hofhansl commented on HBASE-3854:
--

Is this still an issue (I don't anything about thrift, so I can't really tell)

 broken examples
 ---

 Key: HBASE-3854
 URL: https://issues.apache.org/jira/browse/HBASE-3854
 Project: HBase
  Issue Type: Bug
  Components: thrift
Affects Versions: 0.20.0
Reporter: Alexey Diomin
Priority: Minor

 We introduce NotFound exception in HBASE-1292, but we drop it in HBASE-1367.
 In result:
 1. incorrect doc in Hbase.thrift in as result in generated java and java-doc
 2. broken examples in src/examples/thrift/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]

2012-09-05 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448942#comment-13448942
 ] 

Ted Yu commented on HBASE-6649:
---

@J-D:
What do you think ?

nit:
{code}
+  } catch (IOException ie) {
+break;
{code}
A log statement is desirable before break.

 [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
 ---

 Key: HBASE-6649
 URL: https://issues.apache.org/jira/browse/HBASE-6649
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.96.0, 0.92.3, 0.94.2

 Attachments: 6649-1.patch, 6649-2.txt, 6649-trunk.patch, HBase-0.92 
 #495 test - queueFailover [Jenkins].html, HBase-0.92 #502 test - 
 queueFailover [Jenkins].html


 Have seen it twice in the recent past: http://bit.ly/MPCykB  
 http://bit.ly/O79Dq7 .. 
 Looking briefly at the logs hints at a pattern - in both the failed test 
 instances, there was an RS crash while the test was running.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-3840) Add sanity checks on Configurations to make sure hbase confs have been loaded

2012-09-05 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-3840.
--

Resolution: Won't Fix

There appears to be no interest in this for over a year. Closing. Please reopen 
if you disagree.

 Add sanity checks on Configurations to make sure hbase confs have been loaded
 -

 Key: HBASE-3840
 URL: https://issues.apache.org/jira/browse/HBASE-3840
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.0
Reporter: Todd Lipcon

 A common user error (and even hbase dev error) is to pass a vanilla Hadoop 
 Configuration into HBase methods that expect to see all of the relevant hbase 
 defaults from hbase-default.xml. This often results in NPE or issues locating 
 ZK.
 We should add a method like HBaseConfiguration.verify(conf) which ensures 
 that the conf has incorporated hbase-default.xml. We can do this by checking 
 for existence of hbase.defaults.for.version.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-3851) A Random-Access Column Object Model

2012-09-05 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-3851.
--

Resolution: Won't Fix

Closing, as suggested.
@Karthik: Do you want to attach the github link you mentioned?

 A Random-Access Column Object Model
 ---

 Key: HBASE-3851
 URL: https://issues.apache.org/jira/browse/HBASE-3851
 Project: HBase
  Issue Type: New Feature
  Components: client
Affects Versions: 0.92.0
Reporter: Karthick Sankarachary
Assignee: Karthick Sankarachary
Priority: Minor
  Labels: HBase, Mapping, Object
 Attachments: HBASE-3851.patch


 By design, a value in HBase is an opaque and atomic byte array. In theory, 
 any arbitrary type can potentially be represented in terms of such 
 unstructured yet indivisible units. However, as the complexity of the type 
 increases, so does the need to access it in parts rather than in whole. That 
 way, one can update parts of a value without reading the whole first. This 
 calls for transparency in the type of data being accessed.
 To that end, we introduce here a simple object model where each part maps to 
 a {{HTable}} column and value thereof. Specifically, we define a 
 {{ColumnObject}} interface that denotes an arbitrary type comprising 
 properties, where each property is a {{name, value}} tuple of byte arrays. 
 In essence, each property maps to a distinct HBase {{KeyValue}}. In 
 particular, the property's name maps to a column, prefixed by the qualifier 
 and the object's identifier (assumed to be unique within a column family), 
 and the property's value maps to the {{KeyValue#getValue()}} of the 
 corresponding column. Furthermore, the {{ColumnObject}} is marked as a 
 {{RandomAccess}} type to underscore the fact that its properties can be 
 accessed in and of themselves.
 For starters, we provide three concrete objects - a {{ColumnMap}}, 
 {{ColumnList}} and {{ColumnSet}} that implement the {{Map}}, {{List}} and 
 {{Set}} interfaces respectively. The {{ColumnMap}} treats each {{Map.Entry}} 
 as an object property, the {{ColumnList}} stores each element against its 
 ordinal position, and the {{ColumnSet}} considers each element as the 
 property name (as well as its value). For the sake of convenience, we also 
 define extensions to the {{Get}}, {{Put}}, {{Delete}} and {{Result}} classes 
 that are aware of and know how to deal with such {{ColumnObject}} types.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3834) Store ignores checksum errors when opening files

2012-09-05 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448950#comment-13448950
 ] 

Lars Hofhansl commented on HBASE-3834:
--

Should we close this? There appears to be not much interest in it.

 Store ignores checksum errors when opening files
 

 Key: HBASE-3834
 URL: https://issues.apache.org/jira/browse/HBASE-3834
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.2
Reporter: Todd Lipcon
Priority: Critical
 Fix For: 0.90.8


 If you corrupt one of the storefiles in a region (eg using vim to muck up 
 some bytes), the region will still open, but that storefile will just be 
 ignored with a log message. We should probably not do this in general - 
 better to keep that region unassigned and force an admin to make a decision 
 to remove the bad storefile.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3828) region server stuck in waitOnAllRegionsToClose

2012-09-05 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448964#comment-13448964
 ] 

Lars Hofhansl commented on HBASE-3828:
--

I assume with all the recent work this has been fixed.
@Ram, @Stack: Would you agree with that? If so, we can just close this.

 region server stuck in waitOnAllRegionsToClose
 --

 Key: HBASE-3828
 URL: https://issues.apache.org/jira/browse/HBASE-3828
 Project: HBase
  Issue Type: Bug
Reporter: Prakash Khemani



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2012-09-05 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448970#comment-13448970
 ] 

Ted Yu commented on HBASE-6721:
---

More details should be added to the design.
Have you considered introducing interface for AssignmentManager so that 
existing and new managers can be easily swapped ?
Have you considered storing group information in zookeeper instead of on hdfs ?

Please explain more about RegionServerGroupProtocol.

Thanks for the initiative.

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Vandana Ayyalasomayajula
 Fix For: 0.96.0

 Attachments: HBASE-6721-DesigDoc.pdf


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3859) Increment a counter when a Scanner lease expires

2012-09-05 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448971#comment-13448971
 ] 

Elliott Clark commented on HBASE-3859:
--

Wouldn't it be better to use MetricsTimeVaryingLong rather than a 
MetricsLongValue and an AtomicLong

[~lhofhansl] We're starting to get close to finishing the move to metrics2 
however the HRegionServer is the last part that needs to be moved over.  My 
plan is to move over and clean up stuff in HBASE-4050 in the coming weeks.  
With that said I still think this can be a useful issue and having it in the 
Mertics1 version will make sure that it's ported over when the time comes.

 Increment a counter when a Scanner lease expires
 

 Key: HBASE-3859
 URL: https://issues.apache.org/jira/browse/HBASE-3859
 Project: HBase
  Issue Type: Improvement
  Components: metrics, regionserver
Affects Versions: 0.90.2
Reporter: Benoit Sigoure
Assignee: Mubarak Seyed
Priority: Minor
 Attachments: HBASE-3859.trunk.v1.patch


 Whenever a Scanner lease expires, the RegionServer will close it 
 automatically and log a message to complain.  I would like the RegionServer 
 to increment a counter whenever this happens and expose this counter through 
 the metrics system, so we can plug this into our monitoring system (OpenTSDB) 
 and keep track of how frequently this happens.  It's not supposed to happen 
 frequently so it's good to keep an eye on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-3814) force regionserver to halt

2012-09-05 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-3814.
--

Resolution: Won't Fix

There appears to be no interest in this one.
Please revive if you think we should do this.

 force regionserver to halt
 --

 Key: HBASE-3814
 URL: https://issues.apache.org/jira/browse/HBASE-3814
 Project: HBase
  Issue Type: Bug
Reporter: Prakash Khemani

 Once abort() on a regionserver is called we should have a timeout thread that 
 does Runtime.halt() if the rs gets stuck somewhere during abort processing.
 ===
 Pumahbase132 has following the logs .. the dfsclient is not able to set up a 
 write pipeline successfully ... it tries to abort ... but while aborting it 
 gets stuck. I know there is a check that if we are aborting because 
 filesystem is closed then we should not try to flush the logs while aborting. 
 But in this case the fs is up and running, just that it is not functioning.
 2011-04-21 23:48:07,082 INFO org.apache.hadoop.hdfs.DFSClient: Exception in 
 createBlockOutputStream 10.38.131.53:50010  for file 
 /PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280java.io.IOException:
  Bad connect ack with firstBadLink 10.38.133.33:50010
 2011-04-21 23:48:07,082 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning 
 block blk_-8967376451767492285_6537229 for file 
 /PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280
 2011-04-21 23:48:07,125 INFO org.apache.hadoop.hdfs.DFSClient: Exception in 
 createBlockOutputStream 10.38.131.53:50010  for file 
 /PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280java.io.IOException:
  Bad connect ack with firstBadLink 10.38.134.59:50010
 2011-04-21 23:48:07,125 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning 
 block blk_7172251852699100447_6537229 for file 
 /PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280
  
 2011-04-21 23:48:07,169 INFO org.apache.hadoop.hdfs.DFSClient: Exception in 
 createBlockOutputStream 10.38.131.53:50010  for file 
 /PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280java.io.IOException:
  Bad connect ack with firstBadLink 10.38.134.53:50010
 2011-04-21 23:48:07,169 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning 
 block blk_-9153204772467623625_6537229 for file 
 /PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280
 2011-04-21 23:48:07,213 INFO org.apache.hadoop.hdfs.DFSClient: Exception in 
 createBlockOutputStream 10.38.131.53:50010  for file 
 /PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280java.io.IOException:
  Bad connect ack with firstBadLink 10.38.134.49:50010
 2011-04-21 23:48:07,213 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning 
 block blk_-2513098940934276625_6537229 for file 
 /PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280
 2011-04-21 23:48:07,214 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer 
 Exception: java.io.IOException: Unable to create new block.
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3560)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2700(DFSClient.java:2720)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2977)
 2011-04-21 23:48:07,214 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery 
 for block blk_-2513098940934276625_6537229 bad datanode[1] nodes == null
 2011-04-21 23:48:07,214 WARN org.apache.hadoop.hdfs.DFSClient: Could not get 
 block locations. Source file 
 /PUMAHBASE002-SNC5-HBASE/.logs/pumahbase132.snc5.facebook.com,60020,1303450732026/pumahbase132.snc5.facebook.com%3A60020.1303450732280
  - Aborting...
 2011-04-21 23:48:07,216 FATAL org.apache.hadoop.hbase.regionserver.wal.HLog: 
 Could not append. Requesting close of hlog
 And then the RS gets stuck trying to roll the logs ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6720) Optionally limit number of regions balanced in each balancer run

2012-09-05 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448975#comment-13448975
 ] 

Elliott Clark commented on HBASE-6720:
--

When this is put in we need to make sure to change the StochasticLoadBalancer 
as well.  It right now has a setting 
hbase.master.balancer.stochastic.maxMoveRegions that sets the maximum number of 
regions to move at a time.

 Optionally limit number of regions balanced in each balancer run
 

 Key: HBASE-6720
 URL: https://issues.apache.org/jira/browse/HBASE-6720
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Fix For: 0.96.0, 0.94.3


 See discussion on HBASE-3866

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HBASE-6715) TestFromClientSide.testCacheOnWriteEvictOnClose is flaky

2012-09-05 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang reassigned HBASE-6715:
--

Assignee: Jimmy Xiang

 TestFromClientSide.testCacheOnWriteEvictOnClose is flaky
 

 Key: HBASE-6715
 URL: https://issues.apache.org/jira/browse/HBASE-6715
 Project: HBase
  Issue Type: Test
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor

 Occasionally, this test fails:
 {noformat}
 expected:2049 but was:2069
 Stacktrace
 java.lang.AssertionError: expected:2049 but was:2069
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.failNotEquals(Assert.java:647)
 at org.junit.Assert.assertEquals(Assert.java:128)
 at org.junit.Assert.assertEquals(Assert.java:472)
 at org.junit.Assert.assertEquals(Assert.java:456)
 at 
 org.apache.hadoop.hbase.client.TestFromClientSide.testCacheOnWriteEvictOnClose(TestFromClientSide.java:4248)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 {noformat}
 It could be because there is other thread still accessing the cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]

2012-09-05 Thread Himanshu Vashishtha (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448987#comment-13448987
 ] 

Himanshu Vashishtha commented on HBASE-6649:


lgtm. 
The exception will be re-thrown in the next try, so +0 on adding a log 
statement before break.

 [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
 ---

 Key: HBASE-6649
 URL: https://issues.apache.org/jira/browse/HBASE-6649
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.96.0, 0.92.3, 0.94.2

 Attachments: 6649-1.patch, 6649-2.txt, 6649-trunk.patch, HBase-0.92 
 #495 test - queueFailover [Jenkins].html, HBase-0.92 #502 test - 
 queueFailover [Jenkins].html


 Have seen it twice in the recent past: http://bit.ly/MPCykB  
 http://bit.ly/O79Dq7 .. 
 Looking briefly at the logs hints at a pattern - in both the failed test 
 instances, there was an RS crash while the test was running.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]

2012-09-05 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448988#comment-13448988
 ] 

stack commented on HBASE-6649:
--

J-D on vacation.  Let me commit this.  Will add the log message Ted suggests 
though my sense it overkill, lets see.  Would suggest new issue for other 
'parts' DD.

 [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
 ---

 Key: HBASE-6649
 URL: https://issues.apache.org/jira/browse/HBASE-6649
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.96.0, 0.92.3, 0.94.2

 Attachments: 6649-1.patch, 6649-2.txt, 6649-trunk.patch, HBase-0.92 
 #495 test - queueFailover [Jenkins].html, HBase-0.92 #502 test - 
 queueFailover [Jenkins].html


 Have seen it twice in the recent past: http://bit.ly/MPCykB  
 http://bit.ly/O79Dq7 .. 
 Looking briefly at the logs hints at a pattern - in both the failed test 
 instances, there was an RS crash while the test was running.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]

2012-09-05 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HBASE-6649:
---

Attachment: 6649-0.92.patch
6649-trunk.patch

Don't mind adding a few comments around the exception handling..

 [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
 ---

 Key: HBASE-6649
 URL: https://issues.apache.org/jira/browse/HBASE-6649
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.96.0, 0.92.3, 0.94.2

 Attachments: 6649-0.92.patch, 6649-1.patch, 6649-2.txt, 
 6649-trunk.patch, 6649-trunk.patch, HBase-0.92 #495 test - queueFailover 
 [Jenkins].html, HBase-0.92 #502 test - queueFailover [Jenkins].html


 Have seen it twice in the recent past: http://bit.ly/MPCykB  
 http://bit.ly/O79Dq7 .. 
 Looking briefly at the logs hints at a pattern - in both the failed test 
 instances, there was an RS crash while the test was running.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]

2012-09-05 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6649:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Committed to trunk, 0.92, and 0.94.  Thanks for the reviews lads and DD for the 
patch.

 [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
 ---

 Key: HBASE-6649
 URL: https://issues.apache.org/jira/browse/HBASE-6649
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.96.0, 0.92.3, 0.94.2

 Attachments: 6649-0.92.patch, 6649-1.patch, 6649-2.txt, 
 6649-trunk.patch, 6649-trunk.patch, HBase-0.92 #495 test - queueFailover 
 [Jenkins].html, HBase-0.92 #502 test - queueFailover [Jenkins].html


 Have seen it twice in the recent past: http://bit.ly/MPCykB  
 http://bit.ly/O79Dq7 .. 
 Looking briefly at the logs hints at a pattern - in both the failed test 
 instances, there was an RS crash while the test was running.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]

2012-09-05 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6649:
-

Attachment: 6649.txt

Here is what I applied.  Includes Ted's suggested logging.  I applied this same 
patch to 0.94 and 0.92 w/ -p1

 [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
 ---

 Key: HBASE-6649
 URL: https://issues.apache.org/jira/browse/HBASE-6649
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.96.0, 0.92.3, 0.94.2

 Attachments: 6649-0.92.patch, 6649-1.patch, 6649-2.txt, 
 6649-trunk.patch, 6649-trunk.patch, 6649.txt, HBase-0.92 #495 test - 
 queueFailover [Jenkins].html, HBase-0.92 #502 test - queueFailover 
 [Jenkins].html


 Have seen it twice in the recent past: http://bit.ly/MPCykB  
 http://bit.ly/O79Dq7 .. 
 Looking briefly at the logs hints at a pattern - in both the failed test 
 instances, there was an RS crash while the test was running.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3834) Store ignores checksum errors when opening files

2012-09-05 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13448998#comment-13448998
 ] 

Todd Lipcon commented on HBASE-3834:


It's still a somewhat scary bug, if it still exists. It causes data to be 
silently missing from a table. So I hope someone will take interest in it :)

 Store ignores checksum errors when opening files
 

 Key: HBASE-3834
 URL: https://issues.apache.org/jira/browse/HBASE-3834
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.2
Reporter: Todd Lipcon
Priority: Critical
 Fix For: 0.90.8


 If you corrupt one of the storefiles in a region (eg using vim to muck up 
 some bytes), the region will still open, but that storefile will just be 
 ignored with a log message. We should probably not do this in general - 
 better to keep that region unassigned and force an admin to make a decision 
 to remove the bad storefile.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2012-09-05 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13449007#comment-13449007
 ] 

Ted Yu commented on HBASE-6721:
---

Another aspect is fault tolerance.
Say the smallest group consists of 6 region servers, the impact of majority of 
the 6 servers going down at the same time is much higher than 6 servers out of 
whole cluster going down where there is only one group.

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Vandana Ayyalasomayajula
 Fix For: 0.96.0

 Attachments: HBASE-6721-DesigDoc.pdf


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6722) fixHdfsOrphans won't work for first/end regions

2012-09-05 Thread Adrien Mogenet (JIRA)
Adrien Mogenet created HBASE-6722:
-

 Summary: fixHdfsOrphans won't work for first/end regions
 Key: HBASE-6722
 URL: https://issues.apache.org/jira/browse/HBASE-6722
 Project: HBase
  Issue Type: Bug
  Components: hbck
Affects Versions: 0.94.1
Reporter: Adrien Mogenet


When a .regioninfo is missing on the first (or final) region, it will try to 
determine the startKey (or endKey) based on what it has been seen on the HDFS.

However, for these special cases an empty key should be considered instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]

2012-09-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13449035#comment-13449035
 ] 

Hudson commented on HBASE-6649:
---

Integrated in HBase-0.94 #450 (See 
[https://builds.apache.org/job/HBase-0.94/450/])
HBASE-6649 [0.92 UNIT TESTS] TestReplication.queueFailover occasionally 
fails [Part-1] (Revision 1381289)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java


 [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
 ---

 Key: HBASE-6649
 URL: https://issues.apache.org/jira/browse/HBASE-6649
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.96.0, 0.92.3, 0.94.2

 Attachments: 6649-0.92.patch, 6649-1.patch, 6649-2.txt, 
 6649-trunk.patch, 6649-trunk.patch, 6649.txt, HBase-0.92 #495 test - 
 queueFailover [Jenkins].html, HBase-0.92 #502 test - queueFailover 
 [Jenkins].html


 Have seen it twice in the recent past: http://bit.ly/MPCykB  
 http://bit.ly/O79Dq7 .. 
 Looking briefly at the logs hints at a pattern - in both the failed test 
 instances, there was an RS crash while the test was running.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6669) Add BigDecimalColumnInterpreter for doing aggregations using AggregationClient

2012-09-05 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13449067#comment-13449067
 ] 

Ted Yu commented on HBASE-6669:
---

Since there're two BigDecimal fields in BigDecimalColumnInterpreter, you need 
to implement readFields() and write() for serialization.

 Add BigDecimalColumnInterpreter for doing aggregations using AggregationClient
 --

 Key: HBASE-6669
 URL: https://issues.apache.org/jira/browse/HBASE-6669
 Project: HBase
  Issue Type: New Feature
  Components: client, coprocessors
Reporter: Anil Gupta
Priority: Minor
  Labels: client, coprocessors
 Attachments: BigDecimalColumnInterpreter.java, 
 BigDecimalColumnInterpreter.patch, BigDecimalColumnInterpreter.patch


 I recently created a Class for doing aggregations(sum,min,max,std) on values 
 stored as BigDecimal in HBase. I would like to commit the 
 BigDecimalColumnInterpreter into HBase. In my opinion this class can be used 
 by a wide variety of users. Please let me know if its not appropriate to add 
 this class in HBase.
 Thanks,
 Anil Gupta
 Software Engineer II, Intuit, Inc 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]

2012-09-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13449069#comment-13449069
 ] 

Hudson commented on HBASE-6649:
---

Integrated in HBase-TRUNK #3307 (See 
[https://builds.apache.org/job/HBase-TRUNK/3307/])
HBASE-6649 [0.92 UNIT TESTS] TestReplication.queueFailover occasionally 
fails [Part-1] (Revision 1381287)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java


 [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
 ---

 Key: HBASE-6649
 URL: https://issues.apache.org/jira/browse/HBASE-6649
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.96.0, 0.92.3, 0.94.2

 Attachments: 6649-0.92.patch, 6649-1.patch, 6649-2.txt, 
 6649-trunk.patch, 6649-trunk.patch, 6649.txt, HBase-0.92 #495 test - 
 queueFailover [Jenkins].html, HBase-0.92 #502 test - queueFailover 
 [Jenkins].html


 Have seen it twice in the recent past: http://bit.ly/MPCykB  
 http://bit.ly/O79Dq7 .. 
 Looking briefly at the logs hints at a pattern - in both the failed test 
 instances, there was an RS crash while the test was running.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3803) Make the load balancer run with a gentle hand

2012-09-05 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13449085#comment-13449085
 ] 

Lars Hofhansl commented on HBASE-3803:
--

Can we subsume this in HBASE-6720.

 Make the load balancer run with a gentle hand
 -

 Key: HBASE-3803
 URL: https://issues.apache.org/jira/browse/HBASE-3803
 Project: HBase
  Issue Type: Improvement
Reporter: stack

 We need 'smoothing' of balancer region move Yesterday we brought a 
 regionserver back online into a smallish cluster that was under load and the 
 balance run unloaded a bunch of regions all in the one go which put a dent in 
 the throughput when a bunch of regions went offline at the one time.  It'd be 
 sweet if the balancer ran at a context appropriate 'rate'; when under load, 
 it should move regions 'gently' rather than all as a big bang (the 
 decommission script will move a region at a time, verifying it deployed in 
 its new location before moving another... this can take ages to complete but 
 its proven minimally disruptive to loadings)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-879) When dfs restarts or moves blocks around, hbase regionservers don't notice

2012-09-05 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates resolved HBASE-879.
---

Resolution: Fixed

I think this is fixed in the current versions of HBase. Reoopen if I'm mistaken.

 When dfs restarts or moves blocks around, hbase regionservers don't notice
 --

 Key: HBASE-879
 URL: https://issues.apache.org/jira/browse/HBASE-879
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.18.1, 0.19.0
Reporter: Michael Bieniosek

 Since the hbase regionservers use a DFSClient to keep handles open to the 
 dfs, if the dfs blocks move around (typically because of a dfs restart, but 
 can also happen if datanodes die or blocks get shuffled around), the 
 regionserver will be unable to service the region.  It would be nice if the 
 DFSClient that the regionservers use could notice this case and refresh the 
 block list. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-6017) TestReplication fails occasionally

2012-09-05 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates resolved HBASE-6017.


Resolution: Duplicate

DUP of HBASE-6649

 TestReplication fails occasionally
 --

 Key: HBASE-6017
 URL: https://issues.apache.org/jira/browse/HBASE-6017
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1
Reporter: Devaraj Das

 I see occasional failures in TestReplication on the 0.92 branch.
 Running org.apache.hadoop.hbase.replication.TestReplication
 Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 240.118 sec 
  FAILURE!
 Results :
 Failed tests:   
 queueFailover(org.apache.hadoop.hbase.replication.TestReplication): Waited 
 too much time for queueFailover replication

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-917) filesystem intensive operations such as compaction should be load aware

2012-09-05 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates resolved HBASE-917.
---

Resolution: Won't Fix

I think this can be handled via coprocessor hooks - closing as won't fix.

 filesystem intensive operations such as compaction should be load aware
 ---

 Key: HBASE-917
 URL: https://issues.apache.org/jira/browse/HBASE-917
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Andrew Purtell

 If the underlying filesystem is already severely stressed, running intensive 
 operations such as compaction is asking for trouble. Ideally, such actions 
 should be deferred until load is observed to lessen. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-1042) OOME but we don't abort

2012-09-05 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates resolved HBASE-1042.


Resolution: Fixed

This is fixed against trunk, according to the comments. Reopen if still an 
issue.

 OOME but we don't abort
 ---

 Key: HBASE-1042
 URL: https://issues.apache.org/jira/browse/HBASE-1042
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Attachments: 1042-committed.patch, 1042.patch, 1042-v2.patch


 On streamy cluster saw case where graceful shutdown had been triggered rather 
 than an abort on OOME.  On graceful shutdown, we wait on leases to expire or 
 be closed.  Server wouldn't go down because it was waiting on leases to 
 expire only an OOME in Leases had killed the thread so it wasn't ever going 
 to expire anything.   Node was stuck for four hours till someone noticed it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]

2012-09-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13449147#comment-13449147
 ] 

Hudson commented on HBASE-6649:
---

Integrated in HBase-0.92 #557 (See 
[https://builds.apache.org/job/HBase-0.92/557/])
HBASE-6649 [0.92 UNIT TESTS] TestReplication.queueFailover occasionally 
fails [Part-1] (Revision 1381291)

 Result = SUCCESS
stack : 
Files : 
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java


 [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
 ---

 Key: HBASE-6649
 URL: https://issues.apache.org/jira/browse/HBASE-6649
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.96.0, 0.92.3, 0.94.2

 Attachments: 6649-0.92.patch, 6649-1.patch, 6649-2.txt, 
 6649-trunk.patch, 6649-trunk.patch, 6649.txt, HBase-0.92 #495 test - 
 queueFailover [Jenkins].html, HBase-0.92 #502 test - queueFailover 
 [Jenkins].html


 Have seen it twice in the recent past: http://bit.ly/MPCykB  
 http://bit.ly/O79Dq7 .. 
 Looking briefly at the logs hints at a pattern - in both the failed test 
 instances, there was an RS crash while the test was running.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6610) HFileLink: Hardlink alternative for snapshot restore

2012-09-05 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-6610:
---

Attachment: HBASE-6610-v3.patch

 HFileLink: Hardlink alternative for snapshot restore
 

 Key: HBASE-6610
 URL: https://issues.apache.org/jira/browse/HBASE-6610
 Project: HBase
  Issue Type: Sub-task
  Components: io
Affects Versions: 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
  Labels: snapshot
 Fix For: 0.96.0

 Attachments: HBASE-6610-v1.patch, HBASE-6610-v2.patch, 
 HBASE-6610-v3.patch


 To avoid copying data during restore snapshot we need to introduce an HFile 
 Link  that allows to reference a file that can be in the original path 
 (/hbase/table/region/cf/hfile) or, if the file is archived, in the archive 
 directory (/hbase/.archive/table/region/cf/hfile).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (HBASE-917) filesystem intensive operations such as compaction should be load aware

2012-09-05 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack reopened HBASE-917:
-


Reopening.  The vehicle by which we achieve this issue may be a coprocessor but 
the actual work still needs to be done.  I'd say we should leave this issue 
open.  You might argue the issue is without sufficient detail.  You might get 
away w/ closing it with that justification.

 filesystem intensive operations such as compaction should be load aware
 ---

 Key: HBASE-917
 URL: https://issues.apache.org/jira/browse/HBASE-917
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Andrew Purtell

 If the underlying filesystem is already severely stressed, running intensive 
 operations such as compaction is asking for trouble. Ideally, such actions 
 should be deferred until load is observed to lessen. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2012-09-05 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13449185#comment-13449185
 ] 

Francis Liu commented on HBASE-6721:


 Have you considered introducing interface for AssignmentManager so that 
 existing and new managers can be easily swapped?
Yes, part of the proposal is to make AssignmentManager pluggable. I'll add that 
as a subtask for this.

 Have you considered storing group information in zookeeper instead of on hdfs 
 ?
Correct me if I'm wrong, but it seems the approach HBase has taken for it's 
usage of ZK is more towards storing temporal data for coordination and the real 
source of truth is on HDFS or Tables. And we decided to follow the same 
approach. 

 Please explain more about RegionServerGroupProtocol.
RegionServerGroupProtocol exposes APIs to manage Grouping (see API in doc). The 
currently plan is that these APIs will be used and exposed via the CLI 
commands. 

 Another aspect is fault tolerance.
 Say the smallest group consists of 6 region servers, the impact of majority 
 of the 6 servers going down at the same time is much higher than 6 
 servers out of whole cluster going down where there is only one group.
This is similar to hbase cluster sizing for fault tolerance. Let's play around 
with it and later on document best practices.




 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Vandana Ayyalasomayajula
 Fix For: 0.96.0

 Attachments: HBASE-6721-DesigDoc.pdf


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6723) Make AssignmentManager pluggable

2012-09-05 Thread Francis Liu (JIRA)
Francis Liu created HBASE-6723:
--

 Summary: Make AssignmentManager pluggable
 Key: HBASE-6723
 URL: https://issues.apache.org/jira/browse/HBASE-6723
 Project: HBase
  Issue Type: Sub-task
Reporter: Francis Liu




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-5065) wrong IllegalArgumentException thrown when creating an 'HServerAddress' with an un-reachable hostname

2012-09-05 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo resolved HBASE-5065.
-

Resolution: Invalid

This looks like it is now fixed in trunk and 0.90. 
checkBindAddressCanBeResolved() now has a null check and throws an 
IllegalArgumentException with an appropriate message. This class is also 
deprecated.

Please reopen if you think differently.

 wrong IllegalArgumentException thrown when creating an 'HServerAddress' with 
 an un-reachable hostname
 -

 Key: HBASE-5065
 URL: https://issues.apache.org/jira/browse/HBASE-5065
 Project: HBase
  Issue Type: Bug
  Components: util
Affects Versions: 0.90.4
Reporter: Eran Hirsch
Priority: Trivial

 When trying to build an 'HServerAddress' object with an unresolvable hostname:
 e.g. new HServerAddress(www.IAMUNREACHABLE.com:80)
 a call to 'getResolvedAddress' would cause the 'InetSocketAddress' c'tor to 
 throw an IllegalArgumentException because it is called with a null 'hostname' 
 parameter.
 This happens because there is no null-check after the static 
 'getBindAddressInternal' method returns a null value when the hostname is 
 unresolved.
 This is a trivial bug because the code HServerAddress is expected to throw 
 this kind of exception when this error occurs, but it is thrown for the 
 wrong reason. The method 'checkBindAddressCanBeResolved' should be the one 
 throwing the exception (and give a slightly different reason). Because of 
 this reason the method call itself becomes redundent as it will always 
 succeed in the current flow, because the case it checks is already checked 
 for by the previous getResolvedAddress method.
 In short:
 an IllegalArgumentException is thrown with reason: hostname can't be null 
 from the InetSocketAddress c'tor
 INSTEAD OF
 an IllegalArgumentException with reason: Could not resolve the DNS name of 
 [BADHOSTNAME]:[PORT] from HServerAddress's checkBindCanBeResolved method.
 Stack trace:
 java.lang.IllegalArgumentException: hostname can't be null
   at java.net.InetSocketAddress.init(InetSocketAddress.java:139) 
 ~[na:1.7.0_02]
   at 
 org.apache.hadoop.hbase.HServerAddress.getResolvedAddress(HServerAddress.java:108)
  ~[hbase-0.90.4.jar:0.90.4]
   at 
 org.apache.hadoop.hbase.HServerAddress.init(HServerAddress.java:64) 
 ~[hbase-0.90.4.jar:0.90.4]
   at 
 org.apache.hadoop.hbase.zookeeper.RootRegionTracker.dataToHServerAddress(RootRegionTracker.java:82)
  ~[hbase-0.90.4.jar:0.90.4]
   at 
 org.apache.hadoop.hbase.zookeeper.RootRegionTracker.waitRootRegionLocation(RootRegionTracker.java:73)
  ~[hbase-0.90.4.jar:0.90.4]
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:579)
  ~[hbase-0.90.4.jar:0.90.4]
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:559)
  ~[hbase-0.90.4.jar:0.90.4]
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:688)
  ~[hbase-0.90.4.jar:0.90.4]
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:590)
  ~[hbase-0.90.4.jar:0.90.4]
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:559)
  ~[hbase-0.90.4.jar:0.90.4]
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:688)
  ~[hbase-0.90.4.jar:0.90.4]
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:594)
  ~[hbase-0.90.4.jar:0.90.4]
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:559)
  ~[hbase-0.90.4.jar:0.90.4]
   at org.apache.hadoop.hbase.client.HTable.init(HTable.java:173) 
 ~[hbase-0.90.4.jar:0.90.4]
   at org.apache.hadoop.hbase.client.HTable.init(HTable.java:147) 
 ~[hbase-0.90.4.jar:0.90.4]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2012-09-05 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13449238#comment-13449238
 ] 

Ted Yu commented on HBASE-6721:
---

Looking at current doc, GroupInfo would be passed to (new) AssignmentManager.
Do you plan to reference GroupInfo in AssignmentManager interface ?

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Vandana Ayyalasomayajula
 Fix For: 0.96.0

 Attachments: HBASE-6721-DesigDoc.pdf


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6165) Replication can overrun .META. scans on cluster re-start

2012-09-05 Thread Jeff Whiting (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13449245#comment-13449245
 ] 

Jeff Whiting commented on HBASE-6165:
-

I maybe a little late to the party, but why is replication using any kind of 
higher than normal priority handlers? 

It looks like we all agree that they shouldn't be using the high priority 
handlers.  It looks like they now have their own medium priority handlers. But 
I don't see an argument as to why they don't just use the normal handlers 
priority handlers.

 Replication can overrun .META. scans on cluster re-start
 

 Key: HBASE-6165
 URL: https://issues.apache.org/jira/browse/HBASE-6165
 Project: HBase
  Issue Type: Bug
Reporter: Elliott Clark
Assignee: Himanshu Vashishtha
 Fix For: 0.96.0, 0.94.2

 Attachments: 6165-v6.txt, HBase-6165-94-v1.patch, 
 HBase-6165-94-v2.patch, HBase-6165-v1.patch, HBase-6165-v2.patch, 
 HBase-6165-v3.patch, HBase-6165-v4.patch, HBase-6165-v5.patch


 When restarting a large set of regions on a reasonably small cluster the 
 replication from another cluster tied up every xceiver meaning nothing could 
 be onlined.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6659) Port HBASE-6508 Filter out edits at log split time

2012-09-05 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13449251#comment-13449251
 ] 

Ted Yu commented on HBASE-6659:
---

For last flushed sequence Id, another option is to embed it in HRegionInfo.
This way, there is no need to modify RegionLoad.

 Port HBASE-6508 Filter out edits at log split time
 --

 Key: HBASE-6659
 URL: https://issues.apache.org/jira/browse/HBASE-6659
 Project: HBase
  Issue Type: Bug
Reporter: Zhihong Ted Yu
Assignee: Zhihong Ted Yu
 Fix For: 0.96.0

 Attachments: 6508-v2.txt, 6508-v3.txt, 6508-v4.txt, 6508-v5.txt, 
 6508-v7.txt, 6508-v7.txt


 HBASE-6508 is for 0.89-fb branch.
 This JIRA ports the feature to trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6723) Make AssignmentManager pluggable

2012-09-05 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13449262#comment-13449262
 ] 

stack commented on HBASE-6723:
--

One thought is that AM as is should not be pluggable.  Its way too fat doing 
too many things such as actual rpcs inside in AM.  My guess is you don't want 
your AM replacement doing rpcs and handing zk callbacks directly; that should 
be done by a wrapper class and what you want to replace is some nugget core 
that makes the assignment decisions, something we don't yet have but that we 
badly need if only to make AM decision making more testable.  Go easy Francis.

 Make AssignmentManager pluggable
 

 Key: HBASE-6723
 URL: https://issues.apache.org/jira/browse/HBASE-6723
 Project: HBase
  Issue Type: Sub-task
Reporter: Francis Liu



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6715) TestFromClientSide.testCacheOnWriteEvictOnClose is flaky

2012-09-05 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6715:
---

Status: Patch Available  (was: Open)

 TestFromClientSide.testCacheOnWriteEvictOnClose is flaky
 

 Key: HBASE-6715
 URL: https://issues.apache.org/jira/browse/HBASE-6715
 Project: HBase
  Issue Type: Test
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Attachments: trunk-6715.patch


 Occasionally, this test fails:
 {noformat}
 expected:2049 but was:2069
 Stacktrace
 java.lang.AssertionError: expected:2049 but was:2069
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.failNotEquals(Assert.java:647)
 at org.junit.Assert.assertEquals(Assert.java:128)
 at org.junit.Assert.assertEquals(Assert.java:472)
 at org.junit.Assert.assertEquals(Assert.java:456)
 at 
 org.apache.hadoop.hbase.client.TestFromClientSide.testCacheOnWriteEvictOnClose(TestFromClientSide.java:4248)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 {noformat}
 It could be because there is other thread still accessing the cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6715) TestFromClientSide.testCacheOnWriteEvictOnClose is flaky

2012-09-05 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6715:
---

Attachment: trunk-6715.patch

 TestFromClientSide.testCacheOnWriteEvictOnClose is flaky
 

 Key: HBASE-6715
 URL: https://issues.apache.org/jira/browse/HBASE-6715
 Project: HBase
  Issue Type: Test
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Attachments: trunk-6715.patch


 Occasionally, this test fails:
 {noformat}
 expected:2049 but was:2069
 Stacktrace
 java.lang.AssertionError: expected:2049 but was:2069
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.failNotEquals(Assert.java:647)
 at org.junit.Assert.assertEquals(Assert.java:128)
 at org.junit.Assert.assertEquals(Assert.java:472)
 at org.junit.Assert.assertEquals(Assert.java:456)
 at 
 org.apache.hadoop.hbase.client.TestFromClientSide.testCacheOnWriteEvictOnClose(TestFromClientSide.java:4248)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 {noformat}
 It could be because there is other thread still accessing the cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6715) TestFromClientSide.testCacheOnWriteEvictOnClose is flaky

2012-09-05 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13449266#comment-13449266
 ] 

stack commented on HBASE-6715:
--

Is this a fix or more debug to find why the fail?  I'm +1 on commit in either 
case.

 TestFromClientSide.testCacheOnWriteEvictOnClose is flaky
 

 Key: HBASE-6715
 URL: https://issues.apache.org/jira/browse/HBASE-6715
 Project: HBase
  Issue Type: Test
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Attachments: trunk-6715.patch


 Occasionally, this test fails:
 {noformat}
 expected:2049 but was:2069
 Stacktrace
 java.lang.AssertionError: expected:2049 but was:2069
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.failNotEquals(Assert.java:647)
 at org.junit.Assert.assertEquals(Assert.java:128)
 at org.junit.Assert.assertEquals(Assert.java:472)
 at org.junit.Assert.assertEquals(Assert.java:456)
 at 
 org.apache.hadoop.hbase.client.TestFromClientSide.testCacheOnWriteEvictOnClose(TestFromClientSide.java:4248)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 {noformat}
 It could be because there is other thread still accessing the cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   >