[jira] [Updated] (HBASE-6316) Confirm can upgrade to 0.96 from 0.94 by just stopping and restarting

2012-10-03 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6316:
-

Attachment: 6316.txt

Here is a fix for the failed parse of Reference files.

 Confirm can upgrade to 0.96 from 0.94 by just stopping and restarting
 -

 Key: HBASE-6316
 URL: https://issues.apache.org/jira/browse/HBASE-6316
 Project: HBase
  Issue Type: Bug
Reporter: stack
Priority: Blocker
 Fix For: 0.96.0

 Attachments: 6316.txt


 Over in HBASE-6294, LarsH says you have to currently clear zk to get a 0.96 
 to start over data written by a 0.94.  Need to fix it so don't have to do 
 this -- that zk state left over gets auto-migrated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6316) Confirm can upgrade to 0.96 from 0.94 by just stopping and restarting

2012-10-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468366#comment-13468366
 ] 

stack commented on HBASE-6316:
--

Also get this when try to look at UIs:

{code}
HTTP ERROR 500

Problem accessing /rs-status. Reason:

Unresolved compilation problems: 
The import org.apache.hadoop.hbase.tmpl cannot be resolved
RSStatusTmpl cannot be resolved to a type
RSStatusTmpl cannot be resolved to a type
Caused by:

java.lang.Error: Unresolved compilation problems: 
The import org.apache.hadoop.hbase.tmpl cannot be resolved
RSStatusTmpl cannot be resolved to a type
RSStatusTmpl cannot be resolved to a type

at 
org.apache.hadoop.hbase.regionserver.RSStatusServlet.init(RSStatusServlet.java:29)
...
{code}

 Confirm can upgrade to 0.96 from 0.94 by just stopping and restarting
 -

 Key: HBASE-6316
 URL: https://issues.apache.org/jira/browse/HBASE-6316
 Project: HBase
  Issue Type: Bug
Reporter: stack
Priority: Blocker
 Fix For: 0.96.0

 Attachments: 6316.txt


 Over in HBASE-6294, LarsH says you have to currently clear zk to get a 0.96 
 to start over data written by a 0.94.  Need to fix it so don't have to do 
 this -- that zk state left over gets auto-migrated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6931) Refine WAL interface

2012-10-03 Thread Flavio Junqueira (JIRA)
Flavio Junqueira created HBASE-6931:
---

 Summary: Refine WAL interface
 Key: HBASE-6931
 URL: https://issues.apache.org/jira/browse/HBASE-6931
 Project: HBase
  Issue Type: Improvement
Reporter: Flavio Junqueira


We have transformed HLog into an interface and created FSHLog to contain the 
current implementation of HLog in HBASE-5937. In that patch, we have 
essentially exposed the public methods, moved method implementations to FSHLog, 
created a factory for HLog, and moved static methods to HLogUtil. 

In this umbrella jira, the idea is to refine the WAL interface, making it not 
dependent upon a file system as it is currently. The high-level idea is to 
revisit the methods in HLog and HLogUtil and come up an interface that can 
accommodate other backends, such as BookKeeper. Another major task here is to 
decide what to do with the splitter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6933) Revisit methods of HLogUtil

2012-10-03 Thread Flavio Junqueira (JIRA)
Flavio Junqueira created HBASE-6933:
---

 Summary: Revisit methods of HLogUtil
 Key: HBASE-6933
 URL: https://issues.apache.org/jira/browse/HBASE-6933
 Project: HBase
  Issue Type: Sub-task
Reporter: Flavio Junqueira




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6932) Revisit methods of HLog

2012-10-03 Thread Flavio Junqueira (JIRA)
Flavio Junqueira created HBASE-6932:
---

 Summary: Revisit methods of HLog
 Key: HBASE-6932
 URL: https://issues.apache.org/jira/browse/HBASE-6932
 Project: HBase
  Issue Type: Sub-task
Reporter: Flavio Junqueira




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6934) Revisit methods of HLogMetrics

2012-10-03 Thread Flavio Junqueira (JIRA)
Flavio Junqueira created HBASE-6934:
---

 Summary: Revisit methods of HLogMetrics
 Key: HBASE-6934
 URL: https://issues.apache.org/jira/browse/HBASE-6934
 Project: HBase
  Issue Type: Sub-task
Reporter: Flavio Junqueira




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6935) Rename HLog interface to WAL

2012-10-03 Thread Flavio Junqueira (JIRA)
Flavio Junqueira created HBASE-6935:
---

 Summary: Rename HLog interface to WAL
 Key: HBASE-6935
 URL: https://issues.apache.org/jira/browse/HBASE-6935
 Project: HBase
  Issue Type: Sub-task
Reporter: Flavio Junqueira




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6936) Remove splitter from the wal interface

2012-10-03 Thread Flavio Junqueira (JIRA)
Flavio Junqueira created HBASE-6936:
---

 Summary: Remove splitter from the wal interface
 Key: HBASE-6936
 URL: https://issues.apache.org/jira/browse/HBASE-6936
 Project: HBase
  Issue Type: Sub-task
Reporter: Flavio Junqueira




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6937) Remove synchronization around closeLogSyncer (findbugs warning)

2012-10-03 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated HBASE-6937:


Priority: Minor  (was: Major)

 Remove synchronization around closeLogSyncer (findbugs warning)
 ---

 Key: HBASE-6937
 URL: https://issues.apache.org/jira/browse/HBASE-6937
 Project: HBase
  Issue Type: Sub-task
Reporter: Flavio Junqueira
Priority: Minor



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6937) Remove synchronization around closeLogSyncer (findbugs warning)

2012-10-03 Thread Flavio Junqueira (JIRA)
Flavio Junqueira created HBASE-6937:
---

 Summary: Remove synchronization around closeLogSyncer (findbugs 
warning)
 Key: HBASE-6937
 URL: https://issues.apache.org/jira/browse/HBASE-6937
 Project: HBase
  Issue Type: Sub-task
Reporter: Flavio Junqueira




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6938) Move resetLogReaderClass to TestHLogSplit

2012-10-03 Thread Flavio Junqueira (JIRA)
Flavio Junqueira created HBASE-6938:
---

 Summary: Move resetLogReaderClass to TestHLogSplit
 Key: HBASE-6938
 URL: https://issues.apache.org/jira/browse/HBASE-6938
 Project: HBase
  Issue Type: Sub-task
Reporter: Flavio Junqueira




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6938) Move resetLogReaderClass to TestHLogSplit

2012-10-03 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated HBASE-6938:


Priority: Minor  (was: Major)

 Move resetLogReaderClass to TestHLogSplit
 -

 Key: HBASE-6938
 URL: https://issues.apache.org/jira/browse/HBASE-6938
 Project: HBase
  Issue Type: Sub-task
Reporter: Flavio Junqueira
Priority: Minor



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6912) Filters are not properly applied in certain cases

2012-10-03 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468439#comment-13468439
 ] 

ramkrishna.s.vasudevan commented on HBASE-6912:
---

Lazy seeking scenarios will be broken right Lars?
bq.I have a patch, which fixes RowFilter.
Can you upload this patch?

 Filters are not properly applied in certain cases
 -

 Key: HBASE-6912
 URL: https://issues.apache.org/jira/browse/HBASE-6912
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.1
Reporter: Alex Newman
 Fix For: 0.94.3, 0.96.0

 Attachments: minimalTest.java


 Steps to reproduce:
 Create a table, load data into it. Flush the table.
 Do a scan with
 1. Some filter which should not match the first entry in the scan
 2. Where one specifies a family and column.
 You will notice that the first entry is returned even though it doesn't match 
 the filter.
 It looks like the when the first KeyValue of a scan in the column from the 
 point of view of the code
 HRegion.java
 {code}
 } else if (kv != null  !kv.isInternal()  filterRowKey(currentRow)) {
 {code}
 Is generated by
 {code}
 public static KeyValue createLastOnRow(final byte [] row,
 final int roffset, final int rlength, final byte [] family,
 final int foffset, final int flength, final byte [] qualifier,
 final int qoffset, final int qlength) { return new KeyValue(row, roffset, 
 rlength, family, foffset, flength, qualifier, qoffset, qlength, 
 HConstants.OLDEST_TIMESTAMP, Type.Minimum, null, 0, 0); }
 {code}
 So it is always internal from that point of the code.
 Only later from within
 StoreScanner.java
 {code}
 public synchronized boolean next(ListKeyValue outResult, int limit, String 
 metric) throws IOException {
 
 LOOP: while((kv = this.heap.peek()) != null) {
 {code}
 ( The second time through)
 Do we get the actual kv, with a proper type and timestamp. This seems to mess 
 with filtering.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6928) TestStoreFile sometimes fails with 'Column family prefix used twice'

2012-10-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468497#comment-13468497
 ] 

Hudson commented on HBASE-6928:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #205 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/205/])
HBASE-6928 TestStoreFile sometimes fails with 'Column family prefix used 
twice' (Revision 1393284)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java


 TestStoreFile sometimes fails with 'Column family prefix used twice'
 

 Key: HBASE-6928
 URL: https://issues.apache.org/jira/browse/HBASE-6928
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
 Attachments: 6928-debug.txt


 In build #3406, I saw:
 {code}
 java.lang.AssertionError: Column family prefix used twice: 
 cf.cf.bt.Data.fsReadnumops
   at 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.validateMetricChanges(SchemaMetrics.java:822)
   at 
 org.apache.hadoop.hbase.regionserver.TestStoreFile.tearDown(TestStoreFile.java:89)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6388) Avoid potential data loss if the flush fails during regionserver shutdown

2012-10-03 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468532#comment-13468532
 ] 

ramkrishna.s.vasudevan commented on HBASE-6388:
---

{code}
if (!this.killed  this.fsOk) {
  waitOnAllRegionsToClose(abortRequested);
  LOG.info(stopping server  + this.serverNameFromMasterPOV +
; all regions closed.);
}

//fsOk flag may be changed when closing regions throws exception.
if (!this.killed  this.fsOk) {
  closeWAL(abortRequested ? false : true);
}
{code}
I think WAL closing is fine but the closing is not done parallel here.  Do we 
need to address parallelizing closes alone then? What you feel Stack?


 Avoid potential data loss if the flush fails during regionserver shutdown
 -

 Key: HBASE-6388
 URL: https://issues.apache.org/jira/browse/HBASE-6388
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Critical
 Fix For: 0.96.0

 Attachments: 
 0001-HBASE-6388-89-fb-parallelize-close-and-avoid-deletin.patch


 During a controlled shutdown, Regionserver deletes HLogs even if 
 HRegion.close() fails. We should not be doing this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6912) Filters are not properly applied in certain cases

2012-10-03 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468593#comment-13468593
 ] 

Lars Hofhansl commented on HBASE-6912:
--

I am inclined to revert HBASE-6562 for now, and add Alex' test at the same time 
to guard against this in the future.


 Filters are not properly applied in certain cases
 -

 Key: HBASE-6912
 URL: https://issues.apache.org/jira/browse/HBASE-6912
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.1
Reporter: Alex Newman
 Fix For: 0.94.3, 0.96.0

 Attachments: minimalTest.java


 Steps to reproduce:
 Create a table, load data into it. Flush the table.
 Do a scan with
 1. Some filter which should not match the first entry in the scan
 2. Where one specifies a family and column.
 You will notice that the first entry is returned even though it doesn't match 
 the filter.
 It looks like the when the first KeyValue of a scan in the column from the 
 point of view of the code
 HRegion.java
 {code}
 } else if (kv != null  !kv.isInternal()  filterRowKey(currentRow)) {
 {code}
 Is generated by
 {code}
 public static KeyValue createLastOnRow(final byte [] row,
 final int roffset, final int rlength, final byte [] family,
 final int foffset, final int flength, final byte [] qualifier,
 final int qoffset, final int qlength) { return new KeyValue(row, roffset, 
 rlength, family, foffset, flength, qualifier, qoffset, qlength, 
 HConstants.OLDEST_TIMESTAMP, Type.Minimum, null, 0, 0); }
 {code}
 So it is always internal from that point of the code.
 Only later from within
 StoreScanner.java
 {code}
 public synchronized boolean next(ListKeyValue outResult, int limit, String 
 metric) throws IOException {
 
 LOOP: while((kv = this.heap.peek()) != null) {
 {code}
 ( The second time through)
 Do we get the actual kv, with a proper type and timestamp. This seems to mess 
 with filtering.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6912) Filters are not properly applied in certain cases

2012-10-03 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468595#comment-13468595
 ] 

Ted Yu commented on HBASE-6912:
---

+1 to Lars' plan.

 Filters are not properly applied in certain cases
 -

 Key: HBASE-6912
 URL: https://issues.apache.org/jira/browse/HBASE-6912
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.1
Reporter: Alex Newman
 Fix For: 0.94.3, 0.96.0

 Attachments: minimalTest.java


 Steps to reproduce:
 Create a table, load data into it. Flush the table.
 Do a scan with
 1. Some filter which should not match the first entry in the scan
 2. Where one specifies a family and column.
 You will notice that the first entry is returned even though it doesn't match 
 the filter.
 It looks like the when the first KeyValue of a scan in the column from the 
 point of view of the code
 HRegion.java
 {code}
 } else if (kv != null  !kv.isInternal()  filterRowKey(currentRow)) {
 {code}
 Is generated by
 {code}
 public static KeyValue createLastOnRow(final byte [] row,
 final int roffset, final int rlength, final byte [] family,
 final int foffset, final int flength, final byte [] qualifier,
 final int qoffset, final int qlength) { return new KeyValue(row, roffset, 
 rlength, family, foffset, flength, qualifier, qoffset, qlength, 
 HConstants.OLDEST_TIMESTAMP, Type.Minimum, null, 0, 0); }
 {code}
 So it is always internal from that point of the code.
 Only later from within
 StoreScanner.java
 {code}
 public synchronized boolean next(ListKeyValue outResult, int limit, String 
 metric) throws IOException {
 
 LOOP: while((kv = this.heap.peek()) != null) {
 {code}
 ( The second time through)
 Do we get the actual kv, with a proper type and timestamp. This seems to mess 
 with filtering.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6939) Add the possibility to set the ZK port in HBaseTestingUtility

2012-10-03 Thread nkeywal (JIRA)
nkeywal created HBASE-6939:
--

 Summary: Add the possibility to set the ZK port in 
HBaseTestingUtility
 Key: HBASE-6939
 URL: https://issues.apache.org/jira/browse/HBASE-6939
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.1, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial


It's useful when embedding the HBaseTestingUtility into another test server: 
fixing the ZK port allows it to put it simply into a shared instance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6738) Too aggressive task resubmission from the distributed log manager

2012-10-03 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468611#comment-13468611
 ] 

nkeywal commented on HBASE-6738:


Committed revision 1393537.

 Too aggressive task resubmission from the distributed log manager
 -

 Key: HBASE-6738
 URL: https://issues.apache.org/jira/browse/HBASE-6738
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver
Affects Versions: 0.94.1, 0.96.0
 Environment: 3 nodes cluster test, but can occur as well on a much 
 bigger one. It's all luck!
Reporter: nkeywal
Assignee: nkeywal
Priority: Critical
 Fix For: 0.96.0

 Attachments: 6738.v1.patch


 With default settings for hbase.splitlog.manager.timeout = 25s and 
 hbase.splitlog.max.resubmit = 3.
 On tests mentionned on HBASE-5843, I have variations around this scenario, 
 0.94 + HDFS 1.0.3:
 The regionserver in charge of the split does not answer in less than 25s, so 
 it gets interrupted but actually continues. Sometimes, we go out of the 
 number of retry, sometimes not, sometimes we're out of retry, but the as the 
 interrupts were ignored we finish nicely. In the mean time, the same single 
 task is executed in parallel by multiple nodes, increasing the probability to 
 get into race conditions.
 Details:
 t0: unplug a box with DN+RS
 t + x: other boxes are already connected, to their connection starts to dies. 
 Nevertheless, they don't consider this node as suspect.
 t + 180s: zookeeper - master detects the node as dead. recovery start. It 
 can be less than 180s sometimes it around 150s.
 t + 180s: distributed split starts. There is only 1 task, it's immediately 
 acquired by a one RS.
 t + 205s: the RS has multiple errors when splitting, because a datanode is 
 missing as well. The master decides to give the task to someone else. But 
 often the task continues in the first RS. Interrupts are often ignored, as 
 it's well stated in the code (// TODO interrupt often gets swallowed, do 
 what else?)
 {code}
2012-09-04 18:27:30,404 INFO 
 org.apache.hadoop.hbase.regionserver.SplitLogWorker: Sending interrupt to 
 stop the worker thread
 {code}
 t + 211s: two regionsservers are processing the same task. They fight for the 
 leases:
 {code}
 2012-09-04 18:27:32,004 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer 
 Exception: org.apache.hadoop.ipc.RemoteException:  
 org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: Lease mismatch 
 on

 /hbase/TABLE/4d1c1a4695b1df8c58d13382b834332e/recovered.edits/037.temp
  owned by DFSClient_hb_rs_BOX2,60020,1346775882980 but is accessed by 
 DFSClient_hb_rs_BOX1,60020,1346775719125
 {code}
  They can fight like this for many files, until the tasks finally get 
 interrupted or finished.
  The taks on the second box can be cancelled as well. In this case, the 
 task is created again for a new box.
  The master seems to stop after 3 attemps. It can as well renounce to 
 split the files. Sometimes the tasks were not cancelled on the RS side, so 
 the split is finished despites what the master thinks and logs. In this case, 
 the assignement starts. In the other, it's we've got a problem).
 {code}
 2012-09-04 18:43:52,724 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
 Skipping resubmissions of task 
 /hbase/splitlog/hdfs%3A%2F%2FBOX1%3A9000%2Fhbase%2F.logs%2FBOX0%2C60020%2C1346776587640-splitting%2FBOX0%252C60020%252C1346776587640.1346776587832
  because threshold 3 reached 
 {code}
 t + 300s: split is finished. Assignement starts
 t + 330s: assignement is finished, regions are available again.
 There are a lot of subcases possible depending on the number of logs files, 
 of region server and so on.
 The issues are:
 1) it's difficult, especially in HBase but not only, to interrupt a task. The 
 pattern is often
 {code}
  void f() throws IOException{
   try {
  // whatever throw InterruptedException
   }catch(InterruptedException){
 throw new InterruptedIOException();
   }
 }
  boolean g(){
int nbRetry= 0;  
for(;;)
   try{
  f();
  return true;
   }catch(IOException e){
  nbRetry++;
  if ( nbRetry  maxRetry) return false;
   }
} 
  }
 {code}
 This tyically shallows the interrupt. There are other variation, but this one 
 seems to be the standard.
 Even if we fix this in HBase, we need the other layers to be Interrupteble as 
 well. That's not proven.
 2) 25s is very aggressive, considering that we have a default timeout of 180s 
 for zookeeper. In other words, we give 180s to a regionserver before acting, 
 but when it comes to split, it's 25s only. There may be reasons for this, but 
 it seems dangerous, as during a 

[jira] [Updated] (HBASE-6738) Too aggressive task resubmission from the distributed log manager

2012-10-03 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-6738:
---

  Resolution: Fixed
Release Note: 
The Split Log Manager now takes into account the state of the region server 
doing the split. If this region server is marked as dead (i.e. its ZooKeeper 
connection expires), its task is immediately resubmitted. If the region server 
is still in the alive state, then we wait for 2 minutes before resubmitting, 
instead of 25 seconds previously. This delay can be changed with the parameter 
hbase.splitlog.manager.timeout (milliseconds, new default since 0.96: 12).


  was:
The Split Log Manager now takes into account the state of the region server 
doing the split. If this region server is marked as dead (i.e. its ZooKeeper 
connection expires), its task is immediately resubmitted. If the region server 
is still in the alive state, then we wait for 2 minutes before resubmitting, 
instead of 25 seconds previously. This delay can be changed with the parameter 
hbase.splitlog.manager.timeout (milliseconds, new default: 12).


  Status: Resolved  (was: Patch Available)

 Too aggressive task resubmission from the distributed log manager
 -

 Key: HBASE-6738
 URL: https://issues.apache.org/jira/browse/HBASE-6738
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver
Affects Versions: 0.94.1, 0.96.0
 Environment: 3 nodes cluster test, but can occur as well on a much 
 bigger one. It's all luck!
Reporter: nkeywal
Assignee: nkeywal
Priority: Critical
 Fix For: 0.96.0

 Attachments: 6738.v1.patch


 With default settings for hbase.splitlog.manager.timeout = 25s and 
 hbase.splitlog.max.resubmit = 3.
 On tests mentionned on HBASE-5843, I have variations around this scenario, 
 0.94 + HDFS 1.0.3:
 The regionserver in charge of the split does not answer in less than 25s, so 
 it gets interrupted but actually continues. Sometimes, we go out of the 
 number of retry, sometimes not, sometimes we're out of retry, but the as the 
 interrupts were ignored we finish nicely. In the mean time, the same single 
 task is executed in parallel by multiple nodes, increasing the probability to 
 get into race conditions.
 Details:
 t0: unplug a box with DN+RS
 t + x: other boxes are already connected, to their connection starts to dies. 
 Nevertheless, they don't consider this node as suspect.
 t + 180s: zookeeper - master detects the node as dead. recovery start. It 
 can be less than 180s sometimes it around 150s.
 t + 180s: distributed split starts. There is only 1 task, it's immediately 
 acquired by a one RS.
 t + 205s: the RS has multiple errors when splitting, because a datanode is 
 missing as well. The master decides to give the task to someone else. But 
 often the task continues in the first RS. Interrupts are often ignored, as 
 it's well stated in the code (// TODO interrupt often gets swallowed, do 
 what else?)
 {code}
2012-09-04 18:27:30,404 INFO 
 org.apache.hadoop.hbase.regionserver.SplitLogWorker: Sending interrupt to 
 stop the worker thread
 {code}
 t + 211s: two regionsservers are processing the same task. They fight for the 
 leases:
 {code}
 2012-09-04 18:27:32,004 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer 
 Exception: org.apache.hadoop.ipc.RemoteException:  
 org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: Lease mismatch 
 on

 /hbase/TABLE/4d1c1a4695b1df8c58d13382b834332e/recovered.edits/037.temp
  owned by DFSClient_hb_rs_BOX2,60020,1346775882980 but is accessed by 
 DFSClient_hb_rs_BOX1,60020,1346775719125
 {code}
  They can fight like this for many files, until the tasks finally get 
 interrupted or finished.
  The taks on the second box can be cancelled as well. In this case, the 
 task is created again for a new box.
  The master seems to stop after 3 attemps. It can as well renounce to 
 split the files. Sometimes the tasks were not cancelled on the RS side, so 
 the split is finished despites what the master thinks and logs. In this case, 
 the assignement starts. In the other, it's we've got a problem).
 {code}
 2012-09-04 18:43:52,724 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
 Skipping resubmissions of task 
 /hbase/splitlog/hdfs%3A%2F%2FBOX1%3A9000%2Fhbase%2F.logs%2FBOX0%2C60020%2C1346776587640-splitting%2FBOX0%252C60020%252C1346776587640.1346776587832
  because threshold 3 reached 
 {code}
 t + 300s: split is finished. Assignement starts
 t + 330s: assignement is finished, regions are available again.
 There are a lot of subcases possible depending on the number of logs files, 
 of region server and so on.
 The issues are:
 1) it's difficult, especially in HBase but not 

[jira] [Commented] (HBASE-6738) Too aggressive task resubmission from the distributed log manager

2012-10-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468642#comment-13468642
 ] 

Hudson commented on HBASE-6738:
---

Integrated in HBase-TRUNK #3413 (See 
[https://builds.apache.org/job/HBase-TRUNK/3413/])
HBASE-6738  Too aggressive task resubmission from the distributed log 
manager (Revision 1393537)

 Result = FAILURE
nkeywal : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java


 Too aggressive task resubmission from the distributed log manager
 -

 Key: HBASE-6738
 URL: https://issues.apache.org/jira/browse/HBASE-6738
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver
Affects Versions: 0.94.1, 0.96.0
 Environment: 3 nodes cluster test, but can occur as well on a much 
 bigger one. It's all luck!
Reporter: nkeywal
Assignee: nkeywal
Priority: Critical
 Fix For: 0.96.0

 Attachments: 6738.v1.patch


 With default settings for hbase.splitlog.manager.timeout = 25s and 
 hbase.splitlog.max.resubmit = 3.
 On tests mentionned on HBASE-5843, I have variations around this scenario, 
 0.94 + HDFS 1.0.3:
 The regionserver in charge of the split does not answer in less than 25s, so 
 it gets interrupted but actually continues. Sometimes, we go out of the 
 number of retry, sometimes not, sometimes we're out of retry, but the as the 
 interrupts were ignored we finish nicely. In the mean time, the same single 
 task is executed in parallel by multiple nodes, increasing the probability to 
 get into race conditions.
 Details:
 t0: unplug a box with DN+RS
 t + x: other boxes are already connected, to their connection starts to dies. 
 Nevertheless, they don't consider this node as suspect.
 t + 180s: zookeeper - master detects the node as dead. recovery start. It 
 can be less than 180s sometimes it around 150s.
 t + 180s: distributed split starts. There is only 1 task, it's immediately 
 acquired by a one RS.
 t + 205s: the RS has multiple errors when splitting, because a datanode is 
 missing as well. The master decides to give the task to someone else. But 
 often the task continues in the first RS. Interrupts are often ignored, as 
 it's well stated in the code (// TODO interrupt often gets swallowed, do 
 what else?)
 {code}
2012-09-04 18:27:30,404 INFO 
 org.apache.hadoop.hbase.regionserver.SplitLogWorker: Sending interrupt to 
 stop the worker thread
 {code}
 t + 211s: two regionsservers are processing the same task. They fight for the 
 leases:
 {code}
 2012-09-04 18:27:32,004 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer 
 Exception: org.apache.hadoop.ipc.RemoteException:  
 org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: Lease mismatch 
 on

 /hbase/TABLE/4d1c1a4695b1df8c58d13382b834332e/recovered.edits/037.temp
  owned by DFSClient_hb_rs_BOX2,60020,1346775882980 but is accessed by 
 DFSClient_hb_rs_BOX1,60020,1346775719125
 {code}
  They can fight like this for many files, until the tasks finally get 
 interrupted or finished.
  The taks on the second box can be cancelled as well. In this case, the 
 task is created again for a new box.
  The master seems to stop after 3 attemps. It can as well renounce to 
 split the files. Sometimes the tasks were not cancelled on the RS side, so 
 the split is finished despites what the master thinks and logs. In this case, 
 the assignement starts. In the other, it's we've got a problem).
 {code}
 2012-09-04 18:43:52,724 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
 Skipping resubmissions of task 
 /hbase/splitlog/hdfs%3A%2F%2FBOX1%3A9000%2Fhbase%2F.logs%2FBOX0%2C60020%2C1346776587640-splitting%2FBOX0%252C60020%252C1346776587640.1346776587832
  because threshold 3 reached 
 {code}
 t + 300s: split is finished. Assignement starts
 t + 330s: assignement is finished, regions are available again.
 There are a lot of subcases possible depending on the number of logs files, 
 of region server and so on.
 The issues are:
 1) it's difficult, especially in HBase but not only, to interrupt a task. The 
 pattern is often
 {code}
  void f() throws IOException{
   try {
  // whatever throw InterruptedException
   }catch(InterruptedException){
 throw new InterruptedIOException();
   }
 }
  boolean g(){
int nbRetry= 0;  
for(;;)
   try{
  f();
  return true;
   }catch(IOException e){
  nbRetry++;
  if ( nbRetry  maxRetry) return false;
   }
} 
  }
 

[jira] [Updated] (HBASE-6939) Add the possibility to set the ZK port in HBaseTestingUtility

2012-10-03 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-6939:
---

Attachment: 6939.v1.patch

 Add the possibility to set the ZK port in HBaseTestingUtility
 -

 Key: HBASE-6939
 URL: https://issues.apache.org/jira/browse/HBASE-6939
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.1, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Attachments: 6939.v1.patch


 It's useful when embedding the HBaseTestingUtility into another test server: 
 fixing the ZK port allows it to put it simply into a shared instance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6939) Add the possibility to set the ZK port in HBaseTestingUtility

2012-10-03 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-6939:
---

Status: Patch Available  (was: Open)

 Add the possibility to set the ZK port in HBaseTestingUtility
 -

 Key: HBASE-6939
 URL: https://issues.apache.org/jira/browse/HBASE-6939
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.1, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Attachments: 6939.v1.patch


 It's useful when embedding the HBaseTestingUtility into another test server: 
 fixing the ZK port allows it to put it simply into a shared instance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6939) Add the possibility to set the ZK port in HBaseTestingUtility

2012-10-03 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-6939:
---

Attachment: 6939.094.v1.patch

 Add the possibility to set the ZK port in HBaseTestingUtility
 -

 Key: HBASE-6939
 URL: https://issues.apache.org/jira/browse/HBASE-6939
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.1, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Attachments: 6939.094.v1.patch, 6939.v1.patch


 It's useful when embedding the HBaseTestingUtility into another test server: 
 fixing the ZK port allows it to put it simply into a shared instance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6939) Add the possibility to set the ZK port in HBaseTestingUtility

2012-10-03 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468660#comment-13468660
 ] 

nkeywal commented on HBASE-6939:


There's one patch for trunk  one for 0.94...

 Add the possibility to set the ZK port in HBaseTestingUtility
 -

 Key: HBASE-6939
 URL: https://issues.apache.org/jira/browse/HBASE-6939
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.1, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Attachments: 6939.094.v1.patch, 6939.v1.patch


 It's useful when embedding the HBaseTestingUtility into another test server: 
 fixing the ZK port allows it to put it simply into a shared instance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6939) Add the possibility to set the ZK port in HBaseTestingUtility

2012-10-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468664#comment-13468664
 ] 

Hadoop QA commented on HBASE-6939:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12547552/6939.094.v1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2992//console

This message is automatically generated.

 Add the possibility to set the ZK port in HBaseTestingUtility
 -

 Key: HBASE-6939
 URL: https://issues.apache.org/jira/browse/HBASE-6939
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.1, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Attachments: 6939.094.v1.patch, 6939.v1.patch


 It's useful when embedding the HBaseTestingUtility into another test server: 
 fixing the ZK port allows it to put it simply into a shared instance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6912) Filters are not properly applied in certain cases

2012-10-03 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468663#comment-13468663
 ] 

ramkrishna.s.vasudevan commented on HBASE-6912:
---

Ok Lars.. Sounds good.  Alex's test needs little changes like start mini 
cluster and stop minicluster.

 Filters are not properly applied in certain cases
 -

 Key: HBASE-6912
 URL: https://issues.apache.org/jira/browse/HBASE-6912
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.1
Reporter: Alex Newman
 Fix For: 0.94.3, 0.96.0

 Attachments: minimalTest.java


 Steps to reproduce:
 Create a table, load data into it. Flush the table.
 Do a scan with
 1. Some filter which should not match the first entry in the scan
 2. Where one specifies a family and column.
 You will notice that the first entry is returned even though it doesn't match 
 the filter.
 It looks like the when the first KeyValue of a scan in the column from the 
 point of view of the code
 HRegion.java
 {code}
 } else if (kv != null  !kv.isInternal()  filterRowKey(currentRow)) {
 {code}
 Is generated by
 {code}
 public static KeyValue createLastOnRow(final byte [] row,
 final int roffset, final int rlength, final byte [] family,
 final int foffset, final int flength, final byte [] qualifier,
 final int qoffset, final int qlength) { return new KeyValue(row, roffset, 
 rlength, family, foffset, flength, qualifier, qoffset, qlength, 
 HConstants.OLDEST_TIMESTAMP, Type.Minimum, null, 0, 0); }
 {code}
 So it is always internal from that point of the code.
 Only later from within
 StoreScanner.java
 {code}
 public synchronized boolean next(ListKeyValue outResult, int limit, String 
 metric) throws IOException {
 
 LOOP: while((kv = this.heap.peek()) != null) {
 {code}
 ( The second time through)
 Do we get the actual kv, with a proper type and timestamp. This seems to mess 
 with filtering.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6388) Avoid potential data loss if the flush fails during regionserver shutdown

2012-10-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468666#comment-13468666
 ] 

stack commented on HBASE-6388:
--

Then I must be thinking of another issue Ram.  Do you find merit in this patch? 
 If so, lets forward port and get it in.  Thanks.

 Avoid potential data loss if the flush fails during regionserver shutdown
 -

 Key: HBASE-6388
 URL: https://issues.apache.org/jira/browse/HBASE-6388
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Critical
 Fix For: 0.96.0

 Attachments: 
 0001-HBASE-6388-89-fb-parallelize-close-and-avoid-deletin.patch


 During a controlled shutdown, Regionserver deletes HLogs even if 
 HRegion.close() fails. We should not be doing this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6939) Add the possibility to set the ZK port in HBaseTestingUtility

2012-10-03 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-6939:
---

Attachment: 6939.v1.patch

 Add the possibility to set the ZK port in HBaseTestingUtility
 -

 Key: HBASE-6939
 URL: https://issues.apache.org/jira/browse/HBASE-6939
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.1, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Attachments: 6939.094.v1.patch, 6939.v1.patch, 6939.v1.patch


 It's useful when embedding the HBaseTestingUtility into another test server: 
 fixing the ZK port allows it to put it simply into a shared instance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6939) Add the possibility to set the ZK port in HBaseTestingUtility

2012-10-03 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-6939:
---

Status: Open  (was: Patch Available)

 Add the possibility to set the ZK port in HBaseTestingUtility
 -

 Key: HBASE-6939
 URL: https://issues.apache.org/jira/browse/HBASE-6939
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.1, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Attachments: 6939.094.v1.patch, 6939.v1.patch, 6939.v1.patch


 It's useful when embedding the HBaseTestingUtility into another test server: 
 fixing the ZK port allows it to put it simply into a shared instance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6939) Add the possibility to set the ZK port in HBaseTestingUtility

2012-10-03 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-6939:
---

Status: Patch Available  (was: Open)

 Add the possibility to set the ZK port in HBaseTestingUtility
 -

 Key: HBASE-6939
 URL: https://issues.apache.org/jira/browse/HBASE-6939
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.1, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Attachments: 6939.094.v1.patch, 6939.v1.patch, 6939.v1.patch


 It's useful when embedding the HBaseTestingUtility into another test server: 
 fixing the ZK port allows it to put it simply into a shared instance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6939) Add the possibility to set the ZK port in HBaseTestingUtility

2012-10-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468691#comment-13468691
 ] 

Hadoop QA commented on HBASE-6939:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12547551/6939.v1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
83 warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 5 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2991//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2991//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2991//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2991//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2991//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2991//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2991//console

This message is automatically generated.

 Add the possibility to set the ZK port in HBaseTestingUtility
 -

 Key: HBASE-6939
 URL: https://issues.apache.org/jira/browse/HBASE-6939
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.1, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Attachments: 6939.094.v1.patch, 6939.v1.patch, 6939.v1.patch


 It's useful when embedding the HBaseTestingUtility into another test server: 
 fixing the ZK port allows it to put it simply into a shared instance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6785) Convert AggregateProtocol to protobuf defined coprocessor service

2012-10-03 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468697#comment-13468697
 ] 

Ted Yu commented on HBASE-6785:
---

castToReturnType(Object) signature and its javadoc isn't shown above.

FYI

 Convert AggregateProtocol to protobuf defined coprocessor service
 -

 Key: HBASE-6785
 URL: https://issues.apache.org/jira/browse/HBASE-6785
 Project: HBase
  Issue Type: Sub-task
  Components: Coprocessors
Reporter: Gary Helmling
Assignee: Devaraj Das
 Fix For: 0.96.0

 Attachments: Aggregate.proto, Aggregate.proto


 With coprocessor endpoints now exposed as protobuf defined services, we 
 should convert over all of our built-in endpoints to PB services.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6939) Add the possibility to set the ZK port in HBaseTestingUtility

2012-10-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468717#comment-13468717
 ] 

Hadoop QA commented on HBASE-6939:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12547555/6939.v1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
83 warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 5 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.regionserver.TestAtomicOperation

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2993//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2993//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2993//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2993//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2993//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2993//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2993//console

This message is automatically generated.

 Add the possibility to set the ZK port in HBaseTestingUtility
 -

 Key: HBASE-6939
 URL: https://issues.apache.org/jira/browse/HBASE-6939
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.1, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Attachments: 6939.094.v1.patch, 6939.v1.patch, 6939.v1.patch


 It's useful when embedding the HBaseTestingUtility into another test server: 
 fixing the ZK port allows it to put it simply into a shared instance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6758) [replication] The replication-executor should make sure the file that it is replicating is closed before declaring success on that file

2012-10-03 Thread Jean-Daniel Cryans (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468728#comment-13468728
 ] 

Jean-Daniel Cryans commented on HBASE-6758:
---

I really don't like that we have to pass down another instance of HRS (through 
RegionServerServices). The fact that we're now doing this:

{code}
-new Replication(this, this.fs, logdir, oldLogDir): null;
+new Replication(this, this.fs, logdir, oldLogDir, this): null;
{code}

is making me sad. Also it leaks all over the code. It seems to me that there 
should be another way to handle this just in ReplicationSource.

At the moment I'd be +1 for commit only to trunk and on commit this logging 
will need to cleaned up:

{code}
LOG.info(File  + getCurrentPath() +  in use);
{code}

Is ok with you [~devaraj]?

 [replication] The replication-executor should make sure the file that it is 
 replicating is closed before declaring success on that file
 ---

 Key: HBASE-6758
 URL: https://issues.apache.org/jira/browse/HBASE-6758
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
Priority: Critical
 Fix For: 0.96.0

 Attachments: 6758-1-0.92.patch, 6758-2-0.92.patch, 
 6758-trunk-1.patch, 
 TEST-org.apache.hadoop.hbase.replication.TestReplication.xml


 I have seen cases where the replication-executor would lose data to replicate 
 since the file hasn't been closed yet. Upon closing, the new data becomes 
 visible. Before that happens the ZK node shouldn't be deleted in 
 ReplicationSourceManager.logPositionAndCleanOldLogs. Changes need to be made 
 in ReplicationSource.processEndOfFile as well (currentPath related).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6912) Filters are not properly applied in certain cases

2012-10-03 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468729#comment-13468729
 ] 

Lars Hofhansl commented on HBASE-6912:
--

Yeah, I modified the test and integrated it into TestFromClientSide.

OK... So. I'll reopen HBASE-6562, revert that change, and move to 0.94.3 or 
even 0.96.
As part of this jira I'll just commit Alex' test.

 Filters are not properly applied in certain cases
 -

 Key: HBASE-6912
 URL: https://issues.apache.org/jira/browse/HBASE-6912
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.1
Reporter: Alex Newman
 Fix For: 0.94.3, 0.96.0

 Attachments: minimalTest.java


 Steps to reproduce:
 Create a table, load data into it. Flush the table.
 Do a scan with
 1. Some filter which should not match the first entry in the scan
 2. Where one specifies a family and column.
 You will notice that the first entry is returned even though it doesn't match 
 the filter.
 It looks like the when the first KeyValue of a scan in the column from the 
 point of view of the code
 HRegion.java
 {code}
 } else if (kv != null  !kv.isInternal()  filterRowKey(currentRow)) {
 {code}
 Is generated by
 {code}
 public static KeyValue createLastOnRow(final byte [] row,
 final int roffset, final int rlength, final byte [] family,
 final int foffset, final int flength, final byte [] qualifier,
 final int qoffset, final int qlength) { return new KeyValue(row, roffset, 
 rlength, family, foffset, flength, qualifier, qoffset, qlength, 
 HConstants.OLDEST_TIMESTAMP, Type.Minimum, null, 0, 0); }
 {code}
 So it is always internal from that point of the code.
 Only later from within
 StoreScanner.java
 {code}
 public synchronized boolean next(ListKeyValue outResult, int limit, String 
 metric) throws IOException {
 
 LOOP: while((kv = this.heap.peek()) != null) {
 {code}
 ( The second time through)
 Do we get the actual kv, with a proper type and timestamp. This seems to mess 
 with filtering.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (HBASE-6562) Fake KVs are sometimes passed to filters

2012-10-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl reopened HBASE-6562:
--


See HBASE-6912. I am going to revert this change.

 Fake KVs are sometimes passed to filters
 

 Key: HBASE-6562
 URL: https://issues.apache.org/jira/browse/HBASE-6562
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Fix For: 0.94.3, 0.96.0

 Attachments: 6562.txt, 6562-v2.txt, 6562-v3.txt, minimalTest.java


 In internal tests at Salesforce we found that fake row keys sometimes are 
 passed to filters (Filter.filterRowKey(...) specifically).
 The KVs are eventually filtered by the StoreScanner/ScanQueryMatcher, but the 
 row key is passed to filterRowKey in RegionScannImpl *before* that happens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6562) Fake KVs are sometimes passed to filters

2012-10-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6562:
-

Fix Version/s: (was: 0.94.2)
   0.94.3

 Fake KVs are sometimes passed to filters
 

 Key: HBASE-6562
 URL: https://issues.apache.org/jira/browse/HBASE-6562
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Fix For: 0.94.3, 0.96.0

 Attachments: 6562.txt, 6562-v2.txt, 6562-v3.txt, minimalTest.java


 In internal tests at Salesforce we found that fake row keys sometimes are 
 passed to filters (Filter.filterRowKey(...) specifically).
 The KVs are eventually filtered by the StoreScanner/ScanQueryMatcher, but the 
 row key is passed to filterRowKey in RegionScannImpl *before* that happens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6912) Filters are not properly applied in certain cases

2012-10-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6912:
-

Attachment: 6912-0.94.txt

0.94 revert of HBASE-6562, including Alex' test.
(Leaving isInternal() on KeyValue, though, because that is useful to have)

 Filters are not properly applied in certain cases
 -

 Key: HBASE-6912
 URL: https://issues.apache.org/jira/browse/HBASE-6912
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.1
Reporter: Alex Newman
 Fix For: 0.94.3, 0.96.0

 Attachments: 6912-0.94.txt, minimalTest.java


 Steps to reproduce:
 Create a table, load data into it. Flush the table.
 Do a scan with
 1. Some filter which should not match the first entry in the scan
 2. Where one specifies a family and column.
 You will notice that the first entry is returned even though it doesn't match 
 the filter.
 It looks like the when the first KeyValue of a scan in the column from the 
 point of view of the code
 HRegion.java
 {code}
 } else if (kv != null  !kv.isInternal()  filterRowKey(currentRow)) {
 {code}
 Is generated by
 {code}
 public static KeyValue createLastOnRow(final byte [] row,
 final int roffset, final int rlength, final byte [] family,
 final int foffset, final int flength, final byte [] qualifier,
 final int qoffset, final int qlength) { return new KeyValue(row, roffset, 
 rlength, family, foffset, flength, qualifier, qoffset, qlength, 
 HConstants.OLDEST_TIMESTAMP, Type.Minimum, null, 0, 0); }
 {code}
 So it is always internal from that point of the code.
 Only later from within
 StoreScanner.java
 {code}
 public synchronized boolean next(ListKeyValue outResult, int limit, String 
 metric) throws IOException {
 
 LOOP: while((kv = this.heap.peek()) != null) {
 {code}
 ( The second time through)
 Do we get the actual kv, with a proper type and timestamp. This seems to mess 
 with filtering.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6912) Filters are not properly applied in certain cases

2012-10-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6912:
-

Attachment: 6912-0.96.txt

Same for 0.96.

 Filters are not properly applied in certain cases
 -

 Key: HBASE-6912
 URL: https://issues.apache.org/jira/browse/HBASE-6912
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.1
Reporter: Alex Newman
 Fix For: 0.94.3, 0.96.0

 Attachments: 6912-0.94.txt, minimalTest.java


 Steps to reproduce:
 Create a table, load data into it. Flush the table.
 Do a scan with
 1. Some filter which should not match the first entry in the scan
 2. Where one specifies a family and column.
 You will notice that the first entry is returned even though it doesn't match 
 the filter.
 It looks like the when the first KeyValue of a scan in the column from the 
 point of view of the code
 HRegion.java
 {code}
 } else if (kv != null  !kv.isInternal()  filterRowKey(currentRow)) {
 {code}
 Is generated by
 {code}
 public static KeyValue createLastOnRow(final byte [] row,
 final int roffset, final int rlength, final byte [] family,
 final int foffset, final int flength, final byte [] qualifier,
 final int qoffset, final int qlength) { return new KeyValue(row, roffset, 
 rlength, family, foffset, flength, qualifier, qoffset, qlength, 
 HConstants.OLDEST_TIMESTAMP, Type.Minimum, null, 0, 0); }
 {code}
 So it is always internal from that point of the code.
 Only later from within
 StoreScanner.java
 {code}
 public synchronized boolean next(ListKeyValue outResult, int limit, String 
 metric) throws IOException {
 
 LOOP: while((kv = this.heap.peek()) != null) {
 {code}
 ( The second time through)
 Do we get the actual kv, with a proper type and timestamp. This seems to mess 
 with filtering.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6912) Filters are not properly applied in certain cases

2012-10-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6912:
-

Attachment: (was: 6912-0.94.txt)

 Filters are not properly applied in certain cases
 -

 Key: HBASE-6912
 URL: https://issues.apache.org/jira/browse/HBASE-6912
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.1
Reporter: Alex Newman
 Fix For: 0.94.3, 0.96.0

 Attachments: 6912-0.94.txt, minimalTest.java


 Steps to reproduce:
 Create a table, load data into it. Flush the table.
 Do a scan with
 1. Some filter which should not match the first entry in the scan
 2. Where one specifies a family and column.
 You will notice that the first entry is returned even though it doesn't match 
 the filter.
 It looks like the when the first KeyValue of a scan in the column from the 
 point of view of the code
 HRegion.java
 {code}
 } else if (kv != null  !kv.isInternal()  filterRowKey(currentRow)) {
 {code}
 Is generated by
 {code}
 public static KeyValue createLastOnRow(final byte [] row,
 final int roffset, final int rlength, final byte [] family,
 final int foffset, final int flength, final byte [] qualifier,
 final int qoffset, final int qlength) { return new KeyValue(row, roffset, 
 rlength, family, foffset, flength, qualifier, qoffset, qlength, 
 HConstants.OLDEST_TIMESTAMP, Type.Minimum, null, 0, 0); }
 {code}
 So it is always internal from that point of the code.
 Only later from within
 StoreScanner.java
 {code}
 public synchronized boolean next(ListKeyValue outResult, int limit, String 
 metric) throws IOException {
 
 LOOP: while((kv = this.heap.peek()) != null) {
 {code}
 ( The second time through)
 Do we get the actual kv, with a proper type and timestamp. This seems to mess 
 with filtering.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6912) Filters are not properly applied in certain cases

2012-10-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6912:
-

Attachment: 6912-0.94.txt

Right patch.

 Filters are not properly applied in certain cases
 -

 Key: HBASE-6912
 URL: https://issues.apache.org/jira/browse/HBASE-6912
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.1
Reporter: Alex Newman
 Fix For: 0.94.3, 0.96.0

 Attachments: 6912-0.94.txt, minimalTest.java


 Steps to reproduce:
 Create a table, load data into it. Flush the table.
 Do a scan with
 1. Some filter which should not match the first entry in the scan
 2. Where one specifies a family and column.
 You will notice that the first entry is returned even though it doesn't match 
 the filter.
 It looks like the when the first KeyValue of a scan in the column from the 
 point of view of the code
 HRegion.java
 {code}
 } else if (kv != null  !kv.isInternal()  filterRowKey(currentRow)) {
 {code}
 Is generated by
 {code}
 public static KeyValue createLastOnRow(final byte [] row,
 final int roffset, final int rlength, final byte [] family,
 final int foffset, final int flength, final byte [] qualifier,
 final int qoffset, final int qlength) { return new KeyValue(row, roffset, 
 rlength, family, foffset, flength, qualifier, qoffset, qlength, 
 HConstants.OLDEST_TIMESTAMP, Type.Minimum, null, 0, 0); }
 {code}
 So it is always internal from that point of the code.
 Only later from within
 StoreScanner.java
 {code}
 public synchronized boolean next(ListKeyValue outResult, int limit, String 
 metric) throws IOException {
 
 LOOP: while((kv = this.heap.peek()) != null) {
 {code}
 ( The second time through)
 Do we get the actual kv, with a proper type and timestamp. This seems to mess 
 with filtering.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6912) Filters are not properly applied in certain cases

2012-10-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6912:
-

Attachment: (was: 6912-0.96.txt)

 Filters are not properly applied in certain cases
 -

 Key: HBASE-6912
 URL: https://issues.apache.org/jira/browse/HBASE-6912
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.1
Reporter: Alex Newman
 Fix For: 0.94.3, 0.96.0

 Attachments: 6912-0.94.txt, minimalTest.java


 Steps to reproduce:
 Create a table, load data into it. Flush the table.
 Do a scan with
 1. Some filter which should not match the first entry in the scan
 2. Where one specifies a family and column.
 You will notice that the first entry is returned even though it doesn't match 
 the filter.
 It looks like the when the first KeyValue of a scan in the column from the 
 point of view of the code
 HRegion.java
 {code}
 } else if (kv != null  !kv.isInternal()  filterRowKey(currentRow)) {
 {code}
 Is generated by
 {code}
 public static KeyValue createLastOnRow(final byte [] row,
 final int roffset, final int rlength, final byte [] family,
 final int foffset, final int flength, final byte [] qualifier,
 final int qoffset, final int qlength) { return new KeyValue(row, roffset, 
 rlength, family, foffset, flength, qualifier, qoffset, qlength, 
 HConstants.OLDEST_TIMESTAMP, Type.Minimum, null, 0, 0); }
 {code}
 So it is always internal from that point of the code.
 Only later from within
 StoreScanner.java
 {code}
 public synchronized boolean next(ListKeyValue outResult, int limit, String 
 metric) throws IOException {
 
 LOOP: while((kv = this.heap.peek()) != null) {
 {code}
 ( The second time through)
 Do we get the actual kv, with a proper type and timestamp. This seems to mess 
 with filtering.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6912) Filters are not properly applied in certain cases

2012-10-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6912:
-

Fix Version/s: (was: 0.94.3)
   0.94.2
   Status: Patch Available  (was: Open)

 Filters are not properly applied in certain cases
 -

 Key: HBASE-6912
 URL: https://issues.apache.org/jira/browse/HBASE-6912
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.1
Reporter: Alex Newman
 Fix For: 0.94.2, 0.96.0

 Attachments: 6912-0.94.txt, 6912-0.96.txt, minimalTest.java


 Steps to reproduce:
 Create a table, load data into it. Flush the table.
 Do a scan with
 1. Some filter which should not match the first entry in the scan
 2. Where one specifies a family and column.
 You will notice that the first entry is returned even though it doesn't match 
 the filter.
 It looks like the when the first KeyValue of a scan in the column from the 
 point of view of the code
 HRegion.java
 {code}
 } else if (kv != null  !kv.isInternal()  filterRowKey(currentRow)) {
 {code}
 Is generated by
 {code}
 public static KeyValue createLastOnRow(final byte [] row,
 final int roffset, final int rlength, final byte [] family,
 final int foffset, final int flength, final byte [] qualifier,
 final int qoffset, final int qlength) { return new KeyValue(row, roffset, 
 rlength, family, foffset, flength, qualifier, qoffset, qlength, 
 HConstants.OLDEST_TIMESTAMP, Type.Minimum, null, 0, 0); }
 {code}
 So it is always internal from that point of the code.
 Only later from within
 StoreScanner.java
 {code}
 public synchronized boolean next(ListKeyValue outResult, int limit, String 
 metric) throws IOException {
 
 LOOP: while((kv = this.heap.peek()) != null) {
 {code}
 ( The second time through)
 Do we get the actual kv, with a proper type and timestamp. This seems to mess 
 with filtering.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6912) Filters are not properly applied in certain cases

2012-10-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6912:
-

Attachment: 6912-0.96.txt

real 0.96 version

 Filters are not properly applied in certain cases
 -

 Key: HBASE-6912
 URL: https://issues.apache.org/jira/browse/HBASE-6912
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.1
Reporter: Alex Newman
 Fix For: 0.94.2, 0.96.0

 Attachments: 6912-0.94.txt, 6912-0.96.txt, minimalTest.java


 Steps to reproduce:
 Create a table, load data into it. Flush the table.
 Do a scan with
 1. Some filter which should not match the first entry in the scan
 2. Where one specifies a family and column.
 You will notice that the first entry is returned even though it doesn't match 
 the filter.
 It looks like the when the first KeyValue of a scan in the column from the 
 point of view of the code
 HRegion.java
 {code}
 } else if (kv != null  !kv.isInternal()  filterRowKey(currentRow)) {
 {code}
 Is generated by
 {code}
 public static KeyValue createLastOnRow(final byte [] row,
 final int roffset, final int rlength, final byte [] family,
 final int foffset, final int flength, final byte [] qualifier,
 final int qoffset, final int qlength) { return new KeyValue(row, roffset, 
 rlength, family, foffset, flength, qualifier, qoffset, qlength, 
 HConstants.OLDEST_TIMESTAMP, Type.Minimum, null, 0, 0); }
 {code}
 So it is always internal from that point of the code.
 Only later from within
 StoreScanner.java
 {code}
 public synchronized boolean next(ListKeyValue outResult, int limit, String 
 metric) throws IOException {
 
 LOOP: while((kv = this.heap.peek()) != null) {
 {code}
 ( The second time through)
 Do we get the actual kv, with a proper type and timestamp. This seems to mess 
 with filtering.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6940) Enable GC logging by default

2012-10-03 Thread stack (JIRA)
stack created HBASE-6940:


 Summary: Enable GC logging by default
 Key: HBASE-6940
 URL: https://issues.apache.org/jira/browse/HBASE-6940
 Project: HBase
  Issue Type: Improvement
  Components: Admin
Reporter: stack
Priority: Critical
 Fix For: 0.96.0


I think we should enable gc by default.  Its pretty frictionless apparently and 
could help in the case where folks are getting off the ground.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6871) HFileBlockIndex Write Error in HFile V2 due to incorrect split into intermediate index blocks

2012-10-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6871:
-

Fix Version/s: (was: 0.94.3)
   0.94.2

 HFileBlockIndex Write Error in HFile V2 due to incorrect split into 
 intermediate index blocks
 -

 Key: HBASE-6871
 URL: https://issues.apache.org/jira/browse/HBASE-6871
 Project: HBase
  Issue Type: Bug
  Components: HFile
Affects Versions: 0.94.1
 Environment: redhat 5u4
Reporter: Fenng Wang
Assignee: Mikhail Bautin
Priority: Critical
 Fix For: 0.92.3, 0.94.2, 0.96.0

 Attachments: 428a400628ae412ca45d39fce15241fd.hfile, 
 6871.094.addendum2.txt, 6871.094.addendum.txt, 6871-0.94.txt, 
 6871-0.94v2.txt, 6871-hfile-index-0.92.txt, 6871-hfile-index-0.92-v2.txt, 
 6871.txt, 6871v2.txt, 787179746cc347ce9bb36f1989d17419.hfile, 
 960a026ca370464f84903ea58114bc75.hfile, 
 d0026fa8d59b4df291718f59dd145aad.hfile, D5703.1.patch, D5703.2.patch, 
 D5703.3.patch, D5703.4.patch, D5703.5.patch, hbase-6871-0.94.patch, 
 ImportHFile.java, test_hfile_block_index.sh


 After writing some data, compaction and scan operation both failure, the 
 exception message is below:
 2012-09-18 06:32:26,227 ERROR 
 org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest: 
 Compaction failed 
 regionName=hfile_test,,1347778722498.d220df43fb9d8af4633bd7f547613f9e., 
 storeName=page_info, fileCount=7, fileSize=1.3m (188.0k, 188.0k, 188.0k, 
 188.0k, 188.0k, 185.8k, 223.3k), priority=9, 
 time=45826250816757428java.io.IOException: Could not reseek 
 StoreFileScanner[HFileScanner for reader 
 reader=hdfs://hadoopdev1.cm6:9000/hbase/hfile_test/d220df43fb9d8af4633bd7f547613f9e/page_info/b0f6118f58de47ad9d87cac438ee0895,
  compression=lzo, cacheConf=CacheConfig:enabled [cacheDataOnRead=true] 
 [cacheDataOnWrite=false] [cacheIndexesOnWrite=false] 
 [cacheBloomsOnWrite=false] [cacheEvictOnClose=false] [cacheCompressed=false], 
 firstKey=http://com.truereligionbrandjeans.www/Womens_Dresses/pl/c/Womens_Shirts/pl/c/Womens_Shirts/pl/c/Womens_Shirts/pl/c/Womens_Shirts/pl/c/Womens_Shirts/pl/c/Womens_Shirts/pl/c/Womens_Shirts/pl/c/Womens_Shirts/pl/c/Womens_Shirts/pl/c/Womens_Shirts/pl/c/Womens_Shirts/pl/c/Womens_Shirts/pl/c/Womens_Shirts/pl/c/Womens_Shirts/pl/c/Womens_Shirts/pl/c/Womens_Sweaters/pl/c/Womens_Sweaters/pl/c/Womens_Sweaters/pl/c/Womens_Sweaters/pl/c/Womens_Sweaters/pl/c/Womens_Sweaters/pl/c/Womens_Shirts/pl/c/Womens_Sweaters/pl/c/Womens_Shirts/pl/c/Womens_Shirts/pl/c/Womens_Shirts/pl/c/Womens_Sweaters/pl/c/Womens_Sweaters/pl/c/Womens_Shirts/pl/c/Womens_Shirts/pl/c/Womens_Sweaters/pl/c/Womens_Sweaters/pl/c/Womens_Sweaters/pl/c/Womens_Sweaters/pl/c/Womens_Sweaters/pl/c/4010.html/page_info:anchor_sig/1347764439449/DeleteColumn,
  lastKey=http://com.trura.www//page_info:page_type/1347763395089/Put, 
 avgKeyLen=776, avgValueLen=4, entries=12853, length=228611, 
 cur=http://com.truereligionbrandjeans.www/Womens_Exclusive_Details/pl/c/4970.html/page_info:is_deleted/1347764003865/Put/vlen=1/ts=0]
  to key 
 http://com.truereligionbrandjeans.www/Womens_Exclusive_Details/pl/c/4970.html/page_info:is_deleted/OLDEST_TIMESTAMP/Minimum/vlen=0/ts=0
 at 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:178)
 
 at 
 org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
 
 at 
 org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:299)
 at 
 org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:244)
 
 at 
 org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:521)
 
 at 
 org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:402)
 at 
 org.apache.hadoop.hbase.regionserver.Store.compactStore(Store.java:1570)  
   
 at org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:997) 

 at 
 org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1216)
 at 
 org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest.run(CompactionRequest.java:250)
 
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.IOException: Expected block type LEAF_INDEX, but got 
 INTERMEDIATE_INDEX: blockType=INTERMEDIATE_INDEX, 
 onDiskSizeWithoutHeader=8514, uncompressedSizeWithoutHeader=131837, 
 prevBlockOffset=-1, 
 

[jira] [Updated] (HBASE-6906) TestHBaseFsck#testQuarantine* tests are flakey due to TableNotEnabledException

2012-10-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6906:
-

Fix Version/s: (was: 0.94.3)
   0.94.2

 TestHBaseFsck#testQuarantine* tests are flakey due to TableNotEnabledException
 --

 Key: HBASE-6906
 URL: https://issues.apache.org/jira/browse/HBASE-6906
 Project: HBase
  Issue Type: Bug
  Components: hbck, test
Affects Versions: 0.92.3, 0.94.2, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Fix For: 0.92.3, 0.94.2, 0.96.0

 Attachments: hbase-6906-94.patch, hbase-6906.patch


 This test fails periodically (1 out of 10) times on our internal jenkins 
 instance.
 {code}
 FAILED TESTS
 
 1 tests failed.
 REGRESSION: 
 org.apache.hadoop.hbase.util.TestHBaseFsck.testQuarantineMissingRegionDir
 Error Message:
 org.apache.hadoop.hbase.TableNotEnabledException: 
 testQuarantineMissingRegionDir at 
 org.apache.hadoop.hbase.master.handler.DisableTableHandler.init(DisableTableHandler.java:75)
  at org.apache.hadoop.hbase.master.HMaster.disableTable(HMaster.java:1170) at 
 sun.reflect.GeneratedMethodAccessor68.invoke(Unknown Source) at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597) at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
  at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1345)
 Stack Trace:
 org.apache.hadoop.hbase.TableNotEnabledException: 
 org.apache.hadoop.hbase.TableNotEnabledException: 
 testQuarantineMissingRegionDir
 at 
 org.apache.hadoop.hbase.master.handler.DisableTableHandler.init(DisableTableHandler.java:75)
 at org.apache.hadoop.hbase.master.HMaster.disableTable(HMaster.java:1170)
 at sun.reflect.GeneratedMethodAccessor68.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
 at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1345)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
 at 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:79)
 at 
 org.apache.hadoop.hbase.client.HBaseAdmin.disableTableAsync(HBaseAdmin.java:766)
 at 
 org.apache.hadoop.hbase.util.TestHBaseFsck.deleteTable(TestHBaseFsck.java:344)
 at 
 org.apache.hadoop.hbase.util.TestHBaseFsck.doQuarantineTest(TestHBaseFsck.java:1351)
 at 
 org.apache.hadoop.hbase.util.TestHBaseFsck.testQuarantineMissingRegionDir(TestHBaseFsck.java:1433)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at 
 org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62)
 Caused by: 
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hbase.TableNotEnabledException):
  org.apache.hadoop.hbase.TableNotEnabledException: 
 testQuarantineMissingRegionDir
 at 
 org.apache.hadoop.hbase.master.handler.DisableTableHandler.init(DisableTableHandler.java:75)
 at org.apache.hadoop.hbase.master.HMaster.disableTable(HMaster.java:1170)
 at sun.reflect.GeneratedMethodAccessor68.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
 at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1345)
 at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:918)
 at 
 

[jira] [Updated] (HBASE-6854) Deletion of SPLITTING node on split rollback should clear the region from RIT

2012-10-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6854:
-

Fix Version/s: (was: 0.94.3)
   0.94.2

 Deletion of SPLITTING node on split rollback should clear the region from RIT
 -

 Key: HBASE-6854
 URL: https://issues.apache.org/jira/browse/HBASE-6854
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.94.2

 Attachments: HBASE-6854.patch, HBASE-6854.patch


 If a failure happens in split before OFFLINING_PARENT, we tend to rollback 
 the split including deleting the znodes created.
 On deletion of the RS_ZK_SPLITTING node we are getting a callback but not 
 remvoving from RIT. We need to remove it from RIT, anyway SSH logic is well 
 guarded in case the delete event comes due to RS down scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6679) RegionServer aborts due to race between compaction and split

2012-10-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6679:
-

Fix Version/s: (was: 0.94.3)
   0.94.2

 RegionServer aborts due to race between compaction and split
 

 Key: HBASE-6679
 URL: https://issues.apache.org/jira/browse/HBASE-6679
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.92.3, 0.94.2, 0.96.0

 Attachments: 6679-1.094.patch, 6679-1.patch, 
 rs-crash-parallel-compact-split.log


 In our nightlies, we have seen RS aborts due to compaction and split racing. 
 Original parent file gets deleted after the compaction, and hence, the 
 daughters don't find the parent data file. The RS kills itself when this 
 happens. Will attach a snippet of the relevant RS logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-4565) Maven HBase build broken on cygwin with copynativelib.sh call.

2012-10-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-4565:
-

Fix Version/s: (was: 0.94.3)
   0.94.2

 Maven HBase build broken on cygwin with copynativelib.sh call.
 --

 Key: HBASE-4565
 URL: https://issues.apache.org/jira/browse/HBASE-4565
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.0
 Environment: cygwin (on xp and win7)
Reporter: Suraj Varma
Assignee: Suraj Varma
  Labels: build, maven
 Fix For: 0.92.3, 0.94.2

 Attachments: HBASE-4565-0.92.patch, HBASE-4565.patch, 
 HBASE-4565-v2.patch, HBASE-4565-v3-0.92.patch, HBASE-4565-v3.patch, 
 HBASE-4565-v4-0.92.patch, HBASE-4565-v4-0.94.patch


 This is broken in both 0.92 as well as trunk pom.xml
 Here's a sample maven log snippet from trunk (from Mayuresh on user mailing 
 list)
 [INFO] [antrun:run {execution: package}]
 [INFO] Executing tasks
 main:
[mkdir] Created dir: 
 D:\workspace\mkshirsa\hbase-trunk\target\hbase-0.93-SNAPSHOT\hbase-0.93-SNAPSHOT\lib\native\${build.platform}
 [exec] ls: cannot access D:workspacemkshirsahbase-trunktarget/nativelib: 
 No such file or directory
 [exec] tar (child): Cannot connect to D: resolve failed
 [INFO] 
 
 [ERROR] BUILD ERROR
 [INFO] 
 
 [INFO] An Ant BuildException has occured: exec returned: 3328
 There are two issues: 
 1) The ant run task below doesn't resolve the windows file separator returned 
 by the project.build.directory - this causes the above resolve failed.
 !-- Using Unix cp to preserve symlinks, using script to handle wildcards --
 echo file=${project.build.directory}/copynativelibs.sh
 if [ `ls ${project.build.directory}/nativelib | wc -l` -ne 0]; then
 2) The tar argument value below also has a similar issue in that the path arg 
 doesn't resolve right.
 !-- Using Unix tar to preserve symlinks --
 exec executable=tar failonerror=yes 
 dir=${project.build.directory}/${project.artifactId}-${project.version}
 arg value=czf/
 arg 
 value=/cygdrive/c/workspaces/hbase-0.92-svn/target/${project.artifactId}-${project.version}.tar.gz/
 arg value=./
 /exec
 In both cases, the fix would probably be to use a cross-platform way to 
 handle the directory locations. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6299) RS starting region open while failing ack to HMaster.sendRegionOpen() causes inconsistency in HMaster's region state and a series of successive problems

2012-10-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6299:
-

Fix Version/s: (was: 0.94.3)
   0.94.2

 RS starting region open while failing ack to HMaster.sendRegionOpen() causes 
 inconsistency in HMaster's region state and a series of successive problems
 

 Key: HBASE-6299
 URL: https://issues.apache.org/jira/browse/HBASE-6299
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.94.0
Reporter: Maryann Xue
Assignee: Maryann Xue
Priority: Critical
 Fix For: 0.92.3, 0.94.2, 0.96.0

 Attachments: 6299v4.txt, 6299v4.txt, 6299v4.txt, HBASE-6299.patch, 
 HBASE-6299-v2.patch, HBASE-6299-v3.patch


 1. HMaster tries to assign a region to an RS.
 2. HMaster creates a RegionState for this region and puts it into 
 regionsInTransition.
 3. In the first assign attempt, HMaster calls RS.openRegion(). The RS 
 receives the open region request and starts to proceed, with success 
 eventually. However, due to network problems, HMaster fails to receive the 
 response for the openRegion() call, and the call times out.
 4. HMaster attemps to assign for a second time, choosing another RS. 
 5. But since the HMaster's OpenedRegionHandler has been triggered by the 
 region open of the previous RS, and the RegionState has already been removed 
 from regionsInTransition, HMaster finds invalid and ignores the unassigned ZK 
 node RS_ZK_REGION_OPENING updated by the second attempt.
 6. The unassigned ZK node stays and a later unassign fails coz 
 RS_ZK_REGION_CLOSING cannot be created.
 {code}
 2012-06-29 07:03:38,870 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for 
 region 
 CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.;
  
 plan=hri=CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.,
  src=swbss-hadoop-004,60020,1340890123243, 
 dest=swbss-hadoop-006,60020,1340890678078
 2012-06-29 07:03:38,870 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
 CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
  to swbss-hadoop-006,60020,1340890678078
 2012-06-29 07:03:38,870 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=swbss-hadoop-002:6, 
 region=b713fd655fa02395496c5a6e39ddf568
 2012-06-29 07:06:28,882 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, 
 region=b713fd655fa02395496c5a6e39ddf568
 2012-06-29 07:06:32,291 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, 
 region=b713fd655fa02395496c5a6e39ddf568
 2012-06-29 07:06:32,299 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENED, server=swbss-hadoop-006,60020,1340890678078, 
 region=b713fd655fa02395496c5a6e39ddf568
 2012-06-29 07:06:32,299 DEBUG 
 org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED 
 event for 
 CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
  from serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=518945, 
 regions=575, usedHeap=15282, maxHeap=31301); deleting unassigned node
 2012-06-29 07:06:32,299 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x2377fee2ae80007 Deleting existing unassigned node for 
 b713fd655fa02395496c5a6e39ddf568 that is in expected state RS_ZK_REGION_OPENED
 2012-06-29 07:06:32,301 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x2377fee2ae80007 Successfully deleted unassigned node for 
 region b713fd655fa02395496c5a6e39ddf568 in expected state RS_ZK_REGION_OPENED
 2012-06-29 07:06:32,301 DEBUG 
 org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: The master has 
 opened the region 
 CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
  that was online on serverName=swbss-hadoop-006,60020,1340890678078, 
 load=(requests=518945, regions=575, usedHeap=15282, maxHeap=31301)
 2012-06-29 07:07:41,140 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of 
 CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
  to serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=0, 
 regions=575, usedHeap=0, maxHeap=0), trying to assign 

[jira] [Updated] (HBASE-6901) Store file compactSelection throws ArrayIndexOutOfBoundsException

2012-10-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6901:
-

Fix Version/s: (was: 0.94.3)
   0.94.2

 Store file compactSelection throws ArrayIndexOutOfBoundsException
 -

 Key: HBASE-6901
 URL: https://issues.apache.org/jira/browse/HBASE-6901
 Project: HBase
  Issue Type: Bug
  Components: HFile
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.94.2, 0.96.0

 Attachments: trunk-6901.patch


 When setting hbase.mapreduce.hfileoutputformat.compaction.exclude to true, 
 and run compaction to exclude bulk loaded files could cause 
 ArrayIndexOutOfBoundsException since all files are excluded.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6688) folder referred by thrift demo app instructions is outdated

2012-10-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6688:
-

Fix Version/s: (was: 0.94.3)
   0.94.2

 folder referred by thrift demo app instructions is outdated
 ---

 Key: HBASE-6688
 URL: https://issues.apache.org/jira/browse/HBASE-6688
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Jimmy Xiang
Assignee: stack
Priority: Minor
 Fix For: 0.94.2, 0.96.0

 Attachments: thrift094.txt, thrift.txt


 Due to the source tree module change for 0.96, the instructions in the thrift 
 demo example don't match the folder structure any more.
 In the instruction, it is referring to:
 ../../../src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift
 it should be
 ../../hbase-server/src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6888) HBase scripts ignore any HBASE_OPTS set in the environment

2012-10-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6888:
-

Fix Version/s: (was: 0.94.3)
   0.94.2

 HBase scripts ignore any HBASE_OPTS set in the environment
 --

 Key: HBASE-6888
 URL: https://issues.apache.org/jira/browse/HBASE-6888
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.94.0, 0.96.0
Reporter: Aditya Kishore
Assignee: Aditya Kishore
Priority: Minor
 Fix For: 0.94.2, 0.96.0

 Attachments: HBASE-6888_trunk.patch


 hbase-env.sh which is sourced by hbase-config.sh which is eventually sourced 
 by the main 'hbase' script defines HBASE_OPTS form scratch, ignoring any 
 previous value set in the environment.
 This prevents from passing additional JVM parameters to HBase programs 
 (shell, hbck, etc) launched through these scripts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6914) Scans/Gets/Mutations don't give a good error if the table is disabled.

2012-10-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6914:
-

Fix Version/s: (was: 0.94.3)
   0.94.2

 Scans/Gets/Mutations don't give a good error if the table is disabled.
 --

 Key: HBASE-6914
 URL: https://issues.apache.org/jira/browse/HBASE-6914
 Project: HBase
  Issue Type: Improvement
  Components: Client
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 0.92.3, 0.94.2, 0.96.0

 Attachments: HBASE-6914-092-3.patch, HBASE-6914-092-ADD.patch, 
 HBASE-6914-094-3.patch, HBASE-6914-0.patch, HBASE-6914-1.patch, 
 HBASE-6914-2.patch, HBASE-6914-3.patch


 Scan a table that is disabled will have the client retry multiple times and 
 then will error out with NotServingRegionException.  If the table is disabled 
 there's no need to re-try and the message should be more explicit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6927) WrongFS using HRegionInfo.getTableDesc() and different fs for hbase.root and fs.defaultFS

2012-10-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6927:
-

Fix Version/s: (was: 0.94.3)
   0.94.2

 WrongFS using HRegionInfo.getTableDesc() and different fs for hbase.root and 
 fs.defaultFS
 -

 Key: HBASE-6927
 URL: https://issues.apache.org/jira/browse/HBASE-6927
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.2, 0.94.1, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 0.94.2, 0.96.0

 Attachments: 6927.094.txt, HBASE-6927-v0.patch


 Calling HRegionInfo.getTableDesc() with different fs schema for hbase.root 
 and fs.defaultFS raises IllegalArgumentException: Wrong FS exception.
 HRegionInfo.getTableDesc() is called only by bin/region_mover.rb to get the 
 table name and can be easily replaced, getTableDesc() is also deprecated.
 The main problem is that getTableDesc() doesn't replace fs.defaultFS with 
 hbase.root as all the other hbase code (all the code does this, except 
 getTableDesc)
 {code}
 Configuration c = HBaseConfiguration.create();
 c.set(fs.defaultFS, c.get(HConstants.HBASE_DIR));
 c.set(fs.default.name, c.get(HConstants.HBASE_DIR));
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6920) On timeout connecting to master, client can get stuck and never make progress

2012-10-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6920:
-

Fix Version/s: 0.94.2

 On timeout connecting to master, client can get stuck and never make progress
 -

 Key: HBASE-6920
 URL: https://issues.apache.org/jira/browse/HBASE-6920
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.2
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Critical
 Fix For: 0.94.2

 Attachments: HBASE-6920.patch, HBASE-6920-v2.patch


 HBASE-5058 appears to have introduced an issue where a timeout in 
 HConnection.getMaster() can cause the client to never be able to connect to 
 the master.  So, for example, an HBaseAdmin object can never successfully be 
 initialized.
 The issue is here:
 {code}
 if (tryMaster.isMasterRunning()) {
   this.master = tryMaster;
   this.masterLock.notifyAll();
   break;
 }
 {code}
 If isMasterRunning times out, it throws an UndeclaredThrowableException, 
 which is already not ideal, because it can be returned to the application.
  But if the first call to getMaster succeeds, it will set masterChecked = 
 true, which makes us never try to reconnect; that is, we will set this.master 
 = null and just throw MasterNotRunningExceptions, without even trying to 
 connect.
 I tried out a 94 client (actually a 92 client with some 94 patches) on a 
 cluster with some network issues, and it would constantly get stuck as 
 described above.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6920) On timeout connecting to master, client can get stuck and never make progress

2012-10-03 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468745#comment-13468745
 ] 

Lars Hofhansl commented on HBASE-6920:
--

Actually, let's just fix the issue you discovered here. +1 on your patch.
We can think about the other change I suggest for 0.94.3. It is time to get 
0.94.2 out the door.

 On timeout connecting to master, client can get stuck and never make progress
 -

 Key: HBASE-6920
 URL: https://issues.apache.org/jira/browse/HBASE-6920
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.2
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Critical
 Fix For: 0.94.2

 Attachments: HBASE-6920.patch, HBASE-6920-v2.patch


 HBASE-5058 appears to have introduced an issue where a timeout in 
 HConnection.getMaster() can cause the client to never be able to connect to 
 the master.  So, for example, an HBaseAdmin object can never successfully be 
 initialized.
 The issue is here:
 {code}
 if (tryMaster.isMasterRunning()) {
   this.master = tryMaster;
   this.masterLock.notifyAll();
   break;
 }
 {code}
 If isMasterRunning times out, it throws an UndeclaredThrowableException, 
 which is already not ideal, because it can be returned to the application.
  But if the first call to getMaster succeeds, it will set masterChecked = 
 true, which makes us never try to reconnect; that is, we will set this.master 
 = null and just throw MasterNotRunningExceptions, without even trying to 
 connect.
 I tried out a 94 client (actually a 92 client with some 94 patches) on a 
 cluster with some network issues, and it would constantly get stuck as 
 described above.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6758) [replication] The replication-executor should make sure the file that it is replicating is closed before declaring success on that file

2012-10-03 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468751#comment-13468751
 ] 

Devaraj Das commented on HBASE-6758:


Thanks, [~jdcryans] for looking at the patch. Actually, upon looking at the 
RegionServerServices interface closely, I see that it extends the Server 
interface. So the problem you pointed out could be addressed by making the 
affected constructors and methods (the ones that I changed to have the new 
RegionServerServices argument) to have only RegionServerServices instead of 
Server/Stoppable instances.

Will submit a patch soon. Hope that will look better.

 [replication] The replication-executor should make sure the file that it is 
 replicating is closed before declaring success on that file
 ---

 Key: HBASE-6758
 URL: https://issues.apache.org/jira/browse/HBASE-6758
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
Priority: Critical
 Fix For: 0.96.0

 Attachments: 6758-1-0.92.patch, 6758-2-0.92.patch, 
 6758-trunk-1.patch, 
 TEST-org.apache.hadoop.hbase.replication.TestReplication.xml


 I have seen cases where the replication-executor would lose data to replicate 
 since the file hasn't been closed yet. Upon closing, the new data becomes 
 visible. Before that happens the ZK node shouldn't be deleted in 
 ReplicationSourceManager.logPositionAndCleanOldLogs. Changes need to be made 
 in ReplicationSource.processEndOfFile as well (currentPath related).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6758) [replication] The replication-executor should make sure the file that it is replicating is closed before declaring success on that file

2012-10-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468762#comment-13468762
 ] 

stack commented on HBASE-6758:
--

Can we not pass down RegionServerServices?  Can we pass a narrow Interface 
instead?



 [replication] The replication-executor should make sure the file that it is 
 replicating is closed before declaring success on that file
 ---

 Key: HBASE-6758
 URL: https://issues.apache.org/jira/browse/HBASE-6758
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
Priority: Critical
 Fix For: 0.96.0

 Attachments: 6758-1-0.92.patch, 6758-2-0.92.patch, 
 6758-trunk-1.patch, 
 TEST-org.apache.hadoop.hbase.replication.TestReplication.xml


 I have seen cases where the replication-executor would lose data to replicate 
 since the file hasn't been closed yet. Upon closing, the new data becomes 
 visible. Before that happens the ZK node shouldn't be deleted in 
 ReplicationSourceManager.logPositionAndCleanOldLogs. Changes need to be made 
 in ReplicationSource.processEndOfFile as well (currentPath related).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6316) Confirm can upgrade to 0.96 from 0.94 by just stopping and restarting

2012-10-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468764#comment-13468764
 ] 

stack commented on HBASE-6316:
--

Hmm... trying again I don't get the 500 building on this machine.

 Confirm can upgrade to 0.96 from 0.94 by just stopping and restarting
 -

 Key: HBASE-6316
 URL: https://issues.apache.org/jira/browse/HBASE-6316
 Project: HBase
  Issue Type: Bug
Reporter: stack
Priority: Blocker
 Fix For: 0.96.0

 Attachments: 6316.txt


 Over in HBASE-6294, LarsH says you have to currently clear zk to get a 0.96 
 to start over data written by a 0.94.  Need to fix it so don't have to do 
 this -- that zk state left over gets auto-migrated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6912) Filters are not properly applied in certain cases

2012-10-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468789#comment-13468789
 ] 

Hadoop QA commented on HBASE-6912:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12547577/6912-0.96.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
83 warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 5 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2994//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2994//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2994//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2994//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2994//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2994//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2994//console

This message is automatically generated.

 Filters are not properly applied in certain cases
 -

 Key: HBASE-6912
 URL: https://issues.apache.org/jira/browse/HBASE-6912
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.1
Reporter: Alex Newman
 Fix For: 0.94.2, 0.96.0

 Attachments: 6912-0.94.txt, 6912-0.96.txt, minimalTest.java


 Steps to reproduce:
 Create a table, load data into it. Flush the table.
 Do a scan with
 1. Some filter which should not match the first entry in the scan
 2. Where one specifies a family and column.
 You will notice that the first entry is returned even though it doesn't match 
 the filter.
 It looks like the when the first KeyValue of a scan in the column from the 
 point of view of the code
 HRegion.java
 {code}
 } else if (kv != null  !kv.isInternal()  filterRowKey(currentRow)) {
 {code}
 Is generated by
 {code}
 public static KeyValue createLastOnRow(final byte [] row,
 final int roffset, final int rlength, final byte [] family,
 final int foffset, final int flength, final byte [] qualifier,
 final int qoffset, final int qlength) { return new KeyValue(row, roffset, 
 rlength, family, foffset, flength, qualifier, qoffset, qlength, 
 HConstants.OLDEST_TIMESTAMP, Type.Minimum, null, 0, 0); }
 {code}
 So it is always internal from that point of the code.
 Only later from within
 StoreScanner.java
 {code}
 public synchronized boolean next(ListKeyValue outResult, int limit, String 
 metric) throws IOException {
 
 LOOP: while((kv = this.heap.peek()) != null) {
 {code}
 ( The second time through)
 Do we get the actual kv, with a proper type and timestamp. This seems to mess 
 with filtering.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6758) [replication] The replication-executor should make sure the file that it is replicating is closed before declaring success on that file

2012-10-03 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468800#comment-13468800
 ] 

Devaraj Das commented on HBASE-6758:


bq. Can we not pass down RegionServerServices? Can we pass a narrow Interface 
instead?

I think we can (I can pull out the getWAL() method from the interface 
RegionServerServices into a new interface and have RegionServerServices extend 
that..). But in that case we will pass two instances of HRS still (as pointed 
out by JD earlier). But thinking about it, that probably makes downstream 
methods' abstractions cleaner (when compared with the approach of having them 
accept a fat interface).

 [replication] The replication-executor should make sure the file that it is 
 replicating is closed before declaring success on that file
 ---

 Key: HBASE-6758
 URL: https://issues.apache.org/jira/browse/HBASE-6758
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
Priority: Critical
 Fix For: 0.96.0

 Attachments: 6758-1-0.92.patch, 6758-2-0.92.patch, 
 6758-trunk-1.patch, 
 TEST-org.apache.hadoop.hbase.replication.TestReplication.xml


 I have seen cases where the replication-executor would lose data to replicate 
 since the file hasn't been closed yet. Upon closing, the new data becomes 
 visible. Before that happens the ZK node shouldn't be deleted in 
 ReplicationSourceManager.logPositionAndCleanOldLogs. Changes need to be made 
 in ReplicationSource.processEndOfFile as well (currentPath related).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HBASE-6912) Filters are not properly applied in certain cases

2012-10-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl reassigned HBASE-6912:


Assignee: Lars Hofhansl

 Filters are not properly applied in certain cases
 -

 Key: HBASE-6912
 URL: https://issues.apache.org/jira/browse/HBASE-6912
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.1
Reporter: Alex Newman
Assignee: Lars Hofhansl
 Fix For: 0.94.2, 0.96.0

 Attachments: 6912-0.94.txt, 6912-0.96.txt, minimalTest.java


 Steps to reproduce:
 Create a table, load data into it. Flush the table.
 Do a scan with
 1. Some filter which should not match the first entry in the scan
 2. Where one specifies a family and column.
 You will notice that the first entry is returned even though it doesn't match 
 the filter.
 It looks like the when the first KeyValue of a scan in the column from the 
 point of view of the code
 HRegion.java
 {code}
 } else if (kv != null  !kv.isInternal()  filterRowKey(currentRow)) {
 {code}
 Is generated by
 {code}
 public static KeyValue createLastOnRow(final byte [] row,
 final int roffset, final int rlength, final byte [] family,
 final int foffset, final int flength, final byte [] qualifier,
 final int qoffset, final int qlength) { return new KeyValue(row, roffset, 
 rlength, family, foffset, flength, qualifier, qoffset, qlength, 
 HConstants.OLDEST_TIMESTAMP, Type.Minimum, null, 0, 0); }
 {code}
 So it is always internal from that point of the code.
 Only later from within
 StoreScanner.java
 {code}
 public synchronized boolean next(ListKeyValue outResult, int limit, String 
 metric) throws IOException {
 
 LOOP: while((kv = this.heap.peek()) != null) {
 {code}
 ( The second time through)
 Do we get the actual kv, with a proper type and timestamp. This seems to mess 
 with filtering.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6667) TestCatalogJanitor occasionally fails

2012-10-03 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468820#comment-13468820
 ] 

Jesse Yates commented on HBASE-6667:


This test hasn't been failing on trunk for the last 100+ builds and has been 
running under a second in every build (generally under 100ms, with some 
variance).  See 
https://builds.apache.org/job/HBase-TRUNK/3414/testReport/junit/org.apache.hadoop.hbase.master/TestCatalogJanitor/testArchiveOldRegion/history/

I'd like to move to close this as won't fix. I have no idea what went wrong 
with the original test - its all single threaded and a fairly simple test. It 
might have been some weird GC issue where the cleanup ran early or bled over 
from another test running cleanup. However, ran the test again 20x on trunk 
locally without issue.

 TestCatalogJanitor occasionally fails
 -

 Key: HBASE-6667
 URL: https://issues.apache.org/jira/browse/HBASE-6667
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Jesse Yates
 Fix For: 0.96.0

 Attachments: java_6667-v0.txt, testCatalogJanitor-output.txt


 Here is the OS:
 Linux sea0 2.6.38-11-generic #48-Ubuntu SMP Fri Jul 29 19:02:55 UTC 2011 
 x86_64 x86_64 x86_64 GNU/Linux
 {code}
 testArchiveOldRegion(org.apache.hadoop.hbase.master.TestCatalogJanitor)  Time 
 elapsed: 0.007 sec   FAILURE!
 java.lang.AssertionError: Not the same number of current files
 Expected (2):  Gotten (0):
 Not Found:
 _store0
 _store1
 Extra:
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.assertTrue(Assert.java:43)
   at org.junit.Assert.assertNull(Assert.java:551)
   at 
 org.apache.hadoop.hbase.util.HFileArchiveTestingUtil.assertArchiveEqualToOriginal(HFileArchiveTestingUtil.java:132)
   at 
 org.apache.hadoop.hbase.util.HFileArchiveTestingUtil.assertArchiveEqualToOriginal(HFileArchiveTestingUtil.java:95)
   at 
 org.apache.hadoop.hbase.master.TestCatalogJanitor.testArchiveOldRegion(TestCatalogJanitor.java:623)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6667) TestCatalogJanitor occasionally fails

2012-10-03 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468824#comment-13468824
 ] 

Ted Yu commented on HBASE-6667:
---

We can resolve this and tackle more recent test failures.

 TestCatalogJanitor occasionally fails
 -

 Key: HBASE-6667
 URL: https://issues.apache.org/jira/browse/HBASE-6667
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Jesse Yates
 Fix For: 0.96.0

 Attachments: java_6667-v0.txt, testCatalogJanitor-output.txt


 Here is the OS:
 Linux sea0 2.6.38-11-generic #48-Ubuntu SMP Fri Jul 29 19:02:55 UTC 2011 
 x86_64 x86_64 x86_64 GNU/Linux
 {code}
 testArchiveOldRegion(org.apache.hadoop.hbase.master.TestCatalogJanitor)  Time 
 elapsed: 0.007 sec   FAILURE!
 java.lang.AssertionError: Not the same number of current files
 Expected (2):  Gotten (0):
 Not Found:
 _store0
 _store1
 Extra:
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.assertTrue(Assert.java:43)
   at org.junit.Assert.assertNull(Assert.java:551)
   at 
 org.apache.hadoop.hbase.util.HFileArchiveTestingUtil.assertArchiveEqualToOriginal(HFileArchiveTestingUtil.java:132)
   at 
 org.apache.hadoop.hbase.util.HFileArchiveTestingUtil.assertArchiveEqualToOriginal(HFileArchiveTestingUtil.java:95)
   at 
 org.apache.hadoop.hbase.master.TestCatalogJanitor.testArchiveOldRegion(TestCatalogJanitor.java:623)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-6667) TestCatalogJanitor occasionally fails

2012-10-03 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-6667.
---

Resolution: Cannot Reproduce

 TestCatalogJanitor occasionally fails
 -

 Key: HBASE-6667
 URL: https://issues.apache.org/jira/browse/HBASE-6667
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Jesse Yates
 Fix For: 0.96.0

 Attachments: java_6667-v0.txt, testCatalogJanitor-output.txt


 Here is the OS:
 Linux sea0 2.6.38-11-generic #48-Ubuntu SMP Fri Jul 29 19:02:55 UTC 2011 
 x86_64 x86_64 x86_64 GNU/Linux
 {code}
 testArchiveOldRegion(org.apache.hadoop.hbase.master.TestCatalogJanitor)  Time 
 elapsed: 0.007 sec   FAILURE!
 java.lang.AssertionError: Not the same number of current files
 Expected (2):  Gotten (0):
 Not Found:
 _store0
 _store1
 Extra:
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.assertTrue(Assert.java:43)
   at org.junit.Assert.assertNull(Assert.java:551)
   at 
 org.apache.hadoop.hbase.util.HFileArchiveTestingUtil.assertArchiveEqualToOriginal(HFileArchiveTestingUtil.java:132)
   at 
 org.apache.hadoop.hbase.util.HFileArchiveTestingUtil.assertArchiveEqualToOriginal(HFileArchiveTestingUtil.java:95)
   at 
 org.apache.hadoop.hbase.master.TestCatalogJanitor.testArchiveOldRegion(TestCatalogJanitor.java:623)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6920) On timeout connecting to master, client can get stuck and never make progress

2012-10-03 Thread Gregory Chanan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468835#comment-13468835
 ] 

Gregory Chanan commented on HBASE-6920:
---

Sounds good.  I'm doing some cluster testing today, I'll commit if all looks 
good.

 On timeout connecting to master, client can get stuck and never make progress
 -

 Key: HBASE-6920
 URL: https://issues.apache.org/jira/browse/HBASE-6920
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.2
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Critical
 Fix For: 0.94.2

 Attachments: HBASE-6920.patch, HBASE-6920-v2.patch


 HBASE-5058 appears to have introduced an issue where a timeout in 
 HConnection.getMaster() can cause the client to never be able to connect to 
 the master.  So, for example, an HBaseAdmin object can never successfully be 
 initialized.
 The issue is here:
 {code}
 if (tryMaster.isMasterRunning()) {
   this.master = tryMaster;
   this.masterLock.notifyAll();
   break;
 }
 {code}
 If isMasterRunning times out, it throws an UndeclaredThrowableException, 
 which is already not ideal, because it can be returned to the application.
  But if the first call to getMaster succeeds, it will set masterChecked = 
 true, which makes us never try to reconnect; that is, we will set this.master 
 = null and just throw MasterNotRunningExceptions, without even trying to 
 connect.
 I tried out a 94 client (actually a 92 client with some 94 patches) on a 
 cluster with some network issues, and it would constantly get stuck as 
 described above.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6733) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-2]

2012-10-03 Thread Jean-Daniel Cryans (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468868#comment-13468868
 ] 

Jean-Daniel Cryans commented on HBASE-6733:
---

The log switching code really needs to be cleaned up, but my understanding is 
that this patch won't do anything. {{processEndOfFile}} always sets the 
{{currentPath}} to {{null}} so this:

{code}
+  Path oldPath = getCurrentPath();
{code}

would always return null in the case where we're switching log? 

 [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-2]
 ---

 Key: HBASE-6733
 URL: https://issues.apache.org/jira/browse/HBASE-6733
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.92.3

 Attachments: 6733-1.patch, 6733-2.patch


 The failure is in TestReplication.queueFailover (fails due to unreplicated 
 rows). I have come across two problems:
 1. The sleepMultiplier is not properly reset when the currentPath is changed 
 (in ReplicationSource.java).
 2. ReplicationExecutor sometime removes files to replicate from the queue too 
 early, resulting in corresponding edits missing. Here the problem is due to 
 the fact the log-file length that the replication executor finds is not the 
 most updated one, and hence it doesn't read anything from there, and 
 ultimately, when there is a log roll, the replication-queue gets a new entry, 
 and the executor drops the old entry out of the queue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-3793) HBASE-3468 Broke checkAndPut with null value

2012-10-03 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-3793:
---

Attachment: D5835.1.patch

mbautin requested code review of [jira] [HBASE-3793] [89-fb] Fix TestHRegion 
failure with zero-byte expected array in compare-and-put.
Reviewers: Liyin, Kannan, JIRA

  Passing a zero-byte expected value to checkAndPut and similar methods now 
means we are expecting to see a zero-byte value, not a non-existent value. This 
should have been part of rHBASEEIGHTNINEFBBRANCH1391219.

TEST PLAN
  TestHRegion

REVISION DETAIL
  https://reviews.facebook.net/D5835

AFFECTED FILES
  src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/13821/

To: Liyin, Kannan, JIRA, mbautin


 HBASE-3468 Broke checkAndPut with null value
 

 Key: HBASE-3793
 URL: https://issues.apache.org/jira/browse/HBASE-3793
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Lars George
Assignee: Ming Ma
Priority: Blocker
 Fix For: 0.92.0

 Attachments: D5835.1.patch, HBASE-3793.patch, HBASE-3793-TRUNK.patch


 The previous code called Bytes.equal() which does a check for null on the 
 left or right argument. Now the comparator calls Bytes.compareTo() - which 
 has no check for null. But this is a valid input and checks for existence. I 
 actually noticed this running 
 https://github.com/larsgeorge/hbase-book/blob/master/ch04/src/main/java/client/CheckAndPutExample.java
 This used to work, now it throws an NPE
 {noformat}
 Caused by: java.lang.NullPointerException
   at org.apache.hadoop.hbase.util.Bytes.compareTo(Bytes.java:854)
   at 
 org.apache.hadoop.hbase.filter.WritableByteArrayComparable.compareTo(WritableByteArrayComparable.java:63)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkAndMutate(HRegion.java:1681)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.checkAndMutate(HRegionServer.java:1693)
   ... 6 more
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1026)
   at org.apache.hadoop.hbase.client.HTable.checkAndPut(HTable.java:750)
   at client.CheckAndPutExample.main(CheckAndPutExample.java:33)
 {noformat}
 Easy fixable, just needs to handle the null value before even calling 
 comparator.compareTo().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3793) HBASE-3468 Broke checkAndPut with null value

2012-10-03 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468879#comment-13468879
 ] 

Phabricator commented on HBASE-3793:


Kannan has accepted the revision [jira] [HBASE-3793] [89-fb] Fix TestHRegion 
failure with zero-length expected value in compare-and-put.

REVISION DETAIL
  https://reviews.facebook.net/D5835

BRANCH
  fix_test_hregion

To: Liyin, Kannan, JIRA, mbautin


 HBASE-3468 Broke checkAndPut with null value
 

 Key: HBASE-3793
 URL: https://issues.apache.org/jira/browse/HBASE-3793
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Lars George
Assignee: Ming Ma
Priority: Blocker
 Fix For: 0.92.0

 Attachments: D5835.1.patch, HBASE-3793.patch, HBASE-3793-TRUNK.patch


 The previous code called Bytes.equal() which does a check for null on the 
 left or right argument. Now the comparator calls Bytes.compareTo() - which 
 has no check for null. But this is a valid input and checks for existence. I 
 actually noticed this running 
 https://github.com/larsgeorge/hbase-book/blob/master/ch04/src/main/java/client/CheckAndPutExample.java
 This used to work, now it throws an NPE
 {noformat}
 Caused by: java.lang.NullPointerException
   at org.apache.hadoop.hbase.util.Bytes.compareTo(Bytes.java:854)
   at 
 org.apache.hadoop.hbase.filter.WritableByteArrayComparable.compareTo(WritableByteArrayComparable.java:63)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkAndMutate(HRegion.java:1681)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.checkAndMutate(HRegionServer.java:1693)
   ... 6 more
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1026)
   at org.apache.hadoop.hbase.client.HTable.checkAndPut(HTable.java:750)
   at client.CheckAndPutExample.main(CheckAndPutExample.java:33)
 {noformat}
 Easy fixable, just needs to handle the null value before even calling 
 comparator.compareTo().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6733) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-2]

2012-10-03 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468895#comment-13468895
 ] 

Devaraj Das commented on HBASE-6733:


bq. would always return null in the case where we're switching log?
That's true.. But the patch still works :-) The check _if (getCurrentPath() != 
null  !getCurrentPath().equals(oldPath))_ would return true (after a call to 
getNextPath()) and the sleepMultiplier would be reset..

 [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-2]
 ---

 Key: HBASE-6733
 URL: https://issues.apache.org/jira/browse/HBASE-6733
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.92.3

 Attachments: 6733-1.patch, 6733-2.patch


 The failure is in TestReplication.queueFailover (fails due to unreplicated 
 rows). I have come across two problems:
 1. The sleepMultiplier is not properly reset when the currentPath is changed 
 (in ReplicationSource.java).
 2. ReplicationExecutor sometime removes files to replicate from the queue too 
 early, resulting in corresponding edits missing. Here the problem is due to 
 the fact the log-file length that the replication executor finds is not the 
 most updated one, and hence it doesn't read anything from there, and 
 ultimately, when there is a log roll, the replication-queue gets a new entry, 
 and the executor drops the old entry out of the queue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6941) LoadIncrementalHFiles uses the Tool interface incorrectly for loading configs

2012-10-03 Thread Harsh J (JIRA)
Harsh J created HBASE-6941:
--

 Summary: LoadIncrementalHFiles uses the Tool interface incorrectly 
for loading configs
 Key: HBASE-6941
 URL: https://issues.apache.org/jira/browse/HBASE-6941
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.90.6
Reporter: Harsh J
Assignee: Harsh J


The LoadIncrementalHFiles tool has pretty complex config loading structured in 
it, which seems unnecessary and also causes problem since it is ignoring any 
settings passed to it via Tool's -Dprop=value parameters.

This makes integration with tools such as Oozie harder, as it doesn't accept 
different addresses of ZK, etc. unless there's a hbase-site.xml on the 
classpath to load from (which is painful to achieve on Oozie).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6733) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-2]

2012-10-03 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468909#comment-13468909
 ] 

Devaraj Das commented on HBASE-6733:


The patch should continue to work if at some point of time, log switching 
behavior is changed so that the currentPath always points to a valid non-null 
path... But for now, yeah, null works as well (and I have checked in the Hadoop 
code that the implementation of equals method with a null argument is handled).

 [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-2]
 ---

 Key: HBASE-6733
 URL: https://issues.apache.org/jira/browse/HBASE-6733
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.92.3

 Attachments: 6733-1.patch, 6733-2.patch


 The failure is in TestReplication.queueFailover (fails due to unreplicated 
 rows). I have come across two problems:
 1. The sleepMultiplier is not properly reset when the currentPath is changed 
 (in ReplicationSource.java).
 2. ReplicationExecutor sometime removes files to replicate from the queue too 
 early, resulting in corresponding edits missing. Here the problem is due to 
 the fact the log-file length that the replication executor finds is not the 
 most updated one, and hence it doesn't read anything from there, and 
 ultimately, when there is a log roll, the replication-queue gets a new entry, 
 and the executor drops the old entry out of the queue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

2012-10-03 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6228:
---

Resolution: Implemented
Status: Resolved  (was: Patch Available)

Yes, I think the issue is not there any more.

Since we are using SSH to handle dead servers in failover mode, this piece of 
code in HMaster to fixup daughter is not needed any more.  I will remove it in 
HBASE-6611.

 Fixup daughters twice  cause daughter region assigned twice
 ---

 Key: HBASE-6228
 URL: https://issues.apache.org/jira/browse/HBASE-6228
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.96.0

 Attachments: HBASE-6228.patch, HBASE-6228v2.patch, 
 HBASE-6228v2.patch, HBASE-6228v3.patch, HBASE-6228v4.patch


 First, how fixup daughters twice happen?
 1.we will fixupDaughters at the last of HMaster#finishInitialization
 2.ServerShutdownHandler will fixupDaughters when reassigning region through 
 ServerShutdownHandler#processDeadRegion
 When fixupDaughters, we will added daughters to .META., but it coudn't 
 prevent the above case, because FindDaughterVisitor.
 The detail is as the following:
 Suppose region A is a splitted parent region, and its daughter region B is 
 missing
 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B 
 to .META. with serverName=null, and assign the daughter.
 2.Then, Master's initialization thread will also find the daughter region B 
 is missing and assign it. It is because FindDaughterVisitor consider daughter 
 is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6930) [89-fb] Avoid acquiring the same row lock repeatedly

2012-10-03 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-6930:
---

Attachment: D5841.1.patch

mbautin requested code review of [jira] [HBASE-6930] [89-fb] Fix 
TestThriftServerLegacy: notifyAll should be inside synchronized block.
Reviewers: Kannan, Liyin, Karthik, JIRA

  There were a couple of reasons why TestThriftServerLegacy has been failing 
recently in the HBase 89-fb branch:
  - rHBASEEIGHTNINEFBBRANCH1393468 was calling notifyAll outside a synchronized 
block
  - rHBASEEIGHTNINEFBBRANCH1391219 changed the meaning of a null expected value 
passed to checkAndMutate but that was not reflected in the Thrift handler

TEST PLAN
  Run TestThriftServerLegacy

REVISION DETAIL
  https://reviews.facebook.net/D5841

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/13833/

To: Kannan, Liyin, Karthik, JIRA, mbautin


 [89-fb] Avoid acquiring the same row lock repeatedly
 

 Key: HBASE-6930
 URL: https://issues.apache.org/jira/browse/HBASE-6930
 Project: HBase
  Issue Type: Bug
Reporter: Liyin Tang
 Attachments: D5841.1.patch, D5841.2.patch


 When processing the multiPut, multiMutations or multiDelete operations, each 
 IPC handler thread tries to acquire a lock for each row key in these batches. 
 If there are duplicated row keys in these batches, previously the IPC handler 
 thread will repeatedly acquire the same row key again and again.
 So the optimization is to sort each batch operation based on the row key in 
 the client side, and skip acquiring the same row lock repeatedly in the 
 server side.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6930) [89-fb] Avoid acquiring the same row lock repeatedly

2012-10-03 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-6930:
---

Attachment: D5841.2.patch

mbautin updated the revision [jira] [HBASE-6930] [89-fb] Fix 
TestThriftServerLegacy: notifyAll should be inside synchronized block, and a 
null checkAndMutate expected value should be handled correctly.
Reviewers: Kannan, Liyin, Karthik, JIRA

  Adding ThriftServerRunner fixes

REVISION DETAIL
  https://reviews.facebook.net/D5841

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
  src/main/java/org/apache/hadoop/hbase/thrift/ThriftServerRunner.java
  src/test/java/org/apache/hadoop/hbase/thrift/TestThriftServerLegacy.java

To: Kannan, Liyin, Karthik, JIRA, mbautin


 [89-fb] Avoid acquiring the same row lock repeatedly
 

 Key: HBASE-6930
 URL: https://issues.apache.org/jira/browse/HBASE-6930
 Project: HBase
  Issue Type: Bug
Reporter: Liyin Tang
 Attachments: D5841.1.patch, D5841.2.patch


 When processing the multiPut, multiMutations or multiDelete operations, each 
 IPC handler thread tries to acquire a lock for each row key in these batches. 
 If there are duplicated row keys in these batches, previously the IPC handler 
 thread will repeatedly acquire the same row key again and again.
 So the optimization is to sort each batch operation based on the row key in 
 the client side, and skip acquiring the same row lock repeatedly in the 
 server side.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6733) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-2]

2012-10-03 Thread Jean-Daniel Cryans (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468920#comment-13468920
 ] 

Jean-Daniel Cryans commented on HBASE-6733:
---

You are right, but I'd rather have the code expose what it's really doing.

Also, reading more, this looks weird:

{code}
+  boolean pathNull = getNextPath();
...
-  if (!getNextPath()) {
+  if (!pathNull) {
{code}

{{getNextPath}} returns true if the path was not null so shouldn't the variable 
be named pathNotNull or hasCurrentPath and then remove the exclamation point? 

 [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-2]
 ---

 Key: HBASE-6733
 URL: https://issues.apache.org/jira/browse/HBASE-6733
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.92.3

 Attachments: 6733-1.patch, 6733-2.patch


 The failure is in TestReplication.queueFailover (fails due to unreplicated 
 rows). I have come across two problems:
 1. The sleepMultiplier is not properly reset when the currentPath is changed 
 (in ReplicationSource.java).
 2. ReplicationExecutor sometime removes files to replicate from the queue too 
 early, resulting in corresponding edits missing. Here the problem is due to 
 the fact the log-file length that the replication executor finds is not the 
 most updated one, and hence it doesn't read anything from there, and 
 ultimately, when there is a log roll, the replication-queue gets a new entry, 
 and the executor drops the old entry out of the queue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6439) Ignore .archive directory as a table

2012-10-03 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-6439:
---

Status: Patch Available  (was: Open)

 Ignore .archive directory as a table
 

 Key: HBASE-6439
 URL: https://issues.apache.org/jira/browse/HBASE-6439
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.96.0
Reporter: Jesse Yates
Assignee: Jesse Yates
  Labels: newbie
 Attachments: hbase-6439-r0.patch


 From a recent test run:
 {quote}
 2012-07-22 02:27:30,699 WARN  [IPC Server handler 0 on 47087] 
 util.FSTableDescriptors(168): The following folder is in HBase's root 
 directory and doesn't contain a table descriptor, do consider deleting it: 
 .archive
 {quote}
 With the addition of HBASE-5547, table-level folders are no-longer all table 
 folders. FSTableDescriptors needs to then have a 'gold-list' that we can 
 update with directories that aren't tables so we don't have this kind of 
 thing showing up in the logs.
 Currently, we have the following block:
 {quote}
 invocations++;
 if (HTableDescriptor.ROOT_TABLEDESC.getNameAsString().equals(tablename)) {
   cachehits++;
   return HTableDescriptor.ROOT_TABLEDESC;
 }
 if (HTableDescriptor.META_TABLEDESC.getNameAsString().equals(tablename)) {
   cachehits++;
   return HTableDescriptor.META_TABLEDESC;
 }
 {quote}
 to handle special cases, but that's a bit clunky and not clean in terms of 
 table-level directories that need to be ignored.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6930) [89-fb] Avoid acquiring the same row lock repeatedly

2012-10-03 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468929#comment-13468929
 ] 

Phabricator commented on HBASE-6930:


Liyin has accepted the revision [jira] [HBASE-6930] [89-fb] Fix 
TestThriftServerLegacy: notifyAll should be inside synchronized block, and a 
null checkAndMutate expected value should be handled correctly.

  Thanks Mikhail !

REVISION DETAIL
  https://reviews.facebook.net/D5841

BRANCH
  fix_locked_rows_v2

To: Kannan, Liyin, Karthik, JIRA, mbautin


 [89-fb] Avoid acquiring the same row lock repeatedly
 

 Key: HBASE-6930
 URL: https://issues.apache.org/jira/browse/HBASE-6930
 Project: HBase
  Issue Type: Bug
Reporter: Liyin Tang
 Attachments: D5841.1.patch, D5841.2.patch


 When processing the multiPut, multiMutations or multiDelete operations, each 
 IPC handler thread tries to acquire a lock for each row key in these batches. 
 If there are duplicated row keys in these batches, previously the IPC handler 
 thread will repeatedly acquire the same row key again and again.
 So the optimization is to sort each batch operation based on the row key in 
 the client side, and skip acquiring the same row lock repeatedly in the 
 server side.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6941) LoadIncrementalHFiles uses the Tool interface incorrectly for loading configs

2012-10-03 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HBASE-6941:
---

Attachment: HBASE-6941.patch

 LoadIncrementalHFiles uses the Tool interface incorrectly for loading configs
 -

 Key: HBASE-6941
 URL: https://issues.apache.org/jira/browse/HBASE-6941
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.90.6
Reporter: Harsh J
Assignee: Harsh J
 Attachments: HBASE-6941.patch


 The LoadIncrementalHFiles tool has pretty complex config loading structured 
 in it, which seems unnecessary and also causes problem since it is ignoring 
 any settings passed to it via Tool's -Dprop=value parameters.
 This makes integration with tools such as Oozie harder, as it doesn't accept 
 different addresses of ZK, etc. unless there's a hbase-site.xml on the 
 classpath to load from (which is painful to achieve on Oozie).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6941) LoadIncrementalHFiles uses the Tool interface incorrectly for loading configs

2012-10-03 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468932#comment-13468932
 ] 

Harsh J commented on HBASE-6941:


- Unified the configuration to use the getConf() from Configured alone. Added 
HBase configs to it upon construction. This is the right way to use Tool + 
HBaseConfiguration.
- Constantized 3 of the used config params in the tool, to HConstants, and 
updated their references in the tool.

 LoadIncrementalHFiles uses the Tool interface incorrectly for loading configs
 -

 Key: HBASE-6941
 URL: https://issues.apache.org/jira/browse/HBASE-6941
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.90.6
Reporter: Harsh J
Assignee: Harsh J
 Attachments: HBASE-6941.patch


 The LoadIncrementalHFiles tool has pretty complex config loading structured 
 in it, which seems unnecessary and also causes problem since it is ignoring 
 any settings passed to it via Tool's -Dprop=value parameters.
 This makes integration with tools such as Oozie harder, as it doesn't accept 
 different addresses of ZK, etc. unless there's a hbase-site.xml on the 
 classpath to load from (which is painful to achieve on Oozie).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6941) LoadIncrementalHFiles uses the Tool interface incorrectly for loading configs

2012-10-03 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HBASE-6941:
---

Status: Patch Available  (was: Open)

 LoadIncrementalHFiles uses the Tool interface incorrectly for loading configs
 -

 Key: HBASE-6941
 URL: https://issues.apache.org/jira/browse/HBASE-6941
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.90.6
Reporter: Harsh J
Assignee: Harsh J
 Attachments: HBASE-6941.patch


 The LoadIncrementalHFiles tool has pretty complex config loading structured 
 in it, which seems unnecessary and also causes problem since it is ignoring 
 any settings passed to it via Tool's -Dprop=value parameters.
 This makes integration with tools such as Oozie harder, as it doesn't accept 
 different addresses of ZK, etc. unless there's a hbase-site.xml on the 
 classpath to load from (which is painful to achieve on Oozie).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6733) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-2]

2012-10-03 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468933#comment-13468933
 ] 

Devaraj Das commented on HBASE-6733:


bq. but I'd rather have the code expose what it's really doing.

Do you want me to put a comment or something?

bq. the variable be named pathNotNull or hasCurrentPath and then remove the 
exclamation point?

Agree. I'll rename pathNull to hasCurrentPath (but the check will remain the 
same - if (!hasCurrentPath) ..)

 [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-2]
 ---

 Key: HBASE-6733
 URL: https://issues.apache.org/jira/browse/HBASE-6733
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.92.3

 Attachments: 6733-1.patch, 6733-2.patch


 The failure is in TestReplication.queueFailover (fails due to unreplicated 
 rows). I have come across two problems:
 1. The sleepMultiplier is not properly reset when the currentPath is changed 
 (in ReplicationSource.java).
 2. ReplicationExecutor sometime removes files to replicate from the queue too 
 early, resulting in corresponding edits missing. Here the problem is due to 
 the fact the log-file length that the replication executor finds is not the 
 most updated one, and hence it doesn't read anything from there, and 
 ultimately, when there is a log roll, the replication-queue gets a new entry, 
 and the executor drops the old entry out of the queue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

2012-10-03 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468913#comment-13468913
 ] 

Jimmy Xiang edited comment on HBASE-6228 at 10/4/12 9:34 AM:
-

Yes, I think the issue is not there any more.


  was (Author: jxiang):
Yes, I think the issue is not there any more.

Since we are using SSH to handle dead servers in failover mode, this piece of 
code in HMaster to fixup daughter is not needed any more.  I will remove it in 
HBASE-6611.
  
 Fixup daughters twice  cause daughter region assigned twice
 ---

 Key: HBASE-6228
 URL: https://issues.apache.org/jira/browse/HBASE-6228
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.96.0

 Attachments: HBASE-6228.patch, HBASE-6228v2.patch, 
 HBASE-6228v2.patch, HBASE-6228v3.patch, HBASE-6228v4.patch


 First, how fixup daughters twice happen?
 1.we will fixupDaughters at the last of HMaster#finishInitialization
 2.ServerShutdownHandler will fixupDaughters when reassigning region through 
 ServerShutdownHandler#processDeadRegion
 When fixupDaughters, we will added daughters to .META., but it coudn't 
 prevent the above case, because FindDaughterVisitor.
 The detail is as the following:
 Suppose region A is a splitted parent region, and its daughter region B is 
 missing
 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B 
 to .META. with serverName=null, and assign the daughter.
 2.Then, Master's initialization thread will also find the daughter region B 
 is missing and assign it. It is because FindDaughterVisitor consider daughter 
 is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6733) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-2]

2012-10-03 Thread Jean-Daniel Cryans (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468936#comment-13468936
 ] 

Jean-Daniel Cryans commented on HBASE-6733:
---

bq.  (but the check will remain the same - if (!hasCurrentPath) ..)

Ah geez yeah keep that. Damn double negations.

bq. Do you want me to put a comment or something?

Check for null if that's what you expect I'd say.

 [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-2]
 ---

 Key: HBASE-6733
 URL: https://issues.apache.org/jira/browse/HBASE-6733
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.92.3

 Attachments: 6733-1.patch, 6733-2.patch


 The failure is in TestReplication.queueFailover (fails due to unreplicated 
 rows). I have come across two problems:
 1. The sleepMultiplier is not properly reset when the currentPath is changed 
 (in ReplicationSource.java).
 2. ReplicationExecutor sometime removes files to replicate from the queue too 
 early, resulting in corresponding edits missing. Here the problem is due to 
 the fact the log-file length that the replication executor finds is not the 
 most updated one, and hence it doesn't read anything from there, and 
 ultimately, when there is a log roll, the replication-queue gets a new entry, 
 and the executor drops the old entry out of the queue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6941) LoadIncrementalHFiles uses the Tool interface incorrectly for loading configs

2012-10-03 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HBASE-6941:
---

Status: Open  (was: Patch Available)

Missed test-reliance. Cancelling, until I complete that.

 LoadIncrementalHFiles uses the Tool interface incorrectly for loading configs
 -

 Key: HBASE-6941
 URL: https://issues.apache.org/jira/browse/HBASE-6941
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.90.6
Reporter: Harsh J
Assignee: Harsh J
 Attachments: HBASE-6941.patch


 The LoadIncrementalHFiles tool has pretty complex config loading structured 
 in it, which seems unnecessary and also causes problem since it is ignoring 
 any settings passed to it via Tool's -Dprop=value parameters.
 This makes integration with tools such as Oozie harder, as it doesn't accept 
 different addresses of ZK, etc. unless there's a hbase-site.xml on the 
 classpath to load from (which is painful to achieve on Oozie).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6733) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-2]

2012-10-03 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HBASE-6733:
---

Attachment: 6733-3.patch

This should address your comments, [~jdcryans]

 [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-2]
 ---

 Key: HBASE-6733
 URL: https://issues.apache.org/jira/browse/HBASE-6733
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.92.3

 Attachments: 6733-1.patch, 6733-2.patch, 6733-3.patch


 The failure is in TestReplication.queueFailover (fails due to unreplicated 
 rows). I have come across two problems:
 1. The sleepMultiplier is not properly reset when the currentPath is changed 
 (in ReplicationSource.java).
 2. ReplicationExecutor sometime removes files to replicate from the queue too 
 early, resulting in corresponding edits missing. Here the problem is due to 
 the fact the log-file length that the replication executor finds is not the 
 most updated one, and hence it doesn't read anything from there, and 
 ultimately, when there is a log roll, the replication-queue gets a new entry, 
 and the executor drops the old entry out of the queue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6738) Too aggressive task resubmission from the distributed log manager

2012-10-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468966#comment-13468966
 ] 

Hudson commented on HBASE-6738:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #206 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/206/])
HBASE-6738  Too aggressive task resubmission from the distributed log 
manager (Revision 1393537)

 Result = FAILURE
nkeywal : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java


 Too aggressive task resubmission from the distributed log manager
 -

 Key: HBASE-6738
 URL: https://issues.apache.org/jira/browse/HBASE-6738
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver
Affects Versions: 0.94.1, 0.96.0
 Environment: 3 nodes cluster test, but can occur as well on a much 
 bigger one. It's all luck!
Reporter: nkeywal
Assignee: nkeywal
Priority: Critical
 Fix For: 0.96.0

 Attachments: 6738.v1.patch


 With default settings for hbase.splitlog.manager.timeout = 25s and 
 hbase.splitlog.max.resubmit = 3.
 On tests mentionned on HBASE-5843, I have variations around this scenario, 
 0.94 + HDFS 1.0.3:
 The regionserver in charge of the split does not answer in less than 25s, so 
 it gets interrupted but actually continues. Sometimes, we go out of the 
 number of retry, sometimes not, sometimes we're out of retry, but the as the 
 interrupts were ignored we finish nicely. In the mean time, the same single 
 task is executed in parallel by multiple nodes, increasing the probability to 
 get into race conditions.
 Details:
 t0: unplug a box with DN+RS
 t + x: other boxes are already connected, to their connection starts to dies. 
 Nevertheless, they don't consider this node as suspect.
 t + 180s: zookeeper - master detects the node as dead. recovery start. It 
 can be less than 180s sometimes it around 150s.
 t + 180s: distributed split starts. There is only 1 task, it's immediately 
 acquired by a one RS.
 t + 205s: the RS has multiple errors when splitting, because a datanode is 
 missing as well. The master decides to give the task to someone else. But 
 often the task continues in the first RS. Interrupts are often ignored, as 
 it's well stated in the code (// TODO interrupt often gets swallowed, do 
 what else?)
 {code}
2012-09-04 18:27:30,404 INFO 
 org.apache.hadoop.hbase.regionserver.SplitLogWorker: Sending interrupt to 
 stop the worker thread
 {code}
 t + 211s: two regionsservers are processing the same task. They fight for the 
 leases:
 {code}
 2012-09-04 18:27:32,004 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer 
 Exception: org.apache.hadoop.ipc.RemoteException:  
 org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: Lease mismatch 
 on

 /hbase/TABLE/4d1c1a4695b1df8c58d13382b834332e/recovered.edits/037.temp
  owned by DFSClient_hb_rs_BOX2,60020,1346775882980 but is accessed by 
 DFSClient_hb_rs_BOX1,60020,1346775719125
 {code}
  They can fight like this for many files, until the tasks finally get 
 interrupted or finished.
  The taks on the second box can be cancelled as well. In this case, the 
 task is created again for a new box.
  The master seems to stop after 3 attemps. It can as well renounce to 
 split the files. Sometimes the tasks were not cancelled on the RS side, so 
 the split is finished despites what the master thinks and logs. In this case, 
 the assignement starts. In the other, it's we've got a problem).
 {code}
 2012-09-04 18:43:52,724 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
 Skipping resubmissions of task 
 /hbase/splitlog/hdfs%3A%2F%2FBOX1%3A9000%2Fhbase%2F.logs%2FBOX0%2C60020%2C1346776587640-splitting%2FBOX0%252C60020%252C1346776587640.1346776587832
  because threshold 3 reached 
 {code}
 t + 300s: split is finished. Assignement starts
 t + 330s: assignement is finished, regions are available again.
 There are a lot of subcases possible depending on the number of logs files, 
 of region server and so on.
 The issues are:
 1) it's difficult, especially in HBase but not only, to interrupt a task. The 
 pattern is often
 {code}
  void f() throws IOException{
   try {
  // whatever throw InterruptedException
   }catch(InterruptedException){
 throw new InterruptedIOException();
   }
 }
  boolean g(){
int nbRetry= 0;  
for(;;)
   try{
  f();
  return true;
   }catch(IOException e){
  nbRetry++;
  if ( nbRetry  maxRetry) return 

[jira] [Commented] (HBASE-6439) Ignore .archive directory as a table

2012-10-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468969#comment-13468969
 ] 

Hadoop QA commented on HBASE-6439:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12547613/hbase-6439-r0.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 12 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
81 warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 5 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.backup.example.TestZooKeeperTableArchiveClient

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2995//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2995//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2995//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2995//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2995//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2995//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2995//console

This message is automatically generated.

 Ignore .archive directory as a table
 

 Key: HBASE-6439
 URL: https://issues.apache.org/jira/browse/HBASE-6439
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.96.0
Reporter: Jesse Yates
Assignee: Jesse Yates
  Labels: newbie
 Attachments: hbase-6439-r0.patch


 From a recent test run:
 {quote}
 2012-07-22 02:27:30,699 WARN  [IPC Server handler 0 on 47087] 
 util.FSTableDescriptors(168): The following folder is in HBase's root 
 directory and doesn't contain a table descriptor, do consider deleting it: 
 .archive
 {quote}
 With the addition of HBASE-5547, table-level folders are no-longer all table 
 folders. FSTableDescriptors needs to then have a 'gold-list' that we can 
 update with directories that aren't tables so we don't have this kind of 
 thing showing up in the logs.
 Currently, we have the following block:
 {quote}
 invocations++;
 if (HTableDescriptor.ROOT_TABLEDESC.getNameAsString().equals(tablename)) {
   cachehits++;
   return HTableDescriptor.ROOT_TABLEDESC;
 }
 if (HTableDescriptor.META_TABLEDESC.getNameAsString().equals(tablename)) {
   cachehits++;
   return HTableDescriptor.META_TABLEDESC;
 }
 {quote}
 to handle special cases, but that's a bit clunky and not clean in terms of 
 table-level directories that need to be ignored.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6916) HBA logs at info level errors that won't show in the shell

2012-10-03 Thread Jean-Daniel Cryans (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans updated HBASE-6916:
--

Attachment: HBASE-6916.patch

Attaching the trunk patch.

 HBA logs at info level errors that won't show in the shell
 --

 Key: HBASE-6916
 URL: https://issues.apache.org/jira/browse/HBASE-6916
 Project: HBase
  Issue Type: Bug
  Components: shell
Affects Versions: 0.90.6, 0.92.1, 0.94.1, 0.96.0
Reporter: Jean-Daniel Cryans
Priority: Minor
 Fix For: 0.92.2, 0.94.3, 0.96.0

 Attachments: HBASE-6916-0.94.patch, HBASE-6916.patch


 There is a weird interaction between the shell and HBA. When you try to close 
 a region that doesn't exist, it doesn't throw any error:
 {noformat}
 hbase(main):029:0 close_region 'thisisaninvalidregion'
 0 row(s) in 0.0580 seconds
 {noformat}
 Normally one should get UnknownRegionException. Starting the shell with -d 
 I see what a non-shell user would see along with a ton of logging from ZK 
 (skipped here):
 {noformat}
 INFO client.HBaseAdmin: No server in .META. for thisisaninvalidregion; 
 pair=null
 {noformat}
 But again this is not the right message, it should have shown
 {noformat}
 INFO client.HBaseAdmin: No server in .META. for thisisaninvalidregion; 
 pair=null
 {noformat}
 And this is because that part of the code treats both UnknownRegionException 
 and NoServerForRegionException like if it was the same thing.
 There is also some ugliness in flush, compact, and split but it normally 
 doesn't show since the code treats everything like it's a table and sends a 
 TableNotFoundException.
 This jira is about making sure that the exceptions are correctly coming out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6916) HBA logs at info level errors that won't show in the shell

2012-10-03 Thread Jean-Daniel Cryans (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans updated HBASE-6916:
--

Fix Version/s: (was: 0.90.7)
 Assignee: Jean-Daniel Cryans
   Status: Patch Available  (was: Open)

 HBA logs at info level errors that won't show in the shell
 --

 Key: HBASE-6916
 URL: https://issues.apache.org/jira/browse/HBASE-6916
 Project: HBase
  Issue Type: Bug
  Components: shell
Affects Versions: 0.94.1, 0.92.1, 0.90.6, 0.96.0
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Minor
 Fix For: 0.94.3, 0.96.0, 0.92.2

 Attachments: HBASE-6916-0.94.patch, HBASE-6916.patch


 There is a weird interaction between the shell and HBA. When you try to close 
 a region that doesn't exist, it doesn't throw any error:
 {noformat}
 hbase(main):029:0 close_region 'thisisaninvalidregion'
 0 row(s) in 0.0580 seconds
 {noformat}
 Normally one should get UnknownRegionException. Starting the shell with -d 
 I see what a non-shell user would see along with a ton of logging from ZK 
 (skipped here):
 {noformat}
 INFO client.HBaseAdmin: No server in .META. for thisisaninvalidregion; 
 pair=null
 {noformat}
 But again this is not the right message, it should have shown
 {noformat}
 INFO client.HBaseAdmin: No server in .META. for thisisaninvalidregion; 
 pair=null
 {noformat}
 And this is because that part of the code treats both UnknownRegionException 
 and NoServerForRegionException like if it was the same thing.
 There is also some ugliness in flush, compact, and split but it normally 
 doesn't show since the code treats everything like it's a table and sends a 
 TableNotFoundException.
 This jira is about making sure that the exceptions are correctly coming out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-6883) CleanerChore treats .archive as a table and throws TableInfoMissingException

2012-10-03 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang resolved HBASE-6883.


Resolution: Duplicate

Duplicate of HBASE-6439

 CleanerChore treats .archive as a table and throws TableInfoMissingException
 

 Key: HBASE-6883
 URL: https://issues.apache.org/jira/browse/HBASE-6883
 Project: HBase
  Issue Type: Bug
Reporter: Jimmy Xiang

 {noformat}
 2012-09-25 14:52:21,902 DEBUG 
 org.apache.hadoop.hbase.util.FSTableDescriptors: Exception during 
 readTableDecriptor. Current table name = .archive
 org.apache.hadoop.hbase.TableInfoMissingException: No .tableinfo file under 
 hdfs://c0322.hal.cloudera.com:56020/hbase/.archive
 at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptor(FSTableDescriptors.java:417)
 at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptor(FSTableDescriptors.java:408)
 at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:170)
 at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:201)
 at 
 org.apache.hadoop.hbase.master.HMaster.getTableDescriptors(HMaster.java:2205)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hbase.ipc.ProtobufRpcEngine$Server.call(ProtobufRpcEngine.java:357)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1816)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6439) Ignore .archive directory as a table

2012-10-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468982#comment-13468982
 ] 

stack commented on HBASE-6439:
--

[~jesse_yates] Did this test 
org.apache.hadoop.hbase.backup.example.TestZooKeeperTableArchiveClient fail?

 Ignore .archive directory as a table
 

 Key: HBASE-6439
 URL: https://issues.apache.org/jira/browse/HBASE-6439
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.96.0
Reporter: Jesse Yates
Assignee: Jesse Yates
  Labels: newbie
 Attachments: hbase-6439-r0.patch


 From a recent test run:
 {quote}
 2012-07-22 02:27:30,699 WARN  [IPC Server handler 0 on 47087] 
 util.FSTableDescriptors(168): The following folder is in HBase's root 
 directory and doesn't contain a table descriptor, do consider deleting it: 
 .archive
 {quote}
 With the addition of HBASE-5547, table-level folders are no-longer all table 
 folders. FSTableDescriptors needs to then have a 'gold-list' that we can 
 update with directories that aren't tables so we don't have this kind of 
 thing showing up in the logs.
 Currently, we have the following block:
 {quote}
 invocations++;
 if (HTableDescriptor.ROOT_TABLEDESC.getNameAsString().equals(tablename)) {
   cachehits++;
   return HTableDescriptor.ROOT_TABLEDESC;
 }
 if (HTableDescriptor.META_TABLEDESC.getNameAsString().equals(tablename)) {
   cachehits++;
   return HTableDescriptor.META_TABLEDESC;
 }
 {quote}
 to handle special cases, but that's a bit clunky and not clean in terms of 
 table-level directories that need to be ignored.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6758) [replication] The replication-executor should make sure the file that it is replicating is closed before declaring success on that file

2012-10-03 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468986#comment-13468986
 ] 

Devaraj Das commented on HBASE-6758:


In the trunk case, I think something better can be done (and the interface 
changes can be avoided). Replication.postLogRoll could do the enqueue of the 
new path in the ReplicationSource's queue. The Replication.preLogRoll would do 
everything else (creating ZK entries, etc.) except the enqueuing of the path in 
the queue.. 

The postLogRoll is currently called before the writer is reset (to 
_nextWriter_) in FSHLog.rollWriter. I propose that it be called after the 
writer is reset. That in my opinion seems to be a more precise place for 
calling postLogRoll..

Thoughts?

 [replication] The replication-executor should make sure the file that it is 
 replicating is closed before declaring success on that file
 ---

 Key: HBASE-6758
 URL: https://issues.apache.org/jira/browse/HBASE-6758
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
Priority: Critical
 Fix For: 0.96.0

 Attachments: 6758-1-0.92.patch, 6758-2-0.92.patch, 
 6758-trunk-1.patch, 
 TEST-org.apache.hadoop.hbase.replication.TestReplication.xml


 I have seen cases where the replication-executor would lose data to replicate 
 since the file hasn't been closed yet. Upon closing, the new data becomes 
 visible. Before that happens the ZK node shouldn't be deleted in 
 ReplicationSourceManager.logPositionAndCleanOldLogs. Changes need to be made 
 in ReplicationSource.processEndOfFile as well (currentPath related).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6916) HBA logs at info level errors that won't show in the shell

2012-10-03 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468988#comment-13468988
 ] 

Jimmy Xiang commented on HBASE-6916:


+1

 HBA logs at info level errors that won't show in the shell
 --

 Key: HBASE-6916
 URL: https://issues.apache.org/jira/browse/HBASE-6916
 Project: HBase
  Issue Type: Bug
  Components: shell
Affects Versions: 0.90.6, 0.92.1, 0.94.1, 0.96.0
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Minor
 Fix For: 0.92.2, 0.94.3, 0.96.0

 Attachments: HBASE-6916-0.94.patch, HBASE-6916.patch


 There is a weird interaction between the shell and HBA. When you try to close 
 a region that doesn't exist, it doesn't throw any error:
 {noformat}
 hbase(main):029:0 close_region 'thisisaninvalidregion'
 0 row(s) in 0.0580 seconds
 {noformat}
 Normally one should get UnknownRegionException. Starting the shell with -d 
 I see what a non-shell user would see along with a ton of logging from ZK 
 (skipped here):
 {noformat}
 INFO client.HBaseAdmin: No server in .META. for thisisaninvalidregion; 
 pair=null
 {noformat}
 But again this is not the right message, it should have shown
 {noformat}
 INFO client.HBaseAdmin: No server in .META. for thisisaninvalidregion; 
 pair=null
 {noformat}
 And this is because that part of the code treats both UnknownRegionException 
 and NoServerForRegionException like if it was the same thing.
 There is also some ugliness in flush, compact, and split but it normally 
 doesn't show since the code treats everything like it's a table and sends a 
 TableNotFoundException.
 This jira is about making sure that the exceptions are correctly coming out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6872) Number of records written/read per second on regionserver level

2012-10-03 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-6872:
---

Attachment: D5853.1.patch

mbautin requested code review of [jira] [HBASE-6872] [89-fb] Fix 
TestRegionServerMetrics.testNumReadsAndWrites.
Reviewers: Kannan, Karthik, JIRA

  rHBASEEIGHTNINEFBBRANCH1389841 introduced an unstable test in 
TestRegionServerMetrics: testNumReadsAndWrites. Read and write counters should 
be reset to zero before starting the test.

TEST PLAN
  Run TestRegionServerMetrics

REVISION DETAIL
  https://reviews.facebook.net/D5853

AFFECTED FILES
  
src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerMetrics.java

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/13863/

To: Kannan, Karthik, JIRA, mbautin


 Number of records written/read per second on regionserver level
 ---

 Key: HBASE-6872
 URL: https://issues.apache.org/jira/browse/HBASE-6872
 Project: HBase
  Issue Type: New Feature
  Components: regionserver
Reporter: Adela Maznikar
Priority: Minor
 Attachments: D5853.1.patch


 Regionserver level metrics that shows the number of records written/read per 
 second. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6439) Ignore .archive directory as a table

2012-10-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468985#comment-13468985
 ] 

stack commented on HBASE-6439:
--

Otherwise, +1 on patch.  This gets rid of this issue Jesse when I start up 
hbase?

{code}
2012-10-03 12:34:28,515 DEBUG org.apache.hadoop.hbase.util.FSTableDescriptors: 
Exception during readTableDecriptor. Current table name = .archive
org.apache.hadoop.hbase.TableInfoMissingException: No .tableinfo file under 
file:/Users/Stack/Downloads/hbase-stack/hbase/.archive
{code}

What you reckon of the test failure?



 Ignore .archive directory as a table
 

 Key: HBASE-6439
 URL: https://issues.apache.org/jira/browse/HBASE-6439
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.96.0
Reporter: Jesse Yates
Assignee: Jesse Yates
  Labels: newbie
 Attachments: hbase-6439-r0.patch


 From a recent test run:
 {quote}
 2012-07-22 02:27:30,699 WARN  [IPC Server handler 0 on 47087] 
 util.FSTableDescriptors(168): The following folder is in HBase's root 
 directory and doesn't contain a table descriptor, do consider deleting it: 
 .archive
 {quote}
 With the addition of HBASE-5547, table-level folders are no-longer all table 
 folders. FSTableDescriptors needs to then have a 'gold-list' that we can 
 update with directories that aren't tables so we don't have this kind of 
 thing showing up in the logs.
 Currently, we have the following block:
 {quote}
 invocations++;
 if (HTableDescriptor.ROOT_TABLEDESC.getNameAsString().equals(tablename)) {
   cachehits++;
   return HTableDescriptor.ROOT_TABLEDESC;
 }
 if (HTableDescriptor.META_TABLEDESC.getNameAsString().equals(tablename)) {
   cachehits++;
   return HTableDescriptor.META_TABLEDESC;
 }
 {quote}
 to handle special cases, but that's a bit clunky and not clean in terms of 
 table-level directories that need to be ignored.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6872) Number of records written/read per second on regionserver level

2012-10-03 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468995#comment-13468995
 ] 

Phabricator commented on HBASE-6872:


Kannan has added CCs to the revision [jira] [HBASE-6872] [89-fb] Fix 
TestRegionServerMetrics.testNumReadsAndWrites.
Added CCs: adela, Liyin, aaiyer, avf

REVISION DETAIL
  https://reviews.facebook.net/D5853

To: Kannan, Karthik, JIRA, mbautin
Cc: adela, Liyin, aaiyer, avf


 Number of records written/read per second on regionserver level
 ---

 Key: HBASE-6872
 URL: https://issues.apache.org/jira/browse/HBASE-6872
 Project: HBase
  Issue Type: New Feature
  Components: regionserver
Reporter: Adela Maznikar
Priority: Minor
 Attachments: D5853.1.patch


 Regionserver level metrics that shows the number of records written/read per 
 second. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   >