[jira] [Commented] (HBASE-6900) RegionScanner.reseek() creates NPE when a flush or compaction happens before the reseek.

2012-10-05 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470046#comment-13470046
 ] 

ramkrishna.s.vasudevan commented on HBASE-6900:
---

@Lars
Its better if we get it in 0.94.2.  I was once again checking the lastTop 
behaviour.  Even if two times updateReaders happen i feel the lastTop will not 
be null.  Verified the same with a testcase.
{code}
// All public synchronized API calls will call 'checkReseek' which will cause
// the scanner stack to reseek if this.heap==null  this.lastTop != null.
// But if two calls to updateReaders() happen without a 'next' or 'peek' 
then we
// will end up calling this.peek() which would cause a reseek in the middle 
of a updateReaders
// which is NOT what we want, not to mention could cause an NPE. So we 
early out here.
if (this.heap == null) return;
{code}
So i feel the patch that you gave should be fine without additional checks.  +1 
on it.
Lars if you can commit that would be great because i cannot do the commit from 
office and also i may not be able to do it today.  


 RegionScanner.reseek() creates NPE when a flush or compaction happens before 
 the reseek.
 

 Key: HBASE-6900
 URL: https://issues.apache.org/jira/browse/HBASE-6900
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.94.2, 0.96.0

 Attachments: 6900-test.txt, HBASE-6900_1.patch, HBASE-6900.patch


 HBASE-5520 introduced reseek() on the RegionScanner.  
 Now when a scanner is created we have the StoreScanner heap.  After this if a 
 flush or compaction happens parallely all the StoreScannerObservers are 
 cleared so that whenever a new next() call happens we tend to recreate the 
 scanner based on the latest store files.
 The reseek() in StoreScanner expects the heap not to be null because always 
 reseek would be called from next()
 {code}
 public synchronized boolean reseek(KeyValue kv) throws IOException {
 //Heap cannot be null, because this is only called from next() which
 //guarantees that heap will never be null before this call.
 if (explicitColumnQuery  lazySeekEnabledGlobally) {
   return heap.requestSeek(kv, true, useRowColBloom);
 } else {
   return heap.reseek(kv);
 }
   }
 {code}
 Now when we call RegionScanner.reseek() directly using CPs we tend to get a 
 NPE.  In our case it happened when a major compaction was going on.  I will 
 also attach a testcase to show the problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6920) On timeout connecting to master, client can get stuck and never make progress

2012-10-05 Thread Gregory Chanan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470050#comment-13470050
 ] 

Gregory Chanan commented on HBASE-6920:
---

Haven't been able to get a good run.  Is this holding up the new RC for you?  I 
should be able to tell by tomorrow.

 On timeout connecting to master, client can get stuck and never make progress
 -

 Key: HBASE-6920
 URL: https://issues.apache.org/jira/browse/HBASE-6920
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.2
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Critical
 Fix For: 0.94.2

 Attachments: HBASE-6920.patch, HBASE-6920-v2.patch


 HBASE-5058 appears to have introduced an issue where a timeout in 
 HConnection.getMaster() can cause the client to never be able to connect to 
 the master.  So, for example, an HBaseAdmin object can never successfully be 
 initialized.
 The issue is here:
 {code}
 if (tryMaster.isMasterRunning()) {
   this.master = tryMaster;
   this.masterLock.notifyAll();
   break;
 }
 {code}
 If isMasterRunning times out, it throws an UndeclaredThrowableException, 
 which is already not ideal, because it can be returned to the application.
  But if the first call to getMaster succeeds, it will set masterChecked = 
 true, which makes us never try to reconnect; that is, we will set this.master 
 = null and just throw MasterNotRunningExceptions, without even trying to 
 connect.
 I tried out a 94 client (actually a 92 client with some 94 patches) on a 
 cluster with some network issues, and it would constantly get stuck as 
 described above.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6900) RegionScanner.reseek() creates NPE when a flush or compaction happens before the reseek.

2012-10-05 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6900:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Committed to 0.94 and 0.96.
Thanks for your patience Ram.

 RegionScanner.reseek() creates NPE when a flush or compaction happens before 
 the reseek.
 

 Key: HBASE-6900
 URL: https://issues.apache.org/jira/browse/HBASE-6900
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.94.2, 0.96.0

 Attachments: 6900-test.txt, HBASE-6900_1.patch, HBASE-6900.patch


 HBASE-5520 introduced reseek() on the RegionScanner.  
 Now when a scanner is created we have the StoreScanner heap.  After this if a 
 flush or compaction happens parallely all the StoreScannerObservers are 
 cleared so that whenever a new next() call happens we tend to recreate the 
 scanner based on the latest store files.
 The reseek() in StoreScanner expects the heap not to be null because always 
 reseek would be called from next()
 {code}
 public synchronized boolean reseek(KeyValue kv) throws IOException {
 //Heap cannot be null, because this is only called from next() which
 //guarantees that heap will never be null before this call.
 if (explicitColumnQuery  lazySeekEnabledGlobally) {
   return heap.requestSeek(kv, true, useRowColBloom);
 } else {
   return heap.reseek(kv);
 }
   }
 {code}
 Now when we call RegionScanner.reseek() directly using CPs we tend to get a 
 NPE.  In our case it happened when a major compaction was going on.  I will 
 also attach a testcase to show the problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6920) On timeout connecting to master, client can get stuck and never make progress

2012-10-05 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470053#comment-13470053
 ] 

Lars Hofhansl commented on HBASE-6920:
--

Yeah, it's the last (for now :) ) open jira for the next RC.
Thanks Gregory.

 On timeout connecting to master, client can get stuck and never make progress
 -

 Key: HBASE-6920
 URL: https://issues.apache.org/jira/browse/HBASE-6920
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.2
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Critical
 Fix For: 0.94.2

 Attachments: HBASE-6920.patch, HBASE-6920-v2.patch


 HBASE-5058 appears to have introduced an issue where a timeout in 
 HConnection.getMaster() can cause the client to never be able to connect to 
 the master.  So, for example, an HBaseAdmin object can never successfully be 
 initialized.
 The issue is here:
 {code}
 if (tryMaster.isMasterRunning()) {
   this.master = tryMaster;
   this.masterLock.notifyAll();
   break;
 }
 {code}
 If isMasterRunning times out, it throws an UndeclaredThrowableException, 
 which is already not ideal, because it can be returned to the application.
  But if the first call to getMaster succeeds, it will set masterChecked = 
 true, which makes us never try to reconnect; that is, we will set this.master 
 = null and just throw MasterNotRunningExceptions, without even trying to 
 connect.
 I tried out a 94 client (actually a 92 client with some 94 patches) on a 
 cluster with some network issues, and it would constantly get stuck as 
 described above.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6900) RegionScanner.reseek() creates NPE when a flush or compaction happens before the reseek.

2012-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470102#comment-13470102
 ] 

Hudson commented on HBASE-6900:
---

Integrated in HBase-TRUNK #3429 (See 
[https://builds.apache.org/job/HBase-TRUNK/3429/])
HBASE-6900 RegionScanner.reseek() creates NPE when a flush or compaction 
happens before the reseek. (Revision 1394377)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java


 RegionScanner.reseek() creates NPE when a flush or compaction happens before 
 the reseek.
 

 Key: HBASE-6900
 URL: https://issues.apache.org/jira/browse/HBASE-6900
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.94.2, 0.96.0

 Attachments: 6900-test.txt, HBASE-6900_1.patch, HBASE-6900.patch


 HBASE-5520 introduced reseek() on the RegionScanner.  
 Now when a scanner is created we have the StoreScanner heap.  After this if a 
 flush or compaction happens parallely all the StoreScannerObservers are 
 cleared so that whenever a new next() call happens we tend to recreate the 
 scanner based on the latest store files.
 The reseek() in StoreScanner expects the heap not to be null because always 
 reseek would be called from next()
 {code}
 public synchronized boolean reseek(KeyValue kv) throws IOException {
 //Heap cannot be null, because this is only called from next() which
 //guarantees that heap will never be null before this call.
 if (explicitColumnQuery  lazySeekEnabledGlobally) {
   return heap.requestSeek(kv, true, useRowColBloom);
 } else {
   return heap.reseek(kv);
 }
   }
 {code}
 Now when we call RegionScanner.reseek() directly using CPs we tend to get a 
 NPE.  In our case it happened when a major compaction was going on.  I will 
 also attach a testcase to show the problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6900) RegionScanner.reseek() creates NPE when a flush or compaction happens before the reseek.

2012-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470101#comment-13470101
 ] 

Hudson commented on HBASE-6900:
---

Integrated in HBase-0.94 #507 (See 
[https://builds.apache.org/job/HBase-0.94/507/])
HBASE-6900 RegionScanner.reseek() creates NPE when a flush or compaction 
happens before the reseek. (Revision 1394378)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java


 RegionScanner.reseek() creates NPE when a flush or compaction happens before 
 the reseek.
 

 Key: HBASE-6900
 URL: https://issues.apache.org/jira/browse/HBASE-6900
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.94.2, 0.96.0

 Attachments: 6900-test.txt, HBASE-6900_1.patch, HBASE-6900.patch


 HBASE-5520 introduced reseek() on the RegionScanner.  
 Now when a scanner is created we have the StoreScanner heap.  After this if a 
 flush or compaction happens parallely all the StoreScannerObservers are 
 cleared so that whenever a new next() call happens we tend to recreate the 
 scanner based on the latest store files.
 The reseek() in StoreScanner expects the heap not to be null because always 
 reseek would be called from next()
 {code}
 public synchronized boolean reseek(KeyValue kv) throws IOException {
 //Heap cannot be null, because this is only called from next() which
 //guarantees that heap will never be null before this call.
 if (explicitColumnQuery  lazySeekEnabledGlobally) {
   return heap.requestSeek(kv, true, useRowColBloom);
 } else {
   return heap.reseek(kv);
 }
   }
 {code}
 Now when we call RegionScanner.reseek() directly using CPs we tend to get a 
 NPE.  In our case it happened when a major compaction was going on.  I will 
 also attach a testcase to show the problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6956) Do not return back to HTablePool closed connections

2012-10-05 Thread Igor Yurinok (JIRA)
Igor Yurinok created HBASE-6956:
---

 Summary: Do not return back to HTablePool closed connections
 Key: HBASE-6956
 URL: https://issues.apache.org/jira/browse/HBASE-6956
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.90.6
Reporter: Igor Yurinok


Sometimes we see a lot of Exception about closed connections:
{code}
 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@553fd068
 closed
org.apache.hadoop.hbase.client.ClosedConnectionException: 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@553fd068
 closed
{code}

After investigation we assumed that it occurs because closed connection returns 
back into HTablePool. 

For our opinion best solution is  check whether the table is closed in method 
HTablePool.putTable and if true don't add it into the queue and release such 
HTableInterface.

But unfortunatly right now there are no access to HTable#closed field through 
HTableInterface

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6950) TestAcidGuarantees system test now flushes too aggressively

2012-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470243#comment-13470243
 ] 

Hudson commented on HBASE-6950:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #210 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/210/])
HBASE-6950 TestAcidGuarantees system test now flushes too aggressively 
(Revision 1394335)

 Result = FAILURE
gchanan : 
Files : 
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java


 TestAcidGuarantees system test now flushes too aggressively
 ---

 Key: HBASE-6950
 URL: https://issues.apache.org/jira/browse/HBASE-6950
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.92.2, 0.94.2, 0.96.0
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-6950.patch


 HBASE-6552 caused the TestAcidGuarantees system test to flush more 
 aggressively, because flushes are where ACID problems have occurred in the 
 past.
 After some more cluster testing, it seems like this too aggressive; my 
 clusters eventually can't keep up with the number of flushes/compactions and 
 start getting SocketTimeoutExceptions.  We could try to optimize the 
 flushes/compactions, but since this workload would never occur in practice, I 
 don't think it is worth the effort.  Instead, let's just only flush once a 
 minute.  This is arbitrary, but seems to work.
 Here is my comment in the (upcoming) patch:
 {code}
 // Flushing has been a source of ACID violations previously (see HBASE-2856), 
 so ideally,
 // we would flush as often as possible.  On a running cluster, this isn't 
 practical:
 // (1) we will cause a lot of load due to all the flushing and compacting
 // (2) we cannot change the flushing/compacting related Configuration options 
 to try to
 // alleviate this
 // (3) it is an unrealistic workload, since no one would actually flush that 
 often.
 // Therefore, let's flush every minute to have more flushes than usual, but 
 not overload
 // the running cluster.
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HBASE-6804) [replication] lower the amount of logging to a more human-readable level

2012-10-05 Thread Jean-Daniel Cryans (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans reassigned HBASE-6804:
-

Assignee: Jean-Daniel Cryans

 [replication] lower the amount of logging to a more human-readable level
 

 Key: HBASE-6804
 URL: https://issues.apache.org/jira/browse/HBASE-6804
 Project: HBase
  Issue Type: Improvement
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Critical
 Fix For: 0.96.0

 Attachments: HBASE-6804-0.94.patch, HBASE-6804-0.94-v2.patch


 We need stop logging every time replication decides to do something. It used 
 to be extremely useful when the code base was younger but now it should be 
 possible to bring it down while keeping it relevant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5582) No HServerInfo found for should be a WARNING message

2012-10-05 Thread Kevin Odell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Odell updated HBASE-5582:
---

Attachment: HBASE-5582.patch

Changed LOG level and re-worded the message a bit, since we no longer have 
HServerinfo. 

 No HServerInfo found for should be a WARNING message
 --

 Key: HBASE-5582
 URL: https://issues.apache.org/jira/browse/HBASE-5582
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.90.4
Reporter: Shrijeet Paliwal
Assignee: Kevin Odell
Priority: Trivial
  Labels: newbie
 Attachments: HBASE-5582.patch


 The message from RegionServerTracker No HServerInfo found for... is easy to 
 miss. It should not be INFO. 
 From irc chat 
 {noformat}
 jdcryans
 JohnP789: can you grep for No HServerInfo found for in that log?
 jdcryans
 wait I see it
 jdcryans
 ok there's your problem
 shrijeet_
 Yes it is there
 shrijeet_
 jdcryans: it should be INFO, why?
 jdcryans
 it shouldn't be INFO, it's so easy to miss
 jdcryans
 it's not the first time we have to look super closely to figure this one out
 shrijeet_
 yes , I will file a jira
 jdcryans
 in any case it's a mismatch in that machine's DNS config
 shrijeet_
 anyways JohnP789 is waiting :) go on
 JohnP789
 haha!
 JohnP789
 yes...  ???  :-)
 jdcryans
 the master is expecting a RS called 
 localhost.localdomain,53875,1328924863478
 17:26 jdcryans
 but the RS calls itself localhost,53875,1328924863478
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6597) Block Encoding Size Estimation

2012-10-05 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470475#comment-13470475
 ] 

Phabricator commented on HBASE-6597:


tedyu has commented on the revision [jira] [HBASE-6597] [89-fb] Incremental 
data block encoding.

INLINE COMMENTS
  
src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:38
 Length of prefix is returned.
  Name this method getCommonPrefixLength ?
  
src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java:93
 Do we need to consider memstoreTS ?
  
src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java:340 
Check / assert that skipLastBytes is not negative ?
  
src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java:422 
prevKey stores the previous key, right ?
  
src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java:497 
Name this variable negativeDiffTimestamp ?
  
src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java:411 
This class can be private, right ?
  
src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java:420 
This class can be private, right ?

REVISION DETAIL
  https://reviews.facebook.net/D5895

To: Kannan, Karthik, Liyin, aaiyer, avf, JIRA, mbautin
Cc: tedyu


 Block Encoding Size Estimation
 --

 Key: HBASE-6597
 URL: https://issues.apache.org/jira/browse/HBASE-6597
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.89-fb
Reporter: Brian Nixon
Priority: Minor
 Attachments: D5895.1.patch


 Blocks boundaries as created by current writers are determined by the size of 
 the unencoded data. However, blocks in memory are kept encoded. By using an 
 estimate for the encoded size of the block, we can get greater consistency in 
 size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5582) No HServerInfo found for should be a WARNING message

2012-10-05 Thread Kevin Odell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Odell updated HBASE-5582:
---

Fix Version/s: 0.96.0
   Status: Patch Available  (was: Open)

Changed the LOG.info to LOG.warn, and expanded the error message a bit.

 No HServerInfo found for should be a WARNING message
 --

 Key: HBASE-5582
 URL: https://issues.apache.org/jira/browse/HBASE-5582
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.90.4
Reporter: Shrijeet Paliwal
Assignee: Kevin Odell
Priority: Trivial
  Labels: newbie
 Fix For: 0.96.0

 Attachments: HBASE-5582.patch


 The message from RegionServerTracker No HServerInfo found for... is easy to 
 miss. It should not be INFO. 
 From irc chat 
 {noformat}
 jdcryans
 JohnP789: can you grep for No HServerInfo found for in that log?
 jdcryans
 wait I see it
 jdcryans
 ok there's your problem
 shrijeet_
 Yes it is there
 shrijeet_
 jdcryans: it should be INFO, why?
 jdcryans
 it shouldn't be INFO, it's so easy to miss
 jdcryans
 it's not the first time we have to look super closely to figure this one out
 shrijeet_
 yes , I will file a jira
 jdcryans
 in any case it's a mismatch in that machine's DNS config
 shrijeet_
 anyways JohnP789 is waiting :) go on
 JohnP789
 haha!
 JohnP789
 yes...  ???  :-)
 jdcryans
 the master is expecting a RS called 
 localhost.localdomain,53875,1328924863478
 17:26 jdcryans
 but the RS calls itself localhost,53875,1328924863478
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5582) No HServerInfo found for should be a WARNING message

2012-10-05 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470484#comment-13470484
 ] 

Ted Yu commented on HBASE-5582:
---

Line too long:
{code}
+LOG.warn(serverName.toString() +  is not online or isn't known to the 
master. The latter could be caused by a DNS misconfiguration.);
{code}
Wrap the second sentence.

Otherwise looks good.

 No HServerInfo found for should be a WARNING message
 --

 Key: HBASE-5582
 URL: https://issues.apache.org/jira/browse/HBASE-5582
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.90.4
Reporter: Shrijeet Paliwal
Assignee: Kevin Odell
Priority: Trivial
  Labels: newbie
 Fix For: 0.96.0

 Attachments: HBASE-5582.patch


 The message from RegionServerTracker No HServerInfo found for... is easy to 
 miss. It should not be INFO. 
 From irc chat 
 {noformat}
 jdcryans
 JohnP789: can you grep for No HServerInfo found for in that log?
 jdcryans
 wait I see it
 jdcryans
 ok there's your problem
 shrijeet_
 Yes it is there
 shrijeet_
 jdcryans: it should be INFO, why?
 jdcryans
 it shouldn't be INFO, it's so easy to miss
 jdcryans
 it's not the first time we have to look super closely to figure this one out
 shrijeet_
 yes , I will file a jira
 jdcryans
 in any case it's a mismatch in that machine's DNS config
 shrijeet_
 anyways JohnP789 is waiting :) go on
 JohnP789
 haha!
 JohnP789
 yes...  ???  :-)
 jdcryans
 the master is expecting a RS called 
 localhost.localdomain,53875,1328924863478
 17:26 jdcryans
 but the RS calls itself localhost,53875,1328924863478
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5582) No HServerInfo found for should be a WARNING message

2012-10-05 Thread Kevin Odell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Odell updated HBASE-5582:
---

Attachment: HBASE-5582.patch1

 No HServerInfo found for should be a WARNING message
 --

 Key: HBASE-5582
 URL: https://issues.apache.org/jira/browse/HBASE-5582
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.90.4
Reporter: Shrijeet Paliwal
Assignee: Kevin Odell
Priority: Trivial
  Labels: newbie
 Fix For: 0.96.0

 Attachments: HBASE-5582.patch, HBASE-5582.patch1


 The message from RegionServerTracker No HServerInfo found for... is easy to 
 miss. It should not be INFO. 
 From irc chat 
 {noformat}
 jdcryans
 JohnP789: can you grep for No HServerInfo found for in that log?
 jdcryans
 wait I see it
 jdcryans
 ok there's your problem
 shrijeet_
 Yes it is there
 shrijeet_
 jdcryans: it should be INFO, why?
 jdcryans
 it shouldn't be INFO, it's so easy to miss
 jdcryans
 it's not the first time we have to look super closely to figure this one out
 shrijeet_
 yes , I will file a jira
 jdcryans
 in any case it's a mismatch in that machine's DNS config
 shrijeet_
 anyways JohnP789 is waiting :) go on
 JohnP789
 haha!
 JohnP789
 yes...  ???  :-)
 jdcryans
 the master is expecting a RS called 
 localhost.localdomain,53875,1328924863478
 17:26 jdcryans
 but the RS calls itself localhost,53875,1328924863478
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5582) No HServerInfo found for should be a WARNING message

2012-10-05 Thread Kevin Odell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Odell updated HBASE-5582:
---

Status: Open  (was: Patch Available)

 No HServerInfo found for should be a WARNING message
 --

 Key: HBASE-5582
 URL: https://issues.apache.org/jira/browse/HBASE-5582
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.90.4
Reporter: Shrijeet Paliwal
Assignee: Kevin Odell
Priority: Trivial
  Labels: newbie
 Fix For: 0.96.0

 Attachments: HBASE-5582.patch, HBASE-5582.patch1


 The message from RegionServerTracker No HServerInfo found for... is easy to 
 miss. It should not be INFO. 
 From irc chat 
 {noformat}
 jdcryans
 JohnP789: can you grep for No HServerInfo found for in that log?
 jdcryans
 wait I see it
 jdcryans
 ok there's your problem
 shrijeet_
 Yes it is there
 shrijeet_
 jdcryans: it should be INFO, why?
 jdcryans
 it shouldn't be INFO, it's so easy to miss
 jdcryans
 it's not the first time we have to look super closely to figure this one out
 shrijeet_
 yes , I will file a jira
 jdcryans
 in any case it's a mismatch in that machine's DNS config
 shrijeet_
 anyways JohnP789 is waiting :) go on
 JohnP789
 haha!
 JohnP789
 yes...  ???  :-)
 jdcryans
 the master is expecting a RS called 
 localhost.localdomain,53875,1328924863478
 17:26 jdcryans
 but the RS calls itself localhost,53875,1328924863478
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5582) No HServerInfo found for should be a WARNING message

2012-10-05 Thread Kevin Odell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Odell updated HBASE-5582:
---

Status: Patch Available  (was: Open)

 No HServerInfo found for should be a WARNING message
 --

 Key: HBASE-5582
 URL: https://issues.apache.org/jira/browse/HBASE-5582
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.90.4
Reporter: Shrijeet Paliwal
Assignee: Kevin Odell
Priority: Trivial
  Labels: newbie
 Fix For: 0.96.0

 Attachments: HBASE-5582.patch, HBASE-5582.patch1


 The message from RegionServerTracker No HServerInfo found for... is easy to 
 miss. It should not be INFO. 
 From irc chat 
 {noformat}
 jdcryans
 JohnP789: can you grep for No HServerInfo found for in that log?
 jdcryans
 wait I see it
 jdcryans
 ok there's your problem
 shrijeet_
 Yes it is there
 shrijeet_
 jdcryans: it should be INFO, why?
 jdcryans
 it shouldn't be INFO, it's so easy to miss
 jdcryans
 it's not the first time we have to look super closely to figure this one out
 shrijeet_
 yes , I will file a jira
 jdcryans
 in any case it's a mismatch in that machine's DNS config
 shrijeet_
 anyways JohnP789 is waiting :) go on
 JohnP789
 haha!
 JohnP789
 yes...  ???  :-)
 jdcryans
 the master is expecting a RS called 
 localhost.localdomain,53875,1328924863478
 17:26 jdcryans
 but the RS calls itself localhost,53875,1328924863478
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5582) No HServerInfo found for should be a WARNING message

2012-10-05 Thread Kevin Odell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470490#comment-13470490
 ] 

Kevin Odell commented on HBASE-5582:


Resubmitted the patch


 No HServerInfo found for should be a WARNING message
 --

 Key: HBASE-5582
 URL: https://issues.apache.org/jira/browse/HBASE-5582
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.90.4
Reporter: Shrijeet Paliwal
Assignee: Kevin Odell
Priority: Trivial
  Labels: newbie
 Fix For: 0.96.0

 Attachments: HBASE-5582.patch, HBASE-5582.patch1


 The message from RegionServerTracker No HServerInfo found for... is easy to 
 miss. It should not be INFO. 
 From irc chat 
 {noformat}
 jdcryans
 JohnP789: can you grep for No HServerInfo found for in that log?
 jdcryans
 wait I see it
 jdcryans
 ok there's your problem
 shrijeet_
 Yes it is there
 shrijeet_
 jdcryans: it should be INFO, why?
 jdcryans
 it shouldn't be INFO, it's so easy to miss
 jdcryans
 it's not the first time we have to look super closely to figure this one out
 shrijeet_
 yes , I will file a jira
 jdcryans
 in any case it's a mismatch in that machine's DNS config
 shrijeet_
 anyways JohnP789 is waiting :) go on
 JohnP789
 haha!
 JohnP789
 yes...  ???  :-)
 jdcryans
 the master is expecting a RS called 
 localhost.localdomain,53875,1328924863478
 17:26 jdcryans
 but the RS calls itself localhost,53875,1328924863478
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6948) shell create table script cannot handle split key which is expressed in raw bytes

2012-10-05 Thread Tianying Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tianying Chang updated HBASE-6948:
--

Attachment: HBASE-6948.patch

Fix the split key to be correct byte[]

 shell create table script cannot handle split key which is expressed in raw 
 bytes
 -

 Key: HBASE-6948
 URL: https://issues.apache.org/jira/browse/HBASE-6948
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.2
Reporter: Ted Yu
 Attachments: HBASE-6948.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HBASE-6797) TestHFileCleaner#testHFileCleaning sometimes fails in trunk

2012-10-05 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates reassigned HBASE-6797:
--

Assignee: Jesse Yates

 TestHFileCleaner#testHFileCleaning sometimes fails in trunk
 ---

 Key: HBASE-6797
 URL: https://issues.apache.org/jira/browse/HBASE-6797
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Jesse Yates
 Attachments: hbase-6797-v0.patch


 In build #3334, I saw:
 {code}
 java.lang.AssertionError: expected:1 but was:0
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at org.junit.Assert.assertEquals(Assert.java:456)
   at 
 org.apache.hadoop.hbase.master.cleaner.TestHFileCleaner.testHFileCleaning(TestHFileCleaner.java:88)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6797) TestHFileCleaner#testHFileCleaning sometimes fails in trunk

2012-10-05 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-6797:
---

Status: Patch Available  (was: Open)

 TestHFileCleaner#testHFileCleaning sometimes fails in trunk
 ---

 Key: HBASE-6797
 URL: https://issues.apache.org/jira/browse/HBASE-6797
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Jesse Yates
 Attachments: hbase-6797-v0.patch


 In build #3334, I saw:
 {code}
 java.lang.AssertionError: expected:1 but was:0
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at org.junit.Assert.assertEquals(Assert.java:456)
   at 
 org.apache.hadoop.hbase.master.cleaner.TestHFileCleaner.testHFileCleaning(TestHFileCleaner.java:88)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6889) Ignore source control files with apache-rat

2012-10-05 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470504#comment-13470504
 ] 

Jesse Yates commented on HBASE-6889:


[~lhofhansl] wanna commit it?

 Ignore source control files with apache-rat
 ---

 Key: HBASE-6889
 URL: https://issues.apache.org/jira/browse/HBASE-6889
 Project: HBase
  Issue Type: Bug
Reporter: Jesse Yates
Assignee: Jesse Yates
 Attachments: hbase-6889-mvn-v0.patch


 Running 'mvn apache-rat:check' locally causes a failure because it finds the 
 source control files, making it hard to check that you didn't include a file 
 without a source header.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5582) No HServerInfo found for should be a WARNING message

2012-10-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470515#comment-13470515
 ] 

Hadoop QA commented on HBASE-5582:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12548006/HBASE-5582.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
81 warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 5 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher
  org.apache.hadoop.hbase.master.TestSplitLogManager

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3013//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3013//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3013//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3013//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3013//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3013//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3013//console

This message is automatically generated.

 No HServerInfo found for should be a WARNING message
 --

 Key: HBASE-5582
 URL: https://issues.apache.org/jira/browse/HBASE-5582
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.90.4
Reporter: Shrijeet Paliwal
Assignee: Kevin Odell
Priority: Trivial
  Labels: newbie
 Fix For: 0.96.0

 Attachments: HBASE-5582.patch, HBASE-5582.patch1


 The message from RegionServerTracker No HServerInfo found for... is easy to 
 miss. It should not be INFO. 
 From irc chat 
 {noformat}
 jdcryans
 JohnP789: can you grep for No HServerInfo found for in that log?
 jdcryans
 wait I see it
 jdcryans
 ok there's your problem
 shrijeet_
 Yes it is there
 shrijeet_
 jdcryans: it should be INFO, why?
 jdcryans
 it shouldn't be INFO, it's so easy to miss
 jdcryans
 it's not the first time we have to look super closely to figure this one out
 shrijeet_
 yes , I will file a jira
 jdcryans
 in any case it's a mismatch in that machine's DNS config
 shrijeet_
 anyways JohnP789 is waiting :) go on
 JohnP789
 haha!
 JohnP789
 yes...  ???  :-)
 jdcryans
 the master is expecting a RS called 
 localhost.localdomain,53875,1328924863478
 17:26 jdcryans
 but the RS calls itself localhost,53875,1328924863478
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6476) Replace all occurrances of System.currentTimeMillis() with EnvironmentEdge equivalent

2012-10-05 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470516#comment-13470516
 ] 

Chris Trezzo commented on HBASE-6476:
-

A few curious things (if anyone has any thoughts on these that would be great):

1. Check out the Test history: 
https://builds.apache.org/job/HBase-TRUNK/3392/testReport/junit/org.apache.hadoop.hbase.util/TestThreads/testSleepWithoutInterrupt/history/

The System.currentTimeMillis() change might not have caused the test failure 
(it has failed twice for the same reason since the patch was reverted). In 
addition, there is something wonky going on because the test duration for all 
the failed runs is 1~2ms. The error message for the failed runs stated that the 
test timed out after 6000ms, which is the timeout set in the test tag.

2. The method sleepWithoutInterrupt is only called from test code, except 
during thrift server shutdown. Look at 
TBoundedThreadPoolServer.shutdownServer(). Anyone have any thoughts why we do 
the extra wait which calls sleepWithoutInterrupt? My thoughts are that we could 
remove the extra wait since we are already calling awaitTermination on the 
executor service in a loop above. Either that, or we could just replace the 
call with another call to awaitTermination and keep the second wait loop. This 
would limit sleepWithoutInterrupt calls to just test code.

3. Another tricky part with the EdgeManager is dealing with small tests that 
run in parallel within the same JVM. If test1 uses a non-default 
EnvironmentEdge, and test2 is relying on the DefaultEdge and doesn't set it 
explicitly, test2 would unknowingly be using whatever EnvironmentEdge test1 
set. This would create flapping tests that might be tricky to debug.

Thoughts?

Thanks,
Chris

 Replace all occurrances of System.currentTimeMillis() with EnvironmentEdge 
 equivalent
 -

 Key: HBASE-6476
 URL: https://issues.apache.org/jira/browse/HBASE-6476
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Chris Trezzo
Priority: Minor
 Fix For: 0.96.0

 Attachments: 6476.txt, 6476-v2.txt, 6476-v2.txt, 6476v3.txt


 There are still some areas where System.currentTimeMillis() is used in HBase. 
 In order to make all parts of the code base testable and (potentially) to be 
 able to configure HBase's notion of time, this should be generally be 
 replaced with EnvironmentEdgeManager.currentTimeMillis().
 How hard would it be to add a maven task that checks for that, so we do not 
 introduce System.currentTimeMillis back in the future?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6597) Block Encoding Size Estimation

2012-10-05 Thread Mikhail Bautin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-6597:
--

Assignee: Mikhail Bautin

 Block Encoding Size Estimation
 --

 Key: HBASE-6597
 URL: https://issues.apache.org/jira/browse/HBASE-6597
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.89-fb
Reporter: Brian Nixon
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: D5895.1.patch


 Blocks boundaries as created by current writers are determined by the size of 
 the unencoded data. However, blocks in memory are kept encoded. By using an 
 estimate for the encoded size of the block, we can get greater consistency in 
 size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (HBASE-6597) Block Encoding Size Estimation

2012-10-05 Thread Mikhail Bautin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-6597 started by Mikhail Bautin.

 Block Encoding Size Estimation
 --

 Key: HBASE-6597
 URL: https://issues.apache.org/jira/browse/HBASE-6597
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.89-fb
Reporter: Brian Nixon
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: D5895.1.patch


 Blocks boundaries as created by current writers are determined by the size of 
 the unencoded data. However, blocks in memory are kept encoded. By using an 
 estimate for the encoded size of the block, we can get greater consistency in 
 size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6797) TestHFileCleaner#testHFileCleaning sometimes fails in trunk

2012-10-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470554#comment-13470554
 ] 

Hadoop QA commented on HBASE-6797:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12547179/hbase-6797-v0.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3015//console

This message is automatically generated.

 TestHFileCleaner#testHFileCleaning sometimes fails in trunk
 ---

 Key: HBASE-6797
 URL: https://issues.apache.org/jira/browse/HBASE-6797
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Jesse Yates
 Attachments: hbase-6797-v0.patch


 In build #3334, I saw:
 {code}
 java.lang.AssertionError: expected:1 but was:0
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at org.junit.Assert.assertEquals(Assert.java:456)
   at 
 org.apache.hadoop.hbase.master.cleaner.TestHFileCleaner.testHFileCleaning(TestHFileCleaner.java:88)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6948) shell create table script cannot handle split key which is expressed in raw bytes

2012-10-05 Thread Tianying Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tianying Chang updated HBASE-6948:
--

Attachment: (was: HBASE-6948-trunk.patch)

 shell create table script cannot handle split key which is expressed in raw 
 bytes
 -

 Key: HBASE-6948
 URL: https://issues.apache.org/jira/browse/HBASE-6948
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.2
Reporter: Ted Yu
 Attachments: HBASE-6948.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6948) shell create table script cannot handle split key which is expressed in raw bytes

2012-10-05 Thread Tianying Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tianying Chang updated HBASE-6948:
--

Attachment: HBASE-6948-trunk.patch

patch for trunk attached. 

 shell create table script cannot handle split key which is expressed in raw 
 bytes
 -

 Key: HBASE-6948
 URL: https://issues.apache.org/jira/browse/HBASE-6948
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.2
Reporter: Ted Yu
 Attachments: HBASE-6948.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6948) shell create table script cannot handle split key which is expressed in raw bytes

2012-10-05 Thread Tianying Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tianying Chang updated HBASE-6948:
--

Attachment: HBASE-6948-trunk.patch

 shell create table script cannot handle split key which is expressed in raw 
 bytes
 -

 Key: HBASE-6948
 URL: https://issues.apache.org/jira/browse/HBASE-6948
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.2
Reporter: Ted Yu
 Attachments: HBASE-6948.patch, HBASE-6948-trunk.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6948) shell create table script cannot handle split key which is expressed in raw bytes

2012-10-05 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-6948:
--

Hadoop Flags: Reviewed
  Status: Patch Available  (was: Open)

 shell create table script cannot handle split key which is expressed in raw 
 bytes
 -

 Key: HBASE-6948
 URL: https://issues.apache.org/jira/browse/HBASE-6948
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.2
Reporter: Ted Yu
 Attachments: HBASE-6948.patch, HBASE-6948-trunk.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5582) No HServerInfo found for should be a WARNING message

2012-10-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470568#comment-13470568
 ] 

Hadoop QA commented on HBASE-5582:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12548008/HBASE-5582.patch1
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
81 warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 5 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3014//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3014//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3014//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3014//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3014//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3014//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3014//console

This message is automatically generated.

 No HServerInfo found for should be a WARNING message
 --

 Key: HBASE-5582
 URL: https://issues.apache.org/jira/browse/HBASE-5582
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.90.4
Reporter: Shrijeet Paliwal
Assignee: Kevin Odell
Priority: Trivial
  Labels: newbie
 Fix For: 0.96.0

 Attachments: HBASE-5582.patch, HBASE-5582.patch1


 The message from RegionServerTracker No HServerInfo found for... is easy to 
 miss. It should not be INFO. 
 From irc chat 
 {noformat}
 jdcryans
 JohnP789: can you grep for No HServerInfo found for in that log?
 jdcryans
 wait I see it
 jdcryans
 ok there's your problem
 shrijeet_
 Yes it is there
 shrijeet_
 jdcryans: it should be INFO, why?
 jdcryans
 it shouldn't be INFO, it's so easy to miss
 jdcryans
 it's not the first time we have to look super closely to figure this one out
 shrijeet_
 yes , I will file a jira
 jdcryans
 in any case it's a mismatch in that machine's DNS config
 shrijeet_
 anyways JohnP789 is waiting :) go on
 JohnP789
 haha!
 JohnP789
 yes...  ???  :-)
 jdcryans
 the master is expecting a RS called 
 localhost.localdomain,53875,1328924863478
 17:26 jdcryans
 but the RS calls itself localhost,53875,1328924863478
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6889) Ignore source control files with apache-rat

2012-10-05 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6889:
-

   Resolution: Fixed
Fix Version/s: 0.96.0
   0.94.2
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to 0.94 and 0.96. Thanks for the patch, Jesse.

 Ignore source control files with apache-rat
 ---

 Key: HBASE-6889
 URL: https://issues.apache.org/jira/browse/HBASE-6889
 Project: HBase
  Issue Type: Bug
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 0.94.2, 0.96.0

 Attachments: hbase-6889-mvn-v0.patch


 Running 'mvn apache-rat:check' locally causes a failure because it finds the 
 source control files, making it hard to check that you didn't include a file 
 without a source header.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5582) No HServerInfo found for should be a WARNING message

2012-10-05 Thread Jean-Daniel Cryans (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470600#comment-13470600
 ] 

Jean-Daniel Cryans commented on HBASE-5582:
---

+1, going to commit.

 No HServerInfo found for should be a WARNING message
 --

 Key: HBASE-5582
 URL: https://issues.apache.org/jira/browse/HBASE-5582
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.90.4
Reporter: Shrijeet Paliwal
Assignee: Kevin Odell
Priority: Trivial
  Labels: newbie
 Fix For: 0.96.0

 Attachments: HBASE-5582.patch, HBASE-5582.patch1


 The message from RegionServerTracker No HServerInfo found for... is easy to 
 miss. It should not be INFO. 
 From irc chat 
 {noformat}
 jdcryans
 JohnP789: can you grep for No HServerInfo found for in that log?
 jdcryans
 wait I see it
 jdcryans
 ok there's your problem
 shrijeet_
 Yes it is there
 shrijeet_
 jdcryans: it should be INFO, why?
 jdcryans
 it shouldn't be INFO, it's so easy to miss
 jdcryans
 it's not the first time we have to look super closely to figure this one out
 shrijeet_
 yes , I will file a jira
 jdcryans
 in any case it's a mismatch in that machine's DNS config
 shrijeet_
 anyways JohnP789 is waiting :) go on
 JohnP789
 haha!
 JohnP789
 yes...  ???  :-)
 jdcryans
 the master is expecting a RS called 
 localhost.localdomain,53875,1328924863478
 17:26 jdcryans
 but the RS calls itself localhost,53875,1328924863478
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5582) No HServerInfo found for should be a WARNING message

2012-10-05 Thread Jean-Daniel Cryans (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans updated HBASE-5582:
--

   Resolution: Fixed
Fix Version/s: 0.94.2
   0.92.3
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to 0.92, 0.94, and trunk. Thanks for the patch Kevin!

 No HServerInfo found for should be a WARNING message
 --

 Key: HBASE-5582
 URL: https://issues.apache.org/jira/browse/HBASE-5582
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.90.4
Reporter: Shrijeet Paliwal
Assignee: Kevin Odell
Priority: Trivial
  Labels: newbie
 Fix For: 0.92.3, 0.94.2, 0.96.0

 Attachments: HBASE-5582.patch, HBASE-5582.patch1


 The message from RegionServerTracker No HServerInfo found for... is easy to 
 miss. It should not be INFO. 
 From irc chat 
 {noformat}
 jdcryans
 JohnP789: can you grep for No HServerInfo found for in that log?
 jdcryans
 wait I see it
 jdcryans
 ok there's your problem
 shrijeet_
 Yes it is there
 shrijeet_
 jdcryans: it should be INFO, why?
 jdcryans
 it shouldn't be INFO, it's so easy to miss
 jdcryans
 it's not the first time we have to look super closely to figure this one out
 shrijeet_
 yes , I will file a jira
 jdcryans
 in any case it's a mismatch in that machine's DNS config
 shrijeet_
 anyways JohnP789 is waiting :) go on
 JohnP789
 haha!
 JohnP789
 yes...  ???  :-)
 jdcryans
 the master is expecting a RS called 
 localhost.localdomain,53875,1328924863478
 17:26 jdcryans
 but the RS calls itself localhost,53875,1328924863478
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HBASE-6724) Port HBASE-6165 'Replication can overrun .META. scans on cluster re-start' to 0.92

2012-10-05 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-6724:
-

Assignee: Ted Yu

 Port HBASE-6165 'Replication can overrun .META. scans on cluster re-start' to 
 0.92
 --

 Key: HBASE-6724
 URL: https://issues.apache.org/jira/browse/HBASE-6724
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.92.3

 Attachments: 6165.92




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6948) shell create table script cannot handle split key which is expressed in raw bytes

2012-10-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470673#comment-13470673
 ] 

Hadoop QA commented on HBASE-6948:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12548026/HBASE-6948-trunk.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
81 warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 5 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3016//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3016//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3016//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3016//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3016//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3016//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3016//console

This message is automatically generated.

 shell create table script cannot handle split key which is expressed in raw 
 bytes
 -

 Key: HBASE-6948
 URL: https://issues.apache.org/jira/browse/HBASE-6948
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.2
Reporter: Ted Yu
 Attachments: HBASE-6948.patch, HBASE-6948-trunk.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6948) shell create table script cannot handle split key which is expressed in raw bytes

2012-10-05 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470675#comment-13470675
 ] 

Ted Yu commented on HBASE-6948:
---

{code}
Running org.apache.hadoop.hbase.client.TestShell
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 97.498 sec
{code}
Failed test is not related to this patch.

 shell create table script cannot handle split key which is expressed in raw 
 bytes
 -

 Key: HBASE-6948
 URL: https://issues.apache.org/jira/browse/HBASE-6948
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.2
Reporter: Ted Yu
 Attachments: HBASE-6948.patch, HBASE-6948-trunk.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6239) [replication] ReplicationSink uses the ts of the first KV for the other KVs in the same row

2012-10-05 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-6239:
--

Fix Version/s: (was: 0.90.8)

 [replication] ReplicationSink uses the ts of the first KV for the other KVs 
 in the same row
 ---

 Key: HBASE-6239
 URL: https://issues.apache.org/jira/browse/HBASE-6239
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6, 0.92.1
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Critical
  Labels: corruption
 Fix For: 0.92.2

 Attachments: HBASE-6239-0.92-v1.patch


 ReplicationSink assumes that all the KVs for the same row inside a WALEdit 
 will have the same timestamp, which is not necessarily the case.
 This only affects 0.90 and 0.92 since HBASE-5203 fixes it in 0.94

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-6239) [replication] ReplicationSink uses the ts of the first KV for the other KVs in the same row

2012-10-05 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-6239.
---

Resolution: Fixed

 [replication] ReplicationSink uses the ts of the first KV for the other KVs 
 in the same row
 ---

 Key: HBASE-6239
 URL: https://issues.apache.org/jira/browse/HBASE-6239
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6, 0.92.1
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Critical
  Labels: corruption
 Fix For: 0.92.2

 Attachments: HBASE-6239-0.92-v1.patch


 ReplicationSink assumes that all the KVs for the same row inside a WALEdit 
 will have the same timestamp, which is not necessarily the case.
 This only affects 0.90 and 0.92 since HBASE-5203 fixes it in 0.94

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6330) TestImportExport has been failing against hadoop 0.23/2.0 profile [Part2]

2012-10-05 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-6330:
--

Fix Version/s: (was: 0.96.0)

TestImportExport is no longer failing in HBase-TRUNK-on-Hadoop-2.0.0
See build #210.

 TestImportExport has been failing against hadoop 0.23/2.0 profile [Part2]
 -

 Key: HBASE-6330
 URL: https://issues.apache.org/jira/browse/HBASE-6330
 Project: HBase
  Issue Type: Sub-task
  Components: test
Affects Versions: 0.94.1, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Priority: Critical
  Labels: hadoop-2.0
 Fix For: 0.94.3

 Attachments: hbase-6330-94.patch, hbase-6330-trunk.patch, 
 hbase-6330-v2.patch


 See HBASE-5876.  I'm going to commit the v3 patches under this name since 
 there has been two months (my bad) since the first half was committed and 
 found to be incomplte.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6824) Introduce ${hbase.local.dir} and save coprocessor jars there

2012-10-05 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470708#comment-13470708
 ] 

Enis Soztutar commented on HBASE-6824:
--

[~apurtell] Could you please take a look at this when you find time.  

 Introduce ${hbase.local.dir} and save coprocessor jars there
 

 Key: HBASE-6824
 URL: https://issues.apache.org/jira/browse/HBASE-6824
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.3, 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: hbase-6824_v1-0.94.patch, hbase-6824_v1-trunk.patch, 
 hbase-6824_v2-0.94.patch, hbase-6824_v2-trunk.patch


 We need to make the temp directory where coprocessor jars are saved 
 configurable. For this we will add hbase.local.dir configuration parameter. 
 Windows tests are failing due to the pathing problems for coprocessor jars:
 Two HBase TestClassLoading unit tests failed due to a failiure in loading the 
 test file from HDFS:
 {code}
 testClassLoadingFromHDFS(org.apache.hadoop.hbase.coprocessor.TestClassLoading):
  Class TestCP1 was missing on a region
 testClassLoadingFromLibDirInJar(org.apache.hadoop.hbase.coprocessor.TestClassLoading):
  Class TestCP1 was missing on a region
 {code}
 The problem is that CoprocessorHost.load() copies the jar file locally, and 
 schedules the local file to be deleted on exit, but calling 
 FileSystem.deleteOnExit(). However, the filesystem is not the file system of 
 the local file, it is the distributed file system, so on windows, the Path 
 fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6957) TestRowCounter consistently fails against hadoop-2.0

2012-10-05 Thread Ted Yu (JIRA)
Ted Yu created HBASE-6957:
-

 Summary: TestRowCounter consistently fails against hadoop-2.0
 Key: HBASE-6957
 URL: https://issues.apache.org/jira/browse/HBASE-6957
 Project: HBase
  Issue Type: Sub-task
Reporter: Ted Yu
 Fix For: 0.96.0


In 
https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/210/testReport/org.apache.hadoop.hbase.mapreduce/TestRowCounter/testRowCounterHiddenColumn/
 , we can see:
{code}
java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:92)
at org.junit.Assert.assertTrue(Assert.java:43)
at org.junit.Assert.assertTrue(Assert.java:54)
at 
org.apache.hadoop.hbase.mapreduce.TestRowCounter.runRowCount(TestRowCounter.java:135)
at 
org.apache.hadoop.hbase.mapreduce.TestRowCounter.testRowCounterHiddenColumn(TestRowCounter.java:118)
...
2012-10-05 11:24:17,355 WARN  [ContainersLauncher #1] 
launcher.ContainerLaunch(246): Failed to launch container.
java.lang.ArithmeticException: / by zero
at 
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:355)
at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150)
at 
org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.getLogPathForWrite(LocalDirsHandlerService.java:268)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:126)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:68)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
2012-10-05 11:24:17,356 WARN  [DeletionService #1] 
nodemanager.DefaultContainerExecutor(276): delete returned false for path: 
[/home/jenkins/jenkins-slave/workspace/HBase-TRUNK-on-Hadoop-2.0.0/trunk/hbase-server/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-localDir-nm-1_0/usercache/jenkins/appcache/application_1349436189156_0003/container_1349436189156_0003_01_02]
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6957) TestRowCounter consistently fails against hadoop-2.0

2012-10-05 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-6957:
--

Priority: Critical  (was: Major)

 TestRowCounter consistently fails against hadoop-2.0
 

 Key: HBASE-6957
 URL: https://issues.apache.org/jira/browse/HBASE-6957
 Project: HBase
  Issue Type: Sub-task
Reporter: Ted Yu
Priority: Critical
 Fix For: 0.96.0


 In 
 https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/210/testReport/org.apache.hadoop.hbase.mapreduce/TestRowCounter/testRowCounterHiddenColumn/
  , we can see:
 {code}
 java.lang.AssertionError
   at org.junit.Assert.fail(Assert.java:92)
   at org.junit.Assert.assertTrue(Assert.java:43)
   at org.junit.Assert.assertTrue(Assert.java:54)
   at 
 org.apache.hadoop.hbase.mapreduce.TestRowCounter.runRowCount(TestRowCounter.java:135)
   at 
 org.apache.hadoop.hbase.mapreduce.TestRowCounter.testRowCounterHiddenColumn(TestRowCounter.java:118)
 ...
 2012-10-05 11:24:17,355 WARN  [ContainersLauncher #1] 
 launcher.ContainerLaunch(246): Failed to launch container.
 java.lang.ArithmeticException: / by zero
   at 
 org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:355)
   at 
 org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150)
   at 
 org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.getLogPathForWrite(LocalDirsHandlerService.java:268)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:126)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:68)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 2012-10-05 11:24:17,356 WARN  [DeletionService #1] 
 nodemanager.DefaultContainerExecutor(276): delete returned false for path: 
 [/home/jenkins/jenkins-slave/workspace/HBase-TRUNK-on-Hadoop-2.0.0/trunk/hbase-server/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-localDir-nm-1_0/usercache/jenkins/appcache/application_1349436189156_0003/container_1349436189156_0003_01_02]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6957) TestRowCounter consistently fails against hadoop-2.0

2012-10-05 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470722#comment-13470722
 ] 

Ted Yu commented on HBASE-6957:
---

I ran the test on:
{code}
Linux s0 2.6.38-11-generic #48-Ubuntu SMP Fri Jul 29 19:02:55 UTC 2011 x86_64 
x86_64 x86_64 GNU/Linux
{code}
and it passed:
{code}
Running org.apache.hadoop.hbase.mapreduce.TestRowCounter
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 79.768 sec
{code}

 TestRowCounter consistently fails against hadoop-2.0
 

 Key: HBASE-6957
 URL: https://issues.apache.org/jira/browse/HBASE-6957
 Project: HBase
  Issue Type: Sub-task
Reporter: Ted Yu
Priority: Critical
 Fix For: 0.96.0


 In 
 https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/210/testReport/org.apache.hadoop.hbase.mapreduce/TestRowCounter/testRowCounterHiddenColumn/
  , we can see:
 {code}
 java.lang.AssertionError
   at org.junit.Assert.fail(Assert.java:92)
   at org.junit.Assert.assertTrue(Assert.java:43)
   at org.junit.Assert.assertTrue(Assert.java:54)
   at 
 org.apache.hadoop.hbase.mapreduce.TestRowCounter.runRowCount(TestRowCounter.java:135)
   at 
 org.apache.hadoop.hbase.mapreduce.TestRowCounter.testRowCounterHiddenColumn(TestRowCounter.java:118)
 ...
 2012-10-05 11:24:17,355 WARN  [ContainersLauncher #1] 
 launcher.ContainerLaunch(246): Failed to launch container.
 java.lang.ArithmeticException: / by zero
   at 
 org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:355)
   at 
 org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150)
   at 
 org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.getLogPathForWrite(LocalDirsHandlerService.java:268)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:126)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:68)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 2012-10-05 11:24:17,356 WARN  [DeletionService #1] 
 nodemanager.DefaultContainerExecutor(276): delete returned false for path: 
 [/home/jenkins/jenkins-slave/workspace/HBase-TRUNK-on-Hadoop-2.0.0/trunk/hbase-server/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-localDir-nm-1_0/usercache/jenkins/appcache/application_1349436189156_0003/container_1349436189156_0003_01_02]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6797) TestHFileCleaner#testHFileCleaning sometimes fails in trunk

2012-10-05 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-6797:
---

Attachment: hbase-6797-v1.patch

Updating patch onto trunk.

 TestHFileCleaner#testHFileCleaning sometimes fails in trunk
 ---

 Key: HBASE-6797
 URL: https://issues.apache.org/jira/browse/HBASE-6797
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Jesse Yates
 Attachments: hbase-6797-v0.patch, hbase-6797-v1.patch


 In build #3334, I saw:
 {code}
 java.lang.AssertionError: expected:1 but was:0
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at org.junit.Assert.assertEquals(Assert.java:456)
   at 
 org.apache.hadoop.hbase.master.cleaner.TestHFileCleaner.testHFileCleaning(TestHFileCleaner.java:88)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6940) Enable GC logging by default

2012-10-05 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470745#comment-13470745
 ] 

Ted Yu commented on HBASE-6940:
---

To do this, we need to specify the path to GC log:
{code}
-Xloggc:/apache/hbase/logs/gc-hbase.log
{code}
Should a variable, such as HBASE_GC_LOG_PATH, be introduced so that user can 
specify the location ?

 Enable GC logging by default
 

 Key: HBASE-6940
 URL: https://issues.apache.org/jira/browse/HBASE-6940
 Project: HBase
  Issue Type: Improvement
  Components: Admin
Reporter: stack
Priority: Critical
 Fix For: 0.96.0


 I think we should enable gc by default.  Its pretty frictionless apparently 
 and could help in the case where folks are getting off the ground.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6940) Enable GC logging by default

2012-10-05 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470750#comment-13470750
 ] 

stack commented on HBASE-6940:
--

We have a logging dir define already.  We'll reuse that.

 Enable GC logging by default
 

 Key: HBASE-6940
 URL: https://issues.apache.org/jira/browse/HBASE-6940
 Project: HBase
  Issue Type: Improvement
  Components: Admin
Reporter: stack
Priority: Critical
 Fix For: 0.96.0


 I think we should enable gc by default.  Its pretty frictionless apparently 
 and could help in the case where folks are getting off the ground.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5995) Fix and reenable TestLogRolling.testLogRollOnPipelineRestart

2012-10-05 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470752#comment-13470752
 ] 

Ted Yu commented on HBASE-5995:
---

When I ran the test against hadoop 2.0, it failed with:
{code}
testLogRollOnPipelineRestart(org.apache.hadoop.hbase.regionserver.wal.TestLogRolling)
  Time elapsed: 0.243 sec   ERROR!
java.io.IOException: Cannot obtain block length for 
LocatedBlock{BP-1150895311-10.249.196.101-1349476630606:blk_7782056094701760427_1026;
 getBlockSize()=1472; corrupt=false; offset=0; locs=[127.0.0.1:44729, 
127.0.0.1:38785]}
  at 
org.apache.hadoop.hdfs.DFSInputStream.readBlockLength(DFSInputStream.java:232)
  at 
org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:177)
  at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:119)
  at org.apache.hadoop.hdfs.DFSInputStream.init(DFSInputStream.java:112)
  at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:966)
  at 
org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:212)
  at 
org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:75)
  at org.apache.hadoop.io.SequenceFile$Reader.openFile(SequenceFile.java:1768)
  at 
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.openFile(SequenceFileLogReader.java:63)
  at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1688)
  at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1709)
  at 
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.init(SequenceFileLogReader.java:56)
  at 
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init(SequenceFileLogReader.java:176)
  at 
org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createReader(HLogFactory.java:82)
  at 
org.apache.hadoop.hbase.regionserver.wal.TestLogRolling.testLogRollOnPipelineRestart(TestLogRolling.java:501)
{code}

 Fix and reenable TestLogRolling.testLogRollOnPipelineRestart
 

 Key: HBASE-5995
 URL: https://issues.apache.org/jira/browse/HBASE-5995
 Project: HBase
  Issue Type: Task
  Components: test
Affects Versions: 0.96.0
Reporter: stack
Priority: Blocker
 Fix For: 0.96.0


 HBASE-5984 disabled this flakey test (See the issue for more).  This issue is 
 about getting it enabled again.  Made a blocker on 0.96.0 so it gets 
 attention.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6940) Enable GC logging by default

2012-10-05 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-6940:
--

Attachment: 6940-trunk.txt

How about this ?

 Enable GC logging by default
 

 Key: HBASE-6940
 URL: https://issues.apache.org/jira/browse/HBASE-6940
 Project: HBase
  Issue Type: Improvement
  Components: Admin
Reporter: stack
Priority: Critical
 Fix For: 0.96.0

 Attachments: 6940-trunk.txt


 I think we should enable gc by default.  Its pretty frictionless apparently 
 and could help in the case where folks are getting off the ground.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6920) On timeout connecting to master, client can get stuck and never make progress

2012-10-05 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470760#comment-13470760
 ] 

Lars Hofhansl commented on HBASE-6920:
--

ping

 On timeout connecting to master, client can get stuck and never make progress
 -

 Key: HBASE-6920
 URL: https://issues.apache.org/jira/browse/HBASE-6920
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.2
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Critical
 Fix For: 0.94.2

 Attachments: HBASE-6920.patch, HBASE-6920-v2.patch


 HBASE-5058 appears to have introduced an issue where a timeout in 
 HConnection.getMaster() can cause the client to never be able to connect to 
 the master.  So, for example, an HBaseAdmin object can never successfully be 
 initialized.
 The issue is here:
 {code}
 if (tryMaster.isMasterRunning()) {
   this.master = tryMaster;
   this.masterLock.notifyAll();
   break;
 }
 {code}
 If isMasterRunning times out, it throws an UndeclaredThrowableException, 
 which is already not ideal, because it can be returned to the application.
  But if the first call to getMaster succeeds, it will set masterChecked = 
 true, which makes us never try to reconnect; that is, we will set this.master 
 = null and just throw MasterNotRunningExceptions, without even trying to 
 connect.
 I tried out a 94 client (actually a 92 client with some 94 patches) on a 
 cluster with some network issues, and it would constantly get stuck as 
 described above.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5582) No HServerInfo found for should be a WARNING message

2012-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470761#comment-13470761
 ] 

Hudson commented on HBASE-5582:
---

Integrated in HBase-TRUNK #3431 (See 
[https://builds.apache.org/job/HBase-TRUNK/3431/])
HBASE-5582  No HServerInfo found for should be a WARNING message (Kevin 
Odell via JD) (Revision 1394768)

 Result = FAILURE
jdcryans : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/RegionServerTracker.java


 No HServerInfo found for should be a WARNING message
 --

 Key: HBASE-5582
 URL: https://issues.apache.org/jira/browse/HBASE-5582
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.90.4
Reporter: Shrijeet Paliwal
Assignee: Kevin Odell
Priority: Trivial
  Labels: newbie
 Fix For: 0.92.3, 0.94.2, 0.96.0

 Attachments: HBASE-5582.patch, HBASE-5582.patch1


 The message from RegionServerTracker No HServerInfo found for... is easy to 
 miss. It should not be INFO. 
 From irc chat 
 {noformat}
 jdcryans
 JohnP789: can you grep for No HServerInfo found for in that log?
 jdcryans
 wait I see it
 jdcryans
 ok there's your problem
 shrijeet_
 Yes it is there
 shrijeet_
 jdcryans: it should be INFO, why?
 jdcryans
 it shouldn't be INFO, it's so easy to miss
 jdcryans
 it's not the first time we have to look super closely to figure this one out
 shrijeet_
 yes , I will file a jira
 jdcryans
 in any case it's a mismatch in that machine's DNS config
 shrijeet_
 anyways JohnP789 is waiting :) go on
 JohnP789
 haha!
 JohnP789
 yes...  ???  :-)
 jdcryans
 the master is expecting a RS called 
 localhost.localdomain,53875,1328924863478
 17:26 jdcryans
 but the RS calls itself localhost,53875,1328924863478
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6889) Ignore source control files with apache-rat

2012-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470762#comment-13470762
 ] 

Hudson commented on HBASE-6889:
---

Integrated in HBase-TRUNK #3431 (See 
[https://builds.apache.org/job/HBase-TRUNK/3431/])
HBASE-6889 Ignore source control files with apache-rat (Jesse Yates) 
(Revision 1394735)

 Result = FAILURE
larsh : 
Files : 
* /hbase/trunk/pom.xml


 Ignore source control files with apache-rat
 ---

 Key: HBASE-6889
 URL: https://issues.apache.org/jira/browse/HBASE-6889
 Project: HBase
  Issue Type: Bug
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 0.94.2, 0.96.0

 Attachments: hbase-6889-mvn-v0.patch


 Running 'mvn apache-rat:check' locally causes a failure because it finds the 
 source control files, making it hard to check that you didn't include a file 
 without a source header.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6920) On timeout connecting to master, client can get stuck and never make progress

2012-10-05 Thread Gregory Chanan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470770#comment-13470770
 ] 

Gregory Chanan commented on HBASE-6920:
---

Sorry, looks good, going to check in to 0.94.2 soon.

I need to investigate if this is a problem with 0.96

 On timeout connecting to master, client can get stuck and never make progress
 -

 Key: HBASE-6920
 URL: https://issues.apache.org/jira/browse/HBASE-6920
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.2
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Critical
 Fix For: 0.94.2

 Attachments: HBASE-6920.patch, HBASE-6920-v2.patch


 HBASE-5058 appears to have introduced an issue where a timeout in 
 HConnection.getMaster() can cause the client to never be able to connect to 
 the master.  So, for example, an HBaseAdmin object can never successfully be 
 initialized.
 The issue is here:
 {code}
 if (tryMaster.isMasterRunning()) {
   this.master = tryMaster;
   this.masterLock.notifyAll();
   break;
 }
 {code}
 If isMasterRunning times out, it throws an UndeclaredThrowableException, 
 which is already not ideal, because it can be returned to the application.
  But if the first call to getMaster succeeds, it will set masterChecked = 
 true, which makes us never try to reconnect; that is, we will set this.master 
 = null and just throw MasterNotRunningExceptions, without even trying to 
 connect.
 I tried out a 94 client (actually a 92 client with some 94 patches) on a 
 cluster with some network issues, and it would constantly get stuck as 
 described above.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6920) On timeout connecting to master, client can get stuck and never make progress

2012-10-05 Thread Gregory Chanan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470779#comment-13470779
 ] 

Gregory Chanan commented on HBASE-6920:
---

Thanks for the reviews, Ted and Lars.  Committed to 0.94.

Not closing until I investigate trunk.

 On timeout connecting to master, client can get stuck and never make progress
 -

 Key: HBASE-6920
 URL: https://issues.apache.org/jira/browse/HBASE-6920
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.2
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Critical
 Fix For: 0.94.2

 Attachments: HBASE-6920.patch, HBASE-6920-v2.patch


 HBASE-5058 appears to have introduced an issue where a timeout in 
 HConnection.getMaster() can cause the client to never be able to connect to 
 the master.  So, for example, an HBaseAdmin object can never successfully be 
 initialized.
 The issue is here:
 {code}
 if (tryMaster.isMasterRunning()) {
   this.master = tryMaster;
   this.masterLock.notifyAll();
   break;
 }
 {code}
 If isMasterRunning times out, it throws an UndeclaredThrowableException, 
 which is already not ideal, because it can be returned to the application.
  But if the first call to getMaster succeeds, it will set masterChecked = 
 true, which makes us never try to reconnect; that is, we will set this.master 
 = null and just throw MasterNotRunningExceptions, without even trying to 
 connect.
 I tried out a 94 client (actually a 92 client with some 94 patches) on a 
 cluster with some network issues, and it would constantly get stuck as 
 described above.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6920) On timeout connecting to master, client can get stuck and never make progress

2012-10-05 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470784#comment-13470784
 ] 

Lars Hofhansl commented on HBASE-6920:
--

Thanks Gregory! ... Will need to resolve temporarily to get correct release 
notes for 0.94.2 will reopen soon after (or should 0.96 receive a separate 
issue now?)

 On timeout connecting to master, client can get stuck and never make progress
 -

 Key: HBASE-6920
 URL: https://issues.apache.org/jira/browse/HBASE-6920
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.2
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Critical
 Fix For: 0.94.2

 Attachments: HBASE-6920.patch, HBASE-6920-v2.patch


 HBASE-5058 appears to have introduced an issue where a timeout in 
 HConnection.getMaster() can cause the client to never be able to connect to 
 the master.  So, for example, an HBaseAdmin object can never successfully be 
 initialized.
 The issue is here:
 {code}
 if (tryMaster.isMasterRunning()) {
   this.master = tryMaster;
   this.masterLock.notifyAll();
   break;
 }
 {code}
 If isMasterRunning times out, it throws an UndeclaredThrowableException, 
 which is already not ideal, because it can be returned to the application.
  But if the first call to getMaster succeeds, it will set masterChecked = 
 true, which makes us never try to reconnect; that is, we will set this.master 
 = null and just throw MasterNotRunningExceptions, without even trying to 
 connect.
 I tried out a 94 client (actually a 92 client with some 94 patches) on a 
 cluster with some network issues, and it would constantly get stuck as 
 described above.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6797) TestHFileCleaner#testHFileCleaning sometimes fails in trunk

2012-10-05 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470792#comment-13470792
 ] 

Jesse Yates commented on HBASE-6797:


[~lhofhansl] if you can dig it, can you commit it? I ran the test locally (jdk 
1.7) 20x, no failures.

 TestHFileCleaner#testHFileCleaning sometimes fails in trunk
 ---

 Key: HBASE-6797
 URL: https://issues.apache.org/jira/browse/HBASE-6797
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Jesse Yates
 Attachments: hbase-6797-v0.patch, hbase-6797-v1.patch


 In build #3334, I saw:
 {code}
 java.lang.AssertionError: expected:1 but was:0
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at org.junit.Assert.assertEquals(Assert.java:456)
   at 
 org.apache.hadoop.hbase.master.cleaner.TestHFileCleaner.testHFileCleaning(TestHFileCleaner.java:88)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6797) TestHFileCleaner#testHFileCleaning sometimes fails in trunk

2012-10-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470794#comment-13470794
 ] 

Hadoop QA commented on HBASE-6797:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12548056/hbase-6797-v1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
81 warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 5 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.backup.example.TestZooKeeperTableArchiveClient

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3017//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3017//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3017//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3017//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3017//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3017//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3017//console

This message is automatically generated.

 TestHFileCleaner#testHFileCleaning sometimes fails in trunk
 ---

 Key: HBASE-6797
 URL: https://issues.apache.org/jira/browse/HBASE-6797
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Jesse Yates
 Attachments: hbase-6797-v0.patch, hbase-6797-v1.patch


 In build #3334, I saw:
 {code}
 java.lang.AssertionError: expected:1 but was:0
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at org.junit.Assert.assertEquals(Assert.java:456)
   at 
 org.apache.hadoop.hbase.master.cleaner.TestHFileCleaner.testHFileCleaning(TestHFileCleaner.java:88)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5582) No HServerInfo found for should be a WARNING message

2012-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470806#comment-13470806
 ] 

Hudson commented on HBASE-5582:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #211 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/211/])
HBASE-5582  No HServerInfo found for should be a WARNING message (Kevin 
Odell via JD) (Revision 1394768)

 Result = FAILURE
jdcryans : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/RegionServerTracker.java


 No HServerInfo found for should be a WARNING message
 --

 Key: HBASE-5582
 URL: https://issues.apache.org/jira/browse/HBASE-5582
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.90.4
Reporter: Shrijeet Paliwal
Assignee: Kevin Odell
Priority: Trivial
  Labels: newbie
 Fix For: 0.92.3, 0.94.2, 0.96.0

 Attachments: HBASE-5582.patch, HBASE-5582.patch1


 The message from RegionServerTracker No HServerInfo found for... is easy to 
 miss. It should not be INFO. 
 From irc chat 
 {noformat}
 jdcryans
 JohnP789: can you grep for No HServerInfo found for in that log?
 jdcryans
 wait I see it
 jdcryans
 ok there's your problem
 shrijeet_
 Yes it is there
 shrijeet_
 jdcryans: it should be INFO, why?
 jdcryans
 it shouldn't be INFO, it's so easy to miss
 jdcryans
 it's not the first time we have to look super closely to figure this one out
 shrijeet_
 yes , I will file a jira
 jdcryans
 in any case it's a mismatch in that machine's DNS config
 shrijeet_
 anyways JohnP789 is waiting :) go on
 JohnP789
 haha!
 JohnP789
 yes...  ???  :-)
 jdcryans
 the master is expecting a RS called 
 localhost.localdomain,53875,1328924863478
 17:26 jdcryans
 but the RS calls itself localhost,53875,1328924863478
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6889) Ignore source control files with apache-rat

2012-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470807#comment-13470807
 ] 

Hudson commented on HBASE-6889:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #211 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/211/])
HBASE-6889 Ignore source control files with apache-rat (Jesse Yates) 
(Revision 1394735)

 Result = FAILURE
larsh : 
Files : 
* /hbase/trunk/pom.xml


 Ignore source control files with apache-rat
 ---

 Key: HBASE-6889
 URL: https://issues.apache.org/jira/browse/HBASE-6889
 Project: HBase
  Issue Type: Bug
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 0.94.2, 0.96.0

 Attachments: hbase-6889-mvn-v0.patch


 Running 'mvn apache-rat:check' locally causes a failure because it finds the 
 source control files, making it hard to check that you didn't include a file 
 without a source header.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5582) No HServerInfo found for should be a WARNING message

2012-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470818#comment-13470818
 ] 

Hudson commented on HBASE-5582:
---

Integrated in HBase-0.92 #595 (See 
[https://builds.apache.org/job/HBase-0.92/595/])
HBASE-5582  No HServerInfo found for should be a WARNING message (Kevin 
Odell via JD) (Revision 1394766)
HBASE-5582  No HServerInfo found for should be a WARNING message (Revision 
1394765)

 Result = FAILURE
jdcryans : 
Files : 
* /hbase/branches/0.92/CHANGES.txt

jdcryans : 
Files : 
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/zookeeper/RegionServerTracker.java


 No HServerInfo found for should be a WARNING message
 --

 Key: HBASE-5582
 URL: https://issues.apache.org/jira/browse/HBASE-5582
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.90.4
Reporter: Shrijeet Paliwal
Assignee: Kevin Odell
Priority: Trivial
  Labels: newbie
 Fix For: 0.92.3, 0.94.2, 0.96.0

 Attachments: HBASE-5582.patch, HBASE-5582.patch1


 The message from RegionServerTracker No HServerInfo found for... is easy to 
 miss. It should not be INFO. 
 From irc chat 
 {noformat}
 jdcryans
 JohnP789: can you grep for No HServerInfo found for in that log?
 jdcryans
 wait I see it
 jdcryans
 ok there's your problem
 shrijeet_
 Yes it is there
 shrijeet_
 jdcryans: it should be INFO, why?
 jdcryans
 it shouldn't be INFO, it's so easy to miss
 jdcryans
 it's not the first time we have to look super closely to figure this one out
 shrijeet_
 yes , I will file a jira
 jdcryans
 in any case it's a mismatch in that machine's DNS config
 shrijeet_
 anyways JohnP789 is waiting :) go on
 JohnP789
 haha!
 JohnP789
 yes...  ???  :-)
 jdcryans
 the master is expecting a RS called 
 localhost.localdomain,53875,1328924863478
 17:26 jdcryans
 but the RS calls itself localhost,53875,1328924863478
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6330) TestImportExport has been failing against hadoop 0.23/2.0 profile [Part2]

2012-10-05 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-6330:
--

Fix Version/s: 0.96.0

TestImportExport failed in HBase-TRUNK-on-Hadoop-2.0.0 #211

 TestImportExport has been failing against hadoop 0.23/2.0 profile [Part2]
 -

 Key: HBASE-6330
 URL: https://issues.apache.org/jira/browse/HBASE-6330
 Project: HBase
  Issue Type: Sub-task
  Components: test
Affects Versions: 0.94.1, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Priority: Critical
  Labels: hadoop-2.0
 Fix For: 0.94.3, 0.96.0

 Attachments: hbase-6330-94.patch, hbase-6330-trunk.patch, 
 hbase-6330-v2.patch


 See HBASE-5876.  I'm going to commit the v3 patches under this name since 
 there has been two months (my bad) since the first half was committed and 
 found to be incomplte.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6797) TestHFileCleaner#testHFileCleaning sometimes fails in trunk

2012-10-05 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470826#comment-13470826
 ] 

Ted Yu commented on HBASE-6797:
---

Integrated to trunk.

Thanks for the patch, Jesse.

Thanks for the review, Lars.

 TestHFileCleaner#testHFileCleaning sometimes fails in trunk
 ---

 Key: HBASE-6797
 URL: https://issues.apache.org/jira/browse/HBASE-6797
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Jesse Yates
 Attachments: hbase-6797-v0.patch, hbase-6797-v1.patch


 In build #3334, I saw:
 {code}
 java.lang.AssertionError: expected:1 but was:0
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at org.junit.Assert.assertEquals(Assert.java:456)
   at 
 org.apache.hadoop.hbase.master.cleaner.TestHFileCleaner.testHFileCleaning(TestHFileCleaner.java:88)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6920) On timeout connecting to master, client can get stuck and never make progress

2012-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470859#comment-13470859
 ] 

Hudson commented on HBASE-6920:
---

Integrated in HBase-0.94 #509 (See 
[https://builds.apache.org/job/HBase-0.94/509/])
HBASE-6920 On timeout connecting to master, client can get stuck and never 
make progress (Revision 1394857)

 Result = FAILURE
gchanan : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/ipc/RpcEngine.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/client/TestClientTimeouts.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/ipc/RandomTimeoutRpcEngine.java


 On timeout connecting to master, client can get stuck and never make progress
 -

 Key: HBASE-6920
 URL: https://issues.apache.org/jira/browse/HBASE-6920
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.2
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Critical
 Fix For: 0.94.2

 Attachments: HBASE-6920.patch, HBASE-6920-v2.patch


 HBASE-5058 appears to have introduced an issue where a timeout in 
 HConnection.getMaster() can cause the client to never be able to connect to 
 the master.  So, for example, an HBaseAdmin object can never successfully be 
 initialized.
 The issue is here:
 {code}
 if (tryMaster.isMasterRunning()) {
   this.master = tryMaster;
   this.masterLock.notifyAll();
   break;
 }
 {code}
 If isMasterRunning times out, it throws an UndeclaredThrowableException, 
 which is already not ideal, because it can be returned to the application.
  But if the first call to getMaster succeeds, it will set masterChecked = 
 true, which makes us never try to reconnect; that is, we will set this.master 
 = null and just throw MasterNotRunningExceptions, without even trying to 
 connect.
 I tried out a 94 client (actually a 92 client with some 94 patches) on a 
 cluster with some network issues, and it would constantly get stuck as 
 described above.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6958) TestAssignmentManager fails in trunk

2012-10-05 Thread Ted Yu (JIRA)
Ted Yu created HBASE-6958:
-

 Summary: TestAssignmentManager fails in trunk
 Key: HBASE-6958
 URL: https://issues.apache.org/jira/browse/HBASE-6958
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu


From 
https://builds.apache.org/job/HBase-TRUNK/3432/testReport/junit/org.apache.hadoop.hbase.master/TestAssignmentManager/testBalanceOnMasterFailoverScenarioWithOpenedNode/
 :
{code}
Stacktrace

java.lang.Exception: test timed out after 5000 milliseconds
at java.lang.System.arraycopy(Native Method)
at java.lang.ThreadGroup.remove(ThreadGroup.java:969)
at java.lang.ThreadGroup.threadTerminated(ThreadGroup.java:942)
at java.lang.Thread.exit(Thread.java:732)
...
2012-10-06 00:46:12,521 DEBUG [MASTER_CLOSE_REGION-mockedAMExecutor-0] 
zookeeper.ZKUtil(1141): mockedServer-0x13a33892de7000e Retrieved 81 byte(s) of 
data from znode /hbase/unassigned/dc01abf9cd7fd0ea256af4df02811640 and set 
watcher; region=t,,1349484359011.dc01abf9cd7fd0ea256af4df02811640., 
state=M_ZK_REGION_OFFLINE, servername=master,1,1, createTime=1349484372509, 
payload.length=0
2012-10-06 00:46:12,522 ERROR [MASTER_CLOSE_REGION-mockedAMExecutor-0] 
executor.EventHandler(205): Caught throwable while processing event 
RS_ZK_REGION_CLOSED
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.master.TestAssignmentManager$MockedLoadBalancer.randomAssignment(TestAssignmentManager.java:773)
at 
org.apache.hadoop.hbase.master.AssignmentManager.getRegionPlan(AssignmentManager.java:1709)
at 
org.apache.hadoop.hbase.master.AssignmentManager.getRegionPlan(AssignmentManager.java:1666)
at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1435)
at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1155)
at 
org.apache.hadoop.hbase.master.TestAssignmentManager$AssignmentManagerWithExtrasForTesting.assign(TestAssignmentManager.java:1035)
at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1130)
at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1125)
at 
org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:106)
at 
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:202)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
2012-10-06 00:46:12,522 DEBUG [pool-1-thread-1-EventThread] 
master.AssignmentManager(670): Handling transition=M_ZK_REGION_OFFLINE, 
server=master,1,1, region=dc01abf9cd7fd0ea256af4df02811640, current state from 
region state map ={t,,1349484359011.dc01abf9cd7fd0ea256af4df02811640. 
state=OFFLINE, ts=1349484372508, server=null}
{code}
Looks like NPE happened on this line:
{code}
  this.gate.set(true);
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6958) TestAssignmentManager fails in trunk

2012-10-05 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-6958:
--

Fix Version/s: 0.96.0

 TestAssignmentManager fails in trunk
 

 Key: HBASE-6958
 URL: https://issues.apache.org/jira/browse/HBASE-6958
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
 Fix For: 0.96.0


 From 
 https://builds.apache.org/job/HBase-TRUNK/3432/testReport/junit/org.apache.hadoop.hbase.master/TestAssignmentManager/testBalanceOnMasterFailoverScenarioWithOpenedNode/
  :
 {code}
 Stacktrace
 java.lang.Exception: test timed out after 5000 milliseconds
   at java.lang.System.arraycopy(Native Method)
   at java.lang.ThreadGroup.remove(ThreadGroup.java:969)
   at java.lang.ThreadGroup.threadTerminated(ThreadGroup.java:942)
   at java.lang.Thread.exit(Thread.java:732)
 ...
 2012-10-06 00:46:12,521 DEBUG [MASTER_CLOSE_REGION-mockedAMExecutor-0] 
 zookeeper.ZKUtil(1141): mockedServer-0x13a33892de7000e Retrieved 81 byte(s) 
 of data from znode /hbase/unassigned/dc01abf9cd7fd0ea256af4df02811640 and set 
 watcher; region=t,,1349484359011.dc01abf9cd7fd0ea256af4df02811640., 
 state=M_ZK_REGION_OFFLINE, servername=master,1,1, createTime=1349484372509, 
 payload.length=0
 2012-10-06 00:46:12,522 ERROR [MASTER_CLOSE_REGION-mockedAMExecutor-0] 
 executor.EventHandler(205): Caught throwable while processing event 
 RS_ZK_REGION_CLOSED
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.master.TestAssignmentManager$MockedLoadBalancer.randomAssignment(TestAssignmentManager.java:773)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.getRegionPlan(AssignmentManager.java:1709)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.getRegionPlan(AssignmentManager.java:1666)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1435)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1155)
   at 
 org.apache.hadoop.hbase.master.TestAssignmentManager$AssignmentManagerWithExtrasForTesting.assign(TestAssignmentManager.java:1035)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1130)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1125)
   at 
 org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:106)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:202)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:722)
 2012-10-06 00:46:12,522 DEBUG [pool-1-thread-1-EventThread] 
 master.AssignmentManager(670): Handling transition=M_ZK_REGION_OFFLINE, 
 server=master,1,1, region=dc01abf9cd7fd0ea256af4df02811640, current state 
 from region state map ={t,,1349484359011.dc01abf9cd7fd0ea256af4df02811640. 
 state=OFFLINE, ts=1349484372508, server=null}
 {code}
 Looks like NPE happened on this line:
 {code}
   this.gate.set(true);
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5995) Fix and reenable TestLogRolling.testLogRollOnPipelineRestart

2012-10-05 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-5995:
--

Issue Type: Sub-task  (was: Task)
Parent: HBASE-6891

 Fix and reenable TestLogRolling.testLogRollOnPipelineRestart
 

 Key: HBASE-5995
 URL: https://issues.apache.org/jira/browse/HBASE-5995
 Project: HBase
  Issue Type: Sub-task
  Components: test
Affects Versions: 0.96.0
Reporter: stack
Priority: Blocker
 Fix For: 0.96.0


 HBASE-5984 disabled this flakey test (See the issue for more).  This issue is 
 about getting it enabled again.  Made a blocker on 0.96.0 so it gets 
 attention.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5995) Fix and reenable TestLogRolling.testLogRollOnPipelineRestart

2012-10-05 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470872#comment-13470872
 ] 

Andrew Purtell commented on HBASE-5995:
---

This fails consistently against Hadoop 2.

 Fix and reenable TestLogRolling.testLogRollOnPipelineRestart
 

 Key: HBASE-5995
 URL: https://issues.apache.org/jira/browse/HBASE-5995
 Project: HBase
  Issue Type: Sub-task
  Components: test
Affects Versions: 0.96.0
Reporter: stack
Priority: Blocker
 Fix For: 0.96.0


 HBASE-5984 disabled this flakey test (See the issue for more).  This issue is 
 about getting it enabled again.  Made a blocker on 0.96.0 so it gets 
 attention.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6958) TestAssignmentManager sometimes fails in trunk

2012-10-05 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-6958:
--

Summary: TestAssignmentManager sometimes fails in trunk  (was: 
TestAssignmentManager fails in trunk)

 TestAssignmentManager sometimes fails in trunk
 --

 Key: HBASE-6958
 URL: https://issues.apache.org/jira/browse/HBASE-6958
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
 Fix For: 0.96.0


 From 
 https://builds.apache.org/job/HBase-TRUNK/3432/testReport/junit/org.apache.hadoop.hbase.master/TestAssignmentManager/testBalanceOnMasterFailoverScenarioWithOpenedNode/
  :
 {code}
 Stacktrace
 java.lang.Exception: test timed out after 5000 milliseconds
   at java.lang.System.arraycopy(Native Method)
   at java.lang.ThreadGroup.remove(ThreadGroup.java:969)
   at java.lang.ThreadGroup.threadTerminated(ThreadGroup.java:942)
   at java.lang.Thread.exit(Thread.java:732)
 ...
 2012-10-06 00:46:12,521 DEBUG [MASTER_CLOSE_REGION-mockedAMExecutor-0] 
 zookeeper.ZKUtil(1141): mockedServer-0x13a33892de7000e Retrieved 81 byte(s) 
 of data from znode /hbase/unassigned/dc01abf9cd7fd0ea256af4df02811640 and set 
 watcher; region=t,,1349484359011.dc01abf9cd7fd0ea256af4df02811640., 
 state=M_ZK_REGION_OFFLINE, servername=master,1,1, createTime=1349484372509, 
 payload.length=0
 2012-10-06 00:46:12,522 ERROR [MASTER_CLOSE_REGION-mockedAMExecutor-0] 
 executor.EventHandler(205): Caught throwable while processing event 
 RS_ZK_REGION_CLOSED
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.master.TestAssignmentManager$MockedLoadBalancer.randomAssignment(TestAssignmentManager.java:773)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.getRegionPlan(AssignmentManager.java:1709)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.getRegionPlan(AssignmentManager.java:1666)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1435)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1155)
   at 
 org.apache.hadoop.hbase.master.TestAssignmentManager$AssignmentManagerWithExtrasForTesting.assign(TestAssignmentManager.java:1035)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1130)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1125)
   at 
 org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:106)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:202)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:722)
 2012-10-06 00:46:12,522 DEBUG [pool-1-thread-1-EventThread] 
 master.AssignmentManager(670): Handling transition=M_ZK_REGION_OFFLINE, 
 server=master,1,1, region=dc01abf9cd7fd0ea256af4df02811640, current state 
 from region state map ={t,,1349484359011.dc01abf9cd7fd0ea256af4df02811640. 
 state=OFFLINE, ts=1349484372508, server=null}
 {code}
 Looks like NPE happened on this line:
 {code}
   this.gate.set(true);
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6920) On timeout connecting to master, client can get stuck and never make progress

2012-10-05 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6920:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 On timeout connecting to master, client can get stuck and never make progress
 -

 Key: HBASE-6920
 URL: https://issues.apache.org/jira/browse/HBASE-6920
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.2
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Critical
 Fix For: 0.94.2

 Attachments: HBASE-6920.patch, HBASE-6920-v2.patch


 HBASE-5058 appears to have introduced an issue where a timeout in 
 HConnection.getMaster() can cause the client to never be able to connect to 
 the master.  So, for example, an HBaseAdmin object can never successfully be 
 initialized.
 The issue is here:
 {code}
 if (tryMaster.isMasterRunning()) {
   this.master = tryMaster;
   this.masterLock.notifyAll();
   break;
 }
 {code}
 If isMasterRunning times out, it throws an UndeclaredThrowableException, 
 which is already not ideal, because it can be returned to the application.
  But if the first call to getMaster succeeds, it will set masterChecked = 
 true, which makes us never try to reconnect; that is, we will set this.master 
 = null and just throw MasterNotRunningExceptions, without even trying to 
 connect.
 I tried out a 94 client (actually a 92 client with some 94 patches) on a 
 cluster with some network issues, and it would constantly get stuck as 
 described above.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6958) TestAssignmentManager fails in trunk

2012-10-05 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470874#comment-13470874
 ] 

Ted Yu commented on HBASE-6958:
---

Making timeout longer, I found that there was trouble scanning .META.
Here is jstack:
{code}
RunAmJoinCluster prio=5 tid=0x7ffc32041000 nid=0x6d03 waiting on 
condition [0x000113fd6000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
  at java.lang.Thread.sleep(Native Method)
  at 
org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:194)
  at org.apache.hadoop.hbase.client.ClientScanner.close(ClientScanner.java:371)
  at 
org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:218)
  at org.apache.hadoop.hbase.client.ClientScanner.init(ClientScanner.java:127)
  at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:668)
  at org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:567)
  at org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:181)
  at org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:141)
  at 
org.apache.hadoop.hbase.master.AssignmentManager.rebuildUserRegions(AssignmentManager.java:2163)
  at 
org.apache.hadoop.hbase.master.AssignmentManager.joinCluster(AssignmentManager.java:320)
  at 
org.apache.hadoop.hbase.master.TestAssignmentManager$2.run(TestAssignmentManager.java:1087)
{code}

 TestAssignmentManager fails in trunk
 

 Key: HBASE-6958
 URL: https://issues.apache.org/jira/browse/HBASE-6958
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
 Fix For: 0.96.0


 From 
 https://builds.apache.org/job/HBase-TRUNK/3432/testReport/junit/org.apache.hadoop.hbase.master/TestAssignmentManager/testBalanceOnMasterFailoverScenarioWithOpenedNode/
  :
 {code}
 Stacktrace
 java.lang.Exception: test timed out after 5000 milliseconds
   at java.lang.System.arraycopy(Native Method)
   at java.lang.ThreadGroup.remove(ThreadGroup.java:969)
   at java.lang.ThreadGroup.threadTerminated(ThreadGroup.java:942)
   at java.lang.Thread.exit(Thread.java:732)
 ...
 2012-10-06 00:46:12,521 DEBUG [MASTER_CLOSE_REGION-mockedAMExecutor-0] 
 zookeeper.ZKUtil(1141): mockedServer-0x13a33892de7000e Retrieved 81 byte(s) 
 of data from znode /hbase/unassigned/dc01abf9cd7fd0ea256af4df02811640 and set 
 watcher; region=t,,1349484359011.dc01abf9cd7fd0ea256af4df02811640., 
 state=M_ZK_REGION_OFFLINE, servername=master,1,1, createTime=1349484372509, 
 payload.length=0
 2012-10-06 00:46:12,522 ERROR [MASTER_CLOSE_REGION-mockedAMExecutor-0] 
 executor.EventHandler(205): Caught throwable while processing event 
 RS_ZK_REGION_CLOSED
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.master.TestAssignmentManager$MockedLoadBalancer.randomAssignment(TestAssignmentManager.java:773)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.getRegionPlan(AssignmentManager.java:1709)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.getRegionPlan(AssignmentManager.java:1666)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1435)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1155)
   at 
 org.apache.hadoop.hbase.master.TestAssignmentManager$AssignmentManagerWithExtrasForTesting.assign(TestAssignmentManager.java:1035)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1130)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1125)
   at 
 org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:106)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:202)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:722)
 2012-10-06 00:46:12,522 DEBUG [pool-1-thread-1-EventThread] 
 master.AssignmentManager(670): Handling transition=M_ZK_REGION_OFFLINE, 
 server=master,1,1, region=dc01abf9cd7fd0ea256af4df02811640, current state 
 from region state map ={t,,1349484359011.dc01abf9cd7fd0ea256af4df02811640. 
 state=OFFLINE, ts=1349484372508, server=null}
 {code}
 Looks like NPE happened on this line:
 {code}
   this.gate.set(true);
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6900) RegionScanner.reseek() creates NPE when a flush or compaction happens before the reseek.

2012-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470875#comment-13470875
 ] 

Hudson commented on HBASE-6900:
---

Integrated in HBase-0.94-security #60 (See 
[https://builds.apache.org/job/HBase-0.94-security/60/])
HBASE-6900 RegionScanner.reseek() creates NPE when a flush or compaction 
happens before the reseek. (Revision 1394378)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java


 RegionScanner.reseek() creates NPE when a flush or compaction happens before 
 the reseek.
 

 Key: HBASE-6900
 URL: https://issues.apache.org/jira/browse/HBASE-6900
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.94.2, 0.96.0

 Attachments: 6900-test.txt, HBASE-6900_1.patch, HBASE-6900.patch


 HBASE-5520 introduced reseek() on the RegionScanner.  
 Now when a scanner is created we have the StoreScanner heap.  After this if a 
 flush or compaction happens parallely all the StoreScannerObservers are 
 cleared so that whenever a new next() call happens we tend to recreate the 
 scanner based on the latest store files.
 The reseek() in StoreScanner expects the heap not to be null because always 
 reseek would be called from next()
 {code}
 public synchronized boolean reseek(KeyValue kv) throws IOException {
 //Heap cannot be null, because this is only called from next() which
 //guarantees that heap will never be null before this call.
 if (explicitColumnQuery  lazySeekEnabledGlobally) {
   return heap.requestSeek(kv, true, useRowColBloom);
 } else {
   return heap.reseek(kv);
 }
   }
 {code}
 Now when we call RegionScanner.reseek() directly using CPs we tend to get a 
 NPE.  In our case it happened when a major compaction was going on.  I will 
 also attach a testcase to show the problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6920) On timeout connecting to master, client can get stuck and never make progress

2012-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470876#comment-13470876
 ] 

Hudson commented on HBASE-6920:
---

Integrated in HBase-0.94-security #60 (See 
[https://builds.apache.org/job/HBase-0.94-security/60/])
HBASE-6920 On timeout connecting to master, client can get stuck and never 
make progress (Revision 1394857)

 Result = FAILURE
gchanan : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/ipc/RpcEngine.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/client/TestClientTimeouts.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/ipc/RandomTimeoutRpcEngine.java


 On timeout connecting to master, client can get stuck and never make progress
 -

 Key: HBASE-6920
 URL: https://issues.apache.org/jira/browse/HBASE-6920
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.2
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Critical
 Fix For: 0.94.2

 Attachments: HBASE-6920.patch, HBASE-6920-v2.patch


 HBASE-5058 appears to have introduced an issue where a timeout in 
 HConnection.getMaster() can cause the client to never be able to connect to 
 the master.  So, for example, an HBaseAdmin object can never successfully be 
 initialized.
 The issue is here:
 {code}
 if (tryMaster.isMasterRunning()) {
   this.master = tryMaster;
   this.masterLock.notifyAll();
   break;
 }
 {code}
 If isMasterRunning times out, it throws an UndeclaredThrowableException, 
 which is already not ideal, because it can be returned to the application.
  But if the first call to getMaster succeeds, it will set masterChecked = 
 true, which makes us never try to reconnect; that is, we will set this.master 
 = null and just throw MasterNotRunningExceptions, without even trying to 
 connect.
 I tried out a 94 client (actually a 92 client with some 94 patches) on a 
 cluster with some network issues, and it would constantly get stuck as 
 described above.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5582) No HServerInfo found for should be a WARNING message

2012-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470877#comment-13470877
 ] 

Hudson commented on HBASE-5582:
---

Integrated in HBase-0.94-security #60 (See 
[https://builds.apache.org/job/HBase-0.94-security/60/])
HBASE-5582  No HServerInfo found for should be a WARNING message (Kevin 
Odell via JD) (Revision 1394767)

 Result = FAILURE
jdcryans : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/RegionServerTracker.java


 No HServerInfo found for should be a WARNING message
 --

 Key: HBASE-5582
 URL: https://issues.apache.org/jira/browse/HBASE-5582
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.90.4
Reporter: Shrijeet Paliwal
Assignee: Kevin Odell
Priority: Trivial
  Labels: newbie
 Fix For: 0.92.3, 0.94.2, 0.96.0

 Attachments: HBASE-5582.patch, HBASE-5582.patch1


 The message from RegionServerTracker No HServerInfo found for... is easy to 
 miss. It should not be INFO. 
 From irc chat 
 {noformat}
 jdcryans
 JohnP789: can you grep for No HServerInfo found for in that log?
 jdcryans
 wait I see it
 jdcryans
 ok there's your problem
 shrijeet_
 Yes it is there
 shrijeet_
 jdcryans: it should be INFO, why?
 jdcryans
 it shouldn't be INFO, it's so easy to miss
 jdcryans
 it's not the first time we have to look super closely to figure this one out
 shrijeet_
 yes , I will file a jira
 jdcryans
 in any case it's a mismatch in that machine's DNS config
 shrijeet_
 anyways JohnP789 is waiting :) go on
 JohnP789
 haha!
 JohnP789
 yes...  ???  :-)
 jdcryans
 the master is expecting a RS called 
 localhost.localdomain,53875,1328924863478
 17:26 jdcryans
 but the RS calls itself localhost,53875,1328924863478
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6889) Ignore source control files with apache-rat

2012-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470878#comment-13470878
 ] 

Hudson commented on HBASE-6889:
---

Integrated in HBase-0.94-security #60 (See 
[https://builds.apache.org/job/HBase-0.94-security/60/])
HBASE-6889 Ignore source control files with apache-rat (Jesse Yates) 
(Revision 1394736)

 Result = FAILURE
larsh : 
Files : 
* /hbase/branches/0.94/pom.xml


 Ignore source control files with apache-rat
 ---

 Key: HBASE-6889
 URL: https://issues.apache.org/jira/browse/HBASE-6889
 Project: HBase
  Issue Type: Bug
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 0.94.2, 0.96.0

 Attachments: hbase-6889-mvn-v0.patch


 Running 'mvn apache-rat:check' locally causes a failure because it finds the 
 source control files, making it hard to check that you didn't include a file 
 without a source header.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6797) TestHFileCleaner#testHFileCleaning sometimes fails in trunk

2012-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470883#comment-13470883
 ] 

Hudson commented on HBASE-6797:
---

Integrated in HBase-TRUNK #3433 (See 
[https://builds.apache.org/job/HBase-TRUNK/3433/])
HBASE-6797 TestHFileCleaner#testHFileCleaning sometimes fails in trunk 
(Jesse) (Revision 1394875)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/TimeToLiveHFileCleaner.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestHFileCleaner.java


 TestHFileCleaner#testHFileCleaning sometimes fails in trunk
 ---

 Key: HBASE-6797
 URL: https://issues.apache.org/jira/browse/HBASE-6797
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Jesse Yates
 Attachments: hbase-6797-v0.patch, hbase-6797-v1.patch


 In build #3334, I saw:
 {code}
 java.lang.AssertionError: expected:1 but was:0
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at org.junit.Assert.assertEquals(Assert.java:456)
   at 
 org.apache.hadoop.hbase.master.cleaner.TestHFileCleaner.testHFileCleaning(TestHFileCleaner.java:88)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6958) TestAssignmentManager sometimes fails in trunk

2012-10-05 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470884#comment-13470884
 ] 

Ted Yu commented on HBASE-6958:
---

Here is my environment:

Darwin L0032 11.4.2 Darwin Kernel Version 11.4.2: Thu Aug 23 16:25:48 PDT 2012; 
root:xnu-1699.32.7~1/RELEASE_X86_64 x86_64

java version 1.7.0_07
Java(TM) SE Runtime Environment (build 1.7.0_07-b10)
Java HotSpot(TM) 64-Bit Server VM (build 23.3-b01, mixed mode)

 TestAssignmentManager sometimes fails in trunk
 --

 Key: HBASE-6958
 URL: https://issues.apache.org/jira/browse/HBASE-6958
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
 Fix For: 0.96.0


 From 
 https://builds.apache.org/job/HBase-TRUNK/3432/testReport/junit/org.apache.hadoop.hbase.master/TestAssignmentManager/testBalanceOnMasterFailoverScenarioWithOpenedNode/
  :
 {code}
 Stacktrace
 java.lang.Exception: test timed out after 5000 milliseconds
   at java.lang.System.arraycopy(Native Method)
   at java.lang.ThreadGroup.remove(ThreadGroup.java:969)
   at java.lang.ThreadGroup.threadTerminated(ThreadGroup.java:942)
   at java.lang.Thread.exit(Thread.java:732)
 ...
 2012-10-06 00:46:12,521 DEBUG [MASTER_CLOSE_REGION-mockedAMExecutor-0] 
 zookeeper.ZKUtil(1141): mockedServer-0x13a33892de7000e Retrieved 81 byte(s) 
 of data from znode /hbase/unassigned/dc01abf9cd7fd0ea256af4df02811640 and set 
 watcher; region=t,,1349484359011.dc01abf9cd7fd0ea256af4df02811640., 
 state=M_ZK_REGION_OFFLINE, servername=master,1,1, createTime=1349484372509, 
 payload.length=0
 2012-10-06 00:46:12,522 ERROR [MASTER_CLOSE_REGION-mockedAMExecutor-0] 
 executor.EventHandler(205): Caught throwable while processing event 
 RS_ZK_REGION_CLOSED
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.master.TestAssignmentManager$MockedLoadBalancer.randomAssignment(TestAssignmentManager.java:773)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.getRegionPlan(AssignmentManager.java:1709)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.getRegionPlan(AssignmentManager.java:1666)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1435)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1155)
   at 
 org.apache.hadoop.hbase.master.TestAssignmentManager$AssignmentManagerWithExtrasForTesting.assign(TestAssignmentManager.java:1035)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1130)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1125)
   at 
 org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:106)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:202)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:722)
 2012-10-06 00:46:12,522 DEBUG [pool-1-thread-1-EventThread] 
 master.AssignmentManager(670): Handling transition=M_ZK_REGION_OFFLINE, 
 server=master,1,1, region=dc01abf9cd7fd0ea256af4df02811640, current state 
 from region state map ={t,,1349484359011.dc01abf9cd7fd0ea256af4df02811640. 
 state=OFFLINE, ts=1349484372508, server=null}
 {code}
 Looks like NPE happened on this line:
 {code}
   this.gate.set(true);
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6920) On timeout connecting to master, client can get stuck and never make progress

2012-10-05 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470885#comment-13470885
 ] 

Lars Hofhansl commented on HBASE-6920:
--

Looks like this is breaking the security build.

 On timeout connecting to master, client can get stuck and never make progress
 -

 Key: HBASE-6920
 URL: https://issues.apache.org/jira/browse/HBASE-6920
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.2
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Critical
 Fix For: 0.94.2

 Attachments: HBASE-6920.patch, HBASE-6920-v2.patch


 HBASE-5058 appears to have introduced an issue where a timeout in 
 HConnection.getMaster() can cause the client to never be able to connect to 
 the master.  So, for example, an HBaseAdmin object can never successfully be 
 initialized.
 The issue is here:
 {code}
 if (tryMaster.isMasterRunning()) {
   this.master = tryMaster;
   this.masterLock.notifyAll();
   break;
 }
 {code}
 If isMasterRunning times out, it throws an UndeclaredThrowableException, 
 which is already not ideal, because it can be returned to the application.
  But if the first call to getMaster succeeds, it will set masterChecked = 
 true, which makes us never try to reconnect; that is, we will set this.master 
 = null and just throw MasterNotRunningExceptions, without even trying to 
 connect.
 I tried out a 94 client (actually a 92 client with some 94 patches) on a 
 cluster with some network issues, and it would constantly get stuck as 
 described above.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6920) On timeout connecting to master, client can get stuck and never make progress

2012-10-05 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6920:
-

Attachment: 6920-addendum.txt

Addendum to fix the SecureRpcEngine

 On timeout connecting to master, client can get stuck and never make progress
 -

 Key: HBASE-6920
 URL: https://issues.apache.org/jira/browse/HBASE-6920
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.2
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Critical
 Fix For: 0.94.2

 Attachments: 6920-addendum.txt, HBASE-6920.patch, HBASE-6920-v2.patch


 HBASE-5058 appears to have introduced an issue where a timeout in 
 HConnection.getMaster() can cause the client to never be able to connect to 
 the master.  So, for example, an HBaseAdmin object can never successfully be 
 initialized.
 The issue is here:
 {code}
 if (tryMaster.isMasterRunning()) {
   this.master = tryMaster;
   this.masterLock.notifyAll();
   break;
 }
 {code}
 If isMasterRunning times out, it throws an UndeclaredThrowableException, 
 which is already not ideal, because it can be returned to the application.
  But if the first call to getMaster succeeds, it will set masterChecked = 
 true, which makes us never try to reconnect; that is, we will set this.master 
 = null and just throw MasterNotRunningExceptions, without even trying to 
 connect.
 I tried out a 94 client (actually a 92 client with some 94 patches) on a 
 cluster with some network issues, and it would constantly get stuck as 
 described above.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6920) On timeout connecting to master, client can get stuck and never make progress

2012-10-05 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470888#comment-13470888
 ] 

Lars Hofhansl commented on HBASE-6920:
--

Committed addendum to 0.94

 On timeout connecting to master, client can get stuck and never make progress
 -

 Key: HBASE-6920
 URL: https://issues.apache.org/jira/browse/HBASE-6920
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.2
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Critical
 Fix For: 0.94.2

 Attachments: 6920-addendum.txt, HBASE-6920.patch, HBASE-6920-v2.patch


 HBASE-5058 appears to have introduced an issue where a timeout in 
 HConnection.getMaster() can cause the client to never be able to connect to 
 the master.  So, for example, an HBaseAdmin object can never successfully be 
 initialized.
 The issue is here:
 {code}
 if (tryMaster.isMasterRunning()) {
   this.master = tryMaster;
   this.masterLock.notifyAll();
   break;
 }
 {code}
 If isMasterRunning times out, it throws an UndeclaredThrowableException, 
 which is already not ideal, because it can be returned to the application.
  But if the first call to getMaster succeeds, it will set masterChecked = 
 true, which makes us never try to reconnect; that is, we will set this.master 
 = null and just throw MasterNotRunningExceptions, without even trying to 
 connect.
 I tried out a 94 client (actually a 92 client with some 94 patches) on a 
 cluster with some network issues, and it would constantly get stuck as 
 described above.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6920) On timeout connecting to master, client can get stuck and never make progress

2012-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470892#comment-13470892
 ] 

Hudson commented on HBASE-6920:
---

Integrated in HBase-0.94-security #62 (See 
[https://builds.apache.org/job/HBase-0.94-security/62/])
HBASE-6920 Addendum - fix SecureRpcEngine (Revision 1394908)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/branches/0.94/security/src/main/java/org/apache/hadoop/hbase/ipc/SecureRpcEngine.java


 On timeout connecting to master, client can get stuck and never make progress
 -

 Key: HBASE-6920
 URL: https://issues.apache.org/jira/browse/HBASE-6920
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.2
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Critical
 Fix For: 0.94.2

 Attachments: 6920-addendum.txt, HBASE-6920.patch, HBASE-6920-v2.patch


 HBASE-5058 appears to have introduced an issue where a timeout in 
 HConnection.getMaster() can cause the client to never be able to connect to 
 the master.  So, for example, an HBaseAdmin object can never successfully be 
 initialized.
 The issue is here:
 {code}
 if (tryMaster.isMasterRunning()) {
   this.master = tryMaster;
   this.masterLock.notifyAll();
   break;
 }
 {code}
 If isMasterRunning times out, it throws an UndeclaredThrowableException, 
 which is already not ideal, because it can be returned to the application.
  But if the first call to getMaster succeeds, it will set masterChecked = 
 true, which makes us never try to reconnect; that is, we will set this.master 
 = null and just throw MasterNotRunningExceptions, without even trying to 
 connect.
 I tried out a 94 client (actually a 92 client with some 94 patches) on a 
 cluster with some network issues, and it would constantly get stuck as 
 described above.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6920) On timeout connecting to master, client can get stuck and never make progress

2012-10-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470894#comment-13470894
 ] 

Hudson commented on HBASE-6920:
---

Integrated in HBase-0.94 #511 (See 
[https://builds.apache.org/job/HBase-0.94/511/])
HBASE-6920 Addendum - fix SecureRpcEngine (Revision 1394908)

 Result = SUCCESS
larsh : 
Files : 
* 
/hbase/branches/0.94/security/src/main/java/org/apache/hadoop/hbase/ipc/SecureRpcEngine.java


 On timeout connecting to master, client can get stuck and never make progress
 -

 Key: HBASE-6920
 URL: https://issues.apache.org/jira/browse/HBASE-6920
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.2
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Critical
 Fix For: 0.94.2

 Attachments: 6920-addendum.txt, HBASE-6920.patch, HBASE-6920-v2.patch


 HBASE-5058 appears to have introduced an issue where a timeout in 
 HConnection.getMaster() can cause the client to never be able to connect to 
 the master.  So, for example, an HBaseAdmin object can never successfully be 
 initialized.
 The issue is here:
 {code}
 if (tryMaster.isMasterRunning()) {
   this.master = tryMaster;
   this.masterLock.notifyAll();
   break;
 }
 {code}
 If isMasterRunning times out, it throws an UndeclaredThrowableException, 
 which is already not ideal, because it can be returned to the application.
  But if the first call to getMaster succeeds, it will set masterChecked = 
 true, which makes us never try to reconnect; that is, we will set this.master 
 = null and just throw MasterNotRunningExceptions, without even trying to 
 connect.
 I tried out a 94 client (actually a 92 client with some 94 patches) on a 
 cluster with some network issues, and it would constantly get stuck as 
 described above.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira