from:"Zhihong Yu \(JIRA\)"


[ 
https://issues.apache.org/jira/browse/HBASE-5908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13265527#comment-13265527
 ] 

Zhihong Yu commented on HBASE-5908:
---

@Gregory:
I would suggest referencing other JIRAs by their names only, such as 
HADOOP-8230.

This way we would easily see whether the JIRA has been resolved.

 TestHLogSplit.testTralingGarbageCorruptionFileSkipErrorsPasses should not use 
 append to corrupt the HLog
 

 Key: HBASE-5908
 URL: https://issues.apache.org/jira/browse/HBASE-5908
 Project: HBase
  Issue Type: Bug
  Components: test, wal
Affects Versions: 0.96.0
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor
 Attachments: HBASE-5908-trunk.patch


 TestHLogSplit.testTralingGarbageCorruptionFileSkipErrorsPasses fails against 
 a version of hadoop with https://issues.apache.org/jira/browse/HADOOP-8230
 The failure:
 java.io.IOException: Append is not supported. Please see the 
 dfs.support.append configuration parameter.
 Instead of using append, we can probably just:
 - copy over the contents to a new file
 - append the garbage to the new file
 - copy back to the old file

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5548) Add ability to get a table in the shell


[ 
https://issues.apache.org/jira/browse/HBASE-5548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13265529#comment-13265529
 ] 

Zhihong Yu commented on HBASE-5548:
---

@Jesse:
I checked the two outstanding Hadoop QA jobs around 23:27 - they were not for 
this JIRA.

 Add ability to get a table in the shell
 ---

 Key: HBASE-5548
 URL: https://issues.apache.org/jira/browse/HBASE-5548
 Project: HBase
  Issue Type: Improvement
  Components: shell
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 0.96.0

 Attachments: ruby_HBASE-5528-v0.patch, ruby_HBASE-5548-v1.patch, 
 ruby_HBASE-5548-v2.patch, ruby_HBASE-5548-v3.patch, ruby_HBASE-5548-v5.patch


 Currently, all the commands that operate on a table in the shell first have 
 to take the table as name as input. 
 There are two main considerations:
 * It is annoying to have to write the table name every time, when you should 
 just be able to get a reference to a table
 * the current implementation is very wasteful - it creates a new HTable for 
 each call (but reuses the connection since it uses the same configuration)
 We should be able to get a handle to a single HTable and then operate on that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5699) Run with 1 WAL in HRegionServer


[ 
https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13265606#comment-13265606
 ] 

Zhihong Yu commented on HBASE-5699:
---

bq. to one HLog object, which might have more than one underlying stream.
The above can be a (sub-)task by itself.


 Run with  1 WAL in HRegionServer
 -

 Key: HBASE-5699
 URL: https://issues.apache.org/jira/browse/HBASE-5699
 Project: HBase
  Issue Type: Improvement
Reporter: binlijin
Assignee: Li Pi



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5699) Run with 1 WAL in HRegionServer


[ 
https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13265620#comment-13265620
 ] 

Zhihong Yu commented on HBASE-5699:
---

Currently we maintain one sequence number per region per HLog. From append():
{code}
  this.lastSeqWritten.putIfAbsent(regionInfo.getEncodedNameAsBytes(),
Long.valueOf(seqNum));
{code}
If WALEdit's from a particular region can spread across multiple streams, 
accounting would be more complex.

 Run with  1 WAL in HRegionServer
 -

 Key: HBASE-5699
 URL: https://issues.apache.org/jira/browse/HBASE-5699
 Project: HBase
  Issue Type: Improvement
Reporter: binlijin
Assignee: Li Pi



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5547) Don't delete HFiles when in backup mode


[ 
https://issues.apache.org/jira/browse/HBASE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13265627#comment-13265627
 ] 

Zhihong Yu commented on HBASE-5547:
---

@Jesse:
Do you want to attach patch so that Hadoop QA can run test suite ?

 Don't delete HFiles when in backup mode
 -

 Key: HBASE-5547
 URL: https://issues.apache.org/jira/browse/HBASE-5547
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Jesse Yates

 This came up in a discussion I had with Stack.
 It would be nice if HBase could be notified that a backup is in progress (via 
 a znode for example) and in that case either:
 1. rename HFiles to be delete to file.bck
 2. rename the HFiles into a special directory
 3. rename them to a general trash directory (which would not need to be tied 
 to backup mode).
 That way it should be able to get a consistent backup based on HFiles (HDFS 
 snapshots or hard links would be better options here, but we do not have 
 those).
 #1 makes cleanup a bit harder.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5897) prePut coprocessor hook causing substantial CPU usage

2012-04-29 Thread Zhihong Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13264538#comment-13264538
 ] 

Zhihong Yu commented on HBASE-5897:
---

+1 on Todd's patch.

 prePut coprocessor hook causing substantial CPU usage
 -

 Key: HBASE-5897
 URL: https://issues.apache.org/jira/browse/HBASE-5897
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.94.0, 0.96.0

 Attachments: 5897-simple.txt, hbase-5897.txt


 I was running an insert workload against trunk under oprofile and saw that a 
 significant portion of CPU usage was going to calling the prePut 
 coprocessor hook inside doMiniBatchPut, even though I don't have any 
 coprocessors installed. I ran a million-row insert and collected CPU time 
 spent in the RS after commenting out the preput hook, and found CPU usage 
 reduced by 33%.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5342) Grant/Revoke global permissions

2012-04-29 Thread Zhihong Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5342:
--

Comment: was deleted

(was: -1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12525018/HBASE-5342-v0.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 9 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1684//console

This message is automatically generated.)

 Grant/Revoke global permissions
 ---

 Key: HBASE-5342
 URL: https://issues.apache.org/jira/browse/HBASE-5342
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Matteo Bertozzi
 Attachments: HBASE-5342-draft.patch, HBASE-5342-v0.patch


 HBASE-3025 introduced simple ACLs based on coprocessors. It defines 
 global/table/cf/cq level permissions. However, there is no way to 
 grant/revoke global level permissions, other than the hbase.superuser conf 
 setting. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5342) Grant/Revoke global permissions

2012-04-29 Thread Zhihong Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13264539#comment-13264539
 ] 

Zhihong Yu commented on HBASE-5342:
---

@Matteo:
Can you run the new patch for security profile and let us know the result ?

Thanks

 Grant/Revoke global permissions
 ---

 Key: HBASE-5342
 URL: https://issues.apache.org/jira/browse/HBASE-5342
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Matteo Bertozzi
 Attachments: HBASE-5342-draft.patch, HBASE-5342-v0.patch


 HBASE-3025 introduced simple ACLs based on coprocessors. It defines 
 global/table/cf/cq level permissions. However, there is no way to 
 grant/revoke global level permissions, other than the hbase.superuser conf 
 setting. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5894) Delete table failed but HBaseAdmin#deletetable report it as success


[ 
https://issues.apache.org/jira/browse/HBASE-5894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13264290#comment-13264290
 ] 

Zhihong Yu commented on HBASE-5894:
---

{code}
+  throw new RegionException(Retries exhausted, it took too long to wait+
{code}
I think IOException should be thrown above.

If a test can be added to verify the fix, that would be nice.

 Delete table failed but HBaseAdmin#deletetable report it as success
 ---

 Key: HBASE-5894
 URL: https://issues.apache.org/jira/browse/HBASE-5894
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.7, 0.92.2, 0.94.0
 Environment: all versions
Reporter: xufeng
Assignee: xufeng
Priority: Minor
 Attachments: HBASE-5894_trunk_patch_v1.patch, 
 HBASE-5894_trunk_patch_v1_surefire-report.html


 Reproduce this issue by following steps:
 For reproduce it I add this code in DeleteTableHandler#handleTableOperation():
 {noformat}
   LOG.debug(Deleting region  + region.getRegionNameAsString() +
  from META and FS);
 +if (true) {
 +  throw new IOException(ERROR);
 +}
   // Remove region from META
   MetaEditor.deleteRegion(this.server.getCatalogTracker(), region);
 {noformat}
 step1:create a table and disable it.
 step2:delete it by HBaseAdmin#deleteTable() API.
 result:after lone time, The log say the Table has been deleted, but in fact 
 if we do list in shell,the table also exists.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size


 [ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5611:
--

Attachment: 5611-94.addendum

Addendum for 0.94

 Replayed edits from regions that failed to open during recovery aren't 
 removed from the global MemStore size
 

 Key: HBASE-5611
 URL: https://issues.apache.org/jira/browse/HBASE-5611
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Jean-Daniel Cryans
Assignee: Jieshan Bean
Priority: Critical
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: 5611-94.addendum, HBASE-5611-92.patch, 
 HBASE-5611-94-minorchange.patch, HBASE-5611-trunk-v2-minorchange.patch


 This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
 it's still possible to hit it if a region fails to open for more obscure 
 reasons like HDFS errors.
 Consider a region that just went through distributed splitting and that's now 
 being opened by a new RS. The first thing it does is to read the recovery 
 files and put the edits in the {{MemStores}}. If this process takes a long 
 time, the master will move that region away. At that point the edits are 
 still accounted for in the global {{MemStore}} size but they are dropped when 
 the {{HRegion}} gets cleaned up. It's completely invisible until the 
 {{MemStoreFlusher}} needs to force flush a region and that none of them have 
 edits:
 {noformat}
 2012-03-21 00:33:39,303 DEBUG 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
 because memory above low water=5.9g
 2012-03-21 00:33:39,303 ERROR 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
 for entry null
 java.lang.IllegalStateException
 at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:129)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}
 The {{null}} here is a region. In my case I had so many edits in the 
 {{MemStore}} during recovery that I'm over the low barrier although in fact 
 I'm at 0. It happened yesterday and it still printing this out.
 To fix this we need to be able to decrease the global {{MemStore}} size when 
 the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5894) Delete table failed but HBaseAdmin#deletetable report it as success


[ 
https://issues.apache.org/jira/browse/HBASE-5894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13264343#comment-13264343
 ] 

Zhihong Yu commented on HBASE-5894:
---

You can use the following annotation to limit the duration for a particular 
test:
{code}
  @Test(timeout = 12)
{code}

 Delete table failed but HBaseAdmin#deletetable report it as success
 ---

 Key: HBASE-5894
 URL: https://issues.apache.org/jira/browse/HBASE-5894
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.7, 0.92.2, 0.94.0
 Environment: all versions
Reporter: xufeng
Assignee: xufeng
Priority: Minor
 Attachments: HBASE-5894_trunk_patch_v1.patch, 
 HBASE-5894_trunk_patch_v1_surefire-report.html


 Reproduce this issue by following steps:
 For reproduce it I add this code in DeleteTableHandler#handleTableOperation():
 {noformat}
   LOG.debug(Deleting region  + region.getRegionNameAsString() +
  from META and FS);
 +if (true) {
 +  throw new IOException(ERROR);
 +}
   // Remove region from META
   MetaEditor.deleteRegion(this.server.getCatalogTracker(), region);
 {noformat}
 step1:create a table and disable it.
 step2:delete it by HBaseAdmin#deleteTable() API.
 result:after lone time, The log say the Table has been deleted, but in fact 
 if we do list in shell,the table also exists.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size


 [ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu reopened HBASE-5611:
---


Changes for 0.94 reverted due to TestChangingEncoding failure.

 Replayed edits from regions that failed to open during recovery aren't 
 removed from the global MemStore size
 

 Key: HBASE-5611
 URL: https://issues.apache.org/jira/browse/HBASE-5611
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Jean-Daniel Cryans
Assignee: Jieshan Bean
Priority: Critical
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: 5611-94.addendum, HBASE-5611-92.patch, 
 HBASE-5611-94-minorchange.patch, HBASE-5611-trunk-v2-minorchange.patch


 This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
 it's still possible to hit it if a region fails to open for more obscure 
 reasons like HDFS errors.
 Consider a region that just went through distributed splitting and that's now 
 being opened by a new RS. The first thing it does is to read the recovery 
 files and put the edits in the {{MemStores}}. If this process takes a long 
 time, the master will move that region away. At that point the edits are 
 still accounted for in the global {{MemStore}} size but they are dropped when 
 the {{HRegion}} gets cleaned up. It's completely invisible until the 
 {{MemStoreFlusher}} needs to force flush a region and that none of them have 
 edits:
 {noformat}
 2012-03-21 00:33:39,303 DEBUG 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
 because memory above low water=5.9g
 2012-03-21 00:33:39,303 ERROR 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
 for entry null
 java.lang.IllegalStateException
 at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:129)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}
 The {{null}} here is a region. In my case I had so many edits in the 
 {{MemStore}} during recovery that I'm over the low barrier although in fact 
 I'm at 0. It happened yesterday and it still printing this out.
 To fix this we need to be able to decrease the global {{MemStore}} size when 
 the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5342) Grant/Revoke global permissions


[ 
https://issues.apache.org/jira/browse/HBASE-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13264415#comment-13264415
 ] 

Zhihong Yu commented on HBASE-5342:
---

I got the following when applying the draft patch:
{code}
2 out of 9 hunks FAILED -- saving rejects to file 
security/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java.rej
{code}
{code}
+  private void updateGlobalCache(ListMultimapString,TablePermission 
userPerms) {
{code}
I would expect Permission in the method signature above. Can the following 
method be changed to return ListMultimapString, Permission ?
{code}
   ListMultimapString,TablePermission perms = 
AccessControlLists.readPermissions(in, conf);
{code}
{code}
-SetString tableSet = new HashSetString();
+Setbyte[] tableSet = new HashSetbyte[]();
{code}
HashSet is backed by HashMap: see line 93 of 
http://www.docjar.com/html/api/java/util/HashSet.java.html
I think a proper comparator should be used above.
{code}
+   * Returns true if this permission describe a user global permission.
{code}
Should read 'describes a global user permission'
{code}
+  raise(ArgumentError, Can't find a family: #{family}) unless 
htd.hasFamily(family.to_java_bytes)
{code}
Line exceeds 100 chars. Remove the 'a' before 'family' or replace it with 'the'.
{code}
+user_permission = 
org.apache.hadoop.hbase.security.access.UserPermission.new(user.to_java_bytes, 
table_name.to_java_bytes, fambytes, qualbytes, .to_java_bytes)
{code}
Above line is too long. Length should be no longer than 100 chars. Same with 
the assignment in the else block below.

 Grant/Revoke global permissions
 ---

 Key: HBASE-5342
 URL: https://issues.apache.org/jira/browse/HBASE-5342
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Matteo Bertozzi
 Attachments: HBASE-5342-draft.patch


 HBASE-3025 introduced simple ACLs based on coprocessors. It defines 
 global/table/cf/cq level permissions. However, there is no way to 
 grant/revoke global level permissions, other than the hbase.superuser conf 
 setting. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5885) Invalid HFile block magic on Local file System


[ 
https://issues.apache.org/jira/browse/HBASE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13264441#comment-13264441
 ] 

Zhihong Yu commented on HBASE-5885:
---

Now that HBASE-5611 was taken out of 0.94, yet we still see the following test 
failure:
{code}
https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.94/159/testReport/org.apache.hadoop.hbase.io.encoding/TestChangingEncoding/testFlippingEncodeOnDisk/
{code}
I start to think that this JIRA is related to the failure above.

 Invalid HFile block magic on Local file System
 --

 Key: HBASE-5885
 URL: https://issues.apache.org/jira/browse/HBASE-5885
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0, 0.96.0
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Blocker
 Fix For: 0.94.0, 0.96.0

 Attachments: 5885-trunk-v2.txt, HBASE-5885-94-0.patch, 
 HBASE-5885-94-1.patch, HBASE-5885-trunk-0.patch, HBASE-5885-trunk-1.patch


 ERROR: java.lang.RuntimeException: 
 org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
 attempts=7, exceptions:
 Thu Apr 26 11:19:18 PDT 2012, 
 org.apache.hadoop.hbase.client.ScannerCallable@190a621a, java.io.IOException: 
 java.io.IOException: Could not iterate StoreFileScanner[HFileScanner for 
 reader 
 reader=file:/tmp/hbase-eclark/hbase/TestTable/e2d1c846363c75262cbfd85ea278b342/info/bae2681d63734066957b58fe791a0268,
  compression=none, cacheConf=CacheConfig:enabled [cacheDataOnRead=true] 
 [cacheDataOnWrite=false] [cacheIndexesOnWrite=false] 
 [cacheBloomsOnWrite=false] [cacheEvictOnClose=false] [cacheCompressed=false], 
 firstKey=01/info:data/1335463981520/Put, 
 lastKey=0002588100/info:data/1335463902296/Put, avgKeyLen=30, 
 avgValueLen=1000, entries=1215085, length=1264354417, 
 cur=000248/info:data/1335463994457/Put/vlen=1000/ts=0]
   at 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:135)
   at 
 org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:95)
   at 
 org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:368)
   at 
 org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:127)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3323)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3279)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3296)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2393)
   at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1376)
 Caused by: java.io.IOException: Invalid HFile block magic: 
 \xEC\xD5\x9D\xB4\xC2bfo
   at org.apache.hadoop.hbase.io.hfile.BlockType.parse(BlockType.java:153)
   at org.apache.hadoop.hbase.io.hfile.BlockType.read(BlockType.java:164)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlock.init(HFileBlock.java:254)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1779)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1637)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:327)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.readNextDataBlock(HFileReaderV2.java:555)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:651)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:130)
   ... 12 more
 Thu Apr 26 11:19:19 PDT 2012, 
 org.apache.hadoop.hbase.client.ScannerCallable@190a621a, java.io.IOException: 
 java.io.IOException: java.lang.IllegalArgumentException
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1132)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1121)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2420)
   at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at

[jira] [Commented] (HBASE-5885) Invalid HFile block magic on Local file System


[ 
https://issues.apache.org/jira/browse/HBASE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13264452#comment-13264452
 ] 

Zhihong Yu commented on HBASE-5885:
---

Here is another one:
https://builds.apache.org/job/HBase-0.94/157/testReport/org.apache.hadoop.hbase.io.encoding/TestChangingEncoding/testFlippingEncodeOnDisk/

 Invalid HFile block magic on Local file System
 --

 Key: HBASE-5885
 URL: https://issues.apache.org/jira/browse/HBASE-5885
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0, 0.96.0
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Blocker
 Fix For: 0.94.0, 0.96.0

 Attachments: 5885-trunk-v2.txt, HBASE-5885-94-0.patch, 
 HBASE-5885-94-1.patch, HBASE-5885-trunk-0.patch, HBASE-5885-trunk-1.patch


 ERROR: java.lang.RuntimeException: 
 org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
 attempts=7, exceptions:
 Thu Apr 26 11:19:18 PDT 2012, 
 org.apache.hadoop.hbase.client.ScannerCallable@190a621a, java.io.IOException: 
 java.io.IOException: Could not iterate StoreFileScanner[HFileScanner for 
 reader 
 reader=file:/tmp/hbase-eclark/hbase/TestTable/e2d1c846363c75262cbfd85ea278b342/info/bae2681d63734066957b58fe791a0268,
  compression=none, cacheConf=CacheConfig:enabled [cacheDataOnRead=true] 
 [cacheDataOnWrite=false] [cacheIndexesOnWrite=false] 
 [cacheBloomsOnWrite=false] [cacheEvictOnClose=false] [cacheCompressed=false], 
 firstKey=01/info:data/1335463981520/Put, 
 lastKey=0002588100/info:data/1335463902296/Put, avgKeyLen=30, 
 avgValueLen=1000, entries=1215085, length=1264354417, 
 cur=000248/info:data/1335463994457/Put/vlen=1000/ts=0]
   at 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:135)
   at 
 org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:95)
   at 
 org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:368)
   at 
 org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:127)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3323)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3279)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3296)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2393)
   at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1376)
 Caused by: java.io.IOException: Invalid HFile block magic: 
 \xEC\xD5\x9D\xB4\xC2bfo
   at org.apache.hadoop.hbase.io.hfile.BlockType.parse(BlockType.java:153)
   at org.apache.hadoop.hbase.io.hfile.BlockType.read(BlockType.java:164)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlock.init(HFileBlock.java:254)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1779)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1637)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:327)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.readNextDataBlock(HFileReaderV2.java:555)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:651)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:130)
   ... 12 more
 Thu Apr 26 11:19:19 PDT 2012, 
 org.apache.hadoop.hbase.client.ScannerCallable@190a621a, java.io.IOException: 
 java.io.IOException: java.lang.IllegalArgumentException
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1132)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1121)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2420)
   at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1376)
 Caused by: java.lang.IllegalArgumentException
   at

[jira] [Commented] (HBASE-5898) Consider double-checked locking for block cache lock


[ 
https://issues.apache.org/jira/browse/HBASE-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13264456#comment-13264456
 ] 

Zhihong Yu commented on HBASE-5898:
---

Interesting idea.
Minor comments:
The indentation for while (true) loop is off.
Changes to conf/hbase-site.xml belong to another JIRA.

 Consider double-checked locking for block cache lock
 

 Key: HBASE-5898
 URL: https://issues.apache.org/jira/browse/HBASE-5898
 Project: HBase
  Issue Type: Improvement
  Components: performance
Affects Versions: 0.94.1
Reporter: Todd Lipcon
 Attachments: hbase-5898.txt


 Running a workload with a high query rate against a dataset that fits in 
 cache, I saw a lot of CPU being used in IdLock.getLockEntry, being called by 
 HFileReaderV2.readBlock. Even though it was all cache hits, it was wasting a 
 lot of CPU doing lock management here. I wrote a quick patch to switch to a 
 double-checked locking and it improved throughput substantially for this 
 workload.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5898) Consider double-checked locking for block cache lock


[ 
https://issues.apache.org/jira/browse/HBASE-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13264470#comment-13264470
 ] 

Zhihong Yu commented on HBASE-5898:
---

Consider the case where off heap cache is enabled.
From DoubleBlockCache:
Suppose getBlock() is executed without the lock (first pass in the new loop of 
readBlock) and doesn't find cacheKey from onHeapCache but finds it in 
offHeapCache - it will call onHeapCache.cacheBlock():
{code}
  public Cacheable getBlock(BlockCacheKey cacheKey, boolean caching) {
Cacheable cachedBlock;

if ((cachedBlock = onHeapCache.getBlock(cacheKey, caching)) != null) {
  stats.hit(caching);
  return cachedBlock;

} else if ((cachedBlock = offHeapCache.getBlock(cacheKey, caching)) != 
null) {
  if (caching) {
onHeapCache.cacheBlock(cacheKey, cachedBlock);
  }
{code}
Another thread calls cacheBlock() around the same time and executes 
onHeapCache.cacheBlock() for the same cacheKey:
{code}
  public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf) {
onHeapCache.cacheBlock(cacheKey, buf);
offHeapCache.cacheBlock(cacheKey, buf);
  }
{code}
I think there is a race condition which didn't exist before the proposed 
change: the entries for the same cacheKey in onHeapCache and offHeapCache would 
diverge.

If off heap cache is disabled, I don't see problem with proposed optimization.

 Consider double-checked locking for block cache lock
 

 Key: HBASE-5898
 URL: https://issues.apache.org/jira/browse/HBASE-5898
 Project: HBase
  Issue Type: Improvement
  Components: performance
Affects Versions: 0.94.1
Reporter: Todd Lipcon
 Attachments: hbase-5898.txt


 Running a workload with a high query rate against a dataset that fits in 
 cache, I saw a lot of CPU being used in IdLock.getLockEntry, being called by 
 HFileReaderV2.readBlock. Even though it was all cache hits, it was wasting a 
 lot of CPU doing lock management here. I wrote a quick patch to switch to a 
 double-checked locking and it improved throughput substantially for this 
 workload.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size


[ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262648#comment-13262648
 ] 

Zhihong Yu commented on HBASE-5611:
---

@Jieshan:
When you have multiple patches for different branches, attach patch for trunk 
apart from the other patches.
Otherwise Hadoop QA may pick up the wrong patch.

 Replayed edits from regions that failed to open during recovery aren't 
 removed from the global MemStore size
 

 Key: HBASE-5611
 URL: https://issues.apache.org/jira/browse/HBASE-5611
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Jean-Daniel Cryans
Assignee: Jieshan Bean
Priority: Critical
 Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5611-94.patch, HBASE-5611-trunk-v2.patch, 
 HBASE-5611-trunk.patch


 This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
 it's still possible to hit it if a region fails to open for more obscure 
 reasons like HDFS errors.
 Consider a region that just went through distributed splitting and that's now 
 being opened by a new RS. The first thing it does is to read the recovery 
 files and put the edits in the {{MemStores}}. If this process takes a long 
 time, the master will move that region away. At that point the edits are 
 still accounted for in the global {{MemStore}} size but they are dropped when 
 the {{HRegion}} gets cleaned up. It's completely invisible until the 
 {{MemStoreFlusher}} needs to force flush a region and that none of them have 
 edits:
 {noformat}
 2012-03-21 00:33:39,303 DEBUG 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
 because memory above low water=5.9g
 2012-03-21 00:33:39,303 ERROR 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
 for entry null
 java.lang.IllegalStateException
 at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:129)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}
 The {{null}} here is a region. In my case I had so many edits in the 
 {{MemStore}} during recovery that I'm over the low barrier although in fact 
 I'm at 0. It happened yesterday and it still printing this out.
 To fix this we need to be able to decrease the global {{MemStore}} size when 
 the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size


 [ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5611:
--

Hadoop Flags: Reviewed
  Status: Patch Available  (was: Open)

 Replayed edits from regions that failed to open during recovery aren't 
 removed from the global MemStore size
 

 Key: HBASE-5611
 URL: https://issues.apache.org/jira/browse/HBASE-5611
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Jean-Daniel Cryans
Assignee: Jieshan Bean
Priority: Critical
 Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5611-94.patch, HBASE-5611-trunk-v2.patch, 
 HBASE-5611-trunk.patch


 This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
 it's still possible to hit it if a region fails to open for more obscure 
 reasons like HDFS errors.
 Consider a region that just went through distributed splitting and that's now 
 being opened by a new RS. The first thing it does is to read the recovery 
 files and put the edits in the {{MemStores}}. If this process takes a long 
 time, the master will move that region away. At that point the edits are 
 still accounted for in the global {{MemStore}} size but they are dropped when 
 the {{HRegion}} gets cleaned up. It's completely invisible until the 
 {{MemStoreFlusher}} needs to force flush a region and that none of them have 
 edits:
 {noformat}
 2012-03-21 00:33:39,303 DEBUG 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
 because memory above low water=5.9g
 2012-03-21 00:33:39,303 ERROR 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
 for entry null
 java.lang.IllegalStateException
 at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:129)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}
 The {{null}} here is a region. In my case I had so many edits in the 
 {{MemStore}} during recovery that I'm over the low barrier although in fact 
 I'm at 0. It happened yesterday and it still printing this out.
 To fix this we need to be able to decrease the global {{MemStore}} size when 
 the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size


[ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262652#comment-13262652
 ] 

Zhihong Yu commented on HBASE-5611:
---

Patch v2 looks good in general.
Comment on formatting:
{code}
+   * @param regionName
+   *  region name.
{code}
The line length is 100 chars. Please put javadoc for param on the same line as 
param name.

You can wait for Hadoop QA result to come back before attaching new patches.

 Replayed edits from regions that failed to open during recovery aren't 
 removed from the global MemStore size
 

 Key: HBASE-5611
 URL: https://issues.apache.org/jira/browse/HBASE-5611
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Jean-Daniel Cryans
Assignee: Jieshan Bean
Priority: Critical
 Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5611-94.patch, HBASE-5611-trunk-v2.patch, 
 HBASE-5611-trunk.patch


 This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
 it's still possible to hit it if a region fails to open for more obscure 
 reasons like HDFS errors.
 Consider a region that just went through distributed splitting and that's now 
 being opened by a new RS. The first thing it does is to read the recovery 
 files and put the edits in the {{MemStores}}. If this process takes a long 
 time, the master will move that region away. At that point the edits are 
 still accounted for in the global {{MemStore}} size but they are dropped when 
 the {{HRegion}} gets cleaned up. It's completely invisible until the 
 {{MemStoreFlusher}} needs to force flush a region and that none of them have 
 edits:
 {noformat}
 2012-03-21 00:33:39,303 DEBUG 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
 because memory above low water=5.9g
 2012-03-21 00:33:39,303 ERROR 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
 for entry null
 java.lang.IllegalStateException
 at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:129)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}
 The {{null}} here is a region. In my case I had so many edits in the 
 {{MemStore}} during recovery that I'm over the low barrier although in fact 
 I'm at 0. It happened yesterday and it still printing this out.
 To fix this we need to be able to decrease the global {{MemStore}} size when 
 the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size


[ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262652#comment-13262652
 ] 

Zhihong Yu edited comment on HBASE-5611 at 4/26/12 2:58 PM:


Patch v2 looks good in general.
Comment on formatting:
{code}
+   * @param regionName
+   *  region name.
{code}
The line length limit is 100 chars. Please put javadoc for param on the same 
line as param name.

You can wait for Hadoop QA result to come back before attaching new patches.

  was (Author: zhi...@ebaysf.com):
Patch v2 looks good in general.
Comment on formatting:
{code}
+   * @param regionName
+   *  region name.
{code}
The line length is 100 chars. Please put javadoc for param on the same line as 
param name.

You can wait for Hadoop QA result to come back before attaching new patches.
  
 Replayed edits from regions that failed to open during recovery aren't 
 removed from the global MemStore size
 

 Key: HBASE-5611
 URL: https://issues.apache.org/jira/browse/HBASE-5611
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Jean-Daniel Cryans
Assignee: Jieshan Bean
Priority: Critical
 Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5611-94.patch, HBASE-5611-trunk-v2.patch, 
 HBASE-5611-trunk.patch


 This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
 it's still possible to hit it if a region fails to open for more obscure 
 reasons like HDFS errors.
 Consider a region that just went through distributed splitting and that's now 
 being opened by a new RS. The first thing it does is to read the recovery 
 files and put the edits in the {{MemStores}}. If this process takes a long 
 time, the master will move that region away. At that point the edits are 
 still accounted for in the global {{MemStore}} size but they are dropped when 
 the {{HRegion}} gets cleaned up. It's completely invisible until the 
 {{MemStoreFlusher}} needs to force flush a region and that none of them have 
 edits:
 {noformat}
 2012-03-21 00:33:39,303 DEBUG 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
 because memory above low water=5.9g
 2012-03-21 00:33:39,303 ERROR 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
 for entry null
 java.lang.IllegalStateException
 at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:129)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}
 The {{null}} here is a region. In my case I had so many edits in the 
 {{MemStore}} during recovery that I'm over the low barrier although in fact 
 I'm at 0. It happened yesterday and it still printing this out.
 To fix this we need to be able to decrease the global {{MemStore}} size when 
 the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5875) Process RIT and Master restart may remove an online server considering it as a dead server

[
https://issues.apache.org/jira/browse/HBASE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262698#comment-13262698
]

Zhihong Yu commented on HBASE-5875:
---

bq. Or can we update the root region node in the RS side after updating the
online server list?
Let's try this approach first.

The other approach would involve retry count, sleep interval, etc.

Process RIT and Master restart may remove an online server considering it as
a dead server
--

Key: HBASE-5875
URL: https://issues.apache.org/jira/browse/HBASE-5875
Project: HBase
Issue Type: Bug
Affects Versions: 0.92.1
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Fix For: 0.94.1

If on master restart it finds the ROOT/META to be in RIT state, master tries
to assign the ROOT region through ProcessRIT.
Master will trigger the assignment and next will try to verify the Root
Region Location.
Root region location verification is done seeing if the RS has the region in
its online list.
If the master triggered assignment has not yet been completed in RS then the
verify root region location will fail.
Because it failed
{code}
splitLogAndExpireIfOnline(currentRootServer);
{code}
we do split log and also remove the server from online server list. Ideally
here there is nothing to do in splitlog as no region server was restarted.
So master, though the server is online, master just invalidates the region
server.
In a special case, if i have only one RS then my cluster will become non
operative.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size


[ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262702#comment-13262702
 ] 

Zhihong Yu commented on HBASE-5611:
---

Tests were clear.

@Jieshan:
Please address formatting and prepare patches for each branch.

We should also run test suite for 0.90 and 0.92 once patches are available.

Good job.

 Replayed edits from regions that failed to open during recovery aren't 
 removed from the global MemStore size
 

 Key: HBASE-5611
 URL: https://issues.apache.org/jira/browse/HBASE-5611
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Jean-Daniel Cryans
Assignee: Jieshan Bean
Priority: Critical
 Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5611-94.patch, HBASE-5611-trunk-v2.patch, 
 HBASE-5611-trunk.patch


 This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
 it's still possible to hit it if a region fails to open for more obscure 
 reasons like HDFS errors.
 Consider a region that just went through distributed splitting and that's now 
 being opened by a new RS. The first thing it does is to read the recovery 
 files and put the edits in the {{MemStores}}. If this process takes a long 
 time, the master will move that region away. At that point the edits are 
 still accounted for in the global {{MemStore}} size but they are dropped when 
 the {{HRegion}} gets cleaned up. It's completely invisible until the 
 {{MemStoreFlusher}} needs to force flush a region and that none of them have 
 edits:
 {noformat}
 2012-03-21 00:33:39,303 DEBUG 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
 because memory above low water=5.9g
 2012-03-21 00:33:39,303 ERROR 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
 for entry null
 java.lang.IllegalStateException
 at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:129)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}
 The {{null}} here is a region. In my case I had so many edits in the 
 {{MemStore}} during recovery that I'm over the low barrier although in fact 
 I'm at 0. It happened yesterday and it still printing this out.
 To fix this we need to be able to decrease the global {{MemStore}} size when 
 the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client

[
https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262710#comment-13262710
]

Zhihong Yu commented on HBASE-5877:
---

@N:
I think the following validation in real cluster would illustrate the benefit
of this feature:
For given table, select a region server and note the row key ranges hosted by
the region server. Direct client load to this server.
Kill the server at time T.

Difference in client response to region migration around time T with and
without the patch would be interesting.

When a query fails because the region has moved, let the regionserver return
the new address to the client
--

Key: HBASE-5877
URL: https://issues.apache.org/jira/browse/HBASE-5877
Project: HBase
Issue Type: Improvement
Components: client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
Fix For: 0.96.0

Attachments: 5877.v1.patch

This is mainly useful when we do a rolling restart. This will decrease the
load on the master and the network load.
Note that a region is not immediately opened after a close. So:
- it seems preferable to wait before retrying on the other server. An
optimisation would be to have an heuristic depending on when the region was
closed.
- during a rolling restart, the server moves the regions then stops. So we
may have failures when the server is stopped, and this patch won't help.
The implementation in the first patch does:
- on the region move, there is an added parameter on the regionserver#close
to say where we are sending the region
- the regionserver keeps a list of what was moved. Each entry is kept 100
seconds.
- the regionserver sends a specific exception when it receives a query on a
moved region. This exception contains the new address.
- the client analyses the exeptions and update its cache accordingly...

[jira] [Updated] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client

[
https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Zhihong Yu updated HBASE-5877:
--

Comment: was deleted

(was: @N:
I think the following validation in real cluster would illustrate the benefit
of this feature:
For given table, select a region server and note the row key ranges hosted by
the region server. Direct client load to this server.
Kill the server at time T.

Difference in client response to region migration around time T with and
without the patch would be interesting.)

When a query fails because the region has moved, let the regionserver return
the new address to the client
--

Attachments: 5877.v1.patch

[jira] [Commented] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client

[
https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262714#comment-13262714
]

Zhihong Yu commented on HBASE-5877:
---

@N:
I think the following validation in real cluster would illustrate the benefit
of this feature:
For given table, select a region server and note the row key ranges hosted by
one region on the region server. Direct client load to this region.
Issue the following command in shell:
{code}
hbase move 'ENCODED_REGIONNAME', 'SERVER_NAME'
{code}
at time T.

Difference in client response to region migration around time T with and
without the patch would be interesting.

When a query fails because the region has moved, let the regionserver return
the new address to the client
--

Attachments: 5877.v1.patch

[jira] [Commented] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client

[
https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262731#comment-13262731
]

Zhihong Yu commented on HBASE-5877:
---

@N:
If the testing result is favorable, I think Lars may want it in 0.94 as well.
I think making this feature functional in 0.94 cluster would be a good start.

bq. We could have this by adding the info in zk
A separate discussion should be started w.r.t. the above. This would shift load
imposed by clients from master to zk quorum.

When a query fails because the region has moved, let the regionserver return
the new address to the client
--

Attachments: 5877.v1.patch

[jira] [Commented] (HBASE-5620) Convert the client protocol of HRegionInterface to PB


[ 
https://issues.apache.org/jira/browse/HBASE-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262742#comment-13262742
 ] 

Zhihong Yu commented on HBASE-5620:
---

Is there plan to adopt various measures to counter this 8% performance dip ?

 Convert the client protocol of HRegionInterface to PB
 -

 Key: HBASE-5620
 URL: https://issues.apache.org/jira/browse/HBASE-5620
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: hbase-5620-sec.patch, hbase-5620_v3.patch, 
 hbase-5620_v4.patch, hbase-5620_v4.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5862) After Region Close remove the Operation Metrics.


[ 
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262748#comment-13262748
 ] 

Zhihong Yu commented on HBASE-5862:
---

@Stack:
Do you have suggestions on further improvement for the latest patch ?

 After Region Close remove the Operation Metrics.
 

 Key: HBASE-5862
 URL: https://issues.apache.org/jira/browse/HBASE-5862
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, 
 HBASE-5862-2.patch, HBASE-5862-3.patch, HBASE-5862-94-3.patch


 If a region is closed then Hadoop metrics shouldn't still be reporting about 
 that region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5862) After Region Close remove the Operation Metrics.

[
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Zhihong Yu updated HBASE-5862:
--

Comment: was deleted

(was: -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12524468/TSD.png
against trunk revision .

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified
tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.

-1 patch. The patch command could not apply the patch.

Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1657//console

This message is automatically generated.)

After Region Close remove the Operation Metrics.

Key: HBASE-5862
URL: https://issues.apache.org/jira/browse/HBASE-5862
Project: HBase
Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Critical
Fix For: 0.94.0, 0.96.0

Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch,
HBASE-5862-2.patch, HBASE-5862-3.patch, HBASE-5862-94-3.patch, TSD.png

If a region is closed then Hadoop metrics shouldn't still be reporting about
that region.

[jira] [Commented] (HBASE-5829) Inconsistency between the regions map and the servers map in AssignmentManager


[ 
https://issues.apache.org/jira/browse/HBASE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262798#comment-13262798
 ] 

Zhihong Yu commented on HBASE-5829:
---

The latest patch is good to go.
Useless statement can be addressed elsewhere.

 Inconsistency between the regions map and the servers map in 
 AssignmentManager
 --

 Key: HBASE-5829
 URL: https://issues.apache.org/jira/browse/HBASE-5829
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1
Reporter: Maryann Xue
 Attachments: HBASE-5829-0.90.patch, HBASE-5829-trunk.patch


 There are occurrences in AM where this.servers is not kept consistent with 
 this.regions. This might cause balancer to offline a region from the RS that 
 already returned NotServingRegionException at a previous offline attempt.
 In AssignmentManager.unassign(HRegionInfo, boolean)
 try {
   // TODO: We should consider making this look more like it does for the
   // region open where we catch all throwables and never abort
   if (serverManager.sendRegionClose(server, state.getRegion(),
 versionOfClosingNode)) {
 LOG.debug(Sent CLOSE to  + server +  for region  +
   region.getRegionNameAsString());
 return;
   }
   // This never happens. Currently regionserver close always return true.
   LOG.warn(Server  + server +  region CLOSE RPC returned false for  +
 region.getRegionNameAsString());
 } catch (NotServingRegionException nsre) {
   LOG.info(Server  + server +  returned  + nsre +  for  +
 region.getRegionNameAsString());
   // Presume that master has stale data.  Presume remote side just split.
   // Presume that the split message when it comes in will fix up the 
 master's
   // in memory cluster state.
 } catch (Throwable t) {
   if (t instanceof RemoteException) {
 t = ((RemoteException)t).unwrapRemoteException();
 if (t instanceof NotServingRegionException) {
   if (checkIfRegionBelongsToDisabling(region)) {
 // Remove from the regionsinTransition map
 LOG.info(While trying to recover the table 
 + region.getTableNameAsString()
 +  to DISABLED state the region  + region
 +  was offlined but the table was in DISABLING state);
 synchronized (this.regionsInTransition) {
   this.regionsInTransition.remove(region.getEncodedName());
 }
 // Remove from the regionsMap
 synchronized (this.regions) {
   this.regions.remove(region);
 }
 deleteClosingOrClosedNode(region);
   }
 }
 // RS is already processing this region, only need to update the 
 timestamp
 if (t instanceof RegionAlreadyInTransitionException) {
   LOG.debug(update  + state +  the timestamp.);
   state.update(state.getState());
 }
   }
 In AssignmentManager.assign(HRegionInfo, RegionState, boolean, boolean, 
 boolean)
   synchronized (this.regions) {
 this.regions.put(plan.getRegionInfo(), plan.getDestination());
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5862) After Region Close remove the Operation Metrics.


[ 
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262808#comment-13262808
 ] 

Zhihong Yu commented on HBASE-5862:
---

{code}
+//per hfile.  Figuring out which cfs, hfiles, ...
{code}
Should cfs be in expanded form (column families) ?
{code}
+//and on the next tick of the metrics everything that is still relevant 
will be
+//re-added.
{code}
're-added' - 'added' or 'added again'

The initialization work in clear() should be moved to 
RegionServerDynamicMetrics ctor because it is one time operation.

 After Region Close remove the Operation Metrics.
 

 Key: HBASE-5862
 URL: https://issues.apache.org/jira/browse/HBASE-5862
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, 
 HBASE-5862-2.patch, HBASE-5862-3.patch, HBASE-5862-4.patch, 
 HBASE-5862-94-3.patch, TSD.png


 If a region is closed then Hadoop metrics shouldn't still be reporting about 
 that region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5887) Make TestAcidGuarantees usable for system testing.


[ 
https://issues.apache.org/jira/browse/HBASE-5887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13263261#comment-13263261
 ] 

Zhihong Yu commented on HBASE-5887:
---

{code}
+int millis = c.getInt(millis, 5000);
+int numWriters = c.getInt(numWriters, 50);
+int numGetters = c.getInt(numGetters, 2);
+int numScanners = c.getInt(numScanners, 2);
+int numUniqueRows = c.getInt(numUniqueRows, 3);
{code}
Can user specify these config parameters from the command line ?

 Make TestAcidGuarantees usable for system testing.
 --

 Key: HBASE-5887
 URL: https://issues.apache.org/jira/browse/HBASE-5887
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.0, 0.92.1, 0.94.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Attachments: hbase-5887.patch


 Currently, the TestAcidGuarantees run via main() will always abort with an 
 NPE because it digs into a non-existant HBaseTestingUtility for a flusher 
 thread.  We should tool this up so that it works properly from the command 
 line.  This would be a very useful long running test when used in conjunction 
 with fault injections to verify row acid properties.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size


 [ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5611:
--

Comment: was deleted

(was: -1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12524808/HBASE-5611-94.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1663//console

This message is automatically generated.)

 Replayed edits from regions that failed to open during recovery aren't 
 removed from the global MemStore size
 

 Key: HBASE-5611
 URL: https://issues.apache.org/jira/browse/HBASE-5611
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Jean-Daniel Cryans
Assignee: Jieshan Bean
Priority: Critical
 Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5611-94.patch, HBASE-5611-trunk-v2.patch, 
 HBASE-5611-trunk.patch


 This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
 it's still possible to hit it if a region fails to open for more obscure 
 reasons like HDFS errors.
 Consider a region that just went through distributed splitting and that's now 
 being opened by a new RS. The first thing it does is to read the recovery 
 files and put the edits in the {{MemStores}}. If this process takes a long 
 time, the master will move that region away. At that point the edits are 
 still accounted for in the global {{MemStore}} size but they are dropped when 
 the {{HRegion}} gets cleaned up. It's completely invisible until the 
 {{MemStoreFlusher}} needs to force flush a region and that none of them have 
 edits:
 {noformat}
 2012-03-21 00:33:39,303 DEBUG 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
 because memory above low water=5.9g
 2012-03-21 00:33:39,303 ERROR 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
 for entry null
 java.lang.IllegalStateException
 at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:129)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}
 The {{null}} here is a region. In my case I had so many edits in the 
 {{MemStore}} during recovery that I'm over the low barrier although in fact 
 I'm at 0. It happened yesterday and it still printing this out.
 To fix this we need to be able to decrease the global {{MemStore}} size when 
 the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size


[ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13263327#comment-13263327
 ] 

Zhihong Yu commented on HBASE-5611:
---

{code}
+   * Roll back the global MemStore size when a region can't open.
{code}
The above is not accurate: we're only rolling back the replay edits size for 
specified region from global MemStore size.

 Replayed edits from regions that failed to open during recovery aren't 
 removed from the global MemStore size
 

 Key: HBASE-5611
 URL: https://issues.apache.org/jira/browse/HBASE-5611
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Jean-Daniel Cryans
Assignee: Jieshan Bean
Priority: Critical
 Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5611-94.patch, HBASE-5611-trunk-v2.patch, 
 HBASE-5611-trunk.patch


 This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
 it's still possible to hit it if a region fails to open for more obscure 
 reasons like HDFS errors.
 Consider a region that just went through distributed splitting and that's now 
 being opened by a new RS. The first thing it does is to read the recovery 
 files and put the edits in the {{MemStores}}. If this process takes a long 
 time, the master will move that region away. At that point the edits are 
 still accounted for in the global {{MemStore}} size but they are dropped when 
 the {{HRegion}} gets cleaned up. It's completely invisible until the 
 {{MemStoreFlusher}} needs to force flush a region and that none of them have 
 edits:
 {noformat}
 2012-03-21 00:33:39,303 DEBUG 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
 because memory above low water=5.9g
 2012-03-21 00:33:39,303 ERROR 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
 for entry null
 java.lang.IllegalStateException
 at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:129)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}
 The {{null}} here is a region. In my case I had so many edits in the 
 {{MemStore}} during recovery that I'm over the low barrier although in fact 
 I'm at 0. It happened yesterday and it still printing this out.
 To fix this we need to be able to decrease the global {{MemStore}} size when 
 the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5874) The HBase do not configure the 'fs.default.name' attribute, the hbck tool and Merge tool throw IllegalArgumentException.


[ 
https://issues.apache.org/jira/browse/HBASE-5874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13263335#comment-13263335
 ] 

Zhihong Yu commented on HBASE-5874:
---

+1 on patch.

 The HBase do not configure the 'fs.default.name' attribute, the hbck tool and 
 Merge tool throw IllegalArgumentException.
 

 Key: HBASE-5874
 URL: https://issues.apache.org/jira/browse/HBASE-5874
 Project: HBase
  Issue Type: Bug
  Components: hbck
Affects Versions: 0.90.6
Reporter: fulin wang
 Attachments: HBASE-5874-0.90.patch, HBASE-5874-trunk.patch


 The HBase do not configure the 'fs.default.name' attribute, the hbck tool and 
 Merge tool throw IllegalArgumentException.
 the hbck tool and Merge tool, we should add 'fs.default.name' attriubte to 
 the code.
 hbck exception:
 Exception in thread main java.lang.IllegalArgumentException: Wrong FS: 
 hdfs://160.176.0.101:9000/hbase/.META./1028785192/.regioninfo, expected: 
 file:///
   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:412)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:59)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:382)
   at 
 org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:285)
   at 
 org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.init(ChecksumFileSystem.java:128)
   at 
 org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:301)
   at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:489)
   at 
 org.apache.hadoop.hbase.util.HBaseFsck.loadHdfsRegioninfo(HBaseFsck.java:565)
   at 
 org.apache.hadoop.hbase.util.HBaseFsck.loadHdfsRegionInfos(HBaseFsck.java:596)
   at 
 org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:332)
   at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:360)
   at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:2907)
 
 Merge exception:  
 [2012-05-05 10:48:24,830] [ERROR] [main] [org.apache.hadoop.hbase.util.Merge 
 381] exiting due to error
 java.lang.IllegalArgumentException: Wrong FS: 
 hdfs://160.176.0.101:9000/hbase/.META./1028785192/.regioninfo, expected: 
 file:///
   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:412)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:59)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:382)
   at 
 org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:285)
   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:823)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:415)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2679)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2665)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2634)
   at 
 org.apache.hadoop.hbase.util.MetaUtils.openMetaRegion(MetaUtils.java:276)
   at 
 org.apache.hadoop.hbase.util.MetaUtils.scanMetaRegion(MetaUtils.java:261)
   at org.apache.hadoop.hbase.util.Merge.run(Merge.java:115)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
   at org.apache.hadoop.hbase.util.Merge.main(Merge.java:379)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5848) Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort


[ 
https://issues.apache.org/jira/browse/HBASE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261648#comment-13261648
 ] 

Zhihong Yu commented on HBASE-5848:
---

Looks like the addendum wasn't applied to trunk.

 Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to 
 abort
 

 Key: HBASE-5848
 URL: https://issues.apache.org/jira/browse/HBASE-5848
 Project: HBase
  Issue Type: Bug
  Components: client
Reporter: Lars Hofhansl
Assignee: ramkrishna.s.vasudevan
Priority: Minor
 Fix For: 0.94.0, 0.96.0

 Attachments: 5848-addendum-v2.txt, 5848-addendum-v3.txt, 
 5848-addendum-v4.txt, 5848-addendum-v5.txt, 5848-addendum-v6.txt, 
 5848-addendum-v7.txt, 5848-addendum-v7.txt, HBASE-5848.patch, 
 HBASE-5848.patch, HBASE-5848_0.94.patch, HBASE-5848_addendum.patch


 A coworker of mine just had this scenario. It does not make sense the 
 EMPTY_START_ROW as splitKey (since the region with the empty start key is 
 implicit), but it should not cause the HMaster to abort.
 The abort happens because it tries to bulk assign the same region twice and 
 then runs into race conditions with ZK.
 The same would (presumably) happen when two identical split keys are passed, 
 but the client blocks that. The simplest solution here is to also block 
 passed null or EMPTY_START_ROW as split key by the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5829) Inconsistency between the regions map and the servers map in AssignmentManager


[ 
https://issues.apache.org/jira/browse/HBASE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261655#comment-13261655
 ] 

Zhihong Yu commented on HBASE-5829:
---

Patch makes sense.
w.r.t. this.servers, I found a useless statement (at least in trunk):
{code}
  void unassignCatalogRegions() {
this.servers.entrySet();
{code}
that should be removed.

 Inconsistency between the regions map and the servers map in 
 AssignmentManager
 --

 Key: HBASE-5829
 URL: https://issues.apache.org/jira/browse/HBASE-5829
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1
Reporter: Maryann Xue
 Attachments: HBASE-5829-0.90.patch, HBASE-5829-trunk.patch


 There are occurrences in AM where this.servers is not kept consistent with 
 this.regions. This might cause balancer to offline a region from the RS that 
 already returned NotServingRegionException at a previous offline attempt.
 In AssignmentManager.unassign(HRegionInfo, boolean)
 try {
   // TODO: We should consider making this look more like it does for the
   // region open where we catch all throwables and never abort
   if (serverManager.sendRegionClose(server, state.getRegion(),
 versionOfClosingNode)) {
 LOG.debug(Sent CLOSE to  + server +  for region  +
   region.getRegionNameAsString());
 return;
   }
   // This never happens. Currently regionserver close always return true.
   LOG.warn(Server  + server +  region CLOSE RPC returned false for  +
 region.getRegionNameAsString());
 } catch (NotServingRegionException nsre) {
   LOG.info(Server  + server +  returned  + nsre +  for  +
 region.getRegionNameAsString());
   // Presume that master has stale data.  Presume remote side just split.
   // Presume that the split message when it comes in will fix up the 
 master's
   // in memory cluster state.
 } catch (Throwable t) {
   if (t instanceof RemoteException) {
 t = ((RemoteException)t).unwrapRemoteException();
 if (t instanceof NotServingRegionException) {
   if (checkIfRegionBelongsToDisabling(region)) {
 // Remove from the regionsinTransition map
 LOG.info(While trying to recover the table 
 + region.getTableNameAsString()
 +  to DISABLED state the region  + region
 +  was offlined but the table was in DISABLING state);
 synchronized (this.regionsInTransition) {
   this.regionsInTransition.remove(region.getEncodedName());
 }
 // Remove from the regionsMap
 synchronized (this.regions) {
   this.regions.remove(region);
 }
 deleteClosingOrClosedNode(region);
   }
 }
 // RS is already processing this region, only need to update the 
 timestamp
 if (t instanceof RegionAlreadyInTransitionException) {
   LOG.debug(update  + state +  the timestamp.);
   state.update(state.getState());
 }
   }
 In AssignmentManager.assign(HRegionInfo, RegionState, boolean, boolean, 
 boolean)
   synchronized (this.regions) {
 this.regions.put(plan.getRegionInfo(), plan.getDestination());
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5732) Remove the SecureRPCEngine and merge the security-related logic in the core engine


[ 
https://issues.apache.org/jira/browse/HBASE-5732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261738#comment-13261738
 ] 

Zhihong Yu commented on HBASE-5732:
---

Since AccessController and TokenProvider coprocessors remain after this merge, 
my point was that we need to keep security profile for running the unit tests 
related to these coprocessors.

 Remove the SecureRPCEngine and merge the security-related logic in the core 
 engine
 --

 Key: HBASE-5732
 URL: https://issues.apache.org/jira/browse/HBASE-5732
 Project: HBase
  Issue Type: Improvement
Reporter: Devaraj Das
Assignee: Devaraj Das
 Attachments: rpcengine-merge.3.patch, rpcengine-merge.patch


 Remove the SecureRPCEngine and merge the security-related logic in the core 
 engine. Follow up to HBASE-5727.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found


 [ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu reassigned HBASE-5870:
-

Assignee: Zhihong Yu

 Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
 is not found
 -

 Key: HBASE-5870
 URL: https://issues.apache.org/jira/browse/HBASE-5870
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Jonathan Hsieh
Assignee: Zhihong Yu
Priority: Blocker
 Fix For: 0.96.0

 Attachments: 5870.txt


 After HBASE-5861 on 0.94 we are left with this issue on trunk.
 {code}
 $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
 ...
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
 (default-testCompile) on project hbase: Compilation failure
 [ERROR] 
 /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
  cannot find symbol
 [ERROR] symbol  : method getJobTracker()
 [ERROR] location: class 
 org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
 [ERROR] - [Help 1]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found


[ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261747#comment-13261747
 ] 

Zhihong Yu commented on HBASE-5870:
---

I ran the patch against 0.23 profile.
I got one test failure:
{code}
testSimpleCase(org.apache.hadoop.hbase.mapreduce.TestImportExport)  Time 
elapsed: 2.583 sec   ERROR!
java.io.FileNotFoundException: File does not exist: 
/Users/zhihyu/.m2/repository/org/apache/hadoop/hadoop-common/0.23.2-SNAPSHOT/hadoop-common-0.23.2-SNAPSHOT.jar
{code}
But the jar was there:
{code}
-rw-r--r--  1 zhihyu  110088321  1768854 Apr 24 11:23 
/Users/zhihyu/.m2/repository/org/apache/hadoop/hadoop-common/0.23.2-SNAPSHOT/hadoop-common-0.23.2-SNAPSHOT.jar
{code}

 Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
 is not found
 -

 Key: HBASE-5870
 URL: https://issues.apache.org/jira/browse/HBASE-5870
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Jonathan Hsieh
Assignee: Zhihong Yu
Priority: Blocker
 Fix For: 0.96.0

 Attachments: 5870.txt


 After HBASE-5861 on 0.94 we are left with this issue on trunk.
 {code}
 $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
 ...
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
 (default-testCompile) on project hbase: Compilation failure
 [ERROR] 
 /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
  cannot find symbol
 [ERROR] symbol  : method getJobTracker()
 [ERROR] location: class 
 org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
 [ERROR] - [Help 1]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5873) TimeOut Monitor thread should be started after atleast one region server registers.


[ 
https://issues.apache.org/jira/browse/HBASE-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261763#comment-13261763
 ] 

Zhihong Yu commented on HBASE-5873:
---

+1 if tests pass.

 TimeOut Monitor thread should be started after atleast one region server 
 registers.
 ---

 Key: HBASE-5873
 URL: https://issues.apache.org/jira/browse/HBASE-5873
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: ramkrishna.s.vasudevan
Assignee: rajeshbabu
Priority: Minor
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: 5873-trunk.txt, HBASE-5873.patch


 Currently timeout monitor thread is started even before the region server has 
 registered with the master.
 In timeout monitor we depend on the region server to be online 
 {code}
 boolean allRSsOffline = this.serverManager.getOnlineServersList().
 isEmpty();
 {code}
 Now when the master starts up it sees there are no online servers and hence 
 sets 
 allRSsOffline to true.
 {code}
 setAllRegionServersOffline(allRSsOffline);
 {code}
 So this.allRegionServersOffline is also true.
 By this time an RS has come up,
 Now timeout comes up again (after 10secs) in the next cycle he sees 
 allRSsOffline  as false.
 Hence 
 {code}
 else if (this.allRegionServersOffline  !allRSsOffline) {
 // if some RSs just came back online, we can start the
 // the assignment right away
 actOnTimeOut(regionState);
 {code}
 This condition makes him to take action based on timeout.
 Because of this even if one Region assignment of ROOT is going on, this piece 
 of code triggers another assignment and thus we get RegionAlreadyinTransition 
 Exception. Later we need to wait for 30 mins for assigning ROOT itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found


[ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261773#comment-13261773
 ] 

Zhihong Yu commented on HBASE-5870:
---

The failure is consistent:
{code}
testSimpleCase(org.apache.hadoop.hbase.mapreduce.TestImportExport)  Time 
elapsed: 2.552 sec   ERROR!
java.io.FileNotFoundException: File does not exist: 
/Users/zhihyu/.m2/repository/org/apache/hadoop/hadoop-common/0.23.2-SNAPSHOT/hadoop-common-0.23.2-SNAPSHOT.jar
  at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:729)
  at 
org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
  at 
org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
  at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:246)
  at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:284)
  at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:355)
  at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1221)
  at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:396)
  at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177)
  at org.apache.hadoop.mapreduce.Job.submit(Job.java:1218)
  at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1239)
  at 
org.apache.hadoop.hbase.mapreduce.TestImportExport.testSimpleCase(TestImportExport.java:114)
{code}

 Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
 is not found
 -

 Key: HBASE-5870
 URL: https://issues.apache.org/jira/browse/HBASE-5870
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Jonathan Hsieh
Assignee: Zhihong Yu
Priority: Blocker
 Fix For: 0.96.0

 Attachments: 5870.txt


 After HBASE-5861 on 0.94 we are left with this issue on trunk.
 {code}
 $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
 ...
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
 (default-testCompile) on project hbase: Compilation failure
 [ERROR] 
 /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
  cannot find symbol
 [ERROR] symbol  : method getJobTracker()
 [ERROR] location: class 
 org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
 [ERROR] - [Help 1]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size


[ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261799#comment-13261799
 ] 

Zhihong Yu commented on HBASE-5611:
---

{code}
+  // global memstore size once a region opening failed.
{code}
'region opening failed' - 'region failed opening'.
{code}
+  private final ConcurrentMapHRegionInfo, AtomicLong replayEditsPerRegion = 
{code}
Do we need HRegionInfo as the key to the Map ? Can we use region name ?
For rollbackRegionReplayEditsSize():
{code}
+  addAndGetGlobalMemstoreSize(-replayEdistsSize.get());
+  clearRegionReplayEditsSize(hri);
{code}
I suggest remembering the value of -replayEdistsSize.get() in a variable so 
that we can exchange the order of the two statements above and return directly 
from the if block.
If replayEdistsSize is null, would that indicate certain race condition ?

 Replayed edits from regions that failed to open during recovery aren't 
 removed from the global MemStore size
 

 Key: HBASE-5611
 URL: https://issues.apache.org/jira/browse/HBASE-5611
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Jean-Daniel Cryans
Assignee: Jieshan Bean
Priority: Critical
 Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5611-trunk.patch


 This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
 it's still possible to hit it if a region fails to open for more obscure 
 reasons like HDFS errors.
 Consider a region that just went through distributed splitting and that's now 
 being opened by a new RS. The first thing it does is to read the recovery 
 files and put the edits in the {{MemStores}}. If this process takes a long 
 time, the master will move that region away. At that point the edits are 
 still accounted for in the global {{MemStore}} size but they are dropped when 
 the {{HRegion}} gets cleaned up. It's completely invisible until the 
 {{MemStoreFlusher}} needs to force flush a region and that none of them have 
 edits:
 {noformat}
 2012-03-21 00:33:39,303 DEBUG 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
 because memory above low water=5.9g
 2012-03-21 00:33:39,303 ERROR 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
 for entry null
 java.lang.IllegalStateException
 at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:129)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}
 The {{null}} here is a region. In my case I had so many edits in the 
 {{MemStore}} during recovery that I'm over the low barrier although in fact 
 I'm at 0. It happened yesterday and it still printing this out.
 To fix this we need to be able to decrease the global {{MemStore}} size when 
 the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found


 [ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5870:
--

Attachment: 5870-v2.txt

Patch v2 fills in obtainJobConf() for MapreduceV2Shim.

getJobTrackerConf() creates a new JobConf. So setting config param in the 
returned JobConf is not effective.

 Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
 is not found
 -

 Key: HBASE-5870
 URL: https://issues.apache.org/jira/browse/HBASE-5870
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Jonathan Hsieh
Assignee: Zhihong Yu
Priority: Blocker
 Fix For: 0.96.0

 Attachments: 5870-v2.txt, 5870.txt


 After HBASE-5861 on 0.94 we are left with this issue on trunk.
 {code}
 $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
 ...
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
 (default-testCompile) on project hbase: Compilation failure
 [ERROR] 
 /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
  cannot find symbol
 [ERROR] symbol  : method getJobTracker()
 [ERROR] location: class 
 org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
 [ERROR] - [Help 1]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found


[ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261845#comment-13261845
 ] 

Zhihong Yu commented on HBASE-5870:
---

Even in build #136 TestImportExport failed, due to a different exception:
https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK-on-Hadoop-23/136/testReport/org.apache.hadoop.hbase.mapreduce/TestImportExport/org_apache_hadoop_hbase_mapreduce_TestImportExport/

I suggest checking in patch v2 and investigate TestImportExport using another 
JIRA.

 Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
 is not found
 -

 Key: HBASE-5870
 URL: https://issues.apache.org/jira/browse/HBASE-5870
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Jonathan Hsieh
Assignee: Zhihong Yu
Priority: Blocker
 Fix For: 0.96.0

 Attachments: 5870-v2.txt, 5870.txt


 After HBASE-5861 on 0.94 we are left with this issue on trunk.
 {code}
 $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
 ...
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
 (default-testCompile) on project hbase: Compilation failure
 [ERROR] 
 /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
  cannot find symbol
 [ERROR] symbol  : method getJobTracker()
 [ERROR] location: class 
 org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
 [ERROR] - [Help 1]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found


[ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261934#comment-13261934
 ] 

Zhihong Yu commented on HBASE-5870:
---

The two failed tests passed locally.

 Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
 is not found
 -

 Key: HBASE-5870
 URL: https://issues.apache.org/jira/browse/HBASE-5870
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Jonathan Hsieh
Assignee: Zhihong Yu
Priority: Blocker
 Fix For: 0.96.0

 Attachments: 5870-v2.txt, 5870.txt


 After HBASE-5861 on 0.94 we are left with this issue on trunk.
 {code}
 $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
 ...
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
 (default-testCompile) on project hbase: Compilation failure
 [ERROR] 
 /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
  cannot find symbol
 [ERROR] symbol  : method getJobTracker()
 [ERROR] location: class 
 org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
 [ERROR] - [Help 1]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found


 [ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5870:
--

Attachment: 5870-v2.txt

 Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
 is not found
 -

 Key: HBASE-5870
 URL: https://issues.apache.org/jira/browse/HBASE-5870
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Jonathan Hsieh
Assignee: Zhihong Yu
Priority: Blocker
 Fix For: 0.96.0

 Attachments: 5870-v2.txt, 5870.txt


 After HBASE-5861 on 0.94 we are left with this issue on trunk.
 {code}
 $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
 ...
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
 (default-testCompile) on project hbase: Compilation failure
 [ERROR] 
 /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
  cannot find symbol
 [ERROR] symbol  : method getJobTracker()
 [ERROR] location: class 
 org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
 [ERROR] - [Help 1]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found


 [ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5870:
--

Attachment: (was: 5870-v2.txt)

 Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
 is not found
 -

 Key: HBASE-5870
 URL: https://issues.apache.org/jira/browse/HBASE-5870
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Jonathan Hsieh
Assignee: Zhihong Yu
Priority: Blocker
 Fix For: 0.96.0

 Attachments: 5870-v2.txt, 5870.txt


 After HBASE-5861 on 0.94 we are left with this issue on trunk.
 {code}
 $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
 ...
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
 (default-testCompile) on project hbase: Compilation failure
 [ERROR] 
 /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
  cannot find symbol
 [ERROR] symbol  : method getJobTracker()
 [ERROR] location: class 
 org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
 [ERROR] - [Help 1]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5862) After Region Close remove the Operation Metrics.


[ 
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261948#comment-13261948
 ] 

Zhihong Yu commented on HBASE-5862:
---

+1 on patch v3.

 After Region Close remove the Operation Metrics.
 

 Key: HBASE-5862
 URL: https://issues.apache.org/jira/browse/HBASE-5862
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Minor
 Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, 
 HBASE-5862-2.patch, HBASE-5862-3.patch


 If a region is closed then Hadoop metrics shouldn't still be reporting about 
 that region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5862) After Region Close remove the Operation Metrics.


 [ 
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5862:
--

Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed

 After Region Close remove the Operation Metrics.
 

 Key: HBASE-5862
 URL: https://issues.apache.org/jira/browse/HBASE-5862
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, 
 HBASE-5862-2.patch, HBASE-5862-3.patch


 If a region is closed then Hadoop metrics shouldn't still be reporting about 
 that region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5876) TestImportExport has been failing against hadoop 0.23 profile

Zhihong Yu created HBASE-5876:
-

 Summary: TestImportExport has been failing against hadoop 0.23 
profile
 Key: HBASE-5876
 URL: https://issues.apache.org/jira/browse/HBASE-5876
 Project: HBase
  Issue Type: Bug
Reporter: Zhihong Yu


TestImportExport has been failing against hadoop 0.23 profile

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found


[ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261998#comment-13261998
 ] 

Zhihong Yu commented on HBASE-5870:
---

Will integrate later this afternoon if there is no objection.

 Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
 is not found
 -

 Key: HBASE-5870
 URL: https://issues.apache.org/jira/browse/HBASE-5870
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Jonathan Hsieh
Assignee: Zhihong Yu
Priority: Blocker
 Fix For: 0.96.0

 Attachments: 5870-v2.txt, 5870.txt


 After HBASE-5861 on 0.94 we are left with this issue on trunk.
 {code}
 $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
 ...
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
 (default-testCompile) on project hbase: Compilation failure
 [ERROR] 
 /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
  cannot find symbol
 [ERROR] symbol  : method getJobTracker()
 [ERROR] location: class 
 org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
 [ERROR] - [Help 1]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5876) TestImportExport has been failing against hadoop 0.23 profile


[ 
https://issues.apache.org/jira/browse/HBASE-5876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262024#comment-13262024
 ] 

Zhihong Yu commented on HBASE-5876:
---

I face different issue on Macbook. See:
https://issues.apache.org/jira/browse/HBASE-5870?focusedCommentId=13261773page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13261773

 TestImportExport has been failing against hadoop 0.23 profile
 -

 Key: HBASE-5876
 URL: https://issues.apache.org/jira/browse/HBASE-5876
 Project: HBase
  Issue Type: Bug
Reporter: Zhihong Yu
Assignee: Uma Maheswara Rao G

 TestImportExport has been failing against hadoop 0.23 profile

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found


[ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262104#comment-13262104
 ] 

Zhihong Yu commented on HBASE-5870:
---

I ran the test suite and TestSplitTransactionOnCluster passed:
{code}
Running org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 50.737 sec
{code}
I ran it again standalone and it passed.

 Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
 is not found
 -

 Key: HBASE-5870
 URL: https://issues.apache.org/jira/browse/HBASE-5870
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Jonathan Hsieh
Assignee: Zhihong Yu
Priority: Blocker
 Fix For: 0.96.0

 Attachments: 5870-v2.txt, 5870.txt


 After HBASE-5861 on 0.94 we are left with this issue on trunk.
 {code}
 $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
 ...
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
 (default-testCompile) on project hbase: Compilation failure
 [ERROR] 
 /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
  cannot find symbol
 [ERROR] symbol  : method getJobTracker()
 [ERROR] location: class 
 org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
 [ERROR] - [Help 1]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found


[ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262118#comment-13262118
 ] 

Zhihong Yu commented on HBASE-5870:
---

Integrated to trunk.

Thanks for the review, Lars and Jon.

 Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
 is not found
 -

 Key: HBASE-5870
 URL: https://issues.apache.org/jira/browse/HBASE-5870
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Jonathan Hsieh
Assignee: Zhihong Yu
Priority: Blocker
 Fix For: 0.96.0

 Attachments: 5870-v2.txt, 5870.txt


 After HBASE-5861 on 0.94 we are left with this issue on trunk.
 {code}
 $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
 ...
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
 (default-testCompile) on project hbase: Compilation failure
 [ERROR] 
 /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
  cannot find symbol
 [ERROR] symbol  : method getJobTracker()
 [ERROR] location: class 
 org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
 [ERROR] - [Help 1]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client


 [ 
https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5877:
--

Fix Version/s: 0.96.0
  Summary: When a query fails because the region has moved, let the 
regionserver return the new address to the client  (was: When a query fails 
because the region has moved, let the regionserver returns the new address to 
the client)

 When a query fails because the region has moved, let the regionserver return 
 the new address to the client
 --

 Key: HBASE-5877
 URL: https://issues.apache.org/jira/browse/HBASE-5877
 Project: HBase
  Issue Type: Improvement
  Components: client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5877.v1.patch


 This is mainly useful when we do a rolling restart. This will decrease the 
 load on the master and the network load.
 Note that a region is not immediately opened after a close. So:
 - it seems preferable to wait before retrying on the other server. An 
 optimisation would be to have an heuristic depending on when the region was 
 closed.
 - during a rolling restart, the server moves the regions then stops. So we 
 may have failures when the server is stopped, and this patch won't help.
 The implementation in the first patch does:
 - on the region move, there is an added parameter on the regionserver#close 
 to say where we are sending the region
 - the regionserver keeps a list of what was moved. Each entry is kept 100 
 seconds.
 - the regionserver sends a specific exception when it receives a query on a 
 moved region. This exception contains the new address.
 - the client analyses the exeptions and update its cache accordingly...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client


[ 
https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262199#comment-13262199
 ] 

Zhihong Yu commented on HBASE-5877:
---

For RegionMovedException.java:
{code}
+String tmpHostname = nohostname;
{code}
I think the above could potentially be a host name :-)
{code}
+} catch (Exception ignored) {
+  LOG.warn(Can't parse the hostname and the port from this string:  + s 
+ , +
+Continuing);
+}
{code}
Can we mark the failure and make this RegionMovedException behave the same as 
NotServingRegionException ?
For updateCachedLocations(), please put explanation for parameter on the same 
line as the parameter:
{code}
+ * @param row  - row and tableName can be null id hrl is not null.
{code}
{code}
+LOG.warn(Failed all from  + loc, e);
{code}
'Failed all' - 'Failed call'
{code}
+  if (resp == null) {
+// Entire server failed
+LOG.fatal(Failed all for server:  + loc.getHostnamePort() +
+  , removing from cache);
+continue;
+  }
{code}
How is the server removed from cache since I see 'continue' above ?
{code}
+  } else {
+if (numRetries == 1)
+  LOG.fatal(step 4 got result  + regionResult.getFirst() +  
 + regionResult.getSecond());
{code}
Why is the above fatal (regionResult != null) ? Step 4 appears in a comment 
below the above code. Should the above say step 3 ?

Please increase the VERSION of HRegionInterface
{code}
+   * @param destServerName: server name on which the server will be moved
{code}
'which the server' - 'which the region'

For ServerManager.sendRegionClose(), please add javadoc for destServerName 
param.
For HRegionServer.java:
{code}
+LOG.info(Closing region +region.getRegionName()+, moving to 
+sn.getServerName() );
{code}
Is it possible that destServerName is null ?
{code}
+  private ServerName getMovedRegion(String encodedRegionName) {
+LOG.fatal(Called getMovedRegion for +encodedRegionName+ + 
movedRegions.size()+  +movedRegions);
{code}
Please change the above to debug log.


 When a query fails because the region has moved, let the regionserver return 
 the new address to the client
 --

 Key: HBASE-5877
 URL: https://issues.apache.org/jira/browse/HBASE-5877
 Project: HBase
  Issue Type: Improvement
  Components: client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5877.v1.patch


 This is mainly useful when we do a rolling restart. This will decrease the 
 load on the master and the network load.
 Note that a region is not immediately opened after a close. So:
 - it seems preferable to wait before retrying on the other server. An 
 optimisation would be to have an heuristic depending on when the region was 
 closed.
 - during a rolling restart, the server moves the regions then stops. So we 
 may have failures when the server is stopped, and this patch won't help.
 The implementation in the first patch does:
 - on the region move, there is an added parameter on the regionserver#close 
 to say where we are sending the region
 - the regionserver keeps a list of what was moved. Each entry is kept 100 
 seconds.
 - the regionserver sends a specific exception when it receives a query on a 
 moved region. This exception contains the new address.
 - the client analyses the exeptions and update its cache accordingly...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found


 [ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5870:
--

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

 Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
 is not found
 -

 Key: HBASE-5870
 URL: https://issues.apache.org/jira/browse/HBASE-5870
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Jonathan Hsieh
Assignee: Zhihong Yu
Priority: Blocker
 Fix For: 0.96.0

 Attachments: 5870-v2.txt, 5870.txt


 After HBASE-5861 on 0.94 we are left with this issue on trunk.
 {code}
 $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
 ...
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
 (default-testCompile) on project hbase: Compilation failure
 [ERROR] 
 /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
  cannot find symbol
 [ERROR] symbol  : method getJobTracker()
 [ERROR] location: class 
 org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
 [ERROR] - [Help 1]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5880) TestImportExport fails on trunk when build/running against hadoop 23.


[ 
https://issues.apache.org/jira/browse/HBASE-5880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262250#comment-13262250
 ] 

Zhihong Yu commented on HBASE-5880:
---

I logged HBASE-5876 for this issue already.

 TestImportExport fails on trunk when build/running against hadoop 23.
 -

 Key: HBASE-5880
 URL: https://issues.apache.org/jira/browse/HBASE-5880
 Project: HBase
  Issue Type: Bug
  Components: mapreduce, test
Affects Versions: 0.96.0
Reporter: Jonathan Hsieh

 After fixing trunk against hadoop 23 compilation problems with HBASE-5870 and 
 HBASE-5861, we have one remaining problem -- TestImportExport consistently 
 fails unit test run.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5864) Error while reading from hfile in 0.94


[ 
https://issues.apache.org/jira/browse/HBASE-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262266#comment-13262266
 ] 

Zhihong Yu commented on HBASE-5864:
---

Latest patch looks good. Minor comments:
{code}
+  // after reading the root index the check sum bytes has to
{code}
'check sum bytes has to' - 'checksum bytes have to'
{code}
+  // be subracted to know if the mid key exists.
{code}
'subracted' - 'subtracted'


 Error while reading from hfile in 0.94
 --

 Key: HBASE-5864
 URL: https://issues.apache.org/jira/browse/HBASE-5864
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: HBASE-5864_1.patch, HBASE-5864_2.patch, 
 HBASE-5864_3.patch, HBASE-5864_test.patch


 Got the following stacktrace during region split.
 {noformat}
 2012-04-24 16:05:42,168 WARN org.apache.hadoop.hbase.regionserver.Store: 
 Failed getting store size for value
 java.io.IOException: Requested block is out of range: 2906737606134037404, 
 lastDataBlockOffset: 84764558
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:278)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.midkey(HFileBlockIndex.java:285)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.midkey(HFileReaderV2.java:402)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1638)
   at 
 org.apache.hadoop.hbase.regionserver.Store.getSplitPoint(Store.java:1943)
   at 
 org.apache.hadoop.hbase.regionserver.RegionSplitPolicy.getSplitPoint(RegionSplitPolicy.java:77)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:4921)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:2901)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5860) splitlogmanager should not unnecessarily resubmit tasks when zk unavailable


[ 
https://issues.apache.org/jira/browse/HBASE-5860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262318#comment-13262318
 ] 

Zhihong Yu commented on HBASE-5860:
---

Patch makes sense.
{code}
+  static boolean isAnyCreateZNodePending() {
{code}
This method can be made private, right ?
Would isAnyZNodeCreationPending be a better name ?

 splitlogmanager should not unnecessarily resubmit tasks when zk unavailable
 ---

 Key: HBASE-5860
 URL: https://issues.apache.org/jira/browse/HBASE-5860
 Project: HBase
  Issue Type: Improvement
Reporter: Prakash Khemani
Assignee: Prakash Khemani
 Attachments: 
 0001-HBASE-5860-splitlogmanager-should-not-unnecessarily-.patch


 (Doesn't really impact the run time or correctness of log splitting)
 say the master has lost connection to zk. splitlogmanager's timeoutmanager 
 will realize that all the tasks that were submitted are still unassigned. It 
 will resubmit those tasks (i.e. create dummy znodes)
 splitlogmanager should realze that the tasks are unassigned but their znodes 
 have not been created.
 012-04-20 13:11:20,516 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
 dead splitlog worker msgstore295.snc4.facebook.com,60020,1334948757026
 2012-04-20 13:11:20,517 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: 
 Scheduling batch of logs to split
 2012-04-20 13:11:20,517 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
 started splitting logs in 
 [hdfs://msgstore215.snc4.facebook.com:9000/MSGSTORE215-SNC4-HBASE/.logs/msgstore295.snc4.facebook.com,60020,1334948757026-splitting]
 2012-04-20 13:11:20,565 INFO org.apache.zookeeper.ClientCnxn: Opening socket 
 connection to server msgstore235.snc4.facebook.com/10.30.222.186:2181
 2012-04-20 13:11:20,566 INFO org.apache.zookeeper.ClientCnxn: Socket 
 connection established to msgstore235.snc4.facebook.com/10.30.222.186:2181, 
 initiating session
 2012-04-20 13:11:20,575 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
 total tasks = 4 unassigned = 4
 2012-04-20 13:11:20,576 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: 
 resubmitting unassigned task(s) after timeout
 2012-04-20 13:11:21,577 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: 
 resubmitting unassigned task(s) after timeout
 2012-04-20 13:11:21,683 INFO org.apache.zookeeper.ClientCnxn: Unable to read 
 additional data from server sessionid 0x36ccb0f8010002, likely server has 
 closed socket, closing socket connection and attempting reconnect
 2012-04-20 13:11:21,683 INFO org.apache.zookeeper.ClientCnxn: Unable to read 
 additional data from server sessionid 0x136ccb0f489, likely server has 
 closed socket, closing socket connection and attempting reconnect
 2012-04-20 13:11:21,786 WARN 
 org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback: create rc 
 =CONNECTIONLOSS for 
 /hbase/splitlog/hdfs%3A%2F%2Fmsgstore215.snc4.facebook.com%3A9000%2FMSGSTORE215-SNC4-HBASE%2F.logs%2Fmsgstore295.snc4.facebook.com%2C60020%2C1334948757026-splitting%2F10.30.251.186%253A60020.1334951586677
  retry=3
 2012-04-20 13:11:21,786 WARN 
 org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback: create rc 
 =CONNECTIONLOSS for 
 /hbase/splitlog/hdfs%3A%2F%2Fmsgstore215.snc4.facebook.com%3A9000%2FMSGSTORE215-SNC4-HBASE%2F.logs%2Fmsgstore295.snc4.facebook.com%2C60020%2C1334948757026-splitting%2F10.30.251.186%253A60020.1334951920332
  retry=3

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4393) Implement a canary monitoring program


[ 
https://issues.apache.org/jira/browse/HBASE-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260557#comment-13260557
 ] 

Zhihong Yu commented on HBASE-4393:
---

This checkin might be related to:
{code}
[ERROR] Failed to execute goal org.apache.rat:apache-rat-plugin:0.8:check 
(default) on project hbase: Too many unapproved licenses: 1 - [Help 1]
{code}

 Implement a canary monitoring program
 -

 Key: HBASE-4393
 URL: https://issues.apache.org/jira/browse/HBASE-4393
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Matteo Bertozzi
 Fix For: 0.94.0, 0.96.0

 Attachments: Canary-v0.java, HBASE-4393-v0.patch, HBaseCanary.java


 This JIRA is to implement a standalone program that can be used to do canary 
 monitoring of a running HBase cluster. This program would gather a list of 
 the regions in the cluster, then iterate over them doing lightweight 
 operations (eg short scans) to provide metrics about latency as well as alert 
 on availability issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5864) Error while reading from hfile in 0.94


[ 
https://issues.apache.org/jira/browse/HBASE-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260697#comment-13260697
 ] 

Zhihong Yu commented on HBASE-5864:
---

{code}
-DataInputStream nextBlockAsStream(BlockType blockType) throws IOException;
+HFileBlock nextBlockAsStream(BlockType blockType) throws IOException;
{code}
The method should be named nextBlock() because stream isn't returned.
{code}
+ * Read in the root-level index from the given input stream. Must match
{code}
'input stream' is no longer the input. HFileBlock is.
Please add @return to the javadoc.
For TestHFileWriterV2.java:
{code}
-final Compression.Algorithm COMPRESS_ALGO = Compression.Algorithm.GZ;
+final Compression.Algorithm COMPRESS_ALGO = Compression.Algorithm.NONE;
{code}
We should exercise both compression algorithms. Refactoring is needed.

 Error while reading from hfile in 0.94
 --

 Key: HBASE-5864
 URL: https://issues.apache.org/jira/browse/HBASE-5864
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.94.0

 Attachments: HBASE-5864_1.patch, HBASE-5864_2.patch, 
 HBASE-5864_test.patch


 Got the following stacktrace during region split.
 {noformat}
 2012-04-24 16:05:42,168 WARN org.apache.hadoop.hbase.regionserver.Store: 
 Failed getting store size for value
 java.io.IOException: Requested block is out of range: 2906737606134037404, 
 lastDataBlockOffset: 84764558
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:278)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.midkey(HFileBlockIndex.java:285)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.midkey(HFileReaderV2.java:402)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1638)
   at 
 org.apache.hadoop.hbase.regionserver.Store.getSplitPoint(Store.java:1943)
   at 
 org.apache.hadoop.hbase.regionserver.RegionSplitPolicy.getSplitPoint(RegionSplitPolicy.java:77)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:4921)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:2901)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5864) Error while reading from hfile in 0.94


[ 
https://issues.apache.org/jira/browse/HBASE-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260709#comment-13260709
 ] 

Zhihong Yu commented on HBASE-5864:
---

Patch v2 passes the new test.
{code}
+  private void writeDataAndReadFromHFile(Path hfilePath,
+  Algorithm COMPRESS_ALGO, int ENTRY_COUNT, boolean findMidKey) throws 
IOException {
{code}
Please don't use all upper case parameter names.

Please refactor the new readRootIndex() to re-use the existing method.

 Error while reading from hfile in 0.94
 --

 Key: HBASE-5864
 URL: https://issues.apache.org/jira/browse/HBASE-5864
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.94.0

 Attachments: HBASE-5864_1.patch, HBASE-5864_2.patch, 
 HBASE-5864_test.patch


 Got the following stacktrace during region split.
 {noformat}
 2012-04-24 16:05:42,168 WARN org.apache.hadoop.hbase.regionserver.Store: 
 Failed getting store size for value
 java.io.IOException: Requested block is out of range: 2906737606134037404, 
 lastDataBlockOffset: 84764558
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:278)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.midkey(HFileBlockIndex.java:285)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.midkey(HFileReaderV2.java:402)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1638)
   at 
 org.apache.hadoop.hbase.regionserver.Store.getSplitPoint(Store.java:1943)
   at 
 org.apache.hadoop.hbase.regionserver.RegionSplitPolicy.getSplitPoint(RegionSplitPolicy.java:77)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:4921)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:2901)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5864) Error while reading from hfile in 0.94


[ 
https://issues.apache.org/jira/browse/HBASE-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260743#comment-13260743
 ] 

Zhihong Yu commented on HBASE-5864:
---

The following computation assumes checksum is on:
{code}
+  int numBytes = (int) ChecksumUtil.numBytes(blk
+  .getOnDiskDataSizeWithHeader(), blk.getBytesPerChecksum());
{code}
If checksum is off, we would get 'divide by 0' exception.

I suggest using HFileBlock.totalChecksumBytes() in place of the above.

 Error while reading from hfile in 0.94
 --

 Key: HBASE-5864
 URL: https://issues.apache.org/jira/browse/HBASE-5864
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.94.0

 Attachments: HBASE-5864_1.patch, HBASE-5864_2.patch, 
 HBASE-5864_test.patch


 Got the following stacktrace during region split.
 {noformat}
 2012-04-24 16:05:42,168 WARN org.apache.hadoop.hbase.regionserver.Store: 
 Failed getting store size for value
 java.io.IOException: Requested block is out of range: 2906737606134037404, 
 lastDataBlockOffset: 84764558
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:278)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.midkey(HFileBlockIndex.java:285)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.midkey(HFileReaderV2.java:402)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1638)
   at 
 org.apache.hadoop.hbase.regionserver.Store.getSplitPoint(Store.java:1943)
   at 
 org.apache.hadoop.hbase.regionserver.RegionSplitPolicy.getSplitPoint(RegionSplitPolicy.java:77)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:4921)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:2901)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5861) Hadoop 23 compile broken due to tests introduced in HBASE-5064


[ 
https://issues.apache.org/jira/browse/HBASE-5861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260785#comment-13260785
 ] 

Zhihong Yu commented on HBASE-5861:
---

I got the following with v3 using 0.23 profile:
{code}
[ERROR] 
/Users/zhihyu/trunk-hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
 cannot find symbol
[ERROR] symbol  : method getJobTracker()
[ERROR] location: class org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
{code}

 Hadoop 23 compile broken due to tests introduced in HBASE-5064 
 ---

 Key: HBASE-5861
 URL: https://issues.apache.org/jira/browse/HBASE-5861
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.94.0, 0.96.0

 Attachments: 5861.txt, hbase-5861-jon.patch, hbase-5861-v2.patch, 
 hbase-5861-v3.patch


 When attempting to compile HBase 0.94rc1 against hadoop 23, I got this set of 
 compilation error messages:
 {code}
 jon@swoop:~/proj/hbase-0.94$ mvn clean test -Dhadoop.profile=23 -DskipTests
 ...
 [INFO] 
 
 [INFO] BUILD FAILURE
 [INFO] 
 
 [INFO] Total time: 18.926s
 [INFO] Finished at: Mon Apr 23 10:38:47 PDT 2012
 [INFO] Final Memory: 55M/555M
 [INFO] 
 
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
 (default-testCompile) on project hbase: Compilation failure: Compilation 
 failure:
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[147,46]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[153,29]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[194,46]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[206,29]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[213,29]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[226,29]
  org.apache.hadoop.mapreduce.TaskAttemptContext is abstract; cannot be 
 instantiated
 [ERROR] - [Help 1]
 {code}
 Upon further investigation this issue is due to code introduced in HBASE-5064 
 and is also present in trunk.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5848) Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort


[ 
https://issues.apache.org/jira/browse/HBASE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260809#comment-13260809
 ] 

Zhihong Yu commented on HBASE-5848:
---

TestFullLogReconstruction#testReconstruction gave the following based on 
addendum:
{code}
012-04-24 11:35:48,409 WARN  [Thread-189] 
client.HConnectionManager$HConnectionImplementation(1020): Encountered problems 
when prefetch META table:
org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META. for 
table: tabletest, row=tabletest,aaa,99
  at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:158)
  at org.apache.hadoop.hbase.client.MetaScanner.access$000(MetaScanner.java:52)
  at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:130)
  at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:127)
  at 
org.apache.hadoop.hbase.client.HConnectionManager.execute(HConnectionManager.java:385)
  at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:127)
  at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:103)
  at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:1017)
  at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1071)
  at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:959)
  at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1849)
  at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1733)
  at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1020)
  at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:832)
  at org.apache.hadoop.hbase.client.HTable.put(HTable.java:807)
  at 
org.apache.hadoop.hbase.HBaseTestingUtility.loadTable(HBaseTestingUtility.java:992)
  at 
org.apache.hadoop.hbase.TestFullLogReconstruction.testReconstruction(TestFullLogReconstruction.java:102)
{code}
Will try to come up with new addendum.

 Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to 
 abort
 

 Key: HBASE-5848
 URL: https://issues.apache.org/jira/browse/HBASE-5848
 Project: HBase
  Issue Type: Bug
  Components: client
Reporter: Lars Hofhansl
Assignee: ramkrishna.s.vasudevan
Priority: Minor
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5848.patch, HBASE-5848.patch, 
 HBASE-5848_0.94.patch, HBASE-5848_addendum.patch


 A coworker of mine just had this scenario. It does not make sense the 
 EMPTY_START_ROW as splitKey (since the region with the empty start key is 
 implicit), but it should not cause the HMaster to abort.
 The abort happens because it tries to bulk assign the same region twice and 
 then runs into race conditions with ZK.
 The same would (presumably) happen when two identical split keys are passed, 
 but the client blocks that. The simplest solution here is to also block 
 passed null or EMPTY_START_ROW as split key by the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5848) Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort


 [ 
https://issues.apache.org/jira/browse/HBASE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5848:
--

Attachment: 5848-addendum-v2.txt

Addendum v2 passes the following tests:
{code}
  889  mt -Dtest=TestFullLogReconstruction#testReconstruction
  890  mt -Dtest=TestRegionRebalancing#testRebalanceOnRegionServerNumberChange
{code}

 Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to 
 abort
 

 Key: HBASE-5848
 URL: https://issues.apache.org/jira/browse/HBASE-5848
 Project: HBase
  Issue Type: Bug
  Components: client
Reporter: Lars Hofhansl
Assignee: ramkrishna.s.vasudevan
Priority: Minor
 Fix For: 0.94.0, 0.96.0

 Attachments: 5848-addendum-v2.txt, HBASE-5848.patch, 
 HBASE-5848.patch, HBASE-5848_0.94.patch, HBASE-5848_addendum.patch


 A coworker of mine just had this scenario. It does not make sense the 
 EMPTY_START_ROW as splitKey (since the region with the empty start key is 
 implicit), but it should not cause the HMaster to abort.
 The abort happens because it tries to bulk assign the same region twice and 
 then runs into race conditions with ZK.
 The same would (presumably) happen when two identical split keys are passed, 
 but the client blocks that. The simplest solution here is to also block 
 passed null or EMPTY_START_ROW as split key by the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5848) Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort


 [ 
https://issues.apache.org/jira/browse/HBASE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5848:
--

Attachment: 5848-addendum-v3.txt

Addendum v2 didn't address the root cause of this issue.

Addendum v3 treats NodeExistsException specially in 
asyncSetOfflineInZooKeeper().

 Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to 
 abort
 

 Key: HBASE-5848
 URL: https://issues.apache.org/jira/browse/HBASE-5848
 Project: HBase
  Issue Type: Bug
  Components: client
Reporter: Lars Hofhansl
Assignee: ramkrishna.s.vasudevan
Priority: Minor
 Fix For: 0.94.0, 0.96.0

 Attachments: 5848-addendum-v2.txt, 5848-addendum-v3.txt, 
 HBASE-5848.patch, HBASE-5848.patch, HBASE-5848_0.94.patch, 
 HBASE-5848_addendum.patch


 A coworker of mine just had this scenario. It does not make sense the 
 EMPTY_START_ROW as splitKey (since the region with the empty start key is 
 implicit), but it should not cause the HMaster to abort.
 The abort happens because it tries to bulk assign the same region twice and 
 then runs into race conditions with ZK.
 The same would (presumably) happen when two identical split keys are passed, 
 but the client blocks that. The simplest solution here is to also block 
 passed null or EMPTY_START_ROW as split key by the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5715) Revert 'Instant schema alter' for now, HBASE-4213


[ 
https://issues.apache.org/jira/browse/HBASE-5715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261026#comment-13261026
 ] 

Zhihong Yu commented on HBASE-5715:
---

@Subbu:
Thanks for following up.

I think this work should be discussed under HBASE-5713. I will provide review 
comments there.

Have you run all unit tests under instant_schema_alter branch ?

 Revert 'Instant schema alter' for now, HBASE-4213
 -

 Key: HBASE-5715
 URL: https://issues.apache.org/jira/browse/HBASE-5715
 Project: HBase
  Issue Type: Task
Reporter: stack
Assignee: stack
 Fix For: 0.94.0

 Attachments: patch1.patch, revert.txt, revert.v2.txt, revert.v3.txt, 
 revert.v4.txt, revert094.v4.txt


 See this discussion: 
 http://search-hadoop.com/m/NxCQh1KlSxR1/Pull+instant+schema+updating+out%253Fsubj=Pull+instant+schema+updating+out+
 Pull out hbase-4213 for now.  Can add it back later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5848) Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort


[ 
https://issues.apache.org/jira/browse/HBASE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261068#comment-13261068
 ] 

Zhihong Yu commented on HBASE-5848:
---

I modified testCreateTableWithEmptyRowInTheSplitKeys using the above pattern 
and master didn't crash (with addendum):
{code}
Index: src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
===
--- src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java (revision 
1330037)
+++ src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java (working copy)
@@ -733,15 +733,13 @@
   @Test
   public void testCreateTableWithEmptyRowInTheSplitKeys() throws IOException{
 byte[] tableName = 
Bytes.toBytes(testCreateTableWithEmptyRowInTheSplitKeys);
-byte[][] splitKeys = new byte[3][];
-splitKeys[0] = region1.getBytes();
-splitKeys[1] = HConstants.EMPTY_BYTE_ARRAY;
-splitKeys[2] = region2.getBytes();
+byte[][] splitKeys = new byte[2][];
+splitKeys[0] = HConstants.EMPTY_BYTE_ARRAY;
+splitKeys[1] = region2.getBytes();
 HTableDescriptor desc = new HTableDescriptor(tableName);
 desc.addFamily(new HColumnDescriptor(col));
 try {
   admin.createTable(desc, splitKeys);
-  fail(Test case should fail as empty split key is passed.);
 } catch (IllegalArgumentException e) {
 }
   }
{code}

 Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to 
 abort
 

 Key: HBASE-5848
 URL: https://issues.apache.org/jira/browse/HBASE-5848
 Project: HBase
  Issue Type: Bug
  Components: client
Reporter: Lars Hofhansl
Assignee: ramkrishna.s.vasudevan
Priority: Minor
 Fix For: 0.94.0, 0.96.0

 Attachments: 5848-addendum-v2.txt, 5848-addendum-v3.txt, 
 HBASE-5848.patch, HBASE-5848.patch, HBASE-5848_0.94.patch, 
 HBASE-5848_addendum.patch


 A coworker of mine just had this scenario. It does not make sense the 
 EMPTY_START_ROW as splitKey (since the region with the empty start key is 
 implicit), but it should not cause the HMaster to abort.
 The abort happens because it tries to bulk assign the same region twice and 
 then runs into race conditions with ZK.
 The same would (presumably) happen when two identical split keys are passed, 
 but the client blocks that. The simplest solution here is to also block 
 passed null or EMPTY_START_ROW as split key by the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5848) Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort


 [ 
https://issues.apache.org/jira/browse/HBASE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5848:
--

Attachment: 5848-addendum-v4.txt

Addendum v4 passes TestAdmin

testCreateTableWithEmptyRowInTheSplitKeys is removed since 
IllegalArgumentException wouldn't be thrown.

 Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to 
 abort
 

 Key: HBASE-5848
 URL: https://issues.apache.org/jira/browse/HBASE-5848
 Project: HBase
  Issue Type: Bug
  Components: client
Reporter: Lars Hofhansl
Assignee: ramkrishna.s.vasudevan
Priority: Minor
 Fix For: 0.94.0, 0.96.0

 Attachments: 5848-addendum-v2.txt, 5848-addendum-v3.txt, 
 5848-addendum-v4.txt, HBASE-5848.patch, HBASE-5848.patch, 
 HBASE-5848_0.94.patch, HBASE-5848_addendum.patch


 A coworker of mine just had this scenario. It does not make sense the 
 EMPTY_START_ROW as splitKey (since the region with the empty start key is 
 implicit), but it should not cause the HMaster to abort.
 The abort happens because it tries to bulk assign the same region twice and 
 then runs into race conditions with ZK.
 The same would (presumably) happen when two identical split keys are passed, 
 but the client blocks that. The simplest solution here is to also block 
 passed null or EMPTY_START_ROW as split key by the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5713) Introduce throttling during Instant schema change process to throttle opening/closing regions.


 [ 
https://issues.apache.org/jira/browse/HBASE-5713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5713:
--

Attachment: 5713.txt

Subbu's patch.

 Introduce throttling during Instant schema change process to throttle 
 opening/closing regions. 
 ---

 Key: HBASE-5713
 URL: https://issues.apache.org/jira/browse/HBASE-5713
 Project: HBase
  Issue Type: Bug
  Components: client, master, regionserver, shell
Reporter: Subbu M Iyer
Assignee: Subbu M Iyer
Priority: Minor
 Attachments: 5713.txt


 There is a potential for region open/close stampede during instant schema 
 change process as the process attempts to close/open impacted regions in 
 rapid succession. We need to introduce some kind of throttling to eliminate 
 the race condition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5713) Introduce throttling during Instant schema change process to throttle opening/closing regions.


[ 
https://issues.apache.org/jira/browse/HBASE-5713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261097#comment-13261097
 ] 

Zhihong Yu commented on HBASE-5713:
---

For SchemaChangeTracker.java:
{code}
+  Throwable exception
+  ) {
{code}
Please move the second line to the end of the first line.

For CompactSplitThread.java:
{code}
+import java.util.concurrent.*;
{code}
Please restore the individual imports from java.util.concurrent
{code}
 while (this.server.getSchemaChangeTracker()
 .isSchemaChangeInProgress(tableName)) {
   try {
-Thread.sleep(100);
+Thread.sleep(500);
{code}
Why is the sleep interval longer ?
{code}
+  namehbase.instant.schema.throttle.time/name
+  value500/value
+  descriptionThrottle time in millis while closing/re opening impacted 
regions
{code}
're opening' - 're-opening'
Since user may choose longer throttle interval, 
'hbase.instant.schema.alter.timeout' should made longer.


 Introduce throttling during Instant schema change process to throttle 
 opening/closing regions. 
 ---

 Key: HBASE-5713
 URL: https://issues.apache.org/jira/browse/HBASE-5713
 Project: HBase
  Issue Type: Bug
  Components: client, master, regionserver, shell
Reporter: Subbu M Iyer
Assignee: Subbu M Iyer
Priority: Minor
 Attachments: 5713.txt


 There is a potential for region open/close stampede during instant schema 
 change process as the process attempts to close/open impacted regions in 
 rapid succession. We need to introduce some kind of throttling to eliminate 
 the race condition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5861) Hadoop 23 compile broken due to tests introduced in HBASE-5064


 [ 
https://issues.apache.org/jira/browse/HBASE-5861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5861:
--

Attachment: 5861-v4.patch

Patch v4 adds support for obtaining JobConf.

TestHLogRecordReader passes using either hadoop 1.0 or 0.23 profile.

 Hadoop 23 compile broken due to tests introduced in HBASE-5064 
 ---

 Key: HBASE-5861
 URL: https://issues.apache.org/jira/browse/HBASE-5861
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.94.0, 0.96.0

 Attachments: 5861-v4.patch, 5861.txt, hbase-5861-jon.patch, 
 hbase-5861-v2.patch, hbase-5861-v3.patch


 When attempting to compile HBase 0.94rc1 against hadoop 23, I got this set of 
 compilation error messages:
 {code}
 jon@swoop:~/proj/hbase-0.94$ mvn clean test -Dhadoop.profile=23 -DskipTests
 ...
 [INFO] 
 
 [INFO] BUILD FAILURE
 [INFO] 
 
 [INFO] Total time: 18.926s
 [INFO] Finished at: Mon Apr 23 10:38:47 PDT 2012
 [INFO] Final Memory: 55M/555M
 [INFO] 
 
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
 (default-testCompile) on project hbase: Compilation failure: Compilation 
 failure:
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[147,46]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[153,29]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[194,46]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[206,29]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[213,29]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[226,29]
  org.apache.hadoop.mapreduce.TaskAttemptContext is abstract; cannot be 
 instantiated
 [ERROR] - [Help 1]
 {code}
 Upon further investigation this issue is due to code introduced in HBASE-5064 
 and is also present in trunk.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5848) Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort


[ 
https://issues.apache.org/jira/browse/HBASE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261153#comment-13261153
 ] 

Zhihong Yu commented on HBASE-5848:
---

What about the root cause Ram identified: 
https://issues.apache.org/jira/browse/HBASE-5848?focusedCommentId=13259411page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13259411
 ?

A hacker can call master.createTable(desc, splitKeys) directly, bypassing 
HBaseAdmin. Right ?

 Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to 
 abort
 

 Key: HBASE-5848
 URL: https://issues.apache.org/jira/browse/HBASE-5848
 Project: HBase
  Issue Type: Bug
  Components: client
Reporter: Lars Hofhansl
Assignee: ramkrishna.s.vasudevan
Priority: Minor
 Fix For: 0.94.0, 0.96.0

 Attachments: 5848-addendum-v2.txt, 5848-addendum-v3.txt, 
 5848-addendum-v4.txt, 5848-addendum-v5.txt, HBASE-5848.patch, 
 HBASE-5848.patch, HBASE-5848_0.94.patch, HBASE-5848_addendum.patch


 A coworker of mine just had this scenario. It does not make sense the 
 EMPTY_START_ROW as splitKey (since the region with the empty start key is 
 implicit), but it should not cause the HMaster to abort.
 The abort happens because it tries to bulk assign the same region twice and 
 then runs into race conditions with ZK.
 The same would (presumably) happen when two identical split keys are passed, 
 but the client blocks that. The simplest solution here is to also block 
 passed null or EMPTY_START_ROW as split key by the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5862) After Region Close remove the Operation Metrics.


[ 
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261177#comment-13261177
 ] 

Zhihong Yu commented on HBASE-5862:
---

{code}
  @SuppressWarnings(unused)
  private RegionServerDynamicMetrics dynamicMetrics;
{code}
I tried to find out how HRegionServer.dynamicMetrics is used but wasn't able to.
{code}
+//Clear all of the dynamic metrics as they are now probably useless
+this.dynamicMetrics.clear();
{code}
Only encodedName is removed. Why do we clear dynamicMetrics ?
{code}
+  } catch (SecurityException e) {
+LOG.debug(Unable to clear metricsRecord);
{code}
We don't need to stumble over the same exception(s) again and again. Why not 
set a boolean to indicate that reflection shouldn't be used in the future ?
{code}
+if (this.recordMetricMapField != null || this.registryMetricMapField != 
null) {
+  try {
{code}
Please separate the above two conditions into two if blocks.
{code}
+import com.google.common.collect.Multiset.Entry;
{code}
Is the above import used ?
It's nice to have a test.

 After Region Close remove the Operation Metrics.
 

 Key: HBASE-5862
 URL: https://issues.apache.org/jira/browse/HBASE-5862
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Minor
 Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch


 If a region is closed then Hadoop metrics shouldn't still be reporting about 
 that region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5848) Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort


[ 
https://issues.apache.org/jira/browse/HBASE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261182#comment-13261182
 ] 

Zhihong Yu commented on HBASE-5848:
---

Addendum v6 looks good.
BTW this is the highest numbered addendum I have ever worked with :-)

 Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to 
 abort
 

 Key: HBASE-5848
 URL: https://issues.apache.org/jira/browse/HBASE-5848
 Project: HBase
  Issue Type: Bug
  Components: client
Reporter: Lars Hofhansl
Assignee: ramkrishna.s.vasudevan
Priority: Minor
 Fix For: 0.94.0, 0.96.0

 Attachments: 5848-addendum-v2.txt, 5848-addendum-v3.txt, 
 5848-addendum-v4.txt, 5848-addendum-v5.txt, 5848-addendum-v6.txt, 
 HBASE-5848.patch, HBASE-5848.patch, HBASE-5848_0.94.patch, 
 HBASE-5848_addendum.patch


 A coworker of mine just had this scenario. It does not make sense the 
 EMPTY_START_ROW as splitKey (since the region with the empty start key is 
 implicit), but it should not cause the HMaster to abort.
 The abort happens because it tries to bulk assign the same region twice and 
 then runs into race conditions with ZK.
 The same would (presumably) happen when two identical split keys are passed, 
 but the client blocks that. The simplest solution here is to also block 
 passed null or EMPTY_START_ROW as split key by the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5872) Improve hadoopqa script to include checks for hadoop 0.23 build


[ 
https://issues.apache.org/jira/browse/HBASE-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261210#comment-13261210
 ] 

Zhihong Yu commented on HBASE-5872:
---

{code}
+  echo $MVN clean test -DskipTests -D${PROJECT_NAME}PatchProcess  
$PATCH_DIR/trunkJavacWarnings.txt 21
{code}
I am confused by the above: if tests are skipped, why is test target specified ?

 Improve hadoopqa script to include checks for hadoop 0.23 build
 ---

 Key: HBASE-5872
 URL: https://issues.apache.org/jira/browse/HBASE-5872
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Attachments: hbase-5872.patch


 There have been a few patches that have made it into hbase trunk that have 
 broken the compile of hbase against hadoop 0.23.x, without being known for a 
 few days.
 We could have the bot do a few things:
 1) verify that patch compiles against hadoop 23
 2) verify that unit tests pass against hadoop 23

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5872) Improve hadoopqa script to include checks for hadoop 0.23 build


[ 
https://issues.apache.org/jira/browse/HBASE-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261213#comment-13261213
 ] 

Zhihong Yu commented on HBASE-5872:
---

Actually 
https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK-on-Hadoop-23 is 
built daily.
We just need to observe the compilation error there.

 Improve hadoopqa script to include checks for hadoop 0.23 build
 ---

 Key: HBASE-5872
 URL: https://issues.apache.org/jira/browse/HBASE-5872
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Attachments: hbase-5872.patch


 There have been a few patches that have made it into hbase trunk that have 
 broken the compile of hbase against hadoop 0.23.x, without being known for a 
 few days.
 We could have the bot do a few things:
 1) verify that patch compiles against hadoop 23
 2) verify that unit tests pass against hadoop 23

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5862) After Region Close remove the Operation Metrics.


[ 
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261238#comment-13261238
 ] 

Zhihong Yu commented on HBASE-5862:
---

MetricsRecord has become an interface in MRv2.

Please introduce Shim to make the solution work for both hadoop 1.0 and 2.0

 After Region Close remove the Operation Metrics.
 

 Key: HBASE-5862
 URL: https://issues.apache.org/jira/browse/HBASE-5862
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Minor
 Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, 
 HBASE-5862-2.patch, HBASE-5862-3.patch


 If a region is closed then Hadoop metrics shouldn't still be reporting about 
 that region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5861) Hadoop 23 compilation broken due to tests introduced in HBASE-5604


 [ 
https://issues.apache.org/jira/browse/HBASE-5861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5861:
--

Summary: Hadoop 23 compilation broken due to tests introduced in HBASE-5604 
 (was: Hadoop 23 compile broken due to tests introduced in HBASE-5064 )

 Hadoop 23 compilation broken due to tests introduced in HBASE-5604
 --

 Key: HBASE-5861
 URL: https://issues.apache.org/jira/browse/HBASE-5861
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.94.0, 0.96.0

 Attachments: 5861-v4.patch, 5861.txt, hbase-5861-jon.patch, 
 hbase-5861-v2.patch, hbase-5861-v3.patch


 When attempting to compile HBase 0.94rc1 against hadoop 23, I got this set of 
 compilation error messages:
 {code}
 jon@swoop:~/proj/hbase-0.94$ mvn clean test -Dhadoop.profile=23 -DskipTests
 ...
 [INFO] 
 
 [INFO] BUILD FAILURE
 [INFO] 
 
 [INFO] Total time: 18.926s
 [INFO] Finished at: Mon Apr 23 10:38:47 PDT 2012
 [INFO] Final Memory: 55M/555M
 [INFO] 
 
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
 (default-testCompile) on project hbase: Compilation failure: Compilation 
 failure:
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[147,46]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[153,29]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[194,46]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[206,29]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[213,29]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[226,29]
  org.apache.hadoop.mapreduce.TaskAttemptContext is abstract; cannot be 
 instantiated
 [ERROR] - [Help 1]
 {code}
 Upon further investigation this issue is due to code introduced in HBASE-5064 
 and is also present in trunk.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found


 [ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5870:
--

Summary: Hadoop 23 compilation broken because 
JobTrackerRunner#getJobTracker() method is not found  (was: Hadoop 23 compile 
broken because can't find HBaseTestingUtility#getJobTracker() method)

 Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
 is not found
 -

 Key: HBASE-5870
 URL: https://issues.apache.org/jira/browse/HBASE-5870
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.96.0


 After HBASE-5861 on 0.94 we are left with this issue on trunk.
 {code}
 $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
 ...
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
 (default-testCompile) on project hbase: Compilation failure
 [ERROR] 
 /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
  cannot find symbol
 [ERROR] symbol  : method getJobTracker()
 [ERROR] location: class 
 org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
 [ERROR] - [Help 1]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found


[ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261254#comment-13261254
 ] 

Zhihong Yu commented on HBASE-5870:
---

https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK-on-Hadoop-23/136/console
 was the last build which didn't show this compilation error.

 Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
 is not found
 -

 Key: HBASE-5870
 URL: https://issues.apache.org/jira/browse/HBASE-5870
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.96.0


 After HBASE-5861 on 0.94 we are left with this issue on trunk.
 {code}
 $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
 ...
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
 (default-testCompile) on project hbase: Compilation failure
 [ERROR] 
 /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
  cannot find symbol
 [ERROR] symbol  : method getJobTracker()
 [ERROR] location: class 
 org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
 [ERROR] - [Help 1]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5872) Improve hadoopqa script to include checks for hadoop 0.23 build


[ 
https://issues.apache.org/jira/browse/HBASE-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261291#comment-13261291
 ] 

Zhihong Yu commented on HBASE-5872:
---

Thanks for the explanation, Jon.

I think the patch cannot be tested by Hadoop QA. Otherwise the QA report should 
include have contained compilation error instead of failed tests.
I am fine with checking in the patch after HBASE-5870 is fixed - otherwise all 
Hadoop QA reports would only contain compilation error :-)

 Improve hadoopqa script to include checks for hadoop 0.23 build
 ---

 Key: HBASE-5872
 URL: https://issues.apache.org/jira/browse/HBASE-5872
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Attachments: hbase-5872.patch


 There have been a few patches that have made it into hbase trunk that have 
 broken the compile of hbase against hadoop 0.23.x, without being known for a 
 few days.
 We could have the bot do a few things:
 1) verify that patch compiles against hadoop 23
 2) verify that unit tests pass against hadoop 23

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available


[ 
https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261308#comment-13261308
 ] 

Zhihong Yu commented on HBASE-5849:
---

I looped TestClusterBootOrder using patch v4 5 times and didn't see hanging 
test.

 On first cluster startup, RS aborts if root znode is not available
 --

 Key: HBASE-5849
 URL: https://issues.apache.org/jira/browse/HBASE-5849
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, zookeeper
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.92.2, 0.94.0

 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch, 
 HBASE-5849_v4-0.92.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch, 
 HBASE-5849_v4.patch


 When launching a fresh new cluster, the master has to be started first, which 
 might create race conditions for starting master and rs at the same time. 
 Master startup code is smt like this: 
  - establish zk connection
  - create root znodes in zk (/hbase)
  - create ephemeral node for master /hbase/master, 
  Region server start up code is smt like this: 
  - establish zk connection
  - check whether the root znode (/hbase) is there. If not, shutdown. 
  - wait for the master to create znodes /hbase/master
 So, the problem is on the very first launch of the cluster, RS aborts to 
 start since /hbase znode might not have been created yet (only the master 
 creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on 
 subsequent cluster starts, it does not matter which order the servers are 
 started. So this affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5732) Remove the SecureRPCEngine and merge the security-related logic in the core engine


[ 
https://issues.apache.org/jira/browse/HBASE-5732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261318#comment-13261318
 ] 

Zhihong Yu commented on HBASE-5732:
---

Now that security profile is gone in patch v2, the build would be intrinsically 
secure HBase ?

 Remove the SecureRPCEngine and merge the security-related logic in the core 
 engine
 --

 Key: HBASE-5732
 URL: https://issues.apache.org/jira/browse/HBASE-5732
 Project: HBase
  Issue Type: Improvement
Reporter: Devaraj Das
Assignee: Devaraj Das
 Attachments: rpcengine-merge.3.patch, rpcengine-merge.patch


 Remove the SecureRPCEngine and merge the security-related logic in the core 
 engine. Follow up to HBASE-5727.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-5787) Table owner can't disable/delete its own table


 [ 
https://issues.apache.org/jira/browse/HBASE-5787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu resolved HBASE-5787.
---

Resolution: Fixed

 Table owner can't disable/delete its own table
 --

 Key: HBASE-5787
 URL: https://issues.apache.org/jira/browse/HBASE-5787
 Project: HBase
  Issue Type: Bug
  Components: security
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
  Labels: acl, security
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5787-tests-wrong-names.patch, HBASE-5787-v0.patch, 
 HBASE-5787-v1.patch


 An user with CREATE privileges can create a table, but can not disable it, 
 because disable operation require ADMIN privileges. Also if a table is 
 already disabled, anyone can remove it.
 {code}
 public void preDeleteTable(ObserverContextMasterCoprocessorEnvironment c,
 byte[] tableName) throws IOException {
   requirePermission(Permission.Action.CREATE);
 }
 public void preDisableTable(ObserverContextMasterCoprocessorEnvironment c,
 byte[] tableName) throws IOException {
   /* TODO: Allow for users with global CREATE permission and the table owner 
 */
   requirePermission(Permission.Action.ADMIN);
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5787) Table owner can't disable/delete his/her own table


 [ 
https://issues.apache.org/jira/browse/HBASE-5787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5787:
--

Summary: Table owner can't disable/delete his/her own table  (was: Table 
owner can't disable/delete its own table)

 Table owner can't disable/delete his/her own table
 --

 Key: HBASE-5787
 URL: https://issues.apache.org/jira/browse/HBASE-5787
 Project: HBase
  Issue Type: Bug
  Components: security
Affects Versions: 0.92.1, 0.94.0, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
  Labels: acl, security
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5787-tests-wrong-names.patch, HBASE-5787-v0.patch, 
 HBASE-5787-v1.patch


 An user with CREATE privileges can create a table, but can not disable it, 
 because disable operation require ADMIN privileges. Also if a table is 
 already disabled, anyone can remove it.
 {code}
 public void preDeleteTable(ObserverContextMasterCoprocessorEnvironment c,
 byte[] tableName) throws IOException {
   requirePermission(Permission.Action.CREATE);
 }
 public void preDisableTable(ObserverContextMasterCoprocessorEnvironment c,
 byte[] tableName) throws IOException {
   /* TODO: Allow for users with global CREATE permission and the table owner 
 */
   requirePermission(Permission.Action.ADMIN);
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5621) Convert admin protocol of HRegionInterface to PB


[ 
https://issues.apache.org/jira/browse/HBASE-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259709#comment-13259709
 ] 

Zhihong Yu commented on HBASE-5621:
---

Can you list the tests that failed on your machine ?

For security profile, TestProcessBasedCluster.testProcessBasedCluster fails 
consistently and is tracked by HBASE-5851.
Other than that test, we should be careful.

 Convert admin protocol of HRegionInterface to PB
 

 Key: HBASE-5621
 URL: https://issues.apache.org/jira/browse/HBASE-5621
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: hbase-5621_v3.patch, hbase_5621_v4.patch, 
 hbase_5621_v4.patch, hbase_5621_v5.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5848) Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort


[ 
https://issues.apache.org/jira/browse/HBASE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259740#comment-13259740
 ] 

Zhihong Yu commented on HBASE-5848:
---

According to 
http://docs.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime%28%29:
This method can only be used to measure elapsed time and is not related to any 
other notion of system or wall-clock time

So we shouldn't use nanoTime().

 Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to 
 abort
 

 Key: HBASE-5848
 URL: https://issues.apache.org/jira/browse/HBASE-5848
 Project: HBase
  Issue Type: Bug
  Components: client
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor

 A coworker of mine just had this scenario. It does not make sense the 
 EMPTY_START_ROW as splitKey (since the region with the empty start key is 
 implicit), but it should not cause the HMaster to abort.
 The abort happens because it tries to bulk assign the same region twice and 
 then runs into race conditions with ZK.
 The same would (presumably) happen when two identical split keys are passed, 
 but the client blocks that. The simplest solution here is to also block 
 passed null or EMPTY_START_ROW as split key by the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5826) Improve sync of HLog edits


 [ 
https://issues.apache.org/jira/browse/HBASE-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5826:
--

Attachment: 5826.txt

Todd's patch, for trunk.

 Improve sync of HLog edits
 --

 Key: HBASE-5826
 URL: https://issues.apache.org/jira/browse/HBASE-5826
 Project: HBase
  Issue Type: Improvement
Reporter: Zhihong Yu
 Attachments: 5826.txt


 HBASE-5782 solved the correctness issue for the sync of HLog edits.
 Todd provided a patch that would achieve higher throughput.
 This JIRA is a continuation of Todd's work submitted there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5826) Improve sync of HLog edits


 [ 
https://issues.apache.org/jira/browse/HBASE-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5826:
--

Status: Patch Available  (was: Open)

 Improve sync of HLog edits
 --

 Key: HBASE-5826
 URL: https://issues.apache.org/jira/browse/HBASE-5826
 Project: HBase
  Issue Type: Improvement
Reporter: Zhihong Yu
 Attachments: 5826.txt


 HBASE-5782 solved the correctness issue for the sync of HLog edits.
 Todd provided a patch that would achieve higher throughput.
 This JIRA is a continuation of Todd's work submitted there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5851) TestProcessBasedCluster sometimes fails


[ 
https://issues.apache.org/jira/browse/HBASE-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259810#comment-13259810
 ] 

Zhihong Yu commented on HBASE-5851:
---

The test sometime failed in trunk build as well:
https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2793/

 TestProcessBasedCluster sometimes fails
 ---

 Key: HBASE-5851
 URL: https://issues.apache.org/jira/browse/HBASE-5851
 Project: HBase
  Issue Type: Test
Reporter: Zhihong Yu
Assignee: Jimmy Xiang

 TestProcessBasedCluster failed in 
 https://builds.apache.org/job/HBase-TRUNK-security/178
 Looks like cluster failed to start:
 {code}
 2012-04-21 14:22:32,666 INFO  [Thread-1] 
 util.ProcessBasedLocalHBaseCluster(176): Waiting for HBase to startup. 
 Retries left: 2
 java.io.IOException: Giving up trying to location region in meta: thread is 
 interrupted.
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1173)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:956)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:917)
   at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:252)
   at org.apache.hadoop.hbase.client.HTable.init(HTable.java:192)
   at 
 org.apache.hadoop.hbase.util.ProcessBasedLocalHBaseCluster.startHBase(ProcessBasedLocalHBaseCluster.java:174)
   at 
 org.apache.hadoop.hbase.util.TestProcessBasedCluster.testProcessBasedCluster(TestProcessBasedCluster.java:56)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
   at 
 org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62)
 java.lang.InterruptedException: sleep interrupted at 
 java.lang.Thread.sleep(Native Method)
   at org.apache.hadoop.hbase.util.Threads.sleep(Threads.java:134)
   at 
 org.apache.hadoop.hbase.util.ProcessBasedLocalHBaseCluster.startHBase(ProcessBasedLocalHBaseCluster.java:178)
   at 
 org.apache.hadoop.hbase.util.TestProcessBasedCluster.testProcessBasedCluster(TestProcessBasedCluster.java:56)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5848) Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort


[ 
https://issues.apache.org/jira/browse/HBASE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259829#comment-13259829
 ] 

Zhihong Yu commented on HBASE-5848:
---

+1 on patch.

 Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to 
 abort
 

 Key: HBASE-5848
 URL: https://issues.apache.org/jira/browse/HBASE-5848
 Project: HBase
  Issue Type: Bug
  Components: client
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: HBASE-5848.patch


 A coworker of mine just had this scenario. It does not make sense the 
 EMPTY_START_ROW as splitKey (since the region with the empty start key is 
 implicit), but it should not cause the HMaster to abort.
 The abort happens because it tries to bulk assign the same region twice and 
 then runs into race conditions with ZK.
 The same would (presumably) happen when two identical split keys are passed, 
 but the client blocks that. The simplest solution here is to also block 
 passed null or EMPTY_START_ROW as split key by the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5844) Delete the region servers znode after a regions server crash


[ 
https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259914#comment-13259914
 ] 

Zhihong Yu commented on HBASE-5844:
---

{code}
+  (Environment variable HBASE_ZNODE_FILE is no set).);
{code}
'is no set' - 'is not set'

 Delete the region servers znode after a regions server crash
 

 Key: HBASE-5844
 URL: https://issues.apache.org/jira/browse/HBASE-5844
 Project: HBase
  Issue Type: Improvement
  Components: regionserver, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 5844.v1.patch, 5844.v2.patch


 today, if the regions server crashes, its znode is not deleted in ZooKeeper. 
 So the recovery process will stop only after a timeout, usually 30s.
 By deleting the znode in start script, we remove this delay and the recovery 
 starts immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5699) Run with 1 WAL in HRegionServer


 [ 
https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5699:
--

Comment: was deleted

(was: This seems interesting. I'll take a look at doing this.)

 Run with  1 WAL in HRegionServer
 -

 Key: HBASE-5699
 URL: https://issues.apache.org/jira/browse/HBASE-5699
 Project: HBase
  Issue Type: Improvement
Reporter: binlijin
Assignee: Li Pi



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5699) Run with 1 WAL in HRegionServer


[ 
https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260008#comment-13260008
 ] 

Zhihong Yu commented on HBASE-5699:
---

It was a duplicate message.

 Run with  1 WAL in HRegionServer
 -

 Key: HBASE-5699
 URL: https://issues.apache.org/jira/browse/HBASE-5699
 Project: HBase
  Issue Type: Improvement
Reporter: binlijin
Assignee: Li Pi



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5862) After Region Close remove the Operation Metrics.