[jira] [Updated] (HBASE-7568) [replication] Create an interface for replication queues

2013-03-18 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated HBASE-7568:


Attachment: (was: HBASE-7568.trunkv1)

 [replication] Create an interface for replication queues
 

 Key: HBASE-7568
 URL: https://issues.apache.org/jira/browse/HBASE-7568
 Project: HBase
  Issue Type: Sub-task
  Components: Replication
Reporter: Chris Trezzo
Assignee: Chris Trezzo
 Fix For: 0.95.0, 0.96.0, 0.98.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7568) [replication] Create an interface for replication queues

2013-03-18 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated HBASE-7568:


Attachment: HBASE-7568-trunk-v1.patch

Renaming patch file so hadoopqa will pick it up.

 [replication] Create an interface for replication queues
 

 Key: HBASE-7568
 URL: https://issues.apache.org/jira/browse/HBASE-7568
 Project: HBase
  Issue Type: Sub-task
  Components: Replication
Reporter: Chris Trezzo
Assignee: Chris Trezzo
 Fix For: 0.95.0, 0.96.0, 0.98.0

 Attachments: HBASE-7568-trunk-v1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7803) Look into REST API performance

2013-03-18 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-7803:
---

Attachment: trunk-7803.patch

 Look into REST API performance
 --

 Key: HBASE-7803
 URL: https://issues.apache.org/jira/browse/HBASE-7803
 Project: HBase
  Issue Type: Task
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: trunk-7803.patch


 I have a YCSB client using the REST API.  My testing shows the performance 
 for scan with REST API is much worse than that with the java client API.  We 
 need to look into it and find out the root cause, either the test issue, or 
 our REST API issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7803) Look into REST API performance

2013-03-18 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-7803:
---

Status: Patch Available  (was: Open)

 Look into REST API performance
 --

 Key: HBASE-7803
 URL: https://issues.apache.org/jira/browse/HBASE-7803
 Project: HBase
  Issue Type: Task
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: trunk-7803.patch


 I have a YCSB client using the REST API.  My testing shows the performance 
 for scan with REST API is much worse than that with the java client API.  We 
 need to look into it and find out the root cause, either the test issue, or 
 our REST API issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7803) Look into REST API performance

2013-03-18 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605520#comment-13605520
 ] 

Jimmy Xiang commented on HBASE-7803:


Attached a patch to make REST support caching.

 Look into REST API performance
 --

 Key: HBASE-7803
 URL: https://issues.apache.org/jira/browse/HBASE-7803
 Project: HBase
  Issue Type: Task
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: trunk-7803.patch


 I have a YCSB client using the REST API.  My testing shows the performance 
 for scan with REST API is much worse than that with the java client API.  We 
 need to look into it and find out the root cause, either the test issue, or 
 our REST API issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8097) MetaServerShutdownHandler may potentially keep bumping up DeadServer.numProcessing

2013-03-18 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-8097:
--

Fix Version/s: (was: 0.96.0)
   0.98.0
   0.95.0
 Hadoop Flags: Reviewed

Integrated to 0.95 and trunk.

Thanks for the patch, Jeffrey.

Thanks for the reviews, Jimmy, Nicolas and Chunhui.

 MetaServerShutdownHandler may potentially keep bumping up 
 DeadServer.numProcessing
 --

 Key: HBASE-8097
 URL: https://issues.apache.org/jira/browse/HBASE-8097
 Project: HBase
  Issue Type: Bug
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.95.0, 0.98.0

 Attachments: 8097.txt, hbase-8097_1.patch, hbase-8097_v2.patch, 
 hbase-8097_v3.patch


 {code}
 } catch (IOException ioe) {
   this.services.getExecutorService().submit(this);
   this.deadServers.add(serverName);
   throw new IOException(failed log splitting for  +
   serverName + , will retry, ioe);
 }
 {code}
 this.deadServers.add(serverName); will keep incrementing 
 DeadServer.numProcessing
 We can't get rid of numProcessing by just checking deadServers.size() because 
 deadServers is also used to report some historically failed RSs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8128) HTable#put improvements

2013-03-18 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605527#comment-13605527
 ] 

nkeywal commented on HBASE-8128:


Committed in 0.94

 HTable#put improvements
 ---

 Key: HBASE-8128
 URL: https://issues.apache.org/jira/browse/HBASE-8128
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Fix For: 0.95.0, 0.96.0

 Attachments: 8128.v1.patch


 3 points:
  - When doing a single put, we're creating an object by calling Arrays.asList
  - we're doing a size check every 10 put. Not doing it seems simpler, better 
 and allows to share some code between a single put and a list of puts.
  - we could call flushCommits on empty write buffer, especially for someone 
 using a lot of HTable instead of using a pool, as it's called in close().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7803) Look into REST API performance

2013-03-18 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605528#comment-13605528
 ] 

Jimmy Xiang commented on HBASE-7803:


I did some testing on my 4 nodes cluster with ycsb and here is the scan 
throughput I got with REST API:

With caching, and using batch:  8.83
With caching, but no batch: 0.99
No caching, but using batch: 1.85
No caching, no batch: 0.68

On the same cluster, using the HBase client java API, the throughput I got is: 
29.04



 Look into REST API performance
 --

 Key: HBASE-7803
 URL: https://issues.apache.org/jira/browse/HBASE-7803
 Project: HBase
  Issue Type: Task
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: trunk-7803.patch


 I have a YCSB client using the REST API.  My testing shows the performance 
 for scan with REST API is much worse than that with the java client API.  We 
 need to look into it and find out the root cause, either the test issue, or 
 our REST API issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8128) HTable#put improvements

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8128:
---

Fix Version/s: 0.94.8

 HTable#put improvements
 ---

 Key: HBASE-8128
 URL: https://issues.apache.org/jira/browse/HBASE-8128
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Fix For: 0.95.0, 0.96.0, 0.94.8

 Attachments: 8128.v1.patch


 3 points:
  - When doing a single put, we're creating an object by calling Arrays.asList
  - we're doing a size check every 10 put. Not doing it seems simpler, better 
 and allows to share some code between a single put and a list of puts.
  - we could call flushCommits on empty write buffer, especially for someone 
 using a lot of HTable instead of using a pool, as it's called in close().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7803) Look into REST API performance

2013-03-18 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605535#comment-13605535
 ] 

Jimmy Xiang commented on HBASE-7803:


Batch means less REST HTTP trips. Caching means less trips to region servers.  
Based on the results, it seems both are performance killers, and HTTP overhead 
has more impact.

 Look into REST API performance
 --

 Key: HBASE-7803
 URL: https://issues.apache.org/jira/browse/HBASE-7803
 Project: HBase
  Issue Type: Task
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: trunk-7803.patch


 I have a YCSB client using the REST API.  My testing shows the performance 
 for scan with REST API is much worse than that with the java client API.  We 
 need to look into it and find out the root cause, either the test issue, or 
 our REST API issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (HBASE-7597) testRegionShouldNotBeDeployed seems to be flaky

2013-03-18 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang reopened HBASE-7597:


  Assignee: Jimmy Xiang

It failed again:  https://builds.apache.org/job/HBase-0.95/82/
Let me reopen it and take a look if I can do something about it.

 testRegionShouldNotBeDeployed seems to be flaky
 ---

 Key: HBASE-7597
 URL: https://issues.apache.org/jira/browse/HBASE-7597
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Marc Spaggiari
Assignee: Jimmy Xiang

 I ran the entire test suite many times and always failed on, at least, 
 testRegionShouldNotBeDeployed.
 Results below. I will attached more result when current tests are done.
 Failed tests:
 testDeleteExpiredStoreFiles(org.apache.hadoop.hbase.regionserver.TestStore):
 expected:2 but was:4
   
 testAcquireTaskAtStartup(org.apache.hadoop.hbase.regionserver.TestSplitLogWorker):
 Waiting timed out after [1 000] msec
   testRegionShouldNotBeDeployed(org.apache.hadoop.hbase.util.TestHBaseFsck):
 expected:[SHOULD_NOT_BE_DEPLOYED] but was:[]
   
 testPermissionsWatcher(org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8097) MetaServerShutdownHandler may potentially keep bumping up DeadServer.numProcessing

2013-03-18 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-8097:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 MetaServerShutdownHandler may potentially keep bumping up 
 DeadServer.numProcessing
 --

 Key: HBASE-8097
 URL: https://issues.apache.org/jira/browse/HBASE-8097
 Project: HBase
  Issue Type: Bug
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.95.0, 0.98.0

 Attachments: 8097.txt, hbase-8097_1.patch, hbase-8097_v2.patch, 
 hbase-8097_v3.patch


 {code}
 } catch (IOException ioe) {
   this.services.getExecutorService().submit(this);
   this.deadServers.add(serverName);
   throw new IOException(failed log splitting for  +
   serverName + , will retry, ioe);
 }
 {code}
 this.deadServers.add(serverName); will keep incrementing 
 DeadServer.numProcessing
 We can't get rid of numProcessing by just checking deadServers.size() because 
 deadServers is also used to report some historically failed RSs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7597) TestHBaseFsck#testRegionShouldNotBeDeployed seems to be flaky

2013-03-18 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-7597:
--

Summary: TestHBaseFsck#testRegionShouldNotBeDeployed seems to be flaky  
(was: testRegionShouldNotBeDeployed seems to be flaky)

 TestHBaseFsck#testRegionShouldNotBeDeployed seems to be flaky
 -

 Key: HBASE-7597
 URL: https://issues.apache.org/jira/browse/HBASE-7597
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Marc Spaggiari
Assignee: Jimmy Xiang

 I ran the entire test suite many times and always failed on, at least, 
 testRegionShouldNotBeDeployed.
 Results below. I will attached more result when current tests are done.
 Failed tests:
 testDeleteExpiredStoreFiles(org.apache.hadoop.hbase.regionserver.TestStore):
 expected:2 but was:4
   
 testAcquireTaskAtStartup(org.apache.hadoop.hbase.regionserver.TestSplitLogWorker):
 Waiting timed out after [1 000] msec
   testRegionShouldNotBeDeployed(org.apache.hadoop.hbase.util.TestHBaseFsck):
 expected:[SHOULD_NOT_BE_DEPLOYED] but was:[]
   
 testPermissionsWatcher(org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7568) [replication] Create an interface for replication queues

2013-03-18 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605554#comment-13605554
 ] 

Ted Yu commented on HBASE-7568:
---

From https://builds.apache.org/job/PreCommit-HBASE-Build/4871/console, it 
looks like the patch doesn't compile (against hadoop 2.0, at least)

 [replication] Create an interface for replication queues
 

 Key: HBASE-7568
 URL: https://issues.apache.org/jira/browse/HBASE-7568
 Project: HBase
  Issue Type: Sub-task
  Components: Replication
Reporter: Chris Trezzo
Assignee: Chris Trezzo
 Fix For: 0.95.0, 0.96.0, 0.98.0

 Attachments: HBASE-7568-trunk-v1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8128) HTable#put improvements

2013-03-18 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-8128:
--

Fix Version/s: (was: 0.94.8)
   0.94.7

 HTable#put improvements
 ---

 Key: HBASE-8128
 URL: https://issues.apache.org/jira/browse/HBASE-8128
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Fix For: 0.95.0, 0.96.0, 0.94.7

 Attachments: 8128.v1.patch


 3 points:
  - When doing a single put, we're creating an object by calling Arrays.asList
  - we're doing a size check every 10 put. Not doing it seems simpler, better 
 and allows to share some code between a single put and a list of puts.
  - we could call flushCommits on empty write buffer, especially for someone 
 using a lot of HTable instead of using a pool, as it's called in close().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7679) implement store file management for stripe compactions

2013-03-18 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605572#comment-13605572
 ] 

Sergey Shelukhin commented on HBASE-7679:
-

bq. ConcatenatedLists should have unit test.
Added.

bq. Should this define be in the Interface or do you think it implementation 
specific?
bq. + public static final String BLOCKING_STOREFILES_KEY = 
hbase.hstore.blockingStoreFiles;
Not certain, having things in HStore seems to be the convention. Store isn't a 
real interface that invites different implementation :)

bq. On StripeStoreFileManager, do we know if this approach has merit? Have we 
run models or actual test runs and can see it saves i/o? Would be interesting 
to know. Do we have to commit it to figure this out? I can see committing all 
the refactorings which allow different compaction policies but would think a 
compaction engine would need to have proven merit before it goes in? What you 
think Sergey?
I have an integration test in HBASE-8000, but have only run it for correctness 
now.
I plan to make a bigger test for perf, and move to commit after having some 
numbers.

bq. Do we have to have a L0? Can we not flush multiple files when we flush, one 
per boundary in the region? Was that thought just too much work flushing?
I was concerned about many small files, and scope creep into memstore, as 
discussed.
Let me do a write-up on this (probably useful anyway), and discuss on dev list.
After integration tests on tiny files (not a target scenario for this, but 
still), I wonder if impact of L0 files on # of files to be read for gets is 
indeed worth it.
On the other hand for scans, and for overall situation large number of small 
files is not good.



 implement store file management for stripe compactions
 --

 Key: HBASE-7679
 URL: https://issues.apache.org/jira/browse/HBASE-7679
 Project: HBase
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-7667-and-7603-v0-incomplete.patch, 
 HBASE-7667-and-7603-v0-incomplete.patch, HBASE-7667-and-7603-v1.patch, 
 HBASE-7667-and-7603-v1.patch, HBASE-7667-v1.patch, HBASE-7667-v1.patch, 
 HBASE-7667-v2.patch, HBASE-7667-v2.patch, HBASE-7667-v3.patch, 
 HBASE-7679-v4.patch, HBASE-7679-v5.patch, HBASE-7679-v6.patch, 
 HBASE-7679-v7-.patch, HBASE-7679-v7.patch, HBASE-7679-v8.patch, 
 HBASE-7679-v9.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7679) implement store file management for stripe compactions

2013-03-18 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-7679:


Attachment: HBASE-7679-v10.patch

updated patch

 implement store file management for stripe compactions
 --

 Key: HBASE-7679
 URL: https://issues.apache.org/jira/browse/HBASE-7679
 Project: HBase
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-7667-and-7603-v0-incomplete.patch, 
 HBASE-7667-and-7603-v0-incomplete.patch, HBASE-7667-and-7603-v1.patch, 
 HBASE-7667-and-7603-v1.patch, HBASE-7667-v1.patch, HBASE-7667-v1.patch, 
 HBASE-7667-v2.patch, HBASE-7667-v2.patch, HBASE-7667-v3.patch, 
 HBASE-7679-v10.patch, HBASE-7679-v4.patch, HBASE-7679-v5.patch, 
 HBASE-7679-v6.patch, HBASE-7679-v7-.patch, HBASE-7679-v7.patch, 
 HBASE-7679-v8.patch, HBASE-7679-v9.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8127) Region of a disabling or disabled table could be stuck in transition state when RS dies during Master initialization

2013-03-18 Thread rajeshbabu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605582#comment-13605582
 ] 

rajeshbabu commented on HBASE-8127:
---

bq. One time when I saw the opening RIT stuck is due to the 
offlineDisabledRegion function in assignment manager. As you can see we don't 
handle opening RIT inside the function.

If I am not wrong HBASE-7824 patch applied at that time right?

One problem I am suspecting with HBASE-7824 patch is 
{code}
+if (preMetaServer != null  failedServers.contains(preMetaServer)) {
+  // create recovered edits file for .META. server
+  this.fileSystemManager.splitLog(preMetaServer);
+  failedServers.remove(preMetaServer);
+}
{code}

If a RS carrying ROOT or META went down,we are not calling SSH for that RS(not 
even adding to deadservers). We are handling regions in transitions to the dead 
server by processRegionsInTransitions which can cause RIT stuck in case OPENING 
state. If znode in RS_ZK_REGION_OPENING state then we will just add to RIT and 
wait for TM to handle. 
{code}
   regionsInTransition.put(encodedRegionName, new RegionState(regionInfo,
RegionState.State.OPENING, data.getStamp(), data.getOrigin()));
failoverProcessedRegions.put(encodedRegionName, regionInfo);
{code}
When ever TM handles we we will assign,in that case RIT can stuck because its 
seeing table in DISABLING/DISABLED. If really the RS is ALIVE this case wont 
happen because after assignment unassign will be called.

for HBASE-7824 patch we can do below change which avoids RIT stuck like in 
opening state.
If meta RS is down before/during master restart we can add it to deadservers 
and start SSH by passing shouldSplitHlog as false because already splitted logs.
{code}
this.deadservers.add(serverName);
this.services.getExecutorService().submit(
  new ServerShutdownHandler(this.master, this.services, this.deadservers, 
serverName, false));
{code}

Any way actual problem you have given in description we can handle in SSH side. 
I am working on it.

One more thing about your feedback patch:
{code}
+// delete RITs if exists in any state of disabling or disabled tables 
during master starts
+// up
+if (!hri.isMetaTable()) {
+  String tableName = hri.getTableNameAsString();
+  boolean disabled = this.zkTable.isDisabledTable(tableName);
+  if (disabled || this.zkTable.isDisablingTable(tableName)) {
+ZKAssign.deleteNodeFailSilent(watcher, hri);
+regionOffline(hri);
+continue;
+  }
+}
{code}

We dont know whether the DISABLING table region is already closed or not on RS, 
so we should not offline region directly. In SSH we can do because the RS is 
went down.


 Region of a disabling or disabled table could be stuck in transition state 
 when RS dies during Master initialization
 

 Key: HBASE-8127
 URL: https://issues.apache.org/jira/browse/HBASE-8127
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.5
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.7

 Attachments: HBASE-8127_feedback.patch, HBASE-8127.patch, 
 hbase-8127_v1.patch, reproduce-hang.patch


 The issue happens when a RS dies during a master starts up. After the RS 
 reports open to the new master instance and dies immediately thereafter, the 
 RITs of disabling tables(or disabled table) on the died RS will be in RIT 
 state forever.
 I attached a patch to simulate the situation and you can run the following 
 command to reproduce the issue:
 {code}mvn test -PlocalTests 
 -Dtest=TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS{code}
 Basically, we skip regions of a dead server inside 
 AM.processDeadServersAndRecoverLostRegions as the following code and relies 
 on SSH to process those skipped regions:
 {code}
   for (PairHRegionInfo, Result deadRegion : deadServer.getValue()) {
 nodes.remove(deadRegion.getFirst().getEncodedName());
   }
 {code} 
 While in SSH, we skip regions of disabling(or disabled table) again by 
 function processDeadRegion. Finally comes to the issue that RITs of 
 disabling(or disabled table) stuck there forever.
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk (with changes)

2013-03-18 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605583#comment-13605583
 ] 

Sergey Shelukhin commented on HBASE-7055:
-

Ping?

 port HBASE-6371 tier-based compaction from 0.89-fb to trunk (with changes)
 --

 Key: HBASE-7055
 URL: https://issues.apache.org/jira/browse/HBASE-7055
 Project: HBase
  Issue Type: Task
  Components: Compaction
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.95.0

 Attachments: HBASE-6371-squashed.patch, HBASE-6371-v2-squashed.patch, 
 HBASE-6371-v3-refactor-only-squashed.patch, 
 HBASE-6371-v4-refactor-only-squashed.patch, 
 HBASE-6371-v5-refactor-only-squashed.patch, HBASE-7055-v0.patch, 
 HBASE-7055-v1.patch, HBASE-7055-v2.patch, HBASE-7055-v3.patch, 
 HBASE-7055-v4.patch, HBASE-7055-v5.patch, HBASE-7055-v6.patch, 
 HBASE-7055-v7.patch, HBASE-7055-v7.patch


 See HBASE-6371 for details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7295) Contention in HBaseClient.getConnection

2013-03-18 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605586#comment-13605586
 ] 

Ted Yu commented on HBASE-7295:
---

I ran TestRowProcessorEndpoint with trunk patch v4 and it passed.

+1

 Contention in HBaseClient.getConnection
 ---

 Key: HBASE-7295
 URL: https://issues.apache.org/jira/browse/HBASE-7295
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.94.3
Reporter: Varun Sharma
Assignee: Varun Sharma
 Fix For: 0.95.0

 Attachments: 7295-0.94.txt, 7295-0.94-v2.txt, 7295-0.94-v3.txt, 
 7295-0.94-v4.txt, 7295-0.94-v5.txt, 7295-trunk.txt, 7295-trunk.txt, 
 7295-trunk-v2.txt, 7295-trunk-v3.txt, 7295-trunk-v3.txt, 7295-trunk-v4.txt


 HBaseClient.getConnection() synchronizes on the connections object. We found 
 severe contention on a thrift gateway which was fanning out roughly 3000+ 
 calls per second to hbase region servers. The thrift gateway had 2000+ 
 threads for handling incoming connections. Threads were blocked on the 
 syncrhonized block - we set ipc.pool.size to 200. Since we are using 
 RoundRobin/ThreadLocal pool only - its not necessary to synchronize on 
 connections - it might lead to cases where we might go slightly over the 
 ipc.max.pool.size() but the additional connections would timeout after 
 maxIdleTime - underlying PoolMap connections object is thread safe.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7992) provide pre/post region offline hooks for HMaster.offlineRegion()

2013-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605590#comment-13605590
 ] 

Hudson commented on HBASE-7992:
---

Integrated in HBase-TRUNK #3969 (See 
[https://builds.apache.org/job/HBase-TRUNK/3969/])
HBASE-7992 provide pre/post region offline hooks for 
HMaster.offlineRegion() (Rajeshbabu) (Revision 1457854)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/BaseMasterObserver.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/MasterObserver.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterObserver.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController.java


 provide pre/post region offline hooks for HMaster.offlineRegion()
 -

 Key: HBASE-7992
 URL: https://issues.apache.org/jira/browse/HBASE-7992
 Project: HBase
  Issue Type: Bug
  Components: Coprocessors
Affects Versions: 0.95.0
Reporter: rajeshbabu
Assignee: rajeshbabu
 Fix For: 0.98.0

 Attachments: 7992_trunk_3.patch, HBASE-7992_trunk_2.patch, 
 HBASE-7992_trunk.patch


 presently no hooks to provide access control to offline region in master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7679) implement store file management for stripe compactions

2013-03-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605592#comment-13605592
 ] 

stack commented on HBASE-7679:
--

Agree lets get numbers before saying L0 is bad.  Ditto get numbers before 
commit and yes a write up would be helpful.  Smarter compaction could make for 
big wins all around Sergey.  Thanks for persisting.

 implement store file management for stripe compactions
 --

 Key: HBASE-7679
 URL: https://issues.apache.org/jira/browse/HBASE-7679
 Project: HBase
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-7667-and-7603-v0-incomplete.patch, 
 HBASE-7667-and-7603-v0-incomplete.patch, HBASE-7667-and-7603-v1.patch, 
 HBASE-7667-and-7603-v1.patch, HBASE-7667-v1.patch, HBASE-7667-v1.patch, 
 HBASE-7667-v2.patch, HBASE-7667-v2.patch, HBASE-7667-v3.patch, 
 HBASE-7679-v10.patch, HBASE-7679-v4.patch, HBASE-7679-v5.patch, 
 HBASE-7679-v6.patch, HBASE-7679-v7-.patch, HBASE-7679-v7.patch, 
 HBASE-7679-v8.patch, HBASE-7679-v9.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7481) throw IOExceptions from Filter methods?

2013-03-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605593#comment-13605593
 ] 

Hadoop QA commented on HBASE-7481:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12574069/HBASE-7481-1.0.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 9 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s): 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4869//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4869//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4869//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4869//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4869//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4869//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4869//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4869//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4869//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4869//console

This message is automatically generated.

 throw IOExceptions from Filter methods?
 ---

 Key: HBASE-7481
 URL: https://issues.apache.org/jira/browse/HBASE-7481
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Fix For: 0.95.0, 0.98.0

 Attachments: HBASE-7481-1.0.txt


 Currently there is no way to throw custom IOExceptions from any of the filter 
 methods.
 For implementers of custom filters that presents a problem.
 For example there are scenarios where the filter would want to indicate to 
 the client that there it should not retry. Currently there is no way of doing 
 that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8067) TestHFileArchiving.testArchiveOnTableDelete sometimes fails

2013-03-18 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605596#comment-13605596
 ] 

Ted Yu commented on HBASE-8067:
---

Looks like this test failed again in trunk build #3969

 TestHFileArchiving.testArchiveOnTableDelete sometimes fails
 ---

 Key: HBASE-8067
 URL: https://issues.apache.org/jira/browse/HBASE-8067
 Project: HBase
  Issue Type: Bug
  Components: Admin, master, test
Affects Versions: 0.96.0, 0.94.6
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 0.95.0, 0.94.7

 Attachments: HBASE-8067-debug.patch, HBASE-8067-v0.patch


 it seems that testArchiveOnTableDelete() fails because the archiving in 
 DeleteTableHandler is still in progress when admin.deleteTable() returns.
 {code}
 Error Message
 Archived files are missing some of the store files!
 Stacktrace
 java.lang.AssertionError: Archived files are missing some of the store files!
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.assertTrue(Assert.java:41)
   at 
 org.apache.hadoop.hbase.backup.TestHFileArchiving.testArchiveOnTableDelete(TestHFileArchiving.java:262)
 {code}
 (Looking at the problem in a more generic way, we don't have any way to 
 inform the client when an async operation is completed)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7568) [replication] Create an interface for replication queues

2013-03-18 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605603#comment-13605603
 ] 

Chris Trezzo commented on HBASE-7568:
-

Hmm, it compiled locally. Will investigate. Thanks Ted.

Chris

 [replication] Create an interface for replication queues
 

 Key: HBASE-7568
 URL: https://issues.apache.org/jira/browse/HBASE-7568
 Project: HBase
  Issue Type: Sub-task
  Components: Replication
Reporter: Chris Trezzo
Assignee: Chris Trezzo
 Fix For: 0.95.0, 0.96.0, 0.98.0

 Attachments: HBASE-7568-trunk-v1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7803) Look into REST API performance

2013-03-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605605#comment-13605605
 ] 

Hadoop QA commented on HBASE-7803:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12574197/trunk-7803.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces lines longer than 
100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s): 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4870//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4870//console

This message is automatically generated.

 Look into REST API performance
 --

 Key: HBASE-7803
 URL: https://issues.apache.org/jira/browse/HBASE-7803
 Project: HBase
  Issue Type: Task
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: trunk-7803.patch


 I have a YCSB client using the REST API.  My testing shows the performance 
 for scan with REST API is much worse than that with the java client API.  We 
 need to look into it and find out the root cause, either the test issue, or 
 our REST API issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7568) [replication] Create an interface for replication queues

2013-03-18 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605609#comment-13605609
 ] 

Chris Trezzo commented on HBASE-7568:
-

Woops. Posted the wrong file when I renamed it.

 [replication] Create an interface for replication queues
 

 Key: HBASE-7568
 URL: https://issues.apache.org/jira/browse/HBASE-7568
 Project: HBase
  Issue Type: Sub-task
  Components: Replication
Reporter: Chris Trezzo
Assignee: Chris Trezzo
 Fix For: 0.95.0, 0.96.0, 0.98.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7568) [replication] Create an interface for replication queues

2013-03-18 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated HBASE-7568:


Attachment: (was: HBASE-7568-trunk-v1.patch)

 [replication] Create an interface for replication queues
 

 Key: HBASE-7568
 URL: https://issues.apache.org/jira/browse/HBASE-7568
 Project: HBase
  Issue Type: Sub-task
  Components: Replication
Reporter: Chris Trezzo
Assignee: Chris Trezzo
 Fix For: 0.95.0, 0.96.0, 0.98.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8127) Region of a disabling or disabled table could be stuck in transition state when RS dies during Master initialization

2013-03-18 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605616#comment-13605616
 ] 

Jeffrey Zhong commented on HBASE-8127:
--

[~rajesh23] Thanks for the detailed comments.

{quote}
If I am not wrong HBASE-7824 patch applied at that time right?
{quote}
No. Actually with the {code}failedServers.remove(preMetaServer);{code} we don't 
see any issue at all. The only problem is when we have non-empty dead severs 
which are simulated by the reproduce-hang patch

Anyway, the opening RIT of disabled table which causing issues is on the live 
RS not the one dies(or aborted) in the test. So the changes in SSH should not 
have any impact IMHO. 



 Region of a disabling or disabled table could be stuck in transition state 
 when RS dies during Master initialization
 

 Key: HBASE-8127
 URL: https://issues.apache.org/jira/browse/HBASE-8127
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.5
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.7

 Attachments: HBASE-8127_feedback.patch, HBASE-8127.patch, 
 hbase-8127_v1.patch, reproduce-hang.patch


 The issue happens when a RS dies during a master starts up. After the RS 
 reports open to the new master instance and dies immediately thereafter, the 
 RITs of disabling tables(or disabled table) on the died RS will be in RIT 
 state forever.
 I attached a patch to simulate the situation and you can run the following 
 command to reproduce the issue:
 {code}mvn test -PlocalTests 
 -Dtest=TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS{code}
 Basically, we skip regions of a dead server inside 
 AM.processDeadServersAndRecoverLostRegions as the following code and relies 
 on SSH to process those skipped regions:
 {code}
   for (PairHRegionInfo, Result deadRegion : deadServer.getValue()) {
 nodes.remove(deadRegion.getFirst().getEncodedName());
   }
 {code} 
 While in SSH, we skip regions of disabling(or disabled table) again by 
 function processDeadRegion. Finally comes to the issue that RITs of 
 disabling(or disabled table) stuck there forever.
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7295) Contention in HBaseClient.getConnection

2013-03-18 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605621#comment-13605621
 ] 

Lars Hofhansl commented on HBASE-7295:
--

I know we went through this before, but just making the PoolMap volatile does 
not make the implementation thread safe.


 Contention in HBaseClient.getConnection
 ---

 Key: HBASE-7295
 URL: https://issues.apache.org/jira/browse/HBASE-7295
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.94.3
Reporter: Varun Sharma
Assignee: Varun Sharma
 Fix For: 0.95.0

 Attachments: 7295-0.94.txt, 7295-0.94-v2.txt, 7295-0.94-v3.txt, 
 7295-0.94-v4.txt, 7295-0.94-v5.txt, 7295-trunk.txt, 7295-trunk.txt, 
 7295-trunk-v2.txt, 7295-trunk-v3.txt, 7295-trunk-v3.txt, 7295-trunk-v4.txt


 HBaseClient.getConnection() synchronizes on the connections object. We found 
 severe contention on a thrift gateway which was fanning out roughly 3000+ 
 calls per second to hbase region servers. The thrift gateway had 2000+ 
 threads for handling incoming connections. Threads were blocked on the 
 syncrhonized block - we set ipc.pool.size to 200. Since we are using 
 RoundRobin/ThreadLocal pool only - its not necessary to synchronize on 
 connections - it might lead to cases where we might go slightly over the 
 ipc.max.pool.size() but the additional connections would timeout after 
 maxIdleTime - underlying PoolMap connections object is thread safe.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6915) String and ConcurrentHashMap sizes change on jdk7; makes TestHeapSize fail

2013-03-18 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605631#comment-13605631
 ] 

Lars Hofhansl commented on HBASE-6915:
--

+1 for 0.94 as well.

 String and ConcurrentHashMap sizes change on jdk7; makes TestHeapSize fail
 --

 Key: HBASE-6915
 URL: https://issues.apache.org/jira/browse/HBASE-6915
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.95.0

 Attachments: jdk7.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8014) Backport HBASE-6915 to 0.94.

2013-03-18 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605633#comment-13605633
 ] 

Ted Yu commented on HBASE-8014:
---

Here is Lars' confirmation: 
https://issues.apache.org/jira/browse/HBASE-6915?focusedCommentId=13605631page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13605631

 Backport HBASE-6915 to 0.94.
 

 Key: HBASE-8014
 URL: https://issues.apache.org/jira/browse/HBASE-8014
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Marc Spaggiari
Assignee: Jean-Marc Spaggiari
Priority: Critical
 Attachments: HBASE-8014-v0-0.94.patch


 JDK 1.7 changed some data size. Goal of this JIRA is to backport HBASE-6915 
 to 0.94

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-8141) Remove accidental uses of org.mortbay.log

2013-03-18 Thread Andrew Purtell (JIRA)
Andrew Purtell created HBASE-8141:
-

 Summary: Remove accidental uses of org.mortbay.log
 Key: HBASE-8141
 URL: https://issues.apache.org/jira/browse/HBASE-8141
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.95.0, 0.96.0, 0.94.6
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Trivial


Remove accidental uses of org.mortbay.log.Log. Eclipse autocomplete is probably 
the culprit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8014) Backport HBASE-6915 to 0.94.

2013-03-18 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605639#comment-13605639
 ] 

Ted Yu commented on HBASE-8014:
---

Integrated to 0.94

Thanks for the patch, Jean-Marc.

Thanks for the confirmation, Lars.

 Backport HBASE-6915 to 0.94.
 

 Key: HBASE-8014
 URL: https://issues.apache.org/jira/browse/HBASE-8014
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Marc Spaggiari
Assignee: Jean-Marc Spaggiari
Priority: Critical
 Attachments: HBASE-8014-v0-0.94.patch


 JDK 1.7 changed some data size. Goal of this JIRA is to backport HBASE-6915 
 to 0.94

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8014) Backport HBASE-6915 to 0.94.

2013-03-18 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-8014:
--

Fix Version/s: 0.94.7

 Backport HBASE-6915 to 0.94.
 

 Key: HBASE-8014
 URL: https://issues.apache.org/jira/browse/HBASE-8014
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Marc Spaggiari
Assignee: Jean-Marc Spaggiari
Priority: Critical
 Fix For: 0.94.7

 Attachments: HBASE-8014-v0-0.94.patch


 JDK 1.7 changed some data size. Goal of this JIRA is to backport HBASE-6915 
 to 0.94

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8141) Remove accidental uses of org.mortbay.log.Log

2013-03-18 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-8141:
--

Summary: Remove accidental uses of org.mortbay.log.Log  (was: Remove 
accidental uses of org.mortbay.log)

 Remove accidental uses of org.mortbay.log.Log
 -

 Key: HBASE-8141
 URL: https://issues.apache.org/jira/browse/HBASE-8141
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.95.0, 0.96.0, 0.94.6
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Trivial

 Remove accidental uses of org.mortbay.log.Log. Eclipse autocomplete is 
 probably the culprit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk (with changes)

2013-03-18 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605644#comment-13605644
 ] 

Ted Yu commented on HBASE-7055:
---

I am going over the patch.

Can you update Release Notes ?
There're a lot of config parameters introduced in this patch.

 port HBASE-6371 tier-based compaction from 0.89-fb to trunk (with changes)
 --

 Key: HBASE-7055
 URL: https://issues.apache.org/jira/browse/HBASE-7055
 Project: HBase
  Issue Type: Task
  Components: Compaction
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.95.0

 Attachments: HBASE-6371-squashed.patch, HBASE-6371-v2-squashed.patch, 
 HBASE-6371-v3-refactor-only-squashed.patch, 
 HBASE-6371-v4-refactor-only-squashed.patch, 
 HBASE-6371-v5-refactor-only-squashed.patch, HBASE-7055-v0.patch, 
 HBASE-7055-v1.patch, HBASE-7055-v2.patch, HBASE-7055-v3.patch, 
 HBASE-7055-v4.patch, HBASE-7055-v5.patch, HBASE-7055-v6.patch, 
 HBASE-7055-v7.patch, HBASE-7055-v7.patch


 See HBASE-6371 for details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-8141) Remove accidental uses of org.mortbay.log.Log

2013-03-18 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-8141.
---

   Resolution: Fixed
Fix Version/s: 0.94.6
   0.96.0
   0.95.0

 Remove accidental uses of org.mortbay.log.Log
 -

 Key: HBASE-8141
 URL: https://issues.apache.org/jira/browse/HBASE-8141
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.95.0, 0.96.0, 0.94.6
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Trivial
 Fix For: 0.95.0, 0.96.0, 0.94.6

 Attachments: 8141-0.94.patch, 8141-trunk.patch


 Remove accidental uses of org.mortbay.log.Log. Eclipse autocomplete is 
 probably the culprit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8141) Remove accidental uses of org.mortbay.log.Log

2013-03-18 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-8141:
--

Attachment: 8141-0.94.patch
8141-trunk.patch

Trivial patches committed.

 Remove accidental uses of org.mortbay.log.Log
 -

 Key: HBASE-8141
 URL: https://issues.apache.org/jira/browse/HBASE-8141
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.95.0, 0.96.0, 0.94.6
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Trivial
 Fix For: 0.95.0, 0.96.0, 0.94.6

 Attachments: 8141-0.94.patch, 8141-trunk.patch


 Remove accidental uses of org.mortbay.log.Log. Eclipse autocomplete is 
 probably the culprit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7295) Contention in HBaseClient.getConnection

2013-03-18 Thread Varun Sharma (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605664#comment-13605664
 ] 

Varun Sharma commented on HBASE-7295:
-

Lars,

I maybe forgetting but is it because of the edge cases with PoolMap thread 
safety or is it the Connection object thread safety or is it because of the 
double checked locking issue in general ?

Thanks
Varun

 Contention in HBaseClient.getConnection
 ---

 Key: HBASE-7295
 URL: https://issues.apache.org/jira/browse/HBASE-7295
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.94.3
Reporter: Varun Sharma
Assignee: Varun Sharma
 Fix For: 0.95.0

 Attachments: 7295-0.94.txt, 7295-0.94-v2.txt, 7295-0.94-v3.txt, 
 7295-0.94-v4.txt, 7295-0.94-v5.txt, 7295-trunk.txt, 7295-trunk.txt, 
 7295-trunk-v2.txt, 7295-trunk-v3.txt, 7295-trunk-v3.txt, 7295-trunk-v4.txt


 HBaseClient.getConnection() synchronizes on the connections object. We found 
 severe contention on a thrift gateway which was fanning out roughly 3000+ 
 calls per second to hbase region servers. The thrift gateway had 2000+ 
 threads for handling incoming connections. Threads were blocked on the 
 syncrhonized block - we set ipc.pool.size to 200. Since we are using 
 RoundRobin/ThreadLocal pool only - its not necessary to synchronize on 
 connections - it might lead to cases where we might go slightly over the 
 ipc.max.pool.size() but the additional connections would timeout after 
 maxIdleTime - underlying PoolMap connections object is thread safe.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7679) implement store file management for stripe compactions

2013-03-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605678#comment-13605678
 ] 

Hadoop QA commented on HBASE-7679:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12574205/HBASE-7679-v10.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 9 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4872//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4872//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4872//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4872//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4872//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4872//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4872//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4872//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4872//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4872//console

This message is automatically generated.

 implement store file management for stripe compactions
 --

 Key: HBASE-7679
 URL: https://issues.apache.org/jira/browse/HBASE-7679
 Project: HBase
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-7667-and-7603-v0-incomplete.patch, 
 HBASE-7667-and-7603-v0-incomplete.patch, HBASE-7667-and-7603-v1.patch, 
 HBASE-7667-and-7603-v1.patch, HBASE-7667-v1.patch, HBASE-7667-v1.patch, 
 HBASE-7667-v2.patch, HBASE-7667-v2.patch, HBASE-7667-v3.patch, 
 HBASE-7679-v10.patch, HBASE-7679-v4.patch, HBASE-7679-v5.patch, 
 HBASE-7679-v6.patch, HBASE-7679-v7-.patch, HBASE-7679-v7.patch, 
 HBASE-7679-v8.patch, HBASE-7679-v9.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8108) Add m2eclispe lifecycle mapping to hbase-common

2013-03-18 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-8108:
---

Summary: Add m2eclispe lifecycle mapping to hbase-common  (was: Add 
m2eclispe lifecycle mapping to hbase-commn)

 Add m2eclispe lifecycle mapping to hbase-common
 ---

 Key: HBASE-8108
 URL: https://issues.apache.org/jira/browse/HBASE-8108
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.95.0, 0.98.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Attachments: hbase-8108.patch, hbase-8108-v2.patch


 The maven-antrun-plugin execution doesn't have a default mapping in 
 m2eclipse, so if you import the project into eclipse, you will get an error 
 that the mapping is undefined. All that's needed is to define an execution 
 via the org.eclipse.m2 lifecycle-mapping plugin - it doesn't actually affect 
 the usual maven build at all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-8108) Add m2eclispe lifecycle mapping to hbase-common

2013-03-18 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates resolved HBASE-8108.


   Resolution: Fixed
Fix Version/s: 0.98.0
   0.95.0

committed to trunk and 0.95. Thanks for the reviews!

 Add m2eclispe lifecycle mapping to hbase-common
 ---

 Key: HBASE-8108
 URL: https://issues.apache.org/jira/browse/HBASE-8108
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.95.0, 0.98.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 0.95.0, 0.98.0

 Attachments: hbase-8108.patch, hbase-8108-v2.patch


 The maven-antrun-plugin execution doesn't have a default mapping in 
 m2eclipse, so if you import the project into eclipse, you will get an error 
 that the mapping is undefined. All that's needed is to define an execution 
 via the org.eclipse.m2 lifecycle-mapping plugin - it doesn't actually affect 
 the usual maven build at all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7590:
---

Status: Open  (was: Patch Available)

 Add a costless notifications mechanism from master to regionservers  clients
 -

 Key: HBASE-7590
 URL: https://issues.apache.org/jira/browse/HBASE-7590
 Project: HBase
  Issue Type: Bug
  Components: Client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 
 7590.v13.patch, 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 
 7590.v3.patch, 7590.v5.patch, 7590.v5.patch


 t would be very useful to add a mechanism to distribute some information to 
 the clients and regionservers. Especially It would be useful to know globally 
 (regionservers + clients apps) that some regionservers are dead. This would 
 allow:
 - to lower the load on the system, without clients using staled information 
 and going on dead machines
 - to make the recovery faster from a client point of view. It's common to use 
 large timeouts on the client side, so the client may need a lot of time 
 before declaring a region server dead and trying another one. If the client 
 receives the information separatly about a region server states, it can take 
 the right decision, and continue/stop to wait accordingly.
 We can also send more information, for example instructions like 'slow down' 
 to instruct the client to increase the retries delay and so on.
  Technically, the master could send this information. To lower the load on 
 the system, we should:
 - have a multicast communication (i.e. the master does not have to connect to 
 all servers by tcp), with once packet every 10 seconds or so.
 - receivers should not depend on this: if the information is available great. 
 If not, it should not break anything.
 - it should be optional.
 So at the end we would have a thread in the master sending a protobuf message 
 about the dead servers on a multicast socket. If the socket is not 
 configured, it does not do anything. On the client side, when we receive an 
 information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7590:
---

Attachment: 7590.v13.patch

 Add a costless notifications mechanism from master to regionservers  clients
 -

 Key: HBASE-7590
 URL: https://issues.apache.org/jira/browse/HBASE-7590
 Project: HBase
  Issue Type: Bug
  Components: Client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 
 7590.v13.patch, 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 
 7590.v3.patch, 7590.v5.patch, 7590.v5.patch


 t would be very useful to add a mechanism to distribute some information to 
 the clients and regionservers. Especially It would be useful to know globally 
 (regionservers + clients apps) that some regionservers are dead. This would 
 allow:
 - to lower the load on the system, without clients using staled information 
 and going on dead machines
 - to make the recovery faster from a client point of view. It's common to use 
 large timeouts on the client side, so the client may need a lot of time 
 before declaring a region server dead and trying another one. If the client 
 receives the information separatly about a region server states, it can take 
 the right decision, and continue/stop to wait accordingly.
 We can also send more information, for example instructions like 'slow down' 
 to instruct the client to increase the retries delay and so on.
  Technically, the master could send this information. To lower the load on 
 the system, we should:
 - have a multicast communication (i.e. the master does not have to connect to 
 all servers by tcp), with once packet every 10 seconds or so.
 - receivers should not depend on this: if the information is available great. 
 If not, it should not break anything.
 - it should be optional.
 So at the end we would have a thread in the master sending a protobuf message 
 about the dead servers on a multicast socket. If the socket is not 
 configured, it does not do anything. On the client side, when we receive an 
 information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

2013-03-18 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605698#comment-13605698
 ] 

nkeywal commented on HBASE-7590:


May be 13 is going to be my lucky number :-) ?

 Add a costless notifications mechanism from master to regionservers  clients
 -

 Key: HBASE-7590
 URL: https://issues.apache.org/jira/browse/HBASE-7590
 Project: HBase
  Issue Type: Bug
  Components: Client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 
 7590.v13.patch, 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 
 7590.v3.patch, 7590.v5.patch, 7590.v5.patch


 t would be very useful to add a mechanism to distribute some information to 
 the clients and regionservers. Especially It would be useful to know globally 
 (regionservers + clients apps) that some regionservers are dead. This would 
 allow:
 - to lower the load on the system, without clients using staled information 
 and going on dead machines
 - to make the recovery faster from a client point of view. It's common to use 
 large timeouts on the client side, so the client may need a lot of time 
 before declaring a region server dead and trying another one. If the client 
 receives the information separatly about a region server states, it can take 
 the right decision, and continue/stop to wait accordingly.
 We can also send more information, for example instructions like 'slow down' 
 to instruct the client to increase the retries delay and so on.
  Technically, the master could send this information. To lower the load on 
 the system, we should:
 - have a multicast communication (i.e. the master does not have to connect to 
 all servers by tcp), with once packet every 10 seconds or so.
 - receivers should not depend on this: if the information is available great. 
 If not, it should not break anything.
 - it should be optional.
 So at the end we would have a thread in the master sending a protobuf message 
 about the dead servers on a multicast socket. If the socket is not 
 configured, it does not do anything. On the client side, when we receive an 
 information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7590:
---

Status: Patch Available  (was: Open)

 Add a costless notifications mechanism from master to regionservers  clients
 -

 Key: HBASE-7590
 URL: https://issues.apache.org/jira/browse/HBASE-7590
 Project: HBase
  Issue Type: Bug
  Components: Client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 
 7590.v13.patch, 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 
 7590.v3.patch, 7590.v5.patch, 7590.v5.patch


 t would be very useful to add a mechanism to distribute some information to 
 the clients and regionservers. Especially It would be useful to know globally 
 (regionservers + clients apps) that some regionservers are dead. This would 
 allow:
 - to lower the load on the system, without clients using staled information 
 and going on dead machines
 - to make the recovery faster from a client point of view. It's common to use 
 large timeouts on the client side, so the client may need a lot of time 
 before declaring a region server dead and trying another one. If the client 
 receives the information separatly about a region server states, it can take 
 the right decision, and continue/stop to wait accordingly.
 We can also send more information, for example instructions like 'slow down' 
 to instruct the client to increase the retries delay and so on.
  Technically, the master could send this information. To lower the load on 
 the system, we should:
 - have a multicast communication (i.e. the master does not have to connect to 
 all servers by tcp), with once packet every 10 seconds or so.
 - receivers should not depend on this: if the information is available great. 
 If not, it should not break anything.
 - it should be optional.
 So at the end we would have a thread in the master sending a protobuf message 
 about the dead servers on a multicast socket. If the socket is not 
 configured, it does not do anything. On the client side, when we receive an 
 information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7965) Port table locking to 0.94 (HBASE-7305, HBASE-7546, HBASE-7933)

2013-03-18 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605707#comment-13605707
 ] 

Jonathan Hsieh commented on HBASE-7965:
---

I think it is unfair to claim that the ability to change schema without 
disabling the table is a feature that is required to for HBase to be production 
ready.   

The feature is it is off by default, essentially documented as experimental 
({{Its off by default. Enable it at your own risk.}} [1]), so in my eyes 
fixing it essentially feels like a new feature.  

[1]http://hbase.apache.org/book.html#d1949e2910 .  

(Sorry for the delay, was away for a 2 weeks).

 Port table locking to 0.94 (HBASE-7305, HBASE-7546, HBASE-7933)
 ---

 Key: HBASE-7965
 URL: https://issues.apache.org/jira/browse/HBASE-7965
 Project: HBase
  Issue Type: New Feature
  Components: master, Zookeeper
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.94.7


 Port table locking to 0.94 (HBASE-7305, HBASE-7546, HBASE-7933). This is a 
 new feature, but there has been some interest, and it is necessary for 
 snapshots, and online merge, which are also candidates for backport. 
 If we port snapshots, we might need HBASE-7848 as well.
 We can also do disabled by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7295) Contention in HBaseClient.getConnection

2013-03-18 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605711#comment-13605711
 ] 

Lars Hofhansl commented on HBASE-7295:
--

Double checked locking is fine when the variable checked in declared volatile 
(i.e. ensure proper read/write memory barriers).
Here PoolMap itself would have to be thread-safe, which - as far as I know - it 
is not.

Also in the uncontended case an access to a volatile is not significantly 
cheaper than a synchronized statement, so I doubt that even if it was correct 
it would actually improve the situation ... Unless you see extremely high 
contention on this lock.

Do you have sample code that can reproduce the problem? Until then I'm -1 on 
this change. (sorry)


 Contention in HBaseClient.getConnection
 ---

 Key: HBASE-7295
 URL: https://issues.apache.org/jira/browse/HBASE-7295
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.94.3
Reporter: Varun Sharma
Assignee: Varun Sharma
 Fix For: 0.95.0

 Attachments: 7295-0.94.txt, 7295-0.94-v2.txt, 7295-0.94-v3.txt, 
 7295-0.94-v4.txt, 7295-0.94-v5.txt, 7295-trunk.txt, 7295-trunk.txt, 
 7295-trunk-v2.txt, 7295-trunk-v3.txt, 7295-trunk-v3.txt, 7295-trunk-v4.txt


 HBaseClient.getConnection() synchronizes on the connections object. We found 
 severe contention on a thrift gateway which was fanning out roughly 3000+ 
 calls per second to hbase region servers. The thrift gateway had 2000+ 
 threads for handling incoming connections. Threads were blocked on the 
 syncrhonized block - we set ipc.pool.size to 200. Since we are using 
 RoundRobin/ThreadLocal pool only - its not necessary to synchronize on 
 connections - it might lead to cases where we might go slightly over the 
 ipc.max.pool.size() but the additional connections would timeout after 
 maxIdleTime - underlying PoolMap connections object is thread safe.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7597) TestHBaseFsck#testRegionShouldNotBeDeployed seems to be flaky

2013-03-18 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-7597:
---

Attachment: trunk-7597.patch

 TestHBaseFsck#testRegionShouldNotBeDeployed seems to be flaky
 -

 Key: HBASE-7597
 URL: https://issues.apache.org/jira/browse/HBASE-7597
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Marc Spaggiari
Assignee: Jimmy Xiang
 Attachments: trunk-7597.patch


 I ran the entire test suite many times and always failed on, at least, 
 testRegionShouldNotBeDeployed.
 Results below. I will attached more result when current tests are done.
 Failed tests:
 testDeleteExpiredStoreFiles(org.apache.hadoop.hbase.regionserver.TestStore):
 expected:2 but was:4
   
 testAcquireTaskAtStartup(org.apache.hadoop.hbase.regionserver.TestSplitLogWorker):
 Waiting timed out after [1 000] msec
   testRegionShouldNotBeDeployed(org.apache.hadoop.hbase.util.TestHBaseFsck):
 expected:[SHOULD_NOT_BE_DEPLOYED] but was:[]
   
 testPermissionsWatcher(org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7597) TestHBaseFsck#testRegionShouldNotBeDeployed seems to be flaky

2013-03-18 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-7597:
---

Status: Patch Available  (was: Reopened)

In the log file, it shows the region is not deployed according to hbck although 
it is.  I added some checking (the same way as in hbck) to make sure the region 
is deployed before running hbck.

 TestHBaseFsck#testRegionShouldNotBeDeployed seems to be flaky
 -

 Key: HBASE-7597
 URL: https://issues.apache.org/jira/browse/HBASE-7597
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Marc Spaggiari
Assignee: Jimmy Xiang
 Attachments: trunk-7597.patch


 I ran the entire test suite many times and always failed on, at least, 
 testRegionShouldNotBeDeployed.
 Results below. I will attached more result when current tests are done.
 Failed tests:
 testDeleteExpiredStoreFiles(org.apache.hadoop.hbase.regionserver.TestStore):
 expected:2 but was:4
   
 testAcquireTaskAtStartup(org.apache.hadoop.hbase.regionserver.TestSplitLogWorker):
 Waiting timed out after [1 000] msec
   testRegionShouldNotBeDeployed(org.apache.hadoop.hbase.util.TestHBaseFsck):
 expected:[SHOULD_NOT_BE_DEPLOYED] but was:[]
   
 testPermissionsWatcher(org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize

2013-03-18 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-8135:
--

Attachment: 8135-v3.txt

Patch v3 makes TestHeapSize pass.

Put has already been covered in TestHeapSize#testSizes()

 Mutation should implement HeapSize
 --

 Key: HBASE-8135
 URL: https://issues.apache.org/jira/browse/HBASE-8135
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0, 0.94.5
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0, 0.96.0

 Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt


 Code is there already.
 Doing so would allow to share some code when doing client side buffering.
 patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7965) Port table locking to 0.94 (HBASE-7305, HBASE-7546, HBASE-7933)

2013-03-18 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605718#comment-13605718
 ] 

Lars Hofhansl commented on HBASE-7965:
--

Welcome back Jon :)
I do not think it is question about fair vs. unfair.

It is a fact that you cannot safely do online schema changes in 0.94.
When we have an actual patch against 0.94 we can weigh that deficiency against 
the risk introduced by the patch.


 Port table locking to 0.94 (HBASE-7305, HBASE-7546, HBASE-7933)
 ---

 Key: HBASE-7965
 URL: https://issues.apache.org/jira/browse/HBASE-7965
 Project: HBase
  Issue Type: New Feature
  Components: master, Zookeeper
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.94.7


 Port table locking to 0.94 (HBASE-7305, HBASE-7546, HBASE-7933). This is a 
 new feature, but there has been some interest, and it is necessary for 
 snapshots, and online merge, which are also candidates for backport. 
 If we port snapshots, we might need HBASE-7848 as well.
 We can also do disabled by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize

2013-03-18 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-8135:
--

Status: Patch Available  (was: Open)

 Mutation should implement HeapSize
 --

 Key: HBASE-8135
 URL: https://issues.apache.org/jira/browse/HBASE-8135
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.94.5, 0.95.0, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0, 0.96.0

 Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt


 Code is there already.
 Doing so would allow to share some code when doing client side buffering.
 patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7568) [replication] Create an interface for replication queues

2013-03-18 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated HBASE-7568:


Attachment: HBASE-7568-trunk-v1.patch

Attached re-based patch to incorporate new test in TestReplicationSourceManager.

 [replication] Create an interface for replication queues
 

 Key: HBASE-7568
 URL: https://issues.apache.org/jira/browse/HBASE-7568
 Project: HBase
  Issue Type: Sub-task
  Components: Replication
Reporter: Chris Trezzo
Assignee: Chris Trezzo
 Fix For: 0.95.0, 0.96.0, 0.98.0

 Attachments: HBASE-7568-trunk-v1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7295) Contention in HBaseClient.getConnection

2013-03-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605729#comment-13605729
 ] 

stack commented on HBASE-7295:
--

This doesn't make sense:

{code}
-  protected final PoolMapConnectionId, Connection connections;
+  protected volatile PoolMapConnectionId, Connection connections;
{code}

This is http://en.wikipedia.org/wiki/Double-checked_locking

No weird errors/connection fails in your thrift gateway?

PoolMap looks like it is backed by a concurrent hash map which would be fine on 
the gets, etc., but the iterations are not synchronized (I don't see 
connections being iterated but they probably are someplace if I looked more).

We committed a double-check locking around block cache a while ago: 
https://issues.apache.org/jira/secure/attachment/12553266/5898-v4.txt



 Contention in HBaseClient.getConnection
 ---

 Key: HBASE-7295
 URL: https://issues.apache.org/jira/browse/HBASE-7295
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.94.3
Reporter: Varun Sharma
Assignee: Varun Sharma
 Fix For: 0.95.0

 Attachments: 7295-0.94.txt, 7295-0.94-v2.txt, 7295-0.94-v3.txt, 
 7295-0.94-v4.txt, 7295-0.94-v5.txt, 7295-trunk.txt, 7295-trunk.txt, 
 7295-trunk-v2.txt, 7295-trunk-v3.txt, 7295-trunk-v3.txt, 7295-trunk-v4.txt


 HBaseClient.getConnection() synchronizes on the connections object. We found 
 severe contention on a thrift gateway which was fanning out roughly 3000+ 
 calls per second to hbase region servers. The thrift gateway had 2000+ 
 threads for handling incoming connections. Threads were blocked on the 
 syncrhonized block - we set ipc.pool.size to 200. Since we are using 
 RoundRobin/ThreadLocal pool only - its not necessary to synchronize on 
 connections - it might lead to cases where we might go slightly over the 
 ipc.max.pool.size() but the additional connections would timeout after 
 maxIdleTime - underlying PoolMap connections object is thread safe.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7597) TestHBaseFsck#testRegionShouldNotBeDeployed seems to be flaky

2013-03-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605732#comment-13605732
 ] 

stack commented on HBASE-7597:
--

+1

IMO, exploratory/debug is fine to commit trying to figure whats up on jenkins 
(since it hard to reproduce its context elsewhere).

 TestHBaseFsck#testRegionShouldNotBeDeployed seems to be flaky
 -

 Key: HBASE-7597
 URL: https://issues.apache.org/jira/browse/HBASE-7597
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Marc Spaggiari
Assignee: Jimmy Xiang
 Attachments: trunk-7597.patch


 I ran the entire test suite many times and always failed on, at least, 
 testRegionShouldNotBeDeployed.
 Results below. I will attached more result when current tests are done.
 Failed tests:
 testDeleteExpiredStoreFiles(org.apache.hadoop.hbase.regionserver.TestStore):
 expected:2 but was:4
   
 testAcquireTaskAtStartup(org.apache.hadoop.hbase.regionserver.TestSplitLogWorker):
 Waiting timed out after [1 000] msec
   testRegionShouldNotBeDeployed(org.apache.hadoop.hbase.util.TestHBaseFsck):
 expected:[SHOULD_NOT_BE_DEPLOYED] but was:[]
   
 testPermissionsWatcher(org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7905) Add passing of optional cell blocks over rpc

2013-03-18 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7905:
-

Status: Open  (was: Patch Available)

 Add passing of optional cell blocks over rpc
 

 Key: HBASE-7905
 URL: https://issues.apache.org/jira/browse/HBASE-7905
 Project: HBase
  Issue Type: Sub-task
  Components: IPC/RPC
Reporter: stack
Assignee: stack
 Fix For: 0.95.0

 Attachments: 7900v12-depends-on-8101.txt, 7905.txt, 7905v13.txt, 
 7905v14.txt, 7905v15.txt, 7905v16.txt, 7905v3.txt, 7905v4.txt, 7905v6.txt, 
 7905v8.txt, 7905v9.txt, testipc_for_pre_cellblocks.txt


 Make it so we can pass Cells/data w/o having to bury it all in protobuf to 
 get it over the wire.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7905) Add passing of optional cell blocks over rpc

2013-03-18 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7905:
-

Attachment: testipc_for_pre_cellblocks.txt

Add main to testipc for current trunk, before this patch goes in.

 Add passing of optional cell blocks over rpc
 

 Key: HBASE-7905
 URL: https://issues.apache.org/jira/browse/HBASE-7905
 Project: HBase
  Issue Type: Sub-task
  Components: IPC/RPC
Reporter: stack
Assignee: stack
 Fix For: 0.95.0

 Attachments: 7900v12-depends-on-8101.txt, 7905.txt, 7905v13.txt, 
 7905v14.txt, 7905v15.txt, 7905v16.txt, 7905v3.txt, 7905v4.txt, 7905v6.txt, 
 7905v8.txt, 7905v9.txt, testipc_for_pre_cellblocks.txt


 Make it so we can pass Cells/data w/o having to bury it all in protobuf to 
 get it over the wire.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7305) ZK based Read/Write locks for table operations

2013-03-18 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605746#comment-13605746
 ] 

Jonathan Hsieh commented on HBASE-7305:
---

The doc is great -- I'm really the most curious about why different operations 
get the read or the write aspects of the lock guard what they protect.  I'm 
trying to justify this to myself now based on the docs.  So, do I have this 
right?

Affected operations:
* create, delete, disable, enable, alter, modify table (add/del/mod col, mod 
table), splits
* Other candidates: merge, snapshot, ... balancer, am, ssh, hbck

current rationale: 
* want to allow safe table mods (disable, enable, alter)
* want to allow concurrent splits
* want snapshots operations to be safe

Implementaiton:
* Read locks on splits.  
* Exclusive write lock on all other table mods.

Questions/Observations:
* This primarily protects operations that clash with table level 
enable/disable/alter, but not region level operations, right?.
* This doesn't guard meta from individual changes, right?  It only protects 
meta from bulk adds (create/delete table).  Thus this shouldn't affect region 
moves or region closes/opens.
* Protecting split with a read table lock only prevents alter/enable/disable 
table ops from happening.  If an overlapping merge and split were issued, some 
other mechanism is in place to keep this sane right? This doesn't protect 
multiple merge requests with overlapping regions right? 
* Merges will likely want the read lock? (allowing multiple concurrent merges, 
and assuming some overlap sanity protection from a different mechanism).
* With snapshots, this mechanism doesn't prevent regions from moving so it only 
protects snapshots from concurrently happening with enable/disable/alter table 
ops. Snapshot will still fail if it gets caught while the balancer is running.
* These locks don't really help hbck -- except for the cases where 
enable/disable/alter operations are going on as hbck repairs things.  (It 
wouldn't protect hbck from the balancer).

As a strawman (for follow on work), I'm thinking for Assignemnt dependent 
operations (splits/balancer/ssh/snapshots/merge) we might want another lock (I 
believe regions-in-transition kind of serve this purpose already).  

* Does having a table lock (and then having individual region locks that 
require a table read lock being held) make sense?  Maybe this makes sense for 
merges and splits?



 ZK based Read/Write locks for table operations
 --

 Key: HBASE-7305
 URL: https://issues.apache.org/jira/browse/HBASE-7305
 Project: HBase
  Issue Type: Bug
  Components: Client, master, Zookeeper
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.95.0

 Attachments: 130228-zkrwlocks.pdf, 7305-v11.txt, hbase-7305_v0.patch, 
 hbase-7305_v10.patch, hbase-7305_v13.patch, hbase-7305_v14.patch, 
 hbase-7305_v15.patch, hbase-7305_v1-based-on-curator.patch, 
 hbase-7305_v2.patch, hbase-7305_v4.patch, hbase-7305_v9.patch, 
 HBaseTableLocks.pdf


 This has started as forward porting of HBASE-5494 and HBASE-5991 from the 
 89-fb branch to trunk, but diverged enough to have it's own issue. 
 The idea is to implement a zk based read/write lock per table. Master 
 initiated operations should get the write lock, and region operations (region 
 split, moving, balance?, etc) acquire a shared read lock. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8081) Backport HBASE-7213 (separate hlog for meta tables) to 0.94

2013-03-18 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HBASE-8081:
---

Attachment: 7213-0.94-with-config-1.patch

This is essentially the same patch as the last one with one minor change - 
added the new config in hbase-default.xml. 

Passes the unit tests with the option enabled. Also ran manual tests on a 
cluster with the config on/off. Things looked good.

 Backport HBASE-7213 (separate hlog for meta tables) to 0.94
 ---

 Key: HBASE-8081
 URL: https://issues.apache.org/jira/browse/HBASE-8081
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.5
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.94.7

 Attachments: 7213-0.94-2.patch, 7213-0.94-3.patch, 7213-0.94.patch, 
 7213-0.94-with-config-1.patch, 7213-0.94-with-config.patch


 I am interested in backporting HBASE-7213 to 0.94. Helps to address more of 
 the MTTR story. Offline discussion with Lars indicated he is interested as 
 well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7305) ZK based Read/Write locks for table operations

2013-03-18 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605761#comment-13605761
 ] 

Sergey Shelukhin commented on HBASE-7305:
-

bq. having individual region locks that require a table read lock being held 
I wonder if region lock approach would scale. Though vary I can accept that 
splits are infrequent enough to not introduce too much delay to table 
operations, but if every AM action blocks every table operation I think it will 
not scale beyond small or medium clusters. I think we should be able to use 
better approach... table updates on modified regions can be done after 
modification.

 ZK based Read/Write locks for table operations
 --

 Key: HBASE-7305
 URL: https://issues.apache.org/jira/browse/HBASE-7305
 Project: HBase
  Issue Type: Bug
  Components: Client, master, Zookeeper
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.95.0

 Attachments: 130228-zkrwlocks.pdf, 7305-v11.txt, hbase-7305_v0.patch, 
 hbase-7305_v10.patch, hbase-7305_v13.patch, hbase-7305_v14.patch, 
 hbase-7305_v15.patch, hbase-7305_v1-based-on-curator.patch, 
 hbase-7305_v2.patch, hbase-7305_v4.patch, hbase-7305_v9.patch, 
 HBaseTableLocks.pdf


 This has started as forward porting of HBASE-5494 and HBASE-5991 from the 
 89-fb branch to trunk, but diverged enough to have it's own issue. 
 The idea is to implement a zk based read/write lock per table. Master 
 initiated operations should get the write lock, and region operations (region 
 split, moving, balance?, etc) acquire a shared read lock. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

2013-03-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605762#comment-13605762
 ] 

Hadoop QA commented on HBASE-7590:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12574232/7590.v13.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 32 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4873//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4873//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4873//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4873//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4873//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4873//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4873//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4873//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4873//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4873//console

This message is automatically generated.

 Add a costless notifications mechanism from master to regionservers  clients
 -

 Key: HBASE-7590
 URL: https://issues.apache.org/jira/browse/HBASE-7590
 Project: HBase
  Issue Type: Bug
  Components: Client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 
 7590.v13.patch, 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 
 7590.v3.patch, 7590.v5.patch, 7590.v5.patch


 t would be very useful to add a mechanism to distribute some information to 
 the clients and regionservers. Especially It would be useful to know globally 
 (regionservers + clients apps) that some regionservers are dead. This would 
 allow:
 - to lower the load on the system, without clients using staled information 
 and going on dead machines
 - to make the recovery faster from a client point of view. It's common to use 
 large timeouts on the client side, so the client may need a lot of time 
 before declaring a region server dead and trying another one. If the client 
 receives the information separatly about a region server states, it can take 
 the right decision, and continue/stop to wait accordingly.
 We can also send more information, for example instructions like 'slow down' 
 to instruct the client to increase the retries delay and so on.
  Technically, the master could send this information. To lower the load on 
 the system, we should:
 - have a multicast communication (i.e. the master does not have to connect to 
 all servers by tcp), with once packet every 10 seconds or so.
 - receivers should not depend on this: if the information is available great. 
 If not, it should not break anything.
 - it should be optional.
 So at the end we would have a thread in the master sending a protobuf message 
 about the dead servers on a multicast socket. If the socket is not 
 configured, it does not do anything. On the client side, when we receive an 
 information that 

[jira] [Commented] (HBASE-7597) TestHBaseFsck#testRegionShouldNotBeDeployed seems to be flaky

2013-03-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605772#comment-13605772
 ] 

Hadoop QA commented on HBASE-7597:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12574237/trunk-7597.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4874//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4874//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4874//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4874//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4874//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4874//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4874//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4874//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4874//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4874//console

This message is automatically generated.

 TestHBaseFsck#testRegionShouldNotBeDeployed seems to be flaky
 -

 Key: HBASE-7597
 URL: https://issues.apache.org/jira/browse/HBASE-7597
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Marc Spaggiari
Assignee: Jimmy Xiang
 Attachments: trunk-7597.patch


 I ran the entire test suite many times and always failed on, at least, 
 testRegionShouldNotBeDeployed.
 Results below. I will attached more result when current tests are done.
 Failed tests:
 testDeleteExpiredStoreFiles(org.apache.hadoop.hbase.regionserver.TestStore):
 expected:2 but was:4
   
 testAcquireTaskAtStartup(org.apache.hadoop.hbase.regionserver.TestSplitLogWorker):
 Waiting timed out after [1 000] msec
   testRegionShouldNotBeDeployed(org.apache.hadoop.hbase.util.TestHBaseFsck):
 expected:[SHOULD_NOT_BE_DEPLOYED] but was:[]
   
 testPermissionsWatcher(org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7905) Add passing of optional cell blocks over rpc

2013-03-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605783#comment-13605783
 ] 

stack commented on HBASE-7905:
--

Ran dumb test to compare before and after.

Test is too dumb though because for the current trunk, it does not include cost 
building the protobuf whereas it includes the cost building the CellBlock that 
this patch adds.

Given that, here is what I have:

I added a main to TestIPC and then just did nought but pass KVs and compared 
current trunk to what this patch adds.  Main difference between before and 
after is before takes a single Message param into which all data has already 
been serialized.  The after -- i.e. cellblocks -- takes a param and a 
CellScanner from which it then internally composes a CellBlock to pass over the 
wire... so the new stuff does composition iterating all Cells to compose in 
memory a block to send behind the rpc (it is part of what is being measured 
where as with the before, the building of the object is not measured).

The server puts what it receives back on the wire as a return.

Running a test sending a single KV there and back 1M times has before and after 
taking about the same time.

BEFORE: 13/03/18 14:50:38 INFO ipc.TestIPC: Cycled 100 time(s) with 1 
cell(s) in 101236ms
AFTER: 13/03/18 14:25:15 INFO ipc.TestIPC: Cycled 100 time(s) with 1 
cell(s) in 103746ms

If I do more Cells, say 100, they diverge more:

BEFORE: 13/03/18 13:31:09 INFO ipc.TestIPC: Cycled 100 time(s) with 100 
cell(s) in 113950ms
AFTER: 13/03/18 13:40:58 INFO ipc.TestIPC: Cycled 100 time(s) with 100 
cell(s) in 128230ms

~8%

We should add another ~8% for server-side iteration undoing the cellblock which 
this test does not do.

If I do 1000 cells, we go up to about 60% (double that if server is doing 
iterations on its side).

Let me redo the test so its a bit more of a fair comparison.


 Add passing of optional cell blocks over rpc
 

 Key: HBASE-7905
 URL: https://issues.apache.org/jira/browse/HBASE-7905
 Project: HBase
  Issue Type: Sub-task
  Components: IPC/RPC
Reporter: stack
Assignee: stack
 Fix For: 0.95.0

 Attachments: 7900v12-depends-on-8101.txt, 7905.txt, 7905v13.txt, 
 7905v14.txt, 7905v15.txt, 7905v16.txt, 7905v3.txt, 7905v4.txt, 7905v6.txt, 
 7905v8.txt, 7905v9.txt, testipc_for_pre_cellblocks.txt


 Make it so we can pass Cells/data w/o having to bury it all in protobuf to 
 get it over the wire.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7597) TestHBaseFsck#testRegionShouldNotBeDeployed seems to be flaky

2013-03-18 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605787#comment-13605787
 ] 

Jimmy Xiang commented on HBASE-7597:


I ran the test several times locally and it is green.  I checked it in for 
trunk and 0.95.  Let's keep this open for a while to see if the problem happens 
again.

 TestHBaseFsck#testRegionShouldNotBeDeployed seems to be flaky
 -

 Key: HBASE-7597
 URL: https://issues.apache.org/jira/browse/HBASE-7597
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Marc Spaggiari
Assignee: Jimmy Xiang
 Attachments: trunk-7597.patch


 I ran the entire test suite many times and always failed on, at least, 
 testRegionShouldNotBeDeployed.
 Results below. I will attached more result when current tests are done.
 Failed tests:
 testDeleteExpiredStoreFiles(org.apache.hadoop.hbase.regionserver.TestStore):
 expected:2 but was:4
   
 testAcquireTaskAtStartup(org.apache.hadoop.hbase.regionserver.TestSplitLogWorker):
 Waiting timed out after [1 000] msec
   testRegionShouldNotBeDeployed(org.apache.hadoop.hbase.util.TestHBaseFsck):
 expected:[SHOULD_NOT_BE_DEPLOYED] but was:[]
   
 testPermissionsWatcher(org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7597) TestHBaseFsck#testRegionShouldNotBeDeployed seems to be flaky

2013-03-18 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-7597:
---

Fix Version/s: 0.98.0
   0.95.0

 TestHBaseFsck#testRegionShouldNotBeDeployed seems to be flaky
 -

 Key: HBASE-7597
 URL: https://issues.apache.org/jira/browse/HBASE-7597
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Marc Spaggiari
Assignee: Jimmy Xiang
 Fix For: 0.95.0, 0.98.0

 Attachments: trunk-7597.patch


 I ran the entire test suite many times and always failed on, at least, 
 testRegionShouldNotBeDeployed.
 Results below. I will attached more result when current tests are done.
 Failed tests:
 testDeleteExpiredStoreFiles(org.apache.hadoop.hbase.regionserver.TestStore):
 expected:2 but was:4
   
 testAcquireTaskAtStartup(org.apache.hadoop.hbase.regionserver.TestSplitLogWorker):
 Waiting timed out after [1 000] msec
   testRegionShouldNotBeDeployed(org.apache.hadoop.hbase.util.TestHBaseFsck):
 expected:[SHOULD_NOT_BE_DEPLOYED] but was:[]
   
 testPermissionsWatcher(org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7255) KV size metric went missing from StoreScanner.

2013-03-18 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605790#comment-13605790
 ] 

Ted Yu commented on HBASE-7255:
---

If I understand the current implementation correctly, a MetricsStoreSource 
should be added 
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver and 
implementations would be added:

./hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsStoreSourceImpl.java
./hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsStoreSourceImpl.java

 KV size metric went missing from StoreScanner.
 --

 Key: HBASE-7255
 URL: https://issues.apache.org/jira/browse/HBASE-7255
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.95.0


 In trunk due to the metric refactor, at least the KV size metric went missing.
 See this code in StoreScanner.java:
 {code}
 } finally {
   if (cumulativeMetric  0  metric != null) {
   }
 }
 {code}
 Just an empty if statement, where the metric used to be collected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (HBASE-7255) KV size metric went missing from StoreScanner.

2013-03-18 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605790#comment-13605790
 ] 

Ted Yu edited comment on HBASE-7255 at 3/18/13 11:23 PM:
-

If I understand the current implementation correctly, a MetricsStoreSource 
should be added under 
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver and 
implementations would be added under:

./hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsStoreSourceImpl.java
./hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsStoreSourceImpl.java

  was (Author: yuzhih...@gmail.com):
If I understand the current implementation correctly, a MetricsStoreSource 
should be added 
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver and 
implementations would be added:

./hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsStoreSourceImpl.java
./hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsStoreSourceImpl.java
  
 KV size metric went missing from StoreScanner.
 --

 Key: HBASE-7255
 URL: https://issues.apache.org/jira/browse/HBASE-7255
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.95.0


 In trunk due to the metric refactor, at least the KV size metric went missing.
 See this code in StoreScanner.java:
 {code}
 } finally {
   if (cumulativeMetric  0  metric != null) {
   }
 }
 {code}
 Just an empty if statement, where the metric used to be collected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8135) Mutation should implement HeapSize

2013-03-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605793#comment-13605793
 ] 

Hadoop QA commented on HBASE-8135:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12574238/8135-v3.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 1 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4875//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4875//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4875//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4875//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4875//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4875//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4875//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4875//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4875//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4875//console

This message is automatically generated.

 Mutation should implement HeapSize
 --

 Key: HBASE-8135
 URL: https://issues.apache.org/jira/browse/HBASE-8135
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0, 0.94.5
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0, 0.96.0

 Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt


 Code is there already.
 Doing so would allow to share some code when doing client side buffering.
 patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7680) implement compaction policy for stripe compactions

2013-03-18 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-7680:


Attachment: HBASE-7680-v3-with-7679.patch
HBASE-7680-v3.patch

After discussion in HBASE-8034, redid all the reasonable places to use KV count 
instead of size for splitting. Also rebase.

 implement compaction policy for stripe compactions
 --

 Key: HBASE-7680
 URL: https://issues.apache.org/jira/browse/HBASE-7680
 Project: HBase
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-7680-v0.patch, 
 HBASE-7680-v0-with-7679-and-7935.patch, HBASE-7680-v1.patch, 
 HBASE-7680-v1-with-7679.patch, HBASE-7680-v2.patch, 
 HBASE-7680-v2-with-7679-and-8034.patch, HBASE-7680-v3.patch, 
 HBASE-7680-v3-with-7679.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-8142) Sporadic TestZKProcedureControllers failures on trunk

2013-03-18 Thread stack (JIRA)
stack created HBASE-8142:


 Summary: Sporadic TestZKProcedureControllers failures on trunk
 Key: HBASE-8142
 URL: https://issues.apache.org/jira/browse/HBASE-8142
 Project: HBase
  Issue Type: Bug
Reporter: stack


See 
https://builds.apache.org/job/PreCommit-HBASE-Build/4865//artifact/trunk/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.procedure.TestZKProcedureControllers.txt
 and
https://builds.apache.org/job/PreCommit-HBASE-Build/4865//artifact/trunk/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.procedure.TestZKProcedureControllers-output.txt

I see this in the output:

{code}
2013-03-18 17:30:46,672 DEBUG [Thread-2-EventThread] zookeeper.ZKUtil(1682): 
testing utility-0x13d7e8da759 Retrieved 0 byte(s) of data from znode 
/hbase/testSimple/acquired/instanceTest; data=empty
2013-03-18 17:30:46,672 DEBUG [Thread-2-EventThread] 
procedure.ZKProcedureMemberRpcs(206): start proc data length is 0
2013-03-18 17:30:46,672 ERROR [Thread-2-EventThread] 
procedure.ZKProcedureMemberRpcs(210): Data in for starting procuedure 
instanceTest is illegally formatted. Killing the procedure.
2013-03-18 17:30:46,673 ERROR [Thread-2-EventThread] 
procedure.ZKProcedureMemberRpcs(218): Illegal argument exception
java.lang.IllegalArgumentException: Data in for starting procuedure 
instanceTest is illegally formatted. Killing the procedure.
at 
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:211)
at 
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.waitForNewProcedures(ZKProcedureMemberRpcs.java:175)
at 
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.access$100(ZKProcedureMemberRpcs.java:56)
at 
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs$1.nodeChildrenChanged(ZKProcedureMemberRpcs.java:109)
at 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:312)
at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
2013-03-18 17:30:46,675 ERROR [Thread-2-EventThread] 
procedure.ZKProcedureMemberRpcs(281): Failed due to null subprocedure
java.lang.IllegalArgumentException via 
expected:java.lang.IllegalArgumentException: Data in for starting procuedure 
instanceTest is illegally formatted. Killing the procedure.
at 
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:219)
at 
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.waitForNewProcedures(ZKProcedureMemberRpcs.java:175)
at 
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.access$100(ZKProcedureMemberRpcs.java:56)
at 
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs$1.nodeChildrenChanged(ZKProcedureMemberRpcs.java:109)
at 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:312)
at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
Caused by: java.lang.IllegalArgumentException: Data in for starting procuedure 
instanceTest is illegally formatted. Killing the procedure.
at 
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:211)
... 6 more
{code}

The znode has zero data (Usually it has 7 bytes when test runs fine).   Is the 
latch being triggered on node create before data is written?  Pointers 
appreciated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-8013) TestZKProcedureControllers fails intermittently in trunk builds

2013-03-18 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-8013.
---

Resolution: Duplicate

Duplicate with HBASE-8142

 TestZKProcedureControllers fails intermittently in trunk builds
 ---

 Key: HBASE-8013
 URL: https://issues.apache.org/jira/browse/HBASE-8013
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu

 See 
 https://builds.apache.org/job/HBase-TRUNK/3918/testReport/org.apache.hadoop.hbase.procedure/TestZKProcedureControllers/testSimpleZKCohortMemberController/
 This seems to be the reason:
 {code}
 2013-03-06 10:35:31,088 ERROR [Thread-2-EventThread] 
 procedure.ZKProcedureMemberRpcs(218): Illegal argument exception
 java.lang.IllegalArgumentException: Data in for starting procuedure 
 instanceTest is illegally formatted. Killing the procedure.
   at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:211)
   at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.waitForNewProcedures(ZKProcedureMemberRpcs.java:175)
   at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.access$100(ZKProcedureMemberRpcs.java:56)
   at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs$1.nodeChildrenChanged(ZKProcedureMemberRpcs.java:109)
   at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:312)
   at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 2013-03-06 10:35:31,090 ERROR [Thread-2-EventThread] 
 procedure.ZKProcedureMemberRpcs(281): Failed due to null subprocedure
 java.lang.IllegalArgumentException via 
 expected:java.lang.IllegalArgumentException: Data in for starting procuedure 
 instanceTest is illegally formatted. Killing the procedure.
   at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:219)
   at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.waitForNewProcedures(ZKProcedureMemberRpcs.java:175)
   at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.access$100(ZKProcedureMemberRpcs.java:56)
   at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs$1.nodeChildrenChanged(ZKProcedureMemberRpcs.java:109)
   at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:312)
   at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 Caused by: java.lang.IllegalArgumentException: Data in for starting 
 procuedure instanceTest is illegally formatted. Killing the procedure.
   at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:211)
   ... 6 more
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7568) [replication] Create an interface for replication queues

2013-03-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605802#comment-13605802
 ] 

Hadoop QA commented on HBASE-7568:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12574242/HBASE-7568-trunk-v1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces lines longer than 
100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4876//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4876//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4876//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4876//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4876//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4876//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4876//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4876//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4876//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4876//console

This message is automatically generated.

 [replication] Create an interface for replication queues
 

 Key: HBASE-7568
 URL: https://issues.apache.org/jira/browse/HBASE-7568
 Project: HBase
  Issue Type: Sub-task
  Components: Replication
Reporter: Chris Trezzo
Assignee: Chris Trezzo
 Fix For: 0.95.0, 0.96.0, 0.98.0

 Attachments: HBASE-7568-trunk-v1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8119) Optimize StochasticLoadBalancer

2013-03-18 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-8119:
-

Summary: Optimize StochasticLoadBalancer  (was: StochasticLoadBalancer does 
not take into account per table balance)

 Optimize StochasticLoadBalancer
 ---

 Key: HBASE-8119
 URL: https://issues.apache.org/jira/browse/HBASE-8119
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: 0.95.0
Reporter: Enis Soztutar
 Fix For: 0.95.0


 On a 5 node trunk cluster, I ran into a weird problem with 
 StochasticLoadBalancer:
 server1   Thu Mar 14 03:42:50 UTC 20130.0 33
 server2   Thu Mar 14 03:47:53 UTC 20130.0 34
 server3   Thu Mar 14 03:46:53 UTC 2013465.0   42
 server4   Thu Mar 14 03:47:53 UTC 201311455.0 282
 server5   Thu Mar 14 03:47:53 UTC 20130.0 34
 Total:5   11920   425
 Notice that server4 has 282 regions, while the others have much less. Plus 
 for one table with 260 regions has been super imbalanced:
 {code}
 Regions by Region Server
 Region Server Region Count
 http://server3:60030/ 10
 http://server4:60030/ 250
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7905) Add passing of optional cell blocks over rpc

2013-03-18 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7905:
-

Attachment: 7905v17.txt

Fix TestHCM at least.

 Add passing of optional cell blocks over rpc
 

 Key: HBASE-7905
 URL: https://issues.apache.org/jira/browse/HBASE-7905
 Project: HBase
  Issue Type: Sub-task
  Components: IPC/RPC
Reporter: stack
Assignee: stack
 Fix For: 0.95.0

 Attachments: 7900v12-depends-on-8101.txt, 7905.txt, 7905v13.txt, 
 7905v14.txt, 7905v15.txt, 7905v16.txt, 7905v17.txt, 7905v3.txt, 7905v4.txt, 
 7905v6.txt, 7905v8.txt, 7905v9.txt, testipc_for_pre_cellblocks.txt


 Make it so we can pass Cells/data w/o having to bury it all in protobuf to 
 get it over the wire.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7905) Add passing of optional cell blocks over rpc

2013-03-18 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7905:
-

Status: Patch Available  (was: Open)

 Add passing of optional cell blocks over rpc
 

 Key: HBASE-7905
 URL: https://issues.apache.org/jira/browse/HBASE-7905
 Project: HBase
  Issue Type: Sub-task
  Components: IPC/RPC
Reporter: stack
Assignee: stack
 Fix For: 0.95.0

 Attachments: 7900v12-depends-on-8101.txt, 7905.txt, 7905v13.txt, 
 7905v14.txt, 7905v15.txt, 7905v16.txt, 7905v17.txt, 7905v3.txt, 7905v4.txt, 
 7905v6.txt, 7905v8.txt, 7905v9.txt, testipc_for_pre_cellblocks.txt


 Make it so we can pass Cells/data w/o having to bury it all in protobuf to 
 get it over the wire.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8138) Using [packed=true] for repeated field of primitive numeric types (types which use the varint, 32-bit, or 64-bit wire types)

2013-03-18 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-8138:
--

Status: Patch Available  (was: Open)

 Using [packed=true] for repeated field of primitive numeric types (types 
 which use the varint, 32-bit, or 64-bit wire types)
 

 Key: HBASE-8138
 URL: https://issues.apache.org/jira/browse/HBASE-8138
 Project: HBase
  Issue Type: Bug
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
Priority: Trivial
 Fix For: 0.98.0

 Attachments: hbase-8138.patch


 It's recommended to do the following for numeric primitive types
 {quote}
 For historical reasons, repeated fields of basic numeric types aren't encoded 
 as efficiently as they could be. New code should use the special option 
 [packed=true] to get a more efficient encoding
 {quote}
 See details at https://developers.google.com/protocol-buffers/docs/proto

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8013) TestZKProcedureControllers fails intermittently in trunk builds

2013-03-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605811#comment-13605811
 ] 

stack commented on HBASE-8013:
--

Thanks Ted.  I should have noticed this one.

 TestZKProcedureControllers fails intermittently in trunk builds
 ---

 Key: HBASE-8013
 URL: https://issues.apache.org/jira/browse/HBASE-8013
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu

 See 
 https://builds.apache.org/job/HBase-TRUNK/3918/testReport/org.apache.hadoop.hbase.procedure/TestZKProcedureControllers/testSimpleZKCohortMemberController/
 This seems to be the reason:
 {code}
 2013-03-06 10:35:31,088 ERROR [Thread-2-EventThread] 
 procedure.ZKProcedureMemberRpcs(218): Illegal argument exception
 java.lang.IllegalArgumentException: Data in for starting procuedure 
 instanceTest is illegally formatted. Killing the procedure.
   at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:211)
   at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.waitForNewProcedures(ZKProcedureMemberRpcs.java:175)
   at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.access$100(ZKProcedureMemberRpcs.java:56)
   at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs$1.nodeChildrenChanged(ZKProcedureMemberRpcs.java:109)
   at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:312)
   at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 2013-03-06 10:35:31,090 ERROR [Thread-2-EventThread] 
 procedure.ZKProcedureMemberRpcs(281): Failed due to null subprocedure
 java.lang.IllegalArgumentException via 
 expected:java.lang.IllegalArgumentException: Data in for starting procuedure 
 instanceTest is illegally formatted. Killing the procedure.
   at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:219)
   at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.waitForNewProcedures(ZKProcedureMemberRpcs.java:175)
   at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.access$100(ZKProcedureMemberRpcs.java:56)
   at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs$1.nodeChildrenChanged(ZKProcedureMemberRpcs.java:109)
   at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:312)
   at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 Caused by: java.lang.IllegalArgumentException: Data in for starting 
 procuedure instanceTest is illegally formatted. Killing the procedure.
   at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:211)
   ... 6 more
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8119) Optimize StochasticLoadBalancer

2013-03-18 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605827#comment-13605827
 ] 

Enis Soztutar commented on HBASE-8119:
--

bq. Per table load balancing runs the balancer once per table. 
The issue turned out to be not in the per-table load balancing, which was 
already defaulted to false. The issue is that for 500 regions, Load balancer 
takes 15 min, which makes it unusable. In it's current form, 
StochasticLoadBalancer can only work with clusters having ~20 nodes, and low 
hundreds of regions. 
bq. There's a lot of hashmap manipulation that should be optimized out if we 
wanted to worry about perf.
If the balancer takes more than 15 min, there is a bug in HMaster.balance() 
that it breaks prematurely from assigning the region plans from the balancer.
One more thing is that we do not do bulk assign to the regions generated by the 
load balancer plan. 

 Optimize StochasticLoadBalancer
 ---

 Key: HBASE-8119
 URL: https://issues.apache.org/jira/browse/HBASE-8119
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: 0.95.0
Reporter: Enis Soztutar
 Fix For: 0.95.0


 On a 5 node trunk cluster, I ran into a weird problem with 
 StochasticLoadBalancer:
 server1   Thu Mar 14 03:42:50 UTC 20130.0 33
 server2   Thu Mar 14 03:47:53 UTC 20130.0 34
 server3   Thu Mar 14 03:46:53 UTC 2013465.0   42
 server4   Thu Mar 14 03:47:53 UTC 201311455.0 282
 server5   Thu Mar 14 03:47:53 UTC 20130.0 34
 Total:5   11920   425
 Notice that server4 has 282 regions, while the others have much less. Plus 
 for one table with 260 regions has been super imbalanced:
 {code}
 Regions by Region Server
 Region Server Region Count
 http://server3:60030/ 10
 http://server4:60030/ 250
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7255) KV size metric went missing from StoreScanner.

2013-03-18 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605832#comment-13605832
 ] 

Elliott Clark commented on HBASE-7255:
--

I don't think that complexity is needed.  I think we can add this metric to one 
of the other regionserver mbeans.

 KV size metric went missing from StoreScanner.
 --

 Key: HBASE-7255
 URL: https://issues.apache.org/jira/browse/HBASE-7255
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.95.0


 In trunk due to the metric refactor, at least the KV size metric went missing.
 See this code in StoreScanner.java:
 {code}
 } finally {
   if (cumulativeMetric  0  metric != null) {
   }
 }
 {code}
 Just an empty if statement, where the metric used to be collected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7295) Contention in HBaseClient.getConnection

2013-03-18 Thread Varun Sharma (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605838#comment-13605838
 ] 

Varun Sharma commented on HBASE-7295:
-

We are not seeing any issues on the thrift gateway anymore.

@lars : I can try comparing volatile and synchronized accesses to the PoolMap 
of type ReusablePool
@stack : We do iterate over connections in HBaseClient when we try to close 
down the HBaseClient or stop it



 Contention in HBaseClient.getConnection
 ---

 Key: HBASE-7295
 URL: https://issues.apache.org/jira/browse/HBASE-7295
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.94.3
Reporter: Varun Sharma
Assignee: Varun Sharma
 Fix For: 0.95.0

 Attachments: 7295-0.94.txt, 7295-0.94-v2.txt, 7295-0.94-v3.txt, 
 7295-0.94-v4.txt, 7295-0.94-v5.txt, 7295-trunk.txt, 7295-trunk.txt, 
 7295-trunk-v2.txt, 7295-trunk-v3.txt, 7295-trunk-v3.txt, 7295-trunk-v4.txt


 HBaseClient.getConnection() synchronizes on the connections object. We found 
 severe contention on a thrift gateway which was fanning out roughly 3000+ 
 calls per second to hbase region servers. The thrift gateway had 2000+ 
 threads for handling incoming connections. Threads were blocked on the 
 syncrhonized block - we set ipc.pool.size to 200. Since we are using 
 RoundRobin/ThreadLocal pool only - its not necessary to synchronize on 
 connections - it might lead to cases where we might go slightly over the 
 ipc.max.pool.size() but the additional connections would timeout after 
 maxIdleTime - underlying PoolMap connections object is thread safe.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8119) Optimize StochasticLoadBalancer

2013-03-18 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605847#comment-13605847
 ] 

Enis Soztutar commented on HBASE-8119:
--

Quoting review at https://reviews.apache.org/r/9998/: 
Attaching a patch for improving the running time of StochasticLoadBalancer 200x 
times. 

TestStochasticLoadBalancer#testMidCluster() Current impl:
//2013-03-15 17:28:25,495 DEBUG [main] balancer.StochasticLoadBalancer(256): 
Finished computing new laod balance plan.  Computation took 172526ms to try 
15000 different iterations.  Found a solution that moves 600 regions; Going 
from a computed cost of 35.850001 to a new cost of 23.481578947368426
With patch:
//2013-03-18 14:56:13,541 DEBUG [Thread-2] 
balancer.StochasticLoadBalancer(436): Finished computing new laod balance plan. 
 Computation took 941ms to try 15000 different iterations.  Found a solution 
that moves 600 regions; Going from a computed cost of 35.85 to a new cost of 
23.48157894736842

The improvements come from: 
 - Optimized array based data structures in Cluster class
 - Getting rid of hashmaps 
 - Optimized region move and swap ops 
 - Removing most of the computation to cluster initialization, and state change 
for the cluster, thus eliminating computing the same results over and over
 - Some profiling

There should be further optimizations but this should be a good start. If we 
ran into more problems, we can investigate further. There are a lof of TODO's 
added in this patch. I'll create a jira for collecting some thoughts, but I 
wont have the time to work on those for now. 

There are (hopefully) minor semantic changes in the algo. I had to bump up 
loadMultiplier, and decrease moveCostMultiplier. See comments at 
TestStochasticLoadBalancer#testLargeCluster(). Please review carefully. 

As noted in testLargeCluster(), this does not work for large clusters  10 
regions, 1000 nodes. This can be solved by smt like 
http://en.wikipedia.org/wiki/Simulated_annealing instead of random walk with 
eager selection. 

 Optimize StochasticLoadBalancer
 ---

 Key: HBASE-8119
 URL: https://issues.apache.org/jira/browse/HBASE-8119
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: 0.95.0
Reporter: Enis Soztutar
 Fix For: 0.95.0


 On a 5 node trunk cluster, I ran into a weird problem with 
 StochasticLoadBalancer:
 server1   Thu Mar 14 03:42:50 UTC 20130.0 33
 server2   Thu Mar 14 03:47:53 UTC 20130.0 34
 server3   Thu Mar 14 03:46:53 UTC 2013465.0   42
 server4   Thu Mar 14 03:47:53 UTC 201311455.0 282
 server5   Thu Mar 14 03:47:53 UTC 20130.0 34
 Total:5   11920   425
 Notice that server4 has 282 regions, while the others have much less. Plus 
 for one table with 260 regions has been super imbalanced:
 {code}
 Regions by Region Server
 Region Server Region Count
 http://server3:60030/ 10
 http://server4:60030/ 250
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7568) [replication] Create an interface for replication queues

2013-03-18 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated HBASE-7568:


Attachment: HBASE-7568-trunk-v2.patch

Missed one line that is over 100. Attaching new patch.

 [replication] Create an interface for replication queues
 

 Key: HBASE-7568
 URL: https://issues.apache.org/jira/browse/HBASE-7568
 Project: HBase
  Issue Type: Sub-task
  Components: Replication
Reporter: Chris Trezzo
Assignee: Chris Trezzo
 Fix For: 0.95.0, 0.96.0, 0.98.0

 Attachments: HBASE-7568-trunk-v1.patch, HBASE-7568-trunk-v2.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7680) implement compaction policy for stripe compactions

2013-03-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605860#comment-13605860
 ] 

Hadoop QA commented on HBASE-7680:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12574261/HBASE-7680-v3-with-7679.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 22 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.regionserver.TestJoinedScanners

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4877//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4877//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4877//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4877//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4877//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4877//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4877//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4877//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4877//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4877//console

This message is automatically generated.

 implement compaction policy for stripe compactions
 --

 Key: HBASE-7680
 URL: https://issues.apache.org/jira/browse/HBASE-7680
 Project: HBase
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-7680-v0.patch, 
 HBASE-7680-v0-with-7679-and-7935.patch, HBASE-7680-v1.patch, 
 HBASE-7680-v1-with-7679.patch, HBASE-7680-v2.patch, 
 HBASE-7680-v2-with-7679-and-8034.patch, HBASE-7680-v3.patch, 
 HBASE-7680-v3-with-7679.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7905) Add passing of optional cell blocks over rpc

2013-03-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605880#comment-13605880
 ] 

Hadoop QA commented on HBASE-7905:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12574267/7905v17.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 55 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 javac{color}.  The applied patch generated 6 javac compiler 
warnings (more than the trunk's current 4 warnings).

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces lines longer than 
100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.regionserver.TestJoinedScanners
  
org.apache.hadoop.hbase.client.TestFromClientSideWithCoprocessor
  org.apache.hadoop.hbase.client.TestFromClientSide

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4878//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4878//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4878//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4878//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4878//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4878//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4878//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4878//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4878//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4878//console

This message is automatically generated.

 Add passing of optional cell blocks over rpc
 

 Key: HBASE-7905
 URL: https://issues.apache.org/jira/browse/HBASE-7905
 Project: HBase
  Issue Type: Sub-task
  Components: IPC/RPC
Reporter: stack
Assignee: stack
 Fix For: 0.95.0

 Attachments: 7900v12-depends-on-8101.txt, 7905.txt, 7905v13.txt, 
 7905v14.txt, 7905v15.txt, 7905v16.txt, 7905v17.txt, 7905v3.txt, 7905v4.txt, 
 7905v6.txt, 7905v8.txt, 7905v9.txt, testipc_for_pre_cellblocks.txt


 Make it so we can pass Cells/data w/o having to bury it all in protobuf to 
 get it over the wire.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7305) ZK based Read/Write locks for table operations

2013-03-18 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605886#comment-13605886
 ] 

Enis Soztutar commented on HBASE-7305:
--

Thanks Jon, I was in the process of updating the doc, but got carried away with 
more pressing issues. I'll attach an updated version. 
Your overview seems about right. 
bq. This primarily protects operations that clash with table level 
enable/disable/alter, but not region level operations, right?.
If you mean assign, etc. Not it does not. 
bq. This doesn't guard meta from individual changes, right? It only protects 
meta from bulk adds (create/delete table). Thus this shouldn't affect region 
moves or region closes/opens.
There is no guard against changes to META. Region moves, open/close does not 
acquire a lock. 
bq. If an overlapping merge and split were issued, some other mechanism is in 
place to keep this sane right? This doesn't protect multiple merge requests 
with overlapping regions right?
bq. Merges will likely want the read lock? (allowing multiple concurrent 
merges, and assuming some overlap sanity protection from a different mechanism).
Merges can be designed to acquire read lock or a write lock. If read lock, then 
it means there is no guarantee against trying to do a merge and a concurrent 
split. But this allows merges for different ranges happening at the same time. 
If we do write lock, it will guard against concurrent merge / split problem, 
but we cannot do multiple merges at the same time. 
The recent patch for HBASE-7403 moves the regions to be merged to the same 
region server. We might be able to do in-memory locking for merge and split in 
the RS, so that we might be able to use read locking for merges. 
bq. With snapshots, this mechanism doesn't prevent regions from moving so it 
only protects snapshots from concurrently happening with enable/disable/alter 
table ops. Snapshot will still fail if it gets caught while the balancer is 
running.
Yes, there is no protection against that right now. I have to look up why 
region move causes snapshot to fail. 
bq. These locks don't really help hbck – except for the cases where 
enable/disable/alter operations are going on as hbck repairs things. (It 
wouldn't protect hbck from the balancer).
hbck as it is relies too much on knowing about the filesystem layout, and META. 
It is hard to sync between balancer and hbck. 
bq. Does having a table lock (and then having individual region locks that 
require a table read lock being held) make sense? Maybe this makes sense for 
merges and splits?
If we have per-region locks, we might reevaluate table locks. But I would 
imagine so, since it will prevent concurrent master operations as well. We can 
achieve the same thing with acquiring all the region locks, but table locks 
would be faster. 

bq. having individual region locks that require a table read lock being held
I think we have to evaluate whether this is feasible. I guess it should be, but 
we should be able to scale to millions of regions. If we had per-region locks, 
assignment would become much easier (current RIT is similar, but we need this 
even for assigned regions)

 ZK based Read/Write locks for table operations
 --

 Key: HBASE-7305
 URL: https://issues.apache.org/jira/browse/HBASE-7305
 Project: HBase
  Issue Type: Bug
  Components: Client, master, Zookeeper
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.95.0

 Attachments: 130228-zkrwlocks.pdf, 7305-v11.txt, hbase-7305_v0.patch, 
 hbase-7305_v10.patch, hbase-7305_v13.patch, hbase-7305_v14.patch, 
 hbase-7305_v15.patch, hbase-7305_v1-based-on-curator.patch, 
 hbase-7305_v2.patch, hbase-7305_v4.patch, hbase-7305_v9.patch, 
 HBaseTableLocks.pdf


 This has started as forward porting of HBASE-5494 and HBASE-5991 from the 
 89-fb branch to trunk, but diverged enough to have it's own issue. 
 The idea is to implement a zk based read/write lock per table. Master 
 initiated operations should get the write lock, and region operations (region 
 split, moving, balance?, etc) acquire a shared read lock. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (HBASE-8067) TestHFileArchiving.testArchiveOnTableDelete sometimes fails

2013-03-18 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reopened HBASE-8067:
---


 TestHFileArchiving.testArchiveOnTableDelete sometimes fails
 ---

 Key: HBASE-8067
 URL: https://issues.apache.org/jira/browse/HBASE-8067
 Project: HBase
  Issue Type: Bug
  Components: Admin, master, test
Affects Versions: 0.96.0, 0.94.6
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 0.95.0, 0.94.7

 Attachments: HBASE-8067-debug.patch, HBASE-8067-v0.patch


 it seems that testArchiveOnTableDelete() fails because the archiving in 
 DeleteTableHandler is still in progress when admin.deleteTable() returns.
 {code}
 Error Message
 Archived files are missing some of the store files!
 Stacktrace
 java.lang.AssertionError: Archived files are missing some of the store files!
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.assertTrue(Assert.java:41)
   at 
 org.apache.hadoop.hbase.backup.TestHFileArchiving.testArchiveOnTableDelete(TestHFileArchiving.java:262)
 {code}
 (Looking at the problem in a more generic way, we don't have any way to 
 inform the client when an async operation is completed)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8138) Using [packed=true] for repeated field of primitive numeric types (types which use the varint, 32-bit, or 64-bit wire types)

2013-03-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605909#comment-13605909
 ] 

Hadoop QA commented on HBASE-8138:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12574187/hbase-8138.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4879//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4879//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4879//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4879//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4879//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4879//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4879//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4879//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4879//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4879//console

This message is automatically generated.

 Using [packed=true] for repeated field of primitive numeric types (types 
 which use the varint, 32-bit, or 64-bit wire types)
 

 Key: HBASE-8138
 URL: https://issues.apache.org/jira/browse/HBASE-8138
 Project: HBase
  Issue Type: Bug
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
Priority: Trivial
 Fix For: 0.98.0

 Attachments: hbase-8138.patch


 It's recommended to do the following for numeric primitive types
 {quote}
 For historical reasons, repeated fields of basic numeric types aren't encoded 
 as efficiently as they could be. New code should use the special option 
 [packed=true] to get a more efficient encoding
 {quote}
 See details at https://developers.google.com/protocol-buffers/docs/proto

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7568) [replication] Create an interface for replication queues

2013-03-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605917#comment-13605917
 ] 

Hadoop QA commented on HBASE-7568:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12574274/HBASE-7568-trunk-v2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4880//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4880//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4880//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4880//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4880//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4880//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4880//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4880//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4880//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4880//console

This message is automatically generated.

 [replication] Create an interface for replication queues
 

 Key: HBASE-7568
 URL: https://issues.apache.org/jira/browse/HBASE-7568
 Project: HBase
  Issue Type: Sub-task
  Components: Replication
Reporter: Chris Trezzo
Assignee: Chris Trezzo
 Fix For: 0.95.0, 0.96.0, 0.98.0

 Attachments: HBASE-7568-trunk-v1.patch, HBASE-7568-trunk-v2.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8131) Create table handler needs to handle failure cases.

2013-03-18 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605918#comment-13605918
 ] 

chunhui shen commented on HBASE-8131:
-

{code}
+for (int i = 0; i  100; i++) {
+  if (!TEST_UTIL.getHBaseAdmin().isTableAvailable(TABLENAME)) {
+Thread.sleep(200);
+  }
+}
{code}
Make a assert that table is available ?

 Create table handler needs to handle failure cases.
 ---

 Key: HBASE-8131
 URL: https://issues.apache.org/jira/browse/HBASE-8131
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.98.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-8131_trunk_1.patch, HBASE-8131_trunk_2.patch, 
 HBASE-8131_trunk.patch


 In CreateTable Handler there are number of failure cases.  
 IOExceptions are common while creation of regioninfos, htableDescriptors etc.
 After this exception if i try to recreate the table using admin, we need to 
 remove the acquired table lock and also clear the ZKTable in memory cache so 
 that the operation can be retried.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (HBASE-8119) Optimize StochasticLoadBalancer

2013-03-18 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605847#comment-13605847
 ] 

Enis Soztutar edited comment on HBASE-8119 at 3/19/13 1:40 AM:
---

Quoting review at https://reviews.apache.org/r/9998/ 
Attaching a patch for improving the running time of StochasticLoadBalancer 200x 
times. 

TestStochasticLoadBalancer#testMidCluster() Current impl:
//2013-03-15 17:28:25,495 DEBUG [main] balancer.StochasticLoadBalancer(256): 
Finished computing new laod balance plan.  Computation took 172526ms to try 
15000 different iterations.  Found a solution that moves 600 regions; Going 
from a computed cost of 35.850001 to a new cost of 23.481578947368426
With patch:
//2013-03-18 14:56:13,541 DEBUG [Thread-2] 
balancer.StochasticLoadBalancer(436): Finished computing new laod balance plan. 
 Computation took 941ms to try 15000 different iterations.  Found a solution 
that moves 600 regions; Going from a computed cost of 35.85 to a new cost of 
23.48157894736842

The improvements come from: 
 - Optimized array based data structures in Cluster class
 - Getting rid of hashmaps 
 - Optimized region move and swap ops 
 - Removing most of the computation to cluster initialization, and state change 
for the cluster, thus eliminating computing the same results over and over
 - Some profiling

There should be further optimizations but this should be a good start. If we 
ran into more problems, we can investigate further. There are a lof of TODO's 
added in this patch. I'll create a jira for collecting some thoughts, but I 
wont have the time to work on those for now. 

There are (hopefully) minor semantic changes in the algo. I had to bump up 
loadMultiplier, and decrease moveCostMultiplier. See comments at 
TestStochasticLoadBalancer#testLargeCluster(). Please review carefully. 

As noted in testLargeCluster(), this does not work for large clusters  10 
regions, 1000 nodes. This can be solved by smt like 
http://en.wikipedia.org/wiki/Simulated_annealing instead of random walk with 
eager selection. 

  was (Author: enis):
Quoting review at https://reviews.apache.org/r/9998/: 
Attaching a patch for improving the running time of StochasticLoadBalancer 200x 
times. 

TestStochasticLoadBalancer#testMidCluster() Current impl:
//2013-03-15 17:28:25,495 DEBUG [main] balancer.StochasticLoadBalancer(256): 
Finished computing new laod balance plan.  Computation took 172526ms to try 
15000 different iterations.  Found a solution that moves 600 regions; Going 
from a computed cost of 35.850001 to a new cost of 23.481578947368426
With patch:
//2013-03-18 14:56:13,541 DEBUG [Thread-2] 
balancer.StochasticLoadBalancer(436): Finished computing new laod balance plan. 
 Computation took 941ms to try 15000 different iterations.  Found a solution 
that moves 600 regions; Going from a computed cost of 35.85 to a new cost of 
23.48157894736842

The improvements come from: 
 - Optimized array based data structures in Cluster class
 - Getting rid of hashmaps 
 - Optimized region move and swap ops 
 - Removing most of the computation to cluster initialization, and state change 
for the cluster, thus eliminating computing the same results over and over
 - Some profiling

There should be further optimizations but this should be a good start. If we 
ran into more problems, we can investigate further. There are a lof of TODO's 
added in this patch. I'll create a jira for collecting some thoughts, but I 
wont have the time to work on those for now. 

There are (hopefully) minor semantic changes in the algo. I had to bump up 
loadMultiplier, and decrease moveCostMultiplier. See comments at 
TestStochasticLoadBalancer#testLargeCluster(). Please review carefully. 

As noted in testLargeCluster(), this does not work for large clusters  10 
regions, 1000 nodes. This can be solved by smt like 
http://en.wikipedia.org/wiki/Simulated_annealing instead of random walk with 
eager selection. 
  
 Optimize StochasticLoadBalancer
 ---

 Key: HBASE-8119
 URL: https://issues.apache.org/jira/browse/HBASE-8119
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: 0.95.0
Reporter: Enis Soztutar
 Fix For: 0.95.0


 On a 5 node trunk cluster, I ran into a weird problem with 
 StochasticLoadBalancer:
 server1   Thu Mar 14 03:42:50 UTC 20130.0 33
 server2   Thu Mar 14 03:47:53 UTC 20130.0 34
 server3   Thu Mar 14 03:46:53 UTC 2013465.0   42
 server4   Thu Mar 14 03:47:53 UTC 201311455.0 282
 server5   Thu Mar 14 03:47:53 UTC 20130.0 34
 Total:5   11920   425
 Notice that server4 has 282 regions, while the others have much less. Plus 
 for one table 

[jira] [Commented] (HBASE-7305) ZK based Read/Write locks for table operations

2013-03-18 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605930#comment-13605930
 ] 

Sergey Shelukhin commented on HBASE-7305:
-

What I had in mind in the comments in HBASE-5487 was a persistent central state 
machine + version per region (in ZK or a table), and per table. It should 
allow multiple operations to proceed in parallel as long as it's logically 
feasible (e.g. if split is opening daughters and alter table comes you just 
bump the version on the node and server has to reopen, etc.). 
For table-wide ops like alters I am +1 on locking (it could be done via 
versions too though, e.g. why not allow parallel alter-s during region opening 
- but this is not important probably).

 ZK based Read/Write locks for table operations
 --

 Key: HBASE-7305
 URL: https://issues.apache.org/jira/browse/HBASE-7305
 Project: HBase
  Issue Type: Bug
  Components: Client, master, Zookeeper
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.95.0

 Attachments: 130228-zkrwlocks.pdf, 7305-v11.txt, hbase-7305_v0.patch, 
 hbase-7305_v10.patch, hbase-7305_v13.patch, hbase-7305_v14.patch, 
 hbase-7305_v15.patch, hbase-7305_v1-based-on-curator.patch, 
 hbase-7305_v2.patch, hbase-7305_v4.patch, hbase-7305_v9.patch, 
 HBaseTableLocks.pdf


 This has started as forward porting of HBASE-5494 and HBASE-5991 from the 
 89-fb branch to trunk, but diverged enough to have it's own issue. 
 The idea is to implement a zk based read/write lock per table. Master 
 initiated operations should get the write lock, and region operations (region 
 split, moving, balance?, etc) acquire a shared read lock. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8097) MetaServerShutdownHandler may potentially keep bumping up DeadServer.numProcessing

2013-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605931#comment-13605931
 ] 

Hudson commented on HBASE-8097:
---

Integrated in hbase-0.95 #83 (See 
[https://builds.apache.org/job/hbase-0.95/83/])
HBASE-8097 MetaServerShutdownHandler may potentially keep bumping up 
DeadServer.numProcessing (Jeffrey Zhong) (Revision 1457934)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/master/DeadServer.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/MetaServerShutdownHandler.java


 MetaServerShutdownHandler may potentially keep bumping up 
 DeadServer.numProcessing
 --

 Key: HBASE-8097
 URL: https://issues.apache.org/jira/browse/HBASE-8097
 Project: HBase
  Issue Type: Bug
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.95.0, 0.98.0

 Attachments: 8097.txt, hbase-8097_1.patch, hbase-8097_v2.patch, 
 hbase-8097_v3.patch


 {code}
 } catch (IOException ioe) {
   this.services.getExecutorService().submit(this);
   this.deadServers.add(serverName);
   throw new IOException(failed log splitting for  +
   serverName + , will retry, ioe);
 }
 {code}
 this.deadServers.add(serverName); will keep incrementing 
 DeadServer.numProcessing
 We can't get rid of numProcessing by just checking deadServers.size() because 
 deadServers is also used to report some historically failed RSs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8141) Remove accidental uses of org.mortbay.log.Log

2013-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605932#comment-13605932
 ] 

Hudson commented on HBASE-8141:
---

Integrated in hbase-0.95 #83 (See 
[https://builds.apache.org/job/hbase-0.95/83/])
HBASE-8141. Remove accidental uses of org.mortbay.log.Log (Revision 1458002)

 Result = FAILURE
apurtell : 
Files : 
* 
/hbase/branches/0.95/hbase-prefix-tree/src/test/java/org/apache/hadoop/hbase/codec/prefixtree/builder/TestTreeDepth.java
* 
/hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java
* 
/hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/ipc/TestDelayedRpc.java


 Remove accidental uses of org.mortbay.log.Log
 -

 Key: HBASE-8141
 URL: https://issues.apache.org/jira/browse/HBASE-8141
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.95.0, 0.96.0, 0.94.6
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Trivial
 Fix For: 0.95.0, 0.96.0, 0.94.6

 Attachments: 8141-0.94.patch, 8141-trunk.patch


 Remove accidental uses of org.mortbay.log.Log. Eclipse autocomplete is 
 probably the culprit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8108) Add m2eclispe lifecycle mapping to hbase-common

2013-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605933#comment-13605933
 ] 

Hudson commented on HBASE-8108:
---

Integrated in hbase-0.95 #83 (See 
[https://builds.apache.org/job/hbase-0.95/83/])
HBASE-8108: Add m2eclispe lifecycle mapping to hbase-common (Revision 
1458019)

 Result = FAILURE
jyates : 
Files : 
* /hbase/branches/0.95/hbase-common/pom.xml


 Add m2eclispe lifecycle mapping to hbase-common
 ---

 Key: HBASE-8108
 URL: https://issues.apache.org/jira/browse/HBASE-8108
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.95.0, 0.98.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 0.95.0, 0.98.0

 Attachments: hbase-8108.patch, hbase-8108-v2.patch


 The maven-antrun-plugin execution doesn't have a default mapping in 
 m2eclipse, so if you import the project into eclipse, you will get an error 
 that the mapping is undefined. All that's needed is to define an execution 
 via the org.eclipse.m2 lifecycle-mapping plugin - it doesn't actually affect 
 the usual maven build at all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7597) TestHBaseFsck#testRegionShouldNotBeDeployed seems to be flaky

2013-03-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605934#comment-13605934
 ] 

Hudson commented on HBASE-7597:
---

Integrated in hbase-0.95 #83 (See 
[https://builds.apache.org/job/hbase-0.95/83/])
HBASE-7597 TestHBaseFsck#testRegionShouldNotBeDeployed seems to be flaky 
(Revision 1458061)

 Result = FAILURE
jxiang : 
Files : 
* 
/hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java


 TestHBaseFsck#testRegionShouldNotBeDeployed seems to be flaky
 -

 Key: HBASE-7597
 URL: https://issues.apache.org/jira/browse/HBASE-7597
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Marc Spaggiari
Assignee: Jimmy Xiang
 Fix For: 0.95.0, 0.98.0

 Attachments: trunk-7597.patch


 I ran the entire test suite many times and always failed on, at least, 
 testRegionShouldNotBeDeployed.
 Results below. I will attached more result when current tests are done.
 Failed tests:
 testDeleteExpiredStoreFiles(org.apache.hadoop.hbase.regionserver.TestStore):
 expected:2 but was:4
   
 testAcquireTaskAtStartup(org.apache.hadoop.hbase.regionserver.TestSplitLogWorker):
 Waiting timed out after [1 000] msec
   testRegionShouldNotBeDeployed(org.apache.hadoop.hbase.util.TestHBaseFsck):
 expected:[SHOULD_NOT_BE_DEPLOYED] but was:[]
   
 testPermissionsWatcher(org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7878) recoverFileLease does not check return value of recoverLease

2013-03-18 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-7878:
--

Attachment: 7878-trunk-v10.txt

 recoverFileLease does not check return value of recoverLease
 

 Key: HBASE-7878
 URL: https://issues.apache.org/jira/browse/HBASE-7878
 Project: HBase
  Issue Type: Bug
  Components: util
Affects Versions: 0.95.0, 0.94.6
Reporter: Eric Newton
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.95.0, 0.98.0, 0.94.7

 Attachments: 7878.94, 7878-94.addendum, 7878-94.addendum2, 
 7878-trunk.addendum, 7878-trunk.addendum2, 7878-trunk-v10.txt, 
 7878-trunk-v2.txt, 7878-trunk-v3.txt, 7878-trunk-v4.txt, 7878-trunk-v5.txt, 
 7878-trunk-v6.txt, 7878-trunk-v7.txt, 7878-trunk-v8.txt, 7878-trunk-v9.txt, 
 7878-trunk-v9.txt


 I think this is a problem, so I'm opening a ticket so an HBase person takes a 
 look.
 Apache Accumulo has moved its write-ahead log to HDFS. I modeled the lease 
 recovery for Accumulo after HBase's lease recovery.  During testing, we 
 experienced data loss.  I found it is necessary to wait until recoverLease 
 returns true to know that the file has been truly closed.  In FSHDFSUtils, 
 the return result of recoverLease is not checked. In the unit tests created 
 to check lease recovery in HBASE-2645, the return result of recoverLease is 
 always checked.
 I think FSHDFSUtils should be modified to check the return result, and wait 
 until it returns true.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7878) recoverFileLease does not check return value of recoverLease

2013-03-18 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-7878:
--

Attachment: (was: 7878-trunk-v10.txt)

 recoverFileLease does not check return value of recoverLease
 

 Key: HBASE-7878
 URL: https://issues.apache.org/jira/browse/HBASE-7878
 Project: HBase
  Issue Type: Bug
  Components: util
Affects Versions: 0.95.0, 0.94.6
Reporter: Eric Newton
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.95.0, 0.98.0, 0.94.7

 Attachments: 7878.94, 7878-94.addendum, 7878-94.addendum2, 
 7878-trunk.addendum, 7878-trunk.addendum2, 7878-trunk-v10.txt, 
 7878-trunk-v2.txt, 7878-trunk-v3.txt, 7878-trunk-v4.txt, 7878-trunk-v5.txt, 
 7878-trunk-v6.txt, 7878-trunk-v7.txt, 7878-trunk-v8.txt, 7878-trunk-v9.txt, 
 7878-trunk-v9.txt


 I think this is a problem, so I'm opening a ticket so an HBase person takes a 
 look.
 Apache Accumulo has moved its write-ahead log to HDFS. I modeled the lease 
 recovery for Accumulo after HBase's lease recovery.  During testing, we 
 experienced data loss.  I found it is necessary to wait until recoverLease 
 returns true to know that the file has been truly closed.  In FSHDFSUtils, 
 the return result of recoverLease is not checked. In the unit tests created 
 to check lease recovery in HBASE-2645, the return result of recoverLease is 
 always checked.
 I think FSHDFSUtils should be modified to check the return result, and wait 
 until it returns true.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8142) Sporadic TestZKProcedureControllers failures on trunk

2013-03-18 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-8142:
-

Attachment: hase-8142_v1.patch

The test case fail is due to the function ZKUtil.createSetData is NOT an atomic 
operation in file TestZKProcedureControllers.java. Therefore, sometime you'll 
see a znode is created without data 
{code}
ZKUtil.createSetData(watcher, prepare, ProtobufUtil.prependPBMagic(data));
{code}

I changed ZKUtil.createSetData to make atomic createset to fix the test case. 
While Jonathan may need to double check the code to see if we need handle the 
case in the code.

[~jmhsieh] Do you need to patch startNewSubprocedure in order to handle the 
possible non-atomic scenario? 

Thanks,
-Jeffrey

 Sporadic TestZKProcedureControllers failures on trunk
 -

 Key: HBASE-8142
 URL: https://issues.apache.org/jira/browse/HBASE-8142
 Project: HBase
  Issue Type: Bug
Reporter: stack
 Attachments: hase-8142_v1.patch


 See 
 https://builds.apache.org/job/PreCommit-HBASE-Build/4865//artifact/trunk/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.procedure.TestZKProcedureControllers.txt
  and
 https://builds.apache.org/job/PreCommit-HBASE-Build/4865//artifact/trunk/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.procedure.TestZKProcedureControllers-output.txt
 I see this in the output:
 {code}
 2013-03-18 17:30:46,672 DEBUG [Thread-2-EventThread] zookeeper.ZKUtil(1682): 
 testing utility-0x13d7e8da759 Retrieved 0 byte(s) of data from znode 
 /hbase/testSimple/acquired/instanceTest; data=empty
 2013-03-18 17:30:46,672 DEBUG [Thread-2-EventThread] 
 procedure.ZKProcedureMemberRpcs(206): start proc data length is 0
 2013-03-18 17:30:46,672 ERROR [Thread-2-EventThread] 
 procedure.ZKProcedureMemberRpcs(210): Data in for starting procuedure 
 instanceTest is illegally formatted. Killing the procedure.
 2013-03-18 17:30:46,673 ERROR [Thread-2-EventThread] 
 procedure.ZKProcedureMemberRpcs(218): Illegal argument exception
 java.lang.IllegalArgumentException: Data in for starting procuedure 
 instanceTest is illegally formatted. Killing the procedure.
   at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:211)
   at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.waitForNewProcedures(ZKProcedureMemberRpcs.java:175)
   at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.access$100(ZKProcedureMemberRpcs.java:56)
   at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs$1.nodeChildrenChanged(ZKProcedureMemberRpcs.java:109)
   at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:312)
   at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 2013-03-18 17:30:46,675 ERROR [Thread-2-EventThread] 
 procedure.ZKProcedureMemberRpcs(281): Failed due to null subprocedure
 java.lang.IllegalArgumentException via 
 expected:java.lang.IllegalArgumentException: Data in for starting procuedure 
 instanceTest is illegally formatted. Killing the procedure.
   at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:219)
   at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.waitForNewProcedures(ZKProcedureMemberRpcs.java:175)
   at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.access$100(ZKProcedureMemberRpcs.java:56)
   at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs$1.nodeChildrenChanged(ZKProcedureMemberRpcs.java:109)
   at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:312)
   at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 Caused by: java.lang.IllegalArgumentException: Data in for starting 
 procuedure instanceTest is illegally formatted. Killing the procedure.
   at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:211)
   ... 6 more
 {code}
 The znode has zero data (Usually it has 7 bytes when test runs fine).   Is 
 the latch being triggered on node create before data is written?  Pointers 
 appreciated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


<    1   2   3   >