date:20130318


 [ 
https://issues.apache.org/jira/browse/HBASE-7568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated HBASE-7568:


Attachment: (was: HBASE-7568.trunkv1)

 [replication] Create an interface for replication queues
 

 Key: HBASE-7568
 URL: https://issues.apache.org/jira/browse/HBASE-7568
 Project: HBase
  Issue Type: Sub-task
  Components: Replication
Reporter: Chris Trezzo
Assignee: Chris Trezzo
 Fix For: 0.95.0, 0.96.0, 0.98.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7568) [replication] Create an interface for replication queues


 [ 
https://issues.apache.org/jira/browse/HBASE-7568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated HBASE-7568:


Attachment: HBASE-7568-trunk-v1.patch

Renaming patch file so hadoopqa will pick it up.

 [replication] Create an interface for replication queues
 

 Key: HBASE-7568
 URL: https://issues.apache.org/jira/browse/HBASE-7568
 Project: HBase
  Issue Type: Sub-task
  Components: Replication
Reporter: Chris Trezzo
Assignee: Chris Trezzo
 Fix For: 0.95.0, 0.96.0, 0.98.0

 Attachments: HBASE-7568-trunk-v1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7803) Look into REST API performance


 [ 
https://issues.apache.org/jira/browse/HBASE-7803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-7803:
---

Attachment: trunk-7803.patch

 Look into REST API performance
 --

 Key: HBASE-7803
 URL: https://issues.apache.org/jira/browse/HBASE-7803
 Project: HBase
  Issue Type: Task
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: trunk-7803.patch


 I have a YCSB client using the REST API.  My testing shows the performance 
 for scan with REST API is much worse than that with the java client API.  We 
 need to look into it and find out the root cause, either the test issue, or 
 our REST API issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7803) Look into REST API performance


 [ 
https://issues.apache.org/jira/browse/HBASE-7803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-7803:
---

Status: Patch Available  (was: Open)

 Look into REST API performance
 --

 Key: HBASE-7803
 URL: https://issues.apache.org/jira/browse/HBASE-7803
 Project: HBase
  Issue Type: Task
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: trunk-7803.patch


 I have a YCSB client using the REST API.  My testing shows the performance 
 for scan with REST API is much worse than that with the java client API.  We 
 need to look into it and find out the root cause, either the test issue, or 
 our REST API issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7803) Look into REST API performance


[ 
https://issues.apache.org/jira/browse/HBASE-7803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605520#comment-13605520
 ] 

Jimmy Xiang commented on HBASE-7803:


Attached a patch to make REST support caching.

 Look into REST API performance
 --

 Key: HBASE-7803
 URL: https://issues.apache.org/jira/browse/HBASE-7803
 Project: HBase
  Issue Type: Task
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: trunk-7803.patch


 I have a YCSB client using the REST API.  My testing shows the performance 
 for scan with REST API is much worse than that with the java client API.  We 
 need to look into it and find out the root cause, either the test issue, or 
 our REST API issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8097) MetaServerShutdownHandler may potentially keep bumping up DeadServer.numProcessing


 [ 
https://issues.apache.org/jira/browse/HBASE-8097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-8097:
--

Fix Version/s: (was: 0.96.0)
   0.98.0
   0.95.0
 Hadoop Flags: Reviewed

Integrated to 0.95 and trunk.

Thanks for the patch, Jeffrey.

Thanks for the reviews, Jimmy, Nicolas and Chunhui.

 MetaServerShutdownHandler may potentially keep bumping up 
 DeadServer.numProcessing
 --

 Key: HBASE-8097
 URL: https://issues.apache.org/jira/browse/HBASE-8097
 Project: HBase
  Issue Type: Bug
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.95.0, 0.98.0

 Attachments: 8097.txt, hbase-8097_1.patch, hbase-8097_v2.patch, 
 hbase-8097_v3.patch


 {code}
 } catch (IOException ioe) {
   this.services.getExecutorService().submit(this);
   this.deadServers.add(serverName);
   throw new IOException(failed log splitting for  +
   serverName + , will retry, ioe);
 }
 {code}
 this.deadServers.add(serverName); will keep incrementing 
 DeadServer.numProcessing
 We can't get rid of numProcessing by just checking deadServers.size() because 
 deadServers is also used to report some historically failed RSs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8128) HTable#put improvements


[ 
https://issues.apache.org/jira/browse/HBASE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605527#comment-13605527
 ] 

nkeywal commented on HBASE-8128:


Committed in 0.94

 HTable#put improvements
 ---

 Key: HBASE-8128
 URL: https://issues.apache.org/jira/browse/HBASE-8128
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Fix For: 0.95.0, 0.96.0

 Attachments: 8128.v1.patch


 3 points:
  - When doing a single put, we're creating an object by calling Arrays.asList
  - we're doing a size check every 10 put. Not doing it seems simpler, better 
 and allows to share some code between a single put and a list of puts.
  - we could call flushCommits on empty write buffer, especially for someone 
 using a lot of HTable instead of using a pool, as it's called in close().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7803) Look into REST API performance


[ 
https://issues.apache.org/jira/browse/HBASE-7803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605528#comment-13605528
 ] 

Jimmy Xiang commented on HBASE-7803:


I did some testing on my 4 nodes cluster with ycsb and here is the scan 
throughput I got with REST API:

With caching, and using batch:  8.83
With caching, but no batch: 0.99
No caching, but using batch: 1.85
No caching, no batch: 0.68

On the same cluster, using the HBase client java API, the throughput I got is: 
29.04



 Look into REST API performance
 --

 Key: HBASE-7803
 URL: https://issues.apache.org/jira/browse/HBASE-7803
 Project: HBase
  Issue Type: Task
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: trunk-7803.patch


 I have a YCSB client using the REST API.  My testing shows the performance 
 for scan with REST API is much worse than that with the java client API.  We 
 need to look into it and find out the root cause, either the test issue, or 
 our REST API issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8128) HTable#put improvements


 [ 
https://issues.apache.org/jira/browse/HBASE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8128:
---

Fix Version/s: 0.94.8

 HTable#put improvements
 ---

 Key: HBASE-8128
 URL: https://issues.apache.org/jira/browse/HBASE-8128
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Fix For: 0.95.0, 0.96.0, 0.94.8

 Attachments: 8128.v1.patch


 3 points:
  - When doing a single put, we're creating an object by calling Arrays.asList
  - we're doing a size check every 10 put. Not doing it seems simpler, better 
 and allows to share some code between a single put and a list of puts.
  - we could call flushCommits on empty write buffer, especially for someone 
 using a lot of HTable instead of using a pool, as it's called in close().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7803) Look into REST API performance


[ 
https://issues.apache.org/jira/browse/HBASE-7803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605535#comment-13605535
 ] 

Jimmy Xiang commented on HBASE-7803:


Batch means less REST HTTP trips. Caching means less trips to region servers.  
Based on the results, it seems both are performance killers, and HTTP overhead 
has more impact.

 Look into REST API performance
 --

 Key: HBASE-7803
 URL: https://issues.apache.org/jira/browse/HBASE-7803
 Project: HBase
  Issue Type: Task
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: trunk-7803.patch


 I have a YCSB client using the REST API.  My testing shows the performance 
 for scan with REST API is much worse than that with the java client API.  We 
 need to look into it and find out the root cause, either the test issue, or 
 our REST API issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (HBASE-7597) testRegionShouldNotBeDeployed seems to be flaky


 [ 
https://issues.apache.org/jira/browse/HBASE-7597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang reopened HBASE-7597:


  Assignee: Jimmy Xiang

It failed again:  https://builds.apache.org/job/HBase-0.95/82/
Let me reopen it and take a look if I can do something about it.

 testRegionShouldNotBeDeployed seems to be flaky
 ---

 Key: HBASE-7597
 URL: https://issues.apache.org/jira/browse/HBASE-7597
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Marc Spaggiari
Assignee: Jimmy Xiang

 I ran the entire test suite many times and always failed on, at least, 
 testRegionShouldNotBeDeployed.
 Results below. I will attached more result when current tests are done.
 Failed tests:
 testDeleteExpiredStoreFiles(org.apache.hadoop.hbase.regionserver.TestStore):
 expected:2 but was:4
   
 testAcquireTaskAtStartup(org.apache.hadoop.hbase.regionserver.TestSplitLogWorker):
 Waiting timed out after [1 000] msec
   testRegionShouldNotBeDeployed(org.apache.hadoop.hbase.util.TestHBaseFsck):
 expected:[SHOULD_NOT_BE_DEPLOYED] but was:[]
   
 testPermissionsWatcher(org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8097) MetaServerShutdownHandler may potentially keep bumping up DeadServer.numProcessing


 [ 
https://issues.apache.org/jira/browse/HBASE-8097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-8097:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 MetaServerShutdownHandler may potentially keep bumping up 
 DeadServer.numProcessing
 --

 Key: HBASE-8097
 URL: https://issues.apache.org/jira/browse/HBASE-8097
 Project: HBase
  Issue Type: Bug
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.95.0, 0.98.0

 Attachments: 8097.txt, hbase-8097_1.patch, hbase-8097_v2.patch, 
 hbase-8097_v3.patch


 {code}
 } catch (IOException ioe) {
   this.services.getExecutorService().submit(this);
   this.deadServers.add(serverName);
   throw new IOException(failed log splitting for  +
   serverName + , will retry, ioe);
 }
 {code}
 this.deadServers.add(serverName); will keep incrementing 
 DeadServer.numProcessing
 We can't get rid of numProcessing by just checking deadServers.size() because 
 deadServers is also used to report some historically failed RSs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7597) TestHBaseFsck#testRegionShouldNotBeDeployed seems to be flaky


 [ 
https://issues.apache.org/jira/browse/HBASE-7597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-7597:
--

Summary: TestHBaseFsck#testRegionShouldNotBeDeployed seems to be flaky  
(was: testRegionShouldNotBeDeployed seems to be flaky)

 TestHBaseFsck#testRegionShouldNotBeDeployed seems to be flaky
 -

 Key: HBASE-7597
 URL: https://issues.apache.org/jira/browse/HBASE-7597
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Marc Spaggiari
Assignee: Jimmy Xiang

 I ran the entire test suite many times and always failed on, at least, 
 testRegionShouldNotBeDeployed.
 Results below. I will attached more result when current tests are done.
 Failed tests:
 testDeleteExpiredStoreFiles(org.apache.hadoop.hbase.regionserver.TestStore):
 expected:2 but was:4
   
 testAcquireTaskAtStartup(org.apache.hadoop.hbase.regionserver.TestSplitLogWorker):
 Waiting timed out after [1 000] msec
   testRegionShouldNotBeDeployed(org.apache.hadoop.hbase.util.TestHBaseFsck):
 expected:[SHOULD_NOT_BE_DEPLOYED] but was:[]
   
 testPermissionsWatcher(org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7568) [replication] Create an interface for replication queues


[ 
https://issues.apache.org/jira/browse/HBASE-7568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605554#comment-13605554
 ] 

Ted Yu commented on HBASE-7568:
---

From https://builds.apache.org/job/PreCommit-HBASE-Build/4871/console, it 
looks like the patch doesn't compile (against hadoop 2.0, at least)

 [replication] Create an interface for replication queues
 

 Key: HBASE-7568
 URL: https://issues.apache.org/jira/browse/HBASE-7568
 Project: HBase
  Issue Type: Sub-task
  Components: Replication
Reporter: Chris Trezzo
Assignee: Chris Trezzo
 Fix For: 0.95.0, 0.96.0, 0.98.0

 Attachments: HBASE-7568-trunk-v1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8128) HTable#put improvements


 [ 
https://issues.apache.org/jira/browse/HBASE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-8128:
--

Fix Version/s: (was: 0.94.8)
   0.94.7

 HTable#put improvements
 ---

 Key: HBASE-8128
 URL: https://issues.apache.org/jira/browse/HBASE-8128
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Fix For: 0.95.0, 0.96.0, 0.94.7

 Attachments: 8128.v1.patch


 3 points:
  - When doing a single put, we're creating an object by calling Arrays.asList
  - we're doing a size check every 10 put. Not doing it seems simpler, better 
 and allows to share some code between a single put and a list of puts.
  - we could call flushCommits on empty write buffer, especially for someone 
 using a lot of HTable instead of using a pool, as it's called in close().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7679) implement store file management for stripe compactions

[
https://issues.apache.org/jira/browse/HBASE-7679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605572#comment-13605572
]

Sergey Shelukhin commented on HBASE-7679:
-

bq. ConcatenatedLists should have unit test.
Added.

bq. Should this define be in the Interface or do you think it implementation
specific?
bq. + public static final String BLOCKING_STOREFILES_KEY =
hbase.hstore.blockingStoreFiles;
Not certain, having things in HStore seems to be the convention. Store isn't a
real interface that invites different implementation :)

bq. On StripeStoreFileManager, do we know if this approach has merit? Have we
run models or actual test runs and can see it saves i/o? Would be interesting
to know. Do we have to commit it to figure this out? I can see committing all
the refactorings which allow different compaction policies but would think a
compaction engine would need to have proven merit before it goes in? What you
think Sergey?
I have an integration test in HBASE-8000, but have only run it for correctness
now.
I plan to make a bigger test for perf, and move to commit after having some
numbers.

bq. Do we have to have a L0? Can we not flush multiple files when we flush, one
per boundary in the region? Was that thought just too much work flushing?
I was concerned about many small files, and scope creep into memstore, as
discussed.
Let me do a write-up on this (probably useful anyway), and discuss on dev list.
After integration tests on tiny files (not a target scenario for this, but
still), I wonder if impact of L0 files on # of files to be read for gets is
indeed worth it.
On the other hand for scans, and for overall situation large number of small
files is not good.

implement store file management for stripe compactions
--

Key: HBASE-7679
URL: https://issues.apache.org/jira/browse/HBASE-7679
Project: HBase
Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Attachments: HBASE-7667-and-7603-v0-incomplete.patch,
HBASE-7667-and-7603-v0-incomplete.patch, HBASE-7667-and-7603-v1.patch,
HBASE-7667-and-7603-v1.patch, HBASE-7667-v1.patch, HBASE-7667-v1.patch,
HBASE-7667-v2.patch, HBASE-7667-v2.patch, HBASE-7667-v3.patch,
HBASE-7679-v4.patch, HBASE-7679-v5.patch, HBASE-7679-v6.patch,
HBASE-7679-v7-.patch, HBASE-7679-v7.patch, HBASE-7679-v8.patch,
HBASE-7679-v9.patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7679) implement store file management for stripe compactions


 [ 
https://issues.apache.org/jira/browse/HBASE-7679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-7679:


Attachment: HBASE-7679-v10.patch

updated patch

 implement store file management for stripe compactions
 --

 Key: HBASE-7679
 URL: https://issues.apache.org/jira/browse/HBASE-7679
 Project: HBase
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-7667-and-7603-v0-incomplete.patch, 
 HBASE-7667-and-7603-v0-incomplete.patch, HBASE-7667-and-7603-v1.patch, 
 HBASE-7667-and-7603-v1.patch, HBASE-7667-v1.patch, HBASE-7667-v1.patch, 
 HBASE-7667-v2.patch, HBASE-7667-v2.patch, HBASE-7667-v3.patch, 
 HBASE-7679-v10.patch, HBASE-7679-v4.patch, HBASE-7679-v5.patch, 
 HBASE-7679-v6.patch, HBASE-7679-v7-.patch, HBASE-7679-v7.patch, 
 HBASE-7679-v8.patch, HBASE-7679-v9.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8127) Region of a disabling or disabled table could be stuck in transition state when RS dies during Master initialization

2013-03-18 Thread rajeshbabu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605582#comment-13605582
 ] 

rajeshbabu commented on HBASE-8127:
---

bq. One time when I saw the opening RIT stuck is due to the 
offlineDisabledRegion function in assignment manager. As you can see we don't 
handle opening RIT inside the function.

If I am not wrong HBASE-7824 patch applied at that time right?

One problem I am suspecting with HBASE-7824 patch is 
{code}
+if (preMetaServer != null  failedServers.contains(preMetaServer)) {
+  // create recovered edits file for .META. server
+  this.fileSystemManager.splitLog(preMetaServer);
+  failedServers.remove(preMetaServer);
+}
{code}

If a RS carrying ROOT or META went down,we are not calling SSH for that RS(not 
even adding to deadservers). We are handling regions in transitions to the dead 
server by processRegionsInTransitions which can cause RIT stuck in case OPENING 
state. If znode in RS_ZK_REGION_OPENING state then we will just add to RIT and 
wait for TM to handle. 
{code}
   regionsInTransition.put(encodedRegionName, new RegionState(regionInfo,
RegionState.State.OPENING, data.getStamp(), data.getOrigin()));
failoverProcessedRegions.put(encodedRegionName, regionInfo);
{code}
When ever TM handles we we will assign,in that case RIT can stuck because its 
seeing table in DISABLING/DISABLED. If really the RS is ALIVE this case wont 
happen because after assignment unassign will be called.

for HBASE-7824 patch we can do below change which avoids RIT stuck like in 
opening state.
If meta RS is down before/during master restart we can add it to deadservers 
and start SSH by passing shouldSplitHlog as false because already splitted logs.
{code}
this.deadservers.add(serverName);
this.services.getExecutorService().submit(
  new ServerShutdownHandler(this.master, this.services, this.deadservers, 
serverName, false));
{code}

Any way actual problem you have given in description we can handle in SSH side. 
I am working on it.

One more thing about your feedback patch:
{code}
+// delete RITs if exists in any state of disabling or disabled tables 
during master starts
+// up
+if (!hri.isMetaTable()) {
+  String tableName = hri.getTableNameAsString();
+  boolean disabled = this.zkTable.isDisabledTable(tableName);
+  if (disabled || this.zkTable.isDisablingTable(tableName)) {
+ZKAssign.deleteNodeFailSilent(watcher, hri);
+regionOffline(hri);
+continue;
+  }
+}
{code}

We dont know whether the DISABLING table region is already closed or not on RS, 
so we should not offline region directly. In SSH we can do because the RS is 
went down.


 Region of a disabling or disabled table could be stuck in transition state 
 when RS dies during Master initialization
 

 Key: HBASE-8127
 URL: https://issues.apache.org/jira/browse/HBASE-8127
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.5
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.7

 Attachments: HBASE-8127_feedback.patch, HBASE-8127.patch, 
 hbase-8127_v1.patch, reproduce-hang.patch


 The issue happens when a RS dies during a master starts up. After the RS 
 reports open to the new master instance and dies immediately thereafter, the 
 RITs of disabling tables(or disabled table) on the died RS will be in RIT 
 state forever.
 I attached a patch to simulate the situation and you can run the following 
 command to reproduce the issue:
 {code}mvn test -PlocalTests 
 -Dtest=TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS{code}
 Basically, we skip regions of a dead server inside 
 AM.processDeadServersAndRecoverLostRegions as the following code and relies 
 on SSH to process those skipped regions:
 {code}
   for (PairHRegionInfo, Result deadRegion : deadServer.getValue()) {
 nodes.remove(deadRegion.getFirst().getEncodedName());
   }
 {code} 
 While in SSH, we skip regions of disabling(or disabled table) again by 
 function processDeadRegion. Finally comes to the issue that RITs of 
 disabling(or disabled table) stuck there forever.
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk (with changes)


[ 
https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605583#comment-13605583
 ] 

Sergey Shelukhin commented on HBASE-7055:
-

Ping?

 port HBASE-6371 tier-based compaction from 0.89-fb to trunk (with changes)
 --

 Key: HBASE-7055
 URL: https://issues.apache.org/jira/browse/HBASE-7055
 Project: HBase
  Issue Type: Task
  Components: Compaction
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.95.0

 Attachments: HBASE-6371-squashed.patch, HBASE-6371-v2-squashed.patch, 
 HBASE-6371-v3-refactor-only-squashed.patch, 
 HBASE-6371-v4-refactor-only-squashed.patch, 
 HBASE-6371-v5-refactor-only-squashed.patch, HBASE-7055-v0.patch, 
 HBASE-7055-v1.patch, HBASE-7055-v2.patch, HBASE-7055-v3.patch, 
 HBASE-7055-v4.patch, HBASE-7055-v5.patch, HBASE-7055-v6.patch, 
 HBASE-7055-v7.patch, HBASE-7055-v7.patch


 See HBASE-6371 for details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7295) Contention in HBaseClient.getConnection


[ 
https://issues.apache.org/jira/browse/HBASE-7295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605586#comment-13605586
 ] 

Ted Yu commented on HBASE-7295:
---

I ran TestRowProcessorEndpoint with trunk patch v4 and it passed.

+1

 Contention in HBaseClient.getConnection
 ---

 Key: HBASE-7295
 URL: https://issues.apache.org/jira/browse/HBASE-7295
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.94.3
Reporter: Varun Sharma
Assignee: Varun Sharma
 Fix For: 0.95.0

 Attachments: 7295-0.94.txt, 7295-0.94-v2.txt, 7295-0.94-v3.txt, 
 7295-0.94-v4.txt, 7295-0.94-v5.txt, 7295-trunk.txt, 7295-trunk.txt, 
 7295-trunk-v2.txt, 7295-trunk-v3.txt, 7295-trunk-v3.txt, 7295-trunk-v4.txt


 HBaseClient.getConnection() synchronizes on the connections object. We found 
 severe contention on a thrift gateway which was fanning out roughly 3000+ 
 calls per second to hbase region servers. The thrift gateway had 2000+ 
 threads for handling incoming connections. Threads were blocked on the 
 syncrhonized block - we set ipc.pool.size to 200. Since we are using 
 RoundRobin/ThreadLocal pool only - its not necessary to synchronize on 
 connections - it might lead to cases where we might go slightly over the 
 ipc.max.pool.size() but the additional connections would timeout after 
 maxIdleTime - underlying PoolMap connections object is thread safe.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7992) provide pre/post region offline hooks for HMaster.offlineRegion()

2013-03-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-7992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605590#comment-13605590
 ] 

Hudson commented on HBASE-7992:
---

Integrated in HBase-TRUNK #3969 (See 
[https://builds.apache.org/job/HBase-TRUNK/3969/])
HBASE-7992 provide pre/post region offline hooks for 
HMaster.offlineRegion() (Rajeshbabu) (Revision 1457854)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/BaseMasterObserver.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/MasterObserver.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterObserver.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController.java


 provide pre/post region offline hooks for HMaster.offlineRegion()
 -

 Key: HBASE-7992
 URL: https://issues.apache.org/jira/browse/HBASE-7992
 Project: HBase
  Issue Type: Bug
  Components: Coprocessors
Affects Versions: 0.95.0
Reporter: rajeshbabu
Assignee: rajeshbabu
 Fix For: 0.98.0

 Attachments: 7992_trunk_3.patch, HBASE-7992_trunk_2.patch, 
 HBASE-7992_trunk.patch


 presently no hooks to provide access control to offline region in master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7679) implement store file management for stripe compactions


[ 
https://issues.apache.org/jira/browse/HBASE-7679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605592#comment-13605592
 ] 

stack commented on HBASE-7679:
--

Agree lets get numbers before saying L0 is bad.  Ditto get numbers before 
commit and yes a write up would be helpful.  Smarter compaction could make for 
big wins all around Sergey.  Thanks for persisting.

 implement store file management for stripe compactions
 --

 Key: HBASE-7679
 URL: https://issues.apache.org/jira/browse/HBASE-7679
 Project: HBase
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-7667-and-7603-v0-incomplete.patch, 
 HBASE-7667-and-7603-v0-incomplete.patch, HBASE-7667-and-7603-v1.patch, 
 HBASE-7667-and-7603-v1.patch, HBASE-7667-v1.patch, HBASE-7667-v1.patch, 
 HBASE-7667-v2.patch, HBASE-7667-v2.patch, HBASE-7667-v3.patch, 
 HBASE-7679-v10.patch, HBASE-7679-v4.patch, HBASE-7679-v5.patch, 
 HBASE-7679-v6.patch, HBASE-7679-v7-.patch, HBASE-7679-v7.patch, 
 HBASE-7679-v8.patch, HBASE-7679-v9.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7481) throw IOExceptions from Filter methods?


[ 
https://issues.apache.org/jira/browse/HBASE-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605593#comment-13605593
 ] 

Hadoop QA commented on HBASE-7481:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12574069/HBASE-7481-1.0.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 9 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s): 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4869//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4869//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4869//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4869//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4869//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4869//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4869//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4869//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4869//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4869//console

This message is automatically generated.

 throw IOExceptions from Filter methods?
 ---

 Key: HBASE-7481
 URL: https://issues.apache.org/jira/browse/HBASE-7481
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Fix For: 0.95.0, 0.98.0

 Attachments: HBASE-7481-1.0.txt


 Currently there is no way to throw custom IOExceptions from any of the filter 
 methods.
 For implementers of custom filters that presents a problem.
 For example there are scenarios where the filter would want to indicate to 
 the client that there it should not retry. Currently there is no way of doing 
 that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8067) TestHFileArchiving.testArchiveOnTableDelete sometimes fails


[ 
https://issues.apache.org/jira/browse/HBASE-8067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605596#comment-13605596
 ] 

Ted Yu commented on HBASE-8067:
---

Looks like this test failed again in trunk build #3969

 TestHFileArchiving.testArchiveOnTableDelete sometimes fails
 ---

 Key: HBASE-8067
 URL: https://issues.apache.org/jira/browse/HBASE-8067
 Project: HBase
  Issue Type: Bug
  Components: Admin, master, test
Affects Versions: 0.96.0, 0.94.6
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 0.95.0, 0.94.7

 Attachments: HBASE-8067-debug.patch, HBASE-8067-v0.patch


 it seems that testArchiveOnTableDelete() fails because the archiving in 
 DeleteTableHandler is still in progress when admin.deleteTable() returns.
 {code}
 Error Message
 Archived files are missing some of the store files!
 Stacktrace
 java.lang.AssertionError: Archived files are missing some of the store files!
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.assertTrue(Assert.java:41)
   at 
 org.apache.hadoop.hbase.backup.TestHFileArchiving.testArchiveOnTableDelete(TestHFileArchiving.java:262)
 {code}
 (Looking at the problem in a more generic way, we don't have any way to 
 inform the client when an async operation is completed)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7568) [replication] Create an interface for replication queues


[ 
https://issues.apache.org/jira/browse/HBASE-7568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605603#comment-13605603
 ] 

Chris Trezzo commented on HBASE-7568:
-

Hmm, it compiled locally. Will investigate. Thanks Ted.

Chris

 [replication] Create an interface for replication queues
 

 Key: HBASE-7568
 URL: https://issues.apache.org/jira/browse/HBASE-7568
 Project: HBase
  Issue Type: Sub-task
  Components: Replication
Reporter: Chris Trezzo
Assignee: Chris Trezzo
 Fix For: 0.95.0, 0.96.0, 0.98.0

 Attachments: HBASE-7568-trunk-v1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7803) Look into REST API performance


[ 
https://issues.apache.org/jira/browse/HBASE-7803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605605#comment-13605605
 ] 

Hadoop QA commented on HBASE-7803:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12574197/trunk-7803.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces lines longer than 
100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s): 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4870//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4870//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4870//console

This message is automatically generated.

 Look into REST API performance
 --

 Key: HBASE-7803
 URL: https://issues.apache.org/jira/browse/HBASE-7803
 Project: HBase
  Issue Type: Task
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: trunk-7803.patch


 I have a YCSB client using the REST API.  My testing shows the performance 
 for scan with REST API is much worse than that with the java client API.  We 
 need to look into it and find out the root cause, either the test issue, or 
 our REST API issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7568) [replication] Create an interface for replication queues


[ 
https://issues.apache.org/jira/browse/HBASE-7568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605609#comment-13605609
 ] 

Chris Trezzo commented on HBASE-7568:
-

Woops. Posted the wrong file when I renamed it.

 [replication] Create an interface for replication queues
 

 Key: HBASE-7568
 URL: https://issues.apache.org/jira/browse/HBASE-7568
 Project: HBase
  Issue Type: Sub-task
  Components: Replication
Reporter: Chris Trezzo
Assignee: Chris Trezzo
 Fix For: 0.95.0, 0.96.0, 0.98.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7568) [replication] Create an interface for replication queues


 [ 
https://issues.apache.org/jira/browse/HBASE-7568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated HBASE-7568:


Attachment: (was: HBASE-7568-trunk-v1.patch)

 [replication] Create an interface for replication queues
 

 Key: HBASE-7568
 URL: https://issues.apache.org/jira/browse/HBASE-7568
 Project: HBase
  Issue Type: Sub-task
  Components: Replication
Reporter: Chris Trezzo
Assignee: Chris Trezzo
 Fix For: 0.95.0, 0.96.0, 0.98.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8127) Region of a disabling or disabled table could be stuck in transition state when RS dies during Master initialization

2013-03-18 Thread Jeffrey Zhong (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605616#comment-13605616
 ] 

Jeffrey Zhong commented on HBASE-8127:
--

[~rajesh23] Thanks for the detailed comments.

{quote}
If I am not wrong HBASE-7824 patch applied at that time right?
{quote}
No. Actually with the {code}failedServers.remove(preMetaServer);{code} we don't 
see any issue at all. The only problem is when we have non-empty dead severs 
which are simulated by the reproduce-hang patch

Anyway, the opening RIT of disabled table which causing issues is on the live 
RS not the one dies(or aborted) in the test. So the changes in SSH should not 
have any impact IMHO. 



 Region of a disabling or disabled table could be stuck in transition state 
 when RS dies during Master initialization
 

 Key: HBASE-8127
 URL: https://issues.apache.org/jira/browse/HBASE-8127
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.5
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.7

 Attachments: HBASE-8127_feedback.patch, HBASE-8127.patch, 
 hbase-8127_v1.patch, reproduce-hang.patch


 The issue happens when a RS dies during a master starts up. After the RS 
 reports open to the new master instance and dies immediately thereafter, the 
 RITs of disabling tables(or disabled table) on the died RS will be in RIT 
 state forever.
 I attached a patch to simulate the situation and you can run the following 
 command to reproduce the issue:
 {code}mvn test -PlocalTests 
 -Dtest=TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS{code}
 Basically, we skip regions of a dead server inside 
 AM.processDeadServersAndRecoverLostRegions as the following code and relies 
 on SSH to process those skipped regions:
 {code}
   for (PairHRegionInfo, Result deadRegion : deadServer.getValue()) {
 nodes.remove(deadRegion.getFirst().getEncodedName());
   }
 {code} 
 While in SSH, we skip regions of disabling(or disabled table) again by 
 function processDeadRegion. Finally comes to the issue that RITs of 
 disabling(or disabled table) stuck there forever.
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7295) Contention in HBaseClient.getConnection


[ 
https://issues.apache.org/jira/browse/HBASE-7295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605621#comment-13605621
 ] 

Lars Hofhansl commented on HBASE-7295:
--

I know we went through this before, but just making the PoolMap volatile does 
not make the implementation thread safe.


 Contention in HBaseClient.getConnection
 ---

 Key: HBASE-7295
 URL: https://issues.apache.org/jira/browse/HBASE-7295
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.94.3
Reporter: Varun Sharma
Assignee: Varun Sharma
 Fix For: 0.95.0

 Attachments: 7295-0.94.txt, 7295-0.94-v2.txt, 7295-0.94-v3.txt, 
 7295-0.94-v4.txt, 7295-0.94-v5.txt, 7295-trunk.txt, 7295-trunk.txt, 
 7295-trunk-v2.txt, 7295-trunk-v3.txt, 7295-trunk-v3.txt, 7295-trunk-v4.txt


 HBaseClient.getConnection() synchronizes on the connections object. We found 
 severe contention on a thrift gateway which was fanning out roughly 3000+ 
 calls per second to hbase region servers. The thrift gateway had 2000+ 
 threads for handling incoming connections. Threads were blocked on the 
 syncrhonized block - we set ipc.pool.size to 200. Since we are using 
 RoundRobin/ThreadLocal pool only - its not necessary to synchronize on 
 connections - it might lead to cases where we might go slightly over the 
 ipc.max.pool.size() but the additional connections would timeout after 
 maxIdleTime - underlying PoolMap connections object is thread safe.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6915) String and ConcurrentHashMap sizes change on jdk7; makes TestHeapSize fail


[ 
https://issues.apache.org/jira/browse/HBASE-6915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605631#comment-13605631
 ] 

Lars Hofhansl commented on HBASE-6915:
--

+1 for 0.94 as well.

 String and ConcurrentHashMap sizes change on jdk7; makes TestHeapSize fail
 --

 Key: HBASE-6915
 URL: https://issues.apache.org/jira/browse/HBASE-6915
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.95.0

 Attachments: jdk7.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8014) Backport HBASE-6915 to 0.94.


[ 
https://issues.apache.org/jira/browse/HBASE-8014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605633#comment-13605633
 ] 

Ted Yu commented on HBASE-8014:
---

Here is Lars' confirmation: 
https://issues.apache.org/jira/browse/HBASE-6915?focusedCommentId=13605631page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13605631

 Backport HBASE-6915 to 0.94.
 

 Key: HBASE-8014
 URL: https://issues.apache.org/jira/browse/HBASE-8014
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Marc Spaggiari
Assignee: Jean-Marc Spaggiari
Priority: Critical
 Attachments: HBASE-8014-v0-0.94.patch


 JDK 1.7 changed some data size. Goal of this JIRA is to backport HBASE-6915 
 to 0.94

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-8141) Remove accidental uses of org.mortbay.log

Andrew Purtell created HBASE-8141:
-

 Summary: Remove accidental uses of org.mortbay.log
 Key: HBASE-8141
 URL: https://issues.apache.org/jira/browse/HBASE-8141
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.95.0, 0.96.0, 0.94.6
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Trivial


Remove accidental uses of org.mortbay.log.Log. Eclipse autocomplete is probably 
the culprit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8014) Backport HBASE-6915 to 0.94.


[ 
https://issues.apache.org/jira/browse/HBASE-8014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605639#comment-13605639
 ] 

Ted Yu commented on HBASE-8014:
---

Integrated to 0.94

Thanks for the patch, Jean-Marc.

Thanks for the confirmation, Lars.

 Backport HBASE-6915 to 0.94.
 

 Key: HBASE-8014
 URL: https://issues.apache.org/jira/browse/HBASE-8014
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Marc Spaggiari
Assignee: Jean-Marc Spaggiari
Priority: Critical
 Attachments: HBASE-8014-v0-0.94.patch


 JDK 1.7 changed some data size. Goal of this JIRA is to backport HBASE-6915 
 to 0.94

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8014) Backport HBASE-6915 to 0.94.


 [ 
https://issues.apache.org/jira/browse/HBASE-8014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-8014:
--

Fix Version/s: 0.94.7

 Backport HBASE-6915 to 0.94.
 

 Key: HBASE-8014
 URL: https://issues.apache.org/jira/browse/HBASE-8014
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Marc Spaggiari
Assignee: Jean-Marc Spaggiari
Priority: Critical
 Fix For: 0.94.7

 Attachments: HBASE-8014-v0-0.94.patch


 JDK 1.7 changed some data size. Goal of this JIRA is to backport HBASE-6915 
 to 0.94

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8141) Remove accidental uses of org.mortbay.log.Log


 [ 
https://issues.apache.org/jira/browse/HBASE-8141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-8141:
--

Summary: Remove accidental uses of org.mortbay.log.Log  (was: Remove 
accidental uses of org.mortbay.log)

 Remove accidental uses of org.mortbay.log.Log
 -

 Key: HBASE-8141
 URL: https://issues.apache.org/jira/browse/HBASE-8141
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.95.0, 0.96.0, 0.94.6
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Trivial

 Remove accidental uses of org.mortbay.log.Log. Eclipse autocomplete is 
 probably the culprit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk (with changes)


[ 
https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605644#comment-13605644
 ] 

Ted Yu commented on HBASE-7055:
---

I am going over the patch.

Can you update Release Notes ?
There're a lot of config parameters introduced in this patch.

 port HBASE-6371 tier-based compaction from 0.89-fb to trunk (with changes)
 --

 Key: HBASE-7055
 URL: https://issues.apache.org/jira/browse/HBASE-7055
 Project: HBase
  Issue Type: Task
  Components: Compaction
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.95.0

 Attachments: HBASE-6371-squashed.patch, HBASE-6371-v2-squashed.patch, 
 HBASE-6371-v3-refactor-only-squashed.patch, 
 HBASE-6371-v4-refactor-only-squashed.patch, 
 HBASE-6371-v5-refactor-only-squashed.patch, HBASE-7055-v0.patch, 
 HBASE-7055-v1.patch, HBASE-7055-v2.patch, HBASE-7055-v3.patch, 
 HBASE-7055-v4.patch, HBASE-7055-v5.patch, HBASE-7055-v6.patch, 
 HBASE-7055-v7.patch, HBASE-7055-v7.patch


 See HBASE-6371 for details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-8141) Remove accidental uses of org.mortbay.log.Log


 [ 
https://issues.apache.org/jira/browse/HBASE-8141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-8141.
---

   Resolution: Fixed
Fix Version/s: 0.94.6
   0.96.0
   0.95.0

 Remove accidental uses of org.mortbay.log.Log
 -

 Key: HBASE-8141
 URL: https://issues.apache.org/jira/browse/HBASE-8141
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.95.0, 0.96.0, 0.94.6
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Trivial
 Fix For: 0.95.0, 0.96.0, 0.94.6

 Attachments: 8141-0.94.patch, 8141-trunk.patch


 Remove accidental uses of org.mortbay.log.Log. Eclipse autocomplete is 
 probably the culprit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8141) Remove accidental uses of org.mortbay.log.Log


 [ 
https://issues.apache.org/jira/browse/HBASE-8141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-8141:
--

Attachment: 8141-0.94.patch
8141-trunk.patch

Trivial patches committed.

 Remove accidental uses of org.mortbay.log.Log
 -

 Key: HBASE-8141
 URL: https://issues.apache.org/jira/browse/HBASE-8141
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.95.0, 0.96.0, 0.94.6
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Trivial
 Fix For: 0.95.0, 0.96.0, 0.94.6

 Attachments: 8141-0.94.patch, 8141-trunk.patch


 Remove accidental uses of org.mortbay.log.Log. Eclipse autocomplete is 
 probably the culprit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7295) Contention in HBaseClient.getConnection

2013-03-18 Thread Varun Sharma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-7295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605664#comment-13605664
 ] 

Varun Sharma commented on HBASE-7295:
-

Lars,

I maybe forgetting but is it because of the edge cases with PoolMap thread 
safety or is it the Connection object thread safety or is it because of the 
double checked locking issue in general ?

Thanks
Varun

 Contention in HBaseClient.getConnection
 ---

 Key: HBASE-7295
 URL: https://issues.apache.org/jira/browse/HBASE-7295
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.94.3
Reporter: Varun Sharma
Assignee: Varun Sharma
 Fix For: 0.95.0

 Attachments: 7295-0.94.txt, 7295-0.94-v2.txt, 7295-0.94-v3.txt, 
 7295-0.94-v4.txt, 7295-0.94-v5.txt, 7295-trunk.txt, 7295-trunk.txt, 
 7295-trunk-v2.txt, 7295-trunk-v3.txt, 7295-trunk-v3.txt, 7295-trunk-v4.txt


 HBaseClient.getConnection() synchronizes on the connections object. We found 
 severe contention on a thrift gateway which was fanning out roughly 3000+ 
 calls per second to hbase region servers. The thrift gateway had 2000+ 
 threads for handling incoming connections. Threads were blocked on the 
 syncrhonized block - we set ipc.pool.size to 200. Since we are using 
 RoundRobin/ThreadLocal pool only - its not necessary to synchronize on 
 connections - it might lead to cases where we might go slightly over the 
 ipc.max.pool.size() but the additional connections would timeout after 
 maxIdleTime - underlying PoolMap connections object is thread safe.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7679) implement store file management for stripe compactions

[
https://issues.apache.org/jira/browse/HBASE-7679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605678#comment-13605678
]

Hadoop QA commented on HBASE-7679:
--

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12574205/HBASE-7679-v10.patch
against trunk revision .

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 9 new
or modified tests.

{color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop
2.0 profile.

{color:green}+1 javadoc{color}. The javadoc tool did not generate any
warning messages.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

{color:green}+1 lineLengths{color}. The patch does not introduce lines
longer than 100

{color:red}-1 site{color}. The patch appears to cause mvn site goal to
fail.

{color:green}+1 core tests{color}. The patch passed unit tests in .

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/4872//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4872//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4872//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4872//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4872//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4872//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4872//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4872//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4872//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/4872//console

This message is automatically generated.

implement store file management for stripe compactions
--

Key: HBASE-7679
URL: https://issues.apache.org/jira/browse/HBASE-7679
Project: HBase
Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Attachments: HBASE-7667-and-7603-v0-incomplete.patch,
HBASE-7667-and-7603-v0-incomplete.patch, HBASE-7667-and-7603-v1.patch,
HBASE-7667-and-7603-v1.patch, HBASE-7667-v1.patch, HBASE-7667-v1.patch,
HBASE-7667-v2.patch, HBASE-7667-v2.patch, HBASE-7667-v3.patch,
HBASE-7679-v10.patch, HBASE-7679-v4.patch, HBASE-7679-v5.patch,
HBASE-7679-v6.patch, HBASE-7679-v7-.patch, HBASE-7679-v7.patch,
HBASE-7679-v8.patch, HBASE-7679-v9.patch

[jira] [Updated] (HBASE-8108) Add m2eclispe lifecycle mapping to hbase-common

2013-03-18 Thread Jesse Yates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-8108:
---

Summary: Add m2eclispe lifecycle mapping to hbase-common  (was: Add 
m2eclispe lifecycle mapping to hbase-commn)

 Add m2eclispe lifecycle mapping to hbase-common
 ---

 Key: HBASE-8108
 URL: https://issues.apache.org/jira/browse/HBASE-8108
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.95.0, 0.98.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Attachments: hbase-8108.patch, hbase-8108-v2.patch


 The maven-antrun-plugin execution doesn't have a default mapping in 
 m2eclipse, so if you import the project into eclipse, you will get an error 
 that the mapping is undefined. All that's needed is to define an execution 
 via the org.eclipse.m2 lifecycle-mapping plugin - it doesn't actually affect 
 the usual maven build at all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-8108) Add m2eclispe lifecycle mapping to hbase-common

2013-03-18 Thread Jesse Yates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates resolved HBASE-8108.


   Resolution: Fixed
Fix Version/s: 0.98.0
   0.95.0

committed to trunk and 0.95. Thanks for the reviews!

 Add m2eclispe lifecycle mapping to hbase-common
 ---

 Key: HBASE-8108
 URL: https://issues.apache.org/jira/browse/HBASE-8108
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.95.0, 0.98.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 0.95.0, 0.98.0

 Attachments: hbase-8108.patch, hbase-8108-v2.patch


 The maven-antrun-plugin execution doesn't have a default mapping in 
 m2eclipse, so if you import the project into eclipse, you will get an error 
 that the mapping is undefined. All that's needed is to define an execution 
 via the org.eclipse.m2 lifecycle-mapping plugin - it doesn't actually affect 
 the usual maven build at all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

[
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

nkeywal updated HBASE-7590:
---

Status: Open (was: Patch Available)

Add a costless notifications mechanism from master to regionservers clients
-

Key: HBASE-7590
URL: https://issues.apache.org/jira/browse/HBASE-7590
Project: HBase
Issue Type: Bug
Components: Client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch,
7590.v13.patch, 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch,
7590.v3.patch, 7590.v5.patch, 7590.v5.patch

t would be very useful to add a mechanism to distribute some information to
the clients and regionservers. Especially It would be useful to know globally
(regionservers + clients apps) that some regionservers are dead. This would
allow:
- to lower the load on the system, without clients using staled information
and going on dead machines
- to make the recovery faster from a client point of view. It's common to use
large timeouts on the client side, so the client may need a lot of time
before declaring a region server dead and trying another one. If the client
receives the information separatly about a region server states, it can take
the right decision, and continue/stop to wait accordingly.
We can also send more information, for example instructions like 'slow down'
to instruct the client to increase the retries delay and so on.
Technically, the master could send this information. To lower the load on
the system, we should:
- have a multicast communication (i.e. the master does not have to connect to
all servers by tcp), with once packet every 10 seconds or so.
- receivers should not depend on this: if the information is available great.
If not, it should not break anything.
- it should be optional.
So at the end we would have a thread in the master sending a protobuf message
about the dead servers on a multicast socket. If the socket is not
configured, it does not do anything. On the client side, when we receive an
information that a node is dead, we refresh the cache about it.

[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

[
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

nkeywal updated HBASE-7590:
---

Attachment: 7590.v13.patch

Add a costless notifications mechanism from master to regionservers clients
-

[jira] [Commented] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

[
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605698#comment-13605698
]

nkeywal commented on HBASE-7590:

May be 13 is going to be my lucky number :-) ?

Add a costless notifications mechanism from master to regionservers clients
-

[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

[
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

nkeywal updated HBASE-7590:
---

Status: Patch Available (was: Open)

Add a costless notifications mechanism from master to regionservers clients
-

[jira] [Commented] (HBASE-7965) Port table locking to 0.94 (HBASE-7305, HBASE-7546, HBASE-7933)

2013-03-18 Thread Jonathan Hsieh (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-7965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605707#comment-13605707
 ] 

Jonathan Hsieh commented on HBASE-7965:
---

I think it is unfair to claim that the ability to change schema without 
disabling the table is a feature that is required to for HBase to be production 
ready.   

The feature is it is off by default, essentially documented as experimental 
({{Its off by default. Enable it at your own risk.}} [1]), so in my eyes 
fixing it essentially feels like a new feature.  

[1]http://hbase.apache.org/book.html#d1949e2910 .  

(Sorry for the delay, was away for a 2 weeks).

 Port table locking to 0.94 (HBASE-7305, HBASE-7546, HBASE-7933)
 ---

 Key: HBASE-7965
 URL: https://issues.apache.org/jira/browse/HBASE-7965
 Project: HBase
  Issue Type: New Feature
  Components: master, Zookeeper
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.94.7


 Port table locking to 0.94 (HBASE-7305, HBASE-7546, HBASE-7933). This is a 
 new feature, but there has been some interest, and it is necessary for 
 snapshots, and online merge, which are also candidates for backport. 
 If we port snapshots, we might need HBASE-7848 as well.
 We can also do disabled by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7295) Contention in HBaseClient.getConnection

[
https://issues.apache.org/jira/browse/HBASE-7295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605711#comment-13605711
]

Lars Hofhansl commented on HBASE-7295:
--

Double checked locking is fine when the variable checked in declared volatile
(i.e. ensure proper read/write memory barriers).
Here PoolMap itself would have to be thread-safe, which - as far as I know - it
is not.

Also in the uncontended case an access to a volatile is not significantly
cheaper than a synchronized statement, so I doubt that even if it was correct
it would actually improve the situation ... Unless you see extremely high
contention on this lock.

Do you have sample code that can reproduce the problem? Until then I'm -1 on
this change. (sorry)

Contention in HBaseClient.getConnection
---

Key: HBASE-7295
URL: https://issues.apache.org/jira/browse/HBASE-7295
Project: HBase
Issue Type: Improvement
Affects Versions: 0.94.3
Reporter: Varun Sharma
Assignee: Varun Sharma
Fix For: 0.95.0

Attachments: 7295-0.94.txt, 7295-0.94-v2.txt, 7295-0.94-v3.txt,
7295-0.94-v4.txt, 7295-0.94-v5.txt, 7295-trunk.txt, 7295-trunk.txt,
7295-trunk-v2.txt, 7295-trunk-v3.txt, 7295-trunk-v3.txt, 7295-trunk-v4.txt

HBaseClient.getConnection() synchronizes on the connections object. We found
severe contention on a thrift gateway which was fanning out roughly 3000+
calls per second to hbase region servers. The thrift gateway had 2000+
threads for handling incoming connections. Threads were blocked on the
syncrhonized block - we set ipc.pool.size to 200. Since we are using
RoundRobin/ThreadLocal pool only - its not necessary to synchronize on
connections - it might lead to cases where we might go slightly over the
ipc.max.pool.size() but the additional connections would timeout after
maxIdleTime - underlying PoolMap connections object is thread safe.

[jira] [Updated] (HBASE-7597) TestHBaseFsck#testRegionShouldNotBeDeployed seems to be flaky


 [ 
https://issues.apache.org/jira/browse/HBASE-7597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-7597:
---

Attachment: trunk-7597.patch

 TestHBaseFsck#testRegionShouldNotBeDeployed seems to be flaky
 -

 Key: HBASE-7597
 URL: https://issues.apache.org/jira/browse/HBASE-7597
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Marc Spaggiari
Assignee: Jimmy Xiang
 Attachments: trunk-7597.patch


 I ran the entire test suite many times and always failed on, at least, 
 testRegionShouldNotBeDeployed.
 Results below. I will attached more result when current tests are done.
 Failed tests:
 testDeleteExpiredStoreFiles(org.apache.hadoop.hbase.regionserver.TestStore):
 expected:2 but was:4
   
 testAcquireTaskAtStartup(org.apache.hadoop.hbase.regionserver.TestSplitLogWorker):
 Waiting timed out after [1 000] msec
   testRegionShouldNotBeDeployed(org.apache.hadoop.hbase.util.TestHBaseFsck):
 expected:[SHOULD_NOT_BE_DEPLOYED] but was:[]
   
 testPermissionsWatcher(org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7597) TestHBaseFsck#testRegionShouldNotBeDeployed seems to be flaky


 [ 
https://issues.apache.org/jira/browse/HBASE-7597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-7597:
---

Status: Patch Available  (was: Reopened)

In the log file, it shows the region is not deployed according to hbck although 
it is.  I added some checking (the same way as in hbck) to make sure the region 
is deployed before running hbck.

 TestHBaseFsck#testRegionShouldNotBeDeployed seems to be flaky
 -

 Key: HBASE-7597
 URL: https://issues.apache.org/jira/browse/HBASE-7597
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Marc Spaggiari
Assignee: Jimmy Xiang
 Attachments: trunk-7597.patch


 I ran the entire test suite many times and always failed on, at least, 
 testRegionShouldNotBeDeployed.
 Results below. I will attached more result when current tests are done.
 Failed tests:
 testDeleteExpiredStoreFiles(org.apache.hadoop.hbase.regionserver.TestStore):
 expected:2 but was:4
   
 testAcquireTaskAtStartup(org.apache.hadoop.hbase.regionserver.TestSplitLogWorker):
 Waiting timed out after [1 000] msec
   testRegionShouldNotBeDeployed(org.apache.hadoop.hbase.util.TestHBaseFsck):
 expected:[SHOULD_NOT_BE_DEPLOYED] but was:[]
   
 testPermissionsWatcher(org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize


 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-8135:
--

Attachment: 8135-v3.txt

Patch v3 makes TestHeapSize pass.

Put has already been covered in TestHeapSize#testSizes()

 Mutation should implement HeapSize
 --

 Key: HBASE-8135
 URL: https://issues.apache.org/jira/browse/HBASE-8135
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0, 0.94.5
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0, 0.96.0

 Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt


 Code is there already.
 Doing so would allow to share some code when doing client side buffering.
 patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7965) Port table locking to 0.94 (HBASE-7305, HBASE-7546, HBASE-7933)


[ 
https://issues.apache.org/jira/browse/HBASE-7965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605718#comment-13605718
 ] 

Lars Hofhansl commented on HBASE-7965:
--

Welcome back Jon :)
I do not think it is question about fair vs. unfair.

It is a fact that you cannot safely do online schema changes in 0.94.
When we have an actual patch against 0.94 we can weigh that deficiency against 
the risk introduced by the patch.


 Port table locking to 0.94 (HBASE-7305, HBASE-7546, HBASE-7933)
 ---

 Key: HBASE-7965
 URL: https://issues.apache.org/jira/browse/HBASE-7965
 Project: HBase
  Issue Type: New Feature
  Components: master, Zookeeper
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.94.7


 Port table locking to 0.94 (HBASE-7305, HBASE-7546, HBASE-7933). This is a 
 new feature, but there has been some interest, and it is necessary for 
 snapshots, and online merge, which are also candidates for backport. 
 If we port snapshots, we might need HBASE-7848 as well.
 We can also do disabled by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize


 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-8135:
--

Status: Patch Available  (was: Open)

 Mutation should implement HeapSize
 --

 Key: HBASE-8135
 URL: https://issues.apache.org/jira/browse/HBASE-8135
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.94.5, 0.95.0, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0, 0.96.0

 Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt


 Code is there already.
 Doing so would allow to share some code when doing client side buffering.
 patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7568) [replication] Create an interface for replication queues


 [ 
https://issues.apache.org/jira/browse/HBASE-7568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated HBASE-7568:


Attachment: HBASE-7568-trunk-v1.patch

Attached re-based patch to incorporate new test in TestReplicationSourceManager.

 [replication] Create an interface for replication queues
 

 Key: HBASE-7568
 URL: https://issues.apache.org/jira/browse/HBASE-7568
 Project: HBase
  Issue Type: Sub-task
  Components: Replication
Reporter: Chris Trezzo
Assignee: Chris Trezzo
 Fix For: 0.95.0, 0.96.0, 0.98.0

 Attachments: HBASE-7568-trunk-v1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7295) Contention in HBaseClient.getConnection


[ 
https://issues.apache.org/jira/browse/HBASE-7295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605729#comment-13605729
 ] 

stack commented on HBASE-7295:
--

This doesn't make sense:

{code}
-  protected final PoolMapConnectionId, Connection connections;
+  protected volatile PoolMapConnectionId, Connection connections;
{code}

This is http://en.wikipedia.org/wiki/Double-checked_locking

No weird errors/connection fails in your thrift gateway?

PoolMap looks like it is backed by a concurrent hash map which would be fine on 
the gets, etc., but the iterations are not synchronized (I don't see 
connections being iterated but they probably are someplace if I looked more).

We committed a double-check locking around block cache a while ago: 
https://issues.apache.org/jira/secure/attachment/12553266/5898-v4.txt



 Contention in HBaseClient.getConnection
 ---

 Key: HBASE-7295
 URL: https://issues.apache.org/jira/browse/HBASE-7295
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.94.3
Reporter: Varun Sharma
Assignee: Varun Sharma
 Fix For: 0.95.0

 Attachments: 7295-0.94.txt, 7295-0.94-v2.txt, 7295-0.94-v3.txt, 
 7295-0.94-v4.txt, 7295-0.94-v5.txt, 7295-trunk.txt, 7295-trunk.txt, 
 7295-trunk-v2.txt, 7295-trunk-v3.txt, 7295-trunk-v3.txt, 7295-trunk-v4.txt


 HBaseClient.getConnection() synchronizes on the connections object. We found 
 severe contention on a thrift gateway which was fanning out roughly 3000+ 
 calls per second to hbase region servers. The thrift gateway had 2000+ 
 threads for handling incoming connections. Threads were blocked on the 
 syncrhonized block - we set ipc.pool.size to 200. Since we are using 
 RoundRobin/ThreadLocal pool only - its not necessary to synchronize on 
 connections - it might lead to cases where we might go slightly over the 
 ipc.max.pool.size() but the additional connections would timeout after 
 maxIdleTime - underlying PoolMap connections object is thread safe.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7597) TestHBaseFsck#testRegionShouldNotBeDeployed seems to be flaky


[ 
https://issues.apache.org/jira/browse/HBASE-7597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605732#comment-13605732
 ] 

stack commented on HBASE-7597:
--

+1

IMO, exploratory/debug is fine to commit trying to figure whats up on jenkins 
(since it hard to reproduce its context elsewhere).

 TestHBaseFsck#testRegionShouldNotBeDeployed seems to be flaky
 -

 Key: HBASE-7597
 URL: https://issues.apache.org/jira/browse/HBASE-7597
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Marc Spaggiari
Assignee: Jimmy Xiang
 Attachments: trunk-7597.patch


 I ran the entire test suite many times and always failed on, at least, 
 testRegionShouldNotBeDeployed.
 Results below. I will attached more result when current tests are done.
 Failed tests:
 testDeleteExpiredStoreFiles(org.apache.hadoop.hbase.regionserver.TestStore):
 expected:2 but was:4
   
 testAcquireTaskAtStartup(org.apache.hadoop.hbase.regionserver.TestSplitLogWorker):
 Waiting timed out after [1 000] msec
   testRegionShouldNotBeDeployed(org.apache.hadoop.hbase.util.TestHBaseFsck):
 expected:[SHOULD_NOT_BE_DEPLOYED] but was:[]
   
 testPermissionsWatcher(org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7905) Add passing of optional cell blocks over rpc


 [ 
https://issues.apache.org/jira/browse/HBASE-7905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7905:
-

Status: Open  (was: Patch Available)

 Add passing of optional cell blocks over rpc
 

 Key: HBASE-7905
 URL: https://issues.apache.org/jira/browse/HBASE-7905
 Project: HBase
  Issue Type: Sub-task
  Components: IPC/RPC
Reporter: stack
Assignee: stack
 Fix For: 0.95.0

 Attachments: 7900v12-depends-on-8101.txt, 7905.txt, 7905v13.txt, 
 7905v14.txt, 7905v15.txt, 7905v16.txt, 7905v3.txt, 7905v4.txt, 7905v6.txt, 
 7905v8.txt, 7905v9.txt, testipc_for_pre_cellblocks.txt


 Make it so we can pass Cells/data w/o having to bury it all in protobuf to 
 get it over the wire.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7905) Add passing of optional cell blocks over rpc


 [ 
https://issues.apache.org/jira/browse/HBASE-7905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7905:
-

Attachment: testipc_for_pre_cellblocks.txt

Add main to testipc for current trunk, before this patch goes in.

 Add passing of optional cell blocks over rpc
 

 Key: HBASE-7905
 URL: https://issues.apache.org/jira/browse/HBASE-7905
 Project: HBase
  Issue Type: Sub-task
  Components: IPC/RPC
Reporter: stack
Assignee: stack
 Fix For: 0.95.0

 Attachments: 7900v12-depends-on-8101.txt, 7905.txt, 7905v13.txt, 
 7905v14.txt, 7905v15.txt, 7905v16.txt, 7905v3.txt, 7905v4.txt, 7905v6.txt, 
 7905v8.txt, 7905v9.txt, testipc_for_pre_cellblocks.txt


 Make it so we can pass Cells/data w/o having to bury it all in protobuf to 
 get it over the wire.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7305) ZK based Read/Write locks for table operations

2013-03-18 Thread Jonathan Hsieh (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-7305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605746#comment-13605746
]

Jonathan Hsieh commented on HBASE-7305:
---

The doc is great -- I'm really the most curious about why different operations
get the read or the write aspects of the lock guard what they protect. I'm
trying to justify this to myself now based on the docs. So, do I have this
right?

Affected operations:
* create, delete, disable, enable, alter, modify table (add/del/mod col, mod
table), splits
* Other candidates: merge, snapshot, ... balancer, am, ssh, hbck

current rationale:
* want to allow safe table mods (disable, enable, alter)
* want to allow concurrent splits
* want snapshots operations to be safe

Implementaiton:
* Read locks on splits.
* Exclusive write lock on all other table mods.

Questions/Observations:
* This primarily protects operations that clash with table level
enable/disable/alter, but not region level operations, right?.
* This doesn't guard meta from individual changes, right? It only protects
meta from bulk adds (create/delete table). Thus this shouldn't affect region
moves or region closes/opens.
* Protecting split with a read table lock only prevents alter/enable/disable
table ops from happening. If an overlapping merge and split were issued, some
other mechanism is in place to keep this sane right? This doesn't protect
multiple merge requests with overlapping regions right?
* Merges will likely want the read lock? (allowing multiple concurrent merges,
and assuming some overlap sanity protection from a different mechanism).
* With snapshots, this mechanism doesn't prevent regions from moving so it only
protects snapshots from concurrently happening with enable/disable/alter table
ops. Snapshot will still fail if it gets caught while the balancer is running.
* These locks don't really help hbck -- except for the cases where
enable/disable/alter operations are going on as hbck repairs things. (It
wouldn't protect hbck from the balancer).

As a strawman (for follow on work), I'm thinking for Assignemnt dependent
operations (splits/balancer/ssh/snapshots/merge) we might want another lock (I
believe regions-in-transition kind of serve this purpose already).

* Does having a table lock (and then having individual region locks that
require a table read lock being held) make sense? Maybe this makes sense for
merges and splits?

ZK based Read/Write locks for table operations
--

Key: HBASE-7305
URL: https://issues.apache.org/jira/browse/HBASE-7305
Project: HBase
Issue Type: Bug
Components: Client, master, Zookeeper
Reporter: Enis Soztutar
Assignee: Enis Soztutar
Fix For: 0.95.0

Attachments: 130228-zkrwlocks.pdf, 7305-v11.txt, hbase-7305_v0.patch,
hbase-7305_v10.patch, hbase-7305_v13.patch, hbase-7305_v14.patch,
hbase-7305_v15.patch, hbase-7305_v1-based-on-curator.patch,
hbase-7305_v2.patch, hbase-7305_v4.patch, hbase-7305_v9.patch,
HBaseTableLocks.pdf

This has started as forward porting of HBASE-5494 and HBASE-5991 from the
89-fb branch to trunk, but diverged enough to have it's own issue.
The idea is to implement a zk based read/write lock per table. Master
initiated operations should get the write lock, and region operations (region
split, moving, balance?, etc) acquire a shared read lock.

[jira] [Updated] (HBASE-8081) Backport HBASE-7213 (separate hlog for meta tables) to 0.94

2013-03-18 Thread Devaraj Das (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-8081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HBASE-8081:
---

Attachment: 7213-0.94-with-config-1.patch

This is essentially the same patch as the last one with one minor change - 
added the new config in hbase-default.xml. 

Passes the unit tests with the option enabled. Also ran manual tests on a 
cluster with the config on/off. Things looked good.

 Backport HBASE-7213 (separate hlog for meta tables) to 0.94
 ---

 Key: HBASE-8081
 URL: https://issues.apache.org/jira/browse/HBASE-8081
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.5
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.94.7

 Attachments: 7213-0.94-2.patch, 7213-0.94-3.patch, 7213-0.94.patch, 
 7213-0.94-with-config-1.patch, 7213-0.94-with-config.patch


 I am interested in backporting HBASE-7213 to 0.94. Helps to address more of 
 the MTTR story. Offline discussion with Lars indicated he is interested as 
 well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7305) ZK based Read/Write locks for table operations


[ 
https://issues.apache.org/jira/browse/HBASE-7305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605761#comment-13605761
 ] 

Sergey Shelukhin commented on HBASE-7305:
-

bq. having individual region locks that require a table read lock being held 
I wonder if region lock approach would scale. Though vary I can accept that 
splits are infrequent enough to not introduce too much delay to table 
operations, but if every AM action blocks every table operation I think it will 
not scale beyond small or medium clusters. I think we should be able to use 
better approach... table updates on modified regions can be done after 
modification.

 ZK based Read/Write locks for table operations
 --

 Key: HBASE-7305
 URL: https://issues.apache.org/jira/browse/HBASE-7305
 Project: HBase
  Issue Type: Bug
  Components: Client, master, Zookeeper
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.95.0

 Attachments: 130228-zkrwlocks.pdf, 7305-v11.txt, hbase-7305_v0.patch, 
 hbase-7305_v10.patch, hbase-7305_v13.patch, hbase-7305_v14.patch, 
 hbase-7305_v15.patch, hbase-7305_v1-based-on-curator.patch, 
 hbase-7305_v2.patch, hbase-7305_v4.patch, hbase-7305_v9.patch, 
 HBaseTableLocks.pdf


 This has started as forward porting of HBASE-5494 and HBASE-5991 from the 
 89-fb branch to trunk, but diverged enough to have it's own issue. 
 The idea is to implement a zk based read/write lock per table. Master 
 initiated operations should get the write lock, and region operations (region 
 split, moving, balance?, etc) acquire a shared read lock. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients