[jira] Commented: (HBASE-3405) Allow HBaseRpcMetrics to register custom interface methods

2011-01-06 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978229#action_12978229
 ] 

HBase Review Board commented on HBASE-3405:
---

Message from: Gary Helmling ghelml...@gmail.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1414/
---

Review request for hbase.


Summary
---

The patch allows alternate HRegionServer subclasses (and in the future 
coprocessors) to access HBaseRpcMetrics in order to register additional RPC 
methods into the metrics registry.

The changes are:
 * add getRpcMetrics() to HBaseServer to allow access to the HBaseRpcMetrics 
instance
 * add getRpcMetrics() to RegionServerServices (and HRegionServer) to allow for 
future coprocessor accounting
 * add HBaseRpcMetrics.createMetrics(Class[], boolean) overload -- if boolean 
is true, the registered method names will be prefixed with the class name.  
This should help clarify origin for custom metrics and help prevent collisions.


This addresses bug HBASE-3405.
http://issues.apache.org/jira/browse/HBASE-3405


Diffs
-

  src/main/java/org/apache/hadoop/hbase/ipc/HBaseRpcMetrics.java 19dbf2b 
  src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 867a059 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d7147b5 
  src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java 
1309f93 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestRpcMetrics.java 
PRE-CREATION 

Diff: http://review.cloudera.org/r/1414/diff


Testing
---

Created new org.apache.hadoop.hbase.regionserver.TestRpcMetrics test case to 
verify registration and incrementing of metrics from HRegionServer subclasses.


Thanks,

Gary




 Allow HBaseRpcMetrics to register custom interface methods
 --

 Key: HBASE-3405
 URL: https://issues.apache.org/jira/browse/HBASE-3405
 Project: HBase
  Issue Type: Improvement
  Components: ipc
Reporter: Gary Helmling
Priority: Minor

 Opened from comments on HBASE-2997.  James Kennedy notes:
 {quote}
 HBaseRpcMetrics is now logging a WARN message every time it encounters an 
 unregistered RPC method.
 In my case I now get huge log files filled with these warnings because the 
 hbase-trx transactional extension of HBase uses a subclass of HRegionServer 
 that adds new interface methods.
 It's easy enough to tell log4j to ignore HBaseRpcMetrics output.
 However, it would be nice if the Server/HRegionServer HBaseRpcMetrics 
 mechanism was more extensible so I could pass down new interfaces or grab the 
 HBaseRpcMetrics object to add interfaces from up top...
 {quote}
 {{HBaseRpcMetrics}} already has a public method {{createMetrics(Class)}} to 
 register method counters.  We just need a way to expose the metrics class to 
 allow the region server subclass to call it -- add a {{getMetrics()}} method 
 to {{RpcServer}} and {{HBaseServer}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3400) Coprocessor Support for Generic Interfaces

2010-12-31 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976355#action_12976355
 ] 

HBase Review Board commented on HBASE-3400:
---

Message from: ekohl...@gmail.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1405/
---

Review request for hbase.


Summary
---

Coprocessors currently do not support generic interfaces because type erasure 
makes their generic parameters appear as Objects to Invocation.java.
This can be overcome by writing out the parameters using their own types 
(rather than the type parameters), and then separately writing the class names 
for the type parameters. While it would be ideal to implement this in 
Invocation.java, some other code seems to be relying on its write order and 
doing so breaks other RPC code. The modification can, however, be implemented 
in Exec.java instead.

The included patch modifies Invocation.java's fields to that they are protected 
scope, and fully implements the read and write methods for Exec rather than 
using the parent method for the parent fields. ExecResult is also modified to 
accommodate generic returns in the same way.


This addresses bug HBASE-3400.
http://issues.apache.org/jira/browse/HBASE-3400


Diffs
-

  src/main/java/org/apache/hadoop/hbase/client/coprocessor/Exec.java c127ea3 
  src/main/java/org/apache/hadoop/hbase/client/coprocessor/ExecResult.java 
be46cd2 
  src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java 9609652 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorEndpoint.java 
75f76e8 

Diff: http://review.cloudera.org/r/1405/diff


Testing
---

Integration test included in patch. Demonstrates generic interface using 
objects, arrays, and primitives, and checks that all primitive classes work as 
well.


Thanks,

ekohlwey




 Coprocessor Support for Generic Interfaces
 --

 Key: HBASE-3400
 URL: https://issues.apache.org/jira/browse/HBASE-3400
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ed Kohlwey
Assignee: Ed Kohlwey
 Attachments: HBASE-3400-2.patch, HBASE-3400.patch


 Coprocessors currently do not support generic interfaces because type erasure 
 makes their generic parameters appear as Objects to Invocation.java.
 This can be overcome by writing out the parameters using their own types 
 (rather than the type parameters), and then separately writing the class 
 names for the type parameters. While it would be ideal to implement this in 
 Invocation.java, some other code seems to be relying on its write order and 
 doing so breaks other RPC code. The modification can, however, be implemented 
 in Exec.java instead.
 The included patch modifies Invocation.java's fields to that they are 
 protected scope, and fully implements the read and write methods for Exec 
 rather than using the parent method for the parent fields. ExecResult is also 
 modified to accommodate generic returns in the same way.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3400) Coprocessor Support for Generic Interfaces

2010-12-31 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976376#action_12976376
 ] 

HBase Review Board commented on HBASE-3400:
---

Message from: ekohl...@gmail.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1405/
---

(Updated 2010-12-31 17:45:56.197941)


Review request for hbase.


Changes
---

Added classes that were missing for tests.


Summary
---

Coprocessors currently do not support generic interfaces because type erasure 
makes their generic parameters appear as Objects to Invocation.java.
This can be overcome by writing out the parameters using their own types 
(rather than the type parameters), and then separately writing the class names 
for the type parameters. While it would be ideal to implement this in 
Invocation.java, some other code seems to be relying on its write order and 
doing so breaks other RPC code. The modification can, however, be implemented 
in Exec.java instead.

The included patch modifies Invocation.java's fields to that they are protected 
scope, and fully implements the read and write methods for Exec rather than 
using the parent method for the parent fields. ExecResult is also modified to 
accommodate generic returns in the same way.


This addresses bug HBASE-3400.
http://issues.apache.org/jira/browse/HBASE-3400


Diffs (updated)
-

  src/main/java/org/apache/hadoop/hbase/client/coprocessor/Exec.java c127ea3 
  src/main/java/org/apache/hadoop/hbase/client/coprocessor/ExecResult.java 
be46cd2 
  src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java 9609652 
  src/test/java/org/apache/hadoop/hbase/coprocessor/GenericEndpoint.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/coprocessor/GenericProtocol.java 
PRE-CREATION 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorEndpoint.java 
75f76e8 

Diff: http://review.cloudera.org/r/1405/diff


Testing
---

Integration test included in patch. Demonstrates generic interface using 
objects, arrays, and primitives, and checks that all primitive classes work as 
well.


Thanks,

ekohlwey




 Coprocessor Support for Generic Interfaces
 --

 Key: HBASE-3400
 URL: https://issues.apache.org/jira/browse/HBASE-3400
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ed Kohlwey
Assignee: Ed Kohlwey
 Attachments: HBASE-3400-2.patch, HBASE-3400-3.patch, HBASE-3400.patch


 Coprocessors currently do not support generic interfaces because type erasure 
 makes their generic parameters appear as Objects to Invocation.java.
 This can be overcome by writing out the parameters using their own types 
 (rather than the type parameters), and then separately writing the class 
 names for the type parameters. While it would be ideal to implement this in 
 Invocation.java, some other code seems to be relying on its write order and 
 doing so breaks other RPC code. The modification can, however, be implemented 
 in Exec.java instead.
 The included patch modifies Invocation.java's fields to that they are 
 protected scope, and fully implements the read and write methods for Exec 
 rather than using the parent method for the parent fields. ExecResult is also 
 modified to accommodate generic returns in the same way.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-2312) Possible data loss when RS goes into GC pause while rolling HLog

2010-12-22 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12974396#action_12974396
 ] 

HBase Review Board commented on HBASE-2312:
---

Message from: st...@duboce.net

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/396/#review2146
---

Ship it!


Below looks good.  Doesn't work w/o the hadoop issues?  I still need to review 
those?

- stack





 Possible data loss when RS goes into GC pause while rolling HLog
 

 Key: HBASE-2312
 URL: https://issues.apache.org/jira/browse/HBASE-2312
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver
Affects Versions: 0.90.0
Reporter: Karthik Ranganathan
Assignee: Nicolas Spiegelberg
Priority: Critical
 Fix For: 0.90.1


 There is a very corner case when bad things could happen(ie data loss):
 1)RS #1 is going to roll its HLog - not yet created the new one, old one 
 will get no more writes
 2)RS #1 enters GC Pause of Death
 3)Master lists HLog files of RS#1 that is has to split as RS#1 is dead, 
 starts splitting
 4)RS #1 wakes up, created the new HLog (previous one was rolled) and 
 appends an edit - which is lost
 The following seems like a possible solution:
 1)Master detects RS#1 is dead
 2)The master renames the /hbase/.logs/regionserver name  directory to 
 something else (say /hbase/.logs/regionserver name-dead)
 3)Add mkdir support (as opposed to mkdirs) to HDFS - so that a file 
 create fails if the directory doesn't exist. Dhruba tells me this is very 
 doable.
 4)RS#1 comes back up and is not able create the new hlog. It restarts 
 itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-2312) Possible data loss when RS goes into GC pause while rolling HLog

2010-12-21 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12974029#action_12974029
 ] 

HBase Review Board commented on HBASE-2312:
---

Message from: Nicolas nspiegelb...@facebook.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/396/
---

(Updated 2010-12-21 19:06:32.166768)


Review request for hbase.


Changes
---

Version for 0.90.  This version utilizes a new HDFS patch to forcibly recover a 
file lease (forthcoming).  TestZooKeeper will fail without this patch because 
it needs to wait until the soft lease expires otherwise.


Summary
---

There is a very corner case when bad things could happen(ie data loss):

1) RS #1 is going to roll its HLog - not yet created the new one, old one will 
get no more writes
2) RS #1 enters GC Pause of Death
3) Master lists HLog files of RS#1 that is has to split as RS#1 is dead, starts 
splitting
4) RS #1 wakes up, created the new HLog (previous one was rolled) and appends 
an edit - which is lost

Note that this fix requires a healthy dose of HDFS prerequisites: HDFS-617, 
HADOOP-6840, HADOOP-6886.  I encourage you to review those as well, give 
feedback, and hopefully give +1s so we can push the changes through.


This addresses bug HBASE-2312.
http://issues.apache.org/jira/browse/HBASE-2312


Diffs (updated)
-

  trunk/src/main/java/org/apache/hadoop/hbase/HConstants.java 1051398 
  trunk/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java 
1051398 
  trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java 
1051398 
  
trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogWriter.java
 1051398 
  trunk/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java 1051398 
  
trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogSplit.java 
1051398 

Diff: http://review.cloudera.org/r/396/diff


Testing
---

mvn test;
bin/start-hbase.sh
bin/hbase shell  scan '.META.', get, put, etc


Thanks,

Nicolas




 Possible data loss when RS goes into GC pause while rolling HLog
 

 Key: HBASE-2312
 URL: https://issues.apache.org/jira/browse/HBASE-2312
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver
Affects Versions: 0.90.0
Reporter: Karthik Ranganathan
Assignee: Nicolas Spiegelberg
Priority: Critical
 Fix For: 0.90.1


 There is a very corner case when bad things could happen(ie data loss):
 1)RS #1 is going to roll its HLog - not yet created the new one, old one 
 will get no more writes
 2)RS #1 enters GC Pause of Death
 3)Master lists HLog files of RS#1 that is has to split as RS#1 is dead, 
 starts splitting
 4)RS #1 wakes up, created the new HLog (previous one was rolled) and 
 appends an edit - which is lost
 The following seems like a possible solution:
 1)Master detects RS#1 is dead
 2)The master renames the /hbase/.logs/regionserver name  directory to 
 something else (say /hbase/.logs/regionserver name-dead)
 3)Add mkdir support (as opposed to mkdirs) to HDFS - so that a file 
 create fails if the directory doesn't exist. Dhruba tells me this is very 
 doable.
 4)RS#1 comes back up and is not able create the new hlog. It restarts 
 itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3256) Coprocessors: Coprocessor host and observer for HMaster

2010-12-20 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12973452#action_12973452
 ] 

HBase Review Board commented on HBASE-3256:
---

Message from: Gary Helmling ghelml...@gmail.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1321/
---

Review request for hbase, stack, Andrew Purtell, and Jonathan Gray.


Summary
---

This patch adds a new MasterObserver interface with pre/post hooks provided for 
operations defined in org.apache.hadoop.hbase.ipc.HMasterInterface.

In order to accommodate the new MasterObserver interface, I've also refactored 
out common coprocessor base code, with subclasses providing for region-specific 
and master-specific behavior.

The new code structure is (excuse my poor ascii art):

CoprocessorEnvironment - base interface for common facilities provided to CP 
implementations
| 
|- RegionCoprocessorEnvironment - adds access to current HRegion and 
RegionServerServices (for RegionObservers)
|
|- MasterCoprocessorEnvironment - adds access to MasterServerServices (for 
MasterObservers)

CoprocessorHost - abstract base providing core CP loading and invocation code 
and the base CoprocessorEnvironment implementation
|
|- RegionCoprocessorHost - provides hooks for invoking RegionObserver 
pre/post methods and RegionCoprocessorEnvironment implementation
|
|- MasterCoprocessorHost - provides hooks for invoking MasterObserver 
pre/post methods and MasterCoprocessorEnvironment implementation

Also added:
 - org.apache.hadoop.hbase.coprocessor.BaseMasterObserver - stubs out full 
MasterObserver interface with empty methods for convenience
 - org.apache.hadoop.hbase.coprocessor.TestMasterObserver - tests that 
MasterObserver pre/post methods are called during master operations.

In particular, please let me know if the MasterObserver method inputs and 
outputs are sufficient for whatever you anticipate doing with it.  It should 
meet our needs for security checks, but more input would be helpful.


This addresses bug HBASE-3256.
http://issues.apache.org/jira/browse/HBASE-3256


Diffs
-

  src/main/java/org/apache/hadoop/hbase/coprocessor/BaseMasterObserver.java 
PRE-CREATION 
  
src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserverCoprocessor.java
 1ffead0 
  src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorEnvironment.java 
c4fa526 
  src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 
PRE-CREATION 
  
src/main/java/org/apache/hadoop/hbase/coprocessor/MasterCoprocessorEnvironment.java
 PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/coprocessor/MasterObserver.java 
PRE-CREATION 
  
src/main/java/org/apache/hadoop/hbase/coprocessor/RegionCoprocessorEnvironment.java
 PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java 97198ec 
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java 18f7787 
  src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/master/MasterServices.java 593254b 
  src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java 
f71fea6 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1d48131 
  src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java 
PRE-CREATION 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/ColumnAggregationEndpoint.java
 43569f1 
  src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java 
902a60f 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java 
5434d01 
  src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterObserver.java 
PRE-CREATION 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverInterface.java
 5f5fc9a 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverStacking.java
 3193abf 
  src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java 5be8daa 

Diff: http://review.cloudera.org/r/1321/diff


Testing
---

Added a new test (org.apache.hadoop.hbase.coprocessor.TestMasterObserver) to 
cover pre/post hook invocation.

All existing coprocessor tests still pass.


Thanks,

Gary




 Coprocessors: Coprocessor host and observer for HMaster
 ---

 Key: HBASE-3256
 URL: https://issues.apache.org/jira/browse/HBASE-3256
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Gary Helmling
 Fix For: 0.92.0

 Attachments: HBASE-3256_initial.patch


 Implement a coprocessor host for HMaster. Hook observers into administrative 
 operations 

[jira] Commented: (HBASE-3256) Coprocessors: Coprocessor host and observer for HMaster

2010-12-20 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12973457#action_12973457
 ] 

HBase Review Board commented on HBASE-3256:
---

Message from: Andrew Purtell apurt...@apache.org

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1321/#review2127
---

Ship it!



src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/1321/#comment6609

If we were to allow override of assignments, it would have to happen here. 
If the cp calls bypass() then return immediately.



src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/1321/#comment6610

Likewise if we were to allow overriding assignment, we need a symmetrical 
operation here.


- Andrew





 Coprocessors: Coprocessor host and observer for HMaster
 ---

 Key: HBASE-3256
 URL: https://issues.apache.org/jira/browse/HBASE-3256
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Gary Helmling
 Fix For: 0.92.0

 Attachments: HBASE-3256_initial.patch


 Implement a coprocessor host for HMaster. Hook observers into administrative 
 operations performed on tables: create, alter, assignment, load balance, and 
 allow observers to modify base master behavior. Support automatic loading of 
 coprocessor implementation. 
 Consider refactoring the master coprocessor host and regionserver coprocessor 
 host into a common base class. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3256) Coprocessors: Coprocessor host and observer for HMaster

2010-12-20 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12973460#action_12973460
 ] 

HBase Review Board commented on HBASE-3256:
---

Message from: Jonathan Gray jg...@apache.org

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1321/#review2126
---

Ship it!


great work!  just a few small comments but otherwise +1


src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java
http://review.cloudera.org/r/1321/#comment6607

does DEFAULT really mean REGION/REGIONSERVER?  or is it both?

not a big deal if it's just variable names but since it's a config param, 
we should nail it now before it gets out in a release.



src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java
http://review.cloudera.org/r/1321/#comment6608

this code might have been in other earlier patches but could there be false 
positives with this?  it'd be silly to load FancyCoprocessor and then 
MyFancyCoprocessor but i guess this is to cover the package?  maybe parse out 
the class name?



src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/1321/#comment6616

doesn't preBalance() return a void?  it's preBalanceSwitch that returns 
boolean



src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/1321/#comment6617

and here we should get the boolean return value (and base class should 
return the input value)



src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/1321/#comment6618

would we ever want to override default assign behavior?  it's feasible... 
might want to be future proof w/ the api?



src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/1321/#comment6619

same here


- Jonathan





 Coprocessors: Coprocessor host and observer for HMaster
 ---

 Key: HBASE-3256
 URL: https://issues.apache.org/jira/browse/HBASE-3256
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Gary Helmling
 Fix For: 0.92.0

 Attachments: HBASE-3256_initial.patch


 Implement a coprocessor host for HMaster. Hook observers into administrative 
 operations performed on tables: create, alter, assignment, load balance, and 
 allow observers to modify base master behavior. Support automatic loading of 
 coprocessor implementation. 
 Consider refactoring the master coprocessor host and regionserver coprocessor 
 host into a common base class. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3256) Coprocessors: Coprocessor host and observer for HMaster

2010-12-20 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12973477#action_12973477
 ] 

HBase Review Board commented on HBASE-3256:
---

Message from: Gary Helmling ghelml...@gmail.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1321/#review2130
---



src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java
http://review.cloudera.org/r/1321/#comment6621

It actually means region.  That conf key is only used for the system 
coprocessors loaded on regions.

I'll change the name (and config property).



src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java
http://review.cloudera.org/r/1321/#comment6622

Yes, I'm not sure what the original intent was here, obtaining the CP 
without the full package name?

Maybe getClass().getSimpleName().equals() would be better?



src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/1321/#comment6623

This is the result of env.shouldBypass(), in order to allow a 
MasterObserver to bypass the normal balance() processing.



src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/1321/#comment6624

Right, that's the only way to modify the input.  Will change.



src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/1321/#comment6628

I can add in env.shouldBypass() handling here to allow overriding.  
Combined with access to ServerManager through MasterServices, this should allow 
custom assignment policies.



src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/1321/#comment6629

Yes, will add in env.shouldBypass() handling here too.


- Gary





 Coprocessors: Coprocessor host and observer for HMaster
 ---

 Key: HBASE-3256
 URL: https://issues.apache.org/jira/browse/HBASE-3256
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Gary Helmling
 Fix For: 0.92.0

 Attachments: HBASE-3256_initial.patch


 Implement a coprocessor host for HMaster. Hook observers into administrative 
 operations performed on tables: create, alter, assignment, load balance, and 
 allow observers to modify base master behavior. Support automatic loading of 
 coprocessor implementation. 
 Consider refactoring the master coprocessor host and regionserver coprocessor 
 host into a common base class. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3256) Coprocessors: Coprocessor host and observer for HMaster

2010-12-20 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12973491#action_12973491
 ] 

HBase Review Board commented on HBASE-3256:
---

Message from: Gary Helmling ghelml...@gmail.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1321/
---

(Updated 2010-12-20 22:31:38.965609)


Review request for hbase, stack, Andrew Purtell, and Jonathan Gray.


Changes
---

Changes in response to review comments:
 - DEFAULT - REGION in var name and property
 - CoprocessorHost.findCoprocessor(): use getClass().getSimpleName().equals() 
instead of getClass().getName().endsWith() for fallback
 - add bypass handling for MasterCoprocessorHost.preAssign() and preUnassign()
 - use return value from MasterObserver.preBalanceSwitch() to allow modifying 
input


Summary
---

This patch adds a new MasterObserver interface with pre/post hooks provided for 
operations defined in org.apache.hadoop.hbase.ipc.HMasterInterface.

In order to accommodate the new MasterObserver interface, I've also refactored 
out common coprocessor base code, with subclasses providing for region-specific 
and master-specific behavior.

The new code structure is (excuse my poor ascii art):

CoprocessorEnvironment - base interface for common facilities provided to CP 
implementations
| 
|- RegionCoprocessorEnvironment - adds access to current HRegion and 
RegionServerServices (for RegionObservers)
|
|- MasterCoprocessorEnvironment - adds access to MasterServerServices (for 
MasterObservers)

CoprocessorHost - abstract base providing core CP loading and invocation code 
and the base CoprocessorEnvironment implementation
|
|- RegionCoprocessorHost - provides hooks for invoking RegionObserver 
pre/post methods and RegionCoprocessorEnvironment implementation
|
|- MasterCoprocessorHost - provides hooks for invoking MasterObserver 
pre/post methods and MasterCoprocessorEnvironment implementation

Also added:
 - org.apache.hadoop.hbase.coprocessor.BaseMasterObserver - stubs out full 
MasterObserver interface with empty methods for convenience
 - org.apache.hadoop.hbase.coprocessor.TestMasterObserver - tests that 
MasterObserver pre/post methods are called during master operations.

In particular, please let me know if the MasterObserver method inputs and 
outputs are sufficient for whatever you anticipate doing with it.  It should 
meet our needs for security checks, but more input would be helpful.


This addresses bug HBASE-3256.
http://issues.apache.org/jira/browse/HBASE-3256


Diffs (updated)
-

  src/main/java/org/apache/hadoop/hbase/coprocessor/BaseMasterObserver.java 
PRE-CREATION 
  
src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserverCoprocessor.java
 1ffead0 
  src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorEnvironment.java 
c4fa526 
  src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 
PRE-CREATION 
  
src/main/java/org/apache/hadoop/hbase/coprocessor/MasterCoprocessorEnvironment.java
 PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/coprocessor/MasterObserver.java 
PRE-CREATION 
  
src/main/java/org/apache/hadoop/hbase/coprocessor/RegionCoprocessorEnvironment.java
 PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java 97198ec 
  src/main/java/org/apache/hadoop/hbase/coprocessor/package-info.java 1b7918c 
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java 18f7787 
  src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/master/MasterServices.java 593254b 
  src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java 
f71fea6 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1d48131 
  src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java 
PRE-CREATION 
  src/main/resources/hbase-default.xml f1cc4ae 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/ColumnAggregationEndpoint.java
 43569f1 
  src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java 
902a60f 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorEndpoint.java 
8eb2787 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java 
5434d01 
  src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterObserver.java 
PRE-CREATION 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverInterface.java
 5f5fc9a 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverStacking.java
 3193abf 
  src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java 5be8daa 

Diff: http://review.cloudera.org/r/1321/diff


Testing
---

Added a new test 

[jira] Commented: (HBASE-3362) If .META. offline between OPENING and OPENED, then wrong server location in .META. is possible

2010-12-16 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971997#action_12971997
 ] 

HBase Review Board commented on HBASE-3362:
---

Message from: Jonathan Gray jg...@apache.org

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1298/#review2083
---

Ship it!


a few small comments.  i think the loop should change as described in my 
comment (busy loop w/ call to currentTimeMillis as i read it).  otherwise +1, 
good stuff.  we need some tickle util class soon :)


trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
http://review.cloudera.org/r/1298/#comment6529

on this server should probably be left in comment to be clear what this 
is checking



trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
http://review.cloudera.org/r/1298/#comment6530

We were not previously but we should probably log this condition



trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
http://review.cloudera.org/r/1298/#comment6531

This is a busy wait loop?

Should we add a wait/notify on something passed to the thread and w/ a 
timeout of the period?

And then we should probably also have some kind of max timeout.  Even if 
minutes, there could be weird cluster state where the RS misses META 
availability but someone else might handle it properly, so max timeout might be 
good?



trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
http://review.cloudera.org/r/1298/#comment6533

whitespace



trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
http://review.cloudera.org/r/1298/#comment6532

maybe this should be warn.  i think i'd want to see it and also logging of 
stack trace (i don't see logging of it elsewhere)


- Jonathan





 If .META. offline between OPENING and OPENED, then wrong server location in 
 .META. is possible
 --

 Key: HBASE-3362
 URL: https://issues.apache.org/jira/browse/HBASE-3362
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
Priority: Critical
 Fix For: 0.90.0


 This is a good one.  It happened to me testing OOME in split logging.
 * Balancer moves region to new location, regionservrer X.
 * New location regionserver X successfully opens the region and then goes to 
 update .META.
 * At this point, the server carrying .META. crashes.
 * Regionserver X is stuck waiting on .META. to come back online.  It takes so 
 long master times out the region-in-transition
 * Master assigns the region elsewhere to regionserver Y
 * It opens successfully on regionserver Y and then it also parks waiting on 
 .META. coming online
 * .META. comes online
 * The two servers X and Y race to update .META.
 I saw case where server X edit went in after server Ys edit which means that 
 lookups in .META. get the wrong server.  HBCK can detect this situation.
 RegionServer X when it wakes up coreeclty notices that its lost control of 
 the region but the damage is done -- where damage is .META. edit.
 Chatting with Jon, he suggested that regionserver X should 'rollback' the 
 .META. edit -- do explicit delete of what it added.  This would work I think 
 but chatting more, I'll make a fix that keeps updating the zookeeper OPENING 
 state while edit goes on in a separate thread.  Our continuous setting of 
 OPENING will make it so region-in-transition does not timeout.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3362) If .META. offline between OPENING and OPENED, then wrong server location in .META. is possible

2010-12-16 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12972162#action_12972162
 ] 

HBase Review Board commented on HBASE-3362:
---

Message from: st...@duboce.net


bq.  On 2010-12-16 00:14:36, Jonathan Gray wrote:
bq.   
trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java,
 line 96
bq.   http://review.cloudera.org/r/1298/diff/1/?file=18309#file18309line96
bq.  
bq.   We were not previously but we should probably log this condition

We do in the method?


bq.  On 2010-12-16 00:14:36, Jonathan Gray wrote:
bq.   
trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java,
 line 173
bq.   http://review.cloudera.org/r/1298/diff/1/?file=18309#file18309line173
bq.  
bq.   This is a busy wait loop?
bq.   
bq.   Should we add a wait/notify on something passed to the thread and w/ 
a timeout of the period?
bq.   
bq.   And then we should probably also have some kind of max timeout.  
Even if minutes, there could be weird cluster state where the RS misses META 
availability but someone else might handle it properly, so max timeout might be 
good?

I need to add a small sleep.  I'd rather do this than wait/notify.  t.isAlive 
should be enough.  Regards max timeout, I should add check if server is stopped 
... and for max timeout, what you think?  Ten minutes?  Then abort?


bq.  On 2010-12-16 00:14:36, Jonathan Gray wrote:
bq.   
trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java,
 line 238
bq.   http://review.cloudera.org/r/1298/diff/1/?file=18309#file18309line238
bq.  
bq.   maybe this should be warn.  i think i'd want to see it and also 
logging of stack trace (i don't see logging of it elsewhere)

For sure.


- stack


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1298/#review2083
---





 If .META. offline between OPENING and OPENED, then wrong server location in 
 .META. is possible
 --

 Key: HBASE-3362
 URL: https://issues.apache.org/jira/browse/HBASE-3362
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
Priority: Critical
 Fix For: 0.90.0


 This is a good one.  It happened to me testing OOME in split logging.
 * Balancer moves region to new location, regionservrer X.
 * New location regionserver X successfully opens the region and then goes to 
 update .META.
 * At this point, the server carrying .META. crashes.
 * Regionserver X is stuck waiting on .META. to come back online.  It takes so 
 long master times out the region-in-transition
 * Master assigns the region elsewhere to regionserver Y
 * It opens successfully on regionserver Y and then it also parks waiting on 
 .META. coming online
 * .META. comes online
 * The two servers X and Y race to update .META.
 I saw case where server X edit went in after server Ys edit which means that 
 lookups in .META. get the wrong server.  HBCK can detect this situation.
 RegionServer X when it wakes up coreeclty notices that its lost control of 
 the region but the damage is done -- where damage is .META. edit.
 Chatting with Jon, he suggested that regionserver X should 'rollback' the 
 .META. edit -- do explicit delete of what it added.  This would work I think 
 but chatting more, I'll make a fix that keeps updating the zookeeper OPENING 
 state while edit goes on in a separate thread.  Our continuous setting of 
 OPENING will make it so region-in-transition does not timeout.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3362) If .META. offline between OPENING and OPENED, then wrong server location in .META. is possible

2010-12-16 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12972172#action_12972172
 ] 

HBase Review Board commented on HBASE-3362:
---

Message from: st...@duboce.net


bq.  On 2010-12-16 00:14:36, Jonathan Gray wrote:
bq.   
trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java,
 line 173
bq.   http://review.cloudera.org/r/1298/diff/1/?file=18309#file18309line173
bq.  
bq.   This is a busy wait loop?
bq.   
bq.   Should we add a wait/notify on something passed to the thread and w/ 
a timeout of the period?
bq.   
bq.   And then we should probably also have some kind of max timeout.  
Even if minutes, there could be weird cluster state where the RS misses META 
availability but someone else might handle it properly, so max timeout might be 
good?
bq.  
bq.  stack wrote:
bq.  I need to add a small sleep.  I'd rather do this than wait/notify.  
t.isAlive should be enough.  Regards max timeout, I should add check if server 
is stopped ... and for max timeout, what you think?  Ten minutes?  Then abort?
bq.  
bq.  Jonathan Gray wrote:
bq.  I was thinking 5 minutes.
bq.  
bq.  How long you going to sleep for?  That seems like an unideal way to do 
this.  I would prefer wait/notify and have timeout on wait be this 1/3 period, 
but small sleep could work.  If really small, we're in busy loop again.  If too 
big, we increase how long we have to wait.  This is on critical path of every 
single region open.
bq.  
bq.  If we go down path of threads doing work, I don't see why we don't 
want to use wait/notify to let the blocked thread know when it's done.

5 minute is not enough.  IIRC, it was  5 minutes before the region came back 
online.  Let me see.

I want to avoid mother thread depending on daughter thread signaling it to 
stop... seems redundant when I'm watching the daughter with the isAlive already.

The sleep would be short.  1ms or so.  Normally we'd not trip into the sleep.  
The operation will have compeleted before we have chance to sleep.  It'd only 
sleep when no progress can be made.

I'll add wait/notify for you to get this patch cleared past review, np.


- stack


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1298/#review2083
---





 If .META. offline between OPENING and OPENED, then wrong server location in 
 .META. is possible
 --

 Key: HBASE-3362
 URL: https://issues.apache.org/jira/browse/HBASE-3362
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
Priority: Critical
 Fix For: 0.90.0


 This is a good one.  It happened to me testing OOME in split logging.
 * Balancer moves region to new location, regionservrer X.
 * New location regionserver X successfully opens the region and then goes to 
 update .META.
 * At this point, the server carrying .META. crashes.
 * Regionserver X is stuck waiting on .META. to come back online.  It takes so 
 long master times out the region-in-transition
 * Master assigns the region elsewhere to regionserver Y
 * It opens successfully on regionserver Y and then it also parks waiting on 
 .META. coming online
 * .META. comes online
 * The two servers X and Y race to update .META.
 I saw case where server X edit went in after server Ys edit which means that 
 lookups in .META. get the wrong server.  HBCK can detect this situation.
 RegionServer X when it wakes up coreeclty notices that its lost control of 
 the region but the damage is done -- where damage is .META. edit.
 Chatting with Jon, he suggested that regionserver X should 'rollback' the 
 .META. edit -- do explicit delete of what it added.  This would work I think 
 but chatting more, I'll make a fix that keeps updating the zookeeper OPENING 
 state while edit goes on in a separate thread.  Our continuous setting of 
 OPENING will make it so region-in-transition does not timeout.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3260) Coprocessors: Lifecycle management

2010-12-16 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12972312#action_12972312
 ] 

HBase Review Board commented on HBASE-3260:
---

Message from: Gary Helmling ghelml...@gmail.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1306/
---

Review request for hbase, stack and Andrew Purtell.


Summary
---

This patch adds explicit start() and stop() methods for lifecycle management to 
the Coprocessor interface and refactors some of the Coprocessor/RegionObserver 
distinction, moving the region-related pre/post hooks that were previously in 
Coprocessor to RegionObserver.

Coprocessor is now the base interface, containing only:
 - start()
 - stop()
 - Priority enum
 - State enum

RegionObserver extends Coprocessor, and now contains the additional pre/post 
hooks, moved from Coprocessor:
 - pre/postOpen
 - pre/postClose
 - pre/postFlush
 - pre/postCompact
 - pre/postSplit

This will allow cleaner extension in the future, to allow addition of a 
MasterObserver interface, for example.

As shown above, I've also added a new Coprocessor.State enum consisting of the 
states:
UNINSTALLED - INSTALLED - STARTING - ACTIVE - STOPPING - STOPPED

However, the UNINSTALLED/INSTALLED distinction is not particularly useful at 
the moment.  I'd appreciate other feedback on what's necessary here.  The 
current handling could make do with:
UNINSTALLED - STARTING - ACTIVE - STOPPING - UNINSTALLED (4 total states)

However, the UNINSTALLED/INSTALLED distinction may be useful if we want to add 
class level initialization in the future...


This addresses bug HBASE-3260.
http://issues.apache.org/jira/browse/HBASE-3260


Diffs
-

  
src/main/java/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.java 
b81a465 
  
src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserverCoprocessor.java
 f022598 
  src/main/java/org/apache/hadoop/hbase/coprocessor/Coprocessor.java 7ea1c5e 
  src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java 1792290 
  src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java 
f028525 
  src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java 
3db4c36 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java 
81cb75d 

Diff: http://review.cloudera.org/r/1306/diff


Testing
---

Added tests for start() and stop() method invocation in 
org.apache.hadoop.hbase.coprocessor.TestCoprocessorInterface

The existing TestCoprocessorEndpoint, TestCoprocessorInterface, 
TestRegionObserverInterface, TestRegionObserverStacking tests continue to work. 
 I'm not seeing any new failures in the rest of the tests, but TestReplication 
is timing out for me, preventing all tests from executing.


Thanks,

Gary




 Coprocessors: Lifecycle management
 --

 Key: HBASE-3260
 URL: https://issues.apache.org/jira/browse/HBASE-3260
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
 Fix For: 0.92.0

 Attachments: statechart.png


 Considering extending CPs to the master, we have no equivalent to 
 pre/postOpen and pre/postClose as on the regionserver. We also should 
 consider how to resolve dependencies and initialization ordering if loading 
 coprocessors that depend on others. 
 OSGi (http://en.wikipedia.org/wiki/OSGi) has a lifecycle API and is familiar 
 to many Java programmers, so we propose to borrow its terminology and state 
 machine.
 A lifecycle layer manages coprocessors as they are dynamically installed, 
 started, stopped, updated and uninstalled. Coprocessors rely on the framework 
 for dependency resolution and class loading. In turn, the framework calls up 
 to lifecycle management methods in the coprocessor as needed.
 A coprocessor transitions between the below states over its lifetime:
 ||State||Description||
 |UNINSTALLED|The coprocessor implementation is not installed. This is the 
 default implicit state.|
 |INSTALLED|The coprocessor implementation has been successfully installed|
 |STARTING|A coprocessor instance is being started.|
 |ACTIVE|The coprocessor instance has been successfully activated and is 
 running.|
 |STOPPING|A coprocessor instance is being stopped.|
 See attached state diagram. Transitions to STOPPING will only happen as the 
 region is being closed. If a coprocessor throws an unhandled exception, this 
 will cause the RegionServer to close the region, stopping all coprocessor 
 instances on it. 
 Transitions from INSTALLED-STARTING and ACTIVE-STOPPING would go through 
 upcall methods into the coprocessor via the CoprocessorLifecycle interface:
 {code:java}
 public interface 

[jira] Commented: (HBASE-3362) If .META. offline between OPENING and OPENED, then wrong server location in .META. is possible

2010-12-16 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12972321#action_12972321
 ] 

HBase Review Board commented on HBASE-3362:
---

Message from: st...@duboce.net

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1298/
---

(Updated 2010-12-16 17:01:04.757304)


Review request for hbase and Jonathan Gray.


Changes
---

I implemented Jon's suggestions and then some.  Not pretty but works in my 
local and cluster testing.


Summary
---

M 
src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
 Removed stale comments and TODOs.

 Added a 'version' datamenber, the znode edit version which we keep across open 
process.

 Refactored the setting of OPENING out into a method that is used in multiple 
places 
 now rather than repeat code.  Did this in new tickleOpening method.

 Added new PostOpenDeployTasksThread which we run to do the postOpenDeployTasks.
 While its running we update OPENING state if its running a while.


This addresses bug hbase-3362.
http://issues.apache.org/jira/browse/hbase-3362


Diffs (updated)
-

  
trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
 1050086 

Diff: http://review.cloudera.org/r/1298/diff


Testing
---

Ran it on my cluster. Seems to work as the old code did.


Thanks,

stack




 If .META. offline between OPENING and OPENED, then wrong server location in 
 .META. is possible
 --

 Key: HBASE-3362
 URL: https://issues.apache.org/jira/browse/HBASE-3362
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
Priority: Critical
 Fix For: 0.90.0


 This is a good one.  It happened to me testing OOME in split logging.
 * Balancer moves region to new location, regionservrer X.
 * New location regionserver X successfully opens the region and then goes to 
 update .META.
 * At this point, the server carrying .META. crashes.
 * Regionserver X is stuck waiting on .META. to come back online.  It takes so 
 long master times out the region-in-transition
 * Master assigns the region elsewhere to regionserver Y
 * It opens successfully on regionserver Y and then it also parks waiting on 
 .META. coming online
 * .META. comes online
 * The two servers X and Y race to update .META.
 I saw case where server X edit went in after server Ys edit which means that 
 lookups in .META. get the wrong server.  HBCK can detect this situation.
 RegionServer X when it wakes up coreeclty notices that its lost control of 
 the region but the damage is done -- where damage is .META. edit.
 Chatting with Jon, he suggested that regionserver X should 'rollback' the 
 .META. edit -- do explicit delete of what it added.  This would work I think 
 but chatting more, I'll make a fix that keeps updating the zookeeper OPENING 
 state while edit goes on in a separate thread.  Our continuous setting of 
 OPENING will make it so region-in-transition does not timeout.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3260) Coprocessors: Lifecycle management

2010-12-16 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12972328#action_12972328
 ] 

HBase Review Board commented on HBASE-3260:
---

Message from: Andrew Purtell apurt...@apache.org

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1306/#review2102
---

Ship it!



src/main/java/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.java
http://review.cloudera.org/r/1306/#comment6557

What are these arguments about?



src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java
http://review.cloudera.org/r/1306/#comment6558

Should be a WARN?



src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java
http://review.cloudera.org/r/1306/#comment6559

Should be a WARN?



src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java
http://review.cloudera.org/r/1306/#comment6560

Since you are committing a change set in this area, Ryan suggested no need 
for AtomicBoolean here, could just be plain volatile boolean. I think that's 
right.


- Andrew





 Coprocessors: Lifecycle management
 --

 Key: HBASE-3260
 URL: https://issues.apache.org/jira/browse/HBASE-3260
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
 Fix For: 0.92.0

 Attachments: statechart.png


 Considering extending CPs to the master, we have no equivalent to 
 pre/postOpen and pre/postClose as on the regionserver. We also should 
 consider how to resolve dependencies and initialization ordering if loading 
 coprocessors that depend on others. 
 OSGi (http://en.wikipedia.org/wiki/OSGi) has a lifecycle API and is familiar 
 to many Java programmers, so we propose to borrow its terminology and state 
 machine.
 A lifecycle layer manages coprocessors as they are dynamically installed, 
 started, stopped, updated and uninstalled. Coprocessors rely on the framework 
 for dependency resolution and class loading. In turn, the framework calls up 
 to lifecycle management methods in the coprocessor as needed.
 A coprocessor transitions between the below states over its lifetime:
 ||State||Description||
 |UNINSTALLED|The coprocessor implementation is not installed. This is the 
 default implicit state.|
 |INSTALLED|The coprocessor implementation has been successfully installed|
 |STARTING|A coprocessor instance is being started.|
 |ACTIVE|The coprocessor instance has been successfully activated and is 
 running.|
 |STOPPING|A coprocessor instance is being stopped.|
 See attached state diagram. Transitions to STOPPING will only happen as the 
 region is being closed. If a coprocessor throws an unhandled exception, this 
 will cause the RegionServer to close the region, stopping all coprocessor 
 instances on it. 
 Transitions from INSTALLED-STARTING and ACTIVE-STOPPING would go through 
 upcall methods into the coprocessor via the CoprocessorLifecycle interface:
 {code:java}
 public interface CoprocessorLifecycle {
   void start(CoprocessorEnvironment env) throws IOException; 
   void stop(CoprocessorEnvironment env) throws IOException;
 }
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3362) If .META. offline between OPENING and OPENED, then wrong server location in .META. is possible

2010-12-16 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12972331#action_12972331
 ] 

HBase Review Board commented on HBASE-3362:
---

Message from: Jonathan Gray jg...@apache.org

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1298/#review2103
---

Ship it!


it's getting pretty crazy but this looks good.

it's unfortunate we have all these extra node transitioning methods inside this 
class.  this pattern of doing node transitions and tracking expected version is 
very common and we'll probably have more of it so we should look at doing some 
kind of generic abstraction for that pattern soon.

+1 for commit, thanks for the changes


trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
http://review.cloudera.org/r/1298/#comment6561

typo 'initalizes' but good comment



trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
http://review.cloudera.org/r/1298/#comment6562

interesting thing is... we only use this progressable if we do a log 
replay.  in that case, a region open is not really idempotent as we treat it 
here.

outside scope of this jira but something to think about.


- Jonathan





 If .META. offline between OPENING and OPENED, then wrong server location in 
 .META. is possible
 --

 Key: HBASE-3362
 URL: https://issues.apache.org/jira/browse/HBASE-3362
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
Priority: Critical
 Fix For: 0.90.0


 This is a good one.  It happened to me testing OOME in split logging.
 * Balancer moves region to new location, regionservrer X.
 * New location regionserver X successfully opens the region and then goes to 
 update .META.
 * At this point, the server carrying .META. crashes.
 * Regionserver X is stuck waiting on .META. to come back online.  It takes so 
 long master times out the region-in-transition
 * Master assigns the region elsewhere to regionserver Y
 * It opens successfully on regionserver Y and then it also parks waiting on 
 .META. coming online
 * .META. comes online
 * The two servers X and Y race to update .META.
 I saw case where server X edit went in after server Ys edit which means that 
 lookups in .META. get the wrong server.  HBCK can detect this situation.
 RegionServer X when it wakes up coreeclty notices that its lost control of 
 the region but the damage is done -- where damage is .META. edit.
 Chatting with Jon, he suggested that regionserver X should 'rollback' the 
 .META. edit -- do explicit delete of what it added.  This would work I think 
 but chatting more, I'll make a fix that keeps updating the zookeeper OPENING 
 state while edit goes on in a separate thread.  Our continuous setting of 
 OPENING will make it so region-in-transition does not timeout.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3260) Coprocessors: Lifecycle management

2010-12-16 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12972336#action_12972336
 ] 

HBase Review Board commented on HBASE-3260:
---

Message from: Gary Helmling ghelml...@gmail.com


bq.  On 2010-12-16 17:09:44, Andrew Purtell wrote:
bq.   
src/main/java/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.java, 
line 66
bq.   http://review.cloudera.org/r/1306/diff/1/?file=18379#file18379line66
bq.  
bq.   What are these arguments about?

Those are:
- String protocol
- long clientVersion

from org.apache.hadoop.ipc.VersionedProtocol.

Will fix these up.


bq.  On 2010-12-16 17:09:44, Andrew Purtell wrote:
bq.   src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java, 
line 289
bq.   http://review.cloudera.org/r/1306/diff/1/?file=18383#file18383line289
bq.  
bq.   Should be a WARN?

Yeah, agree.  Will fix.


bq.  On 2010-12-16 17:09:44, Andrew Purtell wrote:
bq.   src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java, 
line 305
bq.   http://review.cloudera.org/r/1306/diff/1/?file=18383#file18383line305
bq.  
bq.   Should be a WARN?

Yeah, will fix.


bq.  On 2010-12-16 17:09:44, Andrew Purtell wrote:
bq.   src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java, 
line 385
bq.   http://review.cloudera.org/r/1306/diff/1/?file=18383#file18383line385
bq.  
bq.   Since you are committing a change set in this area, Ryan suggested 
no need for AtomicBoolean here, could just be plain volatile boolean. I think 
that's right.

Ok will change this to a volatile boolean and repost.


- Gary


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1306/#review2102
---





 Coprocessors: Lifecycle management
 --

 Key: HBASE-3260
 URL: https://issues.apache.org/jira/browse/HBASE-3260
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
 Fix For: 0.92.0

 Attachments: statechart.png


 Considering extending CPs to the master, we have no equivalent to 
 pre/postOpen and pre/postClose as on the regionserver. We also should 
 consider how to resolve dependencies and initialization ordering if loading 
 coprocessors that depend on others. 
 OSGi (http://en.wikipedia.org/wiki/OSGi) has a lifecycle API and is familiar 
 to many Java programmers, so we propose to borrow its terminology and state 
 machine.
 A lifecycle layer manages coprocessors as they are dynamically installed, 
 started, stopped, updated and uninstalled. Coprocessors rely on the framework 
 for dependency resolution and class loading. In turn, the framework calls up 
 to lifecycle management methods in the coprocessor as needed.
 A coprocessor transitions between the below states over its lifetime:
 ||State||Description||
 |UNINSTALLED|The coprocessor implementation is not installed. This is the 
 default implicit state.|
 |INSTALLED|The coprocessor implementation has been successfully installed|
 |STARTING|A coprocessor instance is being started.|
 |ACTIVE|The coprocessor instance has been successfully activated and is 
 running.|
 |STOPPING|A coprocessor instance is being stopped.|
 See attached state diagram. Transitions to STOPPING will only happen as the 
 region is being closed. If a coprocessor throws an unhandled exception, this 
 will cause the RegionServer to close the region, stopping all coprocessor 
 instances on it. 
 Transitions from INSTALLED-STARTING and ACTIVE-STOPPING would go through 
 upcall methods into the coprocessor via the CoprocessorLifecycle interface:
 {code:java}
 public interface CoprocessorLifecycle {
   void start(CoprocessorEnvironment env) throws IOException; 
   void stop(CoprocessorEnvironment env) throws IOException;
 }
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3360) ReplicationLogCleaner is enabled by default in 0.90 -- causes NPE

2010-12-15 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971771#action_12971771
 ] 

HBase Review Board commented on HBASE-3360:
---

Message from: st...@duboce.net

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1293/#review2074
---

Ship it!


+1


/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/1293/#comment6510

Don't need this (found by J-D reviewing this over my shoulder)



/trunk/src/main/java/org/apache/hadoop/hbase/replication/regionserver/Replication.java
http://review.cloudera.org/r/1293/#comment6511

Call this 'decoraateMasterConfiguration' or something other than instrument.


- stack





 ReplicationLogCleaner is enabled by default in 0.90 -- causes NPE
 -

 Key: HBASE-3360
 URL: https://issues.apache.org/jira/browse/HBASE-3360
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Jean-Daniel Cryans
 Fix For: 0.90.0


 {code}
 2010-12-15 00:33:17,706 ERROR org.apache.hadoop.hbase.master.LogCleaner: 
 Caught exception
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner.isLogDeletable(ReplicationLogCleaner.java:59)
 at 
 org.apache.hadoop.hbase.master.LogCleaner.chore(LogCleaner.java:138)
 at org.apache.hadoop.hbase.Chore.run(Chore.java:66)
 at org.apache.hadoop.hbase.master.LogCleaner.run(LogCleaner.java:165)
 {code}
 Assigning J-D at his request.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3362) If .META. offline between OPENING and OPENED, then wrong server location in .META. is possible

2010-12-15 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971896#action_12971896
 ] 

HBase Review Board commented on HBASE-3362:
---

Message from: st...@duboce.net

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1298/
---

Review request for hbase and Jonathan Gray.


Summary
---

M 
src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
 Removed stale comments and TODOs.

 Added a 'version' datamenber, the znode edit version which we keep across open 
process.

 Refactored the setting of OPENING out into a method that is used in multiple 
places 
 now rather than repeat code.  Did this in new tickleOpening method.

 Added new PostOpenDeployTasksThread which we run to do the postOpenDeployTasks.
 While its running we update OPENING state if its running a while.


This addresses bug hbase-3362.
http://issues.apache.org/jira/browse/hbase-3362


Diffs
-

  
trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
 1049707 

Diff: http://review.cloudera.org/r/1298/diff


Testing
---

Ran it on my cluster. Seems to work as the old code did.


Thanks,

stack




 If .META. offline between OPENING and OPENED, then wrong server location in 
 .META. is possible
 --

 Key: HBASE-3362
 URL: https://issues.apache.org/jira/browse/HBASE-3362
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
Priority: Critical
 Fix For: 0.90.0


 This is a good one.  It happened to me testing OOME in split logging.
 * Balancer moves region to new location, regionservrer X.
 * New location regionserver X successfully opens the region and then goes to 
 update .META.
 * At this point, the server carrying .META. crashes.
 * Regionserver X is stuck waiting on .META. to come back online.  It takes so 
 long master times out the region-in-transition
 * Master assigns the region elsewhere to regionserver Y
 * It opens successfully on regionserver Y and then it also parks waiting on 
 .META. coming online
 * .META. comes online
 * The two servers X and Y race to update .META.
 I saw case where server X edit went in after server Ys edit which means that 
 lookups in .META. get the wrong server.  HBCK can detect this situation.
 RegionServer X when it wakes up coreeclty notices that its lost control of 
 the region but the damage is done -- where damage is .META. edit.
 Chatting with Jon, he suggested that regionserver X should 'rollback' the 
 .META. edit -- do explicit delete of what it added.  This would work I think 
 but chatting more, I'll make a fix that keeps updating the zookeeper OPENING 
 state while edit goes on in a separate thread.  Our continuous setting of 
 OPENING will make it so region-in-transition does not timeout.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3360) ReplicationLogCleaner is enabled by default in 0.90 -- causes NPE

2010-12-14 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971517#action_12971517
 ] 

HBase Review Board commented on HBASE-3360:
---

Message from: Jean-Daniel Cryans jdcry...@apache.org

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1293/
---

Review request for hbase.


Summary
---

Patch that removes ReplicationLogCleaner from hbase-default.xml and instead 
injects from the Replication class. There's also some cleanup on how HConstants 
are used.


This addresses bug HBASE-3360.
http://issues.apache.org/jira/browse/HBASE-3360


Diffs
-

  /trunk/src/main/java/org/apache/hadoop/hbase/HConstants.java 1049375 
  /trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 1049375 
  /trunk/src/main/java/org/apache/hadoop/hbase/master/LogCleaner.java 1049375 
  
/trunk/src/main/java/org/apache/hadoop/hbase/replication/regionserver/Replication.java
 1049375 
  /trunk/src/main/resources/hbase-default.xml 1049375 

Diff: http://review.cloudera.org/r/1293/diff


Testing
---


Thanks,

Jean-Daniel




 ReplicationLogCleaner is enabled by default in 0.90 -- causes NPE
 -

 Key: HBASE-3360
 URL: https://issues.apache.org/jira/browse/HBASE-3360
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Jean-Daniel Cryans
 Fix For: 0.90.0


 {code}
 2010-12-15 00:33:17,706 ERROR org.apache.hadoop.hbase.master.LogCleaner: 
 Caught exception
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner.isLogDeletable(ReplicationLogCleaner.java:59)
 at 
 org.apache.hadoop.hbase.master.LogCleaner.chore(LogCleaner.java:138)
 at org.apache.hadoop.hbase.Chore.run(Chore.java:66)
 at org.apache.hadoop.hbase.master.LogCleaner.run(LogCleaner.java:165)
 {code}
 Assigning J-D at his request.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3348) Allow Observers to completely override base function

2010-12-14 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971531#action_12971531
 ] 

HBase Review Board commented on HBASE-3348:
---

Message from: Andrew Purtell apurt...@apache.org

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1295/
---

Review request for hbase, Jonathan Gray and Mingjie Lai.


Summary
---

Currently an observer can act as a filter or translator but cannot stop a 
subsequent call down to the base method for get, put, delete, etc. This patch 
allows observers to 1) keep any subsequently chained observer from executing, 
or 2) prevent default behavior from executing. This latter option allows a 
preXXX hook to completely reimplement something.

I also found and fixed some logic bugs in coprocessor framework integration in 
HRegion.

I will squelch the added extraneous whitespace upon commit.


This addresses bug HBASE-3348.
http://issues.apache.org/jira/browse/HBASE-3348


Diffs
-

  
src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserverCoprocessor.java
 134ed2f 
  src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorEnvironment.java 
654b179 
  src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java 10dfff4 
  src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java 
c57ca0c 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java cf9cad0 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 8248f5f 
  src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java 
345790f 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverStacking.java
 9ef3562 

Diff: http://review.cloudera.org/r/1295/diff


Testing
---

All coprocessor unit tests pass. No failures of other unit tests observed that 
might be related to these changes.


Thanks,

Andrew




 Allow Observers to completely override base function
 

 Key: HBASE-3348
 URL: https://issues.apache.org/jira/browse/HBASE-3348
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.92.0

 Attachments: HBASE-3348.patch


 Currently an observer can act as a filter or translator but cannot stop a 
 subsequent call down to the base method for get, put, delete, etc. This means 
 an observer cannot completely override the base function. To deal with this 
 we can:
 - Change the preXXX methods to return the same type as the postXXX methods, 
 the same return type of the base method. 
 - Extend {{Coprocessor.Environment}} with methods that get/set a should 
 continue flag. 
 The framework should check the should continue flag before calling the base 
 method. If not, just return what was returned by the preXXX method. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3348) Allow Observers to completely override base function

2010-12-14 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971534#action_12971534
 ] 

HBase Review Board commented on HBASE-3348:
---

Message from: Ryan Rawson ryano...@gmail.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1295/#review2064
---

Ship it!


- Ryan





 Allow Observers to completely override base function
 

 Key: HBASE-3348
 URL: https://issues.apache.org/jira/browse/HBASE-3348
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.92.0

 Attachments: HBASE-3348.patch


 Currently an observer can act as a filter or translator but cannot stop a 
 subsequent call down to the base method for get, put, delete, etc. This means 
 an observer cannot completely override the base function. To deal with this 
 we can:
 - Change the preXXX methods to return the same type as the postXXX methods, 
 the same return type of the base method. 
 - Extend {{Coprocessor.Environment}} with methods that get/set a should 
 continue flag. 
 The framework should check the should continue flag before calling the base 
 method. If not, just return what was returned by the preXXX method. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3348) Allow Observers to completely override base function

2010-12-14 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971533#action_12971533
 ] 

HBase Review Board commented on HBASE-3348:
---

Message from: Ryan Rawson ryano...@gmail.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1295/#review2063
---



src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserverCoprocessor.java
http://review.cloudera.org/r/1295/#comment6478

presumably a co-processor could modify the Get object to implement policy?  
Another consideration is replacing the Get query with an alternate query, for 
example we have InternalGet subclasses for additional functionality, I'm just 
winging this though.



src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java
http://review.cloudera.org/r/1295/#comment6479

unless you need CAS semantics, you can just use volatile here.  We are 
over-using the Atomic* stuff sometimes.


- Ryan





 Allow Observers to completely override base function
 

 Key: HBASE-3348
 URL: https://issues.apache.org/jira/browse/HBASE-3348
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.92.0

 Attachments: HBASE-3348.patch


 Currently an observer can act as a filter or translator but cannot stop a 
 subsequent call down to the base method for get, put, delete, etc. This means 
 an observer cannot completely override the base function. To deal with this 
 we can:
 - Change the preXXX methods to return the same type as the postXXX methods, 
 the same return type of the base method. 
 - Extend {{Coprocessor.Environment}} with methods that get/set a should 
 continue flag. 
 The framework should check the should continue flag before calling the base 
 method. If not, just return what was returned by the preXXX method. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3328) Admin API: Explicit Split Points

2010-12-13 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971071#action_12971071
 ] 

HBase Review Board commented on HBASE-3328:
---

Message from: Nicolas nspiegelb...@facebook.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1281/
---

(Updated 2010-12-13 15:08:04.932875)


Review request for hbase.


Changes
---

refactored HRegion::forceSplit() api


Summary
---

Add the ability to explicitly split an existing region at a user-specified 
point. Currently, you can disable automated splitting and can presplit a 
newly-created table at explicit boundaries, but cannot explicitly bound a split 
of an existing region.


This addresses bug HBASE-3328.
http://issues.apache.org/jira/browse/HBASE-3328


Diffs (updated)
-

  src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 2fba18e 
  src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 2c109ae 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java cf9cad0 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 26b4c10 
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 4717938 
  src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 8560d22 

Diff: http://review.cloudera.org/r/1281/diff


Testing
---

- mvn test -Dtest=TestAdmin
- mvn test (underway)
- cluster testing
Note: this was primarily cluster-tested with 0.89 master.


Thanks,

Nicolas




 Admin API: Explicit Split Points
 

 Key: HBASE-3328
 URL: https://issues.apache.org/jira/browse/HBASE-3328
 Project: HBase
  Issue Type: Improvement
  Components: client, ipc
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Minor

 Add the ability to explicitly split an existing region at a user-specified 
 point.  Currently, you can disable automated splitting and can presplit a 
 newly-created table at explicit boundaries, but cannot explicitly bound a 
 split of an existing region.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3328) Admin API: Explicit Split Points

2010-12-11 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12970583#action_12970583
 ] 

HBase Review Board commented on HBASE-3328:
---

Message from: Nicolas nspiegelb...@facebook.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1281/
---

(Updated 2010-12-11 23:21:55.350465)


Review request for hbase.


Changes
---

I was porting this from my 0.89 diff.  In 0.90, we can just directly add an 
HRegionInterface RPC and not worry about incrementing the HRegionInfo VERSION.  
Much cleaner and allows for rolling upgrades / mixed version environments.


Summary
---

Add the ability to explicitly split an existing region at a user-specified 
point. Currently, you can disable automated splitting and can presplit a 
newly-created table at explicit boundaries, but cannot explicitly bound a split 
of an existing region.


This addresses bug HBASE-3328.
http://issues.apache.org/jira/browse/HBASE-3328


Diffs (updated)
-

  src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 2fba18e 
  src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 2c109ae 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java cf9cad0 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 26b4c10 
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 4717938 
  src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 8560d22 

Diff: http://review.cloudera.org/r/1281/diff


Testing
---

- mvn test -Dtest=TestAdmin
- mvn test (underway)
- cluster testing
Note: this was primarily cluster-tested with 0.89 master.


Thanks,

Nicolas




 Admin API: Explicit Split Points
 

 Key: HBASE-3328
 URL: https://issues.apache.org/jira/browse/HBASE-3328
 Project: HBase
  Issue Type: Improvement
  Components: client, ipc
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Minor

 Add the ability to explicitly split an existing region at a user-specified 
 point.  Currently, you can disable automated splitting and can presplit a 
 newly-created table at explicit boundaries, but cannot explicitly bound a 
 split of an existing region.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3328) Admin API: Explicit Split Points

2010-12-09 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969941#action_12969941
 ] 

HBase Review Board commented on HBASE-3328:
---

Message from: Nicolas nspiegelb...@facebook.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1281/
---

Review request for hbase.


Summary
---

Add the ability to explicitly split an existing region at a user-specified 
point. Currently, you can disable automated splitting and can presplit a 
newly-created table at explicit boundaries, but cannot explicitly bound a split 
of an existing region.


This addresses bug HBASE-3328.
http://issues.apache.org/jira/browse/HBASE-3328


Diffs
-

  src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 2e601e1 
  src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 2fba18e 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java cf9cad0 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 26b4c10 
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 4717938 
  src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 8560d22 

Diff: http://review.cloudera.org/r/1281/diff


Testing
---

- mvn test -Dtest=TestAdmin
- mvn test (underway)
- cluster testing
Note: this was primarily cluster-tested with 0.89 master.


Thanks,

Nicolas




 Admin API: Explicit Split Points
 

 Key: HBASE-3328
 URL: https://issues.apache.org/jira/browse/HBASE-3328
 Project: HBase
  Issue Type: Improvement
  Components: client, ipc
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Minor

 Add the ability to explicitly split an existing region at a user-specified 
 point.  Currently, you can disable automated splitting and can presplit a 
 newly-created table at explicit boundaries, but cannot explicitly bound a 
 split of an existing region.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3328) Admin API: Explicit Split Points

2010-12-09 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969942#action_12969942
 ] 

HBase Review Board commented on HBASE-3328:
---

Message from: Nicolas nspiegelb...@facebook.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1281/#review2056
---



src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
http://review.cloudera.org/r/1281/#comment6463

note that this means you should not do a rolling upgrade with this patch.


- Nicolas





 Admin API: Explicit Split Points
 

 Key: HBASE-3328
 URL: https://issues.apache.org/jira/browse/HBASE-3328
 Project: HBase
  Issue Type: Improvement
  Components: client, ipc
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Minor

 Add the ability to explicitly split an existing region at a user-specified 
 point.  Currently, you can disable automated splitting and can presplit a 
 newly-created table at explicit boundaries, but cannot explicitly bound a 
 split of an existing region.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3305) Allow round-robin distribution for table created with multiple regions

2010-12-07 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969080#action_12969080
 ] 

HBase Review Board commented on HBASE-3305:
---

Message from: Jonathan Gray jg...@apache.org

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1271/#review2042
---


Almost there.  Some spacing only changes still in here and need to move out 
logic into AM method.


trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/1271/#comment6445

still tabbing changes here and next method signature as well



trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/1271/#comment6446

same as stack's original comment.  this logic should be in 
AssignmentManager.  I wouldn't reuse the method 'assignAllUserRegions' because 
it says all in it.  A method 'assignUserRegions' which takes a list and does 
a bulk assign w/ round-robin would make sense . 'assignAllUserRegions' could 
then call it once it makes a list of regions.


- Jonathan





 Allow round-robin distribution for table created with multiple regions
 --

 Key: HBASE-3305
 URL: https://issues.apache.org/jira/browse/HBASE-3305
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.20.6
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: hbase-3305-array.patch, 
 hbase-3305-default-round-robin.patch, hbase-3305-round-robin-unit-test.patch, 
 hbase-3305.patch


 We can distribute the initial regions created for a new table in round-robin 
 fashion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-1861) Multi-Family support for bulk upload tools (HFileOutputFormat / loadtable.rb)

2010-12-07 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969086#action_12969086
 ] 

HBase Review Board commented on HBASE-1861:
---

Message from: Nicolas nspiegelb...@facebook.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1272/
---

Review request for hbase.


Summary
---

support writing to multiple column families for HFileOutputFormat.  also, added 
a max threshold for PutSortReducer because we had some pathological row cases.


This addresses bug HBASE-1861.
http://issues.apache.org/jira/browse/HBASE-1861


Diffs
-

  src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java 
8ccdf4d 
  src/main/java/org/apache/hadoop/hbase/mapreduce/PutSortReducer.java 5fb3e83 
  src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java 
c5d56cc 

Diff: http://review.cloudera.org/r/1272/diff


Testing
---

mvn test -Dtest=ThestHFileOutputFormat
internal MR testing


Thanks,

Nicolas




 Multi-Family support for bulk upload tools (HFileOutputFormat / loadtable.rb)
 -

 Key: HBASE-1861
 URL: https://issues.apache.org/jira/browse/HBASE-1861
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce
Affects Versions: 0.20.0
Reporter: Jonathan Gray
Assignee: Nicolas Spiegelberg
 Fix For: 0.92.0

 Attachments: HBASE1861-incomplete.patch


 Add multi-family support to bulk upload tools from HBASE-48.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3308) SplitTransaction.splitStoreFiles slows splits a lot

2010-12-07 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969098#action_12969098
 ] 

HBase Review Board commented on HBASE-3308:
---

Message from: Jean-Daniel Cryans jdcry...@apache.org

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1273/
---

Review request for hbase.


Summary
---

Patch that parallelizes the splitting of the files using ThreadPoolExecutor and 
Futures. The code is a bit ugly, but does the job really well as shown during 
cluster testing (which also uncovered HBASE-3318).

One new behavior this patch adds is that it's now possible to rollback a split 
because it took too long to split the files. I did some testing with a timeout 
of 5 secs on my cluster, even tho each machine did a few rollbacks the import 
went fine. The default is 30 seconds and isn't in hbase-default.xml as I don't 
think anyone would really want to change that.


This addresses bug HBASE-3308.
http://issues.apache.org/jira/browse/HBASE-3308


Diffs
-

  
/branches/0.90/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
 1043188 

Diff: http://review.cloudera.org/r/1273/diff


Testing
---


Thanks,

Jean-Daniel




 SplitTransaction.splitStoreFiles slows splits a lot
 ---

 Key: HBASE-3308
 URL: https://issues.apache.org/jira/browse/HBASE-3308
 Project: HBase
  Issue Type: Improvement
Reporter: Jean-Daniel Cryans
Priority: Critical
 Fix For: 0.92.0


 Recently I've been seeing some slow splits in our production environment 
 triggering timeouts, so I decided to take a closer look into the issue.
 According to my debugging, we spend almost all the time it takes to split on 
 creating the reference files. Each file in my testing takes at least 300ms to 
 create, and averages around 600ms. Since we create two references per store 
 file, it means that a region with 4 store file can easily take up to 5 
 seconds to split just to create those references.
 An intuitive improvement would be to create those files in parallel, so at 
 least it wouldn't be much slower when we're splitting a higher number of 
 files. Stack left the following comment in the code:
 {noformat}
 // TODO: If the below were multithreaded would we complete steps in less
 // elapsed time?  St.Ack 20100920
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3308) SplitTransaction.splitStoreFiles slows splits a lot

2010-12-07 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969113#action_12969113
 ] 

HBase Review Board commented on HBASE-3308:
---

Message from: Jean-Daniel Cryans jdcry...@apache.org


bq.  On 2010-12-07 17:02:49, stack wrote:
bq.   
/branches/0.90/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java,
 line 400
bq.   http://review.cloudera.org/r/1273/diff/1/?file=17980#file17980line400
bq.  
bq.   Why not have an upper bound?  If 100 files thats 100 threads doing 
FS operations.  I bet if you had upper bound of 10 on the executorservice, it 
complete faster than an unbounded executorservice?

I think we are already bounded by hbase.hstore.blockingStoreFiles


- Jean-Daniel


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1273/#review2043
---





 SplitTransaction.splitStoreFiles slows splits a lot
 ---

 Key: HBASE-3308
 URL: https://issues.apache.org/jira/browse/HBASE-3308
 Project: HBase
  Issue Type: Improvement
Reporter: Jean-Daniel Cryans
Priority: Critical
 Fix For: 0.92.0


 Recently I've been seeing some slow splits in our production environment 
 triggering timeouts, so I decided to take a closer look into the issue.
 According to my debugging, we spend almost all the time it takes to split on 
 creating the reference files. Each file in my testing takes at least 300ms to 
 create, and averages around 600ms. Since we create two references per store 
 file, it means that a region with 4 store file can easily take up to 5 
 seconds to split just to create those references.
 An intuitive improvement would be to create those files in parallel, so at 
 least it wouldn't be much slower when we're splitting a higher number of 
 files. Stack left the following comment in the code:
 {noformat}
 // TODO: If the below were multithreaded would we complete steps in less
 // elapsed time?  St.Ack 20100920
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-1861) Multi-Family support for bulk upload tools (HFileOutputFormat / loadtable.rb)

2010-12-07 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969118#action_12969118
 ] 

HBase Review Board commented on HBASE-1861:
---

Message from: st...@duboce.net

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1272/#review2044
---

Ship it!


+1  Excellent.


src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java
http://review.cloudera.org/r/1272/#comment6448

Should this behavior be documented in method javadoc?


- stack





 Multi-Family support for bulk upload tools (HFileOutputFormat / loadtable.rb)
 -

 Key: HBASE-1861
 URL: https://issues.apache.org/jira/browse/HBASE-1861
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce
Affects Versions: 0.20.0
Reporter: Jonathan Gray
Assignee: Nicolas Spiegelberg
 Fix For: 0.92.0

 Attachments: HBASE1861-incomplete.patch


 Add multi-family support to bulk upload tools from HBASE-48.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3305) Allow round-robin distribution for table created with multiple regions

2010-12-07 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969148#action_12969148
 ] 

HBase Review Board commented on HBASE-3305:
---

Message from: Ted Yu ted...@yahoo.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1271/#review2048
---



trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/1271/#comment6455

I wrap InterruptedException in IOException.


- Ted





 Allow round-robin distribution for table created with multiple regions
 --

 Key: HBASE-3305
 URL: https://issues.apache.org/jira/browse/HBASE-3305
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.20.6
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: hbase-3305-array.patch, 
 hbase-3305-default-round-robin.patch, hbase-3305-round-robin-unit-test.patch, 
 hbase-3305.patch


 We can distribute the initial regions created for a new table in round-robin 
 fashion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3305) Allow round-robin distribution for table created with multiple regions

2010-12-07 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969149#action_12969149
 ] 

HBase Review Board commented on HBASE-3305:
---

Message from: Ted Yu ted...@yahoo.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1271/
---

(Updated 2010-12-07 18:25:05.129171)


Review request for hbase, stack and Jonathan Gray.


Changes
---

I used Organize Imports in Eclipse for AssignmentManager


Summary
---

Adopted round-robin assignment as default for regions specified when table is 
created.


This addresses bug HBASE-3305.
http://issues.apache.org/jira/browse/HBASE-3305


Diffs (updated)
-

  trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 
1043216 
  trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 1043216 
  trunk/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java 1043216 
  trunk/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 1043216 

Diff: http://review.cloudera.org/r/1271/diff


Testing
---

Put unit tests for this change inside TestAdmin.testCreateTableWithRegions()
They passed.


Thanks,

Ted




 Allow round-robin distribution for table created with multiple regions
 --

 Key: HBASE-3305
 URL: https://issues.apache.org/jira/browse/HBASE-3305
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.20.6
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: hbase-3305-array.patch, 
 hbase-3305-default-round-robin.patch, hbase-3305-round-robin-unit-test.patch, 
 hbase-3305.patch


 We can distribute the initial regions created for a new table in round-robin 
 fashion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3305) Allow round-robin distribution for table created with multiple regions

2010-12-07 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969153#action_12969153
 ] 

HBase Review Board commented on HBASE-3305:
---

Message from: Ted Yu ted...@yahoo.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1271/
---

(Updated 2010-12-07 18:28:46.368066)


Review request for hbase, stack and Jonathan Gray.


Changes
---

Reverted movement of imports


Summary
---

Adopted round-robin assignment as default for regions specified when table is 
created.


This addresses bug HBASE-3305.
http://issues.apache.org/jira/browse/HBASE-3305


Diffs (updated)
-

  trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 
1043216 
  trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 1043216 
  trunk/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java 1043216 
  trunk/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 1043216 

Diff: http://review.cloudera.org/r/1271/diff


Testing
---

Put unit tests for this change inside TestAdmin.testCreateTableWithRegions()
They passed.


Thanks,

Ted




 Allow round-robin distribution for table created with multiple regions
 --

 Key: HBASE-3305
 URL: https://issues.apache.org/jira/browse/HBASE-3305
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.20.6
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: hbase-3305-array.patch, 
 hbase-3305-default-round-robin.patch, hbase-3305-round-robin-unit-test.patch, 
 hbase-3305.patch


 We can distribute the initial regions created for a new table in round-robin 
 fashion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3305) Allow round-robin distribution for table created with multiple regions

2010-12-07 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969162#action_12969162
 ] 

HBase Review Board commented on HBASE-3305:
---

Message from: Jonathan Gray jg...@apache.org

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1271/#review2049
---

Ship it!


looks good, thanks ted!  i will commit the final patch to trunk.


trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
http://review.cloudera.org/r/1271/#comment6456

line is  80 chars but will fix on commit, don't worry


- Jonathan





 Allow round-robin distribution for table created with multiple regions
 --

 Key: HBASE-3305
 URL: https://issues.apache.org/jira/browse/HBASE-3305
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.20.6
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: hbase-3305-array.patch, 
 hbase-3305-default-round-robin.patch, hbase-3305-round-robin-unit-test.patch, 
 hbase-3305.patch


 We can distribute the initial regions created for a new table in round-robin 
 fashion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3308) SplitTransaction.splitStoreFiles slows splits a lot

2010-12-07 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969190#action_12969190
 ] 

HBase Review Board commented on HBASE-3308:
---

Message from: st...@duboce.net


bq.  On 2010-12-07 17:02:49, stack wrote:
bq.   
/branches/0.90/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java,
 line 400
bq.   http://review.cloudera.org/r/1273/diff/1/?file=17980#file17980line400
bq.  
bq.   Why not have an upper bound?  If 100 files thats 100 threads doing 
FS operations.  I bet if you had upper bound of 10 on the executorservice, it 
complete faster than an unbounded executorservice?
bq.  
bq.  Jean-Daniel Cryans wrote:
bq.  I think we are already bounded by hbase.hstore.blockingStoreFiles

That'll do.  +1 on commit.


- stack


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1273/#review2043
---





 SplitTransaction.splitStoreFiles slows splits a lot
 ---

 Key: HBASE-3308
 URL: https://issues.apache.org/jira/browse/HBASE-3308
 Project: HBase
  Issue Type: Improvement
Reporter: Jean-Daniel Cryans
Priority: Critical
 Fix For: 0.92.0


 Recently I've been seeing some slow splits in our production environment 
 triggering timeouts, so I decided to take a closer look into the issue.
 According to my debugging, we spend almost all the time it takes to split on 
 creating the reference files. Each file in my testing takes at least 300ms to 
 create, and averages around 600ms. Since we create two references per store 
 file, it means that a region with 4 store file can easily take up to 5 
 seconds to split just to create those references.
 An intuitive improvement would be to create those files in parallel, so at 
 least it wouldn't be much slower when we're splitting a higher number of 
 files. Stack left the following comment in the code:
 {noformat}
 // TODO: If the below were multithreaded would we complete steps in less
 // elapsed time?  St.Ack 20100920
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3305) Allow round-robin distribution for table created with multiple regions

2010-12-06 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12968354#action_12968354
 ] 

HBase Review Board commented on HBASE-3305:
---

Message from: Ted Yu ted...@yahoo.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1271/
---

(Updated 2010-12-06 10:42:26.792838)


Review request for hbase, stack and Jonathan Gray.


Changes
---

Add hbase group as reviewer


Summary
---

Adopted round-robin assignment as default for regions specified when table is 
created.


This addresses bug HBASE-3305.
http://issues.apache.org/jira/browse/HBASE-3305


Diffs
-

  trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 1042725 
  trunk/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java 1042725 
  trunk/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 1042725 

Diff: http://review.cloudera.org/r/1271/diff


Testing
---

Put unit tests for this change inside TestAdmin.testCreateTableWithRegions()
They passed.


Thanks,

Ted




 Allow round-robin distribution for table created with multiple regions
 --

 Key: HBASE-3305
 URL: https://issues.apache.org/jira/browse/HBASE-3305
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.20.6
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: hbase-3305-array.patch, 
 hbase-3305-default-round-robin.patch, hbase-3305-round-robin-unit-test.patch, 
 hbase-3305.patch


 We can distribute the initial regions created for a new table in round-robin 
 fashion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3305) Allow round-robin distribution for table created with multiple regions

2010-12-06 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12968579#action_12968579
 ] 

HBase Review Board commented on HBASE-3305:
---

Message from: st...@duboce.net

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1271/#review2037
---

Ship it!


Looks good Ted.  Below are a few pointers mostly on formatting and then a few 
questions.  Thanks for making the patch.


trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/1271/#comment6430

Do you need to pollute HMaster with this AssignmentManager inner class?



trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/1271/#comment6431

FYI, don't make these kinda formatting changes in a patch... its 
distracting and the change you are making is against the convention used in the 
rest of this file.  Just FYI.  No biggie.



trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/1271/#comment6432

Yeah, maybe these lines belong inside a method that is inside 
AssignmentManager?  What you think Ted?



trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/1271/#comment6433

What changed on this line?  White space?



trunk/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java
http://review.cloudera.org/r/1271/#comment6434

Convention is two spaces for tab in hbase and hadoop.  This seems like 
something else?



trunk/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
http://review.cloudera.org/r/1271/#comment6435

FYI, tab is two spaces... we indent in multiples of two spaces.



trunk/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
http://review.cloudera.org/r/1271/#comment6437

Good.  Nice test.


- stack





 Allow round-robin distribution for table created with multiple regions
 --

 Key: HBASE-3305
 URL: https://issues.apache.org/jira/browse/HBASE-3305
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.20.6
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: hbase-3305-array.patch, 
 hbase-3305-default-round-robin.patch, hbase-3305-round-robin-unit-test.patch, 
 hbase-3305.patch


 We can distribute the initial regions created for a new table in round-robin 
 fashion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3305) Allow round-robin distribution for table created with multiple regions

2010-12-06 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12968608#action_12968608
 ] 

HBase Review Board commented on HBASE-3305:
---

Message from: Ted Yu ted...@yahoo.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1271/#review2038
---



trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/1271/#comment6438

A new patch will be uploaded that reverts such changes.



trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/1271/#comment6439

I think you're implying rewriting
AssignmentManager.assignAllUserRegions().

How about creating this method:
assignAllUserRegions(ListHRegionInfo regions).

finishInitialization() would pass null to the above method to indicate that 
all user regions should be assigned.
createTable() would pass the list of regions for the new table.

This way, BulkStartupAssigner doesn't appear in HMaster.



trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/1271/#comment6440

Yes. I prefer space between if and left parenthesis.

I will revert anyway.


- Ted





 Allow round-robin distribution for table created with multiple regions
 --

 Key: HBASE-3305
 URL: https://issues.apache.org/jira/browse/HBASE-3305
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.20.6
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: hbase-3305-array.patch, 
 hbase-3305-default-round-robin.patch, hbase-3305-round-robin-unit-test.patch, 
 hbase-3305.patch


 We can distribute the initial regions created for a new table in round-robin 
 fashion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3305) Allow round-robin distribution for table created with multiple regions

2010-12-06 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12968610#action_12968610
 ] 

HBase Review Board commented on HBASE-3305:
---

Message from: Ted Yu ted...@yahoo.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1271/
---

(Updated 2010-12-06 23:22:29.259676)


Review request for hbase, stack and Jonathan Gray.


Changes
---

Removes tabs.
Format code using multiple of two spaces.


Summary
---

Adopted round-robin assignment as default for regions specified when table is 
created.


This addresses bug HBASE-3305.
http://issues.apache.org/jira/browse/HBASE-3305


Diffs (updated)
-

  trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 1042922 
  trunk/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java 1042922 
  trunk/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 1042922 

Diff: http://review.cloudera.org/r/1271/diff


Testing
---

Put unit tests for this change inside TestAdmin.testCreateTableWithRegions()
They passed.


Thanks,

Ted




 Allow round-robin distribution for table created with multiple regions
 --

 Key: HBASE-3305
 URL: https://issues.apache.org/jira/browse/HBASE-3305
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.20.6
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: hbase-3305-array.patch, 
 hbase-3305-default-round-robin.patch, hbase-3305-round-robin-unit-test.patch, 
 hbase-3305.patch


 We can distribute the initial regions created for a new table in round-robin 
 fashion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3290) Max Compaction Size

2010-12-01 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965795#action_12965795
 ] 

HBase Review Board commented on HBASE-3290:
---

Message from: st...@duboce.net

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1263/#review2018
---

Ship it!


This looks great.  I love the test.  There are some comments below.  See what 
you think.  I did not dig in deep on the algo but looks good.


trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
http://review.cloudera.org/r/1263/#comment6361

Good.  I like the way you keep around old name.

FYI, there's white space on end of some of these lines of yours.



trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
http://review.cloudera.org/r/1263/#comment6364

Is this right? We check all storefiles for references where before we only 
checked the subset of candidate compaction files for references?


(Hmm.. maybe the old stuff was wrong?)



trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
http://review.cloudera.org/r/1263/#comment6362

White space



trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
http://review.cloudera.org/r/1263/#comment6365

Good



trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
http://review.cloudera.org/r/1263/#comment6366

I don't grok this comment



trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
http://review.cloudera.org/r/1263/#comment6367

So, its ok to mess w/ file order?  We won't get ourselves into trouble if 
we don't respect the order in which files were written?  We do a merge sort 
when we read all compaction candidates in so should be fine I suppose -- since 
its same as how scanner merges them.. 

Just asking because in old days order was important but I suppose we let go 
of that a while back?



trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
http://review.cloudera.org/r/1263/#comment6368

Is this a good name for this method?  We're compacting a Store, not Stores, 
right?



trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
http://review.cloudera.org/r/1263/#comment6369

Nice



trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
http://review.cloudera.org/r/1263/#comment6370

Excellent!  I love you mocking up StoreFiles rather than fire up minicluster

FYI... loads of white space in here.


- stack





 Max Compaction Size
 ---

 Key: HBASE-3290
 URL: https://issues.apache.org/jira/browse/HBASE-3290
 Project: HBase
  Issue Type: Improvement
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Minor

 Add ability to specify a maximum storefile size for compaction.  After this 
 limit, we will not include this file in compactions.  This is useful for 
 large object stores and clusters that pre-split regions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3290) Max Compaction Size

2010-12-01 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965983#action_12965983
 ] 

HBase Review Board commented on HBASE-3290:
---

Message from: Nicolas nspiegelb...@facebook.com


bq.  On 2010-12-01 10:49:59, stack wrote:
bq.   trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java, 
line 639
bq.   http://review.cloudera.org/r/1263/diff/1/?file=17923#file17923line639
bq.  
bq.   Is this right? We check all storefiles for references where before 
we only checked the subset of candidate compaction files for references?
bq.   
bq.   
bq.   (Hmm.. maybe the old stuff was wrong?)

references == split files.  we currently don't support splitting split files 
(into quarter pieces?), so we need to ensure no files are split.


bq.  On 2010-12-01 10:49:59, stack wrote:
bq.   trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java, 
line 926
bq.   http://review.cloudera.org/r/1263/diff/1/?file=17923#file17923line926
bq.  
bq.   I don't grok this comment

references == split files.  The current algorithm is to split a StoreFile, then 
immediately use compaction after splitting to break them into 2 StoreFiles.  If 
you don't compact reference files that are past the max threshold:

1) you won't be able to split the region again
2) you don't actually even know that the StoreFile is too large.  
HalfStoreFileReader.length() returns the whole StoreFile's length, not the 
length of the StoreFile related to your region


bq.  On 2010-12-01 10:49:59, stack wrote:
bq.   trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java, 
line 954
bq.   http://review.cloudera.org/r/1263/diff/1/?file=17923#file17923line954
bq.  
bq.   So, its ok to mess w/ file order?  We won't get ourselves into 
trouble if we don't respect the order in which files were written?  We do a 
merge sort when we read all compaction candidates in so should be fine I 
suppose -- since its same as how scanner merges them.. 
bq.   
bq.   Just asking because in old days order was important but I suppose we 
let go of that a while back?

so, technically, order is important for optimizations like the TimeStamp 
filter.  However, realistically this isn't a problem because our normal skew 
always decreases in filesize over time.  The only place where our skew doesn't 
decrease is for files that have been recently flushed.  However, all those will 
be unconditionally compacted because they will be lower than 
hbase.hstore.compaction.min.size.  

The sorting is to handle an interesting issue that popped up for us during 
migration: we're bulk loading StoreFiles of extremely variable size (are we 
migrating 1k users or 10M?) and they will all appear at the end of the 
StoreFile list.  How do we determine when it is efficient to compact them?  The 
easiest option was to sort the compact list and handle bulk files by relative 
size instead of making some custom compaction selection algorithm just for bulk 
inclusion.  It seems like any other companies that will incrementally migrate 
data into HBase would hit the same issue.


bq.  On 2010-12-01 10:49:59, stack wrote:
bq.   trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java, 
line 1024
bq.   http://review.cloudera.org/r/1263/diff/1/?file=17923#file17923line1024
bq.  
bq.   Is this a good name for this method?  We're compacting a Store, not 
Stores, right?

true.  I mainly wanted to change the name from the public compact() api.  I 
kept annoyingly clicking on the wrong function in Eclipse.  Do you want to 
refactor it to compactFiles() right before commit?


- Nicolas


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1263/#review2018
---





 Max Compaction Size
 ---

 Key: HBASE-3290
 URL: https://issues.apache.org/jira/browse/HBASE-3290
 Project: HBase
  Issue Type: Improvement
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Minor

 Add ability to specify a maximum storefile size for compaction.  After this 
 limit, we will not include this file in compactions.  This is useful for 
 large object stores and clusters that pre-split regions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close

2010-11-30 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965309#action_12965309
 ] 

HBase Review Board commented on HBASE-3287:
---

Message from: st...@duboce.net

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1261/#review2009
---

Ship it!


Looks good to me.  Some comments below.


branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java
http://review.cloudera.org/r/1261/#comment6343

This looks like useful addition.



branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
http://review.cloudera.org/r/1261/#comment6346

Why the flush?



branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
http://review.cloudera.org/r/1261/#comment6344

Does this create new byte array?



branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
http://review.cloudera.org/r/1261/#comment6345

I wonder if we have to have full path here?  Anything less could cause 
clashes?  But small optimization would strip the hbase.root at least?



branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
http://review.cloudera.org/r/1261/#comment6347

Can you presize the BAOS?  Whats the default?  4k?  If so, and our default 
block size is 64k, that'd be a bit of expensive array resizing going on?  Just 
guessing.



branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
http://review.cloudera.org/r/1261/#comment6348

Surround with if debug?


- stack





 Add option to cache blocks on hfile write and evict blocks on hfile close
 -

 Key: HBASE-3287
 URL: https://issues.apache.org/jira/browse/HBASE-3287
 Project: HBase
  Issue Type: Improvement
  Components: io, regionserver
Affects Versions: 0.90.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
 Fix For: 0.92.0


 This issue is about adding configuration options to add/remove from the block 
 cache when creating/closing files.  For use cases with lots of flushing and 
 compacting, this might be desirable to prevent cache misses and maximize the 
 effective utilization of total block cache capacity.
 The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we 
 pre-cache blocks as we are writing out new files.
 The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict 
 blocks when files are closed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close

2010-11-30 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965310#action_12965310
 ] 

HBase Review Board commented on HBASE-3287:
---

Message from: Ryan Rawson ryano...@gmail.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1261/#review2010
---



branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
http://review.cloudera.org/r/1261/#comment6349

why would you not want to evict blocks from the cache on close?


- Ryan





 Add option to cache blocks on hfile write and evict blocks on hfile close
 -

 Key: HBASE-3287
 URL: https://issues.apache.org/jira/browse/HBASE-3287
 Project: HBase
  Issue Type: Improvement
  Components: io, regionserver
Affects Versions: 0.90.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
 Fix For: 0.92.0


 This issue is about adding configuration options to add/remove from the block 
 cache when creating/closing files.  For use cases with lots of flushing and 
 compacting, this might be desirable to prevent cache misses and maximize the 
 effective utilization of total block cache capacity.
 The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we 
 pre-cache blocks as we are writing out new files.
 The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict 
 blocks when files are closed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close

2010-11-30 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965311#action_12965311
 ] 

HBase Review Board commented on HBASE-3287:
---

Message from: st...@duboce.net


bq.  On 2010-11-30 09:57:27, Ryan Rawson wrote:
bq.   branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java, 
line 765
bq.   http://review.cloudera.org/r/1261/diff/1/?file=17902#file17902line765
bq.  
bq.   why would you not want to evict blocks from the cache on close?

I think this a good point.  Its different behavior but its behavior we should 
have always had?  One less option too.


- stack


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1261/#review2010
---





 Add option to cache blocks on hfile write and evict blocks on hfile close
 -

 Key: HBASE-3287
 URL: https://issues.apache.org/jira/browse/HBASE-3287
 Project: HBase
  Issue Type: Improvement
  Components: io, regionserver
Affects Versions: 0.90.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
 Fix For: 0.92.0


 This issue is about adding configuration options to add/remove from the block 
 cache when creating/closing files.  For use cases with lots of flushing and 
 compacting, this might be desirable to prevent cache misses and maximize the 
 effective utilization of total block cache capacity.
 The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we 
 pre-cache blocks as we are writing out new files.
 The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict 
 blocks when files are closed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3286) Master passes IP and not hostname back to region server

2010-11-30 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965381#action_12965381
 ] 

HBase Review Board commented on HBASE-3286:
---

Message from: Jean-Daniel Cryans jdcry...@apache.org

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1262/
---

Review request for hbase.


Summary
---

Changes:
 - In HMaster, instead of passing an IP as String we now pass the HSA object 
completely.
 - In HRegionServer, I cleared a bunch of crufty comments and handle the HSA 
passed by the master.
 - In HServerInfo, I saw that the hostname wasn't reset when setting the HSA. 
Fixed.
 - In HServerAddress, I fixed a few places that wasn't explicitly using 
hostnames and changed the serialization to pass a hostname instead of an IP 
address.


This addresses bug HBASE-3286.
http://issues.apache.org/jira/browse/HBASE-3286


Diffs
-

  /trunk/src/main/java/org/apache/hadoop/hbase/HServerAddress.java 1040669 
  /trunk/src/main/java/org/apache/hadoop/hbase/HServerInfo.java 1040669 
  /trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 1040669 
  /trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 
1040669 

Diff: http://review.cloudera.org/r/1262/diff


Testing
---

Works on my MBP (I was seeing the same issue but since there's only 1 RS it 
didn't have any bad effect) and my 10 machines Ubuntu cluster. 


Thanks,

Jean-Daniel




 Master passes IP and not hostname back to region server
 ---

 Key: HBASE-3286
 URL: https://issues.apache.org/jira/browse/HBASE-3286
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Daniel Cryans
 Fix For: 0.90.0


 Starting my little test cluster on the latest from 0.90, I see:
 {noformat}
 2010-11-29 23:21:34,131 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 1024 
 region(s) across 9 server(s), retainAssignment=true
 2010-11-29 23:21:34,134 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 22 region(s) 
 to sv2borg181,61020,1291072886282
 2010-11-29 23:21:34,135 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 24 region(s) 
 to sv2borg182,61020,1291072885473
 2010-11-29 23:21:34,135 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 37 region(s) 
 to sv2borg183,61020,1291072885646
 2010-11-29 23:21:34,135 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 25 region(s) 
 to sv2borg184,61020,1291072886734
 2010-11-29 23:21:34,135 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 26 region(s) 
 to sv2borg185,61020,1291072886606
 2010-11-29 23:21:34,136 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 70 region(s) 
 to sv2borg186,61020,1291072885486
 2010-11-29 23:21:34,136 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 30 region(s) 
 to sv2borg187,61020,1291072886355
 2010-11-29 23:21:34,136 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 89 region(s) 
 to sv2borg188,61020,1291072885926
 2010-11-29 23:21:34,136 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 701 
 region(s) to sv2borg189,61020,1291072886739
 {noformat}
 After another restart:
 {noformat}
 2010-11-30 00:03:38,100 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 1024 
 region(s) across 9 server(s), retainAssignment=true
 2010-11-30 00:03:38,103 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 71 region(s) 
 to sv2borg181,61020,1291075409984
 2010-11-30 00:03:38,103 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 82 region(s) 
 to sv2borg182,61020,1291075409956
 2010-11-30 00:03:38,104 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 61 region(s) 
 to sv2borg183,61020,1291075409952
 2010-11-30 00:03:38,104 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 122 
 region(s) to sv2borg184,61020,1291075409957
 2010-11-30 00:03:38,104 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 59 region(s) 
 to sv2borg185,61020,1291075409955
 2010-11-30 00:03:38,104 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 71 region(s) 
 to sv2borg186,61020,1291075409963
 2010-11-30 00:03:38,105 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 52 region(s) 
 to sv2borg187,61020,1291075411049
 2010-11-30 00:03:38,105 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 254 
 region(s) to sv2borg188,61020,1291075410360
 2010-11-30 00:03:38,105 DEBUG 
 

[jira] Commented: (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close

2010-11-30 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965455#action_12965455
 ] 

HBase Review Board commented on HBASE-3287:
---

Message from: Ryan Rawson ryano...@gmail.com


bq.  On 2010-11-30 09:57:27, Ryan Rawson wrote:
bq.   branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java, 
line 765
bq.   http://review.cloudera.org/r/1261/diff/1/?file=17902#file17902line765
bq.  
bq.   why would you not want to evict blocks from the cache on close?
bq.  
bq.  stack wrote:
bq.  I think this a good point.  Its different behavior but its behavior we 
should have always had?  One less option too.

I'm still confused why we are adding config for something that we should always 
be doing it.  While we'll never be zero conf, I am not seeing the reason why 
we'd want to keep things in the LRU.  

It would make more sense not to evict on a split, but evict every other time, 
since a split will probably reopen the same hfiles and need those blocks again.


- Ryan


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1261/#review2010
---





 Add option to cache blocks on hfile write and evict blocks on hfile close
 -

 Key: HBASE-3287
 URL: https://issues.apache.org/jira/browse/HBASE-3287
 Project: HBase
  Issue Type: New Feature
  Components: io, regionserver
Affects Versions: 0.90.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
 Fix For: 0.92.0

 Attachments: HBASE-3287-FINAL-trunk.patch


 This issue is about adding configuration options to add/remove from the block 
 cache when creating/closing files.  For use cases with lots of flushing and 
 compacting, this might be desirable to prevent cache misses and maximize the 
 effective utilization of total block cache capacity.
 The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we 
 pre-cache blocks as we are writing out new files.
 The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict 
 blocks when files are closed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close

2010-11-30 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965487#action_12965487
 ] 

HBase Review Board commented on HBASE-3287:
---

Message from: Jonathan Gray jg...@apache.org


bq.  On 2010-11-30 09:57:27, Ryan Rawson wrote:
bq.   branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java, 
line 765
bq.   http://review.cloudera.org/r/1261/diff/1/?file=17902#file17902line765
bq.  
bq.   why would you not want to evict blocks from the cache on close?
bq.  
bq.  stack wrote:
bq.  I think this a good point.  Its different behavior but its behavior we 
should have always had?  One less option too.
bq.  
bq.  Ryan Rawson wrote:
bq.  I'm still confused why we are adding config for something that we 
should always be doing it.  While we'll never be zero conf, I am not seeing the 
reason why we'd want to keep things in the LRU.  
bq.  
bq.  It would make more sense not to evict on a split, but evict every 
other time, since a split will probably reopen the same hfiles and need those 
blocks again.

I think it makes sense to have undocumented configuration parameters.  The 
default behavior is then the way but having a config option checked in the 
code at least gives the opportunity to turn something on/off without making a 
code change and redeploying completely.  In the unit test, I'm turning it 
on/off with the config parameter so I can verify it works as expected.

And although I've changed the default to true, I'm not convinced that it always 
makes sense in all cases.

Ryan came up with example of the split, though that would override the config 
parameter.  But I think there could be other situations where you don't want to 
as well.

In any case, I want to keep it configurable so I can turn it on/off between 
test runs and see what, if any, difference these optimizations make and IMO 
there's very little cost associated with using 
conf.getBoolean(some.undocumented.thing, true) vs. a hard-coded true (if 
there's any possibility you might want to change the behavior).


- Jonathan


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1261/#review2010
---





 Add option to cache blocks on hfile write and evict blocks on hfile close
 -

 Key: HBASE-3287
 URL: https://issues.apache.org/jira/browse/HBASE-3287
 Project: HBase
  Issue Type: New Feature
  Components: io, regionserver
Affects Versions: 0.90.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
 Fix For: 0.92.0

 Attachments: HBASE-3287-FINAL-trunk.patch


 This issue is about adding configuration options to add/remove from the block 
 cache when creating/closing files.  For use cases with lots of flushing and 
 compacting, this might be desirable to prevent cache misses and maximize the 
 effective utilization of total block cache capacity.
 The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we 
 pre-cache blocks as we are writing out new files.
 The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict 
 blocks when files are closed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close

2010-11-30 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965489#action_12965489
 ] 

HBase Review Board commented on HBASE-3287:
---

Message from: Jonathan Gray jg...@apache.org


bq.  On 2010-11-30 09:57:27, Ryan Rawson wrote:
bq.   branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java, 
line 765
bq.   http://review.cloudera.org/r/1261/diff/1/?file=17902#file17902line765
bq.  
bq.   why would you not want to evict blocks from the cache on close?
bq.  
bq.  stack wrote:
bq.  I think this a good point.  Its different behavior but its behavior we 
should have always had?  One less option too.
bq.  
bq.  Ryan Rawson wrote:
bq.  I'm still confused why we are adding config for something that we 
should always be doing it.  While we'll never be zero conf, I am not seeing the 
reason why we'd want to keep things in the LRU.  
bq.  
bq.  It would make more sense not to evict on a split, but evict every 
other time, since a split will probably reopen the same hfiles and need those 
blocks again.
bq.  
bq.  Jonathan Gray wrote:
bq.  I think it makes sense to have undocumented configuration parameters.  
The default behavior is then the way but having a config option checked in 
the code at least gives the opportunity to turn something on/off without making 
a code change and redeploying completely.  In the unit test, I'm turning it 
on/off with the config parameter so I can verify it works as expected.
bq.  
bq.  And although I've changed the default to true, I'm not convinced that 
it always makes sense in all cases.
bq.  
bq.  Ryan came up with example of the split, though that would override the 
config parameter.  But I think there could be other situations where you don't 
want to as well.
bq.  
bq.  In any case, I want to keep it configurable so I can turn it on/off 
between test runs and see what, if any, difference these optimizations make and 
IMO there's very little cost associated with using 
conf.getBoolean(some.undocumented.thing, true) vs. a hard-coded true (if 
there's any possibility you might want to change the behavior).

Filed HBASE-3289 to disable them on close of parent files during split.  I 
looked at the code and it's a fairly significant change since we'll need to 
pass a boolean in to all of the close() methods (there are several levels of 
them).

Also, figuring out when we do want to evict these blocks (once both children 
have closed the file) is tricky.


- Jonathan


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1261/#review2010
---





 Add option to cache blocks on hfile write and evict blocks on hfile close
 -

 Key: HBASE-3287
 URL: https://issues.apache.org/jira/browse/HBASE-3287
 Project: HBase
  Issue Type: New Feature
  Components: io, regionserver
Affects Versions: 0.90.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
 Fix For: 0.92.0

 Attachments: HBASE-3287-FINAL-trunk.patch


 This issue is about adding configuration options to add/remove from the block 
 cache when creating/closing files.  For use cases with lots of flushing and 
 compacting, this might be desirable to prevent cache misses and maximize the 
 effective utilization of total block cache capacity.
 The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we 
 pre-cache blocks as we are writing out new files.
 The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict 
 blocks when files are closed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3286) Master passes IP and not hostname back to region server

2010-11-30 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965521#action_12965521
 ] 

HBase Review Board commented on HBASE-3286:
---

Message from: Jean-Daniel Cryans jdcry...@apache.org


bq.  On 2010-11-30 12:27:06, Jonathan Gray wrote:
bq.   A little confused by the discrepancy between String host / int port and 
the Address.  But does seem fine given we don't actually access the string/int 
values and always use the address object.
bq.   
bq.   Do we need some tests on this stuff?  Seems like we always have issues 
here but tests don't catch anything.
bq.   
bq.   Looks better than what we have though so I'm +1 regardless.

Regarding tests, I'm not sure what they would catch... 


bq.  On 2010-11-30 12:27:06, Jonathan Gray wrote:
bq.   /trunk/src/main/java/org/apache/hadoop/hbase/HServerAddress.java, line 65
bq.   http://review.cloudera.org/r/1262/diff/1/?file=17919#file17919line65
bq.  
bq.   Why does stringValue not necessarily equal the host:port we store in 
those Strings?  Shouldn't they be the same?

I'm trying to keep it more consistent with the rest of the code, else when 
looking at the code you ask yourself the question you just asked me :)


bq.  On 2010-11-30 12:27:06, Jonathan Gray wrote:
bq.   /trunk/src/main/java/org/apache/hadoop/hbase/HServerAddress.java, line 
177
bq.   http://review.cloudera.org/r/1262/diff/1/?file=17919#file17919line177
bq.  
bq.   But on serialization, we use the address hostname not the thing we 
actually store in hostname/port variables, so after serialized it's different?
bq.   
bq.   Shouldn't we set the hostname/port variables on construction 
according to address.getAddress/getPort rather than the passed values, if the 
address values are what we want to use?

I'm... not following you. You're saying that we shouldn't store the 
InetSocketAddress?


bq.  On 2010-11-30 12:27:06, Jonathan Gray wrote:
bq.   /trunk/src/main/java/org/apache/hadoop/hbase/HServerInfo.java, line 116
bq.   http://review.cloudera.org/r/1262/diff/1/?file=17920#file17920line116
bq.  
bq.   I guess we never actually use the String host / int port?  Why do we 
store them in HServerAddress then?

Here I'm just making sure that after updating the address we also update the 
hostname, since it could have changed.


- Jean-Daniel


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1262/#review2012
---





 Master passes IP and not hostname back to region server
 ---

 Key: HBASE-3286
 URL: https://issues.apache.org/jira/browse/HBASE-3286
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Daniel Cryans
 Fix For: 0.90.0


 Starting my little test cluster on the latest from 0.90, I see:
 {noformat}
 2010-11-29 23:21:34,131 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 1024 
 region(s) across 9 server(s), retainAssignment=true
 2010-11-29 23:21:34,134 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 22 region(s) 
 to sv2borg181,61020,1291072886282
 2010-11-29 23:21:34,135 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 24 region(s) 
 to sv2borg182,61020,1291072885473
 2010-11-29 23:21:34,135 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 37 region(s) 
 to sv2borg183,61020,1291072885646
 2010-11-29 23:21:34,135 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 25 region(s) 
 to sv2borg184,61020,1291072886734
 2010-11-29 23:21:34,135 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 26 region(s) 
 to sv2borg185,61020,1291072886606
 2010-11-29 23:21:34,136 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 70 region(s) 
 to sv2borg186,61020,1291072885486
 2010-11-29 23:21:34,136 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 30 region(s) 
 to sv2borg187,61020,1291072886355
 2010-11-29 23:21:34,136 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 89 region(s) 
 to sv2borg188,61020,1291072885926
 2010-11-29 23:21:34,136 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 701 
 region(s) to sv2borg189,61020,1291072886739
 {noformat}
 After another restart:
 {noformat}
 2010-11-30 00:03:38,100 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 1024 
 region(s) across 9 server(s), retainAssignment=true
 2010-11-30 00:03:38,103 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 71 region(s) 
 to sv2borg181,61020,1291075409984
 2010-11-30 00:03:38,103 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: 

[jira] Commented: (HBASE-3290) Max Compaction Size

2010-11-30 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965587#action_12965587
 ] 

HBase Review Board commented on HBASE-3290:
---

Message from: Nicolas nspiegelb...@facebook.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1263/
---

(Updated 2010-11-30 23:21:26.259598)


Review request for hbase.


Summary
---

Add ability to specify a maximum storefile size for compaction. After this 
limit, we will not include this file in compactions. This is useful for large 
object stores and clusters that pre-split regions.


This addresses bug HBASE-3290.
http://issues.apache.org/jira/browse/HBASE-3290


Diffs
-

  trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1040878 
  trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 
1040878 
  
trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
 PRE-CREATION 

Diff: http://review.cloudera.org/r/1263/diff


Testing
---

mvn test -Dtest=TestCompactSelection
mvn test -Dtest=TestCompaction
mvn test -Dtest=TestFromClientSide
mvn test

cluster testing


Thanks,

Nicolas




 Max Compaction Size
 ---

 Key: HBASE-3290
 URL: https://issues.apache.org/jira/browse/HBASE-3290
 Project: HBase
  Issue Type: Improvement
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Minor

 Add ability to specify a maximum storefile size for compaction.  After this 
 limit, we will not include this file in compactions.  This is useful for 
 large object stores and clusters that pre-split regions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3282) Need to retain DeadServers to ensure we don't allow previously expired RS instances to rejoin cluster

2010-11-29 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12964879#action_12964879
 ] 

HBase Review Board commented on HBASE-3282:
---

Message from: Jonathan Gray jg...@apache.org

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1259/
---

Review request for hbase and stack.


Summary
---

We currently let go of dead servers once we finish their shutdown.  We should 
hang on to them longer to deal with things like network partitions.

I'm not a fan of SoftReferences so I decided on another approach.  DeadServers 
now has a maximum number of servers to hold on to in the set (default 100).  
Once it reaches the max, it evicts the oldest.

More code than I had hoped but nothing too crazy.


This addresses bug HBASE-3282.
http://issues.apache.org/jira/browse/HBASE-3282


Diffs
-

  branches/0.90/src/main/java/org/apache/hadoop/hbase/master/DeadServer.java 
1040221 
  branches/0.90/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 
1040221 
  branches/0.90/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 
1040221 
  
branches/0.90/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
 1040221 

Diff: http://review.cloudera.org/r/1259/diff


Testing
---

Running unit tests now.


Thanks,

Jonathan




 Need to retain DeadServers to ensure we don't allow previously expired RS 
 instances to rejoin cluster
 -

 Key: HBASE-3282
 URL: https://issues.apache.org/jira/browse/HBASE-3282
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
 Fix For: 0.90.0, 0.92.0


 Currently we clear a server from the deadserver set once we finish processing 
 it's shutdown.  However, certain circumstances (network partitions, race 
 conditions) could lead to the RS not doing a check-in until after the 
 shutdown has been processed.  As-is, this RS will now be let back in to the 
 cluster rather than rejected with YouAreDeadException.
 We should hang on to the dead servers so we always reject them.
 One concern is that the set will grow indefinitely.  One recommendation by 
 stack is to use SoftReferences.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3282) Need to retain DeadServers to ensure we don't allow previously expired RS instances to rejoin cluster

2010-11-29 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12964891#action_12964891
 ] 

HBase Review Board commented on HBASE-3282:
---

Message from: st...@duboce.net

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1259/#review2004
---

Ship it!



branches/0.90/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
http://review.cloudera.org/r/1259/#comment6320

You can make this private now that its no longer referenced by Master?


- stack





 Need to retain DeadServers to ensure we don't allow previously expired RS 
 instances to rejoin cluster
 -

 Key: HBASE-3282
 URL: https://issues.apache.org/jira/browse/HBASE-3282
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
 Fix For: 0.90.0, 0.92.0


 Currently we clear a server from the deadserver set once we finish processing 
 it's shutdown.  However, certain circumstances (network partitions, race 
 conditions) could lead to the RS not doing a check-in until after the 
 shutdown has been processed.  As-is, this RS will now be let back in to the 
 cluster rather than rejected with YouAreDeadException.
 We should hang on to the dead servers so we always reject them.
 One concern is that the set will grow indefinitely.  One recommendation by 
 stack is to use SoftReferences.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close

2010-11-29 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965134#action_12965134
 ] 

HBase Review Board commented on HBASE-3287:
---

Message from: Jonathan Gray jg...@apache.org

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1261/
---

Review request for hbase, stack and khemani.


Summary
---

This issue is about adding configuration options to add/remove from the block 
cache when creating/closing files. For use cases with lots of flushing and 
compacting, this might be desirable to prevent cache misses and maximize the 
effective utilization of total block cache capacity.

The first option, hbase.rs.cacheblocksonwrite, will make it so we pre-cache 
blocks as we are writing out new files.

The second option, hbase.rs.evictblocksonclose, will make it so we evict blocks 
when files are closed.


This addresses bug HBASE-3287.
http://issues.apache.org/jira/browse/HBASE-3287


Diffs
-

  
branches/0.90/src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java 
1040422 
  branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java 
1040422 
  branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java 
1040422 
  
branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java 
1040422 
  
branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/SimpleBlockCache.java
 1040422 
  
branches/0.90/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
 1040422 
  branches/0.90/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 
1040422 
  
branches/0.90/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 
1040422 
  branches/0.90/src/main/java/org/apache/hadoop/hbase/util/CompressionTest.java 
1040422 
  
branches/0.90/src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
 1040422 
  
branches/0.90/src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
 1040422 
  branches/0.90/src/test/java/org/apache/hadoop/hbase/io/hfile/RandomSeek.java 
1040422 
  branches/0.90/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java 
1040422 
  
branches/0.90/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java
 1040422 
  
branches/0.90/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java 
1040422 
  
branches/0.90/src/test/java/org/apache/hadoop/hbase/io/hfile/TestReseekTo.java 
1040422 
  branches/0.90/src/test/java/org/apache/hadoop/hbase/io/hfile/TestSeekTo.java 
1040422 
  
branches/0.90/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFiles.java
 1040422 
  
branches/0.90/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
 1040422 

Diff: http://review.cloudera.org/r/1261/diff


Testing
---

Added a unit test to TestStoreFile.  That passes.

Need to do perf testing on a cluster.


Thanks,

Jonathan




 Add option to cache blocks on hfile write and evict blocks on hfile close
 -

 Key: HBASE-3287
 URL: https://issues.apache.org/jira/browse/HBASE-3287
 Project: HBase
  Issue Type: Improvement
  Components: io, regionserver
Affects Versions: 0.90.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
 Fix For: 0.92.0


 This issue is about adding configuration options to add/remove from the block 
 cache when creating/closing files.  For use cases with lots of flushing and 
 compacting, this might be desirable to prevent cache misses and maximize the 
 effective utilization of total block cache capacity.
 The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we 
 pre-cache blocks as we are writing out new files.
 The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict 
 blocks when files are closed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3279) [rest] Filter for gzip/deflate content encoding that wraps both input and output side

2010-11-26 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12935965#action_12935965
 ] 

HBase Review Board commented on HBASE-3279:
---

Message from: Andrew Purtell apurt...@apache.org

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1254/
---

Review request for hbase.


Summary
---

After HBASE-3275 the REST gateway uses Jetty's GzipFilter to will return gzip 
or deflate encoded content to the client if the client requested it using the 
appropriate Accept-Encoding header. However Jetty's GzipFilter only wraps 
output side processing.

This patch implements a filter that also wraps input side processing, so 
clients can submit compressed PUT or POST bodies.


This addresses bug HBASE-3279.
http://issues.apache.org/jira/browse/HBASE-3279


Diffs
-

  src/main/java/org/apache/hadoop/hbase/rest/Main.java 54866b6 
  src/main/java/org/apache/hadoop/hbase/rest/filter/GZIPRequestStream.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/rest/filter/GZIPRequestWrapper.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/rest/filter/GZIPResponseStream.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/rest/filter/GZIPResponseWrapper.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/rest/filter/GzipFilter.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/rest/HBaseRESTTestingUtility.java 
5e943ec 
  src/test/java/org/apache/hadoop/hbase/rest/TestGzipFilter.java PRE-CREATION 

Diff: http://review.cloudera.org/r/1254/diff


Testing
---

New unit test, passes.


Thanks,

Andrew




 [rest] Filter for gzip/deflate content encoding that wraps both input and 
 output side
 -

 Key: HBASE-3279
 URL: https://issues.apache.org/jira/browse/HBASE-3279
 Project: HBase
  Issue Type: Improvement
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.20.7, 0.90.0


 After HBASE-3275 the REST gateway will return gzip or deflate encoded content 
 to the client if the client requested it using the appropriate 
 Accept-Encoding header. However Jetty's GzipFilter only wraps output side 
 processing. A client can submit gzip or deflate encoded requests (i.e. 
 Content-Encoding: gzip ; Content-Type: ...) but the data is not decoded, it 
 is simply passed through. 
 Implement a filter that also wraps input side processing, so clients can 
 submit compressed PUT or POST bodies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3279) [rest] Filter for gzip/deflate content encoding that wraps both input and output side

2010-11-26 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12936015#action_12936015
 ] 

HBase Review Board commented on HBASE-3279:
---

Message from: Lars George larsgeo...@apache.org

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1254/#review1981
---

Ship it!


Looks great! RB did show some white spaces added unnecessarily, I assume you 
could remove those on commit.

- Lars





 [rest] Filter for gzip/deflate content encoding that wraps both input and 
 output side
 -

 Key: HBASE-3279
 URL: https://issues.apache.org/jira/browse/HBASE-3279
 Project: HBase
  Issue Type: Improvement
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.20.7, 0.90.0


 After HBASE-3275 the REST gateway will return gzip or deflate encoded content 
 to the client if the client requested it using the appropriate 
 Accept-Encoding header. However Jetty's GzipFilter only wraps output side 
 processing. A client can submit gzip or deflate encoded requests (i.e. 
 Content-Encoding: gzip ; Content-Type: ...) but the data is not decoded, it 
 is simply passed through. 
 Implement a filter that also wraps input side processing, so clients can 
 submit compressed PUT or POST bodies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3276) delete followed by a put with the same timestamp

2010-11-26 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12936040#action_12936040
 ] 

HBase Review Board commented on HBASE-3276:
---

Message from: Pranav Khaitan pranavkhai...@gmail.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1252/
---

Review request for hbase, Jonathan Gray and Kannan Muthukkaruppan.


Summary
---

This is a design change suggested in HBASE-3276 so adequate thought should be 
given before proceeding. 

The main code change is just one line which is to ignore key type while doing 
KV comparisons. When the key type is ignored, then all the keys for the same 
timestamp are sorted according the order in which they were interested. It is 
still ensured that the delete family and delete column will be at the top 
because they have the default column name and default timestamp.


This addresses bug HBASE-3276.
http://issues.apache.org/jira/browse/HBASE-3276


Diffs
-

  trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1039233 
  
trunk/src/test/java/org/apache/hadoop/hbase/regionserver/KeyValueScanFixture.java
 1039233 
  
trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 
1039233 

Diff: http://review.cloudera.org/r/1252/diff


Testing
---

Test cases added. Since there is a change in semantics, some previous tests 
were failing because of this change. Those tests have been modified to test the 
newer behavior.


Thanks,

Pranav




 delete followed by a put with the same timestamp
 

 Key: HBASE-3276
 URL: https://issues.apache.org/jira/browse/HBASE-3276
 Project: HBase
  Issue Type: Bug
Reporter: Kannan Muthukkaruppan
Assignee: Kannan Muthukkaruppan

 [Note: This issue is relevant only for cases that don't use the default 
 time based versions, but provide/manage versions explicitly.]
 The fix for HBASE-1485 ensures that if there are multiple puts with the same 
 timestamp the later one wins.
 However, if there is a delete for a specific timestamp, then the later put 
 doesn't win. 
 Say for example the following is the sequence of operations:
 put row/col/v1 - value1
 deleteColumn row/col/v1
 put row/col/v1 - value2
 Without the deleteColumn(), HBASE-1485 ensures that value2 is the winner.
 However, with the deleteColumn() thrown into the mix, the delete wins, and 
 one cannot insert a new value at that version. [The only, unsatisfactory, 
 workaround at this point seems to be trigger a major compaction. The major 
 compact would clear the delete marker, and allow new cells to be created with 
 that version again.] 
 ---
 Seems like it might not be too complicated to extend the fix for HBASE-1485 
 to also respect ordering between delete/put operations. I'll look into this 
 further.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3276) delete followed by a put with the same timestamp

2010-11-26 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12936119#action_12936119
 ] 

HBase Review Board commented on HBASE-3276:
---

Message from: Ryan Rawson ryano...@gmail.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1252/#review1993
---



trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
http://review.cloudera.org/r/1252/#comment6297

what are all the consequences for not sorting by type when using 
KVComparator?  Does this mean we might create HFiles that not sorted properly, 
because the HFile comparator uses the KeyComparator directly with ignoreType = 
false. 

While in memstore we can rely on memstoreTS to roughly order by insertion 
time, and the Put/Delete should probably work in that situation, you are 
talking about modifiying a pretty core and important concept in how we sort 
things.

There are other ways to reconcile bugs like this, one of them is to extend 
the memstoreTS concept into the HFile and use that to reconcile during reads.  
There is another JIRA where I proposed this.  

If we are talking about 0.92 and beyond I'd prefer building a solid base 
rather than dangerous hacks like this.  Our unit tests are not extremely 
extensive, so while they might pass, that doesnt guarantee lack of bad 
behaviour later on.



- Ryan





 delete followed by a put with the same timestamp
 

 Key: HBASE-3276
 URL: https://issues.apache.org/jira/browse/HBASE-3276
 Project: HBase
  Issue Type: Bug
Reporter: Kannan Muthukkaruppan
Assignee: Kannan Muthukkaruppan

 [Note: This issue is relevant only for cases that don't use the default 
 time based versions, but provide/manage versions explicitly.]
 The fix for HBASE-1485 ensures that if there are multiple puts with the same 
 timestamp the later one wins.
 However, if there is a delete for a specific timestamp, then the later put 
 doesn't win. 
 Say for example the following is the sequence of operations:
 put row/col/v1 - value1
 deleteColumn row/col/v1
 put row/col/v1 - value2
 Without the deleteColumn(), HBASE-1485 ensures that value2 is the winner.
 However, with the deleteColumn() thrown into the mix, the delete wins, and 
 one cannot insert a new value at that version. [The only, unsatisfactory, 
 workaround at this point seems to be trigger a major compaction. The major 
 compact would clear the delete marker, and allow new cells to be created with 
 that version again.] 
 ---
 Seems like it might not be too complicated to extend the fix for HBASE-1485 
 to also respect ordering between delete/put operations. I'll look into this 
 further.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3276) delete followed by a put with the same timestamp

2010-11-26 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12936123#action_12936123
 ] 

HBase Review Board commented on HBASE-3276:
---

Message from: Pranav Khaitan pranavkhai...@gmail.com


bq.  On 2010-11-26 14:54:45, Ryan Rawson wrote:
bq.   trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 1373
bq.   http://review.cloudera.org/r/1252/diff/1/?file=17712#file17712line1373
bq.  
bq.   what are all the consequences for not sorting by type when using 
KVComparator?  Does this mean we might create HFiles that not sorted properly, 
because the HFile comparator uses the KeyComparator directly with ignoreType = 
false. 
bq.   
bq.   While in memstore we can rely on memstoreTS to roughly order by 
insertion time, and the Put/Delete should probably work in that situation, you 
are talking about modifiying a pretty core and important concept in how we sort 
things.
bq.   
bq.   There are other ways to reconcile bugs like this, one of them is to 
extend the memstoreTS concept into the HFile and use that to reconcile during 
reads.  There is another JIRA where I proposed this.  
bq.   
bq.   If we are talking about 0.92 and beyond I'd prefer building a solid 
base rather than dangerous hacks like this.  Our unit tests are not extremely 
extensive, so while they might pass, that doesnt guarantee lack of bad 
behaviour later on.
bq.  

Agree. As I mentioned, this is a major change and more thought needs to be 
given to it.

However, to resolve issues like HBASE-3276, we need either such a change or 
extend the memstoreTS concept to HFile as you mentioned.

About consequences, I don't see anything negative here. This change only 
affects the sorting of keys having same row, col, timestamp. After this change, 
all keys with the same row, col, ts will be sorted purely based on the order in 
which they were inserted. When a memstore is flushed to HFile, the memstoreTS 
takes care of ordering. During compactions, the KeyValueHeap breaks ties by 
using the sequence ids of storefiles. 


- Pranav


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1252/#review1993
---





 delete followed by a put with the same timestamp
 

 Key: HBASE-3276
 URL: https://issues.apache.org/jira/browse/HBASE-3276
 Project: HBase
  Issue Type: Bug
Reporter: Kannan Muthukkaruppan
Assignee: Kannan Muthukkaruppan

 [Note: This issue is relevant only for cases that don't use the default 
 time based versions, but provide/manage versions explicitly.]
 The fix for HBASE-1485 ensures that if there are multiple puts with the same 
 timestamp the later one wins.
 However, if there is a delete for a specific timestamp, then the later put 
 doesn't win. 
 Say for example the following is the sequence of operations:
 put row/col/v1 - value1
 deleteColumn row/col/v1
 put row/col/v1 - value2
 Without the deleteColumn(), HBASE-1485 ensures that value2 is the winner.
 However, with the deleteColumn() thrown into the mix, the delete wins, and 
 one cannot insert a new value at that version. [The only, unsatisfactory, 
 workaround at this point seems to be trigger a major compaction. The major 
 compact would clear the delete marker, and allow new cells to be created with 
 that version again.] 
 ---
 Seems like it might not be too complicated to extend the fix for HBASE-1485 
 to also respect ordering between delete/put operations. I'll look into this 
 further.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3276) delete followed by a put with the same timestamp

2010-11-26 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12936128#action_12936128
 ] 

HBase Review Board commented on HBASE-3276:
---

Message from: Ryan Rawson ryano...@gmail.com


bq.  On 2010-11-26 14:54:45, Ryan Rawson wrote:
bq.   trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 1373
bq.   http://review.cloudera.org/r/1252/diff/1/?file=17712#file17712line1373
bq.  
bq.   what are all the consequences for not sorting by type when using 
KVComparator?  Does this mean we might create HFiles that not sorted properly, 
because the HFile comparator uses the KeyComparator directly with ignoreType = 
false. 
bq.   
bq.   While in memstore we can rely on memstoreTS to roughly order by 
insertion time, and the Put/Delete should probably work in that situation, you 
are talking about modifiying a pretty core and important concept in how we sort 
things.
bq.   
bq.   There are other ways to reconcile bugs like this, one of them is to 
extend the memstoreTS concept into the HFile and use that to reconcile during 
reads.  There is another JIRA where I proposed this.  
bq.   
bq.   If we are talking about 0.92 and beyond I'd prefer building a solid 
base rather than dangerous hacks like this.  Our unit tests are not extremely 
extensive, so while they might pass, that doesnt guarantee lack of bad 
behaviour later on.
bq.  
bq.  
bq.  Pranav Khaitan wrote:
bq.  Agree. As I mentioned, this is a major change and more thought needs 
to be given to it.
bq.  
bq.  However, to resolve issues like HBASE-3276, we need either such a 
change or extend the memstoreTS concept to HFile as you mentioned.
bq.  
bq.  About consequences, I don't see anything negative here. This change 
only affects the sorting of keys having same row, col, timestamp. After this 
change, all keys with the same row, col, ts will be sorted purely based on the 
order in which they were inserted. When a memstore is flushed to HFile, the 
memstoreTS takes care of ordering. During compactions, the KeyValueHeap breaks 
ties by using the sequence ids of storefiles.

the problem is you are now changing how things are ordered sometimes but not 
all the time.  HFile directly uses the rawcomparator, instantiating it directly 
rather than getting it via the code path you changed.  So now you create a 
memstore in this order:

row,col,100,Put  (memstoreTS=1)
row,col,100,Delete (memstoreTS=2)
row,col,100,Put (memstoreTS=3)

But the HFile comparator will consider this out of order since it doesnt know 
about memstoreTS and it still expects things to be in a certain order.

I'm a little wary of having implicit ordering in the HFiles... in your new 
scheme, Put,Delete,Put are in that order 'just because they are', and the 
comparator cannot put them back in order, and must rely on scanner order.  
During compactions we would place keys in order based on which files they came 
from, but they wouldn't themselves have an order.  Basically we should get rid 
of 'type sorting' and use memstoreTS sorting in memory and implicit sorting in 
the HFiles.  


- Ryan


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1252/#review1993
---





 delete followed by a put with the same timestamp
 

 Key: HBASE-3276
 URL: https://issues.apache.org/jira/browse/HBASE-3276
 Project: HBase
  Issue Type: Bug
Reporter: Kannan Muthukkaruppan
Assignee: Kannan Muthukkaruppan

 [Note: This issue is relevant only for cases that don't use the default 
 time based versions, but provide/manage versions explicitly.]
 The fix for HBASE-1485 ensures that if there are multiple puts with the same 
 timestamp the later one wins.
 However, if there is a delete for a specific timestamp, then the later put 
 doesn't win. 
 Say for example the following is the sequence of operations:
 put row/col/v1 - value1
 deleteColumn row/col/v1
 put row/col/v1 - value2
 Without the deleteColumn(), HBASE-1485 ensures that value2 is the winner.
 However, with the deleteColumn() thrown into the mix, the delete wins, and 
 one cannot insert a new value at that version. [The only, unsatisfactory, 
 workaround at this point seems to be trigger a major compaction. The major 
 compact would clear the delete marker, and allow new cells to be created with 
 that version again.] 
 ---
 Seems like it might not be too complicated to extend the fix for HBASE-1485 
 to also respect ordering between delete/put operations. I'll look into this 
 further.

-- 
This message is automatically generated by JIRA.
-
You can reply to 

[jira] Commented: (HBASE-3276) delete followed by a put with the same timestamp

2010-11-26 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12936131#action_12936131
 ] 

HBase Review Board commented on HBASE-3276:
---

Message from: Pranav Khaitan pranavkhai...@gmail.com


bq.  On 2010-11-26 14:54:45, Ryan Rawson wrote:
bq.   trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 1373
bq.   http://review.cloudera.org/r/1252/diff/1/?file=17712#file17712line1373
bq.  
bq.   what are all the consequences for not sorting by type when using 
KVComparator?  Does this mean we might create HFiles that not sorted properly, 
because the HFile comparator uses the KeyComparator directly with ignoreType = 
false. 
bq.   
bq.   While in memstore we can rely on memstoreTS to roughly order by 
insertion time, and the Put/Delete should probably work in that situation, you 
are talking about modifiying a pretty core and important concept in how we sort 
things.
bq.   
bq.   There are other ways to reconcile bugs like this, one of them is to 
extend the memstoreTS concept into the HFile and use that to reconcile during 
reads.  There is another JIRA where I proposed this.  
bq.   
bq.   If we are talking about 0.92 and beyond I'd prefer building a solid 
base rather than dangerous hacks like this.  Our unit tests are not extremely 
extensive, so while they might pass, that doesnt guarantee lack of bad 
behaviour later on.
bq.  
bq.  
bq.  Pranav Khaitan wrote:
bq.  Agree. As I mentioned, this is a major change and more thought needs 
to be given to it.
bq.  
bq.  However, to resolve issues like HBASE-3276, we need either such a 
change or extend the memstoreTS concept to HFile as you mentioned.
bq.  
bq.  About consequences, I don't see anything negative here. This change 
only affects the sorting of keys having same row, col, timestamp. After this 
change, all keys with the same row, col, ts will be sorted purely based on the 
order in which they were inserted. When a memstore is flushed to HFile, the 
memstoreTS takes care of ordering. During compactions, the KeyValueHeap breaks 
ties by using the sequence ids of storefiles.
bq.  
bq.  Ryan Rawson wrote:
bq.  the problem is you are now changing how things are ordered sometimes 
but not all the time.  HFile directly uses the rawcomparator, instantiating it 
directly rather than getting it via the code path you changed.  So now you 
create a memstore in this order:
bq.  
bq.  row,col,100,Put  (memstoreTS=1)
bq.  row,col,100,Delete (memstoreTS=2)
bq.  row,col,100,Put (memstoreTS=3)
bq.  
bq.  But the HFile comparator will consider this out of order since it 
doesnt know about memstoreTS and it still expects things to be in a certain 
order.
bq.  
bq.  I'm a little wary of having implicit ordering in the HFiles... in your 
new scheme, Put,Delete,Put are in that order 'just because they are', and the 
comparator cannot put them back in order, and must rely on scanner order.  
During compactions we would place keys in order based on which files they came 
from, but they wouldn't themselves have an order.  Basically we should get rid 
of 'type sorting' and use memstoreTS sorting in memory and implicit sorting in 
the HFiles.  
bq.  
bq. 

Right. I see that HFile does an extra check and throws an IOException when it 
gets data out of the order. So if we go forward with this change, we will have 
to ensure that the comparator used by HFile knows about this change. This 
change be achieved in two ways: Firstly, by setting the default value of 
ignoreType = true. Alternately, the HFile can explicitly set ignoreType = true.


- Pranav


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1252/#review1993
---





 delete followed by a put with the same timestamp
 

 Key: HBASE-3276
 URL: https://issues.apache.org/jira/browse/HBASE-3276
 Project: HBase
  Issue Type: Bug
Reporter: Kannan Muthukkaruppan
Assignee: Kannan Muthukkaruppan

 [Note: This issue is relevant only for cases that don't use the default 
 time based versions, but provide/manage versions explicitly.]
 The fix for HBASE-1485 ensures that if there are multiple puts with the same 
 timestamp the later one wins.
 However, if there is a delete for a specific timestamp, then the later put 
 doesn't win. 
 Say for example the following is the sequence of operations:
 put row/col/v1 - value1
 deleteColumn row/col/v1
 put row/col/v1 - value2
 Without the deleteColumn(), HBASE-1485 ensures that value2 is the winner.
 However, with the deleteColumn() thrown into the 

[jira] Commented: (HBASE-3276) delete followed by a put with the same timestamp

2010-11-26 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12936134#action_12936134
 ] 

HBase Review Board commented on HBASE-3276:
---

Message from: Pranav Khaitan pranavkhai...@gmail.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1252/
---

(Updated 2010-11-26 16:02:47.462170)


Review request for hbase, Jonathan Gray and Kannan Muthukkaruppan.


Summary (updated)
---

This is a design change suggested in HBASE-3276 so adequate thought should be 
given before proceeding. 

The main code change is just one line which is to ignore key type while doing 
KV comparisons. When the key type is ignored, then all the keys for the same 
timestamp are sorted according the order in which they were inserted. It is 
still ensured that the delete family and delete column will be at the top 
because they have the default column name and default timestamp.


This addresses bug HBASE-3276.
http://issues.apache.org/jira/browse/HBASE-3276


Diffs
-

  trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1039233 
  
trunk/src/test/java/org/apache/hadoop/hbase/regionserver/KeyValueScanFixture.java
 1039233 
  
trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 
1039233 

Diff: http://review.cloudera.org/r/1252/diff


Testing
---

Test cases added. Since there is a change in semantics, some previous tests 
were failing because of this change. Those tests have been modified to test the 
newer behavior.


Thanks,

Pranav




 delete followed by a put with the same timestamp
 

 Key: HBASE-3276
 URL: https://issues.apache.org/jira/browse/HBASE-3276
 Project: HBase
  Issue Type: Bug
Reporter: Kannan Muthukkaruppan
Assignee: Kannan Muthukkaruppan

 [Note: This issue is relevant only for cases that don't use the default 
 time based versions, but provide/manage versions explicitly.]
 The fix for HBASE-1485 ensures that if there are multiple puts with the same 
 timestamp the later one wins.
 However, if there is a delete for a specific timestamp, then the later put 
 doesn't win. 
 Say for example the following is the sequence of operations:
 put row/col/v1 - value1
 deleteColumn row/col/v1
 put row/col/v1 - value2
 Without the deleteColumn(), HBASE-1485 ensures that value2 is the winner.
 However, with the deleteColumn() thrown into the mix, the delete wins, and 
 one cannot insert a new value at that version. [The only, unsatisfactory, 
 workaround at this point seems to be trigger a major compaction. The major 
 compact would clear the delete marker, and allow new cells to be created with 
 that version again.] 
 ---
 Seems like it might not be too complicated to extend the fix for HBASE-1485 
 to also respect ordering between delete/put operations. I'll look into this 
 further.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3267) close_region shell command breaks region

2010-11-24 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12935566#action_12935566
 ] 

HBase Review Board commented on HBASE-3267:
---

Message from: st...@duboce.net

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1250/
---

Review request for hbase and Jonathan Gray.


Summary
---

So, things are different in the new master.  Close region should close region.  
Not close and then reopen.  To close and reopen elsewhere, thats an unassign or 
a move (both of which were missing from shell but which are added in this 
patch).  I fixed the close so that its a close that does not touch zk... the 
region is just closed on the regionserver.  No going to zk makes it so the 
close no longer makes for complaint.  Close is dangerous though in that the 
region is now permanently offline (I updated the close help to explain this is 
so).   To address it being permanently offline, I added a new assign to the 
shell. 

While in here, I removed commands that no longer make senses such as 
enable_region and disable_region. 

M src/main/java/org/apache/hadoop/hbase/master/HMaster.java
  Change move implementation so can pass an empty host.
  Empty host means move to random location rather than
  explicit server.
  Added assign, unassign
M src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
  (clearRegionPlan): Added.
M src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java
  Improved move javadoc.  Added assign, unassign.
M src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
  Improved javadoc. Added assign and unassign.
M src/main/ruby/hbase/admin.rb
  Added balancer, balance_switch, assign, unassign, removed
  zk, enable_region and disable_region (the latter make no sense
  anymore now disable/enable is done differently).
D src/main/ruby/shell/commands/zk.rb
A src/main/ruby/shell/commands/assign.rb
A src/main/ruby/shell/commands/balance_switch.rb
D src/main/ruby/shell/commands/disable_region.rb
A src/main/ruby/shell/commands/balancer.rb
A src/main/ruby/shell/commands/unassign.rb
D src/main/ruby/shell/commands/enable_region.rb
A src/main/ruby/shell/commands/move.rb
M src/main/ruby/shell/commands/close_region.rb
  Fixed up help
M src/main/ruby/shell.rb
  Added and removed commands.  


This addresses bug hbase-3267.
http://issues.apache.org/jira/browse/hbase-3267


Diffs
-

  trunk/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 1038768 
  trunk/src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java 1038768 
  trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 
1038768 
  trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 1038768 
  trunk/src/main/ruby/hbase/admin.rb 1038768 
  trunk/src/main/ruby/shell.rb 1038768 
  trunk/src/main/ruby/shell/commands/assign.rb PRE-CREATION 
  trunk/src/main/ruby/shell/commands/balance_switch.rb PRE-CREATION 
  trunk/src/main/ruby/shell/commands/balancer.rb PRE-CREATION 
  trunk/src/main/ruby/shell/commands/close_region.rb 1038768 
  trunk/src/main/ruby/shell/commands/disable_region.rb 1038768 
  trunk/src/main/ruby/shell/commands/enable_region.rb 1038768 
  trunk/src/main/ruby/shell/commands/move.rb PRE-CREATION 
  trunk/src/main/ruby/shell/commands/unassign.rb PRE-CREATION 
  trunk/src/main/ruby/shell/commands/zk.rb 1038768 

Diff: http://review.cloudera.org/r/1250/diff


Testing
---

I tested shell here on my little cluster.


Thanks,

stack




 close_region shell command breaks region
 

 Key: HBASE-3267
 URL: https://issues.apache.org/jira/browse/HBASE-3267
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, shell
Affects Versions: 0.90.0
Reporter: Todd Lipcon
Assignee: stack
Priority: Critical
 Fix For: 0.90.0


 It used to be that you could use the close_region command from the shell to 
 close a region on one server and have the master reassign it elsewhere. Now 
 if you close a region, you get the following errors in the master log:
 2010-11-23 00:46:34,090 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSING for region 
 ffaa7999e909dbd6544688cc8ab303bd from server 
 haus01.sf.cloudera.com,12020,1290501789693 but region was in  the state null 
 and not in expected PENDI
 2010-11-23 00:46:34,530 DEBUG 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
 master:6-0x12c537d84e10062 Received ZooKeeper Event, 
 type=NodeDataChanged, state=SyncConnected, 
 path=/hbase/unassigned/ffaa7999e909dbd6544688cc8ab303bd
 2010-11-23 00:46:34,531 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: 
 

[jira] Commented: (HBASE-3267) close_region shell command breaks region

2010-11-24 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12935579#action_12935579
 ] 

HBase Review Board commented on HBASE-3267:
---

Message from: Jonathan Gray jg...@apache.org

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1250/#review1975
---


This is great.  I like this much better than hacking up the master transition 
code.

My main concern is around the exact semantics of assign/unassign (and close).  
I think we need to do good javadoc on the HBA methods to describe how you would 
use these or at least a bit about their behavior.  assign() just does an 
assign, but unassign() actually clears stuff out.  It seems doing a close() 
behind the masters back, then asking the master to assign that region, should 
not work... but it does?


trunk/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
http://review.cloudera.org/r/1250/#comment6230

Is there an open_region?  This assign() goes through the master, so what is 
the opposite of close_region which doesn't go through the master?

Doesn't close_region now put the master in a bad state, so it won't expect 
an assignment to be done on a region which it thinks is already assigned?  
There is a force on unassign() but not on assign().

In the old master, for HBCK, I added a hook in to the master to clear the 
in-memory state for a region.  To deal with dupe assignment, I did silent 
close_regions and then cleared the in-memory state.  Then I triggered a new 
assignment.



trunk/src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java
http://review.cloudera.org/r/1250/#comment6231

this is awesome javadoc.  is there somewhere else we can put this rather 
than in just the move() API?  Maybe in the HBA class comment or something?  
Somewhere we can reference in other javadocs about what a regionname is



trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/1250/#comment6232

So you're supposed to call move instead of open_region?  Or why the change 
in move() though this looks good.



trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/1250/#comment6233

Why META and not in-memory state?  Once you hit assign() you rely on the 
in-memory state anyways?



trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/1250/#comment6238

on assign we just do the assignment, but below on unassign() we first clear 
existing plans and clear from RIT.  why the difference.



trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/1250/#comment6234

is this necessary?  should the unassign method taking force deal with 
anything needed to force it?



trunk/src/main/ruby/hbase/admin.rb
http://review.cloudera.org/r/1250/#comment6235

zk didn't work?  why is this removed?



trunk/src/main/ruby/shell/commands/assign.rb
http://review.cloudera.org/r/1250/#comment6236

whitespace.  and what exactly are the semantics of this?  what if region is 
already assigned?

we should document somewhere more specifically what the behavior is of 
these methods if we're going to expose them to the client and the shell.  
neither place really describes what this means and i can imagine users will be 
doing lots of foot shooting with tools like this.

more importantly, though, i'm trying to understand the use cases for these. 
 if it's to unbreak stuff, it's not clear to me how exactly you would use it 
given that the master will reject certain operations in the wrong order.



trunk/src/main/ruby/shell/commands/close_region.rb
http://review.cloudera.org/r/1250/#comment6237

Why would you use close and not unassign/assign/move?  It's because close 
is done silently?  Should say that if that's the distinction.

Is this comment saying you can use unassign or move after you issue close?  
or instead of?



trunk/src/main/ruby/shell/commands/unassign.rb
http://review.cloudera.org/r/1250/#comment6239

this doesn't use encoded region name?

is move then different from the other methods?


- Jonathan





 close_region shell command breaks region
 

 Key: HBASE-3267
 URL: https://issues.apache.org/jira/browse/HBASE-3267
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, shell
Affects Versions: 0.90.0
Reporter: Todd Lipcon
Assignee: stack
Priority: Critical
 Fix For: 0.90.0


 It used to be that you could use the close_region command from the shell to 
 close a region on one server and have 

[jira] Commented: (HBASE-3267) close_region shell command breaks region

2010-11-24 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12935590#action_12935590
 ] 

HBase Review Board commented on HBASE-3267:
---

Message from: st...@duboce.net


bq.  On 2010-11-24 15:45:32, Jonathan Gray wrote:
bq.   This is great.  I like this much better than hacking up the master 
transition code.
bq.   
bq.   My main concern is around the exact semantics of assign/unassign (and 
close).  I think we need to do good javadoc on the HBA methods to describe how 
you would use these or at least a bit about their behavior.  assign() just does 
an assign, but unassign() actually clears stuff out.  It seems doing a close() 
behind the masters back, then asking the master to assign that region, should 
not work... but it does?

Well, my notion is that user shouldn't be doing these manual messings any more. 
 Fixup stuff is now for hbck to do.

Yes, close of a region is done w/o master's involvement.  Rare would you do it. 
 Yes, an assign will assign a region EVEN IF ALREADY assigned.  Messing in here 
can get you in trouble.  I was able to manufacture some ugly conditions -- a 
stuck region trying to assign same server over and over -- but then unassign 
with a force now clears out RIT and does the right thing i.e. we have 
enough tools to hang ourselves on new master but also the tools to undo.


bq.  On 2010-11-24 15:45:32, Jonathan Gray wrote:
bq.   trunk/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java, line 
980
bq.   http://review.cloudera.org/r/1250/diff/1/?file=17648#file17648line980
bq.  
bq.   Is there an open_region?  This assign() goes through the master, so 
what is the opposite of close_region which doesn't go through the master?
bq.   
bq.   Doesn't close_region now put the master in a bad state, so it won't 
expect an assignment to be done on a region which it thinks is already 
assigned?  There is a force on unassign() but not on assign().
bq.   
bq.   In the old master, for HBCK, I added a hook in to the master to 
clear the in-memory state for a region.  To deal with dupe assignment, I did 
silent close_regions and then cleared the in-memory state.  Then I triggered a 
new assignment.

No open_region.  Someone can add that later if wanted.  Otherwise, use move to 
place region on specific server.

On close_region, yes, puts master in bad state but you'd only do close_region 
when doing fix up of some whack condition.  I was tempted to just remove these 
commands but since we don't know what states new master could put us in, I'll 
leave them in for now.

I'll add force to assign so same as unassign.


Regards what you did for old master hbck, you could call close_regions then an 
unassign with a force would clear memory and get the region assigned elsewhere.

But hbck should be doing this.  Not a user manually, not unless things are 
really hosed.


bq.  On 2010-11-24 15:45:32, Jonathan Gray wrote:
bq.   trunk/src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java, 
line 138
bq.   http://review.cloudera.org/r/1250/diff/1/?file=17649#file17649line138
bq.  
bq.   this is awesome javadoc.  is there somewhere else we can put this 
rather than in just the move() API?  Maybe in the HBA class comment or 
something?  Somewhere we can reference in other javadocs about what a 
regionname is

I moved the interface doc out to HBA as per your suggestion.


bq.  On 2010-11-24 15:45:32, Jonathan Gray wrote:
bq.   trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java, line 709
bq.   http://review.cloudera.org/r/1250/diff/1/?file=17651#file17651line709
bq.  
bq.   So you're supposed to call move instead of open_region?  Or why the 
change in move() though this looks good.

Just added it as something you might want to do.  unassign does same thing 
really.  I could back it out.


bq.  On 2010-11-24 15:45:32, Jonathan Gray wrote:
bq.   trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java, line 994
bq.   http://review.cloudera.org/r/1250/diff/1/?file=17651#file17651line994
bq.  
bq.   Why META and not in-memory state?  Once you hit assign() you rely on 
the in-memory state anyways?

I only have a region server name, not an HRI which is what the inmemory state 
is keyed by.   I could iterate the Map I suppose but then I'm thinking it may 
have been cleared from inmemory state.


bq.  On 2010-11-24 15:45:32, Jonathan Gray wrote:
bq.   trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java, line 996
bq.   http://review.cloudera.org/r/1250/diff/1/?file=17651#file17651line996
bq.  
bq.   on assign we just do the assignment, but below on unassign() we 
first clear existing plans and clear from RIT.  why the difference.

I made it so we only clear state if force is added to the unassign.


bq.  On 2010-11-24 15:45:32, Jonathan Gray wrote:
bq.   

[jira] Commented: (HBASE-3267) close_region shell command breaks region

2010-11-24 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12935598#action_12935598
 ] 

HBase Review Board commented on HBASE-3267:
---

Message from: st...@duboce.net

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1250/
---

(Updated 2010-11-24 16:44:36.709870)


Review request for hbase and Jonathan Gray.


Changes
---

Addresses Jon' commments mostly by way of bulking up help in shell with 
warnings and pulling into HBA the javadoc that was out on the HMasterInterface. 
 Also did stuff like make assign and unassign symmetric both taking a force 
param.


Summary
---

So, things are different in the new master.  Close region should close region.  
Not close and then reopen.  To close and reopen elsewhere, thats an unassign or 
a move (both of which were missing from shell but which are added in this 
patch).  I fixed the close so that its a close that does not touch zk... the 
region is just closed on the regionserver.  No going to zk makes it so the 
close no longer makes for complaint.  Close is dangerous though in that the 
region is now permanently offline (I updated the close help to explain this is 
so).   To address it being permanently offline, I added a new assign to the 
shell. 

While in here, I removed commands that no longer make senses such as 
enable_region and disable_region. 

M src/main/java/org/apache/hadoop/hbase/master/HMaster.java
  Change move implementation so can pass an empty host.
  Empty host means move to random location rather than
  explicit server.
  Added assign, unassign
M src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
  (clearRegionPlan): Added.
M src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java
  Improved move javadoc.  Added assign, unassign.
M src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
  Improved javadoc. Added assign and unassign.
M src/main/ruby/hbase/admin.rb
  Added balancer, balance_switch, assign, unassign, removed
  zk, enable_region and disable_region (the latter make no sense
  anymore now disable/enable is done differently).
D src/main/ruby/shell/commands/zk.rb
A src/main/ruby/shell/commands/assign.rb
A src/main/ruby/shell/commands/balance_switch.rb
D src/main/ruby/shell/commands/disable_region.rb
A src/main/ruby/shell/commands/balancer.rb
A src/main/ruby/shell/commands/unassign.rb
D src/main/ruby/shell/commands/enable_region.rb
A src/main/ruby/shell/commands/move.rb
M src/main/ruby/shell/commands/close_region.rb
  Fixed up help
M src/main/ruby/shell.rb
  Added and removed commands.  


This addresses bug hbase-3267.
http://issues.apache.org/jira/browse/hbase-3267


Diffs (updated)
-

  trunk/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 1038878 
  trunk/src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java 1038878 
  trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 
1038878 
  trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 1038878 
  trunk/src/main/ruby/hbase/admin.rb 1038878 
  trunk/src/main/ruby/shell.rb 1038878 
  trunk/src/main/ruby/shell/commands/assign.rb PRE-CREATION 
  trunk/src/main/ruby/shell/commands/balance_switch.rb PRE-CREATION 
  trunk/src/main/ruby/shell/commands/balancer.rb PRE-CREATION 
  trunk/src/main/ruby/shell/commands/close_region.rb 1038878 
  trunk/src/main/ruby/shell/commands/disable_region.rb 1038878 
  trunk/src/main/ruby/shell/commands/enable_region.rb 1038878 
  trunk/src/main/ruby/shell/commands/move.rb PRE-CREATION 
  trunk/src/main/ruby/shell/commands/unassign.rb PRE-CREATION 
  trunk/src/main/ruby/shell/commands/zk.rb 1038878 

Diff: http://review.cloudera.org/r/1250/diff


Testing
---

I tested shell here on my little cluster.


Thanks,

stack




 close_region shell command breaks region
 

 Key: HBASE-3267
 URL: https://issues.apache.org/jira/browse/HBASE-3267
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, shell
Affects Versions: 0.90.0
Reporter: Todd Lipcon
Assignee: stack
Priority: Critical
 Fix For: 0.90.0


 It used to be that you could use the close_region command from the shell to 
 close a region on one server and have the master reassign it elsewhere. Now 
 if you close a region, you get the following errors in the master log:
 2010-11-23 00:46:34,090 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSING for region 
 ffaa7999e909dbd6544688cc8ab303bd from server 
 haus01.sf.cloudera.com,12020,1290501789693 but region was in  the state null 
 and not in expected PENDI
 2010-11-23 00:46:34,530 DEBUG 
 

[jira] Commented: (HBASE-3267) close_region shell command breaks region

2010-11-24 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12935608#action_12935608
 ] 

HBase Review Board commented on HBASE-3267:
---

Message from: Jonathan Gray jg...@apache.org

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1250/#review1977
---

Ship it!


Thanks for making changes, this looks great.

I completely understand the need for these, especially for HBCK, I guess I just 
think of adding things to the shell as stuff people will try to use.  With all 
the added doc I think it's fine.  We'll for sure make changes to this stuff as 
we see what happens in the wild with the new master.

+1 for commit

- Jonathan





 close_region shell command breaks region
 

 Key: HBASE-3267
 URL: https://issues.apache.org/jira/browse/HBASE-3267
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, shell
Affects Versions: 0.90.0
Reporter: Todd Lipcon
Assignee: stack
Priority: Critical
 Fix For: 0.90.0


 It used to be that you could use the close_region command from the shell to 
 close a region on one server and have the master reassign it elsewhere. Now 
 if you close a region, you get the following errors in the master log:
 2010-11-23 00:46:34,090 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSING for region 
 ffaa7999e909dbd6544688cc8ab303bd from server 
 haus01.sf.cloudera.com,12020,1290501789693 but region was in  the state null 
 and not in expected PENDI
 2010-11-23 00:46:34,530 DEBUG 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
 master:6-0x12c537d84e10062 Received ZooKeeper Event, 
 type=NodeDataChanged, state=SyncConnected, 
 path=/hbase/unassigned/ffaa7999e909dbd6544688cc8ab303bd
 2010-11-23 00:46:34,531 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: 
 master:6-0x12c537d84e10062 Retrieved 128 byte(s) of data from znode 
 /hbase/unassigned/ffaa7999e909dbd6544688cc8ab303bd and set watcher; 
 region=usertable,user1951957302,1290501969
 2010-11-23 00:46:34,531 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_CLOSED, 
 server=haus01.sf.cloudera.com,12020,1290501789693, 
 region=ffaa7999e909dbd6544688cc8ab303bd
 2010-11-23 00:46:34,531 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 
 ffaa7999e909dbd6544688cc8ab303bd from server 
 haus01.sf.cloudera.com,12020,1290501789693 but region was in  the state null 
 and not in expected PENDIN
 and the region just gets stuck closed

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3227) Edit of log messages before branching...

2010-11-23 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934996#action_12934996
 ] 

HBase Review Board commented on HBASE-3227:
---

Message from: st...@duboce.net


bq.  On 2010-11-22 17:29:45, Nicolas wrote:
bq.   trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java, 
line 739
bq.   http://review.cloudera.org/r/1212/diff/1/?file=17170#file17170line739
bq.  
bq.   I'd suggest keeping the store name in this debug message since we're 
considering thread pools for compactions...

Won't the store name be part of the path on the next line when we do 
sf.toString() where sf is the file we're compacting all into?


- stack


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1212/#review1971
---





 Edit of log messages before branching...
 

 Key: HBASE-3227
 URL: https://issues.apache.org/jira/browse/HBASE-3227
 Project: HBase
  Issue Type: Improvement
Reporter: stack
 Fix For: 0.90.0




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3227) Edit of log messages before branching...

2010-11-22 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934688#action_12934688
 ] 

HBase Review Board commented on HBASE-3227:
---

Message from: Nicolas nspiegelb...@facebook.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1212/#review1971
---



trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
http://review.cloudera.org/r/1212/#comment6227

I'd suggest keeping the store name in this debug message since we're 
considering thread pools for compactions...


- Nicolas





 Edit of log messages before branching...
 

 Key: HBASE-3227
 URL: https://issues.apache.org/jira/browse/HBASE-3227
 Project: HBase
  Issue Type: Improvement
Reporter: stack
 Fix For: 0.90.0




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions

2010-11-19 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934033#action_12934033
 ] 

HBase Review Board commented on HBASE-2001:
---

Message from: Mingjie Lai mjla...@gmail.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/876/
---

(Updated 2010-11-19 14:39:18.378849)


Review request for hbase, stack, Andrew Purtell, and Jonathan Gray.


Changes
---

Final patch, ready to be checked in:
- Truncated white space at line end
- Rebuilt the patch after HBase-2002 checked in. 


Summary
---

The diff actually contains 2 seperate patches: HBase-2001 and the one for 
(HBASE-2002+HBASE-2321). The reason is that HBase-2001's CommandTarget relies 
on HBASE-2002 + HBASE-2321 which patches are still under review. I have to 
include Gary's HBASE-2002, HBASE-2321 with this diff, since reviewboard is so 
powerful :) and it disallow my diff to be based on some unchecked in patch. 

Eventually the patch here should be committed after 2001 and 2321. I will make 
another patch after they got checked in. 

Both HBase-2001 and the dynamic RPC stuff are quite big patches. Total number 
of lines are more than 7k. I turned back and forth, but still don't have a good 
idea to create the patch in order to reduce the review pain. However right now 
I'm putting the whole patch for all the 3 issues. Here the list of file which 
are only related to coprocessor:

src/main/java/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.java
src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserverCoprocessor.java
src/main/java/org/apache/hadoop/hbase/coprocessor/Coprocessor.java
src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorEnvironment.java
src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorException.java
src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java
src/main/java/org/apache/hadoop/hbase/coprocessor/package-info.java
src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java
src/test/java/org/apache/hadoop/hbase/coprocessor/ColumnAggregationEndpoint.java
src/test/java/org/apache/hadoop/hbase/coprocessor/ColumnAggregationProtocol.java
src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java
src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorEndpoint.java
src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverInterface.java
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverStacking.java


==

(Here is a brief description. Please find much more details at the 
package-info.java in the diff. I also post the package-info.html to 
https://issues.apache.org/jira/browse/HBASE-2001 as an attachment.)


Coprocessors are code that runs in-process on each region server. Regions 
contain references to the coprocessor implementation classes associated with 
them. Coprocessor classes will be loaded either from local jars on the region 
server's classpath or via the HDFS classloader.

Multiple types of coprocessors are provided to provide sufficient flexibility 
for potential use cases. Right now there are:

* Coprocessor: provides region lifecycle management hooks, e.g., region 
open/close/split/flush/compact operations.
* RegionObserver: provides hook for monitor table operations from client side, 
such as table get/put/scan/delete, etc.
* Endpoint: provides on demand triggers for any arbitrary function executed at 
a region. One use case is column aggregation at region server.

Coprocessor:
A coprocessor is required to implement Coprocessor interface so that 
coprocessor framework can manage it internally.

Another design goal of this interface is to provide simple features for making 
coprocessors useful, while exposing no more internal state or control actions 
of the region server than necessary and not exposing them directly. 

RegionObserver
If the coprocessor implements the RegionObserver interface it can observe and 
mediate client actions on the region. 

Endpoint:
Coprocessor and RegionObserver provide certain hooks for injecting user code 
running at each region. These code will be triggerd with existing HTable and 
HBaseAdmin operations at the certain hook points.

Through Endpoint and dynamic RPC protocol, you can define your own interface 
communicated between client and region server, i.e., you can create a new 
method, specify passed parameters and return types for the method. And the new 
Endpoint methods can be triggered by calling client side dynamic RPC functions 
-- HTable.exec(...). 

Coprocess loading
A customized coprocessor can be loaded by two different ways, by configuration, 
or by 

[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions

2010-11-19 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934039#action_12934039
 ] 

HBase Review Board commented on HBASE-2001:
---

Message from: Andrew Purtell apurt...@apache.org

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/876/#review1961
---

Ship it!


Will commit after running unit tests and verifying all pass.

- Andrew





 Coprocessors: Colocate user code with regions
 -

 Key: HBASE-2001
 URL: https://issues.apache.org/jira/browse/HBASE-2001
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Mingjie Lai
 Fix For: 0.92.0

 Attachments: asm-transformations.pdf, HBase-2001-final.patch, 
 HBASE-2001-RegionObserver-2.patch, HBASE-2001-RegionObserver.patch, 
 HBASE-2001.patch.gz, packge-info.html, packge-info.html, packge-info.html


 Support user code that runs run next to each region in table. As regions 
 split and move, coprocessor code should automatically  move also.
 Use classloader which looks on HDFS.
 Associate a list of classes to load with each table. Put this in HRI so it 
 inherits from table but can be changed on a per region basis (so then those 
 region specific changes can inherited by daughters). 
 Not completely arbitrary code, should require implementation of an interface 
 with callbacks for:
 * Open
 * Close
 * Split
 * Compact
 * (Multi)get and scanner next()
 * (Multi)put
 * (Multi)delete
 Add method to HTableInterface for invoking coprocessor methods and retrieving 
 results.  
 Add methods in o.a.h.h.regionserver or subpackage which implement convenience 
 functions for coprocessor methods and consistent/controlled access to 
 internals: store access, threading, persistent and ephemeral state, scratch 
 storage, etc. 
 GitHub: https://github.com/trendmicro/hbase/tree/coprocessor
 Please see the latest attached package-info.html for updated description.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions

2010-11-17 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12933068#action_12933068
 ] 

HBase Review Board commented on HBASE-2001:
---

Message from: Andrew Purtell apurt...@apache.org


bq.  On 2010-11-15 16:51:18, stack wrote:
bq.   +1 on commit to TRUNK.  I think all below can be cleaned up on commit 
(Andrew, you going to commit?)

Stack, Yes I plan to commit the patches for HBASE-2001/HBASE-2002/HBASE-2321 
onto trunk this week. The dynamic RPC and coprocessor framework changes are 
largely independent and will go in separately to make the change history in the 
commit log more informative. We will address your comments before doing so.


- Andrew


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/876/#review1930
---





 Coprocessors: Colocate user code with regions
 -

 Key: HBASE-2001
 URL: https://issues.apache.org/jira/browse/HBASE-2001
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Mingjie Lai
 Fix For: 0.92.0

 Attachments: asm-transformations.pdf, 
 HBASE-2001-RegionObserver-2.patch, HBASE-2001-RegionObserver.patch, 
 HBASE-2001.patch.gz, packge-info.html, packge-info.html, packge-info.html


 Support user code that runs run next to each region in table. As regions 
 split and move, coprocessor code should automatically  move also.
 Use classloader which looks on HDFS.
 Associate a list of classes to load with each table. Put this in HRI so it 
 inherits from table but can be changed on a per region basis (so then those 
 region specific changes can inherited by daughters). 
 Not completely arbitrary code, should require implementation of an interface 
 with callbacks for:
 * Open
 * Close
 * Split
 * Compact
 * (Multi)get and scanner next()
 * (Multi)put
 * (Multi)delete
 Add method to HTableInterface for invoking coprocessor methods and retrieving 
 results.  
 Add methods in o.a.h.h.regionserver or subpackage which implement convenience 
 functions for coprocessor methods and consistent/controlled access to 
 internals: store access, threading, persistent and ephemeral state, scratch 
 storage, etc. 
 GitHub: https://github.com/trendmicro/hbase/tree/coprocessor
 Please see the latest attached package-info.html for updated description.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions

2010-11-15 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12932285#action_12932285
 ] 

HBase Review Board commented on HBASE-2001:
---

Message from: st...@duboce.net


bq.  On 2010-10-05 23:10:58, stack wrote:
bq.   src/main/java/org/apache/hadoop/hbase/client/Action.java, line 30
bq.   http://review.cloudera.org/r/876/diff/7/?file=14158#file14158line30
bq.  
bq.   I took a look at the package-info.html.  Very nice doc.  One thought 
though was that the batch methods do not seem to be instrumented.  Are they?  
The bulk of inserts are done by multiput now.
bq.   
bq.   Maybe link to the wiki page when you say this in package-info.html 
'implement role-based access control for HBase'
bq.   
bq.   Fix this 'These code will be triggerd with existing...'
bq.   
bq.   BaseRegionObserver as the name of the class that implements BOTH 
Coprocessor and RegionObserver with sensible defaults seems off... it'd make 
sense as the name of an implemenation of RegionObserver but not of both.  Is 
there a better name to give it -- even BaseRegionObserverCoprocessor?  Unless 
BaseObserver already implements Coprocessor?
bq.   
bq.   Should this also say that methods can be new also?  '...i.e., you 
can specify new passed parameters and return types for a method. '
bq.   
bq.   CommandTarget is a strange name for an host of arbitrary 
user-designed methods.  Can we come up w/ something more telling?   Notions 
that come to mind are Substrate, Platform -- i.e. stuff you build up on.
bq.   
bq.   Minor.. fix '...the actually implemention class running...'
bq.   
bq.   Fix this '...How is the client side example of calling...'
bq.   
bq.   The example is missing a bit of code that would help along its 
illustration a few comments would help too but this is a minor 
criticism.  Not important.  I get the gist (Folks interested in CP need to 
start with this page -- it makes grokking the code the easier).
bq.   
bq.   This page would seem to indicate CPs can be chained.  Am I reading 
that wrong?  (See 'Load from configuration')  Over in Gary review, he was 
saying on CP per region only.
bq.   
bq.   
bq.   Usually attribute names are upper-cased.  Here we have 
'Coprocessor$1' (that $1is intentional right?)
bq.   
bq.   This functionality, if its working, is amazing.
bq.   
bq.   
bq.  
bq.  
bq.  Mingjie Lai wrote:
bq.  @stack:
bq.  I didn't realize you posted a comment until last week, since your 
comments here didn't get pushed to jira, neither emails sent to d...@hbase. 
bq.  
bq.  Thanks for your comments. I will address them very soon. But before 
that I'd like to finalize the name of ``CommandTarget'':
bq.  
bq.  You said, ``CommandTarget is a strange name for an host of arbitrary 
user-designed methods.  Can we come up w/ something more telling?   Notions 
that come to mind are Substrate, Platform -- i.e. stuff you build up on.''
bq.  
bq.  Some of us suggested to use ``Endpoint'' instead of CommandTarget. Do 
you like it better? (I'm not really good at naming stuff)
bq.  
bq.  After finalizing the name, I will make the changes to both source code 
and package-info. And post a patch here.
bq.  
bq.  Thanks,
bq.  Mingjie
bq. 

I'm not good at naming either Endpoint seems more 'generic', less loaded 
than 'CommandTarget'.  If you fellas working with this stuff think that a 
better name then thats good by me.


- stack


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/876/#review1438
---





 Coprocessors: Colocate user code with regions
 -

 Key: HBASE-2001
 URL: https://issues.apache.org/jira/browse/HBASE-2001
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Mingjie Lai
 Fix For: 0.92.0

 Attachments: asm-transformations.pdf, 
 HBASE-2001-RegionObserver-2.patch, HBASE-2001-RegionObserver.patch, 
 HBASE-2001.patch.gz, packge-info.html, packge-info.html, packge-info.html


 Support user code that runs run next to each region in table. As regions 
 split and move, coprocessor code should automatically  move also.
 Use classloader which looks on HDFS.
 Associate a list of classes to load with each table. Put this in HRI so it 
 inherits from table but can be changed on a per region basis (so then those 
 region specific changes can inherited by daughters). 
 Not completely arbitrary code, should require implementation of an interface 
 with callbacks for:
 * Open
 * Close
 * Split
 * Compact
 * (Multi)get and 

[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions

2010-11-15 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12932292#action_12932292
 ] 

HBase Review Board commented on HBASE-2001:
---

Message from: st...@duboce.net

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/876/#review1930
---

Ship it!


+1 on commit to TRUNK.  I think all below can be cleaned up on commit (Andrew, 
you going to commit?)


src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
http://review.cloudera.org/r/876/#comment6139

Check in here.  Looks like tabs?  review board reporting it as whitespace.



src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
http://review.cloudera.org/r/876/#comment6140

Usually in hbase code base there are spaces around operations; e.g. around 
'+'.



src/main/java/org/apache/hadoop/hbase/client/coprocessor/ExecResult.java
http://review.cloudera.org/r/876/#comment6142

Be careful.  In hbase lines are 80 characters long normally.  Fix on commit?



src/main/java/org/apache/hadoop/hbase/client/coprocessor/ExecResult.java
http://review.cloudera.org/r/876/#comment6143

I think its ok if these lines  80 characters



src/main/java/org/apache/hadoop/hbase/client/coprocessor/package-info.java
http://review.cloudera.org/r/876/#comment6144

Excellent



src/main/java/org/apache/hadoop/hbase/coprocessor/package-info.java
http://review.cloudera.org/r/876/#comment6146

Lots of white space in here.


- stack





 Coprocessors: Colocate user code with regions
 -

 Key: HBASE-2001
 URL: https://issues.apache.org/jira/browse/HBASE-2001
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Mingjie Lai
 Fix For: 0.92.0

 Attachments: asm-transformations.pdf, 
 HBASE-2001-RegionObserver-2.patch, HBASE-2001-RegionObserver.patch, 
 HBASE-2001.patch.gz, packge-info.html, packge-info.html, packge-info.html


 Support user code that runs run next to each region in table. As regions 
 split and move, coprocessor code should automatically  move also.
 Use classloader which looks on HDFS.
 Associate a list of classes to load with each table. Put this in HRI so it 
 inherits from table but can be changed on a per region basis (so then those 
 region specific changes can inherited by daughters). 
 Not completely arbitrary code, should require implementation of an interface 
 with callbacks for:
 * Open
 * Close
 * Split
 * Compact
 * (Multi)get and scanner next()
 * (Multi)put
 * (Multi)delete
 Add method to HTableInterface for invoking coprocessor methods and retrieving 
 results.  
 Add methods in o.a.h.h.regionserver or subpackage which implement convenience 
 functions for coprocessor methods and consistent/controlled access to 
 internals: store access, threading, persistent and ephemeral state, scratch 
 storage, etc. 
 GitHub: https://github.com/trendmicro/hbase/tree/coprocessor
 Please see the latest attached package-info.html for updated description.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-2002) Coprocessors: Client side support

2010-11-15 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12932296#action_12932296
 ] 

HBase Review Board commented on HBASE-2002:
---

Message from: st...@duboce.net

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/816/#review1933
---

Ship it!


I did a quick pass over this. Most I'd seen already over in the Minjgie patch. 
I'm +1 getting it into TRUNK now early in the release cycle so probs. surface 
before release (You going to commit Andrew?).

- stack





 Coprocessors: Client side support
 -

 Key: HBASE-2002
 URL: https://issues.apache.org/jira/browse/HBASE-2002
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Gary Helmling
 Fix For: 0.92.0


 High-level call interface for clients. Unlike RPC, calls addressed to rows 
 or ranges of rows. Coprocessor client library resolves to actual locations. 
 Calls across multiple rows automatically split into multiple parallelized 
 RPCs
 Generic multicall RPC facility which incorporates this and 
 multiget/multiput/multidelete and parallel scanners.
 Group and batch RPCs by region server. Track and retry outstanding RPCs. Ride 
 over region relocations. 
 Support addressing by explicit region identifier or by row key or row key 
 range. 
 Include a facility for merging results client side. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3235) Intermittent incrementColumnValue failure in TestHRegion

2010-11-15 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12932357#action_12932357
 ] 

HBase Review Board commented on HBASE-3235:
---

Message from: Gary Helmling ghelml...@gmail.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1224/
---

Review request for hbase and Ryan Rawson.


Summary
---

Fix for MemStore.upsert(KeyValue) to start the kvset.tailSet() of potential KVs 
to remove at the beginning of entries for the row/family/qualifier combination, 
ignoring timestamp to prevent Puts being skipped based on timestamp alone and 
masking the ICV.


This addresses bug HBASE-3235.
http://issues.apache.org/jira/browse/HBASE-3235


Diffs
-

  src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java b7409b0 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java 7640997 

Diff: http://review.cloudera.org/r/1224/diff


Testing
---

Added a new test: 
TestHRegion.testIncrementColumnValue_UpdatingInPlace_TimestampClobber() to 
recreate the existing failure condition: 1) put to a row/family/qualifier, 2) 
ICV to the same row/family/qualifier with the same timestamp.  This test fails 
consistently without the patch to MemStore.

With the patch to MemStore, the new test case consistently passes.  I also ran 
TestHRegion 15+ times and saw no more intermittent failures of 
testIncrementColumnValue_UpdatingInPlace().  Previously this was failing every 
5 or so test runs, so this seems a pretty good indication it's fixed.

I also ran through the full test suite on 0.90 and all passed except for an 
error in TestHLog...


Thanks,

Gary




 Intermittent incrementColumnValue failure in TestHRegion
 

 Key: HBASE-3235
 URL: https://issues.apache.org/jira/browse/HBASE-3235
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.0
Reporter: Gary Helmling

 I first saw this in a Hudson build, but can reproduce locally with enough 
 test runs (5-10 times):
 {noformat}
 ---
 Test set: org.apache.hadoop.hbase.regionserver.TestHRegion
 ---
 Tests run: 51, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 39.413 sec 
  FAILURE!
 testIncrementColumnValue_UpdatingInPlace(org.apache.hadoop.hbase.regionserver.TestHRegion)
   Time elapsed: 0.079 sec   FAILURE!
 junit.framework.AssertionFailedError: expected:1 but was:2
 at junit.framework.Assert.fail(Assert.java:47)
 at junit.framework.Assert.failNotEquals(Assert.java:283)
 at junit.framework.Assert.assertEquals(Assert.java:64)
 at junit.framework.Assert.assertEquals(Assert.java:195)
 at junit.framework.Assert.assertEquals(Assert.java:201)
 at 
 org.apache.hadoop.hbase.regionserver.TestHRegion.testIncrementColumnValue_UpdatingInPlace(TestHRegion.java:1889)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 {noformat}
 Alternately, the failure can also show up in 
 testIncrementColumnValue_UpdatingInPlace_Negative():
 {noformat}
 testIncrementColumnValue_UpdatingInPlace_Negative(org.apache.hadoop.hbase.regionserver.TestHRegion)
   Time elapsed: 0.03 sec   FAILURE!
 junit.framework.AssertionFailedError: expected:2 but was:3
 at junit.framework.Assert.fail(Assert.java:47)
 at junit.framework.Assert.failNotEquals(Assert.java:283)
 at junit.framework.Assert.assertEquals(Assert.java:64)
 at junit.framework.Assert.assertEquals(Assert.java:130)
 at junit.framework.Assert.assertEquals(Assert.java:136)
 at
 org.apache.hadoop.hbase.regionserver.TestHRegion.assertICV(TestHRegion.java:2081)
 at
 org.apache.hadoop.hbase.regionserver.TestHRegion.testIncrementColumnValue_UpdatingInPlace_Negative(TestHRegion.java:1990)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3232) Fix KeyOnlyFilter + Add Value Length

2010-11-14 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12931962#action_12931962
 ] 

HBase Review Board commented on HBASE-3232:
---

Message from: Nicolas nspiegelb...@facebook.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1213/
---

(Updated 2010-11-14 18:28:50.196433)


Review request for hbase.


Changes
---

Because I didn't implement write/readFields for KeyOnlyFilter when I added the 
param, client - server serialization didn't work and the default value of 
false was always used.  Fixed + added associated unit test


Summary
---

HBASE-3211 altered filter code to mutate KeyValues. What could go wrong? Well, 
your scan could mess up because the KVHeap compare functions don't work 
properly. If we're going to soft mutate KVs in filter code, we also need to 
soft copy the KV before filtering. This was found while adding the ability to 
have KeyOnlyFilter have the option to return the Value's length. This is useful 
for grouping your reduce tasks into equal-sized blocks.


This addresses bug HBASE-3232.
http://issues.apache.org/jira/browse/HBASE-3232


Diffs (updated)
-

  trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1034646 
  trunk/src/main/java/org/apache/hadoop/hbase/filter/KeyOnlyFilter.java 1034646 
  trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 
1034646 
  trunk/src/test/java/org/apache/hadoop/hbase/TestKeyValue.java 1034646 
  trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 
1034646 
  trunk/src/test/java/org/apache/hadoop/hbase/filter/TestFilter.java 1034646 

Diff: http://review.cloudera.org/r/1213/diff


Testing
---

mvn clean test


Thanks,

Nicolas




 Fix KeyOnlyFilter + Add Value Length
 

 Key: HBASE-3232
 URL: https://issues.apache.org/jira/browse/HBASE-3232
 Project: HBase
  Issue Type: Bug
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Blocker
 Fix For: 0.90.0


 HBASE-3211 altered filter code to mutate KeyValues.  What could go wrong?  
 Well, your scan could mess up because the KVHeap compare functions don't work 
 properly.  If we're going to soft mutate KVs in filter code, we also need to 
 soft copy the KV before filtering.  This was found while adding the ability 
 to have KeyOnlyFilter have the option to return the Value's length.  This is 
 useful for grouping your reduce tasks into equal-sized blocks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3232) Fix KeyOnlyFilter + Add Value Length

2010-11-14 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12931967#action_12931967
 ] 

HBase Review Board commented on HBASE-3232:
---

Message from: Ryan Rawson ryano...@gmail.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1213/#review1922
---


looks great, i just committed it

- Ryan





 Fix KeyOnlyFilter + Add Value Length
 

 Key: HBASE-3232
 URL: https://issues.apache.org/jira/browse/HBASE-3232
 Project: HBase
  Issue Type: Bug
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Blocker
 Fix For: 0.90.0


 HBASE-3211 altered filter code to mutate KeyValues.  What could go wrong?  
 Well, your scan could mess up because the KVHeap compare functions don't work 
 properly.  If we're going to soft mutate KVs in filter code, we also need to 
 soft copy the KV before filtering.  This was found while adding the ability 
 to have KeyOnlyFilter have the option to return the Value's length.  This is 
 useful for grouping your reduce tasks into equal-sized blocks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3232) Fix KeyOnlyFilter + Add Value Length

2010-11-12 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12931630#action_12931630
 ] 

HBase Review Board commented on HBASE-3232:
---

Message from: Nicolas nspiegelb...@facebook.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1213/
---

Review request for hbase.


Summary
---

HBASE-3211 altered filter code to mutate KeyValues. What could go wrong? Well, 
your scan could mess up because the KVHeap compare functions don't work 
properly. If we're going to soft mutate KVs in filter code, we also need to 
soft copy the KV before filtering. This was found while adding the ability to 
have KeyOnlyFilter have the option to return the Value's length. This is useful 
for grouping your reduce tasks into equal-sized blocks.


This addresses bug HBASE-3232.
http://issues.apache.org/jira/browse/HBASE-3232


Diffs
-

  trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1034646 
  trunk/src/main/java/org/apache/hadoop/hbase/filter/KeyOnlyFilter.java 1034646 
  trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 
1034646 
  trunk/src/test/java/org/apache/hadoop/hbase/TestKeyValue.java 1034646 
  trunk/src/test/java/org/apache/hadoop/hbase/filter/TestFilter.java 1034646 

Diff: http://review.cloudera.org/r/1213/diff


Testing
---

mvn clean test


Thanks,

Nicolas




 Fix KeyOnlyFilter + Add Value Length
 

 Key: HBASE-3232
 URL: https://issues.apache.org/jira/browse/HBASE-3232
 Project: HBase
  Issue Type: Bug
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Blocker
 Fix For: 0.90.0


 HBASE-3211 altered filter code to mutate KeyValues.  What could go wrong?  
 Well, your scan could mess up because the KVHeap compare functions don't work 
 properly.  If we're going to soft mutate KVs in filter code, we also need to 
 soft copy the KV before filtering.  This was found while adding the ability 
 to have KeyOnlyFilter have the option to return the Value's length.  This is 
 useful for grouping your reduce tasks into equal-sized blocks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3227) Edit of log messages before branching...

2010-11-11 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12931095#action_12931095
 ] 

HBase Review Board commented on HBASE-3227:
---

Message from: st...@duboce.net

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1212/
---

Review request for hbase, Jean-Daniel Cryans and Jonathan Gray.


Summary
---

Removed redundancy, corrected some of the english in log messages, changed at 
least one to DEBUG.


This addresses bug hbase-3227.
http://issues.apache.org/jira/browse/hbase-3227


Diffs
-

  trunk/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 
1033977 
  
trunk/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
 1033977 
  trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1033979 
  trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1033977 

Diff: http://review.cloudera.org/r/1212/diff


Testing
---


Thanks,

stack




 Edit of log messages before branching...
 

 Key: HBASE-3227
 URL: https://issues.apache.org/jira/browse/HBASE-3227
 Project: HBase
  Issue Type: Improvement
Reporter: stack
 Fix For: 0.90.0




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3227) Edit of log messages before branching...

2010-11-11 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12931096#action_12931096
 ] 

HBase Review Board commented on HBASE-3227:
---

Message from: Jean-Daniel Cryans jdcry...@apache.org

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1212/#review1911
---



trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
http://review.cloudera.org/r/1212/#comment6126

I still don't know what this message means :(


- Jean-Daniel





 Edit of log messages before branching...
 

 Key: HBASE-3227
 URL: https://issues.apache.org/jira/browse/HBASE-3227
 Project: HBase
  Issue Type: Improvement
Reporter: stack
 Fix For: 0.90.0




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3227) Edit of log messages before branching...

2010-11-11 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12931097#action_12931097
 ] 

HBase Review Board commented on HBASE-3227:
---

Message from: Jean-Daniel Cryans jdcry...@apache.org


bq.  On 2010-11-11 09:31:16, Jean-Daniel Cryans wrote:
bq.  

Ooops meant to say, +1


- Jean-Daniel


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1212/#review1911
---





 Edit of log messages before branching...
 

 Key: HBASE-3227
 URL: https://issues.apache.org/jira/browse/HBASE-3227
 Project: HBase
  Issue Type: Improvement
Reporter: stack
 Fix For: 0.90.0




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3227) Edit of log messages before branching...

2010-11-11 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12931106#action_12931106
 ] 

HBase Review Board commented on HBASE-3227:
---

Message from: Jonathan Gray jg...@apache.org

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1212/#review1913
---

Ship it!


lgtm

- Jonathan





 Edit of log messages before branching...
 

 Key: HBASE-3227
 URL: https://issues.apache.org/jira/browse/HBASE-3227
 Project: HBase
  Issue Type: Improvement
Reporter: stack
 Fix For: 0.90.0




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3161) Provide option for Stargate to only serve GET requests

2010-11-10 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930646#action_12930646
 ] 

HBase Review Board commented on HBASE-3161:
---

Message from: st...@duboce.net

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1204/#review1885
---

Ship it!


This patch looks great to me.  Andrew or Ted, you want to take a look?  Bennett 
do all hbase tests pass locally for you?

- stack





 Provide option for Stargate to only serve GET requests
 --

 Key: HBASE-3161
 URL: https://issues.apache.org/jira/browse/HBASE-3161
 Project: HBase
  Issue Type: Improvement
  Components: rest
Affects Versions: 0.20.6
Reporter: Ted Yu

 Provide option for Stargate to only serve GET requests. Hbase health check 
 can utilize this option.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3161) Provide option for Stargate to only serve GET requests

2010-11-10 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930662#action_12930662
 ] 

HBase Review Board commented on HBASE-3161:
---

Message from: Ted Yu ted...@yahoo.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1204/#review1886
---

Ship it!


- Ted





 Provide option for Stargate to only serve GET requests
 --

 Key: HBASE-3161
 URL: https://issues.apache.org/jira/browse/HBASE-3161
 Project: HBase
  Issue Type: Improvement
  Components: rest
Affects Versions: 0.20.6
Reporter: Ted Yu

 Provide option for Stargate to only serve GET requests. Hbase health check 
 can utilize this option.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3211) Key (Index) Only Fetches

2010-11-10 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930830#action_12930830
 ] 

HBase Review Board commented on HBASE-3211:
---

Message from: Kannan Muthukkaruppan kan...@facebook.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1208/#review1893
---

Ship it!


Neat-O! Thanks for cranking this out so quickly.

- Kannan





 Key (Index) Only Fetches
 

 Key: HBASE-3211
 URL: https://issues.apache.org/jira/browse/HBASE-3211
 Project: HBase
  Issue Type: Improvement
Reporter: Kannan Muthukkaruppan
Assignee: Jonathan Gray

 When you retrieve data from HBase you get Key (Row+Column+Timestamp) + 
 Values. 
 It would be nice to have a mode where we only fetch the keys (i.e. the index) 
 but not the values.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3211) Key (Index) Only Fetches

2010-11-10 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930831#action_12930831
 ] 

HBase Review Board commented on HBASE-3211:
---

Message from: Jonathan Gray jg...@apache.org


bq.  On 2010-11-10 15:24:52, stack wrote:
bq.   Looks fine to me.  That kv copy is ugly but what else can you do?

Definitely can't modify the original buffer, so it's the only choice.

In this case, it's not a huge deal because we'll do these allocations, return 
the result, and then immediately be done with the memory and will have no 
references to it.  Should be okay on GC.

One potential optimization would be to do one big rewrite of the KVs at the end 
rather as we go.  Instead of allocating individual byte[] for each KV, you 
could potentially do one big byte[] behind all the key-only KVs.  This gets way 
more complicated and I'm not sure it's worth it.  Was going for minimal 
approach.

In the filter unit test, I'm also going add an additional assert on commit (and 
verifying still passes).  The test verifies the values are not the same but we 
should actually explicitly also assert that the value is 0 length.

Thanks!


- Jonathan


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1208/#review1890
---





 Key (Index) Only Fetches
 

 Key: HBASE-3211
 URL: https://issues.apache.org/jira/browse/HBASE-3211
 Project: HBase
  Issue Type: Improvement
Reporter: Kannan Muthukkaruppan
Assignee: Jonathan Gray

 When you retrieve data from HBase you get Key (Row+Column+Timestamp) + 
 Values. 
 It would be nice to have a mode where we only fetch the keys (i.e. the index) 
 but not the values.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3211) Key (Index) Only Fetches

2010-11-10 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930833#action_12930833
 ] 

HBase Review Board commented on HBASE-3211:
---

Message from: Jonathan Gray jg...@apache.org


bq.  On 2010-11-10 15:25:07, Ryan Rawson wrote:
bq.   trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 1199
bq.   http://review.cloudera.org/r/1208/diff/1/?file=17147#file17147line1199
bq.  
bq.   I'm torn here, this is making the implementation easy, but KeyValues 
have been immutable to date.  While no one shares KeyValues between threads or 
scanners, and ideally no one should, this seems dangerous.
bq.   
bq.  

It doesn't actually touch the original byte[] so does not actually 
destroy/mutate the underlying data in any way.  Agreed it's still potentially 
dangerous but that's why I've added the nice warning message in javadoc :)


- Jonathan


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1208/#review1891
---





 Key (Index) Only Fetches
 

 Key: HBASE-3211
 URL: https://issues.apache.org/jira/browse/HBASE-3211
 Project: HBase
  Issue Type: Improvement
Reporter: Kannan Muthukkaruppan
Assignee: Jonathan Gray

 When you retrieve data from HBase you get Key (Row+Column+Timestamp) + 
 Values. 
 It would be nice to have a mode where we only fetch the keys (i.e. the index) 
 but not the values.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3211) Key (Index) Only Fetches

2010-11-10 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930849#action_12930849
 ] 

HBase Review Board commented on HBASE-3211:
---

Message from: Nicolas nspiegelb...@facebook.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1208/#review1896
---



trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
http://review.cloudera.org/r/1208/#comment6119

Would it be more straightforward to have a ReturnCode.MODIFY, that signals 
an include but suggests that the program must call modifyKeyValue() to get the 
transformed data.  Maybe this is too much of a one-off case...


- Nicolas





 Key (Index) Only Fetches
 

 Key: HBASE-3211
 URL: https://issues.apache.org/jira/browse/HBASE-3211
 Project: HBase
  Issue Type: Improvement
Reporter: Kannan Muthukkaruppan
Assignee: Jonathan Gray
 Fix For: 0.90.0

 Attachments: HBASE-3211-v2.patch, HBASE-3211-v3.patch


 When you retrieve data from HBase you get Key (Row+Column+Timestamp) + 
 Values. 
 It would be nice to have a mode where we only fetch the keys (i.e. the index) 
 but not the values.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3211) Key (Index) Only Fetches

2010-11-10 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930852#action_12930852
 ] 

HBase Review Board commented on HBASE-3211:
---

Message from: Jonathan Gray jg...@apache.org


bq.  On 2010-11-10 16:01:22, Nicolas wrote:
bq.   trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 1199
bq.   http://review.cloudera.org/r/1208/diff/1/?file=17147#file17147line1199
bq.  
bq.   Would it be more straightforward to have a ReturnCode.MODIFY, that 
signals an include but suggests that the program must call modifyKeyValue() to 
get the transformed data.  Maybe this is too much of a one-off case...

Not sure I completely follow.  You're saying the modification would happen 
outside the filter?  No one needs to call modifyKeyValue() to get the 
transformed data, it's done in the filter.

In any case, yeah, I would not be for adding another ReturnCode just for this.


- Jonathan


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1208/#review1896
---





 Key (Index) Only Fetches
 

 Key: HBASE-3211
 URL: https://issues.apache.org/jira/browse/HBASE-3211
 Project: HBase
  Issue Type: Improvement
Reporter: Kannan Muthukkaruppan
Assignee: Jonathan Gray
 Fix For: 0.90.0

 Attachments: HBASE-3211-v2.patch, HBASE-3211-v3.patch


 When you retrieve data from HBase you get Key (Row+Column+Timestamp) + 
 Values. 
 It would be nice to have a mode where we only fetch the keys (i.e. the index) 
 but not the values.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3211) Key (Index) Only Fetches

2010-11-10 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930880#action_12930880
 ] 

HBase Review Board commented on HBASE-3211:
---

Message from: Nicolas nspiegelb...@facebook.com


bq.  On 2010-11-10 16:01:22, Nicolas wrote:
bq.   trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 1199
bq.   http://review.cloudera.org/r/1208/diff/1/?file=17147#file17147line1199
bq.  
bq.   Would it be more straightforward to have a ReturnCode.MODIFY, that 
signals an include but suggests that the program must call modifyKeyValue() to 
get the transformed data.  Maybe this is too much of a one-off case...
bq.  
bq.  Jonathan Gray wrote:
bq.  Not sure I completely follow.  You're saying the modification would 
happen outside the filter?  No one needs to call modifyKeyValue() to get the 
transformed data, it's done in the filter.
bq.  
bq.  In any case, yeah, I would not be for adding another ReturnCode just 
for this.

I suggested this alternative because users normally expect filters to do 
immutable operations on the data itself, and you're introducing side effects.  
If we stay with this paradigm, it's probably best to add a note in 
Filter.filterKeyValue() that the KeyValue may be modified.


- Nicolas


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1208/#review1896
---





 Key (Index) Only Fetches
 

 Key: HBASE-3211
 URL: https://issues.apache.org/jira/browse/HBASE-3211
 Project: HBase
  Issue Type: Improvement
Reporter: Kannan Muthukkaruppan
Assignee: Jonathan Gray
 Fix For: 0.90.0

 Attachments: HBASE-3211-v2.patch, HBASE-3211-v3.patch


 When you retrieve data from HBase you get Key (Row+Column+Timestamp) + 
 Values. 
 It would be nice to have a mode where we only fetch the keys (i.e. the index) 
 but not the values.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3223) Get VersionInfo for Running HBase Process

2010-11-10 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930881#action_12930881
 ] 

HBase Review Board commented on HBASE-3223:
---

Message from: Nicolas nspiegelb...@facebook.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1209/
---

Review request for hbase.


Summary
---

bin/hbase VersionInfo is a great existing utility to provide version info about 
Hbase jar files. Unfortunately, there is no way to currently get this 
information for the running process. For this jira, add an easy/quick way to 
see verify the rev of the running jar.
We got recently bit internally because our running jar was a different version 
from the jar that we had recently pushed and caused havoc on our cluster. This 
problem is more important to fix now that we have rolling upgrades and will 
regularly have cluster scenarios with mixed-version RSs.


This addresses bug HBASE-3223.
http://issues.apache.org/jira/browse/HBASE-3223


Diffs
-

  trunk/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetrics.java 
1033788 
  trunk/src/main/java/org/apache/hadoop/hbase/metrics/HBaseInfo.java 
PRE-CREATION 
  trunk/src/main/java/org/apache/hadoop/hbase/metrics/MetricsMBeanBase.java 
1033788 
  trunk/src/main/java/org/apache/hadoop/hbase/metrics/MetricsString.java 
PRE-CREATION 
  
trunk/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
 1033788 

Diff: http://review.cloudera.org/r/1209/diff


Testing
---

loaded on dev cluster and verified that we this was exported via JMX


Thanks,

Nicolas




 Get VersionInfo for Running HBase Process
 -

 Key: HBASE-3223
 URL: https://issues.apache.org/jira/browse/HBASE-3223
 Project: HBase
  Issue Type: Improvement
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
 Fix For: 0.90.1


 bin/hbase VersionInfo is a great existing utility to provide version info 
 about Hbase jar files.  Unfortunately, there is no way to currently get this 
 information for the running process.  For this jira, add an easy/quick way to 
 see verify the rev of the running jar.
 We got recently bit internally because our running jar was a different 
 version from the jar that we had recently pushed and caused havoc on our 
 cluster.  This problem is more important to fix now that we have rolling 
 upgrades and will regularly have cluster scenarios with mixed-version RSs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3223) Get VersionInfo for Running HBase Process

2010-11-10 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930885#action_12930885
 ] 

HBase Review Board commented on HBASE-3223:
---

Message from: Nicolas nspiegelb...@facebook.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1209/#review1899
---



trunk/src/main/java/org/apache/hadoop/hbase/metrics/MetricsMBeanBase.java
http://review.cloudera.org/r/1209/#comment6122

both LOG info messages can be taken out.  this was from debug


- Nicolas





 Get VersionInfo for Running HBase Process
 -

 Key: HBASE-3223
 URL: https://issues.apache.org/jira/browse/HBASE-3223
 Project: HBase
  Issue Type: Improvement
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
 Fix For: 0.90.1


 bin/hbase VersionInfo is a great existing utility to provide version info 
 about Hbase jar files.  Unfortunately, there is no way to currently get this 
 information for the running process.  For this jira, add an easy/quick way to 
 see verify the rev of the running jar.
 We got recently bit internally because our running jar was a different 
 version from the jar that we had recently pushed and caused havoc on our 
 cluster.  This problem is more important to fix now that we have rolling 
 upgrades and will regularly have cluster scenarios with mixed-version RSs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3211) Key (Index) Only Fetches

2010-11-10 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930891#action_12930891
 ] 

HBase Review Board commented on HBASE-3211:
---

Message from: Jonathan Gray jg...@apache.org


bq.  On 2010-11-10 16:01:22, Nicolas wrote:
bq.   trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 1199
bq.   http://review.cloudera.org/r/1208/diff/1/?file=17147#file17147line1199
bq.  
bq.   Would it be more straightforward to have a ReturnCode.MODIFY, that 
signals an include but suggests that the program must call modifyKeyValue() to 
get the transformed data.  Maybe this is too much of a one-off case...
bq.  
bq.  Jonathan Gray wrote:
bq.  Not sure I completely follow.  You're saying the modification would 
happen outside the filter?  No one needs to call modifyKeyValue() to get the 
transformed data, it's done in the filter.
bq.  
bq.  In any case, yeah, I would not be for adding another ReturnCode just 
for this.
bq.  
bq.  Nicolas wrote:
bq.  I suggested this alternative because users normally expect filters to 
do immutable operations on the data itself, and you're introducing side 
effects.  If we stay with this paradigm, it's probably best to add a note in 
Filter.filterKeyValue() that the KeyValue may be modified.

But a user would have to knowingly use this filter, right?  And the filter only 
has one purpose of mutating the KVs.  I do agree with what you're saying at 
some level but not sure what a note in the interface would do.  This is so if 
writing other filters, you would know that other filters in the chain could 
modify the KV?  How would you behave differently then in that case?


- Jonathan


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1208/#review1896
---





 Key (Index) Only Fetches
 

 Key: HBASE-3211
 URL: https://issues.apache.org/jira/browse/HBASE-3211
 Project: HBase
  Issue Type: Improvement
Reporter: Kannan Muthukkaruppan
Assignee: Jonathan Gray
 Fix For: 0.90.0

 Attachments: HBASE-3211-v2.patch, HBASE-3211-v3.patch


 When you retrieve data from HBase you get Key (Row+Column+Timestamp) + 
 Values. 
 It would be nice to have a mode where we only fetch the keys (i.e. the index) 
 but not the values.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3223) Get VersionInfo for Running HBase Process

2010-11-10 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930935#action_12930935
 ] 

HBase Review Board commented on HBASE-3223:
---

Message from: st...@duboce.net

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1209/#review1905
---

Ship it!


This looks like nice utility.

- stack





 Get VersionInfo for Running HBase Process
 -

 Key: HBASE-3223
 URL: https://issues.apache.org/jira/browse/HBASE-3223
 Project: HBase
  Issue Type: Improvement
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
 Fix For: 0.90.1


 bin/hbase VersionInfo is a great existing utility to provide version info 
 about Hbase jar files.  Unfortunately, there is no way to currently get this 
 information for the running process.  For this jira, add an easy/quick way to 
 see verify the rev of the running jar.
 We got recently bit internally because our running jar was a different 
 version from the jar that we had recently pushed and caused havoc on our 
 cluster.  This problem is more important to fix now that we have rolling 
 upgrades and will regularly have cluster scenarios with mixed-version RSs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.




[jira] Commented: (HBASE-3112) Enable and disable of table needs a bit of loving in new master

2010-11-09 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930285#action_12930285
 ] 

HBase Review Board commented on HBASE-3112:
---

Message from: st...@duboce.net


bq.  On 2010-11-09 11:05:46, Jean-Daniel Cryans wrote:
bq.   trunk/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java, line 
444
bq.   http://review.cloudera.org/r/1187/diff/5/?file=17034#file17034line444
bq.  
bq.   I gave you my comments in person. Short version, I think that those 
methods' method shouldn't change and that we should have methods clearly marked 
as async, and then do a job of educating people towards using them.
bq.  
bq.  Jean-Daniel Cryans wrote:
bq.  I meant method's behavior

Yeah, I agree with you after chatting.  Will fix (And you spotted prob. w/ way 
async was running anyways).


bq.  On 2010-11-09 11:05:46, Jean-Daniel Cryans wrote:
bq.   
trunk/src/main/java/org/apache/hadoop/hbase/master/handler/EnableTableHandler.java,
 line 135
bq.   http://review.cloudera.org/r/1187/diff/5/?file=17043#file17043line135
bq.  
bq.   Looks an awful lot like BulkDisabler

I disagree.  The overrides each differ substantially (They look similar if you 
don't look close -- smile).


- stack


---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1187/#review1866
---





 Enable and disable of table needs a bit of loving in new master
 ---

 Key: HBASE-3112
 URL: https://issues.apache.org/jira/browse/HBASE-3112
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
Priority: Critical
 Fix For: 0.90.0

 Attachments: 3112-v2.txt, 3112-v3.txt, 3112.txt


 The tools are in place to do a more reliable enable/disable of tables.  Some 
 work has been done to hack in a basic enable/disable but its not enough -- 
 see the test avro/thrift tests where a disable/enable/disable switchback can 
 confuse the table state (and has been disabled until this issue addressed).
 This issue is about finishing off enable/disable in the new master.   I think 
 we need to add to the table znode an enabling/disabling state rather than 
 have them binary with a watcher that will stop an enable (or disable) 
 starting until the previous completes (Currently we atomically switch the 
 state though the region close/open lags -- some work in enable/disable 
 handlers helps in that they won't complete till all regions have 
 transitioned.. but its not enough).
 Need to add tests too.
 Marking issue critical bug because loads of the questions we get on lists are 
 about enable/disable probs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3209) New Compaction Heuristic

2010-11-09 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930347#action_12930347
 ] 

HBase Review Board commented on HBASE-3209:
---

Message from: Nicolas nspiegelb...@facebook.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1192/
---

Review request for hbase.


Summary
---

We have a whole bunch of compaction awesome in our internal 0.89 branch. 
Porting this to 0.90:

1) don't unconditionally compact 4 files. have a min threshold
2) intelligently upgrade minors to majors
3) new compaction algo (derived in HBASE-2462 )


This addresses bug HBASE-3209.
http://issues.apache.org/jira/browse/HBASE-3209


Diffs
-

  trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1033278 

Diff: http://review.cloudera.org/r/1192/diff


Testing
---

Has been running on our primary cluster for the past couple weeks.


Thanks,

Nicolas




 New Compaction Heuristic
 

 Key: HBASE-3209
 URL: https://issues.apache.org/jira/browse/HBASE-3209
 Project: HBase
  Issue Type: Improvement
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg

 We have a whole bunch of compaction awesome in our internal 0.89 branch.  
 Porting this to 0.90:
 1) don't unconditionally compact 4 files. have a min threshold
 2) intelligently upgrade minors to majors
 3) new compaction algo (derived in HBASE-2462 )

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3168) Sanity date and time check when a region server joins the cluster

2010-11-09 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930386#action_12930386
 ] 

HBase Review Board commented on HBASE-3168:
---

Message from: Jonathan Gray jg...@apache.org

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1193/
---

Review request for hbase and stack.


Summary
---

This is patch from Jeff Whiting.  I then did little bits of polish and slim 
down of the unit test.

I uncovered very odd coupling of LogsCleaner being instantiated within 
ServerManager, though we don't use it there and it doesn't use SM.  So that's 
refactored out into HMaster and is started up/shut down with 
start/stopServiceThreads().

Changes from Jeff patch:
- Moved pulling maxSkew from config into constructor rather than doing it on 
each call
- Cleaned up the logging message a bit and changed from DEBUG to WARN
- HRS side, use EnvironmentEdgeManager rather than System.currentTimeMillis 
directly
- Changes test to operate directly on ServerManager. I had to do a bit of 
refactoring of ServerManager to get this to work and it's nothing something 
anyone new would have pulled the trigger on (moving stuff into another class 
instead of the weird unnecessary coupling to ServerManager).


This addresses bug HBASE-3168.
http://issues.apache.org/jira/browse/HBASE-3168


Diffs
-

  trunk/src/main/java/org/apache/hadoop/hbase/ClockOutOfSyncException.java 
PRE-CREATION 
  trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPCProtocolVersion.java 
1033288 
  trunk/src/main/java/org/apache/hadoop/hbase/ipc/HMasterRegionInterface.java 
1033288 
  trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 1033288 
  trunk/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 1033288 
  trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 
1033288 
  
trunk/src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 
PRE-CREATION 

Diff: http://review.cloudera.org/r/1193/diff


Testing
---

New added test passes.


Thanks,

Jonathan




 Sanity date and time check when a region server joins the cluster
 -

 Key: HBASE-3168
 URL: https://issues.apache.org/jira/browse/HBASE-3168
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.89.20100924
 Environment: RHEL 5.5 64bit, 1 Master 4 Region Servers
Reporter: Jeff Whiting
Assignee: Jeff Whiting
 Fix For: 0.90.0

 Attachments: HBASE-3168-trunk-v1.txt, HBASE-3168-trunk-v2.txt, 
 HBASE-3168-trunk-v3.txt, HBASE-3168-v4.patch


 Introduce a sanity check when a RS joins the cluster to make sure its clock 
 isn't too far out of skew with the rest of the cluster.  If the RS's time is 
 too far out of skew then the master would prevent it from joining and RS 
 would die and log the error. 
 Having a RS with even small differences in time can cause huge problems due 
 to how bhase stores values with timestamps.
 According to J-D in ServerManager we are already doing: 
 {code}
 HServerInfo info = new HServerInfo(serverInfo);
 checkIsDead(info.getServerName(), STARTUP);
 checkAlreadySameHostPort(info);
 recordNewServer(info, false, null);
 {code}
 And that the new check would fit in nicely there.
 JG suggests we add a ClockOutOfSync-like exception

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



  1   2   3   4   5   6   7   >