[jira] Commented: (HBASE-3405) Allow HBaseRpcMetrics to register custom interface methods
[ https://issues.apache.org/jira/browse/HBASE-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978229#action_12978229 ] HBase Review Board commented on HBASE-3405: --- Message from: Gary Helmling ghelml...@gmail.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1414/ --- Review request for hbase. Summary --- The patch allows alternate HRegionServer subclasses (and in the future coprocessors) to access HBaseRpcMetrics in order to register additional RPC methods into the metrics registry. The changes are: * add getRpcMetrics() to HBaseServer to allow access to the HBaseRpcMetrics instance * add getRpcMetrics() to RegionServerServices (and HRegionServer) to allow for future coprocessor accounting * add HBaseRpcMetrics.createMetrics(Class[], boolean) overload -- if boolean is true, the registered method names will be prefixed with the class name. This should help clarify origin for custom metrics and help prevent collisions. This addresses bug HBASE-3405. http://issues.apache.org/jira/browse/HBASE-3405 Diffs - src/main/java/org/apache/hadoop/hbase/ipc/HBaseRpcMetrics.java 19dbf2b src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 867a059 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d7147b5 src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java 1309f93 src/test/java/org/apache/hadoop/hbase/regionserver/TestRpcMetrics.java PRE-CREATION Diff: http://review.cloudera.org/r/1414/diff Testing --- Created new org.apache.hadoop.hbase.regionserver.TestRpcMetrics test case to verify registration and incrementing of metrics from HRegionServer subclasses. Thanks, Gary Allow HBaseRpcMetrics to register custom interface methods -- Key: HBASE-3405 URL: https://issues.apache.org/jira/browse/HBASE-3405 Project: HBase Issue Type: Improvement Components: ipc Reporter: Gary Helmling Priority: Minor Opened from comments on HBASE-2997. James Kennedy notes: {quote} HBaseRpcMetrics is now logging a WARN message every time it encounters an unregistered RPC method. In my case I now get huge log files filled with these warnings because the hbase-trx transactional extension of HBase uses a subclass of HRegionServer that adds new interface methods. It's easy enough to tell log4j to ignore HBaseRpcMetrics output. However, it would be nice if the Server/HRegionServer HBaseRpcMetrics mechanism was more extensible so I could pass down new interfaces or grab the HBaseRpcMetrics object to add interfaces from up top... {quote} {{HBaseRpcMetrics}} already has a public method {{createMetrics(Class)}} to register method counters. We just need a way to expose the metrics class to allow the region server subclass to call it -- add a {{getMetrics()}} method to {{RpcServer}} and {{HBaseServer}}. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3400) Coprocessor Support for Generic Interfaces
[ https://issues.apache.org/jira/browse/HBASE-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976355#action_12976355 ] HBase Review Board commented on HBASE-3400: --- Message from: ekohl...@gmail.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1405/ --- Review request for hbase. Summary --- Coprocessors currently do not support generic interfaces because type erasure makes their generic parameters appear as Objects to Invocation.java. This can be overcome by writing out the parameters using their own types (rather than the type parameters), and then separately writing the class names for the type parameters. While it would be ideal to implement this in Invocation.java, some other code seems to be relying on its write order and doing so breaks other RPC code. The modification can, however, be implemented in Exec.java instead. The included patch modifies Invocation.java's fields to that they are protected scope, and fully implements the read and write methods for Exec rather than using the parent method for the parent fields. ExecResult is also modified to accommodate generic returns in the same way. This addresses bug HBASE-3400. http://issues.apache.org/jira/browse/HBASE-3400 Diffs - src/main/java/org/apache/hadoop/hbase/client/coprocessor/Exec.java c127ea3 src/main/java/org/apache/hadoop/hbase/client/coprocessor/ExecResult.java be46cd2 src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java 9609652 src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorEndpoint.java 75f76e8 Diff: http://review.cloudera.org/r/1405/diff Testing --- Integration test included in patch. Demonstrates generic interface using objects, arrays, and primitives, and checks that all primitive classes work as well. Thanks, ekohlwey Coprocessor Support for Generic Interfaces -- Key: HBASE-3400 URL: https://issues.apache.org/jira/browse/HBASE-3400 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Ed Kohlwey Assignee: Ed Kohlwey Attachments: HBASE-3400-2.patch, HBASE-3400.patch Coprocessors currently do not support generic interfaces because type erasure makes their generic parameters appear as Objects to Invocation.java. This can be overcome by writing out the parameters using their own types (rather than the type parameters), and then separately writing the class names for the type parameters. While it would be ideal to implement this in Invocation.java, some other code seems to be relying on its write order and doing so breaks other RPC code. The modification can, however, be implemented in Exec.java instead. The included patch modifies Invocation.java's fields to that they are protected scope, and fully implements the read and write methods for Exec rather than using the parent method for the parent fields. ExecResult is also modified to accommodate generic returns in the same way. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3400) Coprocessor Support for Generic Interfaces
[ https://issues.apache.org/jira/browse/HBASE-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976376#action_12976376 ] HBase Review Board commented on HBASE-3400: --- Message from: ekohl...@gmail.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1405/ --- (Updated 2010-12-31 17:45:56.197941) Review request for hbase. Changes --- Added classes that were missing for tests. Summary --- Coprocessors currently do not support generic interfaces because type erasure makes their generic parameters appear as Objects to Invocation.java. This can be overcome by writing out the parameters using their own types (rather than the type parameters), and then separately writing the class names for the type parameters. While it would be ideal to implement this in Invocation.java, some other code seems to be relying on its write order and doing so breaks other RPC code. The modification can, however, be implemented in Exec.java instead. The included patch modifies Invocation.java's fields to that they are protected scope, and fully implements the read and write methods for Exec rather than using the parent method for the parent fields. ExecResult is also modified to accommodate generic returns in the same way. This addresses bug HBASE-3400. http://issues.apache.org/jira/browse/HBASE-3400 Diffs (updated) - src/main/java/org/apache/hadoop/hbase/client/coprocessor/Exec.java c127ea3 src/main/java/org/apache/hadoop/hbase/client/coprocessor/ExecResult.java be46cd2 src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java 9609652 src/test/java/org/apache/hadoop/hbase/coprocessor/GenericEndpoint.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/coprocessor/GenericProtocol.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorEndpoint.java 75f76e8 Diff: http://review.cloudera.org/r/1405/diff Testing --- Integration test included in patch. Demonstrates generic interface using objects, arrays, and primitives, and checks that all primitive classes work as well. Thanks, ekohlwey Coprocessor Support for Generic Interfaces -- Key: HBASE-3400 URL: https://issues.apache.org/jira/browse/HBASE-3400 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Ed Kohlwey Assignee: Ed Kohlwey Attachments: HBASE-3400-2.patch, HBASE-3400-3.patch, HBASE-3400.patch Coprocessors currently do not support generic interfaces because type erasure makes their generic parameters appear as Objects to Invocation.java. This can be overcome by writing out the parameters using their own types (rather than the type parameters), and then separately writing the class names for the type parameters. While it would be ideal to implement this in Invocation.java, some other code seems to be relying on its write order and doing so breaks other RPC code. The modification can, however, be implemented in Exec.java instead. The included patch modifies Invocation.java's fields to that they are protected scope, and fully implements the read and write methods for Exec rather than using the parent method for the parent fields. ExecResult is also modified to accommodate generic returns in the same way. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2312) Possible data loss when RS goes into GC pause while rolling HLog
[ https://issues.apache.org/jira/browse/HBASE-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12974396#action_12974396 ] HBase Review Board commented on HBASE-2312: --- Message from: st...@duboce.net --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/396/#review2146 --- Ship it! Below looks good. Doesn't work w/o the hadoop issues? I still need to review those? - stack Possible data loss when RS goes into GC pause while rolling HLog Key: HBASE-2312 URL: https://issues.apache.org/jira/browse/HBASE-2312 Project: HBase Issue Type: Bug Components: master, regionserver Affects Versions: 0.90.0 Reporter: Karthik Ranganathan Assignee: Nicolas Spiegelberg Priority: Critical Fix For: 0.90.1 There is a very corner case when bad things could happen(ie data loss): 1)RS #1 is going to roll its HLog - not yet created the new one, old one will get no more writes 2)RS #1 enters GC Pause of Death 3)Master lists HLog files of RS#1 that is has to split as RS#1 is dead, starts splitting 4)RS #1 wakes up, created the new HLog (previous one was rolled) and appends an edit - which is lost The following seems like a possible solution: 1)Master detects RS#1 is dead 2)The master renames the /hbase/.logs/regionserver name directory to something else (say /hbase/.logs/regionserver name-dead) 3)Add mkdir support (as opposed to mkdirs) to HDFS - so that a file create fails if the directory doesn't exist. Dhruba tells me this is very doable. 4)RS#1 comes back up and is not able create the new hlog. It restarts itself. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2312) Possible data loss when RS goes into GC pause while rolling HLog
[ https://issues.apache.org/jira/browse/HBASE-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12974029#action_12974029 ] HBase Review Board commented on HBASE-2312: --- Message from: Nicolas nspiegelb...@facebook.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/396/ --- (Updated 2010-12-21 19:06:32.166768) Review request for hbase. Changes --- Version for 0.90. This version utilizes a new HDFS patch to forcibly recover a file lease (forthcoming). TestZooKeeper will fail without this patch because it needs to wait until the soft lease expires otherwise. Summary --- There is a very corner case when bad things could happen(ie data loss): 1) RS #1 is going to roll its HLog - not yet created the new one, old one will get no more writes 2) RS #1 enters GC Pause of Death 3) Master lists HLog files of RS#1 that is has to split as RS#1 is dead, starts splitting 4) RS #1 wakes up, created the new HLog (previous one was rolled) and appends an edit - which is lost Note that this fix requires a healthy dose of HDFS prerequisites: HDFS-617, HADOOP-6840, HADOOP-6886. I encourage you to review those as well, give feedback, and hopefully give +1s so we can push the changes through. This addresses bug HBASE-2312. http://issues.apache.org/jira/browse/HBASE-2312 Diffs (updated) - trunk/src/main/java/org/apache/hadoop/hbase/HConstants.java 1051398 trunk/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java 1051398 trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java 1051398 trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogWriter.java 1051398 trunk/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java 1051398 trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogSplit.java 1051398 Diff: http://review.cloudera.org/r/396/diff Testing --- mvn test; bin/start-hbase.sh bin/hbase shell scan '.META.', get, put, etc Thanks, Nicolas Possible data loss when RS goes into GC pause while rolling HLog Key: HBASE-2312 URL: https://issues.apache.org/jira/browse/HBASE-2312 Project: HBase Issue Type: Bug Components: master, regionserver Affects Versions: 0.90.0 Reporter: Karthik Ranganathan Assignee: Nicolas Spiegelberg Priority: Critical Fix For: 0.90.1 There is a very corner case when bad things could happen(ie data loss): 1)RS #1 is going to roll its HLog - not yet created the new one, old one will get no more writes 2)RS #1 enters GC Pause of Death 3)Master lists HLog files of RS#1 that is has to split as RS#1 is dead, starts splitting 4)RS #1 wakes up, created the new HLog (previous one was rolled) and appends an edit - which is lost The following seems like a possible solution: 1)Master detects RS#1 is dead 2)The master renames the /hbase/.logs/regionserver name directory to something else (say /hbase/.logs/regionserver name-dead) 3)Add mkdir support (as opposed to mkdirs) to HDFS - so that a file create fails if the directory doesn't exist. Dhruba tells me this is very doable. 4)RS#1 comes back up and is not able create the new hlog. It restarts itself. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3256) Coprocessors: Coprocessor host and observer for HMaster
[ https://issues.apache.org/jira/browse/HBASE-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12973452#action_12973452 ] HBase Review Board commented on HBASE-3256: --- Message from: Gary Helmling ghelml...@gmail.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1321/ --- Review request for hbase, stack, Andrew Purtell, and Jonathan Gray. Summary --- This patch adds a new MasterObserver interface with pre/post hooks provided for operations defined in org.apache.hadoop.hbase.ipc.HMasterInterface. In order to accommodate the new MasterObserver interface, I've also refactored out common coprocessor base code, with subclasses providing for region-specific and master-specific behavior. The new code structure is (excuse my poor ascii art): CoprocessorEnvironment - base interface for common facilities provided to CP implementations | |- RegionCoprocessorEnvironment - adds access to current HRegion and RegionServerServices (for RegionObservers) | |- MasterCoprocessorEnvironment - adds access to MasterServerServices (for MasterObservers) CoprocessorHost - abstract base providing core CP loading and invocation code and the base CoprocessorEnvironment implementation | |- RegionCoprocessorHost - provides hooks for invoking RegionObserver pre/post methods and RegionCoprocessorEnvironment implementation | |- MasterCoprocessorHost - provides hooks for invoking MasterObserver pre/post methods and MasterCoprocessorEnvironment implementation Also added: - org.apache.hadoop.hbase.coprocessor.BaseMasterObserver - stubs out full MasterObserver interface with empty methods for convenience - org.apache.hadoop.hbase.coprocessor.TestMasterObserver - tests that MasterObserver pre/post methods are called during master operations. In particular, please let me know if the MasterObserver method inputs and outputs are sufficient for whatever you anticipate doing with it. It should meet our needs for security checks, but more input would be helpful. This addresses bug HBASE-3256. http://issues.apache.org/jira/browse/HBASE-3256 Diffs - src/main/java/org/apache/hadoop/hbase/coprocessor/BaseMasterObserver.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserverCoprocessor.java 1ffead0 src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorEnvironment.java c4fa526 src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/coprocessor/MasterCoprocessorEnvironment.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/coprocessor/MasterObserver.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/coprocessor/RegionCoprocessorEnvironment.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java 97198ec src/main/java/org/apache/hadoop/hbase/master/HMaster.java 18f7787 src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/master/MasterServices.java 593254b src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java f71fea6 src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1d48131 src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/coprocessor/ColumnAggregationEndpoint.java 43569f1 src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java 902a60f src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java 5434d01 src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterObserver.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverInterface.java 5f5fc9a src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverStacking.java 3193abf src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java 5be8daa Diff: http://review.cloudera.org/r/1321/diff Testing --- Added a new test (org.apache.hadoop.hbase.coprocessor.TestMasterObserver) to cover pre/post hook invocation. All existing coprocessor tests still pass. Thanks, Gary Coprocessors: Coprocessor host and observer for HMaster --- Key: HBASE-3256 URL: https://issues.apache.org/jira/browse/HBASE-3256 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Gary Helmling Fix For: 0.92.0 Attachments: HBASE-3256_initial.patch Implement a coprocessor host for HMaster. Hook observers into administrative operations
[jira] Commented: (HBASE-3256) Coprocessors: Coprocessor host and observer for HMaster
[ https://issues.apache.org/jira/browse/HBASE-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12973457#action_12973457 ] HBase Review Board commented on HBASE-3256: --- Message from: Andrew Purtell apurt...@apache.org --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1321/#review2127 --- Ship it! src/main/java/org/apache/hadoop/hbase/master/HMaster.java http://review.cloudera.org/r/1321/#comment6609 If we were to allow override of assignments, it would have to happen here. If the cp calls bypass() then return immediately. src/main/java/org/apache/hadoop/hbase/master/HMaster.java http://review.cloudera.org/r/1321/#comment6610 Likewise if we were to allow overriding assignment, we need a symmetrical operation here. - Andrew Coprocessors: Coprocessor host and observer for HMaster --- Key: HBASE-3256 URL: https://issues.apache.org/jira/browse/HBASE-3256 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Gary Helmling Fix For: 0.92.0 Attachments: HBASE-3256_initial.patch Implement a coprocessor host for HMaster. Hook observers into administrative operations performed on tables: create, alter, assignment, load balance, and allow observers to modify base master behavior. Support automatic loading of coprocessor implementation. Consider refactoring the master coprocessor host and regionserver coprocessor host into a common base class. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3256) Coprocessors: Coprocessor host and observer for HMaster
[ https://issues.apache.org/jira/browse/HBASE-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12973460#action_12973460 ] HBase Review Board commented on HBASE-3256: --- Message from: Jonathan Gray jg...@apache.org --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1321/#review2126 --- Ship it! great work! just a few small comments but otherwise +1 src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java http://review.cloudera.org/r/1321/#comment6607 does DEFAULT really mean REGION/REGIONSERVER? or is it both? not a big deal if it's just variable names but since it's a config param, we should nail it now before it gets out in a release. src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java http://review.cloudera.org/r/1321/#comment6608 this code might have been in other earlier patches but could there be false positives with this? it'd be silly to load FancyCoprocessor and then MyFancyCoprocessor but i guess this is to cover the package? maybe parse out the class name? src/main/java/org/apache/hadoop/hbase/master/HMaster.java http://review.cloudera.org/r/1321/#comment6616 doesn't preBalance() return a void? it's preBalanceSwitch that returns boolean src/main/java/org/apache/hadoop/hbase/master/HMaster.java http://review.cloudera.org/r/1321/#comment6617 and here we should get the boolean return value (and base class should return the input value) src/main/java/org/apache/hadoop/hbase/master/HMaster.java http://review.cloudera.org/r/1321/#comment6618 would we ever want to override default assign behavior? it's feasible... might want to be future proof w/ the api? src/main/java/org/apache/hadoop/hbase/master/HMaster.java http://review.cloudera.org/r/1321/#comment6619 same here - Jonathan Coprocessors: Coprocessor host and observer for HMaster --- Key: HBASE-3256 URL: https://issues.apache.org/jira/browse/HBASE-3256 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Gary Helmling Fix For: 0.92.0 Attachments: HBASE-3256_initial.patch Implement a coprocessor host for HMaster. Hook observers into administrative operations performed on tables: create, alter, assignment, load balance, and allow observers to modify base master behavior. Support automatic loading of coprocessor implementation. Consider refactoring the master coprocessor host and regionserver coprocessor host into a common base class. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3256) Coprocessors: Coprocessor host and observer for HMaster
[ https://issues.apache.org/jira/browse/HBASE-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12973477#action_12973477 ] HBase Review Board commented on HBASE-3256: --- Message from: Gary Helmling ghelml...@gmail.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1321/#review2130 --- src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java http://review.cloudera.org/r/1321/#comment6621 It actually means region. That conf key is only used for the system coprocessors loaded on regions. I'll change the name (and config property). src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java http://review.cloudera.org/r/1321/#comment6622 Yes, I'm not sure what the original intent was here, obtaining the CP without the full package name? Maybe getClass().getSimpleName().equals() would be better? src/main/java/org/apache/hadoop/hbase/master/HMaster.java http://review.cloudera.org/r/1321/#comment6623 This is the result of env.shouldBypass(), in order to allow a MasterObserver to bypass the normal balance() processing. src/main/java/org/apache/hadoop/hbase/master/HMaster.java http://review.cloudera.org/r/1321/#comment6624 Right, that's the only way to modify the input. Will change. src/main/java/org/apache/hadoop/hbase/master/HMaster.java http://review.cloudera.org/r/1321/#comment6628 I can add in env.shouldBypass() handling here to allow overriding. Combined with access to ServerManager through MasterServices, this should allow custom assignment policies. src/main/java/org/apache/hadoop/hbase/master/HMaster.java http://review.cloudera.org/r/1321/#comment6629 Yes, will add in env.shouldBypass() handling here too. - Gary Coprocessors: Coprocessor host and observer for HMaster --- Key: HBASE-3256 URL: https://issues.apache.org/jira/browse/HBASE-3256 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Gary Helmling Fix For: 0.92.0 Attachments: HBASE-3256_initial.patch Implement a coprocessor host for HMaster. Hook observers into administrative operations performed on tables: create, alter, assignment, load balance, and allow observers to modify base master behavior. Support automatic loading of coprocessor implementation. Consider refactoring the master coprocessor host and regionserver coprocessor host into a common base class. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3256) Coprocessors: Coprocessor host and observer for HMaster
[ https://issues.apache.org/jira/browse/HBASE-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12973491#action_12973491 ] HBase Review Board commented on HBASE-3256: --- Message from: Gary Helmling ghelml...@gmail.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1321/ --- (Updated 2010-12-20 22:31:38.965609) Review request for hbase, stack, Andrew Purtell, and Jonathan Gray. Changes --- Changes in response to review comments: - DEFAULT - REGION in var name and property - CoprocessorHost.findCoprocessor(): use getClass().getSimpleName().equals() instead of getClass().getName().endsWith() for fallback - add bypass handling for MasterCoprocessorHost.preAssign() and preUnassign() - use return value from MasterObserver.preBalanceSwitch() to allow modifying input Summary --- This patch adds a new MasterObserver interface with pre/post hooks provided for operations defined in org.apache.hadoop.hbase.ipc.HMasterInterface. In order to accommodate the new MasterObserver interface, I've also refactored out common coprocessor base code, with subclasses providing for region-specific and master-specific behavior. The new code structure is (excuse my poor ascii art): CoprocessorEnvironment - base interface for common facilities provided to CP implementations | |- RegionCoprocessorEnvironment - adds access to current HRegion and RegionServerServices (for RegionObservers) | |- MasterCoprocessorEnvironment - adds access to MasterServerServices (for MasterObservers) CoprocessorHost - abstract base providing core CP loading and invocation code and the base CoprocessorEnvironment implementation | |- RegionCoprocessorHost - provides hooks for invoking RegionObserver pre/post methods and RegionCoprocessorEnvironment implementation | |- MasterCoprocessorHost - provides hooks for invoking MasterObserver pre/post methods and MasterCoprocessorEnvironment implementation Also added: - org.apache.hadoop.hbase.coprocessor.BaseMasterObserver - stubs out full MasterObserver interface with empty methods for convenience - org.apache.hadoop.hbase.coprocessor.TestMasterObserver - tests that MasterObserver pre/post methods are called during master operations. In particular, please let me know if the MasterObserver method inputs and outputs are sufficient for whatever you anticipate doing with it. It should meet our needs for security checks, but more input would be helpful. This addresses bug HBASE-3256. http://issues.apache.org/jira/browse/HBASE-3256 Diffs (updated) - src/main/java/org/apache/hadoop/hbase/coprocessor/BaseMasterObserver.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserverCoprocessor.java 1ffead0 src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorEnvironment.java c4fa526 src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/coprocessor/MasterCoprocessorEnvironment.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/coprocessor/MasterObserver.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/coprocessor/RegionCoprocessorEnvironment.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java 97198ec src/main/java/org/apache/hadoop/hbase/coprocessor/package-info.java 1b7918c src/main/java/org/apache/hadoop/hbase/master/HMaster.java 18f7787 src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/master/MasterServices.java 593254b src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java f71fea6 src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1d48131 src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java PRE-CREATION src/main/resources/hbase-default.xml f1cc4ae src/test/java/org/apache/hadoop/hbase/coprocessor/ColumnAggregationEndpoint.java 43569f1 src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java 902a60f src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorEndpoint.java 8eb2787 src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java 5434d01 src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterObserver.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverInterface.java 5f5fc9a src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverStacking.java 3193abf src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java 5be8daa Diff: http://review.cloudera.org/r/1321/diff Testing --- Added a new test
[jira] Commented: (HBASE-3362) If .META. offline between OPENING and OPENED, then wrong server location in .META. is possible
[ https://issues.apache.org/jira/browse/HBASE-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971997#action_12971997 ] HBase Review Board commented on HBASE-3362: --- Message from: Jonathan Gray jg...@apache.org --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1298/#review2083 --- Ship it! a few small comments. i think the loop should change as described in my comment (busy loop w/ call to currentTimeMillis as i read it). otherwise +1, good stuff. we need some tickle util class soon :) trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java http://review.cloudera.org/r/1298/#comment6529 on this server should probably be left in comment to be clear what this is checking trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java http://review.cloudera.org/r/1298/#comment6530 We were not previously but we should probably log this condition trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java http://review.cloudera.org/r/1298/#comment6531 This is a busy wait loop? Should we add a wait/notify on something passed to the thread and w/ a timeout of the period? And then we should probably also have some kind of max timeout. Even if minutes, there could be weird cluster state where the RS misses META availability but someone else might handle it properly, so max timeout might be good? trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java http://review.cloudera.org/r/1298/#comment6533 whitespace trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java http://review.cloudera.org/r/1298/#comment6532 maybe this should be warn. i think i'd want to see it and also logging of stack trace (i don't see logging of it elsewhere) - Jonathan If .META. offline between OPENING and OPENED, then wrong server location in .META. is possible -- Key: HBASE-3362 URL: https://issues.apache.org/jira/browse/HBASE-3362 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Priority: Critical Fix For: 0.90.0 This is a good one. It happened to me testing OOME in split logging. * Balancer moves region to new location, regionservrer X. * New location regionserver X successfully opens the region and then goes to update .META. * At this point, the server carrying .META. crashes. * Regionserver X is stuck waiting on .META. to come back online. It takes so long master times out the region-in-transition * Master assigns the region elsewhere to regionserver Y * It opens successfully on regionserver Y and then it also parks waiting on .META. coming online * .META. comes online * The two servers X and Y race to update .META. I saw case where server X edit went in after server Ys edit which means that lookups in .META. get the wrong server. HBCK can detect this situation. RegionServer X when it wakes up coreeclty notices that its lost control of the region but the damage is done -- where damage is .META. edit. Chatting with Jon, he suggested that regionserver X should 'rollback' the .META. edit -- do explicit delete of what it added. This would work I think but chatting more, I'll make a fix that keeps updating the zookeeper OPENING state while edit goes on in a separate thread. Our continuous setting of OPENING will make it so region-in-transition does not timeout. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3362) If .META. offline between OPENING and OPENED, then wrong server location in .META. is possible
[ https://issues.apache.org/jira/browse/HBASE-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12972162#action_12972162 ] HBase Review Board commented on HBASE-3362: --- Message from: st...@duboce.net bq. On 2010-12-16 00:14:36, Jonathan Gray wrote: bq. trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java, line 96 bq. http://review.cloudera.org/r/1298/diff/1/?file=18309#file18309line96 bq. bq. We were not previously but we should probably log this condition We do in the method? bq. On 2010-12-16 00:14:36, Jonathan Gray wrote: bq. trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java, line 173 bq. http://review.cloudera.org/r/1298/diff/1/?file=18309#file18309line173 bq. bq. This is a busy wait loop? bq. bq. Should we add a wait/notify on something passed to the thread and w/ a timeout of the period? bq. bq. And then we should probably also have some kind of max timeout. Even if minutes, there could be weird cluster state where the RS misses META availability but someone else might handle it properly, so max timeout might be good? I need to add a small sleep. I'd rather do this than wait/notify. t.isAlive should be enough. Regards max timeout, I should add check if server is stopped ... and for max timeout, what you think? Ten minutes? Then abort? bq. On 2010-12-16 00:14:36, Jonathan Gray wrote: bq. trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java, line 238 bq. http://review.cloudera.org/r/1298/diff/1/?file=18309#file18309line238 bq. bq. maybe this should be warn. i think i'd want to see it and also logging of stack trace (i don't see logging of it elsewhere) For sure. - stack --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1298/#review2083 --- If .META. offline between OPENING and OPENED, then wrong server location in .META. is possible -- Key: HBASE-3362 URL: https://issues.apache.org/jira/browse/HBASE-3362 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Priority: Critical Fix For: 0.90.0 This is a good one. It happened to me testing OOME in split logging. * Balancer moves region to new location, regionservrer X. * New location regionserver X successfully opens the region and then goes to update .META. * At this point, the server carrying .META. crashes. * Regionserver X is stuck waiting on .META. to come back online. It takes so long master times out the region-in-transition * Master assigns the region elsewhere to regionserver Y * It opens successfully on regionserver Y and then it also parks waiting on .META. coming online * .META. comes online * The two servers X and Y race to update .META. I saw case where server X edit went in after server Ys edit which means that lookups in .META. get the wrong server. HBCK can detect this situation. RegionServer X when it wakes up coreeclty notices that its lost control of the region but the damage is done -- where damage is .META. edit. Chatting with Jon, he suggested that regionserver X should 'rollback' the .META. edit -- do explicit delete of what it added. This would work I think but chatting more, I'll make a fix that keeps updating the zookeeper OPENING state while edit goes on in a separate thread. Our continuous setting of OPENING will make it so region-in-transition does not timeout. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3362) If .META. offline between OPENING and OPENED, then wrong server location in .META. is possible
[ https://issues.apache.org/jira/browse/HBASE-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12972172#action_12972172 ] HBase Review Board commented on HBASE-3362: --- Message from: st...@duboce.net bq. On 2010-12-16 00:14:36, Jonathan Gray wrote: bq. trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java, line 173 bq. http://review.cloudera.org/r/1298/diff/1/?file=18309#file18309line173 bq. bq. This is a busy wait loop? bq. bq. Should we add a wait/notify on something passed to the thread and w/ a timeout of the period? bq. bq. And then we should probably also have some kind of max timeout. Even if minutes, there could be weird cluster state where the RS misses META availability but someone else might handle it properly, so max timeout might be good? bq. bq. stack wrote: bq. I need to add a small sleep. I'd rather do this than wait/notify. t.isAlive should be enough. Regards max timeout, I should add check if server is stopped ... and for max timeout, what you think? Ten minutes? Then abort? bq. bq. Jonathan Gray wrote: bq. I was thinking 5 minutes. bq. bq. How long you going to sleep for? That seems like an unideal way to do this. I would prefer wait/notify and have timeout on wait be this 1/3 period, but small sleep could work. If really small, we're in busy loop again. If too big, we increase how long we have to wait. This is on critical path of every single region open. bq. bq. If we go down path of threads doing work, I don't see why we don't want to use wait/notify to let the blocked thread know when it's done. 5 minute is not enough. IIRC, it was 5 minutes before the region came back online. Let me see. I want to avoid mother thread depending on daughter thread signaling it to stop... seems redundant when I'm watching the daughter with the isAlive already. The sleep would be short. 1ms or so. Normally we'd not trip into the sleep. The operation will have compeleted before we have chance to sleep. It'd only sleep when no progress can be made. I'll add wait/notify for you to get this patch cleared past review, np. - stack --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1298/#review2083 --- If .META. offline between OPENING and OPENED, then wrong server location in .META. is possible -- Key: HBASE-3362 URL: https://issues.apache.org/jira/browse/HBASE-3362 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Priority: Critical Fix For: 0.90.0 This is a good one. It happened to me testing OOME in split logging. * Balancer moves region to new location, regionservrer X. * New location regionserver X successfully opens the region and then goes to update .META. * At this point, the server carrying .META. crashes. * Regionserver X is stuck waiting on .META. to come back online. It takes so long master times out the region-in-transition * Master assigns the region elsewhere to regionserver Y * It opens successfully on regionserver Y and then it also parks waiting on .META. coming online * .META. comes online * The two servers X and Y race to update .META. I saw case where server X edit went in after server Ys edit which means that lookups in .META. get the wrong server. HBCK can detect this situation. RegionServer X when it wakes up coreeclty notices that its lost control of the region but the damage is done -- where damage is .META. edit. Chatting with Jon, he suggested that regionserver X should 'rollback' the .META. edit -- do explicit delete of what it added. This would work I think but chatting more, I'll make a fix that keeps updating the zookeeper OPENING state while edit goes on in a separate thread. Our continuous setting of OPENING will make it so region-in-transition does not timeout. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3260) Coprocessors: Lifecycle management
[ https://issues.apache.org/jira/browse/HBASE-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12972312#action_12972312 ] HBase Review Board commented on HBASE-3260: --- Message from: Gary Helmling ghelml...@gmail.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1306/ --- Review request for hbase, stack and Andrew Purtell. Summary --- This patch adds explicit start() and stop() methods for lifecycle management to the Coprocessor interface and refactors some of the Coprocessor/RegionObserver distinction, moving the region-related pre/post hooks that were previously in Coprocessor to RegionObserver. Coprocessor is now the base interface, containing only: - start() - stop() - Priority enum - State enum RegionObserver extends Coprocessor, and now contains the additional pre/post hooks, moved from Coprocessor: - pre/postOpen - pre/postClose - pre/postFlush - pre/postCompact - pre/postSplit This will allow cleaner extension in the future, to allow addition of a MasterObserver interface, for example. As shown above, I've also added a new Coprocessor.State enum consisting of the states: UNINSTALLED - INSTALLED - STARTING - ACTIVE - STOPPING - STOPPED However, the UNINSTALLED/INSTALLED distinction is not particularly useful at the moment. I'd appreciate other feedback on what's necessary here. The current handling could make do with: UNINSTALLED - STARTING - ACTIVE - STOPPING - UNINSTALLED (4 total states) However, the UNINSTALLED/INSTALLED distinction may be useful if we want to add class level initialization in the future... This addresses bug HBASE-3260. http://issues.apache.org/jira/browse/HBASE-3260 Diffs - src/main/java/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.java b81a465 src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserverCoprocessor.java f022598 src/main/java/org/apache/hadoop/hbase/coprocessor/Coprocessor.java 7ea1c5e src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java 1792290 src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java f028525 src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java 3db4c36 src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java 81cb75d Diff: http://review.cloudera.org/r/1306/diff Testing --- Added tests for start() and stop() method invocation in org.apache.hadoop.hbase.coprocessor.TestCoprocessorInterface The existing TestCoprocessorEndpoint, TestCoprocessorInterface, TestRegionObserverInterface, TestRegionObserverStacking tests continue to work. I'm not seeing any new failures in the rest of the tests, but TestReplication is timing out for me, preventing all tests from executing. Thanks, Gary Coprocessors: Lifecycle management -- Key: HBASE-3260 URL: https://issues.apache.org/jira/browse/HBASE-3260 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Fix For: 0.92.0 Attachments: statechart.png Considering extending CPs to the master, we have no equivalent to pre/postOpen and pre/postClose as on the regionserver. We also should consider how to resolve dependencies and initialization ordering if loading coprocessors that depend on others. OSGi (http://en.wikipedia.org/wiki/OSGi) has a lifecycle API and is familiar to many Java programmers, so we propose to borrow its terminology and state machine. A lifecycle layer manages coprocessors as they are dynamically installed, started, stopped, updated and uninstalled. Coprocessors rely on the framework for dependency resolution and class loading. In turn, the framework calls up to lifecycle management methods in the coprocessor as needed. A coprocessor transitions between the below states over its lifetime: ||State||Description|| |UNINSTALLED|The coprocessor implementation is not installed. This is the default implicit state.| |INSTALLED|The coprocessor implementation has been successfully installed| |STARTING|A coprocessor instance is being started.| |ACTIVE|The coprocessor instance has been successfully activated and is running.| |STOPPING|A coprocessor instance is being stopped.| See attached state diagram. Transitions to STOPPING will only happen as the region is being closed. If a coprocessor throws an unhandled exception, this will cause the RegionServer to close the region, stopping all coprocessor instances on it. Transitions from INSTALLED-STARTING and ACTIVE-STOPPING would go through upcall methods into the coprocessor via the CoprocessorLifecycle interface: {code:java} public interface
[jira] Commented: (HBASE-3362) If .META. offline between OPENING and OPENED, then wrong server location in .META. is possible
[ https://issues.apache.org/jira/browse/HBASE-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12972321#action_12972321 ] HBase Review Board commented on HBASE-3362: --- Message from: st...@duboce.net --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1298/ --- (Updated 2010-12-16 17:01:04.757304) Review request for hbase and Jonathan Gray. Changes --- I implemented Jon's suggestions and then some. Not pretty but works in my local and cluster testing. Summary --- M src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java Removed stale comments and TODOs. Added a 'version' datamenber, the znode edit version which we keep across open process. Refactored the setting of OPENING out into a method that is used in multiple places now rather than repeat code. Did this in new tickleOpening method. Added new PostOpenDeployTasksThread which we run to do the postOpenDeployTasks. While its running we update OPENING state if its running a while. This addresses bug hbase-3362. http://issues.apache.org/jira/browse/hbase-3362 Diffs (updated) - trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java 1050086 Diff: http://review.cloudera.org/r/1298/diff Testing --- Ran it on my cluster. Seems to work as the old code did. Thanks, stack If .META. offline between OPENING and OPENED, then wrong server location in .META. is possible -- Key: HBASE-3362 URL: https://issues.apache.org/jira/browse/HBASE-3362 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Priority: Critical Fix For: 0.90.0 This is a good one. It happened to me testing OOME in split logging. * Balancer moves region to new location, regionservrer X. * New location regionserver X successfully opens the region and then goes to update .META. * At this point, the server carrying .META. crashes. * Regionserver X is stuck waiting on .META. to come back online. It takes so long master times out the region-in-transition * Master assigns the region elsewhere to regionserver Y * It opens successfully on regionserver Y and then it also parks waiting on .META. coming online * .META. comes online * The two servers X and Y race to update .META. I saw case where server X edit went in after server Ys edit which means that lookups in .META. get the wrong server. HBCK can detect this situation. RegionServer X when it wakes up coreeclty notices that its lost control of the region but the damage is done -- where damage is .META. edit. Chatting with Jon, he suggested that regionserver X should 'rollback' the .META. edit -- do explicit delete of what it added. This would work I think but chatting more, I'll make a fix that keeps updating the zookeeper OPENING state while edit goes on in a separate thread. Our continuous setting of OPENING will make it so region-in-transition does not timeout. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3260) Coprocessors: Lifecycle management
[ https://issues.apache.org/jira/browse/HBASE-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12972328#action_12972328 ] HBase Review Board commented on HBASE-3260: --- Message from: Andrew Purtell apurt...@apache.org --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1306/#review2102 --- Ship it! src/main/java/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.java http://review.cloudera.org/r/1306/#comment6557 What are these arguments about? src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java http://review.cloudera.org/r/1306/#comment6558 Should be a WARN? src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java http://review.cloudera.org/r/1306/#comment6559 Should be a WARN? src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java http://review.cloudera.org/r/1306/#comment6560 Since you are committing a change set in this area, Ryan suggested no need for AtomicBoolean here, could just be plain volatile boolean. I think that's right. - Andrew Coprocessors: Lifecycle management -- Key: HBASE-3260 URL: https://issues.apache.org/jira/browse/HBASE-3260 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Fix For: 0.92.0 Attachments: statechart.png Considering extending CPs to the master, we have no equivalent to pre/postOpen and pre/postClose as on the regionserver. We also should consider how to resolve dependencies and initialization ordering if loading coprocessors that depend on others. OSGi (http://en.wikipedia.org/wiki/OSGi) has a lifecycle API and is familiar to many Java programmers, so we propose to borrow its terminology and state machine. A lifecycle layer manages coprocessors as they are dynamically installed, started, stopped, updated and uninstalled. Coprocessors rely on the framework for dependency resolution and class loading. In turn, the framework calls up to lifecycle management methods in the coprocessor as needed. A coprocessor transitions between the below states over its lifetime: ||State||Description|| |UNINSTALLED|The coprocessor implementation is not installed. This is the default implicit state.| |INSTALLED|The coprocessor implementation has been successfully installed| |STARTING|A coprocessor instance is being started.| |ACTIVE|The coprocessor instance has been successfully activated and is running.| |STOPPING|A coprocessor instance is being stopped.| See attached state diagram. Transitions to STOPPING will only happen as the region is being closed. If a coprocessor throws an unhandled exception, this will cause the RegionServer to close the region, stopping all coprocessor instances on it. Transitions from INSTALLED-STARTING and ACTIVE-STOPPING would go through upcall methods into the coprocessor via the CoprocessorLifecycle interface: {code:java} public interface CoprocessorLifecycle { void start(CoprocessorEnvironment env) throws IOException; void stop(CoprocessorEnvironment env) throws IOException; } {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3362) If .META. offline between OPENING and OPENED, then wrong server location in .META. is possible
[ https://issues.apache.org/jira/browse/HBASE-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12972331#action_12972331 ] HBase Review Board commented on HBASE-3362: --- Message from: Jonathan Gray jg...@apache.org --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1298/#review2103 --- Ship it! it's getting pretty crazy but this looks good. it's unfortunate we have all these extra node transitioning methods inside this class. this pattern of doing node transitions and tracking expected version is very common and we'll probably have more of it so we should look at doing some kind of generic abstraction for that pattern soon. +1 for commit, thanks for the changes trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java http://review.cloudera.org/r/1298/#comment6561 typo 'initalizes' but good comment trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java http://review.cloudera.org/r/1298/#comment6562 interesting thing is... we only use this progressable if we do a log replay. in that case, a region open is not really idempotent as we treat it here. outside scope of this jira but something to think about. - Jonathan If .META. offline between OPENING and OPENED, then wrong server location in .META. is possible -- Key: HBASE-3362 URL: https://issues.apache.org/jira/browse/HBASE-3362 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Priority: Critical Fix For: 0.90.0 This is a good one. It happened to me testing OOME in split logging. * Balancer moves region to new location, regionservrer X. * New location regionserver X successfully opens the region and then goes to update .META. * At this point, the server carrying .META. crashes. * Regionserver X is stuck waiting on .META. to come back online. It takes so long master times out the region-in-transition * Master assigns the region elsewhere to regionserver Y * It opens successfully on regionserver Y and then it also parks waiting on .META. coming online * .META. comes online * The two servers X and Y race to update .META. I saw case where server X edit went in after server Ys edit which means that lookups in .META. get the wrong server. HBCK can detect this situation. RegionServer X when it wakes up coreeclty notices that its lost control of the region but the damage is done -- where damage is .META. edit. Chatting with Jon, he suggested that regionserver X should 'rollback' the .META. edit -- do explicit delete of what it added. This would work I think but chatting more, I'll make a fix that keeps updating the zookeeper OPENING state while edit goes on in a separate thread. Our continuous setting of OPENING will make it so region-in-transition does not timeout. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3260) Coprocessors: Lifecycle management
[ https://issues.apache.org/jira/browse/HBASE-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12972336#action_12972336 ] HBase Review Board commented on HBASE-3260: --- Message from: Gary Helmling ghelml...@gmail.com bq. On 2010-12-16 17:09:44, Andrew Purtell wrote: bq. src/main/java/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.java, line 66 bq. http://review.cloudera.org/r/1306/diff/1/?file=18379#file18379line66 bq. bq. What are these arguments about? Those are: - String protocol - long clientVersion from org.apache.hadoop.ipc.VersionedProtocol. Will fix these up. bq. On 2010-12-16 17:09:44, Andrew Purtell wrote: bq. src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java, line 289 bq. http://review.cloudera.org/r/1306/diff/1/?file=18383#file18383line289 bq. bq. Should be a WARN? Yeah, agree. Will fix. bq. On 2010-12-16 17:09:44, Andrew Purtell wrote: bq. src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java, line 305 bq. http://review.cloudera.org/r/1306/diff/1/?file=18383#file18383line305 bq. bq. Should be a WARN? Yeah, will fix. bq. On 2010-12-16 17:09:44, Andrew Purtell wrote: bq. src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java, line 385 bq. http://review.cloudera.org/r/1306/diff/1/?file=18383#file18383line385 bq. bq. Since you are committing a change set in this area, Ryan suggested no need for AtomicBoolean here, could just be plain volatile boolean. I think that's right. Ok will change this to a volatile boolean and repost. - Gary --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1306/#review2102 --- Coprocessors: Lifecycle management -- Key: HBASE-3260 URL: https://issues.apache.org/jira/browse/HBASE-3260 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Fix For: 0.92.0 Attachments: statechart.png Considering extending CPs to the master, we have no equivalent to pre/postOpen and pre/postClose as on the regionserver. We also should consider how to resolve dependencies and initialization ordering if loading coprocessors that depend on others. OSGi (http://en.wikipedia.org/wiki/OSGi) has a lifecycle API and is familiar to many Java programmers, so we propose to borrow its terminology and state machine. A lifecycle layer manages coprocessors as they are dynamically installed, started, stopped, updated and uninstalled. Coprocessors rely on the framework for dependency resolution and class loading. In turn, the framework calls up to lifecycle management methods in the coprocessor as needed. A coprocessor transitions between the below states over its lifetime: ||State||Description|| |UNINSTALLED|The coprocessor implementation is not installed. This is the default implicit state.| |INSTALLED|The coprocessor implementation has been successfully installed| |STARTING|A coprocessor instance is being started.| |ACTIVE|The coprocessor instance has been successfully activated and is running.| |STOPPING|A coprocessor instance is being stopped.| See attached state diagram. Transitions to STOPPING will only happen as the region is being closed. If a coprocessor throws an unhandled exception, this will cause the RegionServer to close the region, stopping all coprocessor instances on it. Transitions from INSTALLED-STARTING and ACTIVE-STOPPING would go through upcall methods into the coprocessor via the CoprocessorLifecycle interface: {code:java} public interface CoprocessorLifecycle { void start(CoprocessorEnvironment env) throws IOException; void stop(CoprocessorEnvironment env) throws IOException; } {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3360) ReplicationLogCleaner is enabled by default in 0.90 -- causes NPE
[ https://issues.apache.org/jira/browse/HBASE-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971771#action_12971771 ] HBase Review Board commented on HBASE-3360: --- Message from: st...@duboce.net --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1293/#review2074 --- Ship it! +1 /trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java http://review.cloudera.org/r/1293/#comment6510 Don't need this (found by J-D reviewing this over my shoulder) /trunk/src/main/java/org/apache/hadoop/hbase/replication/regionserver/Replication.java http://review.cloudera.org/r/1293/#comment6511 Call this 'decoraateMasterConfiguration' or something other than instrument. - stack ReplicationLogCleaner is enabled by default in 0.90 -- causes NPE - Key: HBASE-3360 URL: https://issues.apache.org/jira/browse/HBASE-3360 Project: HBase Issue Type: Bug Reporter: stack Assignee: Jean-Daniel Cryans Fix For: 0.90.0 {code} 2010-12-15 00:33:17,706 ERROR org.apache.hadoop.hbase.master.LogCleaner: Caught exception java.lang.NullPointerException at org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner.isLogDeletable(ReplicationLogCleaner.java:59) at org.apache.hadoop.hbase.master.LogCleaner.chore(LogCleaner.java:138) at org.apache.hadoop.hbase.Chore.run(Chore.java:66) at org.apache.hadoop.hbase.master.LogCleaner.run(LogCleaner.java:165) {code} Assigning J-D at his request. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3362) If .META. offline between OPENING and OPENED, then wrong server location in .META. is possible
[ https://issues.apache.org/jira/browse/HBASE-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971896#action_12971896 ] HBase Review Board commented on HBASE-3362: --- Message from: st...@duboce.net --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1298/ --- Review request for hbase and Jonathan Gray. Summary --- M src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java Removed stale comments and TODOs. Added a 'version' datamenber, the znode edit version which we keep across open process. Refactored the setting of OPENING out into a method that is used in multiple places now rather than repeat code. Did this in new tickleOpening method. Added new PostOpenDeployTasksThread which we run to do the postOpenDeployTasks. While its running we update OPENING state if its running a while. This addresses bug hbase-3362. http://issues.apache.org/jira/browse/hbase-3362 Diffs - trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java 1049707 Diff: http://review.cloudera.org/r/1298/diff Testing --- Ran it on my cluster. Seems to work as the old code did. Thanks, stack If .META. offline between OPENING and OPENED, then wrong server location in .META. is possible -- Key: HBASE-3362 URL: https://issues.apache.org/jira/browse/HBASE-3362 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Priority: Critical Fix For: 0.90.0 This is a good one. It happened to me testing OOME in split logging. * Balancer moves region to new location, regionservrer X. * New location regionserver X successfully opens the region and then goes to update .META. * At this point, the server carrying .META. crashes. * Regionserver X is stuck waiting on .META. to come back online. It takes so long master times out the region-in-transition * Master assigns the region elsewhere to regionserver Y * It opens successfully on regionserver Y and then it also parks waiting on .META. coming online * .META. comes online * The two servers X and Y race to update .META. I saw case where server X edit went in after server Ys edit which means that lookups in .META. get the wrong server. HBCK can detect this situation. RegionServer X when it wakes up coreeclty notices that its lost control of the region but the damage is done -- where damage is .META. edit. Chatting with Jon, he suggested that regionserver X should 'rollback' the .META. edit -- do explicit delete of what it added. This would work I think but chatting more, I'll make a fix that keeps updating the zookeeper OPENING state while edit goes on in a separate thread. Our continuous setting of OPENING will make it so region-in-transition does not timeout. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3360) ReplicationLogCleaner is enabled by default in 0.90 -- causes NPE
[ https://issues.apache.org/jira/browse/HBASE-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971517#action_12971517 ] HBase Review Board commented on HBASE-3360: --- Message from: Jean-Daniel Cryans jdcry...@apache.org --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1293/ --- Review request for hbase. Summary --- Patch that removes ReplicationLogCleaner from hbase-default.xml and instead injects from the Replication class. There's also some cleanup on how HConstants are used. This addresses bug HBASE-3360. http://issues.apache.org/jira/browse/HBASE-3360 Diffs - /trunk/src/main/java/org/apache/hadoop/hbase/HConstants.java 1049375 /trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 1049375 /trunk/src/main/java/org/apache/hadoop/hbase/master/LogCleaner.java 1049375 /trunk/src/main/java/org/apache/hadoop/hbase/replication/regionserver/Replication.java 1049375 /trunk/src/main/resources/hbase-default.xml 1049375 Diff: http://review.cloudera.org/r/1293/diff Testing --- Thanks, Jean-Daniel ReplicationLogCleaner is enabled by default in 0.90 -- causes NPE - Key: HBASE-3360 URL: https://issues.apache.org/jira/browse/HBASE-3360 Project: HBase Issue Type: Bug Reporter: stack Assignee: Jean-Daniel Cryans Fix For: 0.90.0 {code} 2010-12-15 00:33:17,706 ERROR org.apache.hadoop.hbase.master.LogCleaner: Caught exception java.lang.NullPointerException at org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner.isLogDeletable(ReplicationLogCleaner.java:59) at org.apache.hadoop.hbase.master.LogCleaner.chore(LogCleaner.java:138) at org.apache.hadoop.hbase.Chore.run(Chore.java:66) at org.apache.hadoop.hbase.master.LogCleaner.run(LogCleaner.java:165) {code} Assigning J-D at his request. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3348) Allow Observers to completely override base function
[ https://issues.apache.org/jira/browse/HBASE-3348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971531#action_12971531 ] HBase Review Board commented on HBASE-3348: --- Message from: Andrew Purtell apurt...@apache.org --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1295/ --- Review request for hbase, Jonathan Gray and Mingjie Lai. Summary --- Currently an observer can act as a filter or translator but cannot stop a subsequent call down to the base method for get, put, delete, etc. This patch allows observers to 1) keep any subsequently chained observer from executing, or 2) prevent default behavior from executing. This latter option allows a preXXX hook to completely reimplement something. I also found and fixed some logic bugs in coprocessor framework integration in HRegion. I will squelch the added extraneous whitespace upon commit. This addresses bug HBASE-3348. http://issues.apache.org/jira/browse/HBASE-3348 Diffs - src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserverCoprocessor.java 134ed2f src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorEnvironment.java 654b179 src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java 10dfff4 src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java c57ca0c src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java cf9cad0 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 8248f5f src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java 345790f src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverStacking.java 9ef3562 Diff: http://review.cloudera.org/r/1295/diff Testing --- All coprocessor unit tests pass. No failures of other unit tests observed that might be related to these changes. Thanks, Andrew Allow Observers to completely override base function Key: HBASE-3348 URL: https://issues.apache.org/jira/browse/HBASE-3348 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.0 Attachments: HBASE-3348.patch Currently an observer can act as a filter or translator but cannot stop a subsequent call down to the base method for get, put, delete, etc. This means an observer cannot completely override the base function. To deal with this we can: - Change the preXXX methods to return the same type as the postXXX methods, the same return type of the base method. - Extend {{Coprocessor.Environment}} with methods that get/set a should continue flag. The framework should check the should continue flag before calling the base method. If not, just return what was returned by the preXXX method. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3348) Allow Observers to completely override base function
[ https://issues.apache.org/jira/browse/HBASE-3348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971534#action_12971534 ] HBase Review Board commented on HBASE-3348: --- Message from: Ryan Rawson ryano...@gmail.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1295/#review2064 --- Ship it! - Ryan Allow Observers to completely override base function Key: HBASE-3348 URL: https://issues.apache.org/jira/browse/HBASE-3348 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.0 Attachments: HBASE-3348.patch Currently an observer can act as a filter or translator but cannot stop a subsequent call down to the base method for get, put, delete, etc. This means an observer cannot completely override the base function. To deal with this we can: - Change the preXXX methods to return the same type as the postXXX methods, the same return type of the base method. - Extend {{Coprocessor.Environment}} with methods that get/set a should continue flag. The framework should check the should continue flag before calling the base method. If not, just return what was returned by the preXXX method. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3348) Allow Observers to completely override base function
[ https://issues.apache.org/jira/browse/HBASE-3348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971533#action_12971533 ] HBase Review Board commented on HBASE-3348: --- Message from: Ryan Rawson ryano...@gmail.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1295/#review2063 --- src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserverCoprocessor.java http://review.cloudera.org/r/1295/#comment6478 presumably a co-processor could modify the Get object to implement policy? Another consideration is replacing the Get query with an alternate query, for example we have InternalGet subclasses for additional functionality, I'm just winging this though. src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java http://review.cloudera.org/r/1295/#comment6479 unless you need CAS semantics, you can just use volatile here. We are over-using the Atomic* stuff sometimes. - Ryan Allow Observers to completely override base function Key: HBASE-3348 URL: https://issues.apache.org/jira/browse/HBASE-3348 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.0 Attachments: HBASE-3348.patch Currently an observer can act as a filter or translator but cannot stop a subsequent call down to the base method for get, put, delete, etc. This means an observer cannot completely override the base function. To deal with this we can: - Change the preXXX methods to return the same type as the postXXX methods, the same return type of the base method. - Extend {{Coprocessor.Environment}} with methods that get/set a should continue flag. The framework should check the should continue flag before calling the base method. If not, just return what was returned by the preXXX method. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3328) Admin API: Explicit Split Points
[ https://issues.apache.org/jira/browse/HBASE-3328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971071#action_12971071 ] HBase Review Board commented on HBASE-3328: --- Message from: Nicolas nspiegelb...@facebook.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1281/ --- (Updated 2010-12-13 15:08:04.932875) Review request for hbase. Changes --- refactored HRegion::forceSplit() api Summary --- Add the ability to explicitly split an existing region at a user-specified point. Currently, you can disable automated splitting and can presplit a newly-created table at explicit boundaries, but cannot explicitly bound a split of an existing region. This addresses bug HBASE-3328. http://issues.apache.org/jira/browse/HBASE-3328 Diffs (updated) - src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 2fba18e src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 2c109ae src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java cf9cad0 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 26b4c10 src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 4717938 src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 8560d22 Diff: http://review.cloudera.org/r/1281/diff Testing --- - mvn test -Dtest=TestAdmin - mvn test (underway) - cluster testing Note: this was primarily cluster-tested with 0.89 master. Thanks, Nicolas Admin API: Explicit Split Points Key: HBASE-3328 URL: https://issues.apache.org/jira/browse/HBASE-3328 Project: HBase Issue Type: Improvement Components: client, ipc Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Priority: Minor Add the ability to explicitly split an existing region at a user-specified point. Currently, you can disable automated splitting and can presplit a newly-created table at explicit boundaries, but cannot explicitly bound a split of an existing region. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3328) Admin API: Explicit Split Points
[ https://issues.apache.org/jira/browse/HBASE-3328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12970583#action_12970583 ] HBase Review Board commented on HBASE-3328: --- Message from: Nicolas nspiegelb...@facebook.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1281/ --- (Updated 2010-12-11 23:21:55.350465) Review request for hbase. Changes --- I was porting this from my 0.89 diff. In 0.90, we can just directly add an HRegionInterface RPC and not worry about incrementing the HRegionInfo VERSION. Much cleaner and allows for rolling upgrades / mixed version environments. Summary --- Add the ability to explicitly split an existing region at a user-specified point. Currently, you can disable automated splitting and can presplit a newly-created table at explicit boundaries, but cannot explicitly bound a split of an existing region. This addresses bug HBASE-3328. http://issues.apache.org/jira/browse/HBASE-3328 Diffs (updated) - src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 2fba18e src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 2c109ae src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java cf9cad0 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 26b4c10 src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 4717938 src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 8560d22 Diff: http://review.cloudera.org/r/1281/diff Testing --- - mvn test -Dtest=TestAdmin - mvn test (underway) - cluster testing Note: this was primarily cluster-tested with 0.89 master. Thanks, Nicolas Admin API: Explicit Split Points Key: HBASE-3328 URL: https://issues.apache.org/jira/browse/HBASE-3328 Project: HBase Issue Type: Improvement Components: client, ipc Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Priority: Minor Add the ability to explicitly split an existing region at a user-specified point. Currently, you can disable automated splitting and can presplit a newly-created table at explicit boundaries, but cannot explicitly bound a split of an existing region. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3328) Admin API: Explicit Split Points
[ https://issues.apache.org/jira/browse/HBASE-3328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969941#action_12969941 ] HBase Review Board commented on HBASE-3328: --- Message from: Nicolas nspiegelb...@facebook.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1281/ --- Review request for hbase. Summary --- Add the ability to explicitly split an existing region at a user-specified point. Currently, you can disable automated splitting and can presplit a newly-created table at explicit boundaries, but cannot explicitly bound a split of an existing region. This addresses bug HBASE-3328. http://issues.apache.org/jira/browse/HBASE-3328 Diffs - src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 2e601e1 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 2fba18e src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java cf9cad0 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 26b4c10 src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 4717938 src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 8560d22 Diff: http://review.cloudera.org/r/1281/diff Testing --- - mvn test -Dtest=TestAdmin - mvn test (underway) - cluster testing Note: this was primarily cluster-tested with 0.89 master. Thanks, Nicolas Admin API: Explicit Split Points Key: HBASE-3328 URL: https://issues.apache.org/jira/browse/HBASE-3328 Project: HBase Issue Type: Improvement Components: client, ipc Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Priority: Minor Add the ability to explicitly split an existing region at a user-specified point. Currently, you can disable automated splitting and can presplit a newly-created table at explicit boundaries, but cannot explicitly bound a split of an existing region. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3328) Admin API: Explicit Split Points
[ https://issues.apache.org/jira/browse/HBASE-3328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969942#action_12969942 ] HBase Review Board commented on HBASE-3328: --- Message from: Nicolas nspiegelb...@facebook.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1281/#review2056 --- src/main/java/org/apache/hadoop/hbase/HRegionInfo.java http://review.cloudera.org/r/1281/#comment6463 note that this means you should not do a rolling upgrade with this patch. - Nicolas Admin API: Explicit Split Points Key: HBASE-3328 URL: https://issues.apache.org/jira/browse/HBASE-3328 Project: HBase Issue Type: Improvement Components: client, ipc Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Priority: Minor Add the ability to explicitly split an existing region at a user-specified point. Currently, you can disable automated splitting and can presplit a newly-created table at explicit boundaries, but cannot explicitly bound a split of an existing region. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3305) Allow round-robin distribution for table created with multiple regions
[ https://issues.apache.org/jira/browse/HBASE-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969080#action_12969080 ] HBase Review Board commented on HBASE-3305: --- Message from: Jonathan Gray jg...@apache.org --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1271/#review2042 --- Almost there. Some spacing only changes still in here and need to move out logic into AM method. trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java http://review.cloudera.org/r/1271/#comment6445 still tabbing changes here and next method signature as well trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java http://review.cloudera.org/r/1271/#comment6446 same as stack's original comment. this logic should be in AssignmentManager. I wouldn't reuse the method 'assignAllUserRegions' because it says all in it. A method 'assignUserRegions' which takes a list and does a bulk assign w/ round-robin would make sense . 'assignAllUserRegions' could then call it once it makes a list of regions. - Jonathan Allow round-robin distribution for table created with multiple regions -- Key: HBASE-3305 URL: https://issues.apache.org/jira/browse/HBASE-3305 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.20.6 Reporter: Ted Yu Assignee: Ted Yu Attachments: hbase-3305-array.patch, hbase-3305-default-round-robin.patch, hbase-3305-round-robin-unit-test.patch, hbase-3305.patch We can distribute the initial regions created for a new table in round-robin fashion. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-1861) Multi-Family support for bulk upload tools (HFileOutputFormat / loadtable.rb)
[ https://issues.apache.org/jira/browse/HBASE-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969086#action_12969086 ] HBase Review Board commented on HBASE-1861: --- Message from: Nicolas nspiegelb...@facebook.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1272/ --- Review request for hbase. Summary --- support writing to multiple column families for HFileOutputFormat. also, added a max threshold for PutSortReducer because we had some pathological row cases. This addresses bug HBASE-1861. http://issues.apache.org/jira/browse/HBASE-1861 Diffs - src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java 8ccdf4d src/main/java/org/apache/hadoop/hbase/mapreduce/PutSortReducer.java 5fb3e83 src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java c5d56cc Diff: http://review.cloudera.org/r/1272/diff Testing --- mvn test -Dtest=ThestHFileOutputFormat internal MR testing Thanks, Nicolas Multi-Family support for bulk upload tools (HFileOutputFormat / loadtable.rb) - Key: HBASE-1861 URL: https://issues.apache.org/jira/browse/HBASE-1861 Project: HBase Issue Type: Improvement Components: mapreduce Affects Versions: 0.20.0 Reporter: Jonathan Gray Assignee: Nicolas Spiegelberg Fix For: 0.92.0 Attachments: HBASE1861-incomplete.patch Add multi-family support to bulk upload tools from HBASE-48. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3308) SplitTransaction.splitStoreFiles slows splits a lot
[ https://issues.apache.org/jira/browse/HBASE-3308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969098#action_12969098 ] HBase Review Board commented on HBASE-3308: --- Message from: Jean-Daniel Cryans jdcry...@apache.org --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1273/ --- Review request for hbase. Summary --- Patch that parallelizes the splitting of the files using ThreadPoolExecutor and Futures. The code is a bit ugly, but does the job really well as shown during cluster testing (which also uncovered HBASE-3318). One new behavior this patch adds is that it's now possible to rollback a split because it took too long to split the files. I did some testing with a timeout of 5 secs on my cluster, even tho each machine did a few rollbacks the import went fine. The default is 30 seconds and isn't in hbase-default.xml as I don't think anyone would really want to change that. This addresses bug HBASE-3308. http://issues.apache.org/jira/browse/HBASE-3308 Diffs - /branches/0.90/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java 1043188 Diff: http://review.cloudera.org/r/1273/diff Testing --- Thanks, Jean-Daniel SplitTransaction.splitStoreFiles slows splits a lot --- Key: HBASE-3308 URL: https://issues.apache.org/jira/browse/HBASE-3308 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Priority: Critical Fix For: 0.92.0 Recently I've been seeing some slow splits in our production environment triggering timeouts, so I decided to take a closer look into the issue. According to my debugging, we spend almost all the time it takes to split on creating the reference files. Each file in my testing takes at least 300ms to create, and averages around 600ms. Since we create two references per store file, it means that a region with 4 store file can easily take up to 5 seconds to split just to create those references. An intuitive improvement would be to create those files in parallel, so at least it wouldn't be much slower when we're splitting a higher number of files. Stack left the following comment in the code: {noformat} // TODO: If the below were multithreaded would we complete steps in less // elapsed time? St.Ack 20100920 {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3308) SplitTransaction.splitStoreFiles slows splits a lot
[ https://issues.apache.org/jira/browse/HBASE-3308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969113#action_12969113 ] HBase Review Board commented on HBASE-3308: --- Message from: Jean-Daniel Cryans jdcry...@apache.org bq. On 2010-12-07 17:02:49, stack wrote: bq. /branches/0.90/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java, line 400 bq. http://review.cloudera.org/r/1273/diff/1/?file=17980#file17980line400 bq. bq. Why not have an upper bound? If 100 files thats 100 threads doing FS operations. I bet if you had upper bound of 10 on the executorservice, it complete faster than an unbounded executorservice? I think we are already bounded by hbase.hstore.blockingStoreFiles - Jean-Daniel --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1273/#review2043 --- SplitTransaction.splitStoreFiles slows splits a lot --- Key: HBASE-3308 URL: https://issues.apache.org/jira/browse/HBASE-3308 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Priority: Critical Fix For: 0.92.0 Recently I've been seeing some slow splits in our production environment triggering timeouts, so I decided to take a closer look into the issue. According to my debugging, we spend almost all the time it takes to split on creating the reference files. Each file in my testing takes at least 300ms to create, and averages around 600ms. Since we create two references per store file, it means that a region with 4 store file can easily take up to 5 seconds to split just to create those references. An intuitive improvement would be to create those files in parallel, so at least it wouldn't be much slower when we're splitting a higher number of files. Stack left the following comment in the code: {noformat} // TODO: If the below were multithreaded would we complete steps in less // elapsed time? St.Ack 20100920 {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-1861) Multi-Family support for bulk upload tools (HFileOutputFormat / loadtable.rb)
[ https://issues.apache.org/jira/browse/HBASE-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969118#action_12969118 ] HBase Review Board commented on HBASE-1861: --- Message from: st...@duboce.net --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1272/#review2044 --- Ship it! +1 Excellent. src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java http://review.cloudera.org/r/1272/#comment6448 Should this behavior be documented in method javadoc? - stack Multi-Family support for bulk upload tools (HFileOutputFormat / loadtable.rb) - Key: HBASE-1861 URL: https://issues.apache.org/jira/browse/HBASE-1861 Project: HBase Issue Type: Improvement Components: mapreduce Affects Versions: 0.20.0 Reporter: Jonathan Gray Assignee: Nicolas Spiegelberg Fix For: 0.92.0 Attachments: HBASE1861-incomplete.patch Add multi-family support to bulk upload tools from HBASE-48. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3305) Allow round-robin distribution for table created with multiple regions
[ https://issues.apache.org/jira/browse/HBASE-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969148#action_12969148 ] HBase Review Board commented on HBASE-3305: --- Message from: Ted Yu ted...@yahoo.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1271/#review2048 --- trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java http://review.cloudera.org/r/1271/#comment6455 I wrap InterruptedException in IOException. - Ted Allow round-robin distribution for table created with multiple regions -- Key: HBASE-3305 URL: https://issues.apache.org/jira/browse/HBASE-3305 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.20.6 Reporter: Ted Yu Assignee: Ted Yu Attachments: hbase-3305-array.patch, hbase-3305-default-round-robin.patch, hbase-3305-round-robin-unit-test.patch, hbase-3305.patch We can distribute the initial regions created for a new table in round-robin fashion. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3305) Allow round-robin distribution for table created with multiple regions
[ https://issues.apache.org/jira/browse/HBASE-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969149#action_12969149 ] HBase Review Board commented on HBASE-3305: --- Message from: Ted Yu ted...@yahoo.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1271/ --- (Updated 2010-12-07 18:25:05.129171) Review request for hbase, stack and Jonathan Gray. Changes --- I used Organize Imports in Eclipse for AssignmentManager Summary --- Adopted round-robin assignment as default for regions specified when table is created. This addresses bug HBASE-3305. http://issues.apache.org/jira/browse/HBASE-3305 Diffs (updated) - trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1043216 trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 1043216 trunk/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java 1043216 trunk/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 1043216 Diff: http://review.cloudera.org/r/1271/diff Testing --- Put unit tests for this change inside TestAdmin.testCreateTableWithRegions() They passed. Thanks, Ted Allow round-robin distribution for table created with multiple regions -- Key: HBASE-3305 URL: https://issues.apache.org/jira/browse/HBASE-3305 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.20.6 Reporter: Ted Yu Assignee: Ted Yu Attachments: hbase-3305-array.patch, hbase-3305-default-round-robin.patch, hbase-3305-round-robin-unit-test.patch, hbase-3305.patch We can distribute the initial regions created for a new table in round-robin fashion. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3305) Allow round-robin distribution for table created with multiple regions
[ https://issues.apache.org/jira/browse/HBASE-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969153#action_12969153 ] HBase Review Board commented on HBASE-3305: --- Message from: Ted Yu ted...@yahoo.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1271/ --- (Updated 2010-12-07 18:28:46.368066) Review request for hbase, stack and Jonathan Gray. Changes --- Reverted movement of imports Summary --- Adopted round-robin assignment as default for regions specified when table is created. This addresses bug HBASE-3305. http://issues.apache.org/jira/browse/HBASE-3305 Diffs (updated) - trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1043216 trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 1043216 trunk/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java 1043216 trunk/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 1043216 Diff: http://review.cloudera.org/r/1271/diff Testing --- Put unit tests for this change inside TestAdmin.testCreateTableWithRegions() They passed. Thanks, Ted Allow round-robin distribution for table created with multiple regions -- Key: HBASE-3305 URL: https://issues.apache.org/jira/browse/HBASE-3305 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.20.6 Reporter: Ted Yu Assignee: Ted Yu Attachments: hbase-3305-array.patch, hbase-3305-default-round-robin.patch, hbase-3305-round-robin-unit-test.patch, hbase-3305.patch We can distribute the initial regions created for a new table in round-robin fashion. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3305) Allow round-robin distribution for table created with multiple regions
[ https://issues.apache.org/jira/browse/HBASE-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969162#action_12969162 ] HBase Review Board commented on HBASE-3305: --- Message from: Jonathan Gray jg...@apache.org --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1271/#review2049 --- Ship it! looks good, thanks ted! i will commit the final patch to trunk. trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java http://review.cloudera.org/r/1271/#comment6456 line is 80 chars but will fix on commit, don't worry - Jonathan Allow round-robin distribution for table created with multiple regions -- Key: HBASE-3305 URL: https://issues.apache.org/jira/browse/HBASE-3305 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.20.6 Reporter: Ted Yu Assignee: Ted Yu Attachments: hbase-3305-array.patch, hbase-3305-default-round-robin.patch, hbase-3305-round-robin-unit-test.patch, hbase-3305.patch We can distribute the initial regions created for a new table in round-robin fashion. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3308) SplitTransaction.splitStoreFiles slows splits a lot
[ https://issues.apache.org/jira/browse/HBASE-3308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969190#action_12969190 ] HBase Review Board commented on HBASE-3308: --- Message from: st...@duboce.net bq. On 2010-12-07 17:02:49, stack wrote: bq. /branches/0.90/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java, line 400 bq. http://review.cloudera.org/r/1273/diff/1/?file=17980#file17980line400 bq. bq. Why not have an upper bound? If 100 files thats 100 threads doing FS operations. I bet if you had upper bound of 10 on the executorservice, it complete faster than an unbounded executorservice? bq. bq. Jean-Daniel Cryans wrote: bq. I think we are already bounded by hbase.hstore.blockingStoreFiles That'll do. +1 on commit. - stack --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1273/#review2043 --- SplitTransaction.splitStoreFiles slows splits a lot --- Key: HBASE-3308 URL: https://issues.apache.org/jira/browse/HBASE-3308 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Priority: Critical Fix For: 0.92.0 Recently I've been seeing some slow splits in our production environment triggering timeouts, so I decided to take a closer look into the issue. According to my debugging, we spend almost all the time it takes to split on creating the reference files. Each file in my testing takes at least 300ms to create, and averages around 600ms. Since we create two references per store file, it means that a region with 4 store file can easily take up to 5 seconds to split just to create those references. An intuitive improvement would be to create those files in parallel, so at least it wouldn't be much slower when we're splitting a higher number of files. Stack left the following comment in the code: {noformat} // TODO: If the below were multithreaded would we complete steps in less // elapsed time? St.Ack 20100920 {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3305) Allow round-robin distribution for table created with multiple regions
[ https://issues.apache.org/jira/browse/HBASE-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12968354#action_12968354 ] HBase Review Board commented on HBASE-3305: --- Message from: Ted Yu ted...@yahoo.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1271/ --- (Updated 2010-12-06 10:42:26.792838) Review request for hbase, stack and Jonathan Gray. Changes --- Add hbase group as reviewer Summary --- Adopted round-robin assignment as default for regions specified when table is created. This addresses bug HBASE-3305. http://issues.apache.org/jira/browse/HBASE-3305 Diffs - trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 1042725 trunk/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java 1042725 trunk/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 1042725 Diff: http://review.cloudera.org/r/1271/diff Testing --- Put unit tests for this change inside TestAdmin.testCreateTableWithRegions() They passed. Thanks, Ted Allow round-robin distribution for table created with multiple regions -- Key: HBASE-3305 URL: https://issues.apache.org/jira/browse/HBASE-3305 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.20.6 Reporter: Ted Yu Assignee: Ted Yu Attachments: hbase-3305-array.patch, hbase-3305-default-round-robin.patch, hbase-3305-round-robin-unit-test.patch, hbase-3305.patch We can distribute the initial regions created for a new table in round-robin fashion. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3305) Allow round-robin distribution for table created with multiple regions
[ https://issues.apache.org/jira/browse/HBASE-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12968579#action_12968579 ] HBase Review Board commented on HBASE-3305: --- Message from: st...@duboce.net --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1271/#review2037 --- Ship it! Looks good Ted. Below are a few pointers mostly on formatting and then a few questions. Thanks for making the patch. trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java http://review.cloudera.org/r/1271/#comment6430 Do you need to pollute HMaster with this AssignmentManager inner class? trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java http://review.cloudera.org/r/1271/#comment6431 FYI, don't make these kinda formatting changes in a patch... its distracting and the change you are making is against the convention used in the rest of this file. Just FYI. No biggie. trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java http://review.cloudera.org/r/1271/#comment6432 Yeah, maybe these lines belong inside a method that is inside AssignmentManager? What you think Ted? trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java http://review.cloudera.org/r/1271/#comment6433 What changed on this line? White space? trunk/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java http://review.cloudera.org/r/1271/#comment6434 Convention is two spaces for tab in hbase and hadoop. This seems like something else? trunk/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java http://review.cloudera.org/r/1271/#comment6435 FYI, tab is two spaces... we indent in multiples of two spaces. trunk/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java http://review.cloudera.org/r/1271/#comment6437 Good. Nice test. - stack Allow round-robin distribution for table created with multiple regions -- Key: HBASE-3305 URL: https://issues.apache.org/jira/browse/HBASE-3305 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.20.6 Reporter: Ted Yu Assignee: Ted Yu Attachments: hbase-3305-array.patch, hbase-3305-default-round-robin.patch, hbase-3305-round-robin-unit-test.patch, hbase-3305.patch We can distribute the initial regions created for a new table in round-robin fashion. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3305) Allow round-robin distribution for table created with multiple regions
[ https://issues.apache.org/jira/browse/HBASE-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12968608#action_12968608 ] HBase Review Board commented on HBASE-3305: --- Message from: Ted Yu ted...@yahoo.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1271/#review2038 --- trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java http://review.cloudera.org/r/1271/#comment6438 A new patch will be uploaded that reverts such changes. trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java http://review.cloudera.org/r/1271/#comment6439 I think you're implying rewriting AssignmentManager.assignAllUserRegions(). How about creating this method: assignAllUserRegions(ListHRegionInfo regions). finishInitialization() would pass null to the above method to indicate that all user regions should be assigned. createTable() would pass the list of regions for the new table. This way, BulkStartupAssigner doesn't appear in HMaster. trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java http://review.cloudera.org/r/1271/#comment6440 Yes. I prefer space between if and left parenthesis. I will revert anyway. - Ted Allow round-robin distribution for table created with multiple regions -- Key: HBASE-3305 URL: https://issues.apache.org/jira/browse/HBASE-3305 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.20.6 Reporter: Ted Yu Assignee: Ted Yu Attachments: hbase-3305-array.patch, hbase-3305-default-round-robin.patch, hbase-3305-round-robin-unit-test.patch, hbase-3305.patch We can distribute the initial regions created for a new table in round-robin fashion. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3305) Allow round-robin distribution for table created with multiple regions
[ https://issues.apache.org/jira/browse/HBASE-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12968610#action_12968610 ] HBase Review Board commented on HBASE-3305: --- Message from: Ted Yu ted...@yahoo.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1271/ --- (Updated 2010-12-06 23:22:29.259676) Review request for hbase, stack and Jonathan Gray. Changes --- Removes tabs. Format code using multiple of two spaces. Summary --- Adopted round-robin assignment as default for regions specified when table is created. This addresses bug HBASE-3305. http://issues.apache.org/jira/browse/HBASE-3305 Diffs (updated) - trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 1042922 trunk/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java 1042922 trunk/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 1042922 Diff: http://review.cloudera.org/r/1271/diff Testing --- Put unit tests for this change inside TestAdmin.testCreateTableWithRegions() They passed. Thanks, Ted Allow round-robin distribution for table created with multiple regions -- Key: HBASE-3305 URL: https://issues.apache.org/jira/browse/HBASE-3305 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.20.6 Reporter: Ted Yu Assignee: Ted Yu Attachments: hbase-3305-array.patch, hbase-3305-default-round-robin.patch, hbase-3305-round-robin-unit-test.patch, hbase-3305.patch We can distribute the initial regions created for a new table in round-robin fashion. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3290) Max Compaction Size
[ https://issues.apache.org/jira/browse/HBASE-3290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965795#action_12965795 ] HBase Review Board commented on HBASE-3290: --- Message from: st...@duboce.net --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1263/#review2018 --- Ship it! This looks great. I love the test. There are some comments below. See what you think. I did not dig in deep on the algo but looks good. trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java http://review.cloudera.org/r/1263/#comment6361 Good. I like the way you keep around old name. FYI, there's white space on end of some of these lines of yours. trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java http://review.cloudera.org/r/1263/#comment6364 Is this right? We check all storefiles for references where before we only checked the subset of candidate compaction files for references? (Hmm.. maybe the old stuff was wrong?) trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java http://review.cloudera.org/r/1263/#comment6362 White space trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java http://review.cloudera.org/r/1263/#comment6365 Good trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java http://review.cloudera.org/r/1263/#comment6366 I don't grok this comment trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java http://review.cloudera.org/r/1263/#comment6367 So, its ok to mess w/ file order? We won't get ourselves into trouble if we don't respect the order in which files were written? We do a merge sort when we read all compaction candidates in so should be fine I suppose -- since its same as how scanner merges them.. Just asking because in old days order was important but I suppose we let go of that a while back? trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java http://review.cloudera.org/r/1263/#comment6368 Is this a good name for this method? We're compacting a Store, not Stores, right? trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java http://review.cloudera.org/r/1263/#comment6369 Nice trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java http://review.cloudera.org/r/1263/#comment6370 Excellent! I love you mocking up StoreFiles rather than fire up minicluster FYI... loads of white space in here. - stack Max Compaction Size --- Key: HBASE-3290 URL: https://issues.apache.org/jira/browse/HBASE-3290 Project: HBase Issue Type: Improvement Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Priority: Minor Add ability to specify a maximum storefile size for compaction. After this limit, we will not include this file in compactions. This is useful for large object stores and clusters that pre-split regions. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3290) Max Compaction Size
[ https://issues.apache.org/jira/browse/HBASE-3290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965983#action_12965983 ] HBase Review Board commented on HBASE-3290: --- Message from: Nicolas nspiegelb...@facebook.com bq. On 2010-12-01 10:49:59, stack wrote: bq. trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java, line 639 bq. http://review.cloudera.org/r/1263/diff/1/?file=17923#file17923line639 bq. bq. Is this right? We check all storefiles for references where before we only checked the subset of candidate compaction files for references? bq. bq. bq. (Hmm.. maybe the old stuff was wrong?) references == split files. we currently don't support splitting split files (into quarter pieces?), so we need to ensure no files are split. bq. On 2010-12-01 10:49:59, stack wrote: bq. trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java, line 926 bq. http://review.cloudera.org/r/1263/diff/1/?file=17923#file17923line926 bq. bq. I don't grok this comment references == split files. The current algorithm is to split a StoreFile, then immediately use compaction after splitting to break them into 2 StoreFiles. If you don't compact reference files that are past the max threshold: 1) you won't be able to split the region again 2) you don't actually even know that the StoreFile is too large. HalfStoreFileReader.length() returns the whole StoreFile's length, not the length of the StoreFile related to your region bq. On 2010-12-01 10:49:59, stack wrote: bq. trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java, line 954 bq. http://review.cloudera.org/r/1263/diff/1/?file=17923#file17923line954 bq. bq. So, its ok to mess w/ file order? We won't get ourselves into trouble if we don't respect the order in which files were written? We do a merge sort when we read all compaction candidates in so should be fine I suppose -- since its same as how scanner merges them.. bq. bq. Just asking because in old days order was important but I suppose we let go of that a while back? so, technically, order is important for optimizations like the TimeStamp filter. However, realistically this isn't a problem because our normal skew always decreases in filesize over time. The only place where our skew doesn't decrease is for files that have been recently flushed. However, all those will be unconditionally compacted because they will be lower than hbase.hstore.compaction.min.size. The sorting is to handle an interesting issue that popped up for us during migration: we're bulk loading StoreFiles of extremely variable size (are we migrating 1k users or 10M?) and they will all appear at the end of the StoreFile list. How do we determine when it is efficient to compact them? The easiest option was to sort the compact list and handle bulk files by relative size instead of making some custom compaction selection algorithm just for bulk inclusion. It seems like any other companies that will incrementally migrate data into HBase would hit the same issue. bq. On 2010-12-01 10:49:59, stack wrote: bq. trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java, line 1024 bq. http://review.cloudera.org/r/1263/diff/1/?file=17923#file17923line1024 bq. bq. Is this a good name for this method? We're compacting a Store, not Stores, right? true. I mainly wanted to change the name from the public compact() api. I kept annoyingly clicking on the wrong function in Eclipse. Do you want to refactor it to compactFiles() right before commit? - Nicolas --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1263/#review2018 --- Max Compaction Size --- Key: HBASE-3290 URL: https://issues.apache.org/jira/browse/HBASE-3290 Project: HBase Issue Type: Improvement Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Priority: Minor Add ability to specify a maximum storefile size for compaction. After this limit, we will not include this file in compactions. This is useful for large object stores and clusters that pre-split regions. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close
[ https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965309#action_12965309 ] HBase Review Board commented on HBASE-3287: --- Message from: st...@duboce.net --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1261/#review2009 --- Ship it! Looks good to me. Some comments below. branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java http://review.cloudera.org/r/1261/#comment6343 This looks like useful addition. branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java http://review.cloudera.org/r/1261/#comment6346 Why the flush? branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java http://review.cloudera.org/r/1261/#comment6344 Does this create new byte array? branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java http://review.cloudera.org/r/1261/#comment6345 I wonder if we have to have full path here? Anything less could cause clashes? But small optimization would strip the hbase.root at least? branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java http://review.cloudera.org/r/1261/#comment6347 Can you presize the BAOS? Whats the default? 4k? If so, and our default block size is 64k, that'd be a bit of expensive array resizing going on? Just guessing. branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java http://review.cloudera.org/r/1261/#comment6348 Surround with if debug? - stack Add option to cache blocks on hfile write and evict blocks on hfile close - Key: HBASE-3287 URL: https://issues.apache.org/jira/browse/HBASE-3287 Project: HBase Issue Type: Improvement Components: io, regionserver Affects Versions: 0.90.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Fix For: 0.92.0 This issue is about adding configuration options to add/remove from the block cache when creating/closing files. For use cases with lots of flushing and compacting, this might be desirable to prevent cache misses and maximize the effective utilization of total block cache capacity. The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we pre-cache blocks as we are writing out new files. The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict blocks when files are closed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close
[ https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965310#action_12965310 ] HBase Review Board commented on HBASE-3287: --- Message from: Ryan Rawson ryano...@gmail.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1261/#review2010 --- branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java http://review.cloudera.org/r/1261/#comment6349 why would you not want to evict blocks from the cache on close? - Ryan Add option to cache blocks on hfile write and evict blocks on hfile close - Key: HBASE-3287 URL: https://issues.apache.org/jira/browse/HBASE-3287 Project: HBase Issue Type: Improvement Components: io, regionserver Affects Versions: 0.90.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Fix For: 0.92.0 This issue is about adding configuration options to add/remove from the block cache when creating/closing files. For use cases with lots of flushing and compacting, this might be desirable to prevent cache misses and maximize the effective utilization of total block cache capacity. The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we pre-cache blocks as we are writing out new files. The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict blocks when files are closed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close
[ https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965311#action_12965311 ] HBase Review Board commented on HBASE-3287: --- Message from: st...@duboce.net bq. On 2010-11-30 09:57:27, Ryan Rawson wrote: bq. branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java, line 765 bq. http://review.cloudera.org/r/1261/diff/1/?file=17902#file17902line765 bq. bq. why would you not want to evict blocks from the cache on close? I think this a good point. Its different behavior but its behavior we should have always had? One less option too. - stack --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1261/#review2010 --- Add option to cache blocks on hfile write and evict blocks on hfile close - Key: HBASE-3287 URL: https://issues.apache.org/jira/browse/HBASE-3287 Project: HBase Issue Type: Improvement Components: io, regionserver Affects Versions: 0.90.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Fix For: 0.92.0 This issue is about adding configuration options to add/remove from the block cache when creating/closing files. For use cases with lots of flushing and compacting, this might be desirable to prevent cache misses and maximize the effective utilization of total block cache capacity. The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we pre-cache blocks as we are writing out new files. The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict blocks when files are closed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3286) Master passes IP and not hostname back to region server
[ https://issues.apache.org/jira/browse/HBASE-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965381#action_12965381 ] HBase Review Board commented on HBASE-3286: --- Message from: Jean-Daniel Cryans jdcry...@apache.org --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1262/ --- Review request for hbase. Summary --- Changes: - In HMaster, instead of passing an IP as String we now pass the HSA object completely. - In HRegionServer, I cleared a bunch of crufty comments and handle the HSA passed by the master. - In HServerInfo, I saw that the hostname wasn't reset when setting the HSA. Fixed. - In HServerAddress, I fixed a few places that wasn't explicitly using hostnames and changed the serialization to pass a hostname instead of an IP address. This addresses bug HBASE-3286. http://issues.apache.org/jira/browse/HBASE-3286 Diffs - /trunk/src/main/java/org/apache/hadoop/hbase/HServerAddress.java 1040669 /trunk/src/main/java/org/apache/hadoop/hbase/HServerInfo.java 1040669 /trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 1040669 /trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1040669 Diff: http://review.cloudera.org/r/1262/diff Testing --- Works on my MBP (I was seeing the same issue but since there's only 1 RS it didn't have any bad effect) and my 10 machines Ubuntu cluster. Thanks, Jean-Daniel Master passes IP and not hostname back to region server --- Key: HBASE-3286 URL: https://issues.apache.org/jira/browse/HBASE-3286 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Fix For: 0.90.0 Starting my little test cluster on the latest from 0.90, I see: {noformat} 2010-11-29 23:21:34,131 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 1024 region(s) across 9 server(s), retainAssignment=true 2010-11-29 23:21:34,134 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 22 region(s) to sv2borg181,61020,1291072886282 2010-11-29 23:21:34,135 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 24 region(s) to sv2borg182,61020,1291072885473 2010-11-29 23:21:34,135 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 37 region(s) to sv2borg183,61020,1291072885646 2010-11-29 23:21:34,135 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 25 region(s) to sv2borg184,61020,1291072886734 2010-11-29 23:21:34,135 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 26 region(s) to sv2borg185,61020,1291072886606 2010-11-29 23:21:34,136 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 70 region(s) to sv2borg186,61020,1291072885486 2010-11-29 23:21:34,136 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 30 region(s) to sv2borg187,61020,1291072886355 2010-11-29 23:21:34,136 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 89 region(s) to sv2borg188,61020,1291072885926 2010-11-29 23:21:34,136 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 701 region(s) to sv2borg189,61020,1291072886739 {noformat} After another restart: {noformat} 2010-11-30 00:03:38,100 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 1024 region(s) across 9 server(s), retainAssignment=true 2010-11-30 00:03:38,103 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 71 region(s) to sv2borg181,61020,1291075409984 2010-11-30 00:03:38,103 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 82 region(s) to sv2borg182,61020,1291075409956 2010-11-30 00:03:38,104 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 61 region(s) to sv2borg183,61020,1291075409952 2010-11-30 00:03:38,104 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 122 region(s) to sv2borg184,61020,1291075409957 2010-11-30 00:03:38,104 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 59 region(s) to sv2borg185,61020,1291075409955 2010-11-30 00:03:38,104 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 71 region(s) to sv2borg186,61020,1291075409963 2010-11-30 00:03:38,105 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 52 region(s) to sv2borg187,61020,1291075411049 2010-11-30 00:03:38,105 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 254 region(s) to sv2borg188,61020,1291075410360 2010-11-30 00:03:38,105 DEBUG
[jira] Commented: (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close
[ https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965455#action_12965455 ] HBase Review Board commented on HBASE-3287: --- Message from: Ryan Rawson ryano...@gmail.com bq. On 2010-11-30 09:57:27, Ryan Rawson wrote: bq. branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java, line 765 bq. http://review.cloudera.org/r/1261/diff/1/?file=17902#file17902line765 bq. bq. why would you not want to evict blocks from the cache on close? bq. bq. stack wrote: bq. I think this a good point. Its different behavior but its behavior we should have always had? One less option too. I'm still confused why we are adding config for something that we should always be doing it. While we'll never be zero conf, I am not seeing the reason why we'd want to keep things in the LRU. It would make more sense not to evict on a split, but evict every other time, since a split will probably reopen the same hfiles and need those blocks again. - Ryan --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1261/#review2010 --- Add option to cache blocks on hfile write and evict blocks on hfile close - Key: HBASE-3287 URL: https://issues.apache.org/jira/browse/HBASE-3287 Project: HBase Issue Type: New Feature Components: io, regionserver Affects Versions: 0.90.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Fix For: 0.92.0 Attachments: HBASE-3287-FINAL-trunk.patch This issue is about adding configuration options to add/remove from the block cache when creating/closing files. For use cases with lots of flushing and compacting, this might be desirable to prevent cache misses and maximize the effective utilization of total block cache capacity. The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we pre-cache blocks as we are writing out new files. The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict blocks when files are closed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close
[ https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965487#action_12965487 ] HBase Review Board commented on HBASE-3287: --- Message from: Jonathan Gray jg...@apache.org bq. On 2010-11-30 09:57:27, Ryan Rawson wrote: bq. branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java, line 765 bq. http://review.cloudera.org/r/1261/diff/1/?file=17902#file17902line765 bq. bq. why would you not want to evict blocks from the cache on close? bq. bq. stack wrote: bq. I think this a good point. Its different behavior but its behavior we should have always had? One less option too. bq. bq. Ryan Rawson wrote: bq. I'm still confused why we are adding config for something that we should always be doing it. While we'll never be zero conf, I am not seeing the reason why we'd want to keep things in the LRU. bq. bq. It would make more sense not to evict on a split, but evict every other time, since a split will probably reopen the same hfiles and need those blocks again. I think it makes sense to have undocumented configuration parameters. The default behavior is then the way but having a config option checked in the code at least gives the opportunity to turn something on/off without making a code change and redeploying completely. In the unit test, I'm turning it on/off with the config parameter so I can verify it works as expected. And although I've changed the default to true, I'm not convinced that it always makes sense in all cases. Ryan came up with example of the split, though that would override the config parameter. But I think there could be other situations where you don't want to as well. In any case, I want to keep it configurable so I can turn it on/off between test runs and see what, if any, difference these optimizations make and IMO there's very little cost associated with using conf.getBoolean(some.undocumented.thing, true) vs. a hard-coded true (if there's any possibility you might want to change the behavior). - Jonathan --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1261/#review2010 --- Add option to cache blocks on hfile write and evict blocks on hfile close - Key: HBASE-3287 URL: https://issues.apache.org/jira/browse/HBASE-3287 Project: HBase Issue Type: New Feature Components: io, regionserver Affects Versions: 0.90.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Fix For: 0.92.0 Attachments: HBASE-3287-FINAL-trunk.patch This issue is about adding configuration options to add/remove from the block cache when creating/closing files. For use cases with lots of flushing and compacting, this might be desirable to prevent cache misses and maximize the effective utilization of total block cache capacity. The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we pre-cache blocks as we are writing out new files. The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict blocks when files are closed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close
[ https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965489#action_12965489 ] HBase Review Board commented on HBASE-3287: --- Message from: Jonathan Gray jg...@apache.org bq. On 2010-11-30 09:57:27, Ryan Rawson wrote: bq. branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java, line 765 bq. http://review.cloudera.org/r/1261/diff/1/?file=17902#file17902line765 bq. bq. why would you not want to evict blocks from the cache on close? bq. bq. stack wrote: bq. I think this a good point. Its different behavior but its behavior we should have always had? One less option too. bq. bq. Ryan Rawson wrote: bq. I'm still confused why we are adding config for something that we should always be doing it. While we'll never be zero conf, I am not seeing the reason why we'd want to keep things in the LRU. bq. bq. It would make more sense not to evict on a split, but evict every other time, since a split will probably reopen the same hfiles and need those blocks again. bq. bq. Jonathan Gray wrote: bq. I think it makes sense to have undocumented configuration parameters. The default behavior is then the way but having a config option checked in the code at least gives the opportunity to turn something on/off without making a code change and redeploying completely. In the unit test, I'm turning it on/off with the config parameter so I can verify it works as expected. bq. bq. And although I've changed the default to true, I'm not convinced that it always makes sense in all cases. bq. bq. Ryan came up with example of the split, though that would override the config parameter. But I think there could be other situations where you don't want to as well. bq. bq. In any case, I want to keep it configurable so I can turn it on/off between test runs and see what, if any, difference these optimizations make and IMO there's very little cost associated with using conf.getBoolean(some.undocumented.thing, true) vs. a hard-coded true (if there's any possibility you might want to change the behavior). Filed HBASE-3289 to disable them on close of parent files during split. I looked at the code and it's a fairly significant change since we'll need to pass a boolean in to all of the close() methods (there are several levels of them). Also, figuring out when we do want to evict these blocks (once both children have closed the file) is tricky. - Jonathan --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1261/#review2010 --- Add option to cache blocks on hfile write and evict blocks on hfile close - Key: HBASE-3287 URL: https://issues.apache.org/jira/browse/HBASE-3287 Project: HBase Issue Type: New Feature Components: io, regionserver Affects Versions: 0.90.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Fix For: 0.92.0 Attachments: HBASE-3287-FINAL-trunk.patch This issue is about adding configuration options to add/remove from the block cache when creating/closing files. For use cases with lots of flushing and compacting, this might be desirable to prevent cache misses and maximize the effective utilization of total block cache capacity. The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we pre-cache blocks as we are writing out new files. The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict blocks when files are closed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3286) Master passes IP and not hostname back to region server
[ https://issues.apache.org/jira/browse/HBASE-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965521#action_12965521 ] HBase Review Board commented on HBASE-3286: --- Message from: Jean-Daniel Cryans jdcry...@apache.org bq. On 2010-11-30 12:27:06, Jonathan Gray wrote: bq. A little confused by the discrepancy between String host / int port and the Address. But does seem fine given we don't actually access the string/int values and always use the address object. bq. bq. Do we need some tests on this stuff? Seems like we always have issues here but tests don't catch anything. bq. bq. Looks better than what we have though so I'm +1 regardless. Regarding tests, I'm not sure what they would catch... bq. On 2010-11-30 12:27:06, Jonathan Gray wrote: bq. /trunk/src/main/java/org/apache/hadoop/hbase/HServerAddress.java, line 65 bq. http://review.cloudera.org/r/1262/diff/1/?file=17919#file17919line65 bq. bq. Why does stringValue not necessarily equal the host:port we store in those Strings? Shouldn't they be the same? I'm trying to keep it more consistent with the rest of the code, else when looking at the code you ask yourself the question you just asked me :) bq. On 2010-11-30 12:27:06, Jonathan Gray wrote: bq. /trunk/src/main/java/org/apache/hadoop/hbase/HServerAddress.java, line 177 bq. http://review.cloudera.org/r/1262/diff/1/?file=17919#file17919line177 bq. bq. But on serialization, we use the address hostname not the thing we actually store in hostname/port variables, so after serialized it's different? bq. bq. Shouldn't we set the hostname/port variables on construction according to address.getAddress/getPort rather than the passed values, if the address values are what we want to use? I'm... not following you. You're saying that we shouldn't store the InetSocketAddress? bq. On 2010-11-30 12:27:06, Jonathan Gray wrote: bq. /trunk/src/main/java/org/apache/hadoop/hbase/HServerInfo.java, line 116 bq. http://review.cloudera.org/r/1262/diff/1/?file=17920#file17920line116 bq. bq. I guess we never actually use the String host / int port? Why do we store them in HServerAddress then? Here I'm just making sure that after updating the address we also update the hostname, since it could have changed. - Jean-Daniel --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1262/#review2012 --- Master passes IP and not hostname back to region server --- Key: HBASE-3286 URL: https://issues.apache.org/jira/browse/HBASE-3286 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Fix For: 0.90.0 Starting my little test cluster on the latest from 0.90, I see: {noformat} 2010-11-29 23:21:34,131 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 1024 region(s) across 9 server(s), retainAssignment=true 2010-11-29 23:21:34,134 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 22 region(s) to sv2borg181,61020,1291072886282 2010-11-29 23:21:34,135 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 24 region(s) to sv2borg182,61020,1291072885473 2010-11-29 23:21:34,135 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 37 region(s) to sv2borg183,61020,1291072885646 2010-11-29 23:21:34,135 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 25 region(s) to sv2borg184,61020,1291072886734 2010-11-29 23:21:34,135 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 26 region(s) to sv2borg185,61020,1291072886606 2010-11-29 23:21:34,136 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 70 region(s) to sv2borg186,61020,1291072885486 2010-11-29 23:21:34,136 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 30 region(s) to sv2borg187,61020,1291072886355 2010-11-29 23:21:34,136 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 89 region(s) to sv2borg188,61020,1291072885926 2010-11-29 23:21:34,136 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 701 region(s) to sv2borg189,61020,1291072886739 {noformat} After another restart: {noformat} 2010-11-30 00:03:38,100 INFO org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 1024 region(s) across 9 server(s), retainAssignment=true 2010-11-30 00:03:38,103 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 71 region(s) to sv2borg181,61020,1291075409984 2010-11-30 00:03:38,103 DEBUG org.apache.hadoop.hbase.master.AssignmentManager:
[jira] Commented: (HBASE-3290) Max Compaction Size
[ https://issues.apache.org/jira/browse/HBASE-3290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965587#action_12965587 ] HBase Review Board commented on HBASE-3290: --- Message from: Nicolas nspiegelb...@facebook.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1263/ --- (Updated 2010-11-30 23:21:26.259598) Review request for hbase. Summary --- Add ability to specify a maximum storefile size for compaction. After this limit, we will not include this file in compactions. This is useful for large object stores and clusters that pre-split regions. This addresses bug HBASE-3290. http://issues.apache.org/jira/browse/HBASE-3290 Diffs - trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1040878 trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1040878 trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java PRE-CREATION Diff: http://review.cloudera.org/r/1263/diff Testing --- mvn test -Dtest=TestCompactSelection mvn test -Dtest=TestCompaction mvn test -Dtest=TestFromClientSide mvn test cluster testing Thanks, Nicolas Max Compaction Size --- Key: HBASE-3290 URL: https://issues.apache.org/jira/browse/HBASE-3290 Project: HBase Issue Type: Improvement Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Priority: Minor Add ability to specify a maximum storefile size for compaction. After this limit, we will not include this file in compactions. This is useful for large object stores and clusters that pre-split regions. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3282) Need to retain DeadServers to ensure we don't allow previously expired RS instances to rejoin cluster
[ https://issues.apache.org/jira/browse/HBASE-3282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12964879#action_12964879 ] HBase Review Board commented on HBASE-3282: --- Message from: Jonathan Gray jg...@apache.org --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1259/ --- Review request for hbase and stack. Summary --- We currently let go of dead servers once we finish their shutdown. We should hang on to them longer to deal with things like network partitions. I'm not a fan of SoftReferences so I decided on another approach. DeadServers now has a maximum number of servers to hold on to in the set (default 100). Once it reaches the max, it evicts the oldest. More code than I had hoped but nothing too crazy. This addresses bug HBASE-3282. http://issues.apache.org/jira/browse/HBASE-3282 Diffs - branches/0.90/src/main/java/org/apache/hadoop/hbase/master/DeadServer.java 1040221 branches/0.90/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 1040221 branches/0.90/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 1040221 branches/0.90/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 1040221 Diff: http://review.cloudera.org/r/1259/diff Testing --- Running unit tests now. Thanks, Jonathan Need to retain DeadServers to ensure we don't allow previously expired RS instances to rejoin cluster - Key: HBASE-3282 URL: https://issues.apache.org/jira/browse/HBASE-3282 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Fix For: 0.90.0, 0.92.0 Currently we clear a server from the deadserver set once we finish processing it's shutdown. However, certain circumstances (network partitions, race conditions) could lead to the RS not doing a check-in until after the shutdown has been processed. As-is, this RS will now be let back in to the cluster rather than rejected with YouAreDeadException. We should hang on to the dead servers so we always reject them. One concern is that the set will grow indefinitely. One recommendation by stack is to use SoftReferences. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3282) Need to retain DeadServers to ensure we don't allow previously expired RS instances to rejoin cluster
[ https://issues.apache.org/jira/browse/HBASE-3282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12964891#action_12964891 ] HBase Review Board commented on HBASE-3282: --- Message from: st...@duboce.net --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1259/#review2004 --- Ship it! branches/0.90/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java http://review.cloudera.org/r/1259/#comment6320 You can make this private now that its no longer referenced by Master? - stack Need to retain DeadServers to ensure we don't allow previously expired RS instances to rejoin cluster - Key: HBASE-3282 URL: https://issues.apache.org/jira/browse/HBASE-3282 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Fix For: 0.90.0, 0.92.0 Currently we clear a server from the deadserver set once we finish processing it's shutdown. However, certain circumstances (network partitions, race conditions) could lead to the RS not doing a check-in until after the shutdown has been processed. As-is, this RS will now be let back in to the cluster rather than rejected with YouAreDeadException. We should hang on to the dead servers so we always reject them. One concern is that the set will grow indefinitely. One recommendation by stack is to use SoftReferences. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3287) Add option to cache blocks on hfile write and evict blocks on hfile close
[ https://issues.apache.org/jira/browse/HBASE-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965134#action_12965134 ] HBase Review Board commented on HBASE-3287: --- Message from: Jonathan Gray jg...@apache.org --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1261/ --- Review request for hbase, stack and khemani. Summary --- This issue is about adding configuration options to add/remove from the block cache when creating/closing files. For use cases with lots of flushing and compacting, this might be desirable to prevent cache misses and maximize the effective utilization of total block cache capacity. The first option, hbase.rs.cacheblocksonwrite, will make it so we pre-cache blocks as we are writing out new files. The second option, hbase.rs.evictblocksonclose, will make it so we evict blocks when files are closed. This addresses bug HBASE-3287. http://issues.apache.org/jira/browse/HBASE-3287 Diffs - branches/0.90/src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java 1040422 branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java 1040422 branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java 1040422 branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java 1040422 branches/0.90/src/main/java/org/apache/hadoop/hbase/io/hfile/SimpleBlockCache.java 1040422 branches/0.90/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java 1040422 branches/0.90/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1040422 branches/0.90/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1040422 branches/0.90/src/main/java/org/apache/hadoop/hbase/util/CompressionTest.java 1040422 branches/0.90/src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java 1040422 branches/0.90/src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java 1040422 branches/0.90/src/test/java/org/apache/hadoop/hbase/io/hfile/RandomSeek.java 1040422 branches/0.90/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java 1040422 branches/0.90/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFilePerformance.java 1040422 branches/0.90/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileSeek.java 1040422 branches/0.90/src/test/java/org/apache/hadoop/hbase/io/hfile/TestReseekTo.java 1040422 branches/0.90/src/test/java/org/apache/hadoop/hbase/io/hfile/TestSeekTo.java 1040422 branches/0.90/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFiles.java 1040422 branches/0.90/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java 1040422 Diff: http://review.cloudera.org/r/1261/diff Testing --- Added a unit test to TestStoreFile. That passes. Need to do perf testing on a cluster. Thanks, Jonathan Add option to cache blocks on hfile write and evict blocks on hfile close - Key: HBASE-3287 URL: https://issues.apache.org/jira/browse/HBASE-3287 Project: HBase Issue Type: Improvement Components: io, regionserver Affects Versions: 0.90.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Fix For: 0.92.0 This issue is about adding configuration options to add/remove from the block cache when creating/closing files. For use cases with lots of flushing and compacting, this might be desirable to prevent cache misses and maximize the effective utilization of total block cache capacity. The first option, {{hbase.rs.cacheblocksonwrite}}, will make it so we pre-cache blocks as we are writing out new files. The second option, {{hbase.rs.evictblocksonclose}}, will make it so we evict blocks when files are closed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3279) [rest] Filter for gzip/deflate content encoding that wraps both input and output side
[ https://issues.apache.org/jira/browse/HBASE-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12935965#action_12935965 ] HBase Review Board commented on HBASE-3279: --- Message from: Andrew Purtell apurt...@apache.org --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1254/ --- Review request for hbase. Summary --- After HBASE-3275 the REST gateway uses Jetty's GzipFilter to will return gzip or deflate encoded content to the client if the client requested it using the appropriate Accept-Encoding header. However Jetty's GzipFilter only wraps output side processing. This patch implements a filter that also wraps input side processing, so clients can submit compressed PUT or POST bodies. This addresses bug HBASE-3279. http://issues.apache.org/jira/browse/HBASE-3279 Diffs - src/main/java/org/apache/hadoop/hbase/rest/Main.java 54866b6 src/main/java/org/apache/hadoop/hbase/rest/filter/GZIPRequestStream.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/rest/filter/GZIPRequestWrapper.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/rest/filter/GZIPResponseStream.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/rest/filter/GZIPResponseWrapper.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/rest/filter/GzipFilter.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/rest/HBaseRESTTestingUtility.java 5e943ec src/test/java/org/apache/hadoop/hbase/rest/TestGzipFilter.java PRE-CREATION Diff: http://review.cloudera.org/r/1254/diff Testing --- New unit test, passes. Thanks, Andrew [rest] Filter for gzip/deflate content encoding that wraps both input and output side - Key: HBASE-3279 URL: https://issues.apache.org/jira/browse/HBASE-3279 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.20.7, 0.90.0 After HBASE-3275 the REST gateway will return gzip or deflate encoded content to the client if the client requested it using the appropriate Accept-Encoding header. However Jetty's GzipFilter only wraps output side processing. A client can submit gzip or deflate encoded requests (i.e. Content-Encoding: gzip ; Content-Type: ...) but the data is not decoded, it is simply passed through. Implement a filter that also wraps input side processing, so clients can submit compressed PUT or POST bodies. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3279) [rest] Filter for gzip/deflate content encoding that wraps both input and output side
[ https://issues.apache.org/jira/browse/HBASE-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12936015#action_12936015 ] HBase Review Board commented on HBASE-3279: --- Message from: Lars George larsgeo...@apache.org --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1254/#review1981 --- Ship it! Looks great! RB did show some white spaces added unnecessarily, I assume you could remove those on commit. - Lars [rest] Filter for gzip/deflate content encoding that wraps both input and output side - Key: HBASE-3279 URL: https://issues.apache.org/jira/browse/HBASE-3279 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.20.7, 0.90.0 After HBASE-3275 the REST gateway will return gzip or deflate encoded content to the client if the client requested it using the appropriate Accept-Encoding header. However Jetty's GzipFilter only wraps output side processing. A client can submit gzip or deflate encoded requests (i.e. Content-Encoding: gzip ; Content-Type: ...) but the data is not decoded, it is simply passed through. Implement a filter that also wraps input side processing, so clients can submit compressed PUT or POST bodies. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3276) delete followed by a put with the same timestamp
[ https://issues.apache.org/jira/browse/HBASE-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12936040#action_12936040 ] HBase Review Board commented on HBASE-3276: --- Message from: Pranav Khaitan pranavkhai...@gmail.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1252/ --- Review request for hbase, Jonathan Gray and Kannan Muthukkaruppan. Summary --- This is a design change suggested in HBASE-3276 so adequate thought should be given before proceeding. The main code change is just one line which is to ignore key type while doing KV comparisons. When the key type is ignored, then all the keys for the same timestamp are sorted according the order in which they were interested. It is still ensured that the delete family and delete column will be at the top because they have the default column name and default timestamp. This addresses bug HBASE-3276. http://issues.apache.org/jira/browse/HBASE-3276 Diffs - trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1039233 trunk/src/test/java/org/apache/hadoop/hbase/regionserver/KeyValueScanFixture.java 1039233 trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1039233 Diff: http://review.cloudera.org/r/1252/diff Testing --- Test cases added. Since there is a change in semantics, some previous tests were failing because of this change. Those tests have been modified to test the newer behavior. Thanks, Pranav delete followed by a put with the same timestamp Key: HBASE-3276 URL: https://issues.apache.org/jira/browse/HBASE-3276 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan Assignee: Kannan Muthukkaruppan [Note: This issue is relevant only for cases that don't use the default time based versions, but provide/manage versions explicitly.] The fix for HBASE-1485 ensures that if there are multiple puts with the same timestamp the later one wins. However, if there is a delete for a specific timestamp, then the later put doesn't win. Say for example the following is the sequence of operations: put row/col/v1 - value1 deleteColumn row/col/v1 put row/col/v1 - value2 Without the deleteColumn(), HBASE-1485 ensures that value2 is the winner. However, with the deleteColumn() thrown into the mix, the delete wins, and one cannot insert a new value at that version. [The only, unsatisfactory, workaround at this point seems to be trigger a major compaction. The major compact would clear the delete marker, and allow new cells to be created with that version again.] --- Seems like it might not be too complicated to extend the fix for HBASE-1485 to also respect ordering between delete/put operations. I'll look into this further. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3276) delete followed by a put with the same timestamp
[ https://issues.apache.org/jira/browse/HBASE-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12936119#action_12936119 ] HBase Review Board commented on HBASE-3276: --- Message from: Ryan Rawson ryano...@gmail.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1252/#review1993 --- trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java http://review.cloudera.org/r/1252/#comment6297 what are all the consequences for not sorting by type when using KVComparator? Does this mean we might create HFiles that not sorted properly, because the HFile comparator uses the KeyComparator directly with ignoreType = false. While in memstore we can rely on memstoreTS to roughly order by insertion time, and the Put/Delete should probably work in that situation, you are talking about modifiying a pretty core and important concept in how we sort things. There are other ways to reconcile bugs like this, one of them is to extend the memstoreTS concept into the HFile and use that to reconcile during reads. There is another JIRA where I proposed this. If we are talking about 0.92 and beyond I'd prefer building a solid base rather than dangerous hacks like this. Our unit tests are not extremely extensive, so while they might pass, that doesnt guarantee lack of bad behaviour later on. - Ryan delete followed by a put with the same timestamp Key: HBASE-3276 URL: https://issues.apache.org/jira/browse/HBASE-3276 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan Assignee: Kannan Muthukkaruppan [Note: This issue is relevant only for cases that don't use the default time based versions, but provide/manage versions explicitly.] The fix for HBASE-1485 ensures that if there are multiple puts with the same timestamp the later one wins. However, if there is a delete for a specific timestamp, then the later put doesn't win. Say for example the following is the sequence of operations: put row/col/v1 - value1 deleteColumn row/col/v1 put row/col/v1 - value2 Without the deleteColumn(), HBASE-1485 ensures that value2 is the winner. However, with the deleteColumn() thrown into the mix, the delete wins, and one cannot insert a new value at that version. [The only, unsatisfactory, workaround at this point seems to be trigger a major compaction. The major compact would clear the delete marker, and allow new cells to be created with that version again.] --- Seems like it might not be too complicated to extend the fix for HBASE-1485 to also respect ordering between delete/put operations. I'll look into this further. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3276) delete followed by a put with the same timestamp
[ https://issues.apache.org/jira/browse/HBASE-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12936123#action_12936123 ] HBase Review Board commented on HBASE-3276: --- Message from: Pranav Khaitan pranavkhai...@gmail.com bq. On 2010-11-26 14:54:45, Ryan Rawson wrote: bq. trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 1373 bq. http://review.cloudera.org/r/1252/diff/1/?file=17712#file17712line1373 bq. bq. what are all the consequences for not sorting by type when using KVComparator? Does this mean we might create HFiles that not sorted properly, because the HFile comparator uses the KeyComparator directly with ignoreType = false. bq. bq. While in memstore we can rely on memstoreTS to roughly order by insertion time, and the Put/Delete should probably work in that situation, you are talking about modifiying a pretty core and important concept in how we sort things. bq. bq. There are other ways to reconcile bugs like this, one of them is to extend the memstoreTS concept into the HFile and use that to reconcile during reads. There is another JIRA where I proposed this. bq. bq. If we are talking about 0.92 and beyond I'd prefer building a solid base rather than dangerous hacks like this. Our unit tests are not extremely extensive, so while they might pass, that doesnt guarantee lack of bad behaviour later on. bq. Agree. As I mentioned, this is a major change and more thought needs to be given to it. However, to resolve issues like HBASE-3276, we need either such a change or extend the memstoreTS concept to HFile as you mentioned. About consequences, I don't see anything negative here. This change only affects the sorting of keys having same row, col, timestamp. After this change, all keys with the same row, col, ts will be sorted purely based on the order in which they were inserted. When a memstore is flushed to HFile, the memstoreTS takes care of ordering. During compactions, the KeyValueHeap breaks ties by using the sequence ids of storefiles. - Pranav --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1252/#review1993 --- delete followed by a put with the same timestamp Key: HBASE-3276 URL: https://issues.apache.org/jira/browse/HBASE-3276 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan Assignee: Kannan Muthukkaruppan [Note: This issue is relevant only for cases that don't use the default time based versions, but provide/manage versions explicitly.] The fix for HBASE-1485 ensures that if there are multiple puts with the same timestamp the later one wins. However, if there is a delete for a specific timestamp, then the later put doesn't win. Say for example the following is the sequence of operations: put row/col/v1 - value1 deleteColumn row/col/v1 put row/col/v1 - value2 Without the deleteColumn(), HBASE-1485 ensures that value2 is the winner. However, with the deleteColumn() thrown into the mix, the delete wins, and one cannot insert a new value at that version. [The only, unsatisfactory, workaround at this point seems to be trigger a major compaction. The major compact would clear the delete marker, and allow new cells to be created with that version again.] --- Seems like it might not be too complicated to extend the fix for HBASE-1485 to also respect ordering between delete/put operations. I'll look into this further. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3276) delete followed by a put with the same timestamp
[ https://issues.apache.org/jira/browse/HBASE-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12936128#action_12936128 ] HBase Review Board commented on HBASE-3276: --- Message from: Ryan Rawson ryano...@gmail.com bq. On 2010-11-26 14:54:45, Ryan Rawson wrote: bq. trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 1373 bq. http://review.cloudera.org/r/1252/diff/1/?file=17712#file17712line1373 bq. bq. what are all the consequences for not sorting by type when using KVComparator? Does this mean we might create HFiles that not sorted properly, because the HFile comparator uses the KeyComparator directly with ignoreType = false. bq. bq. While in memstore we can rely on memstoreTS to roughly order by insertion time, and the Put/Delete should probably work in that situation, you are talking about modifiying a pretty core and important concept in how we sort things. bq. bq. There are other ways to reconcile bugs like this, one of them is to extend the memstoreTS concept into the HFile and use that to reconcile during reads. There is another JIRA where I proposed this. bq. bq. If we are talking about 0.92 and beyond I'd prefer building a solid base rather than dangerous hacks like this. Our unit tests are not extremely extensive, so while they might pass, that doesnt guarantee lack of bad behaviour later on. bq. bq. bq. Pranav Khaitan wrote: bq. Agree. As I mentioned, this is a major change and more thought needs to be given to it. bq. bq. However, to resolve issues like HBASE-3276, we need either such a change or extend the memstoreTS concept to HFile as you mentioned. bq. bq. About consequences, I don't see anything negative here. This change only affects the sorting of keys having same row, col, timestamp. After this change, all keys with the same row, col, ts will be sorted purely based on the order in which they were inserted. When a memstore is flushed to HFile, the memstoreTS takes care of ordering. During compactions, the KeyValueHeap breaks ties by using the sequence ids of storefiles. the problem is you are now changing how things are ordered sometimes but not all the time. HFile directly uses the rawcomparator, instantiating it directly rather than getting it via the code path you changed. So now you create a memstore in this order: row,col,100,Put (memstoreTS=1) row,col,100,Delete (memstoreTS=2) row,col,100,Put (memstoreTS=3) But the HFile comparator will consider this out of order since it doesnt know about memstoreTS and it still expects things to be in a certain order. I'm a little wary of having implicit ordering in the HFiles... in your new scheme, Put,Delete,Put are in that order 'just because they are', and the comparator cannot put them back in order, and must rely on scanner order. During compactions we would place keys in order based on which files they came from, but they wouldn't themselves have an order. Basically we should get rid of 'type sorting' and use memstoreTS sorting in memory and implicit sorting in the HFiles. - Ryan --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1252/#review1993 --- delete followed by a put with the same timestamp Key: HBASE-3276 URL: https://issues.apache.org/jira/browse/HBASE-3276 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan Assignee: Kannan Muthukkaruppan [Note: This issue is relevant only for cases that don't use the default time based versions, but provide/manage versions explicitly.] The fix for HBASE-1485 ensures that if there are multiple puts with the same timestamp the later one wins. However, if there is a delete for a specific timestamp, then the later put doesn't win. Say for example the following is the sequence of operations: put row/col/v1 - value1 deleteColumn row/col/v1 put row/col/v1 - value2 Without the deleteColumn(), HBASE-1485 ensures that value2 is the winner. However, with the deleteColumn() thrown into the mix, the delete wins, and one cannot insert a new value at that version. [The only, unsatisfactory, workaround at this point seems to be trigger a major compaction. The major compact would clear the delete marker, and allow new cells to be created with that version again.] --- Seems like it might not be too complicated to extend the fix for HBASE-1485 to also respect ordering between delete/put operations. I'll look into this further. -- This message is automatically generated by JIRA. - You can reply to
[jira] Commented: (HBASE-3276) delete followed by a put with the same timestamp
[ https://issues.apache.org/jira/browse/HBASE-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12936131#action_12936131 ] HBase Review Board commented on HBASE-3276: --- Message from: Pranav Khaitan pranavkhai...@gmail.com bq. On 2010-11-26 14:54:45, Ryan Rawson wrote: bq. trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 1373 bq. http://review.cloudera.org/r/1252/diff/1/?file=17712#file17712line1373 bq. bq. what are all the consequences for not sorting by type when using KVComparator? Does this mean we might create HFiles that not sorted properly, because the HFile comparator uses the KeyComparator directly with ignoreType = false. bq. bq. While in memstore we can rely on memstoreTS to roughly order by insertion time, and the Put/Delete should probably work in that situation, you are talking about modifiying a pretty core and important concept in how we sort things. bq. bq. There are other ways to reconcile bugs like this, one of them is to extend the memstoreTS concept into the HFile and use that to reconcile during reads. There is another JIRA where I proposed this. bq. bq. If we are talking about 0.92 and beyond I'd prefer building a solid base rather than dangerous hacks like this. Our unit tests are not extremely extensive, so while they might pass, that doesnt guarantee lack of bad behaviour later on. bq. bq. bq. Pranav Khaitan wrote: bq. Agree. As I mentioned, this is a major change and more thought needs to be given to it. bq. bq. However, to resolve issues like HBASE-3276, we need either such a change or extend the memstoreTS concept to HFile as you mentioned. bq. bq. About consequences, I don't see anything negative here. This change only affects the sorting of keys having same row, col, timestamp. After this change, all keys with the same row, col, ts will be sorted purely based on the order in which they were inserted. When a memstore is flushed to HFile, the memstoreTS takes care of ordering. During compactions, the KeyValueHeap breaks ties by using the sequence ids of storefiles. bq. bq. Ryan Rawson wrote: bq. the problem is you are now changing how things are ordered sometimes but not all the time. HFile directly uses the rawcomparator, instantiating it directly rather than getting it via the code path you changed. So now you create a memstore in this order: bq. bq. row,col,100,Put (memstoreTS=1) bq. row,col,100,Delete (memstoreTS=2) bq. row,col,100,Put (memstoreTS=3) bq. bq. But the HFile comparator will consider this out of order since it doesnt know about memstoreTS and it still expects things to be in a certain order. bq. bq. I'm a little wary of having implicit ordering in the HFiles... in your new scheme, Put,Delete,Put are in that order 'just because they are', and the comparator cannot put them back in order, and must rely on scanner order. During compactions we would place keys in order based on which files they came from, but they wouldn't themselves have an order. Basically we should get rid of 'type sorting' and use memstoreTS sorting in memory and implicit sorting in the HFiles. bq. bq. Right. I see that HFile does an extra check and throws an IOException when it gets data out of the order. So if we go forward with this change, we will have to ensure that the comparator used by HFile knows about this change. This change be achieved in two ways: Firstly, by setting the default value of ignoreType = true. Alternately, the HFile can explicitly set ignoreType = true. - Pranav --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1252/#review1993 --- delete followed by a put with the same timestamp Key: HBASE-3276 URL: https://issues.apache.org/jira/browse/HBASE-3276 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan Assignee: Kannan Muthukkaruppan [Note: This issue is relevant only for cases that don't use the default time based versions, but provide/manage versions explicitly.] The fix for HBASE-1485 ensures that if there are multiple puts with the same timestamp the later one wins. However, if there is a delete for a specific timestamp, then the later put doesn't win. Say for example the following is the sequence of operations: put row/col/v1 - value1 deleteColumn row/col/v1 put row/col/v1 - value2 Without the deleteColumn(), HBASE-1485 ensures that value2 is the winner. However, with the deleteColumn() thrown into the
[jira] Commented: (HBASE-3276) delete followed by a put with the same timestamp
[ https://issues.apache.org/jira/browse/HBASE-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12936134#action_12936134 ] HBase Review Board commented on HBASE-3276: --- Message from: Pranav Khaitan pranavkhai...@gmail.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1252/ --- (Updated 2010-11-26 16:02:47.462170) Review request for hbase, Jonathan Gray and Kannan Muthukkaruppan. Summary (updated) --- This is a design change suggested in HBASE-3276 so adequate thought should be given before proceeding. The main code change is just one line which is to ignore key type while doing KV comparisons. When the key type is ignored, then all the keys for the same timestamp are sorted according the order in which they were inserted. It is still ensured that the delete family and delete column will be at the top because they have the default column name and default timestamp. This addresses bug HBASE-3276. http://issues.apache.org/jira/browse/HBASE-3276 Diffs - trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1039233 trunk/src/test/java/org/apache/hadoop/hbase/regionserver/KeyValueScanFixture.java 1039233 trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java 1039233 Diff: http://review.cloudera.org/r/1252/diff Testing --- Test cases added. Since there is a change in semantics, some previous tests were failing because of this change. Those tests have been modified to test the newer behavior. Thanks, Pranav delete followed by a put with the same timestamp Key: HBASE-3276 URL: https://issues.apache.org/jira/browse/HBASE-3276 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan Assignee: Kannan Muthukkaruppan [Note: This issue is relevant only for cases that don't use the default time based versions, but provide/manage versions explicitly.] The fix for HBASE-1485 ensures that if there are multiple puts with the same timestamp the later one wins. However, if there is a delete for a specific timestamp, then the later put doesn't win. Say for example the following is the sequence of operations: put row/col/v1 - value1 deleteColumn row/col/v1 put row/col/v1 - value2 Without the deleteColumn(), HBASE-1485 ensures that value2 is the winner. However, with the deleteColumn() thrown into the mix, the delete wins, and one cannot insert a new value at that version. [The only, unsatisfactory, workaround at this point seems to be trigger a major compaction. The major compact would clear the delete marker, and allow new cells to be created with that version again.] --- Seems like it might not be too complicated to extend the fix for HBASE-1485 to also respect ordering between delete/put operations. I'll look into this further. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3267) close_region shell command breaks region
[ https://issues.apache.org/jira/browse/HBASE-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12935566#action_12935566 ] HBase Review Board commented on HBASE-3267: --- Message from: st...@duboce.net --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1250/ --- Review request for hbase and Jonathan Gray. Summary --- So, things are different in the new master. Close region should close region. Not close and then reopen. To close and reopen elsewhere, thats an unassign or a move (both of which were missing from shell but which are added in this patch). I fixed the close so that its a close that does not touch zk... the region is just closed on the regionserver. No going to zk makes it so the close no longer makes for complaint. Close is dangerous though in that the region is now permanently offline (I updated the close help to explain this is so). To address it being permanently offline, I added a new assign to the shell. While in here, I removed commands that no longer make senses such as enable_region and disable_region. M src/main/java/org/apache/hadoop/hbase/master/HMaster.java Change move implementation so can pass an empty host. Empty host means move to random location rather than explicit server. Added assign, unassign M src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java (clearRegionPlan): Added. M src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java Improved move javadoc. Added assign, unassign. M src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java Improved javadoc. Added assign and unassign. M src/main/ruby/hbase/admin.rb Added balancer, balance_switch, assign, unassign, removed zk, enable_region and disable_region (the latter make no sense anymore now disable/enable is done differently). D src/main/ruby/shell/commands/zk.rb A src/main/ruby/shell/commands/assign.rb A src/main/ruby/shell/commands/balance_switch.rb D src/main/ruby/shell/commands/disable_region.rb A src/main/ruby/shell/commands/balancer.rb A src/main/ruby/shell/commands/unassign.rb D src/main/ruby/shell/commands/enable_region.rb A src/main/ruby/shell/commands/move.rb M src/main/ruby/shell/commands/close_region.rb Fixed up help M src/main/ruby/shell.rb Added and removed commands. This addresses bug hbase-3267. http://issues.apache.org/jira/browse/hbase-3267 Diffs - trunk/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 1038768 trunk/src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java 1038768 trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1038768 trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 1038768 trunk/src/main/ruby/hbase/admin.rb 1038768 trunk/src/main/ruby/shell.rb 1038768 trunk/src/main/ruby/shell/commands/assign.rb PRE-CREATION trunk/src/main/ruby/shell/commands/balance_switch.rb PRE-CREATION trunk/src/main/ruby/shell/commands/balancer.rb PRE-CREATION trunk/src/main/ruby/shell/commands/close_region.rb 1038768 trunk/src/main/ruby/shell/commands/disable_region.rb 1038768 trunk/src/main/ruby/shell/commands/enable_region.rb 1038768 trunk/src/main/ruby/shell/commands/move.rb PRE-CREATION trunk/src/main/ruby/shell/commands/unassign.rb PRE-CREATION trunk/src/main/ruby/shell/commands/zk.rb 1038768 Diff: http://review.cloudera.org/r/1250/diff Testing --- I tested shell here on my little cluster. Thanks, stack close_region shell command breaks region Key: HBASE-3267 URL: https://issues.apache.org/jira/browse/HBASE-3267 Project: HBase Issue Type: Bug Components: master, regionserver, shell Affects Versions: 0.90.0 Reporter: Todd Lipcon Assignee: stack Priority: Critical Fix For: 0.90.0 It used to be that you could use the close_region command from the shell to close a region on one server and have the master reassign it elsewhere. Now if you close a region, you get the following errors in the master log: 2010-11-23 00:46:34,090 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSING for region ffaa7999e909dbd6544688cc8ab303bd from server haus01.sf.cloudera.com,12020,1290501789693 but region was in the state null and not in expected PENDI 2010-11-23 00:46:34,530 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: master:6-0x12c537d84e10062 Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, path=/hbase/unassigned/ffaa7999e909dbd6544688cc8ab303bd 2010-11-23 00:46:34,531 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil:
[jira] Commented: (HBASE-3267) close_region shell command breaks region
[ https://issues.apache.org/jira/browse/HBASE-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12935579#action_12935579 ] HBase Review Board commented on HBASE-3267: --- Message from: Jonathan Gray jg...@apache.org --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1250/#review1975 --- This is great. I like this much better than hacking up the master transition code. My main concern is around the exact semantics of assign/unassign (and close). I think we need to do good javadoc on the HBA methods to describe how you would use these or at least a bit about their behavior. assign() just does an assign, but unassign() actually clears stuff out. It seems doing a close() behind the masters back, then asking the master to assign that region, should not work... but it does? trunk/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java http://review.cloudera.org/r/1250/#comment6230 Is there an open_region? This assign() goes through the master, so what is the opposite of close_region which doesn't go through the master? Doesn't close_region now put the master in a bad state, so it won't expect an assignment to be done on a region which it thinks is already assigned? There is a force on unassign() but not on assign(). In the old master, for HBCK, I added a hook in to the master to clear the in-memory state for a region. To deal with dupe assignment, I did silent close_regions and then cleared the in-memory state. Then I triggered a new assignment. trunk/src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java http://review.cloudera.org/r/1250/#comment6231 this is awesome javadoc. is there somewhere else we can put this rather than in just the move() API? Maybe in the HBA class comment or something? Somewhere we can reference in other javadocs about what a regionname is trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java http://review.cloudera.org/r/1250/#comment6232 So you're supposed to call move instead of open_region? Or why the change in move() though this looks good. trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java http://review.cloudera.org/r/1250/#comment6233 Why META and not in-memory state? Once you hit assign() you rely on the in-memory state anyways? trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java http://review.cloudera.org/r/1250/#comment6238 on assign we just do the assignment, but below on unassign() we first clear existing plans and clear from RIT. why the difference. trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java http://review.cloudera.org/r/1250/#comment6234 is this necessary? should the unassign method taking force deal with anything needed to force it? trunk/src/main/ruby/hbase/admin.rb http://review.cloudera.org/r/1250/#comment6235 zk didn't work? why is this removed? trunk/src/main/ruby/shell/commands/assign.rb http://review.cloudera.org/r/1250/#comment6236 whitespace. and what exactly are the semantics of this? what if region is already assigned? we should document somewhere more specifically what the behavior is of these methods if we're going to expose them to the client and the shell. neither place really describes what this means and i can imagine users will be doing lots of foot shooting with tools like this. more importantly, though, i'm trying to understand the use cases for these. if it's to unbreak stuff, it's not clear to me how exactly you would use it given that the master will reject certain operations in the wrong order. trunk/src/main/ruby/shell/commands/close_region.rb http://review.cloudera.org/r/1250/#comment6237 Why would you use close and not unassign/assign/move? It's because close is done silently? Should say that if that's the distinction. Is this comment saying you can use unassign or move after you issue close? or instead of? trunk/src/main/ruby/shell/commands/unassign.rb http://review.cloudera.org/r/1250/#comment6239 this doesn't use encoded region name? is move then different from the other methods? - Jonathan close_region shell command breaks region Key: HBASE-3267 URL: https://issues.apache.org/jira/browse/HBASE-3267 Project: HBase Issue Type: Bug Components: master, regionserver, shell Affects Versions: 0.90.0 Reporter: Todd Lipcon Assignee: stack Priority: Critical Fix For: 0.90.0 It used to be that you could use the close_region command from the shell to close a region on one server and have
[jira] Commented: (HBASE-3267) close_region shell command breaks region
[ https://issues.apache.org/jira/browse/HBASE-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12935590#action_12935590 ] HBase Review Board commented on HBASE-3267: --- Message from: st...@duboce.net bq. On 2010-11-24 15:45:32, Jonathan Gray wrote: bq. This is great. I like this much better than hacking up the master transition code. bq. bq. My main concern is around the exact semantics of assign/unassign (and close). I think we need to do good javadoc on the HBA methods to describe how you would use these or at least a bit about their behavior. assign() just does an assign, but unassign() actually clears stuff out. It seems doing a close() behind the masters back, then asking the master to assign that region, should not work... but it does? Well, my notion is that user shouldn't be doing these manual messings any more. Fixup stuff is now for hbck to do. Yes, close of a region is done w/o master's involvement. Rare would you do it. Yes, an assign will assign a region EVEN IF ALREADY assigned. Messing in here can get you in trouble. I was able to manufacture some ugly conditions -- a stuck region trying to assign same server over and over -- but then unassign with a force now clears out RIT and does the right thing i.e. we have enough tools to hang ourselves on new master but also the tools to undo. bq. On 2010-11-24 15:45:32, Jonathan Gray wrote: bq. trunk/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java, line 980 bq. http://review.cloudera.org/r/1250/diff/1/?file=17648#file17648line980 bq. bq. Is there an open_region? This assign() goes through the master, so what is the opposite of close_region which doesn't go through the master? bq. bq. Doesn't close_region now put the master in a bad state, so it won't expect an assignment to be done on a region which it thinks is already assigned? There is a force on unassign() but not on assign(). bq. bq. In the old master, for HBCK, I added a hook in to the master to clear the in-memory state for a region. To deal with dupe assignment, I did silent close_regions and then cleared the in-memory state. Then I triggered a new assignment. No open_region. Someone can add that later if wanted. Otherwise, use move to place region on specific server. On close_region, yes, puts master in bad state but you'd only do close_region when doing fix up of some whack condition. I was tempted to just remove these commands but since we don't know what states new master could put us in, I'll leave them in for now. I'll add force to assign so same as unassign. Regards what you did for old master hbck, you could call close_regions then an unassign with a force would clear memory and get the region assigned elsewhere. But hbck should be doing this. Not a user manually, not unless things are really hosed. bq. On 2010-11-24 15:45:32, Jonathan Gray wrote: bq. trunk/src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java, line 138 bq. http://review.cloudera.org/r/1250/diff/1/?file=17649#file17649line138 bq. bq. this is awesome javadoc. is there somewhere else we can put this rather than in just the move() API? Maybe in the HBA class comment or something? Somewhere we can reference in other javadocs about what a regionname is I moved the interface doc out to HBA as per your suggestion. bq. On 2010-11-24 15:45:32, Jonathan Gray wrote: bq. trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java, line 709 bq. http://review.cloudera.org/r/1250/diff/1/?file=17651#file17651line709 bq. bq. So you're supposed to call move instead of open_region? Or why the change in move() though this looks good. Just added it as something you might want to do. unassign does same thing really. I could back it out. bq. On 2010-11-24 15:45:32, Jonathan Gray wrote: bq. trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java, line 994 bq. http://review.cloudera.org/r/1250/diff/1/?file=17651#file17651line994 bq. bq. Why META and not in-memory state? Once you hit assign() you rely on the in-memory state anyways? I only have a region server name, not an HRI which is what the inmemory state is keyed by. I could iterate the Map I suppose but then I'm thinking it may have been cleared from inmemory state. bq. On 2010-11-24 15:45:32, Jonathan Gray wrote: bq. trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java, line 996 bq. http://review.cloudera.org/r/1250/diff/1/?file=17651#file17651line996 bq. bq. on assign we just do the assignment, but below on unassign() we first clear existing plans and clear from RIT. why the difference. I made it so we only clear state if force is added to the unassign. bq. On 2010-11-24 15:45:32, Jonathan Gray wrote: bq.
[jira] Commented: (HBASE-3267) close_region shell command breaks region
[ https://issues.apache.org/jira/browse/HBASE-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12935598#action_12935598 ] HBase Review Board commented on HBASE-3267: --- Message from: st...@duboce.net --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1250/ --- (Updated 2010-11-24 16:44:36.709870) Review request for hbase and Jonathan Gray. Changes --- Addresses Jon' commments mostly by way of bulking up help in shell with warnings and pulling into HBA the javadoc that was out on the HMasterInterface. Also did stuff like make assign and unassign symmetric both taking a force param. Summary --- So, things are different in the new master. Close region should close region. Not close and then reopen. To close and reopen elsewhere, thats an unassign or a move (both of which were missing from shell but which are added in this patch). I fixed the close so that its a close that does not touch zk... the region is just closed on the regionserver. No going to zk makes it so the close no longer makes for complaint. Close is dangerous though in that the region is now permanently offline (I updated the close help to explain this is so). To address it being permanently offline, I added a new assign to the shell. While in here, I removed commands that no longer make senses such as enable_region and disable_region. M src/main/java/org/apache/hadoop/hbase/master/HMaster.java Change move implementation so can pass an empty host. Empty host means move to random location rather than explicit server. Added assign, unassign M src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java (clearRegionPlan): Added. M src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java Improved move javadoc. Added assign, unassign. M src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java Improved javadoc. Added assign and unassign. M src/main/ruby/hbase/admin.rb Added balancer, balance_switch, assign, unassign, removed zk, enable_region and disable_region (the latter make no sense anymore now disable/enable is done differently). D src/main/ruby/shell/commands/zk.rb A src/main/ruby/shell/commands/assign.rb A src/main/ruby/shell/commands/balance_switch.rb D src/main/ruby/shell/commands/disable_region.rb A src/main/ruby/shell/commands/balancer.rb A src/main/ruby/shell/commands/unassign.rb D src/main/ruby/shell/commands/enable_region.rb A src/main/ruby/shell/commands/move.rb M src/main/ruby/shell/commands/close_region.rb Fixed up help M src/main/ruby/shell.rb Added and removed commands. This addresses bug hbase-3267. http://issues.apache.org/jira/browse/hbase-3267 Diffs (updated) - trunk/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 1038878 trunk/src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java 1038878 trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1038878 trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 1038878 trunk/src/main/ruby/hbase/admin.rb 1038878 trunk/src/main/ruby/shell.rb 1038878 trunk/src/main/ruby/shell/commands/assign.rb PRE-CREATION trunk/src/main/ruby/shell/commands/balance_switch.rb PRE-CREATION trunk/src/main/ruby/shell/commands/balancer.rb PRE-CREATION trunk/src/main/ruby/shell/commands/close_region.rb 1038878 trunk/src/main/ruby/shell/commands/disable_region.rb 1038878 trunk/src/main/ruby/shell/commands/enable_region.rb 1038878 trunk/src/main/ruby/shell/commands/move.rb PRE-CREATION trunk/src/main/ruby/shell/commands/unassign.rb PRE-CREATION trunk/src/main/ruby/shell/commands/zk.rb 1038878 Diff: http://review.cloudera.org/r/1250/diff Testing --- I tested shell here on my little cluster. Thanks, stack close_region shell command breaks region Key: HBASE-3267 URL: https://issues.apache.org/jira/browse/HBASE-3267 Project: HBase Issue Type: Bug Components: master, regionserver, shell Affects Versions: 0.90.0 Reporter: Todd Lipcon Assignee: stack Priority: Critical Fix For: 0.90.0 It used to be that you could use the close_region command from the shell to close a region on one server and have the master reassign it elsewhere. Now if you close a region, you get the following errors in the master log: 2010-11-23 00:46:34,090 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSING for region ffaa7999e909dbd6544688cc8ab303bd from server haus01.sf.cloudera.com,12020,1290501789693 but region was in the state null and not in expected PENDI 2010-11-23 00:46:34,530 DEBUG
[jira] Commented: (HBASE-3267) close_region shell command breaks region
[ https://issues.apache.org/jira/browse/HBASE-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12935608#action_12935608 ] HBase Review Board commented on HBASE-3267: --- Message from: Jonathan Gray jg...@apache.org --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1250/#review1977 --- Ship it! Thanks for making changes, this looks great. I completely understand the need for these, especially for HBCK, I guess I just think of adding things to the shell as stuff people will try to use. With all the added doc I think it's fine. We'll for sure make changes to this stuff as we see what happens in the wild with the new master. +1 for commit - Jonathan close_region shell command breaks region Key: HBASE-3267 URL: https://issues.apache.org/jira/browse/HBASE-3267 Project: HBase Issue Type: Bug Components: master, regionserver, shell Affects Versions: 0.90.0 Reporter: Todd Lipcon Assignee: stack Priority: Critical Fix For: 0.90.0 It used to be that you could use the close_region command from the shell to close a region on one server and have the master reassign it elsewhere. Now if you close a region, you get the following errors in the master log: 2010-11-23 00:46:34,090 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSING for region ffaa7999e909dbd6544688cc8ab303bd from server haus01.sf.cloudera.com,12020,1290501789693 but region was in the state null and not in expected PENDI 2010-11-23 00:46:34,530 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: master:6-0x12c537d84e10062 Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, path=/hbase/unassigned/ffaa7999e909dbd6544688cc8ab303bd 2010-11-23 00:46:34,531 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:6-0x12c537d84e10062 Retrieved 128 byte(s) of data from znode /hbase/unassigned/ffaa7999e909dbd6544688cc8ab303bd and set watcher; region=usertable,user1951957302,1290501969 2010-11-23 00:46:34,531 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_CLOSED, server=haus01.sf.cloudera.com,12020,1290501789693, region=ffaa7999e909dbd6544688cc8ab303bd 2010-11-23 00:46:34,531 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region ffaa7999e909dbd6544688cc8ab303bd from server haus01.sf.cloudera.com,12020,1290501789693 but region was in the state null and not in expected PENDIN and the region just gets stuck closed -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3227) Edit of log messages before branching...
[ https://issues.apache.org/jira/browse/HBASE-3227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934996#action_12934996 ] HBase Review Board commented on HBASE-3227: --- Message from: st...@duboce.net bq. On 2010-11-22 17:29:45, Nicolas wrote: bq. trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java, line 739 bq. http://review.cloudera.org/r/1212/diff/1/?file=17170#file17170line739 bq. bq. I'd suggest keeping the store name in this debug message since we're considering thread pools for compactions... Won't the store name be part of the path on the next line when we do sf.toString() where sf is the file we're compacting all into? - stack --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1212/#review1971 --- Edit of log messages before branching... Key: HBASE-3227 URL: https://issues.apache.org/jira/browse/HBASE-3227 Project: HBase Issue Type: Improvement Reporter: stack Fix For: 0.90.0 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3227) Edit of log messages before branching...
[ https://issues.apache.org/jira/browse/HBASE-3227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934688#action_12934688 ] HBase Review Board commented on HBASE-3227: --- Message from: Nicolas nspiegelb...@facebook.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1212/#review1971 --- trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java http://review.cloudera.org/r/1212/#comment6227 I'd suggest keeping the store name in this debug message since we're considering thread pools for compactions... - Nicolas Edit of log messages before branching... Key: HBASE-3227 URL: https://issues.apache.org/jira/browse/HBASE-3227 Project: HBase Issue Type: Improvement Reporter: stack Fix For: 0.90.0 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions
[ https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934033#action_12934033 ] HBase Review Board commented on HBASE-2001: --- Message from: Mingjie Lai mjla...@gmail.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/876/ --- (Updated 2010-11-19 14:39:18.378849) Review request for hbase, stack, Andrew Purtell, and Jonathan Gray. Changes --- Final patch, ready to be checked in: - Truncated white space at line end - Rebuilt the patch after HBase-2002 checked in. Summary --- The diff actually contains 2 seperate patches: HBase-2001 and the one for (HBASE-2002+HBASE-2321). The reason is that HBase-2001's CommandTarget relies on HBASE-2002 + HBASE-2321 which patches are still under review. I have to include Gary's HBASE-2002, HBASE-2321 with this diff, since reviewboard is so powerful :) and it disallow my diff to be based on some unchecked in patch. Eventually the patch here should be committed after 2001 and 2321. I will make another patch after they got checked in. Both HBase-2001 and the dynamic RPC stuff are quite big patches. Total number of lines are more than 7k. I turned back and forth, but still don't have a good idea to create the patch in order to reduce the review pain. However right now I'm putting the whole patch for all the 3 issues. Here the list of file which are only related to coprocessor: src/main/java/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.java src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserverCoprocessor.java src/main/java/org/apache/hadoop/hbase/coprocessor/Coprocessor.java src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorEnvironment.java src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorException.java src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java src/main/java/org/apache/hadoop/hbase/coprocessor/package-info.java src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java src/test/java/org/apache/hadoop/hbase/coprocessor/ColumnAggregationEndpoint.java src/test/java/org/apache/hadoop/hbase/coprocessor/ColumnAggregationProtocol.java src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorEndpoint.java src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverInterface.java src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverStacking.java == (Here is a brief description. Please find much more details at the package-info.java in the diff. I also post the package-info.html to https://issues.apache.org/jira/browse/HBASE-2001 as an attachment.) Coprocessors are code that runs in-process on each region server. Regions contain references to the coprocessor implementation classes associated with them. Coprocessor classes will be loaded either from local jars on the region server's classpath or via the HDFS classloader. Multiple types of coprocessors are provided to provide sufficient flexibility for potential use cases. Right now there are: * Coprocessor: provides region lifecycle management hooks, e.g., region open/close/split/flush/compact operations. * RegionObserver: provides hook for monitor table operations from client side, such as table get/put/scan/delete, etc. * Endpoint: provides on demand triggers for any arbitrary function executed at a region. One use case is column aggregation at region server. Coprocessor: A coprocessor is required to implement Coprocessor interface so that coprocessor framework can manage it internally. Another design goal of this interface is to provide simple features for making coprocessors useful, while exposing no more internal state or control actions of the region server than necessary and not exposing them directly. RegionObserver If the coprocessor implements the RegionObserver interface it can observe and mediate client actions on the region. Endpoint: Coprocessor and RegionObserver provide certain hooks for injecting user code running at each region. These code will be triggerd with existing HTable and HBaseAdmin operations at the certain hook points. Through Endpoint and dynamic RPC protocol, you can define your own interface communicated between client and region server, i.e., you can create a new method, specify passed parameters and return types for the method. And the new Endpoint methods can be triggered by calling client side dynamic RPC functions -- HTable.exec(...). Coprocess loading A customized coprocessor can be loaded by two different ways, by configuration, or by
[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions
[ https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934039#action_12934039 ] HBase Review Board commented on HBASE-2001: --- Message from: Andrew Purtell apurt...@apache.org --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/876/#review1961 --- Ship it! Will commit after running unit tests and verifying all pass. - Andrew Coprocessors: Colocate user code with regions - Key: HBASE-2001 URL: https://issues.apache.org/jira/browse/HBASE-2001 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Mingjie Lai Fix For: 0.92.0 Attachments: asm-transformations.pdf, HBase-2001-final.patch, HBASE-2001-RegionObserver-2.patch, HBASE-2001-RegionObserver.patch, HBASE-2001.patch.gz, packge-info.html, packge-info.html, packge-info.html Support user code that runs run next to each region in table. As regions split and move, coprocessor code should automatically move also. Use classloader which looks on HDFS. Associate a list of classes to load with each table. Put this in HRI so it inherits from table but can be changed on a per region basis (so then those region specific changes can inherited by daughters). Not completely arbitrary code, should require implementation of an interface with callbacks for: * Open * Close * Split * Compact * (Multi)get and scanner next() * (Multi)put * (Multi)delete Add method to HTableInterface for invoking coprocessor methods and retrieving results. Add methods in o.a.h.h.regionserver or subpackage which implement convenience functions for coprocessor methods and consistent/controlled access to internals: store access, threading, persistent and ephemeral state, scratch storage, etc. GitHub: https://github.com/trendmicro/hbase/tree/coprocessor Please see the latest attached package-info.html for updated description. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions
[ https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12933068#action_12933068 ] HBase Review Board commented on HBASE-2001: --- Message from: Andrew Purtell apurt...@apache.org bq. On 2010-11-15 16:51:18, stack wrote: bq. +1 on commit to TRUNK. I think all below can be cleaned up on commit (Andrew, you going to commit?) Stack, Yes I plan to commit the patches for HBASE-2001/HBASE-2002/HBASE-2321 onto trunk this week. The dynamic RPC and coprocessor framework changes are largely independent and will go in separately to make the change history in the commit log more informative. We will address your comments before doing so. - Andrew --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/876/#review1930 --- Coprocessors: Colocate user code with regions - Key: HBASE-2001 URL: https://issues.apache.org/jira/browse/HBASE-2001 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Mingjie Lai Fix For: 0.92.0 Attachments: asm-transformations.pdf, HBASE-2001-RegionObserver-2.patch, HBASE-2001-RegionObserver.patch, HBASE-2001.patch.gz, packge-info.html, packge-info.html, packge-info.html Support user code that runs run next to each region in table. As regions split and move, coprocessor code should automatically move also. Use classloader which looks on HDFS. Associate a list of classes to load with each table. Put this in HRI so it inherits from table but can be changed on a per region basis (so then those region specific changes can inherited by daughters). Not completely arbitrary code, should require implementation of an interface with callbacks for: * Open * Close * Split * Compact * (Multi)get and scanner next() * (Multi)put * (Multi)delete Add method to HTableInterface for invoking coprocessor methods and retrieving results. Add methods in o.a.h.h.regionserver or subpackage which implement convenience functions for coprocessor methods and consistent/controlled access to internals: store access, threading, persistent and ephemeral state, scratch storage, etc. GitHub: https://github.com/trendmicro/hbase/tree/coprocessor Please see the latest attached package-info.html for updated description. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions
[ https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12932285#action_12932285 ] HBase Review Board commented on HBASE-2001: --- Message from: st...@duboce.net bq. On 2010-10-05 23:10:58, stack wrote: bq. src/main/java/org/apache/hadoop/hbase/client/Action.java, line 30 bq. http://review.cloudera.org/r/876/diff/7/?file=14158#file14158line30 bq. bq. I took a look at the package-info.html. Very nice doc. One thought though was that the batch methods do not seem to be instrumented. Are they? The bulk of inserts are done by multiput now. bq. bq. Maybe link to the wiki page when you say this in package-info.html 'implement role-based access control for HBase' bq. bq. Fix this 'These code will be triggerd with existing...' bq. bq. BaseRegionObserver as the name of the class that implements BOTH Coprocessor and RegionObserver with sensible defaults seems off... it'd make sense as the name of an implemenation of RegionObserver but not of both. Is there a better name to give it -- even BaseRegionObserverCoprocessor? Unless BaseObserver already implements Coprocessor? bq. bq. Should this also say that methods can be new also? '...i.e., you can specify new passed parameters and return types for a method. ' bq. bq. CommandTarget is a strange name for an host of arbitrary user-designed methods. Can we come up w/ something more telling? Notions that come to mind are Substrate, Platform -- i.e. stuff you build up on. bq. bq. Minor.. fix '...the actually implemention class running...' bq. bq. Fix this '...How is the client side example of calling...' bq. bq. The example is missing a bit of code that would help along its illustration a few comments would help too but this is a minor criticism. Not important. I get the gist (Folks interested in CP need to start with this page -- it makes grokking the code the easier). bq. bq. This page would seem to indicate CPs can be chained. Am I reading that wrong? (See 'Load from configuration') Over in Gary review, he was saying on CP per region only. bq. bq. bq. Usually attribute names are upper-cased. Here we have 'Coprocessor$1' (that $1is intentional right?) bq. bq. This functionality, if its working, is amazing. bq. bq. bq. bq. bq. Mingjie Lai wrote: bq. @stack: bq. I didn't realize you posted a comment until last week, since your comments here didn't get pushed to jira, neither emails sent to d...@hbase. bq. bq. Thanks for your comments. I will address them very soon. But before that I'd like to finalize the name of ``CommandTarget'': bq. bq. You said, ``CommandTarget is a strange name for an host of arbitrary user-designed methods. Can we come up w/ something more telling? Notions that come to mind are Substrate, Platform -- i.e. stuff you build up on.'' bq. bq. Some of us suggested to use ``Endpoint'' instead of CommandTarget. Do you like it better? (I'm not really good at naming stuff) bq. bq. After finalizing the name, I will make the changes to both source code and package-info. And post a patch here. bq. bq. Thanks, bq. Mingjie bq. I'm not good at naming either Endpoint seems more 'generic', less loaded than 'CommandTarget'. If you fellas working with this stuff think that a better name then thats good by me. - stack --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/876/#review1438 --- Coprocessors: Colocate user code with regions - Key: HBASE-2001 URL: https://issues.apache.org/jira/browse/HBASE-2001 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Mingjie Lai Fix For: 0.92.0 Attachments: asm-transformations.pdf, HBASE-2001-RegionObserver-2.patch, HBASE-2001-RegionObserver.patch, HBASE-2001.patch.gz, packge-info.html, packge-info.html, packge-info.html Support user code that runs run next to each region in table. As regions split and move, coprocessor code should automatically move also. Use classloader which looks on HDFS. Associate a list of classes to load with each table. Put this in HRI so it inherits from table but can be changed on a per region basis (so then those region specific changes can inherited by daughters). Not completely arbitrary code, should require implementation of an interface with callbacks for: * Open * Close * Split * Compact * (Multi)get and
[jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions
[ https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12932292#action_12932292 ] HBase Review Board commented on HBASE-2001: --- Message from: st...@duboce.net --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/876/#review1930 --- Ship it! +1 on commit to TRUNK. I think all below can be cleaned up on commit (Andrew, you going to commit?) src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java http://review.cloudera.org/r/876/#comment6139 Check in here. Looks like tabs? review board reporting it as whitespace. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java http://review.cloudera.org/r/876/#comment6140 Usually in hbase code base there are spaces around operations; e.g. around '+'. src/main/java/org/apache/hadoop/hbase/client/coprocessor/ExecResult.java http://review.cloudera.org/r/876/#comment6142 Be careful. In hbase lines are 80 characters long normally. Fix on commit? src/main/java/org/apache/hadoop/hbase/client/coprocessor/ExecResult.java http://review.cloudera.org/r/876/#comment6143 I think its ok if these lines 80 characters src/main/java/org/apache/hadoop/hbase/client/coprocessor/package-info.java http://review.cloudera.org/r/876/#comment6144 Excellent src/main/java/org/apache/hadoop/hbase/coprocessor/package-info.java http://review.cloudera.org/r/876/#comment6146 Lots of white space in here. - stack Coprocessors: Colocate user code with regions - Key: HBASE-2001 URL: https://issues.apache.org/jira/browse/HBASE-2001 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Mingjie Lai Fix For: 0.92.0 Attachments: asm-transformations.pdf, HBASE-2001-RegionObserver-2.patch, HBASE-2001-RegionObserver.patch, HBASE-2001.patch.gz, packge-info.html, packge-info.html, packge-info.html Support user code that runs run next to each region in table. As regions split and move, coprocessor code should automatically move also. Use classloader which looks on HDFS. Associate a list of classes to load with each table. Put this in HRI so it inherits from table but can be changed on a per region basis (so then those region specific changes can inherited by daughters). Not completely arbitrary code, should require implementation of an interface with callbacks for: * Open * Close * Split * Compact * (Multi)get and scanner next() * (Multi)put * (Multi)delete Add method to HTableInterface for invoking coprocessor methods and retrieving results. Add methods in o.a.h.h.regionserver or subpackage which implement convenience functions for coprocessor methods and consistent/controlled access to internals: store access, threading, persistent and ephemeral state, scratch storage, etc. GitHub: https://github.com/trendmicro/hbase/tree/coprocessor Please see the latest attached package-info.html for updated description. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2002) Coprocessors: Client side support
[ https://issues.apache.org/jira/browse/HBASE-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12932296#action_12932296 ] HBase Review Board commented on HBASE-2002: --- Message from: st...@duboce.net --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/816/#review1933 --- Ship it! I did a quick pass over this. Most I'd seen already over in the Minjgie patch. I'm +1 getting it into TRUNK now early in the release cycle so probs. surface before release (You going to commit Andrew?). - stack Coprocessors: Client side support - Key: HBASE-2002 URL: https://issues.apache.org/jira/browse/HBASE-2002 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Gary Helmling Fix For: 0.92.0 High-level call interface for clients. Unlike RPC, calls addressed to rows or ranges of rows. Coprocessor client library resolves to actual locations. Calls across multiple rows automatically split into multiple parallelized RPCs Generic multicall RPC facility which incorporates this and multiget/multiput/multidelete and parallel scanners. Group and batch RPCs by region server. Track and retry outstanding RPCs. Ride over region relocations. Support addressing by explicit region identifier or by row key or row key range. Include a facility for merging results client side. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3235) Intermittent incrementColumnValue failure in TestHRegion
[ https://issues.apache.org/jira/browse/HBASE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12932357#action_12932357 ] HBase Review Board commented on HBASE-3235: --- Message from: Gary Helmling ghelml...@gmail.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1224/ --- Review request for hbase and Ryan Rawson. Summary --- Fix for MemStore.upsert(KeyValue) to start the kvset.tailSet() of potential KVs to remove at the beginning of entries for the row/family/qualifier combination, ignoring timestamp to prevent Puts being skipped based on timestamp alone and masking the ICV. This addresses bug HBASE-3235. http://issues.apache.org/jira/browse/HBASE-3235 Diffs - src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java b7409b0 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java 7640997 Diff: http://review.cloudera.org/r/1224/diff Testing --- Added a new test: TestHRegion.testIncrementColumnValue_UpdatingInPlace_TimestampClobber() to recreate the existing failure condition: 1) put to a row/family/qualifier, 2) ICV to the same row/family/qualifier with the same timestamp. This test fails consistently without the patch to MemStore. With the patch to MemStore, the new test case consistently passes. I also ran TestHRegion 15+ times and saw no more intermittent failures of testIncrementColumnValue_UpdatingInPlace(). Previously this was failing every 5 or so test runs, so this seems a pretty good indication it's fixed. I also ran through the full test suite on 0.90 and all passed except for an error in TestHLog... Thanks, Gary Intermittent incrementColumnValue failure in TestHRegion Key: HBASE-3235 URL: https://issues.apache.org/jira/browse/HBASE-3235 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.0 Reporter: Gary Helmling I first saw this in a Hudson build, but can reproduce locally with enough test runs (5-10 times): {noformat} --- Test set: org.apache.hadoop.hbase.regionserver.TestHRegion --- Tests run: 51, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 39.413 sec FAILURE! testIncrementColumnValue_UpdatingInPlace(org.apache.hadoop.hbase.regionserver.TestHRegion) Time elapsed: 0.079 sec FAILURE! junit.framework.AssertionFailedError: expected:1 but was:2 at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.failNotEquals(Assert.java:283) at junit.framework.Assert.assertEquals(Assert.java:64) at junit.framework.Assert.assertEquals(Assert.java:195) at junit.framework.Assert.assertEquals(Assert.java:201) at org.apache.hadoop.hbase.regionserver.TestHRegion.testIncrementColumnValue_UpdatingInPlace(TestHRegion.java:1889) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {noformat} Alternately, the failure can also show up in testIncrementColumnValue_UpdatingInPlace_Negative(): {noformat} testIncrementColumnValue_UpdatingInPlace_Negative(org.apache.hadoop.hbase.regionserver.TestHRegion) Time elapsed: 0.03 sec FAILURE! junit.framework.AssertionFailedError: expected:2 but was:3 at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.failNotEquals(Assert.java:283) at junit.framework.Assert.assertEquals(Assert.java:64) at junit.framework.Assert.assertEquals(Assert.java:130) at junit.framework.Assert.assertEquals(Assert.java:136) at org.apache.hadoop.hbase.regionserver.TestHRegion.assertICV(TestHRegion.java:2081) at org.apache.hadoop.hbase.regionserver.TestHRegion.testIncrementColumnValue_UpdatingInPlace_Negative(TestHRegion.java:1990) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3232) Fix KeyOnlyFilter + Add Value Length
[ https://issues.apache.org/jira/browse/HBASE-3232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12931962#action_12931962 ] HBase Review Board commented on HBASE-3232: --- Message from: Nicolas nspiegelb...@facebook.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1213/ --- (Updated 2010-11-14 18:28:50.196433) Review request for hbase. Changes --- Because I didn't implement write/readFields for KeyOnlyFilter when I added the param, client - server serialization didn't work and the default value of false was always used. Fixed + added associated unit test Summary --- HBASE-3211 altered filter code to mutate KeyValues. What could go wrong? Well, your scan could mess up because the KVHeap compare functions don't work properly. If we're going to soft mutate KVs in filter code, we also need to soft copy the KV before filtering. This was found while adding the ability to have KeyOnlyFilter have the option to return the Value's length. This is useful for grouping your reduce tasks into equal-sized blocks. This addresses bug HBASE-3232. http://issues.apache.org/jira/browse/HBASE-3232 Diffs (updated) - trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1034646 trunk/src/main/java/org/apache/hadoop/hbase/filter/KeyOnlyFilter.java 1034646 trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1034646 trunk/src/test/java/org/apache/hadoop/hbase/TestKeyValue.java 1034646 trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1034646 trunk/src/test/java/org/apache/hadoop/hbase/filter/TestFilter.java 1034646 Diff: http://review.cloudera.org/r/1213/diff Testing --- mvn clean test Thanks, Nicolas Fix KeyOnlyFilter + Add Value Length Key: HBASE-3232 URL: https://issues.apache.org/jira/browse/HBASE-3232 Project: HBase Issue Type: Bug Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Priority: Blocker Fix For: 0.90.0 HBASE-3211 altered filter code to mutate KeyValues. What could go wrong? Well, your scan could mess up because the KVHeap compare functions don't work properly. If we're going to soft mutate KVs in filter code, we also need to soft copy the KV before filtering. This was found while adding the ability to have KeyOnlyFilter have the option to return the Value's length. This is useful for grouping your reduce tasks into equal-sized blocks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3232) Fix KeyOnlyFilter + Add Value Length
[ https://issues.apache.org/jira/browse/HBASE-3232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12931967#action_12931967 ] HBase Review Board commented on HBASE-3232: --- Message from: Ryan Rawson ryano...@gmail.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1213/#review1922 --- looks great, i just committed it - Ryan Fix KeyOnlyFilter + Add Value Length Key: HBASE-3232 URL: https://issues.apache.org/jira/browse/HBASE-3232 Project: HBase Issue Type: Bug Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Priority: Blocker Fix For: 0.90.0 HBASE-3211 altered filter code to mutate KeyValues. What could go wrong? Well, your scan could mess up because the KVHeap compare functions don't work properly. If we're going to soft mutate KVs in filter code, we also need to soft copy the KV before filtering. This was found while adding the ability to have KeyOnlyFilter have the option to return the Value's length. This is useful for grouping your reduce tasks into equal-sized blocks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3232) Fix KeyOnlyFilter + Add Value Length
[ https://issues.apache.org/jira/browse/HBASE-3232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12931630#action_12931630 ] HBase Review Board commented on HBASE-3232: --- Message from: Nicolas nspiegelb...@facebook.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1213/ --- Review request for hbase. Summary --- HBASE-3211 altered filter code to mutate KeyValues. What could go wrong? Well, your scan could mess up because the KVHeap compare functions don't work properly. If we're going to soft mutate KVs in filter code, we also need to soft copy the KV before filtering. This was found while adding the ability to have KeyOnlyFilter have the option to return the Value's length. This is useful for grouping your reduce tasks into equal-sized blocks. This addresses bug HBASE-3232. http://issues.apache.org/jira/browse/HBASE-3232 Diffs - trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java 1034646 trunk/src/main/java/org/apache/hadoop/hbase/filter/KeyOnlyFilter.java 1034646 trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 1034646 trunk/src/test/java/org/apache/hadoop/hbase/TestKeyValue.java 1034646 trunk/src/test/java/org/apache/hadoop/hbase/filter/TestFilter.java 1034646 Diff: http://review.cloudera.org/r/1213/diff Testing --- mvn clean test Thanks, Nicolas Fix KeyOnlyFilter + Add Value Length Key: HBASE-3232 URL: https://issues.apache.org/jira/browse/HBASE-3232 Project: HBase Issue Type: Bug Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Priority: Blocker Fix For: 0.90.0 HBASE-3211 altered filter code to mutate KeyValues. What could go wrong? Well, your scan could mess up because the KVHeap compare functions don't work properly. If we're going to soft mutate KVs in filter code, we also need to soft copy the KV before filtering. This was found while adding the ability to have KeyOnlyFilter have the option to return the Value's length. This is useful for grouping your reduce tasks into equal-sized blocks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3227) Edit of log messages before branching...
[ https://issues.apache.org/jira/browse/HBASE-3227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12931095#action_12931095 ] HBase Review Board commented on HBASE-3227: --- Message from: st...@duboce.net --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1212/ --- Review request for hbase, Jean-Daniel Cryans and Jonathan Gray. Summary --- Removed redundancy, corrected some of the english in log messages, changed at least one to DEBUG. This addresses bug hbase-3227. http://issues.apache.org/jira/browse/hbase-3227 Diffs - trunk/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 1033977 trunk/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java 1033977 trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1033979 trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1033977 Diff: http://review.cloudera.org/r/1212/diff Testing --- Thanks, stack Edit of log messages before branching... Key: HBASE-3227 URL: https://issues.apache.org/jira/browse/HBASE-3227 Project: HBase Issue Type: Improvement Reporter: stack Fix For: 0.90.0 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3227) Edit of log messages before branching...
[ https://issues.apache.org/jira/browse/HBASE-3227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12931096#action_12931096 ] HBase Review Board commented on HBASE-3227: --- Message from: Jean-Daniel Cryans jdcry...@apache.org --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1212/#review1911 --- trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java http://review.cloudera.org/r/1212/#comment6126 I still don't know what this message means :( - Jean-Daniel Edit of log messages before branching... Key: HBASE-3227 URL: https://issues.apache.org/jira/browse/HBASE-3227 Project: HBase Issue Type: Improvement Reporter: stack Fix For: 0.90.0 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3227) Edit of log messages before branching...
[ https://issues.apache.org/jira/browse/HBASE-3227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12931097#action_12931097 ] HBase Review Board commented on HBASE-3227: --- Message from: Jean-Daniel Cryans jdcry...@apache.org bq. On 2010-11-11 09:31:16, Jean-Daniel Cryans wrote: bq. Ooops meant to say, +1 - Jean-Daniel --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1212/#review1911 --- Edit of log messages before branching... Key: HBASE-3227 URL: https://issues.apache.org/jira/browse/HBASE-3227 Project: HBase Issue Type: Improvement Reporter: stack Fix For: 0.90.0 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3227) Edit of log messages before branching...
[ https://issues.apache.org/jira/browse/HBASE-3227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12931106#action_12931106 ] HBase Review Board commented on HBASE-3227: --- Message from: Jonathan Gray jg...@apache.org --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1212/#review1913 --- Ship it! lgtm - Jonathan Edit of log messages before branching... Key: HBASE-3227 URL: https://issues.apache.org/jira/browse/HBASE-3227 Project: HBase Issue Type: Improvement Reporter: stack Fix For: 0.90.0 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3161) Provide option for Stargate to only serve GET requests
[ https://issues.apache.org/jira/browse/HBASE-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930646#action_12930646 ] HBase Review Board commented on HBASE-3161: --- Message from: st...@duboce.net --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1204/#review1885 --- Ship it! This patch looks great to me. Andrew or Ted, you want to take a look? Bennett do all hbase tests pass locally for you? - stack Provide option for Stargate to only serve GET requests -- Key: HBASE-3161 URL: https://issues.apache.org/jira/browse/HBASE-3161 Project: HBase Issue Type: Improvement Components: rest Affects Versions: 0.20.6 Reporter: Ted Yu Provide option for Stargate to only serve GET requests. Hbase health check can utilize this option. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3161) Provide option for Stargate to only serve GET requests
[ https://issues.apache.org/jira/browse/HBASE-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930662#action_12930662 ] HBase Review Board commented on HBASE-3161: --- Message from: Ted Yu ted...@yahoo.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1204/#review1886 --- Ship it! - Ted Provide option for Stargate to only serve GET requests -- Key: HBASE-3161 URL: https://issues.apache.org/jira/browse/HBASE-3161 Project: HBase Issue Type: Improvement Components: rest Affects Versions: 0.20.6 Reporter: Ted Yu Provide option for Stargate to only serve GET requests. Hbase health check can utilize this option. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3211) Key (Index) Only Fetches
[ https://issues.apache.org/jira/browse/HBASE-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930830#action_12930830 ] HBase Review Board commented on HBASE-3211: --- Message from: Kannan Muthukkaruppan kan...@facebook.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1208/#review1893 --- Ship it! Neat-O! Thanks for cranking this out so quickly. - Kannan Key (Index) Only Fetches Key: HBASE-3211 URL: https://issues.apache.org/jira/browse/HBASE-3211 Project: HBase Issue Type: Improvement Reporter: Kannan Muthukkaruppan Assignee: Jonathan Gray When you retrieve data from HBase you get Key (Row+Column+Timestamp) + Values. It would be nice to have a mode where we only fetch the keys (i.e. the index) but not the values. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3211) Key (Index) Only Fetches
[ https://issues.apache.org/jira/browse/HBASE-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930831#action_12930831 ] HBase Review Board commented on HBASE-3211: --- Message from: Jonathan Gray jg...@apache.org bq. On 2010-11-10 15:24:52, stack wrote: bq. Looks fine to me. That kv copy is ugly but what else can you do? Definitely can't modify the original buffer, so it's the only choice. In this case, it's not a huge deal because we'll do these allocations, return the result, and then immediately be done with the memory and will have no references to it. Should be okay on GC. One potential optimization would be to do one big rewrite of the KVs at the end rather as we go. Instead of allocating individual byte[] for each KV, you could potentially do one big byte[] behind all the key-only KVs. This gets way more complicated and I'm not sure it's worth it. Was going for minimal approach. In the filter unit test, I'm also going add an additional assert on commit (and verifying still passes). The test verifies the values are not the same but we should actually explicitly also assert that the value is 0 length. Thanks! - Jonathan --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1208/#review1890 --- Key (Index) Only Fetches Key: HBASE-3211 URL: https://issues.apache.org/jira/browse/HBASE-3211 Project: HBase Issue Type: Improvement Reporter: Kannan Muthukkaruppan Assignee: Jonathan Gray When you retrieve data from HBase you get Key (Row+Column+Timestamp) + Values. It would be nice to have a mode where we only fetch the keys (i.e. the index) but not the values. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3211) Key (Index) Only Fetches
[ https://issues.apache.org/jira/browse/HBASE-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930833#action_12930833 ] HBase Review Board commented on HBASE-3211: --- Message from: Jonathan Gray jg...@apache.org bq. On 2010-11-10 15:25:07, Ryan Rawson wrote: bq. trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 1199 bq. http://review.cloudera.org/r/1208/diff/1/?file=17147#file17147line1199 bq. bq. I'm torn here, this is making the implementation easy, but KeyValues have been immutable to date. While no one shares KeyValues between threads or scanners, and ideally no one should, this seems dangerous. bq. bq. It doesn't actually touch the original byte[] so does not actually destroy/mutate the underlying data in any way. Agreed it's still potentially dangerous but that's why I've added the nice warning message in javadoc :) - Jonathan --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1208/#review1891 --- Key (Index) Only Fetches Key: HBASE-3211 URL: https://issues.apache.org/jira/browse/HBASE-3211 Project: HBase Issue Type: Improvement Reporter: Kannan Muthukkaruppan Assignee: Jonathan Gray When you retrieve data from HBase you get Key (Row+Column+Timestamp) + Values. It would be nice to have a mode where we only fetch the keys (i.e. the index) but not the values. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3211) Key (Index) Only Fetches
[ https://issues.apache.org/jira/browse/HBASE-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930849#action_12930849 ] HBase Review Board commented on HBASE-3211: --- Message from: Nicolas nspiegelb...@facebook.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1208/#review1896 --- trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java http://review.cloudera.org/r/1208/#comment6119 Would it be more straightforward to have a ReturnCode.MODIFY, that signals an include but suggests that the program must call modifyKeyValue() to get the transformed data. Maybe this is too much of a one-off case... - Nicolas Key (Index) Only Fetches Key: HBASE-3211 URL: https://issues.apache.org/jira/browse/HBASE-3211 Project: HBase Issue Type: Improvement Reporter: Kannan Muthukkaruppan Assignee: Jonathan Gray Fix For: 0.90.0 Attachments: HBASE-3211-v2.patch, HBASE-3211-v3.patch When you retrieve data from HBase you get Key (Row+Column+Timestamp) + Values. It would be nice to have a mode where we only fetch the keys (i.e. the index) but not the values. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3211) Key (Index) Only Fetches
[ https://issues.apache.org/jira/browse/HBASE-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930852#action_12930852 ] HBase Review Board commented on HBASE-3211: --- Message from: Jonathan Gray jg...@apache.org bq. On 2010-11-10 16:01:22, Nicolas wrote: bq. trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 1199 bq. http://review.cloudera.org/r/1208/diff/1/?file=17147#file17147line1199 bq. bq. Would it be more straightforward to have a ReturnCode.MODIFY, that signals an include but suggests that the program must call modifyKeyValue() to get the transformed data. Maybe this is too much of a one-off case... Not sure I completely follow. You're saying the modification would happen outside the filter? No one needs to call modifyKeyValue() to get the transformed data, it's done in the filter. In any case, yeah, I would not be for adding another ReturnCode just for this. - Jonathan --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1208/#review1896 --- Key (Index) Only Fetches Key: HBASE-3211 URL: https://issues.apache.org/jira/browse/HBASE-3211 Project: HBase Issue Type: Improvement Reporter: Kannan Muthukkaruppan Assignee: Jonathan Gray Fix For: 0.90.0 Attachments: HBASE-3211-v2.patch, HBASE-3211-v3.patch When you retrieve data from HBase you get Key (Row+Column+Timestamp) + Values. It would be nice to have a mode where we only fetch the keys (i.e. the index) but not the values. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3211) Key (Index) Only Fetches
[ https://issues.apache.org/jira/browse/HBASE-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930880#action_12930880 ] HBase Review Board commented on HBASE-3211: --- Message from: Nicolas nspiegelb...@facebook.com bq. On 2010-11-10 16:01:22, Nicolas wrote: bq. trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 1199 bq. http://review.cloudera.org/r/1208/diff/1/?file=17147#file17147line1199 bq. bq. Would it be more straightforward to have a ReturnCode.MODIFY, that signals an include but suggests that the program must call modifyKeyValue() to get the transformed data. Maybe this is too much of a one-off case... bq. bq. Jonathan Gray wrote: bq. Not sure I completely follow. You're saying the modification would happen outside the filter? No one needs to call modifyKeyValue() to get the transformed data, it's done in the filter. bq. bq. In any case, yeah, I would not be for adding another ReturnCode just for this. I suggested this alternative because users normally expect filters to do immutable operations on the data itself, and you're introducing side effects. If we stay with this paradigm, it's probably best to add a note in Filter.filterKeyValue() that the KeyValue may be modified. - Nicolas --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1208/#review1896 --- Key (Index) Only Fetches Key: HBASE-3211 URL: https://issues.apache.org/jira/browse/HBASE-3211 Project: HBase Issue Type: Improvement Reporter: Kannan Muthukkaruppan Assignee: Jonathan Gray Fix For: 0.90.0 Attachments: HBASE-3211-v2.patch, HBASE-3211-v3.patch When you retrieve data from HBase you get Key (Row+Column+Timestamp) + Values. It would be nice to have a mode where we only fetch the keys (i.e. the index) but not the values. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3223) Get VersionInfo for Running HBase Process
[ https://issues.apache.org/jira/browse/HBASE-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930881#action_12930881 ] HBase Review Board commented on HBASE-3223: --- Message from: Nicolas nspiegelb...@facebook.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1209/ --- Review request for hbase. Summary --- bin/hbase VersionInfo is a great existing utility to provide version info about Hbase jar files. Unfortunately, there is no way to currently get this information for the running process. For this jira, add an easy/quick way to see verify the rev of the running jar. We got recently bit internally because our running jar was a different version from the jar that we had recently pushed and caused havoc on our cluster. This problem is more important to fix now that we have rolling upgrades and will regularly have cluster scenarios with mixed-version RSs. This addresses bug HBASE-3223. http://issues.apache.org/jira/browse/HBASE-3223 Diffs - trunk/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetrics.java 1033788 trunk/src/main/java/org/apache/hadoop/hbase/metrics/HBaseInfo.java PRE-CREATION trunk/src/main/java/org/apache/hadoop/hbase/metrics/MetricsMBeanBase.java 1033788 trunk/src/main/java/org/apache/hadoop/hbase/metrics/MetricsString.java PRE-CREATION trunk/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java 1033788 Diff: http://review.cloudera.org/r/1209/diff Testing --- loaded on dev cluster and verified that we this was exported via JMX Thanks, Nicolas Get VersionInfo for Running HBase Process - Key: HBASE-3223 URL: https://issues.apache.org/jira/browse/HBASE-3223 Project: HBase Issue Type: Improvement Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Fix For: 0.90.1 bin/hbase VersionInfo is a great existing utility to provide version info about Hbase jar files. Unfortunately, there is no way to currently get this information for the running process. For this jira, add an easy/quick way to see verify the rev of the running jar. We got recently bit internally because our running jar was a different version from the jar that we had recently pushed and caused havoc on our cluster. This problem is more important to fix now that we have rolling upgrades and will regularly have cluster scenarios with mixed-version RSs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3223) Get VersionInfo for Running HBase Process
[ https://issues.apache.org/jira/browse/HBASE-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930885#action_12930885 ] HBase Review Board commented on HBASE-3223: --- Message from: Nicolas nspiegelb...@facebook.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1209/#review1899 --- trunk/src/main/java/org/apache/hadoop/hbase/metrics/MetricsMBeanBase.java http://review.cloudera.org/r/1209/#comment6122 both LOG info messages can be taken out. this was from debug - Nicolas Get VersionInfo for Running HBase Process - Key: HBASE-3223 URL: https://issues.apache.org/jira/browse/HBASE-3223 Project: HBase Issue Type: Improvement Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Fix For: 0.90.1 bin/hbase VersionInfo is a great existing utility to provide version info about Hbase jar files. Unfortunately, there is no way to currently get this information for the running process. For this jira, add an easy/quick way to see verify the rev of the running jar. We got recently bit internally because our running jar was a different version from the jar that we had recently pushed and caused havoc on our cluster. This problem is more important to fix now that we have rolling upgrades and will regularly have cluster scenarios with mixed-version RSs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3211) Key (Index) Only Fetches
[ https://issues.apache.org/jira/browse/HBASE-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930891#action_12930891 ] HBase Review Board commented on HBASE-3211: --- Message from: Jonathan Gray jg...@apache.org bq. On 2010-11-10 16:01:22, Nicolas wrote: bq. trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 1199 bq. http://review.cloudera.org/r/1208/diff/1/?file=17147#file17147line1199 bq. bq. Would it be more straightforward to have a ReturnCode.MODIFY, that signals an include but suggests that the program must call modifyKeyValue() to get the transformed data. Maybe this is too much of a one-off case... bq. bq. Jonathan Gray wrote: bq. Not sure I completely follow. You're saying the modification would happen outside the filter? No one needs to call modifyKeyValue() to get the transformed data, it's done in the filter. bq. bq. In any case, yeah, I would not be for adding another ReturnCode just for this. bq. bq. Nicolas wrote: bq. I suggested this alternative because users normally expect filters to do immutable operations on the data itself, and you're introducing side effects. If we stay with this paradigm, it's probably best to add a note in Filter.filterKeyValue() that the KeyValue may be modified. But a user would have to knowingly use this filter, right? And the filter only has one purpose of mutating the KVs. I do agree with what you're saying at some level but not sure what a note in the interface would do. This is so if writing other filters, you would know that other filters in the chain could modify the KV? How would you behave differently then in that case? - Jonathan --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1208/#review1896 --- Key (Index) Only Fetches Key: HBASE-3211 URL: https://issues.apache.org/jira/browse/HBASE-3211 Project: HBase Issue Type: Improvement Reporter: Kannan Muthukkaruppan Assignee: Jonathan Gray Fix For: 0.90.0 Attachments: HBASE-3211-v2.patch, HBASE-3211-v3.patch When you retrieve data from HBase you get Key (Row+Column+Timestamp) + Values. It would be nice to have a mode where we only fetch the keys (i.e. the index) but not the values. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3223) Get VersionInfo for Running HBase Process
[ https://issues.apache.org/jira/browse/HBASE-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930935#action_12930935 ] HBase Review Board commented on HBASE-3223: --- Message from: st...@duboce.net --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1209/#review1905 --- Ship it! This looks like nice utility. - stack Get VersionInfo for Running HBase Process - Key: HBASE-3223 URL: https://issues.apache.org/jira/browse/HBASE-3223 Project: HBase Issue Type: Improvement Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Fix For: 0.90.1 bin/hbase VersionInfo is a great existing utility to provide version info about Hbase jar files. Unfortunately, there is no way to currently get this information for the running process. For this jira, add an easy/quick way to see verify the rev of the running jar. We got recently bit internally because our running jar was a different version from the jar that we had recently pushed and caused havoc on our cluster. This problem is more important to fix now that we have rolling upgrades and will regularly have cluster scenarios with mixed-version RSs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3112) Enable and disable of table needs a bit of loving in new master
[ https://issues.apache.org/jira/browse/HBASE-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930285#action_12930285 ] HBase Review Board commented on HBASE-3112: --- Message from: st...@duboce.net bq. On 2010-11-09 11:05:46, Jean-Daniel Cryans wrote: bq. trunk/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java, line 444 bq. http://review.cloudera.org/r/1187/diff/5/?file=17034#file17034line444 bq. bq. I gave you my comments in person. Short version, I think that those methods' method shouldn't change and that we should have methods clearly marked as async, and then do a job of educating people towards using them. bq. bq. Jean-Daniel Cryans wrote: bq. I meant method's behavior Yeah, I agree with you after chatting. Will fix (And you spotted prob. w/ way async was running anyways). bq. On 2010-11-09 11:05:46, Jean-Daniel Cryans wrote: bq. trunk/src/main/java/org/apache/hadoop/hbase/master/handler/EnableTableHandler.java, line 135 bq. http://review.cloudera.org/r/1187/diff/5/?file=17043#file17043line135 bq. bq. Looks an awful lot like BulkDisabler I disagree. The overrides each differ substantially (They look similar if you don't look close -- smile). - stack --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1187/#review1866 --- Enable and disable of table needs a bit of loving in new master --- Key: HBASE-3112 URL: https://issues.apache.org/jira/browse/HBASE-3112 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Priority: Critical Fix For: 0.90.0 Attachments: 3112-v2.txt, 3112-v3.txt, 3112.txt The tools are in place to do a more reliable enable/disable of tables. Some work has been done to hack in a basic enable/disable but its not enough -- see the test avro/thrift tests where a disable/enable/disable switchback can confuse the table state (and has been disabled until this issue addressed). This issue is about finishing off enable/disable in the new master. I think we need to add to the table znode an enabling/disabling state rather than have them binary with a watcher that will stop an enable (or disable) starting until the previous completes (Currently we atomically switch the state though the region close/open lags -- some work in enable/disable handlers helps in that they won't complete till all regions have transitioned.. but its not enough). Need to add tests too. Marking issue critical bug because loads of the questions we get on lists are about enable/disable probs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3209) New Compaction Heuristic
[ https://issues.apache.org/jira/browse/HBASE-3209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930347#action_12930347 ] HBase Review Board commented on HBASE-3209: --- Message from: Nicolas nspiegelb...@facebook.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1192/ --- Review request for hbase. Summary --- We have a whole bunch of compaction awesome in our internal 0.89 branch. Porting this to 0.90: 1) don't unconditionally compact 4 files. have a min threshold 2) intelligently upgrade minors to majors 3) new compaction algo (derived in HBASE-2462 ) This addresses bug HBASE-3209. http://issues.apache.org/jira/browse/HBASE-3209 Diffs - trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1033278 Diff: http://review.cloudera.org/r/1192/diff Testing --- Has been running on our primary cluster for the past couple weeks. Thanks, Nicolas New Compaction Heuristic Key: HBASE-3209 URL: https://issues.apache.org/jira/browse/HBASE-3209 Project: HBase Issue Type: Improvement Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg We have a whole bunch of compaction awesome in our internal 0.89 branch. Porting this to 0.90: 1) don't unconditionally compact 4 files. have a min threshold 2) intelligently upgrade minors to majors 3) new compaction algo (derived in HBASE-2462 ) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3168) Sanity date and time check when a region server joins the cluster
[ https://issues.apache.org/jira/browse/HBASE-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930386#action_12930386 ] HBase Review Board commented on HBASE-3168: --- Message from: Jonathan Gray jg...@apache.org --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1193/ --- Review request for hbase and stack. Summary --- This is patch from Jeff Whiting. I then did little bits of polish and slim down of the unit test. I uncovered very odd coupling of LogsCleaner being instantiated within ServerManager, though we don't use it there and it doesn't use SM. So that's refactored out into HMaster and is started up/shut down with start/stopServiceThreads(). Changes from Jeff patch: - Moved pulling maxSkew from config into constructor rather than doing it on each call - Cleaned up the logging message a bit and changed from DEBUG to WARN - HRS side, use EnvironmentEdgeManager rather than System.currentTimeMillis directly - Changes test to operate directly on ServerManager. I had to do a bit of refactoring of ServerManager to get this to work and it's nothing something anyone new would have pulled the trigger on (moving stuff into another class instead of the weird unnecessary coupling to ServerManager). This addresses bug HBASE-3168. http://issues.apache.org/jira/browse/HBASE-3168 Diffs - trunk/src/main/java/org/apache/hadoop/hbase/ClockOutOfSyncException.java PRE-CREATION trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPCProtocolVersion.java 1033288 trunk/src/main/java/org/apache/hadoop/hbase/ipc/HMasterRegionInterface.java 1033288 trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 1033288 trunk/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 1033288 trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1033288 trunk/src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java PRE-CREATION Diff: http://review.cloudera.org/r/1193/diff Testing --- New added test passes. Thanks, Jonathan Sanity date and time check when a region server joins the cluster - Key: HBASE-3168 URL: https://issues.apache.org/jira/browse/HBASE-3168 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.89.20100924 Environment: RHEL 5.5 64bit, 1 Master 4 Region Servers Reporter: Jeff Whiting Assignee: Jeff Whiting Fix For: 0.90.0 Attachments: HBASE-3168-trunk-v1.txt, HBASE-3168-trunk-v2.txt, HBASE-3168-trunk-v3.txt, HBASE-3168-v4.patch Introduce a sanity check when a RS joins the cluster to make sure its clock isn't too far out of skew with the rest of the cluster. If the RS's time is too far out of skew then the master would prevent it from joining and RS would die and log the error. Having a RS with even small differences in time can cause huge problems due to how bhase stores values with timestamps. According to J-D in ServerManager we are already doing: {code} HServerInfo info = new HServerInfo(serverInfo); checkIsDead(info.getServerName(), STARTUP); checkAlreadySameHostPort(info); recordNewServer(info, false, null); {code} And that the new check would fit in nicely there. JG suggests we add a ClockOutOfSync-like exception -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.