[jira] [Updated] (HBASE-6070) AM.nodeDeleted and SSH races creating problems for regions under SPLIT
[ https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tianying Chang updated HBASE-6070: -- @ram, I am reading the code related to region split. I feel that this code below in AssignmentManager seems to be dead code. Because 1) I don't see any place that callls to update the regionState to be State.SPLIT. 2) for scenario when region has already been split and RS crashed, ServerShutdownHandler should have already taken care of it. Am I missing something here. Thanks if (rs.isSplit()) { LOG.debug(Ephemeral node deleted, regionserver crashed?, + clearing from RIT; rs= + rs); regionOffline(rs.getRegion()); AM.nodeDeleted and SSH races creating problems for regions under SPLIT -- Key: HBASE-6070 URL: https://issues.apache.org/jira/browse/HBASE-6070 Project: HBase Issue Type: Bug Affects Versions: 0.92.1, 0.94.0 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.92.2, 0.94.1, 0.96.0 Attachments: HBASE-6070_0.92_1.patch, HBASE-6070_0.92.patch, HBASE-6070_0.94_1.patch, HBASE-6070_0.94.patch, HBASE-6070_trunk_1.patch, HBASE-6070_trunk.patch We tried to address the problems in Master restart and RS restart while SPLIT region is in progress as part of HBASE-5806. While doing some more we found still there is one race condition. - Split has just started and the znode is in RS_SPLIT state. - RS goes down. - First call back for SSH comes. - As part of the fix for HBASE-5806 SSH knows that some region is in RIT. - But now nodeDeleted event comes for the SPLIt node and there we try to delete the RIT. - After this we try to see in the SSH whether any node is in RIT. As we dont find the region in RIT the region is never assigned. When we fixed HBASE-5806 step 6 happened first and then step 5 happened. So we missed it. Now we found that. Will come up with a patch shortly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6843) loading lzo error when using coprocessor
[ https://issues.apache.org/jira/browse/HBASE-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483003#comment-13483003 ] Hudson commented on HBASE-6843: --- Integrated in HBase-TRUNK #3479 (See [https://builds.apache.org/job/HBase-TRUNK/3479/]) HBASE-6843 loading lzo error when using coprocessor (Andy) (Revision 1401551) Result = FAILURE larsh : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorClassLoader.java loading lzo error when using coprocessor Key: HBASE-6843 URL: https://issues.apache.org/jira/browse/HBASE-6843 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 0.94.1 Reporter: Zhou wenjian Assignee: Zhou wenjian Priority: Critical Fix For: 0.94.3, 0.96.0 Attachments: HBASE-6843-trunk.patch After applying HBASE-6308,we found error followed 2012-09-06 00:44:38,341 DEBUG org.apache.hadoop.hbase.coprocessor.CoprocessorClassLoader: Finding class: com.hadoop.compression.lzo.LzoCodec 2012-09-06 00:44:38,351 ERROR com.hadoop.compression.lzo.GPLNativeCodeLoader: Could not load native gpl library java.lang.UnsatisfiedLinkError: Native Library /home/zhuzhuang/hbase/0.94.0-ali-1.0/lib/native/Linux-amd64-64/libgplcompression.so already loaded in another classloade r at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1772) at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1732) at java.lang.Runtime.loadLibrary0(Runtime.java:823) at java.lang.System.loadLibrary(System.java:1028) at com.hadoop.compression.lzo.GPLNativeCodeLoader.clinit(GPLNativeCodeLoader.java:32) at com.hadoop.compression.lzo.LzoCodec.clinit(LzoCodec.java:67) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:113) at org.apache.hadoop.hbase.io.hfile.Compression$Algorithm$1.getCodec(Compression.java:107) at org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.getCompressor(Compression.java:243) at org.apache.hadoop.hbase.util.CompressionTest.testCompression(CompressionTest.java:85) at org.apache.hadoop.hbase.regionserver.HRegion.checkCompressionCodecs(HRegion.java:3793) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3782) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3732) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) 2012-09-06 00:44:38,355 DEBUG org.apache.hadoop.hbase.coprocessor.CoprocessorClassLoader: Skipping exempt class java.io.PrintWriter - delegating directly to parent 2012-09-06 00:44:38,355 ERROR com.hadoop.compression.lzo.LzoCodec: Cannot load native-lzo without native-hadoop -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7037) ReplicationPeer logs at WARN level aborting server instead of at FATAL
[ https://issues.apache.org/jira/browse/HBASE-7037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-7037: - Resolution: Fixed Fix Version/s: 0.96.0 0.94.3 Assignee: liang xie Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk and 0.94 branch. Thanks for the patch Liang. ReplicationPeer logs at WARN level aborting server instead of at FATAL -- Key: HBASE-7037 URL: https://issues.apache.org/jira/browse/HBASE-7037 Project: HBase Issue Type: Bug Components: Replication Reporter: stack Assignee: liang xie Labels: noob Fix For: 0.94.3, 0.96.0 Attachments: HBASE-7037.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6843) loading lzo error when using coprocessor
[ https://issues.apache.org/jira/browse/HBASE-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483023#comment-13483023 ] Hudson commented on HBASE-6843: --- Integrated in HBase-0.94 #551 (See [https://builds.apache.org/job/HBase-0.94/551/]) HBASE-6843 loading lzo error when using coprocessor (Andy) (Revision 1401550) Result = FAILURE larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorClassLoader.java loading lzo error when using coprocessor Key: HBASE-6843 URL: https://issues.apache.org/jira/browse/HBASE-6843 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 0.94.1 Reporter: Zhou wenjian Assignee: Zhou wenjian Priority: Critical Fix For: 0.94.3, 0.96.0 Attachments: HBASE-6843-trunk.patch After applying HBASE-6308,we found error followed 2012-09-06 00:44:38,341 DEBUG org.apache.hadoop.hbase.coprocessor.CoprocessorClassLoader: Finding class: com.hadoop.compression.lzo.LzoCodec 2012-09-06 00:44:38,351 ERROR com.hadoop.compression.lzo.GPLNativeCodeLoader: Could not load native gpl library java.lang.UnsatisfiedLinkError: Native Library /home/zhuzhuang/hbase/0.94.0-ali-1.0/lib/native/Linux-amd64-64/libgplcompression.so already loaded in another classloade r at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1772) at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1732) at java.lang.Runtime.loadLibrary0(Runtime.java:823) at java.lang.System.loadLibrary(System.java:1028) at com.hadoop.compression.lzo.GPLNativeCodeLoader.clinit(GPLNativeCodeLoader.java:32) at com.hadoop.compression.lzo.LzoCodec.clinit(LzoCodec.java:67) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:113) at org.apache.hadoop.hbase.io.hfile.Compression$Algorithm$1.getCodec(Compression.java:107) at org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.getCompressor(Compression.java:243) at org.apache.hadoop.hbase.util.CompressionTest.testCompression(CompressionTest.java:85) at org.apache.hadoop.hbase.regionserver.HRegion.checkCompressionCodecs(HRegion.java:3793) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3782) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3732) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) 2012-09-06 00:44:38,355 DEBUG org.apache.hadoop.hbase.coprocessor.CoprocessorClassLoader: Skipping exempt class java.io.PrintWriter - delegating directly to parent 2012-09-06 00:44:38,355 ERROR com.hadoop.compression.lzo.LzoCodec: Cannot load native-lzo without native-hadoop -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4676) Prefix Compression - Trie data block encoding
[ https://issues.apache.org/jira/browse/HBASE-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Corgan updated HBASE-4676: --- Attachment: HBASE-4676-prefix-tree-trunk-v2.patch attaching full patch with fixes: HBASE-4676-prefix-tree-trunk-v2.patch Prefix Compression - Trie data block encoding - Key: HBASE-4676 URL: https://issues.apache.org/jira/browse/HBASE-4676 Project: HBase Issue Type: New Feature Components: io, Performance, regionserver Affects Versions: 0.96.0 Reporter: Matt Corgan Assignee: Matt Corgan Attachments: HBASE-4676-0.94-v1.patch, HBASE-4676-prefix-tree-trunk-v1.patch, HBASE-4676-prefix-tree-trunk-v2.patch, hbase-prefix-trie-0.1.jar, PrefixTrie_Format_v1.pdf, PrefixTrie_Performance_v1.pdf, SeeksPerSec by blockSize.png The HBase data block format has room for 2 significant improvements for applications that have high block cache hit ratios. First, there is no prefix compression, and the current KeyValue format is somewhat metadata heavy, so there can be tremendous memory bloat for many common data layouts, specifically those with long keys and short values. Second, there is no random access to KeyValues inside data blocks. This means that every time you double the datablock size, average seek time (or average cpu consumption) goes up by a factor of 2. The standard 64KB block size is ~10x slower for random seeks than a 4KB block size, but block sizes as small as 4KB cause problems elsewhere. Using block sizes of 256KB or 1MB or more may be more efficient from a disk access and block-cache perspective in many big-data applications, but doing so is infeasible from a random seek perspective. The PrefixTrie block encoding format attempts to solve both of these problems. Some features: * trie format for row key encoding completely eliminates duplicate row keys and encodes similar row keys into a standard trie structure which also saves a lot of space * the column family is currently stored once at the beginning of each block. this could easily be modified to allow multiple family names per block * all qualifiers in the block are stored in their own trie format which caters nicely to wide rows. duplicate qualifers between rows are eliminated. the size of this trie determines the width of the block's qualifier fixed-width-int * the minimum timestamp is stored at the beginning of the block, and deltas are calculated from that. the maximum delta determines the width of the block's timestamp fixed-width-int The block is structured with metadata at the beginning, then a section for the row trie, then the column trie, then the timestamp deltas, and then then all the values. Most work is done in the row trie, where every leaf node (corresponding to a row) contains a list of offsets/references corresponding to the cells in that row. Each cell is fixed-width to enable binary searching and is represented by [1 byte operationType, X bytes qualifier offset, X bytes timestamp delta offset]. If all operation types are the same for a block, there will be zero per-cell overhead. Same for timestamps. Same for qualifiers when i get a chance. So, the compression aspect is very strong, but makes a few small sacrifices on VarInt size to enable faster binary searches in trie fan-out nodes. A more compressed but slower version might build on this by also applying further (suffix, etc) compression on the trie nodes at the cost of slower write speed. Even further compression could be obtained by using all VInts instead of FInts with a sacrifice on random seek speed (though not huge). One current drawback is the current write speed. While programmed with good constructs like TreeMaps, ByteBuffers, binary searches, etc, it's not programmed with the same level of optimization as the read path. Work will need to be done to optimize the data structures used for encoding and could probably show a 10x increase. It will still be slower than delta encoding, but with a much higher decode speed. I have not yet created a thorough benchmark for write speed nor sequential read speed. Though the trie is reaching a point where it is internally very efficient (probably within half or a quarter of its max read speed) the way that hbase currently uses it is far from optimal. The KeyValueScanner and related classes that iterate through the trie will eventually need to be smarter and have methods to do things like skipping to the next row of results without scanning every cell in between. When that is accomplished it will also allow much faster compactions because the full row key will not have to be compared as often as it is now. Current code is on github. The trie code is
[jira] [Updated] (HBASE-4676) Prefix Compression - Trie data block encoding
[ https://issues.apache.org/jira/browse/HBASE-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Corgan updated HBASE-4676: --- Status: Open (was: Patch Available) Prefix Compression - Trie data block encoding - Key: HBASE-4676 URL: https://issues.apache.org/jira/browse/HBASE-4676 Project: HBase Issue Type: New Feature Components: io, Performance, regionserver Affects Versions: 0.96.0 Reporter: Matt Corgan Assignee: Matt Corgan Attachments: HBASE-4676-0.94-v1.patch, HBASE-4676-prefix-tree-trunk-v1.patch, HBASE-4676-prefix-tree-trunk-v2.patch, hbase-prefix-trie-0.1.jar, PrefixTrie_Format_v1.pdf, PrefixTrie_Performance_v1.pdf, SeeksPerSec by blockSize.png The HBase data block format has room for 2 significant improvements for applications that have high block cache hit ratios. First, there is no prefix compression, and the current KeyValue format is somewhat metadata heavy, so there can be tremendous memory bloat for many common data layouts, specifically those with long keys and short values. Second, there is no random access to KeyValues inside data blocks. This means that every time you double the datablock size, average seek time (or average cpu consumption) goes up by a factor of 2. The standard 64KB block size is ~10x slower for random seeks than a 4KB block size, but block sizes as small as 4KB cause problems elsewhere. Using block sizes of 256KB or 1MB or more may be more efficient from a disk access and block-cache perspective in many big-data applications, but doing so is infeasible from a random seek perspective. The PrefixTrie block encoding format attempts to solve both of these problems. Some features: * trie format for row key encoding completely eliminates duplicate row keys and encodes similar row keys into a standard trie structure which also saves a lot of space * the column family is currently stored once at the beginning of each block. this could easily be modified to allow multiple family names per block * all qualifiers in the block are stored in their own trie format which caters nicely to wide rows. duplicate qualifers between rows are eliminated. the size of this trie determines the width of the block's qualifier fixed-width-int * the minimum timestamp is stored at the beginning of the block, and deltas are calculated from that. the maximum delta determines the width of the block's timestamp fixed-width-int The block is structured with metadata at the beginning, then a section for the row trie, then the column trie, then the timestamp deltas, and then then all the values. Most work is done in the row trie, where every leaf node (corresponding to a row) contains a list of offsets/references corresponding to the cells in that row. Each cell is fixed-width to enable binary searching and is represented by [1 byte operationType, X bytes qualifier offset, X bytes timestamp delta offset]. If all operation types are the same for a block, there will be zero per-cell overhead. Same for timestamps. Same for qualifiers when i get a chance. So, the compression aspect is very strong, but makes a few small sacrifices on VarInt size to enable faster binary searches in trie fan-out nodes. A more compressed but slower version might build on this by also applying further (suffix, etc) compression on the trie nodes at the cost of slower write speed. Even further compression could be obtained by using all VInts instead of FInts with a sacrifice on random seek speed (though not huge). One current drawback is the current write speed. While programmed with good constructs like TreeMaps, ByteBuffers, binary searches, etc, it's not programmed with the same level of optimization as the read path. Work will need to be done to optimize the data structures used for encoding and could probably show a 10x increase. It will still be slower than delta encoding, but with a much higher decode speed. I have not yet created a thorough benchmark for write speed nor sequential read speed. Though the trie is reaching a point where it is internally very efficient (probably within half or a quarter of its max read speed) the way that hbase currently uses it is far from optimal. The KeyValueScanner and related classes that iterate through the trie will eventually need to be smarter and have methods to do things like skipping to the next row of results without scanning every cell in between. When that is accomplished it will also allow much faster compactions because the full row key will not have to be compared as often as it is now. Current code is on github. The trie code is in a separate project than the slightly modified hbase. There is an hbase project there as
[jira] [Updated] (HBASE-4676) Prefix Compression - Trie data block encoding
[ https://issues.apache.org/jira/browse/HBASE-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Corgan updated HBASE-4676: --- Status: Patch Available (was: Open) Prefix Compression - Trie data block encoding - Key: HBASE-4676 URL: https://issues.apache.org/jira/browse/HBASE-4676 Project: HBase Issue Type: New Feature Components: io, Performance, regionserver Affects Versions: 0.96.0 Reporter: Matt Corgan Assignee: Matt Corgan Attachments: HBASE-4676-0.94-v1.patch, HBASE-4676-prefix-tree-trunk-v1.patch, HBASE-4676-prefix-tree-trunk-v2.patch, hbase-prefix-trie-0.1.jar, PrefixTrie_Format_v1.pdf, PrefixTrie_Performance_v1.pdf, SeeksPerSec by blockSize.png The HBase data block format has room for 2 significant improvements for applications that have high block cache hit ratios. First, there is no prefix compression, and the current KeyValue format is somewhat metadata heavy, so there can be tremendous memory bloat for many common data layouts, specifically those with long keys and short values. Second, there is no random access to KeyValues inside data blocks. This means that every time you double the datablock size, average seek time (or average cpu consumption) goes up by a factor of 2. The standard 64KB block size is ~10x slower for random seeks than a 4KB block size, but block sizes as small as 4KB cause problems elsewhere. Using block sizes of 256KB or 1MB or more may be more efficient from a disk access and block-cache perspective in many big-data applications, but doing so is infeasible from a random seek perspective. The PrefixTrie block encoding format attempts to solve both of these problems. Some features: * trie format for row key encoding completely eliminates duplicate row keys and encodes similar row keys into a standard trie structure which also saves a lot of space * the column family is currently stored once at the beginning of each block. this could easily be modified to allow multiple family names per block * all qualifiers in the block are stored in their own trie format which caters nicely to wide rows. duplicate qualifers between rows are eliminated. the size of this trie determines the width of the block's qualifier fixed-width-int * the minimum timestamp is stored at the beginning of the block, and deltas are calculated from that. the maximum delta determines the width of the block's timestamp fixed-width-int The block is structured with metadata at the beginning, then a section for the row trie, then the column trie, then the timestamp deltas, and then then all the values. Most work is done in the row trie, where every leaf node (corresponding to a row) contains a list of offsets/references corresponding to the cells in that row. Each cell is fixed-width to enable binary searching and is represented by [1 byte operationType, X bytes qualifier offset, X bytes timestamp delta offset]. If all operation types are the same for a block, there will be zero per-cell overhead. Same for timestamps. Same for qualifiers when i get a chance. So, the compression aspect is very strong, but makes a few small sacrifices on VarInt size to enable faster binary searches in trie fan-out nodes. A more compressed but slower version might build on this by also applying further (suffix, etc) compression on the trie nodes at the cost of slower write speed. Even further compression could be obtained by using all VInts instead of FInts with a sacrifice on random seek speed (though not huge). One current drawback is the current write speed. While programmed with good constructs like TreeMaps, ByteBuffers, binary searches, etc, it's not programmed with the same level of optimization as the read path. Work will need to be done to optimize the data structures used for encoding and could probably show a 10x increase. It will still be slower than delta encoding, but with a much higher decode speed. I have not yet created a thorough benchmark for write speed nor sequential read speed. Though the trie is reaching a point where it is internally very efficient (probably within half or a quarter of its max read speed) the way that hbase currently uses it is far from optimal. The KeyValueScanner and related classes that iterate through the trie will eventually need to be smarter and have methods to do things like skipping to the next row of results without scanning every cell in between. When that is accomplished it will also allow much faster compactions because the full row key will not have to be compared as often as it is now. Current code is on github. The trie code is in a separate project than the slightly modified hbase. There is an hbase project there as
[jira] [Commented] (HBASE-4676) Prefix Compression - Trie data block encoding
[ https://issues.apache.org/jira/browse/HBASE-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483041#comment-13483041 ] Hadoop QA commented on HBASE-4676: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12550594/HBASE-4676-prefix-tree-trunk-v2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 143 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 99 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 49 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestHFileDataBlockEncoder Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3134//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3134//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3134//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3134//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3134//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3134//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3134//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3134//console This message is automatically generated. Prefix Compression - Trie data block encoding - Key: HBASE-4676 URL: https://issues.apache.org/jira/browse/HBASE-4676 Project: HBase Issue Type: New Feature Components: io, Performance, regionserver Affects Versions: 0.96.0 Reporter: Matt Corgan Assignee: Matt Corgan Attachments: HBASE-4676-0.94-v1.patch, HBASE-4676-prefix-tree-trunk-v1.patch, HBASE-4676-prefix-tree-trunk-v2.patch, hbase-prefix-trie-0.1.jar, PrefixTrie_Format_v1.pdf, PrefixTrie_Performance_v1.pdf, SeeksPerSec by blockSize.png The HBase data block format has room for 2 significant improvements for applications that have high block cache hit ratios. First, there is no prefix compression, and the current KeyValue format is somewhat metadata heavy, so there can be tremendous memory bloat for many common data layouts, specifically those with long keys and short values. Second, there is no random access to KeyValues inside data blocks. This means that every time you double the datablock size, average seek time (or average cpu consumption) goes up by a factor of 2. The standard 64KB block size is ~10x slower for random seeks than a 4KB block size, but block sizes as small as 4KB cause problems elsewhere. Using block sizes of 256KB or 1MB or more may be more efficient from a disk access and block-cache perspective in many big-data applications, but doing so is infeasible from a random seek perspective. The PrefixTrie block encoding format attempts to solve both of these problems. Some features: * trie format for row key encoding completely eliminates duplicate row keys and encodes similar row keys into a standard trie structure which also saves a lot of space * the column family is currently stored once at the beginning of each block. this could easily be modified to allow multiple family names per block * all qualifiers in the block are stored in their own trie format which caters nicely to wide rows. duplicate qualifers between rows are eliminated. the size of this trie determines the width of the block's qualifier fixed-width-int * the minimum timestamp is stored at the beginning of the block, and deltas are calculated from that. the maximum delta determines the width of the block's timestamp fixed-width-int The block is structured with metadata at the beginning, then a section for the
[jira] [Created] (HBASE-7042) Master Coprocessor Endpoint
Francis Liu created HBASE-7042: -- Summary: Master Coprocessor Endpoint Key: HBASE-7042 URL: https://issues.apache.org/jira/browse/HBASE-7042 Project: HBase Issue Type: Sub-task Reporter: Francis Liu Assignee: Francis Liu Attachments: HBASE-7042_94.patch Having support for a master coprocessor endpoint would enable developers to easily extended HMaster functionality/features. As is the case for region server grouping. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7042) Master Coprocessor Endpoint
[ https://issues.apache.org/jira/browse/HBASE-7042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francis Liu updated HBASE-7042: --- Attachment: HBASE-7042_94.patch Master Coprocessor Endpoint --- Key: HBASE-7042 URL: https://issues.apache.org/jira/browse/HBASE-7042 Project: HBase Issue Type: Sub-task Reporter: Francis Liu Assignee: Francis Liu Fix For: 0.96.0 Attachments: HBASE-7042_94.patch Having support for a master coprocessor endpoint would enable developers to easily extended HMaster functionality/features. As is the case for region server grouping. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7042) Master Coprocessor Endpoint
[ https://issues.apache.org/jira/browse/HBASE-7042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483045#comment-13483045 ] Francis Liu commented on HBASE-7042: Putting up a 0.94 patch for initial comments. Master Coprocessor Endpoint --- Key: HBASE-7042 URL: https://issues.apache.org/jira/browse/HBASE-7042 Project: HBase Issue Type: Sub-task Reporter: Francis Liu Assignee: Francis Liu Fix For: 0.96.0 Attachments: HBASE-7042_94.patch Having support for a master coprocessor endpoint would enable developers to easily extended HMaster functionality/features. As is the case for region server grouping. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6721) RegionServer Group based Assignment
[ https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francis Liu updated HBASE-6721: --- Attachment: HBASE-6721_94.patch 0.94 patch for initial review RegionServer Group based Assignment --- Key: HBASE-6721 URL: https://issues.apache.org/jira/browse/HBASE-6721 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Vandana Ayyalasomayajula Fix For: 0.96.0 Attachments: HBASE-6721_94.patch, HBASE-6721-DesigDoc.pdf In multi-tenant deployments of HBase, it is likely that a RegionServer will be serving out regions from a number of different tables owned by various client applications. Being able to group a subset of running RegionServers and assign specific tables to it, provides a client application a level of isolation and resource allocation. The proposal essentially is to have an AssignmentManager which is aware of RegionServer groups and assigns tables to region servers based on groupings. Load balancing will occur on a per group basis as well. This is essentially a simplification of the approach taken in HBASE-4120. See attached document. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7043) Region Server Group CLI commands
Francis Liu created HBASE-7043: -- Summary: Region Server Group CLI commands Key: HBASE-7043 URL: https://issues.apache.org/jira/browse/HBASE-7043 Project: HBase Issue Type: Sub-task Reporter: Francis Liu -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7043) Region Server Group CLI commands
[ https://issues.apache.org/jira/browse/HBASE-7043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francis Liu updated HBASE-7043: --- Attachment: HBASE-7043_94.patch Region Server Group CLI commands Key: HBASE-7043 URL: https://issues.apache.org/jira/browse/HBASE-7043 Project: HBase Issue Type: Sub-task Reporter: Francis Liu Fix For: 0.96.0 Attachments: HBASE-7043_94.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-7043) Region Server Group CLI commands
[ https://issues.apache.org/jira/browse/HBASE-7043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francis Liu reassigned HBASE-7043: -- Assignee: Francis Liu Region Server Group CLI commands Key: HBASE-7043 URL: https://issues.apache.org/jira/browse/HBASE-7043 Project: HBase Issue Type: Sub-task Reporter: Francis Liu Assignee: Francis Liu Fix For: 0.96.0 Attachments: HBASE-7043_94.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7043) Region Server Group CLI commands
[ https://issues.apache.org/jira/browse/HBASE-7043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483056#comment-13483056 ] Francis Liu commented on HBASE-7043: 0.94 initial patch for review Region Server Group CLI commands Key: HBASE-7043 URL: https://issues.apache.org/jira/browse/HBASE-7043 Project: HBase Issue Type: Sub-task Reporter: Francis Liu Assignee: Francis Liu Fix For: 0.96.0 Attachments: HBASE-7043_94.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6721) RegionServer Group based Assignment
[ https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francis Liu updated HBASE-6721: --- Status: Patch Available (was: Open) 0.94 patch for initial review, We decided to combine the patch of two subtasks into one. RegionServer Group based Assignment --- Key: HBASE-6721 URL: https://issues.apache.org/jira/browse/HBASE-6721 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Vandana Ayyalasomayajula Fix For: 0.96.0 Attachments: HBASE-6721_94.patch, HBASE-6721-DesigDoc.pdf In multi-tenant deployments of HBase, it is likely that a RegionServer will be serving out regions from a number of different tables owned by various client applications. Being able to group a subset of running RegionServers and assign specific tables to it, provides a client application a level of isolation and resource allocation. The proposal essentially is to have an AssignmentManager which is aware of RegionServer groups and assigns tables to region servers based on groupings. Load balancing will occur on a per group basis as well. This is essentially a simplification of the approach taken in HBASE-4120. See attached document. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-6723) Implement RegionServer Group Based Balancer
[ https://issues.apache.org/jira/browse/HBASE-6723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francis Liu resolved HBASE-6723. Resolution: Invalid Implement RegionServer Group Based Balancer --- Key: HBASE-6723 URL: https://issues.apache.org/jira/browse/HBASE-6723 Project: HBase Issue Type: Sub-task Reporter: Francis Liu Assignee: Vandana Ayyalasomayajula Re-purposing this Jira after the discussion last week. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-6837) RegionServer Groups corpcoessor apis
[ https://issues.apache.org/jira/browse/HBASE-6837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francis Liu resolved HBASE-6837. Resolution: Invalid RegionServer Groups corpcoessor apis Key: HBASE-6837 URL: https://issues.apache.org/jira/browse/HBASE-6837 Project: HBase Issue Type: Sub-task Reporter: Francis Liu Assignee: Francis Liu -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7042) Master Coprocessor Endpoint
[ https://issues.apache.org/jira/browse/HBASE-7042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francis Liu updated HBASE-7042: --- Status: Patch Available (was: Open) Master Coprocessor Endpoint --- Key: HBASE-7042 URL: https://issues.apache.org/jira/browse/HBASE-7042 Project: HBase Issue Type: Sub-task Reporter: Francis Liu Assignee: Francis Liu Fix For: 0.96.0 Attachments: HBASE-7042_94.patch Having support for a master coprocessor endpoint would enable developers to easily extended HMaster functionality/features. As is the case for region server grouping. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7043) Region Server Group CLI commands
[ https://issues.apache.org/jira/browse/HBASE-7043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francis Liu updated HBASE-7043: --- Status: Patch Available (was: Open) Region Server Group CLI commands Key: HBASE-7043 URL: https://issues.apache.org/jira/browse/HBASE-7043 Project: HBase Issue Type: Sub-task Reporter: Francis Liu Assignee: Francis Liu Fix For: 0.96.0 Attachments: HBASE-7043_94.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7042) Master Coprocessor Endpoint
[ https://issues.apache.org/jira/browse/HBASE-7042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483062#comment-13483062 ] Hadoop QA commented on HBASE-7042: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12550596/HBASE-7042_94.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 9 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3136//console This message is automatically generated. Master Coprocessor Endpoint --- Key: HBASE-7042 URL: https://issues.apache.org/jira/browse/HBASE-7042 Project: HBase Issue Type: Sub-task Reporter: Francis Liu Assignee: Francis Liu Fix For: 0.96.0 Attachments: HBASE-7042_94.patch Having support for a master coprocessor endpoint would enable developers to easily extended HMaster functionality/features. As is the case for region server grouping. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7043) Region Server Group CLI commands
[ https://issues.apache.org/jira/browse/HBASE-7043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483063#comment-13483063 ] Hadoop QA commented on HBASE-7043: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12550600/HBASE-7043_94.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3135//console This message is automatically generated. Region Server Group CLI commands Key: HBASE-7043 URL: https://issues.apache.org/jira/browse/HBASE-7043 Project: HBase Issue Type: Sub-task Reporter: Francis Liu Assignee: Francis Liu Fix For: 0.96.0 Attachments: HBASE-7043_94.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment
[ https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483065#comment-13483065 ] Hadoop QA commented on HBASE-6721: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12550599/HBASE-6721_94.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3137//console This message is automatically generated. RegionServer Group based Assignment --- Key: HBASE-6721 URL: https://issues.apache.org/jira/browse/HBASE-6721 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Vandana Ayyalasomayajula Fix For: 0.96.0 Attachments: HBASE-6721_94.patch, HBASE-6721-DesigDoc.pdf In multi-tenant deployments of HBase, it is likely that a RegionServer will be serving out regions from a number of different tables owned by various client applications. Being able to group a subset of running RegionServers and assign specific tables to it, provides a client application a level of isolation and resource allocation. The proposal essentially is to have an AssignmentManager which is aware of RegionServer groups and assigns tables to region servers based on groupings. Load balancing will occur on a per group basis as well. This is essentially a simplification of the approach taken in HBASE-4120. See attached document. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6721) RegionServer Group based Assignment
[ https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francis Liu updated HBASE-6721: --- Attachment: HBASE-6721_94.patch RegionServer Group based Assignment --- Key: HBASE-6721 URL: https://issues.apache.org/jira/browse/HBASE-6721 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Vandana Ayyalasomayajula Fix For: 0.96.0 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, HBASE-6721-DesigDoc.pdf In multi-tenant deployments of HBase, it is likely that a RegionServer will be serving out regions from a number of different tables owned by various client applications. Being able to group a subset of running RegionServers and assign specific tables to it, provides a client application a level of isolation and resource allocation. The proposal essentially is to have an AssignmentManager which is aware of RegionServer groups and assigns tables to region servers based on groupings. Load balancing will occur on a per group basis as well. This is essentially a simplification of the approach taken in HBASE-4120. See attached document. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment
[ https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483069#comment-13483069 ] Hadoop QA commented on HBASE-6721: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12550601/HBASE-6721_94.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 11 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3138//console This message is automatically generated. RegionServer Group based Assignment --- Key: HBASE-6721 URL: https://issues.apache.org/jira/browse/HBASE-6721 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Vandana Ayyalasomayajula Fix For: 0.96.0 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, HBASE-6721-DesigDoc.pdf In multi-tenant deployments of HBase, it is likely that a RegionServer will be serving out regions from a number of different tables owned by various client applications. Being able to group a subset of running RegionServers and assign specific tables to it, provides a client application a level of isolation and resource allocation. The proposal essentially is to have an AssignmentManager which is aware of RegionServer groups and assigns tables to region servers based on groupings. Load balancing will occur on a per group basis as well. This is essentially a simplification of the approach taken in HBASE-4120. See attached document. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7037) ReplicationPeer logs at WARN level aborting server instead of at FATAL
[ https://issues.apache.org/jira/browse/HBASE-7037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483077#comment-13483077 ] Hudson commented on HBASE-7037: --- Integrated in HBase-0.94 #552 (See [https://builds.apache.org/job/HBase-0.94/552/]) HBASE-7037 ReplicationPeer logs at WARN level aborting server instead of at FATAL (Revision 1401564) Result = FAILURE stack : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/ReplicationPeer.java ReplicationPeer logs at WARN level aborting server instead of at FATAL -- Key: HBASE-7037 URL: https://issues.apache.org/jira/browse/HBASE-7037 Project: HBase Issue Type: Bug Components: Replication Reporter: stack Assignee: liang xie Labels: noob Fix For: 0.94.3, 0.96.0 Attachments: HBASE-7037.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7037) ReplicationPeer logs at WARN level aborting server instead of at FATAL
[ https://issues.apache.org/jira/browse/HBASE-7037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483087#comment-13483087 ] Hudson commented on HBASE-7037: --- Integrated in HBase-TRUNK #3480 (See [https://builds.apache.org/job/HBase-TRUNK/3480/]) HBASE-7037 ReplicationPeer logs at WARN level aborting server instead of at FATAL (Revision 1401563) Result = FAILURE stack : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/ReplicationPeer.java ReplicationPeer logs at WARN level aborting server instead of at FATAL -- Key: HBASE-7037 URL: https://issues.apache.org/jira/browse/HBASE-7037 Project: HBase Issue Type: Bug Components: Replication Reporter: stack Assignee: liang xie Labels: noob Fix For: 0.94.3, 0.96.0 Attachments: HBASE-7037.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7024) TableMapReduceUtil.initTableMapperJob unnecessarily limits the types of outputKeyClass and outputValueClass
[ https://issues.apache.org/jira/browse/HBASE-7024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483096#comment-13483096 ] Dave Beech commented on HBASE-7024: --- Thanks Ted, Stack. Stack - you are right that keys and values have to be serializable, but they don't have to be Serializable in the Java interface sense. The Job/JobConf classes in Hadoop accept absolutely any class. Map tasks use Hadoop's SerializationFactory to work out which serializer class to use (WritableSerialization is the default, but you can specify custom ones through the io.serialization job setting, like AvroSerialization) The point is that Hadoop doesn't care at all what type your map output key and value classes are, so long as you have provided a serializer which works with them. If you haven't, the job dies horribly (no surprise there). I haven't tested with Hadoop 2 yet, no, but I'd be very surprised if this patch broke anything. If they'd changed this behaviour in Hadoop I'm sure there'd be tons of regression problems with mapreduce jobs that need custom serializers. TableMapReduceUtil.initTableMapperJob unnecessarily limits the types of outputKeyClass and outputValueClass --- Key: HBASE-7024 URL: https://issues.apache.org/jira/browse/HBASE-7024 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Dave Beech Priority: Minor Attachments: HBASE-7024.patch The various initTableMapperJob methods in TableMapReduceUtil take outputKeyClass and outputValueClass parameters which need to extend WritableComparable and Writable respectively. Because of this, it is not convenient to use an alternative serialization like Avro. (I wanted to set these parameters to AvroKey and AvroValue). The methods in the MapReduce API to set map output key and value types do not impose this restriction, so is there a reason to do it here? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7044) verifyRegionLocation in CatalogTracker.java didn't check if regionserver is in the cluster
wonderyl created HBASE-7044: --- Summary: verifyRegionLocation in CatalogTracker.java didn't check if regionserver is in the cluster Key: HBASE-7044 URL: https://issues.apache.org/jira/browse/HBASE-7044 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.0 Reporter: wonderyl at the beginning there is 1 whole hbase cluster, then I decide to split is into 2 cluster, one is for offline mining, one is for online service, and the online one is striped, the offline one contains the original master. unfortunately, the META of the original cluster is assigned to the machine stripped, and as there is a cache policy for META, the offline cluster is still access the META of the stripped one. after inspected the code, I found that in verifyRegionLocation of CatalogTracker.java, although it checks if the region server still contains the region, but it didn't check if the regions erver is still in the cluster which is very easy, just inspect if it is registered int zk. all in all, I have to shutdown the online cluster and restart the offline one, then the META is re-assgined. then everything is back to normal. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7017) Backport [replication] The replication-executor should make sure the file that it is replicating is closed before declaring success on that file to 0.94
[ https://issues.apache.org/jira/browse/HBASE-7017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483109#comment-13483109 ] Devaraj Das commented on HBASE-7017: I should be able to give it a shot in a few days (traveling currently) Backport [replication] The replication-executor should make sure the file that it is replicating is closed before declaring success on that file to 0.94 -- Key: HBASE-7017 URL: https://issues.apache.org/jira/browse/HBASE-7017 Project: HBase Issue Type: Bug Components: Replication Reporter: Devaraj Das Fix For: 0.94.3 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6480) If callQueueSize exceed maxQueueSize, all call will be rejected, do not reject priorityCall
[ https://issues.apache.org/jira/browse/HBASE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483112#comment-13483112 ] binlijin commented on HBASE-6480: - @stack i agree that should be some limit for priorityQueue, Lars's suggestion is good. If callQueueSize exceed maxQueueSize, all call will be rejected, do not reject priorityCall Key: HBASE-6480 URL: https://issues.apache.org/jira/browse/HBASE-6480 Project: HBase Issue Type: Bug Reporter: binlijin Fix For: 0.96.0, 0.94.4 Attachments: HBASE-6480-94.patch, HBASE-6480-trunk.patch Current if the callQueueSize exceed maxQueueSize, all call will be rejected, Should we let the priority Call pass through? Current: {code} if ((callSize + callQueueSize.get()) maxQueueSize) { Call callTooBig = xxx return ; } if (priorityCallQueue != null getQosLevel(param) highPriorityLevel) { priorityCallQueue.put(call); updateCallQueueLenMetrics(priorityCallQueue); } else { callQueue.put(call); // queue the call; maybe blocked here updateCallQueueLenMetrics(callQueue); } {code} Should we change it to : {code} if (priorityCallQueue != null getQosLevel(param) highPriorityLevel) { priorityCallQueue.put(call); updateCallQueueLenMetrics(priorityCallQueue); } else { if ((callSize + callQueueSize.get()) maxQueueSize) { Call callTooBig = xxx return ; } callQueue.put(call); // queue the call; maybe blocked here updateCallQueueLenMetrics(callQueue); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6184) HRegionInfo was null or empty in Meta
[ https://issues.apache.org/jira/browse/HBASE-6184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483122#comment-13483122 ] binlijin commented on HBASE-6184: - This can be happened when region split. 0.94.x version: write memstore, write hlog, update mvcc. Client: {code} metaTable = new HTable(configuration, HConstants.META_TABLE_NAME); Result startRowResult = metaTable.getRowOrBefore(searchRow, HConstants.CATALOG_FAMILY); if (startRowResult == null) { throw new TableNotFoundException(Cannot find row in .META. for table: + Bytes.toString(tableName) + , row= + Bytes.toStringBinary(searchRow)); } byte[] value = startRowResult.getValue(HConstants.CATALOG_FAMILY, HConstants.REGIONINFO_QUALIFIER); if (value == null || value.length == 0) { throw new IOException(HRegionInfo was null or empty in Meta for + Bytes.toString(tableName) + , row= + Bytes.toStringBinary(searchRow)); } {code} Server : HRegion.getClosestRowBefore {code} Store store = getStore(family); // get the closest key. (HStore.getRowKeyAtOrBefore can return null) KeyValue key = store.getRowKeyAtOrBefore(row); Result result = null; if (key != null) { Get get = new Get(key.getRow()); get.addFamily(family); result = get(get, null); } return result; {code} store.getRowKeyAtOrBefore(row); doesn't consider the readPoint, but the get will, so some value doesn't have commit, getRowKeyAtOrBefore see it, but get will ignore it, so there is possiable that will return null result. HRegionInfo was null or empty in Meta -- Key: HBASE-6184 URL: https://issues.apache.org/jira/browse/HBASE-6184 Project: HBase Issue Type: Bug Components: Client, io Affects Versions: 0.94.0 Reporter: jiafeng.zhang Fix For: 0.94.3 Attachments: HBASE-6184.patch insert data hadoop-0.23.2 + hbase-0.94.0 2012-06-07 13:09:38,573 WARN [org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation] Encountered problems when prefetch META table: java.io.IOException: HRegionInfo was null or empty in Meta for hbase_one_col, row=hbase_one_col,09115303780247449149,99 at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:160) at org.apache.hadoop.hbase.client.MetaScanner.access$000(MetaScanner.java:48) at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:126) at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:123) at org.apache.hadoop.hbase.client.HConnectionManager.execute(HConnectionManager.java:359) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:123) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:99) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:894) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:948) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:836) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1482) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1367) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:945) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:801) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:776) at org.apache.hadoop.hbase.client.HTablePool$PooledHTable.put(HTablePool.java:397) at com.dinglicom.hbase.HbaseImport.insertData(HbaseImport.java:177) at com.dinglicom.hbase.HbaseImport.run(HbaseImport.java:210) at java.lang.Thread.run(Thread.java:662) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6184) HRegionInfo was null or empty in Meta
[ https://issues.apache.org/jira/browse/HBASE-6184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483123#comment-13483123 ] binlijin commented on HBASE-6184: - 0.94 write memstore write hlog // last a few ms.. update mvcc Current readPoint = 9;, and the new KeyValue memstoreTS = 10, then the HRegion.getClosestRowBefore is called. KeyValue key = store.getRowKeyAtOrBefore(row); will see the new KeyValue, but Get get = new Get(key.getRow()); get.addFamily(family); result = get(get, null); will not see the new KeyValue. HRegionInfo was null or empty in Meta -- Key: HBASE-6184 URL: https://issues.apache.org/jira/browse/HBASE-6184 Project: HBase Issue Type: Bug Components: Client, io Affects Versions: 0.94.0 Reporter: jiafeng.zhang Fix For: 0.94.3 Attachments: HBASE-6184.patch insert data hadoop-0.23.2 + hbase-0.94.0 2012-06-07 13:09:38,573 WARN [org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation] Encountered problems when prefetch META table: java.io.IOException: HRegionInfo was null or empty in Meta for hbase_one_col, row=hbase_one_col,09115303780247449149,99 at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:160) at org.apache.hadoop.hbase.client.MetaScanner.access$000(MetaScanner.java:48) at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:126) at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:123) at org.apache.hadoop.hbase.client.HConnectionManager.execute(HConnectionManager.java:359) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:123) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:99) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:894) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:948) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:836) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1482) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1367) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:945) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:801) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:776) at org.apache.hadoop.hbase.client.HTablePool$PooledHTable.put(HTablePool.java:397) at com.dinglicom.hbase.HbaseImport.insertData(HbaseImport.java:177) at com.dinglicom.hbase.HbaseImport.run(HbaseImport.java:210) at java.lang.Thread.run(Thread.java:662) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6843) loading lzo error when using coprocessor
[ https://issues.apache.org/jira/browse/HBASE-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483171#comment-13483171 ] Hudson commented on HBASE-6843: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #233 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/233/]) HBASE-6843 loading lzo error when using coprocessor (Andy) (Revision 1401551) Result = FAILURE larsh : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorClassLoader.java loading lzo error when using coprocessor Key: HBASE-6843 URL: https://issues.apache.org/jira/browse/HBASE-6843 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 0.94.1 Reporter: Zhou wenjian Assignee: Zhou wenjian Priority: Critical Fix For: 0.94.3, 0.96.0 Attachments: HBASE-6843-trunk.patch After applying HBASE-6308,we found error followed 2012-09-06 00:44:38,341 DEBUG org.apache.hadoop.hbase.coprocessor.CoprocessorClassLoader: Finding class: com.hadoop.compression.lzo.LzoCodec 2012-09-06 00:44:38,351 ERROR com.hadoop.compression.lzo.GPLNativeCodeLoader: Could not load native gpl library java.lang.UnsatisfiedLinkError: Native Library /home/zhuzhuang/hbase/0.94.0-ali-1.0/lib/native/Linux-amd64-64/libgplcompression.so already loaded in another classloade r at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1772) at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1732) at java.lang.Runtime.loadLibrary0(Runtime.java:823) at java.lang.System.loadLibrary(System.java:1028) at com.hadoop.compression.lzo.GPLNativeCodeLoader.clinit(GPLNativeCodeLoader.java:32) at com.hadoop.compression.lzo.LzoCodec.clinit(LzoCodec.java:67) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:113) at org.apache.hadoop.hbase.io.hfile.Compression$Algorithm$1.getCodec(Compression.java:107) at org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.getCompressor(Compression.java:243) at org.apache.hadoop.hbase.util.CompressionTest.testCompression(CompressionTest.java:85) at org.apache.hadoop.hbase.regionserver.HRegion.checkCompressionCodecs(HRegion.java:3793) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3782) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3732) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) 2012-09-06 00:44:38,355 DEBUG org.apache.hadoop.hbase.coprocessor.CoprocessorClassLoader: Skipping exempt class java.io.PrintWriter - delegating directly to parent 2012-09-06 00:44:38,355 ERROR com.hadoop.compression.lzo.LzoCodec: Cannot load native-lzo without native-hadoop -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7018) Fix and Improve TableDescriptor caching for bulk assignment
[ https://issues.apache.org/jira/browse/HBASE-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483172#comment-13483172 ] Hudson commented on HBASE-7018: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #233 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/233/]) HBASE-7018 Fix and Improve TableDescriptor caching for bulk assignment (Revision 1401525) Result = FAILURE gchanan : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/TableDescriptors.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java Fix and Improve TableDescriptor caching for bulk assignment --- Key: HBASE-7018 URL: https://issues.apache.org/jira/browse/HBASE-7018 Project: HBase Issue Type: Bug Components: regionserver Reporter: Gregory Chanan Assignee: Gregory Chanan Fix For: 0.94.3, 0.96.0 Attachments: 7018-trunk.v2, HBASE-7018-94.patch, HBASE-7018-94-v2.patch, HBASE-7018-94-v3.patch, HBASE-7018-trunk.patch, HBASE-7018-v3-trunk.patch, HBASE-7018-v4-trunk.patch HBASE-6214 backported HBASE-5998 (Bulk assignment: regionserver optimization by using a temporary cache for table descriptors when receiving an open regions request), but it's buggy on 0.94 (0.96 appears correct): {code} HTableDescriptor htd = null; if (htds == null) { htd = this.tableDescriptors.get(region.getTableName()); } else { htd = htds.get(region.getTableNameAsString()); if (htd == null) { htd = this.tableDescriptors.get(region.getTableName()); htds.put(region.getRegionNameAsString(), htd); } } {code} i.e. we get the tableName from the map but write the regionName. Even fixing this, it looks like there are areas for improvement: 1) FSTableDescriptors already has a cache (though it goes to the NameNode each time through to check we have the latest copy. May as well combine these two caches, might be a performance win as well since we don't need to write to multiple caches. 2) FSTableDescriptors makes two RPCs to the NameNode when it encounters a new table. So the total number of RPCs necessary for a bulk assign (without caching is): #regions + #tables (with caching): min(#regions,#tables) + #tables = #tables + #tables = 2 * #tables We can make this only one RPC, yielding: #tables Probably not a big deal for most users, but in a multi-tenant situation where the number of regions being bulk assigned approaches the number of tables being bulk assigned, this could be a nice performance win. Benchmarks coming. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7037) ReplicationPeer logs at WARN level aborting server instead of at FATAL
[ https://issues.apache.org/jira/browse/HBASE-7037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483173#comment-13483173 ] Hudson commented on HBASE-7037: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #233 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/233/]) HBASE-7037 ReplicationPeer logs at WARN level aborting server instead of at FATAL (Revision 1401563) Result = FAILURE stack : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/ReplicationPeer.java ReplicationPeer logs at WARN level aborting server instead of at FATAL -- Key: HBASE-7037 URL: https://issues.apache.org/jira/browse/HBASE-7037 Project: HBase Issue Type: Bug Components: Replication Reporter: stack Assignee: liang xie Labels: noob Fix For: 0.94.3, 0.96.0 Attachments: HBASE-7037.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment
[ https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483325#comment-13483325 ] Ted Yu commented on HBASE-6721: --- {code} + public LoadBalancer getBalancer() { +return balancer; + } {code} The above method can be package private. {code} +public class GroupAdminEndpoint extends BaseEndpointCoprocessor implements GroupAdminProtocol, EventHandler.EventHandlerListener { + private static final Log LOG = LogFactory.getLog(GroupAdminClient.class); {code} Please add javadoc for the class. The line is beyond 100 characters. Log has wrong class. {code} + private ConcurrentMapString,String serversInTransition = {code} What does the value in serversInTransition map represent ? {code} + ListHRegionInfo regions = new ArrayListHRegionInfo(); + if (groupName == null) { + throw new NullPointerException(groupName can't be null); {code} nit: move ArrayList creation after the if statement. {code} + public CollectionString listTablesOfGroup(String groupName) throws IOException { {code} The return type is a collection, more generic than List that listOnlineRegionsOfGroup() returns. I guess there might be a reason. {code} + HTableDescriptor[] tables = master.getTableDescriptors().getAll().values().toArray(new HTableDescriptor[0]); {code} nit: line too long. {code} + public GroupInfo getGroup(String groupName) throws IOException { {code} Suggest renaming the method getGroupInfo(). getGroup() is kind of vague. More reviews to follow. RegionServer Group based Assignment --- Key: HBASE-6721 URL: https://issues.apache.org/jira/browse/HBASE-6721 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Vandana Ayyalasomayajula Fix For: 0.96.0 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, HBASE-6721-DesigDoc.pdf In multi-tenant deployments of HBase, it is likely that a RegionServer will be serving out regions from a number of different tables owned by various client applications. Being able to group a subset of running RegionServers and assign specific tables to it, provides a client application a level of isolation and resource allocation. The proposal essentially is to have an AssignmentManager which is aware of RegionServer groups and assigns tables to region servers based on groupings. Load balancing will occur on a per group basis as well. This is essentially a simplification of the approach taken in HBASE-4120. See attached document. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6977) Multithread processing ZK assignment events
[ https://issues.apache.org/jira/browse/HBASE-6977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6977: --- Status: Open (was: Patch Available) Will address Stack's comments and upload a new patch. Multithread processing ZK assignment events --- Key: HBASE-6977 URL: https://issues.apache.org/jira/browse/HBASE-6977 Project: HBase Issue Type: Improvement Components: Region Assignment Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Attachments: trunk-6977_v1.patch, trunk-6977_v2-1.patch Related to HBASE-6976 and HBASE-6611. ZK events processing is a bottle neck for assignments, since there is only one ZK event thread. If we can use multiple threads, it should be better. With multiple threads, the order of events could be messed up. However, if we pass all events related to one region always to the same worker thread, the order should be kept. We need to play with it and find out how much performance imrovement we can get. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6896) sync bulk and regular assigment handling socket timeout exception
[ https://issues.apache.org/jira/browse/HBASE-6896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6896: --- Resolution: Fixed Status: Resolved (was: Patch Available) Integrated into trunk. Thanks all for the review. sync bulk and regular assigment handling socket timeout exception - Key: HBASE-6896 URL: https://issues.apache.org/jira/browse/HBASE-6896 Project: HBase Issue Type: Bug Components: Region Assignment Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Attachments: trunk-6896.patch, trunk-6896_v2.patch In regular assignment, in case of socket network timeout, it tries to call openRegion again and again without change the region plan, ZK offline node, till the region is out of transition, in case the region server is still up. We may need to sync them up and make sure bulk assignment does the same in this case. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment
[ https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483385#comment-13483385 ] Ted Yu commented on HBASE-6721: --- In GroupAdminEndpoint: {code} + throw new IOException( + The region server or the target to move found to be null.); {code} It would be nice to point out which parameter is null. {code} +throw new DoNotRetryIOException(Group must have no associated tables.); {code} Include group name in the exception message. {code} + public MapString, String listServersInTransition() throws IOException { {code} Return type of Map includes additional information which is not used by callers. Suggest returning keySet. Down in GroupAdminClient: {code} + for(String server: proxy.listServersInTransition().keySet()) { +found = found || servers.contains(server); + } {code} Can you tell me what the body is supposed to achieve ? Back to GroupAdminEndpoint: {code} + private GroupInfoManager getGroupInfoManager() { +return ((GroupBasedLoadBalancer)menv.getMasterServices().getAssignmentManager().getBalancer()).getGroupInfoManager(); {code} Does GroupInfoManager belong to balancer ? The above is probably the longest indirection I have ever seen :-) {code} + private ListHRegionInfo getOnlineRegions(String hostPort) throws IOException { {code} The above method is only called by listOnlineRegionsOfGroup() in a loop over online servers, resulting in nested loop. Please consider collapsing the nested loop into one loop. {code} + LOG.error(Failed to complete GroupMoveServer with of +h.getPlan().getServers().size()+ {code} nit: remove ' of ' in above sentence. RegionServer Group based Assignment --- Key: HBASE-6721 URL: https://issues.apache.org/jira/browse/HBASE-6721 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Vandana Ayyalasomayajula Fix For: 0.96.0 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, HBASE-6721-DesigDoc.pdf In multi-tenant deployments of HBase, it is likely that a RegionServer will be serving out regions from a number of different tables owned by various client applications. Being able to group a subset of running RegionServers and assign specific tables to it, provides a client application a level of isolation and resource allocation. The proposal essentially is to have an AssignmentManager which is aware of RegionServer groups and assigns tables to region servers based on groupings. Load balancing will occur on a per group basis as well. This is essentially a simplification of the approach taken in HBASE-4120. See attached document. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6896) sync bulk and regular assigment handling socket timeout exception
[ https://issues.apache.org/jira/browse/HBASE-6896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483406#comment-13483406 ] Hudson commented on HBASE-6896: --- Integrated in HBase-TRUNK #3482 (See [https://builds.apache.org/job/HBase-TRUNK/3482/]) HBASE-6896 sync bulk and regular assigment handling socket timeout exception (Revision 1401744) Result = FAILURE jxiang : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java sync bulk and regular assigment handling socket timeout exception - Key: HBASE-6896 URL: https://issues.apache.org/jira/browse/HBASE-6896 Project: HBase Issue Type: Bug Components: Region Assignment Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Attachments: trunk-6896.patch, trunk-6896_v2.patch In regular assignment, in case of socket network timeout, it tries to call openRegion again and again without change the region plan, ZK offline node, till the region is out of transition, in case the region server is still up. We may need to sync them up and make sure bulk assignment does the same in this case. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6728) [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded
[ https://issues.apache.org/jira/browse/HBASE-6728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483438#comment-13483438 ] Kannan Muthukkaruppan commented on HBASE-6728: -- Lars wrote: Will this work right if I set scanner caching to 1000 and then deal with 2mb rows? In that case every response will be 2g, and it would always block and never make any progress, right? Yes, we considered that, and that's the reason for not using a simple counting semaphore that's initialize to the max size. We want the implementation to allow one request to exceed the queue size. We set the default at 1G, but we can exceed the limit by 1 requests' size amount. From http://svn.apache.org/viewvc/hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/util/SizeBasedThrottler.java?view=markuppathrev=1385058 : {code} 15 * This implementation allows you to set the value of internal 16 * counter to be greater than threshold. It happens 17 * when internal counter is lower than threshold and 18 * increase method is called with parameter 'delta' big enough 19 * so that sum of delta and internal counter is greater than 20 * threshold. This is not a bug, this is a feature. 21 * It solves some problems: 22 * - thread calling increase with big parameter will not be 23 * starved by other threads calling increase with small 24 * arguments. 25 * - thread calling increase with argument greater than 26 * threshold won't deadlock. This is useful when throttling 27 * queues - you can submit object that is bigger than limit. 28 * 29 * This implementation introduces small costs in terms of 30 * synchronization (no synchronization in most cases at all), but is 31 * vulnerable to races. For details see documentation of 32 * increase method. {code} [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded --- Key: HBASE-6728 URL: https://issues.apache.org/jira/browse/HBASE-6728 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan Assignee: Michal Gregorczyk Fix For: 0.94.3 Attachments: 6728.94, 6728-trunk.txt The per connection responseQueue is an unbounded queue. The request handler threads today try to send the response in line, but if things start to backup, the response is sent via a per connection responder thread. This intermediate queue, because it has no bounds, can be another source of OOMs. [Have not looked at this issue in trunk. So it may or may not be applicable there.] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HBASE-6728) [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded
[ https://issues.apache.org/jira/browse/HBASE-6728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483438#comment-13483438 ] Kannan Muthukkaruppan edited comment on HBASE-6728 at 10/24/12 6:06 PM: Lars wrote: Will this work right if I set scanner caching to 1000 and then deal with 2mb rows? In that case every response will be 2g, and it would always block and never make any progress, right? Yes, we considered that, and that's the reason for not using a simple counting semaphore that's initialized to the max size. We want the implementation to allow one request to exceed the queue size. We set the default at 1G, but we can exceed the limit by 1 requests' size amount. From http://svn.apache.org/viewvc/hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/util/SizeBasedThrottler.java?view=markuppathrev=1385058 : {code} 15 * This implementation allows you to set the value of internal 16 * counter to be greater than threshold. It happens 17 * when internal counter is lower than threshold and 18 * increase method is called with parameter 'delta' big enough 19 * so that sum of delta and internal counter is greater than 20 * threshold. This is not a bug, this is a feature. 21 * It solves some problems: 22 * - thread calling increase with big parameter will not be 23 * starved by other threads calling increase with small 24 * arguments. 25 * - thread calling increase with argument greater than 26 * threshold won't deadlock. This is useful when throttling 27 * queues - you can submit object that is bigger than limit. 28 * 29 * This implementation introduces small costs in terms of 30 * synchronization (no synchronization in most cases at all), but is 31 * vulnerable to races. For details see documentation of 32 * increase method. {code} was (Author: kannanm): Lars wrote: Will this work right if I set scanner caching to 1000 and then deal with 2mb rows? In that case every response will be 2g, and it would always block and never make any progress, right? Yes, we considered that, and that's the reason for not using a simple counting semaphore that's initialize to the max size. We want the implementation to allow one request to exceed the queue size. We set the default at 1G, but we can exceed the limit by 1 requests' size amount. From http://svn.apache.org/viewvc/hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/util/SizeBasedThrottler.java?view=markuppathrev=1385058 : {code} 15 * This implementation allows you to set the value of internal 16 * counter to be greater than threshold. It happens 17 * when internal counter is lower than threshold and 18 * increase method is called with parameter 'delta' big enough 19 * so that sum of delta and internal counter is greater than 20 * threshold. This is not a bug, this is a feature. 21 * It solves some problems: 22 * - thread calling increase with big parameter will not be 23 * starved by other threads calling increase with small 24 * arguments. 25 * - thread calling increase with argument greater than 26 * threshold won't deadlock. This is useful when throttling 27 * queues - you can submit object that is bigger than limit. 28 * 29 * This implementation introduces small costs in terms of 30 * synchronization (no synchronization in most cases at all), but is 31 * vulnerable to races. For details see documentation of 32 * increase method. {code} [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded --- Key: HBASE-6728 URL: https://issues.apache.org/jira/browse/HBASE-6728 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan Assignee: Michal Gregorczyk Fix For: 0.94.3 Attachments: 6728.94, 6728-trunk.txt The per connection responseQueue is an unbounded queue. The request handler threads today try to send the response in line, but if things start to backup, the response is sent via a per connection responder thread. This intermediate queue, because it has no bounds, can be another source of OOMs. [Have not looked at this issue in trunk. So it may or may not be applicable there.] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HBASE-6728) [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded
[ https://issues.apache.org/jira/browse/HBASE-6728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483438#comment-13483438 ] Kannan Muthukkaruppan edited comment on HBASE-6728 at 10/24/12 6:06 PM: Lars wrote: Will this work right if I set scanner caching to 1000 and then deal with 2mb rows? In that case every response will be 2g, and it would always block and never make any progress, right? Yes, we considered that, and that's the reason for not using a simple counting semaphore that's initialized to the max size. We want the implementation to allow one request to exceed the queue size. We set the default at 1G, but we can exceed the limit by 1 requests' size amount. From http://svn.apache.org/viewvc/hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/util/SizeBasedThrottler.java?view=markuppathrev=1385058 : {code} 15 * This implementation allows you to set the value of internal 16 * counter to be greater than threshold. It happens 17 * when internal counter is lower than threshold and 18 * increase method is called with parameter 'delta' big enough 19 * so that sum of delta and internal counter is greater than 20 * threshold. This is not a bug, this is a feature. 21 * It solves some problems: 22 * - thread calling increase with big parameter will not be 23 * starved by other threads calling increase with small 24 * arguments. 25 * - thread calling increase with argument greater than 26 * threshold won't deadlock. This is useful when throttling 27 * queues - you can submit object that is bigger than limit {code} was (Author: kannanm): Lars wrote: Will this work right if I set scanner caching to 1000 and then deal with 2mb rows? In that case every response will be 2g, and it would always block and never make any progress, right? Yes, we considered that, and that's the reason for not using a simple counting semaphore that's initialized to the max size. We want the implementation to allow one request to exceed the queue size. We set the default at 1G, but we can exceed the limit by 1 requests' size amount. From http://svn.apache.org/viewvc/hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/util/SizeBasedThrottler.java?view=markuppathrev=1385058 : {code} 15 * This implementation allows you to set the value of internal 16 * counter to be greater than threshold. It happens 17 * when internal counter is lower than threshold and 18 * increase method is called with parameter 'delta' big enough 19 * so that sum of delta and internal counter is greater than 20 * threshold. This is not a bug, this is a feature. 21 * It solves some problems: 22 * - thread calling increase with big parameter will not be 23 * starved by other threads calling increase with small 24 * arguments. 25 * - thread calling increase with argument greater than 26 * threshold won't deadlock. This is useful when throttling 27 * queues - you can submit object that is bigger than limit. 28 * 29 * This implementation introduces small costs in terms of 30 * synchronization (no synchronization in most cases at all), but is 31 * vulnerable to races. For details see documentation of 32 * increase method. {code} [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded --- Key: HBASE-6728 URL: https://issues.apache.org/jira/browse/HBASE-6728 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan Assignee: Michal Gregorczyk Fix For: 0.94.3 Attachments: 6728.94, 6728-trunk.txt The per connection responseQueue is an unbounded queue. The request handler threads today try to send the response in line, but if things start to backup, the response is sent via a per connection responder thread. This intermediate queue, because it has no bounds, can be another source of OOMs. [Have not looked at this issue in trunk. So it may or may not be applicable there.] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment
[ https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483464#comment-13483464 ] Francis Liu commented on HBASE-6721: [~yuzhih...@gmail.com] {quote} What does the value in serversInTransition map represent? {quote} It represents servers that are being moved from one group to another. {quote} Can you tell me what the body is supposed to achieve ? Back to GroupAdminEndpoint: {quote} Retrieveing the balancer during start() returns null. Thus I have to retrieve it lazily as needed. {quote} Does GroupInfoManager belong to balancer ? The above is probably the longest indirection I have ever seen {quote} We had to do this since we didn't want to touch AssignmentManager as much as possible :) RegionServer Group based Assignment --- Key: HBASE-6721 URL: https://issues.apache.org/jira/browse/HBASE-6721 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Vandana Ayyalasomayajula Fix For: 0.96.0 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, HBASE-6721-DesigDoc.pdf In multi-tenant deployments of HBase, it is likely that a RegionServer will be serving out regions from a number of different tables owned by various client applications. Being able to group a subset of running RegionServers and assign specific tables to it, provides a client application a level of isolation and resource allocation. The proposal essentially is to have an AssignmentManager which is aware of RegionServer groups and assigns tables to region servers based on groupings. Load balancing will occur on a per group basis as well. This is essentially a simplification of the approach taken in HBASE-4120. See attached document. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment
[ https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483471#comment-13483471 ] Francis Liu commented on HBASE-6721: {quote} We had to do this since we didn't want to touch AssignmentManager as much as possible {quote} As an alternative, we can add a getBalancer() Method to MasterServices. Thoughts? RegionServer Group based Assignment --- Key: HBASE-6721 URL: https://issues.apache.org/jira/browse/HBASE-6721 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Vandana Ayyalasomayajula Fix For: 0.96.0 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, HBASE-6721-DesigDoc.pdf In multi-tenant deployments of HBase, it is likely that a RegionServer will be serving out regions from a number of different tables owned by various client applications. Being able to group a subset of running RegionServers and assign specific tables to it, provides a client application a level of isolation and resource allocation. The proposal essentially is to have an AssignmentManager which is aware of RegionServer groups and assigns tables to region servers based on groupings. Load balancing will occur on a per group basis as well. This is essentially a simplification of the approach taken in HBASE-4120. See attached document. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment
[ https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483498#comment-13483498 ] Ted Yu commented on HBASE-6721: --- {code} + public void beforeProcess(EventHandler event) { {code} I think normally the above should be called preProcess(). {code} + public void afterProcess(EventHandler event) { {code} Rename to postProcess(). {code} + * Copyright 2011 The Apache Software Foundation {code} The above is no longer needed in license header. {code} +public interface GroupAdminProtocol extends GroupAdmin, CoprocessorProtocol { +} {code} I wasn't expecting a Protocol to not have methods in it :-) {code} +public class GroupBasedLoadBalancer implements LoadBalancer { {code} Add javadoc for GroupBasedLoadBalancer. {code} + } catch (IOException e) { +LOG.warn(IOException while creating GroupInfoManagerImpl., e); + } {code} I think if groupManager cannot be initialized, we should abort master because group policy wouldn't be enforced. In correctAssignments(): {code} +if ((info == null) || (!info.containsServer(sName.getHostAndPort( { + // Misplaced region. + misplacedRegions.add(region); {code} Under what scenario would a region be misplaced at runtime ? I think rebalancing misplaced region(s) would affect normal operation of related groups. {code} +//unassign misplaced regions, so that they are assigned to correct groups. +this.services.getAssignmentManager().unassign(misplacedRegions); {code} RegionServer Group based Assignment --- Key: HBASE-6721 URL: https://issues.apache.org/jira/browse/HBASE-6721 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Vandana Ayyalasomayajula Fix For: 0.96.0 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, HBASE-6721-DesigDoc.pdf In multi-tenant deployments of HBase, it is likely that a RegionServer will be serving out regions from a number of different tables owned by various client applications. Being able to group a subset of running RegionServers and assign specific tables to it, provides a client application a level of isolation and resource allocation. The proposal essentially is to have an AssignmentManager which is aware of RegionServer groups and assigns tables to region servers based on groupings. Load balancing will occur on a per group basis as well. This is essentially a simplification of the approach taken in HBASE-4120. See attached document. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment
[ https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483502#comment-13483502 ] Ted Yu commented on HBASE-6721: --- bq. Can you tell me what the body is supposed to achieve ? I was asking about the following line of code: {code} found = found || servers.contains(server); {code} It seems to be condition checking. bq. As an alternative, we can add a getBalancer() Method to MasterServices. That would be better than the current form. RegionServer Group based Assignment --- Key: HBASE-6721 URL: https://issues.apache.org/jira/browse/HBASE-6721 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Vandana Ayyalasomayajula Fix For: 0.96.0 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, HBASE-6721-DesigDoc.pdf In multi-tenant deployments of HBase, it is likely that a RegionServer will be serving out regions from a number of different tables owned by various client applications. Being able to group a subset of running RegionServers and assign specific tables to it, provides a client application a level of isolation and resource allocation. The proposal essentially is to have an AssignmentManager which is aware of RegionServer groups and assigns tables to region servers based on groupings. Load balancing will occur on a per group basis as well. This is essentially a simplification of the approach taken in HBASE-4120. See attached document. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7033) Add hbase.lru.blockcache.acceptable.factor to configuration, akin to the min.factor added by HBASE-6312
[ https://issues.apache.org/jira/browse/HBASE-7033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483523#comment-13483523 ] Sergey Shelukhin commented on HBASE-7033: - wondering if someone could review this... Add hbase.lru.blockcache.acceptable.factor to configuration, akin to the min.factor added by HBASE-6312 --- Key: HBASE-7033 URL: https://issues.apache.org/jira/browse/HBASE-7033 Project: HBase Issue Type: Improvement Components: io Affects Versions: 0.96.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Minor Fix For: 0.96.0 Attachments: HBASE-7033.patch Background: we want to make the change to block cache setting available on 0.94 without actually changing the defaults as was done in HBASE-6312, as this can be destabilizing. Thus, both of these would be configurable instead of just one, and the user would be able to switch to new values. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7040) Port HBASE-5867 Improve Compaction Throttle Default to 0.94
[ https://issues.apache.org/jira/browse/HBASE-7040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483529#comment-13483529 ] Sergey Shelukhin commented on HBASE-7040: - We are trying to port a number of performance and stability improvements from trunk to 0.94 in order to, well, make it more performant and stable :) Understandably it's a balancing act with potential destabilization, so please feel free to -1 if you think it's not worth the risk. Thanks! Port HBASE-5867 Improve Compaction Throttle Default to 0.94 --- Key: HBASE-7040 URL: https://issues.apache.org/jira/browse/HBASE-7040 Project: HBase Issue Type: Task Affects Versions: 0.94.2 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Minor Fix For: 0.94.3 Attachments: HBASE-7040.patch Looks like a relatively important (and simple) improvement. Considering porting to 0.94... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment
[ https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483550#comment-13483550 ] Elliott Clark commented on HBASE-6721: -- Just some initial thoughts: * I couldn't seem to get it to compile for me on 0.94 * There seem to be a bunch of formatting changes that aren't needed. * Passing in the preferred server into the load balancer on randomAssignment is messy. If we know the preferred server why call this function at all ? * The balancer is a public interface and we can't make changes to it in a minor release. And this patch won't apply to trunk. * With this many interfaces and classes it might make sense to move them into a namespace. * Why is GroupAdminClient in the master namespace and not in the client namespace. * Why a co-processor and not build it in ? ** Security was done that was because it can be added or removed. As the patch is that's not really possible ** This makes a lot of changes in core code for something that is a co-processor. * Don't create a DefaultLoadBalancer in GroupBasedLoadBalancer. The balancer was made pluggable and that feature shouldn't go away. * Why return ArrayListMultimap from groupRegions in GroupBasedLoadBalancer? Why not the base class * HTableDescriptor seems like the correct location for info about the table if you don't want to put that data into meta. * putting things into the filesystem seems like the wrong way to do it. There are just so many different moving parts with getting things from hdfs with caching and cache invalidation, and edge cases on failure. * There's a lot of logic about balancing bleeding into the AssignmentManager. Right now assignment manager is already too complex. I would much prefer a solution that had everything in the balancer. RegionServer Group based Assignment --- Key: HBASE-6721 URL: https://issues.apache.org/jira/browse/HBASE-6721 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Vandana Ayyalasomayajula Fix For: 0.96.0 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, HBASE-6721-DesigDoc.pdf In multi-tenant deployments of HBase, it is likely that a RegionServer will be serving out regions from a number of different tables owned by various client applications. Being able to group a subset of running RegionServers and assign specific tables to it, provides a client application a level of isolation and resource allocation. The proposal essentially is to have an AssignmentManager which is aware of RegionServer groups and assigns tables to region servers based on groupings. Load balancing will occur on a per group basis as well. This is essentially a simplification of the approach taken in HBASE-4120. See attached document. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment
[ https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483562#comment-13483562 ] Ted Yu commented on HBASE-6721: --- bq. There's a lot of logic about balancing bleeding into the AssignmentManager. Looking at the changes in AssignmentManager, they are mostly white space removal. There is only one real change: {code} + public LoadBalancer getBalancer() { +return balancer; + } {code} which Francis agrees to move out. RegionServer Group based Assignment --- Key: HBASE-6721 URL: https://issues.apache.org/jira/browse/HBASE-6721 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Vandana Ayyalasomayajula Fix For: 0.96.0 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, HBASE-6721-DesigDoc.pdf In multi-tenant deployments of HBase, it is likely that a RegionServer will be serving out regions from a number of different tables owned by various client applications. Being able to group a subset of running RegionServers and assign specific tables to it, provides a client application a level of isolation and resource allocation. The proposal essentially is to have an AssignmentManager which is aware of RegionServer groups and assigns tables to region servers based on groupings. Load balancing will occur on a per group basis as well. This is essentially a simplification of the approach taken in HBASE-4120. See attached document. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7045) Add some comments to MVCC code
Gregory Chanan created HBASE-7045: - Summary: Add some comments to MVCC code Key: HBASE-7045 URL: https://issues.apache.org/jira/browse/HBASE-7045 Project: HBase Issue Type: Task Components: Transactions/MVCC Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Minor Fix For: 0.96.0 I've been digging through the MVCC/transaction code and adding some comments to help me (or others) understand quicker the next time through -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7045) Add some comments to MVCC code
[ https://issues.apache.org/jira/browse/HBASE-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gregory Chanan updated HBASE-7045: -- Attachment: HBASE-7045.patch Add some comments to MVCC code -- Key: HBASE-7045 URL: https://issues.apache.org/jira/browse/HBASE-7045 Project: HBase Issue Type: Task Components: Transactions/MVCC Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Minor Fix For: 0.96.0 Attachments: HBASE-7045.patch I've been digging through the MVCC/transaction code and adding some comments to help me (or others) understand quicker the next time through -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7045) Add some comments to MVCC code
[ https://issues.apache.org/jira/browse/HBASE-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gregory Chanan updated HBASE-7045: -- Status: Patch Available (was: Open) Add some comments to MVCC code -- Key: HBASE-7045 URL: https://issues.apache.org/jira/browse/HBASE-7045 Project: HBase Issue Type: Task Components: Transactions/MVCC Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Minor Fix For: 0.96.0 Attachments: HBASE-7045.patch I've been digging through the MVCC/transaction code and adding some comments to help me (or others) understand quicker the next time through -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6371) [89-fb] Tier based compaction
[ https://issues.apache.org/jira/browse/HBASE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-6371: Attachment: HBASE-6371-089fb-commit.patch I am attaching the 0.89fb commit for reference. The commit hash is 1b3e7bb4df1ed05d7d268cb90ffc23f5955c4398 [89-fb] Tier based compaction - Key: HBASE-6371 URL: https://issues.apache.org/jira/browse/HBASE-6371 Project: HBase Issue Type: Improvement Reporter: Akashnil Assignee: Liyin Tang Labels: noob Attachments: HBASE-6371-089fb-commit.patch Currently, the compaction selection is not very flexible and is not sensitive to the hotness of the data. Very old data is likely to be accessed less, and very recent data is likely to be in the block cache. Both of these considerations make it inefficient to compact these files as aggressively as other files. In some use-cases, the access-pattern is particularly obvious even though there is no way to control the compaction algorithm in those cases. In the new compaction selection algorithm, we plan to divide the candidate files into different levels according to oldness of the data that is present in those files. For each level, parameters like compaction ratio, minimum number of store-files in each compaction may be different. Number of levels, time-ranges, and parameters for each level will be configurable online on a per-column family basis. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7040) Port HBASE-5867 Improve Compaction Throttle Default to 0.94
[ https://issues.apache.org/jira/browse/HBASE-7040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483613#comment-13483613 ] Lars Hofhansl commented on HBASE-7040: -- Going to commit this today or tomorrow unless I hear objections. Port HBASE-5867 Improve Compaction Throttle Default to 0.94 --- Key: HBASE-7040 URL: https://issues.apache.org/jira/browse/HBASE-7040 Project: HBase Issue Type: Task Affects Versions: 0.94.2 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Minor Fix For: 0.94.3 Attachments: HBASE-7040.patch Looks like a relatively important (and simple) improvement. Considering porting to 0.94... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7040) Port HBASE-5867 Improve Compaction Throttle Default to 0.94
[ https://issues.apache.org/jira/browse/HBASE-7040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483612#comment-13483612 ] Lars Hofhansl commented on HBASE-7040: -- As I said, I like the patch :) I think we should commit it. Just wondered whether you had a specific reason. Port HBASE-5867 Improve Compaction Throttle Default to 0.94 --- Key: HBASE-7040 URL: https://issues.apache.org/jira/browse/HBASE-7040 Project: HBase Issue Type: Task Affects Versions: 0.94.2 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Minor Fix For: 0.94.3 Attachments: HBASE-7040.patch Looks like a relatively important (and simple) improvement. Considering porting to 0.94... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6371) [89-fb] Tier based compaction
[ https://issues.apache.org/jira/browse/HBASE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483616#comment-13483616 ] Lars Hofhansl commented on HBASE-6371: -- This is from before there were coprocessors, interesting. [89-fb] Tier based compaction - Key: HBASE-6371 URL: https://issues.apache.org/jira/browse/HBASE-6371 Project: HBase Issue Type: Improvement Reporter: Akashnil Assignee: Liyin Tang Labels: noob Attachments: HBASE-6371-089fb-commit.patch Currently, the compaction selection is not very flexible and is not sensitive to the hotness of the data. Very old data is likely to be accessed less, and very recent data is likely to be in the block cache. Both of these considerations make it inefficient to compact these files as aggressively as other files. In some use-cases, the access-pattern is particularly obvious even though there is no way to control the compaction algorithm in those cases. In the new compaction selection algorithm, we plan to divide the candidate files into different levels according to oldness of the data that is present in those files. For each level, parameters like compaction ratio, minimum number of store-files in each compaction may be different. Number of levels, time-ranges, and parameters for each level will be configurable online on a per-column family basis. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7045) Add some comments to MVCC code
[ https://issues.apache.org/jira/browse/HBASE-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483621#comment-13483621 ] Hadoop QA commented on HBASE-7045: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12550682/HBASE-7045.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 82 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3139//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3139//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3139//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3139//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3139//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3139//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3139//console This message is automatically generated. Add some comments to MVCC code -- Key: HBASE-7045 URL: https://issues.apache.org/jira/browse/HBASE-7045 Project: HBase Issue Type: Task Components: Transactions/MVCC Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Minor Fix For: 0.96.0 Attachments: HBASE-7045.patch I've been digging through the MVCC/transaction code and adding some comments to help me (or others) understand quicker the next time through -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7045) Add some comments to MVCC code
[ https://issues.apache.org/jira/browse/HBASE-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483633#comment-13483633 ] Elliott Clark commented on HBASE-7045: -- +1 Nit: {code}To complete the WriteEntry, call {@link #completeMemstoreInsert(WriteEntry)}.{code} Maybe say: To complete and wait for it to be visible, call completeMemstoreInsert. Add some comments to MVCC code -- Key: HBASE-7045 URL: https://issues.apache.org/jira/browse/HBASE-7045 Project: HBase Issue Type: Task Components: Transactions/MVCC Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Minor Fix For: 0.96.0 Attachments: HBASE-7045.patch I've been digging through the MVCC/transaction code and adding some comments to help me (or others) understand quicker the next time through -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment
[ https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483642#comment-13483642 ] Francis Liu commented on HBASE-6721: {quote} I was asking about the following line of code: found = found || servers.contains(server); It seems to be condition checking. {quote} Yeah it's basically checking if the list of servers to be moved is already in the ServersInTransition list, meaning it is already being moved so we shouldn't allow that. RegionServer Group based Assignment --- Key: HBASE-6721 URL: https://issues.apache.org/jira/browse/HBASE-6721 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Vandana Ayyalasomayajula Fix For: 0.96.0 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, HBASE-6721-DesigDoc.pdf In multi-tenant deployments of HBase, it is likely that a RegionServer will be serving out regions from a number of different tables owned by various client applications. Being able to group a subset of running RegionServers and assign specific tables to it, provides a client application a level of isolation and resource allocation. The proposal essentially is to have an AssignmentManager which is aware of RegionServer groups and assigns tables to region servers based on groupings. Load balancing will occur on a per group basis as well. This is essentially a simplification of the approach taken in HBASE-4120. See attached document. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment
[ https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483645#comment-13483645 ] Elliott Clark commented on HBASE-6721: -- {quote}Looking at the changes in AssignmentManager, they are mostly white space removal. There is only one real change:{quote} You're missing the change where null plans are no longer queued which comes about because of this patch. RegionServer Group based Assignment --- Key: HBASE-6721 URL: https://issues.apache.org/jira/browse/HBASE-6721 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Vandana Ayyalasomayajula Fix For: 0.96.0 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, HBASE-6721-DesigDoc.pdf In multi-tenant deployments of HBase, it is likely that a RegionServer will be serving out regions from a number of different tables owned by various client applications. Being able to group a subset of running RegionServers and assign specific tables to it, provides a client application a level of isolation and resource allocation. The proposal essentially is to have an AssignmentManager which is aware of RegionServer groups and assigns tables to region servers based on groupings. Load balancing will occur on a per group basis as well. This is essentially a simplification of the approach taken in HBASE-4120. See attached document. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7046) Fix resource leak in TestHLogSplit#testOldRecoveredEditsFileSidelined
Himanshu Vashishtha created HBASE-7046: -- Summary: Fix resource leak in TestHLogSplit#testOldRecoveredEditsFileSidelined Key: HBASE-7046 URL: https://issues.apache.org/jira/browse/HBASE-7046 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.96.0 Reporter: Himanshu Vashishtha Assignee: Himanshu Vashishtha Fix For: 0.96.0 This method creates a writer but never closes one. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7047) [snapshots] Refactor error handling to use javax.management
Jesse Yates created HBASE-7047: -- Summary: [snapshots] Refactor error handling to use javax.management Key: HBASE-7047 URL: https://issues.apache.org/jira/browse/HBASE-7047 Project: HBase Issue Type: Sub-task Affects Versions: hbase-6055 Reporter: Jesse Yates Fix For: hbase-6055 The current error handling framework introduced in HBASE-6571 adds a lot of complexity for what is essentially a solved problem. Specifically, cross-thread notifications have been generalized for the JMX tooling in the javax.management classes. Similar to what we developed, they have a NotifciationBroadcaster, NotificationListener, etc. though these are interfaces rather than general classes. These javax classes can be used almost 1-to-1 as replacements for things like the ExceptionOrchestrator and ExceptionListener. This also gives us the opportunity to easily add primitive notifications for standard HBase things like (1) timeouts, (2) aborts, and (3) server stops since the framework already considers things like typed notifications. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-7047) [snapshots] Refactor error handling to use javax.management
[ https://issues.apache.org/jira/browse/HBASE-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Yates reassigned HBASE-7047: -- Assignee: Jesse Yates [snapshots] Refactor error handling to use javax.management --- Key: HBASE-7047 URL: https://issues.apache.org/jira/browse/HBASE-7047 Project: HBase Issue Type: Sub-task Components: Client, master, regionserver, snapshots, Zookeeper Affects Versions: hbase-6055 Reporter: Jesse Yates Assignee: Jesse Yates Fix For: hbase-6055 The current error handling framework introduced in HBASE-6571 adds a lot of complexity for what is essentially a solved problem. Specifically, cross-thread notifications have been generalized for the JMX tooling in the javax.management classes. Similar to what we developed, they have a NotifciationBroadcaster, NotificationListener, etc. though these are interfaces rather than general classes. These javax classes can be used almost 1-to-1 as replacements for things like the ExceptionOrchestrator and ExceptionListener. This also gives us the opportunity to easily add primitive notifications for standard HBase things like (1) timeouts, (2) aborts, and (3) server stops since the framework already considers things like typed notifications. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7047) [snapshots] Refactor error handling to use javax.management
[ https://issues.apache.org/jira/browse/HBASE-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Yates updated HBASE-7047: --- Attachment: java_6667-v0.txt Attaching simple version that refactors the error handling (removing excess classes/tests). This patch also modifies the current implementation of the offline snapshots (HBASE-6863) to use the new classes - slight tweaks, but nothing too crazy. [snapshots] Refactor error handling to use javax.management --- Key: HBASE-7047 URL: https://issues.apache.org/jira/browse/HBASE-7047 Project: HBase Issue Type: Sub-task Components: Client, master, regionserver, snapshots, Zookeeper Affects Versions: hbase-6055 Reporter: Jesse Yates Assignee: Jesse Yates Fix For: hbase-6055 Attachments: java_6667-v0.txt The current error handling framework introduced in HBASE-6571 adds a lot of complexity for what is essentially a solved problem. Specifically, cross-thread notifications have been generalized for the JMX tooling in the javax.management classes. Similar to what we developed, they have a NotifciationBroadcaster, NotificationListener, etc. though these are interfaces rather than general classes. These javax classes can be used almost 1-to-1 as replacements for things like the ExceptionOrchestrator and ExceptionListener. This also gives us the opportunity to easily add primitive notifications for standard HBase things like (1) timeouts, (2) aborts, and (3) server stops since the framework already considers things like typed notifications. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7047) [snapshots] Refactor error handling to use javax.management
[ https://issues.apache.org/jira/browse/HBASE-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Yates updated HBASE-7047: --- Attachment: hbase-7047-v0-adv.patch Attaching 'advanced' version of v0 that does some more advanced refactoring of the offline snapshot handler to take advantage of the new framework. Specifically, uses a centralized notification 'hub' to track the running handler and then uses the added StopNotification to pass a 'stop' update to the running DisabledTableSnapshotHandler. This is really nice in that it is basically zero overhead to running multiple snapshots or adapting for stopping a running snapshot and any restores. [snapshots] Refactor error handling to use javax.management --- Key: HBASE-7047 URL: https://issues.apache.org/jira/browse/HBASE-7047 Project: HBase Issue Type: Sub-task Components: Client, master, regionserver, snapshots, Zookeeper Affects Versions: hbase-6055 Reporter: Jesse Yates Assignee: Jesse Yates Fix For: hbase-6055 Attachments: hbase-7047-v0-adv.patch, java_6667-v0.txt The current error handling framework introduced in HBASE-6571 adds a lot of complexity for what is essentially a solved problem. Specifically, cross-thread notifications have been generalized for the JMX tooling in the javax.management classes. Similar to what we developed, they have a NotifciationBroadcaster, NotificationListener, etc. though these are interfaces rather than general classes. These javax classes can be used almost 1-to-1 as replacements for things like the ExceptionOrchestrator and ExceptionListener. This also gives us the opportunity to easily add primitive notifications for standard HBase things like (1) timeouts, (2) aborts, and (3) server stops since the framework already considers things like typed notifications. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7047) [snapshots] Refactor error handling to use javax.management
[ https://issues.apache.org/jira/browse/HBASE-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483688#comment-13483688 ] Ted Yu commented on HBASE-7047: --- Looking at java_6667-v0.txt, I don't see javax.management classes being used. [snapshots] Refactor error handling to use javax.management --- Key: HBASE-7047 URL: https://issues.apache.org/jira/browse/HBASE-7047 Project: HBase Issue Type: Sub-task Components: Client, master, regionserver, snapshots, Zookeeper Affects Versions: hbase-6055 Reporter: Jesse Yates Assignee: Jesse Yates Fix For: hbase-6055 Attachments: hbase-7047-v0-adv.patch, java_6667-v0.txt The current error handling framework introduced in HBASE-6571 adds a lot of complexity for what is essentially a solved problem. Specifically, cross-thread notifications have been generalized for the JMX tooling in the javax.management classes. Similar to what we developed, they have a NotifciationBroadcaster, NotificationListener, etc. though these are interfaces rather than general classes. These javax classes can be used almost 1-to-1 as replacements for things like the ExceptionOrchestrator and ExceptionListener. This also gives us the opportunity to easily add primitive notifications for standard HBase things like (1) timeouts, (2) aborts, and (3) server stops since the framework already considers things like typed notifications. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7047) [snapshots] Refactor error handling to use javax.management
[ https://issues.apache.org/jira/browse/HBASE-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Yates updated HBASE-7047: --- Attachment: (was: java_6667-v0.txt) [snapshots] Refactor error handling to use javax.management --- Key: HBASE-7047 URL: https://issues.apache.org/jira/browse/HBASE-7047 Project: HBase Issue Type: Sub-task Components: Client, master, regionserver, snapshots, Zookeeper Affects Versions: hbase-6055 Reporter: Jesse Yates Assignee: Jesse Yates Fix For: hbase-6055 Attachments: hbase-7047-v0-adv.patch, hbase-7047-v0.patch The current error handling framework introduced in HBASE-6571 adds a lot of complexity for what is essentially a solved problem. Specifically, cross-thread notifications have been generalized for the JMX tooling in the javax.management classes. Similar to what we developed, they have a NotifciationBroadcaster, NotificationListener, etc. though these are interfaces rather than general classes. These javax classes can be used almost 1-to-1 as replacements for things like the ExceptionOrchestrator and ExceptionListener. This also gives us the opportunity to easily add primitive notifications for standard HBase things like (1) timeouts, (2) aborts, and (3) server stops since the framework already considers things like typed notifications. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7047) [snapshots] Refactor error handling to use javax.management
[ https://issues.apache.org/jira/browse/HBASE-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483689#comment-13483689 ] Jesse Yates commented on HBASE-7047: [~te...@apache.org] whoops, wrong patch. Lets try that again [snapshots] Refactor error handling to use javax.management --- Key: HBASE-7047 URL: https://issues.apache.org/jira/browse/HBASE-7047 Project: HBase Issue Type: Sub-task Components: Client, master, regionserver, snapshots, Zookeeper Affects Versions: hbase-6055 Reporter: Jesse Yates Assignee: Jesse Yates Fix For: hbase-6055 Attachments: hbase-7047-v0-adv.patch, java_6667-v0.txt The current error handling framework introduced in HBASE-6571 adds a lot of complexity for what is essentially a solved problem. Specifically, cross-thread notifications have been generalized for the JMX tooling in the javax.management classes. Similar to what we developed, they have a NotifciationBroadcaster, NotificationListener, etc. though these are interfaces rather than general classes. These javax classes can be used almost 1-to-1 as replacements for things like the ExceptionOrchestrator and ExceptionListener. This also gives us the opportunity to easily add primitive notifications for standard HBase things like (1) timeouts, (2) aborts, and (3) server stops since the framework already considers things like typed notifications. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7047) [snapshots] Refactor error handling to use javax.management
[ https://issues.apache.org/jira/browse/HBASE-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Yates updated HBASE-7047: --- Attachment: hbase-7047-v0.patch Attaching correct version of 'basic'refactor. [snapshots] Refactor error handling to use javax.management --- Key: HBASE-7047 URL: https://issues.apache.org/jira/browse/HBASE-7047 Project: HBase Issue Type: Sub-task Components: Client, master, regionserver, snapshots, Zookeeper Affects Versions: hbase-6055 Reporter: Jesse Yates Assignee: Jesse Yates Fix For: hbase-6055 Attachments: hbase-7047-v0-adv.patch, hbase-7047-v0.patch The current error handling framework introduced in HBASE-6571 adds a lot of complexity for what is essentially a solved problem. Specifically, cross-thread notifications have been generalized for the JMX tooling in the javax.management classes. Similar to what we developed, they have a NotifciationBroadcaster, NotificationListener, etc. though these are interfaces rather than general classes. These javax classes can be used almost 1-to-1 as replacements for things like the ExceptionOrchestrator and ExceptionListener. This also gives us the opportunity to easily add primitive notifications for standard HBase things like (1) timeouts, (2) aborts, and (3) server stops since the framework already considers things like typed notifications. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7045) Add some comments to MVCC code
[ https://issues.apache.org/jira/browse/HBASE-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483696#comment-13483696 ] Gregory Chanan commented on HBASE-7045: --- Thanks for the review. Will do what you suggest on commit. Add some comments to MVCC code -- Key: HBASE-7045 URL: https://issues.apache.org/jira/browse/HBASE-7045 Project: HBase Issue Type: Task Components: Transactions/MVCC Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Minor Fix For: 0.96.0 Attachments: HBASE-7045.patch I've been digging through the MVCC/transaction code and adding some comments to help me (or others) understand quicker the next time through -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7045) Add some comments to MVCC code
[ https://issues.apache.org/jira/browse/HBASE-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gregory Chanan updated HBASE-7045: -- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Add some comments to MVCC code -- Key: HBASE-7045 URL: https://issues.apache.org/jira/browse/HBASE-7045 Project: HBase Issue Type: Task Components: Transactions/MVCC Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Minor Fix For: 0.96.0 Attachments: HBASE-7045.patch I've been digging through the MVCC/transaction code and adding some comments to help me (or others) understand quicker the next time through -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7048) Regionsplitter requires the hadoop config path to be in hbase classpath
Ted Yu created HBASE-7048: - Summary: Regionsplitter requires the hadoop config path to be in hbase classpath Key: HBASE-7048 URL: https://issues.apache.org/jira/browse/HBASE-7048 Project: HBase Issue Type: Bug Affects Versions: 0.92.2 Reporter: Ted Yu Fix For: 0.94.3, 0.96.0 When hadoop config path isn't included in hbase classpath, you will get the following: {code} Exception in thread main java.lang.IllegalArgumentException: Wrong FS: hdfs://t3.e.com/hbase/usertable/_balancedSplit, expected: file:/// at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:454) at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:67) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:431) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:301) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1005) at org.apache.hadoop.hbase.util.RegionSplitter.getSplits(RegionSplitter.java:643) at org.apache.hadoop.hbase.util.RegionSplitter.rollingSplit(RegionSplitter.java:367) at org.apache.hadoop.hbase.util.RegionSplitter.main(RegionSplitter.java:295) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6371) [89-fb] Tier based compaction
[ https://issues.apache.org/jira/browse/HBASE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483729#comment-13483729 ] Sergey Shelukhin commented on HBASE-6371: - I think updateConfiguration mechanism from this patch deserves separate commit; it's more generic than this change (I hope) and will have to be changed to protobufs. I will create a JIRA. [89-fb] Tier based compaction - Key: HBASE-6371 URL: https://issues.apache.org/jira/browse/HBASE-6371 Project: HBase Issue Type: Improvement Reporter: Akashnil Assignee: Liyin Tang Labels: noob Attachments: HBASE-6371-089fb-commit.patch Currently, the compaction selection is not very flexible and is not sensitive to the hotness of the data. Very old data is likely to be accessed less, and very recent data is likely to be in the block cache. Both of these considerations make it inefficient to compact these files as aggressively as other files. In some use-cases, the access-pattern is particularly obvious even though there is no way to control the compaction algorithm in those cases. In the new compaction selection algorithm, we plan to divide the candidate files into different levels according to oldness of the data that is present in those files. For each level, parameters like compaction ratio, minimum number of store-files in each compaction may be different. Number of levels, time-ranges, and parameters for each level will be configurable online on a per-column family basis. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5257) Allow filter to be evaluated after version handling
[ https://issues.apache.org/jira/browse/HBASE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Sharma updated HBASE-5257: Attachment: HBASE-5257-0.92.txt Attached patch passed unittests for Hbase 0.92 - hbase 0.95-snapshot will need a different patch. Allow filter to be evaluated after version handling --- Key: HBASE-5257 URL: https://issues.apache.org/jira/browse/HBASE-5257 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Attachments: HBASE-5257-0.92.txt There are various usecases and filter types where evaluating the filter before version are handled either do not make sense, or make filter handling more complicated. Also see this comment in ScanQueryMatcher: {code} /** * Filters should be checked before checking column trackers. If we do * otherwise, as was previously being done, ColumnTracker may increment its * counter for even that KV which may be discarded later on by Filter. This * would lead to incorrect results in certain cases. */ {code} So we had Filters after the column trackers (which do the version checking), and then moved it. Should be at the discretion of the Filter. Could either add a new method to FilterBase (maybe excludeVersions() or something). Or have a new Filter wrapper (like WhileMatchFilter), that should only be used as outmost filter and indicates the same (maybe ExcludeVersionsFilter). See latest comments on HBASE-5229 for motivation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5257) Allow filter to be evaluated after version handling
[ https://issues.apache.org/jira/browse/HBASE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483733#comment-13483733 ] Olson,Andrew commented on HBASE-5257: - I will be out of the office with limited access to email until Monday, 10/29/2012. For urgent issues please contact Greg Whitsitt. Andrew Olson | Sr. Software Architect | Cerner Corporation | 816.201.3825 | aols...@cerner.com | www.cerner.com Allow filter to be evaluated after version handling --- Key: HBASE-5257 URL: https://issues.apache.org/jira/browse/HBASE-5257 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Attachments: HBASE-5257-0.92.txt There are various usecases and filter types where evaluating the filter before version are handled either do not make sense, or make filter handling more complicated. Also see this comment in ScanQueryMatcher: {code} /** * Filters should be checked before checking column trackers. If we do * otherwise, as was previously being done, ColumnTracker may increment its * counter for even that KV which may be discarded later on by Filter. This * would lead to incorrect results in certain cases. */ {code} So we had Filters after the column trackers (which do the version checking), and then moved it. Should be at the discretion of the Filter. Could either add a new method to FilterBase (maybe excludeVersions() or something). Or have a new Filter wrapper (like WhileMatchFilter), that should only be used as outmost filter and indicates the same (maybe ExcludeVersionsFilter). See latest comments on HBASE-5229 for motivation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7049) add dynamic configuration update mechanism
Sergey Shelukhin created HBASE-7049: --- Summary: add dynamic configuration update mechanism Key: HBASE-7049 URL: https://issues.apache.org/jira/browse/HBASE-7049 Project: HBase Issue Type: Improvement Affects Versions: 0.96.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Initial draft will be modeled on 0.89-fb changes; see HBASE-6371 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7049) add dynamic configuration update mechanism
[ https://issues.apache.org/jira/browse/HBASE-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483736#comment-13483736 ] Ted Yu commented on HBASE-7049: --- @Sergey: Have you seen HBASE-5335 ? It is already in trunk. add dynamic configuration update mechanism -- Key: HBASE-7049 URL: https://issues.apache.org/jira/browse/HBASE-7049 Project: HBase Issue Type: Improvement Affects Versions: 0.96.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Initial draft will be modeled on 0.89-fb changes; see HBASE-6371 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5257) Allow filter to be evaluated after version handling
[ https://issues.apache.org/jira/browse/HBASE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-5257: -- Attachment: 5257-trunk.txt Patch for trunk. TestColumnPaginationFilter, TestFilter and TestFilterList passed. Allow filter to be evaluated after version handling --- Key: HBASE-5257 URL: https://issues.apache.org/jira/browse/HBASE-5257 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Fix For: 0.96.0 Attachments: 5257-trunk.txt, HBASE-5257-0.92.txt There are various usecases and filter types where evaluating the filter before version are handled either do not make sense, or make filter handling more complicated. Also see this comment in ScanQueryMatcher: {code} /** * Filters should be checked before checking column trackers. If we do * otherwise, as was previously being done, ColumnTracker may increment its * counter for even that KV which may be discarded later on by Filter. This * would lead to incorrect results in certain cases. */ {code} So we had Filters after the column trackers (which do the version checking), and then moved it. Should be at the discretion of the Filter. Could either add a new method to FilterBase (maybe excludeVersions() or something). Or have a new Filter wrapper (like WhileMatchFilter), that should only be used as outmost filter and indicates the same (maybe ExcludeVersionsFilter). See latest comments on HBASE-5229 for motivation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5257) Allow filter to be evaluated after version handling
[ https://issues.apache.org/jira/browse/HBASE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-5257: -- Fix Version/s: 0.96.0 Status: Patch Available (was: Reopened) Allow filter to be evaluated after version handling --- Key: HBASE-5257 URL: https://issues.apache.org/jira/browse/HBASE-5257 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Fix For: 0.96.0 Attachments: 5257-trunk.txt, HBASE-5257-0.92.txt There are various usecases and filter types where evaluating the filter before version are handled either do not make sense, or make filter handling more complicated. Also see this comment in ScanQueryMatcher: {code} /** * Filters should be checked before checking column trackers. If we do * otherwise, as was previously being done, ColumnTracker may increment its * counter for even that KV which may be discarded later on by Filter. This * would lead to incorrect results in certain cases. */ {code} So we had Filters after the column trackers (which do the version checking), and then moved it. Should be at the discretion of the Filter. Could either add a new method to FilterBase (maybe excludeVersions() or something). Or have a new Filter wrapper (like WhileMatchFilter), that should only be used as outmost filter and indicates the same (maybe ExcludeVersionsFilter). See latest comments on HBASE-5229 for motivation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7049) add dynamic configuration update mechanism
[ https://issues.apache.org/jira/browse/HBASE-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483749#comment-13483749 ] Sergey Shelukhin commented on HBASE-7049: - There's also a design option here. The current approach is the patch is an explicit admin command that is propagated to all the requisite objects and causes them to re-read the configuration they are interested in from disk. I personally prefer the approach where the act of replacing the file (or adding an override file) would cause the service configuration to be automatically updated inside the configuration object itself. One never caches values from config during init; config object does that on init/first request for a value (and on config file change); thus, the code instead calls conf.getLong(MyCoolValue) every time (or for one method call/one compaction/one request/...), and gets the recent value. For special cases, it's easy to add mechanism to get several values atomically, and for the most special case to add the change callback. This avoids adding the code to propagate config to places/handling updates in code, and avoids the non-atomicity of copying the files and then updating config via admin command. I wonder if there are opinions for either approach? add dynamic configuration update mechanism -- Key: HBASE-7049 URL: https://issues.apache.org/jira/browse/HBASE-7049 Project: HBase Issue Type: Improvement Affects Versions: 0.96.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Initial draft will be modeled on 0.89-fb changes; see HBASE-6371 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7049) add dynamic configuration update mechanism
[ https://issues.apache.org/jira/browse/HBASE-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483750#comment-13483750 ] Sergey Shelukhin commented on HBASE-7049: - ah, nevermind then :) add dynamic configuration update mechanism -- Key: HBASE-7049 URL: https://issues.apache.org/jira/browse/HBASE-7049 Project: HBase Issue Type: Improvement Affects Versions: 0.96.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Initial draft will be modeled on 0.89-fb changes; see HBASE-6371 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5257) Allow filter to be evaluated after version handling
[ https://issues.apache.org/jira/browse/HBASE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483757#comment-13483757 ] Varun Sharma commented on HBASE-5257: - [~te...@apache.org] Thanks for patching this against 0.96 - should we also be submitting into 0.92/0.94 ? We are using the 0.92 version of hbase ? Allow filter to be evaluated after version handling --- Key: HBASE-5257 URL: https://issues.apache.org/jira/browse/HBASE-5257 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Fix For: 0.96.0 Attachments: 5257-trunk.txt, HBASE-5257-0.92.txt There are various usecases and filter types where evaluating the filter before version are handled either do not make sense, or make filter handling more complicated. Also see this comment in ScanQueryMatcher: {code} /** * Filters should be checked before checking column trackers. If we do * otherwise, as was previously being done, ColumnTracker may increment its * counter for even that KV which may be discarded later on by Filter. This * would lead to incorrect results in certain cases. */ {code} So we had Filters after the column trackers (which do the version checking), and then moved it. Should be at the discretion of the Filter. Could either add a new method to FilterBase (maybe excludeVersions() or something). Or have a new Filter wrapper (like WhileMatchFilter), that should only be used as outmost filter and indicates the same (maybe ExcludeVersionsFilter). See latest comments on HBASE-5229 for motivation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6371) [89-fb] Tier based compaction
[ https://issues.apache.org/jira/browse/HBASE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483759#comment-13483759 ] Nicolas Spiegelberg commented on HBASE-6371: @Lars: you are correct about doing a better job of partitioning newly-written and stale data. With leveled compaction, the different tiers end up implicitly becoming different age groups. This was the primary motivation for us. Also note that we are looking into coprocessor-based compactions, but will probably utilize that for TSDB-style compactions and other stuff that is more niche and is questionable if it belongs in the core. [89-fb] Tier based compaction - Key: HBASE-6371 URL: https://issues.apache.org/jira/browse/HBASE-6371 Project: HBase Issue Type: Improvement Reporter: Akashnil Assignee: Liyin Tang Labels: noob Attachments: HBASE-6371-089fb-commit.patch Currently, the compaction selection is not very flexible and is not sensitive to the hotness of the data. Very old data is likely to be accessed less, and very recent data is likely to be in the block cache. Both of these considerations make it inefficient to compact these files as aggressively as other files. In some use-cases, the access-pattern is particularly obvious even though there is no way to control the compaction algorithm in those cases. In the new compaction selection algorithm, we plan to divide the candidate files into different levels according to oldness of the data that is present in those files. For each level, parameters like compaction ratio, minimum number of store-files in each compaction may be different. Number of levels, time-ranges, and parameters for each level will be configurable online on a per-column family basis. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7049) add dynamic configuration update mechanism
[ https://issues.apache.org/jira/browse/HBASE-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483762#comment-13483762 ] Sergey Shelukhin commented on HBASE-7049: - hmm, I am actually referring xml config settings, rather than column/etc. Do you mean HBASE-3909? add dynamic configuration update mechanism -- Key: HBASE-7049 URL: https://issues.apache.org/jira/browse/HBASE-7049 Project: HBase Issue Type: Improvement Affects Versions: 0.96.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Initial draft will be modeled on 0.89-fb changes; see HBASE-6371 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7049) add dynamic HBase xml configuration update mechanism
[ https://issues.apache.org/jira/browse/HBASE-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-7049: Summary: add dynamic HBase xml configuration update mechanism (was: add dynamic configuration update mechanism) add dynamic HBase xml configuration update mechanism Key: HBASE-7049 URL: https://issues.apache.org/jira/browse/HBASE-7049 Project: HBase Issue Type: Improvement Affects Versions: 0.96.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Initial draft will be modeled on 0.89-fb changes; see HBASE-6371 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-7049) add dynamic HBase xml configuration update mechanism
[ https://issues.apache.org/jira/browse/HBASE-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HBASE-7049. - Resolution: Duplicate add dynamic HBase xml configuration update mechanism Key: HBASE-7049 URL: https://issues.apache.org/jira/browse/HBASE-7049 Project: HBase Issue Type: Improvement Affects Versions: 0.96.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Initial draft will be modeled on 0.89-fb changes; see HBASE-6371 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment
[ https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483768#comment-13483768 ] Francis Liu commented on HBASE-6721: Thanks for the comments [~eclark] {quote} I couldn't seem to get it to compile for me on 0.94 {quote} Did you apply the patches attached in the subtasks prior to apply this patch? If you'd like them all in one single patch I can do that as well. {quote} There seem to be a bunch of formatting changes that aren't needed. {quote} Will clean that up in the next update. {quote} Passing in the preferred server into the load balancer on randomAssignment is messy. If we know the preferred server why call this function at all ? {quote} Good point, will remove that argument. {quote} The balancer is a public interface and we can't make changes to it in a minor release. And this patch won't apply to trunk. {quote} I see, we can make it binary compatible at least by supporting both interfaces if you're amenable to that. We're planning on getting 0.94 into production and it'd be great if we didn't have a lot of custom patches on top of it. {quote} With this many interfaces and classes it might make sense to move them into a namespace. {quote} Will look into doing this, my main concern is if there any dependencies to package private methods. {quote} Why a co-processor and not build it in ? Security was done that was because it can be added or removed. As the patch is that's not really possible This makes a lot of changes in core code for something that is a co-processor. {quote} As part of the design, HBase should run fine without the group based classes enabled (endpoint, balancer, etc). If it is not that case then that's a bug. As for some code changes in core code. Some may be unavoidable, but we could probably still make it less invasive (ie remove the EventHandler changes). Having said that, I don't mind if the community would like to have this fully integrated into HBase, just let us know. {quote} Don't create a DefaultLoadBalancer in GroupBasedLoadBalancer. The balancer was made pluggable and that feature shouldn't go away. {quote} The balancer is still pluggable it's just not pluggable for the GroupBasedLoadBalancer. Though should be ok to make that pluggable as well. {quote} HTableDescriptor seems like the correct location for info about the table if you don't want to put that data into meta. {quote} Yes, we have group affiliation store as a table property. Though group information is stored on hdfs. {quote} putting things into the filesystem seems like the wrong way to do it. There are just so many different moving parts with getting things from hdfs with caching and cache invalidation, and edge cases on failure. {quote} I see, were do you suggest we put it? Zookeeper? We mainly had it in HDFS since ZK, seemed to be the place to store only ephemeral data? Putting the data in tables would be a lot more complex and would require more core code change. RegionServer Group based Assignment --- Key: HBASE-6721 URL: https://issues.apache.org/jira/browse/HBASE-6721 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Vandana Ayyalasomayajula Fix For: 0.96.0 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, HBASE-6721-DesigDoc.pdf In multi-tenant deployments of HBase, it is likely that a RegionServer will be serving out regions from a number of different tables owned by various client applications. Being able to group a subset of running RegionServers and assign specific tables to it, provides a client application a level of isolation and resource allocation. The proposal essentially is to have an AssignmentManager which is aware of RegionServer groups and assigns tables to region servers based on groupings. Load balancing will occur on a per group basis as well. This is essentially a simplification of the approach taken in HBASE-4120. See attached document. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5257) Allow filter to be evaluated after version handling
[ https://issues.apache.org/jira/browse/HBASE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Sharma updated HBASE-5257: Attachment: HBASE-5257-0.94.txt Patch for 0.94 Allow filter to be evaluated after version handling --- Key: HBASE-5257 URL: https://issues.apache.org/jira/browse/HBASE-5257 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Fix For: 0.96.0 Attachments: 5257-trunk.txt, HBASE-5257-0.92.txt, HBASE-5257-0.94.txt There are various usecases and filter types where evaluating the filter before version are handled either do not make sense, or make filter handling more complicated. Also see this comment in ScanQueryMatcher: {code} /** * Filters should be checked before checking column trackers. If we do * otherwise, as was previously being done, ColumnTracker may increment its * counter for even that KV which may be discarded later on by Filter. This * would lead to incorrect results in certain cases. */ {code} So we had Filters after the column trackers (which do the version checking), and then moved it. Should be at the discretion of the Filter. Could either add a new method to FilterBase (maybe excludeVersions() or something). Or have a new Filter wrapper (like WhileMatchFilter), that should only be used as outmost filter and indicates the same (maybe ExcludeVersionsFilter). See latest comments on HBASE-5229 for motivation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5257) Allow filter to be evaluated after version handling
[ https://issues.apache.org/jira/browse/HBASE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483782#comment-13483782 ] Hadoop QA commented on HBASE-5257: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12550721/HBASE-5257-0.94.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 9 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3141//console This message is automatically generated. Allow filter to be evaluated after version handling --- Key: HBASE-5257 URL: https://issues.apache.org/jira/browse/HBASE-5257 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Fix For: 0.96.0 Attachments: 5257-trunk.txt, HBASE-5257-0.92.txt, HBASE-5257-0.94.txt There are various usecases and filter types where evaluating the filter before version are handled either do not make sense, or make filter handling more complicated. Also see this comment in ScanQueryMatcher: {code} /** * Filters should be checked before checking column trackers. If we do * otherwise, as was previously being done, ColumnTracker may increment its * counter for even that KV which may be discarded later on by Filter. This * would lead to incorrect results in certain cases. */ {code} So we had Filters after the column trackers (which do the version checking), and then moved it. Should be at the discretion of the Filter. Could either add a new method to FilterBase (maybe excludeVersions() or something). Or have a new Filter wrapper (like WhileMatchFilter), that should only be used as outmost filter and indicates the same (maybe ExcludeVersionsFilter). See latest comments on HBASE-5229 for motivation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment
[ https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483785#comment-13483785 ] Francis Liu commented on HBASE-6721: {quote} You're missing the change where null plans are no longer queued which comes about because of this patch. {quote} We needed this change to prevent regions from being assigned to region servers they don't belong to. We can continue to recognize null, we just need another way to prevent regions from being assigned to the wrong group of region servers. One option is to have a dead/bogus server as part of the plan if no online servers are available for a given group, this way it eventually gets reassigned once a live server is up. Would that work? RegionServer Group based Assignment --- Key: HBASE-6721 URL: https://issues.apache.org/jira/browse/HBASE-6721 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Vandana Ayyalasomayajula Fix For: 0.96.0 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, HBASE-6721-DesigDoc.pdf In multi-tenant deployments of HBase, it is likely that a RegionServer will be serving out regions from a number of different tables owned by various client applications. Being able to group a subset of running RegionServers and assign specific tables to it, provides a client application a level of isolation and resource allocation. The proposal essentially is to have an AssignmentManager which is aware of RegionServer groups and assigns tables to region servers based on groupings. Load balancing will occur on a per group basis as well. This is essentially a simplification of the approach taken in HBASE-4120. See attached document. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5257) Allow filter to be evaluated after version handling
[ https://issues.apache.org/jira/browse/HBASE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483790#comment-13483790 ] Hadoop QA commented on HBASE-5257: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12550711/5257-trunk.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 9 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 82 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3140//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3140//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3140//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3140//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3140//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3140//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3140//console This message is automatically generated. Allow filter to be evaluated after version handling --- Key: HBASE-5257 URL: https://issues.apache.org/jira/browse/HBASE-5257 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Fix For: 0.96.0 Attachments: 5257-trunk.txt, HBASE-5257-0.92.txt, HBASE-5257-0.94.txt There are various usecases and filter types where evaluating the filter before version are handled either do not make sense, or make filter handling more complicated. Also see this comment in ScanQueryMatcher: {code} /** * Filters should be checked before checking column trackers. If we do * otherwise, as was previously being done, ColumnTracker may increment its * counter for even that KV which may be discarded later on by Filter. This * would lead to incorrect results in certain cases. */ {code} So we had Filters after the column trackers (which do the version checking), and then moved it. Should be at the discretion of the Filter. Could either add a new method to FilterBase (maybe excludeVersions() or something). Or have a new Filter wrapper (like WhileMatchFilter), that should only be used as outmost filter and indicates the same (maybe ExcludeVersionsFilter). See latest comments on HBASE-5229 for motivation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6896) sync bulk and regular assigment handling socket timeout exception
[ https://issues.apache.org/jira/browse/HBASE-6896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483794#comment-13483794 ] Hudson commented on HBASE-6896: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #234 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/234/]) HBASE-6896 sync bulk and regular assigment handling socket timeout exception (Revision 1401744) Result = FAILURE jxiang : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java sync bulk and regular assigment handling socket timeout exception - Key: HBASE-6896 URL: https://issues.apache.org/jira/browse/HBASE-6896 Project: HBase Issue Type: Bug Components: Region Assignment Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Attachments: trunk-6896.patch, trunk-6896_v2.patch In regular assignment, in case of socket network timeout, it tries to call openRegion again and again without change the region plan, ZK offline node, till the region is out of transition, in case the region server is still up. We may need to sync them up and make sure bulk assignment does the same in this case. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7045) Add some comments to MVCC code
[ https://issues.apache.org/jira/browse/HBASE-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483793#comment-13483793 ] Hudson commented on HBASE-7045: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #234 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/234/]) HBASE-7045 Add some comments to MVCC code (Revision 1401910) Result = FAILURE gchanan : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyControl.java Add some comments to MVCC code -- Key: HBASE-7045 URL: https://issues.apache.org/jira/browse/HBASE-7045 Project: HBase Issue Type: Task Components: Transactions/MVCC Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Minor Fix For: 0.96.0 Attachments: HBASE-7045.patch I've been digging through the MVCC/transaction code and adding some comments to help me (or others) understand quicker the next time through -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5257) Allow filter to be evaluated after version handling
[ https://issues.apache.org/jira/browse/HBASE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Sharma updated HBASE-5257: Attachment: HBASE-5257-0.92.txt Corrected patch file in right format for 0.92 Allow filter to be evaluated after version handling --- Key: HBASE-5257 URL: https://issues.apache.org/jira/browse/HBASE-5257 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Fix For: 0.96.0 Attachments: 5257-trunk.txt, HBASE-5257-0.92.txt, HBASE-5257-0.92.txt, HBASE-5257-0.94.txt, HBASE-5257-0.94.txt There are various usecases and filter types where evaluating the filter before version are handled either do not make sense, or make filter handling more complicated. Also see this comment in ScanQueryMatcher: {code} /** * Filters should be checked before checking column trackers. If we do * otherwise, as was previously being done, ColumnTracker may increment its * counter for even that KV which may be discarded later on by Filter. This * would lead to incorrect results in certain cases. */ {code} So we had Filters after the column trackers (which do the version checking), and then moved it. Should be at the discretion of the Filter. Could either add a new method to FilterBase (maybe excludeVersions() or something). Or have a new Filter wrapper (like WhileMatchFilter), that should only be used as outmost filter and indicates the same (maybe ExcludeVersionsFilter). See latest comments on HBASE-5229 for motivation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5257) Allow filter to be evaluated after version handling
[ https://issues.apache.org/jira/browse/HBASE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Sharma updated HBASE-5257: Attachment: HBASE-5257-0.94.txt Correctly formatted patch file... Allow filter to be evaluated after version handling --- Key: HBASE-5257 URL: https://issues.apache.org/jira/browse/HBASE-5257 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Fix For: 0.96.0 Attachments: 5257-trunk.txt, HBASE-5257-0.92.txt, HBASE-5257-0.92.txt, HBASE-5257-0.94.txt, HBASE-5257-0.94.txt There are various usecases and filter types where evaluating the filter before version are handled either do not make sense, or make filter handling more complicated. Also see this comment in ScanQueryMatcher: {code} /** * Filters should be checked before checking column trackers. If we do * otherwise, as was previously being done, ColumnTracker may increment its * counter for even that KV which may be discarded later on by Filter. This * would lead to incorrect results in certain cases. */ {code} So we had Filters after the column trackers (which do the version checking), and then moved it. Should be at the discretion of the Filter. Could either add a new method to FilterBase (maybe excludeVersions() or something). Or have a new Filter wrapper (like WhileMatchFilter), that should only be used as outmost filter and indicates the same (maybe ExcludeVersionsFilter). See latest comments on HBASE-5229 for motivation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7045) Add some comments to MVCC code
[ https://issues.apache.org/jira/browse/HBASE-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483798#comment-13483798 ] Hudson commented on HBASE-7045: --- Integrated in HBase-TRUNK #3483 (See [https://builds.apache.org/job/HBase-TRUNK/3483/]) HBASE-7045 Add some comments to MVCC code (Revision 1401910) Result = FAILURE gchanan : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyControl.java Add some comments to MVCC code -- Key: HBASE-7045 URL: https://issues.apache.org/jira/browse/HBASE-7045 Project: HBase Issue Type: Task Components: Transactions/MVCC Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Minor Fix For: 0.96.0 Attachments: HBASE-7045.patch I've been digging through the MVCC/transaction code and adding some comments to help me (or others) understand quicker the next time through -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira