[jira] [Updated] (HBASE-6070) AM.nodeDeleted and SSH races creating problems for regions under SPLIT

2012-10-24 Thread Tianying Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tianying Chang updated HBASE-6070:
--


@ram,  

I am reading the code related to region split. I feel that this code below in 
AssignmentManager seems to be dead code.  Because 1) I don't see any place that 
callls to update the regionState to be State.SPLIT. 2) for scenario when region 
has already been split and RS crashed, ServerShutdownHandler should have 
already taken care of it.  Am I missing something here.  Thanks

if (rs.isSplit()) {
  LOG.debug(Ephemeral node deleted, regionserver crashed?,  +
clearing from RIT; rs= + rs);
  regionOffline(rs.getRegion());


 AM.nodeDeleted and SSH races creating problems for regions under SPLIT
 --

 Key: HBASE-6070
 URL: https://issues.apache.org/jira/browse/HBASE-6070
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.2, 0.94.1, 0.96.0

 Attachments: HBASE-6070_0.92_1.patch, HBASE-6070_0.92.patch, 
 HBASE-6070_0.94_1.patch, HBASE-6070_0.94.patch, HBASE-6070_trunk_1.patch, 
 HBASE-6070_trunk.patch


 We tried to address the problems in Master restart and RS restart while SPLIT 
 region is in progress as part of HBASE-5806.
 While doing some more we found still there is one race condition.
 - Split has just started and the znode is in RS_SPLIT state.
 - RS goes down.
 - First call back for SSH comes.
 - As part of the fix for HBASE-5806 SSH knows that some region is in RIT.
 - But now nodeDeleted event comes for the SPLIt node and there we try to 
 delete the RIT.
 - After this we try to see in the SSH whether any node is in RIT.  As we 
 dont find the region in RIT the region is never assigned.
 When we fixed HBASE-5806 step 6 happened first and then step 5 happened.  So 
 we missed it.  Now we found that. Will come up with a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6843) loading lzo error when using coprocessor

2012-10-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483003#comment-13483003
 ] 

Hudson commented on HBASE-6843:
---

Integrated in HBase-TRUNK #3479 (See 
[https://builds.apache.org/job/HBase-TRUNK/3479/])
HBASE-6843 loading lzo error when using coprocessor (Andy) (Revision 
1401551)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorClassLoader.java


 loading lzo error when using coprocessor
 

 Key: HBASE-6843
 URL: https://issues.apache.org/jira/browse/HBASE-6843
 Project: HBase
  Issue Type: Bug
  Components: Coprocessors
Affects Versions: 0.94.1
Reporter: Zhou wenjian
Assignee: Zhou wenjian
Priority: Critical
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6843-trunk.patch


 After applying HBASE-6308,we found error followed
 2012-09-06 00:44:38,341 DEBUG 
 org.apache.hadoop.hbase.coprocessor.CoprocessorClassLoader: Finding class: 
 com.hadoop.compression.lzo.LzoCodec
 2012-09-06 00:44:38,351 ERROR com.hadoop.compression.lzo.GPLNativeCodeLoader: 
 Could not load native gpl library
 java.lang.UnsatisfiedLinkError: Native Library 
 /home/zhuzhuang/hbase/0.94.0-ali-1.0/lib/native/Linux-amd64-64/libgplcompression.so
  already loaded in another classloade
 r
 at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1772)
 at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1732)
 at java.lang.Runtime.loadLibrary0(Runtime.java:823)
 at java.lang.System.loadLibrary(System.java:1028)
 at 
 com.hadoop.compression.lzo.GPLNativeCodeLoader.clinit(GPLNativeCodeLoader.java:32)
 at com.hadoop.compression.lzo.LzoCodec.clinit(LzoCodec.java:67)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:113)
 at 
 org.apache.hadoop.hbase.io.hfile.Compression$Algorithm$1.getCodec(Compression.java:107)
 at 
 org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.getCompressor(Compression.java:243)
 at 
 org.apache.hadoop.hbase.util.CompressionTest.testCompression(CompressionTest.java:85)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkCompressionCodecs(HRegion.java:3793)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3782)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3732)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 2012-09-06 00:44:38,355 DEBUG 
 org.apache.hadoop.hbase.coprocessor.CoprocessorClassLoader: Skipping exempt 
 class java.io.PrintWriter - delegating directly to parent
 2012-09-06 00:44:38,355 ERROR com.hadoop.compression.lzo.LzoCodec: Cannot 
 load native-lzo without native-hadoop

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7037) ReplicationPeer logs at WARN level aborting server instead of at FATAL

2012-10-24 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7037:
-

   Resolution: Fixed
Fix Version/s: 0.96.0
   0.94.3
 Assignee: liang xie
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk and 0.94 branch.  Thanks for the patch Liang.

 ReplicationPeer logs at WARN level aborting server instead of at FATAL
 --

 Key: HBASE-7037
 URL: https://issues.apache.org/jira/browse/HBASE-7037
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: stack
Assignee: liang xie
  Labels: noob
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-7037.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6843) loading lzo error when using coprocessor

2012-10-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483023#comment-13483023
 ] 

Hudson commented on HBASE-6843:
---

Integrated in HBase-0.94 #551 (See 
[https://builds.apache.org/job/HBase-0.94/551/])
HBASE-6843 loading lzo error when using coprocessor (Andy) (Revision 
1401550)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorClassLoader.java


 loading lzo error when using coprocessor
 

 Key: HBASE-6843
 URL: https://issues.apache.org/jira/browse/HBASE-6843
 Project: HBase
  Issue Type: Bug
  Components: Coprocessors
Affects Versions: 0.94.1
Reporter: Zhou wenjian
Assignee: Zhou wenjian
Priority: Critical
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6843-trunk.patch


 After applying HBASE-6308,we found error followed
 2012-09-06 00:44:38,341 DEBUG 
 org.apache.hadoop.hbase.coprocessor.CoprocessorClassLoader: Finding class: 
 com.hadoop.compression.lzo.LzoCodec
 2012-09-06 00:44:38,351 ERROR com.hadoop.compression.lzo.GPLNativeCodeLoader: 
 Could not load native gpl library
 java.lang.UnsatisfiedLinkError: Native Library 
 /home/zhuzhuang/hbase/0.94.0-ali-1.0/lib/native/Linux-amd64-64/libgplcompression.so
  already loaded in another classloade
 r
 at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1772)
 at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1732)
 at java.lang.Runtime.loadLibrary0(Runtime.java:823)
 at java.lang.System.loadLibrary(System.java:1028)
 at 
 com.hadoop.compression.lzo.GPLNativeCodeLoader.clinit(GPLNativeCodeLoader.java:32)
 at com.hadoop.compression.lzo.LzoCodec.clinit(LzoCodec.java:67)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:113)
 at 
 org.apache.hadoop.hbase.io.hfile.Compression$Algorithm$1.getCodec(Compression.java:107)
 at 
 org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.getCompressor(Compression.java:243)
 at 
 org.apache.hadoop.hbase.util.CompressionTest.testCompression(CompressionTest.java:85)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkCompressionCodecs(HRegion.java:3793)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3782)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3732)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 2012-09-06 00:44:38,355 DEBUG 
 org.apache.hadoop.hbase.coprocessor.CoprocessorClassLoader: Skipping exempt 
 class java.io.PrintWriter - delegating directly to parent
 2012-09-06 00:44:38,355 ERROR com.hadoop.compression.lzo.LzoCodec: Cannot 
 load native-lzo without native-hadoop

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-4676) Prefix Compression - Trie data block encoding

2012-10-24 Thread Matt Corgan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Corgan updated HBASE-4676:
---

Attachment: HBASE-4676-prefix-tree-trunk-v2.patch

attaching full patch with fixes: HBASE-4676-prefix-tree-trunk-v2.patch


 Prefix Compression - Trie data block encoding
 -

 Key: HBASE-4676
 URL: https://issues.apache.org/jira/browse/HBASE-4676
 Project: HBase
  Issue Type: New Feature
  Components: io, Performance, regionserver
Affects Versions: 0.96.0
Reporter: Matt Corgan
Assignee: Matt Corgan
 Attachments: HBASE-4676-0.94-v1.patch, 
 HBASE-4676-prefix-tree-trunk-v1.patch, HBASE-4676-prefix-tree-trunk-v2.patch, 
 hbase-prefix-trie-0.1.jar, PrefixTrie_Format_v1.pdf, 
 PrefixTrie_Performance_v1.pdf, SeeksPerSec by blockSize.png


 The HBase data block format has room for 2 significant improvements for 
 applications that have high block cache hit ratios.  
 First, there is no prefix compression, and the current KeyValue format is 
 somewhat metadata heavy, so there can be tremendous memory bloat for many 
 common data layouts, specifically those with long keys and short values.
 Second, there is no random access to KeyValues inside data blocks.  This 
 means that every time you double the datablock size, average seek time (or 
 average cpu consumption) goes up by a factor of 2.  The standard 64KB block 
 size is ~10x slower for random seeks than a 4KB block size, but block sizes 
 as small as 4KB cause problems elsewhere.  Using block sizes of 256KB or 1MB 
 or more may be more efficient from a disk access and block-cache perspective 
 in many big-data applications, but doing so is infeasible from a random seek 
 perspective.
 The PrefixTrie block encoding format attempts to solve both of these 
 problems.  Some features:
 * trie format for row key encoding completely eliminates duplicate row keys 
 and encodes similar row keys into a standard trie structure which also saves 
 a lot of space
 * the column family is currently stored once at the beginning of each block.  
 this could easily be modified to allow multiple family names per block
 * all qualifiers in the block are stored in their own trie format which 
 caters nicely to wide rows.  duplicate qualifers between rows are eliminated. 
  the size of this trie determines the width of the block's qualifier 
 fixed-width-int
 * the minimum timestamp is stored at the beginning of the block, and deltas 
 are calculated from that.  the maximum delta determines the width of the 
 block's timestamp fixed-width-int
 The block is structured with metadata at the beginning, then a section for 
 the row trie, then the column trie, then the timestamp deltas, and then then 
 all the values.  Most work is done in the row trie, where every leaf node 
 (corresponding to a row) contains a list of offsets/references corresponding 
 to the cells in that row.  Each cell is fixed-width to enable binary 
 searching and is represented by [1 byte operationType, X bytes qualifier 
 offset, X bytes timestamp delta offset].
 If all operation types are the same for a block, there will be zero per-cell 
 overhead.  Same for timestamps.  Same for qualifiers when i get a chance.  
 So, the compression aspect is very strong, but makes a few small sacrifices 
 on VarInt size to enable faster binary searches in trie fan-out nodes.
 A more compressed but slower version might build on this by also applying 
 further (suffix, etc) compression on the trie nodes at the cost of slower 
 write speed.  Even further compression could be obtained by using all VInts 
 instead of FInts with a sacrifice on random seek speed (though not huge).
 One current drawback is the current write speed.  While programmed with good 
 constructs like TreeMaps, ByteBuffers, binary searches, etc, it's not 
 programmed with the same level of optimization as the read path.  Work will 
 need to be done to optimize the data structures used for encoding and could 
 probably show a 10x increase.  It will still be slower than delta encoding, 
 but with a much higher decode speed.  I have not yet created a thorough 
 benchmark for write speed nor sequential read speed.
 Though the trie is reaching a point where it is internally very efficient 
 (probably within half or a quarter of its max read speed) the way that hbase 
 currently uses it is far from optimal.  The KeyValueScanner and related 
 classes that iterate through the trie will eventually need to be smarter and 
 have methods to do things like skipping to the next row of results without 
 scanning every cell in between.  When that is accomplished it will also allow 
 much faster compactions because the full row key will not have to be compared 
 as often as it is now.
 Current code is on github.  The trie code is 

[jira] [Updated] (HBASE-4676) Prefix Compression - Trie data block encoding

2012-10-24 Thread Matt Corgan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Corgan updated HBASE-4676:
---

Status: Open  (was: Patch Available)

 Prefix Compression - Trie data block encoding
 -

 Key: HBASE-4676
 URL: https://issues.apache.org/jira/browse/HBASE-4676
 Project: HBase
  Issue Type: New Feature
  Components: io, Performance, regionserver
Affects Versions: 0.96.0
Reporter: Matt Corgan
Assignee: Matt Corgan
 Attachments: HBASE-4676-0.94-v1.patch, 
 HBASE-4676-prefix-tree-trunk-v1.patch, HBASE-4676-prefix-tree-trunk-v2.patch, 
 hbase-prefix-trie-0.1.jar, PrefixTrie_Format_v1.pdf, 
 PrefixTrie_Performance_v1.pdf, SeeksPerSec by blockSize.png


 The HBase data block format has room for 2 significant improvements for 
 applications that have high block cache hit ratios.  
 First, there is no prefix compression, and the current KeyValue format is 
 somewhat metadata heavy, so there can be tremendous memory bloat for many 
 common data layouts, specifically those with long keys and short values.
 Second, there is no random access to KeyValues inside data blocks.  This 
 means that every time you double the datablock size, average seek time (or 
 average cpu consumption) goes up by a factor of 2.  The standard 64KB block 
 size is ~10x slower for random seeks than a 4KB block size, but block sizes 
 as small as 4KB cause problems elsewhere.  Using block sizes of 256KB or 1MB 
 or more may be more efficient from a disk access and block-cache perspective 
 in many big-data applications, but doing so is infeasible from a random seek 
 perspective.
 The PrefixTrie block encoding format attempts to solve both of these 
 problems.  Some features:
 * trie format for row key encoding completely eliminates duplicate row keys 
 and encodes similar row keys into a standard trie structure which also saves 
 a lot of space
 * the column family is currently stored once at the beginning of each block.  
 this could easily be modified to allow multiple family names per block
 * all qualifiers in the block are stored in their own trie format which 
 caters nicely to wide rows.  duplicate qualifers between rows are eliminated. 
  the size of this trie determines the width of the block's qualifier 
 fixed-width-int
 * the minimum timestamp is stored at the beginning of the block, and deltas 
 are calculated from that.  the maximum delta determines the width of the 
 block's timestamp fixed-width-int
 The block is structured with metadata at the beginning, then a section for 
 the row trie, then the column trie, then the timestamp deltas, and then then 
 all the values.  Most work is done in the row trie, where every leaf node 
 (corresponding to a row) contains a list of offsets/references corresponding 
 to the cells in that row.  Each cell is fixed-width to enable binary 
 searching and is represented by [1 byte operationType, X bytes qualifier 
 offset, X bytes timestamp delta offset].
 If all operation types are the same for a block, there will be zero per-cell 
 overhead.  Same for timestamps.  Same for qualifiers when i get a chance.  
 So, the compression aspect is very strong, but makes a few small sacrifices 
 on VarInt size to enable faster binary searches in trie fan-out nodes.
 A more compressed but slower version might build on this by also applying 
 further (suffix, etc) compression on the trie nodes at the cost of slower 
 write speed.  Even further compression could be obtained by using all VInts 
 instead of FInts with a sacrifice on random seek speed (though not huge).
 One current drawback is the current write speed.  While programmed with good 
 constructs like TreeMaps, ByteBuffers, binary searches, etc, it's not 
 programmed with the same level of optimization as the read path.  Work will 
 need to be done to optimize the data structures used for encoding and could 
 probably show a 10x increase.  It will still be slower than delta encoding, 
 but with a much higher decode speed.  I have not yet created a thorough 
 benchmark for write speed nor sequential read speed.
 Though the trie is reaching a point where it is internally very efficient 
 (probably within half or a quarter of its max read speed) the way that hbase 
 currently uses it is far from optimal.  The KeyValueScanner and related 
 classes that iterate through the trie will eventually need to be smarter and 
 have methods to do things like skipping to the next row of results without 
 scanning every cell in between.  When that is accomplished it will also allow 
 much faster compactions because the full row key will not have to be compared 
 as often as it is now.
 Current code is on github.  The trie code is in a separate project than the 
 slightly modified hbase.  There is an hbase project there as 

[jira] [Updated] (HBASE-4676) Prefix Compression - Trie data block encoding

2012-10-24 Thread Matt Corgan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Corgan updated HBASE-4676:
---

Status: Patch Available  (was: Open)

 Prefix Compression - Trie data block encoding
 -

 Key: HBASE-4676
 URL: https://issues.apache.org/jira/browse/HBASE-4676
 Project: HBase
  Issue Type: New Feature
  Components: io, Performance, regionserver
Affects Versions: 0.96.0
Reporter: Matt Corgan
Assignee: Matt Corgan
 Attachments: HBASE-4676-0.94-v1.patch, 
 HBASE-4676-prefix-tree-trunk-v1.patch, HBASE-4676-prefix-tree-trunk-v2.patch, 
 hbase-prefix-trie-0.1.jar, PrefixTrie_Format_v1.pdf, 
 PrefixTrie_Performance_v1.pdf, SeeksPerSec by blockSize.png


 The HBase data block format has room for 2 significant improvements for 
 applications that have high block cache hit ratios.  
 First, there is no prefix compression, and the current KeyValue format is 
 somewhat metadata heavy, so there can be tremendous memory bloat for many 
 common data layouts, specifically those with long keys and short values.
 Second, there is no random access to KeyValues inside data blocks.  This 
 means that every time you double the datablock size, average seek time (or 
 average cpu consumption) goes up by a factor of 2.  The standard 64KB block 
 size is ~10x slower for random seeks than a 4KB block size, but block sizes 
 as small as 4KB cause problems elsewhere.  Using block sizes of 256KB or 1MB 
 or more may be more efficient from a disk access and block-cache perspective 
 in many big-data applications, but doing so is infeasible from a random seek 
 perspective.
 The PrefixTrie block encoding format attempts to solve both of these 
 problems.  Some features:
 * trie format for row key encoding completely eliminates duplicate row keys 
 and encodes similar row keys into a standard trie structure which also saves 
 a lot of space
 * the column family is currently stored once at the beginning of each block.  
 this could easily be modified to allow multiple family names per block
 * all qualifiers in the block are stored in their own trie format which 
 caters nicely to wide rows.  duplicate qualifers between rows are eliminated. 
  the size of this trie determines the width of the block's qualifier 
 fixed-width-int
 * the minimum timestamp is stored at the beginning of the block, and deltas 
 are calculated from that.  the maximum delta determines the width of the 
 block's timestamp fixed-width-int
 The block is structured with metadata at the beginning, then a section for 
 the row trie, then the column trie, then the timestamp deltas, and then then 
 all the values.  Most work is done in the row trie, where every leaf node 
 (corresponding to a row) contains a list of offsets/references corresponding 
 to the cells in that row.  Each cell is fixed-width to enable binary 
 searching and is represented by [1 byte operationType, X bytes qualifier 
 offset, X bytes timestamp delta offset].
 If all operation types are the same for a block, there will be zero per-cell 
 overhead.  Same for timestamps.  Same for qualifiers when i get a chance.  
 So, the compression aspect is very strong, but makes a few small sacrifices 
 on VarInt size to enable faster binary searches in trie fan-out nodes.
 A more compressed but slower version might build on this by also applying 
 further (suffix, etc) compression on the trie nodes at the cost of slower 
 write speed.  Even further compression could be obtained by using all VInts 
 instead of FInts with a sacrifice on random seek speed (though not huge).
 One current drawback is the current write speed.  While programmed with good 
 constructs like TreeMaps, ByteBuffers, binary searches, etc, it's not 
 programmed with the same level of optimization as the read path.  Work will 
 need to be done to optimize the data structures used for encoding and could 
 probably show a 10x increase.  It will still be slower than delta encoding, 
 but with a much higher decode speed.  I have not yet created a thorough 
 benchmark for write speed nor sequential read speed.
 Though the trie is reaching a point where it is internally very efficient 
 (probably within half or a quarter of its max read speed) the way that hbase 
 currently uses it is far from optimal.  The KeyValueScanner and related 
 classes that iterate through the trie will eventually need to be smarter and 
 have methods to do things like skipping to the next row of results without 
 scanning every cell in between.  When that is accomplished it will also allow 
 much faster compactions because the full row key will not have to be compared 
 as often as it is now.
 Current code is on github.  The trie code is in a separate project than the 
 slightly modified hbase.  There is an hbase project there as 

[jira] [Commented] (HBASE-4676) Prefix Compression - Trie data block encoding

2012-10-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483041#comment-13483041
 ] 

Hadoop QA commented on HBASE-4676:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12550594/HBASE-4676-prefix-tree-trunk-v2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 143 
new or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
99 warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 49 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.io.hfile.TestHFileDataBlockEncoder

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3134//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3134//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3134//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3134//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3134//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3134//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3134//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3134//console

This message is automatically generated.

 Prefix Compression - Trie data block encoding
 -

 Key: HBASE-4676
 URL: https://issues.apache.org/jira/browse/HBASE-4676
 Project: HBase
  Issue Type: New Feature
  Components: io, Performance, regionserver
Affects Versions: 0.96.0
Reporter: Matt Corgan
Assignee: Matt Corgan
 Attachments: HBASE-4676-0.94-v1.patch, 
 HBASE-4676-prefix-tree-trunk-v1.patch, HBASE-4676-prefix-tree-trunk-v2.patch, 
 hbase-prefix-trie-0.1.jar, PrefixTrie_Format_v1.pdf, 
 PrefixTrie_Performance_v1.pdf, SeeksPerSec by blockSize.png


 The HBase data block format has room for 2 significant improvements for 
 applications that have high block cache hit ratios.  
 First, there is no prefix compression, and the current KeyValue format is 
 somewhat metadata heavy, so there can be tremendous memory bloat for many 
 common data layouts, specifically those with long keys and short values.
 Second, there is no random access to KeyValues inside data blocks.  This 
 means that every time you double the datablock size, average seek time (or 
 average cpu consumption) goes up by a factor of 2.  The standard 64KB block 
 size is ~10x slower for random seeks than a 4KB block size, but block sizes 
 as small as 4KB cause problems elsewhere.  Using block sizes of 256KB or 1MB 
 or more may be more efficient from a disk access and block-cache perspective 
 in many big-data applications, but doing so is infeasible from a random seek 
 perspective.
 The PrefixTrie block encoding format attempts to solve both of these 
 problems.  Some features:
 * trie format for row key encoding completely eliminates duplicate row keys 
 and encodes similar row keys into a standard trie structure which also saves 
 a lot of space
 * the column family is currently stored once at the beginning of each block.  
 this could easily be modified to allow multiple family names per block
 * all qualifiers in the block are stored in their own trie format which 
 caters nicely to wide rows.  duplicate qualifers between rows are eliminated. 
  the size of this trie determines the width of the block's qualifier 
 fixed-width-int
 * the minimum timestamp is stored at the beginning of the block, and deltas 
 are calculated from that.  the maximum delta determines the width of the 
 block's timestamp fixed-width-int
 The block is structured with metadata at the beginning, then a section for 
 the 

[jira] [Created] (HBASE-7042) Master Coprocessor Endpoint

2012-10-24 Thread Francis Liu (JIRA)
Francis Liu created HBASE-7042:
--

 Summary: Master Coprocessor Endpoint
 Key: HBASE-7042
 URL: https://issues.apache.org/jira/browse/HBASE-7042
 Project: HBase
  Issue Type: Sub-task
Reporter: Francis Liu
Assignee: Francis Liu
 Attachments: HBASE-7042_94.patch

Having support for a master coprocessor endpoint would enable developers to 
easily extended HMaster functionality/features. As is the case for region 
server grouping.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7042) Master Coprocessor Endpoint

2012-10-24 Thread Francis Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francis Liu updated HBASE-7042:
---

Attachment: HBASE-7042_94.patch

 Master Coprocessor Endpoint
 ---

 Key: HBASE-7042
 URL: https://issues.apache.org/jira/browse/HBASE-7042
 Project: HBase
  Issue Type: Sub-task
Reporter: Francis Liu
Assignee: Francis Liu
 Fix For: 0.96.0

 Attachments: HBASE-7042_94.patch


 Having support for a master coprocessor endpoint would enable developers to 
 easily extended HMaster functionality/features. As is the case for region 
 server grouping.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7042) Master Coprocessor Endpoint

2012-10-24 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483045#comment-13483045
 ] 

Francis Liu commented on HBASE-7042:


Putting up a 0.94 patch for initial comments.

 Master Coprocessor Endpoint
 ---

 Key: HBASE-7042
 URL: https://issues.apache.org/jira/browse/HBASE-7042
 Project: HBase
  Issue Type: Sub-task
Reporter: Francis Liu
Assignee: Francis Liu
 Fix For: 0.96.0

 Attachments: HBASE-7042_94.patch


 Having support for a master coprocessor endpoint would enable developers to 
 easily extended HMaster functionality/features. As is the case for region 
 server grouping.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6721) RegionServer Group based Assignment

2012-10-24 Thread Francis Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francis Liu updated HBASE-6721:
---

Attachment: HBASE-6721_94.patch

0.94 patch for initial review

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Vandana Ayyalasomayajula
 Fix For: 0.96.0

 Attachments: HBASE-6721_94.patch, HBASE-6721-DesigDoc.pdf


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7043) Region Server Group CLI commands

2012-10-24 Thread Francis Liu (JIRA)
Francis Liu created HBASE-7043:
--

 Summary: Region Server Group CLI commands
 Key: HBASE-7043
 URL: https://issues.apache.org/jira/browse/HBASE-7043
 Project: HBase
  Issue Type: Sub-task
Reporter: Francis Liu




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7043) Region Server Group CLI commands

2012-10-24 Thread Francis Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francis Liu updated HBASE-7043:
---

Attachment: HBASE-7043_94.patch

 Region Server Group CLI commands
 

 Key: HBASE-7043
 URL: https://issues.apache.org/jira/browse/HBASE-7043
 Project: HBase
  Issue Type: Sub-task
Reporter: Francis Liu
 Fix For: 0.96.0

 Attachments: HBASE-7043_94.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HBASE-7043) Region Server Group CLI commands

2012-10-24 Thread Francis Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francis Liu reassigned HBASE-7043:
--

Assignee: Francis Liu

 Region Server Group CLI commands
 

 Key: HBASE-7043
 URL: https://issues.apache.org/jira/browse/HBASE-7043
 Project: HBase
  Issue Type: Sub-task
Reporter: Francis Liu
Assignee: Francis Liu
 Fix For: 0.96.0

 Attachments: HBASE-7043_94.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7043) Region Server Group CLI commands

2012-10-24 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483056#comment-13483056
 ] 

Francis Liu commented on HBASE-7043:


0.94 initial patch for review

 Region Server Group CLI commands
 

 Key: HBASE-7043
 URL: https://issues.apache.org/jira/browse/HBASE-7043
 Project: HBase
  Issue Type: Sub-task
Reporter: Francis Liu
Assignee: Francis Liu
 Fix For: 0.96.0

 Attachments: HBASE-7043_94.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6721) RegionServer Group based Assignment

2012-10-24 Thread Francis Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francis Liu updated HBASE-6721:
---

Status: Patch Available  (was: Open)

0.94 patch for initial review, We decided to combine the patch of two subtasks 
into one.

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Vandana Ayyalasomayajula
 Fix For: 0.96.0

 Attachments: HBASE-6721_94.patch, HBASE-6721-DesigDoc.pdf


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-6723) Implement RegionServer Group Based Balancer

2012-10-24 Thread Francis Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francis Liu resolved HBASE-6723.


Resolution: Invalid

 Implement RegionServer Group Based Balancer
 ---

 Key: HBASE-6723
 URL: https://issues.apache.org/jira/browse/HBASE-6723
 Project: HBase
  Issue Type: Sub-task
Reporter: Francis Liu
Assignee: Vandana Ayyalasomayajula

 Re-purposing this Jira after the discussion last week.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-6837) RegionServer Groups corpcoessor apis

2012-10-24 Thread Francis Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francis Liu resolved HBASE-6837.


Resolution: Invalid

 RegionServer Groups corpcoessor apis
 

 Key: HBASE-6837
 URL: https://issues.apache.org/jira/browse/HBASE-6837
 Project: HBase
  Issue Type: Sub-task
Reporter: Francis Liu
Assignee: Francis Liu



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7042) Master Coprocessor Endpoint

2012-10-24 Thread Francis Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francis Liu updated HBASE-7042:
---

Status: Patch Available  (was: Open)

 Master Coprocessor Endpoint
 ---

 Key: HBASE-7042
 URL: https://issues.apache.org/jira/browse/HBASE-7042
 Project: HBase
  Issue Type: Sub-task
Reporter: Francis Liu
Assignee: Francis Liu
 Fix For: 0.96.0

 Attachments: HBASE-7042_94.patch


 Having support for a master coprocessor endpoint would enable developers to 
 easily extended HMaster functionality/features. As is the case for region 
 server grouping.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7043) Region Server Group CLI commands

2012-10-24 Thread Francis Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francis Liu updated HBASE-7043:
---

Status: Patch Available  (was: Open)

 Region Server Group CLI commands
 

 Key: HBASE-7043
 URL: https://issues.apache.org/jira/browse/HBASE-7043
 Project: HBase
  Issue Type: Sub-task
Reporter: Francis Liu
Assignee: Francis Liu
 Fix For: 0.96.0

 Attachments: HBASE-7043_94.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7042) Master Coprocessor Endpoint

2012-10-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483062#comment-13483062
 ] 

Hadoop QA commented on HBASE-7042:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12550596/HBASE-7042_94.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 9 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3136//console

This message is automatically generated.

 Master Coprocessor Endpoint
 ---

 Key: HBASE-7042
 URL: https://issues.apache.org/jira/browse/HBASE-7042
 Project: HBase
  Issue Type: Sub-task
Reporter: Francis Liu
Assignee: Francis Liu
 Fix For: 0.96.0

 Attachments: HBASE-7042_94.patch


 Having support for a master coprocessor endpoint would enable developers to 
 easily extended HMaster functionality/features. As is the case for region 
 server grouping.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7043) Region Server Group CLI commands

2012-10-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483063#comment-13483063
 ] 

Hadoop QA commented on HBASE-7043:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12550600/HBASE-7043_94.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3135//console

This message is automatically generated.

 Region Server Group CLI commands
 

 Key: HBASE-7043
 URL: https://issues.apache.org/jira/browse/HBASE-7043
 Project: HBase
  Issue Type: Sub-task
Reporter: Francis Liu
Assignee: Francis Liu
 Fix For: 0.96.0

 Attachments: HBASE-7043_94.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2012-10-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483065#comment-13483065
 ] 

Hadoop QA commented on HBASE-6721:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12550599/HBASE-6721_94.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3137//console

This message is automatically generated.

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Vandana Ayyalasomayajula
 Fix For: 0.96.0

 Attachments: HBASE-6721_94.patch, HBASE-6721-DesigDoc.pdf


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6721) RegionServer Group based Assignment

2012-10-24 Thread Francis Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francis Liu updated HBASE-6721:
---

Attachment: HBASE-6721_94.patch

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Vandana Ayyalasomayajula
 Fix For: 0.96.0

 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, 
 HBASE-6721-DesigDoc.pdf


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2012-10-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483069#comment-13483069
 ] 

Hadoop QA commented on HBASE-6721:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12550601/HBASE-6721_94.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 11 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3138//console

This message is automatically generated.

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Vandana Ayyalasomayajula
 Fix For: 0.96.0

 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, 
 HBASE-6721-DesigDoc.pdf


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7037) ReplicationPeer logs at WARN level aborting server instead of at FATAL

2012-10-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483077#comment-13483077
 ] 

Hudson commented on HBASE-7037:
---

Integrated in HBase-0.94 #552 (See 
[https://builds.apache.org/job/HBase-0.94/552/])
HBASE-7037 ReplicationPeer logs at WARN level aborting server instead of at 
FATAL (Revision 1401564)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/ReplicationPeer.java


 ReplicationPeer logs at WARN level aborting server instead of at FATAL
 --

 Key: HBASE-7037
 URL: https://issues.apache.org/jira/browse/HBASE-7037
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: stack
Assignee: liang xie
  Labels: noob
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-7037.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7037) ReplicationPeer logs at WARN level aborting server instead of at FATAL

2012-10-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483087#comment-13483087
 ] 

Hudson commented on HBASE-7037:
---

Integrated in HBase-TRUNK #3480 (See 
[https://builds.apache.org/job/HBase-TRUNK/3480/])
HBASE-7037 ReplicationPeer logs at WARN level aborting server instead of at 
FATAL (Revision 1401563)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/ReplicationPeer.java


 ReplicationPeer logs at WARN level aborting server instead of at FATAL
 --

 Key: HBASE-7037
 URL: https://issues.apache.org/jira/browse/HBASE-7037
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: stack
Assignee: liang xie
  Labels: noob
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-7037.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7024) TableMapReduceUtil.initTableMapperJob unnecessarily limits the types of outputKeyClass and outputValueClass

2012-10-24 Thread Dave Beech (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483096#comment-13483096
 ] 

Dave Beech commented on HBASE-7024:
---

Thanks Ted, Stack. 

Stack - you are right that keys and values have to be serializable, but they 
don't have to be Serializable in the Java interface sense. The Job/JobConf 
classes in Hadoop accept absolutely any class. Map tasks use Hadoop's 
SerializationFactory to work out which serializer class to use 
(WritableSerialization is the default, but you can specify custom ones through 
the io.serialization job setting, like AvroSerialization)

The point is that Hadoop doesn't care at all what type your map output key and 
value classes are, so long as you have provided a serializer which works with 
them. If you haven't, the job dies horribly (no surprise there).

I haven't tested with Hadoop 2 yet, no, but I'd be very surprised if this patch 
broke anything. If they'd changed this behaviour in Hadoop I'm sure there'd be 
tons of regression problems with mapreduce jobs that need custom serializers.  


 TableMapReduceUtil.initTableMapperJob unnecessarily limits the types of 
 outputKeyClass and outputValueClass
 ---

 Key: HBASE-7024
 URL: https://issues.apache.org/jira/browse/HBASE-7024
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce
Reporter: Dave Beech
Priority: Minor
 Attachments: HBASE-7024.patch


 The various initTableMapperJob methods in TableMapReduceUtil take 
 outputKeyClass and outputValueClass parameters which need to extend 
 WritableComparable and Writable respectively. 
 Because of this, it is not convenient to use an alternative serialization 
 like Avro. (I wanted to set these parameters to AvroKey and AvroValue). 
 The methods in the MapReduce API to set map output key and value types do not 
 impose this restriction, so is there a reason to do it here?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7044) verifyRegionLocation in CatalogTracker.java didn't check if regionserver is in the cluster

2012-10-24 Thread wonderyl (JIRA)
wonderyl created HBASE-7044:
---

 Summary: verifyRegionLocation in CatalogTracker.java didn't check 
if  regionserver is in the cluster
 Key: HBASE-7044
 URL: https://issues.apache.org/jira/browse/HBASE-7044
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0
Reporter: wonderyl


at the beginning there is 1 whole hbase cluster, then I decide to split is into 
2 cluster, one is for offline mining, one is for online service, and the online 
one is striped, the offline one contains the original master.
unfortunately, the META of the original cluster is assigned to the machine 
stripped, and as there is a cache policy for META, the offline cluster is still 
access the META of the stripped one.
after inspected the code, I found that in verifyRegionLocation of 
CatalogTracker.java, although it checks if the region server still contains the 
region, but it didn't check if the regions erver is still in the cluster which 
is very easy, just inspect if it is registered int zk.
all in all, I have to shutdown the online cluster and restart the offline one, 
then the META is re-assgined. then everything is back to normal.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7017) Backport [replication] The replication-executor should make sure the file that it is replicating is closed before declaring success on that file to 0.94

2012-10-24 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483109#comment-13483109
 ] 

Devaraj Das commented on HBASE-7017:


I should be able to give it a shot in a few days (traveling currently)

 Backport [replication] The replication-executor should make sure the file 
 that it is replicating is closed before declaring success on that file to 
 0.94
 --

 Key: HBASE-7017
 URL: https://issues.apache.org/jira/browse/HBASE-7017
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Devaraj Das
 Fix For: 0.94.3




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6480) If callQueueSize exceed maxQueueSize, all call will be rejected, do not reject priorityCall

2012-10-24 Thread binlijin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483112#comment-13483112
 ] 

binlijin commented on HBASE-6480:
-

@stack i agree that should be some limit for priorityQueue, Lars's suggestion 
is good.

 If callQueueSize exceed maxQueueSize, all call will be rejected, do not 
 reject priorityCall 
 

 Key: HBASE-6480
 URL: https://issues.apache.org/jira/browse/HBASE-6480
 Project: HBase
  Issue Type: Bug
Reporter: binlijin
 Fix For: 0.96.0, 0.94.4

 Attachments: HBASE-6480-94.patch, HBASE-6480-trunk.patch


 Current if the callQueueSize exceed maxQueueSize, all call will be rejected, 
 Should we let the priority Call pass through?
 Current:
 {code}
 if ((callSize + callQueueSize.get())  maxQueueSize) {
   Call callTooBig = xxx
   return ;
 }
 if (priorityCallQueue != null  getQosLevel(param)  highPriorityLevel) {
   priorityCallQueue.put(call);
   updateCallQueueLenMetrics(priorityCallQueue);
 } else {
   callQueue.put(call);  // queue the call; maybe blocked here
   updateCallQueueLenMetrics(callQueue);
 }
 {code}
 Should we change it to :
 {code}
 if (priorityCallQueue != null  getQosLevel(param)  highPriorityLevel) {
   priorityCallQueue.put(call);
   updateCallQueueLenMetrics(priorityCallQueue);
 } else {
   if ((callSize + callQueueSize.get())  maxQueueSize) {
Call callTooBig = xxx
return ;
   }
   callQueue.put(call);  // queue the call; maybe blocked here
   updateCallQueueLenMetrics(callQueue);
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6184) HRegionInfo was null or empty in Meta

2012-10-24 Thread binlijin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483122#comment-13483122
 ] 

binlijin commented on HBASE-6184:
-

This can be happened when region split.
0.94.x version: write memstore, write hlog, update mvcc.

Client:
{code}
metaTable = new HTable(configuration, HConstants.META_TABLE_NAME);
Result startRowResult = metaTable.getRowOrBefore(searchRow,
HConstants.CATALOG_FAMILY); 
if (startRowResult == null) {
  throw new TableNotFoundException(Cannot find row in .META. for 
table: 
  + Bytes.toString(tableName) + , row= + 
Bytes.toStringBinary(searchRow));
}
byte[] value = startRowResult.getValue(HConstants.CATALOG_FAMILY,
HConstants.REGIONINFO_QUALIFIER);
if (value == null || value.length == 0) {
  throw new IOException(HRegionInfo was null or empty in Meta for  +
Bytes.toString(tableName) + , row= + 
Bytes.toStringBinary(searchRow));
}
{code}

Server :
HRegion.getClosestRowBefore
{code}
  Store store = getStore(family);
  // get the closest key. (HStore.getRowKeyAtOrBefore can return null)
  KeyValue key = store.getRowKeyAtOrBefore(row);
  Result result = null;
  if (key != null) {
Get get = new Get(key.getRow());
get.addFamily(family);
result = get(get, null);
  }
  return result;
{code}
 
store.getRowKeyAtOrBefore(row); doesn't consider the readPoint, but the get 
will, so some value doesn't have commit, getRowKeyAtOrBefore see it, but get 
will ignore it, so there is possiable that will return null result. 


 HRegionInfo was null or empty in Meta 
 --

 Key: HBASE-6184
 URL: https://issues.apache.org/jira/browse/HBASE-6184
 Project: HBase
  Issue Type: Bug
  Components: Client, io
Affects Versions: 0.94.0
Reporter: jiafeng.zhang
 Fix For: 0.94.3

 Attachments: HBASE-6184.patch


 insert data
 hadoop-0.23.2 + hbase-0.94.0
 2012-06-07 13:09:38,573 WARN  
 [org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation] 
 Encountered problems when prefetch META table: 
 java.io.IOException: HRegionInfo was null or empty in Meta for hbase_one_col, 
 row=hbase_one_col,09115303780247449149,99
 at 
 org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:160)
 at 
 org.apache.hadoop.hbase.client.MetaScanner.access$000(MetaScanner.java:48)
 at 
 org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:126)
 at 
 org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:123)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager.execute(HConnectionManager.java:359)
 at 
 org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:123)
 at 
 org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:99)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:894)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:948)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:836)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1482)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1367)
 at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:945)
 at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:801)
 at org.apache.hadoop.hbase.client.HTable.put(HTable.java:776)
 at 
 org.apache.hadoop.hbase.client.HTablePool$PooledHTable.put(HTablePool.java:397)
 at com.dinglicom.hbase.HbaseImport.insertData(HbaseImport.java:177)
 at com.dinglicom.hbase.HbaseImport.run(HbaseImport.java:210)
 at java.lang.Thread.run(Thread.java:662)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6184) HRegionInfo was null or empty in Meta

2012-10-24 Thread binlijin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483123#comment-13483123
 ] 

binlijin commented on HBASE-6184:
-

0.94 
write memstore 
write hlog // last a few ms.. 
update mvcc

Current readPoint = 9;, and the new KeyValue memstoreTS = 10, then the 
HRegion.getClosestRowBefore is called.
 KeyValue key = store.getRowKeyAtOrBefore(row);  will see the new KeyValue, 
but 
Get get = new Get(key.getRow());
get.addFamily(family);
result = get(get, null);
will not see the new KeyValue.

 HRegionInfo was null or empty in Meta 
 --

 Key: HBASE-6184
 URL: https://issues.apache.org/jira/browse/HBASE-6184
 Project: HBase
  Issue Type: Bug
  Components: Client, io
Affects Versions: 0.94.0
Reporter: jiafeng.zhang
 Fix For: 0.94.3

 Attachments: HBASE-6184.patch


 insert data
 hadoop-0.23.2 + hbase-0.94.0
 2012-06-07 13:09:38,573 WARN  
 [org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation] 
 Encountered problems when prefetch META table: 
 java.io.IOException: HRegionInfo was null or empty in Meta for hbase_one_col, 
 row=hbase_one_col,09115303780247449149,99
 at 
 org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:160)
 at 
 org.apache.hadoop.hbase.client.MetaScanner.access$000(MetaScanner.java:48)
 at 
 org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:126)
 at 
 org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:123)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager.execute(HConnectionManager.java:359)
 at 
 org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:123)
 at 
 org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:99)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:894)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:948)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:836)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1482)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1367)
 at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:945)
 at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:801)
 at org.apache.hadoop.hbase.client.HTable.put(HTable.java:776)
 at 
 org.apache.hadoop.hbase.client.HTablePool$PooledHTable.put(HTablePool.java:397)
 at com.dinglicom.hbase.HbaseImport.insertData(HbaseImport.java:177)
 at com.dinglicom.hbase.HbaseImport.run(HbaseImport.java:210)
 at java.lang.Thread.run(Thread.java:662)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6843) loading lzo error when using coprocessor

2012-10-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483171#comment-13483171
 ] 

Hudson commented on HBASE-6843:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #233 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/233/])
HBASE-6843 loading lzo error when using coprocessor (Andy) (Revision 
1401551)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorClassLoader.java


 loading lzo error when using coprocessor
 

 Key: HBASE-6843
 URL: https://issues.apache.org/jira/browse/HBASE-6843
 Project: HBase
  Issue Type: Bug
  Components: Coprocessors
Affects Versions: 0.94.1
Reporter: Zhou wenjian
Assignee: Zhou wenjian
Priority: Critical
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6843-trunk.patch


 After applying HBASE-6308,we found error followed
 2012-09-06 00:44:38,341 DEBUG 
 org.apache.hadoop.hbase.coprocessor.CoprocessorClassLoader: Finding class: 
 com.hadoop.compression.lzo.LzoCodec
 2012-09-06 00:44:38,351 ERROR com.hadoop.compression.lzo.GPLNativeCodeLoader: 
 Could not load native gpl library
 java.lang.UnsatisfiedLinkError: Native Library 
 /home/zhuzhuang/hbase/0.94.0-ali-1.0/lib/native/Linux-amd64-64/libgplcompression.so
  already loaded in another classloade
 r
 at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1772)
 at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1732)
 at java.lang.Runtime.loadLibrary0(Runtime.java:823)
 at java.lang.System.loadLibrary(System.java:1028)
 at 
 com.hadoop.compression.lzo.GPLNativeCodeLoader.clinit(GPLNativeCodeLoader.java:32)
 at com.hadoop.compression.lzo.LzoCodec.clinit(LzoCodec.java:67)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:113)
 at 
 org.apache.hadoop.hbase.io.hfile.Compression$Algorithm$1.getCodec(Compression.java:107)
 at 
 org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.getCompressor(Compression.java:243)
 at 
 org.apache.hadoop.hbase.util.CompressionTest.testCompression(CompressionTest.java:85)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkCompressionCodecs(HRegion.java:3793)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3782)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3732)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 2012-09-06 00:44:38,355 DEBUG 
 org.apache.hadoop.hbase.coprocessor.CoprocessorClassLoader: Skipping exempt 
 class java.io.PrintWriter - delegating directly to parent
 2012-09-06 00:44:38,355 ERROR com.hadoop.compression.lzo.LzoCodec: Cannot 
 load native-lzo without native-hadoop

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7018) Fix and Improve TableDescriptor caching for bulk assignment

2012-10-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483172#comment-13483172
 ] 

Hudson commented on HBASE-7018:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #233 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/233/])
HBASE-7018 Fix and Improve TableDescriptor caching for bulk assignment 
(Revision 1401525)

 Result = FAILURE
gchanan : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/TableDescriptors.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java


 Fix and Improve TableDescriptor caching for bulk assignment
 ---

 Key: HBASE-7018
 URL: https://issues.apache.org/jira/browse/HBASE-7018
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: Gregory Chanan
Assignee: Gregory Chanan
 Fix For: 0.94.3, 0.96.0

 Attachments: 7018-trunk.v2, HBASE-7018-94.patch, 
 HBASE-7018-94-v2.patch, HBASE-7018-94-v3.patch, HBASE-7018-trunk.patch, 
 HBASE-7018-v3-trunk.patch, HBASE-7018-v4-trunk.patch


 HBASE-6214 backported HBASE-5998 (Bulk assignment: regionserver optimization 
 by using a temporary cache for table descriptors when receiving an open 
 regions request), but it's buggy on 0.94 (0.96 appears correct):
 {code}
 HTableDescriptor htd = null;
 if (htds == null) {
   htd = this.tableDescriptors.get(region.getTableName());
 } else {
   htd = htds.get(region.getTableNameAsString());
   if (htd == null) {
 htd = this.tableDescriptors.get(region.getTableName());
 htds.put(region.getRegionNameAsString(), htd);
   }
 }
 {code}
 i.e. we get the tableName from the map but write the regionName.
 Even fixing this, it looks like there are areas for improvement:
 1) FSTableDescriptors already has a cache (though it goes to the NameNode 
 each time through to check we have the latest copy.  May as well combine 
 these two caches, might be a performance win as well since we don't need to 
 write to multiple caches.
 2) FSTableDescriptors makes two RPCs to the NameNode when it encounters a new 
 table.  So the total number of RPCs necessary for a bulk assign (without 
 caching is):
 #regions + #tables
 (with caching):
 min(#regions,#tables) + #tables = #tables + #tables = 2 * #tables
 We can make this only one RPC, yielding:
 #tables
 Probably not a big deal for most users, but in a multi-tenant situation where 
 the number of regions being bulk assigned approaches the number of tables 
 being bulk assigned, this could be a nice performance win.
 Benchmarks coming.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7037) ReplicationPeer logs at WARN level aborting server instead of at FATAL

2012-10-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483173#comment-13483173
 ] 

Hudson commented on HBASE-7037:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #233 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/233/])
HBASE-7037 ReplicationPeer logs at WARN level aborting server instead of at 
FATAL (Revision 1401563)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/ReplicationPeer.java


 ReplicationPeer logs at WARN level aborting server instead of at FATAL
 --

 Key: HBASE-7037
 URL: https://issues.apache.org/jira/browse/HBASE-7037
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: stack
Assignee: liang xie
  Labels: noob
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-7037.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2012-10-24 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483325#comment-13483325
 ] 

Ted Yu commented on HBASE-6721:
---

{code}
+  public LoadBalancer getBalancer() {
+return balancer;
+  }
{code}
The above method can be package private.
{code}
+public class GroupAdminEndpoint extends BaseEndpointCoprocessor implements 
GroupAdminProtocol, EventHandler.EventHandlerListener {
+ private static final Log LOG = LogFactory.getLog(GroupAdminClient.class);
{code}
Please add javadoc for the class. The line is beyond 100 characters.
Log has wrong class.
{code}
+  private ConcurrentMapString,String serversInTransition =
{code}
What does the value in serversInTransition map represent ?
{code}
+   ListHRegionInfo regions = new ArrayListHRegionInfo();
+   if (groupName == null) {
+  throw new NullPointerException(groupName can't be null);
{code}
nit: move ArrayList creation after the if statement.
{code}
+  public CollectionString listTablesOfGroup(String groupName) throws 
IOException {
{code}
The return type is a collection, more generic than List that 
listOnlineRegionsOfGroup() returns. I guess there might be a reason.
{code}
+  HTableDescriptor[] tables = 
master.getTableDescriptors().getAll().values().toArray(new HTableDescriptor[0]);
{code}
nit: line too long.
{code}
+  public GroupInfo getGroup(String groupName) throws IOException {
{code}
Suggest renaming the method getGroupInfo(). getGroup() is kind of vague.

More reviews to follow.

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Vandana Ayyalasomayajula
 Fix For: 0.96.0

 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, 
 HBASE-6721-DesigDoc.pdf


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6977) Multithread processing ZK assignment events

2012-10-24 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6977:
---

Status: Open  (was: Patch Available)

Will address Stack's comments and upload a new patch.

 Multithread processing ZK assignment events
 ---

 Key: HBASE-6977
 URL: https://issues.apache.org/jira/browse/HBASE-6977
 Project: HBase
  Issue Type: Improvement
  Components: Region Assignment
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Attachments: trunk-6977_v1.patch, trunk-6977_v2-1.patch


 Related to HBASE-6976 and HBASE-6611.  ZK events processing is a bottle neck 
 for assignments, since there is only one ZK event thread.  If we can use 
 multiple threads, it should be better.
 With multiple threads, the order of events could be messed up. However, if we 
 pass all events related to one region always to the same worker thread, the 
 order should be kept.
 We need to play with it and find out how much performance imrovement we can 
 get.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6896) sync bulk and regular assigment handling socket timeout exception

2012-10-24 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6896:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Integrated into trunk.  Thanks all for the review.

 sync bulk and regular assigment handling socket timeout exception
 -

 Key: HBASE-6896
 URL: https://issues.apache.org/jira/browse/HBASE-6896
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Attachments: trunk-6896.patch, trunk-6896_v2.patch


 In regular assignment, in case of socket network timeout, it tries to call 
 openRegion again and again without change the region plan, ZK offline node,
 till the region is out of transition, in case the region server is still up.
 We may need to sync them up and make sure bulk assignment does the same in 
 this case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2012-10-24 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483385#comment-13483385
 ] 

Ted Yu commented on HBASE-6721:
---

In GroupAdminEndpoint:
{code}
+ throw new IOException(
+ The region server or the target to move found to be null.);
{code}
It would be nice to point out which parameter is null.
{code}
+throw new DoNotRetryIOException(Group must have no associated 
tables.);
{code}
Include group name in the exception message.
{code}
+  public MapString, String listServersInTransition() throws IOException {
{code}
Return type of Map includes additional information which is not used by 
callers. Suggest returning keySet.
Down in GroupAdminClient:
{code}
+  for(String server: proxy.listServersInTransition().keySet()) {
+found = found || servers.contains(server);
+  }
{code}
Can you tell me what the body is supposed to achieve ?
Back to GroupAdminEndpoint:
{code}
+  private GroupInfoManager getGroupInfoManager() {
+return 
((GroupBasedLoadBalancer)menv.getMasterServices().getAssignmentManager().getBalancer()).getGroupInfoManager();
{code}
Does GroupInfoManager belong to balancer ? The above is probably the longest 
indirection I have ever seen :-)
{code}
+  private ListHRegionInfo getOnlineRegions(String hostPort) throws 
IOException {
{code}
The above method is only called by listOnlineRegionsOfGroup() in a loop over 
online servers, resulting in nested loop.
Please consider collapsing the nested loop into one loop.
{code}
+  LOG.error(Failed to complete GroupMoveServer with of 
+h.getPlan().getServers().size()+
{code}
nit: remove ' of ' in above sentence.

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Vandana Ayyalasomayajula
 Fix For: 0.96.0

 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, 
 HBASE-6721-DesigDoc.pdf


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6896) sync bulk and regular assigment handling socket timeout exception

2012-10-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483406#comment-13483406
 ] 

Hudson commented on HBASE-6896:
---

Integrated in HBase-TRUNK #3482 (See 
[https://builds.apache.org/job/HBase-TRUNK/3482/])
HBASE-6896 sync bulk and regular assigment handling socket timeout 
exception (Revision 1401744)

 Result = FAILURE
jxiang : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java


 sync bulk and regular assigment handling socket timeout exception
 -

 Key: HBASE-6896
 URL: https://issues.apache.org/jira/browse/HBASE-6896
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Attachments: trunk-6896.patch, trunk-6896_v2.patch


 In regular assignment, in case of socket network timeout, it tries to call 
 openRegion again and again without change the region plan, ZK offline node,
 till the region is out of transition, in case the region server is still up.
 We may need to sync them up and make sure bulk assignment does the same in 
 this case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6728) [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded

2012-10-24 Thread Kannan Muthukkaruppan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483438#comment-13483438
 ] 

Kannan Muthukkaruppan commented on HBASE-6728:
--


Lars wrote: Will this work right if I set scanner caching to 1000 and then 
deal with 2mb rows? In that case every response will be 2g, and it would always 
block and never make any progress, right?

Yes, we considered that, and that's the reason for not using a simple counting 
semaphore that's initialize to the max size. We want the implementation to 
allow one request to exceed the queue size. We set the default at 1G, but we 
can exceed the limit by 1 requests' size amount.

From 
http://svn.apache.org/viewvc/hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/util/SizeBasedThrottler.java?view=markuppathrev=1385058
 :


{code}
15   * This implementation allows you to set the value of internal
16   * counter to be greater than threshold. It happens
17   * when internal counter is lower than threshold and
18   * increase method is called with parameter 'delta' big enough
19   * so that sum of delta and internal counter is greater than
20   * threshold. This is not a bug, this is a feature.
21   * It solves some problems:
22   *   - thread calling increase with big parameter will not be
23   * starved by other threads calling increase with small
24   * arguments.
25   *   - thread calling increase with argument greater than
26   * threshold won't deadlock. This is useful when throttling
27   * queues - you can submit object that is bigger than limit.
28   *
29   * This implementation introduces small costs in terms of
30   * synchronization (no synchronization in most cases at all), but is
31   * vulnerable to races. For details see documentation of
32   * increase method.
{code}

 [89-fb] prevent OOM possibility due to per connection responseQueue being 
 unbounded
 ---

 Key: HBASE-6728
 URL: https://issues.apache.org/jira/browse/HBASE-6728
 Project: HBase
  Issue Type: Bug
Reporter: Kannan Muthukkaruppan
Assignee: Michal Gregorczyk
 Fix For: 0.94.3

 Attachments: 6728.94, 6728-trunk.txt


 The per connection responseQueue is an unbounded queue. The request handler 
 threads today try to send the response in line, but if things start to 
 backup, the response is sent via a per connection responder thread. This 
 intermediate queue, because it has no bounds, can be another source of OOMs.
 [Have not looked at this issue in trunk. So it may or may not be applicable 
 there.]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (HBASE-6728) [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded

2012-10-24 Thread Kannan Muthukkaruppan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483438#comment-13483438
 ] 

Kannan Muthukkaruppan edited comment on HBASE-6728 at 10/24/12 6:06 PM:


Lars wrote: Will this work right if I set scanner caching to 1000 and then 
deal with 2mb rows? In that case every response will be 2g, and it would always 
block and never make any progress, right?

Yes, we considered that, and that's the reason for not using a simple counting 
semaphore that's initialized to the max size. We want the implementation to 
allow one request to exceed the queue size. We set the default at 1G, but we 
can exceed the limit by 1 requests' size amount.

From 
http://svn.apache.org/viewvc/hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/util/SizeBasedThrottler.java?view=markuppathrev=1385058
 :


{code}
15   * This implementation allows you to set the value of internal
16   * counter to be greater than threshold. It happens
17   * when internal counter is lower than threshold and
18   * increase method is called with parameter 'delta' big enough
19   * so that sum of delta and internal counter is greater than
20   * threshold. This is not a bug, this is a feature.
21   * It solves some problems:
22   *   - thread calling increase with big parameter will not be
23   * starved by other threads calling increase with small
24   * arguments.
25   *   - thread calling increase with argument greater than
26   * threshold won't deadlock. This is useful when throttling
27   * queues - you can submit object that is bigger than limit.
28   *
29   * This implementation introduces small costs in terms of
30   * synchronization (no synchronization in most cases at all), but is
31   * vulnerable to races. For details see documentation of
32   * increase method.
{code}

  was (Author: kannanm):

Lars wrote: Will this work right if I set scanner caching to 1000 and then 
deal with 2mb rows? In that case every response will be 2g, and it would always 
block and never make any progress, right?

Yes, we considered that, and that's the reason for not using a simple counting 
semaphore that's initialize to the max size. We want the implementation to 
allow one request to exceed the queue size. We set the default at 1G, but we 
can exceed the limit by 1 requests' size amount.

From 
http://svn.apache.org/viewvc/hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/util/SizeBasedThrottler.java?view=markuppathrev=1385058
 :


{code}
15   * This implementation allows you to set the value of internal
16   * counter to be greater than threshold. It happens
17   * when internal counter is lower than threshold and
18   * increase method is called with parameter 'delta' big enough
19   * so that sum of delta and internal counter is greater than
20   * threshold. This is not a bug, this is a feature.
21   * It solves some problems:
22   *   - thread calling increase with big parameter will not be
23   * starved by other threads calling increase with small
24   * arguments.
25   *   - thread calling increase with argument greater than
26   * threshold won't deadlock. This is useful when throttling
27   * queues - you can submit object that is bigger than limit.
28   *
29   * This implementation introduces small costs in terms of
30   * synchronization (no synchronization in most cases at all), but is
31   * vulnerable to races. For details see documentation of
32   * increase method.
{code}
  
 [89-fb] prevent OOM possibility due to per connection responseQueue being 
 unbounded
 ---

 Key: HBASE-6728
 URL: https://issues.apache.org/jira/browse/HBASE-6728
 Project: HBase
  Issue Type: Bug
Reporter: Kannan Muthukkaruppan
Assignee: Michal Gregorczyk
 Fix For: 0.94.3

 Attachments: 6728.94, 6728-trunk.txt


 The per connection responseQueue is an unbounded queue. The request handler 
 threads today try to send the response in line, but if things start to 
 backup, the response is sent via a per connection responder thread. This 
 intermediate queue, because it has no bounds, can be another source of OOMs.
 [Have not looked at this issue in trunk. So it may or may not be applicable 
 there.]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (HBASE-6728) [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded

2012-10-24 Thread Kannan Muthukkaruppan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483438#comment-13483438
 ] 

Kannan Muthukkaruppan edited comment on HBASE-6728 at 10/24/12 6:06 PM:


Lars wrote: Will this work right if I set scanner caching to 1000 and then 
deal with 2mb rows? In that case every response will be 2g, and it would always 
block and never make any progress, right?

Yes, we considered that, and that's the reason for not using a simple counting 
semaphore that's initialized to the max size. We want the implementation to 
allow one request to exceed the queue size. We set the default at 1G, but we 
can exceed the limit by 1 requests' size amount.

From 
http://svn.apache.org/viewvc/hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/util/SizeBasedThrottler.java?view=markuppathrev=1385058
 :


{code}
15   * This implementation allows you to set the value of internal
16   * counter to be greater than threshold. It happens
17   * when internal counter is lower than threshold and
18   * increase method is called with parameter 'delta' big enough
19   * so that sum of delta and internal counter is greater than
20   * threshold. This is not a bug, this is a feature.
21   * It solves some problems:
22   *   - thread calling increase with big parameter will not be
23   * starved by other threads calling increase with small
24   * arguments.
25   *   - thread calling increase with argument greater than
26   * threshold won't deadlock. This is useful when throttling
27   * queues - you can submit object that is bigger than limit
{code}

  was (Author: kannanm):
Lars wrote: Will this work right if I set scanner caching to 1000 and 
then deal with 2mb rows? In that case every response will be 2g, and it would 
always block and never make any progress, right?

Yes, we considered that, and that's the reason for not using a simple counting 
semaphore that's initialized to the max size. We want the implementation to 
allow one request to exceed the queue size. We set the default at 1G, but we 
can exceed the limit by 1 requests' size amount.

From 
http://svn.apache.org/viewvc/hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/util/SizeBasedThrottler.java?view=markuppathrev=1385058
 :


{code}
15   * This implementation allows you to set the value of internal
16   * counter to be greater than threshold. It happens
17   * when internal counter is lower than threshold and
18   * increase method is called with parameter 'delta' big enough
19   * so that sum of delta and internal counter is greater than
20   * threshold. This is not a bug, this is a feature.
21   * It solves some problems:
22   *   - thread calling increase with big parameter will not be
23   * starved by other threads calling increase with small
24   * arguments.
25   *   - thread calling increase with argument greater than
26   * threshold won't deadlock. This is useful when throttling
27   * queues - you can submit object that is bigger than limit.
28   *
29   * This implementation introduces small costs in terms of
30   * synchronization (no synchronization in most cases at all), but is
31   * vulnerable to races. For details see documentation of
32   * increase method.
{code}
  
 [89-fb] prevent OOM possibility due to per connection responseQueue being 
 unbounded
 ---

 Key: HBASE-6728
 URL: https://issues.apache.org/jira/browse/HBASE-6728
 Project: HBase
  Issue Type: Bug
Reporter: Kannan Muthukkaruppan
Assignee: Michal Gregorczyk
 Fix For: 0.94.3

 Attachments: 6728.94, 6728-trunk.txt


 The per connection responseQueue is an unbounded queue. The request handler 
 threads today try to send the response in line, but if things start to 
 backup, the response is sent via a per connection responder thread. This 
 intermediate queue, because it has no bounds, can be another source of OOMs.
 [Have not looked at this issue in trunk. So it may or may not be applicable 
 there.]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2012-10-24 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483464#comment-13483464
 ] 

Francis Liu commented on HBASE-6721:


[~yuzhih...@gmail.com]

{quote}
What does the value in serversInTransition map represent?
{quote}

It represents servers that are being moved from one group to another.

{quote}
Can you tell me what the body is supposed to achieve ?
Back to GroupAdminEndpoint:
{quote}

Retrieveing the balancer during start() returns null. Thus I have to retrieve 
it lazily as needed.

{quote}
Does GroupInfoManager belong to balancer ? The above is probably the longest 
indirection I have ever seen
{quote}

We had to do this since we didn't want to touch AssignmentManager as much as 
possible :)



 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Vandana Ayyalasomayajula
 Fix For: 0.96.0

 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, 
 HBASE-6721-DesigDoc.pdf


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2012-10-24 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483471#comment-13483471
 ] 

Francis Liu commented on HBASE-6721:


{quote}
We had to do this since we didn't want to touch AssignmentManager as much as 
possible
{quote}

As an alternative, we can add a getBalancer() Method to MasterServices. 
Thoughts?

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Vandana Ayyalasomayajula
 Fix For: 0.96.0

 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, 
 HBASE-6721-DesigDoc.pdf


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2012-10-24 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483498#comment-13483498
 ] 

Ted Yu commented on HBASE-6721:
---

{code}
+  public void beforeProcess(EventHandler event) {
{code}
I think normally the above should be called preProcess().
{code}
+  public void afterProcess(EventHandler event) {
{code}
Rename to postProcess().
{code}
+ * Copyright 2011 The Apache Software Foundation
{code}
The above is no longer needed in license header.
{code}
+public interface GroupAdminProtocol extends GroupAdmin, CoprocessorProtocol {
+}
{code}
I wasn't expecting a Protocol to not have methods in it :-)
{code}
+public class GroupBasedLoadBalancer implements LoadBalancer {
{code}
Add javadoc for GroupBasedLoadBalancer.
{code}
+  } catch (IOException e) {
+LOG.warn(IOException while creating GroupInfoManagerImpl., e);
+  }
{code}
I think if groupManager cannot be initialized, we should abort master because 
group policy wouldn't be enforced.
In correctAssignments():
{code}
+if ((info == null) || (!info.containsServer(sName.getHostAndPort( {
+  // Misplaced region.
+  misplacedRegions.add(region);
{code}
Under what scenario would a region be misplaced at runtime ? I think 
rebalancing misplaced region(s) would affect normal operation of related groups.
{code}
+//unassign misplaced regions, so that they are assigned to correct groups.
+this.services.getAssignmentManager().unassign(misplacedRegions);
{code}

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Vandana Ayyalasomayajula
 Fix For: 0.96.0

 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, 
 HBASE-6721-DesigDoc.pdf


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2012-10-24 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483502#comment-13483502
 ] 

Ted Yu commented on HBASE-6721:
---

bq. Can you tell me what the body is supposed to achieve ?
I was asking about the following line of code:
{code}
found = found || servers.contains(server);
{code}
It seems to be condition checking.

bq. As an alternative, we can add a getBalancer() Method to MasterServices.
That would be better than the current form.

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Vandana Ayyalasomayajula
 Fix For: 0.96.0

 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, 
 HBASE-6721-DesigDoc.pdf


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7033) Add hbase.lru.blockcache.acceptable.factor to configuration, akin to the min.factor added by HBASE-6312

2012-10-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483523#comment-13483523
 ] 

Sergey Shelukhin commented on HBASE-7033:
-

wondering if someone could review this...


 Add hbase.lru.blockcache.acceptable.factor to configuration, akin to the 
 min.factor added by HBASE-6312
 ---

 Key: HBASE-7033
 URL: https://issues.apache.org/jira/browse/HBASE-7033
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-7033.patch


 Background: we want to make the change to block cache setting available on 
 0.94 without actually changing the defaults as was done in HBASE-6312, as 
 this can be destabilizing.
 Thus, both of these would be configurable instead of just one, and the user 
 would be able to switch to new values.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7040) Port HBASE-5867 Improve Compaction Throttle Default to 0.94

2012-10-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483529#comment-13483529
 ] 

Sergey Shelukhin commented on HBASE-7040:
-

We are trying to port a number of performance and stability improvements from 
trunk to 0.94 in order to, well, make it more performant and stable :)
Understandably it's a balancing act with potential destabilization, so please 
feel free to -1 if you think it's not worth the risk.
Thanks!

 Port HBASE-5867 Improve Compaction Throttle Default to 0.94
 ---

 Key: HBASE-7040
 URL: https://issues.apache.org/jira/browse/HBASE-7040
 Project: HBase
  Issue Type: Task
Affects Versions: 0.94.2
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Minor
 Fix For: 0.94.3

 Attachments: HBASE-7040.patch


 Looks like a relatively important (and simple) improvement. Considering 
 porting to 0.94... 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2012-10-24 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483550#comment-13483550
 ] 

Elliott Clark commented on HBASE-6721:
--

Just some initial thoughts:

* I couldn't seem to get it to compile for me on 0.94
* There seem to be a bunch of formatting changes that aren't needed.
* Passing in the preferred server into the load balancer on randomAssignment is 
messy.  If we know the preferred server why call this function at all ?
* The balancer is a public interface and we can't make changes to it in a minor 
release. And this patch won't apply to trunk.
* With this many interfaces and classes it might make sense to move them into a 
namespace.
* Why is GroupAdminClient in the master namespace and not in the client 
namespace.
* Why a co-processor and not build it in ?
** Security was done that was because it can be added or removed. As the patch 
is that's not really possible
** This makes a lot of changes in core code for something that is a 
co-processor. 
* Don't create a DefaultLoadBalancer in GroupBasedLoadBalancer.  The balancer 
was made pluggable and that feature shouldn't go away.
* Why return ArrayListMultimap from groupRegions in GroupBasedLoadBalancer? Why 
not the base class
* HTableDescriptor seems like the correct location for info about the table if 
you don't want to put that data into meta.
* putting things into the filesystem seems like the wrong way to do it. There 
are just so many different moving parts with getting things from hdfs with 
caching and cache invalidation, and edge cases on failure.
* There's a lot of logic about balancing bleeding into the AssignmentManager.  
Right now assignment manager is already too complex.  I would much prefer a 
solution that had everything in the balancer.

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Vandana Ayyalasomayajula
 Fix For: 0.96.0

 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, 
 HBASE-6721-DesigDoc.pdf


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2012-10-24 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483562#comment-13483562
 ] 

Ted Yu commented on HBASE-6721:
---

bq. There's a lot of logic about balancing bleeding into the AssignmentManager.
Looking at the changes in AssignmentManager, they are mostly white space 
removal.
There is only one real change:
{code}
+  public LoadBalancer getBalancer() {
+return balancer;
+  }
{code}
which Francis agrees to move out.

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Vandana Ayyalasomayajula
 Fix For: 0.96.0

 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, 
 HBASE-6721-DesigDoc.pdf


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7045) Add some comments to MVCC code

2012-10-24 Thread Gregory Chanan (JIRA)
Gregory Chanan created HBASE-7045:
-

 Summary: Add some comments to MVCC code
 Key: HBASE-7045
 URL: https://issues.apache.org/jira/browse/HBASE-7045
 Project: HBase
  Issue Type: Task
  Components: Transactions/MVCC
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor
 Fix For: 0.96.0


I've been digging through the MVCC/transaction code and adding some comments to 
help me (or others) understand quicker the next time through

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7045) Add some comments to MVCC code

2012-10-24 Thread Gregory Chanan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gregory Chanan updated HBASE-7045:
--

Attachment: HBASE-7045.patch

 Add some comments to MVCC code
 --

 Key: HBASE-7045
 URL: https://issues.apache.org/jira/browse/HBASE-7045
 Project: HBase
  Issue Type: Task
  Components: Transactions/MVCC
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-7045.patch


 I've been digging through the MVCC/transaction code and adding some comments 
 to help me (or others) understand quicker the next time through

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7045) Add some comments to MVCC code

2012-10-24 Thread Gregory Chanan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gregory Chanan updated HBASE-7045:
--

Status: Patch Available  (was: Open)

 Add some comments to MVCC code
 --

 Key: HBASE-7045
 URL: https://issues.apache.org/jira/browse/HBASE-7045
 Project: HBase
  Issue Type: Task
  Components: Transactions/MVCC
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-7045.patch


 I've been digging through the MVCC/transaction code and adding some comments 
 to help me (or others) understand quicker the next time through

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6371) [89-fb] Tier based compaction

2012-10-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-6371:


Attachment: HBASE-6371-089fb-commit.patch

I am attaching the 0.89fb commit for reference.
The commit hash is 1b3e7bb4df1ed05d7d268cb90ffc23f5955c4398 

 [89-fb] Tier based compaction
 -

 Key: HBASE-6371
 URL: https://issues.apache.org/jira/browse/HBASE-6371
 Project: HBase
  Issue Type: Improvement
Reporter: Akashnil
Assignee: Liyin Tang
  Labels: noob
 Attachments: HBASE-6371-089fb-commit.patch


 Currently, the compaction selection is not very flexible and is not sensitive 
 to the hotness of the data. Very old data is likely to be accessed less, and 
 very recent data is likely to be in the block cache. Both of these 
 considerations make it inefficient to compact these files as aggressively as 
 other files. In some use-cases, the access-pattern is particularly obvious 
 even though there is no way to control the compaction algorithm in those 
 cases.
 In the new compaction selection algorithm, we plan to divide the candidate 
 files into different levels according to oldness of the data that is present 
 in those files. For each level, parameters like compaction ratio, minimum 
 number of store-files in each compaction may be different. Number of levels, 
 time-ranges, and parameters for each level will be configurable online on a 
 per-column family basis.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7040) Port HBASE-5867 Improve Compaction Throttle Default to 0.94

2012-10-24 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483613#comment-13483613
 ] 

Lars Hofhansl commented on HBASE-7040:
--

Going to commit this today or tomorrow unless I hear objections.

 Port HBASE-5867 Improve Compaction Throttle Default to 0.94
 ---

 Key: HBASE-7040
 URL: https://issues.apache.org/jira/browse/HBASE-7040
 Project: HBase
  Issue Type: Task
Affects Versions: 0.94.2
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Minor
 Fix For: 0.94.3

 Attachments: HBASE-7040.patch


 Looks like a relatively important (and simple) improvement. Considering 
 porting to 0.94... 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7040) Port HBASE-5867 Improve Compaction Throttle Default to 0.94

2012-10-24 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483612#comment-13483612
 ] 

Lars Hofhansl commented on HBASE-7040:
--

As I said, I like the patch :)

I think we should commit it. Just wondered whether you had a specific reason.

 Port HBASE-5867 Improve Compaction Throttle Default to 0.94
 ---

 Key: HBASE-7040
 URL: https://issues.apache.org/jira/browse/HBASE-7040
 Project: HBase
  Issue Type: Task
Affects Versions: 0.94.2
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Minor
 Fix For: 0.94.3

 Attachments: HBASE-7040.patch


 Looks like a relatively important (and simple) improvement. Considering 
 porting to 0.94... 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6371) [89-fb] Tier based compaction

2012-10-24 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483616#comment-13483616
 ] 

Lars Hofhansl commented on HBASE-6371:
--

This is from before there were coprocessors, interesting.

 [89-fb] Tier based compaction
 -

 Key: HBASE-6371
 URL: https://issues.apache.org/jira/browse/HBASE-6371
 Project: HBase
  Issue Type: Improvement
Reporter: Akashnil
Assignee: Liyin Tang
  Labels: noob
 Attachments: HBASE-6371-089fb-commit.patch


 Currently, the compaction selection is not very flexible and is not sensitive 
 to the hotness of the data. Very old data is likely to be accessed less, and 
 very recent data is likely to be in the block cache. Both of these 
 considerations make it inefficient to compact these files as aggressively as 
 other files. In some use-cases, the access-pattern is particularly obvious 
 even though there is no way to control the compaction algorithm in those 
 cases.
 In the new compaction selection algorithm, we plan to divide the candidate 
 files into different levels according to oldness of the data that is present 
 in those files. For each level, parameters like compaction ratio, minimum 
 number of store-files in each compaction may be different. Number of levels, 
 time-ranges, and parameters for each level will be configurable online on a 
 per-column family basis.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7045) Add some comments to MVCC code

2012-10-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483621#comment-13483621
 ] 

Hadoop QA commented on HBASE-7045:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12550682/HBASE-7045.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
82 warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3139//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3139//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3139//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3139//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3139//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3139//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3139//console

This message is automatically generated.

 Add some comments to MVCC code
 --

 Key: HBASE-7045
 URL: https://issues.apache.org/jira/browse/HBASE-7045
 Project: HBase
  Issue Type: Task
  Components: Transactions/MVCC
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-7045.patch


 I've been digging through the MVCC/transaction code and adding some comments 
 to help me (or others) understand quicker the next time through

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7045) Add some comments to MVCC code

2012-10-24 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483633#comment-13483633
 ] 

Elliott Clark commented on HBASE-7045:
--

+1

Nit:
{code}To complete the WriteEntry, call {@link 
#completeMemstoreInsert(WriteEntry)}.{code}

Maybe say: To complete and wait for it to be visible, call 
completeMemstoreInsert.

 Add some comments to MVCC code
 --

 Key: HBASE-7045
 URL: https://issues.apache.org/jira/browse/HBASE-7045
 Project: HBase
  Issue Type: Task
  Components: Transactions/MVCC
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-7045.patch


 I've been digging through the MVCC/transaction code and adding some comments 
 to help me (or others) understand quicker the next time through

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2012-10-24 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483642#comment-13483642
 ] 

Francis Liu commented on HBASE-6721:


{quote}
I was asking about the following line of code:

found = found || servers.contains(server);

It seems to be condition checking.
{quote}

Yeah it's basically checking if the list of servers to be moved is already in 
the ServersInTransition list, meaning it is already being moved so we shouldn't 
allow that.

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Vandana Ayyalasomayajula
 Fix For: 0.96.0

 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, 
 HBASE-6721-DesigDoc.pdf


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2012-10-24 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483645#comment-13483645
 ] 

Elliott Clark commented on HBASE-6721:
--

{quote}Looking at the changes in AssignmentManager, they are mostly white space 
removal.
There is only one real change:{quote}
You're missing the change where null plans are no longer queued which comes 
about because of this patch.

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Vandana Ayyalasomayajula
 Fix For: 0.96.0

 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, 
 HBASE-6721-DesigDoc.pdf


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7046) Fix resource leak in TestHLogSplit#testOldRecoveredEditsFileSidelined

2012-10-24 Thread Himanshu Vashishtha (JIRA)
Himanshu Vashishtha created HBASE-7046:
--

 Summary: Fix resource leak in 
TestHLogSplit#testOldRecoveredEditsFileSidelined
 Key: HBASE-7046
 URL: https://issues.apache.org/jira/browse/HBASE-7046
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.96.0
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
 Fix For: 0.96.0


This method creates a writer but never closes one.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7047) [snapshots] Refactor error handling to use javax.management

2012-10-24 Thread Jesse Yates (JIRA)
Jesse Yates created HBASE-7047:
--

 Summary: [snapshots] Refactor error handling to use 
javax.management
 Key: HBASE-7047
 URL: https://issues.apache.org/jira/browse/HBASE-7047
 Project: HBase
  Issue Type: Sub-task
Affects Versions: hbase-6055
Reporter: Jesse Yates
 Fix For: hbase-6055


The current error handling framework introduced in HBASE-6571 adds a lot of 
complexity for what is essentially a solved problem. Specifically, cross-thread 
notifications have been generalized for the JMX tooling in the javax.management 
classes. 

Similar to what we developed, they have a NotifciationBroadcaster, 
NotificationListener, etc. though these are interfaces rather than general 
classes. These javax classes can be used almost 1-to-1 as replacements for 
things like the ExceptionOrchestrator and ExceptionListener. This also gives us 
the opportunity to easily add primitive notifications for standard HBase things 
like (1) timeouts, (2) aborts, and (3) server stops since the framework already 
considers things like typed notifications. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HBASE-7047) [snapshots] Refactor error handling to use javax.management

2012-10-24 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates reassigned HBASE-7047:
--

Assignee: Jesse Yates

 [snapshots] Refactor error handling to use javax.management
 ---

 Key: HBASE-7047
 URL: https://issues.apache.org/jira/browse/HBASE-7047
 Project: HBase
  Issue Type: Sub-task
  Components: Client, master, regionserver, snapshots, Zookeeper
Affects Versions: hbase-6055
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: hbase-6055


 The current error handling framework introduced in HBASE-6571 adds a lot of 
 complexity for what is essentially a solved problem. Specifically, 
 cross-thread notifications have been generalized for the JMX tooling in the 
 javax.management classes. 
 Similar to what we developed, they have a NotifciationBroadcaster, 
 NotificationListener, etc. though these are interfaces rather than general 
 classes. These javax classes can be used almost 1-to-1 as replacements for 
 things like the ExceptionOrchestrator and ExceptionListener. This also gives 
 us the opportunity to easily add primitive notifications for standard HBase 
 things like (1) timeouts, (2) aborts, and (3) server stops since the 
 framework already considers things like typed notifications. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7047) [snapshots] Refactor error handling to use javax.management

2012-10-24 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-7047:
---

Attachment: java_6667-v0.txt

Attaching simple version that refactors the error handling (removing excess 
classes/tests). This patch also modifies the current implementation of the 
offline snapshots (HBASE-6863) to use the new classes - slight tweaks, but 
nothing too crazy.

 [snapshots] Refactor error handling to use javax.management
 ---

 Key: HBASE-7047
 URL: https://issues.apache.org/jira/browse/HBASE-7047
 Project: HBase
  Issue Type: Sub-task
  Components: Client, master, regionserver, snapshots, Zookeeper
Affects Versions: hbase-6055
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: hbase-6055

 Attachments: java_6667-v0.txt


 The current error handling framework introduced in HBASE-6571 adds a lot of 
 complexity for what is essentially a solved problem. Specifically, 
 cross-thread notifications have been generalized for the JMX tooling in the 
 javax.management classes. 
 Similar to what we developed, they have a NotifciationBroadcaster, 
 NotificationListener, etc. though these are interfaces rather than general 
 classes. These javax classes can be used almost 1-to-1 as replacements for 
 things like the ExceptionOrchestrator and ExceptionListener. This also gives 
 us the opportunity to easily add primitive notifications for standard HBase 
 things like (1) timeouts, (2) aborts, and (3) server stops since the 
 framework already considers things like typed notifications. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7047) [snapshots] Refactor error handling to use javax.management

2012-10-24 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-7047:
---

Attachment: hbase-7047-v0-adv.patch

Attaching 'advanced' version of v0 that does some more advanced refactoring of 
the offline snapshot handler to take advantage of the new framework. 

Specifically, uses a centralized notification 'hub' to track the running 
handler and then uses the added StopNotification to pass a 'stop' update to the 
running DisabledTableSnapshotHandler. This is really nice in that it is 
basically zero overhead to running multiple snapshots or adapting for stopping 
a running snapshot and any restores.

 [snapshots] Refactor error handling to use javax.management
 ---

 Key: HBASE-7047
 URL: https://issues.apache.org/jira/browse/HBASE-7047
 Project: HBase
  Issue Type: Sub-task
  Components: Client, master, regionserver, snapshots, Zookeeper
Affects Versions: hbase-6055
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: hbase-6055

 Attachments: hbase-7047-v0-adv.patch, java_6667-v0.txt


 The current error handling framework introduced in HBASE-6571 adds a lot of 
 complexity for what is essentially a solved problem. Specifically, 
 cross-thread notifications have been generalized for the JMX tooling in the 
 javax.management classes. 
 Similar to what we developed, they have a NotifciationBroadcaster, 
 NotificationListener, etc. though these are interfaces rather than general 
 classes. These javax classes can be used almost 1-to-1 as replacements for 
 things like the ExceptionOrchestrator and ExceptionListener. This also gives 
 us the opportunity to easily add primitive notifications for standard HBase 
 things like (1) timeouts, (2) aborts, and (3) server stops since the 
 framework already considers things like typed notifications. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7047) [snapshots] Refactor error handling to use javax.management

2012-10-24 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483688#comment-13483688
 ] 

Ted Yu commented on HBASE-7047:
---

Looking at java_6667-v0.txt, I don't see javax.management classes being used.

 [snapshots] Refactor error handling to use javax.management
 ---

 Key: HBASE-7047
 URL: https://issues.apache.org/jira/browse/HBASE-7047
 Project: HBase
  Issue Type: Sub-task
  Components: Client, master, regionserver, snapshots, Zookeeper
Affects Versions: hbase-6055
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: hbase-6055

 Attachments: hbase-7047-v0-adv.patch, java_6667-v0.txt


 The current error handling framework introduced in HBASE-6571 adds a lot of 
 complexity for what is essentially a solved problem. Specifically, 
 cross-thread notifications have been generalized for the JMX tooling in the 
 javax.management classes. 
 Similar to what we developed, they have a NotifciationBroadcaster, 
 NotificationListener, etc. though these are interfaces rather than general 
 classes. These javax classes can be used almost 1-to-1 as replacements for 
 things like the ExceptionOrchestrator and ExceptionListener. This also gives 
 us the opportunity to easily add primitive notifications for standard HBase 
 things like (1) timeouts, (2) aborts, and (3) server stops since the 
 framework already considers things like typed notifications. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7047) [snapshots] Refactor error handling to use javax.management

2012-10-24 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-7047:
---

Attachment: (was: java_6667-v0.txt)

 [snapshots] Refactor error handling to use javax.management
 ---

 Key: HBASE-7047
 URL: https://issues.apache.org/jira/browse/HBASE-7047
 Project: HBase
  Issue Type: Sub-task
  Components: Client, master, regionserver, snapshots, Zookeeper
Affects Versions: hbase-6055
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: hbase-6055

 Attachments: hbase-7047-v0-adv.patch, hbase-7047-v0.patch


 The current error handling framework introduced in HBASE-6571 adds a lot of 
 complexity for what is essentially a solved problem. Specifically, 
 cross-thread notifications have been generalized for the JMX tooling in the 
 javax.management classes. 
 Similar to what we developed, they have a NotifciationBroadcaster, 
 NotificationListener, etc. though these are interfaces rather than general 
 classes. These javax classes can be used almost 1-to-1 as replacements for 
 things like the ExceptionOrchestrator and ExceptionListener. This also gives 
 us the opportunity to easily add primitive notifications for standard HBase 
 things like (1) timeouts, (2) aborts, and (3) server stops since the 
 framework already considers things like typed notifications. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7047) [snapshots] Refactor error handling to use javax.management

2012-10-24 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483689#comment-13483689
 ] 

Jesse Yates commented on HBASE-7047:


[~te...@apache.org] whoops, wrong patch. Lets try that again

 [snapshots] Refactor error handling to use javax.management
 ---

 Key: HBASE-7047
 URL: https://issues.apache.org/jira/browse/HBASE-7047
 Project: HBase
  Issue Type: Sub-task
  Components: Client, master, regionserver, snapshots, Zookeeper
Affects Versions: hbase-6055
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: hbase-6055

 Attachments: hbase-7047-v0-adv.patch, java_6667-v0.txt


 The current error handling framework introduced in HBASE-6571 adds a lot of 
 complexity for what is essentially a solved problem. Specifically, 
 cross-thread notifications have been generalized for the JMX tooling in the 
 javax.management classes. 
 Similar to what we developed, they have a NotifciationBroadcaster, 
 NotificationListener, etc. though these are interfaces rather than general 
 classes. These javax classes can be used almost 1-to-1 as replacements for 
 things like the ExceptionOrchestrator and ExceptionListener. This also gives 
 us the opportunity to easily add primitive notifications for standard HBase 
 things like (1) timeouts, (2) aborts, and (3) server stops since the 
 framework already considers things like typed notifications. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7047) [snapshots] Refactor error handling to use javax.management

2012-10-24 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-7047:
---

Attachment: hbase-7047-v0.patch

Attaching correct version of 'basic'refactor.

 [snapshots] Refactor error handling to use javax.management
 ---

 Key: HBASE-7047
 URL: https://issues.apache.org/jira/browse/HBASE-7047
 Project: HBase
  Issue Type: Sub-task
  Components: Client, master, regionserver, snapshots, Zookeeper
Affects Versions: hbase-6055
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: hbase-6055

 Attachments: hbase-7047-v0-adv.patch, hbase-7047-v0.patch


 The current error handling framework introduced in HBASE-6571 adds a lot of 
 complexity for what is essentially a solved problem. Specifically, 
 cross-thread notifications have been generalized for the JMX tooling in the 
 javax.management classes. 
 Similar to what we developed, they have a NotifciationBroadcaster, 
 NotificationListener, etc. though these are interfaces rather than general 
 classes. These javax classes can be used almost 1-to-1 as replacements for 
 things like the ExceptionOrchestrator and ExceptionListener. This also gives 
 us the opportunity to easily add primitive notifications for standard HBase 
 things like (1) timeouts, (2) aborts, and (3) server stops since the 
 framework already considers things like typed notifications. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7045) Add some comments to MVCC code

2012-10-24 Thread Gregory Chanan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483696#comment-13483696
 ] 

Gregory Chanan commented on HBASE-7045:
---

Thanks for the review.  Will do what you suggest on commit.

 Add some comments to MVCC code
 --

 Key: HBASE-7045
 URL: https://issues.apache.org/jira/browse/HBASE-7045
 Project: HBase
  Issue Type: Task
  Components: Transactions/MVCC
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-7045.patch


 I've been digging through the MVCC/transaction code and adding some comments 
 to help me (or others) understand quicker the next time through

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7045) Add some comments to MVCC code

2012-10-24 Thread Gregory Chanan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gregory Chanan updated HBASE-7045:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk.

 Add some comments to MVCC code
 --

 Key: HBASE-7045
 URL: https://issues.apache.org/jira/browse/HBASE-7045
 Project: HBase
  Issue Type: Task
  Components: Transactions/MVCC
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-7045.patch


 I've been digging through the MVCC/transaction code and adding some comments 
 to help me (or others) understand quicker the next time through

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7048) Regionsplitter requires the hadoop config path to be in hbase classpath

2012-10-24 Thread Ted Yu (JIRA)
Ted Yu created HBASE-7048:
-

 Summary: Regionsplitter requires the hadoop config path to be in 
hbase classpath
 Key: HBASE-7048
 URL: https://issues.apache.org/jira/browse/HBASE-7048
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.2
Reporter: Ted Yu
 Fix For: 0.94.3, 0.96.0


When hadoop config path isn't included in hbase classpath, you will get the 
following:
{code}
Exception in thread main java.lang.IllegalArgumentException: Wrong FS: 
hdfs://t3.e.com/hbase/usertable/_balancedSplit, expected: file:///
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:454)
at 
org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:67)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:431)
at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:301)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1005)
at 
org.apache.hadoop.hbase.util.RegionSplitter.getSplits(RegionSplitter.java:643)
at 
org.apache.hadoop.hbase.util.RegionSplitter.rollingSplit(RegionSplitter.java:367)
at org.apache.hadoop.hbase.util.RegionSplitter.main(RegionSplitter.java:295)
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6371) [89-fb] Tier based compaction

2012-10-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483729#comment-13483729
 ] 

Sergey Shelukhin commented on HBASE-6371:
-

I think updateConfiguration mechanism from this patch deserves separate commit; 
it's more generic than this change (I hope) and will have to be changed to 
protobufs. 
I will create a JIRA.

 [89-fb] Tier based compaction
 -

 Key: HBASE-6371
 URL: https://issues.apache.org/jira/browse/HBASE-6371
 Project: HBase
  Issue Type: Improvement
Reporter: Akashnil
Assignee: Liyin Tang
  Labels: noob
 Attachments: HBASE-6371-089fb-commit.patch


 Currently, the compaction selection is not very flexible and is not sensitive 
 to the hotness of the data. Very old data is likely to be accessed less, and 
 very recent data is likely to be in the block cache. Both of these 
 considerations make it inefficient to compact these files as aggressively as 
 other files. In some use-cases, the access-pattern is particularly obvious 
 even though there is no way to control the compaction algorithm in those 
 cases.
 In the new compaction selection algorithm, we plan to divide the candidate 
 files into different levels according to oldness of the data that is present 
 in those files. For each level, parameters like compaction ratio, minimum 
 number of store-files in each compaction may be different. Number of levels, 
 time-ranges, and parameters for each level will be configurable online on a 
 per-column family basis.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5257) Allow filter to be evaluated after version handling

2012-10-24 Thread Varun Sharma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Sharma updated HBASE-5257:


Attachment: HBASE-5257-0.92.txt

Attached patch passed unittests for Hbase 0.92 - hbase 0.95-snapshot will need 
a different patch.

 Allow filter to be evaluated after version handling
 ---

 Key: HBASE-5257
 URL: https://issues.apache.org/jira/browse/HBASE-5257
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
 Attachments: HBASE-5257-0.92.txt


 There are various usecases and filter types where evaluating the filter 
 before version are handled either do not make sense, or make filter handling 
 more complicated.
 Also see this comment in ScanQueryMatcher:
 {code}
 /**
  * Filters should be checked before checking column trackers. If we do
  * otherwise, as was previously being done, ColumnTracker may increment 
 its
  * counter for even that KV which may be discarded later on by Filter. 
 This
  * would lead to incorrect results in certain cases.
  */
 {code}
 So we had Filters after the column trackers (which do the version checking), 
 and then moved it.
 Should be at the discretion of the Filter.
 Could either add a new method to FilterBase (maybe excludeVersions() or 
 something). Or have a new Filter wrapper (like WhileMatchFilter), that should 
 only be used as outmost filter and indicates the same (maybe 
 ExcludeVersionsFilter).
 See latest comments on HBASE-5229 for motivation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5257) Allow filter to be evaluated after version handling

2012-10-24 Thread Olson,Andrew (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483733#comment-13483733
 ] 

Olson,Andrew commented on HBASE-5257:
-

I will be out of the office with limited access to email until Monday, 
10/29/2012. For urgent issues please contact Greg Whitsitt.

Andrew Olson | Sr. Software Architect | Cerner Corporation | 816.201.3825 | 
aols...@cerner.com | www.cerner.com


 Allow filter to be evaluated after version handling
 ---

 Key: HBASE-5257
 URL: https://issues.apache.org/jira/browse/HBASE-5257
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
 Attachments: HBASE-5257-0.92.txt


 There are various usecases and filter types where evaluating the filter 
 before version are handled either do not make sense, or make filter handling 
 more complicated.
 Also see this comment in ScanQueryMatcher:
 {code}
 /**
  * Filters should be checked before checking column trackers. If we do
  * otherwise, as was previously being done, ColumnTracker may increment 
 its
  * counter for even that KV which may be discarded later on by Filter. 
 This
  * would lead to incorrect results in certain cases.
  */
 {code}
 So we had Filters after the column trackers (which do the version checking), 
 and then moved it.
 Should be at the discretion of the Filter.
 Could either add a new method to FilterBase (maybe excludeVersions() or 
 something). Or have a new Filter wrapper (like WhileMatchFilter), that should 
 only be used as outmost filter and indicates the same (maybe 
 ExcludeVersionsFilter).
 See latest comments on HBASE-5229 for motivation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7049) add dynamic configuration update mechanism

2012-10-24 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HBASE-7049:
---

 Summary: add dynamic configuration update mechanism
 Key: HBASE-7049
 URL: https://issues.apache.org/jira/browse/HBASE-7049
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin


Initial draft will be modeled on 0.89-fb changes; see HBASE-6371

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7049) add dynamic configuration update mechanism

2012-10-24 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483736#comment-13483736
 ] 

Ted Yu commented on HBASE-7049:
---

@Sergey:
Have you seen HBASE-5335 ?
It is already in trunk.

 add dynamic configuration update mechanism
 --

 Key: HBASE-7049
 URL: https://issues.apache.org/jira/browse/HBASE-7049
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin

 Initial draft will be modeled on 0.89-fb changes; see HBASE-6371

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5257) Allow filter to be evaluated after version handling

2012-10-24 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-5257:
--

Attachment: 5257-trunk.txt

Patch for trunk.

TestColumnPaginationFilter, TestFilter and TestFilterList passed.

 Allow filter to be evaluated after version handling
 ---

 Key: HBASE-5257
 URL: https://issues.apache.org/jira/browse/HBASE-5257
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
 Fix For: 0.96.0

 Attachments: 5257-trunk.txt, HBASE-5257-0.92.txt


 There are various usecases and filter types where evaluating the filter 
 before version are handled either do not make sense, or make filter handling 
 more complicated.
 Also see this comment in ScanQueryMatcher:
 {code}
 /**
  * Filters should be checked before checking column trackers. If we do
  * otherwise, as was previously being done, ColumnTracker may increment 
 its
  * counter for even that KV which may be discarded later on by Filter. 
 This
  * would lead to incorrect results in certain cases.
  */
 {code}
 So we had Filters after the column trackers (which do the version checking), 
 and then moved it.
 Should be at the discretion of the Filter.
 Could either add a new method to FilterBase (maybe excludeVersions() or 
 something). Or have a new Filter wrapper (like WhileMatchFilter), that should 
 only be used as outmost filter and indicates the same (maybe 
 ExcludeVersionsFilter).
 See latest comments on HBASE-5229 for motivation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5257) Allow filter to be evaluated after version handling

2012-10-24 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-5257:
--

Fix Version/s: 0.96.0
   Status: Patch Available  (was: Reopened)

 Allow filter to be evaluated after version handling
 ---

 Key: HBASE-5257
 URL: https://issues.apache.org/jira/browse/HBASE-5257
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
 Fix For: 0.96.0

 Attachments: 5257-trunk.txt, HBASE-5257-0.92.txt


 There are various usecases and filter types where evaluating the filter 
 before version are handled either do not make sense, or make filter handling 
 more complicated.
 Also see this comment in ScanQueryMatcher:
 {code}
 /**
  * Filters should be checked before checking column trackers. If we do
  * otherwise, as was previously being done, ColumnTracker may increment 
 its
  * counter for even that KV which may be discarded later on by Filter. 
 This
  * would lead to incorrect results in certain cases.
  */
 {code}
 So we had Filters after the column trackers (which do the version checking), 
 and then moved it.
 Should be at the discretion of the Filter.
 Could either add a new method to FilterBase (maybe excludeVersions() or 
 something). Or have a new Filter wrapper (like WhileMatchFilter), that should 
 only be used as outmost filter and indicates the same (maybe 
 ExcludeVersionsFilter).
 See latest comments on HBASE-5229 for motivation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7049) add dynamic configuration update mechanism

2012-10-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483749#comment-13483749
 ] 

Sergey Shelukhin commented on HBASE-7049:
-

There's also a design option here.
The current approach is the patch is an explicit admin command that is 
propagated to all the requisite objects and causes them to re-read the 
configuration they are interested in from disk.
I personally prefer the approach where the act of replacing the file (or adding 
an override file) would cause the service configuration to be automatically 
updated inside the configuration object itself.
One never caches values from config during init; config object does that on 
init/first request for a value (and on config file change); thus, the code 
instead calls conf.getLong(MyCoolValue) every time (or for one method 
call/one compaction/one request/...), and gets the recent value.
For special cases, it's easy to add mechanism to get several values atomically, 
and for the most special case to add the change callback.
This avoids adding the code to propagate config to places/handling updates in 
code, and avoids the non-atomicity of copying the files and then updating 
config via admin command.

I wonder if there are opinions for either approach?

 add dynamic configuration update mechanism
 --

 Key: HBASE-7049
 URL: https://issues.apache.org/jira/browse/HBASE-7049
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin

 Initial draft will be modeled on 0.89-fb changes; see HBASE-6371

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7049) add dynamic configuration update mechanism

2012-10-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483750#comment-13483750
 ] 

Sergey Shelukhin commented on HBASE-7049:
-

ah, nevermind then :)


 add dynamic configuration update mechanism
 --

 Key: HBASE-7049
 URL: https://issues.apache.org/jira/browse/HBASE-7049
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin

 Initial draft will be modeled on 0.89-fb changes; see HBASE-6371

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5257) Allow filter to be evaluated after version handling

2012-10-24 Thread Varun Sharma (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483757#comment-13483757
 ] 

Varun Sharma commented on HBASE-5257:
-

[~te...@apache.org]

Thanks for patching this against 0.96 - should we also be submitting into 
0.92/0.94 ? We are using the 0.92 version of hbase ?

 Allow filter to be evaluated after version handling
 ---

 Key: HBASE-5257
 URL: https://issues.apache.org/jira/browse/HBASE-5257
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
 Fix For: 0.96.0

 Attachments: 5257-trunk.txt, HBASE-5257-0.92.txt


 There are various usecases and filter types where evaluating the filter 
 before version are handled either do not make sense, or make filter handling 
 more complicated.
 Also see this comment in ScanQueryMatcher:
 {code}
 /**
  * Filters should be checked before checking column trackers. If we do
  * otherwise, as was previously being done, ColumnTracker may increment 
 its
  * counter for even that KV which may be discarded later on by Filter. 
 This
  * would lead to incorrect results in certain cases.
  */
 {code}
 So we had Filters after the column trackers (which do the version checking), 
 and then moved it.
 Should be at the discretion of the Filter.
 Could either add a new method to FilterBase (maybe excludeVersions() or 
 something). Or have a new Filter wrapper (like WhileMatchFilter), that should 
 only be used as outmost filter and indicates the same (maybe 
 ExcludeVersionsFilter).
 See latest comments on HBASE-5229 for motivation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6371) [89-fb] Tier based compaction

2012-10-24 Thread Nicolas Spiegelberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483759#comment-13483759
 ] 

Nicolas Spiegelberg commented on HBASE-6371:


@Lars:  you are correct about doing a better job of partitioning newly-written 
and stale data.  With leveled compaction, the different tiers end up implicitly 
becoming different age groups.  This was the primary motivation for us.

Also note that we are looking into coprocessor-based compactions, but will 
probably utilize that for TSDB-style compactions and other stuff that is more 
niche and is questionable if it belongs in the core.

 [89-fb] Tier based compaction
 -

 Key: HBASE-6371
 URL: https://issues.apache.org/jira/browse/HBASE-6371
 Project: HBase
  Issue Type: Improvement
Reporter: Akashnil
Assignee: Liyin Tang
  Labels: noob
 Attachments: HBASE-6371-089fb-commit.patch


 Currently, the compaction selection is not very flexible and is not sensitive 
 to the hotness of the data. Very old data is likely to be accessed less, and 
 very recent data is likely to be in the block cache. Both of these 
 considerations make it inefficient to compact these files as aggressively as 
 other files. In some use-cases, the access-pattern is particularly obvious 
 even though there is no way to control the compaction algorithm in those 
 cases.
 In the new compaction selection algorithm, we plan to divide the candidate 
 files into different levels according to oldness of the data that is present 
 in those files. For each level, parameters like compaction ratio, minimum 
 number of store-files in each compaction may be different. Number of levels, 
 time-ranges, and parameters for each level will be configurable online on a 
 per-column family basis.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7049) add dynamic configuration update mechanism

2012-10-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483762#comment-13483762
 ] 

Sergey Shelukhin commented on HBASE-7049:
-

hmm, I am actually referring xml config settings, rather than column/etc. Do 
you mean HBASE-3909?


 add dynamic configuration update mechanism
 --

 Key: HBASE-7049
 URL: https://issues.apache.org/jira/browse/HBASE-7049
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin

 Initial draft will be modeled on 0.89-fb changes; see HBASE-6371

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7049) add dynamic HBase xml configuration update mechanism

2012-10-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-7049:


Summary: add dynamic HBase xml configuration update mechanism  (was: add 
dynamic configuration update mechanism)

 add dynamic HBase xml configuration update mechanism
 

 Key: HBASE-7049
 URL: https://issues.apache.org/jira/browse/HBASE-7049
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin

 Initial draft will be modeled on 0.89-fb changes; see HBASE-6371

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-7049) add dynamic HBase xml configuration update mechanism

2012-10-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HBASE-7049.
-

Resolution: Duplicate

 add dynamic HBase xml configuration update mechanism
 

 Key: HBASE-7049
 URL: https://issues.apache.org/jira/browse/HBASE-7049
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin

 Initial draft will be modeled on 0.89-fb changes; see HBASE-6371

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2012-10-24 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483768#comment-13483768
 ] 

Francis Liu commented on HBASE-6721:


Thanks for the comments [~eclark]

{quote}
I couldn't seem to get it to compile for me on 0.94
{quote}
Did you apply the patches attached in the subtasks prior to apply this patch? 
If you'd like them all in one single patch I can do that as well.

{quote}
There seem to be a bunch of formatting changes that aren't needed.
{quote}
Will clean that up in the next update.

{quote}
Passing in the preferred server into the load balancer on randomAssignment is 
messy. If we know the preferred server why call this function at all ?
{quote}
Good point, will remove that argument.

{quote}
The balancer is a public interface and we can't make changes to it in a minor 
release. And this patch won't apply to trunk.
{quote}
I see, we can make it binary compatible at least by supporting both interfaces 
if you're amenable to that. We're planning on getting 0.94 into production and 
it'd be great if we didn't have a lot of custom patches on top of it.

{quote}
With this many interfaces and classes it might make sense to move them into a 
namespace.
{quote}
Will look into doing this, my main concern is if there any dependencies to 
package private methods.

{quote}
Why a co-processor and not build it in ?
Security was done that was because it can be added or removed. As the patch 
is that's not really possible
This makes a lot of changes in core code for something that is a 
co-processor.
{quote} 
As part of the design, HBase should run fine without the group based classes 
enabled (endpoint, balancer, etc). If it is not that case then that's a bug. As 
for some code changes in core code. Some may be unavoidable, but we could 
probably still make it less invasive (ie remove the EventHandler changes). 
Having said that, I don't mind if the community would like to have this fully 
integrated into HBase, just let us know.

{quote}
Don't create a DefaultLoadBalancer in GroupBasedLoadBalancer. The balancer was 
made pluggable and that feature shouldn't go away.
{quote}
The balancer is still pluggable it's just not pluggable for the 
GroupBasedLoadBalancer. Though should be ok to make that pluggable as well.

{quote}
HTableDescriptor seems like the correct location for info about the table if 
you don't want to put that data into meta.
{quote}
Yes, we have group affiliation store as a table property. Though group 
information is stored on hdfs.

{quote}
putting things into the filesystem seems like the wrong way to do it. There are 
just so many different moving parts with getting things from hdfs with caching 
and cache invalidation, and edge cases on failure.
{quote}
I see, were do you suggest we put it? Zookeeper? We mainly had it in HDFS since 
ZK, seemed to be the place to store only ephemeral data? Putting the data in 
tables would be a lot more complex and would require more core code change.



 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Vandana Ayyalasomayajula
 Fix For: 0.96.0

 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, 
 HBASE-6721-DesigDoc.pdf


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5257) Allow filter to be evaluated after version handling

2012-10-24 Thread Varun Sharma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Sharma updated HBASE-5257:


Attachment: HBASE-5257-0.94.txt

Patch for 0.94

 Allow filter to be evaluated after version handling
 ---

 Key: HBASE-5257
 URL: https://issues.apache.org/jira/browse/HBASE-5257
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
 Fix For: 0.96.0

 Attachments: 5257-trunk.txt, HBASE-5257-0.92.txt, HBASE-5257-0.94.txt


 There are various usecases and filter types where evaluating the filter 
 before version are handled either do not make sense, or make filter handling 
 more complicated.
 Also see this comment in ScanQueryMatcher:
 {code}
 /**
  * Filters should be checked before checking column trackers. If we do
  * otherwise, as was previously being done, ColumnTracker may increment 
 its
  * counter for even that KV which may be discarded later on by Filter. 
 This
  * would lead to incorrect results in certain cases.
  */
 {code}
 So we had Filters after the column trackers (which do the version checking), 
 and then moved it.
 Should be at the discretion of the Filter.
 Could either add a new method to FilterBase (maybe excludeVersions() or 
 something). Or have a new Filter wrapper (like WhileMatchFilter), that should 
 only be used as outmost filter and indicates the same (maybe 
 ExcludeVersionsFilter).
 See latest comments on HBASE-5229 for motivation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5257) Allow filter to be evaluated after version handling

2012-10-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483782#comment-13483782
 ] 

Hadoop QA commented on HBASE-5257:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12550721/HBASE-5257-0.94.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 9 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3141//console

This message is automatically generated.

 Allow filter to be evaluated after version handling
 ---

 Key: HBASE-5257
 URL: https://issues.apache.org/jira/browse/HBASE-5257
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
 Fix For: 0.96.0

 Attachments: 5257-trunk.txt, HBASE-5257-0.92.txt, HBASE-5257-0.94.txt


 There are various usecases and filter types where evaluating the filter 
 before version are handled either do not make sense, or make filter handling 
 more complicated.
 Also see this comment in ScanQueryMatcher:
 {code}
 /**
  * Filters should be checked before checking column trackers. If we do
  * otherwise, as was previously being done, ColumnTracker may increment 
 its
  * counter for even that KV which may be discarded later on by Filter. 
 This
  * would lead to incorrect results in certain cases.
  */
 {code}
 So we had Filters after the column trackers (which do the version checking), 
 and then moved it.
 Should be at the discretion of the Filter.
 Could either add a new method to FilterBase (maybe excludeVersions() or 
 something). Or have a new Filter wrapper (like WhileMatchFilter), that should 
 only be used as outmost filter and indicates the same (maybe 
 ExcludeVersionsFilter).
 See latest comments on HBASE-5229 for motivation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2012-10-24 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483785#comment-13483785
 ] 

Francis Liu commented on HBASE-6721:


{quote}
You're missing the change where null plans are no longer queued which comes 
about because of this patch.
{quote}
We needed this change to prevent regions from being assigned to region servers 
they don't belong to. We can continue to recognize null, we just need another 
way to prevent regions from being assigned to the wrong group of region 
servers. One option is to have a dead/bogus server as part of the plan if no 
online servers are available for a given group, this way it eventually gets 
reassigned once a live server is up. Would that work?

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Vandana Ayyalasomayajula
 Fix For: 0.96.0

 Attachments: HBASE-6721_94.patch, HBASE-6721_94.patch, 
 HBASE-6721-DesigDoc.pdf


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5257) Allow filter to be evaluated after version handling

2012-10-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483790#comment-13483790
 ] 

Hadoop QA commented on HBASE-5257:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12550711/5257-trunk.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 9 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
82 warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 3 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3140//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3140//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3140//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3140//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3140//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3140//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3140//console

This message is automatically generated.

 Allow filter to be evaluated after version handling
 ---

 Key: HBASE-5257
 URL: https://issues.apache.org/jira/browse/HBASE-5257
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
 Fix For: 0.96.0

 Attachments: 5257-trunk.txt, HBASE-5257-0.92.txt, HBASE-5257-0.94.txt


 There are various usecases and filter types where evaluating the filter 
 before version are handled either do not make sense, or make filter handling 
 more complicated.
 Also see this comment in ScanQueryMatcher:
 {code}
 /**
  * Filters should be checked before checking column trackers. If we do
  * otherwise, as was previously being done, ColumnTracker may increment 
 its
  * counter for even that KV which may be discarded later on by Filter. 
 This
  * would lead to incorrect results in certain cases.
  */
 {code}
 So we had Filters after the column trackers (which do the version checking), 
 and then moved it.
 Should be at the discretion of the Filter.
 Could either add a new method to FilterBase (maybe excludeVersions() or 
 something). Or have a new Filter wrapper (like WhileMatchFilter), that should 
 only be used as outmost filter and indicates the same (maybe 
 ExcludeVersionsFilter).
 See latest comments on HBASE-5229 for motivation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6896) sync bulk and regular assigment handling socket timeout exception

2012-10-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483794#comment-13483794
 ] 

Hudson commented on HBASE-6896:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #234 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/234/])
HBASE-6896 sync bulk and regular assigment handling socket timeout 
exception (Revision 1401744)

 Result = FAILURE
jxiang : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java


 sync bulk and regular assigment handling socket timeout exception
 -

 Key: HBASE-6896
 URL: https://issues.apache.org/jira/browse/HBASE-6896
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Attachments: trunk-6896.patch, trunk-6896_v2.patch


 In regular assignment, in case of socket network timeout, it tries to call 
 openRegion again and again without change the region plan, ZK offline node,
 till the region is out of transition, in case the region server is still up.
 We may need to sync them up and make sure bulk assignment does the same in 
 this case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7045) Add some comments to MVCC code

2012-10-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483793#comment-13483793
 ] 

Hudson commented on HBASE-7045:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #234 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/234/])
HBASE-7045 Add some comments to MVCC code (Revision 1401910)

 Result = FAILURE
gchanan : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyControl.java


 Add some comments to MVCC code
 --

 Key: HBASE-7045
 URL: https://issues.apache.org/jira/browse/HBASE-7045
 Project: HBase
  Issue Type: Task
  Components: Transactions/MVCC
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-7045.patch


 I've been digging through the MVCC/transaction code and adding some comments 
 to help me (or others) understand quicker the next time through

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5257) Allow filter to be evaluated after version handling

2012-10-24 Thread Varun Sharma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Sharma updated HBASE-5257:


Attachment: HBASE-5257-0.92.txt

Corrected patch file in right format for 0.92

 Allow filter to be evaluated after version handling
 ---

 Key: HBASE-5257
 URL: https://issues.apache.org/jira/browse/HBASE-5257
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
 Fix For: 0.96.0

 Attachments: 5257-trunk.txt, HBASE-5257-0.92.txt, 
 HBASE-5257-0.92.txt, HBASE-5257-0.94.txt, HBASE-5257-0.94.txt


 There are various usecases and filter types where evaluating the filter 
 before version are handled either do not make sense, or make filter handling 
 more complicated.
 Also see this comment in ScanQueryMatcher:
 {code}
 /**
  * Filters should be checked before checking column trackers. If we do
  * otherwise, as was previously being done, ColumnTracker may increment 
 its
  * counter for even that KV which may be discarded later on by Filter. 
 This
  * would lead to incorrect results in certain cases.
  */
 {code}
 So we had Filters after the column trackers (which do the version checking), 
 and then moved it.
 Should be at the discretion of the Filter.
 Could either add a new method to FilterBase (maybe excludeVersions() or 
 something). Or have a new Filter wrapper (like WhileMatchFilter), that should 
 only be used as outmost filter and indicates the same (maybe 
 ExcludeVersionsFilter).
 See latest comments on HBASE-5229 for motivation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5257) Allow filter to be evaluated after version handling

2012-10-24 Thread Varun Sharma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Sharma updated HBASE-5257:


Attachment: HBASE-5257-0.94.txt

Correctly formatted patch file...

 Allow filter to be evaluated after version handling
 ---

 Key: HBASE-5257
 URL: https://issues.apache.org/jira/browse/HBASE-5257
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
 Fix For: 0.96.0

 Attachments: 5257-trunk.txt, HBASE-5257-0.92.txt, 
 HBASE-5257-0.92.txt, HBASE-5257-0.94.txt, HBASE-5257-0.94.txt


 There are various usecases and filter types where evaluating the filter 
 before version are handled either do not make sense, or make filter handling 
 more complicated.
 Also see this comment in ScanQueryMatcher:
 {code}
 /**
  * Filters should be checked before checking column trackers. If we do
  * otherwise, as was previously being done, ColumnTracker may increment 
 its
  * counter for even that KV which may be discarded later on by Filter. 
 This
  * would lead to incorrect results in certain cases.
  */
 {code}
 So we had Filters after the column trackers (which do the version checking), 
 and then moved it.
 Should be at the discretion of the Filter.
 Could either add a new method to FilterBase (maybe excludeVersions() or 
 something). Or have a new Filter wrapper (like WhileMatchFilter), that should 
 only be used as outmost filter and indicates the same (maybe 
 ExcludeVersionsFilter).
 See latest comments on HBASE-5229 for motivation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7045) Add some comments to MVCC code

2012-10-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483798#comment-13483798
 ] 

Hudson commented on HBASE-7045:
---

Integrated in HBase-TRUNK #3483 (See 
[https://builds.apache.org/job/HBase-TRUNK/3483/])
HBASE-7045 Add some comments to MVCC code (Revision 1401910)

 Result = FAILURE
gchanan : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyControl.java


 Add some comments to MVCC code
 --

 Key: HBASE-7045
 URL: https://issues.apache.org/jira/browse/HBASE-7045
 Project: HBase
  Issue Type: Task
  Components: Transactions/MVCC
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-7045.patch


 I've been digging through the MVCC/transaction code and adding some comments 
 to help me (or others) understand quicker the next time through

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   >