[jira] [Updated] (HBASE-4489) Better key splitting in RegionSplitter

2011-10-19 Thread Nicolas Spiegelberg (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolas Spiegelberg updated HBASE-4489:
---

   Resolution: Fixed
Fix Version/s: (was: 0.90.5)
   0.94.0
   Status: Resolved  (was: Patch Available)

putting in 0.94 since it is an improvement and we don't plan on a long release 
cycle between 0.92 and 0.94

 Better key splitting in RegionSplitter
 --

 Key: HBASE-4489
 URL: https://issues.apache.org/jira/browse/HBASE-4489
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.4
Reporter: Dave Revell
Assignee: Dave Revell
 Fix For: 0.94.0

 Attachments: HBASE-4489-branch0.90-v1.patch, 
 HBASE-4489-branch0.90-v2.patch, HBASE-4489-branch0.90-v3.patch, 
 HBASE-4489-trunk-v1.patch, HBASE-4489-trunk-v2.patch, 
 HBASE-4489-trunk-v3.patch, HBASE-4489-trunk-v4.patch, 
 HBASE-4489-trunk-v5.patch


 The RegionSplitter utility allows users to create a pre-split table from the 
 command line or do a rolling split on an existing table. It supports 
 pluggable split algorithms that implement the SplitAlgorithm interface. The 
 only/default SplitAlgorithm is one that assumes keys fall in the range from 
 ASCII string  to ASCII string 7FFF. This is not a sane 
 default, and seems useless to most users. Users are likely to be surprised by 
 the fact that all the region splits occur in in the byte range of ASCII 
 characters.
 A better default split algorithm would be one that evenly divides the space 
 of all bytes, which is what this patch does. Making a table with five regions 
 would split at \x33\x33..., \x66\x66, \x99\x99..., \xCC\xCC..., and 
 \xFF\xFF.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4489) Better key splitting in RegionSplitter

2011-10-15 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4489:
--

Fix Version/s: 0.90.5

 Better key splitting in RegionSplitter
 --

 Key: HBASE-4489
 URL: https://issues.apache.org/jira/browse/HBASE-4489
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.4
Reporter: Dave Revell
Assignee: Dave Revell
 Fix For: 0.90.5

 Attachments: HBASE-4489-branch0.90-v1.patch, 
 HBASE-4489-branch0.90-v2.patch, HBASE-4489-branch0.90-v3.patch, 
 HBASE-4489-trunk-v1.patch, HBASE-4489-trunk-v2.patch, 
 HBASE-4489-trunk-v3.patch, HBASE-4489-trunk-v4.patch, 
 HBASE-4489-trunk-v5.patch


 The RegionSplitter utility allows users to create a pre-split table from the 
 command line or do a rolling split on an existing table. It supports 
 pluggable split algorithms that implement the SplitAlgorithm interface. The 
 only/default SplitAlgorithm is one that assumes keys fall in the range from 
 ASCII string  to ASCII string 7FFF. This is not a sane 
 default, and seems useless to most users. Users are likely to be surprised by 
 the fact that all the region splits occur in in the byte range of ASCII 
 characters.
 A better default split algorithm would be one that evenly divides the space 
 of all bytes, which is what this patch does. Making a table with five regions 
 would split at \x33\x33..., \x66\x66, \x99\x99..., \xCC\xCC..., and 
 \xFF\xFF.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4489) Better key splitting in RegionSplitter

2011-10-15 Thread Dave Revell (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Revell updated HBASE-4489:
---

Release Note: The split algorithm used by RegionSplitter is now a required 
parameter. Previously there was one split algorithm called MD5StringSplit, 
which was the default. MD5StringSplit has been renamed to HexStringSplit, and 
tweaked so that its maximum key is now  instead of 7FFF. A new 
split algorithm UniformSplit has been added which treats keys as arbitrary 
bytes.

 Better key splitting in RegionSplitter
 --

 Key: HBASE-4489
 URL: https://issues.apache.org/jira/browse/HBASE-4489
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.4
Reporter: Dave Revell
Assignee: Dave Revell
 Fix For: 0.90.5

 Attachments: HBASE-4489-branch0.90-v1.patch, 
 HBASE-4489-branch0.90-v2.patch, HBASE-4489-branch0.90-v3.patch, 
 HBASE-4489-trunk-v1.patch, HBASE-4489-trunk-v2.patch, 
 HBASE-4489-trunk-v3.patch, HBASE-4489-trunk-v4.patch, 
 HBASE-4489-trunk-v5.patch


 The RegionSplitter utility allows users to create a pre-split table from the 
 command line or do a rolling split on an existing table. It supports 
 pluggable split algorithms that implement the SplitAlgorithm interface. The 
 only/default SplitAlgorithm is one that assumes keys fall in the range from 
 ASCII string  to ASCII string 7FFF. This is not a sane 
 default, and seems useless to most users. Users are likely to be surprised by 
 the fact that all the region splits occur in in the byte range of ASCII 
 characters.
 A better default split algorithm would be one that evenly divides the space 
 of all bytes, which is what this patch does. Making a table with five regions 
 would split at \x33\x33..., \x66\x66, \x99\x99..., \xCC\xCC..., and 
 \xFF\xFF.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4489) Better key splitting in RegionSplitter

2011-10-14 Thread Dave Revell (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Revell updated HBASE-4489:
---

Attachment: HBASE-4489-trunk-v5.patch

Nicolas, agreed. New patch for trunk (v5) attached that incorporates your 
feedback.

The v3 patch is still the most current patch for 0.90.

 Better key splitting in RegionSplitter
 --

 Key: HBASE-4489
 URL: https://issues.apache.org/jira/browse/HBASE-4489
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.4
Reporter: Dave Revell
Assignee: Dave Revell
 Attachments: HBASE-4489-branch0.90-v1.patch, 
 HBASE-4489-branch0.90-v2.patch, HBASE-4489-branch0.90-v3.patch, 
 HBASE-4489-trunk-v1.patch, HBASE-4489-trunk-v2.patch, 
 HBASE-4489-trunk-v3.patch, HBASE-4489-trunk-v4.patch, 
 HBASE-4489-trunk-v5.patch


 The RegionSplitter utility allows users to create a pre-split table from the 
 command line or do a rolling split on an existing table. It supports 
 pluggable split algorithms that implement the SplitAlgorithm interface. The 
 only/default SplitAlgorithm is one that assumes keys fall in the range from 
 ASCII string  to ASCII string 7FFF. This is not a sane 
 default, and seems useless to most users. Users are likely to be surprised by 
 the fact that all the region splits occur in in the byte range of ASCII 
 characters.
 A better default split algorithm would be one that evenly divides the space 
 of all bytes, which is what this patch does. Making a table with five regions 
 would split at \x33\x33..., \x66\x66, \x99\x99..., \xCC\xCC..., and 
 \xFF\xFF.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4489) Better key splitting in RegionSplitter

2011-10-13 Thread Dave Revell (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Revell updated HBASE-4489:
---

Attachment: HBASE-4489-trunk-v4.patch

New patch for trunk with last night's feedback. 
 - The SplitAlgorithm to use is now a required parameter
 - MD5StringSplit is now called HexStringSplit
 - The ceiling of 7FFF is now , and tests were changed to 
accomodate this.

No changes are necessary on the 0.90 branch, so the -v3 patch for 0.90 is still 
current.

 Better key splitting in RegionSplitter
 --

 Key: HBASE-4489
 URL: https://issues.apache.org/jira/browse/HBASE-4489
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.4
Reporter: Dave Revell
Assignee: Dave Revell
 Attachments: HBASE-4489-branch0.90-v1.patch, 
 HBASE-4489-branch0.90-v2.patch, HBASE-4489-branch0.90-v3.patch, 
 HBASE-4489-trunk-v1.patch, HBASE-4489-trunk-v2.patch, 
 HBASE-4489-trunk-v3.patch, HBASE-4489-trunk-v4.patch


 The RegionSplitter utility allows users to create a pre-split table from the 
 command line or do a rolling split on an existing table. It supports 
 pluggable split algorithms that implement the SplitAlgorithm interface. The 
 only/default SplitAlgorithm is one that assumes keys fall in the range from 
 ASCII string  to ASCII string 7FFF. This is not a sane 
 default, and seems useless to most users. Users are likely to be surprised by 
 the fact that all the region splits occur in in the byte range of ASCII 
 characters.
 A better default split algorithm would be one that evenly divides the space 
 of all bytes, which is what this patch does. Making a table with five regions 
 would split at \x33\x33..., \x66\x66, \x99\x99..., \xCC\xCC..., and 
 \xFF\xFF.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4489) Better key splitting in RegionSplitter

2011-10-11 Thread Dave Revell (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Revell updated HBASE-4489:
---

Attachment: HBASE-4489-branch0.90-v3.patch
HBASE-4489-trunk-v3.patch

v3 patches with suggested fixes.

 Better key splitting in RegionSplitter
 --

 Key: HBASE-4489
 URL: https://issues.apache.org/jira/browse/HBASE-4489
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.4
Reporter: Dave Revell
Assignee: Dave Revell
 Attachments: HBASE-4489-branch0.90-v1.patch, 
 HBASE-4489-branch0.90-v2.patch, HBASE-4489-branch0.90-v3.patch, 
 HBASE-4489-trunk-v1.patch, HBASE-4489-trunk-v2.patch, 
 HBASE-4489-trunk-v3.patch


 The RegionSplitter utility allows users to create a pre-split table from the 
 command line or do a rolling split on an existing table. It supports 
 pluggable split algorithms that implement the SplitAlgorithm interface. The 
 only/default SplitAlgorithm is one that assumes keys fall in the range from 
 ASCII string  to ASCII string 7FFF. This is not a sane 
 default, and seems useless to most users. Users are likely to be surprised by 
 the fact that all the region splits occur in in the byte range of ASCII 
 characters.
 A better default split algorithm would be one that evenly divides the space 
 of all bytes, which is what this patch does. Making a table with five regions 
 would split at \x33\x33..., \x66\x66, \x99\x99..., \xCC\xCC..., and 
 \xFF\xFF.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4489) Better key splitting in RegionSplitter

2011-10-10 Thread Dave Revell (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Revell updated HBASE-4489:
---

Attachment: HBASE-4489-branch0.90-v2.patch
HBASE-4489-trunk-v2.patch

New patches ending in -v2. These have new tests for RegionSplitter.

Some weirdness: RegionSplitter.rollingSplit() seems to be broken, so it doesn't 
have any test cases in my code. I opened HBASE-4567 to focus on this. I also 
included a test case in TestRegionSplitter.java called 
reproduceDivByZeroFailure() that reproduces the problem. I think fixing this 
bug is outside the scope of this ticket.

 Better key splitting in RegionSplitter
 --

 Key: HBASE-4489
 URL: https://issues.apache.org/jira/browse/HBASE-4489
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.4
Reporter: Dave Revell
Assignee: Dave Revell
 Attachments: HBASE-4489-branch0.90-v1.patch, 
 HBASE-4489-branch0.90-v2.patch, HBASE-4489-trunk-v1.patch, 
 HBASE-4489-trunk-v2.patch


 The RegionSplitter utility allows users to create a pre-split table from the 
 command line or do a rolling split on an existing table. It supports 
 pluggable split algorithms that implement the SplitAlgorithm interface. The 
 only/default SplitAlgorithm is one that assumes keys fall in the range from 
 ASCII string  to ASCII string 7FFF. This is not a sane 
 default, and seems useless to most users. Users are likely to be surprised by 
 the fact that all the region splits occur in in the byte range of ASCII 
 characters.
 A better default split algorithm would be one that evenly divides the space 
 of all bytes, which is what this patch does. Making a table with five regions 
 would split at \x33\x33..., \x66\x66, \x99\x99..., \xCC\xCC..., and 
 \xFF\xFF.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4489) Better key splitting in RegionSplitter

2011-09-26 Thread Dave Revell (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Revell updated HBASE-4489:
---

Attachment: HBASE-4489-trunk-v1.patch
HBASE-4489-branch0.90-v1.patch

 Better key splitting in RegionSplitter
 --

 Key: HBASE-4489
 URL: https://issues.apache.org/jira/browse/HBASE-4489
 Project: HBase
  Issue Type: Improvement
Reporter: Dave Revell
Assignee: Dave Revell
 Attachments: HBASE-4489-branch0.90-v1.patch, HBASE-4489-trunk-v1.patch


 The RegionSplitter utility allows users to create a pre-split table from the 
 command line or do a rolling split on an existing table. It supports 
 pluggable split algorithms that implement the SplitAlgorithm interface. The 
 only/default SplitAlgorithm is one that assumes keys fall in the range from 
 ASCII string  to ASCII string 7FFF. This is not a sane 
 default, and seems useless to most users. Users are likely to be surprised by 
 the fact that all the region splits occur in in the byte range of ASCII 
 characters.
 A better default split algorithm would be one that evenly divides the space 
 of all bytes, which is what this patch does. Making a table with five regions 
 would split at \x33\x33..., \x66\x66, \x99\x99..., \xCC\xCC..., and 
 \xFF\xFF.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4489) Better key splitting in RegionSplitter

2011-09-26 Thread Dave Revell (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Revell updated HBASE-4489:
---

Affects Version/s: 0.90.4
   Status: Patch Available  (was: Open)

 Better key splitting in RegionSplitter
 --

 Key: HBASE-4489
 URL: https://issues.apache.org/jira/browse/HBASE-4489
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.4
Reporter: Dave Revell
Assignee: Dave Revell
 Attachments: HBASE-4489-branch0.90-v1.patch, HBASE-4489-trunk-v1.patch


 The RegionSplitter utility allows users to create a pre-split table from the 
 command line or do a rolling split on an existing table. It supports 
 pluggable split algorithms that implement the SplitAlgorithm interface. The 
 only/default SplitAlgorithm is one that assumes keys fall in the range from 
 ASCII string  to ASCII string 7FFF. This is not a sane 
 default, and seems useless to most users. Users are likely to be surprised by 
 the fact that all the region splits occur in in the byte range of ASCII 
 characters.
 A better default split algorithm would be one that evenly divides the space 
 of all bytes, which is what this patch does. Making a table with five regions 
 would split at \x33\x33..., \x66\x66, \x99\x99..., \xCC\xCC..., and 
 \xFF\xFF.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira