[jira] [Commented] (HBASE-16765) Improve IncreasingToUpperBoundRegionSplitPolicy

2016-10-31 Thread Jean-Marc Spaggiari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15623934#comment-15623934
 ] 

Jean-Marc Spaggiari commented on HBASE-16765:
-

LGTM. Thanks Lars!


> Improve IncreasingToUpperBoundRegionSplitPolicy
> ---
>
> Key: HBASE-16765
> URL: https://issues.apache.org/jira/browse/HBASE-16765
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 16765-0.98.txt
>
>
> We just did some experiments on some larger clusters and found that while 
> using IncreasingToUpperBoundRegionSplitPolicy generally works well and is 
> very convenient, it does tend to produce too many regions.
> Since the logic is - by design - local, checking the number of regions of the 
> table in question on the local server only, we end with more regions then 
> necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16765) Improve IncreasingToUpperBoundRegionSplitPolicy

2016-10-31 Thread Jean-Marc Spaggiari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15623931#comment-15623931
 ] 

Jean-Marc Spaggiari commented on HBASE-16765:
-

I don't have the exact number, but they have some small tables that got plitted 
way to much. I agree that 2 regions per table per RS is a good limitation (then 
after than it's 10GB per region...)

> Improve IncreasingToUpperBoundRegionSplitPolicy
> ---
>
> Key: HBASE-16765
> URL: https://issues.apache.org/jira/browse/HBASE-16765
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 16765-0.98.txt
>
>
> We just did some experiments on some larger clusters and found that while 
> using IncreasingToUpperBoundRegionSplitPolicy generally works well and is 
> very convenient, it does tend to produce too many regions.
> Since the logic is - by design - local, checking the number of regions of the 
> table in question on the local server only, we end with more regions then 
> necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16765) Improve IncreasingToUpperBoundRegionSplitPolicy

2016-10-31 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15623572#comment-15623572
 ] 

Lars Hofhansl commented on HBASE-16765:
---

OK... So I'll check the class into every branch, so that it could optionally be 
configured as SplitPolicy. For 2.0 I'm going to make this the default.
Everybody cool with that?

> Improve IncreasingToUpperBoundRegionSplitPolicy
> ---
>
> Key: HBASE-16765
> URL: https://issues.apache.org/jira/browse/HBASE-16765
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 16765-0.98.txt
>
>
> We just did some experiments on some larger clusters and found that while 
> using IncreasingToUpperBoundRegionSplitPolicy generally works well and is 
> very convenient, it does tend to produce too many regions.
> Since the logic is - by design - local, checking the number of regions of the 
> table in question on the local server only, we end with more regions then 
> necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16765) Improve IncreasingToUpperBoundRegionSplitPolicy

2016-10-31 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15623570#comment-15623570
 ] 

Lars Hofhansl commented on HBASE-16765:
---

Heh... Yes. Although the old tribal knowledge is also not necessarily correct 
anymore. How many regions did HBase create? I think if we can cap it to 
2/table/server that'd be a good improvement, and workable, I'd think.


> Improve IncreasingToUpperBoundRegionSplitPolicy
> ---
>
> Key: HBASE-16765
> URL: https://issues.apache.org/jira/browse/HBASE-16765
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 16765-0.98.txt
>
>
> We just did some experiments on some larger clusters and found that while 
> using IncreasingToUpperBoundRegionSplitPolicy generally works well and is 
> very convenient, it does tend to produce too many regions.
> Since the logic is - by design - local, checking the number of regions of the 
> table in question on the local server only, we end with more regions then 
> necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16765) Improve IncreasingToUpperBoundRegionSplitPolicy

2016-10-30 Thread Jean-Marc Spaggiari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15620509#comment-15620509
 ] 

Jean-Marc Spaggiari commented on HBASE-16765:
-

Just come back from customer site last week... I was smiling when I saw all 
tables configured with ConstantSizeRegionSplitPolicy... Cluster was about 300 
HBase servers. They complained about HBase creating way to many regions for 
small tables, so they just configured each table to use 
ConstantSizeRegionSplitPolicy. So we should definitively do something on that 
side...

> Improve IncreasingToUpperBoundRegionSplitPolicy
> ---
>
> Key: HBASE-16765
> URL: https://issues.apache.org/jira/browse/HBASE-16765
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 16765-0.98.txt
>
>
> We just did some experiments on some larger clusters and found that while 
> using IncreasingToUpperBoundRegionSplitPolicy generally works well and is 
> very convenient, it does tend to produce too many regions.
> Since the logic is - by design - local, checking the number of regions of the 
> table in question on the local server only, we end with more regions then 
> necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16765) Improve IncreasingToUpperBoundRegionSplitPolicy

2016-10-28 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15615861#comment-15615861
 ] 

stack commented on HBASE-16765:
---

Go for it [~lhofhansl]. Put notes up in release notes. Flag it incompatible and 
2.0.

> Improve IncreasingToUpperBoundRegionSplitPolicy
> ---
>
> Key: HBASE-16765
> URL: https://issues.apache.org/jira/browse/HBASE-16765
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 16765-0.98.txt
>
>
> We just did some experiments on some larger clusters and found that while 
> using IncreasingToUpperBoundRegionSplitPolicy generally works well and is 
> very convenient, it does tend to produce too many regions.
> Since the logic is - by design - local, checking the number of regions of the 
> table in question on the local server only, we end with more regions then 
> necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16765) Improve IncreasingToUpperBoundRegionSplitPolicy

2016-10-28 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15615823#comment-15615823
 ] 

Lars Hofhansl commented on HBASE-16765:
---

Lost sight of this.

I think this should be the default. Can only be in a major release (or perhaps 
in a minor release), the risk is that after running with this for a while and 
then rolling back the upgrade one might run into a split-storm, as the old 
policy will split more aggressively. Not sure if that is enough to not put it 
into the next minor release.

Lemme commit this, after explaining more in the comments.

> Improve IncreasingToUpperBoundRegionSplitPolicy
> ---
>
> Key: HBASE-16765
> URL: https://issues.apache.org/jira/browse/HBASE-16765
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 16765-0.98.txt
>
>
> We just did some experiments on some larger clusters and found that while 
> using IncreasingToUpperBoundRegionSplitPolicy generally works well and is 
> very convenient, it does tend to produce too many regions.
> Since the logic is - by design - local, checking the number of regions of the 
> table in question on the local server only, we end with more regions then 
> necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16765) Improve IncreasingToUpperBoundRegionSplitPolicy

2016-10-12 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570029#comment-15570029
 ] 

stack commented on HBASE-16765:
---

Needs a release note.

> Improve IncreasingToUpperBoundRegionSplitPolicy
> ---
>
> Key: HBASE-16765
> URL: https://issues.apache.org/jira/browse/HBASE-16765
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 16765-0.98.txt
>
>
> We just did some experiments on some larger clusters and found that while 
> using IncreasingToUpperBoundRegionSplitPolicy generally works well and is 
> very convenient, it does tend to produce too many regions.
> Since the logic is - by design - local, checking the number of regions of the 
> table in question on the local server only, we end with more regions then 
> necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16765) Improve IncreasingToUpperBoundRegionSplitPolicy

2016-10-12 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570026#comment-15570026
 ] 

stack commented on HBASE-16765:
---

+1 on patch. On commit add more on how this is different to the default in the 
class comment. Should we make this the default since it less aggressive? Can be 
new issue.

> Improve IncreasingToUpperBoundRegionSplitPolicy
> ---
>
> Key: HBASE-16765
> URL: https://issues.apache.org/jira/browse/HBASE-16765
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 16765-0.98.txt
>
>
> We just did some experiments on some larger clusters and found that while 
> using IncreasingToUpperBoundRegionSplitPolicy generally works well and is 
> very convenient, it does tend to produce too many regions.
> Since the logic is - by design - local, checking the number of regions of the 
> table in question on the local server only, we end with more regions then 
> necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16765) Improve IncreasingToUpperBoundRegionSplitPolicy

2016-10-07 Thread Jean-Marc Spaggiari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1541#comment-1541
 ] 

Jean-Marc Spaggiari commented on HBASE-16765:
-

Oh! I never figured it was cube and not square. I always just looked at the 
comment... Interesting. 

Well, this patch is then totally required, to get the 2 aligned... 

> Improve IncreasingToUpperBoundRegionSplitPolicy
> ---
>
> Key: HBASE-16765
> URL: https://issues.apache.org/jira/browse/HBASE-16765
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 16765-0.98.txt
>
>
> We just did some experiments on some larger clusters and found that while 
> using IncreasingToUpperBoundRegionSplitPolicy generally works well and is 
> very convenient, it does tend to produce too many regions.
> Since the logic is - by design - local, checking the number of regions of the 
> table in question on the local server only, we end with more regions then 
> necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16765) Improve IncreasingToUpperBoundRegionSplitPolicy

2016-10-07 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1423#comment-1423
 ] 

Lars Hofhansl commented on HBASE-16765:
---

SteppingSplitPolicy is the fix :)
I noticed the comment in IncreasingToUpperBoundRegionSplitPolicy did not 
reflect the code so I fixed it.

Could change IncreasingToUpperBoundRegionSplitPolicy itself of course.

> Improve IncreasingToUpperBoundRegionSplitPolicy
> ---
>
> Key: HBASE-16765
> URL: https://issues.apache.org/jira/browse/HBASE-16765
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 16765-0.98.txt
>
>
> We just did some experiments on some larger clusters and found that while 
> using IncreasingToUpperBoundRegionSplitPolicy generally works well and is 
> very convenient, it does tend to produce too many regions.
> Since the logic is - by design - local, checking the number of regions of the 
> table in question on the local server only, we end with more regions then 
> necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16765) Improve IncreasingToUpperBoundRegionSplitPolicy

2016-10-05 Thread Jean-Marc Spaggiari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15549910#comment-15549910
 ] 

Jean-Marc Spaggiari commented on HBASE-16765:
-

Is the last patch correct? What is SteppingSplitPolicy? And I see only 
modification on the comments for IncreasingToUpperBoundRegionSplitPolicy. 

I think until we get HBASE-12451 in, this might still help to reduce the 
damages...

> Improve IncreasingToUpperBoundRegionSplitPolicy
> ---
>
> Key: HBASE-16765
> URL: https://issues.apache.org/jira/browse/HBASE-16765
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 16765-0.98.txt
>
>
> We just did some experiments on some larger clusters and found that while 
> using IncreasingToUpperBoundRegionSplitPolicy generally works well and is 
> very convenient, it does tend to produce too many regions.
> Since the logic is - by design - local, checking the number of regions of the 
> table in question on the local server only, we end with more regions then 
> necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16765) Improve IncreasingToUpperBoundRegionSplitPolicy

2016-10-05 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15549869#comment-15549869
 ] 

Lars Hofhansl commented on HBASE-16765:
---

Any comments on the simple patch? Not worth it?

> Improve IncreasingToUpperBoundRegionSplitPolicy
> ---
>
> Key: HBASE-16765
> URL: https://issues.apache.org/jira/browse/HBASE-16765
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 16765-0.98.txt
>
>
> We just did some experiments on some larger clusters and found that while 
> using IncreasingToUpperBoundRegionSplitPolicy generally works well and is 
> very convenient, it does tend to produce too many regions.
> Since the logic is - by design - local, checking the number of regions of the 
> table in question on the local server only, we end with more regions then 
> necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16765) Improve IncreasingToUpperBoundRegionSplitPolicy

2016-10-04 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15547075#comment-15547075
 ] 

Lars Hofhansl commented on HBASE-16765:
---

And of course HBASE-12451 is far more elaborate.

> Improve IncreasingToUpperBoundRegionSplitPolicy
> ---
>
> Key: HBASE-16765
> URL: https://issues.apache.org/jira/browse/HBASE-16765
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 16765-0.98.txt
>
>
> We just did some experiments on some larger clusters and found that while 
> using IncreasingToUpperBoundRegionSplitPolicy generally works well and is 
> very convenient, it does tend to produce too many regions.
> Since the logic is - by design - local, checking the number of regions of the 
> table in question on the local server only, we end with more regions then 
> necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16765) Improve IncreasingToUpperBoundRegionSplitPolicy

2016-10-04 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15547059#comment-15547059
 ] 

Lars Hofhansl commented on HBASE-16765:
---

Oh... Yeah. Missing the new file... Arrgghh :)

> Improve IncreasingToUpperBoundRegionSplitPolicy
> ---
>
> Key: HBASE-16765
> URL: https://issues.apache.org/jira/browse/HBASE-16765
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>
> We just did some experiments on some larger clusters and found that while 
> using IncreasingToUpperBoundRegionSplitPolicy generally works well and is 
> very convenient, it does tend to produce too many regions.
> Since the logic is - by design - local, checking the number of regions of the 
> table in question on the local server only, we end with more regions then 
> necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16765) Improve IncreasingToUpperBoundRegionSplitPolicy

2016-10-04 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15546483#comment-15546483
 ] 

Lars Hofhansl commented on HBASE-16765:
---

Yeah... No tests, etc copyright, etc, etc. Was just getting a feeling for what 
people think.

Do we need to be more aggressive in preventing the splitting in large clusters? 
In a 1000 node cluster even small table pretty quickly grow to 2000 regions.

> Improve IncreasingToUpperBoundRegionSplitPolicy
> ---
>
> Key: HBASE-16765
> URL: https://issues.apache.org/jira/browse/HBASE-16765
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 16765-0.98.txt
>
>
> We just did some experiments on some larger clusters and found that while 
> using IncreasingToUpperBoundRegionSplitPolicy generally works well and is 
> very convenient, it does tend to produce too many regions.
> Since the logic is - by design - local, checking the number of regions of the 
> table in question on the local server only, we end with more regions then 
> necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16765) Improve IncreasingToUpperBoundRegionSplitPolicy

2016-10-04 Thread Jean-Marc Spaggiari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15546455#comment-15546455
 ] 

Jean-Marc Spaggiari commented on HBASE-16765:
-

Big +1. When I go onsite to customers I usually recommend to disable this split 
policy and go with a size based policy... Because as you figured, it creates 
way to many regions.

> Improve IncreasingToUpperBoundRegionSplitPolicy
> ---
>
> Key: HBASE-16765
> URL: https://issues.apache.org/jira/browse/HBASE-16765
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>
> We just did some experiments on some larger clusters and found that while 
> using IncreasingToUpperBoundRegionSplitPolicy generally works well and is 
> very convenient, it does tend to produce too many regions.
> Since the logic is - by design - local, checking the number of regions of the 
> table in question on the local server only, we end with more regions then 
> necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16765) Improve IncreasingToUpperBoundRegionSplitPolicy

2016-10-04 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15546470#comment-15546470
 ] 

stack commented on HBASE-16765:
---

Sure.

I think the patch is missing stuff though.

> Improve IncreasingToUpperBoundRegionSplitPolicy
> ---
>
> Key: HBASE-16765
> URL: https://issues.apache.org/jira/browse/HBASE-16765
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 16765-0.98.txt
>
>
> We just did some experiments on some larger clusters and found that while 
> using IncreasingToUpperBoundRegionSplitPolicy generally works well and is 
> very convenient, it does tend to produce too many regions.
> Since the logic is - by design - local, checking the number of regions of the 
> table in question on the local server only, we end with more regions then 
> necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16765) Improve IncreasingToUpperBoundRegionSplitPolicy

2016-10-04 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15546468#comment-15546468
 ] 

Lars Hofhansl commented on HBASE-16765:
---

This should be the default, I think.

> Improve IncreasingToUpperBoundRegionSplitPolicy
> ---
>
> Key: HBASE-16765
> URL: https://issues.apache.org/jira/browse/HBASE-16765
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 16765-0.98.txt
>
>
> We just did some experiments on some larger clusters and found that while 
> using IncreasingToUpperBoundRegionSplitPolicy generally works well and is 
> very convenient, it does tend to produce too many regions.
> Since the logic is - by design - local, checking the number of regions of the 
> table in question on the local server only, we end with more regions then 
> necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16765) Improve IncreasingToUpperBoundRegionSplitPolicy

2016-10-04 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15546464#comment-15546464
 ] 

Lars Hofhansl commented on HBASE-16765:
---

Note that the optimum we can achieve is 2 region per table per server unless we 
revert to some scheme with a global view of the number of server and current 
number of regions.

But 2 is twice as good as 4! :)

> Improve IncreasingToUpperBoundRegionSplitPolicy
> ---
>
> Key: HBASE-16765
> URL: https://issues.apache.org/jira/browse/HBASE-16765
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 16765-0.98.txt
>
>
> We just did some experiments on some larger clusters and found that while 
> using IncreasingToUpperBoundRegionSplitPolicy generally works well and is 
> very convenient, it does tend to produce too many regions.
> Since the logic is - by design - local, checking the number of regions of the 
> table in question on the local server only, we end with more regions then 
> necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16765) Improve IncreasingToUpperBoundRegionSplitPolicy

2016-10-04 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15546417#comment-15546417
 ] 

Lars Hofhansl commented on HBASE-16765:
---

In comparison, the default IncreasingToUpperBoundRegionSplitPolicy would need 4 
regions per table and server to reach the maximum split size.

> Improve IncreasingToUpperBoundRegionSplitPolicy
> ---
>
> Key: HBASE-16765
> URL: https://issues.apache.org/jira/browse/HBASE-16765
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>
> We just did some experiments on some larger clusters and found that while 
> using IncreasingToUpperBoundRegionSplitPolicy generally works well and is 
> very convenient, it does tend to produce too many regions.
> Since the logic is - by design - local, checking the number of regions of the 
> table in question on the local server only, we end with more regions then 
> necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16765) Improve IncreasingToUpperBoundRegionSplitPolicy

2016-10-04 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15546402#comment-15546402
 ] 

Lars Hofhansl commented on HBASE-16765:
---

I think ideally we want the following axioms:
# quick splitting and spreading of regions as the table is small
# ideally not more than one region of a table per server (MAX_FILESIZE 
permitting of course)

#2 is where IncreasingToUpperBoundRegionSplitPolicy falls short.
I'd propose a step function instead: split at 2xflushsize when only one region 
of the table is seen, stop splitting (i.e. constant size split policy) when 
more than 1 region is seen.
This should be as close to ideal as is possible with local knowledge only 
usually not leading to more than 2 regions per server (unless we need to split 
more due to MAX_FILESIZE)

[~stack]



> Improve IncreasingToUpperBoundRegionSplitPolicy
> ---
>
> Key: HBASE-16765
> URL: https://issues.apache.org/jira/browse/HBASE-16765
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>
> We just did some experiments on some larger clusters and found that while 
> using IncreasingToUpperBoundRegionSplitPolicy generally works well and is 
> very convenient, it does tend to produce too many regions.
> Since the logic is - by design - local, checking the number of regions of the 
> table in question on the local server only, we end with more regions then 
> necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)