Hi Josh,
Thanks for the feedback. Do you have any concrete example where salted
tables are 'evil'? However I really like the idea to enable salting using
some predefined variable (like number of region servers or something like
that).
An example could be:

SALT_BUCKETS = $REGION_SERVERS_COUNT

Best,
Flavio

On 12 Dec 2017 01:45, "Josh Elser" <els...@apache.org> wrote:

I'm a little hesitant of this for a few things I've noticed from lots of
various installations:

* Salted tables are *not* always more efficient. In fact, I've found myself
giving advice to not use salted tables a bit more than expected. Certain
kinds of queries will require much more work if you have salting over not
having salting

* Considering salt buckets as a measure of parallelism for a table, it's
impossible for the system to correctly judge what the parallelism of the
cluster should be. For example, with 10 RS and 1 Phoenix table, you would
want to start with 10 salt buckets. However, with 10 RS and 100 Phoenix
tables, you'd *maybe* want to do 3 salt buckets. It's hard to make system
wide decisions correctly without a global view of the entire system.

I think James was trying to capture some of this in his use of "relative
conservative default", but I'd take that even a bit farther to say I
consider it harmful for Phoenix to do that out of the box.

However, I would flip the question upside down instead: what kind of
suggestions can Phoenix make as a database to the user to _recommend_ to
them that they enable salting on a table given its schema and important
queries?


On 12/8/17 12:34 PM, James Taylor wrote:

> Hi Flavio,
> I like the idea of “adaptable configuration” where you specify a config
> value as a % of some cluster resource (with relatively conservative
> defaults). Salting is somewhat of a gray area though as it’s not config
> based, but driven by your DDL. One solution you could implement on top of
> Phoenix is scripting for DDL that fills in the salt bucket parameter based
> on cluster size.
> Thanks,
> James
>
> On Tue, Dec 5, 2017 at 12:50 AM Flavio Pompermaier <pomperma...@okkam.it
> <mailto:pomperma...@okkam.it>> wrote:
>
>     Hi to all,
>     as stated by at the documentation[1] "for optimal performance,
>     number of salt buckets should match number of region servers".
>     So, why not to add an option AUTO/DEFAULT for salting that defaults
>     this parameter to the number of region servers?
>     Otherwise I have to manually connect to HBase, retrieve that number
>     and pass to Phoenix...
>     What do you think?
>
>     [1] https://phoenix.apache.org/performance.html#Salting
>
>     Best,
>     Flavio
>
>

Reply via email to