Re: Salting based on partial rowkeys

Josh Elser Mon, 10 Sep 2018 15:25:54 -0700

Hey Gerald,

Trimming back to just dev@phoenix, but I am curious to hear some moreabout what you and Mike are thinking.


Some initial questions:

* What are the problem(s) that you see today with the currentimplementation of SALT_BUCKETS

* How would your new feature/proposal work?
* How would your new feature solve your current problem?
* What are the drawbacks (if any) of your new feature?

I've definitely seen a problem where folks negatively impact their readsby "over-salting" because they were too lazy when writing data (eitherto think about a good distribution or to write some code to ingest theirdata).


Thanks in advance!

- Josh

On 9/10/18 4:56 PM, Gerald Sangudi wrote:

Hello folks,
We have a requirement for salting based on partial, rather than full,rowkeys. My colleague Mike Polcari has identified the requirement andproposed an approach.
I found an already-open JIRA ticket for the same issue:https://issues.apache.org/jira/browse/PHOENIX-4757. I can provide moredetails from the proposal.
The JIRA proposes a syntax of SALT_BUCKETS(col, ...) = N, whereas Mikeproposes SALT_COLUMN=col or SALT_COLUMNS=col, ... .
The benefit at issue is that users gain more control over partitioning,and this can be used to push some additional aggregations and hash joinsdown to region servers.
I would appreciate any go-ahead / thoughts / guidance / objections /feedback. I'd like to be sure that the concept at least is notobjectionable. We would like to work on this and submit a patch down theroad. I'll also add a note to the JIRA ticket.
Thanks,
Gerald

Re: Salting based on partial rowkeys

Reply via email to