Re: Salting based on partial rowkeys

2018-11-05 Thread Gerald Sangudi
Hi folks,

I would like to continue the discussion below from a few weeks ago. I would
like to address the feedback from Thomas, Jaanai, and Sergey.

In thinking about this some more, it amounts to introducing a form of hash
partitioning in Phoenix + HBase. For this to work in the general case,
region splits and merges would need to be disabled, and the regions
pre-defined. This is already supported by DisableRegionSplitPolicy.

I understand that HBase is range-partitioned. What we are proposing is to
allow specialized use cases for users to manage their own partitioning. All
other users would be unaffected.

Jaanai also mentioned that salting is meant to address write hotspots. That
is true. We are proposing an additional use of salting (or if you prefer,
another feature altogether) for specialized use cases. Again, all other
users would be unaffected.

We have a lot of data in HBase, and some of our queries would benefit from
some sort of hash partitioning. That is the crux of our proposal.

Some specific responses:

@Josh -- exactly.

@Thomas -- if we can partition the data exactly how we want, we can make
sure that certain queries do not go across regions. We can either scan only
the matching regions, and we can perform the full aggregation within each
matching region, without having to do a merge.

@Jaanai, @Sergey -- I hope I explained about hotspotting above. We would
also be fine calling it something other than salting. Maybe that's better
to avoid confusion.

@Lars -- you posted questions in PHOENIX-4757. I will respond there.

Thanks everyone for all the feedback on this. Our goal is to discuss all
the concerns, and then finally get a yay or nay consensus from the
committers.

Gerald

On Sun, Sep 16, 2018 at 9:52 PM la...@apache.org  wrote:

>  I added some comments on the PHOENIX-4757
>
> On Thursday, September 13, 2018, 6:42:12 PM PDT, Josh Elser <
> els...@apache.org> wrote:
>
>  Ahh, I get you now.
>
> For a composite primary key made up of columns 1 through N, you want
> similar controls to compute the value of the salt based on a sequence of
> the columns 1 through M where M <= N (instead of always on all columns).
>
> For large numbers of salt buckets and a scan over a facet, you prune
> your search space considerably. Makes sense to me!
>
> On 9/13/18 6:37 PM, Gerald Sangudi wrote:
> > In case the text formatting is lost below, I also added it as a comment
> in
> > the JIRA ticket:
> >
> > https://issues.apache.org/jira/browse/PHOENIX-4757
> >
> >
> > On Thu, Sep 13, 2018 at 3:24 PM, Gerald Sangudi 
> > wrote:
> >
> >> Sorry I missed Josh's reply; I've subscribed to the dev list now.
> >>
> >> Below is a copy-and-paste from our internal document. Thanks in advance
> >> for your review and additional feedback on this.
> >>
> >> Gerald
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> *BackgroundWe make extensive use of multi-column rowkeys and salting
> >>  in our different apache
> phoenix
> >> deployments. We frequently perform group-by aggregations on these data
> >> along a specific dimension that would benefit from predictably
> partitioning
> >> the data along that dimension. Proposal:We propose to add table
> metadata to
> >> allow schema designers to constrain salting to a subset of the rowkey,
> >> rather than the full rowkey as it is today. This will introduce a
> mechanism
> >> to partition data on a per-table basis along a single dimension without
> >> application changes or much change to the phoenix runtime logic. We
> expect
> >> this will result in substantially faster group-by’s along the salted
> >> dimension and negligible penalties elsewhere. This feature has also been
> >> proposed in PHOENIX-4757
> >>  where it was
> pointed
> >> out that partitioning and sorting data along different dimensions is a
> >> common pattern in other datastores as well.Theoretically, it could cause
> >> hotspotting when querying along the salted dimension without the leading
> >> rowkey - that would be an anti-pattern.Usage
> ExampleCurrent:Schema:CREATE
> >> TABLE relationship (id_1 BIGINT NOT NULL,id_2 BIGINT NOT NULL,other_key
> >> BIGINT NOT NULL,val SMALLINT,CONSTRAINT pk PRIMARY KEY (id_1, id_2,
> >> other_key))SALT_BUCKETS=60;Query:Select id_2, sum(val)From
> >> relationshipWhere id_1 in (2,3)Group by id_2Explain:0: jdbc:phoenix:>
> >> EXPLAIN Select id_2, sum(val) From relationship Where id_1 in (2,3)
> Group
> >> by id_2
> >>
> ;+-++|
> >>  PLAN| EST_BY
> >>
> |+-++|
> >> CLIENT 60-CHUNK PARALLEL 60-WAY SKIP SCAN ON 120 KEYS OVER RELATIONSHIP
> >> [0,2] - [59,3]  | null ||SERVER AGGREGATE INTO DISTINCT 

Re: Salting based on partial rowkeys

2018-09-16 Thread la...@apache.org
 I added some comments on the PHOENIX-4757

On Thursday, September 13, 2018, 6:42:12 PM PDT, Josh Elser 
 wrote:  
 
 Ahh, I get you now.

For a composite primary key made up of columns 1 through N, you want 
similar controls to compute the value of the salt based on a sequence of 
the columns 1 through M where M <= N (instead of always on all columns).

For large numbers of salt buckets and a scan over a facet, you prune 
your search space considerably. Makes sense to me!

On 9/13/18 6:37 PM, Gerald Sangudi wrote:
> In case the text formatting is lost below, I also added it as a comment in
> the JIRA ticket:
> 
> https://issues.apache.org/jira/browse/PHOENIX-4757
> 
> 
> On Thu, Sep 13, 2018 at 3:24 PM, Gerald Sangudi 
> wrote:
> 
>> Sorry I missed Josh's reply; I've subscribed to the dev list now.
>>
>> Below is a copy-and-paste from our internal document. Thanks in advance
>> for your review and additional feedback on this.
>>
>> Gerald
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *BackgroundWe make extensive use of multi-column rowkeys and salting
>>  in our different apache phoenix
>> deployments. We frequently perform group-by aggregations on these data
>> along a specific dimension that would benefit from predictably partitioning
>> the data along that dimension. Proposal:We propose to add table metadata to
>> allow schema designers to constrain salting to a subset of the rowkey,
>> rather than the full rowkey as it is today. This will introduce a mechanism
>> to partition data on a per-table basis along a single dimension without
>> application changes or much change to the phoenix runtime logic. We expect
>> this will result in substantially faster group-by’s along the salted
>> dimension and negligible penalties elsewhere. This feature has also been
>> proposed in PHOENIX-4757
>>  where it was pointed
>> out that partitioning and sorting data along different dimensions is a
>> common pattern in other datastores as well.Theoretically, it could cause
>> hotspotting when querying along the salted dimension without the leading
>> rowkey - that would be an anti-pattern.Usage ExampleCurrent:Schema:CREATE
>> TABLE relationship (id_1 BIGINT NOT NULL,id_2 BIGINT NOT NULL,other_key
>> BIGINT NOT NULL,val SMALLINT,CONSTRAINT pk PRIMARY KEY (id_1, id_2,
>> other_key))SALT_BUCKETS=60;Query:Select id_2, sum(val)From
>> relationshipWhere id_1 in (2,3)Group by id_2Explain:0: jdbc:phoenix:>
>> EXPLAIN Select id_2, sum(val) From relationship Where id_1 in (2,3) Group
>> by id_2
>> ;+-++|
>>                                          PLAN    | EST_BY
>> |+-++|
>> CLIENT 60-CHUNK PARALLEL 60-WAY SKIP SCAN ON 120 KEYS OVER RELATIONSHIP
>> [0,2] - [59,3]  | null ||    SERVER AGGREGATE INTO DISTINCT ROWS BY [ID_2]
>>                                        | null || CLIENT MERGE SORT
>>                                                                        |
>> null
>> |+-++3
>> rows selected (0.048 seconds)In this case, although the group by is
>> performed on both the client and regionserver, almost all of the actual
>> grouping happens on the client because the id_2’s are randomly distributed
>> across the regionservers. As a result, a lot of unnecessary data is
>> serialized to the client and grouped serially there. This can become quite
>> material with large resultsets.Proposed:Schema:CREATE TABLE relationship
>> (id_1 BIGINT NOT NULL,id_2 BIGINT NOT NULL,other_key BIGINT NOT NULL,val
>> SMALLINT,CONSTRAINT pk PRIMARY KEY (id_1, id_2,
>> other_key),SALT_BUCKETS=60,SALT_COLUMN = id_2);Query (unchanged):Select
>> id_2, sum(val)From relationshipWhere id_1 in (2,3)Group by id_2Explain
>> (unchanged)Under the proposal, the data are merely partitioned so that all
>> rows containing the same id_2 are on the same regionserver, the above query
>> will perform almost all of the grouping in parallel on the regionservers.
>> No special hint or changes to the query plan would be required to benefit.
>> Tables would need to be re-salted to take advantage of the new
>> functionality.Technical changes proposed to phoenix: - Create a new piece
>> of table-level metadata: SALT_COLUMN. SALT_COLUMN will instruct the salting
>> logic to generate a salt-byte based only on the specified column. If
>> unspecified, it will behave as it does today and default to salting the
>> entire rowkey. This metadata may be specified only when the table is
>> created and may not be modified. The specified column must be part of the
>> rowkey.  - Modify all callers of getSaltingByte
>> 

Re: Salting based on partial rowkeys

2018-09-13 Thread Josh Elser

Ahh, I get you now.

For a composite primary key made up of columns 1 through N, you want 
similar controls to compute the value of the salt based on a sequence of 
the columns 1 through M where M <= N (instead of always on all columns).


For large numbers of salt buckets and a scan over a facet, you prune 
your search space considerably. Makes sense to me!


On 9/13/18 6:37 PM, Gerald Sangudi wrote:

In case the text formatting is lost below, I also added it as a comment in
the JIRA ticket:

https://issues.apache.org/jira/browse/PHOENIX-4757


On Thu, Sep 13, 2018 at 3:24 PM, Gerald Sangudi 
wrote:


Sorry I missed Josh's reply; I've subscribed to the dev list now.

Below is a copy-and-paste from our internal document. Thanks in advance
for your review and additional feedback on this.

Gerald



















*BackgroundWe make extensive use of multi-column rowkeys and salting
 in our different apache phoenix
deployments. We frequently perform group-by aggregations on these data
along a specific dimension that would benefit from predictably partitioning
the data along that dimension. Proposal:We propose to add table metadata to
allow schema designers to constrain salting to a subset of the rowkey,
rather than the full rowkey as it is today. This will introduce a mechanism
to partition data on a per-table basis along a single dimension without
application changes or much change to the phoenix runtime logic. We expect
this will result in substantially faster group-by’s along the salted
dimension and negligible penalties elsewhere. This feature has also been
proposed in PHOENIX-4757
 where it was pointed
out that partitioning and sorting data along different dimensions is a
common pattern in other datastores as well.Theoretically, it could cause
hotspotting when querying along the salted dimension without the leading
rowkey - that would be an anti-pattern.Usage ExampleCurrent:Schema:CREATE
TABLE relationship (id_1 BIGINT NOT NULL,id_2 BIGINT NOT NULL,other_key
BIGINT NOT NULL,val SMALLINT,CONSTRAINT pk PRIMARY KEY (id_1, id_2,
other_key))SALT_BUCKETS=60;Query:Select id_2, sum(val)From
relationshipWhere id_1 in (2,3)Group by id_2Explain:0: jdbc:phoenix:>
EXPLAIN Select id_2, sum(val) From relationship Where id_1 in (2,3) Group
by id_2
;+-++|
  PLAN| EST_BY
|+-++|
CLIENT 60-CHUNK PARALLEL 60-WAY SKIP SCAN ON 120 KEYS OVER RELATIONSHIP
[0,2] - [59,3]  | null || SERVER AGGREGATE INTO DISTINCT ROWS BY [ID_2]
   | null || CLIENT MERGE SORT
   |
null
|+-++3
rows selected (0.048 seconds)In this case, although the group by is
performed on both the client and regionserver, almost all of the actual
grouping happens on the client because the id_2’s are randomly distributed
across the regionservers. As a result, a lot of unnecessary data is
serialized to the client and grouped serially there. This can become quite
material with large resultsets.Proposed:Schema:CREATE TABLE relationship
(id_1 BIGINT NOT NULL,id_2 BIGINT NOT NULL,other_key BIGINT NOT NULL,val
SMALLINT,CONSTRAINT pk PRIMARY KEY (id_1, id_2,
other_key),SALT_BUCKETS=60,SALT_COLUMN = id_2);Query (unchanged):Select
id_2, sum(val)From relationshipWhere id_1 in (2,3)Group by id_2Explain
(unchanged)Under the proposal, the data are merely partitioned so that all
rows containing the same id_2 are on the same regionserver, the above query
will perform almost all of the grouping in parallel on the regionservers.
No special hint or changes to the query plan would be required to benefit.
Tables would need to be re-salted to take advantage of the new
functionality.Technical changes proposed to phoenix: - Create a new piece
of table-level metadata: SALT_COLUMN. SALT_COLUMN will instruct the salting
logic to generate a salt-byte based only on the specified column. If
unspecified, it will behave as it does today and default to salting the
entire rowkey. This metadata may be specified only when the table is
created and may not be modified. The specified column must be part of the
rowkey.  - Modify all callers of getSaltingByte
(byte[]
value, int offset, int length, int bucketNum) to consistently leverage the
new metadata.- Tests- DocsDesign points:One salt column vs multiple salt
columns: Based on the existing signature for getSaltingByte, it seems
simpler to only support a single SALT_COLUMN rather than multiple arbitrary

Re: Salting based on partial rowkeys

2018-09-13 Thread Gerald Sangudi
In case the text formatting is lost below, I also added it as a comment in
the JIRA ticket:

https://issues.apache.org/jira/browse/PHOENIX-4757


On Thu, Sep 13, 2018 at 3:24 PM, Gerald Sangudi 
wrote:

> Sorry I missed Josh's reply; I've subscribed to the dev list now.
>
> Below is a copy-and-paste from our internal document. Thanks in advance
> for your review and additional feedback on this.
>
> Gerald
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *BackgroundWe make extensive use of multi-column rowkeys and salting
>  in our different apache phoenix
> deployments. We frequently perform group-by aggregations on these data
> along a specific dimension that would benefit from predictably partitioning
> the data along that dimension. Proposal:We propose to add table metadata to
> allow schema designers to constrain salting to a subset of the rowkey,
> rather than the full rowkey as it is today. This will introduce a mechanism
> to partition data on a per-table basis along a single dimension without
> application changes or much change to the phoenix runtime logic. We expect
> this will result in substantially faster group-by’s along the salted
> dimension and negligible penalties elsewhere. This feature has also been
> proposed in PHOENIX-4757
>  where it was pointed
> out that partitioning and sorting data along different dimensions is a
> common pattern in other datastores as well.Theoretically, it could cause
> hotspotting when querying along the salted dimension without the leading
> rowkey - that would be an anti-pattern.Usage ExampleCurrent:Schema:CREATE
> TABLE relationship (id_1 BIGINT NOT NULL,id_2 BIGINT NOT NULL,other_key
> BIGINT NOT NULL,val SMALLINT,CONSTRAINT pk PRIMARY KEY (id_1, id_2,
> other_key))SALT_BUCKETS=60;Query:Select id_2, sum(val)From
> relationshipWhere id_1 in (2,3)Group by id_2Explain:0: jdbc:phoenix:>
> EXPLAIN Select id_2, sum(val) From relationship Where id_1 in (2,3) Group
> by id_2
> ;+-++|
>  PLAN| EST_BY
> |+-++|
> CLIENT 60-CHUNK PARALLEL 60-WAY SKIP SCAN ON 120 KEYS OVER RELATIONSHIP
> [0,2] - [59,3]  | null || SERVER AGGREGATE INTO DISTINCT ROWS BY [ID_2]
>   | null || CLIENT MERGE SORT
>   |
> null
> |+-++3
> rows selected (0.048 seconds)In this case, although the group by is
> performed on both the client and regionserver, almost all of the actual
> grouping happens on the client because the id_2’s are randomly distributed
> across the regionservers. As a result, a lot of unnecessary data is
> serialized to the client and grouped serially there. This can become quite
> material with large resultsets.Proposed:Schema:CREATE TABLE relationship
> (id_1 BIGINT NOT NULL,id_2 BIGINT NOT NULL,other_key BIGINT NOT NULL,val
> SMALLINT,CONSTRAINT pk PRIMARY KEY (id_1, id_2,
> other_key),SALT_BUCKETS=60,SALT_COLUMN = id_2);Query (unchanged):Select
> id_2, sum(val)From relationshipWhere id_1 in (2,3)Group by id_2Explain
> (unchanged)Under the proposal, the data are merely partitioned so that all
> rows containing the same id_2 are on the same regionserver, the above query
> will perform almost all of the grouping in parallel on the regionservers.
> No special hint or changes to the query plan would be required to benefit.
> Tables would need to be re-salted to take advantage of the new
> functionality.Technical changes proposed to phoenix: - Create a new piece
> of table-level metadata: SALT_COLUMN. SALT_COLUMN will instruct the salting
> logic to generate a salt-byte based only on the specified column. If
> unspecified, it will behave as it does today and default to salting the
> entire rowkey. This metadata may be specified only when the table is
> created and may not be modified. The specified column must be part of the
> rowkey.  - Modify all callers of getSaltingByte
> (byte[]
> value, int offset, int length, int bucketNum) to consistently leverage the
> new metadata.- Tests- DocsDesign points:One salt column vs multiple salt
> columns: Based on the existing signature for getSaltingByte, it seems
> simpler to only support a single SALT_COLUMN rather than multiple arbitrary
> SALT_COLUMNS. Known use-cases are completely supported by a single
> column.Syntax:  PHOENIX-4757
>  suggests an alternate,
> less verbose syntax for defining the salt bucket. The SALT_COLUMN syntax is
> suggested 

Re: Salting based on partial rowkeys

2018-09-13 Thread Gerald Sangudi
Sorry I missed Josh's reply; I've subscribed to the dev list now.

Below is a copy-and-paste from our internal document. Thanks in advance for
your review and additional feedback on this.

Gerald



















*BackgroundWe make extensive use of multi-column rowkeys and salting
 in our different apache phoenix
deployments. We frequently perform group-by aggregations on these data
along a specific dimension that would benefit from predictably partitioning
the data along that dimension. Proposal:We propose to add table metadata to
allow schema designers to constrain salting to a subset of the rowkey,
rather than the full rowkey as it is today. This will introduce a mechanism
to partition data on a per-table basis along a single dimension without
application changes or much change to the phoenix runtime logic. We expect
this will result in substantially faster group-by’s along the salted
dimension and negligible penalties elsewhere. This feature has also been
proposed in PHOENIX-4757
 where it was pointed
out that partitioning and sorting data along different dimensions is a
common pattern in other datastores as well.Theoretically, it could cause
hotspotting when querying along the salted dimension without the leading
rowkey - that would be an anti-pattern.Usage ExampleCurrent:Schema:CREATE
TABLE relationship (id_1 BIGINT NOT NULL,id_2 BIGINT NOT NULL,other_key
BIGINT NOT NULL,val SMALLINT,CONSTRAINT pk PRIMARY KEY (id_1, id_2,
other_key))SALT_BUCKETS=60;Query:Select id_2, sum(val)From
relationshipWhere id_1 in (2,3)Group by id_2Explain:0: jdbc:phoenix:>
EXPLAIN Select id_2, sum(val) From relationship Where id_1 in (2,3) Group
by id_2
;+-++|
 PLAN| EST_BY
|+-++|
CLIENT 60-CHUNK PARALLEL 60-WAY SKIP SCAN ON 120 KEYS OVER RELATIONSHIP
[0,2] - [59,3]  | null || SERVER AGGREGATE INTO DISTINCT ROWS BY [ID_2]
  | null || CLIENT MERGE SORT
  |
null
|+-++3
rows selected (0.048 seconds)In this case, although the group by is
performed on both the client and regionserver, almost all of the actual
grouping happens on the client because the id_2’s are randomly distributed
across the regionservers. As a result, a lot of unnecessary data is
serialized to the client and grouped serially there. This can become quite
material with large resultsets.Proposed:Schema:CREATE TABLE relationship
(id_1 BIGINT NOT NULL,id_2 BIGINT NOT NULL,other_key BIGINT NOT NULL,val
SMALLINT,CONSTRAINT pk PRIMARY KEY (id_1, id_2,
other_key),SALT_BUCKETS=60,SALT_COLUMN = id_2);Query (unchanged):Select
id_2, sum(val)From relationshipWhere id_1 in (2,3)Group by id_2Explain
(unchanged)Under the proposal, the data are merely partitioned so that all
rows containing the same id_2 are on the same regionserver, the above query
will perform almost all of the grouping in parallel on the regionservers.
No special hint or changes to the query plan would be required to benefit.
Tables would need to be re-salted to take advantage of the new
functionality.Technical changes proposed to phoenix: - Create a new piece
of table-level metadata: SALT_COLUMN. SALT_COLUMN will instruct the salting
logic to generate a salt-byte based only on the specified column. If
unspecified, it will behave as it does today and default to salting the
entire rowkey. This metadata may be specified only when the table is
created and may not be modified. The specified column must be part of the
rowkey.  - Modify all callers of getSaltingByte
(byte[]
value, int offset, int length, int bucketNum) to consistently leverage the
new metadata.- Tests- DocsDesign points:One salt column vs multiple salt
columns: Based on the existing signature for getSaltingByte, it seems
simpler to only support a single SALT_COLUMN rather than multiple arbitrary
SALT_COLUMNS. Known use-cases are completely supported by a single
column.Syntax:  PHOENIX-4757
 suggests an alternate,
less verbose syntax for defining the salt bucket. The SALT_COLUMN syntax is
suggested for clarity and consistency with other Phoenix table
options.Future Enhancements (not in scope)Different aspects of the query
execution runtime could take advantage of new metadata and implied
knowledge that the data are partitioned in a predictable manner. For
example: - It could be that client side grouping is completely unnecessary
in cases where the 

Re: Salting based on partial rowkeys

2018-09-13 Thread Thomas D'Silva
Gerald,

I think you missed Josh's reply here :
https://lists.apache.org/thread.html/c5145461805429622a410c23c1199d578e146a5c94511b2d5833438b@%3Cdev.phoenix.apache.org%3E

Could you explain how using a subset of the pk columns to generate the salt
byte helps with partitioning, aggregations etc?

Thanks,
Thomas

On Thu, Sep 13, 2018 at 8:32 AM, Gerald Sangudi 
wrote:

> Hi folks,
>
> Any thoughts or feedback on this?
>
> Thanks,
> Gerald
>
> On Mon, Sep 10, 2018 at 1:56 PM, Gerald Sangudi 
> wrote:
>
>> Hello folks,
>>
>> We have a requirement for salting based on partial, rather than full,
>> rowkeys. My colleague Mike Polcari has identified the requirement and
>> proposed an approach.
>>
>> I found an already-open JIRA ticket for the same issue:
>> https://issues.apache.org/jira/browse/PHOENIX-4757. I can provide more
>> details from the proposal.
>>
>> The JIRA proposes a syntax of SALT_BUCKETS(col, ...) = N, whereas Mike
>> proposes SALT_COLUMN=col or SALT_COLUMNS=col, ... .
>>
>> The benefit at issue is that users gain more control over partitioning,
>> and this can be used to push some additional aggregations and hash joins
>> down to region servers.
>>
>> I would appreciate any go-ahead / thoughts / guidance / objections /
>> feedback. I'd like to be sure that the concept at least is not
>> objectionable. We would like to work on this and submit a patch down the
>> road. I'll also add a note to the JIRA ticket.
>>
>> Thanks,
>> Gerald
>>
>>
>


Re: Salting based on partial rowkeys

2018-09-13 Thread Gerald Sangudi
Hi folks,

Any thoughts or feedback on this?

Thanks,
Gerald

On Mon, Sep 10, 2018 at 1:56 PM, Gerald Sangudi 
wrote:

> Hello folks,
>
> We have a requirement for salting based on partial, rather than full,
> rowkeys. My colleague Mike Polcari has identified the requirement and
> proposed an approach.
>
> I found an already-open JIRA ticket for the same issue:
> https://issues.apache.org/jira/browse/PHOENIX-4757. I can provide more
> details from the proposal.
>
> The JIRA proposes a syntax of SALT_BUCKETS(col, ...) = N, whereas Mike
> proposes SALT_COLUMN=col or SALT_COLUMNS=col, ... .
>
> The benefit at issue is that users gain more control over partitioning,
> and this can be used to push some additional aggregations and hash joins
> down to region servers.
>
> I would appreciate any go-ahead / thoughts / guidance / objections /
> feedback. I'd like to be sure that the concept at least is not
> objectionable. We would like to work on this and submit a patch down the
> road. I'll also add a note to the JIRA ticket.
>
> Thanks,
> Gerald
>
>


Re: Salting based on partial rowkeys

2018-09-10 Thread Josh Elser

Hey Gerald,

Trimming back to just dev@phoenix, but I am curious to hear some more 
about what you and Mike are thinking.


Some initial questions:

* What are the problem(s) that you see today with the current 
implementation of SALT_BUCKETS

* How would your new feature/proposal work?
* How would your new feature solve your current problem?
* What are the drawbacks (if any) of your new feature?

I've definitely seen a problem where folks negatively impact their reads 
by "over-salting" because they were too lazy when writing data (either 
to think about a good distribution or to write some code to ingest their 
data).


Thanks in advance!

- Josh

On 9/10/18 4:56 PM, Gerald Sangudi wrote:

Hello folks,

We have a requirement for salting based on partial, rather than full, 
rowkeys. My colleague Mike Polcari has identified the requirement and 
proposed an approach.


I found an already-open JIRA ticket for the same issue: 
https://issues.apache.org/jira/browse/PHOENIX-4757. I can provide more 
details from the proposal.


The JIRA proposes a syntax of SALT_BUCKETS(col, ...) = N, whereas Mike 
proposes SALT_COLUMN=col or SALT_COLUMNS=col, ... .


The benefit at issue is that users gain more control over partitioning, 
and this can be used to push some additional aggregations and hash joins 
down to region servers.


I would appreciate any go-ahead / thoughts / guidance / objections / 
feedback. I'd like to be sure that the concept at least is not 
objectionable. We would like to work on this and submit a patch down the 
road. I'll also add a note to the JIRA ticket.


Thanks,
Gerald