[jira] [Commented] (PHOENIX-4757) composite key salt_buckets

Gerald Sangudi (JIRA) Fri, 16 Nov 2018 12:07:43 -0800


    [ 
https://issues.apache.org/jira/browse/PHOENIX-4757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16689953#comment-16689953
 ]


Gerald Sangudi commented on PHOENIX-4757:
-----------------------------------------

[~jamestaylor] - re your questions--

 
{noformat}
If the table is salted, then for each salt bucket, the above is done. For 
example, with SALT_BUCKET=16, there will be 16 parallel scans all looking for 
data with the second byte being 2 or 3. This sets an upper bound on the size of 
the merge sort.
How would this be handled with this proposal? Would every region need to be 
scanned and merge sorted? In general, how would this impact scans for id_1? 
What's the difference between your proposal and what Phoenix would do if the PK 
was (id_2, id_1)?
{noformat}
 

For this specific query, our proposal using PK (id_1, id_2) with 
SALT_COLUMNS=(id_2) would behave similarly to how Phoenix behaves with PK 
(id_2, id_1). This would be a win for us, because it allows us to support both 
those queries that prefer PK (id_1, id_2) and queries like the GROUP BY above, 
without additional tables or indexes.

Furthermore, with our proposal, after the scans and aggregations within each 
region, only regions with overlapping values of id_2 would need to be merged. 
The partial salting allows us to identify regions with overlapping values of 
id_2 in some cases.

 
{noformat}
Would another, simpler alternative given a PK of (id_2, id_1) and a filter of 
id_1 be to improve the filtering by using a skip scan? Given the above query, 
Phoenix would aggregate in-place since you're grouping by the leading PK 
column. What it's not doing automatically is skipping to id_1 of 2 or 3 during 
the aggregation. It might just work if you add a /*+ SKIP_SCAN */ hint to the 
above query. If the cardinality of id_2 is low then you can get better 
performance by using this hint for queries of specific id_1 values. 
{noformat}
This would prevent us from using PK (id_1, id_2), which we also need.

 
{noformat}
Another high level idea: If the goal is to improve the scalability of GROUP BY 
(and in particular any post-processing that occurs which today happens o the 
client-side), then it'd be a good idea to introduce a shuffle step in the 
Phoenix query plan. In this case, instead of attempting to perform the GROUP BY 
in place, Phoenix would write the intermediate results of the query to a temp 
table which is ordered by the GROUP BY key. In that way, you could continue to 
get distributed processing at the expense of more writing. A shuffle step would 
essentially be implemented by performing an intermediate, distributed UPSERT 
SELECT.
{noformat}
We are trying to reduce the amount of processing. I hope this comment and the 
previous one show how processing can be reduced. Please lmk your thoughts. 
Thanks.

> composite key salt_buckets
> --------------------------
>
>                 Key: PHOENIX-4757
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4757
>             Project: Phoenix
>          Issue Type: Improvement
>    Affects Versions: 4.11.0
>            Reporter: cmd
>            Priority: Major
>             Fix For: 4.11.0
>
>
> CREATE TABLE IF NOT EXISTS user_events (
>  user_id VARCHAR NOT NULL,
>  event_type VARCHAR NOT NULL,
>  event_time VARCHAR NOT NULL
>  event_msg VARCHAR NOT NULL
>  event_status VARCHAR NOT NULL
>  event_opt VARCHAR NOT NULL
>  CONSTRAINT my_pk PRIMARY KEY (user_id,event_type,event_time)) 
> SALT_BUCKETS=128;
> and my query is:
>  1.select event_type,count(0) from us_population where user_id='xxxx' group 
> by event_type
>  2.select count(0) from us_population where user_id='xxxx' and 
> event_type='0101'
>  3.select * from us_population where user_id='xxxx' and event_type='0101' and 
> event_time>'20180101' and event_time<'20180201' order by event_time limit 
> 50,100
> Concurrency query ratio:
>  1:80%
>  2:10%
>  3:10% 
>  user_events data :50billion
>  It can be a field/some fileds of the primary key salted by hash
>  grammar with "SALT_BUCKETS(user_id)=4" or 
> "SALT_BUCKETS(user_id,event_type)=4"
> ref:
>  
> [https://www.safaribooksonline.com/library/view/greenplum-architecture/9781940540337/xhtml/chapter03.xhtml]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-4757) composite key salt_buckets

Reply via email to