[jira] [Comment Edited] (PHOENIX-153) Implement TABLESAMPLE clause

Ethan Wang (JIRA) Sun, 25 Jun 2017 13:45:14 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16044823#comment-16044823
 ]


Ethan Wang edited comment on PHOENIX-153 at 6/25/17 8:44 PM:
-------------------------------------------------------------

Spec of this patch. Feedback plz.

Updated Jun25. Syntax to be similar to Calcite and PostgresSQL

++
++Belows are SUPPORTED
++
===BASE CASE====
select * from Person;
select * from PERSON TABLESAMPLE(45);
select * from PERSON TABLESAMPLE (45);

===WHERE CLAUSE====
select * from PERSON where ADDRESS = 'CA' OR name>'tina3';
select * from PERSON TABLESAMPLE (49) where ADDRESS = 'CA' OR name>'tina3';
select * from PERSON TABLESAMPLE (49) where ADDRESS = 'CA' OR name>'tina3' 
ORDER BY ADDRESS LIMIT 2;

===Wired Table===
select * from LOCAL_ADDRESS TABLESAMPLE (79);
select * from SYSTEM.STATS TABLESAMPLE (41);

===CORNER CASE===
select * from PERSON TABLESAMPLE (0);
select * from PERSON TABLESAMPLE (100.0);
select * from PERSON TABLESAMPLE (145.99);  (out of boundary exception)
select * from PERSON TABLESAMPLE ko; (syntax error exception)

===AGGREGATION===
select count(*) from PERSON TABLESAMPLE (49) LIMIT 2
select count(*) from (select NAME from PERSON TABLESAMPLE (49) limit 20)

===SUB QUERY===
select * from (select /*+NO_INDEX*/ * from PERSON tablesample (10) where Name > 
'tina10') where ADDRESS = 'CA'

===JOINS===
select * from PERSON1, PERSON2 tablesample (70) where PERSON1.Name = 
PERSON2.NAME
select /*+NO_INDEX*/ count(*) from PERSON tableSample (19), US_POPULATION 
tableSample (28) where PERSON.Name > US_POPULATION.STATE

===QUERY being OPTMIZED===
select * from PERSON tablesample (80) (may go to IDX_ADDRESS_PERSON index 
table, but table sampling rate carry on)

===INSERT SELECT====
upsert into personbig(ID, ADDRESS) select id, address from personbig 
tablesample (1);


was (Author: aertoria):
Spec of this patch. Feedback plz.
++
++Belows are SUPPORTED
++
===BASE CASE====
select * from Person;
select * from PERSON TABLESAMPLE 0.45;

===WHERE CLAUSE====
select * from PERSON where ADDRESS = 'CA' OR name>'tina3';
select * from PERSON TABLESAMPLE 0.49 where ADDRESS = 'CA' OR name>'tina3';
select * from PERSON TABLESAMPLE 0.49 where ADDRESS = 'CA' OR name>'tina3' 
LIMIT 1;

===Wired Table===
select * from LOCAL_ADDRESS TABLESAMPLE 0.79;
select * from SYSTEM.STATS TABLESAMPLE 0.41;

===CORNER CASE===
select * from PERSON TABLESAMPLE 0;
select * from PERSON TABLESAMPLE 1.45;
select * from PERSON TABLESAMPLE kko;

===AGGREGATION===
select count( * ) from PERSON TABLESAMPLE 0.5 LIMIT 2
select count( * ) from (select NAME from PERSON limit 20)

===SUB QUERY===
select * from (select /*+NO_INDEX*/ * from PERSON tablesample 0.1 where Name > 
'tina10') where ADDRESS = 'CA'

===JOINS===
select * from PERSON1, PERSON2 tablesample 0.7 where PERSON1.Name = PERSON2.NAME

===QUERY being OPTMIZED===
select * from PERSON tablesample 0.8 (goes to IDX_ADDRESS_PERSON index table, 
table sample carry on)

===INSERT SELECT====
upsert into personbig(ID, ADDRESS) select id, address from personbig 
tablesample 0.01;

> Implement TABLESAMPLE clause
> ----------------------------
>
>                 Key: PHOENIX-153
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-153
>             Project: Phoenix
>          Issue Type: Task
>            Reporter: James Taylor
>            Assignee: Ethan Wang
>              Labels: enhancement
>
> Support the standard SQL TABLESAMPLE clause by implementing a filter that 
> uses a skip next hint based on the region boundaries of the table to only 
> return n rows per region.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (PHOENIX-153) Implement TABLESAMPLE clause

Reply via email to