[jira] [Updated] (DRILL-6574) Add option to push LIMIT(0) on top of SCAN (late limit 0 optimization)

Bohdan Kazydub (JIRA) Wed, 18 Jul 2018 05:10:50 -0700


     [ 
https://issues.apache.org/jira/browse/DRILL-6574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Bohdan Kazydub updated DRILL-6574:
----------------------------------
    Description: 
Currently we have early limit 0 optimization 
(planner.enable_limit0_optimization) which determines query data types before 
actual scan. Since we not always able to determine data type during planning, 
we need to add one more option to enable late limit 0 optimization 
(planner.enable_limit0_on_scan, exit query right after scan. LIMIT(0) on SCAN 
for UNION and complex functions will be disabled i.e. UNION and complex 
functions need data to produce result schema. This would not work for the 
following list of functions: KVGEN, MAPPIFY, FLATTEN, CONVERT_FROMJSON, 
CONVERT_TOJSON, CONVERT_TOSIMPLEJSON, CONVERT_TOEXTENDEDJSON.

Query plan examples:

For query
{code:java}
SELECT * FROM (
  SELECT l.l_quantity, l.l_shipdate, o.o_custkey 
  FROM cp.`tpch/lineitem.parquet` l 
  JOIN cp.`tpch/orders.parquet` o ON l.l_orderkey = o.o_orderkey 
  LIMIT 2) 
LIMIT 0
{code}
{color:#6a8759}plan after changes looks like{color}
{noformat}
00-00    Screen : rowType = RecordType(ANY l_quantity, ANY l_shipdate, ANY 
o_custkey): rowcount = 1.0, cumulative cost = \{75183.1 rows, 210559.1 cpu, 0.0 
io, 0.0 network, 17.6 memory}, id = 527
00-01      Project(l_quantity=[$1], l_shipdate=[$2], o_custkey=[$4]) : rowType 
= RecordType(ANY l_quantity, ANY l_shipdate, ANY o_custkey): rowcount = 1.0, 
cumulative cost = \{75183.0 rows, 210559.0 cpu, 0.0 io, 0.0 network, 17.6 
memory}, id = 526
00-02        SelectionVectorRemover : rowType = RecordType(ANY l_orderkey, ANY 
l_quantity, ANY l_shipdate, ANY o_orderkey, ANY o_custkey): rowcount = 1.0, 
cumulative cost = \{75182.0 rows, 210556.0 cpu, 0.0 io, 0.0 network, 17.6 
memory}, id = 525
00-03          Limit(fetch=[0]) : rowType = RecordType(ANY l_orderkey, ANY 
l_quantity, ANY l_shipdate, ANY o_orderkey, ANY o_custkey): rowcount = 1.0, 
cumulative cost = \{75181.0 rows, 210555.0 cpu, 0.0 io, 0.0 network, 17.6 
memory}, id = 524
00-04            Limit(fetch=[2]) : rowType = RecordType(ANY l_orderkey, ANY 
l_quantity, ANY l_shipdate, ANY o_orderkey, ANY o_custkey): rowcount = 2.0, 
cumulative cost = \{75181.0 rows, 210555.0 cpu, 0.0 io, 0.0 network, 17.6 
memory}, id = 523
00-05              HashJoin(condition=[=($0, $3)], joinType=[inner]) : rowType 
= RecordType(ANY l_orderkey, ANY l_quantity, ANY l_shipdate, ANY o_orderkey, 
ANY o_custkey): rowcount = 1.0, cumulative cost = \{75179.0 rows, 210547.0 cpu, 
0.0 io, 0.0 network, 17.6 memory}, id = 522
00-07                SelectionVectorRemover : rowType = RecordType(ANY 
l_orderkey, ANY l_quantity, ANY l_shipdate): rowcount = 1.0, cumulative cost = 
\{60176.0 rows, 180526.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 518
00-09                  Limit(offset=[0], fetch=[0]) : rowType = RecordType(ANY 
l_orderkey, ANY l_quantity, ANY l_shipdate): rowcount = 1.0, cumulative cost = 
\{60175.0 rows, 180525.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 517
00-11                    Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=classpath:/tpch/lineitem.parquet]], 
selectionRoot=classpath:/tpch/lineitem.parquet, numFiles=1, numRowGroups=1, 
usedMetadataFile=false, columns=[`l_orderkey`, `l_quantity`, `l_shipdate`]]]) : 
rowType = RecordType(ANY l_orderkey, ANY l_quantity, ANY l_shipdate): rowcount 
= 60175.0, cumulative cost = \{60175.0 rows, 180525.0 cpu, 0.0 io, 0.0 network, 
0.0 memory}, id = 516
00-06                SelectionVectorRemover : rowType = RecordType(ANY 
o_orderkey, ANY o_custkey): rowcount = 1.0, cumulative cost = \{15001.0 rows, 
30001.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 521
00-08                  Limit(offset=[0], fetch=[0]) : rowType = RecordType(ANY 
o_orderkey, ANY o_custkey): rowcount = 1.0, cumulative cost = \{15000.0 rows, 
30000.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 520
00-10                    Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=classpath:/tpch/orders.parquet]], 
selectionRoot=classpath:/tpch/orders.parquet, numFiles=1, numRowGroups=1, 
usedMetadataFile=false, columns=[`o_orderkey`, `o_custkey`]]]) : rowType = 
RecordType(ANY o_orderkey, ANY o_custkey): rowcount = 15000.0, cumulative cost 
= \{15000.0 rows, 30000.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 519

{noformat}
and before changes:
{noformat}
00-00    Screen : rowType = RecordType(ANY l_quantity, ANY l_shipdate, ANY 
o_custkey): rowcount = 1.0, cumulative cost = \{150354.1 rows, 1052637.1 cpu, 
0.0 io, 0.0 network, 264000.0 memory}, id = 452
00-01      Project(l_quantity=[$1], l_shipdate=[$2], o_custkey=[$4]) : rowType 
= RecordType(ANY l_quantity, ANY l_shipdate, ANY o_custkey): rowcount = 1.0, 
cumulative cost = \{150354.0 rows, 1052637.0 cpu, 0.0 io, 0.0 network, 264000.0 
memory}, id = 451
00-02        SelectionVectorRemover : rowType = RecordType(ANY l_orderkey, ANY 
l_quantity, ANY l_shipdate, ANY o_orderkey, ANY o_custkey): rowcount = 1.0, 
cumulative cost = \{150353.0 rows, 1052634.0 cpu, 0.0 io, 0.0 network, 264000.0 
memory}, id = 450
00-03          Limit(fetch=[0]) : rowType = RecordType(ANY l_orderkey, ANY 
l_quantity, ANY l_shipdate, ANY o_orderkey, ANY o_custkey): rowcount = 1.0, 
cumulative cost = \{150352.0 rows, 1052633.0 cpu, 0.0 io, 0.0 network, 264000.0 
memory}, id = 449
00-04            Limit(fetch=[2]) : rowType = RecordType(ANY l_orderkey, ANY 
l_quantity, ANY l_shipdate, ANY o_orderkey, ANY o_custkey): rowcount = 2.0, 
cumulative cost = \{150352.0 rows, 1052633.0 cpu, 0.0 io, 0.0 network, 264000.0 
memory}, id = 448
00-05              HashJoin(condition=[=($0, $3)], joinType=[inner]) : rowType 
= RecordType(ANY l_orderkey, ANY l_quantity, ANY l_shipdate, ANY o_orderkey, 
ANY o_custkey): rowcount = 60175.0, cumulative cost = \{150350.0 rows, 
1052625.0 cpu, 0.0 io, 0.0 network, 264000.0 memory}, id = 447
00-07                Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=classpath:/tpch/lineitem.parquet]], 
selectionRoot=classpath:/tpch/lineitem.parquet, numFiles=1, numRowGroups=1, 
usedMetadataFile=false, columns=[`l_orderkey`, `l_quantity`, `l_shipdate`]]]) : 
rowType = RecordType(ANY l_orderkey, ANY l_quantity, ANY l_shipdate): rowcount 
= 60175.0, cumulative cost = \{60175.0 rows, 180525.0 cpu, 0.0 io, 0.0 network, 
0.0 memory}, id = 445
00-06                Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=classpath:/tpch/orders.parquet]], 
selectionRoot=classpath:/tpch/orders.parquet, numFiles=1, numRowGroups=1, 
usedMetadataFile=false, columns=[`o_orderkey`, `o_custkey`]]]) : rowType = 
RecordType(ANY o_orderkey, ANY o_custkey): rowcount = 15000.0, cumulative cost 
= \{15000.0 rows, 30000.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 446

{noformat}
Also both early and late limit 0 optimizations will be enabled by default.

  was:
Currently we have early limit 0 optimization 
(planner.enable_limit0_optimization) which determines query data types before 
actual scan. Since we not always able to determine data type during planning, 
we need to add one more option to enable late limit 0 optimization 
(planner.enable_limit0_on_scan, exit query right after scan. LIMIT(0) on SCAN 
for UNION and complex functions will be disabled i.e. UNION and complex 
functions need data to produce result schema. This would not work for the 
following list of functions: KVGEN, MAPPIFY, FLATTEN, CONVERT_FROMJSON, 
CONVERT_TOJSON, CONVERT_TOSIMPLEJSON, CONVERT_TOEXTENDEDJSON.

Query plan examples:

For query
{code:java}
SELECT * FROM (
  SELECT l.l_quantity, l.l_shipdate, o.o_custkey 
  FROM cp.`tpch/lineitem.parquet` l 
  JOIN cp.`tpch/orders.parquet` o ON l.l_orderkey = o.o_orderkey 
  LIMIT 2) 
LIMIT 0
{code}
{color:#6a8759}{color:#333333}plan after changes looks like{color}
{color}

 
{noformat}
00-00    Screen : rowType = RecordType(ANY l_quantity, ANY l_shipdate, ANY 
o_custkey): rowcount = 1.0, cumulative cost = \{75183.1 rows, 210559.1 cpu, 0.0 
io, 0.0 network, 17.6 memory}, id = 527
00-01      Project(l_quantity=[$1], l_shipdate=[$2], o_custkey=[$4]) : rowType 
= RecordType(ANY l_quantity, ANY l_shipdate, ANY o_custkey): rowcount = 1.0, 
cumulative cost = \{75183.0 rows, 210559.0 cpu, 0.0 io, 0.0 network, 17.6 
memory}, id = 526
00-02        SelectionVectorRemover : rowType = RecordType(ANY l_orderkey, ANY 
l_quantity, ANY l_shipdate, ANY o_orderkey, ANY o_custkey): rowcount = 1.0, 
cumulative cost = \{75182.0 rows, 210556.0 cpu, 0.0 io, 0.0 network, 17.6 
memory}, id = 525
00-03          Limit(fetch=[0]) : rowType = RecordType(ANY l_orderkey, ANY 
l_quantity, ANY l_shipdate, ANY o_orderkey, ANY o_custkey): rowcount = 1.0, 
cumulative cost = \{75181.0 rows, 210555.0 cpu, 0.0 io, 0.0 network, 17.6 
memory}, id = 524
00-04            Limit(fetch=[2]) : rowType = RecordType(ANY l_orderkey, ANY 
l_quantity, ANY l_shipdate, ANY o_orderkey, ANY o_custkey): rowcount = 2.0, 
cumulative cost = \{75181.0 rows, 210555.0 cpu, 0.0 io, 0.0 network, 17.6 
memory}, id = 523
00-05              HashJoin(condition=[=($0, $3)], joinType=[inner]) : rowType 
= RecordType(ANY l_orderkey, ANY l_quantity, ANY l_shipdate, ANY o_orderkey, 
ANY o_custkey): rowcount = 1.0, cumulative cost = \{75179.0 rows, 210547.0 cpu, 
0.0 io, 0.0 network, 17.6 memory}, id = 522
00-07                SelectionVectorRemover : rowType = RecordType(ANY 
l_orderkey, ANY l_quantity, ANY l_shipdate): rowcount = 1.0, cumulative cost = 
\{60176.0 rows, 180526.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 518
00-09                  Limit(offset=[0], fetch=[0]) : rowType = RecordType(ANY 
l_orderkey, ANY l_quantity, ANY l_shipdate): rowcount = 1.0, cumulative cost = 
\{60175.0 rows, 180525.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 517
00-11                    Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=classpath:/tpch/lineitem.parquet]], 
selectionRoot=classpath:/tpch/lineitem.parquet, numFiles=1, numRowGroups=1, 
usedMetadataFile=false, columns=[`l_orderkey`, `l_quantity`, `l_shipdate`]]]) : 
rowType = RecordType(ANY l_orderkey, ANY l_quantity, ANY l_shipdate): rowcount 
= 60175.0, cumulative cost = \{60175.0 rows, 180525.0 cpu, 0.0 io, 0.0 network, 
0.0 memory}, id = 516
00-06                SelectionVectorRemover : rowType = RecordType(ANY 
o_orderkey, ANY o_custkey): rowcount = 1.0, cumulative cost = \{15001.0 rows, 
30001.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 521
00-08                  Limit(offset=[0], fetch=[0]) : rowType = RecordType(ANY 
o_orderkey, ANY o_custkey): rowcount = 1.0, cumulative cost = \{15000.0 rows, 
30000.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 520
00-10                    Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=classpath:/tpch/orders.parquet]], 
selectionRoot=classpath:/tpch/orders.parquet, numFiles=1, numRowGroups=1, 
usedMetadataFile=false, columns=[`o_orderkey`, `o_custkey`]]]) : rowType = 
RecordType(ANY o_orderkey, ANY o_custkey): rowcount = 15000.0, cumulative cost 
= \{15000.0 rows, 30000.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 519

{noformat}
and before changes:

{noformat}

00-00    Screen : rowType = RecordType(ANY l_quantity, ANY l_shipdate, ANY 
o_custkey): rowcount = 1.0, cumulative cost = \{150354.1 rows, 1052637.1 cpu, 
0.0 io, 0.0 network, 264000.0 memory}, id = 452
00-01      Project(l_quantity=[$1], l_shipdate=[$2], o_custkey=[$4]) : rowType 
= RecordType(ANY l_quantity, ANY l_shipdate, ANY o_custkey): rowcount = 1.0, 
cumulative cost = \{150354.0 rows, 1052637.0 cpu, 0.0 io, 0.0 network, 264000.0 
memory}, id = 451
00-02        SelectionVectorRemover : rowType = RecordType(ANY l_orderkey, ANY 
l_quantity, ANY l_shipdate, ANY o_orderkey, ANY o_custkey): rowcount = 1.0, 
cumulative cost = \{150353.0 rows, 1052634.0 cpu, 0.0 io, 0.0 network, 264000.0 
memory}, id = 450
00-03          Limit(fetch=[0]) : rowType = RecordType(ANY l_orderkey, ANY 
l_quantity, ANY l_shipdate, ANY o_orderkey, ANY o_custkey): rowcount = 1.0, 
cumulative cost = \{150352.0 rows, 1052633.0 cpu, 0.0 io, 0.0 network, 264000.0 
memory}, id = 449
00-04            Limit(fetch=[2]) : rowType = RecordType(ANY l_orderkey, ANY 
l_quantity, ANY l_shipdate, ANY o_orderkey, ANY o_custkey): rowcount = 2.0, 
cumulative cost = \{150352.0 rows, 1052633.0 cpu, 0.0 io, 0.0 network, 264000.0 
memory}, id = 448
00-05              HashJoin(condition=[=($0, $3)], joinType=[inner]) : rowType 
= RecordType(ANY l_orderkey, ANY l_quantity, ANY l_shipdate, ANY o_orderkey, 
ANY o_custkey): rowcount = 60175.0, cumulative cost = \{150350.0 rows, 
1052625.0 cpu, 0.0 io, 0.0 network, 264000.0 memory}, id = 447
00-07                Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=classpath:/tpch/lineitem.parquet]], 
selectionRoot=classpath:/tpch/lineitem.parquet, numFiles=1, numRowGroups=1, 
usedMetadataFile=false, columns=[`l_orderkey`, `l_quantity`, `l_shipdate`]]]) : 
rowType = RecordType(ANY l_orderkey, ANY l_quantity, ANY l_shipdate): rowcount 
= 60175.0, cumulative cost = \{60175.0 rows, 180525.0 cpu, 0.0 io, 0.0 network, 
0.0 memory}, id = 445
00-06                Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=classpath:/tpch/orders.parquet]], 
selectionRoot=classpath:/tpch/orders.parquet, numFiles=1, numRowGroups=1, 
usedMetadataFile=false, columns=[`o_orderkey`, `o_custkey`]]]) : rowType = 
RecordType(ANY o_orderkey, ANY o_custkey): rowcount = 15000.0, cumulative cost 
= \{15000.0 rows, 30000.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 446

{noformat}

Also both early and late limit 0 optimizations will be enabled by default.


> Add option to push LIMIT(0) on top of SCAN (late limit 0 optimization)
> ----------------------------------------------------------------------
>
>                 Key: DRILL-6574
>                 URL: https://issues.apache.org/jira/browse/DRILL-6574
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: 1.13.0
>            Reporter: Bohdan Kazydub
>            Assignee: Bohdan Kazydub
>            Priority: Major
>              Labels: doc-impacting
>             Fix For: 1.14.0
>
>
> Currently we have early limit 0 optimization 
> (planner.enable_limit0_optimization) which determines query data types before 
> actual scan. Since we not always able to determine data type during planning, 
> we need to add one more option to enable late limit 0 optimization 
> (planner.enable_limit0_on_scan, exit query right after scan. LIMIT(0) on SCAN 
> for UNION and complex functions will be disabled i.e. UNION and complex 
> functions need data to produce result schema. This would not work for the 
> following list of functions: KVGEN, MAPPIFY, FLATTEN, CONVERT_FROMJSON, 
> CONVERT_TOJSON, CONVERT_TOSIMPLEJSON, CONVERT_TOEXTENDEDJSON.
> Query plan examples:
> For query
> {code:java}
> SELECT * FROM (
>   SELECT l.l_quantity, l.l_shipdate, o.o_custkey 
>   FROM cp.`tpch/lineitem.parquet` l 
>   JOIN cp.`tpch/orders.parquet` o ON l.l_orderkey = o.o_orderkey 
>   LIMIT 2) 
> LIMIT 0
> {code}
> {color:#6a8759}plan after changes looks like{color}
> {noformat}
> 00-00    Screen : rowType = RecordType(ANY l_quantity, ANY l_shipdate, ANY 
> o_custkey): rowcount = 1.0, cumulative cost = \{75183.1 rows, 210559.1 cpu, 
> 0.0 io, 0.0 network, 17.6 memory}, id = 527
> 00-01      Project(l_quantity=[$1], l_shipdate=[$2], o_custkey=[$4]) : 
> rowType = RecordType(ANY l_quantity, ANY l_shipdate, ANY o_custkey): rowcount 
> = 1.0, cumulative cost = \{75183.0 rows, 210559.0 cpu, 0.0 io, 0.0 network, 
> 17.6 memory}, id = 526
> 00-02        SelectionVectorRemover : rowType = RecordType(ANY l_orderkey, 
> ANY l_quantity, ANY l_shipdate, ANY o_orderkey, ANY o_custkey): rowcount = 
> 1.0, cumulative cost = \{75182.0 rows, 210556.0 cpu, 0.0 io, 0.0 network, 
> 17.6 memory}, id = 525
> 00-03          Limit(fetch=[0]) : rowType = RecordType(ANY l_orderkey, ANY 
> l_quantity, ANY l_shipdate, ANY o_orderkey, ANY o_custkey): rowcount = 1.0, 
> cumulative cost = \{75181.0 rows, 210555.0 cpu, 0.0 io, 0.0 network, 17.6 
> memory}, id = 524
> 00-04            Limit(fetch=[2]) : rowType = RecordType(ANY l_orderkey, ANY 
> l_quantity, ANY l_shipdate, ANY o_orderkey, ANY o_custkey): rowcount = 2.0, 
> cumulative cost = \{75181.0 rows, 210555.0 cpu, 0.0 io, 0.0 network, 17.6 
> memory}, id = 523
> 00-05              HashJoin(condition=[=($0, $3)], joinType=[inner]) : 
> rowType = RecordType(ANY l_orderkey, ANY l_quantity, ANY l_shipdate, ANY 
> o_orderkey, ANY o_custkey): rowcount = 1.0, cumulative cost = \{75179.0 rows, 
> 210547.0 cpu, 0.0 io, 0.0 network, 17.6 memory}, id = 522
> 00-07                SelectionVectorRemover : rowType = RecordType(ANY 
> l_orderkey, ANY l_quantity, ANY l_shipdate): rowcount = 1.0, cumulative cost 
> = \{60176.0 rows, 180526.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 518
> 00-09                  Limit(offset=[0], fetch=[0]) : rowType = 
> RecordType(ANY l_orderkey, ANY l_quantity, ANY l_shipdate): rowcount = 1.0, 
> cumulative cost = \{60175.0 rows, 180525.0 cpu, 0.0 io, 0.0 network, 0.0 
> memory}, id = 517
> 00-11                    Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=classpath:/tpch/lineitem.parquet]], 
> selectionRoot=classpath:/tpch/lineitem.parquet, numFiles=1, numRowGroups=1, 
> usedMetadataFile=false, columns=[`l_orderkey`, `l_quantity`, `l_shipdate`]]]) 
> : rowType = RecordType(ANY l_orderkey, ANY l_quantity, ANY l_shipdate): 
> rowcount = 60175.0, cumulative cost = \{60175.0 rows, 180525.0 cpu, 0.0 io, 
> 0.0 network, 0.0 memory}, id = 516
> 00-06                SelectionVectorRemover : rowType = RecordType(ANY 
> o_orderkey, ANY o_custkey): rowcount = 1.0, cumulative cost = \{15001.0 rows, 
> 30001.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 521
> 00-08                  Limit(offset=[0], fetch=[0]) : rowType = 
> RecordType(ANY o_orderkey, ANY o_custkey): rowcount = 1.0, cumulative cost = 
> \{15000.0 rows, 30000.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 520
> 00-10                    Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=classpath:/tpch/orders.parquet]], 
> selectionRoot=classpath:/tpch/orders.parquet, numFiles=1, numRowGroups=1, 
> usedMetadataFile=false, columns=[`o_orderkey`, `o_custkey`]]]) : rowType = 
> RecordType(ANY o_orderkey, ANY o_custkey): rowcount = 15000.0, cumulative 
> cost = \{15000.0 rows, 30000.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 519
> {noformat}
> and before changes:
> {noformat}
> 00-00    Screen : rowType = RecordType(ANY l_quantity, ANY l_shipdate, ANY 
> o_custkey): rowcount = 1.0, cumulative cost = \{150354.1 rows, 1052637.1 cpu, 
> 0.0 io, 0.0 network, 264000.0 memory}, id = 452
> 00-01      Project(l_quantity=[$1], l_shipdate=[$2], o_custkey=[$4]) : 
> rowType = RecordType(ANY l_quantity, ANY l_shipdate, ANY o_custkey): rowcount 
> = 1.0, cumulative cost = \{150354.0 rows, 1052637.0 cpu, 0.0 io, 0.0 network, 
> 264000.0 memory}, id = 451
> 00-02        SelectionVectorRemover : rowType = RecordType(ANY l_orderkey, 
> ANY l_quantity, ANY l_shipdate, ANY o_orderkey, ANY o_custkey): rowcount = 
> 1.0, cumulative cost = \{150353.0 rows, 1052634.0 cpu, 0.0 io, 0.0 network, 
> 264000.0 memory}, id = 450
> 00-03          Limit(fetch=[0]) : rowType = RecordType(ANY l_orderkey, ANY 
> l_quantity, ANY l_shipdate, ANY o_orderkey, ANY o_custkey): rowcount = 1.0, 
> cumulative cost = \{150352.0 rows, 1052633.0 cpu, 0.0 io, 0.0 network, 
> 264000.0 memory}, id = 449
> 00-04            Limit(fetch=[2]) : rowType = RecordType(ANY l_orderkey, ANY 
> l_quantity, ANY l_shipdate, ANY o_orderkey, ANY o_custkey): rowcount = 2.0, 
> cumulative cost = \{150352.0 rows, 1052633.0 cpu, 0.0 io, 0.0 network, 
> 264000.0 memory}, id = 448
> 00-05              HashJoin(condition=[=($0, $3)], joinType=[inner]) : 
> rowType = RecordType(ANY l_orderkey, ANY l_quantity, ANY l_shipdate, ANY 
> o_orderkey, ANY o_custkey): rowcount = 60175.0, cumulative cost = \{150350.0 
> rows, 1052625.0 cpu, 0.0 io, 0.0 network, 264000.0 memory}, id = 447
> 00-07                Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=classpath:/tpch/lineitem.parquet]], 
> selectionRoot=classpath:/tpch/lineitem.parquet, numFiles=1, numRowGroups=1, 
> usedMetadataFile=false, columns=[`l_orderkey`, `l_quantity`, `l_shipdate`]]]) 
> : rowType = RecordType(ANY l_orderkey, ANY l_quantity, ANY l_shipdate): 
> rowcount = 60175.0, cumulative cost = \{60175.0 rows, 180525.0 cpu, 0.0 io, 
> 0.0 network, 0.0 memory}, id = 445
> 00-06                Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=classpath:/tpch/orders.parquet]], 
> selectionRoot=classpath:/tpch/orders.parquet, numFiles=1, numRowGroups=1, 
> usedMetadataFile=false, columns=[`o_orderkey`, `o_custkey`]]]) : rowType = 
> RecordType(ANY o_orderkey, ANY o_custkey): rowcount = 15000.0, cumulative 
> cost = \{15000.0 rows, 30000.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 446
> {noformat}
> Also both early and late limit 0 optimizations will be enabled by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6574) Add option to push LIMIT(0) on top of SCAN (late limit 0 optimization)

Reply via email to