[jira] [Commented] (DRILL-4147) Union All operator runs in a single fragment

Robert Hou (JIRA) Fri, 05 Aug 2016 23:21:54 -0700

    [ 
https://issues.apache.org/jira/browse/DRILL-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15410511#comment-15410511
 ]


Robert Hou commented on DRILL-4147:
-----------------------------------

Here is a simple test case using lineitem.  Lineitem can be small, but it needs 
to be created with many parquet files.

alter session set `store.parquet.block-size`=500000;
create table lineitemfiles as select * from lineitem;

create table newlineitemfiles as 
with lineitem_cte as
(
select l.l_orderkey, l.l_partkey
from lineitemfiles l limit 1)

(select l.l_orderkey, l.l_partkey
from
lineitemfiles l
inner join
orders o
on l.l_orderkey = o.o_orderkey)

union all

(select l.l_orderkey, l.l_partkey
from lineitem_cte l);



> Union All operator runs in a single fragment
> --------------------------------------------
>
>                 Key: DRILL-4147
>                 URL: https://issues.apache.org/jira/browse/DRILL-4147
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: amit hadke
>            Assignee: Aman Sinha
>
> A User noticed that running select  from a single directory is much faster 
> than union all on two directories.
> (https://drill.apache.org/blog/2014/12/09/running-sql-queries-on-amazon-s3/#comment-2349732267)
>  
> It seems like UNION ALL operator doesn't parallelize sub scans (its using 
> SINGLETON for distribution type). Everything is ran in single fragment.
> We may have to use SubsetTransformer in UnionAllPrule.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-4147) Union All operator runs in a single fragment

Reply via email to