[
https://issues.apache.org/jira/browse/DRILL-6605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Hou updated DRILL-6605:
------------------------------
Description:
Query is:
Advanced/tpcds/tpcds_sf100/hive/parquet/query84.sql
This uses the hive parquet reader.
{code:sql}
SELECT c_customer_id AS customer_id,
c_last_name
|| ', '
|| c_first_name AS customername
FROM customer,
customer_address,
customer_demographics,
household_demographics,
income_band,
store_returns
WHERE ca_city = 'Green Acres'
AND c_current_addr_sk = ca_address_sk
AND ib_lower_bound >= 54986
AND ib_upper_bound <= 54986 + 50000
AND ib_income_band_sk = hd_income_band_sk
AND cd_demo_sk = c_current_cdemo_sk
AND hd_demo_sk = c_current_hdemo_sk
AND sr_cdemo_sk = cd_demo_sk
ORDER BY c_customer_id
LIMIT 100
{code}
This query should return 100 rows
Here is the explain plan:
{noformat}
| 00-00 Screen
00-01 Project(customer_id=[$0], customername=[$1])
00-02 SelectionVectorRemover
00-03 Limit(fetch=[100])
00-04 SingleMergeExchange(sort0=[0])
01-01 OrderedMuxExchange(sort0=[0])
02-01 SelectionVectorRemover
02-02 TopN(limit=[100])
02-03 HashToRandomExchange(dist0=[[$0]])
03-01 Project(customer_id=[$0], customername=[||(||($5, ',
'), $4)])
03-02 Project(c_customer_id=[$1],
c_current_cdemo_sk=[$2], c_current_hdemo_sk=[$3], c_current_addr_sk=[$4],
c_first_name=[$5], c_last_name=[$6], ca_address_sk=[$8], ca_city=[$9],
cd_demo_sk=[$7], hd_demo_sk=[$10], hd_income_band_sk=[$11],
ib_income_band_sk=[$12], ib_lower_bound=[$13], ib_upper_bound=[$14],
sr_cdemo_sk=[$0])
03-03 HashJoin(condition=[=($7, $0)], joinType=[inner])
03-05 HashToRandomExchange(dist0=[[$0]])
04-01 Scan(groupscan=[HiveScan
[table=Table(dbName:tpcds100_parquet, tableName:store_returns),
columns=[`sr_cdemo_sk`], numPartitions=0, partitions= null,
inputDirectories=[maprfs:/drill/testdata/tpcds_sf100/parquet/web_returns],
confProperties={}]])
03-04 HashToRandomExchange(dist0=[[$6]])
05-01 HashJoin(condition=[=($2, $9)],
joinType=[inner])
05-03 HashJoin(condition=[=($3, $7)],
joinType=[inner])
05-05 HashJoin(condition=[=($1, $6)],
joinType=[inner])
05-07 Scan(groupscan=[HiveScan
[table=Table(dbName:tpcds100_parquet, tableName:customer),
columns=[`c_customer_id`, `c_current_cdemo_sk`, `c_current_hdemo_sk`,
`c_current_addr_sk`, `c_first_name`, `c_last_name`], numPartitions=0,
partitions= null,
inputDirectories=[maprfs:/drill/testdata/tpcds_sf100/parquet/customer],
confProperties={}]])
05-06 BroadcastExchange
06-01 Scan(groupscan=[HiveScan
[table=Table(dbName:tpcds100_parquet, tableName:customer_demographics),
columns=[`cd_demo_sk`], numPartitions=0, partitions= null,
inputDirectories=[maprfs:/drill/testdata/tpcds_sf100/parquet/customer_demographics],
confProperties={}]])
05-04 BroadcastExchange
07-01 SelectionVectorRemover
07-02 Filter(condition=[=($1, 'Green
Acres')])
07-03 Scan(groupscan=[HiveScan
[table=Table(dbName:tpcds100_parquet, tableName:customer_address),
columns=[`ca_address_sk`, `ca_city`], numPartitions=0, partitions= null,
inputDirectories=[maprfs:/drill/testdata/tpcds_sf100/parquet/customer_address],
confProperties={}]])
05-02 BroadcastExchange
08-01 HashJoin(condition=[=($1, $2)],
joinType=[inner])
08-03 Scan(groupscan=[HiveScan
[table=Table(dbName:tpcds100_parquet, tableName:household_demographics),
columns=[`hd_demo_sk`, `hd_income_band_sk`], numPartitions=0, partitions= null,
inputDirectories=[maprfs:/drill/testdata/tpcds_sf100/parquet/household_demographics],
confProperties={}]])
08-02 SelectionVectorRemover
08-04 Filter(condition=[AND(>=($1, 54986),
<=($2, 104986))])
08-05 Scan(groupscan=[HiveScan
[table=Table(dbName:tpcds100_parquet, tableName:income_band),
columns=[`ib_income_band_sk`, `ib_lower_bound`, `ib_upper_bound`],
numPartitions=0, partitions= null,
inputDirectories=[maprfs:/drill/testdata/tpcds_sf100/parquet/income_band],
confProperties={}]])
{noformat}
commit id is:
1.14.0-SNAPSHOT a77fd142d86dd5648cda8866b8ff3af39c7b6b11 DRILL-6516:
EMIT support in streaming agg 11.07.2018 @ 18:40:03 PDT Unknown
12.07.2018 @ 01:50:37 PDT
was:
Query is:
Advanced/tpcds/tpcds_sf100/hive/parquet/query84.sql
This uses the hive parquet reader.
{code:sql}
SELECT c_customer_id AS customer_id,
c_last_name
|| ', '
|| c_first_name AS customername
FROM customer,
customer_address,
customer_demographics,
household_demographics,
income_band,
store_returns
WHERE ca_city = 'Green Acres'
AND c_current_addr_sk = ca_address_sk
AND ib_lower_bound >= 54986
AND ib_upper_bound <= 54986 + 50000
AND ib_income_band_sk = hd_income_band_sk
AND cd_demo_sk = c_current_cdemo_sk
AND hd_demo_sk = c_current_hdemo_sk
AND sr_cdemo_sk = cd_demo_sk
ORDER BY c_customer_id
LIMIT 100
{code}
This query should return 100 rows
commit id is:
1.14.0-SNAPSHOT a77fd142d86dd5648cda8866b8ff3af39c7b6b11 DRILL-6516:
EMIT support in streaming agg 11.07.2018 @ 18:40:03 PDT Unknown
12.07.2018 @ 01:50:37 PDT
> TPCDS-84 Query does not return any rows
> ---------------------------------------
>
> Key: DRILL-6605
> URL: https://issues.apache.org/jira/browse/DRILL-6605
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Relational Operators
> Reporter: Robert Hou
> Assignee: Vitalii Diravka
> Priority: Blocker
> Fix For: 1.14.0
>
>
> Query is:
> Advanced/tpcds/tpcds_sf100/hive/parquet/query84.sql
> This uses the hive parquet reader.
> {code:sql}
> SELECT c_customer_id AS customer_id,
> c_last_name
> || ', '
> || c_first_name AS customername
> FROM customer,
> customer_address,
> customer_demographics,
> household_demographics,
> income_band,
> store_returns
> WHERE ca_city = 'Green Acres'
> AND c_current_addr_sk = ca_address_sk
> AND ib_lower_bound >= 54986
> AND ib_upper_bound <= 54986 + 50000
> AND ib_income_band_sk = hd_income_band_sk
> AND cd_demo_sk = c_current_cdemo_sk
> AND hd_demo_sk = c_current_hdemo_sk
> AND sr_cdemo_sk = cd_demo_sk
> ORDER BY c_customer_id
> LIMIT 100
> {code}
> This query should return 100 rows
> Here is the explain plan:
> {noformat}
> | 00-00 Screen
> 00-01 Project(customer_id=[$0], customername=[$1])
> 00-02 SelectionVectorRemover
> 00-03 Limit(fetch=[100])
> 00-04 SingleMergeExchange(sort0=[0])
> 01-01 OrderedMuxExchange(sort0=[0])
> 02-01 SelectionVectorRemover
> 02-02 TopN(limit=[100])
> 02-03 HashToRandomExchange(dist0=[[$0]])
> 03-01 Project(customer_id=[$0], customername=[||(||($5,
> ', '), $4)])
> 03-02 Project(c_customer_id=[$1],
> c_current_cdemo_sk=[$2], c_current_hdemo_sk=[$3], c_current_addr_sk=[$4],
> c_first_name=[$5], c_last_name=[$6], ca_address_sk=[$8], ca_city=[$9],
> cd_demo_sk=[$7], hd_demo_sk=[$10], hd_income_band_sk=[$11],
> ib_income_band_sk=[$12], ib_lower_bound=[$13], ib_upper_bound=[$14],
> sr_cdemo_sk=[$0])
> 03-03 HashJoin(condition=[=($7, $0)],
> joinType=[inner])
> 03-05 HashToRandomExchange(dist0=[[$0]])
> 04-01 Scan(groupscan=[HiveScan
> [table=Table(dbName:tpcds100_parquet, tableName:store_returns),
> columns=[`sr_cdemo_sk`], numPartitions=0, partitions= null,
> inputDirectories=[maprfs:/drill/testdata/tpcds_sf100/parquet/web_returns],
> confProperties={}]])
> 03-04 HashToRandomExchange(dist0=[[$6]])
> 05-01 HashJoin(condition=[=($2, $9)],
> joinType=[inner])
> 05-03 HashJoin(condition=[=($3, $7)],
> joinType=[inner])
> 05-05 HashJoin(condition=[=($1, $6)],
> joinType=[inner])
> 05-07 Scan(groupscan=[HiveScan
> [table=Table(dbName:tpcds100_parquet, tableName:customer),
> columns=[`c_customer_id`, `c_current_cdemo_sk`, `c_current_hdemo_sk`,
> `c_current_addr_sk`, `c_first_name`, `c_last_name`], numPartitions=0,
> partitions= null,
> inputDirectories=[maprfs:/drill/testdata/tpcds_sf100/parquet/customer],
> confProperties={}]])
> 05-06 BroadcastExchange
> 06-01 Scan(groupscan=[HiveScan
> [table=Table(dbName:tpcds100_parquet, tableName:customer_demographics),
> columns=[`cd_demo_sk`], numPartitions=0, partitions= null,
> inputDirectories=[maprfs:/drill/testdata/tpcds_sf100/parquet/customer_demographics],
> confProperties={}]])
> 05-04 BroadcastExchange
> 07-01 SelectionVectorRemover
> 07-02 Filter(condition=[=($1, 'Green
> Acres')])
> 07-03 Scan(groupscan=[HiveScan
> [table=Table(dbName:tpcds100_parquet, tableName:customer_address),
> columns=[`ca_address_sk`, `ca_city`], numPartitions=0, partitions= null,
> inputDirectories=[maprfs:/drill/testdata/tpcds_sf100/parquet/customer_address],
> confProperties={}]])
> 05-02 BroadcastExchange
> 08-01 HashJoin(condition=[=($1, $2)],
> joinType=[inner])
> 08-03 Scan(groupscan=[HiveScan
> [table=Table(dbName:tpcds100_parquet, tableName:household_demographics),
> columns=[`hd_demo_sk`, `hd_income_band_sk`], numPartitions=0, partitions=
> null,
> inputDirectories=[maprfs:/drill/testdata/tpcds_sf100/parquet/household_demographics],
> confProperties={}]])
> 08-02 SelectionVectorRemover
> 08-04 Filter(condition=[AND(>=($1,
> 54986), <=($2, 104986))])
> 08-05 Scan(groupscan=[HiveScan
> [table=Table(dbName:tpcds100_parquet, tableName:income_band),
> columns=[`ib_income_band_sk`, `ib_lower_bound`, `ib_upper_bound`],
> numPartitions=0, partitions= null,
> inputDirectories=[maprfs:/drill/testdata/tpcds_sf100/parquet/income_band],
> confProperties={}]])
> {noformat}
> commit id is:
> 1.14.0-SNAPSHOT a77fd142d86dd5648cda8866b8ff3af39c7b6b11
> DRILL-6516: EMIT support in streaming agg 11.07.2018 @ 18:40:03 PDT
> Unknown 12.07.2018 @ 01:50:37 PDT
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)