[jira] [Updated] (HIVE-9695) Redundant filter operator in reducer Vertex when CBO is disabled

2015-10-08 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-9695:
--
Affects Version/s: (was: 1.1.0)
   (was: 1.0.0)
   (was: 0.14.0)
   2.0.0

> Redundant filter operator in reducer Vertex when CBO is disabled
> 
>
> Key: HIVE-9695
> URL: https://issues.apache.org/jira/browse/HIVE-9695
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Affects Versions: 2.0.0
>Reporter: Mostafa Mokhtar
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.0.0
>
> Attachments: HIVE-9695.01.patch, HIVE-9695.01.patch, HIVE-9695.patch
>
>
> There is a redundant filter operator in reducer Vertex when CBO is disabled.
> Query 
> {code}
> select 
> ss_item_sk, ss_ticket_number, ss_store_sk
> from
> store_sales a, store_returns b, store
> where
> a.ss_item_sk = b.sr_item_sk
> and a.ss_ticket_number = b.sr_ticket_number 
> and ss_sold_date_sk between 2450816 and 2451500
>   and sr_returned_date_sk between 2450816 and 2451500
>   and s_store_sk = ss_store_sk;
> {code}
> Plan snippet 
> {code}
>   Statistics: Num rows: 57439344 Data size: 1838059008 Basic stats: COMPLETE 
> Column stats: COMPLETE
>   Filter Operator
> predicate: (_col1 = _col27) and (_col8 = _col34)) and 
> _col22 BETWEEN 2450816 AND 2451500) and _col45 BETWEEN 2450816 AND 2451500) 
> and (_col49 = _col6)) (type: boolean)
> {code}
> Full plan with CBO disabled
> {code}
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 3 (BROADCAST_EDGE), Map 4 
> (SIMPLE_EDGE)
>   DagName: mmokhtar_20150214182626_ad6820c7-b667-4652-ab25-cb60deed1a6d:13
>   Vertices:
> Map 1
> Map Operator Tree:
> TableScan
>   alias: b
>   filterExpr: ((sr_item_sk is not null and sr_ticket_number 
> is not null) and sr_returned_date_sk BETWEEN 2450816 AND 2451500) (type: 
> boolean)
>   Statistics: Num rows: 2370038095 Data size: 170506118656 
> Basic stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (sr_item_sk is not null and sr_ticket_number 
> is not null) (type: boolean)
> Statistics: Num rows: 706893063 Data size: 6498502768 
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: sr_item_sk (type: int), 
> sr_ticket_number (type: int)
>   sort order: ++
>   Map-reduce partition columns: sr_item_sk (type: int), 
> sr_ticket_number (type: int)
>   Statistics: Num rows: 706893063 Data size: 6498502768 
> Basic stats: COMPLETE Column stats: COMPLETE
>   value expressions: sr_returned_date_sk (type: int)
> Execution mode: vectorized
> Map 3
> Map Operator Tree:
> TableScan
>   alias: store
>   filterExpr: s_store_sk is not null (type: boolean)
>   Statistics: Num rows: 1704 Data size: 3256276 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: s_store_sk is not null (type: boolean)
> Statistics: Num rows: 1704 Data size: 6816 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: s_store_sk (type: int)
>   sort order: +
>   Map-reduce partition columns: s_store_sk (type: int)
>   Statistics: Num rows: 1704 Data size: 6816 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 4
> Map Operator Tree:
> TableScan
>   alias: a
>   filterExpr: (((ss_item_sk is not null and ss_ticket_number 
> is not null) and ss_store_sk is not null) and ss_sold_date_sk BETWEEN 2450816 
> AND 2451500) (type: boolean)
>   Statistics: Num rows: 28878719387 Data size: 2405805439460 
> Basic stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((ss_item_sk is not null and ss_ticket_number 
> is not null) and ss_store_sk is not null) (type: boolean)
> Statistics: Num rows: 8405840828 Data size: 110101408700 
> 

[jira] [Updated] (HIVE-9695) Redundant filter operator in reducer Vertex when CBO is disabled

2015-10-07 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-9695:
---
Component/s: (was: Physical Optimizer)
 Logical Optimizer

> Redundant filter operator in reducer Vertex when CBO is disabled
> 
>
> Key: HIVE-9695
> URL: https://issues.apache.org/jira/browse/HIVE-9695
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Affects Versions: 2.0.0
>Reporter: Mostafa Mokhtar
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-9695.01.patch, HIVE-9695.01.patch, HIVE-9695.patch
>
>
> There is a redundant filter operator in reducer Vertex when CBO is disabled.
> Query 
> {code}
> select 
> ss_item_sk, ss_ticket_number, ss_store_sk
> from
> store_sales a, store_returns b, store
> where
> a.ss_item_sk = b.sr_item_sk
> and a.ss_ticket_number = b.sr_ticket_number 
> and ss_sold_date_sk between 2450816 and 2451500
>   and sr_returned_date_sk between 2450816 and 2451500
>   and s_store_sk = ss_store_sk;
> {code}
> Plan snippet 
> {code}
>   Statistics: Num rows: 57439344 Data size: 1838059008 Basic stats: COMPLETE 
> Column stats: COMPLETE
>   Filter Operator
> predicate: (_col1 = _col27) and (_col8 = _col34)) and 
> _col22 BETWEEN 2450816 AND 2451500) and _col45 BETWEEN 2450816 AND 2451500) 
> and (_col49 = _col6)) (type: boolean)
> {code}
> Full plan with CBO disabled
> {code}
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 3 (BROADCAST_EDGE), Map 4 
> (SIMPLE_EDGE)
>   DagName: mmokhtar_20150214182626_ad6820c7-b667-4652-ab25-cb60deed1a6d:13
>   Vertices:
> Map 1
> Map Operator Tree:
> TableScan
>   alias: b
>   filterExpr: ((sr_item_sk is not null and sr_ticket_number 
> is not null) and sr_returned_date_sk BETWEEN 2450816 AND 2451500) (type: 
> boolean)
>   Statistics: Num rows: 2370038095 Data size: 170506118656 
> Basic stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (sr_item_sk is not null and sr_ticket_number 
> is not null) (type: boolean)
> Statistics: Num rows: 706893063 Data size: 6498502768 
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: sr_item_sk (type: int), 
> sr_ticket_number (type: int)
>   sort order: ++
>   Map-reduce partition columns: sr_item_sk (type: int), 
> sr_ticket_number (type: int)
>   Statistics: Num rows: 706893063 Data size: 6498502768 
> Basic stats: COMPLETE Column stats: COMPLETE
>   value expressions: sr_returned_date_sk (type: int)
> Execution mode: vectorized
> Map 3
> Map Operator Tree:
> TableScan
>   alias: store
>   filterExpr: s_store_sk is not null (type: boolean)
>   Statistics: Num rows: 1704 Data size: 3256276 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: s_store_sk is not null (type: boolean)
> Statistics: Num rows: 1704 Data size: 6816 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: s_store_sk (type: int)
>   sort order: +
>   Map-reduce partition columns: s_store_sk (type: int)
>   Statistics: Num rows: 1704 Data size: 6816 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 4
> Map Operator Tree:
> TableScan
>   alias: a
>   filterExpr: (((ss_item_sk is not null and ss_ticket_number 
> is not null) and ss_store_sk is not null) and ss_sold_date_sk BETWEEN 2450816 
> AND 2451500) (type: boolean)
>   Statistics: Num rows: 28878719387 Data size: 2405805439460 
> Basic stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((ss_item_sk is not null and ss_ticket_number 
> is not null) and ss_store_sk is not null) (type: boolean)
> Statistics: Num rows: 8405840828 Data size: 110101408700 
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   

[jira] [Updated] (HIVE-9695) Redundant filter operator in reducer Vertex when CBO is disabled

2015-10-07 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-9695:
---
Affects Version/s: (was: 2.0.0)
   0.14.0
   1.0.0
   1.1.0

> Redundant filter operator in reducer Vertex when CBO is disabled
> 
>
> Key: HIVE-9695
> URL: https://issues.apache.org/jira/browse/HIVE-9695
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Affects Versions: 0.14.0, 1.0.0, 1.1.0
>Reporter: Mostafa Mokhtar
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-9695.01.patch, HIVE-9695.01.patch, HIVE-9695.patch
>
>
> There is a redundant filter operator in reducer Vertex when CBO is disabled.
> Query 
> {code}
> select 
> ss_item_sk, ss_ticket_number, ss_store_sk
> from
> store_sales a, store_returns b, store
> where
> a.ss_item_sk = b.sr_item_sk
> and a.ss_ticket_number = b.sr_ticket_number 
> and ss_sold_date_sk between 2450816 and 2451500
>   and sr_returned_date_sk between 2450816 and 2451500
>   and s_store_sk = ss_store_sk;
> {code}
> Plan snippet 
> {code}
>   Statistics: Num rows: 57439344 Data size: 1838059008 Basic stats: COMPLETE 
> Column stats: COMPLETE
>   Filter Operator
> predicate: (_col1 = _col27) and (_col8 = _col34)) and 
> _col22 BETWEEN 2450816 AND 2451500) and _col45 BETWEEN 2450816 AND 2451500) 
> and (_col49 = _col6)) (type: boolean)
> {code}
> Full plan with CBO disabled
> {code}
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 3 (BROADCAST_EDGE), Map 4 
> (SIMPLE_EDGE)
>   DagName: mmokhtar_20150214182626_ad6820c7-b667-4652-ab25-cb60deed1a6d:13
>   Vertices:
> Map 1
> Map Operator Tree:
> TableScan
>   alias: b
>   filterExpr: ((sr_item_sk is not null and sr_ticket_number 
> is not null) and sr_returned_date_sk BETWEEN 2450816 AND 2451500) (type: 
> boolean)
>   Statistics: Num rows: 2370038095 Data size: 170506118656 
> Basic stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (sr_item_sk is not null and sr_ticket_number 
> is not null) (type: boolean)
> Statistics: Num rows: 706893063 Data size: 6498502768 
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: sr_item_sk (type: int), 
> sr_ticket_number (type: int)
>   sort order: ++
>   Map-reduce partition columns: sr_item_sk (type: int), 
> sr_ticket_number (type: int)
>   Statistics: Num rows: 706893063 Data size: 6498502768 
> Basic stats: COMPLETE Column stats: COMPLETE
>   value expressions: sr_returned_date_sk (type: int)
> Execution mode: vectorized
> Map 3
> Map Operator Tree:
> TableScan
>   alias: store
>   filterExpr: s_store_sk is not null (type: boolean)
>   Statistics: Num rows: 1704 Data size: 3256276 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: s_store_sk is not null (type: boolean)
> Statistics: Num rows: 1704 Data size: 6816 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: s_store_sk (type: int)
>   sort order: +
>   Map-reduce partition columns: s_store_sk (type: int)
>   Statistics: Num rows: 1704 Data size: 6816 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 4
> Map Operator Tree:
> TableScan
>   alias: a
>   filterExpr: (((ss_item_sk is not null and ss_ticket_number 
> is not null) and ss_store_sk is not null) and ss_sold_date_sk BETWEEN 2450816 
> AND 2451500) (type: boolean)
>   Statistics: Num rows: 28878719387 Data size: 2405805439460 
> Basic stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((ss_item_sk is not null and ss_ticket_number 
> is not null) and ss_store_sk is not null) (type: boolean)
> Statistics: Num rows: 8405840828 Data size: 110101408700 
> Basic stats: COMPLETE Column stats: COMPLETE
> 

[jira] [Updated] (HIVE-9695) Redundant filter operator in reducer Vertex when CBO is disabled

2015-10-06 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-9695:
--
Attachment: HIVE-9695.01.patch

> Redundant filter operator in reducer Vertex when CBO is disabled
> 
>
> Key: HIVE-9695
> URL: https://issues.apache.org/jira/browse/HIVE-9695
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.0.0
>Reporter: Mostafa Mokhtar
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-9695.01.patch, HIVE-9695.patch
>
>
> There is a redundant filter operator in reducer Vertex when CBO is disabled.
> Query 
> {code}
> select 
> ss_item_sk, ss_ticket_number, ss_store_sk
> from
> store_sales a, store_returns b, store
> where
> a.ss_item_sk = b.sr_item_sk
> and a.ss_ticket_number = b.sr_ticket_number 
> and ss_sold_date_sk between 2450816 and 2451500
>   and sr_returned_date_sk between 2450816 and 2451500
>   and s_store_sk = ss_store_sk;
> {code}
> Plan snippet 
> {code}
>   Statistics: Num rows: 57439344 Data size: 1838059008 Basic stats: COMPLETE 
> Column stats: COMPLETE
>   Filter Operator
> predicate: (_col1 = _col27) and (_col8 = _col34)) and 
> _col22 BETWEEN 2450816 AND 2451500) and _col45 BETWEEN 2450816 AND 2451500) 
> and (_col49 = _col6)) (type: boolean)
> {code}
> Full plan with CBO disabled
> {code}
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 3 (BROADCAST_EDGE), Map 4 
> (SIMPLE_EDGE)
>   DagName: mmokhtar_20150214182626_ad6820c7-b667-4652-ab25-cb60deed1a6d:13
>   Vertices:
> Map 1
> Map Operator Tree:
> TableScan
>   alias: b
>   filterExpr: ((sr_item_sk is not null and sr_ticket_number 
> is not null) and sr_returned_date_sk BETWEEN 2450816 AND 2451500) (type: 
> boolean)
>   Statistics: Num rows: 2370038095 Data size: 170506118656 
> Basic stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (sr_item_sk is not null and sr_ticket_number 
> is not null) (type: boolean)
> Statistics: Num rows: 706893063 Data size: 6498502768 
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: sr_item_sk (type: int), 
> sr_ticket_number (type: int)
>   sort order: ++
>   Map-reduce partition columns: sr_item_sk (type: int), 
> sr_ticket_number (type: int)
>   Statistics: Num rows: 706893063 Data size: 6498502768 
> Basic stats: COMPLETE Column stats: COMPLETE
>   value expressions: sr_returned_date_sk (type: int)
> Execution mode: vectorized
> Map 3
> Map Operator Tree:
> TableScan
>   alias: store
>   filterExpr: s_store_sk is not null (type: boolean)
>   Statistics: Num rows: 1704 Data size: 3256276 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: s_store_sk is not null (type: boolean)
> Statistics: Num rows: 1704 Data size: 6816 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: s_store_sk (type: int)
>   sort order: +
>   Map-reduce partition columns: s_store_sk (type: int)
>   Statistics: Num rows: 1704 Data size: 6816 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 4
> Map Operator Tree:
> TableScan
>   alias: a
>   filterExpr: (((ss_item_sk is not null and ss_ticket_number 
> is not null) and ss_store_sk is not null) and ss_sold_date_sk BETWEEN 2450816 
> AND 2451500) (type: boolean)
>   Statistics: Num rows: 28878719387 Data size: 2405805439460 
> Basic stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((ss_item_sk is not null and ss_ticket_number 
> is not null) and ss_store_sk is not null) (type: boolean)
> Statistics: Num rows: 8405840828 Data size: 110101408700 
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: ss_item_sk (type: int), 
> 

[jira] [Updated] (HIVE-9695) Redundant filter operator in reducer Vertex when CBO is disabled

2015-10-06 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-9695:
--
Attachment: HIVE-9695.01.patch

> Redundant filter operator in reducer Vertex when CBO is disabled
> 
>
> Key: HIVE-9695
> URL: https://issues.apache.org/jira/browse/HIVE-9695
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.0.0
>Reporter: Mostafa Mokhtar
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-9695.01.patch, HIVE-9695.01.patch, HIVE-9695.patch
>
>
> There is a redundant filter operator in reducer Vertex when CBO is disabled.
> Query 
> {code}
> select 
> ss_item_sk, ss_ticket_number, ss_store_sk
> from
> store_sales a, store_returns b, store
> where
> a.ss_item_sk = b.sr_item_sk
> and a.ss_ticket_number = b.sr_ticket_number 
> and ss_sold_date_sk between 2450816 and 2451500
>   and sr_returned_date_sk between 2450816 and 2451500
>   and s_store_sk = ss_store_sk;
> {code}
> Plan snippet 
> {code}
>   Statistics: Num rows: 57439344 Data size: 1838059008 Basic stats: COMPLETE 
> Column stats: COMPLETE
>   Filter Operator
> predicate: (_col1 = _col27) and (_col8 = _col34)) and 
> _col22 BETWEEN 2450816 AND 2451500) and _col45 BETWEEN 2450816 AND 2451500) 
> and (_col49 = _col6)) (type: boolean)
> {code}
> Full plan with CBO disabled
> {code}
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 3 (BROADCAST_EDGE), Map 4 
> (SIMPLE_EDGE)
>   DagName: mmokhtar_20150214182626_ad6820c7-b667-4652-ab25-cb60deed1a6d:13
>   Vertices:
> Map 1
> Map Operator Tree:
> TableScan
>   alias: b
>   filterExpr: ((sr_item_sk is not null and sr_ticket_number 
> is not null) and sr_returned_date_sk BETWEEN 2450816 AND 2451500) (type: 
> boolean)
>   Statistics: Num rows: 2370038095 Data size: 170506118656 
> Basic stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (sr_item_sk is not null and sr_ticket_number 
> is not null) (type: boolean)
> Statistics: Num rows: 706893063 Data size: 6498502768 
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: sr_item_sk (type: int), 
> sr_ticket_number (type: int)
>   sort order: ++
>   Map-reduce partition columns: sr_item_sk (type: int), 
> sr_ticket_number (type: int)
>   Statistics: Num rows: 706893063 Data size: 6498502768 
> Basic stats: COMPLETE Column stats: COMPLETE
>   value expressions: sr_returned_date_sk (type: int)
> Execution mode: vectorized
> Map 3
> Map Operator Tree:
> TableScan
>   alias: store
>   filterExpr: s_store_sk is not null (type: boolean)
>   Statistics: Num rows: 1704 Data size: 3256276 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: s_store_sk is not null (type: boolean)
> Statistics: Num rows: 1704 Data size: 6816 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: s_store_sk (type: int)
>   sort order: +
>   Map-reduce partition columns: s_store_sk (type: int)
>   Statistics: Num rows: 1704 Data size: 6816 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 4
> Map Operator Tree:
> TableScan
>   alias: a
>   filterExpr: (((ss_item_sk is not null and ss_ticket_number 
> is not null) and ss_store_sk is not null) and ss_sold_date_sk BETWEEN 2450816 
> AND 2451500) (type: boolean)
>   Statistics: Num rows: 28878719387 Data size: 2405805439460 
> Basic stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((ss_item_sk is not null and ss_ticket_number 
> is not null) and ss_store_sk is not null) (type: boolean)
> Statistics: Num rows: 8405840828 Data size: 110101408700 
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: ss_item_sk 

[jira] [Updated] (HIVE-9695) Redundant filter operator in reducer Vertex when CBO is disabled

2015-10-01 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-9695:
--
Issue Type: Improvement  (was: Bug)

> Redundant filter operator in reducer Vertex when CBO is disabled
> 
>
> Key: HIVE-9695
> URL: https://issues.apache.org/jira/browse/HIVE-9695
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.0.0
>Reporter: Mostafa Mokhtar
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-9695.patch
>
>
> There is a redundant filter operator in reducer Vertex when CBO is disabled.
> Query 
> {code}
> select 
> ss_item_sk, ss_ticket_number, ss_store_sk
> from
> store_sales a, store_returns b, store
> where
> a.ss_item_sk = b.sr_item_sk
> and a.ss_ticket_number = b.sr_ticket_number 
> and ss_sold_date_sk between 2450816 and 2451500
>   and sr_returned_date_sk between 2450816 and 2451500
>   and s_store_sk = ss_store_sk;
> {code}
> Plan snippet 
> {code}
>   Statistics: Num rows: 57439344 Data size: 1838059008 Basic stats: COMPLETE 
> Column stats: COMPLETE
>   Filter Operator
> predicate: (_col1 = _col27) and (_col8 = _col34)) and 
> _col22 BETWEEN 2450816 AND 2451500) and _col45 BETWEEN 2450816 AND 2451500) 
> and (_col49 = _col6)) (type: boolean)
> {code}
> Full plan with CBO disabled
> {code}
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 3 (BROADCAST_EDGE), Map 4 
> (SIMPLE_EDGE)
>   DagName: mmokhtar_20150214182626_ad6820c7-b667-4652-ab25-cb60deed1a6d:13
>   Vertices:
> Map 1
> Map Operator Tree:
> TableScan
>   alias: b
>   filterExpr: ((sr_item_sk is not null and sr_ticket_number 
> is not null) and sr_returned_date_sk BETWEEN 2450816 AND 2451500) (type: 
> boolean)
>   Statistics: Num rows: 2370038095 Data size: 170506118656 
> Basic stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (sr_item_sk is not null and sr_ticket_number 
> is not null) (type: boolean)
> Statistics: Num rows: 706893063 Data size: 6498502768 
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: sr_item_sk (type: int), 
> sr_ticket_number (type: int)
>   sort order: ++
>   Map-reduce partition columns: sr_item_sk (type: int), 
> sr_ticket_number (type: int)
>   Statistics: Num rows: 706893063 Data size: 6498502768 
> Basic stats: COMPLETE Column stats: COMPLETE
>   value expressions: sr_returned_date_sk (type: int)
> Execution mode: vectorized
> Map 3
> Map Operator Tree:
> TableScan
>   alias: store
>   filterExpr: s_store_sk is not null (type: boolean)
>   Statistics: Num rows: 1704 Data size: 3256276 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: s_store_sk is not null (type: boolean)
> Statistics: Num rows: 1704 Data size: 6816 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: s_store_sk (type: int)
>   sort order: +
>   Map-reduce partition columns: s_store_sk (type: int)
>   Statistics: Num rows: 1704 Data size: 6816 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 4
> Map Operator Tree:
> TableScan
>   alias: a
>   filterExpr: (((ss_item_sk is not null and ss_ticket_number 
> is not null) and ss_store_sk is not null) and ss_sold_date_sk BETWEEN 2450816 
> AND 2451500) (type: boolean)
>   Statistics: Num rows: 28878719387 Data size: 2405805439460 
> Basic stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((ss_item_sk is not null and ss_ticket_number 
> is not null) and ss_store_sk is not null) (type: boolean)
> Statistics: Num rows: 8405840828 Data size: 110101408700 
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: ss_item_sk (type: int), 
> ss_ticket_number (type: 

[jira] [Updated] (HIVE-9695) Redundant filter operator in reducer Vertex when CBO is disabled

2015-10-01 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-9695:
--
Affects Version/s: (was: 0.14.0)
   2.0.0

> Redundant filter operator in reducer Vertex when CBO is disabled
> 
>
> Key: HIVE-9695
> URL: https://issues.apache.org/jira/browse/HIVE-9695
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0
>Reporter: Mostafa Mokhtar
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-9695.patch
>
>
> There is a redundant filter operator in reducer Vertex when CBO is disabled.
> Query 
> {code}
> select 
> ss_item_sk, ss_ticket_number, ss_store_sk
> from
> store_sales a, store_returns b, store
> where
> a.ss_item_sk = b.sr_item_sk
> and a.ss_ticket_number = b.sr_ticket_number 
> and ss_sold_date_sk between 2450816 and 2451500
>   and sr_returned_date_sk between 2450816 and 2451500
>   and s_store_sk = ss_store_sk;
> {code}
> Plan snippet 
> {code}
>   Statistics: Num rows: 57439344 Data size: 1838059008 Basic stats: COMPLETE 
> Column stats: COMPLETE
>   Filter Operator
> predicate: (_col1 = _col27) and (_col8 = _col34)) and 
> _col22 BETWEEN 2450816 AND 2451500) and _col45 BETWEEN 2450816 AND 2451500) 
> and (_col49 = _col6)) (type: boolean)
> {code}
> Full plan with CBO disabled
> {code}
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 3 (BROADCAST_EDGE), Map 4 
> (SIMPLE_EDGE)
>   DagName: mmokhtar_20150214182626_ad6820c7-b667-4652-ab25-cb60deed1a6d:13
>   Vertices:
> Map 1
> Map Operator Tree:
> TableScan
>   alias: b
>   filterExpr: ((sr_item_sk is not null and sr_ticket_number 
> is not null) and sr_returned_date_sk BETWEEN 2450816 AND 2451500) (type: 
> boolean)
>   Statistics: Num rows: 2370038095 Data size: 170506118656 
> Basic stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (sr_item_sk is not null and sr_ticket_number 
> is not null) (type: boolean)
> Statistics: Num rows: 706893063 Data size: 6498502768 
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: sr_item_sk (type: int), 
> sr_ticket_number (type: int)
>   sort order: ++
>   Map-reduce partition columns: sr_item_sk (type: int), 
> sr_ticket_number (type: int)
>   Statistics: Num rows: 706893063 Data size: 6498502768 
> Basic stats: COMPLETE Column stats: COMPLETE
>   value expressions: sr_returned_date_sk (type: int)
> Execution mode: vectorized
> Map 3
> Map Operator Tree:
> TableScan
>   alias: store
>   filterExpr: s_store_sk is not null (type: boolean)
>   Statistics: Num rows: 1704 Data size: 3256276 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: s_store_sk is not null (type: boolean)
> Statistics: Num rows: 1704 Data size: 6816 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: s_store_sk (type: int)
>   sort order: +
>   Map-reduce partition columns: s_store_sk (type: int)
>   Statistics: Num rows: 1704 Data size: 6816 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 4
> Map Operator Tree:
> TableScan
>   alias: a
>   filterExpr: (((ss_item_sk is not null and ss_ticket_number 
> is not null) and ss_store_sk is not null) and ss_sold_date_sk BETWEEN 2450816 
> AND 2451500) (type: boolean)
>   Statistics: Num rows: 28878719387 Data size: 2405805439460 
> Basic stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((ss_item_sk is not null and ss_ticket_number 
> is not null) and ss_store_sk is not null) (type: boolean)
> Statistics: Num rows: 8405840828 Data size: 110101408700 
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: ss_item_sk (type: int), 
> 

[jira] [Updated] (HIVE-9695) Redundant filter operator in reducer Vertex when CBO is disabled

2015-10-01 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-9695:
--
Attachment: HIVE-9695.patch

> Redundant filter operator in reducer Vertex when CBO is disabled
> 
>
> Key: HIVE-9695
> URL: https://issues.apache.org/jira/browse/HIVE-9695
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 0.14.0
>Reporter: Mostafa Mokhtar
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-9695.patch
>
>
> There is a redundant filter operator in reducer Vertex when CBO is disabled.
> Query 
> {code}
> select 
> ss_item_sk, ss_ticket_number, ss_store_sk
> from
> store_sales a, store_returns b, store
> where
> a.ss_item_sk = b.sr_item_sk
> and a.ss_ticket_number = b.sr_ticket_number 
> and ss_sold_date_sk between 2450816 and 2451500
>   and sr_returned_date_sk between 2450816 and 2451500
>   and s_store_sk = ss_store_sk;
> {code}
> Plan snippet 
> {code}
>   Statistics: Num rows: 57439344 Data size: 1838059008 Basic stats: COMPLETE 
> Column stats: COMPLETE
>   Filter Operator
> predicate: (_col1 = _col27) and (_col8 = _col34)) and 
> _col22 BETWEEN 2450816 AND 2451500) and _col45 BETWEEN 2450816 AND 2451500) 
> and (_col49 = _col6)) (type: boolean)
> {code}
> Full plan with CBO disabled
> {code}
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 3 (BROADCAST_EDGE), Map 4 
> (SIMPLE_EDGE)
>   DagName: mmokhtar_20150214182626_ad6820c7-b667-4652-ab25-cb60deed1a6d:13
>   Vertices:
> Map 1
> Map Operator Tree:
> TableScan
>   alias: b
>   filterExpr: ((sr_item_sk is not null and sr_ticket_number 
> is not null) and sr_returned_date_sk BETWEEN 2450816 AND 2451500) (type: 
> boolean)
>   Statistics: Num rows: 2370038095 Data size: 170506118656 
> Basic stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (sr_item_sk is not null and sr_ticket_number 
> is not null) (type: boolean)
> Statistics: Num rows: 706893063 Data size: 6498502768 
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: sr_item_sk (type: int), 
> sr_ticket_number (type: int)
>   sort order: ++
>   Map-reduce partition columns: sr_item_sk (type: int), 
> sr_ticket_number (type: int)
>   Statistics: Num rows: 706893063 Data size: 6498502768 
> Basic stats: COMPLETE Column stats: COMPLETE
>   value expressions: sr_returned_date_sk (type: int)
> Execution mode: vectorized
> Map 3
> Map Operator Tree:
> TableScan
>   alias: store
>   filterExpr: s_store_sk is not null (type: boolean)
>   Statistics: Num rows: 1704 Data size: 3256276 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: s_store_sk is not null (type: boolean)
> Statistics: Num rows: 1704 Data size: 6816 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: s_store_sk (type: int)
>   sort order: +
>   Map-reduce partition columns: s_store_sk (type: int)
>   Statistics: Num rows: 1704 Data size: 6816 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 4
> Map Operator Tree:
> TableScan
>   alias: a
>   filterExpr: (((ss_item_sk is not null and ss_ticket_number 
> is not null) and ss_store_sk is not null) and ss_sold_date_sk BETWEEN 2450816 
> AND 2451500) (type: boolean)
>   Statistics: Num rows: 28878719387 Data size: 2405805439460 
> Basic stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((ss_item_sk is not null and ss_ticket_number 
> is not null) and ss_store_sk is not null) (type: boolean)
> Statistics: Num rows: 8405840828 Data size: 110101408700 
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
>   key expressions: ss_item_sk (type: int), 
> ss_ticket_number (type: int)
>  

[jira] [Updated] (HIVE-9695) Redundant filter operator in reducer Vertex when CBO is disabled

2015-04-13 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-9695:
-
Assignee: Laljo John Pullokkaran  (was: Gunther Hagleitner)

 Redundant filter operator in reducer Vertex when CBO is disabled
 

 Key: HIVE-9695
 URL: https://issues.apache.org/jira/browse/HIVE-9695
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Laljo John Pullokkaran
 Fix For: 1.2.0


 There is a redundant filter operator in reducer Vertex when CBO is disabled.
 Query 
 {code}
 select 
 ss_item_sk, ss_ticket_number, ss_store_sk
 from
 store_sales a, store_returns b, store
 where
 a.ss_item_sk = b.sr_item_sk
 and a.ss_ticket_number = b.sr_ticket_number 
 and ss_sold_date_sk between 2450816 and 2451500
   and sr_returned_date_sk between 2450816 and 2451500
   and s_store_sk = ss_store_sk;
 {code}
 Plan snippet 
 {code}
   Statistics: Num rows: 57439344 Data size: 1838059008 Basic stats: COMPLETE 
 Column stats: COMPLETE
   Filter Operator
 predicate: (_col1 = _col27) and (_col8 = _col34)) and 
 _col22 BETWEEN 2450816 AND 2451500) and _col45 BETWEEN 2450816 AND 2451500) 
 and (_col49 = _col6)) (type: boolean)
 {code}
 Full plan with CBO disabled
 {code}
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 depends on stages: Stage-1
 STAGE PLANS:
   Stage: Stage-1
 Tez
   Edges:
 Reducer 2 - Map 1 (SIMPLE_EDGE), Map 3 (BROADCAST_EDGE), Map 4 
 (SIMPLE_EDGE)
   DagName: mmokhtar_20150214182626_ad6820c7-b667-4652-ab25-cb60deed1a6d:13
   Vertices:
 Map 1
 Map Operator Tree:
 TableScan
   alias: b
   filterExpr: ((sr_item_sk is not null and sr_ticket_number 
 is not null) and sr_returned_date_sk BETWEEN 2450816 AND 2451500) (type: 
 boolean)
   Statistics: Num rows: 2370038095 Data size: 170506118656 
 Basic stats: COMPLETE Column stats: COMPLETE
   Filter Operator
 predicate: (sr_item_sk is not null and sr_ticket_number 
 is not null) (type: boolean)
 Statistics: Num rows: 706893063 Data size: 6498502768 
 Basic stats: COMPLETE Column stats: COMPLETE
 Reduce Output Operator
   key expressions: sr_item_sk (type: int), 
 sr_ticket_number (type: int)
   sort order: ++
   Map-reduce partition columns: sr_item_sk (type: int), 
 sr_ticket_number (type: int)
   Statistics: Num rows: 706893063 Data size: 6498502768 
 Basic stats: COMPLETE Column stats: COMPLETE
   value expressions: sr_returned_date_sk (type: int)
 Execution mode: vectorized
 Map 3
 Map Operator Tree:
 TableScan
   alias: store
   filterExpr: s_store_sk is not null (type: boolean)
   Statistics: Num rows: 1704 Data size: 3256276 Basic stats: 
 COMPLETE Column stats: COMPLETE
   Filter Operator
 predicate: s_store_sk is not null (type: boolean)
 Statistics: Num rows: 1704 Data size: 6816 Basic stats: 
 COMPLETE Column stats: COMPLETE
 Reduce Output Operator
   key expressions: s_store_sk (type: int)
   sort order: +
   Map-reduce partition columns: s_store_sk (type: int)
   Statistics: Num rows: 1704 Data size: 6816 Basic stats: 
 COMPLETE Column stats: COMPLETE
 Execution mode: vectorized
 Map 4
 Map Operator Tree:
 TableScan
   alias: a
   filterExpr: (((ss_item_sk is not null and ss_ticket_number 
 is not null) and ss_store_sk is not null) and ss_sold_date_sk BETWEEN 2450816 
 AND 2451500) (type: boolean)
   Statistics: Num rows: 28878719387 Data size: 2405805439460 
 Basic stats: COMPLETE Column stats: COMPLETE
   Filter Operator
 predicate: ((ss_item_sk is not null and ss_ticket_number 
 is not null) and ss_store_sk is not null) (type: boolean)
 Statistics: Num rows: 8405840828 Data size: 110101408700 
 Basic stats: COMPLETE Column stats: COMPLETE
 Reduce Output Operator
   key expressions: ss_item_sk (type: int), 
 ss_ticket_number (type: int)
   sort order: ++
   Map-reduce partition columns: ss_item_sk (type: