[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible

2016-05-19 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13750:
---
   Resolution: Fixed
Fix Version/s: 2.1.0
   Status: Resolved  (was: Patch Available)

> Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer 
> when possible
> --
>
> Key: HIVE-13750
> URL: https://issues.apache.org/jira/browse/HIVE-13750
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.1.0
>
> Attachments: HIVE-13750.01.patch, HIVE-13750.02.patch, 
> HIVE-13750.patch, HIVE-13750.patch
>
>
> Extend ReduceDedup to remove additional shuffle stage created by sorted 
> dynamic partition optimizer when possible, thus avoiding unnecessary work.
> By [~ashutoshc]:
> {quote}
> Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) 
> unconditionally adds an extra shuffle stage. If sort columns of previous 
> shuffle and partitioning columns of table match, reduce sink deduplication 
> optimizer removes extra shuffle stage, thus bringing down overhead to zero. 
> However, if they don’t match, we end up doing extra shuffle. This can be 
> improved since we can add table partition columns as a sort columns on 
> earlier shuffle and avoid this extra shuffle. This ensures that in cases 
> query already has a shuffle stage, we are not shuffling data again. 
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible

2016-05-19 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13750:
---
Attachment: (was: HIVE-13750.02.patch)

> Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer 
> when possible
> --
>
> Key: HIVE-13750
> URL: https://issues.apache.org/jira/browse/HIVE-13750
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13750.01.patch, HIVE-13750.02.patch, 
> HIVE-13750.patch, HIVE-13750.patch
>
>
> Extend ReduceDedup to remove additional shuffle stage created by sorted 
> dynamic partition optimizer when possible, thus avoiding unnecessary work.
> By [~ashutoshc]:
> {quote}
> Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) 
> unconditionally adds an extra shuffle stage. If sort columns of previous 
> shuffle and partitioning columns of table match, reduce sink deduplication 
> optimizer removes extra shuffle stage, thus bringing down overhead to zero. 
> However, if they don’t match, we end up doing extra shuffle. This can be 
> improved since we can add table partition columns as a sort columns on 
> earlier shuffle and avoid this extra shuffle. This ensures that in cases 
> query already has a shuffle stage, we are not shuffling data again. 
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible

2016-05-19 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13750:
---
Attachment: HIVE-13750.02.patch

> Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer 
> when possible
> --
>
> Key: HIVE-13750
> URL: https://issues.apache.org/jira/browse/HIVE-13750
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13750.01.patch, HIVE-13750.02.patch, 
> HIVE-13750.patch, HIVE-13750.patch
>
>
> Extend ReduceDedup to remove additional shuffle stage created by sorted 
> dynamic partition optimizer when possible, thus avoiding unnecessary work.
> By [~ashutoshc]:
> {quote}
> Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) 
> unconditionally adds an extra shuffle stage. If sort columns of previous 
> shuffle and partitioning columns of table match, reduce sink deduplication 
> optimizer removes extra shuffle stage, thus bringing down overhead to zero. 
> However, if they don’t match, we end up doing extra shuffle. This can be 
> improved since we can add table partition columns as a sort columns on 
> earlier shuffle and avoid this extra shuffle. This ensures that in cases 
> query already has a shuffle stage, we are not shuffling data again. 
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible

2016-05-18 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13750:
---
Status: Patch Available  (was: In Progress)

> Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer 
> when possible
> --
>
> Key: HIVE-13750
> URL: https://issues.apache.org/jira/browse/HIVE-13750
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13750.01.patch, HIVE-13750.02.patch, 
> HIVE-13750.patch, HIVE-13750.patch
>
>
> Extend ReduceDedup to remove additional shuffle stage created by sorted 
> dynamic partition optimizer when possible, thus avoiding unnecessary work.
> By [~ashutoshc]:
> {quote}
> Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) 
> unconditionally adds an extra shuffle stage. If sort columns of previous 
> shuffle and partitioning columns of table match, reduce sink deduplication 
> optimizer removes extra shuffle stage, thus bringing down overhead to zero. 
> However, if they don’t match, we end up doing extra shuffle. This can be 
> improved since we can add table partition columns as a sort columns on 
> earlier shuffle and avoid this extra shuffle. This ensures that in cases 
> query already has a shuffle stage, we are not shuffling data again. 
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible

2016-05-18 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13750:
---
Attachment: HIVE-13750.02.patch

> Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer 
> when possible
> --
>
> Key: HIVE-13750
> URL: https://issues.apache.org/jira/browse/HIVE-13750
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13750.01.patch, HIVE-13750.02.patch, 
> HIVE-13750.patch, HIVE-13750.patch
>
>
> Extend ReduceDedup to remove additional shuffle stage created by sorted 
> dynamic partition optimizer when possible, thus avoiding unnecessary work.
> By [~ashutoshc]:
> {quote}
> Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) 
> unconditionally adds an extra shuffle stage. If sort columns of previous 
> shuffle and partitioning columns of table match, reduce sink deduplication 
> optimizer removes extra shuffle stage, thus bringing down overhead to zero. 
> However, if they don’t match, we end up doing extra shuffle. This can be 
> improved since we can add table partition columns as a sort columns on 
> earlier shuffle and avoid this extra shuffle. This ensures that in cases 
> query already has a shuffle stage, we are not shuffling data again. 
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible

2016-05-18 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13750:
---
Status: Open  (was: Patch Available)

> Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer 
> when possible
> --
>
> Key: HIVE-13750
> URL: https://issues.apache.org/jira/browse/HIVE-13750
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13750.01.patch, HIVE-13750.patch, HIVE-13750.patch
>
>
> Extend ReduceDedup to remove additional shuffle stage created by sorted 
> dynamic partition optimizer when possible, thus avoiding unnecessary work.
> By [~ashutoshc]:
> {quote}
> Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) 
> unconditionally adds an extra shuffle stage. If sort columns of previous 
> shuffle and partitioning columns of table match, reduce sink deduplication 
> optimizer removes extra shuffle stage, thus bringing down overhead to zero. 
> However, if they don’t match, we end up doing extra shuffle. This can be 
> improved since we can add table partition columns as a sort columns on 
> earlier shuffle and avoid this extra shuffle. This ensures that in cases 
> query already has a shuffle stage, we are not shuffling data again. 
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible

2016-05-17 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13750:
---
Attachment: HIVE-13750.01.patch

> Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer 
> when possible
> --
>
> Key: HIVE-13750
> URL: https://issues.apache.org/jira/browse/HIVE-13750
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13750.01.patch, HIVE-13750.patch, HIVE-13750.patch
>
>
> Extend ReduceDedup to remove additional shuffle stage created by sorted 
> dynamic partition optimizer when possible, thus avoiding unnecessary work.
> By [~ashutoshc]:
> {quote}
> Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) 
> unconditionally adds an extra shuffle stage. If sort columns of previous 
> shuffle and partitioning columns of table match, reduce sink deduplication 
> optimizer removes extra shuffle stage, thus bringing down overhead to zero. 
> However, if they don’t match, we end up doing extra shuffle. This can be 
> improved since we can add table partition columns as a sort columns on 
> earlier shuffle and avoid this extra shuffle. This ensures that in cases 
> query already has a shuffle stage, we are not shuffling data again. 
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible

2016-05-17 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13750:
---
Status: Patch Available  (was: Open)

> Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer 
> when possible
> --
>
> Key: HIVE-13750
> URL: https://issues.apache.org/jira/browse/HIVE-13750
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13750.patch, HIVE-13750.patch
>
>
> Extend ReduceDedup to remove additional shuffle stage created by sorted 
> dynamic partition optimizer when possible, thus avoiding unnecessary work.
> By [~ashutoshc]:
> {quote}
> Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) 
> unconditionally adds an extra shuffle stage. If sort columns of previous 
> shuffle and partitioning columns of table match, reduce sink deduplication 
> optimizer removes extra shuffle stage, thus bringing down overhead to zero. 
> However, if they don’t match, we end up doing extra shuffle. This can be 
> improved since we can add table partition columns as a sort columns on 
> earlier shuffle and avoid this extra shuffle. This ensures that in cases 
> query already has a shuffle stage, we are not shuffling data again. 
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible

2016-05-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13750:
---
Attachment: HIVE-13750.patch

> Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer 
> when possible
> --
>
> Key: HIVE-13750
> URL: https://issues.apache.org/jira/browse/HIVE-13750
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13750.patch, HIVE-13750.patch
>
>
> Extend ReduceDedup to remove additional shuffle stage created by sorted 
> dynamic partition optimizer when possible, thus avoiding unnecessary work.
> By [~ashutoshc]:
> {quote}
> Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) 
> unconditionally adds an extra shuffle stage. If sort columns of previous 
> shuffle and partitioning columns of table match, reduce sink deduplication 
> optimizer removes extra shuffle stage, thus bringing down overhead to zero. 
> However, if they don’t match, we end up doing extra shuffle. This can be 
> improved since we can add table partition columns as a sort columns on 
> earlier shuffle and avoid this extra shuffle. This ensures that in cases 
> query already has a shuffle stage, we are not shuffling data again. 
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible

2016-05-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13750:
---
Status: Patch Available  (was: In Progress)

> Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer 
> when possible
> --
>
> Key: HIVE-13750
> URL: https://issues.apache.org/jira/browse/HIVE-13750
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13750.patch
>
>
> Extend ReduceDedup to remove additional shuffle stage created by sorted 
> dynamic partition optimizer when possible, thus avoiding unnecessary work.
> By [~ashutoshc]:
> {quote}
> Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) 
> unconditionally adds an extra shuffle stage. If sort columns of previous 
> shuffle and partitioning columns of table match, reduce sink deduplication 
> optimizer removes extra shuffle stage, thus bringing down overhead to zero. 
> However, if they don’t match, we end up doing extra shuffle. This can be 
> improved since we can add table partition columns as a sort columns on 
> earlier shuffle and avoid this extra shuffle. This ensures that in cases 
> query already has a shuffle stage, we are not shuffling data again. 
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible

2016-05-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13750:
---
Status: Open  (was: Patch Available)

> Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer 
> when possible
> --
>
> Key: HIVE-13750
> URL: https://issues.apache.org/jira/browse/HIVE-13750
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13750.patch
>
>
> Extend ReduceDedup to remove additional shuffle stage created by sorted 
> dynamic partition optimizer when possible, thus avoiding unnecessary work.
> By [~ashutoshc]:
> {quote}
> Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) 
> unconditionally adds an extra shuffle stage. If sort columns of previous 
> shuffle and partitioning columns of table match, reduce sink deduplication 
> optimizer removes extra shuffle stage, thus bringing down overhead to zero. 
> However, if they don’t match, we end up doing extra shuffle. This can be 
> improved since we can add table partition columns as a sort columns on 
> earlier shuffle and avoid this extra shuffle. This ensures that in cases 
> query already has a shuffle stage, we are not shuffling data again. 
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible

2016-05-12 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13750:

Target Version/s: 2.1.0

> Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer 
> when possible
> --
>
> Key: HIVE-13750
> URL: https://issues.apache.org/jira/browse/HIVE-13750
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13750.patch
>
>
> Extend ReduceDedup to remove additional shuffle stage created by sorted 
> dynamic partition optimizer when possible, thus avoiding unnecessary work.
> By [~ashutoshc]:
> {quote}
> Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) 
> unconditionally adds an extra shuffle stage. If sort columns of previous 
> shuffle and partitioning columns of table match, reduce sink deduplication 
> optimizer removes extra shuffle stage, thus bringing down overhead to zero. 
> However, if they don’t match, we end up doing extra shuffle. This can be 
> improved since we can add table partition columns as a sort columns on 
> earlier shuffle and avoid this extra shuffle. This ensures that in cases 
> query already has a shuffle stage, we are not shuffling data again. 
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible

2016-05-12 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13750:
---
Attachment: HIVE-13750.patch

> Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer 
> when possible
> --
>
> Key: HIVE-13750
> URL: https://issues.apache.org/jira/browse/HIVE-13750
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13750.patch
>
>
> Extend ReduceDedup to remove additional shuffle stage created by sorted 
> dynamic partition optimizer when possible, thus avoiding unnecessary work.
> By [~ashutoshc]:
> {quote}
> Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) 
> unconditionally adds an extra shuffle stage. If sort columns of previous 
> shuffle and partitioning columns of table match, reduce sink deduplication 
> optimizer removes extra shuffle stage, thus bringing down overhead to zero. 
> However, if they don’t match, we end up doing extra shuffle. This can be 
> improved since we can add table partition columns as a sort columns on 
> earlier shuffle and avoid this extra shuffle. This ensures that in cases 
> query already has a shuffle stage, we are not shuffling data again. 
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible

2016-05-12 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13750:
---
Status: Patch Available  (was: In Progress)

> Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer 
> when possible
> --
>
> Key: HIVE-13750
> URL: https://issues.apache.org/jira/browse/HIVE-13750
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> Extend ReduceDedup to remove additional shuffle stage created by sorted 
> dynamic partition optimizer when possible, thus avoiding unnecessary work.
> By [~ashutoshc]:
> {quote}
> Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) 
> unconditionally adds an extra shuffle stage. If sort columns of previous 
> shuffle and partitioning columns of table match, reduce sink deduplication 
> optimizer removes extra shuffle stage, thus bringing down overhead to zero. 
> However, if they don’t match, we end up doing extra shuffle. This can be 
> improved since we can add table partition columns as a sort columns on 
> earlier shuffle and avoid this extra shuffle. This ensures that in cases 
> query already has a shuffle stage, we are not shuffling data again. 
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)