[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible
[ https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13750: --- Resolution: Fixed Fix Version/s: 2.1.0 Status: Resolved (was: Patch Available) > Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer > when possible > -- > > Key: HIVE-13750 > URL: https://issues.apache.org/jira/browse/HIVE-13750 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Fix For: 2.1.0 > > Attachments: HIVE-13750.01.patch, HIVE-13750.02.patch, > HIVE-13750.patch, HIVE-13750.patch > > > Extend ReduceDedup to remove additional shuffle stage created by sorted > dynamic partition optimizer when possible, thus avoiding unnecessary work. > By [~ashutoshc]: > {quote} > Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) > unconditionally adds an extra shuffle stage. If sort columns of previous > shuffle and partitioning columns of table match, reduce sink deduplication > optimizer removes extra shuffle stage, thus bringing down overhead to zero. > However, if they don’t match, we end up doing extra shuffle. This can be > improved since we can add table partition columns as a sort columns on > earlier shuffle and avoid this extra shuffle. This ensures that in cases > query already has a shuffle stage, we are not shuffling data again. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible
[ https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13750: --- Attachment: (was: HIVE-13750.02.patch) > Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer > when possible > -- > > Key: HIVE-13750 > URL: https://issues.apache.org/jira/browse/HIVE-13750 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13750.01.patch, HIVE-13750.02.patch, > HIVE-13750.patch, HIVE-13750.patch > > > Extend ReduceDedup to remove additional shuffle stage created by sorted > dynamic partition optimizer when possible, thus avoiding unnecessary work. > By [~ashutoshc]: > {quote} > Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) > unconditionally adds an extra shuffle stage. If sort columns of previous > shuffle and partitioning columns of table match, reduce sink deduplication > optimizer removes extra shuffle stage, thus bringing down overhead to zero. > However, if they don’t match, we end up doing extra shuffle. This can be > improved since we can add table partition columns as a sort columns on > earlier shuffle and avoid this extra shuffle. This ensures that in cases > query already has a shuffle stage, we are not shuffling data again. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible
[ https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13750: --- Attachment: HIVE-13750.02.patch > Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer > when possible > -- > > Key: HIVE-13750 > URL: https://issues.apache.org/jira/browse/HIVE-13750 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13750.01.patch, HIVE-13750.02.patch, > HIVE-13750.patch, HIVE-13750.patch > > > Extend ReduceDedup to remove additional shuffle stage created by sorted > dynamic partition optimizer when possible, thus avoiding unnecessary work. > By [~ashutoshc]: > {quote} > Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) > unconditionally adds an extra shuffle stage. If sort columns of previous > shuffle and partitioning columns of table match, reduce sink deduplication > optimizer removes extra shuffle stage, thus bringing down overhead to zero. > However, if they don’t match, we end up doing extra shuffle. This can be > improved since we can add table partition columns as a sort columns on > earlier shuffle and avoid this extra shuffle. This ensures that in cases > query already has a shuffle stage, we are not shuffling data again. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible
[ https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13750: --- Status: Patch Available (was: In Progress) > Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer > when possible > -- > > Key: HIVE-13750 > URL: https://issues.apache.org/jira/browse/HIVE-13750 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13750.01.patch, HIVE-13750.02.patch, > HIVE-13750.patch, HIVE-13750.patch > > > Extend ReduceDedup to remove additional shuffle stage created by sorted > dynamic partition optimizer when possible, thus avoiding unnecessary work. > By [~ashutoshc]: > {quote} > Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) > unconditionally adds an extra shuffle stage. If sort columns of previous > shuffle and partitioning columns of table match, reduce sink deduplication > optimizer removes extra shuffle stage, thus bringing down overhead to zero. > However, if they don’t match, we end up doing extra shuffle. This can be > improved since we can add table partition columns as a sort columns on > earlier shuffle and avoid this extra shuffle. This ensures that in cases > query already has a shuffle stage, we are not shuffling data again. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible
[ https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13750: --- Attachment: HIVE-13750.02.patch > Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer > when possible > -- > > Key: HIVE-13750 > URL: https://issues.apache.org/jira/browse/HIVE-13750 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13750.01.patch, HIVE-13750.02.patch, > HIVE-13750.patch, HIVE-13750.patch > > > Extend ReduceDedup to remove additional shuffle stage created by sorted > dynamic partition optimizer when possible, thus avoiding unnecessary work. > By [~ashutoshc]: > {quote} > Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) > unconditionally adds an extra shuffle stage. If sort columns of previous > shuffle and partitioning columns of table match, reduce sink deduplication > optimizer removes extra shuffle stage, thus bringing down overhead to zero. > However, if they don’t match, we end up doing extra shuffle. This can be > improved since we can add table partition columns as a sort columns on > earlier shuffle and avoid this extra shuffle. This ensures that in cases > query already has a shuffle stage, we are not shuffling data again. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible
[ https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13750: --- Status: Open (was: Patch Available) > Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer > when possible > -- > > Key: HIVE-13750 > URL: https://issues.apache.org/jira/browse/HIVE-13750 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13750.01.patch, HIVE-13750.patch, HIVE-13750.patch > > > Extend ReduceDedup to remove additional shuffle stage created by sorted > dynamic partition optimizer when possible, thus avoiding unnecessary work. > By [~ashutoshc]: > {quote} > Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) > unconditionally adds an extra shuffle stage. If sort columns of previous > shuffle and partitioning columns of table match, reduce sink deduplication > optimizer removes extra shuffle stage, thus bringing down overhead to zero. > However, if they don’t match, we end up doing extra shuffle. This can be > improved since we can add table partition columns as a sort columns on > earlier shuffle and avoid this extra shuffle. This ensures that in cases > query already has a shuffle stage, we are not shuffling data again. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible
[ https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13750: --- Attachment: HIVE-13750.01.patch > Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer > when possible > -- > > Key: HIVE-13750 > URL: https://issues.apache.org/jira/browse/HIVE-13750 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13750.01.patch, HIVE-13750.patch, HIVE-13750.patch > > > Extend ReduceDedup to remove additional shuffle stage created by sorted > dynamic partition optimizer when possible, thus avoiding unnecessary work. > By [~ashutoshc]: > {quote} > Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) > unconditionally adds an extra shuffle stage. If sort columns of previous > shuffle and partitioning columns of table match, reduce sink deduplication > optimizer removes extra shuffle stage, thus bringing down overhead to zero. > However, if they don’t match, we end up doing extra shuffle. This can be > improved since we can add table partition columns as a sort columns on > earlier shuffle and avoid this extra shuffle. This ensures that in cases > query already has a shuffle stage, we are not shuffling data again. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible
[ https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13750: --- Status: Patch Available (was: Open) > Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer > when possible > -- > > Key: HIVE-13750 > URL: https://issues.apache.org/jira/browse/HIVE-13750 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13750.patch, HIVE-13750.patch > > > Extend ReduceDedup to remove additional shuffle stage created by sorted > dynamic partition optimizer when possible, thus avoiding unnecessary work. > By [~ashutoshc]: > {quote} > Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) > unconditionally adds an extra shuffle stage. If sort columns of previous > shuffle and partitioning columns of table match, reduce sink deduplication > optimizer removes extra shuffle stage, thus bringing down overhead to zero. > However, if they don’t match, we end up doing extra shuffle. This can be > improved since we can add table partition columns as a sort columns on > earlier shuffle and avoid this extra shuffle. This ensures that in cases > query already has a shuffle stage, we are not shuffling data again. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible
[ https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13750: --- Attachment: HIVE-13750.patch > Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer > when possible > -- > > Key: HIVE-13750 > URL: https://issues.apache.org/jira/browse/HIVE-13750 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13750.patch, HIVE-13750.patch > > > Extend ReduceDedup to remove additional shuffle stage created by sorted > dynamic partition optimizer when possible, thus avoiding unnecessary work. > By [~ashutoshc]: > {quote} > Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) > unconditionally adds an extra shuffle stage. If sort columns of previous > shuffle and partitioning columns of table match, reduce sink deduplication > optimizer removes extra shuffle stage, thus bringing down overhead to zero. > However, if they don’t match, we end up doing extra shuffle. This can be > improved since we can add table partition columns as a sort columns on > earlier shuffle and avoid this extra shuffle. This ensures that in cases > query already has a shuffle stage, we are not shuffling data again. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible
[ https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13750: --- Status: Patch Available (was: In Progress) > Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer > when possible > -- > > Key: HIVE-13750 > URL: https://issues.apache.org/jira/browse/HIVE-13750 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13750.patch > > > Extend ReduceDedup to remove additional shuffle stage created by sorted > dynamic partition optimizer when possible, thus avoiding unnecessary work. > By [~ashutoshc]: > {quote} > Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) > unconditionally adds an extra shuffle stage. If sort columns of previous > shuffle and partitioning columns of table match, reduce sink deduplication > optimizer removes extra shuffle stage, thus bringing down overhead to zero. > However, if they don’t match, we end up doing extra shuffle. This can be > improved since we can add table partition columns as a sort columns on > earlier shuffle and avoid this extra shuffle. This ensures that in cases > query already has a shuffle stage, we are not shuffling data again. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible
[ https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13750: --- Status: Open (was: Patch Available) > Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer > when possible > -- > > Key: HIVE-13750 > URL: https://issues.apache.org/jira/browse/HIVE-13750 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13750.patch > > > Extend ReduceDedup to remove additional shuffle stage created by sorted > dynamic partition optimizer when possible, thus avoiding unnecessary work. > By [~ashutoshc]: > {quote} > Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) > unconditionally adds an extra shuffle stage. If sort columns of previous > shuffle and partitioning columns of table match, reduce sink deduplication > optimizer removes extra shuffle stage, thus bringing down overhead to zero. > However, if they don’t match, we end up doing extra shuffle. This can be > improved since we can add table partition columns as a sort columns on > earlier shuffle and avoid this extra shuffle. This ensures that in cases > query already has a shuffle stage, we are not shuffling data again. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible
[ https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-13750: Target Version/s: 2.1.0 > Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer > when possible > -- > > Key: HIVE-13750 > URL: https://issues.apache.org/jira/browse/HIVE-13750 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13750.patch > > > Extend ReduceDedup to remove additional shuffle stage created by sorted > dynamic partition optimizer when possible, thus avoiding unnecessary work. > By [~ashutoshc]: > {quote} > Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) > unconditionally adds an extra shuffle stage. If sort columns of previous > shuffle and partitioning columns of table match, reduce sink deduplication > optimizer removes extra shuffle stage, thus bringing down overhead to zero. > However, if they don’t match, we end up doing extra shuffle. This can be > improved since we can add table partition columns as a sort columns on > earlier shuffle and avoid this extra shuffle. This ensures that in cases > query already has a shuffle stage, we are not shuffling data again. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible
[ https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13750: --- Attachment: HIVE-13750.patch > Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer > when possible > -- > > Key: HIVE-13750 > URL: https://issues.apache.org/jira/browse/HIVE-13750 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13750.patch > > > Extend ReduceDedup to remove additional shuffle stage created by sorted > dynamic partition optimizer when possible, thus avoiding unnecessary work. > By [~ashutoshc]: > {quote} > Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) > unconditionally adds an extra shuffle stage. If sort columns of previous > shuffle and partitioning columns of table match, reduce sink deduplication > optimizer removes extra shuffle stage, thus bringing down overhead to zero. > However, if they don’t match, we end up doing extra shuffle. This can be > improved since we can add table partition columns as a sort columns on > earlier shuffle and avoid this extra shuffle. This ensures that in cases > query already has a shuffle stage, we are not shuffling data again. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible
[ https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13750: --- Status: Patch Available (was: In Progress) > Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer > when possible > -- > > Key: HIVE-13750 > URL: https://issues.apache.org/jira/browse/HIVE-13750 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > > Extend ReduceDedup to remove additional shuffle stage created by sorted > dynamic partition optimizer when possible, thus avoiding unnecessary work. > By [~ashutoshc]: > {quote} > Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) > unconditionally adds an extra shuffle stage. If sort columns of previous > shuffle and partitioning columns of table match, reduce sink deduplication > optimizer removes extra shuffle stage, thus bringing down overhead to zero. > However, if they don’t match, we end up doing extra shuffle. This can be > improved since we can add table partition columns as a sort columns on > earlier shuffle and avoid this extra shuffle. This ensures that in cases > query already has a shuffle stage, we are not shuffling data again. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)