[ 
https://issues.apache.org/jira/browse/HIVE-17781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-17781:
----------------------------------------
    Description: 
Here's one that [~cdrome] and [~thiruvel] worked on:

We found that certain Hadoop Map/Reduce settings that are set in site config 
files do not take effect in Hive jobs, because the Tez site configs do not 
contain the same settings.

In Yahoo's case, the problem was that, at the time, there was no mapping 
between {{MRJobConfig.COMPLETED_MAPS_FOR_REDUCE_SLOWSTART}} and 
{{TEZ_SHUFFLE_VERTEX_MANAGER_MAX_SRC_FRACTION}}. There were situations where 
significant capacity on production clusters were being used up doing nothing, 
while waiting for slow tasks to complete. This would have been avoided, were 
the mappings in place.

Tez provides a {{DeprecatedKeys}} utility class, to help map MR settings to Tez 
settings. Hive should use this to ensure that the mappings are in sync.

(Note to self: YHIVE-883)

  was:
Here's one that [~cdrome] and [~thiruvel] worked on:

We found that certain Hadoop Map/Reduce settings that are set in site config 
files do not take effect in Hive jobs, because the Tez site configs do not 
contain the same settings.

In Yahoo's case, the problem was that, at the time, there was no mapping 
between {{MRJobConfig.COMPLETED_MAPS_FOR_REDUCE_SLOWSTART}} and 
{{TEZ_SHUFFLE_VERTEX_MANAGER_MAX_SRC_FRACTION}}. There were situations where 
significant capacity on production clusters were being used up doing nothing, 
while waiting for slow tasks to complete. This would have been avoided, were 
the mappings in place.

Tez provides a {{DeprecatedKeys}} utility class, to help map MR settings to Tez 
settings. Hive should use this to ensure that the mappings are in sync.


> Map MR settings to Tez settings via DeprecatedKeys
> --------------------------------------------------
>
>                 Key: HIVE-17781
>                 URL: https://issues.apache.org/jira/browse/HIVE-17781
>             Project: Hive
>          Issue Type: Bug
>          Components: Configuration, Tez
>    Affects Versions: 3.0.0
>            Reporter: Mithun Radhakrishnan
>            Assignee: Chris Drome
>         Attachments: HIVE-17781.1.patch
>
>
> Here's one that [~cdrome] and [~thiruvel] worked on:
> We found that certain Hadoop Map/Reduce settings that are set in site config 
> files do not take effect in Hive jobs, because the Tez site configs do not 
> contain the same settings.
> In Yahoo's case, the problem was that, at the time, there was no mapping 
> between {{MRJobConfig.COMPLETED_MAPS_FOR_REDUCE_SLOWSTART}} and 
> {{TEZ_SHUFFLE_VERTEX_MANAGER_MAX_SRC_FRACTION}}. There were situations where 
> significant capacity on production clusters were being used up doing nothing, 
> while waiting for slow tasks to complete. This would have been avoided, were 
> the mappings in place.
> Tez provides a {{DeprecatedKeys}} utility class, to help map MR settings to 
> Tez settings. Hive should use this to ensure that the mappings are in sync.
> (Note to self: YHIVE-883)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to