[
https://issues.apache.org/jira/browse/HIVE-797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12747776#action_12747776
]
Zheng Shao commented on HIVE-797:
---------------------------------
Mapper script can write to stderr to avoid the killing.
Alternatively, you can do the following to achieve the same result:
{code}
set hive.script.auto.progress=true;
{code}
> mappers should report life in ways other than emitting data
> -----------------------------------------------------------
>
> Key: HIVE-797
> URL: https://issues.apache.org/jira/browse/HIVE-797
> Project: Hadoop Hive
> Issue Type: Bug
> Reporter: S. Alex Smith
>
> Mappers which are performing a great deal of aggregation can be killed by
> time out even if they are running successfully. For example, in the
> following query the group by operator stops the mapper from returning any
> rows of data until the map is entirely finished. If the data processing
> takes longer than the time-out limit, the job will fail. The mapper should
> instead offer the tracker some indication that it is busy working.
> Alternatively, the tracker could ping the mapper with an appropriate question
> / warning before it sends a kill signal.
> FROM (
> FROM my_table
> SELECT TRANSFORM(my_data)
> USING 'my_boolean_function'
> AS boolean_output) a
> SELECT boolean_output, COUNT(1)
> GROUP BY boolean_output
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.