[
https://issues.apache.org/jira/browse/HIVE-797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12747840#action_12747840
]
S. Alex Smith commented on HIVE-797:
------------------------------------
setting
set hive.script.auto.progress=true;
seems to have no effect. Aside from the job succeeding (it doesn't), what
effect should I be able to measure (in order to see if this is doing anything)?
> mappers should report life in ways other than emitting data
> -----------------------------------------------------------
>
> Key: HIVE-797
> URL: https://issues.apache.org/jira/browse/HIVE-797
> Project: Hadoop Hive
> Issue Type: Bug
> Reporter: S. Alex Smith
>
> Mappers which are performing a great deal of aggregation can be killed by
> time out even if they are running successfully. For example, in the
> following query the group by operator stops the mapper from returning any
> rows of data until the map is entirely finished. If the data processing
> takes longer than the time-out limit, the job will fail. The mapper should
> instead offer the tracker some indication that it is busy working.
> Alternatively, the tracker could ping the mapper with an appropriate question
> / warning before it sends a kill signal.
> FROM (
> FROM my_table
> SELECT TRANSFORM(my_data)
> USING 'my_boolean_function'
> AS boolean_output) a
> SELECT boolean_output, COUNT(1)
> GROUP BY boolean_output
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.