mappers should report life in ways other than emitting data
-----------------------------------------------------------

                 Key: HIVE-797
                 URL: https://issues.apache.org/jira/browse/HIVE-797
             Project: Hadoop Hive
          Issue Type: Bug
            Reporter: S. Alex Smith


Mappers which are performing a great deal of aggregation can be killed by time 
out even if they are running successfully.  For example, in the following query 
the group by operator stops the mapper from returning any rows of data until 
the map is entirely finished.  If the data processing takes longer than the 
time-out limit, the job will fail.  The mapper should instead offer the tracker 
some indication that it is busy working.  Alternatively, the tracker could ping 
the mapper with an appropriate question / warning before it sends a kill signal.

FROM (
  FROM my_table
  SELECT TRANSFORM(my_data)
  USING 'my_boolean_function'
  AS boolean_output) a
SELECT boolean_output, COUNT(1)
GROUP BY boolean_output

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to