Heartbeating for streaming jobs should not depend on stdout
-----------------------------------------------------------
Key: HIVE-410
URL: https://issues.apache.org/jira/browse/HIVE-410
Project: Hadoop Hive
Issue Type: Bug
Reporter: Venky Iyer
Priority: Blocker
jobs that require iterative processing may take longer than 10 mins to produce
rows. This shouldn't be cause to kill the job. Producing keepalive dummy rows
to stdout is bad if the data has to go into a Hive table or other Hive steps.
If we adopt the solution of using stderr to indicate heartbeats, can that be
combined with streaming counters
(http://hadoop.apache.org/core/docs/current/streaming.html#How+do+I+update+counters+in+streaming+applications%3F
)? Also, will limitations on size of stderr break this?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.