Yongjia Wang created SPARK-11152:
------------------------------------
Summary: Streaming UI: Input sizes are 0 for makeup batches
started from a checkpoint
Key: SPARK-11152
URL: https://issues.apache.org/jira/browse/SPARK-11152
Project: Spark
Issue Type: Bug
Components: Streaming, Web UI
Reporter: Yongjia Wang
Priority: Minor
When a streaming job starts from a checkpoint at batch time x, and say the
current time when we resume this streaming job is x+10. In this scenario, since
Spark will schedule the missing batches from x+1 to x+10 without any metadata,
the behavior is to pack up all the backlogged inputs into batch x+1, then
assign any new inputs into x+2 to x+10 immediately without waiting. This
results in tiny batches that capture inputs only during the back to back
scheduling intervals. This behavior is very reasonable. However, the streaming
UI does not show correctly the input sizes for all these makeup batches - they
are all 0 from batch x to x+10. Fixing this would be very helpful.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]