Yongjia Wang created SPARK-11152: ------------------------------------ Summary: Streaming UI: Input sizes are 0 for makeup batches started from a checkpoint Key: SPARK-11152 URL: https://issues.apache.org/jira/browse/SPARK-11152 Project: Spark Issue Type: Bug Components: Streaming, Web UI Reporter: Yongjia Wang Priority: Minor
When a streaming job starts from a checkpoint at batch time x, and say the current time when we resume this streaming job is x+10. In this scenario, since Spark will schedule the missing batches from x+1 to x+10 without any metadata, the behavior is to pack up all the backlogged inputs into batch x+1, then assign any new inputs into x+2 to x+10 immediately without waiting. This results in tiny batches that capture inputs only during the back to back scheduling intervals. This behavior is very reasonable. However, the streaming UI does not show correctly the input sizes for all these makeup batches - they are all 0 from batch x to x+10. Fixing this would be very helpful. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org