Saisai Shao created SPARK-11632:
-----------------------------------
Summary: Filter out empty partition for KafkaRDD
Key: SPARK-11632
URL: https://issues.apache.org/jira/browse/SPARK-11632
Project: Spark
Issue Type: Improvement
Components: Streaming
Reporter: Saisai Shao
Priority: Minor
For KafkaRDD, each partition's processed message number is calculated
beforehand, so empty partition could be filtered out to avoid submitting
unnecessary tasks. This could alleviate scheduling overhead if there's no data
come in, also makes dynamic allocation effective to shrink the resources (no
pending tasks).
For other receiver-based DStream, BlockRDD already support empty one, so if no
data injected at that time period, there will be no task submitted.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]