Saisai Shao created SPARK-11632:
-----------------------------------

             Summary: Filter out empty partition for KafkaRDD
                 Key: SPARK-11632
                 URL: https://issues.apache.org/jira/browse/SPARK-11632
             Project: Spark
          Issue Type: Improvement
          Components: Streaming
            Reporter: Saisai Shao
            Priority: Minor


For KafkaRDD, each partition's processed message number is calculated 
beforehand, so empty partition could be filtered out to avoid submitting 
unnecessary tasks. This could alleviate scheduling overhead if there's no data 
come in, also makes dynamic allocation effective to shrink the resources (no 
pending tasks).

For other receiver-based DStream, BlockRDD already support empty one, so if no 
data injected at that time period, there will be no task submitted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to