GitHub user lonelytrooper opened a pull request:
https://github.com/apache/spark/pull/19274
[SPARK-22056] Add subconcurrency for KafkaRDDPartition
JIRA Issueï¼https://issues.apache.org/jira/browse/SPARK-22056
When spark streaming consuming data from Kafka in direct way , partition in
Kafka and KafkaRDDPartition in spark streaming are now bijection. To enhance
the computing ability of spark streaming, we always to increase the number of
partitions in Kafka , but too many increments may lead problems in Kafka like
leader selection.
So , we introduce a new mechanism that change bijection to one-to-many
which controls by a new parameter named "topic.partition.subconcurrency". This
mechanism will divide one KafkaRDDPartition to many according to the parameter,
thus will make spark streaming use computing resources more efficient and
avoid the problems caused by increase the Kafka partitions.
(Please fill in changes proposed in this fix)
## How was this patch tested?
(Please explain how this patch was tested. E.g. unit tests, integration
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise,
remove this)
Please review http://spark.apache.org/contributing.html before opening a
pull request.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/lonelytrooper/spark add_partition_concurrency
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19274.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19274
----
commit a89663411e568f265103f0b695168d4db68a2b36
Author: bjyfhanfei <[email protected]>
Date: 2017-09-04T09:00:25Z
add partition subconcurrency
commit d1132195d6b2087be4f18ad25614836c46512fe7
Author: bjyfhanfei <[email protected]>
Date: 2017-09-19T06:12:29Z
add topic.partition.subconcurrency
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]