[GitHub] spark issue #19233: [Spark-22008][Streaming]Spark Streaming Dynamic Allocati...

mayuehappy Wed, 04 Jul 2018 20:29:44 -0700

Github user mayuehappy commented on the issue:

    https://github.com/apache/spark/pull/19233
  
    @srowen Really thanks for your reply.I think maybe I didn't express it 
well.Let's assume that there is a situation like this. If we use spark 
streaming to consume a Kafka topic with 10 partition, so it will make a 
KafkaRDD with 10 partition.that also means the number of task is 10. if our 
each executor core is 1, concurrentJob = 1,can we consider that there can only 
use a maximum of 10 executor at same time.If we turned on the Dynamic 
allocation of resources and set maxExecutors to 15 .Now Spark Streaming's 
resource allocation strategy is using the rate ProcessTime/BatchSize,if our 
ProcessTime/BatchSize > scalingUpRatioï¼the job will acontinues to apply 
resources from yarn ,but these new executor are useless. So I think we should 
limit the MaxExecutor = numTasks * conCurrentJobs * cpuPerTask / 
coresPerExecutor.I don't know if I have a clear expression = =ã



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #19233: [Spark-22008][Streaming]Spark Streaming Dynamic Allocati...

Reply via email to