[ 
https://issues.apache.org/jira/browse/SPARK-51111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17924771#comment-17924771
 ] 

Jungtaek Lim commented on SPARK-51111:
--------------------------------------

I don't like to make the promise I can't do. I'm afraid I can't review the 
change, as I do not have enough context to review DStream, and I'm one of the 
people who led deprecation of DStream. I'd kindly suggest you to work with 
customer to be in sync with recent direction of Spark. You can wait for other 
committers to chime, but deprecation of DStream had already happened, so there 
is the direction. I'm sorry.

> Streaming job gets stuck during the startup of Driver for the consumers are 
> in the rebalance
> --------------------------------------------------------------------------------------------
>
>                 Key: SPARK-51111
>                 URL: https://issues.apache.org/jira/browse/SPARK-51111
>             Project: Spark
>          Issue Type: Bug
>          Components: DStreams
>    Affects Versions: 3.0.0
>            Reporter: Mars
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: KafkaJobTest.scala
>
>
> When multiple Kafka `InputDStream` are used (number exceeding the  Driver's 
> physical machine `Runtime.getRuntime.availableProcessors`), and the consumers 
> are configured with the same group id. Driver will remains stuck and log 
> 'Request joining group due to: group is already rebalancing'.
> It is more likely to occur in the k8s environment, because the cores of the 
> Driver is exactly the same as the cores of the physical machine 
> (`Runtime.getRuntime.availableProcessors`) .
> ROOT CAUSE:
> In the code,  Scala {{ParVector}} is used to initialize the streams, and this 
> variable creates a thread pool with a number of threads equal to 
> {{{}Runtime.getRuntime.availableProcessors{}}}. When the number of streams 
> exceeds the number of threads in the pool, it needs to wait for the previous 
> stream to complete before initializing the next one, leading to sequential 
> waiting.
> If the consumers of different topic InputStreams are configured with the same 
> group id, the Kafka server will trigger a rebalance operation due to the new 
> consumers joining the group. This rebalance operation can be time-consuming, 
> and when there are many streams, it can cause the initialization process to 
> hang very long time.
> We'd better to create a thread pool with a specified number of threads for 
> initialization. This can help the Driver complete the process faster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to