[ 
https://issues.apache.org/jira/browse/SPARK-51111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mars updated SPARK-51111:
-------------------------
    Description: 
When multiple Kafka `InputDStream` are used (number exceeding the  Driver's 
physical machine `Runtime.getRuntime.availableProcessors`), and the consumers 
are configured with the same group id. Driver will remains stuck and log 
'Request joining group due to: group is already rebalancing'.
It is more likely to occur in the k8s environment, because the cores of the 
Driver is exactly the same as the cores of the physical machine 
(`Runtime.getRuntime.availableProcessors`) .

The root cause:
In the code,  Scala {{ParVector}} is used to initialize the streams, and this 
variable creates a thread pool with a number of threads equal to 
{{{}Runtime.getRuntime.availableProcessors{}}}. When the number of streams 
exceeds the number of threads in the pool, it needs to wait for the previous 
stream to complete before initializing the next one, leading to sequential 
waiting.

If the consumers of different topic InputStreams are configured with the same 
group id, the Kafka server will trigger a rebalance operation due to the new 
consumers joining the group. This rebalance operation can be time-consuming, 
and when there are many streams, it can cause the initialization process to 
hang very long time.

  was:
When multiple Kafka `InputDStream` are used (number exceeding the  Driver's 
physical machine `Runtime.getRuntime.availableProcessors`), and the consumers 
are configured with the same group id. Driver will remains stuck and log 
'Request joining group due to: group is already rebalancing'.
It is more likely to occur in the k8s environment, because the cores of the 
Driver is exactly the same as the cores of the physical machine 
(`Runtime.getRuntime.availableProcessors`) .


> Streaming job gets stuck during the startup of Driver for the consumers are 
> in the rebalance
> --------------------------------------------------------------------------------------------
>
>                 Key: SPARK-51111
>                 URL: https://issues.apache.org/jira/browse/SPARK-51111
>             Project: Spark
>          Issue Type: Bug
>          Components: DStreams
>    Affects Versions: 2.0.0
>            Reporter: Mars
>            Priority: Major
>
> When multiple Kafka `InputDStream` are used (number exceeding the  Driver's 
> physical machine `Runtime.getRuntime.availableProcessors`), and the consumers 
> are configured with the same group id. Driver will remains stuck and log 
> 'Request joining group due to: group is already rebalancing'.
> It is more likely to occur in the k8s environment, because the cores of the 
> Driver is exactly the same as the cores of the physical machine 
> (`Runtime.getRuntime.availableProcessors`) .
> The root cause:
> In the code,  Scala {{ParVector}} is used to initialize the streams, and this 
> variable creates a thread pool with a number of threads equal to 
> {{{}Runtime.getRuntime.availableProcessors{}}}. When the number of streams 
> exceeds the number of threads in the pool, it needs to wait for the previous 
> stream to complete before initializing the next one, leading to sequential 
> waiting.
> If the consumers of different topic InputStreams are configured with the same 
> group id, the Kafka server will trigger a rebalance operation due to the new 
> consumers joining the group. This rebalance operation can be time-consuming, 
> and when there are many streams, it can cause the initialization process to 
> hang very long time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to