[ 
https://issues.apache.org/jira/browse/SPARK-51111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17924548#comment-17924548
 ] 

Jungtaek Lim commented on SPARK-51111:
--------------------------------------

Hi, you are using EOL-ed Spark version with deprecated component (DStream). We 
strongly recommend to migrate to Structured Streaming. In recent Spark version 
of Structured Streaming, we do not use Kafka consumer but use Kafka AdminClient 
to retrieve the topic partitions, which eliminates this issue (as it does not 
belong to consumer group).

I can help closing the ticket if you don't mind. I can keep the ticket open if 
you would like to see other inputs. Thanks.

> Streaming job gets stuck during the startup of Driver for the consumers are 
> in the rebalance
> --------------------------------------------------------------------------------------------
>
>                 Key: SPARK-51111
>                 URL: https://issues.apache.org/jira/browse/SPARK-51111
>             Project: Spark
>          Issue Type: Bug
>          Components: DStreams
>    Affects Versions: 2.0.0
>            Reporter: Mars
>            Priority: Major
>
> When multiple Kafka `InputDStream` are used (number exceeding theĀ  Driver's 
> physical machine `Runtime.getRuntime.availableProcessors`), and the consumers 
> are configured with the same group id. Driver will remains stuck and log 
> 'Request joining group due to: group is already rebalancing'.
> It is more likely to occur in the k8s environment, because the cores of the 
> Driver is exactly the same as the cores of the physical machine 
> (`Runtime.getRuntime.availableProcessors`) .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to