[ 
https://issues.apache.org/jira/browse/SPARK-55271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-55271:
-----------------------------------
    Labels: pull-request-available  (was: )

> NullPointerException in Kafka Micro-Batch Streaming Progress Reporting
> ----------------------------------------------------------------------
>
>                 Key: SPARK-55271
>                 URL: https://issues.apache.org/jira/browse/SPARK-55271
>             Project: Spark
>          Issue Type: Bug
>          Components: Structured Streaming
>    Affects Versions: 4.1.1
>            Reporter: Dipesh Chandra Maurya
>            Priority: Critical
>              Labels: pull-request-available
>
> h3. Root Cause
> The error originates in {{KafkaMicroBatchStream.metrics()}} at line 520, 
> where the code attempts to call {{.get()}} on a Scala {{Option}} that 
> contains {{{}null{}}}, then immediately invokes {{.map()}} on the null 
> result. This violates Scala's Option contract and causes the 
> NullPointerException.
> h3. Error Flow
>  # The streaming query completes a micro-batch trigger
>  # Progress reporting attempts to extract source metrics via 
> {{ProgressContext.extractSourceProgress()}}
>  # The KafkaMicroBatchStream is queried for its metrics
>  # An unsafe {{.get()}} call on an empty or null Option returns null
>  # Attempting to call {{.map()}} on the null value throws NullPointerException
> h3. Impact
>  * The streaming query crashes and stops processing
>  * Progress information cannot be generated for the current batch
>  * The failure occurs during the finishTrigger phase, after batch processing 
> completes
> h3. Technical Details
>  * {*}Component{*}: Spark Structured Streaming with Kafka source
>  * {*}Class{*}: {{org.apache.spark.sql.kafka010.KafkaMicroBatchStream}}
>  * {*}Method{*}: {{metrics()}} at line 520
>  * {*}Trigger Type{*}: ProcessingTime (micro-batch mode)
> h3. Likely Causes
>  * Missing or unavailable Kafka metrics configuration
>  * Empty/uninitialized metrics collection in the Kafka source
>  * Race condition where metrics are accessed before initialization
>  * Incompatibility between Spark and Kafka connector versions



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to