Dipesh Chandra Maurya created SPARK-55271:
---------------------------------------------
Summary: NullPointerException in Kafka Micro-Batch Streaming
Progress Reporting
Key: SPARK-55271
URL: https://issues.apache.org/jira/browse/SPARK-55271
Project: Spark
Issue Type: Bug
Components: Structured Streaming
Affects Versions: 4.1.1
Reporter: Dipesh Chandra Maurya
h3. Root Cause
The error originates in {{KafkaMicroBatchStream.metrics()}} at line 520, where
the code attempts to call {{.get()}} on a Scala {{Option}} that contains
{{{}null{}}}, then immediately invokes {{.map()}} on the null result. This
violates Scala's Option contract and causes the NullPointerException.
h3. Error Flow
# The streaming query completes a micro-batch trigger
# Progress reporting attempts to extract source metrics via
{{ProgressContext.extractSourceProgress()}}
# The KafkaMicroBatchStream is queried for its metrics
# An unsafe {{.get()}} call on an empty or null Option returns null
# Attempting to call {{.map()}} on the null value throws NullPointerException
h3. Impact
* The streaming query crashes and stops processing
* Progress information cannot be generated for the current batch
* The failure occurs during the finishTrigger phase, after batch processing
completes
h3. Technical Details
* {*}Component{*}: Spark Structured Streaming with Kafka source
* {*}Class{*}: {{org.apache.spark.sql.kafka010.KafkaMicroBatchStream}}
* {*}Method{*}: {{metrics()}} at line 520
* {*}Trigger Type{*}: ProcessingTime (micro-batch mode)
h3. Likely Causes
* Missing or unavailable Kafka metrics configuration
* Empty/uninitialized metrics collection in the Kafka source
* Race condition where metrics are accessed before initialization
* Incompatibility between Spark and Kafka connector versions
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]