[
https://issues.apache.org/jira/browse/SPARK-55505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated SPARK-55505:
-----------------------------------
Labels: pull-request-available (was: )
> NumberFormatException in SQLExecution.withNewExecutionId0 due to re-reading
> EXECUTION_ROOT_ID_KEY
> -------------------------------------------------------------------------------------------------
>
> Key: SPARK-55505
> URL: https://issues.apache.org/jira/browse/SPARK-55505
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 4.2.0
> Reporter: Yicong Huang
> Priority: Minor
> Labels: pull-request-available
>
> {{SQLExecution.withNewExecutionId0}} can throw {{NumberFormatException:
> Cannot parse null string}} at the line:
> {code:scala}
> val rootExecutionId = sc.getLocalProperty(EXECUTION_ROOT_ID_KEY).toLong
> {code}
> The current code checks if {{EXECUTION_ROOT_ID_KEY}} is null, sets it if so,
> then *re-reads* it from local properties assuming it is non-null:
> {code:scala}
> if (sc.getLocalProperty(EXECUTION_ROOT_ID_KEY) == null) {
> sc.setLocalProperty(EXECUTION_ROOT_ID_KEY, executionId.toString)
> sc.addJobTag(executionIdJobTag(sparkSession, executionId))
> }
> val rootExecutionId = sc.getLocalProperty(EXECUTION_ROOT_ID_KEY).toLong //
> crashes here
> {code}
> This re-read can return null under high-concurrency scenarios involving
> nested thread pools (e.g., {{CrossValidator(parallelism=4)}} with
> {{OneVsRest(parallelism=2)}} running from a Python {{ThreadPoolExecutor}}).
> The fix is to read the property once and use the value directly, avoiding the
> re-read:
> {code:scala}
> val existingRootId = sc.getLocalProperty(EXECUTION_ROOT_ID_KEY)
> val rootExecutionId = if (existingRootId != null) {
> existingRootId.toLong
> } else {
> sc.setLocalProperty(EXECUTION_ROOT_ID_KEY, executionId.toString)
> sc.addJobTag(executionIdJobTag(sparkSession, executionId))
> executionId
> }
> {code}
> CI failure demonstrating the flaky test: [GitHub Actions
> Run|https://github.com/Yicong-Huang/spark/actions/runs/21961599500/attempts/1]
> (Attempt 1 failed, attempt 2 passed. Failed test:
> {{test_save_load_pipeline_estimator}} in {{CrossValidatorIOPipelineTests}})
> Full stack trace:
> {code}
> java.lang.NumberFormatException: Cannot parse null string
> at java.lang.Long.parseLong(Long.java:550)
> at SQLExecution.scala:115
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]