[PR] [SPARK-54067][SDP] Avoid long "User application exited with 1" errors & traces on spark-pipelines errors [spark]

via GitHub Tue, 28 Oct 2025 13:36:53 -0700


sryza opened a new pull request, #52770:
URL: https://github.com/apache/spark/pull/52770


   ### What changes were proposed in this pull request?
   
   Hides the `SparkUserAppException` and stack trace when a pipeline run fails.
   
   ### Why are the changes needed?
   
   I hit this when I ran a pipeline that had no flows:
   ```
   org.apache.spark.SparkUserAppException: User application exited with 1
   at org.apache.spark.deploy.PythonRunner$.main(PythonRunner.scala:127)
   at org.apache.spark.deploy.PythonRunner.main(PythonRunner.scala)
   at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
   at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
   at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.base/java.lang.reflect.Method.invoke(Method.java:569)
   at 
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
   at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1028)
   at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:203)
   at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:226)
   at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:95)
   at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1166)
   at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1175)
   at org.apache.spark.deploy.SparkPipelines$.main(SparkPipelines.scala:42)
   at org.apache.spark.deploy.SparkPipelines.main(SparkPipelines.scala)
   ```
   
   This is not information that's relevant to the user.
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   Not for anything that's been released.
   
   ### How was this patch tested?
   
   Ran the CLI and observed this error was gone and the other output remained 
the same:
   
   ```
   > spark-pipelines run --conf spark.sql.catalogImplementation=hive
   WARNING: Using incubator modules: jdk.incubator.vector
   2025-10-28 13:22:49: Loading pipeline spec from 
/Users/sandy.ryza/sdp-test/demo2/pipeline.yml...
   2025-10-28 13:22:49: Creating Spark session...
   WARNING: Using incubator modules: jdk.incubator.vector
   Using Spark's default log4j profile: 
org/apache/spark/log4j2-defaults.properties
   Setting default log level to "WARN".
   To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
setLogLevel(newLevel).
   25/10/28 13:22:50 WARN NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
   /Users/sandy.ryza/oss/python/lib/pyspark.zip/pyspark/sql/connect/conf.py:64: 
UserWarning: Failed to set spark.sql.catalogImplementation to Some(hive) due to 
[CANNOT_MODIFY_STATIC_CONFIG] Cannot modify the value of the static Spark 
config: "spark.sql.catalogImplementation". SQLSTATE: 46110
   2025-10-28 13:22:53: Creating dataflow graph...
   2025-10-28 13:22:53: Registering graph elements...
   2025-10-28 13:22:53: Loading definitions. Root directory: 
'/Users/sandy.ryza/sdp-test/demo2'.
   2025-10-28 13:22:53: Found 2 files matching glob 'transformations/**/*'
   2025-10-28 13:22:53: Importing 
/Users/sandy.ryza/sdp-test/demo2/transformations/example_python_materialized_view.py...
   2025-10-28 13:22:53: Registering SQL file 
/Users/sandy.ryza/sdp-test/demo2/transformations/example_sql_materialized_view.sql...
   2025-10-28 13:22:53: Starting run...
   25/10/28 13:22:55 WARN ObjectStore: Version information not found in 
metastore. hive.metastore.schema.verification is not enabled so recording the 
schema version 2.3.0
   25/10/28 13:22:55 WARN ObjectStore: setMetaStoreSchemaVersion called but 
recording version is disabled: version = 2.3.0, comment = Set by MetaStore 
[email protected]
   Traceback (most recent call last):
     File "/Users/sandy.ryza/oss/python/pyspark/pipelines/cli.py", line 413, in 
<module>
       run(
     File "/Users/sandy.ryza/oss/python/pyspark/pipelines/cli.py", line 340, in 
run
       handle_pipeline_events(result_iter)
     File 
"/Users/sandy.ryza/oss/python/lib/pyspark.zip/pyspark/pipelines/spark_connect_pipeline.py",
 line 53, in handle_pipeline_events
       for result in iter:
     File 
"/Users/sandy.ryza/oss/python/lib/pyspark.zip/pyspark/sql/connect/client/core.py",
 line 1186, in execute_command_as_iterator
     File 
"/Users/sandy.ryza/oss/python/lib/pyspark.zip/pyspark/sql/connect/client/core.py",
 line 1619, in _execute_and_fetch_as_iterator
     File 
"/Users/sandy.ryza/oss/python/lib/pyspark.zip/pyspark/sql/connect/client/core.py",
 line 1893, in _handle_error
     File 
"/Users/sandy.ryza/oss/python/lib/pyspark.zip/pyspark/sql/connect/client/core.py",
 line 1966, in _handle_rpc_error
   pyspark.errors.exceptions.connect.AnalysisException: 
[PIPELINE_DATASET_WITHOUT_FLOW] Pipeline dataset 
`spark_catalog`.`default`.`abc` does not have any defined flows. Please attach 
a query with the dataset's definition, or explicitly define at least one flow 
that writes to the dataset. SQLSTATE: 0A000
   25/10/28 13:22:57 INFO ShutdownHookManager: Shutdown hook called
   25/10/28 13:22:57 INFO ShutdownHookManager: Deleting directory 
/private/var/folders/1v/dqhbgmt10vl6v3tdlwvvx90r0000gp/T/spark-1214d042-270d-407f-8324-0dfcdf72c38c
   ```
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   <!--
   If generative AI tooling has been used in the process of authoring this 
patch, please include the
   phrase: 'Generated-by: ' followed by the name of the tool and its version.
   If no, write 'No'.
   Please refer to the [ASF Generative Tooling 
Guidance](https://www.apache.org/legal/generative-tooling.html) for details.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] [SPARK-54067][SDP] Avoid long "User application exited with 1" errors & traces on spark-pipelines errors [spark]

Reply via email to