[ 
https://issues.apache.org/jira/browse/SPARK-53735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-53735:
----------------------------------
        Parent: SPARK-51727
    Issue Type: Sub-task  (was: Improvement)

> Hide server-side JVM stack traces by default in spark-pipelines output
> ----------------------------------------------------------------------
>
>                 Key: SPARK-53735
>                 URL: https://issues.apache.org/jira/browse/SPARK-53735
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Declarative Pipelines
>    Affects Versions: 4.1.0
>            Reporter: Sanford Ryza
>            Assignee: Sanford Ryza
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.1.0
>
>
> Error output for failing pipeline runs can be very verbose and show a bunch 
> of info that is not relevant to the user. We should hide the server-side 
> stack traces by default.
>  
> 2025-09-26 17:07:50: Failed to resolve flow: 
> 'spark_catalog.default.rental_bike_trips'.
> Error: [TABLE_OR_VIEW_NOT_FOUND] The table or view 
> `spark_catalog`.`default`.`rental_bike_trips_raws` cannot be found. Verify 
> the spelling and correctness of the schema and catalog.
> If you did not qualify the name with a schema, verify the current_schema() 
> output, or qualify the name with the correct schema and catalog.
> To tolerate the error on drop use DROP VIEW IF EXISTS or DROP TABLE IF 
> EXISTS. SQLSTATE: 42P01;
> 'UnresolvedRelation [spark_catalog, default, rental_bike_trips_raws], [], true
>  
> Traceback (most recent call last):
>   File "/Users/sandy.ryza/oss/python/pyspark/pipelines/cli.py", line 360, in 
> <module>
>     run(
>   File "/Users/sandy.ryza/oss/python/pyspark/pipelines/cli.py", line 287, in 
> run
>     handle_pipeline_events(result_iter)
>   File 
> "/Users/sandy.ryza/oss/python/pyspark/pipelines/spark_connect_pipeline.py", 
> line 53, in handle_pipeline_events
>     for result in iter:
>   File "/Users/sandy.ryza/oss/python/pyspark/sql/connect/client/core.py", 
> line 1169, in execute_command_as_iterator
>     for response in self._execute_and_fetch_as_iterator(req, observations or 
> {}):
>   File "/Users/sandy.ryza/oss/python/pyspark/sql/connect/client/core.py", 
> line 1559, in _execute_and_fetch_as_iterator
>     self._handle_error(error)
>   File "/Users/sandy.ryza/oss/python/pyspark/sql/connect/client/core.py", 
> line 1833, in _handle_error
>     self._handle_rpc_error(error)
>   File "/Users/sandy.ryza/oss/python/pyspark/sql/connect/client/core.py", 
> line 1904, in _handle_rpc_error
>     raise convert_exception(
> pyspark.errors.exceptions.connect.AnalysisException:
> Failed to resolve flows in the pipeline.
>  
> A flow can fail to resolve because the flow itself contains errors or because 
> it reads
> from an upstream flow which failed to resolve.
>  
> Flows with errors: spark_catalog.default.rental_bike_trips
> Flows that failed due to upstream errors:
>  
> To view the exceptions that were raised while resolving these flows, look for 
> flow
> failures that precede this log.
>  
> JVM stacktrace:
> org.apache.spark.sql.pipelines.graph.UnresolvedPipelineException
> at 
> org.apache.spark.sql.pipelines.graph.GraphValidations.validateSuccessfulFlowAnalysis(GraphValidations.scala:284)
> at 
> org.apache.spark.sql.pipelines.graph.GraphValidations.validateSuccessfulFlowAnalysis$(GraphValidations.scala:247)
> at 
> org.apache.spark.sql.pipelines.graph.DataflowGraph.validateSuccessfulFlowAnalysis(DataflowGraph.scala:33)
> at 
> org.apache.spark.sql.pipelines.graph.DataflowGraph.$anonfun$validationFailure$1(DataflowGraph.scala:186)
> at scala.util.Try$.apply(Try.scala:217)
> at 
> org.apache.spark.sql.pipelines.graph.DataflowGraph.validationFailure$lzycompute(DataflowGraph.scala:185)
> at 
> org.apache.spark.sql.pipelines.graph.DataflowGraph.validationFailure(DataflowGraph.scala:185)
> at 
> org.apache.spark.sql.pipelines.graph.DataflowGraph.validate(DataflowGraph.scala:173)
> at 
> org.apache.spark.sql.pipelines.graph.PipelineExecution.resolveGraph(PipelineExecution.scala:109)
> at 
> org.apache.spark.sql.pipelines.graph.PipelineExecution.startPipeline(PipelineExecution.scala:48)
> at 
> org.apache.spark.sql.pipelines.graph.PipelineExecution.runPipeline(PipelineExecution.scala:63)
> at 
> org.apache.spark.sql.connect.pipelines.PipelinesHandler$.startRun(PipelinesHandler.scala:294)
> at 
> org.apache.spark.sql.connect.pipelines.PipelinesHandler$.handlePipelinesCommand(PipelinesHandler.scala:93)
> at 
> org.apache.spark.sql.connect.planner.SparkConnectPlanner.handlePipelineCommand(SparkConnectPlanner.scala:2727)
> at 
> org.apache.spark.sql.connect.planner.SparkConnectPlanner.process(SparkConnectPlanner.scala:2697)
> at 
> org.apache.spark.sql.connect.execution.ExecuteThreadRunner.handleCommand(ExecuteThreadRunner.scala:322)
> at 
> org.apache.spark.sql.connect.execution.ExecuteThreadRunner.$anonfun$executeInternal$1(ExecuteThreadRunner.scala:224)
> at 
> org.apache.spark.sql.connect.execution.ExecuteThreadRunner.$anonfun$executeInternal$1$adapted(ExecuteThreadRunner.scala:196)
> at 
> org.apache.spark.sql.connect.service.SessionHolder.$anonfun$withSession$2(SessionHolder.scala:349)
> at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:804)
> at 
> org.apache.spark.sql.connect.service.SessionHolder.$anonfun$withSession$1(SessionHolder.scala:349)
> at 
> org.apache.spark.JobArtifactSet$.withActiveJobArtifactState(JobArtifactSet.scala:94)
> at 
> org.apache.spark.sql.artifact.ArtifactManager.$anonfun$withResources$1(ArtifactManager.scala:112)
> at org.apache.spark.util.Utils$.withContextClassLoader(Utils.scala:187)
> at 
> org.apache.spark.sql.artifact.ArtifactManager.withClassLoaderIfNeeded(ArtifactManager.scala:102)
> at 
> org.apache.spark.sql.artifact.ArtifactManager.withResources(ArtifactManager.scala:111)
> at 
> org.apache.spark.sql.connect.service.SessionHolder.withSession(SessionHolder.scala:348)
> at 
> org.apache.spark.sql.connect.execution.ExecuteThreadRunner.executeInternal(ExecuteThreadRunner.scala:196)
> at 
> org.apache.spark.sql.connect.execution.ExecuteThreadRunner.org$apache$spark$sql$connect$execution$ExecuteThreadRunner$$execute(ExecuteThreadRunner.scala:125)
> at 
> org.apache.spark.sql.connect.execution.ExecuteThreadRunner$ExecutionThread.run(ExecuteThreadRunner.scala:347)
> 25/09/26 10:07:50 INFO ShutdownHookManager: Shutdown hook called



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to