This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new ac935f5074d [SPARK-46427][PYTHON][SQL] Change Python Data Source's 
description to be pretty in explain
ac935f5074d is described below

commit ac935f5074daac3fc2511f196b32c98007b61e53
Author: Hyukjin Kwon <gurwls...@apache.org>
AuthorDate: Fri Dec 15 22:02:31 2023 -0800

    [SPARK-46427][PYTHON][SQL] Change Python Data Source's description to be 
pretty in explain
    
    ### What changes were proposed in this pull request?
    
    This PR implements `Scan.description` so it has a better string description 
in `DataFrame.explain`.
    
    ```python
    spark.table("pythonSourceTable").explain(True)
    ```
    
    Before:
    
    ```
    == Physical Plan ==
    *(1) Project [x#0, y#1]
    +- BatchScan test[x#0, y#1] class 
org.apache.spark.sql.execution.python.PythonTableProvider$$anon$1$$anon$2 
RuntimeFilters: []
    ```
    
    After:
    
    ```
    == Physical Plan ==
    *(1) Project [x#0, y#1]
    +- BatchScan test[x#0, y#1] (Python) RuntimeFilters: []
    ```
    
    ### Why are the changes needed?
    
    Now it shows the class name for nested classes, which isn't quite pretty.
    
    ### Does this PR introduce _any_ user-facing change?
    
    It changes the plan description but the main change has not been released 
out yet. So no.
    
    ### How was this patch tested?
    
    Manually tested as above.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No.
    
    Closes #44379 from HyukjinKwon/SPARK-46427.
    
    Authored-by: Hyukjin Kwon <gurwls...@apache.org>
    Signed-off-by: Hyukjin Kwon <gurwls...@apache.org>
---
 .../apache/spark/sql/execution/python/UserDefinedPythonDataSource.scala | 2 ++
 1 file changed, 2 insertions(+)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/python/UserDefinedPythonDataSource.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/python/UserDefinedPythonDataSource.scala
index 5e978a90088..047a133a322 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/python/UserDefinedPythonDataSource.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/python/UserDefinedPythonDataSource.scala
@@ -101,6 +101,8 @@ class PythonTableProvider extends TableProvider {
             new PythonPartitionReaderFactory(
               source, readerFunc, outputSchema, jobArtifactUUID)
           }
+
+          override def description: String = "(Python)"
         }
       }
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to