This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new a4678143f6bb [SPARK-46201][PYTHON][DOCS] Correct the typing of 
`schema_of_{csv, json, xml}`
a4678143f6bb is described below

commit a4678143f6bb9fd4356bf12a1db993cdfa22c7bd
Author: Ruifeng Zheng <ruife...@apache.org>
AuthorDate: Fri Dec 1 12:28:10 2023 -0800

    [SPARK-46201][PYTHON][DOCS] Correct the typing of `schema_of_{csv, json, 
xml}`
    
    ### What changes were proposed in this pull request?
    Correct the typing of `schema_of_{csv, json, xml}`
    
    ### Why are the changes needed?
    although `ColumnOrName` is defined as `ColumnOrName = Union[Column, str]`,
    we should not use it when the string here is not a column name.
    
    e.g. 
https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.functions.schema_of_csv.html
    
    
![image](https://github.com/apache/spark/assets/7322292/97681f11-a360-4bce-8557-2366aa07a0b5)
    
    in this case, we should follow parameter `schema` in `from_csv`:
    
![image](https://github.com/apache/spark/assets/7322292/3e05da85-6aba-4d23-b6a6-6a2fb8c9b8fd)
    
    ### Does this PR introduce _any_ user-facing change?
    yes, doc-change
    
    ### How was this patch tested?
    ci
    
    ### Was this patch authored or co-authored using generative AI tooling?
    no
    
    Closes #44108 from zhengruifeng/py_doc_schema_of.
    
    Authored-by: Ruifeng Zheng <ruife...@apache.org>
    Signed-off-by: Dongjoon Hyun <dh...@apple.com>
---
 python/pyspark/sql/functions/builtin.py | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/python/pyspark/sql/functions/builtin.py 
b/python/pyspark/sql/functions/builtin.py
index 5e5c70322ec9..ac237f10c2e7 100644
--- a/python/pyspark/sql/functions/builtin.py
+++ b/python/pyspark/sql/functions/builtin.py
@@ -13753,7 +13753,7 @@ def to_json(col: "ColumnOrName", options: 
Optional[Dict[str, str]] = None) -> Co
 
 
 @_try_remote_functions
-def schema_of_json(json: "ColumnOrName", options: Optional[Dict[str, str]] = 
None) -> Column:
+def schema_of_json(json: Union[Column, str], options: Optional[Dict[str, str]] 
= None) -> Column:
     """
     Parses a JSON string and infers its schema in DDL format.
 
@@ -13941,7 +13941,7 @@ def from_xml(
 
 
 @_try_remote_functions
-def schema_of_xml(xml: "ColumnOrName", options: Optional[Dict[str, str]] = 
None) -> Column:
+def schema_of_xml(xml: Union[Column, str], options: Optional[Dict[str, str]] = 
None) -> Column:
     """
     Parses a XML string and infers its schema in DDL format.
 
@@ -14055,7 +14055,7 @@ def to_xml(col: "ColumnOrName", options: 
Optional[Dict[str, str]] = None) -> Col
 
 
 @_try_remote_functions
-def schema_of_csv(csv: "ColumnOrName", options: Optional[Dict[str, str]] = 
None) -> Column:
+def schema_of_csv(csv: Union[Column, str], options: Optional[Dict[str, str]] = 
None) -> Column:
     """
     Parses a CSV string and infers its schema in DDL format.
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to