This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new a4678143f6bb [SPARK-46201][PYTHON][DOCS] Correct the typing of `schema_of_{csv, json, xml}` a4678143f6bb is described below commit a4678143f6bb9fd4356bf12a1db993cdfa22c7bd Author: Ruifeng Zheng <ruife...@apache.org> AuthorDate: Fri Dec 1 12:28:10 2023 -0800 [SPARK-46201][PYTHON][DOCS] Correct the typing of `schema_of_{csv, json, xml}` ### What changes were proposed in this pull request? Correct the typing of `schema_of_{csv, json, xml}` ### Why are the changes needed? although `ColumnOrName` is defined as `ColumnOrName = Union[Column, str]`, we should not use it when the string here is not a column name. e.g. https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.functions.schema_of_csv.html ![image](https://github.com/apache/spark/assets/7322292/97681f11-a360-4bce-8557-2366aa07a0b5) in this case, we should follow parameter `schema` in `from_csv`: ![image](https://github.com/apache/spark/assets/7322292/3e05da85-6aba-4d23-b6a6-6a2fb8c9b8fd) ### Does this PR introduce _any_ user-facing change? yes, doc-change ### How was this patch tested? ci ### Was this patch authored or co-authored using generative AI tooling? no Closes #44108 from zhengruifeng/py_doc_schema_of. Authored-by: Ruifeng Zheng <ruife...@apache.org> Signed-off-by: Dongjoon Hyun <dh...@apple.com> --- python/pyspark/sql/functions/builtin.py | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/python/pyspark/sql/functions/builtin.py b/python/pyspark/sql/functions/builtin.py index 5e5c70322ec9..ac237f10c2e7 100644 --- a/python/pyspark/sql/functions/builtin.py +++ b/python/pyspark/sql/functions/builtin.py @@ -13753,7 +13753,7 @@ def to_json(col: "ColumnOrName", options: Optional[Dict[str, str]] = None) -> Co @_try_remote_functions -def schema_of_json(json: "ColumnOrName", options: Optional[Dict[str, str]] = None) -> Column: +def schema_of_json(json: Union[Column, str], options: Optional[Dict[str, str]] = None) -> Column: """ Parses a JSON string and infers its schema in DDL format. @@ -13941,7 +13941,7 @@ def from_xml( @_try_remote_functions -def schema_of_xml(xml: "ColumnOrName", options: Optional[Dict[str, str]] = None) -> Column: +def schema_of_xml(xml: Union[Column, str], options: Optional[Dict[str, str]] = None) -> Column: """ Parses a XML string and infers its schema in DDL format. @@ -14055,7 +14055,7 @@ def to_xml(col: "ColumnOrName", options: Optional[Dict[str, str]] = None) -> Col @_try_remote_functions -def schema_of_csv(csv: "ColumnOrName", options: Optional[Dict[str, str]] = None) -> Column: +def schema_of_csv(csv: Union[Column, str], options: Optional[Dict[str, str]] = None) -> Column: """ Parses a CSV string and infers its schema in DDL format. --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org