zhengruifeng commented on code in PR #40260: URL: https://github.com/apache/spark/pull/40260#discussion_r1123992438
########## python/pyspark/sql/connect/_typing.py: ########## @@ -57,7 +57,7 @@ class UserDefinedFunctionLike(Protocol): deterministic: bool @property - def returnType(self) -> DataType: + def returnType(self) -> DataTypeOrString: Review Comment: Yeah, but I am still confused about it: that the old implementation `PySparkSession.builder.getOrCreate().createDataFrame(data=[], schema=data_type).schema` works. I also tried ``` session = PySparkSession.builder.getOrCreate() parsed = session.client._analyze( # type: ignore[attr-defined] method="ddl_parse", ddl_string=data_type ).parsed ``` and at least the tests passed. But if I try ``` parsed = PySparkSession.builder.getOrCreate().client._analyze( # type: ignore[attr-defined] method="ddl_parse", ddl_string=data_type ).parsed ``` the tests always fail with `ValueError: Cannot invoke RPC on closed channel!` Maybe we will have to add a pure python ddl parser, i don't know -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org