zhengruifeng commented on code in PR #40260:
URL: https://github.com/apache/spark/pull/40260#discussion_r1123992438


##########
python/pyspark/sql/connect/_typing.py:
##########
@@ -57,7 +57,7 @@ class UserDefinedFunctionLike(Protocol):
     deterministic: bool
 
     @property
-    def returnType(self) -> DataType:
+    def returnType(self) -> DataTypeOrString:

Review Comment:
   Yeah, but I am still confused about it:
   
   that the old implementation
   `PySparkSession.builder.getOrCreate().createDataFrame(data=[], 
schema=data_type).schema` works.
   
   I also tried 
   ```
       session = PySparkSession.builder.getOrCreate()
       parsed = session.client._analyze(  # type: ignore[attr-defined]
           method="ddl_parse", ddl_string=data_type
       ).parsed
   ```
   and at least the tests passed.
   
   But if I try
   ```
        parsed = PySparkSession.builder.getOrCreate().client._analyze(  # type: 
ignore[attr-defined]
           method="ddl_parse", ddl_string=data_type
       ).parsed
   ```
   the tests always fail with `ValueError: Cannot invoke RPC on closed channel!`
   
   Maybe we will have to add a pure python ddl parser, i don't know



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to