[GitHub] [spark] zero323 commented on a change in pull request #34136: [SPARK-36884][PYTHON] Inline type hints for pyspark.sql.session

GitBox Fri, 01 Oct 2021 09:54:39 -0700


zero323 commented on a change in pull request #34136:
URL: https://github.com/apache/spark/pull/34136#discussion_r720401238




##########
File path: python/pyspark/sql/session.py
##########
@@ -445,7 +487,12 @@ def _inferSchemaFromList(self, data, names=None):
             raise ValueError("Some of types cannot be determined after 
inferring")
         return schema
 
-    def _inferSchema(self, rdd, samplingRatio=None, names=None):
+    def _inferSchema(
+        self,
+        rdd: "RDD[Union[DateTimeLiteral, LiteralType, DecimalLiteral, 
RowLike]]",

Review comment:
       Just wondering about this ‒ I have a feeling that it should be either 
`RDD[Any]` (type-wise we can invoke this on arbitrary RDD) or, if we want to 
give a signal that can succeed  only on certain types of RDDs,  `Literal*` 
variants should be omitted (we don't support schema inference on these).
   
   Same applies to `_inferSchemaFromList`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] zero323 commented on a change in pull request #34136: [SPARK-36884][PYTHON] Inline type hints for pyspark.sql.session

Reply via email to