[GitHub] [spark] HyukjinKwon commented on a change in pull request #32775: [SPARK-35638][PYTHON] Introduce Field to manage dtypes and StructFields.

GitBox Fri, 04 Jun 2021 01:30:24 -0700


HyukjinKwon commented on a change in pull request #32775:
URL: https://github.com/apache/spark/pull/32775#discussion_r645252024




##########
File path: python/pyspark/pandas/frame.py
##########
@@ -7703,7 +7734,7 @@ def join(
         right: DataFrame, Series
         on: str, list of str, or array-like, optional
             Column or index level name(s) in the caller to join on the index 
in `right`, otherwise
-            joins index-on-index. If multiple values given, the `right` 
DataFrame must have a
+            joins index-on-index. If mult iple values given, the `right` 
DataFrame must have a

Review comment:
       nit: typo

##########
File path: python/pyspark/pandas/internal.py
##########
@@ -73,6 +80,131 @@
 SPARK_DEFAULT_SERIES_NAME = str(DEFAULT_SERIES_NAME)
 
 
+class Field:

Review comment:
       Should we maybe name it `InternalField`?

##########
File path: python/pyspark/pandas/internal.py
##########
@@ -385,10 +528,10 @@ def __init__(
         spark_frame: spark.DataFrame,
         index_spark_columns: Optional[List[spark.Column]],
         index_names: Optional[List[Optional[Tuple]]] = None,
-        index_dtypes: Optional[List[Dtype]] = None,
+        index_fields: Optional[List[Field]] = None,

Review comment:
       Hm .. should we maybe combine it with `index_spark_columns` and 
`data_spark_columns` respectively?

##########
File path: python/pyspark/pandas/internal.py
##########
@@ -385,10 +528,10 @@ def __init__(
         spark_frame: spark.DataFrame,
         index_spark_columns: Optional[List[spark.Column]],
         index_names: Optional[List[Optional[Tuple]]] = None,
-        index_dtypes: Optional[List[Dtype]] = None,
+        index_fields: Optional[List[Field]] = None,

Review comment:
       Can be done in a followup but just wondering.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32775: [SPARK-35638][PYTHON] Introduce Field to manage dtypes and StructFields.

Reply via email to