xinrong-meng commented on code in PR #48628:
URL: https://github.com/apache/spark/pull/48628#discussion_r1815847602


##########
python/pyspark/sql/plot/plotly.py:
##########
@@ -214,3 +209,35 @@ def plot_histogram(data: "DataFrame", **kwargs: Any) -> 
"Figure":
     fig["layout"]["xaxis"]["title"] = "value"
     fig["layout"]["yaxis"]["title"] = "count"
     return fig
+
+
+def process_column_param(column: Optional[Union[str, List[str]]], data: 
"DataFrame") -> List[str]:
+    """
+    Processes the provided column parameter for a DataFrame.
+    - If `column` is None, returns a list of numeric or datetime columns from 
the DataFrame.
+    - If `column` is a string, converts it to a list first.
+    - If `column` is a list, it checks if all specified columns exist in the 
DataFrame
+      and are of valid types (NumericType, DateType, or TimestampType).
+    - Raises a PySparkTypeError if any column in the list is not present in 
the DataFrame
+      or has an invalid type.
+    """
+    valid_types = (NumericType, DateType, TimestampType)

Review Comment:
   Good catch!
   Box plots seem to support NumericType only.
   Density and histogram plots will accept `np.datetime64` in PS, as 
https://github.com/apache/spark/blob/master/python/pyspark/pandas/plot/core.py#L142.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to