amaliujia commented on code in PR #38631:
URL: https://github.com/apache/spark/pull/38631#discussion_r1020561328


##########
python/pyspark/sql/connect/column.py:
##########
@@ -82,6 +82,74 @@ def to_plan(self, session: "RemoteSparkSession") -> 
"proto.Expression":
     def __str__(self) -> str:
         ...
 
+    def alias(self, *alias: str, **kwargs: Any) -> "Expression":
+        """
+        Returns this column aliased with a new name or names (in the case of 
expressions that
+        return more than one column, such as explode).
+
+        .. versionadded:: 1.3.0

Review Comment:
   version 3.4.0



##########
python/pyspark/sql/connect/column.py:
##########
@@ -82,6 +82,74 @@ def to_plan(self, session: "RemoteSparkSession") -> 
"proto.Expression":
     def __str__(self) -> str:
         ...
 
+    def alias(self, *alias: str, **kwargs: Any) -> "Expression":
+        """
+        Returns this column aliased with a new name or names (in the case of 
expressions that
+        return more than one column, such as explode).
+
+        .. versionadded:: 1.3.0
+
+        Parameters
+        ----------
+        alias : str
+            desired column names (collects all positional arguments passed)
+
+        Other Parameters
+        ----------------
+        metadata: dict
+            a dict of information to be stored in ``metadata`` attribute of the
+            corresponding :class:`StructField <pyspark.sql.types.StructField>` 
(optional, keyword
+            only argument)
+
+            .. versionchanged:: 2.2.0
+               Added optional ``metadata`` argument.

Review Comment:
   we don't need this: 
   
   Connect is new API and everything will start from 3.4.0



##########
python/pyspark/sql/connect/column.py:
##########
@@ -82,6 +82,74 @@ def to_plan(self, session: "RemoteSparkSession") -> 
"proto.Expression":
     def __str__(self) -> str:
         ...
 
+    def alias(self, *alias: str, **kwargs: Any) -> "Expression":
+        """
+        Returns this column aliased with a new name or names (in the case of 
expressions that
+        return more than one column, such as explode).
+
+        .. versionadded:: 1.3.0
+
+        Parameters
+        ----------
+        alias : str
+            desired column names (collects all positional arguments passed)
+
+        Other Parameters
+        ----------------
+        metadata: dict
+            a dict of information to be stored in ``metadata`` attribute of the
+            corresponding :class:`StructField <pyspark.sql.types.StructField>` 
(optional, keyword
+            only argument)
+
+            .. versionchanged:: 2.2.0
+               Added optional ``metadata`` argument.
+
+        Returns
+        -------
+        :class:`Column`
+            Column representing whether each element of Column is aliased with 
new name or names.
+
+        Examples
+        --------
+        >>> df = spark.createDataFrame(
+        ...      [(2, "Alice"), (5, "Bob")], ["age", "name"])
+        >>> df.select(df.age.alias("age2")).collect()
+        [Row(age2=2), Row(age2=5)]
+        >>> df.select(df.age.alias("age3", metadata={'max': 
99})).schema['age3'].metadata['max']
+        99
+        """
+        metadata = kwargs.pop("metadata", None)
+        assert not kwargs, "Unexpected kwargs where passed: %s" % kwargs
+        return ColumnAlias(self, list(alias), metadata)
+
+
+class ColumnAlias(Expression):
+    def __init__(self, parent: Expression, alias: list[str], metadata: Any):
+
+        self._alias = alias
+        self._metadata = metadata
+        self._parent = parent
+
+    def to_plan(self, session: "RemoteSparkSession") -> "proto.Expression":
+        if len(self._alias) == 1:
+            if self._metadata:
+                raise ValueError("Creating aliases with metadata is not 
supported.")
+            else:
+                exp = proto.Expression()
+                exp.alias.name.append(self._alias[0])
+                exp.alias.expr.CopyFrom(self._parent.to_plan(session))
+                return exp
+        else:
+            if self._metadata:
+                raise ValueError("metadata can only be provided for a single 
column")

Review Comment:
   Wait literally metadata is not supported yet..
   
   Maybe just one `raise ValueError`



##########
connector/connect/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala:
##########
@@ -334,7 +334,11 @@ class SparkConnectPlanner(session: SparkSession) {
   }
 
   private def transformAlias(alias: proto.Expression.Alias): NamedExpression = 
{
-    Alias(transformExpression(alias.getExpr), alias.getName)()
+    if (alias.getNameCount == 1) {
+      Alias(transformExpression(alias.getExpr), alias.getName(0))()
+    } else {
+      MultiAlias(transformExpression(alias.getExpr), 
alias.getNameList.asScala.toSeq)

Review Comment:
   Can you make a change in Connect DSL and add a server side test in 
`SparkConnectProtoSuite`? 
   
   Should be straightforward. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to