AgenticSpark opened a new pull request, #56726:
URL: https://github.com/apache/spark/pull/56726

   ### What changes were proposed in this pull request?
   
   This adds a `_name` property to the PySpark `Column` class that returns the
   column's name, alias, or expression as a string -- the same string shown 
inside
   `Column.__repr__` (`Column<'...'>`). It is implemented for both Spark Classic
   (`self._jc.toString()`) and Spark Connect (`self._expr.__repr__()`).
   
   ```python
   >>> from pyspark.sql import functions as sf
   >>> df = spark.createDataFrame([(2, "Alice")], ["age", "name"])
   >>> df.age._name
   'age'
   >>> sf.col("value")._name
   'value'
   >>> sf.col("a").cast("int")._name
   'CAST(a AS INT)'
   ```
   
   The leading underscore intentionally avoids a collision with the existing
   `Column.name` method, which is an alias for `Column.alias`.
   
   ### Why are the changes needed?
   
   Requested in 
[SPARK-38483](https://issues.apache.org/jira/browse/SPARK-38483).
   Having the name available as an attribute enables convenient patterns, e.g.
   re-aliasing an expression with the source column's name, or branching on a
   column's name inside a helper function:
   
   ```python
   values = sf.col("values")
   distinct_values = sf.array_distinct(values).alias(values._name)
   
   def custom_function(col):
       return col.cast("int") if col._name == "my_column" else 
col.cast("string")
   ```
   
   Previously the name was only obtainable by parsing `repr(col)`.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes -- a new `Column._name` property is available. There is no change to any
   existing behavior.
   
   ### How was this patch tested?
   
   Added `test_name_property` to `ColumnTestsMixin`, so it runs under both the
   classic (`pyspark.sql.tests.test_column`) and Spark Connect parity
   (`pyspark.sql.tests.connect.test_parity_column`) suites. It checks concrete
   values and the invariant `repr(col) == "Column<'%s'>" % col._name`. Doctests
   were also added on the new property.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   Generated-by: GitHub Copilot CLI (Claude Opus 4.8)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to