dongjoon-hyun opened a new pull request, #405:
URL: https://github.com/apache/spark-connect-swift/pull/405

   ### What changes were proposed in this pull request?
   
   This PR adds `withColumn` and `withColumns` transformations to `DataFrame`, 
exposing the Spark Connect `WithColumns` relation that was already defined in 
the generated protobuf but not surfaced in the Swift API.
   
   New public APIs in `DataFrame+Transformations.swift`:
   
   ```swift
   public func withColumn(_ colName: String, _ expr: String) -> DataFrame
   public func withColumns(_ colsMap: [String: String]) -> DataFrame
   ```
   
   - `withColumn` adds a new column or replaces an existing column with the 
same name; it delegates to `withColumns` with a single-entry map.
   - The column expression is provided as a SQL expression string and parsed on 
the server, consistent with the existing `filter(_:)` / `selectExpr(_:)` APIs 
(this client has no `Column` type).
   - A static plan builder `SparkConnectClient.getWithColumns(_:_:)` constructs 
the `WithColumns` relation by mapping each `(name, expr)` pair to an 
`Expression.Alias`, mirroring the sibling `getWithColumnRenamed(_:_:)`.
   - Adds the `WithColumns` type alias in `TypeAliases.swift`, matching the 
existing `WithColumnsRenamed` alias.
   
   ### Why are the changes needed?
   
   `DataFrame.withColumn` / `withColumns` are among the most commonly used 
PySpark and Spark SQL APIs for deriving or replacing columns from expressions. 
The Swift client previously supported only column *renaming* 
(`withColumnRenamed`) and had no way to add or replace a column from a computed 
expression. This change improves API parity with PySpark/Spark SQL.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes. This adds two new public `DataFrame` APIs.
   
   ```swift
   let df = try await spark.range(1)                    // columns: [id]
   try await df.withColumn("b", "id + 1").show()        // columns: [id, b]
   try await df.withColumns(["b": "id + 1",
                             "c": "id * 2"]).show()     // columns: [id, b, c]
   ```
   
   ### How was this patch tested?
   
   Pass the CIs.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   Generated-by: Claude Code (Claude Opus 4.8)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to