jiayuasu opened a new pull request, #832:
URL: https://github.com/apache/sedona-db/pull/832
Wire the `Expr` layer into the existing lazy `DataFrame` so users can
project columns without writing SQL strings.
This is the third small PR in the Phase P1 series of #791, building on #807
(Expr foundation) and #823 (Expr operators).
## What's new
```python
from sedonadb.expr import col
df.select("x", "y") # bare column names
df.select(col("x"), (col("y") + 1).alias("y1")) # Expr objects
df.select("x", (col("y") * 2).alias("y2")) # mix
```
- Strings are converted to column references via `col(name)` internally; the
same plan is produced as the all-Expr form.
- Empty argument list → `ValueError`. Non-str/non-Expr argument →
`TypeError`. Unknown column → DataFusion plan-build error (same behavior locked
in the foundation PR).
## Implementation
`InternalDataFrame::select` (Rust) is a thin wrapper that unwraps
`Vec<PyExpr>` to `Vec<Expr>` and calls DataFusion's `DataFrame::select`
directly. No new query-engine code.
## Test plan
- 11 tests in `tests/expr/test_dataframe_select.py` cover string projection,
Expr projection, mixed args, arithmetic Expr, alias output, literal-coercion
via operators, lazy return, and the three error paths.
- Assertions are exact `column_names` / `to_pylist()` — no substring
matching — so any change in projection semantics fails loudly.
- All 11 pass locally; no regressions in existing `test_dataframe.py`.
`DataFrame.filter` / `.where` come in the next small PR.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]