jorisvandenbossche commented on pull request #9676:
URL: https://github.com/apache/arrow/pull/9676#issuecomment-796578198
Still needs tests and docs and decision on actual user API etc, but so basic
projection is working:
```python
>>> import pyarrow.dataset as ds
>>> dataset = ds.dataset("test.parquet", format="parquet")
>>> dataset.to_table().to_pandas()
A B C
0 1 0.1 a
1 2 0.2 b
2 3 0.3 c
>>> dataset.to_table(columns=['A', 'C']).to_pandas()
A C
0 1 a
1 2 b
2 3 c
>>> dataset.to_table(columns={
... 'A_renamed': ds.field('A'),
... 'B_as_int': ds.field('B').cast("int64", safe=False),
... 'C_is_a': ds.field('C') == 'a'
... }).to_pandas()
A_renamed B_as_int C_is_a
0 1 0 True
1 2 0 False
2 3 0 False
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]