jorisvandenbossche commented on pull request #9676:
URL: https://github.com/apache/arrow/pull/9676#issuecomment-796578198


   Still needs tests and docs and decision on actual user API etc, but so basic 
projection is working:
   
   ```python
   >>> import pyarrow.dataset as ds
   >>> dataset = ds.dataset("test.parquet", format="parquet")
   
   >>> dataset.to_table().to_pandas()
      A    B  C
   0  1  0.1  a
   1  2  0.2  b
   2  3  0.3  c
   
   >>> dataset.to_table(columns=['A', 'C']).to_pandas()
      A  C
   0  1  a
   1  2  b
   2  3  c
   
   >>> dataset.to_table(columns={
   ...     'A_renamed': ds.field('A'),
   ...     'B_as_int': ds.field('B').cast("int64", safe=False),
   ...     'C_is_a': ds.field('C') == 'a'
   ... }).to_pandas()
      A_renamed  B_as_int  C_is_a
   0          1         0    True
   1          2         0   False
   2          3         0   False
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to