Hi Partha, The functionality to select a nested field exists in the C++ library, but as far as I know, this is not yet exposed in the Python bindings, so the example you are showing is not yet supported in practice.
I opened a JIRA to track this feature: https://issues.apache.org/jira/browse/ARROW-11259 Best, Joris On Wed, 13 Jan 2021 at 19:57, PARTHA DUTTA <[email protected]> wrote: > I have a Parquet file which has a field defined as a struct: > workEmail: struct<address: string> > child 0, address: string > -- field metadata -- > PARQUET:field_id: '13' > -- field metadata -- > PARQUET:field_id: '1' > > I am trying to write a filter as a DNF to query a specific value for > workEmail.address but pyarrow does not seem to accept the DNF: > > tbl = pyarrow.parquet.read_table(filename, use_legacy_dataset=False, > columns=["workEmail"], filters=[("workEmail.address", "=", "[email protected] > ")]) > > Is this supported? If not, any other workarounds? > > -- > Partha Dutta > [email protected] >
