vikasmalhotra08 opened a new issue #11347:
URL: https://github.com/apache/arrow/issues/11347


   Hello,
   
   Is it possible to read specific nested fields when trying to read a parquet 
file? I am getting an error that:
   ```pyarrow.lib.ArrowInvalid: Field named 'a.b' not found or not unique in 
the schema.```
   
   Here is how the file is written out:
   ```
   # Writing as table
   pq.write_table(
       table, 
       where=file_path, 
       version='2.0', 
       compression='snappy'
   )
   ```
   
   Here is the schema that's present in the parquet field:
   ```
   required group field_id=0 schema {
   optional group field_id=1 a {
       optional binary field_id=2 abc (String);
       optional group field_id=3 b {
         optional binary field_id=4 c (String);
         optional binary field_id=5 d (String);
         optional binary field_id=6 e (String);
       }
   }
   }
   ```
   
   Here is how I am trying to read it:
   ```
   # read the table
   columns_needed = ['a.b', 'a.b.c']
   data = pq.read_table(
       file_path, 
       columns=columns_needed)
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to