jorisvandenbossche commented on pull request #11385:
URL: https://github.com/apache/arrow/pull/11385#issuecomment-941013899


   This is a bit hacky solution; not sure it is robust enough to be accepted 
(or the best approach). 
   Currently, the mapping from string column name to parquet field index is 
done on the Python side. This is based on the FileMetaData.SchemaDescriptor, 
iterating through the columns and getting the dotted path of each column. The 
problem is that at this point, there is no easy way to know (AFAIK) if the 
column (ColumnDescriptor) is for a list type or not, to be able to also 
construct the shorter version of the dotted path. 
   
   So therefore I did this on the C++ level, but by just by exposing a 
"shorter" dotted path in addition to the default one that excludes the list 
inner elements. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to