mzhukovs opened a new issue #11186:
URL: https://github.com/apache/arrow/issues/11186


   I would like to be able to do basic manipulation of parquet files stored in 
Azure Data Lake Storage (ADLS) Gen2 - is it possible to update ONLY the column 
names and/or metadata without having to recreate the file or read in all of the 
data?
   
   Using the handler provided by this library 
https://github.com/kaaveland/pyarrowfs-adlgen2 able to connect to storage just 
fine to read metadata/schema, but hoping there's a way to write back (by 
specifying the filesystem that the aforementioned lib enables easily). Is this 
possible?
   
   Looks like 
[rename_columns](https://arrow.apache.org/docs/python/generated/pyarrow.Table.html?highlight=rename_columns#pyarrow.Table.rename_columns)
 is on the Table object, so data would have already been read in, and not sure 
about this one: 
https://arrow.apache.org/docs/python/generated/pyarrow.parquet.write_metadata.html?highlight=write_meta
   
   Would appreciate any insight.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to