[GitHub] [arrow] legout commented on issue #31678: PyArrow: RuntimeError: AppendRowGroups requires equal schemas when writing _metadata file

via GitHub Tue, 22 Aug 2023 11:43:22 -0700


legout commented on issue #31678:
URL: https://github.com/apache/arrow/issues/31678#issuecomment-1688730932


   Sorry for my confusing comment. Here are some more details. 
   
   The parquet files of the dataset are exports from an oracle database written 
with another software(knime). Unfortunately, this leads to the parquet files 
having different column ordering, although the data types of the columns are 
identical. 
   
   This means, I am able to read the dataset (parquet files) using 
pyarrow.dataset or pyarrow.read_table. 
   However, when trying to create  metadata and common metadata files according 
to 
https://arrow.apache.org/docs/python/parquet.html#writing-metadata-and-common-medata-files,
 I get the same this error
   
   RuntimeError: AppendRowGroups requires equal schemas. 
   
   
   I do understand, that data types have to be identical, but I wonder why the 
column ordering is important here. 
   
   I am currently on my mobile. I'll provide some sample code later. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] legout commented on issue #31678: PyArrow: RuntimeError: AppendRowGroups requires equal schemas when writing _metadata file

Reply via email to