legout commented on issue #31678: URL: https://github.com/apache/arrow/issues/31678#issuecomment-1688730932
Sorry for my confusing comment. Here are some more details. The parquet files of the dataset are exports from an oracle database written with another software(knime). Unfortunately, this leads to the parquet files having different column ordering, although the data types of the columns are identical. This means, I am able to read the dataset (parquet files) using pyarrow.dataset or pyarrow.read_table. However, when trying to create metadata and common metadata files according to https://arrow.apache.org/docs/python/parquet.html#writing-metadata-and-common-medata-files, I get the same this error RuntimeError: AppendRowGroups requires equal schemas. I do understand, that data types have to be identical, but I wonder why the column ordering is important here. I am currently on my mobile. I'll provide some sample code later. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
