[GitHub] [arrow] arjunsr1 commented on issue #13396: Is the Arrow::Table.merge function in a working state?

GitBox Tue, 21 Jun 2022 14:00:20 -0700


arjunsr1 commented on issue #13396:
URL: https://github.com/apache/arrow/issues/13396#issuecomment-1162355789


   Hi @kou - it seems like `s3_existing_table.concatenate([table])` should work 
fine. However, I'm getting an error that says `Invalid schema at index 1 was 
different`. The steps I'm taking are as follows:
   
   - Construct a table based on a CSV file (variable is called tempfile) using 
`table = Arrow::Table.load(tempfile.path)`
   - Pull a parquet file from S3 into a temporary file location (A parquet file 
that was generated by previously calling `Arrow::Table.save()` on an arrow 
table of the same type).
   - Call `s3_existing_table = Arrow::Table.load(_temporary_file_path_)`
   - Finally, try to merge tables by doing `consolidated_table = 
s3_existing_table.concatenate([table])`
   
   The only schema changes I can verify via diffchecker are as follows
   `
   History Completed Time: timestamp[ms]
   History Created Time: timestamp[ms]
   vs
   History Completed Time: timestamp[s]
   History Created Time: timestamp[s]
   `
   (Rest of the schema ommitted since there are no changes in the rest)
   Apologies if this goes beyond the scope of what you know, but do you think 
there could be some schema change issue that occurs when saving arrow tables as 
parquet files?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] arjunsr1 commented on issue #13396: Is the Arrow::Table.merge function in a working state?

Reply via email to