You can just append the new data to an existing parquet file as a new row group partition..
A. Open a new parquet for writing. B. For each existing row group in the old file.. write the row group to a new file.. This operation is instantaneous even for large parquet files.. C. Take the data you want to add and write it as new row group to the new file. D. Close the new parquet file. https://arrow.apache.org/docs/python/parquet.html#finer-grained-reading-and-writing -----Original Message----- From: drin (via GitHub) <[email protected]> Sent: Thursday, November 13, 2025 9:24 AM To: [email protected] Subject: Re: [D] Safe way to periodically add arrow RecordBatch to a file [arrow] External Email: Use caution with links and attachments GitHub user drin added a comment to the discussion: Safe way to periodically add arrow RecordBatch to a file you can make epoch a column of your record batch or put some index-like structure in the schema metadata that you can use to identify record batch index from epoch. GitHub link: https://urldefense.com/v3/__https://github.com/apache/arrow/discussions/48124*discussioncomment-14961205__;Iw!!KSjYCgUGsB4!cFWq09_jBQ1wZOP_4S1Eg11Gn09xJt7vxAhkQXhTOx-e7iKf-gRz7k_fA24zPZfHnvfBsoUcZ6_pew$ ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected] This message may contain information that is confidential or privileged. If you are not the intended recipient, please advise the sender immediately and delete this message. See https://www.blackrock.com/corporate/compliance/email-disclaimers for further information. Please refer to https://www.blackrock.com/corporate/compliance/privacy-policy for more information about BlackRock’s Privacy Policy. For a list of BlackRock's office addresses worldwide, see https://www.blackrock.com/corporate/about-us/contacts-locations. © 2025 BlackRock, Inc. All rights reserved.
