You can just append the new data to an existing parquet file as a new row group 
partition..

A. Open a new parquet for writing.
B. For each existing row group in the old file.. write the row group to a new 
file.. This operation is instantaneous even for large parquet files..
C. Take the data you want to add and write it as new row group to the new file.
D. Close the new parquet file.

https://arrow.apache.org/docs/python/parquet.html#finer-grained-reading-and-writing

-----Original Message-----
From: drin (via GitHub) <[email protected]>
Sent: Thursday, November 13, 2025 9:24 AM
To: [email protected]
Subject: Re: [D] Safe way to periodically add arrow RecordBatch to a file 
[arrow]

External Email: Use caution with links and attachments


GitHub user drin added a comment to the discussion: Safe way to periodically 
add arrow RecordBatch to a file

you can make epoch a column of your record batch or put some index-like 
structure in the schema metadata that you can use to identify record batch 
index from epoch.

GitHub link: 
https://urldefense.com/v3/__https://github.com/apache/arrow/discussions/48124*discussioncomment-14961205__;Iw!!KSjYCgUGsB4!cFWq09_jBQ1wZOP_4S1Eg11Gn09xJt7vxAhkQXhTOx-e7iKf-gRz7k_fA24zPZfHnvfBsoUcZ6_pew$

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]
This message may contain information that is confidential or privileged. If you 
are not the intended recipient, please advise the sender immediately and delete 
this message. See 
https://www.blackrock.com/corporate/compliance/email-disclaimers for further 
information.  Please refer to 
https://www.blackrock.com/corporate/compliance/privacy-policy for more 
information about BlackRock’s Privacy Policy.
For a list of BlackRock's office addresses worldwide, see 
https://www.blackrock.com/corporate/about-us/contacts-locations.

© 2025 BlackRock, Inc. All rights reserved.

Reply via email to