On Fri, Jul 5, 2024 at 5:08 AM sud <suds1...@gmail.com> wrote: > Hello all, > > Its postgres database. We have option of getting files in csv and/or in > avro format messages from another system to load it into our postgres > database. The volume will be 300million messages per day across many files > in batches. > > My question was, which format should we chose in regards to faster data > loading performance ? >
What application will be loading the data? If psql, then go with CSV; COPY is *really* efficient. If the PG tables are already mapped to the avro format, then maybe avro will be faster. > and if any other aspects to it also should be considered apart from just > loading performance? > If all the data comes in at night, drop as many indices as possible before loading. Load each file in as few DB connections as possible: the most efficient binary format won't do you any good if you open and close a connection for each and every row.