On Mon, Apr 9, 2018 at 12:01 PM, Steve Atkins <st...@blighty.com> wrote:
> > > On Apr 9, 2018, at 8:49 AM, Mark Moellering <markmoellering@psyberation. > com> wrote: > > > > Everyone, > > > > We are trying to architect a new system, which will have to take several > large datastreams (total of ~200,000 parsed files per second) and place > them in a database. I am trying to figure out the best way to import that > sort of data into Postgres. > > > > I keep thinking i can't be the first to have this problem and there are > common solutions but I can't find any. Does anyone know of some sort > method, third party program, etc, that can accept data from a number of > different sources, and push it into Postgres as fast as possible? > > Take a look at http://ossc-db.github.io/pg_bulkload/index.html. Check the > benchmarks for different situations compared to COPY. > > Depending on what you're doing using custom code to parse your data and > then do multiple binary COPYs in parallel may be better. > > Cheers, > Steve > > > (fighting google slightly to keep from top-posting...) Thanks! How long can you run COPY? I have been looking at it more closely. In some ways, it would be simple just to take data from stdin and send it to postgres but can I do that literally 24/7? I am monitoring data feeds that will never stop and I don't know if that is how Copy is meant to be used or if I have to let it finish and start another one at some point? Thanks for everyones' help and input! Mark Moellering