On Mon, Apr 9, 2018 at 12:01 PM, Steve Atkins <st...@blighty.com> wrote:

>
> > On Apr 9, 2018, at 8:49 AM, Mark Moellering <markmoellering@psyberation.
> com> wrote:
> >
> > Everyone,
> >
> > We are trying to architect a new system, which will have to take several
> large datastreams (total of ~200,000 parsed files per second) and place
> them in a database.  I am trying to figure out the best way to import that
> sort of data into Postgres.
> >
> > I keep thinking i can't be the first to have this problem and there are
> common solutions but I can't find any.  Does anyone know of some sort
> method, third party program, etc, that can accept data from a number of
> different sources, and push it into Postgres as fast as possible?
>
> Take a look at http://ossc-db.github.io/pg_bulkload/index.html. Check the
> benchmarks for different situations compared to COPY.
>
> Depending on what you're doing using custom code to parse your data and
> then do multiple binary COPYs in parallel may be better.
>
> Cheers,
>   Steve
>
>
>
(fighting google slightly to keep from top-posting...)

Thanks!

How long can you run COPY?  I have been looking at it more closely.  In
some ways, it would be simple just to take data from stdin and send it to
postgres but can I do that literally 24/7?  I am monitoring data feeds that
will never stop and I don't know if that is how Copy is meant to be used or
if I have to let it finish and start another one at some point?

Thanks for everyones' help and input!

Mark Moellering

Reply via email to