Re: [HACKERS] pg_dump additional options for performance

Dimitri Fontaine Wed, 27 Feb 2008 02:20:29 -0800

Le mardi 26 février 2008, Joshua D. Drake a écrit :
> > Think 100GB+ of data that's in a CSV or delimited file.  Right now
> > the best import path is with COPY, but it won't execute very fast as
> > a single process.  Splitting the file manually will take a long time
> > (time that could be spend loading instead) and substantially increase
> > disk usage, so the ideal approach would figure out how to load in
> > parallel across all available CPUs against that single file.
>
> You mean load from position? That would be very, very cool.


Did I mention pgloader now does exactly this when configured like this:
 http://pgloader.projects.postgresql.org/dev/pgloader.1.html#_parallel_loading
  section_threads = N
  split_file_reading = True

IIRC, Simon and Greg Smith asked for pgloader to get those parallel loading 
features in order to have some first results and ideas about the performance 
gain, as a first step in the parallel COPY backend implementation design.

Hope this helps,
-- 
dim

signature.asc
Description: This is a digitally signed message part.

Re: [HACKERS] pg_dump additional options for performance

Reply via email to