Re: [HACKERS] An idea for parallelizing COPY within one backend

2008-02-27 Thread Florian G. Pflug
Tom Lane wrote: "Florian G. Pflug" <[EMAIL PROTECTED]> writes: Plus, I'd see this as a kind of testbed for gently introducing parallelism into postgres backends (especially thinking about sorting here). This thinking is exactly what makes me scream loudly and run in the other direction. I do

Re: [HACKERS] An idea for parallelizing COPY within one backend

2008-02-27 Thread Tom Lane
"Florian G. Pflug" <[EMAIL PROTECTED]> writes: > Plus, I'd see this as a kind of testbed for gently introducing > parallelism into postgres backends (especially thinking about sorting > here). This thinking is exactly what makes me scream loudly and run in the other direction. I don't want thre

Re: [HACKERS] An idea for parallelizing COPY within one backend

2008-02-27 Thread Florian G. Pflug
Andrew Dunstan wrote: Florian G. Pflug wrote: Would it be possible to determine when the copy is starting that this case holds, and not use the parallel parsing idea in those cases? In theory, yes. In pratice, I don't want to be the one who has to answer to an angry user who just suffered a m

Re: [HACKERS] An idea for parallelizing COPY within one backend

2008-02-27 Thread Brian Hurt
Andrew Dunstan wrote: Florian G. Pflug wrote: Would it be possible to determine when the copy is starting that this case holds, and not use the parallel parsing idea in those cases? In theory, yes. In pratice, I don't want to be the one who has to answer to an angry user who just suffe

Re: [HACKERS] An idea for parallelizing COPY within one backend

2008-02-27 Thread Andrew Dunstan
Florian G. Pflug wrote: Would it be possible to determine when the copy is starting that this case holds, and not use the parallel parsing idea in those cases? In theory, yes. In pratice, I don't want to be the one who has to answer to an angry user who just suffered a major drop in COPY

Re: [HACKERS] An idea for parallelizing COPY within one backend

2008-02-27 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Yeah, but it wouldn't take advantage of, say, the hack to disable WAL > when the table was created in the same transaction. In the context of a parallelizing pg_restore this would be fairly easy to get around. We could probably teach the thing to combi

Re: [HACKERS] An idea for parallelizing COPY within one backend

2008-02-27 Thread Tom Dunstan
On Wed, Feb 27, 2008 at 9:26 PM, Florian G. Pflug <[EMAIL PROTECTED]> wrote: > I was thinking more along the line of letting a datatype specify a > function "void* ioprepare(typmod)" which returns some opaque object > specifying all that the input and output function needs to know. > We could t

Re: [HACKERS] An idea for parallelizing COPY within one backend

2008-02-27 Thread Florian G. Pflug
Brian Hurt wrote: Tom Lane wrote: "Florian G. Pflug" <[EMAIL PROTECTED]> writes: ... Neither the "dealer", nor the "workers" would need access to the either the shared memory or the disk, thereby not messing with the "one backend is one transaction is one session" dogma. ... Unfortunately, thi

Re: [HACKERS] An idea for parallelizing COPY within one backend

2008-02-27 Thread Heikki Linnakangas
A.M. wrote: On Feb 27, 2008, at 9:11 AM, Florian G. Pflug wrote: Dimitri Fontaine wrote: Of course, the backends still have to parse the input given by pgloader, which only pre-processes data. I'm not sure having the client prepare the data some more (binary format or whatever) is a wise id

Re: [HACKERS] An idea for parallelizing COPY within one backend

2008-02-27 Thread Alvaro Herrera
A.M. wrote: > > On Feb 27, 2008, at 9:11 AM, Florian G. Pflug wrote: >> The reason that I'd love some within-one-backend solution is that I'd >> allow you to utilize more than one CPU for a restore within a *single* >> transaction. This is something that a client-side solution won't be >> able

Re: [HACKERS] An idea for parallelizing COPY within one backend

2008-02-27 Thread Brian Hurt
Tom Lane wrote: "Florian G. Pflug" <[EMAIL PROTECTED]> writes: ... Neither the "dealer", nor the "workers" would need access to the either the shared memory or the disk, thereby not messing with the "one backend is one transaction is one session" dogma. ... Unfortunately, this idea ha

Re: [HACKERS] An idea for parallelizing COPY within one backend

2008-02-27 Thread A.M.
On Feb 27, 2008, at 9:11 AM, Florian G. Pflug wrote: Dimitri Fontaine wrote: Of course, the backends still have to parse the input given by pgloader, which only pre-processes data. I'm not sure having the client prepare the data some more (binary format or whatever) is a wise idea, as you

Re: [HACKERS] An idea for parallelizing COPY within one backend

2008-02-27 Thread Florian G. Pflug
Dimitri Fontaine wrote: Of course, the backends still have to parse the input given by pgloader, which only pre-processes data. I'm not sure having the client prepare the data some more (binary format or whatever) is a wise idea, as you mentionned and wrt Tom's follow-up. But maybe I'm all wron

Re: [HACKERS] An idea for parallelizing COPY within one backend

2008-02-27 Thread Florian G. Pflug
Tom Lane wrote: "Florian G. Pflug" <[EMAIL PROTECTED]> writes: ... Neither the "dealer", nor the "workers" would need access to the either the shared memory or the disk, thereby not messing with the "one backend is one transaction is one session" dogma. ... Unfortunately, this idea has far too

Re: [HACKERS] An idea for parallelizing COPY within one backend

2008-02-27 Thread Simon Riggs
On Wed, 2008-02-27 at 09:09 +0100, Dimitri Fontaine wrote: > Hi, > > Le mercredi 27 février 2008, Florian G. Pflug a écrit : > > Upon reception of a COPY INTO command, a backend would > > .) Fork off a "dealer" and N "worker" processes that take over the > > client connection. The "dealer" distrib

Re: [HACKERS] An idea for parallelizing COPY within one backend

2008-02-27 Thread Dimitri Fontaine
Hi, Le mercredi 27 février 2008, Florian G. Pflug a écrit : > Upon reception of a COPY INTO command, a backend would > .) Fork off a "dealer" and N "worker" processes that take over the > client connection. The "dealer" distributes lines received from the > client to the N workes, while the origin

Re: [HACKERS] An idea for parallelizing COPY within one backend

2008-02-26 Thread Tom Lane
"Florian G. Pflug" <[EMAIL PROTECTED]> writes: > ... > Neither the "dealer", nor the "workers" would need access to the either > the shared memory or the disk, thereby not messing with the "one backend > is one transaction is one session" dogma. > ... Unfortunately, this idea has far too narrow a

[HACKERS] An idea for parallelizing COPY within one backend

2008-02-26 Thread Florian G. Pflug
As far as I can see the main difficulty in making COPY run faster (on the server) is that pretty involved conversion from plain-text lines into tuples. Trying to get rid of this conversion by having the client send something that resembles the data stored in on-disk tuples is not a good answer, ei