Re: [HACKERS] An idea for parallelizing COPY within one backend

Florian G. Pflug Wed, 27 Feb 2008 09:04:45 -0800

Andrew Dunstan wrote:

Florian G. Pflug wrote:
Would it be possible to determine when the copy is starting that thiscase holds, and not use the parallel parsing idea in those cases?
In theory, yes. In pratice, I don't want to be the one who has toanswer to an angry user who just suffered a major drop in COPYperformance after adding an ENUM column to his table.
I am yet to be convinced that this is even theoretically a good path tofollow. Any sufficiently large table could probably be partitioned andthen we could use the parallelism that is being discussed for pg_restorewithout any modification to the backend at all. Similar tricks could beplayed by an external bulk loader for third party data sources.


That assumes that some specific bulkloader like pg_restore, pgloader

or similar is used to perform the load. Plain libpq-users would eitherneed to duplicate the logic these loaders contain, or wouldn't be ableto take advantage of fast loads.

Plus, I'd see this as a kind of testbed for gently introducingparallelism into postgres backends (especially thinking about sortinghere). CPU gain more and more cores, so in the long run I fear that wewill have to find ways to utilize more than one of those to execute asingle query.

But of course the architectural details need to be sorted out before anycredible judgement about the feasability of this idea can be made...


regards, Florian Pflug


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

              http://www.postgresql.org/docs/faq

Re: [HACKERS] An idea for parallelizing COPY within one backend

Reply via email to