Andrew Dunstan wrote:
Florian G. Pflug wrote:
Would it be possible to determine when the copy is starting that this case holds, and not use the parallel parsing idea in those cases?

In theory, yes. In pratice, I don't want to be the one who has to answer to an angry user who just suffered a major drop in COPY performance after adding an ENUM column to his table.

I am yet to be convinced that this is even theoretically a good path to follow. Any sufficiently large table could probably be partitioned and then we could use the parallelism that is being discussed for pg_restore without any modification to the backend at all. Similar tricks could be played by an external bulk loader for third party data sources.

That assumes that some specific bulkloader like pg_restore, pgloader
or similar is used to perform the load. Plain libpq-users would either need to duplicate the logic these loaders contain, or wouldn't be able to take advantage of fast loads.

Plus, I'd see this as a kind of testbed for gently introducing parallelism into postgres backends (especially thinking about sorting here). CPU gain more and more cores, so in the long run I fear that we will have to find ways to utilize more than one of those to execute a single query.

But of course the architectural details need to be sorted out before any credible judgement about the feasability of this idea can be made...

regards, Florian Pflug


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

              http://www.postgresql.org/docs/faq

Reply via email to