Jeff Davis wrote:
All I mean is that the second argument to
COPY should produce/consume bytes and not records. I'm not discussing
the internal implementation at all, only semantics.

In other words, STDIN is not a source of records, it's a source of
bytes; and likewise for STDOUT.
In the context of the read case, I'm not as sure it's so black and white. While the current situation does map better to a function that produces a stream of bytes, that's not necessarily the optimal approach for all situations. It's easy to imagine a function intended for accelerating bulk loading that is internally going to produce a stream of already processed records. A good example would be a function that is actually reading from another database system for the purpose of converting its data into PostgreSQL. If those were then loaded by a fairly direct path, that would happen at a much higher rate than if one had to convert those back into a stream of bytes with delimiters and then re-parse.

I think there's a very valid use-case for both approaches. Maybe it just turns into an option, so you can get a faster loading path record at a time or just produce a stream characters, depending on what your data source maps to better. Something like this:

COPY target FROM FUNCTION foo() WITH RECORDS;
COPY target FROM FUNCTION foo() WITH BYTES;


Would seem to cover both situations. I'd think that the WITH BYTES situation would just do some basic parsing and then pass the result through the same basic code path as WITH RECORDS, so having both available shouldn't increase the size of the implementation that much.

--
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to