Hi,
The documentation on pg_getcopydata for DBD::Pg does not mention that
the returned data will only be a single row at a time.
The PostgreSQL function PQgetCopyData, that pg_getcopydata calls,
makes this guarantee:
"Data is always returned one data row at a time; if only a partial
row is available, it is not returned."
from:
http://www.postgresql.org/docs/8.2/static/libpq-copy.html
This guarantee is very helpful when considering how to un-escape and
handle the data. Since each data chunk is a single row, I don't have
to worry about escape sequences that span data chunks, or rows that
span data chunks. Realizing this made the code much simpler.
Here is a suggested patch:
>>>
--- DBD-Pg-2.6.1/Pg.pm 2008-04-30 05:33:21.000000000 -0500
+++ DBD-Pg-2.6.1.orig/Pg.pm 2008-04-22 12:58:32.000000000 -0500
@@ -3592,34 +3592,33 @@
deprecated in favor of the pg_getcopydata, pg_putcopydata, and
pg_putcopyend methods.
=over 4
=item B<pg_getcopydata>
Used to retrieve data from a table after the server has been put into
COPY OUT
-mode by calling "COPY tablename TO STDOUT". Data is always returned
one data row at a time. The first argument to pg_getcopydata
+mode by calling "COPY tablename TO STDOUT". The first argument to
pg_getcopydata
is the variable into which the data will be stored (this variable
should not
be undefined, or it may throw a warning, although it may be a
reference). This
argument returns a number greater than 1 indicating the new size of
the variable,
or a -1 when the COPY has finished. Once a -1 has been returned, no
other action is
necessary, as COPY mode will have already terminated. Example:
$dbh->do("COPY mytable TO STDOUT");
my @data;
my $x=0;
1 while $dbh->pg_getcopydata($data[$x++]) > 0;
There is also a variation of this function called
pg_getcopydata_async, which,
as the name suggests, returns immediately. The only difference from
the original
function is that this version may return a 0, indicating that the row
is not
ready to be delivered yet. When this happens, the variable has not
been changed,
and you will need to call the function again until you get a non-zero
result.
-(Data is still always returned one data row at a time.)
=item B<pg_putcopydata>
Used to put data into a table after the server has been put into COPY
IN mode
by calling "COPY tablename FROM STDIN". The only argument is the data
you want
inserted. Issue a pg_putcopyend() when you have added all your rows.
The default delimiter is a tab character, but this can be changed in
<<<
Thanks!
David