"David E. Wheeler" <da...@justatheory.com> writes:
> I’ve discovered something a bit disturbing at $work. We’re migrating (slowly) 
> from Oracle to PostgreSQL, and in some cases are using oracle_fdw to copy 
> data over. Alas, there are a fair number of text values in the Oracle 
> database that, although the database is UTF-8, are actually something else 
> (CP1252 or Latin1). When we copy from an oracle_fdw foreign table into a 
> PostgreSQL table, PostgreSQL does not complain, but ends up storing the 
> mis-encoded strings, even though the database is UTF-8.

> I assume that this is because the foreign table, as a table, is assumed by 
> the system to have valid data, and therefor additional character encoding 
> validation is skipped, yes?

Probably not so much "assumed" as "nobody thought about it".  In
e.g. plperl we expend the cycles to do encoding validity checking on
*every* string entering the system from Perl.  I'm not sure why foreign
tables ought to get a pass on that, especially when you consider the
communication overhead that the encoding check would be amortized
against.

Now, having said that, I think it has to be the reponsibility of the FDW
to apply any required check ... which makes this a bug report against
oracle_fdw, not the core system.  (FWIW, contrib/file_fdw depends on the
COPY code, which will check encoding.)

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to