On Mar 23, 12:03 pm, Jason  Adams <[email protected]> wrote:
> I have been having a problem where we're using Sequel to migrate from
> a mysql database (utf8) to postgresql (utf8) and line breaks (\r\n) on
> the mysql side were appearing as \015\012 in the postgres side.  I
> thought it was a character encoding issue, but eventually tracked it
> down to the fact that Sequel thinks a mysql text column is a blob.
> When it is inserted into postgres, literal_blob is called on the
> column, which converts \r\n into \015\012.
>
> I'm guessing this is not the intended behavior, but before I tried to
> dig around in the code (which I am completely new to), I wanted to
> make sure there wasn't some deeper reason that MySQL text columns are
> being treated as blobs.  It's quite possible I'm doing something
> wrong...  Any ideas?
>
> To reproduce the situation:
>
> in mysql:
> create table mytest1 (comments text);
>
> in postgres:
> create table mytest1 (comments text);
>
> in ruby (assuming you have mysql_db and pg_db Sequel connections to
> each db):
> mysql_db[:mytest1].insert "1\r\n2\r\n3\r\n"  # correctly inserted in
> mysql db
> pg_db[:mytest1].insert mysql_db[:mytest1].first[:comments] #
> incorrectly inserted in pg db
>
> It works correctly if instead you do:
> pg_db[:mytest1].insert mysql_db[:mytest1].first[:comments].to_s

MySQL treats text and blob columns identically.  The type number used
for the text type and the type number used for the blob type are the
same.  Sequel converts blob columns to Sequel::SQL::Blob objects, and
since MySQL uses the same type number for text and blobs (a typical
stupid decision, IMO), that happens for text columns as well.  That's
half the issue.

The other half is on PostgreSQL the bytea and text columns aren't the
same (as it should be).  However, since you are using an SQL::Blob as
a value for a text column, it's going to literalize it as a blob, as
literalization depends on the type of the object, not the type of the
column (datasets don't know about the types of their columns).
PostgreSQL will automatically convert strings given to the bytea
columns, but if it is a text column, it will use the string as is (the
escaped bytea format).  Casting the literal to bytea doesn't fix
things, since the PostgreSQL doesn't unescape bytea values when
implicitly converting them to text (a mistake, IMO).

You have two real options.  You can use the work around you've already
discovered, or you can change the MySQL adapter to not return
Sequel::SQL::Blob objects for text columns:

  # untested, but should work if you are using the master branch
  [249, 250, 251, 252].each{|x| Sequel::MySQL::MYSQL_TYPES.delete(x)}

If you have other ideas or advice on how to deal with this, please let
me know.

Jeremy
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"sequel-talk" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/sequel-talk?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to