On Mar 23, 7:45 pm, Jeremy Evans <[email protected]> wrote: > On Mar 23, 12:03 pm, Jason Adams <[email protected]> wrote: > > > > > I have been having a problem where we're using Sequel to migrate from > > a mysql database (utf8) to postgresql (utf8) and line breaks (\r\n) on > > the mysql side were appearing as \015\012 in the postgres side. I > > thought it was a character encoding issue, but eventually tracked it > > down to the fact that Sequel thinks a mysql text column is a blob. > > When it is inserted into postgres, literal_blob is called on the > > column, which converts \r\n into \015\012. > > > I'm guessing this is not the intended behavior, but before I tried to > > dig around in the code (which I am completely new to), I wanted to > > make sure there wasn't some deeper reason that MySQL text columns are > > being treated as blobs. It's quite possible I'm doing something > > wrong... Any ideas? > > > To reproduce the situation: > > > in mysql: > > create table mytest1 (comments text); > > > in postgres: > > create table mytest1 (comments text); > > > in ruby (assuming you have mysql_db and pg_db Sequel connections to > > each db): > > mysql_db[:mytest1].insert "1\r\n2\r\n3\r\n" # correctly inserted in > > mysql db > > pg_db[:mytest1].insert mysql_db[:mytest1].first[:comments] # > > incorrectly inserted in pg db > > > It works correctly if instead you do: > > pg_db[:mytest1].insert mysql_db[:mytest1].first[:comments].to_s > > MySQL treats text and blob columns identically. The type number used > for the text type and the type number used for the blob type are the > same. Sequel converts blob columns to Sequel::SQL::Blob objects, and > since MySQL uses the same type number for text and blobs (a typical > stupid decision, IMO), that happens for text columns as well. That's > half the issue. > > The other half is on PostgreSQL the bytea and text columns aren't the > same (as it should be). However, since you are using an SQL::Blob as > a value for a text column, it's going to literalize it as a blob, as > literalization depends on the type of the object, not the type of the > column (datasets don't know about the types of their columns). > PostgreSQL will automatically convert strings given to the bytea > columns, but if it is a text column, it will use the string as is (the > escaped bytea format). Casting the literal to bytea doesn't fix > things, since the PostgreSQL doesn't unescape bytea values when > implicitly converting them to text (a mistake, IMO). > > You have two real options. You can use the work around you've already > discovered, or you can change the MySQL adapter to not return > Sequel::SQL::Blob objects for text columns: > > # untested, but should work if you are using the master branch > [249, 250, 251, 252].each{|x| Sequel::MySQL::MYSQL_TYPES.delete(x)} > > If you have other ideas or advice on how to deal with this, please let > me know.
Hi Jeremy, Thanks for the help and clearing up the issue. Since I'm running this as a one-shot thing, I'll probably just stick to my workaround. Jason --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "sequel-talk" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/sequel-talk?hl=en -~----------~----~----~----~------~----~------~--~---
