On Friday, November 9, 2012 9:51:24 AM UTC-7, Jeremy Evans wrote: > > On Friday, November 9, 2012 1:03:04 AM UTC-8, Clint Pachl wrote: >> >> I am wondering what are the dis/advantages of storing ruby serialized >> objects (i.e. Marshal.dump) directly into a `bytea` column versus Base64 >> encoding the byte stream and storing into a `text` column? Also, are there >> any performance concerns with either method? I am using PostgreSQL to be >> specific. >> >> One of the advantages I found in base64 encoding first is that it doesn't >> screw up my terminal when Sequel logs to STDOUT. Maybe there is a way to >> escape or hide this binary data in the logger? However, it seems storing >> the serialized object directly, without encoding, would be more efficient. >> >> (this may make this post irrelevant) >> Finally, I was unable to reconstitute a ruby object after an >> insert/select. I keep getting the error from Marshal.dump, "data too >> short". It wasn't until I base64 encoded first that I got it to work. So >> maybe storing binary data directly doesn't work? I found a post from 2008, >> How >> do you insert binary data using sequel + >> postgresql?<https://groups.google.com/forum/#!searchin/sequel-talk/ruby$20binary/sequel-talk/4aRQGNPQ7Po/_hdJMmZ2Ki4J> >> >> >> Here's what I did: >> >> ds = DB[:core__checkout_snapshots] >> checkout_data = Marshal.dump(data) >> ds.insert(id: checkout_id, data: checkout_data) >> Marshal.load(ds[id: checkout_id][:data]) >> >> >> This failed with the `data` column as type `bytea` and `text`. >> > > When storing data in a bytea column, you need to mark it as a blob: > > ds.insert(id: checkout_id, data: Sequel.blob(checkout_data)) > > That may fix your issue. >
Yep, that fixed my issue! I knew the Postgres function I needed to escape the binary data (i.e. escape_bytea), but I just couldn't find the Sequel method to access it. I looked at the code, but too many abstraction layers for me to follow. I'm guessing it is in the C code of the Postgres driver. Personally, I don't think serialization of ruby objects into a database is > a good idea in most cases, but if you have to do it with Marshal, it's > probably better to store it in bytea instead of base64 encoded text. > I am glad you offered your opinion Jeremy. You made me think about what I am trying to do. I do indeed want to serialize my objects because they are short-lived snapshots of a collection of objects. Doing so just makes life easier. However, I decided that they don't belong in the database. I decided that I can more easily manage these snapshots using files. I would also agree that it is better to store a Marshal.dump directly in a bytea column. For example, upon quick comparison of my snapshots, the binary data is about about 35% smaller than the base64 encoded text. > About your logger issue, use a custom logger that escapes the output > instead of the default Logger class, that's unrelated to Sequel. > Escaping the binary data using Sequel.blob fixed the garbling of my terminal. I am in fact using the standard Logger, so no need for a custom logger. The Postgres escape_bytea function must make the data terminal-friendly. -- You received this message because you are subscribed to the Google Groups "sequel-talk" group. To view this discussion on the web visit https://groups.google.com/d/msg/sequel-talk/-/qRJxbbmVkfEJ. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/sequel-talk?hl=en.
