On Friday, November 9, 2012 9:51:24 AM UTC-7, Jeremy Evans wrote:
>
> On Friday, November 9, 2012 1:03:04 AM UTC-8, Clint Pachl wrote:
>>
>> I am wondering what are the dis/advantages of storing ruby serialized 
>> objects (i.e. Marshal.dump) directly into a `bytea` column versus Base64 
>> encoding the byte stream and storing into a `text` column? Also, are there 
>> any performance concerns with either method? I am using PostgreSQL to be 
>> specific.
>>
>> One of the advantages I found in base64 encoding first is that it doesn't 
>> screw up my terminal when Sequel logs to STDOUT. Maybe there is a way to 
>> escape or hide this binary data in the logger? However, it seems storing 
>> the serialized object directly, without encoding, would be more efficient.
>>
>> (this may make this post irrelevant)
>> Finally, I was unable to reconstitute a ruby object after an 
>> insert/select. I keep getting the error from Marshal.dump, "data too 
>> short". It wasn't until I base64 encoded first that I got it to work. So 
>> maybe storing binary data directly doesn't work? I found a post from 2008, 
>> How 
>> do you insert binary data using sequel + 
>> postgresql?<https://groups.google.com/forum/#!searchin/sequel-talk/ruby$20binary/sequel-talk/4aRQGNPQ7Po/_hdJMmZ2Ki4J>
>>  
>>
>> Here's what I did:
>>
>> ds = DB[:core__checkout_snapshots]
>> checkout_data = Marshal.dump(data)
>> ds.insert(id: checkout_id, data: checkout_data)
>> Marshal.load(ds[id: checkout_id][:data])
>>
>>
>> This failed with the `data` column as type `bytea` and `text`.
>>
>
> When storing data in a bytea column, you need to mark it as a blob:
>
>    ds.insert(id: checkout_id, data: Sequel.blob(checkout_data))
>
> That may fix your issue.
>

Yep, that fixed my issue!

I knew the Postgres function I needed to escape the binary data 
(i.e. escape_bytea), but I just couldn't find the Sequel method to access 
it. I looked at the code, but too many abstraction layers for me to follow. 
I'm guessing it is in the C code of the Postgres driver.

Personally, I don't think serialization of ruby objects into a database is 
> a good idea in most cases, but if you have to do it with Marshal, it's 
> probably better to store it in bytea instead of base64 encoded text.
>

I am glad you offered your opinion Jeremy. You made me think about what I 
am trying to do. I do indeed want to serialize my objects because they are 
short-lived snapshots of a collection of objects. Doing so just makes life 
easier. However, I decided that they don't belong in the database. I 
decided that I can more easily manage these snapshots using files.

I would also agree that it is better to store a Marshal.dump directly in a 
bytea column. For example, upon quick comparison of my snapshots, the 
binary data is about about 35% smaller than the base64 encoded text.



> About your logger issue, use a custom logger that escapes the output 
> instead of the default Logger class, that's unrelated to Sequel.
>

Escaping the binary data using Sequel.blob fixed the garbling of my 
terminal. I am in fact using the standard Logger, so no need for a custom 
logger. The Postgres escape_bytea function must make the data 
terminal-friendly.

-- 
You received this message because you are subscribed to the Google Groups 
"sequel-talk" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/sequel-talk/-/qRJxbbmVkfEJ.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sequel-talk?hl=en.

Reply via email to