Re: marshal.dumps quadratic growth and marshal.dump not allowing file-like objects

John Machin Sun, 15 Jun 2008 03:23:18 -0700

On Jun 15, 7:47 pm, Peter Otten <[EMAIL PROTECTED]> wrote:
> [EMAIL PROTECTED] wrote:
> > I'm stuck on a problem where I want to use marshal for serialization
> > (yes, yes, I know (c)Pickle is normally recommended here). I favor
> > marshal for speed for the types of data I use.
>
> > However it seems that marshal.dumps() for large objects has a
> > quadratic performance issue which I'm assuming is that it grows its
> > memory buffer in constant increments. This causes a nasty slowdown for
> > marshaling large objects. I thought I would get around this by passing
> > a cStringIO.StringIO object to marshal.dump() instead but I quickly
> > learned this is not supported (only true file objects are supported).
>
> > Any ideas about how to get around the marshal quadratic issue? Any
> > hope for a fix for that on the horizon? Thanks for any information.
>
> Here's how marshal resizes the string:
>
>         newsize = size + size + 1024;
>         if (newsize > 32*1024*1024) {
>                 newsize = size + 1024*1024;
>         }
>
> Maybe you can split your large objects and marshal multiple objects to keep
> the size below the 32MB limit.
>


But that change went into the svn trunk on 11-May-2008; perhaps the OP
is using a production release which would have the previous version,
which is merely "newsize = size + 1024;".

Do people really generate 32MB pyc files, or is stopping doubling at
32MB just a safety valve in case someone/something runs amok?

Cheers,
John
--
http://mail.python.org/mailman/listinfo/python-list

Re: marshal.dumps quadratic growth and marshal.dump not allowing file-like objects

Reply via email to