On Jun 15, 7:47 pm, Peter Otten <[EMAIL PROTECTED]> wrote: > [EMAIL PROTECTED] wrote: > > I'm stuck on a problem where I want to use marshal for serialization > > (yes, yes, I know (c)Pickle is normally recommended here). I favor > > marshal for speed for the types of data I use. > > > However it seems that marshal.dumps() for large objects has a > > quadratic performance issue which I'm assuming is that it grows its > > memory buffer in constant increments. This causes a nasty slowdown for > > marshaling large objects. I thought I would get around this by passing > > a cStringIO.StringIO object to marshal.dump() instead but I quickly > > learned this is not supported (only true file objects are supported). > > > Any ideas about how to get around the marshal quadratic issue? Any > > hope for a fix for that on the horizon? Thanks for any information. > > Here's how marshal resizes the string: > > newsize = size + size + 1024; > if (newsize > 32*1024*1024) { > newsize = size + 1024*1024; > } > > Maybe you can split your large objects and marshal multiple objects to keep > the size below the 32MB limit. >
But that change went into the svn trunk on 11-May-2008; perhaps the OP is using a production release which would have the previous version, which is merely "newsize = size + 1024;". Do people really generate 32MB pyc files, or is stopping doubling at 32MB just a safety valve in case someone/something runs amok? Cheers, John -- http://mail.python.org/mailman/listinfo/python-list