Re: cPickle asymptotic performance?

2008-06-13 Thread Hrvoje Niksic
Eric Jonas [EMAIL PROTECTED] writes:

 Try gc.disable() before loading the pickle, and gc.enable() after.
 
  Is cPickle's behavior known to be O(n^2)?
 
 No, but the garbage collector's sometimes is.

 Wow, that totally fixed it -- we went from 1200 seconds to 60
 seconds.

60 seconds is still a long time.  How many objects are you
deserializing?  Is the time now approximately O(n)?

 I'm somewhat shocked -- is the occasionally-abysmal behavior of the
 GC documented anywhere?

I don't think so.  It's somewhat of an FAQ on the list, though.  The
question tends to arise when someone tries to construct a large list
of non-trivial objects, which takes quadratic time because object
allocation regularly triggers GC, which traverses the growing list.
--
http://mail.python.org/mailman/listinfo/python-list


cPickle asymptotic performance?

2008-06-12 Thread Eric Jonas
Hello, 

I've done some benchmarking while attempting to serialize my (large)
graph data structure with cPickle; I'm seeing superlinear performance
(plotting it seems to suggest n^2 where n is the number of nodes of my
graph), in the duration of the pickle.dump calls and I can't quite
figure out why. The connectivity of the graph is such that the number of
nodes is ~ number of edges, so I don't think this is a problem of edge
count secretly growing  O(n). Is cPickle's behavior known to be O(n^2)?
Does anyone have any generic tips for speeding up cPickle? 

Thanks, 
...Eric


--
http://mail.python.org/mailman/listinfo/python-list


Re: cPickle asymptotic performance?

2008-06-12 Thread Hrvoje Niksic
Eric Jonas [EMAIL PROTECTED] writes:

 I've done some benchmarking while attempting to serialize my (large)
 graph data structure with cPickle; I'm seeing superlinear performance
 (plotting it seems to suggest n^2 where n is the number of nodes of my
 graph), in the duration of the pickle.dump calls and I can't quite
 figure out why.

Try gc.disable() before loading the pickle, and gc.enable() after.

 Is cPickle's behavior known to be O(n^2)?

No, but the garbage collector's sometimes is.
--
http://mail.python.org/mailman/listinfo/python-list


Re: cPickle asymptotic performance?

2008-06-12 Thread Calvin Spealman
If you are getting to the point where your data is large enough to  
really care about the speed of cPickle, then maybe its time you moved  
past pickles for your storage format? 2.5 includes sqlite, so you  
could persist them in a nice, indexed table or something. Just a  
suggestion.


On Jun 12, 2008, at 2:25 PM, Eric Jonas wrote:


Hello,

I've done some benchmarking while attempting to serialize my (large)
graph data structure with cPickle; I'm seeing superlinear performance
(plotting it seems to suggest n^2 where n is the number of nodes of my
graph), in the duration of the pickle.dump calls and I can't quite
figure out why. The connectivity of the graph is such that the  
number of

nodes is ~ number of edges, so I don't think this is a problem of edge
count secretly growing  O(n). Is cPickle's behavior known to be O 
(n^2)?

Does anyone have any generic tips for speeding up cPickle?

Thanks,
...Eric


--
http://mail.python.org/mailman/listinfo/python-list


--
http://mail.python.org/mailman/listinfo/python-list


Re: cPickle asymptotic performance?

2008-06-12 Thread Eric Jonas

On Thu, 2008-06-12 at 20:57 +0200, Hrvoje Niksic wrote:
 Eric Jonas [EMAIL PROTECTED] writes:
 
  I've done some benchmarking while attempting to serialize my (large)
  graph data structure with cPickle; I'm seeing superlinear performance
  (plotting it seems to suggest n^2 where n is the number of nodes of my
  graph), in the duration of the pickle.dump calls and I can't quite
  figure out why.
 
 Try gc.disable() before loading the pickle, and gc.enable() after.
 
  Is cPickle's behavior known to be O(n^2)?
 
 No, but the garbage collector's sometimes is.


Wow, that totally fixed it -- we went from 1200 seconds to 60 seconds.
I'm somewhat shocked -- is the occasionally-abysmal behavior of the GC
documented anywhere? 
...Eric

--
http://mail.python.org/mailman/listinfo/python-list