On 03/ 9/10 04:08 PM, [email protected] wrote:
On Mon, Mar 08, 2010 at 11:54:47PM -0600, Shawn Walker wrote:
   12123 pkg.server.catalog and pkg.updatelog should be removed or trimmed
   15087 catalog serialization makes excessive calls to write

webrev:
http://cr.opensolaris.org/~swalker/pkg-cat/

catalog.py:

   - lines 119/120: Can this be in default arguments to _dump instead of
     cls=None?

Sure.

   - Does it save us time to incrementally update the sha_1 hash by
     calling update() instead of generating it in one pass when the file
     is written, or is this to cope with the case where the file is
     written incrementally?

There's really no way to incrementally update the sha_1 hash unless I iterate over what simplejson returns and call update(). As I've discovered before, doing that is very slow.

If simplejson's iterative encoder generated reasonably-sized chunks, it would be better to call update(). But at the moment, it's iterative encoder will generate chunks as small as a single character (e.g. '{').

So this means that roughly 4.4 million+ iterations are done for the /dev catalog currently, so it's much faster to do the sha-1 calc in a single pass in filesystem-sized chunks.

At some point in the future, I Hope to have enough time to either write a new serialiser (in C), or create a patch for simplejson that makes it suck less.

In particular, I could reduce memory usage and possibly reduce serialisation time further if I could have the iterative encoder write to an fd wrapped in a stdio stream with fdopen() instead of yielding each value or creating a giant python list object with everything inside.

However, I felt that this changeset was a big win for a relatively small amount of work.

Love the deletes from updatelog and server/catalog.

Once the publication tools use the new transport subsystem, I plan to move what little is left into the publisher module for the v0 refresh case. At that point, both can be deleted.

Cheers,
--
Shawn Walker
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Reply via email to