Hi, all

This is a summary of the thinking behind a change I recently committed to iCalendar import:

<http://viewcvs.osafoundation.org/chandler/?rev=14887&view=rev>

that speeds up the 3000 event calendar import test case by 35%-40%. There are similar gains to be had in other performance scenarios (like reload, and possibly subscribe), but there's some trickiness involved, so it's good to document things.

The main part of the diff (in parcels/osaf/sharing/stateless.py) is to wrap the import code with a "with repoView.reindexingDeferred():" call. It turns out that before this change, we were spending an enormous amount of time reinserting items in indexes, as a result of setting attributes that could affect the various indexes Chandler uses.

Currently, there are two usage patterns for repository indexes in Chandler:

1) Indexes used to make sure items are unique: The cases I know of are the EmailAddress and Location kinds. We don't want to create a new item every time you address an item a given email address, so we index the collection of all EmailAddress items, and use the index (actually, multiple indexes) to use an existing item if possible when you add or import an email address.

2) Indexes used for sorting or searching in the UI: Examples here are the indexes used for sorting on dashboard column, and also the global startTime-related indexes used by the calendar UI to find all the relevant events for a given week/day.

It turns out that it's OK to defer the indexes in #2 above for import (or reload, which is similar): the UI is already being notified of changes to the items it's displaying, so we don't need to keep all the indexes instantaneously up-to-date.

However, in case #1, deferring indexing often leads to errors that look like:

LookupError: Access to skiplist is denied, it is marked INVALID

because the deferring has left the index in a temporarily inconsistent state, but we're trying to iterate/insert into the index. So, for case #1, Andi added a 'nodefer' keyword argument to the createIndex() call, which means that these indexes will always keep themselves consistent (i.e. essentially ignore reindexingDeferred ()). This allows us to defer indexing for the remaining indexes, which, happily, is where the most time was previously wasted.

--Grant

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Open Source Applications Foundation "chandler-dev" mailing list
http://lists.osafoundation.org/mailman/listinfo/chandler-dev

Reply via email to