[Dev] A short explanation of Collections

John Anderson Mon, 29 Aug 2005 10:34:03 -0700

As the dust settles on the recent Collections and Sets work, I decidedto write up a short description of what every Chandler developers shouldknow about Collections. The idea of a query that automatically updates alist of items, and notifies subscribers of changes, has been central toChandler from the beginning. Our design and implementation has evolvedmany times, influenced by what we have learned through experience.Although some of what I describe here might change slightly I think thebasic ideas will remain unchanged.

The new Collections are a replacement for repository.query.Query, whichwas used by ItemCollections. In the old ItemCollection world that mostof you are probably familiar with an itemCollection was made up of aquery that specified a set of items, modified by adding in a list ofinclusion items and removing a list of exclusion items. The finalresults were cached in a ref collection that was usually accessed likean array. We ran into a number of problems using ItemCollections. Forexample, when one ItemCollection, e.g. the "All" item collection fedits results into a new filtered ItemCollection, e.g. the subset ofcalendar events, there were problems propagating changes andnotifications. Also we learned that the majority of ItemCollections inChandler were simply ordered lists of items, and the notion of order inItemCollections was not always maintained.

In the new Collections world we have a number of different types ofCollections:


KindCollection: all the items of a particular kind.

ListCollection: an explicit list of items.

FilteredCollection: all items in another source Collection that match aPython expression. You must manually specify a list of attributes whichItems must have to be considered for filtering by the expression. In thefuture we may limit what Python code FilteredCollections may use.


UnionCollection: the union of two or more source Collections

IntersectionCollection: the intersection of two or more source Collections

DifferenceCollection: the difference between to source Collections

InclusionExclusionCollection: a collection similar to our oldItemCollection, that implements some convenience methods to accessinclusions, exclusions, the source Collection, and methods to add andremove items. The InclusionExclusionCollection, is made up of a unioncollection, difference collection, 2 list collections and a sourcecollection as follows:


InclusionExclusionCollection  = ((source - exclusions) + inclusions).

To illustrate the power of Collections consider the new "All" Collection:

allCollection = ((((Notes - (Events filtered by (isGenerated = True)) -Trash) - allExclusions) + allInclusions)

allCollection is an InclusionExclusionCollection. Notes and Events areKindCollections. allInclusions, allExclusions and Trash are ListCollection.

There isn't any code necessary to exclude generated events or item inthe trash from the "All" Collection, which simplifies the design. It'salso easy to update the rules for what is contained in the "All"Collection without having to update a bunch of code. So if you findyourself writing a bunch of code to make sure items end up in the rightCollections in the sidebar or elsewhere, you could probably avoid itcompletely by setting up the right Collections to start with.

You can subscribe to a collection by adding an item to notify to thecollection's subscribers attribute. By default, the method"onCollectionEvent" is called on items that are subscribed, however, youcan specify a different method name in the collectionEventHandlerattribute of your item that is notified.

Collections are not dependent on Blocks, but Blocks are the main user ofCollections.

That finishes the overview. For those that want to understand moredetail or the implementation, read on.

Collections are Items that provide a thin wrapper on repository Setattribute values, where most of the work actually takes place. We needthis wrapper for a few reasons. First it's difficult to manage lots ofreferences to an attribute, which is why Blocks, ContentItems, etc. arenot attributes. Second, the Item implements the support fornotifications. Finally, Set attributes require arguments that refer toother Sets in order to create them. These arguments aren't known whenthe Collection Item is created. This creates an awkward need to delaycreation of the Set attribute. The Item provides Python magic to handlethis awkward delay creation. A further limitation of Sets is that theyare immutable, which means that changing a node in a Collection tree isnot supported. It may be possible to add more Python magic the Itemthat destroys and re-create the correct Sets when one node changes.

These disadvantages imposed by making Sets an attribute made some of usthink that making Sets an Item would have been a better choice. Thecounter argument was that we would face the same limitations even ifSets were Items. There might also be situations where using Sets asattributes would have a advantage, even though they are used that way today.

Collections have the same kind of index that ItemCollections had. If younever index into a Collection it won't have an index. If you index intoit, you'll get an index. The index you get is determined by anattribute on Collection. By default you'll get an ordered index, wherethe order is the same as the iteration order of the Collection. If theindex attribute is the name of an attribute, you'll get an index sorteddby that attribute.

Unlike ItemCollections, collections, except for ListCollections, don'tcache their results.

Most Collections are used as contents for Blocks. As in the past, whenthe Block is rendered it subscribes to notifications, and when it'sunrendered it unsubscribes to notifications. This is a simpleoptimization to minimize the number of notifications, since only blocksthat are visible on the screen need to be notified to update themselves.

KindSets and FilteredSets maintain their indexes by using repositorymonitors. We use that same mechanism to notify subscribers.Notifications for Items coming and going to Collections are synchronous.This doesn't work for changes to attributes on Items in other views, soinstead we we use an asynchronous notification. In order to get thesenotifications it's necessary to poll for them. Each time OnIdle iscalled we do a repository update and poll for these notifications. Eachtime a notification is received, the block that gets the notification isadded to a list of dirty blocks. At the end of OnIdle, the list ofdirty blocks is updated on the screen and removed from the list of dirtyblocks. This has the benefit of accumulating all of the changes to datafairly quickly, and only redrawing the affected part of the screen whenthere's nothing left to do.

Finally, we plan to implement a nestable "Freeze/Thaw" methods totemporarily ignore and enable notifications, which will further improveperformance.


_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Open Source Applications Foundation "Dev" mailing list
http://lists.osafoundation.org/mailman/listinfo/dev

[Dev] A short explanation of Collections

Reply via email to