On Tue, 9 Oct 2007, Grant Baillie wrote:

Also, one could think about addressing performance and scalability at the repo level, without changing the whole architecture.

While Chandler was developed with an infinitely scalable and infinitely fast repository in mind, it might be time to let reality sink in. The repository has come a long way in terms of performance and could still be improved, for sure, but coulddn't one think about addressing performance and scalability at the app level as well, without changing the whole repository architecture ?

Well, one can think about anything, so sure :). But as things stand, there isn't really an "app level" to speak of: The repository is intertwined with everything, and its API shapes the app layer in ways that aren't always so effective. (The current indexing situation is one concrete example).

In other words, it's up to the app to dis-intertwine itself from the repository. I don't think that just tackling repository performance in isolation as has been the approach until now is the right solution anymore.

If, for instance, when importing 100,000 mail message we tell the UI about every itsy bitsy change one attribute at a time, no amount of repository performance improvements is going to get us to the performance we expect.

About mail import performance I need to point out that the message in the status bar at the bottom of the UI is misleading. It says "committing <n> messages" implying that it's spending time inserting item records into the repository.

Since this conversation is now in the mode where we're throwing around row insert number timings, how about changing the message to saying something like "converting mail messages to chandler items" ? The actual repo insert part, the repo commit(), part is pretty small, even negligible, when compared to the time spent "chandlerizing" the mail messages into items with a live UI. I sure don't want people to think that it takes half an hour to write 7,000 mail message items into the repository.

Earlier today, Heikki proposed using multiple processes to better take advantage of multi-core hardware. Berkeley DB and the Chandler repository already fully support multiple processes accessing the same repository concurrently. It should be fairly easy for the application to split off some tasks into separate processes without any code changes in the task or repository components themselves. Importing a large amount of mail in a different process or background syncing collections in a different process could yield some interesting results.

Andi..
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Open Source Applications Foundation "chandler-dev" mailing list
http://lists.osafoundation.org/mailman/listinfo/chandler-dev

Reply via email to