On Wed, 2008-12-10 at 11:12 +0000, Michael Meeks wrote: > Hi Philip, > > On Tue, 2008-12-09 at 19:59 +0100, Philip Van Hoof wrote: > > > http://live.gnome.org/Evolution/Metadata > > > > For early visitors of that page, refresh because I have added/changed > > quite a lot of it already. > > Looks really good. > > The only thing that I don't quite understand (the perennial problem > with asynchronous interfaces), is the memory issue: it seems we need to > store all Unset information on deleted mails somewhere [ unless you are > a womble like me that keeps ~all mail forever ;-]. > > What does the lifecycle for the data in that Unset store look like ? > [ I assume that as/when you re-connect to the service you're as much > likely to get an UnsetMany as a SetMany ]. What if that data starts to > grow larger than the remaining data it describes ? ;-) [ depending on > how we do Junk mail filtering of course that might be quite a common > occurrence for some ].
I think the LifeCycle is best described by this document: http://live.gnome.org/MetadataOnRemovableDevices It specifies a metadata cache format for removable devices in Turtle format. For your information when reading the document: The removal of a resource as a special notation using blank resources <> <>, and the removal of a predicate (of a field of a resource) uses the notation <pfx:predicate> <>. Although cached metadata on a removable device is not the exact same use-case, the life-cycle of what the RDF store (or the metadata engine) wants is the same: - When a new resource is created or one of its predicates (one of its fields) is being updated, it just wants to know about these updates or creates. An update is the same as a create if the resource didn't exist before. For a cache it's important to know the "modified" timestamp so that you know whether your copy of the metadata is most recent, or the cache is about the resource is most recent. For Evolution (for E-mail clients) we can simplify this as "whenever a Set or a SetMany happens, we assume time() to be that date". That's because we can assume the E-mail client to have top-most priority in all cases (being the benevolent dictator about metadata about E-mails, it knows best what we should swallow and when we should swallow its updates - we should not make up our own minds and decisions about it) - When a resource got deleted then the RDF store wants to know about this as soon as possible. Asynchronously (like if the RDF store, being a subscriber, joins the subscription after the deletion took place) this also counts: as soon as possible. Preferably immediately after the subscription. Right now I don't think Evolution is keeping state about deleted UIDs With IMAP there's a trick that you can do: you can assume that a hole in the UIDSET meant that some sort of deleting occurred. That's because IMAP is ~ specified that the server can't reuse UIDs (some IMAP servers might not respect this, and those are also broken in Evolution afaik - or at least require a workaround that makes Evolution basically perform like a POP client for IMAP when synchronizing -) With POP I don't think you can make any such assumptions. - Removing the predicate from a resource (the field of a resource) ain't needed for E-mail. Luckily E-mail is a mostly read-only storage. With exception of fields like <nmo:isRead>. Maybe if we want to support removing a flag or a custom-flag at some point we might need to add something to the API to indicate the removal of a field of a resource. For example it's not possible that the CC or the TO list of an E-mail changes. Because E-mails, once stored, are read-only in that aspect. I think, anyway, that it would make sense for Evolution to start doing two things in the CamelDB: * Log all deletions (just the UID should suffice), if the service reuses UIDs then upon effective reuse of the UID, this log's UID deletion should be removed from the log. Else you loose the E-mail at whoever depends on this log for knowing about effective deletions. * Record the timestamp for each record in the summary table. This timestamp would store the time() when the record got added and maybe would also store the time() (preferably separately) when the last time the E-mail's flags got changed was. With those two additions to the schema of the CamelDB it would I think be possible to make a plugin that implements the service as proposed on the wiki page. Matthew Barnes replied on IRC that we should start storing those timestamps anyhow. I also think it's a good idea. I was planning to discuss this with psankar and srag too. If we'd change the schema then we will also need to implement a migration path from the old schema to the new. Using virtual tables you can simulate MySQL's ALTER TABLE in SQLite. TRANSACTION SELECT * FROM orig_table INTO virtual_table; DROP orig_table; CREATE orig_table ( ... created datetime, modified datetime ); SELECT *, time(), time() FROM virtual_table INTO orig_table; COMMIT -- Philip Van Hoof, freelance software developer home: me at pvanhoof dot be gnome: pvanhoof at gnome dot org http://pvanhoof.be/blog http://codeminded.be _______________________________________________ tracker-list mailing list tracker-list@gnome.org http://mail.gnome.org/mailman/listinfo/tracker-list