Re: [Evolution-hackers] RFClue: Atomic folder updates

2011-12-05 Thread Milan Crha
On Sat, 2011-12-03 at 00:05 +, David Woodhouse wrote:
 The problem is, we need to handle these updates *atomically*. If we
 store the new timestamp before the changed messages, and we crash in
 the
 middle of doing so, then we might miss out forever on the messages in
 question. We'd restart, go to the server and say what happened since
 YYY and we never get told *again* about the messages which came in
 between time XXX and time YYY, that we didn't manage to fetch.
 
 And if you do it the other way round and store the changed messages
 first, and crash before you store the new timestamp, you get similar
 issues (cf. bug 664637).

Hi,
I do not think you can do anything on the CamelFolderSummary side
itself, except of changing your descendant of CamelMessageInfo, but
there still be time intervals where crash or unexpected interruption
will cause issues. I'm not sure if it's understood from your
description, but the SyncKey on the exchange server changes as soon as
the Sync call is finished (the server returns the new key), and asking
with the old key results in this bug. So, you could do:
 a) Call Sync With the XXX key
 b) process result with returned YYY key in a way:
- mark CamelMessageInfo-s which changed as changed (to be updated)
- create fake info-s with the same flag as above for new items
- delete those deleted
 c) save changes to disk with the new YYY Sync key
 d) update what is supposed to be updated

There still is a chance that you crash during b), though that all is
supposed to be done on a local machine only, thus should not take that
long (depends on folder size and such).

You may check on folder open whether there are any info-s marked for
update and process d) if yes, and only then continue with the stored
sync key.

I guess something like that should work. I do not see how any atomicity
would help you here, because when any trouble happens between
SyncFolderItems call and saving of all received changes with the new
Sync key will always result in a desync between either key stored
locally and on the server or the folder content not being updated fully.

Nonetheless, there are people whom are willing to connect to their
exchange account from multiple machines. Did you try whether this sync
can work in such environment, please? I didn't, but I will be surprised
if the Sync key is distributed per client, as it seems like a
memory-consuming operation on the server for me. Imagine two identical
installations with identical evolution-ews versions in use, only on two
different machines connecting to the same server with the same user
name. You might get out of sync with the Sync keys in such environment
quite easily.
Bye,
Milan

___
evolution-hackers mailing list
evolution-hackers@gnome.org
To change your list options or unsubscribe, visit ...
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] RFClue: Atomic folder updates

2011-12-05 Thread David Woodhouse
On Mon, 2011-12-05 at 09:49 +0100, Milan Crha wrote:
  I'm not sure if it's understood from your description, but the
 SyncKey on the exchange server changes as soon as the Sync call is
 finished (the server returns the new key), and asking with the old key
 results in this bug. 

Not quite. Asking again with the old key is *fine*, and it *MUST* be
fine. It happens a lot, if a mobile client is disconnected from the
network before it even *receives* the reply. There's a *huge* window
where the reply can get lost, so the server absolutely *has* to cope
with clients coming back to it with an old key.

In ActiveSync the server stores SyncKey information per client, and will
keep one previous key per client per folder. That's enough to deal with
client crashes. In EWS the SyncKeys seem to last much longer — as if
they contain all the information needed to find the right point in the
database transaction log, and nothing *extra* needs to be stored on the
server side, perhaps?

So the mere fact that you are asking with the old key is *not* a
problem.

The bug is caused because we crash and restart with an *inconsistent*
state because we've applied *some* of the changes. We then ask the
server what happened since XXX, but our cache doesn't *match* what the
server had at time XXX!

In fact, even that would be OK if we got the *same* answer. All of the
changes we get given in a single update are perfectly fine to apply
twice. The real problem happens if the folder has changed again in the
meantime, and some of the changes we were originally given in the
XXX-YYY delta have now been *reverted* (like a message being marked
read and then unread, or created and then deleted).

It goes like this...

 - We ask server what happened since XXX?.
 - Server tells us message 123 was created, and your next key is YYY.
 - We add message 123 to our cache, and crash before storing the new
key.
 - Someone *deletes* message 123.
 - We restart, and ask the server what happened since XXX?.
 - The server tells us *nothing* about message 123 — as far as it's
concerned, that message was added and deleted without us ever
knowing anything about it. But it is in our cache, and the server
is *never* going to tell us that it got deleted because the server
doesn't think we know it's there.


 So, you could do:
  a) Call Sync With the XXX key
  b) process result with returned YYY key in a way:
 - mark CamelMessageInfo-s which changed as changed (to be updated)
 - create fake info-s with the same flag as above for new items
 - delete those deleted
  c) save changes to disk with the new YYY Sync key
  d) update what is supposed to be updated
 
 There still is a chance that you crash during b), though that all is
 supposed to be done on a local machine only, thus should not take that
 long (depends on folder size and such).

Hm, I'm not quite sure what that process would achieve. The race
condition you highlight (crash during (b)) is exactly the one I'm trying
to eliminate. However small the window is, if you can crash at a certain
time and get data corruption, that is a *bug*.

There is no ordering for your (a), (b), and (c) which works; they have
to be *atomic* and hit the disk all at precisely the same time, or we
need an alternative solution (roll-back or roll-forward).

 You may check on folder open whether there are any info-s marked for
 update and process d) if yes, and only then continue with the stored
 sync key.
 
 I guess something like that should work. I do not see how any atomicity
 would help you here, because when any trouble happens between
 SyncFolderItems call and saving of all received changes with the new
 Sync key will always result in a desync between either key stored
 locally and on the server or the folder content not being updated fully.

Atomicity means that when we restart, we have *either* the before or
the after state. Not anything in between.

So either we restart with Sync key 'XXX' and none of the changes applied
to our cache, or we restart with Sync key 'YYY' and *all* of the
changes. Either situation is fine — the server is quite happy for us to
start up and ask it *either* what changed since XXX? *or* what
changed since YYY?.

The *only* issue happens when our cache is inconsistent, and we end up
applying the changes since XXX to a copy of our local cache which
*doesn't* actually match the real state of the folder at that point in
time, because we'd already applied *some* changes to it before crashing.

That's why the atomicity is necessary.

If Evolution's cache storage can't give us that atomicity, then we
should be able to fake it. I suspect the best answer is to write the
server's response to disk before processing it. On startup, we can check
if any such changes are outstanding and need to be replayed.

We'll *still* be replaying changes since XXX to a folder state which
doesn't actually match XXX, but at least it'll be the *same* set of
changes. That actually makes it OK.



Re: [Evolution-hackers] RFClue: Atomic folder updates

2011-12-05 Thread Milan Crha
On Tue, 2011-12-06 at 00:09 +, David Woodhouse wrote:
 In fact, even that would be OK if we got the *same* answer. All of the
 changes we get given in a single update are perfectly fine to apply
 twice. The real problem happens if the folder has changed again in the
 meantime, and some of the changes we were originally given in the
 XXX-YYY delta have now been *reverted* (like a message being marked
 read and then unread, or created and then deleted).

Hi,
thanks for the explanation, it makes sense and I didn't think of it this
way before.

 If Evolution's cache storage can't give us that atomicity, then we
 should be able to fake it. I suspect the best answer is to write the
 server's response to disk before processing it. On startup, we can check
 if any such changes are outstanding and need to be replayed.
 
 We'll *still* be replaying changes since XXX to a folder state which
 doesn't actually match XXX, but at least it'll be the *same* set of
 changes. That actually makes it OK.

That's basically what I tried to express with my a),b),c),d), only in a
level of CamelFolderSummary, without any extra files being involved.

  Nonetheless, there are people whom are willing to connect to their
  exchange account from multiple machines. Did you try whether this sync
  can work in such environment, please?
 
 Yes, I've been using EWS from both my laptop and my desktop for much of
 the last year (until I switched focus to ActiveSync; now I run only on
 one since I got a new laptop and haven't yet configured EWS on it).

I agree, after your above explanation. There should not be a problem to
use more clients if it works as you described.
Thanks and bye,
Milan

___
evolution-hackers mailing list
evolution-hackers@gnome.org
To change your list options or unsubscribe, visit ...
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] RFClue: Atomic folder updates

2011-12-04 Thread Sankar P
My memory is very rusty and I have not seen Evolution sources in more than 3 
years now. However, we had a similar situation when we working on a GroupWise 
backend and we used a trick to get the number of messages in a folder on 
startup of Evolution (or when the user explicitly presses the getmail button) 
and try to compare the mail counts between summary and server and update if 
they differ (and obviously, after fetching new items in this delta)

I am not sure if Exchange server has such a count API. If so, it can be used 
without breaking the summary code much.

Sankar
http://psankar.blogspot.com 

 On 12/3/2011 at 05:35 AM, in message
1322870713.5191.20.ca...@shinybook.infradead.org, David Woodhouse
dw...@infradead.org wrote: 
 We have at least three mail protocols now which are delta-based. That
 is, you have a bookmark, 'sync key' or timestamp which represents what
 the server *last* told you about a given folder. You say to the server
 what changed since XXX, and get back a list of added/removed/changed
 messages along with a *new* timestamp YYY.
 
 It's a very efficient way to handle mailbox access, and it's used by at
 least ActiveSync, EWS and IMAP+QRESYNC. In the Exchange protocols it's
 called 'SyncState' or 'SyncKey', and in IMAP it's the HIGHESTMODSEQ.
 (It's never *actually* a timestamp, since wall-clock time is a PITA. But
 it's easy to *think* of it as as timestamp; the modification time on the
 folder).
 
 The problem is, we need to handle these updates *atomically*. If we
 store the new timestamp before the changed messages, and we crash in the
 middle of doing so, then we might miss out forever on the messages in
 question. We'd restart, go to the server and say what happened since
 YYY and we never get told *again* about the messages which came in
 between time XXX and time YYY, that we didn't manage to fetch.
 
 And if you do it the other way round and store the changed messages
 first, and crash before you store the new timestamp, you get similar
 issues (cf. bug 664637).
 
 I'm not sure how to fix it. Looking at EWS first, since that's where we
 noticed it, I pondered removing the camel_folder_summary_save_to_db()
 call from the camel_ews_utils_sync_{created,updated,deleted}_items()
 helper functions, so it happens just *once* at the end of the loop in
 ews_refresh_info_sync() and commits all the changes at once. But just
 deferring the camel_folder_summary bits isn't enough, is it? The
 individual message_infos will have a lifetime of their own, and even
 *internally*, camel_folder_summary_save_to_db() doesn't actually write
 things out atomically using transactions in sqlite... does it?
 
 Any suggestions or insight would be gratefully received...
 
 -- 
 dwmw2
 


___
evolution-hackers mailing list
evolution-hackers@gnome.org
To change your list options or unsubscribe, visit ...
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] RFClue: Atomic folder updates

2011-12-04 Thread David Woodhouse
On Sun, 2011-12-04 at 10:04 -0700, Sankar P wrote:
 My memory is very rusty and I have not seen Evolution sources in more
 than 3 years now. However, we had a similar situation when we working
 on a GroupWise backend and we used a trick to get the number of
 messages in a folder on startup of Evolution (or when the user
 explicitly presses the getmail button) and try to compare the mail
 counts between summary and server and update if they differ (and
 obviously, after fetching new items in this delta)
 
 I am not sure if Exchange server has such a count API. If so, it can
 be used without breaking the summary code much. 

We do something similar to that for IMAP — using the counts as a sanity
check and falling back to the old-style non-QRESYNC method of refetching
the flags for *every* message in the folder.

I don't believe we have fallback like that for EWS and ActiveSync. Those
didn't have sane delta-based operation 'tacked on' to an old protocol
which we can fall back to; they were designed to be delta-based from the
*start*, and the client is expected *not* to get itself out of sync.

Even if we could detect a problem in ActiveSync, we'd basically have to
refetch the entire folder to recover — and that would mean that message
IDs all change, so our cache has to be entirely blown away. I *think* we
could keep the cache in EWS and just refetch the message details, but I
still don't like it — it would be a horrid workaround for a bug, if we
ever have to use it.

When the server responds, it tells us here are a bunch of changes, and
your new 'bookmark' for the folder state is XXX. We should remember
that, atomically. If we crash, our cache on restart should reflect
either the state *before* the response, or the state *after* it. Never
something in between.

-- 
dwmw2


smime.p7s
Description: S/MIME cryptographic signature
___
evolution-hackers mailing list
evolution-hackers@gnome.org
To change your list options or unsubscribe, visit ...
http://mail.gnome.org/mailman/listinfo/evolution-hackers


[Evolution-hackers] RFClue: Atomic folder updates

2011-12-02 Thread David Woodhouse
We have at least three mail protocols now which are delta-based. That
is, you have a bookmark, 'sync key' or timestamp which represents what
the server *last* told you about a given folder. You say to the server
what changed since XXX, and get back a list of added/removed/changed
messages along with a *new* timestamp YYY.

It's a very efficient way to handle mailbox access, and it's used by at
least ActiveSync, EWS and IMAP+QRESYNC. In the Exchange protocols it's
called 'SyncState' or 'SyncKey', and in IMAP it's the HIGHESTMODSEQ.
(It's never *actually* a timestamp, since wall-clock time is a PITA. But
it's easy to *think* of it as as timestamp; the modification time on the
folder).

The problem is, we need to handle these updates *atomically*. If we
store the new timestamp before the changed messages, and we crash in the
middle of doing so, then we might miss out forever on the messages in
question. We'd restart, go to the server and say what happened since
YYY and we never get told *again* about the messages which came in
between time XXX and time YYY, that we didn't manage to fetch.

And if you do it the other way round and store the changed messages
first, and crash before you store the new timestamp, you get similar
issues (cf. bug 664637).

I'm not sure how to fix it. Looking at EWS first, since that's where we
noticed it, I pondered removing the camel_folder_summary_save_to_db()
call from the camel_ews_utils_sync_{created,updated,deleted}_items()
helper functions, so it happens just *once* at the end of the loop in
ews_refresh_info_sync() and commits all the changes at once. But just
deferring the camel_folder_summary bits isn't enough, is it? The
individual message_infos will have a lifetime of their own, and even
*internally*, camel_folder_summary_save_to_db() doesn't actually write
things out atomically using transactions in sqlite... does it?

Any suggestions or insight would be gratefully received...

-- 
dwmw2


smime.p7s
Description: S/MIME cryptographic signature
___
evolution-hackers mailing list
evolution-hackers@gnome.org
To change your list options or unsubscribe, visit ...
http://mail.gnome.org/mailman/listinfo/evolution-hackers