Re: Message flag caching and polling.

2003-03-18 Thread Alexey Melnikov
David Woodhouse wrote:

 On Thu, 2003-03-13 at 00:34, Alexey Melnikov wrote:
  Please, have a look at draft-melnikov-imap-condstore-10.txt. Your timestamp is
  called modseq (modification sequence) in the draft. FLAGS-VALIDITY is called
  HIGHESTMODSEQ in the document.

 That is indeed almost precisely what I was looking for. Thank you.

 Btw I see only -09.

Hmm, maybe I haven't sent it yet ;-). I will double check.

 Did you s/successfull/successful/ in 3 of -10
 already?

No, this is fixed now. Thank you.

  The functionality you propose can be build as a small extension to CONDSTORE
  (and yes, other people already proposed something similar before).

 It's early here -- I'm not sure if there's any functionality I'd wanted
 that isn't there, from a client's point of view.

 The only real difference I note is that I was trying to allow for the
 option of a far more nave server implementation, where the server
 _only_ keeps 'HIGHESTMODSEQ' and doesn't actually keep MODSEQ for
 individual messages -- or where it keeps MODSEQ only for the N most
 recently changed messages. This might improve the adoption rate of the
 extension while still allowing the majority of the benefit to be seen by
 clients.

I am not 100% sure, but I believe this can be done for the draft as written.
If you can find any requirements that will prevent this, post a message to the
mailing list and let's discuss.

 I'm not particularly tied to that idea though -- from a client's point
 of view, I'm perfectly happy without it.

Cheers,
Alexey Melnikov
__
R  D, ACI Worldwide/MessagingDirect
Watford, UK

Work Phone: +44 1923 81 2877
Home Page: http://orthanc.ab.ca/mel

I speak for myself only, not for my employer.
__






Re: Message flag caching and polling.

2003-03-18 Thread Timo Sirainen
On Tue, 2003-03-18 at 20:16, Alexey Melnikov wrote:
  Also do you really have to return MODSEQ messageset for each SEARCH and
  SORT? You're just returning the search results twice. Why not add the MODSEQ
  into the untagged reply itself? ie. * SEARCH 1 2 4 10 (MODSEQ 123). Since
  client specifically asked for the MODSEQ, there's no compatibility issues.
 
 Actually I like the idea.
 I would rather change the syntax to * SEARCH (MODSEQ 123) 1 2 4 10, so that you
 can recognize new syntax earlier in parsing.

Well, from server's point of view it'd be better to send last, so it
wouldn't have to buffer all the search results before sending MODSEQ.
Not that important though.



Re: Message flag caching and polling.

2003-03-13 Thread David Woodhouse
On Wed, 2003-03-12 at 16:49, Mark Crispin wrote:
 On Wed, 12 Mar 2003, David Woodhouse wrote:
  After all, although Evolution is taking 10 seconds to open certain
  folders at the moment because it's re-downloading flags, it doesn't
  actually _need_ all of those flags
 
 Right.  So why does it need to download all of them?  Note that the SEARCH
 command can often be used in lieu of having the flags locally.

It currently tries to download all of them because it's broken and we
can't update the GUI tree widget as and when items become visible. That
is an implementation detail which does need fixing.

That fact doesn't change the fact that not being able to _cache_ flags
is also suboptimal; each problem merely exacerbates the other.

But SEARCH is a very good idea for optimising Evolution's _current_
behaviour -- since Evo only needs to know (IIRC) about \Seen and
\Answered, those two search commands are likely to be a _lot_ faster
than just fetching all the flags. And it can probably be done in Evo's
IMAP back end without actually having to have rewritten all the rest of
the mailer (yet). Thanks for the suggestion.

For my linux-kernel mailing list folder (and bearing in mind that 'ssh
$MAILHOST /bin/true' takes three seconds on its own...

$ ( time echo -e 000 select inbox.lists.l-k\\n001 search (unseen)\\n002 search 
(answered)\\n003 logout\\n  | 
  ssh $MAILHOST imapd ) 21 | egrep real\|EXISTS
* 5272 EXISTS
real0m4.095s
$ ( time echo -e 000 select inbox.lists.l-k\\n001 fetch 1:5272 (flags)\\n002 
logout\\n | 
  ssh $MAILHOST imapd ) 21 | grep real
real0m25.027s

Admittedly that's the folder that's going to give the _best_ ratio,
since it has many messages, few of which are unseen and even fewer of
which are answered. But then again, that's the one I care about most,
because that's the one that's been taking 20-odd seconds to open :)

  I prefer to do the same. The startup time is far from negligible. But I
  got out of the habit of leaving them running while using wu-imapd,
  because the imapd would keep killing itself if two clients looked at the
  same folder at the same time.
 
 That's an artifact of the traditional UNIX mailbox format.  Try the mbx
 format.

I tried MH, and I didn't like the loss of unseen-sequence with wu-imapd.
Mail I read from pine or with an imap clients wasn't showing up as read
in other MH-capable stuff.

Now I'm trying maildir. Maybe I'll try mbx next :)
 
  It's about five seconds per page (19 mails) for me this morning. How
  does the size of the envelope matter -- we're only displaying four
  fields. Are we downloading headers we don't actually need?
 
 Pine uses envelopes, not headers.  If it's taking 5 seconds to get 19
 normal sized envelopes then something else is wrong.

I'll investigate precisely what it's doing. 

 Session is defined in RFC 2060, and refers to the period that a mailbox
 is selected.

Ah; OK. Since pine tends to restart a session each time I change
mailbox, that's not actually caching for as long as I'd like it to. I
tend to flit between mailboxen a lot.

-- 
dwmw2



Re: Message flag caching and polling.

2003-03-13 Thread Timo Sirainen
On Wed, Mar 12, 2003 at 05:45:25PM -0700, Alexey Melnikov wrote:
 Please, have a look at draft-melnikov-imap-condstore-10.txt. Your timestamp is
 called modseq (modification sequence) in the draft. FLAGS-VALIDITY is called
 HIGHESTMODSEQ in the document.
 The functionality you propose can be build as a small extension to CONDSTORE
 (and yes, other people already proposed something similar before).

There's no way I'm going to store extra 64bit per message in mailbox just
for this extension, but looks like it wouldn't prevent from using the tricks
I mentioned earlier. Instead of storing it for each message, I'd use my
existing transaction log file to remember last few changes. MODSEQ would be
last_MODSEQ + position in log file. MODSEQ of messages not in log file
would all be last_MODSEQ. Do you think this would cause any problems?

Also do you really have to return MODSEQ messageset for each SEARCH and
SORT? You're just returning the search results twice. Why not add the MODSEQ
into the untagged reply itself? ie. * SEARCH 1 2 4 10 (MODSEQ 123). Since
client specifically asked for the MODSEQ, there's no compatibility issues.



Re: Message flag caching and polling.

2003-03-13 Thread David Woodhouse
On Thu, 2003-03-13 at 15:42, Timo Sirainen wrote:
 On Wed, Mar 12, 2003 at 05:45:25PM -0700, Alexey Melnikov wrote:
  Please, have a look at draft-melnikov-imap-condstore-10.txt. Your timestamp is
  called modseq (modification sequence) in the draft. FLAGS-VALIDITY is called
  HIGHESTMODSEQ in the document.
  The functionality you propose can be build as a small extension to CONDSTORE
  (and yes, other people already proposed something similar before).
 
 There's no way I'm going to store extra 64bit per message in mailbox just
 for this extension, but looks like it wouldn't prevent from using the tricks
 I mentioned earlier. Instead of storing it for each message, I'd use my
 existing transaction log file to remember last few changes. MODSEQ would be
 last_MODSEQ + position in log file. MODSEQ of messages not in log file
 would all be last_MODSEQ. Do you think this would cause any problems?

If you do this, you'll need to fix up the case where there's a single
client and it's the only one making changes to the folder.

If a client makes a conditional store, not only does the MODSEQ of the
_changed_ messages increase, but also the MODSEQ of _all_ other messages
in the folder.

Maybe you'd want something like...
 a103 UID STORE 6,4,8 (HIGHESTMODSEQ 200012121230045)
... which succeeds only if 200012121230045 is actually the HIGHESTMODSEQ
for the folder, and returns the _new_ HIGHESTMODSEQ.

That'd mean that a client has to be _entirely_ up to date before
submitting any changes though.

-- 
dwmw2



Re: Message flag caching and polling.

2003-03-13 Thread Timo Sirainen
On Thu, Mar 13, 2003 at 04:04:56PM +, David Woodhouse wrote:
  I mentioned earlier. Instead of storing it for each message, I'd use my
  existing transaction log file to remember last few changes. MODSEQ would be
  last_MODSEQ + position in log file. MODSEQ of messages not in log file
  would all be last_MODSEQ. Do you think this would cause any problems?
 
 If you do this, you'll need to fix up the case where there's a single
 client and it's the only one making changes to the folder.
 
 If a client makes a conditional store, not only does the MODSEQ of the
 _changed_ messages increase, but also the MODSEQ of _all_ other messages
 in the folder.

I'm not sure what you mean. Yes, currently my log isn't written to unless
there's multiple clients accessing the mailbox. It's also truncated whenever
all clients are synced. Supporting this extension would have to change that
behaviour so that the log is written every time and is truncated only after
there's been many changes. There's actually two log files, when one is
full the other is truncated and selected, so half of the changes would be
left. That way MODSEQ of non-changed messages will change only when log file
is truncated, which likely takes days and most clients would have synced by
then.



Re: Message flag caching and polling.

2003-03-13 Thread Timo Sirainen
On Thu, Mar 13, 2003 at 06:18:43PM +0200, Timo Sirainen wrote:
  If you do this, you'll need to fix up the case where there's a single
  client and it's the only one making changes to the folder.
 I'm not sure what you mean.

OK, I now I get it :) Yes, it would prevent client from caching the changes
infinitely. Maybe that's a problem for clients that want to be completely
up-to-date all the time, but not for those that only want to be sure the
visible messages are up to date. Also like I said, it would have to resync
old messages only when the log file is truncated which is much less than the
current sync every time.



Re: Message flag caching and polling.

2003-03-12 Thread David Woodhouse
On Tue, 2003-03-11 at 22:56, Mark Crispin wrote: 
 That's not the right view.  You should instead have how can I build a
 good client within the context of IMAP, without expecting the existing
 IMAP world to add facilities to enable me.

That's how I _started_. I looked at the way UIDs allow me to cache
message headers on the client so they don't need to be re-downloaded. I
looked for an equivalent mechanism for flags, and found it to be absent.

Given that even if an extension allowing flag caching _does_ come into
existence I'd still have to implement the non-caching variant for
compatibility, perhaps I should ignore this particular problem for the
moment and come back to it if and when it actually starts to be a
significant problem.

After all, although Evolution is taking 10 seconds to open certain
folders at the moment because it's re-downloading flags, it doesn't
actually _need_ all of those flags, so perhaps it'll turn out that
having to re-download them isn't quite as much of a showstopper as it is
right now, although it'll still be suboptimal.

  Not entirely. I don't necessarily want a _complete_ synchronised state
  -- but neither do I want to have to discard the state I _do_ need, and
  download it again shortly thereafter, just because the server gives me
  no way to be sure that nothing's changed in the meantime.
 
 I keep my IMAP clients (at least, one at home and one in the office)
 running for days at a time.  I have two incoming mailboxes and some
 newsgroups.  All other mailboxes are ones that get updated by my action.

I prefer to do the same. The startup time is far from negligible. But I
got out of the habit of leaving them running while using wu-imapd,
because the imapd would keep killing itself if two clients looked at the
same folder at the same time. 

Now I'm _staying_ out of that habit because Evolution has the same
behaviour w.r.t. message flags as you describe below -- it doesn't
re-download them in the same session (i.e. the life of one 'evolution'
invocation). Which is somewhat unfortunate if the flags are changed by
another client, because we never notice till I exit and restart Evo.

The _correct_ behaviour, according to IMAP, is to redownload those flags
we care about _every_ time the folder is reselected. 

  Well, maybe. More fundamentally though, I'd say that this is a
  consequence of living on the wrong end of a 64K ISDN link and hoping
  that somewhere, somewhen, I can eliminate _any_ redundant traffic to the
  IMAP server.
 
 I once regularly used IMAP over a 2400 bps line, and to this day I still
 use IMAP over a CDPD device.  Please do not treat me as someone who
 doesn't understand the issue of slow lines.

I wouldn't claim you don't understand. I'm just trying to express the
pain that's a large part of my motivation :)

 My claim is not that you should re-download data that you already had.
 Rather, you should not re-download anything unless you need it; and by
 following a strict don't need it, don't download it will buy you more
 than efforts to keep the caches of a dozen clients in synchronization with
 each other.

I am in violent agreement with this. I _really_ don't want to
re-download data that I already had. This is why I'm looking for a way
to _know_ when I need to do so and when I don't, because currently the
status is that I have to re-download it unconditionally. 

Yes, of course I shouldn't be downloading anything I don't _need_ in the
first place, but that is largely an orthogonal issue. Extensions for
server-side threading have already allowed clients to get away with
requiring far less to be downloaded in the first place; that alleviates
but doesn't entirely remove the requirement for being able to cache what
we _do_ need without having to redownload it.

  And redownloading message flags which I already had and which haven't
  changed is definitely redundant -- especially if it's doing so just
  because I temporarily selected my inbox to read a new mail therein,
  before returning to the mailing list I was reading a moment ago.
 
 So why didn't you spawn a separate session for the other mailbox?  TCP
 sessions are cheap.

Well, it's not actually just TCP. It's 'ssh mailhost.$COMPANY.internal
exec imapd' where ssh knows that for *.$COMPANY.internal you actually
run 'ssh bastionhost.$COMPANY.com exec netcat %h %p' instead of making a
TCP connection... 

So it's not _that_ cheap, and not that fast to start up either. And I
can't just open one IMAP connection for _every_ folder I visit in an MUA
session -- there have to be limits on the number of connections I open.
Besides which, it doesn't fix the case where I _have_ actually closed
the MUA and restarted it, or where a temporary network outage has caused
a disconnection. 

But yes, making the client automatically spawn separate IMAP connections
in order to maintain state on some folders is a possibility which allows
us to work around the lack of caching info. It would have helped us work
around 

Re: Message flag caching and polling.

2003-03-12 Thread David Woodhouse
On Wed, 2003-03-12 at 06:58, Eric A. Hall wrote:
 I'm not trying to start a religious war here, but how much work would it
 really be to have a protocol extension which allowed the client to request
 flags which have changed since time. It seems that all of the difficulty
 would be in the implementation (the server data-store), not in the
 protocol, and there would be significant benefits to having this option
 available in the protocol. Faster resynchronization between sessions would
 be very good for all clients, online and offline alike.

I'm going to assume you meant something sane when you said time, of
course :) 

The protocol side could be fairly simple -- the idea that Timo Sirainen
offered in [EMAIL PROTECTED] seems fairly close to
what we'd want. You'd declare that a server supporting FLAGS-VALIDITY
_MUST_ include any messages with changed flags in its response, and
SHOULD make an effort not to include messages _without_ changed flags.

The _implementation_ doesn't have to be that difficult either -- the
common case where no other client has changed flags since the last visit
can happily be dealt with by a trivial change-counter where we tell a
client to redownload _all_ flags if _any_ change has been made.

Of course the protocol should allow allow more complex setups where we
guarantee to give a list with _no_ false positives. 

Some more thought about possible such implementations was given in 
[EMAIL PROTECTED]...

On Tue, 2003-03-11 at 17:30, Timo Sirainen wrote:
 - Keep flags-validity value of last flag change for each message. Takes
   pretty much disk space and may be slow.
 - Keep only the last flags-validity value. That helps only when there hasn't
   been any flag changes since client last accessed the mailbox
 - Keep low-validity and low-uid. if client request any flags-validity =
   low-validity, only return low-uid:* instead of 1:*
 - Keep a log of the last few flags-validities and what messages they changed

Basically, it all looks technically feasible. It's just a case of
whether people will actually want it and start to make use of it.

-- 
dwmw2



Re: Message flag caching and polling.

2003-03-12 Thread Eric A. Hall

on 3/12/2003 2:42 AM David Woodhouse wrote:
 On Wed, 2003-03-12 at 06:58, Eric A. Hall wrote:
 
I'm not trying to start a religious war here, but how much work would it
really be to have a protocol extension which allowed the client to request
flags which have changed since time. It seems that all of the difficulty
would be in the implementation (the server data-store), not in the
protocol, and there would be significant benefits to having this option
available in the protocol. Faster resynchronization between sessions would
be very good for all clients, online and offline alike.
 
 I'm going to assume you meant something sane when you said time, of
 course :) 

Uh, heh, I meant time alright. Specifically have the server return a
timestamp whenever a folder is closed, and let the client cache it. On
reconnect, the client can just ask for all messages with a modification
timestamp later than the last-cached value for that folder. For rich
data-stores, this just requires a last-modified-on attribute for each
message record; return those records which have a last-modified-on value
greater than the requested value.

 The protocol side could be fairly simple -- the idea that Timo Sirainen
 offered in [EMAIL PROTECTED] seems fairly close to
 what we'd want. You'd declare that a server supporting FLAGS-VALIDITY
 _MUST_ include any messages with changed flags in its response, and
 SHOULD make an effort not to include messages _without_ changed flags.

Doesn't this require the server to cache client state? It'd be a lot
simpler for the clients to keep track of their own state, since that's
what they're already doing.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/



Re: Message flag caching and polling.

2003-03-12 Thread David Woodhouse
On Wed, 12 Mar 2003, Eric A. Hall wrote:

  I'm going to assume you meant something sane when you said time, of
  course :) 
 
 Uh, heh, I meant time alright. Specifically have the server return a
 timestamp whenever a folder is closed, and let the client cache it.

We're digressing somewhat. Yes the idea is basically sound but time 
actually isn't guaranteed to actually be different _every_ time you query 
it, and in the presence of NTP etc., isn't actually guaranteed to be 
monotonic either. Plus as son as you start _calling_ it 'time' you get 
people wanting to use the clock on the _client_ and that's obviously even 
more broken.

Take a timestamp, make sure it's always newer than the previous one, and 
call it something different :)

  The protocol side could be fairly simple -- the idea that Timo Sirainen
  offered in [EMAIL PROTECTED] seems fairly close to
  what we'd want. You'd declare that a server supporting FLAGS-VALIDITY
  _MUST_ include any messages with changed flags in its response, and
  SHOULD make an effort not to include messages _without_ changed flags.
 
 Doesn't this require the server to cache client state? It'd be a lot
 simpler for the clients to keep track of their own state, since that's
 what they're already doing.

It doesn't require any per-client state on the server. For each change 
made to the folder, the server advances through a sequence of cookies 
(which might by an amazing coincidence resemble timestamps) and hands 
one off to the client. Each client remembers the cookie which was 
given to match its own locally-cached information.

Then occasionally a client comes along to the server and says Tell me
what, if anything, changed since $THEN. 

The server has the _option_ of maintaining some details about what changed
and when, for the last few changes which were made to the folder. If the 
cookie the client presents is so old that the server's forgotten 
everything that happened since then, then the server can just tell the 
client to invalidate its whole cache. 

If the server isn't keeping any logs at all, this just becomes a simple 
compare of the cookie with the 'latest' cookie, and a yes/no answer. 
Even that most simple implementation will suffice to optimise the common 
case where no changes have been made by another client in the time since 
this particular client last looked at the folder.

(Of course you ensure that in the one-client case you don't trigger 
invalidates when they're not necessary. So new mail arriving shouldn't 
change the change-counter cookie, only changing of flags on _existing_ 
mail should do that. And when a client changes flags in a folder and the 
change-counter cookie advances, that client shouldn't be told to 
invalidate its cache if it was already up-to-date. But those are just 
details.)

-- 
dwmw2



Re: Message flag caching and polling.

2003-03-12 Thread Cyrus Daboo
Hi Eric,

--On Wednesday, March 12, 2003 12:58:55 AM -0600 Eric A. Hall 
[EMAIL PROTECTED] wrote:

| I'm not trying to start a religious war here, but how much work would it
| really be to have a protocol extension which allowed the client to request
| flags which have changed since time. It seems that all of the difficulty
| would be in the implementation (the server data-store), not in the
| protocol, and there would be significant benefits to having this option
| available in the protocol. Faster resynchronization between sessions would
| be very good for all clients, online and offline alike.
|
| In those cases where it was impractical to store this kind of information,
| the server wouldn't implement it, which is reasonable behavior for any
| optional extension.
|
| Ignoring the data-store issues which will dictate whether a specific
| server is able to implement the feature, how feasible would this be?
The idea of adding a proper 'dynamic' synchronization capability to IMAP 
has been discussed before. You actually need to go further than has been 
discussed in this thread in that its also good to get the list of messages 
expunged since the last time.

The solution to this involves adding a parameter to the SELECT command that 
is an opaque 'token' (which may be a timestamp or a transaction id or 
something else) and returns the UID of messages that have been expunged (as 
one untagged response) and messages whose state (flags, annotations etc) 
has changed (as another untagged response). The set of new messages can be 
inferred from the server in the usual manner so that data set does not need 
to be returned. Note that actual flag data is NOT sent - just the UID. That 
way clients get to decide exactly when they fetch the data.

Servers would be allowed to respond with an untagged 'NO token too old' 
response if they don't have data going back far enough to satisfy the 
request, in which case clients would have to fall back to what they do now.

The only issue with this is coming up with a reliable way for clients to 
get the token. That probably means requiring clients to CLOSE or UNSELECT 
mailboxes in order to get the token at a point after all unsolicited 
responses have been sent.

Of course implementing this on the server is going to be a pain for some 
type of mailbox formats and its going to involve managing a lot of 
meta-data for the mailbox. I've always envisioned this as being done using 
transaction ids (i.e. each command that changes the state of the 
mailbox/messages is recorded as a transaction) rather than timestamps. But 
that's something server vendors can decide on.

--
Cyrus Daboo


Re: Message flag caching and polling.

2003-03-12 Thread Mark Crispin
On Wed, 12 Mar 2003, David Woodhouse wrote:
 After all, although Evolution is taking 10 seconds to open certain
 folders at the moment because it's re-downloading flags, it doesn't
 actually _need_ all of those flags

Right.  So why does it need to download all of them?  Note that the SEARCH
command can often be used in lieu of having the flags locally.

 I prefer to do the same. The startup time is far from negligible. But I
 got out of the habit of leaving them running while using wu-imapd,
 because the imapd would keep killing itself if two clients looked at the
 same folder at the same time.

That's an artifact of the traditional UNIX mailbox format.  Try the mbx
format.

 It's about five seconds per page (19 mails) for me this morning. How
 does the size of the envelope matter -- we're only displaying four
 fields. Are we downloading headers we don't actually need?

Pine uses envelopes, not headers.  If it's taking 5 seconds to get 19
normal sized envelopes then something else is wrong.

  Have you noticed that Pine *does* completely cache within a session?  As
  long as you don't give up the session, Pine will never re-fetch the same
  data.
 I'm not sure what you mean by 'session' in this context.

Session is defined in RFC 2060, and refers to the period that a mailbox
is selected.

-- Mark --

http://staff.washington.edu/mrc
Science does not emerge from voting, party politics, or public debate.


Re: Message flag caching and polling.

2003-03-12 Thread Eric A. Hall

on 3/12/2003 5:38 PM Steve Hole wrote:

 The answer -- the bigger the mailbox and the lower the transaction rate
  into the mailbox, the bigger the win.   If the mailbox was small or
 had a high transaction rate (lots of expunged and new messages) then it
 wasn't that much of a win.
 
 A number of mailboxes do have this type of activity pattern -- my inbox
  being one of those.

That seems like a reasonable outcome if you are only taking watermarks
periodically. If the watermark was updated frequently (such as each open
and close), wouldn't the problem be diminished for mailboxes that were
selected frequently?

Also the obvious point that where this feature benefits the most is
exactly where it is needed the most. I have more than a few folders with
4000 messages in them, and those folders take a very long time to open
whenever there is any significant latency or bandwidth constraint.

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/



Re: Message flag caching and polling.

2003-03-12 Thread Alexey Melnikov
Please, have a look at draft-melnikov-imap-condstore-10.txt. Your timestamp is
called modseq (modification sequence) in the draft. FLAGS-VALIDITY is called
HIGHESTMODSEQ in the document.
The functionality you propose can be build as a small extension to CONDSTORE
(and yes, other people already proposed something similar before).

David Woodhouse wrote:

 On Wed, 12 Mar 2003, Eric A. Hall wrote:

   I'm going to assume you meant something sane when you said time, of
   course :)
 
  Uh, heh, I meant time alright. Specifically have the server return a
  timestamp whenever a folder is closed, and let the client cache it.

 We're digressing somewhat. Yes the idea is basically sound but time
 actually isn't guaranteed to actually be different _every_ time you query
 it, and in the presence of NTP etc., isn't actually guaranteed to be
 monotonic either. Plus as son as you start _calling_ it 'time' you get
 people wanting to use the clock on the _client_ and that's obviously even
 more broken.

 Take a timestamp, make sure it's always newer than the previous one, and
 call it something different :)

   The protocol side could be fairly simple -- the idea that Timo Sirainen
   offered in [EMAIL PROTECTED] seems fairly close to
   what we'd want. You'd declare that a server supporting FLAGS-VALIDITY
   _MUST_ include any messages with changed flags in its response, and
   SHOULD make an effort not to include messages _without_ changed flags.
 
  Doesn't this require the server to cache client state? It'd be a lot
  simpler for the clients to keep track of their own state, since that's
  what they're already doing.

 It doesn't require any per-client state on the server. For each change
 made to the folder, the server advances through a sequence of cookies
 (which might by an amazing coincidence resemble timestamps) and hands
 one off to the client. Each client remembers the cookie which was
 given to match its own locally-cached information.

 Then occasionally a client comes along to the server and says Tell me
 what, if anything, changed since $THEN.

 The server has the _option_ of maintaining some details about what changed
 and when, for the last few changes which were made to the folder. If the
 cookie the client presents is so old that the server's forgotten
 everything that happened since then, then the server can just tell the
 client to invalidate its whole cache.

 If the server isn't keeping any logs at all, this just becomes a simple
 compare of the cookie with the 'latest' cookie, and a yes/no answer.
 Even that most simple implementation will suffice to optimise the common
 case where no changes have been made by another client in the time since
 this particular client last looked at the folder.

 (Of course you ensure that in the one-client case you don't trigger
 invalidates when they're not necessary. So new mail arriving shouldn't
 change the change-counter cookie, only changing of flags on _existing_
 mail should do that. And when a client changes flags in a folder and the
 change-counter cookie advances, that client shouldn't be told to
 invalidate its cache if it was already up-to-date. But those are just
 details.)

Regards,
Alexey
__
R  D, ACI Worldwide/MessagingDirect
Watford, UK

Work Phone: +44 1923 81 2877
Home Page: http://orthanX-Mozilla-Status: 0009 for myself only, not for my employer.
__






Re: Message flag caching and polling.

2003-03-12 Thread Alexey Melnikov
Please, have a look at draft-melnikov-imap-condstore-10.txt. Your timestamp is
called modseq (modification sequence) in the draft. FLAGS-VALIDITY is called
HIGHESTMODSEQ in the document.
The functionality you propose can be build as a small extension to CONDSTORE
(and yes, other people already proposed something similar before).

David Woodhouse wrote:

 On Wed, 12 Mar 2003, Eric A. Hall wrote:

   I'm going to assume you meant something sane when you said time, of
   course :)
 
  Uh, heh, I meant time alright. Specifically have the server return a
  timestamp whenever a folder is closed, and let the client cache it.

 We're digressing somewhat. Yes the idea is basically sound but time
 actually isn't guaranteed to actually be different _every_ time you query
 it, and in the presence of NTP etc., isn't actually guaranteed to be
 monotonic either. Plus as son as you start _calling_ it 'time' you get
 people wanting to use the clock on the _client_ and that's obviously even
 more broken.

 Take a timestamp, make sure it's always newer than the previous one, and
 call it something different :)

   The protocol side could be fairly simple -- the idea that Timo Sirainen
   offered in [EMAIL PROTECTED] seems fairly close to
   what we'd want. You'd declare that a server supporting FLAGS-VALIDITY
   _MUST_ include any messages with changed flags in its response, and
   SHOULD make an effort not to include messages _without_ changed flags.
 
  Doesn't this require the server to cache client state? It'd be a lot
  simpler for the clients to keep track of their own state, since that's
  what they're already doing.

 It doesn't require any per-client state on the server. For each change
 made to the folder, the server advances through a sequence of cookies
 (which might by an amazing coincidence resemble timestamps) and hands
 one off to the client. Each client remembers the cookie which was
 given to match its own locally-cached information.

 Then occasionally a client comes along to the server and says Tell me
 what, if anything, changed since $THEN.

 The server has the _option_ of maintaining some details about what changed
 and when, for the last few changes which were made to the folder. If the
 cookie the client presents is so old that the server's forgotten
 everything that happened since then, then the server can just tell the
 client to invalidate its whole cache.

 If the server isn't keeping any logs at all, this just becomes a simple
 compare of the cookie with the 'latest' cookie, and a yes/no answer.
 Even that most simple implementation will suffice to optimise the common
 case where no changes have been made by another client in the time since
 this particular client last looked at the folder.

 (Of course you ensure that in the one-client case you don't trigger
 invalidates when they're not necessary. So new mail arriving shouldn't
 change the change-counter cookie, only changing of flags on _existing_
 mail should do that. And when a client changes flags in a folder and the
 change-counter cookie advances, that client shouldn't be told to
 invalidate its cache if it was already up-to-date. But those are just
 details.)

Regards,
Alexey
__
R  D, ACI Worldwide/MessagingDirect
Watford, UK

Work Phone: +44 1923 81 2877
Home Page: http://orthanc.ab.ca/mel

I speak for myself only, not for my employer.
__





Re: Message flag caching and polling.

2003-03-12 Thread David Woodhouse
On Thu, 2003-03-13 at 00:34, Alexey Melnikov wrote:
 Please, have a look at draft-melnikov-imap-condstore-10.txt. Your timestamp is
 called modseq (modification sequence) in the draft. FLAGS-VALIDITY is called
 HIGHESTMODSEQ in the document.

That is indeed almost precisely what I was looking for. Thank you.

Btw I see only -09. Did you s/successfull/successful/ in §3 of -10
already?

 The functionality you propose can be build as a small extension to CONDSTORE
 (and yes, other people already proposed something similar before).

It's early here -- I'm not sure if there's any functionality I'd wanted
that isn't there, from a client's point of view.

The only real difference I note is that I was trying to allow for the
option of a far more naïve server implementation, where the server
_only_ keeps 'HIGHESTMODSEQ' and doesn't actually keep MODSEQ for
individual messages -- or where it keeps MODSEQ only for the N most
recently changed messages. This might improve the adoption rate of the
extension while still allowing the majority of the benefit to be seen by
clients. 

I'm not particularly tied to that idea though -- from a client's point
of view, I'm perfectly happy without it.

-- 
dwmw2



Re: Message flag caching and polling.

2003-03-11 Thread Mark Keasling
Hi,


On 11 Mar 2003 10:20:22 +, David Woodhouse [EMAIL PROTECTED] wrote...
 Consider the case where the following happens, in this order:
   1. Client SELECTs a mailbox.
   2. Mail arrives.
   3. Client issues 'IDLE' command.
 
 AFAICT nothing in RFC2177 states that the server 'MUST' immediately
 inform the client of changes to the mailbox which have occurred since
 the mailbox was selected, and unless that behaviour is required, there's
 _always_ a race condition between selecting the folder and entering
 'idle' mode. 

There is no race condition.  When the client sends the idle command, it
is telling the server to send notification about new mail as it arrives.
Since new mail has already arrived prior to the initiation of the IDLE
command, the server should immediately notify the client about the new mail
and then continue to send updates as they occur.

Regards,
Mark Keasling [EMAIL PROTECTED]



Re: Message flag caching and polling.

2003-03-11 Thread David Woodhouse
On Tue, 2003-03-11 at 10:39, Mark Keasling wrote:
 There is no race condition.  When the client sends the idle command, it
 is telling the server to send notification about new mail as it arrives.
 Since new mail has already arrived prior to the initiation of the IDLE
 command, the server should immediately notify the client about the new mail
 and then continue to send updates as they occur.

You said 'should' and I agree with that wholeheartedly. It's common
sense and obviously the correct thing to do. It'd be utterly stupid to
do otherwise.

But you didn't say 'MUST' in capitals :)

My concern is that you can't always rely on implementors to apply common
sense, but you _can_ rely on them to deviate as far as possible from
common sense while staying technically within the bounds of the RFC (and
even that only if you're lucky :).

-- 
dwmw2



Re: Message flag caching and polling.

2003-03-11 Thread Mark Keasling
Hi,

On 11 Mar 2003 10:48:52 +, David Woodhouse [EMAIL PROTECTED] wrote...
 On Tue, 2003-03-11 at 10:39, Mark Keasling wrote:
  There is no race condition.  When the client sends the idle command, it
  is telling the server to send notification about new mail as it arrives.
  Since new mail has already arrived prior to the initiation of the IDLE
  command, the server should immediately notify the client about the new mail
  and then continue to send updates as they occur.
 
 You said 'should' and I agree with that wholeheartedly. It's common
 sense and obviously the correct thing to do. It'd be utterly stupid to
 do otherwise.
 
 But you didn't say 'MUST' in capitals :)

Okay, I've had to dig out RFC2060 and thumb through it.
Here is the relevant text...

5.2.Mailbox Size and Message Status Updates

   At any time, a server can send data that the client did not request.
   Sometimes, such behavior is REQUIRED.  For example, agents other than
   the server MAY add messages to the mailbox (e.g. new mail delivery),
   change the flags of message in the mailbox (e.g. simultaneous access
   to the same mailbox by multiple agents), or even remove messages from
   the mailbox.  A server MUST send mailbox size updates automatically
   if a mailbox size change is observed during the processing of a
   command.  A server SHOULD send message flag updates automatically,
   without requiring the client to request such updates explicitly.
   Special rules exist for server notification of a client about the
   removal of messages to prevent synchronization errors; see the
   description of the EXPUNGE response for more detail.

7.  Server Responses
...
   An example of unilateral untagged server data occurs when the IMAP
   connection is in selected state.  In selected state, the server
   checks the mailbox for new messages as part of command execution.
   Normally, this is part of the execution of every command; hence, a
   NOOP command suffices to check for new messages.  If new messages are
   found, the server sends untagged EXISTS and RECENT responses
   reflecting the new size of the mailbox.  Server implementations that
   offer multiple simultaneous access to the same mailbox SHOULD also
   send appropriate unilateral untagged FETCH and EXPUNGE responses if
   another agent changes the state of any message flags or expunges any
   messages.

You will notice that the RFC has MUST.

 My concern is that you can't always rely on implementors to apply common
 sense, but you _can_ rely on them to deviate as far as possible from
 common sense while staying technically within the bounds of the RFC (and
 even that only if you're lucky :).

A server can not remain compliant with RFC2060 and not send mailbox
size updates at its first opportunity which is the at the latest the
next command sent by the client.  The server can't change its notion
of the mailbox size without informing the client.  The size can grow
at any time.  There are limits however on when the server can tell
the client that it has shrunk.

The IDLE extension is a way for the client to start an command
essentially asking the server for updates in real time.  The server
can then send updates as frequently or not as it chooses.  This
eliminates the need for a constant stream of NOOP commands to
achieve the same result.

A server which fails to notify the client of new mail isn't exactly
useful IMHO.

Regards,
Mark Keasling [EMAIL PROTECTED]



Re: Message flag caching and polling.

2003-03-11 Thread David Woodhouse
On Tue, 2003-03-11 at 11:41, Mark Keasling wrote:
  But you didn't say 'MUST' in capitals :)
 
 Okay, I've had to dig out RFC2060 and thumb through it.
 Here is the relevant text...
 
 ...  A server MUST send mailbox size updates automatically
if a mailbox size change is observed during the processing of a
command.  A server SHOULD send message flag updates automatically,
without requiring the client to request such updates explicitly.

Well, if I _really_ wanted to assume stupidity or malice on the part of
the server author -- and often it's useful for clients to do exactly
that -- I'd point out that the latter is only a 'SHOULD' and the former
although 'MUST' is predicated on the server actually _observing_ a
change, not on the mere fact that there _is_ a change. 

But in practice I suppose I can't see anyone claiming But we didn't
check for the change hence we didn't observe it as an excuse, so it
does come very close to being the 'MUST' which I was after, and which
I'd missed. Thanks for pointing it out.

 A server which fails to notify the client of new mail isn't exactly
 useful IMHO.

True.

-- 
dwmw2



Re: Message flag caching and polling.

2003-03-11 Thread David Woodhouse
On Tue, 2003-03-11 at 12:21, Timo Sirainen wrote:
 I've been thinking that server could send STATUS-replies whenever it
 notices new mail in mailboxes. Maybe a STATUS-WATCH (mailboxlist) or
 similiar command for that. Or that configuration could even be done at
 server side so it might not even require changes to clients, assuming it
 updates it's internal counts whenever it sees STATUS reply.

Well, even though the server is already permitted to send unsolicited
STATUS 'replies', the client would need to know that the server _will_
do so, hence that the client doesn't need to keep polling. There'd have
to be _some_ way for the server to indicate that situation, and probably
also you'd want a way for the client to select _which_ folders should be
monitored by the server. So the 'STATUS-WATCH' idea seems more useful
than silently configuring the server to do it. I wonder if just limiting
it to subscribed folders is OK though, or whether we really want the
client to be able to give a list?

 Right, \Marked is more about if mailbox possibly has \Recent messages
 (or something else interesting).

OK. Hence the only way a client can currently poll for mail in all
folders is to execute a separate 'STATUS' command on _each_ one,
periodically.

  4. What mechanisms exist or have been discussed to allow client-side
  caching of message flags (such as \Seen, \Answered in particular).
 
 Server should send flag updates to client. I think it's pretty safe to
 cache the flags for the current session and assume the server notifies
 of any changes.

Sorry, my question wasn't clear. I speak not of the currently-selected
folder but of _other_ folders, for which the server will not be sending
updates (and besides; flags changes are only a 'SHOULD' anyway).

 Evolution's IMAP support is pretty bad. I've been rewriting it for a
 while with one of the Ximian guys, but it's not that far yet and I've
 other things to do too.

True; the point of my questions is to establish best practice before I
try to assist with that rewrite. I started by adding the capacity to get
at an IMAP server by running 'ssh $mailhost exec imapd', and now I'm
looking at the rest of the things which annoy me about it.

The major one at the moment is the fact that STATUS queries for all
folders get queued in one lump, and I have to wait for them all to
complete before it'll actually do the next thing that _I_ asked it to ;)

Yes, I can make it prioritise commands from a queue and send
'user-driven' requests before background status polling, but in the long
term I'd rather eliminate the STATUS requests altogether if possible.

  Would it be feasible to add something like a flags-sequence-number which
  is incremented each time flags for existing messages are changed, which
  a client can use to invalidate its cache of message flags?
 
 Probably not, it wouldn't be that much different from simply fetching
 the flags instead of flag-seq-number.

I'm not sure I agree with that. Consider the case of a folder with 3000
messages in it, which _hasn't_ been modified by another client since it
was last used by the client we're sitting in front of. Perhaps we've
just selected another folder temporarily and then switched back to the
first. Surely it's better to do a single compare of a sequence number
(basically a datestamp) than to issue a FETCH for the flags of _every_
message again?

I was thinking of a new command where the client can present the
'flags-validity' sequence number which matches its own cached flags for
the folder, and be told by the server 'You need to (re)fetch flags for
the following messages...'. Which gives the nave server implementation
the option of just saying 'all messages' if anything's changed, or more
complicated servers could perhaps keep track of _which_ messages' flags
have changed in the last few sequence numbers; keeping 'last changed at'
stamps per-message or something so that a client can be given a more
limited set of messages for which the flags have changed.

-- 
dwmw2



Re: Message flag caching and polling.

2003-03-11 Thread Timo Sirainen
On Tue, Mar 11, 2003 at 01:59:26PM +, David Woodhouse wrote:
 Well, even though the server is already permitted to send unsolicited
 STATUS 'replies', the client would need to know that the server _will_
 do so, hence that the client doesn't need to keep polling.

Well, yes. The reason I want it is that I'd want to see new mail immediately
when it comes, instead of only when client decides to poll for it. It
doesn't really matter to me if client still does the polling.

 monitored by the server. So the 'STATUS-WATCH' idea seems more useful
 than silently configuring the server to do it. I wonder if just limiting
 it to subscribed folders is OK though, or whether we really want the
 client to be able to give a list?

Subscribed list could be pretty large.

 OK. Hence the only way a client can currently poll for mail in all
 folders is to execute a separate 'STATUS' command on _each_ one,
 periodically.

It's also possible to create a connection for each checked mailbox and
issue IDLE in each of them. I don't know if it's such a good idea though.

 True; the point of my questions is to establish best practice before I
 try to assist with that rewrite.

So far I haven't tried to do any special optimizations. I just want it to
work.

 I started by adding the capacity to get
 at an IMAP server by running 'ssh $mailhost exec imapd', and now I'm
 looking at the rest of the things which annoy me about it.

That's actually what I did first too :) Did you get it to work well? I just
couldn't figure out any good way to make it ask password, so I never
actually began using it.

 The major one at the moment is the fact that STATUS queries for all
 folders get queued in one lump, and I have to wait for them all to
 complete before it'll actually do the next thing that _I_ asked it to ;)

 Yes, I can make it prioritise commands from a queue and send
 'user-driven' requests before background status polling, but in the long
 term I'd rather eliminate the STATUS requests altogether if possible.

How about just getting a faster server :) My STATUS replies near instantly.

 I'm not sure I agree with that. Consider the case of a folder with 3000
 messages in it, which _hasn't_ been modified by another client since it
 was last used by the client we're sitting in front of. Perhaps we've
 just selected another folder temporarily and then switched back to the
 first. Surely it's better to do a single compare of a sequence number
 (basically a datestamp) than to issue a FETCH for the flags of _every_
 message again?

Using multiple connections would also help with this. Then you wouldn't be
temporarily switching between mailboxes.

 I was thinking of a new command where the client can present the
 'flags-validity' sequence number which matches its own cached flags for
 the folder, and be told by the server 'You need to (re)fetch flags for
 the following messages...'. Which gives the naïve server implementation
 the option of just saying 'all messages' if anything's changed, or more
 complicated servers could perhaps keep track of _which_ messages' flags
 have changed in the last few sequence numbers; keeping 'last changed at'
 stamps per-message or something so that a client can be given a more
 limited set of messages for which the flags have changed.

Hmm. Perhaps.

SELECT INBOX (FLAGS-VALIDITY 100)
..
* OK [FLAGS-VALIDITY 105] 1:4,5:10
...
* 5 FETCH (FLAGS (..))
* OK [FLAGS-VALIDITY 106]

But I'm still not sure if it's worth it. Client doesn't _have_ to update
flags for all messages. You probably only care about the unseen count which
you can get with STATUS, and the flags for visible messages. Fetching flags
for only the 20 messages or so isn't slow at all. That would require some
changes to client, but probably not that much.



Re: Message flag caching and polling.

2003-03-11 Thread David Woodhouse
On Tue, 2003-03-11 at 15:09, Timo Sirainen wrote:
 On Tue, Mar 11, 2003 at 01:59:26PM +, David Woodhouse wrote:
  Well, even though the server is already permitted to send unsolicited
  STATUS 'replies', the client would need to know that the server _will_
  do so, hence that the client doesn't need to keep polling.
 
 Well, yes. The reason I want it is that I'd want to see new mail immediately
 when it comes, instead of only when client decides to poll for it. It
 doesn't really matter to me if client still does the polling.

Well, I'd probably settle for that too, and I could do it just by
hacking an imapd to offer unsolicited STATUS reports, and of course
making Evolution deal with them -- but _ideally_ we'd eliminate the
polling from the client in the case where it's not necessary.

Until I actually get bogged down in the details of making Evolution
behave in a vaguely sane way, I'll stick to chasing the ideal ;)

  monitored by the server. So the 'STATUS-WATCH' idea seems more useful
  than silently configuring the server to do it. I wonder if just limiting
  it to subscribed folders is OK though, or whether we really want the
  client to be able to give a list?
 
 Subscribed list could be pretty large.

True; and it's also true that the set of messages which actually receive
new mail is likely to be a subset of that.

  OK. Hence the only way a client can currently poll for mail in all
  folders is to execute a separate 'STATUS' command on _each_ one,
  periodically.
 
 It's also possible to create a connection for each checked mailbox and
 issue IDLE in each of them. I don't know if it's such a good idea though.

It'd work for me, I suppose, because I'm the only real user of the box
I'm using as a server. I wouldn't really want to advocate that as a
'best practice' way of monitoring multiple mailboxes on servers with
_more_ than one user though :)

 That's actually what I did first too :) Did you get it to work well? I just
 couldn't figure out any good way to make it ask password, so I never
 actually began using it.

Ask for password? What do you think ssh-agent is for? :)

You just want it to use ssh-askpass if necessary. There's a trick to
that -- if SSH finds it has a controlling tty, it'll open that and
attempt to use it to interact with the user, instead of running
$SSH_ASKPASS. So my Evo patches explicitly disconnect from the
controlling tty before running the ssh command. It does bring up a
window asking for a password.

ftp://ftp.uk.linux.org/pub/people/dwmw2/evo/

 How about just getting a faster server :) My STATUS replies near instantly.

I've already switched from wu-imapd to Courier, because I objected to
wu-imapd trawling through megabytes of each of my project archive
folders, checking the status of each message to see if it's unseen, when
it had done precisely the same thing one minute ago and the ctime of the
mbox file blatantly hadn't changed :)

But I'm still the wrong end of a 64K ISDN line and even now, with
negligible time actually taken by the _server_ the round-trip time for
40-odd STATUS commands is enough to annoy me.

  I'm not sure I agree with that. Consider the case of a folder with 3000
  messages in it, which _hasn't_ been modified by another client since it
  was last used by the client we're sitting in front of. Perhaps we've
  just selected another folder temporarily and then switched back to the
  first. Surely it's better to do a single compare of a sequence number
  (basically a datestamp) than to issue a FETCH for the flags of _every_
  message again?
 
 Using multiple connections would also help with this. Then you wouldn't be
 temporarily switching between mailboxes.

It's sort of a viable workaround, but there are limits on the number of
connections you can open. Maybe you could keep a connection open for
each of the last N visited mailboxes, where N is configurable -- but
that still means you have to refetch all flags every time you start up,
and when you change through lots of folders (which is a common case when
you have lots of separate folders for lists). 


 Hmm. Perhaps.
 
 SELECT INBOX (FLAGS-VALIDITY 100)
 ..
 * OK [FLAGS-VALIDITY 105] 1:4,5:10
 ...
 * 5 FETCH (FLAGS (..))
 * OK [FLAGS-VALIDITY 106]

That's much like what I was thinking, yeah.

 But I'm still not sure if it's worth it. Client doesn't _have_ to update
 flags for all messages. You probably only care about the unseen count which
 you can get with STATUS, and the flags for visible messages. Fetching flags
 for only the 20 messages or so isn't slow at all. That would require some
 changes to client, but probably not that much.

It's true that the client shouldn't be fetching _all_ the message flags.
To be honest I'm more annoyed with Evolution for insisting on making me
wait while it fetches all the message _headers_ before deigning to show
me the ten new messages at the bottom of the mailbox which was all I
actually wanted to see. But that's a separate quality-of-implementation

Re: Message flag caching and polling.

2003-03-11 Thread Mark Crispin
On Tue, 11 Mar 2003, David Woodhouse wrote:
 I've already switched from wu-imapd to Courier, because I objected to
 wu-imapd trawling through megabytes of each of my project archive
 folders, checking the status of each message to see if it's unseen, when
 it had done precisely the same thing one minute ago and the ctime of the
 mbox file blatantly hadn't changed :)

So why did you do a STATUS of the mailbox if the ctime hadn't changed?
It isn't as if IMAP doesn't provide mailbox status flags to tell you that.
Perhaps you use a client which doesn't know how to check those flags.

I also don't think that running a server which is known to be
non-compliant in many ways (Courier) is an improvement.

-- Mark --

http://staff.washington.edu/mrc
Science does not emerge from voting, party politics, or public debate.


Re: Message flag caching and polling.

2003-03-11 Thread Mark Crispin
On Tue, 11 Mar 2003, David Woodhouse wrote:
 By 'ctime' I meant the inode ctime on the underlying file system, for
 the inode of the mbox file.

I know what ctime is.

 When configured to 'check for mail in all folders' Evolution is issuing
 a LIST command, and then for each mailbox listed it's issuing a STATUS
 command. I wasn't aware of any way in which we could know at the client
 side that a given mailbox doesn't need checking. As discussed, even the
 \Unmarked flag doesn't necessarily mean that there isn't new mail in a
 mailbox which _this_ client hasn't previously seen.

That isn't checking for new mail; rather, it is checking for a need to
synchronize a particular client instance.

 I'm perfectly happy to fix Evolution if a correct fix is possible --
 it's just that I couldn't see one myself.

An IMAP server with fast STATUS metadata (such as Cyrus) will help you do
faster polling, but if the client really wants to maintain synchronization
with a great many mailboxes (as opposed to a much more limited set of
incoming mailboxes) then polling is not the correct fix.

The correct fix is MTA-based notification, not client-based polling of an
IMAP or POP server.  I've been working on a notification facility, largely
in secret because history has shown that the best way to preclude progress
is to have a committee work on it (as in there have been many prior
attempts and all have collapsed under their own weight).

Basically, you need to look at what comsat/biff do, and realize that
although those programs are obsolete, they are on the right track.

-- Mark --

http://staff.washington.edu/mrc
Science does not emerge from voting, party politics, or public debate.


Re: Message flag caching and polling.

2003-03-11 Thread Timo Sirainen
On Tue, Mar 11, 2003 at 03:58:55PM +, David Woodhouse wrote:
 But I'm still the wrong end of a 64K ISDN line and even now, with
 negligible time actually taken by the _server_ the round-trip time for
 40-odd STATUS commands is enough to annoy me.

That could be fixed by sending all the STATUS queries to server at once.

  But I'm still not sure if it's worth it. Client doesn't _have_ to update
  flags for all messages. You probably only care about the unseen count which
  you can get with STATUS, and the flags for visible messages. Fetching flags
  for only the 20 messages or so isn't slow at all. That would require some
  changes to client, but probably not that much.

Actually this brings to my mind: STATUS for selected folder isn't considered
good behaviour. Fetching flags for all messages in folder isn't considered
good behaviour. So, how exactly do I keep unseen messages count up to date
for selected mailbox?

I wouldn't be against flags-validity, but I'm not sure if it's worth the
trouble. The kind of validity-request uidset-response would allow servers to
be pretty flexible in how they handle it. I think it should work something
like this:

Client remembers the last flags-validity given by server and requests that
the next time it selects the mailbox. Server updates the validity when one
of the clients closes the mailbox and there has been at least one flag
update. It sends to all clients (including the one that is closing the
mailbox/connection):

* OK [FLAGS-VALIDITY newvalidity]

Updating flags-validity isn't necessarily a fast operation, so that way it
doesn't have to be updated all the time.

Some ways for server to handle this:

- Keep flags-validity value of last flag change for each message. Takes
  pretty much disk space and may be slow.
- Keep only the last flags-validity value. That helps only when there hasn't
  been any flag changes since client last accessed the mailbox
- Keep low-validity and low-uid. if client request any flags-validity =
  low-validity, only return low-uid:* instead of 1:*
- Keep a log of the last few flags-validities and what messages they changed



Re: Message flag caching and polling.

2003-03-11 Thread David Woodhouse
On Tue, 2003-03-11 at 17:15, Mark Crispin wrote:
 On Tue, 11 Mar 2003, David Woodhouse wrote:
  By 'ctime' I meant the inode ctime on the underlying file system, for
  the inode of the mbox file.
 
 I know what ctime is.

Sorry :) 

I was just confused because you seemed to be saying there was a way for
the client to query the ctime (or at least issue a command less slow
than STATUS) and decide for itself whether to re-issue a STATUS command
or not; which doesn't seem to be the case. Maybe I misunderstood what
you had said. 

  When configured to 'check for mail in all folders' Evolution is issuing
  a LIST command, and then for each mailbox listed it's issuing a STATUS
  command. I wasn't aware of any way in which we could know at the client
  side that a given mailbox doesn't need checking. As discussed, even the
  \Unmarked flag doesn't necessarily mean that there isn't new mail in a
  mailbox which _this_ client hasn't previously seen.
 
 That isn't checking for new mail; rather, it is checking for a need to
 synchronize a particular client instance.

The precise reason for doing this is because it's looking to see whether
it needs to update its folder tree widget by turning a folder name bold
and putting the number of unseen messages in parentheses by it, to
inform me (the user) of the fact that there are new messages therein.

We can describe that either as 'checking for new mail' or as 'checking
for a need to synchronise', but I suspect it doesn't really make a lot
of difference which terminology we use.

  I'm perfectly happy to fix Evolution if a correct fix is possible --
  it's just that I couldn't see one myself.
 
 An IMAP server with fast STATUS metadata (such as Cyrus) will help you do
 faster polling, 

That was the reason for switching to Courier. I considered Cyrus briefly
but it doesn't do PREAUTH when invoked from the MUA via 'ssh $mailhost
exec imapd'.

 but if the client really wants to maintain synchronization
 with a great many mailboxes (as opposed to a much more limited set of
 incoming mailboxes) then polling is not the correct fix.

 The correct fix is MTA-based notification, not client-based polling of an
 IMAP or POP server. 

I'm not sure what you mean by 'MTA-based notification'.

I would have said that the correct fix as far as the _client_ is
concerned is asynchronous notification by an IMAP server of such
changes. Hence the suggestion for an IMAP extension to provide just
that.

And you're arguably right that the correct way to _implement_ that is
MTA-based notification rather than having the IMAP server itself do
polling of individual mailboxen. 

  I've been working on a notification facility, largely
 in secret because history has shown that the best way to preclude progress
 is to have a committee work on it (as in there have been many prior
 attempts and all have collapsed under their own weight).

Were you considering this as an IMAP extension or a completely separate
connection which a client must make to the same host for a remarkably
similar purpose?

-- 
dwmw2



Re: Message flag caching and polling.

2003-03-11 Thread David Woodhouse
On Tue, 2003-03-11 at 17:30, Timo Sirainen wrote:
 That could be fixed by sending all the STATUS queries to server at once.

That's not really going to help. If I send all the STATUS queries to the
server at once, it'll go off and ignore me for a while, while it does
them all for me. It could be alleviated somewhat by ensuring that my
client doesn't let background STATUS updates take priority over _real_
stuff I've asked it to do for me in the foreground like displaying mail
messages -- but I still have a philosophical objection to the fact that
I'm required to poll for status in folders which the server _knows_
haven't changed :)

 Actually this brings to my mind: STATUS for selected folder isn't considered
 good behaviour. Fetching flags for all messages in folder isn't considered
 good behaviour. So, how exactly do I keep unseen messages count up to date
 for selected mailbox?

000 SEARCH UNSEEN
* SEARCH 800 801
000 OK [FLAGS-VALIDITY 123] SEARCH DONE
001 FETCH 800:801 (FLAGS)
* 800 FETCH (FLAGS ())
* 801 FETCH (FLAGS ())
001 OK [FLAGS-VALIDITY 123]

You don't _need_ to keep flags for all messages, only those you care
about. Which in most cases does happens to include unseen messages.

 I wouldn't be against flags-validity, but I'm not sure if it's worth the
 trouble. The kind of validity-request uidset-response would allow servers to
 be pretty flexible in how they handle it. I think it should work something
 like this:
 
 Client remembers the last flags-validity given by server and requests that
 the next time it selects the mailbox. Server updates the validity when one
 of the clients closes the mailbox and there has been at least one flag
 update. It sends to all clients (including the one that is closing the
 mailbox/connection):
 
 * OK [FLAGS-VALIDITY newvalidity]
 
 Updating flags-validity isn't necessarily a fast operation, so that way it
 doesn't have to be updated all the time.
 
 Some ways for server to handle this:
 
 - Keep flags-validity value of last flag change for each message. Takes
   pretty much disk space and may be slow.
 - Keep only the last flags-validity value. That helps only when there hasn't
   been any flag changes since client last accessed the mailbox
 - Keep low-validity and low-uid. if client request any flags-validity =
   low-validity, only return low-uid:* instead of 1:*
 - Keep a log of the last few flags-validities and what messages they changed

Yep, that's basically what I was thinking. The server's options range
from a simple change counter and telling the clients to invalidate _all_
cached flags if that changes, to far more sophisticated optimisations of
the invalidation list.

Note that you probably don't want to update the flags-validity token
when new messages arrive. There are other ways of dealing with new
messages anyway. You only change flags-validity when flags of an
_existing_ message have changed. And we need to make sure that when
there's only one client and it makes changes, it doesn't get told to
invalidate its cache even by the most nave server. Maybe we pass a
flags-validity argument with the STORE command too or something? 

-- 
dwmw2



Re: Message flag caching and polling.

2003-03-11 Thread twk
David Woodhouse wrote:

When configured to 'check for mail in all folders' Evolution is issuing
a LIST command, and then for each mailbox listed it's issuing a STATUS
command. 
I assume that you are using check for mail in all folders because the 
server is directly delivering new mail to folders other than the inbox. 
Otherwise this option (which I believe Evolution turns on as a 
default...ack) is unnecessary.

Seems a severe solution to change the server to accomodate a email 
client with unwanted behavior. My experience with Evolution is that it 
appears to generate unusually high traffic to the server (I have no hard 
data to back this up, so don't ask :-).

Tom

--
Tom Karchesemail : [EMAIL PROTECTED]
Web Systems Administrator  phone : 919.515.5508
NCSU Information Technology


Re: Message flag caching and polling.

2003-03-11 Thread David Woodhouse
On Tue, 2003-03-11 at 17:52, twk wrote:
 I assume that you are using check for mail in all folders because the 
 server is directly delivering new mail to folders other than the inbox. 
 Otherwise this option (which I believe Evolution turns on as a 
 default...ack) is unnecessary.

Yes, that's what I'm doing. Evolution seems to have been born with the
fundamentally screwed up idea that it's sensible to do mail filtering in
the MUA, but thankfully it doesn't attempt to enforce that brain damage
on me :)

 Seems a severe solution to change the server to accomodate a email 
 client with unwanted behavior. My experience with Evolution is that it 
 appears to generate unusually high traffic to the server (I have no hard 
 data to back this up, so don't ask :-).

Evo _does_ generate unusually high traffic to the server. You select a
mailbox and it'll refuse to display _anything_ until it's got _all_
headers for _every_ mail in that mailbox. You select a mail with a small
text/plain body and a _huge_ application/octet-stream attachment, and
it'll download the whole of the attachment just to run 'file' on it and
decide what options to put in the little drop-down menu associated with
it in the display. Etc.

Hence my new pet project -- making it behave sanely. :)

The slight snag being that upon perusing the IMAP RFCs to determine what
the optimal behaviour is, I found that _completely_ sane behaviour, as I
see it, is not entirely possible. You can't do client-side caching of
message flags at all, except for a folder while you've _actually_ got it
SELECTed, and you can't avoid actually polling _all_ folders which might
potentially have new mail -- you can't even use the \Marked and
\Unmarked flags in the LIST response to optimise away some of your
STATUS commands, because \Unmarked folders can have new mail you've
never noticed.

So I'm wondering if these problems have been fixed in subsequent IMAP
extensions I'm not yet aware of, or if in fact we'd need to 'fix' the
IMAP protocol first before I can properly fix up Evolution (or indeed
whatever MUA I end up with) to _really_ make me happy :)

Only now I seem to have upset Mark by mentioning that in an attempt to
alleviate the problem with excessive STATUS commands I switched from
wu-imapd to Courier, which probably wasn't a good place to start :)

-- 
dwmw2



Re: Message flag caching and polling.

2003-03-11 Thread Mark Crispin
On Tue, 11 Mar 2003, David Woodhouse wrote:
 Evo _does_ generate unusually high traffic to the server. You select a
 mailbox and it'll refuse to display _anything_ until it's got _all_
 headers for _every_ mail in that mailbox. You select a mail with a small
 text/plain body and a _huge_ application/octet-stream attachment, and
 it'll download the whole of the attachment just to run 'file' on it and
 decide what options to put in the little drop-down menu associated with
 it in the display. Etc.

Unfortunately, I've seen this type of behavior quite commonly with
applications which take the point of view IMAP is too limiting, therefore
we'll just download the whole thing and do it ourselves.

 You can't do client-side caching of
 message flags at all, except for a folder while you've _actually_ got it
 SELECTed, and you can't avoid actually polling _all_ folders which might
 potentially have new mail -- you can't even use the \Marked and
 \Unmarked flags in the LIST response to optimise away some of your
 STATUS commands, because \Unmarked folders can have new mail you've
 never noticed.

This is a consequence of trying to maintain a complete synchronized client
state of all messages in all mailboxes, so that, somewhere, somewhen, you
won't have to go to the server to get it.  I use online clients instead.

Yes, I have several dozen mailboxes.  But out of all those, only two are
active -- that is, they can be expected to receive new messages.  I also
read about a dozen newsgroups, but STATUS is fast with NNTP (or NNTP via
IMAP) so it's alright to poll.

I use several clients, but I don't keep state on any of them if the client
is not running.  Thus, I don't worry about synchronizing with previous
state.

IMHO -- and YMMV -- far more work is done in unnecessary synchronization
than is ever saved over what an online client does.  I've tried many
disconnected clients (which do synchronization) and each time, have gone
back to Pine.

 So I'm wondering if these problems have been fixed in subsequent IMAP
 extensions I'm not yet aware of, or if in fact we'd need to 'fix' the
 IMAP protocol first before I can properly fix up Evolution (or indeed
 whatever MUA I end up with) to _really_ make me happy :)

Any client which does not work with the IMAP base specification, and
instead depends upon an extension, is ultimately doomed to failure.

 Only now I seem to have upset Mark by mentioning that in an attempt to
 alleviate the problem with excessive STATUS commands I switched from
 wu-imapd to Courier, which probably wasn't a good place to start :)

You didn't upset me.  However, I am pointing out that you can easily get a
distorted view of how IMAP is supposed to work if you use Courier as an
example.

-- Mark --

http://staff.washington.edu/mrc
Science does not emerge from voting, party politics, or public debate.


Re: Message flag caching and polling.

2003-03-11 Thread David Woodhouse
On Tue, 2003-03-11 at 20:28, Mark Crispin wrote:
 On Tue, 11 Mar 2003, David Woodhouse wrote:
  Evo _does_ generate unusually high traffic to the server. You select a
  mailbox and it'll refuse to display _anything_ until it's got _all_
  headers for _every_ mail in that mailbox. You select a mail with a small
  text/plain body and a _huge_ application/octet-stream attachment, and
  it'll download the whole of the attachment just to run 'file' on it and
  decide what options to put in the little drop-down menu associated with
  it in the display. Etc.
 
 Unfortunately, I've seen this type of behavior quite commonly with
 applications which take the point of view IMAP is too limiting, therefore
 we'll just download the whole thing and do it ourselves.

Well, it's broken and it needs fixed. But if I'm going to sit down with
a pretty GUI mail client and completely redesign its IMAP back end, I
want to take some time beforehand and get it _all_ right rather than
just fixing the stuff that's annoying me right now. Hence this
discussion -- I'm trying to take the view IMAP is too limiting; how can
we fix it. 

 This is a consequence of trying to maintain a complete synchronized client
 state of all messages in all mailboxes, 

Not entirely. I don't necessarily want a _complete_ synchronised state
-- but neither do I want to have to discard the state I _do_ need, and
download it again shortly thereafter, just because the server gives me
no way to be sure that nothing's changed in the meantime.

If I don't talk explicitly about reducing the amount of state required
by the client, that's not because I don't agree it needs doing, but
rather because that is a quality of implementation issue and just wants
_fixing_ without discussion. But however little state we can get away
with actually requiring, we still want to cache it wherever possible to
avoid unnecessary traffic.

 so that, somewhere, somewhen, you
 won't have to go to the server to get it.  I use online clients instead.

Well, maybe. More fundamentally though, I'd say that this is a
consequence of living on the wrong end of a 64K ISDN link and hoping
that somewhere, somewhen, I can eliminate _any_ redundant traffic to the
IMAP server. 

And redownloading message flags which I already had and which haven't
changed is definitely redundant -- especially if it's doing so just
because I temporarily selected my inbox to read a new mail therein,
before returning to the mailing list I was reading a moment ago.

Maybe I'm trying to solve a problem which isn't actually relevant to
most people in the 'real' world where bandwidth is free -- I suppose I
might be half-inclined to buy that argument. But there are a lot of
parts of the world where bandwidth still isn't entirely free, and/or
where mail servers are expected to be accessed over a WAN -- so maybe I
won't buy it.

 IMHO -- and YMMV -- far more work is done in unnecessary synchronization
 than is ever saved over what an online client does.  I've tried many
 disconnected clients (which do synchronization) and each time, have gone
 back to Pine.

I tend to keep coming back to Pine too, and have tended to use it as an
example of 'ideal' IMAP behaviour. But it's not quite perfect either --
it _could_ cache headers locally but doesn't, and this means that
sometimes it takes _seconds_ just to draw the screen after I hit PgUp in
a message index, saturating my link while it's at it.

My mileage _does_ vary -- I don't think caching is that overrated; I
think you've been spoiled. :)

I'm not talking about full disconnected operation; merely trying to
reduce the total amount of traffic between client and server by
intelligent management of changes. It's a similar problem to the one
that UIDs solve. 

 Any client which does not work with the IMAP base specification, and
 instead depends upon an extension, is ultimately doomed to failure.

That's indisputably true. If I try to rewrite a mail client's back end
so that it _relies_ on being able to cache stuff using a new IMAP
extension, you have my permission to come over here and shoot me.
Repeatedly if needs be; till I stop it.

But that doesn't mean we shouldn't think about an extension which can
allow optimisation of client-server traffic in the happy situation
where both server and client _do_ support it.

  Only now I seem to have upset Mark by mentioning that in an attempt to
  alleviate the problem with excessive STATUS commands I switched from
  wu-imapd to Courier, which probably wasn't a good place to start :)
 
 You didn't upset me.  However, I am pointing out that you can easily get a
 distorted view of how IMAP is supposed to work if you use Courier as an
 example.

A fair point, and I'll bear it in mind -- thanks. To be honest, I'm
trying to avoid paying too much attention to the server side right now.
There's plenty for me to do just trying to get optimal behaviour from
the _client_, and I'm _assuming_ the server gets it 

Re: Message flag caching and polling.

2003-03-11 Thread Mark Crispin
On Tue, 11 Mar 2003, David Woodhouse wrote:
 I'm trying to take the view IMAP is too limiting; how can
 we fix it.

That's not the right view.  You should instead have how can I build a
good client within the context of IMAP, without expecting the existing
IMAP world to add facilities to enable me.

 Not entirely. I don't necessarily want a _complete_ synchronised state
 -- but neither do I want to have to discard the state I _do_ need, and
 download it again shortly thereafter, just because the server gives me
 no way to be sure that nothing's changed in the meantime.

I keep my IMAP clients (at least, one at home and one in the office)
running for days at a time.  I have two incoming mailboxes and some
newsgroups.  All other mailboxes are ones that get updated by my action.

 Well, maybe. More fundamentally though, I'd say that this is a
 consequence of living on the wrong end of a 64K ISDN link and hoping
 that somewhere, somewhen, I can eliminate _any_ redundant traffic to the
 IMAP server.

I once regularly used IMAP over a 2400 bps line, and to this day I still
use IMAP over a CDPD device.  Please do not treat me as someone who
doesn't understand the issue of slow lines.

My claim is not that you should re-download data that you already had.
Rather, you should not re-download anything unless you need it; and by
following a strict don't need it, don't download it will buy you more
than efforts to keep the caches of a dozen clients in synchronization with
each other.

 And redownloading message flags which I already had and which haven't
 changed is definitely redundant -- especially if it's doing so just
 because I temporarily selected my inbox to read a new mail therein,
 before returning to the mailing list I was reading a moment ago.

So why didn't you spawn a separate session for the other mailbox?  TCP
sessions are cheap.

 I tend to keep coming back to Pine too, and have tended to use it as an
 example of 'ideal' IMAP behaviour. But it's not quite perfect either --
 it _could_ cache headers locally but doesn't, and this means that
 sometimes it takes _seconds_ just to draw the screen after I hit PgUp in
 a message index, saturating my link while it's at it.

I have never had it take more than a second or two, even with CDPD.  It
all depends upon the size of the envelopes.

Have you noticed that Pine *does* completely cache within a session?  As
long as you don't give up the session, Pine will never re-fetch the same
data.

-- Mark --

http://staff.washington.edu/mrc
Science does not emerge from voting, party politics, or public debate.


Re: Message flag caching and polling.

2003-03-11 Thread Eric A. Hall

on 3/11/2003 4:56 PM Mark Crispin wrote:

 Have you noticed that Pine *does* completely cache within a session?  As
 long as you don't give up the session, Pine will never re-fetch the same
 data.

I'm not trying to start a religious war here, but how much work would it
really be to have a protocol extension which allowed the client to request
flags which have changed since time. It seems that all of the difficulty
would be in the implementation (the server data-store), not in the
protocol, and there would be significant benefits to having this option
available in the protocol. Faster resynchronization between sessions would
be very good for all clients, online and offline alike.

In those cases where it was impractical to store this kind of information,
the server wouldn't implement it, which is reasonable behavior for any
optional extension.

Ignoring the data-store issues which will dictate whether a specific
server is able to implement the feature, how feasible would this be?

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/