Re: [jmap-discuss] Perform actions on all messages of a Mailbox

2016-10-20 Thread Bron Gondwana
Sorry about the delay in replying to this.

On Tue, 18 Oct 2016, at 21:16, Matthieu Baechler wrote:
> Hi Neil,
>
> Le mardi 18 octobre 2016 00:18:24 UTC+2, Neil Jenkins a écrit :
>> I don't think this should be added to the JMAP spec. One of the
>> concerns raised by a large mailbox provider we talked to was to make
>> sure a client could be rate limited in a reasonable manner, so it
>> can't overload the server. (We've also been careful the other way to
>> try to ensure the client can control exactly how much data it
>> requests from the server in one go.) Adding a command like this means
>> the client could ask the server to do something potentially very
>> expensive, depending on backend implementation.
>
> Do people expect to use JMAP as the only protocol to access mailboxes
> ? because in IMAP, such very expensive methods are very common
> (expunge, modification with a very broad uid range, etc).

Yes, they are.  They're also not undoable, and risky.  We have a self
service "restore from Backup" tool at FastMail because "I
accidentally deleted a ton of messages I didn't mean to" was a very
common support request.

(that and "I store all my important email in the Trash folder because
I'm insane, and I just hooked up an iPhone which wiped it", *sigh*)

>>
>>
>> Now the server could reject it if it's over a certain number of
>> messages resulting in the query, but the exact limit will be server
>> dependent and when it happens the client has to fallback to a
>> different approach. Having two different implementations in the
>> client is likely to be less tested and buggier.
>
>>
>>
>> In general, JMAP prefers the philosophy of explicitly telling the
>> server what changes to make; this is often more efficient anyway if
>> you're keeping the client cache in sync.
>
> You can do that with IfInState easily, you already know which messages
> will be changed client-side because you actually wrote the query.
>
>>
>>
>> The approach we take to this problem (and I would recommend) is to
>> fetch the list of ids up front (in pages if necessary), then ask the
>> server to make the changes to them in batches (say 100 to 500 at a
>> time), waiting for the previous request to finish before making the
>> next one.
>
> It doesn't look like a great API to me. Managing deletion with client-
> side batch for performance purpose doesn't sound good.
> I think a good implementation will consume more ressources to handle
> such large queries than to do it server-side based on a query.

I thought that too at first, and I initially made the same arguments.
Especially because we already had the ability to delete mailboxes and
hence operate on the messages inside them.

>> The ids (should be) reasonably small and quick to fetch even for
>> large folders. Fetching them up front ensures you don't process
>> anything that arrives during the operation.
>
> IfInState already covers this case, don't you think ?

It does, but the next getMessageUpdates will have to get a response for
all those messages anyway, because you don't know if the client had
cached anything for the IDs.

>
>> By doing it a batch at a time, you can make sure you won't overload
>> the server (and make sure the server will accept the request), and
>> also more easily show a progress bar to the user (because the user is
>> probably locked on the server while the changes are being made), or
>> even interleave other requests to keep the client responsive while a
>> large operation is happening in the background.
>
> It would be easily solved with an "async" capability on requests.
> We already have Event Source for receiving async result. What do
> you think ?

Thanks for raising this topic, because we did discuss it in a lot of
detail, and we actually decided to go entirely the opposite direction!
Instead of deleting a mailbox causing the messages to be moved to the
"inbox" role if there were messages in the mailbox, we changed it so
that you can't delete a mailbox which contains messages.  You need to
explicitly delete them or move them out first.

There are some strong guiding principles in JMAP, and one of them is
that messages are precious and actions should be explicit.

Our own Cyrus IMAPd server has algorithms built on the assumption that
the biggest single mailbox will contain one million emails.  That's
pretty big.  We have around 20 users with more than that many emails
total across all their mailboxes (I know because we have a 32 bit file
size issue with an internal cache file when you get to about 3 million
messages in a single mailbox).

So looking at a most extreme case of deleting a million emails at the
same time, you're looking at one megabyte per byte of message.  A
reasonable ID size is 64 bits, which is 16 hexadecimal characters.
Add in commas and quotes, you're looking at roughly 20 bytes per id.
Multiply that by a million, that's 20 megabytes of IDs to download
and process.

Yes, it's a lot of data.  But 1 million emails is nearly 2 years' wort

Re: [jmap-discuss] Perform actions on all messages of a Mailbox

2016-10-18 Thread Matthieu Baechler
Hi Neil,

Le mardi 18 octobre 2016 00:18:24 UTC+2, Neil Jenkins a écrit :
>
> I don't think this should be added to the JMAP spec. One of the concerns 
> raised by a large mailbox provider we talked to was to make sure a client 
> could be rate limited in a reasonable manner, so it can't overload the 
> server. (We've also been careful the other way to try to ensure the client 
> can control exactly how much data it requests from the server in one go.) 
> Adding a command like this means the client could ask the server to do 
> something potentially very expensive, depending on backend implementation.
>

Do people expect to use JMAP as the only protocol to access mailboxes ? 
because in IMAP, such very expensive methods are very common (expunge, 
modification with a very broad uid range, etc).
 

>
> Now the server could reject it if it's over a certain number of messages 
> resulting in the query, but the exact limit will be server dependent and 
> when it happens the client has to fallback to a different approach. Having 
> two different implementations in the client is likely to be less tested and 
> buggier.
>
 

>
> In general, JMAP prefers the philosophy of explicitly telling the server 
> what changes to make; this is often more efficient anyway if you're keeping 
> the client cache in sync.
>

You can do that with IfInState easily, you already know which messages will 
be changed client-side because you actually wrote the query.
 

>
> The approach we take to this problem (and I would recommend) is to fetch 
> the list of ids up front (in pages if necessary), then ask the server to 
> make the changes to them in batches (say 100 to 500 at a time), waiting for 
> the previous request to finish before making the next one.
>

It doesn't look like a great API to me. Managing deletion with client-side 
batch for performance purpose doesn't sound good.
I think a good implementation will consume more ressources to handle such 
large queries than to do it server-side based on a query.
 

> The ids (should be) reasonably small and quick to fetch even for large 
> folders. Fetching them up front ensures you don't process anything that 
> arrives during the operation. 
>

IfInState already covers this case, don't you think ?
 

> By doing it a batch at a time, you can make sure you won't overload the 
> server (and make sure the server will accept the request), and also more 
> easily show a progress bar to the user (because the user is probably locked 
> on the server while the changes are being made), or even interleave other 
> requests to keep the client responsive while a large operation is happening 
> in the background.
>

It would be easily solved with an "async" capability on requests. We 
already have Event Source for receiving async result. What do you think ? 

>
[...] 

Regards,

-- 
Matthieu Baechler

-- 
You received this message because you are subscribed to the Google Groups 
"JMAP" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jmap-discuss+unsubscr...@googlegroups.com.
To post to this group, send email to jmap-discuss@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jmap-discuss/91dc3a56-0cf9-43b8-9a3c-f165d0027e39%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [jmap-discuss] Perform actions on all messages of a Mailbox

2016-10-17 Thread Neil Jenkins
I don't think this should be added to the JMAP spec. One of the concerns
raised by a large mailbox provider we talked to was to make sure a
client could be rate limited in a reasonable manner, so it can't
overload the server. (We've also been careful the other way to try to
ensure the client can control exactly how much data it requests from the
server in one go.) Adding a command like this means the client could ask
the server to do something potentially very expensive, depending on
backend implementation.

Now the server could reject it if it's over a certain number of messages
resulting in the query, but the exact limit will be server dependent and
when it happens the client has to fallback to a different approach.
Having two different implementations in the client is likely to be less
tested and buggier.

In general, JMAP prefers the philosophy of explicitly telling the server
what changes to make; this is often more efficient anyway if you're
keeping the client cache in sync.

The approach we take to this problem (and I would recommend) is to fetch
the list of ids up front (in pages if necessary), then ask the server to
make the changes to them in batches (say 100 to 500 at a time), waiting
for the previous request to finish before making the next one. The ids
(should be) reasonably small and quick to fetch even for large folders.
Fetching them up front ensures you don't process anything that arrives
during the operation. By doing it a batch at a time, you can make sure
you won't overload the server (and make sure the server will accept the
request), and also more easily show a progress bar to the user (because
the user is probably locked on the server while the changes are being
made), or even interleave other requests to keep the client responsive
while a large operation is happening in the background.

I think adding "setMessages with filter" to the JMAP spec (and therefore
forcing all servers to implement it) would be a mistake. There's a
better approach to achieve the same goal, or you can add a custom
extension for your own client/server if you really want.

Neil.

-- 
You received this message because you are subscribed to the Google Groups 
"JMAP" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jmap-discuss+unsubscr...@googlegroups.com.
To post to this group, send email to jmap-discuss@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jmap-discuss/1476742703.3045804.758966769.7D02EC9B%40webmail.messagingengine.com.
For more options, visit https://groups.google.com/d/optout.


Re: [jmap-discuss] Perform actions on all messages of a Mailbox

2016-10-17 Thread Bron Gondwana
This is a very sharp tool!  Particularly since a common programming
error would be to mess up the filter and send something which matched
every message on the server.  This is the equivalent of "DELETE FROM
Messages;" via SQL.  I'd want a few things for safety:

1) a return code which says "filterTooBroad" or something if you send an
   empty filter.

2) a strong suggestion in the implementation docs that you use ifInState
   to avoid race conditions with new things being delivered that you
   didn't mean to include.

...

Having said that, I'm generally in favour of protocols allowing you to
express your intent, and "delete everything in my Trash folder" is a
common user intent.  As is "move all the messages matching this
expression into a folder", so:


setMessages({  filter: {from: 'br...@fastmail.fm'  },  update: {
folderId: "xyz",
isUnread: false, } })

Yep, I could see that being useful.  It would blow your cache pretty
badly with the large getMessageUpdates in response, but a client that
didn't care about updates could do bulk operations without having to
know all the IDs.

>From a server implementation standpoint, this is quite easy to write as
well, because you already have something for converting a filter into a
list of IDs, and something for applying changes to messages based on ID.

Bron.

On Tue, 18 Oct 2016, at 00:52, D David wrote:
> Hi all,
>
> I was investigating a way for a JMAP client to empty a folder (the
> usual "Empty Trash" scenario) and it turned out there's nothing in the
> spec allowing me to do this.
> The only possible solution would be to fetch Message IDs with a call
> to getMessageList and then destroy all these IDs with a call to
> setMessages. Is my understanding right or am I missing something?
>
> If I'm right, Linagora would like to propose a simple but effective
> way to implement such "mass" actions on all messages in a given
> Mailbox: the setFilteredMessages API. This API will be very similar to
> setMessages but will apply update and/or destroy operations on
> messages matching a filter (the filter would behave exactly as the
> filter option of getMessageList: it will select messages matching the
> filter expression). Two examples below:
>  * Empty a folder
>
> setMessages({  filter: {inMailboxes: ['']  },
> destroy: true })
>
>
>  * Mark all messages from me as read
>
> setMessages({  filter: {from: 'ddo.linag...@gmail.com'  },
> update: {isUnread: false  } })
>
>
> Some restrictions would apply, mainly:
>  * destroy MUST be a Boolean value. If True, matched messages are
>permanently destroyed.
>  * update MUST be an object containing updated properties. Ther server
>MUST apply the modifications to the matched messages.
>
> What do you all think? Once we agree on the proposal, I can prepare a
> PR to the spec quickly.
>
> Thanks,
> Regards,
>
> David
>
> --
>  You received this message because you are subscribed to the Google
>  Groups "JMAP" group.
>  To unsubscribe from this group and stop receiving emails from it,
>  send an email to jmap-discuss+unsubscr...@googlegroups.com.
>  To post to this group, send email to jmap-discuss@googlegroups.com.
>  To view this discussion on the web visit
>  
> https://groups.google.com/d/msgid/jmap-discuss/2720bfba-ea49-4739-baf4-681e0055f7ff%40googlegroups.com[1].
>  For more options, visit https://groups.google.com/d/optout.

--
  Bron Gondwana
  br...@fastmail.fm



Links:

  1. 
https://groups.google.com/d/msgid/jmap-discuss/2720bfba-ea49-4739-baf4-681e0055f7ff%40googlegroups.com?utm_medium=email&utm_source=footer

-- 
You received this message because you are subscribed to the Google Groups 
"JMAP" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jmap-discuss+unsubscr...@googlegroups.com.
To post to this group, send email to jmap-discuss@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jmap-discuss/1476741901.2883076.758954961.07FB71D4%40webmail.messagingengine.com.
For more options, visit https://groups.google.com/d/optout.