Re: dovecot sieve duplicates detection

2019-12-04 Thread James Cassell via dovecot


On Wed, Dec 4, 2019, at 1:14 PM, Stephan Bosch via dovecot wrote:
> 
> 
> On 17/08/2018 09:14, Stephan Bosch wrote:
> >
> >
> > Op 14/05/2018 om 23:03 schreef James Cassell:
> >>
> >> On Mon, May 14, 2018, at 4:52 PM, Stephan Bosch wrote:
> >>>
> >>> Op 25/04/2018 om 22:49 schreef James Cassell:
>  On Wed, Apr 25, 2018, at 3:20 PM, Stephan Bosch wrote:
> > Specify the ID used for duplicate checking explicitly using the
> > :uniqueid argument (https://tools.ietf.org/html/rfc7352#section-3.1).
> > Using the variables extenion, compose the uniqueid from the 
> > message-id
> > and the mailbox name.
> >
>  In my experience with dovecot's implementation, you can set the ID 
>  only once in a script.  If you try to filter duplicates based on 
>  multiple IDs, only the first (or last, I don't remember) takes effect.
> 
> >>> Do you have a detailed example of the supposed wrong behavior?
> >>>
> >> I don't have them readily available. Basically, the result of the 
> >> first duplicate test in a script is taken as the result of any future 
> >> duplicate test, even if the parameters to that future duplicate test 
> >> in the same script are different and would otherwise result in a 
> >> different output. The duplicate test is only evaluated once and its 
> >> results are substituted everywhere.
> >>
> >> For example, I might want to flag a message as a new conversation if 
> >> I have not seen another message with the same subject. In the same 
> >> script, I might want to discard messages that are exactly identical 
> >> including message ID among others. The dovecot behavior would be to 
> >> discard all messages that match a subject of previously received 
> >> message.
> >
> > I finally managed to review this issue and I can confirm that this is 
> > a bug.
> 
> Fix released in 2.3.9.
> 

Awesome! Thanks for the followup!

V/r,
James Cassell


Re: dovecot sieve duplicates detection

2019-12-04 Thread Stephan Bosch via dovecot




On 17/08/2018 09:14, Stephan Bosch wrote:



Op 14/05/2018 om 23:03 schreef James Cassell:


On Mon, May 14, 2018, at 4:52 PM, Stephan Bosch wrote:


Op 25/04/2018 om 22:49 schreef James Cassell:

On Wed, Apr 25, 2018, at 3:20 PM, Stephan Bosch wrote:

Specify the ID used for duplicate checking explicitly using the
:uniqueid argument (https://tools.ietf.org/html/rfc7352#section-3.1).
Using the variables extenion, compose the uniqueid from the 
message-id

and the mailbox name.

In my experience with dovecot's implementation, you can set the ID 
only once in a script.  If you try to filter duplicates based on 
multiple IDs, only the first (or last, I don't remember) takes effect.



Do you have a detailed example of the supposed wrong behavior?

I don't have them readily available. Basically, the result of the 
first duplicate test in a script is taken as the result of any future 
duplicate test, even if the parameters to that future duplicate test 
in the same script are different and would otherwise result in a 
different output. The duplicate test is only evaluated once and its 
results are substituted everywhere.


For example, I might want to flag a message as a new conversation if 
I have not seen another message with the same subject. In the same 
script, I might want to discard messages that are exactly identical 
including message ID among others. The dovecot behavior would be to 
discard all messages that match a subject of previously received 
message.


I finally managed to review this issue and I can confirm that this is 
a bug.


Fix released in 2.3.9.

Regards,

Stephan.



Re: dovecot sieve duplicates detection

2018-08-17 Thread Stephan Bosch




Op 14/05/2018 om 23:03 schreef James Cassell:


On Mon, May 14, 2018, at 4:52 PM, Stephan Bosch wrote:


Op 25/04/2018 om 22:49 schreef James Cassell:

On Wed, Apr 25, 2018, at 3:20 PM, Stephan Bosch wrote:

Specify the ID used for duplicate checking explicitly using the
:uniqueid argument (https://tools.ietf.org/html/rfc7352#section-3.1).
Using the variables extenion, compose the uniqueid from the message-id
and the mailbox name.


In my experience with dovecot's implementation, you can set the ID only once in 
a script.  If you try to filter duplicates based on multiple IDs, only the 
first (or last, I don't remember) takes effect.


Do you have a detailed example of the supposed wrong behavior?


I don't have them readily available. Basically, the result of the first 
duplicate test in a script is taken as the result of any future duplicate test, 
even if the parameters to that future duplicate test in the same script are 
different and would otherwise result in a different output. The duplicate test 
is only evaluated once and its results are substituted everywhere.

For example, I might want to flag a message as a new conversation if I have not 
seen another message with the same subject. In the same script, I might want to 
discard messages that are exactly identical including message ID among others. 
The dovecot behavior would be to discard all messages that match a subject of 
previously received message.


I finally managed to review this issue and I can confirm that this is a bug.

Regards,

Stephan.



Re: dovecot sieve duplicates detection

2018-05-14 Thread James Cassell


On Mon, May 14, 2018, at 4:52 PM, Stephan Bosch wrote:
> 
> 
> Op 25/04/2018 om 22:49 schreef James Cassell:
> > On Wed, Apr 25, 2018, at 3:20 PM, Stephan Bosch wrote:
> >>
> >> Specify the ID used for duplicate checking explicitly using the
> >> :uniqueid argument (https://tools.ietf.org/html/rfc7352#section-3.1).
> >> Using the variables extenion, compose the uniqueid from the message-id
> >> and the mailbox name.
> >>
> > In my experience with dovecot's implementation, you can set the ID only 
> > once in a script.  If you try to filter duplicates based on multiple IDs, 
> > only the first (or last, I don't remember) takes effect.
> >
> 
> Do you have a detailed example of the supposed wrong behavior?
> 

I don't have them readily available. Basically, the result of the first 
duplicate test in a script is taken as the result of any future duplicate test, 
even if the parameters to that future duplicate test in the same script are 
different and would otherwise result in a different output. The duplicate test 
is only evaluated once and its results are substituted everywhere.

For example, I might want to flag a message as a new conversation if I have not 
seen another message with the same subject. In the same script, I might want to 
discard messages that are exactly identical including message ID among others. 
The dovecot behavior would be to discard all messages that match a subject of 
previously received message.

> Regards,
> 
> Stephan.

V/r,
James Cassell


Re: dovecot sieve duplicates detection

2018-05-14 Thread Stephan Bosch



Op 25/04/2018 om 22:49 schreef James Cassell:

On Wed, Apr 25, 2018, at 3:20 PM, Stephan Bosch wrote:


Specify the ID used for duplicate checking explicitly using the
:uniqueid argument (https://tools.ietf.org/html/rfc7352#section-3.1).
Using the variables extenion, compose the uniqueid from the message-id
and the mailbox name.


In my experience with dovecot's implementation, you can set the ID only once in 
a script.  If you try to filter duplicates based on multiple IDs, only the 
first (or last, I don't remember) takes effect.



Do you have a detailed example of the supposed wrong behavior?

Regards,

Stephan.


Re: dovecot sieve duplicates detection

2018-04-25 Thread James Cassell

On Wed, Apr 25, 2018, at 3:20 PM, Stephan Bosch wrote:
> 
> 
> Op 23/04/2018 om 22:03 schreef André Rodier:
> > On 23/04/18 14:18, Stephan Bosch wrote:
> >>
> >>
> >> Op 11-4-2018 om 23:58 schreef André Rodier:
> >>> Hello,
> >>>
> >>> I have tested the sieve duplicate script with success so far, but I 
> >>> have
> >>> a question.
> >>
> >> Sieve duplicate script? You mean the Sieve duplicate extension (RFC 
> >> 7352)?
> >>
> >>> I would like to know if the "duplicate" sieve flag in Dovecot is global
> >>> to all folders, or specific to one folder only.
> >>
> >> It uses the lda-dupes file in the user's home directory. So, it is 
> >> not normally related to folders, although the identifier used for 
> >> duplicate matching could be composed of the mailbox name if you want.
> >>
> >>> For instance, if I copy an email from one folder to another, and I have
> >>> a discard action on duplicate email, is this action will be applied (in
> >>> this case, discard) or not.
> >>
> >> Are you talking about IMAPSieve now? I am not sure "duplicate" is 
> >> currently even allowed in that context.
> >>
> >>> If the duplicate is global to all folders, is there a way to restrict
> >>> the search in one folder only.
> >>
> >> You can set the :uniqueid parameter accordingly.
> >>
> >> Regards,
> >>
> >> Stephan.
> >
> > Thank you, Stephan.
> >
> > Yes, I meant the Sieve duplicate extension.
> >
> > I am using a program to import email (mbsync), which use the IMAP 
> > append function. Sometimes, the import fail and I have to restart the 
> > program. Unfortunately, the same emails are imported again.
> >
> > I found a fix by using a dovecot IMAP sieve script executed on the 
> > APPEND action 
> > (https://wiki.dovecot.org/Pigeonhole/Sieve/Plugins/IMAPSieve). I wrote 
> > a custom sieve script that "discard" the ones that are detected as 
> > "duplicate". It worked very well and the emails were not any more 
> > imported twice.
> >
> > However, there was a huge side effect: archiving an email with 
> > Thunderbird is not working any more, and even lost! I have been able 
> > to understand the error as this:
> >
> > 1. When archiving an email with Thunderbird, it is first copied 
> > (APPEND) into the archive folder, but the original folder is not 
> > expunged.
> > 2. The sieve script detect the email as duplicate, and discard it.
> > 3. When the original folder is expunged, the source email is lost...
> >
> > My conclusion was the duplicate detection function is global to all 
> > folders.
> >
> > If I could restrict the detection of duplicates in the current folder 
> > only, this would let me run the import program again without error.
> 
> Specify the ID used for duplicate checking explicitly using the 
> :uniqueid argument (https://tools.ietf.org/html/rfc7352#section-3.1). 
> Using the variables extenion, compose the uniqueid from the message-id 
> and the mailbox name.
> 

In my experience with dovecot's implementation, you can set the ID only once in 
a script.  If you try to filter duplicates based on multiple IDs, only the 
first (or last, I don't remember) takes effect.

V/r,
James Cassell


Re: dovecot sieve duplicates detection

2018-04-25 Thread André Rodier

On 25/04/18 20:20, Stephan Bosch wrote:



Op 23/04/2018 om 22:03 schreef André Rodier:

On 23/04/18 14:18, Stephan Bosch wrote:



Op 11-4-2018 om 23:58 schreef André Rodier:

Hello,

I have tested the sieve duplicate script with success so far, but I 
have

a question.


Sieve duplicate script? You mean the Sieve duplicate extension (RFC 
7352)?



I would like to know if the "duplicate" sieve flag in Dovecot is global
to all folders, or specific to one folder only.


It uses the lda-dupes file in the user's home directory. So, it is 
not normally related to folders, although the identifier used for 
duplicate matching could be composed of the mailbox name if you want.



For instance, if I copy an email from one folder to another, and I have
a discard action on duplicate email, is this action will be applied (in
this case, discard) or not.


Are you talking about IMAPSieve now? I am not sure "duplicate" is 
currently even allowed in that context.



If the duplicate is global to all folders, is there a way to restrict
the search in one folder only.


You can set the :uniqueid parameter accordingly.

Regards,

Stephan.


Thank you, Stephan.

Yes, I meant the Sieve duplicate extension.

I am using a program to import email (mbsync), which use the IMAP 
append function. Sometimes, the import fail and I have to restart the 
program. Unfortunately, the same emails are imported again.


I found a fix by using a dovecot IMAP sieve script executed on the 
APPEND action 
(https://wiki.dovecot.org/Pigeonhole/Sieve/Plugins/IMAPSieve). I wrote 
a custom sieve script that "discard" the ones that are detected as 
"duplicate". It worked very well and the emails were not any more 
imported twice.


However, there was a huge side effect: archiving an email with 
Thunderbird is not working any more, and even lost! I have been able 
to understand the error as this:


1. When archiving an email with Thunderbird, it is first copied 
(APPEND) into the archive folder, but the original folder is not 
expunged.

2. The sieve script detect the email as duplicate, and discard it.
3. When the original folder is expunged, the source email is lost...

My conclusion was the duplicate detection function is global to all 
folders.


If I could restrict the detection of duplicates in the current folder 
only, this would let me run the import program again without error.


Specify the ID used for duplicate checking explicitly using the 
:uniqueid argument (https://tools.ietf.org/html/rfc7352#section-3.1). 
Using the variables extenion, compose the uniqueid from the message-id 
and the mailbox name.


Regards,

Stephan.



Thank you, I will try this.

André


Re: dovecot sieve duplicates detection

2018-04-25 Thread Stephan Bosch



Op 23/04/2018 om 22:03 schreef André Rodier:

On 23/04/18 14:18, Stephan Bosch wrote:



Op 11-4-2018 om 23:58 schreef André Rodier:

Hello,

I have tested the sieve duplicate script with success so far, but I 
have

a question.


Sieve duplicate script? You mean the Sieve duplicate extension (RFC 
7352)?



I would like to know if the "duplicate" sieve flag in Dovecot is global
to all folders, or specific to one folder only.


It uses the lda-dupes file in the user's home directory. So, it is 
not normally related to folders, although the identifier used for 
duplicate matching could be composed of the mailbox name if you want.



For instance, if I copy an email from one folder to another, and I have
a discard action on duplicate email, is this action will be applied (in
this case, discard) or not.


Are you talking about IMAPSieve now? I am not sure "duplicate" is 
currently even allowed in that context.



If the duplicate is global to all folders, is there a way to restrict
the search in one folder only.


You can set the :uniqueid parameter accordingly.

Regards,

Stephan.


Thank you, Stephan.

Yes, I meant the Sieve duplicate extension.

I am using a program to import email (mbsync), which use the IMAP 
append function. Sometimes, the import fail and I have to restart the 
program. Unfortunately, the same emails are imported again.


I found a fix by using a dovecot IMAP sieve script executed on the 
APPEND action 
(https://wiki.dovecot.org/Pigeonhole/Sieve/Plugins/IMAPSieve). I wrote 
a custom sieve script that "discard" the ones that are detected as 
"duplicate". It worked very well and the emails were not any more 
imported twice.


However, there was a huge side effect: archiving an email with 
Thunderbird is not working any more, and even lost! I have been able 
to understand the error as this:


1. When archiving an email with Thunderbird, it is first copied 
(APPEND) into the archive folder, but the original folder is not 
expunged.

2. The sieve script detect the email as duplicate, and discard it.
3. When the original folder is expunged, the source email is lost...

My conclusion was the duplicate detection function is global to all 
folders.


If I could restrict the detection of duplicates in the current folder 
only, this would let me run the import program again without error.


Specify the ID used for duplicate checking explicitly using the 
:uniqueid argument (https://tools.ietf.org/html/rfc7352#section-3.1). 
Using the variables extenion, compose the uniqueid from the message-id 
and the mailbox name.


Regards,

Stephan.



Re: dovecot sieve duplicates detection

2018-04-23 Thread André Rodier

On 23/04/18 14:18, Stephan Bosch wrote:



Op 11-4-2018 om 23:58 schreef André Rodier:

Hello,

I have tested the sieve duplicate script with success so far, but I have
a question.


Sieve duplicate script? You mean the Sieve duplicate extension (RFC 7352)?


I would like to know if the "duplicate" sieve flag in Dovecot is global
to all folders, or specific to one folder only.


It uses the lda-dupes file in the user's home directory. So, it is not 
normally related to folders, although the identifier used for duplicate 
matching could be composed of the mailbox name if you want.



For instance, if I copy an email from one folder to another, and I have
a discard action on duplicate email, is this action will be applied (in
this case, discard) or not.


Are you talking about IMAPSieve now? I am not sure "duplicate" is 
currently even allowed in that context.



If the duplicate is global to all folders, is there a way to restrict
the search in one folder only.


You can set the :uniqueid parameter accordingly.

Regards,

Stephan.


Thank you, Stephan.

Yes, I meant the Sieve duplicate extension.

I am using a program to import email (mbsync), which use the IMAP append 
function. Sometimes, the import fail and I have to restart the program. 
Unfortunately, the same emails are imported again.


I found a fix by using a dovecot IMAP sieve script executed on the 
APPEND action 
(https://wiki.dovecot.org/Pigeonhole/Sieve/Plugins/IMAPSieve). I wrote a 
custom sieve script that "discard" the ones that are detected as 
"duplicate". It worked very well and the emails were not any more 
imported twice.


However, there was a huge side effect: archiving an email with 
Thunderbird is not working any more, and even lost! I have been able to 
understand the error as this:


1. When archiving an email with Thunderbird, it is first copied (APPEND) 
into the archive folder, but the original folder is not expunged.

2. The sieve script detect the email as duplicate, and discard it.
3. When the original folder is expunged, the source email is lost...

My conclusion was the duplicate detection function is global to all folders.

If I could restrict the detection of duplicates in the current folder 
only, this would let me run the import program again without error.


Kind regards,
André.


Re: dovecot sieve duplicates detection

2018-04-23 Thread Stephan Bosch



Op 11-4-2018 om 23:58 schreef André Rodier:

Hello,

I have tested the sieve duplicate script with success so far, but I have
a question.


Sieve duplicate script? You mean the Sieve duplicate extension (RFC 7352)?


I would like to know if the "duplicate" sieve flag in Dovecot is global
to all folders, or specific to one folder only.


It uses the lda-dupes file in the user's home directory. So, it is not 
normally related to folders, although the identifier used for duplicate 
matching could be composed of the mailbox name if you want.



For instance, if I copy an email from one folder to another, and I have
a discard action on duplicate email, is this action will be applied (in
this case, discard) or not.


Are you talking about IMAPSieve now? I am not sure "duplicate" is 
currently even allowed in that context.



If the duplicate is global to all folders, is there a way to restrict
the search in one folder only.


You can set the :uniqueid parameter accordingly.

Regards,

Stephan.


dovecot sieve duplicates detection

2018-04-11 Thread André Rodier
Hello,

I have tested the sieve duplicate script with success so far, but I have
a question.

I would like to know if the "duplicate" sieve flag in Dovecot is global
to all folders, or specific to one folder only.

For instance, if I copy an email from one folder to another, and I have
a discard action on duplicate email, is this action will be applied (in
this case, discard) or not.

If the duplicate is global to all folders, is there a way to restrict
the search in one folder only.

Thanks for your help.
André