Re: Filter script to remove html, fullquotes and header lines

2022-03-20 Thread Cameron Simpson
On 20Mar2022 13:36, Martin Trautmann  wrote:
>do you know about any mutt script that would go from message to message 
>and
>
>1) remove a html part if a plain text part is given
>
>2) remove all trailing lines,
>   starting with a quote sign ">"
>   and at least e.g. 10 occurences
>
>  such as (^>[.*][\r\n]){9,} before the end of the message
>
>  Maybe I could append xzxzxzx to the end of the message first, delete 
>a fullquote up to there and remove xzxzxzx again?
>
>  Bonus: Do not remove fullquotes for messages without in-reply-to or 
>references headers.
>
>3) remove header lines which are longer than 5 lines
>
>I want to shrink the size of some mailboxes for archive purposes, 
>without throwing away too much.

I think you'll have to write your own.

At minimum you need a full mail message parser so that you are not 
filtering, say, base64 or QP content incorrectly. So something which 
scans a mailbox and for each message:
- decodes it completely
- applies your filters
- assembles the new message
and write this out to a new mailbox (so it isn't destructive and can be 
compared to the original - you don't want to accidentally shred your 
archive).

I'd do this in Python myself - it has a good email library and you can 
do all the things you describe fairly easily with it.

Cheers,
Cameron Simpson 


Filter script to remove html, fullquotes and header lines

2022-03-20 Thread Martin Trautmann

Hi all,

do you know about any mutt script that would go from message to message and

1) remove a html part if a plain text part is given

2) remove all trailing lines,
   starting with a quote sign ">"
   and at least e.g. 10 occurences

  such as (^>[.*][\r\n]){9,} before the end of the message

  Maybe I could append xzxzxzx to the end of the message first, delete 
a fullquote up to there and remove xzxzxzx again?


  Bonus: Do not remove fullquotes for messages without in-reply-to or 
references headers.


3) remove header lines which are longer than 5 lines

I want to shrink the size of some mailboxes for archive purposes, 
without throwing away too much.


Thanks,
Martin