Bron Gondwana wrote in <[email protected]>: |On Wed, Nov 20, 2024, at 08:15, Steffen Nurpmeso wrote: |> that goes out without MIME as such (text/plain 7-bit content-type |> is optional), but both of these two messages came in via ML as |> |> Content-Type: text/plain; charset="utf-8" |> Content-Transfer-Encoding: base64 | |Yeah, if the source message isn't MIME encoded, Mailman re-encodes. \
It is more than that. | It's a "detect message type" flag in the code, and it would be trivial \ |to add a config "don't do that if DKIM2" and instead just MIME-wrap \ |the existing message with the existing charset. ... |> -rw-r----- 1 steffen wheel 2167 Nov 19 21:22 t1-i.txt |> -rw-r----- 1 steffen wheel 2201 Nov 19 21:22 t1-o.txt |> -rw------- 1 steffen wheel 236 Nov 19 21:22 t1-patch |> -rw-r----- 1 steffen wheel 8412 Nov 19 21:22 t2-i.txt |> -rw-r----- 1 steffen wheel 5932 Nov 19 21:22 t2-o.txt |> -rw------- 1 steffen wheel 4350 Nov 19 21:23 t2-patch |> |> Hm. Ok let me remove the bzip2 stuff from bsdiff.. Here is the |> same without, and then running plzip and zstd on the uncompressed |> binary data; this still has the normal header and such (note |> i have not yet looked at all, it may very well be that patches at |> position 0 or "EOT" could be optimized away etc etc. |> |> plzip -9 and zstd -19 |> |> -rw------- 1 steffen wheel 142 Nov 19 21:48 t1-patch-2.lz |> -rw------- 1 steffen wheel 116 Nov 19 21:48 t1-patch-2.zst |> |> -rw------- 1 steffen wheel 4654 Nov 19 21:48 t2-patch-2.lz |> -rw------- 1 steffen wheel 4577 Nov 19 21:48 t2-patch-2.zst |> |> It would be interesting to know how your implementation of the |> algorithm works out for those (and the "real" vcsdiff |> implementation i have seen is huge). Would be cool if it is |> superior, of course. | |My code uses a pretty basic perl diffing tool, but we could use vcsdiff \ |just fine too - and have it be an input to that format. The format \ |really is basically just the logic from RFC3284; but encoded to be \ |readable. Ok i now downloaded xdelta3 which uses the VCDIFF algorithm (like Google's really big thing open-vcdiff), and i see i get for t1 Offset Code Type1 Size1 @Addr1 + Type2 Size2 @Addr2 000000 019 CPY_0 54 S@0 000054 002 ADD 1 000055 034 CPY_0 18 S@59 000073 003 ADD 2 000075 019 CPY_0 27 S@83 000102 019 CPY_0 196 S@112 000298 107 CPY_5 11 S@310 000309 051 CPY_2 53 S@323 000362 007 ADD 6 000368 051 CPY_2 45 S@386 000413 051 CPY_2 111 S@433 000524 099 CPY_5 250 S@546 000774 035 CPY_1 21 T@309 000795 014 ADD 13 000808 069 CPY_3 5 T@362 000813 003 ADD 2 000815 051 CPY_2 38 S@843 000853 099 CPY_5 238 S@883 001091 003 ADD 2 001093 051 CPY_2 1074 S@1127 so i wildly guess you actually postprocess this output (for now). The two examples i had posted are smaller when processed with bsdiff compared to non-postprocessed VCDIFF, that much is plain. But thank you! Ciao, --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt) | |And in Fall, feel "The Dropbear Bard"s ball(s). | |The banded bear |without a care, |Banged on himself fore'er and e'er | |Farewell, dear collar bear _______________________________________________ Ietf-dkim mailing list -- [email protected] To unsubscribe send an email to [email protected]
