Bron Gondwana wrote in
 <[email protected]>:
 |On Wed, Jun 18, 2025, at 21:08, Steffen Nurpmeso wrote:
 |> [email protected] wrote in
 |> <175028097011.643508.1606149502766119820@dt-datatracker-75bbdb9cc5-qvb4t\
 |> >:
 |> ...
 |>|   Title:   A method for describing changes made to an email
 |>|   Author:  Bron Gondwana
 |>|   Name:    draft-gondwana-dkim2-modification-alegbra-02.txt
 |>|   Pages:   6
 |>|   Dates:   2025-06-18
 |> 
 |> I thought it makes sense to make it very clear how different the
 |> approaches are.
 |> 
 |> Here it seems to be the desire to be able to recreate messages
 |> entirely.
 |> 
 |> For ACDC instead the DKIM-normalized data is used to create the
 |> differences: verifiers and signers have to create this
 |> representation anyway, and you can, after applying the difference,
 |> more or less immediately verify the signature of elder data.
 ...
 |Yes, calculating the diff on the normalised form is certainly a valid \
 |option, and having that convert back to the previous message's normalised \
 |form.  That's pretty much what I'm trying to do but not insisting on \
 |the normalising step.  I'd be happy to add the normalisation requirement.

Confusing ... but the last sentence sounds good to me.  I thought
and think it makes things easier to implement and in code paths
i can think of; and it is sufficient.

(Code paths less for example Scott Kitterman's dkimpy, where the
milter and the dkim code are actually separate, and the milter
"only" collects data and then feeds it all further on.  But
i could imagine even there an additional interface for passing
already normalized data would open the door.)

I had been convinced by your approach of inspecting differences of
headers and body separately --- it is the better one, and using
tag=s for describing anything is, too.  Ie, logical: working the
potentially large body can be skipped if the body hash did not
change.  With credits to you.

I would expect lesser CPU resource usage from your difference
algorithm, the lesser the larger the data.  It is interesting, you
know, but i for one would stick to the other approach.  With
normal diff(1) algorithm i got about 30 percent for a 5 megabyte
email, and about 110 percent for a 24 megabyte email.  (With
changed body.)  For emails around and below 100 kilobyte i did not
encounter differences in CPU usage, and with a better compression
algorithm than ZLIB -- xz or zstd, which has a RFC; better in that
they can compress loads of zeroes better at times --, i get
satisfying results with the suffix sorting approach, and i like
the compact visual representation more.

Anyhow, thank you, and a nice weekend i wish from Germany,

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)

_______________________________________________
Ietf-dkim mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to