I slightly disagree with your analysis.

If the file has been written with a different encoding, the result string
will be corrupted when read and unmappable chars replaced as
indicated by [1], but then, the consequence is that the two strings
will differ and the file will be overwritten anyway.
So I don't think any failure can happen and the result should be the
same, whether we do the comparison on the bytes or on the string.

That said, I agree this will avoid an unnecessary decoding of the
existing file and an optimization by checking the file length as you
said.  I've raised a PR for that [2].

Cheers,
Guillaume Nodet

[1]
https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#String(byte[],%20java.nio.charset.Charset)
[2] https://github.com/codehaus-plexus/modello/pull/186

Le lun. 14 févr. 2022 à 08:41, Vladimir Sitnikov <
sitnikov.vladi...@gmail.com> a écrit :

> I believe the added CachingWriter is might become a cause of silent
> failures.
>
> What CachingWriter does is an attempt to read the file and decode it with
> the provided encoding.
> Apparently, the decoding might fail since the file might be written in
> another encoding or it might be corrupted.
>
> A better approach would be to convert the created String into `byte[]`, and
> then compare the bytes with file contents.
> Then you never really need to decode the file
>
> Could someone with commit privileges just fix it?
> The fix is just to remove readString and writeString methods, and use
> Arrays.equals to compare the contents.
> Then, you can compare file contents and avoid reading the file if you know
> the length differs (it optimizes the "file differs" case).
>
> ----
>
> I remember I faced "Maven recompiling everything" issue multiple times, so
> I wonder if Maven itself has a solution for that.
> I understand tracking inputs and outputs might take a while to implement,
> so
> what if Maven had a solid API for generating files that skips overwriting
> if the contents are the same?
>
> Sample issues on the top of my head:
> https://issues.apache.org/jira/browse/MRRESOURCES-91
> https://github.com/freemarker/fmpp/issues/11
> https://github.com/julianhyde/hydromatic-resource/pull/4
>
> Why does every plugin have to reinvent a half-broken caching wheel?
>
> Vladimir
>


-- 
------------------------
Guillaume Nodet

Reply via email to