On Tue, Nov 12, 2019 at 4:07 AM Branko Čibej <[email protected]> wrote:
> On 11.11.2019 17:30, Daniel Shahaf wrote: > (snip test procedure) > > So I think we can close it as "Fixed at some point"? > > Ah, yes, I added some code to at least print something readable (or > let's say "analyzable") in such cases when I added utf8proc to our code. > I can't think of anything better to do, so we can close that issue as > far as I'm concerned. Brane, thanks for your input. Daniel, thanks for testing this and documenting how. Please, could you add that as a comment in the issue tracker? Or, if you'd like, I'll be happy to do that and attribute it to you. I agree that this issue should be closed. >From my reading, it looks like it was not closed as a reminder to move this to APR. (Though that might make sense from a refactoring standpoint, I think it would cause dependency headaches.) Also there was a patch from Ulrich Drepper but it looks incomplete. I'm guessing there was a desire to do more than print the offending hex codes but I don't know what else you could do. If there are no objections, I'll go ahead and close SVN-807. As a side note, I feel dumb because I went spelunking all over the code and found r842879... [[[ r842879 | kfogel | 2002-07-30 18:33:13 -0400 (Tue, 30 Jul 2002) | 12 lines Start on issue #807. Thanks to Justin Erenkrantz for his initial patch to check_non_ascii(). * subversion/libsvn_subr/utf.c (check_non_ascii): Return the more appropriate APR_EINVAL. * subversion/include/svn_utf.h, subversion/libsvn_subr/utf.c (svn_utf_cstring_from_utf8_fuzzy): New function. * subversion/clients/cmdline/log-cmd.c (log_message_receiver): Degrade gracefully yet insistently. ]]] and only AFTER that, I noticed that it was written in the issue tracker all along: [[[ Karl Fogel added a comment - 30/Jul/02 22:46 Okay, the worst has been resolved by revision 2805. We still need to move the code over to APR, if appropriate. And apply Brane and Ulrich's patches, but that may be a separate issue, will mull on that. ]]] r2805 + 840074 = r842879... Which, by the way, is before 1.0. The only other thing I found that deals with bad UTF-8 in log messages is the logic that prevents bad UTF-8 and mixed line endings from getting into the log and the automated test for it. Nathan

