[ https://issues.apache.org/jira/browse/SVN-807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nathan Hartman closed SVN-807. ------------------------------ Fix Version/s: (was: unscheduled) 1.0.0 Resolution: Fixed > gracefully degrade from failed charset conversion > ------------------------------------------------- > > Key: SVN-807 > URL: https://issues.apache.org/jira/browse/SVN-807 > Project: Subversion > Issue Type: Bug > Affects Versions: all > Reporter: Karl Fogel > Priority: Minor > Fix For: 1.0.0 > > Attachments: 1_brane-utf-8.mbox, 2_ulrich.mbox > > > {noformat:nopanel=true} > Right now, if a log message contains characters that cannot be > represented in the client's locale, that log message will simply show > up as: > "[unconvertible log msg]" > Graceful degradation would be nice here :-). > See the dev list thread "Re: converting unconvertible UTF-8 data" for > discussion of possible solutions. > My first idea was to write a fuzzy converter function that replaces > every unconverted byte with an escape sequence representing its > numerical code ("?\XXX" or somesuch). > Then Ulrich Drepper pointed out that since this data is mainly for > human consumption, the "//TRANSLIT" behavior of glibc's iconv and GNU > libiconv would produce more readable output. We can at least detect > when we're using one of those iconv's and append that option to the > to-charset string where appropriate. (Marcus Comstedt points out that > some iconv implementations automatically do transliteration for you, > and don't even tell you whether or not it's happened, which is sort of > unnerving.) > However, if you are on a system that doesn't support this, you'll get > the result above. > So there are various non-mutually-exclusive steps to take here: > - Write the fuzzy function with the escape codes, use where > translit not available. > - Meanhwile, get Subversion doing transliteration where possible > (Ulrich may do) > - Possible early fix: make "svn log" accept --force or > --message-encoding, so one > can make it output the raw bytes or a specific encoding, > respectively. > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)