Branko Čibej wrote on Sun, 06 Jan 2019 19:37 +0100: > A simple check would be: > > * if 0x0a is on an odd offset, and the next byte is 0x00, then it's a > UTF-16-LE linefeed; > * else if 0x0a is on an even offset, and the _previous_ byte is 0x00, > then it's a UTF-16-BE linefeed;
Would would happen if it were an ASCII/UTF-8 file that happened to have a literal NUL byte next to an LF byte? I have seen/used some of those. > * otherwise just hope it's a linefeed and move on. The encoding may also be set explicitly via a svn:mime-type="text/foo; charset=utf-16-le" property. (We even parse that in mod_dav_svn, I think?) Cheers, Daniel