Re: CVS and unicode

Christian Hujer Sun, 11 Sep 2005 05:21:12 -0700

Hi,

Am Sonntag, 11. September 2005 01:53 schrieb Pierre Asselin:
> Christian Hujer <[EMAIL PROTECTED]> wrote:
> > [ ... ]  The CRLF byte sequences are:
> > ASCII: 0x0D 0x0A.
> > UTF-8: 0x0D 0x0A.
> > UTF-16 LE: 0x0D 0x00 0x0A 0x00.
> > UTF-16 BE: 0x00 0x0D 0x00 0x0A.
> >
> > CVS will not interfer with any of these.
> > UTF-16LE sequence will be split within the LF char. But since the next
> > line will be split at exactly the same point, this is not a problem for
> > line diffs.
>
> An UTF-16 file can contain octet sequences like (xx 0D)(0A yy) that
> CVS will mistake for line endings.
Ah okay, true, I didn't think about this.


> It will confuse diff, and if 
> a Windows client strips the "0D" upon commit and a Unix client
> tries to update, the contents will look seriously scrambled...
The diff problem is valid.

The windows client problem is invalid since the client should not perform 
modifications on the files, wether -kb or not (imo).

Okay, UTF-16 is very likely to be problem if not treated as binary. UTF-16 
should therefor be added with -kb.


Christian


_______________________________________________
Info-cvs mailing list
[email protected]
http://lists.nongnu.org/mailman/listinfo/info-cvs

Re: CVS and unicode

Reply via email to