-----Original Message-----
On Tue, May 29, 2007 at 04:59:12PM -0700, David Roundy wrote:
> On Mon, May 28, 2007 at 05:27:24PM -0700, Paul Schauble wrote:
> > I do international software development for Windows. This means I
have a
> > bunch of source files in Unicode/UTF-16LE.
> >
> > At present, DARCS does not handle this format, except as a pure
binary
> > file.
> >
> > Would it be practical to add UTF-8 and UTF-16BE and LE formats with
full
> > diff/merge support?
> > Is this something the developers would be interested in doing?
>
> It would be practical, but I don't know of any developers that would
have
> time to do so. It wouldn't even be particularly hard, I suspect, the
> trickiness would all be in handling the line endings, since darcs
doesn't
> do anything with the contents of lines (unless you use replace, which
might
> be broken).
Convieniently, the UCS2 byte sequence for 10 contains 10. So if darcs
didn't refuse to linediff files containing 0, things would just work.
Stefan
---------------------
Actually, no. If you just took the CR without taking the other byte of
the 16 bit character, then the byte-to-character phase would be wrong
for the following line. You have to actually handle the data as 16 bit
characters.
That's why I'm wondering if the underlying Haskell system can handle 16
bit characters. If not, then how could this change be made?
BTW, the usual problem with programs handling UTF-16 is the null
characters contained within strings. This usually doesn't work out well
unless the underlying language handles wide characters.
++PLS
_______________________________________________
darcs-devel mailing list
[email protected]
http://lists.osuosl.org/mailman/listinfo/darcs-devel