On Mon, Sep 19, 2005 at 08:54:58AM -0400, David Roundy wrote:
> On Sun, Sep 18, 2005 at 05:04:17PM -0500, Taral wrote:
> > On 9/18/05, David Roundy <[EMAIL PROTECTED]> wrote:
> > > Perhaps you could experiment with using the FFI to call isprint?
> > > 
> > > foreign import ccall unsafe "static ctype.h isprint" isprint :: CInt -> 
> > > CInt
> > > 
> > > I suspect, though, that the isprint interface can't really work with
> > > UTF-8...
> > 
> > The functions for unicode are in wctype.h.
> 
> But we don't have unicode characters... or at least we have no way of
> knowing if we have unicode characters.  I suppose we could extract that
> from the locale, but then we'd also need to try to convert our contents
> from utf8 into unicode, which sounds like a bit of a nightmare to me.

Yes, the data "chars" are used as (only) binary octets in
darcs, and it would be the wrong place to implement many
various ways of interpreting them.

I think a reasonable solution for handling various encodings
is with user-supplied translation hooks to filter all data
that darcs reads/writes to/from utf8 (or any other desired
"base" format of the repo).  It could handle both known and
invented encodings, and different line endings.  And we'd
have the option to make darcs utf8-aware at any later time,
since utf8 will hopefully be a very common repo base format,
all all pure ascii7 repos are already perfectly valid utf8.

The isPrint hack is only a hacky way to interpret the binary
octets in chars as chars encoded in the current system locale.

With GHC's new isPrint behavior something bad could happen if
a user sets DARCS_USE_ISPRINT=1 and has a locale with control
codes where unicode has printables.  I don't know if such
encoding exists.

If I (or someone else) don't get a fix (the FII thing) ready
before next stable release, there should probably be a note
in the change log and the doc about it.


-- 
Tommy Pettersson <[EMAIL PROTECTED]>

_______________________________________________
darcs-devel mailing list
[email protected]
http://www.abridgegame.org/cgi-bin/mailman/listinfo/darcs-devel

Reply via email to