Tue, 28 Mar 2000 14:52:56 +0200, Sven Panne <[EMAIL PROTECTED]> 
pisze:

> 1) To be honest, I don't know enough about char/wchar_t/Unicode/ISO-10646
>    to make suggestions here. Could somebody more knowledgeable make a
>    proposal how these should be handled in the FFI? Keep in mind that this
>    affects the reverse mapping, too, and that using good old C Strings
>    should be easy.

Although C does not require wchar_t to be Unicode, it seems to
be clearly suggesting it as a reasonable choice, so I vote for
representing Unicode characters on the C side in wchar_t when
appropriate instead of some custom integer type:

------------------------------------------------------------------------
From: [EMAIL PROTECTED] ()
Newsgroups: comp.std.c
Subject: How wide must wchar_t be?
Date: 27 Mar 2000 08:50:54 GMT
Message-ID: <8bn7de$oa9$[EMAIL PROTECTED]>

Section 6.10.7 of the draft C standard (I haven't ordered
the final version yet) states

    __STDC_ISO_10646__  An integer constant of the form
        yyyymmL (for example, 199712L), intended to
        indicate that values of type wchar_t are the
        coded representations of the characters defined
        by ISO/IEC 10646, along with all amendments and
        technical corrigenda as of the specified year
        and month.

Nit:    Since there are "holes" in the character set
        defined by ISO/IEC 10646 and there are no holes
        in integer data representations, there will always
        by some values of type wchar_t that are not the
        coded representations of the characters defined by
        ISO/IEC 10646.

If ISO/IEC 10646 is amended to include characters outside the
BMP, i.e. characters whose code is greater than 65535, is
wchar_t required to be greater than 16-bits wide if
__STDC_ISO_10646__ is defined to a value that falls on or
after the date the amendment was approved?

                                        Sincerely,
                                        Bob Corbett
------------------------------------------------------------------------

The question about whether the mapping C->Haskell->C must be identity,
i.e. whether it must be possible to exactly specify all basic C
types using a Haskell type signature, is open. It determines possible
representations of CChar and CWChar.

It has not been fully decided yet how many equivalences of Hs*
and C primitive types, or C* and Haskell primitive types, should be
explicitly promised. What has been decided:

                     HsAddr = void * - unless Ptr a replaces Addr
               HsForeignObj = void * - why did we forget about ForeignObj?
                HsStablePtr = void *
   Hs{Int,Word}{8,16,32,64} = {,u}int{8,16,32,64}_t
       - so C{,U}Int{8,16,32,64} are not needed, but Hs* are there for
         consistency and because these C types are not always available

Should anything be said about equivalences of various Chars?
About Float and Double? About Bool?

-- 
 __("<    Marcin Kowalczyk * [EMAIL PROTECTED] http://qrczak.ids.net.pl/
 \__/              GCS/M d- s+:-- a23 C+++$ UL++>++++$ P+++ L++>++++$ E-
  ^^                  W++ N+++ o? K? w(---) O? M- V? PS-- PE++ Y? PGP+ t
QRCZAK                  5? X- R tv-- b+>++ DI D- G+ e>++++ h! r--%>++ y-


Reply via email to