On Wed, 26 Apr 2006 19:42:46 +0330
"roozbeh gholizadeh" <[EMAIL PROTECTED]> wrote:
> On Wed, 26 Apr 2006 12:12:10 +0330, Marc Weustink
> <[EMAIL PROTECTED]> wrote:
>
> > roozbeh gholizadeh wrote:
> >> On Mon, 24 Apr 2006 10:43:28 +0330, Florian Klaempfl
> >> <[EMAIL PROTECTED]> wrote:
> >>
> >>> Micha Nelissen wrote:
> >>>
> >>>> Florian Klaempfl wrote:
> >>>>
> >>>>> Mattias Gaertner wrote:
> >>>>>
> >>>>>> First of all: 'unicode' is merely a table. The computer needs an
> >>>>>> encoding.
> >>>>>> The LCL supports UTF-8. So, yes, there is already a unicode LCL.
> >>>>>> Probably you want UTF-16 for wince.
> >>>>>
> >>>>>
> >>>>> Maybe it's possible to use internally a type for unicode which is OS
> >>>>> dependend.
> >>>>> This requires a lot of code to be rewritten but it makes it possible
> >>>>> to use
> >>>>> native unicode type on platforms where utf-8 is uncommon.
> >>>>
> >>>>
> >>>> How to solve streaming of component text then (LFM etc) ?
> >>>>
> >>>
> >>> Well, data stored in files needs always conversion when working cross
> >
> >>> platform.
> >>>
> >> So if using utf-8 is a case,you mean i return utf-8 data from
> >> tedit.text?or convert it to ansi?
> >> in this way how can user have an tedit with support for unicode?
> >
> > I think it should return a utf8 string. THen you still have al chars.
> >
>
> So what happens if i set that text to another label.caption?or write it
> into a file?
> Also is string1 + string 2 supported?i mean if one is ansi another is utf8
> what happens?
You get rubbish. Of course you can not mix encodings.
> Overall i mean should we from this point think of ansistring always
> containing utf8 or not it can have both meanining within lcl interfaces?
No. At least not at the moment.
The gtk1 interface handles strings depending on the used font. If you use an
UTF-8 font, the strings are treated as UTF-8. If the font is an ISO, then
strings are treated as that ISO. See for example synedit in the IDE.
Under gtk2 there are almost only UTF-8 fonts, so here all strings are UTF-8.
Because the difference between UTF-8 and ISO 8bit encoding matters only for
less than 1% of the LCL code (word boundaries, uppercase and bidi), both
work.
Of course widestrings are totally incompatible, but have exactly the same
problems: multi word UTF-16, word boundaries, uppercase, bidi. And you can
not add WideString(UTF-16) + WideString (UCS2).
The IDE converts all resourcestrings to the current character set, which is
typically UTF-8 under linux/mac/bsd nowadays.
Mattias
_________________________________________________________________
To unsubscribe: mail [EMAIL PROTECTED] with
"unsubscribe" as the Subject
archives at http://www.lazarus.freepascal.org/mailarchives