On 10/20/06, Graeme Geldenhuys <[EMAIL PROTECTED]> wrote:
Ok, I'll start up front by announcing my ignorance on these two items: UTF-8 and
Unicode.

After reading some discussions of implementing Unicode/UTF-8 support
in Lazarus I thought I would ask.  Could someone give the watered down
explanation to me (and probably others too)?  My mind is trying to
wrap around the two concepts as one issue, but I believe I'm just
getting myself more confused.

Unicode is a standard that tells us how to have one encoding to save
all the possible human languages, math, music and Braille signs. (Some
also added Klingon and Elvish to that group). Unicode also give group
of standards that tell you how to deal with language needs, paragraphs
and more. For example, my language Hebrew is written from right to
left, so in order for me to have also number, or English inside a
paragraph, I need something that will be able to handle left to right
and right to left according to the chars. So in Unicode, one of the
standards is Bi-Directional - How to deal with two (or more) languages
that each of them have different usage with the writing direction.
Another standard in Unicode, is how to handle Asian languages such as
Japanese (each column and each line have different meaning AFAIK), how
to combine them with non Asian languages such as English etc.
And there are more standards that arrive with "Unicode".


I read in Wikipedia about Unicode and UTF-8, but still it makes no sense.

Also as an example in Object Pascal, what is involved in changing a
function (or app) that uses standard ANSI strings to support UTF-8 or
Unicode or whatever it should be called.

When am I supposed to use String and WideString? Must I change all
references of String to WideString?  Is WideString = UTF-8 or Unicode
or UTF-16 or UCS-2 (whatever the hell that is)?   See my problem... I
am totally lost. :-)

AnsiString will save Unicode charters in "word" size of char set.
Widestring will save text in longint size, so I have more sizes. UCS2
is also type of size for saving chars, but it is much different in the
way it saves the charters. UTF-32 also have it's own way to save
things (more bits in usage, alto the "32" is not for bits AFAIK).

I hope that helps to give you a start with Unicode :)



Example:

function MyFooBar(const AStrValue: String); String
begin
  .... do whatever in here
  Result := <some string value>;
end;

....

var
  s, r: String;
begin
   s := "Graeme";
   r := MyFooBar(s);
   ...
end;


Many thanks in advance,
  - Graeme -

--
There's no place like 127.0.0.1



Ido
--
http://ik.homelinux.org/

_________________________________________________________________
    To unsubscribe: mail [EMAIL PROTECTED] with
               "unsubscribe" as the Subject
  archives at http://www.lazarus.freepascal.org/mailarchives

Reply via email to