Re: [fpc-devel] lazarus bug report + fix: Utf8ToUnicode doesn't work correctly

2005-05-04 Thread Danny Milosavljevic
Hi,

Am Mittwoch, den 04.05.2005, 12:49 +0200 schrieb Michael Van Canneyt:
> 
> On Wed, 4 May 2005, Jonas Maebe wrote:
> 
> > 
> > On 4 mei 2005, at 12:04, Michael Van Canneyt wrote:
> > 
> > > > It contains a fixed version of the Utf8ToUnicode function. Since it
> > > > is part of
> > > > the rtl, I close this lazarus issue and send you this message. I did
> > > > not test
> > > > the fixed version.
> > > 
> > > The files in the zip file are not usable; They're in some unicode format,
> > > which
> > > I can't use nor check on Linux.
> > 
> > They're plain UTF-8. I know for a fact there are editors under Linux which
> > support that (at least emacs does, and it would surprise me immensely if vim
> > doesn't). Anyway, here's the plain ascii version of the "-fixed" file.
> 
> I already got it from Vincent and applied the patch. But:
> - diff doesn't grok unicode.
diff doesn't need to, its line-based anyways. If you mean the displaying
of unicode, check widechar xterm compilation option (uxterm) [or for
text console, the unicode_start script and loadkeys -u].

> - "joe" doesn't grok it either.
joe 3 does (joe 3 rocks in other ways too :))

> - "kate" & "kwrite" - no grok.
> - PFE (Windows) doesn't grok it either.   
> - Apparently vim converts to plain ascii before editing.
> - Last but not least: The FPC Compiler also doesn't grok unicode :-)
I didnt notice it doesnt yet ... but maybe the char<->character
weirdness appears with utf-8 (which cannot be fixed except by fixing the
language definition itself).

> 
> So much for unicode :/

It used to be a hell of a lot of work to set up unicode to work with
linux, but nowadays distros support it out of the box :) (except
gentoo :P although I have a patch with all the neccessary bits already
done open with them)

What distro do you use ?

> 
> Michael.

cheers,
   Danny

-- 
www.keyserver.net key id A334AEA6



signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] lazarus bug report + fix: Utf8ToUnicode doesn't work correctly

2005-05-04 Thread Michael Van Canneyt


On Wed, 4 May 2005, Michalis Kamburelis wrote:

> Michael Van Canneyt wrote:
> > 
> > On Wed, 4 May 2005, Jonas Maebe wrote:
> > 
> > 
> > > On 4 mei 2005, at 12:04, Michael Van Canneyt wrote:
> > > 
> > > 
> > > > > It contains a fixed version of the Utf8ToUnicode function.
> > > > > Since it
> > > > > is part of
> > > > > the rtl, I close this lazarus issue and send you this message.
> > > > > I did
> > > > > not test
> > > > > the fixed version.
> > > > 
> > > > The files in the zip file are not usable; They're in some unicode
> > > > format,
> > > > which
> > > > I can't use nor check on Linux.
> > > 
> > > They're plain UTF-8. I know for a fact there are editors under Linux
> > > which
> > > support that (at least emacs does, and it would surprise me immensely
> > > if vim
> > > doesn't). Anyway, here's the plain ascii version of the "-fixed" file.
> > 
> > 
> > I already got it from Vincent and applied the patch. But:
> > - diff doesn't grok unicode.
> > - "joe" doesn't grok it either.
> > - "kate" & "kwrite" - no grok.
> > - PFE (Windows) doesn't grok it either.   - Apparently vim converts to
> > plain ascii before editing.
> 
> I didn't look at this patch, but if it's really UTF-8 then you can open the
> patch in Emacs (use `C-x  c utf-8  C-x C-f FILENAME ' to make
> sure that utf-8 will be understood), make sure you're in diff-mode (`M-x
> diff-mode' if necessary), then `C-c C-a' (or `M-x diff-apply-hunk') will apply
> hunks of the patch. I don't know does it solve the problems here (especially
> since FPC doesn't grok unicode...) but it's a painless way to apply patch
> encoded in UTF-8.

Except for a small flaw: I don't have emacs installed :-)

Michael.

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] lazarus bug report + fix: Utf8ToUnicode doesn't work correctly

2005-05-04 Thread Michalis Kamburelis
Michael Van Canneyt wrote:
On Wed, 4 May 2005, Jonas Maebe wrote:

On 4 mei 2005, at 12:04, Michael Van Canneyt wrote:

It contains a fixed version of the Utf8ToUnicode function. Since it
is part of
the rtl, I close this lazarus issue and send you this message. I did
not test
the fixed version.
The files in the zip file are not usable; They're in some unicode format,
which
I can't use nor check on Linux.
They're plain UTF-8. I know for a fact there are editors under Linux which
support that (at least emacs does, and it would surprise me immensely if vim
doesn't). Anyway, here's the plain ascii version of the "-fixed" file.

I already got it from Vincent and applied the patch. But:
- diff doesn't grok unicode.
- "joe" doesn't grok it either.
- "kate" & "kwrite" - no grok.
- PFE (Windows) doesn't grok it either.   
- Apparently vim converts to plain ascii before editing.
I didn't look at this patch, but if it's really UTF-8 then you can open 
the patch in Emacs (use `C-x  c utf-8  C-x C-f FILENAME ' 
to make sure that utf-8 will be understood), make sure you're in 
diff-mode (`M-x diff-mode' if necessary), then `C-c C-a' (or `M-x 
diff-apply-hunk') will apply hunks of the patch. I don't know does it 
solve the problems here (especially since FPC doesn't grok unicode...) 
but it's a painless way to apply patch encoded in UTF-8.

- Last but not least: The FPC Compiler also doesn't grok unicode :-)
So much for unicode :/
Michael.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] lazarus bug report + fix: Utf8ToUnicode doesn't work correctly

2005-05-04 Thread Tomas Hajny
On Wed, May 4, 2005 12:07, Jonas Maebe said:
>
> On 4 mei 2005, at 12:04, Michael Van Canneyt wrote:
>
>>> It contains a fixed version of the Utf8ToUnicode function. Since it
>>> is part of
>>> the rtl, I close this lazarus issue and send you this message. I did
>>> not test
>>> the fixed version.
>>
>> The files in the zip file are not usable; They're in some unicode
>> format, which
>> I can't use nor check on Linux.
>
> They're plain UTF-8.

Aren't they rather UCS-2 (as used under Win32)? I think that UTF-8 only
uses 1 byte for ASCII 0-127, whereas these files have 0 after every single
character.

Tomas


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] lazarus bug report + fix: Utf8ToUnicode doesn't work correctly

2005-05-04 Thread Michael Van Canneyt


On Wed, 4 May 2005, Jonas Maebe wrote:

> 
> On 4 mei 2005, at 12:04, Michael Van Canneyt wrote:
> 
> > > It contains a fixed version of the Utf8ToUnicode function. Since it
> > > is part of
> > > the rtl, I close this lazarus issue and send you this message. I did
> > > not test
> > > the fixed version.
> > 
> > The files in the zip file are not usable; They're in some unicode format,
> > which
> > I can't use nor check on Linux.
> 
> They're plain UTF-8. I know for a fact there are editors under Linux which
> support that (at least emacs does, and it would surprise me immensely if vim
> doesn't). Anyway, here's the plain ascii version of the "-fixed" file.

I already got it from Vincent and applied the patch. But:
- diff doesn't grok unicode.
- "joe" doesn't grok it either.
- "kate" & "kwrite" - no grok.
- PFE (Windows) doesn't grok it either.   
- Apparently vim converts to plain ascii before editing.
- Last but not least: The FPC Compiler also doesn't grok unicode :-)

So much for unicode :/

Michael.

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] lazarus bug report + fix: Utf8ToUnicode doesn't work correctly

2005-05-04 Thread Jonas Maebe
On 4 mei 2005, at 12:04, Michael Van Canneyt wrote:
It contains a fixed version of the Utf8ToUnicode function. Since it 
is part of
the rtl, I close this lazarus issue and send you this message. I did 
not test
the fixed version.
The files in the zip file are not usable; They're in some unicode 
format, which
I can't use nor check on Linux.
They're plain UTF-8. I know for a fact there are editors under Linux 
which support that (at least emacs does, and it would surprise me 
immensely if vim doesn't). Anyway, here's the plain ascii version of 
the "-fixed" file.

Jonas
function Utf8ToUnicode(Dest: PWideChar; MaxDestChars: SizeUInt; Source: 
PChar; SourceBytes: SizeUInt): SizeUInt;
  var
i,j : SizeUInt;
w: SizeUInt;
b : byte;
  begin
if not assigned(Source) then
begin
  result:=0;
  exit;
end;
result:=SizeUInt(-1);
i:=0;
j:=0;
if assigned(Dest) then
  begin
while (j
  begin
b:=byte(Source[i]);
w:=b;
inc(i);
// 2 or 3 bytes?
if b>=$80 then
  begin
w:=b and $3f;
if i>=SourceBytes then
  exit;
// 3 bytes?
if (b and $20)<>0 then
  begin
b:=byte(Source[i]);
inc(i);
if i>=SourceBytes then
  exit;
if (b and $c0)<>$80 then
  exit;
w:=(w shl 6) or (b and $3f);
  end;
b:=byte(Source[i]);
w:=(w shl 6) or (b and $3f);
if (b and $c0)<>$80 then
  exit;
inc(i);
  end;
Dest[j]:=WideChar(w);
inc(j);
  end;
if j>=MaxDestChars then j:=MaxDestChars-1;
Dest[j]:=#0;
  end
else
  begin
while i
  begin
b:=byte(Source[i]);
inc(i);
// 2 or 3 bytes?
if b>=$80 then
  begin
if i>=SourceBytes then
  exit;
// 3 bytes?
b := b and $3f;
if (b and $20)<>0 then
  begin
b:=byte(Source[i]);
inc(i);
if i>=SourceBytes then
  exit;
if (b and $c0)<>$80 then
  exit;
  end;
if (byte(Source[i]) and $c0)<>$80 then
  exit;
inc(i);
  end;
inc(j);
  end;
  end;
result:=j+1;
  end;

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] lazarus bug report + fix: Utf8ToUnicode doesn't work correctly

2005-05-04 Thread Michael Van Canneyt


On Wed, 4 May 2005, Vincent Snijders wrote:

> Hi,
> 
> Can you take a look at this issue reported at the lazarus bug tracker:
> http://www.lazarus.freepascal.org/mantis/view.php?id=888
> 
> It contains a fixed version of the Utf8ToUnicode function. Since it is part of
> the rtl, I close this lazarus issue and send you this message. I did not test
> the fixed version.

The files in the zip file are not usable; They're in some unicode format, which 
I can't use nor check on Linux.

Michael.

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel