Re: [fpc-pascal] cwstrings unit and UTF8Decode()

2016-03-25 Thread Michael Van Canneyt



On Fri, 25 Mar 2016, Graeme Geldenhuys wrote:


On 2016-03-25 12:23, Michael Van Canneyt wrote:

Correction, this particular function does not depend on cwstrings.


When you say "this particular function" you are referring to the
UTF8Decode() function correct?

The documentation page for UTF8Decode has explicitly removed the
reference [that it requires a widestring manager] that was there before...

http://www.freepascal.org/docs-html/current/rtl/system/utf8decode.html

But, it does mention that it uses the low-level Utf8ToUnicode()
function. Now lets see that function's documentation.

http://www.freepascal.org/docs-html/current/rtl/system/utf8tounicode.html

And here it mentions that a widestring manager IS required for it to
function.


This is wrong, I will correct that.

Encoding/Decoding UTF-8 to/from UTF16 is just shuffling bits.

Michael.
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] cwstrings unit and UTF8Decode()

2016-03-25 Thread Graeme Geldenhuys
On 2016-03-25 12:23, Michael Van Canneyt wrote:
> Correction, this particular function does not depend on cwstrings.

When you say "this particular function" you are referring to the
UTF8Decode() function correct?

The documentation page for UTF8Decode has explicitly removed the
reference [that it requires a widestring manager] that was there before...

http://www.freepascal.org/docs-html/current/rtl/system/utf8decode.html

But, it does mention that it uses the low-level Utf8ToUnicode()
function. Now lets see that function's documentation.

http://www.freepascal.org/docs-html/current/rtl/system/utf8tounicode.html

And here it mentions that a widestring manager IS required for it to
function.

So if UTF8Decode depends on UTF8ToUnicode, then by definition UTF8Decode
also depends on a widestring manager.

Regards,
  - Graeme -

-- 
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/

My public PGP key:  http://tinyurl.com/graeme-pgp
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] cwstrings unit and UTF8Decode()

2016-03-25 Thread Juha Manninen
On Fri, Mar 25, 2016 at 7:14 PM, Bart  wrote:
> It's just a define to signal that all strings in LCL are UTF8 and when
> offered to RTL their codepage is CP_UTF8.

Not only in LCL. Package LazUtils / unit LazUTF8 can be used also without LCL.
  
http://wiki.freepascal.org/Better_Unicode_Support_in_Lazarus#Using_UTF-8_in_non_LCL_programs

It means that when Graeme finally switches to FPC 3.x and he uses
LazUTF8 in his code, he gets cwstring as an extra bonus.

Regards,
Juha
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] cwstrings unit and UTF8Decode()

2016-03-25 Thread Bart
On 3/25/16, Felipe Monteiro de Carvalho
 wrote:

> Important part you are forgetting: {$IFDEF UTF8_RTL}
>
> I don't know why it is needed in the utf-8 RTL, since I haven't used
> this RTL yet, but in the RTL that I am using it doesn't depend in that
> unit :)

It's just a define to signal that all strings in LCL are UTF8 and when
offered to RTL their codepage is CP_UTF8.
Whe DisableUtf8RTL is defined than all strings are CP_ACP.

The name of the define may indeed be a little misleading, but it's short.
We have to cater for 3 different situations:
- default: we set DefaultSystemCodepage to CP_UTF8 (on Windows): UTF8_RTL
- DisableUtf8RTL defined: ACP_RTL
- Fpc without cp-string: NO_CP_RTL
(See ($lazarus)\components\lazutils\lazutils_defines.inc)

Bart
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] cwstrings unit and UTF8Decode()

2016-03-25 Thread Felipe Monteiro de Carvalho
On Fri, Mar 25, 2016 at 3:16 PM, Michael Van Canneyt
 wrote:
> "lazutf8 doesn't depending" when it is in the uses clause, sounds a bit
> strange to me :-)

Important part you are forgetting: {$IFDEF UTF8_RTL}

I don't know why it is needed in the utf-8 RTL, since I haven't used
this RTL yet, but in the RTL that I am using it doesn't depend in that
unit :)

Anyway, what I meant is that the routines themselves are Pascal
implementations of the Unicode standard. We even have
uppercase/lowercase tables. So we depend as little as possible on
system stuff. More reliable, more cross-platform and some routines
actually are several times faster than system ones.

-- 
Felipe Monteiro de Carvalho
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] cwstrings unit and UTF8Decode()

2016-03-25 Thread Martin Schreiber
On Friday 25 March 2016 15:37:36 Graeme Geldenhuys wrote:
> On 2016-03-25 14:06, Martin Schreiber wrote:
> > You can use the MSEgui functions in lib/common/msestrings.pas
>
> Thanks, but doesn't MSEgui also use cwstrings?
>
Not for utf-8 <-> utf-16 conversion. The MSEgui version of cwstring also maps 
unicodemanager conversion functions with cp_utf8 to the internal MSEgui 
functions instead to call iconv.

Martin
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] cwstrings unit and UTF8Decode()

2016-03-25 Thread Graeme Geldenhuys
On 2016-03-25 14:06, Martin Schreiber wrote:
> You can use the MSEgui functions in lib/common/msestrings.pas

Thanks, but doesn't MSEgui also use cwstrings?

Regards,
  - Graeme -

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] cwstrings unit and UTF8Decode()

2016-03-25 Thread Michael Van Canneyt



On Fri, 25 Mar 2016, Felipe Monteiro de Carvalho wrote:


On Fri, Mar 25, 2016 at 2:01 PM, Michael Van Canneyt
 wrote:

Look at the sources


Which proves me right, or do I miss something?


"lazutf8 doesn't depending" when it is in the uses clause, 
sounds a bit strange to me :-)


Michael.
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] cwstrings unit and UTF8Decode()

2016-03-25 Thread Marco van de Voort
In our previous episode, Graeme Geldenhuys said:
> >> > Yes, this is correct.
> > Correction, this particular function does not depend on cwstrings.
> > All the other widestring (uppercase, compare etc) functions do.
> 
> Ok, thanks for that.
 
> Is there an easy way to see when a RTL function requires cwstrings to
> function correctly? Is it mentioned in the RTL documentation? Is looking
> at the RTL source code the only way to find that out?

Yes, I think so. But in this case because utf8 to utf16 doesn't require
tables, it makes more sense it doesn't need some unicode library
implementation.

As soon as it starts interpreting/comparing/mutating characters, you need
tables, and those can be better taken from the OS (or be at least optional
for small files that only want to use sysutils to remove a file or so) 
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] cwstrings unit and UTF8Decode()

2016-03-25 Thread Martin Schreiber
On Friday 25 March 2016 14:48:18 Graeme Geldenhuys wrote:
> On 2016-03-25 12:20, Bart wrote:
> > If you're using LazUtf8 (or use LCL) then cwstring will be used in your
> > app.
>
> I don't use LCL at all, pure RTL & FCL code only. Based on the fact that
> LCL's code also requires "cwstrings" I assume my original assumptions is
> correct, that if I want to do any UTF8-to-UTF16 conversions, use
> UTF8Decode etc, my applications (or frameworks) require "cwstrings" for
> now.
>
You can use the MSEgui functions in lib/common/msestrings.pas (stringtoutf8(), 
stringtoutf8ansi(), utf8tostring(), utf8tostringansi(). AFAIK both LCL and 
Free Pascal RTL also have such functions.

Martin
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] cwstrings unit and UTF8Decode()

2016-03-25 Thread Graeme Geldenhuys
On 2016-03-25 12:23, Michael Van Canneyt wrote:
>> > Yes, this is correct.
> Correction, this particular function does not depend on cwstrings.
> All the other widestring (uppercase, compare etc) functions do.


Ok, thanks for that.

Is there an easy way to see when a RTL function requires cwstrings to
function correctly? Is it mentioned in the RTL documentation? Is looking
at the RTL source code the only way to find that out?  Or does the
compiler in some way give a compilation hint that some RTL functions
will not function (because I might have left out cwstrings in a project).

Regards,
  - Graeme -

-- 
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/

My public PGP key:  http://tinyurl.com/graeme-pgp
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] cwstrings unit and UTF8Decode()

2016-03-25 Thread Felipe Monteiro de Carvalho
On Fri, Mar 25, 2016 at 2:01 PM, Michael Van Canneyt
 wrote:
> Look at the sources

Which proves me right, or do I miss something?

-- 
Felipe Monteiro de Carvalho
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] cwstrings unit and UTF8Decode()

2016-03-25 Thread Graeme Geldenhuys
On 2016-03-25 12:20, Bart wrote:
> If you're using LazUtf8 (or use LCL) then cwstring will be used in your app.


I don't use LCL at all, pure RTL & FCL code only. Based on the fact that
LCL's code also requires "cwstrings" I assume my original assumptions is
correct, that if I want to do any UTF8-to-UTF16 conversions, use
UTF8Decode etc, my applications (or frameworks) require "cwstrings" for now.


Regards,
  - Graeme -

-- 
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/

My public PGP key:  http://tinyurl.com/graeme-pgp
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] cwstrings unit and UTF8Decode()

2016-03-25 Thread Michael Van Canneyt



On Fri, 25 Mar 2016, Felipe Monteiro de Carvalho wrote:


On Fri, Mar 25, 2016 at 1:20 PM, Bart  wrote:

If you're using LazUtf8 (or use LCL) then cwstring will be used in your app.
And I guess that Utf8ToUtf16 from Lazutf8 does not depend on a WS
manager, but I may be terribly wrong about that.


As far as I remember, lazutf8 doesn't depending on cwstring for
(most?) of its funcionality.


Look at the sources

uses
  {$IFDEF UTF8_RTL}
  {$ifdef unix}
  cwstring, // UTF8 RTL on Unix requires this. Must be used although it  pulls 
in clib.
  {$endif}

Michael.
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] cwstrings unit and UTF8Decode()

2016-03-25 Thread Felipe Monteiro de Carvalho
On Fri, Mar 25, 2016 at 1:20 PM, Bart  wrote:
> If you're using LazUtf8 (or use LCL) then cwstring will be used in your app.
> And I guess that Utf8ToUtf16 from Lazutf8 does not depend on a WS
> manager, but I may be terribly wrong about that.

As far as I remember, lazutf8 doesn't depending on cwstring for
(most?) of its funcionality.

-- 
Felipe Monteiro de Carvalho
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] cwstrings unit and UTF8Decode()

2016-03-25 Thread Michael Van Canneyt



On Fri, 25 Mar 2016, Michael Van Canneyt wrote:




On Fri, 25 Mar 2016, Graeme Geldenhuys wrote:


Hi,

I'm using FPC 2.6.4 primarily. Am I correct in that UTF8Decode and most
(if not all) UTF8-to-UTF16 conversions don't function correctly (or not
at all) if you don't include the cwstrings unit in your project? I
referring to Unix-based OSes here. I believe Windows automatically
include the WideString Manager for you.



Yes, this is correct.


Correction, this particular function does not depend on cwstrings.
All the other widestring (uppercase, compare etc) functions do.

Michael.
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] cwstrings unit and UTF8Decode()

2016-03-25 Thread Michael Van Canneyt



On Fri, 25 Mar 2016, Graeme Geldenhuys wrote:


Hi,

I'm using FPC 2.6.4 primarily. Am I correct in that UTF8Decode and most
(if not all) UTF8-to-UTF16 conversions don't function correctly (or not
at all) if you don't include the cwstrings unit in your project? I
referring to Unix-based OSes here. I believe Windows automatically
include the WideString Manager for you.



Yes, this is correct.

Michael.
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] cwstrings unit and UTF8Decode()

2016-03-25 Thread Bart
On 3/25/16, Graeme Geldenhuys  wrote:

> I'm using FPC 2.6.4 primarily. Am I correct in that UTF8Decode and most
> (if not all) UTF8-to-UTF16 conversions don't function correctly (or not
> at all) if you don't include the cwstrings unit in your project? I
> referring to Unix-based OSes here.

If you're using LazUtf8 (or use LCL) then cwstring will be used in your app.
And I guess that Utf8ToUtf16 from Lazutf8 does not depend on a WS
manager, but I may be terribly wrong about that.

Bart
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


[fpc-pascal] cwstrings unit and UTF8Decode()

2016-03-25 Thread Graeme Geldenhuys
Hi,

I'm using FPC 2.6.4 primarily. Am I correct in that UTF8Decode and most
(if not all) UTF8-to-UTF16 conversions don't function correctly (or not
at all) if you don't include the cwstrings unit in your project? I
referring to Unix-based OSes here. I believe Windows automatically
include the WideString Manager for you.

Regards,
  - Graeme -

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal