Re: [fpc-pascal] cwstrings unit and UTF8Decode()
On Fri, 25 Mar 2016, Graeme Geldenhuys wrote: On 2016-03-25 12:23, Michael Van Canneyt wrote: Correction, this particular function does not depend on cwstrings. When you say "this particular function" you are referring to the UTF8Decode() function correct? The documentation page for UTF8Decode has explicitly removed the reference [that it requires a widestring manager] that was there before... http://www.freepascal.org/docs-html/current/rtl/system/utf8decode.html But, it does mention that it uses the low-level Utf8ToUnicode() function. Now lets see that function's documentation. http://www.freepascal.org/docs-html/current/rtl/system/utf8tounicode.html And here it mentions that a widestring manager IS required for it to function. This is wrong, I will correct that. Encoding/Decoding UTF-8 to/from UTF16 is just shuffling bits. Michael. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] cwstrings unit and UTF8Decode()
On 2016-03-25 12:23, Michael Van Canneyt wrote: > Correction, this particular function does not depend on cwstrings. When you say "this particular function" you are referring to the UTF8Decode() function correct? The documentation page for UTF8Decode has explicitly removed the reference [that it requires a widestring manager] that was there before... http://www.freepascal.org/docs-html/current/rtl/system/utf8decode.html But, it does mention that it uses the low-level Utf8ToUnicode() function. Now lets see that function's documentation. http://www.freepascal.org/docs-html/current/rtl/system/utf8tounicode.html And here it mentions that a widestring manager IS required for it to function. So if UTF8Decode depends on UTF8ToUnicode, then by definition UTF8Decode also depends on a widestring manager. Regards, - Graeme - -- fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal http://fpgui.sourceforge.net/ My public PGP key: http://tinyurl.com/graeme-pgp ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] cwstrings unit and UTF8Decode()
On Fri, Mar 25, 2016 at 7:14 PM, Bartwrote: > It's just a define to signal that all strings in LCL are UTF8 and when > offered to RTL their codepage is CP_UTF8. Not only in LCL. Package LazUtils / unit LazUTF8 can be used also without LCL. http://wiki.freepascal.org/Better_Unicode_Support_in_Lazarus#Using_UTF-8_in_non_LCL_programs It means that when Graeme finally switches to FPC 3.x and he uses LazUTF8 in his code, he gets cwstring as an extra bonus. Regards, Juha ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] cwstrings unit and UTF8Decode()
On 3/25/16, Felipe Monteiro de Carvalhowrote: > Important part you are forgetting: {$IFDEF UTF8_RTL} > > I don't know why it is needed in the utf-8 RTL, since I haven't used > this RTL yet, but in the RTL that I am using it doesn't depend in that > unit :) It's just a define to signal that all strings in LCL are UTF8 and when offered to RTL their codepage is CP_UTF8. Whe DisableUtf8RTL is defined than all strings are CP_ACP. The name of the define may indeed be a little misleading, but it's short. We have to cater for 3 different situations: - default: we set DefaultSystemCodepage to CP_UTF8 (on Windows): UTF8_RTL - DisableUtf8RTL defined: ACP_RTL - Fpc without cp-string: NO_CP_RTL (See ($lazarus)\components\lazutils\lazutils_defines.inc) Bart ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] cwstrings unit and UTF8Decode()
On Fri, Mar 25, 2016 at 3:16 PM, Michael Van Canneytwrote: > "lazutf8 doesn't depending" when it is in the uses clause, sounds a bit > strange to me :-) Important part you are forgetting: {$IFDEF UTF8_RTL} I don't know why it is needed in the utf-8 RTL, since I haven't used this RTL yet, but in the RTL that I am using it doesn't depend in that unit :) Anyway, what I meant is that the routines themselves are Pascal implementations of the Unicode standard. We even have uppercase/lowercase tables. So we depend as little as possible on system stuff. More reliable, more cross-platform and some routines actually are several times faster than system ones. -- Felipe Monteiro de Carvalho ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] cwstrings unit and UTF8Decode()
On Friday 25 March 2016 15:37:36 Graeme Geldenhuys wrote: > On 2016-03-25 14:06, Martin Schreiber wrote: > > You can use the MSEgui functions in lib/common/msestrings.pas > > Thanks, but doesn't MSEgui also use cwstrings? > Not for utf-8 <-> utf-16 conversion. The MSEgui version of cwstring also maps unicodemanager conversion functions with cp_utf8 to the internal MSEgui functions instead to call iconv. Martin ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] cwstrings unit and UTF8Decode()
On 2016-03-25 14:06, Martin Schreiber wrote: > You can use the MSEgui functions in lib/common/msestrings.pas Thanks, but doesn't MSEgui also use cwstrings? Regards, - Graeme - ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] cwstrings unit and UTF8Decode()
On Fri, 25 Mar 2016, Felipe Monteiro de Carvalho wrote: On Fri, Mar 25, 2016 at 2:01 PM, Michael Van Canneytwrote: Look at the sources Which proves me right, or do I miss something? "lazutf8 doesn't depending" when it is in the uses clause, sounds a bit strange to me :-) Michael. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] cwstrings unit and UTF8Decode()
In our previous episode, Graeme Geldenhuys said: > >> > Yes, this is correct. > > Correction, this particular function does not depend on cwstrings. > > All the other widestring (uppercase, compare etc) functions do. > > Ok, thanks for that. > Is there an easy way to see when a RTL function requires cwstrings to > function correctly? Is it mentioned in the RTL documentation? Is looking > at the RTL source code the only way to find that out? Yes, I think so. But in this case because utf8 to utf16 doesn't require tables, it makes more sense it doesn't need some unicode library implementation. As soon as it starts interpreting/comparing/mutating characters, you need tables, and those can be better taken from the OS (or be at least optional for small files that only want to use sysutils to remove a file or so) ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] cwstrings unit and UTF8Decode()
On Friday 25 March 2016 14:48:18 Graeme Geldenhuys wrote: > On 2016-03-25 12:20, Bart wrote: > > If you're using LazUtf8 (or use LCL) then cwstring will be used in your > > app. > > I don't use LCL at all, pure RTL & FCL code only. Based on the fact that > LCL's code also requires "cwstrings" I assume my original assumptions is > correct, that if I want to do any UTF8-to-UTF16 conversions, use > UTF8Decode etc, my applications (or frameworks) require "cwstrings" for > now. > You can use the MSEgui functions in lib/common/msestrings.pas (stringtoutf8(), stringtoutf8ansi(), utf8tostring(), utf8tostringansi(). AFAIK both LCL and Free Pascal RTL also have such functions. Martin ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] cwstrings unit and UTF8Decode()
On 2016-03-25 12:23, Michael Van Canneyt wrote: >> > Yes, this is correct. > Correction, this particular function does not depend on cwstrings. > All the other widestring (uppercase, compare etc) functions do. Ok, thanks for that. Is there an easy way to see when a RTL function requires cwstrings to function correctly? Is it mentioned in the RTL documentation? Is looking at the RTL source code the only way to find that out? Or does the compiler in some way give a compilation hint that some RTL functions will not function (because I might have left out cwstrings in a project). Regards, - Graeme - -- fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal http://fpgui.sourceforge.net/ My public PGP key: http://tinyurl.com/graeme-pgp ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] cwstrings unit and UTF8Decode()
On Fri, Mar 25, 2016 at 2:01 PM, Michael Van Canneytwrote: > Look at the sources Which proves me right, or do I miss something? -- Felipe Monteiro de Carvalho ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] cwstrings unit and UTF8Decode()
On 2016-03-25 12:20, Bart wrote: > If you're using LazUtf8 (or use LCL) then cwstring will be used in your app. I don't use LCL at all, pure RTL & FCL code only. Based on the fact that LCL's code also requires "cwstrings" I assume my original assumptions is correct, that if I want to do any UTF8-to-UTF16 conversions, use UTF8Decode etc, my applications (or frameworks) require "cwstrings" for now. Regards, - Graeme - -- fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal http://fpgui.sourceforge.net/ My public PGP key: http://tinyurl.com/graeme-pgp ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] cwstrings unit and UTF8Decode()
On Fri, 25 Mar 2016, Felipe Monteiro de Carvalho wrote: On Fri, Mar 25, 2016 at 1:20 PM, Bartwrote: If you're using LazUtf8 (or use LCL) then cwstring will be used in your app. And I guess that Utf8ToUtf16 from Lazutf8 does not depend on a WS manager, but I may be terribly wrong about that. As far as I remember, lazutf8 doesn't depending on cwstring for (most?) of its funcionality. Look at the sources uses {$IFDEF UTF8_RTL} {$ifdef unix} cwstring, // UTF8 RTL on Unix requires this. Must be used although it pulls in clib. {$endif} Michael. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] cwstrings unit and UTF8Decode()
On Fri, Mar 25, 2016 at 1:20 PM, Bartwrote: > If you're using LazUtf8 (or use LCL) then cwstring will be used in your app. > And I guess that Utf8ToUtf16 from Lazutf8 does not depend on a WS > manager, but I may be terribly wrong about that. As far as I remember, lazutf8 doesn't depending on cwstring for (most?) of its funcionality. -- Felipe Monteiro de Carvalho ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] cwstrings unit and UTF8Decode()
On Fri, 25 Mar 2016, Michael Van Canneyt wrote: On Fri, 25 Mar 2016, Graeme Geldenhuys wrote: Hi, I'm using FPC 2.6.4 primarily. Am I correct in that UTF8Decode and most (if not all) UTF8-to-UTF16 conversions don't function correctly (or not at all) if you don't include the cwstrings unit in your project? I referring to Unix-based OSes here. I believe Windows automatically include the WideString Manager for you. Yes, this is correct. Correction, this particular function does not depend on cwstrings. All the other widestring (uppercase, compare etc) functions do. Michael. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] cwstrings unit and UTF8Decode()
On Fri, 25 Mar 2016, Graeme Geldenhuys wrote: Hi, I'm using FPC 2.6.4 primarily. Am I correct in that UTF8Decode and most (if not all) UTF8-to-UTF16 conversions don't function correctly (or not at all) if you don't include the cwstrings unit in your project? I referring to Unix-based OSes here. I believe Windows automatically include the WideString Manager for you. Yes, this is correct. Michael. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] cwstrings unit and UTF8Decode()
On 3/25/16, Graeme Geldenhuyswrote: > I'm using FPC 2.6.4 primarily. Am I correct in that UTF8Decode and most > (if not all) UTF8-to-UTF16 conversions don't function correctly (or not > at all) if you don't include the cwstrings unit in your project? I > referring to Unix-based OSes here. If you're using LazUtf8 (or use LCL) then cwstring will be used in your app. And I guess that Utf8ToUtf16 from Lazutf8 does not depend on a WS manager, but I may be terribly wrong about that. Bart ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
[fpc-pascal] cwstrings unit and UTF8Decode()
Hi, I'm using FPC 2.6.4 primarily. Am I correct in that UTF8Decode and most (if not all) UTF8-to-UTF16 conversions don't function correctly (or not at all) if you don't include the cwstrings unit in your project? I referring to Unix-based OSes here. I believe Windows automatically include the WideString Manager for you. Regards, - Graeme - ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal