vfclists . wrote:
Isn't some formality in these Unicode discussions called for? Use of
everyday language to express things which can only be properly expressed
and tested through source code is very confusing.
The formal definitions can be found at
Isn't some formality in these Unicode discussions called for? Use of
everyday language to express things which can only be properly expressed
and tested through source code is very confusing.
Consider these few sentences by Mattias
It depends.
There are two codepages. The real one and the
On 04/20/2016 11:26 AM, Jonas Maebe wrote:
The reasons are
Thanks a lot or the great explanation.
So there are good reasons to stay with the status-quo (unless doing a
completely new versatile and straight forward String implementation that
exceeds functionality and "mind" Delphi allows,
Michael Schnell wrote on Tue, 19 Apr 2016:
On 04/19/2016 08:22 AM, Jonas Maebe wrote:
When any {$codepage xxx} directive is specified, string constants
in the source are represented in a way that makes lossless
conversion to any other code page possible. This conversion to the
target
BTW.:
http://www.freepascal.org/docs-html/rtl/system/defaultsystemcodepage.html says
that DefaultSystemcodepage can be modified in the user code at runtime.
I suppose that will change the way strings with StringCodePage() =
CP_ACP are handled.
I'll do some tests...
-Michael
On 04/19/2016 08:22 AM, Jonas Maebe wrote:
When any {$codepage xxx} directive is specified, string constants in
the source are represented in a way that makes lossless conversion to
any other code page possible. This conversion to the target code page
is performed at compile time where
On 04/19/2016 08:22 AM, Jonas Maebe wrote:
No, it does not. Please tell me which sentence of
http://wiki.freepascal.org/FPC_Unicode_support#String_constants
suggests that in any way.
I just was making fun of myself, naively supposing the contrary :-) ;-)
-Michael
Michael Schnell wrote:
On 04/16/2016 11:02 AM, Mattias Gaertner wrote:
For instance using {$codepage utf8} tells the compiler to convert all
your literals to UTF-16. Without the {$codepage} the compiler
preserves the real codepage.
I.e. (compiling in a UTF-8 based Linux)
- using {$codepage
On 04/16/2016 11:02 AM, Mattias Gaertner wrote:
StringCodePage on a literal is pretty useless. You should use
StringCodePage on variables.
Just exploring how the compiler works...
-Michael
___
fpc-pascal maillist - fpc-pascal@lists.freepascal.org
On 04/16/2016 11:02 AM, Mattias Gaertner wrote:
For instance using {$codepage utf8} tells the compiler to convert all
your literals to UTF-16. Without the {$codepage} the compiler
preserves the real codepage.
I.e. (compiling in a UTF-8 based Linux)
- using {$codepage utf8} tells the compiler
On 04/16/2016 10:47 AM, Mattias Gaertner wrote:
That's correct. String literals in a codepage other than system are
stored as UTF-16 in the binary
(Assuming with "other than system" you mean different from the
DefaultSystemcodepage setting the compiler sees at it's runtime).
I see. And of
Mattias Gaertner wrote:
That's correct. String literals in a codepage other than system are
stored as UTF-16 in the binary and converted on assign. The conversion
happens at runtime, so the string codepage is decided at
runtime.
That's correct if the assignment is to a variable/parameter that
On Fri, 15 Apr 2016 10:43:55 +0200
Michael Schnell wrote:
>[...]
> Do you suggest that the codepage of the sourcecode is preserved by the
> compiler when creating the string constant in object code ?
It depends.
There are two codepages. The real one and the one you tell
On Fri, 15 Apr 2016 10:19:06 +0200
Michael Schnell wrote:
> On 04/15/2016 08:35 AM, Michael Van Canneyt wrote:
> >
> > For string constants there are slightly different rules. There the
> > result depends on the {$codepage} directive of the source file.
>
> Hmmm.
>
> If
On 04/15/2016 08:35 AM, Michael Van Canneyt wrote:
For string constants there are slightly different rules. There the
result depends on the {$codepage} directive of the source file.
Hmmm.
If not setting $codepage Ifor a constant string I get StringCodePage = 0,
If setting {$codepage UTF8}
On 04/15/2016 10:32 AM, Graeme Geldenhuys wrote:
If you as a programmer knows the unit is saved in UTF-8 encoding, then
add {$codepage utf8} to the top of the unit. That tells the compiler
how to interpret string constants in that unit (without the need for
any guessing).
I did some test
On 2016-04-14 09:16, Michael Schnell wrote:
> For a test I did result := StringCodePage('äü');
If you as a programmer knows the unit is saved in UTF-8 encoding, then
add {$codepage utf8} to the top of the unit. That tells the compiler how
to interpret string constants in that unit (without the
On Thu, 14 Apr 2016, Michael Schnell wrote:
On 04/14/2016 08:52 AM, Michael Van Canneyt wrote:
The default encoding for the string type is determined at run-time, not at
compile time.
How can that work for string constants ? Will they in fact (virtually) change
their encoding when
On 04/14/2016 08:52 AM, Michael Van Canneyt wrote:
The default encoding for the string type is determined at run-time,
not at compile time.
How can that work for string constants ? Will they in fact (virtually)
change their encoding when DefaultSystemcodepage is different ?
For a test I
On Wed, 13 Apr 2016, Michael Schnell wrote:
On 04/13/2016 09:04 AM, Michael Van Canneyt wrote:
It uses the DefaultSystemcodepage. If the system codepage is UTF8, then
it will use UTF8.
(Sorry for replying yet another answer to the same message of yours)
On 04/13/2016 09:04 AM, Michael Van Canneyt wrote:
It uses the DefaultSystemcodepage. If the system codepage is UTF8, then
it will use UTF8.
(Sorry for replying yet another answer to the same message of yours)
http://wiki.freepascal.org/Better_Unicode_Support_in_Lazarus says:
On the other
On 04/13/2016 09:04 AM, Michael Van Canneyt wrote:
It uses the DefaultSystemcodepage. If the system codepage is UTF8,
then it will use UTF8.
Thanks for the enlightenment.
Am I right assuming that the DefaultSystemcodepage is determined when
compiling the RTL and/or the compiler) ? (As the
On Tue, 12 Apr 2016, Michael Schnell wrote:
On 04/04/2016 11:27 AM, Juha Manninen wrote:
Just use the new UTF-8 mode provided by Lazarus and remove all explicit
conversion functions.
http://wiki.freepascal.org/Better_Unicode_Support_in_Lazarus
I just did some tests and it seems that
On 04/04/2016 11:27 AM, Juha Manninen wrote:
Just use the new UTF-8 mode provided by Lazarus and remove all
explicit conversion functions.
http://wiki.freepascal.org/Better_Unicode_Support_in_Lazarus
I just did some tests and it seems that TStringList (this is what Tobias
is concerned about)
On 04/04/2016 11:27 AM, Juha Manninen wrote:
On Mon, Apr 4, 2016 at 11:18 AM, wrote:
I use TStringList for UTF-8 strings. This is no longer possible, because
automatic conversions cause question marks and data loss.
You are completely lost with this issue. The
On 04/04/2016 10:43 AM, tobiasgie...@gmail.com wrote:
"Unicode aware Pascal code needs to set DefaultSystemCodePage to
CP_UTF8".
That can't be this ubiquitous. I do suppose that the default value is
supposed to make sense in many cases.
OTOH, if - as you seem to suggest - there is any
On 2016-04-04 13:15, Sven Barth wrote:
> Qt uses UTF-16 as well...
I always thought that strange. After all, Qt was born as a Unix-type GUI
toolkit. Unless I got my facts wrong. Then again, it's only in recent
years that Unix-like systems moved to UTF-8. I think even FreeBSD didn't
use UTF-8 out
Am 04.04.2016 13:21 schrieb "Graeme Geldenhuys" <
mailingli...@geldenhuys.co.uk>:
>
> On 2016-04-04 12:06, Michael Van Canneyt wrote:
> > 1. Using UTF8 is a choice of lazarus. Other people may prefer
UnicodeString.
> > On Windows, UnicodeString is more 'natural' or 'native'.
>
> Based on
On Mon, 4 Apr 2016, Jonas Maebe wrote:
Michael Van Canneyt wrote on Mon, 04 Apr 2016:
On Mon, 4 Apr 2016, Graeme Geldenhuys wrote:
[add LCL UTF-8 helper units to FPC]
Though it could probably be added as quick as in FPC 3.0.2. It's simply
two new units that need to be explicitly used by
On Mon, 4 Apr 2016, Graeme Geldenhuys wrote:
On 2016-04-04 12:06, Michael Van Canneyt wrote:
1. Using UTF8 is a choice of lazarus. Other people may prefer UnicodeString.
On Windows, UnicodeString is more 'natural' or 'native'.
Based on Internet standards and most popular OSes (mobile
On 2016-04-04 12:06, Michael Van Canneyt wrote:
> 1. Using UTF8 is a choice of lazarus. Other people may prefer UnicodeString.
> On Windows, UnicodeString is more 'natural' or 'native'.
Based on Internet standards and most popular OSes (mobile devices
included), UTF-8 is kind - so we all know
On Mon, 4 Apr 2016, Graeme Geldenhuys wrote:
more complete solution for UTF-8. This is useful for many users. They
don't have to reinvent the wheel.
Not having looked at the two units you mentioned... but if this is a
general requirement for anybody using UTF-8 or similar with FPC 3.0,
then
On 2016-04-04 11:34, Mattias Gaertner wrote:
> for that. In fact you don't have to use LazUtils: some users simply
> copied the two units FPCAdds and LazUTF8. It's all open source.
This was not made clear until you explicitly mentioned it. Juha's
initial comment was vague on the matter, and the
On 2016-04-04 11:40, Mattias Gaertner wrote:
> Or simply copy the two units FPCAdds, LazUTF-8 or parts of them from
> here:
Thank you Juha and Mattias - I'll take a look at those to see what they do.
Regards,
- Graeme -
___
fpc-pascal maillist -
On Mon, 4 Apr 2016 13:27:05 +0300
Juha Manninen wrote:
>[...]
> But yes, it requires Lazarus IDE because LazUtils is a Lazarus
> package. At least you must create and compile the project using
> Lazarus IDE.
Or simply copy the two units FPCAdds, LazUTF-8 or parts of
On Mon, 4 Apr 2016 10:52:20 +0100
Graeme Geldenhuys wrote:
> On 2016-04-04 10:27, Juha Manninen wrote:
> > Just use the new UTF-8 mode provided by Lazarus and remove all
> > explicit conversion functions.
>
> This is the FPC mailing list. Not everybody here uses
On Mon, Apr 4, 2016 at 12:52 PM, Graeme Geldenhuys
wrote:
> This is the FPC mailing list. Not everybody here uses Lazarus or LCL, so
> making such a suggestion is wishful thinking. For example, your
> suggestion means nothing to me, I don't use LCL.
Yes, I should
On 2016-04-04 10:27, Juha Manninen wrote:
> Just use the new UTF-8 mode provided by Lazarus and remove all
> explicit conversion functions.
This is the FPC mailing list. Not everybody here uses Lazarus or LCL, so
making such a suggestion is wishful thinking. For example, your
suggestion means
On Mon, Apr 4, 2016 at 11:18 AM, wrote:
> I use TStringList for UTF-8 strings. This is no longer possible, because
> automatic conversions cause question marks and data loss.
You are completely lost with this issue. The automatic conversion of
encodings is a big step
tobiasgiesen wrote on Mon, 04 Apr 2016:
That please update the wiki - it is user editable.
Done:
http://wiki.freepascal.org/FPC_Unicode_support#Backward_compatibility
I hope this is correct.
It is incorrect in the sense that there is nothing utf8-specific about
the way your code
> That please update the wiki - it is user editable.
Done:
http://wiki.freepascal.org/FPC_Unicode_support#Backward_compatibility
I hope this is correct.
Cheers,
Tobias
___
fpc-pascal maillist - fpc-pascal@lists.freepascal.org
On 2016-04-04 09:43, tobiasgie...@gmail.com wrote:
> Very theoretical. What you really need to tell
> people is something like this:
That please update the wiki - it is user editable. Even a seasoned
developers as myself still needs to get my head around all this FPC
Unicode stuff. So any
> > I use TStringList for UTF-8 strings. This is no longer possible, because
> > automatic conversions cause question marks and data loss.
>
> Lazarus uses TStringList with UTF-8 all over the place.
>
> Please post a complete example demonstrating the problem.
Sorry - this was only theoretical,
On Mon, 4 Apr 2016, tobiasgie...@gmail.com wrote:
Hello,
disallowing "AnsiString" code for UTF-8 is a huge regression.
I use TStringList for UTF-8 strings. This is no longer possible, because
automatic conversions cause question marks and data loss.
Same answer as in my other mail. Set
On Mon, 04 Apr 2016 10:18:18 +0200
tobiasgie...@gmail.com wrote:
> Hello,
>
> disallowing "AnsiString" code for UTF-8 is a huge regression.
>
> I use TStringList for UTF-8 strings. This is no longer possible, because
> automatic conversions cause question marks and data loss.
Lazarus uses
Hello,
disallowing "AnsiString" code for UTF-8 is a huge regression.
I use TStringList for UTF-8 strings. This is no longer possible, because
automatic conversions cause question marks and data loss.
I also use a large amount of third-party libraries that use the AnsiString
data type for UTF-8.
46 matches
Mail list logo