On 05.05.2017 12:01, Michael Van Canneyt via Lazarus wrote:
On Fri, 5 May 2017, Ondrej Pokorny via Lazarus wrote:
Believe me, I use it in production without any problems: I have
unicode-aware TStrings, I can read files with unicode names, I can do
everything with plain FPC trunk.
I am aware
On 2017-05-05 10:41, Ondrej Pokorny via Lazarus wrote:
> Just use "DefaultSystemCodePage := CP_UTF8" and every single-byte string
> is unicode enabled.
So does that mean you don't have to also call the following two functions
(which LCL does).
SetMultiByteConversionCodePage(CP_UTF8);
On 2017-05-05 12:17, Mattias Gaertner via Lazarus wrote:
> I wonder if it would help if FPC would store UTF-8 string literals as
> UTF-8
Yeah, that would be the logical thing to do. FPC not doing that is what
really confused me.
Regards,
Graeme
--
On Fri, 5 May 2017 10:56:41 +0100
Graeme Geldenhuys via Lazarus wrote:
>[...]
> > or work with large amount of 8-bit strings.
>
> Why would you want to? Unicode supports all languages,
Maybe there is a misunderstanding. Let me rephrase my question:
What string
On Fri, 5 May 2017 12:01:47 +0200 (CEST)
Michael Van Canneyt via Lazarus wrote:
>[...]
> > Believe me, I use it in production without any problems: I have
> > unicode-aware TStrings, I can read files with unicode names, I can do
> > everything with plain FPC
Am 2017-05-05 um 12:16 schrieb Graeme Geldenhuys via Lazarus:
> In the end it’s about supporting Unicode. Does it really matter
> what internal encoding it is to achieve the “Unicode support”
> goal?
From a performance perspective it may be unwanted
to convert string encodings back and forth all
On Fri, May 5, 2017 at 1:20 AM, Graeme Geldenhuys via Lazarus
wrote:
> A case in point. Looking at the Wiki page you listed, I read the following:
> "
> Since FPC 3.0 you must add the flag -FcUTF8 or add {$codepage UTF8} at the
> beginning of the unit.
> ...
Uhhh,
On 05.05.2017 12:16, Graeme Geldenhuys via Lazarus wrote:
In the end it’s about supporting Unicode. Does it really matter
what internal encoding it is to achieve the “Unicode support”
goal?
Yep it does.
There are ways around that issue (i.e. code aware strings) but in fact
these trigger a
On Fri, 5 May 2017 12:52:48 +0200 (CEST)
Michael Van Canneyt via Lazarus wrote:
>[...]
> I propose to let the compiler observe the BOM.
> But I don't think more is needed.
FPC observes the BOM. Same as Delphi.
I wonder if it would help if FPC would store UTF-8
On Fri, May 5, 2017 at 2:29 PM, Michael Van Canneyt via Lazarus
wrote:
> Then what is still the problem ?
With BOM you get:
Error: UTF-8 code greater than 65535 found
which is counter-intuitive when the file and the string literal are both UTF-8.
It is related to
On Fri, 5 May 2017 12:17:22 +0200
Ondrej Pokorny via Lazarus wrote:
>[...]
> Embarcadero realized they made a mistake when they disabled (yes, only
> disabled not removed) 8-byte strings from NEXTGEN compilers. UTF8String
> and RawByteString are back for all
On Fri, 5 May 2017, Juha Manninen via Lazarus wrote:
On Fri, May 5, 2017 at 9:43 AM, Michael Van Canneyt via Lazarus
wrote:
What tricks do you still need in 3.0.x ?
The annoying tricky part with our UTF-8 solution is the assignment of
Unicode string
On 2017-05-05 11:55, Jürgen Hestermann via Lazarus wrote:
> I use UTF-8 internally and
> convert to/from UTF-16 for all Windows API functions and
> I never found any problem with it.
> The time that the API functions requires is so much longer than the
> time for string conversion that it does not
On 05.05.2017 13:02, Graeme Geldenhuys via Lazarus wrote:
On 2017-05-05 10:41, Ondrej Pokorny via Lazarus wrote:
Just use "DefaultSystemCodePage := CP_UTF8" and every single-byte string
is unicode enabled.
So does that mean you don't have to also call the following two functions
(which LCL
On 2017-05-05 11:01, Michael Van Canneyt via Lazarus wrote:
> We claim Delphi compatibility.
> So IMHO we must provide a UTF-16 Delphi compatible RTL.
In the end it’s about supporting Unicode. Does it really matter
what internal encoding it is to achieve the “Unicode support”
goal?
Regards,
On 2017-05-05 12:49, Juha Manninen via Lazarus wrote:
> A wrong information easily propagates, thus it is important to get this right.
No worries, I agree. Thanks for correcting my terminology.
Regards,
Graeme
--
___
Lazarus mailing list
On Fri, 5 May 2017, Mattias Gaertner via Lazarus wrote:
On Fri, 5 May 2017 12:52:48 +0200 (CEST)
Michael Van Canneyt via Lazarus wrote:
[...]
I propose to let the compiler observe the BOM.
But I don't think more is needed.
FPC observes the BOM. Same as
On Fri, May 5, 2017 at 2:02 PM, Graeme Geldenhuys via Lazarus
wrote:
> If so, when why does LCL also call the above two functions?
Graeme, they are called by LazUtils package, LazUTF8 unit, not by LCL.
It is not limited to GUI programming.
A wrong information
On 2017-05-05 07:43, Michael Van Canneyt via Lazarus wrote:
> As far as I know, you don't need any tricks to work with unicode
> filenames or output in 3.0.2. Maybe with exception of TStrings and
> TFileStream.
Again, I didn't have time to follow FPC 3.x development much, and I was too
confused
On 05.05.2017 11:17, Michael Van Canneyt via Lazarus wrote:
On Fri, 5 May 2017, Graeme Geldenhuys via Lazarus wrote:
On 2017-05-05 07:43, Michael Van Canneyt via Lazarus wrote:
As far as I know, you don't need any tricks to work with unicode
filenames or output in 3.0.2. Maybe with exception
On Fri, 5 May 2017, Ondrej Pokorny via Lazarus wrote:
On 05.05.2017 11:17, Michael Van Canneyt via Lazarus wrote:
On Fri, 5 May 2017, Graeme Geldenhuys via Lazarus wrote:
On 2017-05-05 07:43, Michael Van Canneyt via Lazarus wrote:
As far as I know, you don't need any tricks to work with
On 04.05.2017 16:56, Juha Manninen via Lazarus wrote:
I believe everybody is happy to get rid of the horrendous Windows
If if this is true, there is a decent need for backwards compatibility.
That is why, theoretically, code aware strings is a good idea.
Unfortunately the implementation of
On 2017-05-05 00:15, Mattias Gaertner via Lazarus wrote:
> I added a FAQ:
> http://wiki.freepascal.org/Better_Unicode_Support_in_Lazarus#What_happens_when_I_use_.24codepage_utf8.3F
Ah, thanks for that explanation.
> AFAIK you are using UTF-8 in AnsiString in FPC 2.6.4. That works in
> many
On 05.05.2017 11:06, Graeme Geldenhuys via Lazarus wrote:
On 2017-05-05 07:43, Michael Van Canneyt via Lazarus wrote:
As far as I know, you don't need any tricks to work with unicode
filenames or output in 3.0.2. Maybe with exception of TStrings and
TFileStream.
Again, I didn't have time to
On Fri, 5 May 2017, Ondrej Pokorny via Lazarus wrote:
On 05.05.2017 11:06, Graeme Geldenhuys via Lazarus wrote:
On 2017-05-05 07:43, Michael Van Canneyt via Lazarus wrote:
As far as I know, you don't need any tricks to work with unicode
filenames or output in 3.0.2. Maybe with exception of
On Fri, 5 May 2017 11:31:00 +0300
Kostas Michalopoulos via Lazarus wrote:
>[...]
> To play the devil's advocate, the fact that ALL reviews said that it has
> excellent support for Unicode means that characters outside the BMP *are*
> rare. After all, BMP does
On Fri, 5 May 2017, Graeme Geldenhuys via Lazarus wrote:
On 2017-05-05 07:43, Michael Van Canneyt via Lazarus wrote:
As far as I know, you don't need any tricks to work with unicode
filenames or output in 3.0.2. Maybe with exception of TStrings and
TFileStream.
Again, I didn't have time to
On 2017-05-05 09:59, Michael Schnell via Lazarus wrote:
> (Most obvious drawback: not flexibly typed TStrings.)
I know not everybody likes Generics, but that is where I see
Generics could come in very handy. A single TStrings implementation
that supports multiple string types.
Or just implement
On 2017-05-05 10:17, Michael Van Canneyt via Lazarus wrote:
>> Something like:
>>
>> sl.LoadFromFile('some_utf8_file.txt', CP_UTF8);
>> sl.LoadFromFile('some_utf16_file.txt', CP_UTF16);
>> sl.LoadFromFile('some_latin1_file.txt', CP_Latin1);
>
> Not yet. These are the exceptions I was talking
On Fri, 5 May 2017 10:01:24 +0100
Graeme Geldenhuys via Lazarus wrote:
>[...]
> > AFAIK you are using UTF-8 in AnsiString in FPC 2.6.4. That works in
> > many cases, because of double fooling the compiler. This trick does not
> > work on Windows with RTL file
On 2017-05-05 10:41, Mattias Gaertner via Lazarus wrote:
> I wonder what they do when you need to access the raw 8-bit file names,
OSX, iOS, Android and Linux all use UTF-8 as standard, so filename access
is not going to be any problem. Windows is moving more and more towards
UTF-16 everywhere,
On 2017-05-05 09:31, Kostas Michalopoulos via Lazarus wrote:
> After all, BMP does include practically all languages used today.
The bottom line:
Unicode Standard <> BMP only!
If you think that, then rather promote your application as a UCS-2
compliant application, not a Unicode compliant
On 2017-05-05 10:17, Ondrej Pokorny via Lazarus wrote:
> I don't know about 3.0.x but you can do it in trunk 3.1.1. I posted a
> patch for it (r34475).
Fantastic! Glad to see somebody was thinking in the same train of thought
as I did. :)
Is that scheduled to be back-ported to FPC 3.0.x?
On Fri, 5 May 2017 14:12:05 +0300
Juha Manninen via Lazarus wrote:
>[...]
> Then Mattias adds FAQs contradicting the earlier texts ...
Oops. Which one?
Mattias
--
___
Lazarus mailing list
Lazarus@lists.lazarus-ide.org
On Fri, May 5, 2017 at 3:56 PM, Sven Barth via Lazarus
wrote:
> That is mainly due to the compiler not supporting surrogate pairs for the
> UTF-8 -> UTF-16 conversion. If it would support them, then there wouldn't be
> a problem anymore...
That is a serious bug.
On Fri, May 5, 2017 at 4:21 PM, Mattias Gaertner via Lazarus
wrote:
> Oops. Which one?
The FAQ says:
"Since FPC 3.0 you must add the flag -FcUTF8 or add {$codepage UTF8}
at the beginning of the unit."
The same page in "String Literals" section says:
"In most
On Fri, 5 May 2017 16:36:51 +0300
Juha Manninen via Lazarus wrote:
> On Fri, May 5, 2017 at 4:21 PM, Mattias Gaertner via Lazarus
> wrote:
> > Oops. Which one?
>
> The FAQ says:
> "Since FPC 3.0 you must add the flag -FcUTF8 or
Am 05.05.2017 13:50 schrieb "Juha Manninen via Lazarus" <
lazarus@lists.lazarus-ide.org>:
>
> On Fri, May 5, 2017 at 2:29 PM, Michael Van Canneyt via Lazarus
> wrote:
> > Then what is still the problem ?
>
> With BOM you get:
> Error: UTF-8 code greater than 65535
38 matches
Mail list logo