Re: [fpc-devel] DOS GUI
What is the difference between this and the TUI that comes up when you start tp. (Same obviously already is part of the fpc source code distribution.) -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
On 12/29/2013 10:57 AM, Michael Van Canneyt wrote: We're way past that stage. Sorry, but IMHO freezing the current (supposedly DXE compatible) state does not make much sense. To decently support creating fully portable applications for multiple OSes, TStrings and its siblings (e.g. TStringlist) needs do be implemented in a way not forcing a fixed encoding to store the information in and thus force time consuming convention when storing and retrieving strings in any different encoding. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Inc() and Dec() on properties
While this behavior is absolutely logical, as a normal procedure could not use a property in that way. OTOH just as a courtsey, the _builtin_ procedures inc() and dec() could be implemented in a way that allows for doing it. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
On Tue, 7 Jan 2014, Michael Schnell wrote: On 12/29/2013 10:57 AM, Michael Van Canneyt wrote: We're way past that stage. Sorry, but IMHO freezing the current (supposedly DXE compatible) state does not make much sense. To decently support creating fully portable applications for multiple OSes, TStrings and its siblings (e.g. TStringlist) needs do be implemented in a way not forcing a fixed encoding to store the information in and thus force time consuming convention when storing and retrieving strings in any different encoding. We know this. But that is stage 2. Michael. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
On 12/29/2013 07:26 PM, Hans-Peter Diettrich wrote: My view on RawByteString: My view on RawByteString originally had been that it can hold strings of any encoding and the software that gets such a variable (or function argument) can detect the actual encoding and behave accordingly. Thus dynamically generic functions can be done, handling any encoding style which is very appropriate as there can be sensible functionalists that don't need to know the encoding at all or not in detail (such as blindly storing and retrieving the information, concatenating, ...). But I (not owning XE) learned that this is not the case with XE, so maybe inventing a differently name string subtype might be appropriate. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
On 12/30/2013 12:53 PM, Jy V wrote: A quick note: the new LLVM Delphi compiler forbid the use of AnsiString and AnsiChar, Yuck ! Forking the platforms in that incompatible way a really ugly move of Delphi's. Hopefully fpc (and Lazarus) is able to continue it's path to real source code compatibility between all supported platforms. Thanks to the developers for their great effort ! -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
On 01/07/2014 10:41 AM, Michael Van Canneyt wrote: We know this. But that is stage 2. Sounds great ! Could you elaborate those plans ? I fear that releasing stage 1 to the public might introduce another source of incompatibility. Explanation: My horror-scenario when trying to convince my colleagues to port their (huge) embedded Delphi application to Lazarus: - They had it nicely working in pre-Unicode Delphi. It would be possible to port it to pre-Unicode Lazarus with decent effort. - It took them a huge effort to port it to Unicode enabled Delphi (including managing the glitches of multiple Delphi versions). - Current Unicode aware (UTF-8 enabled) Lazarus is compatible to neither of pre-Unicode Delphi nor to Unicode-aware Delphi, Hence using it for porting is out of question. - just Delphi XE compatible Stage 1 might be a valid target, in case Lazarus is done appropriately. but I supposedly would recommend waiting for Stage 2 as some of the porting effort might be done in vain regarding the goodies Stage 2 promises. (Not regarding additional issues that might come up when migrating from Stage 1 to State 2.) - of course Stage 2 would be most appropriate, but only if Lazarus follows accordingly, (I suppose at that time the natural target arch will be ARM 64 :-) ) -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
On Tue, 7 Jan 2014, Michael Schnell wrote: On 01/07/2014 10:41 AM, Michael Van Canneyt wrote: We know this. But that is stage 2. Sounds great ! Could you elaborate those plans ? There is not much to say. There will be 2 sets of units: * Unicode * AnsiString for unicode, there is no problem, since everything is 2 bytes (ignoring some exotic codepoints here) so string encoding does not apply. for ansistring, string=ansistring, there are some encoding issues, but the RTL is capable of using Widestrings for all OS interface routines. Michael. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
On 01/07/2014 11:22 AM, Michael Van Canneyt wrote: There is not much to say. There will be 2 sets of units: * Unicode * AnsiString for unicode, there is no problem, since everything is 2 bytes (ignoring some exotic codepoints here) so string encoding does not apply. for ansistring, string=ansistring, there are some encoding issues, but the RTL is capable of using Widestrings for all OS interface routines. This might help in a certain way, but defining a decently dynamic string subtype and use same for TStringList would allow for a lot more flexibility / functionality. (Together with eliminating the ambiguous naming ANSI...) -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
On 07 Jan 2014, at 11:22, Michael Van Canneyt wrote: There will be 2 sets of units: * Unicode * AnsiString Or they may be integrated to a large extent. This has not yet been decided. In any case, creating two separate sets of units is a good step regardless of what happens eventually, since we need the functionality of both in any case. Jonas ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
On 01/07/2014 11:28 AM, Jonas Maebe wrote: creating two separate sets of units is a good step regardless of what happens eventually, since we need the functionality of both in any case. This sounds like State 3 is flickering at the horizon t as well :-) :-) -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
On Tue, 7 Jan 2014, Michael Schnell wrote: On 01/07/2014 11:22 AM, Michael Van Canneyt wrote: There is not much to say. There will be 2 sets of units: * Unicode * AnsiString for unicode, there is no problem, since everything is 2 bytes (ignoring some exotic codepoints here) so string encoding does not apply. for ansistring, string=ansistring, there are some encoding issues, but the RTL is capable of using Widestrings for all OS interface routines. This might help in a certain way, but defining a decently dynamic string subtype and use same for TStringList would allow for a lot more flexibility / functionality. (Together with eliminating the ambiguous naming ANSI...) TRawByteString is what you need. Michael. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
On 01/07/2014 11:48 AM, Michael Van Canneyt wrote: TRawByteString is what you need. AFAI was told by DXE users, this is not true. That is why i did not test this yet. But as you state otherwise I will check into that. But anyway this does not help, as long as the RTL (especially TStrings, TStringList) does not use such type, but forces a fixed type and thus unnecessary conversions in and out. (there have been several discussion on that, including how to implement decent encoding-checks in the compiler, and performance issues). -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
On Tue, 7 Jan 2014, Michael Schnell wrote: On 01/07/2014 11:48 AM, Michael Van Canneyt wrote: TRawByteString is what you need. AFAI was told by DXE users, this is not true. That is why i did not test this yet. But as you state otherwise I will check into that. But anyway this does not help, as long as the RTL (especially TStrings, TStringList) does not use such type, but forces a fixed type and thus unnecessary conversions in and out. (there have been several discussion on that, including how to implement decent encoding-checks in the compiler, and performance issues). We are well aware of that. If you want a TStrings that can hold strings which may differ in their encoding (i.e. strings[0] has a different encoding from strings[1]) then you'll be left in the cold. TStrings will not support that (unless maybe the rawbytestring type can be specified, I would need to check that). Other than that, you can/must specify an encoding when creating the TStrings instance. As a consequence all strings you add will be checked for this encoding. This is only logical. That said, if you have use-case 1 (which I doubt, since it is not possible even today) then you're better off using unicodestring anyway. Michael. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
On 01/07/2014 12:27 PM, Michael Van Canneyt wrote: if you have use-case 1 (which I doubt, since it is not possible even today) then you're better off using unicodestring anyway. My argumentation is driven by the experience (some myself, a huge lot by my colleagues) with doing embedded M2M-communication projects with multiple protocols and applications (included in the project and/or predefined by customers) and with some programs requiring sophisticated user interaction in multiple languages. The projects originally were done (mine still are done) using old pre-Unicode Delphi. Here, strings are used as well to handle texts as to handle byte streams (that might contain binary data or might contain text in whatever encoding). On top of that, the embedded handling of mass-data here needs to be very fast, handling multiple TCP/IP interfaces at the same time.. Here, of course, for binary strings 8 Bit characters are the only appropriate choice, while for the multi-language text supposedly the best choice is using the encoding the underlying OS offer (e.g. UTF-16 for Windows) throughout the multiple units of the project. So we are burned regarding the flexibility and the pitfalls of using strings in multiple encoding variants within a single project, leading to the urgent request for allowing to use any TString sibling (in own code and in the RTL) in a way that allows for using arbitrary encoding without forced re-encoding under the hood, if not absolutely necessary. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
[fpc-devel] Explanation about code page-aware AnsiStrings
Hi, Large parts of the returning discussions about code page-aware AnsiStrings are related to the fact that many people don't how they work. For this reason I've created an overview that explains the rules that are followed by the RTL/compiler at http://wiki.freepascal.org/FPC_Unicode_support (only Sections 1 to 3; 4 and later are older and mostly either incomplete or wishful thinking). Note that this is not a discussion document, and apart from some notes about potential future changes, errors or unknown major Delphi- incompatibilities, the behaviour as described there will not change. The behaviour has been defined to maximise both backward compatibility with existing FPC code (written for FPC 2.6.x and earlier) and with Delphi 2009+. Automagic string types that solve world hunger are outside the scope. Jonas ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
On Tue, 7 Jan 2014, Michael Schnell wrote: On 01/07/2014 12:27 PM, Michael Van Canneyt wrote: if you have use-case 1 (which I doubt, since it is not possible even today) then you're better off using unicodestring anyway. My argumentation is driven by the experience (some myself, a huge lot by my colleagues) with doing embedded M2M-communication projects with multiple protocols and applications (included in the project and/or predefined by customers) and with some programs requiring sophisticated user interaction in multiple languages. The projects originally were done (mine still are done) using old pre-Unicode Delphi. Here, strings are used as well to handle texts as to handle byte streams (that might contain binary data or might contain text in whatever encoding). This is a mistake. You should use TByteArray for that. Old pre-unicode Delphi also handles this type. You cannot expect us to take into account your improper use of single-byte strings when designing a RTL. You'll just have to see whether rawbytestring does what you need... Michael. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Explanation about code page-aware AnsiStrings
This in fact is in the paragraph Old/obsolete sections, but it does not seem to be mentioned in any current paragraph: Roadmap of RTL Unicode support with UnicodeString: - TStrings Not implemented There is no UnicodeString version of TStrings - TStringList Not implemented There is no UnicodeString version of TStringList This is something just discussed and IMHO very critical, as (IMHO) a Delphi XE compatible implementation is not appropriate for flexible use and portability for multiple targets. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
On 01/07/2014 01:05 PM, Michael Van Canneyt wrote: This is a mistake. You should use TByteArray for that. Old pre-unicode Delphi also handles this type. Of course I do know that this sometimes is recommended, but (with pre-Unicode Delphi) I don't see any point in not using String, which is a lot more handy providing easy to use concatenation, search operation, etc. With a new string type this would come for free, as it requires the same behavior as the locale based 1 Byte ANSI String encoding variant, which is necessary to be supported anyway. Thus, IMHO, a discussion about TByteArray is pointless -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
On Tue, 7 Jan 2014, Michael Schnell wrote: On 01/07/2014 01:05 PM, Michael Van Canneyt wrote: This is a mistake. You should use TByteArray for that. Old pre-unicode Delphi also handles this type. Of course I do know that this sometimes is recommended, but (with pre-Unicode Delphi) I don't see any point in not using String, which is a lot more handy providing easy to use concatenation, search operation, etc. With a new string type this would come for free, as it requires the same behavior as the locale based 1 Byte ANSI String encoding variant, which is necessary to be supported anyway. Thus, IMHO, a discussion about TByteArray is pointless Maybe. But like I said: Do not expect us to adapt the RTL to suit any inappropriate use of strings. Michael. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Explanation about code page-aware AnsiStrings
On 07 Jan 2014, at 13:11, Michael Schnell wrote: On 07 Jan 2014, at 12:57, Jonas Maebe wrote: [ http://wiki.freepascal.org/FPC_Unicode_support ] This in fact is in the paragraph Old/obsolete sections, but it does not seem to be mentioned in any current paragraph: Roadmap of RTL Unicode support with UnicodeString: That's because it is a description of what already exists and how it behaves, as explained in the introduction on that page and in the message I sent to this list. It is not the umpteenth iteration of a proposal of what a code page-aware classes unit should look like, nor an attempt to address every person's wish list written down previously on that page. Jonas ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
On 01/07/2014 01:24 PM, Michael Van Canneyt wrote: But like I said: Do not expect us to adapt the RTL to suit any inappropriate use of strings. Like I said: I don't ! (With the requested behavior this comes for free as a side-effect.) But I do expect decent handling of locale based 1 Byte ANSI String encoding variants (i.e. the functionality of pre-Unicode Delphi and of the current fpc) _at_the_same_timer_ as_and_combineable_with_ Unicode Strings (and at best additionally user-definable encoding schemes). -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Explanation about code page-aware AnsiStrings
On 01/07/2014 01:23 PM, Jonas Maebe wrote: ... nor an attempt to address every person's wish list written down previously on that page. Yep. And as (pure) Delphi XE compatible behavior (at least in my opinion) is not what is desirable for a portable language/rtl, while generally Delphi compatibility is a major request, this in fact obviously is a problematic issue. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Explanation about code page-aware AnsiStrings
Jonas Maebe jonas.ma...@elis.ugent.be hat am 7. Januar 2014 um 12:57 geschrieben: Hi, Large parts of the returning discussions about code page-aware AnsiStrings are related to the fact that many people don't how they work. For this reason I've created an overview that explains the rules that are followed by the RTL/compiler at http://wiki.freepascal.org/FPC_Unicode_support (only Sections 1 to 3; 4 and later are older and mostly either incomplete or wishful thinking). Great! What is this crap: http://wiki.freepascal.org/FPC_Unicode_support#FPC_Unicode_support ? Note that this is not a discussion document, and apart from some notes about potential future changes, errors or unknown major Delphi- incompatibilities, the behaviour as described there will not change. The behaviour has been defined to maximise both backward compatibility with existing FPC code (written for FPC 2.6.x and earlier) and with Delphi 2009+. Automagic string types that solve world hunger are outside the scope. ;) Mattias ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Explanation about code page-aware AnsiStrings
On 07 Jan 2014, at 14:13, Mattias Gaertner wrote: What is this crap: http://wiki.freepascal.org/FPC_Unicode_support#FPC_Unicode_support ? It's under the header Old/obsolete sections and as mentioned above, that's incomplete or wishful thinking. I didn't want to delete any existing content for now, but yes, what's mentioned there but not in the current Sections 1 to 3 is not going to happen. Well, there is an UCS4String type in the RTL (like in FPC 2.6.x), but it's just a dynamic array of 0..$10 and not part of the regular string handling. Jonas ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
Michael Van Canneyt schrieb: If you want a TStrings that can hold strings which may differ in their encoding (i.e. strings[0] has a different encoding from strings[1]) then you'll be left in the cold. Just an idea: What if FPC adds another encoding, similar to RawByteString ($), but without the Delphi quirks? Or simply fix the RawByteString flaws in the *Ansi* compiler and RTL? 1) In a discussion in the Embarcadero groups it turned out that, in an assignment of a RawByteString to another AnsiString type, the Delphi compiler should (but does not) check and eventually convert the string to the static encoding of the target. This is (almost) the only way to create strings with a different static and dynamic encoding. 2) The stupid conversion to CP_ACP in an assignment *to* an RawByteString should be dropped. This applies in detail to the assignment to *function results*. 3) The function result type should be honored, in functions accepting RawByteString parameters. The Delphi compiler seems to *assume* that the results of such functions is RawByteString, so that (including beforementioned flaws) the outcome is a CP_ACP string, even if the declared function result is e.g. an UTF8String. Test case: function conc(a,b: RawByteString): UTF8String; begin Result := a+b; end; The same result as for function conc(a,b: RawByteString): RawByteString; begin Result := a+b; end; the returned string has CP_ACP encoding :-( When these flaws are fixed in the FPC compiler, the AnsiString types will always have the same static and dynamic encoding, as it should be. Then TStrings could be based on such RawByteStrings, without excess conversions or losses. Sorting (TStringList) eventually should ignore the dynamic encoding, i.e. work on a strictly binary (byte-by-byte) base. DoDi ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
Jy V schrieb: A quick note: the new LLVM Delphi compiler forbid the use of AnsiString and AnsiChar, (declared in the unit AnsiString.pas, you cannot use this unit anyway), The compiler supports AnsiStrings, but these are hidden for *mobile* targets. There exists a hack to enable AnsiString support also for such targets, though. DoDi ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Explanation about code page-aware AnsiStrings
Jonas Maebe schrieb: Large parts of the returning discussions about code page-aware AnsiStrings are related to the fact that many people don't how they work. For this reason I've created an overview that explains the rules that are followed by the RTL/compiler at http://wiki.freepascal.org/FPC_Unicode_support (only Sections 1 to 3; 4 and later are older and mostly either incomplete or wishful thinking). Thanks :-) The chapter numbers are missing from the headings? On my Win98 VM this page is not accessible: Error 403 We're sorry, but we could not fulfill your request for /FPC_Unicode_support on this server. You do not have permission to access this server. Your technical support key is: 02f1-94ac-17f4-e8c8 What's wrong? On my Win8 machine the page and server is accessible, of course. DoDi ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Explanation about code page-aware AnsiStrings
On Tue, Jan 7, 2014 at 12:57 PM, Jonas Maebe jonas.ma...@elis.ugent.bewrote: For this reason I've created an overview that explains the rules that are followed by the RTL/compiler at http://wiki.freepascal.org/FPC_Unicode_support it is best to save the source code in UTF-8 with a BOM. Is this recommendation is compatible with Delphi7, Delphi 2007 ? the idea is to be able to share same code, same units between Delphi and FPC. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Explanation about code page-aware AnsiStrings
On 07 Jan 2014, at 15:54, Jy V wrote: n Tue, Jan 7, 2014 at 12:57 PM, Jonas Maebe jonas.ma...@elis.ugent.bewrote: For this reason I've created an overview that explains the rules that are followed by the RTL/compiler at http://wiki.freepascal.org/FPC_Unicode_support it is best to save the source code in UTF-8 with a BOM. This is the full quote: it is best to include either an explicit {$codepage xxx} directive or save the source code in UTF-8 with a BOM (which I've now clarified a bit on the wiki page). Both are equally good as far as FPC is concerned. Is this recommendation is compatible with Delphi7, Delphi 2007 ? Delphi doesn't know the {$codepage xxx} directive, but you can put it between {$ifdef fpc}. I don't know whether it automatically detects UTF-8-encoded files (or even handles them at all), especially these older versions. If it doesn't, you can still save your source code in any code page and add a {$codepage xxx} statement for FPC. Additionally, I forgot to mention that you can also specify the used code page on the command line with FPC, via the -Fc parameter. Jonas ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
On 01/07/2014 02:24 PM, Hans-Peter Diettrich wrote: The compiler supports AnsiStrings, but these are hidden for *mobile* targets. Any reason for this strangeness ? -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
On 01/07/2014 03:35 PM, Hans-Peter Diettrich wrote: What if FPC adds another encoding, similar to RawByteString ($), but without the Delphi quirks? Or simply fix the RawByteString flaws in the *Ansi* compiler and RTL? +1 (in fact I elaborated on that in some older Threads here) 1) In a discussion in the Embarcadero groups it turned out that, in an assignment of a RawByteString to another AnsiString type, the Delphi compiler should (but does not) check and eventually convert the string to the static encoding of the target. This is (almost) the only way to create strings with a different static and dynamic encoding. +1 2) The stupid conversion to CP_ACP in an assignment *to* an RawByteString should be dropped. This applies in detail to the assignment to *function results*. 3) The function result type should be honored, in functions accepting RawByteString parameters. The Delphi compiler seems to *assume* that the results of such functions is RawByteString, so that (including beforementioned flaws) the outcome is a CP_ACP string, even if the declared function result is e.g. an UTF8String. I offered some similar thoughts about how to define and implement such a fully dynamically encoded string type and how it can be implemented without harming performance, even it TStrings and siblings (and the Lazarus API) in fact use it for sake of flexibility and portability. For Delphi compatibility this needs to be a new encoding type (say $FFFE), and provided with some appropriate name. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
On 1/7/2014 8:35 AM, Hans-Peter Diettrich wrote: Sorting (TStringList) eventually should ignore the dynamic encoding, i.e. work on a strictly binary (byte-by-byte) base. Why would that be desirable? If you sort a *string* list you'd expect it to do a string based sort, and more than likely, a locale-based one at that. -- Craig Peterson Scooter Software ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
The compiler supports AnsiStrings, but these are hidden for *mobile* targets. Any reason for this strangeness ? They're using the mobile compiler as an opportunity to break backwards compatibility and push the language in the directions they want to go. A single, 0-based string type, automatic reference counting, no with, etc. Apparently the developers they're chasing after are too stupid to know how to deal with more than one string type. -- Craig Peterson Scooter Software ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
On 07 Jan 2014, at 15:35, Hans-Peter Diettrich wrote: 1) In a discussion in the Embarcadero groups it turned out that, in an assignment of a RawByteString to another AnsiString type, the Delphi compiler should (but does not) check and eventually convert the string to the static encoding of the target. This is (almost) the only way to create strings with a different static and dynamic encoding. 2) The stupid conversion to CP_ACP in an assignment *to* an RawByteString should be dropped. This applies in detail to the assignment to *function results*. The conversion does not happen for all assignments, it only happens for concatenations that are assigned to RawByteString. And even then it doesn't always happen. Please read the wiki page I wrote (trying to prevent exactly this kind of wrong statements from being further repeated, and obviously failing). I even mentioned that we will probably add a way to change the behaviour in this specific case. 3) The function result type should be honored, in functions accepting RawByteString parameters. The Delphi compiler seems to *assume* that the results of such functions is RawByteString, so that (including beforementioned flaws) the outcome is a CP_ACP string, even if the declared function result is e.g. an UTF8String. Test case: function conc(a,b: RawByteString): UTF8String; begin Result := a+b; end; This will always return CP_UTF8 on FPC. Does it really return CP_ACP on Delphi? Even if it does, I doubt we will change that. We even couldn't easily do that, because we don't know the static code pages of the strings that are concatenated inside the RTL routine that handles this. Then TStrings could be based on such RawByteStrings, without excess conversions or losses. The problem with changing TStrings from AnsiString to RawByteString is not so much related to the behaviour of RawByteString, but more regarding descendent classes in existing third party (= user) source code that override methods using AnsiString parameters. We don't want to force everyone to rewrite their code so it uses RawByteString (if anything, RawByteString should probably be used as little as possible in user code, because always correctly dealing with all possible code pages is very hard). Sorting (TStringList) eventually should ignore the dynamic encoding, i.e. work on a strictly binary (byte-by-byte) base. Looking for just one second at the definition of the Sort methods of TStringList (and TStrings) would have prevented you from writing the above statement, which does not make any sense whatsoever (unless you want the compiler to start changing all code where a programmer passes a comparison function that does take code pages into account to the Sort methods of TStrings/TStringList). Jonas ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
On Tue, 7 Jan 2014, Craig Peterson wrote: The compiler supports AnsiStrings, but these are hidden for *mobile* targets. Any reason for this strangeness ? They're using the mobile compiler as an opportunity to break backwards compatibility and push the language in the directions they want to go. A single, 0-based string type, automatic reference counting, no with, etc. Apparently the developers they're chasing after are too stupid to know how to deal with more than one string type. Well, that is about 90% of all developers. Ever talk to a PHP, Javascript or VB developer ? You can usually start by explaining the meaning of 'bit' and 'byte', and work your way up from there. Michael. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
On 01/07/2014 04:24 PM, Craig Peterson wrote: They're using the mobile compiler as an opportunity to break backwards compatibility and push the language in the directions they want to go. A single, 0-based string type, automatic reference counting, no with, etc. Apparently the developers they're chasing after are too stupid to know how to deal with more than one string type. I hope fpc/Lazarus will not follow Delphi on that path. - there have been decent discussions in multiple forums showing that that classes with automatic reference counting are dangerous - castrating the Strings in that way (supposedly forcing UTF16) seems very inappropriate. In fact I do like zero based strings. But this breaks any backwards compatibility so it can't be decently considered. In fact I do like dropping with But this breaks any backwards compatibility. So I don't complain but just don't use it. (There is a decent way to implement qualified with blocks, but I don't think it is worth the effort.) In the end, the compiler should be smart enough to create the same code whether or nor not with is used. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
On 01/07/2014 04:33 PM, Jonas Maebe wrote: but more regarding descendent classes in existing third party (= user) source code that override methods using AnsiString parameters. Automatic type conversion should be able to handle this (as the new type would preserve and thus dynamically know the type of the information it hold). A more sever problem with third parties is that they might derive a new class from TStrings. I did not decently check what happens to legacy code in that case. I do hope this can be automatically handled in most cases. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
Am 07.01.2014 17:00 schrieb Michael Schnell mschn...@lumino.de: - there have been decent discussions in multiple forums showing that that classes with automatic reference counting are dangerous I like the idea of ARC classes. In fact I'm trying to develop an implementation that is fully backwards compatible (currently only in my head though :P ) In fact I do like zero based strings. But this breaks any backwards compatibility so it can't be decently considered. They can be switched on using a compiler directive (AFAIK $ZeroBasedStrings On/Off) in both current Delphi and FPC (2.7.1). In fact I do like dropping with But this breaks any backwards compatibility. So I don't complain but just don't use it. (There is a decent way to implement qualified with blocks, but I don't think it is worth the effort.) In the end, the compiler should be smart enough to create the same code whether or nor not with is used. It would be a possibility to implement a modeswitch which switches of with. Regarding same code: Will not be true if the the expression inside the with-header is not side effect free. Regards, Sven ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Explanation about code page-aware AnsiStrings
Am 07.01.2014 14:34, schrieb Jonas Maebe: On 07 Jan 2014, at 14:13, Mattias Gaertner wrote: What is this crap: http://wiki.freepascal.org/FPC_Unicode_support#FPC_Unicode_support ? It's under the header Old/obsolete sections and as mentioned above, that's incomplete or wishful thinking. I didn't want to delete any existing content for now, but yes, what's mentioned there but not in the current Sections 1 to 3 is not going to happen. Maybe it could be moved to the discussions page? ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Explanation about code page-aware AnsiStrings
On 07 Jan 2014, at 21:41, Florian Klämpfl wrote: Am 07.01.2014 14:34, schrieb Jonas Maebe: It's under the header Old/obsolete sections and as mentioned above, that's incomplete or wishful thinking. I didn't want to delete any existing content for now, but yes, what's mentioned there but not in the current Sections 1 to 3 is not going to happen. Maybe it could be moved to the discussions page? BigChimp (Reinier?) has already done that in the mean time. Jonas ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Encoded AnsiString
Jonas Maebe schrieb: On 07 Jan 2014, at 15:35, Hans-Peter Diettrich wrote: 2) The stupid conversion to CP_ACP in an assignment *to* an RawByteString should be dropped. This applies in detail to the assignment to *function results*. The conversion does not happen for all assignments, it only happens for concatenations that are assigned to RawByteString. And even then it doesn't always happen. Please read the wiki page I wrote (trying to prevent exactly this kind of wrong statements from being further repeated, and obviously failing). I've tested the behaviour, and it appears not only in assignments to RawByteStrings. See test case below. Test case: function conc(a,b: RawByteString): UTF8String; begin Result := a+b; end; This will always return CP_UTF8 on FPC. Does it really return CP_ACP on Delphi? Even if it does, I doubt we will change that. This leads me back to my previous statement: it will be simpler to do things right, than trying to achieve compatibility with *all* Delphi flaws. In detail when the Delphi flaws never have been documented... We even couldn't easily do that, because we don't know the static code pages of the strings that are concatenated inside the RTL routine that handles this. Right! Only the compiler can do that, and therefore the compiler should do it right. Then TStrings could be based on such RawByteStrings, without excess conversions or losses. The problem with changing TStrings from AnsiString to RawByteString is not so much related to the behaviour of RawByteString, but more regarding descendent classes in existing third party (= user) source code that override methods using AnsiString parameters. We don't want to force everyone to rewrite their code so it uses RawByteString (if anything, RawByteString should probably be used as little as possible in user code, because always correctly dealing with all possible code pages is very hard). Right sigh Sorting (TStringList) eventually should ignore the dynamic encoding, i.e. work on a strictly binary (byte-by-byte) base. Looking for just one second at the definition of the Sort methods of TStringList (and TStrings) would have prevented you from writing the above statement, which does not make any sense whatsoever (unless you want the compiler to start changing all code where a programmer passes a comparison function that does take code pages into account to the Sort methods of TStrings/TStringList). Fine that you took the bait ;-) DoDi ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel