Re: [fpc-devel] Delphi incompatible encoding
On Tue, December 2, 2014 08:31, Hans-Peter Diettrich wrote: Jonas Maebe schrieb: To get behaviour that is compatible with Delphi2009+, compile with -Mdelphiunicode or {$modeswitch delphiunicode}. Do you mean {$mode delphiunicode}? Now I wonder about compilation at all. When I compile a console program on the commandline, most strings are readable in the console (see previous answer). But when I compile using Lazarus, all strings (including UnicodeString!) are shown in unreadable UTF-8 encoding, regardless of $mode :-( What causes this difference, and how to make strings readable in a (Lazarus compiled) console application? Forgot to mention: everything on WinXP. Probably best to ask about the wrong behaviour with Lazarus on a Lazarus list? Otherwise: In what format (encoding) is your source file? Unless it's a UTF-8 with BOM, FPC decodes it according to the -Fc parameter and Lazarus may pass a different setting of this option. In addition, it might be related to Lazarus playing with Default*SystemCodePage which may not work well with console using a different encoding, but that is just a wild guess which would need to be checked by someone really knowing what Lazarus does there... Tomas ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Delphi incompatible encoding
Tomas Hajny schrieb: On Tue, December 2, 2014 08:31, Hans-Peter Diettrich wrote: When I compile a console program on the commandline, most strings are readable in the console (see previous answer). But when I compile using Lazarus, all strings (including UnicodeString!) are shown in unreadable UTF-8 encoding, regardless of $mode :-( Probably best to ask about the wrong behaviour with Lazarus on a Lazarus list? It really seems to be a Lazarus problem. Compiled from an PAS file, the behaviour is equal to FPC. The bad encoding is used when compiled from an LPR file (LPI project). Thanks DoDi ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Delphi incompatible encoding
Mattias Gaertner schrieb: On Tue, 02 Dec 2014 04:05:59 +0100 Hans-Peter Diettrich drdiettri...@aol.com wrote: Many things affect string literals. Source codepage, system codepage, string type, defaultsystemcodepage, library, compiler version. I started a table for UTF-8 literals: http://wiki.lazarus.freepascal.org/Character_and_string_types#String_constants Thanks, after some reading I changed the sourcefile encoding, and both UTF8bom and Ansi provide correct results. The Lazarus default (UTF-8 without BOM) is not usable on Windows :-( DoDi ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Option -Wp does not work with new embedded target
you can find a lot of information on CMSIS here: http://www.arm.com/products/processors/cortex-m/cortex-microcontroller-software-interface-standard.php To download the svd-files you need to create a free account @arm.com, then you can download lots of svd files for all major chips. It is even easier when you have a license for the keil ide, they provide a tool that makes downloading the files even easier (and sometimes files are a little more up-to-date) My first attempt was to convert the .h files to .pp (I still have some programs to do that), the problem there is that sometimes the header files are incomplete, the definitions for the bits in the registers is often missing. The .svd files are xml files that are quite easy to parse and they usually contain all the information on the bit level. And the nice thing is that no cleanup is necessary ;-) Michael Am 01.12.14 um 20:33 schrieb Sietse Achterop: Hello list, @Florian: thanks for finding my error. I saw that something was case insensitive, but not in this way(: it now works! @Michael: On 11/30/2014 08:14 PM, Michael Ring wrote: Please download my diff here: http://temp.michael-ring.org/fpc-arm.diff Please have a look at the rtl-files I provide (and tell me if you like the way I created them) , they are automagically created out of the CMSIS sources provided by ARMST. I had a short look at them. It looks clean. But I am curious, how did you create them. I also started from the source from ARMST. I used programs that I found on the Internet like h2pas, en 2 versions of c2pas. But they only partly did the job, so there still was quit some handwork needed to get it to compile. And it still needs a lot of cleaning up. If I have it properly working, I want to make the ST-libraries for the standard I/O and USB available. I think I will not try to translate it into Pascal, but directly use the ST-libraries from C. (I use STM32_USB-Host-Device_Lib_V2.1.0) I'll keep you informed. Sietse ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Trying to understand the wiki-Page FPC Unicode support
On 11/28/2014 09:15 PM, Hans-Peter Diettrich wrote: You suggested to use string as UTF-16 on Windows, and UTF-8 on Linux. That's what I understand as a unique program-wide string representation (not sourcecode-wide, instead program as *compiled*). Then I cannot see any need or use for another DynamicString type. I already did understand your meaning and I understand that this unique program-wide string representation is better than having the libraries' APIs (including TStrings) force a fixed string encoding brand, independently from the OS we compile for (and selectable $mode specifications). But I don't *suggest* this way, as it is not very versatile and hampers portability. As said I *suggest* using DynamicString in such cases. Nonetheless, the types simply called String might be done in the way you suggest. Nothing can be broken, as long as the Delphi behaviour is undefined. That of course is is correct, but just follows the poor excuse Embarcadero offers for the flawed implementation of RawByteString (which as we both agree will never be fixed). (In fact there are many instances that old flaws have been deliberately reproduces for not breaking compatibly.) Applied to FPC/Lazarus code (compiler, libraries, IDE...) this means that it's obviously easier to *prevent* possibly different static/dynamic encodings, instead of *checking and reacting* on such flaws throughout the entire codebase. OK. Kill the Type RawByteString and the constant CP_NONE and the usability of it's value $. I do vote for doing so and instead provide new types such as ByteString, WordString, DWordString, and QWordString denoted by the constants CP_Byte = $FF01, CP_Word = $FF02, CP_DWord = $FF04, CP_QWord = $FF08. Apart from that, every encoding-tolerant code will execute much slower than code without a need for checks and conversions everywhere. As I pointed out I don't agree at all. - The check is only two ASM instructions - It does not result in additional conversions. In fact in appropriate cases it can avoid a huge count of conversations (especially when calling libraries, e.g. by means of TStrings) - in pure user code, the check is only done if DynamicString really is used in the user code, hence only when the user knows what to do. In fact commonly degradation = 0% - When calling libraries (e.g. via TStrings), the check is very small regarding that a function call is done as a result of the same statement. Estimated commonly degradation = 0,01 % So the Checking Overhead is nothing but a rumor. (Remember, I don't suggest dropping the standard statically typed paradigm, altogether, as close loops of course work best in that way. That is why fpc would need to define an additional type name (e.g DynamicString) and encoding brand number (e.g. CP_ANY = $FF00) for a decently usable type for intermediately holding a String content. This again would make *FPC* programs incompatible with Delphi. As I decently explained this would not brake any backwards compatibility, even if TStrings uses this type. - The new type is just additional, so its pure existence can't break anything: you don't need to use it in user-code, if you don't want to. - The use of DynamicString in the interface of Library functions does not break anything, as it is (to be) constructed in a way that provides full compatibility. Please do show any code (not containing RawByteString) that is not compatible when using the DynamicString paradigm as described in http://wiki.freepascal.org/not_Delphi_compatible_enhancement_for_Unicode_Support#Analysis . Maybe the page needs to be improved. While fixing the RawByteString flaw would at least allow to *compile* FPC code with Delphi, the use of an different encoding value would definitely prevent compilation of such code with Delphi. What's the more serious incompatibility? IMHO this would be much more dangerous than introducing a decently working new DynamicString type. RawXxxString can be used for really uncoded data as done with old-style strings in a lot of applications. Such a feature would be appreciated by many users, indeed :-) While I would happily follow you suggesting making indecent use of this type impossible ia the fpc compiler, I don't think it's very dangerous to re-introduce the abysmal Delphi compatible behavior of RawByteString (may as well the documented as the the undocumented features). But why do you say would be appreciated ? Is it not possible to use RawByteString in a way the name suggests, by never bringing it together with any String variable of a different encoding brand and hence avoid any conversion - be same intentional/documented/useful or not. Anyway: I added a sentence in the introduction of the wiki page, explaining the paradigm a little more explicitly. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org
Re: [fpc-devel] Delphi incompatible encoding
On Tue, 02 Dec 2014 11:32:13 +0100 Hans-Peter Diettrich drdiettri...@aol.com wrote: Mattias Gaertner schrieb: On Tue, 02 Dec 2014 04:05:59 +0100 Hans-Peter Diettrich drdiettri...@aol.com wrote: Many things affect string literals. Source codepage, system codepage, string type, defaultsystemcodepage, library, compiler version. I started a table for UTF-8 literals: http://wiki.lazarus.freepascal.org/Character_and_string_types#String_constants Thanks, after some reading I changed the sourcefile encoding, and both UTF8bom and Ansi provide correct results. The Lazarus default (UTF-8 without BOM) is not usable on Windows :-( You need to add conversions. With the new DefaultSystemCodePage many of them are no longer needed. Mattias ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Trying to understand the wiki-Page FPC Unicode support
On 12/02/2014 01:05 PM, Michael Schnell wrote: But why do you say would be appreciated ? Is it not possible to use RawByteString in a way the name suggests, by never bringing it together with any String variable of a different encoding brand and hence avoid any conversion - be same intentional/documented/useful or not. Of course you can't use any TStrings sibling (such as TStringList) in such code, as with Delphi, TStrings is based on a statically typed String brand. This would be made possible by introducing DynamicString and using this type for TStrings and friends. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Trying to understand the wiki-Page FPC Unicode support
On 11/29/2014 07:55 AM, Jonas Maebe wrote: Exactly the same goes for converting strings with code page CP_NONE to a different code page: your program is broken when it tries to do that, While accessing an array beyond its bounds is not detectable at compile time and accessing an array beyond its bounds when range checking is switched off is technically not detectable at runtime, and hence *undefined* cant be avoided, the attempt to convert strings with code page CP_NONE to a different code page is easily detectable by the compiler, as we have predefined string variable type brands types here. Thus, if the outcome is *defined* *to* *be* *undefined* it can and should result in a compiler error message. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] RFC: proper interpretation and implementation of Unicode Support
On 11/28/2014 08:19 PM, Hans-Peter Diettrich wrote: In that discussion I found several errors, which are not detected by the compiler nor handled in the RTL. In the concrete entry the illegal use of the *generic* CP_NONE identifier is mentioned. That's why I felt a need to address several specific topics in above draft. Yep. You can't do a type brand the encoding of which is as well static as dynamic. This is what causes the complete mess introduced by RawByteString (and Delphi and in fpc). So IMHO the only way to go is to suggest to the users (or force them) use the type RawByteString (i.e. CO_NONE) exactly as the name suggests: no encoding brand is known, so it can't be auto-converted in any other encoding, and it can't preserve the encoding of anything that is assigned to it. This said, we don't have any (pseudo-) dynamically encoded type any more, and hence the encoding-type (and element-size) field in the string header does not make any sense any more any can be dropped altogether. But as the implementation (in Delphi and) in fpc already provides encoding-type and element-size fields, I suggest using them for an additional decently dynamic type DynamicString (CP_ANY = $FF00), which (IMHO) can be introduced without braking any compatibility or introducing any noticeable performance degradation, and allows for doing versatile code (including standard library APIs). -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Trying to understand the wiki-Page FPC Unicode support
Michael Schnell schrieb: On 11/28/2014 09:15 PM, Hans-Peter Diettrich wrote: Apart from that, every encoding-tolerant code will execute much slower than code without a need for checks and conversions everywhere. As I pointed out I don't agree at all. - The check is only two ASM instructions - It does not result in additional conversions. It does, e.g. in searching or sorting of StringList, when it can contain strings of different encodings. The choice of a unique encoding for application strings (maybe CP_ACP, UTF-8 or UTF-16) eliminates such conversions. So the Checking Overhead is nothing but a rumor. (Remember, I don't suggest dropping the standard statically typed paradigm, altogether, as close loops of course work best in that way. The rumor is the unimportant Conversion Overhead, i.e. how often a check leads to a conversion. When no check is required, conversions consequently cannot ocur at all. RawXxxString can be used for really uncoded data as done with old-style strings in a lot of applications. Such a feature would be appreciated by many users, indeed :-) But why do you say would be appreciated ? Is it not possible to use RawByteString in a way the name suggests, by never bringing it together with any String variable of a different encoding brand and hence avoid any conversion - be same intentional/documented/useful or not. RawByteString cannot serve two different purposes :-( In *Delphi* it is used as a polymorphic string, capable of *holding* actual strings of any encoding. But when assigned to a variable of a different encoding, a conversion may occur that converts the string into the declared (static) encoding of the target variable. In *FPC* it currently is used somewhat close to your idea, i.e. no conversion occurs in both an assignment to *and from* an RawByteString to some other AnsiString. We only can *hope* that *all* AnsiString operations are based on the dynamic encoding of every operand, with according checks and conversions inserted everywhere. This actually is not true, because the compiler relies on the static encoding of AnsiString variables, and inserts checks and conversions only when that encoding is different. Actually a single AnsiString type were sufficient, because it already can hold data of any encoding :-( I understand the FPC attempt, to allow *at the same time* for the new (encoded) and old (unencoded) AnsiString behaviour, where no automatic conversions are allowed. But this would require at the same time, that e.g. all string literals *also* are stored in that (immutable) encoding, and that this encoding can *not* be changed at runtime, while DefaultSystemCodePage *can* be changed. When the result of a conversion of an string of encoding CP_NONE is undefined, what's of course correct for the *dynamic* encoding, this simply could be changed into conversions of CP_NONE strings do nothing. Then CP_NONE would be the perfect encoding for old-style AnsiStrings, with the only remaining problem with string expressions and assignments, when the operands have a different dynamic encoding. In these cases all operands had to be converted into the CP_NONE encoding, as specified in another DefaultNoneEncoding constant (not variable!); the same encoding would apply in assignments *to* variables of a different encoding. Then also all type alias for AnsiStrings must have unique names, which allow to distinguish e.g. type UTF8String = AnsiString; from type NewUTF8String = type AnsiString(CP_UTF8); DoDi ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel