Re: [fpc-devel] Delphi incompatible encoding

2014-12-02 Thread Tomas Hajny
On Tue, December 2, 2014 08:31, Hans-Peter Diettrich wrote:
 Jonas Maebe schrieb:

 To get behaviour that is compatible with Delphi2009+, compile with
 -Mdelphiunicode or {$modeswitch delphiunicode}.

 Do you mean {$mode delphiunicode}?

 Now I wonder about compilation at all.
 When I compile a console program on the commandline, most strings are
 readable in the console (see previous answer). But when I compile using
 Lazarus, all strings (including UnicodeString!) are shown in unreadable
 UTF-8 encoding, regardless of $mode :-(

 What causes this difference, and how to make strings readable in a
 (Lazarus compiled) console application?

 Forgot to mention: everything on WinXP.

Probably best to ask about the wrong behaviour with Lazarus on a Lazarus
list? Otherwise: In what format (encoding) is your source file? Unless
it's a UTF-8 with BOM, FPC decodes it according to the -Fc parameter and
Lazarus may pass a different setting of this option. In addition, it might
be related to Lazarus playing with Default*SystemCodePage which may not
work well with console using a different encoding, but that is just a wild
guess which would need to be checked by someone really knowing what
Lazarus does there...

Tomas


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Delphi incompatible encoding

2014-12-02 Thread Hans-Peter Diettrich

Tomas Hajny schrieb:

On Tue, December 2, 2014 08:31, Hans-Peter Diettrich wrote:



When I compile a console program on the commandline, most strings are
readable in the console (see previous answer). But when I compile using
Lazarus, all strings (including UnicodeString!) are shown in unreadable
UTF-8 encoding, regardless of $mode :-(



Probably best to ask about the wrong behaviour with Lazarus on a Lazarus
list?


It really seems to be a Lazarus problem. Compiled from an PAS file, the 
behaviour is equal to FPC. The bad encoding is used when compiled from 
an LPR file (LPI project).


Thanks
DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Delphi incompatible encoding

2014-12-02 Thread Hans-Peter Diettrich

Mattias Gaertner schrieb:

On Tue, 02 Dec 2014 04:05:59 +0100
Hans-Peter Diettrich drdiettri...@aol.com wrote:



Many things affect string literals. Source codepage, system codepage,
string type, defaultsystemcodepage, library, compiler version.

I started a table for UTF-8 literals:
http://wiki.lazarus.freepascal.org/Character_and_string_types#String_constants


Thanks, after some reading I changed the sourcefile encoding, and both 
UTF8bom and Ansi provide correct results. The Lazarus default (UTF-8 
without BOM) is not usable on Windows :-(


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Option -Wp does not work with new embedded target

2014-12-02 Thread Michael Ring

you can find a lot of information on CMSIS here:

http://www.arm.com/products/processors/cortex-m/cortex-microcontroller-software-interface-standard.php

To download the svd-files you need to create a free account @arm.com, 
then you can download lots of svd files for all major chips.


It is even easier when you have a license for the keil ide, they provide 
a tool that makes downloading the files even easier (and sometimes files 
are a little more up-to-date)


My first attempt was to convert the .h files to .pp (I still have some 
programs to do that), the problem there is that sometimes the header 
files are incomplete, the definitions for the bits in the registers is 
often missing.


The .svd files are xml files that are quite easy to parse and they 
usually contain all the information on the bit level. And the nice thing 
is that no cleanup is necessary ;-)


Michael

Am 01.12.14 um 20:33 schrieb Sietse Achterop:

Hello list,

@Florian: thanks for finding my error. I saw that something was case 
insensitive, but not in this way(:

  it now works!

@Michael:

On 11/30/2014 08:14 PM, Michael Ring wrote:

Please download my diff here:

http://temp.michael-ring.org/fpc-arm.diff

Please have a look at the rtl-files I provide (and tell me if you 
like the way I created them) , they are automagically created out of 
the CMSIS sources provided by ARMST.



I had a short look at them. It looks clean.
But I am curious, how did you create them. I also started from the 
source from ARMST. I used programs that I found on the Internet like 
h2pas, en 2 versions of c2pas.
But they only partly did the job, so there still was quit some 
handwork needed to get it to compile.

And it still needs a lot of cleaning up.
If I have it properly working, I want to make the ST-libraries for the 
standard I/O and USB available. I think I will not try to translate it 
into Pascal, but directly

use the ST-libraries from C.
(I use STM32_USB-Host-Device_Lib_V2.1.0)

  I'll keep you informed.
  Sietse




___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Trying to understand the wiki-Page FPC Unicode support

2014-12-02 Thread Michael Schnell

On 11/28/2014 09:15 PM, Hans-Peter Diettrich wrote:


You suggested to use string as UTF-16 on Windows, and UTF-8 on 
Linux. That's what I understand as a unique program-wide string 
representation (not sourcecode-wide, instead program as *compiled*). 
Then I cannot see any need or use for another DynamicString type.
I already did understand your meaning and I understand that this  
unique program-wide string representation is better than having the 
libraries' APIs (including TStrings) force a fixed string encoding 
brand, independently from the OS we compile for (and selectable $mode 
specifications). But I  don't *suggest* this way, as it is not very 
versatile and hampers portability. As said I *suggest* using 
DynamicString in such cases. Nonetheless, the types simply called 
String might be done in the way you suggest.


Nothing can be broken, as long as the Delphi behaviour is undefined. 
That of course is is correct, but just follows the poor excuse 
Embarcadero  offers for the flawed implementation of RawByteString 
(which as we both agree will never be fixed). (In fact there are many 
instances that old flaws have been deliberately reproduces for not 
breaking compatibly.)


Applied to FPC/Lazarus code (compiler, libraries, IDE...) this means 
that it's obviously easier to *prevent* possibly different 
static/dynamic encodings, instead of *checking and reacting* on such 
flaws throughout the entire codebase. 
OK. Kill the Type RawByteString and the constant CP_NONE and the 
usability of it's value $. I do vote for doing so and instead 
provide new types such as ByteString, WordString, DWordString, and 
QWordString denoted by the constants CP_Byte = $FF01, CP_Word = $FF02, 
CP_DWord = $FF04, CP_QWord = $FF08.


Apart from that, every encoding-tolerant code will execute much slower 
than code without a need for checks and conversions everywhere.

As I pointed out I don't agree at all.
 - The check is only two ASM instructions
 - It does not result in additional conversions. In fact in appropriate 
cases it can avoid a huge count of conversations (especially when 
calling libraries, e.g. by means of TStrings)
 - in pure user code, the check is only done if DynamicString really is 
used in the user code, hence only when the user knows what to do. In 
fact commonly degradation = 0%
 - When calling libraries (e.g. via TStrings), the  check is very small 
regarding that a function call is done as a result of the same 
statement. Estimated commonly degradation = 0,01 %


So the Checking Overhead is nothing but a rumor. (Remember, I don't 
suggest dropping the standard statically typed paradigm, altogether, 
as close loops of course work best in that way.


That is why fpc would need to define an additional type name (e.g 
DynamicString) and encoding brand number (e.g. CP_ANY = $FF00) 
for a decently usable type for intermediately holding a  String content.


This again would make *FPC* programs incompatible with Delphi. 
As I decently explained this would not brake any backwards 
compatibility, even if TStrings uses this type.
 - The new type is just additional, so its pure existence can't break 
anything: you don't need to use it in user-code, if you don't want to.
 - The use of DynamicString in the interface of Library functions does 
not break anything, as it is (to be) constructed in a way that provides 
full compatibility.


Please do show any code (not containing RawByteString) that is not 
compatible when using the DynamicString paradigm as described in 
http://wiki.freepascal.org/not_Delphi_compatible_enhancement_for_Unicode_Support#Analysis 
. Maybe the page needs to be improved.


While fixing the RawByteString flaw would at least allow to *compile* 
FPC code with Delphi, the use of an different encoding value would 
definitely prevent compilation of such code with Delphi. What's the 
more serious incompatibility?
IMHO this would be much more dangerous than introducing a decently 
working new DynamicString type.
RawXxxString can be used for really uncoded data as done with 
old-style strings in a lot of applications.


Such a feature would be appreciated by many users, indeed :-)


While I would happily follow you suggesting making indecent use of 
this type impossible ia the fpc compiler, I don't think it's very 
dangerous to re-introduce the abysmal Delphi compatible behavior of 
RawByteString (may as well the documented as the the undocumented 
features).


But why do you say would be appreciated ? Is it not possible to use 
RawByteString in a way the name suggests, by never bringing it 
together with any String variable of a different encoding brand and 
hence avoid any conversion - be same intentional/documented/useful or not.



Anyway: I added a sentence in the introduction of the wiki page, 
explaining the paradigm a little more explicitly.




-Michael




___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org

Re: [fpc-devel] Delphi incompatible encoding

2014-12-02 Thread Mattias Gaertner
On Tue, 02 Dec 2014 11:32:13 +0100
Hans-Peter Diettrich drdiettri...@aol.com wrote:

 Mattias Gaertner schrieb:
  On Tue, 02 Dec 2014 04:05:59 +0100
  Hans-Peter Diettrich drdiettri...@aol.com wrote:
 
  Many things affect string literals. Source codepage, system codepage,
  string type, defaultsystemcodepage, library, compiler version.
  
  I started a table for UTF-8 literals:
  http://wiki.lazarus.freepascal.org/Character_and_string_types#String_constants
 
 Thanks, after some reading I changed the sourcefile encoding, and both 
 UTF8bom and Ansi provide correct results. The Lazarus default (UTF-8 
 without BOM) is not usable on Windows :-(

You need to add conversions.

With the new DefaultSystemCodePage many of them are no longer needed.

Mattias
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Trying to understand the wiki-Page FPC Unicode support

2014-12-02 Thread Michael Schnell

On 12/02/2014 01:05 PM, Michael Schnell wrote:
But why do you say would be appreciated ? Is it not possible to use 
RawByteString in a way the name suggests, by never bringing it 
together with any String variable of a different encoding brand and 
hence avoid any conversion - be same intentional/documented/useful or 
not.
Of course you can't use any TStrings sibling (such as TStringList) in 
such code, as with Delphi, TStrings is based on a statically typed 
String brand. This would be made possible by introducing DynamicString 
and using this type for TStrings and friends.


-Michael



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Trying to understand the wiki-Page FPC Unicode support

2014-12-02 Thread Michael Schnell

On 11/29/2014 07:55 AM, Jonas Maebe wrote:
Exactly the same goes for converting strings with code page CP_NONE to 
a different code page: your program is broken when it tries to do that,


While accessing an array beyond its bounds is not detectable at compile 
time and accessing an array beyond its bounds when range checking is 
switched off is technically not detectable at runtime, and hence 
*undefined* cant be avoided, the attempt to convert strings with code 
page CP_NONE to a different code page is easily detectable by the 
compiler, as we have predefined string variable type brands types 
here. Thus, if the outcome is *defined* *to* *be* *undefined* it can and 
should result in a compiler error message.


-Michael
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] RFC: proper interpretation and implementation of Unicode Support

2014-12-02 Thread Michael Schnell

On 11/28/2014 08:19 PM, Hans-Peter Diettrich wrote:


In that discussion I found several errors, which are not detected by 
the compiler nor handled in the RTL. In the concrete entry the illegal 
use of the *generic* CP_NONE identifier is mentioned. That's why I 
felt a need to address several specific topics in above draft.

Yep.

You can't do a type brand the encoding of which is as well static as 
dynamic.


This is what causes the complete mess introduced by RawByteString (and 
Delphi and in fpc).


So IMHO the only way to go is to suggest to the users (or force them) 
use the type RawByteString (i.e. CO_NONE) exactly as the name suggests: 
no encoding brand is known, so it can't be auto-converted in any other 
encoding, and it can't preserve the encoding of anything that is 
assigned to it.


This said, we don't have any (pseudo-) dynamically encoded type any 
more, and hence the encoding-type (and element-size) field in the 
string header does not make any sense any more any can be dropped 
altogether.


But as the implementation (in Delphi and) in fpc already provides 
encoding-type and element-size fields, I suggest using them for an 
additional decently dynamic type DynamicString (CP_ANY = $FF00), which 
(IMHO) can be introduced without braking any compatibility or 
introducing any noticeable performance degradation, and allows for doing 
versatile code (including standard  library APIs).


-Michael
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Trying to understand the wiki-Page FPC Unicode support

2014-12-02 Thread Hans-Peter Diettrich

Michael Schnell schrieb:

On 11/28/2014 09:15 PM, Hans-Peter Diettrich wrote:


Apart from that, every encoding-tolerant code will execute much slower 
than code without a need for checks and conversions everywhere.

As I pointed out I don't agree at all.
 - The check is only two ASM instructions
 - It does not result in additional conversions.


It does, e.g. in searching or sorting of StringList, when it can contain
strings of different encodings. The choice of a unique encoding for
application strings (maybe CP_ACP, UTF-8 or UTF-16) eliminates such
conversions.


So the Checking Overhead is nothing but a rumor. (Remember, I don't 
suggest dropping the standard statically typed paradigm, altogether, 
as close loops of course work best in that way.


The rumor is the unimportant Conversion Overhead, i.e. how often a
check leads to a conversion. When no check is required, conversions
consequently cannot ocur at all.


RawXxxString can be used for really uncoded data as done with 
old-style strings in a lot of applications.


Such a feature would be appreciated by many users, indeed :-)


But why do you say would be appreciated ? Is it not possible to use 
RawByteString in a way the name suggests, by never bringing it 
together with any String variable of a different encoding brand and 
hence avoid any conversion - be same intentional/documented/useful or not.


RawByteString cannot serve two different purposes :-(

In *Delphi* it is used as a polymorphic string, capable of *holding*
actual strings of any encoding. But when assigned to a variable of a
different encoding, a conversion may occur that converts the string into
the declared (static) encoding of the target variable.

In *FPC* it currently is used somewhat close to your idea, i.e. no
conversion occurs in both an assignment to *and from* an RawByteString
to some other AnsiString. We only can *hope* that *all* AnsiString
operations are based on the dynamic encoding of every operand, with
according checks and conversions inserted everywhere. This actually is
not true, because the compiler relies on the static encoding of
AnsiString variables, and inserts checks and conversions only when that
encoding is different. Actually a single AnsiString type were
sufficient, because it already can hold data of any encoding :-(

I understand the FPC attempt, to allow *at the same time* for the new
(encoded) and old (unencoded) AnsiString behaviour, where no automatic
conversions are allowed. But this would require at the same time, that
e.g. all string literals *also* are stored in that (immutable) encoding,
and that this encoding can *not* be changed at runtime, while
DefaultSystemCodePage *can* be changed.

When the result of a conversion of an string of encoding CP_NONE is
undefined, what's of course correct for the *dynamic* encoding, this
simply could be changed into conversions of CP_NONE strings do
nothing. Then CP_NONE would be the perfect encoding for old-style
AnsiStrings, with the only remaining problem with string expressions and
assignments, when the operands have a different dynamic encoding. In
these cases all operands had to be converted into the CP_NONE encoding,
as specified in another DefaultNoneEncoding constant (not variable!);
the same encoding would apply in assignments *to* variables of a
different encoding. Then also all type alias for AnsiStrings must have
unique names, which allow to distinguish e.g.
  type UTF8String = AnsiString;
from
  type NewUTF8String = type AnsiString(CP_UTF8);

DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel