from:"Hans\-Peter Diettrich"

Re: [fpc-devel] Closures -- debug warning @ ttgobj.FreeTemp

2015-03-10 Thread Hans-Peter Diettrich




Am 09.03.2015 um 14:36 schrieb bla...@blaise.ru:
FPC trunk r30150, compiled with EXTDEBUG, emits a debug warning for 
the following program:

--8--
type T = interface
procedure Bar;
end;

function Foo: T;
begin
result := nil
end;

begin
Foo().Bar()
// ^-- Warning: tgobj: (FreeTemp) temp at pos -44 is already free !
end.
--8--
Does this indicate a problem in the compiler, or is this warning bogus?

I'd assume that the warning refers to result := nil in Foo(). Assign 
something different and try again to find out more.


DoDi
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] BOOL

2014-12-15 Thread Hans-Peter Diettrich



Am 14.12.2014 um 16:51 schrieb Marco van de Voort:

In our previous episode, Adriaan van Os said:

reveals 0 for False and -1 for True, where I had expected 0 for False and
1 f
according to http://msdn.microsoft.com/en-us/library/eke1xt9y.aspx the
same
respectively in Visual Studio 2013.

There is a C (99?) bool type, and a winapi (and much older) BOOL type. Two
different libraries, two different headers, two different cases :-)
AFAIR Delphi defines some xxxBOOL types for the interpretation of 
*WinAPI function results*. If so, these types and values should be 
restricted to Windows platforms, not be used in general (cross-platform) 
code.


DoDi
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

[fpc-devel] cp1252 problems

2014-12-07 Thread Hans-Peter Diettrich

Using FPC trunk, Lazarus on WinXP, file and $codepage UTF-8, 
DefaultSystemCodePage is 1252.


Then AnsiString variables can contain either UTF-8 or cp1252 strings 
(inconsistent), but that's an already known problem :-(


Now I found another bug with AnsiString(1252), which IMO should behave 
like AnsiString(CP_ACP). Unfortunately this is not true, the same 
assignments of literals to both variables leads to different strings:


type
  WinAnsiString = type AnsiString(1252);
const
  cACP: AnsiString = 'ä'; //encoded UTF-8 = 'ä'
  cWin: WinAnsiString = 'ä'; //encoded 1252 = 'ä?'
var
  strA: AnsiString;
  strW: WinAnsiString;
begin
  strA := 'ä'; //encoded UTF-8 = 'ä'
  strW := 'ä'; //encoded 1252 = 'ä?'
  WriteLn('equal ',strA=strW); //FALSE!
  strW := cACP; //1252 'ä' okay
  strA := cWin; //1252 'ä?' wrong as above
end;

It looks to me as if the cp1252 strings (both const and var) are 
converted from an UTF-16 char (2 bytes into 2 chars), with the first 
char being the letter, the second one being the UTF-16 high byte (0) as 
'?' (#63).


Longer literals, like 'äöü', are converted properly, but to encoding 
UTF-8 for AnsiString and encoding 1252 for WinAnsiString.


Should I submit an bug report?

DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

[fpc-devel] RawByteString Insert etc.

2014-12-05 Thread Hans-Peter Diettrich

IMO the Insert procedure should change the encoding of the 
string-to-insert into the CP of the target string. Else the target 
string can become unusable, containing an mix of characters from 
different codepages. While a RawByteString can have any encoding, it 
cannot have two encodings at the same time.


BTW, the documentation should be updated to RawByteString arguments.


More candidates:
Concat (implemented where? operator +=?)
Pos (make SubStr CP match Source CP)

To be converted to RawByteString at all (overload?):
Format (?)
StringReplace
LastDelimiter, IsDelimiter (in case of non-ASCII delimiters?)
...

Should I supply patches?

DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

[fpc-devel] AnsiUpperCase problems

2014-12-04 Thread Hans-Peter Diettrich

The following console program demonstrates various problems with the new 
(encoded) AnsiStrings (FPC trunk):


program litTest2;
{.$codepage UTF8} //off for now
uses Classes,SysUtils;
var A: AnsiString;
begin
  a := 'äöü';
  //a := a+' '; //uncomment later
  WriteLn(a,'äöü');
  WriteLn(AnsiUpperCase(a),AnsiUpperCase('äöü'));
end.

The output varies depending on (at least) the file encoding and target 
platform (tested only on Windows, using Lazarus).


With an Ansi source file the last line shows as 'ÄÖÜÄÖÜ', as expected. 
The variable also shows as 'äöü', but not the literal (3 graphical 
characters). In all other (tested) cases something different is shown, 
no uppercase letters at all.


With an UTF-8 source file (with BOM) both the variable and literal show 
as 'äöü', but unfortunately never in upper case.


Adding {$codepage UTF8} requires an UTF-8 source file. That's compatible 
with Lazarus defaults, so that further tests (here) will use this 
combination. Please note that (currently) Lazarus sets or leaves 
DefaultSystemCodePage as according to the actual OS, i.e. 1252 for my 
installation, regardless of $codepage.


Now all items are shown as 'äöü', but again never in uppercase - how that?


AnsiUpperCase finally calls Win32AnsiUpperCase (on Windows), declared as
  function Win32AnsiUpperCase(const s: string): string;
which in turn calls CharUpperBuffA.
This explains why no uppercase conversion is performed, when S has a 
dynamic encoding different from (WinAPI) CP_ACP, which is expected by 
CharUpperBuffA. Actually I found the *dynamic* encoding of A and S as 
CP_UTF8, even if its static encoding is CP_ACP (or 1252).


Consequently AnsiUpperCase should convert S to the WinAPI CP_ACP 
(GetACP), before passing it to CharUpperBuffA. The same for all other 
functions with AnsiString arguments, calling external (OS API...) 
routines expecting a specific encoding, on all platforms. And for user 
code, which relies on the encoding of all strings being the declared 
one, like in:

  str1[1]:=str2[1]; //both strings of same type

IMO such additional checks and conversions should be avoided, they bloat 
the library code and consume runtime. Note that SetCodePage requires an 
RawByteString (var parameter), and thus cannot be used immediately to 
adjust the dynamic codepage of an AnsiString.



Now let's add (uncomment) the line
  a := a+' ';
and voila, AnsiUpperCase works, because now the string has the expected 
CP_ACP instead of UTF-8. The same effect occurs when A is assigned from 
an UnicodeString variable.


Is it really intended, that AnsiString behaviour depends on such details?


The most simple solution would disallow a different static and dynamic 
encoding of AnsiStrings, except for RawByteString. Then no additional 
checks and conversions are required, except the one in the assignment of 
an RawByteString to an AnsiString of different type, and everything else 
can be determined by the compiler from the known static=dynamic encoding 
of strings.


More checks and conversions can be avoided, when the dynamic encoding of 
string literals is the actual encoding, as used by the compiler for the 
stored literal, not Delphi incompatible placeholders like CP_ACP. Then 
TranslatePlaceholderCP is required only for explicitly given encoding 
values, but no more for the dynamic encoding of strings.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Delphi incompatible encoding

2014-12-02 Thread Hans-Peter Diettrich


Tomas Hajny schrieb:

On Tue, December 2, 2014 08:31, Hans-Peter Diettrich wrote:



When I compile a console program on the commandline, most strings are
readable in the console (see previous answer). But when I compile using
Lazarus, all strings (including UnicodeString!) are shown in unreadable
UTF-8 encoding, regardless of $mode :-(



Probably best to ask about the wrong behaviour with Lazarus on a Lazarus
list?


It really seems to be a Lazarus problem. Compiled from an PAS file, the 
behaviour is equal to FPC. The bad encoding is used when compiled from 
an LPR file (LPI project).


Thanks
DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Delphi incompatible encoding

2014-12-02 Thread Hans-Peter Diettrich


Mattias Gaertner schrieb:

On Tue, 02 Dec 2014 04:05:59 +0100
Hans-Peter Diettrich drdiettri...@aol.com wrote:



Many things affect string literals. Source codepage, system codepage,
string type, defaultsystemcodepage, library, compiler version.

I started a table for UTF-8 literals:
http://wiki.lazarus.freepascal.org/Character_and_string_types#String_constants


Thanks, after some reading I changed the sourcefile encoding, and both 
UTF8bom and Ansi provide correct results. The Lazarus default (UTF-8 
without BOM) is not usable on Windows :-(


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Trying to understand the wiki-Page FPC Unicode support

2014-12-02 Thread Hans-Peter Diettrich


Michael Schnell schrieb:

On 11/28/2014 09:15 PM, Hans-Peter Diettrich wrote:


Apart from that, every encoding-tolerant code will execute much slower 
than code without a need for checks and conversions everywhere.

As I pointed out I don't agree at all.
 - The check is only two ASM instructions
 - It does not result in additional conversions.


It does, e.g. in searching or sorting of StringList, when it can contain
strings of different encodings. The choice of a unique encoding for
application strings (maybe CP_ACP, UTF-8 or UTF-16) eliminates such
conversions.


So the Checking Overhead is nothing but a rumor. (Remember, I don't 
suggest dropping the standard statically typed paradigm, altogether, 
as close loops of course work best in that way.


The rumor is the unimportant Conversion Overhead, i.e. how often a
check leads to a conversion. When no check is required, conversions
consequently cannot ocur at all.


RawXxxString can be used for really uncoded data as done with 
old-style strings in a lot of applications.


Such a feature would be appreciated by many users, indeed :-)


But why do you say would be appreciated ? Is it not possible to use 
RawByteString in a way the name suggests, by never bringing it 
together with any String variable of a different encoding brand and 
hence avoid any conversion - be same intentional/documented/useful or not.


RawByteString cannot serve two different purposes :-(

In *Delphi* it is used as a polymorphic string, capable of *holding*
actual strings of any encoding. But when assigned to a variable of a
different encoding, a conversion may occur that converts the string into
the declared (static) encoding of the target variable.

In *FPC* it currently is used somewhat close to your idea, i.e. no
conversion occurs in both an assignment to *and from* an RawByteString
to some other AnsiString. We only can *hope* that *all* AnsiString
operations are based on the dynamic encoding of every operand, with
according checks and conversions inserted everywhere. This actually is
not true, because the compiler relies on the static encoding of
AnsiString variables, and inserts checks and conversions only when that
encoding is different. Actually a single AnsiString type were
sufficient, because it already can hold data of any encoding :-(

I understand the FPC attempt, to allow *at the same time* for the new
(encoded) and old (unencoded) AnsiString behaviour, where no automatic
conversions are allowed. But this would require at the same time, that
e.g. all string literals *also* are stored in that (immutable) encoding,
and that this encoding can *not* be changed at runtime, while
DefaultSystemCodePage *can* be changed.

When the result of a conversion of an string of encoding CP_NONE is
undefined, what's of course correct for the *dynamic* encoding, this
simply could be changed into conversions of CP_NONE strings do
nothing. Then CP_NONE would be the perfect encoding for old-style
AnsiStrings, with the only remaining problem with string expressions and
assignments, when the operands have a different dynamic encoding. In
these cases all operands had to be converted into the CP_NONE encoding,
as specified in another DefaultNoneEncoding constant (not variable!);
the same encoding would apply in assignments *to* variables of a
different encoding. Then also all type alias for AnsiStrings must have
unique names, which allow to distinguish e.g.
  type UTF8String = AnsiString;
from
  type NewUTF8String = type AnsiString(CP_UTF8);

DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Delphi incompatible encoding

2014-12-01 Thread Hans-Peter Diettrich


Sven Barth schrieb:
Am 01.12.2014 10:33 schrieb Hans-Peter Diettrich drdiettri...@aol.com 
mailto:drdiettri...@aol.com:

  Another one:
  Delphi XE does not export the CP_xxx encoding constants from 
System.pas. This means that the encoding constants are not available in 
(compatible) user code.


It's not the first and likely not the last we export from a different 
unit than Delphi. There will *always* be differences which already 
starts with Integer that is declared as an alias to LongInt in the 
ObjPas unit in FPC.


Well, Integer (and String...) are generic types, adjustable to the 
best overall performance on every target - whatever best will mean.



  CP_NONE is declared in Windows.pas for clipping, as:
CP_NONE  = 0; { No clipping of output }
  different from the CP_NONE encoding ($).

Do they really not have a CP_NONE constant in System?


No, not even in the definition of RawByteString, and not in any other 
standard (RTL...) source file (except Windows.pas, see above).


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Delphi incompatible encoding

2014-12-01 Thread Hans-Peter Diettrich


Jonas Maebe schrieb:


Hans-Peter Diettrich wrote on ma, 01 dec 2014:


To get behaviour that is compatible with Delphi2009+, compile with 
-Mdelphiunicode or {$modeswitch delphiunicode}.


The compiler option (-M) works, but the $modeswitch is not accepted by 
the compiler (2.7.1): Illegal compiler switch DELPHIUNICODE. The same 
for {$mode ObjPas} - what else did I miss?


When I use Lazarus and set the compiler to the new 2.7.1, the modeswitch 
does not cause an error.



This particular 
difference is also documented in 
http://wiki.freepascal.org/FPC_Unicode_support#String_constants (search 
for delphiunicode or systemcodepage)


Thanks, that explains at least the FPC handling of literals.

But where can I find information about all the differences caused by 
above compiler option/modeswitch? Does it affect implicit AnsiString 
encoding conversions?



BTW it's nice that FPC console Write/Ln (mostly) converts AnsiStrings to 
the console codepage, while Delphi (XE) doesn't convert :-)


But I found a somewhat strange result with generic String variables, 
tested with:


var A: AnsiString; S: String;
begin
  S := ' äöü';
  A := S;
  //S := A; //changes nothing
  WriteLn('A CP: ',StringCodePage(A), A); //always shows ' äöü'
  WriteLn('S CP: ',StringCodePage(S), S); //letters differ
end.

When String is UnicodeString (DelphiUnicode), the output is correctly 
converted for both strings (CP 1200,1252). But when String is not 
UnicodeString, AnsiString and String should be the same type, no? The 
console however shows different letters for the generic String and 
AnsiString variable (both CP 1252). The output doesn't change when A is 
assigned back to S. How that?


When A and S are echanged:
  A := ' äöü';
  S := A;
  //A := S; //CP changes
the encoding of A is shown as zero. Now it makes a difference when S is 
assigned back to A, but only the codepage of A then also is shown as 
1252, while the letters still differ. Obviously String is not equivalent 
to AnsiString now, and string literals should be assigned to AnsiString 
variables only, not to String variables?


Very confused
DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Trying to understand the wiki-Page FPC Unicode support

2014-11-29 Thread Hans-Peter Diettrich


Jonas Maebe schrieb:

On 28/11/14 21:30, Hans-Peter Diettrich wrote:

I prefer to specify and document everything *before* coding, so that
everybody can expect that the code will behave as specified.


If certain behaviour is explicitly undefined, it *is* specified and
documented. It means that your program is buggy if it triggers such
behaviour, and that the effect of triggering it could be anything.

[...]

An example from FPC itself is accessing an array beyond its bounds when
range checking is switched off.


After this hint I reviewd the Code page identifiers section again, and 
probably could find the source of misunderstandings.


CP_NONE: this value indicates that no code page information has been 
associated with the string data. The result of any explicit or implicit 
operation that converts this data to another code page is undefined.


Does this mean CP_NONE is not an allowed *dynamic* (string *data*) 
encoding, just like any other undefined encoding value?


In this case the description is correct, but it describes an special 
case of some *undefined* general rule, about valid and invalid dynamic 
encodings in general. Then this general rule should be documented 
before, not only for CP_NONE. Then also documentation of the *intended* 
purpose of CP_NONE, for the *static* encoding of the RawByteString type, 
is missing at all.


As Delphi doesn't allow for a dynamic encoding of CP_NONE, I don't 
understand the purpose of the FPC description. Now in turn some FPC 
developer might have misunderstood the (Delphi) handling of 
RawByteStrings, assuming that it were okay to omit a conversion in an 
assignment of RawByteString to an AnsiString of a different encoding.


That's why I think that the incorrect handling of such RawByteString 
assignments in FPC should be fixed, according to the general rule of 
assignments to an string of a different (static) encoding. CP_NONE 
definitely *is* different from any other encoding, and Delphi does not 
define an exception for RawByteStrings.




Exactly the same goes for converting strings with code page CP_NONE to a
different code page: your program is broken when it tries to do that,
and we cannot guarantee any outcome. This is exactly what the behaviour
is undefined means.


When a string *really* has a *dynamic* encoding of CP_NONE, this of 
course is illegal and thus will result in an undefined result. ACK, so 
far. But since Delphi (quietly) changes an SetCodePage to CP_NONE into 
the current CP_ACP, the undefined situation (invalid dynamic encoding) 
must have been forced by some illegal *hack* before, or in the FPC case 
by some erroneous (not Delphi conforming) RTL code.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

[fpc-devel] RFC: proper interpretation and implementation of Unicode Support

2014-11-28 Thread Hans-Peter Diettrich


In response to another thread (this should start an new thread):


CP_NONE: this value indicates that no code page information has been
associated with the string data. The result of any explicit or implicit
operation that converts this data to another code page is undefined.


After rereading I found this definition incorrect, the entire section 
(and more) deserves a correction/clarification. The implementation may 
have to be changed accordingly.


This is my interpretation of the Delphi API around encoded AnsiStrings, 
as documented and implemented there, with added clarifications and notes 
on omissions and possible problems on non-Windows platforms.


I do not expect that the FPC developers fully agree with this 
interpretation, but I expect that all items of a revised version of the 
following draft become part of the FPC documentation, somehow.


Draft

1) CP_ACP, CP_OEM and CP_NONE are generic encodings (placeholders), 
applicable as *static* string encodings inside a program only, they 
never can denote a dynamic string encoding.


Note: codepage here means byte-based ANSI/ISO codepages, applicable to 
AnsiStrings, not Unicode codepages (BMP...). While CP_UTF16 (and BE/LE 
variations) can be used to specify a concrete (string,textfile...) 
*encoding*, they do not describe codepages (neither Ansi nor Unicode).


Note: these identifiers (names) should be used with exreme care in 
documentation/discussions. In most cases CP_ACP stands for the *actual* 
default encoding, equivalent to the value of a hypothetical *variable* 
named CP_ACP, i.e. currently (see below) should be understood as 
DefaultSystemCodePage. It should be made clear that the value of the 
CP_ACP *constant* identifier (=0) is meant and usable only in few cases, 
like in the declaration of an string type; it may also be acceptable in 
explicit conversion requests, and to denote the encoding to use in 
file/stream I/O, where the functions replace CP_ACP by the actual 
(DefaultSystemCodePage) value internally.


Note: in compiler, library and application code a value of CP_ACP should 
be considered equal to (be mapped into) the actual 
(DefaultSystemCodePage) encoding.


2) A platform (or Unicode library) may or may not provide their own 
*generic* values (constants) for application (CP_ACP) and console 
(CP_OEM) encoding, as well as further constants for e.g. filenames.


Note: CP_ACP is zero on Windows, possibly different on other platforms 
or libraries. Thus AnsiString(0) may be different from 
AnsiString(CP_ACP). It may be required to distinguish between a named 
Pascal constant CP_ACP=0, and the value of the generic 
application/default encoding in API calls (CP_SYS?).


3) The *actually* associated codepages are defined by the platform, 
eventually can be changed by the user (admin). A program may or may not 
be allowed to change the associated codepages, either locally (process 
wide) or globally (system wide).


Note: the name DefaultSystemCodePage should be reserved for the 
*system* defined codepage. When this setting can be different from an 
application-wide setting, another DefaultApplicationCodePage variable 
should be added. See the comments on Modifications and Notes on 
DefaultSystemCodePage in the Wiki page!


Note: a process should determine (retrieve) the platform settings 
*before* any attempt to interpret system-provided strings (commandline, 
environment variables...). Depending on the platform, more generic 
settings may apply to specific strings, like for filenames. In all 
external API calls, the RTL is responsible for the correct encoding of 
all string arguments, as expected by the called function. This applies 
in detail to CP_ACP, when this encoding can be changed inside a program 
to something different from the external (platform...) setting.


4) A RawByteString variable, of the static encoding CP_NONE, can hold 
strings of *any* dynamic encoding. No conversion is performed when a 
string is assigned to such a variable. In the opposite direction the 
standard handling should apply, i.e. different static encodings require 
a conversion into the static target encoding.


Note: Its known that Delphi does not always convert an RawByteString, in 
an assignment to a variable of an different type. This flaw should be 
fixed in FPC. Is the according Delphi behaviour *defined* anywhere?


5) Use StringCodePage to get an actual (dynamic) string encoding. 
StringCodePage never returns one of the generic values. The dynamic 
codepage of an unassigned (empty) string is assumed (by Delphi) as the 
actually selected CP_ACP codepage for AnsiString arguments, CP_UTF16 (or 
whatever applicable) for UnicodeString arguments.


Note: while an unassigned (empty) string variable has a static encoding, 
known to the compiler, this encoding is unknown to StringCodePage. The 
overloaded Ansi/Unicode versions of StringCodePage only know about the 
basic string type (Ansi/Unicode) of their arguments, but cannot 
determine a

Re: [fpc-devel] Trying to understand the wiki-Page FPC Unicode support

2014-11-28 Thread Hans-Peter Diettrich


Jonas Maebe schrieb:


I'm sorry, but I simply cannot discuss with people that, when I
literally state the result is undefined, think that I may actually
have meant the result is defined and if you change the
implementation and/or keep it stable across compiler releases, then
it will also conform to whatever you think that this defined
behaviour should be. I don't have the energy nor the patience for
that.


I also have no use for continuing such discussions.

I prefer to specify and document everything *before* coding, so that 
everybody can expect that the code will behave as specified.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Trying to understand the wiki-Page FPC Unicode support

2014-11-28 Thread Hans-Peter Diettrich


Michael Schnell schrieb:

I fear that there will be code that relies on the flawed behavior of 
RawByteString (it's a feature, not a bug) and using the same name with 
different behavior would brake same. And a really usable DynmicString 
would not adhere to  that description.


How can somebody rely on behaviour *stated* as undefined, or not 
working as defined?


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Trying to understand the wiki-Page FPC Unicode support

2014-11-28 Thread Hans-Peter Diettrich


Michael Schnell schrieb:

On 11/27/2014 03:44 PM, Hans-Peter Diettrich wrote:


An *efficient* implementation would be based on a single program-wide 
string representation, with different encodings being handled only in 
an exchange with external data sources.
Yep. But it would result in severe user code portability issues (see 
above). IMHO using DynamicString at the correct locations would not be 
(noticeably) less efficient but a lot more versatile.


You suggested to use string as UTF-16 on Windows, and UTF-8 on Linux. 
That's what I understand as a unique program-wide string representation 
(not sourcecode-wide, instead program as *compiled*). Then I cannot see 
any need or use for another DynamicString type.



I also don't think we will ever see a fix for the poor implementation of 
RawByteString (avoiding the word flaw and the suggestion of a bad 
purpose), because it would brake existing user code.


Nothing can be broken, as long as the Delphi behaviour is undefined. 
Code relying on specific compiler/library bugs is bound to that 
compiler, not portable in any way.


Regarding fpc, correcting the flaws and keeping the name RawByteString 
would result in incompatibility issues vs Delphi and breaking code that 
will be ported from Delphi.


Same as above. When application code works properly with strings of 
*sometimes* different static and dynamic encoding, it will not stop 
working with strings of *never* different encodings.


Of course the opposite is not true. When some code works properly (only) 
with strings of the same static and dynamic encoding, it will stop 
working when compiled with Delphi. Then the coder has to insert explicit 
checks for the dynamic encoding of *all* strings, all over his code.


Applied to FPC/Lazarus code (compiler, libraries, IDE...) this means 
that it's obviously easier to *prevent* possibly different 
static/dynamic encodings, instead of *checking and reacting* on such 
flaws throughout the entire codebase. Apart from that, every 
encoding-tolerant code will execute much slower than code without a need 
for checks and conversions everywhere.


I seriously doubt that the FPC developers ever realized these 
consequences, and the amount of time required for finding, reporting and 
fixing the bugs in all affected pieces of their code :-(


That is why fpc would need to define an additional type name (e.g 
DynamicString) and encoding brand number (e.g. CP_ANY = $FF00) for a 
decently usable type for intermediately holding a  String content.


This again would make *FPC* programs incompatible with Delphi. While 
fixing the RawByteString flaw would at least allow to *compile* FPC code 
with Delphi, the use of an different encoding value would definitely 
prevent compilation of such code with Delphi. What's the more serious 
incompatibility?



RawXxxString can be used for really uncoded data as done with 
old-style strings in a lot of applications.


Such a feature would be appreciated by many users, indeed :-)

DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Trying to understand the wiki-Page FPC Unicodesupport

2014-11-28 Thread Hans-Peter Diettrich


Marco van de Voort schrieb:

In our previous episode, Hans-Peter Diettrich said:
While it certainly is a stupid (Microsoft) idea to use UTF-16 for file 
storage, we'll have to take that into account.


(16-bit codepages were designed into OS/2 and Windows NT before utf-8 even
existed)


Right, both systems were developed by Microsoft :-]

No problem, as long as proper host/network byteorder conversion is 
applied in reading/writing such files. But in former times every 
computer manufacturer was proud of *his* clever text processing 
features, with characters stored in 6 up to 9 bit registers. In those 
times it was an essential *marketing* feature, when files could *not* be 
read by competing systems, due to different bytesize, bit-/byteorder, 
character sets, file formats etc.


But times have changed, nowadays the Internet requires certain common 
standards (e.g. 8-bit bytes = octets, HTML, Unicode and more), which 
allow for data exchange across machine and country boundaries.


The lack of far-east support already forced the Japanese to invent their 
own BIOS, codepages etc.  Nowadays continued use of UCS2 had forced the 
Chinese to invent their own character encoding, which then would be used 
by more people than UCS2. Guess what would happen to the rest of the 
world, then...


OT
Or will the Chinese government enforce such a development soon, to 
eliminate the need for continued censorship of foreign web pages, 
because legal equipment then only could present genuine Chinese pages, 
but no more HTML, JavaScript and Unicode? How would the official Chinese 
programming language look like?

/OT

DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] ThousandSeparator

2014-11-27 Thread Hans-Peter Diettrich


Sven Barth schrieb:

At my old company our Delphi application handled runtime changes to 
these settings rather well. For display the normal XToY (e.g. DateToStr) 
functions are used which use the DefaultFormatSettings which are updated 
automatically (the VCL's message loop triggers a repaint when format 
settings were changed in the system).


A repaint by itself doesn't change the strings. How do the new strings 
come into all the edit boxes, of all open forms?


Similarly, when the user changes the system language, can he expect that 
every running application updates itself, with changed menus etc., up to 
eventually open help viewers? What if the program is not prepared for a 
different language, because e.g. a tax assistant is bound to a specific 
country?


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Trying to understand the wiki-Page FPC Unicode support

2014-11-27 Thread Hans-Peter Diettrich


Michael Schnell schrieb:

I now understand that the Element Size field in the String header is 
quite dummy, as under the hood there are two completely separate 
concepts for one-byte-Strings and 2-Byte Strings and none for other 
Element sizes.


After a code review I realized that the element size field is specific 
to dynamic strings, not present in dynamic arrays. Since the element 
size is bound to the string type, it could be omitted in the FPC 
implementation. [With little win, when the record alignment is preserved]


This to me is not obvious at all, as the language syntax and the String 
header data structure suggest a more universal paradigm for multiple 
string type brands, that each have an element-size6 and 
code-ID-number setting, handled by a common infrastructure.


This may have been envisaged by the Delphi architects, but was not 
continued later.


The universal paradigm would allow for extensions (e.g. UTF-32, 
multiple 16 Bit Code pages, an additional fully dynamic String type, 
n-byte un-encoded string types), as I described in the Wiki page.


Even if feasable, such arbitrary string storage can dramatically 
increase the number of implicit string conversions. An *efficient* 
implementation would be based on a single program-wide string 
representation, with different encodings being handled only in an 
exchange with external data sources.


That standard encoding may be Ansi or Unicode; even Delphi allows for 
both models, where Ansi again suggests the use of one specific codepage 
(CP_ACP) for best performance.



Cassandra
After all I have the impression that the known RawByteString flaws will 
never be fixed in Delphi, in order to encourage the users to take the 
step to UnicodeString. Now the question is whether these flaws are fixed 
in FPC, or whether Lazarus will become the first project that definitely 
requires an complete move to UnicodeString, for reliable operation.

For best support of non-UTF-16 platforms I'd suggest to fix the flaws...
/Cassandra

DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] ThousandSeparator

2014-11-27 Thread Hans-Peter Diettrich


Frederic Da Vitoria schrieb:
2014-11-26 16:54 GMT+01:00 Hans-Peter Diettrich drdiettri...@aol.com 
mailto:drdiettri...@aol.com:


2) Formatted numbers, as enterd by the user (maybe by copypaste
from other applications), can have various encodings. Before a
conversion into binary values I'd remove all unexpected characters,
except for the last (rightmost) '.' or ',', which then becomes the
decimal separator as expected by the decoding function (RTL provided).


You mean that the first string to be converted to binary would 
automatically set the decimal separator?


No, my code would make no assumption about the format of strings edited 
by the user.


That would seem dangerous to 
me. What if the first string to be converted contained something like 
11,000, does this mean 11000 with thousand separator = comma (which 
would be true in at least USA), or 11 with decimal separator = comma 
(which would be true at least in France)? I can't think of any way to

choose automatically.


Okay, that would require more knowledge about the value kept in a 
specific input field, for range checks or the like. As long as thousands 
separators occur in the string, different from '.' or ',', they are 
quite easy to identify.


AFAICS, the code needs either to use the system 
settings or to be told explicitly by the developer. Even relying on the 
system settings may not be enough, because one may need to import data 
formatted with different national settings from the system's settings.


Right. When e.g. a CAD program is fed with sizes from an external data 
sheet, it cannot be expected that the figures in that file change 
together with the system language, and are converted between inch and 
meter, temparatures between F and C ;-)


So it looks to me as a stupid idea, when the user changes such system 
settings *while* such a program is running. Furthermore the use of 
national formatting conventions for the exchange of values across 
applications looks to me like another stupid idea. Would somebody expect 
or even like it, when e.g. constant declarations are converted when 
copied into an Lazarus editor, and the compiler would require that all 
constants in source code conform to the current settings?


As mentioned in other contributions, the number formatting seems to be a 
Windows specific problem. How to deal with imported numbers on other 
platforms, with arbitrary settings per application?



After all a program could, when notified of such changes, ask the user 
whether to continue or restart, or force an restart. Restart should be 
safe, but when the user decides to continue, he must be aware of 
possible problems. When the restart takes considerable time, the user 
may learn that his behaviour is not very clever ;-)


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Trying to understand the wiki-Page FPC Unicode support

2014-11-27 Thread Hans-Peter Diettrich


Michael Schnell schrieb:

On 11/26/2014 07:13 PM, Hans-Peter Diettrich wrote:


Not all codepages have a fixed number of bytes per character.
The string preamble contains the *element size* (1 for AnsiString), 
just like with every dynamic array.
Sorry for sloppy wording. Of course I did mean element size 
(Character here obviously is not printable item).


I'd restrict the use of character to physical Char types, just to 
avoid any misinterpretation.


Printable items (glyphs) are independent from the storage format. 
Ligatures or umlauts can consist of multiple codepoints, and several 
Unicode codepoints are not even printable.


A single printable character, as selectable by a single cursor step, 
can consist of multiple codepoints, even (or just) in Unicode.



That's why I'd expect that the FPC documentation includes a glossary and 
definition of the terms, which should be used in the documentation and 
discussions.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Trying to understand the wiki-Page FPC Unicode support

2014-11-27 Thread Hans-Peter Diettrich


Jonas Maebe schrieb:

On 26/11/14 23:41, Hans-Peter Diettrich wrote:

In this case the implementation is compiler specific, somewhat
different from undefined (in a RawByteString):
CP_NONE: this value indicates that no code page information has been
associated with the string data. The result of any explicit or implicit
operation that converts this data to another code page is undefined.

IMO the result is well defined: it's the string with the encoding of
that other codepage.


Unless you actually tested this on all platforms and noted that is the
case, you cannot state this. And if you would actually test it, you
would discover that it is wrong
(http://bugs.freepascal.org/view.php?id=22501#c61238 ).


Bugs obviously violate some specification/definition, else it's not a 
bug, it's a feature ;-)



As mentioned in a previous discussion: don't use IMO (in my opinion)
when talking about testable facts. A testable fact is either true or
false, opinions do not enter the picture.


We're just talking about interpretations, not facts.



An undefined result, as I understand it, would
mean the result can be anything, unrelated to the function input.


Which is 100% correct.


Do you see any use for such function definitions, except in random 
generators?



IMO a better wording should be found, that does not cause the current
obvious confusion of some readers.


The confusion only occurs for readers that do not believe what is written.


Such statements come only from writers that do not believe that their 
words can be understood in various ways ;-)


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Regionalisation (Was ThousandSeparator)

2014-11-27 Thread Hans-Peter Diettrich


Michael Thompson schrieb:

I hear you, but this issue is so much wider than separators.  I know one 
software package that will only successfully export data to excel if the 
system regional is one of the English (xxx) variations (Australian 
guaranteed to work, not really played with the rest...).  In this case, 
the client (in Denmark) has one PC in a corner, set to Australian 
settings, just for exports...


This may be a relict from the time, where Microsoft found it a good idea 
to nationalize VBA (VB, Word, Excel...). I appreciated that in so far, 
as no English macro virus could become active in my German Word (with 
only German keywords). The same language barrier may prevent proper data 
export, maybe starting with slightly different keyword spellings like 
for color/colour.


Similar problems exist(ed) in RTF export, so that MS had to ship another 
WinHelp compiler (HC) for every new WinWord version, that worked around 
the new errors in the RTF sources exported from Word, even after VBA was 
reverted to unique English-only keywords and function names.



As an Australian developer, this is just embarrassing...


I never felt a need or reason for considering Microsoft products as 
anything but buggy toys, hardly usable outside the USA :-(



To come back to the current discussions, the introduction of Unicode (as 
UCS-2 and UTF-16) was a similar (typically American) mistake, totally 
ignoring e.g. any Chinese character set (in favor of Klingon?). Apple, 
as another US company, invented the decomposed Unicode filenames - for 
lack of oversight, or to establish artificial platform barriers?


The step from strictly national Ansi applications to Unicode is a very 
tiny one, compared to the leaps that have to be taken afterwards, in 
order to make the program really work in foreign countries. I wonder how 
e.g. Belgian, Canadian or Swiss software has been written in such 
multi-lingual countries before, and how it is written nowadays.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Trying to understand the wiki-Page FPC Unicode support

2014-11-27 Thread Hans-Peter Diettrich


Michael Schnell schrieb:

On 11/26/2014 06:37 PM, Hans-Peter Diettrich wrote:


An AnsiString consists of AnsiChar's. The *meaning* of these char's 
(bytes) depends on their encoding, regardless of whether the used 
encoding is or is not stored with the string.
I understand that the implementation (in Delphi) seems to be driven more 
by the Wording (ANSI) than by the logical paradigm the language syntax 
suggests. The language syntax and the string header fields suggest that 
both the element-size as the code-ID-number need to be adhered to (be it 
statically or dynamically - depending on the usage instance). E.g. there 
are (are least two Code pages for UTF-16 (LE, and BE), that would 
be worth supporting.


You are confusing codepages and encodings :-(

UTF-7, UTF-8, UTF-16 and UTF-16BE describe different representations of 
the same values (Unicode codepoints). And I agree, all commonly used 
encodings should be implemented, at least for data import/export.



It's essential to distinguish between low-level (physical) AnsiChar 
values, and *logical* characters possibly consisting of multiple 
AnsiChars.
I now do see that the implementation is done following this concept. But 
the language syntax and the string header field suggest a more versatile 
paradigm, providing a universal reference counting element string type.


See it as a multi-level protocol for text processing. The bottom 
(physical) level deals with physical storage items (AnsiChar, 
WideChar...), and how they are stored in memory or files. Like it 
doesn't make sense to deal with individual bytes of real numbers in 
computations, it doesn't make sense to deal with individual bytes 
(AnsiChars) of logical characters - except in type/encoding conversions. 
Higher levels deal with logical values, which can consist of multiple 
physical items, and may need different interpretatons (in case of Ansi 
codepages). This level is partially coverd now by AnsiString encodings 
and UTF-16 surrogate pairs, which allow to map the values into full 
Unicode (UCS-4) codepoints. But these codepoints still are not 
sufficient for a correct interpretation and manipulation of logical 
characters, which again can consist of multiple codepoints (decomposed 
umlauts, ligatures...). In a next level another (mostly language 
specific) interpretation may be required, like which logical characters 
have to be treated together (ligatures, non-breaking characters...). 
Some natural languages (Hebrew, Arabic...) require another special 
handling of (mixed) LTR/RTL reading, and of paths, influencing the 
graphical representation of character sequences; but that's nothing an 
application or library writer should have to deal with, such 
functionality should be provided by the target platform.


There must be a boundary between the standard (RTL) handling of the 
physical items and encodings, and higher text processing levels, up to 
language specific processing (how to break words, when to apply 
capitalization, syntax checks...), so that such special handling can be 
implemented in dedicated extensions (libraries, classes), by developers 
familiar with the rules and conventions of the natural languages.


For now we are talking only about the handling up to individual Unicode 
codepoints, and related string manipulation. Herefore at least one 
string representation must exist, that covers the full Unicode range of 
codepoints (UTF-8 or UTF-16 for now). When such an implementation claims 
for undefined behaviour, then this can only mean implementation flaws, 
resulting in something different from what can be expected from proper 
Unicode handling. This includes invalid parameter values in subroutine 
calls, which should result in proper (defined) runtime error reporting 
(AV, error result...).


WRT to AnsiString encodings, the only acceptable (expected) differences 
can result from lossy conversions, when converting proper Unicode into a 
non-UTF encoding. Even then the results should be consistent, even if 
the concrete results depend on some external (platform...) convention or 
settings.


IMO.


That's why I wonder *when* exactly the result of such an expression 
*is* converted (implicitly) into the static encoding of the target 
variable, and when *not*.
I understand that the idea is, to use the static encoding information 
provided by the type definition whenever possible.


Right, but here whenever possible depends on the correspondence of 
static and dynamic encoding. When the dynamic encoding can *ever* be 
different from the static encoding, except for RawByteString, I consider 
it NOT possible to derive the need for a conversion from the static 
encoding. In the handling of floatingpoint values we may have to expect 
invalid operations (division by zero, overflow...) or values (NaN...), 
but NOT that a Double variable ever contains two Integer values - unless 
forced by dirty hacks out of compiler control. Why should this be 
different and acceptable

Re: [fpc-devel] Trying to understand the wiki-Page FPC Unicode support

2014-11-26 Thread Hans-Peter Diettrich


Mattias Gaertner schrieb:

On Wed, 26 Nov 2014 11:23:17 +0100
Michael Schnell mschn...@lumino.de wrote:


Seemingly here the bytes per character setting implicitly is thought 
of as a port of the code-page definition. correct ?


Code page define bytes per character.


Huh?

Not all codepages have a fixed number of bytes per character.
The string preamble contains the *element size* (1 for AnsiString), just 
like with every dynamic array.




As you know: Don't confuse character with glyph and codepoint.


Right, but what is what?

I feel a need for an exact (official) definition of such (and more) 
terms, in order to prevent further misunderstandings of the 
documentation and in discussions.


E.g. code page has different meanings, when used with ANSI/ISO and 
Unicode character sets.
While ANSI/ISO codepages desribe different mappings of bytes into 
characters, Unicode codepages define subsets of the whole Unicode range.


My understanding of character is a *logical* unit (letter), with 
possibly different encodings, values and sizes in different codepages 
(character sets).

What's the term for the *physical* unit (AnsiChar, WideChar)?



Ansistring supports only one byte per character code pages.


Huh?

What's your definition of character?

AnsiString supports MBCS codepages as well. The restriction is the 
physical storage unit (1 byte per string item), as imposed by AnsiChar.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Trying to understand the wiki-Page FPC Unicode support

2014-11-26 Thread Hans-Peter Diettrich


Michael Schnell schrieb:

On 11/26/2014 11:40 AM, Mattias Gaertner wrote:
Ansistring supports only one byte per character code pages. 


Even more confused. Am I wrong thinking that with code aware Strings,  
for Delphi XE compatibility, in Windows CP_ACP needs to be UTF16 (if not 
right, than due later) ?


Delphi XE does not properly support UTF-8. CP_ACP seems to depend on 
western/far-eastern versions, where the western version assumes and 
allows for any SBCS; I don't know of the same in far-east versions.
The SBCS restriction allows to simplify standard string handling and 
conversions, because every character (=byte) can be exchanged in place. 
UTF-8 doesn't fit into this picture, because it's a MBCS.


UTF-16 is not a valid value for CP_ACP in Delphi, because it's a 2-byte 
encoding. Even if the Delphi architects may have thought about an common 
string type, with a variable element size (1,2,4), this certainly turned 
out soon as a stupid idea, so that AnsiString and 
WideString/UnicodeString still are strictly distinct types. WideString 
and UnicodeString imply UTF-16, with platform specific byte order 
(endianness). The latter becomes important almost only to compiler and 
library coders, in host/network byteorder conversions. For the sake of 
completeness, pdp-11 processors use yet another byte order, maybe more 
word-based processors (DG...) as well.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Trying to understand the wiki-Page FPC Unicode support

2014-11-26 Thread Hans-Peter Diettrich


Michael Schnell schrieb:

On 11/26/2014 12:09 PM, Sven Barth wrote:
 In Delphi (and FPC) CP_ACP corresponds by default with the current 
system codepage (e.g. CP1252 on a German Windows). 


OK. So in Delphi XE (in Germany) String(CP_ACP) is the same as 
String(CP1252) but different from String without brackets which in turn 
is the same as String(CP_UTF16) ? Correct ?


CP_ACP (and CP_NONE) describes a *static* encoding, and has an fixed 
value (CP_ACP=0, CP_NONE=$). The dynamic encoding of strings, kept 
in AnsiString(0) or RawByteString variables, must be obtained from the 
string itself. When the string is empty, StringCodepage returns 
DefaultSystemCodePage (for CP_ACP).



CP_UTF16 is not supported, because AnsiString only supports 1-Byte 
character strings (and UTF-8 as the odd one) and not 2-Byte character 
strings.


I still don't understand. The wiki article seems to suggest that it is 
about a type called ANSIString that features a dynamically settable 
code page information. From discussions about Delphi and FPC, I only 
know a String type with a dynamically settable code page information 
that also features a dynamically settable Bytes per Character 
information and hence does support 1, 2 and 4 Bytes per Character. 
(e.g. UTF-8, UTF-16, and UTF-32).


You should have noticed that there exists no String or Char type, that 
would allow for arbitrary bytes/char counts (see my other answer for 
details).



The difference to Delphi currently is that for FPC 
String=AnsiString(CP_ACP) and for Delphi String=UnicodeString (aka 
2-Byte string).




I understand that you mean (e.g.) Delphi XE. But what version of FPC is 
currently. Am I wrong assuming that in the svn we do have the 
NewStrings library that supports dynamical code-page *and* 
byte-per-character settings and hence supports e.g. CP1251, UTF-8, 
UTF-16, and UTF-32 ?


The byte-per-character field is read-only, just like for any dynamic array.

So I seem to understand the meaning of 
String(CP1252), String(CP_UTF8), and String(CP_UTF16) (which seems do be 
the Delphi notation), but I seemingly don't get the exact meaning of 
AnsiString(CP_ACP) or AnsiString(CP1251)


The Delphi notation is the same, e.g. AnsiString(CP_ACP).

In the end, what the definition of String without brackets is, might 
be due to a settable compiler option and/or the OS the compiler is set 
to create code for.


Right, the *generic* String type can be mapped to either ShortString, 
AnsiString(0) or UnicodeString, depending on compiler versions and 
switches. A raw guess can be derived from sizeof(Char).


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Codepage aware RTL

2014-11-26 Thread Hans-Peter Diettrich


Mattias Gaertner schrieb:

Hi,

The page about FPC Unicode support mentions what has already been
updated to preserve character data.
http://wiki.freepascal.org/FPC_Unicode_support#RTL_changes

Is there already a page about what has not (yet) been updated aka does
not work with all code pages?


You mean this section?
http://wiki.freepascal.org/FPC_Unicode_support#RTL_todos

DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Trying to understand the wiki-Page FPC Unicode support

2014-11-26 Thread Hans-Peter Diettrich


Michael Schnell schrieb:

I fail to understand some of the text.

It seems to be unavoidable to use the name ANSIString even though I 
always though up when seeing a thing called ANSI containing Unicode 
(e. g.   UTF8String = type AnsiString(CP_UTF8) ).



Seemingly here the bytes per character setting implicitly is thought 
of as a port of the code-page definition. correct ?


An AnsiString consists of AnsiChar's. The *meaning* of these char's 
(bytes) depends on their encoding, regardless of whether the used 
encoding is or is not stored with the string.


It's essential to distinguish between low-level (physical) AnsiChar 
values, and *logical* characters possibly consisting of multiple AnsiChars.




In section Dynamic code page:

When assigning a string to a plain AnsiString (= AnsiString(CP_ACP)) or 
ShortString, the string data will however be converted to 
DefaultSystemCodePage. The dynamic code page of that AnsiString(CP_ACP) 
will then be the current value of DefaultSystemCodePage (e.g. 1250 for 
the Windows-1250 code page), even though its static code page is CP_ACP 
(which is a constant  1250). This is one example of how the static 
code page can differ from the dynamic code page. Subsequent sections 
will describe more such scenarios.


1) A short String does not have a Code page notification so for this 
static code page can differ from the dynamic code page does not seem 
to make much sense.


The text correctly states dynamic code page of that AnsiString. 
ShortString (and AnsiChar) has no encoding indicator, they are assumed 
to be encoded in CP_ACP.



2) I fail to understand how with this explanation that seems to force 
auto conversion for assignments between types with different code page 
settings (also for CP_ACP) the static code page can differ from the 
dynamic code page can happen.


Continue reading until you understood the special handling of string 
literals and RawByteString.


In fact this disaster seems to be able to happen (see section 
RawByteString) if assigning a string with a static code page X1 to a 
RawByteString (hence no conversion) and then assigning that 
RawByteString to a string with a static code page X2 (no conversion 
again). In fact I assume that without abusing RawByteString such 
intersexual strings can't be produced, otherwise this would be rather 
disastrous for normal users.


*All* intermediate strings, generated during the evaluation of string 
expressions, only have a dynamic encoding, thus can be considered as 
being RawByteStrings.


That's why I wonder *when* exactly the result of such an expression *is* 
converted (implicitly) into the static encoding of the target variable, 
and when *not*.


Obviously the compiler inserts an conversion request for the *direct* 
assignment of one string variable to another one, of an different 
*static* encoding. But what happens when a string expression doesn't 
have such a known static encoding???




In section RawByteString:

the results of conversions from/to the CP_NONE code page are undefined.

In effect the behavior is exactly defined in this section As a first 
approximation.


Right, the result *is* well defined, but has no *predetermined* dynamic 
encoding.


The entire mess results from the bad interpretation of RawByteString 
assignments, which IMO was well thought by the Delphi language 
architects, but not understood by the Delphi compiler coders. This 
interpretation also found its way into FPC:


Less intuitive is probably that when a RawByteString is assigned to an 
AnsiString(X), the same happens: no code page conversion[...]


It's clear that a conversion *can* be omitted for every assignment *to* 
an RawByteString. That's one of the purposes of that type - to avoid 
excess conversions into CP_ACP or UnicodeString.


But it's unclear why the heck the assignment to any *other* AnsiString 
type should be omitted, as soon as the source string is a RawByteString???


Therefore I'd suggest an compiler switch, implementing the lame Delphi 
compatible behaviour only on *demand*, while the FPC default would force 
eventual conversions with *every* assignment to any other (non-CP_NONE) 
AnsiString type. This simple change will safely prevent strings of 
different static and dynamic encoding, so that according tests can be 
removed safely from library *and* user code.



The proper use of RawByteStrings deserves further documentation, for 
users who want/need their own (generic) stringhandling routines. Topics 
should be:

- how to determine the dynamic encoding of strings (StringCodePage)
- how to force required conversions (SetCodePage)
- how to deal with strings of different encodings
- how to minimize the number of string conversions

DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] ThousandSeparator

2014-11-26 Thread Hans-Peter Diettrich


Ewald schrieb:


Of the OS/window manager actually. You are of course right in that there
are a certain set of separators that can be used, but the exact
separator to use is dependent on the system.


Sounds easy, but just yesterday I ran into a bunch of related problems. 
Even if the following is somewhat OT, my observations may be helpful to 
somebody else:


I just encounterd an really problematic case, with the Ez-Builder IDE. 
That program aborts when the decimal separator is not a period '.', 
asking the user to adjust his national/language settings in the system. 
So what can/should a user do, in order to run this program on my German 
Windows with a comma ',' as the decimal separator?


A developer might ask the author to add proper handling of the 
system-wide national settings. But when that author spends time in 
presenting instructions, how to change the inconvenient setting, instead 
of correcting his code, I doubt that such a wish will be heard :-(


The dumb user might follow the instruction, causing problems in all 
other programs :-(
When the system does not notify all other (running) programs of such an 
global change, or when some other stupid program doesn't know how to 
deal with changed settings, the user better shuts down and restarts his 
system, before and after using that ill behaved program.


But exactly *what* should a clever program do, when it receives such a 
change notification? What should happen with the formatted numbers, 
shown in the forms of the program? Which code (app/OS?) puts the 
separators into formatted number strings?


I don't know if it's worth to discuss such problems in detail, so let me 
only present my preferred handling:


1) The actual settings are determined at program start, and remain 
unchanged until program termination.


2) Formatted numbers, as enterd by the user (maybe by copypaste from 
other applications), can have various encodings. Before a conversion 
into binary values I'd remove all unexpected characters, except for the 
last (rightmost) '.' or ',', which then becomes the decimal separator as 
expected by the decoding function (RTL provided).


3) For all other (non-GUI) purposes a unique string format is used, 
according to the conversion functions used by the compiler. This means 
no thousands separator, and a '.' decimal separator.



But back to the original problem: I managed to create another user, 
whose number format settings match the expections of the Ez-Builder, 
while using my German keyboard. For Linux users this may sound like an 
easy job, but adding and configuring users in Win8 turned out as kind of 
a nightmare :-(
Win8 requires an eMail address for every new user, but entering a fake 
address only allows to create the account, without any chance to log in 
subsequently. Probably the requested password has to be established by 
mail, at least I found no way to disable or specify or reset the 
password for the new account.
Fortunately I had retained an Guest account, could log in and adjust the 
format settings as prescribed, and then could successfully start Ez-Builder.


After all I hope that these problems are due to the cheap (Premium?) 
version of my Win8, that is *intentionally* crippled in several ways.


Conclusion:
Proper handling of separators in formatted numbers is essential, or else 
users may run into so big trouble, that they will drop your program as 
unusable.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Trying to understand the wiki-Page FPC Unicode support

2014-11-26 Thread Hans-Peter Diettrich


Mattias Gaertner schrieb:


For example:
CP_ACP=0, DefaultSystemCodePage=1252
That means static code page is always 0, while dynamic code page can be
0 or 1252. Both describe the same encoding.


A *dynamic* encoding *never* can be CP_ACP nor CP_NONE (in Delphi). 
These values are allowed only for *static* types in type declarations.

CP_UTF16 is also not allowed.

Delphi StringCodePage reports the current default codepage 
(DefaultSystemCodePage) for empty AnsiStrings, CP_UTF16 for all 
UnicodeStrings.



In section RawByteString:

the results of conversions from/to the CP_NONE code page are undefined.


... because CP_NONE is not a real code page.


The same for CP_ACP.

DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Trying to understand the wiki-Page FPC Unicode support

2014-11-26 Thread Hans-Peter Diettrich


Michael Schnell schrieb:

So seemingly you could do MyStringType   = type 
AnsiString(CP_UTF16), and seemingly the size information is set 
according to this.


Not in Delphi XE.

DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Trying to understand the wiki-Page FPC Unicodesupport

2014-11-26 Thread Hans-Peter Diettrich


Jonas Maebe schrieb:

On 26/11/14 17:41, Tomas Hajny wrote:
BTW, in this context - can users choose UTF16BE on little endian 
platforms (and vice versa)?


No, because we do not have any routines that allow a user to set/change
the codepage of a unicodestring (either at run time or at compile time).


What about file I/O?
It should be possible to read (and write) files of either endianness.

DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Trying to understand the wiki-Page FPC Unicode support

2014-11-26 Thread Hans-Peter Diettrich


Jonas Maebe schrieb:


Technically, that section literally states that they will be
concatenated without data loss and that the result is then converted to
the target string's encoding (except in case the target is
RawByteString). How that is implemented exactly is undefined; again in
the meaning of undefined, not in the meaning of undefined when
defined as meaning X.


In this case the implementation is compiler specific, somewhat 
different from undefined (in a RawByteString):
CP_NONE: this value indicates that no code page information has been 
associated with the string data. The result of any explicit or implicit 
operation that converts this data to another code page is undefined.


IMO the result is well defined: it's the string with the encoding of 
that other codepage. An undefined result, as I understand it, would 
mean the result can be anything, unrelated to the function input.


The branch taken in execution of an IF statement also is not 
undefined, only because it depends on the actual condition value.


The value of a local variable initially is undefined, i.e. can be any 
value. But after an assignment it *is* defined, even if that value still 
may be *unpredictable* by static code analysis.


IMO a better wording should be found, that does not cause the current 
obvious confusion of some readers.




Regarding RawByteStrings there has been the definition a RawByteString
has exactly the same behavior as assigning that AnsiString(X) to another
AnsiString(X) variable with the same value of X: no code page conversion
or copying occurs. Seemingly this is not true for the intermediate
results of concatenations.


That paragraph only specifies that code page-aware strings are
concatenated without data loss, and then defines to which code page the
result will be converted before assigning it to the target.


What's the meaning of no copying occurs? Of course the reference to 
the string is copied into the target variable!


What's the same value of X, in case of AnsiString(CP_ACP) and 
AnsiString(DefaultSystemCodePage)?




Even if the intermediary result of a concatenation would be a
RawByteString (which is not stated nor necessarily ever the case), then
the above would apply and hence the (dynamic) code page of that
RawByteString would be the one as defined by the above-mentioned rules
before it would be assigned to the target.


Please note that the other statements refer to *static* encodings, 
therefore my question about the (assumed) static encoding of an 
intermediate result. When the compiler inserts an conversion request 
based on *static* encodings, will it or will it not insert such an 
request, before an intermediate result is assigned to the target variable?



Suggestion:

During string operations the source strings are converted [to CP_ACP?] 
when they have a different [dynamic?] encoding. When the result is 
stored in a variable, it is converted as required by the static encoding 
of the target.


Where as required means that a static target encoding of CP_ACP is 
replaced by the DefaultSystemCodePage, while CP_NONE does not require a 
conversion.


The CP_ACP case should be clarified as well, because it's unclear 
whether CP_ACP(=0) is *considered* equal to the current 
DefaultSystemCodePage, even if both values are *always* different (see 
above). The use of CP_ACP instead of DefaultSystemCodePage can be 
confusing and should be avoided or clarified before.


Perhaps it would help to concentrate on the following steps:
1) (string) operand fetch
2) (string) operations
3) (string) assignment

1) Fetching an operand removes any information about the static encoding 
of the source, only its dynamic encoding persists.
[Now the handling of non-AnsiString sources can be explained, like for 
literals, ShortString etc.

RawByteString is not special here, it's only a static encoding.
]

2) String operations take into account the dynamic encoding of their 
operands, with lossless conversions inserted as required.


3) When a string is assigned to a variable, it is eventually converted 
as required by the static encoding of the target, with possible data loss.

[about required see above.
Special case: when the source is a variable, no conversion occurs when 
the *static* source and target types are compatible.

What exactly is compatible with CP_ACP?
]

DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] ThousandSeparator

2014-11-25 Thread Hans-Peter Diettrich


Michael Van Canneyt schrieb:


The ThousandSeparator is char and supports only 1 byte characters.
For example French and Russian need more.
Are there any plans to extend it?


Plans: yes. Time: no.

Maybe a widechar is sufficient ? Making it a string is more invasive 
than making it a widechar.


Are all possible separators members of the Unicode BMP?
What when a properly decorated string has to be converted to a specific 
(AnsiChar) codepage?


I'd assume that national separators are part of the according codepage, 
but is that always true?


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] ThousandSeparator

2014-11-25 Thread Hans-Peter Diettrich


Mattias Gaertner schrieb:


Does concatenating a string and a WideChar create a UnicodeString? Can
this become a problem?


Concatenation requires 2 strings, so everything depends on the concrete 
code. Regardless of eventual compiler magics, something like this will 
happen:


var c: WideChar; s, cs: string;
cs := c; //dunno if accepted by the compiler
s := s + cs;

The WideChar can be converted into an Unicode (UTF-8 or UTF-16) string. 
Afterwards this string may need another conversion, when the other 
string has a different encoding. In the worst case *both* strings are 
converted to the default Unicode representation (Delphi: UTF-16, 
Lazarus: UTF-8?), before they are concatenated. Another conversion may 
occur when the resulting string is assigned to a variable.


All this may become simpler when CP_ACP is used (at least in Delpi), and 
the separator is given in that encoding, as a single byte/AnsiChar in 
case of an SBCS CP_ACP. When Lazarus instead uses UTF-8 (MBCS) for 
CP_ACP, the character occupies more than one byte, so that this 
simplification is impossible. This suggests to store the delimiter as an 
string, instead of a WideChar, whereupon a concatenation of the strings 
may not require any further conversion.


Finally, when the expression (s+cs) is of type RawByteString (depending 
on the involved function declarations), the result will be stored in the 
target variable *without* another conversion. Then the static and 
dynamic encoding of s may be different afterwards.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Clarify expression grammar

2014-11-06 Thread Hans-Peter Diettrich


Vsevolod Alekseyev schrieb:

Hi all,

in the FPC reference at 
http://www.freepascal.org/docs-html/ref/refse68.html#x127-13700012.1 , 
the formal grammar spec only goes down as far as factor. Can I please 
see the grammar for variable reference? A variable reference can be an 
arbitrarily complex thing; for example, 
MyStructArray[MyFunction(I)*10+1].StructMember[Ord(J)] is a perfectly 
valid variable reference.


ACK
I'm also missing the ^, . and [...] operators/selectors from the
list of operators.

[This is a second post, the first one didn't show up yet]

DoDi




___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Clarify expression grammar

2014-11-05 Thread Hans-Peter Diettrich


Vsevolod Alekseyev schrieb:

Hi all,

in the FPC reference at 
http://www.freepascal.org/docs-html/ref/refse68.html#x127-13700012.1 , 
the formal grammar spec only goes down as far as factor. Can I please 
see the grammar for variable reference? A variable reference can be an 
arbitrarily complex thing; for example, 
MyStructArray[MyFunction(I)*10+1].StructMember[Ord(J)] is a perfectly 
valid variable reference.


ACK
I'm also missing the ^, . and [...] operators/selectors from the 
list of operators.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Small virtual machine to cross compile FPC

2014-11-01 Thread Hans-Peter Diettrich


Paul Breneman schrieb:


I think 100Mb is a bit small.
You'll need cross-binutils, X, cross-dev libs and whatnot.

650Mb would be feasable, I guess.


Thanks for that info, but couldn't most of that be download into the VM 
*after* it is running?  Seems to me I'd like the *smallest* VM and then 
have a way to load things into that standard PC.  But maybe I'm thinking 
wrongly?  If so please help me get it right.


I don't understand why the VM *size* should matter - unless it's 30GB 
for current Windows versions. My goal would be a *simple* OS, easy to 
configure and manage, and then install into it whatever is required. Why 
download and configure all the required tools whenever the VM is run? 
This may take half an day, to get the VM up for cross-development, and 
the downloads end up on the virtual disk as well.


For cross-development I'd install a network of dedicated target VMs, one 
of which can host the project files, and then build the project in every 
target VM. This would allow for parallel builds, and every created 
executable can be tested immediately on its platform - also in parallel 
for comparison of the GUI and operation. With a single development VM 
you would need another VM or emulator to perform the final checks, for 
every single target platform.


I've looked at (or tried) laz4android and fpcup.  Seems that such an 
approach would work much better on a standard PC?


Virtual machines work well on the same hardware (CPU), but for other 
targets (ARM instead of x86) an emulator is required. Wikipedia says 
that a LiveCD and AndroVM with Android for x86 is available, where it 
might be possible to develop Android applications somewhat natively on 
an x86 machine. But finally an emulator or physical device is required, 
where the cross-compiled programs can run on their target CPU, using the 
according libraries (RTL, VCL... for ARM).


Please don't ask me about Adroid, my experience is limited to 
FPC/Lazarus development on various Windows and Linux VMs, and I never 
tried to cross-compile myself. Why cross-compile when I cannot check the 
results?


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Proof of Concept ARC implementation

2014-10-30 Thread Hans-Peter Diettrich


Sven Barth schrieb:

On 28.10.2014 10:15, Michael Schnell wrote:

On 10/27/2014 05:17 PM, Hans-Peter Diettrich wrote:


Something like ShortString and AnsiString?


Only that ShortStrings can easily be avoided (AFAIK, no great
performance advantage to use them) and hence are seldom used right now.


ShortStrings don't have implicit initialization/finalization, thus no 
implicit try/finally blocks, which at least with FPC's platform 
independant exception handling mechanism or SEH on i386-win32 have quite 
some performance impact even in case no error occured (SEH on 
x86_64-win64 (and in theory arm-wince) only has an impact in case of an 
error (and an impact in binary size for the exception tables)). Also 
there is no reference counting for ShortString.


So: basically the performance of reference counted objects compared to 
normal objects is more or less similar to the performance of AnsiStrings 
compared to ShortStrings (it's not completely equal, because AnsiStrings 
are allocated on the heap while ShortStrings are on the stack, but it's 
good enough...). And did anyone yet complain about the performance of 
AnsiStrings? ;)


Right, this entire discussion is somewhat fruitless without benchmarks.

I wonder how difficult it would be to implement the existing Interface 
refcounting model for TObject, so that this runtime variation could be 
tested and benchmarked as well, in addition to the current compiletime 
approach. According to the problems of the compiletime approach, 
revealed in this thread, it looks not viable to me at all.



Just an idea about type incompatibilities:
When a TArcObject cannot be assigned to a TObject variable, because a 
conversion (as between ShortString and AnsiString) is impossible, then a 
delegate could be created that turns the TArcObject into something 
compatible with TObject. I have no idea how this could be 
accomplished[1], but as long as it only affects refcounted objects, the 
overhead has to be accepted when using ARC objects at all. Some overhead 
is inevitable for ARC, and everybody should be free to decide whether 
such overhead is acceptable for his projects or targets. Not using ARC 
at all should always be an option.


[1] Perhaps the same (possibly simplified) mechanism could be used, for 
TArcObjct/TObject conversion, as for Interface/TObject conversion.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Proof of Concept ARC implementation

2014-10-30 Thread Hans-Peter Diettrich


Sven Barth schrieb:

On 28.10.2014 10:19, Michael Schnell wrote:

On 10/27/2014 07:59 PM, Sven Barth wrote:


- in code that does not use ARC (modeswitch arc off - the default;
or maybe better a local directive) all instance variables are
considered weak


While I do have a vision what weak means here, can you give an exact
description ?


- no change in reference count when assigning a refcounted object 
variable to it


- no change in reference count when assigning it to a refcounted object 
variable
I suspect that this can cause premature destruction of the object, when 
either

- another value is assigned to the refcounted object variable
- all other (counted) references to the object disappear

But I don't have a solution for these problems, as long as the compiler 
inlines the refcounting code at compile time.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Proof of Concept ARC implementation

2014-10-30 Thread Hans-Peter Diettrich


Sven Barth schrieb:

On 28.10.2014 09:57, Hans-Peter Diettrich wrote:



  Something like ShortString and AnsiString?



Take unit Typinfo for example where quite some methods take a TObject
instance.


The TypInfo methods can determine the exact type of their arguments, and
act inside accordingly.


If you have a method X that takes a TObject parameter how do you plan to 
pass it a reference counted object when these and TObject are not 
compatible *at compiletime*?


That's intentionally impossible in general. For TypInfo, a dedicated 
method (override) can be added, or an untyped parameter can be used like 
in FreeAndNil.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Proof of Concept ARC implementation

2014-10-30 Thread Hans-Peter Diettrich


Sven Barth schrieb:

On 30.10.2014 04:14, Hans-Peter Diettrich wrote:



I wonder how difficult it would be to implement the existing Interface
refcounting model for TObject, so that this runtime variation could be
tested and benchmarked as well, in addition to the current compiletime
approach. According to the problems of the compiletime approach,
revealed in this thread, it looks not viable to me at all.


The code would mostly be the same as the one I already implemented. Add 
virtual to the ARCDecRef, ARCIncRef and ARCRefCount methods of 
TObject, adjust the RTL helper functions to not expect a refcount field 
at a specific offset, remove the restrictions that reference counting is 
only done for classes marked as refcounted and you're done...


Looks quite easy :-)

Could you introduce this feature into your branch, by conditional 
compilation?



Mark just pointed me to another problem, possibly unhandled yet. 
Interface refcounting must be updated, as soon as the underlying object 
becomes refcounted as well. Do you already have an idea how to handle 
refcounting for classes with interfaces?


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Proof of Concept ARC implementation

2014-10-28 Thread Hans-Peter Diettrich


Sven Barth schrieb:

Am 27.10.2014 21:00, schrieb Hans-Peter Diettrich:

Sven Barth schrieb:
Am 27.10.2014 17:20 schrieb Hans-Peter Diettrich 
drdiettri...@aol.com mailto:drdiettri...@aol.com:



  Something like ShortString and AnsiString?

With the difference that Short- and AnsiString are assignable to 
eachother while Jonas does not want that for reference counted and 
ordinary classes.


Where would this matter? When TObject and TManagedObject are different 
(base) types, a direct assignment of references is impossible.
Take unit Typinfo for example where quite some methods take a TObject 
instance.


The TypInfo methods can determine the exact type of their arguments, and 
act inside accordingly.


Or all those classes (TStrings, TObjectList, TComponent, etc.) 
that somewhere take a TObject as parameter.


IMO containers play a different role in managed and unmanaged 
environments. E.g. an TObjectList.OwnsObjects property is useless with 
managed objects, and the circular owner/child and parent/child 
references in several persistent classes deserve special attention and 
handling, when used with managed objects. Similar considerations apply 
to strings - should TStrings contain AnsiStrings or UnicodeStrings, 
where despite their assignment compatibility the implicit conversions 
between both can consume much runtime.


For such reasons I'd prefer a separate environment (RTL...) for only 
managed and unmanaged objects, just like for AnsiString and 
UnicodeString. But in combination such options would end up in many 
different library versions, so that I do not really suggest such an 
implementation. My dream are distinct FPC/Lazarus versions, designed for 
compatibility with D7, D2009, Unicode, Mobile and whatever versions may 
show up in the future. Then it should be possible to freeze the old 
versions with all bugs fixed, and new features will be added only to 
newer versions; this would eliminate all beforementioned problems, 
resulting from mixing features of different Delphi versions.


IMO Delphi versions don't offer backwards compatibility for good 
reasons, instead a purchased licencse allows to *also* use all older 
versions, down to D7. What I'm missing here are bugfixes, because the 
development of older versions is almost stopped as soon as a new version 
is distributed. Known bugs are mostly fixed only in newer versions, 
which introduce new bugs and features at the same time - good for sales 
but bad for the customers. Since FPC/Lazarus are open source, user 
groups may offer continued support for their preferred version(s), by 
backporting bugfixes into these versions.




What do you mean with virtual counting methods?


Overriding these methods can enable/disable refcounting for a class, 
and all classes derived from it. The default then can be to do nothing 
(no counting).
But then it's the same reference counting as the COM one, because 
without the COM reference counting those virtual methods would not be 
called.


So you prefer to inline these methods?
Now I understand why you want refcounting fully handled at compile time...

DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Proof of Concept ARC implementation

2014-10-28 Thread Hans-Peter Diettrich


Boian Mitov schrieb:
In general the C/C++ notion of doing as little in the language as 
possible, and as much in library has worked very well for it over the 
years.
Yes, pluggable languages concept has existed at least since C ;-) . I 
agree, and as I said has worked well.


AFAIR such languages lack compatibility with themselves, as soon as 
projects start using their private extensions. Then no project can 
borrow parts (libraries...) from other projects, in the worst case.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Proof of Concept ARC implementation

2014-10-28 Thread Hans-Peter Diettrich


Sven Barth schrieb:
Am 27.10.2014 23:41 schrieb Boian Mitov mi...@mitov.com 
mailto:mi...@mitov.com:

 
  Well... we may differ on this one. I absolutely love attributes, but 
I guess that is just me :-D .
  I think attributes are the greatest thing that has happened to Delphi 
ever, I just wish they ware not so limited. Attributes allowed us to cut 
3/4 of our code base. You can't beat that easily.


Let me clarify: I have nothing against the concept of attributes, I just 
dislike the syntax and the introduction of attributes that influence 
compiler behavior.


+1

DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Small virtual machine to cross compile FPC

2014-10-28 Thread Hans-Peter Diettrich


Paul Breneman schrieb:
I've spent a bit of time during the past 7 years trying to figure out 
how to simplify things by avoiding cross-compiling.  This page has many 
of the details:

  http://turbocontrol.com/monitor.htm

I think there is a way to simplify cross-compiling.  Levinux is a small 
(~20 MB) QEMU download for x86 PCs (Windows, OS X, Linux) that provides 
a small Tiny Core Linux VM.  I'd like to see something similar but with 
all the files and tools needed to pull the latest source code and 
cross-compile FPC (also with Debian instead of Tiny Core?).

  http://mikelev.in/ux/

It seems to me that such a small VM should allow a nice standard method 
that will make it easy to test and see things work.


I look forward to your thoughts and comments!


I wonder why you need or use cross-compilation at all?

The biggest part of an cross compiler are the target specific libraries 
and tools, which allow to create executables for use on a specific 
target. IMO it will be easier to create a dedicated VM for every target, 
and install FPC there, instead of adding cross-compilation features for 
many targets to whatever machine. Mobile devices often require their own 
emulator, or a physical device, for program development, a single VM is 
of little use herefore. IMO.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Proof of Concept ARC implementation

2014-10-27 Thread Hans-Peter Diettrich


Jonas Maebe schrieb:

Additionally, as mentioned before, I still believe it's a very bad idea 
to be able to inherited from a regular class and turn it into a 
reference counted class. Reference counted and non-reference-counted
classes are different language entities with different behaviour and 
different code generation requirements, and hence should be completely 
unrelated.


Something like ShortString and AnsiString?

I agree that a *compiler-based* implementation, of a single TObject base 
class, would require two sets of libraries, starting with the RTL, else 
a mix of units with different object types cannot be avoided in an 
executable. And it would almost disallow to use DLLs, of a possibly 
different model.


Even if you completely forbid typecasting a reference counted class into 
a non-reference-counted parent class, simply calling an inherited method 
from a non-reference-counted parent class can easily completely mess up 
the reference counting (e.g. suppose that inherited method inserts 
self into a linked list).


ACK. The only way out; I can see; is adding the *possibility* of 
refcounting to TObject, meaning Add/ReleaseRef methods and a RefCount 
field. Then the compiler can safely generate refcounting code for *all* 
objects and non-weak references, and the counting methods take care of 
required operations. Delphi offers two means for specialized 
refcounting, the virtual counting methods, and the (COM compatible?) 
refcount value of -1 for unmanaged objects.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Proof of Concept ARC implementation

2014-10-27 Thread Hans-Peter Diettrich


Sven Barth schrieb:
Am 27.10.2014 17:20 schrieb Hans-Peter Diettrich drdiettri...@aol.com 
mailto:drdiettri...@aol.com:



  Something like ShortString and AnsiString?

With the difference that Short- and AnsiString are assignable to 
eachother while Jonas does not want that for reference counted and 
ordinary classes.


Where would this matter? When TObject and TManagedObject are different 
(base) types, a direct assignment of references is impossible.




What do you mean with virtual counting methods?


Overriding these methods can enable/disable refcounting for a class, and 
all classes derived from it. The default then can be to do nothing (no 
counting).


The main reason I decided not to introduce reference counting for every 
class was that some people feared the performance impact of the 
reference counting. Though Florian said that it shouldn't be that bad on 
today's CPUs...


Did you ever benchmark your model?

That said: if someone wants to test it one could add refcounted to 
TObject (my code should(!) handle that correctly) and see what 
happens... (of course there will be problems with circular references then)


Fine :-)

DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Proof of Concept ARC implementation

2014-10-27 Thread Hans-Peter Diettrich


Kostas Michalopoulos schrieb:
On Mon, Oct 27, 2014 at 5:17 PM, Hans-Peter Diettrich 
drdiettri...@aol.com mailto:drdiettri...@aol.com wrote:


Then the compiler can safely generate refcounting code for *all*
objects and non-weak references, and the counting methods take care
of required operations


Wouldn't that cause all objects to pay (both in terms of performance 
and memory) for something that they don't use?


Right, that's why I suggest to keep both models separate. But the real 
runtime impact has to be benchmarked - it may be as low as with other 
managed types (AnsiString...), where nobody has complaints. Memory usage 
(4 bytes per object) should not matter, Delphi accepts it just for 
mobile devices!


IMO it is better to fully 
disallow subclasses from introducing reference counting than force 
functionality on objects that they don't need to use.


It *looks* better, but has several issues.

DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Proof of Concept ARC implementation

2014-10-27 Thread Hans-Peter Diettrich


Sven Barth schrieb:

A semicolon has the problem that you need to distinguish between it 
being a modifier and a normal following identifier as not every keyword 
is a keyword in every context (like for example read and write for 
properties).


In this discussion I almost miss the elementary distinction between 
keywords (reserved words) and directives. Unlike keywords, directives 
are context sensitive and can be used as identifiers in all other 
places. That's why directives should *follow* identifiers, never precede 
them.


The semicolon usage is not well designed in Delphi, additional 
(intermediate) semicolons are not required and should be banned. Then 
the parser can continue to search for directives until the end of an 
applicable construct (declaration...) is found, which may be a semicolon 
or something else (comma, parenthesis...), depending on the construct 
syntax.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Proof of Concept ARC implementation

2014-10-26 Thread Hans-Peter Diettrich


Sven Barth schrieb:
Am 25.10.2014 03:17 schrieb Hans-Peter Diettrich drdiettri...@aol.com 
mailto:drdiettri...@aol.com:
  - a class instance is destroyed once the reference count reaches 
zero (and Free does not work for them)

 
 
  Shouldn't Free be usable as a finalizer, clearing all references to 
other objects within this instance?


One could do that (for now I've chosen the simple way). One would 
however need to check how this would be implemented best (e.g. it should 
be marked somehow so that the destructor later on does not try to work 
with already finalized fields; also all fields (Strings, arrays, 
interfaces, records, etc.) should be finalized so that it is 
consistent).


A finalizer must clear all managed fields, otherwise memory management 
were corrupted. Doing so may destroy other managed objects, so that 
possible consequences must be considered. I wonder whether the sequence 
of clearing fields may cause trouble?


Also it needs to be observed how other reference holders 
might react to that zombie instance.


Right, this should be considered by the developer.

A further problem might be legacy 
code which gets passed a reference counted instance (on which ARCIncRef 
was called to keep it alive) and which then calls Free. Might not be the 
intended result by neither code... This might be the reason of 
Embarcadero to implement Free as a no-op and add a new DisposeOf which 
does what you suggested.


Then Delphi compatibility has to be maintained. Is DisposeOf fully 
automatic, or can it be overridden or otherwise influenced (field 
sequence...)?


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Proof of Concept ARC implementation

2014-10-24 Thread Hans-Peter Diettrich


Sven Barth schrieb:

Hello together!

I've now finished my Proof of Concept ARC implementation which is based 
on the RFC I published a few weeks back: 
http://lists.freepascal.org/fpc-devel/2014-September/034263.html


Fine :-)


To recap:

[...]
- a class instance is destroyed once the reference count reaches zero 
(and Free does not work for them)


Shouldn't Free be usable as a finalizer, clearing all references to 
other objects within this instance?


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] suggestion: virtual method co-variance

2014-10-14 Thread Hans-Peter Diettrich


Sven Barth schrieb:

At least at first sight there don't seem to be any real (technical) 
reasons to not covariance for return values. Parameters would be a 
different topic though...


Just so I get the idea right:

=== code begin ===

type
  TBar = class
function Test: TObject; virtual;
  end;

  TFooBar = class(TBar)
function Test: TStrings; override;
  end;

//...


I just wonder about the purpose and use of such refinement. Should the 
compiler relate on an more specific return type, based on the *static* 
type of an object reference? I'd use different names for specialized 
methods/properties instead.


OTOH it would be nice to have specialized lists without much coding, 
i.e. without writing getters (and setters) which only do typecasts of 
their results. Something like


type
  TBar = class
property Test: TObject ...; virtual;
  end;

  TFooBar = class(TBar)
property Test: TStrings; override;
  end;

Please note that this kind of override should not require to override 
the getters/setters, it only would enforce (static) type checks/casts, 
as doable at compile time.


But that smells like Generics, which already have their place in OPL...


Parameters would require different handling, because a single type 
mismatch would defer to the base class implementation, with possibly 
strange effects when this results in an bypass of the modified code in 
the overridden methods. That's another argument for my above suggestion, 
which should not only eliminate the need for overriding related methods, 
but should *disallow* explicit getter/setter overrides - so we were back 
again at generics?


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Suggestion: reference counted objects

2014-09-27 Thread Hans-Peter Diettrich


Mark Morgan Lloyd schrieb:

Boian Mitov wrote:



I think parallel processing belongs in library implementations.


I have reservations, based in part on the fact that other language 
implementations are prepared to assume responsibility for 
parallelisation, in part on experience with e.g. APL which at the very 
least specifies that the user should assume that operations are 
parallelised, and in part on the fact that FPC already vectorises on 
e.g. SSE2 hardware.


What do you (both) mean by parallel processing?

Streaming (SSE...) does *vectoring*, i.e. multiple (floating point) 
operands of the same *array* are processed in parallel. Such cases can 
be handled by the compiler, no libraries are involved, no threads, no 
risk of side-effects.


When instead a general loop, possibly containing multiple statements, is 
broken into multiple loops, which are processed in parallel, 
side-effects can occur depending on the operations (virtual methods...) 
in the loop. Then it IMO is up to the developer to check which loop can 
be parallelized without side-effects, and indicates that to the 
compiler. In this case the compiler could turn the body of the loop into 
an TThread, and insert an RTL call to execute this thread split into 
multiple instances. The RTL then creates the threads at runtime 
(depending on what? cores, already active threads...?), assigns to each 
an subrange of the entire loop interval, starts them and waits until all 
of them have terminated. This would limit the types of loops to FOR 
loops, with a known interval, excluding REPEAT, WHILE and FOREACH loops. 
Right?


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Suggestion: reference counted objects

2014-09-27 Thread Hans-Peter Diettrich


Florian Klämpfl schrieb:

If the time spent in this thread had been spent in coding, FPC would have 
already ARC. The list has
approx. 600 members, 200 message were written. If each of the 600 members spent 
on average 1 min
reading this thread, this is 2000 man-hour, i. e. approx. 1 MY :)


First understanding the compiler code will take more time, until one 
knows where to start coding.


Second I already supplied two proposals, which could be implemented in a 
few lines:

1) Use virtual _AddRef and _Release, override for ARC classes
2) Dto. non-virtual, add _RefCount to TObject, init to -1 for no ARC
In either case the compiler inserts calls to _AddRef/_Release wherever 
it already does for interfaces.


Last not least somebody must be entitled to implement ARC, against all 
objections of the users ;-)


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Suggestion: reference counted objects

2014-09-27 Thread Hans-Peter Diettrich


Sven Barth schrieb:

There are however some nasty problems inside constructors and 
destructors, because Self is reference counted as well (and should be 
after all as we don't want the instance to be destroyed behind our backs 
suddenly).


IMO before the end of a constructor, and before the start of an 
destructor, no references to the object exist at all, so that code 
outside these methods has no reference that could cause trouble.


It looks to me like inside methods Self doesn't deserve refcounting, 
because a method can be invoked only with an existing instance, which 
will stay alive at least until the call returns.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Suggestion: reference counted objects

2014-09-27 Thread Hans-Peter Diettrich


Marco van de Voort schrieb:

In our previous episode, Sven Barth said:

It looks to me like inside methods Self doesn't deserve refcounting,
because a method can be invoked only with an existing instance, which
will stay alive at least until the call returns.
That's the thing I'm not yet entirely sure about. Though disabling it 
for Self would definitely simplyfy things. I'll simply give it a try and 
then my constructor problems should hopefully be solved as well...


Methods might return SELF as function result?


Yes, and this might be a problem during and immediately following 
construction. When the constructor passes Self to another procedure, 
that procedure must not do anything that increases and later decreases 
the refcount. Increasing the refcount during construction might help, 
but then the refcount cannot be decremented at the end of the 
constructor. And what should happen to the result of Create?

  MyObj := TMyObj.Create;
is fine when the refcount is increased to 1 when the instance is 
assigned to MyObj. In contrast

  TMyObj.Create;
looks somewhat useless, and how to destroy this zombie later, without 
refcounting? This construct might be legal with TThread.Create? Can 
somebody test what Delphi does in this case?


Suggestion:
The constructor inits the refcount to 1, and decrements it on exit - 
*not* using _Release! This will prevent destruction by refcounting 
during construction.
The compiler uses (always, or only with latter syntax?) a hidden local 
variable, where a new reference is stored and refcounted. This will 
destroy an zombie on exit from the subroutine (maybe unit initialization!).



Another one - weak references:
When the compiler only handles immediate assignments to weak references, 
such a variable cannot be passed as VAR, because the called code cannot 
know about that special handling.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Suggestion: reference counted objects

2014-09-22 Thread Hans-Peter Diettrich


Sven Barth schrieb:

On 21.09.2014 21:09, Hans-Peter Diettrich wrote:

Sven Barth schrieb:

[...]

I'd add a _RefCount field to TObject, regardless of whether it's really
used later; this will fix the field offset - just like the VMT reference
is fixed in TObject, but not in Object types. This will eliminate
problems with class helpers.


I've especially written that it's not part of *every* objects, because 
people complained about the size increase for all instances.


That's why I also suggested an compiler option, useful for all people 
which care about program size or ARC at all.


In general I agree with your thoughts, I only wanted to add a few remarks.

[...]

Here the compiler would always insert _AddRef, just like with
interfaces, eventually optimized (inlined?) like:
   if Result._RefCounter  -1 then
 Result._AddRef; //or InterlockedIncrement(Result._RefCounter);


And that's another thing: people complained about having that reference 
count overhead for *all* assignments.


See above :-)

[...]

It. Was. Just. An. Example. To. Illustrate. The. Problem!
I would implement that differently as well, but there's *no* point to do 
that inside an example that's supposed to be as simple as possible!


Is it me, or what else makes you so angry today?
Sorry for that :-(

Regards
DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Suggestion: reference counted objects

2014-09-22 Thread Hans-Peter Diettrich


Sven Barth schrieb:
Am 22.09.2014 09:47 schrieb Michael Schnell mschn...@lumino.de 


  Why not use interface to add ref-counting to an object ? This seems 
to work nicely even though the name interface in not speaking on 
that behalf.


Because you'll need to declare an interface for each class you want to 
have reference counted so that you can access its methods, properties, etc.


This overhead could be eliminated by another syntax extension, like
  TMyARCclass = interface(TObject)
where the Compiler could allow for implementations of the declared 
methods just as for

  TMyARCclass = class(TObject)
bridging the gap between traditional (strictly declarative) interfaces 
and classes (including implementations), with or without ARC.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] weak referencing (was Suggestion:.....)

2014-09-22 Thread Hans-Peter Diettrich


Marco van de Voort schrieb:

(to Sven)

So the cycle break mechanism is going to be marking potential cycle cases
as weak.

Do you still plan to at least detect cycles for debugging purposes?

Or is the cycle detection itself already too hard?

IOW I'm wondering what will happen (and what to do) if there is a cycle in a
sufficiently complex program.


I could imagine a tool for that purpose, instead of burdening the 
compiler with such rarely used functionality. More diagnostics could be 
removed from the compiler, like the detection of unused local variables 
or units - if that helps to speed up compilation. Separate diagnostic 
tools could immediately offer means to solve the detected problems 
interactively, what's not the purpose of an compiler.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Suggestion: reference counted objects

2014-09-22 Thread Hans-Peter Diettrich


Sven Barth schrieb:
Am 22.09.2014 12:59 schrieb Hans-Peter Diettrich drdiettri...@aol.com 


  That's why I also suggested an compiler option, useful for all people 
which care about program size or ARC at all.


The problem with the compiler option is that you'd need to rebuild the 
complete RTL, packages and your application so that every unit has these 
changes. Otherwise e.g. the RTL would still contain reference counting 
code and TObject would still contain a reference count field.


Right, it should be a compiler *build* option, so that everybody can 
create his favored flavor(s) of the compiler, and decide which one to 
use with which project.


Every compiler then should provide a predefined constant for every such 
option, in case specific handling is required by conditional compilation 
of user code.




  Is it me, or what else makes you so angry today?
  Sorry for that :-(

I'm sorry that I got aggressive, but it's quite frustrating when one 
writes a simple example to illustrate something and the first complain 
is how to make it use a better design which is completely besides the 
point of the example -.-


Then I missed the point of the example(s). It looked to me like 
something final, well thought like the rest of your message.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Suggestion: reference counted objects

2014-09-22 Thread Hans-Peter Diettrich


Boian Mitov schrieb:
In general, records and classes are inherently the same thing (and in 
C++ are indeed practically interchangeable).


This model might have been the reason for introducing Object at all, for 
compatibility with CBuilder.


The only real difference in 
Delphi/FPC is that records are instantiated in the stack, the objects in 
the heap,


Like records, Objects can reside in either the stack, heap or even 
static memory.



and the artificial restriction on record inheritance.


Why inherit when you can't override virtual methods? I convert my 
Records into Objects, when I want to extend an record.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Suggestion: reference counted objects

2014-09-21 Thread Hans-Peter Diettrich




Marco van de Voort schrieb:

In our previous episode, Hans-Peter Diettrich said:
IMO Weak references should be reserved for users who accept possible 
consequential problems, but should never be used in standard libraries. 
At least I'd suggest to make weak references subject to an compiler 
switch, so that every user has a chance to disable them in case of trouble.


IMHO weak references trade one manual memory system in for a different
manual memory system.


Weak references steer/guide *automatic* memory management.


The hard part of manual memory systems, figuring out how a complex dynamic
structure deallocates (that is usually tackled by having a bit of design and
thought go into it), remains.


And this is what the user of such a structure (standard libraries...) 
does not always know. He may be unable to determine the reason for some 
runtime error in his code, when an object was destroyed automagically 
where it should still be alive. The user can debug (and fix) ordinary 
(owner/owned) patterns, implemented in high level code, but not 
table-driven (RTTI) or otherwise hidden (intrinsic) management 
procedures. While the user can change the owner of an object at runtime, 
he cannot change a weak reference into a strong one, without 
recompilation of at least the unit containing that declaration, and 
without figuring out the consequences of such an change.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Suggestion: reference counted objects

2014-09-21 Thread Hans-Peter Diettrich


Sven Barth schrieb:

On 20.09.2014 13:42, Sven Barth wrote:

On 20.09.2014 13:11, Peter Popov wrote:


- to remedy this TObject is extended with non-virtual methods that allow 
manual reference counting and would rely on the RTTI data I mentioned 
(let's call the methods AddRef, Release, IsReferenceCounted and RefCount 
for now, which can also be used to hook up the reference counting of 
IUnknown interfaces);



I'd add a _RefCount field to TObject, regardless of whether it's really 
used later; this will fix the field offset - just like the VMT reference 
is fixed in TObject, but not in Object types. This will eliminate 
problems with class helpers.


This approach also would allow to switch any object from managed to 
unmanaged on the fly, by setting the counter to -1, because the special 
value -1 already indicates an unmanaged/const memory object (like with 
string literals).


In my first draft I considered virtual _AddRef/_Release methods, but 
calling a virtual method is more expensive than calling or inlining a 
static method.



 the code from above would then look like this to

make it safe:

=== code begin ===

function CreateObject: TObject;
begin
  Result := TARCObject.Create;
  Result.AddRef;
end;

=== code end ===


Here the compiler would always insert _AddRef, just like with 
interfaces, eventually optimized (inlined?) like:

  if Result._RefCounter  -1 then
Result._AddRef; //or InterlockedIncrement(Result._RefCounter);


- TObject.Free would be extended to take reference counting into account 
as well. If the object is reference counted (IsReferenceCounted returns 
true) it will call Release and otherwise it will continue to Destroy.
- there would be a TARCObject declared in System which is a direct 
descendant of TObject, but with reference counting enabled; same maybe 
also for TInterfacedObject


The convention, of -1 meaning unmanaged, favors managed objects by 
default, when InitInstance zeroes all fields of the instance just 
created. But when the VMT reference must be excluded or inserted 
afterwards afterwards, then _RefCount can be initialized at the same 
time (to -1 for the unmanaged default). Later on a TARCObject base class 
constructor/initializer will reset _RefCount to zero again.


- all classes can now have operator overloads as well though it should 
be warned in the documentation that non-reference counted objects might 
result in memory leaks there


...unless operators also test _RefCount


- this now only leaves the problems of cycles; take this code:

=== code begin ===

type
  TSomeClass = class(TARCObject)
Children: specialize TListTSomeClass;
Owner: TSomeClass;
constructor Create(aOwner: TSomeClass);
  end;

constructor TSomeClass.Create(aOwner: TSomeClass);
begin
  Children := specialize TListTSomeClass.Create;
  Owner := aOwner;
  if Assigned(Owner) then
Owner.Children.Add(Self);
end;


Here I'd prefer
  Owner.AddChild(Self);
so that the Owner can implement any decent/appropriate child management 
under the hood.



procedure Test;
var
  t1, t2: TSomeClass;
begin
  t1 := TSomeClass.Create(Nil);
  t2 := TSomeClass.Create(t1);
  // do something
end;

=== code end ===

Now once Test is left it would leave the instances which were assigned 
to t1 and t2 hanging, because they have references to each other.


This depends on the implementation of TOwner.Children[] and 
TChild.Owner. Is a stored TChild.Owner reference really required in a 
managaged environment? IMO a (strong) unidirectional reference from 
Owner to Child will do it all. Then no child will be destroyed, as long 
as its owner holds a reference to it. That's the intended purpose of 
both owner/child and automatic memory management.


When it's desireable to definitely destroy an owned object at will, then 
its owner must be known, of course. In this case two different 
management approaches conflict with each other. In this case I'd accept 
a weak Owner reference, because the referenced Owner will stay alive 
longer than it's listed children.


More problematic are circular references without a decicated owner/child 
relationship.




There are (as far as I see) three ways to solve this:
* provide a way to break the circle (in this example e.g. setting Owner 
to Nil before leaving Test; this is what Delphi provides with the 
DisposeOf virtual method)

* introduce weak references which would disable reference counting, e.g.:

=== code begin ===

type
  TSomeClass = class(TARCObject)
// ...
Owner: TSomeClass weak;
// ...
  end;

=== code end ===

Now the TSomeClass.Create(t1) line in Test wouldn't increase the 
reference count of t1 further and thus both class instances would be 
destroyed after Test is left.


This IMO is the preferable way to go, in a definite owner/child 
relationship. The lifetime of an owner can not depend on the existence 
of owned children, so that the owner will survive until it has 
destroyed/released all his children himself. A child-to-owner

Re: [fpc-devel] Suggestion: reference counted objects

2014-09-21 Thread Hans-Peter Diettrich


Florian Klämpfl schrieb:

Am 21.09.2014 um 07:22 schrieb Hans-Peter Diettrich:

Boian Mitov schrieb:

That is easy. it gets incremented when it gets assigned. The running threads 
have no way of
accessing it if there is no reference (assignment) already in place.

The problem arises when an object is destroyed, or even elected for destruction 
in _Release, while
another thread starts using the same instance.


This is not possible for a correctly written program: if two threads having 
references to the same
instance, the ref. count is 2. So no destruction is possible.


What happens if such a reference (to A) is part of another object (B), 
known to the thread? I suspect a chance for a race condition here, when 
one thread clears B.A while another thread tries to acquire a reference 
from B.A.


But this may be an excursion into threadsafe coding, where any 
modification to a shared resource requires a lock in a multi-threaded 
environment. Then above situation should never occur...


Is it really sufficient to protect refcounter changes by Interlocked 
Inc/Dec, to prevent race conditions while obtaining object references?


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Suggestion: reference counted objects

2014-09-20 Thread Hans-Peter Diettrich


Fabrício Srdic schrieb:

Hello,

In platforms with managed code (.NET, Java), objects are automatically 
freed by the memory manager / garbage collector.


Would not it be interesting to have a similar feature in FPC?


AFAIK some Delphi XE made TObject itself managed, by reference counting. 
It would be easy to introduce the same feature in FPC, so that no 
special base class would be required. Like with extended RTTI a decision 
should be made, whether managed objects should be enabled or disabled by 
default. Afterwards automatic management can be turned on or off for 
every single class or object individually.


For example, through a root class where its objects are counted by 
reference, like the TInterfacedObjects. Thus, the programmer would be 
free from having to manually release objects.


In practice it turned out that the automatic destruction of objects 
still requires assistance of the coder, in many cases, in all languages 
with garbage collection. I.e. a destructor (or finalizer) still is 
required to prepare an object for subsequent destruction.


IMO it's sufficient to use Interfaces for all objects that should be 
subject to garbage collection.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] RTTI generating

2014-09-20 Thread Hans-Peter Diettrich


Sven Barth schrieb:
Am 20.09.2014 01:52 schrieb Hans-Peter Diettrich drdiettri...@aol.com 


  It's up to the coder to make all properties etc. published, when he 
*intends* to ever use RTTI on them. That't the way to tell the compiler 
what to do.


The extended RTTI introduced with Delphi 2010 allows you to even query 
private fields if the class developer decided to enable data generation 
for that.


AFAIK it works in the opposite direction: the developer must *exclude* 
explicitly, in *every* unit, what should *not* be subject to extended 
RTTI. That's my strongest point against (Delphi) extended RTTI, while 
RTTI by itself is okay for me.


I think that it's time to resume the work on my Delphi decompiler. I 
never published it before, but now it looks like it's time to wake up 
the XE coders, like the VB3 coders decades ago.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Suggestion: reference counted objects

2014-09-20 Thread Hans-Peter Diettrich


Sven Barth schrieb:

On 20.09.2014 12:36, Hans-Peter Diettrich wrote:



AFAIK some Delphi XE made TObject itself managed, by reference counting.
It would be easy to introduce the same feature in FPC, so that no
special base class would be required. Like with extended RTTI a decision
should be made, whether managed objects should be enabled or disabled by
default. Afterwards automatic management can be turned on or off for
every single class or object individually.


It's basically easy, yes, but then one has to deal with code like this:

[...]
Which could lead to some unintended side effects if o is passed to 
some other code which keeps the instance around and .Free merely 
decreases the reference count. Of course that would have been a memory 
leak before and now it's not, but nevertheless it changes behavior.


I already mentioned that destructors still are required, but will have 
an different purpose and usage than before. This would discourage 
continued use of Destroy(), BeforeDestruction() etc., which should at 
least be renamed to prevent unconverted legacy code from compiling. This 
change already will break compatibilitiy, so that consequently all 
libraries (in detail when dealing with lists containing objects) have to 
be updated. I was aware of such consequences, but I'm no more sure of 
the consequences of my idea of simply turning refcounting on or off for 
specific objects or classes.


The mere implementation of refcounting for TObject is easy, but the 
consequences are hell :-(


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Suggestion: reference counted objects

2014-09-20 Thread Hans-Peter Diettrich


Boian Mitov schrieb:

The short story is that any approach has issues.
The component container approach has issue of a single ownership with 
the easy loss of pointers.

The ref counting has the danger of circular references.
The GC has the non deterministic behavior (Actually I proposed a 
deterministic/semideterministic GC algorithm ~8 years ago or so, but 
that is a different story).


I don't like the use of GC as a synonym for *mark-sweep* garbage 
collection only. Wikipedia also states Reference Counting as just 
another form of garbage collection.


The point is that even with GC the developer is still required to 
carefully manage resources, and GC tends to make it even more complex.
From all the the above approaches the ARC with optional Weak pointers 
is the 
easiest to manage and the one that tends to lead to the least problems 
IMHO .


ACK - except for Weak references. Weak references turn the conservative 
memory management into an aggresive/optimistic one, with unpredictable 
consequences.


IMO Weak references should be reserved for users who accept possible 
consequential problems, but should never be used in standard libraries. 
At least I'd suggest to make weak references subject to an compiler 
switch, so that every user has a chance to disable them in case of trouble.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Suggestion: reference counted objects

2014-09-20 Thread Hans-Peter Diettrich


Sven Barth schrieb:
Am 20.09.2014 20:34 schrieb Giuliano Colla 


  A general mechanism to be reliable should take into account all 
possibilities. If it does, it will block threads even when unnecessary. 
If it doesn't, it will be unsafe.


That would work the same way as it does in interfaces, arrays and 
strings: using Interlocked*-functions.


As I understand Interlocked Inc/Dec funtionality, it only protects the 
update of the reference counter against interrupts, but not the tests 
required before/after this update. As a precaution a RefCount should at 
least be incremented as soon as there exists a *chance*, that the 
reference is used/copied in some piece of currently active code (thread...).


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Suggestion: reference counted objects

2014-09-20 Thread Hans-Peter Diettrich


Giuliano Colla schrieb:

  Hi Boian,

I'm easily convinced that you've developed a lot of things using 
reference counting. Design is the art of compromise, and possibly in 
your class of application that's the best compromise.
But we should never forget that our class of applications isn't the only 
possible one in the world. What is a bonus for you might be either 
useless or extremely harmful for someone else.


I might, for example, tell you that my company has been successfully 
implementing since more than 30 years a class of applications for the 
control of industrial processes, with hundreds of threads running 
simultaneously in a multi-CPU environment,

[...]

IMO realtime applications require an realtime OS, providing all required 
means of process synchronization and communication. Ordinary systems and 
developers should be happy with primitive threads, doing their work in 
the background and exiting when done.


It's good to know that FPC allows to implement and manage more complex 
parallel processing, if I understand you and Boian correctly?


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Suggestion: reference counted objects

2014-09-20 Thread Hans-Peter Diettrich


Boian Mitov schrieb:
That is easy. it gets incremented when it gets assigned. The running 
threads have no way of accessing it if there is no reference 
(assignment) already in place.


The problem arises when an object is destroyed, or even elected for 
destruction in _Release, while another thread starts using the same 
instance.


Indeed that is how it works in Delphi, and BTW: that is how Strings work 
in Delphi and FPC the last time I checked ;-) .


With strings it's possible to create another (unique) copy, when a 
string is modified, and it does no harm when that copy is destroyed 
later - every user will find an valid (empty) string. Not so with 
objects, which cannot be copied nor reused after destruction.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] RTTI generating

2014-09-19 Thread Hans-Peter Diettrich


Boian Mitov schrieb:

On Fri, 19 Sep 2014, Adriaan van Os wrote:



Your remarks seem to imply that you think RTTI can be used to inspect 
any aspect of an object.

It was/is not meant for that.


Quite incorrect. All languages with modern RTTI allow for full object 
inspection, and that includes Delphi 2010 and higher, C#, and even VB 
has it.


It's up to the coder to make all properties etc. published, when he 
*intends* to ever use RTTI on them. That't the way to tell the compiler 
what to do.


Inside a program there exists no distinct brave object inspector and 
unauthorized object garbler - both can be implemented by using RTTI. 
If you don't like safe types and other restrictions, which exist in 
Pascal for good reasons, then choose any unsafe language to implement 
whatever mess you like :-]


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Method for write string into TStreamt

2014-07-21 Thread Hans-Peter Diettrich


Dmitry Boyarintsev schrieb:

How about introducing a default parameter?
The parameter keeps the method backward compatible, allowing write a 
string without the prefix.


public procedure TStream.WriteAnsiString(
  const S: string;
  withLength: Boolean = true;


How should a string without a length be read back?

DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] TTimers and TThreads. Attn Michael Schnell

2014-06-27 Thread Hans-Peter Diettrich


Giuliano Colla schrieb:

If you're using relative times and not absolute ones, then you may avoid 
the search, without need to resort, using a slightly different scheme, 
i.e. entering in a sorted list the times *relatives to the previous one*.


Then your queue can run out of sync with the absolute time.

I don't see an advantage with using relative times, or unsorted lists.
On insertion a binary search over the list can be made, when the entries
are sorted by absolute time. Removal of entries occurs always from the
list head.

DoDi


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Extended($FFFFFFFFFFFFFFFF) = -1?

2014-03-03 Thread Hans-Peter Diettrich


Ewald schrieb:

On 03 Mar 2014, at 00:29, Hans-Peter Diettrich wrote:



`-1` would then be $1    , whereas $ 
  would be $0    . It really is quite
easy to store it like that and `fix` things [picking a fitting
datatype] afterwards.

The datatype has to be constructed/determined first, and *if* there
exists a type with more than 64 bits, then it will be a signed
type, with a 65 bit (unsigned) subrange matching your needs. But if
no such type exists, you are lost.


Yes, that is true, but there always is a 64 signed/unsigned type
(perhaps not native). On machines where, for example, only 32 bit
wide datatypes are allowed, the virtual subrange should be 33 instead
of 65 bytes.


Subranges are expressed in low..hi notation, not in bits, meaning that 
the hi value must be expressable in a valid signed positive number.



Anyway, that is the way how I parse constants. The important rule
here is that you don't need the full 65 bit in the final
representation. The signedness of the type can fix this loss of the
one bit.


How (which data type) does *your* parser store untyped numerical constants?

IMO your problem arises from the fact that a bitpattern, with the 
highest of 64 bits set, cannot be stored in a larger (signed) type, as 
required. All such untyped constants will cause problems when assigned 
to typed variables. Test yourself what happens when you convert such an 
QWORD value into Extended.



Anyway, then you have got backwards compatibility to take care
of, since there will be someone out there who's code actually
depends on this behaviour.

When we agree that a bitpattern of $    can be
interperted differently on different 32 bit machines, as -1 or
-MaxInt,


Why `on 32 bit machines`?


You're right, my guess of the number of bits was wrong.


I'm fairly confident that this particular
constant on this particular compiler version will generate the same
outcome on every possible architecture out there (just change the
`extended` to `single` in the original example, because extended
tends to vary).


That's my expectation, too.


then it's obvious that such a textual representation should cause
an compilation error not portable We know that such an error
message has not yet been implemented, but if you insist on writing
unportable code... :-]


I insist on using a constant that is: - 64 bit wide - Only contains
1's - Is interpreted as an unsigned number wherever mathematical
operations are performed.


Then you have to choose a different language. What will C++, C# or Java 
do in these cases?



Those demands are quite portable, no?


No.


My original problem was easily solved with a typecast
QWord(gargantuan constant goes here), so that was no longer an
issue. What baffled me though was  the fact that this (mis-: in my
opinion) mis-parsing of certain constants is by design.


What you observed was related to the argument passed to WriteLn. When 
WriteLn includes code to output a QWord, then the output should reflect 
the bitpattern (unsigned number). The output of an Extended value 
reflects the value converted from integral to floating point, and that 
conversion assumes signed values. IIRC the x87 FPU doesn't have an 
instruction to load unsigned integral values, so that no compiler has a 
chance to make it load an unsigned value.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Extended($FFFFFFFFFFFFFFFF) = -1?

2014-03-03 Thread Hans-Peter Diettrich


Ewald schrieb:

On 03 Mar 2014, at 12:49, Hans-Peter Diettrich wrote:



How (which data type) does *your* parser store untyped numerical
constants?


Roughly like this (syntax may be a bit awry, but you get the point):

TIntegerNumber = Record Case SignedNess: TSignedNess of snPositive:
UValue: QWord; snNegative:  SValue: Int64; End;

The parser detects wether there is a `-` in front of the constant and
stores the right sign in the SignedNess field.


A parser doesn't work like that - too many possible cases with unary 
minus. If you need an datatype for integers with more bits than provided 
by the compiler, you must roll your own datatype.




Alright, let me rephrase my demands: - I want to store the value
18446744073709551615 in any kind of variable without loss of
precision.


See above.

DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Extended($FFFFFFFFFFFFFFFF) = -1?

2014-03-02 Thread Hans-Peter Diettrich


Ewald schrieb:


Talking about principles: If hexadecimal is actually used to
represent bit patterns (as Hans-Peter Diettrich wrote), then the
decision to use a signed type here seems to violate this (represent
bitpatterns) principle, since the highest bit in a signed number has
a different meaning than the other bits, where in a bitpattern all
bits have equal meaning.


That's correct.


It seems like sticking to one principle (signed integer as much as
possible) actually breaks another principle (bitpattern).


Wirth and his Pascal language are well designed with signed types above 
all, and unsigned types being subranges. In so far one could consider 
hex constants with the sign bit set as syntactical errors.



You do care about the signedness, because the only way to represent
int64(-1) in hexadecimal is as $.


Negative numbers never should be expressed in hex.


And what about -$1? Or is that too far fetched?


That's correct, because -$1 is -1 is a valid integral expression, 
without signedness problems.



Numbers in two's complement do no consist of a single sign bit
followed by a magnitude. Those top 63 '1' bits together form the
- sign in this number.


Yes, but this can all be solved by parsing the string and storing it
with one extra MSBit (if there is a `-` in front of the constant it
must be negative, otherwise it should be positive).


This is why Wirth considers all types being signed, without such problems.



This highest bit then reflects the sign.


The sign representation is machine specific, as you know. On 1's 
complement machines there exist two representation of zero, as +0 and 
-0, and you cannot express both as hexadecimal constants in an portable 
way. That's why high level languages, like Pascal, forbid hex 
representations of (possibly) negative numerical values.



`-1` would then be $1    ,
whereas $    would be $0    . It
really is quite easy to store it like that and `fix` things [picking
a fitting datatype] afterwards.


The datatype has to be constructed/determined first, and *if* there 
exists a type with more than 64 bits, then it will be a signed type, 
with a 65 bit (unsigned) subrange matching your needs. But if no such 
type exists, you are lost.



Anyway, then you have got backwards
compatibility to take care of, since there will be someone out there
who's code actually depends on this behaviour.


When we agree that a bitpattern of $    can be 
interperted differently on different 32 bit machines, as -1 or -MaxInt, 
then it's obvious that such a textual representation should cause an 
compilation error not portable We know that such an error message 
has not yet been implemented, but if you insist on writing unportable 
code... :-]


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Class property and virtual getter

2014-02-28 Thread Hans-Peter Diettrich


Michael Schnell schrieb:

On 02/28/2014 02:18 AM, Hans-Peter Diettrich wrote:


So the lack of Self seems to apply to static; methods, not to 
class methods. I'll ask in an EMBT group for a description of 
static;, the OH seems to reflect the C++ meaning only,


In ANSI C static with functions just means unreachable from outside 
the current source file (i.e. by the linker) (I always thought this is 
a silly name for that meaning.)


Is this different with C++ ?


Yes. Some C keywords have multiple different meanings in C++, depending 
on where they occur in source code.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Extended($FFFFFFFFFFFFFFFF) = -1?

2014-02-28 Thread Hans-Peter Diettrich


Ewald schrieb:

On 28 Feb 2014, at 20:39, Jonas Maebe wrote:



All hexadecimal constants are (conceptually) parsed as int64, so
this is by design. int64($) is not -1.



By the way, what do you do when you want to port fpc to a one's
comlement machine (if they still exist)?


Numerical constants, where the sign matters, should only be encoded in 
decimal. The other formats (hex,oct,bin...) are intended for use with 
binary values, where the bit pattern is important. Then the code 
compiles correctly on any kind of machine.


Assumptions about type sizes and encodings can make *application* code 
unportable. E.g. the Extended type doesn't have a guaranteed size and 
binary representation, IIRC it's equivalent to Double on x64.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Class property and virtual getter

2014-02-27 Thread Hans-Peter Diettrich


Michael Van Canneyt schrieb:

So what's the special use of a *class* property? If it exists for 
Delphi compatibility only, why then is it handled differently from 
property?


The reason is explained in the upcoming docs.

Namely: a static method cannot be overridden.


Sure, but virtual methods (including class methods) can be overridden.

The class property is part 
of this particular class, and descendent classes should not be able to 
override it's behaviour.


A static class method can call another virtual class method, so this 
protection looks very artifical to me.



BTW Delphi XE allows to call a virtual class method, but when called 
from a static class method it calls it like a static method, overrides 
are simply ignored. Calling the same method directly honors overrides.


Also self is no more known inside class methods in XE. In D7 it was 
the class type instead of the instance pointer. Thus a too restrictive 
compiler, geared towards compatibilitiy with *new* Delphi versions, may 
break existing code.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Class property and virtual getter

2014-02-27 Thread Hans-Peter Diettrich


Michael Van Canneyt schrieb:


The reason is explained in the upcoming docs.

Namely: a static method cannot be overridden.


Sure, but virtual methods (including class methods) can be overridden.

The class property is part of this particular class, and descendent 
classes should not be able to override it's behaviour.


A static class method can call another virtual class method,


No, it cannot. Try it. It was explained to me using this exact example.


Then I missed the static; directive in the posted example, added to 
the getter/setter methods. Delphi introduced that directive after D7, 
and I found no useful description for it yet. Now it looks to me as if 
we have to distinguish ordinary static (non-virtual) methods from 
explicit static; methods.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Class property and virtual getter

2014-02-27 Thread Hans-Peter Diettrich


Sven Barth schrieb:

Am 27.02.2014 15:35, schrieb Hans-Peter Diettrich:
Also self is no more known inside class methods in XE. In D7 it was 
the class type instead of the instance pointer. Thus a too restrictive 
compiler, geared towards compatibilitiy with *new* Delphi versions, 
may break existing code.

Source please. This compiles and runs without problems in XE:

=== source begin ===

type
  TTest = class
class procedure Test; -
  end;


When you add static;, as required for class property getters/setters, 
the following won't compile:



class procedure TTest.Test;
begin
  Writeln(Self.ClassName);
end;


So the lack of Self seems to apply to static; methods, not to 
class methods. I'll ask in an EMBT group for a description of 
static;, the OH seems to reflect the C++ meaning only, without 
mentioning the impact on OPL classes.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Class property and virtual getter

2014-02-27 Thread Hans-Peter Diettrich


Jonas Maebe schrieb:


Error: Only class methods, class properties and class variables can be
referred with class references


You have to declare an instance and then call its property. You don't 
have to instantiate the instance if the property maps to a class method.



Technically there's some obstacle to allow such construct?


As long as a method doesn't use Self, directly or implicitly, the 
absence of an object reference does not cause problems.


Class properties should be accessible from within static class methods. 
Having them accessible depending on the getter/setter they use (static 
or not) would break orthogonality (the visibility/usability must depend 
on the interface, not on the implementation of the interface).


This would mean that in legacy code the non-virtual methods have to be 
separated now, into non-virtual, static, class and static class methods, 
in order to keep the code compiling?


Non-static class methods cannot be called from static class methods 
because you don't know the original class type that was used to call it 
(and hence this could have unexpected results).


Does this mean that the new static class methods don't have an Self 
parameter?


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] About typecasts and the documentation

2014-02-08 Thread Hans-Peter Diettrich


Martin Frb schrieb:

http://www.freepascal.org/docs-html/ref/refse67.html#x124-13400012.4
In general, the type size of the expression and the size of the type 
cast must be the same. However, for ordinal types (byte, char, word, 
boolean, enumerates) this is not so, they can be used interchangeably. 
That is, the following will work, although the sizes do not match.


http://www.freepascal.org/docs-html/ref/refse68.html#x125-13500012.5
A variable can be considered a single factor in an expression. It can 
therefore be typecast as well. A variable can be typecast to any type, 
provided the type has the same size as the original variable.


IMO type*cast* and type*conversion* should be kept separate. A cast then 
requires that the *size* is the same, while in a conversion the *value* 
stays the same. The compiler messages should reflect this difference 
(see your example below).


Usually typecast can have both meanings, in detail for value 
typecasts. Further terms are type coercion, type promotion.


Typecasts can be further restricted to *compatible* types. Here numeric 
types seem to be compatible with other numeric types, but not with 
structured types (records...). With classes sometimes a distinction 
between upcast and downcast is made (type *inclusion*), where up and 
down reflect more basic (ancestors) and more derived classes. Eventually 
this also applies to conversions between Char and numeric types, for 
which standard conversions Ord(c) and Chr(i) are defined.


It's not clear to me why a TStrings.Objects[i]:TObject can be compatible 
with e.g. integer:

  MyStringList.AddObject('1',TObject(1));
This may be due to some underlying implementation detail, where the list 
(array) contains pointers instead of objects, and these pointers then 
are compatible with numbers.


Often also multiple casts can be accepted, in something (untested) like
  MyObject := TObject(pointer(1));
or
  pointer(MyObject) := pointer(1);

IMO the detailed rules, as implemented in the compiler, are too complex 
for a simple description. That's why the docs only explain the syntax, 
not the full semantics behind the syntax.



foo := TFoo(longint(1)); // project1.lpr(9,8) Error: Illegal type 
conversion: LongInt to TFoo
foo := TFoo(1); // project1.lpr(10,8) Error: Illegal type conversion: 
LongInt to TFoo

end.


Obviously the types (record and numeric) are considered incompatible by 
the compiler.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Incomplete docs on operator precedence / Question about actual precedence

2014-02-03 Thread Hans-Peter Diettrich


Martin Frb schrieb:


Further, it appears that ^ has a higher precedence than  unary -


IMO pointer/address arithmetic (should) follows its own rules.
Unary - and @ should not be applicable to addresses. @ also is 
restricted to arguments which *do* have an address, i.e. not applicable 
to arithmetic expressions or properties.


Applicable binary operators depend on the type of both arguments, e.g. 
it's valid to subtract addresses (yielding an ordinal value), but adding 
addresses should be disallowed, while adding an ordinal value to an 
pointer is okay (yealding another address).




//  p:= -@i;   // if enabled, next line will crash


This should not compile, unary - is not applicable to addresses.


  writeln( -p^ ); // writes -99


Here ^ must take precedence, applied to an pointer/address.

p+i^ is questionable, the only valid interpretation is (p+i)^.


IOW applicable operators and precedence depend on the type of the 
argument(s).


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Black List of examples that FPC won't compile.

2014-01-28 Thread Hans-Peter Diettrich


Maciej Izak schrieb:


Following by current logic, this example should not compile:

--begin code--
TA = class
  procedure Foo;
end;

TB = class(TA)
  procedure Foo(A: Integer = 0); overload;
end;

var
  b: TB;
begin
  b := TB.Create;
  b.Foo; // should raise Error: Can't determine which overloaded 
function to call

end;
--end code--


Delphi (XE) has no problem with this code, even if TA.Foo also is 
declared overload, or both are declared Foo(). The static type (b:TB) 
determines which static method to call.



Returning to the example from bugtracker 
(http://bugs.freepascal.org/view.php?id=25607):
FPC don't recognize at TB level that the TObject.Create was hidden on 
the TA level by 


constructor Create(A: Integer = 0); virtual; overload;


Delphi here requires
 constructor Create(A: Integer = 0); overload; virtual;

Delphi seems to search for overloaded methods only in the same class. 
Otherwise an error Previous declaration ... not marked 'overload' 
would occur since TObject.Create was not marked 'overload'. Tested with 
declarations in the same class:

  constructor Create;
  constructor Create(A: integer);
where *both* must be marked 'overload'.

Only if there is no matching method in a class, the ancestors are 
searched as well. I.e. Delphi does not *hide* inherited methods, it only 
extends the search into ancestors *when required*, regardless of 
'overload' directives.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] DOS GUI

2014-01-15 Thread Hans-Peter Diettrich


Thaddy schrieb:
Well, I have a statement from their legal dating from 2005 amounting to: 
we use it as you intended (sic) and see no reason to quote that this 
sourcecode is yours. Furthermore, the two units that contain said 
sourcecode you refer to are protected under U.S. copyright law and are 
our intellectual property. (It blahblah's a lot more, this is the 
essence and not verbatum) In other words: closed source.


Well, such companies and lawyers can claim a lot. This is not different 
from other countries, but it may be much more expensive to defend 
against such piracy in the U.S. :-(


At least you know now that your license has been too generous. And your 
case also explains why the open source licenses are so complicated, in 
order to prevent Copyright adicts from hijacking open source code.


Now you can be right and probably you are right but to be legally right 
in de U.S. this will cost a lot of funds that I can better use 
elsewhere. This type of answers is not unique to my case. I believe 
Henri Gourvest has a rather unique addition to some of his his 
open-licenced  sourcecode explicitly exluding said company from using it 
after a similarly bad experience.


Did you contact e.g. the FSF, asking for advice or assistance in your 
case? When that company is known for such illegal practices, they may be 
interested in defending open source principles.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] DOS GUI

2014-01-14 Thread Hans-Peter Diettrich


Thaddy schrieb:

It happened to me once or twice ;) that a certain company with ever 
changing names used my sourcecode and licensed it under their own closed 
terms because i included the term: use as you like.


Better: free for private use.

If the owner wants that not to happen,, choose any of these licenses 
mentioned.
This is really important. Without huge legal fees I can't get my 
intellectual property  back


Sorry, that's nonsense. You still have all rights on your own software, 
no need to get anything back. Even in outdated Copyright terms a use 
as you like should not mean take ownership.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] DOS GUI

2014-01-14 Thread Hans-Peter Diettrich


Mark Morgan Lloyd schrieb:

Hans-Peter Diettrich wrote:

If the owner wants that not to happen,, choose any of these licenses 
mentioned.
This is really important. Without huge legal fees I can't get my 
intellectual property  back


Sorry, that's nonsense. You still have all rights on your own 
software, no need to get anything back. Even in outdated Copyright 
terms a use as you like should not mean take ownership.


I don't think that's necessarily the case. If you don't make a clear 
statement of ownership in every accessible file then it's difficult to 
claim that it's not in the public domain (or res nullius),


In contrary, nobody can state then that it *is* in the public domain.

that's why 
classic IBM operating systems and HP calculator firmware are now being 
distributed freely.


Not legally in the EU, at least not with consent of the rights holder.

Ownership expires after some time, perhaps the old Copyright protection 
has expired now? Otherwise ownership expires 70 years after the *death* 
of the author, what unlikely happened for software yet :-]


In current international law (Droit d'Auteur) *only* the author has 
rights on his work. Everbody else must be allowed by the author to use 
it. That's why a author note will allow to identify the person from 
which one can obtain the right to use it. When the author can not be 
identified, then the work is *not* in the public domain, nobody is 
allowed to use it.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Explanation about code page-aware AnsiStrings

2014-01-08 Thread Hans-Peter Diettrich


Jonas Maebe schrieb:

http://wiki.freepascal.org/FPC_Unicode_support (only Sections 1 to 3; 4 
and later are older and mostly either incomplete or wishful thinking).


Just a note on RawByteString concatenation:

Delphi concatenates RawByteStrings to the dynamic encoding of the 
*first* string, the appended strings eventually are converted before 
concatenation. Special handling of strings with the same encoding is not 
required.

I.e. the result is *not* always a CP_ACP string, as documented in the wiki.

Please adjust the implementation accordingly, it makes the 
RawByteStrings much more useful. The handling of automatic conversions 
may be unified in general, when concatenated strings are assigned to a 
target of a known encoding; in this case the target encoding can be used 
for the result, instead of the encoding of the first string, the 
remaining concatenation process can be the same.



On OEMString:

CP_OEM (=1) works differently from CP_ACP (=0). Variables of type 
AnsiString(CP_OEM) will always have dynamic encoding CP_OEM, no 
substitution to a specific OEM codepage. CP_ACP strings instead have a 
dynamic encoding of the current DefaultSystemCodepage.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Explanation about code page-aware AnsiStrings

2014-01-08 Thread Hans-Peter Diettrich


Sven Barth schrieb:
Am 08.01.2014 15:58 schrieb Hans-Peter Diettrich drdiettri...@aol.com 
mailto:drdiettri...@aol.com:
  Delphi concatenates RawByteStrings to the dynamic encoding of the 
*first* string, the appended strings eventually are converted before 
concatenation. Special handling of strings with the same encoding is not 
required.
  I.e. the result is *not* always a CP_ACP string, as documented in the 
wiki.


Would you be so kind to provide a simple test case for this? :)


function test(a,b: RawByteString): RawByteString;
begin
  Result := a+b;
  WriteLn(StringCodePage(Result));
end;

var
  u: UTF8String;
  a: AnsiString;
begin
  a := 'äöü';
  u := 'üöä';
  test(a,u);  //CP_ACP
  test(u,a);  //UTF-8
end;

It looks to me, however, that no conversion occurs at all!
The strings are only concatenated as they are.
Same for a concatenation of (global) RawByteString variables.

This of course were not a desireable implementation :-(

DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Explanation about code page-aware AnsiStrings

2014-01-08 Thread Hans-Peter Diettrich


Sven Barth schrieb:

On 08.01.2014 19:57, Hans-Peter Diettrich wrote:



It looks to me, however, that no conversion occurs at all!
The strings are only concatenated as they are.
Same for a concatenation of (global) RawByteString variables.

This of course were not a desireable implementation :-(


I'm inclined to say of course, because in your test function you are 
concatenating two RawByteStrings which - by definition - don't do any 
conversion.


I cited what I've been told in the EMBT groups - that a conversion is 
made when required. Everything else doesn't make sense.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Encoded AnsiString

2014-01-07 Thread Hans-Peter Diettrich


Michael Van Canneyt schrieb:

If you want a TStrings that can hold strings which may differ in their 
encoding (i.e. strings[0] has a different encoding from strings[1]) then 
you'll be left in the cold.


Just an idea:
What if FPC adds another encoding, similar to RawByteString ($), but 
without the Delphi quirks? Or simply fix the RawByteString flaws in the 
*Ansi* compiler and RTL?


1) In a discussion in the Embarcadero groups it turned out that, in an 
assignment of a RawByteString to another AnsiString type, the Delphi 
compiler should (but does not) check and eventually convert the string 
to the static encoding of the target. This is (almost) the only way to 
create strings with a different static and dynamic encoding.


2) The stupid conversion to CP_ACP in an assignment *to* an 
RawByteString should be dropped. This applies in detail to the 
assignment to *function results*.


3) The function result type should be honored, in functions accepting 
RawByteString parameters. The Delphi compiler seems to *assume* that the 
results of such functions is RawByteString, so that (including 
beforementioned flaws) the outcome is a CP_ACP string, even if the 
declared function result is e.g. an UTF8String.


Test case:
  function conc(a,b: RawByteString): UTF8String;
  begin Result := a+b; end;
The same result as for
  function conc(a,b: RawByteString): RawByteString;
  begin Result := a+b; end;
the returned string has CP_ACP encoding :-(


When these flaws are fixed in the FPC compiler, the AnsiString types 
will always have the same static and dynamic encoding, as it should be.


Then TStrings could be based on such RawByteStrings, without excess 
conversions or losses. Sorting (TStringList) eventually should ignore 
the dynamic encoding, i.e. work on a strictly binary (byte-by-byte) base.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Encoded AnsiString

2014-01-07 Thread Hans-Peter Diettrich


Jy V schrieb:

A quick note: the new LLVM Delphi compiler forbid the use of AnsiString 
and AnsiChar, (declared in the unit AnsiString.pas, you cannot use this 
unit anyway),


The compiler supports AnsiStrings, but these are hidden for *mobile* 
targets. There exists a hack to enable AnsiString support also for such 
targets, though.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Explanation about code page-aware AnsiStrings

2014-01-07 Thread Hans-Peter Diettrich


Jonas Maebe schrieb:

Large parts of the returning discussions about code page-aware 
AnsiStrings are related to the fact that many people don't how they 
work. For this reason I've created an overview that explains the rules 
that are followed by the RTL/compiler at 
http://wiki.freepascal.org/FPC_Unicode_support (only Sections 1 to 3; 4 
and later are older and mostly either incomplete or wishful thinking).


Thanks :-)

The chapter numbers are missing from the headings?


On my Win98 VM this page is not accessible:



Error 403

We're sorry, but we could not fulfill your request for
/FPC_Unicode_support on this server.

You do not have permission to access this server.

Your technical support key is: 02f1-94ac-17f4-e8c8


What's wrong?

On my Win8 machine the page and server is accessible, of course.

DoDi


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Encoded AnsiString

2014-01-07 Thread Hans-Peter Diettrich


Jonas Maebe schrieb:

On 07 Jan 2014, at 15:35, Hans-Peter Diettrich wrote:



2) The stupid conversion to CP_ACP in an assignment *to* an
RawByteString should be dropped. This applies in detail to the
assignment to *function results*.


The conversion does not happen for all assignments, it only happens
for concatenations that are assigned to RawByteString. And even then
it doesn't always happen. Please read the wiki page I wrote (trying
to prevent exactly this kind of wrong statements from being further
repeated, and obviously failing).


I've tested the behaviour, and it appears not only in assignments to 
RawByteStrings. See test case below.




Test case: function conc(a,b: RawByteString): UTF8String; begin
Result := a+b; end;


This will always return CP_UTF8 on FPC. Does it really return CP_ACP
on Delphi? Even if it does, I doubt we will change that.


This leads me back to my previous statement: it will be simpler to do 
things right, than trying to achieve compatibility with *all* Delphi 
flaws. In detail when the Delphi flaws never have been documented...



We even
couldn't easily do that, because we don't know the static code pages
of the strings that are concatenated inside the RTL routine that
handles this.


Right! Only the compiler can do that, and therefore the compiler should 
do it right.



Then TStrings could be based on such RawByteStrings, without excess
conversions or losses.


The problem with changing TStrings from AnsiString to RawByteString
is not so much related to the behaviour of RawByteString, but more
regarding descendent classes in existing third party (= user) source
code that override methods using AnsiString parameters. We don't want
to force everyone to rewrite their code so it uses RawByteString (if
anything, RawByteString should probably be used as little as possible
in user code, because always correctly dealing with all possible code
pages is very hard).


Right sigh


Sorting (TStringList) eventually should ignore the dynamic
encoding, i.e. work on a strictly binary (byte-by-byte) base.


Looking for just one second at the definition of the Sort methods of
TStringList (and TStrings) would have prevented you from writing the
above statement, which does not make any sense whatsoever (unless you
want the compiler to start changing all code where a programmer
passes a comparison function that does take code pages into account
to the Sort methods of TStrings/TStringList).


Fine that you took the bait ;-)

DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Encoded AnsiString

2013-12-30 Thread Hans-Peter Diettrich


Paul Ishenin schrieb:

30.12.2013 9:07, Hans-Peter Diettrich пишет:
Do you think that FPC should really reproduce all this inconsistent 
behaviour? Who would test or even specify the compatible behaviour, 
when every new variation will result in more unexpected results? IMO 
it's much easier to do it right, and fix the Delphi flaws in FPC.


The work is already done by FPC team. AnsiString(codepage) works and 
works compatible with Delphi (whether someone like this or not) and the 
behavior is covered by tests. Trunk version is very close to 2.8 
release.


This means that UTF-8 won't work properly when it's not CP_ACP :-(

DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Encoded AnsiString

2013-12-29 Thread Hans-Peter Diettrich


Michael Van Canneyt schrieb:



On Sun, 29 Dec 2013, Hans-Peter Diettrich wrote:

Inspired by the current Lazarus discussion I'd like to learn more 
about the current state of the implementation of the new AnsiStrings.


In case nothing has be done yet, I'd suggest to extend TAnsiRec by the 
new codePage and elemSize fields (words). These can be zero for now, 
so that the remaining codebase is not affected. Then it will be 
possible to play around with encoded strings, using the codePage field.




All this is done already a long time ago in trunk.
We're way past that stage.


I'm very confused, didn't use FPC for a long time. Have to refresh 
memory of all related procedures...


How do I instruct fpcup to checkout the trunk version? (Windows)
I tried to add an parameter fpcURL=trunk to the shortcut, is this correct?

How do I proceed (build, use in Lazarus...)?
Any links appreciated :-)

Current stage is the creation of a unicode RTL, where all base 
file/string operations accept unicode strings. This is done too.


Next step is creation of the unicode RTL, where string = widestring.
This will be combined with the dotted unit filenames, to be Delphi 2010+ 
compatible.


sigh.sigh
How do I create source files for use with both versions?

To allow people to choose, 2 RTLs will be created: one unicode 
(string=ansistring), one non-unicode (string=widestring).


This will result (probably) in 2 paths:
units/os-cpu
units/os-cpu-unicode
This is not decided yet.

I planned the work in februari/march.


Thanks :-)

Where can I jump in?



A related question:
Why is the string length set to zero in NewAnsiString, when the 
allocated Length is already known?


Because the allocated memory length is not necessarily equal to the 
string length.
If you have a string of length 50, setting the length to 25 will not 
discard and reallocate the memory block, but merely set the character 
length to 25.


This means that the allocated length is stored somewhere else, in the 
memory block descriptor?


How can a user request an string of a specific allocation size?


Another one:

I've heard that a mix of encodings converts the (concatenated) output 
(RawByteString?) to CP_ACP, with possible losses. Is this correct?


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Encoded AnsiString

2013-12-29 Thread Hans-Peter Diettrich


Michael Van Canneyt schrieb:



On Sun, 29 Dec 2013, Hans-Peter Diettrich wrote:


This will be combined with the dotted unit filenames, to be Delphi 
2010+ compatible.


sigh.sigh
How do I create source files for use with both versions?


What do you mean by this statement ?


I'm not familiar with dotted unit names, they seem not to be used in XE.
So I only can imagine something like conditionals around the different 
items in un/dotted environment, to keep Classes separate from 
System.Classes?


Are directories involved? If so, does the Delphi structure match the FPC 
tree structure?




Where can I jump in?


When I'm done I will release a version for testing to the public.


Fine :-)



How can a user request an string of a specific allocation size?


You should not.


Okay.



Another one:

I've heard that a mix of encodings converts the (concatenated) output 
(RawByteString?) to CP_ACP, with possible losses. Is this correct?


Define output ?


s := SomeACPstr+SomeUTF8str+äöü;

In XE I can concatenate ACP and UTF-8 strings and assign it to an OEM 
string without losses. Somebody said this will fail in FPC, on e.g.

  FindFirst(myPath+allfiles,faAnyFile,sr);
due to an (intermediate?) conversion of myPath+allfiles to CP_ACP.

Of course the string must be converted to CP_ACP if FindFirst expects 
exactly an AnsiString(0) argument, otherwise something is broken.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 951 matches

Mail list logo