On 5/19/06, Daniël Mantione <[EMAIL PROTECTED]> wrote:
Op Thu, 18 May 2006, schreef Flávio Etrusco: > > L> Dynamic arrays can be very handy and I never knew anyone who avoids > > L> them. Of course if your array has fixed length there's no reason > > L> to use a dynamic array either. > > L> Fortunately it's no very often that one falls in Borland's trap > > L> that dynamic arrays aren't copy-on-write like AnsiStrings... BTW, > > L> is this the behaviour in FPC, too? Free Pascal is Delphi compatible.
I know that FPC aims to be Delphi-compatible, but it's not always the case, as e.g. the WideStrings were reference-counted until a couple of months ago. So you are saying that in this is specific case FPC is already (unfortunately, as for the WideStrings case) compatible with Delphi?
A lot of people use getmem combined with possibly reallocmem if the array size should change after initial allocation. It is low level programming and therefore ugly, but dynamic arrays are being considered ugly as well by many people because they differ a lot from standard Pascal semantics. > > L> It's simply because the code has to check there's only one reference > > L> to the string on each change. If you know there's no concurrent > > L> access to the string (e.g. you app is single-threaded, or you have a > > L> local copy of the string) you should access it as a PChar. This code: procedure z; var a:string; begin a:='abc'; end; ... generates this monster with $H+: P$TESTASTRING_Z: push ebp mov ebp,esp sub esp,52 mov dword [ebp-4],0 lea eax,[ebp-24] mov ecx,eax lea eax,[ebp-48] mov edx,eax mov eax,1 call NEAR FPC_PUSHEXCEPTADDR call NEAR FPC_SETJMP push eax test eax,eax jne NEAR [EMAIL PROTECTED] lea edx,[ebp-4] mov eax,edx call NEAR FPC_ANSISTR_DECR_REF mov eax,dword [_$PROGRAM$_L12] mov dword [ebp-4],eax [EMAIL PROTECTED]: call NEAR FPC_POPADDRSTACK mov edx,dword INIT__SYSTEM_ANSISTRING lea eax,[ebp-4] call NEAR fpc_finalize pop eax test eax,eax je NEAR [EMAIL PROTECTED] call NEAR FPC_RERAISE [EMAIL PROTECTED]: leave ret With $H- the result is: P$TESTASTRING_Z: push ebp mov ebp,esp sub esp,256 lea ecx,[ebp-256] mov edx,dword _$PROGRAM$_L9 mov eax,255 call NEAR fpc_shortstr_to_shortstr leave ret It is therefore not surprising that the shortstring version is faster. Other reasons why they are faster are that temporary strings are allocated on the stack, a "sub esp,xxxx" is a lot faster than a getmem. Shortstrings also do not need reallocmem if they grow. > > L> > > L> > But they are said to be improved in recent versions (recent > > L> snapshots?). > > L> > > L> I find it strange that the cost of copying a ShortString (maybe > > L> because they are at most 255 bytes? Maybe cache locality usually > > L> is fine in this case? 8-| ) is lower(better) than the > > L> locked-count-reference and the exception trapping... A shortstring copy is really fast. They are copied with 4 bytes at a time in assembler code, so you need at most 64 steps to copy a string of maximum length. Most strings are shorter, like the example above. However, you are right that copying is a limiting factor in shortstring performance. > > L> Anyway, isn't it just the case to correctly optimize string > > L> parameters as 'const' and 'var', and maybe using PChar in some few places, or > > L> can you think of any other reason for AnsiStrings to be slower than > > L> ShortStrings? A lot of them, see above. Daniël
Wow, thanks really a lot for all the info :-) What is the disassembler you use? Is there any nice free one? I'll try to digest that assembly since I'm not a "close friend" to it ;-) Also, that case is IMHO the bad case of AnsiString (i.e. we have to add a reference). I'm more interested if there's any overhead when reading from a 'const' string parameter... Cheers, Flávio _______________________________________________ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel