Re: [fpc-devel] Class field reordering
On Wednesday 18 July 2012 08:19:02 Martin Schreiber wrote: > Used in order TParams create tmseparam items instead of TParam: > > TCollection: > - FItemClass > Probably can be solved in a forked db.pas Martin ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Class field reordering
On Tuesday 17 July 2012 09:40:36 michael.vancann...@wisa.be wrote: > > Maybe, but what about performance? Another complication is the > > "updatebuffer" with the "oldvalues". > > Thinking about it: > > I would allocate the buffer as is, with for all string fields an > integer-sized slot in the buffer. The slot contains an index, pointing > to a separate array of strings containing N*M*2 strings. (N=number of > records, M= Number of string fields per record, factor 2 for old values) > > Field value = Element [Index [+1 for old value]] in the array. > where the [Index] is stored in the buffer. > > The speed performance penalty of this system should be negligable, since > you assume all records are in memory anyway. > > And: everything can be done without meddling with the internals of TField. > Thank you. There are more items in the db.pas list... But I think first we should concentrate on classes.pas because I really don't want to fork it. Forking db.pas is less problematic and I probably prefer it in place of an endless discussion and in my eyes not optimal solutions. With a forked db.pas I can eliminate the many workarounds I already had to implement. Currently needed crackerclasses by MSEide+MSEgui: Used by MSEide for different tasks (example: ask for ancestor forms and frames/submodules while loading a form/datamodule, recover in case of an error) and for streaming of frames with additional components/widgets: TComponent: - FComponents - FComponentState - FFreeNotifies TWriter: - FPropPath - FAncestors TReader: - FStream - FLoaded Used in order TParams create tmseparam items instead of TParam: TCollection: - FItemClass Used to unify memorystreams/files/pipes/sockets/stringstreams: THandlestream: - FHandle TMemoryStream: - FCapacity Why must these fields be private? Thanks, Martin ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Cross compiling x86 on amd64, linking to crtbegin.o etc.
On 17.07.2012 15:04, Jonas Maebe wrote: On 17 Jul 2012, at 07:15, Sergei Gorelkin wrote: I'm afraid this isn't entirely correct. The problems arise when using "-n" in the command line (one example is compiling fpmake for packages directory). In this case any paths from fpc.cfg are ignored. Moreover, compiler silently ignores the absense of crti.o and company. The resulting executable manages to work somehow, but it is mostly by chance. Then you'd have to pass extra CROSSOPT parameters to the compiler when building (it's not that uncommon having to specify extra parameters when building a cross-distribution). Unless of course there is a specification somewhere in the LSB (or whatever it's called today) that specifies the default locations of cross-libraries on multi-arch systems, but we can keep adding code to the compiler to cater to all possible distributions. I'm facing this problem even when not cross-compiling (and it is in fact unrelated to cross-compiling). The issue is that the path to crti.o is known only to gcc, if it is installed in system. It isn't related to other libraries' path and it isn't known to ld. This path is queried by fpc install script and written to fpc.cfg. Later it can become invalid if gcc version is upgraded without reinstalling fpc (this is likely what happens at our test machines, see tests/packages/webtbs/tw14265 failing with "cannot find -lgcc"). I was thinking about quering this path at every ld invocation by running `gcc -print-libgcc-file-name`, likewise it can be queried for cross-compiling with `gcc -m32 -print-libgcc-file-name`. Hopefully doing so won't slow down much. And probably fpc shouldn't keep silence if it needs gcc files to link and cannot find them. Anyway, since currenly I'm working on ELF linker (reached the state when packages directory builds :-), I'll almost inevitably collect all linking-related issues Regards, Sergei ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Class field reordering
Hi, On Tue, 2012-07-17 at 08:22 +0200, Skybuck Flying wrote: > > I also wonder how much of an optimization it actually is ? Maybe 0.01% > > more performance ? > > " > 1) as mentioned in the original mail, the current transformation is > implemented for saving memory, not for improving performance > " > > This wasn't clear, it only mentions gaps. What kind of gaps ? Gaps between fields of an object instance due to alignment. Typically loading data from unaligned address, i.e. an address that is not evenly divisible by the field size, is slower than otherwise. Also, some CPU architectures even give you an exception if you actually try. E.g. for the following object type test = class b1 : byte; d : double; b2 : byte; q : qword; end; due to above hardware limitations an instance will look as follows in memory (first column indicates offset, disregarding any additional internal data): 0: b1 1-7 : 8-15 : d 16 : b2 17-23: 24-31: q So for storing 18 bytes of usable data, you use up 32 bytes in memory, i.e. a waste of 44%. Now imagine your program uses thousands of these records. > Apperently you > ment to minimize memory size, the opposite could also have been ment in the > sense of optimizations to make fields fall on memory boundaries for perhaps > increased fetch speed or something else. You always want to make fields fall on memory boundaries (i.e. align them) except if you are either really scarce on memory, need a specific layout for i/o purposes or have another reason to do so (e.g. extra padding in multi-threaded code when you want to avoid cache line contention). But then hopefully you know what you need to take care of and read the manual. There is a way to disable this reordering on a per-class basis. > Later performance optimization possibilities for the future are mentioned as > well. Given that software on reasonably modern hardware is very often memory bound, a decrease in memory footprint often translates into real performance gains. In the above case, if this "optimization" is applied, the object instance looks as follows (for example, do not know the exact algorithm): 0-7 : d 8-15 : q 16 : b1 17 : b2 I.e. the object now uses exactly 18 bytes of memory. > " > 2) if it was done for performance reasons, some people already got up to 34% > extra performance by doing exactly that: > http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.34.4009 (download > the cached version of the paper, the original link no longer works) > " > > I scanned over this document and it seems to mention "profiling > information", it also mentions "compilers which can then use this profiling > information to re-arrange fields". > > To me it seems this optimization idea belongs in the realm of "profilers". > It should be easy for a programmer to use such a profiling tool and make the > necessary changes him/herself instead of complexifieing the compiler. > > [.. snipped the rest...] Imo for a static compiler like fpc, re-arranging fields for pure performance reasons without reasonably accurate profiling information is indeed pure guesswork. I do not see a problem with automating this process. However as mentioned above, decreasing memory footprint often also increases performance. > > I rarely inspect the binary equivalent of a class instance, so your > > supposedly optimization is probably not a big deal, for records that would > > be a different matter since these are used in all kinds of api's and > > input/output situations. > > " > As mentioned in the original mail, the transformation would only be applied > to classes. > " > > How is a record not an abstraction and a class is an abstraction, that's > kinda weird/inconsistent ?! Because unfortunately people are already (mis-)using records by blockwriting/reading them without knowing that this is not portable at all. Actually such things already break e.g. when moving from 32 to 64 bit processors, or from one cpu architecture to another with different alignment rules, and so on. When 64 bit was new, there have been many questions/issues about exactly that - on this list too. It looks like a purely pragmatic decision. (And maybe sometime somewhere the Borland people defined it that way for records). > >>> It's already bad that Delphi adds invisible fields to classes so they > >>> cannot be simply dumped to disk... (virtual method table pointers ?) > >>> this would make it even worse. > >> > >> If you want to program at an assembler level of abstraction, don't use > >> high level language features. > > > > I see no reason why a high level language could not be used to produce > > binary instructions and or files/data. > > " > It can be used for that, as long as you don't use high level abstractions. > The whole point of abstractions to get rid of any guarantees a far as > implementation is concerned, in order to increase portability, programmer > productivity and compiler optim
Re: [fpc-devel] Cross compiling x86 on amd64, linking to crtbegin.o etc.
On 17 Jul 2012, at 07:15, Sergei Gorelkin wrote: > I'm afraid this isn't entirely correct. The problems arise when using "-n" in > the command line (one example is compiling fpmake for packages directory). In > this case any paths from fpc.cfg are ignored. Moreover, compiler silently > ignores the absense of crti.o and company. The resulting executable manages > to work somehow, but it is mostly by chance. Then you'd have to pass extra CROSSOPT parameters to the compiler when building (it's not that uncommon having to specify extra parameters when building a cross-distribution). Unless of course there is a specification somewhere in the LSB (or whatever it's called today) that specifies the default locations of cross-libraries on multi-arch systems, but we can keep adding code to the compiler to cater to all possible distributions. Jonas___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Class field reordering
I don't think this is a good idea. For example while debugging and looking at the memory in raw this would lead to confusion. By knowing the order of the fields, you still don't know their exact offsets. If you want to know their address, print @classinstance.fieldname Yes but I do know the order of the fields which does help make some sense of it. With your suggested optimizations it would become much more confusing/mixed/shuffled. I also find it slightly strange how there is now an even bigger disconnect between records and classes. " The whole point of classes is to offer abstraction. Again: if you don't want abstraction, don't use data structures that offer an abstraction. " I have a different view on programming languages and pascal. It's a tool to help generate code. The abstraction is to prevent tieing to a single instruction set/cpu/computer, but it's always nice if the result/instructions/data field can be compared to the high level code. I also wonder how much of an optimization it actually is ? Maybe 0.01% more performance ? " 1) as mentioned in the original mail, the current transformation is implemented for saving memory, not for improving performance " This wasn't clear, it only mentions gaps. What kind of gaps ? Apperently you ment to minimize memory size, the opposite could also have been ment in the sense of optimizations to make fields fall on memory boundaries for perhaps increased fetch speed or something else. Later performance optimization possibilities for the future are mentioned as well. " 2) if it was done for performance reasons, some people already got up to 34% extra performance by doing exactly that: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.34.4009 (download the cached version of the paper, the original link no longer works) " I scanned over this document and it seems to mention "profiling information", it also mentions "compilers which can then use this profiling information to re-arrange fields". To me it seems this optimization idea belongs in the realm of "profilers". It should be easy for a programmer to use such a profiling tool and make the necessary changes him/herself instead of complexifieing the compiler. Perhaps someday these kinds of "complex optimization tricks" might backfire as well, because of different runs of the program or so, though usually there is some redline in a program. How do you envision this profiling to be done with the free pascal compiler or did you think about "some kind of static offline optimization" ? If the later how about this: " setup code: access field 2. access field 2 red loop: forever access field1 then field2. " Without actually running the code and profiling it or analyzing the for loop, the static offline optimization trick would believe field2 is accessed the most while field1 is actually accessed the most first. So could lead to wrong optimization results. I rarely inspect the binary equivalent of a class instance, so your supposedly optimization is probably not a big deal, for records that would be a different matter since these are used in all kinds of api's and input/output situations. " As mentioned in the original mail, the transformation would only be applied to classes. " How is a record not an abstraction and a class is an abstraction, that's kinda weird/inconsistent ?! It's already bad that Delphi adds invisible fields to classes so they cannot be simply dumped to disk... (virtual method table pointers ?) this would make it even worse. If you want to program at an assembler level of abstraction, don't use high level language features. I see no reason why a high level language could not be used to produce binary instructions and or files/data. " It can be used for that, as long as you don't use high level abstractions. The whole point of abstractions to get rid of any guarantees a far as implementation is concerned, in order to increase portability, programmer productivity and compiler optimization opportunities. " How about instead extending the pascal language description and specifieing that the order of the fields in the class and records must be the same in binary as well. This seems nice and constant and might allow some other functionalities in the future. That is not to say that this supposedly optimization could be done and later then removed if this order extension is introduced. Finally I do see some merit for free pascal compiler or any other compiler to generate some helpfull (debugging?) / profiling information (?) so that a profiler can report back to the user what source code fields are accessed and in what order... and how often to give some hints or suggestions to the programmer how to make the optimizations him/herself. Some might even take fun in optimization code, but it can also be boring work, then again, after having done it a few times, perhaps programmers learn from i
Re: [fpc-devel] Class field reordering
On Tue, 17 Jul 2012, Martin Schreiber wrote: On Monday 16 July 2012 17:25:58 michael.vancann...@wisa.be wrote: On Mon, 16 Jul 2012, Martin Schreiber wrote: On Monday 16 July 2012 16:50:06 michael.vancann...@wisa.be wrote: Well, from your code adding the following to the protected section: Property ValueBuffer : Pointer Read FValueBuffer; Property Validating : Boolean Read FValidating ; Should be enough to remove the need for the TFieldCracker ? Unfortunately no: " procedure tmsebufdataset.checkfreebuffer(const afield: tfield); begin {$ifdef FPC}{$warnings off}{$endif} with tfieldcracker(afield) do begin {$ifdef FPC}{$warnings on}{$endif} if foffset and 2 <> 0 then begin freemem(fvaluebuffer); fvaluebuffer:= nil; end; end; end; " If I understand correctly, you have in the record buffer just the pointer of the (wide)string instead of the actual string data ? Correct. The UnicodeString pointer. That has not directly to do with the procedure above BTW. This one is to allow changing field values in OnValidate IIRC. If so, why then didn't you implement the string fields as blobs are implemented in SQLDB (it is basically the same problem), in that case you would not need the internal fields of TField at all ? Maybe, but what about performance? Another complication is the "updatebuffer" with the "oldvalues". Thinking about it: I would allocate the buffer as is, with for all string fields an integer-sized slot in the buffer. The slot contains an index, pointing to a separate array of strings containing N*M*2 strings. (N=number of records, M= Number of string fields per record, factor 2 for old values) Field value = Element [Index [+1 for old value]] in the array. where the [Index] is stored in the buffer. The speed performance penalty of this system should be negligable, since you assume all records are in memory anyway. And: everything can be done without meddling with the internals of TField. Michael. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel