Re: [fpc-devel] Class field reordering

2012-07-17 Thread Martin Schreiber
On Wednesday 18 July 2012 08:19:02 Martin Schreiber wrote:
> Used in order TParams create tmseparam items instead of TParam:
>
> TCollection:
> - FItemClass
>
Probably can be solved in a forked db.pas

Martin
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Class field reordering

2012-07-17 Thread Martin Schreiber
On Tuesday 17 July 2012 09:40:36 michael.vancann...@wisa.be wrote:
> > Maybe, but what about performance? Another complication is the
> > "updatebuffer" with the "oldvalues".
>
> Thinking about it:
>
> I would allocate the buffer as is, with for all string fields an
> integer-sized slot in the buffer. The slot contains an index, pointing
> to a separate array of strings containing  N*M*2 strings. (N=number of
> records, M= Number of string fields per record, factor 2 for old values)
>
> Field value =  Element [Index  [+1 for old value]] in the array.
> where the [Index] is stored in the buffer.
>
> The speed performance penalty of this system should be negligable, since
> you assume all records are in memory anyway.
>
> And: everything can be done without meddling with the internals of TField.
>
Thank you. There are more items in the db.pas list...
But I think first we should concentrate on classes.pas because I really don't 
want to fork it. Forking db.pas is less problematic and I probably prefer it 
in place of an endless discussion and in my eyes not optimal solutions. With 
a forked db.pas I can eliminate the many workarounds I already had to 
implement.

Currently needed crackerclasses by MSEide+MSEgui:

Used by MSEide for different tasks (example: ask for ancestor forms and 
frames/submodules while loading a form/datamodule, recover in case of an 
error) and for streaming of frames with additional components/widgets:

TComponent:
- FComponents
- FComponentState
- FFreeNotifies

TWriter:
- FPropPath
- FAncestors

TReader:
- FStream
- FLoaded

Used in order TParams create tmseparam items instead of TParam:

TCollection:
- FItemClass

Used to unify memorystreams/files/pipes/sockets/stringstreams:

THandlestream:
- FHandle

TMemoryStream:
- FCapacity

Why must these fields be private?

Thanks, Martin
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Cross compiling x86 on amd64, linking to crtbegin.o etc.

2012-07-17 Thread Sergei Gorelkin

On 17.07.2012 15:04, Jonas Maebe wrote:


On 17 Jul 2012, at 07:15, Sergei Gorelkin wrote:


I'm afraid this isn't entirely correct. The problems arise when using "-n" in 
the command line (one example is compiling fpmake for packages directory). In this case 
any paths from fpc.cfg are ignored. Moreover, compiler silently ignores the absense of 
crti.o and company. The resulting executable manages to work somehow, but it is mostly by 
chance.


Then you'd have to pass extra CROSSOPT parameters to the compiler when building 
(it's not that uncommon having to specify extra parameters when building a 
cross-distribution). Unless of course there is a specification somewhere in the 
LSB (or whatever it's called today) that specifies the default locations of 
cross-libraries on multi-arch systems, but we can keep adding code to the 
compiler to cater to all possible distributions.


I'm facing this problem even when not cross-compiling (and it is in fact 
unrelated to cross-compiling).
The issue is that the path to crti.o is known only to gcc, if it is installed in system. It isn't 
related to other libraries' path and it isn't known to ld. This path is queried by fpc install 
script and written to fpc.cfg. Later it can become invalid if gcc version is upgraded without 
reinstalling fpc (this is likely what happens at our test machines, see 
tests/packages/webtbs/tw14265 failing with "cannot find -lgcc").


I was thinking about quering this path at every ld invocation by running
`gcc -print-libgcc-file-name`, likewise it can be queried for cross-compiling with `gcc -m32 
-print-libgcc-file-name`. Hopefully doing so won't slow down much.

And probably fpc shouldn't keep silence if it needs gcc files to link and 
cannot find them.

Anyway, since currenly I'm working on ELF linker (reached the state when packages directory builds 
:-), I'll almost inevitably collect all linking-related issues


Regards,
Sergei
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Class field reordering

2012-07-17 Thread Thomas Schatzl
Hi,

On Tue, 2012-07-17 at 08:22 +0200, Skybuck Flying wrote:
> > I also wonder how much of an optimization it actually is ? Maybe 0.01% 
> > more performance ?
> 
> "
> 1) as mentioned in the original mail, the current transformation is 
> implemented for saving memory, not for improving performance
> "
> 
> This wasn't clear, it only mentions gaps. What kind of gaps ? 

Gaps between fields of an object instance due to alignment. Typically
loading data from unaligned address, i.e. an address that is not evenly
divisible by the field size, is slower than otherwise. Also, some CPU
architectures even give you an exception if you actually try.

E.g. for the following object

type
  test = class
b1 : byte;
d : double;
b2 : byte;
q : qword;
  end;

due to above hardware limitations an instance will look as follows in
memory (first column indicates offset, disregarding any additional
internal data):

0:  b1
1-7  : 
8-15 : d
16   : b2
17-23: 
24-31: q

So for storing 18 bytes of usable data, you use up 32 bytes in memory,
i.e. a waste of 44%.
Now imagine your program uses thousands of these records.

> Apperently you 
> ment to minimize memory size, the opposite could also have been ment in the 
> sense of optimizations to make fields fall on memory boundaries for perhaps 
> increased fetch speed or something else.

You always want to make fields fall on memory boundaries (i.e. align
them) except if you are either really scarce on memory, need a specific
layout for i/o purposes or have another reason to do so (e.g. extra
padding in multi-threaded code when you want to avoid cache line
contention).

But then hopefully you know what you need to take care of and read the
manual. There is a way to disable this reordering on a per-class basis.

> Later performance optimization possibilities for the future are mentioned as 
> well.

Given that software on reasonably modern hardware is very often memory
bound, a decrease in memory footprint often translates into real
performance gains.

In the above case, if this "optimization" is applied, the object
instance looks as follows (for example, do not know the exact
algorithm):

0-7  : d
8-15 : q
16   : b1
17   : b2

I.e. the object now uses exactly 18 bytes of memory.


> "
> 2) if it was done for performance reasons, some people already got up to 34% 
> extra performance by doing exactly that: 
> http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.34.4009 (download 
> the cached version of the paper, the original link no longer works)
> "
> 
> I scanned over this document and it seems to mention "profiling 
> information", it also mentions "compilers which can then use this profiling 
> information to re-arrange fields".
> 
> To me it seems this optimization idea belongs in the realm of "profilers". 
> It should be easy for a programmer to use such a profiling tool and make the 
> necessary changes him/herself instead of complexifieing the compiler.
>
> [.. snipped the rest...]

Imo for a static compiler like fpc, re-arranging fields for pure
performance reasons without reasonably accurate profiling information is
indeed pure guesswork.
I do not see a problem with automating this process.

However as mentioned above, decreasing memory footprint often also
increases performance.

> > I rarely inspect the binary equivalent of a class instance, so your 
> > supposedly optimization is probably not a big deal, for records that would 
> > be a different matter since these are used in all kinds of api's and 
> > input/output situations.
> 
> "
> As mentioned in the original mail, the transformation would only be applied 
> to classes.
> "
> 
> How is a record not an abstraction and a class is an abstraction, that's 
> kinda weird/inconsistent ?!

Because unfortunately people are already (mis-)using records by
blockwriting/reading them without knowing that this is not portable at
all.
Actually such things already break e.g. when moving from 32 to 64 bit
processors, or from one cpu architecture to another with different
alignment rules, and so on. When 64 bit was new, there have been many
questions/issues about exactly that - on this list too.

It looks like a purely pragmatic decision. (And maybe sometime somewhere
the Borland people defined it that way for records).

> >>> It's already bad that Delphi adds invisible fields to classes so they 
> >>> cannot be simply dumped to disk... (virtual method table pointers ?) 
> >>> this would make it even worse.
> >>
> >> If you want to program at an assembler level of abstraction, don't use 
> >> high level language features.
> >
> > I see no reason why a high level language could not be used to produce 
> > binary instructions and or files/data.
> 
> "
> It can be used for that, as long as you don't use high level abstractions. 
> The whole point of abstractions to get rid of any guarantees a far as 
> implementation is concerned, in order to increase portability, programmer 
> productivity and compiler optim

Re: [fpc-devel] Cross compiling x86 on amd64, linking to crtbegin.o etc.

2012-07-17 Thread Jonas Maebe

On 17 Jul 2012, at 07:15, Sergei Gorelkin wrote:

> I'm afraid this isn't entirely correct. The problems arise when using "-n" in 
> the command line (one example is compiling fpmake for packages directory). In 
> this case any paths from fpc.cfg are ignored. Moreover, compiler silently 
> ignores the absense of crti.o and company. The resulting executable manages 
> to work somehow, but it is mostly by chance.

Then you'd have to pass extra CROSSOPT parameters to the compiler when building 
(it's not that uncommon having to specify extra parameters when building a 
cross-distribution). Unless of course there is a specification somewhere in the 
LSB (or whatever it's called today) that specifies the default locations of 
cross-libraries on multi-arch systems, but we can keep adding code to the 
compiler to cater to all possible distributions.


Jonas___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Class field reordering

2012-07-17 Thread Skybuck Flying

I don't think this is a good idea.

For example while debugging and looking at the memory in raw this would 
lead to confusion.


By knowing the order of the fields, you still don't know their exact 
offsets. If you want to know their address, print 
@classinstance.fieldname


Yes but I do know the order of the fields which does help make some sense 
of it. With your suggested optimizations it would become much more 
confusing/mixed/shuffled.


I also find it slightly strange how there is now an even bigger disconnect 
between records and classes.


"
The whole point of classes is to offer abstraction. Again: if you don't want 
abstraction, don't use data structures that offer an abstraction.

"

I have a different view on programming languages and pascal. It's a tool to 
help generate code. The abstraction is to prevent tieing to a single 
instruction set/cpu/computer, but it's always nice if the 
result/instructions/data field can be compared to the high level code.


I also wonder how much of an optimization it actually is ? Maybe 0.01% 
more performance ?


"
1) as mentioned in the original mail, the current transformation is 
implemented for saving memory, not for improving performance

"

This wasn't clear, it only mentions gaps. What kind of gaps ? Apperently you 
ment to minimize memory size, the opposite could also have been ment in the 
sense of optimizations to make fields fall on memory boundaries for perhaps 
increased fetch speed or something else.


Later performance optimization possibilities for the future are mentioned as 
well.


"
2) if it was done for performance reasons, some people already got up to 34% 
extra performance by doing exactly that: 
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.34.4009 (download 
the cached version of the paper, the original link no longer works)

"

I scanned over this document and it seems to mention "profiling 
information", it also mentions "compilers which can then use this profiling 
information to re-arrange fields".


To me it seems this optimization idea belongs in the realm of "profilers". 
It should be easy for a programmer to use such a profiling tool and make the 
necessary changes him/herself instead of complexifieing the compiler.


Perhaps someday these kinds of "complex optimization tricks" might backfire 
as well, because of different runs of the program or so, though usually 
there is some redline in a program.


How do you envision this profiling to be done with the free pascal compiler 
or did you think about "some kind of static offline optimization" ?


If the later how about this:

"
setup code:

access field 2.

access field 2

red loop:

forever access field1 then field2.
"

Without actually running the code and profiling it or analyzing the for 
loop, the static offline optimization trick would believe field2 is accessed 
the most while field1 is actually accessed the most first.


So could lead to wrong optimization results.

I rarely inspect the binary equivalent of a class instance, so your 
supposedly optimization is probably not a big deal, for records that would 
be a different matter since these are used in all kinds of api's and 
input/output situations.


"
As mentioned in the original mail, the transformation would only be applied 
to classes.

"

How is a record not an abstraction and a class is an abstraction, that's 
kinda weird/inconsistent ?!


It's already bad that Delphi adds invisible fields to classes so they 
cannot be simply dumped to disk... (virtual method table pointers ?) 
this would make it even worse.


If you want to program at an assembler level of abstraction, don't use 
high level language features.


I see no reason why a high level language could not be used to produce 
binary instructions and or files/data.


"
It can be used for that, as long as you don't use high level abstractions. 
The whole point of abstractions to get rid of any guarantees a far as 
implementation is concerned, in order to increase portability, programmer 
productivity and compiler optimization opportunities.

"

How about instead extending the pascal language description and specifieing 
that the order of the fields in the class and records must be the same in 
binary as well. This seems nice and constant and might allow some other 
functionalities in the future.


That is not to say that this supposedly optimization could be done and later 
then removed if this order extension is introduced.


Finally I do see some merit for free pascal compiler or any other compiler 
to generate some helpfull (debugging?) / profiling information (?) so that a 
profiler can report back to the user what source code fields are accessed 
and in what order... and how often to give some hints or suggestions to the 
programmer how to make the optimizations him/herself.


Some might even take fun in optimization code, but it can also be boring 
work, then again, after having done it a few times, perhaps programmers 
learn from i

Re: [fpc-devel] Class field reordering

2012-07-17 Thread michael . vancanneyt



On Tue, 17 Jul 2012, Martin Schreiber wrote:


On Monday 16 July 2012 17:25:58 michael.vancann...@wisa.be wrote:

On Mon, 16 Jul 2012, Martin Schreiber wrote:

On Monday 16 July 2012 16:50:06 michael.vancann...@wisa.be wrote:

Well, from your code adding the following to the protected section:

   Property ValueBuffer : Pointer Read FValueBuffer;
   Property Validating : Boolean Read FValidating ;

Should be enough to remove the need for the TFieldCracker ?


Unfortunately no:
"
procedure tmsebufdataset.checkfreebuffer(const afield: tfield);
begin
{$ifdef FPC}{$warnings off}{$endif}
with tfieldcracker(afield) do begin
{$ifdef FPC}{$warnings on}{$endif}
 if foffset and 2 <> 0 then begin
  freemem(fvaluebuffer);
  fvaluebuffer:= nil;
 end;
end;
end;
"


If I understand correctly, you have in the record buffer just the pointer
of the (wide)string instead of the actual string data ?


Correct. The UnicodeString pointer. That has not directly to do with the
procedure above BTW. This one is to allow changing field values in OnValidate
IIRC.


If so, why then didn't you implement the string fields as blobs are
implemented in SQLDB (it is basically the same problem), in that case you
would not need the internal fields of TField at all ?


Maybe, but what about performance? Another complication is the "updatebuffer"
with the "oldvalues".


Thinking about it:

I would allocate the buffer as is, with for all string fields an
integer-sized slot in the buffer. The slot contains an index, pointing 
to a separate array of strings containing  N*M*2 strings. (N=number of 
records, M= Number of string fields per record, factor 2 for old values)


Field value =  Element [Index  [+1 for old value]] in the array. 
where the [Index] is stored in the buffer.


The speed performance penalty of this system should be negligable, since you
assume all records are in memory anyway.

And: everything can be done without meddling with the internals of TField.

Michael.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel