Re: [Bug 25480] - Experimental performance improvements.

2004-01-14 Thread Peter B. West
Finn Bock wrote:
[Peter B. West]

Alt-design (trying the hyphen for a while) takes different approaches 
at different times.  While building the subtree of any node, all of 
the properties are maintained in a HashMap, along with a BitSet of 
specified properties.

When the subtree construction is complete, the HashMap and BitSet are 
used to build the sparse array of only the relevant *resolved* 
property values 


If I understand the Alt-design code correctly, the function calls, like 
from-parent(), are resolved but percentage are not resolved at this 
point, but still saved as an IndirectValue, right? The percentage will 
be resolved at a later stage?
Basically, yes.  There are complications with from-parent() and 
from-nearest-specified-value() when used with shorthands.  These have 
lead to the creation of the FromParent and FromNearestSpecified 
pseudo-types.

(not properties - one of the differences with HEAD) 


I think you have mentioned this before, but is it such a big difference? 
HEAD wraps its datavalues in a very thin Property wrapper, but otherwise 
there is a one-to-one binding between a HEAD Property and its value.
I freely admit that when I started working with properties, I had only 
the fuzziest notion of the way they were processed in the original code. 
 I'm not a lot better informed now.  However, the idea of expressing 
properties in terms of data values still seems to me to be strange. 
Even though individual properties may *eventually* resolve to a 
particular basic type, the road there can be very complicated.  It 
seemed to me that I should be able to process property values into a 
range of possible data types (e.g. enumerations and lengths), postponing 
the resolution into a particular, say, length, as long as possible.

The other issue was that some types (enumerations, strings) resolve 
eventually into very different types depending on the property on which 
they are expressed.  The Rec. (and consequently the parser) allow a 
multiplicity of different data types in the expressions on many 
properties.  It just seemed cleaner to me to separate properties (which 
have certain static characteristics) out from data types.  That way, I 
have the option of resolving different datatypes into their final 
datatype downstream of the parsing and FO tree building.

I am open to enlightenment on this.

Peter
--
Peter B. West 


Re: [Bug 25480] - Experimental performance improvements.

2004-01-14 Thread Finn Bock
[Peter B. West]

Alt-design (trying the hyphen for a while) takes different approaches at 
different times.  While building the subtree of any node, all of the 
properties are maintained in a HashMap, along with a BitSet of specified 
properties.

When the subtree construction is complete, the HashMap and BitSet are 
used to build the sparse array of only the relevant *resolved* property 
values 
If I understand the Alt-design code correctly, the function calls, like 
from-parent(), are resolved but percentage are not resolved at this 
point, but still saved as an IndirectValue, right? The percentage will 
be resolved at a later stage?

(not properties - one of the differences with HEAD) 
I think you have mentioned this before, but is it such a big difference? 
HEAD wraps its datavalues in a very thin Property wrapper, but otherwise 
there is a one-to-one binding between a HEAD Property and its value.

regards,
finn


RE: [Bug 25480] - Experimental performance improvements.

2004-01-14 Thread Andreas L. Delmelle
> -Original Message-
> From: Peter B. West [mailto:[EMAIL PROTECTED]
>

> If I mentioned PropertyValue singletons, it was a slip of the fingers.
> I maintain Property singletons, which are exist solely to provide access
> to certain "static" information about individual properties.
>

Don't worry, your fingers still OK. The slip was all mine...

In any case: thanks for a very fine explanation. I'm going to digest your
remarks first, then maybe I'll be back for more.


Cheers,

Andreas



Re: [Bug 25480] - Experimental performance improvements.

2004-01-14 Thread Finn Bock
[me]

1) Only store the specified properties. That is what HEAD does now.
2) Put the relevant properties in a fast Property[] array and the
   remaining specified properties in a HashMap. For fo:root the result
   would be an array of size 1 for the 'media-usage' property.
3) Expect to store every valid property. For fo:root that would mean
   allocating an array large enough to store every defined property.
   This is what my patch does, and the "values" array works as the
   PropertyWindow.
... I'll try to come up with some numbers to see
how much memory that would use/save compared to 1) and 3).
You can find the counts of relevant and valid properties for each 
element type here:

   http://bckfnn-modules.sf.net/valid
   http://bckfnn-modules.sf.net/relevant
and a trace of the number of (base) properties defined at each element 
in the DocBook:TDG example I've been using all along.

   http://bckfnn-modules.sf.net/prop-count

Adding these numbers together in a rough sort of way, I've come to this 
result. The number to the right is the amount of bytes it takes to store 
the references to the properties.

1) hashsize(specified): 713828
2) relevant * 4 + hashsize(specified)  3906168
3) valid * 4   7007052
Where hashsize is a function
   cnt * (16 + 12) + int(cnt * 1.) * 4
that tries calculate to amount of memory consumed by each entry in a 
HashMap. 16 bytes is taken by each HashMap.Entry object and I estimate 
12 bytes system overhead for each object. 1. is the loadfactor of 
the HashMap.table array.

The number for 2) is most likely too high, it should only be necessary 
to store the non-relevant properties in the HashMap, but I don't have 
the count of non-relevant properties available.

The full summary can be found here:
   http://bckfnn-modules.sf.net/summary
occur: the number of times the element occurs in the input.
spec:  the number of specified properties on all the occurrences of that
   element type.
relev: the maximum number of relevant slots that can be filled on all
   the occurrences of that element type.
valid: the maximum number of valid slots that can be filled on all
   the occurrences of that element type.
regards,
finn


Re: [Bug 25480] - Experimental performance improvements.

2004-01-13 Thread Peter B. West
Andreas L. Delmelle wrote:
Sorry for the long post... Just trying to put a few ideas together.


Alt-design (trying the hyphen for a while) takes different approaches at
different times.  While building the subtree of any node, all of the
properties are maintained in a HashMap, along with a BitSet of specified
properties.
When the subtree construction is complete, the HashMap and BitSet are
used to build the sparse array of only the relevant *resolved* property
values (not properties - one of the differences with HEAD) and then
thrown away.


Yum... so a Collection of Objects is traded for a handful of --what? (Would
this be the propertyValue singletons you mention?)
If I mentioned PropertyValue singletons, it was a slip of the fingers. 
I maintain Property singletons, which are exist solely to provide access 
to certain "static" information about individual properties.

What happens with
'unresolvable' Props at that point (for the cases you mention below)?
See the package org.apache.fop.datatypes.indirect, especially 
IndirectValue, which the others extend.  IndirectValue itself extends 
AbstractPropertyValue.  Unresolved values take their place alongside the 
resolved properties.

Do
they get thrown away, or placed in a sort of Property stack for later
accessing? If so, how exactly do you indicate in the FObj (or FONode) that
the property in question still has to be resolved? So that, when Layout asks
for this property, it is signaled that some computations still need to be
made? (Pardon me if these questions mean I haven't read your docs thoroughly
enough; I did read some of them, but it seems I still miss a bit of
technical background to fully understand it.)
The docs are far from adequate.


This approach has to be modified in two environments - fo:static-content
and fo:marker.  In the case of fo:marker, the inherited environment is
not known at parse time, and in the case of static-content, the
appropriate fo:retrieve-marker subtrees are not know until the
region-body area tree is constructed.


Yes... cases I was overlooking --not so plain inheritance as the FObj's for
static-content appear only once or twice every page-sequence, but their
inherited properties and values vary, depending on the further processing of
the tree. Still, I'm wondering whether you really need to re-parse them (as
you indicated)... Couldn't you just, say, keep the Property alive and alter
its value when needed? (If this would save you any re-initializing, I mean)
The re-parsing idea seems very interesting to be able to say after a
threesome of pages have been processed: collapse the branches of the FOTree
that have already been layed out to the level below the current fo:flow (or
fo:block; in any case some logical point of reference). If downstream, it
turns out that their layout needs to be modified again (what are the odds?
any way of predicting this?), one could trigger a re-parse from the subtree
in question onwards. (This would, I think, save us some memory)
The re-parse may not be strictly necessary, but it is a workable first 
cut.  Generally, static-content and markers will not be large subtrees, 
so re-parsing them ought not to have a major impact.  This can be 
experimented with once the layout is working.  The beauty of the 
approach is that it simplifies the logic, without (I hope) costing too 
much in performance.

In terms of the re-layout, re-parsing is not required (Re, re, re your 
boat...)  IndirectValues will need to be re-evaluated in the new 
context, but the process of re-layout is not markedly different from the 
normal layout process.  Any particular lineage descendant from an 
fo:flow must be able to adapt to a new environment.

Because the page is a logical unit different from the flow, some line of 
descent from the fo:flow will be unresolved when any particular page, 
which is being laid out from that flow, fills up.  The incomplete 
lineage must somehow be associated with the new page, whose region-body 
dimensions may be different.  The layout must then proceed from the 
point at which the previous page filled.

There are similarities here with backtracking.  Backtracking layout 
involves backtracking to a particular lineage in a particular state 
(even it that is the beginning of an fo:flow.)

  Perhaps the
same approach can be taken WRT tables: collapse all finished rows, so their
cells are released. When their layed-out state turns out to be insufficient,
start processing again from the row in question onwards, *with* the
knowledge of what lies ahead this time...
Peter
--
Peter B. West 


RE: [Bug 25480] - Experimental performance improvements.

2004-01-13 Thread John Austin
On Tue, 2004-01-13 at 20:49, Glen Mazza wrote:
> Let's not get too certain of anything right now with
> respect to implementation--but you probably have a
> point--a huge and very repetitively formatted document
> (say, the Chicago phone book, perhaps) would have
> comparatively fewer properties with a higher
> cardinality for each.

SOLVED! Yes!

Something to cheer up a morbidly downcast Packers fan two
days after the fall of the mighty number '4'.

I used DocBook for the frequency table because I was familiar
with formatting it as PDF with FOP. I suspect that properties
have similar distributions in general because XSL-FO are always
generated with programs and (ransom notes notwithstanding) 
adhere to general styles.

Really repetitive documents would be only slightly more skewed
than general text documents. (Say 90-10 rather than 80-20).

Someone told me where to get the style sheets for the XSL-FO
specification (RenderX) and I wanted to generate the XSL-FO
file for it, as a more appropriate 'challenge' for the project. 

--
John Austin <[EMAIL PROTECTED]>


RE: [Bug 25480] - Experimental performance improvements.

2004-01-13 Thread Glen Mazza
Let's not get too certain of anything right now with
respect to implementation--but you probably have a
point--a huge and very repetitively formatted document
(say, the Chicago phone book, perhaps) would have
comparatively fewer properties with a higher
cardinality for each.

Glen

--- "Andreas L. Delmelle" <[EMAIL PROTECTED]>
wrote:
> > -Original Message-
> > From: Glen Mazza [mailto:[EMAIL PROTECTED]
> >
> > OK--this may also be overkill then.  Thank you for
> the
> > analysis.
> >
> 
> It will prove useful, I am sure --provided we want
> to uphold the intention
> to be able to process any size of document (from
> sources that may not be as
> exemplary as DocBook :/)
> ... and eventually redo part
> of the process with
> significantly less effort.
> 
> Over & out (for the time being :) )
> 
> Cheers,
> 
> Andreas
> 


__
Do you Yahoo!?
Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes
http://hotjobs.sweepstakes.yahoo.com/signingbonus


DO NOT REPLY [Bug 25480] - [PATCH] Experimental performance improvements.

2004-01-13 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480

[PATCH] Experimental performance improvements.

[EMAIL PROTECTED] changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED



--- Additional Comments From [EMAIL PROTECTED]  2004-01-14 00:05 ---
OK, Finn, I just added in the inherit[] (inheritableProperties[]) array in 
PropertyList.  Also, converted what appears to be the last of the properties (I 
read through the whole patch again to confirm.)  We decided earlier for the 
time being to leave out the FO element conversions due to lack of use, so I 
will mark this patch as "Fixed".

If you see me missing anything, or would otherwise still like to keep it open, 
go ahead and do so.

Thanks!
Glen


RE: [Bug 25480] - Experimental performance improvements.

2004-01-13 Thread Andreas L. Delmelle
> -Original Message-
> From: Glen Mazza [mailto:[EMAIL PROTECTED]
>
> OK--this may also be overkill then.  Thank you for the
> analysis.
>

It will prove useful, I am sure --provided we want to uphold the intention
to be able to process any size of document (from sources that may not be as
exemplary as DocBook :/)... and eventually redo part of the process with
significantly less effort.

Over & out (for the time being :) )

Cheers,

Andreas



Re: [Bug 25480] - Experimental performance improvements.

2004-01-13 Thread Glen Mazza
OK--this may also be overkill then.  Thank you for the
analysis.

Glen

--- Finn Bock <[EMAIL PROTECTED]> wrote:
> [Glen Mazza]
> 
> > This sounds like it could be an excellent idea--a
> > PropertyRepository (extending, of course, a
> > DelmelleRepository (tm) ;) ) could be a very
> useful
> > tool for FO Tree Building.  Prior to creating any
> > Property instance for any FO's array, we send the
> > specs of the needed property to the
> > PropertyRepository, and it gives us either (1) a
> > brand-new property instance, or (2) a reference to
> an
> > already-created one.  So, indeed, only one
> instance of
> > that font-size = "12pt" would need to be created. 
> 
> 
> The amount to be saved will of course depend on the
> input, but for 
> "DocBook: The Definite Guide", the amount is quite
> small. So small that 
> I didn't bothered to do it when I made the
> performance patch.
> 
> Here is a break down on the string values that are
> parsed into 
> properties. It is the output from "sort | uniq -c |
> sort" so in the 
> first column is the number of times that a value
> occurs.
> 
> 
>
http://bckfnn-modules.sf.net/property-value-breakdown.txt
> 
> The properties that starts with a '.' are the
> default values for 
> subproperties and they can all be reduced to one
> occurrence if the 
> default value for subproperties was cached.
> 
> 'start-ident' and 'line-height=1.2em' both depend on
> other properties 
> and therefore can't be (easily) cached.
> 
> regards,
> finn
> 


__
Do you Yahoo!?
Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes
http://hotjobs.sweepstakes.yahoo.com/signingbonus


Re: [Bug 25480] - Experimental performance improvements.

2004-01-13 Thread Finn Bock
[Glen Mazza]

This sounds like it could be an excellent idea--a
PropertyRepository (extending, of course, a
DelmelleRepository (tm) ;) ) could be a very useful
tool for FO Tree Building.  Prior to creating any
Property instance for any FO's array, we send the
specs of the needed property to the
PropertyRepository, and it gives us either (1) a
brand-new property instance, or (2) a reference to an
already-created one.  So, indeed, only one instance of
that font-size = "12pt" would need to be created.  
The amount to be saved will of course depend on the input, but for 
"DocBook: The Definite Guide", the amount is quite small. So small that 
I didn't bothered to do it when I made the performance patch.

Here is a break down on the string values that are parsed into 
properties. It is the output from "sort | uniq -c | sort" so in the 
first column is the number of times that a value occurs.

http://bckfnn-modules.sf.net/property-value-breakdown.txt

The properties that starts with a '.' are the default values for 
subproperties and they can all be reduced to one occurrence if the 
default value for subproperties was cached.

'start-ident' and 'line-height=1.2em' both depend on other properties 
and therefore can't be (easily) cached.

regards,
finn


RE: [Bug 25480] - Experimental performance improvements.

2004-01-13 Thread Andreas L. Delmelle

Sorry for the long post... Just trying to put a few ideas together.

> From: Finn Bock [mailto:[EMAIL PROTECTED]
>

> Pardon me for repeating what might be obvious,

Well, it wasn't to me, so... thanks in advance

> but I'll like to take
> a look at what information we want to store at each FObj.

If you look at the big picture, I think this could as well be none ( a null
vector ) for FObj's with nothing but inherited props. Their property
accessor should be able to perform the logic 'if absent in this FObj's prop
specs, get the specified/computed value (or the property, in case the value
is still unresolved) for the correct FObj', instead of having to redirect
the call to the immediate parent's Propertylist, which may or may not have
to do the same, who knows... the parser probably did at some point, but it
forgot, so did we and now it's got us frantically climbing up the
FOTree --like monkeys :)

> I can come up with 3 different strategies:
>
> 1) Only store the specified properties. That is what HEAD does now.
 
? Is this meant ironically? :)
IIC, if not optimized or normalized in some way, overconstraint would lead
to a drastically high object instantiation rate, not? Besides that, the spec
*does* enforce verbosity in some quite trivial and widely used constructs...
like, , tables

> From: Peter B. West [mailto:[EMAIL PROTECTED]
>

> Alt-design (trying the hyphen for a while) takes different approaches at
> different times.  While building the subtree of any node, all of the
> properties are maintained in a HashMap, along with a BitSet of specified
> properties.
>
> When the subtree construction is complete, the HashMap and BitSet are
> used to build the sparse array of only the relevant *resolved* property
> values (not properties - one of the differences with HEAD) and then
> thrown away.
>

Yum... so a Collection of Objects is traded for a handful of --what? (Would
this be the propertyValue singletons you mention?) What happens with
'unresolvable' Props at that point (for the cases you mention below)? Do
they get thrown away, or placed in a sort of Property stack for later
accessing? If so, how exactly do you indicate in the FObj (or FONode) that
the property in question still has to be resolved? So that, when Layout asks
for this property, it is signaled that some computations still need to be
made? (Pardon me if these questions mean I haven't read your docs thoroughly
enough; I did read some of them, but it seems I still miss a bit of
technical background to fully understand it.)

> This approach has to be modified in two environments - fo:static-content
> and fo:marker.  In the case of fo:marker, the inherited environment is
> not known at parse time, and in the case of static-content, the
> appropriate fo:retrieve-marker subtrees are not know until the
> region-body area tree is constructed.
>

Yes... cases I was overlooking --not so plain inheritance as the FObj's for
static-content appear only once or twice every page-sequence, but their
inherited properties and values vary, depending on the further processing of
the tree. Still, I'm wondering whether you really need to re-parse them (as
you indicated)... Couldn't you just, say, keep the Property alive and alter
its value when needed? (If this would save you any re-initializing, I mean)
The re-parsing idea seems very interesting to be able to say after a
threesome of pages have been processed: collapse the branches of the FOTree
that have already been layed out to the level below the current fo:flow (or
fo:block; in any case some logical point of reference). If downstream, it
turns out that their layout needs to be modified again (what are the odds?
any way of predicting this?), one could trigger a re-parse from the subtree
in question onwards. (This would, I think, save us some memory) Perhaps the
same approach can be taken WRT tables: collapse all finished rows, so their
cells are released. When their layed-out state turns out to be insufficient,
start processing again from the row in question onwards, *with* the
knowledge of what lies ahead this time...

> From: Glen Mazza [mailto:[EMAIL PROTECTED]
> --- "Andreas L. Delmelle" <[EMAIL PROTECTED]>
> > What I'm very concerned about, for example, are cases like
> > tables, where it would be quite awkward to have the TableCell
> > FObj's reference their own copy of a Property instance
>
> To put it more precisely, the individual Properties of
> an FObj are currently stored in the PropertyList of
> that FOBj.

> This sounds like it could be an excellent idea--a
> PropertyRepository ( ... ) could be a very useful
> tool for FO Tree Building.  Prior to creating any
> Property instance for any FO's array, we send the
> specs of the needed property to the
> PropertyRepository, and it gives us either (1) a
> brand-new property instance, or (2) a reference to an
> already-created one.

Aha! So this approach could also be used to recycle some of the HashMap and
BitSet that get thrown away in al

RE: [Bug 25480] - Experimental performance improvements.

2004-01-12 Thread Glen Mazza
--- "Andreas L. Delmelle" <[EMAIL PROTECTED]>
wrote:
> What I'm very concerned about, for example, are
> cases like tables, where it
> would be quite awkward to have the TableCell FObj's
> reference their own copy
> of a Property instance

To put it more precisely, the individual Properties of
an FObj are currently stored in the PropertyList of
that FOBj.

>
> > 2.) for any properties undefined, access the
> > PropertyWindow to determine the property instances
> to
> > use for them.  No recursion needed now.
> >
> 
> Sounds good, and while we're at it, could we test
> for equal specified values
> on any ancestors in step 1, so we can use the
> advantages of inheritance to
> the max...? 

> We don't really want to *punish* users
> who feel like specifying

who *don't* feel, I presume?  ;)

> 'font-size="12pt"' on 80K different FObj's that are
> descendants of the
> fo:flow, I think.
> In the proposed case, IMHO, there should be only one
> Property instance for
> 'font-size="12pt"'. Is this a correct view? Or is
> this outright impossible
> for some reason I'm missing?
> 

This sounds like it could be an excellent idea--a
PropertyRepository (extending, of course, a
DelmelleRepository (tm) ;) ) could be a very useful
tool for FO Tree Building.  Prior to creating any
Property instance for any FO's array, we send the
specs of the needed property to the
PropertyRepository, and it gives us either (1) a
brand-new property instance, or (2) a reference to an
already-created one.  So, indeed, only one instance of
that font-size = "12pt" would need to be created.  

The PropertyRepository would be accessed by
PropertyList when creating properties out of specified
attributes, prior to going to a PropertyWindow (if we
use one) to determine which non-specified properties
to attach.  (The PropertyWindow would only need the
PropertyRepository on app-startup, when it would
create its default objects.)  

But we may not need to bother with a
PropertyWindow--as you're saying, I doubt we're going
to need to store references to all properties at each
FObj--and its time savings on FO Tree building may be
questionable anyway.

Glen

__
Do you Yahoo!?
Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes
http://hotjobs.sweepstakes.yahoo.com/signingbonus


Re: [Bug 25480] - Experimental performance improvements.

2004-01-12 Thread Glen Mazza
--- Finn Bock <[EMAIL PROTECTED]> wrote:
> > I.e., for all those references to the 'foo'
> property
> > instance for the children of an FO where that
> value
> > would be inherited, we don't have to create a new
> > Property instance, just a reference to the
> inherited
> > instance.
> 
> Right, that is also the way I see it.
> 

Good, all three of us are on the same page.

> 2) Put the relevant properties in a fast Property[]
> array and the
> remaining specified properties in a HashMap. For
> fo:root the result
> would be an array of size 1 for the
> 'media-usage' property.

No, I wasn't thinking of having a HashMap at all--just
a property array--first populated by incoming
specified properties, then populated by querying the
PropertyWindow for ancestor already-created property
instances.

The PropertyWindow is only a FO-tree build time
convenience for quickly attaching already created
property instances to the current node, without
needing to make recursive calls up the tree to obtain
those instances.  It is not needed if (1) the
recursion is not that big a deal in time, or (2) it is
not desired to store an array of all valid instances
for each FObj (i.e., we should continue something
similar to what HEAD does presently).  

However, its usage may also eliminate the need to
store all relevant properties for the children in the
array of each FObj.  (i.e., fo:root can return to just
storing media-usage.)  Of course, it is certainly more
space-efficient to store 250 properties at the top
than at each of the child nodes.  Ultimately, it
depends on what you want the property arrays for each
FObj instance to store.

Glen

__
Do you Yahoo!?
Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes
http://hotjobs.sweepstakes.yahoo.com/signingbonus


Re: [Bug 25480] - Experimental performance improvements.

2004-01-12 Thread Peter B. West
Finn Bock wrote:
[Glen]

One thing I'm missing here, for Finn's design below:

values[0] = null // always null.
values[1] = reference to a 'foo' Property instance
values[2] = reference to a 'baz' Property instance
Can't we just have, say, values[1] refer to the
parent's Property instance in cases where inheritance
is applicable?
I.e., for all those references to the 'foo' property
instance for the children of an FO where that value
would be inherited, we don't have to create a new
Property instance, just a reference to the inherited
instance.


Right, that is also the way I see it.

But if the problem is the *recursion* necessary to
determine the property instance to inherit--here, not
the memory problem, but processing speed--I'm thinking
of a PropertyWindow instance as follows:
A PropertyWindow would be used temporarily during FObj
property initialization to hold references to all the
property instances that would be relevant for that
FObj should a property not be explicitly defined. 
So, to populate the property instances for a
particular FObj, i.e., the "values" array:


Pardon me for repeating what might be obvious, but I'll like to take
a look at what information we want to store at each FObj. I can come
up with 3 different strategies:
1) Only store the specified properties. That is what HEAD does now.
2) Put the relevant properties in a fast Property[] array and the
   remaining specified properties in a HashMap. For fo:root the result
   would be an array of size 1 for the 'media-usage' property.
3) Expect to store every valid property. For fo:root that would mean
   allocating an array large enough to store every defined property.
   This is what my patch does, and the "values" array works as the
   PropertyWindow.
As I understand your PropertyWindow proposal, it would allow us to
implement no. 2) above. I'll try to come up with some numbers to see
how much memory that would use/save compared to 1) and 3).
Alt-design (trying the hyphen for a while) takes different approaches at 
different times.  While building the subtree of any node, all of the 
properties are maintained in a HashMap, along with a BitSet of specified 
properties.

When the subtree construction is complete, the HashMap and BitSet are 
used to build the sparse array of only the relevant *resolved* property 
values (not properties - one of the differences with HEAD) and then 
thrown away.

This approach has to be modified in two environments - fo:static-content 
and fo:marker.  In the case of fo:marker, the inherited environment is 
not known at parse time, and in the case of static-content, the 
appropriate fo:retrieve-marker subtrees are not know until the 
region-body area tree is constructed.

In general, the impact on storage of maintaining full details for 
fo:static-content and fo:marker will not be huge, even if these are 
parsed as encountered.  However, the plan for alt-design is 1) not the 
parse an fo:marker subtree unless and until it is required, and 2) to 
re-parse fo:static-content for each page after the region-body area tree 
has been constructed. (I'm working on these modifications now.)

Peter
--
Peter B. West 


Re: [Bug 25480] - Experimental performance improvements.

2004-01-12 Thread Finn Bock
[Glen]

One thing I'm missing here, for Finn's design below:

values[0] = null // always null.
values[1] = reference to a 'foo' Property instance
values[2] = reference to a 'baz' Property instance
Can't we just have, say, values[1] refer to the
parent's Property instance in cases where inheritance
is applicable?
I.e., for all those references to the 'foo' property
instance for the children of an FO where that value
would be inherited, we don't have to create a new
Property instance, just a reference to the inherited
instance.
Right, that is also the way I see it.

But if the problem is the *recursion* necessary to
determine the property instance to inherit--here, not
the memory problem, but processing speed--I'm thinking
of a PropertyWindow instance as follows:
A PropertyWindow would be used temporarily during FObj
property initialization to hold references to all the
property instances that would be relevant for that
FObj should a property not be explicitly defined.  

So, to populate the property instances for a
particular FObj, i.e., the "values" array:
Pardon me for repeating what might be obvious, but I'll like to take
a look at what information we want to store at each FObj. I can come
up with 3 different strategies:
1) Only store the specified properties. That is what HEAD does now.
2) Put the relevant properties in a fast Property[] array and the
   remaining specified properties in a HashMap. For fo:root the result
   would be an array of size 1 for the 'media-usage' property.
3) Expect to store every valid property. For fo:root that would mean
   allocating an array large enough to store every defined property.
   This is what my patch does, and the "values" array works as the
   PropertyWindow.
As I understand your PropertyWindow proposal, it would allow us to
implement no. 2) above. I'll try to come up with some numbers to see
how much memory that would use/save compared to 1) and 3).
regards,
finn


RE: [Bug 25480] - Experimental performance improvements.

2004-01-12 Thread Andreas L. Delmelle
> -Original Message-
> From: Glen Mazza [mailto:[EMAIL PROTECTED]
>

> One thing I'm missing here, for Finn's design below:
>
> values[0] = null // always null.
> values[1] = reference to a 'foo' Property instance
> values[2] = reference to a 'baz' Property instance
>
> Can't we just have, say, values[1] refer to the
> parent's Property instance in cases where inheritance
> is applicable?
>
> I.e., for all those references to the 'foo' property
> instance for the children of an FO where that value
> would be inherited, we don't have to create a new
> Property instance, just a reference to the inherited
> instance.
>

That sounds indeed very much like the sort of thing I'm referring to,
expressed with more profound background knowledge, so more to the point.

> [Me: ]
> > The difference would be that, instead of n
> > Property objects, each with
> > its own FObj, you would end up with one larger
> > Property object --larger,
> > because it has to store references of all FObj's to
> > which it applies - that
> > is in some way shared by n FObj's.
> >
> [Glen: ]
> I think this could be done more simply by the above,
> just have values[1] refer to the property of the
> parent instead.
>

Agreed.

What I'm very concerned about, for example, are cases like tables, where it
would be quite awkward to have the TableCell FObj's reference their own copy
of a Property instance --thus increasing their size in memory - while there
would be a way to get the value (resolved/unresolved) of the ancestor in
question, directly without any recursion, given the necessary adjustments.
IIC the TableRow FObj's Property instances will always be available as long
as any Cells are being processed. Same goes for the Table and the Columns
while any Rows are processed. I admit, this is a quite different form of
inheritance, but it serves as an illustration.
(Thinking of my recent colspan excursions here: does every cell really need
its own colspan variable where you could simply retrieve it from the column
(if specified)? This also has me doubting about simply ignoring implicit
columns at LM creation stage, like the proposed (and recently applied) patch
25809 does. The corresponding FObj would simply not exist. If it did, a LM
would automatically be instantiated for it. I'm thinking of creating 'a' (at
least one) column in the FOTree, but I'll discuss this in another
thread... )

> But if the problem is the *recursion* necessary to
> determine the property instance to inherit--here, not
> the memory problem, but processing speed--I'm thinking
> of a PropertyWindow instance as follows:
>

My concern, I believe, is both. Take, for instance, the font-size property.
I would consider it to be best practice in XSL-FO to specify the value that
will be used the most throughout the rest of the document on the fo:flow,
and use specified values to override this value only where it is absolutely
necessary to do so. This, to me, means that when it comes to processing,
there has to be an advantage to adhere to such practices, WRT memory
consumption as well as WRT speed. I mean: if it takes only a little more
time to process at FOTree building stage, it might be worth it if you can
avoid a lot of recursive calls later on in the process.


So...
> A PropertyWindow would be used temporarily during FObj
> property initialization to hold references to all the
> property instances that would be relevant for that
> FObj should a property not be explicitly defined.
>
> So, to populate the property instances for a
> particular FObj, i.e., the "values" array:
>
> 1.) read any incoming attributes and make properties
> out of them.
>
> 2.) for any properties undefined, access the
> PropertyWindow to determine the property instances to
> use for them.  No recursion needed now.
>

Sounds good, and while we're at it, could we test for equal specified values
on any ancestors in step 1, so we can use the advantages of inheritance to
the max...? We don't really want to *punish* users who feel like specifying
'font-size="12pt"' on 80K different FObj's that are descendants of the
fo:flow, I think.
In the proposed case, IMHO, there should be only one Property instance for
'font-size="12pt"'. Is this a correct view? Or is this outright impossible
for some reason I'm missing?

> 3.) if the FObj has any child objects, create a new
> (temporary) PropertyWindow for their
> processing, based on the FObj's (parent)
> PropertyWindow and the value of its own Properties.
> This temporary PropertyWindow can be dropped once the
> children's FObjs have been processed.  So the number
> of open PropertyWindow instances during processing
> would just be equal to the depth of the tree at the
> current point of processing.
>
> (What I'm describing above is may very well be a
> common design pattern, I don't know its name,
> however.)
>

Me neither... If it turns out not to be one, we could call it the
'Mazza-Window' :)

> OTOH, for any XSL functions which request, "run-time",
> to

RE: [Bug 25480] - Experimental performance improvements.

2004-01-11 Thread Glen Mazza
--- "Andreas L. Delmelle" <[EMAIL PROTECTED]>
wrote:
> > -Original Message-
> > From: Finn Bock [mailto:[EMAIL PROTECTED]
> >
> > > [Andreas L. Delmelle]
> > > In this case, however, I think you
> > > can't fully 'push' these onto the descendants,
> as this would
> > > lead to absurd storage-reqs for quite
> average-sized documents.
> >



> >
> > Absolutely correct, and I'm not at all sure that
> we should go to an array
> > based storage of the properties in PropertyList,
> as I did in the patch.

One thing I'm missing here, for Finn's design below:

values[0] = null // always null.
values[1] = reference to a 'foo' Property instance
values[2] = reference to a 'baz' Property instance

Can't we just have, say, values[1] refer to the
parent's Property instance in cases where inheritance
is applicable?

I.e., for all those references to the 'foo' property
instance for the children of an FO where that value
would be inherited, we don't have to create a new
Property instance, just a reference to the inherited
instance.


> The difference would be that, instead of n
> Property objects, each with
> its own FObj, you would end up with one larger
> Property object --larger,
> because it has to store references of all FObj's to
> which it applies - that
> is in some way shared by n FObj's.
> 

I think this could be done more simply by the above,
just have values[1] refer to the property of the
parent instead.  

But if the problem is the *recursion* necessary to
determine the property instance to inherit--here, not
the memory problem, but processing speed--I'm thinking
of a PropertyWindow instance as follows:

A PropertyWindow would be used temporarily during FObj
property initialization to hold references to all the
property instances that would be relevant for that
FObj should a property not be explicitly defined.  

So, to populate the property instances for a
particular FObj, i.e., the "values" array:

1.) read any incoming attributes and make properties
out of them.

2.) for any properties undefined, access the
PropertyWindow to determine the property instances to
use for them.  No recursion needed now.

3.) if the FObj has any child objects, create a new
(temporary) PropertyWindow for their
processing, based on the FObj's (parent)
PropertyWindow and the value of its own Properties.
This temporary PropertyWindow can be dropped once the
children's FObjs have been processed.  So the number
of open PropertyWindow instances during processing
would just be equal to the depth of the tree at the
current point of processing.

(What I'm describing above is may very well be a
common design pattern, I don't know its name,
however.)

OTOH, for any XSL functions which request, "run-time",
to use different Property values than that obtained by
simple inheritance--we can't rely on the above.  These
functions are relatively infrequently used, and IMO do
not need to be as optimized as normal inheritance
would be--i.e., we still have to support them, but we
shouldn't be altering the speed of the base processing
in order to optimize for *these* functions--recursion
would be OK here, if it allows speed/memory saving for
usual processing.

Glen


__
Do you Yahoo!?
Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes
http://hotjobs.sweepstakes.yahoo.com/signingbonus


RE: [Bug 25480] - Experimental performance improvements.

2004-01-11 Thread Andreas L. Delmelle
> -Original Message-
> From: Finn Bock [mailto:[EMAIL PROTECTED]
>
> > [Andreas L. Delmelle]
> > In this case, however, I think you
> > can't fully 'push' these onto the descendants, as this would
> > lead to absurd storage-reqs for quite average-sized documents.
>
> > OTOH, the inherited property value (resolved or unresolved)
> > can indeed be supposed as available at parse time, because a parent is
> > per se processed *before* any of its children.
>
> Absolutely correct, and I'm not at all sure that we should go to an array
> based storage of the properties in PropertyList, as I did in the patch.
> Using arrays has strengths:
>
> - speed during lookup.
> - possible to pushing inherited props into the children.
>
> and weaknesses:
>
> - much, much larger memory consumption.
>
> I have some ideas regarding releasing the array memory earlier during the
> endElement event that I will try out, but until we have a solution to
> the memory consumption, I think we are better of with the current HashMap
> implementation of property storage and then delay the implementation of
> pushing inherited props into the children.
>

My understanding of the Property side is still a bit too fragmented, but
I'll try do describe what I think to be 'the best of both worlds' in the
case of inherited props (maybe it is already (partially) done this way, I'm
not sure, but I don't get that impression from the code or the docs... see
the pointer below[1]) AFAICT, the way I describe it would turn the logic
almost inside-out :

One structure for an inherited property (in the form of a subclass? or an
'Inheritable' interface?) that contains a reference to the base FObj and an
array of all FObj's whose property accessors should be routed directly to
return the property (value) of the base FObj. This to avoid unnecessary
calls later on to an FObj's accessor from which you already know at parse
time that it's going to dispatch the call anyway. Take the case of a
property value where the property in question is inherited from some levels
up... The difference would be that, instead of n Property objects, each with
its own FObj, you would end up with one larger Property object --larger,
because it has to store references of all FObj's to which it applies - that
is in some way shared by n FObj's.

I have for the moment, no idea whether such an approach is feasible, but one
thing I do know, is that if it is, it'll save on a number of recursive calls
later on in the process.

When an FObj is interrogated by Layout about an unspecified value for a
property which falls in the inheritable category, the call for the property
value should trigger:

- getting the id for the nearest ancestor on which the property *was*
specified (conveniently stored this info at parse time, so shouldn't be a
problem)
- pass this id as a parameter to get({property}), so effectively reducing
the number of calls to 1

I must admit, I do see an increase in storage for :
- FO sources with little or no nesting (? Is this even conceivable? )
- FO sources where the values for inherited properties are specified at
every level (but this IMHO should be considered as 'bad practice')

   [Me]
>  > I just wonder if this has something to do with
>  > Finn's other idea of moving logic to the property side to save on
>  > unnecessary calls to empty methods ?

> [Finn]
> No, it is unrelated, but while I made the patch to remove the generated
> maker classes
>

As I said, I have a number of things to learn WRT the property side, so
thanks for the enlightenment.

>http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25873
>
> , I've concluded that it would also be good design decision and I plan
> on updating patch 25873 to show that the Property.Maker can become
> simpler and easier to understand as well as faster.

Great! Looking forward to this.


Cheers,

Andreas

[1] http://xml.apache.org/fop/design/properties.html#property-list-struct
in the main logic : "If the property is inherited, the process repeats
using the PropertyList of the FO's parent object. (This is easy because ..."
Of course, this would be _easy_, but my question is: Would it also be the
_preferred_ way of dealing with inheritance?



Re: [Bug 25480] - Experimental performance improvements.

2004-01-10 Thread Finn Bock
[Glen Mazza]

... what's your opinion on switching to FO
Constants at this time?  They probably will not give
us the rate of return that property constants have;
but there may be future indexing or processing
advantages with them.  I'm not strong one way or the
other on them.
Since then, you have removed the unused support for elementTable,
which was the only place a FO constanst id was used. So I would
delay the implementation of FO ids until that future arrives.
http://marc.theaimsgroup.com/?l=fop-dev&m=107182754300725&w=2

Your solution in your email does not look difficult to
implement, but as a simplification, how about just
implementing getElementId() within FONode to query an
new ID if its value is unintialized, say, -1, and to
store that value with the FONode, *instead of* having
each Extension FO's implement that query & store
function themselves?  
That would also work, but the beauty of having each extension FO
do the work is that it saves the size of an int (4 bytes) for each
instance of a FONode. In my mail, the extension stores the id value
in a static vrbl so it doesn't weight down every FONode object.
And it includes the properties that are valid for
all the children element of
each FO. That is what the large while loop does when
it calls mergeContent.


OK--but to satisfy the spec, we probably need to add a
static boolean array for the properties, defining
true/false of whether each property is "inheritable". 
Right, my patch stores that array in PropertyListBuilder.inherit.

Because one can attach an inheritable property to any
FO, regardless of whether or not it is relevant for it
or its children, we don't want to report an error if
the user chooses to do so--we can query this static
array to determine that we should just ignore the
property instead.
A good point, my patch just reports an error when a property is
specified that isn't relevant for the FO or any of its children.
regards,
finn


Re: [Bug 25480] - Experimental performance improvements.

2004-01-10 Thread Glen Mazza
--- Finn Bock <[EMAIL PROTECTED]> wrote:
> > 1.) For the new PropertyList constructor (in the
> patch), you appear to be 
> > duplicating the element ID argument, once as "el",
> the other time 
> > as "elementId"--just to confirm, they are
> referring to the same thing (and 
> > hence one of them can be removed)?
> 
> Yes, it is a silly leftover from my own conversion
> process.
> 

OK--BTW, what's your opinion on switching to FO
Constants at this time?  They probably will not give
us the rate of return that property constants have;
but there may be future indexing or processing
advantages with them.  I'm not strong one way or the
other on them.

One issue your brought up in the email below, was that
of extension elements needing their own ID's
dynamically:  

http://marc.theaimsgroup.com/?l=fop-dev&m=107182754300725&w=2

Your solution in your email does not look difficult to
implement, but as a simplification, how about just
implementing getElementId() within FONode to query an
new ID if its value is unintialized, say, -1, and to
store that value with the FONode, *instead of* having
each Extension FO's implement that query & store
function themselves?  

> > 
> 
> No, the + 1 is a deliberate trick to handle unknown
> properties which
> should return a null value during lookup().



Excellent--thanks for your full explanation--I
understand it now, and have added comments to
PropertySets.java to make it clearer for others who
may also have questions.


> > --
> > 
> And it includes the properties that are valid for
> all the children element of
> each FO. That is what the large while loop does when
> it calls mergeContent.

OK--but to satisfy the spec, we probably need to add a
static boolean array for the properties, defining
true/false of whether each property is "inheritable". 


Because one can attach an inheritable property to any
FO, regardless of whether or not it is relevant for it
or its children, we don't want to report an error if
the user chooses to do so--we can query this static
array to determine that we should just ignore the
property instead.  I can probably help out with
this--but I'll wait until later when 25873 is
settled--we may end up having this information created
anyway by then.

> , I've concluded that it would also be good design
> decision and I plan
> on updating patch 25873 to show that the
> Property.Maker can become
> simpler and easier to understand as well as faster.
> 

Looking forward to seeing it--I haven't had much time
to look at 25873 yet.  One possible way I see of 
simplifying the makers would be for them to no longer
be a nested inner class of Property, but either part
of the Property class itself or part of its own class.


BTW, when you have write access available, let me know
when you'd like me to step back, I'll happily retreat
back to the renderers and let you take over the
properties at your much faster rate of speed.

Thanks,
Glen


__
Do you Yahoo!?
Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes
http://hotjobs.sweepstakes.yahoo.com/signingbonus


DO NOT REPLY [Bug 25480] - [PATCH] Experimental performance improvements.

2004-01-10 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480

[PATCH] Experimental performance improvements.





--- Additional Comments From [EMAIL PROTECTED]  2004-01-10 13:45 ---
Further comments and discussions here:

  http://marc.theaimsgroup.com/?l=fop-dev&m=107369978306013&w=2
  http://marc.theaimsgroup.com/?l=fop-dev&m=107374163230526&w=2


[Bug 25480] - Experimental performance improvements.

2004-01-10 Thread Finn Bock
[Glen in bugzilla]

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480

I'm now looking at the changes to PropertySets.java (already applied) and 
PropertyList.java (only partly so) and have a few questions:

1.) For the new PropertyList constructor (in the patch), you appear to be 
duplicating the element ID argument, once as "el", the other time 
as "elementId"--just to confirm, they are referring to the same thing (and 
hence one of them can be removed)?
Yes, it is a silly leftover from my own conversion process.



2.) In PropertySets.java (already applied), method makeSparseIndices, you 
define indices[0] as: 

indices[0] = (short) (set.cardinality() + 1);

Later, in PropertyList, you initialize the values array as follows:

this.values = new Property[indices[0]];

I think we can then just use set.cardinality() in makeSparseIndices(), 
correct?  (i.e., leave out the +1).
No, the + 1 is a deliberate trick to handle unknown properties which
should return a null value during lookup(). The other part of the trick
is the
   int j = 1;

in makeSpareIndices() which ensure that all the unknown properties in a
indices array all have a value of '0' and all known properties has a
value > 0. For an element fo:bar which support 2 properties foo=21 and
baz=137, the indices array has the values
   indices[21] = 1
   indices[137] = 2
and all other values are 0. The PropertyList.values array then look
like this:
   values[0] = null // always null.
   values[1] = reference to a 'foo' Property instance
   values[2] = reference to a 'baz' Property instance
In the performance sensitive PropertyList.lookup(), the code doesn't
have to test for properties that are unknown by fo:bar
   return values[indices[propertyName]];

because all unknown properties map to the values[0] index which always
have a null value.
--

3.) PropertySets.java defines those properties which are valid for each FO--in 
PropertyList, the proposed implementation then uses that information to limit 
the properties that can be assigned to an FObj (i.e., only those defined as 
valid for it.)  Am I correct here on this point?
And it includes the properties that are valid for all the children element of
each FO. That is what the large while loop does when it calls mergeContent.
It keeps pulling the childrens properties into the parent and repeats doing
it for all the FOs until no more properties can be pulled (yes, there has
to be a better way of doing that).
... and do we also need to somehow additionally qualify *those* 
properties as "valid for the FO but not directly relevant for it"?
I don't think so, and the patch doesn't do it. ProperttySets only return
the set of properties that are valid for a FO.


4.)  Finally, I'm too far removed from my C programming days to understand the 
math here:

In the PropertyList constructor, you code this:

   this.specified = new int[(indices[0] >> 5) + 1];  

(where indices[0] defines the number of properties valid for the FObj)

Why the bitshifting 5 to the right?  What does this accomplish--what is this 
shorthand for?
The 32 bits that is stored in a int. See below.

also, in putSpecified(int idx, Property value), you code this:

   specified[i >> 5] |= 1 << (i & 31);

I'm not clear what this is doing either.  What does putSpecified() do, and 
what's the point of the i & 31 and the Or'ing?  
PropertList.specified is just a bitmap of the same size as values.length.
All the shifting and masking is similar to what is done in
java.util.BitSet, inlined in PropertyList for performance.
putSpecified() insert a property in the array and set the specified bit to
true. This is in contrast to the inherit() method which also inserts a
property in the array but doesn't set the specified bit. The other put()
method isn't used anymore.
I also think that inlining the bitset implementation is going to far,
but it does make a performance difference, so I included it in the patch.
[Andreas L. Delmelle]

> IIC the initial strategy WRT inherited properties was to add methods to the
> FObj's to get these from their parent. I think the problem with this
> implementation is that, in the case of very large documents with deeply
> nested elements that inherit a property which is specified at the top-level,
> you would end up with one getter being dispatched to the parent's getter,
> and this in its turn being dispatched to yet another ancestor's getter (or
> Makers in case of Property creation)... In this case, however, I think you
> can't fully 'push' these onto the descendants, as this would lead to absurd
> storage-reqs for quite average-sized documents.
>
> OTOH, the inherited property value (resolved or unresolved) can indeed be
> supposed as available at parse time, because a parent is per se processed
> *before* any of its children.
Absolutely correct, and I'm not at all sure that we should go to an array
based storage of the properties in PropertyList, as I di

[Bug 25480] - [PATCH] Experimental performance improvements.

2004-01-09 Thread Andreas L. Delmelle
> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
>
> --- Additional Comments From [EMAIL PROTECTED]  2004-01-10


Glen / Finn,

Hope you don't mind my interrupting here:

Particularly this point I found interesting:

> 3.) PropertySets.java defines those properties which are valid
> for each FO--in PropertyList, the proposed implementation then uses that
> information to limit the properties that can be assigned to an FObj
> (i.e., only those defined as valid for it.)  Am I correct here on this
point?
>
> If so, we may need to expand the "valid" properties to include
> the inheritable ones.
>

> Do we need to expand then the property sets for each FO to include the
> inheritable properties (you may already have done so, I'm not sure if
those
> were included)--and do we also need to somehow additionally qualify
*those*
> properties as "valid for the FO but not directly relevant for it"?  (I
> think "yes" for the first question, "no" for the second.)

IIC the initial strategy WRT inherited properties was to add methods to the
FObj's to get these from their parent. I think the problem with this
implementation is that, in the case of very large documents with deeply
nested elements that inherit a property which is specified at the top-level,
you would end up with one getter being dispatched to the parent's getter,
and this in its turn being dispatched to yet another ancestor's getter (or
Makers in case of Property creation)... In this case, however, I think you
can't fully 'push' these onto the descendants, as this would lead to absurd
storage-reqs for quite average-sized documents.

OTOH, the inherited property value (resolved or unresolved) can indeed be
supposed as available at parse time, because a parent is per se processed
*before* any of its children. I just wonder if this has something to do with
Finn's other idea of moving logic to the property side to save on
unnecessary calls to empty methods ?

> 4.)  Finally, I'm too far removed from my C programming days to understand
the
> math here:
>   this.specified = new int[(indices[0] >> 5) + 1];

He's dividing the value by 32 [or 2^5], right?

> (where indices[0] defines the number of properties valid for the FObj)

> also, in putSpecified(int idx, Property value), you code this:

>   specified[i >> 5] |= 1 << (i & 31);

> I'm not clear what this is doing either.  What does putSpecified() do, and
> what's the point of the i & 31 and the Or'ing?

I *definitely* *have* *to* take a closer look at that code!
I *definitely* *have* *to* take a closer look at that code!
I *definitely* *have* *to* take a closer look at that code!
...

That's all I can add for you now, Glen. Sorry :(

Cheers,

Andreas



DO NOT REPLY [Bug 25480] - [PATCH] Experimental performance improvements.

2004-01-09 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480

[PATCH] Experimental performance improvements.





--- Additional Comments From [EMAIL PROTECTED]  2004-01-10 01:07 ---
Finn,

I'm now looking at the changes to PropertySets.java (already applied) and 
PropertyList.java (only partly so) and have a few questions:

1.) For the new PropertyList constructor (in the patch), you appear to be 
duplicating the element ID argument, once as "el", the other time 
as "elementId"--just to confirm, they are referring to the same thing (and 
hence one of them can be removed)?



2.) In PropertySets.java (already applied), method makeSparseIndices, you 
define indices[0] as: 

indices[0] = (short) (set.cardinality() + 1);

Later, in PropertyList, you initialize the values array as follows:

this.values = new Property[indices[0]];

I think we can then just use set.cardinality() in makeSparseIndices(), 
correct?  (i.e., leave out the +1).

--

3.) PropertySets.java defines those properties which are valid for each FO--in 
PropertyList, the proposed implementation then uses that information to limit 
the properties that can be assigned to an FObj (i.e., only those defined as 
valid for it.)  Am I correct here on this point?

If so, we may need to expand the "valid" properties to include the inheritable 
ones.  As Peter notes on the Alt-Design pages [1], in 5.1.4 of the Spec [2]:  

gives these two statements:

"The inheritable properties can be placed on any formatting object."

"Hence there is always a specified value defined for every inheritable property 
for every formatting object."

[1] http://xml.apache.org/fop/design/alt.design/properties/introduction.html
[2] http://www.w3.org/TR/2001/REC-xsl-20011015/slice5.html#inheritance

Do we need to expand then the property sets for each FO to include the 
inheritable properties (you may already have done so, I'm not sure if those 
were included)--and do we also need to somehow additionally qualify *those* 
properties as "valid for the FO but not directly relevant for it"?  (I 
think "yes" for the first question, "no" for the second.)



4.)  Finally, I'm too far removed from my C programming days to understand the 
math here:

In the PropertyList constructor, you code this:

   this.specified = new int[(indices[0] >> 5) + 1];  

(where indices[0] defines the number of properties valid for the FObj)

Why the bitshifting 5 to the right?  What does this accomplish--what is this 
shorthand for?


also, in putSpecified(int idx, Property value), you code this:

   specified[i >> 5] |= 1 << (i & 31);

I'm not clear what this is doing either.  What does putSpecified() do, and 
what's the point of the i & 31 and the Or'ing?  

Sorry for the long post--feel free to move this to FOP-DEV if easier for you to 
respond.

Thanks as always for your help!
Glen


DO NOT REPLY [Bug 25480] - [PATCH] Experimental performance improvements.

2003-12-23 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480

[PATCH] Experimental performance improvements.





--- Additional Comments From [EMAIL PROTECTED]  2003-12-24 00:08 ---

"It does not get filled at all.

It is there to support an AFAICT unused feature in the foproperties.xml file 
which makes it possible to specify special property makers for each fo:element."


To lesson confusion, I went ahead and removed it from FObj.java and 
PropertyList.java -- we can always bring it back should it become of use in the 
future.

Thanks,
Glen


DO NOT REPLY [Bug 25480] - [PATCH] Experimental performance improvements.

2003-12-22 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480

[PATCH] Experimental performance improvements.





--- Additional Comments From [EMAIL PROTECTED]  2003-12-23 03:39 ---
Finn,

Oops, sorry--My recent check-in of property-sets.xsl--I forgot to give you 
credit for that work.  It was basically a renaming of your mkpropset.xsl, with 
a few trivial cosmetic changes.

Glen


DO NOT REPLY [Bug 25480] - [PATCH] Experimental performance improvements.

2003-12-22 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480

[PATCH] Experimental performance improvements.





--- Additional Comments From [EMAIL PROTECTED]  2003-12-22 22:18 ---
It does not get filled at all.

It is there to support an AFAICT unused feature in the foproperties.xml file 
which makes it possible to specify special property makers for each fo:element.

Take a look at the comment for 'generic-property-list' in 
src/codegen/properties.dtd where the feature is described.

I did not port the codegeneration of this feature to int indexes, so it still 
uses a HashMap in my patch.

regards,
finn


DO NOT REPLY [Bug 25480] - [PATCH] Experimental performance improvements.

2003-12-22 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480

[PATCH] Experimental performance improvements.





--- Additional Comments From [EMAIL PROTECTED]  2003-12-22 22:01 ---
Finn, 

Question on the FOPropertyMapping.java class--from your changes, it is 
dutifully filling up the s_htGeneric (property) HashMap from its fo-property-
mapping.xsl generation:

s_htGeneric[PR_SOURCE_DOCUMENT] =SourceDocumentMaker.maker
(PR_SOURCE_DOCUMENT);
s_htGeneric[PR_ROLE] =RoleMaker.maker(PR_ROLE);
s_htGeneric[PR_ABSOLUTE_POSITION] =AbsolutePositionMaker.maker
(PR_ABSOLUTE_POSITION);


I'm already using it.  But there is an s_htElementLists HashMap (which I 
haven't converted to yet) in this class that is *not* being populated here--
you've created accessors for it, etc., but it doesn't seem to get populated 
anywhere.  (Actually, the same goes with its predecessor that I'm still using.) 
Where does this HashMap get filled?

Thanks,
Glen


DO NOT REPLY [Bug 25480] - [PATCH] Experimental performance improvements.

2003-12-16 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480

[PATCH] Experimental performance improvements.





--- Additional Comments From [EMAIL PROTECTED]  2003-12-16 12:14 ---
Excellent...Thanks for the explanation!  I'll look over it tonight.

Glen


DO NOT REPLY [Bug 25480] - [PATCH] Experimental performance improvements.

2003-12-16 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480

[PATCH] Experimental performance improvements.





--- Additional Comments From [EMAIL PROTECTED]  2003-12-16 08:15 ---
The instances of java.util.BitSet that are created for each fo:element in 
PropSets are only used for collecting the properties of that element and the 
properties for all its child elements. The list of child-element properties are 
copied to the parent element by calling the mergeProperties() method repeatedly 
until no more properties can trickle upwards to the root element. (I really 
wanted to calculate the set of properties that can be applied to an element 
staticly by the .xsl instead, but I couldn't quite figure out how to du that).

As an example, the BitSet for an element without any child element consist of 
just the list of properties that apply to the element, while the BitSet for 
fo:root consists of all the properties from the xsl-fo specification.

After the completed BitSet's have been calculated, the BitSet are turned into a 
array of shorts by makeSpareIndices(). The array are always PROPERTY_COUNT long 
and only the slots with supported properties have a non-zero value.

For fo:root there is no packing, so all element "indices[n] = n".
For some other element that support 4 properties, ihe indices array might look 
like this:

   indices[0] = 5 // The number one properties that apply +1
   indices[2] = 1
   indices[10] = 2
   indices[142] = 3
   indices[202] = 4

where the rest of the array has 0 values. (It is important that all the 
property identifiers in Constants.java have non-zero values).

The short[] arrays (one for each element type) are stored in the static 
PropSets.mapping and can be retrieved by getPropertySet(elementId).
This method is called from the ctor of PropertyList().

The sparse indices array are then used as an extra level of indirection when 
accessing the PropertyList.values array.

All of this business with sparse indices and PropSets.jacva is only a memory 
optimization. Allocating a full array 
  values = new Property[PROPERTY_COUNT]
in PropertyList would work just as fine, but would use more memory.


Re: (Victor et al) Re: Performance improvements.

2003-12-15 Thread Peter B. West
Victor Mote wrote:
Peter B. West wrote:


If we go towards integer representation, properties in the API will
always be represented by integers.  By looking at this particular
signature, we are not locking ourselves in.  We can add other signatures
if the need arises, but they can be extensions of the basic call.
The above call does not return an int or an Integer, but a PropertyValue.

public PropertyValue getPropertyValue(int property)

is, in fact, the signature from FONode.java in alt.design.


OK. I'll interpret this as a firm -1 on my API proposal, which is sufficient
to deep-six it. I think it will be a net benefit for the project for me to
withdraw from the remainder of the Properties discussion.
Victor,

I thought that my input on this question was primarily informational. 
If I wanted to vote -1 on your proposal, I would.  I did not, and still 
do not, intend to vote on that proposal, because 1) my primary 
involvement is with alt.design, and 2) I don't understand it.  (I will 
post again shortly on the slowness of my understanding in general.)  If 
your proposal is deep sixed, it will be by those who are more intimately 
involved in HEAD, and who fully understand what you are trying to 
achieve.  My comments about the use of integers are based entirely on my 
experience with alt.design, which I thought might be helpful in coming 
up with a modified properties handler in HEAD.

Peter
--
Peter B. West 


Re: (Victor et al) Re: Performance improvements.

2003-12-15 Thread Peter B. West
Victor Mote wrote:
Peter B. West wrote:


Glen Mazza wrote:

--- "Peter B. West" <[EMAIL PROTECTED]> wrote:


See above.  In alt.design, all compounds are
shorthands, and all
shorthands are presumed to have been resolved into
their components
during FO Tree building.


BTW, does Alt-Design already resolve the cases where
*both* a shorthand and one of its included components
are specified?  I.e., (usually, I believe), disregard
the shorthand value for that component and use its
explicitly given value instead?
It's in the ordering of the properties.  There is a simply for loop to
process properties (attributes, in fact) defined on an FO.  The order of
definition ensures the correct order of evaluation - shorthand first,
then any individual properties.  The same goes for corresponding
properties, when I get around to doing them.  Check the order of
properties in PropNames.java.


I don't know how to reconcile this with your previous posting:


See above.  In alt.design, all compounds are shorthands, and all
shorthands are presumed to have been resolved into their components
during FO Tree building.


The previous posting seems to indicate that the properties have been
decomposed, the chronologically latter posting seems to indicate that the
shorthand retains its shorthand character. The critical thing here is that
the API not even allow anybody to ask about shorthand or compound
properties. The API should allow requests only for the decomposed
properties, and the FO Tree should resolve all of this before passing a
value back.
The ordering is necessary to ensure that the FO attributes are processed
in the correct sequence.  When the properties on a particular node and
its subtree are fully parsed, the shorthands and compounds will have
been expanded, and the minimal property set for the stable FO tree is
constructed.  See makeSparsePropsSet() in FONode.java at
http://cvs.apache.org/viewcvs.cgi/*checkout*/xml-fop/src/java/org/apache/fop/fo/FONode.java?content-type=text%2Fplain&rev=1.2.2.1
Within getPropertyValue() these situations are discriminated, according
to the comments on
private PropertyValue[] propertySet;
When the properties for this value are still being resolved,
getPropertyValue() will resolve inheritance and initial values.  During
this process, shorthands and compounds will be resolved.  At the time
the subtree of the FONode has been constructed, the property set of the
node will be reduced (by makeSparsePropsSet()), and the shorthands and
compounds will be eliminated.  A call to getPropertyValue() specifying
such a property will presently, IIRC, throw an exception.  If necessary,
that can be adjusted to return a NoType PropertyValue, but the
assumption is that by this stage, anything calling for a resolved
property will be aware of which properties are valid on the node.
In either case, any other method which requests such a value is going to
get short shrift.  I think that covers the last couple of sentences in
your comment above.
Basically, shorthands, etc, get resolved and eliminated from the valid
property set.
Peter
--
Peter B. West 



RE: (Victor et al) Re: Performance improvements.

2003-12-15 Thread Victor Mote
Peter B. West wrote:

> If we go towards integer representation, properties in the API will
> always be represented by integers.  By looking at this particular
> signature, we are not locking ourselves in.  We can add other signatures
> if the need arises, but they can be extensions of the basic call.
>
> The above call does not return an int or an Integer, but a PropertyValue.
>
> public PropertyValue getPropertyValue(int property)
>
> is, in fact, the signature from FONode.java in alt.design.

OK. I'll interpret this as a firm -1 on my API proposal, which is sufficient
to deep-six it. I think it will be a net benefit for the project for me to
withdraw from the remainder of the Properties discussion.

Victor Mote



Re: (Victor et al) Re: Performance improvements.

2003-12-15 Thread Peter B. West
Victor Mote wrote:
Peter B. West wrote:


Victor Mote wrote:

Yes, yes, yes. Since we are trying to eliminate the need for unnecessary
object creation, why create a MinOptMax object? Why not just return the
three resolved values in separate methods. Anything that uses the
information will have to separately address the 3 pieces of
data anyway, so

I don't see any advantage to packaging them in an object.

In fact, in alt.design, getPropertyValue(PropNames.LEADER_LENGTH_MINIMUM)
etc.
Speaking of the minimal API, why not PropertyValue getPropertyValue(int
propertyIndex)?  This is presumably defined on FONode or some such, and
FONode also presumably knows how to navigate the FO tree and related
Area tree in order to resolve percentages and the like.


That may be OK for the internal-to-FO-Tree logic, but I don't think is
suitable for the API, primarily because you are locked into one signature
for all properties. 1) There may be some properties for which we want to
return a non-integer value (Font object, for example). 2) There may be some
properties for which we wish to pass more information. 3) You are going to
have some ugly case or switch logic that has to determine which property is
being dealt with. My opinion is that this is not a sound approach.
If we go towards integer representation, properties in the API will 
always be represented by integers.  By looking at this particular 
signature, we are not locking ourselves in.  We can add other signatures 
if the need arises, but they can be extensions of the basic call.

The above call does not return an int or an Integer, but a PropertyValue.

public PropertyValue getPropertyValue(int property)

is, in fact, the signature from FONode.java in alt.design.

Peter
--
Peter B. West 


DO NOT REPLY [Bug 25480] - [PATCH] Experimental performance improvements.

2003-12-15 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480

[PATCH] Experimental performance improvements.





--- Additional Comments From [EMAIL PROTECTED]  2003-12-15 21:59 ---
Finn,

Thanks for your explanation.

More questions--I've just generated the PropSets.java, but am not sure what 
this is needed for.

1.)  As Alt-Design does (but for different objects), you're creating many 
BitSets that define the properties that are relevant for each FO--but I can't 
see these BitSets in use anywhere within your patch.  Am I missing those 
locations where you are employing these for use, or are you creating these with 
the knowledge that they will be used later?  If the latter, how are you 
anticipating their usage?

2.)  What is the purpose of the makeSparseIndices() method in PropSets?

3.)  What is the purpose of the mergeContent() method in PropSets?

Thanks,
Glen


RE: (Victor et al) Re: Performance improvements.

2003-12-15 Thread Glen Mazza
--- Victor Mote <[EMAIL PROTECTED]> wrote:
> The previous posting seems to indicate that the
> properties have been
> decomposed, the chronologically latter posting seems
> to indicate that the
> shorthand retains its shorthand character. 

I think Peter and I were talking about inner-to-FO
properties resolution, not FO-to-Area or -layout.

> The
> critical thing here is that
> the API not even allow anybody to ask about
> shorthand or compound
> properties. The API should allow requests only for
> the decomposed
> properties, and the FO Tree should resolve all of
> this before passing a
> value back.
> 

Yes, for FO--> onwards.

Glen

__
Do you Yahoo!?
New Yahoo! Photos - easier uploading and sharing.
http://photos.yahoo.com/


RE: (Victor et al) Re: Performance improvements.

2003-12-15 Thread Victor Mote
Peter B. West wrote:

> Glen Mazza wrote:
> > --- "Peter B. West" <[EMAIL PROTECTED]> wrote:
> >
> >>See above.  In alt.design, all compounds are
> >>shorthands, and all
> >>shorthands are presumed to have been resolved into
> >>their components
> >>during FO Tree building.
> >>
> >
> >
> > BTW, does Alt-Design already resolve the cases where
> > *both* a shorthand and one of its included components
> > are specified?  I.e., (usually, I believe), disregard
> > the shorthand value for that component and use its
> > explicitly given value instead?
> >
>
> It's in the ordering of the properties.  There is a simply for loop to
> process properties (attributes, in fact) defined on an FO.  The order of
> definition ensures the correct order of evaluation - shorthand first,
> then any individual properties.  The same goes for corresponding
> properties, when I get around to doing them.  Check the order of
> properties in PropNames.java.

I don't know how to reconcile this with your previous posting:

> See above.  In alt.design, all compounds are shorthands, and all
> shorthands are presumed to have been resolved into their components
> during FO Tree building.

The previous posting seems to indicate that the properties have been
decomposed, the chronologically latter posting seems to indicate that the
shorthand retains its shorthand character. The critical thing here is that
the API not even allow anybody to ask about shorthand or compound
properties. The API should allow requests only for the decomposed
properties, and the FO Tree should resolve all of this before passing a
value back.

Victor Mote



RE: (Victor et al) Re: Performance improvements.

2003-12-15 Thread Victor Mote
Peter B. West wrote:

> Victor Mote wrote:
> > Yes, yes, yes. Since we are trying to eliminate the need for unnecessary
> > object creation, why create a MinOptMax object? Why not just return the
> > three resolved values in separate methods. Anything that uses the
> > information will have to separately address the 3 pieces of
> data anyway, so
> > I don't see any advantage to packaging them in an object.
> >
>
> In fact, in alt.design, getPropertyValue(PropNames.LEADER_LENGTH_MINIMUM)
> etc.
>
> Speaking of the minimal API, why not PropertyValue getPropertyValue(int
> propertyIndex)?  This is presumably defined on FONode or some such, and
> FONode also presumably knows how to navigate the FO tree and related
> Area tree in order to resolve percentages and the like.

That may be OK for the internal-to-FO-Tree logic, but I don't think is
suitable for the API, primarily because you are locked into one signature
for all properties. 1) There may be some properties for which we want to
return a non-integer value (Font object, for example). 2) There may be some
properties for which we wish to pass more information. 3) You are going to
have some ugly case or switch logic that has to determine which property is
being dealt with. My opinion is that this is not a sound approach.

Victor Mote



DO NOT REPLY [Bug 25480] - [PATCH] Experimental performance improvements.

2003-12-15 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480

[PATCH] Experimental performance improvements.





--- Additional Comments From [EMAIL PROTECTED]  2003-12-15 06:58 ---
The compound properties are shifted so that both base and compound can be 
stored into a single int. In PropertyList there are several cases where the 
base and the compound part are unmasked. This was just one possible way keeping 
the HEAD design where the compound properties are stored as sub components.

Without having looked at it is detail, I like the alt.design approach better, 
with its flat property space.

finn


Re: (Victor et al) Re: Performance improvements.

2003-12-14 Thread Glen Mazza
Good thing I asked--I may need to do that with the
Constants interface, where the ordering is currently
alphabetical.  

Glen

--- "Peter B. West" <[EMAIL PROTECTED]> wrote:
> 
> It's in the ordering of the properties.  There is a
> simply for loop to 
> process properties (attributes, in fact) defined on
> an FO.  The order of 
> definition ensures the correct order of evaluation -
> shorthand first, 
> then any individual properties.  The same goes for
> corresponding 
> properties, when I get around to doing them.  Check
> the order of 
> properties in PropNames.java.
> 
> Peter
> -- 
> Peter B. West
> 
> 


__
Do you Yahoo!?
New Yahoo! Photos - easier uploading and sharing.
http://photos.yahoo.com/


DO NOT REPLY [Bug 25480] - [PATCH] Experimental performance improvements.

2003-12-14 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480

[PATCH] Experimental performance improvements.





--- Additional Comments From [EMAIL PROTECTED]  2003-12-15 01:25 ---
I brought in the Constants information for a start, and will be continuing to 
analyze the rest of your patch as well as Alt-Design until we're completely 
over to integer constants.  Thanks for the fine work!

Glen


Re: (Victor et al) Re: Performance improvements.

2003-12-14 Thread Peter B. West
Glen Mazza wrote:
--- "Peter B. West" <[EMAIL PROTECTED]> wrote:

See above.  In alt.design, all compounds are
shorthands, and all 
shorthands are presumed to have been resolved into
their components 
during FO Tree building.



BTW, does Alt-Design already resolve the cases where
*both* a shorthand and one of its included components
are specified?  I.e., (usually, I believe), disregard
the shorthand value for that component and use its
explicitly given value instead?
It's in the ordering of the properties.  There is a simply for loop to 
process properties (attributes, in fact) defined on an FO.  The order of 
definition ensures the correct order of evaluation - shorthand first, 
then any individual properties.  The same goes for corresponding 
properties, when I get around to doing them.  Check the order of 
properties in PropNames.java.

Peter
--
Peter B. West 


DO NOT REPLY [Bug 25480] - [PATCH] Experimental performance improvements.

2003-12-14 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480

[PATCH] Experimental performance improvements.





--- Additional Comments From [EMAIL PROTECTED]  2003-12-14 23:02 ---
Finn,

The property constants file that your version will generate defines three 
constants as follows:

// Masks
int COMPOUND_SHIFT = 9;
int PROPERTY_MASK = (1 << COMPOUND_SHIFT)-1;
int COMPOUND_MASK = ~PROPERTY_MASK;

We see them at work later in the Constants file w.r.t. compound properties:

int C_BLOCK_PROGRESSION_DIRECTION = 1 << COMPOUND_SHIFT;
int C_CONDITIONALITY = 2 << COMPOUND_SHIFT;
int C_INLINE_PROGRESSION_DIRECTION = 3 << COMPOUND_SHIFT;
int C_LENGTH = 4 << COMPOUND_SHIFT;
int C_MAXIMUM = 5 << COMPOUND_SHIFT;
int C_MINIMUM = 6 << COMPOUND_SHIFT;

If I recall my C programming days correctly, I believe you're doing a bitwise 
shift 9 digits to the left for these constants--what's the benefit of shifting 
these compound constant values--can you point me to a place in your patch where 
you take advantage of this shifting (e.g., masking, quick calculations of 
anything, etc.)?  I will add comments accordingly.

Thanks,
Glen


Re: (Victor et al) Re: Performance improvements.

2003-12-14 Thread Glen Mazza
--- "Peter B. West" <[EMAIL PROTECTED]> wrote:
> 
> See above.  In alt.design, all compounds are
> shorthands, and all 
> shorthands are presumed to have been resolved into
> their components 
> during FO Tree building.
> 

BTW, does Alt-Design already resolve the cases where
*both* a shorthand and one of its included components
are specified?  I.e., (usually, I believe), disregard
the shorthand value for that component and use its
explicitly given value instead?

Thanks,
Glen


__
Do you Yahoo!?
New Yahoo! Photos - easier uploading and sharing.
http://photos.yahoo.com/


Re: (Victor et al) Re: Performance improvements.

2003-12-14 Thread Peter B. West
Victor Mote wrote:
Peter B. West wrote:


It would be really nice to have a getLeaderLength()
which returns a MinOptMax. this means the getLeaderLength()
has:
- resolve percentages and functions
- deal with the leader-length shorthand setting before this
- deal with inheritance (n/a here, fortunately)
Or getLeaderLengthMin(), getLeaderLengthOpt(), getLeaderLengthMax(),
with all values resolved.


Yes, yes, yes. Since we are trying to eliminate the need for unnecessary
object creation, why create a MinOptMax object? Why not just return the
three resolved values in separate methods. Anything that uses the
information will have to separately address the 3 pieces of data anyway, so
I don't see any advantage to packaging them in an object.
In fact, in alt.design, getPropertyValue(PropNames.LEADER_LENGTH_MINIMUM)
etc.
Speaking of the minimal API, why not PropertyValue getPropertyValue(int 
propertyIndex)?  This is presumably defined on FONode or some such, and 
FONode also presumably knows how to navigate the FO tree and related 
Area tree in order to resolve percentages and the like.



One of the complications in the maintenance code is that
the code in the FO layout routines had to deal with resolving
percentages. OTOH, the generator is mainly so ugly because
Keiron et al. tried hard to press the shorthand handling
into a common scheme. There should be better solutions for
either problem.


Nobody but the FO Tree should ever have to think about compound or shorthand
properties. AFAICT, all examples of these can be decomposed into their
components. The FO Tree's API should deliver only the decomposed (i.e.
lowest-common-denominator or LCD) values. And yes, definitely, percentages
should be handled on the FO Tree side. Make FO Tree do all of that work.
See above.  In alt.design, all compounds are shorthands, and all 
shorthands are presumed to have been resolved into their components 
during FO Tree building.

Peter
--
Peter B. West 


DO NOT REPLY [Bug 25480] - [PATCH] Experimental performance improvements.

2003-12-13 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480

[PATCH] Experimental performance improvements.





--- Additional Comments From [EMAIL PROTECTED]  2003-12-14 02:24 ---
Looking at your patch currently...

Glen


RE: (Victor et al) Re: Performance improvements.

2003-12-13 Thread Victor Mote
J.Pietschmann wrote:

> If we are at it, I'd vote for dumping generating the property
> classes and check the java files into CVS.

+1. I have noted Finn's and Glen's subsequent objections, and Joerg's
subsequent comments. I agree that the general need for that level of
flexibility has passed, and that these things *should* be rewritten in a
more OO way. I may be wrong, but I think most of these classes will
disappear after this stuff is properly rationalized. The vast majority of
the values can be reduced to primitive data types that can be stored
directly in FO Object instances.

I think one other advantage of the generated code was that it was easier to
deal with whether a property was supported or not. However, support for
properties now needs to be handled by the LayoutStrategy implementations.
IOW, as far as FO Tree is concerned, it handles all objects and properties,
and should remain agnostic about how that information may or may not be
used.

Victor Mote



RE: (Victor et al) Re: Performance improvements.

2003-12-13 Thread Victor Mote
Peter B. West wrote:

> > It would be really nice to have a getLeaderLength()
> > which returns a MinOptMax. this means the getLeaderLength()
> > has:
> >  - resolve percentages and functions
> >  - deal with the leader-length shorthand setting before this
> >  - deal with inheritance (n/a here, fortunately)
>
> Or getLeaderLengthMin(), getLeaderLengthOpt(), getLeaderLengthMax(),
> with all values resolved.

Yes, yes, yes. Since we are trying to eliminate the need for unnecessary
object creation, why create a MinOptMax object? Why not just return the
three resolved values in separate methods. Anything that uses the
information will have to separately address the 3 pieces of data anyway, so
I don't see any advantage to packaging them in an object.


> > One of the complications in the maintenance code is that
> > the code in the FO layout routines had to deal with resolving
> > percentages. OTOH, the generator is mainly so ugly because
> > Keiron et al. tried hard to press the shorthand handling
> > into a common scheme. There should be better solutions for
> > either problem.

Nobody but the FO Tree should ever have to think about compound or shorthand
properties. AFAICT, all examples of these can be decomposed into their
components. The FO Tree's API should deliver only the decomposed (i.e.
lowest-common-denominator or LCD) values. And yes, definitely, percentages
should be handled on the FO Tree side. Make FO Tree do all of that work.

Victor Mote



RE: (Victor et al) Re: Performance improvements.

2003-12-13 Thread Victor Mote
Glen Mazza wrote:

> Thanks, Finn and John Austin, for your efforts here.

Yes, and Peter, too.

> Victor, not to rush you, but how agreeable are you in
> general on switching to integer enumerations for FOP
> properties?  (Given Alt-Design, Peter obviously
> approves.)  I checked Alt-Design's PropNames.java[1]
> and liked what I saw.  It doesn't necessarily have to
> be that particular design, just the idea of integer
> constants in general.

That is fine. I hope nothing I have said would be interpreted to the
contrary, and I can't imagine why you might think that this proposition
rushes me (were you waiting on me for something?). I have the least
knowledge of or interest in the performance side. Having said that, I *very*
much want us to implement a lowest-common-denominator,
it-will-never-matter-what-the-guts-look-like API for the FO Tree to deliver
the property values. That is totally independent of the way properties are
stored and computed.

Victor Mote



Re: (Victor et al) Re: Performance improvements.

2003-12-13 Thread Finn Bock
I like the generation process as it allowed me to try out and 
experiment with different optimizations. I don't think that I 
realisticly could have added caching of compound properties or changed 
the abs2rel/rel2abs code if I had to change the Maker classes manually.
[J.Pietschmann]

If its common code, that's what class hierarchies and
inheritance are made for.
Indeed, but then I think you are talking about a cleaner rewritten 
property handling class hierarchies, right?

But in what we have generated now, the similarities isn't handled by 
inheritance. So there is a certain amount of repeated-but-not-equal code 
in the Maker classes.

For instance, the 22 makers with isCorrespondingForced() methods, 
generates this kind of code in HEAD:

public boolean isCorrespondingForced(PropertyList propertyList) {
FObj parentFO = propertyList.getParentFObj();
StringBuffer sbExpr=new StringBuffer();
sbExpr.setLength(0);
sbExpr.append("margin-");
sbExpr.append(propertyList.wmRelToAbs(PropertyList.START));
if (propertyList.getExplicit(sbExpr.toString()) != null)
return true;
return false;
}
and my optimized code looks like this:

public boolean isCorrespondingForced(PropertyList propertyList) {
FObj parentFO = propertyList.getParentFObj();
if (propertyList.getExplicit(propertyList.wmMap(
   Constants.P_MARGIN_LEFT,
   Constants.P_MARGIN_RIGHT,
   Constants.P_MARGIN_TOP)) != null)
return true;
return false;
}
Another optimization that I would like to try out, involves creating a 
"copy" of the PropertyList.findProperty() method in the Maker classes, 
but one that doesn't call isCorrespondingForced() if the maker knows 
that it always return false and doesn't call getExplicit if there isn't 
any shorthands defined. Such an experiment is downright impossible if 
the Maker isn't generated.

regards,
finn


Re: (Victor et al) Re: Performance improvements.

2003-12-13 Thread J.Pietschmann
Glen Mazza wrote:
-1.  I'd like to hold off on this, at least until I
can gain a better understanding of the autogenerated
code.  I may still to the same conclusion as the other
committers, but Finn's endorsement of the XSLT--as
well as the long work of those like Keiron who have
worked with the XSLT files--suggests that there are
significant time benefits to using them.  (At work, I
use "SQL to write SQL" all the time, and love the time
efficiencies that result.)
Well, the XSLT generation of the Java classes was good
for bootstrap, because the properties were gained from
the XML source of the spec. The FO java classes were
bootstrapped this way too. This came handy while the spec
was in flux and properties and elements were added and
changed. Remember, FOP tracked the spec from the early
development, and some bugs like the white-space-collapse
peciliarities are leftovers from this phase.
Unfortunately, meanwhile generating has become more of a
burden, because if you look at in in detail, there are
very, very few properties which are handled identically.
Catering for the fine differences has led to many ugly
hooks in the presumably generic code (have a look at
GenericShorthandParser, which isn't generic enough to
parse font shorthand properties).
[Actually, I'm looking forward to studying the XSLT
that generates these files--as I mentioned to Clay
that CVS and Ant were two of the initial benefits you
get by working on FOP, apparently being about to write
Java code using XSLT is a third one...i.e., Yeehaw!,
as I believe he had put it... ;)]
Well, I fell for the trap too...I'm all in for code generators,
and I regularly use some and write some for myself. However,
code generation has to have benefits, and if I have to provide
183 choose cases for individual code for (a fixed number of)
185 items, there is no longer any reasonable benefit. A class
hierarchy, and some proper abstractions should be enough to
avoid code duplication.
If I had time I'd even rewrite the propery code nearly from
scratch, because
- provide a proper property expression tree
- deal with shorthands and font family selections in a
 less convoluted way
- perhaps use a grammar driven parser for property
 expressions
The intermingling of (improper) tokenizing, top-down parsing
and error handling for property expressions as well as the
improper reuse of tokenizing for shorthand parsing (despite
it being a completely different grammar) was always enough to
drive my blood pressure through the roof.
J.Pietschmann



Re: (Victor et al) Re: Performance improvements.

2003-12-13 Thread John Austin
I haven't looked at the XSLT code but I have a question
in my mind that I need to answer about it.

I wonder what it is that is being generated and what were 
the design alternatives to the codegen implementation.

One question that popped in to my head was:

Is there 'missing polymorphism' here ?

As I said, I only have the question at this time. 

On Sat, 2003-12-13 at 12:12, Glen Mazza wrote:
> -1.  I'd like to hold off on this, at least until I
> can gain a better understanding of the autogenerated
> code.  I may still to the same conclusion as the other
> committers, but Finn's endorsement of the XSLT--as
> well as the long work of those like Keiron who have
> worked with the XSLT files--suggests that there are
> significant time benefits to using them.  (At work, I
> use "SQL to write SQL" all the time, and love the time
> efficiencies that result.)
> 
> If we check in the Java code, then changes may end up
> being made to those files directly, which will result
> in the XSLT files becoming unregeneratable.  Or, every
> run of the XSLT will require re-modification of the
> changes made manually to all the Java
> files--potentially dozens--100's of files.  So I'm
> kind of leery about doing this at the moment.
> 
> [Actually, I'm looking forward to studying the XSLT
> that generates these files--as I mentioned to Clay
> that CVS and Ant were two of the initial benefits you
> get by working on FOP, apparently being about to write
> Java code using XSLT is a third one...i.e., Yeehaw!,
> as I believe he had put it... ;)]
> 
> Glen
> 
> --- "J.Pietschmann" <[EMAIL PROTECTED]> wrote:
> > Finn Bock wrote:
> > > I like the generation process as it allowed me to
> > try out and experiment 
> > > with different optimizations. I don't think that I
> > realisticly could 
> > > have added caching of compound properties or
> > changed the abs2rel/rel2abs 
> > > code if I had to change the Maker classes
> > manually.
> > 
> > If its common code, that's what class hierarchies
> > and
> > inheritance are made for.
> > 
> > J.Pietschmann
> > 
> > 
> 
> 
> __
> Do you Yahoo!?
> New Yahoo! Photos - easier uploading and sharing.
> http://photos.yahoo.com/
-- 
John Austin <[EMAIL PROTECTED]>


Re: (Victor et al) Re: Performance improvements.

2003-12-13 Thread Glen Mazza
-1.  I'd like to hold off on this, at least until I
can gain a better understanding of the autogenerated
code.  I may still to the same conclusion as the other
committers, but Finn's endorsement of the XSLT--as
well as the long work of those like Keiron who have
worked with the XSLT files--suggests that there are
significant time benefits to using them.  (At work, I
use "SQL to write SQL" all the time, and love the time
efficiencies that result.)

If we check in the Java code, then changes may end up
being made to those files directly, which will result
in the XSLT files becoming unregeneratable.  Or, every
run of the XSLT will require re-modification of the
changes made manually to all the Java
files--potentially dozens--100's of files.  So I'm
kind of leery about doing this at the moment.

[Actually, I'm looking forward to studying the XSLT
that generates these files--as I mentioned to Clay
that CVS and Ant were two of the initial benefits you
get by working on FOP, apparently being about to write
Java code using XSLT is a third one...i.e., Yeehaw!,
as I believe he had put it... ;)]

Glen

--- "J.Pietschmann" <[EMAIL PROTECTED]> wrote:
> Finn Bock wrote:
> > I like the generation process as it allowed me to
> try out and experiment 
> > with different optimizations. I don't think that I
> realisticly could 
> > have added caching of compound properties or
> changed the abs2rel/rel2abs 
> > code if I had to change the Maker classes
> manually.
> 
> If its common code, that's what class hierarchies
> and
> inheritance are made for.
> 
> J.Pietschmann
> 
> 


__
Do you Yahoo!?
New Yahoo! Photos - easier uploading and sharing.
http://photos.yahoo.com/


Re: (Victor et al) Re: Performance improvements.

2003-12-13 Thread J.Pietschmann
Finn Bock wrote:
I like the generation process as it allowed me to try out and experiment 
with different optimizations. I don't think that I realisticly could 
have added caching of compound properties or changed the abs2rel/rel2abs 
code if I had to change the Maker classes manually.
If its common code, that's what class hierarchies and
inheritance are made for.
J.Pietschmann




Re: (Victor et al) Re: Performance improvements.

2003-12-13 Thread Jeremias Maerki
+1 to that, but please don't dump the original XML files
(foproperties.xml etc.) because they could come handy later.

Besides that, I can't contribute much to this discussion, but I am
confident that you guys find a good solution. The "integer idea" sounds
reasonable to me and I'm particularly glad that a few new smart heads
start to get active on HEAD. Thank you for your efforts!

On 13.12.2003 01:34:02 J.Pietschmann wrote:
> If we are at it, I'd vote for dumping generating the property
> classes and check the java files into CVS.


Jeremias Maerki (who's still looking for a way out of his current Delphi
swamp)


Re: (Victor et al) Re: Performance improvements.

2003-12-13 Thread Finn Bock
[J.Pietschmann]

[snip]

If we are at it, I'd vote for dumping generating the property
classes and check the java files into CVS.
I like the generation process as it allowed me to try out and experiment 
with different optimizations. I don't think that I realisticly could 
have added caching of compound properties or changed the abs2rel/rel2abs 
code if I had to change the Maker classes manually.

regards,
finn


Re: (Victor et al) Re: Performance improvements.

2003-12-13 Thread Finn Bock
[Glen Mazza]

Thanks, Finn and John Austin, for your efforts here.
It's a pleasure.

Victor, not to rush you, but how agreeable are you in
general on switching to integer enumerations for FOP
properties?  (Given Alt-Design, Peter obviously
approves.)  I checked Alt-Design's PropNames.java[1]
and liked what I saw.  It doesn't necessarily have to
be that particular design, just the idea of integer
constants in general.
For inspiration you can also take a look at some of the same files that 
I made:

http://bckfnn-modules.sf.net/Constants.java
http://bckfnn-modules.sf.net/FOPropertyMapping.java
http://bckfnn-modules.sf.net/PropSets.java
The PropSets.java file is created from a new .xml file that specifies 
the element and property structure.

http://bckfnn-modules.sf.net/foelements.xml

Perhaps the information in PropSets.java could also be created from a 
xsl-fo DTD.

In addition to the performance improvements,
I suggested before we could have some form of
// very rough pseudocode
checkPropertySupported (int property) {
return isSupported[property];
}
That would be written like this:

boolean checkPropertySupported (int property) {
return indices[i] != 0;
}
It appears that using int constants gives us more ways
to efficiently work with the data.  
I agree.

regards,
finn


Re: Performance improvements.

2003-12-13 Thread Finn Bock
On the other hand a series of smaller adjustments to the property
handling has improved the total processing time about 10%.
[Peter B. West]
That's a better time improvement than I got with alt.design.  The fact 
that I have extra overhead from the pull parsing may account for that.
Perhaps, but it doesn't make much sense comparing 'improvement' between 
the two designs. Not all of my speedup comes from using integers, some 
came from caching the default values from compound properties and from 
removing a lot of StringBuffer.append calls in the way the Makers are 
handling corresponding properties.

Neither of these two changes can apply to alt.design AFAICT, so the 
percent wise improvement that you found will naturally be smaller.

regards,
finn


DO NOT REPLY [Bug 25480] - [PATCH] Experimental performance improvements.

2003-12-13 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480

[PATCH] Experimental performance improvements.





--- Additional Comments From [EMAIL PROTECTED]  2003-12-13 07:08 ---
Created an attachment (id=9553)
New version of src\codegen\foelements.xml


DO NOT REPLY [Bug 25480] - [PATCH] Experimental performance improvements.

2003-12-13 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480

[PATCH] Experimental performance improvements.





--- Additional Comments From [EMAIL PROTECTED]  2003-12-13 07:08 ---
John, The foelements.xml that maps properties to elements did not include the 
shorthand properties. This is fixed now.


Re: (Victor et al) Re: Performance improvements.

2003-12-12 Thread Peter B. West
J.Pietschmann wrote:
Glen Mazza wrote:

Victor, not to rush you, but how agreeable are you in
general on switching to integer enumerations for FOP
properties?  (Given Alt-Design, Peter obviously
approves.)  I checked Alt-Design's PropNames.java[1]
and liked what I saw.  It doesn't necessarily have to
be that particular design, just the idea of integer
constants in general.


There are some ugly considerations. I think at the FO
level properties should be prepared into redy-to-use
bundles. At some point something has to deal with the
oddities called "compound properties".
They're shorthands, with some slightly different inheritance 
characteristics.  I think Arved may have come to the same conclusion.

It would be really nice to have a getLeaderLength()
which returns a MinOptMax. this means the getLeaderLength()
has:
 - resolve percentages and functions
 - deal with the leader-length shorthand setting before this
 - deal with inheritance (n/a here, fortunately)
Or getLeaderLengthMin(), getLeaderLengthOpt(), getLeaderLengthMax(), 
with all values resolved.

One of the complications in the maintenance code is that
the code in the FO layout routines had to deal with resolving
percentages. OTOH, the generator is mainly so ugly because
Keiron et al. tried hard to press the shorthand handling
into a common scheme. There should be better solutions for
either problem.
While I haven't done the corresponding properties for alt.design yet, 
shorthand and compound properties are handled, if not prettily, at least 
coherently.  Percentages will. I think, be resolved by tightly 
integrating the processing of FO nodes and areas within PageSequence, so 
that these values are resolved as early as possible.  For example, the 
relevant reference-area will be made available from the area 
construction logic to the FO tree builder for resolution of the 
properties at the time of parsing.  In most cases, the area will be 
dimensioned at this time, but where not, the linkage between FO node and 
area will be established.

If we are at it, I'd vote for dumping generating the property
classes and check the java files into CVS.
+1

Peter
--
Peter B. West 


Re: Performance improvements.

2003-12-12 Thread Peter B. West
Finn Bock wrote:
Hi,

Inspired by the recent talk about optimizations, I have tried to look
for low hanging fruits that can speed up FOP, and as I expected I did 
not find any silver bullets.

On the other hand a series of smaller adjustments to the property
handling has improved the total processing time about 10%.
That's a better time improvement than I got with alt.design.  The fact 
that I have extra overhead from the pull parsing may account for that.

- Switched from string based property names to integer enums.
  Most of the lookups then changed from get("font-size") to
  get(P_FONT_SIZE) and compound property lookup changed to
  get(P_SPACE_AFTER | C_OPTIMUM).
- Cached the non-contextdep default compound properties in the makers.
  This caching is similar to what already existed for the base
  properties.
- Calculated rel2abs and abs2rel in the property makers at codegen time
  to avoid string manipulation at runtime.
- Changed the element lookup in ProperyListBuilder.elementTable from
  using string keys to using integer enums.
- Copy the inherited property values from parent fo into the child
  fo. findProperty() is then no longer recursive.
The data structure that maps properties ids to the property object is an
sparse java array somewhat (AFAICT) similar to those in alt-design.
The indication of whether a property is explicit set is stored in a bit 
array.

Further optimizations:

- The PropertyManager stores a lot of properties in instance fields.
  Perhaps the properties in the PropertyList can be removed after the
  PropertyManager has taken what it needs. The remaining properties
  could then be packed more efficiently.
- The entire logic in findProperty() could be moved to the property
  makers. This could avoid 630.000 calls to the empty default methods
  in Property.Maker.
A compiled version of my changes can be downloaded here:

   http://bckfnn-modules.sf.net/fop.jar
I'm pleased to see that ideas from alt.design are making their way into 
the HEAD redesign.  I will assist where I can.

Peter
--
Peter B. West 


Re: (Victor et al) Re: Performance improvements.

2003-12-12 Thread J.Pietschmann
Glen Mazza wrote:
Victor, not to rush you, but how agreeable are you in
general on switching to integer enumerations for FOP
properties?  (Given Alt-Design, Peter obviously
approves.)  I checked Alt-Design's PropNames.java[1]
and liked what I saw.  It doesn't necessarily have to
be that particular design, just the idea of integer
constants in general.
There are some ugly considerations. I think at the FO
level properties should be prepared into redy-to-use
bundles. At some point something has to deal with the
oddities called "compound properties".
It would be really nice to have a getLeaderLength()
which returns a MinOptMax. this means the getLeaderLength()
has:
 - resolve percentages and functions
 - deal with the leader-length shorthand setting before this
 - deal with inheritance (n/a here, fortunately)
One of the complications in the maintenance code is that
the code in the FO layout routines had to deal with resolving
percentages. OTOH, the generator is mainly so ugly because
Keiron et al. tried hard to press the shorthand handling
into a common scheme. There should be better solutions for
either problem.
If we are at it, I'd vote for dumping generating the property
classes and check the java files into CVS.
J.Pietschmann



(Victor et al) Re: Performance improvements.

2003-12-12 Thread Glen Mazza
Thanks, Finn and John Austin, for your efforts here.

Victor, not to rush you, but how agreeable are you in
general on switching to integer enumerations for FOP
properties?  (Given Alt-Design, Peter obviously
approves.)  I checked Alt-Design's PropNames.java[1]
and liked what I saw.  It doesn't necessarily have to
be that particular design, just the idea of integer
constants in general.

[1]
http://cvs.apache.org/viewcvs.cgi/*checkout*/xml-fop/src/java/org/apache/fop/fo/Attic/PropNames.java?content-type=text%2Fplain&rev=1.1.2.1

Joerg? Jeremias?

In addition to the performance improvements,
I suggested before we could have some form of

// very rough pseudocode
checkPropertySupported (int property) {
return isSupported[property];
}

to quickly index properties supported for a particular
FO.

Also, for toString() implementations that will list
*all* the properties for a particular FO, we might be
able to have something very simple like this:

class FObj:

String toString {
   String state;
   for (int i; i < PROPMAX; i++) {
  state += getPropertyValue(i).toString() + "\n"; 
   }
   return state;
}

It appears that using int constants gives us more ways
to efficiently work with the data.  

Thanks,
Glen


--- Finn Bock <[EMAIL PROTECTED]> wrote:
> Hi,
> 
> Inspired by the recent talk about optimizations, I
> have tried to look
> for low hanging fruits that can speed up FOP, and as
> I expected I did 
> not find any silver bullets.
> 


__
Do you Yahoo!?
New Yahoo! Photos - easier uploading and sharing.
http://photos.yahoo.com/


DO NOT REPLY [Bug 25480] - [PATCH] Experimental performance improvements.

2003-12-12 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480

[PATCH] Experimental performance improvements.





--- Additional Comments From [EMAIL PROTECTED]  2003-12-12 20:57 ---
Created an attachment (id=9551)
Test logs test1 (without patch) and test2 (with patch) in a ZIP file (tests.zip)


DO NOT REPLY [Bug 25480] - [PATCH] Experimental performance improvements.

2003-12-12 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480

[PATCH] Experimental performance improvements.





--- Additional Comments From [EMAIL PROTECTED]  2003-12-12 20:54 ---
Attachment 9550 was named 'test.tar.gz' in case you can't grok it. 
Use Linux tar zxf or gunzip and tar to unpack it.

I shall attach 'tests.zip' for those of you running tar-disadvantaged systems.


DO NOT REPLY [Bug 25480] - [PATCH] Experimental performance improvements.

2003-12-12 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480

[PATCH] Experimental performance improvements.





--- Additional Comments From [EMAIL PROTECTED]  2003-12-12 20:48 ---
Created an attachment (id=9550)
Log files of testing without (test1) and with (test2) proposed changes.


DO NOT REPLY [Bug 25480] - [PATCH] Experimental performance improvements.

2003-12-12 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480

[PATCH] Experimental performance improvements.





--- Additional Comments From [EMAIL PROTECTED]  2003-12-12 20:39 ---
I tested this in a recent copy of Fop HEAD and ran in to a couple of exceptions:

My test includes the corrections for the FNF problem I reported last week.

The test I run is from the 'root' directory of the distribution (contains
build.sh) and is as follows:

1) time find test -name "*.fo" -print -exec ./test.sh {} \; 2>test1 1>&2

2) time find test -name "*.fo" -print -exec ./test.sh {} \; 2>test2 1>&2

Where 'test.sh' contains:

##
#!/bin/sh
 
echo java -Xms100m -Xmx200m -cp
.:build/fop.jar:lib/avalon-framework-4.1.4.jar:lib/batik.jar:lib/commons-io-dev-20030703.jar
org.apache.fop.apps.Fop -fo ${1} -pdf /tmp/$$.pdf
java -Xms100m -Xmx200m -cp
.:build/fop.jar:lib/avalon-framework-4.1.4.jar:lib/batik.jar:lib/commons-io-dev-20030703.jar
org.apache.fop.apps.Fop -fo ${1} -pdf /tmp/$$.pdf
##

The result is close but not perfect. I have attached 'test1' and 'test2' and
you may use a diff tool to find the exception. [Most differences are due to the
different PID no's assigned to the temp PDF files. This approach would be 
improved if I were to 'ls -l /tmp/$$.pdf' to show the output file sizes.]

The following exception shows up twice in my tests. 


> Exception in thread "main" java.lang.RuntimeException: Insert into unknown
slot 175
>   at org.apache.fop.fo.PropertyList.putSpecified(PropertyList.java:162)
>   at
org.apache.fop.fo.PropertyListBuilder.convertAttributeToProperty(PropertyListBuilder.java:268)
>   at
org.apache.fop.fo.PropertyListBuilder.makeList(PropertyListBuilder.java:217)
>   at org.apache.fop.fo.FObj.handleAttrs(FObj.java:156)
>   at org.apache.fop.fo.flow.BlockContainer.handleAttrs(BlockContainer.java:95)
>   at org.apache.fop.fo.FOTreeBuilder.startElement(FOTreeBuilder.java:267)
>   at org.apache.xerces.parsers.AbstractSAXParser.startElement(Unknown Source)
>   at
org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source)
>   at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown
Source)
>   at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
>   at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
>   at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source)
>   at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
>   at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)


DO NOT REPLY [Bug 25480] - [PATCH] Experimental performance improvements.

2003-12-12 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480

[PATCH] Experimental performance improvements.





--- Additional Comments From [EMAIL PROTECTED]  2003-12-12 15:54 ---
Yes, switching to integer enums may be upcoming in the near future.  FOP's Alt-
Design already relies on them--see its PropNames.java class.

Glen


DO NOT REPLY [Bug 25480] - [PATCH] Experimental performance improvements.

2003-12-12 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480

[PATCH] Experimental performance improvements.





--- Additional Comments From [EMAIL PROTECTED]  2003-12-12 15:35 ---
The two new files that I've attached should be called 
   src/codegen/foelements.xml
and
   src/codegen/mkpropset.xsl


Performance improvements.

2003-12-12 Thread Finn Bock
Hi,

Inspired by the recent talk about optimizations, I have tried to look
for low hanging fruits that can speed up FOP, and as I expected I did 
not find any silver bullets.

On the other hand a series of smaller adjustments to the property
handling has improved the total processing time about 10%.
I have tested against a partial copy of "DocBook, The Definite Guide" 
that I translated to .fo with saxon-6.1.5. The input files that I used 
for timing can be found here:

   http://bckfnn-modules.sf.net/DocBookDTG.zip

The modification I've implemented are:

- Switched from string based property names to integer enums.
  Most of the lookups then changed from get("font-size") to
  get(P_FONT_SIZE) and compound property lookup changed to
  get(P_SPACE_AFTER | C_OPTIMUM).
- Cached the non-contextdep default compound properties in the makers.
  This caching is similar to what already existed for the base
  properties.
- Calculated rel2abs and abs2rel in the property makers at codegen time
  to avoid string manipulation at runtime.
- Changed the element lookup in ProperyListBuilder.elementTable from
  using string keys to using integer enums.
- Copy the inherited property values from parent fo into the child
  fo. findProperty() is then no longer recursive.
The data structure that maps properties ids to the property object is an
sparse java array somewhat (AFAICT) similar to those in alt-design.
The indication of whether a property is explicit set is stored in a bit 
array.

Further optimizations:

- The PropertyManager stores a lot of properties in instance fields.
  Perhaps the properties in the PropertyList can be removed after the
  PropertyManager has taken what it needs. The remaining properties
  could then be packed more efficiently.
- The entire logic in findProperty() could be moved to the property
  makers. This could avoid 630.000 calls to the empty default methods
  in Property.Maker.
A compiled version of my changes can be downloaded here:

   http://bckfnn-modules.sf.net/fop.jar

regards,
finn


DO NOT REPLY [Bug 25480] - [PATCH] Experimental performance improvements.

2003-12-12 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480

[PATCH] Experimental performance improvements.





--- Additional Comments From [EMAIL PROTECTED]  2003-12-12 15:12 ---
Created an attachment (id=9540)
A new file to placed in src/codegen


DO NOT REPLY [Bug 25480] - [PATCH] Experimental performance improvements.

2003-12-12 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480

[PATCH] Experimental performance improvements.





--- Additional Comments From [EMAIL PROTECTED]  2003-12-12 15:11 ---
Created an attachment (id=9539)
A new file to placed in src/codegen


DO NOT REPLY [Bug 25480] - [PATCH] Experimental performance improvements.

2003-12-12 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480

[PATCH] Experimental performance improvements.





--- Additional Comments From [EMAIL PROTECTED]  2003-12-12 15:10 ---
Created an attachment (id=9538)
The patch


DO NOT REPLY [Bug 25480] New: - [PATCH] Experimental performance improvements.

2003-12-12 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25480

[PATCH] Experimental performance improvements.

   Summary: [PATCH] Experimental performance improvements.
   Product: Fop
   Version: 1.0dev
  Platform: Other
OS/Version: Other
Status: NEW
  Severity: Normal
  Priority: Other
 Component: general
AssignedTo: [EMAIL PROTECTED]
ReportedBy: [EMAIL PROTECTED]


This (rather large) patch implements different performance improvements to the 
property handling.

- Switched from string based property names to integer enums.
  Most of the lookups then changed from get("font-size") to
  get(P_FONT_SIZE) and compound propery lookup changed to
  get(P_SPACE_AFTER | C_OPTIMUM).
- Cached the non-contextdep default compound properties in the makers.
  This caching is similar to what already existed for the base
  properties.
- Calculated rel2abs and abs2rel in the property makers at codegen time
  to avoid string manipulation at runtime.
- Changed the element lookup in ProperyListBuilder.elementTable from
  using string keys to using integer enums.
- Copy the inherited property values from parent fo into the child
  fo. findProperty() is then no longer recursive.

This patch is not meant to be applied, but is purely for discussion and 
experimentation.