Re: Comments on new property maker implementation

2004-01-20 Thread Finn Bock
Finn Bock wrote:

I would guess that doing ~6 string compares to navigate the binary 
tree (with 148 color keywords) is slower than one string hash, ~1.2 
int compares and one string compare. But I haven't measured it, so you 
might be well be right. Many keyword sets for other properties are 
much smaller and could perhaps benefit from a more suitable collection 
type.
[J.Pietschmann]

I meant setup effort, although a binary tree will most likely do
additional memory management. You are right about the lookup. Just
for curiosity, where do you get the 1.2 int comparisions? A perfect
hash should not have collisions.
I was comparing a standard HashMap with your binary tree. A perfect hash 
would likely have a more complicated hash function and of course zero 
int compares.

It might also be interesting how a trie or ternary tree (as used for
hyphenation patterns) would compare to hash maps for keywords (in
terms of setup costs, lookup costs and memory). I have doing a
study of various Java implementations on my todo list but didn't
quite get around to do this.
Very interesting indeed.

regards,
finn


Re: Comments on new property maker implementation

2004-01-19 Thread Finn Bock
[Finn Bock]

You should perhaps also be aware that the values
in a static array gets 

assigned to the array one element at a time. So

   static int[] a = {
101,102,103,104,105,106,107,108 };

becomes in bytecodes:

Method static {}
  0 bipush 8
  2 newarray int
  4 dup
  5 iconst_0
[Glen Mazza]

Hmmm...Are you saying that declaring a static array
isn't much (any?) faster than manually creating one? 
I would guess that it is a bit faster than the typical bytecode for 
manually created arrays since the above uses 'dup' instead of 
'getstatic' or 'aload' to push the array on the stack.

I didn't realize that there is code being run for
static arrays--I would have thought the compiled
bytecode just includes the array internally, and not
the code to create it.  (i.e., if you opened the
bytecode you would see an array 101 102 103 104...
sitting someplace.)
Arrays can't be stored in the constant pool so there is no place to put 
the data except as bytecode.

http://java.sun.com/docs/books/vmspec/html/ClassFile.doc.html#20080

It should perhaps be noted that constant instances of String is not 
really stored in the constant pool either. The pool just stores the 
utf-8 representation of the string constant, and each literal string is 
initialized as a new pool String object based on that (as UTF-16ish).

Isn't that how C works, at least?
I think so, but C has hardware support for code and data segments so C 
can make better use of it.

Sigh...I guess I *didn't* know bytecode by heart after
all!  ;-)
I didn't bring it up to discourage the use of static initialized arrays.
If it makes sense to put something in a static array we should do so 
without concern of compiletime vs. runtime. After all, the 
initialization is only performed once per classloader.

regards,
finn


Re: Comments on new property maker implementation

2004-01-19 Thread Finn Bock
You should perhaps also be aware that the values in a static array 
gets assigned to the array one element at a time. So
[J.Pietschmann]

That's an unpleasant surprise. I was always under the impression
statically initialized data was stored along with the string
constants, like in C. This means a generated perfect has table
wouldn't have much of an advantage over, let's say, a simple
binary tree loaded with the values in proper order so that the
tree becomes automatically balanced (without rotations like
rb-trees do).
I would guess that doing ~6 string compares to navigate the binary tree 
(with 148 color keywords) is slower than one string hash, ~1.2 int 
compares and one string compare. But I haven't measured it, so you might 
be well be right. Many keyword sets for other properties are much 
smaller and could perhaps benefit from a more suitable collection type.

It would make sense, however, to properly initialitze initial size
values for the various hashmaps currently used.
Indeed. Rehashing a HashMap is very fast tho, so I wouldn't expect a 
major speedup, but it all adds up.

regards,
finn


RE: Comments on new property maker implementation

2004-01-19 Thread Andreas L. Delmelle
 -Original Message-
 From: Finn Bock [mailto:[EMAIL PROTECTED]

 [ Glen : ]
  Sigh...I guess I *didn't* know bytecode by heart after
  all!  ;-)

 I didn't bring it up to discourage the use of static initialized arrays.
 If it makes sense to put something in a static array we should do so
 without concern of compiletime vs. runtime. After all, the
 initialization is only performed once per classloader.


Well, (sorry to disappoint you, Peter) I don't know my BC by heart, but IIRC
the real difference would be in the size of the compiled classes...
See also a little trick I stumbled upon for for-loops. It's common(?)
knowledge that testing for a value to be greater than (or equal to) another
value, is the same as testing whether the result of their subtraction is
greater than (or equal to) zero --rewriting this effectively saves a
processor instruction per comparison.
If you subsequently combine the test and the decrementing of the counter,
you can slightly reduce the size of the compiled class further.

In short:

  for( int i = 0; i  j ; ++i )

is better written as ( = faster; that is: it will save a few hundred
millisecs in large loops, the few that might just be enough to give the
average user the impression that the software is actually any faster than
before )

  for( int i = j; i  0; --i )

and even better ( leads to even more compact compiled classes;
performance-boost is negligeable with current hardware, but might turn out
to be an advantage --albeit a minor one - when this class has to be loaded
from a network location )

  for( int i = j; --i = 0; )


Cheers,

Andreas



Re: Comments on new property maker implementation

2004-01-19 Thread J.Pietschmann
Finn Bock wrote:
I would guess that doing ~6 string compares to navigate the binary tree 
(with 148 color keywords) is slower than one string hash, ~1.2 int 
compares and one string compare. But I haven't measured it, so you might 
be well be right. Many keyword sets for other properties are much 
smaller and could perhaps benefit from a more suitable collection type.
I meant setup effort, although a binary tree will most likely do
additional memory management. You are right about the lookup. Just
for curiosity, where do you get the 1.2 int comparisions? A perfect
hash should not have collisions.
It might also be interesting how a trie or ternary tree (as used for
hyphenation patterns) would compare to hash maps for keywords (in
terms of setup costs, lookup costs and memory). I have doing a
study of various Java implementations on my todo list but didn't
quite get around to do this.
J.Pietschmann


Comments on new property maker implementation

2004-01-18 Thread Glen Mazza
Finn,

I've looked at your changes--I like them, and I'm
thankful to have someone on our team to be able to
redesign the properties as you have.  Getting rid of
the 250 autogenerated or so classes will be a welcome
improvement.

Comments right now:

1.)  Unlike what I was saying earlier, I don't think
we should move from Property.Maker to a new
PropertyMaker class after all, your design looks fine.
 I've noticed most subclasses of Property.Maker are
within subclasses of Properties themselves (e.g.,
LengthProperty, LengthProperty.Maker, etc.) so it
looks like a neat, clean design.


2.)  The new FOPropertyMapping.java class appears (1)
autogenerated, and (2) to be an XSLT masterpiece at
that as well.  If it is indeed in good shape, I'd like
you to submit it to Bugzilla as the new
fo-property-mapping.xsl, replacing the old one of that
name in src/codegen.  (We won't apply it however,
until we no longer need the current autogenerated
fo-property-mapping.xsl, i.e., until all the old
properties have been tossed out.)

This way if we have to make wide-ranging changes to
FOPropertyMapping, we'll have a XSLT source file we
can conveniently work with.  (Note that putting it in
codegen does *not* mean that it will be automatically
autogenerated anymore--it won't, just as constants.xsl
no longer is--we'll pull it out of the main Ant build
target at that time and keep it the separate, manual
xsltToJava target in our build file[1].

[1]
http://cvs.apache.org/viewcvs.cgi/xml-fop/build.xml?rev=1.97view=auto
)


Comments on FOPropertyMapping:

I like removing all these autogenerated classes, but I
think we can still keep some processing at
compile-time for more of a performance gain, as
follows:

3)  I think the runtime construction of the generic
properties (genericColor, genericCondBorderWidth,
etc.) may not be necessary.  We can still have those
xslt-generated into classes (6-8 classes total), but
this time we check them into FOP (again, keeping the
xsl available for manual re-generation when needed). 
But most of the generic classes are so small (your
initialization of GenericCondPadding is only 4 lines
of code), that going back to creating concrete classes
would be noticeably beneficial either, so I'm not
recommending this change.

One thing that *does* stick out, however, is the 100
or so addKeyword() calls for genericColor (the largest
of the generic properties):

  genericColor.addKeyword(antiquewhite, #faebd7);
  genericColor.addKeyword(aqua, #00);
  genericColor.addKeyword(aquamarine, #7fffd4);
  .

I'd like us to have a static array of these
values--i.e., something done compile-time, that
genericColor can just reference, so we don't have to
do this keyword initialization.  


4)  I'd also like us to, rather than call
setInherited() and setDefault() for each of the
properties during initialization, for the
Property/Property.Maker classes to just reference that
information from two (new) static arrays, added to
Constants.java.  We can also get rid of these two
setter methods as well (ideally there shouldn't be
setters for these attributes anyway--they should
remain inherent to the Property.)

This change will allow us to take advantage of the
fact that we are now on int-constants. 
getDefault(PR_WHATEVER), for example, is just
Constants.DefaultArray[PR_WHATEVER].


5)  Similar to (b) above, several of the makers also
have a useGeneric() initialization requirement:

m  = new CondLengthProperty.Maker(PR_PADDING_END);
m.useGeneric(genericCondPadding);

For those Makers that require it, I'd like the
constructor to be expanded to this:

m  = new CondLengthProperty.Maker(PR_PADDING_END,
genericCondPadding);

Again, getting rid of the useGeneric() function.  This
is for more speed, encapsulation, and also shrinking
FOPropertyMapping class a bit.

Sorry for the long post.  I'll probably have other
comments in other areas, but this is all I've studied
for now.

Thoughts?

Thanks,
Glen


__
Do you Yahoo!?
Yahoo! Hotjobs: Enter the Signing Bonus Sweepstakes
http://hotjobs.sweepstakes.yahoo.com/signingbonus


Re: Comments on new property maker implementation

2004-01-18 Thread Finn Bock
[Glen Mazza]

I've looked at your changes--I like them, and I'm
thankful to have someone on our team to be able to
redesign the properties as you have.  Getting rid of
the 250 autogenerated or so classes will be a welcome
improvement.
But the biggest improvement is IMHO the easy ability to create special 
maker subclasses to handle the corner cases. Take a look at 
IndentPropertyMaker for the calculation of start and end-indent and at 
BorderWidthPropertyMaker for the special handling of border-width when 
border-style is NONE.

Comments right now:

1.)  Unlike what I was saying earlier, I don't think
we should move from Property.Maker to a new
PropertyMaker class after all, your design looks fine.
 I've noticed most subclasses of Property.Maker are
within subclasses of Properties themselves (e.g.,
LengthProperty, LengthProperty.Maker, etc.) so it
looks like a neat, clean design.
2.)  The new FOPropertyMapping.java class appears (1)
autogenerated, and (2) to be an XSLT masterpiece at
that as well.  If it is indeed in good shape, I'd like
you to submit it to Bugzilla as the new
fo-property-mapping.xsl, replacing the old one of that
name in src/codegen.  
Initially my new FOPropertyMapping was generated by XSLT but that is now 
a long time ago and I have made lots of manual changes since then. The 
XSLT script only handled the most common property information and was 
just a hack to get me started. The output isn't a complete java file, it 
doesn't link the subproperties to the base properties and it doesn't 
deal with the classname of any of the complex properties.

(We won't apply it however,
until we no longer need the current autogenerated
fo-property-mapping.xsl, i.e., until all the old
properties have been tossed out.)
This way if we have to make wide-ranging changes to
FOPropertyMapping, we'll have a XSLT source file we
can conveniently work with.  (Note that putting it in
codegen does *not* mean that it will be automatically
autogenerated anymore--it won't, just as constants.xsl
no longer is--we'll pull it out of the main Ant build
target at that time and keep it the separate, manual
xsltToJava target in our build file[1].
[1]
http://cvs.apache.org/viewcvs.cgi/xml-fop/build.xml?rev=1.97view=auto
)
Comments on FOPropertyMapping:

I like removing all these autogenerated classes, but I
think we can still keep some processing at
compile-time for more of a performance gain, as
follows:
3)  I think the runtime construction of the generic
properties (genericColor, genericCondBorderWidth,
etc.) may not be necessary.  We can still have those
xslt-generated into classes (6-8 classes total), but
this time we check them into FOP (again, keeping the
xsl available for manual re-generation when needed). 
But most of the generic classes are so small (your
initialization of GenericCondPadding is only 4 lines
of code), that going back to creating concrete classes
would be noticeably beneficial either, so I'm not
recommending this change.
The generic properties are just templates that carries default data 
values to be used later on, so I don't fully see how we could xslt- 
generate them as anything other than containers of default values. Your 
last statement is a bit difficult to parse so I'm not sure what exactly 
you are recommending.

One thing that *does* stick out, however, is the 100
or so addKeyword() calls for genericColor (the largest
of the generic properties):
  genericColor.addKeyword(antiquewhite, #faebd7);
  genericColor.addKeyword(aqua, #00);
  genericColor.addKeyword(aquamarine, #7fffd4);
  .
I'd like us to have a static array of these
values--i.e., something done compile-time, that
genericColor can just reference, so we don't have to
do this keyword initialization.  
I probably need an example of what you thinking are here. Right now in 
HEAD all the color keywords are stored in a HashMap created in 
GenericColor so the keywords initialization is already done. Putting the 
keywords in static array would require us to somehow search the array 
and I don't see how that will be much faster.

You should perhaps also be aware that the values in a static array gets 
assigned to the array one element at a time. So

static int[] a = { 101,102,103,104,105,106,107,108 };

becomes in bytecodes:

Method static {}
   0 bipush 8
   2 newarray int
   4 dup
   5 iconst_0
   6 bipush 101
   8 iastore
   9 dup
  10 iconst_1
  11 bipush 102
  13 iastore
  14 dup
  15 iconst_2
  16 bipush 103
  18 iastore
  ...
and so on for each index. (In case you don't know bytecode by heart, 
iconst and bipush both push a constant on the stack and iastore pops 3 
items from the stack; an index, a value and an array and assign the 
value to the index in the array).

4)  I'd also like us to, rather than call
setInherited() and setDefault() for each of the
properties during initialization, for the
Property/Property.Maker classes to just reference that
information from two (new) static arrays, added to
Constants.java.  We 

Re: Comments on new property maker implementation

2004-01-18 Thread J.Pietschmann
Glen Mazza wrote:
One thing that *does* stick out, however, is the 100
or so addKeyword() calls for genericColor
...
I'd like us to have a static array of these
values--i.e., something done compile-time, that
genericColor can just reference, so we don't have to
do this keyword initialization.  
Look up perfect hash code and the associated generators
on the internet, like gperf, a C++ implementation used by
gcc and a veriety of other compilers to provide a data
structure for mapping strings to something else in an
efficient way. Mind you, this would also benefit mapping
to FO and property names to their associated classes or
code numbers.
J.Pietschmann


Re: Comments on new property maker implementation

2004-01-18 Thread J.Pietschmann
Finn Bock wrote:
You should perhaps also be aware that the values in a static array gets 
assigned to the array one element at a time. So
That's an unpleasant surprise. I was always under the impression
statically initialized data was stored along with the string
constants, like in C. This means a generated perfect has table
wouldn't have much of an advantage over, let's say, a simple
binary tree loaded with the values in proper order so that the
tree becomes automatically balanced (without rotations like
rb-trees do).
It would make sense, however, to properly initialitze initial size
values for the various hashmaps currently used.
J.Pietschmann



Re: Comments on new property maker implementation

2004-01-18 Thread Glen Mazza
--- Finn Bock [EMAIL PROTECTED] wrote:
 
 But the biggest improvement is IMHO the easy ability
 to create special 
 maker subclasses to handle the corner cases. Take a
 look at 
 IndentPropertyMaker for the calculation of start and
 end-indent and at 
 BorderWidthPropertyMaker for the special handling of
 border-width when 
 border-style is NONE.
 

Well, I'm not there yet, but I'll be able to
appreciate it in due time.

 
 Initially my new FOPropertyMapping was generated by
 XSLT but that is now 
 a long time ago and I have made lots of manual
 changes since then. 

OK, no problem, we'll modify the Java source from now
on.

 
 The generic properties are just templates that
 carries default data 
 values to be used later on, so I don't fully see how
 we could xslt- 
 generate them as anything other than containers of
 default values. Your 
 last statement is a bit difficult to parse so I'm
 not sure what exactly 
 you are recommending.
 

Umm, never mind.  What I was trying to say is that the
generic templates (GenericKeep, GenericSpace, etc., of
the present code) were all autogenerated.  *If* you
thought it still useful to keep it as such, it's OK
with me (i.e., going down from 250 autogenerated to
about 8 is still a very nice improvement.)  But you no
longer see a need for it, which is absolutely fine
with me.

  One thing that *does* stick out, however, is the
 100
  or so addKeyword() calls for genericColor (the
 largest
  of the generic properties):
  
genericColor.addKeyword(antiquewhite,
 #faebd7);
genericColor.addKeyword(aqua, #00);
genericColor.addKeyword(aquamarine,
 #7fffd4);
.
  
  I'd like us to have a static array of these
  values--i.e., something done compile-time, that
  genericColor can just reference, so we don't have
 to
  do this keyword initialization.  
 
 I probably need an example of what you thinking are
 here. Right now in 
 HEAD all the color keywords are stored in a HashMap
 created in 
 GenericColor so the keywords initialization is
 already done. 

OK--I see, thanks for the enlightenment here.  Never
mind again, I was wrong on this point.

 Putting the 
 keywords in static array would require us to somehow
 search the array 
 and I don't see how that will be much faster.
 

Yes, wasn't thinking of that.

  4)  I'd also like us to, rather than call
  setInherited() and setDefault() for each of the
  properties during initialization, for the
  Property/Property.Maker classes to just reference
 that
  information from two (new) static arrays, added to
  Constants.java.  We can also get rid of these two
  setter methods as well (ideally there shouldn't be
  setters for these attributes anyway--they should
  remain inherent to the Property.)
  
  This change will allow us to take advantage of the
  fact that we are now on int-constants. 
  getDefault(PR_WHATEVER), for example, is just
  Constants.DefaultArray[PR_WHATEVER].
 
 I think 'Default' is a bad example, noone ever tries
 to get the default 
 value except for the property maker itself, but your
 argument holds for 
 isInherited().
 

No--I don't think you've gotten my point here.  I
don't care about the consumers of that
information--even if it is just Property.Maker.  But I
don't see the reason to run-time initialize a
PropertyMaker with inherited and default values,
because I can add the whole array in the Constants
interface, or even in Property.Maker directly.  

static Boolean inheritedArray[] =
{
false   // 0
true// PR_PROP_1
true// PR_PROP_2
false   // PR_PROP_3
true// ...


Once you initialize a Property.Maker with its
PR_XX constant, *it* (the Maker) can always obtain
these values by accessing inheritedArray[PR_XX] or
defaultArray[PR_XX].  No reason to initialize via
setInherited(true) or setDefault(5).  Do you see
what I'm trying to say?

 Still, I disagree. If one want to know is a property
 is inherited, the 
 proper way to get the information should be to call 
 propertyMapping[PR_WHATEVER].isInherited().
 

OK--we can place these two arrays in a location where
only the Property.Makers can get to it.  (Maybe a
protected static array in Property.Maker?)  Thoughts
here?


  Again, getting rid of the useGeneric() function. 
 This
  is for more speed, encapsulation, and also
 shrinking
  FOPropertyMapping class a bit.
 
 A very good idea. +1.
 

I can probably make the modifications to this--looks
simple.

Thanks,
Glen


__
Do you Yahoo!?
Yahoo! Hotjobs: Enter the Signing Bonus Sweepstakes
http://hotjobs.sweepstakes.yahoo.com/signingbonus


Re: Comments on new property maker implementation

2004-01-18 Thread Peter B. West
Finn Bock wrote:
I probably need an example of what you thinking are here. Right now in 
HEAD all the color keywords are stored in a HashMap created in 
GenericColor so the keywords initialization is already done. Putting the 
keywords in static array would require us to somehow search the array 
and I don't see how that will be much faster.

You should perhaps also be aware that the values in a static array gets 
assigned to the array one element at a time. So

static int[] a = { 101,102,103,104,105,106,107,108 };

becomes in bytecodes:

Method static {}
   0 bipush 8
   2 newarray int
   4 dup
   5 iconst_0
   6 bipush 101
   8 iastore
   9 dup
  10 iconst_1
  11 bipush 102
  13 iastore
  14 dup
  15 iconst_2
  16 bipush 103
  18 iastore
  ...
and so on for each index. (In case you don't know bytecode by heart, 
iconst and bipush both push a constant on the stack and iastore pops 3 
items from the stack; an index, a value and an array and assign the 
value to the index in the array).
Finn,

I can't imagine there is anyone here who doesn't know bytecode by heart. 
 (Except maybe me.)

Peter
--
Peter B. West http://www.powerup.com.au/~pbwest/resume.html


Re: Comments on new property maker implementation

2004-01-18 Thread Glen Mazza
--- Peter B. West [EMAIL PROTECTED] wrote:
[Finn Bock]
  
  You should perhaps also be aware that the values
 in a static array gets 
  assigned to the array one element at a time. So
  
  static int[] a = {
 101,102,103,104,105,106,107,108 };
  
  becomes in bytecodes:
  
  Method static {}
 0 bipush 8
 2 newarray int
 4 dup
 5 iconst_0


Hmmm...Are you saying that declaring a static array
isn't much (any?) faster than manually creating one? 
I didn't realize that there is code being run for
static arrays--I would have thought the compiled
bytecode just includes the array internally, and not
the code to create it.  (i.e., if you opened the
bytecode you would see an array 101 102 103 104...
sitting someplace.)  Isn't that how C works, at least?

Sigh...I guess I *didn't* know bytecode by heart after
all!  ;-)

Glen


__
Do you Yahoo!?
Yahoo! Hotjobs: Enter the Signing Bonus Sweepstakes
http://hotjobs.sweepstakes.yahoo.com/signingbonus