Re: [udk-dev] Thoughts on String Construction / Destruction Performance

2005-10-04 Thread Stephan Bergmann

Kay Ramme - Sun Germany - Hamburg wrote:

Hi guys,

Thorsten Behrens wrote:


Eike Rathke [EMAIL PROTECTED] writes:



A specialized parser could almost certainly be faster than the general
SAX parser passing strings back and forth. I wouldn't do it with
lex/yacc though, they're a nightmare to maintain, and in case wrong code
was generated, which can happen, you're almost lost. I'd prefer
boost::spirit instead, but it might be even more work to implement the
parser. I've no idea though whether boost::spirit would be suitable to
parse an ODF tree.



hm. I'd profile a larger test case beforehand - spirit is a recursive
parser vs. yacc being table-driven. But OTOH, maybe contemporary
optimizers are able to compensate for that. And I'd definitely bump
our boost to 1.33, then - spirit (and lots of other stuff) has been
improved a lot.

At least in theory an unlimited (in the sense of programming language 
constructs etc.) parser generator, as yacc, should be better than a 
limited one, as boost::spirit. I know, that in practice the gcc C++ 
parser, which was AFAIK originally a bison generated parser, has been 
replaced by a hand written one. Which seems to show that bison does 
_not_ generate optimal code.


yacc/bison can only handle lalr(1) grammars, so if it is 
difficult/impossible to press your grammar into that form (and for C++ 
it is at least difficult), you are better off using some other approach 
than lalr(1).  This does not necessarily have anything to do with the 
performance of the resulting parser, and this definitely has nothing to 
do with whether a parser generator is implemented as a stand-alone tool 
or as an embedded DSL.


-Stephan

Reaching the point where the parser becomes the main bottleneck, we 
should try out independent implementations, using boost::spirit and 
bison/yacc. Personally, I am a fan of domain oriented prog. languages 
and therefor would favor bison. But certainly would be open for 
everything fastclean :-)


Kay


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [udk-dev] Thoughts on String Construction / Destruction Performance

2005-10-04 Thread Kay Ramme - Sun Germany - Hamburg

Stephan Bergmann wrote:

Kay Ramme - Sun Germany - Hamburg wrote:


Hi guys,

Thorsten Behrens wrote:


Eike Rathke [EMAIL PROTECTED] writes:



A specialized parser could almost certainly be faster than the general
SAX parser passing strings back and forth. I wouldn't do it with
lex/yacc though, they're a nightmare to maintain, and in case wrong 
code

was generated, which can happen, you're almost lost. I'd prefer
boost::spirit instead, but it might be even more work to implement the
parser. I've no idea though whether boost::spirit would be suitable to
parse an ODF tree.



hm. I'd profile a larger test case beforehand - spirit is a recursive
parser vs. yacc being table-driven. But OTOH, maybe contemporary
optimizers are able to compensate for that. And I'd definitely bump
our boost to 1.33, then - spirit (and lots of other stuff) has been
improved a lot.

At least in theory an unlimited (in the sense of programming language 
constructs etc.) parser generator, as yacc, should be better than a 
limited one, as boost::spirit. I know, that in practice the gcc C++ 
parser, which was AFAIK originally a bison generated parser, has been 
replaced by a hand written one. Which seems to show that bison does 
_not_ generate optimal code.



yacc/bison can only handle lalr(1) grammars, so if it is 
difficult/impossible to press your grammar into that form (and for C++ 
it is at least difficult), you are better off using some other approach 
than lalr(1).  This does not necessarily have anything to do with the 

You are right.
performance of the resulting parser, and this definitely has nothing to 
do with whether a parser generator is implemented as a stand-alone tool 
or as an embedded DSL.
If I understand correctly (not yet taken a look at the code Thorsten 
pointed to, but promise to do so :-), an embedded parser gets generated 
at runtime, while yacc/bison generate the parser at compile time.


-Stephan

Kay



Reaching the point where the parser becomes the main bottleneck, we 
should try out independent implementations, using boost::spirit and 
bison/yacc. Personally, I am a fan of domain oriented prog. languages 
and therefor would favor bison. But certainly would be open for 
everything fastclean :-)


Kay



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [udk-dev] Thoughts on String Construction / Destruction Performance

2005-09-26 Thread Kay Ramme - Sun Germany - Hamburg

Hi guys,

Thorsten Behrens wrote:

Eike Rathke [EMAIL PROTECTED] writes:



A specialized parser could almost certainly be faster than the general
SAX parser passing strings back and forth. I wouldn't do it with
lex/yacc though, they're a nightmare to maintain, and in case wrong code
was generated, which can happen, you're almost lost. I'd prefer
boost::spirit instead, but it might be even more work to implement the
parser. I've no idea though whether boost::spirit would be suitable to
parse an ODF tree.



hm. I'd profile a larger test case beforehand - spirit is a recursive
parser vs. yacc being table-driven. But OTOH, maybe contemporary
optimizers are able to compensate for that. And I'd definitely bump
our boost to 1.33, then - spirit (and lots of other stuff) has been
improved a lot.

At least in theory an unlimited (in the sense of programming language 
constructs etc.) parser generator, as yacc, should be better than a 
limited one, as boost::spirit. I know, that in practice the gcc C++ 
parser, which was AFAIK originally a bison generated parser, has been 
replaced by a hand written one. Which seems to show that bison does 
_not_ generate optimal code.


Reaching the point where the parser becomes the main bottleneck, we 
should try out independent implementations, using boost::spirit and 
bison/yacc. Personally, I am a fan of domain oriented prog. languages 
and therefor would favor bison. But certainly would be open for 
everything fastclean :-)


Kay



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [udk-dev] Thoughts on String Construction / Destruction Performance

2005-09-26 Thread Eike Rathke
Hi Thorsten,

On Sat, Sep 24, 2005 at 23:18:24 +0200, Thorsten Behrens wrote:

 hm. I'd profile a larger test case beforehand - spirit is a recursive
 parser vs. yacc being table-driven.

A valid concern. IMHO yacc _will_ be faster anyway, the question is just
how much, and whether spirit would be suitable at all. And whether
developing yet another parser is worth it. As there are many XML
parsers already available, what about eXpat, for example? It is used by
Mozilla, Perl, Python, and is said to be very fast.
http://expat.sourceforge.net/
Btw, the developer of eXpat is James Clark, who is a very active member
of the Thai OOo community, he might be able to tell us whether eXpat
would feasible for fast ODF parsing.

Other parsers: http://www.xml.com/pub/rg/XML_Parsers

 But OTOH, maybe contemporary
 optimizers are able to compensate for that. And I'd definitely bump
 our boost to 1.33, then - spirit (and lots of other stuff) has been
 improved a lot.

As we're talking about an OOo3.0 target, I guess, I see no problem with
this.

  Eike

-- 
 OOo/SO Calc core developer. Number formatter bedevilled I18N transpositionizer.
 GnuPG key 0x293C05FD:  997A 4C60 CE41 0149 0DB3  9E96 2F1A D073 293C 05FD

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [udk-dev] Thoughts on String Construction / Destruction Performance

2005-09-24 Thread Thorsten Behrens
Eike Rathke [EMAIL PROTECTED] writes:

 A specialized parser could almost certainly be faster than the general
 SAX parser passing strings back and forth. I wouldn't do it with
 lex/yacc though, they're a nightmare to maintain, and in case wrong code
 was generated, which can happen, you're almost lost. I'd prefer
 boost::spirit instead, but it might be even more work to implement the
 parser. I've no idea though whether boost::spirit would be suitable to
 parse an ODF tree.

hm. I'd profile a larger test case beforehand - spirit is a recursive
parser vs. yacc being table-driven. But OTOH, maybe contemporary
optimizers are able to compensate for that. And I'd definitely bump
our boost to 1.33, then - spirit (and lots of other stuff) has been
improved a lot.

-- 

Thorsten

If you're not failing some of the time, you're not trying hard enough.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [udk-dev] Thoughts on String Construction / Destruction Performance

2005-09-15 Thread Kay Ramme - Sun Germany - Hamburg

Hi Niklas,

Niklas Nebel wrote:

Kay Ramme - Sun Germany - Hamburg wrote:

The calculation just gives, what the optimum (minimum) looks like. If 
the implementation does more string instantiations than calculated, 
than it is obviously not optimal (in this aspect) and the impact of 
string con- / destruction is increased.



You're right. But real numbers, also for the memory overhead in a 
complete running office, would sure be interesting.
this numbers obviously need to be measured. If they differ from what we 
calculated, then we know that the implementations are not optimal (which 
we probably know already anyway ;-), but I think the point is, that we 
can judge the implementation. E.g. if the predicated value for string 
instantiations is X and we measure 100 x X, we know that we are 
relatively far away from the optimum and need to do some optimizations, 
but if we measure 1.1 x X, than we know we are already near the optimum 
and that it will probably be hard to improve performance further. Hope 
you understand what I mean ;-).


Another thing: test_character_copy looks somewhat strange. 100 
iterations of memcpy for 1024 bytes each, and the total time divided by 
1024?
Hey, that is a bug! Thanks for pointing this out. Oh well, that happens 
when one is too lazy to follow the little obscurities. That probably 
means, that a sal_Unicode copy is much less (1/100) expansive than 34 
insts. I will update the calculations accordingly.


Niklas

Kay



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [udk-dev] Thoughts on String Construction / Destruction Performance

2005-09-14 Thread Kay Ramme - Sun Germany - Hamburg

Hi Niklas,

Niklas Nebel wrote:


Taking an attribute name as an example, with the SAX interface 
inbetween, the via callbacks part looks (roughly) like this:


The name is copied into a vector for the XAttributeList implementation, 
copied again for two getNameByIndex calls (one to look for namespace 
attributes, one in the implementation of the cell element context), then 
a hash_map is used to separate it into a namespace ID and local name.
That is correct. Some of the copy operations may easily be omitted, but 
with nearly no measurable impact.


Considering this, the calculation might me a bit simplistic.
The calculation just gives, what the optimum (minimum) looks like. If 
the implementation does more string instantiations than calculated, than 
it is obviously not optimal (in this aspect) and the impact of string 
con- / destruction is increased.


Niklas

Kay



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [udk-dev] Thoughts on String Construction / Destruction Performance

2005-09-13 Thread Eike Rathke
Hi Kay,

On Tue, Sep 13, 2005 at 17:38:52 +0200, Kay Ramme - Sun Germany - Hamburg wrote:

 investigating especially into constructiondestruction and suggesting a 
 slightly modified variant, with improved behavior. So, if you have some 
 time left, I would like to get your feedback ;-).

Thank you for your rich and detailed analysis!

I fully second your conclusion about the heap-string-with-a-buffer. For
different scenarios it could even be an advantage, in time and space, to
have string types with different buffer sizes, e.g. 8, 16, 32, 64.

I would like to add an additional view for the scenario of loading
a spreadsheet, which may also apply to loading other non-spreadsheet
OpenDocument files. I didn't verify though if it holds true in its
entirety nor do I know if it would be feasible to implement it in
a timely manner.


table:table-cell office:value-type=float office:value=10
  text:p10/text:p
/table:table-cell

As you stated, the steps involved for parsing value cells are

1. the XML file gets parsed, about 9 strings (for every tag and value)
   get created for every cell,
2. the strings are passed via callbacks to the Calc engine,
3. the Calc engine converts the content string into a number
4. the strings get destructed.

The following values are known in advance and do not change:

- length(table:table-cell) = 16c
- length(office:value-type) = 17c
- length(office:value) = 12c
- length(text:p) = 6c

Here we could go one step further. It could be possible for the parser
to preallocate sets of strings for different usage scenarios, here cell
of value type float with value and textual representation, and store
them in a map, e.g. table:table-cell, office:value-type, float, or
even a combined office:value-type=\float\, office:value, text:p.
The map would have to be accessible by unique IDs that are generated
from the sequences of characters using a collision-free algorithm. While
parsing the stream, IDs would be generated on the fly (hence
collision-free to not make it necessary to pass strings around for
comparison like it is necessary in a general hash map) and matches
against the map could then return references to a shared sequence of
a string. This would eliminate all allocations and deallocations of
steps #1 and #4 above where fixed strings that are known in advance are
involved, thus reducing concurrency of your example from 3 to 1 (the
value 10).

Using your numbers and replacing the average string length of 12.75 with
8 (assumption of medium value string length) and adjusting the
probability of a value string matching the buffer length it would be

- buffer_size == 16c
- probability(sequence = buffer) ==  5%
- probability(sequence   buffer) == 95%

construction_time := 40 * (0.05 * (519insts + 35insts * 8) + 0.95 * 35insts 
* 8)
  == 12238 insts
 (yours: 84690 insts)

destruction_time  := 40 * 0.05 * 465insts
  ==   930 insts
 (yours: 27900 insts)

an overall gain of (12238 + 930 == 13168) compared to the
previous (84690 + 27900 == 112590) resulting in factor 0.11
of the previously needed time of your example, compared to the current
shared-heap-string (171630 insts) it is even factor 0.07 and not
only 0.65.

Now the question remains whether there is an algorithm that generates
2*40 unique IDs and references to shared sequences in less than
(112590 - 13168 == 99422) instructions (~ 1242 insts per
string) and whether UDK would like to implement such a map :-)

Note: all of this is just a quick idea resulting of a brainstorm,
nothing seriously investigated, I didn't even verify the numbers..

  Eike

-- 
 OOo/SO Calc core developer. Number formatter bedevilled I18N transpositionizer.
 GnuPG key 0x293C05FD:  997A 4C60 CE41 0149 0DB3  9E96 2F1A D073 293C 05FD

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]