Re: [udk-dev] Thoughts on String Construction / Destruction Performance
Kay Ramme - Sun Germany - Hamburg wrote: Hi guys, Thorsten Behrens wrote: Eike Rathke [EMAIL PROTECTED] writes: A specialized parser could almost certainly be faster than the general SAX parser passing strings back and forth. I wouldn't do it with lex/yacc though, they're a nightmare to maintain, and in case wrong code was generated, which can happen, you're almost lost. I'd prefer boost::spirit instead, but it might be even more work to implement the parser. I've no idea though whether boost::spirit would be suitable to parse an ODF tree. hm. I'd profile a larger test case beforehand - spirit is a recursive parser vs. yacc being table-driven. But OTOH, maybe contemporary optimizers are able to compensate for that. And I'd definitely bump our boost to 1.33, then - spirit (and lots of other stuff) has been improved a lot. At least in theory an unlimited (in the sense of programming language constructs etc.) parser generator, as yacc, should be better than a limited one, as boost::spirit. I know, that in practice the gcc C++ parser, which was AFAIK originally a bison generated parser, has been replaced by a hand written one. Which seems to show that bison does _not_ generate optimal code. yacc/bison can only handle lalr(1) grammars, so if it is difficult/impossible to press your grammar into that form (and for C++ it is at least difficult), you are better off using some other approach than lalr(1). This does not necessarily have anything to do with the performance of the resulting parser, and this definitely has nothing to do with whether a parser generator is implemented as a stand-alone tool or as an embedded DSL. -Stephan Reaching the point where the parser becomes the main bottleneck, we should try out independent implementations, using boost::spirit and bison/yacc. Personally, I am a fan of domain oriented prog. languages and therefor would favor bison. But certainly would be open for everything fastclean :-) Kay - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [udk-dev] Thoughts on String Construction / Destruction Performance
Stephan Bergmann wrote: Kay Ramme - Sun Germany - Hamburg wrote: Hi guys, Thorsten Behrens wrote: Eike Rathke [EMAIL PROTECTED] writes: A specialized parser could almost certainly be faster than the general SAX parser passing strings back and forth. I wouldn't do it with lex/yacc though, they're a nightmare to maintain, and in case wrong code was generated, which can happen, you're almost lost. I'd prefer boost::spirit instead, but it might be even more work to implement the parser. I've no idea though whether boost::spirit would be suitable to parse an ODF tree. hm. I'd profile a larger test case beforehand - spirit is a recursive parser vs. yacc being table-driven. But OTOH, maybe contemporary optimizers are able to compensate for that. And I'd definitely bump our boost to 1.33, then - spirit (and lots of other stuff) has been improved a lot. At least in theory an unlimited (in the sense of programming language constructs etc.) parser generator, as yacc, should be better than a limited one, as boost::spirit. I know, that in practice the gcc C++ parser, which was AFAIK originally a bison generated parser, has been replaced by a hand written one. Which seems to show that bison does _not_ generate optimal code. yacc/bison can only handle lalr(1) grammars, so if it is difficult/impossible to press your grammar into that form (and for C++ it is at least difficult), you are better off using some other approach than lalr(1). This does not necessarily have anything to do with the You are right. performance of the resulting parser, and this definitely has nothing to do with whether a parser generator is implemented as a stand-alone tool or as an embedded DSL. If I understand correctly (not yet taken a look at the code Thorsten pointed to, but promise to do so :-), an embedded parser gets generated at runtime, while yacc/bison generate the parser at compile time. -Stephan Kay Reaching the point where the parser becomes the main bottleneck, we should try out independent implementations, using boost::spirit and bison/yacc. Personally, I am a fan of domain oriented prog. languages and therefor would favor bison. But certainly would be open for everything fastclean :-) Kay - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [udk-dev] Thoughts on String Construction / Destruction Performance
Hi guys, Thorsten Behrens wrote: Eike Rathke [EMAIL PROTECTED] writes: A specialized parser could almost certainly be faster than the general SAX parser passing strings back and forth. I wouldn't do it with lex/yacc though, they're a nightmare to maintain, and in case wrong code was generated, which can happen, you're almost lost. I'd prefer boost::spirit instead, but it might be even more work to implement the parser. I've no idea though whether boost::spirit would be suitable to parse an ODF tree. hm. I'd profile a larger test case beforehand - spirit is a recursive parser vs. yacc being table-driven. But OTOH, maybe contemporary optimizers are able to compensate for that. And I'd definitely bump our boost to 1.33, then - spirit (and lots of other stuff) has been improved a lot. At least in theory an unlimited (in the sense of programming language constructs etc.) parser generator, as yacc, should be better than a limited one, as boost::spirit. I know, that in practice the gcc C++ parser, which was AFAIK originally a bison generated parser, has been replaced by a hand written one. Which seems to show that bison does _not_ generate optimal code. Reaching the point where the parser becomes the main bottleneck, we should try out independent implementations, using boost::spirit and bison/yacc. Personally, I am a fan of domain oriented prog. languages and therefor would favor bison. But certainly would be open for everything fastclean :-) Kay - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [udk-dev] Thoughts on String Construction / Destruction Performance
Hi Thorsten, On Sat, Sep 24, 2005 at 23:18:24 +0200, Thorsten Behrens wrote: hm. I'd profile a larger test case beforehand - spirit is a recursive parser vs. yacc being table-driven. A valid concern. IMHO yacc _will_ be faster anyway, the question is just how much, and whether spirit would be suitable at all. And whether developing yet another parser is worth it. As there are many XML parsers already available, what about eXpat, for example? It is used by Mozilla, Perl, Python, and is said to be very fast. http://expat.sourceforge.net/ Btw, the developer of eXpat is James Clark, who is a very active member of the Thai OOo community, he might be able to tell us whether eXpat would feasible for fast ODF parsing. Other parsers: http://www.xml.com/pub/rg/XML_Parsers But OTOH, maybe contemporary optimizers are able to compensate for that. And I'd definitely bump our boost to 1.33, then - spirit (and lots of other stuff) has been improved a lot. As we're talking about an OOo3.0 target, I guess, I see no problem with this. Eike -- OOo/SO Calc core developer. Number formatter bedevilled I18N transpositionizer. GnuPG key 0x293C05FD: 997A 4C60 CE41 0149 0DB3 9E96 2F1A D073 293C 05FD - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [udk-dev] Thoughts on String Construction / Destruction Performance
Eike Rathke [EMAIL PROTECTED] writes: A specialized parser could almost certainly be faster than the general SAX parser passing strings back and forth. I wouldn't do it with lex/yacc though, they're a nightmare to maintain, and in case wrong code was generated, which can happen, you're almost lost. I'd prefer boost::spirit instead, but it might be even more work to implement the parser. I've no idea though whether boost::spirit would be suitable to parse an ODF tree. hm. I'd profile a larger test case beforehand - spirit is a recursive parser vs. yacc being table-driven. But OTOH, maybe contemporary optimizers are able to compensate for that. And I'd definitely bump our boost to 1.33, then - spirit (and lots of other stuff) has been improved a lot. -- Thorsten If you're not failing some of the time, you're not trying hard enough. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [udk-dev] Thoughts on String Construction / Destruction Performance
Hi Niklas, Niklas Nebel wrote: Kay Ramme - Sun Germany - Hamburg wrote: The calculation just gives, what the optimum (minimum) looks like. If the implementation does more string instantiations than calculated, than it is obviously not optimal (in this aspect) and the impact of string con- / destruction is increased. You're right. But real numbers, also for the memory overhead in a complete running office, would sure be interesting. this numbers obviously need to be measured. If they differ from what we calculated, then we know that the implementations are not optimal (which we probably know already anyway ;-), but I think the point is, that we can judge the implementation. E.g. if the predicated value for string instantiations is X and we measure 100 x X, we know that we are relatively far away from the optimum and need to do some optimizations, but if we measure 1.1 x X, than we know we are already near the optimum and that it will probably be hard to improve performance further. Hope you understand what I mean ;-). Another thing: test_character_copy looks somewhat strange. 100 iterations of memcpy for 1024 bytes each, and the total time divided by 1024? Hey, that is a bug! Thanks for pointing this out. Oh well, that happens when one is too lazy to follow the little obscurities. That probably means, that a sal_Unicode copy is much less (1/100) expansive than 34 insts. I will update the calculations accordingly. Niklas Kay - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [udk-dev] Thoughts on String Construction / Destruction Performance
Hi Niklas, Niklas Nebel wrote: Taking an attribute name as an example, with the SAX interface inbetween, the via callbacks part looks (roughly) like this: The name is copied into a vector for the XAttributeList implementation, copied again for two getNameByIndex calls (one to look for namespace attributes, one in the implementation of the cell element context), then a hash_map is used to separate it into a namespace ID and local name. That is correct. Some of the copy operations may easily be omitted, but with nearly no measurable impact. Considering this, the calculation might me a bit simplistic. The calculation just gives, what the optimum (minimum) looks like. If the implementation does more string instantiations than calculated, than it is obviously not optimal (in this aspect) and the impact of string con- / destruction is increased. Niklas Kay - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [udk-dev] Thoughts on String Construction / Destruction Performance
Hi Kay, On Tue, Sep 13, 2005 at 17:38:52 +0200, Kay Ramme - Sun Germany - Hamburg wrote: investigating especially into constructiondestruction and suggesting a slightly modified variant, with improved behavior. So, if you have some time left, I would like to get your feedback ;-). Thank you for your rich and detailed analysis! I fully second your conclusion about the heap-string-with-a-buffer. For different scenarios it could even be an advantage, in time and space, to have string types with different buffer sizes, e.g. 8, 16, 32, 64. I would like to add an additional view for the scenario of loading a spreadsheet, which may also apply to loading other non-spreadsheet OpenDocument files. I didn't verify though if it holds true in its entirety nor do I know if it would be feasible to implement it in a timely manner. table:table-cell office:value-type=float office:value=10 text:p10/text:p /table:table-cell As you stated, the steps involved for parsing value cells are 1. the XML file gets parsed, about 9 strings (for every tag and value) get created for every cell, 2. the strings are passed via callbacks to the Calc engine, 3. the Calc engine converts the content string into a number 4. the strings get destructed. The following values are known in advance and do not change: - length(table:table-cell) = 16c - length(office:value-type) = 17c - length(office:value) = 12c - length(text:p) = 6c Here we could go one step further. It could be possible for the parser to preallocate sets of strings for different usage scenarios, here cell of value type float with value and textual representation, and store them in a map, e.g. table:table-cell, office:value-type, float, or even a combined office:value-type=\float\, office:value, text:p. The map would have to be accessible by unique IDs that are generated from the sequences of characters using a collision-free algorithm. While parsing the stream, IDs would be generated on the fly (hence collision-free to not make it necessary to pass strings around for comparison like it is necessary in a general hash map) and matches against the map could then return references to a shared sequence of a string. This would eliminate all allocations and deallocations of steps #1 and #4 above where fixed strings that are known in advance are involved, thus reducing concurrency of your example from 3 to 1 (the value 10). Using your numbers and replacing the average string length of 12.75 with 8 (assumption of medium value string length) and adjusting the probability of a value string matching the buffer length it would be - buffer_size == 16c - probability(sequence = buffer) == 5% - probability(sequence buffer) == 95% construction_time := 40 * (0.05 * (519insts + 35insts * 8) + 0.95 * 35insts * 8) == 12238 insts (yours: 84690 insts) destruction_time := 40 * 0.05 * 465insts == 930 insts (yours: 27900 insts) an overall gain of (12238 + 930 == 13168) compared to the previous (84690 + 27900 == 112590) resulting in factor 0.11 of the previously needed time of your example, compared to the current shared-heap-string (171630 insts) it is even factor 0.07 and not only 0.65. Now the question remains whether there is an algorithm that generates 2*40 unique IDs and references to shared sequences in less than (112590 - 13168 == 99422) instructions (~ 1242 insts per string) and whether UDK would like to implement such a map :-) Note: all of this is just a quick idea resulting of a brainstorm, nothing seriously investigated, I didn't even verify the numbers.. Eike -- OOo/SO Calc core developer. Number formatter bedevilled I18N transpositionizer. GnuPG key 0x293C05FD: 997A 4C60 CE41 0149 0DB3 9E96 2F1A D073 293C 05FD - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]