On 28 October 2010 17:37, MJ Suhonos <m...@suhonos.ca> wrote: > Let me openly state that I've never used Turbomarc. I believe the "special > case" they are referring to is the subfield code with a value of "η", which > is non-alphanumeric. I don't know enough about MARC to even begin guessing > what this means or why it might occur (or not). > > The use case I see for Turbomarc is when you: > > 1- have a need for high performance > 2- are converting binary MARC to XML > 3- are writing your own XSLT to manipulate that XML (since it's not MARCXML) > > The first comment claims a 30-40% increase in XML parsing, which seems > obvious when you compare the number of characters in the example provided: > 277 vs. 419, or about 34% fewer going through the parser.
The speedup can be much greater than that -- from the blog post itself, "Using xsltproc --timing showed that our transformations were faster by a factor of 4-5. Shortening the element names only improved performance fractionally, but since everything counts, we decided to do this as well". xsltproc uses the highly optimised LibXML/LibXSLT stack, which I guess maybe doesn't have so much constant-time overhead as the PHP simplexml parser that yielder the smaller speedup.