Hi Luis, I fetched the update script from svn and have obtained timings on our server for the large test file which show that the performance of the interact.xml+interact.xsl transform has indeed changed for all processors:
xsltproc - 72.41s (prev 1011.96s) xalan - 111.44s (prev 1206.95s) saxon-xslt - 82.96s (prev 491.95s) saxonb-xslt - 20.44s (prev 132.19s) Certainly for us using saxonb-xslt it's now feasible to look at protxml results with 4500+ protein IDs, the limitation definitely being the web browser coping with the large HTML file rather than the xslt processing being very slow. DT Luis Mendoza wrote: > Hello Dave, > Thanks for the suggestion and research. We are planning on developing a > completely new prot-xml viewer within the next few months, since the current > one has severe limitations when dealing with very large files. > > In the meantime, David and I have just checked in a version of protxml2html > that performs about 4-5 times faster than the previous one (still using > xsltproc). It would be interesting to see if this speed-up holds with other > xslt engines; we may tweak this a bit more in the next few weeks before the > full re-write. > > Cheers, > --Luis > > > On Wed, Dec 2, 2009 at 9:17 AM, dctrud > <[email protected]<mailto:[email protected]>> wrote: > Brian, > > The Ubuntu libsaxonb-java package in the universe repository installs > the script /usr/bin/saxonb-xslt which fires up saxonb under Java. It > expects filenames to be specified as the TPP does to Xalan, i.e. > > saxonb-xslt <xml file> <xsl file> > > ... so it will work if you just replace references to /usr/bin/ > xsltproc with /usr/bin/saxonb-xslt in the pl scripts. > > My quick tests were done on a command-line xslt transform. Actual > performance when using protxml2html.pl<http://protxml2html.pl> via a web > browser doesn't > improve as much since the resulting html is still huge, and takes time > to transfer and for the browser to display. It does seem very useful > for getting huge results files out into text format quickly by > invoking protxml2html on the command line though. Another thing to > note is that on very small files xsltproc is probably still faster due > to the overhead of starting up a JVM for Saxon. > > The commercial Saxon-EE is very impressive with its 28x speed-up, but > the free speedup of 7.5x with saxonb is still very nice. I also tested > Saxon-HE (new open source version that is replacing Saxonb, but not > packaged for Ubuntu), and it's about the same as saxonb. > > DT > > On Dec 2, 4:53 pm, Brian Pratt > <[email protected]<mailto:[email protected]>> wrote: >> Impressive! I'm unclear, though, on the practicalities of how it replaces >> xsltproc (which is an executable) - presumably there's a script that invokes >> java? In which case we have a TPP java dependency we didn't have before - >> not that this is necessarily an insurmountable problem, and one we'll >> probably have to address sooner than later anyway. >> >> Brian Pratt >> >> On Wed, Dec 2, 2009 at 4:28 AM, dctrud >> <[email protected]<mailto:[email protected]>> wrote: >>> Have obtained a Saxon-EE evaluation to try it. Same .prot.xml file, >>> same server - 35.04s (3.5% of the xsltproc run-time). Down side is >>> that it costs 300 GBP per server. >>> DT >>> On Dec 2, 10:45 am, dctrud >>> <[email protected]<mailto:[email protected]>> wrote: >>>> I've just done a quick comparison of the speeds of various XSLT >>>> processors for transforming .prot.xml to html. There is a marked >>>> difference between the processors, and xsltproc which is the TPP >>>> default is not the quickest. >>>> Tests performed on Ubuntu 9.0x 64-bit on a DELL R600 Dual Xeon 5500 >>>> 32GB RAM. All processors are installed from their Ubuntu packages. >>>> Input document was a large 200Mb .prot.xml file resulting from OMSSA >>>> search of the 72-run MaxQuant dataset downloaded from ProteomeCommons: >>>> xsltproc - 1011.96s >>>> xalan - 1206.95 >>>> saxon-xslt - 491.95s >>>> saxonb-xslt - 132.19s >>>> saxonb-xslt works for me as a direct replacement for xsltproc in the >>>> $xsltproc definition in protxml2html.pl<http://protxml2html.pl> >>>> I've not tried the commercial Saxon-SA / Saxon-EE from Saxonica.com, >>>> but they are supposedly faster still. >>>> DT >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "spctools-discuss" group. >>> To post to this group, send email to >>> [email protected]<mailto:[email protected]>. >>> To unsubscribe from this group, send email to >>> [email protected]<mailto:spctools-discuss%[email protected]><spctools-discuss%[email protected]<mailto:spctools-discuss%[email protected]>> >>> . >>> For more options, visit this group at >>> http://groups.google.com/group/spctools-discuss?hl=en. >> > > -- > > You received this message because you are subscribed to the Google Groups > "spctools-discuss" group. > To post to this group, send email to > [email protected]<mailto:[email protected]>. > To unsubscribe from this group, send email to > [email protected]<mailto:spctools-discuss%[email protected]>. > For more options, visit this group at > http://groups.google.com/group/spctools-discuss?hl=en. > > > > > -- > > You received this message because you are subscribed to the Google Groups > "spctools-discuss" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/spctools-discuss?hl=en. > -- Dr. David Trudgian Bioinformatician in Proteomics University of Oxford Mon-Thu: CCMP, Roosevelt Drive Tel: (+44) (01865 2)87784 Friday : Dunn School of Pathology, S. Parks Rd. Tel: (+44) (01865 2)75557 -- You received this message because you are subscribed to the Google Groups "spctools-discuss" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/spctools-discuss?hl=en.
