Re: [docbook-apps] How to improve the build speed with saxon 6.X / docbook
Le vendredi 11 septembre 2009 à 06:13 +0100, Dave Pawson a écrit : On 10/09/09 13:09, Sylvestre Ledru wrote: Hello, I am currently trying to improve the build time of the documentation of a free scientific software (Scilab). There are almost 1800 XML files. The size of these files is between 1 k to 10 k. Before calling saxon, some processing is done (mathml = png through jeuclid, etc) and finally merged all of them into a single file [1]. This file is processed against chunk.xsl or javahelp.xsl from docbook-xsl. Both are taking a long time (pretty much the same). However, the build time is way too long (between 30m to 60m on a powerfull computer to hours on a small CPU). Especially for some small architectures like s390 or armel... For example, Debian compilation chains are killing the process since it is taking more than 150 minutes, just to load the XML. Therefor, I am trying to improve the speed of the process. I wonder if there are any tricks to improve the speed. Some people told me that the merge of all xml files is not necessary but I haven't been able to find how to do it. Have you tried compiling the stylesheets using the saxon option? Not something I've done, nor something I've heard being done on this list, but definitely should show an improvement I am going to try. Do you have any pointer/documentation on this ? Thanks Sylvestre - To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org
Re: [docbook-apps] How to improve the build speed with saxon 6.X / docbook
On 14/09/09 10:13, Sylvestre Ledru wrote: Have you tried compiling the stylesheets using the saxon option? Not something I've done, nor something I've heard being done on this list, but definitely should show an improvement I am going to try. Do you have any pointer/documentation on this ? Thanks Sylvestre I think I owe you an apology. Seems that compiling a stylesheet is a Saxon 9 option, not available on saxon 6.5.5 which is the XSLT 1.0 engine needed for docbook. Can anyone confirm this? I'm looking at http://saxon.sourceforge.net/saxon6.5.5/using-xsl.html#Command-line as the command line options. http://www.saxonica.com/documentation/using-xsl/compiling.html documents using the pre-compiled stylesheet with xslt 2.0 regards -- Dave Pawson XSLT XSL-FO FAQ. http://www.dpawson.co.uk - To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org
Re: [docbook-apps] How to improve the build speed with saxon 6.X / docbook
Sylvestre, A couple of things may help. Make sure that your catalogs are operating correctly and that the resolution of them are not going out to the net to resolve entries. This can slow things down greatly. Also, have you thought about using XSLTPROC instead of Saxon. I maintain my Docbook tools so that I can use both Saxon and XSLTPROC but Saxon is slower. XSLTPROC is in most Linux distros and there is a Windows package also. Which version of Saxon are you using and what version of Java? Regards, Dean Nelson In a message dated 09/10/09 05:09:38 Pacific Daylight Time, sylvestre.le...@inria.fr writes: Hello, I am currently trying to improve the build time of the documentation of a free scientific software (Scilab). There are almost 1800 XML files. The size of these files is between 1 k to 10 k. Before calling saxon, some processing is done (mathml = png through jeuclid, etc) and finally merged all of them into a single file [1]. This file is processed against chunk.xsl or javahelp.xsl from docbook-xsl. Both are taking a long time (pretty much the same). However, the build time is way too long (between 30m to 60m on a powerfull computer to hours on a small CPU). Especially for some small architectures like s390 or armel... For example, Debian compilation chains are killing the process since it is taking more than 150 minutes, just to load the XML. Therefor, I am trying to improve the speed of the process. I wonder if there are any tricks to improve the speed. Some people told me that the merge of all xml files is not necessary but I haven't been able to find how to do it. I was wondering if there is a better way to structure the XML document. For now, it is (mainly) structured the following way (by merge of all xml files): book part titletitle of the chapter 1/title refentry Details about the function [...] /refentry refentry Details about the function 2 [...] /refentry /part part titletitle of the chapter 2/title refentry [...] /refentry /part /book Some rare refentry are also stored in some chapter section. There are quite many links between all the refentry (especially coming from the see also section). Does anybody know how to improve this ? Note that the PDF or PS generation is very fast and based on the same master xml file. Many thanks, Sylvestre PS: I sent this email on the saxon mailing list. They told me that this is most probably due to docbook and not saxon. [1] http://www.scilab.org/team/sylvestre.ledru/master_en_US_help-processed.xml.gz - To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org
Re: [docbook-apps] How to improve the build speed with saxon 6.X / docbook
Hello, Thanks for your quick answer. Le jeudi 10 septembre 2009 à 06:59 -0700, DeanNelson a écrit : Sylvestre, A couple of things may help. Make sure that your catalogs are operating correctly and that the resolution of them are not going out to the net to resolve entries. This can slow things down greatly. A silly question :). How can I be sure of that ? Also, have you thought about using XSLTPROC instead of Saxon. I maintain my Docbook tools so that I can use both Saxon and XSLTPROC but Saxon is slower. XSLTPROC is in most Linux distros and there is a Windows package also. I already checked with xsltproc and I have about the same time of processing... Which version of Saxon are you using and what version of Java? Saxon 6.5 and openjdk 6b16 (but I have the same issue with the Sun JDK). Regards, Sylvestre Regards, Dean Nelson In a message dated 09/10/09 05:09:38 Pacific Daylight Time, sylvestre.le...@inria.fr writes: Hello, I am currently trying to improve the build time of the documentation of a free scientific software (Scilab). There are almost 1800 XML files. The size of these files is between 1 k to 10 k. Before calling saxon, some processing is done (mathml = png through jeuclid, etc) and finally merged all of them into a single file [1]. This file is processed against chunk.xsl or javahelp.xsl from docbook-xsl. Both are taking a long time (pretty much the same). However, the build time is way too long (between 30m to 60m on a powerfull computer to hours on a small CPU). Especially for some small architectures like s390 or armel... For example, Debian compilation chains are killing the process since it is taking more than 150 minutes, just to load the XML. Therefor, I am trying to improve the speed of the process. I wonder if there are any tricks to improve the speed. Some people told me that the merge of all xml files is not necessary but I haven't been able to find how to do it. I was wondering if there is a better way to structure the XML document. For now, it is (mainly) structured the following way (by merge of all xml files): book part titletitle of the chapter 1/title refentry Details about the function [...] /refentry refentry Details about the function 2 [...] /refentry /part part titletitle of the chapter 2/title refentry [...] /refentry /part /book Some rare refentry are also stored in some chapter section. There are quite many links between all the refentry (especially coming from the see also section). Does anybody know how to improve this ? Note that the PDF or PS generation is very fast and based on the same master xml file. Many t hanks, Sylvestre PS: I sent this email on the saxon mailing list. They told me that this is most probably due to docbook and not saxon. [1] http://www.scilab.org/team/sylvestre.ledru/master_en_US_help-processed.xml.gz - To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org - To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org
RE: [docbook-apps] How to improve the build speed with saxon 6.X / docbook
A couple of things may help. Make sure that your catalogs are operating correctly and that the resolution of them are not going out to the net to resolve entries. This can slow things down greatly. A silly question :). How can I be sure of that ? The way I usually realize when my catalogs aren't working is to build a doc while offline :-) Also, have you thought about using XSLTPROC instead of Saxon. I maintain my Docbook tools so that I can use both Saxon and XSLTPROC but Saxon is slower. XSLTPROC is in most Linux distros and there is a Windows package also. I already checked with xsltproc and I have about the same time of processing... That's been my experience too. David - To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org
Re: [docbook-apps] How to improve the build speed with saxon 6.X / docbook
One way is to turn on the verbosity in the CatalogManager.properties file verbosity=4 Non zero values print debugging info to the screen. This will be of great help in knowing if all of the net centric info is being resolved to a local location. This all assumes that you are using a catalog ;-) If not then everything is going out to the net. When you tried XSLTPROC did you have the --nonet switch on the command line? I don't think Saxon has a similar switch, but it does have the CatalogManager.properties file which help these types of issues. It may be only one unresolved entry that slow things down, so you will have to really look closely at the output. Also, I use the Jueclid FOP plugin to render my MathML equations during the FOP generation. This saves a conversion step at the beginning. Regards, Dean Nelson In a message dated 09/10/09 07:32:09 Pacific Daylight Time, dcra...@motive.com writes: A couple of things may help. Make sure that your catalogs are operating correctly and that the resolution of them are not going out to the net to resolve entries. This can slow things down greatly. A silly question :). How can I be sure of that ? The way I usually realize when my catalogs aren't working is to build a doc while offline :-)
Re: [docbook-apps] How to improve the build speed with saxon 6.X / docbook
Bad luck, it does seems to be related to the network ... I tried with both and it is pretty much the same time. I am going the same as you about MathML. Thanks again for your advices! Sylvestre Le jeudi 10 septembre 2009 à 07:57 -0700, DeanNelson a écrit : One way is to turn on the verbosity in the CatalogManager.properties file verbosity=4 Non zero values print debugging info to the screen. This will be of great help in knowing if all of the net centric info is being resolved to a local location. This all assumes that you are using a catalog ;-) If not then everything is going out to the net. When you tried XSLTPROC did you have the --nonet switch on the command line? I don't think Saxon has a similar switch, but it does have the CatalogManager.properties file which help these types of issues. It may be only one unresolved entry that slow things down, so you will have to really look closely at the output. Also, I use the Jueclid FOP plugin to render my MathML equations during the FOP generation. This saves a conversion step at the beginning. Regards, Dean Nelson In a message dated 09/10/09 07:32:09 Pacific Daylight Time, dcra...@motive.com writes: A couple of things may help. Make sure that your catalogs are operating correctly and that the resolution of them are not going out to the net to resolve entries. This can slow things down greatly. A silly question :). How can I be sure of that ? The way I usually realize when my catalogs aren't working is to build a doc while offline :-) - To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org
Re: [docbook-apps] How to improve the build speed with saxon 6.X / docbook
On 10/09/09 13:09, Sylvestre Ledru wrote: Hello, I am currently trying to improve the build time of the documentation of a free scientific software (Scilab). There are almost 1800 XML files. The size of these files is between 1 k to 10 k. Before calling saxon, some processing is done (mathml = png through jeuclid, etc) and finally merged all of them into a single file [1]. This file is processed against chunk.xsl or javahelp.xsl from docbook-xsl. Both are taking a long time (pretty much the same). However, the build time is way too long (between 30m to 60m on a powerfull computer to hours on a small CPU). Especially for some small architectures like s390 or armel... For example, Debian compilation chains are killing the process since it is taking more than 150 minutes, just to load the XML. Therefor, I am trying to improve the speed of the process. I wonder if there are any tricks to improve the speed. Some people told me that the merge of all xml files is not necessary but I haven't been able to find how to do it. Have you tried compiling the stylesheets using the saxon option? Not something I've done, nor something I've heard being done on this list, but definitely should show an improvement regards -- Dave Pawson XSLT XSL-FO FAQ. http://www.dpawson.co.uk - To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org