On Mon, Oct 26, 2015 at 11:53 AM, Alexander Lakhin <a.lak...@postgrespro.ru> wrote:
> Hello, Peter. > > I've managed to speed up html generation from xml (make xslthtml) from 32 > min. (in my environment) to 4 min. by modifying slowest XSL templates. > All my modifications incorporated in a single file > stylesheet-xhtml-speedup.xsl, which is included in stylesheet.xsl. > I performed optimization by analyzing output of: > xsltproc --profile --stringparam pg.version '9.6devel' stylesheet.xsl > postgres.xml > Initial statistics: > number match name mode Calls Tot 100us > Avg > > 0 appendix label.markup > 23090 90677526 3927 > 1 chapter label.markup > 28870 39740757 1376 > 2 chunk-all-sections 1289 23845066 18498 > 3 make.legalnotice.head.links > 2578 9630258 3735 > 4 indexterm reference 2579 4126513 1600 > 5 html.head 1289 3112534 2414 > ... > index % time self children called name > 0.479 1326.034 22/23090 toc.line [61] > 5.128 1308.245 21944/23090 sect1[label.markup] [13] > 3.772 1318.264 850/23090 substitute-markup [15] > 1.355 1304.631 274/23090 > figure|table|example[label.markup] [32] > [0] 47.95 906.775 1.613 23090 appendix[label.markup] [0] > 1.613 0.000 23090/23090 autolabel.format [29] > ----------------------------------------------- > 5.128 1308.245 24708/28870 sect1[label.markup] [13] > 0.479 1326.034 130/28870 toc.line [61] > 3.772 1318.264 2112/28870 substitute-markup [15] > 1.355 1304.631 1920/28870 > figure|table|example[label.markup] [32] > [1] 21.01 397.408 1.613 28870 chapter[label.markup] [1] > 1.613 0.000 28870/28870 autolabel.format [29] > ----------------------------------------------- > 0.164 238.606 1289/1289 process-chunk-element [98] > [2] 12.61 238.451 0.225 1289 chunk-all-sections [2] > 0.225 7.117 1289/1289 process-chunk [86] > ----------------------------------------------- > 31.125 112.261 1289/2578 html.head [5] > 96.303 96.726 1289/2578 make.legalnotice.head.links [3] > [3] 5.09 96.303 96.726 2578 make.legalnotice.head.links [3] > 96.303 96.726 1289/3867 make.legalnotice.head.links [3] > 0.339 0.494 1289/3867 *[object.title.markup.textonly] > [69] > 0.085 0.781 1289/3867 ln.or.rh.filename [116] > > ----------------------------------------------- > > > Currrent statistics: > number match name mode Calls Tot 100us > Avg > > 0 chunk-all-sections 1289 5405958 4193 > 1 make.legalnotice.head.links > 1289 3159538 > 2451 > 2 html.head 1289 3068417 2380 > 3 gentext.template 689835 2327761 3 > 4 l10n.language 564453 1455253 2 > 5 href.target 29881 1344063 44 > --- > index % time self children called name > 0.136 54.207 1289/1289 process-chunk-element [95] > [0] 20.40 54.060 0.312 1289 chunk-all-sections [0] > 0.312 6.468 1289/1289 process-chunk [67] > ----------------------------------------------- > 30.684 45.458 1289/1289 html.head [2] > [1] 11.92 31.595 0.448 1289 make.legalnotice.head.links [1] > 0.290 0.403 1289/2578 *[object.title.markup.textonly] > [71] > 0.159 0.828 1289/2578 ln.or.rh.filename [91] > ----------------------------------------------- > 0.330 31.617 1289/1289 chunk-element-content [65] > [2] 11.58 30.684 45.458 1289 html.head [2] > 31.595 0.448 1289/15462 make.legalnotice.head.links [1] > 13.441 4.726 5153/15462 href.target [5] > 0.290 0.403 5153/15462 > *[object.title.markup.textonly] [71] > 0.115 1.576 1289/15462 head.content [99] > 0.012 0.000 1289/15462 system.head.content [186] > 0.006 0.000 1289/15462 user.head.content [228] > > To make sure that result of the transformation is the same, I've compared > original .html's with .html's generated with modified templates. > Unfortunately xslt generates random id's, so it's needed to exclude them > before comparing. I do that with: > for f in */*.html; do sed -e > 's/id=\"\(ftn\.\)\?id[a-z][0-9]\+\"/id=\"id\"/g' -i $f ; sed -e > 's/href=\"[^#]*#\(ftn\.\)\?id[a-z][0-9]\+\"/href=\"#\"/g' -i $f; done > > > So if it's acceptable way to speed up generation of HTML (and maybe some > other formats), what other steps should we take to move away from SGML? > If the performance is still not satisfying, please let me know, I'll > continue to optimize xslt. > Beside performance issues, I can see some difference in results of 'make > html' and 'make xslthtml'. For example, see doc/src/sgml/html/spi.html > (xslt-generated version doesn't contain the lists of functions). > > Best regards, > Alexander I think this is great result and it's worth to start moving to xml. I want to note, that it's 21-th century and we should think about including pictures into our documentation, which will greatly improve it. XML makes this easier. > > > > > 06.04.2015 23:02, Peter Eisentraut wrote: > >> On 4/2/15 5:22 PM, Luzanov Pavel wrote: >> >>> Peter, >>> >>> >>> I found this message in archives: >>> >>> >>> http://www.postgresql.org/message-id/flat/519c3d99.9000...@gmx.net#519c3d99.9000...@gmx.net >>> >>> >>> and, as you recommend, tested a speed of building docs on a fresh Ubuntu >>> installation. >>> >>> 50sec for make html >>> and 14min 50sec for make xslthtml >>> 17 times slower! >>> >>> >>> Is it still a main stopper for moving to XML? >>> >> Yes :) >> >> >> >> > > > -- > Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-docs > >