Thanks, Kasun. You confirmed one suspicion (XML conformance). This information should help a lot.
Bill Burns Verbum Communications, Inc. +1.208.336.6081 <mailto:bbu...@verbumcomm.com> bbu...@verbumcomm.com <http://www.verbumcomm.com> http://www.verbumcomm.com From: kasu...@gmail.com [mailto:kasu...@gmail.com] On Behalf Of Kasun Gajasinghe Sent: Sunday, May 22, 2011 1:30 AM To: Bill Burns Cc: docbook-apps@lists.oasis-open.org Subject: Re: [docbook-apps] Documentation for webhelpindexer.jar On Sat, May 21, 2011 at 3:16 AM, Bill Burns <bbu...@verbumcomm.com> wrote: > Hi, everyone. > > > > This is my first post. I apologize if this is off topic. > > > > I'm modifying a homespun web help transform based on the DocBook XSL 1.35 > HTML transform. I'm looking at retrofitting the web help indexer into the > transform but having a bit of trouble finding documentation on it. Does any > exist? Any caveats for attempting this? I'm not a developer, just an > intrepid XSL tweaker. The documentation for the original plugin webhelpindexer is based is at http://www.helpml.com:8088/help/index.jsp?topic=/org.sample.help.doc/htmlsearch/DHSC_BestPractices_htmlsearch.html . WebhelpIndexer is based on the htmlsearch DITA plugin which we ported to DocBook with additional features. As you know, for a search, there's two components: indexing and searching. webhelpindexer take care of indexing the contents. If you are looking on how to invoke the webhelp indexer, have a look at the "index" target in the build.xml file of docbook webhelp transform (i.e. xsl/webhelp/build.xml) Hope you are familiar with what ANT targets are. Do note that webhelpindexer is for XHTML transforms. It should work on HTML transforms too if your html files are XML-compatible though it haven't tested. You can identify the whole process via the ANT build.xml file. But to give a brief description on how to invoke the indexer via command-line, * You need to have following in your CLASSPATH. * webhelpindexer.jar, lucene-analyzers-3.0.0.jar, lucene-core-3.0.0.jar - These three are available in the extensions/ directory of docbook-xsl-1.76.1. Go for a XSL snapshot if you can which contains the latest version http://docbook.sourceforge.net/snapshot/ * xercesImpl.jar, xml-apis.jar - These two are available in /usr/share/java directory under Linux distributions. Or you can download them. * The main class is com.nexwave.nquindexer.IndexerMain * Give two parameters as command-line arguments: * The folder with the files needs to be indexed * (Optional) language. defaults to "en". See build.properties for details. * You need to wrap the html contents that needs to be indexed by a <div> tag with id "content". i.e. <div id="content"> ... all the html contents except the toc, index etc. </div> Following is the full command: java -cp webhelpindexer.jar:lucene-analyzers-3.0.0.jar:lucene-core-3.0.0.jar:/usr/share/java/xercesImpl.jar:/usr/share/java/xml-apis.jar com.nexwave.nquindexer.IndexerMain "/home/kasun/docbook/repository/trunk/xsl/webhelp/docs/content" "en" That's all for the indexing part. This will create a directory search/ which will contains the index. Do not hesitate to ask further questions you have! Regards, --Kasun > > > > Thanks, > > > > Bill Burns > > Verbum Communications, Inc. > > +1.208.336.6081 > > bbu...@verbumcomm.com > > http://www.verbumcomm.com > > -- ~~~*******'''''''''''''*******~~~ Kasun Gajasinghe, University of Moratuwa, Sri Lanka. Blog: http://kasunbg.blogspot.com Twitter: http://twitter.com/kasunbg