Yes. We still use the ridx.py file shown here to index the HTML contents: http://www.mozilla.org/projects/help-viewer/ridx.py.txt
Could it be done this through a (not so) simple program which searchs inside the .HTMLs for <hN>...</hN> and <a name="..."></a> and generates the help-index.rdf? Probably introducing additional mark-up taking advantage of the migration to XHTML.
It looks for anchors formatted in the following way:
<A NAME="sample index entryIDX"></A> and extracts the anchor name without the IDX and makes it part of the RDF file that lives in the tree. I believe that the project page describes this process.
I have a bug open to make this index and search db generation part of the build process, but means it has to be done in perl, and my perl chops are gone. (plus I'm not sure we need to build the dbs *every* time: there aren't often updates to the idx and search on the content side, and maybe manual generation is enough).
I/O
--
Ian Oeschger
www.brownhen.com
