To use Lucene, you will have to have some way of creating Lucene Document objects, which you can then add to a Lucene index.
Most of the translation work is custom, because everyone's got different XML DTD's and schemas. There are examples of indexing XML with DOM and SAX for Lucene in the Lucene Sandbox.
There are probably a few steps you will need to take:
1) Figure out how your XML Schema or DTD maps to a Lucene Document - which XML elements are going to be which Lucene Fields, and how are they going to be indexed and stored?
2) Write a JUnit test that will be used to test your document conversion utility. It should take your XML documents, run them through your converter, and then check the fields in the Lucene document to make sure they are what you want. Do this for each type of XML document that you have.
3) Write conversion code that translates an XML document to a Lucene document. Do this for each type of XML document that you have.
4) Write an indexer utility that goes through your XML database and feeds XML documents throught the conversion utility and then into the Lucene indexer.
I might be forgetting one or two steps here :)
Jeff
Pierre Lacchini wrote:
Hello,
I'm using Lucene, and I need to index an XML Database (Tamino).
How can I do that ? Do i have to use an XML parser as Digester ?
I'm kinda noob with Lucene, and I really need help ;)
Thx !Pierre Lacchini
Consultant d�veloppement
PeopleWare
12, rue du Cimeti�re
L-8413 Steinfort
Phone : + 352 399 968 35
http://www.peopleware.lu
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
