Hi to all how i can index remotely documents(PDF, HTML, XML)?
i use lucene 2.0.0 i use current java org.w3c.tidy.Tidy -m *.html to parser HTML java org.apache.lucene.demo.IndexHTML -create -index index .\ for index HTML java org.pdfbox.searchengine.lucene.IndexFiles -create -index C:\tomcat\webapps\luceneweb\index .\ for index PDF but how i can parser XML? i use java dom.DOMFilter *.xml but how i can index XML thanks -- View this message in context: http://www.nabble.com/Index-remotely-documents-tf4430491.html#a12639240 Sent from the Lucene - General mailing list archive at Nabble.com.
