I would have thought you would use Solr, not Lucene directly with it. And for Solr, we have more-appropriate Solr-Users mailing list.
With Solr, you can index XML: *) in Solr specific format (probably not your case) *) (from a directory listing too) using DIH by mapping XPATH to fields *) by pre-processing them with XSLT to map to the fields *) by running them through extract handler (Tika), IIRC In all cases (including using just Lucene) you will hit a challenge of what example "indexing" means for you. Do you include just text, attribute values, attribute names, elements of structure, etc. That's the hard business-specific part. If you want to preserve richness of XML files, then pure Solr/Lucene is not best suited for it but there are commercial solutions (using Lucene under the covers I think) that deal with XML full complexity. One such as MarkLogic. Regards, Alex. ---- http://www.solr-start.com/ - Resources for Solr users, new and experienced On 28 March 2017 at 09:21, Karthikeyan P <[email protected]> wrote: > I have a list of xml files in a directory , I have to parse these xml using > apache lucene and index it. Once indexing is done , I want to be able to > search text inside xml files. How can I achieve this? I am able to search > text files in a similar way, can someone help me with xml lucene search?? > > Regards, > Karthikeyan. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
