I would have thought you would use Solr, not Lucene directly with it.
And for Solr, we have more-appropriate Solr-Users mailing list.

With Solr, you can index XML:
*) in Solr specific format (probably not your case)
*) (from a directory listing too) using DIH by mapping XPATH to fields
*) by pre-processing them with XSLT to map to the fields
*) by running them through extract handler (Tika), IIRC

In all cases (including using just Lucene) you will hit a challenge of
what example "indexing" means for you. Do you include just text,
attribute values, attribute names, elements of structure, etc. That's
the hard business-specific part.

If you want to preserve richness of XML files, then pure Solr/Lucene
is not best suited for it but there are commercial solutions (using
Lucene under the covers I think) that deal with XML full complexity.
One such as MarkLogic.

Regards,
   Alex.

----
http://www.solr-start.com/ - Resources for Solr users, new and experienced


On 28 March 2017 at 09:21, Karthikeyan P <[email protected]> wrote:
> I have a list of xml files in a directory , I have to parse these xml using
> apache lucene and index it. Once indexing is done , I want to be able to
> search text inside xml files. How can I achieve this? I am able to search
> text files in a similar way, can someone help me with xml lucene search??
>
> Regards,
> Karthikeyan.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to