Hi, I am trying to index an xml file as a field in lucene, see example below:
<add> <doc> <field name="title">As You Like it</field> <field name="author">Shakespeare, William</field> <field name="record"><myxml>here goes the xml...</myxml></field> </doc> </add> I can index the title and author fields because they are strings, but the record field is an xml itself and I bump into some problems as I cannot directly input an xml file using the post.sh script (solr complains). I wonder what would be the correct (and relatively simple) way of doing it. Ideally, I would like to store the xml as is, and index only the content removing the xml-tags (I believe there is HTMLStripWhitespaceAnalyzer for that). And output the result as an xml (so, simple escaping does not work for me). So far, I had the idea of escaping the xml record and then unescaping it for inner storage and using the analyzer for indexing (which would possible require creating a class like XMLField or such). thanks, mirko