Importing XML into SOLR, identifying a failed import document

2015-02-03 Thread Morris, Paul E.
Hi All, I'm using SOLR 4.9.0 to import XML using /dataimport from the dashboard and a suitably configured xml-data-config.xml file. Everything works fine, but very occasionally I encounter a bad XML file and the XML importhandler fails with the following error, and the index rolls-back.

SOLR terms component and finding least frequent terms

2015-03-27 Thread Morris, Paul E.
Dear SOLR users, I have been using the /terms component to find low occurrence terms in a large SOLR index, and this works very well, but it is not possible to filter (fq) the results so you are stuck analyzing the whole index. Other options might be to use SOLR faceting, but I don't see how

Finding the intersection of two queries

2015-04-13 Thread Morris, Paul E.
Dear SOLR users, I've found some facet postings and Erik's old slides Venn diagram slides but still can't figure out the query to do the following. Find the Document_IDs where the word apples is mentioned in both SECTION fields (each Section has its own unique ID REFERENCE) An example XML is

adding XML data to SOLR index using DIH (xml-data-config)

2015-06-17 Thread Morris, Paul E.
We regularly create a SOLR index from XML files, using the DIH with a suitably edited xml-data-config.xml. However, whenever new XML become available it seems like we have to rebuild the entire index again using the Data Import Handler. Are we missing something? Should it be possible to add new