Hi Lance
thanx for your reply, but I have a question
is this patch committed to trunk?
SOLR-1499 is a plug-in for the DIH that uses Solr as a DataSource.
This means that you can read the database and PDFs separately. You
could index all of the PDF content in one DIH script. Then, when
there's a database update, you have a separate DIH scripts that reads
the old row from Solr, and pul
If it's of any help I've split the processing of PDF files from the
indexing. I put the PDF content into a text file (but I guess you could load
it into a database) and use that as part of the indexing. My processing of
the PDF files also compares timestamps on the document and the text file so
th