Hi,
I am using SOLR as part of the dspace 5.4 SW application.
I have a problem when running the dspace indexing command
(index-discovery). Most of the files are not being added to the index, and
an exception is raised.
It seems that Solr does not process the PDF files that are result of
scanning
Not sure how much help I can be, I have no clue what DSpace is
doing with Solr.
If you're willing to try to index straight to Solr, you can always use
SolrJ to parse the files, it's actually not very hard. Here's an example:
https://lucidworks.com/blog/2012/02/14/indexing-with-solrj/
some