Hi. Everyone:
After a bit of digging what I have discovered is that any item that has
multiple bitstreams of PDFs, only the first bitstream added is searchable.
The other bitstreams in the item seem to be ignored by the indexer. I have
checked and the extracted Texts are there, so it is not an issue with the
filter-media program.
We (at Cornell) have many items with multiple bitstreams of PDFs, and so far
all of my testing indicates only the first bitstream of the item is being
indexed by the Dspace search engine.
Is this a known issue? Is there something wrong in my configuration files that
may be causing this?
George Kozak
Digital Library Specialist
Cornell University Library Information Technologies (CUL-IT)
501 Olin Library
Cornell University
Ithaca, NY 14853
607-255-8924
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech