I tested Fedora Generic Search 2.4

(1)
Focus was on PDF full text indexing. I found that some PDF document
are full text indexed OK but some are not. Those that are not indexed
full text can be converted into text using Adobe Acrobat so they are
not images. Their metadata is indexed alright in Fedora.

Example of the document that was not full text indexed is from ERIC database:

"Digest of Education Statistics, 2009. NCES 2010-013"

http://www.eric.ed.gov/ERICWebPortal/search/recordDetails.jsp?ERICExtSearch_SearchValue_0=ED509883&searchtype=keyword&ERICExtSearch_SearchType_0=no&_pageLabel=RecordDetails&accno=ED509883&_nfls=false&source=ae

I looked at the fedoragsearch.daily.log and see that fields like
<field name="dsmd_OBJ.Content-Type"> are there for the problem PDF
document. However, filed like <field name="ds.OBJ"> is absent.

For other PDF documents that were full test indexed without problems
field <field name="ds.OBJ"> was in the fedoragsearch.daily.log

Any suggestion how to fix would help.


(2)
Additionally, for each ingest of any object multiple records starting
with the following records are written in the fedora.log:

ERROR 2012-01-16 02:45:43.124 [http-8080-4]
(FedoraAPIABindingSOAPHTTPImpl) Error getting datastream dissemination
org.fcrepo.server.errors.DatastreamNotFoundException: [DefaulAccess]
No datastream could be returned. Either there is no datastream for the
digital object "mynamesp:someid" with datastream ID of "QUERY "  OR
there are no datastreams that match the specified date/time value of
"null "  .
...
...

"mynamesp:someid" is my collection where I ingest objects.

Should I ignore those?


Thank you,
Serhiy

------------------------------------------------------------------------------
RSA(R) Conference 2012
Mar 27 - Feb 2
Save $400 by Jan. 27
Register now!
http://p.sf.net/sfu/rsa-sfdev2dev2
_______________________________________________
Fedora-commons-users mailing list
Fedora-commons-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/fedora-commons-users

Reply via email to