I tested Fedora Generic Search 2.4 (1) Focus was on PDF full text indexing. I found that some PDF document are full text indexed OK but some are not. Those that are not indexed full text can be converted into text using Adobe Acrobat so they are not images. Their metadata is indexed alright in Fedora.
Example of the document that was not full text indexed is from ERIC database: "Digest of Education Statistics, 2009. NCES 2010-013" http://www.eric.ed.gov/ERICWebPortal/search/recordDetails.jsp?ERICExtSearch_SearchValue_0=ED509883&searchtype=keyword&ERICExtSearch_SearchType_0=no&_pageLabel=RecordDetails&accno=ED509883&_nfls=false&source=ae I looked at the fedoragsearch.daily.log and see that fields like <field name="dsmd_OBJ.Content-Type"> are there for the problem PDF document. However, filed like <field name="ds.OBJ"> is absent. For other PDF documents that were full test indexed without problems field <field name="ds.OBJ"> was in the fedoragsearch.daily.log Any suggestion how to fix would help. (2) Additionally, for each ingest of any object multiple records starting with the following records are written in the fedora.log: ERROR 2012-01-16 02:45:43.124 [http-8080-4] (FedoraAPIABindingSOAPHTTPImpl) Error getting datastream dissemination org.fcrepo.server.errors.DatastreamNotFoundException: [DefaulAccess] No datastream could be returned. Either there is no datastream for the digital object "mynamesp:someid" with datastream ID of "QUERY " OR there are no datastreams that match the specified date/time value of "null " . ... ... "mynamesp:someid" is my collection where I ingest objects. Should I ignore those? Thank you, Serhiy ------------------------------------------------------------------------------ RSA(R) Conference 2012 Mar 27 - Feb 2 Save $400 by Jan. 27 Register now! http://p.sf.net/sfu/rsa-sfdev2dev2 _______________________________________________ Fedora-commons-users mailing list Fedora-commons-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/fedora-commons-users