Can someone please respond to this mail, just so we know that we're 
being heard. The solution to the problem is as follows:

DSpace OAIHarvester.java logs to dspace.log:

log.info("HTTP Request: " + listRecords.getRequestURL());

and the printout in dspace.log is:

HTTP Request: 
http://eprints.someserver.somewhere/cgi/oai2?verb=ListRecords&until=2014-05-21T14:47:08Z&metadataPrefix=oai_bibl

The problem is the last parameter value: metadataPrefix=oai_bibl

After searching entire DSpace directory structure for string oai_bibl we 
realized that EPrints might be responsible for pushing this string to 
DSpace. We have access to EPrints server so we found the only file with 
oai_bibl string in it (EPrints is programmed in perl and we're not well 
versed in it) - OAI_Bibliography.pm. We've removed this file and 
restarted the apache httpd service. After that OAI harvest of that 
EPrints server was perfect.

Hope someone else finds this helpful.

Kind regards,
Vladimir

On 2014-05-16 17:56, Vladimir Tomić wrote:
> Hi,
>
> We have successfully deployed DSpace 4.0 on a centos+postgres+tomcat
> machine. This DSpace is intended as an aggregator of metadata from
> multiple local repositories (harvesting via OAI-PMH using Simple Dublin
> Core).
>
> After successfully harvesting multiple non-EPrints based repositories,
> we ran into problems with an EPrints repository - all of the harvested
> articles are missing metadata. These articles are listed in our DSpace
> as Untitled, Unknown Author. When showing full item record, only
> DSpace-generated Dublin Core elements are present:
>
> dc.date.accessioned
> dc.date.available
> dc.identifier.uri
> dc.description.provenance
> dc.description.provenance
>
> While trying to solve this issue, we have inspected EPrints and
> non-EPrints XML responses to OAI verb=ListRecords requests and couldn't
> find any notable differences. In another effort to pinpoint the problem,
> we have partially harvested metadata of University of Southhampton
> EPrints repository and still got around 10,000 Untitled, Unknown Author
> articles.
>
> Is harvesting EPrints from DSpace a known issue? Does anyone have any
> idea as to why this is happening and what might help?
>
> Kind regards,
> Vladimir
>
>
> ------------------------------------------------------------------------------
> "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
> Instantly run your Selenium tests across 300+ browser/OS combos.
> Get unparalleled scalability from the best Selenium testing platform available
> Simple to use. Nothing to install. Get started now for free."
> http://p.sf.net/sfu/SauceLabs
> _______________________________________________
> DSpace-tech mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
> List Etiquette: 
> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
>


------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.
Get unparalleled scalability from the best Selenium testing platform available
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Reply via email to