DSpace folks, For some time now the Texas Digital Library has been investigating using ORE and OAI-PMH in conjunction with handling ETDs from various schools across Texas in a federated collection. Our primary use case still is: we have several IRs across the state that have ETD collections for their respective institutions and we would like to create a single federated collection that aggregates those ETDs and keeps itself automatically updated. To accomplish this, we have added the ability to point a DSpace collection to an external OAI-PMH provider and harvest its items into the local repository. If the remote repository supports OAI-ORE (for example, another DSpace instance), the resource maps can be used to harvest bitstreams as well. We also implemented a scheduling system to run harvests on configured collections at set intervals.
This update is to let you know that the bulk of the project has been completed and is currently undergoing testing. If you want to take a look, the SVN branch is available at: https://source.tdl.org/svn/dspace/branches/dspace-1_5_0-with-harvesting/ We will be integrating the code into later versions of DSpace and would like for it to be considered for inclusion into future versions. The basic install and use instructions are as follows. 1. Check out the harvesting branch at: https://source.tdl.org/svn/dspace/branches/dspace-1_5_0-with-harvesting/ 2. Follow the installation instructions in dspace/docs/install.html normally, with two exceptions: a) before running "mvn package" for the first time, you'll need to manually install a .jar into your maven repository. It is found in: [dspace-source]/etc/oclc-harvester2-0.1.12.jar The full command is: mvn install:install-file -DgroupId=org.dspace -DartifactId=oclc- harvester2 -Dversion=0.1.12 -Dpackaging=jar -Dfile=[dspace-source]/etc/ oclc-harvester2-0.1.12.jar b) there are some new settings in dspace.cfg. The ones of immediate interest to you are "dspace.oai.url", which is the URL that ORE uses to assign its resources a permanent home and "harvester.eperson", which the EPerson under whose authorization the automatic harvests are performed. The rest of the configuration options are described in the configure.html documenation. 3. Harvesting settings are collection-specific and can be configured from JSPUI, XMLUI and command line. a) The command-line utility to configure and run harvests is currently executed via: [dspace-source]/bin/dsrun org.dspace.app.harvest.Harvest Use the -h flag for details. b) Both JSPUI and XMLUI support setting up a collection's harvest settings through its admin interface. In JSPUI, the harvest settings were added to the bottom of the Collection edit screen. In XMLUI, a new tab was added to Edit Collection and Control Panel screens. -Alexey Maslov ------------------------------------------------------------------------------ This SF.net email is sponsored by: High Quality Requirements in a Collaborative Environment. Download a free trial of Rational Requirements Composer Now! http://p.sf.net/sfu/www-ibm-com _______________________________________________ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel