http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=10662
--- Comment #16 from David Cook <[email protected]> --- Created attachment 42446 --> http://bugs.koha-community.org/bugzilla3/attachment.cgi?id=42446&action=edit Bug 10662 - Build OAI-PMH Harvesting Client This patch set adds an OAI-PMH harvesting client to Koha. It provides a user interface (UI) for defining external servers from which to harvest records using the OAI-PMH protocol. After it downloads records, it checks the harvest database to see if it needs to add a new record, update an existing record, or delete a record in Koha. _TEST PLAN_ 1) Apply all patches 2) Run updatedatabase.pl (to apply the atomic update) 3) Go to Administration > OAI-PMH servers 4) Click "New OAI-PMH server target" 5) At a minimum, include a valid "Base URL" and a valid "Metadata prefix". 6) Click "Test HTTP and OAI-PMH parameters" 7) If successful, continue with this plan. If unsuccessful, address the warning messages displayed in red before testing the parameters again. 8) At this point, you might want to choose a preferred granularity. All OAI-PMH servers must support YYYY-MM-DD according to the spec, but in practice this isn't always the case, so you may need to choose a more particular granularity (note that this support isn't tested using the "Test" button). 9) You may also want to choose a "From" and "Until" range, at least for the purposes of testing, so that you don't accidentally try downloading thousands or millions of records. (You may also want to download by "Set".) 10) You must set the "Active" parameter to "Active" from "Inactive" for the harvester to work on this server target. 11) Optionally, you may provide a path to a XSLT to transform the incoming data. There is a parameter called "identifier" which is passed to the XSLT engine. This contains the unique OAI-PMH identifier for a record. You may wish to add this to the MARC, especially for the sake of provenance. (You may also want to strip 952, 942, and 999 fields, as well as $9 subfields from incoming records. You may also try the magic "default" keyword here which uses a XSLT I've already written and linked in the backend.) 12) Choose the MARC framework you would like to use (although Default is fine as well). 13) Optionally, you may wish to include a "Original system field". At this time, this has no real purpose. However, in the future, it may be used for linking downloaded holdings records with their original parent record. (e.g. the 004 of a holdings record would link to the 001 of the bibliographic record). This field uses the format of 001 or 999$c with the dollar sign as a subfield delimiter. 14) Click Save 15) You will now see a table containing your entry; click "View". 16) All the numbers on the following screen should be 0. -- 17) Set your environmental variables for KOHA_CONF and PERL5LIB 18) Run "perl /misc/cronjobs/oai/oai_harvester.pl -d -v" to download your records (NOTE: This downloader will run as long as it needs to, so try to only download a few records. Ctrl+C will stop the harvest if it gets out of control.) 19) Revisit the web app as per step #15 20) It should now say "Harvested records waiting to be imported: X" with X being higher than 0. 21) Run "perl /misc/cronjobs/oai/oai_harvester.pl -i -v" to import these records into Koha. 22) The terminal output should indicate the result of the import. This should also be reflected by the webapp as per step #15 (e.g. "Koha records created from harvested records: X"). -- Now, there are a few different scenarios to try: If you control the OAI-PMH repository, try editing a record you've downloaded, and try downloading records again (it might be necessary to change your "From" entry as this should be auto-updated after each harvest), and seeing if your Koha record is updated. If your repository also supports deleted records, try deleting a record that you've already imported into Koha. Koha should get a deletion notice and delete the record from Koha (unless it has items attached). If you delete a record from Koha, but that record still exists upstream, you'll still download updates for that record, but an error will be generated when trying to import into Koha. Each record in this "error" state will be recorded in the "View" section of the UI as "Harvested records in an error state: X". (In the future, I might make it so that the record gets re-added, or add more configuration options to control this behaviour.) If you want to reset the respository harvest (ie delete all your existing harvested records and re-harvest a repository), click "Reset repository harvest" in the "View" screen of the OAI-PMH server target. If errors are encountered while deleting an existing harvest, it should display hyperlinks to the problem records for manual intervention. -- You are receiving this mail because: You are watching all bug changes. _______________________________________________ Koha-bugs mailing list [email protected] http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
