http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=10662
--- Comment #19 from David Cook <[email protected]> --- Hi all! I've finally got something up for testing, so please everyone take some time to test it out. So much has changed since I first started working on this back in 2013, but hopefully it should provide all the functionality that you need. I'm sure that the user interface could use more attention, so I'd love to receive feedback on that. I'd also love to hear back about how the feature works. The key component is the "oai_harvester.pl" cronjob, which will be set up by a system administrator. I don't think there's much that a web user can do to affect that, although I have seen other bugs talking about giving web users control over scheduling tasks. I think web users controlling scheduling would be outside the scope of this bug. Unlike the "Staged MARC Management", there is currently no way of un-importing and re-importing. You can only "reset repository harvest", which will delete all currently harvested records and allow you to schedule a new re-harvest. While I originally was going to leverage the "Staged MARC Management" code, I decided that giving web users control over selectively un-importing and re-importing batches of records harvested via OAI-PMH could be really problematic. That is, you might un-import a batch which deletes 10 records, import a new batch which contains those 10 records, then try to re-import an earlier batch of those 10 records. Even if the (optional) record matching rules were set up perfectly, your Koha records would be wrong; they'd be for an older version of the upstream record. I decided that once a record was added to Koha - all further updates and deletions should be automatic. And if a record was deleted from Koha (other than by "resetting the harvest"), then it could not be re-added; it will instead generate an error (since you can't currently "undelete" a bibliographic record or re-add it with the same biblionumber). However, I'm happy to discuss options for handling records that have been deleted from Koha. There is code that checks if the record has been deleted in Koha, so it would be trivial to add a new record with a new biblionumber, although I'd have to update some other code which expects a unique OAI-PMH identifier to be tied to only 1 Koha biblionumber whereas in this case it would have 2 or more. In fact, I'm happy to discuss every part of this code. Some of you might be interested in improving performance. At the moment, the "oai_harvester.pl" runs synchronously, which means that first all the records need to be downloaded into the database, and then all the records need to be processed and imported into Koha. For initial imports or large imports, this takes hours. However, I've recently gained a lot of experience using POE (Perl Object Environment). Using POE, I could presumably write an asynchronous program which could import records as they're received, rather than waiting for the entire harvest to complete. Unfortunately, POE was removed from Koha's dependencies in the past year or so, but I don't think it would be problematic to add it to the dependencies once again. -- Despite me posting these patches, the work isn't done yet. A keen observer will note that there is a lack of consistency in naming. I sometimes say "oai_client", "oai_server", "oai_target", "oai repository". It's not always exactly clear what I mean, even though I know what I mean. I want to be clear in differentiating this feature from Koha's OAI-PMH server as well. I would be receptive to comments about preferred terminology in both the backend and the web app. Additionally, I also need to do the following: 1) Add unit tests 2) Revise the embedded POD in the code 3) Add help pages (and possibly hints/tips in the templates for web users) I'm going to hold off on these 3 tasks for the moment until we get further into the testing. Otherwise they'll just need to be revised again after more code iterations. (That said, it would have been smart to have written unit tests from the beginning as I built up the code. Alas. Next time.) -- You are receiving this mail because: You are watching all bug changes. _______________________________________________ Koha-bugs mailing list [email protected] http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
