http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=10662
--- Comment #27 from David Cook <[email protected]> --- The first time I started working on this feature, I thought about using “Staged Marc Management”, but there were problems with this which I don't recall 100% (as it was over 2 years ago). I do have some memories though: 1. I wouldn't want the harvests accessible via the "Staged Marc Management" tool, because selective "import"/"undo import" of harvests would be highly problematic. You could import 100 records, unimport 100 records, import 50 records, and then try to re-import those original 100 records which include that 50 record subset. In this case, you might overwrite the newer 50 records with the older 100 records. Of course, you could opt not to overwrite matches... but that relies on there being a good matcher, which there very well might not be. Plus, if you don't overwrite matches and have that setting defined at a OAI-PMH server level, you're never going to get newer records updating older records, which is also bad. 2. The "Staged Marc Management" record matcher relies on Zebra which makes it prone to not always matching correctly. If something hasn't been indexed correctly, you'll get duplicate records. It also relies on that Koha's indexing configuration. In some tests, I've forced the unique OAI-PMH identifier to be placed in the 035$a field... but that field isn't indexed by default. So it would be useless for matching without an update to the Zebra indexing... which can be achieved but it's another point of failure. The matching also relies on import rules defined in Koha. If you have a staff member accidentally delete your OAI-PMH matching rule, you're going to quickly get many many duplicate records. -- I chose to do my own import rules - using only the unique OAI-PMH identifier - because it was the most reliable way of making sure that harvested records weren't duplicated against themselves/each other. In the event that you're harvesting holdings, you also need to have the original bibliographic record in the Koha. That means that if you're having duplicate matching, it must 100% of the time overwrite local bibliographic records. Otherwise, your holdings won't know which bibliographic record to which to bind. If you're using "Staged Marc Management", it's easy to accidentally misconfigure so that you're not overwriting local bibliographic records, and then you have problems again. Another reason I chose to do my own import rules is because I don't think you can trust the user to manage the OAI-PMH harvester configuration completely. -- That all said, I think perhaps the "Staged Marc Management" system might be able to be leveraged... I just don't want it to be configurable by end users, since it needs very particular settings in order to work correctly. Unfortunately, this means that you're going to lose some of the functionality you want, like being able to look at all the records in a harvest. However, the idea of a "harvest" doesn't really make sense if you're using the harvester every few seconds. Each "harvest" might only have 1-2 records in it, so the concept of harvests becomes a bit unhelpful. -- Ultimately, I think we'll need to discuss the import and duplication part of the feature more... -- You are receiving this mail because: You are watching all bug changes. _______________________________________________ Koha-bugs mailing list [email protected] http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
