On 20 August 2012 16:35, Sarah Breau <[email protected]> wrote: >> I am not too sure about saying ISBNs for 'rejected' item types >> should/could (not) be added to 'accepted' item types. If possible, it >> would be nice if an ISBN only points a user to the item it was >> attached to, not a related item. It is not possible (AFAIK) to explain >> which ISBN is for what. On the other hand, it's sometimes hard to see >> what an ISBN on an item identifies*, so you can't always tell whether >> a mistake was made or that a related ISBN was added. > > Since the motto of OL is one web page for every book, I separate out records > that contain multiple formats. So if I am working on a record and it has > more than one ISBN, I make a new record and move the paperback version over > there. Sometimes they have different covers, so to me they are different > books. > > The bigger question here is where did all this bad information come from? It > strikes me as sub-optimal to import a huge amount of data automatically and > then have humans painstakingly sort through it and discard the non-book > items one by one. And that's the best-case scenario: at this point, the > human workers don't even have this ability. As a user, I sometimes get > frustrated with the amount of disorderly information in OL, especially since > as a user I don't have the tools to clean it up. I think I would spend more > time on the database if a) I could make meaningful changes (like removing > non-book items or merging duplicate records), and b) I didn't feel like > somewhere around half of the records are duplicates (why bother fixing a > record when it has twins out there that are just as incomplete?). > > Sarah
Hi Sarah, As far as I can tell, bad data was imported from bad library records. It seems many libraries have errors in their records, ranging from bad data (e.g. physical format ":" and Dewey Decimal Code "B" at the Library of Congress) to bad structure (e.g. missing separators in MARC records). It also seems that on import, typical MARC markup like "[Springfield, Va]" was not changed to "Springfield, VA". I have been working on some automated vacuuming, but VacuumBot can only do some simple stuff. I agree that more options for users to handle duplicates are needed. But I am afraid efforts have to come from users (I'd love to try automatic duplicate detection on the OL records, but I have no experience yet, except for having MySQL find duplicate work titles, and need to do other work). Ben _______________________________________________ Ol-discuss mailing list [email protected] http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to [email protected]
