On Wed, Mar 13, 2013 at 10:48 AM, John Rigdon <[email protected]>wrote:
> A topic that was broached on the Discuss list this morning may be an > important starting point and one that can impact and benefit bibliographic > databases everywhere. > I think the content of the discussions is orthogonal to the forum that they're conducted in, which was the subject of Karen's message. The vast majority of bibliographic databases are maintained by libraries. They have vast, nay endless, sets of rules, adaptations, exceptions, code lists, "controlled" vocabularies, "authorities" and as much standardization as you could ever want (at least for a card catalog). > The topic centers around normalizing data entry procedures and guidelines. > I don't know what has been done in the area, but a minimun seems to be an > open source data dictionary with explanation of the terms. > Since the bulk of the data comes from library records, the starting point is the data definitions that they used, which you can find here: http://www.loc.gov/marc/bibliographic/ecbdhome.html Of course, it's not quite that simple since there are lots of ancillary and supporting standards. What OpenLibrary has done, in my view, is attempt to bridge the gap between that existing practice and what would be useful to actual consumers, so there are things like normal name order instead of inverted (last,first), collecting together all editions under a single "work" record, etc. They've also greatly simplified things. I agree that better standardization and documentation would be useful, but I think it's very early days for OpenLibrary as a community, so it's not that surprising that it could use improvement. We should definitely put it on the list though. Any progress we make in normalizing these will GREATLY help future > machine manipulation projects. Just to point out the opposing view, there is the school that says humans should just enter things in whatever format they want and it's up to the machines to figure it out. As an example, ISBN 10 vs 13, dashes vs no dashes is trivial for a machine to deal with. Still and all, better standards/documentation would be useful for the areas where it matters. Tom
_______________________________________________ Ol-tech mailing list [email protected] http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech To unsubscribe from this mailing list, send email to [email protected]
