On 12/30/12 11:38 AM, Ben Companjen wrote: > Hi Karen, > > Thanks for your answer (and the link). If I understand correctly, you > are saying that the editions of other imprints of the same > publisher/company are included and that these are determined by the > ISBNs (if possible).
Ben, I'm not quite sure we're saying the same thing, so I'll reword it a bit. Edward did a study (which I can no longer find) in which he matched some ISBN publisher IDs to the terms in the publisher string in the records. (This is also what the OCLC study did.) He was then able to cluster publisher names, including those from records where there was no ISBN. The main one I recall was Oxford University Press, which had used various names over its many hundreds of years of publishing. This obviously had a margin of error. But it's the only way that I can figure that these imprints are being gathered under a single publisher. That said, I cannot find anything in the code that would indicate that this is what is actually happening. I suspect that only Edward would know, and he's not around. I admit that it gets confusing because the publisher is linking to the Work page, and in fact the publisher is relevant only to an edition. So I wonder if all of the editions aren't getting linked together through the work, and thus all of the publishers of all of the editions are getting linked together. This is one of those areas where Works vs. Editions actually matters. I'll poke around and see if I can't make more sense of it using a different publisher name. BTW, there's info on the publisher you picked in Wikipedia -- and it still exists: http://en.wikipedia.org/wiki/American_Tract_Society I've spent the last couple of weeks reading up a lot on FRBR and the Work/Edition concept, and I'm coming increasingly to the conclusion that it's hell to implement. This may be another example of why. I'll let you know if I discover anything. kc > > I don't think it works this way - I believe publishers are indexed > only by the contents of the "publisher" field. > Let's take the same imprint, but the year 1684 [3]. That yields 2 > works, one of which is Pilgrim's progress by John Bunyan [4]. If I > sort the 338 editions by "edition" (year + publisher), I see that the > 1684 edition [5] imprint is "Gedruckt by Juriaen van Poolsum ..." > (Dutch! :)). There are American Tract Society editions from 1830 [6] > and 1840 [7]. > > In these editions no ISBNs are involved and Juriaen van Poolsum is > (probably) not related to the ATS in any way. > > The problem may be more clear if you look at the record for "Gedruckt > by Juriaen van Poolsum ..." [8]: one work, but so many editions > published between 1678 and 2009 - impossible. > > Ben > > [3] > http://openlibrary.org/publishers/American_tract_society#sort=edition_count&published_in=1684 > [4] http://openlibrary.org/works/OL107195W/Pilgrim%27s_progress > [5] > <http://openlibrary.org/books/OL20633079M/Eens_Christens_reyse_na_da_eeuwigheyd_...> > [6] http://openlibrary.org/books/OL6642554M > [7] http://openlibrary.org/books/OL23665874M > [8] <http://openlibrary.org/publishers/Gedruckt_by_Juriaen_van_Poolsum_...> > > On 30 December 2012 20:03, Karen Coyle <[email protected]> wrote: >> Ben, this is one of the complex things about publishers. I don't believe >> there is an error here. If you look at the ones with ISBNs, they use the >> same publisher identifier: >> >> 1432539787 >> 1432577166 >> >> (The "4325" is the publisher ID). >> >> The same publisher (read: company) often publishes multiple imprints. >> Thus "Vintage Books" is an imprint of Random House (or was, I can't >> remember who still exists due to the many buyouts). OCLC did an entire >> study of trying to see if they could identify the publishers from the >> bib data and ISBNs: >> >> Connaway, Lynn Silipigni, and Timothy J. Dickey. 2011. "Publisher Names >> in Bibliographic Data: An Experimental Authority File and a Prototype >> Application." Library Resources and Technical Services, 55,4. Pre-print >> available online at: >> http://www.oclc.org/research/publications/library/2011/connaway-lrts.pdf >> (.pdf: 388.5K/41 pp.). >> >> The method they used was very close to what Edward Betts did for the >> Open Library. >> >> I also often suspect that some small publishers join together to use an >> ISBN ID because it's cheaper for them. But I don't have any proof of that. >> >> kc >> >> On 12/30/12 9:44 AM, Ben Companjen wrote: >>> Hi, >>> >>> I noticed that the page for publisher "American tract society" [1], >>> more specifically the Publishing History graph showing the number of >>> editions per year published by the publisher, says 9 editions were >>> published by this publisher in 2007. >>> >>> When I click the 2007 bar, I get the 9 works that had editions >>> published in 2007 by this publisher [2]. But for at least two of these >>> works, the 2007 editions were not published by this publisher. >>> >>> I believe this problem is not specific to this publisher, but that it >>> occurs with all publishers. Could it be a Solr feeding problem? >>> >>> And speaking of Solr, what is the status of the index updating >>> process? I added a book a few months ago, but I can only access it via >>> a direct link as the author page doesn't show it yet and the publisher >>> page doesn't exist yet. >>> >>> Happy New year! >>> >>> Ben >>> >>> >>> [1] http://openlibrary.org/publishers/American_tract_society >>> [2] >>> http://openlibrary.org/publishers/American_tract_society#published_in=2007&sort=edition_count >>> _______________________________________________ >>> Ol-tech mailing list >>> [email protected] >>> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech >>> To unsubscribe from this mailing list, send email to >>> [email protected] >>> >> >> -- >> Karen Coyle >> [email protected] http://kcoyle.net >> ph: 1-510-540-7596 >> m: 1-510-435-8234 >> skype: kcoylenet >> _______________________________________________ >> Ol-tech mailing list >> [email protected] >> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech >> To unsubscribe from this mailing list, send email to >> [email protected] > -- Karen Coyle [email protected] http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet _______________________________________________ Ol-tech mailing list [email protected] http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech To unsubscribe from this mailing list, send email to [email protected]
