Interesting question. First, on the Library of Congress data, Internet Archive has a snapshot of the LoC information from 2007. It was collected by the Scriblio project http://www.archive.org/details/marc_records_scriblio_net. There's also some other record collections at archive that contain MARC records. There's some good MARC libraries in Perl.
As you point out though, looking at library catalogs is going to produce a lot of holes. You might have better luck looking at some of the larger publishers. They might have ONIX files they can share with you, but the data harvesting with publishers typically isn't easy to do in an automatic way. The publishers generally don't seem to make that data available, which is a pity. But I suspect contacting them asking for ONIX dumps of their catalogs might be one of the quicker routes, particularly for historical information. One nice advantage with Perl is most of the books will have an isbn number, which will help with combining data from multiple sources. Another old-school, non-automated way technique to do this would be to follow citation trails. Use something like Web of Science. Of course, the issue there is that many of the citation sources will be academic and there will be holes for publishers like Sams that are more focused on developers. ACM Digital Library also does this to a degree if I remember correctly and they have non-ACM materials w/ record info. For example, the first hit is Perl Cookbook when I search there. Depending on the scope of the project or how urgent it is this might be a useful thing to crowd-source. Start gathering the data and make it available and ask people to send information about anything that is missing. One final question, do you want all books, published anywhere and each edition? So you want to know about, say, the Chinese translations to Effective Perl Programming and some small book only published in Sanskrit? Jon Gorman On Sun, Nov 6, 2011 at 1:18 PM, brian d foy <brian.d....@gmail.com> wrote: > I'm looking for a way to discover all the books ever published about > Perl. Where should I look? > > * Is there a Perl interface for the WorldCat APIs? If not, I'll make > one. Are people merely shoving their results into something like > XML::Feed? I have a big dump of data > > * WorldCat has many of the books, but there are holes. I realize that > this is a union catalog instead of a historical database. > > * I know about the Amazon interfaces too, but I think that's the same > problem as WorldCat (and there are already Perl interfaces for that). > > * I have the data dump from Google Books already. > > * I figure that the Library of Congress knows about a lot of them, but > I don't have $20,000 to buy their 2012 database (or subsequent ones). > Is there some other way to get re > > -- > brian d foy <brian.d....@gmail.com> >