Re: Finding all the Perl books

2011-11-08 Thread Jon Gorman
Interesting question.

First, on the Library of Congress data, Internet Archive has a
snapshot of the LoC information from 2007.  It was collected by the
Scriblio project
http://www.archive.org/details/marc_records_scriblio_net.  There's
also some other record collections at archive that contain MARC
records.  There's some good MARC libraries in Perl.

As you point out though, looking at library catalogs is going to
produce a lot of holes.  You might have better luck looking at some of
the larger publishers.  They might have ONIX files they can share with
you, but the data harvesting with publishers typically isn't  easy to
do in an automatic way.  The publishers generally don't seem to make
that data available, which is a pity.  But I suspect contacting them
asking for ONIX dumps of their catalogs might be one of the quicker
routes, particularly for historical information.

One nice advantage with Perl is most of the books will have an isbn
number, which will help with combining data from multiple sources.

Another old-school, non-automated way technique to do this would be to
follow citation trails.  Use something like Web of Science.  Of
course, the issue there is that many of the citation sources will be
academic and there will be holes for publishers like Sams that are
more focused on developers.  ACM Digital Library also does this to a
degree if I remember correctly and they have non-ACM materials w/
record info.  For example, the first hit is Perl Cookbook when I
search there.

Depending on the scope of the project or how urgent it is this might
be a useful thing to crowd-source.  Start gathering the data and make
it available and ask people to send information about anything that is
missing.

One final question, do you want all books, published anywhere and each
edition?  So you want to know about, say, the Chinese translations to
Effective Perl Programming and some small book only published in
Sanskrit?

Jon Gorman


On Sun, Nov 6, 2011 at 1:18 PM, brian d foy brian.d@gmail.com wrote:
 I'm looking for a way to discover all the books ever published about
 Perl. Where should I look?

 * Is there a Perl interface for the WorldCat APIs? If not, I'll make
 one. Are people merely shoving their results into something like
 XML::Feed? I have a big dump of data

 * WorldCat has many of the books, but there are holes. I realize that
 this is a union catalog instead of a historical database.

 * I know about the Amazon interfaces too, but I think that's the same
 problem as WorldCat (and there are already Perl interfaces for that).

 * I have the data dump from Google Books already.

 * I figure that the Library of Congress knows about a lot of them, but
 I don't have $20,000 to buy their 2012 database (or subsequent ones).
 Is there some other way to get re

 --
 brian d foy brian.d@gmail.com



FW: Finding all the Perl books

2011-11-08 Thread emily nedell tuck

I think WorldCat is the best bet, see below.
 



From: elibrar...@hotmail.com
To: jakob.v...@gbv.de
Subject: RE: Finding all the Perl books
Date: Tue, 8 Nov 2011 07:50:32 -0600





Go to WorldCat.org and search under the subject heading
 

Perl (Computer program language) 
Perl (Langage de programmation) 
 
etc.
 
If you do a book search on Perl, click on one of the results and look to the 
right to get to the subject headings--click on that and you will only get 
relevant titles. 
 
WorldCat contains the Library of Congress and the bibliographic records of 
thousands of libraries around the world, but (ideally) without all the 
duplication. It will at least provide a list of all of the Perl books ever 
acquired by a library.
 
Emily 
 
 

 Date: Tue, 8 Nov 2011 09:18:49 +0100
 From: jakob.v...@gbv.de
 To: perl4lib@perl.org
 Subject: Re: Finding all the Perl books

 brian d foy asked:

  I'm looking for a way to discover all the books ever published about
  Perl. Where should I look?

 Unless someone else has already created a bibliography of Perl books,
 you will find almost all books in library catalogs - except some edge
 cases
 that depend on what published means. Probably there are some printed
 Perl tutorials distributed by hand, that never made it into libraries,
 and
 the definition of e-Book is rather fuzzy. I bet you mean traditional
 printed
 books, right?

 So the tricky part is to find the right library catalogs and how to best
 query
 them. You wrote:

  Is there a Perl interface for the WorldCat APIs? If not, I'll make
  one. Are people merely shoving their results into something like
  XML::Feed? I have a big dump of data

 The most-popular search APIs for library catalogs are Z39.50 which is
 now
 replaced by SRU

 http://search.cpan.org/dist/SRU/
 http://search.cpan.org/dist/Net-Z3950-ZOOM/

 I guess you know http://www.oclc.org/developer/services/WCAPI

  WorldCat has many of the books, but there are holes. I realize that
  this is a union catalog instead of a historical database.

 Perl is very old, but not old enough to show up in historical databases
 ;-)
 WorldCat is the largest but not the only union catalog, especially if
 you
 search for non-English books.

  I have the data dump from Google Books already.

 Where did you get this?

  I figure that the Library of Congress knows about a lot of them, but
  I don't have $20,000 to buy their 2012 database (or subsequent ones).

 Does someone at this list know whether all of LoC goes into WorldCat?

 In theory this query is a good use-case for Linked Data, but then you
 will
 have to wait some other 10 years. However libraries already use
 controlled
 vocabularies since centuries, so there are some subject headings for
 Perl.
 I only looked in the German national library:

 http://d-nb.info/gnd/4709495-3 Perl 6
 http://d-nb.info/gnd/7638891-8 Perl 5.10
 http://d-nb.info/gnd/4698927-4 Perl 5.8
 http://d-nb.info/gnd/4698920-1 Perl 5.6.1
 http://d-nb.info/gnd/4646656-3 Perl 5.6
 http://d-nb.info/gnd/4419978-8 Perl 5
 http://d-nb.info/gnd/4625418-3 mod_perl
 http://d-nb.info/gnd/4584437-9 Perl DBI
 http://d-nb.info/gnd/4307836-9 Perl in general

 The list of publications for each subject heading are available as RSS.

 Subject headings are important because the term Perl is used in other
 context too. For instance there is a German town of this name

 http://d-nb.info/gnd/4102974-4 =
 http://en.wikipedia.org/wiki/Perl,_Saarland

 Having said this, full text search is the best method to start with. A
 good
 place to find libraries is

 http://en.wikipedia.org/wiki/Special:BookSources

 Many library catalogs are subsumed by union catalogs, so you don't need
 to query each of them.

 The best collection not created by libraries is LibraryThing, which is
 created by volunteers and provided good APIs too, see:

 http://www.librarything.com/tag/perl

 Perhaps the best method is crowd-sourcing the LibraryThing way.

 Sorry for not giving a simple answer. I doubt that you can find all Perl
 books
 in all languages fully automatically.

 Cheers
 Jakob

 --
 brian d foy


 --
 Verbundzentrale des GBV (VZG)
 Digitale Bibliothek - Jakob Voß
 Platz der Goettinger Sieben 1
 37073 Goettingen - Germany
 +49 (0)551 39-10242
 http://www.gbv.de
 jakob.v...@gbv.de   

Re: Finding all the Perl books

2011-11-08 Thread Ed Summers
On Tue, Nov 8, 2011 at 9:32 AM, Jon Gorman jonathan.gor...@gmail.com
wrote: First, on the Library of Congress data, Internet Archive has
a snapshot of the LoC information from 2007.  It was collected by
the Scriblio project
http://www.archive.org/details/marc_records_scriblio_net.  There's
also some other record collections at archive that contain MARC
records.  There's some good MARC libraries in Perl.
It's not widely known, but Internet Archive also subscribe to
theweekly updates from LC (I believe back to the Scriblio purchase)
andmake them available on the Web (god bless 'em):
   http://www.archive.org/details/marc_loc_updates
I believe all LoC records are present in WorldCat, except for
thecatalog records that aren't in electronic form :-) I seem to
rememberthere was an impoverished search API that OCLC offers to the
generalpublic, and that it's nice one is reserved for OCLC
subscribers. Youcould use the SRU module with LC's SRU endpoint:
   http://z3950.loc.gov:7090/voyager?operation=explain
But, depending on what you are doing, I would probably be content
tosift through the 481 hits in GoogleBooks and call it a day :-)
   https://www.googleapis.com/books/v1/volumes?q=perl

//Ed