Re: [CODE4LIB] Internet Archive collection codes?

2008-06-03 Thread [Alexis Rossi]
Hi,

You can do a search for mediatype:collection to return results for all
4200+ collections.

We have a search interface that will return specific fields for this query
in xml format, if you'd like, but I'll need to give you some permissions
to access it.  Feel free to send me an email if you'd like to use that
([EMAIL PROTECTED]).

Alexis




 Does anyone know where to get a list of Internet Archive collection
 codes and their human-displayable display labels?

 For instance:
 americana = American Libraries
 gutenberg = Project Gutenberg
 librivoxaudio = [hell if I know]


 Some of these I can 'scrape' from the quick search box popup on the IA
 website. But their not all in there. And maybe there's a better place to
 get these?

 Anyone know where the right place to ask this of the IA and/or IA
 developer community is?

 Jonathan



[CODE4LIB] Latest OpenLibrary.org release

2008-05-07 Thread Alexis Rossi

The OpenLibrary.org http://www.openlibrary.org team has just finished
its latest release on the long path towards one web page for every book
ever published.

What's new?

   * added another 6 million book records (13.4 million total) with 18
 million more records waiting to be integrated

   * built an API http://www.openlibrary.org/dev/docs/api to the data
 which allows you to query the database for objects matching
 particular criteria or to GET an object from the database

   * added internationalization support
 http://www.openlibrary.org/i18n - we have already started on
 Spanish, Italian and a few other languages, but users are now able
 to translate the site into any language

   * search the full text of 230,000 scanned books from the advanced
 search http://www.openlibrary.org/advanced page

   * started merging library MARC records and non-library book data
 crawled from the web (still some kinks to be worked out!)

OpenLibrary is a work in progress, so please help us build it!  The
site, the code and the documentation are all open, so if you're
interested in helping as a librarian or a programmer, join us - there's
lots left to do!

You can join the OL mailing list at:
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss

And I'd especially like to thank our awesome team:

   * Edward Betts
   * Anand Chitipothu
   * Karen Coyle
   * Rebecca Malamud
   * Paul Rubin
   * Aaron Swartz


Thanks,

Alexis Rossi
Internet Archive





___
Openlibrary mailing list
[EMAIL PROTECTED]
http://mail.archive.org/cgi-bin/mailman/listinfo/openlibrary



___
Ol-tech mailing list
[EMAIL PROTECTED]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech


Re: [CODE4LIB] Latest OpenLibrary.org release

2008-05-07 Thread Alexis Rossi

I'm the only non-techie on the team, so I don't know that much about
SRU.  (Our head programmer lives in India, and is presumably asleep at
the moment, otherwise I'd ask him!)  Is it an interface that is used
primarily by libraries?  We are definitely hoping that our API will be
used by all kinds, so perhaps that's the reasoning.

But this is an Open Source project, so if anyone would like to volunteer
to build an SRU interface... you can!  Please do! :-)

Alexis


Dr R. Sanderson wrote:

  * built an API http://www.openlibrary.org/dev/docs/api to the data
which allows you to query the database for objects matching
particular criteria or to GET an object from the database


Not SRU? Any reasons why you rolled your own?

Rob


[CODE4LIB] Programmer: Semantic Web Data Integrator for Library Records (Open Library)

2008-03-11 Thread Alexis Rossi

[Please excuse the cross-posting]

Internet Archive is looking for a programmer that can bring library
records into the semantic web. This requires working with very large
datasets and doing analyzing, merging, and manipulating to bring these
key resources to a wide audience.

The Internet Archive is a non-profit digital library committed to
preserving the world's digital cultural artifacts. Used by over 6
million people, this resource is becoming part of how the Internet
works. Our job is to put the best humanity has to offer within reach of
students, educators and the general public. Find out more about our
organization and web archive at www.archive.org

Open Library is an open source software project started by the Internet
Archive to build a site with one web page for every book ever published.
The site uses a new type of Semantic Wiki that preserves the structured
data that already exists for books. Leveraging millions of library and
publisher bibliographic records, we have already created a technology
demo, available at http://demo.openlibrary.org, and we're looking for a
data importer to help us grow the site to the next level.  Interested
applicants should be sure to look at the source code available on the
demo site before applying.

You will assist the current team of programmers to import data in MARC,
ONIX and other formats, crawl and parse information from the web, and
integrate and deduplicate the records that we get from different sources.

REQUIREMENTS:

   * Minimum of 3 years of experience with Python, Perl, or PHP is required
   * Must have UNIX experience
   * Experience with database calling or merging, crawling technology,
book data a plus
   * Experience as a technical librarian a strong plus

We are located in the Presidio of San Francisco with parking and public
transportation available.

The Internet Archive is an equal opportunity employer. We provide
medical and dental benefits. Please send your resume and cover letter to
[EMAIL PROTECTED] with the subject line Programmer- Semantic
Web. The Internet Archive thanks all applicants for their interest, but
advises that only those selected for an interview will be contacted. No
phone calls please.


[CODE4LIB] technology demo of Open Library - Now Open!

2007-07-16 Thread Alexis Rossi

Hi all,

I spoke to a few of you at the code4lib conference in Georgia about this, but 
it's finally up and ready for people to take a look.  Open Library is an effort 
to catalog every book in the world, while keeping the technology and all of the 
data open to everyone.

After months of hard work by a very dedicated group of people, Open Library is 
now open:

http://demo.openlibrary.org/

This is a technology demo, so it doesn't have all of the bells and whistles 
just yet.  But we're looking for help!  If you've got data, we want it!  If 
you're a programmer interested in helping, please let us know!

We have a series of pages describing our project and goals, a marvelous demo 
site that shows off what we're capable of, and a new series of mailing lists to 
bring more people into the project.

Please subscribe to the lists that interest you, poke around the site, and let 
us know what you think.

Thanks!

Alexis Rossi
Internet Archive