I played a little with the datadump of January and found out:
- there are 47796615 records (author, work, edition) in the dump
- distributors is empty in every edition record
- there are 7439623 records that have content in the contributions field
- there are 14996 records with content in the contributors field,
which is an array of (role, name) tuples; there are just under 300
values for 'role' - that could use some cleanup.
- there are 38 records containing a link to VIAF
- there are 1294 records containing a link to the English Wikipedia
- there are 1864 records with content in the wikipedia field, which
has links to non-English Wikipedias too.

I published a blog post about this at
http://ben.companjen.name/2012/02/playing-with-the-open-library/

Ben

On 8 February 2012 15:41, Ben Companjen <[email protected]> wrote:
> Hi all,
>
> (About types)
> I was wondering how the contributors[] field is stored within
> /type/edition. There is no mention at
> http://openlibrary.org/type/edition, but the JSON representation at
> e.g. http://openlibrary.org/books/OL25154702M.json shows the field (a
> compound type with role and name, both plain strings), and the RDF
> template uses the information from the field too (just the name part
> though).
>
> I found out that the contributions[] field (that only has strings)
> contains data that come (probably) straight from the MARC records and
> may or may not have a role included in the string (e.g. "Ben Companjen
> (Editor)". The field is not displayed in the page view, but it is in
> the RDF and JSON views.
>
> In the complete datadump of January 31st, no edition uses the
> distributors[] field. I had already asked myself what it is for, but
> now it seems it isn't used at all.
>
> (About documentation)
> There is an issue on GitHub
> (https://github.com/internetarchive/openlibrary/issues/100) about
> documenting the datadumps, which would include these fields. Perhaps
> it's a good idea to start with the individual properties and types?
> Infogami allows a description of each type and property:
> http://infogami.org/quicklook. Documenting the datadumps would then be
> easy: just copy the applicable property descriptions.
>
> I noticed that on the http://openlibrary.org/developers page, the
> "Bugs" link points to the Launchpad bug tracker (which is no longer
> tracked). Should it be updated to point to the issues on GitHub, and
> should the current bugs on Launchpad be moved?
>
> And what is the status of http://code.openlibrary.org/ ? The
> Developers page points to this documentation through "OL Development".
> Is this documentation still linked to the in-code documentation?
>
> I am looking to see if I can help out with more than the RDF output,
> but I'm having a hard time finding out what some of the code is doing.
> Thanks in advance for the answers.
>
> Regards,
>
> Ben
_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
To unsubscribe from this mailing list, send email to 
[email protected]

Reply via email to