Hi Ben,
> That is great! Linking to Open Library makes Open Library more visible > in the Linked Data world, I guess. > > I read your blog post, and would like to raise a couple of questions with you. > First of all: where are the links? I see no link to the OL website (or > a Work URI) on the page that is said to be an example... our SPARQL Endpoint has had a corrupted DB-file. Now it's reindexed - have another try. > Did you only link to Works, or to Editions too? ISBNs are associated we link only to works since it's not necessarily right to say that "two manifestations with the same ISBNs are the same manifestations" (because they can have different issue dates etc ...). > with Editions, so I'd expect that would be the first stop. Editions > contain TOCs too, Works don't. Or are there too many Editions with the > same ISBN? In the json dump I can see that http://openlibrary.org/works/OL10000003W is a work and it has an edition (which is included in the json) and this edition has an ISBN, so I linked our manifestation with that work: /works/OL9999886W { "authors": [ { "author": { "key": "/authors/OL3964828A", "name": "M. Deshors", "type": { "key": "/type/author" } } } ], "covers": [ 3139849 ], "created": { "type": "/type/datetime", "value": "2009-12-11T01:57:10.423179" }, "editions": [ { "authors": [ { "key": "/authors/OL3964828A" } ], "covers": [ 3139849 ], "isbn_10": [ "2842702751" ], "isbn_13": [ "9782842702755" ], "key": "/books/OL12622527M", "languages": [ { "key": "/languages/fre" } ], "last_modified": { "type": "/type/datetime", "value": "2010-04-13T09:18:54.161394" }, "latest_revision": 3, "physical_dimensions": "8.3 x 5.9 x 0.8 inches", "physical_format": "Paperback", "publish_date": "March 28, 2001", "publishers": [ "Mango" ], "revision": 3, "title": "Toute la toile autour du monde", "type": { "key": "/type/edition" }, "weight": "15.7 ounces", "works": [ { "key": "/works/OL9999886W" } ] } ], "key": "/works/OL9999886W", "last_modified": { "type": "/type/datetime", "value": "2010-04-28T10:16:22.495622" }, "latest_revision": 2, "revision": 2, "title": "Toute la toile autour du monde", "type": { "key": "/type/work" } } In the dump I found that this edition has no extra entry, so it can only be found included in the work-uri in the json dump (as seen above) (sort of confusing, since I am used to RDF which uses graphs to store data, and the edition DOES resolve ( http://openlibrary.org/books/OL12622527M ), so why not just link it in the work-description in the dump? (as more as I am thinking about it , I have to admit I only _begin_ to understand the dump ;) ) > And regarding your remarkable example: that German version of Lord of > the Rings should not be linked to a German work, but to The One Work > called "The Lord of the Rings" (I consider the separate publications > of the three parts one work each). The German work is a duplicate. yep, guessed it. How to deduplicate? Should not be too hard because you have a work-link to library thing: $ grep 'librarything": \["1386651"' ol_dump_deworks_2012-03-31.txt | wc -l 49 seems to bring up 49 editions for one work level, but just a first control sample shows no association with the work level at all: $ grep OL9177075M ol_dump_deworks_2012-03-31.txt /books/OL9177075M {"editions": [{"publishers": ["RUSCONI"], "physical_format": "Paperback", "last_modified": {"type": "/type/datetime", "value": "2011-04-29T03:29:19.321447"}, "created": {"type": "/type/datetime", "value": "2008-04-30T09:38:13.731961"}, "number_of_pages": 1359, "isbn_13": ["9788818123210"], "languages": [{"key": "/languages/ita"}], "isbn_10": ["8818123211"], "publish_date": "1985", "key": "/books/OL9177075M", "title": "IL SIGNORE DEGLI ANELLI (Titolo originale dell'opera: The Lord of the Rings)", "oclc_numbers": ["635814336"], "revision": 4, "type": {"key": "/type/edition"}, "latest_revision": 4, "identifiers": {"goodreads": ["1110294"], "librarything": ["1386651"]}}], "authors": []} > I recently published [1] a list of works that appear to be duplicates > (based on title, subtitle and author) which unfortunately showed that > a lot of cleaning up of edition-less works and duplicate works has to > be done. > That brings up another question: will you do the linking process again > in the future? yes I am willing to do so :) > I imagine that eventually many works (and authors, and probably > editions too) will be merged so that the Work URI you get back (in the > Edition data) when you lookup the same ISBN again may change in the > future. I don't think it will be a problem to have old URIs in your > data, as they will redirect to the new URI(s) when you look them up. > However, if you leave the old URIs in your dataset, you don't know for > sure how many distinct works are linked. And since Open Library data > changes regularly anyway, I don't suppose this was an one-time only > experiment? right > Is the code you used to convert the datadump to RDF available online > (and is it Free software)? Since my proposed changes to OL's "native" > RDF output [2] haven't been accepted yet, perhaps other approaches can > be promoted somehow. Talis's approach works well, but I'm interested > to see others too. since I had not much time and only needed the ISBN in a first approach anyway I did some crude regex to make me an ISBN-triple -o > [1] http://www.mail-archive.com/[email protected]/msg00613.html > [2] https://github.com/internetarchive/openlibrary/pull/136 (comments > still welcome, naturally) > > On 23 May 2012 15:51, Pascal Christoph <[email protected]> wrote: >> Hi *, >> >> today we achieved to link 1.2 M lobid.org resources to Open Library work >> resources, simply using isbn 10. >> It seems that no commonly used identifier (that would be: viaf or GND or ... >> and not an extra minted openlibrary identifier[1]) for creators in ol is >> given. >> Identifier (among other things) help to disambiguate data so if you want to >> you >> can enrich your data using our newly generated links. How to do that and a >> little bit more of background at our blog: >> >> https://wiki1.hbz-nrw.de/display/SEM/2012/05/23/1.2+M+links+to+Open+Library >> >> Yes, and let me say "thank you" for your amazing work - this is just one more >> fine example of what is achivable with LOD! >> >> -o >> >> [1]it may be that there is already a concordance out there between i.e. viaf >> and ol-Person-URIs, I don't know , just saw whats already there in the RDF >> >> -- >> Pascal Christoph >> - Linked Open Data: http://lobid.org/ - >> hbz - Hochschulbibliothekszentrum NRW >> Telefon +49-221-40075-139 >> http://www.hbz-nrw.de/ >> _______________________________________________ >> Ol-tech mailing list >> [email protected] >> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech >> To unsubscribe from this mailing list, send email to >> [email protected] > _______________________________________________ > Ol-tech mailing list > [email protected] > http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech > To unsubscribe from this mailing list, send email to > [email protected] _______________________________________________ Ol-tech mailing list [email protected] http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech To unsubscribe from this mailing list, send email to [email protected]
