Quoting Ross Singer <[email protected]>:

> Ok, I have a followup on this.
>
> I've made an analogous web service to OCLC's X-Identifier or
> LibraryThing's ThingISBN with Open Library's data:

This is great, Ross. I hope folks have time to play with it a bit and  
see what it reveals.

>
> http://ol-identifier.heroku.com/
>
> One of the things that becomes apparent with this is how many
> duplicate editions there are (multiple "owl:sameAs"es means the
> identifier in question appears in multiple records):

Editions get de-duped, but there are a lot of dups that I think are a  
result of either a bug or a failure of the de-dupe program to run for  
some time. I will, however, take a look at some of the dupes and see  
if they look like an algorithm problem or a data problem. There have  
been a lot of problems with Amazon data, which almost always contains  
an ISBN but often have either crappy data or put junk in the title  
field. But I think you're just seeing a result of a bug/error.

kc

>
> http://ol-identifier.heroku.com/oclc/792033:
> <rdf:Description rdf:about="http://ol-identifier.heroku.com/oclc/792033";>
> <owl:sameAs rdf:resource="http://openlibrary.org/books/OL14464770M"/>
> <owl:sameAs rdf:resource="http://openlibrary.org/books/OL24210371M"/>
> </rdf:Description>
>
> http://ol-identifier.heroku.com/isbn/006251587X:
> <rdf:Description rdf:about="http://ol-identifier.heroku.com/isbn/006251587X";>
> <owl:sameAs rdf:resource="http://openlibrary.org/books/OL22359132M"/>
> <owl:sameAs rdf:resource="http://openlibrary.org/books/OL9245413M"/>
> <owl:sameAs rdf:resource="http://openlibrary.org/books/OL38986M"/>
> <owl:sameAs rdf:resource="http://openlibrary.org/books/OL7290708M"/>
> </rdf:Description>
>
> http://ol-identifier.heroku.com/lccn/00004240:
> <rdf:Description rdf:about="http://ol-identifier.heroku.com/lccn/00004240";>
> <owl:sameAs rdf:resource="http://openlibrary.org/books/OL23358959M"/>
> <owl:sameAs rdf:resource="http://openlibrary.org/books/OL6774976M"/>
> </rdf:Description>
>
> So, my question would be, do editions get merged?  If so, is there
> some log of these merges?
>
> By the way, I should have something to show with a feed of deprecated
> work IDs tomorrow.
>
> Thanks!
> -Ross.
> On Mon, Dec 13, 2010 at 4:12 PM, Ross Singer <[email protected]> wrote:
>> On Mon, Dec 13, 2010 at 3:13 PM, Edward Betts <[email protected]> wrote:
>>> It doesn't create a new work, it uses ID of the existing work with the
>>> most editions. If there is a draw it picks the lowest work ID.
>>>
>>> The other works get turned into redirects to the chosen work.
>>>
>> Ok, good -- the redirect is a good safety net, if nothing else.
>>
>>> The changes are also visible here: http://openlibrary.org/people/WorkBot
>>>
>>>> Is there a way to track the changes to the Work identifiers?  The API
>>>> only seems to show the edition ids.
>>>
>>> We don't have a good feed for watching the changes to work identifiers.
>>> What would you use the feed for?
>>>
>> Well, it would be a lot easier to update anything with the old work id
>> to use the new one than to match every edition identifier and change
>> it.  In the end, though, it's not a huge deal.
>>
>>> Currently it is only WorkBot that is merging works, in the future we'll
>>> let humans do it as well, as we build that feature we can include a feed
>>> of merged works.
>>>
>> Good to know!
>>
>> Thanks for all of the pointers,
>> -Ross.
>>
> _______________________________________________
> Ol-tech mailing list
> [email protected]
> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
> To unsubscribe from this mailing list, send email to  
> [email protected]
>



-- 
Karen Coyle
[email protected] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet

_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
To unsubscribe from this mailing list, send email to 
[email protected]

Reply via email to