Mystery and detective (Detective and Mystery) might have been too an
easy an example ... they are indeed different ... although this is not
intuitive.

Another example:

1) 
http://openlibrary.org/search?subject_facet=Science+fiction&subject=science+fiction:
3,527 hits

2) http://openlibrary.org/search?subject=science+fiction: 8,587 hits

3) http://openlibrary.org/subjects/science_fiction: 4,714 works

Why do I get such dramatically different results.

I am trying to generate list with all Sci-Fi, Western, Mystery and
detective and Romance titles and their original publication dates. My
question regarding ol_dump_works_2011-03-31.txt.gz is: running a
script reading in every single json object and checking for subject
"science fiction", data from what query (1), 2) or 3)) will I get
back?

thanks,
Rainer

On Sat, Apr 23, 2011 at 3:00 PM,  <[email protected]> wrote:
> Send Ol-tech mailing list submissions to
>        [email protected]
>
> To subscribe or unsubscribe via the World Wide Web, visit
>        http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
> or, via email, send a message with subject or body 'help' to
>        [email protected]
>
> You can reach the person managing the list at
>        [email protected]
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Ol-tech digest..."
>
>
> Today's Topics:
>
>   1. Re: multiple subject instances ([email protected])
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 22 Apr 2011 14:25:44 -0700 (PDT)
> From: [email protected]
> Subject: Re: [ol-tech] multiple subject instances
> To: "Open Library -- technical discussion" <[email protected]>
> Message-ID: <[email protected]>
> Content-Type: text/plain;charset=iso-8859-1
>
> Hi Rainer -
>
> It's a bit confusing, but strictly speaking, these two subjects are
> different, even though to our eye they seem the same:
>
> Detective and mystery stories [x]
> Mystery and detective stories [x]
>
> Different in the sense that the search can't recognize those 2 different
> phrases as similar or same. (At least our search doesn't do that sort of
> thing yet...)
>
> There are all sorts of tiny variations in the way people use subject
> headings. Perhaps the challenge becomes working out that similarity I
> mentioned.
>
>> Finally, in the ol_dump_works_2011-03-31.txt file, what subject level
>> is contained there?
>
> I'm not quite sure what you mean by "subject level," but hopefully, the
> dump for Works in the system also contains subject information, which Open
> Library attaches to the Work level. (Many of our edition records also have
> subject information too, although that's not surfaced in the display. We
> made the decision that subjects were most suited to the overall Work level
> of something, since any manifestation of that Work would presumably be
> about the same thing.)
>
> Cheers,
> george
>
>
>
>
>> Hi,
>>
>> why are there multiple subject instances per subject? and why are they
>> called subject facets but only "subjects" can be searched in the
>> "advanced search options? Here is an example:
>>
>> http://openlibrary.org/search?subject_facet=Mystery+and+detective+stories&subject_facet=Detective+and+mystery+stories&subject=Mystery+and+detective+stories
>>
>> On each results page I am able to click on the same subject again in
>> the subject pane on the right. Why? Why are more restrictive
>> sub-subjects not given unique names?
>>
>> Finally, in the ol_dump_works_2011-03-31.txt file, what subject level
>> is contained there?
>>
>> Thanks,
>> Rainer
>> _______________________________________________
>> Ol-tech mailing list
>> [email protected]
>> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
>> To unsubscribe from this mailing list, send email to
>> [email protected]
>>
>
>
>
>
> ------------------------------
>
> _______________________________________________
> Ol-tech mailing list
> [email protected]
> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
> To unsubscribe from this mailing list, send email to 
> [email protected]
>
>
> End of Ol-tech Digest, Vol 40, Issue 4
> **************************************
>



-- 
Rainer Hilscher
2130 Pauline Blvd, Apt 203
Ann Arbor, MI 48103
USA
_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
To unsubscribe from this mailing list, send email to 
[email protected]

Reply via email to