Mystery and detective (Detective and Mystery) might have been too an easy an example ... they are indeed different ... although this is not intuitive.
Another example: 1) http://openlibrary.org/search?subject_facet=Science+fiction&subject=science+fiction: 3,527 hits 2) http://openlibrary.org/search?subject=science+fiction: 8,587 hits 3) http://openlibrary.org/subjects/science_fiction: 4,714 works Why do I get such dramatically different results. I am trying to generate list with all Sci-Fi, Western, Mystery and detective and Romance titles and their original publication dates. My question regarding ol_dump_works_2011-03-31.txt.gz is: running a script reading in every single json object and checking for subject "science fiction", data from what query (1), 2) or 3)) will I get back? thanks, Rainer On Sat, Apr 23, 2011 at 3:00 PM, <[email protected]> wrote: > Send Ol-tech mailing list submissions to > [email protected] > > To subscribe or unsubscribe via the World Wide Web, visit > http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech > or, via email, send a message with subject or body 'help' to > [email protected] > > You can reach the person managing the list at > [email protected] > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Ol-tech digest..." > > > Today's Topics: > > 1. Re: multiple subject instances ([email protected]) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Fri, 22 Apr 2011 14:25:44 -0700 (PDT) > From: [email protected] > Subject: Re: [ol-tech] multiple subject instances > To: "Open Library -- technical discussion" <[email protected]> > Message-ID: <[email protected]> > Content-Type: text/plain;charset=iso-8859-1 > > Hi Rainer - > > It's a bit confusing, but strictly speaking, these two subjects are > different, even though to our eye they seem the same: > > Detective and mystery stories [x] > Mystery and detective stories [x] > > Different in the sense that the search can't recognize those 2 different > phrases as similar or same. (At least our search doesn't do that sort of > thing yet...) > > There are all sorts of tiny variations in the way people use subject > headings. Perhaps the challenge becomes working out that similarity I > mentioned. > >> Finally, in the ol_dump_works_2011-03-31.txt file, what subject level >> is contained there? > > I'm not quite sure what you mean by "subject level," but hopefully, the > dump for Works in the system also contains subject information, which Open > Library attaches to the Work level. (Many of our edition records also have > subject information too, although that's not surfaced in the display. We > made the decision that subjects were most suited to the overall Work level > of something, since any manifestation of that Work would presumably be > about the same thing.) > > Cheers, > george > > > > >> Hi, >> >> why are there multiple subject instances per subject? and why are they >> called subject facets but only "subjects" can be searched in the >> "advanced search options? Here is an example: >> >> http://openlibrary.org/search?subject_facet=Mystery+and+detective+stories&subject_facet=Detective+and+mystery+stories&subject=Mystery+and+detective+stories >> >> On each results page I am able to click on the same subject again in >> the subject pane on the right. Why? Why are more restrictive >> sub-subjects not given unique names? >> >> Finally, in the ol_dump_works_2011-03-31.txt file, what subject level >> is contained there? >> >> Thanks, >> Rainer >> _______________________________________________ >> Ol-tech mailing list >> [email protected] >> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech >> To unsubscribe from this mailing list, send email to >> [email protected] >> > > > > > ------------------------------ > > _______________________________________________ > Ol-tech mailing list > [email protected] > http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech > To unsubscribe from this mailing list, send email to > [email protected] > > > End of Ol-tech Digest, Vol 40, Issue 4 > ************************************** > -- Rainer Hilscher 2130 Pauline Blvd, Apt 203 Ann Arbor, MI 48103 USA _______________________________________________ Ol-tech mailing list [email protected] http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech To unsubscribe from this mailing list, send email to [email protected]
