Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set

2012-05-22 Thread Simon Spero
Arash - you might not want to use a straight dump of worldcat catalog records- at least not without the associated holdings information.* There are a lot of quasi-duplicate records that are sufficiently broken that the worldcat de-duplication algorithm refuses to merge them. These records will

Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set

2012-05-22 Thread Arash.Joorabchi
Thank you Roy and Simon for the info. As for your second point, I suppose one advantage of using the WorldCat API at this experimental stage is that the returned bib records are already FRBR-ized. Ross - Thanks for the link of Open Library data dump. WorldCat collection is 2 orders of magnitude

Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set

2012-05-19 Thread Roy Tennant
Arash, Yes, we have made WorldCat available to researchers under a special license agreement. I suggest contacting Thom Hickeyhic...@oclc.org about such an arrangement. Thanks, Roy On Fri, May 18, 2012 at 3:46 AM, Arash.Joorabchi arash.joorab...@ul.ie wrote: Dear Karen, I am conducting a

Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set

2012-05-18 Thread Arash.Joorabchi
Dear Karen, I am conducting a research experiment on automatic text classification and I am trying to retrieve top matching bib records (which include DDC fields) for a set of keyphrases extracted from a given document. So, I suppose this is a rather exceptional use case. In fact, the right

Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set

2012-05-18 Thread Ross Singer
On May 18, 2012, at 6:46 AM, Arash.Joorabchi wrote: Dear Karen, I am conducting a research experiment on automatic text classification and I am trying to retrieve top matching bib records (which include DDC fields) for a set of keyphrases extracted from a given document. So, I suppose

Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set

2012-05-17 Thread Karen Coombs
I forwarded this thread to the Product Manager for the WorldCat Search API. She responded back that unfortunately this query is not possible using the API at this time. FYI, the SRU interface to WorldCat Search API doesn't currently support any scan type searches either. Is there a particular

Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set

2012-05-16 Thread Mike Taylor
There is no standard way in CQL to express field X is not empty. Depending on implementations, NOT srw.dd= might work (but evidently doesn't in this case). Another possibility is srw.dd=*, but again that may or may not work, and might be appallingly inefficient if it does. NOT srw.dd=null will

Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set

2012-05-16 Thread Arash.Joorabchi
Hi mark, Srw.dd=* does not work either: Identifier: info:srw/diagnostic/1/27 Meaning: Details:srw.dd Message:The index [srw.dd] did not include a searchable value I suppose the only option left is to retrieve everything and filter the results on the client side.

Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set

2012-05-16 Thread Arash.Joorabchi
Hi Andy, I am a SRU newbie myself, so I don't know how this could be achieved using scan operations and could not find much info on SRU website (http://www.loc.gov/standards/sru/). As for the wildcards, according to this guide: