Hi Guys

I have a Solr application searching on data uploaded by Nutch.  The search
I wish to carry out is for a particular document reference contained within
the "url" field, e.g. IAE-UPC-0001.

The problem is is that the file names that comprise the url's are not
consistent, so a url might contain the reference as IAE-UPC-0001 or
IAE_UPC_0001 (ie using either the minus or underscore as the delimiter) but
not both.

I have created the query (in the solr admin interface):

url:"IAE-UPC-0001"

which works (returning the single expected document), as do:

url:"IAE*UPC*0001"
url:"IAE?UPC?0001"

when the doc ref is in the format IAE-UPC-0001 (ie using the minus sign as
a delimiter).

However:

url:"IAE_UPC_0001"
url:"IAE*UPC*0001"
url:"IAE?UPC?0001"

do not work (returning zero documents) when the doc ref is in the format
IAE_UPC_0001 (ie using the underscore character as the delimiter).

I'm assuming the underscore is a special character but have tried looking
at the solr wiki but can't find anything to say what the problem is.  Also
the minus sign also has a specific meaning but is nullified by adding the
quotes.

Can anyone suggest what I'm doing wrong?

Many thanks

Paul

Reply via email to