OK, looks like you're mixing fieldTypes. That is,
you have some "string" types, which are
completely unanalyzed and some analyzed
fields. The analyzed fields have stopwords
removed at index time. Then it looks like
your query chain does NOT remove stopwords
or some such.

So it's probably a schema issue. The admin/analysis
page will help you understand how the analysis chains
work.

I'd also recommend that you NOT use eDismax when
experimenting with analyzers, having requests distributed
across all those fields can be confusing. Certainly DO use
eDismax when you're working "for real", or use the
fielded form of the queries title:"chef de projet" just to
reduce the clutter of the output...

But you're on the right track
Best
Erick

On Thu, Nov 17, 2011 at 8:10 AM, Jean-Claude Dauphin
<jc.daup...@gmail.com> wrote:
> Thanks Erick for your prompt response.
>
> I am not sure but I think I found why the phrase "chef de projet" is not
> found by dismax and edismax.
> The following terms are indexed and can be seen with Luke:
>   chef
>   projet
>   chef de projet
> When searching for the phrase "chef de projet", the terms 'chef' and
> 'projet' are found in the index but 'de' is not found. And thus no results.
> Please note that using standard Lucene QueryParser, it works well.
>
> This is just what I suspect, does it sounds correct??
>
> Best wishes,
>
> Jean-Claude
>
> On Wed, Nov 16, 2011 at 9:26 PM, Erick Erickson 
> <erickerick...@gmail.com>wrote:
>
>> Ah, ok I was mis-reading some things. So, let's ignore the
>> category bits for now.
>>
>> Questions:
>> 1> Can you refine down the problem. That is,
>>    demonstrate this with a single field and leave out
>>    the category stuff. Something like
>>    q=title:"chef de projet" getting no results and
>>    q=title:"chef projet" getting results? The idea
>>    is to cycle through all the fields to see if we can
>>    hone in on the problem. I'd get rid of any pf
>>    parameters of your edismax definition too. I'm after
>>   the simplest case that can demonstrate the issue.
>>   For that matter, it'd be even easier if you could
>>   make this happen with the default searcher (
>>   solr/select?q=title:"chef de projet"
>> 2> if you can do <1>, please post the field definitions
>>     from your schema.xml file. One possibility is that
>>     you are removing stopwords at index time but not
>>     query time or vice-versa, but that's a wild guess.
>> 3> Once you have a field, use the admin/analysis page
>>     to see the exact transformations that occur at index
>>     and query time to see if anything jumps out.
>>
>> All in all, I suspect you have a field that isn't being parsed
>> as you expect at either index or query time, but as I said
>> above, that's a guess.
>>
>> Best
>> Erick
>>
>> On Wed, Nov 16, 2011 at 5:02 AM, Jean-Claude Dauphin
>> <jc.daup...@gmail.com> wrote:
>> > Thanks Erick for yr quick answer.
>> >
>> > I am using Solr 3.1
>> >
>> > 1) I have set the mm parameter to 0 and removed the categories from the
>> > search. Thus the query is only for "chef de projet" and nothing else.
>> > But the problem remains, i.e searching for "chef de projet" gives no
>> > results while searching for "chef projet" gives the right result.
>> >
>> > Here is an excerpt from the test I made:
>> >
>> > DISMAX query (q)=("chef de projet")
>> >
>> > =====The Parameters=====
>> >
>> > *queryResponse*=[{responseHeader={status=0,QTime=157,
>> >
>> > params={facet=true,
>> >
>> > f.createDate.facet.date.start=NOW/DAY-6DAYS,tie=0.1,
>> >
>> > facet.limit=40000,
>> >
>> > f.location.facet.limit=3,
>> >
>> > *q.alt*=*:*,
>> >
>> > facet.date.other=all,
>> >
>> > hl=true,version=2,
>> >
>> > *bq*=[categoryPayloads:category10000071^1,
>> > categoryPayloads:category10055078^1,
>> categoryPayloads:category10055405^1],
>> >
>> > fl=*,score,
>> >
>> > debugQuery=true,
>> >
>> > facet.field=[soldProvisions, contractTypeText, nafCodeText, createDate,
>> > wage, keywords, labelLocation, jobCode, organizationName,
>> > requiredExperienceLevelText],
>> >
>> > *qs*=3,
>> >
>> > qt=edismax,
>> >
>> > facet.date.end=NOW/DAY,
>> >
>> > *mm*=0,
>> >
>> > facet.mincount=1,
>> >
>> > facet.date=createDate,
>> >
>> > *qf*= title^4.0 formattedDescription^2.0 nafCodeText^2.0 jobCodeText^3.0
>> > organizationName^1.0 keywords^3.0 location^1.0 labelLocation^1.0
>> > categoryPayloads^1.0,
>> >
>> > hl.fl=title,
>> >
>> > wt=javabin,
>> >
>> > rows=20,
>> >
>> > start=0,
>> >
>> > *q*=("chef de projet"),
>> >
>> > facet.date.gap=+1DAY,
>> >
>> > *stopwords*=false,
>> >
>> > *ps*=3}},
>> >
>> > ====The Solr Response====
>> > response={numFound=0
>> >
>> > ====Debug Info====
>> >
>> > debug={
>> >
>> > *rawquerystring*=("chef de projet"),
>> >
>> > *querystring*=("chef de projet"),
>> >
>> > *-----------------------------------
>> > *
>> >
>> > *parsedquery*=
>> >
>> > +*DisjunctionMaxQuery*((title:"chef de projet"~3^4.0 | keywords:chef de
>> > projet^3.0 | organizationName:chef de projet | location:chef de projet |
>> > formattedDescription:"chef de projet"~3^2.0 | nafCodeText:chef de
>> > projet^2.0 | jobCodeText:chef de projet^3.0 | categoryPayloads:"chef de
>> > projet"~3 | labelLocation:chef de projet)~0.1)
>> > *DisjunctionMaxQuery*((title:"(("chef
>> > chef) de (projet") projet)"~3^4.0)~0.1) categoryPayloads:category10000071
>> > categoryPayloads:category10055078 categoryPayloads:category10055405,
>> >
>> > *-----------------------------------*
>> >
>> > *parsedquery_toString*=+(title:"chef de projet"~3^4.0 | keywords:chef de
>> > projet^3.0 | organizationName:chef de projet | location:chef de projet |
>> > formattedDescription:"chef de projet"~3^2.0 | nafCodeText:chef de
>> > projet^2.0 | jobCodeText:chef de projet^3.0 | categoryPayloads:"chef de
>> > projet"~3 | labelLocation:chef de projet)~0.1 (title:"(("chef chef) de
>> > (projet") projet)"~3^4.0)~0.1 categoryPayloads:category10000071
>> > categoryPayloads:category10055078 categoryPayloads:category10055405,
>> >
>> >
>> >
>> > explain={},
>> >
>> > QParser=ExtendedDismaxQParser,altquerystring=null,
>> >
>> > *boost_queries*=[categoryPayloads:category10000071^1,
>> > categoryPayloads:category10055078^1,
>> categoryPayloads:category10055405^1],
>> >
>> > *parsed_boost_queries*=[categoryPayloads:category10000071,
>> > categoryPayloads:category10055078, categoryPayloads:category10055405],
>> > boostfuncs=null,
>> >
>> > 2) I tried to remove the bq values but no changes:
>> >
>> > *querystring*=("chef de projet"),
>> >
>> > *parsedquery*=+*DisjunctionMaxQuery*((title:"chef de projet"~3^4.0 |
>> > keywords:chef de projet^3.0 | organizationName:chef de projet |
>> > location:chef de projet | formattedDescription:"chef de projet"~3^2.0 |
>> > nafCodeText:chef de projet^2.0 | jobCodeText:chef de projet^3.0 |
>> > categoryPayloads:"chef de projet"~3 | labelLocation:chef de projet)~0.1)
>> *
>> > DisjunctionMaxQuery*((title:"(("chef chef) de (projet")
>> > projet)"~3^4.0)~0.1),
>> > *parsedquery_toString*=+(title:"chef de projet"~3^4.0 | keywords:chef de
>> > projet^3.0 | organizationName:chef de projet | location:chef de projet |
>> > formattedDescription:"chef de projet"~3^2.0 | nafCodeText:chef de
>> > projet^2.0 | jobCodeText:chef de projet^3.0 | categoryPayloads:"chef de
>> > projet"~3 | labelLocation:chef de projet)~0.1 (title:"(("chef chef) de
>> > (projet") projet)"~3^4.0)~0.1,
>> >
>> > 3) and the query which works
>> >
>> > debug={
>> >
>> > *rawquerystring*=("chef  projet"),
>> >
>> > *querystring*=("chef  projet"),
>> >
>> > *parsedquery*=+*DisjunctionMaxQuery*((title:"chef projet"~3^4.0 |
>> > keywords:chef  projet^3.0 | organizationName:chef  projet |
>> > location:chef  projet
>> > | formattedDescription:"chef projet"~3^2.0 | nafCodeText:chef
>>  projet^2.0 |
>> > jobCodeText:chef  projet^3.0 | categoryPayloads:"chef projet"~3 |
>> > labelLocation:chef  projet)~0.1) *DisjunctionMaxQuery*((title:"(("chef
>> > chef) (projet") projet)"~3^4.0)~0.1),
>> >
>> > *parsedquery_toString*=+(title:"chef projet"~3^4.0 | keywords:chef
>>  projet^3.0
>> > | organizationName:chef  projet | location:chef  projet |
>> > formattedDescription:"chef projet"~3^2.0 | nafCodeText:chef  projet^2.0 |
>> > jobCodeText:chef  projet^3.0 | categoryPayloads:"chef projet"~3 |
>> > labelLocation:chef  projet)~0.1 (title:"(("chef chef) (projet")
>> > projet)"~3^4.0)~0.1,
>> >
>> > explain={23715081=
>> >
>> > 14.832518 = (MATCH) sum of:
>> >
>> > I really don't know how to solve this issue and would appreciate any help
>> >
>> > Best wishes
>> >
>> > Jean-Claude
>> >
>> >
>> > On Tue, Nov 15, 2011 at 9:28 PM, Erick Erickson <erickerick...@gmail.com
>> >wrote:
>> >
>> >> The query re-writing is...er...interesting, and I'll skip that for
>> now...
>> >>
>> >> As for why you're not getting results, see the mm parameter
>> >> here: http://wiki.apache.org/solr/DisMaxQParserPlugin
>> >>
>> >> Especially the line:
>> >> The default value is 100% (all clauses must match)
>> >>
>> >> so I suspect your categories not matching, especially if there's
>> >> only a single category per document!
>> >>
>> >> Best
>> >> Erick
>> >>
>> >> On Tue, Nov 15, 2011 at 9:46 AM, Jean-Claude Dauphin
>> >> <jc.daup...@gmail.com> wrote:
>> >> > Hello,
>> >> >
>> >> > I would be very greateful if somebody could explain me what is the
>> exact
>> >> > problem and how to get the right results.
>> >> >
>> >> > Using dismax or edismax with the following query:
>> >> > EDISMAX query (q)=("chef de projet" category10000071 category10055078
>> >> > category10055405)
>> >> > gives no results (should get 33 documents with "chef de projet"
>> >> >
>> >> > while making the following query
>> >> > EDISMAX query (q)=("chef projet")
>> >> >
>> >> > gives the right number of  results
>> >> >
>> >> > Here is the debug info:
>> >> >
>> >> > *queryResponse*=[{responseHeader={status=0,
>> >> >
>> >> > QTime=16,
>> >> >
>> >> > params={facet=true,
>> >> >
>> >> > f.createDate.facet.date.start=NOW/DAY-6DAYS,
>> >> >
>> >> > *tie*=0.1,
>> >> >
>> >> > facet.limit=40000,
>> >> >
>> >> > f.location.facet.limit=3,
>> >> >
>> >> > *q.alt*=*:*,
>> >> >
>> >> > facet.date.other=all,
>> >> >
>> >> > hl=true,version=2,
>> >> >
>> >> > *bq*=[categoryPayloads:category10000071^1,
>> >> > categoryPayloads:category10055078^1,
>> >> categoryPayloads:category10055405^1],
>> >> >
>> >> > debugQuery=true,
>> >> >
>> >> > *fl*=*,score,
>> >> >
>> >> > facet.field=[soldProvisions, contractTypeText, nafCodeText,
>> createDate,
>> >> > wage, keywords, labelLocation, jobCode, organizationName,
>> >> > requiredExperienceLevelText],
>> >> >
>> >> > *qs*=3,
>> >> >
>> >> > *qt*=edismax,
>> >> >
>> >> > facet.date.end=NOW/DAY,
>> >> >
>> >> > facet.mincount=1,
>> >> >
>> >> > facet.date=createDate,
>> >> >
>> >> > *qf*= title^4.0 formattedDescription^2.0 nafCodeText^2.0
>> jobCodeText^3.0
>> >> > organizationName^1.0 keywords^3.0 location^1.0 labelLocation^1.0
>> >> > categoryPayloads^1.0,
>> >> >
>> >> > hl.fl=title,
>> >> >
>> >> > wt=javabin,
>> >> >
>> >> > rows=20,
>> >> >
>> >> > start=0,
>> >> >
>> >> > *q*=("chef de projet" category10000071 category10055078
>> >> category10055405),
>> >> >
>> >> > facet.date.gap=+1DAY}},
>> >> >
>> >> >
>> >> >
>> >>
>> *response*={numFound=0,start=00.0,docs=[]},facet_counts={facet_queries={},facet_fields={soldProvisions={},contractTypeText={},nafCodeText={},createDate={},wage={},keywords={},labelLocation={},jobCode={},organizationName={},requiredExperienceLevelText={}},facet_dates={createDate={gap=+1DAY,start=Wed
>> >> > Nov 09 01:00:00 CET 2011,end=Tue Nov 15 01:00:00 CET
>> >> > 2011,before=0,after=0,between=0}},facet_ranges={}},highlighting={},
>> >> >
>> >> > debug={
>> >> >
>> >> > *rawquerystring*=("chef de projet" category10000071 category10055078
>> >> > category10055405),
>> >> >
>> >> > *querystring*=("chef de projet" category10000071 category10055078
>> >> > category10055405),
>> >> >
>> >> > *parsedquery*=+((*DisjunctionMaxQuery*((title:"chef de projet"~3^4.0 |
>> >> > keywords:chef de projet^3.0 | organizationName:chef de projet |
>> >> > location:chef de projet | formattedDescription:"chef de projet"~3^2.0
>> |
>> >> > nafCodeText:chef de projet^2.0 | jobCodeText:chef de projet^3.0 |
>> >> > categoryPayloads:"chef de projet"~3 | labelLocation:chef de
>> projet)~0.1)
>> >> >
>> >> > *DisjunctionMaxQuery*((title:category10000071^4.0 |
>> >> > keywords:category10000071^3.0 | organizationName:category10000071 |
>> >> > location:category10000071 | formattedDescription:category10000071^2.0
>> |
>> >> > nafCodeText:category10000071^2.0 | jobCodeText:category10000071^3.0 |
>> >> > categoryPayloads:category10000071 |
>> labelLocation:category10000071)~0.1)
>> >> *
>> >> > DisjunctionMaxQuery*((title:category10055078^4.0 |
>> >> > keywords:category10055078^3.0 | organizationName:category10055078 |
>> >> > location:category10055078 | formattedDescription:category10055078^2.0
>> |
>> >> > nafCodeText:category10055078^2.0 | jobCodeText:category10055078^3.0 |
>> >> > categoryPayloads:category10055078 |
>> labelLocation:category10055078)~0.1)
>> >> *
>> >> > DisjunctionMaxQuery*((title:category10055405^4.0 |
>> >> > keywords:category10055405^3.0 | organizationName:category10055405 |
>> >> > location:category10055405 | formattedDescription:category10055405^2.0
>> |
>> >> > nafCodeText:category10055405^2.0 | jobCodeText:category10055405^3.0 |
>> >> > categoryPayloads:category10055405 |
>> >> > labelLocation:category10055405)~0.1))~4)
>> >> *DisjunctionMaxQuery*((title:"(("chef
>> >> > chef) de (projet" projet) category10000071 category10055078
>> >> > (category10055405) category10055405)"^4.0)~0.1)
>> >> > categoryPayloads:category10000071 categoryPayloads:category10055078
>> >> > categoryPayloads:category10055405,
>> >> >
>> >> > *parsedquery_toString*=+(((title:"chef de projet"~3^4.0 |
>> keywords:chef
>> >> de
>> >> > projet^3.0 | organizationName:chef de projet | location:chef de
>> projet |
>> >> > formattedDescription:"chef de projet"~3^2.0 | nafCodeText:chef de
>> >> > projet^2.0 | jobCodeText:chef de projet^3.0 | categoryPayloads:"chef
>> de
>> >> > projet"~3 | labelLocation:chef de projet)~0.1
>> >> (title:category10000071^4.0 |
>> >> > keywords:category10000071^3.0 | organizationName:category10000071 |
>> >> > location:category10000071 | formattedDescription:category10000071^2.0
>> |
>> >> > nafCodeText:category10000071^2.0 | jobCodeText:category10000071^3.0 |
>> >> > categoryPayloads:category10000071 |
>> labelLocation:category10000071)~0.1
>> >> > (title:category10055078^4.0 | keywords:category10055078^3.0 |
>> >> > organizationName:category10055078 | location:category10055078 |
>> >> > formattedDescription:category10055078^2.0 |
>> >> > nafCodeText:category10055078^2.0 | jobCodeText:category10055078^3.0 |
>> >> > categoryPayloads:category10055078 |
>> labelLocation:category10055078)~0.1
>> >> > (title:category10055405^4.0 | keywords:category10055405^3.0 |
>> >> > organizationName:category10055405 | location:category10055405 |
>> >> > formattedDescription:category10055405^2.0 |
>> >> > nafCodeText:category10055405^2.0 | jobCodeText:category10055405^3.0 |
>> >> > categoryPayloads:category10055405 |
>> >> labelLocation:category10055405)~0.1)~4)
>> >> > (title:"(("chef chef) de (projet" projet) category10000071
>> >> category10055078
>> >> > (category10055405) category10055405)"^4.0)~0.1
>> >> > categoryPayloads:category10000071 categoryPayloads:category10055078
>> >> > categoryPayloads:category10055405,
>> >> >
>> >> > explain={},
>> >> >
>> >> > QParser=ExtendedDismaxQParser,
>> >> >
>> >> > altquerystring=null,
>> >> >
>> >> > boost_queries=[categoryPayloads:category10000071^1,
>> >> > categoryPayloads:category10055078^1,
>> >> categoryPayloads:category10055405^1],
>> >> >
>> >> > parsed_boost_queries=[categoryPayloads:category10000071,
>> >> > categoryPayloads:category10055078, categoryPayloads:category10055405],
>> >> >
>> >> > boostfuncs=null,
>> >> >
>> >> > The last part:
>> >> > *DisjunctionMaxQuery*((title:"(("chef chef) de (projet" projet)
>> >> > category10000071 category10055078 (category10055405)
>> >> > category10055405)"^4.0)~0.1) categoryPayloads:category10000071
>> >> > categoryPayloads:category10055078 categoryPayloads:category10055405,
>> >> >
>> >> > is quite strange for me and I would appreciate if someone could
>> explain
>> >> the
>> >> > query re-writing
>> >> >
>> >> > Best wishes,
>> >> >
>> >> > JC
>> >> > --
>> >> > Jean-Claude Dauphin
>> >> >
>> >>
>> >
>> >
>> >
>> > --
>> > Jean-Claude Dauphin
>> >
>> > jc.daup...@gmail.com
>> > jc.daup...@afus.unesco.org
>> >
>> > http://kenai.com/projects/j-isis/
>> > http://www.unesco.org/isis/
>> > http://www.unesco.org/idams/
>> > http://www.greenstone.org
>> >
>>
>
>
>
> --
> Jean-Claude Dauphin
>
> jc.daup...@gmail.com
> jc.daup...@afus.unesco.org
>
> http://kenai.com/projects/j-isis/
> http://www.unesco.org/isis/
> http://www.unesco.org/idams/
> http://www.greenstone.org
>

Reply via email to