Re: Wild-card query behavior

2019-10-09 Thread Mikhail Khludnev
Well it remind regular awkward parsing issues. Try to experiment with
={!join to=...from=... v='field:12*'} or ={!join to=... from=...
v=$qq}=field:12*
No more questions to ask.

On Wed, Oct 9, 2019 at 4:39 PM Paresh  wrote:

> E.g. In query, join with wild-card query using parenthesis I get error -
>
> "error-class","org.apache.solr.common.SolrException",
>   "root-error-class","org.apache.solr.parser.ParseException"],
> "msg":"org.apache.solr.search.SyntaxError: Cannot parse
> 'solrField:(12*': Encountered \"\" at line 1, column 57.\r\nWas
> expecting one of:\r\n ...\r\n ...\r\n ...\r\n
> \"+\" ...\r\n\"-\" ...\r\n ...\r\n\"(\" ...\r\n
> \")\" ...\r\n\"*\" ...\r\n\"^\" ...\r\n ...\r\n
>  ...\r\n ...\r\n ...\r\n
> 
> ...\r\n ...\r\n\"[\" ...\r\n\"{\" ...\r\n
>  ...\r\n\"filter(\" ...\r\n ...\r\n",
> "code":400}}
>
> When using the same query with parenthesis in filter query (fq), I get the
> expected results.
>
>
>
>
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


-- 
Sincerely yours
Mikhail Khludnev


Re: Wild-card query behavior

2019-10-09 Thread Paresh
E.g. In query, join with wild-card query using parenthesis I get error -

"error-class","org.apache.solr.common.SolrException",
  "root-error-class","org.apache.solr.parser.ParseException"],
"msg":"org.apache.solr.search.SyntaxError: Cannot parse
'solrField:(12*': Encountered \"\" at line 1, column 57.\r\nWas
expecting one of:\r\n ...\r\n ...\r\n ...\r\n   
\"+\" ...\r\n\"-\" ...\r\n ...\r\n\"(\" ...\r\n   
\")\" ...\r\n\"*\" ...\r\n\"^\" ...\r\n ...\r\n   
 ...\r\n ...\r\n ...\r\n
...\r\n ...\r\n\"[\" ...\r\n\"{\" ...\r\n   
 ...\r\n\"filter(\" ...\r\n ...\r\n",
"code":400}}

When using the same query with parenthesis in filter query (fq), I get the
expected results.




--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Wild-card query behavior

2019-10-09 Thread Mikhail Khludnev
Hello, Paresh.
Please examine debugQuery output, otherwise 'doesn't work' is vague.

On Wed, Oct 9, 2019 at 8:31 AM Paresh  wrote:

> Hi All,
>
> I am trying wild-card query with query, filter query with and without !join
> and finding it difficult to understand the SOLR behavior.
>
> (-) wild-card like 12* in query: field:12* works well
> (-) wild-card like 12* in query with {!join to=... from=...}field:12* -->
> works well
> (-) wild-card like (12*) in query with {!join to=... from=...}field:(12*)
> --> doesn't work
> (-) wild-card like (12*) in filter query with ={!join to=...
> from=...}field:12* --> doesn't work
> (-) wild-card like (12*) in filter query with ={!join to=...
> from=...}field:"12*" --> doesn't work
> (-) wild-card like (12*) in filter query with ={!join to=...
> from=...}field:(12*) --> works well
>
> Why wild-card query does not work with {!join}?
>
> Regards,
> Paresh
>
>
>
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


-- 
Sincerely yours
Mikhail Khludnev


Wild-card query behavior

2019-10-08 Thread Paresh
Hi All,

I am trying wild-card query with query, filter query with and without !join
and finding it difficult to understand the SOLR behavior.

(-) wild-card like 12* in query: field:12* works well
(-) wild-card like 12* in query with {!join to=... from=...}field:12* -->
works well
(-) wild-card like (12*) in query with {!join to=... from=...}field:(12*)
--> doesn't work 
(-) wild-card like (12*) in filter query with ={!join to=...
from=...}field:12* --> doesn't work
(-) wild-card like (12*) in filter query with ={!join to=...
from=...}field:"12*" --> doesn't work
(-) wild-card like (12*) in filter query with ={!join to=...
from=...}field:(12*) --> works well

Why wild-card query does not work with {!join}?

Regards,
Paresh



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Odd Boolean Query behavior in SOLR 3.6

2017-06-13 Thread Erik Hatcher
Inner purely negative queries match nothing.  A query is about matching, and 
skipping over things that don’t match.  The fix is when using (-something) to 
do (*:* -something) to match everything and skip the negative clause items.

In your example, try fq=((*:* -documentTypeId:3) AND companyId:29096)

Erik

> On Jun 13, 2017, at 3:15 AM, abhi Abhishek  wrote:
> 
> Hi Everyone,
> 
>I have hit a weird behavior of Boolean Query, when I am
> running the query with below param’s  it’s not behaving as expected. can
> you please help me understand the behavior here?
> 
> 
> 
> q=*:*=((-documentTypeId:3)+AND+companyId:29096)=2.2=0=10=on=true
> 
> èReturns 0 matches
> 
> filter_queries: ((-documentTypeId:3) AND companyId:29096)
> 
> parsed_filter_queries: +(-documentTypeId:3) +companyId:29096
> 
> 
> 
> q=*:*=(-documentTypeId:3+AND+companyId:29096)=2.2=0=10=on=true
> 
> è returns 1600 matches
> 
> filter_queries:(-documentTypeId:3 AND companyId:29096)
> 
> parsed_filter_queries:-documentTypeId:3 +companyId:29096
> 
> 
> 
> Can you please help me understand what am I missing here?
> 
> 
> Thanks in Advance.
> 
> 
> Thanks & Best Regards,
> 
> Abhishek



Odd Boolean Query behavior in SOLR 3.6

2017-06-13 Thread abhi Abhishek
Hi Everyone,

I have hit a weird behavior of Boolean Query, when I am
running the query with below param’s  it’s not behaving as expected. can
you please help me understand the behavior here?



q=*:*=((-documentTypeId:3)+AND+companyId:29096)=2.2=0=10=on=true

 èReturns 0 matches

filter_queries: ((-documentTypeId:3) AND companyId:29096)

parsed_filter_queries: +(-documentTypeId:3) +companyId:29096



q=*:*=(-documentTypeId:3+AND+companyId:29096)=2.2=0=10=on=true

è returns 1600 matches

filter_queries:(-documentTypeId:3 AND companyId:29096)

parsed_filter_queries:-documentTypeId:3 +companyId:29096



Can you please help me understand what am I missing here?


Thanks in Advance.


Thanks & Best Regards,

Abhishek


Re: Wildcard query behavior.

2016-04-19 Thread Modassar Ather
Yes! wildcards are not analyzed. Thanks Shwan for reminding me.
Thanks Erick for your response.

Best,
Modassar

On Mon, Apr 18, 2016 at 8:53 PM, Erick Erickson 
wrote:

> Here's a blog on the subject:
>
> https://lucidworks.com/blog/2011/11/29/whats-with-lowercasing-wildcard-multiterm-queries-in-solr/
>
> bq: When validator is changed to validate, both at query time and index
> time,
> then should not validator*/validator return the same results at-least?
>
> This is one of those problems that's easy to state, but hard to solve. And
> there are so many variations that any attempt to solve it will _always_
> have lots of surprises. Simple example (and remember that the
> stemming is usually algorithmic). "validator" probably stems to "validat".
> However, "validato" (note the 'o') may not stem
> the same way at all, so searching for "validato*" wouldn't produce the
> expected response.
>
> Best,
> Erick
>
> On Mon, Apr 18, 2016 at 6:23 AM, Shawn Heisey  wrote:
> > On 4/18/2016 1:18 AM, Modassar Ather wrote:
> >> When I search for f:validator I get 80K+ documents whereas if I search
> for
> >> f:validator* I get only around 150 results.
> >>
> >> When I checked on analysis page I see that validator is changed to
> >> validate. Per my understanding in both the above cases it should
> at-least
> >> give the exact same result of around 80K+ documents.
> >
> > What Reth was trying to tell you, but did not state clearly, is that
> > when you use wildcards, your query is NOT analyzed -- none of your
> > filters, including the stemmer, are used.
> >
> > Thanks,
> > Shawn
> >
>


Re: Wildcard query behavior.

2016-04-18 Thread Erick Erickson
Here's a blog on the subject:
https://lucidworks.com/blog/2011/11/29/whats-with-lowercasing-wildcard-multiterm-queries-in-solr/

bq: When validator is changed to validate, both at query time and index time,
then should not validator*/validator return the same results at-least?

This is one of those problems that's easy to state, but hard to solve. And
there are so many variations that any attempt to solve it will _always_
have lots of surprises. Simple example (and remember that the
stemming is usually algorithmic). "validator" probably stems to "validat".
However, "validato" (note the 'o') may not stem
the same way at all, so searching for "validato*" wouldn't produce the
expected response.

Best,
Erick

On Mon, Apr 18, 2016 at 6:23 AM, Shawn Heisey  wrote:
> On 4/18/2016 1:18 AM, Modassar Ather wrote:
>> When I search for f:validator I get 80K+ documents whereas if I search for
>> f:validator* I get only around 150 results.
>>
>> When I checked on analysis page I see that validator is changed to
>> validate. Per my understanding in both the above cases it should at-least
>> give the exact same result of around 80K+ documents.
>
> What Reth was trying to tell you, but did not state clearly, is that
> when you use wildcards, your query is NOT analyzed -- none of your
> filters, including the stemmer, are used.
>
> Thanks,
> Shawn
>


Re: Wildcard query behavior.

2016-04-18 Thread Shawn Heisey
On 4/18/2016 1:18 AM, Modassar Ather wrote:
> When I search for f:validator I get 80K+ documents whereas if I search for
> f:validator* I get only around 150 results.
>
> When I checked on analysis page I see that validator is changed to
> validate. Per my understanding in both the above cases it should at-least
> give the exact same result of around 80K+ documents.

What Reth was trying to tell you, but did not state clearly, is that
when you use wildcards, your query is NOT analyzed -- none of your
filters, including the stemmer, are used.

Thanks,
Shawn



Re: Wildcard query behavior.

2016-04-18 Thread Modassar Ather
Thanks Reth for your response.

When validator is changed to validate, both at query time and index time,
then should not validator*/validator return the same results at-least?

E.g. 5 documents contains validator. At index time validator got changed to
validate.
Now when validator* is searched it will also change to validate and should
match all 5 documents. In this case I am not sure how the wildcard
internally is handled meaning what the query will transform to.

Please help me understand the internals of wildcard with stemming or point
me to some documents as I could not find any details on it.

Best,
Modassar

On Mon, Apr 18, 2016 at 1:04 PM, Reth RM  wrote:

> If you search for f:validat*, then I believe you will get same number of
> results. Please check.
>
> f:validator* is searching for records that have prefix "validator" where as
> field with stemmer which stems "validator" to "validate" (if this stemming
> was applied at index time as well as query time) its looking for records
> that have "validate" or "validator", so for obvious reasons, numFound might
> have been different.
>
>
>
> On Mon, Apr 18, 2016 at 12:48 PM, Modassar Ather 
> wrote:
>
> > Hi,
> >
> > Please help me understand following.
> >
> > I have analysis chain which uses KStemFilterFactory for a field. Solr
> > version is 5.4.0
> >
> > When I search for f:validator I get 80K+ documents whereas if I search
> for
> > f:validator* I get only around 150 results.
> >
> > When I checked on analysis page I see that validator is changed to
> > validate. Per my understanding in both the above cases it should at-least
> > give the exact same result of around 80K+ documents.
> >
> > I understand in some cases wildcards can result in sub-optimal results
> for
> > stemmed content. Please correct me if I am wrong.
> >
> > Thanks,
> > Modassar
> >
>


Re: Wildcard query behavior.

2016-04-18 Thread Reth RM
If you search for f:validat*, then I believe you will get same number of
results. Please check.

f:validator* is searching for records that have prefix "validator" where as
field with stemmer which stems "validator" to "validate" (if this stemming
was applied at index time as well as query time) its looking for records
that have "validate" or "validator", so for obvious reasons, numFound might
have been different.



On Mon, Apr 18, 2016 at 12:48 PM, Modassar Ather 
wrote:

> Hi,
>
> Please help me understand following.
>
> I have analysis chain which uses KStemFilterFactory for a field. Solr
> version is 5.4.0
>
> When I search for f:validator I get 80K+ documents whereas if I search for
> f:validator* I get only around 150 results.
>
> When I checked on analysis page I see that validator is changed to
> validate. Per my understanding in both the above cases it should at-least
> give the exact same result of around 80K+ documents.
>
> I understand in some cases wildcards can result in sub-optimal results for
> stemmed content. Please correct me if I am wrong.
>
> Thanks,
> Modassar
>


Wildcard query behavior.

2016-04-18 Thread Modassar Ather
Hi,

Please help me understand following.

I have analysis chain which uses KStemFilterFactory for a field. Solr
version is 5.4.0

When I search for f:validator I get 80K+ documents whereas if I search for
f:validator* I get only around 150 results.

When I checked on analysis page I see that validator is changed to
validate. Per my understanding in both the above cases it should at-least
give the exact same result of around 80K+ documents.

I understand in some cases wildcards can result in sub-optimal results for
stemmed content. Please correct me if I am wrong.

Thanks,
Modassar


Re: Query behavior.

2016-03-19 Thread Jack Krupansky
That's what I thought you had meant before, but the Jira ticket indicates
that you are looking for some extra level of AND/MUST outside of the OR,
which is different from what you just indicated. In the ticket you say: "How
can I achieve following? "+((fl:java fl:book))"", which has an extra AND
outside of the inner sub-query, which is a little different than just "(fl:java
fl:book)". Sure, the results should be the same, but why insist on the
extra level of nested boolean query?

-- Jack Krupansky

On Thu, Mar 17, 2016 at 12:50 AM, Modassar Ather 
wrote:

> What I understand by q.op is the default operator. If there is no AND/OR
> in-between the terms the default will be AND as per my setting of q.op=AND.
> But what if the query has AND/OR explicitly put in-between the query terms?
> I just think that if (A OR B) is the query then the result should be based
> on any of the term's or both of the terms and not only both of the terms.
> Please correct me if my understanding is wrong.
>
> Thanks,
> Modassar
>
> On Wed, Mar 16, 2016 at 7:34 PM, Jack Krupansky 
> wrote:
>
> > Now you've confused me... Did you actually intend that q.op=AND was going
> > to perform some function in a query with only two terms and and OR
> > operator? I mean, why not just drop the q.op=AND?
> >
> > -- Jack Krupansky
> >
> > On Wed, Mar 16, 2016 at 1:31 AM, Modassar Ather 
> > wrote:
> >
> > > Jack as suggested I have created following jira issue.
> > >
> > > https://issues.apache.org/jira/browse/SOLR-8853
> > >
> > > Thanks,
> > > Modassar
> > >
> > >
> > > On Tue, Mar 15, 2016 at 8:15 PM, Jack Krupansky <
> > jack.krupan...@gmail.com>
> > > wrote:
> > >
> > > > That was precisely the point of the need for a new Jira - to answer
> > > exactly
> > > > the questions that you have posed - and that I had proposed as well.
> > > Until
> > > > some of the senior committers comment on that Jira you won't have
> > > answers.
> > > > They've painted themselves into a corner and now I am curious how
> they
> > > will
> > > > unpaint themselves out of that corner.
> > > >
> > > > -- Jack Krupansky
> > > >
> > > > On Tue, Mar 15, 2016 at 1:46 AM, Modassar Ather <
> > modather1...@gmail.com>
> > > > wrote:
> > > >
> > > > > Thanks Jack for your response.
> > > > > The following jira bug for this issue is already present so I have
> > not
> > > > > created a new one.
> > > > > https://issues.apache.org/jira/browse/SOLR-8812
> > > > >
> > > > > Kindly help me understand that whether it is possible to achieve
> > search
> > > > on
> > > > > ORed terms as it was done in earlier Solr version.
> > > > > Is this behavior intentional or is it a bug? I need to migrate to
> > > > > Solr-5.5.0 but not doing so due to this behavior.
> > > > >
> > > > > Thanks,
> > > > > Modassar
> > > > >
> > > > >
> > > > > On Fri, Mar 11, 2016 at 3:18 AM, Jack Krupansky <
> > > > jack.krupan...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > We probably need a Jira to investigate whether this really is an
> > > > > explicitly
> > > > > > intentional feature change, or whether it really is a bug. And if
> > it
> > > > > truly
> > > > > > was intentional, how people can work around the change to get the
> > > > > desired,
> > > > > > pre-5.5 behavior. Personally, I always thought it was a mistake
> > that
> > > > q.op
> > > > > > and mm were so tightly linked in Solr even though they are
> > > independent
> > > > in
> > > > > > Lucene.
> > > > > >
> > > > > > In short, I think people want to be able to set the default
> > behavior
> > > > for
> > > > > > individual terms (MUST vs. SHOULD) if explicit operators are not
> > > used,
> > > > > and
> > > > > > that OR is an explicit operator. And that mm should control only
> > how
> > > > many
> > > > > > SHOULD terms are required (Lucene MinShouldMatch.)
> > > > > >
> > > > > >
> > > > > > -- Jack Krupansky
> > > > > >
> > > > > > On Thu, Mar 10, 2016 at 3:41 AM, Modassar Ather <
> > > > modather1...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Thanks Shawn for pointing to the jira issue. I was not sure
> that
> > if
> > > > it
> > > > > is
> > > > > > > an expected behavior or a bug or there could have been a way to
> > get
> > > > the
> > > > > > > desired result.
> > > > > > >
> > > > > > > Best,
> > > > > > > Modassar
> > > > > > >
> > > > > > > On Thu, Mar 10, 2016 at 11:32 AM, Shawn Heisey <
> > > apa...@elyograg.org>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > On 3/9/2016 10:55 PM, Shawn Heisey wrote:
> > > > > > > > > The ~2 syntax, when not attached to a phrase query (quotes)
> > is
> > > > the
> > > > > > way
> > > > > > > > > you express a fuzzy query. If it's attached to a query in
> > > quotes,
> > > > > > then
> > > > > > > > > it is a proximity query. I'm not sure whether it means
> > > something
> > > > > > > > > different when it's attached to a query clause in
> > parentheses,
> > > > > > someone
> > > > > > > > > 

Re: Query behavior.

2016-03-19 Thread Modassar Ather
What I understand by "+((fl:java fl:book))" is any of the terms should be
present in the complete query. Please correct me if I am wrong.
What I want to achieve is (A OR B) where any of the term or both of the
term will cause a match.

Thanks,
Modassar

On Thu, Mar 17, 2016 at 10:32 AM, Jack Krupansky 
wrote:

> That's what I thought you had meant before, but the Jira ticket indicates
> that you are looking for some extra level of AND/MUST outside of the OR,
> which is different from what you just indicated. In the ticket you say:
> "How
> can I achieve following? "+((fl:java fl:book))"", which has an extra AND
> outside of the inner sub-query, which is a little different than just
> "(fl:java
> fl:book)". Sure, the results should be the same, but why insist on the
> extra level of nested boolean query?
>
> -- Jack Krupansky
>
> On Thu, Mar 17, 2016 at 12:50 AM, Modassar Ather 
> wrote:
>
> > What I understand by q.op is the default operator. If there is no AND/OR
> > in-between the terms the default will be AND as per my setting of
> q.op=AND.
> > But what if the query has AND/OR explicitly put in-between the query
> terms?
> > I just think that if (A OR B) is the query then the result should be
> based
> > on any of the term's or both of the terms and not only both of the terms.
> > Please correct me if my understanding is wrong.
> >
> > Thanks,
> > Modassar
> >
> > On Wed, Mar 16, 2016 at 7:34 PM, Jack Krupansky <
> jack.krupan...@gmail.com>
> > wrote:
> >
> > > Now you've confused me... Did you actually intend that q.op=AND was
> going
> > > to perform some function in a query with only two terms and and OR
> > > operator? I mean, why not just drop the q.op=AND?
> > >
> > > -- Jack Krupansky
> > >
> > > On Wed, Mar 16, 2016 at 1:31 AM, Modassar Ather <
> modather1...@gmail.com>
> > > wrote:
> > >
> > > > Jack as suggested I have created following jira issue.
> > > >
> > > > https://issues.apache.org/jira/browse/SOLR-8853
> > > >
> > > > Thanks,
> > > > Modassar
> > > >
> > > >
> > > > On Tue, Mar 15, 2016 at 8:15 PM, Jack Krupansky <
> > > jack.krupan...@gmail.com>
> > > > wrote:
> > > >
> > > > > That was precisely the point of the need for a new Jira - to answer
> > > > exactly
> > > > > the questions that you have posed - and that I had proposed as
> well.
> > > > Until
> > > > > some of the senior committers comment on that Jira you won't have
> > > > answers.
> > > > > They've painted themselves into a corner and now I am curious how
> > they
> > > > will
> > > > > unpaint themselves out of that corner.
> > > > >
> > > > > -- Jack Krupansky
> > > > >
> > > > > On Tue, Mar 15, 2016 at 1:46 AM, Modassar Ather <
> > > modather1...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Thanks Jack for your response.
> > > > > > The following jira bug for this issue is already present so I
> have
> > > not
> > > > > > created a new one.
> > > > > > https://issues.apache.org/jira/browse/SOLR-8812
> > > > > >
> > > > > > Kindly help me understand that whether it is possible to achieve
> > > search
> > > > > on
> > > > > > ORed terms as it was done in earlier Solr version.
> > > > > > Is this behavior intentional or is it a bug? I need to migrate to
> > > > > > Solr-5.5.0 but not doing so due to this behavior.
> > > > > >
> > > > > > Thanks,
> > > > > > Modassar
> > > > > >
> > > > > >
> > > > > > On Fri, Mar 11, 2016 at 3:18 AM, Jack Krupansky <
> > > > > jack.krupan...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > We probably need a Jira to investigate whether this really is
> an
> > > > > > explicitly
> > > > > > > intentional feature change, or whether it really is a bug. And
> if
> > > it
> > > > > > truly
> > > > > > > was intentional, how people can work around the change to get
> the
> > > > > > desired,
> > > > > > > pre-5.5 behavior. Personally, I always thought it was a mistake
> > > that
> > > > > q.op
> > > > > > > and mm were so tightly linked in Solr even though they are
> > > > independent
> > > > > in
> > > > > > > Lucene.
> > > > > > >
> > > > > > > In short, I think people want to be able to set the default
> > > behavior
> > > > > for
> > > > > > > individual terms (MUST vs. SHOULD) if explicit operators are
> not
> > > > used,
> > > > > > and
> > > > > > > that OR is an explicit operator. And that mm should control
> only
> > > how
> > > > > many
> > > > > > > SHOULD terms are required (Lucene MinShouldMatch.)
> > > > > > >
> > > > > > >
> > > > > > > -- Jack Krupansky
> > > > > > >
> > > > > > > On Thu, Mar 10, 2016 at 3:41 AM, Modassar Ather <
> > > > > modather1...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Thanks Shawn for pointing to the jira issue. I was not sure
> > that
> > > if
> > > > > it
> > > > > > is
> > > > > > > > an expected behavior or a bug or there could have been a way
> to
> > > get
> > > > > the
> > > > > > > > desired result.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Modassar

Re: Query behavior.

2016-03-19 Thread Jack Krupansky
I was just wanting to see the Jira clarified (without creating noise on the
Jira), but if others feel they understand the relevance of the outer AND/+
to the stated problem, fine. I don't think I have anything else to add to
the discussion at this stage. Now we sit and wait for some senior
committers to address the concern.

-- Jack Krupansky

On Fri, Mar 18, 2016 at 6:06 AM, Alessandro Benedetti  wrote:

> I think what he tried to explain was :
> " Input query : *fl:(java OR book)*
>  Instead of having the query parser parsing :
>  *+((fl:java fl:book)~2) *( which seems what is happening right now)
> He want the query parser to parse :
>
> +((fl:java fl:book)) ( without the mm expressed)
>
> More than the outer level of AND , I think the concern is in the absence of
> the ~2 operator ( mm=2 set automatically) .
>
> Anyway I can't reproduce the issue :(
>
> P.S. taking a brief look into the code
> : org/apache/solr/search/ExtendedDismaxQParser.java:341
> I suggest you to debug from that point as the comment says :
>
> // For correct lucene queries, turn off mm processing if there
> // were explicit operators (except for AND).
> if (query instanceof BooleanQuery) {
> query = SolrPluginUtils.setMinShouldMatch((BooleanQuery)query,
> config.minShouldMatch, config.mmAutoRelax);
> }
>
> I have no time now,
> Cheers
>
> On Fri, Mar 18, 2016 at 4:39 AM, Jack Krupansky 
> wrote:
>
> > You still haven't explained what exactly you are trying to accomplish
> with
> > that outer level AND/+/MUST. Please be specific - why you insist on
> > "+((fl:java
> > fl:book))" rather than  "fl:java fl:book".
> >
> > -- Jack Krupansky
> >
> > On Fri, Mar 18, 2016 at 12:12 AM, Modassar Ather  >
> > wrote:
> >
> > > What I understand by "+((fl:java fl:book))" is any of the terms should
> be
> > > present in the complete query. Please correct me if I am wrong.
> > > What I want to achieve is (A OR B) where any of the term or both of the
> > > term will cause a match.
> > >
> > > Thanks,
> > > Modassar
> > >
> > > On Thu, Mar 17, 2016 at 10:32 AM, Jack Krupansky <
> > jack.krupan...@gmail.com
> > > >
> > > wrote:
> > >
> > > > That's what I thought you had meant before, but the Jira ticket
> > indicates
> > > > that you are looking for some extra level of AND/MUST outside of the
> > OR,
> > > > which is different from what you just indicated. In the ticket you
> say:
> > > > "How
> > > > can I achieve following? "+((fl:java fl:book))"", which has an extra
> > AND
> > > > outside of the inner sub-query, which is a little different than just
> > > > "(fl:java
> > > > fl:book)". Sure, the results should be the same, but why insist on
> the
> > > > extra level of nested boolean query?
> > > >
> > > > -- Jack Krupansky
> > > >
> > > > On Thu, Mar 17, 2016 at 12:50 AM, Modassar Ather <
> > modather1...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > What I understand by q.op is the default operator. If there is no
> > > AND/OR
> > > > > in-between the terms the default will be AND as per my setting of
> > > > q.op=AND.
> > > > > But what if the query has AND/OR explicitly put in-between the
> query
> > > > terms?
> > > > > I just think that if (A OR B) is the query then the result should
> be
> > > > based
> > > > > on any of the term's or both of the terms and not only both of the
> > > terms.
> > > > > Please correct me if my understanding is wrong.
> > > > >
> > > > > Thanks,
> > > > > Modassar
> > > > >
> > > > > On Wed, Mar 16, 2016 at 7:34 PM, Jack Krupansky <
> > > > jack.krupan...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Now you've confused me... Did you actually intend that q.op=AND
> was
> > > > going
> > > > > > to perform some function in a query with only two terms and and
> OR
> > > > > > operator? I mean, why not just drop the q.op=AND?
> > > > > >
> > > > > > -- Jack Krupansky
> > > > > >
> > > > > > On Wed, Mar 16, 2016 at 1:31 AM, Modassar Ather <
> > > > modather1...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Jack as suggested I have created following jira issue.
> > > > > > >
> > > > > > > https://issues.apache.org/jira/browse/SOLR-8853
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Modassar
> > > > > > >
> > > > > > >
> > > > > > > On Tue, Mar 15, 2016 at 8:15 PM, Jack Krupansky <
> > > > > > jack.krupan...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > That was precisely the point of the need for a new Jira - to
> > > answer
> > > > > > > exactly
> > > > > > > > the questions that you have posed - and that I had proposed
> as
> > > > well.
> > > > > > > Until
> > > > > > > > some of the senior committers comment on that Jira you won't
> > have
> > > > > > > answers.
> > > > > > > > They've painted themselves into a corner and now I am curious
> > how
> > > > > they
> > > > > > > will
> > > > > > > > unpaint themselves out of that corner.
> > > > > > > >
> > > > > > > > -- Jack Krupansky
> > > > > > > >

Re: Query behavior.

2016-03-19 Thread Jack Krupansky
Now you've confused me... Did you actually intend that q.op=AND was going
to perform some function in a query with only two terms and and OR
operator? I mean, why not just drop the q.op=AND?

-- Jack Krupansky

On Wed, Mar 16, 2016 at 1:31 AM, Modassar Ather 
wrote:

> Jack as suggested I have created following jira issue.
>
> https://issues.apache.org/jira/browse/SOLR-8853
>
> Thanks,
> Modassar
>
>
> On Tue, Mar 15, 2016 at 8:15 PM, Jack Krupansky 
> wrote:
>
> > That was precisely the point of the need for a new Jira - to answer
> exactly
> > the questions that you have posed - and that I had proposed as well.
> Until
> > some of the senior committers comment on that Jira you won't have
> answers.
> > They've painted themselves into a corner and now I am curious how they
> will
> > unpaint themselves out of that corner.
> >
> > -- Jack Krupansky
> >
> > On Tue, Mar 15, 2016 at 1:46 AM, Modassar Ather 
> > wrote:
> >
> > > Thanks Jack for your response.
> > > The following jira bug for this issue is already present so I have not
> > > created a new one.
> > > https://issues.apache.org/jira/browse/SOLR-8812
> > >
> > > Kindly help me understand that whether it is possible to achieve search
> > on
> > > ORed terms as it was done in earlier Solr version.
> > > Is this behavior intentional or is it a bug? I need to migrate to
> > > Solr-5.5.0 but not doing so due to this behavior.
> > >
> > > Thanks,
> > > Modassar
> > >
> > >
> > > On Fri, Mar 11, 2016 at 3:18 AM, Jack Krupansky <
> > jack.krupan...@gmail.com>
> > > wrote:
> > >
> > > > We probably need a Jira to investigate whether this really is an
> > > explicitly
> > > > intentional feature change, or whether it really is a bug. And if it
> > > truly
> > > > was intentional, how people can work around the change to get the
> > > desired,
> > > > pre-5.5 behavior. Personally, I always thought it was a mistake that
> > q.op
> > > > and mm were so tightly linked in Solr even though they are
> independent
> > in
> > > > Lucene.
> > > >
> > > > In short, I think people want to be able to set the default behavior
> > for
> > > > individual terms (MUST vs. SHOULD) if explicit operators are not
> used,
> > > and
> > > > that OR is an explicit operator. And that mm should control only how
> > many
> > > > SHOULD terms are required (Lucene MinShouldMatch.)
> > > >
> > > >
> > > > -- Jack Krupansky
> > > >
> > > > On Thu, Mar 10, 2016 at 3:41 AM, Modassar Ather <
> > modather1...@gmail.com>
> > > > wrote:
> > > >
> > > > > Thanks Shawn for pointing to the jira issue. I was not sure that if
> > it
> > > is
> > > > > an expected behavior or a bug or there could have been a way to get
> > the
> > > > > desired result.
> > > > >
> > > > > Best,
> > > > > Modassar
> > > > >
> > > > > On Thu, Mar 10, 2016 at 11:32 AM, Shawn Heisey <
> apa...@elyograg.org>
> > > > > wrote:
> > > > >
> > > > > > On 3/9/2016 10:55 PM, Shawn Heisey wrote:
> > > > > > > The ~2 syntax, when not attached to a phrase query (quotes) is
> > the
> > > > way
> > > > > > > you express a fuzzy query. If it's attached to a query in
> quotes,
> > > > then
> > > > > > > it is a proximity query. I'm not sure whether it means
> something
> > > > > > > different when it's attached to a query clause in parentheses,
> > > > someone
> > > > > > > with more knowledge will need to comment.
> > > > > > 
> > > > > > > https://issues.apache.org/jira/browse/SOLR-8812
> > > > > >
> > > > > > After I read SOLR-8812 more closely, it seems that the ~2 syntax
> > with
> > > > > > parentheses is the way that the effective mm value is expressed
> > for a
> > > > > > particular query clause in the parsed query.  I've learned
> > something
> > > > new
> > > > > > today.
> > > > > >
> > > > > > Thanks,
> > > > > > Shawn
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: Query behavior.

2016-03-19 Thread Jack Krupansky
You still haven't explained what exactly you are trying to accomplish with
that outer level AND/+/MUST. Please be specific - why you insist on
"+((fl:java
fl:book))" rather than  "fl:java fl:book".

-- Jack Krupansky

On Fri, Mar 18, 2016 at 12:12 AM, Modassar Ather 
wrote:

> What I understand by "+((fl:java fl:book))" is any of the terms should be
> present in the complete query. Please correct me if I am wrong.
> What I want to achieve is (A OR B) where any of the term or both of the
> term will cause a match.
>
> Thanks,
> Modassar
>
> On Thu, Mar 17, 2016 at 10:32 AM, Jack Krupansky  >
> wrote:
>
> > That's what I thought you had meant before, but the Jira ticket indicates
> > that you are looking for some extra level of AND/MUST outside of the OR,
> > which is different from what you just indicated. In the ticket you say:
> > "How
> > can I achieve following? "+((fl:java fl:book))"", which has an extra AND
> > outside of the inner sub-query, which is a little different than just
> > "(fl:java
> > fl:book)". Sure, the results should be the same, but why insist on the
> > extra level of nested boolean query?
> >
> > -- Jack Krupansky
> >
> > On Thu, Mar 17, 2016 at 12:50 AM, Modassar Ather  >
> > wrote:
> >
> > > What I understand by q.op is the default operator. If there is no
> AND/OR
> > > in-between the terms the default will be AND as per my setting of
> > q.op=AND.
> > > But what if the query has AND/OR explicitly put in-between the query
> > terms?
> > > I just think that if (A OR B) is the query then the result should be
> > based
> > > on any of the term's or both of the terms and not only both of the
> terms.
> > > Please correct me if my understanding is wrong.
> > >
> > > Thanks,
> > > Modassar
> > >
> > > On Wed, Mar 16, 2016 at 7:34 PM, Jack Krupansky <
> > jack.krupan...@gmail.com>
> > > wrote:
> > >
> > > > Now you've confused me... Did you actually intend that q.op=AND was
> > going
> > > > to perform some function in a query with only two terms and and OR
> > > > operator? I mean, why not just drop the q.op=AND?
> > > >
> > > > -- Jack Krupansky
> > > >
> > > > On Wed, Mar 16, 2016 at 1:31 AM, Modassar Ather <
> > modather1...@gmail.com>
> > > > wrote:
> > > >
> > > > > Jack as suggested I have created following jira issue.
> > > > >
> > > > > https://issues.apache.org/jira/browse/SOLR-8853
> > > > >
> > > > > Thanks,
> > > > > Modassar
> > > > >
> > > > >
> > > > > On Tue, Mar 15, 2016 at 8:15 PM, Jack Krupansky <
> > > > jack.krupan...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > That was precisely the point of the need for a new Jira - to
> answer
> > > > > exactly
> > > > > > the questions that you have posed - and that I had proposed as
> > well.
> > > > > Until
> > > > > > some of the senior committers comment on that Jira you won't have
> > > > > answers.
> > > > > > They've painted themselves into a corner and now I am curious how
> > > they
> > > > > will
> > > > > > unpaint themselves out of that corner.
> > > > > >
> > > > > > -- Jack Krupansky
> > > > > >
> > > > > > On Tue, Mar 15, 2016 at 1:46 AM, Modassar Ather <
> > > > modather1...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Thanks Jack for your response.
> > > > > > > The following jira bug for this issue is already present so I
> > have
> > > > not
> > > > > > > created a new one.
> > > > > > > https://issues.apache.org/jira/browse/SOLR-8812
> > > > > > >
> > > > > > > Kindly help me understand that whether it is possible to
> achieve
> > > > search
> > > > > > on
> > > > > > > ORed terms as it was done in earlier Solr version.
> > > > > > > Is this behavior intentional or is it a bug? I need to migrate
> to
> > > > > > > Solr-5.5.0 but not doing so due to this behavior.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Modassar
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Mar 11, 2016 at 3:18 AM, Jack Krupansky <
> > > > > > jack.krupan...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > We probably need a Jira to investigate whether this really is
> > an
> > > > > > > explicitly
> > > > > > > > intentional feature change, or whether it really is a bug.
> And
> > if
> > > > it
> > > > > > > truly
> > > > > > > > was intentional, how people can work around the change to get
> > the
> > > > > > > desired,
> > > > > > > > pre-5.5 behavior. Personally, I always thought it was a
> mistake
> > > > that
> > > > > > q.op
> > > > > > > > and mm were so tightly linked in Solr even though they are
> > > > > independent
> > > > > > in
> > > > > > > > Lucene.
> > > > > > > >
> > > > > > > > In short, I think people want to be able to set the default
> > > > behavior
> > > > > > for
> > > > > > > > individual terms (MUST vs. SHOULD) if explicit operators are
> > not
> > > > > used,
> > > > > > > and
> > > > > > > > that OR is an explicit operator. And that mm should control
> > only
> > > > how
> > > > > > many
> > > > 

Re: Query behavior.

2016-03-19 Thread Alessandro Benedetti
I think what he tried to explain was :
" Input query : *fl:(java OR book)*
 Instead of having the query parser parsing :
 *+((fl:java fl:book)~2) *( which seems what is happening right now)
He want the query parser to parse :

+((fl:java fl:book)) ( without the mm expressed)

More than the outer level of AND , I think the concern is in the absence of
the ~2 operator ( mm=2 set automatically) .

Anyway I can't reproduce the issue :(

P.S. taking a brief look into the code
: org/apache/solr/search/ExtendedDismaxQParser.java:341
I suggest you to debug from that point as the comment says :

// For correct lucene queries, turn off mm processing if there
// were explicit operators (except for AND).
if (query instanceof BooleanQuery) {
query = SolrPluginUtils.setMinShouldMatch((BooleanQuery)query,
config.minShouldMatch, config.mmAutoRelax);
}

I have no time now,
Cheers

On Fri, Mar 18, 2016 at 4:39 AM, Jack Krupansky 
wrote:

> You still haven't explained what exactly you are trying to accomplish with
> that outer level AND/+/MUST. Please be specific - why you insist on
> "+((fl:java
> fl:book))" rather than  "fl:java fl:book".
>
> -- Jack Krupansky
>
> On Fri, Mar 18, 2016 at 12:12 AM, Modassar Ather 
> wrote:
>
> > What I understand by "+((fl:java fl:book))" is any of the terms should be
> > present in the complete query. Please correct me if I am wrong.
> > What I want to achieve is (A OR B) where any of the term or both of the
> > term will cause a match.
> >
> > Thanks,
> > Modassar
> >
> > On Thu, Mar 17, 2016 at 10:32 AM, Jack Krupansky <
> jack.krupan...@gmail.com
> > >
> > wrote:
> >
> > > That's what I thought you had meant before, but the Jira ticket
> indicates
> > > that you are looking for some extra level of AND/MUST outside of the
> OR,
> > > which is different from what you just indicated. In the ticket you say:
> > > "How
> > > can I achieve following? "+((fl:java fl:book))"", which has an extra
> AND
> > > outside of the inner sub-query, which is a little different than just
> > > "(fl:java
> > > fl:book)". Sure, the results should be the same, but why insist on the
> > > extra level of nested boolean query?
> > >
> > > -- Jack Krupansky
> > >
> > > On Thu, Mar 17, 2016 at 12:50 AM, Modassar Ather <
> modather1...@gmail.com
> > >
> > > wrote:
> > >
> > > > What I understand by q.op is the default operator. If there is no
> > AND/OR
> > > > in-between the terms the default will be AND as per my setting of
> > > q.op=AND.
> > > > But what if the query has AND/OR explicitly put in-between the query
> > > terms?
> > > > I just think that if (A OR B) is the query then the result should be
> > > based
> > > > on any of the term's or both of the terms and not only both of the
> > terms.
> > > > Please correct me if my understanding is wrong.
> > > >
> > > > Thanks,
> > > > Modassar
> > > >
> > > > On Wed, Mar 16, 2016 at 7:34 PM, Jack Krupansky <
> > > jack.krupan...@gmail.com>
> > > > wrote:
> > > >
> > > > > Now you've confused me... Did you actually intend that q.op=AND was
> > > going
> > > > > to perform some function in a query with only two terms and and OR
> > > > > operator? I mean, why not just drop the q.op=AND?
> > > > >
> > > > > -- Jack Krupansky
> > > > >
> > > > > On Wed, Mar 16, 2016 at 1:31 AM, Modassar Ather <
> > > modather1...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Jack as suggested I have created following jira issue.
> > > > > >
> > > > > > https://issues.apache.org/jira/browse/SOLR-8853
> > > > > >
> > > > > > Thanks,
> > > > > > Modassar
> > > > > >
> > > > > >
> > > > > > On Tue, Mar 15, 2016 at 8:15 PM, Jack Krupansky <
> > > > > jack.krupan...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > That was precisely the point of the need for a new Jira - to
> > answer
> > > > > > exactly
> > > > > > > the questions that you have posed - and that I had proposed as
> > > well.
> > > > > > Until
> > > > > > > some of the senior committers comment on that Jira you won't
> have
> > > > > > answers.
> > > > > > > They've painted themselves into a corner and now I am curious
> how
> > > > they
> > > > > > will
> > > > > > > unpaint themselves out of that corner.
> > > > > > >
> > > > > > > -- Jack Krupansky
> > > > > > >
> > > > > > > On Tue, Mar 15, 2016 at 1:46 AM, Modassar Ather <
> > > > > modather1...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Thanks Jack for your response.
> > > > > > > > The following jira bug for this issue is already present so I
> > > have
> > > > > not
> > > > > > > > created a new one.
> > > > > > > > https://issues.apache.org/jira/browse/SOLR-8812
> > > > > > > >
> > > > > > > > Kindly help me understand that whether it is possible to
> > achieve
> > > > > search
> > > > > > > on
> > > > > > > > ORed terms as it was done in earlier Solr version.
> > > > > > > > Is this behavior intentional or is it a bug? I need to
> migrate
> > to
> > > > > > > > Solr-5.5.0 but not 

Re: Query behavior.

2016-03-19 Thread Modassar Ather
What I understand by q.op is the default operator. If there is no AND/OR
in-between the terms the default will be AND as per my setting of q.op=AND.
But what if the query has AND/OR explicitly put in-between the query terms?
I just think that if (A OR B) is the query then the result should be based
on any of the term's or both of the terms and not only both of the terms.
Please correct me if my understanding is wrong.

Thanks,
Modassar

On Wed, Mar 16, 2016 at 7:34 PM, Jack Krupansky 
wrote:

> Now you've confused me... Did you actually intend that q.op=AND was going
> to perform some function in a query with only two terms and and OR
> operator? I mean, why not just drop the q.op=AND?
>
> -- Jack Krupansky
>
> On Wed, Mar 16, 2016 at 1:31 AM, Modassar Ather 
> wrote:
>
> > Jack as suggested I have created following jira issue.
> >
> > https://issues.apache.org/jira/browse/SOLR-8853
> >
> > Thanks,
> > Modassar
> >
> >
> > On Tue, Mar 15, 2016 at 8:15 PM, Jack Krupansky <
> jack.krupan...@gmail.com>
> > wrote:
> >
> > > That was precisely the point of the need for a new Jira - to answer
> > exactly
> > > the questions that you have posed - and that I had proposed as well.
> > Until
> > > some of the senior committers comment on that Jira you won't have
> > answers.
> > > They've painted themselves into a corner and now I am curious how they
> > will
> > > unpaint themselves out of that corner.
> > >
> > > -- Jack Krupansky
> > >
> > > On Tue, Mar 15, 2016 at 1:46 AM, Modassar Ather <
> modather1...@gmail.com>
> > > wrote:
> > >
> > > > Thanks Jack for your response.
> > > > The following jira bug for this issue is already present so I have
> not
> > > > created a new one.
> > > > https://issues.apache.org/jira/browse/SOLR-8812
> > > >
> > > > Kindly help me understand that whether it is possible to achieve
> search
> > > on
> > > > ORed terms as it was done in earlier Solr version.
> > > > Is this behavior intentional or is it a bug? I need to migrate to
> > > > Solr-5.5.0 but not doing so due to this behavior.
> > > >
> > > > Thanks,
> > > > Modassar
> > > >
> > > >
> > > > On Fri, Mar 11, 2016 at 3:18 AM, Jack Krupansky <
> > > jack.krupan...@gmail.com>
> > > > wrote:
> > > >
> > > > > We probably need a Jira to investigate whether this really is an
> > > > explicitly
> > > > > intentional feature change, or whether it really is a bug. And if
> it
> > > > truly
> > > > > was intentional, how people can work around the change to get the
> > > > desired,
> > > > > pre-5.5 behavior. Personally, I always thought it was a mistake
> that
> > > q.op
> > > > > and mm were so tightly linked in Solr even though they are
> > independent
> > > in
> > > > > Lucene.
> > > > >
> > > > > In short, I think people want to be able to set the default
> behavior
> > > for
> > > > > individual terms (MUST vs. SHOULD) if explicit operators are not
> > used,
> > > > and
> > > > > that OR is an explicit operator. And that mm should control only
> how
> > > many
> > > > > SHOULD terms are required (Lucene MinShouldMatch.)
> > > > >
> > > > >
> > > > > -- Jack Krupansky
> > > > >
> > > > > On Thu, Mar 10, 2016 at 3:41 AM, Modassar Ather <
> > > modather1...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Thanks Shawn for pointing to the jira issue. I was not sure that
> if
> > > it
> > > > is
> > > > > > an expected behavior or a bug or there could have been a way to
> get
> > > the
> > > > > > desired result.
> > > > > >
> > > > > > Best,
> > > > > > Modassar
> > > > > >
> > > > > > On Thu, Mar 10, 2016 at 11:32 AM, Shawn Heisey <
> > apa...@elyograg.org>
> > > > > > wrote:
> > > > > >
> > > > > > > On 3/9/2016 10:55 PM, Shawn Heisey wrote:
> > > > > > > > The ~2 syntax, when not attached to a phrase query (quotes)
> is
> > > the
> > > > > way
> > > > > > > > you express a fuzzy query. If it's attached to a query in
> > quotes,
> > > > > then
> > > > > > > > it is a proximity query. I'm not sure whether it means
> > something
> > > > > > > > different when it's attached to a query clause in
> parentheses,
> > > > > someone
> > > > > > > > with more knowledge will need to comment.
> > > > > > > 
> > > > > > > > https://issues.apache.org/jira/browse/SOLR-8812
> > > > > > >
> > > > > > > After I read SOLR-8812 more closely, it seems that the ~2
> syntax
> > > with
> > > > > > > parentheses is the way that the effective mm value is expressed
> > > for a
> > > > > > > particular query clause in the parsed query.  I've learned
> > > something
> > > > > new
> > > > > > > today.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Shawn
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: Query behavior.

2016-03-15 Thread Modassar Ather
Jack as suggested I have created following jira issue.

https://issues.apache.org/jira/browse/SOLR-8853

Thanks,
Modassar


On Tue, Mar 15, 2016 at 8:15 PM, Jack Krupansky 
wrote:

> That was precisely the point of the need for a new Jira - to answer exactly
> the questions that you have posed - and that I had proposed as well. Until
> some of the senior committers comment on that Jira you won't have answers.
> They've painted themselves into a corner and now I am curious how they will
> unpaint themselves out of that corner.
>
> -- Jack Krupansky
>
> On Tue, Mar 15, 2016 at 1:46 AM, Modassar Ather 
> wrote:
>
> > Thanks Jack for your response.
> > The following jira bug for this issue is already present so I have not
> > created a new one.
> > https://issues.apache.org/jira/browse/SOLR-8812
> >
> > Kindly help me understand that whether it is possible to achieve search
> on
> > ORed terms as it was done in earlier Solr version.
> > Is this behavior intentional or is it a bug? I need to migrate to
> > Solr-5.5.0 but not doing so due to this behavior.
> >
> > Thanks,
> > Modassar
> >
> >
> > On Fri, Mar 11, 2016 at 3:18 AM, Jack Krupansky <
> jack.krupan...@gmail.com>
> > wrote:
> >
> > > We probably need a Jira to investigate whether this really is an
> > explicitly
> > > intentional feature change, or whether it really is a bug. And if it
> > truly
> > > was intentional, how people can work around the change to get the
> > desired,
> > > pre-5.5 behavior. Personally, I always thought it was a mistake that
> q.op
> > > and mm were so tightly linked in Solr even though they are independent
> in
> > > Lucene.
> > >
> > > In short, I think people want to be able to set the default behavior
> for
> > > individual terms (MUST vs. SHOULD) if explicit operators are not used,
> > and
> > > that OR is an explicit operator. And that mm should control only how
> many
> > > SHOULD terms are required (Lucene MinShouldMatch.)
> > >
> > >
> > > -- Jack Krupansky
> > >
> > > On Thu, Mar 10, 2016 at 3:41 AM, Modassar Ather <
> modather1...@gmail.com>
> > > wrote:
> > >
> > > > Thanks Shawn for pointing to the jira issue. I was not sure that if
> it
> > is
> > > > an expected behavior or a bug or there could have been a way to get
> the
> > > > desired result.
> > > >
> > > > Best,
> > > > Modassar
> > > >
> > > > On Thu, Mar 10, 2016 at 11:32 AM, Shawn Heisey 
> > > > wrote:
> > > >
> > > > > On 3/9/2016 10:55 PM, Shawn Heisey wrote:
> > > > > > The ~2 syntax, when not attached to a phrase query (quotes) is
> the
> > > way
> > > > > > you express a fuzzy query. If it's attached to a query in quotes,
> > > then
> > > > > > it is a proximity query. I'm not sure whether it means something
> > > > > > different when it's attached to a query clause in parentheses,
> > > someone
> > > > > > with more knowledge will need to comment.
> > > > > 
> > > > > > https://issues.apache.org/jira/browse/SOLR-8812
> > > > >
> > > > > After I read SOLR-8812 more closely, it seems that the ~2 syntax
> with
> > > > > parentheses is the way that the effective mm value is expressed
> for a
> > > > > particular query clause in the parsed query.  I've learned
> something
> > > new
> > > > > today.
> > > > >
> > > > > Thanks,
> > > > > Shawn
> > > > >
> > > > >
> > > >
> > >
> >
>


Re: Query behavior.

2016-03-15 Thread Jack Krupansky
That was precisely the point of the need for a new Jira - to answer exactly
the questions that you have posed - and that I had proposed as well. Until
some of the senior committers comment on that Jira you won't have answers.
They've painted themselves into a corner and now I am curious how they will
unpaint themselves out of that corner.

-- Jack Krupansky

On Tue, Mar 15, 2016 at 1:46 AM, Modassar Ather 
wrote:

> Thanks Jack for your response.
> The following jira bug for this issue is already present so I have not
> created a new one.
> https://issues.apache.org/jira/browse/SOLR-8812
>
> Kindly help me understand that whether it is possible to achieve search on
> ORed terms as it was done in earlier Solr version.
> Is this behavior intentional or is it a bug? I need to migrate to
> Solr-5.5.0 but not doing so due to this behavior.
>
> Thanks,
> Modassar
>
>
> On Fri, Mar 11, 2016 at 3:18 AM, Jack Krupansky 
> wrote:
>
> > We probably need a Jira to investigate whether this really is an
> explicitly
> > intentional feature change, or whether it really is a bug. And if it
> truly
> > was intentional, how people can work around the change to get the
> desired,
> > pre-5.5 behavior. Personally, I always thought it was a mistake that q.op
> > and mm were so tightly linked in Solr even though they are independent in
> > Lucene.
> >
> > In short, I think people want to be able to set the default behavior for
> > individual terms (MUST vs. SHOULD) if explicit operators are not used,
> and
> > that OR is an explicit operator. And that mm should control only how many
> > SHOULD terms are required (Lucene MinShouldMatch.)
> >
> >
> > -- Jack Krupansky
> >
> > On Thu, Mar 10, 2016 at 3:41 AM, Modassar Ather 
> > wrote:
> >
> > > Thanks Shawn for pointing to the jira issue. I was not sure that if it
> is
> > > an expected behavior or a bug or there could have been a way to get the
> > > desired result.
> > >
> > > Best,
> > > Modassar
> > >
> > > On Thu, Mar 10, 2016 at 11:32 AM, Shawn Heisey 
> > > wrote:
> > >
> > > > On 3/9/2016 10:55 PM, Shawn Heisey wrote:
> > > > > The ~2 syntax, when not attached to a phrase query (quotes) is the
> > way
> > > > > you express a fuzzy query. If it's attached to a query in quotes,
> > then
> > > > > it is a proximity query. I'm not sure whether it means something
> > > > > different when it's attached to a query clause in parentheses,
> > someone
> > > > > with more knowledge will need to comment.
> > > > 
> > > > > https://issues.apache.org/jira/browse/SOLR-8812
> > > >
> > > > After I read SOLR-8812 more closely, it seems that the ~2 syntax with
> > > > parentheses is the way that the effective mm value is expressed for a
> > > > particular query clause in the parsed query.  I've learned something
> > new
> > > > today.
> > > >
> > > > Thanks,
> > > > Shawn
> > > >
> > > >
> > >
> >
>


Re: Query behavior.

2016-03-14 Thread Modassar Ather
Thanks Jack for your response.
The following jira bug for this issue is already present so I have not
created a new one.
https://issues.apache.org/jira/browse/SOLR-8812

Kindly help me understand that whether it is possible to achieve search on
ORed terms as it was done in earlier Solr version.
Is this behavior intentional or is it a bug? I need to migrate to
Solr-5.5.0 but not doing so due to this behavior.

Thanks,
Modassar


On Fri, Mar 11, 2016 at 3:18 AM, Jack Krupansky 
wrote:

> We probably need a Jira to investigate whether this really is an explicitly
> intentional feature change, or whether it really is a bug. And if it truly
> was intentional, how people can work around the change to get the desired,
> pre-5.5 behavior. Personally, I always thought it was a mistake that q.op
> and mm were so tightly linked in Solr even though they are independent in
> Lucene.
>
> In short, I think people want to be able to set the default behavior for
> individual terms (MUST vs. SHOULD) if explicit operators are not used, and
> that OR is an explicit operator. And that mm should control only how many
> SHOULD terms are required (Lucene MinShouldMatch.)
>
>
> -- Jack Krupansky
>
> On Thu, Mar 10, 2016 at 3:41 AM, Modassar Ather 
> wrote:
>
> > Thanks Shawn for pointing to the jira issue. I was not sure that if it is
> > an expected behavior or a bug or there could have been a way to get the
> > desired result.
> >
> > Best,
> > Modassar
> >
> > On Thu, Mar 10, 2016 at 11:32 AM, Shawn Heisey 
> > wrote:
> >
> > > On 3/9/2016 10:55 PM, Shawn Heisey wrote:
> > > > The ~2 syntax, when not attached to a phrase query (quotes) is the
> way
> > > > you express a fuzzy query. If it's attached to a query in quotes,
> then
> > > > it is a proximity query. I'm not sure whether it means something
> > > > different when it's attached to a query clause in parentheses,
> someone
> > > > with more knowledge will need to comment.
> > > 
> > > > https://issues.apache.org/jira/browse/SOLR-8812
> > >
> > > After I read SOLR-8812 more closely, it seems that the ~2 syntax with
> > > parentheses is the way that the effective mm value is expressed for a
> > > particular query clause in the parsed query.  I've learned something
> new
> > > today.
> > >
> > > Thanks,
> > > Shawn
> > >
> > >
> >
>


Re: Query behavior.

2016-03-10 Thread Jack Krupansky
We probably need a Jira to investigate whether this really is an explicitly
intentional feature change, or whether it really is a bug. And if it truly
was intentional, how people can work around the change to get the desired,
pre-5.5 behavior. Personally, I always thought it was a mistake that q.op
and mm were so tightly linked in Solr even though they are independent in
Lucene.

In short, I think people want to be able to set the default behavior for
individual terms (MUST vs. SHOULD) if explicit operators are not used, and
that OR is an explicit operator. And that mm should control only how many
SHOULD terms are required (Lucene MinShouldMatch.)


-- Jack Krupansky

On Thu, Mar 10, 2016 at 3:41 AM, Modassar Ather 
wrote:

> Thanks Shawn for pointing to the jira issue. I was not sure that if it is
> an expected behavior or a bug or there could have been a way to get the
> desired result.
>
> Best,
> Modassar
>
> On Thu, Mar 10, 2016 at 11:32 AM, Shawn Heisey 
> wrote:
>
> > On 3/9/2016 10:55 PM, Shawn Heisey wrote:
> > > The ~2 syntax, when not attached to a phrase query (quotes) is the way
> > > you express a fuzzy query. If it's attached to a query in quotes, then
> > > it is a proximity query. I'm not sure whether it means something
> > > different when it's attached to a query clause in parentheses, someone
> > > with more knowledge will need to comment.
> > 
> > > https://issues.apache.org/jira/browse/SOLR-8812
> >
> > After I read SOLR-8812 more closely, it seems that the ~2 syntax with
> > parentheses is the way that the effective mm value is expressed for a
> > particular query clause in the parsed query.  I've learned something new
> > today.
> >
> > Thanks,
> > Shawn
> >
> >
>


Re: Query behavior.

2016-03-10 Thread Modassar Ather
Thanks Shawn for pointing to the jira issue. I was not sure that if it is
an expected behavior or a bug or there could have been a way to get the
desired result.

Best,
Modassar

On Thu, Mar 10, 2016 at 11:32 AM, Shawn Heisey  wrote:

> On 3/9/2016 10:55 PM, Shawn Heisey wrote:
> > The ~2 syntax, when not attached to a phrase query (quotes) is the way
> > you express a fuzzy query. If it's attached to a query in quotes, then
> > it is a proximity query. I'm not sure whether it means something
> > different when it's attached to a query clause in parentheses, someone
> > with more knowledge will need to comment.
> 
> > https://issues.apache.org/jira/browse/SOLR-8812
>
> After I read SOLR-8812 more closely, it seems that the ~2 syntax with
> parentheses is the way that the effective mm value is expressed for a
> particular query clause in the parsed query.  I've learned something new
> today.
>
> Thanks,
> Shawn
>
>


Re: Query behavior.

2016-03-09 Thread Shawn Heisey
On 3/9/2016 10:55 PM, Shawn Heisey wrote:
> The ~2 syntax, when not attached to a phrase query (quotes) is the way
> you express a fuzzy query. If it's attached to a query in quotes, then
> it is a proximity query. I'm not sure whether it means something
> different when it's attached to a query clause in parentheses, someone
> with more knowledge will need to comment.

> https://issues.apache.org/jira/browse/SOLR-8812

After I read SOLR-8812 more closely, it seems that the ~2 syntax with
parentheses is the way that the effective mm value is expressed for a
particular query clause in the parsed query.  I've learned something new
today.

Thanks,
Shawn



Re: Query behavior.

2016-03-09 Thread Shawn Heisey
On 3/9/2016 12:07 AM, Modassar Ather wrote:
> Kindly help me understand the parsing of following query. I am using
> edismax parser and Solr-5.5.0.
> q.op is set to AND and there is no explicit mm value set.
>
> fl:(java OR book) => "boost(+((fl:java fl:book)~2),int(val))"
>
> When the query has explicit OR then why the ~2 is present in the parsed
> query?

The ~2 syntax, when not attached to a phrase query (quotes) is the way
you express a fuzzy query.  If it's attached to a query in quotes, then
it is a proximity query.  I'm not sure whether it means something
different when it's attached to a query clause in parentheses, someone
with more knowledge will need to comment.

> How can I achieve following?
> "boost(+((fl:java fl:book)),int(val))"
>
> The reason being the ANDed and ORed queries both returns the same number of
> documents. But what expected is that the ORed query should have more number
> of documents.

Normally I would say that if you get the same number of documents with
both "AND" & "OR" then it means that every document that contains "java"
also contains "book" ... but since you are running version 5.5.0, there
is a bug report that describes what you are seeing:

https://issues.apache.org/jira/browse/SOLR-8812

Thanks,
Shawn



Re: Query behavior.

2016-03-09 Thread Modassar Ather
Hi,

A suggestion will be very helpful.

Thanks,
Modassar

On Wed, Mar 9, 2016 at 12:37 PM, Modassar Ather 
wrote:

> Hi,
>
> Kindly help me understand the parsing of following query. I am using
> edismax parser and Solr-5.5.0.
> q.op is set to AND and there is no explicit mm value set.
>
> fl:(java OR book) => "boost(+((fl:java fl:book)~2),int(val))"
>
> When the query has explicit OR then why the ~2 is present in the parsed
> query?
>
> How can I achieve following?
> "boost(+((fl:java fl:book)),int(val))"
>
> The reason being the ANDed and ORed queries both returns the same number
> of documents. But what expected is that the ORed query should have more
> number of documents.
>
> Thanks,
> Modassar
>


Query behavior.

2016-03-08 Thread Modassar Ather
Hi,

Kindly help me understand the parsing of following query. I am using
edismax parser and Solr-5.5.0.
q.op is set to AND and there is no explicit mm value set.

fl:(java OR book) => "boost(+((fl:java fl:book)~2),int(val))"

When the query has explicit OR then why the ~2 is present in the parsed
query?

How can I achieve following?
"boost(+((fl:java fl:book)),int(val))"

The reason being the ANDed and ORed queries both returns the same number of
documents. But what expected is that the ORed query should have more number
of documents.

Thanks,
Modassar


Re: Query behavior difference.

2016-01-06 Thread Emir Arnautovic

Hi Modassar,
It usually helps if you analyze extreme case: e.g. fl:a*
What terms should be better match? Those who are shorter or all should 
be equally good?
What should be top document? Assuming standard TF/IDF scoring is used, 
that would be one with the most terms that start with 'a' especially 
those that are not frequent in corpus. Calculating that could be 
expensive and irrelevant in most cases so constant score makes sense.


Thanks,
Emir

On 06.01.2016 12:07, Modassar Ather wrote:

Please help me understand why queries like wildcard, prefix and few others
are re-written into constant score query?
Why the scoring factors are not taken into consideration in such queries?

Please correct me if I am wrong that this behavior is per the query type
irrespective of the parser used.

Thanks,
Modassar

On Wed, Jan 6, 2016 at 12:56 PM, Modassar Ather 
wrote:


Thanks for your response Ahmet.

Best,
Modassar

On Mon, Jan 4, 2016 at 5:07 PM, Ahmet Arslan 
wrote:


Hi,

I think wildcard queries fl:networ* are re-written into Constant Score
Query.
fl=*,score should returns same score for all documents that are retrieved.

Ahmet



On Monday, January 4, 2016 12:22 PM, Modassar Ather <
modather1...@gmail.com> wrote:
Hi,

Kindly help me understand how will relevance ranking differ int following
searches.

query : fl:network
query : fl:networ*

What I am observing that the results returned are different in both of
them
in a way that the top documents returned for q=fl:network is not present
in
the top results of q=fl:networ*.
For example for q=fl:network I am getting top documents having around 20
occurrence of network whereas the top result of q=fl:networ* has only
couple of occurrence of network.
I am aware of the underlying normalization process participation in
relevance ranking of documents but not able to understand such a
difference
in the ranking of result for the queries.

Thanks,
Modassar





--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/



Re: Query behavior difference.

2016-01-06 Thread Modassar Ather
Please help me understand why queries like wildcard, prefix and few others
are re-written into constant score query?
Why the scoring factors are not taken into consideration in such queries?

Please correct me if I am wrong that this behavior is per the query type
irrespective of the parser used.

Thanks,
Modassar

On Wed, Jan 6, 2016 at 12:56 PM, Modassar Ather 
wrote:

> Thanks for your response Ahmet.
>
> Best,
> Modassar
>
> On Mon, Jan 4, 2016 at 5:07 PM, Ahmet Arslan 
> wrote:
>
>> Hi,
>>
>> I think wildcard queries fl:networ* are re-written into Constant Score
>> Query.
>> fl=*,score should returns same score for all documents that are retrieved.
>>
>> Ahmet
>>
>>
>>
>> On Monday, January 4, 2016 12:22 PM, Modassar Ather <
>> modather1...@gmail.com> wrote:
>> Hi,
>>
>> Kindly help me understand how will relevance ranking differ int following
>> searches.
>>
>> query : fl:network
>> query : fl:networ*
>>
>> What I am observing that the results returned are different in both of
>> them
>> in a way that the top documents returned for q=fl:network is not present
>> in
>> the top results of q=fl:networ*.
>> For example for q=fl:network I am getting top documents having around 20
>> occurrence of network whereas the top result of q=fl:networ* has only
>> couple of occurrence of network.
>> I am aware of the underlying normalization process participation in
>> relevance ranking of documents but not able to understand such a
>> difference
>> in the ranking of result for the queries.
>>
>> Thanks,
>> Modassar
>>
>
>


Re: Query behavior difference.

2016-01-06 Thread Jack Krupansky
The motivation for the constant-score rewrite is simply performance. As per
the Javadoc:

"*This method is faster than the BooleanQuery rewrite methods when the
number of matched terms or matched documents is non-trivial. Also, it will
never hit an errant BooleanQuery.TooManyClauses exception.*"

So that's a second reason - to avoid the max clause count limitation of
Boolean Query.

See:
https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/search/MultiTermQuery.html#CONSTANT_SCORE_REWRITE
https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/search/WildcardQuery.html


-- Jack Krupansky

On Wed, Jan 6, 2016 at 6:07 AM, Modassar Ather 
wrote:

> Please help me understand why queries like wildcard, prefix and few others
> are re-written into constant score query?
> Why the scoring factors are not taken into consideration in such queries?
>
> Please correct me if I am wrong that this behavior is per the query type
> irrespective of the parser used.
>
> Thanks,
> Modassar
>
> On Wed, Jan 6, 2016 at 12:56 PM, Modassar Ather 
> wrote:
>
> > Thanks for your response Ahmet.
> >
> > Best,
> > Modassar
> >
> > On Mon, Jan 4, 2016 at 5:07 PM, Ahmet Arslan 
> > wrote:
> >
> >> Hi,
> >>
> >> I think wildcard queries fl:networ* are re-written into Constant Score
> >> Query.
> >> fl=*,score should returns same score for all documents that are
> retrieved.
> >>
> >> Ahmet
> >>
> >>
> >>
> >> On Monday, January 4, 2016 12:22 PM, Modassar Ather <
> >> modather1...@gmail.com> wrote:
> >> Hi,
> >>
> >> Kindly help me understand how will relevance ranking differ int
> following
> >> searches.
> >>
> >> query : fl:network
> >> query : fl:networ*
> >>
> >> What I am observing that the results returned are different in both of
> >> them
> >> in a way that the top documents returned for q=fl:network is not present
> >> in
> >> the top results of q=fl:networ*.
> >> For example for q=fl:network I am getting top documents having around 20
> >> occurrence of network whereas the top result of q=fl:networ* has only
> >> couple of occurrence of network.
> >> I am aware of the underlying normalization process participation in
> >> relevance ranking of documents but not able to understand such a
> >> difference
> >> in the ranking of result for the queries.
> >>
> >> Thanks,
> >> Modassar
> >>
> >
> >
>


Re: Query behavior difference.

2016-01-06 Thread Modassar Ather
Thanks for your responses.

Best,
Modassar

On Wed, Jan 6, 2016 at 9:27 PM, Jack Krupansky 
wrote:

> The motivation for the constant-score rewrite is simply performance. As per
> the Javadoc:
>
> "*This method is faster than the BooleanQuery rewrite methods when the
> number of matched terms or matched documents is non-trivial. Also, it will
> never hit an errant BooleanQuery.TooManyClauses exception.*"
>
> So that's a second reason - to avoid the max clause count limitation of
> Boolean Query.
>
> See:
>
> https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/search/MultiTermQuery.html#CONSTANT_SCORE_REWRITE
>
> https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/search/WildcardQuery.html
>
>
> -- Jack Krupansky
>
> On Wed, Jan 6, 2016 at 6:07 AM, Modassar Ather 
> wrote:
>
> > Please help me understand why queries like wildcard, prefix and few
> others
> > are re-written into constant score query?
> > Why the scoring factors are not taken into consideration in such queries?
> >
> > Please correct me if I am wrong that this behavior is per the query type
> > irrespective of the parser used.
> >
> > Thanks,
> > Modassar
> >
> > On Wed, Jan 6, 2016 at 12:56 PM, Modassar Ather 
> > wrote:
> >
> > > Thanks for your response Ahmet.
> > >
> > > Best,
> > > Modassar
> > >
> > > On Mon, Jan 4, 2016 at 5:07 PM, Ahmet Arslan  >
> > > wrote:
> > >
> > >> Hi,
> > >>
> > >> I think wildcard queries fl:networ* are re-written into Constant Score
> > >> Query.
> > >> fl=*,score should returns same score for all documents that are
> > retrieved.
> > >>
> > >> Ahmet
> > >>
> > >>
> > >>
> > >> On Monday, January 4, 2016 12:22 PM, Modassar Ather <
> > >> modather1...@gmail.com> wrote:
> > >> Hi,
> > >>
> > >> Kindly help me understand how will relevance ranking differ int
> > following
> > >> searches.
> > >>
> > >> query : fl:network
> > >> query : fl:networ*
> > >>
> > >> What I am observing that the results returned are different in both of
> > >> them
> > >> in a way that the top documents returned for q=fl:network is not
> present
> > >> in
> > >> the top results of q=fl:networ*.
> > >> For example for q=fl:network I am getting top documents having around
> 20
> > >> occurrence of network whereas the top result of q=fl:networ* has only
> > >> couple of occurrence of network.
> > >> I am aware of the underlying normalization process participation in
> > >> relevance ranking of documents but not able to understand such a
> > >> difference
> > >> in the ranking of result for the queries.
> > >>
> > >> Thanks,
> > >> Modassar
> > >>
> > >
> > >
> >
>


Re: Query behavior difference.

2016-01-05 Thread Modassar Ather
Thanks for your response Ahmet.

Best,
Modassar

On Mon, Jan 4, 2016 at 5:07 PM, Ahmet Arslan 
wrote:

> Hi,
>
> I think wildcard queries fl:networ* are re-written into Constant Score
> Query.
> fl=*,score should returns same score for all documents that are retrieved.
>
> Ahmet
>
>
>
> On Monday, January 4, 2016 12:22 PM, Modassar Ather <
> modather1...@gmail.com> wrote:
> Hi,
>
> Kindly help me understand how will relevance ranking differ int following
> searches.
>
> query : fl:network
> query : fl:networ*
>
> What I am observing that the results returned are different in both of them
> in a way that the top documents returned for q=fl:network is not present in
> the top results of q=fl:networ*.
> For example for q=fl:network I am getting top documents having around 20
> occurrence of network whereas the top result of q=fl:networ* has only
> couple of occurrence of network.
> I am aware of the underlying normalization process participation in
> relevance ranking of documents but not able to understand such a difference
> in the ranking of result for the queries.
>
> Thanks,
> Modassar
>


Query behavior difference.

2016-01-04 Thread Modassar Ather
Hi,

Kindly help me understand how will relevance ranking differ int following
searches.

query : fl:network
query : fl:networ*

What I am observing that the results returned are different in both of them
in a way that the top documents returned for q=fl:network is not present in
the top results of q=fl:networ*.
For example for q=fl:network I am getting top documents having around 20
occurrence of network whereas the top result of q=fl:networ* has only
couple of occurrence of network.
I am aware of the underlying normalization process participation in
relevance ranking of documents but not able to understand such a difference
in the ranking of result for the queries.

Thanks,
Modassar


Re: Query behavior difference.

2016-01-04 Thread Ahmet Arslan
Hi,

I think wildcard queries fl:networ* are re-written into Constant Score Query.
fl=*,score should returns same score for all documents that are retrieved.

Ahmet



On Monday, January 4, 2016 12:22 PM, Modassar Ather  
wrote:
Hi,

Kindly help me understand how will relevance ranking differ int following
searches.

query : fl:network
query : fl:networ*

What I am observing that the results returned are different in both of them
in a way that the top documents returned for q=fl:network is not present in
the top results of q=fl:networ*.
For example for q=fl:network I am getting top documents having around 20
occurrence of network whereas the top result of q=fl:networ* has only
couple of occurrence of network.
I am aware of the underlying normalization process participation in
relevance ranking of documents but not able to understand such a difference
in the ranking of result for the queries.

Thanks,
Modassar


Re: Difference in query behavior.

2015-11-30 Thread Jack Krupansky
The mm parameter or default operator logic only applies to the top level of
the query. Once you get nested in parentheses below the top level,
Solr/Lucene reverts to the default of the OR (SHOULD) operator.

-- Jack Krupansky

On Mon, Nov 30, 2015 at 5:45 AM, Modassar Ather 
wrote:

> Hi,
>
> I have a query title:(solr lucene api). The mm is set to 100% using q.op as
> AND.
> When the query is executed it returns documnets having all the terms. It
> parses to following:
> +(title:solr title:faceting title:api)~3
>
> Similarlly I have another query like this topic:facet AND title:(solr
> lucene api) which is parsed as:
> +(+topic:facet +(title:solr title:lucene title:api)
>
> The second query is a subset of first query but it returns more results
> than the first.
> Per my understanding reason being that there are two clauses in second
> query 1) topic:facet which MUST occur and 2) (title:solr title:lucene
> title:api) any of the terms MUST occur.
> In first query there are 3 clauses which has SHOULD occur in between terms
> but due to 100% mm all terms are matched.
>
> Kindly help me understand how I can get the subset of result of query 1 by
> query 2.
> I understand if I put +/AND in between the clauses it will work but the
> same is not required in query one.
> Is there a way I can group the clauses which ensures that the first clause
> and the terms of other clause all should match as in the query first all
> the clauses are matched.
> Also let me know how ~ is different from phrase slop in the case of first
> query.
>
> Thanks,
> Modassar
>


Re: Difference in query behavior.

2015-11-30 Thread Upayavira
I cannot immediately explain the behaviour you are seeing, but can't you
use a filter query to achieve the same?

Add fq=topic:facet to your query string, and you'll be set.

As to the original behaviour, the parsed query looks wrong, as it is
missing a bracket. Can you provide all of the versions of the queries
that Solr gives to you, along with the JSON/XML that wraps them?

Upayavira

On Mon, Nov 30, 2015, at 10:45 AM, Modassar Ather wrote:
> Hi,
> 
> I have a query title:(solr lucene api). The mm is set to 100% using q.op
> as
> AND.
> When the query is executed it returns documnets having all the terms. It
> parses to following:
> +(title:solr title:faceting title:api)~3
> 
> Similarlly I have another query like this topic:facet AND title:(solr
> lucene api) which is parsed as:
> +(+topic:facet +(title:solr title:lucene title:api)
> 
> The second query is a subset of first query but it returns more results
> than the first.
> Per my understanding reason being that there are two clauses in second
> query 1) topic:facet which MUST occur and 2) (title:solr title:lucene
> title:api) any of the terms MUST occur.
> In first query there are 3 clauses which has SHOULD occur in between
> terms
> but due to 100% mm all terms are matched.
> 
> Kindly help me understand how I can get the subset of result of query 1
> by
> query 2.
> I understand if I put +/AND in between the clauses it will work but the
> same is not required in query one.
> Is there a way I can group the clauses which ensures that the first
> clause
> and the terms of other clause all should match as in the query first all
> the clauses are matched.
> Also let me know how ~ is different from phrase slop in the case of first
> query.
> 
> Thanks,
> Modassar


Re: Difference in query behavior.

2015-11-30 Thread Alexandre Rafalovitch
On 30 November 2015 at 05:45, Modassar Ather  wrote:
>
> I have a query title:(solr lucene api). The mm is set to 100% using q.op as
> +(title:solr **title:faceting** title:api)~3

Does it though? solr lucene api => solr faceting api!

Is it possible you are staring at the wrong tab and the counts don't
match. I've done that often enough :-(

Regards,
   Alex.


Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


Re: Difference in query behavior.

2015-11-30 Thread Modassar Ather
Thanks for your response.

Upayavira : The missing bracket is a copy paste error. Correct parsed query
: +(+topic:facet +(title:solr title:lucene title:api)). Use of fq is not an
option as these are user queries.
Alexandre : That is just an example query. Those terms used are just to
explain the behavior. Basically the query forms can be seen as field:(term1
term2 term3) and field1:term4 AND field:(term1 term2 term3)
  The second query should bring the subset of the first
query but that is not happening.

Thanks Jack for your input.

Please let me know if there is a way to achieve the subset of first query
from second query. As per my understanding of the code I saw that until
there is an OR in between clauses the mm is not considered. So for the
query field1:term4 AND field:(term1 term2 term3) mm is not considered at
all.

Regards,
Modassar

On Tue, Dec 1, 2015 at 10:14 AM, Modassar Ather <modather1...@gmail.com>
wrote:

> Thanks for your response.
>
> Upayavira : The missing bracket is a copy paste error. Correct parsed
> query : +(+topic:facet +(title:solr title:lucene title:api)). Use of fq is
> not an option as these are user queries.
> Alexandre : That is just an example query. Those terms used are just to
> explain the behavior. Basically the query forms can be seen as field:(term1
> term2 term3) and field1:term4 AND field:(term1 term2 term3)
>   The second query should bring the subset of the first
> query but that is not happening.
>
> Thanks Jack for your input.
>
> Please let me know if there is a way to achieve the subset of first query
> from second query.
>
> Regards,
> Modassar
>
>
> On Tue, Dec 1, 2015 at 9:57 AM, Modassar Ather <modather1...@gmail.com>
> wrote:
>
>> Hi Tim,
>>
>> I am using the SpanQueryParser for phrases particularly.
>>
>> Thanks,
>> Modassar
>>
>> On Mon, Nov 30, 2015 at 6:27 PM, Allison, Timothy B. <talli...@mitre.org>
>> wrote:
>>
>>> Out of curiosity, how does the SpanQueryParser work on this?  Or have
>>> you stopped using that?
>>>
>>> Cheers,
>>>
>>>   Tim
>>>
>>> -Original Message-
>>> From: Modassar Ather [mailto:modather1...@gmail.com]
>>> Sent: Monday, November 30, 2015 5:46 AM
>>> To: solr-user@lucene.apache.org
>>> Subject: Difference in query behavior.
>>>
>>> Hi,
>>>
>>> I have a query title:(solr lucene api). The mm is set to 100% using q.op
>>> as AND.
>>> When the query is executed it returns documnets having all the terms. It
>>> parses to following:
>>> +(title:solr title:faceting title:api)~3
>>>
>>> Similarlly I have another query like this topic:facet AND title:(solr
>>> lucene api) which is parsed as:
>>> +(+topic:facet +(title:solr title:lucene title:api)
>>>
>>> The second query is a subset of first query but it returns more results
>>> than the first.
>>> Per my understanding reason being that there are two clauses in second
>>> query 1) topic:facet which MUST occur and 2) (title:solr title:lucene
>>> title:api) any of the terms MUST occur.
>>> In first query there are 3 clauses which has SHOULD occur in between
>>> terms but due to 100% mm all terms are matched.
>>>
>>> Kindly help me understand how I can get the subset of result of query 1
>>> by query 2.
>>> I understand if I put +/AND in between the clauses it will work but the
>>> same is not required in query one.
>>> Is there a way I can group the clauses which ensures that the first
>>> clause and the terms of other clause all should match as in the query first
>>> all the clauses are matched.
>>> Also let me know how ~ is different from phrase slop in the case of
>>> first query.
>>>
>>> Thanks,
>>> Modassar
>>>
>>
>>
>


Difference in query behavior.

2015-11-30 Thread Modassar Ather
Hi,

I have a query title:(solr lucene api). The mm is set to 100% using q.op as
AND.
When the query is executed it returns documnets having all the terms. It
parses to following:
+(title:solr title:faceting title:api)~3

Similarlly I have another query like this topic:facet AND title:(solr
lucene api) which is parsed as:
+(+topic:facet +(title:solr title:lucene title:api)

The second query is a subset of first query but it returns more results
than the first.
Per my understanding reason being that there are two clauses in second
query 1) topic:facet which MUST occur and 2) (title:solr title:lucene
title:api) any of the terms MUST occur.
In first query there are 3 clauses which has SHOULD occur in between terms
but due to 100% mm all terms are matched.

Kindly help me understand how I can get the subset of result of query 1 by
query 2.
I understand if I put +/AND in between the clauses it will work but the
same is not required in query one.
Is there a way I can group the clauses which ensures that the first clause
and the terms of other clause all should match as in the query first all
the clauses are matched.
Also let me know how ~ is different from phrase slop in the case of first
query.

Thanks,
Modassar


changed query behavior

2014-04-14 Thread Johannes Siegert

Hi,

I have updated my solr instance from 4.5.1 to 4.7.1.

Now my solr query failing some tests.

Query: q=*:*fq=(title:((TE)))?debug=true

Before the update:

lstname=debug
strname=rawquerystring*:*/str
strname=querystring*:*/str
strname=parsedqueryMatchAllDocsQuery(*:*)/str
strname=parsedquery_toString*:*/str
lstname=explain/
strname=QParserLuceneQParser/str
arrname=filter_queries
str(title:((TE)))/str
/arr
arrname=parsed_filter_queries
str+title:te +title:t +title:e/str
/arr
...

After the update:

lstname=debug
strname=rawquerystring*:*/str
strname=querystring*:*/str
strname=parsedqueryMatchAllDocsQuery(*:*)/str
strname=parsedquery_toString*:*/str
lstname=explain/
strname=QParserLuceneQParser/str
arrname=filter_queries
str(title:((TE)))/str
/arr
arrname=parsed_filter_queries
str+((title:te title:t)/no_coord) +title:e/str
/arr
...

Before update the query deliver only one result. Now the query deliver 
three results.


Do you have any idea why the parsed_filter_queries is +((title:te 
title:t)/no_coord) +title:e instead of +title:te +title:t +title:e?


title-field definition:

fieldType name=text_title class=solr.TextField 
positionIncrementGap=100 omitNorms=true

  analyzer type=index
charFilter class=solr.HTMLStripCharFilterFactory/
charFilter class=solr.MappingCharFilterFactory 
mapping=mapping.txt/

tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.WordDelimiterFilterFactory 
generateWordParts=1 generateNumberParts=1 catenateWords=1 
catenateNumbers=1 catenateAll=1 splitOnCaseChange=1 
splitOnNumerics=1 preserveOriginal=1 stemEnglishPossessive=0/

filter class=solr.LowerCaseFilterFactory/
  /analyzer
  analyzer type=query
charFilter class=solr.HTMLStripCharFilterFactory/
 charFilter class=solr.MappingCharFilterFactory 
mapping=mapping.txt/

tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.SynonymFilterFactory 
synonyms=synonyms.txt ignoreCase=true expand=false/
filter class=solr.WordDelimiterFilterFactory 
generateWordParts=1 generateNumberParts=1 catenateWords=0 
catenateNumbers=0 catenateAll=0 splitOnCaseChange=0 
splitOnNumerics=0 preserveOriginal=1/

filter class=solr.LowerCaseFilterFactory/
  /analyzer
/fieldType

The default query operator is AND.

Thanks!

Johannes




Re: Join Query Behavior

2013-10-25 Thread Andy Pickler
If it helps to clarify any, here's the full query:

/select
?
q=*:*

fq=type:ProjectGroup

fq={!join from=project_id_i to=project_id_im}user_id_i:65615 -role_id_i:18
type:UserRole

We have two Solr servers that were indexed from the same database.  One of
the servers is running Solr 4.2, while the other (test server) is running
4.5.

Solr 4.2:
result name=response numFound=64 start=0

Solr 4.5.1:
result name=response numFound=2642 start=0

Solr 4.2 returns the expected result with the project IDs filtered out
from the join query, while the 4.5 query shows *all* results (2642
records).  I can leave off the join query in 4.5 and get the same results,
which tells me obviously it is having no effect.

Is there a change to the join query behavior between these releases, or
could I have configured something differently in my 4.5.1 install?

Thanks,
Andy Pickler

On Thu, Oct 24, 2013 at 2:42 PM, Andy Pickler andy.pick...@gmail.comwrote:

 We're attempting to upgrade from Solr 4.2 to 4.5 but are finding that 4.5
 is not honoring this join query:

 first part of query...
 
 fq={!join from=project_id_i to=project_id_im}user_id_i:65615 -role_id_i:18
 type:UserRole
 
 last part of query

 On our Solr 4.2 instance adding/removing that query gives us different
 (and expected) results, while the query doesn't affect the results at all
 in 4.5.  Is there any known join query behavior differences/fixes between
 4.2 and 4.5 that might explain this, or should I be looking at other
 factors?

 Thanks,
 Andy Pickler




Join Query Behavior

2013-10-24 Thread Andy Pickler
We're attempting to upgrade from Solr 4.2 to 4.5 but are finding that 4.5
is not honoring this join query:

first part of query...

fq={!join from=project_id_i to=project_id_im}user_id_i:65615 -role_id_i:18
type:UserRole

last part of query

On our Solr 4.2 instance adding/removing that query gives us different (and
expected) results, while the query doesn't affect the results at all in
4.5.  Is there any known join query behavior differences/fixes between 4.2
and 4.5 that might explain this, or should I be looking at other factors?

Thanks,
Andy Pickler


Re: Can anyone explain this Solr query behavior?

2013-05-24 Thread Shankar Sundararaju
Hi Upayavira,

Thank you for your analysis. I thought 'AND'  groupings are supported as
per documentation:

http://docs.lucidworks.com/display/solr/The+Extended+DisMax+Query+Parser
http://lucene.apache.org/core/old_versioned_docs/versions/3_5_0/queryparsersyntax.html#Grouping

But yes, q=doc-id:3000 AND (-text:[* TO *]) works as expected.

Thanks
-Shankar



On Thu, May 23, 2013 at 5:31 PM, Upayavira u...@odoko.co.uk wrote:

 (+(doc-id:3000 DisjunctionMaxQuery((Publisher:and^2.0 | text:and |
 Classification:and^2.0 | Contributors:and^2.0 |
 Title:and^3.0/no_coord

 You're using edismax, not lucene. So AND is being considered as a search
 term, not an operator, and the word 'and' probably exists in 631580
 documents.

 Why is it triggering dismax? Probably because field:() is not valid
 syntax, so edismax is dropping to dismax because it isn't a valid lucene
 query.

 What do you expect text:() to do?

 If you want to match any docs that have a value in the text field, use
 q=text:[* TO *]

 To match docs that *don't* have a value in the text field: q=-text[* TO
 *]

 Upayavira

 On Fri, May 24, 2013, at 12:23 AM, Shankar Sundararaju wrote:
  Hi Erick,
 
  Here's the output after turning on the debug flag:
 
  *q=text:()debug=query*
 
  yields
 
  response
  lst name=responseHeader
  int name=status0/int
  int name=QTime17/int
  lst name=params
  str name=indenttrue/str
  str name=qtext:()/str
  str name=debugquery/str
  /lst
  /lst
  result name=response numFound=0 start=0 maxScore=0.0/result
  lst name=debug
  str name=rawquerystringtext:()/str
  str name=querystringtext:()/str
  str name=parsedquery(+())/no_coord/str
  str name=parsedquery_toString+()/str
  str name=QParserExtendedDismaxQParser/str
  null name=altquerystring/
  null name=boost_queries/
  arr name=parsed_boost_queries/
  null name=boostfuncs/
  /lst
  /response
 
  *q=doc-id:3000debug=query*
 
  yields
 
  response
  lst name=responseHeader
  int name=status0/int
  int name=QTime17/int
  lst name=params
  str name=qdoc-id:3000/str
  str name=debugquery/str
  /lst
  /lst
  result name=response numFound=1 start=0 maxScore=11.682044
  doc
:
:
  /doc
  /result
  lst name=debug
  str name=rawquerystringdoc-id:3000/str
  str name=querystringdoc-id:3000/str
  str name=parsedquery(+doc-id:3000)/no_coord/str
  str name=parsedquery_toString+doc-id:`#8;#0;#0;#23;8/str
  str name=QParserExtendedDismaxQParser/str
  null name=altquerystring/
  null name=boost_queries/
  arr name=parsed_boost_queries/
  null name=boostfuncs/
  /lst
  /response
 
  *q=doc-id:3000 AND text:()debug=query*
 
yields
 
  response
  lst name=responseHeader
  int name=status0/int
  int name=QTime23/int
  lst name=params
  str name=qdoc-id:3000 AND text:()/str
  str name=debugquery/str
  /lst
  /lst
  result name=response numFound=631647 start=0 maxScore=8.056607
  doc
   :
  /doc
   :
  /doc
  doc
   :
  /doc
  doc
   :
  /doc
  doc
   :
  /doc
  doc
   :
  /doc
  /result
  lst name=debug
  str name=rawquerystringdoc-id:3000 AND text:()/str
  str name=querystringdoc-id:3000 AND text:()/str
  str name=parsedquery
  (+(doc-id:3000 DisjunctionMaxQuery((Publisher:and^2.0 | text:and |
  Classification:and^2.0 | Contributors:and^2.0 |
  Title:and^3.0/no_coord
  /str
  str name=parsedquery_toString
  +(doc-id:`#8;#0;#0;#23;8 (Publisher:and^2.0 | text:and |
  Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0))
  /str
  str name=QParserExtendedDismaxQParser/str
  null name=altquerystring/
  null name=boost_queries/
  arr name=parsed_boost_queries/
  null name=boostfuncs/
  /lst
  /response
 
  *solrconfig.xml:*
  requestHandler name=/select class=solr.SearchHandler
   lst name=defaults
 str name=echoParamsexplicit/str
 int name=rows10/int
 str name=dftext/str
 str name=defTypeedismax/str
 str name=qftext^1.0 Title^3.0 Classification^2.0
  Contributors^2.0 Publisher^2.0/str
   /lst
 
  *schema.xml:*
  field name=text type=my_text indexed=true stored=false required=
  false/*
  *
  dynamicField name=* type=my_text indexed=true stored=true
  multiValued=false/
  fieldType name=my_text class=solr.TextField analyzer type=index
  class=MyAnalyzer/ analyzer type=query class=MyAnalyzer/
  analyzer
  type=multiterm class=MyAnalyzer/ /fieldType
  *
  *
  *Note:* MyAnalyzer among few other customizations, uses
  WhitespaceTokenizer
  and LoweCaseFilter
 
  Thanks a lot.
 
  -Shankar
 
 
  On Thu, May 23, 2013 at 4:34 AM, Erick Erickson
  erickerick...@gmail.comwrote:
 
   Please post the results of adding debug=query to the URL.
   That'll tell us what the query parser spits out which is much
   easier to analyze.
  
   Best
   Erick
  
   On Wed, May 22, 2013 at 12:16 PM, Shankar Sundararaju
   shan...@ebrary.com wrote:
This query returns 0 documents: *q=(+Title:() +Classification:()
+Contributors:() +text:())*
   
This returns 1 document: *q=doc-id:3000*
   
And this returns 631580 documents 

Re: Can anyone explain this Solr query behavior?

2013-05-24 Thread Shankar Sundararaju
Hi Jack Krupansky,

Thank you for your reply. I would like to know how you got the error
logging? Is there any special flag I have to turn on? Because I don't see
it in my solr.log even after switching the log level to DEBUG.

str name=msgorg.apache.solr.**search.SyntaxError: Cannot parse 'id:*
AND text:()': Encountered  ) )  at line 1, column 15.

Thanks
-Shankar


On Thu, May 23, 2013 at 5:41 PM, Jack Krupansky j...@basetechnology.comwrote:

 Okay... sorry I wasn't paying close enough attention. What is happening is
 that the empty parentheses are illegal in Lucene query syntax:

  str name=msgorg.apache.solr.**search.SyntaxError: Cannot parse 'id:*
 AND text:()': Encountered  ) )  at line 1, column 15.
 Was expecting one of:
lt;NOTgt; ...
+ ...
- ...
lt;BAREOPERgt; ...
( ...
* ...
lt;QUOTEDgt; ...
lt;TERMgt; ...
lt;PREFIXTERMgt; ...
lt;WILDTERMgt; ...
lt;REGEXPTERMgt; ...
[ ...
{ ...
lt;LPARAMSgt; ...
lt;NUMBERgt; ...
lt;TERMgt; ...
* ...
/str
  int name=code400/int

 Edismax traps such errors and then escapes the query so that Lucene will
 no longer throw an error. In this case, it puts quotes around the AND
 operator, which is why you see and included in the parsed query as if it
 were a term. And I believe it turns text:() into text:(), which makes
 the original Lucene error go away, but the () analyzes to nothing and
 generates no term in the query.

 So, fix your syntax error and the anomaly should go away.

 -- Jack Krupansky

 -Original Message- From: Shankar Sundararaju
 Sent: Thursday, May 23, 2013 7:23 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Can anyone explain this Solr query behavior?


 Hi Erick,

 Here's the output after turning on the debug flag:

 *q=text:()debug=query*


yields

 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime17/int
 lst name=params
 str name=indenttrue/str
 str name=qtext:()/str
 str name=debugquery/str
 /lst
 /lst
 result name=response numFound=0 start=0 maxScore=0.0/result
 lst name=debug
 str name=rawquerystringtext:()**/str
 str name=querystringtext:()/**str
 str name=parsedquery(+())/no_**coord/str
 str name=parsedquery_toString+(**)/str
 str name=QParser**ExtendedDismaxQParser/str
 null name=altquerystring/
 null name=boost_queries/
 arr name=parsed_boost_queries/
 null name=boostfuncs/
 /lst
 /response

 *q=doc-id:3000debug=query*


yields

 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime17/int
 lst name=params
 str name=qdoc-id:3000/str
 str name=debugquery/str
 /lst
 /lst
 result name=response numFound=1 start=0 maxScore=11.682044
 doc
  :
  :
 /doc
 /result
 lst name=debug
 str name=rawquerystringdoc-id:**3000/str
 str name=querystringdoc-id:**3000/str
 str name=parsedquery(+doc-id:**3000)/no_coord/str
 str name=parsedquery_toString+**doc-id:`#8;#0;#0;#23;8/str
 str name=QParser**ExtendedDismaxQParser/str
 null name=altquerystring/
 null name=boost_queries/
 arr name=parsed_boost_queries/
 null name=boostfuncs/
 /lst
 /response

 *q=doc-id:3000 AND text:()debug=query*

  yields

 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime23/int
 lst name=params
 str name=qdoc-id:3000 AND text:()/str
 str name=debugquery/str
 /lst
 /lst
 result name=response numFound=631647 start=0 maxScore=8.056607
 doc
 :
 /doc
 :
 /doc
 doc
 :
 /doc
 doc
 :
 /doc
 doc
 :
 /doc
 doc
 :
 /doc
 /result
 lst name=debug
 str name=rawquerystringdoc-id:**3000 AND text:()/str
 str name=querystringdoc-id:3000 AND text:()/str
 str name=parsedquery
 (+(doc-id:3000 DisjunctionMaxQuery((**Publisher:and^2.0 | text:and |
 Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0/no_coord
 /str
 str name=parsedquery_toString
 +(doc-id:`#8;#0;#0;#23;8 (Publisher:and^2.0 | text:and |
 Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0))
 /str
 str name=QParser**ExtendedDismaxQParser/str
 null name=altquerystring/
 null name=boost_queries/
 arr name=parsed_boost_queries/
 null name=boostfuncs/
 /lst
 /response

 *solrconfig.xml:*

 requestHandler name=/select class=solr.SearchHandler
 lst name=defaults
   str name=echoParamsexplicit/**str
   int name=rows10/int
   str name=dftext/str
   str name=defTypeedismax/str
   str name=qftext^1.0 Title^3.0 Classification^2.0
 Contributors^2.0 Publisher^2.0/str
 /lst

 *schema.xml:*

 field name=text type=my_text indexed=true stored=false required=
 false/*
 *

 dynamicField name=* type=my_text indexed=true stored=true
 multiValued=false/
 fieldType name=my_text class=solr.TextField analyzer type=index
 class=MyAnalyzer/ analyzer type=query class=MyAnalyzer/ analyzer
 type=multiterm class=MyAnalyzer/ /fieldType
 *
 *
 *Note:* MyAnalyzer among few other customizations, uses WhitespaceTokenizer

 and LoweCaseFilter

 Thanks a lot.

 -Shankar


 On Thu, May 23, 2013 at 4:34 AM, Erick Erickson erickerick...@gmail.com*
 *wrote:

  Please post the results

Re: Can anyone explain this Solr query behavior?

2013-05-24 Thread Jack Krupansky
Oh, I simply changed the query parser type to lucene, with defType=lucene 
and then I see essentially the same error that edismax does when it 
internally tries to parse the query.


But, it might be nice if DEBUG level logging for edismax did display the 
error as well and then told you what remediation it was performing..


-- Jack Krupansky

-Original Message- 
From: Shankar Sundararaju

Sent: Friday, May 24, 2013 1:01 PM
To: solr-user@lucene.apache.org
Subject: Re: Can anyone explain this Solr query behavior?

Hi Jack Krupansky,

Thank you for your reply. I would like to know how you got the error
logging? Is there any special flag I have to turn on? Because I don't see
it in my solr.log even after switching the log level to DEBUG.

str name=msgorg.apache.solr.**search.SyntaxError: Cannot parse 'id:*
AND text:()': Encountered  ) )  at line 1, column 15.

Thanks
-Shankar


On Thu, May 23, 2013 at 5:41 PM, Jack Krupansky 
j...@basetechnology.comwrote:



Okay... sorry I wasn't paying close enough attention. What is happening is
that the empty parentheses are illegal in Lucene query syntax:

 str name=msgorg.apache.solr.**search.SyntaxError: Cannot parse 'id:*
AND text:()': Encountered  ) )  at line 1, column 15.
Was expecting one of:
   lt;NOTgt; ...
   + ...
   - ...
   lt;BAREOPERgt; ...
   ( ...
   * ...
   lt;QUOTEDgt; ...
   lt;TERMgt; ...
   lt;PREFIXTERMgt; ...
   lt;WILDTERMgt; ...
   lt;REGEXPTERMgt; ...
   [ ...
   { ...
   lt;LPARAMSgt; ...
   lt;NUMBERgt; ...
   lt;TERMgt; ...
   * ...
   /str
 int name=code400/int

Edismax traps such errors and then escapes the query so that Lucene will
no longer throw an error. In this case, it puts quotes around the AND
operator, which is why you see and included in the parsed query as if it
were a term. And I believe it turns text:() into text:(), which 
makes

the original Lucene error go away, but the () analyzes to nothing and
generates no term in the query.

So, fix your syntax error and the anomaly should go away.

-- Jack Krupansky

-Original Message- From: Shankar Sundararaju
Sent: Thursday, May 23, 2013 7:23 PM
To: solr-user@lucene.apache.org
Subject: Re: Can anyone explain this Solr query behavior?


Hi Erick,

Here's the output after turning on the debug flag:

*q=text:()debug=query*


   yields

response
lst name=responseHeader
int name=status0/int
int name=QTime17/int
lst name=params
str name=indenttrue/str
str name=qtext:()/str
str name=debugquery/str
/lst
/lst
result name=response numFound=0 start=0 maxScore=0.0/result
lst name=debug
str name=rawquerystringtext:()**/str
str name=querystringtext:()/**str
str name=parsedquery(+())/no_**coord/str
str name=parsedquery_toString+(**)/str
str name=QParser**ExtendedDismaxQParser/str
null name=altquerystring/
null name=boost_queries/
arr name=parsed_boost_queries/
null name=boostfuncs/
/lst
/response

*q=doc-id:3000debug=query*


   yields

response
lst name=responseHeader
int name=status0/int
int name=QTime17/int
lst name=params
str name=qdoc-id:3000/str
str name=debugquery/str
/lst
/lst
result name=response numFound=1 start=0 maxScore=11.682044
doc
 :
 :
/doc
/result
lst name=debug
str name=rawquerystringdoc-id:**3000/str
str name=querystringdoc-id:**3000/str
str name=parsedquery(+doc-id:**3000)/no_coord/str
str name=parsedquery_toString+**doc-id:`#8;#0;#0;#23;8/str
str name=QParser**ExtendedDismaxQParser/str
null name=altquerystring/
null name=boost_queries/
arr name=parsed_boost_queries/
null name=boostfuncs/
/lst
/response

*q=doc-id:3000 AND text:()debug=query*

 yields

response
lst name=responseHeader
int name=status0/int
int name=QTime23/int
lst name=params
str name=qdoc-id:3000 AND text:()/str
str name=debugquery/str
/lst
/lst
result name=response numFound=631647 start=0 maxScore=8.056607
doc
:
/doc
:
/doc
doc
:
/doc
doc
:
/doc
doc
:
/doc
doc
:
/doc
/result
lst name=debug
str name=rawquerystringdoc-id:**3000 AND text:()/str
str name=querystringdoc-id:3000 AND text:()/str
str name=parsedquery
(+(doc-id:3000 DisjunctionMaxQuery((**Publisher:and^2.0 | text:and |
Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0/no_coord
/str
str name=parsedquery_toString
+(doc-id:`#8;#0;#0;#23;8 (Publisher:and^2.0 | text:and |
Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0))
/str
str name=QParser**ExtendedDismaxQParser/str
null name=altquerystring/
null name=boost_queries/
arr name=parsed_boost_queries/
null name=boostfuncs/
/lst
/response

*solrconfig.xml:*

requestHandler name=/select class=solr.SearchHandler
lst name=defaults
  str name=echoParamsexplicit/**str
  int name=rows10/int
  str name=dftext/str
  str name=defTypeedismax/str
  str name=qftext^1.0 Title^3.0 Classification^2.0
Contributors^2.0 Publisher^2.0/str
/lst

*schema.xml:*

field name=text type=my_text indexed=true stored=false required=
false/*
*

dynamicField name=* type=my_text indexed=true stored=true
multiValued=false/
fieldType name=my_text class=solr.TextField

Re: Can anyone explain this Solr query behavior?

2013-05-23 Thread Erick Erickson
Please post the results of adding debug=query to the URL.
That'll tell us what the query parser spits out which is much
easier to analyze.

Best
Erick

On Wed, May 22, 2013 at 12:16 PM, Shankar Sundararaju
shan...@ebrary.com wrote:
 This query returns 0 documents: *q=(+Title:() +Classification:()
 +Contributors:() +text:())*

 This returns 1 document: *q=doc-id:3000*

 And this returns 631580 documents when I was expecting 0: *q=doc-id:3000
 AND (+Title:() +Classification:() +Contributors:() +text:())*

 Am I missing something here? Can someone please explain? I am using Solr
 4.2.1

 Thanks
 -Shankar


Re: Can anyone explain this Solr query behavior?

2013-05-23 Thread Shankar Sundararaju
Hi Erick,

Here's the output after turning on the debug flag:

*q=text:()debug=query*

yields

response
lst name=responseHeader
int name=status0/int
int name=QTime17/int
lst name=params
str name=indenttrue/str
str name=qtext:()/str
str name=debugquery/str
/lst
/lst
result name=response numFound=0 start=0 maxScore=0.0/result
lst name=debug
str name=rawquerystringtext:()/str
str name=querystringtext:()/str
str name=parsedquery(+())/no_coord/str
str name=parsedquery_toString+()/str
str name=QParserExtendedDismaxQParser/str
null name=altquerystring/
null name=boost_queries/
arr name=parsed_boost_queries/
null name=boostfuncs/
/lst
/response

*q=doc-id:3000debug=query*

yields

response
lst name=responseHeader
int name=status0/int
int name=QTime17/int
lst name=params
str name=qdoc-id:3000/str
str name=debugquery/str
/lst
/lst
result name=response numFound=1 start=0 maxScore=11.682044
doc
  :
  :
/doc
/result
lst name=debug
str name=rawquerystringdoc-id:3000/str
str name=querystringdoc-id:3000/str
str name=parsedquery(+doc-id:3000)/no_coord/str
str name=parsedquery_toString+doc-id:`#8;#0;#0;#23;8/str
str name=QParserExtendedDismaxQParser/str
null name=altquerystring/
null name=boost_queries/
arr name=parsed_boost_queries/
null name=boostfuncs/
/lst
/response

*q=doc-id:3000 AND text:()debug=query*

  yields

response
lst name=responseHeader
int name=status0/int
int name=QTime23/int
lst name=params
str name=qdoc-id:3000 AND text:()/str
str name=debugquery/str
/lst
/lst
result name=response numFound=631647 start=0 maxScore=8.056607
doc
 :
/doc
 :
/doc
doc
 :
/doc
doc
 :
/doc
doc
 :
/doc
doc
 :
/doc
/result
lst name=debug
str name=rawquerystringdoc-id:3000 AND text:()/str
str name=querystringdoc-id:3000 AND text:()/str
str name=parsedquery
(+(doc-id:3000 DisjunctionMaxQuery((Publisher:and^2.0 | text:and |
Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0/no_coord
/str
str name=parsedquery_toString
+(doc-id:`#8;#0;#0;#23;8 (Publisher:and^2.0 | text:and |
Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0))
/str
str name=QParserExtendedDismaxQParser/str
null name=altquerystring/
null name=boost_queries/
arr name=parsed_boost_queries/
null name=boostfuncs/
/lst
/response

*solrconfig.xml:*
requestHandler name=/select class=solr.SearchHandler
 lst name=defaults
   str name=echoParamsexplicit/str
   int name=rows10/int
   str name=dftext/str
   str name=defTypeedismax/str
   str name=qftext^1.0 Title^3.0 Classification^2.0
Contributors^2.0 Publisher^2.0/str
 /lst

*schema.xml:*
field name=text type=my_text indexed=true stored=false required=
false/*
*
dynamicField name=* type=my_text indexed=true stored=true
multiValued=false/
fieldType name=my_text class=solr.TextField analyzer type=index
class=MyAnalyzer/ analyzer type=query class=MyAnalyzer/ analyzer
type=multiterm class=MyAnalyzer/ /fieldType
*
*
*Note:* MyAnalyzer among few other customizations, uses WhitespaceTokenizer
and LoweCaseFilter

Thanks a lot.

-Shankar


On Thu, May 23, 2013 at 4:34 AM, Erick Erickson erickerick...@gmail.comwrote:

 Please post the results of adding debug=query to the URL.
 That'll tell us what the query parser spits out which is much
 easier to analyze.

 Best
 Erick

 On Wed, May 22, 2013 at 12:16 PM, Shankar Sundararaju
 shan...@ebrary.com wrote:
  This query returns 0 documents: *q=(+Title:() +Classification:()
  +Contributors:() +text:())*
 
  This returns 1 document: *q=doc-id:3000*
 
  And this returns 631580 documents when I was expecting 0: *q=doc-id:3000
  AND (+Title:() +Classification:() +Contributors:() +text:())*
 
  Am I missing something here? Can someone please explain? I am using Solr
  4.2.1
 
  Thanks
  -Shankar




-- 
Regards,
*Shankar Sundararaju
*Sr. Software Architect
ebrary, a ProQuest company
410 Cambridge Avenue, Palo Alto, CA 94306 USA
shan...@ebrary.com | www.ebrary.com | 650-475-8776 (w) | 408-426-3057 (c)


Re: Can anyone explain this Solr query behavior?

2013-05-23 Thread Upayavira
(+(doc-id:3000 DisjunctionMaxQuery((Publisher:and^2.0 | text:and |
Classification:and^2.0 | Contributors:and^2.0 |
Title:and^3.0/no_coord

You're using edismax, not lucene. So AND is being considered as a search
term, not an operator, and the word 'and' probably exists in 631580
documents.

Why is it triggering dismax? Probably because field:() is not valid
syntax, so edismax is dropping to dismax because it isn't a valid lucene
query.

What do you expect text:() to do?

If you want to match any docs that have a value in the text field, use
q=text:[* TO *]

To match docs that *don't* have a value in the text field: q=-text[* TO
*]

Upayavira

On Fri, May 24, 2013, at 12:23 AM, Shankar Sundararaju wrote:
 Hi Erick,
 
 Here's the output after turning on the debug flag:
 
 *q=text:()debug=query*
 
 yields
 
 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime17/int
 lst name=params
 str name=indenttrue/str
 str name=qtext:()/str
 str name=debugquery/str
 /lst
 /lst
 result name=response numFound=0 start=0 maxScore=0.0/result
 lst name=debug
 str name=rawquerystringtext:()/str
 str name=querystringtext:()/str
 str name=parsedquery(+())/no_coord/str
 str name=parsedquery_toString+()/str
 str name=QParserExtendedDismaxQParser/str
 null name=altquerystring/
 null name=boost_queries/
 arr name=parsed_boost_queries/
 null name=boostfuncs/
 /lst
 /response
 
 *q=doc-id:3000debug=query*
 
 yields
 
 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime17/int
 lst name=params
 str name=qdoc-id:3000/str
 str name=debugquery/str
 /lst
 /lst
 result name=response numFound=1 start=0 maxScore=11.682044
 doc
   :
   :
 /doc
 /result
 lst name=debug
 str name=rawquerystringdoc-id:3000/str
 str name=querystringdoc-id:3000/str
 str name=parsedquery(+doc-id:3000)/no_coord/str
 str name=parsedquery_toString+doc-id:`#8;#0;#0;#23;8/str
 str name=QParserExtendedDismaxQParser/str
 null name=altquerystring/
 null name=boost_queries/
 arr name=parsed_boost_queries/
 null name=boostfuncs/
 /lst
 /response
 
 *q=doc-id:3000 AND text:()debug=query*
 
   yields
 
 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime23/int
 lst name=params
 str name=qdoc-id:3000 AND text:()/str
 str name=debugquery/str
 /lst
 /lst
 result name=response numFound=631647 start=0 maxScore=8.056607
 doc
  :
 /doc
  :
 /doc
 doc
  :
 /doc
 doc
  :
 /doc
 doc
  :
 /doc
 doc
  :
 /doc
 /result
 lst name=debug
 str name=rawquerystringdoc-id:3000 AND text:()/str
 str name=querystringdoc-id:3000 AND text:()/str
 str name=parsedquery
 (+(doc-id:3000 DisjunctionMaxQuery((Publisher:and^2.0 | text:and |
 Classification:and^2.0 | Contributors:and^2.0 |
 Title:and^3.0/no_coord
 /str
 str name=parsedquery_toString
 +(doc-id:`#8;#0;#0;#23;8 (Publisher:and^2.0 | text:and |
 Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0))
 /str
 str name=QParserExtendedDismaxQParser/str
 null name=altquerystring/
 null name=boost_queries/
 arr name=parsed_boost_queries/
 null name=boostfuncs/
 /lst
 /response
 
 *solrconfig.xml:*
 requestHandler name=/select class=solr.SearchHandler
  lst name=defaults
str name=echoParamsexplicit/str
int name=rows10/int
str name=dftext/str
str name=defTypeedismax/str
str name=qftext^1.0 Title^3.0 Classification^2.0
 Contributors^2.0 Publisher^2.0/str
  /lst
 
 *schema.xml:*
 field name=text type=my_text indexed=true stored=false required=
 false/*
 *
 dynamicField name=* type=my_text indexed=true stored=true
 multiValued=false/
 fieldType name=my_text class=solr.TextField analyzer type=index
 class=MyAnalyzer/ analyzer type=query class=MyAnalyzer/
 analyzer
 type=multiterm class=MyAnalyzer/ /fieldType
 *
 *
 *Note:* MyAnalyzer among few other customizations, uses
 WhitespaceTokenizer
 and LoweCaseFilter
 
 Thanks a lot.
 
 -Shankar
 
 
 On Thu, May 23, 2013 at 4:34 AM, Erick Erickson
 erickerick...@gmail.comwrote:
 
  Please post the results of adding debug=query to the URL.
  That'll tell us what the query parser spits out which is much
  easier to analyze.
 
  Best
  Erick
 
  On Wed, May 22, 2013 at 12:16 PM, Shankar Sundararaju
  shan...@ebrary.com wrote:
   This query returns 0 documents: *q=(+Title:() +Classification:()
   +Contributors:() +text:())*
  
   This returns 1 document: *q=doc-id:3000*
  
   And this returns 631580 documents when I was expecting 0: *q=doc-id:3000
   AND (+Title:() +Classification:() +Contributors:() +text:())*
  
   Am I missing something here? Can someone please explain? I am using Solr
   4.2.1
  
   Thanks
   -Shankar
 
 
 
 
 -- 
 Regards,
 *Shankar Sundararaju
 *Sr. Software Architect
 ebrary, a ProQuest company
 410 Cambridge Avenue, Palo Alto, CA 94306 USA
 shan...@ebrary.com | www.ebrary.com | 650-475-8776 (w) | 408-426-3057 (c)


Re: Can anyone explain this Solr query behavior?

2013-05-23 Thread Jack Krupansky
Okay... sorry I wasn't paying close enough attention. What is happening is 
that the empty parentheses are illegal in Lucene query syntax:


 str name=msgorg.apache.solr.search.SyntaxError: Cannot parse 'id:* AND 
text:()': Encountered  ) )  at line 1, column 15.

Was expecting one of:
   lt;NOTgt; ...
   + ...
   - ...
   lt;BAREOPERgt; ...
   ( ...
   * ...
   lt;QUOTEDgt; ...
   lt;TERMgt; ...
   lt;PREFIXTERMgt; ...
   lt;WILDTERMgt; ...
   lt;REGEXPTERMgt; ...
   [ ...
   { ...
   lt;LPARAMSgt; ...
   lt;NUMBERgt; ...
   lt;TERMgt; ...
   * ...
   /str
 int name=code400/int

Edismax traps such errors and then escapes the query so that Lucene will 
no longer throw an error. In this case, it puts quotes around the AND 
operator, which is why you see and included in the parsed query as if it 
were a term. And I believe it turns text:() into text:(), which makes 
the original Lucene error go away, but the () analyzes to nothing and 
generates no term in the query.


So, fix your syntax error and the anomaly should go away.

-- Jack Krupansky

-Original Message- 
From: Shankar Sundararaju

Sent: Thursday, May 23, 2013 7:23 PM
To: solr-user@lucene.apache.org
Subject: Re: Can anyone explain this Solr query behavior?

Hi Erick,

Here's the output after turning on the debug flag:

*q=text:()debug=query*

   yields

response
lst name=responseHeader
int name=status0/int
int name=QTime17/int
lst name=params
str name=indenttrue/str
str name=qtext:()/str
str name=debugquery/str
/lst
/lst
result name=response numFound=0 start=0 maxScore=0.0/result
lst name=debug
str name=rawquerystringtext:()/str
str name=querystringtext:()/str
str name=parsedquery(+())/no_coord/str
str name=parsedquery_toString+()/str
str name=QParserExtendedDismaxQParser/str
null name=altquerystring/
null name=boost_queries/
arr name=parsed_boost_queries/
null name=boostfuncs/
/lst
/response

*q=doc-id:3000debug=query*

   yields

response
lst name=responseHeader
int name=status0/int
int name=QTime17/int
lst name=params
str name=qdoc-id:3000/str
str name=debugquery/str
/lst
/lst
result name=response numFound=1 start=0 maxScore=11.682044
doc
 :
 :
/doc
/result
lst name=debug
str name=rawquerystringdoc-id:3000/str
str name=querystringdoc-id:3000/str
str name=parsedquery(+doc-id:3000)/no_coord/str
str name=parsedquery_toString+doc-id:`#8;#0;#0;#23;8/str
str name=QParserExtendedDismaxQParser/str
null name=altquerystring/
null name=boost_queries/
arr name=parsed_boost_queries/
null name=boostfuncs/
/lst
/response

*q=doc-id:3000 AND text:()debug=query*

 yields

response
lst name=responseHeader
int name=status0/int
int name=QTime23/int
lst name=params
str name=qdoc-id:3000 AND text:()/str
str name=debugquery/str
/lst
/lst
result name=response numFound=631647 start=0 maxScore=8.056607
doc
:
/doc
:
/doc
doc
:
/doc
doc
:
/doc
doc
:
/doc
doc
:
/doc
/result
lst name=debug
str name=rawquerystringdoc-id:3000 AND text:()/str
str name=querystringdoc-id:3000 AND text:()/str
str name=parsedquery
(+(doc-id:3000 DisjunctionMaxQuery((Publisher:and^2.0 | text:and |
Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0/no_coord
/str
str name=parsedquery_toString
+(doc-id:`#8;#0;#0;#23;8 (Publisher:and^2.0 | text:and |
Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0))
/str
str name=QParserExtendedDismaxQParser/str
null name=altquerystring/
null name=boost_queries/
arr name=parsed_boost_queries/
null name=boostfuncs/
/lst
/response

*solrconfig.xml:*
requestHandler name=/select class=solr.SearchHandler
lst name=defaults
  str name=echoParamsexplicit/str
  int name=rows10/int
  str name=dftext/str
  str name=defTypeedismax/str
  str name=qftext^1.0 Title^3.0 Classification^2.0
Contributors^2.0 Publisher^2.0/str
/lst

*schema.xml:*
field name=text type=my_text indexed=true stored=false required=
false/*
*
dynamicField name=* type=my_text indexed=true stored=true
multiValued=false/
fieldType name=my_text class=solr.TextField analyzer type=index
class=MyAnalyzer/ analyzer type=query class=MyAnalyzer/ analyzer
type=multiterm class=MyAnalyzer/ /fieldType
*
*
*Note:* MyAnalyzer among few other customizations, uses WhitespaceTokenizer
and LoweCaseFilter

Thanks a lot.

-Shankar


On Thu, May 23, 2013 at 4:34 AM, Erick Erickson 
erickerick...@gmail.comwrote:



Please post the results of adding debug=query to the URL.
That'll tell us what the query parser spits out which is much
easier to analyze.

Best
Erick

On Wed, May 22, 2013 at 12:16 PM, Shankar Sundararaju
shan...@ebrary.com wrote:
 This query returns 0 documents: *q=(+Title:() +Classification:()
 +Contributors:() +text:())*

 This returns 1 document: *q=doc-id:3000*

 And this returns 631580 documents when I was expecting 0: *q=doc-id:3000
 AND (+Title:() +Classification:() +Contributors:() +text:())*

 Am I missing something here? Can someone please explain? I am using Solr
 4.2.1

 Thanks
 -Shankar





--
Regards,
*Shankar Sundararaju
*Sr. Software

Can anyone explain this Solr query behavior?

2013-05-22 Thread Shankar Sundararaju
This query returns 0 documents: *q=(+Title:() +Classification:()
+Contributors:() +text:())*

This returns 1 document: *q=doc-id:3000*

And this returns 631580 documents when I was expecting 0: *q=doc-id:3000
AND (+Title:() +Classification:() +Contributors:() +text:())*

Am I missing something here? Can someone please explain? I am using Solr
4.2.1

Thanks
-Shankar


Strange query behavior

2010-06-28 Thread Marc Ghorayeb

Hello,
I have a title that says 3DVIA Studio amp; Virtools Maya and 3dsMax 
Exporters. The analysis tool for this field gives me these 
tokens:3dviadviastudio;virtoolmaya3dsmaxdssystèmmaxexport


However, when i search for 3dsmax, i get no results :( Furthermore, if i 
search for dsmax i get the spellchecker that suggests me 3dsmax even though 
it doesn't find any results. If i search for any other token (3dvia, or max 
for example), the document is found. 3dsmax is the only token that doesn't 
seem to work!! :(
Here is my schema for this field:fieldType name=text class=solr.TextField 
positionIncrementGap=100
analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/

filter class=solr.WordDelimiterFilterFactory
generateWordParts=1
generateNumberParts=1
catenateWords=0
catenateNumbers=0
catenateAll=0
splitOnCaseChange=1
preserveOriginal=1
/

filter class=solr.TrimFilterFactory updateOffsets=true/
filter class=solr.LengthFilterFactory min=2 max=15/ 
filter class=solr.StopFilterFactory ignoreCase=true 
words=stopwords.txt enablePositionIncrements=true /   filter 
class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true 
expand=true/

filter class=solr.LowerCaseFilterFactory/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
filter class=solr.SnowballPorterFilterFactory 
language=${Language} protected=protwords.txt/
/analyzer

analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory /

filter class=solr.WordDelimiterFilterFactory
generateWordParts=1
generateNumberParts=1
catenateWords=1
catenateNumbers=1
catenateAll=0
splitOnCaseChange=1
preserveOriginal=1
/

filter class=solr.TrimFilterFactory updateOffsets=true/
filter class=solr.LengthFilterFactory min=2 max=15/
filter class=solr.StopFilterFactory ignoreCase=true 
words=stopwords.txt enablePositionIncrements=true /
filter class=solr.LowerCaseFilterFactory /
filter class=solr.RemoveDuplicatesTokenFilterFactory /
filter class=solr.SnowballPorterFilterFactory 
language=${Language} protected=protwords.txt /
/analyzer
/fieldType
Can anyone help me out please? :(
PS: the ${Language} is set to en (for english) in this case...
  
_
La boîte mail NOW Génération vous permet de réunir toutes vos boîtes mail dans 
Hotmail !
http://www.windowslive.fr/hotmail/nowgeneration/

Re: Strange query behavior

2010-06-28 Thread Joe Calderon
splitOnCaseChange is creating multiple tokens from 3dsMax disable it
or enable catenateAll, use the analysys page in the admin tool to see
exactly how your text will be indexed by analyzers without having to
reindex your documents, once you have it right you can do a full
reindex.

On Mon, Jun 28, 2010 at 5:48 AM, Marc Ghorayeb dekay...@hotmail.com wrote:

 Hello,
 I have a title that says 3DVIA Studio  Virtools Maya and 3dsMax Exporters. 
 The analysis tool for this field gives me these 
 tokens:3dviadviastudio;virtoolmaya3dsmaxdssystèmmaxexport


 However, when i search for 3dsmax, i get no results :( Furthermore, if i 
 search for dsmax i get the spellchecker that suggests me 3dsmax even 
 though it doesn't find any results. If i search for any other token (3dvia, 
 or max for example), the document is found. 3dsmax is the only token that 
 doesn't seem to work!! :(
 Here is my schema for this field:fieldType name=text 
 class=solr.TextField positionIncrementGap=100
        analyzer type=index
                tokenizer class=solr.WhitespaceTokenizerFactory/

                filter class=solr.WordDelimiterFilterFactory
                        generateWordParts=1
                        generateNumberParts=1
                        catenateWords=0
                        catenateNumbers=0
                        catenateAll=0
                        splitOnCaseChange=1
                        preserveOriginal=1
                /

                filter class=solr.TrimFilterFactory updateOffsets=true/
                filter class=solr.LengthFilterFactory min=2 max=15/    
          filter class=solr.StopFilterFactory ignoreCase=true 
 words=stopwords.txt enablePositionIncrements=true /               
 filter class=solr.SynonymFilterFactory synonyms=synonyms.txt 
 ignoreCase=true expand=true/

                filter class=solr.LowerCaseFilterFactory/
                filter class=solr.RemoveDuplicatesTokenFilterFactory/
                filter class=solr.SnowballPorterFilterFactory 
 language=${Language} protected=protwords.txt/
        /analyzer

        analyzer type=query
                tokenizer class=solr.WhitespaceTokenizerFactory /

                filter class=solr.WordDelimiterFilterFactory
                        generateWordParts=1
                        generateNumberParts=1
                        catenateWords=1
                        catenateNumbers=1
                        catenateAll=0
                        splitOnCaseChange=1
                        preserveOriginal=1
                /

                filter class=solr.TrimFilterFactory updateOffsets=true/
                filter class=solr.LengthFilterFactory min=2 max=15/
                filter class=solr.StopFilterFactory ignoreCase=true 
 words=stopwords.txt enablePositionIncrements=true /
                filter class=solr.LowerCaseFilterFactory /
                filter class=solr.RemoveDuplicatesTokenFilterFactory /
                filter class=solr.SnowballPorterFilterFactory 
 language=${Language} protected=protwords.txt /
        /analyzer
 /fieldType
 Can anyone help me out please? :(
 PS: the ${Language} is set to en (for english) in this case...

 _
 La boîte mail NOW Génération vous permet de réunir toutes vos boîtes mail 
 dans Hotmail !
 http://www.windowslive.fr/hotmail/nowgeneration/


Unexpected boolean query behavior

2010-01-14 Thread markwaddle

Here is my query:
(virt* AND machine fingerprinting) OR (virt* AND encryption) OR (virt* AND
anonymous) OR (virt* AND analytic*) AND owned:true

It can be broken down to:
(A) OR (B) OR (C) OR (D) AND E

A, B, C and D are themselves AND boolean clauses.

The E clause at the end is not behaving the way I would expect. No matter
how I order the A,B,C and D clauses, it always returns the equivalent of
((D) AND E).

When I add additional parentheses it behaves the way I expect. Like:
((A) OR (B) OR (C) OR (D)) AND E
or
(A) OR (B) OR (C) OR ((D) AND E)

Can anyone explain why it behaves the way it does without the parentheses?
Is there something I am missing in the way it processes boolean clauses?

Thanks,
Mark
-- 
View this message in context: 
http://old.nabble.com/Unexpected-boolean-query-behavior-tp27166967p27166967.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Unexpected boolean query behavior

2010-01-14 Thread Otis Gospodnetic
Mark,

Does it help if you rewrite your query using +/- syntax (required, 
prohibited), or nothing for should?  Because that's what happens under the 
hood (terms are required, prohibited, or should occur).


Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



- Original Message 
 From: markwaddle m...@markwaddle.com
 To: solr-user@lucene.apache.org
 Sent: Thu, January 14, 2010 2:39:21 PM
 Subject: Unexpected boolean query behavior
 
 
 Here is my query:
 (virt* AND machine fingerprinting) OR (virt* AND encryption) OR (virt* AND
 anonymous) OR (virt* AND analytic*) AND owned:true
 
 It can be broken down to:
 (A) OR (B) OR (C) OR (D) AND E
 
 A, B, C and D are themselves AND boolean clauses.
 
 The E clause at the end is not behaving the way I would expect. No matter
 how I order the A,B,C and D clauses, it always returns the equivalent of
 ((D) AND E).
 
 When I add additional parentheses it behaves the way I expect. Like:
 ((A) OR (B) OR (C) OR (D)) AND E
 or
 (A) OR (B) OR (C) OR ((D) AND E)
 
 Can anyone explain why it behaves the way it does without the parentheses?
 Is there something I am missing in the way it processes boolean clauses?
 
 Thanks,
 Mark
 -- 
 View this message in context: 
 http://old.nabble.com/Unexpected-boolean-query-behavior-tp27166967p27166967.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Unexpected boolean query behavior

2010-01-14 Thread markwaddle

That is a reasonable question. The problem here is that my users have already
created numerous queries just like this one, using ANDs and ORs. My users
are very technical and they have been using the results of these queries for
months now to perform analysis that drives business decisions. I need an
explanation for why this is happening so I can not only train them on how to
use it more effectively, but also to restore their trust in the search
application.

Does anyone understand this behavior? Or can you recommend a place for me to
look?


Otis Gospodnetic wrote:
 
 Mark,
 
 Does it help if you rewrite your query using +/- syntax (required,
 prohibited), or nothing for should?  Because that's what happens under
 the hood (terms are required, prohibited, or should occur).
 
 
 Otis
 --
 Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
 
 
 
 - Original Message 
 From: markwaddle m...@markwaddle.com
 To: solr-user@lucene.apache.org
 Sent: Thu, January 14, 2010 2:39:21 PM
 Subject: Unexpected boolean query behavior
 
 
 Here is my query:
 (virt* AND machine fingerprinting) OR (virt* AND encryption) OR (virt*
 AND
 anonymous) OR (virt* AND analytic*) AND owned:true
 
 It can be broken down to:
 (A) OR (B) OR (C) OR (D) AND E
 
 A, B, C and D are themselves AND boolean clauses.
 
 The E clause at the end is not behaving the way I would expect. No matter
 how I order the A,B,C and D clauses, it always returns the equivalent of
 ((D) AND E).
 
 When I add additional parentheses it behaves the way I expect. Like:
 ((A) OR (B) OR (C) OR (D)) AND E
 or
 (A) OR (B) OR (C) OR ((D) AND E)
 
 Can anyone explain why it behaves the way it does without the
 parentheses?
 Is there something I am missing in the way it processes boolean clauses?
 
 Thanks,
 Mark
 -- 
 View this message in context: 
 http://old.nabble.com/Unexpected-boolean-query-behavior-tp27166967p27166967.html
 Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 

-- 
View this message in context: 
http://old.nabble.com/Unexpected-boolean-query-behavior-tp27166967p27167750.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Unexpected boolean query behavior

2010-01-14 Thread Otis Gospodnetic
HI Mark,

Does this help?
http://wiki.apache.org/lucene-java/BooleanQuerySyntax

Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



- Original Message 
 From: markwaddle m...@markwaddle.com
 To: solr-user@lucene.apache.org
 Sent: Thu, January 14, 2010 3:38:34 PM
 Subject: Re: Unexpected boolean query behavior
 
 
 That is a reasonable question. The problem here is that my users have already
 created numerous queries just like this one, using ANDs and ORs. My users
 are very technical and they have been using the results of these queries for
 months now to perform analysis that drives business decisions. I need an
 explanation for why this is happening so I can not only train them on how to
 use it more effectively, but also to restore their trust in the search
 application.
 
 Does anyone understand this behavior? Or can you recommend a place for me to
 look?
 
 
 Otis Gospodnetic wrote:
  
  Mark,
  
  Does it help if you rewrite your query using +/- syntax (required,
  prohibited), or nothing for should?  Because that's what happens under
  the hood (terms are required, prohibited, or should occur).
  
  
  Otis
  --
  Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
  
  
  
  - Original Message 
  From: markwaddle 
  To: solr-user@lucene.apache.org
  Sent: Thu, January 14, 2010 2:39:21 PM
  Subject: Unexpected boolean query behavior
  
  
  Here is my query:
  (virt* AND machine fingerprinting) OR (virt* AND encryption) OR (virt*
  AND
  anonymous) OR (virt* AND analytic*) AND owned:true
  
  It can be broken down to:
  (A) OR (B) OR (C) OR (D) AND E
  
  A, B, C and D are themselves AND boolean clauses.
  
  The E clause at the end is not behaving the way I would expect. No matter
  how I order the A,B,C and D clauses, it always returns the equivalent of
  ((D) AND E).
  
  When I add additional parentheses it behaves the way I expect. Like:
  ((A) OR (B) OR (C) OR (D)) AND E
  or
  (A) OR (B) OR (C) OR ((D) AND E)
  
  Can anyone explain why it behaves the way it does without the
  parentheses?
  Is there something I am missing in the way it processes boolean clauses?
  
  Thanks,
  Mark
  -- 
  View this message in context: 
  
 http://old.nabble.com/Unexpected-boolean-query-behavior-tp27166967p27166967.html
  Sent from the Solr - User mailing list archive at Nabble.com.
  
  
  
 
 -- 
 View this message in context: 
 http://old.nabble.com/Unexpected-boolean-query-behavior-tp27166967p27167750.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Unexpected boolean query behavior

2010-01-14 Thread markwaddle

That explains my exact problem, thank you! May I ask how you found that wiki
posting?


Otis Gospodnetic wrote:
 
 HI Mark,
 
 Does this help?
 http://wiki.apache.org/lucene-java/BooleanQuerySyntax
 
 Otis
 --
 Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
 
-- 
View this message in context: 
http://old.nabble.com/Unexpected-boolean-query-behavior-tp27166967p27170172.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Unexpected boolean query behavior

2010-01-14 Thread Lance Norskog
Try this:

http://www.lucidimagination.com/search/?q=boolean+query

On Thu, Jan 14, 2010 at 3:45 PM, markwaddle m...@markwaddle.com wrote:

 That explains my exact problem, thank you! May I ask how you found that wiki
 posting?


 Otis Gospodnetic wrote:

 HI Mark,

 Does this help?
 http://wiki.apache.org/lucene-java/BooleanQuerySyntax

 Otis
 --
 Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch

 --
 View this message in context: 
 http://old.nabble.com/Unexpected-boolean-query-behavior-tp27166967p27170172.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
Lance Norskog
goks...@gmail.com