Re: QueryParser changes query by itself [solved]

2017-08-22 Thread Steve Rowe
Hi Bernd,

> On Aug 22, 2017, at 4:31 AM, Bernd Fehling  
> wrote:
> 
> But the QueryBuilder only calls "stream.reset()", it never calls 
> "stream.end()" so that Filters
> in the Analyzer chain can't do any cleanup (like my Filter wanted to do).
> I moved my "cleanup" into reset() which feels like a dirty hack.
> 
> 
> My opinion, in lucene QueryBuilder there should be a "stream.end()" after 
> consuming the stream:
> ...
>   stream.reset();
>   while (stream.incrementToken()) {
>   numTokens++;
>   ...
>   }
>   stream.end();
> ...

The stream here is a CachingTokenFilter wrapping the passed-in TokenStream. On 
first call to cache.incrementToken(), CachingTokenFilter's cache is populated 
by exhausting the wrapped stream and then calling its end() method.

--
Steve
www.lucidworks.com

Re: QueryParser changes query by itself [solved]

2017-08-22 Thread Bernd Fehling
ery class. 
>>>> But we didn't change its hashCode method accordingly. This caused 
>>>> anomalies in Solr, and Yonik found the bug and fixed hashCode. Your e-mail 
>>>> somehow reminded me this.
>>>> Could it be the QueryCache and hashCode method/implementation of Query 
>>>> subclasses.
>>>> May be your good and bad example is producing same hashCode? And this is 
>>>> confusing query cache in solr?
>>>> Can you disable the query cache, to test it?
>>>> By the way, which query parser are you using? I believe SynonymQuery is 
>>>> produced by BM25 similarity, right?
>>>>
>>>> Ahmet
>>>>
>>>>
>>>> On Friday, August 11, 2017, 2:48:07 PM GMT+3, Bernd Fehling 
>>>> <bernd.fehl...@uni-bielefeld.de> wrote:
>>>>
>>>>
>>>> We just noticed a very strange problem with Solr 6.4.2 QueryParser.
>>>> The QueryParser changes the query by itself from time to time.
>>>> This happens if doing a search request reload several times at higher rate.
>>>>
>>>> Good example:
>>>> ...
>>>> textth:waffenhandel
>>>>   
>>>> ...
>>>> textth:waffenhandel
>>>> textth:waffenhandel
>>>>   +SynonymQuery(Synonym(textth:"arms sales" 
>>>> textth:"arms trade"...
>>>>   +Synonym(textth:"arms sales" 
>>>> textth:"arms trade"...
>>>>
>>>>
>>>> Bad example:
>>>> ...
>>>> textth:waffenhandel
>>>>   
>>>> ...
>>>> textth:waffenhandel
>>>> textth:waffenhandel
>>>>   +textth:rss
>>>>   +textth:rss
>>>>
>>>> As you can see in the bad example after several reloads the parsedquery 
>>>> changed to term "rss".
>>>> But the original querystring has no "rss" substring at all. That is really 
>>>> strange.
>>>>
>>>> Anyone seen this before?
>>>>
>>>> Single index, Solr 6.4.2.
>>>>
>>>> Regards
>>>> Bernd
>>>>


Re: QueryParser changes query by itself

2017-08-16 Thread Yonik Seeley
The queryCache shouldn't be involved, this is somehow an issue in
parsing (and Solr doesn't currently cache parsing).
Perhaps there is something shared in your SynonymQParser instances
that isn't quite thread safe?
It could also be something in the text analysis in lucene as well
(related to the new graph stuff?)

-Yonik


On Wed, Aug 16, 2017 at 7:32 AM, Bernd Fehling
<bernd.fehl...@uni-bielefeld.de> wrote:
> My class SynonymQParser which calls SolrQueryParserBase.parse :
>
> class SynonymQParser extends QParser {
> protected SolrQueryParser sqparser;
> ...
> @Override
> public Query parse() throws SyntaxError {
> ...
> sqparser = new SolrQueryParser(this, defaultField);
> sqparser.setEnableGraphQueries(false);
> sqparser.setEnablePositionIncrements(false);
> ...
> Query synquery = sqparser.parse(qstr);
> ...
>
> And this is SolrQueryParserBase with method parse:
>
> public abstract class SolrQueryParserBase extends QueryBuilder {
> ...
> public Query parse(String query) throws SyntaxError {
> ReInit(new FastCharStream(new StringReader(query)));
> try {
>   // TopLevelQuery is a Query followed by the end-of-input (EOF)
>   Query res = TopLevelQuery(null);  // pass null so we can tell later 
> if an explicit field was provided or not
>   return res!=null ? res : newBooleanQuery().build();
> }
> ...
>
>
> The String variable "query" going into parse method is always 
> "textth:waffenhandel" !!!
> Having a breakpoint at "return", the Query variable "res" changes sometimes to
> TermQuery with term "textth:rss" instead of being a SynonymQuery.
>
> This is strange!!!
>
> What is ReInit right before try doing, is that a cahe lookup?
>
> Or is the problem in TopLevelQuery?
>
> Regards
> Bernd
>
>
> Am 16.08.2017 um 09:06 schrieb Bernd Fehling:
>> Hi Ahmet,
>>
>> thank you for your reply. I was also targeting towards QueryCache but
>> with your hint about LUCENE-3758 I have a better point to start with.
>>
>> If the system is under high load and the the QueryCache is filled I have
>> a higher rate of changed queries.
>> In debug mode the "timing-->process-->query" of changed queries is always 
>> "0" zero.
>>
>> The query parser "SynonymQParser" is self developed which uses QParserPlugin.
>> There is no caching inside and works for years.
>> Only compiled against recent Lucene/Solr and some modifications like
>> using Builder with newer Lucene versions.
>>
>> I will test without query cache.
>> Wich one should be disabled, Query Result Cache?
>>
>> Regards
>> Bernd
>>
>>
>> Am 15.08.2017 um 19:07 schrieb Ahmet Arslan:
>>> Hi Bernd,
>>>
>>> In LUCENE-3758, a new member field added into ComplexPhraseQuery class. But 
>>> we didn't change its hashCode method accordingly. This caused anomalies in 
>>> Solr, and Yonik found the bug and fixed hashCode. Your e-mail somehow 
>>> reminded me this.
>>> Could it be the QueryCache and hashCode method/implementation of Query 
>>> subclasses.
>>> May be your good and bad example is producing same hashCode? And this is 
>>> confusing query cache in solr?
>>> Can you disable the query cache, to test it?
>>> By the way, which query parser are you using? I believe SynonymQuery is 
>>> produced by BM25 similarity, right?
>>>
>>> Ahmet
>>>
>>>
>>> On Friday, August 11, 2017, 2:48:07 PM GMT+3, Bernd Fehling 
>>> <bernd.fehl...@uni-bielefeld.de> wrote:
>>>
>>>
>>> We just noticed a very strange problem with Solr 6.4.2 QueryParser.
>>> The QueryParser changes the query by itself from time to time.
>>> This happens if doing a search request reload several times at higher rate.
>>>
>>> Good example:
>>> ...
>>> textth:waffenhandel
>>>   
>>> ...
>>> textth:waffenhandel
>>> textth:waffenhandel
>>>   +SynonymQuery(Synonym(textth:"arms sales" 
>>> textth:"arms trade"...
>>>   +Synonym(textth:"arms sales" 
>>> textth:"arms trade"...
>>>
>>>
>>> Bad example:
>>> ...
>>> textth:waffenhandel
>>>   
>>> ...
>>> textth:waffenhandel
>>> textth:waffenhandel
>>>   +textth:rss
>>>   +textth:rss
>>>
>>> As you can see in the bad example after several reloads the parsedquery 
>>> changed to term "rss".
>>> But the original querystring has no "rss" substring at all. That is really 
>>> strange.
>>>
>>> Anyone seen this before?
>>>
>>> Single index, Solr 6.4.2.
>>>
>>> Regards
>>> Bernd
>>>


Re: QueryParser changes query by itself

2017-08-16 Thread Bernd Fehling
My class SynonymQParser which calls SolrQueryParserBase.parse :

class SynonymQParser extends QParser {
protected SolrQueryParser sqparser;
...
@Override
public Query parse() throws SyntaxError {
...
sqparser = new SolrQueryParser(this, defaultField);
sqparser.setEnableGraphQueries(false);
sqparser.setEnablePositionIncrements(false);
...
Query synquery = sqparser.parse(qstr);
...

And this is SolrQueryParserBase with method parse:

public abstract class SolrQueryParserBase extends QueryBuilder {
...
public Query parse(String query) throws SyntaxError {
ReInit(new FastCharStream(new StringReader(query)));
try {
  // TopLevelQuery is a Query followed by the end-of-input (EOF)
  Query res = TopLevelQuery(null);  // pass null so we can tell later 
if an explicit field was provided or not
  return res!=null ? res : newBooleanQuery().build();
}
...


The String variable "query" going into parse method is always 
"textth:waffenhandel" !!!
Having a breakpoint at "return", the Query variable "res" changes sometimes to
TermQuery with term "textth:rss" instead of being a SynonymQuery.

This is strange!!!

What is ReInit right before try doing, is that a cahe lookup?

Or is the problem in TopLevelQuery?

Regards
Bernd


Am 16.08.2017 um 09:06 schrieb Bernd Fehling:
> Hi Ahmet,
> 
> thank you for your reply. I was also targeting towards QueryCache but
> with your hint about LUCENE-3758 I have a better point to start with.
> 
> If the system is under high load and the the QueryCache is filled I have
> a higher rate of changed queries.
> In debug mode the "timing-->process-->query" of changed queries is always "0" 
> zero.
> 
> The query parser "SynonymQParser" is self developed which uses QParserPlugin.
> There is no caching inside and works for years.
> Only compiled against recent Lucene/Solr and some modifications like
> using Builder with newer Lucene versions.
> 
> I will test without query cache.
> Wich one should be disabled, Query Result Cache?
> 
> Regards
> Bernd
> 
> 
> Am 15.08.2017 um 19:07 schrieb Ahmet Arslan:
>> Hi Bernd,
>>
>> In LUCENE-3758, a new member field added into ComplexPhraseQuery class. But 
>> we didn't change its hashCode method accordingly. This caused anomalies in 
>> Solr, and Yonik found the bug and fixed hashCode. Your e-mail somehow 
>> reminded me this.
>> Could it be the QueryCache and hashCode method/implementation of Query 
>> subclasses.
>> May be your good and bad example is producing same hashCode? And this is 
>> confusing query cache in solr?
>> Can you disable the query cache, to test it?
>> By the way, which query parser are you using? I believe SynonymQuery is 
>> produced by BM25 similarity, right?
>>
>> Ahmet
>>
>>
>> On Friday, August 11, 2017, 2:48:07 PM GMT+3, Bernd Fehling 
>> <bernd.fehl...@uni-bielefeld.de> wrote:
>>
>>
>> We just noticed a very strange problem with Solr 6.4.2 QueryParser.
>> The QueryParser changes the query by itself from time to time.
>> This happens if doing a search request reload several times at higher rate.
>>
>> Good example:
>> ...
>> textth:waffenhandel
>>   
>> ...
>> textth:waffenhandel
>> textth:waffenhandel
>>   +SynonymQuery(Synonym(textth:"arms sales" 
>> textth:"arms trade"...
>>   +Synonym(textth:"arms sales" textth:"arms 
>> trade"...
>>
>>
>> Bad example:
>> ...
>> textth:waffenhandel
>>   
>> ...
>> textth:waffenhandel
>> textth:waffenhandel
>>   +textth:rss
>>   +textth:rss
>>
>> As you can see in the bad example after several reloads the parsedquery 
>> changed to term "rss".
>> But the original querystring has no "rss" substring at all. That is really 
>> strange.
>>
>> Anyone seen this before?
>>
>> Single index, Solr 6.4.2.
>>
>> Regards
>> Bernd
>>


Re: QueryParser changes query by itself

2017-08-16 Thread Bernd Fehling
Hi Ahmet,

thank you for your reply. I was also targeting towards QueryCache but
with your hint about LUCENE-3758 I have a better point to start with.

If the system is under high load and the the QueryCache is filled I have
a higher rate of changed queries.
In debug mode the "timing-->process-->query" of changed queries is always "0" 
zero.

The query parser "SynonymQParser" is self developed which uses QParserPlugin.
There is no caching inside and works for years.
Only compiled against recent Lucene/Solr and some modifications like
using Builder with newer Lucene versions.

I will test without query cache.
Wich one should be disabled, Query Result Cache?

Regards
Bernd


Am 15.08.2017 um 19:07 schrieb Ahmet Arslan:
> Hi Bernd,
> 
> In LUCENE-3758, a new member field added into ComplexPhraseQuery class. But 
> we didn't change its hashCode method accordingly. This caused anomalies in 
> Solr, and Yonik found the bug and fixed hashCode. Your e-mail somehow 
> reminded me this.
> Could it be the QueryCache and hashCode method/implementation of Query 
> subclasses.
> May be your good and bad example is producing same hashCode? And this is 
> confusing query cache in solr?
> Can you disable the query cache, to test it?
> By the way, which query parser are you using? I believe SynonymQuery is 
> produced by BM25 similarity, right?
> 
> Ahmet
> 
> 
> On Friday, August 11, 2017, 2:48:07 PM GMT+3, Bernd Fehling 
> <bernd.fehl...@uni-bielefeld.de> wrote:
> 
> 
> We just noticed a very strange problem with Solr 6.4.2 QueryParser.
> The QueryParser changes the query by itself from time to time.
> This happens if doing a search request reload several times at higher rate.
> 
> Good example:
> ...
> textth:waffenhandel
>   
> ...
> textth:waffenhandel
> textth:waffenhandel
>   +SynonymQuery(Synonym(textth:"arms sales" 
> textth:"arms trade"...
>   +Synonym(textth:"arms sales" textth:"arms 
> trade"...
> 
> 
> Bad example:
> ...
> textth:waffenhandel
>   
> ...
> textth:waffenhandel
> textth:waffenhandel
>   +textth:rss
>   +textth:rss
> 
> As you can see in the bad example after several reloads the parsedquery 
> changed to term "rss".
> But the original querystring has no "rss" substring at all. That is really 
> strange.
> 
> Anyone seen this before?
> 
> Single index, Solr 6.4.2.
> 
> Regards
> Bernd
> 


Re: QueryParser changes query by itself

2017-08-15 Thread Ahmet Arslan
Hi Bernd,

In LUCENE-3758, a new member field added into ComplexPhraseQuery class. But we 
didn't change its hashCode method accordingly. This caused anomalies in Solr, 
and Yonik found the bug and fixed hashCode. Your e-mail somehow reminded me 
this.
Could it be the QueryCache and hashCode method/implementation of Query 
subclasses.
May be your good and bad example is producing same hashCode? And this is 
confusing query cache in solr?
Can you disable the query cache, to test it?
By the way, which query parser are you using? I believe SynonymQuery is 
produced by BM25 similarity, right?

Ahmet


On Friday, August 11, 2017, 2:48:07 PM GMT+3, Bernd Fehling 
<bernd.fehl...@uni-bielefeld.de> wrote:


We just noticed a very strange problem with Solr 6.4.2 QueryParser.
The QueryParser changes the query by itself from time to time.
This happens if doing a search request reload several times at higher rate.

Good example:
...
textth:waffenhandel
  
...
textth:waffenhandel
textth:waffenhandel
  +SynonymQuery(Synonym(textth:"arms sales" 
textth:"arms trade"...
  +Synonym(textth:"arms sales" textth:"arms 
trade"...


Bad example:
...
textth:waffenhandel
  
...
textth:waffenhandel
textth:waffenhandel
  +textth:rss
  +textth:rss

As you can see in the bad example after several reloads the parsedquery changed 
to term "rss".
But the original querystring has no "rss" substring at all. That is really 
strange.

Anyone seen this before?

Single index, Solr 6.4.2.

Regards
Bernd

QueryParser changes query by itself

2017-08-11 Thread Bernd Fehling
We just noticed a very strange problem with Solr 6.4.2 QueryParser.
The QueryParser changes the query by itself from time to time.
This happens if doing a search request reload several times at higher rate.

Good example:
...
textth:waffenhandel
  
...
textth:waffenhandel
textth:waffenhandel
  +SynonymQuery(Synonym(textth:"arms sales" 
textth:"arms trade"...
  +Synonym(textth:"arms sales" textth:"arms 
trade"...


Bad example:
...
textth:waffenhandel
  
...
textth:waffenhandel
textth:waffenhandel
  +textth:rss
  +textth:rss

As you can see in the bad example after several reloads the parsedquery changed 
to term "rss".
But the original querystring has no "rss" substring at all. That is really 
strange.

Anyone seen this before?

Single index, Solr 6.4.2.

Regards
Bernd