Update to Solr 6 - Amazon EC2 high CPU SYS usage

2017-04-27 Thread Elodie Sannier

Hello,

We have migrated from Solr 5.4.1 to Solr 6.4.0 on Amazon EC2 and we have
a high CPU SYS usage and it drastically decreases the Solr performance.

The JVM version (java-1.8.0-openjdk-1.8.0.131-0.b11.el6_9.x86_64), the
Jetty version (9.3.14) and the OS version (CentOS 6.9) have not changed
with the Solr upgrade.

Using "strace" command we have found a lot of "clock_gettime"
(gettimeofday) calls when Solr is started.

The clocksource on Amazon VMs is "xen" and, according to this web site,
it impacts the system calls:
https://blog.packagecloud.io/eng/2017/03/08/system-calls-are-much-slower-on-ec2/


We have updated the clocksource to "tsc" and it fixes the issue.

Is there a change between Solr 5.4.1 and 6.4.0 that would trigger many
more gettimeofday calls done by the JVM ?

Elodie

Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 158 Ter Rue du Temple 75003 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


Re: SolrIndexSearcher accumulation

2017-04-19 Thread Elodie Sannier

Yes, I didn't copy all our code but we also do extraReq.close(); in a
finally block. It was not the problem.

On 04/19/2017 11:53 AM, Mikhail Khludnev wrote:

If you create SolrQueryRequest make sure you close it then, since it's
necessary to release a searcher.

On Wed, Apr 19, 2017 at 12:35 PM, Elodie Sannier <elodie.sann...@kelkoo.fr>
wrote:


Hello,

We have found how to fix the problem.
When we update the original SolrQueryResponse object, we need to create
a new BasicResultContext object with the extra response.

Simplified code :

public class CustomSearchHandler extends SearchHandler {

public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse
rsp) throws Exception {

   SolrQueryRequest extraReq = createExtraRequest();
   SolrQueryResponse extraRsp = new SolrQueryResponse();

   super.handleRequestBody(extraReq, extraRsp);

   ResultContext extraRc = (ResultContext) extraRsp.getResponse();

   // code with memory leak !!
   rsp.addResponse(extraRc);

   // code without memory leak
   ResultContext extraRcClone = new BasicResultContext(extraRc.get
DocList(),
 rsp.getReturnFields(), req.getSearcher(),
extraRc.getQuery(), req);
   rsp.addResponse(extraRcClone);

}

}

We don't know why we need to create a new BasicResultContext to properly
manage searchers. Do you know why ?

Elodie


On 04/07/2017 04:14 PM, Rick Leir wrote:


Hi Gerald
The best solution in my mind is to look at the custom code and try to
find a way to remove it from your system. Solr queries can be complex, and
I hope there is a way to get the results you need. Would you like to say
what results you want to get, and what Solr queries you have tried?
I realize that in large organizations it is difficult to suggest change.
Cheers -- Rick

On April 7, 2017 9:08:19 AM EDT, Shawn Heisey <apa...@elyograg.org>
wrote:


On 4/7/2017 3:09 AM, Gerald Reinhart wrote:


 We have some custom code that extends SearchHandler to be able to


:


  - do an extra request
  - merge/combine the original request and the extra request
results

 On Solr 5.x, our code was working very well, now with Solr 6.x we
have the following issue:  the number of SolrIndexSearcher are
increasing (we can see them in the admin view > Plugins/ Stats > Core


).


As SolrIndexSearcher are accumulating, we have the following issues :
 - the memory used by Solr is increasing => OOM after a long
period of time in production
 - some files in the index has been deleted from the system but
the Solr JVM still hold them => ("fake") Full disk after a long


period


of time in production

 We are wondering,
- what has changed between Solr 5.x and Solr 6.x in the
management of the SolrIndexSearcher ?
- what would be the best way, in a Solr plugin, to perform 2
queries and merge the results to a single SolrQueryResponse ?


I hesitated to send a reply because when it comes right down to it, I
do
not know a whole lot about deep Solr internals.  I tend to do my work
with the code at a higher level, and don't dive down in the depths all
that often.  I am slowly learning, though.  You may need to wait for a
reply from someone who really knows those internals.

It looks like you and I participated in a discussion last month where
you were facing a similar problem with searchers -- deleted index files
being held open.  How did that turn out?  Seems like if that problem
were solved, it would also solve this problem.

Very likely, the fact that the plugin worked correctly in 5.x was
actually a bug in Solr related to reference counting, one that has been
fixed in later versions.

You may need to use a paste website or a file-sharing website to share
all your plugin code so that people can get a look at it.  The list has
a habit of deleting attachments.

Thanks,
Shawn


Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 158 Ter Rue du Temple 75003 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à
l'attention exclusive de leurs destinataires. Si vous n'êtes pas le
destinataire de ce message, merci de le détruire et d'en avertir
l'expéditeur.







Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 158 Ter Rue du Temple 75003 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


Re: SolrIndexSearcher accumulation

2017-04-19 Thread Elodie Sannier

Hello,

We have found how to fix the problem.
When we update the original SolrQueryResponse object, we need to create
a new BasicResultContext object with the extra response.

Simplified code :

public class CustomSearchHandler extends SearchHandler {

public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse
rsp) throws Exception {

  SolrQueryRequest extraReq = createExtraRequest();
  SolrQueryResponse extraRsp = new SolrQueryResponse();

  super.handleRequestBody(extraReq, extraRsp);

  ResultContext extraRc = (ResultContext) extraRsp.getResponse();

  // code with memory leak !!
  rsp.addResponse(extraRc);

  // code without memory leak
  ResultContext extraRcClone = new BasicResultContext(extraRc.getDocList(),
rsp.getReturnFields(), req.getSearcher(),
extraRc.getQuery(), req);
  rsp.addResponse(extraRcClone);

}

}

We don't know why we need to create a new BasicResultContext to properly
manage searchers. Do you know why ?

Elodie

On 04/07/2017 04:14 PM, Rick Leir wrote:

Hi Gerald
The best solution in my mind is to look at the custom code and try to find a 
way to remove it from your system. Solr queries can be complex, and I hope 
there is a way to get the results you need. Would you like to say what results 
you want to get, and what Solr queries you have tried?
I realize that in large organizations it is difficult to suggest change.
Cheers -- Rick

On April 7, 2017 9:08:19 AM EDT, Shawn Heisey  wrote:

On 4/7/2017 3:09 AM, Gerald Reinhart wrote:

We have some custom code that extends SearchHandler to be able to

:

 - do an extra request
 - merge/combine the original request and the extra request
results

On Solr 5.x, our code was working very well, now with Solr 6.x we
have the following issue:  the number of SolrIndexSearcher are
increasing (we can see them in the admin view > Plugins/ Stats > Core

).

As SolrIndexSearcher are accumulating, we have the following issues :
- the memory used by Solr is increasing => OOM after a long
period of time in production
- some files in the index has been deleted from the system but
the Solr JVM still hold them => ("fake") Full disk after a long

period

of time in production

We are wondering,
   - what has changed between Solr 5.x and Solr 6.x in the
management of the SolrIndexSearcher ?
   - what would be the best way, in a Solr plugin, to perform 2
queries and merge the results to a single SolrQueryResponse ?

I hesitated to send a reply because when it comes right down to it, I
do
not know a whole lot about deep Solr internals.  I tend to do my work
with the code at a higher level, and don't dive down in the depths all
that often.  I am slowly learning, though.  You may need to wait for a
reply from someone who really knows those internals.

It looks like you and I participated in a discussion last month where
you were facing a similar problem with searchers -- deleted index files
being held open.  How did that turn out?  Seems like if that problem
were solved, it would also solve this problem.

Very likely, the fact that the plugin worked correctly in 5.x was
actually a bug in Solr related to reference counting, one that has been
fixed in later versions.

You may need to use a paste website or a file-sharing website to share
all your plugin code so that people can get a look at it.  The list has
a habit of deleting attachments.

Thanks,
Shawn



Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 158 Ter Rue du Temple 75003 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


Re: [Migration Solr5 to Solr6] Unwanted deleted files references

2017-03-20 Thread Elodie Sannier

We have found a workaround to close the searchers checking the current
index version.
And now the SolrCore does not have many open searchers.

However, we have less unwanted deleted files references but we still
have some.

We have two collections fr_blue, fr_green with aliases:
fr -> fr_blue
fr_temp -> fr_green

The fr collection receives the queries, the fr_temp collection does not
receive the queries.

The problem occurs when we are doing the following sequence:
1- swap aliases (create alias fr -> fr_green and fr_temp -> fr_blue for
example)
2- reload collection with fr_temp alias (fr_blue for example)

We suspect that there is a problem with the reload of a collection that
has received traffic so far but doesn't receive anymore since the
aliases swap.
A problem with the increment / decrement of the searcher perhaps ?

Elodie

On 03/14/2017 06:42 PM, Shawn Heisey wrote:

On 3/14/2017 10:23 AM, Elodie Sannier wrote:

The request close() method decrements the reference count on the
searcher.

 From what I could tell, that method decrements the reference counter,
but does not actually close the searcher object.  I cannot tell you what
the correct procedure is to make sure that all resources are properly
closed at the proper time.  This might be a bug, or there might be
something missing from your code.  I do not know which.

Thanks,
Shawn




Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 158 Ter Rue du Temple 75003 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


Re: [Migration Solr5 to Solr6] Unwanted deleted files references

2017-03-14 Thread Elodie Sannier

The request close() method decrements the reference count on the searcher.

public abstract class SolrQueryRequestBase implements SolrQueryRequest,
Closeable {

  // The index searcher associated with this request
  protected RefCounted searcherHolder;

  public void close() {
if(this.searcherHolder != null) {
  this.searcherHolder.decref();
  this.searcherHolder = null;
}
  }
}

RefCounted keeps track of a reference count on the searcher and closes
it when the count hits zero.

public abstract class RefCounted {
  ...
  public void decref() {
if (refcount.decrementAndGet() == 0) {
  close();
}
  }
}

We asume that when we call req.getSearcher() - this increases the
reference count, after we are done with the searcher, we have to call
close() to call decref() to decrease the reference count.

But it does not seem enough or maybe there is a bug in Solr in this case ?

Elodie

On 03/14/2017 03:02 PM, Shawn Heisey wrote:

On 3/14/2017 3:08 AM, Gerald Reinhart wrote:

Hi,
The custom code we have is something like this :
public class MySearchHandlerextends SearchHandler {
@Override public void handleRequestBody(SolrQueryRequest req,
SolrQueryResponse rsp)throws Exception {
 SolrIndexSearcher  searcher =req.getSearcher();
 try{
  // Do stuff with the searcher
 }finally {
 req.close();
 }



  Despite the fact that we always close the request each time we get
a SolrIndexSearcher from the request, the number of SolrIndexSearcher
instances is increasing. Each time a new commit is done on the index, a
new Searcher is created (this is normal) but the old one remains. Is
there something wrong with this custom code ?

My understanding of Solr and Lucene internals is rudimentary, but I
might know what's happening here.

The code closes the request, but never closes the searcher.  Searcher
objects include a Lucene object that holds onto the index files that
pertain to that view of the index.  The searcher must be closed.

It does look like if you close the searcher and then close the request,
that might be enough to fully decrement all the reference counters
involved, but I do not know the code well enough to be sure of that.

Thanks,
Shawn




Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 158 Ter Rue du Temple 75003 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


Re: [Migration Solr5 to Solr6] Unwanted deleted files references

2017-03-07 Thread Elodie Sannier

Thank you Alex for your answer.

The reference on deleted files are only on index files (with .fdt, .doc.
dvd, ... extensions).

sudo lsof | grep DEL
java   1366kookel  DEL   REG 253,8   15360013
/opt/kookel/data/searchSolrNode/solrindex/fr1_green/index/_2508z.cfs
java   1366kookel  DEL   REG 253,8   15360035
/opt/kookel/data/searchSolrNode/solrindex/fr1_green/index/_25091.fdt
java   1366kookel  DEL   REG 253,8   15425603
/opt/kookel/data/searchSolrNode/solrindex/fr1_green/index/_25091_Lucene50_0.tim
java   1366kookel  DEL   REG 253,8   11624982
/opt/kookel/data/searchSolrNode/solrindex/fr1_green/index/_2508y.fdt
...

We have tested to optimize the collection with Solr Admin but without
effect on it.

Elodie

On 03/07/2017 04:11 PM, Alexandre Rafalovitch wrote:

More sanity checks: what are the extensions/types of the files that
are not deleted?

If they are index files, optimize command (even if no longer
recommended for production) should really blow all the old ones away.
So, are they other kinds of files?

Regards,
Alex.

http://www.solr-start.com/ - Resources for Solr users, new and experienced


On 7 March 2017 at 09:55, Erick Erickson <erickerick...@gmail.com> wrote:

Just as a sanity check, if you restart the Solr JVM, do the files
disappear from disk?

Do you have any custom code anywhere in this chain? If so, do you open
any searchers but
fail to close them? Although why 6.4 would manifest the problem but
other code wouldn't
is a mystery, just another sanity check.

Best,
Erick

On Tue, Mar 7, 2017 at 6:44 AM, Elodie Sannier <elodie.sann...@kelkoo.fr> wrote:

Hello,

We have migrated from Solr 5.4.1 to Solr 6.4.0 and the disk usage has
increased.
We found hundreds of references to deleted index files being held by solr.
Before the migration, we had 15-30% of disk space used, after the migration
we have 60-90% of disk space used.

We are using Solr Cloud with 2 collections.

The commands applied on the collections are:
- for incremental indexation mode: add, deleteById with commitWithin of 30
minutes
- for full indexation mode: add, deleteById, commit
- for switch between incremental and full mode: deleteByQuery, createAlias,
reload
- there is also an autocommit every 15 minutes

We have seen the email "Solr leaking references to deleted files"
2016-05-31 which describe the same problem but the mentioned bugs are fixed.

We manually tried to force a commit, a reload and an optimize on the
collections without effect.

Is a problem of configuration (merge / delete policy) or a possible
regression in the Solr code ?

Thank you


Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 158 Ter Rue du Temple 75003 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce
message, merci de le détruire et d'en avertir l'expéditeur.



--

Elodie Sannier
Software engineer

<http://www.kelkoo.com/>

*E*elodie.sann...@kelkoo.fr*Skype*kelkooelodies
*T*+33 (0)4 56 09 07 55
*A*Parc Sud Galaxie, 6, rue des Méridiens, 38130 Echirolles


Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 158 Ter Rue du Temple 75003 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


Re: [Migration Solr5 to Solr6] Unwanted deleted files references

2017-03-07 Thread Elodie Sannier

Thank you Erick for your answer.

The files are deleted even without JVM restart but they are still seen
as DELETED by the kernel.

We have a custom code and for the migration to Solr 6.4.0 we have added
a new code with req.getSearcher() but without "close".
We will decrement the reference count on a resource for the Searcher
(prevent the Searcher remains open after a commit) and see if it fixes
the problem.

Elodie

On 03/07/2017 03:55 PM, Erick Erickson wrote:

Just as a sanity check, if you restart the Solr JVM, do the files
disappear from disk?

Do you have any custom code anywhere in this chain? If so, do you open
any searchers but
fail to close them? Although why 6.4 would manifest the problem but
other code wouldn't
is a mystery, just another sanity check.

Best,
Erick

On Tue, Mar 7, 2017 at 6:44 AM, Elodie Sannier <elodie.sann...@kelkoo.fr> wrote:

Hello,

We have migrated from Solr 5.4.1 to Solr 6.4.0 and the disk usage has
increased.
We found hundreds of references to deleted index files being held by solr.
Before the migration, we had 15-30% of disk space used, after the migration
we have 60-90% of disk space used.

We are using Solr Cloud with 2 collections.

The commands applied on the collections are:
- for incremental indexation mode: add, deleteById with commitWithin of 30
minutes
- for full indexation mode: add, deleteById, commit
- for switch between incremental and full mode: deleteByQuery, createAlias,
reload
- there is also an autocommit every 15 minutes

We have seen the email "Solr leaking references to deleted files"
2016-05-31 which describe the same problem but the mentioned bugs are fixed.

We manually tried to force a commit, a reload and an optimize on the
collections without effect.

Is a problem of configuration (merge / delete policy) or a possible
regression in the Solr code ?

Thank you


Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 158 Ter Rue du Temple 75003 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce
message, merci de le détruire et d'en avertir l'expéditeur.



--

Elodie Sannier
Software engineer

<http://www.kelkoo.com/>

*E*elodie.sann...@kelkoo.fr*Skype*kelkooelodies
*T*+33 (0)4 56 09 07 55
*A*Parc Sud Galaxie, 6, rue des Méridiens, 38130 Echirolles


Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 158 Ter Rue du Temple 75003 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


[Migration Solr5 to Solr6] Unwanted deleted files references

2017-03-07 Thread Elodie Sannier

Hello,

We have migrated from Solr 5.4.1 to Solr 6.4.0 and the disk usage has increased.
We found hundreds of references to deleted index files being held by solr.
Before the migration, we had 15-30% of disk space used, after the migration we 
have 60-90% of disk space used.

We are using Solr Cloud with 2 collections.

The commands applied on the collections are:
- for incremental indexation mode: add, deleteById with commitWithin of 30 
minutes
- for full indexation mode: add, deleteById, commit
- for switch between incremental and full mode: deleteByQuery, createAlias, 
reload
- there is also an autocommit every 15 minutes

We have seen the email "Solr leaking references to deleted files"  2016-05-31 
which describe the same problem but the mentioned bugs are fixed.

We manually tried to force a commit, a reload and an optimize on the 
collections without effect.

Is a problem of configuration (merge / delete policy) or a possible regression 
in the Solr code ?

Thank you


Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 158 Ter Rue du Temple 75003 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


Update to solr 5 - custom coordination factor implementation issue

2016-02-02 Thread Elodie Sannier

Hello,

We are using solr 4.10.4 and we want to update to 5.4.1.

With solr 4.10.4:
- we extend BooleanQuery with a custom class in order to update the
coordination factor behaviour (coord method) but with solr 5.4.1 this
computation does not seem to be done by BooleanQuery anymore
- in order to use our implementation, we extend ExtendedSolrQueryParser
with a custom class and we override the methods newBooleanClause and
getBooleanQuery

How can we do this with solr 5.4.1 ?

Elodie Sannier

Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 158 Ter Rue du Temple 75003 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


Update to solr 5 - custom coordination factor implementation issue

2016-02-02 Thread Elodie Sannier

Hello,

We are using solr 4.10.4 and we want to update to 5.4.1.

With solr 4.10.4:
- we extend BooleanQuery with a custom class in order to update the
coordination factor behaviour (coord method) but with solr 5.4.1 this
computation does not seem to be done by BooleanQuery anymore
- in order to use our implementation, we extend ExtendedSolrQueryParser
with a custom class and we override the methods newBooleanClause and
getBooleanQuery

How can we do this with solr 5.4.1 ?

Elodie Sannier

Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris




Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 158 Ter Rue du Temple 75003 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


Update to solr 5 - custom phrase query implementation issue

2016-02-02 Thread Elodie Sannier

Hello,

We are using solr 4.10.4 and we want to update to 5.4.1.

With solr 4.10.4:
- we extend PhraseQuery with a custom class in order to remove some
terms from phrase queries with phrase slop (update of add(Term term, int
position) method)
- in order to use our implementation, we extend ExtendedSolrQueryParser
with a custom class and we override the method newPhraseQuery but with
solr 5 this method does not exist anymore

How can we do this with solr 5.4.1 ?

Elodie Sannier

Kelkoo SAS
Société par Actions Simplifiée
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris


Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 158 Ter Rue du Temple 75003 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


SolrCloud - ResultContext versus SolrDocumentList in distributed mode

2014-02-10 Thread Elodie Sannier

Hello,

I am using SolrCloud 4.5.1 with one shard and three replicas and I am
using the distributed mode.

I am using a custom SearchHandler which makes two sub-queries and merges
the responses.
When I merge the SolrQueryResponse objects I do the following casting :
SolrDocumentList firstResponseSDL = (SolrDocumentList)
firstResponse.getValues().get(Constants.RESPONSE);
SolrDocumentList secondResponseSDL = (SolrDocumentList)
secondResponse.getValues().get(Constants.RESPONSE);

Sometimes (not often), I have a ClassCastException only for the casting
of the second response:
java.lang.ClassCastException: org.apache.solr.response.ResultContext
cannot be cast to org.apache.solr.common.SolrDocumentList

Correct me if I am wrong, but I thought the response type was always
SolrDocumentList in a distibuted mode and ResultContext in a NOT
distibuted mode.

In which case, in a distributed mode, the response of the first
sub-query can be an instance of SolrDocumentList and the second
sub-query an instance of ResultContext ?

Elodie Sannier

Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


Re: Possible regression for Solr 4.6.0 - commitWithin does not work with replicas

2014-01-29 Thread Elodie Sannier

I have this configuration for test servers (the order of the instance
start leads to this conf.) not for production.

Elodie

On 01/23/2014 04:35 PM, Shawn Heisey wrote:

On 12/11/2013 2:41 AM, Elodie Sannier wrote:

collection fr_blue:
- shard1 - server-01 (replica1), server-01 (replica2)
- shard2 - server-02 (replica1), server-02 (replica2)

collection fr_green:
- shard1 - server-01 (replica1), server-01 (replica2)
- shard2 - server-02 (replica1), server-02 (replica2)

I'm pretty sure this won't affect the issue you've mentioned, but it's
worth pointing out.

If this is really how you've arranged your shard replicas, your system
cannot survive a failure, because you've got both replicas for each
shard on the same server.  If that server dies, half of each collection
will be gone.

Thanks,
Shawn




Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


Possible regression for Solr 4.6.0 - commitWithin does not work with replicas

2013-12-11 Thread Elodie Sannier

Hello,

I am using SolrCloud 4.6.0 with two shards, two replicas by shard and with two 
collections.

collection fr_blue:
- shard1 - server-01 (replica1), server-01 (replica2)
- shard2 - server-02 (replica1), server-02 (replica2)

collection fr_green:
- shard1 - server-01 (replica1), server-01 (replica2)
- shard2 - server-02 (replica1), server-02 (replica2)

I add documents using solrj CloudSolrServer and using commitWithin feature :
int commitWithinMs = 3;
SolrServer server = new CloudSolrServer(zkHost);
server.add(doc, commitWithinMs);

When I query an instance,  for 5 indexed documents, the numFound value changes 
for each call, randomly 0,1,4 or 5.
When I query the instances with distrib=false, I have:
- leader shard1: numFound=1
- leader shard2: numFound=4
- replica shard1: numFound=0
- replica shard1: numFound=0

The documents are not commited in the replicas, even after waiting more than 30 
seconds.

If I force a commit usinghttp://server-01:8080/solr/update/?commit=true, the 
documents are commited in the replicas and numFound=5.
I suppose that the leader forwards the documents to the replica, but they are 
not commited.

Is it a new bug with commitWithin feature for distributed mode ?

This problem does not occur with the version 4.5.1.

Elodie Sannier


Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


SolrCloud 4.6.0 - leader election issue

2013-12-09 Thread Elodie Sannier
, moving to the next candidate
2013-12-06 21:27:58,732 [coreLoadExecutor-4-thread-2] INFO  
org.apache.solr.cloud.ShardLeaderElectionContext:runLeaderProcess:224  - Sync 
was not a success but no one else is active! I am the leader
2013-12-06 21:27:58,736 [coreLoadExecutor-4-thread-2] INFO  
org.apache.solr.cloud.ShardLeaderElectionContext:runLeaderProcess:251  - I am 
the new leader: 
http://dc1-vt-dev-xen-06-vm-07.dev.dc1.kelkoo.net:8080/searchsolrnodefr/fr_green/
 shard1

Is it a bug with the leader election ?

This problem does not occur :
- with the version 4.5.1.
- or if I start  the four solr instances with a delay between them (about 15 
seconds).
- or if I configure only one collection
- or if I have only one replica by shard

Elodie Sannier

--
Kelkoo  

*Elodie Sannier *Software engineer

*E*elodie.sann...@kelkoo.fr mailto:elodie.sann...@kelkoo.fr 
*Y!Messenger* kelkooelodies

*T* +33 (0)4 56 09 07 55 *M*
*A* 4/6 Rue des Méridiens 38130 Echirolles





Unexpected value for boolean field in FunctionQuery

2013-09-10 Thread Elodie Sannier

Hello,

I am using the solr version 4.4.0, when I'm using FunctionQuery with boolean fields, it 
seems that the default field value  is true for documents without a value in 
the field.

The page http://wiki.apache.org/solr/FunctionQuery#field says 0 is returned for documents 
without a value in the field. so we could expect that the field value would be 
false.

Starting from the SolrCloud - Getting Started page with the document exampledocs/ipod_video.xml and removing 
the boolean field inStock: field name=inStocktrue/field demonstrates the problem.
When requesting with bf=if(inStock,10,0) :
curl -sS 
http://localhost:8983/solr/select?q=*:*bf=if%28inStock,10,0%29defType=edismaxdebugQuery=true;
Result indicates that field value for boolean field inStock is seen as true 
:
7.071068 = (MATCH) FunctionQuery(if(bool(inStock),const(10),const(0))), product 
of:
10.0 = if(bool(inStock)=true,const(10),const(0))
1.0 = boost
0.70710677 = queryNorm

Same behaviour using FunctionQuery via LocalParams syntax :
http://localhost:8983/solr/select?q={!func}if%28inStock,10,0%29debugQuery=true
10.0 = (MATCH) FunctionQuery(if(bool(inStock),const(10),const(0))), product of:
  10.0 = if(bool(inStock)=true,const(10),const(0))
  1.0 = boost
  1.0 = queryNorm

Is that expected ?

Elodie Sannier


Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


Re: Unexpected value for boolean field in FunctionQuery

2013-09-10 Thread Elodie Sannier

I didn't forget to commit my changes.
I used commands:
java -Durl=http://localhost:8983/solr/collection1/update -jar post.jar
ipod_video.xml
curl 'http://localhost:8983/solr/collection1/update/?commit=true'

When I use your url example
http://localhost:8983/solr/select?q=*:*rows=100fl=id,inStock,if%28inStock,10,0%29debugQuery=true
I have :
long name=if(inStock,10,0)10/long
(and my document does not have the inStock field)

Elodie

On 09/10/2013 03:54 PM, Yonik Seeley wrote:

I just tried a simple test with the example data, and things seem to
be working fine...

I tried this:
http://localhost:8983/solr/select
   ?q=*:*
   rows=100
   fl=id, inStock, if(inStock,10,0)

I saw values of 10 when inStock==true and values of 0 when it was
missing or explicitly false.
Perhaps you forgot to commit your changes when you removed the inStock
field from one of the example docs?

-Yonik
http://lucidworks.com


On Tue, Sep 10, 2013 at 9:25 AM, Elodie Sannier
elodie.sann...@kelkoo.fr wrote:

Hello,

I am using the solr version 4.4.0, when I'm using FunctionQuery with boolean
fields, it seems that the default field value  is true for documents
without a value in the field.

The page http://wiki.apache.org/solr/FunctionQuery#field says 0 is returned
for documents without a value in the field. so we could expect that the
field value would be false.

Starting from the SolrCloud - Getting Started page with the document
exampledocs/ipod_video.xml and removing the boolean field inStock: field
name=inStocktrue/field demonstrates the problem.
When requesting with bf=if(inStock,10,0) :
curl -sS
http://localhost:8983/solr/select?q=*:*bf=if%28inStock,10,0%29defType=edismaxdebugQuery=true;
Result indicates that field value for boolean field inStock is seen as
true :
7.071068 = (MATCH) FunctionQuery(if(bool(inStock),const(10),const(0))),
product of:
 10.0 = if(bool(inStock)=true,const(10),const(0))
 1.0 = boost
 0.70710677 = queryNorm

Same behaviour using FunctionQuery via LocalParams syntax :
http://localhost:8983/solr/select?q={!func}if%28inStock,10,0%29debugQuery=true
10.0 = (MATCH) FunctionQuery(if(bool(inStock),const(10),const(0))), product
of:
   10.0 = if(bool(inStock)=true,const(10),const(0))
   1.0 = boost
   1.0 = queryNorm

Is that expected ?

Elodie Sannier


Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce
message, merci de le détruire et d'en avertir l'expéditeur.





Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


Re: Unexpected value for boolean field in FunctionQuery

2013-09-10 Thread Elodie Sannier

By the way Yonik which version do you use (4.4.0 or nightly) ?

Elodie

On 09/10/2013 04:06 PM, Elodie Sannier wrote:

I didn't forget to commit my changes.
I used commands:
java -Durl=http://localhost:8983/solr/collection1/update -jar post.jar
ipod_video.xml
curl 'http://localhost:8983/solr/collection1/update/?commit=true'

When I use your url example
http://localhost:8983/solr/select?q=*:*rows=100fl=id,inStock,if%28inStock,10,0%29debugQuery=true
I have :
long name=if(inStock,10,0)10/long
(and my document does not have the inStock field)

Elodie

On 09/10/2013 03:54 PM, Yonik Seeley wrote:

I just tried a simple test with the example data, and things seem to
be working fine...

I tried this:
http://localhost:8983/solr/select
?q=*:*
rows=100
fl=id, inStock, if(inStock,10,0)

I saw values of 10 when inStock==true and values of 0 when it was
missing or explicitly false.
Perhaps you forgot to commit your changes when you removed the inStock
field from one of the example docs?

-Yonik
http://lucidworks.com


On Tue, Sep 10, 2013 at 9:25 AM, Elodie Sannier
elodie.sann...@kelkoo.fr wrote:

Hello,

I am using the solr version 4.4.0, when I'm using FunctionQuery with boolean
fields, it seems that the default field value  is true for documents
without a value in the field.

The page http://wiki.apache.org/solr/FunctionQuery#field says 0 is returned
for documents without a value in the field. so we could expect that the
field value would be false.

Starting from the SolrCloud - Getting Started page with the document
exampledocs/ipod_video.xml and removing the boolean field inStock: field
name=inStocktrue/field demonstrates the problem.
When requesting with bf=if(inStock,10,0) :
curl -sS
http://localhost:8983/solr/select?q=*:*bf=if%28inStock,10,0%29defType=edismaxdebugQuery=true;
Result indicates that field value for boolean field inStock is seen as
true :
7.071068 = (MATCH) FunctionQuery(if(bool(inStock),const(10),const(0))),
product of:
  10.0 = if(bool(inStock)=true,const(10),const(0))
  1.0 = boost
  0.70710677 = queryNorm

Same behaviour using FunctionQuery via LocalParams syntax :
http://localhost:8983/solr/select?q={!func}if%28inStock,10,0%29debugQuery=true
10.0 = (MATCH) FunctionQuery(if(bool(inStock),const(10),const(0))), product
of:
10.0 = if(bool(inStock)=true,const(10),const(0))
1.0 = boost
1.0 = queryNorm

Is that expected ?

Elodie Sannier


Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce
message, merci de le détruire et d'en avertir l'expéditeur.


Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.





Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


Re: SolrCloud: no timing when no result in distributed mode

2013-08-27 Thread Elodie Sannier

Hello,

I'm using the 4.4.0 version but I still have the problem.
Should I create a JIRA issue for it ?

Elodie

On 06/21/2013 02:54 PM, Elodie Sannier wrote:

Hello,

I am using SolrCloud 4.2.1 with two shards, with the debugQuery=true parameter, when a 
query does not return documents then the timing debug information is not returned:
curl -sS http://localhost:8983/solr/select?q=dummydebugQuery=true; | grep -o 'lst 
name=\timing\.*'

If i use the distrib=false parameter, the timing debug information is 
returned:
curl -sS http://localhost:8983/solr/select?q=dummydebugQuery=truedistrib=false; |  grep -o 
'lst name=\timing\.*'
lst name=timingdouble name=time1.0/doublelst name=preparedouble name=time0.0/doublelst name=querydouble name=time0.0/double/lstlst name=facetdouble name=time0.0/double/lstlst name=mltdouble name=time0.0/double/lstlst name=highlightdouble name=time0.0/double/lstlst name=statsdouble name=time0.0/double/lstlst name=debugdouble 
name=time0.0/double/lst/lstlst name=processdouble name=time1.0/doublelst name=querydouble name=time0.0/double/lstlst name=facetdouble name=time0.0/double/lstlst name=mltdouble name=time0.0/double/lstlst name=highlightdouble name=time0.0/double/lstlst name=statsdouble name=time0.0/double/lstlst name=debugdouble 
name=time1.0/double/lst/lst/lst/lst*

Is it a bug of the distributed mode ?


Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


Re: XInclude and Document Entity not working on schema.xml

2013-07-24 Thread Elodie Sannier

I'm using java-1.7.0-openjdk-1.7.0.3-2.1.el6.1.x86_64 and
tomcat6-6.0.24-48.el6_3.noarch.

I tested with the 4.4 solr version but I still have the bug.

Elodie

Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


Re: XInclude and Document Entity not working on schema.xml

2013-07-23 Thread Elodie Sannier

Hello Chris,

Thank you for your help.

I checked differences between my files and your test files but I didn't
find bugs in my files.

All my files are in the same directory: collection1/conf

= schema.xml content:

?xml version=1.0 encoding=UTF-8 ?
!DOCTYPE schema [
!ENTITY commonschema_types SYSTEM commonschema_types.xml
!ENTITY commonschema_others SYSTEM commonschema_others.xml
]
schema name=searchSolrSchema version=1.5

  types

fieldType name=text_stemmed class=solr.TextField
positionIncrementGap=100 omitNorms=true
!-- FR : french --
!-- least aggressive stemming --
  analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter
class=com.kelkoo.search.solr.plugins.stemmer.fr.KelkooFrenchMinimalStemFilterFactory/

  /analyzer
  analyzer type=query
tokenizer class=solr.StandardTokenizerFactory/
filter
class=com.kelkoo.search.solr.plugins.stemmer.fr.KelkooFrenchMinimalStemFilterFactory/
  /analyzer
/fieldType

commonschema_types;

  /types

  commonschema_others;

/schema

= commonschema_types.xml content:

fieldType name=string class=solr.StrField
sortMissingLast=true omitNorms=true/
fieldType name=boolean class=solr.BoolField
sortMissingLast=true omitNorms=true/

!-- int is for exact ids, work with grouped=true and distrib=true --
fieldType name=int
class=solr.TrieIntField  precisionStep=0
sortMissingLast=true omitNorms=true positionIncrementGap=0/

!-- tint is for numbers that need sorting and/or range queries
(precisionStep=4 has better performance than precisionStep=8)
   and that do *not* need grouping (grouping
does not work in distrib=true for tint)--
fieldType name=tint
class=solr.TrieIntField precisionStep=4
sortMissingLast=true omitNorms=true positionIncrementGap=0/

fieldType name=long class=solr.TrieLongField precisionStep=0
positionIncrementGap=0/
fieldType name=byte class=solr.ByteField omitNorms=true/
fieldType name=float class=solr.TrieFloatField
sortMissingLast=true omitNorms=true/

 !-- A general text field which tokenizes with StandardTokenizer
 omitNorms=true means the (index time) lenghtNorm will be the
same whatever the number of tokens.
  --
fieldType name=text_general class=solr.TextField
positionIncrementGap=100 omitNorms=true
  analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
  /analyzer
  analyzer type=query
tokenizer class=solr.StandardTokenizerFactory/
  /analyzer
/fieldType


commonschema_others; include works.

Do you see something wrong ?

Unfortunately I cannot use the 4.3.0 version because I'm using solr.xml
sharedLib which does not work in 4.3.0
(cf.https://issues.apache.org/jira/browse/SOLR-4791).
Where can I found the newly voted 4.4 ?
I have this bug with the nightly 4.5-2013-07-18_06-04-44 found here
https://builds.apache.org/job/Solr-Artifacts-4.x/lastSuccessfulBuild/artifact/solr/package/
(the 18th of july).

Elodie Sannier


Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


XInclude and Document Entity not working on schema.xml

2013-07-18 Thread Elodie Sannier

Hello,

I am using the solr nightly version 4.5-2013-07-18_06-04-44 and I want
to use Document Entity in schema.xml, I get this exception :
java.lang.RuntimeException: schema fieldtype
string(org.apache.solr.schema.StrField) invalid
arguments:{xml:base=solrres:/commonschema_types.xml}
at org.apache.solr.schema.FieldType.setArgs(FieldType.java:187)
at
org.apache.solr.schema.FieldTypePluginLoader.init(FieldTypePluginLoader.java:141)
at
org.apache.solr.schema.FieldTypePluginLoader.init(FieldTypePluginLoader.java:43)
at
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:190)
... 16 more

schema.xml:
?xml version=1.0 encoding=UTF-8 ?
!DOCTYPE schema [
!ENTITY commonschema_types SYSTEM commonschema_types.xml
]
schema name=searchSolrSchema version=1.5
  types
!-- Stuff --
commonschema_types;
  /types
   !-- Stuff --
/schema

commonschema_types.xml:
?xml version=1.0 encoding=UTF-8 ?
fieldType name=string   class=solr.StrField
sortMissingLast=true omitNorms=true/
fieldType name=long class=solr.TrieLongField precisionStep=0
positionIncrementGap=0/
!-- Stuff --

The same error appears in this bug (fixed ?):
https://issues.apache.org/jira/browse/SOLR-3087

It works with solr-4.2.1.

//-

I also try to use use XML XInclude mechanism
(http://en.wikipedia.org/wiki/XInclude) to include parts of schema.xml.

When I try to include a fieldType, I get this exception :
org.apache.solr.common.SolrException: Unknown fieldType 'long' specified
on field _version_
at org.apache.solr.schema.IndexSchema.loadFields(IndexSchema.java:644)
at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:470)
at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:164)
at
org.apache.solr.schema.IndexSchemaFactory.create(IndexSchemaFactory.java:55)
at
org.apache.solr.schema.IndexSchemaFactory.buildIndexSchema(IndexSchemaFactory.java:69)
at org.apache.solr.core.ZkContainer.createFromZk(ZkContainer.java:267)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:622)
... 10 more

The type is not found.

I include 'schema_integration.xml' like this in 'schema.xml' :
?xml version=1.0 encoding=UTF-8 ?
schema name=default version=1.5
types
!-- Stuff --
xi:include href=commonschema_types.xml
xmlns:xi=http://www.w3.org/2001/XInclude/
/types
!-- Stuff --
fields
field name=_version_ type=long indexed=true stored=true
multiValued=false/
!-- Stuff --
/fields
/schema

Is it a bug of the nightly version ?

Elodie Sannier

Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


SolrCloud: no timing when no result in distributed mode

2013-06-21 Thread Elodie Sannier

Hello,

I am using SolrCloud 4.2.1 with two shards, with the debugQuery=true parameter, when a 
query does not return documents then the timing debug information is not returned:
curl -sS http://localhost:8983/solr/select?q=dummydebugQuery=true; | grep -o 'lst 
name=\timing\.*'

If i use the distrib=false parameter, the timing debug information is 
returned:
curl -sS http://localhost:8983/solr/select?q=dummydebugQuery=truedistrib=false; |  grep -o 
'lst name=\timing\.*'
lst name=timingdouble name=time1.0/doublelst name=preparedouble name=time0.0/doublelst name=querydouble name=time0.0/double/lstlst name=facetdouble name=time0.0/double/lstlst name=mltdouble name=time0.0/double/lstlst name=highlightdouble name=time0.0/double/lstlst name=statsdouble name=time0.0/double/lstlst name=debugdouble 
name=time0.0/double/lst/lstlst name=processdouble name=time1.0/doublelst name=querydouble name=time0.0/double/lstlst name=facetdouble name=time0.0/double/lstlst name=mltdouble name=time0.0/double/lstlst name=highlightdouble name=time0.0/double/lstlst name=statsdouble name=time0.0/double/lstlst name=debugdouble 
name=time1.0/double/lst/lst/lst/lst*
**
*Is it a bug of the distributed mode ?*

*  Elodie Sannier
--

Kelkoo

*Elodie Sannier *Software engineer

*E*elodie.sann...@kelkoo.fr mailto:elodie.sann...@kelkoo.fr
*Y!Messenger* kelkooelodies
*T* +33 (0)4 56 09 07 55 *M*
*A* 4/6 Rue des Méridiens 38130 Echirolles




Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


SolrCloud: 500 error with combination of debug and group in distributed search

2013-06-21 Thread Elodie Sannier

Hello,

I am using SolrCloud 4.2.1 with two shards, when I'm grouping on a field
and using the debug parameter in distributed mode, I have a 500 error.

http://localhost:8983/solr/select?q=*:*group=truegroup.field=popularitydebug=true
(idem with debug=timing, query or results)
lst name=error
   str name=msg
  Server at http://localhost:8983/solr returned non ok status:500,
message:Internal Server Error
   /str
   str name=trace
org.apache.solr.common.SolrException: Server at
http://localhost:8983/solr returned non ok status:500, message:Internal
Server Error at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:373)
at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
at
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:172)
at
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:135)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at
java.util.concurrent.FutureTask.run(FutureTask.java:166) at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at
java.util.concurrent.FutureTask.run(FutureTask.java:166) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
   /str
   int name=code500/int
/lst

In the logs I have:
2013-06-21 13:26:47,876 [http-8080-5] ERROR
org.apache.solr.servlet.SolrDispatchFilter:log:96  -
null:java.lang.NullPointerException
at
org.apache.solr.handler.component.DebugComponent.process(DebugComponent.java:56)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:216)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:555)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:722)

If I add the distrib=false parameter or if I replace the debug
parameter by the debugQuery=true parameter or if I remove the group
parameters, I don't have the error.

Is it a bug of the distributed mode with the combination of debug and
group ?

Elodie Sannier
--
Kelkoo

*Elodie Sannier *Software engineer

*E*elodie.sann...@kelkoo.fr mailto:elodie.sann...@kelkoo.fr
*Y!Messenger* kelkooelodies
*T* +33 (0)4 56 09 07 55 *M*
*A* 4/6 Rue des Méridiens 38130 Echirolles




Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


Re: SolrCloud: no timing when no result in distributed mode

2013-06-21 Thread Elodie Sannier

Unfortunately I cannot use the 4.3.0 version because I'm using solr.xml
sharedLib which does not work in 4.3.0 (cf.
https://issues.apache.org/jira/browse/SOLR-4791).

Elodie

On 06/21/2013 03:30 PM, James Thomas wrote:

Seems to work fine for me on 4.3.0, maybe you can try a newer version.
4.3.1 is available.

-Original Message-
From: Elodie Sannier [mailto:elodie.sann...@kelkoo.fr]
Sent: Friday, June 21, 2013 8:54 AM
To: solr-user@lucene.apache.org  solr-user@lucene.apache.org
Subject: SolrCloud: no timing when no result in distributed mode

Hello,

I am using SolrCloud 4.2.1 with two shards, with the debugQuery=true parameter, when a 
query does not return documents then the timing debug information is not returned:
curl -sS http://localhost:8983/solr/select?q=dummydebugQuery=true; | grep -o 'lst 
name=\timing\.*'

If i use the distrib=false parameter, the timing debug information is 
returned:
curl -sS http://localhost:8983/solr/select?q=dummydebugQuery=truedistrib=false; |  grep -o 
'lst name=\timing\.*'
lst name=timingdouble name=time1.0/doublelst name=preparedouble name=time0.0/doublelst name=querydouble name=time0.0/double/lstlst name=facetdouble name=time0.0/double/lstlst name=mltdouble name=time0.0/double/lstlst name=highlightdouble name=time0.0/double/lstlst name=statsdouble name=time0.0/double/lstlst name=debugdouble 
name=time0.0/double/lst/lstlst name=processdouble name=time1.0/doublelst name=querydouble name=time0.0/double/lstlst name=facetdouble name=time0.0/double/lstlst name=mltdouble name=time0.0/double/lstlst name=highlightdouble name=time0.0/double/lstlst name=statsdouble name=time0.0/double/lstlst name=debugdouble 
name=time1.0/double/lst/lst/lst/lst*
**
*Is it a bug of the distributed mode ?*

*  Elodie Sannier
--

Kelkoo

*Elodie Sannier *Software engineer

*E*elodie.sann...@kelkoo.fr mailto:elodie.sann...@kelkoo.fr
*Y!Messenger* kelkooelodies
*T* +33 (0)4 56 09 07 55 *M*
*A* 4/6 Rue des Méridiens 38130 Echirolles




Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.



--
Kelkoo

*Elodie Sannier *Software engineer

*E*elodie.sann...@kelkoo.fr mailto:elodie.sann...@kelkoo.fr
*Y!Messenger* kelkooelodies
*T* +33 (0)4 56 09 07 55 *M*
*A* 4/6 Rue des Méridiens 38130 Echirolles




Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


Re: FieldCache insanity with field used as facet and group

2013-06-03 Thread Elodie Sannier

I'm reproducing the problem with the 4.2.1 example with 2 shards.

1) started up solr shards, indexed the example data, and confirmed empty
fieldCaches
[sanniere@funlevel-dx example]$ java
-Dbootstrap_confdir=./solr/collection1/conf
-Dcollection.configName=myconf -DzkRun -DnumShards=2 -jar start.jar
[sanniere@funlevel-dx example2]$ java -Djetty.port=7574
-DzkHost=localhost:9983 -jar start.jar

2) used both grouping and faceting on the popularity field, then checked
the fieldcache insanity count
[sanniere@funlevel-dx example]$ curl -sS
http://localhost:8983/solr/select?q=*:*group=truegroup.field=popularity;
 /dev/null
[sanniere@funlevel-dx example]$ curl -sS
http://localhost:8983/solr/select?q=*:*facet=truefacet.field=popularity;
 /dev/null
[sanniere@funlevel-dx example]$ curl -sS
http://localhost:8983/solr/admin/mbeans?stats=truekey=fieldCachewt=jsonindent=true;
| grep entries_count|insanity_count
entries_count:10,
insanity_count:2,

insanity#0:VALUEMISMATCH: Multiple distinct value objects for
SegmentCoreReader(owner=_g(4.2.1):C1)+popularity\n\t'SegmentCoreReader(owner=_g(4.2.1):C1)'='popularity',class
org.apache.lucene.index.SortedDocValues,0.5=org.apache.lucene.search.FieldCacheImpl$SortedDocValuesImpl#12129794\n\t'SegmentCoreReader(owner=_g(4.2.1):C1)'='popularity',int,null=org.apache.lucene.search.FieldCacheImpl$IntsFromArray#12298774\n\t'SegmentCoreReader(owner=_g(4.2.1):C1)'='popularity',int,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_INT_PARSER=org.apache.lucene.search.FieldCacheImpl$IntsFromArray#12298774\n,
insanity#1:VALUEMISMATCH: Multiple distinct value objects for
SegmentCoreReader(owner=_f(4.2.1):C9)+popularity\n\t'SegmentCoreReader(owner=_f(4.2.1):C9)'='popularity',int,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_INT_PARSER=org.apache.lucene.search.FieldCacheImpl$IntsFromArray#16648315\n\t'SegmentCoreReader(owner=_f(4.2.1):C9)'='popularity',int,null=org.apache.lucene.search.FieldCacheImpl$IntsFromArray#16648315\n\t'SegmentCoreReader(owner=_f(4.2.1):C9)'='popularity',class
org.apache.lucene.index.SortedDocValues,0.5=org.apache.lucene.search.FieldCacheImpl$SortedDocValuesImpl#1130715\n}}},
HIGHLIGHTING,{},
OTHER,{}]}

I've updated https://issues.apache.org/jira/browse/SOLR-4866

Elodie

Le 28.05.2013 10:22, Elodie Sannier a écrit :

I've created https://issues.apache.org/jira/browse/SOLR-4866

Elodie

Le 07.05.2013 18:19, Chris Hostetter a écrit :

: I am using the Lucene FieldCache with SolrCloud and I have insane instances
: with messages like:

FWIW: I'm the one that named the result of these sanity checks
FieldCacheInsantity and i have regretted it ever since -- a better label
would have been inconsistency

: VALUEMISMATCH: Multiple distinct value objects for
: SegmentCoreReader(​owner=_11i(​4.2.1):C4493997/853637)+merchantid
: 'SegmentCoreReader(​owner=_11i(​4.2.1):C4493997/853637)'='merchantid',class
: 
org.apache.lucene.index.SortedDocValues,0.5=org.apache.lucene.search.FieldCacheImpl$SortedDocValuesImpl#557711353
: 
'SegmentCoreReader(​owner=_11i(​4.2.1):C4493997/853637)'='merchantid',int,null=org.apache.lucene.search.FieldCacheImpl$IntsFromArray#1105988713
: 
'SegmentCoreReader(​owner=_11i(​4.2.1):C4493997/853637)'='merchantid',int,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_INT_PARSER=org.apache.lucene.search.FieldCacheImpl$IntsFromArray#1105988713
:
: All insane instances are for a field merchantid of type int used as facet
: and group field.

Interesting: it appears that the grouping code and the facet code are not
being consistent in how they are building hte field cache, so you are
getting two objects in the cache for each segment

I haven't checked if this happens much with the example configs, but if
you could: please file a bug with the details of which Solr version you
are using along with the schema fieldType   filed declarations for your
merchantid field, along with the mbean stats output showing the field
cache insanity after executing two queries like...

/select?q=*:*facet=truefacet.field=merchantid
/select?q=*:*group=truegroup.field=merchantid

(that way we can rule out your custom SearchComponent as having a bug in
it)

: This insanity can have performance impact ?
: How can I fix it ?

the impact is just that more ram is being used them is probably strictly
neccessary.  unless there is something unusual in your fieldType
delcataion, i don't think there is an easy fix you can apply -- we need to
fix the underlying code.

-Hoss


--
Kelkoo

*Elodie Sannier *Software engineer

*E*elodie.sann...@kelkoo.frmailto:elodie.sann...@kelkoo.fr
*Y!Messenger* kelkooelodies
*T* +33 (0)4 56 09 07 55 *M*
*A* 4/6 Rue des Méridiens 38130 Echirolles




Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le

Re: FieldCache insanity with field used as facet and group

2013-05-28 Thread Elodie Sannier

I've created https://issues.apache.org/jira/browse/SOLR-4866

Elodie

Le 07.05.2013 18:19, Chris Hostetter a écrit :

: I am using the Lucene FieldCache with SolrCloud and I have insane instances
: with messages like:

FWIW: I'm the one that named the result of these sanity checks
FieldCacheInsantity and i have regretted it ever since -- a better label
would have been inconsistency

: VALUEMISMATCH: Multiple distinct value objects for
: SegmentCoreReader(​owner=_11i(​4.2.1):C4493997/853637)+merchantid
: 'SegmentCoreReader(​owner=_11i(​4.2.1):C4493997/853637)'='merchantid',class
: 
org.apache.lucene.index.SortedDocValues,0.5=org.apache.lucene.search.FieldCacheImpl$SortedDocValuesImpl#557711353
: 
'SegmentCoreReader(​owner=_11i(​4.2.1):C4493997/853637)'='merchantid',int,null=org.apache.lucene.search.FieldCacheImpl$IntsFromArray#1105988713
: 
'SegmentCoreReader(​owner=_11i(​4.2.1):C4493997/853637)'='merchantid',int,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_INT_PARSER=org.apache.lucene.search.FieldCacheImpl$IntsFromArray#1105988713
:
: All insane instances are for a field merchantid of type int used as facet
: and group field.

Interesting: it appears that the grouping code and the facet code are not
being consistent in how they are building hte field cache, so you are
getting two objects in the cache for each segment

I haven't checked if this happens much with the example configs, but if
you could: please file a bug with the details of which Solr version you
are using along with the schema fieldType  filed declarations for your
merchantid field, along with the mbean stats output showing the field
cache insanity after executing two queries like...

/select?q=*:*facet=truefacet.field=merchantid
/select?q=*:*group=truegroup.field=merchantid

(that way we can rule out your custom SearchComponent as having a bug in
it)

: This insanity can have performance impact ?
: How can I fix it ?

the impact is just that more ram is being used them is probably strictly
neccessary.  unless there is something unusual in your fieldType
delcataion, i don't think there is an easy fix you can apply -- we need to
fix the underlying code.

-Hoss



--
Kelkoo

*Elodie Sannier *Software engineer

*E*elodie.sann...@kelkoo.fr mailto:elodie.sann...@kelkoo.fr
*Y!Messenger* kelkooelodies
*T* +33 (0)4 56 09 07 55 *M*
*A* 4/6 Rue des Méridiens 38130 Echirolles




Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


FieldCache insanity with field used as facet and group

2013-04-25 Thread Elodie Sannier

Hello,

I am using the Lucene FieldCache with SolrCloud and I have insane instances 
with messages like:

VALUEMISMATCH: Multiple distinct value objects for 
SegmentCoreReader(​owner=_11i(​4.2.1):C4493997/853637)+merchantid 
'SegmentCoreReader(​owner=_11i(​4.2.1):C4493997/853637)'='merchantid',class 
org.apache.lucene.index.SortedDocValues,0.5=org.apache.lucene.search.FieldCacheImpl$SortedDocValuesImpl#557711353
 
'SegmentCoreReader(​owner=_11i(​4.2.1):C4493997/853637)'='merchantid',int,null=org.apache.lucene.search.FieldCacheImpl$IntsFromArray#1105988713
 
'SegmentCoreReader(​owner=_11i(​4.2.1):C4493997/853637)'='merchantid',int,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_INT_PARSER=org.apache.lucene.search.FieldCacheImpl$IntsFromArray#1105988713

All insane instances are for a field merchantid of type int used as facet 
and group field.

I'm using a custom SearchHandler which makes two sub-queries, a first query 
with group.field=merchantid and a second query with facet.field=merchantid.

When I'm using the parameter facet.method=enum, I don't have the insane 
instance but I'm not sure it is the good fix.

This insanity can have performance impact ?
How can I fix it ?

Elodie Sannier


Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


SolrCloud: Result Grouping - no groups with field type with precisionStep 0

2013-04-09 Thread Elodie Sannier

Hello,

I am using the Result Grouping feature with SolrCloud, and it seems that
grouping does not work with field types having precisionStep property
greater than 0, in distributed mode.

I updated the SolrCloud - Getting Started page example A (Simple two
shard cluster).
In my schema.xml, the popularity field has an int type where I
changed precisionStep from 0 to 4 :

fieldType name=int class=solr.TrieIntField precisionStep=4
positionIncrementGap=0 /
field name=popularity type=int indexed=true stored=true /

When I'm requesting in distributed mode, the grouping on this field does
not return groups :
http://localhost:8983/solr/select?q=*:*group=truegroup.field=popularitydistrib=true

lst name=grouped
lst name=popularity
int name=matches1/int
arr name=groups
lst
int name=groupValue0/int
result name=doclist numFound=0 start=0 /
/lst
/arr
/lst
/lst

When I'm requesting on a single core, the grouping on this field returns
a group :
http://localhost:8983/solr/select?q=*:*group=truegroup.field=popularitydistrib=false

lst
int name=groupValue10/int
result name=doclist numFound=1 start=0
doc
str name=idMA147LL/A/str
...
int name=popularity10/int
...
/doc
/result
/lst

If I come back to the origin configuration, changing the int type with
precisionStep=0, the distributed request works :
fieldType name=int class=solr.TrieIntField precisionStep=0
positionIncrementGap=0 /

The precisionStep  0 can be useful for range queries but is it normal
that it is not compatible with grouping queries, in distributed mode only ?

Elodie Sannier

Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


Solrj 4.2 - CloudSolrServer aliases are not loaded

2013-04-02 Thread Elodie Sannier

Hello,

I am using the new collection alias feature, and it seems
CloudSolrServer class (solrj 4.2.0) does not allow to use it, either for
update or select.

When I'm requesting the CloudSolrServer with a collection alias name, I
have the error:
org.apache.solr.common.SolrException: Collection not found:
aliasedCollection

The collection alias cannot be found because, in
CloudSolrServer#getCollectionList (line 319) method, the alias variable
is always empty.

When I'm requesting the CloudSolrServer, the connect method is called
and it calls the ZkStateReader#createClusterStateWatchersAndUpdate method.
In the ZkStateReader#createClusterStateWatchersAndUpdate method, the
aliases are not loaded.

line 295, the data from /clusterstate.json are loaded :
ClusterState clusterState = ClusterState.load(zkClient, liveNodeSet);
this.clusterState = clusterState;

Should we have the same data loading from /aliases.json, in order to
fill the aliases field ?
line 299, a Watcher for aliases is created but does not seem used.


As a workaround to avoid the error, I have to force the aliases loading
at my application start and when the aliases are updated:
CloudSolrServer solrServer = new CloudSolrServer(localhost:2181);
solrServer.setDefaultCollection(aliasedCollection);
solrServer.connect();
solrServer.getZkStateReader().updateAliases();

Is there a better way to use collection aliases with solrj ?

Elodie Sannier

Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.