Update to Solr 6 - Amazon EC2 high CPU SYS usage
Hello, We have migrated from Solr 5.4.1 to Solr 6.4.0 on Amazon EC2 and we have a high CPU SYS usage and it drastically decreases the Solr performance. The JVM version (java-1.8.0-openjdk-1.8.0.131-0.b11.el6_9.x86_64), the Jetty version (9.3.14) and the OS version (CentOS 6.9) have not changed with the Solr upgrade. Using "strace" command we have found a lot of "clock_gettime" (gettimeofday) calls when Solr is started. The clocksource on Amazon VMs is "xen" and, according to this web site, it impacts the system calls: https://blog.packagecloud.io/eng/2017/03/08/system-calls-are-much-slower-on-ec2/ We have updated the clocksource to "tsc" and it fixes the issue. Is there a change between Solr 5.4.1 and 6.4.0 that would trigger many more gettimeofday calls done by the JVM ? Elodie Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 158 Ter Rue du Temple 75003 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
Re: SolrIndexSearcher accumulation
Yes, I didn't copy all our code but we also do extraReq.close(); in a finally block. It was not the problem. On 04/19/2017 11:53 AM, Mikhail Khludnev wrote: If you create SolrQueryRequest make sure you close it then, since it's necessary to release a searcher. On Wed, Apr 19, 2017 at 12:35 PM, Elodie Sannier <elodie.sann...@kelkoo.fr> wrote: Hello, We have found how to fix the problem. When we update the original SolrQueryResponse object, we need to create a new BasicResultContext object with the extra response. Simplified code : public class CustomSearchHandler extends SearchHandler { public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) throws Exception { SolrQueryRequest extraReq = createExtraRequest(); SolrQueryResponse extraRsp = new SolrQueryResponse(); super.handleRequestBody(extraReq, extraRsp); ResultContext extraRc = (ResultContext) extraRsp.getResponse(); // code with memory leak !! rsp.addResponse(extraRc); // code without memory leak ResultContext extraRcClone = new BasicResultContext(extraRc.get DocList(), rsp.getReturnFields(), req.getSearcher(), extraRc.getQuery(), req); rsp.addResponse(extraRcClone); } } We don't know why we need to create a new BasicResultContext to properly manage searchers. Do you know why ? Elodie On 04/07/2017 04:14 PM, Rick Leir wrote: Hi Gerald The best solution in my mind is to look at the custom code and try to find a way to remove it from your system. Solr queries can be complex, and I hope there is a way to get the results you need. Would you like to say what results you want to get, and what Solr queries you have tried? I realize that in large organizations it is difficult to suggest change. Cheers -- Rick On April 7, 2017 9:08:19 AM EDT, Shawn Heisey <apa...@elyograg.org> wrote: On 4/7/2017 3:09 AM, Gerald Reinhart wrote: We have some custom code that extends SearchHandler to be able to : - do an extra request - merge/combine the original request and the extra request results On Solr 5.x, our code was working very well, now with Solr 6.x we have the following issue: the number of SolrIndexSearcher are increasing (we can see them in the admin view > Plugins/ Stats > Core ). As SolrIndexSearcher are accumulating, we have the following issues : - the memory used by Solr is increasing => OOM after a long period of time in production - some files in the index has been deleted from the system but the Solr JVM still hold them => ("fake") Full disk after a long period of time in production We are wondering, - what has changed between Solr 5.x and Solr 6.x in the management of the SolrIndexSearcher ? - what would be the best way, in a Solr plugin, to perform 2 queries and merge the results to a single SolrQueryResponse ? I hesitated to send a reply because when it comes right down to it, I do not know a whole lot about deep Solr internals. I tend to do my work with the code at a higher level, and don't dive down in the depths all that often. I am slowly learning, though. You may need to wait for a reply from someone who really knows those internals. It looks like you and I participated in a discussion last month where you were facing a similar problem with searchers -- deleted index files being held open. How did that turn out? Seems like if that problem were solved, it would also solve this problem. Very likely, the fact that the plugin worked correctly in 5.x was actually a bug in Solr related to reference counting, one that has been fixed in later versions. You may need to use a paste website or a file-sharing website to share all your plugin code so that people can get a look at it. The list has a habit of deleting attachments. Thanks, Shawn Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 158 Ter Rue du Temple 75003 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur. Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 158 Ter Rue du Temple 75003 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
Re: SolrIndexSearcher accumulation
Hello, We have found how to fix the problem. When we update the original SolrQueryResponse object, we need to create a new BasicResultContext object with the extra response. Simplified code : public class CustomSearchHandler extends SearchHandler { public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) throws Exception { SolrQueryRequest extraReq = createExtraRequest(); SolrQueryResponse extraRsp = new SolrQueryResponse(); super.handleRequestBody(extraReq, extraRsp); ResultContext extraRc = (ResultContext) extraRsp.getResponse(); // code with memory leak !! rsp.addResponse(extraRc); // code without memory leak ResultContext extraRcClone = new BasicResultContext(extraRc.getDocList(), rsp.getReturnFields(), req.getSearcher(), extraRc.getQuery(), req); rsp.addResponse(extraRcClone); } } We don't know why we need to create a new BasicResultContext to properly manage searchers. Do you know why ? Elodie On 04/07/2017 04:14 PM, Rick Leir wrote: Hi Gerald The best solution in my mind is to look at the custom code and try to find a way to remove it from your system. Solr queries can be complex, and I hope there is a way to get the results you need. Would you like to say what results you want to get, and what Solr queries you have tried? I realize that in large organizations it is difficult to suggest change. Cheers -- Rick On April 7, 2017 9:08:19 AM EDT, Shawn Heiseywrote: On 4/7/2017 3:09 AM, Gerald Reinhart wrote: We have some custom code that extends SearchHandler to be able to : - do an extra request - merge/combine the original request and the extra request results On Solr 5.x, our code was working very well, now with Solr 6.x we have the following issue: the number of SolrIndexSearcher are increasing (we can see them in the admin view > Plugins/ Stats > Core ). As SolrIndexSearcher are accumulating, we have the following issues : - the memory used by Solr is increasing => OOM after a long period of time in production - some files in the index has been deleted from the system but the Solr JVM still hold them => ("fake") Full disk after a long period of time in production We are wondering, - what has changed between Solr 5.x and Solr 6.x in the management of the SolrIndexSearcher ? - what would be the best way, in a Solr plugin, to perform 2 queries and merge the results to a single SolrQueryResponse ? I hesitated to send a reply because when it comes right down to it, I do not know a whole lot about deep Solr internals. I tend to do my work with the code at a higher level, and don't dive down in the depths all that often. I am slowly learning, though. You may need to wait for a reply from someone who really knows those internals. It looks like you and I participated in a discussion last month where you were facing a similar problem with searchers -- deleted index files being held open. How did that turn out? Seems like if that problem were solved, it would also solve this problem. Very likely, the fact that the plugin worked correctly in 5.x was actually a bug in Solr related to reference counting, one that has been fixed in later versions. You may need to use a paste website or a file-sharing website to share all your plugin code so that people can get a look at it. The list has a habit of deleting attachments. Thanks, Shawn Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 158 Ter Rue du Temple 75003 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
Re: [Migration Solr5 to Solr6] Unwanted deleted files references
We have found a workaround to close the searchers checking the current index version. And now the SolrCore does not have many open searchers. However, we have less unwanted deleted files references but we still have some. We have two collections fr_blue, fr_green with aliases: fr -> fr_blue fr_temp -> fr_green The fr collection receives the queries, the fr_temp collection does not receive the queries. The problem occurs when we are doing the following sequence: 1- swap aliases (create alias fr -> fr_green and fr_temp -> fr_blue for example) 2- reload collection with fr_temp alias (fr_blue for example) We suspect that there is a problem with the reload of a collection that has received traffic so far but doesn't receive anymore since the aliases swap. A problem with the increment / decrement of the searcher perhaps ? Elodie On 03/14/2017 06:42 PM, Shawn Heisey wrote: On 3/14/2017 10:23 AM, Elodie Sannier wrote: The request close() method decrements the reference count on the searcher. From what I could tell, that method decrements the reference counter, but does not actually close the searcher object. I cannot tell you what the correct procedure is to make sure that all resources are properly closed at the proper time. This might be a bug, or there might be something missing from your code. I do not know which. Thanks, Shawn Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 158 Ter Rue du Temple 75003 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
Re: [Migration Solr5 to Solr6] Unwanted deleted files references
The request close() method decrements the reference count on the searcher. public abstract class SolrQueryRequestBase implements SolrQueryRequest, Closeable { // The index searcher associated with this request protected RefCounted searcherHolder; public void close() { if(this.searcherHolder != null) { this.searcherHolder.decref(); this.searcherHolder = null; } } } RefCounted keeps track of a reference count on the searcher and closes it when the count hits zero. public abstract class RefCounted { ... public void decref() { if (refcount.decrementAndGet() == 0) { close(); } } } We asume that when we call req.getSearcher() - this increases the reference count, after we are done with the searcher, we have to call close() to call decref() to decrease the reference count. But it does not seem enough or maybe there is a bug in Solr in this case ? Elodie On 03/14/2017 03:02 PM, Shawn Heisey wrote: On 3/14/2017 3:08 AM, Gerald Reinhart wrote: Hi, The custom code we have is something like this : public class MySearchHandlerextends SearchHandler { @Override public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp)throws Exception { SolrIndexSearcher searcher =req.getSearcher(); try{ // Do stuff with the searcher }finally { req.close(); } Despite the fact that we always close the request each time we get a SolrIndexSearcher from the request, the number of SolrIndexSearcher instances is increasing. Each time a new commit is done on the index, a new Searcher is created (this is normal) but the old one remains. Is there something wrong with this custom code ? My understanding of Solr and Lucene internals is rudimentary, but I might know what's happening here. The code closes the request, but never closes the searcher. Searcher objects include a Lucene object that holds onto the index files that pertain to that view of the index. The searcher must be closed. It does look like if you close the searcher and then close the request, that might be enough to fully decrement all the reference counters involved, but I do not know the code well enough to be sure of that. Thanks, Shawn Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 158 Ter Rue du Temple 75003 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
Re: [Migration Solr5 to Solr6] Unwanted deleted files references
Thank you Alex for your answer. The reference on deleted files are only on index files (with .fdt, .doc. dvd, ... extensions). sudo lsof | grep DEL java 1366kookel DEL REG 253,8 15360013 /opt/kookel/data/searchSolrNode/solrindex/fr1_green/index/_2508z.cfs java 1366kookel DEL REG 253,8 15360035 /opt/kookel/data/searchSolrNode/solrindex/fr1_green/index/_25091.fdt java 1366kookel DEL REG 253,8 15425603 /opt/kookel/data/searchSolrNode/solrindex/fr1_green/index/_25091_Lucene50_0.tim java 1366kookel DEL REG 253,8 11624982 /opt/kookel/data/searchSolrNode/solrindex/fr1_green/index/_2508y.fdt ... We have tested to optimize the collection with Solr Admin but without effect on it. Elodie On 03/07/2017 04:11 PM, Alexandre Rafalovitch wrote: More sanity checks: what are the extensions/types of the files that are not deleted? If they are index files, optimize command (even if no longer recommended for production) should really blow all the old ones away. So, are they other kinds of files? Regards, Alex. http://www.solr-start.com/ - Resources for Solr users, new and experienced On 7 March 2017 at 09:55, Erick Erickson <erickerick...@gmail.com> wrote: Just as a sanity check, if you restart the Solr JVM, do the files disappear from disk? Do you have any custom code anywhere in this chain? If so, do you open any searchers but fail to close them? Although why 6.4 would manifest the problem but other code wouldn't is a mystery, just another sanity check. Best, Erick On Tue, Mar 7, 2017 at 6:44 AM, Elodie Sannier <elodie.sann...@kelkoo.fr> wrote: Hello, We have migrated from Solr 5.4.1 to Solr 6.4.0 and the disk usage has increased. We found hundreds of references to deleted index files being held by solr. Before the migration, we had 15-30% of disk space used, after the migration we have 60-90% of disk space used. We are using Solr Cloud with 2 collections. The commands applied on the collections are: - for incremental indexation mode: add, deleteById with commitWithin of 30 minutes - for full indexation mode: add, deleteById, commit - for switch between incremental and full mode: deleteByQuery, createAlias, reload - there is also an autocommit every 15 minutes We have seen the email "Solr leaking references to deleted files" 2016-05-31 which describe the same problem but the mentioned bugs are fixed. We manually tried to force a commit, a reload and an optimize on the collections without effect. Is a problem of configuration (merge / delete policy) or a possible regression in the Solr code ? Thank you Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 158 Ter Rue du Temple 75003 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur. -- Elodie Sannier Software engineer <http://www.kelkoo.com/> *E*elodie.sann...@kelkoo.fr*Skype*kelkooelodies *T*+33 (0)4 56 09 07 55 *A*Parc Sud Galaxie, 6, rue des Méridiens, 38130 Echirolles Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 158 Ter Rue du Temple 75003 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
Re: [Migration Solr5 to Solr6] Unwanted deleted files references
Thank you Erick for your answer. The files are deleted even without JVM restart but they are still seen as DELETED by the kernel. We have a custom code and for the migration to Solr 6.4.0 we have added a new code with req.getSearcher() but without "close". We will decrement the reference count on a resource for the Searcher (prevent the Searcher remains open after a commit) and see if it fixes the problem. Elodie On 03/07/2017 03:55 PM, Erick Erickson wrote: Just as a sanity check, if you restart the Solr JVM, do the files disappear from disk? Do you have any custom code anywhere in this chain? If so, do you open any searchers but fail to close them? Although why 6.4 would manifest the problem but other code wouldn't is a mystery, just another sanity check. Best, Erick On Tue, Mar 7, 2017 at 6:44 AM, Elodie Sannier <elodie.sann...@kelkoo.fr> wrote: Hello, We have migrated from Solr 5.4.1 to Solr 6.4.0 and the disk usage has increased. We found hundreds of references to deleted index files being held by solr. Before the migration, we had 15-30% of disk space used, after the migration we have 60-90% of disk space used. We are using Solr Cloud with 2 collections. The commands applied on the collections are: - for incremental indexation mode: add, deleteById with commitWithin of 30 minutes - for full indexation mode: add, deleteById, commit - for switch between incremental and full mode: deleteByQuery, createAlias, reload - there is also an autocommit every 15 minutes We have seen the email "Solr leaking references to deleted files" 2016-05-31 which describe the same problem but the mentioned bugs are fixed. We manually tried to force a commit, a reload and an optimize on the collections without effect. Is a problem of configuration (merge / delete policy) or a possible regression in the Solr code ? Thank you Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 158 Ter Rue du Temple 75003 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur. -- Elodie Sannier Software engineer <http://www.kelkoo.com/> *E*elodie.sann...@kelkoo.fr*Skype*kelkooelodies *T*+33 (0)4 56 09 07 55 *A*Parc Sud Galaxie, 6, rue des Méridiens, 38130 Echirolles Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 158 Ter Rue du Temple 75003 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
[Migration Solr5 to Solr6] Unwanted deleted files references
Hello, We have migrated from Solr 5.4.1 to Solr 6.4.0 and the disk usage has increased. We found hundreds of references to deleted index files being held by solr. Before the migration, we had 15-30% of disk space used, after the migration we have 60-90% of disk space used. We are using Solr Cloud with 2 collections. The commands applied on the collections are: - for incremental indexation mode: add, deleteById with commitWithin of 30 minutes - for full indexation mode: add, deleteById, commit - for switch between incremental and full mode: deleteByQuery, createAlias, reload - there is also an autocommit every 15 minutes We have seen the email "Solr leaking references to deleted files" 2016-05-31 which describe the same problem but the mentioned bugs are fixed. We manually tried to force a commit, a reload and an optimize on the collections without effect. Is a problem of configuration (merge / delete policy) or a possible regression in the Solr code ? Thank you Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 158 Ter Rue du Temple 75003 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
Update to solr 5 - custom coordination factor implementation issue
Hello, We are using solr 4.10.4 and we want to update to 5.4.1. With solr 4.10.4: - we extend BooleanQuery with a custom class in order to update the coordination factor behaviour (coord method) but with solr 5.4.1 this computation does not seem to be done by BooleanQuery anymore - in order to use our implementation, we extend ExtendedSolrQueryParser with a custom class and we override the methods newBooleanClause and getBooleanQuery How can we do this with solr 5.4.1 ? Elodie Sannier Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 158 Ter Rue du Temple 75003 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
Update to solr 5 - custom coordination factor implementation issue
Hello, We are using solr 4.10.4 and we want to update to 5.4.1. With solr 4.10.4: - we extend BooleanQuery with a custom class in order to update the coordination factor behaviour (coord method) but with solr 5.4.1 this computation does not seem to be done by BooleanQuery anymore - in order to use our implementation, we extend ExtendedSolrQueryParser with a custom class and we override the methods newBooleanClause and getBooleanQuery How can we do this with solr 5.4.1 ? Elodie Sannier Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 158 Ter Rue du Temple 75003 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
Update to solr 5 - custom phrase query implementation issue
Hello, We are using solr 4.10.4 and we want to update to 5.4.1. With solr 4.10.4: - we extend PhraseQuery with a custom class in order to remove some terms from phrase queries with phrase slop (update of add(Term term, int position) method) - in order to use our implementation, we extend ExtendedSolrQueryParser with a custom class and we override the method newPhraseQuery but with solr 5 this method does not exist anymore How can we do this with solr 5.4.1 ? Elodie Sannier Kelkoo SAS Société par Actions Simplifiée Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 158 Ter Rue du Temple 75003 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
SolrCloud - ResultContext versus SolrDocumentList in distributed mode
Hello, I am using SolrCloud 4.5.1 with one shard and three replicas and I am using the distributed mode. I am using a custom SearchHandler which makes two sub-queries and merges the responses. When I merge the SolrQueryResponse objects I do the following casting : SolrDocumentList firstResponseSDL = (SolrDocumentList) firstResponse.getValues().get(Constants.RESPONSE); SolrDocumentList secondResponseSDL = (SolrDocumentList) secondResponse.getValues().get(Constants.RESPONSE); Sometimes (not often), I have a ClassCastException only for the casting of the second response: java.lang.ClassCastException: org.apache.solr.response.ResultContext cannot be cast to org.apache.solr.common.SolrDocumentList Correct me if I am wrong, but I thought the response type was always SolrDocumentList in a distibuted mode and ResultContext in a NOT distibuted mode. In which case, in a distributed mode, the response of the first sub-query can be an instance of SolrDocumentList and the second sub-query an instance of ResultContext ? Elodie Sannier Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
Re: Possible regression for Solr 4.6.0 - commitWithin does not work with replicas
I have this configuration for test servers (the order of the instance start leads to this conf.) not for production. Elodie On 01/23/2014 04:35 PM, Shawn Heisey wrote: On 12/11/2013 2:41 AM, Elodie Sannier wrote: collection fr_blue: - shard1 - server-01 (replica1), server-01 (replica2) - shard2 - server-02 (replica1), server-02 (replica2) collection fr_green: - shard1 - server-01 (replica1), server-01 (replica2) - shard2 - server-02 (replica1), server-02 (replica2) I'm pretty sure this won't affect the issue you've mentioned, but it's worth pointing out. If this is really how you've arranged your shard replicas, your system cannot survive a failure, because you've got both replicas for each shard on the same server. If that server dies, half of each collection will be gone. Thanks, Shawn Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
Possible regression for Solr 4.6.0 - commitWithin does not work with replicas
Hello, I am using SolrCloud 4.6.0 with two shards, two replicas by shard and with two collections. collection fr_blue: - shard1 - server-01 (replica1), server-01 (replica2) - shard2 - server-02 (replica1), server-02 (replica2) collection fr_green: - shard1 - server-01 (replica1), server-01 (replica2) - shard2 - server-02 (replica1), server-02 (replica2) I add documents using solrj CloudSolrServer and using commitWithin feature : int commitWithinMs = 3; SolrServer server = new CloudSolrServer(zkHost); server.add(doc, commitWithinMs); When I query an instance, for 5 indexed documents, the numFound value changes for each call, randomly 0,1,4 or 5. When I query the instances with distrib=false, I have: - leader shard1: numFound=1 - leader shard2: numFound=4 - replica shard1: numFound=0 - replica shard1: numFound=0 The documents are not commited in the replicas, even after waiting more than 30 seconds. If I force a commit usinghttp://server-01:8080/solr/update/?commit=true, the documents are commited in the replicas and numFound=5. I suppose that the leader forwards the documents to the replica, but they are not commited. Is it a new bug with commitWithin feature for distributed mode ? This problem does not occur with the version 4.5.1. Elodie Sannier Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
SolrCloud 4.6.0 - leader election issue
, moving to the next candidate 2013-12-06 21:27:58,732 [coreLoadExecutor-4-thread-2] INFO org.apache.solr.cloud.ShardLeaderElectionContext:runLeaderProcess:224 - Sync was not a success but no one else is active! I am the leader 2013-12-06 21:27:58,736 [coreLoadExecutor-4-thread-2] INFO org.apache.solr.cloud.ShardLeaderElectionContext:runLeaderProcess:251 - I am the new leader: http://dc1-vt-dev-xen-06-vm-07.dev.dc1.kelkoo.net:8080/searchsolrnodefr/fr_green/ shard1 Is it a bug with the leader election ? This problem does not occur : - with the version 4.5.1. - or if I start the four solr instances with a delay between them (about 15 seconds). - or if I configure only one collection - or if I have only one replica by shard Elodie Sannier -- Kelkoo *Elodie Sannier *Software engineer *E*elodie.sann...@kelkoo.fr mailto:elodie.sann...@kelkoo.fr *Y!Messenger* kelkooelodies *T* +33 (0)4 56 09 07 55 *M* *A* 4/6 Rue des Méridiens 38130 Echirolles
Unexpected value for boolean field in FunctionQuery
Hello, I am using the solr version 4.4.0, when I'm using FunctionQuery with boolean fields, it seems that the default field value is true for documents without a value in the field. The page http://wiki.apache.org/solr/FunctionQuery#field says 0 is returned for documents without a value in the field. so we could expect that the field value would be false. Starting from the SolrCloud - Getting Started page with the document exampledocs/ipod_video.xml and removing the boolean field inStock: field name=inStocktrue/field demonstrates the problem. When requesting with bf=if(inStock,10,0) : curl -sS http://localhost:8983/solr/select?q=*:*bf=if%28inStock,10,0%29defType=edismaxdebugQuery=true; Result indicates that field value for boolean field inStock is seen as true : 7.071068 = (MATCH) FunctionQuery(if(bool(inStock),const(10),const(0))), product of: 10.0 = if(bool(inStock)=true,const(10),const(0)) 1.0 = boost 0.70710677 = queryNorm Same behaviour using FunctionQuery via LocalParams syntax : http://localhost:8983/solr/select?q={!func}if%28inStock,10,0%29debugQuery=true 10.0 = (MATCH) FunctionQuery(if(bool(inStock),const(10),const(0))), product of: 10.0 = if(bool(inStock)=true,const(10),const(0)) 1.0 = boost 1.0 = queryNorm Is that expected ? Elodie Sannier Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
Re: Unexpected value for boolean field in FunctionQuery
I didn't forget to commit my changes. I used commands: java -Durl=http://localhost:8983/solr/collection1/update -jar post.jar ipod_video.xml curl 'http://localhost:8983/solr/collection1/update/?commit=true' When I use your url example http://localhost:8983/solr/select?q=*:*rows=100fl=id,inStock,if%28inStock,10,0%29debugQuery=true I have : long name=if(inStock,10,0)10/long (and my document does not have the inStock field) Elodie On 09/10/2013 03:54 PM, Yonik Seeley wrote: I just tried a simple test with the example data, and things seem to be working fine... I tried this: http://localhost:8983/solr/select ?q=*:* rows=100 fl=id, inStock, if(inStock,10,0) I saw values of 10 when inStock==true and values of 0 when it was missing or explicitly false. Perhaps you forgot to commit your changes when you removed the inStock field from one of the example docs? -Yonik http://lucidworks.com On Tue, Sep 10, 2013 at 9:25 AM, Elodie Sannier elodie.sann...@kelkoo.fr wrote: Hello, I am using the solr version 4.4.0, when I'm using FunctionQuery with boolean fields, it seems that the default field value is true for documents without a value in the field. The page http://wiki.apache.org/solr/FunctionQuery#field says 0 is returned for documents without a value in the field. so we could expect that the field value would be false. Starting from the SolrCloud - Getting Started page with the document exampledocs/ipod_video.xml and removing the boolean field inStock: field name=inStocktrue/field demonstrates the problem. When requesting with bf=if(inStock,10,0) : curl -sS http://localhost:8983/solr/select?q=*:*bf=if%28inStock,10,0%29defType=edismaxdebugQuery=true; Result indicates that field value for boolean field inStock is seen as true : 7.071068 = (MATCH) FunctionQuery(if(bool(inStock),const(10),const(0))), product of: 10.0 = if(bool(inStock)=true,const(10),const(0)) 1.0 = boost 0.70710677 = queryNorm Same behaviour using FunctionQuery via LocalParams syntax : http://localhost:8983/solr/select?q={!func}if%28inStock,10,0%29debugQuery=true 10.0 = (MATCH) FunctionQuery(if(bool(inStock),const(10),const(0))), product of: 10.0 = if(bool(inStock)=true,const(10),const(0)) 1.0 = boost 1.0 = queryNorm Is that expected ? Elodie Sannier Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur. Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
Re: Unexpected value for boolean field in FunctionQuery
By the way Yonik which version do you use (4.4.0 or nightly) ? Elodie On 09/10/2013 04:06 PM, Elodie Sannier wrote: I didn't forget to commit my changes. I used commands: java -Durl=http://localhost:8983/solr/collection1/update -jar post.jar ipod_video.xml curl 'http://localhost:8983/solr/collection1/update/?commit=true' When I use your url example http://localhost:8983/solr/select?q=*:*rows=100fl=id,inStock,if%28inStock,10,0%29debugQuery=true I have : long name=if(inStock,10,0)10/long (and my document does not have the inStock field) Elodie On 09/10/2013 03:54 PM, Yonik Seeley wrote: I just tried a simple test with the example data, and things seem to be working fine... I tried this: http://localhost:8983/solr/select ?q=*:* rows=100 fl=id, inStock, if(inStock,10,0) I saw values of 10 when inStock==true and values of 0 when it was missing or explicitly false. Perhaps you forgot to commit your changes when you removed the inStock field from one of the example docs? -Yonik http://lucidworks.com On Tue, Sep 10, 2013 at 9:25 AM, Elodie Sannier elodie.sann...@kelkoo.fr wrote: Hello, I am using the solr version 4.4.0, when I'm using FunctionQuery with boolean fields, it seems that the default field value is true for documents without a value in the field. The page http://wiki.apache.org/solr/FunctionQuery#field says 0 is returned for documents without a value in the field. so we could expect that the field value would be false. Starting from the SolrCloud - Getting Started page with the document exampledocs/ipod_video.xml and removing the boolean field inStock: field name=inStocktrue/field demonstrates the problem. When requesting with bf=if(inStock,10,0) : curl -sS http://localhost:8983/solr/select?q=*:*bf=if%28inStock,10,0%29defType=edismaxdebugQuery=true; Result indicates that field value for boolean field inStock is seen as true : 7.071068 = (MATCH) FunctionQuery(if(bool(inStock),const(10),const(0))), product of: 10.0 = if(bool(inStock)=true,const(10),const(0)) 1.0 = boost 0.70710677 = queryNorm Same behaviour using FunctionQuery via LocalParams syntax : http://localhost:8983/solr/select?q={!func}if%28inStock,10,0%29debugQuery=true 10.0 = (MATCH) FunctionQuery(if(bool(inStock),const(10),const(0))), product of: 10.0 = if(bool(inStock)=true,const(10),const(0)) 1.0 = boost 1.0 = queryNorm Is that expected ? Elodie Sannier Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur. Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur. Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
Re: SolrCloud: no timing when no result in distributed mode
Hello, I'm using the 4.4.0 version but I still have the problem. Should I create a JIRA issue for it ? Elodie On 06/21/2013 02:54 PM, Elodie Sannier wrote: Hello, I am using SolrCloud 4.2.1 with two shards, with the debugQuery=true parameter, when a query does not return documents then the timing debug information is not returned: curl -sS http://localhost:8983/solr/select?q=dummydebugQuery=true; | grep -o 'lst name=\timing\.*' If i use the distrib=false parameter, the timing debug information is returned: curl -sS http://localhost:8983/solr/select?q=dummydebugQuery=truedistrib=false; | grep -o 'lst name=\timing\.*' lst name=timingdouble name=time1.0/doublelst name=preparedouble name=time0.0/doublelst name=querydouble name=time0.0/double/lstlst name=facetdouble name=time0.0/double/lstlst name=mltdouble name=time0.0/double/lstlst name=highlightdouble name=time0.0/double/lstlst name=statsdouble name=time0.0/double/lstlst name=debugdouble name=time0.0/double/lst/lstlst name=processdouble name=time1.0/doublelst name=querydouble name=time0.0/double/lstlst name=facetdouble name=time0.0/double/lstlst name=mltdouble name=time0.0/double/lstlst name=highlightdouble name=time0.0/double/lstlst name=statsdouble name=time0.0/double/lstlst name=debugdouble name=time1.0/double/lst/lst/lst/lst* Is it a bug of the distributed mode ? Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
Re: XInclude and Document Entity not working on schema.xml
I'm using java-1.7.0-openjdk-1.7.0.3-2.1.el6.1.x86_64 and tomcat6-6.0.24-48.el6_3.noarch. I tested with the 4.4 solr version but I still have the bug. Elodie Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
Re: XInclude and Document Entity not working on schema.xml
Hello Chris, Thank you for your help. I checked differences between my files and your test files but I didn't find bugs in my files. All my files are in the same directory: collection1/conf = schema.xml content: ?xml version=1.0 encoding=UTF-8 ? !DOCTYPE schema [ !ENTITY commonschema_types SYSTEM commonschema_types.xml !ENTITY commonschema_others SYSTEM commonschema_others.xml ] schema name=searchSolrSchema version=1.5 types fieldType name=text_stemmed class=solr.TextField positionIncrementGap=100 omitNorms=true !-- FR : french -- !-- least aggressive stemming -- analyzer type=index tokenizer class=solr.StandardTokenizerFactory/ filter class=com.kelkoo.search.solr.plugins.stemmer.fr.KelkooFrenchMinimalStemFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ filter class=com.kelkoo.search.solr.plugins.stemmer.fr.KelkooFrenchMinimalStemFilterFactory/ /analyzer /fieldType commonschema_types; /types commonschema_others; /schema = commonschema_types.xml content: fieldType name=string class=solr.StrField sortMissingLast=true omitNorms=true/ fieldType name=boolean class=solr.BoolField sortMissingLast=true omitNorms=true/ !-- int is for exact ids, work with grouped=true and distrib=true -- fieldType name=int class=solr.TrieIntField precisionStep=0 sortMissingLast=true omitNorms=true positionIncrementGap=0/ !-- tint is for numbers that need sorting and/or range queries (precisionStep=4 has better performance than precisionStep=8) and that do *not* need grouping (grouping does not work in distrib=true for tint)-- fieldType name=tint class=solr.TrieIntField precisionStep=4 sortMissingLast=true omitNorms=true positionIncrementGap=0/ fieldType name=long class=solr.TrieLongField precisionStep=0 positionIncrementGap=0/ fieldType name=byte class=solr.ByteField omitNorms=true/ fieldType name=float class=solr.TrieFloatField sortMissingLast=true omitNorms=true/ !-- A general text field which tokenizes with StandardTokenizer omitNorms=true means the (index time) lenghtNorm will be the same whatever the number of tokens. -- fieldType name=text_general class=solr.TextField positionIncrementGap=100 omitNorms=true analyzer type=index tokenizer class=solr.StandardTokenizerFactory/ /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ /analyzer /fieldType commonschema_others; include works. Do you see something wrong ? Unfortunately I cannot use the 4.3.0 version because I'm using solr.xml sharedLib which does not work in 4.3.0 (cf.https://issues.apache.org/jira/browse/SOLR-4791). Where can I found the newly voted 4.4 ? I have this bug with the nightly 4.5-2013-07-18_06-04-44 found here https://builds.apache.org/job/Solr-Artifacts-4.x/lastSuccessfulBuild/artifact/solr/package/ (the 18th of july). Elodie Sannier Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
XInclude and Document Entity not working on schema.xml
Hello, I am using the solr nightly version 4.5-2013-07-18_06-04-44 and I want to use Document Entity in schema.xml, I get this exception : java.lang.RuntimeException: schema fieldtype string(org.apache.solr.schema.StrField) invalid arguments:{xml:base=solrres:/commonschema_types.xml} at org.apache.solr.schema.FieldType.setArgs(FieldType.java:187) at org.apache.solr.schema.FieldTypePluginLoader.init(FieldTypePluginLoader.java:141) at org.apache.solr.schema.FieldTypePluginLoader.init(FieldTypePluginLoader.java:43) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:190) ... 16 more schema.xml: ?xml version=1.0 encoding=UTF-8 ? !DOCTYPE schema [ !ENTITY commonschema_types SYSTEM commonschema_types.xml ] schema name=searchSolrSchema version=1.5 types !-- Stuff -- commonschema_types; /types !-- Stuff -- /schema commonschema_types.xml: ?xml version=1.0 encoding=UTF-8 ? fieldType name=string class=solr.StrField sortMissingLast=true omitNorms=true/ fieldType name=long class=solr.TrieLongField precisionStep=0 positionIncrementGap=0/ !-- Stuff -- The same error appears in this bug (fixed ?): https://issues.apache.org/jira/browse/SOLR-3087 It works with solr-4.2.1. //- I also try to use use XML XInclude mechanism (http://en.wikipedia.org/wiki/XInclude) to include parts of schema.xml. When I try to include a fieldType, I get this exception : org.apache.solr.common.SolrException: Unknown fieldType 'long' specified on field _version_ at org.apache.solr.schema.IndexSchema.loadFields(IndexSchema.java:644) at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:470) at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:164) at org.apache.solr.schema.IndexSchemaFactory.create(IndexSchemaFactory.java:55) at org.apache.solr.schema.IndexSchemaFactory.buildIndexSchema(IndexSchemaFactory.java:69) at org.apache.solr.core.ZkContainer.createFromZk(ZkContainer.java:267) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:622) ... 10 more The type is not found. I include 'schema_integration.xml' like this in 'schema.xml' : ?xml version=1.0 encoding=UTF-8 ? schema name=default version=1.5 types !-- Stuff -- xi:include href=commonschema_types.xml xmlns:xi=http://www.w3.org/2001/XInclude/ /types !-- Stuff -- fields field name=_version_ type=long indexed=true stored=true multiValued=false/ !-- Stuff -- /fields /schema Is it a bug of the nightly version ? Elodie Sannier Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
SolrCloud: no timing when no result in distributed mode
Hello, I am using SolrCloud 4.2.1 with two shards, with the debugQuery=true parameter, when a query does not return documents then the timing debug information is not returned: curl -sS http://localhost:8983/solr/select?q=dummydebugQuery=true; | grep -o 'lst name=\timing\.*' If i use the distrib=false parameter, the timing debug information is returned: curl -sS http://localhost:8983/solr/select?q=dummydebugQuery=truedistrib=false; | grep -o 'lst name=\timing\.*' lst name=timingdouble name=time1.0/doublelst name=preparedouble name=time0.0/doublelst name=querydouble name=time0.0/double/lstlst name=facetdouble name=time0.0/double/lstlst name=mltdouble name=time0.0/double/lstlst name=highlightdouble name=time0.0/double/lstlst name=statsdouble name=time0.0/double/lstlst name=debugdouble name=time0.0/double/lst/lstlst name=processdouble name=time1.0/doublelst name=querydouble name=time0.0/double/lstlst name=facetdouble name=time0.0/double/lstlst name=mltdouble name=time0.0/double/lstlst name=highlightdouble name=time0.0/double/lstlst name=statsdouble name=time0.0/double/lstlst name=debugdouble name=time1.0/double/lst/lst/lst/lst* ** *Is it a bug of the distributed mode ?* * Elodie Sannier -- Kelkoo *Elodie Sannier *Software engineer *E*elodie.sann...@kelkoo.fr mailto:elodie.sann...@kelkoo.fr *Y!Messenger* kelkooelodies *T* +33 (0)4 56 09 07 55 *M* *A* 4/6 Rue des Méridiens 38130 Echirolles Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
SolrCloud: 500 error with combination of debug and group in distributed search
Hello, I am using SolrCloud 4.2.1 with two shards, when I'm grouping on a field and using the debug parameter in distributed mode, I have a 500 error. http://localhost:8983/solr/select?q=*:*group=truegroup.field=popularitydebug=true (idem with debug=timing, query or results) lst name=error str name=msg Server at http://localhost:8983/solr returned non ok status:500, message:Internal Server Error /str str name=trace org.apache.solr.common.SolrException: Server at http://localhost:8983/solr returned non ok status:500, message:Internal Server Error at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:373) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181) at org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:172) at org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:135) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) /str int name=code500/int /lst In the logs I have: 2013-06-21 13:26:47,876 [http-8080-5] ERROR org.apache.solr.servlet.SolrDispatchFilter:log:96 - null:java.lang.NullPointerException at org.apache.solr.handler.component.DebugComponent.process(DebugComponent.java:56) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:216) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:555) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:722) If I add the distrib=false parameter or if I replace the debug parameter by the debugQuery=true parameter or if I remove the group parameters, I don't have the error. Is it a bug of the distributed mode with the combination of debug and group ? Elodie Sannier -- Kelkoo *Elodie Sannier *Software engineer *E*elodie.sann...@kelkoo.fr mailto:elodie.sann...@kelkoo.fr *Y!Messenger* kelkooelodies *T* +33 (0)4 56 09 07 55 *M* *A* 4/6 Rue des Méridiens 38130 Echirolles Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
Re: SolrCloud: no timing when no result in distributed mode
Unfortunately I cannot use the 4.3.0 version because I'm using solr.xml sharedLib which does not work in 4.3.0 (cf. https://issues.apache.org/jira/browse/SOLR-4791). Elodie On 06/21/2013 03:30 PM, James Thomas wrote: Seems to work fine for me on 4.3.0, maybe you can try a newer version. 4.3.1 is available. -Original Message- From: Elodie Sannier [mailto:elodie.sann...@kelkoo.fr] Sent: Friday, June 21, 2013 8:54 AM To: solr-user@lucene.apache.org solr-user@lucene.apache.org Subject: SolrCloud: no timing when no result in distributed mode Hello, I am using SolrCloud 4.2.1 with two shards, with the debugQuery=true parameter, when a query does not return documents then the timing debug information is not returned: curl -sS http://localhost:8983/solr/select?q=dummydebugQuery=true; | grep -o 'lst name=\timing\.*' If i use the distrib=false parameter, the timing debug information is returned: curl -sS http://localhost:8983/solr/select?q=dummydebugQuery=truedistrib=false; | grep -o 'lst name=\timing\.*' lst name=timingdouble name=time1.0/doublelst name=preparedouble name=time0.0/doublelst name=querydouble name=time0.0/double/lstlst name=facetdouble name=time0.0/double/lstlst name=mltdouble name=time0.0/double/lstlst name=highlightdouble name=time0.0/double/lstlst name=statsdouble name=time0.0/double/lstlst name=debugdouble name=time0.0/double/lst/lstlst name=processdouble name=time1.0/doublelst name=querydouble name=time0.0/double/lstlst name=facetdouble name=time0.0/double/lstlst name=mltdouble name=time0.0/double/lstlst name=highlightdouble name=time0.0/double/lstlst name=statsdouble name=time0.0/double/lstlst name=debugdouble name=time1.0/double/lst/lst/lst/lst* ** *Is it a bug of the distributed mode ?* * Elodie Sannier -- Kelkoo *Elodie Sannier *Software engineer *E*elodie.sann...@kelkoo.fr mailto:elodie.sann...@kelkoo.fr *Y!Messenger* kelkooelodies *T* +33 (0)4 56 09 07 55 *M* *A* 4/6 Rue des Méridiens 38130 Echirolles Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur. -- Kelkoo *Elodie Sannier *Software engineer *E*elodie.sann...@kelkoo.fr mailto:elodie.sann...@kelkoo.fr *Y!Messenger* kelkooelodies *T* +33 (0)4 56 09 07 55 *M* *A* 4/6 Rue des Méridiens 38130 Echirolles Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
Re: FieldCache insanity with field used as facet and group
I'm reproducing the problem with the 4.2.1 example with 2 shards. 1) started up solr shards, indexed the example data, and confirmed empty fieldCaches [sanniere@funlevel-dx example]$ java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -DnumShards=2 -jar start.jar [sanniere@funlevel-dx example2]$ java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar 2) used both grouping and faceting on the popularity field, then checked the fieldcache insanity count [sanniere@funlevel-dx example]$ curl -sS http://localhost:8983/solr/select?q=*:*group=truegroup.field=popularity; /dev/null [sanniere@funlevel-dx example]$ curl -sS http://localhost:8983/solr/select?q=*:*facet=truefacet.field=popularity; /dev/null [sanniere@funlevel-dx example]$ curl -sS http://localhost:8983/solr/admin/mbeans?stats=truekey=fieldCachewt=jsonindent=true; | grep entries_count|insanity_count entries_count:10, insanity_count:2, insanity#0:VALUEMISMATCH: Multiple distinct value objects for SegmentCoreReader(owner=_g(4.2.1):C1)+popularity\n\t'SegmentCoreReader(owner=_g(4.2.1):C1)'='popularity',class org.apache.lucene.index.SortedDocValues,0.5=org.apache.lucene.search.FieldCacheImpl$SortedDocValuesImpl#12129794\n\t'SegmentCoreReader(owner=_g(4.2.1):C1)'='popularity',int,null=org.apache.lucene.search.FieldCacheImpl$IntsFromArray#12298774\n\t'SegmentCoreReader(owner=_g(4.2.1):C1)'='popularity',int,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_INT_PARSER=org.apache.lucene.search.FieldCacheImpl$IntsFromArray#12298774\n, insanity#1:VALUEMISMATCH: Multiple distinct value objects for SegmentCoreReader(owner=_f(4.2.1):C9)+popularity\n\t'SegmentCoreReader(owner=_f(4.2.1):C9)'='popularity',int,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_INT_PARSER=org.apache.lucene.search.FieldCacheImpl$IntsFromArray#16648315\n\t'SegmentCoreReader(owner=_f(4.2.1):C9)'='popularity',int,null=org.apache.lucene.search.FieldCacheImpl$IntsFromArray#16648315\n\t'SegmentCoreReader(owner=_f(4.2.1):C9)'='popularity',class org.apache.lucene.index.SortedDocValues,0.5=org.apache.lucene.search.FieldCacheImpl$SortedDocValuesImpl#1130715\n}}}, HIGHLIGHTING,{}, OTHER,{}]} I've updated https://issues.apache.org/jira/browse/SOLR-4866 Elodie Le 28.05.2013 10:22, Elodie Sannier a écrit : I've created https://issues.apache.org/jira/browse/SOLR-4866 Elodie Le 07.05.2013 18:19, Chris Hostetter a écrit : : I am using the Lucene FieldCache with SolrCloud and I have insane instances : with messages like: FWIW: I'm the one that named the result of these sanity checks FieldCacheInsantity and i have regretted it ever since -- a better label would have been inconsistency : VALUEMISMATCH: Multiple distinct value objects for : SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)+merchantid : 'SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)'='merchantid',class : org.apache.lucene.index.SortedDocValues,0.5=org.apache.lucene.search.FieldCacheImpl$SortedDocValuesImpl#557711353 : 'SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)'='merchantid',int,null=org.apache.lucene.search.FieldCacheImpl$IntsFromArray#1105988713 : 'SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)'='merchantid',int,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_INT_PARSER=org.apache.lucene.search.FieldCacheImpl$IntsFromArray#1105988713 : : All insane instances are for a field merchantid of type int used as facet : and group field. Interesting: it appears that the grouping code and the facet code are not being consistent in how they are building hte field cache, so you are getting two objects in the cache for each segment I haven't checked if this happens much with the example configs, but if you could: please file a bug with the details of which Solr version you are using along with the schema fieldType filed declarations for your merchantid field, along with the mbean stats output showing the field cache insanity after executing two queries like... /select?q=*:*facet=truefacet.field=merchantid /select?q=*:*group=truegroup.field=merchantid (that way we can rule out your custom SearchComponent as having a bug in it) : This insanity can have performance impact ? : How can I fix it ? the impact is just that more ram is being used them is probably strictly neccessary. unless there is something unusual in your fieldType delcataion, i don't think there is an easy fix you can apply -- we need to fix the underlying code. -Hoss -- Kelkoo *Elodie Sannier *Software engineer *E*elodie.sann...@kelkoo.frmailto:elodie.sann...@kelkoo.fr *Y!Messenger* kelkooelodies *T* +33 (0)4 56 09 07 55 *M* *A* 4/6 Rue des Méridiens 38130 Echirolles Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le
Re: FieldCache insanity with field used as facet and group
I've created https://issues.apache.org/jira/browse/SOLR-4866 Elodie Le 07.05.2013 18:19, Chris Hostetter a écrit : : I am using the Lucene FieldCache with SolrCloud and I have insane instances : with messages like: FWIW: I'm the one that named the result of these sanity checks FieldCacheInsantity and i have regretted it ever since -- a better label would have been inconsistency : VALUEMISMATCH: Multiple distinct value objects for : SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)+merchantid : 'SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)'='merchantid',class : org.apache.lucene.index.SortedDocValues,0.5=org.apache.lucene.search.FieldCacheImpl$SortedDocValuesImpl#557711353 : 'SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)'='merchantid',int,null=org.apache.lucene.search.FieldCacheImpl$IntsFromArray#1105988713 : 'SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)'='merchantid',int,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_INT_PARSER=org.apache.lucene.search.FieldCacheImpl$IntsFromArray#1105988713 : : All insane instances are for a field merchantid of type int used as facet : and group field. Interesting: it appears that the grouping code and the facet code are not being consistent in how they are building hte field cache, so you are getting two objects in the cache for each segment I haven't checked if this happens much with the example configs, but if you could: please file a bug with the details of which Solr version you are using along with the schema fieldType filed declarations for your merchantid field, along with the mbean stats output showing the field cache insanity after executing two queries like... /select?q=*:*facet=truefacet.field=merchantid /select?q=*:*group=truegroup.field=merchantid (that way we can rule out your custom SearchComponent as having a bug in it) : This insanity can have performance impact ? : How can I fix it ? the impact is just that more ram is being used them is probably strictly neccessary. unless there is something unusual in your fieldType delcataion, i don't think there is an easy fix you can apply -- we need to fix the underlying code. -Hoss -- Kelkoo *Elodie Sannier *Software engineer *E*elodie.sann...@kelkoo.fr mailto:elodie.sann...@kelkoo.fr *Y!Messenger* kelkooelodies *T* +33 (0)4 56 09 07 55 *M* *A* 4/6 Rue des Méridiens 38130 Echirolles Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
FieldCache insanity with field used as facet and group
Hello, I am using the Lucene FieldCache with SolrCloud and I have insane instances with messages like: VALUEMISMATCH: Multiple distinct value objects for SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)+merchantid 'SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)'='merchantid',class org.apache.lucene.index.SortedDocValues,0.5=org.apache.lucene.search.FieldCacheImpl$SortedDocValuesImpl#557711353 'SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)'='merchantid',int,null=org.apache.lucene.search.FieldCacheImpl$IntsFromArray#1105988713 'SegmentCoreReader(owner=_11i(4.2.1):C4493997/853637)'='merchantid',int,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_INT_PARSER=org.apache.lucene.search.FieldCacheImpl$IntsFromArray#1105988713 All insane instances are for a field merchantid of type int used as facet and group field. I'm using a custom SearchHandler which makes two sub-queries, a first query with group.field=merchantid and a second query with facet.field=merchantid. When I'm using the parameter facet.method=enum, I don't have the insane instance but I'm not sure it is the good fix. This insanity can have performance impact ? How can I fix it ? Elodie Sannier Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
SolrCloud: Result Grouping - no groups with field type with precisionStep 0
Hello, I am using the Result Grouping feature with SolrCloud, and it seems that grouping does not work with field types having precisionStep property greater than 0, in distributed mode. I updated the SolrCloud - Getting Started page example A (Simple two shard cluster). In my schema.xml, the popularity field has an int type where I changed precisionStep from 0 to 4 : fieldType name=int class=solr.TrieIntField precisionStep=4 positionIncrementGap=0 / field name=popularity type=int indexed=true stored=true / When I'm requesting in distributed mode, the grouping on this field does not return groups : http://localhost:8983/solr/select?q=*:*group=truegroup.field=popularitydistrib=true lst name=grouped lst name=popularity int name=matches1/int arr name=groups lst int name=groupValue0/int result name=doclist numFound=0 start=0 / /lst /arr /lst /lst When I'm requesting on a single core, the grouping on this field returns a group : http://localhost:8983/solr/select?q=*:*group=truegroup.field=popularitydistrib=false lst int name=groupValue10/int result name=doclist numFound=1 start=0 doc str name=idMA147LL/A/str ... int name=popularity10/int ... /doc /result /lst If I come back to the origin configuration, changing the int type with precisionStep=0, the distributed request works : fieldType name=int class=solr.TrieIntField precisionStep=0 positionIncrementGap=0 / The precisionStep 0 can be useful for range queries but is it normal that it is not compatible with grouping queries, in distributed mode only ? Elodie Sannier Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
Solrj 4.2 - CloudSolrServer aliases are not loaded
Hello, I am using the new collection alias feature, and it seems CloudSolrServer class (solrj 4.2.0) does not allow to use it, either for update or select. When I'm requesting the CloudSolrServer with a collection alias name, I have the error: org.apache.solr.common.SolrException: Collection not found: aliasedCollection The collection alias cannot be found because, in CloudSolrServer#getCollectionList (line 319) method, the alias variable is always empty. When I'm requesting the CloudSolrServer, the connect method is called and it calls the ZkStateReader#createClusterStateWatchersAndUpdate method. In the ZkStateReader#createClusterStateWatchersAndUpdate method, the aliases are not loaded. line 295, the data from /clusterstate.json are loaded : ClusterState clusterState = ClusterState.load(zkClient, liveNodeSet); this.clusterState = clusterState; Should we have the same data loading from /aliases.json, in order to fill the aliases field ? line 299, a Watcher for aliases is created but does not seem used. As a workaround to avoid the error, I have to force the aliases loading at my application start and when the aliases are updated: CloudSolrServer solrServer = new CloudSolrServer(localhost:2181); solrServer.setDefaultCollection(aliasedCollection); solrServer.connect(); solrServer.getZkStateReader().updateAliases(); Is there a better way to use collection aliases with solrj ? Elodie Sannier Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.