Re: hello, a question about solr.
the name field is text,which is analysed, i use the query name:ibmT63notebook 2008/8/18, Shalin Shekhar Mangar [EMAIL PROTECTED]: Hi, What is the type of the field name? Does a query like name:ibm OR name:T63 OR name:notebook work for you? On Mon, Aug 18, 2008 at 10:43 AM, finy finy [EMAIL PROTECTED] wrote: i use solr for 3 months, and i find some question follow: i check the solr source code, and find it uses lucene's QueryParser to parse user's input querystring for example, a query like this name:ibmT63notebook ,solr will parse it like 'name:ibm T63 notebook' , it regard this as a PhrazeQuery,so it will use PhrazeQuery. but i want to get a result which include ibm and T63 and notebook at any postion. for example ,it should match some sentence like i have a notebook ,it is t63 of ibm.. but solr doesn't do that,it consider that queryparser as a PhrazeQuery, how can i do that as my mind? thanks, your friend! -- Regards, Shalin Shekhar Mangar.
Re: Jetty Multicore installation doesn't work
It seems that you are trying to use a Solr 1.3 feature (multiple cores) with a Solr 1.2 war file. If you want to use multiple core, you must use a nightly build of Solr and take a look at the CoreAdmin page (formerly known as MultiCore) http://wiki.apache.org/solr/CoreAdmin On Mon, Aug 18, 2008 at 2:19 PM, parthad76 [EMAIL PROTECTED] wrote: Hi I tried to run the multicore installation of Jetty after downloading it. Its throwing the following error and I am not sure why. I added the multicore.xml file in solr.home but that too doesn't work.Can someone please help? INFO: Solr home set to 'multicore/' 2008-08-18 14:18:31.796::WARN: failed SolrRequestFilter java.lang.NoClassDefFoundError at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.ja va:74) at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java: 40) at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.ja va:594) at org.mortbay.jetty.servlet.Context.startContext(Context.java:139) at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.jav a:1218) at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java: 500) at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:448 ) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java: 40) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection .java:147) at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHan dlerCollection.java:161) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java: 40) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection .java:147) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java: 40) at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java: 117) at org.mortbay.jetty.Server.doStart(Server.java:210) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java: 40) at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:929) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.mortbay.start.Main.invokeMain(Main.java:183) at org.mortbay.start.Main.start(Main.java:497) at org.mortbay.start.Main.main(Main.java:115) 2008-08-18 14:18:31.875::WARN: failed [EMAIL PROTECTED] 8e059{/solr,jar:file:/D:/Projects/SaaS%20-%20Social%20Commerce%20Platform/Core%2 0Services/Search/apache-solr-1.2.0_Single/example/webapps/solr.war!/} java.lang.NoClassDefFoundError at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.ja va:74) at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java: 40) at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.ja va:594) at org.mortbay.jetty.servlet.Context.startContext(Context.java:139) at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.jav a:1218) at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java: 500) at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:448 ) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java: 40) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection .java:147) at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHan dlerCollection.java:161) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java: 40) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection .java:147) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java: 40) at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java: 117) at org.mortbay.jetty.Server.doStart(Server.java:210) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java: 40) at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:929) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.mortbay.start.Main.invokeMain(Main.java:183) at org.mortbay.start.Main.start(Main.java:497) at org.mortbay.start.Main.main(Main.java:115) 2008-08-18 14:18:31.906::WARN:
Re: IndexOutOfBoundsException
Hi Ian, I sent this to java-user, but maybe you didn't see it, so let's try again on solr-user: It looks like your stored fields file (_X.fdt) is corrupt. Are you using multiple threads to add docs? Can you try switching to SerialMergeScheduler to verify it's reproducible? When you hit this exception, can you stop Solr and then run Lucene's CheckIndex tool (org.apache.lucene.index.CheckIndex) to verify the index is corrupt and see which segment it is? Then post back the exception and ls -l of your index directory? If you could post the client-side code you're using to build submit docs to Solr, and if I can get access to the Medline content, and I can the repro the bug, then I'll track it down... Mike On Aug 14, 2008, at 10:18 PM, Ian Connor wrote: I seem to be able to reproduce this very easily and the data is medline (so I am sure I can share it if needed with a quick email to check). - I am using fedora: %uname -a Linux ghetto5.projectlounge.com 2.6.23.1-42.fc8 #1 SMP Tue Oct 30 13:18:33 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux %java -version java version 1.7.0 IcedTea Runtime Environment (build 1.7.0-b21) IcedTea 64-Bit Server VM (build 1.7.0-b21, mixed mode) - single core (will use shards but each machine just as one HDD so didn't see how cores would help but I am new at this) - next run I will keep the output to check for earlier errors - very and I can share code + data if that will help On Thu, Aug 14, 2008 at 4:23 PM, Yonik Seeley [EMAIL PROTECTED] wrote: Yikes... not good. This shouldn't be due to anything you did wrong Ian... it looks like a lucene bug. Some questions: - what platform are you running on, and what JVM? - are you using multicore? (I fixed some index locking bugs recently) - are there any exceptions in the log before this? - how reproducible is this? -Yonik On Thu, Aug 14, 2008 at 2:47 PM, Ian Connor [EMAIL PROTECTED] wrote: Hi, I have rebuilt my index a few times (it should get up to about 4 Million but around 1 Million it starts to fall apart). Exception in thread Lucene Merge Thread #0 org.apache.lucene.index.MergePolicy$MergeException: java.lang.IndexOutOfBoundsException: Index: 105, Size: 33 at org .apache .lucene .index .ConcurrentMergeScheduler .handleMergeException(ConcurrentMergeScheduler.java:323) at org.apache.lucene.index.ConcurrentMergeScheduler $MergeThread.run(ConcurrentMergeScheduler.java:300) Caused by: java.lang.IndexOutOfBoundsException: Index: 105, Size: 33 at java.util.ArrayList.rangeCheck(ArrayList.java:572) at java.util.ArrayList.get(ArrayList.java:350) at org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:260) at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:188) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java: 670) at org .apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java: 349) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:134) at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java: 3998) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3650) at org .apache .lucene .index .ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:214) at org.apache.lucene.index.ConcurrentMergeScheduler $MergeThread.run(ConcurrentMergeScheduler.java:269) When this happens, the disk usage goes right up and the indexing really starts to slow down. I am using a Solr build from about a week ago - so my Lucene is at 2.4 according to the war files. Has anyone seen this error before? Is it possible to tell which Array is too large? Would it be an Array I am sending in or another internal one? Regards, Ian Connor -- Regards, Ian Connor
Re: IndexOutOfBoundsException
Hi Mike, I am currently ruling out some bad memory modules. Knowing that this is a index corruption, makes memory corruption more likely. If replacing RAM does not fix the problem (which I need to do anyway due to segmentation faults), I will package up the crash into a reproducible scenario. On Mon, Aug 18, 2008 at 5:56 AM, Michael McCandless [EMAIL PROTECTED] wrote: Hi Ian, I sent this to java-user, but maybe you didn't see it, so let's try again on solr-user: It looks like your stored fields file (_X.fdt) is corrupt. Are you using multiple threads to add docs? Can you try switching to SerialMergeScheduler to verify it's reproducible? When you hit this exception, can you stop Solr and then run Lucene's CheckIndex tool (org.apache.lucene.index.CheckIndex) to verify the index is corrupt and see which segment it is? Then post back the exception and ls -l of your index directory? If you could post the client-side code you're using to build submit docs to Solr, and if I can get access to the Medline content, and I can the repro the bug, then I'll track it down... Mike On Aug 14, 2008, at 10:18 PM, Ian Connor wrote: I seem to be able to reproduce this very easily and the data is medline (so I am sure I can share it if needed with a quick email to check). - I am using fedora: %uname -a Linux ghetto5.projectlounge.com 2.6.23.1-42.fc8 #1 SMP Tue Oct 30 13:18:33 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux %java -version java version 1.7.0 IcedTea Runtime Environment (build 1.7.0-b21) IcedTea 64-Bit Server VM (build 1.7.0-b21, mixed mode) - single core (will use shards but each machine just as one HDD so didn't see how cores would help but I am new at this) - next run I will keep the output to check for earlier errors - very and I can share code + data if that will help On Thu, Aug 14, 2008 at 4:23 PM, Yonik Seeley [EMAIL PROTECTED] wrote: Yikes... not good. This shouldn't be due to anything you did wrong Ian... it looks like a lucene bug. Some questions: - what platform are you running on, and what JVM? - are you using multicore? (I fixed some index locking bugs recently) - are there any exceptions in the log before this? - how reproducible is this? -Yonik On Thu, Aug 14, 2008 at 2:47 PM, Ian Connor [EMAIL PROTECTED] wrote: Hi, I have rebuilt my index a few times (it should get up to about 4 Million but around 1 Million it starts to fall apart). Exception in thread Lucene Merge Thread #0 org.apache.lucene.index.MergePolicy$MergeException: java.lang.IndexOutOfBoundsException: Index: 105, Size: 33 at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:323) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:300) Caused by: java.lang.IndexOutOfBoundsException: Index: 105, Size: 33 at java.util.ArrayList.rangeCheck(ArrayList.java:572) at java.util.ArrayList.get(ArrayList.java:350) at org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:260) at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:188) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:670) at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:349) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:134) at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3998) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3650) at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:214) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:269) When this happens, the disk usage goes right up and the indexing really starts to slow down. I am using a Solr build from about a week ago - so my Lucene is at 2.4 according to the war files. Has anyone seen this error before? Is it possible to tell which Array is too large? Would it be an Array I am sending in or another internal one? Regards, Ian Connor -- Regards, Ian Connor -- Regards, Ian Connor
solr doc
Hello all, I'm looking for a doc that full-fill the following situation? How can two solr servers synchronised with each other ? And if one of them down for whatever reason the how other one can take over... does solr has anything like master/slave tajke over ? any docs or suggestions are thankfully welcomed ? many thanks ak _ Win New York holidays with Kellogg’s Live Search http://clk.atdmt.com/UKM/go/107571440/direct/01/
Restrict Wildcards
Hi list. Is it possible to create a field type in solr that does not match with wildcard queries? I want it to only match the complete string, so if I have indexed foo123 and foo234 i dont want foo* to match any of these. This does not work with just using the predefined string type. Any suggestions? Warm regards Erlend Hamnaberg
Re: solr doc
keep a slave handy as the second aster and if the real master goes down let the second one take over. On Mon, Aug 18, 2008 at 4:44 PM, dudes dudes [EMAIL PROTECTED] wrote: Hello all, I'm looking for a doc that full-fill the following situation? How can two solr servers synchronised with each other ? And if one of them down for whatever reason the how other one can take over... does solr has anything like master/slave tajke over ? any docs or suggestions are thankfully welcomed ? many thanks ak _ Win New York holidays with Kellogg's Live Search http://clk.atdmt.com/UKM/go/107571440/direct/01/ -- --Noble Paul
Re: solr doc
Take a look at http://wiki.apache.org/solr/CollectionDistribution On Mon, Aug 18, 2008 at 4:44 PM, dudes dudes [EMAIL PROTECTED] wrote: Hello all, I'm looking for a doc that full-fill the following situation? How can two solr servers synchronised with each other ? And if one of them down for whatever reason the how other one can take over... does solr has anything like master/slave tajke over ? any docs or suggestions are thankfully welcomed ? many thanks ak _ Win New York holidays with Kellogg's Live Search http://clk.atdmt.com/UKM/go/107571440/direct/01/ -- Regards, Shalin Shekhar Mangar.
RE: solr doc
thanks :) Date: Mon, 18 Aug 2008 17:54:20 +0530 From: [EMAIL PROTECTED] To: solr-user@lucene.apache.org Subject: Re: solr doc Take a look at http://wiki.apache.org/solr/CollectionDistribution On Mon, Aug 18, 2008 at 4:44 PM, dudes dudes wrote: Hello all, I'm looking for a doc that full-fill the following situation? How can two solr servers synchronised with each other ? And if one of them down for whatever reason the how other one can take over... does solr has anything like master/slave tajke over ? any docs or suggestions are thankfully welcomed ? many thanks ak _ Win New York holidays with Kellogg's Live Search http://clk.atdmt.com/UKM/go/107571440/direct/01/ -- Regards, Shalin Shekhar Mangar. _ Win a voice over part with Kung Fu Panda Live Search and 100’s of Kung Fu Panda prizes to win with Live Search http://clk.atdmt.com/UKM/go/107571439/direct/01/
Re: partialResults, distributed search SOLR-502
I don't think this patch is working yet. If I take a shard out of rotation (even just one out of four), I get an error: org.apache.solr.client.solrj.SolrServerException: java.net.ConnectException: Connection refused org.apache.solr.common.SolrException: org.apache.solr.client.solrj.SolrServerException: java.net.ConnectException: Connection refused at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:256) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1156) from http://localhost:8983/solr/select/?shards=10.0.16.181:8983,10.0.16.182:8983,10.0.16.183:8983,10.0.16.184:8983/solrtimeAllowed=1000q=cancer%0D%0Aversion=2.2start=0rows=10indent=on where .181 is down but .183-.184 are up. On Fri, Aug 15, 2008 at 1:23 PM, Brian Whitman [EMAIL PROTECTED] wrote: I was going to file a ticket like this: A SOLR-303 query with shards=host1,host2,host3 when host3 is down returns an error. One of the advantages of a shard implementation is that data can be stored redundantly across different shards, either as direct copies (e.g. when host1 and host3 are snapshooter'd copies of each other) or where there is some data RAID that stripes indexes for redundancy. But then I saw SOLR-502, which appears to be committed. If I have the above scenario (host1,host2,host3 where host3 is not up) and set a timeAllowed, will I still get a 400 or will it come back with partial results? If not, can we think of a way to get this to work? It's my understanding already that duplicate docIDs are merged in the SOLR-303 response, so other than building in some this host isn't working, just move on and report it and of course the work to index redundantly, we wouldn't need anything to achieve a good redundant shard implementation. B -- Regards, Ian Connor
Re: hello, a question about solr.
On Mon, 18 Aug 2008 15:33:02 +0800 finy finy [EMAIL PROTECTED] wrote: the name field is text,which is analysed, i use the query name:ibmT63notebook why do you search with no spaces? is this free text entered by a user, or is it part of a link which you control ? PS: please dont top-post _ {Beto|Norberto|Numard} Meijome Commitment is active, not passive. Commitment is doing whatever you can to bring about the desired result. Anything less is half-hearted. I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Boosting fields by default
Hi, I¹m using the data import mechanism to pull data into my index. If I want to boost a certain field for all docs, (e.g. the title over the body) what is the best way to do that? I was expecting to change something in schema.xml but I don¹t see any info on boosting there. Thanks in advance -Rakesh
Re: Restrict Wildcards
Erlend, This doesn't work with string? Maybe something there is removing numbers. Have you tried with an example without numbers? e.g. fooaaa and foobbb. Does foo* match them both? If it does, then perhaps you can create a custom field type and use KeywordTokenizer in it. Example schema.xml has some of this stuff. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Erlend Hamnaberg [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Monday, August 18, 2008 7:42:22 AM Subject: Restrict Wildcards Hi list. Is it possible to create a field type in solr that does not match with wildcard queries? I want it to only match the complete string, so if I have indexed foo123 and foo234 i dont want foo* to match any of these. This does not work with just using the predefined string type. Any suggestions? Warm regards Erlend Hamnaberg
Re: hello, a question about solr.
because i use chinese character, for example ibm笔记本电脑 solr will parse it into a term ibm and a phraze 笔记本 电脑 can i use solr to query with a term ibm and a term 笔记本 and a term 电脑? 2008/8/18, Norberto Meijome [EMAIL PROTECTED]: On Mon, 18 Aug 2008 15:33:02 +0800 finy finy [EMAIL PROTECTED] wrote: the name field is text,which is analysed, i use the query name:ibmT63notebook why do you search with no spaces? is this free text entered by a user, or is it part of a link which you control ? PS: please dont top-post _ {Beto|Norberto|Numard} Meijome Commitment is active, not passive. Commitment is doing whatever you can to bring about the desired result. Anything less is half-hearted. I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: Order of returned fields
Yes, this is normal behavior. Does order matter in your application? Could you explain why? Order is maintained with multiple values of the same field name, though - which is important. Erik On Aug 17, 2008, at 6:38 PM, Pierre Auslaender wrote: Hello, After a Solr query, I always get the fields back in alphabetical order, no matter how I insert them. Is this the normal behaviour? This is when adding the document... doc field name=uidch.tsr.esg.domain.ProgramCollection[id: 1]/ field field name=genrecollection/field field name=collectionBac à sable/field field name=collection.urlhttp://localhost:8080/esg/api/collections/1 /field /doc ... and this is when retrieving it: doc str name=collectionBac à sable/str str name=collection.urlhttp://localhost:8080/esg/api/collections/1 /str str name=genrecollection/str str name=uidch.tsr.esg.domain.ProgramCollection[id: 1]/str /doc Thanks a lot, Pierre Auslaender
SimpleFacets: Performance Boost for Tokenized Fields
Hello: Term Vectors could be much faster than Intersectings with FilterCache. Exception: when size of DocSet is close (more than 50%) to the total count of documents in the index. When it works (100 times faster than current; very specific scenario): - use stored Term Vectors; - 10,000,000 documents in the index; - 5-10 terms per document; - 200,000 unique terms for a tokenized field. Obviously calculating sizes of 200,000 intersections with FilterCache is slover than traversing 10 - 20,000 documents for smaller DocSets and counting frequencies of Terms. There are some related TODOs in SOLR source. -- Thanks, Fuad Efendi 416-993-2060(cell) Tokenizer Inc. == http://www.linkedin.com/in/liferay http://www.tokenizer.org
.wsdl for example....
hi :) does anyone have a .wsdl definition for the example bundled with SOLR? if nobody has it, would it be useful to have one ? cheers, B _ {Beto|Norberto|Numard} Meijome Intelligence: Finding an error in a Knuth text. Stupidity: Cashing that $2.56 check you got. I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: Boosting fields by default
On Mon, Aug 18, 2008 at 7:12 PM, Rakesh Godhani [EMAIL PROTECTED] wrote: Hi, I¹m using the data import mechanism to pull data into my index. If I want to boost a certain field for all docs, (e.g. the title over the body) what is the best way to do that? I was expecting to change something in schema.xml but I don¹t see any info on boosting there. You can specify the boost as an attribute on the field in data-config.xml field column=title boost=2.0 / -- Regards, Shalin Shekhar Mangar.
Re: partialResults, distributed search SOLR-502
Hi, I have traced this as far as I can figure. It does seem as though the patch is in the trunk. I can see that timeAllowed is certainly being set and the lucene class TimeLimitedCollector is being used when the param is there. However, I have tried to trace RequestHandlerBase from this stack through to SearchHandler and get lost when the Shard is submitted. I can see it creates a CommonsHttpSolrServer to make the request and that at least at this point the timeAllowed param is alive and well. However, when I try to dive into the QueryRequest and SolrServer I realize my java is a little rusty. Can anyone explain how the QueryRequest here uses the code that is found in SolrIndexSearcher? On Mon, Aug 18, 2008 at 9:31 AM, Ian Connor [EMAIL PROTECTED] wrote: I don't think this patch is working yet. If I take a shard out of rotation (even just one out of four), I get an error: org.apache.solr.client.solrj.SolrServerException: java.net.ConnectException: Connection refused org.apache.solr.common.SolrException: org.apache.solr.client.solrj.SolrServerException: java.net.ConnectException: Connection refused at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:256) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1156) from http://localhost:8983/solr/select/?shards=10.0.16.181:8983,10.0.16.182:8983,10.0.16.183:8983,10.0.16.184:8983/solrtimeAllowed=1000q=cancer%0D%0Aversion=2.2start=0rows=10indent=on where .181 is down but .183-.184 are up. On Fri, Aug 15, 2008 at 1:23 PM, Brian Whitman [EMAIL PROTECTED] wrote: I was going to file a ticket like this: A SOLR-303 query with shards=host1,host2,host3 when host3 is down returns an error. One of the advantages of a shard implementation is that data can be stored redundantly across different shards, either as direct copies (e.g. when host1 and host3 are snapshooter'd copies of each other) or where there is some data RAID that stripes indexes for redundancy. But then I saw SOLR-502, which appears to be committed. If I have the above scenario (host1,host2,host3 where host3 is not up) and set a timeAllowed, will I still get a 400 or will it come back with partial results? If not, can we think of a way to get this to work? It's my understanding already that duplicate docIDs are merged in the SOLR-303 response, so other than building in some this host isn't working, just move on and report it and of course the work to index redundantly, we wouldn't need anything to achieve a good redundant shard implementation. B -- Regards, Ian Connor -- Regards, Ian Connor
Re: partialResults, distributed search SOLR-502
On Aug 18, 2008, at 11:51 AM, Ian Connor wrote: On Mon, Aug 18, 2008 at 9:31 AM, Ian Connor [EMAIL PROTECTED] wrote: I don't think this patch is working yet. If I take a shard out of rotation (even just one out of four), I get an error: org.apache.solr.client.solrj.SolrServerException: java.net.ConnectException: Connection refused It's my understanding that SOLR-502 is really only concerned with queries timing out (i.e. they connect but take over N seconds to return) If the connection gets refused then a non-solr java connection exception is thrown. Something would have to get put in that (optionally) catches connection errors and still builds the response from the shards that did respond. On Fri, Aug 15, 2008 at 1:23 PM, Brian Whitman [EMAIL PROTECTED] wrote: I was going to file a ticket like this: A SOLR-303 query with shards=host1,host2,host3 when host3 is down returns an error. One of the advantages of a shard implementation is that data can be stored redundantly across different shards, either as direct copies (e.g. when host1 and host3 are snapshooter'd copies of each other) or where there is some data RAID that stripes indexes for redundancy. But then I saw SOLR-502, which appears to be committed. If I have the above scenario (host1,host2,host3 where host3 is not up) and set a timeAllowed, will I still get a 400 or will it come back with partial results? If not, can we think of a way to get this to work? It's my understanding already that duplicate docIDs are merged in the SOLR-303 response, so other than building in some this host isn't working, just move on and report it and of course the work to index redundantly, we wouldn't need anything to achieve a good redundant shard implementation. B -- Regards, Ian Connor -- Regards, Ian Connor -- http://variogr.am/
Re: Localisation, faceting
Hi, Regarding Boolean operator localization -- there was a person who submitted patches for the same functionality, but for Lucene's QueryParser. This was a few years ago. I think his patch was never applied. Perhaps that helps. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Pierre Auslaender [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Saturday, August 16, 2008 12:50:53 PM Subject: Localisation, faceting Hello, I have a couple of questions: 1/ Is it possible to localise query operator names without writing code? For instance, I'd like to issue queries with French operator names, e.g. ET (instead of AND), OU (instead of OR), etc. 2/ Is it possible for Solr to generate, in the XML response, the URLs or complete queries for each facet in a faceted search? Here's an example. Say my first query is : http://localhost:8080/solr/select?q=bacfacet=truefacet.field=kindfacet.limit=-1 The kind field has three values: material, immaterial, time. I get back something like this: 1024 27633 389 If I want to drill down into one facet, say into material, I have to manually rebuild a query like this: http://localhost:8080/solr/select?q=bacfacet=truefacet.field=kindfacet.limit=-1fq=kind:material; It's not too difficult, but surely Solr could add this URL or query string under the material element. Is this possible? Or do I have to XSLT the result myself? Thanks, Pierre Auslaender
Re: Solr Logo thought
I like it, even its asymmetry. :) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Lukáš Vlček [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Sunday, August 17, 2008 7:02:25 PM Subject: Re: Solr Logo thought Hi, My initial draft of Solr logo can be found here: http://picasaweb.google.com/lukas.vlcek/Solr The reason why I haven't attached it to SOLR-84 for now is that this is just draft and not final design (there are a lot of unfinished details). I would like to get some feedback before I spend more time on it. I had several ideas but in the end I found that the simplicity works best. Simple font, sun motive, just two colors. Should look fine in both the large and small formats. As for the favicon I would use the sun motive only - it means the O letter with the beams. The logo font still needs a lot of small (but important) touches. For now I would like to get feedback mostly about the basic idea. Regards, Lukas On Sat, Aug 9, 2008 at 8:21 PM, Mark Miller wrote: Plenty left, but here is a template to get things started: http://wiki.apache.org/solr/LogoContest Speaking of which, if we want to maintain the momentum of interest in this topic, someone (ie: not me) should setup a LogoContest wiki page with some of the goals discussed in the various threads on solr-user and solr-dev recently, as well as draft up some good guidelines for how we should run the contest -- http://blog.lukas-vlcek.com/
Re: partialResults, distributed search SOLR-502
Yes, as far as I know, what Brian said is correct. Also, as far as I know, there is nothing that gracefully handles problematic Solr instances during distributed search. Solr 1.4 request? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Brian Whitman [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Monday, August 18, 2008 11:57:23 AM Subject: Re: partialResults, distributed search SOLR-502 On Aug 18, 2008, at 11:51 AM, Ian Connor wrote: On Mon, Aug 18, 2008 at 9:31 AM, Ian Connor wrote: I don't think this patch is working yet. If I take a shard out of rotation (even just one out of four), I get an error: org.apache.solr.client.solrj.SolrServerException: java.net.ConnectException: Connection refused It's my understanding that SOLR-502 is really only concerned with queries timing out (i.e. they connect but take over N seconds to return) If the connection gets refused then a non-solr java connection exception is thrown. Something would have to get put in that (optionally) catches connection errors and still builds the response from the shards that did respond. On Fri, Aug 15, 2008 at 1:23 PM, Brian Whitman wrote: I was going to file a ticket like this: A SOLR-303 query with shards=host1,host2,host3 when host3 is down returns an error. One of the advantages of a shard implementation is that data can be stored redundantly across different shards, either as direct copies (e.g. when host1 and host3 are snapshooter'd copies of each other) or where there is some data RAID that stripes indexes for redundancy. But then I saw SOLR-502, which appears to be committed. If I have the above scenario (host1,host2,host3 where host3 is not up) and set a timeAllowed, will I still get a 400 or will it come back with partial results? If not, can we think of a way to get this to work? It's my understanding already that duplicate docIDs are merged in the SOLR-303 response, so other than building in some this host isn't working, just move on and report it and of course the work to index redundantly, we wouldn't need anything to achieve a good redundant shard implementation. B -- Regards, Ian Connor -- Regards, Ian Connor -- http://variogr.am/
Re: partialResults, distributed search SOLR-502
When I put logging into SolrIndexSearcher just to see if we get there, I don't see any messages. However, I do see logging without a problem in QueryRequest and above. My issue is that I just cannot understand how SolrIndexSearcher comes into play here. On Mon, Aug 18, 2008 at 11:57 AM, Brian Whitman [EMAIL PROTECTED] wrote: On Aug 18, 2008, at 11:51 AM, Ian Connor wrote: On Mon, Aug 18, 2008 at 9:31 AM, Ian Connor [EMAIL PROTECTED] wrote: I don't think this patch is working yet. If I take a shard out of rotation (even just one out of four), I get an error: org.apache.solr.client.solrj.SolrServerException: java.net.ConnectException: Connection refused It's my understanding that SOLR-502 is really only concerned with queries timing out (i.e. they connect but take over N seconds to return) If the connection gets refused then a non-solr java connection exception is thrown. Something would have to get put in that (optionally) catches connection errors and still builds the response from the shards that did respond. On Fri, Aug 15, 2008 at 1:23 PM, Brian Whitman [EMAIL PROTECTED] wrote: I was going to file a ticket like this: A SOLR-303 query with shards=host1,host2,host3 when host3 is down returns an error. One of the advantages of a shard implementation is that data can be stored redundantly across different shards, either as direct copies (e.g. when host1 and host3 are snapshooter'd copies of each other) or where there is some data RAID that stripes indexes for redundancy. But then I saw SOLR-502, which appears to be committed. If I have the above scenario (host1,host2,host3 where host3 is not up) and set a timeAllowed, will I still get a 400 or will it come back with partial results? If not, can we think of a way to get this to work? It's my understanding already that duplicate docIDs are merged in the SOLR-303 response, so other than building in some this host isn't working, just move on and report it and of course the work to index redundantly, we wouldn't need anything to achieve a good redundant shard implementation. B -- Regards, Ian Connor -- Regards, Ian Connor -- http://variogr.am/ -- Regards, Ian Connor 1 Leighton St #605 Cambridge, MA 02141 Direct Line: +1 (978) 672 Call Center Phone: +1 (714) 239 3875 (24 hrs) Mobile Phone: +1 (312) 218 3209 Fax: +1(770) 818 5697 Suisse Phone: +41 (0) 22 548 1664 Skype: ian.connor
Re: Boosting fields by default
Sweet, cool, thanks -Rakesh On 8/18/08 11:31 AM, Shalin Shekhar Mangar [EMAIL PROTECTED] wrote: On Mon, Aug 18, 2008 at 7:12 PM, Rakesh Godhani [EMAIL PROTECTED] wrote: Hi, I¹m using the data import mechanism to pull data into my index. If I want to boost a certain field for all docs, (e.g. the title over the body) what is the best way to do that? I was expecting to change something in schema.xml but I don¹t see any info on boosting there. You can specify the boost as an attribute on the field in data-config.xml field column=title boost=2.0 /
Re: partialResults, distributed search SOLR-50
On Mon, Aug 18, 2008 at 12:16 PM, Otis Gospodnetic [EMAIL PROTECTED] wrote: Yes, as far as I know, what Brian said is correct. Also, as far as I know, there is nothing that gracefully handles problematic Solr instances during distributed search. Right... we punted that issue to a load balancer (which assumes that you have more than one copy of each shard). -Yonik
Re: partialResults, distributed search SOLR-50
On Aug 18, 2008, at 12:31 PM, Yonik Seeley wrote: On Mon, Aug 18, 2008 at 12:16 PM, Otis Gospodnetic [EMAIL PROTECTED] wrote: Yes, as far as I know, what Brian said is correct. Also, as far as I know, there is nothing that gracefully handles problematic Solr instances during distributed search. Right... we punted that issue to a load balancer (which assumes that you have more than one copy of each shard). Can you explain how you have a LB handling shards? Do you put a separate LB in front of each group of replica shards?
Re: partialResults, distributed search SOLR-50
Right. And a LB that is configured to, say, make use of Solr's ping response to determine if Solr healthy? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Yonik Seeley [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Monday, August 18, 2008 12:31:03 PM Subject: Re: partialResults, distributed search SOLR-50 On Mon, Aug 18, 2008 at 12:16 PM, Otis Gospodnetic wrote: Yes, as far as I know, what Brian said is correct. Also, as far as I know, there is nothing that gracefully handles problematic Solr instances during distributed search. Right... we punted that issue to a load balancer (which assumes that you have more than one copy of each shard). -Yonik
Re: Administrative questions
Thanks! I put that up on http://wiki.apache.org/solr/Daemontools , so if you want to add/change anything, you can do so at any time (anyone can edit or create wiki pages). Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Jon Drukman [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Friday, August 15, 2008 4:47:27 PM Subject: Re: Administrative questions Jason Rennie wrote: On Wed, Aug 13, 2008 at 1:52 PM, Jon Drukman wrote: Duh. I should have thought of that. I'm a big fan of djbdns so I'm quite familiar with daemontools. Thanks! :) My pleasure. Was nice to hear recently that DJB is moving toward more flexible licensing terms. For anyone unfamiliar w/ daemontools, here's DJB's explanation of why they rock compared to inittab, ttys, init.d, and rc.local: http://cr.yp.to/daemontools/faq/create.html#why in case anybody wants to know, here's how to run solr under daemontools. 1. install daemontools 2. create /etc/solr 3. create a user and group called solr 4. create shell script /etc/solr/run (edit to taste, i'm using the default jetty that comes with solr) #!/bin/sh exec 21 cd /usr/local/apache-solr-1.2.0/example exec setuidgid solr java -jar start.jar 4. create /etc/solr/log/run containing: #!/bin/sh exec setuidgid solr multilog t ./main 5. ln -s /etc/solr /service/solr that is all. as long as you've got svscan set to launch when the system boots, solr will run and auto-restart on crashes. logs will be in /service/solr/log/main (auto-rotated). yay. -jsd-
Re: partialResults, distributed search SOLR-50
On Mon, Aug 18, 2008 at 12:34 PM, Brian Whitman [EMAIL PROTECTED] wrote: On Aug 18, 2008, at 12:31 PM, Yonik Seeley wrote: On Mon, Aug 18, 2008 at 12:16 PM, Otis Gospodnetic [EMAIL PROTECTED] wrote: Yes, as far as I know, what Brian said is correct. Also, as far as I know, there is nothing that gracefully handles problematic Solr instances during distributed search. Right... we punted that issue to a load balancer (which assumes that you have more than one copy of each shard). Can you explain how you have a LB handling shards? Do you put a separate LB in front of each group of replica shards? A single load balancer should be fine... each shard has it's own VIP which maps to 2 or more solr servers with a replica of that shard. -Yonik
Re: partialResults, distributed search SOLR-5
My interest now is beyond the initial problem and would love if someone could explain how you get from a QueryRequest being created to using the code in SolrIndexSearcher. On Mon, Aug 18, 2008 at 12:34 PM, Otis Gospodnetic [EMAIL PROTECTED] wrote: Right. And a LB that is configured to, say, make use of Solr's ping response to determine if Solr healthy? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Yonik Seeley [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Monday, August 18, 2008 12:31:03 PM Subject: Re: partialResults, distributed search SOLR-50 On Mon, Aug 18, 2008 at 12:16 PM, Otis Gospodnetic wrote: Yes, as far as I know, what Brian said is correct. Also, as far as I know, there is nothing that gracefully handles problematic Solr instances during distributed search. Right... we punted that issue to a load balancer (which assumes that you have more than one copy of each shard). -Yonik -- Regards, Ian Connor
Re: Auto commit error and java.io.FileNotFoundException
I'm assuming that one way to do this would be to set the logging level to FINEST in the logging page in the solr admin tool, and then to make sure my logging.properties file is also set to record the FINEST logging level. Let me know if that won't enable to sort of debugging info you are talking about. (I do understand that the logging page in the admin tool makes temporary changes that will get reverted when you restart Solr.) On Mon, Aug 18, 2008 at 3:05 AM, Michael McCandless [EMAIL PROTECTED] wrote: Since it seems reproducible, could you turn on debugging output (IndexWriter.setInfoStream(...)), get the FileNotFoundException to happen again, and post the resulting output? Mike
Re: Auto commit error and java.io.FileNotFoundException
Alas, I think this won't actually turn on IndexWriter's infoStream. I think you may need to modify the SolrIndexWriter.java sources, in the init method, to add a call to setInfoStream(...). Can any Solr developers confirm this? Mike Chris Harris wrote: I'm assuming that one way to do this would be to set the logging level to FINEST in the logging page in the solr admin tool, and then to make sure my logging.properties file is also set to record the FINEST logging level. Let me know if that won't enable to sort of debugging info you are talking about. (I do understand that the logging page in the admin tool makes temporary changes that will get reverted when you restart Solr.) On Mon, Aug 18, 2008 at 3:05 AM, Michael McCandless [EMAIL PROTECTED] wrote: Since it seems reproducible, could you turn on debugging output (IndexWriter.setInfoStream(...)), get the FileNotFoundException to happen again, and post the resulting output? Mike
Synonyms with spaces not working
Hello folks! Sorry to ask such a basic question but synonyms might be the end of me.. I suspect that there is something fundamentally wrong with the field type I've set up.. fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer filter class=solr.LowerCaseFilterFactory/ filter class=solr.TrimFilterFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType In synonyms.txt I have a *large* list of synonyms in the following format.. a, b, c d e, f, g = something I'm having the behavior that searches for a, b, f, and g all work, but the c d e does not. I suspected that was because things were getting split on white space before they were going to the synonym filter, so I moved the synonym filters to be before the tokenizer. Something's still wrong though... any help would be most appreciated! Thank you for your time! Matthew Runo Software Engineer, Zappos.com [EMAIL PROTECTED] - 702-943-7833
Re: Synonyms with spaces not working
Matthew, there is a good page page about synonyms on the Wiki that covers the multi-word synonyms stuff. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Matthew Runo [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Monday, August 18, 2008 1:39:52 PM Subject: Synonyms with spaces not working Hello folks! Sorry to ask such a basic question but synonyms might be the end of me.. I suspect that there is something fundamentally wrong with the field type I've set up.. positionIncrementGap=100 ignoreCase=true expand=true/ words=stopwords.txt/ protected=protwords.txt/ generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ In synonyms.txt I have a *large* list of synonyms in the following format.. a, b, c d e, f, g = something I'm having the behavior that searches for a, b, f, and g all work, but the c d e does not. I suspected that was because things were getting split on white space before they were going to the synonym filter, so I moved the synonym filters to be before the tokenizer. Something's still wrong though... any help would be most appreciated! Thank you for your time! Matthew Runo Software Engineer, Zappos.com [EMAIL PROTECTED] - 702-943-7833
Re: Auto commit error and java.io.FileNotFoundException
Lucene v.2.1 has a bug with autocommit...
Re: Auto commit error and java.io.FileNotFoundException
On Mon, Aug 18, 2008 at 1:12 PM, Michael McCandless [EMAIL PROTECTED] wrote: Alas, I think this won't actually turn on IndexWriter's infoStream. I think you may need to modify the SolrIndexWriter.java sources, in the init method, to add a call to setInfoStream(...). Can any Solr developers confirm this? Yeah, we don't have that feature yet. -Yonik
Re: Auto commit error and java.io.FileNotFoundException
On Mon, Aug 18, 2008 at 6:05 AM, Michael McCandless [EMAIL PROTECTED] wrote: The output from CheckIndex shows quite a few missing files! Is there any possibility that two instances of Solr were somehow sharing the same index directory? To eliminate that possibility, the lock factory should be set to simple and unlockOnStartup should be false in solrconfig.xml -Yonik
RE: Synonyms with spaces not working
Hi Matthew, On 08/18/2008 at 1:39 PM, Matthew Runo wrote: fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer [...] filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ [...] I can see from SOLR-702 that most of your synonym rules have a single term/phrase on the right-hand side. The SynonymFilterFactory section of the AnalyzersTokenizersTokenFilters wiki page http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#head-2c461ac74b4ddd82e453dc68fcfc92da77358d46 says: # If expand==true, ipod, i-pod, i pod is equivalent to the explicit mapping: ipod, i-pod, i pod = ipod, i-pod, i pod AFAICT from looking at the code, however, the expand option is ignored when there is an explict right-hand side of a rule (i.e. = something). a, b, c d e, f, g = something So documents containing c d e (or a or b or f or g) will only be indexed with something. I'm having the behavior that searches for a, b, f, and g all work, but the c d e does not. As Otis mentioned earlier in this thread, the above-linked wiki page mentions some gotchas about mixing phrases, synonyms, and the Lucene QueryParser. Perhaps you could address the problem by creating separate rules for your phrasal terms, e.g.: a, b, f, g = something c d e, something Using the above rule with no right-hand side, and with expand==true, both c d e and something will be indexed for documents containing c d e. Steve
Re: .wsdl for example....
On Aug 18, 2008, at 11:27 AM, Norberto Meijome wrote: does anyone have a .wsdl definition for the example bundled with SOLR? WSDL? surely you jest. Erik
Re: partialResults, distributed search SOLR-50
I have been using HAProxy on different ports (same IP). It seems to work but have not tested it in production yet. On Mon, Aug 18, 2008 at 12:37 PM, Yonik Seeley [EMAIL PROTECTED] wrote: On Mon, Aug 18, 2008 at 12:34 PM, Brian Whitman [EMAIL PROTECTED] wrote: On Aug 18, 2008, at 12:31 PM, Yonik Seeley wrote: On Mon, Aug 18, 2008 at 12:16 PM, Otis Gospodnetic [EMAIL PROTECTED] wrote: Yes, as far as I know, what Brian said is correct. Also, as far as I know, there is nothing that gracefully handles problematic Solr instances during distributed search. Right... we punted that issue to a load balancer (which assumes that you have more than one copy of each shard). Can you explain how you have a LB handling shards? Do you put a separate LB in front of each group of replica shards? A single load balancer should be fine... each shard has it's own VIP which maps to 2 or more solr servers with a replica of that shard. -Yonik -- Regards, Ian Connor
Re: Order of returned fields
Order matters in my application because I'm indexing structured data - actually, a domain object model (a bit like with Hibernate Search), only I'm adding parents to children, instead of children to parents. So say I have Cities and People, with a 1-N relationship between City and People. I'm indexing documents for Cities, and documents for People, and the documents for People contain the fields of the City they're living in. When I display the results, I'd like the People fields to display before the City fields. I can parse the Solr response and rearrange the fields (in the Java middle-tier, or with XSLT, or in the Javascript client), but then I have to know of the domain in too many places. I have to know of the domain in my Java application, in the SOLR schema file, and in the Javascript that rearranges the fields... I thought maybe I could avoid the latter and put as much application information as possible in the SOLR schema, for instance specifiy an order for the returned fields... Thanks anyway, Pierre Erik Hatcher a écrit : Yes, this is normal behavior. Does order matter in your application? Could you explain why? Order is maintained with multiple values of the same field name, though - which is important. Erik On Aug 17, 2008, at 6:38 PM, Pierre Auslaender wrote: Hello, After a Solr query, I always get the fields back in alphabetical order, no matter how I insert them. Is this the normal behaviour? This is when adding the document... doc field name=uidch.tsr.esg.domain.ProgramCollection[id: 1]/field field name=genrecollection/field field name=collectionBac à sable/field field name=collection.urlhttp://localhost:8080/esg/api/collections/1/field /doc ... and this is when retrieving it: doc str name=collectionBac à sable/str str name=collection.urlhttp://localhost:8080/esg/api/collections/1/str str name=genrecollection/str str name=uidch.tsr.esg.domain.ProgramCollection[id: 1]/str /doc Thanks a lot, Pierre Auslaender
Re: Localisation, faceting
Would that be of any interest to the SOLR / Lucene community, given the trend to globalisation / regionalisation ? My base is Switzerland - 4 official national tongues, none of them English. If one were to localise the boolean operators, would that have to be at the Lucene level, or could that be done at the SOLR level ? Thanks, Pierre Otis Gospodnetic a écrit : Hi, Regarding Boolean operator localization -- there was a person who submitted patches for the same functionality, but for Lucene's QueryParser. This was a few years ago. I think his patch was never applied. Perhaps that helps. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Pierre Auslaender [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Saturday, August 16, 2008 12:50:53 PM Subject: Localisation, faceting Hello, I have a couple of questions: 1/ Is it possible to localise query operator names without writing code? For instance, I'd like to issue queries with French operator names, e.g. ET (instead of AND), OU (instead of OR), etc. 2/ Is it possible for Solr to generate, in the XML response, the URLs or complete queries for each facet in a faceted search? Here's an example. Say my first query is : http://localhost:8080/solr/select?q=bacfacet=truefacet.field=kindfacet.limit=-1 The kind field has three values: material, immaterial, time. I get back something like this: 1024 27633 389 If I want to drill down into one facet, say into material, I have to manually rebuild a query like this: http://localhost:8080/solr/select?q=bacfacet=truefacet.field=kindfacet.limit=-1fq=kind:material; It's not too difficult, but surely Solr could add this URL or query string under the material element. Is this possible? Or do I have to XSLT the result myself? Thanks, Pierre Auslaender
Re: Localisation, faceting
I would do it in the client, even if it meant parsing the query, modifying it, then unparsing it. This is exactly like changing To: to Zu: in a mail header. Show that in the client, but make it standard before it goes onto the network. If queries at the Solr/Lucene level are standard, then users with different locale settings could share saved queries. wunder On 8/18/08 2:18 PM, Pierre Auslaender [EMAIL PROTECTED] wrote: Would that be of any interest to the SOLR / Lucene community, given the trend to globalisation / regionalisation ? My base is Switzerland - 4 official national tongues, none of them English. If one were to localise the boolean operators, would that have to be at the Lucene level, or could that be done at the SOLR level ? Thanks, Pierre Otis Gospodnetic a écrit : Hi, Regarding Boolean operator localization -- there was a person who submitted patches for the same functionality, but for Lucene's QueryParser. This was a few years ago. I think his patch was never applied. Perhaps that helps. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Pierre Auslaender [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Saturday, August 16, 2008 12:50:53 PM Subject: Localisation, faceting Hello, I have a couple of questions: 1/ Is it possible to localise query operator names without writing code? For instance, I'd like to issue queries with French operator names, e.g. ET (instead of AND), OU (instead of OR), etc. 2/ Is it possible for Solr to generate, in the XML response, the URLs or complete queries for each facet in a faceted search? Here's an example. Say my first query is : http://localhost:8080/solr/select?q=bacfacet=truefacet.field=kindfacet.li mit=-1 The kind field has three values: material, immaterial, time. I get back something like this: 1024 27633 389 If I want to drill down into one facet, say into material, I have to manually rebuild a query like this: http://localhost:8080/solr/select?q=bacfacet=truefacet.field=kindfacet.li mit=-1fq=kind:material It's not too difficult, but surely Solr could add this URL or query string under the material element. Is this possible? Or do I have to XSLT the result myself? Thanks, Pierre Auslaender
Re: Order of returned fields
Hey Pierre, I don't know if my case helps you, but what I do to keep relational information is to put the related data all in the same field. Let me give you an example: I have a product index. Each product has a list of manufacturer properties, like dimensions, color, connections supported (usb, bluetooth and so on), etc etc etc. Each property belongs to a context, so I index data following this model: propertyId ^ propertyLabel ^ propertyType ^ propertyValue Then I parse each result returned on my application. Does that help you? 2008/8/18 Pierre Auslaender [EMAIL PROTECTED] Order matters in my application because I'm indexing structured data - actually, a domain object model (a bit like with Hibernate Search), only I'm adding parents to children, instead of children to parents. So say I have Cities and People, with a 1-N relationship between City and People. I'm indexing documents for Cities, and documents for People, and the documents for People contain the fields of the City they're living in. When I display the results, I'd like the People fields to display before the City fields. I can parse the Solr response and rearrange the fields (in the Java middle-tier, or with XSLT, or in the Javascript client), but then I have to know of the domain in too many places. I have to know of the domain in my Java application, in the SOLR schema file, and in the Javascript that rearranges the fields... I thought maybe I could avoid the latter and put as much application information as possible in the SOLR schema, for instance specifiy an order for the returned fields... Thanks anyway, Pierre Erik Hatcher a écrit : Yes, this is normal behavior. Does order matter in your application? Could you explain why? Order is maintained with multiple values of the same field name, though - which is important. Erik On Aug 17, 2008, at 6:38 PM, Pierre Auslaender wrote: Hello, After a Solr query, I always get the fields back in alphabetical order, no matter how I insert them. Is this the normal behaviour? This is when adding the document... doc field name=uidch.tsr.esg.domain.ProgramCollection[id: 1]/field field name=genrecollection/field field name=collectionBac à sable/field field name=collection.url http://localhost:8080/esg/api/collections/1/field /doc ... and this is when retrieving it: doc str name=collectionBac à sable/str str name=collection.url http://localhost:8080/esg/api/collections/1/str str name=genrecollection/str str name=uidch.tsr.esg.domain.ProgramCollection[id: 1]/str /doc Thanks a lot, Pierre Auslaender -- Alexander Ramos Jardim
Re: Localisation, faceting
Excellent point about the saved queries. Thanks! So I could sniff the locale (from the HTML page or the Java application,...) and infer the query language, or try to do automatic guessing of the language based on the operator names (if they don't collide with indexed terms). This brings up an other question: which query parser should I use? I guess it would be a bad idea to invent one, it would be better to reuse or adapt the query parser used by SOLR - or is it Lucene? Can you point me to the parser? Thanks, Pierre Walter Underwood a écrit : I would do it in the client, even if it meant parsing the query, modifying it, then unparsing it. This is exactly like changing To: to Zu: in a mail header. Show that in the client, but make it standard before it goes onto the network. If queries at the Solr/Lucene level are standard, then users with different locale settings could share saved queries. wunder On 8/18/08 2:18 PM, Pierre Auslaender [EMAIL PROTECTED] wrote: Would that be of any interest to the SOLR / Lucene community, given the trend to globalisation / regionalisation ? My base is Switzerland - 4 official national tongues, none of them English. If one were to localise the boolean operators, would that have to be at the Lucene level, or could that be done at the SOLR level ? Thanks, Pierre Otis Gospodnetic a écrit : Hi, Regarding Boolean operator localization -- there was a person who submitted patches for the same functionality, but for Lucene's QueryParser. This was a few years ago. I think his patch was never applied. Perhaps that helps. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Pierre Auslaender [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Saturday, August 16, 2008 12:50:53 PM Subject: Localisation, faceting Hello, I have a couple of questions: 1/ Is it possible to localise query operator names without writing code? For instance, I'd like to issue queries with French operator names, e.g. ET (instead of AND), OU (instead of OR), etc. 2/ Is it possible for Solr to generate, in the XML response, the URLs or complete queries for each facet in a faceted search? Here's an example. Say my first query is : http://localhost:8080/solr/select?q=bacfacet=truefacet.field=kindfacet.li mit=-1 The kind field has three values: material, immaterial, time. I get back something like this: 1024 27633 389 If I want to drill down into one facet, say into material, I have to manually rebuild a query like this: http://localhost:8080/solr/select?q=bacfacet=truefacet.field=kindfacet.li mit=-1fq=kind:material It's not too difficult, but surely Solr could add this URL or query string under the material element. Is this possible? Or do I have to XSLT the result myself? Thanks, Pierre Auslaender
Re: .wsdl for example....
Do you wanna a full web service for SOLR example? How a .wsdl will help you? Why don't you use the HTTP interface SOLR provides? Anyways, if you need to develop a web service (SOAP compliant) to access SOLR, just remember to use an embedded core on your webservice. 2008/8/18 Norberto Meijome [EMAIL PROTECTED] hi :) does anyone have a .wsdl definition for the example bundled with SOLR? if nobody has it, would it be useful to have one ? cheers, B _ {Beto|Norberto|Numard} Meijome Intelligence: Finding an error in a Knuth text. Stupidity: Cashing that $2.56 check you got. I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned. -- Alexander Ramos Jardim
Re: Restrict Wildcards
I will try this tomorrow. Thanks for the suggestion. - Erlend On Mon, Aug 18, 2008 at 5:01 PM, Otis Gospodnetic [EMAIL PROTECTED] wrote: Erlend, This doesn't work with string? Maybe something there is removing numbers. Have you tried with an example without numbers? e.g. fooaaa and foobbb. Does foo* match them both? If it does, then perhaps you can create a custom field type and use KeywordTokenizer in it. Example schema.xml has some of this stuff. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Erlend Hamnaberg [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Monday, August 18, 2008 7:42:22 AM Subject: Restrict Wildcards Hi list. Is it possible to create a field type in solr that does not match with wildcard queries? I want it to only match the complete string, so if I have indexed foo123 and foo234 i dont want foo* to match any of these. This does not work with just using the predefined string type. Any suggestions? Warm regards Erlend Hamnaberg
Solr won't start under jetty on RHEL5.2
I just migrated my solr instance to a new server, running RHEL5.2. I installed java from yum but I suspect it's different from the one I used to use. Anyway, my Solr no longer works. 2008-08-18 18:01:12.079::INFO: Logging to STDERR via org.mortbay.log.StdErrLog 2008-08-18 18:01:12.229::INFO: jetty-6.1.3 2008-08-18 18:01:12.330::INFO: Extract jar:file:/home/apps/solr/solr-1.2.0/webapps/solr.war!/ to /tmp/Jetty_0_0_0_0_8983_solr.war__solr__k1kf17/webapp 2008-08-18 18:01:12.452::INFO: NO JSP Support for /solr, did not find org.apache.jasper.servlet.JspServlet 18-Aug-08 6:01:12 PM org.apache.solr.servlet.SolrDispatchFilter init INFO: SolrDispatchFilter.init() 18-Aug-08 6:01:12 PM org.apache.solr.core.Config getInstanceDir INFO: JNDI not configured for Solr (NoInitialContextEx) 18-Aug-08 6:01:12 PM org.apache.solr.core.Config getInstanceDir INFO: Solr home defaulted to 'null' (could not find system property or JNDI) 18-Aug-08 6:01:12 PM org.apache.solr.core.Config setInstanceDir INFO: Solr home set to 'solr/' 18-Aug-08 6:01:12 PM org.apache.solr.core.SolrConfig initConfig INFO: Loaded SolrConfig: solrconfig.xml 18-Aug-08 6:01:12 PM org.apache.solr.servlet.SolrDispatchFilter init INFO: user.dir=/home/apps/solr/solr-1.2.0 2008-08-18 18:01:12.663::WARN: failed SolrRequestFilter java.lang.NoClassDefFoundError: org.apache.solr.core.SolrCore at java.lang.Class.initializeClass(libgcj.so.7rh) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:75) at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:594) at org.mortbay.jetty.servlet.Context.startContext(Context.java:139) at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1218) at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:500) at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:448) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:161) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:117) at org.mortbay.jetty.Server.doStart(Server.java:210) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:929) at java.lang.reflect.Method.invoke(libgcj.so.7rh) at org.mortbay.start.Main.invokeMain(Main.java:183) at org.mortbay.start.Main.start(Main.java:497) at org.mortbay.start.Main.main(Main.java:115) All attempts to load solr pages result in 404 not found errors. I suspect this is a Jetty configuration problem but I know nothing about jetty or servlet containers or anything like that. Could someone explain in words of one syllable or less how to get it to find the installation please? Thanks -jsd-
Re: Solr won't start under jetty on RHEL5.2
Jon Drukman wrote: I just migrated my solr instance to a new server, running RHEL5.2. I installed java from yum but I suspect it's different from the one I used to use. Turns out my instincts were correct. The version from yum does not work. I installed the official sun jdk and now it starts fine. bad: java version 1.4.2 gij (GNU libgcj) version 4.1.2 20071124 (Red Hat 4.1.2-42) good: java version 1.6.0_07 Java(TM) SE Runtime Environment (build 1.6.0_07-b06) Java HotSpot(TM) 64-Bit Server VM (build 10.0-b23, mixed mode) -jsd-
Re: hello, a question about solr.
On Mon, 18 Aug 2008 23:07:19 +0800 finy finy [EMAIL PROTECTED] wrote: because i use chinese character, for example ibm___ solr will parse it into a term ibm and a phraze _ __ can i use solr to query with a term ibm and a term _ and a term __? Hi finy, you should look into n-gram tokenizers. Not sure if it is documented in the wiki, but it has been discussed in the mailing list quite a few times. in short, an n-gram tokenizer breaks your input into blocks of characters of size n , which are then used to compare in the index. I think for Chinese , bi-gram is the favoured approach. good luck, B _ {Beto|Norberto|Numard} Meijome I used to hate weddings; all the Grandmas would poke me and say, You're next sonny! They stopped doing that when i started to do it to them at funerals. I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: hello, a question about solr.
thanks for your help. could you give me your gmail talk address or msn? 2008/8/19, Norberto Meijome [EMAIL PROTECTED]: On Mon, 18 Aug 2008 23:07:19 +0800 finy finy [EMAIL PROTECTED] wrote: because i use chinese character, for example ibm___ solr will parse it into a term ibm and a phraze _ __ can i use solr to query with a term ibm and a term _ and a term __? Hi finy, you should look into n-gram tokenizers. Not sure if it is documented in the wiki, but it has been discussed in the mailing list quite a few times. in short, an n-gram tokenizer breaks your input into blocks of characters of size n , which are then used to compare in the index. I think for Chinese , bi-gram is the favoured approach. good luck, B _ {Beto|Norberto|Numard} Meijome I used to hate weddings; all the Grandmas would poke me and say, You're next sonny! They stopped doing that when i started to do it to them at funerals. I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: .wsdl for example....
On Mon, 18 Aug 2008 19:08:24 -0300 Alexander Ramos Jardim [EMAIL PROTECTED] wrote: Do you wanna a full web service for SOLR example? How a .wsdl will help you? Why don't you use the HTTP interface SOLR provides? Anyways, if you need to develop a web service (SOAP compliant) to access SOLR, just remember to use an embedded core on your webservice. On Mon, 18 Aug 2008 15:37:24 -0400 Erik Hatcher [EMAIL PROTECTED] wrote: WSDL? surely you jest. Erik :D I obviously said something terribly stupid, oh well, not the first time and most likely wont be the last one either. Anyway, the reason for my asking is : - I've put together a SOLR search service with a few cores. Nothing fancy, it works great as is. - the .NET developer I am working with on this asked for a .wsdl (or .asmx) file to import into Visual Studio ... yes, he can access the service directly, but he seems to prefer a more 'well defined' interface (haven't really decided whether it is worth the effort, but that is another question altogether) The way I see it, SOLR is a RESTful service. I am not looking into wrapping the whole thing behind SOAP ( I actually much prefer REST than SOAP, but that is entering into quasi-religious grounds...) - which should be able to be defined with a .wsdl ( v 1.1 should suffice as only GET + POST are supported in SOLR anyway). Am I missing anything here ? thanks in advance for your time + thoughts , B _ {Beto|Norberto|Numard} Meijome He has no enemies, but is intensely disliked by his friends. Oscar Wilde I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: .wsdl for example....
On Tue, 19 Aug 2008 11:23:48 +1000 Norberto Meijome [EMAIL PROTECTED] wrote: On Mon, 18 Aug 2008 19:08:24 -0300 Alexander Ramos Jardim [EMAIL PROTECTED] wrote: Do you wanna a full web service for SOLR example? How a .wsdl will help you? Why don't you use the HTTP interface SOLR provides? Anyways, if you need to develop a web service (SOAP compliant) to access SOLR, just remember to use an embedded core on your webservice. On Mon, 18 Aug 2008 15:37:24 -0400 Erik Hatcher [EMAIL PROTECTED] wrote: WSDL? surely you jest. Erik :D I obviously said something terribly stupid, oh well, not the first time and most likely wont be the last one either. Anyway, the reason for my asking is : - I've put together a SOLR search service with a few cores. Nothing fancy, it works great as is. - the .NET developer I am working with on this asked for a .wsdl (or .asmx) file to import into Visual Studio ... yes, he can access the service directly, but he seems to prefer a more 'well defined' interface (haven't really decided whether it is worth the effort, but that is another question altogether) The way I see it, SOLR is a RESTful service. I am not looking into wrapping the whole thing behind SOAP ( I actually much prefer REST than SOAP, but that is entering into quasi-religious grounds...) - which should be able to be defined with a .wsdl ( v 1.1 should suffice as only GET + POST are supported in SOLR anyway). Am I missing anything here ? thanks in advance for your time + thoughts , B To be clear, i don't suggest we should have a .wsdl for example, simply asking if there would be any use in having one. but given the responses I got, I'm curious now to understand what I have gotten wrong :) Best, B _ {Beto|Norberto|Numard} Meijome I sense much NT in you. NT leads to Bluescreen. Bluescreen leads to downtime. Downtime leads to suffering. NT is the path to the darkside. Powerful Unix is. I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: Clarification on facets
On Tue, 19 Aug 2008 10:18:12 +1200 Gene Campbell [EMAIL PROTECTED] wrote: Is this interpreted as meaning, there are 10 documents that will match with 'car' in the title, and likewise 6 'boat' and 2 'bike'? Correct. If so, is there any way to get counts for the *number times* a value is found in a document. I'm looking for a way to determine the number of times 'car' is repeated in the title, for example Not sure - i would suggest that a field with a term repeated several times would receive a higher score when searching for that term, but not sure how you could get the information you seek...maybe with the Luke handler ? ( but on a per-document basis...slow... ? ) B _ {Beto|Norberto|Numard} Meijome Computers are like air conditioners; they can't do their job properly if you open windows. I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: .wsdl for example....
check SolrSharp http://wiki.apache.org/solr/SolrSharp On Aug 18, 2008, at 9:23 PM, Norberto Meijome wrote: On Mon, 18 Aug 2008 19:08:24 -0300 Alexander Ramos Jardim [EMAIL PROTECTED] wrote: Do you wanna a full web service for SOLR example? How a .wsdl will help you? Why don't you use the HTTP interface SOLR provides? Anyways, if you need to develop a web service (SOAP compliant) to access SOLR, just remember to use an embedded core on your webservice. On Mon, 18 Aug 2008 15:37:24 -0400 Erik Hatcher [EMAIL PROTECTED] wrote: WSDL? surely you jest. Erik :D I obviously said something terribly stupid, oh well, not the first time and most likely wont be the last one either. Anyway, the reason for my asking is : - I've put together a SOLR search service with a few cores. Nothing fancy, it works great as is. - the .NET developer I am working with on this asked for a .wsdl (or .asmx) file to import into Visual Studio ... yes, he can access the service directly, but he seems to prefer a more 'well defined' interface (haven't really decided whether it is worth the effort, but that is another question altogether) The way I see it, SOLR is a RESTful service. I am not looking into wrapping the whole thing behind SOAP ( I actually much prefer REST than SOAP, but that is entering into quasi-religious grounds...) - which should be able to be defined with a .wsdl ( v 1.1 should suffice as only GET + POST are supported in SOLR anyway). Am I missing anything here ? thanks in advance for your time + thoughts , B _ {Beto|Norberto|Numard} Meijome He has no enemies, but is intensely disliked by his friends. Oscar Wilde I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
RE: .wsdl for example....
Various Java web service libraries come with 'wsdl2java' and 'java2wsdl' programs. You just run 'java2wsdl' on the Java soap description. -Original Message- From: Ryan McKinley [mailto:[EMAIL PROTECTED] Sent: Monday, August 18, 2008 6:53 PM To: solr-user@lucene.apache.org Subject: Re: .wsdl for example check SolrSharp http://wiki.apache.org/solr/SolrSharp On Aug 18, 2008, at 9:23 PM, Norberto Meijome wrote: On Mon, 18 Aug 2008 19:08:24 -0300 Alexander Ramos Jardim [EMAIL PROTECTED] wrote: Do you wanna a full web service for SOLR example? How a .wsdl will help you? Why don't you use the HTTP interface SOLR provides? Anyways, if you need to develop a web service (SOAP compliant) to access SOLR, just remember to use an embedded core on your webservice. On Mon, 18 Aug 2008 15:37:24 -0400 Erik Hatcher [EMAIL PROTECTED] wrote: WSDL? surely you jest. Erik :D I obviously said something terribly stupid, oh well, not the first time and most likely wont be the last one either. Anyway, the reason for my asking is : - I've put together a SOLR search service with a few cores. Nothing fancy, it works great as is. - the .NET developer I am working with on this asked for a .wsdl (or .asmx) file to import into Visual Studio ... yes, he can access the service directly, but he seems to prefer a more 'well defined' interface (haven't really decided whether it is worth the effort, but that is another question altogether) The way I see it, SOLR is a RESTful service. I am not looking into wrapping the whole thing behind SOAP ( I actually much prefer REST than SOAP, but that is entering into quasi-religious grounds...) - which should be able to be defined with a .wsdl ( v 1.1 should suffice as only GET + POST are supported in SOLR anyway). Am I missing anything here ? thanks in advance for your time + thoughts , B _ {Beto|Norberto|Numard} Meijome He has no enemies, but is intensely disliked by his friends. Oscar Wilde I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Deadlock in lucene?
Hello folks! I was just wondering if anyone else has seen this issue under heavy load. We had some servers set to very high thread limits (12 core servers with 32 gigs of ram), and found several threads would end up in this state Name: http-8080-891 State: BLOCKED on [EMAIL PROTECTED] owned by: http-8080-191 Total blocked: 97,926 Total waited: 16 Stack trace: org.apache.lucene.index.SegmentReader.isDeleted(SegmentReader.java:674) org.apache.solr.search.function.FunctionQuery $AllScorer.next(FunctionQuery.java:116) org .apache .lucene .util.ScorerDocQueue.topNextAndAdjustElsePop(ScorerDocQueue.java:116) org .apache .lucene .search .DisjunctionSumScorer.advanceAfterCurrent(DisjunctionSumScorer.java:175) org .apache .lucene.search.DisjunctionSumScorer.skipTo(DisjunctionSumScorer.java: 228) org.apache.lucene.search.ReqOptSumScorer.score(ReqOptSumScorer.java:76) org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.java:357) org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.java:320) org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:137) org.apache.lucene.search.Searcher.search(Searcher.java:126) org.apache.lucene.search.Searcher.search(Searcher.java:105) org .apache .solr .search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java: 1148) org .apache .solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:834) org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java: 269) org .apache .solr.handler.component.QueryComponent.process(QueryComponent.java:160) org .apache .solr .handler.component.SearchHandler.handleRequestBody(SearchHandler.java: 169) org .apache .solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java: 128) org.apache.solr.core.SolrCore.execute(SolrCore.java:1143) org .apache .solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) org .apache .solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:272) org .apache .catalina .core .ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java: 235) org .apache .catalina .core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) org .apache .catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java: 233) org .apache .catalina.core.StandardContextValve.invoke(StandardContextValve.java: 175) org .apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java: 128) org .apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java: 102) org .apache .catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java: 286) org.apache.coyote.http11.Http11Processor.process(Http11Processor.java: 844) org.apache.coyote.http11.Http11Protocol $Http11ConnectionHandler.process(Http11Protocol.java:583) org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447) java.lang.Thread.run(Thread.java:619) Thanks for your time! Matthew Runo Software Engineer, Zappos.com [EMAIL PROTECTED] - 702-943-7833
Re: Deadlock in lucene?
It's not a deadlock (just a synchronization bottleneck) , but it is a known issue in Lucene and there has been some progress in improving the situation. -Yonik On Mon, Aug 18, 2008 at 10:55 PM, Matthew Runo [EMAIL PROTECTED] wrote: Hello folks! I was just wondering if anyone else has seen this issue under heavy load. We had some servers set to very high thread limits (12 core servers with 32 gigs of ram), and found several threads would end up in this state Name: http-8080-891 State: BLOCKED on [EMAIL PROTECTED] owned by: http-8080-191 Total blocked: 97,926 Total waited: 16 Stack trace: org.apache.lucene.index.SegmentReader.isDeleted(SegmentReader.java:674) org.apache.solr.search.function.FunctionQuery$AllScorer.next(FunctionQuery.java:116) org.apache.lucene.util.ScorerDocQueue.topNextAndAdjustElsePop(ScorerDocQueue.java:116) org.apache.lucene.search.DisjunctionSumScorer.advanceAfterCurrent(DisjunctionSumScorer.java:175) org.apache.lucene.search.DisjunctionSumScorer.skipTo(DisjunctionSumScorer.java:228) org.apache.lucene.search.ReqOptSumScorer.score(ReqOptSumScorer.java:76) org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.java:357) org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.java:320) org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:137) org.apache.lucene.search.Searcher.search(Searcher.java:126) org.apache.lucene.search.Searcher.search(Searcher.java:105) org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1148) org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:834) org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:269) org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:160) org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:169) org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:128) org.apache.solr.core.SolrCore.execute(SolrCore.java:1143) org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:272) org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286) org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844) org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447) java.lang.Thread.run(Thread.java:619) Thanks for your time! Matthew Runo Software Engineer, Zappos.com [EMAIL PROTECTED] - 702-943-7833
Re: Clarification on facets
Thank you for the response. Always nice to have something willing to validate your thinking! Of course, if anyone has any ideas on how to get the numbers of times term is repeated in a document, I'm all ears. cheers gene On Tue, Aug 19, 2008 at 1:42 PM, Norberto Meijome [EMAIL PROTECTED] wrote: On Tue, 19 Aug 2008 10:18:12 +1200 Gene Campbell [EMAIL PROTECTED] wrote: Is this interpreted as meaning, there are 10 documents that will match with 'car' in the title, and likewise 6 'boat' and 2 'bike'? Correct. If so, is there any way to get counts for the *number times* a value is found in a document. I'm looking for a way to determine the number of times 'car' is repeated in the title, for example Not sure - i would suggest that a field with a term repeated several times would receive a higher score when searching for that term, but not sure how you could get the information you seek...maybe with the Luke handler ? ( but on a per-document basis...slow... ? ) B _ {Beto|Norberto|Numard} Meijome Computers are like air conditioners; they can't do their job properly if you open windows. I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.