Re: Synonyms Phrase not working
Hi, because your search for /?q=produto_nome:lubrificante intimo is a phrase search and will be handled different. Your other search gets the synonyms, but the last synonym is a multi-word synonym and not a phrase ... produto_nome:lubrificante intimo) )) See also: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory Regards Bernd Am 01.10.2012 19:02, schrieb Gustav: Hello Everyone, Im having a problem using the SynonymFilterFactory in a query analyzer. That's my synonyms.txt file: sexo = Preservativo, vaselina, viagra, lubrificante intimo And that is the fieldtype in which it is implemented: fieldType class=solr.TextField name=produto_nome_synonyms positionIncrementGap=100 analyzer type=index tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory enablePositionIncrements=true ignoreCase=true words=stopwords.txt/ filter class=solr.ISOLatin1AccentFilterFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.NGramFilterFactory maxGramSize=25 minGramSize=1/ /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.ISOLatin1AccentFilterFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SynonymFilterFactory expand=true ignoreCase=true synonyms=synonyms.txt tokenizerFactory=KeywordTokenizerFactory/ filter class=solr.StopFilterFactory enablePositionIncrements=true ignoreCase=true words=stopwords.txt/ /analyzer /fieldType The problem here is: When i search for /?q=produto_nome:lubrificante intimo Solr returns 8 documents, that matches because of the n-gram filter factory, but when i search for /?q=produto_nome:sexo Solr brings no results. I was expecting the same result as /?q=lubrificante intimo , as configured in the synonyms. i turned debugQuery=true and got the following parsedquery: str name=parsedquery +DisjunctionMaxQuery(((produto_nome:preservativo produto_nome:vaselina produto_nome:viagra produto_nome:lubrificante intimo) )) /str I Dont undersant why it brings no results. Any ideas? -- View this message in context: http://lucene.472066.n3.nabble.com/Synonyms-Phrase-not-working-tp4011237.html Sent from the Solr - User mailing list archive at Nabble.com. -- * Bernd FehlingUniversitätsbibliothek Bielefeld Dipl.-Inform. (FH)LibTec - Bibliothekstechnologie Universitätsstr. 25 und Wissensmanagement 33615 Bielefeld Tel. +49 521 106-4060 bernd.fehling(at)uni-bielefeld.de BASE - Bielefeld Academic Search Engine - www.base-search.net *
Re: Understanding fieldCache SUBREADER insanity
Hi Yonik, I've been attempting to fix the SUBREADER insanity in our custom component, and have made perhaps some progress (or is this worse?) - I've gone from SUBREADER to VALUEMISMATCH insanity: ---snip--- entries_count : 12 entry#0 : 'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='f_normalizedTotalHotttnesss',class org.apache.lucene.search.FieldCacheImpl$DocsWithFieldCache,null=org.apache.lucene.util.FixedBitSet#1387502754 entry#1 : 'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='i_track_count',class org.apache.lucene.search.FieldCacheImpl$DocsWithFieldCache,null=org.apache.lucene.util.Bits$MatchAllBits#233863705 entry#2 : 'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='s_artistID',class org.apache.lucene.search.FieldCache$StringIndex,null=org.apache.lucene.search.FieldCache$StringIndex#652215925 entry#3 : 'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='s_artistID',class java.lang.String,null=[Ljava.lang.String;#1036517187 entry#4 : 'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='thingID',class java.lang.String,null=[Ljava.lang.String;#357017445 entry#5 : 'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='f_normalizedTotalHotttnesss',float,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_FLOAT_PARSER=[F#322888397 entry#6 : 'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='f_normalizedTotalHotttnesss',float,org.apache.lucene.search.FieldCache.DEFAULT_FLOAT_PARSER=org.apache.lucene.search.FieldCache$CreationPlaceholder#1229311421 entry#7 : 'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='f_normalizedTotalHotttnesss',float,null=[F#322888397 entry#8 : 'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='i_collapse',int,org.apache.lucene.search.FieldCache.DEFAULT_INT_PARSER=org.apache.lucene.search.FieldCache$CreationPlaceholder#92920526 entry#9 : 'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='i_collapse',int,null=[I#494669113 entry#10 : 'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='i_collapse',int,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_INT_PARSER=[I#494669113 entry#11 : 'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='i_track_count',int,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_INT_PARSER=[I#994584654 insanity_count : 1 insanity#0 : VALUEMISMATCH: Multiple distinct value objects for MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)+s_artistID 'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='s_artistID',class org.apache.lucene.search.FieldCache$StringIndex,null=org.apache.lucene.search.FieldCache$StringIndex#652215925 'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='s_artistID',class java.lang.String,null=[Ljava.lang.String;#1036517187 ---snip--- Any suggestions on what the cause of this VALUEMISMATCH is, if it is the normal case, or suggestions on how to fix it. For anybody else with SUBREADER insanity issues, this is the change I made to get this far (get the first leafReader, since we are using a merged/optimized index): ---snip--- SolrIndexReader reader = searcher.getReader().getLeafReaders()[0]; collapseIDs = FieldCache.DEFAULT.getInts(reader, COLLAPSE_KEY_NAME); hotnessValues = FieldCache.DEFAULT.getFloats(reader, HOTNESS_KEY_NAME); artistIDs = FieldCache.DEFAULT.getStrings(reader, ARTIST_KEY_NAME); ---snip--- Thanks, Aaron On Wed, Sep 19, 2012 at 4:54 PM, Yonik Seeley yo...@lucidworks.com wrote: already-optimized, single-segment index That part is interesting... if true, then the type of insanity you saw should be impossible, and either the insanity detection or something else is broken. -Yonik http://lucidworks.com
Re: multivalued filed question (FieldCache error)
I'm also using that field for a facet: |requestHandler name=mytype class=solr.SearchHandler lst name=defaults str name=defTypedismax/str str name=echoParamsexplicit/str float name=tie1/float str name=qf many field but not store_slug /str str name=pf |many field but not store_slug||| /str str name=fl ..., store_slug /str str name=mm![CDATA[1100% 580%]]/str int name=qs2/int int name=ps2/int str name=q.alt*:*/str str name=spellcheck.dictionarydefault/str str name=spellchecktrue/str str name=spellcheck.extendedResultstrue/str str name=spellcheck.count10/str str name=spellcheck.collatetrue/str !-- Facet -- str name=facettrue/str str name=facet.mincount1/str str name=facet.pivot.mincount0/str str name=facet.sortcount/str ... str name=facet.fieldstore_slug/str ... str name=hlfalse/str /lst arr name=last-components strspellcheck/str /arr /requestHandler| Il 01/10/12 18:34, Erik Hatcher ha scritto: How is your request handler defined? Using store_slug for anything but fl? Erik On Oct 1, 2012, at 10:51,giovanni.bricc...@banzai.it giovanni.bricc...@banzai.it wrote: Hello, I would like to put a multivalued field into a qt definition as output field. to do this I edit the current solrconfig.xml definition and add the field in the fl specification. Unexpectedly when I do the query q=*:*qt=mytype I get the error str name=msg can not use FieldCache on multivalued field: store_slug /str But if I instead run the query http://src-eprice-dev:8080/solr/0/select/?q=*:*qt=mytypefl=otherfield,mymultivaluedfiled I don't get the error Have you got any suggestions? I'm using solr 4 beta solr-spec 4.0.0.2012.08.06.22.50.47 lucene-impl 4.0.0-BETA 1370099 Giovanni -- Giovanni Bricconi Banzai Consulting cell. 348 7283865 ufficio 02 00643839 via Gian Battista Vico 42 20132 Milano (MI)
Re: httpSolrServer and exyternal load balancer
Cheers, saved the day Lee C On 28 September 2012 23:27, Chris Hostetter hossman_luc...@fucit.orgwrote: : The issue we face is the f5 balancer is returning a cookie which the client : is hanging onto. resulting in the same slave being hit for all requests. ... : My question is can I configure the solr server to ignore client state ? We : are on solr 3.4 I'm not an expert on HTTP session affinity as implemented by various load blanacers, but i can say with a high degree of confidence: 1) SolrJ doesn't care about cookies 2) if any part of the codepath you are using cares about cookies sent back from your load-balancer, it would be the HttpClient objects used by CommonsHttpSolrServer. 3) you have total control over the HttpClient objects used by CommonsHttpSolrServer via an optional constructor arg. 4) https://hc.apache.org/httpcomponents-client-ga/tutorial/html/statemgmt.html#d5e799 -Hoss
RE: Problem with spellchecker
The problem is your stray double quote: str name=queryAnalyzerFieldTypetext_general_fr/str I'd think this would throw an exception somewhere. -Original message- From:Jose Aguilar jagui...@searchtechnologies.com Sent: Tue 02-Oct-2012 01:40 To: solr-user@lucene.apache.org Subject: Problem with spellchecker We have configured 2 spellcheckers English and French in solr 4 BETA. Each spellchecker works with a specific search handler. The English spellchecker is working as expected with any word regardless of the case. On the other hand, the French spellchecker works with lowercase words. If the first letter is uppercase, then the spellchecker is not returning any suggestion unless we add the spellcheck.q parameter with that term. To further clarify, this doesn't return any corrections: http://localhost:8984/solr/collection1/handler?wt=xmlq=Systme But this one works as expected: http://localhost:8984/solr/collection1/handler?wt=xmlq=Systmespellcheck.q=Systme According to this page (http://wiki.apache.org/solr/SpellCheckComponent#q_OR_spellcheck.q) , the spellcheck.q paramater shouldn't be required: If spellcheck.q is defined, then it is used, otherwise the original input query is used Are we missing something? We double checked the configuration settings for English which is working fine and it seems well configured. Here is an extract of the spellcheck component configuration for French language searchComponent name=spellcheckfr class=solr.SpellCheckComponent str name=queryAnalyzerFieldTypetext_general_fr/str lst name=spellchecker str name=namedefault/str str name=fieldSpellingFr/str str name=classnamesolr.DirectSolrSpellChecker/str str name=distanceMeasureinternal/str float name=accuracy0.5/float int name=maxEdits2/int int name=minPrefix1/int int name=maxInspections5/int int name=minQueryLength4/int float name=maxQueryFrequency0.01/float str name=buildOnCommittrue/str /lst /searchComponent Thanks for any help
Re: Deploying and securing Solr war in JBoss AS
Hi Billy see http://wiki.apache.org/solr/SolrSecurity One approach is keep master internal, read only slaves with just select handlers defined in the solr config for public facing requests. See your app container security docs for other approaches On 1 October 2012 16:32, Billy Newman newman...@gmail.com wrote: I am struggling with how to protect the Solr URLs (esp. the admin page(s)) when I deploy solr to JBoss. I know that I can extract the web.xml from the war and mess with that, but was wondering if there was a way to deploy the war as-is and modify some JBoss config file to protect that wars URL(s). Anyone deployed this to JBoss w/ success and have any experience with how to secure it. I looked here for JBoss config with no luck: http://wiki.apache.org/solr/SolrJBoss Thanks in advance, Billy
Re: Synonyms Phrase not working
Gustav, AFAIK, multi words synonyms is one of the weak points for Lucene/Solr. I'm going to propose a solution approach at forthcoming Eurocon http://www.apachecon.eu/schedule/presentation/18/ . You are welcome! -- Sincerely yours Mikhail Khludnev Tech Lead Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
Can I rely on correct handling of interrupted status of threads?
Hi, I'm using Solr 3.6.1 in an application embedded directly, i.e. via EmbeddedSolrServer, not over an HTTP connection, which works perfectly. Our application uses Thread.interrupt() for canceling long-running tasks (e.g. through Future.cancel). A while (and a few Solr versions) back a colleague of mine implemented a workaround because he said that Solr didn't handle the thread's interrupted status correctly, i.e. not setting the interrupted status after having caught an InterruptedException or rethrowing it, thus killing the information that an interrupt has been requested, which breaks libraries relying on that. However, I did not find anything up-to-date in mailing list or forum archives on the web. Is that still or was it ever the case? What does one have to watch out for when interrupting a thread that is doing anything within Solr/Lucene? Any advice would be appreciated. Regards, Robert
Re: move solr.war to Glassfish and got error running http://host:port/ProjectName/browse
Hello list, On Sun, Sep 30, 2012 at 6:43 PM, Iwan Hanjoyo ihanj...@gmail.com wrote: Hello all, I used older Solr 3.6.1 version. I created a new web project (called SolrRedo) on Netbeans 7.1.1 running on Glassfish Web Server Then I moved sources from the solr.war sample code (that resided inside apache-solr-3.6.1.zip) into SolrRedo' Netbeans 7.1.1 project. I also do some settings (ex: put the solr.home folder into a proper place), deploy and run the project I successfully run it on the browser (including http://localhost:8080/SolrRedo/admin/). However, I got error HTTP Status 500 when trying to browse http://localhost:8080/SolrRedo/browse/ How should I fix this problem? Kind regards Hanjoyo Here is the details: HTTP Status 500 - lazy loading error org.apache.solr.common.SolrException: lazy loading error at org.apache.solr.core.SolrCore$LazyQueryResponseWriterWrapper.getWrappedWriter(SolrCore.java:1763) at org.apache.solr.core.SolrCore$LazyQueryResponseWriterWrapper.getContentType(SolrCore.java:1778) at org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:273) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:256) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:217) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:279) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) at org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:655) at org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:595) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:161) at org.apache.catalina.connector.CoyoteAdapter.doService(CoyoteAdapter.java:331) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:231) at com.sun.enterprise.v3.services.impl.ContainerMapper$AdapterCallable.call(ContainerMapper.java:317) at com.sun.enterprise.v3.services.impl.ContainerMapper.service(ContainerMapper.java:195) at com.sun.grizzly.http.ProcessorTask.invokeAdapter(ProcessorTask.java:849) at com.sun.grizzly.http.ProcessorTask.doProcess (ProcessorTask.java:746) at com.sun.grizzly.http.ProcessorTask.process(ProcessorTask.java:1045) at com.sun.grizzly.http.DefaultProtocolFilter.execute(DefaultProtocolFilter.java:228) at com.sun.grizzly.DefaultProtocolChain.executeProtocolFilter(DefaultProtocolChain.java:137) at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:104) at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:90) at com.sun.grizzly.http.HttpProtocolChain.execute(HttpProtocolChain.java:79) at com.sun.grizzly.ProtocolChainContextTask.doCall(ProtocolChainContextTask.java:54) at com.sun.grizzly.SelectionKeyContextTask.call(SelectionKeyContextTask.java:59) at com.sun.grizzly.ContextTask.run (ContextTask.java:71) at com.sun.grizzly.util.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:532) at com.sun.grizzly.util.AbstractThreadPool$Worker.run(AbstractThreadPool.java:513) at java.lang.Thread.run(Thread.java:722) Caused by: org.apache.solr.common.SolrException: Error loading class 'solr.VelocityResponseWriter' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:394) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:419) at org.apache.solr.core.SolrCore.createQueryResponseWriter(SolrCore.java:487) at org.apache.solr.core.SolrCore.access$300 (SolrCore.java:72) at org.apache.solr.core.SolrCore$LazyQueryResponseWriterWrapper.getWrappedWriter(SolrCore.java:1758) ... 28 more Caused by: java.lang.ClassNotFoundException: solr.VelocityResponseWriter at java.net.URLClassLoader$1.run (URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at java.net.FactoryURLClassLoader.loadClass (URLClassLoader.java:789) at java.lang.ClassLoader.loadClass(ClassLoader.java:356) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at org.apache.solr.core.SolrResourceLoader.findClass (SolrResourceLoader.java:378) ... 32 more
Re: move solr.war to Glassfish and got error running http://host:port/ProjectName/browse
Hello list, I finally solved the problem. I miss the configuration of solr jar file in the solrconfig.xml file. thank you. Kind regards, Hanjoyo On Tue, Oct 2, 2012 at 5:57 PM, Iwan Hanjoyo ihanj...@gmail.com wrote: Hello list, On Sun, Sep 30, 2012 at 6:43 PM, Iwan Hanjoyo ihanj...@gmail.com wrote: Hello all, I used older Solr 3.6.1 version. I created a new web project (called SolrRedo) on Netbeans 7.1.1 running on Glassfish Web Server Then I moved sources from the solr.war sample code (that resided inside apache-solr-3.6.1.zip) into SolrRedo' Netbeans 7.1.1 project. I also do some settings (ex: put the solr.home folder into a proper place), deploy and run the project I successfully run it on the browser (including http://localhost:8080/SolrRedo/admin/). However, I got error HTTP Status 500 when trying to browse http://localhost:8080/SolrRedo/browse/ How should I fix this problem? Kind regards Hanjoyo Here is the details: HTTP Status 500 - lazy loading error org.apache.solr.common.SolrException: lazy loading error at org.apache.solr.core.SolrCore$LazyQueryResponseWriterWrapper.getWrappedWriter(SolrCore.java:1763) at org.apache.solr.core.SolrCore$LazyQueryResponseWriterWrapper.getContentType(SolrCore.java:1778) at org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:273) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:256) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:217) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:279) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) at org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:655) at org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:595) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:161) at org.apache.catalina.connector.CoyoteAdapter.doService(CoyoteAdapter.java:331) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:231) at com.sun.enterprise.v3.services.impl.ContainerMapper$AdapterCallable.call(ContainerMapper.java:317) at com.sun.enterprise.v3.services.impl.ContainerMapper.service(ContainerMapper.java:195) at com.sun.grizzly.http.ProcessorTask.invokeAdapter(ProcessorTask.java:849) at com.sun.grizzly.http.ProcessorTask.doProcess (ProcessorTask.java:746) at com.sun.grizzly.http.ProcessorTask.process(ProcessorTask.java:1045) at com.sun.grizzly.http.DefaultProtocolFilter.execute(DefaultProtocolFilter.java:228) at com.sun.grizzly.DefaultProtocolChain.executeProtocolFilter(DefaultProtocolChain.java:137) at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:104) at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:90) at com.sun.grizzly.http.HttpProtocolChain.execute(HttpProtocolChain.java:79) at com.sun.grizzly.ProtocolChainContextTask.doCall(ProtocolChainContextTask.java:54) at com.sun.grizzly.SelectionKeyContextTask.call(SelectionKeyContextTask.java:59) at com.sun.grizzly.ContextTask.run (ContextTask.java:71) at com.sun.grizzly.util.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:532) at com.sun.grizzly.util.AbstractThreadPool$Worker.run(AbstractThreadPool.java:513) at java.lang.Thread.run(Thread.java:722) Caused by: org.apache.solr.common.SolrException: Error loading class 'solr.VelocityResponseWriter' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:394) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:419) at org.apache.solr.core.SolrCore.createQueryResponseWriter(SolrCore.java:487) at org.apache.solr.core.SolrCore.access$300 (SolrCore.java:72) at org.apache.solr.core.SolrCore$LazyQueryResponseWriterWrapper.getWrappedWriter(SolrCore.java:1758) ... 28 more Caused by: java.lang.ClassNotFoundException: solr.VelocityResponseWriter at java.net.URLClassLoader$1.run (URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at java.net.FactoryURLClassLoader.loadClass (URLClassLoader.java:789) at java.lang.ClassLoader.loadClass(ClassLoader.java:356) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at org.apache.solr.core.SolrResourceLoader.findClass (SolrResourceLoader.java:378) ... 32 more
Re: successfully move to glassfish but got error accessing Velocity sample code
Hello list, I finally solved the problem. I miss the configuration of solr jar files in the solrconfig.xml file. thank you. Kind regards, Hanjoyo On Mon, Oct 1, 2012 at 8:58 PM, Iwan Hanjoyo ihanj...@gmail.com wrote: Hello all, First, after extracting apache-solr-3.6.1.zip file, I can run and access http://localhost:8080/browse (the solritas velocity example) from jetty. I also successfully move the solr.war to Glassfish and get it running. However, I got an error when accessing http://localhost:8080/browse (the solritas velocity example) from glassfish. What configuration is missing here? I have copied the solr.home folder as is from the apache-solr-3.6.1.zip Thanx before. Kind regards, Hanjoyo
Tuning DirectUpdateHandler2.addDoc
Hi I have been profiling SolrCloud when indexing into a sharded non-replica collection because indexing slows down when the index files (*.fdt) grows to a couple of GB (the largest is about 3.5GB). When profiling for a couple of minutes I see that most time is spend in the DirectUpdateHandler2.addDoc method (being called about 8000 times). Its time is spend in UpdateLog.lookupVersion, VersionInfo.getVersionFromIndex, SolrIndexSearcher.lookupId (being called about 6000 times) and it spends it time in AtomicReader.termDocsEnums which is called about 530.000 times taking about 770.000 ms Is it true, that the reason for AtomicReader.termDocsEnums is being called 530.000/6000 =~ 90 times per SolrIndexSearcher.lookupId call, is that I have in average 90 term-files? Can I do anything to lower this number of term-files? I'm running more cores on my SolrCloud instance. Is there any way I can lower the time spend in each AtomicReader.termDocsEnums method call (this seems to be much faster when I don't have so many documents in my collection/shard)? Thanks as always. Best regards Trym
mapping values in fields
Hi, I try to map values from one field into other values in another field. For example: original_field: orig_value1 mapped_field: mapped_value1 with the help of an explicitely defined (N:1) mapping: orig_value1 = mapped_value1 orig_value2 = mapped_value1 orig_value3 = mapped_value2 I have tried to use SynonymFilterFactory for the mapped_field: fieldtype name=mapped_field class=solr.TextField analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=region-map.txt ignoreCase=true expand=true/ /analyzer combined with: copyField src=original_field dest=mapped_field / Now, a search for mapped_field:mapped_value1 yields results, however in the result the mapped_value1 does not appear at all, but instead the orig_value1 appears also in the mapped_field. How can I achieve, that the mapped_value appears in the result as well? thank you, matej
Re: mapping values in fields
What's the query you send? I'm guessing a bit here since you haven't included it, but try insuring two things: 1 your mapped_field is has 'stored=true ' 2 you specify (either in your request handler on on the URL) fl=mapped_value Best Erick On Tue, Oct 2, 2012 at 9:04 AM, tech.vronk t...@vronk.net wrote: Hi, I try to map values from one field into other values in another field. For example: original_field: orig_value1 mapped_field: mapped_value1 with the help of an explicitely defined (N:1) mapping: orig_value1 = mapped_value1 orig_value2 = mapped_value1 orig_value3 = mapped_value2 I have tried to use SynonymFilterFactory for the mapped_field: fieldtype name=mapped_field class=solr.TextField analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=region-map.txt ignoreCase=true expand=true/ /analyzer combined with: copyField src=original_field dest=mapped_field / Now, a search for mapped_field:mapped_value1 yields results, however in the result the mapped_value1 does not appear at all, but instead the orig_value1 appears also in the mapped_field. How can I achieve, that the mapped_value appears in the result as well? thank you, matej
Re: mapping values in fields
the query is: mapped_field:mapped_value1 and seems to correctly return the documents. the mapped_field has attribute stored=true and also appears in the result (even without requesting it explicitely with fl), just with the orig_value1 instead of mapped_value1 matej Am 02.10.2012 15:46, schrieb Erick Erickson: What's the query you send? I'm guessing a bit here since you haven't included it, but try insuring two things: 1 your mapped_field is has 'stored=true ' 2 you specify (either in your request handler on on the URL) fl=mapped_value Best Erick On Tue, Oct 2, 2012 at 9:04 AM, tech.vronk t...@vronk.net wrote: Hi, I try to map values from one field into other values in another field. For example: original_field: orig_value1 mapped_field: mapped_value1 with the help of an explicitely defined (N:1) mapping: orig_value1 = mapped_value1 orig_value2 = mapped_value1 orig_value3 = mapped_value2 I have tried to use SynonymFilterFactory for the mapped_field: fieldtype name=mapped_field class=solr.TextField analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=region-map.txt ignoreCase=true expand=true/ /analyzer combined with: copyField src=original_field dest=mapped_field / Now, a search for mapped_field:mapped_value1 yields results, however in the result the mapped_value1 does not appear at all, but instead the orig_value1 appears also in the mapped_field. How can I achieve, that the mapped_value appears in the result as well? thank you, matej
Re: At a high level how does faceting in SolrCloud work?
Thanks for this guys, really excellent explanation! On Thu, Sep 27, 2012 at 12:15 AM, Yonik Seeley yo...@lucidworks.com wrote: On Wed, Sep 26, 2012 at 6:21 PM, Chris Hostetter hossman_luc...@fucit.org wrote: 2) the coordinator node sums up the counts for any constraint returned by multiple nodes, and then picks the top (facet.limit) constraints based n the counts it knows about. It's actually more sophisticated than that - we don't limit to the top facet.limit constraints at the first phase. For *all* constraints we see from the first phase, we calculate if it could possibly be in the top facet.limit constraints (based on shards we haven't heard from). If so, we request exact counts from those shards we haven't heard from. (but i believe this is second query is optimized to only ask a shard about a constraint if it didn't already get the count in the first request) Correct. So imagine you have 3 shards, and querying them individually with facet.field=catfacet.limit=3 you get... shardA: cars(8), books(7), computers(6) shardB: toys(8), books(7), garden(5) shardC: garden(4), books(3), computers(3) If you made a solr cloud query (or an explicit distributed query of those three shards), the first request the coordinator would send to each shard would specify a higher facet.limit, and might get back something like... shardA: cars(8), books(7), computers(6), cleaning(4), ... shardB: toys(8), books(7), garden(5), cleaning(4), ... shardC: garden(4), books(3), computers(3), plants(3), ... ...in which case cleaning pops up as a contender for being in the top constraints. The coordinator sums up the counts for the constraints it knows about, and might decide that these are the top 3... books(17), computers(9), cleaning(8) To extend your example, Solr notices that plants has a count of 3 on one shard, and was missing from the other two shards. The maximum possible count it *could* have is 11 (3+4+4), which could possibly put it in the top 3, hence it will also ask shardA and shardB about plants. -Yonik http://lucidworks.com
Re: At a high level how does faceting in SolrCloud work?
So does mincount get considered in this as well? On Tue, Oct 2, 2012 at 10:19 AM, Jamie Johnson jej2...@gmail.com wrote: Thanks for this guys, really excellent explanation! On Thu, Sep 27, 2012 at 12:15 AM, Yonik Seeley yo...@lucidworks.com wrote: On Wed, Sep 26, 2012 at 6:21 PM, Chris Hostetter hossman_luc...@fucit.org wrote: 2) the coordinator node sums up the counts for any constraint returned by multiple nodes, and then picks the top (facet.limit) constraints based n the counts it knows about. It's actually more sophisticated than that - we don't limit to the top facet.limit constraints at the first phase. For *all* constraints we see from the first phase, we calculate if it could possibly be in the top facet.limit constraints (based on shards we haven't heard from). If so, we request exact counts from those shards we haven't heard from. (but i believe this is second query is optimized to only ask a shard about a constraint if it didn't already get the count in the first request) Correct. So imagine you have 3 shards, and querying them individually with facet.field=catfacet.limit=3 you get... shardA: cars(8), books(7), computers(6) shardB: toys(8), books(7), garden(5) shardC: garden(4), books(3), computers(3) If you made a solr cloud query (or an explicit distributed query of those three shards), the first request the coordinator would send to each shard would specify a higher facet.limit, and might get back something like... shardA: cars(8), books(7), computers(6), cleaning(4), ... shardB: toys(8), books(7), garden(5), cleaning(4), ... shardC: garden(4), books(3), computers(3), plants(3), ... ...in which case cleaning pops up as a contender for being in the top constraints. The coordinator sums up the counts for the constraints it knows about, and might decide that these are the top 3... books(17), computers(9), cleaning(8) To extend your example, Solr notices that plants has a count of 3 on one shard, and was missing from the other two shards. The maximum possible count it *could* have is 11 (3+4+4), which could possibly put it in the top 3, hence it will also ask shardA and shardB about plants. -Yonik http://lucidworks.com
Re: mapping values in fields
Ah, I get it (finally). OK, there's no good way to do what you want that I know of. The problem is that the stored=true takes effect long before any transformations are applied, and is always the raw input. You effectively want to chain the fields together, i.e. apply the analysis chain _then_ have the copyfield take effect which is not supported. I don't know how to accomplish this off the top of my head OOB, I'd guess your client would have to manage the substitutions and then just index separate fields... Best Erick On Tue, Oct 2, 2012 at 9:54 AM, tech.vronk t...@vronk.net wrote: the query is: mapped_field:mapped_value1 and seems to correctly return the documents. the mapped_field has attribute stored=true and also appears in the result (even without requesting it explicitely with fl), just with the orig_value1 instead of mapped_value1 matej Am 02.10.2012 15:46, schrieb Erick Erickson: What's the query you send? I'm guessing a bit here since you haven't included it, but try insuring two things: 1 your mapped_field is has 'stored=true ' 2 you specify (either in your request handler on on the URL) fl=mapped_value Best Erick On Tue, Oct 2, 2012 at 9:04 AM, tech.vronk t...@vronk.net wrote: Hi, I try to map values from one field into other values in another field. For example: original_field: orig_value1 mapped_field: mapped_value1 with the help of an explicitely defined (N:1) mapping: orig_value1 = mapped_value1 orig_value2 = mapped_value1 orig_value3 = mapped_value2 I have tried to use SynonymFilterFactory for the mapped_field: fieldtype name=mapped_field class=solr.TextField analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=region-map.txt ignoreCase=true expand=true/ /analyzer combined with: copyField src=original_field dest=mapped_field / Now, a search for mapped_field:mapped_value1 yields results, however in the result the mapped_value1 does not appear at all, but instead the orig_value1 appears also in the mapped_field. How can I achieve, that the mapped_value appears in the result as well? thank you, matej
Question about MoreLikeThis query with solrj
Hi :) I'm using Solr 3.6.1 and i'm trying to use the similarity features of lucene/solr to compare texts. The content of my documents is in french so I defined a field like : field name=content_mlt type=text_fr termVectors=true indexed=true stored=true/ (it uses the default text_fr fieldType provided with the default schema.xml file) i'm using the following method to query my index : SolrQuery sQuery = new SolrQuery(); sQuery.setQueryType(/ + MoreLikeThisParams.MLT); sQuery.set(MoreLikeThisParams.MATCH_INCLUDE, false); sQuery.set(MoreLikeThisParams.MIN_DOC_FREQ, 1); sQuery.set(MoreLikeThisParams.MIN_TERM_FREQ, 1); sQuery.set(MoreLikeThisParams.MAX_QUERY_TERMS, 50); sQuery.set(MoreLikeThisParams.SIMILARITY_FIELDS, field); sQuery.set(fl, *,id,score); sQuery.setRows(5); sQuery.setQuery(content_mlt:/the content to find/); QueryResponse rsp = server.query(sQuery); return rsp.getResults(); The problem is that the returned results and the associated scores look strange to me. I indexed the three following texts : sample 1 : Le 1° de l'article 81 du CGI exige que les allocations pour frais soient utilisées conformément à leur objet pour être affranchies de l'impôt. Lorsque la réalité du versement des allocations est établie, le bénéficiaire doit cependant être en mesure de justifier de leur utilisation; sample 2: Le premier alinéa du 1° de l'article 81 du CGI prévoit que les rémunérations des journalistes, rédacteurs, photographes, directeurs de journaux et critiques dramatiques et musicaux perçues ès qualités constituent des allocations pour frais d'emploi affranchies d'impôt à concurrence de 7 650 EUR.; sample 3: Par ailleurs, lorsque leur montant est fixé par voie législative, les allocations pour frais prévues au 1° de l'article 81 du CGI sont toujours réputées utilisées conformément à leur objet et ne peuvent donner lieu à aucune vérification de la part de l'administration. Il s'agit d'une présomption irréfragable, qui ne peut donc pas être renversée par la preuve contraire qui serait apportée par l'administration d'une utilisation non conforme à son objet de l'allocation concernée. Pour que le deuxième alinéa du 1° de l'article 81 du CGI s'applique, deux conditions doivent être réunies simultanément : - la nature d'allocation spéciale inhérente à la fonction ou à l'emploi résulte directement de la loi ; - son montant est fixé par la loi; I tried to query the index by passing the first sample as the content to query and the result is the following : MLT result: id: dc3 - score: 0.114195324 (correspond to the sample 3) MLT result: id: dc2 - score: 0.035233106 (correspond to the sample 2) The results don't even contain the first sample, although it is exactly the same text as the one put into the query :/ Any idea of why I get these results? Maybe the query parameters are incorrect or there is something to change in the solr config? Thanks :) Gary
Hierarchical Data
Hi all, I'm trying to import some hierarchical data (stored in MySQL) on Solr, using DataImportHandler. Unfortunately, as most of you already knows, MySQL has no support for recursive queries, so there is no way to get hierarchical data stored as an adjacency list. So I considered writing a DIH custom transformers which given a specified sql (like select * from categories) and a value (f.e. category_id): * fetches all data * builds an hierarchical representation of the fetched data * optionally caches the hierarchical data structure * then returns 2 multi-valued lists which contain the 2 full paths (as String and as Number) Is there something out of the box? Alternatively, does the above approach sound good? TIA Twitter :http://www.twitter.com/m_cucchiara G+ :https://plus.google.com/107903711540963855921 Linkedin:http://www.linkedin.com/in/mauriziocucchiara VisualizeMe: http://vizualize.me/maurizio.cucchiara?r=maurizio.cucchiara Maurizio Cucchiara
RE: SolrJ - IOException
Hi Toke, We encountered this issue again. This time the SOLR servers were stalled. We are at 30 TPS. Please let us know any updates in the HTTP issue. Thanks, Balaji Balaji Gandhi, Senior Software Developer, Horizontal Platform Services Product Engineering │ Apollo Group, Inc. 1225 W. Washington St. | AZ23 | Tempe, AZ 85281 Phone: 602.713.2417 | Email: balaji.gan...@apollogrp.edumailto:balaji.gan...@apollogrp.edu P Go Green. Don't Print. Moreover soft copies can be indexed by algorithms. From: Balaji Gandhi Sent: Thursday, September 27, 2012 10:52 AM To: 'Toke Eskildsen [via Lucene]' Subject: RE: SolrJ - IOException Here is the stack trace:- org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server: org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:414) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:182) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:122) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:107) at org.apache.solr.handler.dataimport.thread.task.SolrUploadTask.upload(SolrUploadTask.java:31) at org.apache.solr.handler.dataimport.thread.SolrUploader.run(SolrUploader.java:31) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: org.apache.http.NoHttpResponseException: The target server failed to respond at org.apache.http.impl.conn.DefaultResponseParser.parseHead(DefaultResponseParser.java:101) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:252) at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:282) at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:247) at org.apache.http.impl.conn.AbstractClientConnAdapter.receiveResponseHeader(AbstractClientConnAdapter.java:216) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:298) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:647) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:464) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:732) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:353) ... 9 more Balaji Gandhi, Senior Software Developer, Horizontal Platform Services Product Engineering │ Apollo Group, Inc. 1225 W. Washington St. | AZ23 | Tempe, AZ 85281 Phone: 602.713.2417 | Email: balaji.gan...@apollogrp.edumailto:balaji.gan...@apollogrp.edu P Go Green. Don't Print. Moreover soft copies can be indexed by algorithms. From: Toke Eskildsen [via Lucene] [mailto:ml-node+s472066n4010082...@n3.nabble.com] Sent: Tuesday, September 25, 2012 12:19 AM To: Balaji Gandhi Subject: Re: SolrJ - IOException On Tue, 2012-09-25 at 01:50 +0200, balaji.gandhi wrote: I am encountering this error randomly (under load) when posting to Solr using SolrJ. Has anyone encountered a similar error? org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://localhost:8080/solr/profile at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:414) [...] This looks suspiciously like a potential bug in the HTTP keep-alive flow that we encountered some weeks ago. I am guessing that you are issuing more than 100 separate updates/second. Could you please provide the full stack trace? If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/SolrJ-IOException-tp4010026p4010082.html To unsubscribe from SolrJ - IOException, click herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4010026code=YmFsYWppLmdhbmRoaUBhcG9sbG9ncnAuZWR1fDQwMTAwMjZ8LTEwNzE2NTA1NDI=. NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml This message is private and confidential. If you have received it in error, please notify the sender and
Re: Hierarchical Data
Hi Maurizio, if you can manipulate your MySql db a simpler solution can be the following: 1 - Add a new field for your hierarchical data inside your table MY_HIERARCHICAL_FIELD 2 - Populate directly in MySql this new field with a simple procedure* 3 - Import the data in your Solr index *The MySql procedure could be similar to the following: 1 - iterate your table and find your top records not yet elaborated (it means with MY_HIERARCHICAL_FIELD empty and no father or with a father with MY_HIERARCHICAL_FIELD NOT empty) 2 - find the anchestor of the founded records 3 - Update your MY_HIERARCHICAL_FIELD field with the value of MY_HIERARCHICAL_FIELD father || '/' || MY_ID current At the end of the procedure you will have your MY_HIERARCHICAL_FIELD populated with values like '12/35/45/154' The value '12/35/45/154' means that the current record has id 154 and it is a child of record 45 that is a child of 35 that is a child of 12 that has no parents. 2012/10/2 Maurizio Cucchiara mcucchi...@apache.org Hi all, I'm trying to import some hierarchical data (stored in MySQL) on Solr, using DataImportHandler. Unfortunately, as most of you already knows, MySQL has no support for recursive queries, so there is no way to get hierarchical data stored as an adjacency list. So I considered writing a DIH custom transformers which given a specified sql (like select * from categories) and a value (f.e. category_id): * fetches all data * builds an hierarchical representation of the fetched data * optionally caches the hierarchical data structure * then returns 2 multi-valued lists which contain the 2 full paths (as String and as Number) Is there something out of the box? Alternatively, does the above approach sound good? TIA Twitter :http://www.twitter.com/m_cucchiara G+ :https://plus.google.com/107903711540963855921 Linkedin:http://www.linkedin.com/in/mauriziocucchiara VisualizeMe: http://vizualize.me/maurizio.cucchiara?r=maurizio.cucchiara Maurizio Cucchiara
Re: multivalued filed question (FieldCache error)
: I'm also using that field for a facet: Hmmm... that still doesn't make sense. faceting can use FieldCache, but it will check if ht field is mutivalued to decide if/when/how to do this. There's nothing else in your requestHandler config that would suggest why you might get this error. can you please provide more details about the error you are getting -- in particular: the completley stack trace from the server logs. that should help us itendify the code path leading to the problem. : : |requestHandler name=mytype class=solr.SearchHandler : lst name=defaults : str name=defTypedismax/str : str name=echoParamsexplicit/str : float name=tie1/float : str name=qf :many field but not store_slug : /str : str name=pf :|many field but not store_slug||| : /str str name=fl : ..., store_slug : /str : str name=mm![CDATA[1100% 580%]]/str : int name=qs2/int : int name=ps2/int : str name=q.alt*:*/str : str name=spellcheck.dictionarydefault/str : str name=spellchecktrue/str : str name=spellcheck.extendedResultstrue/str : str name=spellcheck.count10/str : str name=spellcheck.collatetrue/str !-- Facet -- : str name=facettrue/str : str name=facet.mincount1/str : str name=facet.pivot.mincount0/str : str name=facet.sortcount/str : ... : str name=facet.fieldstore_slug/str : ... : str name=hlfalse/str : /lst : arr name=last-components : strspellcheck/str : /arr : : /requestHandler| : : : Il 01/10/12 18:34, Erik Hatcher ha scritto: : How is your request handler defined? Using store_slug for anything but fl? : : Erik : : On Oct 1, 2012, at 10:51,giovanni.bricc...@banzai.it : giovanni.bricc...@banzai.it wrote: : : Hello, : : I would like to put a multivalued field into a qt definition as output : field. to do this I edit the current solrconfig.xml definition and add the : field in the fl specification. : : Unexpectedly when I do the query q=*:*qt=mytype I get the error : : str name=msg : can not use FieldCache on multivalued field: store_slug : /str : : But if I instead run the query : : http://src-eprice-dev:8080/solr/0/select/?q=*:*qt=mytypefl=otherfield,mymultivaluedfiled : : I don't get the error : : Have you got any suggestions? : : I'm using solr 4 beta : : solr-spec 4.0.0.2012.08.06.22.50.47 : lucene-impl 4.0.0-BETA 1370099 : : : Giovanni : : : -- : : : Giovanni Bricconi : : Banzai Consulting : cell. 348 7283865 : ufficio 02 00643839 : via Gian Battista Vico 42 : 20132 Milano (MI) : : : : -Hoss
Re: Hierarchical Data
Ciao Davide, Unfortunately changing the structure of the dbs is not an option for me (there are many legacy dbs involved), otherwise I would have chosen a /closure table/ instead of the /path enumeration/ you mentioned before. Furthermore, I'd need 2 PE fields: 1 for the values (ids) and 1 for the label (names). Also, I'm looking for a definite general solution. Twitter :http://www.twitter.com/m_cucchiara G+ :https://plus.google.com/107903711540963855921 Linkedin:http://www.linkedin.com/in/mauriziocucchiara VisualizeMe: http://vizualize.me/maurizio.cucchiara?r=maurizio.cucchiara Maurizio Cucchiara
Re: Can I rely on correct handling of interrupted status of threads?
I remember a bug in EmbeddedSolrServer at 1.4.1 when exception bypasses request closing that lead to searcher leak and OOM. It was fixed about two years ago. On Tue, Oct 2, 2012 at 1:48 PM, Robert Krüger krue...@lesspain.de wrote: Hi, I'm using Solr 3.6.1 in an application embedded directly, i.e. via EmbeddedSolrServer, not over an HTTP connection, which works perfectly. Our application uses Thread.interrupt() for canceling long-running tasks (e.g. through Future.cancel). A while (and a few Solr versions) back a colleague of mine implemented a workaround because he said that Solr didn't handle the thread's interrupted status correctly, i.e. not setting the interrupted status after having caught an InterruptedException or rethrowing it, thus killing the information that an interrupt has been requested, which breaks libraries relying on that. However, I did not find anything up-to-date in mailing list or forum archives on the web. Is that still or was it ever the case? What does one have to watch out for when interrupting a thread that is doing anything within Solr/Lucene? Any advice would be appreciated. Regards, Robert -- Sincerely yours Mikhail Khludnev Tech Lead Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
WordBreak spell correction makes split terms optional?
The user query design your ownbinoculars is corrected by the 'wordbreak' dictionary to: str name=querystringdesign your (own binoculars)/str Where are the parentheses coming from? Can I strip them with a post-processing filter? The parentheses make the terms optional, so, while the first match is excellent, the rest are irrelevant. Thx, Carrie Coy
Re: PHP client for a web application
Hi esteban. Im currently using both in my application. Both are fine. Solarium is great because it models the concepts of solr and can build queries using OOP. The other one is more lower level, so u have to write queries manually, which can be good in some situations. Both are fast enough. Solarium has bigger learning curve. Solarium has built in batch updating and other things like parallel queries. So i would go with solarium. Its a very nice library. On Oct 3, 2012 5:38 AM, Esteban Cacavelos estebancacave...@gmail.com wrote: Hi, I'm starting a web application using solr as a search engine. The web site will be developed in PHP (maybe I'll use a framework also). I would like to know some thoughts and opinions about the clients ( http://wiki.apache.org/solr/SolPHP). I didn't like very much the PHP extension option because I think this is a limitation. So, I would like to read opinions about SOLARIUM and SOLR-PHP-CLIENT. Thanks in advance! -- Esteban L. Cacavelos de Amoriza Cel: 0981 220 429
RE: WordBreak spell correction makes split terms optional?
The parenthesis are being added by the spellchecker. I tried to envision a number of different scenarios when designing how this would work and at the time it seemed best to add parenthesis around terms that originally were together but now are split up. From your example, I see this is a mistake. I see no reason why you can't just strip these, or use the collateExtendedResults to construct your own query, for now. Would you mind giving a little more detail on your query parameters in a JIRA bug report, so we can track this problem and fix it? Overall, its a bit tricky when splitting a term into multiple to make sure that if the original term was required|optional|prohibited that all the resulting terms also have the same status. See WordBreakSolrSpellCheckerTest#testCollate for details. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Carrie Coy [mailto:c...@ssww.com] Sent: Tuesday, October 02, 2012 3:08 PM To: solr-user@lucene.apache.org Subject: WordBreak spell correction makes split terms optional? The user query design your ownbinoculars is corrected by the 'wordbreak' dictionary to: str name=querystringdesign your (own binoculars)/str Where are the parentheses coming from? Can I strip them with a post-processing filter? The parentheses make the terms optional, so, while the first match is excellent, the rest are irrelevant. Thx, Carrie Coy
NoHttpResponseException using Solrj to index
Hey I am trying to make a simple application using solrj to index documents. I used the start.jar to start the Solr,. When I try to index a document to Solr I get the following exception: Exception in thread main java.lang.NoClassDefFoundError: org/apache/http/NoHttpResponseException The exception occurs when I instantiate SolrServer (in orange): public static void indexFilesSolrCell(File srcFile, String solrId) throws IOException, SolrServerException { String urlString = http://localhost:8983/solr;; SolrServer solr = new HttpSolrServer(urlString); ContentStreamUpdateRequest up = new ContentStreamUpdateRequest(/update/extract); I already import apache solr-solrj Thank you, -- Rui Vaz
Re: Can SOLR Index UTF-16 Text
If it is a simple text file, does that text file start with the UTF-16 BOM marker? http://unicode.org/faq/utf_bom.html Also, do UTF-8 files work? If not, then your setup has a basic encoding problem. And, when you post such a text file (for example, with curl), use the UTF-16 charset mime-type: I think it is text/plain; charset=utf-16. - Original Message - | From: Chris Hostetter hossman_luc...@fucit.org | To: solr-user@lucene.apache.org | Sent: Friday, September 28, 2012 5:17:15 PM | Subject: Re: Can SOLR Index UTF-16 Text | | | : Our SOLR setup (4.0.BETA on Tomcat 6) works as expected when | indexing UTF-8 | : files. Recently, however, we noticed that it has issues with | indexing | : certain text files eg. UTF-16 files. See attachment for an example | : (tarred+zipped) | : | : tesla-utf16.txt | : http://lucene.472066.n3.nabble.com/file/n4010834/tesla-utf16.txt | | No attachment came through to the list, and the URL nabble seems to | have | provided when you posted your message leads to a 404. | | IN general, the question of is indexing a UTF-16 file supported | largely | depneds on *how* you are indexing this file -- if it's plain text, | are you | parsing it yourself using some client code, and then sending it to | solr, | are you using DIH to read it from disk? are you using | ExtractingRequestHandler? | | those are all very differnet ways to index data in Solr -- and | depending | on what you are doing determins how/where the encoding of that file | is | processed. | | | -Hoss |
ContentStreamUpdateRequest example in 4.0 Beta
Hi, Is there any complete implementation for Solr 4.0 Beta of a class which uses ContentStreamUpdateRequest to send data to the ExtractingRequestHandlerhttp://wiki.apache.org/solr/ExtractingRequestHandler , similar to this one for 3.1 version? http://wiki.apache.org/solr/ContentStreamUpdateRequestExample Thank you, -- Rui Vaz
Re: Problem with spellchecker
Thank you for your help, the whole team overlooked this simple error. It was driving us crazy! :) Thanks!! Jose. On 10/2/12 1:23 AM, Markus Jelsma markus.jel...@openindex.io wrote: The problem is your stray double quote: str name=queryAnalyzerFieldTypetext_general_fr/str I'd think this would throw an exception somewhere. -Original message- From:Jose Aguilar jagui...@searchtechnologies.com Sent: Tue 02-Oct-2012 01:40 To: solr-user@lucene.apache.org Subject: Problem with spellchecker We have configured 2 spellcheckers English and French in solr 4 BETA. Each spellchecker works with a specific search handler. The English spellchecker is working as expected with any word regardless of the case. On the other hand, the French spellchecker works with lowercase words. If the first letter is uppercase, then the spellchecker is not returning any suggestion unless we add the spellcheck.q parameter with that term. To further clarify, this doesn't return any corrections: http://localhost:8984/solr/collection1/handler?wt=xmlq=Systme But this one works as expected: http://localhost:8984/solr/collection1/handler?wt=xmlq=Systmespellcheck .q=Systme According to this page (http://wiki.apache.org/solr/SpellCheckComponent#q_OR_spellcheck.q) , the spellcheck.q paramater shouldn't be required: If spellcheck.q is defined, then it is used, otherwise the original input query is used Are we missing something? We double checked the configuration settings for English which is working fine and it seems well configured. Here is an extract of the spellcheck component configuration for French language searchComponent name=spellcheckfr class=solr.SpellCheckComponent str name=queryAnalyzerFieldTypetext_general_fr/str lst name=spellchecker str name=namedefault/str str name=fieldSpellingFr/str str name=classnamesolr.DirectSolrSpellChecker/str str name=distanceMeasureinternal/str float name=accuracy0.5/float int name=maxEdits2/int int name=minPrefix1/int int name=maxInspections5/int int name=minQueryLength4/int float name=maxQueryFrequency0.01/float str name=buildOnCommittrue/str /lst /searchComponent Thanks for any help
RE: Can SOLR Index UTF-16 Text
Solr can index bytearrays too: unigram, bigram, trigram... even bitsets, tritsets, qatrisets ;- ) LOL I got strong cold... BTW, don't forget to configure UTF-8 as your default (Java) container encoding... -Fuad
Re: SolrJ - IOException
Was it stalled due to gc pause? Sent from my iPhone On Oct 2, 2012, at 10:02 AM, balaji.gandhi balaji.gan...@apollogrp.edu wrote: Hi Toke, We encountered this issue again. This time the SOLR servers were stalled. We are at 30 TPS. Please let us know any updates in the HTTP issue. Thanks, Balaji Balaji Gandhi, Senior Software Developer, Horizontal Platform Services Product Engineering │ Apollo Group, Inc. 1225 W. Washington St. | AZ23 | Tempe, AZ 85281 Phone: 602.713.2417 | Email: balaji.gan...@apollogrp.edumailto:balaji.gan...@apollogrp.edu P Go Green. Don't Print. Moreover soft copies can be indexed by algorithms. From: Balaji Gandhi Sent: Thursday, September 27, 2012 10:52 AM To: 'Toke Eskildsen [via Lucene]' Subject: RE: SolrJ - IOException Here is the stack trace:- org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server: org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:414) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:182) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:122) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:107) at org.apache.solr.handler.dataimport.thread.task.SolrUploadTask.upload(SolrUploadTask.java:31) at org.apache.solr.handler.dataimport.thread.SolrUploader.run(SolrUploader.java:31) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: org.apache.http.NoHttpResponseException: The target server failed to respond at org.apache.http.impl.conn.DefaultResponseParser.parseHead(DefaultResponseParser.java:101) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:252) at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:282) at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:247) at org.apache.http.impl.conn.AbstractClientConnAdapter.receiveResponseHeader(AbstractClientConnAdapter.java:216) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:298) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:647) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:464) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:732) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:353) ... 9 more Balaji Gandhi, Senior Software Developer, Horizontal Platform Services Product Engineering │ Apollo Group, Inc. 1225 W. Washington St. | AZ23 | Tempe, AZ 85281 Phone: 602.713.2417 | Email: balaji.gan...@apollogrp.edumailto:balaji.gan...@apollogrp.edu P Go Green. Don't Print. Moreover soft copies can be indexed by algorithms. From: Toke Eskildsen [via Lucene] [mailto:ml-node+s472066n4010082...@n3.nabble.com] Sent: Tuesday, September 25, 2012 12:19 AM To: Balaji Gandhi Subject: Re: SolrJ - IOException On Tue, 2012-09-25 at 01:50 +0200, balaji.gandhi wrote: I am encountering this error randomly (under load) when posting to Solr using SolrJ. Has anyone encountered a similar error? org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://localhost:8080/solr/profile at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:414) [...] This looks suspiciously like a potential bug in the HTTP keep-alive flow that we encountered some weeks ago. I am guessing that you are issuing more than 100 separate updates/second. Could you please provide the full stack trace? If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/SolrJ-IOException-tp4010026p4010082.html To unsubscribe from SolrJ - IOException, click herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4010026code=YmFsYWppLmdhbmRoaUBhcG9sbG9ncnAuZWR1fDQwMTAwMjZ8LTEwNzE2NTA1NDI=.
Re: anyone has solrcloud perfromance numbers ?
I don't have the URL handy, but guys at LinkedIn have a benchmark tool for Solr, ElasticSearch, and Sensei. Check the list archives for URL and my signature below for a tool that can show metrics for any of those systems, which you'll probably want to observe during testing. Otis -- Performance Monitoring - http://sematext.com/spm On Oct 2, 2012 6:47 PM, varun srivastava varunmail...@gmail.com wrote: Hi, Does anyone has some solr cloud preliminary performance numbers ? Or if someone has performance comparison ( throughput and latency) between solr 3.6 and solrcloud ( having a huge monolithic index vs sharded) ? Thanks Varun
Re: Query among multiple cores
Are the cores join-able? If so, you can use Solr's join feature to execute just one query. Otis -- Performance Monitoring - http://sematext.com/spm On Oct 2, 2012 5:50 PM, Nicholas Ding nicholas...@gmail.com wrote: Hello, I'm working on a search project, that involves searching against more than one cores. For example, I have 3 cores. Core A, Core B, and Core C. - Fist Step, search Core A, get some Ids. - Second Step, search Core B, get some keywords. - Finally, I use Ids from Core A and keywords from Core B, searching against Core C. I know I can write some php frontend to call Solr several times, but is that possible to do it inside Solr? Core A and Core B are pretty small, by comparing the searching time, the HTTP overhead is great. This project is gonna have high volume traffic, so I wanna reduce the overhead of HTTP if that's possible. Thanks Nicholas
Re: NoHttpResponseException using Solrj to index
You need to add the jar with that missing class to the startup command line. Otis -- Performance Monitoring - http://sematext.com/spm On Oct 2, 2012 5:42 PM, Rui Vaz rui@gmail.com wrote: Hey I am trying to make a simple application using solrj to index documents. I used the start.jar to start the Solr,. When I try to index a document to Solr I get the following exception: Exception in thread main java.lang.NoClassDefFoundError: org/apache/http/NoHttpResponseException The exception occurs when I instantiate SolrServer (in orange): public static void indexFilesSolrCell(File srcFile, String solrId) throws IOException, SolrServerException { String urlString = http://localhost:8983/solr;; SolrServer solr = new HttpSolrServer(urlString); ContentStreamUpdateRequest up = new ContentStreamUpdateRequest(/update/extract); I already import apache solr-solrj Thank you, -- Rui Vaz
Re: Query among multiple cores
Join is cool, but does it work among multiple cores? On Solr's wiki, I saw it's only applied to single core. On Tue, Oct 2, 2012 at 11:06 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Are the cores join-able? If so, you can use Solr's join feature to execute just one query. Otis -- Performance Monitoring - http://sematext.com/spm On Oct 2, 2012 5:50 PM, Nicholas Ding nicholas...@gmail.com wrote: Hello, I'm working on a search project, that involves searching against more than one cores. For example, I have 3 cores. Core A, Core B, and Core C. - Fist Step, search Core A, get some Ids. - Second Step, search Core B, get some keywords. - Finally, I use Ids from Core A and keywords from Core B, searching against Core C. I know I can write some php frontend to call Solr several times, but is that possible to do it inside Solr? Core A and Core B are pretty small, by comparing the searching time, the HTTP overhead is great. This project is gonna have high volume traffic, so I wanna reduce the overhead of HTTP if that's possible. Thanks Nicholas
Re: Follow links in xml doc
Hi Billy, There is nothing in Solr that will do XML parsing and link extraction, so you'll need to do that part. Once you do that have a look at Solr join for parent-child querying. http://search-lucene.com/?q=solr+join Otis -- Search Analytics - http://sematext.com/search-analytics/index.html Performance Monitoring - http://sematext.com/spm/index.html On Tue, Oct 2, 2012 at 9:51 PM, Billy Newman newman...@gmail.com wrote: Hello again all. I have a URLDataSource to index xml data. Is there any way to follow links within the xml doc and index items in those under the same document? I.E. if I search for a word or term and that term lives in a link of doc with ID 12345 I would like to return that doc when searched. Thanks, Billy
Re: Query among multiple cores
Solr join does work across multiple cores, as long as they are in the same JVM. Otis -- Search Analytics - http://sematext.com/search-analytics/index.html Performance Monitoring - http://sematext.com/spm/index.html On Tue, Oct 2, 2012 at 11:09 PM, Nicholas Ding nicholas...@gmail.com wrote: Join is cool, but does it work among multiple cores? On Solr's wiki, I saw it's only applied to single core. On Tue, Oct 2, 2012 at 11:06 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Are the cores join-able? If so, you can use Solr's join feature to execute just one query. Otis -- Performance Monitoring - http://sematext.com/spm On Oct 2, 2012 5:50 PM, Nicholas Ding nicholas...@gmail.com wrote: Hello, I'm working on a search project, that involves searching against more than one cores. For example, I have 3 cores. Core A, Core B, and Core C. - Fist Step, search Core A, get some Ids. - Second Step, search Core B, get some keywords. - Finally, I use Ids from Core A and keywords from Core B, searching against Core C. I know I can write some php frontend to call Solr several times, but is that possible to do it inside Solr? Core A and Core B are pretty small, by comparing the searching time, the HTTP overhead is great. This project is gonna have high volume traffic, so I wanna reduce the overhead of HTTP if that's possible. Thanks Nicholas
Re: PHP client for a web application
Thanks for your response Damien. As you said, you can do some basic things quiclier than solr-php-client. I think is a good choice for basic applications, and if needed more specific things, then go with solr-php-client also. 2012/10/2 Damien Camilleri i...@webdistribution.com.au Hi esteban. Im currently using both in my application. Both are fine. Solarium is great because it models the concepts of solr and can build queries using OOP. The other one is more lower level, so u have to write queries manually, which can be good in some situations. Both are fast enough. Solarium has bigger learning curve. Solarium has built in batch updating and other things like parallel queries. So i would go with solarium. Its a very nice library. On Oct 3, 2012 5:38 AM, Esteban Cacavelos estebancacave...@gmail.com wrote: Hi, I'm starting a web application using solr as a search engine. The web site will be developed in PHP (maybe I'll use a framework also). I would like to know some thoughts and opinions about the clients ( http://wiki.apache.org/solr/SolPHP). I didn't like very much the PHP extension option because I think this is a limitation. So, I would like to read opinions about SOLARIUM and SOLR-PHP-CLIENT. Thanks in advance! -- Esteban L. Cacavelos de Amoriza Cel: 0981 220 429 -- Esteban L. Cacavelos de Amoriza Cel: 0981 220 429
Re: anyone has solrcloud perfromance numbers ?
Thanks Otis On Tue, Oct 2, 2012 at 8:06 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: I don't have the URL handy, but guys at LinkedIn have a benchmark tool for Solr, ElasticSearch, and Sensei. Check the list archives for URL and my signature below for a tool that can show metrics for any of those systems, which you'll probably want to observe during testing. Otis -- Performance Monitoring - http://sematext.com/spm On Oct 2, 2012 6:47 PM, varun srivastava varunmail...@gmail.com wrote: Hi, Does anyone has some solr cloud preliminary performance numbers ? Or if someone has performance comparison ( throughput and latency) between solr 3.6 and solrcloud ( having a huge monolithic index vs sharded) ? Thanks Varun
Re: anyone has solrcloud perfromance numbers ?
Otis, I am looking for performance benchmark number rather than performance monitoring tools. SPM looks like monitoring tool. Moreover its comparing Solr with Elastic Search etc, I want comparison between Solr 3.6 and solrcloud. Thanks Varun On Tue, Oct 2, 2012 at 9:15 PM, varun srivastava varunmail...@gmail.comwrote: Thanks Otis On Tue, Oct 2, 2012 at 8:06 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: I don't have the URL handy, but guys at LinkedIn have a benchmark tool for Solr, ElasticSearch, and Sensei. Check the list archives for URL and my signature below for a tool that can show metrics for any of those systems, which you'll probably want to observe during testing. Otis -- Performance Monitoring - http://sematext.com/spm On Oct 2, 2012 6:47 PM, varun srivastava varunmail...@gmail.com wrote: Hi, Does anyone has some solr cloud preliminary performance numbers ? Or if someone has performance comparison ( throughput and latency) between solr 3.6 and solrcloud ( having a huge monolithic index vs sharded) ? Thanks Varun
Re: anyone has solrcloud perfromance numbers ?
Hi, Was trying to say you will need to run the benchmark yourself because each context is different. The linkedin tool I referred you to will help you do that - you don't have to bench non-solr engines. I also tried suggesting that while you are benchmarking you really want to be looking at various metrics, possible with the help of SPM. HTH Otis -- Performance Monitoring - http://sematext.com/spm On Oct 3, 2012 12:25 AM, varun srivastava varunmail...@gmail.com wrote: Otis, I am looking for performance benchmark number rather than performance monitoring tools. SPM looks like monitoring tool. Moreover its comparing Solr with Elastic Search etc, I want comparison between Solr 3.6 and solrcloud. Thanks Varun On Tue, Oct 2, 2012 at 9:15 PM, varun srivastava varunmail...@gmail.com wrote: Thanks Otis On Tue, Oct 2, 2012 at 8:06 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: I don't have the URL handy, but guys at LinkedIn have a benchmark tool for Solr, ElasticSearch, and Sensei. Check the list archives for URL and my signature below for a tool that can show metrics for any of those systems, which you'll probably want to observe during testing. Otis -- Performance Monitoring - http://sematext.com/spm On Oct 2, 2012 6:47 PM, varun srivastava varunmail...@gmail.com wrote: Hi, Does anyone has some solr cloud preliminary performance numbers ? Or if someone has performance comparison ( throughput and latency) between solr 3.6 and solrcloud ( having a huge monolithic index vs sharded) ? Thanks Varun