Re: Basic auth
Although I'm not sure why you took this approach instead of supporting simple built-in basic auth and let us configure security the old/easy way Going with Jetty basic auth is not useful in a large enough cluster. Where do you store the credentials and how would you propagate it across the cluster. When you use Solr, you need a SOlr like way of managing that. The other problem is inter-node communication. How do you pass credentials along in that case I'm guessing it has to do with future requirement of field/doc level security Acutally that is an orthogonal requirement I hope you can get rid of the war file soon and start promoting Solr as a set of libraries so one can easily embed/extend Solr That is not what we have in mind. We want Solr to be a server which controls every aspect of its running . We should have the choice of getting rid of jetty or whatsoever and move to a new system. We only guarantee to interface/protocol to remain constant On Tue, Jul 28, 2015 at 2:19 AM, Fadi Mohsen fadi.moh...@gmail.com wrote: Thank you, I tested providing my implementation of authentication in security.json, uploaded file to ZK (just considering authentication), started nodes and it worked like a charm. That required of course turning off Jetty basic auth. Although I'm not sure why you took this approach instead of supporting simple built-in basic auth and let us configure security the old/easy way. I'm guessing it has to do with future requirement of field/doc level security. I hope you can get rid of the war file soon and start promoting Solr as a set of libraries so one can easily embed/extend Solr, since some (especially me) might consider command line ZK operations are not that continues delivery/automate everything/production friendly. It's easy today to spin up a jetty and wire / point out resource classes or wire up CXF alongside to get things playing, but I'm probably missing out of other things since I see many mails usually in consensus of not embedding and rather want people to consider Solr as a stand-alone service, not sure why! I'm probably getting out of context here. Regards On 27 Jul 2015, at 13:17, Noble Paul noble.p...@gmail.com wrote: Q.do you know when it would be released? 5.3 will be released in another 3-4 weeks . Q.Are there any requirements of ZK authentication must be there as well? NO bq.Providing my own security.json + class/implementation to verify user/pass should work today with 5.2, right? Yes. But, if you modify your credentials or anything in that JSON, you will have to restart all your nodes . Q.SOLR-7274 pluggable security is already in 5.2 (my requirement is to provide user/pass in a secure manner, not as argument on cmd or from (our unsecured) ZK but from a configuration restful service, I'm not clear what your question is. Basic Auth is a well-known standard. We are just implementing that standard. We store all credentials permissions in ZK . That means it is only as secure as your ZK . As long as nobody can write to ZK, your system is safe On Wed, Jul 22, 2015 at 11:10 PM, Fadi Mohsen fadi.moh...@gmail.com wrote: Hi, I have some questions regarding basic auth and proper support in 5.3: do you know when it would be released? Are there any requirements of ZK authentication must be there as well? Do we store the user/pass in ZK? SOLR-7274 pluggable security is already in 5.2 (my requirement is to provide user/pass in a secure manner, not as argument on cmd or from (our unsecured) ZK but from a configuration restful service, I'm not sure 5.3 release would fit above requirement, can you reflect on this? Providing my own security.json + class/implementation to verify user/pass should work today with 5.2, right? Thanks Fadi On 22 Jul 2015, at 14:33, Noble Paul noble.p...@gmail.com wrote: Solr 5.3 is coming with proper basic auth support https://issues.apache.org/jira/browse/SOLR-7692 On Wed, Jul 22, 2015 at 5:28 PM, Peter Sturge peter.stu...@gmail.com wrote: if you're using Jetty you can use the standard realms mechanism for Basic Auth, and it works the same on Windows or UNIX. There's plenty of docs on the Jetty site about getting this working, although it does vary somewhat depending on the version of Jetty you're running (N.B. I would suggest using Jetty 9, and not 8, as 8 is missing some key authentication classes). If, when you execute a search query to your Solr instance you get a username and password popup, then Jetty's auth is setup. If you don't then something's wrong in the Jetty config. it's worth noting that if you're doing distributed searches Basic Auth on its own will not work for you. This is because Solr sends distributed requests to remote instances on behalf of the user, and it has no knowledge of the web container's auth mechanics. We got 'round this by customizing Solr to receive credentials and use them for authentication to remote
Search for All CAPS words
Hi, I need the capability to search for /GATE/ separately from /gate/. I cannot remove the lowercase filter factory in both my search and analysis chains since that will break many other search scenarios. Is there a way to payload/mark an ALL CAPS word in the index analyzer chain before it gets lowercased (by the lowercasefilterfactory) so that I can search it with some custom grammar and logic in my query parser. Say I want: Field:_gate to match /GATE/ only Field:gate to match both /GATE/ and /gate/ Any pointers would be helpful. thanks Ritesh -- View this message in context: http://lucene.472066.n3.nabble.com/Search-for-All-CAPS-words-tp4219893.html Sent from the Solr - User mailing list archive at Nabble.com.
Hard Commit not working
Hi, I am trying to index documents using solr cloud. After setting, maxTime to 6 ms in hard commit. Documents are visible instantly while adding them. Not commiting after 6 ms. I have added Solr log. Please check it. I am not getting exactly what is happening. *CURL to commit documents:* curl http://localhost:8983/solr/test/update/json -H 'Content-type:application/json' -d 'json-here' *Solrconfig.xml:* autoCommit maxDocs1/maxDocs maxTime6/maxTime openSearcherfalse/openSearcher /autoCommit !--autoSoftCommit -- !-- maxTime${solr.autoSoftCommit.maxTime:-1}/maxTime -- !--/autoSoftCommit-- *Solr Log: * INFO - 2015-07-30 14:14:12.636; [test shard6 core_node2 test_shard6_replica1] org.apache.solr.update.processor.LogUpdateProcessor; [test_shard6_replica1] webapp=/solr path=/update params={update.distrib=FROMLEADERwaitSearcher=trueopenSearcher=truecommit=truesoftCommit=falsedistrib.from= http://100.77.202.145:8983/solr/test_shard2_replica1/commit_end_point=truewt=javabinversion=2expungeDeletes=false} {commit=} 0 26
Re: Hard Commit not working
Hi Edwards, I am only sending 1 document for indexing then why it is committing instantly. I gave maxTime to 6. On Thu, Jul 30, 2015 at 8:26 PM Edward Ribeiro edward.ribe...@gmail.com wrote: Your maxDocs is set to 1. This is the number of pending docs before autocommit is triggered too. You should set it to a higher value like 1, for example. Edward Em 30/07/2015 11:43, Nitin Solanki nitinml...@gmail.com escreveu: Hi, I am trying to index documents using solr cloud. After setting, maxTime to 6 ms in hard commit. Documents are visible instantly while adding them. Not commiting after 6 ms. I have added Solr log. Please check it. I am not getting exactly what is happening. *CURL to commit documents:* curl http://localhost:8983/solr/test/update/json -H 'Content-type:application/json' -d 'json-here' *Solrconfig.xml:* autoCommit maxDocs1/maxDocs maxTime6/maxTime openSearcherfalse/openSearcher /autoCommit !--autoSoftCommit -- !-- maxTime${solr.autoSoftCommit.maxTime:-1}/maxTime -- !--/autoSoftCommit-- *Solr Log: * INFO - 2015-07-30 14:14:12.636; [test shard6 core_node2 test_shard6_replica1] org.apache.solr.update.processor.LogUpdateProcessor; [test_shard6_replica1] webapp=/solr path=/update params={update.distrib=FROMLEADERwaitSearcher=trueopenSearcher=truecommit=truesoftCommit=falsedistrib.from= http://100.77.202.145:8983/solr/test_shard2_replica1/commit_end_point=truewt=javabinversion=2expungeDeletes=false } {commit=} 0 26
Re: [ANN] New Features For Splainer
Glad you find it useful Daniel! Yeah its all driven from the browser. Splainer doesn't have a backend, its just a bunch of html and javascript hosted on s3. So no worries about your data being shared around. It seems another common trend is just running it locally. I correspond with quite a few folks that just do that. If you know something about some basic Javascript build tools it typically just works fine that way as well. Let me know if you have any ideas/problems! Cheers, -Doug On Wed, Jul 29, 2015 at 10:14 AM, Davis, Daniel (NIH/NLM) [C] daniel.da...@nih.gov wrote: I usually protect https://whatever.nlm.nih.gov/solr deeply, requiring CAS authentication against NIH Login, but I also make sure handleSelect=false, and reverse proxy https://whatever.nlm.nih.gov/search/core-name to /solr/select. I'm surprised and gratified that http://splainer.io/ works in my environment. -Original Message- From: Doug Turnbull [mailto:dturnb...@opensourceconnections.com] Sent: Friday, July 24, 2015 3:47 PM To: solr-user@lucene.apache.org Subject: [ANN] New Features For Splainer First, I wanted to humbly thank the Solr community for their contributions and feedback for our open source Solr sandbox, Splainer ( http://splainer.io and http://github.com/o19s/splainer). The reception and comments have been generally positive and helpful, and I very much appreciate being part of such a great open source community that wants to support each other. What is Splainer exactly? Why should you care? Nobody likes working with Solr in the browser's URL bar. Splainer let's you paste in your Solr URL and get an instant, easy to understand breakdown of why some documents are ranked higher than others. It then gives you a friendly interface to tweak Solr params and experiment with different ideas with a friendlier UI than trying to parse through XML and JSON. You needn't worry about security rules so that some splainer backend needing to talk to your Solr. The interaction with Solr is 100% through your browser. If your PC can see Solr, then so can Splainer running in your browser. If you leave work or turn off the VPN, then Splainer can't see your Solr. It's all running locally on your machine through the browser! I wanted to share that we've been slowly adding features to Splainer. The two I wanted to highlight, are captured in this blog article ( http://opensourceconnections.com/blog/2015/07/24/splainer-a-solr-developers-best-friend/ ) To summarize, they include - Explain Other You often wonder why obviously relevant search results don't come back. Splainer now gives you the ability to compare any document to secondary document to see what factors caused one document to rank higher than another - Share Splainerized Solr Results Once you paste a Solr URL into Splainer, you can then copy the splainer.io URL to share what you're seeing with a colleague. For example, here's some information about Virginia state laws about hunting deer from a boat: http://splainer.io/#?solr=http:%2F%2Fsolr.quepid.com%2Fsolr%2Fstatedecoded%2Fselect%3Fq%3Ddeer%20hunt%20from%20watercraft%0A%26defType%3Dedismax%0A%26qf%3Dcatch_line%20text%0A%26bq%3Dtitle:deer There's many more smaller features and tweaks, but I wanted to let you know this was out there. I hope you find Splainer useful. I'm very happy to field pull requests, ideas, suggestions, or try to figure out why Splainer isn't working for you! Cheers! -- *Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections http://opensourceconnections.com, LLC | 240.476.9983 Author: Relevant Search http://manning.com/turnbull This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such. -- *Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections http://opensourceconnections.com, LLC | 240.476.9983 Author: Relevant Search http://manning.com/turnbull This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.
Re: Search for All CAPS words
Have you tried copyField with different field type for different fields yet? That would be my first step. Make the copied field indexed-only, not stored for efficiency. And you can then either search against that copied field directly or use eDisMax against both fields and give that field a higher priority. Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-start.com/ On 30 July 2015 at 10:00, rks_lucene ppro.i...@gmail.com wrote: Hi, I need the capability to search for /GATE/ separately from /gate/. I cannot remove the lowercase filter factory in both my search and analysis chains since that will break many other search scenarios. Is there a way to payload/mark an ALL CAPS word in the index analyzer chain before it gets lowercased (by the lowercasefilterfactory) so that I can search it with some custom grammar and logic in my query parser. Say I want: Field:_gate to match /GATE/ only Field:gate to match both /GATE/ and /gate/ Any pointers would be helpful. thanks Ritesh -- View this message in context: http://lucene.472066.n3.nabble.com/Search-for-All-CAPS-words-tp4219893.html Sent from the Solr - User mailing list archive at Nabble.com.
StandardTokenizerFactory and WhitespaceTokenizerFactory
I am indexing text that contains part numbers in various formats that contain hypens/dashes, and a few other special characters. Here's the problem: If I use StandardTokenizerFactory, the hypens, etc are stripped and so I cannot search by the part number 222-333-. I can only search for 222 or 333 or 444. If I use the WhitespaceTokenizerFactory instead, I can search part numbers, but I'm not able to search words if they have punctuations like comma or period after the word. Example: wheel, Should I use copy fields and use different tokenizers and then during the search based on the search string? Any other options?
Re: Hard Commit not working
Your maxDocs is set to 1. This is the number of pending docs before autocommit is triggered too. You should set it to a higher value like 1, for example. Edward Em 30/07/2015 11:43, Nitin Solanki nitinml...@gmail.com escreveu: Hi, I am trying to index documents using solr cloud. After setting, maxTime to 6 ms in hard commit. Documents are visible instantly while adding them. Not commiting after 6 ms. I have added Solr log. Please check it. I am not getting exactly what is happening. *CURL to commit documents:* curl http://localhost:8983/solr/test/update/json -H 'Content-type:application/json' -d 'json-here' *Solrconfig.xml:* autoCommit maxDocs1/maxDocs maxTime6/maxTime openSearcherfalse/openSearcher /autoCommit !--autoSoftCommit -- !-- maxTime${solr.autoSoftCommit.maxTime:-1}/maxTime -- !--/autoSoftCommit-- *Solr Log: * INFO - 2015-07-30 14:14:12.636; [test shard6 core_node2 test_shard6_replica1] org.apache.solr.update.processor.LogUpdateProcessor; [test_shard6_replica1] webapp=/solr path=/update params={update.distrib=FROMLEADERwaitSearcher=trueopenSearcher=truecommit=truesoftCommit=falsedistrib.from= http://100.77.202.145:8983/solr/test_shard2_replica1/commit_end_point=truewt=javabinversion=2expungeDeletes=false } {commit=} 0 26
RE: StandardTokenizerFactory and WhitespaceTokenizerFactory
Using PatternReplaceCharFilterFactory to replace comma, period, etc with space or empty char will work? -Original Message- From: Tarala, Magesh Sent: Thursday, July 30, 2015 10:08 AM To: solr-user@lucene.apache.org Subject: StandardTokenizerFactory and WhitespaceTokenizerFactory I am indexing text that contains part numbers in various formats that contain hypens/dashes, and a few other special characters. Here's the problem: If I use StandardTokenizerFactory, the hypens, etc are stripped and so I cannot search by the part number 222-333-. I can only search for 222 or 333 or 444. If I use the WhitespaceTokenizerFactory instead, I can search part numbers, but I'm not able to search words if they have punctuations like comma or period after the word. Example: wheel, Should I use copy fields and use different tokenizers and then during the search based on the search string? Any other options?
Re: Problem with 60 cc and 60cc
The reason is almost certainly because the query parser is splitting on whitespace before the analysis chain gets the query - thus, each token travels separately through your chain. Try it with quotes around it to see if this is your issue. Upayavira On Thu, Jul 30, 2015, at 04:52 PM, Jack Schlederer wrote: Hi, I'm in the process of revising a schema for the search function of an eCommerce platform. One of the sticking points is a particular use case of searching for xx yy where xx is any number and yy is an abbreviation for a unit of measurement (mm, cc, ml, in, etc.). The problem is that searching for xx yy and xxyy return different results. One possible solution I tried was applying a few PatternReplaceCharFilterFactories to remove the whitespace between xx and yy if there was any (at both index- and query-time). These are the first few lines in the analyzer: charFilter class=solr.PatternReplaceCharFilterFactory pattern=(?i)(\d+)\s?(pounds?|lbs?) replacement=$1lb / charFilter class=solr.PatternReplaceCharFilterFactory pattern=(?i)(\d+)\s?(inch[es]?|in?) replacement=$1in / charFilter class=solr.PatternReplaceCharFilterFactory pattern=(?i)(\d+)\s?(ounc[es]?|oz) replacement=$1oz / charFilter class=solr.PatternReplaceCharFilterFactory pattern=(?i)(\d+)\s?(quarts?|qts?) replacement=$1qt / charFilter class=solr.PatternReplaceCharFilterFactory pattern=(?i)(\d+)\s?(gallons?|gal?) replacement=$1gal / charFilter class=solr.PatternReplaceCharFilterFactory pattern=(?i)(\d+)\s?(mm|cc|ml) replacement=$1$2 / A few more lines down, I use a PatternCaptureGroupFilterFactory to emit the tokens xxyy, xx, and yy: filter class=solr.PatternCaptureGroupFilterFactory pattern=(\d+)(lb|oz|in|qt|gal|mm|cc|ml) preserve_original=true / In Solr admin's analysis tool for the field type this applies to, both xx yy and xxyy are tokenized and filtered down indentically (at both index- and -query time). The platform I'm working on searches many different fields by default, but even when I rig up the query to only search in this one field, I still get different results for xxyy and xx yy. I'm wondering why this is. Attached is a screenshot from Solr analysis. Thanks, John
RE: StandardTokenizerFactory and WhitespaceTokenizerFactory
I'm adding PatternReplaceCharFilterFactory to exclude characters. Looks like this works. -Original Message- From: Tarala, Magesh Sent: Thursday, July 30, 2015 10:37 AM To: solr-user@lucene.apache.org Subject: RE: StandardTokenizerFactory and WhitespaceTokenizerFactory Using PatternReplaceCharFilterFactory to replace comma, period, etc with space or empty char will work? -Original Message- From: Tarala, Magesh Sent: Thursday, July 30, 2015 10:08 AM To: solr-user@lucene.apache.org Subject: StandardTokenizerFactory and WhitespaceTokenizerFactory I am indexing text that contains part numbers in various formats that contain hypens/dashes, and a few other special characters. Here's the problem: If I use StandardTokenizerFactory, the hypens, etc are stripped and so I cannot search by the part number 222-333-. I can only search for 222 or 333 or 444. If I use the WhitespaceTokenizerFactory instead, I can search part numbers, but I'm not able to search words if they have punctuations like comma or period after the word. Example: wheel, Should I use copy fields and use different tokenizers and then during the search based on the search string? Any other options?
Re: Hard Commit not working
Please be more specific as to why you think something is not working. -- Jack Krupansky On Thu, Jul 30, 2015 at 10:43 AM, Nitin Solanki nitinml...@gmail.com wrote: Hi, I am trying to index documents using solr cloud. After setting, maxTime to 6 ms in hard commit. Documents are visible instantly while adding them. Not commiting after 6 ms. I have added Solr log. Please check it. I am not getting exactly what is happening. *CURL to commit documents:* curl http://localhost:8983/solr/test/update/json -H 'Content-type:application/json' -d 'json-here' *Solrconfig.xml:* autoCommit maxDocs1/maxDocs maxTime6/maxTime openSearcherfalse/openSearcher /autoCommit !--autoSoftCommit -- !-- maxTime${solr.autoSoftCommit.maxTime:-1}/maxTime -- !--/autoSoftCommit-- *Solr Log: * INFO - 2015-07-30 14:14:12.636; [test shard6 core_node2 test_shard6_replica1] org.apache.solr.update.processor.LogUpdateProcessor; [test_shard6_replica1] webapp=/solr path=/update params={update.distrib=FROMLEADERwaitSearcher=trueopenSearcher=truecommit=truesoftCommit=falsedistrib.from= http://100.77.202.145:8983/solr/test_shard2_replica1/commit_end_point=truewt=javabinversion=2expungeDeletes=false } {commit=} 0 26
Re: Zookeeper state and its effect on Solr cluster.
Hi, Our indexer before starting does upload/reload of Solr configuration files using ZK UPLOAD and RELOAD APIs. In this process zookeeper is not stopped/restarted. ZK is alive and so are Solr nodes. Doing this often causes following exception. Kindly note that the ZK instance is standalone and not ensemble. This exception is only happening at RELOAD. {responseHeader:{status:500,QTime:180028},error:{msg:reload the collection time out:180s,trace:org.apache.solr.common.SolrException: reload the collection time out:180s\n\tat org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:237)\n\tat org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:168)\n\tat org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)\n\tat org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:660)\n\tat org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:431)\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:227)\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:196)\n\tat org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)\n\tat org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)\n\tat org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)\n\tat org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)\n\tat org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)\n\tat org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)\n\tat org.eclipse.jetty.server.Server.handle(Server.java:497)\n\tat org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)\n\tat org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)\n\tat org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)\n\tat java.lang.Thread.run(Thread.java:745)\n,code:500}} Kindly help as this is blocking our smooth process of indexing. Regards, Modassar On Tue, Jul 28, 2015 at 11:40 AM, Shawn Heisey apa...@elyograg.org wrote: On 7/27/2015 10:59 PM, Modassar Ather wrote: If we upgrade zookeeper we need to restart. This upgrade process is automated for future releases/changes of zookeeper. This is a single external zookeeper which is completely stopped/shutdown. No Solr node are restarted/shutdown. What I have understanding that even if the zookeeper shuts down, after restart the Solr nodes should come insync with the ZK state. Please correct me if I am wrong. Disclaimer: I do not have a ton of concrete experience with SolrCloud. I do have a cloud setup, but it is running Solr 4.2.1, which at this point is ancient. I haven't needed to do much to maintain it ... it takes care of itself. Recovering correctly from a complete zookeeper failure is what I would hope for, but it's a scenario that I've never tried. I hope there's a unit test for it, but I haven't checked. A fully redundant zookeeper ensemble requires a minimum of three hosts. If you need to upgrade ZK, then you upgrade them one at a time, and the ensemble never loses quorum. http://wiki.apache.org/hadoop/ZooKeeper/FAQ#A6 Thanks, Shawn
Suggester always highlights suggestions even if we pass highlight=false
I am still experiencing https://issues.apache.org/jira/browse/SOLR-6648 issue with solr 5.2.1. even if i send highlight=false solr returns me highlighted suggestions. Any idea why this is happening? My configurations : *URL : *http://solrhost:solrpost/mycorename/suggest?suggest.dictionary=altSuggestersuggest.dictionary=mainSuggesterwt=jsonsuggest.q=treatmsuggest.count=20highlight=false *reponse : * { responseHeader: { status: 0, QTime: 6 }, suggest: { mainSuggester: { treatm: { numFound: 20, suggestions: [ { term: *Treatm*ent Refusal, weight: 0, payload: }, { term: Withholding *Treatm*ent, weight: 0, payload: }, { term: *Treatm*ent Refusal, weight: 0, payload: }, { term: Withholding *Treatm*ent, weight: 0, payload: } ] } }, altSuggester: { treatm: { numFound: 2, suggestions: [ { term: *treatm*ent, weight: 197, payload: }, { term: *treatm*ents, weight: 5, payload: } ] } } } } *My Configurations : * searchComponent name=suggest class=solr.SuggestComponent lst name=suggester str name=namemainSuggester/str str name=lookupImplAnalyzingInfixLookupFactory/str str name=dictionaryImplDocumentDictionaryFactory/str str name=fieldkeyphrases/str str name=suggestAnalyzerFieldTypetext_general/str str name=indexPathmain-suggest/str str name=buildOnStartuptrue/str /lst lst name=suggester str name=namealtSuggester/str str name=lookupImplAnalyzingInfixLookupFactory/str str name=dictionaryImplHighFrequencyDictionaryFactory/str str name=fieldtext/str str name=suggestAnalyzerFieldTypetext_general/str str name=indexPathalt-suggest/str str name=allTermsRequiredfalse/str str name=buildOnStartuptrue/str /lst /searchComponent requestHandler name=/suggest class=solr.SearchHandler startup=lazy lst name=defaults str name=suggesttrue/str str name=suggest.count10/str str name=suggest.dictionarymainSuggester/str /lst arr name=components strsuggest/str /arr /requestHandler - Nutch Solr User The ultimate search engine would basically understand everything in the world, and it would always give you the right thing. -- View this message in context: http://lucene.472066.n3.nabble.com/Suggester-always-highlights-suggestions-even-if-we-pass-highlight-false-tp4219846.html Sent from the Solr - User mailing list archive at Nabble.com.
How to handle line breaks for quoted queries
How can I recognize line breaks and do not allow matching of a quoted query in the following example. I have two documents with just one text field: 1. AAA BBB line break CCC DDD 2. BBB CCC line break DDD AAA User enters query BBB CCC. How can I configure tokenizers so that Solr only returns doc #2? Thanks, Mohsen
Re: Hard Commit not working
Most probably because your solrconfig.xml is setting maxDocs for 1: maxDocs1/maxDocs. Then Solr will autoCommit EITHER with 1 document or after maxTime has passed. Change your maxDocs value in solrconfig.xml to 1, don't forget to RELOAD the core, then test it again. On Thu, Jul 30, 2015 at 12:13 PM, Nitin Solanki nitinml...@gmail.com wrote: Hi Edwards, I am only sending 1 document for indexing then why it is committing instantly. I gave maxTime to 6. On Thu, Jul 30, 2015 at 8:26 PM Edward Ribeiro edward.ribe...@gmail.com wrote: Your maxDocs is set to 1. This is the number of pending docs before autocommit is triggered too. You should set it to a higher value like 1, for example. Edward Em 30/07/2015 11:43, Nitin Solanki nitinml...@gmail.com escreveu: Hi, I am trying to index documents using solr cloud. After setting, maxTime to 6 ms in hard commit. Documents are visible instantly while adding them. Not commiting after 6 ms. I have added Solr log. Please check it. I am not getting exactly what is happening. *CURL to commit documents:* curl http://localhost:8983/solr/test/update/json -H 'Content-type:application/json' -d 'json-here' *Solrconfig.xml:* autoCommit maxDocs1/maxDocs maxTime6/maxTime openSearcherfalse/openSearcher /autoCommit !--autoSoftCommit -- !-- maxTime${solr.autoSoftCommit.maxTime:-1}/maxTime -- !--/autoSoftCommit-- *Solr Log: * INFO - 2015-07-30 14:14:12.636; [test shard6 core_node2 test_shard6_replica1] org.apache.solr.update.processor.LogUpdateProcessor; [test_shard6_replica1] webapp=/solr path=/update params={update.distrib=FROMLEADERwaitSearcher=trueopenSearcher=truecommit=truesoftCommit=falsedistrib.from= http://100.77.202.145:8983/solr/test_shard2_replica1/commit_end_point=truewt=javabinversion=2expungeDeletes=false } {commit=} 0 26
Re: Peronalized Search Results or Matching Documents to Users
On 7/30/2015 10:46 AM, Robert Farrior wrote: We have a requirement to be able to have a master product catalog and to create a sub-catalog of products per user. This means I may have 10,000 users who each create their own list of documents. This is a simple mapping of user to documents. The full data about the documents would be in the main catalog. What approaches would allow Solr to only return the results that are in the user's list? It seems like I would need a couple of steps in the process. In other words, the main catalog has 3 documents: A, B and C. I have 2 users. User 1 has access to documents A and C but not B. User 2 has access to documents C and B but not A. When a user searches, I want to only return documents that the user has access to. A common approach for Solr would be to have a multivalued user field on each document, which has individual values for each user that can access the document. When you index the document, you included values in this field listing all the users that can access that document. Then you simply filter by user: fq=user:joe This is EXTREMELY efficient at query time, especially when the number of users is much smaller than the number of documents. It may complicate indexing somewhat, but indexing is an extremely custom operation that users have to write themselves, so it probably won't be horrible. Thanks, Shawn
Problem with 60 cc and 60cc
Hi, I'm in the process of revising a schema for the search function of an eCommerce platform. One of the sticking points is a particular use case of searching for xx yy where xx is any number and yy is an abbreviation for a unit of measurement (mm, cc, ml, in, etc.). The problem is that searching for xx yy and xxyy return different results. One possible solution I tried was applying a few PatternReplaceCharFilterFactories to remove the whitespace between xx and yy if there was any (at both index- and query-time). These are the first few lines in the analyzer: charFilter class=solr.PatternReplaceCharFilterFactory pattern=(?i)(\d+)\s?(pounds?|lbs?) replacement=$1lb / charFilter class=solr.PatternReplaceCharFilterFactory pattern=(?i)(\d+)\s?(inch[es]?|in?) replacement=$1in / charFilter class=solr.PatternReplaceCharFilterFactory pattern=(?i)(\d+)\s?(ounc[es]?|oz) replacement=$1oz / charFilter class=solr.PatternReplaceCharFilterFactory pattern=(?i)(\d+)\s?(quarts?|qts?) replacement=$1qt / charFilter class=solr.PatternReplaceCharFilterFactory pattern=(?i)(\d+)\s?(gallons?|gal?) replacement=$1gal / charFilter class=solr.PatternReplaceCharFilterFactory pattern=(?i)(\d+)\s?(mm|cc|ml) replacement=$1$2 / A few more lines down, I use a PatternCaptureGroupFilterFactory to emit the tokens xxyy, xx, and yy: filter class=solr.PatternCaptureGroupFilterFactory pattern=(\d+)(lb|oz|in|qt|gal|mm|cc|ml) preserve_original=true / In Solr admin's analysis tool for the field type this applies to, both xx yy and xxyy are tokenized and filtered down indentically (at both index- and -query time). The platform I'm working on searches many different fields by default, but even when I rig up the query to only search in this one field, I still get different results for xxyy and xx yy. I'm wondering why this is. Attached is a screenshot from Solr analysis. Thanks, John
RE: Solr spell check mutliwords
Talha, In your configuration, you have this set: str name=spellcheck.maxResultsForSuggest5/str ...which means it will consider the query correctly spelled and offer no suggestions if there are 5 or more results. You could omit this parameter and it will always suggest when possible. Possibly, a better option would be to add spellcheck.collateParam.mm=100% or spellcheck.collateParam.q.op=100%, so when testing collations against the index, it will require all the terms to match something. See https://wiki.apache.org/solr/SpellCheckComponent#spellcheck.collateParam.XX for more information. James Dyer Ingram Content Group -Original Message- From: talha [mailto:talh...@gmail.com] Sent: Wednesday, July 22, 2015 9:34 AM To: solr-user@lucene.apache.org Subject: Solr spell check mutliwords Could not figure out actual reason why my configured Solr spell checker not giving desire output. In my indexed data query: symphony+mobile has around 3.5K+ docs and spell checker detect it as correctly spelled. When i miss-spell symphony in query: symphony+mobile it showing only results for mobile and spell checker detect this query as correctly spelled. I have searched this query in different combination. Please find search result stat Query: symphony ResultFound: 1190 SpellChecker: correctly spelled Query: mobile ResultFound: 2850 SpellChecker: correctly spelled Query: simphony ResultFound: 0 SpellChecker: symphony Collation Hits: 1190 Query: symphony+mobile ResultFound: 3585 SpellChecker: correctly spelled Query: simphony+mobile ResultFound: 2850 SpellChecker: correctly spelled Query: symphony+mbile ResultFound: 1190 SpellChecker: correctly spelled In last two quries it should suggest something for miss-spelled word simphony and mbile Please find my configuration below. Only spell check configuration are given solrconfig.xml requestHandler name=/select class=solr.SearchHandler lst name=defaults str name=echoParamsexplicit/str int name=rows10/int str name=dfproduct_name/str str name=spellcheckon/str str name=spellcheck.dictionarydefault/str str name=spellcheck.dictionarywordbreak/str str name=spellcheck.extendedResultstrue/str str name=spellcheck.count5/str str name=spellcheck.alternativeTermCount2/str str name=spellcheck.maxResultsForSuggest5/str str name=spellcheck.collatetrue/str str name=spellcheck.collateExtendedResultstrue/str str name=spellcheck.maxCollationTries5/str str name=spellcheck.maxCollations3/str /lst arr name=last-components strspellcheck/str /arr /requestHandler searchComponent name=spellcheck class=solr.SpellCheckComponent str name=queryAnalyzerFieldTypetext_suggest/str lst name=spellchecker str name=namedefault/str str name=fieldsuggest/str str name=classnamesolr.DirectSolrSpellChecker/str str name=distanceMeasureinternal/str float name=accuracy0.5/float /lst lst name=spellchecker str name=namewordbreak/str str name=fieldsuggest/str str name=classnamesolr.WordBreakSolrSpellChecker/str str name=combineWordstrue/str str name=breakWordstrue/str int name=maxChanges10/int int name=minBreakLength5/int /lst /searchComponent schema.xml fieldType name=text_suggest class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.UAX29URLEmailTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.ASCIIFoldingFilterFactory/ filter class=solr.EnglishPossessiveFilterFactory/ /analyzer /fieldType -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-spell-check-mutliwords-tp4218580.html Sent from the Solr - User mailing list archive at Nabble.com.
Peronalized Search Results or Matching Documents to Users
Hi, We have a requirement to be able to have a master product catalog and to create a sub-catalog of products per user. This means I may have 10,000 users who each create their own list of documents. This is a simple mapping of user to documents. The full data about the documents would be in the main catalog. What approaches would allow Solr to only return the results that are in the user's list? It seems like I would need a couple of steps in the process. In other words, the main catalog has 3 documents: A, B and C. I have 2 users. User 1 has access to documents A and C but not B. User 2 has access to documents C and B but not A. When a user searches, I want to only return documents that the user has access to. One approach would seem to have a DB table for the user's catalog list. Then during indexing, use that table to index each product against all applicable users. Then, during search, restrict the results to products that match to the current user. No idea HOW to do that. Another approach would seem to be to do nothing on indexing and instead provide some type of filter on the results that limits the results for the specific user. No idea how to do that either. The goal is to have the ability to have personalized product catalogs. The big problem is that we have very large catalogs with a high number of users. Something like 500,000 products and 10,000 customers. The solution needs to perform well for both indexing and search. My client is considering dropping SOLR and investing in Endeca if Solr cannot handle this need efficiently. Any suggestions or help would be greatly appreciated. bob -- View this message in context: http://lucene.472066.n3.nabble.com/Peronalized-Search-Results-or-Matching-Documents-to-Users-tp4219951.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Search for All CAPS words
Thanks and I did think of the copy field option. So what you are suggesting is that I have a copyfield in which I do not keep the lowercase factory analyzer in my indexing/query chains. I am afraid that would not help if my search query is complex with many words (say a boolean with proximity operators) because the full search string would have go into the copyfield (not having the lowercase). The rest of the words other than /GATE/ wouldnt match properly then. Ritesh -- View this message in context: http://lucene.472066.n3.nabble.com/Search-for-All-CAPS-words-tp4219893p4219959.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Query taking 50 sec
On 7/30/2015 3:53 AM, Manohar Sripada wrote: We have Solr Cloud (version 4.7.2) setup on 64 shards spread across VMs. I see my queries to Solr taking exactly 50 sec intermittently (as someone said so :P). This happens once in 10 queries. I have enabled log level to TRACE on all the solr nodes. I didn't find any issue with the query time on any given shard (max QTime observed on a shard is 10 ms). We ran all the tests related to network and everything looks fine there. Whenever the query took 50 sec, I am seeing the below log statements for org.eclipse.jetty component. Is this some issue with Jetty?? I could see this logs being printed every 11 seconds(/2015-07-24 07:06:00, //2015-07-24 07:06:11, ...)/ for 4 times. Attached the complete logs during that duration. Can someone please help me here?? snip /INFO - 2015-07-24 07:06:00.128; org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null path=/admin/info/logging params={_=1437736005493since=1437734905469wt=json} status=0 QTime=0 Those logs appear to be caused by someone watching the Logging tab in the admin UI. This admin UI page refreshes every ten seconds. No queries are happening during the log you included, only the requests for logging info. These requests are normally very fast, and in your log, they show a qtime of zero milliseconds. 64 shards is quite a bit, and as soon as someone talks about a very large install on virtual machines that is having performance problems, I suspect that they probably do not have enough resources (memory in particular) for what they are asking the system to do. Now it's time for some light reading: http://wiki.apache.org/solr/SolrPerformanceProblems Next there are questions. These first bunch of questions are about the virtual machines themselves, not the host hardware for the virtual machines. Are you using the jetty (start.jar) included with Solr, or have you installed Solr into a different jetty? On the dashboard of the admin UI, in the JVM section, there is an Args parameter, which may have multiple lines. What all is there? If you add up all the shard replicas on a single virtual machine, how many docs are there and how much disk space is used by the index data? Include all replicas in those numbers, even if they duplicate data that's on another virtual machine. How much memory does the virtual machine have, and how much of that memory is allocated to the java heap? Are all of the virtual machines similar as far as memory config and how much Solr data they contain? If you are using a virtual machine platform that you host yourself, then I need to know how many of these virtual machines are loaded onto each physical machine, and how much memory that physical machine has. If you're using AWS, then this question is irrelevant. The allocation of CPU resources might be important, but it's not as important as memory. Thanks, Shawn
Re: Search for All CAPS words
So, what you want is to duplicate a specific token, rename one of the copies, and inject it with the same offset as the original. So GATE = gate, _gate but gate=gate. That, to me, is a custom token filter. You can probably use KeywordRepeatFilterFactory as a base: http://www.solr-start.com/javadoc/solr-lucene/org/apache/lucene/analysis/miscellaneous/KeywordRepeatFilterFactory.html (you can click through to the Filter and then source from there). Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-start.com/ On 30 July 2015 at 13:53, rks_lucene ppro.i...@gmail.com wrote: Thanks and I did think of the copy field option. So what you are suggesting is that I have a copyfield in which I do not keep the lowercase factory analyzer in my indexing/query chains. I am afraid that would not help if my search query is complex with many words (say a boolean with proximity operators) because the full search string would have go into the copyfield (not having the lowercase). The rest of the words other than /GATE/ wouldnt match properly then. Ritesh -- View this message in context: http://lucene.472066.n3.nabble.com/Search-for-All-CAPS-words-tp4219893p4219959.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Question about Stemmer
Hi Ashish, are we talking about Analysis at query or Index time or both ? As Erick say I found really hard to believe for this combination in a classic search. Are you trying to provide something special ? Ngram token filter will produce a setof ngram out of your token: token to ok ke en in case of bigrams. I find this as useless input for a stemmer. Inverting the two token filter will probably make more sense. But , can we know which kind of search you want to provide on top of this analysis ? Analysis must always go in pair with the expected Search you want! Cheers 2015-07-29 10:49 GMT+01:00 Ashish Mukherjee ashish.mukher...@gmail.com: Hello, I am using Stemmer on a Ngram field. I am getting better results with Stemmer factory after Ngram, but I was wondering what is the recommended practice when using Stemmer on Ngram field? Regards, Ashish -- -- Benedetti Alessandro Visiting card - http://about.me/alessandro_benedetti Blog - http://alexbenedetti.blogspot.co.uk Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England
Re: Suggester always highlights suggestions even if we pass highlight=false
Hi Nutch, are you sure you are using the proper parameters ? I can not see the highlight param in the suggester configuration! From the issue you linked, it seems it is necessary to disable highlighting ( default =true) . I see it as query param for the /suggest search handler. Am I wrong or you misunderstood the configuration? Cheers 2015-07-30 8:50 GMT+01:00 Nutch Solr User nutchsolru...@gmail.com: I am still experiencing https://issues.apache.org/jira/browse/SOLR-6648 issue with solr 5.2.1. even if i send highlight=false solr returns me highlighted suggestions. Any idea why this is happening? My configurations : *URL : *http://solrhost:solrpost /mycorename/suggest?suggest.dictionary=altSuggestersuggest.dictionary=mainSuggesterwt=jsonsuggest.q=treatmsuggest.count=20highlight=false *reponse : * { responseHeader: { status: 0, QTime: 6 }, suggest: { mainSuggester: { treatm: { numFound: 20, suggestions: [ { term: *Treatm*ent Refusal, weight: 0, payload: }, { term: Withholding *Treatm*ent, weight: 0, payload: }, { term: *Treatm*ent Refusal, weight: 0, payload: }, { term: Withholding *Treatm*ent, weight: 0, payload: } ] } }, altSuggester: { treatm: { numFound: 2, suggestions: [ { term: *treatm*ent, weight: 197, payload: }, { term: *treatm*ents, weight: 5, payload: } ] } } } } *My Configurations : * searchComponent name=suggest class=solr.SuggestComponent lst name=suggester str name=namemainSuggester/str str name=lookupImplAnalyzingInfixLookupFactory/str str name=dictionaryImplDocumentDictionaryFactory/str str name=fieldkeyphrases/str str name=suggestAnalyzerFieldTypetext_general/str str name=indexPathmain-suggest/str str name=buildOnStartuptrue/str /lst lst name=suggester str name=namealtSuggester/str str name=lookupImplAnalyzingInfixLookupFactory/str str name=dictionaryImplHighFrequencyDictionaryFactory/str str name=fieldtext/str str name=suggestAnalyzerFieldTypetext_general/str str name=indexPathalt-suggest/str str name=allTermsRequiredfalse/str str name=buildOnStartuptrue/str /lst /searchComponent requestHandler name=/suggest class=solr.SearchHandler startup=lazy lst name=defaults str name=suggesttrue/str str name=suggest.count10/str str name=suggest.dictionarymainSuggester/str /lst arr name=components strsuggest/str /arr /requestHandler - Nutch Solr User The ultimate search engine would basically understand everything in the world, and it would always give you the right thing. -- View this message in context: http://lucene.472066.n3.nabble.com/Suggester-always-highlights-suggestions-even-if-we-pass-highlight-false-tp4219846.html Sent from the Solr - User mailing list archive at Nabble.com. -- -- Benedetti Alessandro Visiting card - http://about.me/alessandro_benedetti Blog - http://alexbenedetti.blogspot.co.uk Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England
Solr Query taking 50 sec
Hi, We have Solr Cloud (version 4.7.2) setup on 64 shards spread across VMs. I see my queries to Solr taking exactly 50 sec intermittently (as someone said so :P). This happens once in 10 queries. I have enabled log level to TRACE on all the solr nodes. I didn't find any issue with the query time on any given shard (max QTime observed on a shard is 10 ms). We ran all the tests related to network and everything looks fine there. Whenever the query took 50 sec, I am seeing the below log statements for org.eclipse.jetty component. Is this some issue with Jetty?? I could see this logs being printed every 11 seconds(*2015-07-24 07:06:00, **2015-07-24 07:06:11, ...)* for 4 times. Attached the complete logs during that duration. Can someone please help me here?? *DEBUG - 2015-07-24 07:06:00.126; org.eclipse.jetty.http.HttpParser; filled 707/707* *DEBUG - 2015-07-24 07:06:00.127; org.eclipse.jetty.server.Server; REQUEST /solr/admin/info/logging on BlockingHttpConnection@7a5f39b0,g=HttpGenerator{s=0,h=-1,b=-1,c=-1},p=HttpParser{s=-5,l=209,c=0},r=43* *DEBUG - 2015-07-24 07:06:00.127; org.eclipse.jetty.server.handler.ContextHandler; scope null||/solr/admin/info/logging @ o.e.j.w.WebAppContext{/solr,file:/u01/work/app/install_solr/daas_node/solr-webapp/webapp/},/u01/work/app/install_solr/daas_node/webapps/solr.war* *DEBUG - 2015-07-24 07:06:00.127; org.eclipse.jetty.server.handler.ContextHandler; context=/solr||/admin/info/logging @ o.e.j.w.WebAppContext{/solr,file:/u01/work/app/install_solr/daas_node/solr-webapp/webapp/},/u01/work/app/install_solr/daas_node/webapps/solr.war* *DEBUG - 2015-07-24 07:06:00.127; org.eclipse.jetty.server.session.SessionHandler; Got Session ID vZScVxfQ528bXYGHJw16N3vTLJ4t3L41bSkHNmyTywQKGGzZFC8p!-348395136!NONE from cookie* *DEBUG - 2015-07-24 07:06:00.127; org.eclipse.jetty.server.session.SessionHandler; sessionManager=org.eclipse.jetty.server.session.HashSessionManager@1c49094* *DEBUG - 2015-07-24 07:06:00.127; org.eclipse.jetty.server.session.SessionHandler; session=null* *DEBUG - 2015-07-24 07:06:00.128; org.eclipse.jetty.servlet.ServletHandler; servlet /solr|/admin/info/logging|null - default* *DEBUG - 2015-07-24 07:06:00.128; org.eclipse.jetty.servlet.ServletHandler; chain=SolrRequestFilter-default* *DEBUG - 2015-07-24 07:06:00.128; org.eclipse.jetty.servlet.ServletHandler$CachedChain; call filter SolrRequestFilter* *INFO - 2015-07-24 07:06:00.128; org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null path=/admin/info/logging params={_=1437736005493since=1437734905469wt=json} status=0 QTime=0 * *DEBUG - 2015-07-24 07:06:00.128; org.apache.solr.servlet.SolrDispatchFilter; Closing out SolrRequest: {_=1437736005493since=1437734905469wt=json}* *DEBUG - 2015-07-24 07:06:00.129; org.eclipse.jetty.server.Server; RESPONSE /solr/admin/info/logging 200 handled=true* *DEBUG - 2015-07-24 07:06:06.327; org.apache.zookeeper.ClientCnxn$SendThread; Got ping response for sessionid: 0x14eaf8f79530460 after 0ms* *DEBUG - 2015-07-24 07:06:11.118; org.eclipse.jetty.http.HttpParser; filled 707/707* *DEBUG - 2015-07-24 07:06:11.119; org.eclipse.jetty.server.Server; REQUEST /solr/admin/info/logging on BlockingHttpConnection@7a5f39b0,g=HttpGenerator{s=0,h=-1,b=-1,c=-1},p=HttpParser{s=-5,l=209,c=0},r=44* *DEBUG - 2015-07-24 07:06:11.119; org.eclipse.jetty.server.handler.ContextHandler; scope null||/solr/admin/info/logging @ o.e.j.w.WebAppContext{/solr,file:/u01/work/app/install_solr/daas_node/solr-webapp/webapp/},/u01/work/app/install_solr/daas_node/webapps/solr.war* *DEBUG - 2015-07-24 07:06:11.119; org.eclipse.jetty.server.handler.ContextHandler; context=/solr||/admin/info/logging @ o.e.j.w.WebAppContext{/solr,file:/u01/work/app/install_solr/daas_node/solr-webapp/webapp/},/u01/work/app/install_solr/daas_node/webapps/solr.war* *DEBUG - 2015-07-24 07:06:11.119; org.eclipse.jetty.server.session.SessionHandler; Got Session ID vZScVxfQ528bXYGHJw16N3vTLJ4t3L41bSkHNmyTywQKGGzZFC8p!-348395136!NONE from cookie* *DEBUG - 2015-07-24 07:06:11.119; org.eclipse.jetty.server.session.SessionHandler; sessionManager=org.eclipse.jetty.server.session.HashSessionManager@1c49094* *DEBUG - 2015-07-24 07:06:11.119; org.eclipse.jetty.server.session.SessionHandler; session=null* *DEBUG - 2015-07-24 07:06:11.120; org.eclipse.jetty.servlet.ServletHandler; servlet /solr|/admin/info/logging|null - default* *DEBUG - 2015-07-24 07:06:11.120; org.eclipse.jetty.servlet.ServletHandler; chain=SolrRequestFilter-default* *DEBUG - 2015-07-24 07:06:11.120; org.eclipse.jetty.servlet.ServletHandler$CachedChain; call filter SolrRequestFilter* *INFO - 2015-07-24 07:06:11.120; org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null path=/admin/info/logging params={_=1437736016484since=1437734905469wt=json} status=0 QTime=0 * *DEBUG - 2015-07-24 07:06:11.120; org.apache.solr.servlet.SolrDispatchFilter; Closing out SolrRequest: {_=1437736016484since=1437734905469wt=json}* *DEBUG - 2015-07-24 07:06:11.121;
Re: How to handle line breaks for quoted queries
Hi Mohsen, this is the perfect place for the *positionIncrementGap *attribute for your field type*.* fieldType name=text_general class=solr.TextField *positionIncrementGap* =100 First of all when phrase or positional searches are necessary you need to store term positions in your index. The position increment gap will increment the position when a multi valued field happens. At this point you different solutions : 1) you provide multiple values based on line break if this fit for your use case 2) more likely you take a look to a tokenizer that do the position increment on its own when a line break is found. If you use the analysis tool now, what will you get for your tokens and positions ? How the line break is indexed ? You can review positions with the analysis tool ! Cheers 2015-07-30 8:40 GMT+01:00 Mohsen Saboorian mohs...@gmail.com: How can I recognize line breaks and do not allow matching of a quoted query in the following example. I have two documents with just one text field: 1. AAA BBB line break CCC DDD 2. BBB CCC line break DDD AAA User enters query BBB CCC. How can I configure tokenizers so that Solr only returns doc #2? Thanks, Mohsen -- -- Benedetti Alessandro Visiting card - http://about.me/alessandro_benedetti Blog - http://alexbenedetti.blogspot.co.uk Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England