Re: Basic auth

2015-07-30 Thread Noble Paul
Although I'm not sure why you took this approach instead of
supporting  simple built-in basic auth and let us configure security
the old/easy way

Going with Jetty basic auth is not useful in a large enough  cluster.
Where do you store the credentials and how would you propagate it
across the cluster. When you use Solr, you need a SOlr like way of
managing that. The other problem is inter-node communication. How do
you pass credentials along in that case

I'm guessing it has to do with future requirement of field/doc level security

Acutally that is an orthogonal requirement

I hope you can get rid of the war file soon and start promoting Solr
as a set of libraries so one can easily embed/extend Solr

That is not what we have in mind. We want Solr to be a server which
controls every aspect of its running . We should have the choice of
getting rid of jetty or whatsoever and move to a new system. We only
guarantee to interface/protocol to remain constant



On Tue, Jul 28, 2015 at 2:19 AM, Fadi Mohsen fadi.moh...@gmail.com wrote:
 Thank you, I tested providing my implementation of authentication in 
 security.json, uploaded file to ZK (just considering authentication), started 
 nodes and it worked like a charm.

 That required of course turning off Jetty basic auth.

 Although I'm not sure why you took this approach instead of supporting  
 simple built-in basic auth and let us configure security the old/easy way.

 I'm guessing it has to do with future requirement of field/doc level security.

 I hope you can get rid of the war file soon and start promoting Solr as a set 
 of libraries so one can easily embed/extend Solr, since some (especially me) 
 might consider command line ZK operations are not that continues 
 delivery/automate everything/production friendly.

 It's easy today to spin up a jetty and wire / point out resource classes or 
 wire up CXF alongside to get things playing, but I'm probably missing out of 
 other things since I see many mails usually in consensus of not embedding and 
 rather want people to consider Solr as a stand-alone service, not sure why!
 I'm probably getting out of context here.

 Regards

 On 27 Jul 2015, at 13:17, Noble Paul noble.p...@gmail.com wrote:

 Q.do you know when it would be released?
 5.3 will be released in another 3-4 weeks .

 Q.Are there any requirements of ZK authentication must be there as well?
 NO

 bq.Providing my own security.json + class/implementation to verify
 user/pass should work today with 5.2, right?

 Yes. But, if you modify your credentials or anything in that JSON, you
 will have to restart all your nodes .

 Q.SOLR-7274 pluggable security is already in 5.2 (my requirement is to
 provide user/pass in a secure manner, not as argument on cmd or from
 (our unsecured) ZK but from a configuration restful service,

 I'm not clear what your question is. Basic Auth is a well-known
 standard. We are just implementing that standard. We store all
 credentials  permissions in ZK . That means it is only as secure as
 your ZK . As long as nobody can write to ZK, your system is safe

 On Wed, Jul 22, 2015 at 11:10 PM, Fadi Mohsen fadi.moh...@gmail.com wrote:
 Hi, I have some questions regarding basic auth and proper support in 5.3:

 do you know when it would be released?

 Are there any requirements of ZK authentication must be there as well?

 Do we store the user/pass in ZK?

 SOLR-7274 pluggable security is already in 5.2 (my requirement is to 
 provide user/pass in a secure manner, not as argument on cmd or from (our 
 unsecured) ZK but from a configuration restful service,
 I'm not sure 5.3 release would fit above requirement, can you reflect on 
 this?

 Providing my own security.json + class/implementation to verify user/pass 
 should work today with 5.2, right?

 Thanks
 Fadi

 On 22 Jul 2015, at 14:33, Noble Paul noble.p...@gmail.com wrote:

 Solr 5.3 is coming with proper basic auth support


 https://issues.apache.org/jira/browse/SOLR-7692

 On Wed, Jul 22, 2015 at 5:28 PM, Peter Sturge peter.stu...@gmail.com 
 wrote:
 if you're using Jetty you can use the standard realms mechanism for Basic
 Auth, and it works the same on Windows or UNIX. There's plenty of docs on
 the Jetty site about getting this working, although it does vary somewhat
 depending on the version of Jetty you're running (N.B. I would suggest
 using Jetty 9, and not 8, as 8 is missing some key authentication 
 classes).
 If, when you execute a search query to your Solr instance you get a
 username and password popup, then Jetty's auth is setup. If you don't then
 something's wrong in the Jetty config.

 it's worth noting that if you're doing distributed searches Basic Auth on
 its own will not work for you. This is because Solr sends distributed
 requests to remote instances on behalf of the user, and it has no 
 knowledge
 of the web container's auth mechanics. We got 'round this by customizing
 Solr to receive credentials and use them for authentication to remote
 

Search for All CAPS words

2015-07-30 Thread rks_lucene
Hi,

I need the capability to search for /GATE/ separately from /gate/.

I cannot remove the lowercase filter factory in both my search and analysis
chains since that will break many other search scenarios.

Is there a way to payload/mark an ALL CAPS word in the index analyzer chain
before it gets lowercased (by the lowercasefilterfactory) so that I can
search it with some custom grammar and logic in my query parser.

Say I want:

Field:_gate to match /GATE/ only

Field:gate to match both /GATE/ and /gate/

Any pointers would be helpful.

thanks
Ritesh



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Search-for-All-CAPS-words-tp4219893.html
Sent from the Solr - User mailing list archive at Nabble.com.


Hard Commit not working

2015-07-30 Thread Nitin Solanki
Hi,
   I am trying to index documents using solr cloud. After setting,
maxTime to 6 ms in hard commit. Documents are visible instantly while
adding them. Not commiting after 6 ms.
I have added Solr log. Please check it. I am not getting exactly what is
happening.

*CURL to commit documents:*

curl http://localhost:8983/solr/test/update/json -H
'Content-type:application/json' -d 'json-here'

*Solrconfig.xml:*
autoCommit
   maxDocs1/maxDocs
   maxTime6/maxTime
   openSearcherfalse/openSearcher
 /autoCommit
!--autoSoftCommit --
 !--  maxTime${solr.autoSoftCommit.maxTime:-1}/maxTime --
 !--/autoSoftCommit--


*Solr Log: *
INFO  - 2015-07-30 14:14:12.636; [test shard6 core_node2
test_shard6_replica1] org.apache.solr.update.processor.LogUpdateProcessor;
[test_shard6_replica1] webapp=/solr path=/update
params={update.distrib=FROMLEADERwaitSearcher=trueopenSearcher=truecommit=truesoftCommit=falsedistrib.from=
http://100.77.202.145:8983/solr/test_shard2_replica1/commit_end_point=truewt=javabinversion=2expungeDeletes=false}
{commit=} 0 26


Re: Hard Commit not working

2015-07-30 Thread Nitin Solanki
Hi Edwards,
  I am only sending 1 document for indexing then why it is
committing instantly. I gave maxTime to 6.

On Thu, Jul 30, 2015 at 8:26 PM Edward Ribeiro edward.ribe...@gmail.com
wrote:

 Your maxDocs is set to 1. This is the number of pending docs before
 autocommit is triggered too. You should set it to a higher value like
 1, for example.

 Edward
 Em 30/07/2015 11:43, Nitin Solanki nitinml...@gmail.com escreveu:

  Hi,
 I am trying to index documents using solr cloud. After setting,
  maxTime to 6 ms in hard commit. Documents are visible instantly
 while
  adding them. Not commiting after 6 ms.
  I have added Solr log. Please check it. I am not getting exactly what is
  happening.
 
  *CURL to commit documents:*
 
  curl http://localhost:8983/solr/test/update/json -H
  'Content-type:application/json' -d 'json-here'
 
  *Solrconfig.xml:*
  autoCommit
 maxDocs1/maxDocs
 maxTime6/maxTime
 openSearcherfalse/openSearcher
   /autoCommit
  !--autoSoftCommit --
   !--  maxTime${solr.autoSoftCommit.maxTime:-1}/maxTime --
   !--/autoSoftCommit--
 
 
  *Solr Log: *
  INFO  - 2015-07-30 14:14:12.636; [test shard6 core_node2
  test_shard6_replica1]
 org.apache.solr.update.processor.LogUpdateProcessor;
  [test_shard6_replica1] webapp=/solr path=/update
 
 
 params={update.distrib=FROMLEADERwaitSearcher=trueopenSearcher=truecommit=truesoftCommit=falsedistrib.from=
 
 
 http://100.77.202.145:8983/solr/test_shard2_replica1/commit_end_point=truewt=javabinversion=2expungeDeletes=false
  }
  {commit=} 0 26
 



Re: [ANN] New Features For Splainer

2015-07-30 Thread Doug Turnbull
Glad you find it useful Daniel!

Yeah its all driven from the browser. Splainer doesn't have a backend, its
just a bunch of html and javascript hosted on s3. So no worries about your
data being shared around.

It seems another common trend is just running it locally. I correspond with
quite a few folks that just do that. If you know something about some basic
Javascript build tools it typically just works fine that way as well.

Let me know if you have any ideas/problems!

Cheers,
-Doug

On Wed, Jul 29, 2015 at 10:14 AM, Davis, Daniel (NIH/NLM) [C] 
daniel.da...@nih.gov wrote:

 I usually protect https://whatever.nlm.nih.gov/solr deeply, requiring CAS
 authentication against NIH Login, but I also make sure handleSelect=false,
 and reverse proxy https://whatever.nlm.nih.gov/search/core-name to
 /solr/select.

 I'm surprised and gratified that http://splainer.io/ works in my
 environment.

 -Original Message-
 From: Doug Turnbull [mailto:dturnb...@opensourceconnections.com]
 Sent: Friday, July 24, 2015 3:47 PM
 To: solr-user@lucene.apache.org
 Subject: [ANN] New Features For Splainer

 First, I wanted to humbly thank the Solr community for their contributions
 and feedback for our open source Solr sandbox, Splainer (
 http://splainer.io and http://github.com/o19s/splainer). The reception
 and comments have been generally positive and helpful, and I very much
 appreciate being part of such a great open source community that wants to
 support each other.

 What is Splainer exactly? Why should you care? Nobody likes working with
 Solr in the browser's URL bar.  Splainer let's you paste in your Solr URL
 and get an instant, easy to understand breakdown of why some documents are
 ranked higher than others. It then gives you a friendly interface to tweak
 Solr params and experiment with different ideas with a friendlier UI than
 trying to parse through XML and JSON. You needn't worry about security
 rules so that some splainer backend needing to talk to your Solr. The
 interaction with Solr is 100% through your browser. If your PC can see
 Solr, then so can Splainer running in your browser. If you leave work or
 turn off the VPN, then Splainer can't see your Solr. It's all running
 locally on your machine through the browser!

 I wanted to share that we've been slowly adding features to Splainer. The
 two I wanted to highlight, are captured in this blog article (
 http://opensourceconnections.com/blog/2015/07/24/splainer-a-solr-developers-best-friend/
 )

 To summarize, they include

 - Explain Other
 You often wonder why obviously relevant search results don't come back.
 Splainer now gives you the ability to compare any document to secondary
 document to see what factors caused one document to rank higher than another

 - Share Splainerized Solr Results
 Once you paste a Solr URL into Splainer, you can then copy the splainer.io
 URL to share what you're seeing with a colleague. For example, here's some
 information about Virginia state laws about hunting deer from a boat:


 http://splainer.io/#?solr=http:%2F%2Fsolr.quepid.com%2Fsolr%2Fstatedecoded%2Fselect%3Fq%3Ddeer%20hunt%20from%20watercraft%0A%26defType%3Dedismax%0A%26qf%3Dcatch_line%20text%0A%26bq%3Dtitle:deer

 There's many more smaller features and tweaks, but I wanted to let you
 know this was out there. I hope you find Splainer useful. I'm very happy to
 field pull requests, ideas, suggestions, or try to figure out why Splainer
 isn't working for you!

 Cheers!
 --
 *Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections 
 http://opensourceconnections.com, LLC | 240.476.9983
 Author: Relevant Search http://manning.com/turnbull This e-mail and all
 contents, including attachments, is considered to be Company Confidential
 unless explicitly stated otherwise, regardless of whether attachments are
 marked as such.




-- 
*Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections
http://opensourceconnections.com, LLC | 240.476.9983
Author: Relevant Search http://manning.com/turnbull
This e-mail and all contents, including attachments, is considered to be
Company Confidential unless explicitly stated otherwise, regardless
of whether attachments are marked as such.


Re: Search for All CAPS words

2015-07-30 Thread Alexandre Rafalovitch
Have you tried copyField with different field type for different
fields yet? That would be my first step. Make the copied field
indexed-only, not stored for efficiency.

And you can then either search against that copied field directly or
use eDisMax against both fields and give that field a higher priority.

Regards,
   Alex.

Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


On 30 July 2015 at 10:00, rks_lucene ppro.i...@gmail.com wrote:
 Hi,

 I need the capability to search for /GATE/ separately from /gate/.

 I cannot remove the lowercase filter factory in both my search and analysis
 chains since that will break many other search scenarios.

 Is there a way to payload/mark an ALL CAPS word in the index analyzer chain
 before it gets lowercased (by the lowercasefilterfactory) so that I can
 search it with some custom grammar and logic in my query parser.

 Say I want:

 Field:_gate to match /GATE/ only

 Field:gate to match both /GATE/ and /gate/

 Any pointers would be helpful.

 thanks
 Ritesh



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Search-for-All-CAPS-words-tp4219893.html
 Sent from the Solr - User mailing list archive at Nabble.com.


StandardTokenizerFactory and WhitespaceTokenizerFactory

2015-07-30 Thread Tarala, Magesh
I am indexing text that contains part numbers in various formats that contain 
hypens/dashes, and a few other special characters.

Here's the problem: If I use StandardTokenizerFactory, the hypens, etc are 
stripped and so I cannot search by the part number 222-333-. I can only 
search for 222 or 333 or 444.
If I use the WhitespaceTokenizerFactory instead, I can search part numbers, but 
I'm not able to search words if they have punctuations like comma or period 
after the word. Example: wheel,

Should I use copy fields and use different tokenizers and then during the 
search based on the search string? Any other options?





Re: Hard Commit not working

2015-07-30 Thread Edward Ribeiro
Your maxDocs is set to 1. This is the number of pending docs before
autocommit is triggered too. You should set it to a higher value like
1, for example.

Edward
Em 30/07/2015 11:43, Nitin Solanki nitinml...@gmail.com escreveu:

 Hi,
I am trying to index documents using solr cloud. After setting,
 maxTime to 6 ms in hard commit. Documents are visible instantly while
 adding them. Not commiting after 6 ms.
 I have added Solr log. Please check it. I am not getting exactly what is
 happening.

 *CURL to commit documents:*

 curl http://localhost:8983/solr/test/update/json -H
 'Content-type:application/json' -d 'json-here'

 *Solrconfig.xml:*
 autoCommit
maxDocs1/maxDocs
maxTime6/maxTime
openSearcherfalse/openSearcher
  /autoCommit
 !--autoSoftCommit --
  !--  maxTime${solr.autoSoftCommit.maxTime:-1}/maxTime --
  !--/autoSoftCommit--


 *Solr Log: *
 INFO  - 2015-07-30 14:14:12.636; [test shard6 core_node2
 test_shard6_replica1] org.apache.solr.update.processor.LogUpdateProcessor;
 [test_shard6_replica1] webapp=/solr path=/update

 params={update.distrib=FROMLEADERwaitSearcher=trueopenSearcher=truecommit=truesoftCommit=falsedistrib.from=

 http://100.77.202.145:8983/solr/test_shard2_replica1/commit_end_point=truewt=javabinversion=2expungeDeletes=false
 }
 {commit=} 0 26



RE: StandardTokenizerFactory and WhitespaceTokenizerFactory

2015-07-30 Thread Tarala, Magesh
Using PatternReplaceCharFilterFactory to replace comma, period, etc with space 
or empty char will work?

-Original Message-
From: Tarala, Magesh 
Sent: Thursday, July 30, 2015 10:08 AM
To: solr-user@lucene.apache.org
Subject: StandardTokenizerFactory and WhitespaceTokenizerFactory

I am indexing text that contains part numbers in various formats that contain 
hypens/dashes, and a few other special characters.

Here's the problem: If I use StandardTokenizerFactory, the hypens, etc are 
stripped and so I cannot search by the part number 222-333-. I can only 
search for 222 or 333 or 444.
If I use the WhitespaceTokenizerFactory instead, I can search part numbers, but 
I'm not able to search words if they have punctuations like comma or period 
after the word. Example: wheel,

Should I use copy fields and use different tokenizers and then during the 
search based on the search string? Any other options?





Re: Problem with 60 cc and 60cc

2015-07-30 Thread Upayavira
The reason is almost certainly because the query parser is splitting on
whitespace before the analysis chain gets the query - thus, each token
travels separately through your chain. Try it with quotes around it to
see if this is your issue.

Upayavira

On Thu, Jul 30, 2015, at 04:52 PM, Jack Schlederer wrote:
 Hi,

 I'm in the process of revising a schema for the search function of an
 eCommerce platform.  One of the sticking points is a particular use
 case of searching for xx yy where xx is any number and yy is an
 abbreviation for a unit of measurement (mm, cc, ml, in, etc.).  The
 problem is that searching for xx yy and xxyy return different
 results. One possible solution I tried was applying a few
 PatternReplaceCharFilterFactories to remove the whitespace between xx
 and yy if there was any (at both index- and query-time).  These are
 the first few lines in the analyzer:

 charFilter class=solr.PatternReplaceCharFilterFactory
 pattern=(?i)(\d+)\s?(pounds?|lbs?) replacement=$1lb / charFilter
 class=solr.PatternReplaceCharFilterFactory
 pattern=(?i)(\d+)\s?(inch[es]?|in?) replacement=$1in /
 charFilter class=solr.PatternReplaceCharFilterFactory
 pattern=(?i)(\d+)\s?(ounc[es]?|oz) replacement=$1oz / charFilter
 class=solr.PatternReplaceCharFilterFactory
 pattern=(?i)(\d+)\s?(quarts?|qts?) replacement=$1qt / charFilter
 class=solr.PatternReplaceCharFilterFactory
 pattern=(?i)(\d+)\s?(gallons?|gal?) replacement=$1gal /
 charFilter class=solr.PatternReplaceCharFilterFactory
 pattern=(?i)(\d+)\s?(mm|cc|ml) replacement=$1$2 /

 A few more lines down, I use a PatternCaptureGroupFilterFactory to
 emit the tokens xxyy, xx, and yy:

 filter class=solr.PatternCaptureGroupFilterFactory
 pattern=(\d+)(lb|oz|in|qt|gal|mm|cc|ml) preserve_original=true /

 In Solr admin's analysis tool for the field type this applies to, both
 xx yy and xxyy are tokenized and filtered down indentically (at
 both index- and -query time).

 The platform I'm working on searches many different fields by default,
 but even when I rig up the query to only search in this one field, I
 still get different results for xxyy and xx yy.  I'm wondering why
 this is.

 Attached is a screenshot from Solr analysis.

 Thanks, John


RE: StandardTokenizerFactory and WhitespaceTokenizerFactory

2015-07-30 Thread Tarala, Magesh
I'm adding PatternReplaceCharFilterFactory to exclude characters. Looks like 
this works. 

-Original Message-
From: Tarala, Magesh 
Sent: Thursday, July 30, 2015 10:37 AM
To: solr-user@lucene.apache.org
Subject: RE: StandardTokenizerFactory and WhitespaceTokenizerFactory

Using PatternReplaceCharFilterFactory to replace comma, period, etc with space 
or empty char will work?

-Original Message-
From: Tarala, Magesh 
Sent: Thursday, July 30, 2015 10:08 AM
To: solr-user@lucene.apache.org
Subject: StandardTokenizerFactory and WhitespaceTokenizerFactory

I am indexing text that contains part numbers in various formats that contain 
hypens/dashes, and a few other special characters.

Here's the problem: If I use StandardTokenizerFactory, the hypens, etc are 
stripped and so I cannot search by the part number 222-333-. I can only 
search for 222 or 333 or 444.
If I use the WhitespaceTokenizerFactory instead, I can search part numbers, but 
I'm not able to search words if they have punctuations like comma or period 
after the word. Example: wheel,

Should I use copy fields and use different tokenizers and then during the 
search based on the search string? Any other options?





Re: Hard Commit not working

2015-07-30 Thread Jack Krupansky
Please be more specific as to why you think something is not working.

-- Jack Krupansky

On Thu, Jul 30, 2015 at 10:43 AM, Nitin Solanki nitinml...@gmail.com
wrote:

 Hi,
I am trying to index documents using solr cloud. After setting,
 maxTime to 6 ms in hard commit. Documents are visible instantly while
 adding them. Not commiting after 6 ms.
 I have added Solr log. Please check it. I am not getting exactly what is
 happening.

 *CURL to commit documents:*

 curl http://localhost:8983/solr/test/update/json -H
 'Content-type:application/json' -d 'json-here'

 *Solrconfig.xml:*
 autoCommit
maxDocs1/maxDocs
maxTime6/maxTime
openSearcherfalse/openSearcher
  /autoCommit
 !--autoSoftCommit --
  !--  maxTime${solr.autoSoftCommit.maxTime:-1}/maxTime --
  !--/autoSoftCommit--


 *Solr Log: *
 INFO  - 2015-07-30 14:14:12.636; [test shard6 core_node2
 test_shard6_replica1] org.apache.solr.update.processor.LogUpdateProcessor;
 [test_shard6_replica1] webapp=/solr path=/update

 params={update.distrib=FROMLEADERwaitSearcher=trueopenSearcher=truecommit=truesoftCommit=falsedistrib.from=

 http://100.77.202.145:8983/solr/test_shard2_replica1/commit_end_point=truewt=javabinversion=2expungeDeletes=false
 }
 {commit=} 0 26



Re: Zookeeper state and its effect on Solr cluster.

2015-07-30 Thread Modassar Ather
Hi,

Our indexer before starting does upload/reload of Solr configuration files
using ZK UPLOAD and RELOAD APIs. In this process zookeeper is not
stopped/restarted. ZK is alive and so are Solr nodes.
Doing this often causes following exception. Kindly note that the ZK
instance is standalone and not ensemble. This exception is only happening
at RELOAD.

{responseHeader:{status:500,QTime:180028},error:{msg:reload the
collection time out:180s,trace:org.apache.solr.common.SolrException:
reload the collection time out:180s\n\tat
org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:237)\n\tat
org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:168)\n\tat
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)\n\tat
org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:660)\n\tat
org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:431)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:227)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:196)\n\tat
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)\n\tat
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)\n\tat
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)\n\tat
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)\n\tat
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)\n\tat
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)\n\tat
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)\n\tat
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)\n\tat
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)\n\tat
org.eclipse.jetty.server.Server.handle(Server.java:497)\n\tat
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)\n\tat
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)\n\tat
org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)\n\tat
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)\n\tat
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)\n\tat
java.lang.Thread.run(Thread.java:745)\n,code:500}}

Kindly help as this is blocking our smooth process of indexing.

Regards,
Modassar

On Tue, Jul 28, 2015 at 11:40 AM, Shawn Heisey apa...@elyograg.org wrote:

 On 7/27/2015 10:59 PM, Modassar Ather wrote:
  If we upgrade zookeeper we need to restart. This upgrade process is
  automated for future releases/changes of zookeeper.
  This is a single external zookeeper which is completely stopped/shutdown.
  No Solr node are restarted/shutdown.
  What I have understanding that even if the zookeeper shuts down, after
  restart the Solr nodes should come insync with the ZK state. Please
 correct
  me if I am wrong.

 Disclaimer:  I do not have a ton of concrete experience with SolrCloud.
  I do have a cloud setup, but it is running Solr 4.2.1, which at this
 point is ancient.  I haven't needed to do much to maintain it ... it
 takes care of itself.

 Recovering correctly from a complete zookeeper failure is what I would
 hope for, but it's a scenario that I've never tried.  I hope there's a
 unit test for it, but I haven't checked.

 A fully redundant zookeeper ensemble requires a minimum of three hosts.
  If you need to upgrade ZK, then you upgrade them one at a time, and the
 ensemble never loses quorum.

 http://wiki.apache.org/hadoop/ZooKeeper/FAQ#A6

 Thanks,
 Shawn




Suggester always highlights suggestions even if we pass highlight=false

2015-07-30 Thread Nutch Solr User
I am still experiencing https://issues.apache.org/jira/browse/SOLR-6648 issue
with solr 5.2.1. 
even if i send highlight=false solr returns me highlighted suggestions. Any
idea why this is happening?

My configurations : 

*URL :
*http://solrhost:solrpost/mycorename/suggest?suggest.dictionary=altSuggestersuggest.dictionary=mainSuggesterwt=jsonsuggest.q=treatmsuggest.count=20highlight=false

*reponse : *

{
  responseHeader: {
status: 0,
QTime: 6
  },
  suggest: {
mainSuggester: {
  treatm: {
numFound: 20,
suggestions: [
  {
term: *Treatm*ent Refusal,
weight: 0,
payload: 
  },
  {
term: Withholding *Treatm*ent,
weight: 0,
payload: 
  },
  {
term: *Treatm*ent Refusal,
weight: 0,
payload: 
  },
  {
term: Withholding *Treatm*ent,
weight: 0,
payload: 
  }
]
  }
},
altSuggester: {
  treatm: {
numFound: 2,
suggestions: [
  {
term: *treatm*ent,
weight: 197,
payload: 
  },
  {
term: *treatm*ents,
weight: 5,
payload: 
  }
]
  }
}
  }
}


*My Configurations : *

searchComponent name=suggest class=solr.SuggestComponent
   lst name=suggester
str name=namemainSuggester/str
str name=lookupImplAnalyzingInfixLookupFactory/str
str name=dictionaryImplDocumentDictionaryFactory/str
str name=fieldkeyphrases/str
str name=suggestAnalyzerFieldTypetext_general/str
str name=indexPathmain-suggest/str
str name=buildOnStartuptrue/str
  /lst
  lst name=suggester
str name=namealtSuggester/str
str name=lookupImplAnalyzingInfixLookupFactory/str
str name=dictionaryImplHighFrequencyDictionaryFactory/str
str name=fieldtext/str
str name=suggestAnalyzerFieldTypetext_general/str
 str name=indexPathalt-suggest/str
 str name=allTermsRequiredfalse/str
str name=buildOnStartuptrue/str
  /lst
/searchComponent


 requestHandler name=/suggest class=solr.SearchHandler startup=lazy
  lst name=defaults
str name=suggesttrue/str
str name=suggest.count10/str
str name=suggest.dictionarymainSuggester/str
  /lst
  arr name=components
strsuggest/str
  /arr
/requestHandler



-
Nutch Solr User

The ultimate search engine would basically understand everything in the world, 
and it would always give you the right thing.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Suggester-always-highlights-suggestions-even-if-we-pass-highlight-false-tp4219846.html
Sent from the Solr - User mailing list archive at Nabble.com.


How to handle line breaks for quoted queries

2015-07-30 Thread Mohsen Saboorian
How can I recognize line breaks and do not allow matching of a quoted query
in the following example.

I have two documents with just one text field:
1. AAA BBB line break
CCC DDD

2. BBB CCC line break
DDD AAA

User enters query BBB CCC. How can I configure tokenizers so that Solr
only returns doc #2?

Thanks,
Mohsen


Re: Hard Commit not working

2015-07-30 Thread Edward Ribeiro
Most probably because your solrconfig.xml is setting maxDocs for 1:
maxDocs1/maxDocs. Then Solr will autoCommit EITHER with 1 document or
after maxTime has passed. Change your maxDocs value in solrconfig.xml
to 1, don't forget to RELOAD the core, then test it again.



On Thu, Jul 30, 2015 at 12:13 PM, Nitin Solanki nitinml...@gmail.com
wrote:

 Hi Edwards,
   I am only sending 1 document for indexing then why it is
 committing instantly. I gave maxTime to 6.

 On Thu, Jul 30, 2015 at 8:26 PM Edward Ribeiro edward.ribe...@gmail.com
 wrote:

  Your maxDocs is set to 1. This is the number of pending docs before
  autocommit is triggered too. You should set it to a higher value like
  1, for example.
 
  Edward
  Em 30/07/2015 11:43, Nitin Solanki nitinml...@gmail.com escreveu:
 
   Hi,
  I am trying to index documents using solr cloud. After setting,
   maxTime to 6 ms in hard commit. Documents are visible instantly
  while
   adding them. Not commiting after 6 ms.
   I have added Solr log. Please check it. I am not getting exactly what
 is
   happening.
  
   *CURL to commit documents:*
  
   curl http://localhost:8983/solr/test/update/json -H
   'Content-type:application/json' -d 'json-here'
  
   *Solrconfig.xml:*
   autoCommit
  maxDocs1/maxDocs
  maxTime6/maxTime
  openSearcherfalse/openSearcher
/autoCommit
   !--autoSoftCommit --
!--  maxTime${solr.autoSoftCommit.maxTime:-1}/maxTime --
!--/autoSoftCommit--
  
  
   *Solr Log: *
   INFO  - 2015-07-30 14:14:12.636; [test shard6 core_node2
   test_shard6_replica1]
  org.apache.solr.update.processor.LogUpdateProcessor;
   [test_shard6_replica1] webapp=/solr path=/update
  
  
 
 params={update.distrib=FROMLEADERwaitSearcher=trueopenSearcher=truecommit=truesoftCommit=falsedistrib.from=
  
  
 
 http://100.77.202.145:8983/solr/test_shard2_replica1/commit_end_point=truewt=javabinversion=2expungeDeletes=false
   }
   {commit=} 0 26
  
 



Re: Peronalized Search Results or Matching Documents to Users

2015-07-30 Thread Shawn Heisey
On 7/30/2015 10:46 AM, Robert Farrior wrote:
 We have a requirement to be able to have a master product catalog and to
 create a sub-catalog of products per user. This means I may have 10,000
 users who each create their own list of documents. This is a simple mapping
 of user to documents. The full data about the documents would be in the main
 catalog.
 
 What approaches would allow Solr to only return the results that are in the
 user's list?  It seems like I would need a couple of steps in the process.
 In other words, the main catalog has 3 documents: A, B and C. I have 2
 users. User 1 has access to documents A and C but not B. User 2 has access
 to documents C and B but not A.
 
 When a user searches, I want to only return documents that the user has
 access to.

A common approach for Solr would be to have a multivalued user field
on each document, which has individual values for each user that can
access the document.  When you index the document, you included values
in this field listing all the users that can access that document.

Then you simply filter by user:

fq=user:joe

This is EXTREMELY efficient at query time, especially when the number of
users is much smaller than the number of documents.  It may complicate
indexing somewhat, but indexing is an extremely custom operation that
users have to write themselves, so it probably won't be horrible.

Thanks,
Shawn



Problem with 60 cc and 60cc

2015-07-30 Thread Jack Schlederer
Hi,

I'm in the process of revising a schema for the search function of an
eCommerce platform.  One of the sticking points is a particular use case of
searching for xx yy where xx is any number and yy is an abbreviation for
a unit of measurement (mm, cc, ml, in, etc.).  The problem is that
searching for xx yy and xxyy return different results. One possible
solution I tried was applying a few PatternReplaceCharFilterFactories to
remove the whitespace between xx and yy if there was any (at both index-
and query-time).  These are the first few lines in the analyzer:

charFilter class=solr.PatternReplaceCharFilterFactory
pattern=(?i)(\d+)\s?(pounds?|lbs?) replacement=$1lb /
charFilter class=solr.PatternReplaceCharFilterFactory
pattern=(?i)(\d+)\s?(inch[es]?|in?) replacement=$1in /
charFilter class=solr.PatternReplaceCharFilterFactory
pattern=(?i)(\d+)\s?(ounc[es]?|oz) replacement=$1oz /
charFilter class=solr.PatternReplaceCharFilterFactory
pattern=(?i)(\d+)\s?(quarts?|qts?) replacement=$1qt /
charFilter class=solr.PatternReplaceCharFilterFactory
pattern=(?i)(\d+)\s?(gallons?|gal?) replacement=$1gal /
charFilter class=solr.PatternReplaceCharFilterFactory
pattern=(?i)(\d+)\s?(mm|cc|ml) replacement=$1$2 /

A few more lines down, I use a PatternCaptureGroupFilterFactory to emit the
tokens xxyy, xx, and yy:

filter class=solr.PatternCaptureGroupFilterFactory
pattern=(\d+)(lb|oz|in|qt|gal|mm|cc|ml) preserve_original=true /

In Solr admin's analysis tool for the field type this applies to, both xx
yy and xxyy are tokenized and filtered down indentically (at both index-
and -query time).

The platform I'm working on searches many different fields by default, but
even when I rig up the query to only search in this one field, I still get
different results for xxyy and xx yy.  I'm wondering why this is.

Attached is a screenshot from Solr analysis.

Thanks,
John


RE: Solr spell check mutliwords

2015-07-30 Thread Dyer, James
Talha,

In your configuration, you have this set:

str name=spellcheck.maxResultsForSuggest5/str

...which means it will consider the query correctly spelled and offer no 
suggestions if there are 5 or more results. You could omit this parameter and 
it will always suggest when possible.  

Possibly, a better option would be to add spellcheck.collateParam.mm=100% or 
spellcheck.collateParam.q.op=100%, so when testing collations against the 
index, it will require all the terms to match something.  See 
https://wiki.apache.org/solr/SpellCheckComponent#spellcheck.collateParam.XX for 
more information.

James Dyer
Ingram Content Group

-Original Message-
From: talha [mailto:talh...@gmail.com] 
Sent: Wednesday, July 22, 2015 9:34 AM
To: solr-user@lucene.apache.org
Subject: Solr spell check mutliwords

Could not figure out actual reason why my configured Solr spell checker not
giving desire output. In my indexed data query: symphony+mobile has around
3.5K+ docs and spell checker detect it as correctly spelled. When i
miss-spell symphony in query: symphony+mobile it showing only results for
mobile and spell checker detect this query as correctly spelled. I have
searched this query in different combination. Please find search result stat

Query: symphony 
ResultFound: 1190
SpellChecker: correctly spelled

Query: mobile
ResultFound: 2850
SpellChecker: correctly spelled

Query: simphony
ResultFound: 0
SpellChecker: symphony 
Collation Hits: 1190

Query: symphony+mobile
ResultFound: 3585
SpellChecker: correctly spelled 

Query: simphony+mobile
ResultFound: 2850
SpellChecker: correctly spelled

Query: symphony+mbile
ResultFound: 1190
SpellChecker: correctly spelled 

In last two quries it should suggest something for miss-spelled word
simphony and mbile

Please find my configuration below. Only spell check configuration are given

solrconfig.xml

  requestHandler name=/select class=solr.SearchHandler
  lst name=defaults

str name=echoParamsexplicit/str
int name=rows10/int
str name=dfproduct_name/str

str name=spellcheckon/str
str name=spellcheck.dictionarydefault/str
str name=spellcheck.dictionarywordbreak/str
str name=spellcheck.extendedResultstrue/str
str name=spellcheck.count5/str
str name=spellcheck.alternativeTermCount2/str
str name=spellcheck.maxResultsForSuggest5/str
str name=spellcheck.collatetrue/str
str name=spellcheck.collateExtendedResultstrue/str
str name=spellcheck.maxCollationTries5/str
str name=spellcheck.maxCollations3/str

  /lst
  arr name=last-components
strspellcheck/str
  /arr
  /requestHandler

  searchComponent name=spellcheck class=solr.SpellCheckComponent

  str name=queryAnalyzerFieldTypetext_suggest/str

  lst name=spellchecker
str name=namedefault/str
str name=fieldsuggest/str
str name=classnamesolr.DirectSolrSpellChecker/str
str name=distanceMeasureinternal/str
float name=accuracy0.5/float
  /lst

  lst name=spellchecker
str name=namewordbreak/str
str name=fieldsuggest/str
str name=classnamesolr.WordBreakSolrSpellChecker/str
str name=combineWordstrue/str
str name=breakWordstrue/str
int name=maxChanges10/int
int name=minBreakLength5/int
  /lst

  /searchComponent

schema.xml

  fieldType name=text_suggest class=solr.TextField
positionIncrementGap=100
  analyzer
tokenizer class=solr.UAX29URLEmailTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.ASCIIFoldingFilterFactory/
filter class=solr.EnglishPossessiveFilterFactory/
  /analyzer
  /fieldType



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-spell-check-mutliwords-tp4218580.html
Sent from the Solr - User mailing list archive at Nabble.com.



Peronalized Search Results or Matching Documents to Users

2015-07-30 Thread Robert Farrior
Hi,

We have a requirement to be able to have a master product catalog and to
create a sub-catalog of products per user. This means I may have 10,000
users who each create their own list of documents. This is a simple mapping
of user to documents. The full data about the documents would be in the main
catalog.

What approaches would allow Solr to only return the results that are in the
user's list?  It seems like I would need a couple of steps in the process.
In other words, the main catalog has 3 documents: A, B and C. I have 2
users. User 1 has access to documents A and C but not B. User 2 has access
to documents C and B but not A.

When a user searches, I want to only return documents that the user has
access to.

One approach would seem to have a DB table for the user's catalog list.
Then during indexing, use that table to index each product against all
applicable users.  Then, during search, restrict the results to products
that match to the current user. No idea HOW to do that.

Another approach would seem to be to do nothing on indexing and instead
provide some type of filter on the results that limits the results for the
specific user. No idea how to do that either.

The goal is to have the ability to have personalized product catalogs. The
big problem is that we have very large catalogs with a high number of users.
Something like 500,000 products and 10,000 customers.

The solution needs to perform well for both indexing and search.  My client
is considering dropping SOLR and investing in Endeca if Solr cannot handle
this need efficiently.

Any suggestions or help would be greatly appreciated.

bob



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Peronalized-Search-Results-or-Matching-Documents-to-Users-tp4219951.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Search for All CAPS words

2015-07-30 Thread rks_lucene
Thanks and I did think of the copy field option. So what you are suggesting
is that I have a copyfield in which I do not keep the lowercase factory
analyzer in my indexing/query chains.

I am afraid that would not help if my search query is complex with many
words (say a boolean with proximity operators) because the full search
string would have go into the copyfield (not having the lowercase). The rest
of the words other than /GATE/ wouldnt match properly then.

Ritesh







--
View this message in context: 
http://lucene.472066.n3.nabble.com/Search-for-All-CAPS-words-tp4219893p4219959.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Query taking 50 sec

2015-07-30 Thread Shawn Heisey
On 7/30/2015 3:53 AM, Manohar Sripada wrote:
 We have Solr Cloud (version 4.7.2) setup on 64 shards spread across VMs.
 I see my queries to Solr taking exactly 50 sec intermittently (as
 someone said so :P). This happens once in 10 queries. 
 I have enabled log level to TRACE on all the solr nodes. I didn't find
 any issue with the query time on any given shard (max QTime observed on
 a shard is 10 ms).  We ran all the tests related to network and
 everything looks fine there. 
 
 Whenever the query took 50 sec, I am seeing the below log statements
 for org.eclipse.jetty component. Is this some issue with Jetty?? I could
 see this logs being printed every 11 seconds(/2015-07-24
 07:06:00, //2015-07-24 07:06:11, ...)/  for 4 times. Attached the
 complete logs during that duration. Can someone please help me here??

snip

 /INFO  - 2015-07-24 07:06:00.128;
 org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null
 path=/admin/info/logging
 params={_=1437736005493since=1437734905469wt=json} status=0 QTime=0

Those logs appear to be caused by someone watching the Logging tab in
the admin UI. This admin UI page refreshes every ten seconds. No queries
are happening during the log you included, only the requests for logging
info. These requests are normally very fast, and in your log, they show
a qtime of zero milliseconds.

64 shards is quite a bit, and as soon as someone talks about a very
large install on virtual machines that is having performance problems, I
suspect that they probably do not have enough resources (memory in
particular) for what they are asking the system to do.

Now it's time for some light reading:

http://wiki.apache.org/solr/SolrPerformanceProblems

Next there are questions. These first bunch of questions are about the
virtual machines themselves, not the host hardware for the virtual machines.

Are you using the jetty (start.jar) included with Solr, or have you
installed Solr into a different jetty?

On the dashboard of the admin UI, in the JVM section, there is an Args
parameter, which may have multiple lines. What all is there?

If you add up all the shard replicas on a single virtual machine, how
many docs are there and how much disk space is used by the index data?
Include all replicas in those numbers, even if they duplicate data
that's on another virtual machine.

How much memory does the virtual machine have, and how much of that
memory is allocated to the java heap?

Are all of the virtual machines similar as far as memory config and how
much Solr data they contain?

If you are using a virtual machine platform that you host yourself, then
I need to know how many of these virtual machines are loaded onto each
physical machine, and how much memory that physical machine has. If
you're using AWS, then this question is irrelevant. The allocation of
CPU resources might be important, but it's not as important as memory.

Thanks,
Shawn



Re: Search for All CAPS words

2015-07-30 Thread Alexandre Rafalovitch
So, what you want is to duplicate a specific token, rename one of the
copies, and inject it with the same offset as the original. So GATE =
gate, _gate but gate=gate.

That, to me, is a custom token filter. You can probably use
KeywordRepeatFilterFactory as a base:
http://www.solr-start.com/javadoc/solr-lucene/org/apache/lucene/analysis/miscellaneous/KeywordRepeatFilterFactory.html
(you can click through to the Filter and then source from there).

Regards,
   Alex.

Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


On 30 July 2015 at 13:53, rks_lucene ppro.i...@gmail.com wrote:
 Thanks and I did think of the copy field option. So what you are suggesting
 is that I have a copyfield in which I do not keep the lowercase factory
 analyzer in my indexing/query chains.

 I am afraid that would not help if my search query is complex with many
 words (say a boolean with proximity operators) because the full search
 string would have go into the copyfield (not having the lowercase). The rest
 of the words other than /GATE/ wouldnt match properly then.

 Ritesh







 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Search-for-All-CAPS-words-tp4219893p4219959.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Question about Stemmer

2015-07-30 Thread Alessandro Benedetti
Hi Ashish, are we talking about Analysis at query or Index time or both ?
As Erick say I found really hard to believe for this combination in a
classic search.

Are you trying to provide something special ?

Ngram token filter will produce a setof ngram out of your token:

token

to ok ke en  in case of bigrams.
I find this as useless input for a stemmer.

Inverting the two token filter will probably make more sense.
But , can we know which kind of search you want to provide on top of this
analysis ?

Analysis must always go in pair with the expected Search you want!

Cheers

2015-07-29 10:49 GMT+01:00 Ashish Mukherjee ashish.mukher...@gmail.com:

 Hello,

 I am using Stemmer on a Ngram field. I am getting better results with
 Stemmer factory after Ngram, but I was wondering what is the recommended
 practice when using Stemmer on Ngram field?

 Regards,
 Ashish




-- 
--

Benedetti Alessandro
Visiting card - http://about.me/alessandro_benedetti
Blog - http://alexbenedetti.blogspot.co.uk

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England


Re: Suggester always highlights suggestions even if we pass highlight=false

2015-07-30 Thread Alessandro Benedetti
Hi Nutch,
are you sure you are using the proper parameters ?
I can not see the highlight param in the suggester configuration!
From the issue you linked, it seems it is necessary to disable highlighting
( default =true) .

I see it as query param for the /suggest search handler.
Am I wrong or you misunderstood the configuration?

Cheers

2015-07-30 8:50 GMT+01:00 Nutch Solr User nutchsolru...@gmail.com:

 I am still experiencing https://issues.apache.org/jira/browse/SOLR-6648
 issue
 with solr 5.2.1.
 even if i send highlight=false solr returns me highlighted suggestions. Any
 idea why this is happening?

 My configurations :

 *URL :
 *http://solrhost:solrpost
 /mycorename/suggest?suggest.dictionary=altSuggestersuggest.dictionary=mainSuggesterwt=jsonsuggest.q=treatmsuggest.count=20highlight=false

 *reponse : *

 {
   responseHeader: {
 status: 0,
 QTime: 6
   },
   suggest: {
 mainSuggester: {
   treatm: {
 numFound: 20,
 suggestions: [
   {
 term: *Treatm*ent Refusal,
 weight: 0,
 payload: 
   },
   {
 term: Withholding *Treatm*ent,
 weight: 0,
 payload: 
   },
   {
 term: *Treatm*ent Refusal,
 weight: 0,
 payload: 
   },
   {
 term: Withholding *Treatm*ent,
 weight: 0,
 payload: 
   }
 ]
   }
 },
 altSuggester: {
   treatm: {
 numFound: 2,
 suggestions: [
   {
 term: *treatm*ent,
 weight: 197,
 payload: 
   },
   {
 term: *treatm*ents,
 weight: 5,
 payload: 
   }
 ]
   }
 }
   }
 }


 *My Configurations : *

 searchComponent name=suggest class=solr.SuggestComponent
lst name=suggester
 str name=namemainSuggester/str
 str name=lookupImplAnalyzingInfixLookupFactory/str
 str name=dictionaryImplDocumentDictionaryFactory/str
 str name=fieldkeyphrases/str
 str name=suggestAnalyzerFieldTypetext_general/str
 str name=indexPathmain-suggest/str
 str name=buildOnStartuptrue/str
   /lst
   lst name=suggester
 str name=namealtSuggester/str
 str name=lookupImplAnalyzingInfixLookupFactory/str
 str name=dictionaryImplHighFrequencyDictionaryFactory/str
 str name=fieldtext/str
 str name=suggestAnalyzerFieldTypetext_general/str
  str name=indexPathalt-suggest/str
  str name=allTermsRequiredfalse/str
 str name=buildOnStartuptrue/str
   /lst
 /searchComponent


  requestHandler name=/suggest class=solr.SearchHandler startup=lazy
   lst name=defaults
 str name=suggesttrue/str
 str name=suggest.count10/str
 str name=suggest.dictionarymainSuggester/str
   /lst
   arr name=components
 strsuggest/str
   /arr
 /requestHandler



 -
 Nutch Solr User

 The ultimate search engine would basically understand everything in the
 world, and it would always give you the right thing.
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Suggester-always-highlights-suggestions-even-if-we-pass-highlight-false-tp4219846.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
--

Benedetti Alessandro
Visiting card - http://about.me/alessandro_benedetti
Blog - http://alexbenedetti.blogspot.co.uk

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England


Solr Query taking 50 sec

2015-07-30 Thread Manohar Sripada
Hi,

We have Solr Cloud (version 4.7.2) setup on 64 shards spread across VMs. I
see my queries to Solr taking exactly 50 sec intermittently (as someone
said so :P). This happens once in 10 queries.
I have enabled log level to TRACE on all the solr nodes. I didn't find any
issue with the query time on any given shard (max QTime observed on a shard
is 10 ms).  We ran all the tests related to network and everything looks
fine there.

Whenever the query took 50 sec, I am seeing the below log statements
for org.eclipse.jetty component. Is this some issue with Jetty?? I could
see this logs being printed every 11 seconds(*2015-07-24 07:06:00, **2015-07-24
07:06:11, ...)*  for 4 times. Attached the complete logs during that
duration. Can someone please help me here??


*DEBUG - 2015-07-24 07:06:00.126; org.eclipse.jetty.http.HttpParser; filled
707/707*
*DEBUG - 2015-07-24 07:06:00.127; org.eclipse.jetty.server.Server; REQUEST
/solr/admin/info/logging on
BlockingHttpConnection@7a5f39b0,g=HttpGenerator{s=0,h=-1,b=-1,c=-1},p=HttpParser{s=-5,l=209,c=0},r=43*
*DEBUG - 2015-07-24 07:06:00.127;
org.eclipse.jetty.server.handler.ContextHandler; scope
null||/solr/admin/info/logging @
o.e.j.w.WebAppContext{/solr,file:/u01/work/app/install_solr/daas_node/solr-webapp/webapp/},/u01/work/app/install_solr/daas_node/webapps/solr.war*
*DEBUG - 2015-07-24 07:06:00.127;
org.eclipse.jetty.server.handler.ContextHandler;
context=/solr||/admin/info/logging @
o.e.j.w.WebAppContext{/solr,file:/u01/work/app/install_solr/daas_node/solr-webapp/webapp/},/u01/work/app/install_solr/daas_node/webapps/solr.war*
*DEBUG - 2015-07-24 07:06:00.127;
org.eclipse.jetty.server.session.SessionHandler; Got Session ID
vZScVxfQ528bXYGHJw16N3vTLJ4t3L41bSkHNmyTywQKGGzZFC8p!-348395136!NONE from
cookie*
*DEBUG - 2015-07-24 07:06:00.127;
org.eclipse.jetty.server.session.SessionHandler;
sessionManager=org.eclipse.jetty.server.session.HashSessionManager@1c49094*
*DEBUG - 2015-07-24 07:06:00.127;
org.eclipse.jetty.server.session.SessionHandler; session=null*
*DEBUG - 2015-07-24 07:06:00.128; org.eclipse.jetty.servlet.ServletHandler;
servlet /solr|/admin/info/logging|null - default*
*DEBUG - 2015-07-24 07:06:00.128; org.eclipse.jetty.servlet.ServletHandler;
chain=SolrRequestFilter-default*
*DEBUG - 2015-07-24 07:06:00.128;
org.eclipse.jetty.servlet.ServletHandler$CachedChain; call filter
SolrRequestFilter*
*INFO  - 2015-07-24 07:06:00.128;
org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null
path=/admin/info/logging
params={_=1437736005493since=1437734905469wt=json} status=0 QTime=0 *
*DEBUG - 2015-07-24 07:06:00.128;
org.apache.solr.servlet.SolrDispatchFilter; Closing out SolrRequest:
{_=1437736005493since=1437734905469wt=json}*
*DEBUG - 2015-07-24 07:06:00.129; org.eclipse.jetty.server.Server; RESPONSE
/solr/admin/info/logging  200 handled=true*
*DEBUG - 2015-07-24 07:06:06.327;
org.apache.zookeeper.ClientCnxn$SendThread; Got ping response for
sessionid: 0x14eaf8f79530460 after 0ms*
*DEBUG - 2015-07-24 07:06:11.118; org.eclipse.jetty.http.HttpParser; filled
707/707*
*DEBUG - 2015-07-24 07:06:11.119; org.eclipse.jetty.server.Server; REQUEST
/solr/admin/info/logging on
BlockingHttpConnection@7a5f39b0,g=HttpGenerator{s=0,h=-1,b=-1,c=-1},p=HttpParser{s=-5,l=209,c=0},r=44*
*DEBUG - 2015-07-24 07:06:11.119;
org.eclipse.jetty.server.handler.ContextHandler; scope
null||/solr/admin/info/logging @
o.e.j.w.WebAppContext{/solr,file:/u01/work/app/install_solr/daas_node/solr-webapp/webapp/},/u01/work/app/install_solr/daas_node/webapps/solr.war*
*DEBUG - 2015-07-24 07:06:11.119;
org.eclipse.jetty.server.handler.ContextHandler;
context=/solr||/admin/info/logging @
o.e.j.w.WebAppContext{/solr,file:/u01/work/app/install_solr/daas_node/solr-webapp/webapp/},/u01/work/app/install_solr/daas_node/webapps/solr.war*
*DEBUG - 2015-07-24 07:06:11.119;
org.eclipse.jetty.server.session.SessionHandler; Got Session ID
vZScVxfQ528bXYGHJw16N3vTLJ4t3L41bSkHNmyTywQKGGzZFC8p!-348395136!NONE from
cookie*
*DEBUG - 2015-07-24 07:06:11.119;
org.eclipse.jetty.server.session.SessionHandler;
sessionManager=org.eclipse.jetty.server.session.HashSessionManager@1c49094*
*DEBUG - 2015-07-24 07:06:11.119;
org.eclipse.jetty.server.session.SessionHandler; session=null*
*DEBUG - 2015-07-24 07:06:11.120; org.eclipse.jetty.servlet.ServletHandler;
servlet /solr|/admin/info/logging|null - default*
*DEBUG - 2015-07-24 07:06:11.120; org.eclipse.jetty.servlet.ServletHandler;
chain=SolrRequestFilter-default*
*DEBUG - 2015-07-24 07:06:11.120;
org.eclipse.jetty.servlet.ServletHandler$CachedChain; call filter
SolrRequestFilter*
*INFO  - 2015-07-24 07:06:11.120;
org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null
path=/admin/info/logging
params={_=1437736016484since=1437734905469wt=json} status=0 QTime=0 *
*DEBUG - 2015-07-24 07:06:11.120;
org.apache.solr.servlet.SolrDispatchFilter; Closing out SolrRequest:
{_=1437736016484since=1437734905469wt=json}*
*DEBUG - 2015-07-24 07:06:11.121; 

Re: How to handle line breaks for quoted queries

2015-07-30 Thread Alessandro Benedetti
Hi Mohsen,
this is the perfect place for the *positionIncrementGap *attribute for your
field type*.*

fieldType name=text_general class=solr.TextField *positionIncrementGap*
=100

First of all when phrase or positional searches are necessary you need to
store term positions in your index.
The position increment gap will increment the position when a multi valued
field happens.

At this point you different solutions :

1) you provide multiple values based on line break if this fit for your use
case

2) more likely you take a look to a tokenizer that do the position
increment on its own when a line break is found.

If you use the analysis tool now, what will you get for your tokens and
positions ?
How the line break is indexed ?
You can review positions with the analysis tool !

Cheers



2015-07-30 8:40 GMT+01:00 Mohsen Saboorian mohs...@gmail.com:

 How can I recognize line breaks and do not allow matching of a quoted query
 in the following example.

 I have two documents with just one text field:
 1. AAA BBB line break
 CCC DDD

 2. BBB CCC line break
 DDD AAA

 User enters query BBB CCC. How can I configure tokenizers so that Solr
 only returns doc #2?

 Thanks,
 Mohsen




-- 
--

Benedetti Alessandro
Visiting card - http://about.me/alessandro_benedetti
Blog - http://alexbenedetti.blogspot.co.uk

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England