Re: Using Synonyms as a feature with LTR

2018-02-12 Thread Alessandro Benedetti
In the end a feature will just be a numerical value. How do you plan to use synonyms in a field to generate a numerical feature ? Are you planning to define a binary feature for a field, in case there is a match on the synonyms ? Or a feature which contains a score for a query ( with synonyms

Solr node is out of sync (looks Healthy)

2018-02-12 Thread Daniel Carrasco
Hello, We're using Solr to manage products data on our shop and the last week some customers called us telling that price between shop and shopping basket differs. After research a bit I've noticed that it happens sometimes on page refresh. After disabling all cache I've queried all solr

Re: Solr node is out of sync (looks Healthy)

2018-02-12 Thread Emir Arnautović
Hi Daniel, Can you tell us more about your document update process. How do you commit changes? Since it got fixed after restart, it seems to me that on that one node index searcher was not reopened after updates. Do you see any errors/warnings on that node? Also, what do you mean by “All nodes

RE: Search for a word NOT followed by another on a Solr query

2018-02-12 Thread Markus Jelsma
You can abuse phrase query for that, q=leonardo AND -"leonardo da vinci" (asuming you have a proper default field set). Markus -Original message- > From:ivan > Sent: Monday 12th February 2018 12:54 > To: solr-user@lucene.apache.org > Subject: Search for a word

Re: Solr node is out of sync (looks Healthy)

2018-02-12 Thread Emir Arnautović
Hi Daniel, Maybe it is Monday and I am still not warmed up, but your details seems a bit imprecise to me. Maybe not directly related to your problem, but just to exclude that you are having some strange Solr setup, here is my understanding: You are running a single SolrCloud cluster with 8

coord in SolR 7

2018-02-12 Thread Moll, Dr. Andreas
Hi, I try to upgrade our SolR installation from SolR 5 to 7. We use a customized similarity class that heavily depends on the coordination factor to scale the similarity for OR-queries with multiple terms. Since SolR 7 this feature has been removed. Is there any hook to implement this in our

SolrCloud to maintain keepalive connections

2018-02-12 Thread Keerthy Jayaraj
Hi, I am using SolrCloud implementation on Solr 5.2.1, and due to some infrastructure constraints, creating new http connections between shards looks expensive. I want my shardhandlers to use persistent connections while communicating between shards. Is there a way to configure this? Will this

Re: Solr node is out of sync (looks Healthy)

2018-02-12 Thread Daniel Carrasco
Hello, thanks for your help. I answer bellow. Greetings!! 2018-02-12 11:31 GMT+01:00 Emir Arnautović : > Hi Daniel, > Can you tell us more about your document update process. How do you commit > changes? Since it got fixed after restart, it seems to me that on

Search for a word NOT followed by another on a Solr query

2018-02-12 Thread ivan
What i'm trying to do is to only get results for "Leonardo" when is not followed by "da vinci". If i have "Leonardo da vinci" in my result is fine as long as i have another "Leonardo" without "da vinci". Examples: "Leonardo foo bar" OK "Leonardo da vinci foo bar" KO "Leonardo foo bar Leonardo da

Re: Solr4 To Solr6 CPU load issues

2018-02-12 Thread Deepak Goel
I would suggest to keep the load same both for solr4 and solr6. And then test. Also please post exact concurrent hits On 12 Feb 2018 12:48, "~$alpha`" wrote: When both solr4 and solr6 have concurrent hits: 1. 30 to 40 : Avg response time 470ms vs 380ms Load

Re: Solr node is out of sync (looks Healthy)

2018-02-12 Thread Daniel Carrasco
Hello, 2018-02-12 12:32 GMT+01:00 Emir Arnautović : > Hi Daniel, > Maybe it is Monday and I am still not warmed up, but your details seems a > bit imprecise to me. Maybe not directly related to your problem, but just > to exclude that you are having some strange

Purchase of support

2018-02-12 Thread Hon Fook Boey
Hi, May I know if support/maintenance can be purchased for SOLR? Thanks and regards, Boey HF eHoB Technology Sdn Bhd (Co Reg No 561898-XGST Reg # 001282277376) No 12-2, Jln PJU 7/16A, Mutiara Damansara, 47800 Petaling Jaya, Malaysia Tel +6 03 7710 3308 Fax +6 03 7726 6228 È Mobile +6 012

Re: Solr4 To Solr6 CPU load issues

2018-02-12 Thread Deepak Goel
This would then mean that solr6 is reaching some kind of saturation (number of threads, etc) at about loads of 60 Hits which then drives the performance of it to be very bad ! Deepak "Please stop cruelty to Animals, help by becoming a Vegan" +91 73500 12833 deic...@gmail.com Facebook:

Re: Solr4 To Solr6 CPU load issues

2018-02-12 Thread Walter Underwood
100% CPU will cause congestion and very slow response. In production, we do not drive Solr over 75% CPU. He is reporting load average, which is a bit harder to interpret. When the load average reaches the number of CPUs, that is probably the beginning of congestion. But I’m less sure about

Re: Solr4 To Solr6 CPU load issues

2018-02-12 Thread ~$alpha`
Hits 41 : Avg response time470ms vs 380ms CPU Load reaches6 vs 10 Hits 82: Avg response time 500ms vs 620ms (solr6 performing bad on peak hours) CPU Load reaches11 vs 25 -- Sent from:

RE: Search for a word NOT followed by another on a Solr query

2018-02-12 Thread ivan
Markus Jelsma-2 wrote > You can abuse phrase query for that, q=leonardo AND -"leonardo da vinci" > (asuming you have a proper default field set). > > Markus This way i'm losing results where i have both "Leonardo" and "Leonardo da vinci in the same field, see example number 3 "Leonardo foo bar

Re: Solr4 To Solr6 CPU load issues

2018-02-12 Thread ~$alpha`
I cant test on more as performance is already degraded. Its a 32core system and load 25 means 2500% cpu -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Solr search word NOT followed by another word

2018-02-12 Thread ivan
What i'm trying to do is to only get results for "Leonardo" when is not followed by "da vinci". So any result containing "Leonardo" (not followed by "da vinci") is fine even if i have "Leonardo da vinci" in the result. I want to filter out only the results where i don't have "Leonardo" without "da

solr spell check index dictionary build failed issue

2018-02-12 Thread Krishna Kumar Sharma
Hell I have issue on building of spell check index dictionary showing error as like ERROR undefined SpellCheckComponent Exception in building spell check index for spellchecker: indexD org.apache.lucene.store.LockObtainFailedException: Lock held by this virtual machine:

RE: Solr search word NOT followed by another word

2018-02-12 Thread Allison, Timothy B.
That requires a SpanNotQuery. AFAIK, there is no way to do this with the current parsers included in Solr. My SpanQueryParser does cover this, and I'm hoping to port it to 7.x today or tomorrow. Syntax would be "Leonardo [da vinci]"!~0,1 https://issues.apache.org/jira/browse/LUCENE-5205

Solr Expression Slow

2018-02-12 Thread ~$alpha`
In the below Solr query I am sorting based on the below expression. > sum(if(and(tf(CRITERIA1_FILTER,Y),if(tf(PARTNER_CRITERIA1,N),0,1)),1,0),if(and(tf(CRITERIA2_FILTER,Y),if(tf(PARTNER_CRITERIA2,1),0,1)),1,0)) > asc I have mentioned PARTNER_CRITERIA1 and PARTNER_CRITERIA2 in the expression. I

DovValues and in-place udpates

2018-02-12 Thread Brian Yee
I asked a question here about fast inventory updates last week and I was recommended to use docValues with partial in-place updates. I think this will work well, but there is a problem I can't think of a good solution for. Consider this scenario: InStock = 1 for a product. InStock changes to 0

Re: Solr4 To Solr6 CPU load issues

2018-02-12 Thread Deepak Goel
Please test for higher number of hits till cpu load reaches 100% On 12 Feb 2018 19:44, "~$alpha`" wrote: > Hits 41 : > Avg response time470ms vs 380ms > CPU Load reaches6 vs 10 > > Hits 82: > Avg response time 500ms vs 620ms

Null Pointer exception after upgrading lucene index from 6.1 to 7.2

2018-02-12 Thread Webster Homer
We ran the org.apache.lucene.index.IndexUpgrader as part of upgrading from 6.1 to 7.2.0 After the upgrade, one of our collections threw a NullPointerException on a query of *:* We didn't observe errors in the logs. All of our other collections appear to be fine. Re-indexing the collection seems

Re: DovValues and in-place udpates

2018-02-12 Thread Charlie Hull
On 12/02/2018 16:02, Brian Yee wrote: I asked a question here about fast inventory updates last week and I was recommended to use docValues with partial in-place updates. I think this will work well, but there is a problem I can't think of a good solution for. Consider this scenario: InStock

Re: Solr4 To Solr6 CPU load issues

2018-02-12 Thread ~$alpha`
Yes, but how to move ahead now. Its strange solr4 is better behaving than solr6 -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Solr4 To Solr6 CPU load issues

2018-02-12 Thread Walter Underwood
25 of 32 CPUs loaded is 78% CPU, but I’ve never seen CPU use reported that way. How are you measuring CPU usage? What tool? wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Feb 12, 2018, at 7:08 AM, ~$alpha` wrote: > > I

Re: solrcloud Auto-commit doesn't seem reliable

2018-02-12 Thread Webster Homer
Erick, I am aware of the CDCR buffering problem causing tlog retention, we always turn buffering off in our cdcr configurations. My post was precipitated by seeing that we had uncommitted data in collections > 24 hours after it was loaded. The collections I was looking at are in our development

Re: Solr4 To Solr6 CPU load issues

2018-02-12 Thread ~$alpha`
cpu utlization is already on higher end, so VM wont seems a solution -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Solr4 To Solr6 CPU load issues

2018-02-12 Thread ~$alpha`
top command in linux -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Solr4 To Solr6 CPU load issues

2018-02-12 Thread Deepak Goel
One more idea could be is to have multiple vm's (8 cpu each) on your server and load balance them. That would help Solr6 scale nicely On 12 Feb 2018 23:05, "Deepak Goel" wrote: > If the community cannot help, the only way i can think is either to > profile Solr (java) under a

Re: Solr Expression Slow

2018-02-12 Thread ~$alpha`
It cannot be done at indexing time. its based onloggedin user info -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

RE: DovValues and in-place udpates

2018-02-12 Thread Chris Hostetter
: True, I could remove the trigger to rebuild the entire document. But : what if a different field changes and the whole document is triggered : for update for a different field. We have the same problem. at a high level, your concern is really compleltey orthoginal to the question of in-place

Re: Solr Expression Slow

2018-02-12 Thread Erik Hatcher
I suggest applying that logic at index time and build yourself a SORT_CRITERIA field and use that rather than that sophisticated function that looks like it could collapse down to a single index-time field. Erik > On Feb 12, 2018, at 10:32 AM, ~$alpha` wrote:

Re: DovValues and in-place udpates

2018-02-12 Thread Erick Erickson
"But it also triggers a slow update that will rebuild the entire document..." Why do you think this? The whole _point_ of in-place updates is that they don't have to re-index the whole document And the only way to do that effectively would be if all the fields are stored, which is not a

Re: Purchase of support

2018-02-12 Thread Charlie Hull
On 12/02/2018 07:58, Hon Fook Boey wrote: Hi, May I know if support/maintenance can be p urchased for SOLR? Hi, Various companies provide support for Solr (including us): what kind of support are you looking for? Best Charlie Thanks and regards, Boey HF eHoB Technology Sdn Bhd (Co

Re: Solr4 To Solr6 CPU load issues

2018-02-12 Thread Deepak Goel
If the community cannot help, the only way i can think is either to profile Solr (java) under a load test to find the problem. You could also use an APM. On 12 Feb 2018 23:00, "~$alpha`" wrote: > Yes, but how to move ahead now. > Its strange solr4 is better behaving

RE: DovValues and in-place udpates

2018-02-12 Thread Brian Yee
True, I could remove the trigger to rebuild the entire document. But what if a different field changes and the whole document is triggered for update for a different field. We have the same problem. -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Monday,

Re: Solr search word NOT followed by another word

2018-02-12 Thread simon
Tim: How up to date is the Solr-5410 patch/zip in JIRA ?. Looking to use the Span Query parser in 6.5.1, migrating to 7.x sometime soon. Would love to see these committed ! -Simon On Mon, Feb 12, 2018 at 10:41 AM, Allison, Timothy B. wrote: > That requires a

Re: Request routing / load-balancing TLOG & PULL replica types

2018-02-12 Thread Greg Roodt
Thanks Ere. I've taken a look at the discussion here: http://lucene.472066.n3.nabble.com/Limit-search-queries-only-to-pull-replicas-td4367323.html This is how I was imagining TLOG & PULL replicas would wor, so if this functionality does get developed, it would be useful to me. I still have 2

Re: Request routing / load-balancing TLOG & PULL replica types

2018-02-12 Thread Tomas Fernandez Lobbe
> On Feb 12, 2018, at 12:06 PM, Greg Roodt wrote: > > Thanks Ere. I've taken a look at the discussion here: > http://lucene.472066.n3.nabble.com/Limit-search-queries-only-to-pull-replicas-td4367323.html > This is how I was imagining TLOG & PULL replicas would wor, so if this

Delta Import Failed to Complete

2018-02-12 Thread Perrin Bignoli
Hi, A couple of weeks ago, I ran into an unusual problem with Solr on which I could find previous discussion. I have a 4 node Solr cluster with 2 collections, ‘A’ and ‘B’. Each of the collections has 1 shard and 3 replicas. Both collections are updated with a delta-import that pulls from a

Re: Solr node is out of sync (looks Healthy)

2018-02-12 Thread Emir Arnautović
Hi Daniel, Please see inline comments. -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > On 12 Feb 2018, at 13:13, Daniel Carrasco wrote: > > Hello, > > 2018-02-12 12:32 GMT+01:00 Emir

Re: "editorialMarkerFieldName"

2018-02-12 Thread Sadiki Latty
Thanks Chris, my colleague discovered this on Friday and shared it with me. Thanks for getting back to me. Sent from my iPhone > On Feb 12, 2018, at 5:13 PM, Chris Hostetter wrote: > > > https://issues.apache.org/jira/browse/SOLR-11977 > > : Date: Mon, 12 Feb

Re: solrcloud Auto-commit doesn't seem reliable

2018-02-12 Thread Erick Erickson
bq: But if 3 seconds is aggressive what would be a good value for soft commit? The usual answer is "as long as you can stand". All top-level caches are invalidated, autowarming is done etc. on each soft commit. That can be a lot of work and if your users are comfortable with docs not showing up

Issue Using JSON Facet API Buckets in Solr 6.6

2018-02-12 Thread Antelmo Aguilar
Hi, I was using the following part of a query to get facet buckets so that I can use the information in the buckets for some post-processing: "json":

Re: "editorialMarkerFieldName"

2018-02-12 Thread Chris Hostetter
IIUC the "editorialMarkerFieldName" config option is a bit missleading. Configuring that doesn't automatically add a field w/that name to your docs to indicate which of them have been elevated -- all it does is provide an *override* for what name can be used to refer to the "[elevated]"

Re: Solr Autoscaling multi-AZ rules

2018-02-12 Thread Noble Paul
>>Goal: No node should have more than 6 shards This is not possible today {"replica": "<7", "node":"#ANY"} , means don't put more than 7 replicas of the collection (irrespective of the shards) in a given node what do you mean by distinct 'RF' ? I think we are screwing up the terminologies a

Re: Search for a word NOT followed by another on a Solr query

2018-02-12 Thread Emir Arnautović
Hi Ivan, You might be able to use complexphrase query parser to get what you need, you can test something like this: {!complexphrase df=my_field}”Leonardo -(da Vinci)” HTH, Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training -

Re: Purchase of support

2018-02-12 Thread Emir Arnautović
Hi Boey, You can see Sematext’s support offer at https://sematext.com/support/solr-production-support/ In case you meant more like consulting type of support: https://sematext.com/consulting/solr/

Re: "editorialMarkerFieldName"

2018-02-12 Thread Chris Hostetter
https://issues.apache.org/jira/browse/SOLR-11977 : Date: Mon, 12 Feb 2018 14:44:34 -0700 (MST) : From: Chris Hostetter : To: "solr-user@lucene.apache.org" : Subject: Re: "editorialMarkerFieldName" : : : IIUC the

Re: Request routing / load-balancing TLOG & PULL replica types

2018-02-12 Thread Ere Maijala
Your question about directing queries to PULL replicas only has been discussed on the list. Look for topic "Limit search queries only to pull replicas". What I'd like to see is something similar to the preferLocalShards parameter. It could be something like "preferReplicaTypes=TLOG,PULL".

Re: Solr4 To Solr6 CPU load issues

2018-02-12 Thread Emir Arnautović
Hi Birender, Do you monitor you heap? Is it possible that you are running close to max heap size and that GC is what is taking CPU? Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > On 11 Feb 2018, at

Re: Hard commits blocked | non-solrcloud v6.6.2

2018-02-12 Thread Emir Arnautović
I didn’t have time to look at the full thread dump, but noticed one thing in pasted stack trace: AddSchemaFieldsUpdateProcessor.processAdd Is it possible that you are doing a lot changes to your schema? Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch

Re: Request routing / load-balancing TLOG & PULL replica types

2018-02-12 Thread Greg Roodt
Thanks so much again Tomas! You've answered my questions and I clearly understand now. Great work! On 13 February 2018 at 09:13, Tomas Fernandez Lobbe wrote: > > > > On Feb 12, 2018, at 12:06 PM, Greg Roodt wrote: > > > > Thanks Ere. I've taken a look at

Index size increases disproportionately to size of added field when indexed=false

2018-02-12 Thread Howe, David
Hi, We are using Solr 7.1.0 to index a database of addresses. We have found that our index size increases massively when we add one extra field to the index, even though that field is stored and not indexed, and doesn’t contain a lot of data. When this occurs, we also observe a significant

Re: facet.method=uif not working in solr cloud?

2018-02-12 Thread Yonik Seeley
Feels like we should open an issue for this (that facet.method=uif is only respected if you specify another esoteric parameter...) -Yonik On Mon, Feb 12, 2018 at 8:34 PM, Wei wrote: > Adding facet.distrib.mco=true did the trick. Thanks Toke and Alessandro! > > Cheers, >

Re: facet.method=uif not working in solr cloud?

2018-02-12 Thread Wei
Adding facet.distrib.mco=true did the trick. Thanks Toke and Alessandro! Cheers, Wei On Thu, Feb 8, 2018 at 1:23 AM, Toke Eskildsen wrote: > On Fri, 2018-02-02 at 17:40 -0800, Wei wrote: > > I tried to debug a bit and see that when executing on a cloud solr > > server, although I

Re: Solr4 To Solr6 CPU load issues

2018-02-12 Thread Yonik Seeley
On Sun, Feb 11, 2018 at 8:47 AM, ~$alpha` wrote: > I have upgraded Solr4.0 Beta to Solr6.6. The Cache results look Awesome but > overall the CPU load on solr6.6 is double the load on solr4.0 and hence I am > not able to roll solr6.6 to 100% of my traffic. > > *Some Key

Re: lat/long (location ) field context filters for autosuggestions

2018-02-12 Thread Emir Arnautović
Hi Deepak, Maybe it’s just me, but I am not sure what is the issue. Can you give some example what you are currently doing, what did you try and what issues did you run into. Thanks, Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support

coord in SolR 7

2018-02-12 Thread Moll, Dr. Andreas
Hi, I try to upgrade our SolR installation from SolR 5 to 7. We use a customized similarity class that heavily depends on the coordination factor to scale the similarity for OR-queries with multiple terms. Since SolR 7 this feature has been removed. Is there any hook to implement this in our

Re: Request routing / load-balancing TLOG & PULL replica types

2018-02-12 Thread Ere Maijala
2. In my experience using PULL replicas can have a significant positive effect on the server load. It depends of course on your analysis chain, but we do some fairly expensive analysis, and not having to do the same work X times does have a benefit. Unfortunately we need multiple shards so we