Re: how to get rid of double quotes in solr

2020-04-15 Thread Paras Lehana
Hi, Are you referring to the double quotes in the JSON result? On Tue, 14 Apr 2020 at 08:29, sefty nindyastuti wrote: > the data that I use is log from hadoop, my problem is hadoop log from > cluster, > the schema I use is filebeat --> logstash --> solr, I use logstash config > to parse the

Re: How upgrade to Solr 8 impact performance

2020-04-15 Thread Paras Lehana
In January, we upgraded Solr from version 6 to 8 skipping all versions in between. The hardware and Solr configurations were kept the same but we still faced degradation in response time by 30-50%. We had exceptional Query times around 25 ms with Solr 6 and now we are hovering around 36 ms.

How upgrade to Solr 8 impact performance

2020-04-15 Thread ChienHuaWang
Do anyone have experience to upgrade the application with Solr 7.X to 8.X? How's the query performance? Found out a little slower response time from application with Solr8 based on current measurement, still looking into more detail it. But wondering is any one have similar experience? is that

Re: ReversedWildcardFilter - should it be applied only at the index time?

2020-04-15 Thread TK Solr
It doesn't tell much: "debug":{ "rawquerystring":"email:*@aol.com", "querystring":"email:*@aol.com", "parsedquery":"(email:*@aol.com)", "parsedquery_toString":"email:*@aol.com", "explain":{ "11d6e092-58b5-4c1b-83bc-f3b37e0797fd":{ "match":true, "value":1.0, "description":"email:*@aol.com"},

Re: Unable to RESTORE collections via Collections API

2020-04-15 Thread Eugene Livis
In case this helps somebody in the future, given how completely unhelpful the Solr error message is - turns out the problem was occurring because in solrconfig.xml the updateLog was disabled. I have enabled updateLog the following way and "restore" operation started working:

Re: ReversedWildcardFilter - should it be applied only at the index time?

2020-04-15 Thread Erick Erickson
What do you see if you add =query? That should tell you…. Best, Erick > On Apr 15, 2020, at 2:40 PM, TK Solr wrote: > > Thank you. > > Is there any harm if I use it on the query side too? In my case it seems > working OK (even with withOriginal="false"), and even faster. > I see the query

Re: ReversedWildcardFilter - should it be applied only at the index time?

2020-04-15 Thread TK Solr
Thank you. Is there any harm if I use it on the query side too? In my case it seems working OK (even with withOriginal="false"), and even faster. I see the query parser code is taking a look at index analyzer and applying ReversedWildcardFilter at query time. But I didn't quite understand what

Re: Defaults Merge Policy

2020-04-15 Thread Erick Erickson
The number of deleted documents will bounce around. The default TieredMergePolicy has a rather complex algorithm that decides which segments to merge, and the percentage of deleted docs in any given segment is a factor, but not the sole determinant. Merging is not really based on the raw number

Re: Solr index size has increased in solr 7.7.2

2020-04-15 Thread David Hastings
i wouldnt worry about the index size until you get above a half terabyte or so. adding doc values and other features means you sacrifice things that dont matter, like size. memory and ssd's are cheap. On Wed, Apr 15, 2020 at 1:21 PM Rajdeep Sahoo wrote: > Hi all > We are migrating from solr

Defaults Merge Policy

2020-04-15 Thread Kayak28
Hello, Solr Community: I would like to ask about Default's Merge Policy for Solr 8.3.0. My client (SolrJ) makes a commit every 10'000 doc. I have not explicitly configured Merge Policy via solrconfig.xml For each indexing time, some documents are updated or deleted. I think the Default Merge

Solr index size has increased in solr 7.7.2

2020-04-15 Thread Rajdeep Sahoo
Hi all We are migrating from solr 4.6 to solr 7.7.2. In solr 4.6 the size was 2.5 gb but here in solr 7.7.2 the solr index size is showing 6.8 gb with the same no of documents. Is it expected behavior or any suggestions how to optimize the size.

Re: Optimal size for queries?

2020-04-15 Thread Mark H. Wood
On Wed, Apr 15, 2020 at 10:09:59AM +0100, Colvin Cowie wrote: > Hi, I can't answer the question as to what the optimal size of rows per > request is. I would expect it to depend on the number of stored fields > being marshaled, and their type, and your hardware. It was a somewhat naive question,

Re: ZooKeeper 3.4 end of life

2020-04-15 Thread Jörn Franke
The problem with Solr related to use TLS with ZK is the following: * 3.5.5 seem to only support tls certificate authentication together with TLS . Solr support es only digest and Kerberos authentication. However, I have to check in the ZK jiras if this has changed with higher ZK versions *

Re: On the delay in electing a leader when the leader is dead(Solr 7.5)

2020-04-15 Thread Erick Erickson
There’s no way leader election, even with tlog replay should take a day. 10,000 docs/minute doesn’t sound like enough to clog up replay either, so something’s definitely not what I’d expect. What is your hard commit interval? That controls how big the tlog is and thus how long it’d take to

Re: facets & docValues

2020-04-15 Thread Erick Erickson
In a word, “yes”. I also suspect your corpus isn’t very big. I think the key is the facet queries. Now, I’m talking from theory rather than diving into the code, but querying on a docValues=true, indexed=false field is really doing a search. And searching on a field like that is effectively

Re: ZooKeeper 3.4 end of life

2020-04-15 Thread Erick Erickson
Good to hear and thanks for reporting back. The other thing ZK 3.5 allegedly makes easier is dynamically reconfiguring the ensemble. Again, haven’t personally tried it but I’d be cautious about that since Solr won’t be using the 3.5 jars and just dropping the 3.5 jars in for Solr to use would be

Re: Optimal size for queries?

2020-04-15 Thread Colvin Cowie
Hi, I can't answer the question as to what the optimal size of rows per request is. I would expect it to depend on the number of stored fields being marshaled, and their type, and your hardware. But using start + rows is a *bad thing* for deep paging. You need to use cursorMark, which looks like

Re: ReversedWildcardFilter - should it be applied only at the index time?

2020-04-15 Thread Colvin Cowie
You only need apply it in the index analyzer: https://lucene.apache.org/solr/8_4_0/solr-core/org/apache/solr/analysis/ReversedWildcardFilterFactory.html If it appears in the index analyzer, the query part of it is automatically applied at query time. The ReversedWildcardFilter indexes *every*

Re: ZooKeeper 3.4 end of life

2020-04-15 Thread Bram Van Dam
On 09/04/2020 16:03, Bram Van Dam wrote: > Thanks, Erick. I'll give it a go this weekend and see how it behaves. > I'll report back so there's a record of my attempts in case anyone else > ends up asking the same question. Here's a quick update after non-exhaustive testing: Running SolrCloud