Re: REBALANCELEADERS is not reliable

2018-11-27 Thread Bernd Fehling
Hi Vadim, thanks for confirming. So it seems to be a general problem with Solr 6.x, 7.x and might be still there in the most recent versions. But where to start to debug this problem, is it something not correctly stored in zookeeper or is overseer the problem? I was also reading something

Re: How to implement ssl for Solr cloud?

2018-11-27 Thread Zheng Lin Edwin Yeo
Hi, You can read the following first regarding the implementation of SSL. https://lucene.apache.org/solr/guide/7_5/enabling-ssl.html Regards, Edwin On Wed, 28 Nov 2018 at 01:42, John Milton wrote: > Hi Solr Team, > > In my Solr cloud cluster, I am having 3 Zookeeper external ensemble and 2 >

Re: one node too busy

2018-11-27 Thread Deepak Goel
You might have to use a APM tool (AppDynamics) to debug the busy solr instance Deepak "The greatness of a nation can be judged by the way its animals are treated. Please consider stopping the cruelty by becoming a Vegan" +91 73500 12833 deic...@gmail.com Facebook:

Re: Delete all, index all, end up with 1 segment with 50% deletes

2018-11-27 Thread Erick Erickson
Shawn's comment seems likely, somehow you're adding all the docs twice and only committing at the end. In that case there'd be only 1 segment. That's about the only way I can imagine your index has exactly one segment with exactly half the docs deleted. It'd be interesting for you to look at the

Re: Period on-line index optimization

2018-11-27 Thread Erick Erickson
And do note one implication of the link Shawn gave you. Now that you've optimized, you probably have one huge segment. It _will not_ be merged unless and until it has < 2.5G "live" documents. So you may see your percentage of deleted documents get quite a bit larger than you've seen before merging

Sanity check on dataimport handler -- what are the implications if status request returns error?

2018-11-27 Thread Shawn Heisey
What might the implications be if a DIH status request returns an error response other than a 404? A 404 says either the handler or the core probably don't exist. My guess, and I admit that I haven't read the code closely, is that if the handler exists but is so broken that it cannot return a

Haystack Relevance Conference Announced; CFP ends Jan 9!

2018-11-27 Thread Doug Turnbull
Hey everyone, Many of you may know about/have been to Haystack - The Search Relevance Conference. http://haystackconf.com We're excited to announce 2019's Haystack, April 22-25 in Charlottesville, VA, USA. Our CFP due January 9th. We want to bring together practitioners that work on really

Re: Solr Delta Import Issue

2018-11-27 Thread Shawn Heisey
On 11/27/2018 10:32 AM, ~$alpha` wrote: SOLR VERSION 6.0.0 As seen in the image, There is a spike which can be observed at every 2 hours. i.e whenever delta import runs 1. Response time doubles 2. CPU load average doubles and if it

Re: Period on-line index optimization

2018-11-27 Thread Shawn Heisey
On 11/27/2018 10:04 AM, Christopher Schultz wrote: So, it's pretty much like GC promotion: the number of live objects is really the only things that matters? That's probably a better analogy than most anything else I could come up with. Lucene must completely reconstruct all of the index

Re: Query regarding Dynamic Fields

2018-11-27 Thread Alexandre Rafalovitch
However, to add to Edward's message, you can with eDismax create synthetic field names that expand to multiple fields under the covers, using per-field 'qf' parameter. See: https://lucene.apache.org/solr/guide/7_5/the-extended-dismax-query-parser.html#field-aliasing-using-per-field-qf-overrides

Re: Query regarding Dynamic Fields

2018-11-27 Thread Edward Ribeiro
You should provide the full name of the dynamic field in the query like q=s_myfield:foo, for example. Solr doesn't allow field prefix queries like q=s_*:foo. Edward Em 27 de nov de 2018 12:08, "jay harkhani" escreveu: Hello All, We are using dynamic fields in our collection. We want to use

Re: Manage new nodes types limit

2018-11-27 Thread Edward Ribeiro
Idk if you can promote a replica from PULL to TLOG, for example. You could accomplish this deleting then adding the replica, imho. Also, when adding a replica you can specify the type parameter (nrt, pull, tlog), see https://lucene.apache.org/solr/guide/7_4/collections-api.html#addreplica Edward

Re: Period on-line index optimization

2018-11-27 Thread Walter Underwood
There is one case where optimize makes sense. You do a full reload of content rarely, maybe once per day or once per week. You use a master/slave cluster. Your index isn’t huge (say under 1 million docs). We have exactly that setup for our textbook search. We do not run optimize. Our median

Re: Period on-line index optimization

2018-11-27 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Walter, On 11/27/18 12:31, Walter Underwood wrote: > Optimize is just forcing a full merge. Solr does merges > automatically in the background. Understood. > It has been automatically doing merges for the months you’ve been > using it. Let it

How to implement ssl for Solr cloud?

2018-11-27 Thread John Milton
Hi Solr Team, In my Solr cloud cluster, I am having 3 Zookeeper external ensemble and 2 Solr cloud instance. Is it needs to implement ssl for all the available Solr instance? Based on the ssl implementation any additional configuration needed in zookeeper? Thanks, John Milton

Time-Routed Alias Not Distributing Wrongly Placed Docs

2018-11-27 Thread John Nashorn
Hello Everyone, I'm using "hive-solr" from Lucidworks to index my data into Solr (v:7.5, cloud mode). As written in the Solr Manual, TRA expects documents to be indexed using its alias name, and not directly into the collections under it. Unfortunately, hive-solr doesn't allow using TRA names

Solr Delta Import Issue

2018-11-27 Thread ~$alpha`
SOLR VERSION 6.0.0 As seen in the image, There is a spike which can be observed at every 2 hours. i.e whenever delta import runs 1. Response time doubles 2. CPU load average doubles and if it runs to near to peal hour than it goes even

Re: Period on-line index optimization

2018-11-27 Thread Walter Underwood
Optimize is just forcing a full merge. Solr does merges automatically in the background. It has been automatically doing merges for the months you’ve been using it. Let it continue. Don’t bother with optimize. It was a huge mistake to name that function “optimize”. Ultraseek had a button

Re: Period on-line index optimization

2018-11-27 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Shawn, On 11/27/18 11:01, Shawn Heisey wrote: > On 11/27/2018 7:47 AM, Christopher Schultz wrote: >> I've got a single-core Solr instance with something like 1M small >> documents in it. It contains user information for fast-lookups, >> and it gets

one node too busy

2018-11-27 Thread Kudrettin Güleryüz
Hi, How can I debug what is causing occasional hiccup of our Solr cloud instance? When this issue happens, I can see that one of the nodes is too busy and the others are just doing fine. We use 6 nodes, 6 shards (1 shard per node), 1 replica for each collection. Can you please suggest tools to

Re: Period on-line index optimization

2018-11-27 Thread Shawn Heisey
On 11/27/2018 7:47 AM, Christopher Schultz wrote: I've got a single-core Solr instance with something like 1M small documents in it. It contains user information for fast-lookups, and it gets updated any time relevant user-info changes. Here's the basic info from the Core Dashboard: I'm

Manage new nodes types limit

2018-11-27 Thread Daniel Carrasco
Hello, I'm working on a new cluster and I want to make it with two TLOG nodes and N PULL nodes. For now I've found the way to scale the cluster adding replicas when a node joins the cluster and deleting the replicas when is unreachable (like for example after a shutdown), but all new replicas are

Re: Autoscaling using triggers to create new replicas

2018-11-27 Thread Daniel Carrasco
Hello, Finally I've found the way to do it. Just limiting the replicas number by node using policies is the trick: curl -X POST -H 'Content-Type: application/json' ' http://localhost:8983/api/cluster/autoscaling' --data-binary '{ "set-cluster-policy": [{"replica": "<2", "shard": "#EACH", "node":

RE: REBALANCELEADERS is not reliable

2018-11-27 Thread Vadim Ivanov
Hi, Bernd I have tried REBALANCELEADERS with Solr 6.3 and 7.5 I had very similar results and notion that it's not reliable :( -- Br, Vadim > -Original Message- > From: Bernd Fehling [mailto:bernd.fehl...@uni-bielefeld.de] > Sent: Tuesday, November 27, 2018 5:13 PM > To:

Period on-line index optimization

2018-11-27 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 All, I've got a single-core Solr instance with something like 1M small documents in it. It contains user information for fast-lookups, and it gets updated any time relevant user-info changes. Here's the basic info from the Core Dashboard: Last

Re: Delete all, index all, end up with 1 segment with 50% deletes

2018-11-27 Thread Shawn Heisey
On 11/27/2018 5:29 AM, Markus Jelsma wrote: A background batch process compiles a data set, when finished, it sends a delete all to its target collection, then everything gets sent by SolrJ, followed by a regular commit. When inspecting the core i notice it has one segment with 9578

REBALANCELEADERS is not reliable

2018-11-27 Thread Bernd Fehling
Hi list, unfortunately REBALANCELEADERS is not reliable and the leader election has unpredictable results with SolrCloud 6.6.5 and Zookeeper 3.4.10. Seen with 5 shards / 3 replicas. - CLUSTERSTATUS reports all replicas (core_nodes) as state=active. - setting with ADDREPLICAPROP the property

FW: Sort index by size

2018-11-27 Thread Srinivas Kashyap
Hi Shawn and everyone who replied to the thread, The solr version is 5.2.1 and each document is returning multi-valued fields for majority of fields defined in schema.xml. I'm in the process of pasting the content of my files to a paste website and soon will update. Thanks, Srinivas On

Query regarding Dynamic Fields

2018-11-27 Thread jay harkhani
Hello All, We are using dynamic fields in our collection. We want to use it in query to fetch records. Can someone please advice on it? i.e.: q=ABC_*:"myValue" Here "ABC_*" is dynamic field. Currently when we tried if provide field name as above it gives "org.apache.solr.search.SyntaxError".

Delete all, index all, end up with 1 segment with 50% deletes

2018-11-27 Thread Markus Jelsma
Hello, A background batch process compiles a data set, when finished, it sends a delete all to its target collection, then everything gets sent by SolrJ, followed by a regular commit. When inspecting the core i notice it has one segment with 9578 documents, of which exactly half are deleted.