Parallel merge of indexes

2020-02-04 Thread Erol Akarsu
I need some help in merging indexes in parallel much faster way. I am using IndexMergeTool provided by Lucene but it seems very slow. Is there a way to speed up the process ? What I do is that I make 16 shards with no replication and then add replica for every node and every shard. In the last

Re: Filtered join in Solr?

2020-02-04 Thread Edward Ribeiro
Just for the sake of an imagined scenario, you could use the [subquery] doc transformer. A query like the one below: /select?q=family: Smith=watched_movies:[* TO *]=*, movies:[subquery]={!terms f=id v=$row.watched_movies} Would bring back the results below: { "responseHeader":{ "status":0,

ID is a required field in SolrSchema . But not found in DataConfig

2020-02-04 Thread Karl Stoney
Hey all, I'm trying to use the DIH to copy from one collection to another, it appears to work (data gets copied) however I've noticed this in the logs: 17:39:58.167 [qtp1472216456-87] INFO org.apache.solr.handler.dataimport.config.DIHConfiguration - ID is a required field in SolrSchema . But

Re: Handling All Replicas Down in Solr 8.3 Cloud Collection

2020-02-04 Thread Joseph Lorenzini
Here's roughly what was going on: 1. set up three node cluster with a collection. The collection has one shard and three replicas for that shard. 2. Shut down two of the nodes and verify the remaining node is the leader. Verified the other two nodes are registered as dead in solr ui.

Re: Exact search in Solr

2020-02-04 Thread yeikel valdes
You can store a non alayzed version and copy it to an analyzed field. If you need full text search, you se the analyzed version. Otherwise use the non analyzed version. If you want to search both you could still do that and boost the non alayzed version if needed On Tue, 04 Feb 2020

Filtered join in Solr?

2020-02-04 Thread Radu Gheorghe
Hello Solr users, How would you design a filtered join scenario? Say I have a bunch of movies (excuse any inaccuracies, this is an imagined scenario): curl -XPOST -H 'Content-Type: application/json' 'localhost:8983/solr/test/update?commitWithin=1000' --data-binary ' [{ "id": "1", "title":

Re: Handling All Replicas Down in Solr 8.3 Cloud Collection

2020-02-04 Thread Erick Erickson
First, be sure to wait at least 3 minutes before concluding the replicas are permanently down, that’s the default wait period for certain leader election fallbacks. It’s easy to conclude it’s never going to recover, 180 seconds is an eternity ;). You can try the collections API FORCELEADER

Handling All Replicas Down in Solr 8.3 Cloud Collection

2020-02-04 Thread Joseph Lorenzini
Hi all, I have a 3 node solr cloud instance with a single collection. The solr nodes are pointed to a 3-node zookeeper ensemble. I was doing some basic disaster recovery testing and have encountered a problem that hasn't been obvious to me on how to fix. After i started back up the three solr

Re: How to compute index size

2020-02-04 Thread Andrzej Białecki
If you’re using Solr 8.2 or newer there’s a built-in index analysis tool that gives you a better understanding of what kind of data in your index occupies the most disk space, so that you can tweak your schema accordingly:

Re: Exact search in Solr

2020-02-04 Thread Mikhail Khludnev
Hello, Łukasz The later for sure. On Tue, Feb 4, 2020 at 12:44 PM Antczak, Lukasz wrote: > Hi, Solr experts! > > I would like to learn from you if there is a better solution for doing > 'exact search' in Solr. > Exact search means no analysis for the text other then tokenization. Query >

Exact search in Solr

2020-02-04 Thread Antczak, Lukasz
Hi, Solr experts! I would like to learn from you if there is a better solution for doing 'exact search' in Solr. Exact search means no analysis for the text other then tokenization. Query "secret" gives back only documents containing exactly "secret" not "secrets", "secrection", etc. Text that

How can shards distributed evenly among nodes

2020-02-04 Thread Yuan Zhao
Hi Team, We are using autoscaling policy, we make use of the utilize node feature to move replica to new nod. But we found after replica are moved, solr can make sure the repilica belongs to a same shard located on different nodes, but it can not make sure shard distributed evenly on all the

Re: Haystack CFP is open, come and tell us how you tune relevance for Lucene/Solr

2020-02-04 Thread Charlie Hull
Hi all, You have until this Friday to submit a talk to Haystack! Very much looking forward to your submissions. Charlie On 27/01/2020 21:53, Doug Turnbull wrote: Just an update the CFP was extended to Feb 7th, less than 2 weeks away. -> http://haystackconf.com It's your ethical imperative