Re: Solr Cloud wiping all cores when restart without proper zookeeper directories

2019-01-09 Thread Yogendra Kumar Soni
I have an existing collection http://10.2.12.239:11080/solr/test/select?q=*:*=0 { "responseHeader":{ "zkConnected":true, "status":0, "QTime":121, "params":{ "q":"*:*", "rows":"0"}}, "response":{"numFound":150,"start":0,"maxScore":1.0,"docs":[] }} ls ls

Re: Solr Query running slow in Prod node

2019-01-09 Thread Zheng Lin Edwin Yeo
Hi, You have to check if both of settings are using the same configurations, and if the production Solr server have other programs running? Also, the query performance might be affected if there is indexing going on at the same time. Regards, Edwin On Thu, 10 Jan 2019 at 06:50, Dasarathi Minjur

Single query to get the count for all individual collections

2019-01-09 Thread Zheng Lin Edwin Yeo
Hi, I would like to find out, is there any way that I can send a single query to retrieve the numFound for all the individual collections? I have tried with this query http://localhost:8983/solr/collection1/select?q=*:*=collection1,collection2 However, this query is doing the sum of all the

Re: Bootstrapping a Collection on SolrCloud

2019-01-09 Thread Erick Erickson
First, for a given data set, I can easily double or halve the size of the index on disk depending on what options I choose for my fields; things like how many times I may need to copy fields to support various use-cases, whether I need to store the input for some, all or no fields, whether I

Re: Bootstrapping a Collection on SolrCloud

2019-01-09 Thread Frank Greguska
Thanks, I am no Solr expert so I may be over-simplifying things a bit in my ignorance. "No. The replicas are in a "down" state the Solr instances are brought back up" Why can't I dictate (at least initially) the "up" state somehow? It seems Solr keeps track of where replicas were deployed so that

Re: Is there a recommended open source GUI tool for monitoring 'zookeeper'?

2019-01-09 Thread Otis Gospodnetić
Hi, Sematext's monitoring agent with a ZooKeeper integration is open-source: https://github.com/sematext/sematext-agent-java The ZK integration is at https://github.com/sematext/sematext-agent-integrations/tree/master/zookeeper (Solr and SolrCloud integrations are in the same repo) If you can't

Is there a recommended open source GUI tool for monitoring 'zookeeper'?

2019-01-09 Thread 유정인
Hi Is there a recommended open source GUI tool for monitoring 'zookeeper'?

Re: Bootstrapping a Collection on SolrCloud

2019-01-09 Thread Erick Erickson
bq. do all 100 replicas move to the one remaining node? No. The replicas are in a "down" state the Solr instances are brought back up (I'm skipping autoscaling here, but even that wouldn't move all the replicas to the one remaining node). bq. what the collection *should* look like based on the

Re: Bootstrapping a Collection on SolrCloud

2019-01-09 Thread Frank Greguska
Thanks for the response. You do raise good points. Say I reverse your example and I have a 10 node cluster with a 10-shard collection and a replication factor of 10. Now I kill 9 of my nodes, do all 100 replicas move to the one remaining node? I believe the answer is, well that depends on the

Solr Query running slow in Prod node

2019-01-09 Thread Dasarathi Minjur
Hello, We have a Solr query that runs much slower in Production Solr cluster compared with lower environments.(Yes they may not be apples to apples comparison but it's really slow in prod as HDFS gets pounded) What are the general ways to track/trouble shoot slowness in the query. Is there any

Re: Bootstrapping a Collection on SolrCloud

2019-01-09 Thread Erick Erickson
How would you envision that working? When would the replicas actually be created and under what heuristics? Imagine this is possible, and there are a bunch of placeholders in ZK for a 10-shard, collection with a replication factor of 10 (100 replicas all told). Now I bring up a single Solr

Re: Re: Re: Page faults

2019-01-09 Thread Erick Erickson
bq: We could create 2 separate collections. - Requires re-indexing - Code changes in our APIs and indexing process - Lost ability to query all the docs at once *** *** Not quite true. You can create an alias that points to multiple collections. HOWEVER, since the scores are computed using

Re: Re: Re: Page faults

2019-01-09 Thread Branham, Jeremy (Experis)
Thanks for the information Erick – I’ve learned there are 2 ‘classes’ of documents being stored in this collection. There are about 4x as many documents in class A as class B. When the documents are indexed, the document ID includes the key prefix like ‘A/1!’ or ‘B/1!’, which I understand spreads

Re: Solr Cloud wiping all cores when restart without proper zookeeper directories

2019-01-09 Thread lstusr 5u93n4
We've seen the same thing on solr 7.5 by doing: - create a collection - add some data - stop solr on all servers - delete all contents of the solr node from zookeeper - start solr on all nodes - create a collection with the same name as in the first step When doing this, solr wipes out the

Re: Concurrent User

2019-01-09 Thread Shawn Heisey
On 1/9/2019 9:00 AM, Senthil0809 wrote: I am new to this tool and we are planning to implement Apache Solr for Search and match process . Here I have added some of my requirement . 1. We have 500 concurrent user for search and match process to pull the details from record 2. Around 5

Bootstrapping a Collection on SolrCloud

2019-01-09 Thread Frank Greguska
Hello, I am trying to bootstrap a SolrCloud installation and I ran into an issue that seems rather odd. I see it is possible to bootstrap a configuration set from an existing SOLR_HOME using ./server/scripts/cloud-scripts/zkcli.sh -zkhost ${ZK_HOST} -cmd bootstrap -solrhome ${SOLR_HOME} but

Concurrent User

2019-01-09 Thread Senthil0809
*Hi Team , * I am new to this tool and we are planning to implement Apache Solr for Search and match process . Here I have added some of my requirement . 1. We have 500 concurrent user for search and match process to pull the details from record 2. Around 5 request will happen on daily

Re: REBALANCELEADERS is not reliable

2019-01-09 Thread Erick Erickson
Executive summary: The central problem is "how can I insert an ephemeral node in a specific place in a ZK queue". The code could be much, much simpler if there were a reliable way to do just that. I haven't looked at more recent ZKs to see if it's possible, I'd love it if there were a better way.

Re: Web Server HTTP Header Internal IP Disclosure SOLR port

2019-01-09 Thread Gus Heck
This sounds like something that might crop up if the admin UI were exposed to an alternate (or public) network space through a tunnel or proxy. The server knows nothing about the proxy/tunnel, and the cloud page has nice clickable machine names that point at the internal dns or ip names of the

Re: how to recover state.json files

2019-01-09 Thread Gus Heck
Not a direct solution, but manipulating data in Zookeeper can be made easier with https://github.com/rgs1/zk_shell On Wed, Jan 9, 2019 at 10:26 AM Erick Erickson wrote: > How did you "lose" the data? Exactly what happened? > > Where does the dataDir variable point in your > zoo.cfg file? By

Re: how to recover state.json files

2019-01-09 Thread Erick Erickson
How did you "lose" the data? Exactly what happened? Where does the dataDir variable point in your zoo.cfg file? By default it points to /tmp/zookeeper, which can be deleted by the op system when the machine is restarted. Otherwise you can get/put arbitrary znodes by using "bin/solr zk cp".

Re: Solr Cloud wiping all cores when restart without proper zookeeper directories

2019-01-09 Thread Erick Erickson
Solr doesn't just remove directories, this is very likely something in your environment that's doing this. In any case, there's no information here to help diagnose. You must tell us _exactly_ what steps you take in order to have any hope of helping. Best, Erick On Wed, Jan 9, 2019 at 2:48 AM

Re: Haystack Relevance Conference Announced; CFP ends Jan 9!

2019-01-09 Thread Charlie Hull
Hi all, Just to let you know the CFP has been extended until January 30th and we're really looking forward to seeing your proposals! http://haystackconf.com Cheers Charlie On 27/11/2018 22:33, Doug Turnbull wrote: Hey everyone, Many of you may know about/have been to Haystack - The

Re: how to recover state.json files

2019-01-09 Thread Bernd Fehling
Have you lost dataDir from all zookeepers? If not, first take a backup of remaining dataDir and then start that zookeeper. Take ZooInspector to connect to dataDir at localhost and get your state.json including all other configs and setting. Am 09.01.19 um 12:25 schrieb Yogendra Kumar Soni:

how to recover state.json files

2019-01-09 Thread Yogendra Kumar Soni
How to know attributes like shard name and hash ranges with associated core names if we lost state.json file from zookeeper. core.properties only contains core level information but hash ranges are not stored there. Does solr stores collection information, shards information anywhere. --

Solr Cloud wiping all cores when restart without proper zookeeper directories

2019-01-09 Thread Yogendra Kumar Soni
We are running a solr cloud cluster using solr 7.4 with 8 shards. When we started our solr cloud with a zookeeper node (without collections directory but with only solr.xml and configs) our data directory containing core.propery and cores data becomes empty. -- *Thanks and Regards,* *Yogendra

Re: REBALANCELEADERS is not reliable

2019-01-09 Thread Bernd Fehling
Yes, your findings are also very strange. I wonder if we can discover the "inventor" of all this and ask him how it should work or better how he originally wanted it to work. Comments in the code (RebalanceLeaders.java) state that it is possible to have more than one electionNode with the same