Re: Getting rid of zookeeper

2020-06-09 Thread Dave
Is it horrible that I’m already burnt out from just reading that? I’m going to stick to the classic solr master slave set up for the foreseeable future, at least that let’s me focus more on the search theory rather than the back end system non stop. > On Jun 9, 2020, at 5:11 PM, Vincenzo

Re: Getting rid of zookeeper

2020-06-09 Thread Erick Erickson
The intermediate solution is to migrate to Curator. I don’t know all the ins and outs of that and whether or not it would be easier to setup and maintain. I do know that Zookeeper is deeply embedded in Solr and taking replacing it with most anything would be a major pain. I’m also certain that

Re: Getting rid of zookeeper

2020-06-09 Thread Vincenzo D'Amore
My 2 cents, I have few solrcloud productions installations, I would share some thoughts of what I learned in the latest 4/5 years (fwiw) just as they come out of my mind. - to configure a SolrCloud *production* Cluster you have to be a zookeeper expert even if you only need Solr. - the Zookeeper

combined multiple bf into a single bf

2020-06-09 Thread Derek Poh
I have the following boost requirement using bf response_rate is 3, boost by ^0.6 response_rate is 2, boost by ^0.3 response_time is 4, boost by ^0.6 response_time is 3, boost by ^0.3 I am using a bf for each of the boost requirement,

Re: Solr takes time to warm up core with huge data

2020-06-09 Thread Erick Erickson
I’d ignore the form of the query for the present, I think that’s a red herring. Start by taking all your sort clauses off. Then add them back one by one (you have to restart Solr between these experiments). My bet: your problem is “uninverting” and you’ll see your startup speed get worse the

Re: Atomic updates with add-distinct in Solr 7 cloud

2020-06-09 Thread Munendra S N
While checking this, I encountered a bug in master. Since you have mentioned that it works in Solr 8, the bug might be unrelated to your issue. I have raised a ticket to track this https://issues.apache.org/jira/browse/SOLR-14550 to track the bug Regards, Munendra S N On Mon, Jun 8, 2020 at

Re: Lucene-Solr project split

2020-06-09 Thread Dominique Bejean
Thank you. Dominique Le mar. 9 juin 2020 à 15:18, Ilan Ginzburg a écrit : > See also https://issues.apache.org/jira/browse/SOLR-14521 > > On Tue, Jun 9, 2020 at 3:17 PM Ilan Ginzburg wrote: > >> Yes. >> >> >>

Re: How to determine why solr stops running?

2020-06-09 Thread Erick Erickson
To add to what Dave said, if you have a particular machine that’s prone to suddenly stopping, that’s usually a red flag that you should seriously think about hardware issues. If the problem strikes different machines, then I agree with Shawn that the first thing I’d be suspicious of is OOM

Re: Solrcloud 6.6 becomes nuts

2020-06-09 Thread Dominique Bejean
Hi, We had the problem again a few days ago. I have noticed that each time the problem occurs the old generation of the heap suddenly grows. Its size is generally between 0,5 et 1,5Gb on 3Gg limit. In 4 minutes the old generation grows to 3Gb and never goes down as consecutive GC reclaims 0

Error when trying to create a core in solr

2020-06-09 Thread Jim Anderson
Hi, I am running Solr-7.3.1. I have just untarred the Solr-7.3.1 area and created a 'nutch' directory for the core. I have downloaded nutch-master.zip from https://github.com/apache/nutch, unzipped that file and copied schema.xml to .../server/solr/configsets/nutch/conf/schema.xml In the schema

Re: Lucene-Solr project split

2020-06-09 Thread Ilan Ginzburg
Yes. https://lists.apache.org/thread.html/raab13cabe321d12b6cda7dc6e529176f51ece31d30f00997dd36570a%40%3Cdev.lucene.apache.org%3E Ilan On Tue, Jun 9, 2020 at 3:10 PM Dominique Bejean wrote: > Hi, > > One of my clients claims that the Lucene-Solr project will split into two > separate projects

Re: Lucene-Solr project split

2020-06-09 Thread Ilan Ginzburg
See also https://issues.apache.org/jira/browse/SOLR-14521 On Tue, Jun 9, 2020 at 3:17 PM Ilan Ginzburg wrote: > Yes. > > > https://lists.apache.org/thread.html/raab13cabe321d12b6cda7dc6e529176f51ece31d30f00997dd36570a%40%3Cdev.lucene.apache.org%3E > > Ilan > > On Tue, Jun 9, 2020 at 3:10 PM

SyntaxError on Tagging filters in fq

2020-06-09 Thread AJALA, Marwan
Hello, Since I am not sure it’s a bug or a known issue (I could not find any JIRA ticket related to this error) or even if it is on purpose, I am writing to this mailing first (before opening a JIRA ticket). We are using the tagging and excluding syntax in Solr and we have noticed that when

Lucene-Solr project split

2020-06-09 Thread Dominique Bejean
Hi, One of my clients claims that the Lucene-Solr project will split into two separate projects after a vote of the community. I cannot find any trace of discussions on this subject. Is it true ? Regards. Dominique

Re: Error when trying to create a core in solr

2020-06-09 Thread Jim Anderson
Hi Erick, I probably should have included information about the config directory. As part of the setup, I had copied the config directory as follows: $ cp -r /usr/share/solr-8.5.1/server/solr/configsets/_default/* . Note that the copy was from solr-8.5.1 because I could not find a '_default'

Indexing error when using Category Routed Alias

2020-06-09 Thread Tom Evans
Hi all 1. Setup simple 1 node solrcloud test setup using docker-compose, solr:8.5.2, zookeeper:3.5.8. 2. Upload a configset 3. Create two collections, one standard collection, one CRA, both using the same configset legacy: action=CREATE=products_old=products=true=1=-1 CRA: { "create-alias":

Re: Error when trying to create a core in solr

2020-06-09 Thread Erick Erickson
You need the entire config directory for a start, not just the schema file. And there’s no need to copy things around, just path to the nutch-provided config directory and you can leave off the “conf” since the upload process automatically checks for it and does the right thing. Best, Erick >

Solr Terms browsing in descending order

2020-06-09 Thread Jigar Gajjar
Hi , Thanks for following on this one. terms.sort can wort on index (only asc which is default) and another is count which does not help. We are using facet for getting terms but it has other issues too.I was thinking that if we can get terms in descending order then it will make our life much

Re: Getting to grips with auto-scaling

2020-06-09 Thread Tom Evans
Hi Radu Thanks for the reply - I'm starting to look that way myself, to create a different collection for each set of data, that way I can control more easily the scaling on each collection, eg to increase replication factor on those that will be queried more. I was looking at Category Routed

Getting rid of zookeeper

2020-06-09 Thread S G
Hello, I recently stumbled across KIP-500: Replace ZooKeeper with a Self-Managed Metadata Quorum Elastic-search does this too. And so do many other systems. Is there some work to

Re: Getting rid of zookeeper

2020-06-09 Thread Walter Underwood
Zookeeper was created because fault-tolerant algorithms are extremely hard to test and get correct. Maybe the hardest thing in computing. Using a trusted implementation frees up lots of developer time. To get an idea of the difficulty, read through the kinds of things fixed in the Zookeeper

Re: Getting rid of zookeeper

2020-06-09 Thread David Hastings
Zookeeper is annoying to both set up and manage, but then again the same thing can be said about solr cloud. not certain why you would want to deal with either On Tue, Jun 9, 2020 at 3:29 PM S G wrote: > Hello, > > I recently stumbled across KIP-500: Replace ZooKeeper with a Self-Managed >

Re: Fw: TolerantUpdateProcessorFactory not functioning

2020-06-09 Thread Thomas Corthals
If your XML or JSON can't be parsed, your content never makes it to the update chain. It looks like you're trying to index non-UTF-8 data. You can set the encoding of your XML in the Content-Type header of your POST request. -H 'Content-Type: text/xml; charset=GB18030' JSON only allows UTF-8,

Re: How to determine why solr stops running?

2020-06-09 Thread Shawn Heisey
On 5/14/2020 7:22 AM, Ryan W wrote: I manage a site where solr has stopped running a couple times in the past week. The server hasn't been rebooted, so that's not the reason. What else causes solr to stop running? How can I investigate why this is happening? Any situation where Solr stops

Re: Fw: TolerantUpdateProcessorFactory not functioning

2020-06-09 Thread Hup Chen
Oh I got it, that's not indexing error! Seem like I need to remove all the characters between [\x0-\x1F] (except \x9 TAB, \xA LF, \xD CR) first. Thanks a lot! From: Shawn Heisey Sent: Tuesday, June 9, 2020 3:19 PM To: solr-user@lucene.apache.org Subject:

Re: Fw: TolerantUpdateProcessorFactory not functioning

2020-06-09 Thread Hup Chen
Thanks for your reply, this is one of the example where it fail. POST by using charset=utf-8 or other charset didn't help that CTRL-CHAR "^" error found in the title field, I hope solr can simply skip this record and go ahead to index the rest data. 9780373773244 9780373773244 Missing:

Re: Fw: TolerantUpdateProcessorFactory not functioning

2020-06-09 Thread Shawn Heisey
On 6/9/2020 12:44 AM, Hup Chen wrote: Thanks for your reply, this is one of the example where it fail. POST by using charset=utf-8 or other charset didn't help that CTRL-CHAR "^" error found in the title field, I hope solr can simply skip this record and go ahead to index the rest data.

Re: How to determine why solr stops running?

2020-06-09 Thread Dave
I’ll add that whenever I’ve had a solr instance shut down, for me it’s been a hardware failure. Either the ram or the disk got a “glitch” and both of these are relatively fragile and wear and tear type parts of the machine, and should be expected to fail and be replaced from time to time. Solr