CDCR issues

2019-03-21 Thread Jay Potharaju
Hi, I just enabled CDCR for one collection. I am seeing high CPU usage and the high number of tlog files and increasing. The collection does not have lot of data , just started reindexing of data. . Solr 7.7.0 , implicit sharding 8 shards I have enabled buffer on source side and disabled buffer

HTML/JavaScript Query and Results Display Not Working

2019-03-21 Thread Deoxyribonucleic_DNA ...
I am trying working off of https://wiki.apache.org/solr/SolJSON tutorial. I have put my url for solr in the code, copied from solr admin query result to make sure the query should return something. I try typing in "title:Asian" into text box but when the button is hit, textbox just clears and

Re: Use of ShingleFilter causing very large BooleanQuery structures in Solr 7.1

2019-03-21 Thread Erick Erickson
sow was introduced in Solr 6, so it’s just ignored in 4x. bq. Surely the tokenizer splits on white space anyway, or it wouldn't work? I didn’t work on that code, so I don’t have the details off the top of my head, but I’ll take a stab at it as far as my understanding goes. The result is in

Re: Use of ShingleFilter causing very large BooleanQuery structures in Solr 7.1

2019-03-21 Thread Hubert-Price, Neil
Hi Erick, I've run a series of tests using debug=true, the same original query, and variations around sow=true/sow=false/not set. See links below for .txt files containing the output. I have removed any genuine document content and replaced it with .. because I don't have the customer's

RE: is df needed for SolrCloud replication?

2019-03-21 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Thanks. That resolves the issue. Thanks again. -Original Message- From: Shawn Heisey Sent: Tuesday, March 19, 2019 7:10 PM To: solr-user@lucene.apache.org Subject: Re: is df needed for SolrCloud replication? On 3/19/2019 4:48 PM, Oakley, Craig (NIH/NLM/NCBI) [C] wrote: > I recently

Re: Upgrading solarj from 6.5.1 to 8.0.0

2019-03-21 Thread Erick Erickson
One tangent just so you’re aware. You _must_ re-index from scratch. Lucene 8x will refuse to open an index that was _ever_ touched by Solr 6. Best, Erick > On Mar 21, 2019, at 8:26 AM, Lahiru Jayasekera > wrote: > > Hi Jason, > Thanks for the response. I saw the method of setting credentials

Re: Migrate Solr Master To Cloud 7.5

2019-03-21 Thread Erick Erickson
Yeah, the link you referenced will work. It is _very important_ that you create your collection with exactly one shard then do the copy. After that you can use SPLITSHARD to sub-divide it. This is a costly operation, but probably not as costly as re-indexing. That said, it might be easier to

Re: Gather Nodes Streaming

2019-03-21 Thread Joel Bernstein
gatherNodes requires single value fields in the tuples. In certain scenarios the cartesianProduct streaming expression can be used to explode a multi-value field into a single field stream. But in the scenario you describe this might not be possible. Joel Bernstein http://joelsolr.blogspot.com/

Re: highlighter, stored documents and performance

2019-03-21 Thread Erick Erickson
By and large, storing data will not affect search speed as much as you might think. Getting the top N results (say 10) doesn’t use stored data at all. It’s only _after_ that point that highlighting occurs on the 10 docs. As far as needing the full doc, Jörn is right, it must be stored. The

Re: Strange disk size behavior

2019-03-21 Thread Erick Erickson
99% sure it’s background merging. When two segments are merged, the combined segment is written and only after it’s successful will the old segments be deleted. Restarting will stop any ongoing merging and delete any un-referenced segments. I expect you’ll see the space come back as you start

Strange disk size behavior

2019-03-21 Thread SOLR4189
Hi all. We use SOLR-6.5.1 and in our cluster each solr core is placed in different virtual machine (one core per one node). Each virtual machine has 104 Gb size of disk. Yesterday we marked that several solr cores use disk space in the abnormal manner. In running command *"df -h

Re: highlighter, stored documents and performance

2019-03-21 Thread Jörn Franke
Hi, Then you have to go for the full documents. I recommend to reduce then the returned results, use paging (if it is a web ui) and split the documents on several nodes (if the previous measures do not turn out to be successful). Best regards > Am 21.03.2019 um 17:15 schrieb Martin Frank

RE: highlighter, stored documents and performance

2019-03-21 Thread Martin Frank Hansen (MHQ)
Hi Jörn, Thanks for your answer. Unfortunately, there is no summary included in the documents and I would like it to work for all documents. Best regards Martin Internal - KMD A/S -Original Message- From: Jörn Franke Sent: 21. marts 2019 17:11 To: solr-user@lucene.apache.org

Re: highlighter, stored documents and performance

2019-03-21 Thread Jörn Franke
I don’t think so - to highlight any possible query you need the full document. You could optimize it by only storing a subset of the document and highlight only in this subset. Alternatively you can store a summary and show only the summary without highlighting. > Am 21.03.2019 um 17:05

highlighter, stored documents and performance

2019-03-21 Thread Martin Frank Hansen (MHQ)
Hi, I am wondering how performance highlighting in Solr performs when the number of documents get large? Right now we have about 1 TB of data in all sorts of file types and I was wondering how storing these documents within Solr (for highlighting purpose) will affect performance? Is it

Migrate Solr Master To Cloud 7.5

2019-03-21 Thread IZaBEE_Keeper
Hi.. I have a large Solr 7.5 index over 150M docs and 800GB in a master slave setup.. I need to migrate the core to a Solr Cloud instance with pull replicas as the index will be exceeding the 2.2B doc limit for a single core.. I found this..

Re: Delay searches till log replay finishes

2019-03-21 Thread Rahul Goswami
Eric,Shawn, Apologies for the late update on this thread and thank you for your inputs. My assumption about the number of segments increasing was out of incomplete understanding of the TieredMergePolicy, but I get it now. Another concern was slowing indexing rate due to constant merges. This is

Re: Upgrading solarj from 6.5.1 to 8.0.0

2019-03-21 Thread Lahiru Jayasekera
Hi Jason, Thanks for the response. I saw the method of setting credentials based on individual request. But I need to set the credentials at solrclient level. If you remember the way to do it please let me know. Thanks On Thu, Mar 21, 2019 at 8:26 PM Jason Gerlowski wrote: > You should be able

Re: Use of ShingleFilter causing very large BooleanQuery structures in Solr 7.1

2019-03-21 Thread Erick Erickson
Neil: Yeah, the attachment-stripping is catches everyone first time, we’re so used to just adding anything we want to an e-mail… I don’t know enough about the query parsing to answer off the top of my head. I do know one thing that’s changed is “Split on Whitespace” has changed from true to

Re: Upgrading solarj from 6.5.1 to 8.0.0

2019-03-21 Thread Jason Gerlowski
You should be able to set credentials on individual requests with the SolrRequest.setBasicAuthCredentials() method. That's the method suggested by the latest Solr ref guide at least: https://lucene.apache.org/solr/guide/7_7/basic-authentication-plugin.html#using-basic-auth-with-solrj There might

Re: CDCR one source multiple targets

2019-03-21 Thread Arnold Bronley
I see a similar question asked but no answers there too. http://lucene.472066.n3.nabble.com/CDCR-Replication-from-one-source-to-multiple-targets-td4308717.html OP there is using multiple cdcr request handlers but in my case I am using multiple zkhost strings. It will be pretty limiting if we

Re: Environmental Protection Agency: Stop Deforesting in Sri Lanka

2019-03-21 Thread solrlucene
I am from India will it help On Thursday, March 21, 2019, wrote: > Hello there, > > I just signed the petition "Environmental Protection Agency: Stop > Deforesting in Sri Lanka" and wanted to see if you could help by adding > your name. > > Our goal is to reach 15,000 signatures and we need

Re: Use of ShingleFilter causing very large BooleanQuery structures in Solr 7.1

2019-03-21 Thread Hubert-Price, Neil
Hello Erick, This is the first time I've had reason to use the mailing list, so I wasn't aware of the behaviour around attachments. See below, links to the images that I originally sent as attachments, both are screenshots from within Eclipse MAT looking at a SOLR heap dump.

Environmental Protection Agency: Stop Deforesting in Sri Lanka

2019-03-21 Thread bjchathuranga
Hello there, I just signed the petition "Environmental Protection Agency: Stop Deforesting in Sri Lanka" and wanted to see if you could help by adding your name. Our goal is to reach 15,000 signatures and we need more support. You can read more and sign the petition here:

Upgrading solarj from 6.5.1 to 8.0.0

2019-03-21 Thread Lahiru Jayasekera
Hi all, I need help implementing the following code in solarj 8.0.0. private SolrClient server, adminServer; this.adminServer = new HttpSolrClient(SolrClientUrl); this.server = new HttpSolrClient( SolrClientUrl + "/" + mapping.getCoreName() ); if (serverUserAuth) { HttpClientUtil.setBasicAuth(