Bi-Directional CDCR

2020-06-03 Thread Gell-Holleron, Daniel
Hi there, I need some advice on how Bi-Directional CDCR is properly configured. I've created a collection on Site A (3 Solr nodes, 5 ZooKeepers). I've also created a collection on site B (3 Solr nodes, 5 ZooKeepers). These both have the same number of shards (not sure if that is a factor or

Re: Not all EML files are indexing during indexing

2020-06-03 Thread Charlie Hull
I think the OP is indexing flat files, not web pages (but otherwise, I agree with you that Scrapy is great - I know some of the people behind it too and they're a good bunch). Charlie On 02/06/2020 16:41, Walter Underwood wrote: On Jun 2, 2020, at 7:40 AM, Charlie Hull wrote: If it was me

Periodically 100% cpu and high load/IO

2020-06-03 Thread Marvin Bredal Lillehaug
Hi, We have a cluster with five Solr(8.5.1, Java 11) nodes, and sometimes one or two nodes has Solr running with 100% cpu on all cores, «load» over 400, and high IO. It usually lasts five to ten minutes, and the node is hardly responding. Does anyone have any experience with this type of

Re: solr 8.4.1 with ssl tls1.2 creating an issue with non-leader node

2020-06-03 Thread yaswanth kumar
thanks Franke, I now made the use of the default jetty-ssl.xml that comes with the solr package, but the issue is still happening when I try to push data to a non-leader node. Do you still think if its something to do with the configurations ?? Thanks, On Wed, Jun 3, 2020 at 12:29 AM Jörn

RE: which terms are used at the matched document?

2020-06-03 Thread Serkan KAZANCI
Dobry den Mikhail, So I searched for "banka" which means "bank" at my language. Below is highlighted fragments of a matched document. You can see from mark tags that "Bankalar", "banka", "bankaya", "bankalar" terms exist in document, "highlighting":{

Re: Periodically 100% cpu and high load/IO

2020-06-03 Thread Marvin Bredal Lillehaug
Yes, there are light/moderate indexing most of the time. The setup has NRT replicas. And the shards are around 45GB each. Index merging has been the hypothesis for some time, but we haven't dared to activate info stream logging. On Wed, Jun 3, 2020 at 2:34 PM Erick Erickson wrote: > One

Re: Insert documents to a particular shard

2020-06-03 Thread sambasivarao giddaluri
Thanks Jorn for your suggestions , It was a sample schema but each document_type will have more fields . 1) Yes i have exported graph traversal gatherNodes using streaming expression but we found few issues ex: get parent doc based on grandchild doc filter Graph Traversal - {!graph from=parentId

SolrSlf4jReporter, MDC information not set if num collections > coreLoadThreads

2020-06-03 Thread Marvin Bredal Lillehaug
Hi! We just started using SolrSlf4jReporter to get hold of metrics. In Solr 8.5.2 there is a issue, when the number of cores is larger than 3 (default value of coreLoadThreads) the logged metrics for some cores are missing all MDC variables for the core. There has been some changes concerning

Re: Periodically 100% cpu and high load/IO

2020-06-03 Thread Erick Erickson
One possibility is merging index segments. When this happens, are you actively indexing? And are these NRT replicas or TLOG/PULL? If the latter, are your TLOG leaders on the affected machines? Best, Erick > On Jun 3, 2020, at 3:57 AM, Marvin Bredal Lillehaug > wrote: > > Hi, > We have a

backup compression

2020-06-03 Thread Gell-Holleron, Daniel
Hi there, I wanted to know as part of a backup (action=BACKUP) that compression can be used as part of the command? Going forward as more data is pumped into Solr the backups are going to be very large. Aside from applying compression to the folder the backup gets written to, is there a

Re: Multiple Solr instances using same ZooKeepers

2020-06-03 Thread Walter Underwood
If your clusters are able to use the same Zookeeper, then they are in the same data center (or AWS region), so you should not need CDCR. That is for clusters in different data centers. Also, CDCR has some known problems. What are you trying to solve with CDCR? There may be a better way to solve

Re: backup compression

2020-06-03 Thread Jan Høydahl
I see from the original issue https://issues.apache.org/jira/browse/SOLR-5750 that backup compression was thought of but not implemented. I don’t see any open JIRAs for it either. Would be great to have a ‘compress’ option to the command, and

solrj - get metrics from all nodes

2020-06-03 Thread lstusr 5u93n4
Hi All, I'm attempting to connect to the metrics api in solrj to query metrics from my cluster. Using the CloudSolrClient, I get routed to one node, and get metrics only from that node. I'm building my request like this: GenericSolrRequest req = new GenericSolrRequest(METHOD.GET,

Re: which terms are used at the matched document?

2020-06-03 Thread Mikhail Khludnev
This is matching term: "ht(*content:banka* in 71" On Wed, Jun 3, 2020 at 5:15 PM Serkan KAZANCI wrote: > Dobry den Mikhail, > > So I searched for "banka" which means "bank" at my language. Below is > highlighted fragments of a matched document. You can see from mark tags > that "Bankalar",

Re: which terms are used at the matched document?

2020-06-03 Thread Mikhail Khludnev
Hi, debugQuery response contains matched terms as well. It's just a little bit hard to read. On Wed, Jun 3, 2020 at 3:55 PM Serkan KAZANCI wrote: > Hi, > > > > Is it possible to retrieve the terms that are used to match the document? > (Keyword term itself, stemmed versions of term, term

RE: Autoscaling using SolrCloud8.5 on AWS EKS - issue with Node Added trigger

2020-06-03 Thread Mangla,Kirti
Hi, Looking for help on this issue. Anyone has faced this problem? Thanks, Kirti Mangla Software Engineer- Gartner Digital Markets - GetApp Two Horizon Center, Golf Course Road, Gurgaon, India Direct: +91124-4795963 [logo_small] From: Mangla,Kirti Sent: Wednesday, June 3, 2020 12:29 AM To:

which terms are used at the matched document?

2020-06-03 Thread Serkan KAZANCI
Hi, Is it possible to retrieve the terms that are used to match the document? (Keyword term itself, stemmed versions of term, term matched from synonyms.txt) Example: search keyword "heaven" Found in document1 via "heavens" and "heaven", found in document2 via "heavenly" , found in

Re: solr 8.4.1 with ssl tls1.2 creating an issue with non-leader node

2020-06-03 Thread yaswanth kumar
Hi Franke, I suspect its because of the certificate encryption ?? But will wait for you to confirm the same. We are trying to generate a certs with RSA 2048 and finally combining them to a single JKS and that's what we are referring as a keystore and truststore, let me know if it doesn't work or