Re: SOLR Cloud - Full index replication

2018-12-30 Thread Doss
Thanks Erick! We are using SOLR version 7.0.1. is there any disadvantages if we increase peer sync size to 1000 ? We have analysed the GC logs but we have not seen long GC pauses so far. We tried to find the reason for the full sync, but noting more informative, but we have seen too many logs

Re: Removing words like "FONT-SIZE: 9pt; FONT-FAMILY: arial" from content

2018-12-30 Thread Zheng Lin Edwin Yeo
These texts are likely from the original EML file data, but they are not visible in the content when the EML file is opened in Microsoft Outlook. I have already applied the HTMLStripFieldUpdateProcessorFactory in solrconfig.xml, but these texts are still showing up in the index. Below is my

Re: Removing words like "FONT-SIZE: 9pt; FONT-FAMILY: arial" from content

2018-12-30 Thread Alexandre Rafalovitch
Specifically, a custome Update Request Processor chain can be used before indexing. Probably with HTMLStripFieldUpdateProcessorFactory Regards, Alex On Sun, Dec 30, 2018, 9:26 PM Vincenzo D'Amore Hi, > > I think this kind of text manipulation should be done before indexing, if > you have

Re: Removing words like "FONT-SIZE: 9pt; FONT-FAMILY: arial" from content

2018-12-30 Thread Vincenzo D'Amore
Hi, I think this kind of text manipulation should be done before indexing, if you have font-size font-family in your text, very likely you’re indexing an html with css. If I’m right, you’re just entering in a hell of words that should be removed from your text. On the other hand, if you have

Removing words like "FONT-SIZE: 9pt; FONT-FAMILY: arial" from content

2018-12-30 Thread Zheng Lin Edwin Yeo
Hi, I noticed that during the indexing of EMLfiles, there are words like "*FONT-SIZE: 9pt; FONT-FAMILY: arial*" that are being indexed into the content as well. Would like to check, how are we able to remove those words during the indexing? I am using Solr 7.5.0 Regards, Edwin

Re: PC hang while running Solr cloud instance?

2018-12-30 Thread David Hastings
1. Each pc? How many are you talking about? 2. Why are you using shards? On Dec 30, 2018, at 4:11 PM, John Milton mailto:johnmilton@gmail.com>> wrote: Wish you happy new year to you all. Hi, I had run my Solr cloud instance 7.5 on my Windows OS. It has 100 shards with 4 replication. My

Identifying product name and other details from search string

2018-12-30 Thread UsesRN
Is there any way to identify product name and other details from search string in Solr or Java? For example: 1. Input String: " wound type cartridge filter size 20 * 4 Inch for RO plant" Output: Product: cartridge filter for RO plant Size: 20 * 4 inch 2. Input String: " WD 40 rust

PC hang while running Solr cloud instance?

2018-12-30 Thread John Milton
Wish you happy new year to you all. Hi, I had run my Solr cloud instance 7.5 on my Windows OS. It has 100 shards with 4 replication. My PC is hanging,and cpu and memory occupied 95% of space. Each PC has 16 GB of RAM. PC in ideal state only, at the moment no indexing and searching happens, but

How to archive Solr cloud and delete the data?

2018-12-30 Thread Rekha
Hi Solr Team, I want to archive my Solr data. Is there any api available to archive data? I planned to read data by month wise and store that into another collection. But this plan takes long time, as like adding new data and new indexing. And when I delete the archived data from the main

Re: RuleBasedAuthorizationPlugin configuration

2018-12-30 Thread Dominique Bejean
Hi, After reading more carefully the log file, here is my understanding. The request http://2:xx@localhost:8983/solr/biblio/select?indent=on=*:*=json report this in log 2018-12-30 12:24:52.102 INFO (qtp1731656333-20) [ x:biblio] o.a.s.s.HttpSolrCall USER_REQUIRED auth header Basic Mjox

Re: Reload synonyms without reloading the multiple collections

2018-12-30 Thread Simón de Frosterus Pokrzywnicki
Sorry, I see that it may have been confusing. My webapp calls the reload of all the affected Collections (about a dozen of them) in sequential mode using the Collections API. Ideally I would be able to write some QueryTimeSynonymFilterFactory that would periodically or when told, reload the