Re: SolrCloud Nodes going to recovery state during indexing

2018-01-03 Thread Sravan Kumar
Emir, Yes there is a delete_by_query on every bulk insert. This delete_by_query deletes all the documents which are updated lesser than a day before the current time. Is bulk delete_by_query the reason? On Wed, Jan 3, 2018 at 7:58 PM, Emir Arnautović < emir.arnauto...@sematext.com> wro

Re: SolrCloud Nodes going to recovery state during indexing

2018-01-04 Thread Sravan Kumar
Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > > > > > On 3 Jan 2018, at 16:04, Sravan Kumar wrote: > > > > Emir, > >Yes there is a delete_by_query on every bulk insert. > >This delete_by_query deletes all the documents which are

Title Search scoring issues with multivalued field & norm

2018-01-31 Thread Sravan Kumar
Hi, We are using solr for our movie title search. As it is "title search", this should be treated different than the normal document search. Hence, we use a modified version of TFIDFSimilarity with the following changes. - disabled TF & IDF and will only have 1 as value. - disabled norms by spe

Re: Title Search scoring issues with multivalued field & norm

2018-01-31 Thread Sravan Kumar
wunder >> Walter Underwood >> wun...@wunderwood.org >> http://observer.wunderwood.org/ (my blog) >> >>> On Jan 31, 2018, at 4:38 AM, Sravan Kumar wrote: >>> >>> Hi, >>> We are using solr for our movie title search. >>> >>>

Re: Title Search scoring issues with multivalued field & norm

2018-01-31 Thread Sravan Kumar
f > you weighted things differently would you shorten the length of the chain. > Can you get the click throughs to happen sooner. > > Anyway, just my 2 cents > > > On Wed, Jan 31, 2018 at 6:38 PM, Sravan Kumar wrote: > > > > > @Walter: We have 6 fields decla

Re: Title Search scoring issues with multivalued field & norm

2018-01-31 Thread Sravan Kumar
, if you took the > >> previous months searches where there is a chain of successive > searches. If > >> you weighted things differently would you shorten the length of the > chain. > >> Can you get the click throughs to happen sooner. > >> > >> Anyway,

Re: Title Search scoring issues with multivalued field & norm

2018-02-04 Thread Sravan Kumar
affect the score. Any other way to handle norms in multivalued field? On Thu, Feb 1, 2018 at 12:24 PM, Sravan Kumar wrote: > @Walter: Perhaps you are right on not to consider stemming. Instead fuzzy > search will cover these along with the misspellings. > > In case of symbols, we want t

Bi Gram token generation with fuzzy searches

2018-02-07 Thread Sravan Kumar
We have the following two fields for our movie title search - title without symbols a custom analyser with WordDelimiterFilterFactory, SynonymFilterFactory and other filters to retain only alpha numeric characters. - title with word bi grams a custom analyser with solr.ShingleFilterFactory to gener

Re: Bi Gram token generation with fuzzy searches

2018-02-07 Thread Sravan Kumar
=false qf=title_bigrams}$v) OR _query({!edismax > qf=title}$v)&$v=some movie title > > > > HTH, > > Emir > > -- > > Monitoring - Log Management - Alerting - Anomaly Detection > > Solr & Elasticsearch Consulting Support Training - http://sematext.com/ &g