Hi there, I’m developing custom java application with lucene 8.5.0.
I've tried to use DelimitedBoostTokenFilterFactory but I have a problem, so please help me if I'm doing something wrong. When I’m using BM25Similarity and delimitedBoost filter everything works as expected, but if I switch to BooleanSimilarity nothing happens. Parsed query looks ok. It has synonyms with proper boost value, but the final score hasn’t changed. I’m using StandardAnalyzer for search, and my SynonymGraphFilter has default configuration: Map<String, String> synonymParam = new HashMap<>(); synonymParam.put("synonyms", synonymFileName); synonymParam.put("ignoreCase", "true"); synonymParam.put("format", "solr"); synonymParam.put("expand","true"); synonymParam.put("tokenizerFactory","org.apache.lucene.analysis.core.WhitespaceTokenizerFactory"); Map<String, String> delimitedBoostTokenFilterMap = new HashMap<>(); delimitedBoostTokenFilterMap.put("delimiter", "|"); Analyzer customAnalyzer = CustomAnalyzer.builder(Paths.get(synonymFolder)) .withTokenizer(StandardTokenizerFactory.NAME) .addTokenFilter(SynonymGraphFilterFactory.NAME, synonymParam) .addTokenFilter(DelimitedBoostTokenFilterFactory.NAME, delimitedBoostTokenFilterMap) .build(); Here’s my debug output and some additional info: Query: +Synonym(morphology_term_original_name_key:neoplasm^0.7 morphology_term_original_name_key:tumor^0.8 morphology_term_original_name_key:tumour^0.6) 1.0 = weight(Synonym(morphology_term_original_name:neoplasm^0.7 morphology_term_original_name:tumor^0.8 morphology_term_original_name:tumour^0.6) in 0) [BooleanSimilarity], result of: 1.0 = score(BooleanWeight), computed from: 1.0 = boost, query boost If I use the BM25Similarity, the printout is as follows: 0.75188845 = weight(Synonym(morphology_term_original_name:neoplasm^0.7 morphology_term_original_name:tumor^0.8 morphology_term_original_name:tumour^0.6) in 0) [BM25Similarity], result of: 0.75188845 = score(freq=0.8), computed as boost * idf * tf from: 1.3862944 = idf, computed as log(1 + (N – n + 0.5) / (n + 0.5)) from: 1 = n, number of documents containing term 5 = N, total number of documents with field 0.5423729 = tf, computed as freq / (freq + k1 * (1 – b + b * dl / avgdl)) from: 0.8 = termFreq=0.8 1.2 = k1, term saturation parameter 0.75 = b, length normalization parameter 1.0 = dl, length of field 2.4 = avgdl, average length of field Thanks in advance! Ivana