I realize that I probably have to define the similarity property of my field as "my_similarity" (and not as "tfCappedSimilarity") and define in the settings my_similarity as being of type tfCappedSimilarity. When I do that, I get the following error at the index/mapping creation:
{"error":"IndexCreationException[[exbd] failed to create index]; nested: NoClassSettingsException[Failed to load class setting [type] with value [tfCappedSimilarity]]; nested: ClassNotFoundException[org.elasticsearch.index.similarity.tfcappedsimilarity.tfCappedSimilaritySimilarityProvider]; ","status":500}] Note that the provider is referred in the error as tfCappedSimilaritySimilarityProvider (similarity repeated 2 times). Is it normal? Patrick Le lundi 31 mars 2014 13:06:00 UTC-4, geantbrun a écrit : > > Hi Ivan, > I followed your instructions but it does not seem to work, I must be wrong > somewhere. I created the jar file from the following two java files, could > you tell me if they are ok? > > tfCappedSimilarity.java > *************************** > package org.elasticsearch.index.similarity; > > import org.apache.lucene.search.similarities.DefaultSimilarity; > import org.elasticsearch.common.logging.ESLogger; > import org.elasticsearch.common.logging.Loggers; > > public class tfCappedSimilarity extends DefaultSimilarity { > > private ESLogger logger; > > public tfCappedSimilarity() { > logger = Loggers.getLogger(getClass()); > } > > /** > * Capped tf value > */ > @Override > public float tf(float freq) { > return (float)Math.sqrt(Math.min(9, freq)); > } > } > > tfCappedSimilarityProvider.java > ************************************* > package org.elasticsearch.index.similarity; > > import org.elasticsearch.common.inject.Inject; > import org.elasticsearch.common.inject.assistedinject.Assisted; > import org.elasticsearch.common.settings.Settings; > > public class tfCappedSimilarityProvider extends AbstractSimilarityProvider > { > > private tfCappedSimilarity similarity; > > @Inject > public tfCappedSimilarityProvider(@Assisted String name, @Assisted > Settings settings) { > super(name); > this.similarity = new tfCappedSimilarity(); > } > > /** > * {@inheritDoc} > */ > @Override > public tfCappedSimilarity get() { > return similarity; > } > } > > > In my mapping, I define the similarity property of my field as > tfCappedSimilarity, is it ok? > > What makes me say that it does not work: I insert a doc with a word > repeated 16 times in my field. When I do a search with that word, the > result shows a tf of 4 (square root of 16) and not 3 as I was expecting, Is > there a way to know if the similarity was loaded or not (maybe in a log > file?). > > Cheers, > Patrick > > Le mercredi 26 mars 2014 17:16:36 UTC-4, Ivan Brusic a écrit : >> >> I updated my gist to illustrate the SimilarityProvider that goes along >> with it. Similarities are easier to add to Elasticsearch than most plugins. >> You just need to compile the two files into a jar and then add that jar >> into Elasticsearch's classpath ($ES_HOME/lib most likely). The code will >> scan for every SimilarityProvider defined and load it. >> >> You then mapping the similarity to a field: >> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#_configuring_similarity_per_field >> >> Note that you cannot change the similarity of a field dynamically. >> >> Ivan >> >> >> >> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#_configuring_similarity_per_field >> >> >> On Wed, Mar 26, 2014 at 12:49 PM, geantbrun <agin.p...@gmail.com> wrote: >> >>> Britta is looping over words that are passed as parameters. It's easy to >>> implement her script for a simple query but what about boolean querys? In >>> my understanding (but I could be wrong of course), I would have to parse >>> the query to call the script with each sub-clause, am I wrong? >>> >>> I prefer your custom similarity alternative. Again, sorry for the silly >>> question (newbie!) but where do you put your java file? Is it the only >>> thing that is needed (except for the modification in the mapping)? >>> cheers, >>> Patrick >>> >>> Le mercredi 26 mars 2014 11:58:52 UTC-4, Ivan Brusic a écrit : >>>> >>>> I am still on a version of Elasticsearch that does not have access to >>>> the new scoring capabilities, so I cannot test out any scripts. The non >>>> normalized term frequency should be the line: >>>> tf = _index[field][word].tf() >>>> >>>> If that is the case, you could substitute that line with something like: >>>> tf = Math.min(10, _index[field][word].tf()) >>>> >>>> As a stated before, I am used to using Similarities, so I find the >>>> example easier. Here is a custom similarity that I used in Elasticsearch >>>> (removes any norms that are indexed): >>>> https://gist.github.com/brusic/9786587 >>>> >>>> The second part would be the tf() method you would need to implement >>>> instead of decodeNormValue I used. >>>> >>>> Cheers, >>>> >>>> Ivan >>>> >>> >> -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6370b4dc-8243-4aea-918a-e4e4e9588aaf%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.