It has been a while since I used a custom similarity, but what you have looks right. Can you try a full class name instead? Use org.elasticsearch.index.similarity.tfCappedSimilarityProvider. According to the error, it is looking for org.elasticsearch.index. similarity.tfcappedsimilarity.tfCappedSimilaritySimilarityProvider.
-- Ivan On Tue, Apr 1, 2014 at 7:00 AM, geantbrun <agin.patr...@gmail.com> wrote: > Sure. > > { > "settings" : { > "index" : { > "similarity" : { > "my_similarity" : { > "type" : "tfCappedSimilarity" > } > } > } > }, > "mappings" : { > "post" : { > "properties" : { > "id" : { "type" : "long", "store" : "yes", "precision_step" : "0" }, > "name" : { "type" : "string", "store" : "yes", "index" : "analyzed"}, > "contents" : { "type" : "string", "store" : "no", "index" : > "analyzed", "similarity" : "my_similarity"} > } > } > } > } > > If I substitute tfCappedSimilarity for tfCapped in the mapping, the error > is the same except that provider is referred as tfCappedSimilarityProviderand > not as > tfCappedSimilaritySimilarityProvider. > Cheers, > Patrick > > > Le lundi 31 mars 2014 17:13:24 UTC-4, Ivan Brusic a écrit : >> >> Can you also post your mapping where you defined the similarity? >> >> -- >> Ivan >> >> >> On Mon, Mar 31, 2014 at 10:36 AM, geantbrun <agin.p...@gmail.com> wrote: >> >>> I realize that I probably have to define the similarity property of my >>> field as "my_similarity" (and not as "tfCappedSimilarity") and define in >>> the settings my_similarity as being of type tfCappedSimilarity. >>> When I do that, I get the following error at the index/mapping creation: >>> >>> {"error":"IndexCreationException[[exbd] failed to create index]; >>> nested: NoClassSettingsException[Failed to load class setting [type] >>> with value [tfCappedSimilarity]]; nested: ClassNotFoundException[org. >>> elasticsearch.index.similarity.tfcappedsimilarity. >>> tfCappedSimilaritySimilarityProvider]; ","status":500}] >>> >>> Note that the provider is referred in the error as >>> tfCappedSimilaritySimilarityProvider (similarity repeated 2 times). Is >>> it normal? >>> Patrick >>> >>> Le lundi 31 mars 2014 13:06:00 UTC-4, geantbrun a écrit : >>> >>>> Hi Ivan, >>>> I followed your instructions but it does not seem to work, I must be >>>> wrong somewhere. I created the jar file from the following two java files, >>>> could you tell me if they are ok? >>>> >>>> tfCappedSimilarity.java >>>> *************************** >>>> package org.elasticsearch.index.similarity; >>>> >>>> import org.apache.lucene.search.similarities.DefaultSimilarity; >>>> import org.elasticsearch.common.logging.ESLogger; >>>> import org.elasticsearch.common.logging.Loggers; >>>> >>>> public class tfCappedSimilarity extends DefaultSimilarity { >>>> >>>> private ESLogger logger; >>>> >>>> public tfCappedSimilarity() { >>>> logger = Loggers.getLogger(getClass()); >>>> } >>>> >>>> /** >>>> * Capped tf value >>>> */ >>>> @Override >>>> public float tf(float freq) { >>>> return (float)Math.sqrt(Math.min(9, freq)); >>>> } >>>> } >>>> >>>> tfCappedSimilarityProvider.java >>>> ************************************* >>>> package org.elasticsearch.index.similarity; >>>> >>>> import org.elasticsearch.common.inject.Inject; >>>> import org.elasticsearch.common.inject.assistedinject.Assisted; >>>> import org.elasticsearch.common.settings.Settings; >>>> >>>> public class tfCappedSimilarityProvider extends >>>> AbstractSimilarityProvider { >>>> >>>> private tfCappedSimilarity similarity; >>>> >>>> @Inject >>>> public tfCappedSimilarityProvider(@Assisted String name, >>>> @Assisted Settings settings) { >>>> super(name); >>>> this.similarity = new tfCappedSimilarity(); >>>> } >>>> >>>> /** >>>> * {@inheritDoc} >>>> */ >>>> @Override >>>> public tfCappedSimilarity get() { >>>> return similarity; >>>> } >>>> } >>>> >>>> >>>> In my mapping, I define the similarity property of my field as >>>> tfCappedSimilarity, is it ok? >>>> >>>> What makes me say that it does not work: I insert a doc with a word >>>> repeated 16 times in my field. When I do a search with that word, the >>>> result shows a tf of 4 (square root of 16) and not 3 as I was expecting, Is >>>> there a way to know if the similarity was loaded or not (maybe in a log >>>> file?). >>>> >>>> Cheers, >>>> Patrick >>>> >>>> Le mercredi 26 mars 2014 17:16:36 UTC-4, Ivan Brusic a écrit : >>>>> >>>>> I updated my gist to illustrate the SimilarityProvider that goes along >>>>> with it. Similarities are easier to add to Elasticsearch than most >>>>> plugins. >>>>> You just need to compile the two files into a jar and then add that jar >>>>> into Elasticsearch's classpath ($ES_HOME/lib most likely). The code will >>>>> scan for every SimilarityProvider defined and load it. >>>>> >>>>> You then mapping the similarity to a field: http://www.elasticsearc >>>>> h.org/guide/en/elasticsearch/reference/current/mapping-core-types. >>>>> html#_configuring_similarity_per_field >>>>> >>>>> Note that you cannot change the similarity of a field dynamically. >>>>> >>>>> Ivan >>>>> >>>>> >>>>> http://www.elasticsearch.org/guide/en/elasticsearch/referenc >>>>> e/current/mapping-core-types.html#_configuring_similarity_per_field >>>>> >>>>> >>>>> On Wed, Mar 26, 2014 at 12:49 PM, geantbrun <agin.p...@gmail.com>wrote: >>>>> >>>>>> Britta is looping over words that are passed as parameters. It's easy >>>>>> to implement her script for a simple query but what about boolean querys? >>>>>> In my understanding (but I could be wrong of course), I would have to >>>>>> parse >>>>>> the query to call the script with each sub-clause, am I wrong? >>>>>> >>>>>> I prefer your custom similarity alternative. Again, sorry for the >>>>>> silly question (newbie!) but where do you put your java file? Is it the >>>>>> only thing that is needed (except for the modification in the mapping)? >>>>>> cheers, >>>>>> Patrick >>>>>> >>>>>> Le mercredi 26 mars 2014 11:58:52 UTC-4, Ivan Brusic a écrit : >>>>>>> >>>>>>> I am still on a version of Elasticsearch that does not have access >>>>>>> to the new scoring capabilities, so I cannot test out any scripts. The >>>>>>> non >>>>>>> normalized term frequency should be the line: >>>>>>> tf = _index[field][word].tf() >>>>>>> >>>>>>> If that is the case, you could substitute that line with something >>>>>>> like: >>>>>>> tf = Math.min(10, _index[field][word].tf()) >>>>>>> >>>>>>> As a stated before, I am used to using Similarities, so I find the >>>>>>> example easier. Here is a custom similarity that I used in Elasticsearch >>>>>>> (removes any norms that are indexed): >>>>>>> https://gist.github.com/brusic/9786587 >>>>>>> >>>>>>> The second part would be the tf() method you would need to implement >>>>>>> instead of decodeNormValue I used. >>>>>>> >>>>>>> Cheers, >>>>>>> >>>>>>> Ivan >>>>>>> >>>>>> >>>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to elasticsearc...@googlegroups.com. >>> To view this discussion on the web visit https://groups.google.com/d/ >>> msgid/elasticsearch/6370b4dc-8243-4aea-918a-e4e4e9588aaf% >>> 40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/6370b4dc-8243-4aea-918a-e4e4e9588aaf%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearch+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/f9c6111c-9c4a-427d-952e-a203f2376fb8%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/f9c6111c-9c4a-427d-952e-a203f2376fb8%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQD9iNsZvK_hEx6BZ2gT0r3N79djoE5w1acDHFMY93n9fQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.