Ivan, Sorry but I realize (I'm totally unaware of Java) that I skipped the java compile step (I simply put the java files in a jar file with jar cf). The problem now is that executing :
javac NormRemovalSimilarity.java -classpath ./elasticsearch-1.1.0.jar generates errors, the first one being: package org.apache.lucene.search.similarities does not exist Googled it but found nothing. Any idea? Patrick P.S. I installed elasticsearch following the easy way<https://gist.github.com/wingdspur/2026107>(dpkg the deb file) Le jeudi 3 avril 2014 09:16:02 UTC-4, geantbrun a écrit : > > Thanks again for your great help Ivan. Does not work for me. When I > substitute NormRemovalSimilarityProvider by BM25SimilarityProvider (or > simply by BM25), it works. Is it possible that I put my jar file in the > wrong directory (usr/share/elasticsearch/lib)? Is it necessary to > *register* somewhere the new classes I define before restarting service? > Cheers, > Patrick > > Le mercredi 2 avril 2014 17:47:46 UTC-4, Ivan Brusic a écrit : >> >> Are you using a full class name? I have no problems with >> >> curl -XPOST 'http://localhost:9200/sim/' -d ' >> { >> "settings" : { >> "similarity" : { >> "my_similarity" : { >> "type" : >> "org.elasticsearch.index.similarity.NormRemovalSimilarityProvider" >> } >> } >> }, >> "mappings" : { >> "post" : { >> "properties" : { >> "id" : { "type" : "long", "store" : "yes", "precision_step" : "0" }, >> "name" : { "type" : "string", "store" : "yes", "index" : "analyzed"}, >> "contents" : { "type" : "string", "store" : "no", "index" : >> "analyzed", "similarity" : "my_similarity"} >> } >> } >> } >> } >> ' >> >> >> >> On Wed, Apr 2, 2014 at 12:03 PM, geantbrun <agin.p...@gmail.com> wrote: >> >>> In order to better understand the error, I copied your >>> NormRemovalSimilarity and NormRemovalSimilarityProvider code snippets in >>> usr/share/elasticsearch/lib. I put these 2 files in a jar named >>> NormRemovalSimilarity.jar. After restarting the elasticsearch service, I >>> tried to create the index with the same mapping as before (except that I >>> put "type" : "NormRemoval" in the settings of my_similarity. >>> >>> The result is the same: >>> {"error":"IndexCreationException[[exbd] failed to create index]; nested: >>> NoClassSettingsException[Failed to load class setting [type] with value >>> [NormRemoval]]; nested: >>> ClassNotFoundException[org.elasticsearch.index.similarity.normremoval.NormRemovalSimilarityProvider]; >>> >>> ","status":500}] >>> >>> I deleted the jar file just to see if the error is the same: yes it is. >>> It's like the new similarity is never found or loaded. Is it still working >>> without modifications on your side? >>> Cheers, >>> Patrick >>> >>> >>> Le mercredi 2 avril 2014 00:31:44 UTC-4, Ivan Brusic a écrit : >>>> >>>> It has been a while since I used a custom similarity, but what you have >>>> looks right. Can you try a full class name instead? >>>> Use org.elasticsearch.index.similarity.tfCappedSimilarityProvider. >>>> According to the error, it is looking for org.elasticsearch.index.si >>>> milarity.tfcappedsimilarity.tfCappedSimilaritySimilarityProvider. >>>> >>>> -- >>>> Ivan >>>> >>>> >>>> On Tue, Apr 1, 2014 at 7:00 AM, geantbrun <agin.p...@gmail.com> wrote: >>>> >>>>> Sure. >>>>> >>>>> { >>>>> "settings" : { >>>>> "index" : { >>>>> "similarity" : { >>>>> "my_similarity" : { >>>>> "type" : "tfCappedSimilarity" >>>>> } >>>>> } >>>>> } >>>>> }, >>>>> "mappings" : { >>>>> "post" : { >>>>> "properties" : { >>>>> "id" : { "type" : "long", "store" : "yes", "precision_step" : "0" >>>>> }, >>>>> "name" : { "type" : "string", "store" : "yes", "index" : >>>>> "analyzed"}, >>>>> "contents" : { "type" : "string", "store" : "no", "index" : >>>>> "analyzed", "similarity" : "my_similarity"} >>>>> } >>>>> } >>>>> } >>>>> } >>>>> >>>>> If I substitute tfCappedSimilarity for tfCapped in the mapping, the >>>>> error is the same except that provider is referred as >>>>> tfCappedSimilarityProvider and not as tfCappedSimilaritySimilarit >>>>> yProvider. >>>>> Cheers, >>>>> Patrick >>>>> >>>>> >>>>> Le lundi 31 mars 2014 17:13:24 UTC-4, Ivan Brusic a écrit : >>>>>> >>>>>> Can you also post your mapping where you defined the similarity? >>>>>> >>>>>> -- >>>>>> Ivan >>>>>> >>>>>> >>>>>> On Mon, Mar 31, 2014 at 10:36 AM, geantbrun <agin.p...@gmail.com>wrote: >>>>>> >>>>>>> I realize that I probably have to define the similarity property of >>>>>>> my field as "my_similarity" (and not as "tfCappedSimilarity") and >>>>>>> define in >>>>>>> the settings my_similarity as being of type tfCappedSimilarity. >>>>>>> When I do that, I get the following error at the index/mapping >>>>>>> creation: >>>>>>> >>>>>>> {"error":"IndexCreationException[[exbd] failed to create index]; >>>>>>> nested: NoClassSettingsException[Failed to load class setting >>>>>>> [type] with value [tfCappedSimilarity]]; nested: >>>>>>> ClassNotFoundException[org. >>>>>>> elasticsearch.index.similarity.tfcappedsimilarity.tfCappedSim >>>>>>> ilaritySimilarityProvider]; ","status":500}] >>>>>>> >>>>>>> Note that the provider is referred in the error as >>>>>>> tfCappedSimilaritySimilarityProvider (similarity repeated 2 >>>>>>> times). Is it normal? >>>>>>> Patrick >>>>>>> >>>>>>> Le lundi 31 mars 2014 13:06:00 UTC-4, geantbrun a écrit : >>>>>>> >>>>>>>> Hi Ivan, >>>>>>>> I followed your instructions but it does not seem to work, I must >>>>>>>> be wrong somewhere. I created the jar file from the following two java >>>>>>>> files, could you tell me if they are ok? >>>>>>>> >>>>>>>> tfCappedSimilarity.java >>>>>>>> *************************** >>>>>>>> package org.elasticsearch.index.similarity; >>>>>>>> >>>>>>>> import org.apache.lucene.search.similarities.DefaultSimilarity; >>>>>>>> import org.elasticsearch.common.logging.ESLogger; >>>>>>>> import org.elasticsearch.common.logging.Loggers; >>>>>>>> >>>>>>>> public class tfCappedSimilarity extends DefaultSimilarity { >>>>>>>> >>>>>>>> private ESLogger logger; >>>>>>>> >>>>>>>> public tfCappedSimilarity() { >>>>>>>> logger = Loggers.getLogger(getClass()); >>>>>>>> } >>>>>>>> >>>>>>>> /** >>>>>>>> * Capped tf value >>>>>>>> */ >>>>>>>> @Override >>>>>>>> public float tf(float freq) { >>>>>>>> return (float)Math.sqrt(Math.min(9, freq)); >>>>>>>> } >>>>>>>> } >>>>>>>> >>>>>>>> tfCappedSimilarityProvider.java >>>>>>>> ************************************* >>>>>>>> package org.elasticsearch.index.similarity; >>>>>>>> >>>>>>>> import org.elasticsearch.common.inject.Inject; >>>>>>>> import org.elasticsearch.common.inject.assistedinject.Assisted; >>>>>>>> import org.elasticsearch.common.settings.Settings; >>>>>>>> >>>>>>>> public class tfCappedSimilarityProvider extends >>>>>>>> AbstractSimilarityProvider { >>>>>>>> >>>>>>>> private tfCappedSimilarity similarity; >>>>>>>> >>>>>>>> @Inject >>>>>>>> public tfCappedSimilarityProvider(@Assisted String name, >>>>>>>> @Assisted Settings settings) { >>>>>>>> super(name); >>>>>>>> this.similarity = new tfCappedSimilarity(); >>>>>>>> } >>>>>>>> >>>>>>>> /** >>>>>>>> * {@inheritDoc} >>>>>>>> */ >>>>>>>> @Override >>>>>>>> public tfCappedSimilarity get() { >>>>>>>> return similarity; >>>>>>>> } >>>>>>>> } >>>>>>>> >>>>>>>> >>>>>>>> In my mapping, I define the similarity property of my field as >>>>>>>> tfCappedSimilarity, is it ok? >>>>>>>> >>>>>>>> What makes me say that it does not work: I insert a doc with a word >>>>>>>> repeated 16 times in my field. When I do a search with that word, the >>>>>>>> result shows a tf of 4 (square root of 16) and not 3 as I was >>>>>>>> expecting, Is >>>>>>>> there a way to know if the similarity was loaded or not (maybe in a >>>>>>>> log >>>>>>>> file?). >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Patrick >>>>>>>> >>>>>>>> Le mercredi 26 mars 2014 17:16:36 UTC-4, Ivan Brusic a écrit : >>>>>>>>> >>>>>>>>> I updated my gist to illustrate the SimilarityProvider that goes >>>>>>>>> along with it. Similarities are easier to add to Elasticsearch than >>>>>>>>> most >>>>>>>>> plugins. You just need to compile the two files into a jar and then >>>>>>>>> add >>>>>>>>> that jar into Elasticsearch's classpath ($ES_HOME/lib most likely). >>>>>>>>> The >>>>>>>>> code will scan for every SimilarityProvider defined and load it. >>>>>>>>> >>>>>>>>> You then mapping the similarity to a field: http://www. >>>>>>>>> elasticsearch.org/guide/en/elasticsearch/reference/ >>>>>>>>> current/mapping-core-types.html#_configuring_similarity_per_field >>>>>>>>> >>>>>>>>> Note that you cannot change the similarity of a field dynamically. >>>>>>>>> >>>>>>>>> Ivan >>>>>>>>> >>>>>>>>> >>>>>>>>> http://www.elasticsearch.org/guide/en/elasticsearch/referenc >>>>>>>>> e/current/mapping-core-types.html#_configuring_similarity_pe >>>>>>>>> r_field >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, Mar 26, 2014 at 12:49 PM, geantbrun >>>>>>>>> <agin.p...@gmail.com>wrote: >>>>>>>>> >>>>>>>>>> Britta is looping over words that are passed as parameters. It's >>>>>>>>>> easy to implement her script for a simple query but what about >>>>>>>>>> boolean >>>>>>>>>> querys? In my understanding (but I could be wrong of course), I >>>>>>>>>> would have >>>>>>>>>> to parse the query to call the script with each sub-clause, am I >>>>>>>>>> wrong? >>>>>>>>>> >>>>>>>>>> I prefer your custom similarity alternative. Again, sorry for the >>>>>>>>>> silly question (newbie!) but where do you put your java file? Is it >>>>>>>>>> the >>>>>>>>>> only thing that is needed (except for the modification in the >>>>>>>>>> mapping)? >>>>>>>>>> cheers, >>>>>>>>>> Patrick >>>>>>>>>> >>>>>>>>>> Le mercredi 26 mars 2014 11:58:52 UTC-4, Ivan Brusic a écrit : >>>>>>>>>>> >>>>>>>>>>> I am still on a version of Elasticsearch that does not have >>>>>>>>>>> access to the new scoring capabilities, so I cannot test out any >>>>>>>>>>> scripts. >>>>>>>>>>> The non normalized term frequency should be the line: >>>>>>>>>>> tf = _index[field][word].tf() >>>>>>>>>>> >>>>>>>>>>> If that is the case, you could substitute that line with >>>>>>>>>>> something like: >>>>>>>>>>> tf = Math.min(10, _index[field][word].tf()) >>>>>>>>>>> >>>>>>>>>>> As a stated before, I am used to using Similarities, so I find >>>>>>>>>>> the example easier. Here is a custom similarity that I used in >>>>>>>>>>> Elasticsearch (removes any norms that are indexed): >>>>>>>>>>> https://gist.github.com/brusic/9786587 >>>>>>>>>>> >>>>>>>>>>> The second part would be the tf() method you would need to >>>>>>>>>>> implement instead of decodeNormValue I used. >>>>>>>>>>> >>>>>>>>>>> Cheers, >>>>>>>>>>> >>>>>>>>>>> Ivan >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "elasticsearch" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to elasticsearc...@googlegroups.com. >>>>>>> To view this discussion on the web visit >>>>>>> https://groups.google.com/d/msgid/elasticsearch/6370b4dc-824 >>>>>>> 3-4aea-918a-e4e4e9588aaf%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/6370b4dc-8243-4aea-918a-e4e4e9588aaf%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> >>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>> >>>>>> >>>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "elasticsearch" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to elasticsearc...@googlegroups.com. >>>>> To view this discussion on the web visit https://groups.google.com/d/ >>>>> msgid/elasticsearch/f9c6111c-9c4a-427d-952e-a203f2376fb8% >>>>> 40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/f9c6111c-9c4a-427d-952e-a203f2376fb8%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to elasticsearc...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/elasticsearch/68488979-9153-430b-b349-2192717677e7%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/68488979-9153-430b-b349-2192717677e7%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/25ca773c-17fc-4b03-aaf7-58464f6a6885%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.