Hi Anshum, my complaint was not a polemic but a sad observation :( I perfectly know that it has been more lacking time than the intent! Hopefully I will get some feedback and we can solve/improve the MLT together !
Cheers On 11 March 2016 at 17:26, Anshum Gupta <[email protected]> wrote: > Hi Alessandro, > > I've updated the JIRA. The committers try and review code whenever they > get time and in this case, like other such times, I think we were all just > lacking time, rather than the intent. > > Also, not all committers work on all parts of the code, so that narrows > down the people who could potentially help you. > > On Fri, Mar 11, 2016 at 8:49 AM, Alessandro Benedetti < > [email protected]> wrote: > >> I start to feel that is not that easy to contribute improvements or small >> fix to Solr ( if they are not super interesting to the mass) . >> I think this one could be a good improvement in the MLT but I would love >> to discuss this with some committer. >> The patch is attached, it is there since months ago... >> Any feedback would be appreciated, I want to contribute, but I need some >> second opinions ... >> >> Cheers >> >> On 11 February 2016 at 13:48, Alessandro Benedetti <[email protected] >> > wrote: >> >>> Hi Guys, >>> is it possible to have any feedback ? >>> Is there any process to speed up bug resolution / discussions ? >>> just want to understand if the patch is not good enough, if I need to >>> improve it or simply no-one took a look ... >>> >>> https://issues.apache.org/jira/browse/LUCENE-6954 >>> >>> Cheers >>> >>> On 11 January 2016 at 15:25, Alessandro Benedetti <[email protected] >>> > wrote: >>> >>>> Hi guys, >>>> the patch seems fine to me. >>>> I didn't spend much more time on the code but I checked the tests and >>>> the pre-commit checks. >>>> It seems fine to me. >>>> Let me know , >>>> >>>> Cheers >>>> >>>> On 31 December 2015 at 18:40, Alessandro Benedetti < >>>> [email protected]> wrote: >>>> >>>>> https://issues.apache.org/jira/browse/LUCENE-6954 >>>>> >>>>> First draft patch available, I will check better the tests new year ! >>>>> >>>>> On 29 December 2015 at 13:43, Alessandro Benedetti < >>>>> [email protected]> wrote: >>>>> >>>>>> Sure, I will proceed tomorrow with the Jira and the simple patch + >>>>>> tests. >>>>>> >>>>>> In the meantime let's try to collect some additional feedback. >>>>>> >>>>>> Cheers >>>>>> >>>>>> On 29 December 2015 at 12:43, Anshum Gupta <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Feel free to create a JIRA and put up a patch if you can. >>>>>>> >>>>>>> On Tue, Dec 29, 2015 at 4:26 PM, Alessandro Benedetti < >>>>>>> [email protected] >>>>>>> > wrote: >>>>>>> >>>>>>> > Hi guys, >>>>>>> > While I was exploring the way we build the More Like This query, I >>>>>>> > discovered a part I am not convinced of : >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > Let's see how we build the query : >>>>>>> > org.apache.lucene.queries.mlt.MoreLikeThis#retrieveTerms(int) >>>>>>> > >>>>>>> > 1) we extract the terms from the interesting fields, adding them >>>>>>> to a map : >>>>>>> > >>>>>>> > Map<String, Int> termFreqMap = new HashMap<>(); >>>>>>> > >>>>>>> > *( we lose the relation field-> term, we don't know anymore where >>>>>>> the term >>>>>>> > was coming ! )* >>>>>>> > >>>>>>> > org.apache.lucene.queries.mlt.MoreLikeThis#createQueue >>>>>>> > >>>>>>> > 2) we build the queue that will contain the query terms, at this >>>>>>> point we >>>>>>> > connect again there terms to some field, but : >>>>>>> > >>>>>>> > ... >>>>>>> >> // go through all the fields and find the largest document >>>>>>> frequency >>>>>>> >> String topField = fieldNames[0]; >>>>>>> >> int docFreq = 0; >>>>>>> >> for (String fieldName : fieldNames) { >>>>>>> >> int freq = ir.docFreq(new Term(fieldName, word)); >>>>>>> >> topField = (freq > docFreq) ? fieldName : topField; >>>>>>> >> docFreq = (freq > docFreq) ? freq : docFreq; >>>>>>> >> } >>>>>>> >> ... >>>>>>> > >>>>>>> > >>>>>>> > We identify the topField as the field with the highest document >>>>>>> frequency >>>>>>> > for the term t . >>>>>>> > Then we build the termQuery : >>>>>>> > >>>>>>> > queue.add(new ScoreTerm(word, *topField*, score, idf, docFreq, >>>>>>> tf)); >>>>>>> > >>>>>>> > In this way we lose a lot of precision. >>>>>>> > Not sure why we do that. >>>>>>> > I would prefer to keep the relation between terms and fields. >>>>>>> > The MLT query can improve a lot the quality. >>>>>>> > If i run the MLT on 2 fields : *description* and *facilities* for >>>>>>> example. >>>>>>> > It is likely I want to find documents with similar terms in the >>>>>>> > description and similar terms in the facilities, without mixing up >>>>>>> the >>>>>>> > things and loosing the semantic of the terms. >>>>>>> > >>>>>>> > Let me know your opinion, >>>>>>> > >>>>>>> > Cheers >>>>>>> > >>>>>>> > >>>>>>> > -- >>>>>>> > -------------------------- >>>>>>> > >>>>>>> > Benedetti Alessandro >>>>>>> > Visiting card : http://about.me/alessandro_benedetti >>>>>>> > >>>>>>> > "Tyger, tyger burning bright >>>>>>> > In the forests of the night, >>>>>>> > What immortal hand or eye >>>>>>> > Could frame thy fearful symmetry?" >>>>>>> > >>>>>>> > William Blake - Songs of Experience -1794 England >>>>>>> > >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Anshum Gupta >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> -------------------------- >>>>>> >>>>>> Benedetti Alessandro >>>>>> Visiting card : http://about.me/alessandro_benedetti >>>>>> >>>>>> "Tyger, tyger burning bright >>>>>> In the forests of the night, >>>>>> What immortal hand or eye >>>>>> Could frame thy fearful symmetry?" >>>>>> >>>>>> William Blake - Songs of Experience -1794 England >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> -------------------------- >>>>> >>>>> Benedetti Alessandro >>>>> Visiting card : http://about.me/alessandro_benedetti >>>>> >>>>> "Tyger, tyger burning bright >>>>> In the forests of the night, >>>>> What immortal hand or eye >>>>> Could frame thy fearful symmetry?" >>>>> >>>>> William Blake - Songs of Experience -1794 England >>>>> >>>> >>>> >>>> >>>> -- >>>> -------------------------- >>>> >>>> Benedetti Alessandro >>>> Visiting card : http://about.me/alessandro_benedetti >>>> >>>> "Tyger, tyger burning bright >>>> In the forests of the night, >>>> What immortal hand or eye >>>> Could frame thy fearful symmetry?" >>>> >>>> William Blake - Songs of Experience -1794 England >>>> >>> >>> >>> >>> -- >>> -------------------------- >>> >>> Benedetti Alessandro >>> Visiting card : http://about.me/alessandro_benedetti >>> >>> "Tyger, tyger burning bright >>> In the forests of the night, >>> What immortal hand or eye >>> Could frame thy fearful symmetry?" >>> >>> William Blake - Songs of Experience -1794 England >>> >> >> >> >> -- >> -------------------------- >> >> Benedetti Alessandro >> Visiting card : http://about.me/alessandro_benedetti >> >> "Tyger, tyger burning bright >> In the forests of the night, >> What immortal hand or eye >> Could frame thy fearful symmetry?" >> >> William Blake - Songs of Experience -1794 England >> > > > > -- > Anshum Gupta > -- -------------------------- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England
