Oh, silly of me. :) Thanks, Edward
Em sex, 29 de nov de 2019 07:13, Alan Woodward <[email protected]> escreveu: > I think it’s working fine - Luke is showing you the docFreq of the term, > which will be 1 as it only appears in a single document. > > On 28 Nov 2019, at 21:51, Edward Ribeiro <[email protected]> wrote: > > Hi, > > Please, anyone has an example of DelimitedTermFrequencyTokenFilter use > that could share? > > I have been banging my head against the wall trying to make it work ( > https://gist.github.com/eribeiro/ebb24feb3fd84931b7c288b9b716ed49 ) and > idk what I am doing wrong. > > I am creating a custom analyzer that uses a WhitespaceTokenizer to parse a > string like "a|10 b|2 c|9", and pass it to > DelimitedTermFrequencyTokenFilter. I am inserting a custom field that is > added to the document to prevent it from having positions and offsets. > > The debugger shows the string is being correctly parsed by DTFTF and its > char and term attributes are properly set up. But the term frequency of > each term is 1 when I inspect the index via Luke. Curiously, the output of > my snippet shows the correct total term frequency as seen below: > > field="text",maxDoc=1,docCount=1,sumTotalTermFreq=123,sumDocFreq=3 > a|10 b|23 c|90 > SumTotalTermFreq: 123 > SumDocFreq: 3 > > Cheers, > Edward > PS: I am a Lucene newbie so it may be something quite stupid. > > >
