Yes. You are right. I understand now. Let me explain my issue a bit better with the exact problem i have.
I have this text "Information number 61149-008." Using the tokenizers and filters described previously i get this list of tokens. information number 61149-008. 61149 008 Basically last token "61149-008." gets tokenized as 61149-008. 61149 008 User is searching for "61149-008" without dot, so this is not a match. I don't want to change the tokenization on the query to avoid altering the matches for other cases. I would like to delete the dot at the end. Basically generate this extra token information number 61149-008. 61149 008 61149-008 Not sure if what I am saying make sense or there is other way to do this right. Thanks a lot Sergio On 24 November 2017 at 15:31, Shawn Heisey <[email protected]> wrote: > On 11/24/2017 2:32 AM, marotosg wrote: > >> Hi Shaw. >> Thanks for your reply. Actually my issue is with the last token. It looks >> like for the last token of a string. It keeps the dot. >> >> In your case Testing. This is a test. Test. >> >> Keeps the "Test." >> >> Is there any reason I can't see for that behauviour? >> > > I am really not sure what you're saying here. > > Every token is duplicated, one has the dot and one doesn't. This is what > you wanted based on what I read in your initial email. > > Making a guess as to what you're asking about this time: If you're > noticing that there isn't a "Test" as the last token on the line for WDF, > then I have to tell you that it actually is there, the display was simply > too wide for the browser window. Scrolling horizontally would be required > to see the whole thing. > > Thanks, > Shawn > >
