rclabo commented on issue #569:
URL: https://github.com/apache/lucenenet/issues/569#issuecomment-991120781
Emre – It’s a really good question. I’ve wondered the same thing before as
well. Your question prompted me to do a bit of digging and this is the
conclusion I reached:
It seems that Lucene considers the step of converting an Int64Field into a
Trie structure for indexing to be a form of tokenization. While the approach
does not use an Analyzer per se it is true that Lucene does greatly change the
form of the number before putting that new representation into the index. And
non-tokenized fields are placed directly in the inverted index, which is not
the case for numbers since what is placed in the inverted index is a trie
structure corresponding to the number. That trie structure often has 8 terms
which are placed in the inverted index but the number of terms will very based
on the numeric Field’s NumericPrecisionStep.
One piece of code that shines a bit of light onto this is here:
https://github.com/apache/lucenenet/blob/Lucene.Net_4_8_0_beta00015/src/Lucene.Net/Document/Field.cs#L168
-Ron
rclabo
From: Emre Rauhofer ***@***.***
Sent: Thursday, December 9, 2021 7:57 AM
To: apache/lucenenet
Cc: Subscribed
Subject: [apache/lucenenet] Int64Field tokenized (Issue #569)
Hello,
I looked through the source code and saw that the Int64Field has the
Parameter IsTokenized set on true.
I found that weird, because I thought only strings can be tokenized.
What does that mean for the integer?
And how does it affect the search?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<https://github.com/apache/lucenenet/issues/569> , or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABU7VWZD5FNZQV4BEU2KWD3UQCRRDANCNFSM5JWM3IOQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>
.
<https://github.com/notifications/beacon/ABU7VW3ILM5SWS36PSPKELLUQCRRDA5CNFSM5JWM3IO2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4QA3PNZA.gif>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]