Hi all!
I'm reimplementing a very Lucene-like search library as a learning
experience and I've run into a snag. Before I go deep code diving, I
thought I'd ask here in case someone has the time to answer.
The term dictionary file includes the term count in a header. But when
I'm merging segm
robert engels wrote:
but a better solution, since you probably need a indexed file into the
terms file, you might not even need the term count, since you should
read the indexed file into memory anyway (read every 16 entries, etc.) -
at which point you will know the number of terms in the file.
Aside from the useful exchange I had with Robert, I'd still like to know
how Lucene knows what value to write in the "term count" part of the
term dictionary header when it's merging segments -- even if I decide
forgo it in my own re-implementation.
Of course, I can always just dive into the c
robert engels wrote:
It seeks back at the end to the location and writes the size.
Ah! Sorry I didn't get that. Thanks for your help!
Matt
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PR
Hi, another abstract implementation question:
Per Term Position (prox) data vs. Per Doc Term Vectors. Belt and
Suspenders? Can't Term Vectors effectively (performantly) replace
position data for doing phrase matches? Is there another use of
position data that term vectors doesn't satisfy? D
[
https://issues.apache.org/jira/browse/LUCENE-1613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1270#action_1270
]
Matt Chaput edited comment on LUCENE-1613 at 4/27/09 1:1
[
https://issues.apache.org/jira/browse/LUCENE-1613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1270#action_1270
]
Matt Chaput commented on LUCENE-1613:
-
Given how fundamental the issue is w.r.t.