[
https://issues.apache.org/jira/browse/LUCENE-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558935#action_12558935
]
Karl Wettin commented on LUCENE-1016:
-------------------------------------
{quote}
I'm curious if the build part of this would be faster than reanalyzing a
document.
{quote}
It is a slow process on an index with many terms. Each one has to be iterated
and mached against the document number.
{quote}
Just thinking outloud, but I have wondering about a Highlighter that uses the
new TermVectorMapper, but using that doesn't account for non-TermVector based
Documents that need to be analyzed. Was thinking this might account for both
cases, all through the TermVectorMapper mechanism. Just doesn't seem like it
would be very fast.
{quote}
This patch is mostly about when you don't have access to the source data. It
was used together with a TermVectorMappingCachedTokenStreamFactory to extract
re-indexable documents from any directory.
If you think of this peice of code and highlighter together, I would consider
something else, perhaps a tool that could add the term vector to all documents
missing one in a single iteration sweep of the index. I know very little about
the file format and the highlighter though.
> TermVectorAccessor, transparent vector space access
> ----------------------------------------------------
>
> Key: LUCENE-1016
> URL: https://issues.apache.org/jira/browse/LUCENE-1016
> Project: Lucene - Java
> Issue Type: New Feature
> Components: Term Vectors
> Affects Versions: 2.2
> Reporter: Karl Wettin
> Priority: Minor
> Attachments: LUCENE-1016.txt
>
>
> This class visits TermVectorMapper and populates it with information
> transparent by either passing it down to the default terms cache (documents
> indexed with Field.TermVector) or by resolving the inverted index.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]