[
https://issues.apache.org/jira/browse/LUCENE-5181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13749414#comment-13749414
]
Robert Muir commented on LUCENE-5181:
-------------------------------------
{quote}
For their review, users are often presented with a match-oriented table view
rather than a document-oriented table view, i.e., each row in the table
represents a term match, generally with some context, and is joined with some
document metadata.
{quote}
How does highlighting fit into this?
My general concern is that passing docid/encouraging the use of
o.a.l.document.Document within passage-processing will mean that people are
retrieving from the stored fields for every single match: and this would be
very slow.
Are you using highlighting to rank the most relevant sentences or do you really
want to enumerate term matches? In the latter case Query.extractTerms() +
TermsEnum.docsAndPositionsEnum(FLAG_OFFSETS) would be much more efficient.
> Passage knows its own docID
> ---------------------------
>
> Key: LUCENE-5181
> URL: https://issues.apache.org/jira/browse/LUCENE-5181
> Project: Lucene - Core
> Issue Type: Improvement
> Affects Versions: 4.4
> Reporter: Jon Stewart
> Priority: Minor
>
> The new PostingsHighlight package allows for retrieval of term matches from a
> query if one creates a class that extends PassageFormatter and overrides
> format(). However, class Passage does not have a docID field, nor is this
> provided via PassageFormatter.format(). Therefore, it's very difficult to
> know which Document contains a given Passage.
> It would suffice for PassageFormatter.format() to be passed the docID as a
> parameter. From the code in PostingsHighlight, this seems like it would be
> easy.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]