IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS vs storing positions and offsets/

2013-05-08 Thread AarKay
I see that Lucene 4.x has FieldInfo.IndexOptions that can be used to tell
lucene whether to Index Documents/Frequencies/Positions/Offsets.

We are in the process of upgrading from Lucene 2.9 to Lucene 4.x and I was
wondering if there was a way to tell lucene whether to index
docs/freqs/pos/offsets or not in the older versions (2.9) or did it always
index positions and offsets by default?

Also I see that Lucene 4.x has FieldType.setStoreTermVectorPositions and
FieldType.setStoreTermVectorOffsets.
Can someone please tell me a usecase for storing positions and offsets in
index?
Is it necessary to store termvector positions and offsets when using
IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS?

Thanks
-AarKay


Re: IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS vs storing positions and offsets/

2013-05-08 Thread AarKay
Thanks Mike. This is little bit clear to me now.

Just to make sure I got it right, do you mean that we need to store just
the offsets and set IndexOptions to DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS
to be able to use PostingsHighlighter?
Also we don't need to store TermVectors and Positions. Correct?

I believe usecase for storing TermVectors and Positions is to use other
highlighter (FastVectorHighlighter)



On Wed, May 8, 2013 at 5:59 AM, Michael McCandless <
luc...@mikemccandless.com> wrote:

> On Wed, May 8, 2013 at 4:23 AM, AarKay  wrote:
> > I see that Lucene 4.x has FieldInfo.IndexOptions that can be used to tell
> > lucene whether to Index Documents/Frequencies/Positions/Offsets.
> >
> > We are in the process of upgrading from Lucene 2.9 to Lucene 4.x and I
> was
> > wondering if there was a way to tell lucene whether to index
> > docs/freqs/pos/offsets or not in the older versions (2.9) or did it
> always
> > index positions and offsets by default?
>
> I believe in 2.9 you could only say "docs"
> (omitTermFreqAndPositions=true), or "docs+freqs+positions".  Offsets
> are new in 4.x.
>
> > Also I see that Lucene 4.x has FieldType.setStoreTermVectorPositions and
> > FieldType.setStoreTermVectorOffsets.
> > Can someone please tell me a usecase for storing positions and offsets in
> > index?
>
> Storing offsets in the index (postings) lets you use the new
> PostingsHighlighter.  It should be faster than the other two
> highlighters which rely on term vectors or on re-analysis at search
> time.
>
> > Is it necessary to store termvector positions and offsets when using
> > IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS?
>
> No.
>
> Term vectors are stored separately from postings (IndexOptions
> controls what's put into the postings).
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


FastVectorHighlighter returns null fragments - Lucene 4.0

2013-05-09 Thread AarKay
I have an index built using Lucene4 with below config
storeTermVectors=true
storeTermVectorPositions=true
storeTermVectorOffsets=true
IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS

I am trying to use FastVectorHighlighter for retrieving the snippets from
hit docs but getBestFragment method is returning nulls even when there are
hit docs.

Can someone please tell me what am i doing wrong?

Here is the code snippet on how I am using FastVectorHighlighter

private FastVectorHighlighter getHighlighter() {
FragListBuilder fragListBuilder = new SimpleFragListBuilder(200);
FragmentsBuilder fragmentBuilder = new
SimpleFragmentsBuilder(PRE_TAGS, POST_TAGS);
return new FastVectorHighlighter(true, true, fragListBuilder,
fragmentBuilder);
}

public void testHighlight(String term) throws Exception {
ClassicAnalyzer analyzer = new ClassicAnalyzer(Version.LUCENE_40);
Query query = new QueryParser(Version.LUCENE_40, "contents:,
analyzer).parse(term);
FastVectorHighlighter highlighter = getHighlighter();
FieldQuery fieldQuery = highlighter.getFieldQuery(query);

TopDocs topDocs = indexSearcher.search(query, 10);
List fragments = new ArrayList();
for (ScoreDoc scoreDoc : topDocs.scoreDocs) {
fragments.add(highlighter.getBestFragment(fieldQuery,
indexSearcher.getIndexReader(), scoreDoc.doc, "contents", 1000));
}

System.out.println( fragments.size() + " " + fragments.toString());
}


Thanks
AarKay