Or 3, create a term list during document creation and place it into one of the document fields. The list will be shorter than the actual document text if you don't care about term positions - just their frequency. Also, you can collapse this list to include only stemmed forms. The list can be encoded or even compressed. It can be the field that is used for indexing or it can be an additional field that is only used for term storage. When you retrieve the document, you get the terms out of that field.Message: 3
Date: Thu, 28 Jun 2001 09:33:01 +0200
From: Gerhard Schwarz <[EMAIL PROTECTED]>
Organization: Frost & Partner
To: Lucene Users <[EMAIL PROTECTED]>
Subject: Re: [Lucene-users] Retrieve Terms for a certain Document?
Hi Tal,
Tal Dayan schrieb:If you store the document fields in the index, you can
then retrive the document, run the analyzer, filter out
duplicates and get a set of the terms. Is this is want
you want ?
That's the second way. But I find it more comfortable when the index
could provide this info. Looking at the indexed words and add or delete
some indeces is essential for me.
Unfortunatly, the indexed file is only present while indexing. Its is
a teporary created textfile from various other formats and has some
additional user defined keywords. The textfile is not important for
my application (the original is store elsewhere), all information
where to get the original file is stored as an additional field.
Seems like I have to choices:
1. I take the textfile and store it beside the index, and re-analyze
it to get all qualified terms.
2. I change my Analyzer that way that it creates a file while
indexing and put all qualified terms into it.
This is the solution we use in our application. It's not ideal, but it works for the time being. I agree, it would be greate to be able to get this directly out of the index.
The second choice has the advance, that this filebase could be
used to rebuild the index after a mishap.Tal
Gerhard
--__--__--
_______________________________________________
Lucene-users mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/lists/listinfo/lucene-users
End of Lucene-users Digest