We often calculate co-occurrence information as an offline task and
store it and then it is just a simple lookup at run time. You just
have to put together the appropriate loops based on the window size
that you want for any given term. Probably not efficient if you
index is changing a lot.
-Grant
On Oct 3, 2006, at 6:14 AM, Renzo Scheffer wrote:
I try to get back a list of all left or right neighbours of a
searchterm.
Then I will count them to get back the Information, how often a
specific
word is used as neighbour of the searchterm. I know that the
results are
variable according to the used Analyzer/Filter. It's just an
experiment and
first I'll try to find out if it is possible to do something like
that with
Lucene.
Renzo
-----Ursprüngliche Nachricht-----
Von: Nicolas Lalevée [mailto:[EMAIL PROTECTED]
Gesendet: Dienstag, 3. Oktober 2006 00:04
An: java-user@lucene.apache.org
Betreff: Re: get terms by positions
Le Lundi 02 Octobre 2006 23:06, Renzo Scheffer a écrit :
Hi,
can anybody be so kind to tell me if it is possible to search a
Term by
its
position?
I search a term (for excample "soccer") and get back the DocId's and
positions as follows:
TermPositions termPos = reader.termPositions(new
Term("contents","soccer"));
while(termPos.next()){
int freq = termPos.freq();
for(int i=0; i<freq; i++){
int docNumber = termPos.doc();
int position = termPos.nextPosition();
System.out.println("DocId: "+docNumber+"; Pos:"+position);
}
Output:
DocId: 0; Pos: 1
DocId: 0; Pos: 4
DocId: 0; Pos: 7
DocId: 1; Pos: 3
DocId: 1; Pos: 7
Now I try to get back terms, one position before/after "soccer". I
considered to take the
Position and increase or decrease it. But I can't find a way to
get back a
term, according to the given Position.
Can anybody help me?
I think this is a non-sense to try to find a term. In Lucene, you
search
with
a term, you are not trying to get some. Basically, in Lucene, you
have a
list
of term pointing on documents, not the reverse.
Maybe if you explain why you are trying to do that, we can find a
better way
to do it.
Nicolas
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------
Grant Ingersoll
Sr. Software Engineer
Center for Natural Language Processing
Syracuse University
335 Hinds Hall
Syracuse, NY 13244
http://www.cnlp.org
Voice: 315-443-5484
Skype: grant_ingersoll
Fax: 315-443-6886
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]