We often calculate co-occurrence information as an offline task and store it and then it is just a simple lookup at run time. You just have to put together the appropriate loops based on the window size that you want for any given term. Probably not efficient if you index is changing a lot.

-Grant

On Oct 3, 2006, at 6:14 AM, Renzo Scheffer wrote:

I try to get back a list of all left or right neighbours of a searchterm. Then I will count them to get back the Information, how often a specific word is used as neighbour of the searchterm. I know that the results are variable according to the used Analyzer/Filter. It's just an experiment and first I'll try to find out if it is possible to do something like that with
Lucene.

Renzo

-----Ursprüngliche Nachricht-----
Von: Nicolas Lalevée [mailto:[EMAIL PROTECTED]
Gesendet: Dienstag, 3. Oktober 2006 00:04
An: java-user@lucene.apache.org
Betreff: Re: get terms by positions

Le Lundi 02 Octobre 2006 23:06, Renzo Scheffer a écrit :
Hi,



can anybody be so kind to tell me if it is possible to search a Term by
its
position?



I search a term (for excample "soccer") and get back the DocId's and
positions as follows:





TermPositions termPos = reader.termPositions(new
Term("contents","soccer"));

while(termPos.next()){

int freq = termPos.freq();

for(int i=0; i<freq; i++){



      int docNumber = termPos.doc();

      int position = termPos.nextPosition();

System.out.println("DocId: "+docNumber+"; Pos:"+position);

}







Output:



DocId: 0; Pos: 1

DocId: 0; Pos: 4

DocId: 0; Pos: 7

DocId: 1; Pos: 3

DocId: 1; Pos: 7



Now I try to get back terms, one position before/after "soccer". I
considered to take the

Position and increase or decrease it. But I can't find a way to get back a
term, according to the given Position.

Can anybody help me?


I think this is a non-sense to try to find a term. In Lucene, you search
with
a term, you are not trying to get some. Basically, in Lucene, you have a
list
of term pointing on documents, not the reverse.

Maybe if you explain why you are trying to do that, we can find a better way

to do it.

Nicolas

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


--------------------------
Grant Ingersoll
Sr. Software Engineer
Center for Natural Language Processing
Syracuse University
335 Hinds Hall
Syracuse, NY 13244
http://www.cnlp.org

Voice: 315-443-5484
Skype: grant_ingersoll
Fax: 315-443-6886




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to