Ahmed,

FYI, I updated the term collocations package I mentioned earlier with a few 
fixes and changes which will make it work for Lucene 3.0.2.  This may help your 
task.

See: 
https://issues.apache.org/jira/browse/LUCENE-474

Thanks,

Ivan Provalov


--- On Sat, 8/21/10, Otis Gospodnetic <otis_gospodne...@yahoo.com> wrote:

> From: Otis Gospodnetic <otis_gospodne...@yahoo.com>
> Subject: Re: Calculate Term Co-occurrence Matrix
> To: java-user@lucene.apache.org
> Date: Saturday, August 21, 2010, 8:05 AM
> Ahmed,
> 
> That's what that KPE (link in my previous email, below)
> will do for you.  It's 
> not open source at this time, but that is exactly one of
> the things it does.  I 
> think Mahout collocations stuff might work for you, too.
> 
> Otis
> ----
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
> 
> 
> 
> ----- Original Message ----
> > From: ahmed algohary <algoharya...@gmail.com>
> > To: java-user@lucene.apache.org
> > Sent: Sat, August 21, 2010 7:20:03 AM
> > Subject: Re: Calculate Term Co-occurrence Matrix
> > 
> > Thanks for all your answers!
> > 
> > it seems like I did not make my question  clear.
> I have a text corpus and I
> > need to determine the pairs of words that  occur
> together in many documents.
> > I need to do that to be able to measure the 
> semantic proximity between
> > words. This method is expanded
> > here<http://forums.searchenginewatch.com/showthread.php?t=48>.
> > I hope to  find some code that given a text
> corpus, generate all the words
> > pairs with  their probability of occurring
> together.
> > 
> > 
> > On Sat, Aug 21, 2010 at 1:46  AM, Otis
> Gospodnetic <
> > otis_gospodne...@yahoo.com> 
> wrote:
> > 
> > > There is also a non-Mahout Key Phrase Extractor
> for  Collocations, SIPs, and
> > > a
> > > few other things:
> > > http://sematext.com/products/key-phrase-extractor/index.html
> > >
> > >  One of the demos that uses news data is at
> > > http://sematext.com/demo/kpe/index.html
> > >
> > > Otis
> > >  ----
> > > Sematext :: http://sematext.com/ :: Solr - Lucene -
> Nutch
> > > Lucene ecosystem  search :: http://search-lucene.com/
> > >
> > >
> > >
> > > ----- Original  Message ----
> > > > From: Grant Ingersoll <gsing...@apache.org>
> > > > To: java-user@lucene.apache.org
> > >  > Sent: Fri, August 20, 2010 8:52:17 AM
> > > > Subject: Re: Calculate  Term
> Co-occurrence Matrix
> > > >
> > > > You might also be interested  in
> Mahout's collocations package:
> > > >http://cwiki.apache.org/confluence/display/MAHOUT/Collocations
> > >  >
> > > > -Grant
> > > > On  Aug 19, 2010, at 11:39 AM,
> ahmed  algohary wrote:
> > > >
> > > > > Hi all,
> > > > >
> > >  > > I need to know if there is a
> Lucene plug-in or a Lucene-based  API  for
> > > > > calculating the term co-occurrence
> matrix for a  given text  corpus.
> > > > >
> > > > > Thanks!
> > >  > >
> > > > > --
> > > > >  Ahmed
> > >  >
> > > > --------------------------
> > > > Grant  Ingersoll
> > > > http://www.lucidimagination.com/
> > > >
> > > > Search the  Lucene ecosystem
> using  Solr/Lucene:
> > > >http://www.lucidimagination.com/search
> > > >
> > > >
> > >  > 
> ---------------------------------------------------------------------
> > >  > To  unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > >  > For  additional commands, e-mail:
> java-user-h...@lucene.apache.org
> > >  >
> > > >
> > >
> > > 
> ---------------------------------------------------------------------
> > > To  unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > >  For additional commands, e-mail: java-user-h...@lucene.apache.org
> > >
> > >
> > 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
> 
> 




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to