http://code.google.com/p/semanticvectors/
If you search the archives of this mailing-list, there have been plenty of discussions in the past about LSI/LSA & Lucene. On Tue, Jun 23, 2009 at 6:55 AM, Cool The Breezer<techcool.ku...@yahoo.com> wrote: > > Shashi, > I think I am planning or intended to do the same thing as > implemented in LSI methodology. It seems from your meesage, you or somebody > might have used the LSI approach in lucene. So can you share some of your > work. I am more interested to know any library or package or paper used for > analyzing terms semantically and constrcuting vector space. > > - RB > > > ----- Original Message ---- > From: Shashi Kant <shashi....@gmail.com> > To: java-user@lucene.apache.org > Sent: Tuesday, June 23, 2009 3:20:16 PM > Subject: Re: Similarity > > I suspect what you are looking for is "Latent Semantics" - it can > algorithmically infer that "iPod~iPhone" or "Apple~Steve Jobs". Google for > "Latent Semantic Indexing" or "Latent Semantic Analysis" - you can apply > some of those approaches using the TermVectors in Lucene index. > Ontologies such as WordNet are very generic, hence if you have a domain > specific corpus, you would need to generate some kind of Latent Semantic > Index to extract the relations therein. > > > > > On Tue, Jun 23, 2009 at 5:27 AM, Cool The Breezer > <techcool.ku...@yahoo.com>wrote: > >> >> Of the late I started using Lucene as main search library for all documents >> in our intranet. It works extremely well. I am trying to use similarity >> kinda functionality to find similarity between two sentences/documents and >> trying to use Wordnet in our searching solution. I have used wordnet contrib >> package and it really works well to expand queries with synonyms and get >> results. But I can get handicap when searching for documents with query like >> "Steve Jobs" and documents containing "apple" should be returned. In the >> same way "pirated" and "willfull downloading copyrighted material". This >> comes finding meaning of a word wrt its context. Has anybody done any kind >> of such context based indexing that means while tokenization based on >> context of each word/token and searching the same after expanding the query >> using synonyms. I have come across some sf projects like >> http://wn-similarity.sourceforge.net/ to semantically relating words >> using wordnet but I am >> still kinda confused on how to move ahead with such kind of context based >> search. Appreciate your help. I understand that this might not be directly >> related to Lucene but somehow this falls in the same domain search solution. >> >> - RB >> >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org