http://code.google.com/p/semanticvectors/


If you search the archives of this mailing-list, there have been
plenty of discussions in the past about LSI/LSA & Lucene.



On Tue, Jun 23, 2009 at 6:55 AM, Cool The
Breezer<techcool.ku...@yahoo.com> wrote:
>
> Shashi,
>          I think I am planning or intended to do the same thing as 
> implemented in LSI methodology. It seems from your meesage, you or somebody 
> might have used the LSI approach in lucene. So can you share some of your 
> work. I am more interested to know any library or package or paper used for 
> analyzing terms semantically and constrcuting vector space.
>
> - RB
>
>
> ----- Original Message ----
> From: Shashi Kant <shashi....@gmail.com>
> To: java-user@lucene.apache.org
> Sent: Tuesday, June 23, 2009 3:20:16 PM
> Subject: Re: Similarity
>
> I suspect what you are looking for is "Latent Semantics" - it can
> algorithmically infer that "iPod~iPhone" or "Apple~Steve Jobs". Google for
> "Latent Semantic Indexing" or "Latent Semantic Analysis" - you can apply
> some of those approaches using the TermVectors in Lucene index.
> Ontologies such as WordNet are very generic, hence if you have a domain
> specific corpus, you would need to generate some kind of Latent Semantic
> Index to extract the relations therein.
>
>
>
>
> On Tue, Jun 23, 2009 at 5:27 AM, Cool The Breezer
> <techcool.ku...@yahoo.com>wrote:
>
>>
>> Of the late I started using Lucene as main search library for all documents
>> in our intranet. It works extremely well. I am trying to use similarity
>> kinda functionality to find similarity between two sentences/documents and
>> trying to use Wordnet in our searching solution. I have used wordnet contrib
>> package and it really works well to expand queries with synonyms and get
>> results. But I can get handicap when searching for documents with query like
>> "Steve Jobs" and documents containing "apple" should be returned. In the
>> same way "pirated" and "willfull downloading copyrighted material". This
>> comes finding meaning of a word wrt its context. Has anybody done any kind
>> of such context based indexing that means while tokenization based on
>> context of each word/token and searching the same after expanding the query
>> using synonyms. I have come across some sf projects like
>> http://wn-similarity.sourceforge.net/  to semantically relating words
>> using wordnet but I am
>>  still kinda confused on how to move ahead with such kind of context based
>> search. Appreciate your help. I understand that this might not be directly
>> related to Lucene but somehow this falls in the same domain search solution.
>>
>> - RB
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Similarity

Reply via email to