How to use query terms tfidf as a factor in document similarity calculation

2014-05-18 Thread Diaa Abdallah
Hi, I'm trying to implement Explicit semantic analysis(ESA) via Lucene. How do I take a term TFIDF in a query into consideration when matching documents? For example: Query:"a b c a d a" Doc1:"a b a" Doc2:"a b c" The query should match Doc1 better than 2. I'd like this to work without impacting

Re: tf/idf similarity with modified document similarity

2014-03-07 Thread Jack Krupansky
@lucene.apache.org Subject: tf/idf similarity with modified document similarity -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello, what is the best method to score documents similar to default similarity, but the document frequency should be calculated per query against the matching result document set

tf/idf similarity with modified document similarity

2014-03-06 Thread Christian Reuschling
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello, what is the best method to score documents similar to default similarity, but the document frequency should be calculated per query against the matching result document set, not statically against the whole corpus. Didn't found a good and pe

Re: Document Similarity

2012-07-30 Thread in.abdul
>> regardsshaimaa >> >> -- >> If you reply to this email, your message will be added to the >> discussion below: >> http://lucene.472066.n3.nabble.com/Document-Similarity-tp3998082.html >> To unsubscribe from Lucene, click >&g

RE: Document Similarity

2012-07-30 Thread Elshaimaa Ali
ntain any code any help will be greatly apreciated regardsshaimaa > Date: Mon, 30 Jul 2012 07:32:49 -0700 > From: in.ab...@gmail.com > To: java-user@lucene.apache.org > Subject: Re: Document Similarity > > Hi ELshaimaa, > I couldnt able understood what is your need . Can you

Re: Document Similarity

2012-07-30 Thread in.abdul
can use to map the document to one of the documents in > the index > regardsshaimaa > > -- > If you reply to this email, your message will be added to the discussion > below: > http://lucene.472066.n3.nabble.com/Document-Similarity-tp3998082.html

Re: Measuring document similarity

2012-03-12 Thread Koji Sekiguchi
(12/03/13 2:38), Hassane Cabir wrote: Hi guys, I'm using Lucene for my project and I need to calcule how similar two (or more) documents are, using TFIDF. How to get TFIDF with lucene? Any insights on this? Solr has TermVectorComponent which can return tf, df and tf-idf of each term in a docu

Measuring document similarity

2012-03-12 Thread Hassane Cabir
Hi guys, I'm using Lucene for my project and I need to calcule how similar two (or more) documents are, using TFIDF. How to get TFIDF with lucene? Any insights on this? Thank you for your support . -- Hassane

Re: Document similarity

2006-01-20 Thread Aleksey Serba
e the fields you just indexed... no need to > retrieve it again). > > -Yonik > > On 1/20/06, Klaus <[EMAIL PROTECTED]> wrote: > > > > >In my case, i need to filter similar documents in search results and > > >therefore determine document similarity durin

Re: Document similarity

2006-01-20 Thread Yonik Seeley
n my case, i need to filter similar documents in search results and > >therefore determine document similarity during indexing process using > >term vectors. Obviously, i can't compare currently indexing document > >with all documents in my collection. > > Yes you can.

AW: Document similarity

2006-01-20 Thread Klaus
>In my case, i need to filter similar documents in search results and >therefore determine document similarity during indexing process using >term vectors. Obviously, i can't compare currently indexing document >with all documents in my collection. Yes you can. Right after

Document similarity

2006-01-20 Thread Aleksey Serba
Hello lucene people! First of all, i would like to thank all of community participants ( developers, users, Erik and Otis for "Lucene in Action" book ) for their great work. As far as i understand it, there are two most popular approches concerning document similarity: 1. "cosine