RE: Keyword extraction

2008-11-27 Thread Plaatje, Patrick
@lucene.apache.org Subject: Re: Keyword extraction Ah, yes, That is important. In lucene, the MLT will see if the term vector is stored, and if it is not it will still be able to perform the querying, but in a much much much less efficient way.. Lucene will analyze the document (and the variable

Re: Keyword extraction

2008-11-27 Thread Aleksander M. Stensby
[mailto:[EMAIL PROTECTED] Sent: woensdag 26 november 2008 15:07 To: solr-user@lucene.apache.org Subject: Re: Keyword extraction Ah, yes, That is important. In lucene, the MLT will see if the term vector is stored, and if it is not it will still be able to perform the querying, but in a much much much

RE: Keyword extraction

2008-11-26 Thread Plaatje, Patrick
Hi All, as an addition to my previous post, no interestingTerms are returned when i execute the folowing url: http://localhost:8080/solr/select/?q=id=18477975mlt.fl=textmlt.interes tingTerms=listmlt=truemlt.match.include=true I get a moreLikeThis list though, any thoughts? Best, Patrick

Re: Keyword extraction

2008-11-26 Thread Scurtu Vitalie
] wrote: From: Aleksander M. Stensby aleksander. [EMAIL PROTECTED] Subject: Re: Keyword extraction To: solr-user@lucene.apache.org Date: Wednesday, November 26, 2008, 1:03 PM I do not agree with you at all. The concept of MoreLikeThis is based on the fundamental idea of TF-IDF weighting

RE: Keyword extraction

2008-11-26 Thread Plaatje, Patrick
=18477975 numFound=0 start=0/ /lst Instead of delivering details of the interestingTerms. Thanks in advance Patrick -Original Message- From: Aleksander M. Stensby [mailto:[EMAIL PROTECTED] Sent: woensdag 26 november 2008 13:03 To: solr-user@lucene.apache.org Subject: Re: Keyword extraction

Re: Keyword extraction

2008-11-26 Thread Aleksander M. Stensby
@lucene.apache.org Subject: Re: Keyword extraction I do not agree with you at all. The concept of MoreLikeThis is based on the fundamental idea of TF-IDF weighting, and not term frequency alone. Please take a look at: http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/similar

RE: Keyword extraction

2008-11-26 Thread Plaatje, Patrick
-Original Message- From: Aleksander M. Stensby [mailto:[EMAIL PROTECTED] Sent: woensdag 26 november 2008 14:37 To: solr-user@lucene.apache.org Subject: Re: Keyword extraction Hi there! Well, first of all i think you have an error in your query, if I'm not mistaken. You say http://localhost

Re: Keyword extraction

2008-11-26 Thread Aleksander M. Stensby
=3,4, results were poor, while for mlt.maxqt=5,6 it gave too many and irrelevant results. Thank you, Best Wishes, Vitalie Scurtu --- On Wed, 11/26/08, Aleksander M. Stensby [EMAIL PROTECTED] wrote: From: Aleksander M. Stensby aleksander. [EMAIL PROTECTED] Subject: Re: Keyword extraction

RE: Keyword extraction

2008-11-26 Thread Scurtu Vitalie
in this case). I hope it helps, Best Regards, Vitalie Scurtu --- On Wed, 11/26/08, Plaatje, Patrick [EMAIL PROTECTED] wrote: From: Plaatje, Patrick [EMAIL PROTECTED] Subject: RE: Keyword extraction To: solr-user@lucene.apache.org Date: Wednesday, November 26, 2008, 10:52 AM Hi All, as an addition

Re: Keyword extraction

2008-11-26 Thread Aleksander M. Stensby
[mailto:[EMAIL PROTECTED] Sent: woensdag 26 november 2008 14:37 To: solr-user@lucene.apache.org Subject: Re: Keyword extraction Hi there! Well, first of all i think you have an error in your query, if I'm not mistaken. You say http://localhost:8080/solr/select/?q=id=18477975... but since you

Re: Keyword extraction

2008-11-26 Thread Jeff Newburn
the index now, and see if this fixes the problem. Best, patrick -Original Message- From: Aleksander M. Stensby [mailto:[EMAIL PROTECTED] Sent: woensdag 26 november 2008 14:37 To: solr-user@lucene.apache.org Subject: Re: Keyword extraction Hi there! Well, first of all i think

Re: Keyword extraction

2008-11-26 Thread Scurtu Vitalie
PROTECTED] Subject: Re: Keyword extraction To: solr-user@lucene.apache.org Date: Wednesday, November 26, 2008, 2:43 PM I'm sure that for certain problems and cases you will need to do quite a bit tweaking to make it work (to suite your needs), but i responded to your statement because you made

Re: Keyword extraction

2008-11-26 Thread Aleksander M. Stensby
with similar title (more like this doesn't work in this case). I hope it helps, Best Regards, Vitalie Scurtu --- On Wed, 11/26/08, Plaatje, Patrick [EMAIL PROTECTED] wrote: From: Plaatje, Patrick [EMAIL PROTECTED] Subject: RE: Keyword extraction To: solr-user@lucene.apache.org Date: Wednesday

Re: Keyword extraction

2008-11-26 Thread Shalin Shekhar Mangar
You might also be interested in http://wiki.apache.org/solr/TermVectorComponent On Wed, Nov 26, 2008 at 12:39 AM, Plaatje, Patrick [EMAIL PROTECTED] wrote: Hi all, Strugling with a question I recently got from a collegue: is it possible to extract keywords from indexed content? In my

Keyword extraction

2008-11-25 Thread Plaatje, Patrick
Hi all, Strugling with a question I recently got from a collegue: is it possible to extract keywords from indexed content? In my opinion it should be possible to find out on what words the ranking of the indexed content is the highest (Lucene or Solr), but have no clue where to begin. Anyone

Re: Keyword extraction

2008-11-25 Thread Ryan McKinley
lots of approaches out there... the easiest off the shelf method would be to use the MoreLikeThisHandler and get the top interesting terms; http://wiki.apache.org/solr/MoreLikeThisHandler ryan On Nov 25, 2008, at 2:09 PM, Plaatje, Patrick wrote: Hi all, Strugling with a question I