Re: svn commit: r405565 - in /lucene/nutch/trunk/src: java/org/apache/nutch/searcher/ test/org/apache/nutch/searcher/ web/jsp/

2006-05-12 Thread Dawid Weiss
Yes, this should be definitely mentioned somewhere (in the documentation :) At least we left a track on the mailing list so it'll be possible to refer to it. D. Jérôme Charron wrote: You're right -- changing anything with the input (snippets length, number of documents etc) will alter the c

Re: svn commit: r405565 - in /lucene/nutch/trunk/src: java/org/apache/nutch/searcher/ test/org/apache/nutch/searcher/ web/jsp/

2006-05-12 Thread Jérôme Charron
You're right -- changing anything with the input (snippets length, number of documents etc) will alter the clusters. This is basically how it works. If you want clustering in your search engine then, depending on the type of data you serve, you'll have to experiment with the settings a bit and see

Re: svn commit: r405565 - in /lucene/nutch/trunk/src: java/org/apache/nutch/searcher/ test/org/apache/nutch/searcher/ web/jsp/

2006-05-12 Thread Dawid Weiss
Hi Jerome, Yes Dawid, but it is already committed => the clustering now uses the plain text version returned by the toString() method. Ugh, yes, sorry about that, it uses Summary.toStrings(summaries) to be specific and that uses toString internally. Actually, the clustering uses the summa

Re: [Nutch-dev] Re: svn commit: r405565 - in /lucene/nutch/trunk/src: java/org/apache/nutch/searcher/ test/org/apache/nutch/searcher/ web/jsp/

2006-05-11 Thread Jérôme Charron
Bob Carpenter of alias-i had this to say when I brought up this very idea: http://article.gmane.org/gmane.comp.jakarta.lucene.devel/12599 Thanks for you response Marvin. But finally my question is : shouldn't the nutch clustering uses some fixed size snippets instead of the configurable displaye

Re: svn commit: r405565 - in /lucene/nutch/trunk/src: java/org/apache/nutch/searcher/ test/org/apache/nutch/searcher/ web/jsp/

2006-05-11 Thread Jérôme Charron
> (but if the nutch-site.xml overrides the plugin.include property and > doen't > include it it will not be activated, like any other plugin) yes, that's what I ment, I quess that's the default case for people hacking plugins. Oh, yes Sami, I understand what you mean... Sorry, I just forgot to m

Re: [Nutch-dev] Re: svn commit: r405565 - in /lucene/nutch/trunk/src: java/org/apache/nutch/searcher/ test/org/apache/nutch/searcher/ web/jsp/

2006-05-11 Thread Marvin Humphrey
On May 11, 2006, at 3:36 AM, Jérôme Charron wrote: Actually, the clustering uses the summaries as input. I assumes it would provides some better results if it takes the whole documents content. no? I assumes that clustering uses the summaries instead of documents content for some performa

Re: svn commit: r405565 - in /lucene/nutch/trunk/src: java/org/apache/nutch/searcher/ test/org/apache/nutch/searcher/ web/jsp/

2006-05-11 Thread Sami Siren
Jérôme Charron wrote: (but if the nutch-site.xml overrides the plugin.include property and doen't include it it will not be activated, like any other plugin) yes, that's what I ment, I quess that's the default case for people hacking plugins. -- Sami Siren

Re: svn commit: r405565 - in /lucene/nutch/trunk/src: java/org/apache/nutch/searcher/ test/org/apache/nutch/searcher/ web/jsp/

2006-05-11 Thread Jérôme Charron
Add 3. Clustering would benefit from a plain text version. Yes Dawid, but it is already committed => the clustering now uses the plain text version returned by the toString() method. Dawid, I have a question about clustering. Actually, the clustering uses the summaries as input. I assumes it wo

Re: svn commit: r405565 - in /lucene/nutch/trunk/src: java/org/apache/nutch/searcher/ test/org/apache/nutch/searcher/ web/jsp/

2006-05-11 Thread Dawid Weiss
The reason is that they should not use the same HTML code : 1. OpenSearch should only use around highlights 2. search.jsp should use some more complicated HTML code () Add 3. Clustering would benefit from a plain text version. D.

Re: svn commit: r405565 - in /lucene/nutch/trunk/src: java/org/apache/nutch/searcher/ test/org/apache/nutch/searcher/ web/jsp/

2006-05-10 Thread Doug Cutting
Jérôme Charron wrote: Yes Doug, but in fact, the idea is to add the toString(Formatter) method in a common place (Summary). And add one specific Formatter implementation for OpenSearch and another one for search.jsp : The reason is that they should not use the same HTML code : 1. OpenSearch sho

Re: svn commit: r405565 - in /lucene/nutch/trunk/src: java/org/apache/nutch/searcher/ test/org/apache/nutch/searcher/ web/jsp/

2006-05-10 Thread Jérôme Charron
> String toString(Encoder, Formatter) like in the Lucene's Highlighter and > provide some basic implementations of Encoder and Formatter. That sounds fine, but in the meantime, let's not reproduce the html-specific code in lots of places. We need it in both search.jsp and in OpenSearchServlet.jav

Re: svn commit: r405565 - in /lucene/nutch/trunk/src: java/org/apache/nutch/searcher/ test/org/apache/nutch/searcher/ web/jsp/

2006-05-10 Thread Jérôme Charron
> Also a friendly hint to all plugin hackers, you need to enable > summary-basic in your existing nutch-site.xml to get things working. > Took me some time to realize this fact :) I think we should add this to nutch-default.xml, Does I missed something? summary-basic is activated in the nutch-de

Re: svn commit: r405565 - in /lucene/nutch/trunk/src: java/org/apache/nutch/searcher/ test/org/apache/nutch/searcher/ web/jsp/

2006-05-10 Thread Jérôme Charron
> Also a friendly hint to all plugin hackers, you need to enable > summary-basic in your existing nutch-site.xml to get things working. > Took me some time to realize this fact :) Sounds like we should enable it by default, no? The summary-basic plugin is already enabled by default in nutch-defa

Re: svn commit: r405565 - in /lucene/nutch/trunk/src: java/org/apache/nutch/searcher/ test/org/apache/nutch/searcher/ web/jsp/

2006-05-10 Thread Doug Cutting
Sami Siren wrote: Also a friendly hint to all plugin hackers, you need to enable summary-basic in your existing nutch-site.xml to get things working. Took me some time to realize this fact :) Sounds like we should enable it by default, no? Doug

Re: svn commit: r405565 - in /lucene/nutch/trunk/src: java/org/apache/nutch/searcher/ test/org/apache/nutch/searcher/ web/jsp/

2006-05-10 Thread Andrzej Bialecki
Sami Siren wrote: Doesn't this break any existing application that uses OpenSearch and displays summaries in a web browser? This is an incompatible change which we should avoid. Also a friendly hint to all plugin hackers, you need to enable summary-basic in your existing nutch-site.xml t

Re: svn commit: r405565 - in /lucene/nutch/trunk/src: java/org/apache/nutch/searcher/ test/org/apache/nutch/searcher/ web/jsp/

2006-05-10 Thread Sami Siren
Doesn't this break any existing application that uses OpenSearch and displays summaries in a web browser? This is an incompatible change which we should avoid. Also a friendly hint to all plugin hackers, you need to enable summary-basic in your existing nutch-site.xml to get things workin

Re: svn commit: r405565 - in /lucene/nutch/trunk/src: java/org/apache/nutch/searcher/ test/org/apache/nutch/searcher/ web/jsp/

2006-05-10 Thread Doug Cutting
Jérôme Charron wrote: This means there's no markup in the OpenSearch output? Yes, no markup for now. Doesn't this break any existing application that uses OpenSearch and displays summaries in a web browser? This is an incompatible change which we should avoid. Shouldn't there be? Th

Re: svn commit: r405565 - in /lucene/nutch/trunk/src: java/org/apache/nutch/searcher/ test/org/apache/nutch/searcher/ web/jsp/

2006-05-10 Thread Jérôme Charron
This means there's no markup in the OpenSearch output? Yes, no markup for now. Shouldn't there be? The restriction on description field is : "Can contain simple escaped HTML markup, such as , , , and elements." So, ya, why not. We can add around highlights. What you and others thinks?

Re: svn commit: r405565 - in /lucene/nutch/trunk/src: java/org/apache/nutch/searcher/ test/org/apache/nutch/searcher/ web/jsp/

2006-05-09 Thread Doug Cutting
Thanks for making this change! A few comments: [EMAIL PROTECTED] wrote: == --- lucene/nutch/trunk/src/java/org/apache/nutch/searcher/OpenSearchServlet.java (original) +++ lucene/nutch/trunk/src/java/org/apache/nutch/