Yes, this should be definitely mentioned somewhere (in the documentation
:) At least we left a track on the mailing list so it'll be possible to
refer to it.
D.
Jérôme Charron wrote:
You're right -- changing anything with the input (snippets length,
number of documents etc) will alter the c
You're right -- changing anything with the input (snippets length,
number of documents etc) will alter the clusters. This is basically how
it works. If you want clustering in your search engine then, depending
on the type of data you serve, you'll have to experiment with the
settings a bit and see
Hi Jerome,
Yes Dawid, but it is already committed => the clustering now uses the plain
text version returned by the toString() method.
Ugh, yes, sorry about that, it uses Summary.toStrings(summaries) to be
specific and that uses toString internally.
Actually, the clustering uses the summa
Bob Carpenter of alias-i had this to say when I brought up this very
idea:
http://article.gmane.org/gmane.comp.jakarta.lucene.devel/12599
Thanks for you response Marvin.
But finally my question is : shouldn't the nutch clustering uses some
fixed size snippets instead of the configurable displaye
> (but if the nutch-site.xml overrides the plugin.include property and
> doen't
> include it it will not be activated, like any other plugin)
yes, that's what I ment, I quess that's the default case for people
hacking plugins.
Oh, yes Sami, I understand what you mean...
Sorry, I just forgot to m
On May 11, 2006, at 3:36 AM, Jérôme Charron wrote:
Actually, the clustering uses the summaries as input. I assumes it
would
provides some better results if it takes the whole documents
content. no?
I assumes that clustering uses the summaries instead of documents
content
for some performa
Jérôme Charron wrote:
(but if the nutch-site.xml overrides the plugin.include property and
doen't
include it it will not be activated, like any other plugin)
yes, that's what I ment, I quess that's the default case for people
hacking plugins.
--
Sami Siren
Add 3. Clustering would benefit from a plain text version.
Yes Dawid, but it is already committed => the clustering now uses the plain
text version returned by the toString() method.
Dawid, I have a question about clustering.
Actually, the clustering uses the summaries as input. I assumes it wo
The reason is that they should not use the same HTML code :
1. OpenSearch should only use around highlights
2. search.jsp should use some more complicated HTML code ()
Add 3. Clustering would benefit from a plain text version.
D.
Jérôme Charron wrote:
Yes Doug, but in fact, the idea is to add the toString(Formatter) method in
a common place (Summary).
And add one specific Formatter implementation for OpenSearch and another
one
for search.jsp :
The reason is that they should not use the same HTML code :
1. OpenSearch sho
> String toString(Encoder, Formatter) like in the Lucene's Highlighter and
> provide some basic implementations of Encoder and Formatter.
That sounds fine, but in the meantime, let's not reproduce the
html-specific code in lots of places. We need it in both search.jsp and
in OpenSearchServlet.jav
> Also a friendly hint to all plugin hackers, you need to enable
> summary-basic in your existing nutch-site.xml to get things working.
> Took me some time to realize this fact :)
I think we should add this to nutch-default.xml,
Does I missed something?
summary-basic is activated in the nutch-de
> Also a friendly hint to all plugin hackers, you need to enable
> summary-basic in your existing nutch-site.xml to get things working.
> Took me some time to realize this fact :)
Sounds like we should enable it by default, no?
The summary-basic plugin is already enabled by default in nutch-defa
Sami Siren wrote:
Also a friendly hint to all plugin hackers, you need to enable
summary-basic in your existing nutch-site.xml to get things working.
Took me some time to realize this fact :)
Sounds like we should enable it by default, no?
Doug
Sami Siren wrote:
Doesn't this break any existing application that uses OpenSearch and
displays summaries in a web browser? This is an incompatible change
which we should avoid.
Also a friendly hint to all plugin hackers, you need to enable
summary-basic in your existing nutch-site.xml t
Doesn't this break any existing application that uses OpenSearch and
displays summaries in a web browser? This is an incompatible change
which we should avoid.
Also a friendly hint to all plugin hackers, you need to enable
summary-basic in your existing nutch-site.xml to get things workin
Jérôme Charron wrote:
This means there's no markup in the OpenSearch output?
Yes, no markup for now.
Doesn't this break any existing application that uses OpenSearch and
displays summaries in a web browser? This is an incompatible change
which we should avoid.
Shouldn't there be?
Th
This means there's no markup in the OpenSearch output?
Yes, no markup for now.
Shouldn't there be?
The restriction on description field is : "Can contain simple escaped HTML
markup, such as , , , and elements."
So, ya, why not. We can add around highlights.
What you and others thinks?
Thanks for making this change!
A few comments:
[EMAIL PROTECTED] wrote:
==
---
lucene/nutch/trunk/src/java/org/apache/nutch/searcher/OpenSearchServlet.java
(original)
+++
lucene/nutch/trunk/src/java/org/apache/nutch/
19 matches
Mail list logo