Hi,
here's a small patch that makes the summaries non-random. Currently the use
of hashCode() leads to quasi-random results, i.e. you can get a different
summary when you reload a result page. That's a bit confusing. With this
patch you'll always get the same summary.
(I actually sent this to Michael Cafarella some months ago, but I think it
was lost/forgotten back then).
Regards
Daniel
--
http://www.danielnaber.de
Index: Summarizer.java
===================================================================
RCS file: /cvsroot/nutch/nutch/src/java/net/nutch/searcher/Summarizer.java,v
retrieving revision 1.5
diff -u -r1.5 Summarizer.java
--- Summarizer.java 4 Sep 2003 21:17:25 -0000 1.5
+++ Summarizer.java 19 May 2004 21:37:05 -0000
@@ -124,12 +124,7 @@
if (numToks1 < numToks2) {
return -1;
} else if (numToks1 == numToks2) {
- int result = excerpt1.numFragments() - excerpt2.numFragments();
- if (result == 0) {
- return excerpt1.hashCode() - excerpt2.hashCode();
- } else {
- return result;
- }
+ return excerpt1.numFragments() - excerpt2.numFragments();
} else {
return 1;
}