Hi,

here's a small patch that makes the summaries non-random. Currently the use 
of hashCode() leads to quasi-random results, i.e. you can get a different 
summary when you reload a result page. That's a bit confusing. With this 
patch you'll always get the same summary.

(I actually sent this to Michael Cafarella some months ago, but I think it 
was lost/forgotten back then).

Regards
 Daniel

-- 
http://www.danielnaber.de
Index: Summarizer.java
===================================================================
RCS file: /cvsroot/nutch/nutch/src/java/net/nutch/searcher/Summarizer.java,v
retrieving revision 1.5
diff -u -r1.5 Summarizer.java
--- Summarizer.java	4 Sep 2003 21:17:25 -0000	1.5
+++ Summarizer.java	19 May 2004 21:37:05 -0000
@@ -124,12 +124,7 @@
             if (numToks1 < numToks2) {
                 return -1;
             } else if (numToks1 == numToks2) {
-                int result = excerpt1.numFragments() - excerpt2.numFragments();
-                if (result == 0) {
-                    return excerpt1.hashCode() - excerpt2.hashCode();
-                } else {
-                    return result;
-                }
+                return excerpt1.numFragments() - excerpt2.numFragments();
             } else {
                 return 1;
             }

Reply via email to