I am trying to sort the search result with "lastModified" field. So I index
"lastModified " as Integer and Keyword into index and search with
search(Qurey query, Filter filter, int n, Sort sort) method. Just modified
in net.nutch.searcher.LuceneQueryOptimizer.optimize.
return searcher.search(query, filter, numHits,
new Sort( new SortField[]{
new SortField("lastModified", SortField.INT, true)
}
));
The result sure changed, and largely sorted by time. But it didn't exactly sorted by lastModified. The results looks ugly, :(.
I can see two sources of problems:
1. You should sort by the "date" field, not "lastModified", since that's not indexed, and sorting requires an indexed field.
2. Not all pages have a lastModified value. You should change MoreIndexingFilter to always add a date. If no last modified is specified, then use the fetch date, fo.getFetchDate().
If you get this working, please send a patch. Even if it's a hack, it's a start for others.
Thanks,
Doug
