Alan Wang wrote:
I am trying to sort the search result with "lastModified" field. So I index
"lastModified " as Integer and Keyword into index and search with
search(Qurey query, Filter filter, int n, Sort sort) method. Just modified
in net.nutch.searcher.LuceneQueryOptimizer.optimize.
return searcher.search(query, filter, numHits,


new Sort( new SortField[]{
new SortField("lastModified", SortField.INT, true)
}
));


The result sure changed, and largely sorted by time. But it didn't exactly
sorted by lastModified. The results looks ugly, :(.

I can see two sources of problems:

1. You should sort by the "date" field, not "lastModified", since that's not indexed, and sorting requires an indexed field.

2. Not all pages have a lastModified value. You should change MoreIndexingFilter to always add a date. If no last modified is specified, then use the fetch date, fo.getFetchDate().

If you get this working, please send a patch. Even if it's a hack, it's a start for others.

Thanks,

Doug

Reply via email to