Alan Wang wrote:
    String lastModified = metaData.getProperty("last-modified");
    if (lastModified == null)
      return doc;

If the metaData does not contain a "last-modified" entry (from the http headers) then the document ends up with no last-modified field, and hence nothing to sort it on.


Also, the sorting code you sent assumes that dates are ints, while you've modified things to index a long. That will cause problems too. It is substantially more efficient in Lucene to sort by ints, so I recommend switching this back to indexing a YYYYMMDD int. If you need more precision, you could index to the hour (YYYYMMDDHH) and still stay within positive integers, or you could convert things to something like minutes since 1970.

Doug


------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to