Elwin wrote:
When I read pages out of a webdb and printed out the url of each page, I found two urls are just the same. Is it possible that two pages with the same url?
WebDB should not allow two URLs that are exactly the same (Nutch uses MD5 signature for that). Please check them carefully, most probably they differ only in a single character, or a whitespace.
-- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __________________________________ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com