Hello,

I used nutch-1.2 to index a few domains. I noticed that nutch correctly crawled 
all sub-pages of domains. By sub-pages I mean the followings, for example for a 
domain mydomain.com all links inside it like
mydomain.com/show/photos/1 and etc. I also noticed in our apache logs that 
google-bot also crawled all sub-pages.
However, in search for mydomain.com google gives mydomain.com in the first page 
and almost no subpages, but nutch gives all subpages. If a domain has, let say 
200 sub-pages and we display 10 results in a page then it would take us 10 
pages to go forward to see results from other domains. In contrary google 
displays results form ohter domains in the second place.

Is there a way of fixing this issue?

Thanks in advance.
Alex.


Reply via email to