The output format is <hostCount,hostName> Maybe FETCHED and NOT FETCHED attribute of CrawlDatum will appear in any position in Text file. It is not be sorted.
On Mon, Aug 20, 2012 at 7:56 PM, Markus Jelsma <[email protected]>wrote: > Those counts are the sum of the fetched pages for that host. 210661 are > fetched in total and 427773 are unfetched. > > > -----Original message----- > > From:Alexei Korolev <[email protected]> > > Sent: Mon 20-Aug-2012 13:38 > > To: [email protected] > > Subject: what's mean this values? > > > > Hello, > > > > I tried to google about it, but without luck. I run this command: > > > > nutch domainstats crawl/crawldb/current temp host > > > > and then have following output: > > > > 469 ttt.in.ua > > 12 aaa.com.ua > > 210661 FETCHED > > 427773 NOT_FETCHED > > 4238 aaaa.ru > > 1 all4vvvv.com.ua > > 17844 amtist.ru > > 4092 aptrrr.ru > > > > Anybody could explore for me what's mean this values? And why I have > > FETCHED and NOT FETCHED in the middle of this list? > > > > Thanks. > > > > -- > > Alexei A. Korolev > > > -- Don't Grow Old, Grow Up... :-)

