On Wed, 16 May 2007 16:42:05 -0400, bbrown wrote > This is kind of a generic question. Are there any stats on how many > pages will get crawled based on some initial seed. For example, if > you seed the list from dmoz, how many pages will get indexed? Lets > say there are 4 million, will 4 million only get indexed? > > Or lets say I have 4000, will I get 30,000 crawled/indexed pages? > > -- > Berlin Brown > [berlin dot brown at gmail dot com] > http://botspiritcompany.com/botlist/?
I am sorry, lets say I give an average depth of 3. I am asking because I have these article pages (blogs, news articles) about 8000 of them and I want to have nutch crawl them on a regular basis but would like to have an idea of how many pages will get created in the index. -- Berlin Brown [berlin dot brown at gmail dot com] http://botspiritcompany.com/botlist/? ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
