Are you running in non clustered mode, then run with parameter 
-numFetchers 1 and you should get all the urls.

perhaps we should fix this by adding a check in generator:

if task is run with local job runner that param should be forced to 1 
(now it defaults to job.getNumMapTasks() which defaults to 2)

--
  Sami Siren

Frank Kempf wrote:
> Hello,
> 
> got stuck with generating.
> Injecting 3200 Urls into the database and generating afterwards leads 
> always to the same result of having 1632 Urls in crawl_generate.
> (I checked the db and it actually has 3200 entries).
> No matter if I try -topN 5000 / 50000 or nothing.
> How could I generate a whole set of first level Urls?
> 
> 
>   Kind regards
> 
>     Frank
> 
> 


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to