Are you running in non clustered mode, then run with parameter
-numFetchers 1 and you should get all the urls.
perhaps we should fix this by adding a check in generator:
if task is run with local job runner that param should be forced to 1
(now it defaults to job.getNumMapTasks() which defaults to 2)
--
Sami Siren
Frank Kempf wrote:
Hello,
got stuck with generating.
Injecting 3200 Urls into the database and generating afterwards leads
always to the same result of having 1632 Urls in crawl_generate.
(I checked the db and it actually has 3200 entries).
No matter if I try -topN 5000 / 50000 or nothing.
How could I generate a whole set of first level Urls?
Kind regards
Frank