Using this model is important also from another point of view: with the current code, where NutchConf is a singleton, it's not possible to run several tasks in parallel within a single JVM, but with radically different parameters. E.g.: if you want to run several CrawlTool with different parameters, under a single JVM, it's currently not possible. With the setConfig() change it becomes possible.
But is it really how a search platform will be tuned in the "real life"?
On the contrary, I was thinking that the NucthConf must be a singleton over many JVM (over many nodes).
No? Isn't it the real use case?
It depends - some of us have a need to work on one big search domain, some of us need to work on many smaller search domains.
By going into the direction of not having the NutchConf as a singleton we can start to cower the latter usecase and still not loosing the first.
Atleast I like the idea of nutch internally supporting more than one "Collection".
-- Sami Siren
