Hi all,

I've been busy lately with a Nutch 1.x setup and I've managed to replicate
the crawl script into an Oozie workflow (and HUE for pretty web UI). To
make things easy I've used the JavaMain action to execute the classes that
the nutch scripts invokes, parametrized as necessary.

One thing that I noticed is that I found configuring the command line
arguments a tad cumbersome so: would it be unthinkable to adopt the Hadoop
-D configuration.setting convention to set these options?

bash scripts could still hide the extra verbosity and preserve the current
args, while adding the option to define them in nutch-site.xml or in Oozie
under a more practical element.

The patch wouldn't be too disruptive, but I don't want to do work that
wouldn't be folded into upstream so let me know if such an approach flies
in the face of community wide decisions and so on...


Best,
Edoardo

-- 
A Motto
Smile a while, and while you smile
   another smiles
And soon there's miles and miles
   of smiles
And life's worth while because
   you smile

Reply via email to