Re: Separating nutch and hadoop configurations.

Briggs Wed, 11 Jul 2007 14:41:30 -0700

Hey, thanks.  My problem was that I also wanted the nutch conf out of
the nutch install dir. So, I did set the NUTCH_CONF_DIR variable in my
.bashrc and couldn't understand why it was never picking it up.  Well,
as it happens, that was the one variable I forgot to export!  Doh!


So, it wasn't hard at all. Though, I needed to replace
hadoop-12.whatever.jar to the lastest within the nutch build.  It
seems to be working. yay.


Thanks.




On 7/11/07, Andrzej Bialecki <[EMAIL PROTECTED]> wrote:

Briggs wrote:
> I am currently trying to figure out how to deploy Nutch and Hadoop
> separately.  I want to configure Hadoop outside of Nutch and have
> Nutch use that service, rather than configuring hadoop within nutch.
> I would think all that Nutch should need to know is the urls to
> connect to Hadoop, but can't figure out how to get this to work.
>
> Is this possible?  If so, is there some sort of document, or archive
> of another list post for this?
>
> Sorry for the ignorance.

If you have a clean hadoop installation up and running (made e.g. from
one of the official Hadoop builds), it should be enough to put the
nutch*.job file in ${hadoop.dir}, and copy bin/nutch (possibly with some
minor modifications - my memory is a little vague on this ...).


--
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



--
"Conscious decisions by conscious minds are what make reality real"

Re: Separating nutch and hadoop configurations.

Reply via email to