I want to stream data from logs into the HDFS in production but I do NOT
want my production machine to be apart of the computation cluster.  The
reason I want to do it in this way is to take advantage of HDFS without
putting computation load on my production machine.  Is this possible*?*
Furthermore, is this unnecessary because the computation would not put a
significant load on my production box (obviously depends on the map/reduce
implementation but I'm asking in general)*?*

I should note that our prod machine hosts our core web application and
database (saving up for another box :-).

Thanks,
Shahab

Reply via email to