I've been searching the docs but could find no help --
We have some machines that produce data - and on each we have
an adapter (agent). Those machines are 'close' to each other - same network
(physically).
Then, we have the HDFS cluster on other machines, on another network. The
two networks are of course connected (via internet).
So, we want to know which is better - network-wise: to put the collector on
the same network of the adapters, or on the same computer as the hdfs
namenode?
Option A - collector close to adapters - seems better to me because they
send data ALL THE TIME to the collector, while the collector sends data to
the hdfs only every 5 mins, with one writing action.

P.S - our collector writes exactly what he gets from the adapters, so there
are no considerations regarding data volumes.

Any recommendations?
Thanks,
-- 
Oded

Reply via email to