Hi everyone,

I am confronted with quite a difficult problem, how to connect 2 separate
Hadoop instances, each having their own HDFS's, HBASE's etc.

My first though would be what if they share a Master node, surely this will
allow any Mapreduce or Spark request to be run against both instances?
Comments?

If the above idea is feasible, how would one ensure that the data
collection remain in each respective instance, and not be distributed among
both instances?

Kind Regards
Dave

Reply via email to