This is beginning to sound like a catch-22 problem. I think I personally would lean towards a single HDFS (high performing) cluster that can be shared between various types of applications (realtime vs analytics). Then control/balance resource requirements for each application. This would work for scenarios where I can predict the different types of applications/workloads before hand. However, if for some reason the nature of workload is to shift, that could potentially throw off the whole resource equilibrium.
Are there any additional Hadoop specific monitoring tools that can be deployed to predict resource/performance bottlenecks in advance (in addition to regular BMC type tools)? -- View this message in context: http://apache-hbase.679495.n3.nabble.com/Shared-HDFS-for-HBase-and-MapReduce-tp4018856p4018881.html Sent from the HBase - Developer mailing list archive at Nabble.com.
