Hi all,
If HBase is used as the data sink in an MR job, would there be a
performance improvement if a) is done instead of b)a) all the Puts are collected in Reduce or Map (if there is no reduce) and a batch write is done b) writing out each <K,V> pair using context.write(k, v) If a) is considered instead of b) then wouldn't there be a violation of semantics w.r.t KEYOUT, VALUEOUT (because <K, V> is not being output)?? Is this OK? Thank you. Regards, Raghava
