if we have a huge table, and every 1 hour only 1% of that has some updates, it would be a huge waste to slurp in the whole table through MR job and write out the new table.
instead, if we store this table in HBASE, and use the current HBase+Hive integration, as long as we can do upsert, then we can afford to touch only that 1% of entries, and the result can be very fast.
