The writes take longer in HBase. 

Just how much longer may depend on how well you tuned HBase. 

Now, having said that... suppose you want to find a single record in either 
HBase or Hive. 
Which do you think will be faster? ;-) 


On Jan 17, 2013, at 10:44 AM, Austin Chungath <austi...@gmail.com> wrote:

>  Hi,
> Problem: hive took 6 mins to load a data set, hbase took 1 hr 14 mins.
> It's a 20 gb data set approx 230 million records. The data is in hdfs,
> single text file. The cluster is 11 nodes, 8 cores.
> 
> I loaded this in hive, partitioned by date and bucketed into 32 and sorted.
> Time taken is 6 mins.
> 
> I loaded the same data into hbase, in the same cluster by writing a map
> reduce code. It took 1hr 14 mins. The cluster wasn't running anything else
> and assuming that the code that i wrote is good enough, what is it that
> makes hbase slower than hive in loading the data?
> 
> Thanks,
> Austin

Reply via email to