The block caching won't buy you much in terms of performance.
You *must* set the scanner caching.

Note that hbase.client.scanner.caching is a global config option. (see 
HTable.getScanner(...)), so as long as that option is set on the Configuration 
that the HTable sees that Hive uses to create the scanner it should work.


-- Lars



________________________________
 From: java8964 <[email protected]>
To: "[email protected]" <[email protected]> 
Sent: Monday, February 10, 2014 12:19 PM
Subject: Re: Hive + Hbase scanning performance
 

Hi, Ted:
Our environment is using a distribution from a Vendor, so it is not easy just 
to patch it myself.
But I can seek the option to see if the vendor is willing to patch it in next 
release.
Before I do that, I just want to make sure patching the code is the ONLY 
solution.
I read the source code of Hive 0.9.0 of HiveHBaseTableInputFormat. I didn't see 
any place it invoked scan.setCaching(), so I don't think "set 
hbase.client.scanner.caching" in the hive session will work, but that is just 
my guess. There are quite a lot of messages on the internet that it will work 
in this case, so it confused me.
What I want to confirm is that "set hbase.client.scanner.caching" in fact 
doesn't work in hive for scan.setCaching(). Is that true?
Thanks
Yong

Date: Mon, 13 Jan 2014 19:31:38 -0800
Subject: Re: Hive + Hbase scanning performance
From: [email protected]
To: [email protected]

You can patch HIVE-3603 into your deployment so that you can make use of
scan.setCacheBlocks(false).

Cheers                           

Reply via email to