Hi, Lars:
Is there any logging I can enable to verify this?
I am not questioning your knowledge, but from my performance testing, I really 
didn't see any result.
I read org.apache.hadoop.hbase.client.Scan of Hbase 0.94.3 version, I didn't 
see any logging I can use to check if the cache value is being set on what 
value.
>From the Hive code org.apache.hadoop.hive.hbase.HiveBaseTableInputFormat, it 
>will create a Scan object with default caching value (-1), and set this scan 
>into its BaseClass, which is







org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.
I believe then this Scan class will be serialized to the server and I didn't 
find any place its caching value will be reset based on the Configuration. Of 
course, I maybe miss it since I just start reading Hbase codebase and not 
knowing too much about it.
Any log in the server side can show the cache value, if I change any log level? 
If so, how?
Also, can you comment out about Hive Jira 
https://issues.apache.org/jira/browse/HIVE-3603? 
In fact, I have the same question as the 2nd to last comment in the Jira 
ticket, but no one ever answered it. 
Quoted:
Swarnim Kulkarni added a comment - 26/Aug/13 19:28Edward Capriolo Thanks! Also 
how is setting this property different than directly setting the 
"hbase.client.scanner.caching" property in hive-site.xml without this 
enhancement? Wouldn't they have the same effect?

Thanks
Yong
> Date: Mon, 10 Feb 2014 12:37:07 -0800
> From: [email protected]
> Subject: Re: Hive + Hbase scanning performance
> To: [email protected]
> 
> The block caching won't buy you much in terms of performance.
> You *must* set the scanner caching.
> 
> Note that hbase.client.scanner.caching is a global config option. (see 
> HTable.getScanner(...)), so as long as that option is set on the 
> Configuration that the HTable sees that Hive uses to create the scanner it 
> should work.
> 
> 
> -- Lars
> 
> 
> 
> ________________________________
>  From: java8964 <[email protected]>
> To: "[email protected]" <[email protected]> 
> Sent: Monday, February 10, 2014 12:19 PM
> Subject: Re: Hive + Hbase scanning performance
>  
> 
> Hi, Ted:
> Our environment is using a distribution from a Vendor, so it is not easy just 
> to patch it myself.
> But I can seek the option to see if the vendor is willing to patch it in next 
> release.
> Before I do that, I just want to make sure patching the code is the ONLY 
> solution.
> I read the source code of Hive 0.9.0 of HiveHBaseTableInputFormat. I didn't 
> see any place it invoked scan.setCaching(), so I don't think "set 
> hbase.client.scanner.caching" in the hive session will work, but that is just 
> my guess. There are quite a lot of messages on the internet that it will work 
> in this case, so it confused me.
> What I want to confirm is that "set hbase.client.scanner.caching" in fact 
> doesn't work in hive for scan.setCaching(). Is that true?
> Thanks
> Yong
> 
> Date: Mon, 13 Jan 2014 19:31:38 -0800
> Subject: Re: Hive + Hbase scanning performance
> From: [email protected]
> To: [email protected]
> 
> You can patch HIVE-3603 into your deployment so that you can make use of
> scan.setCacheBlocks(false).
> 
> Cheers                           
                                          

Reply via email to