> Date: Tue, 9 Nov 2010 23:13:53 +0530
> Subject: Re: Why and When to use HTablePool?
> From: [email protected]
> To: [email protected]
> 
> So the difference between a pool and an HTable would be negligible in a
> typical map-reduce environment, right.. if I am not creating any new HTable
> instances in the map and reduce phases? Perhaps creating a pool can have
> negative impact in this case?
> 

Well, I'm no expert but I can't see any reason to run a pool of HTable 
connections from within a m/r job.
(I'm sure someone can figure out a use case...) 

But from our experience... you create a single hbase table instance in setup() 
and you're set. A pool adds complexity and it doesn't improve performance from 
what I can see....

> e.g, what performance impact can I expect in my bulk uploading mapreducejob?
> I create an HTable connection in the run() method, each map converts a line
> from a text file to a put instance. Also, it would be great if any of you
> could point me to an example usage of TablePool.
> 
> hari

This is a difficult question to answer. Too many factors. Your hardware, 
network, design and quality of code all will have an impact on performance.
The most important thing is your design and your code will have the greatest 
impact on performance.
                                          

Reply via email to