HBase & MapReduce & Zookeeper

Andre Reiter Tue, 19 Jul 2011 01:00:36 -0700

Hi folks,

i'm running in an interesting issue:
we have a zookeeper cluster running on 3 servers


we run mapreduce jobs using org.apache.hadoop.conf.Configuration to pass 
parameters to our mappers
the string based (key/value) approach is imho not the most elegant way, i would 
prefer to however pass ie a Writable object to my mappers, but however it works 
with the conf object, so what...

our jobs are using an hbase table as input (TableInputFormat)
the Configuration is created for every job, because we have every time new 
parameter values

new the problem occurs: the zookeeper cluster does not accept new connections, 
there is a limit per default to 10 connections for every client, and indeed all 
the old connections from the client to the zookeeper cluster are still 
established

new i found some description saying, that we have to reuse the Configuration 
object
ok, sure we can reuse the Configuration object, but if we want to set different 
parameters in a multi threaded environment, what than?

do i have to create a pool of Configuration objects, to share them synchronized?

andre

HBase & MapReduce & Zookeeper

Reply via email to