Hello everybody
I have run a cluster with 11 nodes hbase CDH3u0 and i have 3
zookeeper server in my cluster It seems very slowly when i run the job that
import
text file into hbase table my question is what is the recommended
configuration in hbase to write big data around 17GB
into Hbase table. Where i run the job it launched map task is only 20 it could
be 100 or more.
i have attached the hbase-site.xml file if someone knows it please help me
here is the map function of my job:
public void map(LongWritable key, Text value, OutputCollector<TextPair, Text>
output, Reporter reporter) throws IOException {
String line = value.toString();
//System.out.println("[read line ]"+line);
if(line !=null&& !line.isEmpty()){
String[] items = line.split("\\,");
String concept_id = items[1];
String element_id = items[0];
//System.out.println("[Concept
id]:"+concept_id);
Put put = new Put(Bytes.toBytes(concept_id));
//keys of ELEMENT_* column families are element
id
put.add(Constant.COLUMN_ELEMENT_ID,
Bytes.toBytes(element_id),Bytes.toBytes(items[0]));
put.add(Constant.COLUMN_ELEMENT_CONTEXT_ID,Bytes.toBytes(element_id),Bytes.toBytes(items[2]));
put.add(Constant.COLUMN_ELEMENT_POSITION_FORM,
Bytes.toBytes(element_id),Bytes.toBytes(items[3]));
put.add(Constant.COLUMN_ELEMENT_POSITION_TO,
Bytes.toBytes(element_id),Bytes.toBytes(items[4]));
put.add(Constant.COLUMN_ELEMENT_TERM_ID,
Bytes.toBytes(element_id),Bytes.toBytes(items[5]));
put.add(Constant.COLUMN_ELEMENT_DICTIONARY_ID,
Bytes.toBytes(element_id),Bytes.toBytes(items[6]));
put.add(Constant.COLUMN_ELEMENT_WORKFLOW_STATUS,
Bytes.toBytes(element_id),Bytes.toBytes(items[7]));
hTable.put(put);
hTable.setAutoFlush(true);
hTable.flushCommits();
//output.collect(new TextPair(items[1],"1"),new
Text(items[0]+items[1]));
//System.out.println("[key value]"+ concept_id+" :
"+line );
}
//======================================================================================
here is the configuration file of hbase:
<property>
<name>hbase.zookeeper.property.maxClientCnxns</name>
<value>1000</value>
</property>
<property>
<name>hbase.hregion.max.filesize</name>
<value>1073741824</value>
</property>
<property>
<name>hbase.regionserver.handler.count</name>
<value>200</value>
</property>
<property>
<name>dfs.datanode.max.xcievers</name>
<value>4096</value>
</property>
<property>
<name>hfile.block.cache.size</name>
<value>0.4</value>
</property>
<property>
<name>hbase.client.scanner.caching</name>
<value>100000</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>server1,serve3,server5</value>
</property>
cheers
Byambajargal