what is the recommended configuration in hbase to write big data

byambajargal Wed, 01 Jun 2011 10:13:22 -0700

Hello everybody

 I have run  a cluster with 11 nodes hbase CDH3u0 and i have 3
 zookeeper server in my cluster It seems very slowly when i run the job that 
import
 text file into hbase table my question is what is the recommended 
configuration in hbase to write big data around 17GB
 into Hbase table. Where i run the job it launched map task is only 20 it could 
be 100 or more.
i have attached the hbase-site.xml file if someone knows it please help me


here is the map function of my job:

public void map(LongWritable key, Text value, OutputCollector<TextPair, Text>  
output, Reporter reporter)   throws IOException {
                

                            String line = value.toString();
                                
                                //System.out.println("[read line ]"+line);
                                
                                if(line !=null&&  !line.isEmpty()){
                                        
                                String[] items = line.split("\\,");
                                
                                String concept_id = items[1];
                            String element_id = items[0];
                                //System.out.println("[Concept 
id]:"+concept_id);
                                Put put = new Put(Bytes.toBytes(concept_id));
                                
                                //keys of ELEMENT_* column families are element 
id
                                put.add(Constant.COLUMN_ELEMENT_ID, 
Bytes.toBytes(element_id),Bytes.toBytes(items[0]));
                                
put.add(Constant.COLUMN_ELEMENT_CONTEXT_ID,Bytes.toBytes(element_id),Bytes.toBytes(items[2]));
                                put.add(Constant.COLUMN_ELEMENT_POSITION_FORM, 
Bytes.toBytes(element_id),Bytes.toBytes(items[3]));
                                put.add(Constant.COLUMN_ELEMENT_POSITION_TO, 
Bytes.toBytes(element_id),Bytes.toBytes(items[4]));
                                put.add(Constant.COLUMN_ELEMENT_TERM_ID, 
Bytes.toBytes(element_id),Bytes.toBytes(items[5]));
                                put.add(Constant.COLUMN_ELEMENT_DICTIONARY_ID, 
Bytes.toBytes(element_id),Bytes.toBytes(items[6]));
                                
put.add(Constant.COLUMN_ELEMENT_WORKFLOW_STATUS, 
Bytes.toBytes(element_id),Bytes.toBytes(items[7]));
                                hTable.put(put);
                                hTable.setAutoFlush(true);
                                hTable.flushCommits();
                                //output.collect(new TextPair(items[1],"1"),new 
Text(items[0]+items[1]));
                                
                                //System.out.println("[key value]"+ concept_id+" : 
"+line );
                                
                                }
//======================================================================================
here is the configuration file of hbase:

<property>
<name>hbase.zookeeper.property.maxClientCnxns</name>
    <value>1000</value>
</property>
<property>
<name>hbase.hregion.max.filesize</name>
    <value>1073741824</value>
</property>
<property>
<name>hbase.regionserver.handler.count</name>
    <value>200</value>
</property>


<property>
  <name>dfs.datanode.max.xcievers</name>
  <value>4096</value>
</property>

<property>
  <name>hfile.block.cache.size</name>
  <value>0.4</value>
</property>
<property>
  <name>hbase.client.scanner.caching</name>
  <value>100000</value>
</property>

<property>
  <name>hbase.zookeeper.quorum</name>
  <value>server1,serve3,server5</value>
</property>



 cheers


 Byambajargal

what is the recommended configuration in hbase to write big data

Reply via email to