Re: Re: MapReduce bulk load into Phoenix table

[email protected] Tue, 13 Jan 2015 17:33:12 -0800

Yes. I know exactly what HBase bulkload do and we are applying the schema to 
bulkload into Phoniex.
Just to clarify, if wal is enabled for phoenix tables to be bulkloaded, the 
loading performance shall be 
very poor. So disabling wal goes to an option for better loading performance.
Corrects me if I am wrong.


Regards,
Sun.





CertusNet 

From: Nick Dimiduk
Date: 2015-01-14 02:50
To: user
Subject: Re: MapReduce bulk load into Phoenix table
On Tue, Jan 13, 2015 at 1:29 AM, [email protected] 
<[email protected]> wrote:
As far as I know, bulk loading into phoenix or hbase may be affected by several 
conditions, like wal enabled or numbers of split regions. 

Bulkloading in HBase does not go through the WAL, it's using the 
HFileOutputFormat to write HFiles directly. Region splits will have some impact 
on bulkload, but not in the same way as it does with online writes.

I agree with James -- it seems your host is very underpowered or your 
underlying cluster installation is not configured correctly. Please consider 
profiling the individual steps in isolation so as to better identify the 
bottleneck.

From: Ciureanu, Constantin (GfK)
Date: 2015-01-13 17:12
To: [email protected]
Subject: MapReduce bulk load into Phoenix table
Hello all,
 
(Due to the slow speed of Phoenix JDBC – single machine ~ 1000-1500 rows /sec) 
I am also documenting myself about loading data into Phoenix via MapReduce.
 
So far I understood that the Key + List<[Key,Value]> to be inserted into HBase 
table is obtained via a “dummy” Phoenix connection – then those rows are stored 
into HFiles (then after the MR job finishes it is Bulk loading those HFiles 
normally into HBase).
 
My question: Is there any better / faster approach? I assume this cannot reach 
the maximum speed to load data into Phoenix / HBase table.
   
Also I would like to find a better / newer sample code than this one:
http://grepcode.com/file/repo1.maven.org/maven2/org.apache.phoenix/phoenix/4.0.0-incubating/org/apache/phoenix/mapreduce/CsvToKeyValueMapper.java#CsvToKeyValueMapper.loadPreUpsertProcessor%28org.apache.hadoop.conf.Configuration%29
 
Thank you,
   Constantin

Re: Re: MapReduce bulk load into Phoenix table

Reply via email to