Yes, this can be used at production scale -- that's the intention of the CSV 
bulk loader. The insert and rollbacks that you see are purely in-memory 
operations used to build up the KeyVales for the HFiles. 

- Gabriel

> On 11 Jan 2015, at 19:39, Pariksheet Barapatre <pbarapa...@gmail.com> wrote:
> 
> Hi Gabriel,
> 
> This is great. Thanks. Can I use same approach for generating HFile at
> production scale. I am bit worried  because for every row, code tries to
> insert row and then rollbacks. (uncommitted row).
> 
> 
> Many Thanks
> Pari
> 
> 
> 
>> On 11 January 2015 at 23:36, Gabriel Reid <gabriel.r...@gmail.com> wrote:
>> 
>> The CSV bulk loader in Phoenix actually does this -- it creates HFiles
>> via MapReduce based on CSV input.
>> 
>> You can take a look at the details of how it works in
>> CsvBulkLoadTool.java [1] and CsvToKeyValueMapper.java [2]. There isn't
>> currently a public API for creating Phoenix-compatible HFiles via
>> MapReduce in Phoenix, but there is a set of utility classes in the
>> org.apache.phoenix.mapreduce package for writing to Phoenix directly
>> as the output of a MapReduce program.
>> 
>> - Gabriel
>> 
>> 
>> 1.
>> https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/mapreduce/CsvBulkLoadTool.java
>> 2.
>> https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/mapreduce/CsvToKeyValueMapper.java
>> 
>> On Sun, Jan 11, 2015 at 6:34 PM, Pariksheet Barapatre
>> <pbarapa...@gmail.com> wrote:
>>> Hello All,
>>> 
>>> New year greetings..!!!
>>> 
>>> My question as follow -
>>> 
>>> How to create Phoenix Salted table  equivalent HFile using MapReduce.
>>> 
>>> As per my understanding we can create HFile by specifying
>>> 
>>> HFileOutputFormat.configureIncrementalLoad(job, hTable);
>>> 
>>> What would be the way to create  salt and generate phoenix equivalent
>> rowkey
>>> and values.
>>> 
>>> 
>>> Cheers,
>>> Pari
> 
> 
> 
> -- 
> Cheers,
> Pari

Reply via email to