Yes, this can be used at production scale -- that's the intention of the CSV bulk loader. The insert and rollbacks that you see are purely in-memory operations used to build up the KeyVales for the HFiles.
- Gabriel > On 11 Jan 2015, at 19:39, Pariksheet Barapatre <pbarapa...@gmail.com> wrote: > > Hi Gabriel, > > This is great. Thanks. Can I use same approach for generating HFile at > production scale. I am bit worried because for every row, code tries to > insert row and then rollbacks. (uncommitted row). > > > Many Thanks > Pari > > > >> On 11 January 2015 at 23:36, Gabriel Reid <gabriel.r...@gmail.com> wrote: >> >> The CSV bulk loader in Phoenix actually does this -- it creates HFiles >> via MapReduce based on CSV input. >> >> You can take a look at the details of how it works in >> CsvBulkLoadTool.java [1] and CsvToKeyValueMapper.java [2]. There isn't >> currently a public API for creating Phoenix-compatible HFiles via >> MapReduce in Phoenix, but there is a set of utility classes in the >> org.apache.phoenix.mapreduce package for writing to Phoenix directly >> as the output of a MapReduce program. >> >> - Gabriel >> >> >> 1. >> https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/mapreduce/CsvBulkLoadTool.java >> 2. >> https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/mapreduce/CsvToKeyValueMapper.java >> >> On Sun, Jan 11, 2015 at 6:34 PM, Pariksheet Barapatre >> <pbarapa...@gmail.com> wrote: >>> Hello All, >>> >>> New year greetings..!!! >>> >>> My question as follow - >>> >>> How to create Phoenix Salted table equivalent HFile using MapReduce. >>> >>> As per my understanding we can create HFile by specifying >>> >>> HFileOutputFormat.configureIncrementalLoad(job, hTable); >>> >>> What would be the way to create salt and generate phoenix equivalent >> rowkey >>> and values. >>> >>> >>> Cheers, >>> Pari > > > > -- > Cheers, > Pari