[ https://issues.apache.org/jira/browse/PHOENIX-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056030#comment-14056030 ]
jay wong commented on PHOENIX-1056: ----------------------------------- I think a all in one tool will be fine. But I didn't know that CSVBulkLoadTool How to set the reduce class {code} job.setMapperClass(CsvToKeyValueMapper.class); HFileOutputFormat.configureIncrementalLoad(job, htable); // But in HFileOutputFormat.configureIncrementalLoad(job, htable); if (KeyValue.class.equals(job.getMapOutputValueClass())) { job.setReducerClass(KeyValueSortReducer.class); } else if (Put.class.equals(job.getMapOutputValueClass())) { job.setReducerClass(PutSortReducer.class); } else { LOG.warn("Unknown map output value type:" + job.getMapOutputValueClass()); } {code} So I think the tool 1. Must there is a way for build HFile and LoadIncrementalHFiles. Not only via htable.put(put) 2. Must fix the problem that [~jeffreyz] has mention. the index table hfiles aren't align with its own partitioning and when loading those generated index hfiles will incur extra writes during loading. But the tool is also can be used for insert data build. but not data update. In a update way . how can I do? > A ImportTsv tool for phoenix to build table data and all index data. > -------------------------------------------------------------------- > > Key: PHOENIX-1056 > URL: https://issues.apache.org/jira/browse/PHOENIX-1056 > Project: Phoenix > Issue Type: Task > Affects Versions: 3.0.0 > Reporter: jay wong > Fix For: 3.1 > > Attachments: PHOENIX-1056.patch > > > I have just build a tool for build table data and index table data just like > ImportTsv job. > http://hbase.apache.org/book/ops_mgt.html#importtsv > when ImportTsv work it write HFile in a CF name path. > for example A table has two cf, A and B. > the output is > ...../outputpath/A > ...../outputpath/B > In my job. we has a table. TableOne. and two Index IdxOne, IdxTwo. > the output will be > ...../outputpath/TableOne/A > ...../outputpath/TableOne/B > ...../outputpath/IdxOne > ...../outputpath/IdxTwo. > If anyone need it .I will build a clean tool. -- This message was sent by Atlassian JIRA (v6.2#6252)