[jira] [Commented] (PHOENIX-1056) A ImportTsv tool for phoenix to build table data and all index data.

jay wong (JIRA) Wed, 09 Jul 2014 02:31:34 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056030#comment-14056030
 ]


jay wong commented on PHOENIX-1056:
-----------------------------------

I think a all in one tool will be fine. 

But I didn't know that CSVBulkLoadTool How to set the reduce class

{code}
job.setMapperClass(CsvToKeyValueMapper.class);
HFileOutputFormat.configureIncrementalLoad(job, htable);

// But in HFileOutputFormat.configureIncrementalLoad(job, htable);
if (KeyValue.class.equals(job.getMapOutputValueClass())) {
      job.setReducerClass(KeyValueSortReducer.class);
    } else if (Put.class.equals(job.getMapOutputValueClass())) {
      job.setReducerClass(PutSortReducer.class);
    } else {
      LOG.warn("Unknown map output value type:" + job.getMapOutputValueClass());
    }
{code}

So I think the tool
 1. Must there is a way for build  HFile and LoadIncrementalHFiles.   Not only 
via htable.put(put)
 2. Must fix the problem that [~jeffreyz] has mention. the index table hfiles 
aren't align with its own partitioning and when loading     those generated 
index hfiles will incur extra writes during loading.

But the tool is also can be used for insert data build. but not data update.
In a update way . how can I do?


> A ImportTsv tool for phoenix to build table data and all index data.
> --------------------------------------------------------------------
>
>                 Key: PHOENIX-1056
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1056
>             Project: Phoenix
>          Issue Type: Task
>    Affects Versions: 3.0.0
>            Reporter: jay wong
>             Fix For: 3.1
>
>         Attachments: PHOENIX-1056.patch
>
>
> I have just build a tool for build table data and index table data just like 
> ImportTsv job.
> http://hbase.apache.org/book/ops_mgt.html#importtsv
> when ImportTsv work it write HFile in a CF name path.
> for example A table has two cf, A and B.
> the output is 
> ...../outputpath/A
> ...../outputpath/B
> In my job. we has a table.  TableOne. and two Index IdxOne, IdxTwo.
> the output will be
> ...../outputpath/TableOne/A
> ...../outputpath/TableOne/B
> ...../outputpath/IdxOne
> ...../outputpath/IdxTwo.
> If anyone need it .I will build a clean tool.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PHOENIX-1056) A ImportTsv tool for phoenix to build table data and all index data.

Reply via email to