[jira] [Commented] (PHOENIX-1056) A ImportTsv tool for phoenix to build table data and all index data.

James Taylor (JIRA) Wed, 09 Jul 2014 01:03:08 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14055974#comment-14055974
 ]


James Taylor commented on PHOENIX-1056:
---------------------------------------

Thanks, [~jaywong]. That's a good improvement to build both the table data and 
the index data in a single job. Open issues are:
- Do we need both a CSV bulk loader and an ImportTsv tool? How are they 
different? Or can the improvements you made be folded into the CSV bulk loader 
instead? If we do need both, can the ImportTsv tool be built on top of the CSV 
bulk loader?
- The CSV bulk loader uses publicly exposed Phoenix APIs to get at the 
underlying KeyValues and uses the Phoenix table metadata to drive the import, 
while the ImportTSV tool requires the column information to be passed through 
in a somewhat awkward manner (leaving room for discrepancies between the real 
schema and the one passed in). The ImportTSV should go through the same Phoenix 
APIs as the CSV bulk loader IMO.

Thoughts? Would be interested in your opinions, [~gabriel.reid] and 
[~maghamravikiran]


> A ImportTsv tool for phoenix to build table data and all index data.
> --------------------------------------------------------------------
>
>                 Key: PHOENIX-1056
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1056
>             Project: Phoenix
>          Issue Type: Task
>    Affects Versions: 3.0.0
>            Reporter: jay wong
>             Fix For: 3.1
>
>         Attachments: PHOENIX-1056.patch
>
>
> I have just build a tool for build table data and index table data just like 
> ImportTsv job.
> http://hbase.apache.org/book/ops_mgt.html#importtsv
> when ImportTsv work it write HFile in a CF name path.
> for example A table has two cf, A and B.
> the output is 
> ...../outputpath/A
> ...../outputpath/B
> In my job. we has a table.  TableOne. and two Index IdxOne, IdxTwo.
> the output will be
> ...../outputpath/TableOne/A
> ...../outputpath/TableOne/B
> ...../outputpath/IdxOne
> ...../outputpath/IdxTwo.
> If anyone need it .I will build a clean tool.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PHOENIX-1056) A ImportTsv tool for phoenix to build table data and all index data.

Reply via email to