[
https://issues.apache.org/jira/browse/PHOENIX-2723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173604#comment-15173604
]
Gabriel Reid commented on PHOENIX-2723:
---------------------------------------
Is there any real advantage to running the tool with multiple inputs and
tables, as opposed to just running it multiple times for one table at a time?
I'm a bit worried about the possible complexity that this would add if there
isn't a big advantage to it. Additionally, this would break the current
functionality of supplying multiple comma-separated input files for importing
into a single table.
> Make BulkLoad able to load several tables at once
> -------------------------------------------------
>
> Key: PHOENIX-2723
> URL: https://issues.apache.org/jira/browse/PHOENIX-2723
> Project: Phoenix
> Issue Type: Improvement
> Reporter: Sergey Soldatov
> Assignee: Sergey Soldatov
>
> It comes that usually bulk load is required for more than one table and
> usually it's done by running jobs one by one. The idea is to provide lists of
> tables and corresponding input sources to the MR BulkLoad job. Syntax can be
> something like :
> yarn ... CsvBulkLoadTool -t table1,table2,table3 --input input1,input2,input3
> Having map tableName => input during map phase we can determine to which
> table the current split belongs to and produce necessary tableRowKeyPair.
> Any thoughts, suggestions?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)