[jira] [Commented] (PHOENIX-2723) Make BulkLoad able to load several tables at once

Gabriel Reid (JIRA) Tue, 15 Mar 2016 03:21:07 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-2723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195060#comment-15195060
 ]


Gabriel Reid commented on PHOENIX-2723:
---------------------------------------

{quote}
well, the logic is quite simple. If there are several input files and one table 
name - all those files will be loaded to this table. Otherwise the number of 
tables need to be equal number of inputs.
{quote}

This sounds like the semantics of one input parameter is then changed by the 
contents of other input parameters, which I'm personally not in favor of.

I think that sticking with a single invocation is for loading a single table is 
the best way to stay in line with the [Principle of least 
astonishment|https://en.wikipedia.org/wiki/Principle_of_least_astonishment] 
(mostly because it is in line with how most other tools work), and the 
advantages of not having to write shell scripts around it and reduced start-up 
time don't feel like a bit enough win to compromise on simplicity here. That's 
just my opinion of course.

> Make BulkLoad able to load several tables at once
> -------------------------------------------------
>
>                 Key: PHOENIX-2723
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2723
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Sergey Soldatov
>            Assignee: Sergey Soldatov
>         Attachments: PHOENIX-2723-1.patch
>
>
> It comes that usually bulk load is required for more than one table and 
> usually it's done by running jobs one by one. The idea is to provide lists of 
> tables and corresponding input sources to the MR BulkLoad job. Syntax can be 
> something like :
> yarn ... CsvBulkLoadTool -t table1,table2,table3 --input input1,input2,input3
> Having map tableName => input during map phase we can determine to which 
> table the current split belongs to and produce necessary tableRowKeyPair. 
> Any thoughts, suggestions?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-2723) Make BulkLoad able to load several tables at once

Reply via email to