[jira] [Updated] (PHOENIX-1973) Improve CsvBulkLoadTool performance by moving keyvalue construction from map phase to reduce phase

Sergey Soldatov (JIRA) Mon, 15 Feb 2016 21:22:37 -0800

     [ 
https://issues.apache.org/jira/browse/PHOENIX-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sergey Soldatov updated PHOENIX-1973:
-------------------------------------
    Attachment: PHOENIX-1973-6.patch

Passing logical table names to the job as well as the physical ones. So, column 
information is building using logical name, but the output is written with 
physical. It's necessary for local indexes that are writing several logical 
tables to one physical.

> Improve CsvBulkLoadTool performance by moving keyvalue construction from map 
> phase to reduce phase
> --------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-1973
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1973
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Rajeshbabu Chintaguntla
>            Assignee: Sergey Soldatov
>             Fix For: 4.7.0
>
>         Attachments: PHOENIX-1973-1.patch, PHOENIX-1973-2.patch, 
> PHOENIX-1973-3.patch, PHOENIX-1973-4.patch, PHOENIX-1973-5.patch, 
> PHOENIX-1973-6.patch
>
>
> It's similar to HBASE-8768. Only thing is we need to write custom mapper and 
> reducer in Phoenix. In Map phase we just need to get row key from primary key 
> columns and write the full text of a line as usual(to ensure sorting). In 
> reducer we need to get actual key values by running upsert query.
> It's basically reduces lot of map output to write to disk and data need to be 
> transferred through network.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PHOENIX-1973) Improve CsvBulkLoadTool performance by moving keyvalue construction from map phase to reduce phase

Reply via email to