[ 
https://issues.apache.org/jira/browse/PHOENIX-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14959449#comment-14959449
 ] 

Gabriel Reid commented on PHOENIX-2216:
---------------------------------------

Thanks for the updated patches [~maghamraviki...@gmail.com], those tests are 
indeed working now with the multipleoutputs patch.

I was able to debug the reason that the multipleoutputs patch isn't working, 
but the solution is probably to create some kind of wrapper around 
HFileOutputFormat2 which would add a custom OutputCommitter. I definitely like 
this approach, but it would probably involve quite a bit of slow-going 
debugging work (which I'm really not able to do at the moment unfortunately).

As said before, the approach with the custom HFileOutputFormat patch seems to 
be working fine, so realistically if we want to get it in place in the short 
term I think that's the way to go. The patch looks good to me. One thing I 
thought of that would be good is to update the integration test to use multiple 
regions on the test tables, as this is a spot where there could be issues with 
either of the approaches.

> Support single mapper pass to CSV bulk load table and indexes
> -------------------------------------------------------------
>
>                 Key: PHOENIX-2216
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2216
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>            Assignee: maghamravikiran
>         Attachments: phoenix-custom-hfileoutputformat-comments.patch, 
> phoenix-custom-hfileoutputformat.patch, phoenix-multipleoutputs.patch
>
>
> Instead of running separate MR jobs for CSV bulk load: once for the table and 
> then once for each secondary index, generate both the data table HFiles and 
> the index table(s) HFiles in one mapper phase.
> Not sure if we need HBASE-3727 to be implemented for this or if we can do it 
> with existing HBase APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to