[jira] [Commented] (HBASE-3727) MultiHFileOutputFormat

James Taylor (JIRA) Tue, 24 Sep 2013 08:57:32 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13776417#comment-13776417
 ]


James Taylor commented on HBASE-3727:
-------------------------------------

We have a good use case for this, Andrew. Arun Singh, an open source 
contributor to Phoenix, is writing a bulk CSV loader for Phoenix. We'd like to 
get it working efficiently for the scenario where a data table has indexes. The 
indexes are themselves HBase tables (essentially partial copies of the data 
table with a different row key structure) with the same column family structure 
as the data table. With this feature you've added, we can do the data table and 
index table creation in a single map-reduce run. Otherwise, we have to do one 
map-reduce run per data table + index tables which is obviously suboptimal.

Any chance we could get this into a patch on 0.94?
                
> MultiHFileOutputFormat
> ----------------------
>
>                 Key: HBASE-3727
>                 URL: https://issues.apache.org/jira/browse/HBASE-3727
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>            Priority: Minor
>         Attachments: MultiHFileOutputFormat.java, MultiHFileOutputFormat.java
>
>
> Like MultiTableOutputFormat, but outputting HFiles. Key is tablename as an 
> IBW. Creates sub-writers (code cut and pasted from HFileOutputFormat) on 
> demand that produce HFiles in per-table subdirectories of the configured 
> output path. Does not currently support partitioning for existing tables / 
> incremental update.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3727) MultiHFileOutputFormat

Reply via email to