[jira] [Commented] (PHOENIX-918) Support importing directly from ORC formatted HDFS data

Gabriel Reid (JIRA) Thu, 10 Apr 2014 08:07:21 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13965423#comment-13965423
 ]


Gabriel Reid commented on PHOENIX-918:
--------------------------------------

{quote}I know a good Apache open source project that already does that. Why not 
work to integrate Apache Phoenix with Apache Hive rather than have some 
separate/different/duplicated effort (i.e. like, as a start, implementing this 
JIRA)?{quote}

FWIW, I agree with this. I think that at its core, Phoenix is a HBase schema 
mapping mechanism together with a system for doing optimal scans and retrieval 
of data for given queries. I think that both of these are the main focuses of 
an optimal HBaseStorageHandler for Hive, and I think that Phoenix has already 
largely solved these issues. That's my 2c anyhow.

About the HCatalog integration, I think that basically dropping in 
HCatInputFormat instead of FileInputFormat should be enough to handle the use 
case that I outlined above. I expect that the biggest issue will be just 
getting the dependencies set up correctly, etc.

> Support importing directly from ORC formatted HDFS data
> -------------------------------------------------------
>
>                 Key: PHOENIX-918
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-918
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>
> We currently have a good way to import from CSV, but we should also add the 
> ability to import from HDFS ORC files, as this would likely be common if 
> folks have Hive data they'd like to import.
> [~enis], [~ndimiduk], [~devaraj] - Does this make sense, or is there a 
> better, existing way? Any takers on implementing it?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PHOENIX-918) Support importing directly from ORC formatted HDFS data

Reply via email to