[
https://issues.apache.org/jira/browse/PHOENIX-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13965423#comment-13965423
]
Gabriel Reid commented on PHOENIX-918:
--------------------------------------
{quote}I know a good Apache open source project that already does that. Why not
work to integrate Apache Phoenix with Apache Hive rather than have some
separate/different/duplicated effort (i.e. like, as a start, implementing this
JIRA)?{quote}
FWIW, I agree with this. I think that at its core, Phoenix is a HBase schema
mapping mechanism together with a system for doing optimal scans and retrieval
of data for given queries. I think that both of these are the main focuses of
an optimal HBaseStorageHandler for Hive, and I think that Phoenix has already
largely solved these issues. That's my 2c anyhow.
About the HCatalog integration, I think that basically dropping in
HCatInputFormat instead of FileInputFormat should be enough to handle the use
case that I outlined above. I expect that the biggest issue will be just
getting the dependencies set up correctly, etc.
> Support importing directly from ORC formatted HDFS data
> -------------------------------------------------------
>
> Key: PHOENIX-918
> URL: https://issues.apache.org/jira/browse/PHOENIX-918
> Project: Phoenix
> Issue Type: Bug
> Reporter: James Taylor
>
> We currently have a good way to import from CSV, but we should also add the
> ability to import from HDFS ORC files, as this would likely be common if
> folks have Hive data they'd like to import.
> [~enis], [~ndimiduk], [~devaraj] - Does this make sense, or is there a
> better, existing way? Any takers on implementing it?
--
This message was sent by Atlassian JIRA
(v6.2#6252)