[
https://issues.apache.org/jira/browse/PHOENIX-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15120717#comment-15120717
]
Randy Gelhausen commented on PHOENIX-2632:
------------------------------------------
I would like to see this moved into Phoenix in two ways:
1. [~jmahonin] agreed the "create if not exists" snippet would improve the
existing phoenix-spark API integration. I'll look at opening an additional JIRA
and submitting a preliminary patch to add it there.
2. I also envision this as a new "executable" module similar to the pre-built
bulk CSV loading MR job: HADOOP_CLASSPATH=$(hbase mapredcp):/path/to/hbase/conf
hadoop jar phoenix-4.0.0-incubating-client.jar
org.apache.phoenix.mapreduce.CsvBulkLoadTool --table EXAMPLE --input
/data/example.csv
Making the generic "Hive table/query <-> Phoenix" use case bash-scriptable
opens the door to users who aren't going to write Spark code just to move data
back and forth between Hive and HBase.
[~elserj] [~jmahonin] I'm happy to add tests and restructure the existing code
for both 1 and 2, but will need some guidance once you decide yea or nay for
each.
> Easier Hive->Phoenix data movement
> ----------------------------------
>
> Key: PHOENIX-2632
> URL: https://issues.apache.org/jira/browse/PHOENIX-2632
> Project: Phoenix
> Issue Type: Improvement
> Reporter: Randy Gelhausen
>
> Moving tables or query results from Hive into Phoenix today requires error
> prone manual schema re-definition inside HBase storage handler properties.
> Since Hive and Phoenix support near equivalent types, it should be easier for
> users to pick a Hive table and load it (or derived query results) from it.
> I'm posting this to open design discussion, but also submit my own project
> https://github.com/randerzander/HiveToPhoenix for consideration as an early
> solution. It creates a Spark DataFrame from a Hive query, uses Phoenix JDBC
> to "create if not exists" a Phoenix equivalent table, and uses the
> phoenix-spark artifact to store the DataFrame into Phoenix.
> I'm eager to get feedback if this is interesting/useful to the Phoenix
> community.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)