[
https://issues.apache.org/jira/browse/PHOENIX-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14492377#comment-14492377
]
ASF GitHub Bot commented on PHOENIX-1815:
-----------------------------------------
Github user jmahonin commented on the pull request:
https://github.com/apache/phoenix/pull/63#issuecomment-92343659
Thanks for the review @mravi
That HBaseConfiguration.create() step is a great idea, I'll make that
change ASAP.
Re: naming scheme, I'd attempted to follow Cassandra-Spark connector, since
there's not yet too much available for reference code, but also the feature
sets would be relatively closely aligned:
https://github.com/datastax/spark-cassandra-connector/tree/master/spark-cassandra-connector/src/main/scala/com/datastax/spark/connector
Although I'm not completely married to the idea, both Datastax (Cassandra)
and Databricks (Spark) seem to follow a _Functions.scala scheme, where _ is the
class to which implicit helper parameters are being attached. In this case, the
new 'ProductRDDFunctions' applies the implicit helper function 'saveToPhoenix'
to objects of type RDD[Product], or an RDD of tuples.
> Use Spark Data Source API in phoenix-spark module
> -------------------------------------------------
>
> Key: PHOENIX-1815
> URL: https://issues.apache.org/jira/browse/PHOENIX-1815
> Project: Phoenix
> Issue Type: New Feature
> Reporter: Josh Mahonin
>
> Spark 1.3.0 introduces a new 'Data Source' API to standardize load and save
> methods for different types of data sources.
> The phoenix-spark module should implement the same API for use as a pluggable
> data store in Spark.
> ref:
> https://spark.apache.org/docs/latest/sql-programming-guide.html#data-sources
>
> https://databricks.com/blog/2015/01/09/spark-sql-data-sources-api-unified-data-access-for-the-spark-platform.html
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)