[ 
https://issues.apache.org/jira/browse/PHOENIX-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14484574#comment-14484574
 ] 

Josh Mahonin commented on PHOENIX-1071:
---------------------------------------

It looks like the latest changes that moved the unit tests into the 'it' 
directory end up actually disabling them, I think due to the filename not 
ending in "*IT". Due to the memory / VM complications, this might be ideal for 
now.

I have a local branch with the unit tests running during the integration-test 
phase, and I'm presently working on having the tests extend the 
BaseHBaseManagedTimeIT interface. My hope is that once that's in place, the 
build issues and parallel execution limitions will be fixed along with it. If 
all goes well I should have something ready tomorrow.

> Provide integration for exposing Phoenix tables as Spark RDDs
> -------------------------------------------------------------
>
>                 Key: PHOENIX-1071
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1071
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>            Assignee: Josh Mahonin
>
> A core concept of Apache Spark is the resilient distributed dataset (RDD), a 
> "fault-tolerant collection of elements that can be operated on in parallel". 
> One can create a RDDs referencing a dataset in any external storage system 
> offering a Hadoop InputFormat, like PhoenixInputFormat and 
> PhoenixOutputFormat. There could be opportunities for additional interesting 
> and deep integration. 
> Add the ability to save RDDs back to Phoenix with a {{saveAsPhoenixTable}} 
> action, implicitly creating necessary schema on demand.
> Add support for {{filter}} transformations that push predicates to the server.
> Add a new {{select}} transformation supporting a LINQ-like DSL, for example:
> {code}
> // Count the number of different coffee varieties offered by each
> // supplier from Guatemala
> phoenixTable("coffees")
>     .select(c =>
>         where(c.origin == "GT"))
>     .countByKey()
>     .foreach(r => println(r._1 + "=" + r._2))
> {code} 
> Support conversions between Scala and Java types and Phoenix table data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to