[ 
https://issues.apache.org/jira/browse/KUDU-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183098#comment-15183098
 ] 

Ted Malaska commented on KUDU-1214:
-----------------------------------

OK this is what I'm going to do.  I started the work, but I'm going to wait 
until someone lets me know that Kudu will accept this work.

What I would like to add is the following for the following use cases.

1. RDD.ForeachPartition: This is so I can put data from Spark into Kudu
2. RDD.MapPartition: This is so I can use complex interactions with Kudu when 
making transformations 
3. DStream.ForeachPartition: This is so I can put data from Spark into Kudu 
from Spark Streaming
4. DStream.MapPartition: This is so I can use complex interactions with Kudu 
when making transformations from Spark Streaming
5. RDD Functions: This will allow me to use my Kudu functions straight off a 
RDD so I don't need to use the KuduContext on ever call
6. DStream Functions: This will allow me to use my Kudu functions straight off 
a DStream so I don't need to use the KuduContext  on ever call

Please give me confirmation that I should continue with this week
 

> Add Integration points for Spark, Spark Streaming, and Spark SQL
> ----------------------------------------------------------------
>
>                 Key: KUDU-1214
>                 URL: https://issues.apache.org/jira/browse/KUDU-1214
>             Project: Kudu
>          Issue Type: New Feature
>          Components: integration
>            Reporter: Ted Malaska
>         Attachments: KUDU-1214.1.patch
>
>
> This Jira will be broken up into four main jira:
> 1. Add Support for Spark RDD map and foreach integration with Kudu
> 2. Add Support for Spark DStream map and foreach integration with Kudu
> 3. Add Support for Spark SQL defaultSource and push down predicates
> 4. Add documentation for all Spark Integrations



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to