[
https://issues.apache.org/jira/browse/KUDU-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183098#comment-15183098
]
Ted Malaska commented on KUDU-1214:
-----------------------------------
OK this is what I'm going to do. I started the work, but I'm going to wait
until someone lets me know that Kudu will accept this work.
What I would like to add is the following for the following use cases.
1. RDD.ForeachPartition: This is so I can put data from Spark into Kudu
2. RDD.MapPartition: This is so I can use complex interactions with Kudu when
making transformations
3. DStream.ForeachPartition: This is so I can put data from Spark into Kudu
from Spark Streaming
4. DStream.MapPartition: This is so I can use complex interactions with Kudu
when making transformations from Spark Streaming
5. RDD Functions: This will allow me to use my Kudu functions straight off a
RDD so I don't need to use the KuduContext on ever call
6. DStream Functions: This will allow me to use my Kudu functions straight off
a DStream so I don't need to use the KuduContext on ever call
Please give me confirmation that I should continue with this week
> Add Integration points for Spark, Spark Streaming, and Spark SQL
> ----------------------------------------------------------------
>
> Key: KUDU-1214
> URL: https://issues.apache.org/jira/browse/KUDU-1214
> Project: Kudu
> Issue Type: New Feature
> Components: integration
> Reporter: Ted Malaska
> Attachments: KUDU-1214.1.patch
>
>
> This Jira will be broken up into four main jira:
> 1. Add Support for Spark RDD map and foreach integration with Kudu
> 2. Add Support for Spark DStream map and foreach integration with Kudu
> 3. Add Support for Spark SQL defaultSource and push down predicates
> 4. Add documentation for all Spark Integrations
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)