[ 
https://issues.apache.org/jira/browse/KUDU-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183268#comment-15183268
 ] 

Ted Malaska commented on KUDU-1214:
-----------------------------------

Thank [~tlipcon] for the response

So the functions may or may not be needed depending on the cost of making a 
Kudu client and future expectations of having a kerberos enforced Kudu client.

Here are some questions to determine if these functions are needed
1. How expensive if the create client call?  If I said I needed to execute the 
create Kudu client 20x a second every second for the next year with in a given 
JVM would you be ok with that?  Now if we are caching the client in a JVM then 
this is not an issue.  Just need to validate
2. For saving to Kudu, would it not be easier to have a function on a dataframe 
call saveToKudu rather then having everyone write their own implementation with 
a foreachParition.  Would the Kudu outputFormat be enough for this use case.

So if on number 1 you say the client is cached then we are good there.  If on 
number two you see the KuduOutputFormat as being good enough and you don't need 
an integration with DataFrame then who am I to debate.

Let me know.
Ted Malaska

> Add Integration points for Spark, Spark Streaming, and Spark SQL
> ----------------------------------------------------------------
>
>                 Key: KUDU-1214
>                 URL: https://issues.apache.org/jira/browse/KUDU-1214
>             Project: Kudu
>          Issue Type: New Feature
>          Components: integration
>            Reporter: Ted Malaska
>         Attachments: KUDU-1214.1.patch
>
>
> This Jira will be broken up into four main jira:
> 1. Add Support for Spark RDD map and foreach integration with Kudu
> 2. Add Support for Spark DStream map and foreach integration with Kudu
> 3. Add Support for Spark SQL defaultSource and push down predicates
> 4. Add documentation for all Spark Integrations



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to