Attila Zsolt Piros created KUDU-2539:
----------------------------------------

             Summary: Supporting Spark Streaming DataFrame in KuduContext
                 Key: KUDU-2539
                 URL: https://issues.apache.org/jira/browse/KUDU-2539
             Project: Kudu
          Issue Type: Improvement
          Components: spark
    Affects Versions: 1.8.0
            Reporter: Attila Zsolt Piros


Currently KuduContext does not support Spark Streaming DataFrame. The problem 
comes from a foreachPartition call which in case of spark streaming is an 
unsupported operation, like foreach: 

[unsupported operations in 
streaming|https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#unsupported-operations]

I have created a small example app with a custom Kudu sink which can be used 
for testing:

[kudu custom sink and example 
app|https://github.com/attilapiros/kudu_custom_sink]

The patch fixing this issue is also available for kudu-spark, so soon a gerrit 
review can be expected with the solution.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to