Attila Piros has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11199 )

Change subject: Supporting Spark streaming DataFrame in KuduContext.
......................................................................


Patch Set 5:

> Patch Set 5:
>
> (1 comment)
In case of structured streaming foreachPartition on a streaming DataFrame is an 
unsupported operation. You can see this error running the new test without my 
KuduContext changes:

~~~~~~~~~~~~~~~~~~~~~~~~
> Task :kudu-spark:test

org.apache.kudu.spark.kudu.StreamingTest > testKuduContextWithSparkStreaming 
FAILED
    org.apache.spark.sql.streaming.StreamingQueryException: Queries with 
streaming sources must be executed with writeStream.start();;
    LocalRelation [value#16]

    === Streaming Query ===
    Identifier: [id = f00274e4-469e-43fd-8d9a-3e42f81125a3, runId = 
35f66191-e282-4112-b226-26cd2b0c6fae]
    Current Committed Offsets: {}
    Current Available Offsets: {MemoryStream[value#1]: 0}

    Current State: ACTIVE
    Thread State: RUNNABLE

    Logical Plan:
    Project [_1#7 AS key#10, _2#8 AS val#11]
    +- SerializeFromObject [assertnotnull(assertnotnull(input[0, scala.Tuple2, 
true]))._1 AS _1#7, staticinvoke(class 
org.apache.spark.unsafe.types.UTF8String, StringType, fromString, 
assertnotnull(assertnotnull(input[0, scala.Tuple2, true]))._2, true, false) AS 
_2#8]
       +- MapElements <function1>, int, [StructField(value,IntegerType,false)], 
obj#6: scala.Tuple2
          +- DeserializeToObject assertnotnull(cast(value#1 as int)), obj#5: int
             +- StreamingExecutionRelation MemoryStream[value#1], [value#1]

        Caused by:
        org.apache.spark.sql.AnalysisException: Queries with streaming sources 
must be executed with writeStream.start();;
        LocalRelation [value#16]

1 test completed, 1 failed

> Task :kudu-spark:test FAILED
~~~~~~~~~~~~~~~~~~~~~~~~~~~

So this solution uses the same method as was used for KafkaSink: using 
foreachPartition on RDD-level. Basically it is a workaround.


--
To view, visit http://gerrit.cloudera.org:8080/11199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iead04539d3514920a5d6803c34715e5686124572
Gerrit-Change-Number: 11199
Gerrit-PatchSet: 5
Gerrit-Owner: Attila Piros <piros.attila.zs...@gmail.com>
Gerrit-Reviewer: Attila Bukor <abu...@apache.org>
Gerrit-Reviewer: Attila Piros <piros.attila.zs...@gmail.com>
Gerrit-Reviewer: Grant Henke <granthe...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Wed, 26 Sep 2018 00:42:12 +0000
Gerrit-HasComments: No

Reply via email to