Attila Piros has posted comments on this change. (
http://gerrit.cloudera.org:8080/11199 )
Change subject: Supporting Spark streaming DataFrame in KuduContext.
......................................................................
Patch Set 5:
> Patch Set 5:
>
> (1 comment)
In case of structured streaming foreachPartition on a streaming DataFrame is an
unsupported operation. You can see this error running the new test without my
KuduContext changes:
~~~~~~~~~~~~~~~~~~~~~~~~
> Task :kudu-spark:test
org.apache.kudu.spark.kudu.StreamingTest > testKuduContextWithSparkStreaming
FAILED
org.apache.spark.sql.streaming.StreamingQueryException: Queries with
streaming sources must be executed with writeStream.start();;
LocalRelation [value#16]
=== Streaming Query ===
Identifier: [id = f00274e4-469e-43fd-8d9a-3e42f81125a3, runId =
35f66191-e282-4112-b226-26cd2b0c6fae]
Current Committed Offsets: {}
Current Available Offsets: {MemoryStream[value#1]: 0}
Current State: ACTIVE
Thread State: RUNNABLE
Logical Plan:
Project [_1#7 AS key#10, _2#8 AS val#11]
+- SerializeFromObject [assertnotnull(assertnotnull(input[0, scala.Tuple2,
true]))._1 AS _1#7, staticinvoke(class
org.apache.spark.unsafe.types.UTF8String, StringType, fromString,
assertnotnull(assertnotnull(input[0, scala.Tuple2, true]))._2, true, false) AS
_2#8]
+- MapElements <function1>, int, [StructField(value,IntegerType,false)],
obj#6: scala.Tuple2
+- DeserializeToObject assertnotnull(cast(value#1 as int)), obj#5: int
+- StreamingExecutionRelation MemoryStream[value#1], [value#1]
Caused by:
org.apache.spark.sql.AnalysisException: Queries with streaming sources
must be executed with writeStream.start();;
LocalRelation [value#16]
1 test completed, 1 failed
> Task :kudu-spark:test FAILED
~~~~~~~~~~~~~~~~~~~~~~~~~~~
So this solution uses the same method as was used for KafkaSink: using
foreachPartition on RDD-level. Basically it is a workaround.
--
To view, visit http://gerrit.cloudera.org:8080/11199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iead04539d3514920a5d6803c34715e5686124572
Gerrit-Change-Number: 11199
Gerrit-PatchSet: 5
Gerrit-Owner: Attila Piros <[email protected]>
Gerrit-Reviewer: Attila Bukor <[email protected]>
Gerrit-Reviewer: Attila Piros <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Wed, 26 Sep 2018 00:42:12 +0000
Gerrit-HasComments: No