Github user koeninger commented on a diff in the pull request:
https://github.com/apache/spark/pull/15679#discussion_r85641432
--- Diff: docs/streaming-kafka-0-10-integration.md ---
@@ -120,15 +184,24 @@ Kafka has an offset commit API that stores offsets in
a special Kafka topic. By
<div class="codetabs">
<div data-lang="scala" markdown="1">
stream.foreachRDD { rdd =>
- val offsets = rdd.asInstanceOf[HasOffsetRanges].offsetRanges
+ val offsetRanges = rdd.asInstanceOf[HasOffsetRanges].offsetRanges
// some time later, after outputs have completed
- stream.asInstanceOf[CanCommitOffsets].commitAsync(offsets)
+ stream.asInstanceOf[CanCommitOffsets].commitAsync(offsetRanges)
}
As with HasOffsetRanges, the cast to CanCommitOffsets will only succeed if
called on the result of createDirectStream, not after transformations. The
commitAsync call is threadsafe, but must occur after outputs if you want
meaningful semantics.
</div>
<div data-lang="java" markdown="1">
+ stream.foreachRDD(new VoidFunction<JavaRDD<ConsumerRecord<String,
String>>>() {
+ @Override
+ public void call(JavaRDD<ConsumerRecord<String, String>> rdd) {
+ OffsetRange[] offsetRanges = ((HasOffsetRanges)
rdd.rdd()).offsetRanges();
+
+ // some time later, after outputs have completed
+ ((CanCommitOffsets)
stream.inputDStream()).commitAsync(offsetRanges);
--- End diff --
I think it's far too late to fix those issues at this point. DStreams
return an RDD, not a parameterized type. KafkaUtils methods return DStreams
and RDDs, not an implementation specific type.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]