[GitHub] spark pull request #20836: SPARK-23685 : Fix for the Spark Structured Stream...

sirishaSindri Mon, 26 Mar 2018 17:42:53 -0700

Github user sirishaSindri commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20836#discussion_r177276642
  
    --- Diff: 
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala
 ---
    @@ -279,9 +279,8 @@ private[kafka010] case class InternalKafkaConsumer(
           if (record.offset > offset) {
             // This may happen when some records aged out but their offsets 
already got verified
             if (failOnDataLoss) {
    -          reportDataLoss(true, s"Cannot fetch records in [$offset, 
${record.offset})")
    --- End diff --
    
    @gaborgsomogyi Thank you Gaborgsomogyi for looking at it. For the batch 
queries, it will always fail if it fails to read any data from the provided 
offsets due to lost data.  
https://spark.apache.org/docs/latest/structured-streaming-kafka-integration.html
 . With this change,it wont fail .Instead It will return all the available 
messages within the requested offset range.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #20836: SPARK-23685 : Fix for the Spark Structured Stream...

Reply via email to