Ayush Verma created BAHIR-223:
---------------------------------

             Summary: Concern around reliability of sql-streaming-sqs
                 Key: BAHIR-223
                 URL: https://issues.apache.org/jira/browse/BAHIR-223
             Project: Bahir
          Issue Type: Bug
          Components: Spark Structured Streaming Connectors
            Reporter: Ayush Verma


Looking at the source for the *sql-streaming-sqs* connector, it seems that we 
delete the messages in SQS on every fetchMaxOffset() call.

[https://github.com/apache/bahir/blob/3912360ca5bcca269a30ff42120cac46934693c4/sql-streaming-sqs/src/main/scala/org/apache/spark/sql/streaming/sqs/SqsSource.scala#L106]

My understanding of a spark streaming source is that a call to the commit() 
method signals that spark has completed processing up-to the given offset. 
Should we not delete the SQS messages on a call to commit() instead?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to