david-streamlio commented on issue #9966: URL: https://github.com/apache/pulsar/issues/9966#issuecomment-803033947
I guess my main concern is the duplication of data if a task is failed and re-processed. Consider a scenario where the task is a list of 1,000 records to be put into a database, and we fail on record 500 on a non-connectivity errors such as "invalid foreign key ". Then we proceed with inserting the remaining 499 records before failing the entire task. The task message will get redelivered and the same 999 records will be inserted a second time. Even if these second inserts were just duplicates, they would all be unnecessary calls to the DB. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
