>From the logs provided by the AWS support, I can see these 2 issues: 1) Sometimes Camel doesn't set the visibilityTimeout(or Camel sends the request but AWS doesn't set it) and Camel doesn't catch this information. 2) Sometimes AWS fails to extend the visibilityTimeout and Camel catch it. Camel throw out a WARNING. My question is why Camel doesn't retry to extend the visibilityTimeout when this happened? And I am not sure whether Camel should throw out an Exception rather than a Warning because it can cause the duplication and real problem.
On 02/07/2019, 11:22, "Qingyang Xu" <qingyang...@bbc.co.uk> wrote: >Hi, >Thank you for your reply. I don't have a simple reproducer. It happens >about once in 200 messages. It is rare but it does happen almost everyday. >That's quite annoying :( > >On 01/07/2019, 14:46, "Andrea Cosentino" <ancosen1...@yahoo.com.INVALID> >wrote: > >>Do you have a simple reproducer for this? >> >>-- >>Andrea Cosentino >>---------------------------------- >>Apache Camel PMC Chair >>Apache Karaf Committer >>Apache Servicemix PMC Member >>Email: ancosen1...@yahoo.com >>Twitter: @oscerd2 >>Github: oscerd >> >> >> >> >> >> >>On Tuesday, June 25, 2019, 5:11:05 PM GMT+2, Qingyang Xu >><qingyang...@bbc.co.uk> wrote: >> >> >> >> >> >>Hi, >>We have a micro service (multiple instances and each instance has >>multiple >>consumers) on AWS that uses Camel aws-sqs to listen to a AWS sqs queue. >>Our camel version is 2.24.0 and the aws-sqs config is >>visibilityTimeout=300&extendMessageVisibility=true&maxMessagesPerPoll=5&c >>o >>n >>currentConsumers=5. >>Sometimes (less than 1%) we found our micro service duplicated to process >>a message. And sometimes we can find a warning like this: >>019-06-18 12:55:23.821 [Camel (camel-1) thread #7 - SqsTimeoutExtender] >>WARN o.a.c.c.a.sqs.SqsConsumer - Extending visibility window failed for >>exchange Exchange[ID-i-00edba07f823c5496-1560852804975-0-254]. Will not >>attempt to extend visibility further. >>This exception will be ignored. >>com.amazonaws.services.sqs.model.AmazonSQSException: Value >>AQEB/3m4117WvFRRTVpLJVyLa9D0pbln3f2Y8x/0lgTfWEDd for parameter >>ReceiptHandle is invalid. Reason: Message does not exist or is not >>available for visibility timeout change. (Service: AmazonSQS; Status >>Code: 400; Error Code: InvalidParameterValue; Request ID: >>5531d38b-1253-59a1-a944-8609f3281568at >>com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(A >>m >>a >>zonHttpClient.java:1638 >>at >>com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(Ama >>z >>o >>nHttpClient.java:1303 >>at >>com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonH >>t >>t >>pClient.java:1055 >>at >>com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpC >>l >>i >>ent.java:743 >>at >>com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(Amaz >>o >>n >>HttpClient.java:717 >>at >>com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpCli >>e >>n >>t.java:699 >>at >>com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttp >>C >>l >>ient.java:667 >>at >>com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(A >>m >>a >>zonHttpClient.java:649 >>at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513 >>at >>com.amazonaws.services.sqs.AmazonSQSClient.doInvoke(AmazonSQSClient.java: >>1 >>7 >>40 >>at >>com.amazonaws.services.sqs.AmazonSQSClient.invoke(AmazonSQSClient.java:17 >>1 >>6 >>at >>com.amazonaws.services.sqs.AmazonSQSClient.executeChangeMessageVisibility >>( >>A >>mazonSQSClient.java:518 >>at >>com.amazonaws.services.sqs.AmazonSQSClient.changeMessageVisibility(Amazon >>S >>Q >>SClient.java:494 >>at org.apache.camel.component.aws.sqs.SqsConsumer$TimeoutExtender.run >>(SqsConsumer.java:327 >>(Executors.java:511 >>at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308 >>at >>java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.acce >>s >>s >>$301(ScheduledThreadPoolExecutor.java:180 >>at >>java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run >>(ScheduledThreadPoolExecutor.java:294 >>at >>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java >>: >>1 >>149 >>at java.util.concurrent.ThreadPoolExecutor$Worker.run >>(ThreadPoolExecutor.java:624 >>at java.lang.Thread.run >>(Thread.java:748 >> >> >>I thought it is because Camel failed to extend the visibilityTimeout so >>the message appeared in the queue and was picked up by another consumer. >>I >>raised a AWS support ticket and got the logs from AWS like this: >>1. Operation=SendMessage | >>RemoteIpAddress=52.213.250.143 >>| EndTime=Tue, 18 Jun 2019 12:51:18 UTC >>2. Operation=ReceiveMessage | RemoteIpAddress=52.208.226.232 >>| EndTime=Tue, 18 Jun 2019 12:51:18 UTC | VisibilityTimeout=60 >>3. Operation=ReceiveMessage | RemoteIpAddress=52.209.137.230 >>| EndTime=Tue, 18 Jun 2019 12:52:19 UTC | VisibilityTimeout=60 >>4. Operation=ChangeMessageVisibility | RemoteIpAddress=52.209.137.230 | >>EndTime=Tue, 18 Jun 2019 12:52:49 UTC | VisibilityTimeout=90 >>5. Operation=ChangeMessageVisibility | RemoteIpAddress=52.209.137.230 | >>EndTime=Tue, 18 Jun 2019 12:53:49 UTC | VisibilityTimeout=90 >>6. Operation=ChangeMessageVisibility | RemoteIpAddress=52.209.137.230 | >>EndTime=Tue, 18 Jun 2019 12:54:49 UTC | VisibilityTimeout=90 >>7. Operation=ChangeMessageVisibility | RemoteIpAddress=52.208.226.232 | >>EndTime=Tue, 18 Jun 2019 12:55:23 UTC | VisibilityTimeout=90 >>8. Operation=DeleteMessage | >>RemoteIpAddress=52.209.137.230 >>| EndTime=Tue, 18 Jun 2019 12:55:23 UTC >>9. Operation=ChangeMessageVisibility | RemoteIpAddress=52.208.226.232 | >>EndTime=Tue, 18 Jun 2019 12:56:23 UTC | VisibilityTimeout=90 >>10. Operation=DeleteMessage | >>RemoteIpAddress=52.208.226.232 | EndTime=Tue, 18 Jun 2019 12:56:57 UTC >> >>We can see the message was picked up by the first >>consumer(RemoteIpAddress=52.208.226.232) but it didn't extend the >>visibilityTimeout, so after 60 seconds the message was picked up by >>another consumer(RemoteIpAddress=52.209.137.230) which extended the >>VisibilityTimeout as usual. This happened without any warning and >>exception. We saw the warning because somehow after about 4 minutes the >>first consumer tried to extend the VisibilityTimeout after the message >>was >>deleted. Sometimes this happened completely >>silently. >> >>This happened a couple of times every week. Is it because of something >>wrong with the threads (concurrentConsumers=5) which has problem to >>schedule the VisibilityTimeout extension? Has anyone seen this before? >> >>Many thanks, >>Qingyang Xu >