[
https://issues.apache.org/jira/browse/GEODE-3730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16188658#comment-16188658
]
Barry Oglesby commented on GEODE-3730:
--------------------------------------
I see whats going on with this issue.
A connection is re-established (meaning the remote site members went down and
have come back up):
{noformat}
[info 2017/09/11 13:50:24.910 PDT cluster1-server2 <Event Processor for
GatewaySender_ny_0> tid=0x56] ny: Using Pooled Connection to server-2:3528:
Connection[server-2:3528]@295453478
{noformat}
The {{Event Processor}} sends 2 batches in succession. Batch 1 contains event
with tradeId=1768. Batch 2 contains event with tradeId=1770.
{noformat}
[info 2017/09/11 13:50:25.230 PDT cluster1-server2 <Event Processor for
GatewaySender_ny_0> tid=0x56] During normal processing, dispatching the
following : 1 events (batch #1):
Event
EventID[threadID=0x10000|4;sequenceID=1770;bucketId=0]:1768->Trade
[tradeId=1768, tradeTime=Mon Sep 11 13:50:25 PDT 2017,
orderNumber=1768.order.1768, executingBroker=executingBroker1768]1781
[info 2017/09/11 13:50:25.444 PDT cluster1-server2 <Event Processor for
GatewaySender_ny_0> tid=0x56] During normal processing, dispatching the
following : 1 events (batch #2):
Event
EventID[threadID=0x1000a|4;sequenceID=1772;bucketId=10]:1770->Trade
[tradeId=1770, tradeTime=Mon Sep 11 13:50:25 PDT 2017,
orderNumber=1770.order.1770, executingBroker=executingBroker1770]1804
{noformat}
The {{AckReaderThread}} receives the ack for batch 1with tradeId=1768. It
contains a {{BatchException}}. Since
{{gemfire.GatewaySender.REMOVE_FROM_QUEUE_ON_EXCEPTION=false}}, the event is
not removed from the queue.
{noformat}
[warning 2017/09/11 13:50:25.574 PDT cluster1-server2 <AckReaderThread for :
Event Processor for GatewaySender_ny_0> tid=0xfa] XXX AckReaderThread.run read
ack batchId=1; gotBatchException=true
[warning 2017/09/11 13:50:25.574 PDT cluster1-server2 <AckReaderThread for :
Event Processor for GatewaySender_ny_0> tid=0xfa] A BatchException occurred
processing event 0
com.gemstone.gemfire.internal.cache.wan.BatchException70: Exception occurred
while processing a batch on the receiver running on DistributedSystem with Id:
3, DistributedMember on which the receiver is running:
server-2(cluster3-server2:64368)<v18>:58108
Caused by: com.gemstone.gemfire.cache.persistence.PartitionOfflineException:
Region /trade bucket 0 has persistent data that is no longer online stored at
these locations: [/192.168.2.14:/regionData created at timestamp 1505162997703
version 0 diskStoreId ff1a27e2ec154f39-a906b607a314e030 name null]
[warning 2017/09/11 13:50:25.574 PDT cluster1-server2 <AckReaderThread for :
Event Processor for GatewaySender_ny_0> tid=0xfa] The event being processed
when the BatchException occurred was:
SenderEventImpl[id=EventID[threadID=0x10000|4;sequenceID=1770;bucketId=0];action=0;operation=CREATE;region=/trade;key=1768;value=Trade
[tradeId=1768, tradeTime=Mon Sep 11 13:50:25 PDT 2017,
orderNumber=1768.order.1768,
executingBroker=executingBroker1768];valueIsObject=1;numberOfParts=9;callbackArgument=GatewaySenderEventCallbackArgument
[originalCallbackArg=null;originatingSenderId=1;recipientGatewayReceivers=[2,
3, 4]];possibleDuplicate=false;creationTime=1505163025218;shadowKey=
1781;timeStamp=1505163025216;acked=false;dispatched=false]
{noformat}
The {{AbstractGatewaySenderEventProcessor batchIdToEventsMap}} contains two
in-progress events (one for each of the sent batches):
{noformat}
[warning 2017/09/11 13:50:25.574 PDT cluster1-server2 <AckReaderThread for :
Event Processor for GatewaySender_ny_0> tid=0xfa] XXX AckReaderThread.run
events=
batchId=1
value=[[SenderEventImpl[id=EventID[threadID=0x10000|4;sequenceID=1770;bucketId=0];action=0;operation=CREATE;region=/trade;key=1768;value=Trade
[tradeId=1768, tradeTime=Mon Sep 11 13:50:25 PDT 2017,
orderNumber=1768.order.1768,
executingBroker=executingBroker1768];valueIsObject=1;numberOfParts=9;callbackArgument=GatewaySenderEventCallbackArgument
[originalCallbackArg=null;originatingSenderId=1;recipientGatewayReceivers=[2,
3, 4]];possibleDuplicate=false;creationTime=1505163025218;shadowKey=
1781;timeStamp=1505163025216;acked=false;dispatched=false]],
[SenderEventImpl[id=EventID[threadID=0x10000|4;sequenceID=1770;bucketId=0];action=0;operation=CREATE;region=/trade;key=1768;value=Trade
[tradeId=1768, tradeTime=Mon Sep 11 13:50:25 PDT 2017,
orderNumber=1768.order.1768,
executingBroker=executingBroker1768];valueIsObject=1;numberOfParts=9;callbackArgument=GatewaySenderEventCallbackArgument
[originalCallbackArg=null;originatingSenderId=1;recipientGatewayReceivers=[2,
3, 4]];possibleDuplicate=false;creationTime=1505163025218;shadowKey=
1781;timeStamp=1505163025216;acked=false;dispatched=false]]]
batchId=2
value=[[SenderEventImpl[id=EventID[threadID=0x1000a|4;sequenceID=1772;bucketId=10];action=0;operation=CREATE;region=/trade;key=1770;value=Trade
[tradeId=1770, tradeTime=Mon Sep 11 13:50:25 PDT 2017,
orderNumber=1770.order.1770,
executingBroker=executingBroker1770];valueIsObject=1;numberOfParts=9;callbackArgument=GatewaySenderEventCallbackArgument
[originalCallbackArg=null;originatingSenderId=1;recipientGatewayReceivers=[2,
3, 4]];possibleDuplicate=false;creationTime=1505163025430;shadowKey=
1804;timeStamp=1505163025428;acked=false;dispatched=false]],
[SenderEventImpl[id=EventID[threadID=0x1000a|4;sequenceID=1772;bucketId=10];action=0;operation=CREATE;region=/trade;key=1770;value=Trade
[tradeId=1770, tradeTime=Mon Sep 11 13:50:25 PDT 2017,
orderNumber=1770.order.1770,
executingBroker=executingBroker1770];valueIsObject=1;numberOfParts=9;callbackArgument=GatewaySenderEventCallbackArgument
[originalCallbackArg=null;originatingSenderId=1;recipientGatewayReceivers=[2,
3, 4]];possibleDuplicate=false;creationTime=1505163025430;shadowKey=
1804;timeStamp=1505163025428;acked=false;dispatched=false]]]
{noformat}
The {{AckReaderThread}} receives the ack for batch 2 with tradeId=1770. Since
it does not contain a {{BatchException}}, the {{GatewayEventFilterAdapter
afterAcknowledgement}} method is invoked.
{noformat}
[warning 2017/09/11 13:50:25.574 PDT cluster1-server2 <AckReaderThread for :
Event Processor for GatewaySender_ny_0> tid=0xfa] XXX AckReaderThread.run read
ack batchId=2; gotBatchException=false
[info 2017/09/11 13:50:25.574 PDT cluster1-server2 <AckReaderThread for : Event
Processor for GatewaySender_ny_0> tid=0xfa] Received ACK for
GatewayQueueEvent[key=1770,operation=CREATE]
{noformat}
The event with tradeId=1770 is attempted to be removed from the queue, but
since remove just removes the head of the queue (not the key), the wrong event
is removed (tradeId=1768) .
{noformat}
[warning 2017/09/11 13:50:25.574 PDT cluster1-server2 <AckReaderThread for :
Event Processor for GatewaySender_ny_0> tid=0xfa] XXX
ParallelGatewaySenderQueue.remove about to remove
event=SenderEventImpl[id=EventID[threadID=0x10000|4;sequenceID=1770;bucketId=0];action=0;operation=CREATE;region=/trade;key=1768;value=Trade
[tradeId=1768, tradeTime=Mon Sep 11 13:50:25 PDT 2017,
orderNumber=1768.order.1768,
executingBroker=executingBroker1768];valueIsObject=1;numberOfParts=9;callbackArgument=GatewaySenderEventCallbackArgument
[originalCallbackArg=null;originatingSenderId=1;recipientGatewayReceivers=[2,
3, 4]];possibleDuplicate=false;creationTime=1505163025218;shadowKey=
1781;timeStamp=1505163025216;acked=false;dispatched=false]
{noformat}
The next batch is sent containing the already-processed event with
tradeId=1770. The failed event with tradeId=1768 is never processed.
{noformat}
[info 2017/09/11 13:50:25.650 PDT cluster1-server2 <Event Processor for
GatewaySender_ny_0> tid=0x56] During normal processing, dispatching the
following : 1 events (batch #3):
Event
EventID[threadID=0x1000a|4;sequenceID=1772;bucketId=10]:1770->Trade
[tradeId=1770, tradeTime=Mon Sep 11 13:50:25 PDT 2017,
orderNumber=1770.order.1770, executingBroker=executingBroker1770]1804
{noformat}
> With gemfire.GatewaySender.REMOVE_FROM_QUEUE_ON_EXCEPTION=false, gateway
> events can still be removed from the queue when an exception occurs
> --------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: GEODE-3730
> URL: https://issues.apache.org/jira/browse/GEODE-3730
> Project: Geode
> Issue Type: Bug
> Components: wan
> Reporter: Barry Oglesby
>
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)