albertogpz edited a comment on pull request #6659:
URL: https://github.com/apache/geode/pull/6659#issuecomment-876191044
> This is what the
> > > > > > > @pivotal-eshu
> > > > > > > It looks like the Tx code is trying to assess Gateway sender
requirement to support GEODE-7971 in TX code. Which doesn't seems right.
> > > > > > > The "getLastTransactionEvent" should just return the last
event in the transaction; currently it is doing it by
"callbacks.get(callbacks.size() - 1);" which also seems to be error prone or
inefficient.
> > > > > > > In "firePendingCallbacks" it could have iterated over the list
using size index:
> > > > > > > for (int i=0; i<callbaack.size(); i++){
> > > > > > >
ee.getRegion().invokeTXCallbacks(EnumListenerEvent.AFTER_DESTROY, ee, true,
(i+1) == size() /_isLastTrasaction_/);
> > > > > > > }
> > > > > > > Please correct me if i am missing anything.
> > > > > >
> > > > > >
> > > > > > It was implemented that way to put the logic of getting the last
transaction event in another class although it is as simple as getting the last
element.
> > > > > > Also, this logic could detect misconfigurations that can only be
detected at runtime (second check in `getLastTransactionEvent()`), then the
caller of getLastTransactionEvent() could detect it by means of the exception
and set all the events as the last in the transaction. Why do this? because if
not all events in the transaction go to the same senders (misconfiguration),
that would imply that some senders will not get all the events for the
transaction but could try if group-transaction-events is set to complete the
transaction in the batch, by looking for the last event in the transaction
which could never come (delaying the batch sending) or may come but not
guaranteeing that the rest of the events for the transaction were sent in the
batch (because they may not have reached the sender).
> > > > > > The first check of `getLastTransactionEvent()` was added in
order to avoid the cost of the second check if no sender had
group-transaction-events set to true. In that case, it is irrelevant to return
the last transaction event or null because it will not be used although, it
could make more sense to return the last transaction event.
> > > > >
> > > > >
> > > > > It looks like the GEODE-7971 tried to use grouping with the
existing way of "invokeTXCallbacks()", where client, wan, cache-listener
notification are done one event from transaction at a time. Instead it could
have invoked wan separately from "firePendingCallbacks"; which has all the TX
events in one group.
> > > > > E.g.:
> > > > > firePendingCallbacks() {
> > > > > invokeTxCallbackForClientCacheLisener(getPendingCallbacks());
> > > > > invokeTxCallbackForWan(getPendingCallbacks());
> > > > > }
> > > > > Then there was no need to tag last transaction event.
> > > > > Please correct me if i am missing anything.
> > > > > And calling "configError" for an TX entry not part of specific
sender seems to be wrong; why would it be a error; think about the case where
TX uses helper/lookup region entry/value to calculate a final result and other
site is only interested in the final result, this is not an error, this is the
most expected behavior.
> > > > > E.g.:
> > > > > getBonusIncreatementFromBonusRegion();
> > > > > updateNewSalaryForEmployee(); // The region or final value other
wan site is interested in.
> > > > > I think for now we need to change the naming of the methods to be
more meaningful (based on what they are doing):
> > > > > "getLastTransactionEvent" to
"getLastTransactionEventForGroupedWANSender()"
> > > > > Instead of "ServiceConfigurationError" return null or boolean. As
the new change is doing.
> > >
> > >
> > > > Tagging the last transaction event is needed by
`SerialGatewaySenderQueue` and `ParallelGatewaySenderQueue` to make sure that,
when group-transaction-events is enabled, batches do not contain incomplete
transactions. The way to assure that is to check that for every transaction in
the batch (events are tagged with the transaction event), there is an event for
that transaction that is the last event in the transaction (there is an event
tagged as last transaction event).
> > > > I do not see why your changes avoid the need for tagging the last
transaction event.
> > >
> > >
> > > Thinking more about this, I see what you are saying...
> > > The other option could have been, separating WAN callback and creating
a single event/message for all the entries in transaction. May be this is too
much.
> > > I thought about that alternative when I was designing this solution
but abandoned it as it implied bigger changes.
> > > > Secondly, having events for a transaction that will not be
replicated by all senders (when some of the senders have
group-transaction-events set to true) could POSSIBLY be a configuration error.
> > > > Let me illustrate it with an example:
> > > > Let's suppose that we have a system with two senders, sender1 and
sender2, both with group-transaction-events set to true.
> > > > Now let's suppose that a transaction (t1) arrives to Geode with two
events: event1 (create a) and event2 (create b).
> > > > Let's suppose that event1 is to be replicated by sender1 (but not
sender2) and event 2 is to be replicated by sender2 (but not sender1).
> > > > The last event for the transaction will be event2 and it will be
tagged as such.
> > > > What we will see happening is that even1 arrives to sender1's queue.
When event1 is added to a batch and the batch is to be sent, it will be checked
if all events for the transaction to which even1 belongs (t1) are in the batch.
As there is only one event and it is not the last event in the the transaction,
the batch creator will try to get the missing events for the transaction but it
will never get them because they will never arrive, the last event in the
transaction will not be on the sender1 queue but it will be on the sender2
queue.
> > > > At the same time, event2 will arrive to sender2's queue. When this
event is added to a batch and the batch is to be sent, it will be checked if
all transactions are complete. For t1, as we have the last event for the
transaction (event2), it will be considered that it is so. Nevertheless, as not
all events for the transaction are sent to this sender, the assumption would be
wrong.
> > >
> > >
> > > I understand why the lastTransactionEvent; but I don't see how its a
configuration error...When you say configuration error, are you meaning sender
configuration or setting grouping configuration for a sender. In any case, it
is different usecase but is not an error. Application have their own
requirement of sending/managing data in different sites.
> >
> >
> > I mean both. This is what the documentation says about this feature:
> > _In order to use transaction event grouping:
> >
> > * The `group-transaction-events` setting is supported only on serial
senders with just one dispatcher thread, or on parallel senders.
> > * The regions to which the transaction events belong must be replicated
by the same set of gateway senders that also have this setting enabled.
> > * This setting cannot be enabled if `enable-batch-conflation` is in
effect._
> >
> > The first and the third conditions can be (and are) checked at
configuration time.
> > But the second condition can only be checked at run time, when a
transaction is received. If you have a transaction with several events that
spans several regions, and the gateway senders for the regions are not the
same, then group-transaction-events will not work even though whoever
configured the gateway senders may think it will. Some senders will receive
some events for the transaction and other senders will receive other events.
>
> Yes...Thats up to application/use-case, how and what kind data it wants to
be sent to other site. The WAN gateway senders
> doesn't have any limitations like that...
>
> > One may argue that this is not necessarily a configuration error but the
fact is that, in this case, a transaction replicated by gateway senders who
have group-transaction-events set to true will not be sent in the same batch
because senders will not receive all the events for the transaction.
>
> The grouping of transaction is some-way of making sure the transaction can
be sent as a single group; but it should not be limiting factor (saying error).
If it is a limiting factor, the TX should have been rolled back and an
exception is thrown to application. Right now what I can see is nothing is
rolled back or TX is aborted because of error, it is only logged as message;
which is noticed after the event/error has already happened.
You are right. It is not a limiting factor. But if you have configured
group-transaction-events for some or all your senders and you have events in
transactions being sent by those senders that cannot be grouped then chances
are that you might have made a mistake during the configuration. As this cannot
be detected at configuration time and it is not necessarily a configuration
error, you do not want to rollback the transaction. But at least, you will have
a log telling you that transactions are not being grouped so that you can check
if there is a configuration error.
>
> @albertogpz you also mentioned "This is what the documentation says about
this feature:"; I believe this is not release feature and is not yet documented
in Geode manuals; if this needs to be documented we need to create a jira
ticket and tag docs team @davebarnes97
This is documented in:
geode-docs/topologies_and_comm/topology_concepts/multisite_overview.html.md.erb
>
> Also, reading over the code, I see many of the methods in
TXLastEventInTransactionUtils.class doesn't have unit level tests. We were
wondering what happens with the helper methods in this class, when there are no
senders configured, when there are multiple enders on each region with and
without grouping option. It will really help Geode product having all the unit
test cases covered; do you think you can add additional tests...
Tests could also be added to the non-public methods of
`TXLastEventInTransactionUtils` in `TXLastEventInTransactionUtilsTest` but the
current tests on the public 'getLastTransactionEvent()` method seemed enough to
me when I wrote them to cover the cases you mention.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]