Re: Odg: gateway sender queue

2019-11-14 Thread Suranjan Kumar
+1 to Dan's suggestion.
In addition we should think about the following scenario:
In certain cases, cleanQueue would cause stop to block if Receivers are not
up or causing exception while applying the events. In that case events will
not be deletd from queue and stop will block.
Currently, it is controlled by a timeout, however we can give an option to
the user it he/she still wants to force stop the queue in case queue is not
drained in the a given time.

On Fri, Nov 15, 2019 at 6:18 AM Michael Oleske  wrote:

> >
> > vsd stat in the gfs file
> >
>
> Just here to say consider using a meter instead of stat as documented in
> https://cwiki.apache.org/confluence/display/GEODE/How+to+add+a+Meter if
> more than a log message is warrented.
>
> -michael
>
> On Thu, Nov 14, 2019 at 11:29 AM Nabarun Nag  wrote:
>
> > +1 to Dan's suggestion
> >
> > What about a log and a vsd stat in the gfs file which tells us if any
> > cleanQueue commands were executed.
> >
> >
> > Regards
> > Nabarun Nag
> >
> > On Thu, Nov 14, 2019 at 10:27 AM Udo Kohlmeyer  wrote:
> >
> > > In addition... making is default has bigger consequences that we have
> > > not thought about..
> > >
> > > e.g if you purge an existing queue on start up.. is this the start up
> of
> > > the server node / GS Queue? Given that we have shared-nothing
> > > architecture, purging *should* only be local and not cluster-wide...
> I'd
> > > be interested and see a proposal on this feature.
> > >
> > > --Udo
> > >
> > > On 11/14/19 10:24 AM, Jason Huynh wrote:
> > > > +1 to Dan's suggestion
> > > >
> > > > On Thu, Nov 14, 2019 at 9:52 AM Dan Smith  wrote:
> > > >
> > > >> I'm ok with adding a --cleanQueue option.
> > > >>
> > > >> However, I don't think it should default to be true, since that
> would
> > be
> > > >> changing the behavior of the existing command. It should default to
> > > false.
> > > >>
> > > >> -Dan
> > > >>
> > > >> On Thu, Nov 14, 2019 at 9:28 AM Xiaojian Zhou 
> > wrote:
> > > >>
> > > >>> The --cleanQueue option is a similar idea as Barry's "DeadLetter"
> > > spike.
> > > >> I
> > > >>> remembered that we decided not to do it.
> > > >>>
> > > >>>
> > > >>> On Wed, Nov 13, 2019 at 11:41 PM Mario Ivanac
>  > >
> > > >>> wrote:
> > > >>>
> > >  Hi,
> > > 
> > >  just to remind you on last question:
> > > 
> > >  what is your opinion on adding additional option in gfsh command
> > > >> "start
> > >  gateway sender"
> > >  to control clearing of existing queues --cleanQueues.
> > > 
> > >  This option will indicate, when gateway sender is started, should
> we
> > >  discard/clean existing queue, or should we use existing queue.
> > >  By default it will be to discard/clean existing queue.
> > > 
> > >  Best Regards,
> > >  Mario
> > >  
> > >  Šalje: Mario Ivanac 
> > >  Poslano: 8. studenog 2019. 13:00
> > >  Prima: dev@geode.apache.org 
> > >  Predmet: Odg: gateway sender queue
> > > 
> > >  Hi all,
> > > 
> > >  one more clarification regarding 3rd question:
> > > 
> > >  "*   Could we add extra option in gfsh command  "start gateway
> > sender"
> > >    that allows to control queues reset (for instance
> > > --cleanQueues)"
> > > 
> > >  This option will indicate, when gateway sender is started, should
> we
> > >  discard/clean existing queue, or should we use existing queue.
> > >  By default it will be to discard/clean existing queue.
> > > 
> > >  Best Regards,
> > >  Mario
> > >  
> > >  Šalje: Mario Ivanac 
> > >  Poslano: 7. studenog 2019. 9:01
> > >  Prima: Dan Smith ; dev@geode.apache.org <
> > >  dev@geode.apache.org>
> > >  Predmet: Odg: gateway sender queue
> > > 
> > >  Hi,
> > > 
> > >  thanks for answers.
> > > 
> > >  Some more details regarding 1st question.
> > > 
> > >  Is this behavior same (for serial and parallel gateway sender) in
> > case
> > >  queue is persistent?
> > >  Meaning, should queue (persistent) be purged if we restart gateway
> > > >>> sender?
> > > 
> > >  Thanks,
> > >  Mario
> > > 
> > >  
> > >  Šalje: Dan Smith 
> > >  Poslano: 5. studenog 2019. 18:52
> > >  Prima: dev@geode.apache.org 
> > >  Predmet: Re: gateway sender queue
> > > 
> > >  Some replies, inline:
> > > 
> > > *   During testing we have observed, different behavior in
> > parallel
> > > >> and
> > > > serial gateway senders. In case we manually stop, than start
> > gateway
> > > > senders, for parallel gateway senders, queue is purged, but for
> > > >> serial
> > > > gateway senders this is not the case. Is this normal behavior or
> > bug?
> > > >
> > >  Hmm, I also think stop is supposed to clear the queue. I think if
> > you
> > > >> are
> > >  seeing that it 

Re: [DISCUSS] TTL setting on WAN

2019-03-25 Thread Suranjan Kumar
Hi,
 I think the one approach for a user would be to 'filter' the events while
dispatching. If I remember correctly, we can attach a filter at dispatch
time and filter the events based on creationTime of the GatewayEvent. We
can provide a pre created filter and use it based on some so that user
doesn't have to write his/her own.

Something like,
/**
All the events which spend timeToLive or more time in queue will be deleted
from the queue
and will not be sent to remote site.
Possible consequence is that two sites can be inconsistent in case
*/
public GatewaySenderFactory setEntryTimeToLive(long timeToLive);

As queues will be read in LRU way this would be faster too. Only drawback
is that there will be only one thread (not sure if we have concurrent
dispatcher yet) clearing the queue.

As Udo/Dan mentioned above, user needs to be aware of the consequences.


On Tue, Mar 26, 2019 at 3:09 AM Bruce Schuchardt 
wrote:

> I've walked through the code to forward expiration actions to async
> event listeners & don't see how to apply it to removal of queue entries
> for WAN.  The current implementation just queues the expiration
> actions.  If we wanted to remove any queued events associated with the
> expired region entry we'd have to scan the whole queue, which would be
> too slow if we're overflowing the queue to disk.
>
> I've also walked through the conflation code.  It applies only to the
> current batch being processed by the gateway sender.  The data structure
> used to perform conflation is just a Map that is created in the sender's
> batch processing method and then thrown away.
>
> On 3/20/19 11:15 AM, Dan Smith wrote:
> >> 2) The developer wants to replicate _state_.  This means that implicit
> >> state changes (expiration or eviction w/ destroy) could allow us to
> >> optimize the queue size.  This is very similar to conflation, just a
> >> different kind of optimization.
> >>
> >> For this second case, does it make sense to allow the user to specify a
> >> different TTL than the underlying region?  It seems like what the user
> >> wants is to not replicate stale data and having an extra TTL attribute
> >> would just be another value to mis-configure.  What do you think about
> just
> >> providing a boolean flag?
> >>
> >>
> > This kinda jogged my memory. AsyncEventQueues actually *do* have a
> boolean
> > flag to allow you to forward expiration events to the queue. I have no
> idea
> > how this interacts with conflation though -
> >
> https://geode.apache.org/releases/latest/javadoc/org/apache/geode/cache/asyncqueue/AsyncEventQueueFactory.html#setForwardExpirationDestroy-boolean-
> >
>


Re: Issues due to Special exceptions in RVV

2017-05-23 Thread Suranjan Kumar
Hi Darrel,
  Actually, I reproduced the issue in unit test that was seen in one of the
runs and initializeVersionHolder doesn't fix the issue.
  The initializeFrom is used to initialize a member's RVV from a  GII
provider.

  It turns out if a member has already recorded higher version then while
initializing a version holder from a member with lower version, we
introduce an special exception.
  But in future when new versions are recorded by the same member, the
special exception is not handled properly and that leads to version holder
data structure corruption leading to holder#contains(version) error. In
fact, I see that in this case an exception can remain permanently in the
version holder even though the version has been recorded.

  I have a fix, which I will send for review soon. However, I just wanted
to understand the issue and why it is not caught in delta GII.

 Moreover, if you look at the method initializeFrom, then already recorded
versions are ignored? Isn't that an issue? If not then why?
 I will file a JIRA if you suggest.

On Tue, May 23, 2017 at 10:49 PM, Darrel Schneider <dschnei...@pivotal.io>
wrote:

> I see that this test directly calls initializeFrom but the product never
> does this.
> The product always calls it through this method:
>
> org.apache.geode.internal.cache.versions.RegionVersionVector.
> initializeVersionHolder(T,
> RegionVersionHolder)
>
> I can see that initializeVersionHolder checks a memberToVersion map and on
> a miss does not call initializeFrom but instead add the new member's RVV to
> the memberToVersion map. Have you considered if calling
> initializeVersionHolder would fix the issues you are seeing?
>
>
> On Tue, May 23, 2017 at 10:04 AM, Darrel Schneider <dschnei...@pivotal.io>
> wrote:
>
> > I made these modifications and the following compiles, runs, and fails on
> > geode.
> > I added the following to RegionVersionVectorJUnitTest:
> >   @Test
> >   public void testInitialized() {
> >
> > RegionVersionHolder vh1 = new RegionVersionHolder<>("vh1");
> > vh1.recordVersion(56);
> > vh1.recordVersion(57);
> > vh1.recordVersion(58);
> > vh1.recordVersion(59);
> > vh1.recordVersion(60);
> > assertTrue(vh1.contains(57));
> > vh1 = vh1.clone();
> > assertTrue(vh1.contains(57));
> > System.out.println("This node init, vh1="+vh1);
> >
> > RegionVersionHolder vh2 = new RegionVersionHolder<>("vh2");
> > for(int i=1;i<57;i++) {
> >   vh2.recordVersion(i);
> > }
> > vh2 = vh2.clone();
> > System.out.println("GII node init, vh2="+vh2);
> >
> > vh1.initializeFrom(vh2);
> > // assertTrue(vh1.contains(57)); // fails
> > vh1 = vh1.clone();
> > System.out.println("After initialize, vh1="+vh1);
> >
> > vh1.recordVersion(58);
> > vh1.recordVersion(59);
> > vh1.recordVersion(60);
> >
> > System.out.println("After initialize and record version, vh1="+vh1);
> >
> > vh1 = vh1.clone();
> > vh1.recordVersion(57);
> > System.out.println("After initialize and record version after clone,
> > vh1="+vh1);
> >
> > assertTrue(vh1.contains(57)); //FAILS
> >   }
> >
> >
> > On Tue, May 23, 2017 at 9:37 AM, Kirk Lund <kl...@apache.org> wrote:
> >
> >> Are you sure you're using Geode? The signature of
> >> RegionVersionHolder#recordVersion
> >> in Geode is:
> >>
> >> RegionVersionHolder#recordVersion(long)
> >>
> >> I recommend checking out develop branch of Geode, write a test to
> confirm
> >> your bug and then submit that test with a Jira ticket.
> >>
> >> On Tue, May 23, 2017 at 7:10 AM, Suranjan Kumar <
> suranjan.ku...@gmail.com
> >> >
> >> wrote:
> >>
> >> > Hi Bruce/Dan,
> >> >
> >> >  #1. I see some issues due to introduction of special exceptions in
> the
> >> > RegionVersionHolder. In some cases, it is leading to
> RegionVersionHolder
> >> > data structure corruption causing wrong contains(version) results.
> >> >
> >> >  This happens because we set the version to max of currentVersion or
> >> newly
> >> > recorded one. But do not try to fix the special exception present if
> >> any.
> >> >
> >> >  It is easily reproducible even in junit tests.
> >> > The corruption of holder data stricture causes even future record
> >> operation
> >> > to fail.

Issues due to Special exceptions in RVV

2017-05-23 Thread Suranjan Kumar
Hi Bruce/Dan,

 #1. I see some issues due to introduction of special exceptions in the
RegionVersionHolder. In some cases, it is leading to RegionVersionHolder
data structure corruption causing wrong contains(version) results.

 This happens because we set the version to max of currentVersion or newly
recorded one. But do not try to fix the special exception present if any.

 It is easily reproducible even in junit tests.
The corruption of holder data stricture causes even future record operation
to fail.
For example the following test fails:

 public void testInitialized() {

  RegionVersionHolder vh1 = new RegionVersionHolder(member);
  vh1.recordVersion(56, null);
  vh1.recordVersion(57, null);
  vh1.recordVersion(58, null);
  vh1.recordVersion(59, null);
  vh1.recordVersion(60, null);
  vh1 = vh1.clone();
  System.out.println("This node init, vh1="+vh1);

  RegionVersionHolder vh2 = new RegionVersionHolder(member);
  for(int i=1;i<57;i++) {
vh2.recordVersion(i,null);
  }
  vh2 = vh2.clone();
  System.out.println("GII node init, vh2="+vh2);

  vh1.initializeFrom(vh2);
  vh1 = vh1.clone();
  System.out.println("After initialize, vh1="+vh1);


  vh1.recordVersion(58,null);
  vh1.recordVersion(59,null);
  vh1.recordVersion(60,null);

  System.out.println("After initialize and record version, vh1="+vh1);

  vh1 = vh1.clone();
  vh1.recordVersion(57,null);
  System.out.println("After initialize and record version after clone,
vh1="+vh1);

  assertTrue(vh1.contains(57)); //FAILS
}


 #2. I have observed that
RegionVersionHolder#initializeFrom(RegionVersionHolder source)
doesn't not record already recorded version in itself contrary to what
the comment above the method claims. In fact the junit tests also
verifies the same so I couldn't understand the rationale behind it. It
just adds an special exception if any higher version was already
recorded by self.
 Shouldn't the resulting holder after initializing from a source
contain records for both the holders? If not then why do we even need
special exception.

If possible could you please let me know your thoughts on this.

Regards,
Suranjan Kumar