Hi,

FYI: I have filed a JIRA ticket on this but I thought may be someone might be 
aware of solution or workaround for this problem. So, I am posting it here as 
well.

In one of the project we are using Geode. Here is a summary of how we use it.
- Geode servers (Release 1.1.1) have multiple regions.
- Clients subscribe to the data from these regions.
- Clients subscribe interest in all the entries, therefore they get updates 
about all the entries from creation to modification to deletion.
- One of the regions usually has 5-10 million entries with a TTL of 24 hours. 
Most entries are added in an hour's span one after other. So, when TTL kicks 
in, they are often destroyed in an hour.

Problem:
Every now and then we observe following message:
                Client queue for 
_gfe_non_durable_client_with_id_x.x.x.x(14229:loner):42754:e4266fc4_2_queue 
client is full.
This seems to happen when the TTL kicks in. Entries start getting evicted 
(deleted), the updates now must be sent to clients. We see that the updates do 
happen for a while but suddenly the updates stop and the queue size starts 
growing. This is becoming a major issue for smooth functioning of our 
production setup. Any help will be much appreciated.

I did some ground work by downloading and looking at the code. I see reference 
to 2 issues #37581, #51400. But I am unable to view actual JIRA tickets (needs 
login credentials) Hopefully, it helps someone looking at the issue.
Here is the pertinent code:

   @Override
    @edu.umd.cs.findbugs.annotations.SuppressWarnings("TLW_TWO_LOCK_WAIT")
    void checkQueueSizeConstraint() throws InterruptedException {
      if (this.haContainer instanceof HAContainerMap && isPrimary()) { // Fix 
for bug 39413
        if (Thread.interrupted())
          throw new InterruptedException();
        synchronized (this.putGuard) {
          if (putPermits <= 0) {
            synchronized (this.permitMon) {
              if (reconcilePutPermits() <= 0) {
                if 
(region.getSystem().getConfig().getRemoveUnresponsiveClient()) {
                  isClientSlowReciever = true;
                } else {
                  try {
                    long logFrequency = 
CacheClientNotifier.DEFAULT_LOG_FREQUENCY;
                    CacheClientNotifier ccn = CacheClientNotifier.getInstance();
                    if (ccn != null) { // check needed for junit tests
                      logFrequency = ccn.getLogFrequency();
                    }
                    if ((this.maxQueueSizeHitCount % logFrequency) == 0) {
                      logger.warn(LocalizedMessage.create(
                          
LocalizedStrings.HARegionQueue_CLIENT_QUEUE_FOR_0_IS_FULL,
                          new Object[] {region.getName()}));
                      this.maxQueueSizeHitCount = 0;
                    }
                    ++this.maxQueueSizeHitCount;
                    this.region.checkReadiness(); // fix for bug 37581
                    // TODO: wait called while holding two locks
                    
this.permitMon.wait(CacheClientNotifier.eventEnqueueWaitTime);
                    this.region.checkReadiness(); // fix for bug 37581
                    // Fix for #51400. Allow the queue to grow beyond its
                    // capacity/maxQueueSize, if it is taking a long time to
                    // drain the queue, either due to a slower client or the
                    // deadlock scenario mentioned in the ticket.
                    reconcilePutPermits();
                    if ((this.maxQueueSizeHitCount % logFrequency) == 1) {
                      logger.info(LocalizedMessage
                          
.create(LocalizedStrings.HARegionQueue_RESUMING_WITH_PROCESSING_PUTS));
                    }
                  } catch (InterruptedException ex) {
                    // TODO: The line below is meaningless. Comment it out later
                    this.permitMon.notifyAll();
                    throw ex;
                  }
                }
              }
            } // synchronized (this.permitMon)
          } // if (putPermits <= 0)
          --putPermits;
        } // synchronized (this.putGuard)
      }
    }


Thanks
Mangesh

Reply via email to