Well, I found it's already reverted.

But I think we don't have to.

After discussed with Jason, I worked out a new fix. It kept previous 5591's
intention of exception handling and improved on assigning the port.

The port is now checked if available, so it will also resolve 2 minutes
timeout issue for the retry. (Or at least will not make things worse).

On Wed, Sep 5, 2018 at 10:14 AM, Ryan McMahon <rmcma...@pivotal.io> wrote:

> +1 for reverting in both places.
>
> I see that there is already an isGatewayReceiver flag in the AcceptorImpl
> constructor.  It's not ideal, but could we use this flag to prevent the 2
> minute retry logic for happening if this flag is true?
>
> Ryan
>
> On Wed, Sep 5, 2018 at 10:01 AM, Lynn Hughes-Godfrey <
> lhughesgodf...@pivotal.io> wrote:
>
> > +1 for reverting in both places.
> >
> > On Wed, Sep 5, 2018 at 9:50 AM, Dan Smith <dsm...@pivotal.io> wrote:
> >
> > > +1 for reverting in both places. The current fix is not better, that's
> > why
> > > we are reverting it on the release branch!
> > >
> > > -Dan
> > >
> > > On Wed, Sep 5, 2018 at 9:47 AM, Jacob Barrett <jbarr...@pivotal.io>
> > wrote:
> > >
> > > > I’m not ok with reverting in develop. Revert in 1.7 and modify in
> > > develop.
> > > > We shouldn’t go backwards in develop. The current fix is better than
> > the
> > > > bug it fixes.
> > > >
> > > > > On Sep 5, 2018, at 9:40 AM, Nabarun Nag <n...@apache.org> wrote:
> > > > >
> > > > > If everyone is okay with it, I will revert that change in develop
> and
> > > > then
> > > > > cherry pick it to release/1.7.0 branch.
> > > > > Please do comment.
> > > > >
> > > > > Regards
> > > > > Nabarun Nag
> > > > >
> > > > >
> > > > >> On Wed, Sep 5, 2018 at 9:30 AM Dan Smith <dsm...@pivotal.io>
> wrote:
> > > > >>
> > > > >> +1 to yank it and rework the fix.
> > > > >>
> > > > >> Gester's change helps, but it just means that you will sometimes
> > > > randomly
> > > > >> have a 2 minute delay starting up a gateway receiver. I don't
> think
> > > > that is
> > > > >> a great user experience either.
> > > > >>
> > > > >> -Dan
> > > > >>
> > > > >> On Wed, Sep 5, 2018 at 8:20 AM, Bruce Schuchardt <
> > > > bschucha...@pivotal.io>
> > > > >> wrote:
> > > > >>
> > > > >>> Let's yank it
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>> On 9/4/18 5:04 PM, Sean Goller wrote:
> > > > >>>>
> > > > >>>> If it's to get the release out, I'm fine with reverting. I don't
> > > like
> > > > >> it,
> > > > >>>> but I'm not willing to die on that hill. :)
> > > > >>>>
> > > > >>>> -S.
> > > > >>>>
> > > > >>>> On Tue, Sep 4, 2018 at 4:38 PM Dan Smith <dsm...@pivotal.io>
> > wrote:
> > > > >>>>
> > > > >>>> Spitting this into a separate thread.
> > > > >>>>>
> > > > >>>>> I see the issue. The two minute timeout is the constructor for
> > > > >>>>> AcceptorImpl, where it retries to bind for 2 minutes.
> > > > >>>>>
> > > > >>>>> That behavior makes sense for CacheServer.start.
> > > > >>>>>
> > > > >>>>> But it doesn't make sense for the new logic in
> > > > GatewayReceiver.start()
> > > > >>>>> from
> > > > >>>>> GEODE-5591. That code is trying to use CacheServer.start to
> scan
> > > for
> > > > an
> > > > >>>>> available port, trying each port in a range. That free port
> > finding
> > > > >> logic
> > > > >>>>> really doesn't want to have two minutes of retries for each
> port.
> > > It
> > > > >>>>> seems
> > > > >>>>> like we need to rework the fix for GEODE-5591.
> > > > >>>>>
> > > > >>>>> Does it make sense to hold up the release to rework this fix,
> or
> > > > should
> > > > >>>>> we
> > > > >>>>> just revert it? Have we switched concourse over to using alpine
> > > > linux,
> > > > >>>>> which I think was the original motivation for this fix?
> > > > >>>>>
> > > > >>>>> -Dan
> > > > >>>>>
> > > > >>>>> On Tue, Sep 4, 2018 at 4:25 PM, Dan Smith <dsm...@pivotal.io>
> > > wrote:
> > > > >>>>>
> > > > >>>>> Why is it waiting at all in this case? Where is this 2 minute
> > > timeout
> > > > >>>>>> coming from?
> > > > >>>>>>
> > > > >>>>>> -Dan
> > > > >>>>>>
> > > > >>>>>> On Tue, Sep 4, 2018 at 4:12 PM, Sai Boorlagadda <
> > > > >>>>>>
> > > > >>>>> sai.boorlaga...@gmail.com
> > > > >>>>>
> > > > >>>>>> wrote:
> > > > >>>>>>> So the issue is that it takes longer to start than previous
> > > > releases?
> > > > >>>>>>> Also, is this wait time only when using Gfsh to create
> > > > >>>>>>> gateway-receiver?
> > > > >>>>>>>
> > > > >>>>>>> On Tue, Sep 4, 2018 at 4:03 PM Nabarun Nag <n...@apache.org>
> > > > wrote:
> > > > >>>>>>>
> > > > >>>>>>> Currently we have a minor issue in the release branch as
> > pointed
> > > > out
> > > > >>>>>>>>
> > > > >>>>>>> by
> > > > >>>>>
> > > > >>>>>> Barry O.
> > > > >>>>>>>> We will wait till a resolution is figured out for this
> issue.
> > > > >>>>>>>>
> > > > >>>>>>>> Steps:
> > > > >>>>>>>> 1. create locator
> > > > >>>>>>>> 2. start server --name=server1 --server-port=40404
> > > > >>>>>>>> 3. start server --name=server2 --server-port=40405
> > > > >>>>>>>> 4. create gateway-receiver --member=server1
> > > > >>>>>>>> 5. create gateway-receiver --member=server2 `This gets stuck
> > > for 2
> > > > >>>>>>>>
> > > > >>>>>>> minutes`
> > > > >>>>>>>
> > > > >>>>>>>> Is the 2 minute wait time acceptable? Should we document it?
> > > When
> > > > we
> > > > >>>>>>>>
> > > > >>>>>>> revert
> > > > >>>>>>>
> > > > >>>>>>>> GEODE-5591, this issue does not happen.
> > > > >>>>>>>>
> > > > >>>>>>>> Regards
> > > > >>>>>>>> Nabarun Nag
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
>
diff --git 
a/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/GatewayReceiverImpl.java
 
b/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/GatewayReceiverImpl.java
index cd2702991..95c363c63 100644
--- 
a/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/GatewayReceiverImpl.java
+++ 
b/geode-wan/src/main/java/org/apache/geode/internal/cache/wan/GatewayReceiverImpl.java
@@ -26,6 +26,7 @@ import org.apache.geode.cache.wan.GatewayReceiver;
 import org.apache.geode.cache.wan.GatewayTransportFilter;
 import org.apache.geode.distributed.internal.InternalDistributedSystem;
 import org.apache.geode.distributed.internal.ResourceEvent;
+import org.apache.geode.internal.AvailablePort;
 import org.apache.geode.internal.cache.CacheServerImpl;
 import org.apache.geode.internal.cache.InternalCache;
 import org.apache.geode.internal.i18n.LocalizedStrings;
@@ -144,7 +145,12 @@ public class GatewayReceiverImpl implements 
GatewayReceiver {
       return;
     }
 
-    for (int port = this.startPort; port <= this.endPort; port++) {
+    int loopStartPort = getPortToStart();
+    for (int port = loopStartPort; port <= endPort; port++) {
+      if (!AvailablePort.isPortAvailable(port, AvailablePort.SOCKET,
+          AvailablePort.getAddress(AvailablePort.SOCKET))) {
+        continue;
+      }
       receiver.setPort(port);
       receiver.setSocketBufferSize(socketBufferSize);
       receiver.setMaximumTimeBetweenPings(timeBetPings);
@@ -159,15 +165,20 @@ public class GatewayReceiverImpl implements 
GatewayReceiver {
         this.port = port;
         break;
       } catch (IOException e) {
-        if (port == this.endPort) {
-          throw new GatewayReceiverException("No available free port found in 
the given range (" +
+        if (port == this.endPort && startPort != endPort) {
+          port = this.startPort;
+          logger.info("loopback to " + this.startPort);
+        } else if (port == loopStartPort - 1 || startPort == endPort) {
+          logger.warn("No available free port found in the given range (" +
               this.startPort +
               "-" +
               this.endPort +
               ")", e);
+          throw new GatewayReceiverException(
+              
LocalizedStrings.SocketCreator_FAILED_TO_CREATE_SERVER_SOCKET_ON_0_1
+                  .toLocalizedString(new Object[] {bindAdd, this.port}));
         }
       }
-
     }
     logger
         
.info(LocalizedMessage.create(LocalizedStrings.GatewayReceiver_STARTED_ON_PORT, 
this.port));
@@ -177,6 +188,18 @@ public class GatewayReceiverImpl implements 
GatewayReceiver {
 
   }
 
+  private int getPortToStart() {
+    // choose a random port from the given port range
+    int rPort;
+    if (this.startPort == this.endPort) {
+      rPort = this.startPort;
+    } else {
+      rPort = AvailablePort.getRandomAvailablePortInRange(this.startPort, 
this.endPort,
+          AvailablePort.SOCKET);
+    }
+    return rPort;
+  }
+
   public void stop() {
     if (!isRunning()) {
       throw new GatewayReceiverException(

Reply via email to