Re: [Gluster-devel] regression failed : snapshot/bug-1316437.t

Avra Sengupta Mon, 25 Jul 2016 04:05:01 -0700

The crux of the problem is that as of today, brick processes on restarttry to reuse the old port they were using (assuming that no otherprocess will be using it, and not consulting pmap_registry_alloc()before using it). With a recent change, pmap_registry_alloc (),reassigns older ports that were used, but are now free. Hence snapd nowgets a port that was previously used by a brick and tries to bind to it,whereas the older brick process without consulting pmap table blindlytries to connect to it, and hence we see this problem.

Now coming to the fix, I feel brick process should not try to get theolder port and should just take a new port every time it comes up. Wewill not run out of ports with this change coz, now pmap allocates oldports again, and the previous port being used by the brick process willeventually be reused. If anyone sees any concern with this approach,please feel free to raise so now.

While awaiting feedback from you guys, I have sent this patch(http://review.gluster.org/15001), which moves the said test case to badtests for now, and after we collectively reach to a conclusion on thefix, we will remove this from bad test.


Regards,
Avra

On 07/25/2016 02:33 PM, Avra Sengupta wrote:

The failure suggests that the port snapd is trying to bind to isalready in use. But snapd has been modified to use a new porteverytime. I am looking into this.
On 07/25/2016 02:23 PM, Nithya Balachandran wrote:
More failures:
https://build.gluster.org/job/rackspace-regression-2GB-triggered/22452/console

I see these messages in the snapd.log:
[2016-07-22 05:31:52.482282] I[rpcsvc.c:2199:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service:Configured rpc.outstanding-rpc-limit with value 64[2016-07-22 05:31:52.482352] W [MSGID: 101002][options.c:954:xl_opt_validate] 0-patchy-server: option 'listen-port'is deprecated, preferred is 'transport.socket.listen-port',continuing with correction[2016-07-22 05:31:52.482436] E [socket.c:771:__socket_server_bind]0-tcp.patchy-server: binding to failed: Address already in use[2016-07-22 05:31:52.482447] E [socket.c:774:__socket_server_bind]0-tcp.patchy-server: Port is already in use[2016-07-22 05:31:52.482459] W [rpcsvc.c:1630:rpcsvc_create_listener]0-rpc-service: listening on transport failed[2016-07-22 05:31:52.482469] W [MSGID: 115045] [server.c:1061:init]0-patchy-server: creation of listener failed[2016-07-22 05:31:52.482481] E [MSGID: 101019][xlator.c:433:xlator_init] 0-patchy-server: Initialization of volume'patchy-server' failed, review your volfile again[2016-07-22 05:31:52.482491] E [MSGID: 101066][graph.c:324:glusterfs_graph_init] 0-patchy-server: initializingtranslator failed[2016-07-22 05:31:52.482499] E [MSGID: 101176][graph.c:670:glusterfs_graph_activate] 0-graph: init failed
On Mon, Jul 25, 2016 at 12:00 PM, Ashish Pandey <[email protected]<mailto:[email protected]>> wrote:
    Hi,

    Following test has failed 3 times in last two days -

    ./tests/bugs/snapshot/bug-1316437.t
    
https://build.gluster.org/job/rackspace-regression-2GB-triggered/22445/consoleFull
    
https://build.gluster.org/job/rackspace-regression-2GB-triggered/22445/consoleFull
    
https://build.gluster.org/job/rackspace-regression-2GB-triggered/22470/consoleFull

    Please take a look at it and check if it spurious failure or not.

    Ashish

    _______________________________________________
    Gluster-devel mailing list
    [email protected] <mailto:[email protected]>
    http://www.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________
Gluster-devel mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] regression failed : snapshot/bug-1316437.t

Reply via email to