Re: [Gluster-devel] reflink support for glusterfs and gluster-block using it for taking snapshots

2017-11-06 Thread Prasanna Kalever
On Tue, Nov 7, 2017 at 7:43 AM, Pranith Kumar Karampuri
 wrote:
> hi,
>  I just created a github issue for reflink support(#349) in glusterfs.
> We are intending to use this feature to do block snapshots in gluster-block.
>
> Please let us know your comments on the github issue. I have added the
> changes we may need for xlators I know a little bit about. Please help in
> identifying gaps in implementing this FOP.

Pranith,

You might be interested in taking a look at an approach taken earlier
to support snapshots using xfs reflink feature.

Patch with working code:  https://review.gluster.org/#/c/13979/
Spec: 
https://github.com/gluster/glusterfs-specs/blob/master/under_review/reflink-based-fsnap.md

Note: It was more than one and a half years old stuff (April 2016), so
expect a rebase :-)

Thanks,
--
Prasanna

>
> --
> Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] reflink support for glusterfs and gluster-block using it for taking snapshots

2017-11-06 Thread Pranith Kumar Karampuri
hi,
 I just created a github issue for reflink support
(#349) in glusterfs. We
are intending to use this feature to do block snapshots in gluster-block.

Please let us know your comments on the github issue. I have added the
changes we may need for xlators I know a little bit about. Please help in
identifying gaps in implementing this FOP.

-- 
Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-users] Request for Comments: Upgrades from 3.x to 4.0+

2017-11-06 Thread Alastair Neil
Ahh OK I see, thanks


On 6 November 2017 at 00:54, Kaushal M  wrote:

> On Fri, Nov 3, 2017 at 8:50 PM, Alastair Neil 
> wrote:
> > Just so I am clear the upgrade process will be as follows:
> >
> > upgrade all clients to 4.0
> >
> > rolling upgrade all servers to 4.0 (with GD1)
> >
> > kill all GD1 daemons on all servers and run upgrade script (new clients
> > unable to connect at this point)
> >
> > start GD2 ( necessary or does the upgrade script do this?)
> >
> >
> > I assume that once the cluster had been migrated to GD2 the glusterd
> startup
> > script will be smart enough to start the correct version?
> >
>
> This should be the process, mostly.
>
> The upgrade script needs to GD2 running on all nodes before it can
> begin migration.
> But they don't need to have a cluster formed, the script should take
> care of forming the cluster.
>
>
> > -Thanks
> >
> >
> >
> >
> >
> > On 3 November 2017 at 04:06, Kaushal M  wrote:
> >>
> >> On Thu, Nov 2, 2017 at 7:53 PM, Darrell Budic 
> >> wrote:
> >> > Will the various client packages (centos in my case) be able to
> >> > automatically handle the upgrade vs new install decision, or will we
> be
> >> > required to do something manually to determine that?
> >>
> >> We should be able to do this with CentOS (and other RPM based distros)
> >> which have well split glusterfs packages currently.
> >> At this moment, I don't know exactly how much can be handled
> >> automatically, but I expect the amount of manual intervention to be
> >> minimal.
> >> The least minimum amount of manual work needed would be enabling and
> >> starting GD2 and starting the migration script.
> >>
> >> >
> >> > It’s a little unclear that things will continue without interruption
> >> > because
> >> > of the way you describe the change from GD1 to GD2, since it sounds
> like
> >> > it
> >> > stops GD1.
> >>
> >> With the described upgrade strategy, we can ensure continuous volume
> >> access to clients during the whole process (provided volumes have been
> >> setup with replication or ec).
> >>
> >> During the migration from GD1 to GD2, any existing clients still
> >> retain access, and can continue to work without interruption.
> >> This is possible because gluster keeps the management  (glusterds) and
> >> data (bricks and clients) parts separate.
> >> So it is possible to interrupt the management parts, without
> >> interrupting data access to existing clients.
> >> Clients and the server side brick processes need GlusterD to start up.
> >> But once they're running, they can run without GlusterD. GlusterD is
> >> only required again if something goes wrong.
> >> Stopping GD1 during the migration process, will not lead to any
> >> interruptions for existing clients.
> >> The brick process continue to run, and any connected clients continue
> >> to remain connected to the bricks.
> >> Any new clients which try to mount the volumes during this migration
> >> will fail, as a GlusterD will not be available (either GD1 or GD2).
> >>
> >> > Early days, obviously, but if you could clarify if that’s what
> >> > we’re used to as a rolling upgrade or how it works, that would be
> >> > appreciated.
> >>
> >> A Gluster rolling upgrade process, allows data access to volumes
> >> during the process, while upgrading the brick processes as well.
> >> Rolling upgrades with uninterrupted access requires that volumes have
> >> redundancy (replicate or ec).
> >> Rolling upgrades involves upgrading servers belonging to a redundancy
> >> set (replica set or ec set), one at a time.
> >> One at a time,
> >> - A server is picked from a redundancy set
> >> - All Gluster processes are killed on the server, glusterd, bricks and
> >> other daemons included.
> >> - Gluster is upgraded and restarted on the server
> >> - A heal is performed to heal new data onto the bricks.
> >> - Move onto next server after heal finishes.
> >>
> >> Clients maintain uninterrupted access, because a full redundancy set
> >> is never taken offline all at once.
> >>
> >> > Also clarification that we’ll be able to upgrade from 3.x
> >> > (3.1x?) to 4.0, manually or automatically?
> >>
> >> Rolling upgrades from 3.1x to 4.0 are a manual process. But I believe,
> >> gdeploy has playbooks to automate it.
> >> At the end of this you will be left with a 4.0 cluster, but still be
> >> running GD1.
> >> Upgrading from GD1 to GD2, in 4.0 will be a manual process. A script
> >> that automates this is planned only for 4.1.
> >>
> >> >
> >> >
> >> > 
> >> > From: Kaushal M 
> >> > Subject: [Gluster-users] Request for Comments: Upgrades from 3.x to
> 4.0+
> >> > Date: November 2, 2017 at 3:56:05 AM CDT
> >> > To: gluster-us...@gluster.org; Gluster Devel
> >> >
> >> > We're fast approaching the time for Gluster-4.0. And we would like to
> >> > set out the expected upgrade strategy and try to polish it to be as

Re: [Gluster-devel] Regression failure: ./tests/bugs/glusterd/bug-1345727-bricks-stop-on-no-quorum-validation.t

2017-11-06 Thread Atin Mukherjee
On Mon, 6 Nov 2017 at 18:26, Nithya Balachandran 
wrote:

> On 6 November 2017 at 18:02, Atin Mukherjee  wrote:
>
>> Snippet from where the test failed (the one which failed is marked in
>> bold):
>>
>> # Start the
>> volume
>> TEST $CLI_1 volume start
>> $V0
>>
>>
>> EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" brick_up_status_1 $V0 $H1
>> $B1/${V0}1
>> EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" brick_up_status_1 $V0 $H2
>> $B2/${V0}2
>> EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" brick_up_status_1 $V0 $H3
>> $B3/${V0}3
>>
>>
>> # Bring down 2nd and 3rd
>> glusterd
>> TEST kill_glusterd
>> 2
>> TEST kill_glusterd
>> 3
>>
>>
>> # Server quorum is not met. Brick on 1st node must be
>> down
>>
>>
>> *EXPECT_WITHIN $PROCESS_DOWN_TIMEOUT "0" brick_up_status_1 $V0 $H1
>> $B1/${V0}1 *
>>
>> *08:04:05* not ok 13 Got "" instead of "0", LINENUM:33*08:04:05* FAILED 
>> COMMAND: 0 brick_up_status_1 patchy 127.1.1.1 /d/backends/1/patchy1
>>
>>
>> Nothing abnormal from the logs. The test failed as we expect the number
>> of bricks to be up as 0 due to quorum loss but it returned "" from the
>> command "$CLI_1 volume status $vol $host:$brick --xml | sed -ne
>> 's/.*\([01]\)<\/status>/\1/p'" . The only way this command to
>> return a non integer number is some parsing error? As of now its a mystery
>> to me, still looking into it.
>>
>
> The funny thing is that it takes a very long time to send the sigterm to
> the brick (I'm assuming it is the local brick). It also looks like the test
> not check that glusterd is down before it checks the brick status.
>

Yes, we should check for a peer_count before checking for the brick status.
But I’m not 100% sure if that’s the only issue as in that case I should
have seen an integer number greater than 0 instead of blank.


> [2017-11-06 08:03:21.403670]:++
> G_LOG:./tests/bugs/glusterd/bug-1345727-bricks-stop-on-no-quorum-validation.t:
> TEST: 30 kill_glusterd 3 ++
> *[2017-11-06 08:03:21.415249]*:++
> G_LOG:./tests/bugs/glusterd/bug-1345727-bricks-stop-on-no-quorum-validation.t:
> TEST: 33 0 brick_up_status_1 patchy 127.1.1.1 /d/backends/1/patchy1
> ++
>
> ...
>
> [*2017-11-06 08:03:44.972007]* I [MSGID: 106542]
> [glusterd-utils.c:8063:glusterd_brick_signal] 0-glusterd: sending signal 15
> to brick with pid 30706
>
> This is nearly 25 seconds later and PROCESS_DOWN_TIMEOUT is set to 5.
>
>
> Regards,
> Nithya
>
>
> On Mon, Nov 6, 2017 at 3:06 PM, Nithya Balachandran 
>> wrote:
>>
>>> Hi,
>>>
>>> Can someone take a look at :
>>> https://build.gluster.org/job/centos6-regression/7231/
>>> ?
>>>
>>>
>>> From the logs:
>>>
>>> [2017-11-06 08:03:21.200177]:++
>>> G_LOG:./tests/bugs/glusterd/bug-1345727-bricks-stop-on-no-quorum-validation.t:
>>> TEST: 26 1 brick_up_status_1 patchy 127.1.1.3 /d/backends/3/patchy3
>>> ++
>>> [2017-11-06 08:03:21.392027]:++
>>> G_LOG:./tests/bugs/glusterd/bug-1345727-bricks-stop-on-no-quorum-validation.t:
>>> TEST: 29 kill_glusterd 2 ++
>>> [2017-11-06 08:03:21.400647] W [socket.c:593:__socket_rwv] 0-management:
>>> readv on 127.1.1.2:24007 failed (No data available)
>>> The message "I [MSGID: 106499]
>>> [glusterd-handler.c:4303:__glusterd_handle_status_volume] 0-management:
>>> Received status volume req for volume patchy" repeated 2 times between
>>> [2017-11-06 08:03:20.983906] and [2017-11-06 08:03:21.373432]
>>> [2017-11-06 08:03:21.400698] I [MSGID: 106004]
>>> [glusterd-handler.c:6284:__glusterd_peer_rpc_notify] 0-management: Peer
>>> <127.1.1.2> (<4a9ec683-6d08-47f3-960f-1ed53be2e230>), in state >> Cluster>, has disconnected from glusterd.
>>> [2017-11-06 08:03:21.400811] W
>>> [glusterd-locks.c:675:glusterd_mgmt_v3_unlock]
>>> (-->/build/install/lib/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x1e9b9)
>>> [0x7fafe000d9b9]
>>> -->/build/install/lib/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x3231e)
>>> [0x7fafe002131e]
>>> -->/build/install/lib/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0xfdf37)
>>> [0x7fafe00ecf37] ) 0-management: Lock for vol patchy not held
>>> [2017-11-06 08:03:21.400827] W [MSGID: 106118]
>>> [glusterd-handler.c:6309:__glusterd_peer_rpc_notify] 0-management: Lock not
>>> released for patchy
>>> [2017-11-06 08:03:21.400851] C [MSGID: 106003]
>>> [glusterd-server-quorum.c:351:glusterd_do_volume_quorum_action]
>>> 0-management: Server quorum regained for volume patchy. Starting local
>>> bricks.
>>> [2017-11-06 08:03:21.403670]:++
>>> G_LOG:./tests/bugs/glusterd/bug-1345727-bricks-stop-on-no-quorum-validation.t:
>>> TEST: 30 kill_glusterd 3 ++
>>> [2017-11-06 08:03:21.415249]:++
>>> G_LOG:./tests/bugs/glusterd/bug-1345727-bricks-stop-on-no-quorum-validation.t:
>>> TEST: 33 0 brick_up_status_1 patchy 127.1.1.1 /d/backends/1/patchy1
>>> ++
>>> [2017-11-06 08:03:31.158076] E [socket.c:2369:socket_connect_finish]
>>> 0-management: connection to 127.1.1.2:24007 failed (Connection
>>> refused); 

Re: [Gluster-devel] Regression failure : /tests/basic/ec/ec-1468261.t

2017-11-06 Thread Ashish Pandey

I don't think it is an issue with the test you mentioned. 
You may have to re-trigger the test. 
This is what I did for one of my patch. 

-- 
Ashish 
- Original Message -

From: "Nithya Balachandran"  
To: "Gluster Devel" , "Xavi Hernandez" 
, "Ashish Pandey"  
Sent: Monday, November 6, 2017 6:35:24 PM 
Subject: Regression failure : /tests/basic/ec/ec-1468261.t 

Can someone take a look at this? 
The run was aborted ( 
https://build.gluster.org/job/centos6-regression/7232/console ) 

Thanks, 
Nithya 

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Coverity covscan for 2017-11-06-8aace739 (master branch)

2017-11-06 Thread staticanalysis
GlusterFS Coverity covscan results are available from
http://download.gluster.org/pub/gluster/glusterfs/static-analysis/master/glusterfs-coverity/2017-11-06-8aace739
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Regression failure: ./tests/bugs/glusterd/bug-1345727-bricks-stop-on-no-quorum-validation.t

2017-11-06 Thread Nithya Balachandran
On 6 November 2017 at 18:02, Atin Mukherjee  wrote:

> Snippet from where the test failed (the one which failed is marked in
> bold):
>
> # Start the volume
>
> TEST $CLI_1 volume start $V0
>
>
>
> EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" brick_up_status_1 $V0 $H1
> $B1/${V0}1
> EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" brick_up_status_1 $V0 $H2
> $B2/${V0}2
> EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" brick_up_status_1 $V0 $H3
> $B3/${V0}3
>
>
> # Bring down 2nd and 3rd glusterd
>
> TEST kill_glusterd 2
>
> TEST kill_glusterd 3
>
>
>
> # Server quorum is not met. Brick on 1st node must be
> down
>
>
> *EXPECT_WITHIN $PROCESS_DOWN_TIMEOUT "0" brick_up_status_1 $V0 $H1
> $B1/${V0}1 *
>
> *08:04:05* not ok 13 Got "" instead of "0", LINENUM:33*08:04:05* FAILED 
> COMMAND: 0 brick_up_status_1 patchy 127.1.1.1 /d/backends/1/patchy1
>
>
> Nothing abnormal from the logs. The test failed as we expect the number of
> bricks to be up as 0 due to quorum loss but it returned "" from the command
> "$CLI_1 volume status $vol $host:$brick --xml | sed -ne
> 's/.*\([01]\)<\/status>/\1/p'" . The only way this command to
> return a non integer number is some parsing error? As of now its a mystery
> to me, still looking into it.
>

The funny thing is that it takes a very long time to send the sigterm to
the brick (I'm assuming it is the local brick). It also looks like the test
not check that glusterd is down before it checks the brick status.

[2017-11-06 08:03:21.403670]:++ G_LOG:./tests/bugs/glusterd/
bug-1345727-bricks-stop-on-no-quorum-validation.t: TEST: 30 kill_glusterd 3
++
*[2017-11-06 08:03:21.415249]*:++ G_LOG:./tests/bugs/glusterd/
bug-1345727-bricks-stop-on-no-quorum-validation.t: TEST: 33 0
brick_up_status_1 patchy 127.1.1.1 /d/backends/1/patchy1 ++

...

[*2017-11-06 08:03:44.972007]* I [MSGID: 106542]
[glusterd-utils.c:8063:glusterd_brick_signal] 0-glusterd: sending signal 15
to brick with pid 30706

This is nearly 25 seconds later and PROCESS_DOWN_TIMEOUT is set to 5.


Regards,
Nithya


On Mon, Nov 6, 2017 at 3:06 PM, Nithya Balachandran 
> wrote:
>
>> Hi,
>>
>> Can someone take a look at : https://build.gluster.org/jo
>> b/centos6-regression/7231/
>> ?
>>
>>
>> From the logs:
>>
>> [2017-11-06 08:03:21.200177]:++ G_LOG:./tests/bugs/glusterd/bu
>> g-1345727-bricks-stop-on-no-quorum-validation.t: TEST: 26 1
>> brick_up_status_1 patchy 127.1.1.3 /d/backends/3/patchy3 ++
>> [2017-11-06 08:03:21.392027]:++ G_LOG:./tests/bugs/glusterd/bu
>> g-1345727-bricks-stop-on-no-quorum-validation.t: TEST: 29 kill_glusterd
>> 2 ++
>> [2017-11-06 08:03:21.400647] W [socket.c:593:__socket_rwv] 0-management:
>> readv on 127.1.1.2:24007 failed (No data available)
>> The message "I [MSGID: 106499] 
>> [glusterd-handler.c:4303:__glusterd_handle_status_volume]
>> 0-management: Received status volume req for volume patchy" repeated 2
>> times between [2017-11-06 08:03:20.983906] and [2017-11-06 08:03:21.373432]
>> [2017-11-06 08:03:21.400698] I [MSGID: 106004]
>> [glusterd-handler.c:6284:__glusterd_peer_rpc_notify] 0-management: Peer
>> <127.1.1.2> (<4a9ec683-6d08-47f3-960f-1ed53be2e230>), in state > Cluster>, has disconnected from glusterd.
>> [2017-11-06 08:03:21.400811] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock]
>> (-->/build/install/lib/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x1e9b9)
>> [0x7fafe000d9b9] -->/build/install/lib/glusterf
>> s/3.12.2/xlator/mgmt/glusterd.so(+0x3231e) [0x7fafe002131e]
>> -->/build/install/lib/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0xfdf37)
>> [0x7fafe00ecf37] ) 0-management: Lock for vol patchy not held
>> [2017-11-06 08:03:21.400827] W [MSGID: 106118]
>> [glusterd-handler.c:6309:__glusterd_peer_rpc_notify] 0-management: Lock
>> not released for patchy
>> [2017-11-06 08:03:21.400851] C [MSGID: 106003]
>> [glusterd-server-quorum.c:351:glusterd_do_volume_quorum_action]
>> 0-management: Server quorum regained for volume patchy. Starting local
>> bricks.
>> [2017-11-06 08:03:21.403670]:++ G_LOG:./tests/bugs/glusterd/bu
>> g-1345727-bricks-stop-on-no-quorum-validation.t: TEST: 30 kill_glusterd
>> 3 ++
>> [2017-11-06 08:03:21.415249]:++ G_LOG:./tests/bugs/glusterd/bu
>> g-1345727-bricks-stop-on-no-quorum-validation.t: TEST: 33 0
>> brick_up_status_1 patchy 127.1.1.1 /d/backends/1/patchy1 ++
>> [2017-11-06 08:03:31.158076] E [socket.c:2369:socket_connect_finish]
>> 0-management: connection to 127.1.1.2:24007 failed (Connection refused);
>> disconnecting socket
>> [2017-11-06 08:03:31.159513] I [MSGID: 106499]
>> [glusterd-handler.c:4303:__glusterd_handle_status_volume] 0-management:
>> Received status volume req for volume patchy
>> [2017-11-06 08:03:33.151685] W [socket.c:593:__socket_rwv] 0-management:
>> readv on 127.1.1.3:24007 failed (Connection reset by peer)
>> [2017-11-06 08:03:33.151735] I [MSGID: 106004]
>> [glusterd-handler.c:6284:__glusterd_peer_rpc_notify] 

[Gluster-devel] Gluster Summit BOF - Rebalance

2017-11-06 Thread Nithya Balachandran
Hi,

We had a BOF on Rebalance  at the Gluster Summit to get feedback from
Gluster users.

- Performance has improved over the last few releases and it works well for
large files.
- However, it is still not fast enough on volumes which contain a lot of
directories and small files. The bottleneck appears to be the
single-threaded filesystem crawl.
- Scripts using the fix-layout and file migration virtual xattrs to
rebalance volumes which are available via the mount point would be helpful
- Rebalance is currently broken on volumes with ZFS bricks (and other FS
where fallocate is not available). A fix for this is being worked on [1]
and should be ready soon.
- The rebalance status output is satisfactory

Amar, Susant, Raghavendra, please add if I have missed something.

Regards,
Nithya

[1] https://review.gluster.org/18573
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Regression failure : /tests/basic/ec/ec-1468261.t

2017-11-06 Thread Nithya Balachandran
Can someone take a look at this?
The run was aborted (
https://build.gluster.org/job/centos6-regression/7232/console)

Thanks,
Nithya
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Regression failure: ./tests/bugs/glusterd/bug-1345727-bricks-stop-on-no-quorum-validation.t

2017-11-06 Thread Atin Mukherjee
Snippet from where the test failed (the one which failed is marked in bold):

# Start the
volume
TEST $CLI_1 volume start
$V0


EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" brick_up_status_1 $V0 $H1
$B1/${V0}1
EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" brick_up_status_1 $V0 $H2
$B2/${V0}2
EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" brick_up_status_1 $V0 $H3
$B3/${V0}3


# Bring down 2nd and 3rd
glusterd
TEST kill_glusterd
2
TEST kill_glusterd
3


# Server quorum is not met. Brick on 1st node must be
down


*EXPECT_WITHIN $PROCESS_DOWN_TIMEOUT "0" brick_up_status_1 $V0 $H1
$B1/${V0}1 *

*08:04:05* not ok 13 Got "" instead of "0", LINENUM:33*08:04:05*
FAILED COMMAND: 0 brick_up_status_1 patchy 127.1.1.1
/d/backends/1/patchy1


Nothing abnormal from the logs. The test failed as we expect the number of
bricks to be up as 0 due to quorum loss but it returned "" from the command
"$CLI_1 volume status $vol $host:$brick --xml | sed -ne
's/.*\([01]\)<\/status>/\1/p'" . The only way this command to
return a non integer number is some parsing error? As of now its a mystery
to me, still looking into it.

On Mon, Nov 6, 2017 at 3:06 PM, Nithya Balachandran 
wrote:

> Hi,
>
> Can someone take a look at : https://build.gluster.org/
> job/centos6-regression/7231/
> ?
>
>
> From the logs:
>
> [2017-11-06 08:03:21.200177]:++ G_LOG:./tests/bugs/glusterd/
> bug-1345727-bricks-stop-on-no-quorum-validation.t: TEST: 26 1
> brick_up_status_1 patchy 127.1.1.3 /d/backends/3/patchy3 ++
> [2017-11-06 08:03:21.392027]:++ G_LOG:./tests/bugs/glusterd/
> bug-1345727-bricks-stop-on-no-quorum-validation.t: TEST: 29 kill_glusterd
> 2 ++
> [2017-11-06 08:03:21.400647] W [socket.c:593:__socket_rwv] 0-management:
> readv on 127.1.1.2:24007 failed (No data available)
> The message "I [MSGID: 106499] 
> [glusterd-handler.c:4303:__glusterd_handle_status_volume]
> 0-management: Received status volume req for volume patchy" repeated 2
> times between [2017-11-06 08:03:20.983906] and [2017-11-06 08:03:21.373432]
> [2017-11-06 08:03:21.400698] I [MSGID: 106004] 
> [glusterd-handler.c:6284:__glusterd_peer_rpc_notify]
> 0-management: Peer <127.1.1.2> (<4a9ec683-6d08-47f3-960f-1ed53be2e230>),
> in state , has disconnected from glusterd.
> [2017-11-06 08:03:21.400811] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock]
> (-->/build/install/lib/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x1e9b9)
> [0x7fafe000d9b9] 
> -->/build/install/lib/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x3231e)
> [0x7fafe002131e] 
> -->/build/install/lib/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0xfdf37)
> [0x7fafe00ecf37] ) 0-management: Lock for vol patchy not held
> [2017-11-06 08:03:21.400827] W [MSGID: 106118] 
> [glusterd-handler.c:6309:__glusterd_peer_rpc_notify]
> 0-management: Lock not released for patchy
> [2017-11-06 08:03:21.400851] C [MSGID: 106003]
> [glusterd-server-quorum.c:351:glusterd_do_volume_quorum_action]
> 0-management: Server quorum regained for volume patchy. Starting local
> bricks.
> [2017-11-06 08:03:21.403670]:++ G_LOG:./tests/bugs/glusterd/
> bug-1345727-bricks-stop-on-no-quorum-validation.t: TEST: 30 kill_glusterd
> 3 ++
> [2017-11-06 08:03:21.415249]:++ G_LOG:./tests/bugs/glusterd/
> bug-1345727-bricks-stop-on-no-quorum-validation.t: TEST: 33 0
> brick_up_status_1 patchy 127.1.1.1 /d/backends/1/patchy1 ++
> [2017-11-06 08:03:31.158076] E [socket.c:2369:socket_connect_finish]
> 0-management: connection to 127.1.1.2:24007 failed (Connection refused);
> disconnecting socket
> [2017-11-06 08:03:31.159513] I [MSGID: 106499] 
> [glusterd-handler.c:4303:__glusterd_handle_status_volume]
> 0-management: Received status volume req for volume patchy
> [2017-11-06 08:03:33.151685] W [socket.c:593:__socket_rwv] 0-management:
> readv on 127.1.1.3:24007 failed (Connection reset by peer)
> [2017-11-06 08:03:33.151735] I [MSGID: 106004] 
> [glusterd-handler.c:6284:__glusterd_peer_rpc_notify]
> 0-management: Peer <127.1.1.3> (),
> in state , has disconnected from glusterd.
> [2017-11-06 08:03:33.151828] W [glusterd-locks.c:686:glusterd_mgmt_v3_unlock]
> (-->/build/install/lib/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x1e9b9)
> [0x7fafe000d9b9] 
> -->/build/install/lib/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x3231e)
> [0x7fafe002131e] 
> -->/build/install/lib/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0xfdfe4)
> [0x7fafe00ecfe4] ) 0-management: Lock owner mismatch. Lock for vol patchy
> held by faa07524-55ba-46af-8359-0c6c87df5e86
> [2017-11-06 08:03:33.151850] W [MSGID: 106118] 
> [glusterd-handler.c:6309:__glusterd_peer_rpc_notify]
> 0-management: Lock not released for patchy
> [2017-11-06 08:03:33.151873] C [MSGID: 106002]
> [glusterd-server-quorum.c:357:glusterd_do_volume_quorum_action]
> 0-management: Server quorum lost for volume patchy. Stopping local bricks.
> [2017-11-06 08:03:44.972007] I [MSGID: 106542] 
> [glusterd-utils.c:8063:glusterd_brick_signal]
> 0-glusterd: sending signal 15 to 

Re: [Gluster-devel] Coverity fixes

2017-11-06 Thread Amar Tumballi
One of the things I noticed is, if we make
https://scan.coverity.com/projects/gluster-glusterfs as the source of truth
for coverity issues, then the issue IDs will be constant. We can reference
them.

Also note that we should most probably focusing on 'High Impact' issues
first for sure, than the medium/low impact issues.

Regards,
Amar

On Sat, Nov 4, 2017 at 7:38 AM, Vijay Bellur  wrote:

>
>
> On Fri, Nov 3, 2017 at 9:25 AM, Atin Mukherjee 
> wrote:
>
>>
>> On Fri, 3 Nov 2017 at 18:31, Kaleb S. KEITHLEY 
>> wrote:
>>
>>> On 11/02/2017 10:19 AM, Atin Mukherjee wrote:
>>> > While I appreciate the folks to contribute lot of coverity fixes over
>>> > last few days, I have an observation for some of the patches the
>>> > coverity issue id(s) are *not* mentioned which gets maintainers in a
>>> > difficult situation to understand the exact complaint coming out of the
>>> > coverity. From my past experience in fixing coverity defects, sometimes
>>> > the fixes might look simple but they are not.
>>> >
>>> > May I request all the developers to include the defect id in the commit
>>> > message for all the coverity fixes?
>>> >
>>>
>>> How does that work? AFAIK the defect IDs are constantly changing as some
>>> get fixed and new ones get added.
>>
>>
>> We’d need atleast (a) the defect id with pointer to the coverity link
>> which most of the devs are now following I guess but with a caveat that
>> link goes stale in 7 days and the review needs to be done by that time or
>> (b) the commit message should exactly have the coverity description which
>> is more neat.
>>
>> ( I was not knowing the fact the defect id are not constant and later on
>> got to know this from Nigel today)
>>
>>>
>>>
>
> +1 to providing a clean description of the issue rather than using a
> temporary defect ID.
>
> -Vijay
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
>



-- 
Amar Tumballi (amarts)
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Regression failure: ./tests/bugs/glusterd/bug-1345727-bricks-stop-on-no-quorum-validation.t

2017-11-06 Thread Nithya Balachandran
Hi,

Can someone take a look at :
https://build.gluster.org/job/centos6-regression/7231/
?


>From the logs:

[2017-11-06 08:03:21.200177]:++
G_LOG:./tests/bugs/glusterd/bug-1345727-bricks-stop-on-no-quorum-validation.t:
TEST: 26 1 brick_up_status_1 patchy 127.1.1.3 /d/backends/3/patchy3
++
[2017-11-06 08:03:21.392027]:++
G_LOG:./tests/bugs/glusterd/bug-1345727-bricks-stop-on-no-quorum-validation.t:
TEST: 29 kill_glusterd 2 ++
[2017-11-06 08:03:21.400647] W [socket.c:593:__socket_rwv] 0-management:
readv on 127.1.1.2:24007 failed (No data available)
The message "I [MSGID: 106499]
[glusterd-handler.c:4303:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume patchy" repeated 2 times between
[2017-11-06 08:03:20.983906] and [2017-11-06 08:03:21.373432]
[2017-11-06 08:03:21.400698] I [MSGID: 106004]
[glusterd-handler.c:6284:__glusterd_peer_rpc_notify] 0-management: Peer
<127.1.1.2> (<4a9ec683-6d08-47f3-960f-1ed53be2e230>), in state , has disconnected from glusterd.
[2017-11-06 08:03:21.400811] W
[glusterd-locks.c:675:glusterd_mgmt_v3_unlock]
(-->/build/install/lib/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x1e9b9)
[0x7fafe000d9b9]
-->/build/install/lib/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x3231e)
[0x7fafe002131e]
-->/build/install/lib/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0xfdf37)
[0x7fafe00ecf37] ) 0-management: Lock for vol patchy not held
[2017-11-06 08:03:21.400827] W [MSGID: 106118]
[glusterd-handler.c:6309:__glusterd_peer_rpc_notify] 0-management: Lock not
released for patchy
[2017-11-06 08:03:21.400851] C [MSGID: 106003]
[glusterd-server-quorum.c:351:glusterd_do_volume_quorum_action]
0-management: Server quorum regained for volume patchy. Starting local
bricks.
[2017-11-06 08:03:21.403670]:++
G_LOG:./tests/bugs/glusterd/bug-1345727-bricks-stop-on-no-quorum-validation.t:
TEST: 30 kill_glusterd 3 ++
[2017-11-06 08:03:21.415249]:++
G_LOG:./tests/bugs/glusterd/bug-1345727-bricks-stop-on-no-quorum-validation.t:
TEST: 33 0 brick_up_status_1 patchy 127.1.1.1 /d/backends/1/patchy1
++
[2017-11-06 08:03:31.158076] E [socket.c:2369:socket_connect_finish]
0-management: connection to 127.1.1.2:24007 failed (Connection refused);
disconnecting socket
[2017-11-06 08:03:31.159513] I [MSGID: 106499]
[glusterd-handler.c:4303:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume patchy
[2017-11-06 08:03:33.151685] W [socket.c:593:__socket_rwv] 0-management:
readv on 127.1.1.3:24007 failed (Connection reset by peer)
[2017-11-06 08:03:33.151735] I [MSGID: 106004]
[glusterd-handler.c:6284:__glusterd_peer_rpc_notify] 0-management: Peer
<127.1.1.3> (), in state , has disconnected from glusterd.
[2017-11-06 08:03:33.151828] W
[glusterd-locks.c:686:glusterd_mgmt_v3_unlock]
(-->/build/install/lib/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x1e9b9)
[0x7fafe000d9b9]
-->/build/install/lib/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x3231e)
[0x7fafe002131e]
-->/build/install/lib/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0xfdfe4)
[0x7fafe00ecfe4] ) 0-management: Lock owner mismatch. Lock for vol patchy
held by faa07524-55ba-46af-8359-0c6c87df5e86
[2017-11-06 08:03:33.151850] W [MSGID: 106118]
[glusterd-handler.c:6309:__glusterd_peer_rpc_notify] 0-management: Lock not
released for patchy
[2017-11-06 08:03:33.151873] C [MSGID: 106002]
[glusterd-server-quorum.c:357:glusterd_do_volume_quorum_action]
0-management: Server quorum lost for volume patchy. Stopping local bricks.
[2017-11-06 08:03:44.972007] I [MSGID: 106542]
[glusterd-utils.c:8063:glusterd_brick_signal] 0-glusterd: sending signal 15
to brick with pid 30706

Thanks,
Nithya
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel