[Gluster-devel] (legacy)GlusterD & SSL - status updates

2016-10-25 Thread Atin Mukherjee
Although legacy GlusterD doesn't get discussed much in the community
meeting but I feel we can and should update about the work accomplished for
last week and what's planned for next week for major components for
GlusterFS and GlusterD is being one of them especially with the new model
of community meeting proposal from Kaushal. You can expect an email on
every Wednesday from one of us on the same topic.

*What has been done for last week:*

*Patches worked on*:

- nfs.disable  default configuration value in
volume info output
  -- http://review.gluster.org/15568

- Eventing related fixes
  -- http://review.gluster.org/15678 - glusterd: conditionally pass uuid
for EVENT_PEER_CONNECT
  -- http://review.gluster.org/15699 - glusterd: use GF_BRICK_STOPPING as
intermediate brickinfo->status state

- gluster get-state CLI fixes
  -- http://review.gluster.org/15655 - glusterd: set the brickinfo->port
before spawning the bricks
  --   http://review.gluster.org/15662 - cli, glusterd: Address issues in
get-state cli output

- SSL fixes
  -- http://review.gluster.org/14356 - rpc/socket: Close pipe on
disconnection
  -- http://review.gluster.org/15605 - rpc/socket.c : Modify gf_log message
in socket_poller code in case of error
  -- http://review.gluster.org/15677 - rpc/socket.c : Modify socket_poller
code in case of ENODATA error code

*(Major) Reviews*:

- team is helping tiering team in reviewing changes required to make tier
daemon as a separate service by separating rebalance process (
http://review.gluster.org/#/c/13365/ ). This change is slated for 3.10.

*What is planned for next week:*

- review/comment resolution
   -- http://review.gluster.org/#/c/15563/ - glusterd, cli: Get global
options through volume get functionality (targeted for 3.10)
  -- http://review.gluster.org/#/c/15463/ - cli, glusterd: New CLI command
to get maximum supported op-version (targeted for 3.10)

- Continue on reviewing tiering changes

~ Atin (atinm)
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] quota-rename.t core in netbsd

2016-10-25 Thread Sanoj Unnikrishnan
Sure. Will continue working on it then.
Thanks,
Sanoj

On Wed, Oct 26, 2016 at 7:48 AM, Vijay Bellur  wrote:

> On Tue, Oct 25, 2016 at 1:15 AM, Sanoj Unnikrishnan 
> wrote:
> > This is a memory overrun issue.
> >
> > I was not able to fully RCA this issue. In particular i had trouble
> trying
> > to run gluster on netbsd setup.
> >
> > will pick this up later if it is deemed as priority.
> >
> >
>
> We have had elusive memory corruptions in quota and I think this is
> worth chasing down as we have a consistent reproducer.
>
> Emmanuel might be able to help with problems related to NetBSD environment.
>
> Thanks,
> Vijay
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] quota-rename.t core in netbsd

2016-10-25 Thread Vijay Bellur
On Tue, Oct 25, 2016 at 1:15 AM, Sanoj Unnikrishnan  wrote:
> This is a memory overrun issue.
>
> I was not able to fully RCA this issue. In particular i had trouble trying
> to run gluster on netbsd setup.
>
> will pick this up later if it is deemed as priority.
>
>

We have had elusive memory corruptions in quota and I think this is
worth chasing down as we have a consistent reproducer.

Emmanuel might be able to help with problems related to NetBSD environment.

Thanks,
Vijay
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Input/output error when files in .shard folder are deleted

2016-10-25 Thread qingwei wei
Hi,

Pls see the client log below.

[2016-10-24 10:29:51.111603] I [fuse-bridge.c:5171:fuse_graph_setup]
0-fuse: switched to graph 0
[2016-10-24 10:29:51.111662] I [MSGID: 114035]
[client-handshake.c:193:client_set_lk_version_cbk]
0-testHeal-client-2: Server lk version = 1
[2016-10-24 10:29:51.112371] I [fuse-bridge.c:4083:fuse_init]
0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.22
kernel 7.22
[2016-10-24 10:29:51.113563] I [MSGID: 108031]
[afr-common.c:2071:afr_local_discovery_cbk] 0-testHeal-replicate-0:
selecting local read_child testHeal-client-2
[2016-10-24 10:29:51.113604] I [MSGID: 108031]
[afr-common.c:2071:afr_local_discovery_cbk] 0-testHeal-replicate-0:
selecting local read_child testHeal-client-0
[2016-10-24 10:29:51.113630] I [MSGID: 108031]
[afr-common.c:2071:afr_local_discovery_cbk] 0-testHeal-replicate-0:
selecting local read_child testHeal-client-1
[2016-10-24 10:29:54.016802] W [MSGID: 108001]
[afr-transaction.c:789:afr_handle_quorum] 0-testHeal-replicate-0:
/.shard/9061198a-eb7e-45a2-93fb-eb396d1b2727.1: F
ailing MKNOD as quorum is not met
[2016-10-24 10:29:54.019330] W [MSGID: 114031]
[client-rpc-fops.c:2981:client3_3_lookup_cbk] 0-testHeal-client-0:
remote operation failed. Path: (null) (-
---) [Invalid argument]
[2016-10-24 10:29:54.019343] W [MSGID: 114031]
[client-rpc-fops.c:2981:client3_3_lookup_cbk] 0-testHeal-client-2:
remote operation failed. Path: (null) (-
---) [Invalid argument]
[2016-10-24 10:29:54.019373] W [MSGID: 114031]
[client-rpc-fops.c:2981:client3_3_lookup_cbk] 0-testHeal-client-1:
remote operation failed. Path: (null) (-
---) [Invalid argument]
[2016-10-24 10:29:54.019854] E [MSGID: 133010]
[shard.c:1582:shard_common_lookup_shards_cbk] 0-testHeal-shard: Lookup
on shard 1 failed. Base file gfid = 9061198a
-eb7e-45a2-93fb-eb396d1b2727 [Input/output error]
[2016-10-24 10:29:54.020886] W [fuse-bridge.c:2227:fuse_readv_cbk]
0-glusterfs-fuse: 135: READ => -1
gfid=9061198a-eb7e-45a2-93fb-eb396d1b2727 fd=0x7f70c80d12dc (
Input/output error)
[2016-10-24 10:29:54.118264] W [MSGID: 114031]
[client-rpc-fops.c:2981:client3_3_lookup_cbk] 0-testHeal-client-0:
remote operation failed. Path: (null) (-
---) [Invalid argument]
[2016-10-24 10:29:54.118308] W [MSGID: 114031]
[client-rpc-fops.c:2981:client3_3_lookup_cbk] 0-testHeal-client-2:
remote operation failed. Path: (null)
(----) [Invalid argument]
[2016-10-24 10:29:54.118329] W [MSGID: 114031]
[client-rpc-fops.c:2981:client3_3_lookup_cbk] 0-testHeal-client-1:
remote operation failed. Path: (null)
(----) [Invalid argument]
[2016-10-24 10:29:54.118751] E [MSGID: 133010]
[shard.c:1582:shard_common_lookup_shards_cbk] 0-testHeal-shard: Lookup
on shard 1 failed. Base file gfid =
9061198a-eb7e-45a2-93fb-eb396d1b2727 [Input/output error]
[2016-10-24 10:29:54.118787] W [fuse-bridge.c:2227:fuse_readv_cbk]
0-glusterfs-fuse: 137: READ => -1
gfid=9061198a-eb7e-45a2-93fb-eb396d1b2727 fd=0x7f70c80d12dc
(Input/output error)
[2016-10-24 10:29:54.119330] W [MSGID: 114031]
[client-rpc-fops.c:2981:client3_3_lookup_cbk] 0-testHeal-client-1:
remote operation failed. Path: (null)
(----) [Invalid argument]
[2016-10-24 10:29:54.119338] W [MSGID: 114031]
[client-rpc-fops.c:2981:client3_3_lookup_cbk] 0-testHeal-client-0:
remote operation failed. Path: (null)
(----) [Invalid argument]
[2016-10-24 10:29:54.119368] W [MSGID: 114031]
[client-rpc-fops.c:2981:client3_3_lookup_cbk] 0-testHeal-client-2:
remote operation failed. Path: (null)
(----) [Invalid argument]
[2016-10-24 10:29:54.119674] E [MSGID: 133010]
[shard.c:1582:shard_common_lookup_shards_cbk] 0-testHeal-shard: Lookup
on shard 1 failed. Base file gfid =
9061198a-eb7e-45a2-93fb-eb396d1b2727 [Input/output error]
[2016-10-24 10:29:54.119715] W [fuse-bridge.c:2227:fuse_readv_cbk]
0-glusterfs-fuse: 138: READ => -1
gfid=9061198a-eb7e-45a2-93fb-eb396d1b2727 fd=0x7f70c80d12dc
(Input/output error)
[2016-10-24 10:36:13.140414] W [MSGID: 114031]
[client-rpc-fops.c:2981:client3_3_lookup_cbk] 0-testHeal-client-0:
remote operation failed. Path: (null)
(----) [Invalid argument]
[2016-10-24 10:36:13.140451] W [MSGID: 114031]
[client-rpc-fops.c:2981:client3_3_lookup_cbk] 0-testHeal-client-2:
remote operation failed. Path: (null)
(----) [Invalid argument]
[2016-10-24 10:36:13.140461] W [MSGID: 114031]
[client-rpc-fops.c:2981:client3_3_lookup_cbk] 0-testHeal-client-1:
remote operation failed. Path: (null)
(----) [Invalid argument]
[2016-10-24 10:36:13.140956] E [MSGID: 133010]
[shard.c:1582:shard_common_lookup_shards_cbk] 0-testHeal-shard: Lookup
on shard 1 failed. Base file gfid =
9061198a-eb7e-45a2-93fb-eb396d1b2727 

Re: [Gluster-devel] [Gluster-Maintainers] Gluster Test Thursday - Release 3.9

2016-10-25 Thread Kaleb S. KEITHLEY
On 10/25/2016 12:11 PM, Niels de Vos wrote:
> On Tue, Oct 25, 2016 at 07:51:47AM -0400, Kaleb S. KEITHLEY wrote:
>> On 10/25/2016 06:46 AM, Atin Mukherjee wrote:
>>>
>>>
>>> On Tue, Oct 25, 2016 at 4:12 PM, Aravinda >> > wrote:
>>>
>>> Hi,
>>>
>>> Since Automated test framework for Gluster is in progress, we need
>>> help from Maintainers and developers to test the features and bug
>>> fixes to release Gluster 3.9.
>>>
>>> In last maintainers meeting Shyam shared an idea about having a Test
>>> day to accelerate the testing and release.
>>>
>>> Please participate in testing your component(s) on Oct 27, 2016. We
>>> will prepare the rc2 build by tomorrow and share the details before
>>   ^^^
>>> Test day.
>>>
>>> RC1 Link:
>>> http://www.gluster.org/pipermail/maintainers/2016-September/001442.html
>>> 
>>> 
>>>
>>>
>>> I don't think testing RC1 would be ideal as 3.9 head has moved forward
>>> with significant number of patches. I'd recommend of having an RC2 here.
>>>
>>
>> BTW, please tag RC2 as 3.9.0rc2 (versus 3.9rc2).  It makes building
>> packages for Fedora much easier.
>>
>> I know you were following what was done for 3.8rcX. That was a pain. :-}
> 
> Can you explain what the problem is with 3.9rc2 and 3.9.0? The huge
> advantage is that 3.9.0 is seen as a version update to 3.9rc2. When
> 3.9.0rc2 is used, 3.9.0 is *not* an update for that, and rc2 packages
> will stay installed until 3.9.1 is released...
> 
> You can check this easily with the rpmdev-vercmp command:
> 
>$ rpmdev-vercmp 3.9.0rc2 3.9.0
>3.9.0rc2 > 3.9.0
>$ rpmdev-vercmp 3.9rc2 3.9.0
>3.9rc2 < 3.9.0

Those aren't really very realistic RPM NVRs IMO.

> 
> So, at least for RPM packaging, 3.9rc2 is recommended, and 3.9.0rc2 is
> problematic.

That's not the only thing recommended.

Last I knew, one of several things that are recommended is, e.g.,
3.9.0-0.2rc2; 3.9.0-1 > 3.9.0-0.2rc2.

The RC (and {qa,alpha,beta}) packages (that I've) built for Fedora for
several years have had NVRs in that form.

This scheme was what was suggested to me on the fedora-devel mailing
list several years ago.

When RCs are tagged as 3.9rc1, then I have to make non-trivial and
counter-intuitive changes to the .spec file to build packages with NVRs
like 3.9.0-0.XrcY. If they are tagged 3.9.0rc1 then the changes much
more straight forward and much simpler.

-- 

Kaleb



signature.asc
Description: OpenPGP digital signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Input/output error when files in .shard folder are deleted

2016-10-25 Thread Krutika Dhananjay
Tried it locally on my setup. Worked fine.

Could you please attach the mount logs?

-Krutika

On Tue, Oct 25, 2016 at 6:55 PM, Pranith Kumar Karampuri <
pkara...@redhat.com> wrote:

> +Krutika
>
> On Mon, Oct 24, 2016 at 4:10 PM, qingwei wei  wrote:
>
>> Hi,
>>
>> I am currently running a simple gluster setup using one server node
>> with multiple disks. I realize that if i delete away all the .shard
>> files in one replica in the backend, my application (dd) will report
>> Input/Output error even though i have 3 replicas.
>>
>> My gluster version is 3.7.16
>>
>> gluster volume file
>>
>> Volume Name: testHeal
>> Type: Replicate
>> Volume ID: 26d16d7f-bc4f-44a6-a18b-eab780d80851
>> Status: Started
>> Number of Bricks: 1 x 3 = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: 192.168.123.4:/mnt/sdb_mssd/testHeal2
>> Brick2: 192.168.123.4:/mnt/sde_mssd/testHeal2
>> Brick3: 192.168.123.4:/mnt/sdd_mssd/testHeal2
>> Options Reconfigured:
>> cluster.self-heal-daemon: on
>> features.shard-block-size: 16MB
>> features.shard: on
>> performance.readdir-ahead: on
>>
>> dd error
>>
>> [root@fujitsu05 .shard]# dd of=/home/test if=/mnt/fuseMount/ddTest
>> bs=16M count=20 oflag=direct
>> dd: error reading ‘/mnt/fuseMount/ddTest’: Input/output error
>> 1+0 records in
>> 1+0 records out
>> 16777216 bytes (17 MB) copied, 0.111038 s, 151 MB/s
>>
>> in the .shard folder where i deleted all the .shard file, i can see
>> one .shard file is recreated
>>
>> getfattr -d -e hex -m.  9061198a-eb7e-45a2-93fb-eb396d1b2727.1
>> # file: 9061198a-eb7e-45a2-93fb-eb396d1b2727.1
>> trusted.afr.testHeal-client-0=0x00010001
>> trusted.afr.testHeal-client-2=0x00010001
>> trusted.gfid=0x41b653f7daa14627b1f91f9e8554ddde
>>
>> However, the gfid is not the same compare to the other replicas
>>
>> getfattr -d -e hex -m.  9061198a-eb7e-45a2-93fb-eb396d1b2727.1
>> # file: 9061198a-eb7e-45a2-93fb-eb396d1b2727.1
>> trusted.afr.dirty=0x
>> trusted.afr.testHeal-client-1=0x
>> trusted.bit-rot.version=0x0300580dde99000e5e5d
>> trusted.gfid=0x9ee5c5eed7964a6cb9ac1a1419de5a40
>>
>> Is this consider a bug?
>>
>> Regards,
>>
>> Cwtan
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>
>
>
>
> --
> Pranith
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-Maintainers] Gluster Test Thursday - Release 3.9

2016-10-25 Thread Niels de Vos
On Tue, Oct 25, 2016 at 07:51:47AM -0400, Kaleb S. KEITHLEY wrote:
> On 10/25/2016 06:46 AM, Atin Mukherjee wrote:
> > 
> > 
> > On Tue, Oct 25, 2016 at 4:12 PM, Aravinda  > > wrote:
> > 
> > Hi,
> > 
> > Since Automated test framework for Gluster is in progress, we need
> > help from Maintainers and developers to test the features and bug
> > fixes to release Gluster 3.9.
> > 
> > In last maintainers meeting Shyam shared an idea about having a Test
> > day to accelerate the testing and release.
> > 
> > Please participate in testing your component(s) on Oct 27, 2016. We
> > will prepare the rc2 build by tomorrow and share the details before
>   ^^^
> > Test day.
> > 
> > RC1 Link:
> > http://www.gluster.org/pipermail/maintainers/2016-September/001442.html
> > 
> > 
> > 
> > 
> > I don't think testing RC1 would be ideal as 3.9 head has moved forward
> > with significant number of patches. I'd recommend of having an RC2 here.
> > 
> 
> BTW, please tag RC2 as 3.9.0rc2 (versus 3.9rc2).  It makes building
> packages for Fedora much easier.
> 
> I know you were following what was done for 3.8rcX. That was a pain. :-}

Can you explain what the problem is with 3.9rc2 and 3.9.0? The huge
advantage is that 3.9.0 is seen as a version update to 3.9rc2. When
3.9.0rc2 is used, 3.9.0 is *not* an update for that, and rc2 packages
will stay installed until 3.9.1 is released...

You can check this easily with the rpmdev-vercmp command:

   $ rpmdev-vercmp 3.9.0rc2 3.9.0
   3.9.0rc2 > 3.9.0
   $ rpmdev-vercmp 3.9rc2 3.9.0
   3.9rc2 < 3.9.0

So, at least for RPM packaging, 3.9rc2 is recommended, and 3.9.0rc2 is
problematic.

Thanks,
Niels


> 
> 3.7 and 3.6 were all 3.X.0betaY or 3.X.0qaY.
> 
> If for some reason 3.9 doesn't get released soon, I'll need to package
> the RC to get 3.9 into Fedora 25 before its GA and having a packaging
> friendly tag will make it that much easier for me to get that done.
> 
> (See the community packaging matrix I sent to the mailing lists and/or
> at
> http://gluster.readthedocs.io/en/latest/Install-Guide/Community_Packages/)
> 
> N.B. This will serve as the email part of the RC tagging discussion
> action item I have.
> 
> Thanks.
> 
> 
> -- 
> 
> Kaleb
> ___
> maintainers mailing list
> maintain...@gluster.org
> http://www.gluster.org/mailman/listinfo/maintainers


signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Minutes from today's Gluster Community Bug Triage meeting (Oct 25 2016)

2016-10-25 Thread Kaleb S. KEITHLEY

There were no meetings on Oct 11 or Oct 18 due to small number of
attendees. There is no meeting next week (Nov 1) due to holiday in
Bangalore. The next meeting will be Nov 8th.

Please find the minutes of today's Gluster Community Bug Triage meeting
at the links posted below.

Minutes:
https://meetbot.fedoraproject.org/gluster-meeting/2016-10-25/bug_triage.2016-10-25-12.00.html
Minutes (text):
https://meetbot.fedoraproject.org/gluster-meeting/2016-10-25/bug_triage.2016-10-25-12.00.txt
Log:
https://meetbot.fedoraproject.org/gluster-meeting/2016-10-25/bug_triage.2016-10-25-12.00.log.html


#gluster-meeting: bug triage



Meeting started by kkeithley at 12:00:07 UTC. The full logs are
available at
https://meetbot.fedoraproject.org/gluster-meeting/2016-10-25/bug_triage.2016-10-25-12.00.log.html
.



Meeting summary
---
* roll call  (kkeithley, 12:00:20)

* Action Items  (kkeithley, 12:02:50)
  * Saravanakmr will host  (kkeithley, 12:03:55)
  * Saravanakmr will host on 2016/11/8 ?  (kkeithley, 12:06:10)
  * bug triage on 2016/11/1 is cancelled due to holiday in Bangalore
(kkeithley, 12:08:43)
  * LINK: https://public.pad.fsfe.org/p/gluster-bugs-to-triage
(kkeithley, 12:11:18)

Meeting ended at 12:29:47 UTC.




Action Items






Action Items, by person
---
* **UNASSIGNED**
  * (none)




People Present (lines said)
---
* kkeithley (45)
* Saravanakmr (11)
* jiffin (8)
* hgowtham (6)
* zodbot (3)
* ashiq (1)
* rafi (1)




Generated by `MeetBot`_ 0.1.4

.. _`MeetBot`: http://wiki.debian.org/MeetBot



-- 

Kaleb
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Input/output error when files in .shard folder are deleted

2016-10-25 Thread Pranith Kumar Karampuri
+Krutika

On Mon, Oct 24, 2016 at 4:10 PM, qingwei wei  wrote:

> Hi,
>
> I am currently running a simple gluster setup using one server node
> with multiple disks. I realize that if i delete away all the .shard
> files in one replica in the backend, my application (dd) will report
> Input/Output error even though i have 3 replicas.
>
> My gluster version is 3.7.16
>
> gluster volume file
>
> Volume Name: testHeal
> Type: Replicate
> Volume ID: 26d16d7f-bc4f-44a6-a18b-eab780d80851
> Status: Started
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: 192.168.123.4:/mnt/sdb_mssd/testHeal2
> Brick2: 192.168.123.4:/mnt/sde_mssd/testHeal2
> Brick3: 192.168.123.4:/mnt/sdd_mssd/testHeal2
> Options Reconfigured:
> cluster.self-heal-daemon: on
> features.shard-block-size: 16MB
> features.shard: on
> performance.readdir-ahead: on
>
> dd error
>
> [root@fujitsu05 .shard]# dd of=/home/test if=/mnt/fuseMount/ddTest
> bs=16M count=20 oflag=direct
> dd: error reading ‘/mnt/fuseMount/ddTest’: Input/output error
> 1+0 records in
> 1+0 records out
> 16777216 bytes (17 MB) copied, 0.111038 s, 151 MB/s
>
> in the .shard folder where i deleted all the .shard file, i can see
> one .shard file is recreated
>
> getfattr -d -e hex -m.  9061198a-eb7e-45a2-93fb-eb396d1b2727.1
> # file: 9061198a-eb7e-45a2-93fb-eb396d1b2727.1
> trusted.afr.testHeal-client-0=0x00010001
> trusted.afr.testHeal-client-2=0x00010001
> trusted.gfid=0x41b653f7daa14627b1f91f9e8554ddde
>
> However, the gfid is not the same compare to the other replicas
>
> getfattr -d -e hex -m.  9061198a-eb7e-45a2-93fb-eb396d1b2727.1
> # file: 9061198a-eb7e-45a2-93fb-eb396d1b2727.1
> trusted.afr.dirty=0x
> trusted.afr.testHeal-client-1=0x
> trusted.bit-rot.version=0x0300580dde99000e5e5d
> trusted.gfid=0x9ee5c5eed7964a6cb9ac1a1419de5a40
>
> Is this consider a bug?
>
> Regards,
>
> Cwtan
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel




-- 
Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-Maintainers] Gluster Test Thursday - Release 3.9

2016-10-25 Thread Kaleb S. KEITHLEY
On 10/25/2016 06:46 AM, Atin Mukherjee wrote:
> 
> 
> On Tue, Oct 25, 2016 at 4:12 PM, Aravinda  > wrote:
> 
> Hi,
> 
> Since Automated test framework for Gluster is in progress, we need
> help from Maintainers and developers to test the features and bug
> fixes to release Gluster 3.9.
> 
> In last maintainers meeting Shyam shared an idea about having a Test
> day to accelerate the testing and release.
> 
> Please participate in testing your component(s) on Oct 27, 2016. We
> will prepare the rc2 build by tomorrow and share the details before
  ^^^
> Test day.
> 
> RC1 Link:
> http://www.gluster.org/pipermail/maintainers/2016-September/001442.html
> 
> 
> 
> I don't think testing RC1 would be ideal as 3.9 head has moved forward
> with significant number of patches. I'd recommend of having an RC2 here.
> 

BTW, please tag RC2 as 3.9.0rc2 (versus 3.9rc2).  It makes building
packages for Fedora much easier.

I know you were following what was done for 3.8rcX. That was a pain. :-}

3.7 and 3.6 were all 3.X.0betaY or 3.X.0qaY.

If for some reason 3.9 doesn't get released soon, I'll need to package
the RC to get 3.9 into Fedora 25 before its GA and having a packaging
friendly tag will make it that much easier for me to get that done.

(See the community packaging matrix I sent to the mailing lists and/or
at
http://gluster.readthedocs.io/en/latest/Install-Guide/Community_Packages/)

N.B. This will serve as the email part of the RC tagging discussion
action item I have.

Thanks.


-- 

Kaleb
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-Maintainers] Gluster Test Thursday - Release 3.9

2016-10-25 Thread Atin Mukherjee
On Tue, Oct 25, 2016 at 4:12 PM, Aravinda  wrote:

> Hi,
>
> Since Automated test framework for Gluster is in progress, we need help
> from Maintainers and developers to test the features and bug fixes to
> release Gluster 3.9.
>
> In last maintainers meeting Shyam shared an idea about having a Test day
> to accelerate the testing and release.
>
> Please participate in testing your component(s) on Oct 27, 2016. We will
> prepare the rc2 build by tomorrow and share the details before Test day.
>
> RC1 Link: http://www.gluster.org/pipermail/maintainers/2016-September/
> 001442.html


I don't think testing RC1 would be ideal as 3.9 head has moved forward with
significant number of patches. I'd recommend of having an RC2 here.


> Release Checklist: https://public.pad.fsfe.org/p/
> gluster-component-release-checklist
>
>
> Thanks and Regards
> Aravinda and Pranith
>
> ___
> maintainers mailing list
> maintain...@gluster.org
> http://www.gluster.org/mailman/listinfo/maintainers
>



-- 

~ Atin (atinm)
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Gluster Test Thursday - Release 3.9

2016-10-25 Thread Aravinda

Hi,

Since Automated test framework for Gluster is in progress, we need help 
from Maintainers and developers to test the features and bug fixes to 
release Gluster 3.9.


In last maintainers meeting Shyam shared an idea about having a Test day 
to accelerate the testing and release.


Please participate in testing your component(s) on Oct 27, 2016. We will 
prepare the rc2 build by tomorrow and share the details before Test day.


RC1 Link: 
http://www.gluster.org/pipermail/maintainers/2016-September/001442.html
Release Checklist: 
https://public.pad.fsfe.org/p/gluster-component-release-checklist



Thanks and Regards
Aravinda and Pranith

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Need inputs for solution for renames + entry self-heal data loss in afr

2016-10-25 Thread Pranith Kumar Karampuri
https://bugzilla.redhat.com/show_bug.cgi?id=1366818 is the bug I am
referring to in the mail above. (Thanks sankarshan for pointing that I
missed the link :-) )

On Tue, Oct 25, 2016 at 3:14 PM, Pranith Kumar Karampuri <
pkara...@redhat.com> wrote:

> One of the Red hat QE engineers (Nag Pavan) found a day 1 bug in entry
> self-heal where the file with good data can be replaced with file with bad
> data when renames + self-heal is involved in a particular way.
>
> Sample steps (From the bz):
> 1) have a plain replica volume with 2 bricks. start the volume and mount
> it.
> 2) mkdir dir && mkdir newdir && touch file1
> 3) bring first brick down
> 4) echo abc > dir/file1
> 5) bring the first brick back up and quickly bring the second brick down
> before self-heal can be triggered.
> 6) do mv dir/file1 newdir/file2 <<--- note that this is empty file.
>
> Now bring the second brick back up. If entry self-heal of 'dir' happens
> first then it deletes the file1 with content 'abc' now when 'newdir' heal
> happens it leads to creation of empty file and the data in the file is lost.
>
> Same can be achieved using 'link' + 'unlink' as well.
>
> The main reason for this problem is that afr entry-self-heal at the moment
> doesn't care completely about link-counts before deleting the final link of
> an inode, so it always does unlink and recreates the file and does data
> heals. In this corner case unlink happens on the good copy of the file and
> we either lose data or get stale data based on what is the data present on
> the sink file.
>
> Solution we are proposing is the following:
>
> 1) Posix will maintain a hidden directory '.glusterfs/anoninode'(We can
> call it lost+found as well) directory which will be used by afr/ec for
> keeping the 'inodes' until their names are resolved.
> 2) Both afr and ec when they need to heal a directory and a 'name' has to
> be deleted but on the other bricks if the inode is present, it renames this
> file as  'anoninode/' instead of doing unlink/rmdir on it.
> 3) For files:
>  a) Both afr, ec already has logic to do 'link' instead of new
> file creation if a gfid already exists in the brick. So when a name is
> resolved it does exactly what it does now.
>  b) Self-heal daemon will periodically crawl the first level of
> 'anoninode' directory to make sure it deletes the 'inodes' represented as
> files with gfid-string as names whenever the link count is > 1. It will
> also delete the files if the gfid cease to exist on the other bricks.
> 5) For directories:
>  a) both afr and ec need to perform 'rename' of the
> 'anoninode/dir-gfid' to the name it will be resolved to as part of entry
> self-heal, instead of 'mkdir'.
>  b) If self-heal daemon crawl detects that a directory is deleted
> on the other bricks, then it has to scan the files inside the deleted
> directory and move them into 'anoninode' if the gfid of the file/directory
> exists on the other bricks. Otherwise they can be safely deleted.
>
> Please let us know if you see any issues with this approach.
>
> --
> Pranith
>



-- 
Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Need inputs for solution for renames + entry self-heal data loss in afr

2016-10-25 Thread Pranith Kumar Karampuri
One of the Red hat QE engineers (Nag Pavan) found a day 1 bug in entry
self-heal where the file with good data can be replaced with file with bad
data when renames + self-heal is involved in a particular way.

Sample steps (From the bz):
1) have a plain replica volume with 2 bricks. start the volume and mount it.
2) mkdir dir && mkdir newdir && touch file1
3) bring first brick down
4) echo abc > dir/file1
5) bring the first brick back up and quickly bring the second brick down
before self-heal can be triggered.
6) do mv dir/file1 newdir/file2 <<--- note that this is empty file.

Now bring the second brick back up. If entry self-heal of 'dir' happens
first then it deletes the file1 with content 'abc' now when 'newdir' heal
happens it leads to creation of empty file and the data in the file is lost.

Same can be achieved using 'link' + 'unlink' as well.

The main reason for this problem is that afr entry-self-heal at the moment
doesn't care completely about link-counts before deleting the final link of
an inode, so it always does unlink and recreates the file and does data
heals. In this corner case unlink happens on the good copy of the file and
we either lose data or get stale data based on what is the data present on
the sink file.

Solution we are proposing is the following:

1) Posix will maintain a hidden directory '.glusterfs/anoninode'(We can
call it lost+found as well) directory which will be used by afr/ec for
keeping the 'inodes' until their names are resolved.
2) Both afr and ec when they need to heal a directory and a 'name' has to
be deleted but on the other bricks if the inode is present, it renames this
file as  'anoninode/' instead of doing unlink/rmdir on it.
3) For files:
 a) Both afr, ec already has logic to do 'link' instead of new file
creation if a gfid already exists in the brick. So when a name is resolved
it does exactly what it does now.
 b) Self-heal daemon will periodically crawl the first level of
'anoninode' directory to make sure it deletes the 'inodes' represented as
files with gfid-string as names whenever the link count is > 1. It will
also delete the files if the gfid cease to exist on the other bricks.
5) For directories:
 a) both afr and ec need to perform 'rename' of the
'anoninode/dir-gfid' to the name it will be resolved to as part of entry
self-heal, instead of 'mkdir'.
 b) If self-heal daemon crawl detects that a directory is deleted
on the other bricks, then it has to scan the files inside the deleted
directory and move them into 'anoninode' if the gfid of the file/directory
exists on the other bricks. Otherwise they can be safely deleted.

Please let us know if you see any issues with this approach.

-- 
Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] New style community meetings - No more status updates

2016-10-25 Thread Atin Mukherjee
Its a +1 from me, however this model will be only successful if we
diligently provide component updates.

On Tuesday 25 October 2016, Kaushal M  wrote:

> On Fri, Oct 21, 2016 at 11:46 AM, Kaushal M  > wrote:
> > On Thu, Oct 20, 2016 at 8:09 PM, Amye Scavarda  > wrote:
> >>
> >>
> >> On Thu, Oct 20, 2016 at 7:06 AM, Kaushal M  > wrote:
> >>>
> >>> Hi All,
> >>>
> >>> Our weekly community meetings have become mainly one hour of status
> >>> updates. This just drains the life out of the meeting, and doesn't
> >>> encourage new attendees to speak up.
> >>>
> >>> Let's try and change this. For the next meeting lets try skipping
> >>> updates all together and instead just dive into the 'Open floor' part
> >>> of the meeting.
> >>>
> >>> Let's have the updates to the regular topics be provided by the
> >>> regular owners before the meeting. This could either be through
> >>> sending out emails to the mailing lists, or updates entered into the
> >>> meeting etherpad[1]. As the host, I'll make sure to link to these
> >>> updates when the meeting begins, and in the meeting minutes. People
> >>> can view these updates later in their own time. People who need to
> >>> provide updates on AIs, just update the etherpad[1]. It will be
> >>> visible from there.
> >>>
> >>> Now let's move why I addressed this mail to this large and specific
> >>> set of people. The people who have been directly addressed are the
> >>> owners of the regular topics. You all are expected, before the next
> >>> meeting, to either,
> >>>  - Send out an update on the status for the topic you are responsible
> >>> for to the mailing lists, and then link to it on the the etherpad
> >>>  - or, provide you updates directly in the etherpad.
> >>> Please make sure you do this without fail.
> >>> If you do have anything to discuss, add it to the "Open floor" section.
> >>> Also, if I've missed out anyone in the addressed list, please make
> >>> sure they get this message too.
> >>>
> >>> Anyone else who wants to share their updates, add it to the 'Other
> >>> updates' section.
> >>>
> >>> Everyone else, go ahead and add anything you want to ask to the "Open
> >>> floor" section. Ensure to have your name with the topic you add
> >>> (etherpad colours are not reliable), and attend the meeting next week.
> >>> When your topic comes up, you'll have the floor.
> >>>
> >>> I hope that this new format helps make our meetings more colourful and
> >>> lively.
> >>>
> >>> As always, our community meetings will be held every Wednesday at
> >>> 1200UTC in #gluster-meeting on Freenode.
> >>> See you all there.
> >>>
> >>> ~kaushal
> >>>
> >>> [1]: https://public.pad.fsfe.org/p/gluster-community-meetings
> >>
> >>
> >> I really like this idea and am all in favor of color + liveliness.
> >> Let's give this new format three weeks or so, and we'll review around
> >> November 9th to see if we like this experiment.
> >> Fair?
> >> -- amye
> >
> > Sounds good to me.
> >
>
> Okay. We have one more day to the meeting, but I've yet to see any
> updates from all of you.
> Please ensure that you do this before the meeting tomorrow.
>
> >>
> >> --
> >> Amye Scavarda | a...@redhat.com  | Gluster Community Lead
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org 
> http://www.gluster.org/mailman/listinfo/gluster-devel
>


-- 
--Atin
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] New style community meetings - No more status updates

2016-10-25 Thread Kaushal M
On Fri, Oct 21, 2016 at 11:46 AM, Kaushal M  wrote:
> On Thu, Oct 20, 2016 at 8:09 PM, Amye Scavarda  wrote:
>>
>>
>> On Thu, Oct 20, 2016 at 7:06 AM, Kaushal M  wrote:
>>>
>>> Hi All,
>>>
>>> Our weekly community meetings have become mainly one hour of status
>>> updates. This just drains the life out of the meeting, and doesn't
>>> encourage new attendees to speak up.
>>>
>>> Let's try and change this. For the next meeting lets try skipping
>>> updates all together and instead just dive into the 'Open floor' part
>>> of the meeting.
>>>
>>> Let's have the updates to the regular topics be provided by the
>>> regular owners before the meeting. This could either be through
>>> sending out emails to the mailing lists, or updates entered into the
>>> meeting etherpad[1]. As the host, I'll make sure to link to these
>>> updates when the meeting begins, and in the meeting minutes. People
>>> can view these updates later in their own time. People who need to
>>> provide updates on AIs, just update the etherpad[1]. It will be
>>> visible from there.
>>>
>>> Now let's move why I addressed this mail to this large and specific
>>> set of people. The people who have been directly addressed are the
>>> owners of the regular topics. You all are expected, before the next
>>> meeting, to either,
>>>  - Send out an update on the status for the topic you are responsible
>>> for to the mailing lists, and then link to it on the the etherpad
>>>  - or, provide you updates directly in the etherpad.
>>> Please make sure you do this without fail.
>>> If you do have anything to discuss, add it to the "Open floor" section.
>>> Also, if I've missed out anyone in the addressed list, please make
>>> sure they get this message too.
>>>
>>> Anyone else who wants to share their updates, add it to the 'Other
>>> updates' section.
>>>
>>> Everyone else, go ahead and add anything you want to ask to the "Open
>>> floor" section. Ensure to have your name with the topic you add
>>> (etherpad colours are not reliable), and attend the meeting next week.
>>> When your topic comes up, you'll have the floor.
>>>
>>> I hope that this new format helps make our meetings more colourful and
>>> lively.
>>>
>>> As always, our community meetings will be held every Wednesday at
>>> 1200UTC in #gluster-meeting on Freenode.
>>> See you all there.
>>>
>>> ~kaushal
>>>
>>> [1]: https://public.pad.fsfe.org/p/gluster-community-meetings
>>
>>
>> I really like this idea and am all in favor of color + liveliness.
>> Let's give this new format three weeks or so, and we'll review around
>> November 9th to see if we like this experiment.
>> Fair?
>> -- amye
>
> Sounds good to me.
>

Okay. We have one more day to the meeting, but I've yet to see any
updates from all of you.
Please ensure that you do this before the meeting tomorrow.

>>
>> --
>> Amye Scavarda | a...@redhat.com | Gluster Community Lead
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Bugs with incorrect status

2016-10-25 Thread Niels de Vos
1349723 (mainline) MODIFIED: Added libraries to get server_brick dictionaries
  [master] I904612 distaf: adding libraries to get server_brick dictionaries 
(MERGED)
  ** akhak...@redhat.com: Bug 1349723 should be ON_QA, use v3.9rc1 for 
verification of the fix **

1342298 (mainline) MODIFIED: reading file with size less than 512 fails with 
odirect read
  [master] I097418 features/shard: Don't modify readv size (MERGED)
  ** b...@gluster.org: Bug 1342298 should be ON_QA, use v3.9rc1 for 
verification of the fix **

1362397 (mainline) MODIFIED: Mem leak in meta_default_readv in meta xlators
  [master] Ieb4132 meta: fix memory leak in meta xlators (MERGED)
  ** b...@gluster.org: Bug 1362397 should be ON_QA, use v3.9rc1 for 
verification of the fix **

1279747 (mainline) MODIFIED: spec: add CFLAGS=-DUSE_INSECURE_OPENSSL to 
configure command-line for RHEL-5 only
  ** mchan...@redhat.com: No change posted, but bug 1279747 is in MODIFIED **

1339181 (mainline) MODIFIED: Full heal of a sub-directory does not clean up 
name-indices when granular-entry-heal is enabled.
  [master] Ief71cc cluster/afr: Attempt name-index purge even on full-heal of 
directory (MERGED)
  ** kdhan...@redhat.com: Bug 1339181 should be ON_QA, use v3.9rc1 for 
verification of the fix **

1377288 (3.9) MODIFIED: The GlusterFS Callback RPC-calls always use RPC/XID 42
  [release-3.9] I2116be rpc: increase RPC/XID with each callback (MERGED)
  ** nde...@redhat.com: Bug 1377288 should be ON_QA, use v3.9rc1 for 
verification of the fix **

1332073 (mainline) MODIFIED: EINVAL errors while aggregating the directory size 
by quotad
  [master] Iaa quotad: fix potential buffer overflows (NEW)
  [master] If8a267 quotad: fix potential buffer overflows (NEW)
  [master] If8a267 quotad: fix potential buffer overflows (MERGED)
  ** rgowd...@redhat.com: Bug 1332073 should be in POST, change If8a267 under 
review **

1336612 (mainline) MODIFIED: one of vm goes to paused state when network goes 
down and comes up back
  [master] Ife1ce4 cluster/afr: Fix warning about unused variable (MERGED)
  [master] I5c50b6 cluster/afr: Refresh inode for inode-write fops in need 
(MERGED)
  [master] I571d0c cluster/afr: Refresh inode for inode-write fops in need 
(ABANDONED)
  [master] If6479e cluster/afr: Refresh inode for inode-write fops in need 
(ABANDONED)
  [master] Iabd91c cluster/afr: If possible give errno received from lower 
xlators (MERGED)
  ** pkara...@redhat.com: Bug 1336612 should be ON_QA, use v3.9rc1 for 
verification of the fix **

1343286 (mainline) MODIFIED: enabling glusternfs with nfs.rpc-auth-allow to 
many hosts failed
  [master] Ibbabad nfs: build exportlist with multiple groupnodes (MERGED)
  [master] I9d04ea xdr/nfs: free complete groupnode structure (MERGED)
  ** bku...@redhat.com: Bug 1343286 should be ON_QA, use v3.9rc1 for 
verification of the fix **

1358936 (mainline) MODIFIED: coverity: iobuf_get_page_aligned calling 
iobuf_get2 should check the return pointer
  [master] I3aa5b0 core: coverity, NULL potinter check (MERGED)
  ** johnzzpcrys...@gmail.com: Bug 1358936 should be ON_QA, use v3.9rc1 for 
verification of the fix **

1349284 (mainline) MODIFIED: [tiering]: Files of size greater than that of high 
watermark level should not be promoted
  [master] Ice0457 cluster/tier: dont promote if estimated block consumption > 
hi watermark (MERGED)
  ** mchan...@redhat.com: Bug 1349284 should be ON_QA, use v3.9rc1 for 
verification of the fix **

1158654 (mainline) POST: [FEAT] Journal Based Replication (JBR - formerly NSR)
  [master] Ia1c5aa NSR: Volgen Support (MERGED)
  ** jda...@redhat.com: Bug 1158654 should be CLOSED, v3.8.5 contains a fix **

1202717 (mainline) MODIFIED: quota: re-factor quota cli and glusterd changes 
and remove code duplication
  ** rgowd...@redhat.com: No change posted, but bug 1202717 is in MODIFIED **

1153964 (mainline) MODIFIED: quota: rename of "dir" fails in case of quota 
space availability is around 1GB
  [master] Iaad907 quota: No need for quota-limit check if rename is under same 
parent (ABANDONED)
  [master] I2c8140 quota: For a link operation, do quota_check_limit only till 
the common ancestor of src and dst file (MERGED)
  [master] Ia1e536 quota: For a rename operation, do quota_check_limit only 
till the common ancestor of src and dst file (MERGED)
  ** rgowd...@redhat.com: Bug 1153964 should be CLOSED, v3.8.5 contains a fix **

1340488 (mainline) MODIFIED: copy-export-ganesha.sh does not have a correct 
shebang
  [master] I22061a ganesha: fix the shebang for the copy-export script (MERGED)
  ** nde...@redhat.com: Bug 1340488 should be ON_QA, use v3.9rc1 for 
verification of the fix **

1371874 (mainline) POST: [RFE] DHT Events
  [master] Ib44517 dht/events:  Added rebalance events (MERGED)
  [master] I4e07cb events/dht: dht cli events (MERGED)
  [master] Ic571df events/dht: dht cli events (ABANDONED)
  ** nbala...@redhat.com: Bug 1371874 should be MODIFIED, change I4e07cb has 
been merged **