Re: [Gluster-users] [Gluster-devel] Release 3.10.2: Scheduled for the 30th of April

2017-05-15 Thread Shyam

Thanks Talur!

Further, for 3.10.3 as well Talur will be leading the release management 
work and we intend to release it on time, this time :)


Shyam

On 05/14/2017 04:32 PM, Raghavendra Talur wrote:

Glusterfs 3.10.2 has been tagged.

Packages for the various distributions will be available in a few days,
and with that a more formal release announcement will be made.

- Tagged code: https://github.com/gluster/glusterfs/tree/v3.10.2
- Release notes:
https://github.com/gluster/glusterfs/blob/release-3.10/doc/release-notes/3.10.2.md

Thanks,
Raghavendra Talur

NOTE: Tracker bug for 3.10.2 will be closed in a couple of days and
tracker for 3.10.3 will be opened, and an announcement for 3.10.3 will
be sent with the details



On Wed, May 3, 2017 at 3:46 PM, Raghavendra Talur  wrote:

I had previously announced that we would be releasing 3.10.2 today.
This is to update the 3.10.2 release is now delayed. We are waiting
for a bug[1] to be fixed.
If you are waiting for 3.10.2 release for a particular bug fix, please
let us know.

I will update with expected release date by tomorrow.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1447608

Thanks,
Raghavendra Talur

___
Gluster-devel mailing list
gluster-de...@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Release 3.11: RC0 has been tagged

2017-05-15 Thread Shyam

Hi,

Packages are available for testing from the following locations,

3.11.0rc0 packages are available in Fedora Rawhide (f27)

Packages for fedora-26, fedora-25, epel-7, and epel-6 are available now 
from [5]


Packages for Stretch/9 and Jessie are at [5]

Reminder, 3.11.0rcX packages are (still) signed with 3.10 signing key.

There will be a new signing key for the 3.11.0 GA and all following 
3.11.X packages.


Request testing feedback from the community, that can help us catch any 
major issue before the release.


Tracker BZ is at [2] and release notes are at [3]

Thanks,
Shyam

[5] Packages at download.gluster.org: 
https://download.gluster.org/pub/gluster/glusterfs/qa-releases/3.11.0rc0/


On 05/08/2017 02:40 PM, Shyam wrote:

Hi,

Pending features for 3.11 have been merged (and those that did not make
it have been moved out of the 3.11 release window). Thus, leading to
creating 3.11 RC0 tag in the gluster repositories.

Packagers have been notified via mail, and packages for the different
distributions will be made available soon.

We would like to, at this point of the release, encourage users and the
development community, to *test 3.11* and provide feedback on the lists,
or raise bugs [1].

If any bug you raise, is a blocker for the release, please add it to the
release tracker as well [2].

The scratch version of the release notes can be found here [3], and
request all developers who added features to 3.11, to send in their
respective commits for updating the release notes with the required
information (please use the same github issue# as the feature, when
posting commits against the release-notes, that way the issue also gets
updated with a reference to the commit).

This is also a good time for developers to edit gluster documentation,
to add details regarding the features added to 3.11 [4].

Thanks,
Shyam and Kaushal

[1] File a bug: https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS

[2] Tracker BZ for 3.11.0 blockers:
https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-3.11.0

[3] Release notes:
https://github.com/gluster/glusterfs/blob/release-3.11/doc/release-notes/3.11.0.md


[4] Gluster documentation repository:
https://github.com/gluster/glusterdocs
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] [Errno 5] Input/output error: '/abc/def/ghi.gz'

2017-05-15 Thread David Squire
I am running:  glusterfs 3.5.9 built on Mar 28 2016 07:10:17

 

Other volume info:

 

Type: Distributed-Replicate

Number of Bricks: 8 x 3 = 24

Transport-type: tcp

Options Reconfigured:

performance.cache-refresh-timeout: 30

performance.cache-size: 768MB

cluster.quorum-type: auto

cluster.server-quorum-type: server

cluster.server-quorum-ratio: 51

 

When I try to manipulate a file (def/ghi.gz) on the mounted glusterfs folder
(abc) I get an Errno 5 input/output error.  Most of the files work, but
there are lots that have this same problem.

 

I visited each brick in my volume to see what the extended file attributes
are for this file.

 

On my_volume-replicate-0 there is an empty file with the filename.  When I
run "ls -al" it looks like this:

-T2 root   root0 Mar  1 14:56 ghi.gz

 

On the first two bricks (bricks 0 and 1) of my_volume-replicate-0 when I run
"getfattr -d -m. -e hex ghi.gz" I get the following results:

# file: ghi.gz

trusted.afr.my_volume-client-0=0x

trusted.afr.my_volume-client-1=0x

trusted.afr.my_volume-client-2=0x00020002

trusted.gfid=0xabb0369b05844390add6ea72ce7e107a

trusted.glusterfs.dht.linkto=0x686f7374696e672d7265706c69636174652d3400

 

The link to looks like the following when I use text encoding instead of hex
encoding:

trusted.glusterfs.dht.linkto="my_volume-replicate-4"

 

The third brick (brick 2) of my_volume-replicate-0 has these extended
attributes:

# file: ghi.gz

trusted.gfid=0xc5c99fe21c3f4582b48e6f69ff76e33b

trusted.glusterfs.dht.linkto=0x686f7374696e672d7265706c69636174652d3400

 

So the third brick has a DIFFERENT trusted.gfid.

 

The first two bricks have
trusted.afr.my_volume-client-2=0x00020002.  Does that mean
that the first two bricks think that the third brick (brick 2) has
differences?

 

All three bricks are linking to my_volume-replicate-4.

 

All three bricks (bricks 12, 13, and 14) of my_volume-replicate-4 all have
the actual file with these extended attributes:

# file: ghi.gz

trusted.afr.my_volume-client-12=0x

trusted.afr.my_volume-client-13=0x

trusted.afr.my_volume-client-14=0x

trusted.gfid=0xabb0369b05844390add6ea72ce7e107a

 

So, my_volume-replicate-4's trusted.gfid matches bricks 0 and 1 of
my_volume-replicate-0.  And they all have 0x for all
three trusted.afr.my_volume-client-## attribute.  I assume this means that
the file is the same on all three bricks of my_volume-replicate-4.

 

No other bricks in the system have the ghi.gz file on them.

 

When I go to .glusterfs/indices/xattrop of bricks 0 and 1 there is a file
there named abb0369b-0584-4390-add6-ea72ce7e107a.  This means that this file
id is in need of healing, correct?  There is NOT a file named
abb0369b-0584-4390-add6-ea72ce7e107a on brick 2.

 

When I run "gluster volume heal my_volume info heal-failed" it lists
 four times.  I have tried to do
a full heal and a rebalance of the system, but it does not fix this problem.

 

How do I fix this problem?  Is there an easy way that I can fix all of the
files with the problem in bulk?

 

Thank you very much for any insights or help you may have!!

 

Dave

 

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Order of Bricks

2017-05-15 Thread iceholger
Hi,
No one has any idea to reorder bricks to change the distribution of files in a 
replica 2 after adding a brick  
Mailed friday the problem 

Sent from MailDroid___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Testing for gbench.

2017-05-15 Thread Ben Turner
Hi all!  A while back I created a benchmark kit for Gluster:

https://github.com/gluster/gbench

To run it just check the help file:

[bturner@ben-laptop bt--0001]$ python GlusterBench.py -h
Gluster Benchmark Kit Options:
  -h --help   Print gbench options.
  -v  Verbose Output.
  -r --record-sizeRecord size to write in for large files in KB
  -s --seq-file-size  The size of file each IOZone thread creates in GB
  -f --files  The nuber of files to create for smallfile tests in KB
  -l --sm-file-size   The size of files to create for smallfile tests in KB
  -n --sample-sizeThe number of samples to collect for each test
  -t --threadsThe number of threads to run applications with
  -m --mount-pointThe mount point gbench runs against
Example: GlusterBench.py -r 1024 -s 8 -f 1 -l 1024 -n 3 -t 4 -m 
/gluster-mount -v

To run it just cd to the dir and run GlusterBench.py:

 $ git clone https://github.com/gluster/gbench.git
 $ cd gbench/bench-tests/bt--0001/
 $ python GlusterBench.py -r 1024 -s 8 -f 1 -l 1024 -n 3 -t 4 -m 
/gluster-mount -v

Gbench will install smallfile for you and create any config files, but you will 
need to have IOzone installed your self as I haven't yet found a reliable repo 
for IOZone.

If anyone is interested in benchmarking their cluster or testing the tool I 
would appreciate it.  Also any problems / enhancements / whatever either email 
me or open an issue on github.  In the future I would love to have a page where 
you can upload your results and see how your cluster's performance compares to 
others.  We are always looking for users / contributors so if you know python 
and want to contribute it would be appreciated!  Thanks!

-b
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Failure while upgrading gluster to 3.10.1

2017-05-15 Thread Atin Mukherjee
On Mon, 15 May 2017 at 11:58, Pawan Alwandi  wrote:

> Hi Atin,
>
> I see below error.  Do I require gluster to be upgraded on all 3 hosts for
> this to work?  Right now I have host 1 running 3.10.1 and host 2 & 3
> running 3.6.2
>
> # gluster v set all cluster.op-version 31001
> volume set: failed: Required op_version (31001) is not supported
>

Yes you should given 3.6 version is EOLed.

>
>
>
> On Mon, May 15, 2017 at 3:32 AM, Atin Mukherjee 
> wrote:
>
>> On Sun, 14 May 2017 at 21:43, Atin Mukherjee  wrote:
>>
>>> Allright, I see that you haven't bumped up the op-version. Can you
>>> please execute:
>>>
>>> gluster v set all cluster.op-version 30101  and then restart glusterd on
>>> all the nodes and check the brick status?
>>>
>>
>> s/30101/31001
>>
>>
>>>
>>> On Sun, May 14, 2017 at 8:55 PM, Pawan Alwandi 
>>> wrote:
>>>
 Hello Atin,

 Thanks for looking at this.  Below is the output you requested for.

 Again, I'm seeing those errors after upgrading gluster on host 1.

 Host 1

 # cat /var/lib/glusterd/glusterd.info
 UUID=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
 operating-version=30600

 # cat /var/lib/glusterd/peers/*
 uuid=5ec54b4f-f60c-48c6-9e55-95f2bb58f633
 state=3
 hostname1=192.168.0.7
 uuid=83e9a0b9-6bd5-483b-8516-d8928805ed95
 state=3
 hostname1=192.168.0.6

 # gluster --version
 glusterfs 3.10.1

 Host 2

 # cat /var/lib/glusterd/glusterd.info
 UUID=83e9a0b9-6bd5-483b-8516-d8928805ed95
 operating-version=30600

 # cat /var/lib/glusterd/peers/*
 uuid=5ec54b4f-f60c-48c6-9e55-95f2bb58f633
 state=3
 hostname1=192.168.0.7
 uuid=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
 state=3
 hostname1=192.168.0.5

 # gluster --version
 glusterfs 3.6.2 built on Jan 21 2015 14:23:44

 Host 3

 # cat /var/lib/glusterd/glusterd.info
 UUID=5ec54b4f-f60c-48c6-9e55-95f2bb58f633
 operating-version=30600

 # cat /var/lib/glusterd/peers/*
 uuid=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
 state=3
 hostname1=192.168.0.5
 uuid=83e9a0b9-6bd5-483b-8516-d8928805ed95
 state=3
 hostname1=192.168.0.6

 # gluster --version
 glusterfs 3.6.2 built on Jan 21 2015 14:23:44



 On Sat, May 13, 2017 at 6:28 PM, Atin Mukherjee 
 wrote:

> I have already asked for the following earlier:
>
> Can you please provide output of following from all the nodes:
>
> cat /var/lib/glusterd/glusterd.info
> cat /var/lib/glusterd/peers/*
>
> On Sat, 13 May 2017 at 12:22, Pawan Alwandi  wrote:
>
>> Hello folks,
>>
>> Does anyone have any idea whats going on here?
>>
>> Thanks,
>> Pawan
>>
>> On Wed, May 10, 2017 at 5:02 PM, Pawan Alwandi 
>> wrote:
>>
>>> Hello,
>>>
>>> I'm trying to upgrade gluster from 3.6.2 to 3.10.1 but don't see the
>>> glusterfsd and glusterfs processes coming up.
>>> http://gluster.readthedocs.io/en/latest/Upgrade-Guide/upgrade_to_3.10/
>>> is the process that I'm trying to follow.
>>>
>>> This is a 3 node server setup with a replicated volume having
>>> replica count of 3.
>>>
>>> Logs below:
>>>
>>> [2017-05-10 09:07:03.507959] I [MSGID: 100030]
>>> [glusterfsd.c:2460:main] 0-/usr/sbin/glusterd: Started running
>>> /usr/sbin/glusterd version 3.10.1 (args: /usr/sbin/glusterd -p
>>> /var/run/glusterd.pid)
>>> [2017-05-10 09:07:03.512827] I [MSGID: 106478]
>>> [glusterd.c:1449:init] 0-management: Maximum allowed open file 
>>> descriptors
>>> set to 65536
>>> [2017-05-10 09:07:03.512855] I [MSGID: 106479]
>>> [glusterd.c:1496:init] 0-management: Using /var/lib/glusterd as working
>>> directory
>>> [2017-05-10 09:07:03.520426] W [MSGID: 103071]
>>> [rdma.c:4590:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event
>>> channel creation failed [No such device]
>>> [2017-05-10 09:07:03.520452] W [MSGID: 103055] [rdma.c:4897:init]
>>> 0-rdma.management: Failed to initialize IB Device
>>> [2017-05-10 09:07:03.520465] W
>>> [rpc-transport.c:350:rpc_transport_load] 0-rpc-transport: 'rdma'
>>> initialization failed
>>> [2017-05-10 09:07:03.520518] W
>>> [rpcsvc.c:1661:rpcsvc_create_listener] 0-rpc-service: cannot create
>>> listener, initing the transport failed
>>> [2017-05-10 09:07:03.520534] E [MSGID: 106243]
>>> [glusterd.c:1720:init] 0-management: creation of 1 listeners failed,
>>> continuing with succeeded transport
>>> [2017-05-10 09:07:04.931764] I [MSGID: 106513]
>>> [glusterd-store.c:2197:glusterd_restore_op_version] 0-glusterd: 
>>> retrieved
>>> op-version: 30600
>>> [2017-05-10 09:07:04.964354] I [MSGID: 106544]

[Gluster-users] Mounting GlusterFS volume in a client.

2017-05-15 Thread Dwijadas Dey
Hi
   List users
   I am trying to mount a GlusterFS server volume to a
Gluster Client in /var directory. My intention is to connect a folder of
webserver in Gluster client  (/var/www/html/data) to the Gluster storage
server.

The size of remote Gluster storage server(CentOS) is 250GB and the Gluster
client is installed on CentOS 7 that is allocated 64GB. Since the volume
size of client is less than the volume size of Gluster storage server, does
the Gluster storage server will mount on the Gluster client ?

Regards
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster and NFS-Ganesha - cluster is down after reboot

2017-05-15 Thread hvjunk

> On 15 May 2017, at 12:56 PM, Soumya Koduri  wrote:
> 
> 
> 
> On 05/12/2017 06:27 PM, Adam Ru wrote:
>> Hi Soumya,
>> 
>> Thank you very much for last response – very useful.
>> 
>> I apologize for delay, I had to find time for another testing.
>> 
>> I updated instructions that I provided in previous e-mail. *** means
>> that the step was added.
>> 
>> Instructions:
>> - Clean installation of CentOS 7.3 with all updates, 3x node,
>> resolvable IPs and VIPs
>> - Stopped firewalld (just for testing)
>> - *** SELinux in permissive mode (I had to, will explain bellow)
>> - Install “centos-release-gluster" to get "centos-gluster310" repo

should I also install the centos-gluster310, or will that be automagically 
chosen by the centos-release-gluster?

>> and install following (nothing else):
>> --- glusterfs-server
>> --- glusterfs-ganesha
>> - Passwordless SSH between all nodes
>> (/var/lib/glusterd/nfs/secret.pem and secret.pem.pub on all nodes)
>> - systemctl enable and start glusterd
>> - gluster peer probe 
>> - gluster volume set all cluster.enable-shared-storage enable

After this step, I’ll advise (given my experience in doing this by Ansible) to 
make sure that the shared filesystem have propagated to all the nodes, as well 
as the needed entries made in fstab… safety check, and I’ll also load my 
systemd service and helper script to assist in cluster cold-bootstrapping.

>> - systemctl enable and start pcsd.service
>> - systemctl enable pacemaker.service (cannot be started at this moment)
>> - Set password for hacluster user on all nodes
>> - pcs cluster auth-u hacluster -p blabla
>> - mkdir /var/run/gluster/shared_storage/nfs-ganesha/
>> - touch /var/run/gluster/shared_storage/nfs-ganesha/ganesha.conf (not
>> sure if needed)
>> - vi /var/run/gluster/shared_storage/nfs-ganesha/ganesha-ha.conf and
>> insert configuration
>> - Try list files on other nodes: ls
>> /var/run/gluster/shared_storage/nfs-ganesha/
>> - gluster nfs-ganesha enable
>> - *** systemctl enable pacemaker.service (again, since pacemaker was
>> disabled at this point)
>> - *** Check owner of "state", "statd", "sm" and "sm.bak" in
>> /var/lib/nfs/ (I had to: chown rpcuser:rpcuser
>> /var/lib/nfs/statd/state)
>> - Check on other nodes that nfs-ganesha.service is running and "pcs
>> status" shows started resources
>> - gluster volume create mynewshare replica 3 transport tcp
>> node1:/ node2:/ node3:/
>> - gluster volume start mynewshare
>> - gluster vol set mynewshare ganesha.enable on
>> 
>> At this moment, this is status of important (I think) services:
>> 
>> -- corosync.service disabled
>> -- corosync-notifyd.service disabled
>> -- glusterd.service enabled
>> -- glusterfsd.service   disabled
>> -- pacemaker.serviceenabled
>> -- pcsd.service enabled
>> -- nfs-ganesha.service  disabled
>> -- nfs-ganesha-config.service   static
>> -- nfs-ganesha-lock.service static
>> 
>> -- corosync.service active (running)
>> -- corosync-notifyd.service inactive (dead)
>> -- glusterd.service active (running)
>> -- glusterfsd.service   inactive (dead)
>> -- pacemaker.serviceactive (running)
>> -- pcsd.service active (running)
>> -- nfs-ganesha.service  active (running)
>> -- nfs-ganesha-config.service   inactive (dead)
>> -- nfs-ganesha-lock.service active (running)
>> 
>> May I ask you a few questions please?
>> 
>> 1. Could you please confirm that services above has correct status/state?
> 
> Looks good to the best of my knowledge.
> 
>> 
>> 2. When I restart a node then nfs-ganesha is not running. Of course I
>> cannot enable it since it needs to be enabled after shared storage is
>> mounted. What is best practice to start it automatically so I don’t
>> have to worry about restarting node? Should I create a script that
>> will check whether shared storage was mounted and then start
>> nfs-ganesha? How do you do this in production?
> 
> That's right.. We have plans to address this in near future (probably by 
> having a new .service which mounts shared_storage before starting 
> nfs-ganesha). But until then ..yes having a custom defined script to do so is 
> the only way to automate it.

Refer to my previous posting that has a script & systemd service that address 
this problematic bootstrapping issue w.r.t. locally mounted gluster 
directories, which the shared directory is.
That could be used (with my permission) as a basis to help fix this issue…

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster arbiter with tier

2017-05-15 Thread Ravishankar N

On 05/15/2017 11:03 AM, Benjamin Kingston wrote:

Are there any plans to enable tiering with arbiter enabled?


There was a discussion on brick ordering in tiered volumes affecting 
arbiter brick placement [1] but nothing concrete turned out.  I don't 
think this is being actively looked into at the moment.


-Ravi

[1] 
http://lists.gluster.org/pipermail/gluster-devel/2016-January/047839.html



___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users



___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Reliability issues with Gluster 3.10 and shard

2017-05-15 Thread Nithya Balachandran
On 15 May 2017 at 11:01, Benjamin Kingston  wrote:

> I resolved this with the following settings, particularly disabling
> features.ctr-enabled
>

That's odd. CTR should be enabled for tiered volumes. Was it enabled by
default?



>
> olume Name: storage2
> Type: Distributed-Replicate
> Volume ID: adaabca5-25ed-4e7f-ae86-2f20fc0143a8
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 3 x (2 + 1) = 9
> Transport-type: tcp
> Bricks:
> Brick1: fd00:0:0:3::6:/mnt/gluster/storage/brick0/glusterfs2
> Brick2: fd00:0:0:3::8:/mnt/gluster/storage/brick0/glusterfs2
> Brick3: fd00:0:0:3::10:/mnt/gluster/storage/brick0/glusterfs (arbiter)
> Brick4: fd00:0:0:3::6:/mnt/gluster/storage/brick1/glusterfs2
> Brick5: fd00:0:0:3::8:/mnt/gluster/storage/brick1/glusterfs2
> Brick6: fd00:0:0:3::10:/mnt/gluster/storage/brick1/glusterfs (arbiter)
> Brick7: fd00:0:0:3::6:/mnt/gluster/storage/brick2/glusterfs2
> Brick8: fd00:0:0:3::8:/mnt/gluster/storage/brick2/glusterfs2
> Brick9: fd00:0:0:3::10:/mnt/gluster/storage/brick2/glusterfs (arbiter)
> Options Reconfigured:
> performance.write-behind-window-size: 4MB
> performance.cache-invalidation: on
> transport.keepalive: on
> performance.write-behind: on
> performance.read-ahead: on
> performance.io-cache: on
> performance.stat-prefetch: on
> performance.open-behind: on
> cluster.use-compound-fops: on
> performance.cache-ima-xattrs: on
> features.cache-invalidation: on
> client.event-threads: 4
> cluster.data-self-heal-algorithm: full
> performance.client-io-threads: on
> server.event-threads: 4
> performance.quick-read: on
> features.scrub: Active
> features.bitrot: on
> features.shard: on
> transport.address-family: inet6
> nfs.disable: on
> server.allow-insecure: on
> user.cifs: off
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
> cluster.tier-compact: on
> diagnostics.brick-log-level: WARNING
> diagnostics.client-log-level: WARNING
> cluster.self-heal-daemon: enable
> performance.cache-samba-metadata: on
> cluster.brick-multiplex: off
> cluster.enable-shared-storage: enable
> nfs-ganesha: enable
>
>
>
> -ben
>
> On Sat, May 13, 2017 at 12:20 PM, Benjamin Kingston 
> wrote:
>
>> Hers's some log entries from nfs-ganesha gfapi
>>
>> [2017-05-13 19:02:54.105936] E [MSGID: 133010]
>> [shard.c:1706:shard_common_lookup_shards_cbk] 0-storage2-shard: Lookup
>> on shard 11 failed. Base file gfid = 1494c083-a618-4eba-80a0-147e656dd9d0
>> [Input/output error]
>> [2017-05-13 19:02:54.106176] E [MSGID: 133010]
>> [shard.c:1706:shard_common_lookup_shards_cbk] 0-storage2-shard: Lookup
>> on shard 2 failed. Base file gfid = 1494c083-a618-4eba-80a0-147e656dd9d0
>> [Input/output error]
>> [2017-05-13 19:02:54.106288] E [MSGID: 133010]
>> [shard.c:1706:shard_common_lookup_shards_cbk] 0-storage2-shard: Lookup
>> on shard 1 failed. Base file gfid = 1494c083-a618-4eba-80a0-147e656dd9d0
>> [Input/output error]
>> [2017-05-13 19:02:54.384922] I [MSGID: 108026]
>> [afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do]
>> 0-storage2-replicate-2: performing metadata selfheal on
>> fe651475-226e-42a3-be2d-751d4f58e383
>> [2017-05-13 19:02:54.385894] W [MSGID: 114031]
>> [client-rpc-fops.c:2258:client3_3_setattr_cbk] 0-storage2-client-8:
>> remote operation failed [Operation not permitted]
>> [2017-05-13 19:02:54.401187] I [MSGID: 108026]
>> [afr-self-heal-common.c:1255:afr_log_selfheal] 0-storage2-replicate-2:
>> Completed metadata selfheal on fe651475-226e-42a3-be2d-751d4f58e383.
>> sources=[0] 1  sinks=
>> [2017-05-13 19:02:57.830019] I [MSGID: 109066]
>> [dht-rename.c:1608:dht_rename] 0-storage2-dht: renaming
>> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
>> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.par2.tmp
>> (hash=storage2-readdir-ahead-2/cache=storage2-readdir-ahead-2) =>
>> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
>> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.par2
>> (hash=storage2-readdir-ahead-0/cache=)
>>
>> [2017-05-13 19:08:22.014899] I [MSGID: 109066]
>> [dht-rename.c:1608:dht_rename] 0-storage2-dht: renaming
>> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
>> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.srr.tmp
>> (hash=storage2-readdir-ahead-1/cache=storage2-readdir-ahead-1) =>
>> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
>> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.srr
>> (hash=storage2-readdir-ahead-1/cache=)
>> [2017-05-13 19:08:22.463840] I [MSGID: 109066]
>> [dht-rename.c:1608:dht_rename] 0-storage2-dht: renaming
>> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
>> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.r04.tmp
>> (hash=storage2-readdir-ahead-2/cache=storage2-readdir-ahead-2) =>
>> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
>> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.r04
>> (hash=storage2-readdir-ahead-0/cache=)
>> [2017-05-13 19:08:22.769542] I [MSGID: 

Re: [Gluster-users] Reliability issues with Gluster 3.10 and shard

2017-05-15 Thread Benjamin Kingston
I resolved this with the following settings, particularly disabling
features.ctr-enabled

olume Name: storage2
Type: Distributed-Replicate
Volume ID: adaabca5-25ed-4e7f-ae86-2f20fc0143a8
Status: Started
Snapshot Count: 0
Number of Bricks: 3 x (2 + 1) = 9
Transport-type: tcp
Bricks:
Brick1: fd00:0:0:3::6:/mnt/gluster/storage/brick0/glusterfs2
Brick2: fd00:0:0:3::8:/mnt/gluster/storage/brick0/glusterfs2
Brick3: fd00:0:0:3::10:/mnt/gluster/storage/brick0/glusterfs (arbiter)
Brick4: fd00:0:0:3::6:/mnt/gluster/storage/brick1/glusterfs2
Brick5: fd00:0:0:3::8:/mnt/gluster/storage/brick1/glusterfs2
Brick6: fd00:0:0:3::10:/mnt/gluster/storage/brick1/glusterfs (arbiter)
Brick7: fd00:0:0:3::6:/mnt/gluster/storage/brick2/glusterfs2
Brick8: fd00:0:0:3::8:/mnt/gluster/storage/brick2/glusterfs2
Brick9: fd00:0:0:3::10:/mnt/gluster/storage/brick2/glusterfs (arbiter)
Options Reconfigured:
performance.write-behind-window-size: 4MB
performance.cache-invalidation: on
transport.keepalive: on
performance.write-behind: on
performance.read-ahead: on
performance.io-cache: on
performance.stat-prefetch: on
performance.open-behind: on
cluster.use-compound-fops: on
performance.cache-ima-xattrs: on
features.cache-invalidation: on
client.event-threads: 4
cluster.data-self-heal-algorithm: full
performance.client-io-threads: on
server.event-threads: 4
performance.quick-read: on
features.scrub: Active
features.bitrot: on
features.shard: on
transport.address-family: inet6
nfs.disable: on
server.allow-insecure: on
user.cifs: off
cluster.quorum-type: auto
cluster.server-quorum-type: server
cluster.tier-compact: on
diagnostics.brick-log-level: WARNING
diagnostics.client-log-level: WARNING
cluster.self-heal-daemon: enable
performance.cache-samba-metadata: on
cluster.brick-multiplex: off
cluster.enable-shared-storage: enable
nfs-ganesha: enable



-ben

On Sat, May 13, 2017 at 12:20 PM, Benjamin Kingston 
wrote:

> Hers's some log entries from nfs-ganesha gfapi
>
> [2017-05-13 19:02:54.105936] E [MSGID: 133010] 
> [shard.c:1706:shard_common_lookup_shards_cbk]
> 0-storage2-shard: Lookup on shard 11 failed. Base file gfid =
> 1494c083-a618-4eba-80a0-147e656dd9d0 [Input/output error]
> [2017-05-13 19:02:54.106176] E [MSGID: 133010] 
> [shard.c:1706:shard_common_lookup_shards_cbk]
> 0-storage2-shard: Lookup on shard 2 failed. Base file gfid =
> 1494c083-a618-4eba-80a0-147e656dd9d0 [Input/output error]
> [2017-05-13 19:02:54.106288] E [MSGID: 133010] 
> [shard.c:1706:shard_common_lookup_shards_cbk]
> 0-storage2-shard: Lookup on shard 1 failed. Base file gfid =
> 1494c083-a618-4eba-80a0-147e656dd9d0 [Input/output error]
> [2017-05-13 19:02:54.384922] I [MSGID: 108026]
> [afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do]
> 0-storage2-replicate-2: performing metadata selfheal on
> fe651475-226e-42a3-be2d-751d4f58e383
> [2017-05-13 19:02:54.385894] W [MSGID: 114031] 
> [client-rpc-fops.c:2258:client3_3_setattr_cbk]
> 0-storage2-client-8: remote operation failed [Operation not permitted]
> [2017-05-13 19:02:54.401187] I [MSGID: 108026]
> [afr-self-heal-common.c:1255:afr_log_selfheal] 0-storage2-replicate-2:
> Completed metadata selfheal on fe651475-226e-42a3-be2d-751d4f58e383.
> sources=[0] 1  sinks=
> [2017-05-13 19:02:57.830019] I [MSGID: 109066]
> [dht-rename.c:1608:dht_rename] 0-storage2-dht: renaming
> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.par2.tmp
> (hash=storage2-readdir-ahead-2/cache=storage2-readdir-ahead-2) =>
> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.par2
> (hash=storage2-readdir-ahead-0/cache=)
>
> [2017-05-13 19:08:22.014899] I [MSGID: 109066]
> [dht-rename.c:1608:dht_rename] 0-storage2-dht: renaming
> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.srr.tmp
> (hash=storage2-readdir-ahead-1/cache=storage2-readdir-ahead-1) =>
> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.srr
> (hash=storage2-readdir-ahead-1/cache=)
> [2017-05-13 19:08:22.463840] I [MSGID: 109066]
> [dht-rename.c:1608:dht_rename] 0-storage2-dht: renaming
> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.r04.tmp
> (hash=storage2-readdir-ahead-2/cache=storage2-readdir-ahead-2) =>
> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.r04
> (hash=storage2-readdir-ahead-0/cache=)
> [2017-05-13 19:08:22.769542] I [MSGID: 109066]
> [dht-rename.c:1608:dht_rename] 0-storage2-dht: renaming
> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.r01.tmp
> (hash=storage2-readdir-ahead-2/cache=storage2-readdir-ahead-2) =>
> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
> 

Re: [Gluster-users] Reliability issues with Gluster 3.10 and shard

2017-05-15 Thread Krutika Dhananjay
Shard translator is currently supported only for VM image store workload.

-Krutika

On Sun, May 14, 2017 at 12:50 AM, Benjamin Kingston 
wrote:

> Hers's some log entries from nfs-ganesha gfapi
>
> [2017-05-13 19:02:54.105936] E [MSGID: 133010] 
> [shard.c:1706:shard_common_lookup_shards_cbk]
> 0-storage2-shard: Lookup on shard 11 failed. Base file gfid =
> 1494c083-a618-4eba-80a0-147e656dd9d0 [Input/output error]
> [2017-05-13 19:02:54.106176] E [MSGID: 133010] 
> [shard.c:1706:shard_common_lookup_shards_cbk]
> 0-storage2-shard: Lookup on shard 2 failed. Base file gfid =
> 1494c083-a618-4eba-80a0-147e656dd9d0 [Input/output error]
> [2017-05-13 19:02:54.106288] E [MSGID: 133010] 
> [shard.c:1706:shard_common_lookup_shards_cbk]
> 0-storage2-shard: Lookup on shard 1 failed. Base file gfid =
> 1494c083-a618-4eba-80a0-147e656dd9d0 [Input/output error]
> [2017-05-13 19:02:54.384922] I [MSGID: 108026]
> [afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do]
> 0-storage2-replicate-2: performing metadata selfheal on
> fe651475-226e-42a3-be2d-751d4f58e383
> [2017-05-13 19:02:54.385894] W [MSGID: 114031] 
> [client-rpc-fops.c:2258:client3_3_setattr_cbk]
> 0-storage2-client-8: remote operation failed [Operation not permitted]
> [2017-05-13 19:02:54.401187] I [MSGID: 108026]
> [afr-self-heal-common.c:1255:afr_log_selfheal] 0-storage2-replicate-2:
> Completed metadata selfheal on fe651475-226e-42a3-be2d-751d4f58e383.
> sources=[0] 1  sinks=
> [2017-05-13 19:02:57.830019] I [MSGID: 109066]
> [dht-rename.c:1608:dht_rename] 0-storage2-dht: renaming
> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.par2.tmp
> (hash=storage2-readdir-ahead-2/cache=storage2-readdir-ahead-2) =>
> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.par2
> (hash=storage2-readdir-ahead-0/cache=)
>
> [2017-05-13 19:08:22.014899] I [MSGID: 109066]
> [dht-rename.c:1608:dht_rename] 0-storage2-dht: renaming
> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.srr.tmp
> (hash=storage2-readdir-ahead-1/cache=storage2-readdir-ahead-1) =>
> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.srr
> (hash=storage2-readdir-ahead-1/cache=)
> [2017-05-13 19:08:22.463840] I [MSGID: 109066]
> [dht-rename.c:1608:dht_rename] 0-storage2-dht: renaming
> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.r04.tmp
> (hash=storage2-readdir-ahead-2/cache=storage2-readdir-ahead-2) =>
> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.r04
> (hash=storage2-readdir-ahead-0/cache=)
> [2017-05-13 19:08:22.769542] I [MSGID: 109066]
> [dht-rename.c:1608:dht_rename] 0-storage2-dht: renaming
> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.r01.tmp
> (hash=storage2-readdir-ahead-2/cache=storage2-readdir-ahead-2) =>
> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.r01
> (hash=storage2-readdir-ahead-0/cache=)
> [2017-05-13 19:08:23.141069] I [MSGID: 109066]
> [dht-rename.c:1608:dht_rename] 0-storage2-dht: renaming
> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.nfo.tmp
> (hash=storage2-readdir-ahead-1/cache=storage2-readdir-ahead-1) =>
> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.nfo
> (hash=storage2-readdir-ahead-0/cache=)
> [2017-05-13 19:08:23.468554] I [MSGID: 109066]
> [dht-rename.c:1608:dht_rename] 0-storage2-dht: renaming
> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.r00.tmp
> (hash=storage2-readdir-ahead-0/cache=storage2-readdir-ahead-0) =>
> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.r00
> (hash=storage2-readdir-ahead-2/cache=)
> [2017-05-13 19:08:23.671753] I [MSGID: 109066]
> [dht-rename.c:1608:dht_rename] 0-storage2-dht: renaming
> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.sfv.tmp
> (hash=storage2-readdir-ahead-2/cache=storage2-readdir-ahead-2) =>
> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.sfv
> (hash=storage2-readdir-ahead-2/cache=)
> [2017-05-13 19:08:23.812152] I [MSGID: 109066]
> [dht-rename.c:1608:dht_rename] 0-storage2-dht: renaming
> /content/Downloads/incomplete/usenet/Attack.on.Titan.S02E05.
> 720p.WEB.x264-ANiURL.#27/aniurl-aot.s02e05.720p.web.r11.tmp
> 

Re: [Gluster-users] Failure while upgrading gluster to 3.10.1

2017-05-15 Thread Pawan Alwandi
Hi Atin,

I see below error.  Do I require gluster to be upgraded on all 3 hosts for
this to work?  Right now I have host 1 running 3.10.1 and host 2 & 3
running 3.6.2

# gluster v set all cluster.op-version 31001
volume set: failed: Required op_version (31001) is not supported


On Mon, May 15, 2017 at 3:32 AM, Atin Mukherjee  wrote:

> On Sun, 14 May 2017 at 21:43, Atin Mukherjee  wrote:
>
>> Allright, I see that you haven't bumped up the op-version. Can you please
>> execute:
>>
>> gluster v set all cluster.op-version 30101  and then restart glusterd on
>> all the nodes and check the brick status?
>>
>
> s/30101/31001
>
>
>>
>> On Sun, May 14, 2017 at 8:55 PM, Pawan Alwandi  wrote:
>>
>>> Hello Atin,
>>>
>>> Thanks for looking at this.  Below is the output you requested for.
>>>
>>> Again, I'm seeing those errors after upgrading gluster on host 1.
>>>
>>> Host 1
>>>
>>> # cat /var/lib/glusterd/glusterd.info
>>> UUID=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>> operating-version=30600
>>>
>>> # cat /var/lib/glusterd/peers/*
>>> uuid=5ec54b4f-f60c-48c6-9e55-95f2bb58f633
>>> state=3
>>> hostname1=192.168.0.7
>>> uuid=83e9a0b9-6bd5-483b-8516-d8928805ed95
>>> state=3
>>> hostname1=192.168.0.6
>>>
>>> # gluster --version
>>> glusterfs 3.10.1
>>>
>>> Host 2
>>>
>>> # cat /var/lib/glusterd/glusterd.info
>>> UUID=83e9a0b9-6bd5-483b-8516-d8928805ed95
>>> operating-version=30600
>>>
>>> # cat /var/lib/glusterd/peers/*
>>> uuid=5ec54b4f-f60c-48c6-9e55-95f2bb58f633
>>> state=3
>>> hostname1=192.168.0.7
>>> uuid=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>> state=3
>>> hostname1=192.168.0.5
>>>
>>> # gluster --version
>>> glusterfs 3.6.2 built on Jan 21 2015 14:23:44
>>>
>>> Host 3
>>>
>>> # cat /var/lib/glusterd/glusterd.info
>>> UUID=5ec54b4f-f60c-48c6-9e55-95f2bb58f633
>>> operating-version=30600
>>>
>>> # cat /var/lib/glusterd/peers/*
>>> uuid=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>> state=3
>>> hostname1=192.168.0.5
>>> uuid=83e9a0b9-6bd5-483b-8516-d8928805ed95
>>> state=3
>>> hostname1=192.168.0.6
>>>
>>> # gluster --version
>>> glusterfs 3.6.2 built on Jan 21 2015 14:23:44
>>>
>>>
>>>
>>> On Sat, May 13, 2017 at 6:28 PM, Atin Mukherjee 
>>> wrote:
>>>
 I have already asked for the following earlier:

 Can you please provide output of following from all the nodes:

 cat /var/lib/glusterd/glusterd.info
 cat /var/lib/glusterd/peers/*

 On Sat, 13 May 2017 at 12:22, Pawan Alwandi  wrote:

> Hello folks,
>
> Does anyone have any idea whats going on here?
>
> Thanks,
> Pawan
>
> On Wed, May 10, 2017 at 5:02 PM, Pawan Alwandi 
> wrote:
>
>> Hello,
>>
>> I'm trying to upgrade gluster from 3.6.2 to 3.10.1 but don't see the
>> glusterfsd and glusterfs processes coming up.
>> http://gluster.readthedocs.io/en/latest/Upgrade-Guide/
>> upgrade_to_3.10/ is the process that I'm trying to follow.
>>
>> This is a 3 node server setup with a replicated volume having replica
>> count of 3.
>>
>> Logs below:
>>
>> [2017-05-10 09:07:03.507959] I [MSGID: 100030]
>> [glusterfsd.c:2460:main] 0-/usr/sbin/glusterd: Started running
>> /usr/sbin/glusterd version 3.10.1 (args: /usr/sbin/glusterd -p
>> /var/run/glusterd.pid)
>> [2017-05-10 09:07:03.512827] I [MSGID: 106478] [glusterd.c:1449:init]
>> 0-management: Maximum allowed open file descriptors set to 65536
>> [2017-05-10 09:07:03.512855] I [MSGID: 106479] [glusterd.c:1496:init]
>> 0-management: Using /var/lib/glusterd as working directory
>> [2017-05-10 09:07:03.520426] W [MSGID: 103071]
>> [rdma.c:4590:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm
>> event channel creation failed [No such device]
>> [2017-05-10 09:07:03.520452] W [MSGID: 103055] [rdma.c:4897:init]
>> 0-rdma.management: Failed to initialize IB Device
>> [2017-05-10 09:07:03.520465] W [rpc-transport.c:350:rpc_transport_load]
>> 0-rpc-transport: 'rdma' initialization failed
>> [2017-05-10 09:07:03.520518] W [rpcsvc.c:1661:rpcsvc_create_listener]
>> 0-rpc-service: cannot create listener, initing the transport failed
>> [2017-05-10 09:07:03.520534] E [MSGID: 106243] [glusterd.c:1720:init]
>> 0-management: creation of 1 listeners failed, continuing with succeeded
>> transport
>> [2017-05-10 09:07:04.931764] I [MSGID: 106513] 
>> [glusterd-store.c:2197:glusterd_restore_op_version]
>> 0-glusterd: retrieved op-version: 30600
>> [2017-05-10 09:07:04.964354] I [MSGID: 106544]
>> [glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID:
>> 7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>> [2017-05-10 09:07:04.993944] I [MSGID: 106498]
>> [glusterd-handler.c:3669:glusterd_friend_add_from_peerinfo]
>> 0-management: connect returned 0
>> [2017-05-10 09:07:04.995864] I [MSGID: