Re: [Gluster-users] Slow write times to gluster disk

2017-08-08 Thread Steve Postma
Soumya,

its


[root@mseas-data2 ~]# glusterfs --version
glusterfs 3.7.11 built on Apr 27 2016 14:09:20
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2013 Red Hat, Inc. 
GlusterFS comes with ABSOLUTELY NO WARRANTY.
It is licensed to you under your choice of the GNU Lesser
General Public License, version 3 or any later version (LGPLv3
or later), or the GNU General Public License, version 2 (GPLv2),
in all cases as published by the Free Software Foundation.
[root@mseas-data2 ~]#


Thanks,

Steve



From: Soumya Koduri 
Sent: Tuesday, August 8, 2017 1:37 AM
To: Pat Haley; Niels de Vos
Cc: gluster-users@gluster.org; Pranith Kumar Karampuri; Ben Turner; Ravishankar 
N; Raghavendra Gowdappa; Niels de Vos; Steve Postma
Subject: Re: [Gluster-users] Slow write times to gluster disk



- Original Message -
> From: "Pat Haley" >
> To: "Soumya Koduri" >, 
> gluster-users@gluster.org, "Pranith Kumar 
> Karampuri" >
> Cc: "Ben Turner" >, 
> "Ravishankar N" >, 
> "Raghavendra Gowdappa"
> >, "Niels de Vos" 
> >, "Steve Postma" 
> >
> Sent: Monday, August 7, 2017 9:52:48 PM
> Subject: Re: [Gluster-users] Slow write times to gluster disk
>
>
> Hi Soumya,
>
> We just had the opportunity to try the option of disabling the
> kernel-NFS and restarting glusterd to start gNFS. However the gluster
> demon crashes immediately on startup. What additional information
> besides what we provide below would help debugging this?
>

Which version of glusterfs are you using? There were few regressions caused 
(all fixed now in master branch atleast) by recent changes in mount codepath.

Request Niels to comment.

Thanks,
Soumya


> Thanks,
>
> Pat
>
>
>  Forwarded Message 
> Subject: gluster-nfs crashing on start
> Date: Mon, 7 Aug 2017 16:05:09 +
> From: Steve Postma >
> To: Pat Haley >
>
>
>
> *To disable kernal-nfs and enable nfs through Gluster we:*
>
>
> gluster volume set data-volume nfs.export-volumes on
> gluster volume set data-volume nfs.disable off
>
> /etc/init.d/glusterd stop
>
>
> service nfslock stop
>
> service rpcgssd stop
>
> service rpcidmapd stop
>
> service portmap stop
>
> service nfs stop
>
>
> /etc/init.d/glusterd stop
>
>
>
>
> *the /var/log/glusterfs/nfs.log immediately reports a crash:*
>
> *
> *
>
> [root@mseas-data2 glusterfs]# cat nfs.log
>
> [2017-08-07 15:20:16.327026] I [MSGID: 100030] [glusterfsd.c:2332:main]
> 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version
> 3.7.11 (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs
> -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S
> /var/run/gluster/7db74f19472511d20849e471bf224c1a.socket)
>
> [2017-08-07 15:20:16.345166] I [MSGID: 101190]
> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread
> with index 1
>
> [2017-08-07 15:20:16.351290] I
> [rpcsvc.c:2215:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service:
> Configured rpc.outstanding-rpc-limit with value 16
>
> pending frames:
>
> frame : type(0) op(0)
>
> patchset: git://git.gluster.com/glusterfs.git
>
> signal received: 11
>
> time of crash:
>
> 2017-08-07 15:20:17
>
> configuration details:
>
> argp 1
>
> backtrace 1
>
> dlfcn 1
>
> libpthread 1
>
> llistxattr 1
>
> setfsid 1
>
> spinlock 1
>
> epoll.h 1
>
> xattr.h 1
>
> st_atim.tv_nsec 1
>
> package-string: glusterfs 3.7.11
>
> /usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb8)[0x3889625a18]
>
> /usr/lib64/libglusterfs.so.0(gf_print_trace+0x32f)[0x38896456af]
>
> /lib64/libc.so.6[0x34a1c32660]
>
> /lib64/libc.so.6[0x34a1d3382f]
>
> /usr/lib64/glusterfs/3.7.11/xlator/nfs/server.so(+0x53307)[0x7f8d071b3307]
>
> /usr/lib64/glusterfs/3.7.11/xlator/nfs/server.so(exp_file_parse+0x302)[0x7f8d071b3742]
>
> /usr/lib64/glusterfs/3.7.11/xlator/nfs/server.so(mnt3_auth_set_exports_auth+0x45)[0x7f8d071b47a5]
>
> /usr/lib64/glusterfs/3.7.11/xlator/nfs/server.so(_mnt3_init_auth_params+0x91)[0x7f8d07183e41]
>
> /usr/lib64/glusterfs/3.7.11/xlator/nfs/server.so(mnt3svc_init+0x218)[0x7f8d07184228]
>
> /usr/lib64/glusterfs/3.7.11/xlator/nfs/server.so(nfs_init_versions+0xd7)[0x7f8d07174a37]
>
> /usr/lib64/glusterfs/3.7.11/xlator/nfs/server.so(init+0x77)[0x7f8d071767c7]
>
> /usr/lib64/libglusterfs.so.0(xlator_init+0x52)[0x3889622a82]
>
> /usr/lib64/libglusterfs.so.0(glusterfs_graph_init+0x31)[0x3889669aa1]
>
> 

Re: [Gluster-users] Upgrading from 3.6.3 to 3.10/3.11

2017-08-08 Thread Diego Remolina
I cannot speak to an interim version. I went from 3.6.x to 3.7.x a
long time ago and it was a disaster. Many samba crashes and core dumps
scared me so I rolled back to 3.6.x series and stayed there until I
upgraded to 3.10.2

I never tried 3.8.x so I cannot speak to it, other than knowing is
what Red Hat considers stable on their supported RHEL OS.

Diego

On Tue, Aug 8, 2017 at 9:04 AM, Brett Randall  wrote:
> Thanks Diego. This is invaluable information, appreciate it immensely. I had
> heard previously that you can always go back to previous Gluster binaries,
> but without understanding the data structures behind Gluster, I had no idea
> how safe that was. Backing up the lib folder makes perfect sense.
>
> The performance issues we're specifically keen to address are the small-file
> performance improvements introduced in 3.7. I feel that a lot of the
> complaints we get are from people using apps that are {slowly} crawling
> massively deep folders via SMB. I'm hoping that the improvements made in 3.7
> have stayed intact in 3.10! Otherwise, is there a generally accepted "fast
> and stable" version earlier than 3.10 that we should be looking at as an
> interim step?
>
> Brett
>
> 
> From: Diego Remolina 
> Sent: Tuesday, August 8, 2017 10:39:27 PM
> To: Brett Randall
> Cc: gluster-users@gluster.org List
> Subject: Re: [Gluster-users] Upgrading from 3.6.3 to 3.10/3.11
>
> I had a mixed experience going from 3.6.6 to 3.10.2 on a two server
> setup. I have since upgraded to 3.10.3 but I still have a bad problem
> with specific files (see CONS below).
>
> PROS
> - Back on a "supported" version.
> - Windows roaming profiles (small file performance) improved
> significantly via samba. This may be due to new tuning options added
> (see my tuning options for the volume below):
> Volume Name: export
> Type: Replicate
> Volume ID: ---snip---
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: 10.0.1.7:/bricks/hdds/brick
> Brick2: 10.0.1.6:/bricks/hdds/brick
> Options Reconfigured:
> performance.stat-prefetch: on
> performance.cache-min-file-size: 0
> network.inode-lru-limit: 65536
> performance.cache-invalidation: on
> features.cache-invalidation: on
> performance.md-cache-timeout: 600
> features.cache-invalidation-timeout: 600
> performance.cache-samba-metadata: on
> transport.address-family: inet
> server.allow-insecure: on
> performance.cache-size: 10GB
> cluster.server-quorum-type: server
> nfs.disable: on
> performance.io-thread-count: 64
> performance.io-cache: on
> cluster.lookup-optimize: on
> cluster.readdir-optimize: on
> server.event-threads: 5
> client.event-threads: 5
> performance.cache-max-file-size: 256MB
> diagnostics.client-log-level: INFO
> diagnostics.brick-log-level: INFO
> cluster.server-quorum-ratio: 51%
>
> CONS
> - New problems came up with specific files (Autodesk Revit files) for
> which no solution has been found, other than stop using samba vfs
> gluster plugin and also doing some stupid file renaming game. See:
> http://lists.gluster.org/pipermail/gluster-users/2017-June/031377.html
> - With 3.6.6 I had a nightly rsync process that would copy all the
> data from the gluster server pair to another server (nightly backup).
> This operation used to finish between 1-2AM every day. After upgrade,
> this operation is much slower with rsync finishing up between 3-5AM.
> - I have not looked a lot into it, but after 40-ish days after the
> upgrade, the gluster mount in one server became stuck and I had to
> reboot the servers.
>
> As for recommendations, definitively do *not* go with 3.11 as that is
> *not* a long term release. Stay with 3.10.
> https://www.gluster.org/community/release-schedule/
>
> Make sure you have the 3.6.3 rpms available to downgrade if needed.
> You can always go back to the previous rpms if you have them available
> (this is not easy if you have a mix with other distros, i.e ubuntu,
> where the ppa only have the latest .deb file for each minor version).
>
> You must schedule downtime and bring the whole gluster down for the
> upgrade. Upgrade all servers, then clients then test, test, test and
> test more (I did not notice my Revit file problem until users brought
> it to my attention). If things are going well in your testing, then
> you should do the op version upgrade, but not before committing to
> staying with 3.10. It is truth you can lower the op version later
> manually, but then you have to manually edit several files on each
> server, so I say, stay with the *older* op version until you are sure
> you want to stay on 3.10 then upgrade the op version.
>
> https://gluster.readthedocs.io/en/latest/Upgrade-Guide/op_version/
>
> Prior to any changes, backup all your gluster server configuration
> folders ( /var/lib/glusterd/ ) in every single server. That will allow
> you to go back to the moment before upgrade if really needed.

Re: [Gluster-users] Upgrading from 3.6.3 to 3.10/3.11

2017-08-08 Thread Brett Randall
Thanks Diego. This is invaluable information, appreciate it immensely. I had 
heard previously that you can always go back to previous Gluster binaries, but 
without understanding the data structures behind Gluster, I had no idea how 
safe that was. Backing up the lib folder makes perfect sense.

The performance issues we're specifically keen to address are the small-file 
performance improvements introduced in 3.7. I feel that a lot of the complaints 
we get are from people using apps that are {slowly} crawling massively deep 
folders via SMB. I'm hoping that the improvements made in 3.7 have stayed 
intact in 3.10! Otherwise, is there a generally accepted "fast and stable" 
version earlier than 3.10 that we should be looking at as an interim step?

Brett


From: Diego Remolina 
Sent: Tuesday, August 8, 2017 10:39:27 PM
To: Brett Randall
Cc: gluster-users@gluster.org List
Subject: Re: [Gluster-users] Upgrading from 3.6.3 to 3.10/3.11

I had a mixed experience going from 3.6.6 to 3.10.2 on a two server
setup. I have since upgraded to 3.10.3 but I still have a bad problem
with specific files (see CONS below).

PROS
- Back on a "supported" version.
- Windows roaming profiles (small file performance) improved
significantly via samba. This may be due to new tuning options added
(see my tuning options for the volume below):
Volume Name: export
Type: Replicate
Volume ID: ---snip---
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.0.1.7:/bricks/hdds/brick
Brick2: 10.0.1.6:/bricks/hdds/brick
Options Reconfigured:
performance.stat-prefetch: on
performance.cache-min-file-size: 0
network.inode-lru-limit: 65536
performance.cache-invalidation: on
features.cache-invalidation: on
performance.md-cache-timeout: 600
features.cache-invalidation-timeout: 600
performance.cache-samba-metadata: on
transport.address-family: inet
server.allow-insecure: on
performance.cache-size: 10GB
cluster.server-quorum-type: server
nfs.disable: on
performance.io-thread-count: 64
performance.io-cache: on
cluster.lookup-optimize: on
cluster.readdir-optimize: on
server.event-threads: 5
client.event-threads: 5
performance.cache-max-file-size: 256MB
diagnostics.client-log-level: INFO
diagnostics.brick-log-level: INFO
cluster.server-quorum-ratio: 51%

CONS
- New problems came up with specific files (Autodesk Revit files) for
which no solution has been found, other than stop using samba vfs
gluster plugin and also doing some stupid file renaming game. See:
http://lists.gluster.org/pipermail/gluster-users/2017-June/031377.html
- With 3.6.6 I had a nightly rsync process that would copy all the
data from the gluster server pair to another server (nightly backup).
This operation used to finish between 1-2AM every day. After upgrade,
this operation is much slower with rsync finishing up between 3-5AM.
- I have not looked a lot into it, but after 40-ish days after the
upgrade, the gluster mount in one server became stuck and I had to
reboot the servers.

As for recommendations, definitively do *not* go with 3.11 as that is
*not* a long term release. Stay with 3.10.
https://www.gluster.org/community/release-schedule/

Make sure you have the 3.6.3 rpms available to downgrade if needed.
You can always go back to the previous rpms if you have them available
(this is not easy if you have a mix with other distros, i.e ubuntu,
where the ppa only have the latest .deb file for each minor version).

You must schedule downtime and bring the whole gluster down for the
upgrade. Upgrade all servers, then clients then test, test, test and
test more (I did not notice my Revit file problem until users brought
it to my attention). If things are going well in your testing, then
you should do the op version upgrade, but not before committing to
staying with 3.10. It is truth you can lower the op version later
manually, but then you have to manually edit several files on each
server, so I say, stay with the *older* op version until you are sure
you want to stay on 3.10 then upgrade the op version.

https://gluster.readthedocs.io/en/latest/Upgrade-Guide/op_version/

Prior to any changes, backup all your gluster server configuration
folders ( /var/lib/glusterd/ ) in every single server. That will allow
you to go back to the moment before upgrade if really needed.

HTH,

Diego



On Tue, Aug 8, 2017 at 6:51 AM, Brett Randall  wrote:
> Hi all
>
> We have a 20-node, 1pb Gluster deployment that is running 3.6.3 - the same
> version we installed on day 1. There are obviously numerous performance and
> feature improvements that we'd like to take advantage of. However, this is a
> production system and we don't have a replica of it that we can test the
> upgrade on.
>
> We're running CentOS 6.6 with official Gluster binaries. We rely on
> Gluster's NFS daemon, and also use samba-glusterfs with samba for SMB access
> to our Gluster volume.
>
> What risks might 

Re: [Gluster-users] How are bricks healed in Debian Jessie 3.11

2017-08-08 Thread lemonnierk
> Healing of contents works at the entire file level at the moment. For VM 
> image use cases, it is advised to enable sharding by virtue of which 
> heals would be restricted to only the shards that were modified when the 
> brick was down.

We even change the heal algo to "full", since it seems better to just
re-download a small shard than trying to heal it. At least on 3.7 it
works better that way


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] How are bricks healed in Debian Jessie 3.11

2017-08-08 Thread Ravishankar N



On 08/08/2017 04:51 PM, Gerry O'Brien wrote:

Hi,

 How are bricks healed in Debian Jessie 3.11? Is it at the file of
block level? The scenario we have in mind is a 2 brick replica volume
for storing VM file systems in a self-service IaaS, e.g. OpenNebula. If
one of the bricks is off-line for a period of time all the VM files
systems will all have been modified when brick comes back on-line. As
some of these VM file systems are quite large, 100GB+, our guess is that
if healing involves copying entire files then it might be unworkable for
us as the time to restore a few minutes downtime of a single brick could
take days. Just copying modified blocks would be fine.

 Can someone give us some insight on how healing works?
Healing of contents works at the entire file level at the moment. For VM 
image use cases, it is advised to enable sharding by virtue of which 
heals would be restricted to only the shards that were modified when the 
brick was down.

-Ravi


 Regards,
 Gerry



___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Upgrading from 3.6.3 to 3.10/3.11

2017-08-08 Thread Diego Remolina
I had a mixed experience going from 3.6.6 to 3.10.2 on a two server
setup. I have since upgraded to 3.10.3 but I still have a bad problem
with specific files (see CONS below).

PROS
- Back on a "supported" version.
- Windows roaming profiles (small file performance) improved
significantly via samba. This may be due to new tuning options added
(see my tuning options for the volume below):
Volume Name: export
Type: Replicate
Volume ID: ---snip---
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.0.1.7:/bricks/hdds/brick
Brick2: 10.0.1.6:/bricks/hdds/brick
Options Reconfigured:
performance.stat-prefetch: on
performance.cache-min-file-size: 0
network.inode-lru-limit: 65536
performance.cache-invalidation: on
features.cache-invalidation: on
performance.md-cache-timeout: 600
features.cache-invalidation-timeout: 600
performance.cache-samba-metadata: on
transport.address-family: inet
server.allow-insecure: on
performance.cache-size: 10GB
cluster.server-quorum-type: server
nfs.disable: on
performance.io-thread-count: 64
performance.io-cache: on
cluster.lookup-optimize: on
cluster.readdir-optimize: on
server.event-threads: 5
client.event-threads: 5
performance.cache-max-file-size: 256MB
diagnostics.client-log-level: INFO
diagnostics.brick-log-level: INFO
cluster.server-quorum-ratio: 51%

CONS
- New problems came up with specific files (Autodesk Revit files) for
which no solution has been found, other than stop using samba vfs
gluster plugin and also doing some stupid file renaming game. See:
http://lists.gluster.org/pipermail/gluster-users/2017-June/031377.html
- With 3.6.6 I had a nightly rsync process that would copy all the
data from the gluster server pair to another server (nightly backup).
This operation used to finish between 1-2AM every day. After upgrade,
this operation is much slower with rsync finishing up between 3-5AM.
- I have not looked a lot into it, but after 40-ish days after the
upgrade, the gluster mount in one server became stuck and I had to
reboot the servers.

As for recommendations, definitively do *not* go with 3.11 as that is
*not* a long term release. Stay with 3.10.
https://www.gluster.org/community/release-schedule/

Make sure you have the 3.6.3 rpms available to downgrade if needed.
You can always go back to the previous rpms if you have them available
(this is not easy if you have a mix with other distros, i.e ubuntu,
where the ppa only have the latest .deb file for each minor version).

You must schedule downtime and bring the whole gluster down for the
upgrade. Upgrade all servers, then clients then test, test, test and
test more (I did not notice my Revit file problem until users brought
it to my attention). If things are going well in your testing, then
you should do the op version upgrade, but not before committing to
staying with 3.10. It is truth you can lower the op version later
manually, but then you have to manually edit several files on each
server, so I say, stay with the *older* op version until you are sure
you want to stay on 3.10 then upgrade the op version.

https://gluster.readthedocs.io/en/latest/Upgrade-Guide/op_version/

Prior to any changes, backup all your gluster server configuration
folders ( /var/lib/glusterd/ ) in every single server. That will allow
you to go back to the moment before upgrade if really needed.

HTH,

Diego



On Tue, Aug 8, 2017 at 6:51 AM, Brett Randall  wrote:
> Hi all
>
> We have a 20-node, 1pb Gluster deployment that is running 3.6.3 - the same
> version we installed on day 1. There are obviously numerous performance and
> feature improvements that we'd like to take advantage of. However, this is a
> production system and we don't have a replica of it that we can test the
> upgrade on.
>
> We're running CentOS 6.6 with official Gluster binaries. We rely on
> Gluster's NFS daemon, and also use samba-glusterfs with samba for SMB access
> to our Gluster volume.
>
> What risks might we face with an upgrade from 3.6 to 3.10/3.11? And what
> rollback options do we have?
>
> More importantly, is there anyone who would be willing to work for a
> retainer plus worked hours to be "on call" in case we have problems during
> the upgrade? Someone with plenty of experience in Gluster over the years and
> could diagnose any issues we may experience in an upgrade. If you're
> interested, please e-mail me off-list. I'm, of course, interested in advice
> on-list as well.
>
> Thanks
>
> Brett.
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] How to delete geo-replication session?

2017-08-08 Thread mabi
Thank you very much Aravinda. I used your instructions and I have 
geo-replication working again and I am very happy about that.
That's great if you can add this process to the documentation. I am sure others 
will also benefit from that.
Hopefully last question: as mentioned in a previous post on this mailing list 
(http://lists.gluster.org/pipermail/gluster-users/2017-July/031906.html) I have 
a count 272 under the FAILURES column of my geo-replication sessions and I was 
wondering what is the procedure to fix or deal with this?

>  Original Message 
> Subject: Re: [Gluster-users] How to delete geo-replication session?
> Local Time: August 8, 2017 11:20 AM
> UTC Time: August 8, 2017 9:20 AM
> From: avish...@redhat.com
> To: mabi 
> Gluster Users 
> Sorry I missed your previous mail.
>
> Please perform the following steps once a new node is added
>
> - Run gsec create command again
> gluster system:: execute gsec_create
>
> - Run Geo-rep create command with force and run start force
>
> gluster volume geo-replication  :: create 
> push-pem force
> gluster volume geo-replication  :: start force
> With these steps you will be able to stop/delete the Geo-rep session. I will 
> add these steps in the documentation 
> page(http://gluster.readthedocs.io/en/latest/Administrator%20Guide/Geo%20Replication/).
>
> regards
> Aravinda VK
>
> On 08/08/2017 12:08 PM, mabi wrote:
>
>> When I run the "gluster volume geo-replication status" I see my geo 
>> replication session correctly including the volume name under the "VOL" 
>> column. I see my two nodes (node1 and node2) but not arbiternode as I have 
>> added it later after setting up geo-replication. For more details have a 
>> quick look at my previous post here:
>> http://lists.gluster.org/pipermail/gluster-users/2017-July/031911.html
>> Sorry for repeating myself but again: how can I manually delete this 
>> problematic geo-replication session?
>> It seems to me that when I added the arbiternode it broke geo-replication.
>> Alternatively how can I fix this situation? but I think the easiest would be 
>> to delete the geo replication session.
>> Regards,
>> Mabi
>>
>>>  Original Message 
>>> Subject: Re: [Gluster-users] How to delete geo-replication session?
>>> Local Time: August 8, 2017 7:19 AM
>>> UTC Time: August 8, 2017 5:19 AM
>>> From: avish...@redhat.com
>>> To: mabi [](mailto:m...@protonmail.ch), Gluster Users 
>>> [](mailto:gluster-users@gluster.org)
>>> Do you see any session listed when Geo-replication status command is 
>>> run(without any volume name)
>>>
>>> gluster volume geo-replication status
>>>
>>> Volume stop force should work even if Geo-replication session exists. From 
>>> the error it looks like node "arbiternode.domain.tld" in Master cluster is 
>>> down or not reachable.
>>>
>>> regards
>>> Aravinda VK
>>>
>>> On 08/07/2017 10:01 PM, mabi wrote:
>>>
 Hi,

 I would really like to get rid of this geo-replication session as I am 
 stuck with it right now. For example I can't even stop my volume as it 
 complains about that geo-replcation...
 Can someone let me know how I can delete it?
 Thanks

>  Original Message 
> Subject: How to delete geo-replication session?
> Local Time: August 1, 2017 12:15 PM
> UTC Time: August 1, 2017 10:15 AM
> From: m...@protonmail.ch
> To: Gluster Users 
> [](mailto:gluster-users@gluster.org)
> Hi,
> I would like to delete a geo-replication session on my GluterFS 3.8.11 
> replicat 2 volume in order to re-create it. Unfortunately the "delete" 
> command does not work as you can see below:
> $ sudo gluster volume geo-replication myvolume 
> gfs1geo.domain.tld::myvolume-geo delete
> Staging failed on arbiternode.domain.tld. Error: Geo-replication session 
> between myvolume and arbiternode.domain.tld::myvolume-geo does not exist.
> geo-replication command failed
> I also tried with "force" but no luck here either:
> $ sudo gluster volume geo-replication myvolume 
> gfs1geo.domain.tld::myvolume-geo delete force
> Usage: volume geo-replication [] [] {create 
> [[ssh-port n] [[no-verify]|[push-pem]]] [force]|start [force]|stop 
> [force]|pause [force]|resume [force]|config|status [detail]|delete 
> [reset-sync-time]} [options...]
> So how can I delete my geo-replication session manually?
> Mind that I do not want to reset-sync-time, I would like to delete it and 
> re-create it so that it continues to geo replicate where it left from.
> Thanks,
> M.

 ___
 Gluster-users mailing list
 Gluster-users@gluster.org

 http://lists.gluster.org/mailman/listinfo/gluster-users___
Gluster-users mailing list

[Gluster-users] How are bricks healed in Debian Jessie 3.11

2017-08-08 Thread Gerry O'Brien
Hi,

How are bricks healed in Debian Jessie 3.11? Is it at the file of
block level? The scenario we have in mind is a 2 brick replica volume
for storing VM file systems in a self-service IaaS, e.g. OpenNebula. If
one of the bricks is off-line for a period of time all the VM files
systems will all have been modified when brick comes back on-line. As
some of these VM file systems are quite large, 100GB+, our guess is that
if healing involves copying entire files then it might be unworkable for
us as the time to restore a few minutes downtime of a single brick could
take days. Just copying modified blocks would be fine.

Can someone give us some insight on how healing works?

Regards,
Gerry

-- 
Gerry O'Brien

Systems Manager
School of Computer Science and Statistics
Trinity College Dublin
Dublin 2
IRELAND

00 353 1 896 1341


___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Upgrading from 3.6.3 to 3.10/3.11

2017-08-08 Thread Brett Randall
Hi all

We have a 20-node, 1pb Gluster deployment that is running 3.6.3 - the same
version we installed on day 1. There are obviously numerous performance and
feature improvements that we'd like to take advantage of. However, this is
a production system and we don't have a replica of it that we can test the
upgrade on.

We're running CentOS 6.6 with official Gluster binaries. We rely on
Gluster's NFS daemon, and also use samba-glusterfs with samba for SMB
access to our Gluster volume.

What risks might we face with an upgrade from 3.6 to 3.10/3.11? And what
rollback options do we have?

More importantly, is there anyone who would be willing to work for a
retainer plus worked hours to be "on call" in case we have problems during
the upgrade? Someone with plenty of experience in Gluster over the years
and could diagnose any issues we may experience in an upgrade. If you're
interested, please e-mail me off-list. I'm, of course, interested in advice
on-list as well.

Thanks

Brett.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] How to delete geo-replication session?

2017-08-08 Thread Aravinda

Sorry I missed your previous mail.

Please perform the following steps once a new node is added

- Run gsec create command again
gluster system:: execute gsec_create

- Run Geo-rep create command with force and run start force

gluster volume geo-replication  :: 
create push-pem force
gluster volume geo-replication  :: 
start force


With these steps you will be able to stop/delete the Geo-rep session. I 
will add these steps in the documentation 
page(http://gluster.readthedocs.io/en/latest/Administrator%20Guide/Geo%20Replication/).


regards
Aravinda VK

On 08/08/2017 12:08 PM, mabi wrote:
When I run the "gluster volume geo-replication status" I see my geo 
replication session correctly including the volume name under the 
"VOL" column. I see my two nodes (node1 and node2) but not arbiternode 
as I have added it later after setting up geo-replication. For more 
details have a quick look at my previous post here:


http://lists.gluster.org/pipermail/gluster-users/2017-July/031911.html

Sorry for repeating myself but again: how can I manually delete this 
problematic geo-replication session?


It seems to me that when I added the arbiternode it broke geo-replication.

Alternatively how can I fix this situation? but I think the easiest 
would be to delete the geo replication session.


Regards,
Mabi




 Original Message 
Subject: Re: [Gluster-users] How to delete geo-replication session?
Local Time: August 8, 2017 7:19 AM
UTC Time: August 8, 2017 5:19 AM
From: avish...@redhat.com
To: mabi , Gluster Users 

Do you see any session listed when Geo-replication status command is 
run(without any volume name)


gluster volume geo-replication status

Volume stop force should work even if Geo-replication session exists. 
From the error it looks like node "arbiternode.domain.tld" in Master 
cluster is down or not reachable.

regards
Aravinda VK
On 08/07/2017 10:01 PM, mabi wrote:

Hi,

I would really like to get rid of this geo-replication session as I 
am stuck with it right now. For example I can't even stop my volume 
as it complains about that geo-replcation...


Can someone let me know how I can delete it?

Thanks




 Original Message 
Subject: How to delete geo-replication session?
Local Time: August 1, 2017 12:15 PM
UTC Time: August 1, 2017 10:15 AM
From: m...@protonmail.ch
To: Gluster Users 

Hi,

I would like to delete a geo-replication session on my GluterFS 
3.8.11 replicat 2 volume in order to re-create it. Unfortunately 
the "delete" command does not work as you can see below:


$ sudo gluster volume geo-replication myvolume 
gfs1geo.domain.tld::myvolume-geo delete


Staging failed on arbiternode.domain.tld. Error: Geo-replication 
session between myvolume and arbiternode.domain.tld::myvolume-geo 
does not exist.

geo-replication command failed

I also tried with "force" but no luck here either:

$ sudo gluster volume geo-replication myvolume 
gfs1geo.domain.tld::myvolume-geo delete force


Usage: volume geo-replication [] [] {create 
[[ssh-port n] [[no-verify]|[push-pem]]] [force]|start [force]|stop 
[force]|pause [force]|resume [force]|config|status [detail]|delete 
[reset-sync-time]} [options...]




So how can I delete my geo-replication session manually?

Mind that I do not want to reset-sync-time, I would like to delete 
it and re-create it so that it continues to geo replicate where it 
left from.


Thanks,
M.




___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users




___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] gluster stuck when trying to list a successful mount

2017-08-08 Thread Ilan Schwarts
OK.
The issue was bad disk/mount on one of the machines.

When I tried to: df -h from Node1 it stucked.
So removed volume, peer, remounted the device.
Now its ok.

On Mon, Aug 7, 2017 at 5:18 PM, Ilan Schwarts  wrote:
> Hi all,
> My infrastructure is GlusterFS with 2 Nodes:
> L137B-GlusterFS-Node1.L137B-root.com
> L137B-GlusterFS-Node2.L137B-root.com
>
>
> I have followed this guide:
> http://www.itzgeek.com/how-tos/linux/centos-how-tos/install-and-configure-glusterfs-on-centos-7-rhel-7.html/2
>
> I have created a device, formatted as ext4, mounted it.
> The next step was to install gluster FS.
> I installed glusterfs 3.10.3, created a volume and started it
> [root@L137B-GlusterFS-Node2 someuser]# gluster volume info
>
> Volume Name: gv0
> Type: Replicate
> Volume ID: a606f77b-c0df-427c-99ec-cee98b3ecd71
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: L137B-GlusterFS-Node1.L137B-root.com:/mnt/gluster/gv0
> Brick2: L137B-GlusterFS-Node2.L137B-root.com:/mnt/gluster/gv0
> Options Reconfigured:
> transport.address-family: inet
> nfs.disable: on
>
>
> Now when I try to mount to Node1 from Node2, with the command:
> mount -t glusterfs L137B-GlusterFS-Node2.L137B-root.com:/gv0 /mnt
>
> L137B-GlusterFS-Node2.L137B-root.com:/gv0 on /mnt type fuse.glusterfs
> (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
> fusectl on /sys/fs/fuse/connections type fusectl (rw,relatime)
>
>
>
> it returns ok, no errors, but when i do: ls -la /mnt the console get stucked.
> on dmesg -T i get (from a duplicate session on another console):
> [Mon Aug  7 17:05:58 2017] fuse init (API version 7.22)
> [Mon Aug  7 17:09:32 2017] INFO: task glusterfsd:3273 blocked for more
> than 120 seconds.
> [Mon Aug  7 17:09:32 2017] "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [Mon Aug  7 17:09:32 2017] glusterfsd  D 880092e380b0 0
> 3273  1 0x0080
> [Mon Aug  7 17:09:32 2017]  8800a1e3bb70 0086
> 880139b7af10 8800a1e3bfd8
> [Mon Aug  7 17:09:32 2017]  8800a1e3bfd8 8800a1e3bfd8
> 880139b7af10 880092e380a8
> [Mon Aug  7 17:09:32 2017]  880092e380ac 880139b7af10
>  880092e380b0
> [Mon Aug  7 17:09:32 2017] Call Trace:
> [Mon Aug  7 17:09:32 2017]  []
> schedule_preempt_disabled+0x29/0x70
> [Mon Aug  7 17:09:32 2017]  []
> __mutex_lock_slowpath+0xc5/0x1c0
> [Mon Aug  7 17:09:32 2017]  [] ? unlazy_walk+0x87/0x140
> [Mon Aug  7 17:09:32 2017]  [] mutex_lock+0x1f/0x2f
> [Mon Aug  7 17:09:32 2017]  [] lookup_slow+0x33/0xa7
> [Mon Aug  7 17:09:32 2017]  [] link_path_walk+0x80f/0x8b0
> [Mon Aug  7 17:09:32 2017]  [] ? __remove_hrtimer+0x46/0xe0
> [Mon Aug  7 17:09:32 2017]  [] path_lookupat+0x6b/0x7a0
> [Mon Aug  7 17:09:32 2017]  [] ? futex_wait+0x1a3/0x280
> [Mon Aug  7 17:09:32 2017]  [] ? kmem_cache_alloc+0x35/0x1e0
> [Mon Aug  7 17:09:32 2017]  [] ? getname_flags+0x4f/0x1a0
> [Mon Aug  7 17:09:32 2017]  [] filename_lookup+0x2b/0xc0
> [Mon Aug  7 17:09:32 2017]  [] user_path_at_empty+0x67/0xc0
> [Mon Aug  7 17:09:32 2017]  [] ? futex_wake+0x80/0x160
> [Mon Aug  7 17:09:32 2017]  [] user_path_at+0x11/0x20
> [Mon Aug  7 17:09:32 2017]  [] vfs_fstatat+0x63/0xc0
> [Mon Aug  7 17:09:32 2017]  [] SYSC_newlstat+0x31/0x60
> [Mon Aug  7 17:09:32 2017]  [] ? SyS_futex+0x80/0x180
> [Mon Aug  7 17:09:32 2017]  [] ?
> __audit_syscall_exit+0x1e6/0x280
> [Mon Aug  7 17:09:32 2017]  [] SyS_newlstat+0xe/0x10
> [Mon Aug  7 17:09:32 2017]  [] 
> system_call_fastpath+0x16/0x1b
> [Mon Aug  7 17:11:32 2017] INFO: task glusterfsd:3273 blocked for more
> than 120 seconds.
> [Mon Aug  7 17:11:32 2017] "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [Mon Aug  7 17:11:32 2017] glusterfsd  D 880092e380b0 0
> 3273  1 0x0080
> [Mon Aug  7 17:11:32 2017]  8800a1e3bb70 0086
> 880139b7af10 8800a1e3bfd8
> [Mon Aug  7 17:11:32 2017]  8800a1e3bfd8 8800a1e3bfd8
> 880139b7af10 880092e380a8
> [Mon Aug  7 17:11:32 2017]  880092e380ac 880139b7af10
>  880092e380b0
> [Mon Aug  7 17:11:32 2017] Call Trace:
> [Mon Aug  7 17:11:32 2017]  []
> schedule_preempt_disabled+0x29/0x70
> [Mon Aug  7 17:11:32 2017]  []
> __mutex_lock_slowpath+0xc5/0x1c0
> [Mon Aug  7 17:11:32 2017]  [] ? unlazy_walk+0x87/0x140
> [Mon Aug  7 17:11:32 2017]  [] mutex_lock+0x1f/0x2f
> [Mon Aug  7 17:11:32 2017]  [] lookup_slow+0x33/0xa7
> [Mon Aug  7 17:11:32 2017]  [] link_path_walk+0x80f/0x8b0
> [Mon Aug  7 17:11:32 2017]  [] ? __remove_hrtimer+0x46/0xe0
> [Mon Aug  7 17:11:32 2017]  [] path_lookupat+0x6b/0x7a0
> [Mon Aug  7 17:11:32 2017]  [] ? futex_wait+0x1a3/0x280
> [Mon Aug  7 17:11:32 2017]  [] ? kmem_cache_alloc+0x35/0x1e0
> [Mon Aug  7 17:11:32 2017]  [] ? getname_flags+0x4f/0x1a0
> [Mon Aug  7 17:11:32 2017]  [] filename_lookup+0x2b/0xc0
> [Mon Aug  7 17:11:32 2017]  [] 

Re: [Gluster-users] Mailing list question

2017-08-08 Thread Amar Tumballi
Looks like the issue is with 'gmail' where it keeps the mail only in 'Sent'
label if there are no responses. You reply to your own mail by going to
'Sent' and responding in the thread.

If someone responds to that thread, then you will get it in your 'Inbox'.

-Amar

On Tue, Aug 8, 2017 at 12:56 PM, Ilan Schwarts  wrote:

> Hi all,
>
> How can I answer my question or delete the thread ?
> When I sent a question to gluster-users@gluster.org I didnt get the
> mail (probably by-design), so I cannot reply it with a solution.
>
> I see it in the
> archive:http://lists.gluster.org/pipermail/gluster-users/
> 2017-August/032008.html
>
> But i cannot answer/delete it..
>
> Thanks
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>



-- 
Amar Tumballi (amarts)
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Mailing list question

2017-08-08 Thread lemonnierk
Hi,

If you haven't subscribed to the mailing-list, indeed you won't get it.
I'd say just "craft" a reply by using the same subject and put Re: in
front of it for your reply.

Next time I'd advise in subscribing before posting, even if to
unsubscribe a few days later when the problem is solved :)

On Tue, Aug 08, 2017 at 10:26:58AM +0300, Ilan Schwarts wrote:
> Hi all,
> 
> How can I answer my question or delete the thread ?
> When I sent a question to gluster-users@gluster.org I didnt get the
> mail (probably by-design), so I cannot reply it with a solution.
> 
> I see it in the
> archive:http://lists.gluster.org/pipermail/gluster-users/2017-August/032008.html
> 
> But i cannot answer/delete it..
> 
> Thanks
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Mailing list question

2017-08-08 Thread Ilan Schwarts
Hi all,

How can I answer my question or delete the thread ?
When I sent a question to gluster-users@gluster.org I didnt get the
mail (probably by-design), so I cannot reply it with a solution.

I see it in the
archive:http://lists.gluster.org/pipermail/gluster-users/2017-August/032008.html

But i cannot answer/delete it..

Thanks
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] How to delete geo-replication session?

2017-08-08 Thread mabi
When I run the "gluster volume geo-replication status" I see my geo replication 
session correctly including the volume name under the "VOL" column. I see my 
two nodes (node1 and node2) but not arbiternode as I have added it later after 
setting up geo-replication. For more details have a quick look at my previous 
post here:
http://lists.gluster.org/pipermail/gluster-users/2017-July/031911.html
Sorry for repeating myself but again: how can I manually delete this 
problematic geo-replication session?
It seems to me that when I added the arbiternode it broke geo-replication.
Alternatively how can I fix this situation? but I think the easiest would be to 
delete the geo replication session.
Regards,
Mabi

>  Original Message 
> Subject: Re: [Gluster-users] How to delete geo-replication session?
> Local Time: August 8, 2017 7:19 AM
> UTC Time: August 8, 2017 5:19 AM
> From: avish...@redhat.com
> To: mabi , Gluster Users 
> Do you see any session listed when Geo-replication status command is 
> run(without any volume name)
>
> gluster volume geo-replication status
>
> Volume stop force should work even if Geo-replication session exists. From 
> the error it looks like node "arbiternode.domain.tld" in Master cluster is 
> down or not reachable.
>
> regards
> Aravinda VK
>
> On 08/07/2017 10:01 PM, mabi wrote:
>
>> Hi,
>>
>> I would really like to get rid of this geo-replication session as I am stuck 
>> with it right now. For example I can't even stop my volume as it complains 
>> about that geo-replcation...
>> Can someone let me know how I can delete it?
>> Thanks
>>
>>>  Original Message 
>>> Subject: How to delete geo-replication session?
>>> Local Time: August 1, 2017 12:15 PM
>>> UTC Time: August 1, 2017 10:15 AM
>>> From: m...@protonmail.ch
>>> To: Gluster Users 
>>> [](mailto:gluster-users@gluster.org)
>>> Hi,
>>> I would like to delete a geo-replication session on my GluterFS 3.8.11 
>>> replicat 2 volume in order to re-create it. Unfortunately the "delete" 
>>> command does not work as you can see below:
>>> $ sudo gluster volume geo-replication myvolume 
>>> gfs1geo.domain.tld::myvolume-geo delete
>>> Staging failed on arbiternode.domain.tld. Error: Geo-replication session 
>>> between myvolume and arbiternode.domain.tld::myvolume-geo does not exist.
>>> geo-replication command failed
>>> I also tried with "force" but no luck here either:
>>> $ sudo gluster volume geo-replication myvolume 
>>> gfs1geo.domain.tld::myvolume-geo delete force
>>> Usage: volume geo-replication [] [] {create [[ssh-port 
>>> n] [[no-verify]|[push-pem]]] [force]|start [force]|stop [force]|pause 
>>> [force]|resume [force]|config|status [detail]|delete [reset-sync-time]} 
>>> [options...]
>>> So how can I delete my geo-replication session manually?
>>> Mind that I do not want to reset-sync-time, I would like to delete it and 
>>> re-create it so that it continues to geo replicate where it left from.
>>> Thanks,
>>> M.
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>>
>> http://lists.gluster.org/mailman/listinfo/gluster-users___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users