[Gluster-users] geo-replication - OSError: [Errno 1] Operation not permitted - failing with socket files?

2019-03-17 Thread Davide Obbi
Hi,

i am trying to understand why georeplciation during "History Crawl" starts
failing on each of the three bricks, one after the other. I have enabled
DEBUG for all the logs configurable by the geo-replication command.

Running glusterfs v4.16 the behaviour is as follow:
- The "History Crawl" worked fine for about one hr, it actually replicated
some files and folders albeit most of them looks empty
- at some point it starts becoming faulty, try to start on another brick,
faulty and so on
- in the logs, Python exception above mentioned is raised:
[2019-03-17 18:52:49.565040] E [syncdutils(worker
/var/lib/heketi/mounts/vg_b088aec908c959c75674e01fb8598c21/brick_f90f425ecb89c3eec6ef2ef4a2f0a973/brick):332:log_raise_exception]
:
FAIL:

Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 311, in
main
func(args)
  File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 72, in
subcmd_worker
local.service_loop(remote)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1291,
in service_loop
g3.crawlwrap(oneshot=True)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 615, in
crawlwrap
self.crawl()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1569, in
crawl
self.changelogs_batch_process(changes)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1469, in
changelogs_batch_process
self.process(batch)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1304, in
process
self.process_change(change, done, retry)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1203, in
process_change
failures = self.slave.server.entry_ops(entries)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 216, in
__call__
return self.ins(self.meth, *a)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 198, in
__call__
raise res
OSError: [Errno 1] Operation not permitted

- The operation before the exception:
[2019-03-17 18:52:49.545103] D [master(worker
/var/lib/heketi/mounts/vg_b088aec908c959c75674e01fb8598c21/brick_f90f425ecb89c3eec6ef2ef4a2f0a973/brick):1186:process_change]
_GMaster: entries: [{'uid': 7575, 'gfid':
'e1ad7c98-f32a-4e48-9902-cc75840de7c3', 'gid': 100, 'mode'
: 49536, 'entry':
'.gfid/5219e4b8-a1f3-4a4e-b9c7-c9b129abe671/.control_f7c33270dc9db9234d005406a13deb4375459715.6lvofzOuVnfAwOwY',
'op': 'MKNOD'}, {'gfid': 'e1ad7c98-f32a-4e48-9902-cc75840de7c3', 'entry':
'.gfid/5219e4b8-a1f3-4a4e-b9c7-c9b129abe671/.control_f7c33270dc9db9
234d005406a13deb4375459715', 'stat': {'atime': 1552661403.3846507, 'gid':
100, 'mtime': 1552661403.3846507, 'uid': 7575, 'mode': 49536}, 'link':
None, 'op': 'LINK'}, {'gfid': 'e1ad7c98-f32a-4e48-9902-cc75840de7c3',
'entry': '.gfid/5219e4b8-a1f3-4a4e-b9c7-c9b129abe671/.con
trol_f7c33270dc9db9234d005406a13deb4375459715.6lvofzOuVnfAwOwY', 'op':
'UNLINK'}]
[2019-03-17 18:52:49.548614] D [repce(worker
/var/lib/heketi/mounts/vg_b088aec908c959c75674e01fb8598c21/brick_f90f425ecb89c3eec6ef2ef4a2f0a973/brick):179:push]
RepceClient: call 56917:140179359156032:1552848769.55 entry_ops([{'uid':
7575, 'gfid': 'e1ad7c98-f32a-4e48-9902-
cc75840de7c3', 'gid': 100, 'mode': 49536, 'entry':
'.gfid/5219e4b8-a1f3-4a4e-b9c7-c9b129abe671/.control_f7c33270dc9db9234d005406a13deb4375459715.6lvofzOuVnfAwOwY',
'op': 'MKNOD'}, {'gfid': '*e1ad7c98-f32a-4e48-9902-cc75840de7c3*', 'entry':
'.gfid/5219e4b8-a1f3-4a4e-b9c7-c9b
129abe671/.control_f7c33270dc9db9234d005406a13deb4375459715', 'stat':
{'atime': 1552661403.3846507, 'gid': 100, 'mtime': 1552661403.3846507,
'uid': 7575, 'mode': 49536}, 'link': None, 'op': 'LINK'}, {'gfid':
'e1ad7c98-f32a-4e48-9902-cc75840de7c3', 'entry': '.gfid/5219e4b8
-a1f3-4a4e-b9c7-c9b129abe671/*.control_f7c33270dc9db9234d005406a13deb4375459715.6lvofzOuVnfAwOwY',
'op'*: 'UNLINK'}],) ...

- The gfid highlighted, is pointing to these control files which are "unix
sockets" as per below:
rw---  2 pippo users 0 Mar 14 16:32
.control_31c3a99664c1f956f949311e58434037e6a52d22
srw---  2 pippo users 0 Mar 14 16:33
.control_a9b82937042529bca677b9f43eba9eb02ca7c5ee
srw---  2 pippo users 0 Mar 14 16:32
.control_f429221460d52570066d9f25521011fe7e081cf5
srw---  2 pippo users 0 Mar 15 15:50
.control_f7c33270dc9db9234d005406a13deb4375459715

So it seems geo-replicaiton should be at least skipping such file rather
than raising an exception? Am i the first experiencing this behaviour?

thanks in advance
Davide
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [External] Re: Samba+Gluster: Performance measurements for small files

2019-01-22 Thread Davide Obbi
Hi David,

i haven't tested samba but glusterfs fuse, i have posted the results few
months ago, tests conducted using gluster 4.1.5:

*Options Reconfigured:*
client.event-threads 3
performance.cache-size 8GB
performance.io-thread-count 24
network.inode-lru-limit 1048576
performance.parallel-readdir on
performance.cache-invalidation on
performance.md-cache-timeout 600
features.cache-invalidation on
features.cache-invalidation-timeout 600
performance.client-io-threads on

nr of clients 6
Network 10Gb
Clients Mem 128GB
Clients Cores 22
Centos 7.5.1804
Kernel 3.10.0-862.14.4.el7.x86_64



nr of servers/bricks per volume 3
Network 100Gb
*node to node is 100Gb, cleints 10Gb
Server Mem 377GB
Server Cores 56 *Intel 5120 CPU
Storage 4x8TB NVME
Centos 7.5.1804
Kernel 3.10.0-862.14.4.el7.x86_64

This for example are FOPS with 128K IO size (cnsidered sweet spot for
glusterfs according to documentation). In Blue 8threads per client and red
4threads for client
[image: image.png]
Below 4K
[image: image.png]
and 1MB
[image: image.png]

On Tue, Jan 22, 2019 at 9:09 AM David Spisla  wrote:

> Hello Amar,
> thank you for the advice. We already use nl-cache option and a bunch of
> other settings. At the moment we try the samba-vfs-glusterfs plugin to
> access a gluster volume via samba. The performance increase now.
> Additionally we are looking for some performance measurements to compare
> with. Maybe someone in the community also does performance tests. Does
> Redhat has some official reference measurement?
>
> Regards
> David Spisla
>
> Am Di., 22. Jan. 2019 um 07:14 Uhr schrieb Amar Tumballi Suryanarayan <
> atumb...@redhat.com>:
>
>> For Samba usecase, please make sure you have nl-cache (ie,
>> 'negative-lookup cache') enabled. We have seen some improvements from this
>> value.
>>
>> -Amar
>>
>> On Tue, Dec 18, 2018 at 8:23 PM David Spisla  wrote:
>>
>>> Dear Gluster Community,
>>>
>>> it is a known fact that Samba+Gluster has a bad smallfile performance.
>>> We now have some test measurements created by this setup: 2-Node-Cluster on
>>> real hardware with Replica-2 Volume (just one subvolume), Gluster v.4.1.6,
>>> Samba v4.7. Samba writes to Gluster via FUSE. Files created by fio. We used
>>> a Windows System as Client which is in the same network like the servers.
>>>
>>> The measurements are as follows. In each test case 400 files were
>>> written:
>>>
>>>64KiB_x_400 files1MiB_x_400 files
>>> 10MiB_x_400 files
>>> 1 Thread  0,77 MiB/s   8,05
>>> MiB/s72,67 MiB/s
>>> 4 Threads0,86 MiB/s   8,92 MiB/s
>>> 90,38 MiB/s
>>> 8 Threads0,87 MiB/s   8,92
>>> MiB/s 94,75 MiB/s
>>>
>>> Does anyone have measurements that are in a similar range or are 
>>> significantly different?
>>> We do not know which values can still be considered "normal" and which are 
>>> not.
>>> We also know that there are options to improve performance. But first of 
>>> all we are interested
>>> in whether there are reference values.
>>> Regards
>>> David Spisla
>>>
>>> _______
>>> Gluster-users mailing list
>>> Gluster-users@gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
>>
>> --
>> Amar Tumballi (amarts)
>>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users



-- 
Davide Obbi
Senior System Administrator

Booking.com B.V.
Vijzelstraat 66-80 Amsterdam 1017HL Netherlands
Direct +31207031558
[image: Booking.com] <https://www.booking.com/>
Empowering people to experience the world since 1996
43 languages, 214+ offices worldwide, 141,000+ global destinations, 29
million reported listings
Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [External] To good to be truth speed improvements?

2019-01-15 Thread Davide Obbi
i think you can find the volume options doing  a grep -R option
/var/lib/glusterd/vols/ and the .vol files show the options

On Tue, Jan 15, 2019 at 2:28 PM Diego Remolina  wrote:

> Hi Davide,
>
> The options information is already provided in prior e-mail, see the
> termbin.con link for the options of the volume after the 4.1.6 upgrade.
>
> The gluster options set on the volume are:
> https://termbin.com/yxtd
>
> This is the other piece:
>
> # gluster v info export
>
> Volume Name: export
> Type: Replicate
> Volume ID: b4353b3f-6ef6-4813-819a-8e85e5a95cff
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: 10.0.1.7:/bricks/hdds/brick
> Brick2: 10.0.1.6:/bricks/hdds/brick
> Options Reconfigured:
> performance.stat-prefetch: on
> performance.cache-min-file-size: 0
> network.inode-lru-limit: 65536
> performance.cache-invalidation: on
> features.cache-invalidation: on
> performance.md-cache-timeout: 600
> features.cache-invalidation-timeout: 600
> performance.cache-samba-metadata: on
> transport.address-family: inet
> server.allow-insecure: on
> performance.cache-size: 10GB
> cluster.server-quorum-type: server
> nfs.disable: on
> performance.io-thread-count: 64
> performance.io-cache: on
> cluster.lookup-optimize: on
> cluster.readdir-optimize: on
> server.event-threads: 5
> client.event-threads: 5
> performance.cache-max-file-size: 256MB
> diagnostics.client-log-level: INFO
> diagnostics.brick-log-level: INFO
> cluster.server-quorum-ratio: 51%
>
> Now I did create a backup of /var/lib/glusterd so if you tell me how to
> pull information from there to compare I can do it.
>
> I compared the file /var/lib/glusterd/vols/export/info and it is the same
> in both, though entries are in different order.
>
> Diego
>
>
>
>
> On Tue, Jan 15, 2019 at 5:03 AM Davide Obbi 
> wrote:
>
>>
>>
>> On Tue, Jan 15, 2019 at 2:18 AM Diego Remolina 
>> wrote:
>>
>>> Dear all,
>>>
>>> I was running gluster 3.10.12 on a pair of servers and recently upgraded
>>> to 4.1.6. There is a cron job that runs nightly in one machine, which
>>> rsyncs the data on the servers over to another machine for backup purposes.
>>> The rsync operation runs on one of the gluster servers, which mounts the
>>> gluster volume via fuse on /export.
>>>
>>> When using 3.10.12, this process would start at 8:00PM nightly, and
>>> usually end up at around 4:30AM when the servers had been freshly rebooted.
>>> From this point, things would start taking a bit longer and stabilize
>>> ending at around 7-9AM depending on actual file changes and at some point
>>> the servers would start eating up so much ram (up to 30GB) and I would have
>>> to reboot them to bring things back to normal as the file system would
>>> become extremely slow (perhaps the memory leak I have read was present on
>>> 3.10.x).
>>>
>>> After upgrading to 4.1.6 over the weekend, I was shocked to see the
>>> rsync process finish in about 1 hour and 26 minutes. This is compared to 8
>>> hours 30 mins with the older version. This is a nice speed up, however, I
>>> can only ask myself what has changed so drastically that this process is
>>> now so fast. Have there really been improvements in 4.1.6 that could speed
>>> this up so dramatically? In both of my test cases, there would had not
>>> really been a lot to copy via rsync given the fresh reboots are done on
>>> Saturday after the sync has finished from the day before.
>>>
>>> In general, the servers (which are accessed via samba for windows
>>> clients) are much faster and responsive since the update to 4.1.6. Tonight
>>> I will have the first rsync run which will actually have to copy the day's
>>> changes and will have another point of comparison.
>>>
>>> I am still using fuse mounts for samba, due to prior problems with vsf
>>> =gluster, which are currently present in Samba 4.8.3-4, and already
>>> documented in bugs, for which patches exist, but no official updated samba
>>> packages have been released yet. Since I was going from 3.10.12 to 4.1.6 I
>>> also did not want to change other things to make sure I could track any
>>> issues just related to the change in gluster versions and eliminate other
>>> complexity.
>>>
>>> The file system currently has about 16TB of data in
>>> 5142816 files and 696544 directories
>>>

Re: [Gluster-users] [External] To good to be truth speed improvements?

2019-01-15 Thread Davide Obbi
On Tue, Jan 15, 2019 at 2:18 AM Diego Remolina  wrote:

> Dear all,
>
> I was running gluster 3.10.12 on a pair of servers and recently upgraded
> to 4.1.6. There is a cron job that runs nightly in one machine, which
> rsyncs the data on the servers over to another machine for backup purposes.
> The rsync operation runs on one of the gluster servers, which mounts the
> gluster volume via fuse on /export.
>
> When using 3.10.12, this process would start at 8:00PM nightly, and
> usually end up at around 4:30AM when the servers had been freshly rebooted.
> From this point, things would start taking a bit longer and stabilize
> ending at around 7-9AM depending on actual file changes and at some point
> the servers would start eating up so much ram (up to 30GB) and I would have
> to reboot them to bring things back to normal as the file system would
> become extremely slow (perhaps the memory leak I have read was present on
> 3.10.x).
>
> After upgrading to 4.1.6 over the weekend, I was shocked to see the rsync
> process finish in about 1 hour and 26 minutes. This is compared to 8 hours
> 30 mins with the older version. This is a nice speed up, however, I can
> only ask myself what has changed so drastically that this process is now so
> fast. Have there really been improvements in 4.1.6 that could speed this up
> so dramatically? In both of my test cases, there would had not really been
> a lot to copy via rsync given the fresh reboots are done on Saturday after
> the sync has finished from the day before.
>
> In general, the servers (which are accessed via samba for windows clients)
> are much faster and responsive since the update to 4.1.6. Tonight I will
> have the first rsync run which will actually have to copy the day's changes
> and will have another point of comparison.
>
> I am still using fuse mounts for samba, due to prior problems with vsf
> =gluster, which are currently present in Samba 4.8.3-4, and already
> documented in bugs, for which patches exist, but no official updated samba
> packages have been released yet. Since I was going from 3.10.12 to 4.1.6 I
> also did not want to change other things to make sure I could track any
> issues just related to the change in gluster versions and eliminate other
> complexity.
>
> The file system currently has about 16TB of data in
> 5142816 files and 696544 directories
>
> I've just ran the following code to count files and dirs and it took
> 67mins 38.957 secs to complete in this gluster volume:
> https://github.com/ChristopherSchultz/fast-file-count
>
> # time ( /root/sbin/dircnt /export )
> /export contains 5142816 files and 696544 directories
>
> real67m38.957s
> user0m6.225s
> sys 0m48.939s
>
> The gluster options set on the volume are:
> https://termbin.com/yxtd
>
> # gluster v status export
> Status of volume: export
> Gluster process TCP Port  RDMA Port  Online
> Pid
>
> --
> Brick 10.0.1.7:/bricks/hdds/brick   49157 0  Y
>  13986
> Brick 10.0.1.6:/bricks/hdds/brick   49153 0  Y
>  9953
> Self-heal Daemon on localhost   N/A   N/AY
>  21934
> Self-heal Daemon on 10.0.1.5N/A   N/AY
>  4598
> Self-heal Daemon on 10.0.1.6N/A   N/AY
>  14485
>
> Task Status of Volume export
>
> --
> There are no active volume tasks
>
> Truth, there is a 3rd server here, but no bricks on it.
>
> Thoughts?
>
> Diego
>
>
> 
>  Virus-free.
> www.avast.com
> 
> <#m_8084651329793795211_m_7462352325940458688_m_-6479459361629161759_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users


Hi Diego,

Besides the actual improvements made in the code i think new releases might
implement volume options by default that before might have had different
setting. I would have been interesting to diff "gluster volume get
 all" befor and after the upgrade. Just for curiosity and i am
trying to figure out volume options for rsync kind of workloads can you
share the command output anyway along with gluster volume info ?

thanks
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [External] Re: A broken file that can not be deleted

2019-01-10 Thread Davide Obbi
does selfheal reports anything?
did you re-mount on the client side?
how the permissions are displayed for the file on the servers?

On Thu, Jan 10, 2019 at 10:00 AM Raghavendra Gowdappa 
wrote:

>
>
> On Wed, Jan 9, 2019 at 7:48 PM Dmitry Isakbayev  wrote:
>
>> I am seeing a broken file that exists on 2 out of 3 nodes.
>>
>
> Wondering whether its a case of split brain.
>
>
>> The application trying to use the file throws file permissions error.
>> ls, rm, mv, touch all throw "Input/output error"
>>
>> $ ls -la
>> ls: cannot access .download_suspensions.memo: Input/output error
>> drwxrwxr-x. 2 ossadmin ossadmin  4096 Jan  9 08:06 .
>> drwxrwxr-x. 5 ossadmin ossadmin  4096 Jan  3 11:36 ..
>> -?? ? ????
>> .download_suspensions.memo
>>
>> $ rm ".download_suspensions.memo"
>> rm: cannot remove ‘.download_suspensions.memo’: Input/output error
>>
>>
>>
>>
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users



-- 
Davide Obbi
Senior System Administrator

Booking.com B.V.
Vijzelstraat 66-80 Amsterdam 1017HL Netherlands
Direct +31207031558
[image: Booking.com] <https://www.booking.com/>
Empowering people to experience the world since 1996
43 languages, 214+ offices worldwide, 141,000+ global destinations, 29
million reported listings
Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [External] Re: Input/output error on FUSE log

2019-01-07 Thread Davide Obbi
then my last idea would be trying to create the same files or run the
application on the other volumes, sorry but i will be interested in the
solution!

On Mon, Jan 7, 2019 at 7:52 PM Matt Waymack  wrote:

> Yep, first unmount/remounted, then rebooted clients.  Stopped/started the
> volumes, and rebooted all nodes.
>
>
>
> *From:* Davide Obbi 
> *Sent:* Monday, January 7, 2019 12:47 PM
> *To:* Matt Waymack 
> *Cc:* Raghavendra Gowdappa ;
> gluster-users@gluster.org List 
> *Subject:* Re: [External] Re: [Gluster-users] Input/output error on FUSE
> log
>
>
>
> i guess you tried already unmounting, stop/star and mounting?
>
>
>
> On Mon, Jan 7, 2019 at 7:44 PM Matt Waymack  wrote:
>
> Yes, all volumes use sharding.
>
>
>
> *From:* Davide Obbi 
> *Sent:* Monday, January 7, 2019 12:43 PM
> *To:* Matt Waymack 
> *Cc:* Raghavendra Gowdappa ;
> gluster-users@gluster.org List 
> *Subject:* Re: [External] Re: [Gluster-users] Input/output error on FUSE
> log
>
>
>
> are all the volumes being configured with sharding?
>
>
>
> On Mon, Jan 7, 2019 at 5:35 PM Matt Waymack  wrote:
>
> I think that I can rule out network as I have multiple volumes on the same
> nodes and not all volumes are affected.  Additionally, access via SMB using
> samba-vfs-glusterfs is not affected, even on the same volumes.   This is
> seemingly only affecting the FUSE clients.
>
>
>
> *From:* Davide Obbi 
> *Sent:* Sunday, January 6, 2019 12:26 PM
> *To:* Raghavendra Gowdappa 
> *Cc:* Matt Waymack ; gluster-users@gluster.org List <
> gluster-users@gluster.org>
> *Subject:* Re: [External] Re: [Gluster-users] Input/output error on FUSE
> log
>
>
>
> Hi,
>
>
>
> i would start doing some checks like: "(Input/output error)" seems
> returned by the operating system, this happens for instance trying to
> access a file system which is on a device not available so i would check
> the network connectivity between the client to servers  and server to
> server during the reported time.
>
>
>
> Regards
>
> Davide
>
>
>
> On Sun, Jan 6, 2019 at 3:32 AM Raghavendra Gowdappa 
> wrote:
>
>
>
>
>
> On Sun, Jan 6, 2019 at 7:58 AM Raghavendra Gowdappa 
> wrote:
>
>
>
>
>
> On Sun, Jan 6, 2019 at 4:19 AM Matt Waymack  wrote:
>
> Hi all,
>
>
>
> I'm having a problem writing to our volume.  When writing files larger
> than about 2GB, I get an intermittent issue where the write will fail and
> return Input/Output error.  This is also shown in the FUSE log of the
> client (this is affecting all clients).  A snip of a client log is below:
>
> [2019-01-05 22:39:44.581371] W [fuse-bridge.c:2474:fuse_writev_cbk]
> 0-glusterfs-fuse: 51040978: WRITE => -1
> gfid=82a0b5c4-7ef3-43c2-ad86-41e16673d7c2 fd=0x7f949839a368 (Input/output
> error)
>
> [2019-01-05 22:39:44.598392] W [fuse-bridge.c:1441:fuse_err_cbk]
> 0-glusterfs-fuse: 51040979: FLUSH() ERR => -1 (Input/output error)
>
> [2019-01-05 22:39:47.420920] W [fuse-bridge.c:2474:fuse_writev_cbk]
> 0-glusterfs-fuse: 51041266: WRITE => -1
> gfid=0e8e1e13-97a5-478a-bc58-e81ddf3698a3 fd=0x7f949809b7f8 (Input/output
> error)
>
> [2019-01-05 22:39:47.433377] W [fuse-bridge.c:1441:fuse_err_cbk]
> 0-glusterfs-fuse: 51041267: FLUSH() ERR => -1 (Input/output error)
>
> [2019-01-05 22:39:50.441531] W [fuse-bridge.c:2474:fuse_writev_cbk]
> 0-glusterfs-fuse: 51041548: WRITE => -1
> gfid=0e8e1e13-97a5-478a-bc58-e81ddf3698a3 fd=0x7f949839a368 (Input/output
> error)
>
> [2019-01-05 22:39:50.451914] W [fuse-bridge.c:1441:fuse_err_cbk]
> 0-glusterfs-fuse: 51041549: FLUSH() ERR => -1 (Input/output error)
>
> The message "W [MSGID: 109011] [dht-layout.c:163:dht_layout_search]
> 0-gv1-dht: no subvolume for hash (value) = 1311504267" repeated 1721 times
> between [2019-01-05 22:39:33.906241] and [2019-01-05 22:39:44.598371]
>
> The message "E [MSGID: 101046] [dht-common.c:1502:dht_lookup_dir_cbk]
> 0-gv1-dht: dict is null" repeated 1714 times between [2019-01-05
> 22:39:33.925981] and [2019-01-05 22:39:50.451862]
>
> The message "W [MSGID: 109011] [dht-layout.c:163:dht_layout_search]
> 0-gv1-dht: no subvolume for hash (value) = 1137142622" repeated 1707 times
> between [2019-01-05 22:39:39.636552] and [2019-01-05 22:39:50.451895]
>
>
>
> This looks to be a DHT issue. Some questions:
>
> * Are all subvolumes of DHT up and client is connected to them?
> Particularly the subvolume which contains the file in question.
>
> * Can you get all extended attributes of parent directory of the file from
> all bricks?
>
> * set diagnostics.client-log-level to 

Re: [Gluster-users] [External] Re: Input/output error on FUSE log

2019-01-07 Thread Davide Obbi
are all the volumes being configured with sharding?

On Mon, Jan 7, 2019 at 5:35 PM Matt Waymack  wrote:

> I think that I can rule out network as I have multiple volumes on the same
> nodes and not all volumes are affected.  Additionally, access via SMB using
> samba-vfs-glusterfs is not affected, even on the same volumes.   This is
> seemingly only affecting the FUSE clients.
>
>
>
> *From:* Davide Obbi 
> *Sent:* Sunday, January 6, 2019 12:26 PM
> *To:* Raghavendra Gowdappa 
> *Cc:* Matt Waymack ; gluster-users@gluster.org List <
> gluster-users@gluster.org>
> *Subject:* Re: [External] Re: [Gluster-users] Input/output error on FUSE
> log
>
>
>
> Hi,
>
>
>
> i would start doing some checks like: "(Input/output error)" seems
> returned by the operating system, this happens for instance trying to
> access a file system which is on a device not available so i would check
> the network connectivity between the client to servers  and server to
> server during the reported time.
>
>
>
> Regards
>
> Davide
>
>
>
> On Sun, Jan 6, 2019 at 3:32 AM Raghavendra Gowdappa 
> wrote:
>
>
>
>
>
> On Sun, Jan 6, 2019 at 7:58 AM Raghavendra Gowdappa 
> wrote:
>
>
>
>
>
> On Sun, Jan 6, 2019 at 4:19 AM Matt Waymack  wrote:
>
> Hi all,
>
>
>
> I'm having a problem writing to our volume.  When writing files larger
> than about 2GB, I get an intermittent issue where the write will fail and
> return Input/Output error.  This is also shown in the FUSE log of the
> client (this is affecting all clients).  A snip of a client log is below:
>
> [2019-01-05 22:39:44.581371] W [fuse-bridge.c:2474:fuse_writev_cbk]
> 0-glusterfs-fuse: 51040978: WRITE => -1
> gfid=82a0b5c4-7ef3-43c2-ad86-41e16673d7c2 fd=0x7f949839a368 (Input/output
> error)
>
> [2019-01-05 22:39:44.598392] W [fuse-bridge.c:1441:fuse_err_cbk]
> 0-glusterfs-fuse: 51040979: FLUSH() ERR => -1 (Input/output error)
>
> [2019-01-05 22:39:47.420920] W [fuse-bridge.c:2474:fuse_writev_cbk]
> 0-glusterfs-fuse: 51041266: WRITE => -1
> gfid=0e8e1e13-97a5-478a-bc58-e81ddf3698a3 fd=0x7f949809b7f8 (Input/output
> error)
>
> [2019-01-05 22:39:47.433377] W [fuse-bridge.c:1441:fuse_err_cbk]
> 0-glusterfs-fuse: 51041267: FLUSH() ERR => -1 (Input/output error)
>
> [2019-01-05 22:39:50.441531] W [fuse-bridge.c:2474:fuse_writev_cbk]
> 0-glusterfs-fuse: 51041548: WRITE => -1
> gfid=0e8e1e13-97a5-478a-bc58-e81ddf3698a3 fd=0x7f949839a368 (Input/output
> error)
>
> [2019-01-05 22:39:50.451914] W [fuse-bridge.c:1441:fuse_err_cbk]
> 0-glusterfs-fuse: 51041549: FLUSH() ERR => -1 (Input/output error)
>
> The message "W [MSGID: 109011] [dht-layout.c:163:dht_layout_search]
> 0-gv1-dht: no subvolume for hash (value) = 1311504267" repeated 1721 times
> between [2019-01-05 22:39:33.906241] and [2019-01-05 22:39:44.598371]
>
> The message "E [MSGID: 101046] [dht-common.c:1502:dht_lookup_dir_cbk]
> 0-gv1-dht: dict is null" repeated 1714 times between [2019-01-05
> 22:39:33.925981] and [2019-01-05 22:39:50.451862]
>
> The message "W [MSGID: 109011] [dht-layout.c:163:dht_layout_search]
> 0-gv1-dht: no subvolume for hash (value) = 1137142622" repeated 1707 times
> between [2019-01-05 22:39:39.636552] and [2019-01-05 22:39:50.451895]
>
>
>
> This looks to be a DHT issue. Some questions:
>
> * Are all subvolumes of DHT up and client is connected to them?
> Particularly the subvolume which contains the file in question.
>
> * Can you get all extended attributes of parent directory of the file from
> all bricks?
>
> * set diagnostics.client-log-level to TRACE, capture these errors again
> and attach the client log file.
>
>
>
> I spoke a bit early. dht_writev doesn't search hashed subvolume as its
> already been looked up in lookup. So, these msgs looks to be of a different
> issue - not  writev failure.
>
>
>
>
>
> This is intermittent for most files, but eventually if a file is large
> enough it will not write.  The workflow is SFTP tot he client which then
> writes to the volume over FUSE.  When files get to a certain point,w e can
> no longer write to them.  The file sizes are different as well, so it's not
> like they all get to the same size and just stop either.  I've ruled out a
> free space issue, our files at their largest are only a few hundred GB and
> we have tens of terrabytes free on each brick.  We are also sharding at 1GB.
>
>
>
> I'm not sure where to go from here as the error seems vague and I can only
> see it on the client log.  I'm not seeing these errors on the nodes
> themselves.  This is also seen if I mount the volume via FUSE on any o

Re: [Gluster-users] [External] Re: Input/output error on FUSE log

2019-01-07 Thread Davide Obbi
i guess you tried already unmounting, stop/star and mounting?

On Mon, Jan 7, 2019 at 7:44 PM Matt Waymack  wrote:

> Yes, all volumes use sharding.
>
>
>
> *From:* Davide Obbi 
> *Sent:* Monday, January 7, 2019 12:43 PM
> *To:* Matt Waymack 
> *Cc:* Raghavendra Gowdappa ;
> gluster-users@gluster.org List 
> *Subject:* Re: [External] Re: [Gluster-users] Input/output error on FUSE
> log
>
>
>
> are all the volumes being configured with sharding?
>
>
>
> On Mon, Jan 7, 2019 at 5:35 PM Matt Waymack  wrote:
>
> I think that I can rule out network as I have multiple volumes on the same
> nodes and not all volumes are affected.  Additionally, access via SMB using
> samba-vfs-glusterfs is not affected, even on the same volumes.   This is
> seemingly only affecting the FUSE clients.
>
>
>
> *From:* Davide Obbi 
> *Sent:* Sunday, January 6, 2019 12:26 PM
> *To:* Raghavendra Gowdappa 
> *Cc:* Matt Waymack ; gluster-users@gluster.org List <
> gluster-users@gluster.org>
> *Subject:* Re: [External] Re: [Gluster-users] Input/output error on FUSE
> log
>
>
>
> Hi,
>
>
>
> i would start doing some checks like: "(Input/output error)" seems
> returned by the operating system, this happens for instance trying to
> access a file system which is on a device not available so i would check
> the network connectivity between the client to servers  and server to
> server during the reported time.
>
>
>
> Regards
>
> Davide
>
>
>
> On Sun, Jan 6, 2019 at 3:32 AM Raghavendra Gowdappa 
> wrote:
>
>
>
>
>
> On Sun, Jan 6, 2019 at 7:58 AM Raghavendra Gowdappa 
> wrote:
>
>
>
>
>
> On Sun, Jan 6, 2019 at 4:19 AM Matt Waymack  wrote:
>
> Hi all,
>
>
>
> I'm having a problem writing to our volume.  When writing files larger
> than about 2GB, I get an intermittent issue where the write will fail and
> return Input/Output error.  This is also shown in the FUSE log of the
> client (this is affecting all clients).  A snip of a client log is below:
>
> [2019-01-05 22:39:44.581371] W [fuse-bridge.c:2474:fuse_writev_cbk]
> 0-glusterfs-fuse: 51040978: WRITE => -1
> gfid=82a0b5c4-7ef3-43c2-ad86-41e16673d7c2 fd=0x7f949839a368 (Input/output
> error)
>
> [2019-01-05 22:39:44.598392] W [fuse-bridge.c:1441:fuse_err_cbk]
> 0-glusterfs-fuse: 51040979: FLUSH() ERR => -1 (Input/output error)
>
> [2019-01-05 22:39:47.420920] W [fuse-bridge.c:2474:fuse_writev_cbk]
> 0-glusterfs-fuse: 51041266: WRITE => -1
> gfid=0e8e1e13-97a5-478a-bc58-e81ddf3698a3 fd=0x7f949809b7f8 (Input/output
> error)
>
> [2019-01-05 22:39:47.433377] W [fuse-bridge.c:1441:fuse_err_cbk]
> 0-glusterfs-fuse: 51041267: FLUSH() ERR => -1 (Input/output error)
>
> [2019-01-05 22:39:50.441531] W [fuse-bridge.c:2474:fuse_writev_cbk]
> 0-glusterfs-fuse: 51041548: WRITE => -1
> gfid=0e8e1e13-97a5-478a-bc58-e81ddf3698a3 fd=0x7f949839a368 (Input/output
> error)
>
> [2019-01-05 22:39:50.451914] W [fuse-bridge.c:1441:fuse_err_cbk]
> 0-glusterfs-fuse: 51041549: FLUSH() ERR => -1 (Input/output error)
>
> The message "W [MSGID: 109011] [dht-layout.c:163:dht_layout_search]
> 0-gv1-dht: no subvolume for hash (value) = 1311504267" repeated 1721 times
> between [2019-01-05 22:39:33.906241] and [2019-01-05 22:39:44.598371]
>
> The message "E [MSGID: 101046] [dht-common.c:1502:dht_lookup_dir_cbk]
> 0-gv1-dht: dict is null" repeated 1714 times between [2019-01-05
> 22:39:33.925981] and [2019-01-05 22:39:50.451862]
>
> The message "W [MSGID: 109011] [dht-layout.c:163:dht_layout_search]
> 0-gv1-dht: no subvolume for hash (value) = 1137142622" repeated 1707 times
> between [2019-01-05 22:39:39.636552] and [2019-01-05 22:39:50.451895]
>
>
>
> This looks to be a DHT issue. Some questions:
>
> * Are all subvolumes of DHT up and client is connected to them?
> Particularly the subvolume which contains the file in question.
>
> * Can you get all extended attributes of parent directory of the file from
> all bricks?
>
> * set diagnostics.client-log-level to TRACE, capture these errors again
> and attach the client log file.
>
>
>
> I spoke a bit early. dht_writev doesn't search hashed subvolume as its
> already been looked up in lookup. So, these msgs looks to be of a different
> issue - not  writev failure.
>
>
>
>
>
> This is intermittent for most files, but eventually if a file is large
> enough it will not write.  The workflow is SFTP tot he client which then
> writes to the volume over FUSE.  When files get to a certain point,w e can
> no longer write to them.  The file sizes are different as well, so it's not
> like t

Re: [Gluster-users] [External] Re: Input/output error on FUSE log

2019-01-06 Thread Davide Obbi
port-type: tcp
>>> Bricks:
>>> Brick1: tpc-glus4:/exp/b1/gv1
>>> Brick2: tpc-glus2:/exp/b1/gv1
>>> Brick3: tpc-arbiter1:/exp/b1/gv1 (arbiter)
>>> Brick4: tpc-glus2:/exp/b2/gv1
>>> Brick5: tpc-glus4:/exp/b2/gv1
>>> Brick6: tpc-arbiter1:/exp/b2/gv1 (arbiter)
>>> Brick7: tpc-glus4:/exp/b3/gv1
>>> Brick8: tpc-glus2:/exp/b3/gv1
>>> Brick9: tpc-arbiter1:/exp/b3/gv1 (arbiter)
>>> Brick10: tpc-glus4:/exp/b4/gv1
>>> Brick11: tpc-glus2:/exp/b4/gv1
>>> Brick12: tpc-arbiter1:/exp/b4/gv1 (arbiter)
>>> Brick13: tpc-glus1:/exp/b5/gv1
>>> Brick14: tpc-glus3:/exp/b5/gv1
>>> Brick15: tpc-arbiter2:/exp/b5/gv1 (arbiter)
>>> Brick16: tpc-glus1:/exp/b6/gv1
>>> Brick17: tpc-glus3:/exp/b6/gv1
>>> Brick18: tpc-arbiter2:/exp/b6/gv1 (arbiter)
>>> Brick19: tpc-glus1:/exp/b7/gv1
>>> Brick20: tpc-glus3:/exp/b7/gv1
>>> Brick21: tpc-arbiter2:/exp/b7/gv1 (arbiter)
>>> Brick22: tpc-glus1:/exp/b8/gv1
>>> Brick23: tpc-glus3:/exp/b8/gv1
>>> Brick24: tpc-arbiter2:/exp/b8/gv1 (arbiter)
>>> Options Reconfigured:
>>> performance.cache-samba-metadata: on
>>> performance.cache-invalidation: off
>>> features.shard-block-size: 1000MB
>>> features.shard: on
>>> transport.address-family: inet
>>> nfs.disable: on
>>> cluster.lookup-optimize: on
>>>
>>> I'm a bit stumped on this, any help is appreciated.  Thank you!
>>>
>>>
>>> ___
>>> Gluster-users mailing list
>>> Gluster-users@gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users



-- 
Davide Obbi
Senior System Administrator

Booking.com B.V.
Vijzelstraat 66-80 Amsterdam 1017HL Netherlands
Direct +31207031558
[image: Booking.com] <https://www.booking.com/>
Empowering people to experience the world since 1996
43 languages, 214+ offices worldwide, 141,000+ global destinations, 29
million reported listings
Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [External] Re: Self Heal Confusion

2018-12-31 Thread Davide Obbi
cluster.quorum-type auto
cluster.quorum-count (null)
cluster.server-quorum-type off
cluster.server-quorum-ratio 0
cluster.quorum-readsno

Where exacty do I remove the gfid entries from - the .glusterfs
directory? --> yes can't remember exactly where but try to do a find in the
brick paths with the gfid  it should return something

Where do I put the cluster.heal-timeout option - which file? --> gluster
volume set volumename option value

On Mon, Dec 31, 2018 at 10:34 AM Brett Holcomb  wrote:

> That is probably the case as a lot of files were deleted some time ago.
>
> I'm on version 5.2 but was on 3.12 until about a week ago.
>
> Here is the quorum info.  I'm running a distributed replicated volumes
> in 2 x 3 = 6
>
> cluster.quorum-type auto
> cluster.quorum-count (null)
> cluster.server-quorum-type off
> cluster.server-quorum-ratio 0
> cluster.quorum-readsno
>
> Where exacty do I remove the gfid entries from - the .glusterfs
> directory?  Do I just delete all the directories can files under this
> directory?
>
> Where do I put the cluster.heal-timeout option - which file?
>
> I think you've hit on the cause of the issue.  Thinking back we've had
> some extended power outages and due to a misconfiguration in the swap
> file device name a couple of the nodes did not come up and I didn't
> catch it for a while so maybe the deletes occured then.
>
> Thank you.
>
> On 12/31/18 2:58 AM, Davide Obbi wrote:
> > if the long GFID does not correspond to any file it could mean the
> > file has been deleted by the client mounting the volume. I think this
> > is caused when the delete was issued and the number of active bricks
> > were not reaching quorum majority or a second brick was taken down
> > while another was down or did not finish the selfheal, the latter more
> > likely.
> > It would be interesting to see:
> > - what version of glusterfs you running, it happened to me with 3.12
> > - volume quorum rules: "gluster volume get vol all | grep quorum"
> >
> > To clean it up if i remember correctly it should be possible to delete
> > the gfid entries from the brick mounts on the glusterfs server nodes
> > reporting the files to heal.
> >
> > As a side note you might want to consider changing the selfheal
> > timeout to more agressive schedule in cluster.heal-timeout option
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users



-- 
Davide Obbi
System Administrator

Booking.com B.V.
Vijzelstraat 66-80 Amsterdam 1017HL Netherlands
Direct +31207031558
[image: Booking.com] <https://www.booking.com/>
Empowering people to experience the world since 1996
43 languages, 214+ offices worldwide, 141,000+ global destinations, 29
million reported listings
Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [External] Re: Self Heal Confusion

2018-12-31 Thread Davide Obbi
if the long GFID does not correspond to any file it could mean the file has
been deleted by the client mounting the volume. I think this is caused when
the delete was issued and the number of active bricks were not reaching
quorum majority or a second brick was taken down while another was down or
did not finish the selfheal, the latter more likely.
It would be interesting to see:
- what version of glusterfs you running, it happened to me with 3.12
- volume quorum rules: "gluster volume get vol all | grep quorum"

To clean it up if i remember correctly it should be possible to delete the
gfid entries from the brick mounts on the glusterfs server nodes reporting
the files to heal.

As a side note you might want to consider changing the selfheal timeout to
more agressive schedule in cluster.heal-timeout option
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [External] Seeding geo-replication slaves

2018-12-05 Thread Davide Obbi
Hi,

from what i have seen you can have the target volume mounted RW and write
other files to it without breaking replication. I don't know in case the
files are the same what would happen since i haven't tested it but i guess
as per rsync it would update the files from source.

Kr

On Tue, Dec 4, 2018 at 5:23 PM Conrad Lawes  wrote:

> Is it possible to pre-seed a glusterfs geo-repl slave?
>
> Background story:
> I am presently using rsync to mirror 3 servers.
> The source server (master) resides in the UK.  The target servers reside
> in  Canada and USA.
> The targets servers presently have 1.5TB of mirrored data.
> I want to switch from rsync mirroring to glusterfs geo-replication.
> However,  I wish to use the existing data  as the starting point instead
> of starting geo-replication from scratch.  It will take a very long time to
> re-sync 1.5TB of data over our WAN connection.
>
> Does glusterfs allow for this?
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users



-- 
Davide Obbi
System Administrator

Booking.com B.V.
Vijzelstraat 66-80 Amsterdam 1017HL Netherlands
Direct +31207031558
[image: Booking.com] <https://www.booking.com/>
Empowering people to experience the world since 1996
43 languages, 214+ offices worldwide, 141,000+ global destinations, 29
million reported listings
Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [External] Re: Geo Replication / Error: bash: gluster: command not found

2018-11-27 Thread Davide Obbi
Hi,

i resolved this specifying the command= directive in the ssh public keys:

command="/usr/libexec/glusterfs/gsyncd" ssh-rsa B3
and
command="tar ${SSH_ORIGINAL_COMMAND#* }" ssh-rsa B...


On Tue, Nov 27, 2018 at 10:15 PM m0rbidini  wrote:

> Hi everyone.
>
> I'm having the same problem in almost the same scenario.
>
> It only started after I upgraded to v4.1.5
>
> Output from the first node of the master volume:
>
> [root@glusterfs-node1 ~]# gluster volume geo-replication gv0
> geouser@glusterfs-node3::gv0 create push-pem
> gluster command on geouser@glusterfs-node3 failed. Error: bash: gluster:
> command not found
> geo-replication command failed
>
> [root@glusterfs-node1 ~]# cat /etc/redhat-release
> CentOS Linux release 7.5.1804 (Core)
>
> [root@glusterfs-node1 ~]# rpm -qa | grep glu
> glusterfs-libs-4.1.5-1.el7.x86_64
> glusterfs-fuse-4.1.5-1.el7.x86_64
> centos-release-gluster41-1.0-3.el7.centos.noarch
> glusterfs-4.1.5-1.el7.x86_64
> glusterfs-cli-4.1.5-1.el7.x86_64
> glusterfs-client-xlators-4.1.5-1.el7.x86_64
> glusterfs-server-4.1.5-1.el7.x86_64
> glusterfs-geo-replication-4.1.5-1.el7.x86_64
> glusterfs-api-4.1.5-1.el7.x86_64
> python2-gluster-4.1.5-1.el7.x86_64
>
> Output from the first node of the slave volume:
>
> [root@glusterfs-node3 ~]# cat /etc/redhat-release
> CentOS Linux release 7.5.1804 (Core)
>
> [root@glusterfs-node3 ~]# rpm -qa | grep glu
> glusterfs-libs-4.1.5-1.el7.x86_64
> glusterfs-client-xlators-4.1.5-1.el7.x86_64
> glusterfs-fuse-4.1.5-1.el7.x86_64
> glusterfs-server-4.1.5-1.el7.x86_64
> glusterfs-geo-replication-4.1.5-1.el7.x86_64
> centos-release-gluster41-1.0-3.el7.centos.noarch
> glusterfs-4.1.5-1.el7.x86_64
> glusterfs-api-4.1.5-1.el7.x86_64
> glusterfs-cli-4.1.5-1.el7.x86_64
> python2-gluster-4.1.5-1.el7.x86_64
>
>
> > Hi all,
> >
> > I encounter a problem to set up my geo replication session in glusterfs
> > 4.1.5 on centos 7.5.1804.
> >
> > After I give the
> > gluster volume geo-replication mastervol geoaccount at servere::slavevol
> > create push-pem
> >
> > I see the following
> >
> > gluster command on geoaccount at servere failed. Error: bash: gluster:
> command
> > not found
> > geo-replication command failed
> >
> > Do you know where is the problem?
> >
> > Thanks in advance!
> >
> > BR
> > -- next part --
> > An HTML attachment was scrubbed...
> > URL: <
> http://lists.gluster.org/pipermail/gluster-users/attachments/20181115/7449ed9c/attachment.html
> >
> >
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users



-- 
Davide Obbi
System Administrator

Booking.com B.V.
Vijzelstraat 66-80 Amsterdam 1017HL Netherlands
Direct +31207031558
[image: Booking.com] <https://www.booking.com/>
Empowering people to experience the world since 1996
43 languages, 214+ offices worldwide, 141,000+ global destinations, 29
million reported listings
Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] glusterfs performance report

2018-11-14 Thread Davide Obbi
Hi,

I have been conducting performance tests over the past day on our new HW
where we plan to deploy a scalable file-system solution. I hope the results
can be helpful to someone.
I hope to receive some feedback regarding optimizations and volume xlator
setup.
If necessary volume profiling has been collected.

Regards
Davide

The Clients

nr of clients

6

Network

10Gb

Clients Mem

128GB

Clients Cores

22

Centos

7.5.1804

Kernel

3.10.0-862.14.4.el7.x86_64

The Servers

nr of servers

3

Network

100Gb

*node to node is 100Gb, to clients 10Gb

Server Mem

377GB

Server Cores

56

*Intel 5120 CPU

Storage

4x8TB NVME

Centos

7.5.1804

Kernel

3.10.0-862.14.4.el7.x86_64


Gluster version

Both clients and servers running glusterfs 4.1.5 (glusterd not glusterd2)

Brick Setup

The Bricks have been automatically configured by heketi at the volume
creation resulting in:

   -

   1 VG per NVME disk
   -

   1 thinpool with one LV
   -

   1 LV mapped to one brick
   -

   1 x 3 = 3


The tests

The tests have been carried out using smallfile utility:

https://github.com/distributed-system-analysis/smallfile

A set of comparative tests have been carried out between the following
platforms, these tests include gluster volume profiling:

   -

   proprietary appliance NVME over iSCSI, top of the range (1 client only)
   -

   Proprietary appliance SSD service NFS, top of the range
   -

   Gluster 3 nodes cluster specs above


A set of longer running  and resilience tests for only Glusterfs, in these
tests the system graph metrics are available and physically unplugging
drives/nodes has been done

Volume profiling:

Has shown by the 8threads 4K test below, gluster volume profiling did not
incur in any performance degradation so has been left on for all the 5K
files tests. Volume profiling results are enclosed.

Gluster volume options

The following volume options have been configured based on previous
experiences. Some of these options have been tested default VS custom as
shown by the 4threads test.

Other options haven’t been explicitly set since already enabled at the
default values.

Options Reconfigured:

client.event-threads 3

performance.cache-size 8GB

performance.io-thread-count 24

network.inode-lru-limit 1048576

performance.parallel-readdir on

performance.cache-invalidation on

performance.md-cache-timeout 600

features.cache-invalidation on

features.cache-invalidation-timeout 600

performance.client-io-threads on

The Cache

I did not noticed any difference in re-mounting the share or dropping the
cache in /proc and in real world i want to use any cache available as much
as possible so none of the tests have been clearing caches or re-mounting.
To note however that performing two subsequent stat operations resulted in
aprox 10x faster FOPS, this test result is not recorded.


*The results*

Attached


glusterfs_smallfile.xlsx
Description: MS-Excel 2007 spreadsheet
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [External] Re: duplicate performance.cache-size with different values

2018-11-13 Thread Davide Obbi
Bug 1649252 submitted

thanks

On Tue, Nov 13, 2018 at 2:41 AM Raghavendra Gowdappa 
wrote:

>
>
> On Mon, Nov 12, 2018 at 9:36 PM Davide Obbi 
> wrote:
>
>> Hi,
>>
>> i have noticed that this option is repeated twice with different values
>> in gluster 4.1.5 if you run gluster volume get volname all
>>
>> performance.cache-size  32MB
>> ...
>> performance.cache-size  128MB
>>
>
> This is a bug with naming the options. Two xlators io-cache and quick-read
> have same keys listed in glusterd-volume-set.c. can you file a bug?
>
>
>>
>> is that right?
>>
>> Regards
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>

-- 
Davide Obbi
System Administrator

Booking.com B.V.
Vijzelstraat 66-80 Amsterdam 1017HL Netherlands
Direct +31207031558
[image: Booking.com] <https://www.booking.com/>
Empowering people to experience the world since 1996
43 languages, 214+ offices worldwide, 141,000+ global destinations, 29
million reported listings
Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] duplicate performance.cache-size with different values

2018-11-12 Thread Davide Obbi
Hi,

i have noticed that this option is repeated twice with different values in
gluster 4.1.5 if you run gluster volume get volname all

performance.cache-size  32MB
...
performance.cache-size  128MB


is that right?

Regards
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [External] Re: anyone using gluster-block?

2018-11-09 Thread Davide Obbi
Hi Vijay,

The Volume has been created using heketi-cli blockvolume create command.
The block config is the config applied by heketi out of the box and in my
case ended up to be:
- 3 nodes each with 1 brick
- the brick is carved from a VG with a single PV
- the PV consists of a 1.2TB SSD, not partitioned and no HW RAID behind
- the volume does not have any custom setting aside what configured in
/etc/glusterfs/group-gluster-block by default
performance.quick-read=off
performance.read-ahead=off
performance.io-cache=off
performance.stat-prefetch=off
performance.open-behind=off
performance.readdir-ahead=off
performance.strict-o-direct=on
network.remote-dio=disable
cluster.eager-lock=enable
cluster.quorum-type=auto
cluster.data-self-heal-algorithm=full
cluster.locking-scheme=granular
cluster.shd-max-threads=8
cluster.shd-wait-qlength=1
features.shard=on
features.shard-block-size=64MB
user.cifs=off
server.allow-insecure=on
cluster.choose-local=off

Kernel: 3.10.0-862.11.6.el7.x86_64
OS: Centos 7.5.1804
tcmu-runner: 0.2rc4.el7

Each node has 32 cores and 128GB RAM and 10Gb connection.

What i am trying to understand is what should be performance expectations
with gluster-block since i couldnt find many benchmarks online.

Regards
Davide


On Fri, Nov 9, 2018 at 7:07 AM Vijay Bellur  wrote:

> Hi Davide,
>
> Can you please share the block hosting volume configuration?
>
> Also, more details about the kernel and tcmu-runner versions could help in
> understanding the problem better.
>
> Thanks,
> Vijay
>
> On Tue, Nov 6, 2018 at 6:16 AM Davide Obbi 
> wrote:
>
>> Hi,
>>
>> i am testing gluster-block and i am wondering if someone has used it and
>> have some feedback regarding its performance.. just to set some
>> expectations... for example:
>> - i have deployed a block volume using heketi on a 3 nodes gluster4.1
>> cluster. it's a replica3 volume.
>> - i have mounted via iscsi using multipath config suggested, created
>> vg/lv and put xfs on it
>> - all done without touching any volume setting or customizing xfs
>> parameters etc..
>> - all baremetal running on 10Gb, gluster has a single block device, SSD
>> in use by heketi
>>
>> so i tried a dd and i get a 4.7 MB/s?
>> - on the gluster nodes i have in write ~200iops, ~15MB/s, 75% util steady
>> and spiky await time up to 100ms alternating between the servers. CPUs are
>> mostly idle but there is some waiting...
>> - Glusterd and fsd utilization is below 1%
>>
>> The thing is that a gluster fuse mount on same platform does not have
>> this slowness so there must be something wrong with my understanding of
>> gluster-block?
>>
>>
>>
>>
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>

-- 
Davide Obbi
System Administrator

Booking.com B.V.
Vijzelstraat 66-80 Amsterdam 1017HL Netherlands
Direct +31207031558
[image: Booking.com] <https://www.booking.com/>
Empowering People to experience the world since 1996
43 languages, 214+ offices worldwide, 141,000+ global destinations, 29
million reported listings
Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] anyone using gluster-block?

2018-11-06 Thread Davide Obbi
Hi,

i am testing gluster-block and i am wondering if someone has used it and
have some feedback regarding its performance.. just to set some
expectations... for example:
- i have deployed a block volume using heketi on a 3 nodes gluster4.1
cluster. it's a replica3 volume.
- i have mounted via iscsi using multipath config suggested, created vg/lv
and put xfs on it
- all done without touching any volume setting or customizing xfs
parameters etc..
- all baremetal running on 10Gb, gluster has a single block device, SSD in
use by heketi

so i tried a dd and i get a 4.7 MB/s?
- on the gluster nodes i have in write ~200iops, ~15MB/s, 75% util steady
and spiky await time up to 100ms alternating between the servers. CPUs are
mostly idle but there is some waiting...
- Glusterd and fsd utilization is below 1%

The thing is that a gluster fuse mount on same platform does not have this
slowness so there must be something wrong with my understanding of
gluster-block?
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] glusterfs 4.1.5 - SSL3_GET_RECORD:wrong version number

2018-10-09 Thread Davide Obbi
Hi,

after running volume stop/start the error disappeared and the volume can be
mounted from the server.

Regards

On Tue, Oct 9, 2018 at 3:27 PM Davide Obbi  wrote:

>
> Hi,
>
> i have enabled SSL/TLS on a cluster of 3 nodes, the server to server
> communication seems working since gluster volume status returns the three
> bricks while we are unable to mount from the client and the client can be
> also one of the gluster nodes iteself.
> Options:
> /var/lib/glusterd/secure-acceess
>   option transport.socket.ssl-cert-depth 3
>
> ssl.cipher-list:
> HIGH:!SSLv2:!SSLv3:!TLSv1:!TLSv1.1:TLSv1.2:!3DES:!RC4:!aNULL:!ADH
> auth.ssl-allow:
> localhost,glusterserver-1005,glusterserver-1008,glusterserver-1009
> server.ssl: on
> client.ssl: on
> auth.allow: glusterserver-1005,glusterserver-1008,glusterserver-1009
> ssl.certificate-depth: 3
>
> We noticed the following in glusterd logs, the .18 address is the client
> and one of the cluster nodes glusterserver-1005:
> [2018-10-09 13:12:10.786384] D [socket.c:354:ssl_setup_connection]
> 0-tcp.management: peer CN = glusterserver-1005
>
> [2018-10-09 13:12:10.786401] D [socket.c:357:ssl_setup_connection]
> 0-tcp.management: SSL verification succeeded (client: 10.10.0.18:49149)
> (server: 10.10.0.18:24007)
> [2018-10-09 13:12:10.956960] D [socket.c:354:ssl_setup_connection]
> 0-tcp.management: peer CN = glusterserver-1009
>
> [2018-10-09 13:12:10.956977] D [socket.c:357:ssl_setup_connection]
> 0-tcp.management: SSL verification succeeded (client: 10.10.0.27:49150)
> (server: 10.10.0.18:24007)
> [2018-10-09 13:12:11.322218] D [socket.c:354:ssl_setup_connection]
> 0-tcp.management: peer CN = glusterserver-1008
>
> [2018-10-09 13:12:11.322248] D [socket.c:357:ssl_setup_connection]
> 0-tcp.management: SSL verification succeeded (client: 10.10.0.23:49150)
> (server: 10.10.0.18:24007)
> [2018-10-09 13:12:11.368753] D [socket.c:354:ssl_setup_connection]
> 0-tcp.management: peer CN = glusterserver-1005
>
> [2018-10-09 13:12:11.368770] D [socket.c:357:ssl_setup_connection]
> 0-tcp.management: SSL verification succeeded (client: 10.10.0.18:49149)
> (server: 10.10.0.18:24007)
> [2018-10-09 13:12:13.535081] E [socket.c:364:ssl_setup_connection]
> 0-tcp.management: SSL connect error (client: 10.10.0.18:49149) (server:
> 10.10.0.18:24007)
> [2018-10-09 13:12:13.535102] E [socket.c:203:ssl_dump_error_stack]
> 0-tcp.management:   error:1408F10B:SSL routines:SSL3_GET_RECORD:wrong
> version number
> [2018-10-09 13:12:13.535129] E [socket.c:2677:socket_poller]
> 0-tcp.management: server setup failed
>
> I believe that something has changed since version 4.1.3 cause using that
> version we were able to mount on the client and we did not get that SSL
> error. Also the cipher volume option was not set in that version. At this
> point i can't understand if node to node is actually using SSL or not and
> why the client is unable to mount
>
> thanks
> Davide
>


-- 
Davide Obbi
System Administrator

Booking.com B.V.
Vijzelstraat 66-80 Amsterdam 1017HL Netherlands
Direct +31207031558
[image: Booking.com] <https://www.booking.com/>
The world's #1 accommodation site
43 languages, 198+ offices worldwide, 120,000+ global destinations,
1,550,000+ room nights booked every day
No booking fees, best price always guaranteed
Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] glusterfs 4.1.5 - SSL3_GET_RECORD:wrong version number

2018-10-09 Thread Davide Obbi
Hi,

i have enabled SSL/TLS on a cluster of 3 nodes, the server to server
communication seems working since gluster volume status returns the three
bricks while we are unable to mount from the client and the client can be
also one of the gluster nodes iteself.
Options:
/var/lib/glusterd/secure-acceess
  option transport.socket.ssl-cert-depth 3

ssl.cipher-list:
HIGH:!SSLv2:!SSLv3:!TLSv1:!TLSv1.1:TLSv1.2:!3DES:!RC4:!aNULL:!ADH
auth.ssl-allow:
localhost,glusterserver-1005,glusterserver-1008,glusterserver-1009
server.ssl: on
client.ssl: on
auth.allow: glusterserver-1005,glusterserver-1008,glusterserver-1009
ssl.certificate-depth: 3

We noticed the following in glusterd logs, the .18 address is the client
and one of the cluster nodes glusterserver-1005:
[2018-10-09 13:12:10.786384] D [socket.c:354:ssl_setup_connection]
0-tcp.management: peer CN = glusterserver-1005

[2018-10-09 13:12:10.786401] D [socket.c:357:ssl_setup_connection]
0-tcp.management: SSL verification succeeded (client: 10.10.0.18:49149)
(server: 10.10.0.18:24007)
[2018-10-09 13:12:10.956960] D [socket.c:354:ssl_setup_connection]
0-tcp.management: peer CN = glusterserver-1009

[2018-10-09 13:12:10.956977] D [socket.c:357:ssl_setup_connection]
0-tcp.management: SSL verification succeeded (client: 10.10.0.27:49150)
(server: 10.10.0.18:24007)
[2018-10-09 13:12:11.322218] D [socket.c:354:ssl_setup_connection]
0-tcp.management: peer CN = glusterserver-1008

[2018-10-09 13:12:11.322248] D [socket.c:357:ssl_setup_connection]
0-tcp.management: SSL verification succeeded (client: 10.10.0.23:49150)
(server: 10.10.0.18:24007)
[2018-10-09 13:12:11.368753] D [socket.c:354:ssl_setup_connection]
0-tcp.management: peer CN = glusterserver-1005

[2018-10-09 13:12:11.368770] D [socket.c:357:ssl_setup_connection]
0-tcp.management: SSL verification succeeded (client: 10.10.0.18:49149)
(server: 10.10.0.18:24007)
[2018-10-09 13:12:13.535081] E [socket.c:364:ssl_setup_connection]
0-tcp.management: SSL connect error (client: 10.10.0.18:49149) (server:
10.10.0.18:24007)
[2018-10-09 13:12:13.535102] E [socket.c:203:ssl_dump_error_stack]
0-tcp.management:   error:1408F10B:SSL routines:SSL3_GET_RECORD:wrong
version number
[2018-10-09 13:12:13.535129] E [socket.c:2677:socket_poller]
0-tcp.management: server setup failed

I believe that something has changed since version 4.1.3 cause using that
version we were able to mount on the client and we did not get that SSL
error. Also the cipher volume option was not set in that version. At this
point i can't understand if node to node is actually using SSL or not and
why the client is unable to mount

thanks
Davide
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [External] Re: directory quotas non existing directory failing

2018-09-14 Thread Davide Obbi
thanks for the clarification

On Fri, Sep 14, 2018 at 8:44 AM Hari Gowtham  wrote:

> Hi,
>
> Thanks for letting us know.
> It is a mistake in the doc. will fix it.
>
> It should be as follows:
> You can set the disk limit on the directory even if files are not created.
> The disk limit is enforced for all the files in that directory once they
> are created.
>
> On Fri, Sep 14, 2018 at 11:35 AM Davide Obbi 
> wrote:
>
>> Here:
>>
>> https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Directory%20Quota/
>>
>> Kr
>> Davide
>>
>> On Fri, Sep 14, 2018 at 7:12 AM Hari Gowtham  wrote:
>>
>>> Hi,
>>>
>>> Can you point to the right place in doc where it is mentioned as above?
>>> Need to understand the context in which it was mentioned.
>>>
>>> As far as I know, it is not possible to do so.
>>> Quota's limits are stored on the directory itself.
>>> Without the directory being there quota can't store the limit for
>>> future directories.
>>>
>>>
>>> On Thu, Sep 13, 2018 at 4:48 PM Davide Obbi 
>>> wrote:
>>> >
>>> > Hi,
>>> >
>>> > According to glusterdoc:
>>> >
>>> > Note You can set the disk limit on the directory even if it is not
>>> created. The disk limit is enforced immediately after creating that
>>> directory.
>>> > However if i try to set the limit on direcotries not exsiting on the
>>> volume i get:
>>> >
>>> > quota command failed : Failed to get trusted.gfid attribute on path
>>> /rhomes/davide. Reason : No such file or directory
>>> > please enter the path relative to the volume
>>> >
>>> > glusterfs 4.1.2
>>> >
>>> > Based on this i hoped that would be possible to create some sort of
>>> wildcard quota like:
>>> >
>>> > gluster volume quota homes limit-usage /rhomes/* 20GB
>>> >
>>> >
>>> > ___
>>> > Gluster-users mailing list
>>> > Gluster-users@gluster.org
>>> > https://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Hari Gowtham.
>>>
>>
>>
>> --
>> Davide Obbi
>> System Administrator
>>
>> Booking.com B.V.
>> Vijzelstraat 66-80 Amsterdam 1017HL Netherlands
>> Direct +31207031558
>> [image: Booking.com] <https://www.booking.com/>
>> The world's #1 accommodation site
>> 43 languages, 198+ offices worldwide, 120,000+ global destinations,
>> 1,550,000+ room nights booked every day
>> No booking fees, best price always guaranteed
>> Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)
>>
>
>
> --
> Regards,
> Hari Gowtham.
>


-- 
Davide Obbi
System Administrator

Booking.com B.V.
Vijzelstraat 66-80 Amsterdam 1017HL Netherlands
Direct +31207031558
[image: Booking.com] <https://www.booking.com/>
The world's #1 accommodation site
43 languages, 198+ offices worldwide, 120,000+ global destinations,
1,550,000+ room nights booked every day
No booking fees, best price always guaranteed
Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [External] Re: directory quotas non existing directory failing

2018-09-14 Thread Davide Obbi
Here:
https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Directory%20Quota/

Kr
Davide

On Fri, Sep 14, 2018 at 7:12 AM Hari Gowtham  wrote:

> Hi,
>
> Can you point to the right place in doc where it is mentioned as above?
> Need to understand the context in which it was mentioned.
>
> As far as I know, it is not possible to do so.
> Quota's limits are stored on the directory itself.
> Without the directory being there quota can't store the limit for
> future directories.
>
>
> On Thu, Sep 13, 2018 at 4:48 PM Davide Obbi 
> wrote:
> >
> > Hi,
> >
> > According to glusterdoc:
> >
> > Note You can set the disk limit on the directory even if it is not
> created. The disk limit is enforced immediately after creating that
> directory.
> > However if i try to set the limit on direcotries not exsiting on the
> volume i get:
> >
> > quota command failed : Failed to get trusted.gfid attribute on path
> /rhomes/davide. Reason : No such file or directory
> > please enter the path relative to the volume
> >
> > glusterfs 4.1.2
> >
> > Based on this i hoped that would be possible to create some sort of
> wildcard quota like:
> >
> > gluster volume quota homes limit-usage /rhomes/* 20GB
> >
> >
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
> --
> Regards,
> Hari Gowtham.
>


-- 
Davide Obbi
System Administrator

Booking.com B.V.
Vijzelstraat 66-80 Amsterdam 1017HL Netherlands
Direct +31207031558
[image: Booking.com] <https://www.booking.com/>
The world's #1 accommodation site
43 languages, 198+ offices worldwide, 120,000+ global destinations,
1,550,000+ room nights booked every day
No booking fees, best price always guaranteed
Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] directory quotas non existing directory failing

2018-09-13 Thread Davide Obbi
Hi,

According to glusterdoc:

Note You can set the disk limit on the directory even if it is not created.
The disk limit is enforced immediately after creating that directory.
However if i try to set the limit on direcotries not exsiting on the volume
i get:

quota command failed : Failed to get trusted.gfid attribute on path
/rhomes/davide. Reason : No such file or directory

please enter the path relative to the volume

*glusterfs 4.1.2*

Based on this i hoped that would be possible to create some sort of
wildcard quota like:

gluster volume quota homes limit-usage /rhomes/* 20GB
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [External] Re: file metadata operations performance - gluster 4.1

2018-08-31 Thread Davide Obbi
it didnt make a difference. I will try to re-configure with a 2x3 config

On Fri, Aug 31, 2018 at 1:48 PM Raghavendra Gowdappa 
wrote:

> another relevant option is setting cluster.lookup-optimize on.
>
> On Fri, Aug 31, 2018 at 3:22 PM, Davide Obbi 
> wrote:
>
>> #gluster vol set VOLNAME group nl-cache --> didn't know there are groups
>> of options, after this command i got set the following:
>> performance.nl-cache-timeout: 600
>> performance.nl-cache: on
>> performance.parallel-readdir: on
>> performance.io-thread-count: 64
>> network.inode-lru-limit: 20
>>
>> to note that i had network.inode-lru-limit set to max and got reduced to
>> 20
>>
>> then i added
>> performance.nl-cache-positive-entry: on
>>
>> The volume options:
>> Options Reconfigured:
>> performance.nl-cache-timeout: 600
>> performance.nl-cache: on
>> performance.nl-cache-positive-entry: on
>> performance.parallel-readdir: on
>> performance.io-thread-count: 64
>> network.inode-lru-limit: 20
>> nfs.disable: on
>> transport.address-family: inet
>> performance.readdir-ahead: on
>> features.cache-invalidation-timeout: 600
>> features.cache-invalidation: on
>> performance.md-cache-timeout: 600
>> performance.stat-prefetch: on
>> performance.cache-invalidation: on
>> performance.cache-size: 10GB
>> network.ping-timeout: 5
>> diagnostics.client-log-level: WARNING
>> diagnostics.brick-log-level: WARNING
>> features.quota: off
>> features.inode-quota: off
>> performance.quick-read: on
>>
>> untar completed in 08mins 30secs
>>
>> increasing network.inode-lru-limit to 1048576 untar completed in the same
>> time
>> I have attached the gluster profile results of the last test, with
>> network.inode-lru-limit to 1048576
>>
>> I guess the next test will be creating more bricks for the same volume to
>> have a 2x3. Since i do not see bottlenecks at the disk level and i have
>> limited hw ATM i will just carve out the bricks from LVs from the same 1
>> disk VG.
>>
>> Also i have tried to look for a complete list of options/description
>> unsuccessfully can you point at one?
>>
>> thanks
>> Davide
>>
>> On Thu, Aug 30, 2018 at 5:47 PM Poornima Gurusiddaiah <
>> pguru...@redhat.com> wrote:
>>
>>> To enable nl-cache please use group option instead of single volume set:
>>>
>>> #gluster vol set VOLNAME group nl-cache
>>>
>>> This sets few other things including time out, invalidation etc.
>>>
>>> For enabling the option Raghavendra mentioned, you ll have to execute it
>>> explicitly, as it's not part of group option yet:
>>>
>>> #gluster vol set VOLNAME performance.nl-cache-positive-entry on
>>>
>>> Also from the past experience, setting the below option has helped in
>>> performance:
>>>
>>> # gluster vol set VOLNAME network.inode-lru-limit 20
>>>
>>> Regards,
>>> Poornima
>>>
>>>
>>> On Thu, Aug 30, 2018, 8:49 PM Raghavendra Gowdappa 
>>> wrote:
>>>
>>>>
>>>>
>>>> On Thu, Aug 30, 2018 at 8:38 PM, Davide Obbi 
>>>> wrote:
>>>>
>>>>> yes "performance.parallel-readdir on and 1x3 replica
>>>>>
>>>>
>>>> That's surprising. I thought performance.parallel-readdir will help
>>>> only when distribute count is fairly high. This is something worth
>>>> investigating further.
>>>>
>>>>
>>>>> On Thu, Aug 30, 2018 at 5:00 PM Raghavendra Gowdappa <
>>>>> rgowd...@redhat.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Aug 30, 2018 at 8:08 PM, Davide Obbi >>>>> > wrote:
>>>>>>
>>>>>>> Thanks Amar,
>>>>>>>
>>>>>>> i have enabled the negative lookups cache on the volume:
>>>>>>>
>>>>>>
>>>> I think enabling nl-cache-positive-entry might help for untarring or
>>>> git clone into glusterfs. Its disabled by default. can you let us know the
>>>> results?
>>>>
>>>> Option: performance.nl-cache-positive-entry
>>>> Default Value: (null)
>>>> Description: enable/disable storing of entries that were lookedup and
>>>> found to be present in the volume, thus lookup on non existent file is
&

Re: [Gluster-users] [External] Re: file metadata operations performance - gluster 4.1

2018-08-31 Thread Davide Obbi
#gluster vol set VOLNAME group nl-cache --> didn't know there are groups of
options, after this command i got set the following:
performance.nl-cache-timeout: 600
performance.nl-cache: on
performance.parallel-readdir: on
performance.io-thread-count: 64
network.inode-lru-limit: 20

to note that i had network.inode-lru-limit set to max and got reduced to
20

then i added
performance.nl-cache-positive-entry: on

The volume options:
Options Reconfigured:
performance.nl-cache-timeout: 600
performance.nl-cache: on
performance.nl-cache-positive-entry: on
performance.parallel-readdir: on
performance.io-thread-count: 64
network.inode-lru-limit: 20
nfs.disable: on
transport.address-family: inet
performance.readdir-ahead: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
performance.md-cache-timeout: 600
performance.stat-prefetch: on
performance.cache-invalidation: on
performance.cache-size: 10GB
network.ping-timeout: 5
diagnostics.client-log-level: WARNING
diagnostics.brick-log-level: WARNING
features.quota: off
features.inode-quota: off
performance.quick-read: on

untar completed in 08mins 30secs

increasing network.inode-lru-limit to 1048576 untar completed in the same
time
I have attached the gluster profile results of the last test, with
network.inode-lru-limit to 1048576

I guess the next test will be creating more bricks for the same volume to
have a 2x3. Since i do not see bottlenecks at the disk level and i have
limited hw ATM i will just carve out the bricks from LVs from the same 1
disk VG.

Also i have tried to look for a complete list of options/description
unsuccessfully can you point at one?

thanks
Davide

On Thu, Aug 30, 2018 at 5:47 PM Poornima Gurusiddaiah 
wrote:

> To enable nl-cache please use group option instead of single volume set:
>
> #gluster vol set VOLNAME group nl-cache
>
> This sets few other things including time out, invalidation etc.
>
> For enabling the option Raghavendra mentioned, you ll have to execute it
> explicitly, as it's not part of group option yet:
>
> #gluster vol set VOLNAME performance.nl-cache-positive-entry on
>
> Also from the past experience, setting the below option has helped in
> performance:
>
> # gluster vol set VOLNAME network.inode-lru-limit 20
>
> Regards,
> Poornima
>
>
> On Thu, Aug 30, 2018, 8:49 PM Raghavendra Gowdappa 
> wrote:
>
>>
>>
>> On Thu, Aug 30, 2018 at 8:38 PM, Davide Obbi 
>> wrote:
>>
>>> yes "performance.parallel-readdir on and 1x3 replica
>>>
>>
>> That's surprising. I thought performance.parallel-readdir will help only
>> when distribute count is fairly high. This is something worth investigating
>> further.
>>
>>
>>> On Thu, Aug 30, 2018 at 5:00 PM Raghavendra Gowdappa <
>>> rgowd...@redhat.com> wrote:
>>>
>>>>
>>>>
>>>> On Thu, Aug 30, 2018 at 8:08 PM, Davide Obbi 
>>>> wrote:
>>>>
>>>>> Thanks Amar,
>>>>>
>>>>> i have enabled the negative lookups cache on the volume:
>>>>>
>>>>
>> I think enabling nl-cache-positive-entry might help for untarring or git
>> clone into glusterfs. Its disabled by default. can you let us know the
>> results?
>>
>> Option: performance.nl-cache-positive-entry
>> Default Value: (null)
>> Description: enable/disable storing of entries that were lookedup and
>> found to be present in the volume, thus lookup on non existent file is
>> served from the cache
>>
>>
>>>>> To deflate a tar archive (not compressed) of 1.3GB it takes aprox
>>>>> 9mins which can be considered a slight improvement from the previous 12-15
>>>>> however still not fast enough compared to local disk. The tar is present 
>>>>> on
>>>>> the gluster share/volume and deflated inside the same folder structure.
>>>>>
>>>>
>>>> I am assuming this is with parallel-readdir enabled, right?
>>>>
>>>>
>>>>> Running the operation twice (without removing the already deflated
>>>>> files) also did not reduce the time spent.
>>>>>
>>>>> Running the operation with the tar archive on local disk made no
>>>>> difference
>>>>>
>>>>> What really made a huge difference while git cloning was setting
>>>>> "performance.parallel-readdir on". During the phase "Receiving objects" ,
>>>>> as i enabled the xlator it bumped up from 3/4MBs to 27MBs
>>>>>
>>>>
>>>> What is the distribute count? Is it 1x3 replica?
>&

Re: [Gluster-users] [External] Re: file metadata operations performance - gluster 4.1

2018-08-30 Thread Davide Obbi
yes "performance.parallel-readdir on and 1x3 replica

On Thu, Aug 30, 2018 at 5:00 PM Raghavendra Gowdappa 
wrote:

>
>
> On Thu, Aug 30, 2018 at 8:08 PM, Davide Obbi 
> wrote:
>
>> Thanks Amar,
>>
>> i have enabled the negative lookups cache on the volume:
>>
>> To deflate a tar archive (not compressed) of 1.3GB it takes aprox 9mins
>> which can be considered a slight improvement from the previous 12-15
>> however still not fast enough compared to local disk. The tar is present on
>> the gluster share/volume and deflated inside the same folder structure.
>>
>
> I am assuming this is with parallel-readdir enabled, right?
>
>
>> Running the operation twice (without removing the already deflated files)
>> also did not reduce the time spent.
>>
>> Running the operation with the tar archive on local disk made no
>> difference
>>
>> What really made a huge difference while git cloning was setting
>> "performance.parallel-readdir on". During the phase "Receiving objects" ,
>> as i enabled the xlator it bumped up from 3/4MBs to 27MBs
>>
>
> What is the distribute count? Is it 1x3 replica?
>
>
>> So in conclusion i'm trying to make the untar operation working at an
>> acceptable level, not expecting local disks speed but at least being within
>> the 4mins
>>
>> I have attached the profiles collected at the end of the untar operations
>> with the archive on the mount and outside
>>
>> thanks
>> Davide
>>
>>
>> On Tue, Aug 28, 2018 at 8:41 AM Amar Tumballi 
>> wrote:
>>
>>> One of the observation we had with git clone like work load was,
>>> nl-cache (negative-lookup cache), helps here.
>>>
>>> Try 'gluster volume set $volume-name nl-cache enable'.
>>>
>>> Also sharing the 'profile info' during this performance observations
>>> also helps us to narrow down the situation.
>>>
>>> More on how to capture profile info @
>>> https://hackmd.io/PhhT5jPdQIKxzfeLQmnjJQ?view
>>>
>>> -Amar
>>>
>>>
>>> On Thu, Aug 23, 2018 at 7:11 PM, Davide Obbi 
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> did anyone ever managed to achieve reasonable waiting time while
>>>> performing metadata intensive operations such as git clone, untar etc...?
>>>> Is this possible workload or will never be in scope for glusterfs?
>>>>
>>>> I'd like to know, if possible, what would be the options that affect
>>>> such volume performances.
>>>> Albeit i managed to achieve decent git status/git grep operations, 3
>>>> and 30 secs, the git clone and untarring a file from/to the same share take
>>>> ages. for a git repo of aprox 6GB.
>>>>
>>>> I'm running a test environment with 3 way replica 128GB RAM and 24
>>>> cores are  2.40GHz, one internal SSD dedicated to the volume brick and 10Gb
>>>> network
>>>>
>>>> The options set so far that affects volume performances are:
>>>>  48 performance.readdir-ahead: on
>>>>  49 features.cache-invalidation-timeout: 600
>>>>  50 features.cache-invalidation: on
>>>>  51 performance.md-cache-timeout: 600
>>>>  52 performance.stat-prefetch: on
>>>>  53 performance.cache-invalidation: on
>>>>  54 performance.parallel-readdir: on
>>>>  55 network.inode-lru-limit: 90
>>>>  56 performance.io-thread-count: 32
>>>>  57 performance.cache-size: 10GB
>>>>
>>>> ___
>>>> Gluster-users mailing list
>>>> Gluster-users@gluster.org
>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>
>>>
>>>
>>> --
>>> Amar Tumballi (amarts)
>>>
>>
>>
>> --
>> Davide Obbi
>> System Administrator
>>
>> Booking.com B.V.
>> Vijzelstraat 66-80 Amsterdam 1017HL Netherlands
>> Direct +31207031558
>> [image: Booking.com] <https://www.booking.com/>
>> The world's #1 accommodation site
>> 43 languages, 198+ offices worldwide, 120,000+ global destinations,
>> 1,550,000+ room nights booked every day
>> No booking fees, best price always guaranteed
>> Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
>

-- 
Davide Obbi
System Administrator

Booking.com B.V.
Vijzelstraat 66-80 Amsterdam 1017HL Netherlands
Direct +31207031558
[image: Booking.com] <https://www.booking.com/>
The world's #1 accommodation site
43 languages, 198+ offices worldwide, 120,000+ global destinations,
1,550,000+ room nights booked every day
No booking fees, best price always guaranteed
Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] file metadata operations performance - gluster 4.1

2018-08-23 Thread Davide Obbi
Hello,

did anyone ever managed to achieve reasonable waiting time while performing
metadata intensive operations such as git clone, untar etc...? Is this
possible workload or will never be in scope for glusterfs?

I'd like to know, if possible, what would be the options that affect such
volume performances.
Albeit i managed to achieve decent git status/git grep operations, 3 and 30
secs, the git clone and untarring a file from/to the same share take ages.
for a git repo of aprox 6GB.

I'm running a test environment with 3 way replica 128GB RAM and 24 cores
are  2.40GHz, one internal SSD dedicated to the volume brick and 10Gb
network

The options set so far that affects volume performances are:
 48 performance.readdir-ahead: on
 49 features.cache-invalidation-timeout: 600
 50 features.cache-invalidation: on
 51 performance.md-cache-timeout: 600
 52 performance.stat-prefetch: on
 53 performance.cache-invalidation: on
 54 performance.parallel-readdir: on
 55 network.inode-lru-limit: 90
 56 performance.io-thread-count: 32
 57 performance.cache-size: 10GB
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [External] Re: Proposal to mark few features as Deprecated / SunSet from Version 5.0

2018-08-22 Thread Davide Obbi
fully using RDMA transport, do get in touch with us
>> to prioritize the migration plan for your volume. Plan is to work on this
>> > after the release, so by version 6.0, we will have a cleaner transport
>> code, which just needs to support one type.
>> >
>> > ‘Tiering’ feature
>> >
>> > Gluster’s tiering feature which was planned to be providing an option
>> to keep your ‘hot’ data in different location than your cold data, so one
>> can
>> > get better performance. While we saw some users for the feature, it
>> needs much more attention to be completely bug free. At the time, we are not
>> > having any active maintainers for the feature, and hence suggesting to
>> take it out of the ‘supported’ tag.
>> >
>> > If you are willing to take it up, and maintain it, do let us know, and
>> we are happy to assist you.
>> >
>> > If you are already using tiering feature, before upgrading, make sure
>> to do gluster volume tier detach all the bricks before upgrading to next
>> > release. Also, we recommend you to use features like dmcache on your
>> LVM setup to get best performance from bricks.
>> >
>> > ‘Quota’
>> >
>> > This is a call out for ‘Quota’ feature, to let you all know that it
>> will be ‘no new development’ state. While this feature is ‘actively’ in use
>> by
>> > many people, the challenges we have in accounting mechanisms involved,
>> has made it hard to achieve good performance with the feature. Also, the
>> > amount of extended attribute get/set operations while using the feature
>> is not very ideal. Hence we recommend our users to move towards setting
>> > quota on backend bricks directly (ie, XFS project quota), or to use
>> different volumes for different directories etc.
>> >
>> > As the feature wouldn’t be deprecated immediately, the feature doesn’t
>> need a migration plan when you upgrade to newer version, but if you are a
>> new
>> > user, we wouldn’t recommend setting quota feature. By the release
>> dates, we will be publishing our best alternatives guide for gluster’s
>> current
>> > quota feature.
>> >
>> > Note that if you want to contribute to the feature, we have project
>> quota based issue open[2] Happy to get contributions, and help in getting a
>> > newer approach to Quota.
>> >
>> >
>> > These are our set of initial features which we propose to take out of
>> ‘fully’ supported features. While we are in the process of making the
>> > user/developer experience of the project much better with providing
>> well maintained codebase, we may come up with few more set of features
>> which we
>> > may possibly consider to move out of support, and hence keep watching
>> this space.
>> >
>> > [1] - http://review.gluster.org/4809
>> >
>> > [2] - https://github.com/gluster/glusterfs/issues/184
>> >
>> >
>> > Regards,
>> >
>> > Vijay, Shyam, Amar
>> >
>> >
>> > ___
>> > Gluster-users mailing list
>> > Gluster-users@gluster.org
>> > https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
> --
> Amar Tumballi (amarts)
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users



-- 
Davide Obbi
System Administrator

Booking.com B.V.
Vijzelstraat 66-80 Amsterdam 1017HL Netherlands
Direct +31207031558
[image: Booking.com] <https://www.booking.com/>
The world's #1 accommodation site
43 languages, 198+ offices worldwide, 120,000+ global destinations,
1,550,000+ room nights booked every day
No booking fees, best price always guaranteed
Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] heketi and glusterd2

2018-08-16 Thread Davide Obbi
Hi,

how do i tell heketi to use glusterd2 instead of glusterd?

When i perform the topology load i get the following error:
Creating node gluster01 ... Unable to create node: New Node doesn't have
glusterd running

this suggest that is looking for glusterd in fact if i switch to glusterd
it finds it.
in heketi.json i could not find any option to specify this

thanks
Davide
--
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] create volume with glusterd2 fails with "HTTP/1.1 500 Internal Server Error"

2018-08-13 Thread Davide Obbi
Hi,

i'm trying to create a volume using glusterd2 however curl does not return
anything and in verbose mode i get the above mentioned error,  the only
gluster service on the hosts is "glusterd2.service"

/var/log/glusterd2/glusterd2.log:
time="2018-08-13 18:05:15.395479" level=info msg="runtime error: index out
of range"

JSON FILE:
{
"name": "test01",
"subvols": [
{
"type": "replicate",
"bricks": [
{"peerid": "2f468973-8de9-4dbf-90c2-7236e229697d",
"path": "/srv/gfs/test01/brk01"},

{"peerid": "ad228d0e-e7b9-4d02-a863-94c74bd3d843",
"path": "/srv/gfs/test01/brk01"},

{"peerid": "de7ea1d7-5566-40f4-bc34-0d68bfd48193",
"path": "/srv/gfs/test01/brk01"}

],
"replica": 3
}
],
"force": false
}

COMMAND:
curl -v -X POST http://localhost:24007/v1/volumes --data @test01.json -H
'Content-Type: application/json'
# i have also tried with fqd from localhost

ERROR:
* About to connect() to localhost port 24007 (#0)
*   Trying ::1...
* Connected to localhost (::1) port 24007 (#0)
> POST /v1/volumes HTTP/1.1
> User-Agent: curl/7.29.0
> Host: localhost:24007
> Accept: */*
> Content-Type: application/json
> Content-Length: 525
>
* upload completely sent off: 525 out of 525 bytes
< HTTP/1.1 500 Internal Server Error
< X-Gluster-Cluster-Id: 04385f96-4b7e-4afa-9366-8d1a3b30f36e
< X-Gluster-Peer-Id: ad228d0e-e7b9-4d02-a863-94c74bd3d843
< X-Request-Id: fea765a5-1014-48f7-919c-3431278d14ab
< Date: Mon, 13 Aug 2018 15:57:24 GMT
< Content-Length: 0
< Content-Type: text/plain; charset=utf-8
<
* Connection #0 to host localhost left intact

VERSIONS:
glusterd2.x86_64
 4.1.0-1.el7
glusterfs-server.x86_64
  4.1.2-1.el7

PEERS:
+--+--+-+--++

|  ID  |   NAME
   |  CLIENT ADDRESSES   |  PEER ADDRESSES  | ONLINE |

+--+--+-+--++

| 2f468973-8de9-4dbf-90c2-7236e229697d | gluster-1009
   | 127.0.0.1:24007 | gluster-1009:24008   | yes|

|  |
  | 10.10.10.19:24007   | 10.10.10.19:24008||

| ad228d0e-e7b9-4d02-a863-94c74bd3d843 | gluster-1005
   | 127.0.0.1:24007 | 10.10.10.18:24008| yes|

|  |
  | 10.10.10.18:24007   |  ||

| de7ea1d7-5566-40f4-bc34-0d68bfd48193 | gluster-1008
   | 127.0.0.1:24007 | gluster-1008:24008   | yes|

|  |
  | 10.10.10.23:24007   | 10.10.10.23:24008||

+--+--+-+--++
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] glusterd2 4.1 availability in centos7 repositories

2018-08-01 Thread Davide Obbi
Hi,

does anyone know why glusterd2 4.1 is not available in the main centos
repos?

http://mirror.centos.org/centos/7/storage/x86_64/gluster-4.1/

while is available in the buildlogs?

https://buildlogs.centos.org/centos/7/storage/x86_64/gluster-4.1/

thanks
Davide
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [External] Re: peer detach doesn't seem working in glustercli

2018-06-14 Thread Davide Obbi
thanks,

i have been able to remove the old id entry:
5953d666-fccd-48e2-aeb9-5308a53ad5a0 succesfully

On Thu, Jun 14, 2018 at 4:07 PM, Aravinda  wrote:

> Hi,
>
> Thanks for the feedback, I will look into the issue.
>
> Gluster 4.1 version, glustercli allows to detach peer using peer id as
> well. As a workaround can you try using REST API as below
>
> curl -i -XDELETE http://localhost:24007/v1/peers/
>
> For example, curl -i -XDELETE http://localhost:24007/v1/
> peers/00590d47-950d-4047-82d5-60f941c51827
>
> Peer id as shown in `glustercli peer status`
>
>
> On 06/13/2018 05:16 AM, Davide Obbi wrote:
>
> hi,
>
> i'm testing some operations with gluster4 and glustercli.
> I have re-installed a node with the same host name/IP and add it back to
> the cluster.
> This resulted in a double entry, this could be expected but then i am not
> able to remove the old one nor the new nor any other node.
> So at this point i am not sure i run peer detach correctly or something is
> not working as expected or everything broke by re-adding the same node
> again since i get always:
>
> Error: Unable to find Peer ID
>
> thanks
> Davide
>
>
> Usage:
>   glustercli peer detach  [flags]
>
> +--+
> +-+
>
> |  ID  |
> NAME  |PEERADDRESSES|
>
> +--+
> +-+
>
> | 00590d47-950d-4047-82d5-60f941c51827 | glusternode-2405 |
> 10.10.10.1:24008 |
>
> | 5953d666-fccd-48e2-aeb9-5308a53ad5a0 | glusternode-2401 |
> 10.10.10.2.15:24008 |
>
> | 79ffdcbf-5147-4f7e-b28d-1fa056cd3ccc | glusternode-2404 |
> 10.10.10.3:24008 |
>
> | ba9e411e-9b69-4841-8b36-e1dbb5663135 | glusternode-2401 |
> 10.10.10.2:24008 |
>
> +--+
> +-+
>
> [root@glusternode-2404 ~]# glustercli peer detach 00590d47-950d-4047-82d5-
> 60f941c51827
> Peer detach failed
>
> Error: Unable to find Peer ID
> [root@glusternode-2404 ~]# glustercli peer detach 5953d666-fccd-48e2-aeb9-
> 5308a53ad5a0
> Peer detach failed
>
> Error: Unable to find Peer ID
> [root@glusternode-2404 ~]# glustercli peer detach glusternode-2401
> Peer detach failed
>
> Error: Unable to find Peer ID
> [root@glusternode-2404 ~]# glustercli peer detach glusternode-2404
> Peer detach failed
>
> Error: Unable to find Peer ID
>
>
>
>
>
> ___
> Gluster-users mailing 
> listGluster-users@gluster.orghttp://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
> --
> regards
> Aravinda VK
>
>


-- 
Davide Obbi
System Administrator

Booking.com B.V.
Vijzelstraat 66-80 Amsterdam 1017HL Netherlands
Direct +31207031558
[image: Booking.com] <https://www.booking.com/>
The world's #1 accommodation site
43 languages, 198+ offices worldwide, 120,000+ global destinations,
1,550,000+ room nights booked every day
No booking fees, best price always guaranteed
Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] peer detach doesn't seem working in glustercli

2018-06-13 Thread Davide Obbi
hi,

i'm testing some operations with gluster4 and glustercli.
I have re-installed a node with the same host name/IP and add it back to
the cluster.
This resulted in a double entry, this could be expected but then i am not
able to remove the old one nor the new nor any other node.
So at this point i am not sure i run peer detach correctly or something is
not working as expected or everything broke by re-adding the same node
again since i get always:

Error: Unable to find Peer ID

thanks
Davide


Usage:
  glustercli peer detach  [flags]

+--++-+

|  ID  |
NAME  |PEERADDRESSES
|

+--++-+
| 00590d47-950d-4047-82d5-60f941c51827 | glusternode-2405 | 10.10.10.1:24008
|

| 5953d666-fccd-48e2-aeb9-5308a53ad5a0 | glusternode-2401 |
10.10.10.2.15:24008
|

| 79ffdcbf-5147-4f7e-b28d-1fa056cd3ccc | glusternode-2404 | 10.10.10.3:24008
|

| ba9e411e-9b69-4841-8b36-e1dbb5663135 | glusternode-2401 | 10.10.10.2:24008
|

+--++-+


[root@glusternode-2404 ~]# glustercli peer detach
00590d47-950d-4047-82d5-60f941c51827
Peer detach failed

Error: Unable to find Peer ID
[root@glusternode-2404 ~]# glustercli peer detach
5953d666-fccd-48e2-aeb9-5308a53ad5a0
Peer detach failed

Error: Unable to find Peer ID
[root@glusternode-2404 ~]# glustercli peer detach glusternode-2401
Peer detach failed

Error: Unable to find Peer ID
[root@glusternode-2404 ~]# glustercli peer detach glusternode-2404
Peer detach failed

Error: Unable to find Peer ID
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [External] Re: glusterfs 3.13 repo unavailable and downgrade to 3.12.9 fails

2018-05-15 Thread Davide Obbi
i have a 12 nodes cluster with 2 nodes downgraded and currently down. The
procedure of taking down 3.13 nodes means a cluster downtime?

if it matters /var/lib/glusterd/glusterd.info is empty on the 3.12.9 nodes.

On Tue, May 15, 2018 at 2:47 PM, Kaleb S. KEITHLEY <kkeit...@redhat.com>
wrote:

> On 05/15/2018 08:08 AM, Davide Obbi wrote:
> > Thanks Kaleb,
> >
> > any chance i can make the node working after the downgrade?
> > thanks
>
> Without knowing what doesn't work, I'll go out on a limb and guess that
> it's an op-version problem.
>
> Shut down your 3.13 nodes, change their op-version to one of the valid
> 3.12 op-versions (e.g. 31203) and restart. Then the 3.12 nodes should
> work with 3.13, and/or the formerly 3.13 nodes should work with 3.12.
>
> JoeJulian in #gluster may have other ideas.
>
>
> >
> > On Tue, May 15, 2018 at 2:02 PM, Kaleb S. KEITHLEY <kkeit...@redhat.com
> > <mailto:kkeit...@redhat.com>> wrote:
> >
> >
> > You can still get them from
> >   https://buildlogs.centos.org/centos/7/storage/x86_64/gluster-3.13/
> > <https://buildlogs.centos.org/centos/7/storage/x86_64/gluster-3.13/>
> >
> > (I don't know how much longer they'll be there. I suggest you copy
> them
> > if you think you're going to need them in the future.)
> >
> >
> >
> > n 05/15/2018 04:58 AM, Davide Obbi wrote:
> > > hi,
> > >
> > > i noticed that this repo for glusterfs 3.13 does not exists
> anymore at:
> > >
> > > http://mirror.centos.org/centos/7/storage/x86_64/
> > <http://mirror.centos.org/centos/7/storage/x86_64/>
> > >
> > > i knew was not going to be long term supported however the
> downgrade to
> > > 3.12 breaks the server node i believe the issue is with:
> > > *[2018-05-15 08:54:39.981101] E [MSGID: 101019]
> > > [xlator.c:503:xlator_init] 0-management: Initialization of volume
> > > 'management' failed, review your volfile
> again
> > > [2018-05-15 08:54:39.981113] E [MSGID: 101066]
> > > [graph.c:327:glusterfs_graph_init] 0-management: initializing
> translator
> > > failed
>
> > > [2018-05-15 08:54:39.981121] E [MSGID: 101176]
> > > [graph.c:698:glusterfs_graph_activate] 0-graph: init
> > > failed
>
> > > *
> > >
> > > Any help appreciated, thanks
> > >
> > > Details:
> > > *Installed Packages*
> > >
> > glusterfs.x86_64
>
> > >
> > 3.12.9-1.el7
>
> > > @gluster-3.12
> > >
> > glusterfs-api.x86_64
>
> > >
> > 3.12.9-1.el7
>
> > > @gluster-3.12
> > >
> > glusterfs-cli.x86_64
>
> > >
> > 3.12.9-1.el7
>
> > > @gluster-3.12
> > >
> > glusterfs-client-xlators.x86_64
>
> > >
> > 3.12.9-1.el7
>
> > > @gluster-3.12
> > >
> > glusterfs-events.x86_64
>
> > >
> > 3.12.9-1.el7
>
> > > @gluster-3.12
> > >
> > glusterfs-fuse.x86_64
>
> > >
> > 3.12.9-1.el7
>
> > > @gluster-3.12
> > >
> > glusterfs-libs.x86_64
>
> > >
> > 3.12.9-1.el7
>
> > > @gluster-3.12
> > >
> > glusterfs-server.x86_64
>
> > >
> > 3.12.9-1.el7
>
> > > @gluster-3.12
> > >
> > python2-gluster.x86_64
>
> > >
> > 3.12.9-1.el7
>
> > > @gluster-3.12
> > >
> > > *# journalctl -u
> > >
> > glusterd*
>
>
> > >
> > > -- Logs begin at Wed 2018-03-21 15:06:46 CET, end at Tue 2018-05-15
> > > 10:48:01 CEST. --
> > > Mar 21 15:06:58 glusternode-1003 systemd[1]: Starting GlusterFS, a
> > > clustered file-system
> > >
> > server...
>
> > >
> > > Mar 21 15:07:01 glusternode-1 systemd[1]: Started GlusterFS, a
> > clustered
> > > file-system
> > >
> > server.
>
> > >
> > > Mar 21 15:25:07 glusternode-1 systemd[1]: Stopping GlusterFS, a
> > > clustered file-system
> > >
> > server...
>
> > >
> > > Mar 21 15:25:07 glusternode-1 systemd[1]: Starting Gluste

Re: [Gluster-users] [External] Re: glusterfs 3.13 repo unavailable and downgrade to 3.12.9 fails

2018-05-15 Thread Davide Obbi
Thanks Kaleb,

any chance i can make the node working after the downgrade?
thanks

On Tue, May 15, 2018 at 2:02 PM, Kaleb S. KEITHLEY <kkeit...@redhat.com>
wrote:

>
> You can still get them from
>   https://buildlogs.centos.org/centos/7/storage/x86_64/gluster-3.13/
>
> (I don't know how much longer they'll be there. I suggest you copy them
> if you think you're going to need them in the future.)
>
>
>
> n 05/15/2018 04:58 AM, Davide Obbi wrote:
> > hi,
> >
> > i noticed that this repo for glusterfs 3.13 does not exists anymore at:
> >
> > http://mirror.centos.org/centos/7/storage/x86_64/
> >
> > i knew was not going to be long term supported however the downgrade to
> > 3.12 breaks the server node i believe the issue is with:
> > *[2018-05-15 08:54:39.981101] E [MSGID: 101019]
> > [xlator.c:503:xlator_init] 0-management: Initialization of volume
> > 'management' failed, review your volfile again
>
> > [2018-05-15 08:54:39.981113] E [MSGID: 101066]
> > [graph.c:327:glusterfs_graph_init] 0-management: initializing translator
> > failed
> > [2018-05-15 08:54:39.981121] E [MSGID: 101176]
> > [graph.c:698:glusterfs_graph_activate] 0-graph: init
> > failed
>
> > *
> >
> > Any help appreciated, thanks
> >
> > Details:
> > *Installed Packages*
> > glusterfs.x86_64
>
> > 3.12.9-1.el7
>
> > @gluster-3.12
> > glusterfs-api.x86_64
>
> > 3.12.9-1.el7
>
> > @gluster-3.12
> > glusterfs-cli.x86_64
>
> > 3.12.9-1.el7
>
> > @gluster-3.12
> > glusterfs-client-xlators.x86_64
>
> > 3.12.9-1.el7
>
> > @gluster-3.12
> > glusterfs-events.x86_64
>
> > 3.12.9-1.el7
>
> > @gluster-3.12
> > glusterfs-fuse.x86_64
>
> > 3.12.9-1.el7
>
> > @gluster-3.12
> > glusterfs-libs.x86_64
>
> > 3.12.9-1.el7
>
> > @gluster-3.12
> > glusterfs-server.x86_64
>
> > 3.12.9-1.el7
>
> > @gluster-3.12
> > python2-gluster.x86_64
>
> > 3.12.9-1.el7
>
> > @gluster-3.12
> >
> > *# journalctl -u
> > glusterd*
>
>
> >
> > -- Logs begin at Wed 2018-03-21 15:06:46 CET, end at Tue 2018-05-15
> > 10:48:01 CEST. --
> > Mar 21 15:06:58 glusternode-1003 systemd[1]: Starting GlusterFS, a
> > clustered file-system
> > server...
>
> >
> > Mar 21 15:07:01 glusternode-1 systemd[1]: Started GlusterFS, a clustered
> > file-system
> > server.
>
> >
> > Mar 21 15:25:07 glusternode-1 systemd[1]: Stopping GlusterFS, a
> > clustered file-system
> > server...
>
> >
> > Mar 21 15:25:07 glusternode-1 systemd[1]: Starting GlusterFS, a
> > clustered file-system
> > server...
>
> >
> > Mar 21 15:25:09 glusternode-1 systemd[1]: Started GlusterFS, a clustered
> > file-system
> > server.
>
> >
> > May 10 12:29:11 glusternode-1 systemd[1]: Stopping GlusterFS, a
> > clustered file-system
> > server...
>
> >
> > May 10 12:29:11 glusternode-1 systemd[1]: Starting GlusterFS, a
> > clustered file-system
> > server...
>
> >
> > May 10 12:29:13 glusternode-1 systemd[1]: Started GlusterFS, a clustered
> > file-system
> > server.
>
> >
> > May 15 10:21:01 glusternode-1 systemd[1]: Starting GlusterFS, a
> > clustered file-system
> > server...
>
> >
> > May 15 10:21:01 glusternode-1 systemd[1]: glusterd.service: control
> > process exited, code=exited
> > status=1
> > May 15 10:21:01 glusternode-1 systemd[1]: Failed to start GlusterFS, a
> > clustered file-system
> > server.
> > May 15 10:21:01 glusternode-1 systemd[1]: Unit glusterd.service entered
> > failed
> > state.
>
> >
> > May 15 10:21:01 glusternode-1 systemd[1]: glusterd.service failed.
> > May 15 10:22:14 glusternode-1 systemd[1]: Starting GlusterFS, a
> > clustered file-system
> > server...
>
> >
> > May 15 10:22:14 glusternode-1 systemd[1]: glusterd.service: control
> > process exited, code=exited
> > status=1
> > May 15 10:22:14 glusternode-1 systemd[1]: Failed to start GlusterFS, a
> > clustered file-system
> > server.
> > May 15 10:22:14 glusternode-1 systemd[1]: Unit glusterd.service entered
> > failed
> > state.
>
> >
> > May 15 10:22:14 glusternode-1 systemd[1]: glusterd.service failed.
> > May 15 10:22:49 glusternode-1 systemd[1]: Starting GlusterFS, a
> > clustered file-system
> > server...
>
> >
> > May 15 10:22:49 glusternode-1 systemd[1]: glusterd.service: co

[Gluster-users] glusterfs 3.13 repo unavailable and downgrade to 3.12.9 fails

2018-05-15 Thread Davide Obbi
hi,

i noticed that this repo for glusterfs 3.13 does not exists anymore at:

http://mirror.centos.org/centos/7/storage/x86_64/

i knew was not going to be long term supported however the downgrade to
3.12 breaks the server node i believe the issue is with:


*[2018-05-15 08:54:39.981101] E [MSGID: 101019] [xlator.c:503:xlator_init]
0-management: Initialization of volume 'management' failed, review your
volfile again  [2018-05-15 08:54:39.981113] E
[MSGID: 101066] [graph.c:327:glusterfs_graph_init] 0-management:
initializing translator
failed
[2018-05-15 08:54:39.981121] E [MSGID: 101176]
[graph.c:698:glusterfs_graph_activate] 0-graph: init
failed
*

Any help appreciated, thanks

Details:
*Installed Packages*
glusterfs.x86_64
3.12.9-1.el7
@gluster-3.12
glusterfs-api.x86_64
3.12.9-1.el7
@gluster-3.12
glusterfs-cli.x86_64
3.12.9-1.el7
@gluster-3.12
glusterfs-client-xlators.x86_64
3.12.9-1.el7
@gluster-3.12
glusterfs-events.x86_64
3.12.9-1.el7
@gluster-3.12
glusterfs-fuse.x86_64
3.12.9-1.el7
@gluster-3.12
glusterfs-libs.x86_64
3.12.9-1.el7
@gluster-3.12
glusterfs-server.x86_64
3.12.9-1.el7
@gluster-3.12
python2-gluster.x86_64
3.12.9-1.el7
@gluster-3.12

*# journalctl -u glusterd*

-- Logs begin at Wed 2018-03-21 15:06:46 CET, end at Tue 2018-05-15
10:48:01 CEST. --
Mar 21 15:06:58 glusternode-1003 systemd[1]: Starting GlusterFS, a
clustered file-system
server...

Mar 21 15:07:01 glusternode-1 systemd[1]: Started GlusterFS, a clustered
file-system
server.

Mar 21 15:25:07 glusternode-1 systemd[1]: Stopping GlusterFS, a clustered
file-system
server...

Mar 21 15:25:07 glusternode-1 systemd[1]: Starting GlusterFS, a clustered
file-system
server...

Mar 21 15:25:09 glusternode-1 systemd[1]: Started GlusterFS, a clustered
file-system
server.

May 10 12:29:11 glusternode-1 systemd[1]: Stopping GlusterFS, a clustered
file-system
server...

May 10 12:29:11 glusternode-1 systemd[1]: Starting GlusterFS, a clustered
file-system
server...

May 10 12:29:13 glusternode-1 systemd[1]: Started GlusterFS, a clustered
file-system
server.

May 15 10:21:01 glusternode-1 systemd[1]: Starting GlusterFS, a clustered
file-system
server...

May 15 10:21:01 glusternode-1 systemd[1]: glusterd.service: control process
exited, code=exited
status=1
May 15 10:21:01 glusternode-1 systemd[1]: Failed to start GlusterFS, a
clustered file-system
server.
May 15 10:21:01 glusternode-1 systemd[1]: Unit glusterd.service entered
failed
state.

May 15 10:21:01 glusternode-1 systemd[1]: glusterd.service failed.
May 15 10:22:14 glusternode-1 systemd[1]: Starting GlusterFS, a clustered
file-system
server...

May 15 10:22:14 glusternode-1 systemd[1]: glusterd.service: control process
exited, code=exited
status=1
May 15 10:22:14 glusternode-1 systemd[1]: Failed to start GlusterFS, a
clustered file-system
server.
May 15 10:22:14 glusternode-1 systemd[1]: Unit glusterd.service entered
failed
state.

May 15 10:22:14 glusternode-1 systemd[1]: glusterd.service failed.
May 15 10:22:49 glusternode-1 systemd[1]: Starting GlusterFS, a clustered
file-system
server...

May 15 10:22:49 glusternode-1 systemd[1]: glusterd.service: control process
exited, code=exited
status=1
May 15 10:22:49 glusternode-1 systemd[1]: Failed to start GlusterFS, a
clustered file-system
server.
May 15 10:22:49 glusternode-1 systemd[1]: Unit glusterd.service entered
failed
state.

May 15 10:22:49 glusternode-1 systemd[1]: glusterd.service failed.
May 15 10:23:36 glusternode-1 systemd[1]: Starting GlusterFS, a clustered
file-system
server...

May 15 10:23:36 glusternode-1 systemd[1]: glusterd.service: control process
exited, code=exited
status=1
May 15 10:23:36 glusternode-1 systemd[1]: Failed to start GlusterFS, a
clustered file-system
server.
May 15 10:23:36 glusternode-1 systemd[1]: Unit glusterd.service entered
failed
state.

May 15 10:23:36 glusternode-1 systemd[1]: glusterd.service failed.
[2018-05-15 08:54:38.354520] I [MSGID: 100030] [glusterfsd.c:2511:main]
0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.12.9
(args: /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO)
[2018-05-15 08:54:38.359267] I [MSGID: 106478] [glusterd.c:1423:init]
0-management: Maximum allowed open file descriptors set to
65536
[2018-05-15 08:54:38.359305] I [MSGID: 106479] [glusterd.c:1481:init]
0-management: Using /var/lib/glusterd as working
directory
[2018-05-15 08:54:38.359313] I [MSGID: 106479] [glusterd.c:1486:init]
0-management: Using /var/run/gluster as pid file working
directory
[2018-05-15 08:54:38.363363] E [rpc-transport.c:283:rpc_transport_load]
0-rpc-transport: /usr/lib64/glusterfs/3.12.9/rpc-transport/rdma.so: cannot
open shared object file: No such file or directory
[2018-05-15 08:54:38.363381] W [rpc-transport.c:287:rpc_transport_load]
0-rpc-transport: volume 'rdma.management': transport-type 'rdma' is not
valid or not found on this machine
[2018-05-15 08:54:38.363391] W [rpcsvc.c:1682:rpcsvc_create_listener]
0-rpc-service: cannot