Re: [Gluster-users] Geo-replication configuration issue
Hi, Looks like Master Pem keys are not copied to Slave nodes properly, Please cleanup /root/.ssh/authorized_keys in Slave nodes and run Geo-rep create force again. gluster volume geo-replication :: create push-pem force Do you observe any errors related to hook scripts in glusterd log file? regards Aravinda On 07/18/2016 10:11 PM, Alexandre Besnard wrote: Hello On a fresh Gluster 3.8 install, I am not able to configure a geo-replicated volume. Everything works fine up to starting of the volume however Gluster reports a faulty status. When looking at the logs (gluster_error): [2016-07-18 16:30:04.371686] I [cli.c:730:main] 0-cli: Started running gluster with version 3.8.0 [2016-07-18 16:30:04.435854] I [MSGID: 101190] [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2016-07-18 16:30:04.435921] I [socket.c:2468:socket_event_handler] 0-transport: disconnecting now [2016-07-18 16:30:04.997986] I [input.c:31:cli_batch] 0-: Exiting with: 0 From the geo-replicated logs, it seems I have a SSH configuration issue: 2016-07-18 16:35:28.293524] I [monitor(monitor):266:monitor] Monitor: [2016-07-18 16:35:28.293740] I [monitor(monitor):267:monitor] Monitor: starting gsyncd worker [2016-07-18 16:35:28.352266] I [gsyncd(/gluster/backupvol):710:main_i] : syncing: gluster://localhost:backupvol -> ssh://root@ks4:gluster://localhost:backupvol [2016-07-18 16:35:28.352489] I [changelogagent(agent):73:__init__] ChangelogAgent: Agent listining... [2016-07-18 16:35:28.492474] E [syncdutils(/gluster/backupvol):252:log_raise_exception] : connection to peer is broken [2016-07-18 16:35:28.492706] E [resource(/gluster/backupvol):226:errlog] Popen: command "ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-Fs2XND/b63292d563144e7818235d683516731d.sock root@ks4 /nonexistent/gsyncd --session-owner 3281242a-ab45-4a0d-99e5-2965b4ac5840 -N --listen --timeout 120 gluster://localhost:backupvol" returned with 255, saying: [2016-07-18 16:35:28.492794] E [resource(/gluster/backupvol):230:logerr] Popen: ssh> key_load_public: invalid format [2016-07-18 16:35:28.492863] E [resource(/gluster/backupvol):230:logerr] Popen: ssh> Permission denied (publickey,password). [2016-07-18 16:35:28.493004] I [syncdutils(/gluster/backupvol):220:finalize] : exiting. [2016-07-18 16:35:28.494045] I [repce(agent):92:service_loop] RepceServer: terminating on reaching EOF. [2016-07-18 16:35:28.494204] I [syncdutils(agent):220:finalize] : exiting. [2016-07-18 16:35:28.494143] I [monitor(monitor):333:monitor] Monitor: worker(/gluster/backupvol) died before establishing connection I tried to fix them to the best of my knowledge but I am missing something. Can you help me to fix it? Thanks, Alex ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] [Gluster-Maintainers] Gluster Events API - Help required to identify the list of Events from each component
So the framework is in now mainline branch [1]. I'd request all of you to start thinking about all the important events required to be captured as a next step and feedback. [1] http://review.gluster.org/14248 ~Atin On Thu, Jul 14, 2016 at 1:45 PM, Aravindawrote: > +gluster-users > > regards > Aravinda > > On 07/13/2016 09:03 PM, Vijay Bellur wrote: > >> On 07/13/2016 10:23 AM, Aravinda wrote: >> >>> Hi, >>> >>> We are working on Eventing feature for Gluster, Sent feature patch for >>> the same. >>> Design: http://review.gluster.org/13115 >>> Patch: http://review.gluster.org/14248 >>> Demo: http://aravindavk.in/blog/10-mins-intro-to-gluster-eventing >>> >>> Following document lists the events(mostly user driven events are >>> covered in the doc). Please let us know the Events from your components >>> to be supported by the Eventing Framework. >>> >>> >>> https://docs.google.com/document/d/1oMOLxCbtryypdN8BRdBx30Ykquj4E31JsaJNeyGJCNo/edit?usp=sharing >>> >>> >>> >> Thanks for putting this together, Aravinda! Might be worth to poll -users >> ML also about events of interest. >> >> -Vijay >> > > ___ > maintainers mailing list > maintain...@gluster.org > http://www.gluster.org/mailman/listinfo/maintainers > -- --Atin ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] New cluster - first experience
On Mon, Jul 18, 2016 at 11:58 PM, Gandalf Corvotempesta < gandalf.corvotempe...@gmail.com> wrote: > Il 18/07/2016 20:13, Alastair Neil ha scritto: > >> It does not seem to me that this is a gluster issue. I just quickly >> reviewed the thread and you said that you saw 60 MB/s with plain nfs to the >> bricks and with gluster and no sharding you got 59 MB/s >> > > That's true but I have to use sharding that kills my transfer rate. > What is the shard size you are looking to set? > Additionally, I would like to optimize as much as i can the current > network and I'm looking for > some suggestions by gluster users, as this network is totally dedicated to > a gluster cluster. > > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users > -- Pranith ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Block storage
Is block storage xlator stable and usable in production? Any docs about this? ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] New cluster - first experience
Il 18/07/2016 20:13, Alastair Neil ha scritto: It does not seem to me that this is a gluster issue. I just quickly reviewed the thread and you said that you saw 60 MB/s with plain nfs to the bricks and with gluster and no sharding you got 59 MB/s That's true but I have to use sharding that kills my transfer rate. Additionally, I would like to optimize as much as i can the current network and I'm looking for some suggestions by gluster users, as this network is totally dedicated to a gluster cluster. ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] gluster NFS/rpcbind conflict
Hello, just a note to give feedback on a known problem: I have 2 replica servers and for some reasons I use NFS mounts on one of my clients (because it is an old one with which I have troubles with glusterfs native client). I managed to performs NFS mounts from on of the servers but failed on the other. I was happy to find a thread about this problem: it is rpcbind started by default with "-w" option that lead rpcbind to re-use NFS-server ports even if no more NFS server is running (but it did on this machine). Removing the "-w" option and restarting rpcbind works fine. This mail is only to suggest to add this on documentations pages for glusterfs as it seems than other people met this problem. In a more general way why not adding a "troubleshooting" section to documentation? I parsed official documentations and found the solution reading bugreports threads. It seems that this problem still exists on (at least) recent Debians - that I'm using - so it may save time to other users. Other suggestion: indicating on docs that it may be disk-saving to switch volumes to WARNING level-log (for clients). INFO is far too verbose for production (at least on 3.6.x) and should only be used when starting using glusterfs. Note: this is just improvement suggestions, that may save time to other people. glusterfs is very fine for our needs and we are happy to use it :) Best regards, -- Y. smime.p7s Description: Signature cryptographique S/MIME ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] New cluster - first experience
It does not seem to me that this is a gluster issue. I just quickly reviewed the thread and you said that you saw 60 MB/s with plain nfs to the bricks and with gluster and no sharding you got 59 MB/s With plain NFS (no gluster involved) i'm getting almost the same > speed: about 60MB/s > > Without sharding: > # echo 3 > /proc/sys/vm/drop_caches; dd if=/dev/zero of=test bs=1M > count=1000 conv=fsync > 1000+0 record dentro > 1000+0 record fuori > 1048576000 byte (1,0 GB) copiati, 17,759 s, 59,0 MB/s On 18 July 2016 at 06:35, Gandalf Corvotempesta < gandalf.corvotempe...@gmail.com> wrote: > 2016-07-16 15:07 GMT+02:00 Gandalf Corvotempesta >: > > 2016-07-16 15:04 GMT+02:00 Gandalf Corvotempesta > > : > >> [ ID] Interval Transfer Bandwidth > >> [ 3] 0.0-10.0 sec 2.31 GBytes 1.98 Gbits/sec > > > > Obviously i did the same test with all gluster server. Speed is always > > near 2gbit, so, the network is not an issue here. > > Any help? I would like to start real-test with virtual machines and > proxmox before the August holiday. > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users > ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Determine og force healing occurring
On Sat, Jul 16, 2016 at 9:53 PM, Jesper Led Lauridsen TS Infra server < j...@dr.dk> wrote: > On 07/16/2016 04:10 AM, Pranith Kumar Karampuri wrote: > > > > On Fri, Jul 15, 2016 at 5:20 PM, Jesper Led Lauridsen TS Infra server < >j...@dr.dk> wrote: > >> Hi, >> >> How do I determine in which log etc. that a healing is in progress or >> startet and how do I if not startet force it. >> >> Additional info, is that I have some problem with a volume if I execute >> 'gluster volume heal info' the command just hangs but if I >> execute 'gluster volume heal info split-brain' it return that no >> file are in split-brain. Yet there is and I have successfully recovered >> another one. >> > > If the command hangs there is a chance that operations on the file may > have lead to stale locks. Could you give the output of statedump? > You can follow > https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump/ to > generat the files. > > > Thanks for you response. You are right there was a stale lock. But I am > sorry I booted all my cluster nodes, so I guess (without knowing) that > there is no reason to give you the output of a statedump? > > What I can confirm and give of information is: > * All the servers failed to reboot so I had to push the button. They all > failed with the message > "Unmounting pipe file system: Cannot create link /etc/mtab~ > Perhaps there is a stale lock file?" > * After 2 nodes had rebooted the command executed without any problem > and reported a couple off split-brain (Both Directory and Files) > * strace the command showed that it was just looping, so basically the > command didn't hanging. It just couldn't finish. > * I am using "glusterfs-3.6.2-1.el6.x86_64". But hoping to upgrade to > 3.6.9 this weekend. > * The file I refereed to here. Now has the same output on both replicas > when getting getfattr information. The > grusted.afr.glu_rhevtst_dr2_data_01-client-[0,1] and trusted.afr.dirty are > now all zero > If you are anyway looking to upgrade, why not upgrade to 3.7.13, which is the latest stable version? > > > > >> I just have problem with this one. I can determine if there is a healing >> process running or not >> >> I have change 'trusted.afr.glu_rhevtst_dr2_data_01-client-1' to >> 0x on the file located on glustertst03 and executed >> a 'ls -lrt' on the file on the gluster-mount. >> >> [root@glustertst04 ]# getfattr -d -m . -e hex >> /bricks/brick1/glu_rhevtst_dr2_data_01/6bdc67d1-4ae5-47e3-86c3-ef0916996862/images/7669ca25-028e-40a5-9dc8-06c716101709/a1ae3612-bb89-45d8-8041-134c34592eab >> getfattr: Removing leading '/' from absolute path names >> # file: >> bricks/brick1/glu_rhevtst_dr2_data_01/6bdc67d1-4ae5-47e3-86c3-ef0916996862/images/7669ca25-028e-40a5-9dc8-06c716101709/a1ae3612-bb89-45d8-8041-134c34592eab >> >> security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000 >> trusted.afr.dirty=0x >> trusted.afr.glu_rhevtst_dr2_data_01-client-0=0x4c70 >> trusted.afr.glu_rhevtst_dr2_data_01-client-1=0x >> trusted.gfid=0x7575f870875b4c899fd81ef16be3b1a1 >> >> trusted.glusterfs.quota.70145d52-bb80-42ce-b437-64be6ee4a7d4.contri=0x0001606dc000 >> trusted.pgfid.70145d52-bb80-42ce-b437-64be6ee4a7d4=0x0001 >> >> [root@glustertst03 ]# getfattr -d -m . -e hex >> /bricks/brick1/glu_rhevtst_dr2_data_01/6bdc67d1-4ae5-47e3-86c3-ef0916996862/images/7669ca25-028e-40a5-9dc8-06c716101709/a1ae3612-bb89-45d8-8041-134c34592eab >> getfattr: Removing leading '/' from absolute path names >> # file: >> bricks/brick1/glu_rhevtst_dr2_data_01/6bdc67d1-4ae5-47e3-86c3-ef0916996862/images/7669ca25-028e-40a5-9dc8-06c716101709/a1ae3612-bb89-45d8-8041-134c34592eab >> >> security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000 >> trusted.afr.dirty=0x0027 >> trusted.afr.glu_rhevtst_dr2_data_01-client-0=0x >> trusted.afr.glu_rhevtst_dr2_data_01-client-1=0x >> trusted.gfid=0x7575f870875b4c899fd81ef16be3b1a1 >> >> trusted.glusterfs.quota.70145d52-bb80-42ce-b437-64be6ee4a7d4.contri=0x000160662000 >> trusted.pgfid.70145d52-bb80-42ce-b437-64be6ee4a7d4=0x0001 >> >> [root@glustertst04 ]# stat >> /var/run/gluster/glu_rhevtst_dr2_data_01/6bdc67d1-4ae5-47e3-86c3-ef0916996862/images/7669ca25-028e-40a5-9dc8-06c716101709/a1ae3612-bb89-45d8-8041-134c34592eab >> File: >> `/var/run/gluster/glu_rhevtst_dr2_data_01/6bdc67d1-4ae5-47e3-86c3-ef0916996862/images/7669ca25-028e-40a5-9dc8-06c716101709/a1ae3612-bb89-45d8-8041-134c34592eab' >> Size: 21474836480 Blocks: 11548384 IO Block: 131072 regular file >> Device: 31h/49d Inode: 11517990069246079393 Links: 1 >> Access: (0660/-rw-rw) Uid: ( 36/vdsm) Gid: ( 36/ kvm) >> Access: 2016-07-15 13:33:47.860224289 +0200 >> Modify: 2016-07-15 13:34:44.396125458 +0200 >> Change: 2016-07-15 13:34:44.397125492 +0200
[Gluster-users] Geo-replication configuration issue
Hello On a fresh Gluster 3.8 install, I am not able to configure a geo-replicated volume. Everything works fine up to starting of the volume however Gluster reports a faulty status. When looking at the logs (gluster_error): [2016-07-18 16:30:04.371686] I [cli.c:730:main] 0-cli: Started running gluster with version 3.8.0 [2016-07-18 16:30:04.435854] I [MSGID: 101190] [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2016-07-18 16:30:04.435921] I [socket.c:2468:socket_event_handler] 0-transport: disconnecting now [2016-07-18 16:30:04.997986] I [input.c:31:cli_batch] 0-: Exiting with: 0 >From the geo-replicated logs, it seems I have a SSH configuration issue: 2016-07-18 16:35:28.293524] I [monitor(monitor):266:monitor] Monitor: [2016-07-18 16:35:28.293740] I [monitor(monitor):267:monitor] Monitor: starting gsyncd worker [2016-07-18 16:35:28.352266] I [gsyncd(/gluster/backupvol):710:main_i] : syncing: gluster://localhost:backupvol -> ssh://root@ks4:gluster://localhost:backupvol [2016-07-18 16:35:28.352489] I [changelogagent(agent):73:__init__] ChangelogAgent: Agent listining... [2016-07-18 16:35:28.492474] E [syncdutils(/gluster/backupvol):252:log_raise_exception] : connection to peer is broken [2016-07-18 16:35:28.492706] E [resource(/gluster/backupvol):226:errlog] Popen: command "ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-Fs2XND/b63292d563144e7818235d683516731d.sock root@ks4 /nonexistent/gsyncd --session-owner 3281242a-ab45-4a0d-99e5-2965b4ac5840 -N --listen --timeout 120 gluster://localhost:backupvol" returned with 255, saying: [2016-07-18 16:35:28.492794] E [resource(/gluster/backupvol):230:logerr] Popen: ssh> key_load_public: invalid format [2016-07-18 16:35:28.492863] E [resource(/gluster/backupvol):230:logerr] Popen: ssh> Permission denied (publickey,password). [2016-07-18 16:35:28.493004] I [syncdutils(/gluster/backupvol):220:finalize] : exiting. [2016-07-18 16:35:28.494045] I [repce(agent):92:service_loop] RepceServer: terminating on reaching EOF. [2016-07-18 16:35:28.494204] I [syncdutils(agent):220:finalize] : exiting. [2016-07-18 16:35:28.494143] I [monitor(monitor):333:monitor] Monitor: worker(/gluster/backupvol) died before establishing connection I tried to fix them to the best of my knowledge but I am missing something. Can you help me to fix it? Thanks, Alex ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Shard storage suggestions
On Mon, Jul 18, 2016 at 3:55 PM, Krutika Dhananjaywrote: > Hi, > > The suggestion you gave was in fact considered at the time of writing > shard translator. > Here are some of the considerations for sticking with a single directory > as opposed to a two-tier classification of shards based on the initial > chars of the uuid string: > i) Even for a 4TB disk with the smallest possible shard size of 4MB, there > will only be a max of 1048576 entries > under /.shard in the worst case - a number far less than the max number > of inodes that are supported by most backend file systems. > > ii) Entry self-heal for a single directory even with the simplest case of > 1 entry deleted/created while a replica is down required crawling the whole > sub-directory tree, figuring which entry is present/absent between src and > sink and then healing it to the sink. With granular entry self-heal [1], we > no longer have to live under this limitation. > > iii) Resolving shards from the original file name as given by the > application to the corresponding shard within a single directory (/.shard > in the existing case) would mean, looking up the parent dir /.shard first > followed by lookup on the actual shard that is to be operated on. But > having a two-tier sub-directory structure means that we not only have to > resolve (or look-up) /.shard first, but also the directories '/.shard/d2', > '/.shard/d2/18', and '/.shard/d2/18/d218cd1c-4bd9-40d7-9810-86b3f7932509' > before finally looking up the shard, which is a lot of network operations. > Yes, these are all one-time operations and the results can be cached in the > inode table, but still on account of having to have dynamic gfids (as > opposed to just /.shard, which has a fixed gfid - > be318638-e8a0-4c6d-977d-7a937aa84806), it is trivial to resolve the name of > the shard to gfid, or the parent name to parent gfid _even_ in memory. > s/trivial/non-trivial/ in the last sentence above. Oh and [1] - https://github.com/gluster/glusterfs-specs/blob/master/done/GlusterFS%203.8/granular-entry-self-healing.md -Krutika > > > Are you unhappy with the performance? What's your typical VM image size, > shard block size and the capacity of individual bricks? > > -Krutika > > On Mon, Jul 18, 2016 at 2:43 PM, Gandalf Corvotempesta < > gandalf.corvotempe...@gmail.com> wrote: > >> 2016-07-18 9:53 GMT+02:00 Oleksandr Natalenko : >> > I'd say, like this: >> > >> > /.shard/d2/18/D218CD1C-4BD9-40D7-9810-86B3F7932509.1 >> >> Yes, something like this. >> I was on mobile when I wrote. Your suggestion is better than mine. >> >> Probably, using a directory for the whole shard is also better and >> keep the directory structure clear: >> >> >> >> /.shard/d2/18/D218CD1C-4BD9-40D7-9810-86B3F7932509/D218CD1C-4BD9-40D7-9810-86B3F7932509.1 >> >> The current shard directory structure doesn't scale at all. >> ___ >> Gluster-users mailing list >> Gluster-users@gluster.org >> http://www.gluster.org/mailman/listinfo/gluster-users >> > > ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] New cluster - first experience
2016-07-16 15:07 GMT+02:00 Gandalf Corvotempesta: > 2016-07-16 15:04 GMT+02:00 Gandalf Corvotempesta > : >> [ ID] Interval Transfer Bandwidth >> [ 3] 0.0-10.0 sec 2.31 GBytes 1.98 Gbits/sec > > Obviously i did the same test with all gluster server. Speed is always > near 2gbit, so, the network is not an issue here. Any help? I would like to start real-test with virtual machines and proxmox before the August holiday. ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Shard storage suggestions
2016-07-18 12:25 GMT+02:00 Krutika Dhananjay: > Hi, > > The suggestion you gave was in fact considered at the time of writing shard > translator. > Here are some of the considerations for sticking with a single directory as > opposed to a two-tier classification of shards based on the initial chars of > the uuid string: > i) Even for a 4TB disk with the smallest possible shard size of 4MB, there > will only be a max of 1048576 entries > under /.shard in the worst case - a number far less than the max number of > inodes that are supported by most backend file systems. This with just 1 single file. What about thousands of huge sharded files? In a petabyte scale cluster, having thousands of huge file should be considered normal. > iii) Resolving shards from the original file name as given by the > application to the corresponding shard within a single directory (/.shard in > the existing case) would mean, looking up the parent dir /.shard first > followed by lookup on the actual shard that is to be operated on. But having > a two-tier sub-directory structure means that we not only have to resolve > (or look-up) /.shard first, but also the directories '/.shard/d2', > '/.shard/d2/18', and '/.shard/d2/18/d218cd1c-4bd9-40d7-9810-86b3f7932509' > before finally looking up the shard, which is a lot of network operations. > Yes, these are all one-time operations and the results can be cached in the > inode table, but still on account of having to have dynamic gfids (as > opposed to just /.shard, which has a fixed gfid - > be318638-e8a0-4c6d-977d-7a937aa84806), it is trivial to resolve the name of > the shard to gfid, or the parent name to parent gfid _even_ in memory. What about just 1 single level? /.shard/d218cd1c-4bd9-40d7-9810-86b3f7932509/d218cd1c-4bd9-40d7-9810-86b3f7932509.1 ? You have the GFID, thus there is no need to crawl multiple levels, just direct-access to the proper path. With this soulution, you have 1.048.576 entries with a 4TB shared file with 4MB shard size. With the current implementation, you have 1.048.576 for each sharded file. If I have 100 4TB files, i'll end with 1.048.576*100 = 104.857.600 files in a single directory. > Are you unhappy with the performance? What's your typical VM image size, > shard block size and the capacity of individual bricks? No, i'm just thinking about this optimization. ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Shard storage suggestions
Hi, The suggestion you gave was in fact considered at the time of writing shard translator. Here are some of the considerations for sticking with a single directory as opposed to a two-tier classification of shards based on the initial chars of the uuid string: i) Even for a 4TB disk with the smallest possible shard size of 4MB, there will only be a max of 1048576 entries under /.shard in the worst case - a number far less than the max number of inodes that are supported by most backend file systems. ii) Entry self-heal for a single directory even with the simplest case of 1 entry deleted/created while a replica is down required crawling the whole sub-directory tree, figuring which entry is present/absent between src and sink and then healing it to the sink. With granular entry self-heal [1], we no longer have to live under this limitation. iii) Resolving shards from the original file name as given by the application to the corresponding shard within a single directory (/.shard in the existing case) would mean, looking up the parent dir /.shard first followed by lookup on the actual shard that is to be operated on. But having a two-tier sub-directory structure means that we not only have to resolve (or look-up) /.shard first, but also the directories '/.shard/d2', '/.shard/d2/18', and '/.shard/d2/18/d218cd1c-4bd9-40d7-9810-86b3f7932509' before finally looking up the shard, which is a lot of network operations. Yes, these are all one-time operations and the results can be cached in the inode table, but still on account of having to have dynamic gfids (as opposed to just /.shard, which has a fixed gfid - be318638-e8a0-4c6d-977d-7a937aa84806), it is trivial to resolve the name of the shard to gfid, or the parent name to parent gfid _even_ in memory. Are you unhappy with the performance? What's your typical VM image size, shard block size and the capacity of individual bricks? -Krutika On Mon, Jul 18, 2016 at 2:43 PM, Gandalf Corvotempesta < gandalf.corvotempe...@gmail.com> wrote: > 2016-07-18 9:53 GMT+02:00 Oleksandr Natalenko: > > I'd say, like this: > > > > /.shard/d2/18/D218CD1C-4BD9-40D7-9810-86B3F7932509.1 > > Yes, something like this. > I was on mobile when I wrote. Your suggestion is better than mine. > > Probably, using a directory for the whole shard is also better and > keep the directory structure clear: > > > > /.shard/d2/18/D218CD1C-4BD9-40D7-9810-86B3F7932509/D218CD1C-4BD9-40D7-9810-86B3F7932509.1 > > The current shard directory structure doesn't scale at all. > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users > ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] lingering <gfid:*> entries in volume heal, gluster 3.6.3
On Fri, 2016-07-15 at 22:24 +0530, Ravishankar N wrote: > On 07/15/2016 09:55 PM, Kingsley wrote: > > This has revealed something. I'm now seeing lots of lines like this in > > the shd log: > > > > [2016-07-15 16:20:51.098152] D [afr-self-heald.c:516:afr_shd_index_sweep] > > 0-callrec-replicate-0: got entry: eaa43674-b1a3-4833-a946-de7b7121bb88 > > [2016-07-15 16:20:51.099346] D > > [client-rpc-fops.c:1523:client3_3_inodelk_cbk] 0-callrec-client-2: remote > > operation failed: Stale file handle > > [2016-07-15 16:20:51.100683] D > > [client-rpc-fops.c:2686:client3_3_opendir_cbk] 0-callrec-client-2: remote > > operation failed: Stale file handle. Path: > > > > (eaa43674-b1a3-4833-a946-de7b7121bb88) > > Looks like the files are not present at all in client-2 which is why you > see these messages. > Find out the files/directory names corresponding to these gfids from one > of the healthy bricks and see if they are present in client-2 as well. > If not try accessing them from the mount. That should create any missing > entries in client-2. Then launch heal again. > > Hope this helps. > Ravi Hi, Thanks - I found the files that these entries corresponded to. Indeed they weren't on client-2. From a client I did ls -lR of the directory tree that they were all from and then self heal automatically fixed everything, so now all is back in order. Thank you for your help! Cheers, Kingsley. ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Shard storage suggestions
2016-07-18 9:53 GMT+02:00 Oleksandr Natalenko: > I'd say, like this: > > /.shard/d2/18/D218CD1C-4BD9-40D7-9810-86B3F7932509.1 Yes, something like this. I was on mobile when I wrote. Your suggestion is better than mine. Probably, using a directory for the whole shard is also better and keep the directory structure clear: /.shard/d2/18/D218CD1C-4BD9-40D7-9810-86B3F7932509/D218CD1C-4BD9-40D7-9810-86B3F7932509.1 The current shard directory structure doesn't scale at all. ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Shard storage suggestions
I'd say, like this: /.shard/d2/18/D218CD1C-4BD9-40D7-9810-86B3F7932509.1 18.07.2016 10:31, Gandalf Corvotempesta написав: AFAIK gluster store each shard on a single directory. With huge files this could lead to millions of small shard file in the same directory that certainly lead to a performance issue. Why not moving each shard in a dedicated directory and may be also with a defined nested structure? In example, from this: /.shard/D218CD1C-4BD9-40D7-9810-86B3F7932509.1 To something like this: /.shard/d/d2/D218CD1C-4BD9-40D7-9810-86B3F7932509/D218CD1C-4BD9-40D7-9810-86B3F7932509.1 ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Shard storage suggestions
AFAIK gluster store each shard on a single directory. With huge files this could lead to millions of small shard file in the same directory that certainly lead to a performance issue. Why not moving each shard in a dedicated directory and may be also with a defined nested structure? In example, from this: /.shard/*d218cd1c-4bd9-40d7-9810-86b3f7932509*.1 To something like this: /.shard/d/d2/*d218cd1c-4bd9-40d7-9810-86b3f7932509*/ *d218cd1c-4bd9-40d7-9810-86b3f7932509*.1 ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users