[Gluster-devel] spurious failures tests/bugs/tier/bug-1205545-CTR-and-trash-integration.t
hi, http://build.gluster.org/job/rackspace-regression-2GB-triggered/11757/consoleFull has the logs. Could you please look into it. Pranith ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] spurious failures tests/bugs/tier/bug-1205545-CTR-and-trash-integration.t
Yep will have a look - Original Message - From: Pranith Kumar Karampuri pkara...@redhat.com To: Joseph Fernandes josfe...@redhat.com, Gluster Devel gluster-devel@gluster.org Sent: Wednesday, July 1, 2015 1:44:44 PM Subject: spurious failures tests/bugs/tier/bug-1205545-CTR-and-trash-integration.t hi, http://build.gluster.org/job/rackspace-regression-2GB-triggered/11757/consoleFull has the logs. Could you please look into it. Pranith ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Gluster Docker images are available at docker hub
On Wed, Jul 1, 2015 at 11:32 AM, Krishnan Parthasarathi kpart...@redhat.com wrote: Yeah this followed by glusterd restart should help But frankly, i was hoping that 'rm' the file isn't a neat way to fix this issue Why is rm not a neat way? Is it because the container deployment tool needs to know about gluster internals? But isn't a Dockerfile dealing with details of the service(s) that is being deployment in a container. IIUC post 'rm' we need to restart glusterd, but touching a file as part of Dockerfile would bringup glusterd with the new UUID. I haven't tried this tho' I guess like Humble said, fixing this in glusterd seems ideal The above discussion was _if_ glusterd continues the current behavior then either 'rm' or 'touch' methods could be considered. thanx, deepak ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Gluster Docker images are available at docker hub
On 07/01/2015 03:03 PM, Deepak Shetty wrote: On Wed, Jul 1, 2015 at 11:32 AM, Krishnan Parthasarathi kpart...@redhat.com wrote: Yeah this followed by glusterd restart should help But frankly, i was hoping that 'rm' the file isn't a neat way to fix this issue Why is rm not a neat way? Is it because the container deployment tool needs to know about gluster internals? But isn't a Dockerfile dealing with details of the service(s) that is being deployment in a container. IIUC post 'rm' we need to restart glusterd, but touching a file as part of Dockerfile would bringup glusterd with the new UUID. I haven't tried this tho' I guess like Humble said, fixing this in glusterd seems ideal The above discussion was _if_ glusterd continues the current behavior then either 'rm' or 'touch' methods could be considered. Here you go, http://review.gluster.org/11488 should fix it and yes the changes are in glusterD ;) thanx, deepak -- ~Atin ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Gluster Docker images are available at docker hub
On Wed, Jul 1, 2015 at 9:39 AM, Atin Mukherjee amukh...@redhat.com wrote: On 05/06/2015 12:31 PM, Humble Devassy Chirammal wrote: Hi All, Docker images of GlusterFS 3.6 for Fedora ( 21) and CentOS (7) are now available at docker hub ( https://registry.hub.docker.com/u/gluster/ ). These images can be used to deploy GlusterFS containers. The blog entry at planet.gluster.org [1] have details about how these images can be used. Please let me know if you have any comments/feedback/questions. [1] Building GlusterFS in a docker container @ planet.gluster.org [2] http://humblec.com/building-glusterfs-in-a-docker-container/ Hi Humble, As discussed yesterday, post daemon refactoring we generate UUID at the the time of glusterD init and this has caused an issue in bringing multiple docker containers as UUIDs will be same across different containers since yum install brings up glusterd and persist the information in /var/lib/glusterd.info. This also means we should add a testcase and QEs should cover this scenario as part of their testing to ensure we don't break gluster for common-image-base type scenarios (containers, ovirt, virtualization templates). thanx, deepak ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Huge memory consumption with quota-marker
Hi, The new marker xlator uses syncop framework to update quota-size in the background, it uses one synctask per write FOP. If there are 100 parallel writes with all different inodes but on the same directory '/dir', there will be ~100 txn waiting in queue to acquire a lock on on its parent i.e '/dir'. Each of this txn uses a syntack and each synctask allocates stack size of 2M (default size), so total 0f 200M usage. This usage can increase depending on the load. I am think of of using the stacksize for synctask to 256k, will this mem be sufficient as we perform very limited operations within a synctask in marker updation? Please provide suggestions on solving this problem? Thanks, Vijay ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] spurious failures tests/bugs/tier/bug-1205545-CTR-and-trash-integration.t
On Jul 1, 2015 18:42, Raghavendra Talur raghavendra.ta...@gmail.com wrote: On Wed, Jul 1, 2015 at 3:18 PM, Joseph Fernandes josfe...@redhat.com wrote: Hi All, TEST 4-5 are failing i.e the following TEST $CLI volume start $V0 TEST $CLI volume attach-tier $V0 replica 2 $H0:$B0/${V0}$CACHE_BRICK_FIRST $H0:$B0/${V0}$CACHE_BRICK_LAST Glusterd Logs say: [2015-07-01 07:33:25.053412] I [rpc-clnt.c:965:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2015-07-01 07:33:25.053851] [run.c:190:runner_log] (-- /build/install/lib/libglusterfs.so.0(_gf_log_callingfn+0x240)[0x7fe8349bfb82] (-- /build/install/lib/libglusterfs.so.0(runner_log+0x192)[0x7fe834a29426] (-- /build/install/lib/glusterfs/3.8dev/xlator/mgmt/glusterd.so(glusterd_volume_start_glusterfs+0xae7)[0x7fe829e475d7] (-- /build/install/lib/glusterfs/3.8dev/xlator/mgmt/glusterd.so(glusterd_brick_start+0x151)[0x7fe829e514e3] (-- /build/install/lib/glusterfs/3.8dev/xlator/mgmt/glusterd.so(glusterd_start_volume+0xba)[0x7fe829ebd534] ) 0-: Starting GlusterFS: /build/install/sbin/glusterfsd -s slave26.cloud.gluster.org --volfile-id patchy.slave26.cloud.gluster.org.d-backends-patchy3 -p /var/lib/glusterd/vols/patchy/run/slave26.cloud.gluster.org-d-backends-patchy3.pid -S /var/run/gluster/e511d04af0bd91bfc3b030969b789d95.socket --brick-name /d/backends/patchy3 -l /var/log/glusterfs/bricks/d-backends-patchy3.log --xlator-option *-posix.glusterd-uuid=aff38c34-7744-4c c0-9aa4-a9fab5a71b2f --brick-port 49172 --xlator-option patchy-server.listen-port=49172 [2015-07-01 07:33:25.070284] I [MSGID: 106144] [glusterd-pmap.c:269:pmap_registry_remove] 0-pmap: removing brick (null) on port 49172 [2015-07-01 07:33:25.071022] E [MSGID: 106005] [glusterd-utils.c:4448:glusterd_brick_start] 0-management: Unable to start brick slave26.cloud.gluster.org:/d/backends/patchy3 [2015-07-01 07:33:25.071053] E [MSGID: 106123] [glusterd-syncop.c:1416:gd_commit_op_phase] 0-management: Commit of operation 'Volume Start' failed on localhost The volume is 2x2 : LAST_BRICK=3 TEST $CLI volume create $V0 replica 2 $H0:$B0/${V0}{0..$LAST_BRICK} When looked into the 3 bricks are fine but when looked at the 4th brick log: [2015-07-01 07:33:25.056463] I [MSGID: 100030] [glusterfsd.c:2296:main] 0-/build/install/sbin/glusterfsd: Started running /build/install/sbin/glusterfsd version 3.8dev (args: /build/install/sbin/glusterfsd -s slave26.cloud.gluster.org --volfile-id patchy.slave26.cloud.gluster.org.d-backends-patchy3 -p /var/lib/glusterd/vols/patchy/run/slave26.cloud.gluster.org-d-backends-patchy3.pid -S /var/run/gluster/e511d04af0bd91bfc3b030969b789d95.socket --brick-name /d/backends/patchy3 -l /var/log/glusterfs/bricks/d-backends-patchy3.log --xlator-option *-posix.glusterd-uuid=aff38c34-7744-4cc0-9aa4-a9fab5a71b2f --brick-port 49172 --xlator-option patchy-server.listen-port=49172) [2015-07-01 07:33:25.064879] I [MSGID: 101190] [event-epoll.c:627:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2015-07-01 07:33:25.068992] I [MSGID: 101173] [graph.c:268:gf_add_cmdline_options] 0-patchy-server: adding option 'listen-port' for volume 'patchy-server' with value '49172' [2015-07-01 07:33:25.069034] I [MSGID: 101173] [graph.c:268:gf_add_cmdline_options] 0-patchy-posix: adding option 'glusterd-uuid' for volume 'patchy-posix' with value 'aff38c34-7744-4cc0-9aa4-a9fab5a71b2f' [2015-07-01 07:33:25.069313] I [MSGID: 115034] [server.c:392:_check_for_auth_option] 0-/d/backends/patchy3: skip format check for non-addr auth option auth.login./d/backends/patchy3.allow [2015-07-01 07:33:25.069316] I [MSGID: 101190] [event-epoll.c:627:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2 [2015-07-01 07:33:25.069330] I [MSGID: 115034] [server.c:392:_check_for_auth_option] 0-/d/backends/patchy3: skip format check for non-addr auth option auth.login.18b50c0d-38fb-4b49-bb5e-b203f4217223.password [2015-07-01 07:33:25.069580] I [rpcsvc.c:2210:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configured rpc.outstanding-rpc-limit with value 64 [2015-07-01 07:33:25.069647] W [MSGID: 101002] [options.c:952:xl_opt_validate] 0-patchy-server: option 'listen-port' is deprecated, preferred is 'transport.socket.listen-port', continuing with correction [2015-07-01 07:33:25.069736] E [socket.c:818:__socket_server_bind] 0-tcp.patchy-server: binding to failed: Address already in use [2015-07-01 07:33:25.069750] E [socket.c:821:__socket_server_bind] 0-tcp.patchy-server: Port is already in use [2015-07-01 07:33:25.069763] W [rpcsvc.c:1599:rpcsvc_transport_create] 0-rpc-service: listening on transport failed [2015-07-01 07:33:25.069774] W [MSGID: 115045] [server.c:996:init] 0-patchy-server: creation of listener failed [2015-07-01 07:33:25.069788] E [MSGID: 101019] [xlator.c:423:xlator_init] 0-patchy-server: Initialization of volume 'patchy-server' failed, review your volfile again [2015-07-01 07:33:25.069798] E [MSGID: 101066]
Re: [Gluster-devel] Progress on adding support for SEEK_DATA and SEEK_HOLE
On 07/01/2015 08:53 AM, Niels de Vos wrote: On Tue, Jun 30, 2015 at 11:48:20PM +0530, Ravishankar N wrote: On 06/22/2015 03:22 PM, Ravishankar N wrote: On 06/22/2015 01:41 PM, Miklos Szeredi wrote: On Sun, Jun 21, 2015 at 6:20 PM, Niels de Vos nde...@redhat.com wrote: Hi, it seems that there could be a reasonable benefit for virtual machine images on a FUSE mountpoint when SEEK_DATA and SEEK_HOLE would be available. At the moment, FUSE does not pass lseek() on to the userspace process that handles the I/O. Other filesystems that do not (need to) track the position in the file-descriptor are starting to support SEEK_DATA/HOLE. One example is NFS: https://tools.ietf.org/html/draft-ietf-nfsv4-minorversion2-38#section-15.11 I would like to add this feature to Gluster, and am wondering if there are any reasons why it should/could not be added to FUSE. I don't see any reason why it couldn't be added. Please go ahead. Thanks for bouncing the mail to me Niels, I would be happy to work on this. I'll submit a patch by Monday next. Sent a patch @ http://thread.gmane.org/gmane.comp.file-systems.fuse.devel/14752 I've tested it with some skeleton code in gluster-fuse to handle lseek(). Ravi also sent his patch for glusterfs-fuse: http://review.gluster.org/11474 I have posted my COMPLETELY UNTESTED patches to their own Gerrit topic so that we can easily track the progress: http://review.gluster.org/#/q/status:open+project:glusterfs+branch:master+topic:wip/SEEK_HOLE My preference goes to share things early and make everyone able to follow progress (know where to find the latest patches). Assistance in testing, reviewing and improving is welcome! There are some outstanding things like seek() for ec and sharding, and probably more. This all was done as a suggestion from Christopher (kripper) Pereira, for improving the handling of sparse files (like most VM images). I've posted the patch for ec in the same Gerrit topic: http://review.gluster.org/11494/ It has not been tested and some discussion about if it's really needed to send the request to all subvolumes will be needed. The lock and the xattrop are absolutely needed. Even if we send the request to only one subvolume, we need to know which ones are healthy (to avoid sending the request to a brick that could have invalid hole information). This could have been done in open, but since NFS does not issue open calls, we cannot rely on that. Once we know which bricks are healthy we could opt for sending the request only to one of them. In this case we need to be aware that even healthy bricks could have different hole locations. What do you think ? Xavi ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Gluster Docker images are available at docker hub
On Wed, Jul 1, 2015 at 11:51 AM, Krishnan Parthasarathi kpart...@redhat.com wrote: We do have a way to tackle this situation from the code. Raghavendra Talur will be sending a patch shortly. We should fix it by undoing what daemon-refactoring did, that broke the lazy creation of uuid for a node. Fixing it elsewhere is just masking the real cause. Meanwhile 'rm' is the stop gap arrangement. Agreed. The patch I was supposed to send may not solve this issue . Atin understands this code well better than I and I guess would be sending the patch to generate uuid on first volume creation or first peer probe instead of on first start of glusterd. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel -- *Raghavendra Talur * ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Gluster Docker images are available at docker hub
Yeah this followed by glusterd restart should help But frankly, i was hoping that 'rm' the file isn't a neat way to fix this issue Why is rm not a neat way? Is it because the container deployment tool needs to know about gluster internals? But isn't a Dockerfile dealing with details of the service(s) that is being deployment in a container. I do think fixing the dockerfile is *not* the correct way. That said, the use case is not just containers. This issue can pop up in Ovirt or virtualization environments as well. The VM template may have pre configured glusterd in it and the pool created out of this template can show the same behaviour. I believe fixing it in gluster code base would be the right thing to do. Thanks Atin for the heads up! AFAICT we have 2 scenarios: 1) Non-container scenario, where the current behaviour of glusterd persisting the info in .info file makes sense 2) Container scenario, where the same image gets used as the base, hence all containers gets the same UUID For this we can have an option to tell glusterd that instructs it to refresh the UUID during next start. --Humble On Wed, Jul 1, 2015 at 11:32 AM, Krishnan Parthasarathi kpart...@redhat.com wrote: Yeah this followed by glusterd restart should help But frankly, i was hoping that 'rm' the file isn't a neat way to fix this issue Why is rm not a neat way? Is it because the container deployment tool needs to know about gluster internals? But isn't a Dockerfile dealing with details of the service(s) that is being deployment in a container. AFAICT we have 2 scenarios: 1) Non-container scenario, where the current behaviour of glusterd persisting the info in .info file makes sense 2) Container scenario, where the same image gets used as the base, hence all containers gets the same UUID For this we can have an option to tell glusterd that instructs it to refresh the UUID during next start. Maybe somethign like presence of a file /var/lib/glusterd/refresh_uuid makes glusterd refresh the UUID in .info and then delete this file, that ways, Dockerfile can touch this file, post gluster rpm install step and things should work as expected ? If container deployment needs are different it should should address issues like above. If we start addressing glusterd's configuration handling for every new deployment technology it would quickly become unmaintainable. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Problems when using different hostnames in a bricks and a peer
Hi, Recently, my company needed to change our hostnames used in the Gluster Pool. In a first moment, we have two Gluster Nodes called storage1 and storage2. Our volumes used two bricks: storage1:/MYVOLYME and storage2:/MYVOLUME. We put the storage1 and storage2 IPs in the /etc/hosts file of our nodes and in our client servers. After some time, more client servers started to using Gluster and we discovered that using hostnames without domain (using /etc/hosts) in all client servers is a pain in the a$$ :(. So, we decided to change them to something like storage1.mydomain.com and storage2.mydomain.com. Remember that, at this point, we had already some volumes (with bricks): $ gluster volume info MYVOL [...] Brick1: storage1:/MYDIR Brick1: storage2:/MYDIR For simplicity, let's consider that we had two Gluster Nodes, each one with the following entries in /etc/hosts: 10.10.10.1 storage1 10.10.10.2 storage2 To implement the hostname changes, we've changed the etc hosts file to: 10.10.10.1 storage1 storage1.mydomain.com 10.10.10.2 storage2 storage2.mydomain.com And we've run in storage1: $ gluster peer probe storage2.mydomain.com peer probe: success Everything works well during some time, but the glusterd starts to fail after any reboot: $ service glusterfs-server status glusterfs-server start/running, process 14714 $ service glusterfs-server restart glusterfs-server stop/waiting glusterfs-server start/running, process 14860 $ service glusterfs-server status glusterfs-server stop/waiting To start the service again, it was necessary to rollback the hostname1 config to storage2 in /var/lib/glusterd/peers/OUR_UUID. After some try and error, we discovered that if we change the order of the entries in /etc/hosts and repeat the process, everything worked. It is, from: 10.10.10.1 storage1 storage1.mydomain.com 10.10.10.2 storage2 storage2.mydomain.com To: 10.10.10.1 storage1.mydomain.com storage1 10.10.10.2 storage2.mydomain.com storage2 And run: gluster peer probe storage2.mydomain.com service glusterfs-server restart So we've checked the Glusterd debug log and checked the GlusterFS source code and discovered that the big secret was the function glusterd_friend_find_by_hostname, in the file xlators/mgmt/glusterd/src/glusterd-utils.c. This function is called for each brick that isn't a local brick and does the following things: - It checks if the brick hostname is equal to some peer hostname; - If it's, this peer is our wanted friend; - If not, it gets the brick IP (resolves the hostname using the function getaddrinfo) and checks if the brick IP is equal to the peer hostname; - It is, we could run gluster peer probe 10.10.10.2. Once the brick IP (storage2 resolves to 10.10.10.2) would have equal to the peer hostname (10.10.10.2); - If it's, this peer is our wanted friend; - If not, gets the reverse of the brick IP (using the function getnameinfo) and checks if the brick reverse is equal to the peer hostname; - This is why changing the order of the entries in /etc/hosts worked as an workaround for us; - If not, returns and error (and Glusterd will fail). However, we think that comparing the brick IP (resolving the brick hostname) and the peer IP (resolving the peer hostname) would be a simpler and more comprehensive solution. Once both brick and peer will have difference hostnames, but the same IP, it would work. The solution could be: - It checks if the brick hostname is equal to some peer hostname; - If it's, this peer is our wanted friend; - If not, it gets both the brick IP (resolves the hostname using the function getaddrinfo) and the peer IP (resolves the peer hostname) and, for each IP pair, check if a brick IP is equal to a peer IP; - If it's, this peer is our wanted friend; - If not, returns and error (and Glusterd will fail). What do you think about it? -- *Rarylson Freitas* Computer Engineer ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Unable to send patches to review.gluster.org
+ Infra, can any one of you just take a look at it? On 07/02/2015 09:53 AM, Anuradha Talur wrote: Hi, I'm unable to send patches to r.g.o, also not able to login. I'm getting the following errors respectively: 1) Permission denied (publickey). fatal: Could not read from remote repository. Please make sure you have the correct access rights and the repository exists. 2) Internal server error or forbidden access. Is anyone else facing the same issue? -- ~Atin ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Unable to send patches to review.gluster.org
Me too. Earlier (past week or so) this error used to last for about 15-20 minutes, but today seems to be it's day. Venky On Thu, Jul 2, 2015 at 9:53 AM, Anuradha Talur ata...@redhat.com wrote: Hi, I'm unable to send patches to r.g.o, also not able to login. I'm getting the following errors respectively: 1) Permission denied (publickey). fatal: Could not read from remote repository. Please make sure you have the correct access rights and the repository exists. 2) Internal server error or forbidden access. Is anyone else facing the same issue? -- Thanks, Anuradha. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Huge memory consumption with quota-marker
Yes, we could take synctask size as an argument for synctask_create. The increase in synctask threads is not really a problem, it can't grow more than 16 (SYNCENV_PROC_MAX). - Original Message - On 07/02/2015 10:40 AM, Krishnan Parthasarathi wrote: - Original Message - On Wednesday 01 July 2015 08:41 AM, Vijaikumar M wrote: Hi, The new marker xlator uses syncop framework to update quota-size in the background, it uses one synctask per write FOP. If there are 100 parallel writes with all different inodes but on the same directory '/dir', there will be ~100 txn waiting in queue to acquire a lock on on its parent i.e '/dir'. Each of this txn uses a syntack and each synctask allocates stack size of 2M (default size), so total 0f 200M usage. This usage can increase depending on the load. I am think of of using the stacksize for synctask to 256k, will this mem be sufficient as we perform very limited operations within a synctask in marker updation? Seems like a good idea to me. Do we need a 256k stacksize or can we live with something even smaller? It was 16K when synctask was introduced. This is a property of syncenv. We could create a separate syncenv for marker transactions which has smaller stacks. env-stacksize (and SYNCTASK_DEFAULT_STACKSIZE) was increased to 2MB to support pump xlator based data migration for replace-brick. For the no. of stack frames a marker transaction could use at any given time, we could use much lesser, 16K say. Does that make sense? Creating one more syncenv will lead to extra sync-threads, may be we can take stacksize as argument. Pranith ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Unable to send patches to review.gluster.org
Hi, I'm unable to send patches to r.g.o, also not able to login. I'm getting the following errors respectively: 1) Permission denied (publickey). fatal: Could not read from remote repository. Please make sure you have the correct access rights and the repository exists. 2) Internal server error or forbidden access. Is anyone else facing the same issue? -- Thanks, Anuradha. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Lock migration as a part of rebalance
One solution I can think of is to have the responsibility of lock migration process spread between both client and rebalance process. A rough algo is outlined below: 1. We should've a static identifier for client process (something like process-uuid of mount process - lets call it client-uuid) in the lock structure. This identifier won't change across reconnects. 2. rebalance just copies the entire lock-state verbatim to dst-node (fd-number and client-uuid values would be same as the values on src-node). 3. rebalance process marks these half-migrated locks as migration-in-progress on dst-node. Any lock request which overlaps with migration-in-progress locks is considered as conflicting and dealt with appropriately (if SETLK unwind with EAGAIN and if SETLKW block till these locks are released). Same approach is followed for mandatory locking too. 4. whenever an fd based operation (like writev, release, lk, flush etc) happens on the fd, the client (through which lock was acquired), migrates the lock. Migration is basically, * does a fgetxattr (fd, LOCKINFO_KEY, src-subvol). This will fetch the fd number on src subvol - lockinfo. * opens new fd on dst-subvol. Then does fsetxattr (new-fd, LOCKINFO_KEY, lockinfo, dst-subvol). The brick on receiving setxattr on virtual xattr LOCKINFO_KEY looks for all the locks with ((fd == lockinfo) (client-uuid == uuid-of-client-on-which-this-setxattr-came)) and then fills in appropriate values for client_t and fd (basically sets lock-fd = fd-num-of-the-fd-on-which-setxattr-came). Some issues and solutions: 1. What if client never connects to dst brick? We'll have a time-out for migration-in-progress locks to be converted into complete locks. If DHT doesn't migrate within this timeout, server will cleanup these locks. This is similar to current protocol/client implementation of lock-heal (This functionality is disabled on client as of now. But, upcall needs this feature too and we can get this functionality working). If a dht tries to migrate the locks after this timeout, it'll will have to re-aquire lock on destination (This has to be a non-blocking lock request, irrespective of mode of original lock). We get information of current locks opened through the fd opened on src. If lock acquisition fails for some reason, dht marks the fd bad, so that application will be notified about lost locks. One problem unsolved with this solution is another client (say c2) acquiring and releasing the lock during the period starting from timeout and client (c1) initiates lock migration. However, that problem is present even with existing lock implementation and not really something new introduced by lock migration. 2. What if client connects but disconnects before it could've attempted to migrate migration-in-progress locks? The server can identify locks belonging to this client using client-uuid and cleans them up. Dht trying to migrate locks after first disconnect will try to reaquire locks as outlined in 1. 3. What if client disconnects with src subvol and cannot get lock information from src for handling issues 1 and 2? We'll mark the fd bad. We can optimize this to mark fd bad only if locks have been acquired. To do this client has to store some history in the fd on successful lock acquisition. regards, On Wed, Dec 17, 2014 at 12:45 PM, Raghavendra G raghaven...@gluster.com wrote: On Wed, Dec 17, 2014 at 1:25 AM, Shyam srang...@redhat.com wrote: This mail intends to present the lock migration across subvolumes problem and seek solutions/thoughts around the same, so any feedback/corrections are appreciated. # Current state of file locks post file migration during rebalance Currently when a file is migrated during rebalance, its lock information is not transferred over from the old subvol to the new subvol, that the file now resides on. As further lock requests, post migration of the file, would now be sent to the new subvol, any potential lock conflicts would not be detected, until the locks are migrated over. The term locks above can refer to the POSIX locks aquired using the FOP lk by consumers of the volume, or to the gluster internal(?) inode/dentry locks. For now we limit the discussion to the POSIX locks supported by the FOP lk. # Other areas in gluster that migrate locks Current scheme of migrating locks in gluster on graph switches, trigger an fd migration process that migrates the lock information from the old fd to the new fd. This is driven by the gluster client stack, protocol layer (FUSE, gfapi). This is done using the (set/get)xattr call with the attr name, trusted.glusterfs.lockinfo. Which in turn fetches the required key for the old fd, and migrates the lock from this old fd to new fd. IOW, there is very little information transferred as the locks are migrated across fds on the same subvolume and not across subvolumes. Additionally locks that are in the blocked state, do not seem to be migrated (at
Re: [Gluster-devel] Unable to send patches to review.gluster.org
I get the following error: error: unpack failed: error No space left on device fatal: Unpack error, check server log Pranith On 07/02/2015 09:58 AM, Atin Mukherjee wrote: + Infra, can any one of you just take a look at it? On 07/02/2015 09:53 AM, Anuradha Talur wrote: Hi, I'm unable to send patches to r.g.o, also not able to login. I'm getting the following errors respectively: 1) Permission denied (publickey). fatal: Could not read from remote repository. Please make sure you have the correct access rights and the repository exists. 2) Internal server error or forbidden access. Is anyone else facing the same issue? ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Progress on adding support for SEEK_DATA and SEEK_HOLE
On Tue, Jun 30, 2015 at 11:48:20PM +0530, Ravishankar N wrote: On 06/22/2015 03:22 PM, Ravishankar N wrote: On 06/22/2015 01:41 PM, Miklos Szeredi wrote: On Sun, Jun 21, 2015 at 6:20 PM, Niels de Vos nde...@redhat.com wrote: Hi, it seems that there could be a reasonable benefit for virtual machine images on a FUSE mountpoint when SEEK_DATA and SEEK_HOLE would be available. At the moment, FUSE does not pass lseek() on to the userspace process that handles the I/O. Other filesystems that do not (need to) track the position in the file-descriptor are starting to support SEEK_DATA/HOLE. One example is NFS: https://tools.ietf.org/html/draft-ietf-nfsv4-minorversion2-38#section-15.11 I would like to add this feature to Gluster, and am wondering if there are any reasons why it should/could not be added to FUSE. I don't see any reason why it couldn't be added. Please go ahead. Thanks for bouncing the mail to me Niels, I would be happy to work on this. I'll submit a patch by Monday next. Sent a patch @ http://thread.gmane.org/gmane.comp.file-systems.fuse.devel/14752 I've tested it with some skeleton code in gluster-fuse to handle lseek(). Ravi also sent his patch for glusterfs-fuse: http://review.gluster.org/11474 I have posted my COMPLETELY UNTESTED patches to their own Gerrit topic so that we can easily track the progress: http://review.gluster.org/#/q/status:open+project:glusterfs+branch:master+topic:wip/SEEK_HOLE My preference goes to share things early and make everyone able to follow progress (know where to find the latest patches). Assistance in testing, reviewing and improving is welcome! There are some outstanding things like seek() for ec and sharding, and probably more. This all was done as a suggestion from Christopher (kripper) Pereira, for improving the handling of sparse files (like most VM images). Thanks, Niels ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Gluster Docker images are available at docker hub
We do have a way to tackle this situation from the code. Raghavendra Talur will be sending a patch shortly. We should fix it by undoing what daemon-refactoring did, that broke the lazy creation of uuid for a node. Fixing it elsewhere is just masking the real cause. Meanwhile 'rm' is the stop gap arrangement. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Gluster Docker images are available at docker hub
On 07/01/2015 11:51 AM, Krishnan Parthasarathi wrote: We do have a way to tackle this situation from the code. Raghavendra Talur will be sending a patch shortly. We should fix it by undoing what daemon-refactoring did, that broke the lazy creation of uuid for a node. Fixing it elsewhere is just masking the real cause. Meanwhile 'rm' is the stop gap arrangement. In rebalance we are creating necessary files before start the daemon , but other daemons which are using svc framework are creating the files during glusterd init(UUID used to create the socket file) , better to create necessary files during daemons start in svc framework also. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel