Re: [Gluster-users] Split brain that is not split brain
I don't understand why there's such a complicated process to recover when I can just look at both files, decide which one I need and delete another one. On Thu, Sep 11, 2014 at 7:56 AM, Pranith Kumar Karampuri pkara...@redhat.com wrote: On 09/11/2014 09:29 AM, Ilya Ivanov wrote: Right... I deleted it and now all appears to be fine. Still, could you please elaborate on gfid split-brain? Could you go through https://github.com/gluster/glusterfs/blob/master/doc/debugging/split-brain.md Let us know if you would like something to be more clearer and we can add that and improve the document. Pranith On Thu, Sep 11, 2014 at 5:32 AM, Pranith Kumar Karampuri pkara...@redhat.com wrote: On 09/11/2014 12:16 AM, Ilya Ivanov wrote: Any insight? Was the other file's gfid d3def9e1-c6d0-4b7d-a322-b5019305182e? Could you check if this file exists in brick/.glusterfs/d3/de/ When a file is deleted this file also needs to be deleted if there are no more hardlinks to the file Pranith On Tue, Sep 9, 2014 at 8:35 AM, Ilya Ivanov bearw...@gmail.com wrote: What's a gfid split-brain and how is it different from normal split-brain? I accessed the file with stat, but heal info still shows Number of entries: 1 [root@gluster1 gluster]# getfattr -d -m. -e hex gv01/123 # getfattr -d -m. -e hex gv01/123 # file: gv01/123 trusted.afr.gv01-client-0=0x trusted.afr.gv01-client-1=0x trusted.gfid=0x35f86f4561134ba0bd1b94ef70179d4d [root@gluster1 gluster]# getfattr -d -m. -e hex gv01 # file: gv01 trusted.afr.gv01-client-0=0x trusted.afr.gv01-client-1=0x trusted.gfid=0x0001 trusted.glusterfs.dht=0x0001 trusted.glusterfs.volume-id=0x31a2c4c486ca4344b838d2c2e6c716c1 On Tue, Sep 9, 2014 at 8:19 AM, Pranith Kumar Karampuri pkara...@redhat.com wrote: On 09/09/2014 11:35 AM, Ilya Ivanov wrote: Ahh, thank you, now I get it. I deleted it on one node and it replicated to another one. Now I get the following output: [root@gluster1 var]# gluster volume heal gv01 info Brick gluster1:/home/gluster/gv01/ gfid:d3def9e1-c6d0-4b7d-a322-b5019305182e Number of entries: 1 Brick gluster2:/home/gluster/gv01/ Number of entries: 0 Is it normal? Why the number of entries isn't reset to 0? If you access the file using ls/stat etc, it will be fixed. But before that could you please post the output of 'getfattr -d -m. -e hex file/path/in/backend/brick' and 'getfattr -d -m. -e hex parent/dir/to/file/path/in/backend/brick' Pranith And why wouldn't the file show up in split-brain before, anyway? Gfid split-brains are not shown in heal-info-split-brain yet. Pranith On Tue, Sep 9, 2014 at 7:46 AM, Pranith Kumar Karampuri pkara...@redhat.com wrote: On 09/09/2014 01:54 AM, Ilya Ivanov wrote: Hello. I've Gluster 3.5.2 on Centos 6. A primitive replicated volume, as describe here https://www.digitalocean.com/community/tutorials/how-to-create-a-redundant-storage-pool-using-glusterfs-on-ubuntu-servers. I tried to simulate split-brain by temporarily disconnecting the nodes and creating a file with the same name and different contents. That worked. The question is, how do I fix it now? All the tutorials suggest deleting the file from one of the nodes. I can't do that, it reports Input/output error. The file won't even show up in gluster volume heal gv00 info split-brain. That shows 0 entries. The deletion needs to happen on one of the bricks, not from the mount point. Pranith I can see the file in gluster volume heal gv00 info heal-failed, though. -- Ilya. ___ Gluster-users mailing listGluster-users@gluster.orghttp://supercolony.gluster.org/mailman/listinfo/gluster-users -- Ilya. -- Ilya. -- Ilya. -- Ilya. -- Ilya. ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Split brain that is not split brain
On 09/11/2014 11:37 AM, Ilya Ivanov wrote: I don't understand why there's such a complicated process to recover when I can just look at both files, decide which one I need and delete another one. If the file needs to be deleted the whole file needs to be copied which is fine for small files but for big files like VM images it takes less time if the file already exists and it syncs only the parts of files that are different from the good copy. One more reason is if the parent directory from which the file is deleted from is the source then self-heal will delete the file from other directory rather than creating it. SO instead of deleting the file may be it is a better practise to make a copy of the file somewhere and delete it. We shall update the document as well with this new information. Thanks for the feedback. In 3.7 it is going to be simplified. We are giving a command to fix the split-brains where the user gets to choose the file and it will do the rest. Pranith On Thu, Sep 11, 2014 at 7:56 AM, Pranith Kumar Karampuri pkara...@redhat.com mailto:pkara...@redhat.com wrote: On 09/11/2014 09:29 AM, Ilya Ivanov wrote: Right... I deleted it and now all appears to be fine. Still, could you please elaborate on gfid split-brain? Could you go through https://github.com/gluster/glusterfs/blob/master/doc/debugging/split-brain.md Let us know if you would like something to be more clearer and we can add that and improve the document. Pranith On Thu, Sep 11, 2014 at 5:32 AM, Pranith Kumar Karampuri pkara...@redhat.com mailto:pkara...@redhat.com wrote: On 09/11/2014 12:16 AM, Ilya Ivanov wrote: Any insight? Was the other file's gfid d3def9e1-c6d0-4b7d-a322-b5019305182e? Could you check if this file exists in brick/.glusterfs/d3/de/ When a file is deleted this file also needs to be deleted if there are no more hardlinks to the file Pranith On Tue, Sep 9, 2014 at 8:35 AM, Ilya Ivanov bearw...@gmail.com mailto:bearw...@gmail.com wrote: What's a gfid split-brain and how is it different from normal split-brain? I accessed the file with stat, but heal info still shows Number of entries: 1 [root@gluster1 gluster]# getfattr -d -m. -e hex gv01/123 # getfattr -d -m. -e hex gv01/123 # file: gv01/123 trusted.afr.gv01-client-0=0x trusted.afr.gv01-client-1=0x trusted.gfid=0x35f86f4561134ba0bd1b94ef70179d4d [root@gluster1 gluster]# getfattr -d -m. -e hex gv01 # file: gv01 trusted.afr.gv01-client-0=0x trusted.afr.gv01-client-1=0x trusted.gfid=0x0001 trusted.glusterfs.dht=0x0001 trusted.glusterfs.volume-id=0x31a2c4c486ca4344b838d2c2e6c716c1 On Tue, Sep 9, 2014 at 8:19 AM, Pranith Kumar Karampuri pkara...@redhat.com mailto:pkara...@redhat.com wrote: On 09/09/2014 11:35 AM, Ilya Ivanov wrote: Ahh, thank you, now I get it. I deleted it on one node and it replicated to another one. Now I get the following output: [root@gluster1 var]# gluster volume heal gv01 info Brick gluster1:/home/gluster/gv01/ gfid:d3def9e1-c6d0-4b7d-a322-b5019305182e Number of entries: 1 Brick gluster2:/home/gluster/gv01/ Number of entries: 0 Is it normal? Why the number of entries isn't reset to 0? If you access the file using ls/stat etc, it will be fixed. But before that could you please post the output of 'getfattr -d -m. -e hex file/path/in/backend/brick' and 'getfattr -d -m. -e hex parent/dir/to/file/path/in/backend/brick' Pranith And why wouldn't the file show up in split-brain before, anyway? Gfid split-brains are not shown in heal-info-split-brain yet. Pranith On Tue, Sep 9, 2014 at 7:46 AM, Pranith Kumar Karampuri pkara...@redhat.com mailto:pkara...@redhat.com wrote: On 09/09/2014 01:54 AM, Ilya Ivanov wrote: Hello. I've Gluster 3.5.2 on Centos 6. A primitive replicated volume, as describe here https://www.digitalocean.com/community/tutorials/how-to-create-a-redundant-storage-pool-using-glusterfs-on-ubuntu-servers. I tried to simulate split-brain by temporarily
[Gluster-users] Fwd: New Defects reported by Coverity Scan for GlusterFS
To fix these Coverity issues , please check the below link for guidelines: http://www.gluster.org/community/documentation/index.php/Fixing_Issues_Reported_By_Tools_For_Static_Code_Analysis#Coverity Original Message Subject:New Defects reported by Coverity Scan for GlusterFS Date: Thu, 11 Sep 2014 00:02:11 -0700 From: scan-ad...@coverity.com Hi, Please find the latest report on new defect(s) introduced to GlusterFS found with Coverity Scan. Defect(s) Reported-by: Coverity Scan Showing 4 of 4 defect(s) ** CID 1238186: Logically dead code (DEADCODE) /xlators/cluster/afr/src/afr-dir-write.c: 339 in afr_mark_new_entry_changelog() ** CID 1238185: Explicit null dereferenced (FORWARD_NULL) /xlators/features/snapview-server/src/snapview-server-mgmt.c: 476 in svs_get_snapshot_list() ** CID 1238184: Explicit null dereferenced (FORWARD_NULL) /xlators/features/snapview-server/src/snapview-server-mgmt.c: 115 in svs_mgmt_init() ** CID 1238183: Missing break in switch (MISSING_BREAK) /xlators/mgmt/glusterd/src/glusterd-rebalance.c: 577 in glusterd_op_stage_rebalance() *** CID 1238186: Logically dead code (DEADCODE) /xlators/cluster/afr/src/afr-dir-write.c: 339 in afr_mark_new_entry_changelog() 333 break; 334 } 335 336 new_frame = NULL; 337 out: 338 if (changelog) CID 1238186: Logically dead code (DEADCODE) Execution cannot reach this statement afr_matrix_cleanup(changelo 339 afr_matrix_cleanup (changelog, priv-child_count); 340 if (new_frame) 341 AFR_STACK_DESTROY (new_frame); 342 if (xattr) 343 dict_unref (xattr); 344 return; *** CID 1238185: Explicit null dereferenced (FORWARD_NULL) /xlators/features/snapview-server/src/snapview-server-mgmt.c: 476 in svs_get_snapshot_list() 470 if (frame_cleanup) { 471 /* 472 * Destroy the frame if we encountered an error 473 * Else we need to clean it up in 474 * mgmt_get_snapinfo_cbk 475 */ CID 1238185: Explicit null dereferenced (FORWARD_NULL) Dereferencing null pointer frame. 476 SVS_STACK_DESTROY (frame); 477 } 478 479 return ret; *** CID 1238184: Explicit null dereferenced (FORWARD_NULL) /xlators/features/snapview-server/src/snapview-server-mgmt.c: 115 in svs_mgmt_init() 109 ret = 0; 110 111 gf_log (this-name, GF_LOG_DEBUG, svs mgmt init successful); 112 113 out: 114 if (ret) { CID 1238184: Explicit null dereferenced (FORWARD_NULL) Dereferencing null pointer priv. 115 rpc_clnt_connection_cleanup (priv-rpc-conn); 116 rpc_clnt_unref (priv-rpc); 117 priv-rpc = NULL; 118 } 119 120 return ret; *** CID 1238183: Missing break in switch (MISSING_BREAK) /xlators/mgmt/glusterd/src/glusterd-rebalance.c: 577 in glusterd_op_stage_rebalance() 571disconnect those clients before 572attempting this command again., 573volname); 574 goto out; 575 } 576 CID 1238183: Missing break in switch (MISSING_BREAK) The above case falls through to this one. 577 case GF_DEFRAG_CMD_START_FORCE: 578 if (is_origin_glusterd (dict)) { 579 op_ctx = glusterd_op_get_ctx (); 580 if (!op_ctx) { 581 ret = -1; 582 gf_log (this-name, GF_LOG_ERROR, To view the defects in Coverity Scan visit, http://scan.coverity.com/projects/987?tab=overview To unsubscribe from the email notification for new defects, http://scan5.coverity.com/cgi-bin/unsubscribe.py ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] remote operation failed: Operation not permitted
Hello list, What would that kind of message mean ? [2014-09-10 21:49:15.360499] W [client-rpc-fops.c:1480:client3_3_fstat_cbk] 0-mailer-client-1: remote operation failed: Operation not permitted [2014-09-10 21:49:15.360780] W [client-rpc-fops.c:1480:client3_3_fstat_cbk] 0-mailer-client-0: remote operation failed: Operation not permitted I got this in one of my client's gluster log. The other client connected to these bricks reports more messages like this : [2014-09-11 07:25:25.995942] W [client-rpc-fops.c:2774:client3_3_lookup_cbk] 0-mailer-client-0: remote operation failed: No such file or directory. Path: gfid:b539b594-0b30-445a-995f-1440accbf3ea (b539b594-0b30-445a-995f-1440accbf3ea) [2014-09-11 07:25:25.996140] W [client-rpc-fops.c:2774:client3_3_lookup_cbk] 0-mailer-client-1: remote operation failed: No such file or directory. Path: gfid:b539b594-0b30-445a-995f-1440accbf3ea (b539b594-0b30-445a-995f-1440accbf3ea) [2014-09-11 07:25:25.996158] E [fuse-bridge.c:776:fuse_getattr_resume] 0-glusterfs-fuse: 14752638: GETATTR 139947347157676 (b539b594-0b30-445a-995f-1440accbf3ea) resolution failed The Gluster bricks (replicated) do not report any error in the logs. Thank you ! signature.asc Description: OpenPGP digital signature ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Split brain that is not split brain
Makes some sense. Yes, I meant make a backup and delete, rather than just delete. If I may suggest, putting that debug link somewhere more visible would be be good, too. I wouldn't find without your help. Thank you for the assistance. On Thu, Sep 11, 2014 at 9:14 AM, Pranith Kumar Karampuri pkara...@redhat.com wrote: On 09/11/2014 11:37 AM, Ilya Ivanov wrote: I don't understand why there's such a complicated process to recover when I can just look at both files, decide which one I need and delete another one. If the file needs to be deleted the whole file needs to be copied which is fine for small files but for big files like VM images it takes less time if the file already exists and it syncs only the parts of files that are different from the good copy. One more reason is if the parent directory from which the file is deleted from is the source then self-heal will delete the file from other directory rather than creating it. SO instead of deleting the file may be it is a better practise to make a copy of the file somewhere and delete it. We shall update the document as well with this new information. Thanks for the feedback. In 3.7 it is going to be simplified. We are giving a command to fix the split-brains where the user gets to choose the file and it will do the rest. Pranith On Thu, Sep 11, 2014 at 7:56 AM, Pranith Kumar Karampuri pkara...@redhat.com wrote: On 09/11/2014 09:29 AM, Ilya Ivanov wrote: Right... I deleted it and now all appears to be fine. Still, could you please elaborate on gfid split-brain? Could you go through https://github.com/gluster/glusterfs/blob/master/doc/debugging/split-brain.md Let us know if you would like something to be more clearer and we can add that and improve the document. Pranith On Thu, Sep 11, 2014 at 5:32 AM, Pranith Kumar Karampuri pkara...@redhat.com wrote: On 09/11/2014 12:16 AM, Ilya Ivanov wrote: Any insight? Was the other file's gfid d3def9e1-c6d0-4b7d-a322-b5019305182e? Could you check if this file exists in brick/.glusterfs/d3/de/ When a file is deleted this file also needs to be deleted if there are no more hardlinks to the file Pranith On Tue, Sep 9, 2014 at 8:35 AM, Ilya Ivanov bearw...@gmail.com wrote: What's a gfid split-brain and how is it different from normal split-brain? I accessed the file with stat, but heal info still shows Number of entries: 1 [root@gluster1 gluster]# getfattr -d -m. -e hex gv01/123 # getfattr -d -m. -e hex gv01/123 # file: gv01/123 trusted.afr.gv01-client-0=0x trusted.afr.gv01-client-1=0x trusted.gfid=0x35f86f4561134ba0bd1b94ef70179d4d [root@gluster1 gluster]# getfattr -d -m. -e hex gv01 # file: gv01 trusted.afr.gv01-client-0=0x trusted.afr.gv01-client-1=0x trusted.gfid=0x0001 trusted.glusterfs.dht=0x0001 trusted.glusterfs.volume-id=0x31a2c4c486ca4344b838d2c2e6c716c1 On Tue, Sep 9, 2014 at 8:19 AM, Pranith Kumar Karampuri pkara...@redhat.com wrote: On 09/09/2014 11:35 AM, Ilya Ivanov wrote: Ahh, thank you, now I get it. I deleted it on one node and it replicated to another one. Now I get the following output: [root@gluster1 var]# gluster volume heal gv01 info Brick gluster1:/home/gluster/gv01/ gfid:d3def9e1-c6d0-4b7d-a322-b5019305182e Number of entries: 1 Brick gluster2:/home/gluster/gv01/ Number of entries: 0 Is it normal? Why the number of entries isn't reset to 0? If you access the file using ls/stat etc, it will be fixed. But before that could you please post the output of 'getfattr -d -m. -e hex file/path/in/backend/brick' and 'getfattr -d -m. -e hex parent/dir/to/file/path/in/backend/brick' Pranith And why wouldn't the file show up in split-brain before, anyway? Gfid split-brains are not shown in heal-info-split-brain yet. Pranith On Tue, Sep 9, 2014 at 7:46 AM, Pranith Kumar Karampuri pkara...@redhat.com wrote: On 09/09/2014 01:54 AM, Ilya Ivanov wrote: Hello. I've Gluster 3.5.2 on Centos 6. A primitive replicated volume, as describe here https://www.digitalocean.com/community/tutorials/how-to-create-a-redundant-storage-pool-using-glusterfs-on-ubuntu-servers. I tried to simulate split-brain by temporarily disconnecting the nodes and creating a file with the same name and different contents. That worked. The question is, how do I fix it now? All the tutorials suggest deleting the file from one of the nodes. I can't do that, it reports Input/output error. The file won't even show up in gluster volume heal gv00 info split-brain. That shows 0 entries. The deletion needs to happen on one of the bricks, not from the mount point. Pranith I can see the file in gluster volume heal gv00 info heal-failed, though. -- Ilya. ___ Gluster-users mailing
Re: [Gluster-users] Split brain that is not split brain
On 09/11/2014 01:13 PM, Ilya Ivanov wrote: Makes some sense. Yes, I meant make a backup and delete, rather than just delete. If I may suggest, putting that debug link somewhere more visible would be be good, too. I wouldn't find without your help. Justin, where shall we put the doc? Pranith Thank you for the assistance. On Thu, Sep 11, 2014 at 9:14 AM, Pranith Kumar Karampuri pkara...@redhat.com mailto:pkara...@redhat.com wrote: On 09/11/2014 11:37 AM, Ilya Ivanov wrote: I don't understand why there's such a complicated process to recover when I can just look at both files, decide which one I need and delete another one. If the file needs to be deleted the whole file needs to be copied which is fine for small files but for big files like VM images it takes less time if the file already exists and it syncs only the parts of files that are different from the good copy. One more reason is if the parent directory from which the file is deleted from is the source then self-heal will delete the file from other directory rather than creating it. SO instead of deleting the file may be it is a better practise to make a copy of the file somewhere and delete it. We shall update the document as well with this new information. Thanks for the feedback. In 3.7 it is going to be simplified. We are giving a command to fix the split-brains where the user gets to choose the file and it will do the rest. Pranith On Thu, Sep 11, 2014 at 7:56 AM, Pranith Kumar Karampuri pkara...@redhat.com mailto:pkara...@redhat.com wrote: On 09/11/2014 09:29 AM, Ilya Ivanov wrote: Right... I deleted it and now all appears to be fine. Still, could you please elaborate on gfid split-brain? Could you go through https://github.com/gluster/glusterfs/blob/master/doc/debugging/split-brain.md Let us know if you would like something to be more clearer and we can add that and improve the document. Pranith On Thu, Sep 11, 2014 at 5:32 AM, Pranith Kumar Karampuri pkara...@redhat.com mailto:pkara...@redhat.com wrote: On 09/11/2014 12:16 AM, Ilya Ivanov wrote: Any insight? Was the other file's gfid d3def9e1-c6d0-4b7d-a322-b5019305182e? Could you check if this file exists in brick/.glusterfs/d3/de/ When a file is deleted this file also needs to be deleted if there are no more hardlinks to the file Pranith On Tue, Sep 9, 2014 at 8:35 AM, Ilya Ivanov bearw...@gmail.com mailto:bearw...@gmail.com wrote: What's a gfid split-brain and how is it different from normal split-brain? I accessed the file with stat, but heal info still shows Number of entries: 1 [root@gluster1 gluster]# getfattr -d -m. -e hex gv01/123 # getfattr -d -m. -e hex gv01/123 # file: gv01/123 trusted.afr.gv01-client-0=0x trusted.afr.gv01-client-1=0x trusted.gfid=0x35f86f4561134ba0bd1b94ef70179d4d [root@gluster1 gluster]# getfattr -d -m. -e hex gv01 # file: gv01 trusted.afr.gv01-client-0=0x trusted.afr.gv01-client-1=0x trusted.gfid=0x0001 trusted.glusterfs.dht=0x0001 trusted.glusterfs.volume-id=0x31a2c4c486ca4344b838d2c2e6c716c1 On Tue, Sep 9, 2014 at 8:19 AM, Pranith Kumar Karampuri pkara...@redhat.com mailto:pkara...@redhat.com wrote: On 09/09/2014 11:35 AM, Ilya Ivanov wrote: Ahh, thank you, now I get it. I deleted it on one node and it replicated to another one. Now I get the following output: [root@gluster1 var]# gluster volume heal gv01 info Brick gluster1:/home/gluster/gv01/ gfid:d3def9e1-c6d0-4b7d-a322-b5019305182e Number of entries: 1 Brick gluster2:/home/gluster/gv01/ Number of entries: 0 Is it normal? Why the number of entries isn't reset to 0? If you access the file using ls/stat etc, it will be fixed. But before that could you please post the output of 'getfattr -d -m. -e hex file/path/in/backend/brick' and 'getfattr -d -m. -e hex parent/dir/to/file/path/in/backend/brick' Pranith
Re: [Gluster-users] Split brain that is not split brain
On 11/09/2014, at 9:44 AM, Pranith Kumar Karampuri wrote: On 09/11/2014 01:13 PM, Ilya Ivanov wrote: Makes some sense. Yes, I meant make a backup and delete, rather than just delete. If I may suggest, putting that debug link somewhere more visible would be be good, too. I wouldn't find without your help. Justin, where shall we put the doc? In theory, the .md files should be pulled into the main documentation part of the static site, somewhere under here: http://www.gluster.org/documentation/ I'm not sure if we've yet got that happening, but we definitely need to in the near future. + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] [Gluster-devel] Proposal for GlusterD-2.0
Bala, I think using Salt as the orchestration framework is a good idea. We would still need to have a consistent distributed store. I hope Salt has the provision to use one of our choice. It could be consul or something that satisfies the criteria for choosing alternate technology. I would wait for a couple of days for the community to chew on this and share their thoughts. If we have a consensus on this, we could 'port' the 'basic'[1] volume management commands to a system built using Salt and see for real how it fits our use case. Thoughts? [1] basic commands - peer-probe, volume-create, volume-start and volume-add-brick ~KP - Original Message - - Original Message - From: Justin Clift jus...@gluster.org To: Balamurugan Arumugam b...@gluster.com Cc: Kaushal M kshlms...@gmail.com, gluster-users@gluster.org, Gluster Devel gluster-de...@gluster.org Sent: Thursday, September 11, 2014 7:33:52 AM Subject: Re: [Gluster-users] [Gluster-devel] Proposal for GlusterD-2.0 On 11/09/2014, at 2:46 AM, Balamurugan Arumugam wrote: snip WRT glusterd problem, I see Salt already resolves most of them at infrastructure level. Its worth considering it. Salt used to have (~12 months ago) a reputation for being really buggy. Any idea if that's still the case? I heard from various presentations about that due to zeromq 2.x issues. With zeromq 3.x, its all gone. But we could explore more on stability point of view. Apart from that though, using Salt is an interesting idea. :) Yes. I came across Salt currently for unified management for storage to manage gluster and ceph which is still in planning phase. I could think of a complete requirement of infra requirement to solve from glusterd to unified management. Calamari ceph management already uses Salt. It would be the ideal solution with Salt (or any infra) if gluster, ceph and unified management uses. Regards, Bala ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] [Gluster-devel] Proposal for GlusterD-2.0
- Original Message - From: Krishnan Parthasarathi kpart...@redhat.com To: Balamurugan Arumugam barum...@redhat.com Cc: gluster-users@gluster.org, Gluster Devel gluster-de...@gluster.org Sent: Thursday, September 11, 2014 2:25:45 PM Subject: Re: [Gluster-users] [Gluster-devel] Proposal for GlusterD-2.0 Bala, I think using Salt as the orchestration framework is a good idea. We would still need to have a consistent distributed store. I hope Salt has the provision to use one of our choice. It could be consul or something that satisfies the criteria for choosing alternate technology. I would wait for a couple of days for the community to chew on this and share their thoughts. If we have a consensus on this, we could 'port' the 'basic'[1] volume management commands to a system built using Salt and see for real how it fits our use case. Thoughts? For distributed store, I would think of MongoDB which provides distributed/replicated/highly available/master read-write/slave read-only database. Lets get what community think about SaltStack and/or MongoDB. [1] basic commands - peer-probe, volume-create, volume-start and volume-add-brick Regards, Bala ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] [Gluster-devel] Proposal for GlusterD-2.0
On 11/09/2014, at 10:16 AM, Balamurugan Arumugam wrote: snip For distributed store, I would think of MongoDB which provides distributed/replicated/highly available/master read-write/slave read-only database. Lets get what community think about SaltStack and/or MongoDB. Is this relevant for MongoDB still? (it's from 12+ months ago) http://aphyr.com/posts/284-call-me-maybe-mongodb + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] [Gluster-devel] Proposal for GlusterD-2.0
- Original Message - From: Justin Clift jus...@gluster.org To: Balamurugan Arumugam b...@gluster.com Cc: Krishnan Parthasarathi kpart...@redhat.com, gluster-users@gluster.org, Gluster Devel gluster-de...@gluster.org Sent: Thursday, September 11, 2014 2:48:41 PM Subject: Re: [Gluster-devel] [Gluster-users] Proposal for GlusterD-2.0 On 11/09/2014, at 10:16 AM, Balamurugan Arumugam wrote: snip For distributed store, I would think of MongoDB which provides distributed/replicated/highly available/master read-write/slave read-only database. Lets get what community think about SaltStack and/or MongoDB. Is this relevant for MongoDB still? (it's from 12+ months ago) http://aphyr.com/posts/284-call-me-maybe-mongodb I haven't tried using Mongo yet. A PoC would help to identify how helpful mongodb is :) Regards, Bala ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] [Gluster-devel] Proposal for GlusterD-2.0
- Original Message - From: Justin Clift jus...@gluster.org To: Balamurugan Arumugam b...@gluster.com Cc: Krishnan Parthasarathi kpart...@redhat.com, gluster-users@gluster.org, Gluster Devel gluster-de...@gluster.org Sent: Thursday, September 11, 2014 2:48:41 PM Subject: Re: [Gluster-devel] [Gluster-users] Proposal for GlusterD-2.0 On 11/09/2014, at 10:16 AM, Balamurugan Arumugam wrote: snip For distributed store, I would think of MongoDB which provides distributed/replicated/highly available/master read-write/slave read-only database. Lets get what community think about SaltStack and/or MongoDB. Is this relevant for MongoDB still? (it's from 12+ months ago) http://aphyr.com/posts/284-call-me-maybe-mongodb I haven't tried using Mongo yet. A PoC would help to identify how helpful mongodb is :) Regards, Bala ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] [Gluster-devel] Proposal for GlusterD-2.0
Yes. I came across Salt currently for unified management for storage to manage gluster and ceph which is still in planning phase. I could think of a complete requirement of infra requirement to solve from glusterd to unified management. Calamari ceph management already uses Salt. It would be the ideal solution with Salt (or any infra) if gluster, ceph and unified management uses. I think the idea of using Salt (or similar) is interesting, but it's also key that Ceph still has its mon cluster as well. (Is mon calamari an *intentional* Star Wars reference?) As I see it, glusterd or anything we use to replacement has multiple responsibilities: (1) Track the current up/down state of cluster members and resources. (2) Store configuration and coordinate changes to it. (3) Orchestrate complex or long-running activities (e.g. rebalance). (4) Provide service discovery (current portmapper). Salt and its friends clearly shine at (2) and (3), though they outsource the actual data storage to an external data store. With such a data store, (4) becomes pretty trivial. The sticking point for me is (1). How does Salt handle that need, or how might it be satisfied on top of the facilities Salt does provide? I can see *very* clearly how to do it on top of etcd or consul. Could those in fact be used for Salt's data store? It seems like Salt shouldn't need a full-fledged industrial strength database, just something with high consistency/availability and some basic semantics. Maybe we should try to engage with the Salt developers to come up with ideas. Or find out exactly what functionality they found still needs to be in the mon cluster and not in Salt. ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] [Gluster-devel] Proposal for GlusterD-2.0
For distributed store, I would think of MongoDB which provides distributed/replicated/highly available/master read-write/slave read-only database. Lets get what community think about SaltStack and/or MongoDB. I definitely do not think MongoDB is the right tool for this job. I'm not one of those people who just bash MongoDB out of fashion, either. I frequently defend them against such attacks, and I used MongoDB for some work on CloudForms a while ago. However, a full MongoDB setup carries a pretty high operational complexity, to support high scale and rich features . . . which we don't need. This part of our system doesn't need sharding. It doesn't need complex ad-hoc query capability. If we don't need those features, we *certainly* don't need the complexity that comes with them. We need something with the very highest levels of reliability and consistency, with as little complexity as possible to go with that. Even its strongest advocates would probably agree that MongoDB doesn't fit those requirements very well. ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Proposal for GlusterD-2.0
I really hope whatever the outcome and final choice is ... as an end user I hope that Gluster stays as simple to deploy as it is today. Yes, the peers and volume files would be nice to be in one cloud storage (chicken and egg problem - gluster is a cloud storage but it unable to bootstrap its own stuff into its own cloud infra) and overall aim is to get scalability. All heavy products with their own servers, large key value stores are fancy tools that take a long time to setup and have their own problems. -Nirmal -Original Message- From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Jeff Darcy Sent: Thursday, September 11, 2014 10:23 AM To: Balamurugan Arumugam Cc: Gluster Devel; gluster-users@gluster.org Subject: Re: [Gluster-users] [Gluster-devel] Proposal for GlusterD-2.0 For distributed store, I would think of MongoDB which provides distributed/replicated/highly available/master read-write/slave read-only database. Lets get what community think about SaltStack and/or MongoDB. I definitely do not think MongoDB is the right tool for this job. I'm not one of those people who just bash MongoDB out of fashion, either. I frequently defend them against such attacks, and I used MongoDB for some work on CloudForms a while ago. However, a full MongoDB setup carries a pretty high operational complexity, to support high scale and rich features . . . which we don't need. This part of our system doesn't need sharding. It doesn't need complex ad-hoc query capability. If we don't need those features, we *certainly* don't need the complexity that comes with them. We need something with the very highest levels of reliability and consistency, with as little complexity as possible to go with that. Even its strongest advocates would probably agree that MongoDB doesn't fit those requirements very well. ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] [Gluster-devel] Proposal for GlusterD-2.0
I'm so glad to read this. I was thinking the same thing. On Sep 11, 2014, at 7:22 AM, Jeff Darcy jda...@redhat.com wrote: For distributed store, I would think of MongoDB which provides distributed/replicated/highly available/master read-write/slave read-only database. Lets get what community think about SaltStack and/or MongoDB. I definitely do not think MongoDB is the right tool for this job. I'm not one of those people who just bash MongoDB out of fashion, either. I frequently defend them against such attacks, and I used MongoDB for some work on CloudForms a while ago. However, a full MongoDB setup carries a pretty high operational complexity, to support high scale and rich features . . . which we don't need. This part of our system doesn't need sharding. It doesn't need complex ad-hoc query capability. If we don't need those features, we *certainly* don't need the complexity that comes with them. We need something with the very highest levels of reliability and consistency, with as little complexity as possible to go with that. Even its strongest advocates would probably agree that MongoDB doesn't fit those requirements very well. ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] [Gluster-devel] Proposal for GlusterD-2.0
On Thu, Sep 11, 2014 at 4:55 AM, Krishnan Parthasarathi kpart...@redhat.com wrote: I think using Salt as the orchestration framework is a good idea. We would still need to have a consistent distributed store. I hope Salt has the provision to use one of our choice. It could be consul or something that satisfies the criteria for choosing alternate technology. I would wait for a couple of days for the community to chew on this and share their thoughts. If we have a consensus on this, we could 'port' the 'basic'[1] volume management commands to a system built using Salt and see for real how it fits our use case. Thoughts? I disagree. I think puppet + puppet-gluster would be a good idea :) One advantage is that the technology is already proven, and there's a working POC. Feel free to prove me wrong, or to request any features that it's missing. ;) ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Proposal for GlusterD-2.0
On Thu, Sep 11, 2014 at 12:01 PM, Prasad, Nirmal npra...@idirect.net wrote: I really hope whatever the outcome and final choice is ... as an end user I hope that Gluster stays as simple to deploy as it is today. I think it's pretty simple already with puppet-gluster. It takes me around 15 minutes while I'm off drinking a $BEVERAGE. ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Dead directory after other node is rebooted
On 09/10/2014 08:32 AM, French Teddy wrote: I have a very simple two nodes setup. After I rebooted one the nodes, a directory deep inside the hierarchy on the other node has become dead. By dead I mean any process trying to access its content becomes stuck. Absolutly nothing appears on the logs. How can I fix that very very quickly? Does the stuck process come out eventually? Or is it stuck forever? How many files are under this directory? Could you please provide the output of 'getfattr -d -m. -e hex dir/path/on/bricks' Pranith TIA greg gluster 3.3.2 on debian 7.5 ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] file corruption on Gluster 3.5.1 and Ubuntu 14.04
Any more to this thread? I don't mean to nag, but this seems like a pretty serious issue. How can I help? On Sep 7, 2014, at 9:51 AM, mike m...@luminatewireless.com wrote: I don't think I have these enabled. How can I confirm that? On Sep 7, 2014, at 12:57 AM, Anand Avati av...@gluster.org wrote: The only reason O_APPEND gets stripped on the server side, is because of one of the following xlators: - stripe - quiesce - crypt If you have any of these, please try unloading/reconfiguring without these features and try again. Thanks On Sat, Sep 6, 2014 at 3:31 PM, mike m...@luminatewireless.com wrote: I was able to narrow it down to smallish python script. I've attached that to the bug. https://bugzilla.redhat.com/show_bug.cgi?id=1138970 On Sep 6, 2014, at 1:05 PM, Justin Clift jus...@gluster.org wrote: Thanks Mike, this is good stuff. :) + Justin On 06/09/2014, at 8:19 PM, mike wrote: I upgraded the client to Gluster 3.5.2, but there is no difference. The bug is almost certainly in the Fuse client. If I remount the filesystem with NFS, the problem is no longer observable. I spent a little time looking through the xlator/fuse-bridge to see where the offsets are coming from, but I'm really not familiar enough with the code, so it is slow going. Unfortunately, I'm still having trouble reproducing this in a python script that could be readily attached to a bug report. I'll take a crack at that again, but I will a file a bug anyway for completeness. On Sep 5, 2014, at 7:10 PM, mike m...@luminatewireless.com wrote: I have narrowed down the source of the bug. Here is an strace of glusterfsd http://fpaste.org/131455/40996378/ The first line represents a write that does *not* make it into the underlying file. The last line is the write that stomps the earlier write. As I said, the client file is opened in O_APPEND mode, but on the glusterfsd side, the file is just O_CREAT|O_WRONLY. The means the offsets to pwrite() need to be valid. I correlated this to a tcpdump I took and I can see that in fact, the RPCs being sent have the wrong offset. Interestingly, glusterfs.write-is-append = 0, which I wouldn't have expected. I think the bug lies in the glusterfs fuse client. As to your question about Gluster 3.5.2, I may be able to do that if I am unable to find the bug in the source. -Mike On Sep 5, 2014, at 6:16 PM, Justin Clift jus...@gluster.org wrote: On 06/09/2014, at 12:10 AM, mike wrote: I have found that the O_APPEND flag is key to this failure - I had overlooked that flag when reading the strace and trying to cobble up a minimal reproduction. I now have a small pair of python scripts that can reliably reproduce this failure. As a thought, is there a reasonable way you can test this on GlusterFS 3.5.2? There were some important bug fixes in 3.5.2 (from 3.5.1). Note I'm not saying yours is one of them, I'm just asking if it's easy to test and find out. :) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] file corruption on Gluster 3.5.1 and Ubuntu 14.04
On 09/11/2014 11:38 PM, mike wrote: Any more to this thread? I don't mean to nag, but this seems like a pretty serious issue. Most probably the issue is in write-behind according to my tests. The people who know that xlator are Avati/Raghavendra G/Niels CCed all of them Pranith How can I help? On Sep 7, 2014, at 9:51 AM, mike m...@luminatewireless.com mailto:m...@luminatewireless.com wrote: I don't think I have these enabled. How can I confirm that? On Sep 7, 2014, at 12:57 AM, Anand Avati av...@gluster.org mailto:av...@gluster.org wrote: The only reason O_APPEND gets stripped on the server side, is because of one of the following xlators: - stripe - quiesce - crypt If you have any of these, please try unloading/reconfiguring without these features and try again. Thanks On Sat, Sep 6, 2014 at 3:31 PM, mike m...@luminatewireless.com mailto:m...@luminatewireless.com wrote: I was able to narrow it down to smallish python script. I've attached that to the bug. https://bugzilla.redhat.com/show_bug.cgi?id=1138970 On Sep 6, 2014, at 1:05 PM, Justin Clift jus...@gluster.org mailto:jus...@gluster.org wrote: Thanks Mike, this is good stuff. :) + Justin On 06/09/2014, at 8:19 PM, mike wrote: I upgraded the client to Gluster 3.5.2, but there is no difference. The bug is almost certainly in the Fuse client. If I remount the filesystem with NFS, the problem is no longer observable. I spent a little time looking through the xlator/fuse-bridge to see where the offsets are coming from, but I'm really not familiar enough with the code, so it is slow going. Unfortunately, I'm still having trouble reproducing this in a python script that could be readily attached to a bug report. I'll take a crack at that again, but I will a file a bug anyway for completeness. On Sep 5, 2014, at 7:10 PM, mike m...@luminatewireless.com mailto:m...@luminatewireless.com wrote: I have narrowed down the source of the bug. Here is an strace of glusterfsd http://fpaste.org/131455/40996378/ The first line represents a write that does *not* make it into the underlying file. The last line is the write that stomps the earlier write. As I said, the client file is opened in O_APPEND mode, but on the glusterfsd side, the file is just O_CREAT|O_WRONLY. The means the offsets to pwrite() need to be valid. I correlated this to a tcpdump I took and I can see that in fact, the RPCs being sent have the wrong offset. Interestingly, glusterfs.write-is-append = 0, which I wouldn't have expected. I think the bug lies in the glusterfs fuse client. As to your question about Gluster 3.5.2, I may be able to do that if I am unable to find the bug in the source. -Mike On Sep 5, 2014, at 6:16 PM, Justin Clift jus...@gluster.org mailto:jus...@gluster.org wrote: On 06/09/2014, at 12:10 AM, mike wrote: I have found that the O_APPEND flag is key to this failure - I had overlooked that flag when reading the strace and trying to cobble up a minimal reproduction. I now have a small pair of python scripts that can reliably reproduce this failure. As a thought, is there a reasonable way you can test this on GlusterFS 3.5.2? There were some important bug fixes in 3.5.2 (from 3.5.1). Note I'm not saying yours is one of them, I'm just asking if it's easy to test and find out. :) Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org http://www.gluster.org/ An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift http://twitter.com/realjustinclift -- GlusterFS - http://www.gluster.org http://www.gluster.org/ An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift http://twitter.com/realjustinclift ___ Gluster-users mailing list Gluster-users@gluster.org mailto:Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] libgfapi management functions
I apologize for not replying earlier. Such functionality is not planned for libgfapi. We plan to keep it restricted to only data path. Projects in https://forge.gluster.org/ may be helpful to you. On Tue, Sep 2, 2014 at 8:05 PM, Prasad, Nirmal npra...@idirect.net wrote: Thanks for the response – the one functionality or hook I’m looking for is a notification for brick failure – presently the process involves checking the volume status to see if a certain brick goes offline. When a brick fails, would want to start a timer and replace the brick if it does not come up automatically – alternatively if gluster allowed specifying “hot-stand-bys” to replace with at a certain timeout (eg. 1 minute or a replica threshold) it would work. Thanks Regards Nirmal *From:* RAGHAVENDRA TALUR [mailto:raghavendra.ta...@gmail.com] *Sent:* Monday, September 01, 2014 3:26 PM *To:* Prasad, Nirmal *Cc:* gluster-users@gluster.org *Subject:* Re: [Gluster-users] libgfapi management functions On Sat, Aug 30, 2014 at 9:12 PM, Prasad, Nirmal npra...@idirect.net wrote: Hi, Is there any plan or interest in getting the management functions added to gfapi? – looks pretty much like the client library functions could be added. eg. the present libgfapi has glfs_mgmt.c with a few rpc messages exchanged with glusterd https://github.com/gluster/glusterfs/blob/master/api/src/glfs-mgmt.c Any interest or plans to have more operations exposed (especially brick status change notifications to be propagated to applications)? I’m wondering if gluster cli could be built on this library – it would involve taking the functionality that exists in the cli into a library and separating out the console interactions. Thanks Regards Nirmal ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users Hi Nirmal, The file glfs_mgmt.c has only those functions which are required by the client stack of glusterfs to interact with glusterd. The management here refers to interaction of the process/library with the glusterfs management daemon which is glusterd for its own purpose. GFAPI is currently intended to be used only as a interface/api to the file glusterfs filesystem and not for management operations. Do you have a specific requirement which would require such a feature from GFAPI? There are other projects being developed to provide API for management operations like http://www.gluster.org/community/documentation/index.php/Features/rest-api -- *Raghavendra Talur * -- *Raghavendra Talur * ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Questions about gluster reblance
Hello Shyam. Thanks for the reply. Please see my reply below, starting with [paul:] Please add me in address list besides gluster-uses when replying so that I can easier reply since I subscribed gluster-users with the digest mode (No other choice if I remember correctly.) Date: Wed, 10 Sep 2014 10:36:41 -0400 From: Shyam srang...@redhat.com To: gluster-users@gluster.org Subject: Re: [Gluster-users] Questions about gluster reblance Message-ID: 541061f9.7000...@redhat.com Content-Type: text/plain; charset=UTF-8; format=flowed On 09/10/2014 03:27 AM, Paul Guo wrote: Hello, Recently I spent a bit time understanding rebalance since I want to know its performance given that there could be more and more bricks to be added into my glusterfs volume and there will be more and more files and directories in the existing glusterfs volume. During the test I saw something which I'm really confused about. Steps: SW versions: glusterfs 3.4.4 + centos 6.5 Inital Configuration: replica 2, lab1:/brick1 + lab2:/brick1 fuse_mount it on /mnt cp -rf /sbin /mnt (~300+ files under /sbin) add two more bricks: lab1:/brick2 + lab2:/brick2. run gluster reblance. 1) fix-layout only (e.g. gluster volume rebalance g1 fix-layout start)? After rebalance is done (observed via gluster volume rebalance g1 status),? I found there is no file under lab1:/brick2/sbin. The hash ranges of new brick?lab1:/brick2/sbin and old brick lab1:/brick1/sbin appear to be ok. [root@lab1 Desktop]# getfattr -dm. -e hex /brick2/sbin getfattr: Removing leading '/' from absolute path names # file: brick2/sbin trusted.gfid=0x35976c2034d24dc2b0639fde18de007d trusted.glusterfs.dht=0x00017fff [root@lab1 Desktop]# getfattr -dm. -e hex /brick1/sbin getfattr: Removing leading '/' from absolute path names # file: brick1/sbin trusted.gfid=0x35976c2034d24dc2b0639fde18de007d trusted.glusterfs.dht=0x00017ffe ? The question is: AFAIK, fix-layout would create linkto files (files with linkto xattr and with sticky bit set only) for those ones whose hash values belong to the new subvol. so there should have been some linkto files under lab1:/brick2, but no one now, why? fix-layout only fixes the layout, i.e spreads the layout to the newer bricks (or bricks previously not participating in the layout). It would not create the linkto files. Post fix-layout, if one were to perform a lookup on a file, that should have belonged to the newer brick as per the layout and hash of that file name, one can see the linkto file being present. Hope this explains (1). [paul:] After fix-layout is complete, I mount the volume on /mnt, then run ls -l /mnt/sbin/* and file /mnt/sbin/*, and then I found just several linkto files are created while most files, which should have been created under the new brick (i.e. brick2), are not created. [root@lab1 ~]# ls -l /brick2/sbin total 0 -T 2 root root 0 Sep 12 09:26 dmraid -T 2 root root 0 Sep 12 09:26 initctl -T 2 root root 0 Sep 12 09:26 ip6tables-multi -T 2 root root 0 Sep 12 09:26 portreserve -T 2 root root 0 Sep 12 09:26 reboot -T 2 root root 0 Sep 12 09:26 swapon [root@lab1 ~]# getfattr -dm. -e hex /brick2/sbin getfattr: Removing leading '/' from absolute path names # file: brick2/sbin trusted.gfid=0x94bc07cd18914a91ab12fbe931c63431 trusted.glusterfs.dht=0x00017fff [root@lab1 ~]# ./gethash.py reboot 0xd48b11f6L [root@lab1 ~]# ./gethash.py swapon 0x93129578L The hash values of reboot swapon are in the range of /brick2/sbin (i.e. 7fff - ) so the linkto files for the two binaries are expected, but there are more linkto files missing, e.g. xfsdump [root@lab1 ~]# ./gethash.py xfsdump 0xc17ff86bL Even I umount /mnt, stop-then-start the volume, restart glusterd, remount /mnt and then do the experiment again, I still find no more linkto files under /brick2/sbin. 2) fix-layout + data_migrate (e.g. gluster volume rebalance g1 start) After migration is done, I saw linkto files under brick2/sbin.? There are totally 300+ files under system /sbin. Under brick2/sbin, I found the 300+ files are all there! either migrated or linkto-ed. -rwxr-xr-x 2 root root 17400 Sep 10 12:02 vmcore-dmesg -T 2 root root 0 Sep 10 12:03 weak-modules -T 2 root root 0 Sep 10 12:03 wipefs -rwxr-xr-x 2 root root 295656 Sep 10 12:02 xfsdump -rwxr-xr-x 2 root root 51 Sep 10 12:02 xfs_repair -rwxr-xr-x 2 root root 348088 Sep 10 12:02 xfsrestore And under brick1/sbin, those migrated files are gone as expected. There are near to 150 files under brick/sbin. ? This confuses me since creating those linkto files seems to be unnecessary, at least for files whose hash values do not belong to the subvol. (My understanding is that if a file's hash value is in the range of a subvol then it will be stored in that subvol.) Can you check if a lookup
Re: [Gluster-users] [Gluster-devel] Proposal for GlusterD-2.0
- Original Message - On Thu, Sep 11, 2014 at 4:55 AM, Krishnan Parthasarathi kpart...@redhat.com wrote: I think using Salt as the orchestration framework is a good idea. We would still need to have a consistent distributed store. I hope Salt has the provision to use one of our choice. It could be consul or something that satisfies the criteria for choosing alternate technology. I would wait for a couple of days for the community to chew on this and share their thoughts. If we have a consensus on this, we could 'port' the 'basic'[1] volume management commands to a system built using Salt and see for real how it fits our use case. Thoughts? I disagree. I think puppet + puppet-gluster would be a good idea :) One advantage is that the technology is already proven, and there's a working POC. Feel free to prove me wrong, or to request any features that it's missing. ;) I am glad you joined this discussion. I was expecting you to join earlier :) IIUC, puppet-gluster uses glusterd to perform glusterfs deployments. I think it's important to consider puppet given its acceptance.What are your thoughts on building 'glusterd' using puppet? The proposal mail describes the functions glusterd performs today. With that as a reference could you elaborate on how we could use puppet to perform some (or all) the functions of glusterd? ~KP ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users