Re: [Gluster-devel] [Gluster-users] Proposal for GlusterD-2.0
On 09/05/2014 03:51 PM, Kaushal M wrote: GlusterD performs the following functions as the management daemon for GlusterFS: - Peer membership management - Maintains consistency of configuration data across nodes (distributed configuration store) - Distributed command execution (orchestration) - Service management (manage GlusterFS daemons) - Portmap service for GlusterFS daemons This proposal aims to delegate the above functions to technologies that solve these problems well. We aim to do this in a phased manner. The technology alternatives we would be looking for should have the following properties, - Open source - Vibrant community - Good documentation - Easy to deploy/manage This would allow GlusterD's architecture to be more modular. We also aim to make GlusterD's architecture as transparent and observable as possible. Separating out these functions would allow us to do that. Bulk of current GlusterD code deals with keeping the configuration of the cluster and the volumes in it consistent and available across the nodes. The current algorithm is not scalable (N^2 in no. of nodes) and doesn't prevent split-brain of configuration. This is the problem area we are targeting for the first phase. As part of the first phase, we aim to delegate the distributed configuration store. We are exploring consul [1] as a replacement for the existing distributed configuration store (sum total of /var/lib/glusterd/* across all nodes). Consul provides distributed configuration store which is consistent and partition tolerant. By moving all Gluster related configuration information into consul we could avoid split-brain situations. Did you get a chance to go over the following questions while making the decision? If yes could you please share the info. What are the consistency guarantees for changing the configuration in case of network partitions? specifically when there are 2 nodes and 1 of them is not reachable? consistency guarantees when there are more than 2 nodes? What are the consistency guarantees for reading configuration in case of network partitions? Pranith All development efforts towards this proposal would happen in parallel to the existing GlusterD code base. The existing code base would be actively maintained until GlusterD-2.0 is production-ready. This is in alignment with the GlusterFS Quattro proposals on making GlusterFS scalable and easy to deploy. This is the first phase ground work towards that goal. Questions and suggestions are welcome. ~kaushal [1] : http://www.consul.io/ ___ Gluster-users mailing list gluster-us...@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] basic/afr/gfid-self-heal.t on release-3.6/NetBSD
Harshavardhana har...@harshavardhana.net wrote: This is the change, you need this patch. Since it isn't committed to release-3.6 - you shouldn't be using master regression tests on release-3.6 Right, but that is a bug fixed in master but still present in release-3.6, isn't it? Why not backport that change to release-3.6? The patch will not apply cleanly, it requires previous changes, but perhaps it is worth working on it? -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz m...@netbsd.org ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Proposal for GlusterD-2.0
On 09/06/2014 05:55 PM, Pranith Kumar Karampuri wrote: On 09/05/2014 03:51 PM, Kaushal M wrote: GlusterD performs the following functions as the management daemon for GlusterFS: - Peer membership management - Maintains consistency of configuration data across nodes (distributed configuration store) - Distributed command execution (orchestration) - Service management (manage GlusterFS daemons) - Portmap service for GlusterFS daemons This proposal aims to delegate the above functions to technologies that solve these problems well. We aim to do this in a phased manner. The technology alternatives we would be looking for should have the following properties, - Open source - Vibrant community - Good documentation - Easy to deploy/manage This would allow GlusterD's architecture to be more modular. We also aim to make GlusterD's architecture as transparent and observable as possible. Separating out these functions would allow us to do that. Bulk of current GlusterD code deals with keeping the configuration of the cluster and the volumes in it consistent and available across the nodes. The current algorithm is not scalable (N^2 in no. of nodes) and doesn't prevent split-brain of configuration. This is the problem area we are targeting for the first phase. As part of the first phase, we aim to delegate the distributed configuration store. We are exploring consul [1] as a replacement for the existing distributed configuration store (sum total of /var/lib/glusterd/* across all nodes). Consul provides distributed configuration store which is consistent and partition tolerant. By moving all Gluster related configuration information into consul we could avoid split-brain situations. Did you get a chance to go over the following questions while making the decision? If yes could you please share the info. What are the consistency guarantees for changing the configuration in case of network partitions? specifically when there are 2 nodes and 1 of them is not reachable? consistency guarantees when there are more than 2 nodes? What are the consistency guarantees for reading configuration in case of network partitions? Consul documentation claims that it can recover from network partition. http://www.consul.io/docs/internals/jepsen.html However having said that we are yet to do this POC. ~Atin Pranith All development efforts towards this proposal would happen in parallel to the existing GlusterD code base. The existing code base would be actively maintained until GlusterD-2.0 is production-ready. This is in alignment with the GlusterFS Quattro proposals on making GlusterFS scalable and easy to deploy. This is the first phase ground work towards that goal. Questions and suggestions are welcome. ~kaushal [1] : http://www.consul.io/ ___ Gluster-users mailing list gluster-us...@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] basic/afr/gfid-self-heal.t on release-3.6/NetBSD
Right, but that is a bug fixed in master but still present in release-3.6, isn't it? Why not backport that change to release-3.6? The patch will not apply cleanly, it requires previous changes, but perhaps it is worth working on it? That is left to the changeset owner, perhaps it will be done in upcoming weeks. -- Religious confuse piety with mere ritual, the virtuous confuse regulation with outcomes ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Brick replace
Hi I try getting tests/basic/pump.t to pass on NetBSD, but after a few experiments, it seems the brick replace functionality is just broken. I run that steps one by one on a fresh install: netbsd0# glusterd netbsd0# $CLI volume create $V0 $H0:$B0/${V0}0 volume create: patchy: success: please start the volume to access data netbsd0# $CLI volume start $V0 volume start: patchy: success netbsd0# $GFS --volfile-id=/$V0 --volfile-server=$H0 $M0; netbsd0# cp -r /usr/share/misc/ $M0/ netbsd0# $CLI volume replace-brick $V0 $H0:$B0/${V0}0 $H0:$B0/${V0}1 start volume replace-brick: success: replace-brick started successfully ID: 98030ade-2dab-467e-86cb-cbea2436e85f netbsd0# $CLI volume replace-brick $V0 $H0:$B0/${V0}0 $H0:$B0/${V0}1 commit volume replace-brick: failed: Commit failed on localhost. Please check the log file for more details. Where logs should I look at? There is nothing in the mount log. Here is glusterd log: [2014-09-07 00:39:01.905900] I [glusterd-replace-brick.c:154:__glusterd_handle_replace_brick] 0-management: Received replace brick commit request [2014-09-07 00:39:01.948259] I [glusterd-replace-brick.c:1441:rb_update_srcbrick_port] 0-: adding src-brick port no [2014-09-07 00:39:01.951238] I [glusterd-replace-brick.c:1495:rb_update_dstbrick_port] 0-: adding dst-brick port no [2014-09-07 00:39:01.974376] E [glusterd-replace-brick.c:1780:glusterd_op_replace_brick] 0-management: Commit operation failed [2014-09-07 00:39:01.974423] E [glusterd-op-sm.c:4109:glusterd_op_ac_send_commit_op] 0-management: Commit of operation 'Volume Replace brick' failed on localhost The brick log has a lot of errors: [2014-09-07 00:44:44.041565] E [client-handshake.c:1544:client_query_portmap] 0-patchy-replace-brick: remote-subvolume not set in volfile [2014-09-07 00:44:44.041636] I [client.c:2215:client_rpc_notify] 0-patchy-replace-brick: disconnected from patchy-replace-brick. Client process will keep trying to connect to glusterd until brick's port is available Any hint of where to look at? -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz m...@netbsd.org ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] truncating grouplist
Hi I have a lot of log messages about this: [2014-09-07 04:07:41.323747] W [rpc-clnt.c:1340:rpc_clnt_record] 0-gfs352-client-3: truncating grouplist from 500 to 87 This is produced here: /* The number of groups and the size of lk_owner depend on oneother. * We can truncate the groups, but should not touch the lk_owner. */ max_groups = GF_AUTH_GLUSTERFS_MAX_GROUPS (au.lk_owner.lk_owner_len); if (au.groups.groups_len max_groups) { GF_LOG_OCCASIONALLY (gf_auth_max_groups_log, clnt-conn.name, GF_LOG_WARNING, truncating grouplist from %d to %d, au.groups.groups_len, max_groups); au.groups.groups_len = max_groups; } Wound't it make sense to only log is max_groups NGROUPS_MAX ? I do not really care a group list is truncated if my system cannot handle the missing groups anyway. -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz m...@netbsd.org ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Another transaction is in progress
How am I supposed to address this? # gluster volume status gfs352 Another transaction is in progress. Please try again after sometime. Retrying after some time does not help. The glusterd log file at least tells me who the culprit is: [2014-09-07 04:20:25.820379] E [glusterd-utils.c:153:glusterd_lock] 0-management: Unable to get lock for uuid: 078015de-2186-4bd7-a4d1-017e39c16dd3, lock held by: 078015de-2186-4bd7-a4d1-017e39c16dd3 This UUID is a brick whose log file will not tell anything particular. -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz m...@netbsd.org ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Brick replace
On 09/07/2014 06:21 AM, Emmanuel Dreyfus wrote: Hi I try getting tests/basic/pump.t to pass on NetBSD, but after a few experiments, it seems the brick replace functionality is just broken. I run that steps one by one on a fresh install: netbsd0# glusterd netbsd0# $CLI volume create $V0 $H0:$B0/${V0}0 volume create: patchy: success: please start the volume to access data netbsd0# $CLI volume start $V0 volume start: patchy: success netbsd0# $GFS --volfile-id=/$V0 --volfile-server=$H0 $M0; netbsd0# cp -r /usr/share/misc/ $M0/ netbsd0# $CLI volume replace-brick $V0 $H0:$B0/${V0}0 $H0:$B0/${V0}1 start volume replace-brick: success: replace-brick started successfully ID: 98030ade-2dab-467e-86cb-cbea2436e85f netbsd0# $CLI volume replace-brick $V0 $H0:$B0/${V0}0 $H0:$B0/${V0}1 commit volume replace-brick: failed: Commit failed on localhost. Please check the log file for more details. Where logs should I look at? There is nothing in the mount log. Here is glusterd log: [2014-09-07 00:39:01.905900] I [glusterd-replace-brick.c:154:__glusterd_handle_replace_brick] 0-management: Received replace brick commit request [2014-09-07 00:39:01.948259] I [glusterd-replace-brick.c:1441:rb_update_srcbrick_port] 0-: adding src-brick port no [2014-09-07 00:39:01.951238] I [glusterd-replace-brick.c:1495:rb_update_dstbrick_port] 0-: adding dst-brick port no [2014-09-07 00:39:01.974376] E [glusterd-replace-brick.c:1780:glusterd_op_replace_brick] 0-management: Commit operation failed [2014-09-07 00:39:01.974423] E [glusterd-op-sm.c:4109:glusterd_op_ac_send_commit_op] 0-management: Commit of operation 'Volume Replace brick' failed on localhost The brick log has a lot of errors: [2014-09-07 00:44:44.041565] E [client-handshake.c:1544:client_query_portmap] 0-patchy-replace-brick: remote-subvolume not set in volfile [2014-09-07 00:44:44.041636] I [client.c:2215:client_rpc_notify] 0-patchy-replace-brick: disconnected from patchy-replace-brick. Client process will keep trying to connect to glusterd until brick's port is available Any hint of where to look at? Looking at the code, I can see there are lot of errors logged in debug level, I would suggest you to run glusterd with -LDEBUG and reproduce the issue such that you can get to the exact problem area looking at the glusterd log (just check why rb_do_operation returns -1) Having said that, I believe we should also change the loglevel from DEBUG to ERROR in few failure cases in rb_do_operation () ~Atin ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel