On Fri, Nov 25, 2016 at 1:14 PM, songxin <[email protected]> wrote:
> Hi Atin, > It seems that this workaround should be done by manual. > Is that right? > And even the files in bricks/* may be empty too. > Yes, that's right > > Do you have a workaround, which is implemented in glusterfs code? > Workaround is by nature manual and anything to be done through code should be considered as fix not work around :) > > Thanks, > Xin > > > > > > 在 2016-11-25 15:36:29,"Atin Mukherjee" <[email protected]> 写道: > > > > On Fri, Nov 25, 2016 at 12:06 PM, songxin <[email protected]> wrote: > >> Hi Atin, >> Do you mean that you have the workaround applicable now? >> Or it will take time to design the workaround? >> >> If you have workaround now, could you share it to me ? >> > > If you end up in having a 0 byte info file you'd need to copy the same > info file from other node and put it there and restart glusterd. > > >> >> Thanks, >> Xin, >> >> >> >> >> >> 在 2016-11-24 19:12:07,"Atin Mukherjee" <[email protected]> 写道: >> >> Xin - I appreciate your patience. I'd need some more time to pick this >> item up from my backlog. I believe we have a workaround applicable here too. >> >> On Thu, 24 Nov 2016 at 14:24, songxin <[email protected]> wrote: >> >>> >>> >>> >>> Hi Atin, >>> Actually, the glusterfs is used in my project. >>> And our test team find this issue. >>> So I want to make sure that whether you plan to fix it. >>> if you have plan I will wait you because your method shoud be better >>> than mine. >>> >>> Thanks, >>> Xin >>> >>> >>> 在 2016-11-21 10:00:36,"Atin Mukherjee" <[email protected]> 写道: >>> >>> Hi Xin, >>> >>> I've not got a chance to look into it yet. delete stale volume function >>> is in place to take care of wiping off volume configuration data which has >>> been deleted from the cluster. However we need to revisit this code to see >>> if this function is anymore needed given we recently added a validation to >>> fail delete request if one of the glusterd is down. I'll get back to you on >>> this. >>> >>> On Mon, 21 Nov 2016 at 07:24, songxin <[email protected]> wrote: >>> >>> Hi Atin, >>> Thank you for your support. >>> >>> And any conclusions about this issue? >>> >>> Thanks, >>> Xin >>> >>> >>> >>> >>> >>> 在 2016-11-16 20:59:05,"Atin Mukherjee" <[email protected]> 写道: >>> >>> >>> >>> On Tue, Nov 15, 2016 at 1:53 PM, songxin <[email protected]> wrote: >>> >>> ok, thank you. >>> >>> >>> >>> >>> 在 2016-11-15 16:12:34,"Atin Mukherjee" <[email protected]> 写道: >>> >>> >>> >>> On Tue, Nov 15, 2016 at 12:47 PM, songxin <[email protected]> wrote: >>> >>> >>> Hi Atin, >>> >>> I think the root cause is in the function glusterd_import_friend_volume >>> as below. >>> >>> int32_t >>> glusterd_import_friend_volume (dict_t *peer_data, size_t count) >>> { >>> ... >>> ret = glusterd_volinfo_find (new_volinfo->volname, >>> &old_volinfo); >>> if (0 == ret) { >>> (void) gd_check_and_update_rebalance_info (old_volinfo, >>> n >>> ew_volinfo); >>> (void) glusterd_delete_stale_volume (old_volinfo, >>> new_volinfo); >>> } >>> ... >>> ret = glusterd_store_volinfo (new_volinfo, >>> GLUSTERD_VOLINFO_VER_AC_NONE); >>> if (ret) { >>> gf_msg (this->name, GF_LOG_ERROR, 0, >>> GD_MSG_VOLINFO_STORE_FAIL, "Failed to store " >>> "volinfo for volume %s", new_volinfo->volname); >>> goto out; >>> } >>> ... >>> } >>> >>> glusterd_delete_stale_volume will remove the info and bricks/* and the >>> glusterd_store_volinfo will create the new one. >>> But if glusterd is killed before rename the info will is empty. >>> >>> And glusterd will start failed because the infois empty in the next time >>> you start the glusterd. >>> >>> Any idea, Atin? >>> >>> >>> Give me some time, will check it out, but reading at this analysis looks >>> very well possible if a volume is changed when the glusterd was done on >>> node a and when the same comes up during peer handshake we update the >>> volinfo and during that time glusterd goes down once again. I'll confirm it >>> by tomorrow. >>> >>> >>> I checked the code and it does look like you have got the right RCA for >>> the issue which you simulated through those two scripts. However this can >>> happen even when you try to create a fresh volume and while glusterd tries >>> to write the content into the store and goes down before renaming the >>> info.tmp file you get into the same situation. >>> >>> I'd really need to think through if this can be fixed. Suggestions are >>> always appreciated. >>> >>> >>> >>> >>> BTW, excellent work Xin! >>> >>> >>> Thanks, >>> Xin >>> >>> >>> 在 2016-11-15 12:07:05,"Atin Mukherjee" <[email protected]> 写道: >>> >>> >>> >>> On Tue, Nov 15, 2016 at 8:58 AM, songxin <[email protected]> wrote: >>> >>> Hi Atin, >>> I have some clues about this issue. >>> I could reproduce this issue use the scrip that mentioned in >>> https://bugzilla.redhat.com/show_bug.cgi?id=1308487 . >>> >>> >>> I really appreciate your help in trying to nail down this issue. While I >>> am at your email and going through the code to figure out the possible >>> cause for it, unfortunately I don't see any script in the attachment of the >>> bug. Could you please cross check? >>> >>> >>> >>> After I added some debug print,which like below, in glusterd-store.c and >>> I found that the /var/lib/glusterd/vols/xxx/info and >>> /var/lib/glusterd/vols/xxx/bricks/* are removed. >>> But other files in /var/lib/glusterd/vols/xxx/ will not be remove. >>> >>> int32_t >>> glusterd_store_volinfo (glusterd_volinfo_t *volinfo, >>> glusterd_volinfo_ver_ac_t ac) >>> { >>> int32_t ret = -1; >>> >>> GF_ASSERT (volinfo) >>> >>> ret = access("/var/lib/glusterd/vols/gv0/info", F_OK); >>> if(ret < 0) >>> { >>> gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "info is not >>> exit(%d)", errno); >>> } >>> else >>> { >>> ret = stat("/var/lib/glusterd/vols/gv0/info", &buf); >>> if(ret < 0) >>> { >>> gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "stat >>> info error"); >>> } >>> else >>> { >>> gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "info >>> size is %lu, inode num is %lu", buf.st_size, buf.st_ino); >>> } >>> } >>> >>> glusterd_perform_volinfo_version_action (volinfo, ac); >>> ret = glusterd_store_create_volume_dir (volinfo); >>> if (ret) >>> goto out; >>> >>> ... >>> } >>> >>> So it is easy to understand why the info or 10.32.1.144.-opt-lvmdir-c2- >>> brick sometimes is empty. >>> It is becaue the info file is not exist, and it will be create by “fd = >>> open (path, O_RDWR | O_CREAT | O_APPEND, 0600);” in function >>> gf_store_handle_new. >>> And the info file is empty before rename. >>> So the info file is empty if glusterd shutdown before rename. >>> >>> >>> >>> My question is following. >>> 1.I did not find the point the info is removed.Could you tell me the >>> point where the info and /bricks/* are removed? >>> 2.why the file info and bricks/* is removed?But other files in >>> var/lib/glusterd/vols/xxx/ >>> are not be removed? >>> >>> >>> AFAIK, we never delete the info file and hence this file is opened with >>> O_APPEND flag. As I said I will go back and cross check the code once again. >>> >>> >>> >>> >>> Thanks, >>> Xin >>> >>> >>> 在 2016-11-11 20:34:05,"Atin Mukherjee" <[email protected]> 写道: >>> >>> >>> >>> On Fri, Nov 11, 2016 at 4:00 PM, songxin <[email protected]> wrote: >>> >>> Hi Atin, >>> >>> Thank you for your support. >>> Sincerely wait for your reply. >>> >>> By the way, could you make sure that the issue, file info is empty, >>> cause by rename is interrupted in kernel? >>> >>> >>> As per my RCA on that bug, it looked to be. >>> >>> >>> >>> Thanks, >>> Xin >>> >>> 在 2016-11-11 15:49:02,"Atin Mukherjee" <[email protected]> 写道: >>> >>> >>> >>> On Fri, Nov 11, 2016 at 1:15 PM, songxin <[email protected]> wrote: >>> >>> Hi Atin, >>> Thank you for your reply. >>> Actually it is very difficult to reproduce because I don't know when there >>> was an ongoing commit happening.It is just a coincidence. >>> But I want to make sure the root cause. >>> >>> >>> I'll give it a another try and see if this situation can be >>> simulated/reproduced and will keep you posted. >>> >>> >>> >>> So I would be grateful if you could answer my questions below. >>> >>> You said that "This issue is hit at part of the negative testing where >>> while gluster volume set was executed at the same point of time glusterd in >>> another instance was brought down. In the faulty node we could see >>> /var/lib/glusterd/vols/<volname>info file been empty whereas the >>> info.tmp file has the correct contents." in comment. >>> >>> I have two questions for you. >>> >>> 1.Could you reproduce this issue by gluster volume set glusterd which was >>> brought down? >>> 2.Could you be certain that this issue is cause by rename is interrupted in >>> kernel? >>> >>> In my case there are two files, info and 10.32.1.144.-opt-lvmdir-c2-brick, >>> are both empty. >>> But in my view only one rename can be running at the same time because of >>> the big lock. >>> Why there are two files are empty? >>> >>> >>> Could rename("info.tmp", "info") and rename("xxx-brick.tmp", "xxx-brick") >>> be running in two thread? >>> >>> Thanks, >>> Xin >>> >>> >>> 在 2016-11-11 15:27:03,"Atin Mukherjee" <[email protected]> 写道: >>> >>> >>> >>> On Fri, Nov 11, 2016 at 12:38 PM, songxin <[email protected]> wrote: >>> >>> >>> Hi Atin, >>> Thank you for your reply. >>> >>> As you said that the info file can only be changed in the >>> glusterd_store_volinfo() >>> sequentially because of the big lock. >>> >>> I have found the similar issue as below that you mentioned. >>> https://bugzilla.redhat.com/show_bug.cgi?id=1308487 >>> >>> >>> Great, so this is what I was actually trying to refer in my first email >>> that I saw a similar issue. Have you got a chance to look at >>> https://bugzilla.redhat.com/show_bug.cgi?id=1308487#c4 ? But in your >>> case, did you try to bring down glusterd when there was an ongoing commit >>> happening? >>> >>> >>> >>> You said that "This issue is hit at part of the negative testing where >>> while gluster volume set was executed at the same point of time glusterd in >>> another instance was brought down. In the faulty node we could see >>> /var/lib/glusterd/vols/<volname>info file been empty whereas the >>> info.tmp file has the correct contents." in comment. >>> >>> I have two questions for you. >>> >>> 1.Could you reproduce this issue by gluster volume set glusterd which was >>> brought down? >>> 2.Could you be certain that this issue is cause by rename is interrupted in >>> kernel? >>> >>> In my case there are two files, info and 10.32.1.144.-opt-lvmdir-c2-brick, >>> are both empty. >>> But in my view only one rename can be running at the same time because of >>> the big lock. >>> Why there are two files are empty? >>> >>> >>> Could rename("info.tmp", "info") and rename("xxx-brick.tmp", "xxx-brick") >>> be running in two thread? >>> >>> Thanks, >>> Xin >>> >>> >>> >>> >>> 在 2016-11-11 14:36:40,"Atin Mukherjee" <[email protected]> 写道: >>> >>> >>> >>> On Fri, Nov 11, 2016 at 8:33 AM, songxin <[email protected]> wrote: >>> >>> Hi Atin, >>> >>> Thank you for your reply. >>> I have two questions for you. >>> >>> 1.Are the two files info and info.tmp are only to be created or changed >>> in function glusterd_store_volinfo()? I did not find other point in which >>> the two file are changed. >>> >>> >>> If we are talking about info file volume then yes, the mentioned >>> function actually takes care of it. >>> >>> >>> 2.I found that glusterd_store_volinfo() will be call in many point by >>> glusterd.Is there a problem of thread synchronization?If so, one thread may >>> open a same file info.tmp using O_TRUNC flag when another thread is >>> writing the info,tmp.Could this case happen? >>> >>> >>> In glusterd threads are big lock protected and I don't see a >>> possibility (theoretically) to have two glusterd_store_volinfo () calls at >>> a given point of time. >>> >>> >>> >>> Thanks, >>> Xin >>> >>> >>> At 2016-11-10 21:41:06, "Atin Mukherjee" <[email protected]> wrote: >>> >>> Did you run out of disk space by any chance? AFAIK, the code is like we >>> write new stuffs to .tmp file and rename it back to the original file. In >>> case of a disk space issue I expect both the files to be of non zero size. >>> But having said that I vaguely remember a similar issue (in the form of a >>> bug or an email) landed up once but we couldn't reproduce it, so something >>> is wrong with the atomic update here is what I guess. I'll be glad if you >>> have a reproducer for the same and then we can dig into it further. >>> >>> On Thu, Nov 10, 2016 at 1:32 PM, songxin <[email protected]> wrote: >>> >>> Hi, >>> When I start the glusterd some error happened. >>> And the log is following. >>> >>> [2016-11-08 07:58:34.989365] I [MSGID: 100030] [glusterfsd.c:2318:main] >>> 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.6 >>> (args: /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO) >>> [2016-11-08 07:58:34.998356] I [MSGID: 106478] [glusterd.c:1350:init] >>> 0-management: Maximum allowed open file descriptors set to 65536 >>> [2016-11-08 07:58:35.000667] I [MSGID: 106479] [glusterd.c:1399:init] >>> 0-management: Using /system/glusterd as working directory >>> [2016-11-08 07:58:35.024508] I [MSGID: 106514] >>> [glusterd-store.c:2075:glusterd_restore_op_version] 0-management: >>> Upgrade detected. Setting op-version to minimum : 1 >>> *[2016-11-08 07:58:35.025356] E [MSGID: 106206] >>> [glusterd-store.c:2562:glusterd_store_update_volinfo] 0-management: Failed >>> to get next store iter * >>> *[2016-11-08 07:58:35.025401] E [MSGID: 106207] >>> [glusterd-store.c:2844:glusterd_store_retrieve_volume] 0-management: Failed >>> to update volinfo for c_glusterfs volume * >>> *[2016-11-08 07:58:35.025463] E [MSGID: 106201] >>> [glusterd-store.c:3042:glusterd_store_retrieve_volumes] 0-management: >>> Unable to restore volume: c_glusterfs * >>> *[2016-11-08 07:58:35.025544] E [MSGID: 101019] >>> [xlator.c:428:xlator_init] 0-management: Initialization of volume >>> 'management' failed, review your volfile again * >>> *[2016-11-08 07:58:35.025582] E [graph.c:322:glusterfs_graph_init] >>> 0-management: initializing translator failed * >>> *[2016-11-08 07:58:35.025629] E [graph.c:661:glusterfs_graph_activate] >>> 0-graph: init failed * >>> [2016-11-08 07:58:35.026109] W [glusterfsd.c:1236:cleanup_and_exit] >>> (-->/usr/sbin/glusterd(glusterfs_volumes_init-0x1b260) [0x1000a718] >>> -->/usr/sbin/glusterd(glusterfs_process_volfp-0x1b3b8) [0x1000a5a8] >>> -->/usr/sbin/glusterd(cleanup_and_exit-0x1c02c) [0x100098bc] ) 0-: >>> received signum (0), shutting down >>> >>> >>> And then I found that the size of vols/volume_name/info is 0.It cause >>> glusterd shutdown. >>> But I found that vols/volume_name_info.tmp is not 0. >>> And I found that there is a brick file vols/volume_name/bricks/xxxx.brick >>> is 0, but vols/volume_name/bricks/xxxx.brick.tmp is not 0. >>> >>> I read the function code glusterd_store_volinfo () in glusterd-store.c . >>> I know that the info.tmp will be rename to info in function >>> glusterd_store_volume_atomic_update(). >>> >>> But my question is that why the info file is 0 but info.tmp is not 0. >>> >>> >>> Thanks, >>> Xin >>> >>> >>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> [email protected] >>> http://www.gluster.org/mailman/listinfo/gluster-users >>> >>> >>> >>> >>> -- >>> >>> ~ Atin (atinm) >>> >>> >>> >>> >>> >>> >>> >>> >>> -- >>> >>> ~ Atin (atinm) >>> >>> >>> >>> >>> >>> >>> >>> >>> -- >>> >>> ~ Atin (atinm) >>> >>> >>> >>> >>> >>> >>> >>> >>> -- >>> >>> ~ Atin (atinm) >>> >>> >>> >>> >>> >>> >>> >>> >>> -- >>> >>> ~ Atin (atinm) >>> >>> >>> >>> >>> >>> >>> >>> >>> -- >>> >>> ~ Atin (atinm) >>> >>> >>> >>> >>> >>> >>> >>> >>> -- >>> >>> ~ Atin (atinm) >>> >>> >>> >>> >>> >>> >>> >>> >>> -- >>> >>> ~ Atin (atinm) >>> >>> >>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> [email protected] >>> http://www.gluster.org/mailman/listinfo/gluster-users >>> >>> -- >>> --Atin >>> >>> -- >> - Atin (atinm) >> >> >> >> >> > > > > -- > > ~ Atin (atinm) > > > > > -- ~ Atin (atinm)
_______________________________________________ Gluster-users mailing list [email protected] http://www.gluster.org/mailman/listinfo/gluster-users
