It seems, like I can update glusterfs to 3.5.2-1 according to debian sid repository. Should I give it a try? https://packages.debian.org/sid/glusterfs-server ie
2014-08-05 12:37 GMT+03:00 Roman <[email protected]>: > really, seems like the same file > > stor1: > a951641c5230472929836f9fcede6b04 > /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2 > > stor2: > a951641c5230472929836f9fcede6b04 > /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2 > > > one thing I've seen from logs, that somehow proxmox VE is connecting with > wrong version to servers? > [2014-08-05 09:23:45.218550] I > [client-handshake.c:1659:select_server_supported_programs] > 0-HA-fast-150G-PVE1-client-0: Using Program GlusterFS 3.3, Num (1298437), > Version (330) > but if I issue: > root@pve1:~# glusterfs -V > glusterfs 3.4.4 built on Jun 28 2014 03:44:57 > seems ok. > > server use 3.4.4 meanwhile > [2014-08-05 09:23:45.117875] I [server-handshake.c:567:server_setvolume] > 0-HA-fast-150G-PVE1-server: accepted client from > stor1-9004-2014/08/05-09:23:45:93538-HA-fast-150G-PVE1-client-1-0 (version: > 3.4.4) > [2014-08-05 09:23:49.103035] I [server-handshake.c:567:server_setvolume] > 0-HA-fast-150G-PVE1-server: accepted client from > stor1-8998-2014/08/05-09:23:45:89883-HA-fast-150G-PVE1-client-0-0 (version: > 3.4.4) > > if this could be the reason, of course. > I did restart the Proxmox VE yesterday (just for an information) > > > > > > 2014-08-05 12:30 GMT+03:00 Pranith Kumar Karampuri <[email protected]>: > > >> On 08/05/2014 02:33 PM, Roman wrote: >> >> Waited long enough for now, still different sizes and no logs about >> healing :( >> >> stor1 >> # file: exports/fast-test/150G/images/127/vm-127-disk-1.qcow2 >> trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000 >> trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000 >> trusted.gfid=0xf10ad81b58484bcd9b385a36a207f921 >> >> root@stor1:~# du -sh /exports/fast-test/150G/images/127/ >> 1.2G /exports/fast-test/150G/images/127/ >> >> >> stor2 >> # file: exports/fast-test/150G/images/127/vm-127-disk-1.qcow2 >> trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000 >> trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000 >> trusted.gfid=0xf10ad81b58484bcd9b385a36a207f921 >> >> >> root@stor2:~# du -sh /exports/fast-test/150G/images/127/ >> 1.4G /exports/fast-test/150G/images/127/ >> >> According to the changelogs, the file doesn't need any healing. Could you >> stop the operations on the VMs and take md5sum on both these machines? >> >> Pranith >> >> >> >> >> >> 2014-08-05 11:49 GMT+03:00 Pranith Kumar Karampuri <[email protected]>: >> >>> >>> On 08/05/2014 02:06 PM, Roman wrote: >>> >>> Well, it seems like it doesn't see the changes were made to the volume ? >>> I created two files 200 and 100 MB (from /dev/zero) after I disconnected >>> the first brick. Then connected it back and got these logs: >>> >>> [2014-08-05 08:30:37.830150] I >>> [glusterfsd-mgmt.c:1584:mgmt_getspec_cbk] 0-glusterfs: No change in >>> volfile, continuing >>> [2014-08-05 08:30:37.830207] I [rpc-clnt.c:1676:rpc_clnt_reconfig] >>> 0-HA-fast-150G-PVE1-client-0: changing port to 49153 (from 0) >>> [2014-08-05 08:30:37.830239] W [socket.c:514:__socket_rwv] >>> 0-HA-fast-150G-PVE1-client-0: readv failed (No data available) >>> [2014-08-05 08:30:37.831024] I >>> [client-handshake.c:1659:select_server_supported_programs] >>> 0-HA-fast-150G-PVE1-client-0: Using Program GlusterFS 3.3, Num (1298437), >>> Version (330) >>> [2014-08-05 08:30:37.831375] I >>> [client-handshake.c:1456:client_setvolume_cbk] >>> 0-HA-fast-150G-PVE1-client-0: Connected to 10.250.0.1:49153, attached >>> to remote volume '/exports/fast-test/150G'. >>> [2014-08-05 08:30:37.831394] I >>> [client-handshake.c:1468:client_setvolume_cbk] >>> 0-HA-fast-150G-PVE1-client-0: Server and Client lk-version numbers are not >>> same, reopening the fds >>> [2014-08-05 08:30:37.831566] I >>> [client-handshake.c:450:client_set_lk_version_cbk] >>> 0-HA-fast-150G-PVE1-client-0: Server lk version = 1 >>> >>> >>> [2014-08-05 08:30:37.830150] I >>> [glusterfsd-mgmt.c:1584:mgmt_getspec_cbk] 0-glusterfs: No change in >>> volfile, continuing >>> this line seems weird to me tbh. >>> I do not see any traffic on switch interfaces between gluster servers, >>> which means, there is no syncing between them. >>> I tried to ls -l the files on the client and servers to trigger the >>> healing, but seems like no success. Should I wait more? >>> >>> Yes, it should take around 10-15 minutes. Could you provide 'getfattr >>> -d -m. -e hex <file-on-brick>' on both the bricks. >>> >>> Pranith >>> >>> >>> >>> 2014-08-05 11:25 GMT+03:00 Pranith Kumar Karampuri <[email protected]> >>> : >>> >>>> >>>> On 08/05/2014 01:10 PM, Roman wrote: >>>> >>>> Ahha! For some reason I was not able to start the VM anymore, Proxmox >>>> VE told me, that it is not able to read the qcow2 header due to permission >>>> is denied for some reason. So I just deleted that file and created a new >>>> VM. And the nex message I've got was this: >>>> >>>> Seems like these are the messages where you took down the bricks >>>> before self-heal. Could you restart the run waiting for self-heals to >>>> complete before taking down the next brick? >>>> >>>> Pranith >>>> >>>> >>>> >>>> [2014-08-05 07:31:25.663412] E >>>> [afr-self-heal-common.c:197:afr_sh_print_split_brain_log] >>>> 0-HA-fast-150G-PVE1-replicate-0: Unable to self-heal contents of >>>> '/images/124/vm-124-disk-1.qcow2' (possible split-brain). Please delete the >>>> file from all but the preferred subvolume.- Pending matrix: [ [ 0 60 ] [ >>>> 11 0 ] ] >>>> [2014-08-05 07:31:25.663955] E >>>> [afr-self-heal-common.c:2262:afr_self_heal_completion_cbk] >>>> 0-HA-fast-150G-PVE1-replicate-0: background data self-heal failed on >>>> /images/124/vm-124-disk-1.qcow2 >>>> >>>> >>>> >>>> 2014-08-05 10:13 GMT+03:00 Pranith Kumar Karampuri <[email protected] >>>> >: >>>> >>>>> I just responded to your earlier mail about how the log looks. The >>>>> log comes on the mount's logfile >>>>> >>>>> Pranith >>>>> >>>>> On 08/05/2014 12:41 PM, Roman wrote: >>>>> >>>>> Ok, so I've waited enough, I think. Had no any traffic on switch ports >>>>> between servers. Could not find any suitable log message about completed >>>>> self-heal (waited about 30 minutes). Plugged out the other server's UTP >>>>> cable this time and got in the same situation: >>>>> root@gluster-test1:~# cat /var/log/dmesg >>>>> -bash: /bin/cat: Input/output error >>>>> >>>>> brick logs: >>>>> [2014-08-05 07:09:03.005474] I [server.c:762:server_rpc_notify] >>>>> 0-HA-fast-150G-PVE1-server: disconnecting connectionfrom >>>>> pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0 >>>>> [2014-08-05 07:09:03.005530] I >>>>> [server-helpers.c:729:server_connection_put] 0-HA-fast-150G-PVE1-server: >>>>> Shutting down connection >>>>> pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0 >>>>> [2014-08-05 07:09:03.005560] I [server-helpers.c:463:do_fd_cleanup] >>>>> 0-HA-fast-150G-PVE1-server: fd cleanup on /images/124/vm-124-disk-1.qcow2 >>>>> [2014-08-05 07:09:03.005797] I >>>>> [server-helpers.c:617:server_connection_destroy] >>>>> 0-HA-fast-150G-PVE1-server: destroyed connection of >>>>> pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0 >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> 2014-08-05 9:53 GMT+03:00 Pranith Kumar Karampuri <[email protected] >>>>> >: >>>>> >>>>>> Do you think it is possible for you to do these tests on the latest >>>>>> version 3.5.2? 'gluster volume heal <volname> info' would give you that >>>>>> information in versions > 3.5.1. >>>>>> Otherwise you will have to check it from either the logs, there will >>>>>> be self-heal completed message on the mount logs (or) by observing >>>>>> 'getfattr -d -m. -e hex <image-file-on-bricks>' >>>>>> >>>>>> Pranith >>>>>> >>>>>> >>>>>> On 08/05/2014 12:09 PM, Roman wrote: >>>>>> >>>>>> Ok, I understand. I will try this shortly. >>>>>> How can I be sure, that healing process is done, if I am not able to >>>>>> see its status? >>>>>> >>>>>> >>>>>> 2014-08-05 9:30 GMT+03:00 Pranith Kumar Karampuri < >>>>>> [email protected]>: >>>>>> >>>>>>> Mounts will do the healing, not the self-heal-daemon. The problem I >>>>>>> feel is that whichever process does the healing has the latest >>>>>>> information >>>>>>> about the good bricks in this usecase. Since for VM usecase, mounts >>>>>>> should >>>>>>> have the latest information, we should let the mounts do the healing. If >>>>>>> the mount accesses the VM image either by someone doing operations >>>>>>> inside >>>>>>> the VM or explicit stat on the file it should do the healing. >>>>>>> >>>>>>> Pranith. >>>>>>> >>>>>>> >>>>>>> On 08/05/2014 10:39 AM, Roman wrote: >>>>>>> >>>>>>> Hmmm, you told me to turn it off. Did I understood something wrong? >>>>>>> After I issued the command you've sent me, I was not able to watch the >>>>>>> healing process, it said, it won't be healed, becouse its turned off. >>>>>>> >>>>>>> >>>>>>> 2014-08-05 5:39 GMT+03:00 Pranith Kumar Karampuri < >>>>>>> [email protected]>: >>>>>>> >>>>>>>> You didn't mention anything about self-healing. Did you wait until >>>>>>>> the self-heal is complete? >>>>>>>> >>>>>>>> Pranith >>>>>>>> >>>>>>>> On 08/04/2014 05:49 PM, Roman wrote: >>>>>>>> >>>>>>>> Hi! >>>>>>>> Result is pretty same. I set the switch port down for 1st server, >>>>>>>> it was ok. Then set it up back and set other server's port off. and it >>>>>>>> triggered IO error on two virtual machines: one with local root FS but >>>>>>>> network mounted storage. and other with network root FS. 1st gave an >>>>>>>> error >>>>>>>> on copying to or from the mounted network disk, other just gave me an >>>>>>>> error >>>>>>>> for even reading log.files. >>>>>>>> >>>>>>>> cat: /var/log/alternatives.log: Input/output error >>>>>>>> then I reset the kvm VM and it said me, there is no boot device. >>>>>>>> Next I virtually powered it off and then back on and it has booted. >>>>>>>> >>>>>>>> By the way, did I have to start/stop volume? >>>>>>>> >>>>>>>> >> Could you do the following and test it again? >>>>>>>> >> gluster volume set <volname> cluster.self-heal-daemon off >>>>>>>> >>>>>>>> >>Pranith >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> 2014-08-04 14:10 GMT+03:00 Pranith Kumar Karampuri < >>>>>>>> [email protected]>: >>>>>>>> >>>>>>>>> >>>>>>>>> On 08/04/2014 03:33 PM, Roman wrote: >>>>>>>>> >>>>>>>>> Hello! >>>>>>>>> >>>>>>>>> Facing the same problem as mentioned here: >>>>>>>>> >>>>>>>>> >>>>>>>>> http://supercolony.gluster.org/pipermail/gluster-users/2014-April/039959.html >>>>>>>>> >>>>>>>>> my set up is up and running, so i'm ready to help you back with >>>>>>>>> feedback. >>>>>>>>> >>>>>>>>> setup: >>>>>>>>> proxmox server as client >>>>>>>>> 2 gluster physical servers >>>>>>>>> >>>>>>>>> server side and client side both running atm 3.4.4 glusterfs >>>>>>>>> from gluster repo. >>>>>>>>> >>>>>>>>> the problem is: >>>>>>>>> >>>>>>>>> 1. craeted replica bricks. >>>>>>>>> 2. mounted in proxmox (tried both promox ways: via GUI and fstab >>>>>>>>> (with backup volume line), btw while mounting via fstab I'm unable to >>>>>>>>> launch a VM without cache, meanwhile direct-io-mode is enabled in >>>>>>>>> fstab >>>>>>>>> line) >>>>>>>>> 3. installed VM >>>>>>>>> 4. bring one volume down - ok >>>>>>>>> 5. bringing up, waiting for sync is done. >>>>>>>>> 6. bring other volume down - getting IO errors on VM guest and not >>>>>>>>> able to restore the VM after I reset the VM via host. It says (no >>>>>>>>> bootable >>>>>>>>> media). After I shut it down (forced) and bring back up, it boots. >>>>>>>>> >>>>>>>>> Could you do the following and test it again? >>>>>>>>> gluster volume set <volname> cluster.self-heal-daemon off >>>>>>>>> >>>>>>>>> Pranith >>>>>>>>> >>>>>>>>> >>>>>>>>> Need help. Tried 3.4.3, 3.4.4. >>>>>>>>> Still missing pkg-s for 3.4.5 for debian and 3.5.2 (3.5.1 always >>>>>>>>> gives a healing error for some reason) >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Best regards, >>>>>>>>> Roman. >>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Gluster-users mailing >>>>>>>>> [email protected]http://supercolony.gluster.org/mailman/listinfo/gluster-users >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Best regards, >>>>>>>> Roman. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Best regards, >>>>>>> Roman. >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Best regards, >>>>>> Roman. >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Best regards, >>>>> Roman. >>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Best regards, >>>> Roman. >>>> >>>> >>>> >>> >>> >>> -- >>> Best regards, >>> Roman. >>> >>> >>> >> >> >> -- >> Best regards, >> Roman. >> >> >> > > > -- > Best regards, > Roman. > -- Best regards, Roman.
_______________________________________________ Gluster-users mailing list [email protected] http://supercolony.gluster.org/mailman/listinfo/gluster-users
