Roman,
The file went into split-brain. I think we should do these tests
with 3.5.2. Where monitoring the heals is easier. Let me also come up
with a document about how to do this testing you are trying to do.
Humble/Niels,
Do we have debs available for 3.5.2? In 3.5.1 there was packaging
issue where /usr/bin/glfsheal is not packaged along with the deb. I
think that should be fixed now as well?
Pranith
On 08/06/2014 11:52 AM, Roman wrote:
good morning,
root@stor1:~# getfattr -d -m. -e hex
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
getfattr: Removing leading '/' from absolute path names
# file: exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000
trusted.afr.HA-fast-150G-PVE1-client-1=0x000001320000000000000000
trusted.gfid=0x23c79523075a4158bea38078da570449
getfattr: Removing leading '/' from absolute path names
# file: exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
trusted.afr.HA-fast-150G-PVE1-client-0=0x000000040000000000000000
trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000
trusted.gfid=0x23c79523075a4158bea38078da570449
2014-08-06 9:20 GMT+03:00 Pranith Kumar Karampuri <[email protected]
<mailto:[email protected]>>:
On 08/06/2014 11:30 AM, Roman wrote:
Also, this time files are not the same!
root@stor1:~# md5sum
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
32411360c53116b96a059f17306caeda
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
root@stor2:~# md5sum
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
65b8a6031bcb6f5fb3a11cb1e8b1c9c9
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
What is the getfattr output?
Pranith
2014-08-05 16:33 GMT+03:00 Roman <[email protected]
<mailto:[email protected]>>:
Nope, it is not working. But this time it went a bit other way
root@gluster-client:~# dmesg
Segmentation fault
I was not able even to start the VM after I done the tests
Could not read qcow2 header: Operation not permitted
And it seems, it never starts to sync files after first
disconnect. VM survives first disconnect, but not second (I
waited around 30 minutes). Also, I've
got network.ping-timeout: 2 in volume settings, but logs
react on first disconnect around 30 seconds. Second was
faster, 2 seconds.
Reaction was different also:
slower one:
[2014-08-05 13:26:19.558435] W [socket.c:514:__socket_rwv]
0-glusterfs: readv failed (Connection timed out)
[2014-08-05 13:26:19.558485] W
[socket.c:1962:__socket_proto_state_machine] 0-glusterfs:
reading from socket failed. Error (Connection timed out),
peer (10.250.0.1:24007 <http://10.250.0.1:24007>)
[2014-08-05 13:26:21.281426] W [socket.c:514:__socket_rwv]
0-HA-fast-150G-PVE1-client-0: readv failed (Connection timed out)
[2014-08-05 13:26:21.281474] W
[socket.c:1962:__socket_proto_state_machine]
0-HA-fast-150G-PVE1-client-0: reading from socket failed.
Error (Connection timed out), peer (10.250.0.1:49153
<http://10.250.0.1:49153>)
[2014-08-05 13:26:21.281507] I
[client.c:2098:client_rpc_notify]
0-HA-fast-150G-PVE1-client-0: disconnected
the fast one:
2014-08-05 12:52:44.607389] C
[client-handshake.c:127:rpc_client_ping_timer_expired]
0-HA-fast-150G-PVE1-client-1: server 10.250.0.2:49153
<http://10.250.0.2:49153> has not responded in the last 2
seconds, disconnecting.
[2014-08-05 12:52:44.607491] W [socket.c:514:__socket_rwv]
0-HA-fast-150G-PVE1-client-1: readv failed (No data available)
[2014-08-05 12:52:44.607585] E
[rpc-clnt.c:368:saved_frames_unwind]
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0xf8)
[0x7fcb1b4b0558]
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3)
[0x7fcb1b4aea63]
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)
[0x7fcb1b4ae97e]))) 0-HA-fast-150G-PVE1-client-1: forced
unwinding frame type(GlusterFS 3.3) op(LOOKUP(27)) called at
2014-08-05 12:52:42.463881 (xid=0x381883x)
[2014-08-05 12:52:44.607604] W
[client-rpc-fops.c:2624:client3_3_lookup_cbk]
0-HA-fast-150G-PVE1-client-1: remote operation failed:
Transport endpoint is not connected. Path: /
(00000000-0000-0000-0000-000000000001)
[2014-08-05 12:52:44.607736] E
[rpc-clnt.c:368:saved_frames_unwind]
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0xf8)
[0x7fcb1b4b0558]
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3)
[0x7fcb1b4aea63]
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)
[0x7fcb1b4ae97e]))) 0-HA-fast-150G-PVE1-client-1: forced
unwinding frame type(GlusterFS Handshake) op(PING(3)) called
at 2014-08-05 12:52:42.463891 (xid=0x381884x)
[2014-08-05 12:52:44.607753] W
[client-handshake.c:276:client_ping_cbk]
0-HA-fast-150G-PVE1-client-1: timer must have expired
[2014-08-05 12:52:44.607776] I
[client.c:2098:client_rpc_notify]
0-HA-fast-150G-PVE1-client-1: disconnected
I've got SSD disks (just for an info).
Should I go and give a try for 3.5.2?
2014-08-05 13:06 GMT+03:00 Pranith Kumar Karampuri
<[email protected] <mailto:[email protected]>>:
reply along with gluster-users please :-). May be you are
hitting 'reply' instead of 'reply all'?
Pranith
On 08/05/2014 03:35 PM, Roman wrote:
To make sure and clean, I've created another VM with raw
format and goint to repeat those steps. So now I've got
two VM-s one with qcow2 format and other with raw
format. I will send another e-mail shortly.
2014-08-05 13:01 GMT+03:00 Pranith Kumar Karampuri
<[email protected] <mailto:[email protected]>>:
On 08/05/2014 03:07 PM, Roman wrote:
really, seems like the same file
stor1:
a951641c5230472929836f9fcede6b04
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
stor2:
a951641c5230472929836f9fcede6b04
/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
one thing I've seen from logs, that somehow proxmox
VE is connecting with wrong version to servers?
[2014-08-05 09:23:45.218550] I
[client-handshake.c:1659:select_server_supported_programs]
0-HA-fast-150G-PVE1-client-0: Using Program
GlusterFS 3.3, Num (1298437), Version (330)
It is the rpc (over the network data structures)
version, which is not changed at all from 3.3 so
thats not a problem. So what is the conclusion? Is
your test case working now or not?
Pranith
but if I issue:
root@pve1:~# glusterfs -V
glusterfs 3.4.4 built on Jun 28 2014 03:44:57
seems ok.
server use 3.4.4 meanwhile
[2014-08-05 09:23:45.117875] I
[server-handshake.c:567:server_setvolume]
0-HA-fast-150G-PVE1-server: accepted client from
stor1-9004-2014/08/05-09:23:45:93538-HA-fast-150G-PVE1-client-1-0
(version: 3.4.4)
[2014-08-05 09:23:49.103035] I
[server-handshake.c:567:server_setvolume]
0-HA-fast-150G-PVE1-server: accepted client from
stor1-8998-2014/08/05-09:23:45:89883-HA-fast-150G-PVE1-client-0-0
(version: 3.4.4)
if this could be the reason, of course.
I did restart the Proxmox VE yesterday (just for an
information)
2014-08-05 12:30 GMT+03:00 Pranith Kumar Karampuri
<[email protected] <mailto:[email protected]>>:
On 08/05/2014 02:33 PM, Roman wrote:
Waited long enough for now, still different
sizes and no logs about healing :(
stor1
# file:
exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000
trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000
trusted.gfid=0xf10ad81b58484bcd9b385a36a207f921
root@stor1:~# du -sh
/exports/fast-test/150G/images/127/
1.2G /exports/fast-test/150G/images/127/
stor2
# file:
exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000
trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000
trusted.gfid=0xf10ad81b58484bcd9b385a36a207f921
root@stor2:~# du -sh
/exports/fast-test/150G/images/127/
1.4G /exports/fast-test/150G/images/127/
According to the changelogs, the file doesn't
need any healing. Could you stop the operations
on the VMs and take md5sum on both these machines?
Pranith
2014-08-05 11:49 GMT+03:00 Pranith Kumar
Karampuri <[email protected]
<mailto:[email protected]>>:
On 08/05/2014 02:06 PM, Roman wrote:
Well, it seems like it doesn't see the
changes were made to the volume ? I
created two files 200 and 100 MB (from
/dev/zero) after I disconnected the first
brick. Then connected it back and got
these logs:
[2014-08-05 08:30:37.830150] I
[glusterfsd-mgmt.c:1584:mgmt_getspec_cbk]
0-glusterfs: No change in volfile, continuing
[2014-08-05 08:30:37.830207] I
[rpc-clnt.c:1676:rpc_clnt_reconfig]
0-HA-fast-150G-PVE1-client-0: changing
port to 49153 (from 0)
[2014-08-05 08:30:37.830239] W
[socket.c:514:__socket_rwv]
0-HA-fast-150G-PVE1-client-0: readv
failed (No data available)
[2014-08-05 08:30:37.831024] I
[client-handshake.c:1659:select_server_supported_programs]
0-HA-fast-150G-PVE1-client-0: Using
Program GlusterFS 3.3, Num (1298437),
Version (330)
[2014-08-05 08:30:37.831375] I
[client-handshake.c:1456:client_setvolume_cbk]
0-HA-fast-150G-PVE1-client-0: Connected
to 10.250.0.1:49153
<http://10.250.0.1:49153>, attached to
remote volume '/exports/fast-test/150G'.
[2014-08-05 08:30:37.831394] I
[client-handshake.c:1468:client_setvolume_cbk]
0-HA-fast-150G-PVE1-client-0: Server and
Client lk-version numbers are not same,
reopening the fds
[2014-08-05 08:30:37.831566] I
[client-handshake.c:450:client_set_lk_version_cbk]
0-HA-fast-150G-PVE1-client-0: Server lk
version = 1
[2014-08-05 08:30:37.830150] I
[glusterfsd-mgmt.c:1584:mgmt_getspec_cbk]
0-glusterfs: No change in volfile, continuing
this line seems weird to me tbh.
I do not see any traffic on switch
interfaces between gluster servers, which
means, there is no syncing between them.
I tried to ls -l the files on the client
and servers to trigger the healing, but
seems like no success. Should I wait more?
Yes, it should take around 10-15 minutes.
Could you provide 'getfattr -d -m. -e hex
<file-on-brick>' on both the bricks.
Pranith
2014-08-05 11:25 GMT+03:00 Pranith Kumar
Karampuri <[email protected]
<mailto:[email protected]>>:
On 08/05/2014 01:10 PM, Roman wrote:
Ahha! For some reason I was not able
to start the VM anymore, Proxmox VE
told me, that it is not able to read
the qcow2 header due to permission
is denied for some reason. So I just
deleted that file and created a new
VM. And the nex message I've got was
this:
Seems like these are the messages
where you took down the bricks before
self-heal. Could you restart the run
waiting for self-heals to complete
before taking down the next brick?
Pranith
[2014-08-05 07:31:25.663412] E
[afr-self-heal-common.c:197:afr_sh_print_split_brain_log]
0-HA-fast-150G-PVE1-replicate-0:
Unable to self-heal contents of
'/images/124/vm-124-disk-1.qcow2'
(possible split-brain). Please
delete the file from all but the
preferred subvolume.- Pending
matrix: [ [ 0 60 ] [ 11 0 ] ]
[2014-08-05 07:31:25.663955] E
[afr-self-heal-common.c:2262:afr_self_heal_completion_cbk]
0-HA-fast-150G-PVE1-replicate-0:
background data self-heal failed on
/images/124/vm-124-disk-1.qcow2
2014-08-05 10:13 GMT+03:00 Pranith
Kumar Karampuri <[email protected]
<mailto:[email protected]>>:
I just responded to your earlier
mail about how the log looks.
The log comes on the mount's logfile
Pranith
On 08/05/2014 12:41 PM, Roman wrote:
Ok, so I've waited enough, I
think. Had no any traffic on
switch ports between servers.
Could not find any suitable log
message about completed
self-heal (waited about 30
minutes). Plugged out the other
server's UTP cable this time
and got in the same situation:
root@gluster-test1:~# cat
/var/log/dmesg
-bash: /bin/cat: Input/output error
brick logs:
[2014-08-05 07:09:03.005474] I
[server.c:762:server_rpc_notify]
0-HA-fast-150G-PVE1-server:
disconnecting connectionfrom
pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0
[2014-08-05 07:09:03.005530] I
[server-helpers.c:729:server_connection_put]
0-HA-fast-150G-PVE1-server:
Shutting down connection
pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0
[2014-08-05 07:09:03.005560] I
[server-helpers.c:463:do_fd_cleanup]
0-HA-fast-150G-PVE1-server: fd
cleanup on
/images/124/vm-124-disk-1.qcow2
[2014-08-05 07:09:03.005797] I
[server-helpers.c:617:server_connection_destroy]
0-HA-fast-150G-PVE1-server:
destroyed connection of
pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0
2014-08-05 9:53 GMT+03:00
Pranith Kumar Karampuri
<[email protected]
<mailto:[email protected]>>:
Do you think it is possible
for you to do these tests
on the latest version
3.5.2? 'gluster volume heal
<volname> info' would give
you that information in
versions > 3.5.1.
Otherwise you will have to
check it from either the
logs, there will be
self-heal completed message
on the mount logs (or) by
observing 'getfattr -d -m.
-e hex <image-file-on-bricks>'
Pranith
On 08/05/2014 12:09 PM,
Roman wrote:
Ok, I understand. I will
try this shortly.
How can I be sure, that
healing process is done,
if I am not able to see
its status?
2014-08-05 9:30 GMT+03:00
Pranith Kumar Karampuri
<[email protected]
<mailto:[email protected]>>:
Mounts will do the
healing, not the
self-heal-daemon. The
problem I feel is that
whichever process does
the healing has the
latest information
about the good bricks
in this usecase. Since
for VM usecase, mounts
should have the latest
information, we should
let the mounts do the
healing. If the mount
accesses the VM image
either by someone
doing operations
inside the VM or
explicit stat on the
file it should do the
healing.
Pranith.
On 08/05/2014 10:39
AM, Roman wrote:
Hmmm, you told me to
turn it off. Did I
understood something
wrong? After I issued
the command you've
sent me, I was not
able to watch the
healing process, it
said, it won't be
healed, becouse its
turned off.
2014-08-05 5:39
GMT+03:00 Pranith
Kumar Karampuri
<[email protected]
<mailto:[email protected]>>:
You didn't
mention anything
about
self-healing. Did
you wait until
the self-heal is
complete?
Pranith
On 08/04/2014
05:49 PM, Roman
wrote:
Hi!
Result is pretty
same. I set the
switch port down
for 1st server,
it was ok. Then
set it up back
and set other
server's port
off. and it
triggered IO
error on two
virtual
machines: one
with local root
FS but network
mounted storage.
and other with
network root FS.
1st gave an
error on copying
to or from the
mounted network
disk, other just
gave me an error
for even reading
log.files.
cat:
/var/log/alternatives.log:
Input/output error
then I reset the
kvm VM and it
said me, there
is no boot
device. Next I
virtually
powered it off
and then back on
and it has booted.
By the way, did
I have to
start/stop volume?
>> Could you do
the following
and test it again?
>> gluster volume
set <volname>
cluster.self-heal-daemon
off
>>Pranith
2014-08-04 14:10
GMT+03:00
Pranith Kumar
Karampuri
<[email protected]
<mailto:[email protected]>>:
On
08/04/2014
03:33 PM,
Roman wrote:
Hello!
Facing the
same
problem as
mentioned
here:
http://supercolony.gluster.org/pipermail/gluster-users/2014-April/039959.html
my set up
is up and
running, so
i'm ready
to help you
back with
feedback.
setup:
proxmox
server as
client
2 gluster
physical
servers
server side
and client
side both
running atm
3.4.4
glusterfs
from
gluster repo.
the problem is:
1. craeted
replica bricks.
2. mounted
in proxmox
(tried both
promox
ways: via
GUI and
fstab (with
backup
volume
line), btw
while
mounting
via fstab
I'm unable
to launch a
VM without
cache,
meanwhile
direct-io-mode
is enabled
in fstab line)
3. installed VM
4. bring
one volume
down - ok
5. bringing
up, waiting
for sync is
done.
6. bring
other
volume down
- getting
IO errors
on VM guest
and not
able to
restore the
VM after I
reset the
VM via
host. It
says (no
bootable
media).
After I
shut it
down
(forced)
and bring
back up, it
boots.
Could you do
the
following
and test it
again?
gluster
volume set
<volname>
cluster.self-heal-daemon
off
Pranith
Need help.
Tried
3.4.3, 3.4.4.
Still
missing
pkg-s for
3.4.5 for
debian and
3.5.2
(3.5.1
always
gives a
healing
error for
some reason)
--
Best regards,
Roman.
_______________________________________________
Gluster-users mailing list
[email protected]
<mailto:[email protected]>
http://supercolony.gluster.org/mailman/listinfo/gluster-users
--
Best regards,
Roman.
--
Best regards,
Roman.
--
Best regards,
Roman.
--
Best regards,
Roman.
--
Best regards,
Roman.
--
Best regards,
Roman.
--
Best regards,
Roman.
--
Best regards,
Roman.
--
Best regards,
Roman.
--
Best regards,
Roman.
--
Best regards,
Roman.
--
Best regards,
Roman.
_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users