Hi, 

we have a small Proxmox cluster, on top of ceph. 
Version is proxmox 5.2.6, Ceph 12.2.5 luminous 
Hardware is 4 machines, dual Xeon E5, 128 GB RAM 
local SATA (raid1) for OS 
local SSD for OSD (2 OSD per machine, no Raid here) 
4x 10GBit (copper) NICs 

We came upon the following situation: 
VM snapshot was created to perform a dangerous installation process, which 
should be revertable 
Installation was done and a rollback to snapshot was initiated (because 
something went wrong). 
However, the rollback of snapshot took > 1 hour and during this timeframe, the 
whole cluster 
was reacting veeeeery slow. 
We tried to find out the reason for this, and it looks like an I/O bottleneck. 
For some reason, the main I/O was done on two local OSD processes (on the same 
host where the VM was running). 
The iostat output said the data transmission rate was about 30MB/s per OSD disk 
but util was 100%. (whatever this means) 
The underlying SSD are not damaged and have a significant higher throughput 
normally. 
OSD is based on filestore/XFS (we encountered some problems with bluestore and 
decided to use filestore again) 
There are a lot of read/write operations in parallel at this time. 

Normal cluster operation is relatively fluent, only copying machines affects 
I/O but we can see 
transfer rates > 200 MB/s in iostat in this case. (this is not very fast for 
the SSD disks from my point of view, 
but it is not only sequential write) 
Also, I/O utilization is not near 100% when a copy action is executed. 

SSD and SATA disks are on separate controllers. 

Any ideas where to tune for better snapshot rollback performance ? 
I am not sure how the placement of the snapshot data is done from proxmox or 
ceph. 

Under the hood, there are rbd devices, which are snapshotted. So it should be 
up to the ceph logic 
where the snapshots are done (maybe depending on the initial layout of the 
original device ) ? 
Would the crush map influence that ? 

Also live backup takes snapshots as I can see. We have had very strange locks 
on running backups 
in the past (mostly gone since the disks were put on separate controllers). 

Could this be the same reason ? 

Another thing we found is the following (not on all hosts): 
[614673.831726] libceph: mon1 192.168.16.32:6789 session lost, hunting for new 
mon 
[614673.848249] libceph: mon2 192.168.16.34:6789 session established 
[614704.551754] libceph: mon2 192.168.16.34:6789 session lost, hunting for new 
mon 
[614704.552729] libceph: mon1 192.168.16.32:6789 session established 
[614735.271779] libceph: mon1 192.168.16.32:6789 session lost, hunting for new 
mon 
[614735.272339] libceph: mon2 192.168.16.34:6789 session established 

This leads to a kernel problem, which is still not solved (because not 
backported to 4.15). 
I am not sure if this is a reaction to a ceph problem or the reason for the 
ceph problem. 

Any thoughts on this ? 

Marcus Haarmann 
_______________________________________________
pve-user mailing list
[email protected]
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

Reply via email to