Hello, I've tried to upgrade our ceph cluster to pacific release (version
16.2.0 and then planned to move to each version 1 by 1) but it seems that on
our cluster, it's failing
I've installed it (long time ago...) via cephadm on version v15 (I guess it was
a v15.2.8 underneath at this time).
I remember having an issue with ceph mgr which leads to use
ceph-base:latest-octopus to fix (next version wasn't released at this time and
it was crashing the cluster by filling the logs)
the cluster state is OK:
cluster:
id: adc48d6a-61bf-11eb-9212-2f70acf7224f
health: HEALTH_OK
services:
mon: 3 daemons, quorum server36,server38,server37 (age 3h)
mgr: server36.xujjng(active, since 2h), standbys: server37.fyglah
osd: 52 osds: 52 up (since 4h), 52 in (since 4h)
rgw: 2 daemons active (2 hosts, 1 zones)
data:
pools: 11 pools, 489 pgs
objects: 1.28M objects, 4.8 TiB
usage: 15 TiB used, 65 TiB / 81 TiB avail
pgs: 489 active+clean
io:
client: 3.2 MiB/s rd, 24 MiB/s wr, 2.73k op/s rd, 1.81k op/s wr
So I tried "ceph orch upgrade start --ceph-version 16.2.0) and the first time
it deployed a new ceph mgr with 16.2.0 version and got stuck here.
After waiting several hours, I stop and restarted and nothing happened.
I've then manually upgraded the whole cluster except the 2 rgw and
grafana/prometheus/alertmanager/nodeexporter
I retried and I see nothing happening, in the different logs (in debug)
(cephadm, logs of active mgr, ceph -W cephadm --watch-debug, ...)
I also tried with 16.2.1 now as it seems 16.2.0 wasn't working but I have the
same effect
here's what I see for ceph -W cephadm --watch-debug:
2021-07-09T14:13:14.642077+0000 mgr.server36.xujjng [INF] Upgrade: Started with
target docker.io/ceph/ceph:v16.2.1
and nothing
in mgr docker logs, I see (roughly) the same line and then debug stuff not
related :
::ffff:127.0.0.1 - - [09/Jul/2021:15:08:33] "GET /metrics HTTP/1.1" 200 1423923
"" "Prometheus/2.18.1"
debug 2021-07-09T15:08:35.829+0000 7fa870818700 0 log_channel(cluster) log
[DBG] : pgmap v3602: 489 pgs: 489 active+clean; 4.8 TiB data, 15 TiB used, 65
TiB / 81 TiB avail; 2.8 MiB/s rd, 37 MiB/s wr, 3.99k op/s
debug 2021-07-09T15:08:37.829+0000 7fa870818700 0 log_channel(cluster) log
[DBG] : pgmap v3603: 489 pgs: 489 active+clean; 4.8 TiB data, 15 TiB used, 65
TiB / 81 TiB avail; 2.0 MiB/s rd, 30 MiB/s wr, 2.75k op/s
I don't see anything in cephadm logs
status is not very good also:
{
"target_image": "docker.io/ceph/ceph:v16.2.1",
"in_progress": true,
"services_complete": [],
"progress": "",
"message": ""
}
Do you know where I could find some log / info in order to see why it doesn't
start?
thanks!
Sylvain
_________________________________________________________________________________________________________________________
Ce message et ses pieces jointes peuvent contenir des informations
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou
falsifie. Merci.
This message and its attachments may contain confidential or privileged
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete
this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been
modified, changed or falsified.
Thank you.
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]