Hi! I would then suggest to install the latest 20.2 ceph packages on your clients (and maybe on your server if you have any 19.2.1 packages installed there).
You can find a repository with ubuntu 24.04 packages here: https://ca.ceph.com/debian-tentacle/ Carsten ------------------------------------------------------------------ Carsten Goetze Computer Graphics tel: +49 531 391-2109 TU Braunschweig fax: +49 531 391-2103 Muehlenpfordtstr. 23 eMail: [email protected] D-38106 Braunschweig http://www.cg.cs.tu-bs.de/people/goetze > Am 12.02.2026 um 10:45 schrieb dominik.baack > <[email protected]>: > > Hi, > > thanks for your reply. > > Mounting was done without the 'root_squash' option, here is the corresponding > fstab entry: > > 192.168.251.2,192.168.251.3,192.168.251.4,192.168.251.5,192.168.251.6,192.168.251.7:/ > /cephfs ceph > name=gpu01,secretfile=/etc/ceph/gpu01.key,noatime,_netdev,recover_session=clean,read_from_replica=balance > 0 0 > > Dominik > > > Am 2026-02-12 09:51, schrieb goetze: >> Hi ! >> Have you mounted cephfs with the 'root_squash' option set? If so, >> remove that option. I may be wrong here, but as far as I know, this is >> still considered unsafe and can lead to data corruption since the >> necessary code changes have not yet made it into the mainstream linux >> kernel. >> Carsten >> ------------------------------------------------------------------ >> Carsten Goetze >> Computer Graphics tel: +49 531 391-2109 >> TU Braunschweig fax: +49 531 391-2103 >> Muehlenpfordtstr. 23 eMail: [email protected] >> D-38106 Braunschweig http://www.cg.cs.tu-bs.de/people/goetze >>> Am 12.02.2026 um 07:15 schrieb Dominik Baack via ceph-users >>> <[email protected]>: >>> Hi, >>> I noticed now that files are still actively corrupted / replaced by >>> empty files when open saved. >>> Access was done via Ceph 19.2.1 from Ubuntu 24 with kernel mount. >>> Ceph servers are still running 20.2 Tentacle (deployed via cephadm) >>> with an Ubuntu 24.04 host. >>> Options currently set are to noout, norebalance, noscrub and >>> nodeep-scrub >>> I am currently setting up a read only mount to copy all existing >>> data for a backup. >>> I have currently no clue whats going on, I currently was able to >>> observe this behavior only on the nodes. As those were upgraded as >>> well (MLNX driver, nvidia-fs,...) can it be network? >>> Any Idea how to recover from this? >>> Cheers >>> Dominik >>> Am 11.02.2026 um 18:44 schrieb dominik.baack via ceph-users: >>>> Hi, >>>> after an controlled shutdown of the whole cluster do to external >>>> circumstances we decided to update from 19.2 to 20.2 after the >>>> restart. The system was health before and after the update. >>>> The nodes mounting the filesystem were not equally lucky and were >>>> partially shutdown hard. Storage was kept running additional >>>> ~30min after node shutdown, all inflight operations should have >>>> finished. >>>> Now we discover the some of the user files seem to be replaced >>>> with zeros. For example: >>>> stat .gitignore >>>> File: .gitignore >>>> Size: 4429 Blocks: 9 IO Block: 4194304 >>>> regular file >>>> Device: 0,48 Inode: 1100241384598 Links: 1 >>>> hexdump -C .gitignore >>>> 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>>> |................| >>>> * >>>> 00001140 00 00 00 00 00 00 00 00 00 00 00 00 00 |.............| >>>> 0000114d >>>> Scanning for files containing only zeros show several issues of >>>> files that were likely accessed before or during the shutdown of >>>> the nodes. >>>> How should I progress from here? >>> _______________________________________________ >>> ceph-users mailing list -- [email protected] >>> To unsubscribe send an email to [email protected] _______________________________________________ ceph-users mailing list -- [email protected] To unsubscribe send an email to [email protected]
