[ceph-users] Re: Data is still actively corrupted!

Carsten Götze via ceph-users Thu, 12 Feb 2026 02:42:30 -0800

Hi!

I would then suggest to install the latest 20.2 ceph packages on your clients 
(and maybe on your server if you have any 19.2.1 packages installed there).


You can find a repository with ubuntu 24.04 packages here: 
https://ca.ceph.com/debian-tentacle/

Carsten
------------------------------------------------------------------
Carsten Goetze
Computer Graphics          tel:   +49 531 391-2109
TU Braunschweig            fax:   +49 531 391-2103
Muehlenpfordtstr. 23       eMail: [email protected]
D-38106 Braunschweig       http://www.cg.cs.tu-bs.de/people/goetze

> Am 12.02.2026 um 10:45 schrieb dominik.baack 
> <[email protected]>:
> 
> Hi,
> 
> thanks for your reply.
> 
> Mounting was done without the 'root_squash' option, here is the corresponding 
> fstab entry:
> 
> 192.168.251.2,192.168.251.3,192.168.251.4,192.168.251.5,192.168.251.6,192.168.251.7:/
>  /cephfs ceph 
> name=gpu01,secretfile=/etc/ceph/gpu01.key,noatime,_netdev,recover_session=clean,read_from_replica=balance
>  0  0
> 
> Dominik
> 
> 
> Am 2026-02-12 09:51, schrieb goetze:
>> Hi !
>> Have you mounted cephfs with the 'root_squash' option set? If so,
>> remove that option. I may be wrong here, but as far as I know, this is
>> still considered unsafe and can lead to data corruption since the
>> necessary code changes have not yet made it into the mainstream linux
>> kernel.
>> Carsten
>> ------------------------------------------------------------------
>> Carsten Goetze
>> Computer Graphics          tel:   +49 531 391-2109
>> TU Braunschweig            fax:   +49 531 391-2103
>> Muehlenpfordtstr. 23       eMail: [email protected]
>> D-38106 Braunschweig       http://www.cg.cs.tu-bs.de/people/goetze
>>> Am 12.02.2026 um 07:15 schrieb Dominik Baack via ceph-users
>>> <[email protected]>:
>>> Hi,
>>> I noticed now that files are still actively corrupted / replaced by
>>> empty files when open saved.
>>> Access was done via Ceph 19.2.1 from Ubuntu 24 with kernel mount.
>>> Ceph servers are still running 20.2 Tentacle (deployed via cephadm)
>>> with an Ubuntu 24.04 host.
>>> Options currently set are to noout, norebalance, noscrub and
>>> nodeep-scrub
>>> I am currently setting up a read only mount to copy all existing
>>> data for a backup.
>>> I have currently no clue whats going on, I currently was able to
>>> observe this behavior only on the nodes. As those were upgraded as
>>> well (MLNX driver, nvidia-fs,...) can it be network?
>>> Any Idea how to recover from this?
>>> Cheers
>>> Dominik
>>> Am 11.02.2026 um 18:44 schrieb dominik.baack via ceph-users:
>>>> Hi,
>>>> after an controlled shutdown of the whole cluster do to external
>>>> circumstances we decided to update from 19.2 to 20.2 after the
>>>> restart. The system was health before and after the update.
>>>> The nodes mounting the filesystem were not equally lucky and were
>>>> partially shutdown hard. Storage was kept running additional
>>>> ~30min after node shutdown, all inflight operations should have
>>>> finished.
>>>> Now we discover the some of the user files seem to be replaced
>>>> with zeros. For example:
>>>> stat .gitignore
>>>> File: .gitignore
>>>> Size: 4429            Blocks: 9          IO Block: 4194304
>>>> regular file
>>>> Device: 0,48    Inode: 1100241384598  Links: 1
>>>> hexdump -C .gitignore
>>>> 00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
>>>> |................|
>>>> *
>>>> 00001140  00 00 00 00 00 00 00 00  00 00 00 00 00  |.............|
>>>> 0000114d
>>>> Scanning for files containing only zeros show several issues of
>>>> files that were likely accessed before or during the shutdown of
>>>> the nodes.
>>>> How should I progress from here?
>>> _______________________________________________
>>> ceph-users mailing list -- [email protected]
>>> To unsubscribe send an email to [email protected]

_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[ceph-users] Re: Data is still actively corrupted!

Reply via email to