[ceph-users] Re: what happens if a server crashes with cephfs?

2022-12-08 Thread Charles Hedrick
thanks. That's the behavior I was hoping for. From: Gregory Farnum Sent: Thursday, December 8, 2022 12:57 PM To: Charles Hedrick Cc: Manuel Holtgrewe ; Dhairya Parmar ; ceph-users@ceph.io Subject: Re: [ceph-users] Re: what happens if a server crashes with

[ceph-users] Re: what happens if a server crashes with cephfs?

2022-12-08 Thread Gregory Farnum
Ceph clients keep updates buffered until they receive server notification that the update is persisted to disk. On server crash, the client connects to either the newly-responsible OSD (for file data) or the standby/restarted MDS (for file metadata) and replays outstanding operations. This is all

[ceph-users] Re: what happens if a server crashes with cephfs?

2022-12-08 Thread Charles Hedrick
I'm aware that the file system will remain available. My concern is about long jobs using it failing because a single operation returns an error. While none of the discussion so far has been explicit, I assume this can happen if an OSD fails, since it might have done an async acknowledgement

[ceph-users] Re: what happens if a server crashes with cephfs?

2022-12-08 Thread Manuel Holtgrewe
Hi Charles, are you concerned with a single Ceph cluster server crash or the whole server crashing? If you have sufficient redundancy, nothing bad should happen but the file system should remain available. The same should be true if you perform an upgrade in the "correct" way, e.g., through the

[ceph-users] Re: what happens if a server crashes with cephfs?

2022-12-08 Thread Charles Hedrick
network and local file systems have different requirements. If I have a long job and the machine I'm running on crashes, I have to rerun it. The fact that the last 500 msec of data didn't get flushed to disk is unlikely to matter. If I have a long job using a network file system, and the server

[ceph-users] Re: octopus rbd cluster just stopped out of nowhere (>20k slow ops)

2022-12-08 Thread Anthony D'Atri
This page is a nice summary: https://brooker.co.za/blog/2014/07/04/iostat-pct.html > On Dec 8, 2022, at 11:05 AM, Sven Kieske wrote: > > On Mi, 2022-12-07 at 14:25 -0500, Anthony D'Atri wrote: >> Especially on SSDs. >> >>> On Dec 7, 2022, at 14:16, Matthias Ferdinand wrote: >>> >>> The

[ceph-users] Re: what happens if a server crashes with cephfs?

2022-12-08 Thread Gregory Farnum
On Thu, Dec 8, 2022 at 8:42 AM Manuel Holtgrewe wrote: > > Hi Charles, > > as far as I know, CephFS implements POSIX semantics. That is, if the CephFS > server cluster dies for whatever reason then this will translate in I/O > errors. This is the same as if your NFS server dies or you run the

[ceph-users] Re: what happens if a server crashes with cephfs?

2022-12-08 Thread Manuel Holtgrewe
Hi Charles, as far as I know, CephFS implements POSIX semantics. That is, if the CephFS server cluster dies for whatever reason then this will translate in I/O errors. This is the same as if your NFS server dies or you run the program locally on a workstation/laptop and the machine loses power.

[ceph-users] Re: what happens if a server crashes with cephfs?

2022-12-08 Thread Charles Hedrick
thanks. I'm evaluating cephfs for a computer science dept. We have users that run week-long AI training jobs. They use standard packages, which they probably don't want to modify. At the moment we use NFS. It uses synchronous I/O, so if somethings goes wrong, the users' jobs pause until we

[ceph-users] Re: octopus rbd cluster just stopped out of nowhere (>20k slow ops)

2022-12-08 Thread Sven Kieske
On Mi, 2022-12-07 at 14:25 -0500, Anthony D'Atri wrote: > Especially on SSDs. > > > On Dec 7, 2022, at 14:16, Matthias Ferdinand wrote: > > > > The usefulness of %util is limited anyway. Interesting, this is the first time I read such claims about limited usefulness of %util. Can you back

[ceph-users] Ceph mgr rgw module missing in quincy

2022-12-08 Thread Szabo, Istvan (Agoda)
Hi, When I want to enable this module it is missing: https://docs.ceph.com/en/quincy/mgr/rgw.html Looked in the mgr module list but nothing there. What is the reason? Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e:

[ceph-users] Re: Cannot create snapshots if RBD image is mapped with -oexclusive

2022-12-08 Thread Andreas Teuchert
Hello, in case anyone finds this post while trying to find an answer to the same question, I believe the answer is here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/DBJRYTMQURANFFWSS4QDCKD5KULJQ46X/ As far as I understand it: Creating a snapshot requires to acquire the

[ceph-users] Re: rbd-mirror stops replaying journal on primary cluster

2022-12-08 Thread Josef Johansson
Hi, Running a simple `echo 1>a;sync;rm a;sync;fstrim --all` Triggers the problem. No need to have the mount point mounted with discard. On Thu, Dec 8, 2022 at 12:33 AM Josef Johansson wrote: > > Hi, > > I've updated https://tracker.ceph.com/issues/57396 with some more > info, it seems that

[ceph-users] Unable to start monitor as a daemon

2022-12-08 Thread zRiemann Contact
Hi all, I'm on a evaluation stage and implementing a fully virtualized ceph quincy test cluster. I successfully deployed the first two mon and three osd; on the first mon I also deployed the manager and the dashboard. All the deployments was carried out without any automation (ansible or

[ceph-users] Re: pol min_size

2022-12-08 Thread Eugen Block
Hi, if you value your data you shouldn't set it to k (in your case 6). The docs [1] are pretty clear about that: min_size Sets the minimum number of replicas required for I/O. See Set the Number of Object Replicas for further details. In the case of Erasure Coded pools this should

[ceph-users] Re: pacific: ceph-mon services stopped after OSDs are out/down

2022-12-08 Thread Eugen Block
Hi, do the MONs use the same SAS interface? They store the mon db on local disk, so it might be related. But without any logs or more details it's just guessing. Regards, Eugen Zitat von Mevludin Blazevic : Hi all, I'm running Pacific with cephadm. After installation, ceph