[ceph-users] Re: ceph octopus mysterious OSD crash

2021-03-19 Thread Stefan Kooman
On 3/19/21 9:11 PM, Philip Brown wrote: if we cant replace a drive on a node in a crash situation, without blowing away the entire node seems to me ceph octopus fails the "test" part of the "test cluster" :-/ I agree. This should not be necessary. And I'm sure there is, or there will be

[ceph-users] Re: ceph octopus mysterious OSD crash

2021-03-19 Thread Tony Liu
Are you sure the OSD is with DB/WAL on SSD? Tony From: Philip Brown Sent: March 19, 2021 02:49 PM To: Eugen Block Cc: ceph-users Subject: [ceph-users] Re: [BULK] Re: Re: ceph octopus mysterious OSD crash Wow. My expectations have been adjusted. Thank

[ceph-users] Re: high number of kernel clients per osd slow down

2021-03-19 Thread Stefan Kooman
On 3/19/21 7:20 PM, Andrej Filipcic wrote: Hi, I am testing 15.2.10 on a large cluster (RH8). cephfs pool (size=1) with 122 nvme OSDs works fine till the number of clients is relatively low. Writing from 400 kernel clients (ior benchmark), 8 streams each, causes issues. Writes are initially

[ceph-users] Re: ceph octopus mysterious OSD crash

2021-03-19 Thread Stefan Kooman
On 3/19/21 6:22 PM, Philip Brown wrote: I made *some* progress for cleanup. I could already do "ceph osd rm 33" from my master. But doing the cleanup on the actual OSD node was problematical. ceph-volume lvm zap xxx wasnt working properly.. because the device wasnt fully released because

[ceph-users] Re: ceph octopus mysterious OSD crash

2021-03-19 Thread David Orman
We also ran into a scenario in which I did exactly this, and it did _not_ work. It created the OSD, but did not put the DB/WAL on the NVME (didn't even create an LV). I'm wondering if there's some constraint applied (haven't looked at code yet) that when the NVME already has all but the one DB on

[ceph-users] Re: [BULK] Re: Re: ceph octopus mysterious OSD crash

2021-03-19 Thread Philip Brown
Wow. My expectations have been adjusted. Thank you for detailing your experience, so I had motivation to try again. Explicit steps I took: 1. went into "cephadm shell" and did a vgremove on the HDD 2. ceph-volume zap /dev/(hdd) 3. lvremove (the matching old lv). This meant that the VG on the

[ceph-users] Re: ceph octopus mysterious OSD crash

2021-03-19 Thread Eugen Block
I am quite sure that this case is covered by cephadm already. A few months ago I tested it after a major rework of ceph-volume. I don’t have any links right now. But I had a lab environment with multiple OSDs per node with rocksDB on SSD and after wiping both HDD and DB LV cephadm

[ceph-users] Re: ceph octopus mysterious OSD crash

2021-03-19 Thread Stefan Kooman
On 3/19/21 3:53 PM, Philip Brown wrote: mkay. Sooo... what's the new and nifty proper way to clean this up? The outsider's view is, "I should just be able to run 'ceph orch osd rm 33'" Can you spawn a cephadm shell and run: ceph osd rm 33? And / or: ceph osd crush rm 33, or try to do it

[ceph-users] Re: ceph orch daemon add , separate db

2021-03-19 Thread Tony Liu
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/EC45YMDJZD3T6TQINGM222H2H4RZABJ4/ From: Philip Brown Sent: March 19, 2021 08:59 AM To: ceph-users Subject: [ceph-users] ceph orch daemon add , separate db I was having difficulty doing

[ceph-users] Re: ceph octopus mysterious OSD crash

2021-03-19 Thread Philip Brown
if we cant replace a drive on a node in a crash situation, without blowing away the entire node seems to me ceph octopus fails the "test" part of the "test cluster" :-/ I vaguely recall running into this "doesnt have PARTUUID" problem before. THAT time, I did end up wiping the entire machine

[ceph-users] Re: howto:: emergency shutdown procedure and maintenance

2021-03-19 Thread Adrian Sevcenco
On 3/19/21 5:05 PM, Andrew Walker-Brown wrote: Hi Adrian, Hi! For maintenance, this is the procedure I’d follow: https://ceph.io/planet/how-to-do-a-ceph-cluster-maintenance-shutdown/ Difference between maintenance

[ceph-users] Re: LVM vs. direct disk acess

2021-03-19 Thread Reed Dier
I think this would be a great place in the ML to look. https://ceph-users.ceph.narkive.com/AthYx879/ceph-volume-migration-and-disk-partition-support Reed > On Mar 19, 2021, at 2:17 PM, Marc wrote:

[ceph-users] Re: LVM vs. direct disk acess

2021-03-19 Thread Marc
I have asked exactly the same question 1 year ago or so. Sage told me to show evidence of a significant impact, because they did not measured one. If I remember correctly is the idea behind this that not all storage devices are available as /dev/sdX as normal disk and lvm sort of solves this

[ceph-users] LVM vs. direct disk acess

2021-03-19 Thread Nico Schottelius
Good evening, I've seen the shift in ceph to focus more on LVM than on plain (direct) access to disks. I was wondering what the motivation is for that. >From my point of view OSD disk layouts never change (they are re-added if they do), so the dynamic approach of LVM is probably not the

[ceph-users] Re: high number of kernel clients per osd slow down

2021-03-19 Thread Andrej Filipcic
On 19/03/2021 19:41, Stefan Kooman wrote: On 3/19/21 7:20 PM, Andrej Filipcic wrote: Hi, I am testing 15.2.10 on a large cluster (RH8). cephfs pool (size=1) with 122 nvme OSDs works fine till the number of clients is relatively low. Writing from 400 kernel clients (ior benchmark), 8 streams

[ceph-users] Re: ceph octopus mysterious OSD crash

2021-03-19 Thread Philip Brown
Unfortunately, neither of those things will work. because ceph orch daemon add does not have a syntax that lets me add an SSD as a journal to a HDD and likewise ceph orch apply osd --all-available-devices will not do the right thing. both for mixed ssd/hdd.. but also, even though I have a

[ceph-users] high number of kernel clients per osd slow down

2021-03-19 Thread Andrej Filipcic
Hi, I am testing 15.2.10 on a large cluster (RH8). cephfs pool (size=1) with 122 nvme OSDs works fine till the number of clients is relatively low. Writing from 400 kernel clients (ior benchmark), 8 streams each, causes issues. Writes are initially fast at 100GB/s but then they drop to

[ceph-users] Re: ceph octopus mysterious OSD crash

2021-03-19 Thread Philip Brown
I made *some* progress for cleanup. I could already do "ceph osd rm 33" from my master. But doing the cleanup on the actual OSD node was problematical. ceph-volume lvm zap xxx wasnt working properly.. because the device wasnt fully released because at the regular OS level, it cant even SEE

[ceph-users] March Ceph Science Virtual User Group Meeting

2021-03-19 Thread Kevin Hrpcek
Hey all, We will be having a Ceph science/research/big cluster call on Wednesday March 24th. If anyone wants to discuss something specific they can add it to the pad linked below. If you have questions or comments you can contact me. This is an informal open call of community members mostly

[ceph-users] ceph orch daemon add , separate db

2021-03-19 Thread Philip Brown
I was having difficulty doing this myself, and I came across this semi-recent thread: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/T4R76XJN2NE442GQJ5P2KRJN6HXPMYKL/ " I've tried adding OSDs with ceph orch daemon add ... but it's pretty limited. ...you can't [have] a separate

[ceph-users] Re: howto:: emergency shutdown procedure and maintenance

2021-03-19 Thread Andrew Walker-Brown
Hi Adrian, For maintenance, this is the procedure I’d follow: https://ceph.io/planet/how-to-do-a-ceph-cluster-maintenance-shutdown/ Difference between maintenance and emergency; I’d probably set all the flags as per maintenance but down the OSD’s at the same time followed by all the

[ceph-users] Re: ceph octopus mysterious OSD crash

2021-03-19 Thread Philip Brown
mkay. Sooo... what's the new and nifty proper way to clean this up? The outsider's view is, "I should just be able to run 'ceph orch osd rm 33'" but that returns Unable to find OSDs: ['33'] - Original Message - From: "Stefan Kooman" To: "Philip Brown" Cc: "ceph-users" Sent:

[ceph-users] Re: Importance of bluefs fix in Octopus 15.2.10 ?

2021-03-19 Thread Igor Fedotov
Hi Chris, this patch fixes potential data corruption indeed. But IMO its probabilty is pretty low and it rather tend to occur when performing bulk RocksDB/BlueFS writes during e.g. omap naming scheme update. Which in turn rather occurs  between major point release upgrade. At least that's

[ceph-users] Importance of bluefs fix in Octopus 15.2.10 ?

2021-03-19 Thread Chris Palmer
When looking over the changelog for 15.2.10 I noticed some bluefs changes. One in particular caught my eye, and it was called out as a notable change:    os/bluestore: fix huge reads/writes at BlueFS (pr#39701 , Jianpeng Ma, Igor Fedotov) It wasn't

[ceph-users] Re: ceph-ansible in Pacific and beyond?

2021-03-19 Thread Stefan Kooman
On 3/17/21 5:50 PM, Matthew Vernon wrote: Hi, I caught up with Sage's talk on what to expect in Pacific ( https://www.youtube.com/watch?v=PVtn53MbxTc ) and there was no mention of ceph-ansible at all. Is it going to continue to be supported? We use it (and uncontainerised packages) for all