[ceph-users] Re: High OSD commit_latency after kernel upgrade

2024-03-22 Thread Özkan Göksu
c3--2ea3c8a05ca1 04K 2G 0 I think I also need to write a udev rule for LVM osd partitions right? Anthony D'Atri , 22 Mar 2024 Cum, 18:11 tarihinde şunu yazdı: > Maybe because the Crucial units are detected as client drives? But also > look at the device paths an

[ceph-users] Re: High OSD commit_latency after kernel upgrade

2024-03-22 Thread Özkan Göksu
eph-write-through.rules ACTION=="add", SUBSYSTEM=="scsi_disk", ATTR{cache_type}:="write through" Özkan Göksu , 22 Mar 2024 Cum, 17:42 tarihinde şunu yazdı: > Hello Anthony, thank you for the answer. > > While researching I also found out this type of iss

[ceph-users] Re: High OSD commit_latency after kernel upgrade

2024-03-22 Thread Özkan Göksu
ging-usb-ssd-provisioning-mode-from-unmap-to-full-in-ub> > ? > > > On Mar 22, 2024, at 09:36, Özkan Göksu wrote: > > Hello! > > After upgrading "5.15.0-84-generic" to "5.15.0-100-generic" (Ubuntu 22.04.2 > LTS) , commit latency star

[ceph-users] High OSD commit_latency after kernel upgrade

2024-03-22 Thread Özkan Göksu
Hello! After upgrading "5.15.0-84-generic" to "5.15.0-100-generic" (Ubuntu 22.04.2 LTS) , commit latency started acting weird with "CT4000MX500SSD" drives. osd commit_latency(ms) apply_latency(ms) 36 867867 373045 3045 38

[ceph-users] Sata SSD trim latency with (WAL+DB on NVME + Sata OSD)

2024-02-26 Thread Özkan Göksu
Hello. With the SSD drives without tantalum capacitors Ceph faces trim latency on every write. I wonder if the behavior is the same if we locate WAL+DB on NVME drives with "Tantalum capacitors" ? Do I need to use NVME + SAS SSD to avoid this latency issue? Best regards.

[ceph-users] Re: Seperate metadata pool in 3x MDS node

2024-02-26 Thread Özkan Göksu
d, even if the addressed MDS is on the same switch as > the client, you will always have this MDS which will consult/write (nvme) > OSDs on the other racks (among 1,2,4,5). > > In any case, the exercise is interesting. > > > > Le sam. 24 févr. 2024 à 19:56, Özkan Göksu a écri

[ceph-users] Seperate metadata pool in 3x MDS node

2024-02-24 Thread Özkan Göksu
Hello folks! I'm designing a new Ceph storage from scratch and I want to increase CephFS speed and decrease latency. Usually I always build (WAL+DB on NVME with Sas-Sata SSD's) and I deploy MDS and MON's on the same servers. This time a weird idea came to my mind and I think it has great

[ceph-users] Re: Performance improvement suggestion

2024-02-20 Thread Özkan Göksu
Hello. I didn't test it personally but what about rep 1 write cache pool with nvme backed by another rep 2 pool? It has the potential exactly what you are looking for in theory. 1 Şub 2024 Per 20:54 tarihinde quag...@bol.com.br şunu yazdı: > > > Ok Anthony, > > I understood what you said. I

[ceph-users] Re: 1 clients failing to respond to cache pressure (quincy:17.2.6)

2024-01-27 Thread Özkan Göksu
Thank you Frank. My focus is actually performance tuning. After your mail, I started to investigate client-side. I think the kernel tunings work great now. After the tunings I didn't get any warning again. Now I will continue with performance tunings. I decided to distribute subvolumes across

[ceph-users] Re: 1 clients failing to respond to cache pressure (quincy:17.2.6)

2024-01-26 Thread Özkan Göksu
61848msec); 0 zone resets What do you think? Özkan Göksu , 27 Oca 2024 Cmt, 04:08 tarihinde şunu yazdı: > Wow I noticed something! > > To prevent ram overflow with gpu training allocations, I'm using a 2TB > Samsung 870 evo for swap. > > As you can see below, swap usage 18Gi

[ceph-users] Re: 1 clients failing to respond to cache pressure (quincy:17.2.6)

2024-01-26 Thread Özkan Göksu
11Gi50Gi 3.0Mi 1.0Gi 49Gi Swap: 1.8Ti 0B 1.8Ti I started to feel we are getting closer :) Özkan Göksu , 27 Oca 2024 Cmt, 02:58 tarihinde şunu yazdı: > I started to investigate my clients. > > for example: > > root@ud-01:~# ceph heal

[ceph-users] Re: 1 clients failing to respond to cache pressure (quincy:17.2.6)

2024-01-26 Thread Özkan Göksu
_ra_hit 25658825 direct_map_level2_splits 1249 direct_map_level3_splits 49 nr_unstable 0 Özkan Göksu , 27 Oca 2024 Cmt, 02:36 tarihinde şunu yazdı: > Hello Frank. > > I have 84 clients (high-end servers) with: Ubuntu 20.04.5 LTS - Kernel: > Linux 5.4.0-125-generic > > My cluster 17.2.6

[ceph-users] Re: 1 clients failing to respond to cache pressure (quincy:17.2.6)

2024-01-26 Thread Özkan Göksu
negative impact. > > Best regards, > = > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ________ > From: Eugen Block > Sent: Friday, January 26, 2024 10:05 AM > To: Özkan Göksu > Cc: ceph-users@ceph.io

[ceph-users] Re: 1 clients failing to respond to cache pressure (quincy:17.2.6)

2024-01-25 Thread Özkan Göksu
hit - d_lease 306 19110 3317071969 caps 14129 145404 3761682333 Özkan Göksu , 25 Oca 2024 Per, 20:25 tarihinde şunu yazdı: > Every user has a 1x subvolume and I only have 1 pool. > At

[ceph-users] Re: 1 clients failing to respond to cache pressure (quincy:17.2.6)

2024-01-25 Thread Özkan Göksu
nning, is it possible to split the CephFS into multiple > subdirectories and pin them to different ranks? > But first I’d still like to know what the performance issue really is. > > Zitat von Özkan Göksu : > > > I will try my best to explain my situation. > > > >

[ceph-users] Re: 1 clients failing to respond to cache pressure (quincy:17.2.6)

2024-01-25 Thread Özkan Göksu
em to have too much activity > going on (around 140 requests per second), but that's probably not the > usual traffic? What does ceph report in its client IO output? > Can you paste the 'ceph osd df' output as well? > Do you have dedicated MDS servers or are they colocated with other &g

[ceph-users] Re: 1 clients failing to respond to cache pressure (quincy:17.2.6)

2024-01-25 Thread Özkan Göksu
Hello Eugen. I read all of your MDS related topics and thank you so much for your effort on this. There is not much information and I couldn't find a MDS tuning guide at all. It seems that you are the correct person to discuss mds debugging and tuning. Do you have any documents or may I learn

[ceph-users] Re: cephfs-top causes 16 mgr modules have recently crashed

2024-01-25 Thread Özkan Göksu
gt; https://docs.ceph.com/en/quincy/cephadm/upgrade/#staggered-upgrade. > > On 23/01/24 12:19, Özkan Göksu wrote: > > Hello Jos. > Thank you for the reply. > > I can upgrade to 17.2.7 but I wonder can I only upgrade MON+MGR for this > issue or do I need to upgrade all the part

[ceph-users] Re: cephfs-top causes 16 mgr modules have recently crashed

2024-01-22 Thread Özkan Göksu
;18.1.1", "18.1.0", "17.2.7", "17.2.6", "17.2.5", "17.2.4", "17.2.3", "17.2.2", "17.2.1", "17.2.0" ] } Best regards Jos Collin

[ceph-users] cephfs-top causes 16 mgr modules have recently crashed

2024-01-22 Thread Özkan Göksu
Hello When I run cephfs-top it causes mgr module crash. Can you please tell me the reason? My environment: My ceph version 17.2.6 Operating System: Ubuntu 22.04.2 LTS Kernel: Linux 5.15.0-84-generic I created the cephfs-top user with the following command: ceph auth get-or-create client.fstop

[ceph-users] Re: 1 clients failing to respond to cache pressure (quincy:17.2.6)

2024-01-16 Thread Özkan Göksu
"osdop_write": 713549, "osdop_writefull": 20267, "osdop_writesame": 0, "osdop_append": 0, "osdop_zero": 2, "osdop_truncate": 0, "osdop_delete": 1226688, "osdop_mapext": 0, &q

[ceph-users] Re: 1 clients failing to respond to cache pressure (quincy:17.2.6)

2024-01-16 Thread Özkan Göksu
d I need to increase the value to decrease the latency. How can I do that? Özkan Göksu , 17 Oca 2024 Çar, 07:44 tarihinde şunu yazdı: > Let me share some outputs about my cluster. > > root@ud-01:~# ceph fs status > ud-data - 84 clients > === > RANK STATE MDS

[ceph-users] Re: 1 clients failing to respond to cache pressure (quincy:17.2.6)

2024-01-16 Thread Özkan Göksu
"start": "0x10004a3c12e", "length": 3 }, { "start": "0x1000f9831fe", "length": 2 }, { "start": "0x1000

[ceph-users] Re: 1 clients failing to respond to cache pressure (quincy:17.2.6)

2024-01-16 Thread Özkan Göksu
https://www.spinics.net/lists/ceph-users/msg73188.html > > Zitat von Özkan Göksu : > > > Hello. > > > > I have 5 node ceph cluster and I'm constantly having "clients failing to > > respond to cache pressure" warning. > > > > I have 84 cephfs kernel

[ceph-users] 1 clients failing to respond to cache pressure (quincy:17.2.6)

2024-01-12 Thread Özkan Göksu
Hello. I have 5 node ceph cluster and I'm constantly having "clients failing to respond to cache pressure" warning. I have 84 cephfs kernel clients (servers) and my users are accessing their personal subvolumes located on one pool. My users are software developers and the data is home and user