Ake,
in this particular case I can answer your question in detail.

Before SFAOS 12.1 (IIRC) the /sys/block/*/queue/rotational setting is set from 
userspace at mount time via a udev script, and the Lustre detection of 
"rotational=0" could be racy.  Newer versions of SFAOS (12.1+) set the 
rotational state in the SCSI VPD page and this is detected directly by the 
kernel.

For EXAScaler systems that may be running older SFAOS releases, there was a 
patch made (included in 2.12.6-ddn72/EXA5.2.5) that revalidates the rotational 
device state occasionally in case it has been modified after mount time, and 
uses that to update the read_cache_enable and writethrough_cache_enable 
tunables *if they have not been explicitly set*.

Until you update to a newer EXA and/or SFAOS, you can explicitly tune 
osd-ldiskfs.*.read_cache_enable=0 and ...writethrough_cache_enable=0, using a 
wildcard "*" if all of the OSTs/MDTs are flash based.  If you have a hybrid 
NVMe/HDD system, you can explicitly select a subset of OST/MDT devices to 
disable the caches.

Cheers, Andreas

On May 20, 2022, at 02:49, Åke Sandgren 
<ake.sandg...@hpc2n.umu.se<mailto:ake.sandg...@hpc2n.umu.se>> wrote:
On 5/20/22 09:53, Andreas Dilger via lustre-discuss wrote:
To elaborate a bit on Patrick's answer, there is no mechanism to do this on the 
*client*, because the performance difference between client RAM and server 
storage is still fairly significant, especially if the application is doing 
sub-page read or write operations.
However, on the *server* the OSS and MDS will *not* put flash storage into the 
page cache, because using the kernel page cache has a measurable overhead, and 
(at least in our testing) the performance of NVMe IOPS is actually better 
*without* the page cache because more CPU is available to handle RPCs.  This is 
controlled on the server with 
osd-ldiskfs.*.{read_cache_enable,writethrough_cache_enable}, default to 0 if 
the block device is non-rotational, default to 1 if block device is rotational.

Then my question is, what is it checking to determine non-rotational?

On our systems the NVMe disks have read/writethrough_cache_enable = 1 (DDN 
SFA400NVXE) with
===
/dev/sde on /lustre/stor10/ost0000 (NVMe)
cat /sys/block/sde/queue/rotational
0
lctl get_param osd-ldiskfs.*.*cache*enable
osd-ldiskfs.stor10-OST0000.read_cache_enable=1
osd-ldiskfs.stor10-OST0000.writethrough_cache_enable=1

EXAScaler SFA CentOS 5.2.3-r5
kmod-lustre-2.12.6_ddn58-1.el7.x86_64
===

--
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: a...@hpc2n.umu.se<mailto:a...@hpc2n.umu.se>  Mobile: +46 70 7716134  
Fax: +46 90-580 14
WWW: http://www.hpc2n.umu.se
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Whamcloud







_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to