On 10/19/18 12:37 PM, Mohr Jr, Richard Frank (Rick Mohr) wrote:
On Oct 17, 2018, at 7:30 PM, Riccardo Veraldi <[email protected]> 
wrote:

anyway especially regarding the OSSes you may eventually need some ZFS module parameters 
optimizations regarding vdev_write and vdev_read max to increase those values higher than 
default. You may also disable ZIL, change the redundant_metadata to "most"  
atime off.

I could send you a list of parameters that in my case work well.
Riccardo,

Would you mind sharing your ZFS parameters with the mailing list?  I would be 
interested to see which options you have changed.

--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu

this worked for me on my high performance cluster

options zfs zfs_prefetch_disable=1
options zfs zfs_txg_history=120
options zfs metaslab_debug_unload=1
#
options zfs zfs_vdev_scheduler=deadline
options zfs zfs_vdev_async_write_active_min_dirty_percent=20
#
options zfs zfs_vdev_scrub_min_active=48
options zfs zfs_vdev_scrub_max_active=128
#
options zfs zfs_vdev_sync_write_min_active=8
options zfs zfs_vdev_sync_write_max_active=32
options zfs zfs_vdev_sync_read_min_active=8
options zfs zfs_vdev_sync_read_max_active=32
options zfs zfs_vdev_async_read_min_active=8
options zfs zfs_vdev_async_read_max_active=32
options zfs zfs_top_maxinflight=320
options zfs zfs_txg_timeout=30
options zfs zfs_dirty_data_max_percent=40
options zfs zfs_vdev_async_write_min_active=8
options zfs zfs_vdev_async_write_max_active=32

##############

these the zfs attributes that I changed on the OSSes:

zfs set mountpoint=none $ostpool

zfs set sync=disabled $ostpool

zfs set atime=off $ostpool

zfs set redundant_metadata=most $ostpool

zfs set xattr=sa $ostpool

zfs set recordsize=1M $ostpool

#################


these the ko2iblnd parameters for FDR Mellanox IB interfaces

options ko2iblnd timeout=100 peer_credits=63 credits=2560 concurrent_sends=63 ntx=2048 fmr_pool_size=1280 fmr_flush_trigger=1024 ntx=5120

############

these the ksocklnd paramaters

options ksocklnd sock_timeout=100 credits=2560 peer_credits=63

##############

these other parameters that I did tweak

echo 32 > /sys/module/ptlrpc/parameters/max_ptlrpcds
echo 3 > /sys/module/ptlrpc/parameters/ptlrpcd_bind_policy

lctl set_param timeout=600
lctl set_param ldlm_timeout=200
lctl set_param at_min=250
lctl set_param at_max=600

###########

Also I run this script at boot time to redefine IRQ assignments for hard drives spanned across all CPUs, not needed for kernel > 4.4

#!/bin/sh
# numa_smp.sh
device=$1
cpu1=$2
cpu2=$3
cpu=$cpu1
grep $1 /proc/interrupts|awk '{print $1}'|sed 's/://'|while read int
do
  echo $cpu > /proc/irq/$int/smp_affinity_list
  echo "echo CPU $cpu > /proc/irq/$a/smp_affinity_list"
  if [ $cpu = $cpu2 ]
  then
     cpu=$cpu1
  else
     ((cpu=$cpu+1))
  fi
done

_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to