Hi community, I am running an ATS cluster with debug workload (a replay of normal workload with all headers removed) and I have noticed that a significant portion (>20%) of cache misses are just tunneled through without being written to disk (extracted using via header, and in my toy experiment, every miss should be cached). I assume this happens because the aggregation write buffer is full, so the some of the disk writes are dropped.
I have tried to increase the the aggregation buffer from 4MB to 256MB when I compile ATS, but it does not help. Besides my NVME disk throughput (> 1600MB/s for 1MB fragment 80%read+20%write from FIO test) is way higher than what I observed during the experiments, during which I rarely see over 200MB/s read+writing. Any suggestion on how I should tune ATS to gain better performance? Maybe increase the number of cache threads per disk? Or maybe increase/reduce fragment size? Or any suggestion on how to figure out the reason of dropped disk writes? Any thoughts are appreciated! Best, Jason
