bneradt commented on code in PR #10960: URL: https://github.com/apache/trafficserver/pull/10960#discussion_r1443212011
########## doc/admin-guide/files/records.yaml.en.rst: ########## @@ -2488,6 +2488,29 @@ Cache Control used from the asynchronous IO threads when IO finishes and the ``CacheVC`` lock or stripe lock is required. +.. ts::cv:: CONFIG proxy.config.cache.max_disk_errors INT 5 + + Cache disks sometimes fail due to hardware problems. |TS| keeps count of + the number of times it encounters I/O errors when accessing each cache disk. + If the number of errors on a disk reaches this setting, |TS| removes that + disk from the cache. + + Note that the count of errors is kept in memory, and is reset to zero when + |TS| is restarted. By default, |TS| will not remember which cache disk has + been removed in this way when it restarts. If you wish to change this behavior + and prevent known bad disks from re-joining the cache upon restart, change + the setting :ts:cv:`proxy.config.cache.persist_bad_disks`. + +.. ts::cv:: CONFIG proxy.config.cache.persist_bad_disks INT 0 + + When enabled (``1``), |TS| will remember which cache disks have been + marked as failed through :ts:cv:`proxy.config.cache.max_disk_errors`, + even after a restart. A list of known bad cache disks is written to + ``localstatedir/known_bad_disks``. If you replace the known bad disks, + delete this file so that |TS| will use it in the cache again. + + When disabled (``0``), |TS| will ignore the ``known_bad_disks`` file. + Review Comment: It's typical to say something explicit about the default for something like this, like: ``` This feature is disabled by default. ``` ########## src/iocore/cache/Cache.cc: ########## @@ -336,6 +344,15 @@ CacheProcessor::start_internal(int flags) opts |= O_CREAT; } + // Check if the disk is known to be bad Review Comment: nit: end with a period. ########## src/iocore/cache/Cache.cc: ########## @@ -336,6 +344,15 @@ CacheProcessor::start_internal(int flags) opts |= O_CREAT; } + // Check if the disk is known to be bad + if (cache_config_persist_bad_disks && !known_bad_disks.empty()) { + if (known_bad_disks.find(paths[gndisks]) != known_bad_disks.end()) { + Note("%s is a known bad disk. Skipping.", paths[gndisks]); Review Comment: Probably should be a warning. ########## include/tscore/Filenames.h: ########## @@ -47,6 +47,7 @@ namespace filename // Various other file names constexpr const char *RECORDS_STATS = "records.snap"; constexpr const char *HOST_RECORDS = "host_records.yaml"; + constexpr const char *BAD_DISKS = "bad_disks"; Review Comment: Probably good to add a `.txt` suffix. That can be helpful for some editors. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@trafficserver.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org