[PATCH] D114966: [clang][deps] Split stat and file content caches

Duncan P. N. Exon Smith via Phabricator via cfe-commits Wed, 15 Dec 2021 20:07:44 -0800

dexonsmith added inline comments.


================
Comment at: 
clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp:110-111
+
+CachedFileContents &DependencyScanningFilesystemSharedCache::getFileContents(
+    StringRef Filename, llvm::sys::fs::UniqueID UID) {
+  CacheShard &Shard = CacheShards[llvm::hash_value(Filename) % NumShards];
----------------
This name is a bit misleading... looks more like `getOrCreateFileContents()` to 
me.


================
Comment at: 
clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp:200-210
+          // FIXME: This read can fail.
+          // In that case, we should probably update `CacheEntry::MaybeStat`.
+          // However, that is concurrently read at the start of this function
+          // (`needsUpdate` -> `isReadable`), leading to a data race. If we try
+          // to avoid the data race by returning an `error_code` here (without
+          // updating the entry stat value, we can end up attempting to read 
the
+          // file again in the future, which breaks the consistency guarantees
----------------
dexonsmith wrote:
> I'm not quite following this logic. I think it's safe (and important!) to 
> modify MaybeStat if `read()` fails.
> 
> We're in a critical section that either creates and partially initializes the 
> entry, or incrementally updates it.
> 
> In the "creates and partially initializes" case:
> - All other workers will get nullptr for `Cache.getEntry()` and try to enter 
> the critical section.
> - We have just seen a successful MaybeStat value.
> - `needsRead()` will be `true` since we have not read contents before. We 
> will immediately try to read.
> - `read()` should open the original contents and can safely:
>     - on success, update the value for MaybeStat to match the observed size
>     - on failure, drop the value and set the error for MaybeStat to the 
> observed error
> - When we leave the critical section, either:
>     - MaybeStat stores an error; no thread will enter the critical section 
> again
>     - OriginalContents are initialized and `needsRead()` returns false
> 
> In the "incrementally updates" case:
> - `needsRead()` returns false so `read()` will not be called
> 
The key property being the last bullet of the first case: that the "create and 
partially initializes" case guarantees that either `MaybeStat` stores an error 
(`isReadable()` is false) or `OriginalContents` is initialized (`needsRead()` 
returns false).

----

I think I've found the part I was missing: that this critical section is for a 
SharedCacheFileEntry (associated with a filename), but the `OriginalContents` 
is a field on a CachedFileContents which could apply to other UIDs (and 
`SharedCache.getFileContents()` is actually "get or create"). Since this commit 
creates the UID map, seems like maybe the race gets worse in this commit? (Not 
sure)

----

Another problem: the UID can change between the first stat above (call to 
`getUnderlyingFS().status(Filename)`) and the one inside `read()` if another 
process is writing at the same time. We can't trust the UID mapping from the 
first `status()` call unless content already exists for that UID.

I think to avoid this race you need to delay creating "UID to content" map 
entry until there is the result of a successful `read()` to store.

I'll describe an algorithm that I think is fairly clean that handles this. I'm 
using different data structure names to avoid confusion since I've broken it 
down a little differently:
- ReadResult: stat (for directories) OR error and uid (failed read) OR stat and 
content (and optional minimized content and pp ranges, and a way to update them 
atomically))
- FilenameMap: map from Filename to `ReadResult*` (shared and sharded; mirrored 
locally in each worker)
- UIDMap: map from UID to `ReadResult*` (shared and sharded; probably no local 
mirror)

And here's the algorithm:
```
lang=c++
// Top-level API: get the entry/result for some filename.
ErrorOr<ReadResult &> getOrCreateResult(StringRef Filename) {
  if (ReadResult *Result = lookupEntryForFilename(Filename))
    return minimizeIfNecessary(*Result, ShouldMinimize);
  if (ErrorOr<ReadResult &> Result = computeAndStoreResult(Filename))
    return minimizeIfNecessary(*Result, ShouldMinimize);
  else
    return Result.getError();
}
// Compute and store an entry/result for some filename. Returned result
// has in-sync stat+read info (assuming read was successful).
ErrorOr<ReadResult &> computeAndStoreResult(StringRef Filename) {
  ErrorOr<Status> Stat = UnderlyingFS->status(Filename);
  if (!Stat)
    return Stat.getError(); // Can't cache missing files.
  if (ReadResult *Result = lookupEntryForUID(Stat->UID))
    return storeFilenameEntry(Filename, *Result); // UID already known.
  // UID not known. Compute a ReadResult.
  //
  // Unless this is a directory (where we don't need to go back to the FS),
  // ignore existing 'Stat' because without an open file descriptor the UID
  // could change.
  Optional<ReadResult> Result;
  if (Stat->isDirectory())
    Result = ReadResult(*Stat);
  else if (ErrorOr<ReadResult> MaybeResult = computeReadResult(Filename))
    Result = std::move(*MaybeResult);
  else
    return MaybeResult.getError(); // File disappeared...
  // Store the result. Cascade through UID then Filename. Each level could
  // return a different result than it was passed in.
  return storeEntryForFilenameOrReturnExisting(Filename,
             storeEntryForUIDOrReturnExisting(std::move(*Result));
}
// Lookup existing result in FilenameMap. No mutation. First checks local map
// then falls back to the shared map (locks shard, lookup, unlocks, saves in
// local map, returns).
ReadResult *lookupEntryForFilename(StringRef Filename);
// Lookup existing result in UIDMap. No mutation. No local map, just a shared
// map (lockshard+lookup+return).
ReadResult *lookupEntryForUID(UniqueID);
// Compute read result using a single file descriptor.
// - Return error if `open()` fails. Can't cache missing files.
// - Else compute ReadResult: Stat open file descriptor and get a memory buffer 
from it.
// Note: "Error" state if stat fails.
// Note: "Error" state if stat succeeds and memory buffer does not open.
// Note: if the memory buffer opens successfully, status updated with observed 
size.
// Note: does not take a UID parameter since live FS could have changed.
// Note: does not access or mutate UIDMap/FilenameMap/etc.
ErrorOr<ReadResult> computeReadResult(StringRef Filename);
// Compare-exchange. Pulls UID out of NewResult. Locks shard for UIDMap[UID]; 
checks for
// existing result; if none, bump-ptr-allocates and stores NewResult; returns 
stored
// result.
ReadResult& storeEntryForUIDOrReturnExisting(ReadResult &&NewResult);
// Compare-exchange. Locks shard for FilenameMap[Filename]; checks for existing 
result;
// if none, stores parameter; unlocks; updates local map with stored result and 
returns
// it.
ReadResult& storeEntryForFilenameOrReturnExisting(StringRef Filename, 
ReadResult &);
// If needed and missing, adds minimization info atomically. Note that Result
// may store a cached read error, or a directory.
ReadResult& minimizeIfNecessary(ReadResult& Result, bool ShouldMinimize);
```
The only thing "lost" is that two workers might both compute a ReadResult for 
the same file (the slower one having the work dropped on the floor). I'm 
skeptical this will matter in practice. If some measurement says it does, the 
FilenameMap could map from Filename to 
`unique_ptr<pair<mutex,atomic<ReadResult*>>>` and `computeAndStoreResult()` 
could take a lock at the start and re-check the map... but IMO it's better to 
make this simple and optimize for the common case.

This does leave behind an extended critical section in `minimizeIfNecessary()` 
to avoid racing to minimize, implying there's a mutex in `ReadResult` for 
updating the MinimizedContents and PPRanges. (I suspect minimization is fast 
enough (and racing workers rare and cheap enough) that the memory overhead of a 
per-ReadResult mutex is a bad tradeoff (a simple alternative would be to 
computeMinimized before the critical section, then take out a 
"store-minimization" lock just for setting the values (mutex could be sharded 
by the `UID % 64` or something), but I'm less confident.)

This also leaves behind the double-stat behaviour (before and after open) for 
new files. New files should be pretty rare in the depscanner so maybe this is 
fine; I've also observed cases where failed `open` is slower than failed 
`stat`, so for the common case of missing files (which don't get cached...) 
this might be best.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114966/new/

https://reviews.llvm.org/D114966

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D114966: [clang][deps] Split stat and file content caches

Reply via email to