On Sat, Aug 11 2018, René Scharfe wrote:
> Object IDs to skip are stored in a shared static oid_array. Lookups do > a binary search on the sorted array. The code checks if the object IDs > are already in the correct order while loading and skips sorting in that > case. I think this change makes sense, but it's missing an update to the relevant documentation in Documentation/config.txt: fsck.skipList:: The path to a sorted list of object names (i.e. one SHA-1 per line) that are known to be broken in a non-fatal way and should be ignored. This feature is useful when an established project should be accepted despite early commits containing errors that can be safely ignored such as invalid committer email addresses. Note: corrupt objects cannot be skipped with this setting. Also, while I use the skipList feature it's for something on the order of 10-100 objects, so whatever algorithm the lookup uses isn't going to matter, but I think it's interesting to describe the trade-off in the commit message. I.e. what if I have 100K objects listed in the skipList, is it only going to be read lazily during fsck if there's an issue, or on every object etc? What's the difference in performance? Before this change, I wanted to follow-up my ab/fsck-transfer-updates with something where we'd die if we found the skipList wasn't ordered as we read it, but from a UI POV this is even better.