On Mon, 4 Mar 2024, Martin Storsjö wrote:

Hi,

On Mon, 4 Mar 2024, Mateusz Mikuła wrote:

rand is not random enough and may lead to clashing temporary directories
with multiple parallel link processes as it was observed on Rust's CI.

It can be reproduced with these commands (run them all in without long pauses):


for n in {1..15000}; do rm -f lib/libLLVMAVRAsmParser.a && \
ar qc lib/libLLVMAVRAsmParser.a lib/Target/AVR/AsmParser/CMakeFiles/LLVMAVRAsmParser.dir/AVRAsmParser.cpp.obj && \
ranlib.exe lib/libLLVMAVRAsmParser.a; done &

for n in {1..15000}; do rm -f lib/libLLVMSparcCodeGen.a && \
ar qc lib/libLLVMSparcCodeGen.a lib/Target/Sparc/CMakeFiles/LLVMSparcCodeGen.dir/*.obj && \
ranlib.exe lib/libLLVMSparcCodeGen.a; done

echo "done"
fg

Before the patch it will fail with an error: ranlib.exe: could not create temporary file whilst writing archive: no more archived files.

Thanks, I've run into this issue occasionally when building LLVM on msys2 as well, but I've failed to reproduce it when I've tried to look closer at it (as I've missed the issue that one needs to build two archives at the same time in order to trigger it).

If the issue is that the randomness clashes, shouldn't that be something that, as part of the contract of mkstemp, the function should retry until it finds a non-conflicting combination? But, thinking further, is the issue that two processes end up trying the same sequence of pseudo random files, which all then end up clashing, and mkstemp returns an error as it was unable to find a unique file name? I guess that's plausible. In that case, I guess this patch is fine (with Liu Hao's suggestion), as a way to reduce the risk of running into this.

Looking closer at our mkstemp implementation, we have this loop:

    /*
        Like OpenBSD, mkstemp() will try at least 2 ** 31 combinations before
        giving up.
     */
    for (i = 0; i >= 0; i++) {
        for(j = index; j < len; j++) {
            template_name[j] = letters[rand () % 62];
        }
        fd = _sopen(template_name,
                _O_RDWR | _O_CREAT | _O_EXCL | _O_BINARY,
                _SH_DENYNO, _S_IREAD | _S_IWRITE);
        if (fd != -1) return fd;
        if (fd == -1 && errno != EEXIST) return -1;
    }

This should retry an absolutely insane number of times, so as long as one process finds a unique file name and stops iterating, the other parallel process should also find a unique one soon after, one would expect.

So if this fails, it looks like something is fishy here; if we have this clash, do we hit the "if (fd == -1 && errno != EEXIST) return -1;" case directly on the first iteration?

(Separately, it looks like the loop relies on undefined behaviour, signed wraparound, in order to exit the loop.)

// Martin

_______________________________________________
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Reply via email to