It looks like the writes are done atomically, but the reads aren't. TSAN is right to be suspicious of this. In practice there is no bug here, but showing that requires higher-level reasoning that TSAN wouldn't be expected to figure out.
-Kenton On Fri, Oct 9, 2020 at 11:52 AM Vitali Lovich <[email protected]> wrote: > So you're saying just change the read to atomic read? > > Is it possible there's a legitimate read-before-write race condition & > that's what TSAN is complaining about? It doesn't seem to be complaining > about concurrent writes but I also haven't investigated this codepath to > understand yet as I assumed the problem was in my code. > > On Fri, Oct 9, 2020 at 9:48 AM Kenton Varda <[email protected]> wrote: > >> I think that ThreadSanitizer is having trouble recognizing that the >> initialization of `brokenCapFactory` is thread-safe, due to the awkward way >> in which it is initialized. It may end up being initialized by multiple >> threads, but all threads will initialize it to the same value, hence no >> atomics are necessary when reading it later. >> >> Maybe we should use atomic reads anyway, to make ThreadSanitizer happy. >> Doing so should be free, at least on x86. >> >> (The reason for the weird initialization is that the factory is >> implemented in libcapnp-rpc.so, yet needs to be callable from libcapnp.so >> -- but only if RPC is in use.) >> >> -Kenton >> >> On Fri, Oct 9, 2020 at 8:37 AM Vitali Lovich <[email protected]> wrote: >> >>> I'm trying to do a basic in-process connection of the servers & running >>> that under TSAN but I feel like this is some nuance of the APIs I'm missing >>> or using the API totally incorrectly outright, so I would appreciate some >>> feedback if possible. I'm sure the bug must be in my code rather than >>> cap'n'proto. I don't have a helpful standalone piece of code to demonstrate >>> this issue at the moment (but if it's really critical I can work on >>> providing that). I've provided brief snippets of what the threads do (the >>> threads themselves are really that simple, it's just the schema & the >>> threads using various helper internal libraries that make it harder to post >>> a working standalone example). >>> >>> Full TSAN (to avoid spamming the list): https://pastebin.com/hSkVDgsU >>> >>> For context, the main thread does >>> auto ioContext = kj::setupAsyncIo(); >>> auto serverThread = ioContext.provider->newPipeThread(...); >>> auto serverPtr = <wait for CV state from T1>; >>> capnp::TwoPartyClient rpcClient(*serverThread.pipe); >>> auto client = rpcClient.bootstrap().castAs<MyServer>(); >>> auto connectedResponse = >>> client.connectRequest().send().wait(ioContext.waitScope); *<-- this is >>> the problematic TSAN line * >>> >>> T1 (taking AsyncIoProvider&, AsyncIoStream& stream, kj::WaitScope& >>> waitScope): >>> capnp::TwoPartyVatNetwork network(stream, >>> capnp::rpc::twoparty::Side::SERVER); >>> auto myServer = kj::heap<MyServerImpl>(); >>> auto myServerPtr = myServer.get() >>> auto rpcServer = capnp::makeRpcServer(network, kj::mv( myServer)); >>> <publish myServerPtr to main thread using CV> >>> network.onDisconnect().wait(waitScope);* <- problematic TSAN line* >>> >>> SUMMARY: ThreadSanitizer: data race >>> capnproto/c++/src/capnp/layout.c++:2188:5 in >>> capnp::_::WireHelpers::readCapabilityPointer(capnp::_::SegmentReader*, >>> capnp::_::CapTableReader*, capnp::_::WirePointer const*, int) >>> >>> Read of size 8 at 0x0000016673e0 by main thread: >>> #0 >>> capnp::_::WireHelpers::readCapabilityPointer(capnp::_::SegmentReader*, >>> capnp::_::CapTableReader*, capnp::_::WirePointer const*, int) >>> capnproto/c++/src/capnp/layout.c++:2188:5 (ld-2.26.so+0x809e60) >>> #1 capnp::_::PointerReader::getCapability() const >>> capnproto/c++/src/capnp/layout.c++:2705 (ld-2.26.so+0x809e60) >>> #2 >>> capnp::AnyPointer::Reader::getPipelinedCap(kj::ArrayPtr<capnp::PipelineOp >>> const>) const capnproto/c++/src/capnp/any.c++:53:18 (ld-2.26.so >>> +0x7ff4a7) >>> #3 capnp::_::(anonymous >>> >>> Previous atomic write of size 8 at 0x0000016673e0 by thread T1: >>> #0 __tsan_atomic64_store <null> (ld-2.26.so+0x6911ee) >>> #1 >>> capnp::_::setGlobalBrokenCapFactoryForLayoutCpp(capnp::_::BrokenCapFactory&) >>> capnproto/c++/src/capnp/layout.c++:45:3 (ld-2.26.so+0x800d12) >>> #2 >>> capnp::ReaderCapabilityTable::ReaderCapabilityTable(kj::Array<kj::Maybe<kj::Own<capnp::ClientHook> >>> > >) capnproto/c++/src/capnp/capability.c++:951:3 (ld-2.26.so+0x6f32cc) >>> #3 capnp::_::(anonymous >>> namespace)::RpcConnectionState::RpcCallContext::RpcCallContext(capnp::_::(anonymous >>> namespace)::RpcConnectionState&, unsigned int, >>> kj::Own<capnp::IncomingRpcMessage>&&, >>> kj::Array<kj::Maybe<kj::Own<capnp::ClientHook> > >, >>> capnp::AnyPointer::Reader const&, bool, kj::Own<kj::PromiseFulfiller<void> >>> >&&, unsigned long, unsigned short) capnproto/c++/src/capnp/rpc.c++:2144:11 >>> (ld-2.26.so+0x754bea) >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Cap'n Proto" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/capnproto/19bac31b-6f11-43d1-886e-93d6bc32557an%40googlegroups.com >>> <https://groups.google.com/d/msgid/capnproto/19bac31b-6f11-43d1-886e-93d6bc32557an%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- You received this message because you are subscribed to the Google Groups "Cap'n Proto" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/capnproto/CAJouXQkT-3dHBk3iqE_fsfngaFZ447yJK9APof1VEKh-9RTHVw%40mail.gmail.com.
