To close the loop on this, I put up a patch here:
http://gerrit.cloudera.org:8080/6095

-Todd

On Tue, Feb 14, 2017 at 6:32 PM, Todd Lipcon <[email protected]> wrote:

> On Tue, Feb 14, 2017 at 6:22 PM, Adar Dembo <[email protected]> wrote:
>
>> Any idea why this only surfaced recently? There's a call to
>> std::call_once in util/logging.cc that's been there for almost a year.
>>
>
> I'm guessing that one is only very rarely called (on crash) and would only
> be called by multiple threads at or near the same time in an even rarer
> case (two threads crash simultaneously).
>
>
>> In any case, I'm not a huge fan of #2. #1 and #3 both require a patch,
>> but I think #1 is more likely to get noticed (and reverted) upon an
>> LLVM 4.0 upgrade. So I think I vote for #1, though #3 is reasonable
>> too.
>>
>
> OK, I'll go with #1. I tested that the patch applies cleanly and fixes a
> manufactured test case I wrote.
>
>
>>
>> On Tue, Feb 14, 2017 at 6:09 PM, Todd Lipcon <[email protected]> wrote:
>> > BTW, I should add that libstdcxx as included in gcc 4.9.2 (devtoolset on
>> > el6) doesn't have this bug, since it just wraps pthread_once. So, it's
>> > probably not a _real_ issue since we don't use libc++ for production
>> builds.
>> >
>> > -Todd
>> >
>> > On Tue, Feb 14, 2017 at 6:05 PM, Todd Lipcon <[email protected]> wrote:
>> >
>> >> I noticed that a lot of our tests have gotten flaky with a TSAN race
>> >> reported around SSL initialization. After digging into a bit, I found
>> that
>> >> our version of libcxx has a bug in std::once where it's using a relaxed
>> >> load instead of an acquire load to check the flag of whether it has
>> run yet.
>> >>
>> >> I went to go report this upstream and discovered it's actually already
>> >> fixed:
>> >> https://github.com/llvm-mirror/libcxx/commit/
>> >> 4dbd4fcf8626453949625bf36976e668094cfbb1
>> >>
>> >> I'd still like to resolve the race, though. A couple options:
>> >> 1) apply this patch to our libcxx
>> >> 2) use GoogleOnce instead of std::call_once
>> >> 3) suppress it until we upgrade to llvm 4.0 with the newer libcxx,
>> which
>> >> should likely be released in a few weeks
>> >>
>> >> Any opinions?
>> >> -Todd
>> >> --
>> >> Todd Lipcon
>> >> Software Engineer, Cloudera
>> >>
>> >
>> >
>> >
>> > --
>> > Todd Lipcon
>> > Software Engineer, Cloudera
>>
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Reply via email to