On 24 July 2017 at 17:04, Jim Apple <[email protected]> wrote: > I had anticipated that shared linking would save time and disk space, but > it sounds like, from your testing, it doesn't save much time. Does it save > disk space? >
I haven't measured but I would expect not. Do we need to be very careful about disk space in the current configuration? > > Does static linking save time when compiling incremental changes? > Again, I haven't measured. > > On Mon, Jul 24, 2017 at 4:51 PM, Henry Robinson <[email protected]> wrote: > > > :) I agree - we should also track any known breaks to shared linking in a > > best effort fashion because it's so useful to some dev workflows. > > > > On 24 July 2017 at 16:49, Tim Armstrong <[email protected]> wrote: > > > > > I vote for changing Jenkins' linking strategy now and not changing it > > back > > > :). Static linking is the blessed configuration so I think we should be > > > running tests with that primarily. > > > > > > On Mon, Jul 24, 2017 at 4:34 PM, Henry Robinson <[email protected]> > > wrote: > > > > > > > On 24 July 2017 at 13:58, Todd Lipcon <[email protected]> wrote: > > > > > > > > > On Mon, Jul 24, 2017 at 1:47 PM, Henry Robinson <[email protected]> > > > > wrote: > > > > > > > > > > > Thanks for the asan pointer - I'll give it a go. > > > > > > > > > > > > My understanding of linking isn't deep, but my working theory has > > > been > > > > > that > > > > > > the complications have been caused by glog getting linked twice - > > > once > > > > > > statically (possibly into libkudu.so), and once dynamically (via > > > > everyone > > > > > > else). > > > > > > > > > > > > > > > > In libkudu_client.so, we use a linker script to ensure that we > don't > > > leak > > > > > glog/gflags/etc symbols. Those are all listed as 'local' in > > > > > src/kudu/client/symbols.map. We also have a unit test > > > > > 'client_symbol-test.sh' which uses nm to dump the list of symbols > and > > > > make > > > > > sure that they all non-local non-weak symbols are under the > 'kudu::' > > > > > namespace. > > > > > > > > > > So it's possible that something's getting linked twice but I'd be > > > > somewhat > > > > > surprised if it's from the Kudu client. > > > > > > > > > > > > > > Good to know, thanks. > > > > > > > > ASAN hasn't turned up anything yet - so does anyone have an opinion > > about > > > > changing Jenkins' linking strategy for now? > > > > > > > > > > > > > -Todd > > > > > > > > > > > > > > > > > > > > > > I would think that could lead to one or both of the issues you > > linked > > > > to. > > > > > > > > > > > > > > > > > > On 24 July 2017 at 13:39, Todd Lipcon <[email protected]> wrote: > > > > > > > > > > > > > Is it possible that the issue here is due to a "one definition > > > rule" > > > > > > > violation? eg something like > > > > > > > https://github.com/google/sanitizers/wiki/AddressSanitizerOn > > > > > > > eDefinitionRuleViolation > > > > > > > Another similar thing is described here: > > > > > > > https://github.com/google/sanitizers/wiki/AddressSanitizerIn > > > > > > > itializationOrderFiasco > > > > > > > > > > > > > > ASAN with the appropriate flags might help expose if one of the > > > above > > > > > is > > > > > > > related. > > > > > > > > > > > > > > I wonder whether it is a kind of coincidence that it is fine > in a > > > > > static > > > > > > > build but causes problems in dynamic, and at some point the > > static > > > > link > > > > > > > order may slightly shift, causing another new subtle bug. > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Jul 24, 2017 at 1:22 PM, Henry Robinson < > > [email protected]> > > > > > > wrote: > > > > > > > > > > > > > > > We've started seeing isolated incidences of IMPALA-5702 > during > > > > GVOs, > > > > > > > where > > > > > > > > a custom cluster test fails by throwing an exception during > > > locale > > > > > > > > handling. > > > > > > > > > > > > > > > > I've been able to reproduce this locally, but only with > shared > > > > > linking > > > > > > > > enabled (which makes sense since the issue is symptomatic of > a > > > > global > > > > > > > c'tor > > > > > > > > not getting called the right number of times). > > > > > > > > > > > > > > > > It's probable that my patch for IMPALA-5659 exposed this > (since > > > it > > > > > > > forced a > > > > > > > > more correct linking strategy for thirdparty libraries when > > > dynamic > > > > > > > linking > > > > > > > > was enabled), but it looks to me at first glance like there > > were > > > > > latent > > > > > > > > dynamic linking bugs that we weren't getting hit by. Fixing > > > > > IMPALA-5702 > > > > > > > > will probably take a while, and I don't think we should hold > up > > > > GVOs > > > > > or > > > > > > > put > > > > > > > > them at risk. > > > > > > > > > > > > > > > > So there are two options: > > > > > > > > > > > > > > > > 1. Revert IMPALA-5659 > > > > > > > > > > > > > > > > 2. Switch GVO to static linking > > > > > > > > > > > > > > > > IMPALA-5659 is important to commit the kudu util library, > which > > > is > > > > > > needed > > > > > > > > for the KRPC work. Without it, shared linking doesn't work > *at > > > all* > > > > > > when > > > > > > > > the kudu util library is committed. > > > > > > > > > > > > > > > > Static linking doesn't take much longer in my unscientific > > > > > > measurements, > > > > > > > > and is closer to how Impala is actually used. In the interest > > of > > > > > > forward > > > > > > > > progress I'd like to try switching ubuntu-14.04-from-scratch > to > > > use > > > > > > > static > > > > > > > > linking while I work on IMPALA-5702. > > > > > > > > > > > > > > > > What does everyone else think? > > > > > > > > > > > > > > > > Henry > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Todd Lipcon > > > > > > > Software Engineer, Cloudera > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Todd Lipcon > > > > > Software Engineer, Cloudera > > > > > > > > > > > > > > >
