For sharding in #4, you might be interested in what test-with-docker.py
does. It already has some ASAN support, I think, so maybe adding UBSAN will
be pretty straight-forward.  In the screenshot below, you can see that I
sharded CLUSTER_TEST and EE_TEST into two halves. It seems to work ok.

[image: image.png]

But, anyway, it sounds like it's easiest to continue independently, and if
we find occasion to merge, we can do so later.

-- Philip

On Mon, Nov 12, 2018 at 1:45 PM Jim Apple <jbap...@cloudera.com> wrote:

> My only concern with this is the unfinished nature of the UBSAN work. Here
> are some things left to do:
>
> 1. Make UBSAN builds work with FE and JDBC tests. These aren't UBSAN
> unclean -- they plain don't work.
> 2. Make e2e, custom cluster, and BE tests UBSAN clean. This is in progress.
> 3. Make e2e tests pass with -full_ubsan. This turns UBSAN on during
> codegen, and it seems to break some e2e tests.
> 4. Give -full_ubsan a tolerable run time, probably by sharding the tests.
> Right now the core suite takes 7 or 8 hours.
>
> If you get centos6 up and working as part of pre-merge, I'm comfortable
> using it as a platform to test UBSAN on as we build more UBSAN coverage.
> I'm also happy to try and keep any sharding in #4 distribution-agnostic so
> that if UBSAN goes up first, you can easily port it to centos6. With those
> two in place, my feeling is that the two improvements we want to make
> "commute", in that whichever one we do first, the work and result should be
> similar.
>
> What do you think?
>
> On Mon, Nov 12, 2018 at 1:26 PM Philip Zeyliger <phi...@cloudera.com>
> wrote:
>
> > On Mon, Nov 12, 2018 at 1:21 PM Jim Apple <jbap...@cloudera.com> wrote:
> >
> > > I don't think I understand the "one stone" part - are you suggesting
> that
> > > we do UBSAN testing within a centos6 container?
> > >
> >
> > Exactly. If we're doing multiple builds, we may as well be mutating other
> > variables to get coverage from them. (Here, you're mutating "build type",
> > and I'm proposing we also mutate "base OS" to get additional coverage.)
> >
> > -- Philip
> >
> >
> >
> > >
> > > On Mon, Nov 12, 2018 at 1:01 PM Philip Zeyliger <phi...@cloudera.com>
> > > wrote:
> > >
> > > > Seems useful to me.
> > > >
> > > > If you're interested, we could kill multiple birds with one stone.
> > > > Specifically, I'm also interested in centos6/rh6 pre-merge testing.
> > There
> > > > are a variety of ways to do so, including running with
> test-with-docker
> > > > stuff. I recognize it's more work, but happy to help if you want to
> try
> > > it.
> > > >
> > > > -- Philip
> > > >
> > > > On Sat, Nov 10, 2018 at 11:10 PM Jim Apple <jbap...@cloudera.com>
> > wrote:
> > > >
> > > > > C++ has some constructs that have undefined behavior. Shall we test
> > for
> > > > > this during pre-merge testing?
> > > > >
> > > > > When the behavior of C++ code is formally "undefined" by the
> > standard,
> > > > > compilers can behave erratically, like not taking either branch of
> a
> > > > > if/else statement. This can be reproduced in the wild. The standard
> > > > itself
> > > > > notes:
> > > > >
> > > > > "Using a bool value in ways described by this International
> Standard
> > as
> > > > > 'undefined,' such as by examining the value of an uninitialized
> > > automatic
> > > > > object, might cause it to behave as if it is neither true nor
> false."
> > > > >
> > > > > Clang has a checker for this called UBSAN, and, after some effort,
> > the
> > > > data
> > > > > loading part of our build is now UBSAN-clean. I'm suggesting we add
> > > that
> > > > > test to the pre-merge testing. I'm happy to handle the details.
> > > > >
> > > > > When it fails, the output will look something like this:
> > > > >
> > > > >
> https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/3573/console
> > > > >
> > > > > exprs/math-functions-ir.cc:405:13: runtime error: signed integer
> > > > overflow:
> > > > > 4738381338321616896 * 36 cannot be represented in type 'long'
> > > > > runtime/decimal-value.inline.h:254:17: runtime error: signed
> integer
> > > > > overflow: 0x4b3b4ca85a86c47a098a223fffffffff +
> > > > > 0x4b3b4ca85a86c47a098a223fffffffff cannot be represented in type
> > > > '__int128'
> > > > > runtime/row-batch-serialize-test.cc:243:18: runtime error: variable
> > > > length
> > > > > array bound evaluates to non-positive value 0
> > > > >
> > > >
> > >
> >
>

Reply via email to