The UBSAN pre-commit jobs have been running successfully for a while now. I
propose removing them from the pre-commit job to save on build time and
costs. I've setup a separate Jenkins job that runs BE, FE, JBDC, and custom
cluster tests against an UBSAN build, on a daily basis.

On Mon, Nov 12, 2018 at 1:53 PM Philip Zeyliger <phi...@cloudera.com> wrote:

> For sharding in #4, you might be interested in what test-with-docker.py
> does. It already has some ASAN support, I think, so maybe adding UBSAN will
> be pretty straight-forward.  In the screenshot below, you can see that I
> sharded CLUSTER_TEST and EE_TEST into two halves. It seems to work ok.
>
> [image: image.png]
>
> But, anyway, it sounds like it's easiest to continue independently, and if
> we find occasion to merge, we can do so later.
>
> -- Philip
>
> On Mon, Nov 12, 2018 at 1:45 PM Jim Apple <jbap...@cloudera.com> wrote:
>
>> My only concern with this is the unfinished nature of the UBSAN work. Here
>> are some things left to do:
>>
>> 1. Make UBSAN builds work with FE and JDBC tests. These aren't UBSAN
>> unclean -- they plain don't work.
>> 2. Make e2e, custom cluster, and BE tests UBSAN clean. This is in
>> progress.
>> 3. Make e2e tests pass with -full_ubsan. This turns UBSAN on during
>> codegen, and it seems to break some e2e tests.
>> 4. Give -full_ubsan a tolerable run time, probably by sharding the tests.
>> Right now the core suite takes 7 or 8 hours.
>>
>> If you get centos6 up and working as part of pre-merge, I'm comfortable
>> using it as a platform to test UBSAN on as we build more UBSAN coverage.
>> I'm also happy to try and keep any sharding in #4 distribution-agnostic so
>> that if UBSAN goes up first, you can easily port it to centos6. With those
>> two in place, my feeling is that the two improvements we want to make
>> "commute", in that whichever one we do first, the work and result should
>> be
>> similar.
>>
>> What do you think?
>>
>> On Mon, Nov 12, 2018 at 1:26 PM Philip Zeyliger <phi...@cloudera.com>
>> wrote:
>>
>> > On Mon, Nov 12, 2018 at 1:21 PM Jim Apple <jbap...@cloudera.com> wrote:
>> >
>> > > I don't think I understand the "one stone" part - are you suggesting
>> that
>> > > we do UBSAN testing within a centos6 container?
>> > >
>> >
>> > Exactly. If we're doing multiple builds, we may as well be mutating
>> other
>> > variables to get coverage from them. (Here, you're mutating "build
>> type",
>> > and I'm proposing we also mutate "base OS" to get additional coverage.)
>> >
>> > -- Philip
>> >
>> >
>> >
>> > >
>> > > On Mon, Nov 12, 2018 at 1:01 PM Philip Zeyliger <phi...@cloudera.com>
>> > > wrote:
>> > >
>> > > > Seems useful to me.
>> > > >
>> > > > If you're interested, we could kill multiple birds with one stone.
>> > > > Specifically, I'm also interested in centos6/rh6 pre-merge testing.
>> > There
>> > > > are a variety of ways to do so, including running with
>> test-with-docker
>> > > > stuff. I recognize it's more work, but happy to help if you want to
>> try
>> > > it.
>> > > >
>> > > > -- Philip
>> > > >
>> > > > On Sat, Nov 10, 2018 at 11:10 PM Jim Apple <jbap...@cloudera.com>
>> > wrote:
>> > > >
>> > > > > C++ has some constructs that have undefined behavior. Shall we
>> test
>> > for
>> > > > > this during pre-merge testing?
>> > > > >
>> > > > > When the behavior of C++ code is formally "undefined" by the
>> > standard,
>> > > > > compilers can behave erratically, like not taking either branch
>> of a
>> > > > > if/else statement. This can be reproduced in the wild. The
>> standard
>> > > > itself
>> > > > > notes:
>> > > > >
>> > > > > "Using a bool value in ways described by this International
>> Standard
>> > as
>> > > > > 'undefined,' such as by examining the value of an uninitialized
>> > > automatic
>> > > > > object, might cause it to behave as if it is neither true nor
>> false."
>> > > > >
>> > > > > Clang has a checker for this called UBSAN, and, after some effort,
>> > the
>> > > > data
>> > > > > loading part of our build is now UBSAN-clean. I'm suggesting we
>> add
>> > > that
>> > > > > test to the pre-merge testing. I'm happy to handle the details.
>> > > > >
>> > > > > When it fails, the output will look something like this:
>> > > > >
>> > > > >
>> https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/3573/console
>> > > > >
>> > > > > exprs/math-functions-ir.cc:405:13: runtime error: signed integer
>> > > > overflow:
>> > > > > 4738381338321616896 * 36 cannot be represented in type 'long'
>> > > > > runtime/decimal-value.inline.h:254:17: runtime error: signed
>> integer
>> > > > > overflow: 0x4b3b4ca85a86c47a098a223fffffffff +
>> > > > > 0x4b3b4ca85a86c47a098a223fffffffff cannot be represented in type
>> > > > '__int128'
>> > > > > runtime/row-batch-serialize-test.cc:243:18: runtime error:
>> variable
>> > > > length
>> > > > > array bound evaluates to non-positive value 0
>> > > > >
>> > > >
>> > >
>> >
>>
>

-- 
Sahil Takiar
Software Engineer
takiar.sa...@gmail.com | (510) 673-0309

Reply via email to