The UBSAN pre-commit jobs have been running successfully for a while now. I propose removing them from the pre-commit job to save on build time and costs. I've setup a separate Jenkins job that runs BE, FE, JBDC, and custom cluster tests against an UBSAN build, on a daily basis.
On Mon, Nov 12, 2018 at 1:53 PM Philip Zeyliger <phi...@cloudera.com> wrote: > For sharding in #4, you might be interested in what test-with-docker.py > does. It already has some ASAN support, I think, so maybe adding UBSAN will > be pretty straight-forward. In the screenshot below, you can see that I > sharded CLUSTER_TEST and EE_TEST into two halves. It seems to work ok. > > [image: image.png] > > But, anyway, it sounds like it's easiest to continue independently, and if > we find occasion to merge, we can do so later. > > -- Philip > > On Mon, Nov 12, 2018 at 1:45 PM Jim Apple <jbap...@cloudera.com> wrote: > >> My only concern with this is the unfinished nature of the UBSAN work. Here >> are some things left to do: >> >> 1. Make UBSAN builds work with FE and JDBC tests. These aren't UBSAN >> unclean -- they plain don't work. >> 2. Make e2e, custom cluster, and BE tests UBSAN clean. This is in >> progress. >> 3. Make e2e tests pass with -full_ubsan. This turns UBSAN on during >> codegen, and it seems to break some e2e tests. >> 4. Give -full_ubsan a tolerable run time, probably by sharding the tests. >> Right now the core suite takes 7 or 8 hours. >> >> If you get centos6 up and working as part of pre-merge, I'm comfortable >> using it as a platform to test UBSAN on as we build more UBSAN coverage. >> I'm also happy to try and keep any sharding in #4 distribution-agnostic so >> that if UBSAN goes up first, you can easily port it to centos6. With those >> two in place, my feeling is that the two improvements we want to make >> "commute", in that whichever one we do first, the work and result should >> be >> similar. >> >> What do you think? >> >> On Mon, Nov 12, 2018 at 1:26 PM Philip Zeyliger <phi...@cloudera.com> >> wrote: >> >> > On Mon, Nov 12, 2018 at 1:21 PM Jim Apple <jbap...@cloudera.com> wrote: >> > >> > > I don't think I understand the "one stone" part - are you suggesting >> that >> > > we do UBSAN testing within a centos6 container? >> > > >> > >> > Exactly. If we're doing multiple builds, we may as well be mutating >> other >> > variables to get coverage from them. (Here, you're mutating "build >> type", >> > and I'm proposing we also mutate "base OS" to get additional coverage.) >> > >> > -- Philip >> > >> > >> > >> > > >> > > On Mon, Nov 12, 2018 at 1:01 PM Philip Zeyliger <phi...@cloudera.com> >> > > wrote: >> > > >> > > > Seems useful to me. >> > > > >> > > > If you're interested, we could kill multiple birds with one stone. >> > > > Specifically, I'm also interested in centos6/rh6 pre-merge testing. >> > There >> > > > are a variety of ways to do so, including running with >> test-with-docker >> > > > stuff. I recognize it's more work, but happy to help if you want to >> try >> > > it. >> > > > >> > > > -- Philip >> > > > >> > > > On Sat, Nov 10, 2018 at 11:10 PM Jim Apple <jbap...@cloudera.com> >> > wrote: >> > > > >> > > > > C++ has some constructs that have undefined behavior. Shall we >> test >> > for >> > > > > this during pre-merge testing? >> > > > > >> > > > > When the behavior of C++ code is formally "undefined" by the >> > standard, >> > > > > compilers can behave erratically, like not taking either branch >> of a >> > > > > if/else statement. This can be reproduced in the wild. The >> standard >> > > > itself >> > > > > notes: >> > > > > >> > > > > "Using a bool value in ways described by this International >> Standard >> > as >> > > > > 'undefined,' such as by examining the value of an uninitialized >> > > automatic >> > > > > object, might cause it to behave as if it is neither true nor >> false." >> > > > > >> > > > > Clang has a checker for this called UBSAN, and, after some effort, >> > the >> > > > data >> > > > > loading part of our build is now UBSAN-clean. I'm suggesting we >> add >> > > that >> > > > > test to the pre-merge testing. I'm happy to handle the details. >> > > > > >> > > > > When it fails, the output will look something like this: >> > > > > >> > > > > >> https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/3573/console >> > > > > >> > > > > exprs/math-functions-ir.cc:405:13: runtime error: signed integer >> > > > overflow: >> > > > > 4738381338321616896 * 36 cannot be represented in type 'long' >> > > > > runtime/decimal-value.inline.h:254:17: runtime error: signed >> integer >> > > > > overflow: 0x4b3b4ca85a86c47a098a223fffffffff + >> > > > > 0x4b3b4ca85a86c47a098a223fffffffff cannot be represented in type >> > > > '__int128' >> > > > > runtime/row-batch-serialize-test.cc:243:18: runtime error: >> variable >> > > > length >> > > > > array bound evaluates to non-positive value 0 >> > > > > >> > > > >> > > >> > >> > -- Sahil Takiar Software Engineer takiar.sa...@gmail.com | (510) 673-0309