Having looked at the failing build Jim quoted above, the failure seems to come from the security area. This is from the Kudu master's log, from the startup sequence (see https://jenkins.impala.io/job/ubuntu-18.04-from-scratch/16/artifact/Impala/logs_static/logs/cluster/cdh6-node-1/kudu/master/kudu-master.INFO/*view*/ ), all this in the context of an Impala minicluster:
I0612 04:12:56.129866 8515 sys_catalog.cc:424] T 00000000000000000000000000000000 P 58a05ce6efa74b30907ac4d679bd0515 [sys.catalog]: configured and running, proceeding with master startup. W0612 04:12:56.130080 8522 catalog_manager.cc:1113] T 00000000000000000000000000000000 P 58a05ce6efa74b30907ac4d679bd0515: acquiring CA information for follower catalog manager: Not found: root CA entry not found W0612 04:12:56.130123 8522 catalog_manager.cc:596] Not found: root CA entry not found: failed to prepare follower catalog manager, will retry I0612 04:12:56.130151 8521 catalog_manager.cc:1055] Loading table and tablet metadata into memory... I0612 04:12:56.130228 8521 catalog_manager.cc:1066] Initializing Kudu internal certificate authority... W0612 04:12:56.167639 8636 negotiation.cc:320] Unauthorized connection attempt: Server connection negotiation failed: server connection from 127.0.0.1:50174: expected TLS_HANDSHAKE step: SASL_INITIATE W0612 04:12:56.170145 8636 negotiation.cc:320] Unauthorized connection attempt: Server connection negotiation failed: server connection from 127.0.0.1:50176: expected TLS_HANDSHAKE step: SASL_INITIATE W0612 04:12:56.172571 8636 negotiation.cc:320] Unauthorized connection attempt: Server connection negotiation failed: server connection from 127.0.0.1:50178: expected TLS_HANDSHAKE step: SASL_INITIATE W0612 04:12:56.182530 8636 negotiation.cc:320] Unauthorized connection attempt: Server connection negotiation failed: server connection from 127.0.0.1:50180: expected TLS_HANDSHAKE step: SASL_INITIATE W0612 04:12:56.185034 8636 negotiation.cc:320] Unauthorized connection attempt: Server connection negotiation failed: server connection from 127.0.0.1:50182: expected TLS_HANDSHAKE step: SASL_INITIATE W0612 04:12:56.187453 8636 negotiation.cc:320] Unauthorized connection attempt: Server connection negotiation failed: server connection from 127.0.0.1:50184: expected TLS_HANDSHAKE step: SASL_INITIATE I0612 04:12:56.197146 8521 catalog_manager.cc:950] Generated new certificate authority record I0612 04:12:56.198005 8521 catalog_manager.cc:1075] Loading token signing keys... W0612 04:12:56.293697 8636 negotiation.cc:320] Unauthorized connection attempt: Server connection negotiation failed: server connection from 127.0.0.1:50186: expected TLS_HANDSHAKE step: SASL_INITIATE W0612 04:12:56.295320 8636 negotiation.cc:320] Unauthorized connection attempt: Server connection negotiation failed: server connection from 127.0.0.1:50188: expected TLS_HANDSHAKE step: SASL_INITIATE W0612 04:12:56.296821 8636 negotiation.cc:320] Unauthorized connection attempt: Server connection negotiation failed: server connection from 127.0.0.1:50190: expected TLS_HANDSHAKE step: SASL_INITIATE I0612 04:12:56.416918 8521 catalog_manager.cc:4292] T 00000000000000000000000000000000 P 58a05ce6efa74b30907ac4d679bd0515: Generated new TSK 0 W0612 04:12:57.174684 8901 negotiation.cc:320] Unauthorized connection attempt: Server connection negotiation failed: server connection from 127.0.0.1:50192: expected TLS_HANDSHAKE step: SASL_INITIATE [and so on...] The same run has very similar messages in the tablet server logs as well: 0612 04:12:56.289767 8396 rpc_server.cc:205] RPC server started. Bound to: 127.0.0.1:31202 I0612 04:12:56.289903 8396 webserver.cc:308] Webserver started at http://0.0.0.0:31302/ using document root /home/ubuntu/Impala/toolchain/cdh_components-1137441/kudu-1.10.0-cdh6.x-SNAPSHOT/release/bin/../lib/kudu/www and password file <none> W0612 04:12:56.293773 8897 heartbeater.cc:587] Failed to heartbeat to 127.0.0.1:7051 (0 consecutive failures): Not authorized: Failed to ping master at 127.0.0.1:7051: Client connection negotiation failed: client connection to 127.0.0.1:7051: FATAL_UNAUTHORIZED: Not authorized: expected TLS_HANDSHAKE step: SASL_INITIATE W0612 04:12:56.296866 8897 heartbeater.cc:380] Failed 3 heartbeats in a row: no longer allowing fast heartbeat attempts. W0612 04:13:56.424613 8897 heartbeater.cc:587] Failed to heartbeat to 127.0.0.1:7051 (62 consecutive failures): Not authorized: Failed to ping master at 127.0.0.1:7051: Client connection negotiation failed: client connection to 127.0.0.1:7051: FATAL_UNAUTHORIZED: Not authorized: expected TLS_HANDSHAKE step: SASL_INITIATE W0612 04:14:56.556850 8897 heartbeater.cc:587] Failed to heartbeat to 127.0.0.1:7051 (122 consecutive failures): Not authorized: Failed to ping master at 127.0.0.1:7051: Client connection negotiation failed: client connection to 127.0.0.1:7051: FATAL_UNAUTHORIZED: Not authorized: expected TLS_HANDSHAKE step: SASL_INITIATE W0612 04:15:56.694403 8897 heartbeater.cc:587] Failed to heartbeat to 127.0.0.1:7051 (182 consecutive failures): Not authorized: Failed to ping master at 127.0.0.1:7051: Client connection negotiation failed: client connection to 127.0.0.1:7051: FATAL_UNAUTHORIZED: Not authorized: expected TLS_HANDSHAKE step: SASL_INITIATE W0612 04:16:56.826400 8897 heartbeater.cc:587] Failed to heartbeat to 127.0.0.1:7051 (242 consecutive failures): Not authorized: Failed to ping master at 127.0.0.1:7051: Client connection negotiation failed: client connection to 127.0.0.1:7051: FATAL_UNAUTHORIZED: Not authorized: expected TLS_HANDSHAKE step: SASL_INITIATE W0612 04:17:56.955927 8897 heartbeater.cc:587] Failed to heartbeat to 127.0.0.1:7051 (302 consecutive failures): Not authorized: Failed to ping master at 127.0.0.1:7051: Client connection negotiation failed: client connection to 127.0.0.1:7051: FATAL_UNAUTHORIZED: Not authorized: expected TLS_HANDSHAKE step: SASL_INITIATE W0612 04:18:57.103503 8897 heartbeater.cc:587] Failed to heartbeat to 127.0.0.1:7051 (362 consecutive failures): Not authorized: Failed to ping master at 127.0.0.1:7051: Client connection negotiation failed: client connection to 127.0.0.1:7051: FATAL_UNAUTHORIZED: Not authorized: expected TLS_HANDSHAKE step: SASL_INITIATE W0612 04:19:57.237712 8897 heartbeater.cc:587] Failed to heartbeat to 127.0.0.1:7051 (422 consecutive failures): Not authorized: Failed to ping master at 127.0.0.1:7051: Client connection negotiation failed: client connection to 127.0.0.1:7051: FATAL_UNAUTHORIZED: Not authorized: expected TLS_HANDSHAKE step: SASL_INITIATE W0612 04:20:57.393489 8897 heartbeater.cc:587] Failed to heartbeat to 127.0.0.1:7051 (482 consecutive failures): Not authorized: Failed to ping master at 127.0.0.1:7051: Client connection negotiation failed: client connection to 127.0.0.1:7051: FATAL_UNAUTHORIZED: Not authorized: expected TLS_HANDSHAKE step: SASL_INITIATE W0612 04:21:57.522513 8897 heartbeater.cc:587] Failed to heartbeat to 127.0.0.1:7051 (542 consecutive failures): Not authorized: Failed to ping master at 127.0.0.1:7051: Client connection negotiation failed: client connection to 127.0.0.1:7051: FATAL_UNAUTHORIZED: Not authorized: expected TLS_HANDSHAKE step: SASL_INITIATE W0612 04:22:57.652271 8897 heartbeater.cc:587] Failed to heartbeat to 127.0.0.1:7051 (602 consecutive failures): Not authorized: Failed to ping master at 127.0.0.1:7051: Client connection negotiation failed: client connection to 127.0.0.1:7051: FATAL_UNAUTHORIZED: Not authorized: expected TLS_HANDSHAKE step: SASL_INITIATE W0612 04:23:57.782537 8897 heartbeater.cc:587] Failed to heartbeat to 127.0.0.1:7051 (662 consecutive failures): Not authorized: Failed to ping master at 127.0.0.1:7051: Client connection negotiation failed: client connection to 127.0.0.1:7051: FATAL_UNAUTHORIZED: Not authorized: expected TLS_HANDSHAKE step: SASL_INITIATE W0612 04:24:57.910481 8897 heartbeater.cc:587] Failed to heartbeat to 127.0.0.1:7051 (722 consecutive failures): Not authorized: Failed to ping master at 127.0.0.1:7051: Client connection negotiation failed: client connection to 127.0.0.1:7051: FATAL_UNAUTHORIZED: Not authorized: expected TLS_HANDSHAKE step: SASL_INITIATE On Mon, Jun 17, 2019 at 9:08 PM Todd Lipcon <[email protected]> wrote: > On Sat, Jun 15, 2019 at 2:20 PM Jim Apple <[email protected]> wrote: > > > My goal is to have Impala keep up with (what I perceive to be) the most > > popular version of the most popular Linux distribution, for the purpose > of > > easing the workflow of developers, especially new developers. > > > > Sure, that makes sense. I use Ubuntu 18 myself, but tend to develop Impala > on a remote box running el7 because the dev environment is too heavy-weight > to realistically run on my laptop. > > > > > > 18.04 stopped being able to load data some time between June 9th and > > https://jenkins.impala.io/job/ubuntu-18.04-from-scratch/14/ and June 12 > > and > > > > > https://jenkins.impala.io/job/ubuntu-18.04-from-scratch/16/artifact/Impala/logs_static/logs/data_loading/catalogd.ERROR/*view*/ > > . > > I tried reproducing the June 9 run with the same git checkouts (Impala > and > > Impala-LZO) as #14 today, and data loading still failed. > > > > What RHEL 7 components did you have in mind that are closer to Ubuntu > 16.04 > > than 18.04? > > > > Stuff like libc, openssl, krb5, sasl, etc are pretty different > version-wise. At least, I know when we made Kudu pass tests on Ubuntu 18, > we dealt with issues mostly in those libraries, which aren't part of the > toolchain (for security reasons we rely on OS-provided libs). > > Generally I think precommit running on something closer to the oldest > supported OS is better than running on the newest, since it's more likely > that new OSes are backward-compatible. Otherwise it's very easy to > introduce code that uses features not available on el7, for example. > > > > > > On Wed, May 22, 2019 at 10:41 AM Todd Lipcon <[email protected]> wrote: > > > > > On Mon, May 20, 2019 at 8:36 PM Jim Apple <[email protected]> wrote: > > > > > > > Maybe now would be a good time to implement Everblue jobs that ping > > dev@ > > > > when they fail. Thoughts? > > > > > > > > > > Mixed feelings on that. We already get many test runs per day of the > > > "default" config because people are running precommit builds. Adding an > > > additional cron-based job to the mix that runs the same builds doesn't > > seem > > > like it adds much unless it tests some other config (eg Ubuntu 18 or a > > > longer suite of tests). One thing I could get on board with would be > > > switching the precommit builds to run just "core" tests or some other > > > faster subset, and defer the exhaustive/long runs to scheduled builds > or > > as > > > an optional precommit for particularly invasive patches. I think that > > would > > > increase dev quality of life substantially (I find my productivity is > > often > > > hampered by only getting two shots at a precommit run per work day). > > > > > > I'm not against adding a cron-triggered full test/build on Ubuntu 18, > but > > > would like to know if someone plans to sign up to triage it when it > > fails. > > > My experience with other Apache communities is that collective > ownership > > > over test triage duty (ie "email the dev list on failure" doesn't > work. I > > > seem to recall we had such builds back in 2010 or so on Hadoop and they > > > just always got ignored. In various "day job" teams I've seen this work > > via > > > a prescriptive rotation ("all team members take a triage/build-cop > > shift") > > > but that's not really compatbile with the nature of Apache projects > being > > > volunteer communities. > > > > > > So, I think I'll put the question back to you: as a committer you can > > spend > > > your time as you like. If you think an Ubuntu 18 job running on a > > schedule > > > would be useful and willing to sign up to triage failures, sounds great > > to > > > me :) Personally I don't develop on Ubuntu 18 and in my day job it's > not > > a > > > particularly important deployment platform, so I personally don't think > > > I'll spend much time triaging that build. > > > > > > Todd > > > > > > > > > > > > > > On Mon, May 20, 2019 at 9:09 AM Todd Lipcon <[email protected]> > wrote: > > > > > > > > > Adding a build-only job for 18.04 makes sense to me. A full test > run > > on > > > > > every precommit seems a bit expensive but doing one once a week or > > > > > something like that might be a good idea to prevent runtime > > > regressions. > > > > > > > > > > As for switching the precommit from 16.04 to 18.04, I'd lean > towards > > > > > keeping to 16.04 due to it being closer in terms of component > > versions > > > to > > > > > common enterprise distros like RHEL 7. > > > > > > > > > > -Todd > > > > > > > > > > On Sun, May 19, 2019 at 5:03 PM Jim Apple <[email protected]> > > wrote: > > > > > > > > > > > HEAD now passes on Ubuntu 18.04: > > > > > > > > > > > > https://jenkins.impala.io/job/ubuntu-18.04-from-scratch/ > > > > > > > > > > > > Thanks to the community members who have made this happen! > > > > > > > > > > > > Should we add Ubuntu 18.04 to our pre-merge Jenkins job, replace > > > 16.04 > > > > > with > > > > > > 18.04 in our pre-merge Jenkins job, or neither? > > > > > > > > > > > > I propose adding 18.04 for now (ans so running both 16.04 and > 18.04 > > > on > > > > > > merge) and removing 16.04 when it starts to become inconvenient. > > > > > > > > > > > > > > > > > > > > > -- > > > > > Todd Lipcon > > > > > Software Engineer, Cloudera > > > > > > > > > > > > > > > > > > -- > > > Todd Lipcon > > > Software Engineer, Cloudera > > > > > > > > -- > Todd Lipcon > Software Engineer, Cloudera >
