James DeFelice created MESOS-9590:
-------------------------------------
Summary: Mesos CI sometimes, incorrectly, overwrites
already-pushed mesos master nightly images with new images built from
non-master branches.
Key: MESOS-9590
URL: https://issues.apache.org/jira/browse/MESOS-9590
Project: Mesos
Issue Type: Bug
Reporter: James DeFelice
Assignee: Jie Yu
I pulled image mesos/mesos-centos:master-2019-02-15 some time on the 15th and
worked with it locally, on my laptop, for about a week. Part of that work
included downloading the related mesos-xxx-devel.rpm from the same CI build
that produced the image so that I could build 3rd party mesos modules from the
master base image. The rpm was labeled as pre-1.8.0.
This worked great until I tried to repeat the work on another machine. The
other machine pulled the "same" dockerhub image
(mesos/mesos-centos:master-2019-02-15) which was somehow built with a
mesos-xxx.rpm labeled as pre-1.7.2. I couldn't build my docker image using this
strangely new base because the mesos-xxx-devel.rpm I had hardcoded into the
dockerfile no longer aligned with the version of the mesos RPM that was
shipping in the base image.
The base image had changed, such that the mesos RPM version went from 1.8.0 to
1.7.2. This should never happen.
[~jieyu] investigated and found that the problem appears to happen at random.
Current thinking is that one of the mesos CI boxes uses a version of git that's
too old, and that the CI scripts are incorrectly ignoring a git command
failure: the git command fails because the git version is too old, and the
script subsequently ignores any failures from the command pipeline in which
this command is executed. With the result being that the "version" of the
branch being built cannot be detected and therefore defaults to master -
overwriting *actual* master image builds.
[~jieyu] also wrote some patches, which I'll link here:
* https://reviews.apache.org/r/70024/
* https://reviews.apache.org/r/70025/
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)