Hi Jess,

It’s not clear that the “Purge unused docker manifests and images” task is the 
culprit.  If I understand the doc correctly, it’s only supposed to delete 
images that are no longer associated with any tags.  I still see tons of 
time-stamped SNAPSHOT docker images from a couple of weeks ago still around 
(e.g. onap/cli: 2.0.0-SNAPSHOT-20180425T130053Z).

As of this past couple of hours, onap/dmaap/dmaap-mr:1.1.4 has disappeared.  
See 
https://jenkins.onap.org/job/integration-master-version-manifest-verify-java/182/.
  This broke the deployments that were running around that time.
Fortunately, I had a local docker cache where I could pull a copy of 
onap/dmaap/dmaap-mr:1.1.4, and I found that this image has the same sha256 hash 
as the onap/dmaap/dmaap-mr:latest image currently on nexus3.  So the image is 
still there, but just the tag is missing.  Maybe this is another clue to help 
narrow down the cause.

Thanks,
Gary


From: [email protected] 
[mailto:[email protected]] On Behalf Of Jessica Wagantall
Sent: Thursday, May 10, 2018 2:55 PM
To: [email protected]; onap-release <[email protected]>; 
Gildas Lanilis <[email protected]>; Jeremy Phelps 
<[email protected]>; Kenny Paul <[email protected]>; Anil 
Belur <[email protected]>
Subject: [onap-discuss] Nexus 3 images "disappearing" from the server

Dear ONAP team,

As mentioned today by Helen on the TSC call, it seems that we have experienced
an issue with Nexus3 where the dependency images "disappear" for some moment and
come back.

I was investigating this issue a little bit closer, let me try to explain what 
I think is happening
with Gary's example.

In his case, 
onap/aaf/aaf_cm/manifests/2.1.0-SNAPSHOT<https://nexus3.onap.org/repository/docker.snapshot/v2/onap/aaf/aaf_cm/manifests/2.1.0-SNAPSHOT>
 (among other AFF images) disappeared
on the 9th of may and re-appeared the same day after few hours.
Looking at the job that pushed this image 
https://jenkins.onap.org/view/aaf/job/aaf-authz-master-docker-java-shell-daily/,
seems like AAF bins were successfully pushed on the 7th and on the 9th but 
failed on the 8th.

At the same time, I believe this rule kicked in:
[cid:[email protected]]

This rule seems to be scanning for dependencies and will remove any snapshot 
not being referenced by anyone every day.
https://help.sonatype.com/repomanager3/configuration/system-configuration#SystemConfiguration-TypesofTasksandWhentoUseThem
There is not much configuration on this rule to be able to explain what is 
actually looking for, but I believe this is the cause.

This rule might have removed the AAF image pushed on the 7th and, since the AAF 
jenkins job failed to push a new image on the 8th,
the rule might have remove it for some time. Then the job that kicked in on the 
9th brought it back.

So, here is my suggestions:
- We need to make sure our daily jobs are healthy and address any failures on a 
daily basis to avoid this issue in future
- Keeping this rule in place is helping us keep stability on disk space. If we 
were to remove it we will have grater issues to address.
- I have confirmed with Andy and we prefer keeping this known configuration in 
place to avoid disk usage issues.

Let me know if I was clear on my explanation and we can see if keeping an eye 
on the dailies helps us reducing this occurrences.
Thanks a ton!
Jess

_______________________________________________
onap-discuss mailing list
[email protected]
https://lists.onap.org/mailman/listinfo/onap-discuss

Reply via email to