Gary, It seems like the new image for multicloud/ocata project hasn’t been created [1].
10:03:50 [WARN] onap/msb/msb_discovery:1.1.0-STAGING-latest not released 10:03:51 [WARN] onap/multicloud/framework:1.1.2-STAGING not released 10:03:52 [WARN] onap/multicloud/openstack-newton:1.1.2-SNAPSHOT not released 10:03:52 [ERROR] onap/multicloud/openstack-ocata:1.1.2-SNAPSHOT not found 10:03:54 [WARN] onap/multicloud/openstack-windriver:1.1.2-SNAPSHOT not released 10:03:55 [WARN] onap/multicloud/vio:1.1.2-STAGING not released I modified the Dockerfile to include a new variable [2] required for the HPA CSIT and I forgot to provide a value for it. Anyway, this other patch provides a default value for it [3] and fixes creation image problem in the Dockerfile. Regards, Victor Morales [1] http://12.234.32.117/jenkins/job/nexus3-docker-image-check/70/console [2] https://gerrit.onap.org/r/#/c/46711/ [3] https://gerrit.onap.org/r/#/c/47151/2/ocata/docker/Dockerfile@13 From: <[email protected]> on behalf of Gary Wu <[email protected]> Date: Saturday, May 12, 2018 at 8:01 AM To: Jessica Wagantall <[email protected]> Cc: Jeremy Phelps <[email protected]>, onap-release <[email protected]>, Anil Belur <[email protected]>, "[email protected]" <[email protected]> Subject: Re: [onap-discuss] Nexus 3 images "disappearing" from the server One more clue: As of right now onap/multicloud/openstack-newton:1.1.2-SNAPSHOT has reappeared, possibly due to a build. If we compare the image metadata for onap/multicloud/openstack-newton between 1.1.2-SNAPSHOT and latest: Component Name onap/multicloud/openstack-newton onap/multicloud/openstack-newton Component Version 1.1.2-SNAPSHOT latest Blob created Sat May 12 2018 06:09:29 GMT-0700 (Pacific Daylight Time) Wed Sep 20 2017 18:38:45 GMT-0700 (Pacific Daylight Time) Blob updated Sat May 12 2018 06:09:29 GMT-0700 (Pacific Daylight Time) Sat May 12 2018 06:09:29 GMT-0700 (Pacific Daylight Time) The system seems to think that latest is an update to an existing tag, whereas 1.1.2-SNAPSHOT is a brand new tag. Thanks, Gary From: Gary Wu Sent: Saturday, May 12, 2018 7:03 AM To: 'Jessica Wagantall' <[email protected]> Cc: [email protected]; [email protected]; onap-release <[email protected]>; Gildas Lanilis <[email protected]>; Jeremy Phelps <[email protected]>; Kenny Paul <[email protected]>; Anil Belur <[email protected]> Subject: RE: Re: [onap-discuss] Nexus 3 images "disappearing" from the server By the way, I added an hourly check of nexus3 docker images so we can better observe what’s going on: http://12.234.32.117/jenkins/job/nexus3-docker-image-check/. Currently it shows that the following have disappeared since about 7 PM PDT last night: 19:03:46 [ERROR] onap/multicloud/openstack-newton:1.1.2-SNAPSHOT not found 19:03:46 [ERROR] onap/multicloud/openstack-ocata:1.1.2-SNAPSHOT not found 19:03:47 [ERROR] onap/multicloud/openstack-windriver:1.1.2-SNAPSHOT not found Thanks, Gary From: Jessica Wagantall [mailto:[email protected]] Sent: Friday, May 11, 2018 6:00 PM To: Gary Wu <[email protected]<mailto:[email protected]>> Cc: [email protected]<mailto:[email protected]>; [email protected]<mailto:[email protected]>; onap-release <[email protected]<mailto:[email protected]>>; Gildas Lanilis <[email protected]<mailto:[email protected]>>; Jeremy Phelps <[email protected]<mailto:[email protected]>>; Kenny Paul <[email protected]<mailto:[email protected]>>; Anil Belur <[email protected]<mailto:[email protected]>> Subject: Re: Re: [onap-discuss] Nexus 3 images "disappearing" from the server Gary, Can I get your opinion on this? Looking at dmaap, it seems that this project is using 2 daily templates at the same time: [mage removed by sender. Success] [mage removed by sender. 100%]<https://jenkins.onap.org/view/dmaap/job/dmaap-messagerouter-docker-master-docker-java-daily/lastBuild> dmaap-messagerouter-docker-master-docker-java-daily<https://jenkins.onap.org/view/dmaap/job/dmaap-messagerouter-docker-master-docker-java-daily/> 12 hr - #266<https://jenkins.onap.org/view/dmaap/job/dmaap-messagerouter-docker-master-docker-java-daily/lastSuccessfulBuild/> N/A 8 min 34 sec [mage removed by sender. Success] [mage removed by sender. 100%]<https://jenkins.onap.org/view/dmaap/job/dmaap-messagerouter-docker-master-docker-version-java-daily/lastBuild> dmaap-messagerouter-docker-master-docker-version-java-daily<https://jenkins.onap.org/view/dmaap/job/dmaap-messagerouter-docker-master-docker-version-java-daily/> 12 hr - #267<https://jenkins.onap.org/view/dmaap/job/dmaap-messagerouter-docker-master-docker-version-java-daily/lastSuccessfulBuild/> N/A 9 min 48 sec Both seemed to update a latest tag. I am not saying this is the cause, but it is definitely not right thing to do. Right? thanks! Jess On Fri, May 11, 2018 at 5:49 PM, Jessica Wagantall <[email protected]<mailto:[email protected]>> wrote: Hi Gary, Just a small observation, onap/cli: 2.0.0-SNAPSHOT-20180425T130053Z seems like it will be removed in 45 mins. We have a task that runs every day at 6:30 PT that removes old SNAPSHOTS older than 16 days. I am looking further into your particular case. Thanks! Jess On Fri, May 11, 2018 at 5:43 PM, Jessica Wagantall <[email protected]<mailto:[email protected]>> wrote: Dear Michal, In your case, which is the Jenkins job that needs these versions: policy_handler: nexus3.onap.org:10001/onap/org.onap.dcaegen2.platform.policy-handler:2.4.1<http://nexus3.onap.org:10001/onap/org.onap.dcaegen2.platform.policy-handler:2.4.1> service_change_handler: nexus3.onap.org:10001/onap/org.onap.dcaegen2.platform.servicechange-handler:1.1.3<http://nexus3.onap.org:10001/onap/org.onap.dcaegen2.platform.servicechange-handler:1.1.3> I think in your particular case this could be a need for these versions to be released and that just hasn't been requested. I can see these binary versions tagged as "...SNAPSHOT.." but they haven't been pushed into the releases repository because no one have requested them Let me know if my info is helpful thanks! Jess On Fri, May 11, 2018 at 2:11 PM, Jessica Wagantall <[email protected]<mailto:[email protected]>> wrote: Thanks Gary, I have updated it since a bit ago and been using it without an issue. Gildas, can we propose to the TSC this upgrade please? It is something we can easily try with about 1 min of downtime while the server restarts with the new version. Thanks! Jess On Fri, May 11, 2018 at 1:37 PM, Gary Wu <[email protected]<mailto:[email protected]>> wrote: Sounds reasonable, except that I’m no longer actively using nexus3ap, and may not be able to give much feedback on how well it’s working. Thanks, Gary From: Jessica Wagantall [mailto:[email protected]<mailto:[email protected]>] Sent: Friday, May 11, 2018 1:27 PM To: [email protected]<mailto:[email protected]> Cc: Gary Wu <[email protected]<mailto:[email protected]>>; [email protected]<mailto:[email protected]>; onap-release <[email protected]<mailto:[email protected]>>; Gildas Lanilis <[email protected]<mailto:[email protected]>>; Jeremy Phelps <[email protected]<mailto:[email protected]>>; Kenny Paul <[email protected]<mailto:[email protected]>>; Anil Belur <[email protected]<mailto:[email protected]>> Subject: Re: Re: [onap-discuss] Nexus 3 images "disappearing" from the server Thanks for your inputs Michael and Gary, Gary, the case you mention makes me wonder if this is a tag problem or the way binaries are being tagged every day. I was chatting with Andy and he mentioned to me that in general Nexus3 reports so many docker issues. We confirmed this is not relates to a disk capacity issues since we have still about 3TB of free space. I am trying something now which I don't guarantee might fix our issue, but we could give it a shot at least. I have upgraded https://nexus3ap.onap.org/ to version 3.11.0 and see how it behaves. If everything goes fine and that version is stable, we should upgrade nexus3.onap.org<http://nexus3.onap.org> too so that we are at least on the latest version. Can I get your thoughts on this? Thanks! Jess On Thu, May 10, 2018 at 11:44 PM, Michal Ptacek <[email protected]<mailto:[email protected]>> wrote: Hi, not sure if it relates but OOM currently needs for dcae2gen following images policy_handler: nexus3.onap.org:10001/onap/org.onap.dcaegen2.platform.policy-handler:2.4.1<http://nexus3.onap.org:10001/onap/org.onap.dcaegen2.platform.policy-handler:2.4.1> service_change_handler: nexus3.onap.org:10001/onap/org.onap.dcaegen2.platform.servicechange-handler:1.1.3<http://nexus3.onap.org:10001/onap/org.onap.dcaegen2.platform.servicechange-handler:1.1.3> both of them are currently NOT available on nexus, the question is whether it relates to this problem Should we just simply start using newer images (fix in OOM ?) I don't know if I can propose that as I am not from DCAE team ... thanks, Michal --------- Original Message --------- Sender : Gary Wu <[email protected]<mailto:[email protected]>> Date : 2018-05-11 07:39 (GMT+1) Title : Re: [onap-discuss] Nexus 3 images "disappearing" from the server To : null<[email protected]<mailto:[email protected]>>, null<[email protected]<mailto:[email protected]>>, null<[email protected]<mailto:[email protected]>>, null<[email protected]<mailto:[email protected]>>, null<[email protected]<mailto:[email protected]>>, null<[email protected]<mailto:[email protected]>>, null<[email protected]<mailto:[email protected]>> Hi Jess, It’s not clear that the “Purge unused docker manifests and images” task is the culprit. If I understand the doc correctly, it’s only supposed to delete images that are no longer associated with any tags. I still see tons of time-stamped SNAPSHOT docker images from a couple of weeks ago still around (e.g. onap/cli: 2.0.0-SNAPSHOT-20180425T130053Z). As of this past couple of hours, onap/dmaap/dmaap-mr:1.1.4 has disappeared. See https://jenkins.onap.org/job/integration-master-version-manifest-verify-java/182/. This broke the deployments that were running around that time. Fortunately, I had a local docker cache where I could pull a copy of onap/dmaap/dmaap-mr:1.1.4, and I found that this image has the same sha256 hash as the onap/dmaap/dmaap-mr:latest image currently on nexus3. So the image is still there, but just the tag is missing. Maybe this is another clue to help narrow down the cause. Thanks, Gary From: [email protected]<mailto:[email protected]> [mailto:[email protected]<mailto:[email protected]>] On Behalf Of Jessica Wagantall Sent: Thursday, May 10, 2018 2:55 PM To: [email protected]<mailto:[email protected]>; onap-release <[email protected]<mailto:[email protected]>>; Gildas Lanilis <[email protected]<mailto:[email protected]>>; Jeremy Phelps <[email protected]<mailto:[email protected]>>; Kenny Paul <[email protected]<mailto:[email protected]>>; Anil Belur <[email protected]<mailto:[email protected]>> Subject: [onap-discuss] Nexus 3 images "disappearing" from the server Dear ONAP team, As mentioned today by Helen on the TSC call, it seems that we have experienced an issue with Nexus3 where the dependency images "disappear" for some moment and come back. I was investigating this issue a little bit closer, let me try to explain what I think is happening with Gary's example. In his case, onap/aaf/aaf_cm/manifests/2.1.0-SNAPSHOT<https://nexus3.onap.org/repository/docker.snapshot/v2/onap/aaf/aaf_cm/manifests/2.1.0-SNAPSHOT> (among other AFF images) disappeared on the 9th of may and re-appeared the same day after few hours. Looking at the job that pushed this image https://jenkins.onap.org/view/aaf/job/aaf-authz-master-docker-java-shell-daily/, seems like AAF bins were successfully pushed on the 7th and on the 9th but failed on the 8th. At the same time, I believe this rule kicked in: [cid:[email protected]] This rule seems to be scanning for dependencies and will remove any snapshot not being referenced by anyone every day. https://help.sonatype.com/repomanager3/configuration/system-configuration#SystemConfiguration-TypesofTasksandWhentoUseThem There is not much configuration on this rule to be able to explain what is actually looking for, but I believe this is the cause. This rule might have removed the AAF image pushed on the 7th and, since the AAF jenkins job failed to push a new image on the 8th, the rule might have remove it for some time. Then the job that kicked in on the 9th brought it back. So, here is my suggestions: - We need to make sure our daily jobs are healthy and address any failures on a daily basis to avoid this issue in future - Keeping this rule in place is helping us keep stability on disk space. If we were to remove it we will have grater issues to address. - I have confirmed with Andy and we prefer keeping this known configuration in place to avoid disk usage issues. Let me know if I was clear on my explanation and we can see if keeping an eye on the dailies helps us reducing this occurrences. Thanks a ton! Jess _______________________________________________ onap-discuss mailing list [email protected]<mailto:[email protected]> https://lists.onap.org/mailman/listinfo/onap-discuss [cid:[email protected]] [mage removed by sender.]
_______________________________________________ onap-discuss mailing list [email protected] https://lists.onap.org/mailman/listinfo/onap-discuss
