Team,
    Nexus3.onap.org:10001 slowdown is fixed as of 20181222:1900 EST
     I am getting fully speed on all downloads directly from nexus3 (no need 
for a nexus3.onap.info or nexus3ap.onap.org proxy now)
     Speed went from 0.2MB/sec to 48MB/sec – up by 250x – which is normal – for 
example a 800Mb dmaap-mr image downloads in 16sec now – on a clean VM.

     Issue closed or routing – rerouted.
     
https://lists.onap.org/g/onap-discuss/topic/onap_tsc_onap_discuss/28821863?p=,,,20,0,0,0::recentpostdate%2Fsticky,,,20,2,0,28821863
     https://jira.onap.org/browse/TSC-79
     thank you Linux Foundation
     /michael

From: [email protected] <[email protected]> On Behalf Of 
Michael O'Brien
Sent: Friday, December 21, 2018 11:02 AM
To: [email protected]; Michael O'Brien <[email protected]>; 
[email protected]; [email protected]; 
[email protected]; Gildas Lanilis <[email protected]>; 
Kenny Paul <[email protected]>; Mike Elliott <[email protected]>; 
Soleil, Alain <[email protected]>; [email protected]; Yang Xu 
(Yang, Fixed Network) <[email protected]>
Subject: Re: [onap-tsc] [onap-discuss] Nexus3 slowdown issues - 
nexus3ap.onap.org proxy is full speed when prepulled daily - I'll keep the 
cache saturated

Team,
   I can confirm that nexus3ap – once cached is pulling “full” speed – 20G in 
10 min – each image on the order of seconds (aaf, some aai) – Note that a real 
test is a “clean” vm not on AWS or near onap.org – one that just had docker 
installed (no layers yet).
   I recommend we close this issue for now until the new year and use the 
workarounds.

   A good result from this test is that the routes to the proxy are 
“unhindered” – running on aws, the routes to nexus3 (cloud.onap.org) are 
different.
   This nexus3ap cache works the same as any other local cache – as expected  - 
but it is a bandaid for now – we should be good with it until Jan – as most of 
us will be off for a couple days shortly.

   However if I try to pull an image not yet cached – it sits there and takes 
minutes to hours  (workflow-init at 55Mb took 10 min (it should have been 5 
seconds) – because it still needs to go through nexus3.
   To keep speed up – we will need a daily docker_prepull.sh running 
immediately after the dockers are built daily – I will run this from a vm on 
Azure on a schedule so that the proxy remains cached – otherwise the first 
deployment to ask for a new image will need to wait the full duration for a 
pull from nexus3.onap.org

   I recommend we use the proxy for now – and can hold off on the root cause on 
the main nexus3.onap.org route issue – as the cached proxy is working well.
   Rebuilt Casablanca images will only repull for 5 more seconds because most 
of the layers will be already on the system from the day before – so we don’t 
have to turn off the Casablanca merge jobs as a workaround.

routes are better for the proxy
obrienbiometrics:nexus michaelobrien$ traceroute nexus3ap.onap.org
traceroute: Warning: nexus3ap.onap.org has multiple addresses; using 
13.251.228.5
traceroute to onap-nexus-elb-724767212.ap-southeast-1.elb.amazonaws.com 
(13.251.228.5), 64 hops max, 52 byte packets
1  192.168.20.1 (192.168.20.1)  2.035 ms  1.080 ms  0.857 ms
2  192.168.0.1 (192.168.0.1)  1.906 ms  2.446 ms  1.996 ms
3  * * *
4  24.156.158.197 (24.156.158.197)  10.330 ms  9.408 ms  11.047 ms
5  209.148.231.77 (209.148.231.77)  16.993 ms
    209.148.236.169 (209.148.236.169)  19.012 ms
    209.148.236.173 (209.148.236.173)  10.701 ms
6  9301-cgw01.ym.rmgt.net.rogers.com (209.148.229.229)  16.378 ms  23.694 ms  
16.524 ms
7  209.148.235.105 (209.148.235.105)  55.253 ms
    209.148.237.17 (209.148.237.17)  26.782 ms
    209.148.235.101 (209.148.235.101)  30.302 ms
8  209-8-108-157.static.pccwglobal.net (209.8.108.157)  27.953 ms  29.701 ms  
27.282 ms
9  tenge0-0-0-22.br02.hkg15.pccwbtn.net (63.223.15.98)  242.227 ms  275.205 ms  
239.594 ms
10  hundredge0-6-0-0.br02.hkg15.pccwbtn.net (63.223.15.190)  229.824 ms
    63-217-17-42.static.pccwglobal.net (63.217.17.42)  237.815 ms  234.202 ms
11  63-217-17-42.static.pccwglobal.net (63.217.17.42)  291.196 ms
    52.93.35.84 (52.93.35.84)  362.510 ms
    52.93.35.36 (52.93.35.36)  292.224 ms
12  52.93.35.45 (52.93.35.45)  351.281 ms
    52.93.35.61 (52.93.35.61)  236.533 ms
    52.93.35.47 (52.93.35.47)  239.904 ms
13  52.93.35.127 (52.93.35.127)  270.910 ms
    54.239.43.164 (54.239.43.164)  349.463 ms
    54.239.43.182 (54.239.43.182)  262.831 ms
14  54.240.241.119 (54.240.241.119)  338.567 ms  284.551 ms
    54.239.43.164 (54.239.43.164)  254.775 ms
15  52.93.9.152 (52.93.9.152)  261.928 ms
    54.240.241.121 (54.240.241.121)  251.418 ms
    52.93.9.20 (52.93.9.20)  314.312 ms
16  52.93.11.81 (52.93.11.81)  351.155 ms
    52.93.11.1 (52.93.11.1)  353.004 ms
    52.93.11.51 (52.93.11.51)  353.171 ms
17  52.93.11.43 (52.93.11.43)  349.406 ms
    52.93.10.232 (52.93.10.232)  273.079 ms
    52.93.11.41 (52.93.11.41)  262.679 ms
18  52.93.9.93 (52.93.9.93)  265.661 ms
    52.93.8.27 (52.93.8.27)  285.240 ms
    52.93.8.159 (52.93.8.159)  261.546 ms
19  203.83.223.17 (203.83.223.17)  259.577 ms
    52.93.10.89 (52.93.10.89)  270.384 ms
    52.93.9.115 (52.93.9.115)  263.440 ms
20  * 203.83.223.175 (203.83.223.175)  247.910 ms *
21  * * *
not for nexus3.onap.org
obrienbiometrics:nexus michaelobrien$ traceroute nexus3.onap.org
traceroute to cloud.onap.org (199.204.45.137), 64 hops max, 52 byte packets
1  192.168.20.1 (192.168.20.1)  1.470 ms  0.909 ms  0.877 ms
2  192.168.0.1 (192.168.0.1)  2.500 ms  1.805 ms  1.809 ms
3  * * *
4  24.156.158.197 (24.156.158.197)  13.671 ms  9.892 ms  11.598 ms
5  209.148.236.169 (209.148.236.169)  16.619 ms  14.900 ms
    209.148.236.173 (209.148.236.173)  20.104 ms
6  9301-cgw01.ym.rmgt.net.rogers.com (209.148.229.229)  16.498 ms  17.402 ms  
16.041 ms
7  209.148.235.133 (209.148.235.133)  16.074 ms  16.827 ms
    209.148.235.18 (209.148.235.18)  17.194 ms
8  * * *
9  be3260.ccr22.ymq01.atlas.cogentco.com (154.54.42.90)  24.564 ms  24.442 ms  
24.002 ms
10  te0-0-1-0.agr22.ymq01.atlas.cogentco.com (154.54.83.254)  30.135 ms  23.553 
ms
    te0-0-1-0.agr21.ymq01.atlas.cogentco.com (154.54.82.54)  27.203 ms
11  154.24.60.126 (154.24.60.126)  24.989 ms  27.162 ms  35.718 ms
12  38.140.46.58 (38.140.46.58)  25.420 ms  25.970 ms  30.194 ms
13  compute-199-204-45-137.ca-ymq-1.vexxhost.net (199.204.45.137)  27.019 ms  
39.879 ms  30.965 ms


Thank you
/michael

From: [email protected]<mailto:[email protected]> 
<[email protected]<mailto:[email protected]>> On Behalf Of 
Michael O'Brien
Sent: Thursday, December 20, 2018 2:47 PM
To: [email protected]<mailto:[email protected]>; 
[email protected]<mailto:[email protected]>; 
[email protected]<mailto:[email protected]>; 
[email protected]<mailto:[email protected]>; Gildas 
Lanilis <[email protected]<mailto:[email protected]>>; Kenny 
Paul <[email protected]<mailto:[email protected]>>; Mike 
Elliott <[email protected]<mailto:[email protected]>>; Soleil, 
Alain <[email protected]<mailto:[email protected]>>; 
[email protected]<mailto:[email protected]>; Yang Xu (Yang, Fixed 
Network) <[email protected]<mailto:[email protected]>>
Subject: Re: [onap-tsc] [onap-discuss] Nexus3 slowdown issues

Team,
    The RC for nexus3.onap.org must be fixed – for now we are faster – from 2h 
per image to 17min – our goal is 45 sec.
    Some updates: the nexus3ap proxy is experiencing the same issues as any 
other proxy – it must deal with the latency of nexus3.onap.org.  On 2 VMs – 
different behavior
    On a VM that already pulled to the E’s from my own proxy – the pulls from 
nexus3ap were fast as most of the layers are shared and already downloaded – as 
soon as I hit an image that is not cached either locally or on nexus3ap – it 
takes 17 min to download a 1.1G image instead of 45 seconds – better than the 2 
hours previously but not really fixed.
    On a VM that is empty – it takes the full 17 min per image to pull from 
nexus3ap

    It looks like nexus3ap is truncating the problem route enough to lower the 
download time from 2 hours to 17 min per aaf 1.1g image for example.
    Another issue is how we handle rebuilt images for branches like Casablanca 
– I need to this fully but hopefully we do not need to download the entire 
image from scratch/warm the proxy if we run the Jenkins merge jobs daily – ie 
the hash changes.
    There are indications this will not be an issues – because of shared layers 
– I repulled images that were already downloaded the day before and only get a 
5 sec cycle
1.0.5: Pulling from 
onap/org.onap.dcaegen2.collectors.datafile.datafile-app-server
4fe2ade4980c: Already exists
6fc58a8d4ae4: Already exists
819f4a45746c: Pulling fs layer
9c4800b836af: Pulling fs layer



Clean server took 150 min to download these for example
ubuntu@ip-172-31-17-47:~$ sudo docker images
REPOSITORY                                     TAG                 IMAGE ID     
       CREATED             SIZE
nexus3ap.onap.org:10001/onap/aaf/aaf_service   2.1.8               6eb295fed110 
       5 weeks ago         1.16 GB
nexus3ap.onap.org:10001/onap/aaf/aaf_oauth     2.1.8               74dcdce76094 
       5 weeks ago         1.16 GB
nexus3ap.onap.org:10001/onap/aaf/aaf_locate    2.1.8               2a4eaa6275ff 
       5 weeks ago         1.16 GB
nexus3ap.onap.org:10001/onap/aaf/aaf_hello     2.1.8               495a01176053 
       5 weeks ago         1.16 GB
nexus3ap.onap.org:10001/onap/aaf/aaf_gui       2.1.8               8caa6dc681f0 
       5 weeks ago         1.16 GB
nexus3ap.onap.org:10001/onap/aaf/aaf_fs        2.1.8               3d663698534d 
       5 weeks ago         1.16 GB
nexus3ap.onap.org:10001/onap/aaf/aaf_cm        2.1.8               0ba25c4ec3fb 
       5 weeks ago         1.16 GB
nexus3ap.onap.org:10001/onap/aaf/aaf_agent     2.1.8               090b326a7f11 
       5 weeks ago         1.14 GB
nexus3ap.onap.org:10001/onap/aaf/aaf_config    2.1.8               6506ac785cb5 
       5 weeks ago         1.14 GB
nexus3ap.onap.org:10001/onap/aaf/aaf_cass      2.1.8               4b91e9b0b43f 
       5 weeks ago         323 MB


From: [email protected]<mailto:[email protected]> 
<[email protected]<mailto:[email protected]>> On Behalf Of Michael 
O'Brien
Sent: Thursday, December 20, 2018 12:03 PM
To: [email protected]<mailto:[email protected]>; 
[email protected]<mailto:[email protected]>; onap-tsc 
<[email protected]<mailto:[email protected]>>; 
[email protected]<mailto:[email protected]>; Gildas 
Lanilis <[email protected]<mailto:[email protected]>>; Kenny 
Paul <[email protected]<mailto:[email protected]>>; Mike 
Elliott <[email protected]<mailto:[email protected]>>; Soleil, 
Alain <[email protected]<mailto:[email protected]>>; 
[email protected]<mailto:[email protected]>; Yang Xu (Yang, Fixed 
Network) <[email protected]<mailto:[email protected]>>
Subject: Re: [onap-tsc] [onap-discuss] Nexus3 slowdown issues

Catherine,
   Some good news – so far a couple pulls were back to normal using the ap 
mirror – getting 10+Mb/sec initially  – they downloaded within a couple min.

  I am rerunning the prepull script if it finished under an hour for the 40-60G 
of images we are good – I will update jiras/wikis/mails at that point

   The mirror is slowing now though and hanging – don’t know how many clients 
it can handle – there are at least 3 pulling now – I will advise on the test 
results
   Using the following to test


# clean ubuntu 16.04 vm

sudo git clone https://gerrit.onap.org/r/logging-analytics

sudo cp logging-analytics/deploy/docker_prepull.sh .

sudo curl https://releases.rancher.com/install-docker/17.03.sh | sh

sudo usermod -aG docker ubuntu

sudo systemctl restart docker

sudo ./docker_prepull.sh -b casablanca -s nexus3ap.onap.org:10001

    Thank you
    /michael


From: [email protected]<mailto:[email protected]> 
<[email protected]<mailto:[email protected]>> On Behalf Of 
Jessica Wagantall
Sent: Thursday, December 20, 2018 11:04 AM
To: [email protected]<mailto:[email protected]>; onap-tsc 
<[email protected]<mailto:[email protected]>>; 
[email protected]<mailto:[email protected]>; Gildas 
Lanilis <[email protected]<mailto:[email protected]>>; Kenny 
Paul <[email protected]<mailto:[email protected]>>
Subject: [onap-discuss] Nexus3 slowdown issues

Dear ONAP team

Our Infra team is helping us out looking into the Nexus3 issues the teams are
facing.

We have also escalated to our provider to look into the routing issues and we 
are
hoping to get an answer from them soon.

In the meantime, please also consider using 
nexus3ap.onap.org<http://nexus3ap.onap.org> mirror in case this can
unblock you.

Sorry for the inconveniences again!
Jess
This email and the information contained herein is proprietary and confidential 
and subject to the Amdocs Email Terms of Service, which you may review at 
https://www.amdocs.com/about/email-terms-of-service
This email and the information contained herein is proprietary and confidential 
and subject to the Amdocs Email Terms of Service, which you may review at 
https://www.amdocs.com/about/email-terms-of-service
This email and the information contained herein is proprietary and confidential 
and subject to the Amdocs Email Terms of Service, which you may review at 
https://www.amdocs.com/about/email-terms-of-service

This email and the information contained herein is proprietary and confidential 
and subject to the Amdocs Email Terms of Service, which you may review at 
https://www.amdocs.com/about/email-terms-of-service 
<https://www.amdocs.com/about/email-terms-of-service>

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14678): https://lists.onap.org/g/onap-discuss/message/14678
Mute This Topic: https://lists.onap.org/mt/28821863/21656
Group Owner: [email protected]
Unsubscribe: https://lists.onap.org/g/onap-discuss/unsub  
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to