klement, ok…ill think about how to do that without too much trouble in its current state..
in the meantime…blowing out the cpu and memory a bit changed the error…… 21:49:42 create 1k of p2p subifs OK 21:49:42 ============================================================================== 21:51:52 21:53:13,610 Timeout while waiting for child test runner process (last test running was `drop rx packet not matching p2p subinterface' in `/tmp/vpp-unittest-P2PEthernetIPV6-GDHSDK')! 21:51:52 Killing possible remaining process IDs: 19954 19962 19964 21:45:05 PPPoE Test Case 21:45:05 ===================================21:48:13,778 Timeout while waiting for child test runner process (last test running was `drop rx packet not matching p2p subinterface' in `/tmp/vpp-unittest-P2PEthernetIPV6-I0REOQ')! 21:47:45 Killing possible remaining process IDs: 20017 20025 20027 20:48:46 PPPoE Test Case 20:48:46 ===================================20:51:34,082 Timeout while waiting for child test runner process (last test running was `drop rx packet not matching p2p subinterface' in `/tmp/vpp-unittest-P2PEthernetIPV6-tQ5sP0')! 20:51:05 Killing possible remaining process IDs: 19919 19927 19929 anything new/different/exciting in here? Also the memory/cpu expansion (by roughly a third) these failures happen in the order of 2/3 minutes as opposed to a 90 leading to timeout failure. Since the verifies are still happily chugging along I ASSuME that this drop packet check isn’t happening in that suite? Ed On Aug 9, 2017, at 1:04 PM, Klement Sekera -X (ksekera - PANTHEON TECHNOLOGIES at Cisco) <ksek...@cisco.com<mailto:ksek...@cisco.com>> wrote: Ed, it'd help if you could collect log.txt from a failed run so we could peek under the hood... please see my other email in this thread... Thanks, Klement Quoting Ed Kern (ejk) (2017-08-09 20:48:46) this is not you…or this patch… the make test-debug has had a 90+% failure rate (read not 100%) for at least the last 100 builds (far back as my current logs go but will probably blow that out a bit now) you hit the one that is seen most often… on that create 1k of p2p subifs the other much less frequent is 13:40:24 CGNAT TCP session close initiated from outside network OK 13:40:24 =================================================Build timed out (after 120 minutes). Marking the build as failed. so currently I’m allocating 10000 MHz in cpu and 8G in memory for verify and also for test-debug runs… Im not obviously getting (as you can see) errors about it running out of memory but I wonder if thats possibly whats happening.. its easy enough to blow my allocations out a bit and see if that makes a difference.. If anyone has other ideas to try and happy to give them a shot.. appreciate the heads up Ed On Aug 9, 2017, at 12:07 PM, Dave Barach (dbarach) <[1]dbar...@cisco.com<mailto:dbar...@cisco.com>> wrote: Please see [2]https://gerrit.fd.io/r/#/c/7927, and [3]http://jenkins.ejkern.net:8080/job/vpp-test-debug-master-ubuntu1604/1056/console The patch in question is highly unlikely to cause this failure... 14:37:11 ============================================================================== 14:37:11 P2P Ethernet tests 14:37:11 ============================================================================== 14:37:11 delete/create p2p subif OK 14:37:11 create 100k of p2p subifs SKIP 14:37:11 create 1k of p2p subifs Build timed out (after 120 minutes). Marking the build as failed. 16:24:49 $ ssh-agent -k 16:24:54 unset SSH_AUTH_SOCK; 16:24:54 unset SSH_AGENT_PID; 16:24:54 echo Agent pid 84 killed; 16:25:07 [ssh-agent] Stopped. 16:25:07 Build was aborted 16:25:09 [WS-CLEANUP] Deleting project workspace...[WS-CLEANUP] done 16:25:11 Finished: FAILURE Thanks… Dave References Visible links 1. mailto:dbar...@cisco.com 2. https://gerrit.fd.io/r/#/c/7927 3. http://jenkins.ejkern.net:8080/job/vpp-test-debug-master-ubuntu1604/1056/console
_______________________________________________ vpp-dev mailing list vpp-dev@lists.fd.io https://lists.fd.io/mailman/listinfo/vpp-dev