On 9/28/21 13:30, Amber, Kumar wrote:
> Hi llya,
> 
> The test-case fails with the following build command on the master branch.
> 
> Pass:
> ./configure --with-dpdk=static CFLAGS=""
> Fails:
> ./configure --with-dpdk=static CFLAGS="-msse4.2"
> 
> Testing on ovs-master branch, running test case like this
> $ make check TESTSUITEFLAGS="779"
> 
> Based on the above build tests it was identified that "-msse4.2" is causing 
> the unit-test to fail. Note that OVS changes its hashing implementation based 
> on SSE4.2 being available at compile-time. This switches between murmur hash 
> SW implementation, and SSE CRC32 instruction.
> 
> This test seems to expect *a specific value* of a hash result, causing it to 
> pass/fail based on hashing implementation selected at ./configure time.

Hi.  Thanks for details.  That make sense.

Could you try following patch:

diff --git a/tests/tunnel-push-pop.at b/tests/tunnel-push-pop.at
index 12fc1ef91..cf021c0cc 100644
--- a/tests/tunnel-push-pop.at
+++ b/tests/tunnel-push-pop.at
@@ -628,20 +628,22 @@ AT_CHECK([
 AT_CHECK([ovs-vsctl -- set Interface p0 options:tx_pcap=p0.pcap])
 
 packet=50540000000a505400000009123
-encap=f8bc124434b6aa55aa5500000800450000320000400040113406010102580101025c83a917c1001e00000000655800007b00
+dnl Source port is based on a packet hash, so it might be different depending
+dnl on compiler flags and CPU.  Masked with '....'.
+encap=f8bc124434b6aa55aa5500000800450000320000400040113406010102580101025c....17c1001e00000000655800007b00
 
 dnl Output to tunnel from a int-br internal port.
 dnl Checking that the packet arrived and it was correctly encapsulated.
 AT_CHECK([ovs-ofctl add-flow int-br 
"in_port=LOCAL,actions=debug_slow,output:2"])
 AT_CHECK([ovs-appctl netdev-dummy/receive int-br "${packet}4"])
-OVS_WAIT_UNTIL([test `ovs-pcap p0.pcap | grep "${encap}${packet}4" | wc -l` 
-ge 1])
+OVS_WAIT_UNTIL([test `ovs-pcap p0.pcap | egrep "${encap}${packet}4" | wc -l` 
-ge 1])
 dnl Sending again to exercise the non-miss upcall path.
 AT_CHECK([ovs-appctl netdev-dummy/receive int-br "${packet}4"])
-OVS_WAIT_UNTIL([test `ovs-pcap p0.pcap | grep "${encap}${packet}4" | wc -l` 
-ge 2])
+OVS_WAIT_UNTIL([test `ovs-pcap p0.pcap | egrep "${encap}${packet}4" | wc -l` 
-ge 2])
 
 dnl Output to tunnel from the controller.
 AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out int-br CONTROLLER 
"debug_slow,output:2" "${packet}5"])
-OVS_WAIT_UNTIL([test `ovs-pcap p0.pcap | grep "${encap}${packet}5" | wc -l` 
-ge 1])
+OVS_WAIT_UNTIL([test `ovs-pcap p0.pcap | egrep "${encap}${packet}5" | wc -l` 
-ge 1])
 
 dnl Datapath actions should not have tunnel push action.
 AT_CHECK([ovs-appctl dpctl/dump-flows | grep -q tnl_push], [1])
---

I can send it formally once I'm back from PTO.

One other thing that I noticed is that all tests are failing with '-msse4.2'
and without dpdk due to extra warnings from an AVX512 code.  I think we need
to lower them down to DBG, as they are not very useful, but breaks tests and
make issues like this drain in the flood of test failures.
If you can fix that, that would be great.

Best regards, Ilya Maximets.

> 
> Regards
> Amber
> 
>> -----Original Message-----
>> From: Ilya Maximets <[email protected]>
>> Sent: Tuesday, September 21, 2021 7:00 PM
>> To: Amber, Kumar <[email protected]>; Ilya Maximets
>> <[email protected]>; [email protected];
>> [email protected]
>> Cc: Stokes, Ian <[email protected]>; Van Haaren, Harry
>> <[email protected]>
>> Subject: Re: Unit Test Failure Report to OVS ML
>>
>> On 9/21/21 14:51, Amber, Kumar wrote:
>>> Hi Ilya,
>>>
>>> The Test-case failure is not related to AVX512 or any patches we are 
>>> directly
>> failing on "master" latest of OVS with no patches on top of it.
>>> I am still trying to figure out or root cause the issue, we tested the 
>>> master on 4
>> different servers, and all fails on the same test-case.
>>
>> This sounds very weird.  How do you build it?
>>
>>>
>>> Regards
>>> Amber
>>>
>>>> -----Original Message-----
>>>> From: Ilya Maximets <[email protected]>
>>>> Sent: Monday, September 20, 2021 5:05 PM
>>>> To: Amber, Kumar <[email protected]>; [email protected];
>>>> [email protected]; [email protected]
>>>> Cc: Stokes, Ian <[email protected]>; Van Haaren, Harry
>>>> <[email protected]>
>>>> Subject: Re: Unit Test Failure Report to OVS ML
>>>>
>>>> On 9/20/21 12:35, Amber, Kumar wrote:
>>>>> Hi all,
>>>>>
>>>>> The following commit ID with the following description added a test
>>>>> case for
>>>> "tunnel-push-pop" test-suit by the name: "tunnel_push_pop -
>>>> packet_out debug_slow" has been found to be failing on the latest master
>> branch.
>>>>>
>>>>> ## ------------------------------- ## ## openvswitch 2.16.90 test
>>>>> suite. ## ## ------------------------------- ##
>>>>> 779: tunnel_push_pop - packet_out debug_slow         FAILED
>>>>> (ovs-macros.at:242)
>>>>>
>>>>> ## ------------- ##
>>>>> ## Test results. ##
>>>>> ## ------------- ##
>>>>>
>>>>> ERROR: 1 test was run,
>>>>> 1 failed unexpectedly.
>>>>>
>>>>> We did some investigation, and the matching is the cause of the failure.
>>>>>
>>>>> ./ovs-macros.at:242: hard failure
>>>>>
>>>>> 779. tunnel-push-pop.at:598: 779. tunnel_push_pop - packet_out
>>>>> debug_slow (tunnel-push-pop.at:598): FAILED (ovs-macros.at:242)
>>>>>
>>>>> Commit patch: 7e6b41ac8d9d183655be96795b529adeb33aeb47
>>>>>
>>>>> dpif-netdev: Fix crash when PACKET_OUT is metered.
>>>>>
>>>>> When a PACKET_OUT has output port of OFPP_TABLE, and the rule table
>>>>> includes a meter and this causes the packet to be deleted, execute
>>>>> with a clone of the packet, restoring the original packet if it is
>>>>> changed by the execution.
>>>>>
>>>>> Add tests to verify the original issue is fixed, and that the fix
>>>>> doesn't break tunnel processing.
>>>>>
>>>>> Would the authors of the patch investigate why the test is failing?
>>>>>
>>>>> Regards
>>>>> Amber
>>>>
>>>> Hi.
>>>>
>>>> I can't reproduce the issue.  I re-run the test 10 times on 2 of my
>>>> systems and it works 10/10 without any issues.   And none of our CI
>>>> systems has issues with this test.
>>>>
>>>> The patch that added the test should not affect packet matching as it
>>>> only changes the execution of actions, just to avoid the crash under
>>>> certain conditions, and it tries to do that with least amount of side
>>>> effects possible.  So, this patch should not be a root cause.  Maybe
>>>> the new test case just uncovered a different issue in packet matching?
>>>>
>>>> The test itself was carefully crafted to catch a particular issue
>>>> where packet is not encapsulated, while it should be.  And the test itself
>> seems solid.
>>>>
>>>> Does it still fail for you, if you revert code changes from the patch
>>>> but keep the aforementioned unit test (this test is not for the crash
>>>> itself, so it should pass without the change in the patch)?
>>>>
>>>> Anyway, what does "the matching is the cause of the failure" mean?
>>>> Are you testing with avx512 enabled?  If so, doesn't autovalidator
>>>> tell you what the issue is?
>>>>
>>>> Best regards, Ilya Maximets.
> 

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to