Hello Amos,
I ran the netperf test manually and verified that the network
connectivity is not a problem

On machine 172.25.43.226 I ran the netserver on port 11922

[email protected]:/tmp/netperftest/netperftest# ./netserver -p 11922
Starting netserver at port 11922
Starting netserver at hostname 0.0.0.0 port 11922 and family AF_UNSPEC

[email protected]:/tmp/netperftest/netperftest# netstat  -W -n -p -a |
grep 11922
tcp        0      0 0.0.0.0:11922           0.0.0.0:*              
LISTEN      22848/netserver
[email protected]:/tmp/netperftest/netperftest#




On machine 172.25.43.234, I ran the TCP_STREAM test for netperf
[email protected]:/tmp/netperftest/netperftest# ./netperf -H
172.25.43.226 -p 11922,11922 -t TCP_STREAM  -I 99,2 -- -s1048576 -S1048576
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.25.43.226
(172.25.43.226) port 0 AF_INET : +/-1.000% @ 99% conf.  : demo
Recv   Send    Send                         
Socket Socket  Message  Elapsed             
Size   Size    Size     Time     Throughput 
bytes  bytes   bytes    secs.    10^6bits/sec 

2097152 2097152 2097152    10.02     941.37



But while trying to run via autotest, I see a timeout failure: the
client times out waiting for the server to start. Details are below.
Is there any autotest related configuration I may be missing ?

Thanks,
Bhupesh

Here are the steps I followed to run the test via autotest

- updated the ./client/tests/netperf2/control.client
            NAME = "Netperf2 (Client)"
            AUTHOR = "Martin Bligh <[email protected]>"
            TIME = "MEDIUM"
            TEST_CATEGORY = "BENCHMARK"
            TEST_CLASS = "HARDWARE"
            TEST_TYPE = "CLIENT"
            DOC = """
            TCP/UDP/sockets/etc performance benchmark.
            See http://www.netperf.org/netperf/NetperfPage.html.
            """

           job.run_test('netperf2',
                               server_ip='172.25.43.226',
                               client_ip='172.25.43.234',
                               role='client',
                               tag='client')


-  updated the ./client/tests/netperf2/control.server
               NAME = "Netperf2 (Server)"
               AUTHOR = "Martin Bligh <[email protected]>"
               TIME = "MEDIUM"
               TEST_CATEGORY = "BENCHMARK"
               TEST_CLASS = "HARDWARE"
               TEST_TYPE = "CLIENT"
               DOC = """
               TCP/UDP/sockets/etc performance benchmark.
               See http://www.netperf.org/netperf/NetperfPage.html.
               """

              job.run_test('netperf2',
                                   server_ip='172.25.43.226',
                                   client_ip='172.25.43.234',
                                   role='server',
                                   tag='server')

- updated ./client/tests/netperf2/netperf2.py : increased all
'rendezvous' time intervals to 20 minutes

            if role == 'server':
                self.server_start(cpu_affinity)
                try:
                    # Wait up to 20 minutes for the client to reach this
                    # point.
                    self.job.barrier(server_tag, 'start_%d' % num_streams,
                                     *1200*).rendezvous(*all)
                    # Wait up to test_time + 20 minutes for the test to
                    # complete
                    self.job.barrier(server_tag, 'stop_%d' % num_streams,
                                     test_time+*1200*).rendezvous(*all)
                finally:
                    self.server_stop()

            elif role == 'client':
                # Wait up to 20 minutes for the server to start
                self.job.barrier(client_tag, 'start_%d' % num_streams,
                                 *1200*).rendezvous(*all)
                self.client(server_ip, test, test_time, num_streams,
                            test_specific_args, cpu_affinity)
                # Wait up to 20 minutes for the server to also reach
this point
                self.job.barrier(client_tag, 'stop_%d' % num_streams,
                                 *1200*).rendezvous(*all)





- In the autotest UI,
             click Create job
             select "Client" from the "Test type" dropdown
             select Netperf2(client)
             select 172.25.43.226  and   172.25.43.234
             submit job

I do not see the netserver running on the server
[email protected]:~# ps -ef | grep netserver
root     13741 13682  0 01:57 pts/0    00:00:00 grep netserver

[email protected]:~# ps -ef | grep autotest
root     12703     1  0 01:56 ?        00:00:00 /usr/bin/python
/ghostcache/autotest/autotestd /tmp/autoserv-xnMBak -H autoserv
--verbose --hostname=172.25.43.226 --user=debug_user
/ghostcache/autotest/control.autoserv
root     12704 12615  0 01:56 ?        00:00:00 /usr/bin/python
/ghostcache/autotest/autotestd_monitor /tmp/autoserv-xnMBak 0 0
root     12706 12703  0 01:56 ?        00:00:00 /usr/bin/python -u
/ghostcache/autotest/autotest -H autoserv --verbose
--hostname=172.25.43.226 --user=debug_user
/ghostcache/autotest/control.autoserv
root     12709 12706  0 01:56 ?        00:00:00 /usr/bin/python -u
/ghostcache/autotest/autotest -H autoserv --verbose
--hostname=172.25.43.226 --user=debug_user
/ghostcache/autotest/control.autoserv
root     12710 12706  0 01:56 ?        00:00:00 /usr/bin/python -u
/ghostcache/autotest/autotest -H autoserv --verbose
--hostname=172.25.43.226 --user=debug_user
/ghostcache/autotest/control.autoserv
root     12724 12706  0 01:56 ?        00:00:00 /usr/bin/python -u
/ghostcache/autotest/autotest -H autoserv --verbose
--hostname=172.25.43.226 --user=debug_user
/ghostcache/autotest/control.autoserv
root     12751 12724  0 01:56 ?        00:00:00 /usr/bin/python -u
/ghostcache/autotest/autotest -H autoserv --verbose
--hostname=172.25.43.226 --user=debug_user
/ghostcache/autotest/control.autoserv
root     12752 12724  0 01:56 ?        00:00:00 /usr/bin/python -u
/ghostcache/autotest/autotest -H autoserv --verbose
--hostname=172.25.43.226 --user=debug_user
/ghostcache/autotest/control.autoserv
root     13866 13682  0 01:58 pts/0    00:00:00 grep autotest
[email protected]:~#



On the client:

[email protected]:~# ps -ef | grep netperf
root     13212 13153  0 01:58 pts/0    00:00:00 grep netperf
[email protected]:~# ps -ef | grep autotest
root     11939     1  0 01:56 ?        00:00:00 /usr/bin/python
/ghostcache/autotest/autotestd /tmp/autoserv-NObANa -H autoserv
--verbose --hostname=172.25.43.234 --user=debug_user
/ghostcache/autotest/control.autoserv
root     11940 11850  0 01:56 ?        00:00:00 /usr/bin/python
/ghostcache/autotest/autotestd_monitor /tmp/autoserv-NObANa 0 0
root     11943 11939  0 01:56 ?        00:00:00 /usr/bin/python -u
/ghostcache/autotest/autotest -H autoserv --verbose
--hostname=172.25.43.234 --user=debug_user
/ghostcache/autotest/control.autoserv
root     11945 11943  0 01:56 ?        00:00:00 /usr/bin/python -u
/ghostcache/autotest/autotest -H autoserv --verbose
--hostname=172.25.43.234 --user=debug_user
/ghostcache/autotest/control.autoserv
root     11946 11943  0 01:56 ?        00:00:00 /usr/bin/python -u
/ghostcache/autotest/autotest -H autoserv --verbose
--hostname=172.25.43.234 --user=debug_user
/ghostcache/autotest/control.autoserv
root     11960 11943  0 01:56 ?        00:00:00 /usr/bin/python -u
/ghostcache/autotest/autotest -H autoserv --verbose
--hostname=172.25.43.234 --user=debug_user
/ghostcache/autotest/control.autoserv
root     11987 11960  0 01:56 ?        00:00:00 /usr/bin/python -u
/ghostcache/autotest/autotest -H autoserv --verbose
--hostname=172.25.43.234 --user=debug_user
/ghostcache/autotest/control.autoserv
root     11988 11960  0 01:56 ?        00:00:00 /usr/bin/python -u
/ghostcache/autotest/autotest -H autoserv --verbose
--hostname=172.25.43.234 --user=debug_user
/ghostcache/autotest/control.autoserv
root     13268 13153  0 01:58 pts/0    00:00:00 grep autotest
[email protected]:~#




The job failed:

stack trace in /results/3-debug_user/172.25.43.226/debug/client.0.DEBUG

06/20 01:41:01 ERROR|base_barri:0277| master handshake timeout: 
(172.25.43.226:11922)
06/20 01:41:01 ERROR|  parallel:0026| child process failed
06/20 01:41:01 DEBUG|  parallel:0030| Traceback (most recent call last):
06/20 01:41:01 DEBUG|  parallel:0030|   File 
"/ghostcache/autotest/parallel.py", line 18, in fork_start
06/20 01:41:01 DEBUG|  parallel:0030|     l()
06/20 01:41:01 DEBUG|  parallel:0030|   File "/ghostcache/autotest/job.py", 
line 529, in <lambda>
06/20 01:41:01 DEBUG|  parallel:0030|     l = lambda : test.runtest(self, url, 
tag, args, dargs)
06/20 01:41:01 DEBUG|  parallel:0030|   File "/ghostcache/autotest/test.py", 
line 115, in runtest
06/20 01:41:01 DEBUG|  parallel:0030|     job.sysinfo.log_after_each_iteration)
06/20 01:41:01 DEBUG|  parallel:0030|   File 
"/ghostcache/autotest/shared/test.py", line 931, in runtest
06/20 01:41:01 DEBUG|  parallel:0030|     mytest._exec(args, dargs)
06/20 01:41:01 DEBUG|  parallel:0030|   File 
"/ghostcache/autotest/shared/test.py", line 426, in _exec
06/20 01:41:01 DEBUG|  parallel:0030|     _call_test_function(self.execute, 
*p_args, **p_dargs)
06/20 01:41:01 DEBUG|  parallel:0030|   File 
"/ghostcache/autotest/shared/test.py", line 841, in _call_test_function
06/20 01:41:01 DEBUG|  parallel:0030|     return func(*args, **dargs)
06/20 01:41:01 DEBUG|  parallel:0030|   File 
"/ghostcache/autotest/shared/test.py", line 299, in execute
06/20 01:41:01 DEBUG|  parallel:0030|     postprocess_profiled_run, args, dargs)
06/20 01:41:01 DEBUG|  parallel:0030|   File 
"/ghostcache/autotest/shared/test.py", line 219, in _call_run_once
06/20 01:41:01 DEBUG|  parallel:0030|     self.run_once(*args, **dargs)
06/20 01:41:01 DEBUG|  parallel:0030|   File 
"/ghostcache/autotest/tmp/site_tests/netperf2/netperf2.py", line 103, in 
run_once
06/20 01:41:01 DEBUG|  parallel:0030|     1200).rendezvous(*all)
06/20 01:41:01 DEBUG|  parallel:0030|   File 
"/ghostcache/autotest/shared/base_barrier.py", line 514, in rendezvous
06/20 01:41:01 DEBUG|  parallel:0030|     self._run_client(is_master=False)
06/20 01:41:01 DEBUG|  parallel:0030|   File 
"/ghostcache/autotest/shared/base_barrier.py", line 391, in _run_client
06/20 01:41:01 DEBUG|  parallel:0030|     while self._remaining() is None or 
self._remaining() > 0:
06/20 01:41:01 DEBUG|  parallel:0030|   File 
"/ghostcache/autotest/shared/base_barrier.py", line 184, in _remaining
06/20 01:41:01 DEBUG|  parallel:0030|     raise error.BarrierError(errmsg)
06/20 01:41:01 DEBUG|  parallel:0030| BarrierError: timeout waiting for 
barrier: start_1
06/20 01:41:01 INFO |       job:0212|   END ABORT       netperf2.client 
netperf2.client timestamp=1403228461    localtime=Jun 20 01:41:01       
06/20 01:41:01 DEBUG|  base_job:0348| Persistent state client._record_indent 
now set to 1
06/20 01:41:01 DEBUG|  base_job:0375| Persistent state client.unexpected_reboot 
deleted
06/20 01:41:01 ERROR|       job:1341| JOB ERROR: timeout waiting for barrier: 
start_1
06/20 01:41:01 INFO |       job:0212| END ABORT ----    ----    
timestamp=1403228461    localtime=Jun 20 01:41:01       timeout waiting for 
barrier: start_1
06/20 01:41:01 DEBUG|  base_job:0348| Persistent state client._record_indent 
now set to 0


Thanks,
Bhupesh


On 06/18/2014 03:28 PM, Purandare, Bhupesh wrote:
> Hello Amos,
> Thanks a lot for the reply. I think I was only updating ips
> in
> ./client/tests/netperf2/control.client.
>
> I am going to update the ips in
> ./client/tests/netperf2/control.server too
> now and run the tests again.
> Best
> Regards,
> Bhupesh
>
> On 6/17/14, 3:08 AM, "Amos Kong" <[email protected]>
> wrote:
>
>> On Thu, Jun 12, 2014 at 04:13:40PM -0400, Bhupesh Purandare
> wrote:
>>> Hello Amos,
>>> I am trying to run the Netperf2(client) test in
> Autotest 0.1.5.1.  I
>>> saw your
>>> git commits for netperf in autotest.
>>> I
> am using two hosts as required for the test, setting the IP address
>>> of one
> to
>>> 'client' and the other to 'server' in the control.client file.
>>>
>>>
> The tests are failing and I see the following in the client.0.DEBUG logs
>> I found netperf2.py in
>> [amos@z autotest]$ find .|grep
> netperf2.py
>> ./client/tests/netperf2/netperf2.py
>> ./server/tests/netperf2/ne
> tperf2.py
>> I guess you updated ip in
> ./client/tests/netperf2/control.client
> and
>> ./client/tests/netperf2/control.server,
>> and executed them in two
> hosts?
>>> 06/12 18:43:16 DEBUG|  parallel:0030|
> File
>>> "/ghostcache/autotest/parallel.py", line 18, in fork_start
>>> 06/12
> 18:43:16 DEBUG|  parallel:0030|     l()
>>> 06/12 18:43:16 DEBUG|
> parallel:0030|   File
>>> "/ghostcache/autotest/job.py", line 529, in
> <lambda>
>>> 06/12 18:43:16 DEBUG|  parallel:0030|     l = lambda
> :
>>> test.runtest(self, url, tag, args, dargs)
>>> 06/12 18:43:16 DEBUG|
> parallel:0030|   File
>>> "/ghostcache/autotest/test.py", line 115, in
> runtest
>>> 06/12 18:43:16 DEBUG|
> parallel:0030|
>>> job.sysinfo.log_after_each_iteration)
>>> 06/12 18:43:16
> DEBUG|  parallel:0030|   File
>>> "/ghostcache/autotest/shared/test.py", line
> 931, in runtest
>>> 06/12 18:43:16 DEBUG|  parallel:0030|
> mytest._exec(args, dargs)
>>> 06/12 18:43:16 DEBUG|  parallel:0030|
> File
>>> "/ghostcache/autotest/shared/test.py", line 426, in _exec
>>> 06/12
> 18:43:16 DEBUG|  parallel:0030|
>>> _call_test_function(self.execute, *p_args,
> **p_dargs)
>>> 06/12 18:43:16 DEBUG|  parallel:0030|
> File
>>> "/ghostcache/autotest/shared/test.py", line 841, in
> _call_test_function
>>> 06/12 18:43:16 DEBUG|  parallel:0030|     return
> func(*args, **dargs)
>>> 06/12 18:43:16 DEBUG|  parallel:0030|
> File
>>> "/ghostcache/autotest/shared/test.py", line 299, in execute
>>> 06/12
> 18:43:16 DEBUG|  parallel:0030|     postprocess_profiled_run,
>>> args,
> dargs)
>>> 06/12 18:43:16 DEBUG|  parallel:0030|
> File
>>> "/ghostcache/autotest/shared/test.py", line 219, in _call_run_once
>>>
> 06/12 18:43:16 DEBUG|  parallel:0030|     self.run_once(*args, **dargs)
> 06/12 18:43:16 DEBUG|  parallel:0030|
> File
>>> "/ghostcache/autotest/tmp/site_tests/netperf2/netperf2.py", line 103,
> in
>>> run_once
>>> 06/12 18:43:16 DEBUG|  parallel:0030|
> 1200).rendezvous(*all)
>>> 06/12 18:43:16 DEBUG|  parallel:0030|
> File
>>> "/ghostcache/autotest/shared/base_barrier.py", line 514, in
> rendezvous
>>> 06/12 18:43:16 DEBUG|
> parallel:0030|
>>> self._run_client(is_master=False)
>>> 06/12 18:43:16 DEBUG|
> parallel:0030|   File
>>> "/ghostcache/autotest/shared/base_barrier.py", line
> 391, in _run_client
>>> 06/12 18:43:16 DEBUG|  parallel:0030|     while
> self._remaining() is
>>> None or self._remaining() > 0:
>>> 06/12 18:43:16
> DEBUG|  parallel:0030|
> File
>>> "/ghostcache/autotest/shared/base_barrier.py", line 184, in
> _remaining
>>> 06/12 18:43:16 DEBUG|  parallel:0030|
> raise
>>> error.BarrierError(errmsg)
>>> 06/12 18:43:16 DEBUG|  parallel:0030|
> BarrierError: timeout waiting for
>>> barrier: start_1
>>> 06/12 18:43:16 INFO
> |       job:0212|   END ABORT       netperf2.client
>>> netperf2.client
> timestamp=1402598596    localtime=Jun 12 18:43:16
>>> 06/12 18:43:16 DEBUG|
> base_job:0348| Persistent state
>>> client._record_indent now set to 1
>>>
> 06/12 18:43:16 DEBUG|  base_job:0375| Persistent
> state
>>> client.unexpected_reboot deleted
>>> 06/12 18:43:16 ERROR|
> job:1341| JOB ERROR: timeout waiting for
>>> barrier: start_1
>>> 06/12
> 18:43:16 INFO |       job:0212| END ABORT ----
> ----
>>> timestamp=1402598596    localtime=Jun 12 18:43:16       timeout
> waiting
>>> for barrier: start_1
>>> 06/12 18:43:16 DEBUG|  base_job:0348|
> Persistent state
>>> client._record_indent now set to 0
>>>
>>>
>>>
>>> I tried
> tweaking the netperf2.py file to set higher values for time
>>> parameters;
> e.g. I tried increasing the wait time for server start from
>>> 10 minutes to
> 20 minutes.
>>> I also increased the wait time for the "server to reach this
> point"
>> >from 5 minutes to 10 minutes.
>>> elif role == 'client':
>>>
> # Wait up to ten minutes for the server to start
> self.job.barrier(client_tag, 'start_%d' % num_streams,
> 1200).rendezvous(*all)
>>>      self.client(server_ip, test, test_time,
> num_streams,
>>>                  test_specific_args, cpu_affinity)
>>>      #
> Wait up to 5 minutes for the server to also reach this point
> self.job.barrier(client_tag, 'stop_%d' % num_streams,
> 600).rendezvous(*all)
>>                                   ^^^ did you try to
> increase this 600
>> to 1200 ?
>>
>> Did you check the netperf server is launched
> or not?
>> You can try without Autotest to make sure the network works
> well.
>>>
>>>
>>> Can you kindly guide as to what might be causing the
> test timeout?  Is
>>> there some documentation we should be using to run this
> test correctly
>>> or are there any patches available to be applied?
>>> Any
> help will be much appreciated.
>> The latest Autotest is 0.16.0
>>
> Thanks,
>>> Bhupesh
>>>
>>>
>> -- 
>>                      Amos.
>
>

_______________________________________________
Autotest-kernel mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/autotest-kernel

Reply via email to