Hi Antoine,

There is obviously something wrong with the clock on your setup.
Just take a look at the leftmost timestamp in manager.log: it shows that the 
manager ran from 10:31:37.295 to 10:39:26.425. However, it also shows on lines 
472, 730 and 1912 that the clock suddenly and temporarily jumped 4398 seconds 
(+01:13:18) into the future. This problem is further confirmed in report.xml, 
for example:
<!-- update step at 12  2010-05-28 10:33:22.823 [4481546ms]-->
<!-- update step at 18  2010-05-28 10:33:22.823 [83500ms]-->
For your information, the YMDHMS and the ms timestamps are printed from 
different variables, and the ms timestamp is sometimes too high by 4398046ms. 
This is the same difference as in your previous report.xml files, so the 
problem seems to be very reproducible.
Checking in the source code, I tracked this problem down to getmilliseconds() 
in utils.cpp and to the gettimeofday() system call. The root cause of the 
problem is most probably the system clock that transiently jumps into the 
future for some reason... When it happens, the manager thinks that the step is 
completed and increases the load to the next step.

It is very likely that your system clock problem originates from a disagreement 
between the hypervisor and the virtual machine about what time it is. By the 
way, what hypervisor are you using?
I will let you further investigate the HV and VM clock mis-configuration. You 
already killed ntpd and ptpd on the VM, but there is probably a synchronization 
feature in the HV that regularly adjusts the clock on the VM... Or may be yet 
another time-related daemon on the VM that incorrectly moves the clock ahead 
and the HV (almost) immediately restores it to the correct time...
To evidence the problem and to confirm that it has been solved, you can write a 
simple C program that calls gettimeofday() in a loop, prints the tv_sec field 
of the timeval structure, and checks whether it has suddenly decreased (showing 
that it was incorrectly increased in the previous loop iteration). In order to 
avoid 100% CPU usage, you could add a sleep() inside the loop, but since this 
system call is also time-based, it might actually prevent your program from 
showing the problem!

I don't mean that IMS Bench SIPp is not supported in a VM, but we never tested 
this setup. In fact, we usually do the opposite: in order to benchmark our 
SUTs, we use a stack of at least 4 physical servers on which we run at least 4 
SIPp TS instances. Anyway, it might run in a VM, provided the generated load is 
not too high, but you will probably lose some precision on the results. The 
major prerequisite, on a VM or on a physical server, is that the clock is 
linear and monotonic, which is not your case at the moment.
Regarding the clock precision, the installation guide at 
http://sipp.sourceforge.net/ims_bench/reference.html#Pre-requisites recommends 
to recompile the kernel with the timer frequency set to 1000Hz. Some 
distributions come with a kernel already configured that way, and otherwise may 
be you recompiled your kernel accordingly on the VM. But whatever the 
configuration of the VM, it still depends on the configuration of the 
underlying HV, which might possibly be out of your control.

I guess that your current goal is to validate whether IMS Bench SIPp can be 
used to benchmark SailFin, and to check that the generated reports contain the 
information that you expect. For that purpose, I understand your choice of 
running in a VM with a low load, because you plan to validate features rather 
than full performance. I also guess that, once you validated the tools, you 
would deploy a real test setup using one or more dedicated physical servers in 
order to benchmark the real performance of your SailFin SUT.
If that's the case, and unless the VM/HV clock mis-configuration issue is 
really obvious, then there is no real value in spending your time to make it 
work in a VM, because the final setup would use physical servers anyway. 
Instead, I would suggest that you temporarily use a real physical machine, 
because a low end server or even a desktop should be good enough to validate 
the features under a low load.

Regarding the "segmentation fault" problem, I think that it is related to the 
clock issue, because we obviously assume that the clock is monotonic and we 
make decisions based on the amount of time spent. If the time difference is 
sometimes negative, then we might take wrong decisions, such as deleting an 
object which could later be accessed at another point in the code when the time 
difference is correct again... So I wouldn't worry too much about it, until the 
clock issue is solved.

Regards,
Patrice

-----Original Message-----
From: Antoine Roly [mailto:antoine.r...@gmail.com] 
Sent: Friday, May 28, 2010 10:47 AM
To: Buriez, Patrice
Cc: sipp-users@lists.sourceforge.net
Subject: RE: sipp ims bench

Hi Patrice,

I've tried some tests with the initialSAPS to a even value, but the
results are still strange, and the SAPS increases more than expected.
The files you've asked (report and manafer.log) are attached.

If the problem could come from the clock, I'm going to investigate in
this direction. The TS and manager are running in a virtual machine,
maybe something is wrong... It should not, and I've never had a problem
with it but we never know...

Regards,

A.



Le jeudi 27 mai 2010 à 18:35 +0100, Buriez, Patrice a écrit :
> Hi Antoine,
> 
> This is really weird, the [ms] timestamp in the report.xml still moves back 
> and forth, while the "YMD HMS.ms" seems correct!
> Because of that transient wrong time reference, the load is increased too 
> often. That's why you got 60 instead of 5.
> 
> Can you do one more try, with InitialSAPS set to an even value, or to any 
> multiple of (StirSteps+1)?
> Please also attach the manager.log file.
> It's OK to run the manager and TS on the same computer.
> 
> Regards,
> Patrice
> 
> -----Original Message-----
> From: Antoine Roly [mailto:antoine.r...@gmail.com] 
> Sent: Thursday, May 27, 2010 6:14 PM
> To: Buriez, Patrice
> Cc: sipp-users@lists.sourceforge.net
> Subject: RE: sipp ims bench
> 
> Hi Patrice,
> 
> I've "svn co" revision 587 and killed ptpd and ntpd. 
> I haven't seen anything weird when I compiled the soft (make rmtl,  ossl
> and mgr as in the doc). 
> 
> The manager and the TS are the same host, so I suppose it's ok to run
> the test without both ntpd and ptpd, but I had to put the MaxTimeOffset
> to 0. 
> I don't know if this can have an important negative impact on the test
> (other than for the time in the report of course).
> 
> I've made several tests today, the results are strange. Almost all tests
> end correctly (i.e. without seg fault, but the results are weird), some
> test ends with a seg fault like in a previous mail.
> 
> Here are the 3 files from the latest test... In this one, the SAPS
> increased more than expected and overloaded sailfin. As you can see in
> the report, the requested load of the first step was 5, but the mean
> value is 60!!! I don't understand why the SAPS increase so much. Gsl is
> working, I think the soft uses that to generate traffic so...
> 
> Obviously there's something wrong, maybe in the way I'm using the bench,
> I don't know... Is it possible it's not working as expected due to the
> very low value I'm using (for initialSAPS, SAPSincreaseAmount,...)? I
> suppose not but... Or because I've only a single TS running on the same
> host than the manager, and without ntpd or ptpd?
> 
> Regards,
> 
> A.
> 
> Le mercredi 26 mai 2010 à 17:54 +0100, Buriez, Patrice a écrit :
> > Hi Antoine,
> > 
> > I investigated the files you sent.
> > The report.xml file suggests that the time reference is moving back and 
> > forth.
> > I see several possible reasons for that:
> > 
> > - Are you running ntpd and ptpd at the same time?
> > If that's the case, kill at least one of them, or even both, and try again.
> > 
> > - The "Segmentation fault" suggests that something is going really bad. May 
> > be the stack got corrupted...
> > Try a "make clean", then "make", and check for errors and warnings. 
> > Anything weird there?
> > 
> > - We might have a regression in IMS Bench SIPp.
> > Get revision 587 and try again with this first version that supports 
> > SailFin:
> >     svn co -r 587 
> > https://sipp.svn.sourceforge.net/svnroot/sipp/sipp/branches/ims_bench 
> > ims_bench-587
> > 
> > Regards,
> > Patrice
> > 
> > -----Original Message-----
> > From: Antoine Roly [mailto:antoine.r...@gmail.com] 
> > Sent: Wednesday, May 26, 2010 2:12 PM
> > To: Buriez, Patrice
> > Subject: sipp ims bench
> > 
> > Hi Patrice,
> > 
> > Here are the files you asked.
> >  
> > For this test, only one instance of SIPp was running, on the same host
> > that the manager. I suppose this is not a problem. Of course the SUT was
> > another host.
> > 
> > Thanks in advance
> > 
> > Regards,
> > 
> > Antoine
> > 
---------------------------------------------------------------------
Intel Corporation NV/SA
Rond point Schuman 6, B-1040 Brussels
RPM (Bruxelles) 0415.497.718. 
Citibank, Brussels, account 570/1031255/09

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


------------------------------------------------------------------------------

_______________________________________________
Sipp-users mailing list
Sipp-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sipp-users

Reply via email to