On 25.01.2006 [08:17:29 +0200], Michael S. Tsirkin wrote: > Quoting r. Nishanth Aravamudan <[EMAIL PROTECTED]>: > > Subject: Re: [openib-general] Re: Re: Userspace testing results > > (manykernels, many svn trees) > > > > On 24.01.2006 [23:19:52 +0200], Michael S. Tsirkin wrote: > > > Quoting r. Nishanth Aravamudan <[EMAIL PROTECTED]>: > > > > Subject: Re: [openib-general] Re: Re: Userspace testing results > > > > (manykernels, many svn trees) > > > > > > > > On 24.01.2006 [21:39:23 +0200], Michael S. Tsirkin wrote: > > > > > Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > > > > > > Subject: Re: [openib-general] Re: Re: Userspace testing results > > > > > > (manykernels, many svn trees) > > > > > > > > > > > > Michael> 1 sec = 5.37731e+14 usec > > > > > > > > > > > > Michael> which seems to indicate something's still wrong. > > > > > > > > > > > > BTW this number is pretty close to 2^32 times bigger than 1e6, so > > > > > > the > > > > > > problem is probably still using long long to return the result of > > > > > > mftb (which will result in shifting the result by 32 bits, ie > > > > > > multiplying by 2^32). > > > > > > > > > > Hmm. > > > > > Maybe make clean wasnt run after updating? > > > > > Could it be un on rev 5174? > > > > > > > > Heh, here's what happens with 5174: > > > > > > > > Correlation coefficient r^2: 0.773428 < 0.9 > > > > 1 sec = inf usec > > > > 1 sec = inf usec > > > > 1 sec = inf usec > > > > 1 sec = inf usec > > > > 1 sec = inf usec > > > > 1 sec = inf usec > > > > 1 sec = inf usec > > > > 1 sec = inf usec > > > > 1 sec = inf usec > > > > 1 sec = inf usec > > > > 1 sec = inf usec > > > > 1 sec = inf usec > > > > 1 sec = inf usec > > > > 1 sec = inf usec > > > > 1 sec = inf usec > > > > 1 sec = inf usec > > > > 1 sec = inf usec > > > > > > > > And so forth... > > > > > > > > Thanks, > > > > Nish > > > > > > Hmm. Looks like mftb is returning wrong data. > > > Could you uncomment lines setting DEBUG and DEBUG_DATA at the top? > > > This will print all mftb values out. > > > > Here you go: > >
<snip> > > x=1990 y=397692 > > x=2000 y=399776 > > x=2010 y=401853 > > x=2020 y=403711 > > x=2030 y=405478 > > x=2040 y=407577 > > x=2050 y=409618 > > x=2060 y=411603 > > x=2070 y=413642 > > x=2080 y=415601 > > x=2090 y=417823 > > a = -8.02523 > > b = 199.818 > > a / b = -0.0401626 > > r^2 = 0.999999 > > Warning: measured timestamp frequency 199.818 differs from nominal 1600 MHz > > 1 sec = 1.00195e+06 usec > > 1 sec = 1.00198e+06 usec > > 1 sec = 1.00207e+06 usec > > 1 sec = 1.00207e+06 usec > > 1 sec = 1.00207e+06 usec > > 1 sec = 1.00207e+06 usec > > 1 sec = 1.00207e+06 usec > > 1 sec = 1.00207e+06 usec > > 1 sec = 1.00207e+06 usec > > 1 sec = 1.00207e+06 usec > > Seems to work fine now ... what changed? > Time to try rdma_lat/rdma_bw I guess. I think rdma_lat and rdma_bw are fixed now, magically. The first job of the day hasn't finished, but I checked the unformatted logs and it seems to give the following: rdma_lat: Warning: measured timestamp frequency 199.838 differs from nominal 1600 MHz loading libehca local address: LID 0x0d QPN 0x140406 PSN 0xee1d06 RKey 0x2340032 VAddr 0x0000001001a001 remote address: LID 0x08 QPN 0x140406 PSN 0x790ae8 RKey 0x2340032 VAddr 0x0000001001a001 <snip all the values> Latency typical: 6.10244 usec Latency best : 6.00736 usec Latency worst : 71.9282 usec rdma_bw: Warning: measured timestamp frequency 199.82 differs from nominal 1600 MHz loading libehca local address: LID 0x0d, QPN 0x150406, PSN 0x7cca90 RKey 0x23a0032 VAddr 0x000000f7fce000 remote address: LID 0x08, QPN 0x150406, PSN 0x35668f, RKey 0x23a0032 VAddr 0x000000f7fb8000 Bandwidth peak (#0 to #963): 233.043 MB/sec Bandwidth average: 233.041 MB/sec Service Demand peak (#0 to #963): 837 cycles/KB Service Demand Avg : 50 cycles/KB Thanks for the debugging, Nish _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
