Quoting r. Grant Grundler <[EMAIL PROTECTED]>: > Subject: ia64 perf and FMR > > Hi, > Just wanted to share initial perf results (and surprise) > that I'm getting on the HP ZX1/IA64 boxes. > > Before FMR support was committed, netperf was reporting around > 1720 Mb/s (215 MB/s) for IPoIB with msi_x=1 and netserver pinned > to the CPU that wasn't taking interrupts. After FMR was committed, > netperf is reporting about 3500 Mb/s (437 MB/s) for IPoIB. CPU was > saturated on the send side in all cases. > > I've a vague idea what "Fast Memory Registration" is but not a good > understanding. Can someone point me at a decent explanation of FMR? > > I'd like to understand the 2X in performance. > Maybe we are doing 1/2 as much DMA mapping in one of > the bug fixes? > > And I'm suspicious of the IPoIB numbers since SDP is also seeing > a bit over 3500 Mb/s and sending CPU is also saturated. I was hoping > SDP would be 40-60% faster than TCP (ipoib). Maybe I'm just not > configuring libsdp.conf correctly for netperf and maybe the IPoIB > numbers are correct. I've "rmmod ib_sdp" on both boxes, unloaded > and reloaded all the other IB drivers, and "unset LD_PRELOAD". > Is unloading ib_sdp sufficient to be sure SDP isn't used? > > (I do get "module in use" when netserver is running with LD_PRELOAD > pointing at libsdp.so) > > > I also reviewed all the "__attribute__ ((packed))" uses in > include/ib_mad.h and include/ib_smi.h. It looks safe to me > to remove them since every field is "naturally" aligned from > the start of it's respective structure. I also checked > nested cases. However, while it worked fine, removing all use > from the two files didn't matter for netperf TCP_STREAM. > > I didn't realize other files also use "packed" and will > have to revisit the issue. I'm mostly worried some > new use will not be well aligned and cause the compiler > to insert padding. That will be a PITA to debug. > What we need is a compiler warning to tell us when/where > padding is inserted in a structure with a similar __attribute__. > > Reminder: not pinning the netserver thread to the other CPU > costs around 25% performance. I think that's true for any single > threaded networking perf test that saturates the CPU. > > thanks, > grant
Can you try with hide DDR? this will disable FMRs for tavor. -- MST - Michael S. Tsirkin _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
