Hi Stephen, You may want to look at the local tailqs patchset I've made. This may fix your issues as well.
http://dpdk.org/ml/archives/dev/2014-June/003573.html I'm planning to respin a v4 of it, with an addition of using --base-virtaddr flag to also control where rte_config is mapped as well as hugepages. Thanks, Anatoly > -----Original Message----- > From: Gooch, Stephen [mailto:stephen.gooch at windriver.com] > Sent: Friday, June 20, 2014 3:37 PM > To: Burakov, Anatoly; Richardson, Bruce; dev at dpdk.org > Subject: RE: mmap() hint address > > Hello, > > One item I should have included is this device is running 32-bit 2.6.27, quite > old, and sharing 4GB of RAM with a number of applications. We were able to > find the issue. In the failure case vDSO is mapped lower (toward [heap]) > than normal. As a result , .rte_config was mapped into the pre-mapped pci > uio resource virtual address range. > > The fix: (1) move uio mmap() out of the narrow range at the bottom of the > memory maps and (2) creating spacing between the uio maps and rte_config > mmap(). It works with all huge page settings tested. > > - Stephen > > -----Original Message----- > From: Burakov, Anatoly [mailto:anatoly.burakov at intel.com] > Sent: Monday, June 16, 2014 1:00 AM > To: RICHARDSON, BRUCE; Gooch, Stephen; dev at dpdk.org > Subject: RE: mmap() hint address > > Hi Bruce, Stephen, > > > > Hello, > > > > > > I have seen a case where a secondary DPDK process tries to map uio > > > resource in which mmap() normally sends the corresponding virtual > > > address as a hint address. However on some instances mmap() returns > > > a virtual address that is not the hint address, and it result in > > > rte_panic() and the secondary process goes defunct. > > > > > > This happens from time to time on an embedded device when > > nr_hugepages is > > > set to 128, but never when nr_hugepage is set to 256 on the same > device. > > My > > > question is, if mmap() can find the correct memory regions when > > > hugepages is set to 256, would it not require less resources (and > > > therefore be more likely to > > > pass) at a lower value such as 128? > > > > > > Any ideas what would cause this mmap() behavior at a lower > > > nr_hugepage value? > > > > > > - Stephen > > > > Hi Stephen, > > > > That's a strange one! > > I don't know for definite why this is happening, but here is one > > possible theory. :-) > > > > It could be due to the size of the memory blocks that are getting > mmapped. > > When you use 256 pages, the blocks of memory getting mapped may well > > be larger (depending on how fragmented in memory the 2MB pages are), > > and so may be getting mapped at a higher set of address ranges where > > there is more free memory. This set of address ranges is then free in > > the secondary process and it is similarly able to map the memory. > > With the 128 hugepages, you may be looking for smaller amounts of > > memory and so the addresses get mapped in at a different spot in the > > virtual address space, one that may be more heavily used. Then when > > the secondary process tries to duplicate the mappings, it already has > > memory in that region in use and the mapping fails. > > In short - one theory is that having bigger blocks to map causes the > > memory to be mapped to a different location in memory which is free > > from conflicts in the secondary process. > > > > So, how to confirm or refute this, and generally debug this issue? > > Well, in general we would need to look at the messages printed out at > > startup in the primary process to see how big of blocks it is trying > > to map in each case, and where they end up in the virtual address-space. > > As I remember, OVDK project has had vaguely similar issues (only they were > trying to map hugepages into the space that QEMU has already occupied). > This resulted in us adding a --base-virtaddr EAL command-line flag that would > specify the start virtual address where primary process would start mapping > pages. I guess you can try that as well (just remember that it needs to be > done in the primary process, because the secondary one just copies the > mappings and succeeds or fails to do so). > > Best regards, > Anatoly Burakov > DPDK SW Engineer > > > >