Hi Jerin,

> -----Original Message-----
> From: dev <dev-boun...@dpdk.org> On Behalf Of Gavin Hu (Arm Technology
> China)
> Sent: Monday, November 11, 2019 10:01 PM
> To: jer...@marvell.com; dev@dpdk.org
> Cc: Olivier Matz <olivier.m...@6wind.com>; Andrew Rybchenko
> <arybche...@solarflare.com>; David Christensen <d...@linux.vnet.ibm.com>;
> bruce.richard...@intel.com; konstantin.anan...@intel.com;
> hemant.agra...@nxp.com; Shahaf Shuler <shah...@mellanox.com>;
> Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>;
> vikto...@rehivetech.com; anatoly.bura...@intel.com; Steve Capper
> <steve.cap...@arm.com>; Ola Liljedahl <ola.liljed...@arm.com>; nd
> <n...@arm.com>
> Subject: Re: [dpdk-dev] Mbuf memory alignment constraints for
> (micro)architectures
> 
> Hi Jerin,
> 
> > -----Original Message-----
> > From: Jerin Jacob Kollanukkaran <jer...@marvell.com>
> > Sent: Thursday, October 31, 2019 2:02 AM
> > To: dev@dpdk.org
> > Cc: Olivier Matz <olivier.m...@6wind.com>; Andrew Rybchenko
> > <arybche...@solarflare.com>; David Christensen
> <d...@linux.vnet.ibm.com>;
> > bruce.richard...@intel.com; konstantin.anan...@intel.com;
> > hemant.agra...@nxp.com; Shahaf Shuler <shah...@mellanox.com>;
> > Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>; Gavin Hu (Arm
> > Technology China) <gavin...@arm.com>; vikto...@rehivetech.com;
> > anatoly.bura...@intel.com
> > Subject: Mbuf memory alignment constraints for (micro)architectures
> >
> > CC:  Arch and platform maintainers
> >
> > While reviewing the mempool objection allocation requirements in the code,
> >
> > A) it's found that in the default case, mempool objects have padding
> > in the object trailer to have start addresses of objects among the different
> > channels,
> > to enable equally load on the DRAM channel to have better performance
> >
> > # More documentation is here
> > https://doc.dpdk.org/guides/prog_guide/mempool_lib.html
> > in section 8.3. Memory Alignment Constraints
> >
> > B) The optimize_object_size() does the channel distribution requirement
> > by the following formula
> >
> >         new_obj_size = (obj_size + RTE_MEMPOOL_ALIGN_MASK) /
> > RTE_MEMPOOL_ALIGN;
> >         while (get_gcd(new_obj_size, nrank * nchan) != 1)
> >                new_obj_size++;
> >
> >
> > C) The formula mentioned in the (B) is NOT generic. At least of the 
> > octeontx2
> > SoC
> > The memory/DDR controller works in different way. Where by:
> > # It does XOR operation of some  of physical address lines(not the user 
> > space
> > VA address)
> > to compute the hash and that the function defines the actual channel.
> >
> > The XOR(kind of CRC) scheme is useful because there is natural  channel
> > distribution
> > based on the address i.e No need to have padding to waste memory
> >
> > So, in short the padding scheme does not need for some SoC. I trying to send
> > the patch
> > to fix it. So the questions is,
> >
> > # Is PPC and other ARM SoC has formula (B)  to compute DRAM channel
> > distribution ? or
> > Is it specific to x86? That would define where the hooks needs to added to
> have
> > proper fix.
> Reading through some documents, both x86 and arm, and having internal
> discussion,
> it looks like this is specific to x86, x86 spreads adjacent virtual addresses 
> within
> a page across multiple memory devices,
> the interleaving was done per one or two cache lines.
> https://software.intel.com/en-us/articles/how-memory-is-accessed
> 
> Arm leaves flexibility to implementations, no fixed pattern for interleaving 
> and
> thus it can hardly be generalized.
Same conclusion, but more words for this topic(from Arm internally):
"Interleaving (or stripping) happens at the interconnect/memory controller 
level, so on Arm-based systems it's going to be highly dependent on the given 
SoC's integration and probably the system configuration too. Arm own 
interconnect and DMC IPs generally offer various options to support stripping, 
but even then it's the integrator's choice how to use them, and obviously there 
are multitudes of alternative third-party IPs too.
In summary, this really depends on the system's interconnect and memory 
controller capabilities and how it has been configured."
/Gavin
> >
> >
> >
> >
> >
> >
> >
> >

Reply via email to