On 12/11/2014 5:37 AM, Jay Rolette wrote: > On Wed, Dec 10, 2014 at 12:39 PM, Ananyev, Konstantin < > konstantin.ananyev at intel.com> wrote: > >>> I just got through replacing that entire function in my repo with a call >> to qsort() from the standard library last night myself. Faster >>> (although probably not material to most deployments) and less code. >> If you feel like it is worth it, why not to submit a patch? :) > > On Haswell and IvyBridge Xeons, with 128 1G huge pages, it doesn't make a > user-noticeable difference in the time required for > rte_eal_hugepage_init(). The reason I went ahead and checked it in my repo > is because: > > a) it eats at my soul to see an O(n^2) case for something where qsort() is > trivial to use > b) we will increase that up to ~232 1G huge pages soon. Likely doesn't > matter at that point either, but since it was already written... > > What *does* chew up a lot of time in init is where the huge pages are being > explicitly zeroed in map_all_hugepages(). > > Removing that memset() makes find_numasocket() blow up, but I was able to > do a quick test where I only memset 1 byte on each page. That cut init time > by 30% (~20 seconds in my test). Significant, but since I'm not entirely > sure it is safe, I'm not making that change right now. > > On Linux, shared memory that isn't file-backed is automatically zeroed > before the app gets it. However, I haven't had a chance to chase down > whether that applies to huge pages or not, much less how hugetlbfs factors > into the equation. > > Back to the question about the patch, if you guys are interested in it, > I'll have to figure out your patch submission process. Shouldn't be a huge > deal other than the fact that we are on DPDK 1.6 (r2).
Go ahead and post it :) Thanks, Michael > Cheers, > Jay >

