Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > Subject: Re: [PATCH] fmr support in mthca > > Michael> Good, glad to help. I will try to address your comments > Michael> next week (its already weekend here). > > No problem. Libor won't be back until Monday so I won't even try SDP > until then. > > Roland> What if we just reserve something like 64K MPTs and MTTs > Roland> for FMRs and ioremap everything at driver startup? That > Roland> would only use a few MB of vmalloc space and probably > Roland> simplify the code too. > > Michael> I dont like these pre-allocations - if someone is only > Michael> using SDP and IP over IB, it seems he wont need almost > Michael> any regular regions. 64K MTTs with 4K page size cover up > Michael> to 200MByte of memory. > > We can bump up the numbers if you want. Right now the default > allocation is 1 << 20 MTT segments (8 << 20 MTT entries). I see no > problem with having 64K MPTs and 256 MTT segments reserved for FMRs by > default. That should be more than enough for a single HCA -- 256K MTT > segments means that 2 million pages or 8 GB of IO could be in flight > at a time, which doesn't seem like a harsh limit to me. > > Ultimately we can make the allocations tunable at device init time, > along with the rest of the parameters (number of QPs, number of CQs, > etc). I haven't seen much pressure to do that so far but it is > definitely in my plans. > > Michael> My other problem with this approach was implementational: > Michael> existing allocator and table code can be passed reserved > Michael> parameter, but dont have the ability to allocate out of > Michael> that pool. So we'd have to allocate out of a separate > Michael> allocator, and take care so that keys do not > Michael> conflict. This gets a bit complicated. > > I think this is the way to go. Keys are easy to deal with -- in > mthca_init_mr_table, we could just pass dev->limits.num_fmrs instead > of dev->limits.reserved_mrws when initializing dev->mr_table.mpt_alloc, > and then create a new table of size dev->limits.num_fmrs and reserve > dev->limits.reserved_mrws out of that table. > > The buddy allocator is a little more work but it needs to be cleaned > up and encapsulated better anyway. Once that's done we'd just have > two buddy allocators. The first one would cover all the MTT segments, > and we'd first take out a chunk of that one to cover the reserved MTTs > and then allocate another chunk that can hold whatever number of MTT > segments we decide to use for FMRs. > > Michael> Maybe do something separate for 32 bit kernels (like - > Michael> disable FMR support)? > > No FMRs on 32-bit kernels isn't going to fly. It doesn't seem that > hard to make things work on i386 so why not do it? > > Michael> Yes but for mtts the addresses may not be physically > Michael> contigious, unless we want to limit FMRs to PAGE_SIZE/8 > Michael> MTTs, which means 512 MTTs, that is 2MByte with 4K FMR > Michael> page size. And is it seems possible that even with this > Michael> limitation MTTs for a specific FMR start at non page > Michael> aligned boundary. > > I think it's fine to limit an FMR to 512 MTT entries. I'd have to > look at the source to be sure of the exact numbers, but I know that > for the Topspin stack, neither SDP nor SRP is using more than 32 > entries per FMR. A limit of mapping 512 pages/2 MB per FMR seems > fine. I don't know of anyone using FMRs even close to that big. > > Even if it turns out to be to small, I see no problem with adding a > small array of something on the order of 2 or 4 MTT pages. > > If we use the buddy allocator for MTT entries for FMRs, then alignment > is OK. The buddy allocator guarantees that objects will be aligned to > their size, which means that the MTT segments will never cross a page > boundary. > > - R. >
OK. I thought about it and I buy this design. I'll prepare a patch along these lines. MST -- MST - Michael S. Tsirkin _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
