Robin, On Mon, May 07, 2007 at 06:24:38AM -0500, Robin Holt wrote: > > > > > > > > > Yes, sorry, I meant locked_vm. It may not be updated in the v2.0. > > > > > > I am a little confused at how you got to using locked_vm? Shouldn't you > > > be using do_mlock() to lock the region of memory and letting the kernel > > > page fault handler update locked_vm? In the code, you mention a denial > > > of service issue. What is that issue? > > > > > Using mlock() requires root privileges. You do not want to restrict > > monitoring tools (using smapling) to root users. Thus you have to do > > This is wrong. I have often used mlock as a normal user. Not sure > where you got this, but it is valid for a normal user to be able to > use mlock. Why else would there be an _USER_ limit? > Ok, you are right, you can use mlock() as a regular user.
> > the update yourself. Furthermore, we cannot afford to take page fault in > > the PMU interrupt handler. > > You do not need to take any page faults in the PMU handler. While you > are setting up the pages for sampling, call do_mlock from the kernel > and you will get the pages locked. To ensure they are faulted, do > get_user_pages(). > That is a possibility. > That said, I think mlock at all is the wrong thing. Where did that > code come from? > We are not using mlock(). I think you are commenting on the v2.0 code base. In the v2.x (x>0), the way we allocate the sampling buffer is different. During context creation, we pass a desired size in bytes. The kernel does the allocation and locks in the pages. Then the user has to call mmap() to make the buffer visible in user space. There is no mlock() required, the kernel does the equivalent. > > The issue is with program which would request a very large buffer. > > Keep in mind that the sampling buffer is allocated by the kernel > > with vmalloc(). This is not user-pageable memory. > > So limit the sample buffer size using something reasonable. mlock is > not the same as a shared buffer size. We are finding that on even small > sized systems, your requirement would require us to increase the mlock > limit for everything on the system in order to sample with the existing > number of entries. > > For SuSE, the ulimit default is 128k. That is 8 pages on ia64. With the > current algorithm, I am limited to 8 cpus before I exceed that limit. > That is a tiny system and is _SIGNIFICANTLY_ below the old limit. > Whether we use mlock() or the kernel locks the pages directly does not change the fact that we need to limit what regular users can consume to avoid attacks. Given that we actually lock in some pages, I think the RLIMIT_MEMLOCK is relevant. The default hard limit set by distros is very small especially for large machines. Yet you can always adjust this by changing the limit in /etc/limits.conf. Note that we have a 2nd level mechanism in place to avoid users creating lots of processes with small buffers. Perfmon also supports a global limit for all sampling buffers of all users. That limit is availabnle in v2.2 or higher and can be configured via /sys. > On a completely seperate note, this entire calculation in user space > is probably a bad idea. Why not let the kernel calculate the limit and > return the failure. Don't require a similar calculation to be done in > two seperate locations which just adds complexity that does not need to > be there. > Applications can just pick a size, the kernel does the calculation for a minimal size buffer + checks the limits, if that succeeds you are done, if not you get E2BIG. This is the way this works. Pfmon does this differently because it lets the user give the number of entries in the buffer as opposed to its size. As such, it needs to convert this into a byte size. I think using 'number of entries' is more practical to end-users because they do not need to know the actual size of a sample. But what pfmon does is independent of what the kernel interface does. You can build a tool which expose the buffer size as an option. -- -Stephane _______________________________________________ perfmon mailing list [email protected] http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/
