On Fri, Aug 19, 2016 at 05:41:20PM +0300, Yauheni Kaliuta wrote: > Hi! > > At the moment there is no clear indication if a process exceeds resource > limit. In some cases the problematic syscall can return a error, in some cases > the process can be just killed. > > I'm trying to implement some sort of monitoring of such events and have a > question, what way would be acceptable.
> > 1) The straight forward solution would be to instrument every such a place > with > a printk (something related implemented, for example, by > d977d56ce5b3e8842236f2f9e7483d4914c9592e). > > It has some concerns about reliablity and performance (giving a way to flood > printk buffer because of bad application, for example). > > 2) Using tracepoints. I've used a simple program, which dup()s until gets the > error 3 times: just to start up the discussion.. ;-) I'd think this one (2) is the proper way, but generaly you need to come with good justification/usecase to add new tracepoint also rlimit seems to be difficult to add tracepoints to, because the checks are spread all over the code.. can't think of a good solution ATM > $ sudo ./perf record -e rlimit:rlimit_exceeded ./a.out > Couldn't dup file: Too many open files, iteration 1020 > Couldn't dup file: Too many open files, iteration 1021 > Couldn't dup file: Too many open files, iteration 1022 > [ perf record: Woken up 1 times to write data ] > [ perf record: Captured and wrote 0.010 MB perf.data (3 samples) ] > > $ sudo ./perf report > # To display the perf.data header info, please use --header/--header-only > options. > # > # > # Total Lost Samples: 0 > # > # Samples: 3 of event 'rlimit:rlimit_exceeded' > # Event count (approx.): 3 > # > # Overhead Trace output > # ........ ........................................................ > # > 100.00% RLIMIT NOFILE violation. Current 1024, requested Unknown > > The code to demonstrate the idea below: > > diff --git a/fs/file.c b/fs/file.c > index 6b1acdfe59da..a358de041ac4 100644 > --- a/fs/file.c > +++ b/fs/file.c > @@ -947,6 +947,9 @@ SYSCALL_DEFINE1(dup, unsigned int, fildes) > else > fput(file); > } > + if (ret == -EMFILE) > + rlimit_exceeded(RLIMIT_NOFILE, > + rlimit(RLIMIT_NOFILE), (u64)-1); > return ret; how about other places? alloc_fd/get_unused_fd_flags/replace_fd.. jirka

