On 06/10/15 10:28 +0200, Dejan Muhamedagic wrote: > On Mon, Oct 05, 2015 at 07:00:18PM +0300, Vladislav Bogdanov wrote: >> 14.09.2015 02:31, Andrew Beekhof wrote: >>> >>>> On 8 Sep 2015, at 10:18 pm, Ulrich Windl >>>> <[email protected]> wrote: >>>> >>>>>>> Vladislav Bogdanov <[email protected]> schrieb am 08.09.2015 um >>>>>>> 14:05 in >>>> Nachricht <[email protected]>: >>>>> Hi, >>>>> >>>>> just discovered very interesting issue. >>>>> If there is a system user with very big UID (80000002 in my case), >>>>> then crm_report (actually 'grep' it runs) consumes too much RAM. >>>>> >>>>> Relevant part of the process tree at that moment looks like (word-wrap >>>>> off): >>>>> USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND >>>>> ... >>>>> root 25526 0.0 0.0 106364 636 ? S 12:37 0:00 >>>>> \_ >>>>> /bin/sh /usr/sbin/crm_report --dest=/var/log/crm_report -f 0000-01-01 >>>>> 00:00:00 >>>>> root 25585 0.0 0.0 106364 636 ? S 12:37 0:00 >>>>> \_ bash /var/log/crm_report/collector >>>>> root 25613 0.0 0.0 106364 152 ? S 12:37 0:00 >>>>> \_ bash /var/log/crm_report/collector >>>>> root 25614 0.0 0.0 106364 692 ? S 12:37 0:00 >>>>> \_ bash /var/log/crm_report/collector >>>>> root 27965 4.9 0.0 100936 452 ? S 12:38 0:01 >>>>> | \_ cat /var/log/lastlog >>>>> root 27966 23.0 82.9 3248996 1594688 ? D 12:38 0:08 >>>>> | \_ grep -l -e Starting Pacemaker >>>>> root 25615 0.0 0.0 155432 600 ? S 12:37 0:00 >>>>> \_ sort -u >>>>> >>>>> ls -ls /var/log/lastlog shows: >>>>> 40 -rw-r--r--. 1 root root 23360000876 Sep 8 04:36 /var/log/lastlog >>>>> >>>>> That is sparse binary file, which consumes only 40k of disk space. >>>>> At the same time its size is 23GB, and grep takes all the RAM trying to >>>>> grep a string from a 23GB of mostly zeroes without new-lines. >>>>> >>>>> I believe this is worth fixing, >>> >>> Shouldn’t this be directed to the grep folks? >> >> Actually, not everything in /var/log are textual logs. Currently >> findmsg() [z,bz,xz]cats _every_ file there and greps for a pattern. >> Shouldn't it skip some well-known ones? btmp, lastlog and wtmp are >> good candidates to be skipped. They are not intended to be handled >> as a text. >> >> Or may be just test that file is a text in a find_decompressor() and >> to not cat it if it is not? >> >> something like >> find_decompressor() { >> if echo $1 | grep -qs 'bz2$'; then >> echo "bzip2 -dc" >> elif echo $1 | grep -qs 'gz$'; then >> echo "gzip -dc" >> elif echo $1 | grep -qs 'xz$'; then >> echo "xz -dc" >> elif file $1 | grep -qs 'text'; then >> echo "cat" >> else >> echo "echo" > > Good idea.
Even better might be using process substitution and avoid cat'ing if not needed even for plain text files, assuming GNU grep 2.13+ that, in combination with kernel, attempts to detect sparse files, marking them as binary files[1], which can then be utilized in combination with -I option. But that is not expected to work under /bin/sh and achieving the same in compatible way would be quite clumsy. Not to speak about using non-POSIX extensions to grep. And I don't think grep folks can do any better with piped input... [1] http://git.savannah.gnu.org/cgit/grep.git/tree/NEWS?id=c528aa1da0ef1635fa48c3ec804162cf3e71cb79#n22 >> fi >> } -- Jan (Poki)
pgpFwn9zKMyVe.pgp
Description: PGP signature
_______________________________________________ Users mailing list: [email protected] http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
