Thanks for the answer.
> This is the development list - I expect reported problems for development > releases not older than one month. Fritz once told to me to post here so I did ever since. Where do I get such version then? I tested on newest available on sourceforge. Do you got any svn/git repo? > Seems to be a system related problem, if only one is not running well. > IMHO this is caused by a system or assp configuration problem, if the same > software is running on the same hardware differently all servers I'm running have same hardware configuration and exacly same system (os) that is parition cloned. Servers might be under diffrent loads otherwise (apache, php, passenger, courier etc) and that would explain huge number of errors when reseving futex locks on /etc/localtime. It seems that assp is wasting cpu time waiting for futex locks on this file. Solution would be still not to overuse futex locks on this file. ASSP is written in Perl not C or C++ . The FUTEX calls are used by Perl > (threads and/or threads::shared) on your OS (seems to be a linux) for > shared memory access. obviously but it means that some perl library is overusing access to /etc/localtime (or maybe some other files too). Perhaps some perl lib used by assp is not really up to the task working under huge loads. Maybe some other software running on this server is also doing that at the same time and causing congestion. How can I find out is causing all those calls to /etc/localtime in assp? (as I'm not really familiar with assp code or perl language). Switching on 'WorkerLog' and 'SignalLog' to the highest level could help > to investigate the problem. Enableing 'debug' could also help. ok, i've enabled those options. I'll send you log file to your email since I don't want to post it publicly. 2015-02-26 8:12 GMT+01:00 Thomas Eckardt <thomas.ecka...@thockar.com>: > You wrote: > > > I have assp (ASSP version 2.4.1(14132)) running on multiple servers. > >I've tried newest version of assp (ASSP version 2.4.3(14349)). > > This is the development list - I expect reported problems for development > releases not older than one month. > > > On one of these servers assp is causing high cpu ussage. > > Seems to be a system related problem, if only one is not running well. > IMHO this is caused by a system or assp configuration problem, if the same > software is running on the same hardware differently > > > FUTEX ... > > ASSP is written in Perl not C or C++ . The FUTEX calls are used by Perl > (threads and/or threads::shared) on your OS (seems to be a linux) for > shared memory access. > > >Assp "ASSP Worker/DB/Regex Status" shows workers in "ThreadGetNewCon" > status. > > Looks like the threads having problems to communicate to each other via > shared memory. Another reason could be a socket problem, that causes the > MainThread to wakeup all worker threads, because no worker is answering. > > >accesis of /etc/localtime by assp process. Why does assp constantly > access it? > > ASSP does not access /etc/localtime directly. ASSP reads the time via Perl > from the OS (this is a clib function). Looks like, this OS function needs > to constantly reread /etc/localtime . > > Switching on 'WorkerLog' and 'SignalLog' to the highest level could help > to investigate the problem. Enableing 'debug' could also help. > > Thomas > > > > Von: "krz...@gmail.com " <krz...@gmail.com> > An: ASSP development mailing list <assp-test@lists.sourceforge.net> > Datum: 26.02.2015 01:38 > Betreff: Re: [Assp-test] assp 100% cpu but basicly idle > > > > A month has passed and not even single suggestion how to debug a > problem. Is assp project dead? > > strace made for 10 seconds on server that is not experiencing problem > > root@sv1 [/root]# strace -f -p 29554 -c > Process 29554 attached with 8 threads - interrupt to quit > ^CProcess 29554 detached > Process 29645 detached > Process 29970 detached > Process 29991 detached > Process 30032 detached > Process 30441 detached > Process 30658 detached > Process 31035 detached > % time seconds usecs/call calls errors syscall > ------ ----------- ----------- --------- --------- ---------------- > 49.47 0.116007 1208 96 poll > 31.64 0.074212 299 248 nanosleep > 17.06 0.040001 8000 5 restart_syscall > 1.71 0.004001 37 109 32 futex > 0.12 0.000272 0 3259 sched_yield > 0.01 0.000030 0 70 fcntl > 0.00 0.000000 0 31 6 read > 0.00 0.000000 0 38 write > 0.00 0.000000 0 3 open > 0.00 0.000000 0 8 close > 0.00 0.000000 0 575 3 stat > 0.00 0.000000 0 2 fstat > 0.00 0.000000 0 9 8 lseek > 0.00 0.000000 0 1 rt_sigaction > 0.00 0.000000 0 58 rt_sigprocmask > 0.00 0.000000 0 9 9 ioctl > 0.00 0.000000 0 6 select > 0.00 0.000000 0 34 alarm > 0.00 0.000000 0 2 socket > 0.00 0.000000 0 4 2 connect > 0.00 0.000000 0 1 accept > 0.00 0.000000 0 1 sendto > 0.00 0.000000 0 1 recvfrom > 0.00 0.000000 0 6 getsockname > 0.00 0.000000 0 3 1 getpeername > 0.00 0.000000 0 4 getdents > 0.00 0.000000 0 1 rename > 0.00 0.000000 0 1 prctl > ------ ----------- ----------- --------- --------- ---------------- > 100.00 0.234523 4585 61 total > > strace made for 10 seconds on server that is experiencing problem > > Process 22699 attached with 8 threads - interrupt to quit > ^CProcess 22699 detached > Process 22703 detached > Process 22714 detached > Process 22870 detached > Process 22885 detached > Process 22959 detached > Process 23071 detached > Process 23248 detached > % time seconds usecs/call calls errors syscall > ------ ----------- ----------- --------- --------- ---------------- > 33.46 2.232141 54442 41 poll > 28.29 1.887313 14630 129 nanosleep > 20.51 1.368086 456029 3 restart_syscall > 17.43 1.162859 1 1190382 271641 futex > 0.30 0.020000 5000 4 close > 0.00 0.000161 0 21696 sched_yield > 0.00 0.000018 0 4753 1 stat > 0.00 0.000000 0 17 3 read > 0.00 0.000000 0 16 write > 0.00 0.000000 0 4 open > 0.00 0.000000 0 3 fstat > 0.00 0.000000 0 8 6 lseek > 0.00 0.000000 0 1 rt_sigaction > 0.00 0.000000 0 54 rt_sigprocmask > 0.00 0.000000 0 8 8 ioctl > 0.00 0.000000 0 11 select > 0.00 0.000000 0 32 alarm > 0.00 0.000000 0 1 socket > 0.00 0.000000 0 2 1 connect > 0.00 0.000000 0 1 accept > 0.00 0.000000 0 15 sendto > 0.00 0.000000 0 14 recvfrom > 0.00 0.000000 0 6 getsockname > 0.00 0.000000 0 17 15 getpeername > 0.00 0.000000 0 54 fcntl > 0.00 0.000000 0 4 getdents > 0.00 0.000000 0 1 prctl > ------ ----------- ----------- --------- --------- ---------------- > 100.00 6.670578 1217277 271675 total > > it is meadeatly there is much higer number of errors (271641) on futex > calls. Why is that? I will just repleat that both os are same (cloned > partition), both hardware is same, and assp is same. > I speculate that this is because of all futex call during constatnt > accesis of /etc/localtime by assp process. Why does assp constantly > access it? How to fix it? > > [pid 22714] futex(0x7f11d6efea5c, FUTEX_WAKE_OP_PRIVATE, 1, 1, > 0x7f11d6efea58, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 > [pid 22959] <... futex resumed> ) = 0 > [pid 22714] stat("/etc/localtime", <unfinished ...> > [pid 22959] futex(0x7f11d6efea20, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> > [pid 22714] <... stat resumed> {st_mode=S_IFREG|0644, st_size=2679, ...}) > = 0 > [pid 22959] <... futex resumed> ) = 0 > [pid 22959] futex(0x7f11d6efea5c, FUTEX_WAIT_PRIVATE, 1750855243, NULL > <unfinished ...> > [pid 22714] futex(0x7f11d6efea5c, FUTEX_WAKE_OP_PRIVATE, 1, 1, > 0x7f11d6efea58, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1} <unfinished ...> > [pid 22959] <... futex resumed> ) = -1 EAGAIN (Resource > temporarily unavailable) > [pid 22714] <... futex resumed> ) = 0 > [pid 22959] futex(0x7f11d6efea20, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished > ...> > [pid 22714] futex(0x7f11d6efea20, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> > [pid 22959] <... futex resumed> ) = -1 EAGAIN (Resource > temporarily unavailable) > [pid 22714] <... futex resumed> ) = 0 > [pid 22959] futex(0x7f11d6efea20, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> > [pid 22714] futex(0x7f11d6efea5c, FUTEX_WAIT_PRIVATE, 1750855245, NULL > <unfinished ...> > [pid 22959] <... futex resumed> ) = 0 > [pid 22959] futex(0x7f11d6efea5c, FUTEX_WAKE_OP_PRIVATE, 1, 1, > 0x7f11d6efea58, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 > [pid 22714] <... futex resumed> ) = 0 > [pid 22959] futex(0x7f11d6efea5c, FUTEX_WAIT_PRIVATE, 1750855247, NULL > <unfinished ...> > [pid 22714] futex(0x7f11d6efea20, FUTEX_WAKE_PRIVATE, 1) = 0 > [pid 22714] futex(0x7f11d6efea5c, FUTEX_WAKE_OP_PRIVATE, 1, 1, > 0x7f11d6efea58, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1} <unfinished ...> > [pid 22959] <... futex resumed> ) = 0 > -- > [pid 22714] <... futex resumed> ) = 0 > [pid 22959] <... sched_yield resumed> ) = 0 > [pid 22959] stat("/etc/localtime", <unfinished ...> > [pid 22714] futex(0x7f11d6efea20, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> > [pid 22959] <... stat resumed> {st_mode=S_IFREG|0644, st_size=2679, ...}) > = 0 > [pid 22714] <... futex resumed> ) = 0 > [pid 22959] sched_yield( <unfinished ...> > [pid 22714] sched_yield( <unfinished ...> > [pid 22959] <... sched_yield resumed> ) = 0 > [pid 22959] sched_yield( <unfinished ...> > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, > sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for > all > things parallel software development, from weekly thought leadership blogs > to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Assp-test mailing list > Assp-test@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/assp-test > > > > > > > DISCLAIMER: > ******************************************************* > This email and any files transmitted with it may be confidential, legally > privileged and protected in law and are intended solely for the use of the > > individual to whom it is addressed. > This email was multiple times scanned for viruses. There should be no > known virus in this email! > ******************************************************* > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, > sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for > all > things parallel software development, from weekly thought leadership blogs > to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Assp-test mailing list > Assp-test@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/assp-test > ------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ Assp-test mailing list Assp-test@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/assp-test