A month has passed and not even single suggestion how to debug a problem. Is assp project dead?
strace made for 10 seconds on server that is not experiencing problem root@sv1 [/root]# strace -f -p 29554 -c Process 29554 attached with 8 threads - interrupt to quit ^CProcess 29554 detached Process 29645 detached Process 29970 detached Process 29991 detached Process 30032 detached Process 30441 detached Process 30658 detached Process 31035 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 49.47 0.116007 1208 96 poll 31.64 0.074212 299 248 nanosleep 17.06 0.040001 8000 5 restart_syscall 1.71 0.004001 37 109 32 futex 0.12 0.000272 0 3259 sched_yield 0.01 0.000030 0 70 fcntl 0.00 0.000000 0 31 6 read 0.00 0.000000 0 38 write 0.00 0.000000 0 3 open 0.00 0.000000 0 8 close 0.00 0.000000 0 575 3 stat 0.00 0.000000 0 2 fstat 0.00 0.000000 0 9 8 lseek 0.00 0.000000 0 1 rt_sigaction 0.00 0.000000 0 58 rt_sigprocmask 0.00 0.000000 0 9 9 ioctl 0.00 0.000000 0 6 select 0.00 0.000000 0 34 alarm 0.00 0.000000 0 2 socket 0.00 0.000000 0 4 2 connect 0.00 0.000000 0 1 accept 0.00 0.000000 0 1 sendto 0.00 0.000000 0 1 recvfrom 0.00 0.000000 0 6 getsockname 0.00 0.000000 0 3 1 getpeername 0.00 0.000000 0 4 getdents 0.00 0.000000 0 1 rename 0.00 0.000000 0 1 prctl ------ ----------- ----------- --------- --------- ---------------- 100.00 0.234523 4585 61 total strace made for 10 seconds on server that is experiencing problem Process 22699 attached with 8 threads - interrupt to quit ^CProcess 22699 detached Process 22703 detached Process 22714 detached Process 22870 detached Process 22885 detached Process 22959 detached Process 23071 detached Process 23248 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 33.46 2.232141 54442 41 poll 28.29 1.887313 14630 129 nanosleep 20.51 1.368086 456029 3 restart_syscall 17.43 1.162859 1 1190382 271641 futex 0.30 0.020000 5000 4 close 0.00 0.000161 0 21696 sched_yield 0.00 0.000018 0 4753 1 stat 0.00 0.000000 0 17 3 read 0.00 0.000000 0 16 write 0.00 0.000000 0 4 open 0.00 0.000000 0 3 fstat 0.00 0.000000 0 8 6 lseek 0.00 0.000000 0 1 rt_sigaction 0.00 0.000000 0 54 rt_sigprocmask 0.00 0.000000 0 8 8 ioctl 0.00 0.000000 0 11 select 0.00 0.000000 0 32 alarm 0.00 0.000000 0 1 socket 0.00 0.000000 0 2 1 connect 0.00 0.000000 0 1 accept 0.00 0.000000 0 15 sendto 0.00 0.000000 0 14 recvfrom 0.00 0.000000 0 6 getsockname 0.00 0.000000 0 17 15 getpeername 0.00 0.000000 0 54 fcntl 0.00 0.000000 0 4 getdents 0.00 0.000000 0 1 prctl ------ ----------- ----------- --------- --------- ---------------- 100.00 6.670578 1217277 271675 total it is meadeatly there is much higer number of errors (271641) on futex calls. Why is that? I will just repleat that both os are same (cloned partition), both hardware is same, and assp is same. I speculate that this is because of all futex call during constatnt accesis of /etc/localtime by assp process. Why does assp constantly access it? How to fix it? [pid 22714] futex(0x7f11d6efea5c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f11d6efea58, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 [pid 22959] <... futex resumed> ) = 0 [pid 22714] stat("/etc/localtime", <unfinished ...> [pid 22959] futex(0x7f11d6efea20, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> [pid 22714] <... stat resumed> {st_mode=S_IFREG|0644, st_size=2679, ...}) = 0 [pid 22959] <... futex resumed> ) = 0 [pid 22959] futex(0x7f11d6efea5c, FUTEX_WAIT_PRIVATE, 1750855243, NULL <unfinished ...> [pid 22714] futex(0x7f11d6efea5c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f11d6efea58, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1} <unfinished ...> [pid 22959] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 22714] <... futex resumed> ) = 0 [pid 22959] futex(0x7f11d6efea20, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...> [pid 22714] futex(0x7f11d6efea20, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> [pid 22959] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 22714] <... futex resumed> ) = 0 [pid 22959] futex(0x7f11d6efea20, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> [pid 22714] futex(0x7f11d6efea5c, FUTEX_WAIT_PRIVATE, 1750855245, NULL <unfinished ...> [pid 22959] <... futex resumed> ) = 0 [pid 22959] futex(0x7f11d6efea5c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f11d6efea58, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 [pid 22714] <... futex resumed> ) = 0 [pid 22959] futex(0x7f11d6efea5c, FUTEX_WAIT_PRIVATE, 1750855247, NULL <unfinished ...> [pid 22714] futex(0x7f11d6efea20, FUTEX_WAKE_PRIVATE, 1) = 0 [pid 22714] futex(0x7f11d6efea5c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f11d6efea58, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1} <unfinished ...> [pid 22959] <... futex resumed> ) = 0 -- [pid 22714] <... futex resumed> ) = 0 [pid 22959] <... sched_yield resumed> ) = 0 [pid 22959] stat("/etc/localtime", <unfinished ...> [pid 22714] futex(0x7f11d6efea20, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> [pid 22959] <... stat resumed> {st_mode=S_IFREG|0644, st_size=2679, ...}) = 0 [pid 22714] <... futex resumed> ) = 0 [pid 22959] sched_yield( <unfinished ...> [pid 22714] sched_yield( <unfinished ...> [pid 22959] <... sched_yield resumed> ) = 0 [pid 22959] sched_yield( <unfinished ...> ------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ Assp-test mailing list Assp-test@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/assp-test