I took a look at dolog() & log_flush(). Both use semop. If I understood the semop man page correctly, using a negative sem_op value means 'down' (i.e. enter a critical section). Using a positive sem_op value means 'up' (i.e. leave the critical section). According to that, it looks to me that the syslog calls in dolog() & log_flush() print incorrect information. Am I right?
Another (bigger) problem - from time to time, when I run 'iscsiadm -m node -U all', it never returns. When I ran 'echo t > /proc/sysrq-trigger', I got the following: iscsid S 0000000000000000 0 8441 1 8442 24234 (NOTLB) Oct 15 14:46:29 b73 kernel: ffff81012e28dd28 0000000000000086 0000000000000000 000000007fb18660 Oct 15 14:46:29 b73 kernel: ffff81012e28de48 000000000000000a ffff81003e3bd080 ffff810143b85100 Oct 15 14:46:29 b73 kernel: 0000912e761c6b46 0000000000002973 ffff81003e3bd268 00000006800547f7 Oct 15 14:46:29 b73 kernel: Call Trace: Oct 15 14:46:29 b73 kernel: [<ffffffff8014b5d4>] __next_cpu+0x19/0x28 Oct 15 14:46:29 b73 kernel: [<ffffffff8008bdf5>] find_busiest_group+0x20d/0x621 Oct 15 14:46:29 b73 kernel: [<ffffffff8011c267>] sys_semtimedop+0x627/0x720 Oct 15 14:46:29 b73 kernel: [<ffffffff80063097>] thread_return+0x62/0xfe Oct 15 14:46:29 b73 kernel: [<ffffffff8004dd1b>] lock_hrtimer_base+0x26/0x4c Oct 15 14:46:29 b73 kernel: [<ffffffff8003a65a>] hrtimer_try_to_cancel+0x4a/0x53 Oct 15 14:46:29 b73 kernel: [<ffffffff80059d69>] hrtimer_cancel+0xc/0x16 Oct 15 14:46:29 b73 kernel: [<ffffffff80063db6>] do_nanosleep+0x47/0x70 Oct 15 14:46:29 b73 kernel: [<ffffffff80059c56>] hrtimer_nanosleep+0x58/0x118 Oct 15 14:46:29 b73 kernel: [<ffffffff8005d28d>] tracesys+0xd5/0xe0 Oct 15 14:46:29 b73 kernel: Oct 15 14:46:29 b73 kernel: iscsid S ffffffff800627ba 0 8442 1 29016 8441 (NOTLB) Oct 15 14:46:29 b73 kernel: ffff81012fa65d28 0000000000000086 0000000000000000 ffff8101a0282100 Oct 15 14:46:29 b73 kernel: ffff81012fa65e10 000000000000000a ffff81067683d040 ffff81017b7fb080 Oct 15 14:46:29 b73 kernel: 0000912e761c6007 00000000000028ca ffff81067683d228 000000018003d267 Oct 15 14:46:29 b73 kernel: Call Trace: Oct 15 14:46:29 b73 kernel: [<ffffffff8011c267>] sys_semtimedop+0x627/0x720 Oct 15 14:46:29 b73 kernel: [<ffffffff80058e3a>] inet_stream_connect+0x225/0x236 Oct 15 14:46:29 b73 kernel: [<ffffffff8021a0a8>] sock_getsockopt+0x326/0x348 Oct 15 14:46:29 b73 kernel: [<ffffffff80032e39>] lock_sock+0xa7/0xb2 Oct 15 14:46:29 b73 kernel: [<ffffffff80217b32>] sys_connect+0x7e/0xae Oct 15 14:46:29 b73 kernel: [<ffffffff8005d28d>] tracesys+0xd5/0xe0 It looks like both iscsid processes are waiting for a semaphore. Later, when I ran strace, I got the following logs (because semop was interrupted): Oct 15 14:53:28 b73 iscsid: semop up failed 4 Oct 15 14:53:56 b73 iscsid: semop down failed Oct 15 14:54:27 b73 iscsid: semop up failed 4 BTW - why do we always have 2 iscsid processes? Thanks, Erez --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~----------~----~----~----~------~----~------~--~---