pthreads mostly hang when signal
Hi all I have a C program with pthreads. This program creates thousands of detached pthreads for short jobs at the beginning. They all come and go within the first 10 seconds. But it has 3 permanently running pthreads. I signal these 3 permanently running pthreads to stop processing. I send SIGUSR1 for this purpose. If I don't interrupt these 3 permanently running pthreads, they run without any issue and do their job. If I send the SIGUSR1 to these 3 permanently running pthreads, they rarely work. That is, these threads immediately receive SIGUSR1 signals. But when I signal, they mostly hang. That is, these threads don't receive the signal. Following code fragment shows how I send the signal and wait till they stop: LockMutex(threadCountMutex); tCount = threadCount; UnlockMutex(threadCountMutex); printf(B4 stop Thread_1. Threads: %d\n, tCount); LockMutex(Thread_1varMutex); Thread_1var.Thread_1Stopped = 0; // Thread_1 not stopped yet. UnlockMutex(Thread_1varMutex); if (pthread_kill(tid1, SIGUSR1) != 0) // Send a signal to the thread { // to stop processing. fprintf(stderr, pthread_kill failed for Thread_1!\n); exit(1); } Delay(25); // Let Thread_1 to settle. // Check now whether the Thread_1 thread received the signal and stopped processing. for (threadActivateDelay=0 ;threadActivateDelay threadActivateTimeOut; threadActivateDelay += 50) { LockMutex(Thread_1varMutex); Thread_1Stopped = Thread_1var.Thread_1Stopped; UnlockMutex(Thread_1varMutex); if (Thread_1Stopped) break; else Delay(50); // Let Thread_1 thread to settles. 50ms LockMutex(threadCountMutex); tCount = threadCount; UnlockMutex(threadCountMutex); printf(Wait till Thread_1 stopped, Threads: %d Delay: %d\n, tCount, threadActivateDelay+50); } printf(Came out of Thread_1 loop, threadActivateDelay: %d, threadActivateTimeOut: %d\n, threadActivateDelay, threadActivateTimeOut); if (threadActivateDelay = threadActivateTimeOut) // Something is wrong. Thread may be dead. { fprintf(stderr, Time out. Thread_1 may be dead!\n); exit(1); } Note, Thread_1var.Thread_1Stopped is set to 1 by the Thread_1 once it receive the SIGUSR1. Result of two runs of the program is as follows (values are in milliseconds): ./prog B4 stop Thread_1. Threads: 3 Thread_1 cought SIGUSR1 Wait till Thread_1 stopped, Threads: 3 Delay: 50 Came out of Thread_1 loop, threadActivateDelay: 50, threadActivateTimeOut: 3000 B4 stop Thread_2. Threads: 3 Thread_2 cought SIGUSR1 Came out of Thread_2 loop, threadActivateDelay: 0, threadActivateTimeOut: 3000 B4 stop Thread_3. Threads: 3 Wait till Thread_3 stopped, Threads: 3 Delay: 50 Wait till Thread_3 stopped, Threads: 3 Delay: 100 Wait till Thread_3 stopped, Threads: 3 Delay: 150 : : Wait till Thread_3 stopped, Threads: 3 Delay: 3000 Came out of Thread_3 loop, threadActivateDelay: 3000, threadActivateTimeOut: 3000 Time out. Thread_3 may be dead! ./prog B4 stop Thread_1. Threads: 3 Wait till Thread_1 stopped, Threads: 3 Delay: 50 Wait till Thread_1 stopped, Threads: 3 Delay: 100 Wait till Thread_1 stopped, Threads: 3 Delay: 150 : : Wait till Thread_1 stopped, Threads: 3 Delay: 3000 Came out of Thread_1 loop, threadActivateDelay: 3000, threadActivateTimeOut: 3000 Time out. Thread_1 may be dead! I have tested this program on FreeBSD 8.1 and 9.0 RC1, both i386. Different runs hang different threads. Also as I mention earlier, rarely all three threads stop immediately. My issue is quite similar to the problem: http://security.freebsd.org/advisories/FreeBSD-EN-10:02.sched_ule.asc But it doesn't freeze the system. Increase threadActivateTimeOut to 6ms also doesn't work once hang. Please also note, once receive a SIGUSR1, the thread wait on sigwait() till it receive another signal. So what have I hit with? Is it a programming error in my side or scheduling error or something else? Appreciate very much if FreeBSD guys could help me to solve this issue. Many thanks in advance. Best regards Unga ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD Status Report July-September, 2011
On Tue, 8 Nov 2011 09:55:09 +, Daniel Gerzo wrote: FreeBSD Quarterly Status Report - Q3/2011 Unfortunately, I managed to use an old status report entry for KDE/FreeBSD, instead of the current one. I am sorry for any inconvenience; the current entry for KDE/FreeBSD is below: KDE/FreeBSD URL: http://FreeBSD.KDE.org URL: http://FreeBSD.KDE.org/area51.php Contact: KDE FreeBSD kde-free...@kde.org The KDE/FreeBSD team has continued to improve the experience of KDE software and Qt under FreeBSD. The latest round of improvements include: * Splitting some of the KDE modules into smaller ports * Reduced startup time by ~15 seconds * Allowed auto-login out-of-the-box * Kopete supports GoogleTalk * Kalzium installs with its molecular editor * Zeitgeist support added * Porting Calligra to FreeBSD (work-in-progress) The team has also made many releases and upstreamed many fixes and patches. The latest round of releases include: * Qt: 4.7.4 * PyQt: 4.8.5 (SIP: 4.12.4) * KDE SC: 4.7.2 * Amarok: 2.4.3 * KDevelop: 4.2.3 (KDevPlatform: 1.2.3) The team is always looking for more testers and porters so please contact us at kde-free...@kde.org and visit our home page at http://FreeBSD.KDE.org. Open tasks: 1. Testing KDE PIM 4.7.2 2. Testing phonon-gstreamer and phonon-vlc as the phonon-xine backend was deprecated (and will remain in ports) -- Kind regards Daniel ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
On Tuesday, November 08, 2011 6:39:07 pm Vincent Hoffman wrote: On 08/11/2011 22:24, Vincent Hoffman wrote: On 08/11/2011 19:50, John Baldwin wrote: On Wednesday, November 02, 2011 5:47:38 pm Vincent Hoffman wrote: On 28/10/2011 04:14, Jan Mikkelsen wrote: Hi, There is a patch linked to from this PR, which seems very similar: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/140416 http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html The problem is also consistent with running mfiutil clearing the problem. I'm about to deploy mfi controllers in a similar configuration, so I'd be very curious about whether the patch fixes the problem for you. The patch you linked to seems to have removed the stalls, although I have only had it running for a day. I'll post if it stalls again though. I did manage to scrounge the use of a Dell r410 with a LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05) Badged as Dell PERC H700 Adapter to test out the patch I originally found but had the same issue as this post http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063821.html I couldnt get the dell to stall in the first place either though so it could be a specific firmware version that the issue. Anyway thanks for the pointers. Hmm, did you try the patch I had posted from that earlier thread? It had two changes in it, one was similar to the patch in the PR, the second added MSI-X support. I've since tweaked it to make the MSI-X support off by default but possible to enable via loader.conf. Would you be willing to try the updated patch at www.freebsd.org/~jhb/patches/mfi.patch? Hi, yes I tried the patch you posted originally, unfortunately the dell never finished booting either. The Supermicro is now in production but I'll take the dell up to 9-STABLE and try your updated patch. The patch didnt apply quite cleanly for 9-STABLE, 1 reject as it had already been applied. Odd, it's against stock head, so I don't know why it would have failed to apply. I have rebooted the dell and it seems happy with the new patch (msi disabled.) Okay, good. I'll commit the non-MSI bits at least and get them merged into 9.0 if possible. Booting with hw.mfi.msix=1 in /boot/loader.conf causes the timeouts again and stops the boot from completing. Ok. Can you try changing it to use MSI instead of MSI-X? Just edit the mfi_pci.c call and replace 'pci_alloc_msix' with 'pci_alloc_msi'. -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
On 09/11/2011 14:39, John Baldwin wrote: On Tuesday, November 08, 2011 6:39:07 pm Vincent Hoffman wrote: On 08/11/2011 22:24, Vincent Hoffman wrote: On 08/11/2011 19:50, John Baldwin wrote: On Wednesday, November 02, 2011 5:47:38 pm Vincent Hoffman wrote: On 28/10/2011 04:14, Jan Mikkelsen wrote: Hi, There is a patch linked to from this PR, which seems very similar: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/140416 http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html The problem is also consistent with running mfiutil clearing the problem. I'm about to deploy mfi controllers in a similar configuration, so I'd be very curious about whether the patch fixes the problem for you. The patch you linked to seems to have removed the stalls, although I have only had it running for a day. I'll post if it stalls again though. I did manage to scrounge the use of a Dell r410 with a LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05) Badged as Dell PERC H700 Adapter to test out the patch I originally found but had the same issue as this post http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063821.html I couldnt get the dell to stall in the first place either though so it could be a specific firmware version that the issue. Anyway thanks for the pointers. Hmm, did you try the patch I had posted from that earlier thread? It had two changes in it, one was similar to the patch in the PR, the second added MSI-X support. I've since tweaked it to make the MSI-X support off by default but possible to enable via loader.conf. Would you be willing to try the updated patch at www.freebsd.org/~jhb/patches/mfi.patch? Hi, yes I tried the patch you posted originally, unfortunately the dell never finished booting either. The Supermicro is now in production but I'll take the dell up to 9-STABLE and try your updated patch. The patch didnt apply quite cleanly for 9-STABLE, 1 reject as it had already been applied. Odd, it's against stock head, so I don't know why it would have failed to apply. I think it was http://svnweb.freebsd.org/base?view=revisionrevision=227309 no big deal. I have rebooted the dell and it seems happy with the new patch (msi disabled.) Okay, good. I'll commit the non-MSI bits at least and get them merged into 9.0 if possible. Booting with hw.mfi.msix=1 in /boot/loader.conf causes the timeouts again and stops the boot from completing. Ok. Can you try changing it to use MSI instead of MSI-X? Just edit the mfi_pci.c call and replace 'pci_alloc_msix' with 'pci_alloc_msi'. I'll have a try. Vince ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
On 09/11/2011 14:39, John Baldwin wrote: On Tuesday, November 08, 2011 6:39:07 pm Vincent Hoffman wrote: On 08/11/2011 22:24, Vincent Hoffman wrote: On 08/11/2011 19:50, John Baldwin wrote: On Wednesday, November 02, 2011 5:47:38 pm Vincent Hoffman wrote: On 28/10/2011 04:14, Jan Mikkelsen wrote: Hi, There is a patch linked to from this PR, which seems very similar: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/140416 http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html The problem is also consistent with running mfiutil clearing the problem. I'm about to deploy mfi controllers in a similar configuration, so I'd be very curious about whether the patch fixes the problem for you. The patch you linked to seems to have removed the stalls, although I have only had it running for a day. I'll post if it stalls again though. I did manage to scrounge the use of a Dell r410 with a LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05) Badged as Dell PERC H700 Adapter to test out the patch I originally found but had the same issue as this post http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063821.html I couldnt get the dell to stall in the first place either though so it could be a specific firmware version that the issue. Anyway thanks for the pointers. Hmm, did you try the patch I had posted from that earlier thread? It had two changes in it, one was similar to the patch in the PR, the second added MSI-X support. I've since tweaked it to make the MSI-X support off by default but possible to enable via loader.conf. Would you be willing to try the updated patch at www.freebsd.org/~jhb/patches/mfi.patch? Hi, yes I tried the patch you posted originally, unfortunately the dell never finished booting either. The Supermicro is now in production but I'll take the dell up to 9-STABLE and try your updated patch. The patch didnt apply quite cleanly for 9-STABLE, 1 reject as it had already been applied. Odd, it's against stock head, so I don't know why it would have failed to apply. I have rebooted the dell and it seems happy with the new patch (msi disabled.) Okay, good. I'll commit the non-MSI bits at least and get them merged into 9.0 if possible. Booting with hw.mfi.msix=1 in /boot/loader.conf causes the timeouts again and stops the boot from completing. Ok. Can you try changing it to use MSI instead of MSI-X? Just edit the mfi_pci.c call and replace 'pci_alloc_msix' with 'pci_alloc_msi'. Much better, It boots and says Nov 9 15:25:45 zfstest kernel: mfi0: Dell PERC H700 Adapter port 0xfc00-0xfcff mem 0xdf1bc000-0xdf1b,0xdf1c-0xdf1f irq 38 at device 0.0 on pci3 Nov 9 15:25:45 zfstest kernel: mfi0: Using MSI-X Nov 9 15:25:45 zfstest kernel: mfi0: Megaraid SAS driver Ver 3.00 Nov 9 15:25:45 zfstest kernel: mfi0: 2004 (374167405s/0x0020/info) - Shutdown command received from host Nov 9 15:25:45 zfstest kernel: mfi0: 2005 (boot + 34s/0x0020/info) - Firmware initialization started (PCI ID 0079/1000/1f16/1028) Nov 9 15:25:45 zfstest kernel: mfi0: 2006 (boot + 34s/0x0020/info) - Firmware version 2.100.03-1046 Nov 9 15:25:45 zfstest kernel: mfi0: 2007 (boot + 36s/0x0008/info) - Battery Present Nov 9 15:25:45 zfstest kernel: mfi0: 2008 (boot + 36s/0x0020/info) - Package version 12.10.0-0025 Nov 9 15:25:45 zfstest kernel: mfi0: 2009 (boot + 36s/0x0020/info) - Board Revision A00 Nov 9 15:25:45 zfstest kernel: mfi0: 2010 (boot + 61s/0x0002/info) - Inserted: PD 00(e0xff/s0) Nov 9 15:25:45 zfstest kernel: mfi0: 2011 (boot + 61s/0x0002/info) - Inserted: PD 00(e0xff/s0) Info: enclPd=, scsiType=0, portMap=01, sasAddr=443322110700, Nov 9 15:25:45 zfstest kernel: mfi0: 2012 (boot + 61s/0x0002/info) - Inserted: PD 01(e0xff/s1) Nov 9 15:25:45 zfstest kernel: mfi0: 2013 (boot + 61s/0x0002/info) - Inserted: PD 01(e0xff/s1) Info: enclPd=, scsiType=0, portMap=00, sasAddr=443322110600, Nov 9 15:25:45 zfstest kernel: mfi0: 2014 (374167491s/0x0020/info) - Time established as 11/09/11 15:24:51; (63 seconds since power on) Nov 9 15:25:45 zfstest kernel: mfi0: 2015 (374167529s/0x0008/info) - Battery temperature is normal Nov 9 15:25:45 zfstest kernel: mfi0: 2016 (374167529s/0x0008/info) - Battery started charging More info as required. Vince ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: smartctl / mpt on 9.0-RC1
On 11/08/2011 09:33 PM, Marat N.Afanasyev wrote: why :) just a little misunderstanding, I suppose ;) I just showed what I'd expect from #smartctl -a -d 3ware,0 /dev/twa0 in case of sas drive on channel 0 Yes. BTW, if you able to provide access to the BSD box with MFI and SAS i could fix defect sectors status report. For the twa/SAS much work needs to be done, but if there is anyone with such controller and hardware (not in production!) i could try, at least. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: smartctl / mpt on 9.0-RC1
Alex Samorukov wrote: On 11/08/2011 09:33 PM, Marat N.Afanasyev wrote: why :) just a little misunderstanding, I suppose ;) I just showed what I'd expect from #smartctl -a -d 3ware,0 /dev/twa0 in case of sas drive on channel 0 Yes. BTW, if you able to provide access to the BSD box with MFI and SAS i could fix defect sectors status report. For the twa/SAS much work needs to be done, but if there is anyone with such controller and hardware (not in production!) i could try, at least. I have one of my boxes being repaired, so as soon as it will be returned I'll try to give you access to that box. Unfortunately all my sas drives attached to 3ware controllers are in production boxes, so playing with them are not possible :( -- SY, Marat