pthreads mostly hang when signal

2011-11-09 Thread Unga
Hi all

I have a C program with pthreads.

This program creates thousands of detached pthreads for short jobs at the 
beginning. They all come and go within the first 10 seconds.


But it has 3 permanently running pthreads.

I signal these 3 permanently running pthreads to stop processing. I send 
SIGUSR1 for this purpose.

If I don't interrupt these 3 permanently running pthreads, they run without any 
issue and do their job.

If I send the SIGUSR1 to these 3 permanently running pthreads, they rarely 
work. That is, these threads immediately receive SIGUSR1 signals.

But when I signal, they mostly hang. That is, these threads don't receive the 
signal.


Following code fragment shows how I send the signal and wait till they stop:
 LockMutex(threadCountMutex);   
 tCount = threadCount;  
 UnlockMutex(threadCountMutex); 
 
 printf(B4 stop Thread_1. Threads: %d\n, tCount); 
 
 LockMutex(Thread_1varMutex);
    Thread_1var.Thread_1Stopped = 0; // Thread_1 not stopped yet.
 UnlockMutex(Thread_1varMutex);


 if (pthread_kill(tid1, SIGUSR1) != 0) // Send a signal to the thread
    {                       // to stop processing.
 fprintf(stderr, pthread_kill failed for Thread_1!\n);
 exit(1);
    }
    Delay(25); // Let Thread_1 to settle.


 // Check now whether the Thread_1 thread received the signal and stopped 
processing.
 for (threadActivateDelay=0 ;threadActivateDelay  threadActivateTimeOut; 
threadActivateDelay += 50)
 {
  LockMutex(Thread_1varMutex);
  Thread_1Stopped = Thread_1var.Thread_1Stopped;
  UnlockMutex(Thread_1varMutex);
  
  if (Thread_1Stopped)
 break;
  else
 Delay(50); // Let Thread_1 thread to settles. 50ms

  LockMutex(threadCountMutex);   
  tCount = threadCount;  
  UnlockMutex(threadCountMutex); 
     
  printf(Wait till Thread_1 stopped, Threads: %d  Delay: %d\n, tCount, 
threadActivateDelay+50);     
 }

 printf(Came out of Thread_1 loop, threadActivateDelay: %d, 
threadActivateTimeOut: %d\n, threadActivateDelay, threadActivateTimeOut); 
 if (threadActivateDelay = threadActivateTimeOut) // Something is wrong. 
Thread may be dead.
    {
     fprintf(stderr, Time out. Thread_1 may be dead!\n);
 exit(1);
    }


Note, Thread_1var.Thread_1Stopped is set to 1 by the Thread_1 once it receive 
the SIGUSR1.

Result of two runs of the program is as follows (values are in milliseconds):
./prog
B4 stop Thread_1. Threads: 3
Thread_1 cought SIGUSR1
Wait till Thread_1 stopped, Threads: 3  Delay: 50
Came out of Thread_1 loop, threadActivateDelay: 50, threadActivateTimeOut: 3000
B4 stop Thread_2. Threads: 3
Thread_2 cought SIGUSR1
Came out of Thread_2 loop, threadActivateDelay: 0, threadActivateTimeOut: 3000
B4 stop Thread_3. Threads: 3
Wait till Thread_3 stopped, Threads: 3  Delay: 50
Wait till Thread_3 stopped, Threads: 3  Delay: 100
Wait till Thread_3 stopped, Threads: 3  Delay: 150
:
:
Wait till Thread_3 stopped, Threads: 3  Delay: 3000
Came out of Thread_3 loop, threadActivateDelay: 3000, threadActivateTimeOut: 
3000
Time out. Thread_3 may be dead!



./prog
B4 stop Thread_1. Threads: 3
Wait till Thread_1 stopped, Threads: 3  Delay: 50
Wait till Thread_1 stopped, Threads: 3  Delay: 100
Wait till Thread_1 stopped, Threads: 3  Delay: 150
:
:
Wait till Thread_1 stopped, Threads: 3  Delay: 3000
Came out of Thread_1 loop, threadActivateDelay: 3000, threadActivateTimeOut: 
3000
Time out. Thread_1 may be dead!


I have tested this program on FreeBSD 8.1 and 9.0 RC1, both i386. Different 
runs hang different threads. Also as I mention earlier, rarely all three 
threads stop immediately.

My issue is quite similar to the problem: 
http://security.freebsd.org/advisories/FreeBSD-EN-10:02.sched_ule.asc

But it doesn't freeze the system. 

Increase threadActivateTimeOut to 6ms also doesn't work once hang.

Please also note, once receive a SIGUSR1, the thread wait on sigwait() till it 
receive another signal.


So what have I hit with? Is it a programming error in my side or scheduling 
error or something else?

Appreciate very much if FreeBSD guys could help me to solve this issue.

Many thanks in advance.

Best regards
Unga
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD Status Report July-September, 2011

2011-11-09 Thread Daniel Gerzo

On Tue, 8 Nov 2011 09:55:09 +, Daniel Gerzo wrote:

FreeBSD Quarterly Status Report - Q3/2011


Unfortunately, I managed to use an old status report entry for 
KDE/FreeBSD, instead of the current one.
I am sorry for any inconvenience; the current entry for KDE/FreeBSD is 
below:



KDE/FreeBSD

   URL: http://FreeBSD.KDE.org
   URL: http://FreeBSD.KDE.org/area51.php

   Contact: KDE FreeBSD kde-free...@kde.org

   The KDE/FreeBSD team has continued to improve the experience of KDE
   software and Qt under FreeBSD. The latest round of improvements
   include:
 * Splitting some of the KDE modules into smaller ports
 * Reduced startup time by ~15 seconds
 * Allowed auto-login out-of-the-box
 * Kopete supports GoogleTalk
 * Kalzium installs with its molecular editor
 * Zeitgeist support added
 * Porting Calligra to FreeBSD (work-in-progress)

   The team has also made many releases and upstreamed many fixes and
   patches. The latest round of releases include:
 * Qt: 4.7.4
 * PyQt: 4.8.5 (SIP: 4.12.4)
 * KDE SC: 4.7.2
 * Amarok: 2.4.3
 * KDevelop: 4.2.3 (KDevPlatform: 1.2.3)

   The team is always looking for more testers and porters so please
   contact us at kde-free...@kde.org and visit our home page at
   http://FreeBSD.KDE.org.

Open tasks:

1. Testing KDE PIM 4.7.2
2. Testing phonon-gstreamer and phonon-vlc as the phonon-xine 
backend

   was deprecated (and will remain in ports)

--
Kind regards
  Daniel
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: mfi timeouts

2011-11-09 Thread John Baldwin
On Tuesday, November 08, 2011 6:39:07 pm Vincent Hoffman wrote:
 On 08/11/2011 22:24, Vincent Hoffman wrote:
  On 08/11/2011 19:50, John Baldwin wrote:
  On Wednesday, November 02, 2011 5:47:38 pm Vincent Hoffman wrote:
  On 28/10/2011 04:14, Jan Mikkelsen wrote:
  Hi,
 
  There is a patch linked to from this PR, which seems very similar:
 
  http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/140416
 
  http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html
 
  The problem is also consistent with running mfiutil clearing the problem.
 
  I'm about to deploy mfi controllers in a similar configuration, so I'd 
  be 
  very curious about whether the patch fixes the problem for you.
  The patch you linked to seems to have removed the stalls, although I
  have only had it running for a day. I'll post if it stalls again though.
 
  I did manage to scrounge the use of a Dell r410 with a
  LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05)
  Badged as Dell PERC H700 Adapter
 
  to test out the patch I originally found but had the same issue as this 
  post
 
  http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063821.html
 
 
  I couldnt get the dell to stall in the first place either though so it
  could be a specific firmware version that the issue.
 
  Anyway thanks for the pointers.
  Hmm, did you try the patch I had posted from that earlier thread?  It had
  two changes in it, one was similar to the patch in the PR, the second added
  MSI-X support.  I've since tweaked it to make the MSI-X support off by
  default but possible to enable via loader.conf.  Would you be willing to
  try the updated patch at www.freebsd.org/~jhb/patches/mfi.patch?
  Hi,
  yes I tried the patch you posted originally, unfortunately the dell
  never finished booting either. The Supermicro is now in production but
  I'll take the dell up to 9-STABLE and try your updated patch.
 
 The patch didnt apply quite cleanly for 9-STABLE, 1 reject as it had
 already been applied.

Odd, it's against stock head, so I don't know why it would have failed to
apply.
 
 I have rebooted the dell and it seems happy with the new patch (msi
 disabled.)

Okay, good.  I'll commit the non-MSI bits at least and get them merged into
9.0 if possible.

 Booting with
 hw.mfi.msix=1 in /boot/loader.conf causes the timeouts again and stops
 the boot from completing.

Ok.  Can you try changing it to use MSI instead of MSI-X?  Just edit the
mfi_pci.c call and replace 'pci_alloc_msix' with 'pci_alloc_msi'.

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: mfi timeouts

2011-11-09 Thread Vincent Hoffman
On 09/11/2011 14:39, John Baldwin wrote:
 On Tuesday, November 08, 2011 6:39:07 pm Vincent Hoffman wrote:
 On 08/11/2011 22:24, Vincent Hoffman wrote:
 On 08/11/2011 19:50, John Baldwin wrote:
 On Wednesday, November 02, 2011 5:47:38 pm Vincent Hoffman wrote:
 On 28/10/2011 04:14, Jan Mikkelsen wrote:
 Hi,

 There is a patch linked to from this PR, which seems very similar:

 http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/140416

 http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html

 The problem is also consistent with running mfiutil clearing the problem.

 I'm about to deploy mfi controllers in a similar configuration, so I'd 
 be 
 very curious about whether the patch fixes the problem for you.
 The patch you linked to seems to have removed the stalls, although I
 have only had it running for a day. I'll post if it stalls again though.

 I did manage to scrounge the use of a Dell r410 with a
 LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05)
 Badged as Dell PERC H700 Adapter

 to test out the patch I originally found but had the same issue as this 
 post

 http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063821.html


 I couldnt get the dell to stall in the first place either though so it
 could be a specific firmware version that the issue.

 Anyway thanks for the pointers.
 Hmm, did you try the patch I had posted from that earlier thread?  It had
 two changes in it, one was similar to the patch in the PR, the second added
 MSI-X support.  I've since tweaked it to make the MSI-X support off by
 default but possible to enable via loader.conf.  Would you be willing to
 try the updated patch at www.freebsd.org/~jhb/patches/mfi.patch?
 Hi,
 yes I tried the patch you posted originally, unfortunately the dell
 never finished booting either. The Supermicro is now in production but
 I'll take the dell up to 9-STABLE and try your updated patch.

 The patch didnt apply quite cleanly for 9-STABLE, 1 reject as it had
 already been applied.
 Odd, it's against stock head, so I don't know why it would have failed to
 apply.
  
I think it was http://svnweb.freebsd.org/base?view=revisionrevision=227309

no big deal.
 I have rebooted the dell and it seems happy with the new patch (msi
 disabled.)
 Okay, good.  I'll commit the non-MSI bits at least and get them merged into
 9.0 if possible.

 Booting with
 hw.mfi.msix=1 in /boot/loader.conf causes the timeouts again and stops
 the boot from completing.
 Ok.  Can you try changing it to use MSI instead of MSI-X?  Just edit the
 mfi_pci.c call and replace 'pci_alloc_msix' with 'pci_alloc_msi'.

I'll have a try.

Vince



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: mfi timeouts

2011-11-09 Thread Vincent Hoffman
On 09/11/2011 14:39, John Baldwin wrote:
 On Tuesday, November 08, 2011 6:39:07 pm Vincent Hoffman wrote:
 On 08/11/2011 22:24, Vincent Hoffman wrote:
 On 08/11/2011 19:50, John Baldwin wrote:
 On Wednesday, November 02, 2011 5:47:38 pm Vincent Hoffman wrote:
 On 28/10/2011 04:14, Jan Mikkelsen wrote:
 Hi,

 There is a patch linked to from this PR, which seems very similar:

 http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/140416

 http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html

 The problem is also consistent with running mfiutil clearing the problem.

 I'm about to deploy mfi controllers in a similar configuration, so I'd 
 be 
 very curious about whether the patch fixes the problem for you.
 The patch you linked to seems to have removed the stalls, although I
 have only had it running for a day. I'll post if it stalls again though.

 I did manage to scrounge the use of a Dell r410 with a
 LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05)
 Badged as Dell PERC H700 Adapter

 to test out the patch I originally found but had the same issue as this 
 post

 http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063821.html


 I couldnt get the dell to stall in the first place either though so it
 could be a specific firmware version that the issue.

 Anyway thanks for the pointers.
 Hmm, did you try the patch I had posted from that earlier thread?  It had
 two changes in it, one was similar to the patch in the PR, the second added
 MSI-X support.  I've since tweaked it to make the MSI-X support off by
 default but possible to enable via loader.conf.  Would you be willing to
 try the updated patch at www.freebsd.org/~jhb/patches/mfi.patch?
 Hi,
 yes I tried the patch you posted originally, unfortunately the dell
 never finished booting either. The Supermicro is now in production but
 I'll take the dell up to 9-STABLE and try your updated patch.

 The patch didnt apply quite cleanly for 9-STABLE, 1 reject as it had
 already been applied.
 Odd, it's against stock head, so I don't know why it would have failed to
 apply.
  
 I have rebooted the dell and it seems happy with the new patch (msi
 disabled.)
 Okay, good.  I'll commit the non-MSI bits at least and get them merged into
 9.0 if possible.

 Booting with
 hw.mfi.msix=1 in /boot/loader.conf causes the timeouts again and stops
 the boot from completing.
 Ok.  Can you try changing it to use MSI instead of MSI-X?  Just edit the
 mfi_pci.c call and replace 'pci_alloc_msix' with 'pci_alloc_msi'.

Much better, It boots and says
Nov  9 15:25:45 zfstest kernel: mfi0: Dell PERC H700 Adapter port
0xfc00-0xfcff mem 0xdf1bc000-0xdf1b,0xdf1c-0xdf1f irq 38 at
device 0.0 on pci3
Nov  9 15:25:45 zfstest kernel: mfi0: Using MSI-X
Nov  9 15:25:45 zfstest kernel: mfi0: Megaraid SAS driver Ver 3.00
Nov  9 15:25:45 zfstest kernel: mfi0: 2004 (374167405s/0x0020/info) -
Shutdown command received from host
Nov  9 15:25:45 zfstest kernel: mfi0: 2005 (boot + 34s/0x0020/info) -
Firmware initialization started (PCI ID 0079/1000/1f16/1028)
Nov  9 15:25:45 zfstest kernel: mfi0: 2006 (boot + 34s/0x0020/info) -
Firmware version 2.100.03-1046
Nov  9 15:25:45 zfstest kernel: mfi0: 2007 (boot + 36s/0x0008/info) -
Battery Present
Nov  9 15:25:45 zfstest kernel: mfi0: 2008 (boot + 36s/0x0020/info) -
Package version 12.10.0-0025
Nov  9 15:25:45 zfstest kernel: mfi0: 2009 (boot + 36s/0x0020/info) -
Board Revision A00
Nov  9 15:25:45 zfstest kernel: mfi0: 2010 (boot + 61s/0x0002/info) -
Inserted: PD 00(e0xff/s0)
Nov  9 15:25:45 zfstest kernel: mfi0: 2011 (boot + 61s/0x0002/info) -
Inserted: PD 00(e0xff/s0) Info: enclPd=, scsiType=0, portMap=01,
sasAddr=443322110700,
Nov  9 15:25:45 zfstest kernel: mfi0: 2012 (boot + 61s/0x0002/info) -
Inserted: PD 01(e0xff/s1)
Nov  9 15:25:45 zfstest kernel: mfi0: 2013 (boot + 61s/0x0002/info) -
Inserted: PD 01(e0xff/s1) Info: enclPd=, scsiType=0, portMap=00,
sasAddr=443322110600,
Nov  9 15:25:45 zfstest kernel: mfi0: 2014 (374167491s/0x0020/info) -
Time established as 11/09/11 15:24:51; (63 seconds since power on)
Nov  9 15:25:45 zfstest kernel: mfi0: 2015 (374167529s/0x0008/info) -
Battery temperature is normal
Nov  9 15:25:45 zfstest kernel: mfi0: 2016 (374167529s/0x0008/info) -
Battery started charging

More info as required.

Vince

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: smartctl / mpt on 9.0-RC1

2011-11-09 Thread Alex Samorukov

On 11/08/2011 09:33 PM, Marat N.Afanasyev wrote:

why :)

just a little misunderstanding, I suppose ;) I just showed what I'd 
expect from


#smartctl -a -d 3ware,0 /dev/twa0

in case of sas drive on channel 0

Yes.

BTW, if you able to provide access to the BSD box with MFI and SAS i 
could fix defect sectors status report. For the twa/SAS much work 
needs to be done, but if there is anyone with such controller and 
hardware (not in production!) i could try, at least.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: smartctl / mpt on 9.0-RC1

2011-11-09 Thread Marat N.Afanasyev

Alex Samorukov wrote:

On 11/08/2011 09:33 PM, Marat N.Afanasyev wrote:

why :)

just a little misunderstanding, I suppose ;) I just showed what I'd
expect from

#smartctl -a -d 3ware,0 /dev/twa0

in case of sas drive on channel 0

Yes.

BTW, if you able to provide access to the BSD box with MFI and SAS i
could fix defect sectors status report. For the twa/SAS much work
needs to be done, but if there is anyone with such controller and
hardware (not in production!) i could try, at least.


I have one of my boxes  being repaired, so as soon as it will be 
returned I'll try to give you access to that box. Unfortunately all my 
sas drives attached to 3ware controllers are in production boxes, so 
playing with them are not possible :(


--
SY, Marat