linux threads and fclose lock problem

2001-08-28 Thread Jeff Fellin

The following test program hangs on current from 8/20/2001.
The program hangs in the fprintf to the function testThread(),
instead of running to completion. If the call in main to fclose()
of an unrelated file descriptor is removed the program runs to 
completion.

From tracing code, it appears that fclose.c locks the file, does some stuff,
and then *tries* to unlock the file.  But while _flockfile is called,
_funlockfile is *not*.  (The source for fclose.c calls FUNLOCKFILE(fp) -
don't know where FUNLOCKFILE is defined.)

So, a lock gets left open.  Because these are recursive locks, further I/Os
from the same thread are OK, but any IO from a different thread block
forever.  If the fclose is removed, the program works fine or setting
the value of __isthreaded to zero.



#include stdio.h
#include pthread.h

extern int __isthreaded;

void *testThread(void *param)
{
  const char *fn = /tmp/file_to_write;
  FILE *fo = fopen(fn, w);
  printf(testThread __isthreaded = %d\n, __isthreaded);
  printf (Writing to file %s\n, fn);
  fprintf(fo, Test line\n);
  printf (Finished writing to file %s\n, fn);
  return NULL;
}

int main()
{
  const char *fn = /tmp/foobar;
  FILE *fi = fopen(fn, r);

  printf(main __isthreaded = %d\n, __isthreaded);
  if (!fi) {
 printf (%s must exist for this test to work\n, fn);
 exit(0);
  }
  printf (We opened %s, fd of %d\n, fn, fi-_file);
  fclose(fi);
   printf(main after fclose __isthreaded = %d\n, __isthreaded);
  printf (File is closed\n);

  pthread_t tid;
  pthread_create(tid, NULL, testThread, (void*)NULL);
  pthread_join(tid, NULL);
}

Here's the output:
---
main __isthreaded = 1
We opened /tmp/foobar, fd of 3
3, 671604288, 1 - Locking fd
main after fclose __isthreaded = 1
File is closed
testThread __isthread = 1
Writing to file /tmp/file_to_write
5, 3208641568, 2 - Locking fd
671604288 - File is locked by another thread
File is locked: 671644044, 3208642568
Did insert: 671644044
About to suspend:671644044
---

Changing the value of __isthreaded to 0 in main produces the expected output:
---
main __isthreaded = 0
We opened /tmp/foobar, fd of 3
main after fclose __isthreaded = 0
File is closed
testThread __isthread = 0
Writing to file /tmp/file_to_write
Finished writing to file /tmp/file_to_write
---


In the default case fclose leaves a lock open so when the other thread
attempts file/IO it is suspended waiting for the other thread to release
the lock. However the other thread doesn't know it has a lock.

So, a lock gets left open.  Because these are recursive locks, further I/Os
from the same thread are OK, but any IO from a different thread block
forever.


This is with linuxthreads 2.2.3_1.  The compile lines are:
g++ -ofclose-test.o -c -g  -D_THREAD_SAFE  \
-I/usr/local/include/pthread/linuxthreads \
-D_PTHREADS  -D__USE_UNIX98 fclose-test.cpp
g++  -o fclose-test.exe fclose-test.o -g  -L/usr/local/lib  -llthread  -llgcc_r


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



booting off multiple disks

2001-06-28 Thread Jeff Fellin


I have a system that I need to boot either stable or current, each
with it's own root filesystem. I am having problems with the
configuration of the system on the second drive, in that the
kernel (current) loads and boots, but it is using the root filesystem
of stable. 

My configuration is:

Jun 28 08:39:29 /boot/kernel/kernel: isa0: ISA bus on isab0
Jun 28 08:39:29 /boot/kernel/kernel: atapci0: Intel PIIX4 ATA33 control
ler port 0x3460-0x346f at device 18.1 on pci0
Jun 28 08:39:29 /boot/kernel/kernel: ata0: at 0x1f0 irq 14 on atapci0
Jun 28 08:39:29 /boot/kernel/kernel: ata1: at 0x170 irq 15 on atapci0   

snip

Jun 28 08:39:30 /boot/kernel/kernel: ad0: DMA limited to UDMA33, non-ATA
66 compliant cable  
Jun 28 08:39:30 /boot/kernel/kernel: ad0: 28629MB QUANTUM FIREBALLP LM3
0 [58168/16/63] at ata0-master UDMA33
Jun 28 08:39:30 /boot/kernel/kernel: ad2: 8063MB IBM-DHEA-38451 [16383
/16/63] at ata1-master UDMA33

snip
Jun 28 08:39:30 /boot/kernel/kernel: Mounting root from ufs:/dev/ad0s1a

I had switch the values of ata.0 and ata.1 in /boot/device.hints, but
it didn't make any difference.

Simply put I have stable on ad0 and current on ad2, and when I boot
a kernel from a drive I want the root filesystem to be on the drive.

Can anyone point me to any man pages that might explain(hint) at how
to accomplish this?

Thank you in advance.
Jeff Fellin
Bell Labs, Murray Hill, NJ
(908) 582-7673
[EMAIL PROTECTED]

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Kernel Buffer overwrite debugging

2000-12-18 Thread Jeff Fellin


I am having a problem with a device driver that uses physio
to transfer data to a SCSI adapter. Some times the after 
passing the buffer to the CAM system, via xpt_action, the
buffer contents are modified. I've traced my driver and cannot
determine how this could be happening. I am running on a single
CPU Pentium II system with all system config defaults.

What I would like to do is to dynamically set a watch point
on the buffer used by the write system call for the duration
of sending the data to the SCSI adapter. I want to do this
dynamically instead of manually setting a breakpoint in the
code and manually setting the watch point, because the problem
occurs around the 90'th time, and I don't want SCSI bus timeouts
while typing the watch address.

I've examined the ddb code, and thought that if I emulated the
steps in db_trap() for the command of setting a watchpoint it
would work. However, it doesn't appear to be working.

What I've done is:

/* possible on data xfer = 512 bytes */
if (condition for problem) {

db_watchpoint_cmd(bp-bio_addr, bp-bio_addr,
bp-bio_count, "rw");
db_continue_cmd(0, 0, 0, "w"):
db_restart_at_pc(FALSE);
}

When the buffer is done transmitting I do the following:

db_clear_watchpoints();
db_deletewatch_cmd(bp-bio_addr, bp-cio_addr,
bp-cio_count, "rw");
db_continue_cmd(0, 0, 0, "w");
db_restart_at_pc(FALSE);

My driver trace printf's show the data  at bp-bio_addr was
changed from 0x601000a3 to 0x0. Additional traces show the 
data from the first 200+ bytes is changed to zero.

Any guidance on how to use the ddb functions to debug this
problem are appreciated. Also, alternative methods to determine
what is overwriting the buffer. In looking at the data on a
SCSI bus analyzer, the entire buffer has been zero'ed out.

Thank you in advance for your help.

Jeff Fellin
MH 2A-352
(908) 582-7673
[EMAIL PROTECTED]



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message