Re: Linuxulator: possible Giant pushdown victim

2001-09-10 Thread John Baldwin


On 10-Sep-01 Dag-Erling Smorgrav wrote:
> Julian Elischer <[EMAIL PROTECTED]> writes:
>> Marcel Moolenaar wrote:
>> > BTW: Do we have handy functions for use in the remote debugger, such
>> > as show_proc, show_vm or whatever, that dump important information
>> > in a readable form?
>> Matt has a cool set of macros as does Grog.
> 
> I have a couple of macros I've used for debugging KLDs, which may
> serve as templates or inspiration for someone to write e.g. a "ps"
> macro (it shouldn't be too different from the "kldstat" macro, just
> walk the process table and print formatted info for every process)

Grog has a ps macro.  Look in sys/modules/vinum IIRC.

-- 

John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Linuxulator: possible Giant pushdown victim

2001-09-10 Thread Dag-Erling Smorgrav

Julian Elischer <[EMAIL PROTECTED]> writes:
> Marcel Moolenaar wrote:
> > BTW: Do we have handy functions for use in the remote debugger, such
> > as show_proc, show_vm or whatever, that dump important information
> > in a readable form?
> Matt has a cool set of macros as does Grog.

I have a couple of macros I've used for debugging KLDs, which may
serve as templates or inspiration for someone to write e.g. a "ps"
macro (it shouldn't be too different from the "kldstat" macro, just
walk the process table and print formatted info for every process)

define kldstat
  set $kld = linker_files.tqh_first
  printf "Id Refs AddressSize Name\n"
  while ($kld != 0)
printf "%2d %4d 0x%08x %-8x %s\n", \
  $kld->id, $kld->refs, $kld->address, $kld->size, $kld->filename
set $kld = $kld->link.tqe_next
  end
end

document kldstat
  Lists the modules that were loaded when the kernel crashed.
end

define kldstat-v
  set $kld = linker_files.tqh_first
  printf "Id Refs AddressSize Name\n"
  while ($kld != 0)
printf "%2d %4d 0x%08x %-8x %s\n", \
  $kld->id, $kld->refs, $kld->address, $kld->size, $kld->filename
printf "Contains modules:\n"
printf "Id Name\n"
set $module = $kld->modules.tqh_first
while ($module != 0)
  printf "%2d %s\n", $module->id, $module->name
  set $module = $module->link.tqe_next
end
set $kld = $kld->link.tqe_next
  end
end

document kldstat-v
  Lists modules with full information.
end

define kldload
  set $kld = linker_files.tqh_first
  set $done = 0
  while ($kld != 0 && $done == 0)
if ($kld->filename == $arg0)
  set $done = 1
else
  set $kld = $kld->link.tqe_next
end
  end
  if ($done == 1)
shell /usr/bin/objdump -h $arg0 | \
  awk '/ .text/ { print "set \$offset = 0x" $6 }' > .kgdb.temp
source .kgdb.temp
add-symbol-file $arg0 $kld->address + $offset
  end
end

document kldload
  Loads a module. Arguments are module name and offset of text section.
end

DES
-- 
Dag-Erling Smorgrav - [EMAIL PROTECTED]

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Linuxulator: possible Giant pushdown victim

2001-09-07 Thread Julian Elischer

Marcel Moolenaar wrote:
>
> BTW: Do we have handy functions for use in the remote debugger, such
> as show_proc, show_vm or whatever, that dump important information
> in a readable form?

Matt has a cool set of macros as does Grog.

-- 
++   __ _  __
|   __--_|\  Julian Elischer |   \ U \/ / hard at work in 
|  /   \ [EMAIL PROTECTED] +-->x   USA\ a very strange
| (   OZ)\___   ___ | country !
+- X_.---._/presently in San Francisco   \_/   \\
  v

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Linuxulator: possible Giant pushdown victim

2001-09-07 Thread Marcel Moolenaar

On Thu, Sep 06, 2001 at 11:55:19AM -0700, John Baldwin wrote:
> 
> 
> Note that 3 of these are runnable (stat of 2 == SRUN).  In top, see if they are
> chewing up lots of time.

Top doesn't update after the first mozilla process has started. Its
trace is:

mi_switch()
cv_timedwait_sig()
select()
syscall()
syscall_with_err_pushed()
 --- syscall(93, .., select)

> > db> trace 517
> > mi_switch(0,cd193aa0,811f874,cd27cfa0,c02bead6) at mi_switch+0x1a0
> > _mtx_unlock_sleep(c039e860,0,c030b460,497) at _mtx_unlock_sleep+0x204
> > syscall(2f,2f,2f,811f874,1) at syscall+0x48a
> > syscall_with_err_pushed() at syscall_with_err_pushed+0x1b
> > --- syscall (514), eip = 0x285a31a7, esp = 0x811f858, ebp = 0x811f9b4 ---
> 
> Weird syscall number (514).  This one was blocked on a mutex that was just
> released.  I'm betting that 0xc039e860 is Giant?  Perhaps not though?

Rien ne va plus! It is Giant.

> > db> trace 520
> > mi_switch(cd193ee0) at mi_switch+0x1a0
> > userret(cd193ee0,cd257fa8,0,208,befffc00) at userret+0x395
> > syscall(2f,2f,2f,befffd24,befffc00) at syscall+0x3c9
> > syscall_with_err_pushed() at syscall_with_err_pushed+0x1b
> > --- syscall (0, Linux ELF, nosys), eip = 0x285b8bd4, esp = 0xbefffb24, ebp =
> > 0xbefffbf4 ---
> 
> Another instance of being preempted upon return to userland.  Possible that the
> regs in the trapframe are altered to hold return values and thus that the
> syscall number is invalid.  Hmm.

That certainly would explain it (see above).

> What locks do all these processes hold? 

No locks are hold by any of the processes. The question then is: what
are they waiting for?

I started playing with remote debugging let me look around for a bit.

BTW: Do we have handy functions for use in the remote debugger, such
as show_proc, show_vm or whatever, that dump important information
in a readable form?

-- 
 Marcel Moolenaar USPA: A-39004  [EMAIL PROTECTED]

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Linuxulator: possible Giant pushdown victim

2001-09-06 Thread John Baldwin


On 06-Sep-01 Marcel Moolenaar wrote:
> On Wed, Sep 05, 2001 at 02:47:28PM -0700, John Baldwin wrote:
>> 
>> Yes, you can trace indiviudal processes though, using 'trace ', and I'm
>> more curious about the traces of the Mozilla processes.
> 
> Ok, here it is:
> 
> db> ps
>   pid   proc addruid  ppid  pgrp  flag stat wmesg   wchan   cmd
>   520 cd193ee0 cd256000 4152   517   514 02  2 
> mozilla-bin
>   519 cd197840 cd1ab000 4152   517   514 202  3  pause c17d3000
> mozilla-bin
>   518 cd193880 cd27 4152   517   514 02  3  select c039bb24
> mozilla-bin
>   517 cd193aa0 cd27b000 4152   514   514 02  2 
> mozilla-bin
>   514 cd194100 cd244000 4152   505   514 004002  2 
> mozilla-bin
>   ...

Note that 3 of these are runnable (stat of 2 == SRUN).  In top, see if they are
chewing up lots of time.

> db> trace 514
> mi_switch(cd194100) at mi_switch+0x1a0
> userret(cd194100,cd245fa8,c5,a,bfbfeae0) at userret+0x395
> syscall(2f,2f,2f,282397c0,bfbfeae0) at syscall+0x3c9
> syscall_with_err_pushed() at syscall_with_err_pushed+0x1b
> --- syscall (148, Linux ELF, linux_fdatasync), eip = 0x285c2074, esp =
> 0xbfbfeac8, ebp = 0xbfbfeb98 ---

It was returning from a syscall but had to do a context switch due to
PS_NEEDRESCHED because it got preempted.

> db> trace 517
> mi_switch(0,cd193aa0,811f874,cd27cfa0,c02bead6) at mi_switch+0x1a0
> _mtx_unlock_sleep(c039e860,0,c030b460,497) at _mtx_unlock_sleep+0x204
> syscall(2f,2f,2f,811f874,1) at syscall+0x48a
> syscall_with_err_pushed() at syscall_with_err_pushed+0x1b
> --- syscall (514), eip = 0x285a31a7, esp = 0x811f858, ebp = 0x811f9b4 ---

Weird syscall number (514).  This one was blocked on a mutex that was just
released.  I'm betting that 0xc039e860 is Giant?  Perhaps not though?

> db> trace 518
> mi_switch(cd19399c,cd193880,0,2,0) at mi_switch+0x1a0
> cv_timedwait_sig(c039bb24,cd19399c,dad,1,bfbffeb8) at cv_timedwait_sig+0x65b
> poll(cd193880,cd271f44,cd19399c,cd193880,bf3ffa4c) at poll+0x656
> linux_poll(cd193880,cd271f80,bf3ffa4c,88b8,bf3ffa4c) at linux_poll+0x11f
> syscall(2f,2f,2f,bf3ffa4c,88b8) at syscall+0x339
> syscall_with_err_pushed() at syscall_with_err_pushed+0x1b
> --- syscall (168, Linux ELF, linux_poll), eip = 0x285c7894, esp = 0xbf3ff9e8,
> ebp = 0xbf3ff9f4 ---

Asleep in select as ps shows.

> db> trace 519
> mi_switch(cd19795c,cd197840,c17d3000,c02f3a60,2) at mi_switch+0x1a0
> msleep(c17d3000,cd19795c,168,c02f0f49,0) at msleep+0x71a
> sigsuspend(cd197840,cd1acf4c,cd1acf44,bfbffeb8,cd19795c) at sigsuspend+0x19f
> linux_rt_sigsuspend(cd197840,cd1acf80,bf1ff94c,bf1ff94c,28239fc8) at
> linux_rt_sigsuspend+0x8e
> syscall(2f,2f,2f,28239fc8,bf1ff94c) at syscall+0x339
> syscall_with_err_pushed() at syscall_with_err_pushed+0x1b
> --- syscall (179, Linux ELF, linux_rt_sigsuspend), eip = 0x2851c656, esp =
> 0xbf1ff92c, ebp = 0xbf1ff934 ---

Asleep in pause as ps shows.
 
> db> trace 520
> mi_switch(cd193ee0) at mi_switch+0x1a0
> userret(cd193ee0,cd257fa8,0,208,befffc00) at userret+0x395
> syscall(2f,2f,2f,befffd24,befffc00) at syscall+0x3c9
> syscall_with_err_pushed() at syscall_with_err_pushed+0x1b
> --- syscall (0, Linux ELF, nosys), eip = 0x285b8bd4, esp = 0xbefffb24, ebp =
> 0xbefffbf4 ---

Another instance of being preempted upon return to userland.  Possible that the
regs in the trapframe are altered to hold return values and thus that the
syscall number is invalid.  Hmm.  What locks do all these processes hold? 
I would expect the ones in stat 3 (SSLEEP) to hold none, but the others might
hold locks.

-- 

John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Linuxulator: possible Giant pushdown victim

2001-09-05 Thread Marcel Moolenaar

On Wed, Sep 05, 2001 at 02:47:28PM -0700, John Baldwin wrote:
> 
> Yes, you can trace indiviudal processes though, using 'trace ', and I'm
> more curious about the traces of the Mozilla processes.

Ok, here it is:

db> ps
  pid   proc addruid  ppid  pgrp  flag stat wmesg   wchan   cmd
  520 cd193ee0 cd256000 4152   517   514 02  2  mozilla-bin
  519 cd197840 cd1ab000 4152   517   514 202  3  pause c17d3000 mozilla-bin
  518 cd193880 cd27 4152   517   514 02  3  select c039bb24 mozilla-bin
  517 cd193aa0 cd27b000 4152   514   514 02  2  mozilla-bin
  514 cd194100 cd244000 4152   505   514 004002  2  mozilla-bin
  ...

db> trace
Debugger(c0305de9) at Debugger+0x44
scgetc(c039a080,2,c1667a00,c0392da0,4) at scgetc+0x412
sckbdevent(c0392da0,0,c039a080,c1667a00,c1669780) at sckbdevent+0x1c9
atkbd_intr(c0392da0,0,cc475f7c,c01bd99b,c0392da0) at atkbd_intr+0x22
atkbd_isa_intr(c0392da0) at atkbd_isa_intr+0x18
ithread_loop(c1669780,cc475fa8) at ithread_loop+0x2bf
fork_exit(c01bd6dc,c1669780,cc475fa8) at fork_exit+0xb4
fork_trampoline() at fork_trampoline+0x8

db> trace 514
mi_switch(cd194100) at mi_switch+0x1a0
userret(cd194100,cd245fa8,c5,a,bfbfeae0) at userret+0x395
syscall(2f,2f,2f,282397c0,bfbfeae0) at syscall+0x3c9
syscall_with_err_pushed() at syscall_with_err_pushed+0x1b
--- syscall (148, Linux ELF, linux_fdatasync), eip = 0x285c2074, esp = 0xbfbfeac8, ebp 
= 0xbfbfeb98 ---

db> trace 517
mi_switch(0,cd193aa0,811f874,cd27cfa0,c02bead6) at mi_switch+0x1a0
_mtx_unlock_sleep(c039e860,0,c030b460,497) at _mtx_unlock_sleep+0x204
syscall(2f,2f,2f,811f874,1) at syscall+0x48a
syscall_with_err_pushed() at syscall_with_err_pushed+0x1b
--- syscall (514), eip = 0x285a31a7, esp = 0x811f858, ebp = 0x811f9b4 ---

db> trace 518
mi_switch(cd19399c,cd193880,0,2,0) at mi_switch+0x1a0
cv_timedwait_sig(c039bb24,cd19399c,dad,1,bfbffeb8) at cv_timedwait_sig+0x65b
poll(cd193880,cd271f44,cd19399c,cd193880,bf3ffa4c) at poll+0x656
linux_poll(cd193880,cd271f80,bf3ffa4c,88b8,bf3ffa4c) at linux_poll+0x11f
syscall(2f,2f,2f,bf3ffa4c,88b8) at syscall+0x339
syscall_with_err_pushed() at syscall_with_err_pushed+0x1b
--- syscall (168, Linux ELF, linux_poll), eip = 0x285c7894, esp = 0xbf3ff9e8, ebp = 
0xbf3ff9f4 ---

db> trace 519
mi_switch(cd19795c,cd197840,c17d3000,c02f3a60,2) at mi_switch+0x1a0
msleep(c17d3000,cd19795c,168,c02f0f49,0) at msleep+0x71a
sigsuspend(cd197840,cd1acf4c,cd1acf44,bfbffeb8,cd19795c) at sigsuspend+0x19f
linux_rt_sigsuspend(cd197840,cd1acf80,bf1ff94c,bf1ff94c,28239fc8) at 
linux_rt_sigsuspend+0x8e
syscall(2f,2f,2f,28239fc8,bf1ff94c) at syscall+0x339
syscall_with_err_pushed() at syscall_with_err_pushed+0x1b
--- syscall (179, Linux ELF, linux_rt_sigsuspend), eip = 0x2851c656, esp = 0xbf1ff92c, 
ebp = 0xbf1ff934 ---

db> trace 520
mi_switch(cd193ee0) at mi_switch+0x1a0
userret(cd193ee0,cd257fa8,0,208,befffc00) at userret+0x395
syscall(2f,2f,2f,befffd24,befffc00) at syscall+0x3c9
syscall_with_err_pushed() at syscall_with_err_pushed+0x1b
--- syscall (0, Linux ELF, nosys), eip = 0x285b8bd4, esp = 0xbefffb24, ebp = 
0xbefffbf4 ---

NOTE 1: process 517: this process seems to be the most active. Multiple
breaks after continuing result in different traces.
NOTE 2: process 518: there's no linux_poll in the source tree. This is a
local change.
NOTE 3: process 520: syscall 0 is an invalid Linux syscall (used to be
setup()).
NOTE 4: this is not reproducable on Alpha, because it panics even before
loading mozilla, but this is for later.

I'll go with my hunch (sp?) that it's linux_clone and see if I can find
the evidence. The systems looks responsive, but everything that relates
to processes (creation, destruction) seem to queue up. At least that's
how it "feels"...

FYI,

-- 
 Marcel Moolenaar USPA: A-39004  [EMAIL PROTECTED]

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Linuxulator: possible Giant pushdown victim

2001-09-05 Thread John Baldwin


On 05-Sep-01 Marcel Moolenaar wrote:
> On Wed, Sep 05, 2001 at 11:04:04AM -0700, John Baldwin wrote:
>> 
>> On 05-Sep-01 Marcel Moolenaar wrote:
>> > Hi,
>> > 
>> > I get consistent locks when trying to run Mozilla for Linux (RH 7.1).
>> > 
>> > Breaking into the debugger, I see it hangs in fork_exit()+180. This
>> > is should be the PROC_LOCK(p) in the source file (kern_fork.c).
>> 
>> Can you do 'show locks ' where  is the pid of the mozilla process?
>> Also, what does a 'trace' of the pid in question show?  (I take it this is
>> how
>> you know where it locked up?)
> 
> show locks  gives nothing for all cloned mozilla processes. This
> strikes me as odd. Another strange thing is that it seems to have a
> local effect at first (ie only mozilla hangs), but when trying to
> compose an email on the same machine (for example), it locks up hard.
> 
> I give you a complete trace when I call it a day at the office. In the
> mean time, this is roughly it (warning, from memory):
> 
> Debugger
> ...
> intr...kbd
> intr...isa
> ithread_loop
> fork_exit
> fork_trampoline
> 
> My guess is that everything beginning with ithread_loop is related to
> me breaking into the debugger with CA-ESC.

Yes, you can trace indiviudal processes though, using 'trace ', and I'm
more curious about the traces of the Mozilla processes.

-- 

John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Linuxulator: possible Giant pushdown victim

2001-09-05 Thread Marcel Moolenaar

On Wed, Sep 05, 2001 at 11:04:04AM -0700, John Baldwin wrote:
> 
> On 05-Sep-01 Marcel Moolenaar wrote:
> > Hi,
> > 
> > I get consistent locks when trying to run Mozilla for Linux (RH 7.1).
> > 
> > Breaking into the debugger, I see it hangs in fork_exit()+180. This
> > is should be the PROC_LOCK(p) in the source file (kern_fork.c).
> 
> Can you do 'show locks ' where  is the pid of the mozilla process?
> Also, what does a 'trace' of the pid in question show?  (I take it this is how
> you know where it locked up?)

show locks  gives nothing for all cloned mozilla processes. This
strikes me as odd. Another strange thing is that it seems to have a
local effect at first (ie only mozilla hangs), but when trying to
compose an email on the same machine (for example), it locks up hard.

I give you a complete trace when I call it a day at the office. In the
mean time, this is roughly it (warning, from memory):

Debugger
...
intr...kbd
intr...isa
ithread_loop
fork_exit
fork_trampoline

My guess is that everything beginning with ithread_loop is related to
me breaking into the debugger with CA-ESC.

When I get back home again, I'll try this on Alpha as well. The Alpha
has already got a serial console, so it's easier to experiment at this
time.

Please standby... :-)

-- 
 Marcel Moolenaar USPA: A-39004  [EMAIL PROTECTED]

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



RE: Linuxulator: possible Giant pushdown victim

2001-09-05 Thread John Baldwin


On 05-Sep-01 Marcel Moolenaar wrote:
> Hi,
> 
> I get consistent locks when trying to run Mozilla for Linux (RH 7.1).
> 
> Breaking into the debugger, I see it hangs in fork_exit()+180. This
> is should be the PROC_LOCK(p) in the source file (kern_fork.c).

Can you do 'show locks ' where  is the pid of the mozilla process?
Also, what does a 'trace' of the pid in question show?  (I take it this is how
you know where it locked up?)

-- 

John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Linuxulator: possible Giant pushdown victim

2001-09-05 Thread Marcel Moolenaar

Hi,

I get consistent locks when trying to run Mozilla for Linux (RH 7.1).

Breaking into the debugger, I see it hangs in fork_exit()+180. This
is should be the PROC_LOCK(p) in the source file (kern_fork.c).

Since a deadlock in this place should be seen for FreeBSD binaries as
well and since that's not the case, it must be Mozilla.

In the Linuxulator fork() and vfork() are implemented in terms of
their FreeBSD equivs, so I don't think that's the problem. This
leaves clone().

I'm in the office and can't try anything ATM, but if someone can tell
me if my deductions make sense or not I'll see if I can get it resolved
as soon as I'm home.

-- 
 Marcel Moolenaar USPA: A-39004  [EMAIL PROTECTED]

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message