Re: -current sudden panics :(

2000-03-23 Thread Matthew Dillon

This problem should now be fixed, it's probably the problem I just fixed
a moment ago in netinet/if_ether.c based on a thread in -hackers.  The
m_pullup() NULL check in arpintr() was broken, resulting in a NULL
pointer dereference.  

-Matt



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-23 Thread Warner Losh

In message [EMAIL PROTECTED] Matthew Dillon writes:
: This problem should now be fixed, it's probably the problem I just fixed
: a moment ago in netinet/if_ether.c based on a thread in -hackers.  The
: m_pullup() NULL check in arpintr() was broken, resulting in a NULL
: pointer dereference.  

inoue-san's patch survived the night.  I'll check into your patch and
give it a try instead.

Warner


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-23 Thread Warner Losh

In message [EMAIL PROTECTED] Yoshinobu Inoue writes:
: I would like to narrow down the problem more and could you
: please try if this patch stop the problem or not?
: (The m_pullup() is recently added to if_rl.c. It should not be
: harmful, but I suspect that this might have invoked another
: hidden bug.)

This survived overnight.  I see that Matt Dillon has another patch,
I'll try that tonight.

Warner


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-23 Thread Yoshinobu Inoue

 : This problem should now be fixed, it's probably the problem I just fixed
 : a moment ago in netinet/if_ether.c based on a thread in -hackers.  The
 : m_pullup() NULL check in arpintr() was broken, resulting in a NULL
 : pointer dereference.  
 
 inoue-san's patch survived the night.  I'll check into your patch and
 give it a try instead.

My patch is just a workaround to avoid m_pullup() when it is
not necessary, and his fix seems to be the real one for the
problem.
But I think my patch to if_rl.c is also better to be applied
for performance reason.

Cheers,
Yoshinobu Inoue


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-23 Thread Ilmar S. Habibulin

On Thu, 23 Mar 2000, Matthew Dillon wrote:

 This problem should now be fixed, it's probably the problem I just fixed
 a moment ago in netinet/if_ether.c based on a thread in -hackers.  The
 m_pullup() NULL check in arpintr() was broken, resulting in a NULL
 pointer dereference.  
Ok. Uptime more than 8 hours, continue testing.



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-22 Thread Warner Losh

In message [EMAIL PROTECTED] "Ilmar S. 
Habibulin" writes:
: This is driver for ed(ne2000) cards. I have realtek(rl driver). I took a
: look at his source and didn't find such strings. There is comment there
: about cutting off mbuf header before passing it to ether_input - what's
: this?

I applied a similar patch to the end of the rl packet handling
routine.  It didn't solve my arp crashes, however.   It is almost as
if sometimes the rl driver passes a packet to ether_input and then
does bad things to it behind the scenes...  I've not had a lot of time
to try to track down why this does what it does.

Warner


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-22 Thread Yoshinobu Inoue

Hi,

 : This is driver for ed(ne2000) cards. I have realtek(rl driver). I took a
 : look at his source and didn't find such strings. There is comment there
 : about cutting off mbuf header before passing it to ether_input - what's
 : this?
 
 I applied a similar patch to the end of the rl packet handling
 routine.  It didn't solve my arp crashes, however.   It is almost as
 if sometimes the rl driver passes a packet to ether_input and then
 does bad things to it behind the scenes...  I've not had a lot of time
 to try to track down why this does what it does.
 
 Warner

I would like to narrow down the problem more and could you
please try if this patch stop the problem or not?
(The m_pullup() is recently added to if_rl.c. It should not be
harmful, but I suspect that this might have invoked another
hidden bug.)

Yoshinobu Inoue

Index: if_rl.c
===
RCS file: /home/ncvs/src/sys/pci/if_rl.c,v
retrieving revision 1.38
diff -u -r1.38 if_rl.c
--- if_rl.c 1999/12/28 06:04:29 1.38
+++ if_rl.c 2000/03/23 01:35:02
@@ -1130,7 +1130,8 @@
m_adj(m, RL_ETHER_ALIGN);
m_copyback(m, wrap, total_len - wrap,
sc-rl_cdata.rl_rx_buf);
-   m = m_pullup(m, sizeof(struct ether_header));
+   if (m-m_len  sizeof(struct ether_header))
+   m = m_pullup(m, sizeof(struct ether_header));
if (m == NULL) {
printf("rl%d: m_pullup failed",
sc-rl_unit);


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-22 Thread Ilmar S. Habibulin

On Tue, 21 Mar 2000, Warner Losh wrote:

 : But why there is such a sudden change? Everything worked just fine a week
 : before 5-current.
 No it didn't.  I've been seeing panics like this for about two weeks,
Ok, it worked for me.

 but it hadn't been a priority until this week for me.  And I'm not
 seeing it on lightly loaded networks, but am on heavily loaded ones.
My pc is not on lightly loaded network. This networks' load is moving
towards(?) zero. ;-)

 Since our product's network port is just for debugging, it isn't a big
 deal to me
And i'm using freebsd as my desktop OS. So this became a VERY BIG problem
for me. :(

 It is definitely a load related problem for me.  It usually works just
 fine, but sometimes there's a packet that gets to arp that arp barfs
 on.
I can't track this situation. Everything seems to be fine, then 
- BBBOOOMMM - page fault. :( 



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-21 Thread Yoshinobu Inoue

Hello,

 Fatal 12 trap: page fault while in kernel mode
 fault virtual address   = 0x8
 fault code  = supervisor read, page not present
 instruction pointer = 0x8:0xc01843fc
 stack pointer   = 0x10:0xc026bd64 
 frame pointer   = 0x10:0xc026bd64 
 code segment= base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, def32 1, gran 1
 processor eflags= interrupt enabled, resume, IOPL = 0
 current process = Idle
 interrupt mask  =
 kernel: type 12 trap, code=0
 Stopped at  arpintr+0x9c:  movl0x8(%ebx),%ecx
 
 trace gave this:
 arpint(c022537b,0,10,10,c0220010) at arpintr+0x9c
 swi_net_next() at awi_net_next
 
 I'm sending kernel config and dmesg in the attachment. I have INET6 there,
 but it is not configured by ifconfig.
 
 What's this and how can i avoid this panics?

Do you have any other hints for the problem?, because at least
I couldn't reproduce it in my 4.0 and 5.0 machines.

  -Any kernel crash dump?
  -Is there any typical situation or condition where the
   problem happens?
  -What is your LAN card?


Thanks,
Yoshinobu Inoue


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-21 Thread Nikolai Saoukh

On Wed, Mar 22, 2000 at 12:51:36AM +0900, Yoshinobu Inoue wrote:

  trace gave this:
  arpint(c022537b,0,10,10,c0220010) at arpintr+0x9c
  swi_net_next() at awi_net_next
  
  I'm sending kernel config and dmesg in the attachment. I have INET6 there,
  but it is not configured by ifconfig.
  
  What's this and how can i avoid this panics?
 
 Do you have any other hints for the problem?, because at least
 I couldn't reproduce it in my 4.0 and 5.0 machines.
 
   -Any kernel crash dump?
   -Is there any typical situation or condition where the
problem happens?
   -What is your LAN card?

The driver for his card does not set packet header pointer, thus
arp stuff see NULL pointer. small patch will cure this problem
(at least I hope so).

*** if_ed.c.old Tue Mar 21 19:21:40 2000
--- if_ed.c Tue Mar 21 19:23:27 2000
***
*** 2728,2733 
--- 2728,2734 
 */
m-m_pkthdr.len = m-m_len = len - sizeof(struct ether_header);
m-m_data += sizeof(struct ether_header);
+   m-m_pkthdr.header = (void *)eh;
  
ether_input(sc-arpcom.ac_if, eh, m);
return;


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-21 Thread Yoshinobu Inoue

-What is your LAN card?

Woops, I often do a needless query. That should be using rl
driver as the kernel log.

 The driver for his card does not set packet header pointer, thus
 arp stuff see NULL pointer. small patch will cure this problem
 (at least I hope so).
 
 *** if_ed.c.old   Tue Mar 21 19:21:40 2000
 --- if_ed.c   Tue Mar 21 19:23:27 2000
 ***
 *** 2728,2733 
 --- 2728,2734 
*/
   m-m_pkthdr.len = m-m_len = len - sizeof(struct ether_header);
   m-m_data += sizeof(struct ether_header);
 + m-m_pkthdr.header = (void *)eh;
   
   ether_input(sc-arpcom.ac_if, eh, m);
   return;

But shouldn't it be sys/pci/if_rl.c ?

Yoshinobu Inoue


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-21 Thread Nikolai Saoukh

On Wed, Mar 22, 2000 at 01:51:53AM +0900, Yoshinobu Inoue wrote:

 But shouldn't it be sys/pci/if_rl.c ?

Sorry,
it is mea culpa. I mixed his case with my (token ring).


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-21 Thread Warner Losh

In message [EMAIL PROTECTED] Nikolai Saoukh writes:
:  But shouldn't it be sys/pci/if_rl.c ?
: 
: Sorry,
: it is mea culpa. I mixed his case with my (token ring).

Do you have the patch to if_rl.c.  I looked at it for all of 10
seconds and it wasn't immediately obvious to me.

Warner


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-21 Thread Ilmar S. Habibulin

On Wed, 22 Mar 2000, Yoshinobu Inoue wrote:

 Do you have any other hints for the problem?, because at least
 I couldn't reproduce it in my 4.0 and 5.0 machines.
   -Any kernel crash dump?
Can you tell me ddb command to make a kernel dump?

   -Is there any typical situation or condition where the
problem happens?
I don't know. uptime between panics is from 5 minutes to 10 hours. They
are sudden as i sayd. :(

   -What is your LAN card?
Something on realtek chiset(rl8139), maybe acorp. I don't remember. The
card worked fine for about one year.



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-21 Thread Ilmar S. Habibulin

On Tue, 21 Mar 2000, Nikolai Saoukh wrote:

 The driver for his card does not set packet header pointer, thus
 arp stuff see NULL pointer. small patch will cure this problem
 (at least I hope so).
 
 *** if_ed.c.old   Tue Mar 21 19:21:40 2000
 --- if_ed.c   Tue Mar 21 19:23:27 2000
 ***
 *** 2728,2733 
 --- 2728,2734 
*/
   m-m_pkthdr.len = m-m_len = len - sizeof(struct ether_header);
   m-m_data += sizeof(struct ether_header);
 + m-m_pkthdr.header = (void *)eh;
   
   ether_input(sc-arpcom.ac_if, eh, m);
   return;
This is driver for ed(ne2000) cards. I have realtek(rl driver). I took a
look at his source and didn't find such strings. There is comment there
about cutting off mbuf header before passing it to ether_input - what's
this?



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-21 Thread Ilmar S. Habibulin

On Tue, 21 Mar 2000, Warner Losh wrote:

 In message [EMAIL PROTECTED] Nikolai Saoukh writes:
 :  But shouldn't it be sys/pci/if_rl.c ?
 : 
 : Sorry,
 : it is mea culpa. I mixed his case with my (token ring).
 
 Do you have the patch to if_rl.c.  I looked at it for all of 10
 seconds and it wasn't immediately obvious to me.

But why there is such a sudden change? Everything worked just fine a week
before 5-current.



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-21 Thread Yoshinobu Inoue

  The driver for his card does not set packet header pointer, thus
  arp stuff see NULL pointer. small patch will cure this problem
  (at least I hope so).
  
  *** if_ed.c.old Tue Mar 21 19:21:40 2000
  --- if_ed.c Tue Mar 21 19:23:27 2000
  ***
  *** 2728,2733 
  --- 2728,2734 
   */
  m-m_pkthdr.len = m-m_len = len - sizeof(struct ether_header);
  m-m_data += sizeof(struct ether_header);
  +   m-m_pkthdr.header = (void *)eh;

  ether_input(sc-arpcom.ac_if, eh, m);
  return;
 This is driver for ed(ne2000) cards. I have realtek(rl driver). I took a
 look at his source and didn't find such strings. There is comment there
 about cutting off mbuf header before passing it to ether_input - what's
 this?

I think this fix is only necessary for token-ring case (as he
say in his following mail), and not related to ethernet.

Yoshinobu Inoue


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-21 Thread Yoshinobu Inoue

-Any kernel crash dump?
 Can you tell me ddb command to make a kernel dump?

 -Please confirm that your /var/crash has enough size for your
  machine's memory.

 -Please check your swap device using "swapinfo" etc.
  In case of my machine,

   % swapinfo
   Device  1K-blocks UsedAvail Capacity  Type
   /dev/wd0s2b26214475612   18640429%Interleaved

 -Please sepcify it as dumpdev in your /etc/rc.conf

   dumpdev="/dev/wd0s2b"

Then at the reboot of after a panic, crash dump will be
written to files under /var/crash/.

Cheers,
Yoshinobu Inoue


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-21 Thread Warner Losh

In message [EMAIL PROTECTED] "Ilmar S. 
Habibulin" writes:
: But why there is such a sudden change? Everything worked just fine a week
: before 5-current.

No it didn't.  I've been seeing panics like this for about two weeks,
but it hadn't been a priority until this week for me.  And I'm not
seeing it on lightly loaded networks, but am on heavily loaded ones.
Since our product's network port is just for debugging, it isn't a big
deal to me

It is definitely a load related problem for me.  It usually works just
fine, but sometimes there's a packet that gets to arp that arp barfs
on.

Warner


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message