from:"Hen, Shmulik"

Q: kallsyms - where can I find it and what does it do ?

2000-10-18 Thread Hen, Shmulik


Hello,

I'm trying to build a new kernel with kdb support and I keep getting an
error from the make file:

kallsyms pass 1
[make] /bin/sh: /sbin/kallsyms: No such file or directory
error

What is kallsyms and where can I get it from ?

Here is what I have and what I did:
The machine is a Compaq Ap500 dual P-III Xeon.
I'm using RedHat 6.2 (clean disk - custom install - install everything).
The new kernel is linux-2.4.0-test9.tar.gz (the latest according to
www.kernel.org).
The kdb patch is kdb-v1.5-2.4.0-test9-pre9.gz (the latest according to
oss.sgi.com).

There doesn't seem to be a problem running the patch (no error messages, at
least).
I made sure kdb support is checked under 'make menuconfig'.
I ran 'make dep; make clean; make bzImage' and it keeps failing.

While on the subject, how is it possible that I can build an SMP kernel on
that machine (before the patch), but can't build a UP kernel ?.
I use 'make menuconfig' and uncheck "SMP support", then run 'make dep; make
clean; make bzImage' and I get all kinds of warnings and errors until the
make file simply stops running (doesn't return to the prompt with an error
message - just stops and I have to hit CR to get back to the prompt).


Thanks in advance,

Shmulik Hen

  Software Engineer
Linux Advanced Networking Services
Network Communications Group, Israel (NCGj)
Intel Corporation Ltd.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Please ignore - RE: kallsyms - where can I find it and what does it do ?

2000-10-18 Thread Hen, Shmulik

-Original Message-
From: Hen, Shmulik [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, October 18, 2000 6:17 PM
To: [EMAIL PROTECTED]
Cc: 'Keith Owens'
Subject: Q: kallsyms - where can I find it and what does it do ?

Hello,

I'm trying to build a new kernel with kdb support and I keep getting an
error from the make file:

kallsyms pass 1
[make] /bin/sh: /sbin/kallsyms: No such file or directory
error

What is kallsyms and where can I get it from ?

Here is what I have and what I did:
The machine is a Compaq Ap500 dual P-III Xeon.
I'm using RedHat 6.2 (clean disk - custom install - install everything).
The new kernel is linux-2.4.0-test9.tar.gz (the latest according to
www.kernel.org).
The kdb patch is kdb-v1.5-2.4.0-test9-pre9.gz (the latest according to
oss.sgi.com).

There doesn't seem to be a problem running the patch (no error messages, at
least).
I made sure kdb support is checked under 'make menuconfig'.
I ran 'make dep; make clean; make bzImage' and it keeps failing.

While on the subject, how is it possible that I can build an SMP kernel on
that machine (before the patch), but can't build a UP kernel ?.
I use 'make menuconfig' and uncheck "SMP support", then run 'make dep; make
clean; make bzImage' and I get all kinds of warnings and errors until the
make file simply stops running (doesn't return to the prompt with an error
message - just stops and I have to hit CR to get back to the prompt).

Thanks in advance,

Shmulik Hen

  Software Engineer
Linux Advanced Networking Services
Network Communications Group, Israel (NCGj)
Intel Corporation Ltd.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

page fault problems porting a network driver to 2.4.x

2000-10-24 Thread Hen, Shmulik


Hello,

We are developing an advanced networking services loadable module and are
having problems porting it to work on 2.4.x kernels. The driver is supposed
to provide services such as fault tolerance, load balancing and link
aggregation over a team of network adapters. It works OK on 2.2.x kernels
but hangs on 2.4.x kernels.

In order to debug it, we stripped it down to become a mere "intermediate" or
"filter" driver that binds to a base driver and passes everything through in
both directions (Rx, Tx, IOCTL, stats, etc.). After going through the basics
of modifying the driver to compile on 2.4.x kernels and fighting some nasty
dead locks due to the new nature of the networking layer, we managed to get
it to run. The driver will receive and transmit a few hundreds of thousands
of packets (while having a periodic timer expire 10 times a second and
running continuous IOCTLs), and then it causes an oops about not being able
to handle a page fault.

The function looks something like:

int iansHardStartXmit(struct sk_buff *skb, struct net_device *dev) {
int res;
struct net_device *base;

spin_lock(lock);
base = get_base_driver_by_name(name);

if(base != NULL) {
res = base-hard_start_xmit(skb, base);
}

spin_unlock(lock);
return res;
}

We used kdb in order to track down the problem and found out the following
stack trace:

 EBPEIP function(args)
0xc4cd1c54  0xd081e3e7  [e100]__kallsyms+0xb (0xc4b595a0,
0xc840f200)
e100 __kallsyms 0xd081e3dc
0xd081e3dc 0xd0820dsc
0xd08244ba  [ians]iansHardStartXmit+0xa6 (0xc4b595a0,
0xc4d9bc00)
ians .text 0xd0824060 0xd0824414
0xd082452c
0xc01f9d1f  qdisc_restart+0xcf (0xc4d9bc00)
kernel .text 0xc010 0xc01f9c50
0xc01f9f14
*
*
*

This goes on and shows that this is an ICMP echo reply packet going down
through the IP stack to the filter driver (apparently 0xc4b595a0 is the skb,
0xc4d9bc00 is the *dev of the filter driver and 0xc840f200 is the *dev of
the base driver). The filter driver is supposed to call the
dev-hard_start_xmit of the base driver, but strangely it lands somewhere in
the data segment of the base driver (__kallsyms is a part of the symbol
table of the module according to insmod -m).
Figuring the dev-hard_start_xmit pointer got trashed somehow, we added a
check to make sure the same pointer is always called, and indeed this was
the case. Looking at the assembly code with kdb, we could see that the call
to the base driver is done by a 'call *%eax' command. kdb reports that
eax=0x after the page fault (origeax).

How is it possible that the pointer to the function keeps it's value, but
the jump to that function falls somewhere else ?
The entire function is protected by a spinlock, so there is no worry about
the other threads messing my data.

We are using:
RedHat 6.2
gcc v2.91.66
modutils v2.3.11-1
kernel linux-2.4.0-test9
kdb v1.5-2.4.0-test9-pre9
Compaq ap500 dual p-III Xeon


Thanks,
Shmulik Hen

Software Engineer
Linux Advanced Networking Services
Network Communications Group, Israel (NCGj)
Intel Corporation Ltd.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

pointer to dev-hard_start_xmit() gets trashed in 2.4.0-test9

2000-10-25 Thread Hen, Shmulik


Hello,

We are developing an advanced networking services loadable module and are
having problems porting it to work on 2.4.x kernels. The driver is supposed
to provide services such as fault tolerance, load balancing and link
aggregation over a team of network adapters. It works OK on 2.2.x kernels
but hangs on 2.4.x kernels.

In order to debug it, we stripped it down to become a mere "intermediate" or
"filter" driver that binds to a base driver and passes everything through in
both directions (Rx, Tx, IOCTL, stats, etc.). After going through the basics
of modifying the driver to compile on 2.4.x kernels and fighting some nasty
dead locks due to the new nature of the networking layer, we managed to get
it to run. The driver will receive and transmit a few hundreds of thousands
of packets (while having a periodic timer expire 10 times a second and
running continuous IOCTLs), and then it causes an oops about not being able
to handle a page fault.

The function looks something like:

int iansHardStartXmit(struct sk_buff *skb, struct net_device *dev) {
int res;
struct net_device *base;

spin_lock(lock);   //no interrupts involved, so spin_lock
should do
base = get_base_driver_by_name(name);

if(base != NULL) {
BUG_TRAP(ptr_g == base-hard_start_xmit); //make sure it's
always the same addr
res = base-hard_start_xmit(skb, base);
}

spin_unlock(lock);
return res;
}

We used kdb in order to track down the problem and found out the following
stack trace:

 EBPEIP function(args)
0xc4cd1c54  0xd081e3e7  [e100]__kallsyms+0xb (0xc4b595a0,
0xc840f200)
e100 __kallsyms 0xd081e3dc
0xd081e3dc 0xd0820dsc
0xd08244ba  [ians]iansHardStartXmit+0xa6 (0xc4b595a0,
0xc4d9bc00)
ians .text 0xd0824060 0xd0824414
0xd082452c
0xc01f9d1f  qdisc_restart+0xcf (0xc4d9bc00)
kernel .text 0xc010 0xc01f9c50
0xc01f9f14
*
*
*

This goes on and shows that this is an ICMP echo reply packet going down
through the IP stack to the filter driver (apparently 0xc4b595a0 is the skb,
0xc4d9bc00 is the *dev of the filter driver and 0xc840f200 is the *dev of
the base driver). The filter driver is supposed to call the
dev-hard_start_xmit of the base driver, but strangely it lands somewhere in
the data segment of the base driver (__kallsyms is a part of the symbol
table of the module according to insmod -m).
Figuring the dev-hard_start_xmit pointer got trashed somehow, we added a
check to make sure the same pointer is always called, and indeed this is the
case. Looking at the assembly code with kdb, we could see that the call to
the base driver is done by a 'call *%eax' instruction.

How is it possible that the pointer to the function keeps it's value, but
the jump to that function falls somewhere else ?

We are using:
RedHat 6.2
gcc v2.91.66
modutils v2.3.11-1 (was upgraded because of kdb)
kernel linux-2.4.0-test8 (SMP, +kdb, compiled with frame pointers,
SPINLOCK_DEBUG=2)
kdb v1.4-2.4.0-test9-pre9
Compaq ap500 dual p-III Xeon

Could this be a version mismatch between the components above ?

Thanks,
Shmulik Hen

Software Engineer
Linux Advanced Networking Services
Network Communications Group, Israel (NCGj)
Intel Corporation Ltd.




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

RE: spinlock help

2001-03-07 Thread Hen, Shmulik


How about if the same sequence occurred, but from two different drivers ?

We've had some bad experience with this stuff. Our driver, which acts as an
intermediate net driver, would call the hard_start_xmit in the base driver.
The base driver, wanting to block receive interrupts would issue a
'spin_lock_irqsave(a,b)' and process the packet. If the TX queue is full, it
could call an indication entry point in our intermediate driver to signal it
to stop sending more packets. Since our indication function handles many
types of indications but can process them only one at a time, we wanted to
block other indications while queuing the request.

The whole sequence would look like that:

[our driver]
ans_send() {
.
.
e100_hard_start_xmit(dev, skb);
.
.
}

[e100.o]
e100_hard_start_xmit() {
.
.
spin_lock_irqsave(a,b);
.
.
if(tx_queue_full)
ans_notify(TX_QUEUE_FULL);  --
.
.
spin_unlock_irqrestore(a,b);
}

[our driver]
ans_notify() {
.
.
spin_lock_irqsave(c,d);
queue_request(req_type);
spin_unlock_irqrestore(c,d);--
.
.
}

At that point, for some reason, interrupts were back and the e100.o would
hang in an infinite loop (we verified it on kernel 2.4.0-test10 +kdb that
the processor was enabling interrupts and that the e100_isr was called for
processing an Rx int.).

How is that possible that a 'spin_unlock_irqrestore(c,d)' would also restore
what should have been restored only with a 'spin_unlock_irqrestore(a,b)' ?


Thanks in advance,
Shmulik Hen  
  Software Engineer
Linux Advanced Networking Services
Intel Network Communications Group
Jerusalem, Israel.

-Original Message-
From: Nigel Gamble [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, March 07, 2001 1:54 AM
To: Manoj Sontakke
Cc: [EMAIL PROTECTED]
Subject: Re: spinlock help


On Tue, 6 Mar 2001, Manoj Sontakke wrote:
 1. when spin_lock_irqsave() function is called the subsequent code is
 executed untill spin_unloc_irqrestore()is called. is this right?

Yes.  The protected code will not be interrupted, or simultaneously
executed by another CPU.

 2. is this sequence valid?
   spin_lock_irqsave(a,b);
   spin_lock_irqsave(c,d);

Yes, as long as it is followed by:

spin_unlock_irqrestore(c, d);
spin_unlock_irqrestore(a, b);

Nigel Gamble[EMAIL PROTECTED]
Mountain View, CA, USA. http://www.nrg.org/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: spinlock help

2001-03-07 Thread Hen, Shmulik


spin_lock_bh() won't block interrupts and we need them blocked to prevent
more indications.
spin_lock_irq() could do the trick but it's counterpart spin_unlock_irq()
enables all interrupts by calling sti(), and this is even worse for us.


Shmulik.

-Original Message-
From: Manoj Sontakke [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, March 07, 2001 12:27 PM
To: Hen, Shmulik
Cc: '[EMAIL PROTECTED]'; Manoj Sontakke; [EMAIL PROTECTED]
Subject: Re: spinlock help


hi

spin_lock_irq()   andspin_lock_bh() 

can they be of any use to u? 

"Hen, Shmulik" wrote:
 
 How about if the same sequence occurred, but from two different drivers ?
 
 We've had some bad experience with this stuff. Our driver, which acts as
an
 intermediate net driver, would call the hard_start_xmit in the base
driver.
 The base driver, wanting to block receive interrupts would issue a
 'spin_lock_irqsave(a,b)' and process the packet. If the TX queue is full,
it
 could call an indication entry point in our intermediate driver to signal
it
 to stop sending more packets. Since our indication function handles many
 types of indications but can process them only one at a time, we wanted to
 block other indications while queuing the request.
 
 The whole sequence would look like that:
 
 [our driver]
 ans_send() {
 .
 .
 e100_hard_start_xmit(dev, skb);
 .
 .
 }
 
 [e100.o]
 e100_hard_start_xmit() {
 .
 .
 spin_lock_irqsave(a,b);
 .
 .
 if(tx_queue_full)
 ans_notify(TX_QUEUE_FULL);  --
 .
 .
 spin_unlock_irqrestore(a,b);
 }
 
 [our driver]
 ans_notify() {
 .
 .
 spin_lock_irqsave(c,d);
 queue_request(req_type);
 spin_unlock_irqrestore(c,d);--
 .
 .
 }
 
 At that point, for some reason, interrupts were back and the e100.o would
 hang in an infinite loop (we verified it on kernel 2.4.0-test10 +kdb that
 the processor was enabling interrupts and that the e100_isr was called for
 processing an Rx int.).
 
 How is that possible that a 'spin_unlock_irqrestore(c,d)' would also
restore
 what should have been restored only with a 'spin_unlock_irqrestore(a,b)' ?
 
 Thanks in advance,
 Shmulik Hen
   Software Engineer
 Linux Advanced Networking Services
 Intel Network Communications Group
 Jerusalem, Israel.
 
 -Original Message-
 From: Nigel Gamble [mailto:[EMAIL PROTECTED]]
 Sent: Wednesday, March 07, 2001 1:54 AM
 To: Manoj Sontakke
 Cc: [EMAIL PROTECTED]
 Subject: Re: spinlock help
 
 On Tue, 6 Mar 2001, Manoj Sontakke wrote:
  1. when spin_lock_irqsave() function is called the subsequent code is
  executed untill spin_unloc_irqrestore()is called. is this right?
 
 Yes.  The protected code will not be interrupted, or simultaneously
 executed by another CPU.
 
  2. is this sequence valid?
spin_lock_irqsave(a,b);
spin_lock_irqsave(c,d);
 
 Yes, as long as it is followed by:
 
 spin_unlock_irqrestore(c, d);
 spin_unlock_irqrestore(a, b);
 
 Nigel Gamble[EMAIL PROTECTED]
 Mountain View, CA, USA. http://www.nrg.org/
 
 -
 To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
 
 -
 To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/

-- 
Regards,
Manoj Sontakke

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: spinlock help

2001-03-07 Thread Hen, Shmulik


e100 implements all sorts of hooks for our intermediate driver (kind of a
co-development effort), so eepro100 is out of the question for us.


Shmulik.

-Original Message-
From: Ofer Fryman [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, March 07, 2001 12:31 PM
To: 'Hen, Shmulik'
Cc: [EMAIL PROTECTED]
Subject: RE: spinlock help


Did you try looking at Becker eepro100 driver it seems to be simple, no
unnecessary spin_lock_irqsave?.

-Original Message-
From: Hen, Shmulik [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, March 07, 2001 11:21 AM
To: '[EMAIL PROTECTED]'; Manoj Sontakke
Cc: [EMAIL PROTECTED]
Subject: RE: spinlock help


How about if the same sequence occurred, but from two different drivers ?

We've had some bad experience with this stuff. Our driver, which acts as an
intermediate net driver, would call the hard_start_xmit in the base driver.
The base driver, wanting to block receive interrupts would issue a
'spin_lock_irqsave(a,b)' and process the packet. If the TX queue is full, it
could call an indication entry point in our intermediate driver to signal it
to stop sending more packets. Since our indication function handles many
types of indications but can process them only one at a time, we wanted to
block other indications while queuing the request.

The whole sequence would look like that:

[our driver]
ans_send() {
.
.
e100_hard_start_xmit(dev, skb);
.
.
}

[e100.o]
e100_hard_start_xmit() {
.
.
spin_lock_irqsave(a,b);
.
.
if(tx_queue_full)
ans_notify(TX_QUEUE_FULL);  --
.
.
spin_unlock_irqrestore(a,b);
}

[our driver]
ans_notify() {
.
.
spin_lock_irqsave(c,d);
queue_request(req_type);
spin_unlock_irqrestore(c,d);--
.
.
}

At that point, for some reason, interrupts were back and the e100.o would
hang in an infinite loop (we verified it on kernel 2.4.0-test10 +kdb that
the processor was enabling interrupts and that the e100_isr was called for
processing an Rx int.).

How is that possible that a 'spin_unlock_irqrestore(c,d)' would also restore
what should have been restored only with a 'spin_unlock_irqrestore(a,b)' ?


Thanks in advance,
Shmulik Hen  
  Software Engineer
Linux Advanced Networking Services
Intel Network Communications Group
Jerusalem, Israel.

-Original Message-
From: Nigel Gamble [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, March 07, 2001 1:54 AM
To: Manoj Sontakke
Cc: [EMAIL PROTECTED]
Subject: Re: spinlock help


On Tue, 6 Mar 2001, Manoj Sontakke wrote:
 1. when spin_lock_irqsave() function is called the subsequent code is
 executed untill spin_unloc_irqrestore()is called. is this right?

Yes.  The protected code will not be interrupted, or simultaneously
executed by another CPU.

 2. is this sequence valid?
   spin_lock_irqsave(a,b);
   spin_lock_irqsave(c,d);

Yes, as long as it is followed by:

spin_unlock_irqrestore(c, d);
spin_unlock_irqrestore(a, b);

Nigel Gamble[EMAIL PROTECTED]
Mountain View, CA, USA. http://www.nrg.org/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: spinlock help

2001-03-07 Thread Hen, Shmulik


The kdb trace was accurate, we could actually see the e100 ISR pop from no
where right in the middle of our ans_notify every time the TX queue would
fill up. When we commented out the call to spin_*_irqsave(), it worked fine
under heavy stress for days.

Is it possible it was something wrong with 2.4.0-test10 specifically ?

We had to drop the locks in the final release and never got around to
checking it on other kernel releases (it went on the TO_DO list ;-).

Shmulik.

-Original Message-
From: Andrew Morton [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, March 07, 2001 1:43 PM
To: Hen, Shmulik
Subject: Re: spinlock help


"Hen, Shmulik" wrote:
 
 How about if the same sequence occurred, but from two different drivers ?
 
 We've had some bad experience with this stuff. Our driver, which acts as
an
 intermediate net driver, would call the hard_start_xmit in the base
driver.
 The base driver, wanting to block receive interrupts would issue a
 'spin_lock_irqsave(a,b)' and process the packet. If the TX queue is full,
it
 could call an indication entry point in our intermediate driver to signal
it
 to stop sending more packets. Since our indication function handles many
 types of indications but can process them only one at a time, we wanted to
 block other indications while queuing the request.
 
 The whole sequence would look like that:
 
 [our driver]
 ans_send() {
 .
 .
 e100_hard_start_xmit(dev, skb);
 .
 .
 }
 
 [e100.o]
 e100_hard_start_xmit() {
 .
 .
 spin_lock_irqsave(a,b);
 .
 .
 if(tx_queue_full)
 ans_notify(TX_QUEUE_FULL);  --
 .
 .
 spin_unlock_irqrestore(a,b);
 }
 
 [our driver]
 ans_notify() {
 .
 .
 spin_lock_irqsave(c,d);
 queue_request(req_type);
 spin_unlock_irqrestore(c,d);--
 .
 .
 }
 
 At that point, for some reason, interrupts were back

Sorry, that can't happen.

Really, you must have made a mistake somewhere.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: spinlock help

2001-03-08 Thread Hen, Shmulik


OK guys, you were right. The bug was in our code - sorry for trouble.
Turns out that while I was away, the problem was solved by someone else. The
problem is probably related to the fact that when we did
'spin_lock_irqsave(c,d)', 'd' was a global variable. The fix was to wrap the
call with another function and declare 'd' as local. I can't quite explain,
but I think that changing from a static to automatic variable made the
difference. My best guess is that since 'd' is passed by value and not by
reference, the macro expansion of spin_lock_irqsave() relies on the location
of 'd' in the stack and if 'd' was on the heap instead, it might get
trashed.

I would really like to hear your expert opinion on my assumption.


Thanks,
Shmulik.

-Original Message-
From: Andrew Morton [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, March 07, 2001 2:54 PM
To: Hen, Shmulik
Cc: 'LKML'
Subject: Re: spinlock help


"Hen, Shmulik" wrote:
 
 The kdb trace was accurate, we could actually see the e100 ISR pop from no
 where right in the middle of our ans_notify every time the TX queue would
 fill up. When we commented out the call to spin_*_irqsave(), it worked
fine
 under heavy stress for days.
 
 Is it possible it was something wrong with 2.4.0-test10 specifically ?
 

Sorry, no.  If spin_lock_irqsave()/spin_unlock_irqrestore()
were accidentally reenabling interrupts then it would be
the biggest, ugliest catastrophe since someone put out a kernel
which forgot to flush dirty inodes to disk :)

Conceivably it was a compiler bug.  Were you using egcs-1.1.2/x86?

-

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

catch 22 - porting net driver from 2.2 to 2.4

2000-11-09 Thread Hen, Shmulik


Hello,

This is a bit long and I apologize (since there are kdb captures in it).

We are developing an advanced networking services driver (loadable module)
and are having problems porting it to work on 2.4.x kernel.
The driver is supposed to provide services such as fault tolerance, load
balancing, link aggregation and VLAN. It does that by creating a group of
"virtual" adapters that are bound on top of a team of network adapters. This
works great on 2.2 kernels but demonstrated a few problems on 2.4.0-test9

The only problem we have left now has to do with insmod/rmmod. for good
reasons, we cant just call init_etherdev() like base drivers do, so we
created our own version that handles memory and name allocations and calls
register_netdevice() on it's own the same as init_etherdev(). since we've
got several "virtual" adapters that are part of a topology being built
progressively, we can't perform their registrations during module_init() but
rather through an IOCTL and here is our problem:

if we call register_netdevice(), we get the following message:

RTNL: assertion failed at devinet.c(775):inetdev_event

we figured this is because we neglected rtnl_lock() so instead we try using
register_netdev() to handle this for us but then we get:

Scheduling in interrupt
kernel BUG at sched.c:696!
Entering kdb (current=0xc51a8000, pid 1075) on processor 1 Panic:
invalid operand
due to panic @ 0xc011aa71

eax = 0x001b ebx = 0x0020 ecx = 0xc030d80c edx = 0x 
esi = 0xc030b5b4 edi = 0x esp = 0xc51a9d34 eip = 0xc011aa71 
ebp = 0xc51a9d8c  ss = 0x0018  cs = 0x0010 eflags =
0x00010246 
 ds = 0x0018  es = 0x0018 origeax = 0x regs =
0xc51a9d00

[1]kdb bt
EBP   EIP Function(args)
0xc51a9d8c 0xc011aa71 schedule+0x935 (0xc2c9e000, 0xc4482520,
0xc51a9e10)
   kernel .text 0xc010 0xc011a13c
0xc011aa80
0xc51a9db8 0xc0107b8d __down+0xf5
   kernel .text 0xc010 0xc0107a98
0xc0107c68
0xc51a9dcc 0xc0107f43 __down_failed+0xb (0xc51a9de4, 0xd082de5c,
0xc2c9e000, 0xc60a8320, 0xc51a9dfc)
   kernel .text 0xc010 0xc0107f38
0xc0107f4c
   0xc023a7a9 stext_lock+0x4919
   kernel .text.lock 0xc0235e90
0xc0235e90 0xc023bd80
0xc51a9dd4 0xc01f2e81 rtnl_lock+0x11 (0xc2c9e000)
   kernel .text 0xc010 0xc01f2e70
0xc01f2e88
0xc51a9de4 0xd082de5c [ians]iansInitEtherdev+0x20 (0xc4482520)
   ians .text 0xd082d060 0xd082de3c
0xd082de78
.
. (boring chain of calls)
.
0xc51a9ec4 0xd082dbcd [ians]doControlIoctl+0x15d (0xc2c9e200,
0xc51a9f20, 0x89f0)
   ians .text 0xd082d060 0xd082da70
0xd082dc40
0xc51a9ee4 0xc01ef09f dev_ifsioc+0x33f (0xc51a9f20, 0x89f0,
0xc51a9f20)
   kernel .text 0xc010 0xc01eed60
0xc01ef0b0
0xc51a9f40 0xc01ef29d dev_ioctl+0x1ed (0x89f0, 0xba58)
   kernel .text 0xc010 0xc01ef0b0
0xc01ef300
0xc51a9f64 0xc021a70c inet_ioctl+0x18c (0xc339d13c, 0x89f0,
0xba58)
   kernel .text 0xc010 0xc021a580
0xc021a720
0xc51a9f84 0xc01e8f06 sock_ioctl+0x9e (0xc339d040, 0xc38e0900,
0x89f0, 0xba58)
   kernel .text 0xc010 0xc01e8e68
0xc01e8f6c
0xc51a9fbc 0xc014f5fd sys_ioctl+0x26d (0x3, 0x89f0, 0xba58,
0x4000ae60, 0xbba4)
   kernel .text 0xc010 0xc014f390
0xc014f6a0
   0xc010965f system_call+0x33
   kernel .text 0xc010 0xc010962c
0xc0109664

We figured that since we are in user context (do_ioctl) and use
spin_lock_bh() to protect us from other concurrent threads, it might
interfere with rtnl_lock() so we remove our lock just before calling
register_netdev() and lock again upon return but then the whole process just
stopped and didn't return to the prompt. from within kdb, we could see that
all CPU's are running in idle but if we try to return to the prompt the
whole system hangs. sometimes it hangs if we try to run ifconfig -a to see
if the virtual adapters appear.

I can't use it without locks, I can't use it with locks and I can't complete
the operation if I remove my own locks - catch 22.


The other problem has to do with rmmod - here we get called in our
cleanup_module function and from it we try to call unregister_netdev() for
each registered virtual adapter.
in this case, we get:

Scheduling in interrupt
kernel BUG at sched.c:696!

Entering kdb (current=0xc38c8000, pid 1602) on processor 0 Panic:

Multiple warnings when compiling network driver in 2.4.0-test9

2000-10-29 Thread Hen, Shmulik


Hello,

While trying to compile a network driver for 2.4.0-test9 (+kdb-v1.5,
configured for UP) I'm getting multiple warnings:
/usr/src/linux/include/linux/sched.h:700: warning: can't inline call
to `__mmdrop'
/usr/src/linux/include/linux/sched.h:704: warning: called from here

This happens for every kernel header I try to use such  as netdevice.h,
skbuff.h, malloc.h and pci.h
Each of those header files includes slab.h (line 14) that includes mm.h
(line 4) that includes sched.h which contains the following on line 700:

700:extern inline void FASTCALL(__mmdrop(struct mm_struct *));
701:static inline void mmdrop(struct mm_struct * mm)
702:{
703:if (atomic_dec_and_test(mm-mm_count))
704:__mmdrop(mm);
705:}

My make file uses the following flags:
gcc -fomit-frame-pointer -Wall  -Wstrict-prototypes -Winline -O3
-D__KERNEL__ -DMODULE  -DDEBUG  -DMODVERSIONS -I/usr/src/linux/include

Can anyone tell me what this warning means and if I can safely ignore it (or
expect disaster) ?


Thanks,

Shmulik Hen,
Software Engineer
Linux Advanced Networking Services
Network Communications Group, Israel (NCGj)
Intel Corporation Ltd.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Locking Between User Context and Soft IRQs in 2.4.0

2000-10-30 Thread Hen, Shmulik


Hello,

We are trying to port a network driver from 2.2.x to 2.4.x and have some
question regarding locks.
According to the kernel locking HOWTO, we have to take extra care when
locking between user context threads and BH/tasklet/softIRQ,
so we learned (the hard way ;-) that when running the ioctl system call from
an application we should use spin_lock/unlock_bh() and not
spin_lock/unlock() inside dev-do_ioctl().

*   What about the other entry points implemented in net_device ? 
*   We've got dev-get_stats, dev-set_mac_address,
dev-set_mutlicast_list and others that are all called from running
'ifconfig' which is an application. Are they considered user context too ?
*   What about dev-open and dev-stop ?
*   We figured that dev-hard_start_xmit() and timer callbacks are not
considered user context, but how can I find out if they are being run as
SoftIRQ or as tasklets or as Bottom Halves ? (their different definitions
require different types of protections)

Our driver is actually an intermediate driver bound on top of a regular net
driver. It behaves both as a network adapter driver and a protocol at the
same time. I can safely assume that it will have to handle both transmits
and receives simultaneously (no hardware interrupts are involved). We've
decided that for the first stage we are going to implement "wide" locks that
wrap entire operations from top to bottom. For example, our
dev-hard_start_xmit() will have a spin_lock() at the beginning and a
spin_unlock() at the end of the function.
*   Will it be safe to keep the lock until after the call to the base
driver's hard_start_xmit, or do I have to release the lock just before that
?
*   Or, in our receive function, will I have to release the lock before
or after the call to netif_rx() ?
*   What about other calls to the kernel ? can the running thread be
switched out of context when calling kernel entries and not be switched back
in when they finish ? should I beware of deadlocks in such case ?


Thanks in advance,
Shmulik Hen,
Software Engineer
Linux Advanced Networking Services
Network Communications Group, Israel (NCGj)
Intel Corporation Ltd.




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

RE: catch 22 - porting net driver from 2.2 to 2.4

2000-11-12 Thread Hen, Shmulik


and you don't get the "RTNL: assertion failed at
devinet.c(775):inetdev_event" in 2.4.x ?

the thing is I need to prevent Tx/Rx when a topology change is initiated
from the ioctl (registering a virtual adapter is just one example), so they
all share a single lock and I must use spin_lock_bh from the ioctl.

Shmulik.

-Original Message-
From: Olaf Titz [mailto:[EMAIL PROTECTED]]
Sent: Friday, November 10, 2000 2:09 AM
To: [EMAIL PROTECTED]
Subject: Re: catch 22 - porting net driver from 2.2 to 2.4


 We figured that since we are in user context (do_ioctl) and use
 spin_lock_bh() to protect us from other concurrent threads, it might
 interfere with rtnl_lock() so we remove our lock just before calling
 register_netdev() and lock again upon return but then the whole process
just
 stopped and didn't return to the prompt. from within kdb, we could see
that

Can't you just do this:

#if LINUX_VERSION_CODE = KERNEL_VERSION(2,3,0) /* not sure about the 0 */
#define rtnl_LOCK() rtnl_lock()
#define rtnl_UNLOCK()   rtnl_unlock()
#else
#define rtnl_LOCK() /* nop */
#define rtnl_UNLOCK()   /* nop */
#endif

rtnl_LOCK();
register_netdevice(...);
rtnl_UNLOCK();

that works for me (yes, from do_ioctl, but without the bh lock - I
don't know if that's absolutely needed in your case).

Olaf
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

RE: catch 22 - porting net driver from 2.2 to 2.4

2000-11-12 Thread Hen, Shmulik


So how come I get the "RTNL: assertion failed at
devinet.c(775):inetdev_event" when I call register_netdevice without
rtnl_lock/unlock ?
could it be a 2.4.0-test9 thing ? (haven't used test10 or 11 yet).

and what about rmmod causing the panic when I use unregister_netdev or never
completing the operation when I use unregister_netdevice ?
does module_exit run inside rtnl_lock too ?


Shmulik.

-Original Message-
From: Jeff Garzik [mailto:[EMAIL PROTECTED]]
Sent: Thursday, November 09, 2000 7:37 PM
To: Hen, Shmulik
Cc: 'LNML'; 'LKML'; [EMAIL PROTECTED]
Subject: Re: catch 22 - porting net driver from 2.2 to 2.4


do_ioctl is inside rtnl_lock...

Remember if you need to alter the rules, you can always queue work in
the current context, and have a kernel thread handle the work.  The nice
thing about a kernel thread is that you start with a [almost] clean
state, when it comes to locks.

Jeff


-- 
Jeff Garzik |
Building 1024   | Would you like a Twinkie?
MandrakeSoft|
-
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

RE: catch 22 - porting net driver from 2.2 to 2.4

2000-11-13 Thread Hen, Shmulik



"Jeff Garzik" wrote:
 Theoretically, if you call unregister_netdev from rmmon, it should grab
 rtnl_lock and then complete the operation for you.  If that doesn't work
 for you, it sounds like you are not setting up, or cleaning up,
 something correctly.
 
 Basically... it sounds like there are still bugs in your driver that
 need working out :)

I followed the value of dev-refcnt and there is something strange. before
the call to register_netdev it is set to 0 and after that it is increased to
1. but before the call to unregister_netdev it is somehow 2.

How can I tell who is modifying it and when ?

I tried using kdb to find out but it keeps hanging the machine. I wanted to
place a breakpoint that will pop if a certain memory address is accessed so
I did the following:

static void *p; (global - at the top of my source file)
EXPORT_SYMBOL (p);

in my probe function:
p = (void*) dev-refcnt;

 insmod my_module
 ksyms

this showed that p is at address 0xd081ee90

then from within kdb:
kdb md 0xd081ee90
0xd081ee90 c470bedc  

kdb bpha 0xc470bedc
Forced Global Breakpoint at...

kdb go

system hung


Is there a way to place a breakpoint on a memory address access ?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Q: using kdb to trap memory modifications

2000-11-14 Thread Hen, Shmulik


hello,

I'm trying to see when and how a certain variable is being modified and I
wonder how to get kdb to do that for me.

when I do 'insmod my_module' with my network driver, I noticed that
dev-refcnt is 0 at first and gets increased to 1 after calling
register_netdev. When I want to do 'rmmod my_module' somehow dev-refcnt is
at 2 before the call to unregister_netdev and so the kernel (2.4.0-test9)
won't let the module to unload. Since I don't increase it explicitly, I want
to know who does it for me and when to see if I'm doing anything wrong.

So, I modified my module to contain in the global section:
static void* ptr;
EXPORT_SYMBOL(ptr);

and in my module_init function I added:

ptr = (void*) dev-refcnt;

after running 'insmod my_module' I used ksyms and found that ptr is at
address 0xd081eeb4.

after entering kdb, I did:

kdb md 0xd081eeb4

0xd081eeb4  c470bedc      
0xd081eec4        
.
.

kdb bph 0xc470bedc
Forced Breakpoint at #0 ...

kdb be 0
kdb go

When I try to return to normal shell, the system is totally hung and won't
even receive inputs from the remote serial terminal.
I tried bp, bpa and bpha, but the result is always the same.


Thanks,
Shmulik Hen
Software Engineer
Linux Advanced Networking Services
Network Communications Group, Israel (NCGj)
Intel Corporation Ltd.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

RE: catch 22 - porting net driver from 2.2 to 2.4

2000-11-20 Thread Hen, Shmulik


I tried using the kernel thread as demonstrated in your example and again it
failed (panic - scheduling in interrupt).
The difference is that your code executes the thread from within dev-open,
while my code tries to do that from dev-do_ioctl that has spinlocks around
the entire operation (which apparently sleeps).
If I comment out the spin_lock/unlock it will succeed, but then I can't be
sure I don't get any concurrent Tx/Rx/timer which is a bad idea while the
topology is still being created.

is there any way to do something like firing threads/timers atomically ?


Thanks,
Shmulik.

-Original Message-
From: Jeff Garzik [mailto:[EMAIL PROTECTED]]
Sent: Monday, November 13, 2000 4:26 PM
To: Hen, Shmulik
Subject: Re: catch 22 - porting net driver from 2.2 to 2.4


"Hen, Shmulik" wrote:
 
 Where can I find info about that ?
 My first idea was to fire a timer and let the callback routine do the
work,
 but I worry about synchronization and about passing the list of items for
it
 to handle.
 What is the accepted way of starting a kernel thread and how do I handle
 parameters and sync. ?

Attached is an example.  My "8139too" ethernet driver uses a kernel
thread instead of a timer to perform media checking.  It illustrates how
to start and stop a kernel thread.

-- 
Jeff Garzik |
Building 1024   | The chief enemy of creativity is "good" sense
MandrakeSoft|  -- Picasso

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

RE: change_mtu boundary checking error

2001-04-18 Thread Hen, Shmulik

But Ethernet is not only for IP, what about other protocols ?

-Original Message-
From: Alan Cox [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, April 17, 2001 3:41 PM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: Re: change_mtu boundary checking error

 Now, the high boundary seemed reasonable (ETH_FRAME_LEN - ETH_HLEN =
 ETH_DATA_LEN) which gives 1500, but why is the low boundary set to 68 ?
 According to my calculations, it should have been ETH_ZLEN - ETH_HLEN
which
 gives 46.

The IPv4 minimum MTU is 68 bytes. Below that not all frames can be delivered

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: ioctl call for network device

2001-05-08 Thread Hen, Shmulik


 struct ifreq has a member called ifr_data. It is a pointer. You can
 put a pointer to any of your data, including the most complex structure
 you might envision, in that area. This allows you to pass anything
 to and from your module. This pointer can be properly dereferenced
 in kernel space but you should use copy_to/from_user and friends so a
 user-space coding bug won't panic the kernel.

How about a linked list ?
Will the driver be able to follow the list where each node was dynamically
allocated by the application ?
Is there a size limit on the buffer ifr_data points to ? (AFAIK, Windows
NDIS drivers limit to 1 page buffer =4096 bytes).


Thanks,

Shmulik Hen  
Linux Advanced Networking Services
Intel Network Communications Group

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[OT] ethtool MII helpers (actually two OT's)

2001-06-24 Thread Hen, Shmulik


MII
---
Is there any support in the MII standard for 1000Mbps (GbE Fiber/Copper) ?
Perhaps an extension to the standard ?
I could see that some of the Gigabit adapters supported by the kernel
provide the MII IOCTLs
interface, but couldn't figure out how to extract the correct speed
information from the registers
I can read. I know it's a bit of a hassle and I have to get the local
capabilities and match them against the partner's capabilities and find the
highest common speed etc. etc. but I'm sure that if the driver can do it I
can reproduce it in userland too.

EthTool
---
Is there a way that I can extract the link status information out of the
ethtool struct ?
I could see that at least one Gigabit adapter driver (bcm5700.c), provides
the EthTool interface
and reports the correct speed and duplex mode but not the link status.
Is there a place that defines how a driver is supposed to implement the
support for EthTool ?
I figured that since there is no separate field for link status (at least in
version 1.2), a driver is supposed to report speed=0 or something like that
when the link is down. I know this driver detects link status changes for
sure because it prints messages every time, but the speed and duplex are
always reported the same.


Thanks,
Shmulik Hen  
  Software Engineer
Linux Advanced Networking Services
Intel Network Communications Group
Jerusalem, Israel

-Original Message-
From: Jeff Garzik [mailto:[EMAIL PROTECTED]]
Sent: Friday, June 22, 2001 8:59 AM
To: Chris Wedgwood
Cc: Linux Kernel Mailing List; [EMAIL PROTECTED]; David S. Miller
Subject: Re: PATCH: ethtool MII helpers


Chris Wedgwood wrote:
 
 On Fri, Jun 22, 2001 at 01:24:36AM -0400, Jeff Garzik wrote:
 
 Sure, and that's planned.  Wanna send me a patch for it?  :)
 
 Possibly, but I wonder if this is a kernel-space problem or not. Why
 not put all the smarts into userland for it?

I meant, send me a patch for userland ethtool, to do exactly what you
described.


 It will definitely fall back on the MII ioctls if ethtool media
 support for the desired command doesn't exist.
 
 Well, that is more or less as much as needs to be done. That, and
 some kind of super-set API to be defined for all new stuff, having
 two slightly different APIs for the same things sucks.

Both APIs do different things but have a common subset, yes.

The MII ioctls only do their thing for MII-like hardware.  ethtool can
be applied to any hardware.  Old ISA drivers that don't do MII, or do it
in a really nonstandard way.  For example I have ethtool code locally
which allows ne2k-pci to do media selection via ioctl, for two popular
ne2k cards, something its never been able to do before.  Emulating media
selection support for things like 10base2-10baseT-AUI just isn't
possible with the MII ioctls.

MII is a standard and incredibly popular, thus mii-tool works most
popular PCI NICs, for the most popular media types.  But it's still
basically a hardware interface.  I am not convinced its a good idea for
make the [G]MII ioctls the Linux software media interface for all
network hardware.

I see ethtool as the interface for tuning your NIC, that works across
all hardware.
I see mii-diag as the way to do advance MII-specific hardware stuff,
like next page or HA monitoring or whatever.

Jeff


-- 
Jeff Garzik  | Andre the Giant has a posse.
Building 1024|
MandrakeSoft |
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

kernel memory allocations alignment

2001-02-04 Thread Hen, Shmulik


Hello,

When using kmalloc(size_t size), do I get a guaranty that the memory region
allocated is aligned according to the size specified ?
More to the point, if I call kmalloc for type int on an IA64 architecture is
the pointer going to be 8 bytes aligned ?


Shmulik Hen
Software Engineer
Linux Advanced Networking Services
Network Communications Group, Israel (NCGj)
Intel Corporation Ltd.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

RE: kernel memory allocations alignment

2001-02-04 Thread Hen, Shmulik


Actually yes. We were warned that on IA64 architecture the system will halt
when accessing any type of variable via a pointer if the pointer does not
contain an aligned address matching that type. Until now we were using a
method of receiving a pointer to an array, casting it to a pointer of a
struct (packed with #pragma pack(1) ) ,and retrieving fields directly from
it with pointers.
It seems we cannot do that any more and were wondering what are the
alternatives.
One way we could think of is forget the packing and rearrange the fields in
the struct in descending order so they all come out aligned, but we didn't
know for sure if the first one will be aligned too.

Will that work ?


Thanks,
Shmulik Hen  
  Software Engineer
Linux Advanced Networking Services
Intel Network Communications Group
Jerusalem, Israel

-Original Message-
From: Manfred [mailto:[EMAIL PROTECTED]]
Sent: Sunday, February 04, 2001 5:56 PM
To: Hen, Shmulik
Cc: 'LKML'
Subject: Re: kernel memory allocations alignment


"Hen, Shmulik" wrote:
 
 When using kmalloc(size_t size), do I get a guaranty that the memory
region
 allocated is aligned according to the size specified ?
 More to the point, if I call kmalloc for type int on an IA64 architecture
is
 the pointer going to be 8 bytes aligned ?


Yes, kmalloc results are always 'sizeof(void*)' aligned.

Do you have stricter alignment requirements?

--
Manfred

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

RE: Q: How do I get from the latest stable kernel version to the latest prepatch version ?

2001-03-26 Thread Hen, Shmulik


Thanks.
It just struck me odd that the latest is 2.4.2 while the prepatches were
2.4.3 so I figured there must be something I missed in between (my logic
told me that a 2.4.3 patch would be against a 2.4.3 something ;-).

BTW, I haven't seen any announcements from Linus in this mailing list
regarding new versions, just the updates on the web site and Alan's release
notes saying he's merging with 2.4.3xx. Are those announcements being posted
somewhere else now ?

-Original Message-
From: Leonid Mamtchenkov [mailto:[EMAIL PROTECTED]]
Sent: Monday, March 26, 2001 2:33 PM
To: Hen, Shmulik
Cc: 'LKML'
Subject: Re: Q: How do I get from the latest stable kernel version to
the late st prepatch version ?


Hello Hen, Shmulik,

Once you wrote about "Q: How do I get from the latest stable kernel version
to the late st prepatch version ?":
HS According to http://www.kernel.org, the latest stable kernel version is
HS 2.4.2. The latest prepatch version is 2.4.3-pre3.
HS 
HS In order to get a full 2.4.3-pre8 kernel do I have to:
HS 
HS A. download linux-2.4.2.tar.gz and all the patch-2.4.3-preX.gz and apply
HS them in succession or,
HS B. download linux-2.4.3.tar.gz (exists ?) and then apply the all patches
or,
HS C. download linux-2.4.3-pre7.tar.gz (exists ?) and apply only
HS patch-2.4.3-pre8.gz ?

Download 2.4.2 and then apply 2.4.3-preX (latest) on it... that's it.
You might want to visit http://kernelnewbies.org .  They have some good docs
there.

-- 
 Best regards,
 Leonid Mamtchenkov
 System Administrator


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: URGENT : System hands on Freeing unused kernel memory:

2001-03-27 Thread Hen, Shmulik


Does it hang forever ?

I've noticed that my kernel (2.4.2) stalls for several minutes with the same
message but suddenly after that the login prompt appears (anything between,
like configurations and services starting messages, are gone). We've been
able to track it down to a change we did to /etc/lilo.conf to add support
for kernel prints to go out to a serial debugger. Before that everything was
OK, but after we added append="console=tty0 console=ttyS1,38400", this
problem started. We did notice however that everything that doesn't appear
on the console does appear on the serial debugger.


Shmulik.

-Original Message-
From: Thomas Foerster [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, March 27, 2001 10:40 AM
To: [EMAIL PROTECTED]
Subject: Re: URGENT : System hands on "Freeing unused kernel memory: "



 On 03.27 Thomas Foerster wrote:

 But suddenly the box was offline. One technical assistant from our ISP
tried
 to reboot
 our server (he couldn't tell me if there had been any messages on the
screen),
 but the
 system always hangs on

 Freeing unused kernel memory: xxk freed


 Try booting with init=/bin/bash, it looks like kernel gets a bad
/sbin/init,
 and gets stuck. Perhaps the shutdown damaged init, it starts to run and
get
 hung.

That didn't fix the problem :(

When i run "diff" on a new and the "old" init, i get no diffs ...

Must be something other :(

Thomas

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: Plans for 2.5

2001-03-29 Thread Hen, Shmulik


Just some general questions:

1) Is there anywhere a list that describes what is intended to be in 2.5.x ?
2) Are there any early releases of 2.5.x ?
3) Are the things for 2.5.x being discussed on another mailing list ?
4) What is the time frame of releasing 2.5.x-final (or 2.6.x) ?

Specifically, I'm more interested in the network driver aspect.
1) Are there any intended changes to the networking layer ?
2) I over heard something about making the driver reentrant - any news ?
3) What about support for IPv6 ? (I noticed it was marked as experimental
until now)


Thanks in advance,
Shmulik Hen  
  Software Engineer
Linux Advanced Networking Services
Intel Network Communications Group
Jerusalem, Israel


-Original Message-
From: Bruno Avila [mailto:[EMAIL PROTECTED]]
Sent: Thursday, March 29, 2001 12:45 AM
To: [EMAIL PROTECTED]
Subject: Plans for 2.5


Hello people,

I got some questions. When are we going to develop stuff for 2.5?
What is
planed? My opinion for linux 2.5 should be performance. Since linux already
is stable or well done for nature, we could thing more on performance to be
a diferencial over others. What do you people thing?

  Bruno Avila

PS: Not a good english. I know! :)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

change_mtu boundary checking error

2001-04-17 Thread Hen, Shmulik


Hello,

Going through the change_mtu() code in the kernel, I came across the default
function supplied when calling ether_setup().
I could see that eth_change_mtu() (drivers/net/net_init.c) does the
following:

if( (new_mtu  68) || (new_mtu  1500) )
return -EINVAL;

Looking in include/linux/if_ether.h I found the following constants:
#define ETH_ALEN6   /* Octets in one ethernet addr */
#define ETH_HLAN14  /* Total octets in header. */
#define ETH_ZLEN60  /* Min. octets in frame sans FCS */
#define ETH_DATA_LEN1500/* Max. octets in payload */
#define ETH_FRAME_LEN   1514/* Max. octets in frame sans FCS */


Now, the high boundary seemed reasonable (ETH_FRAME_LEN - ETH_HLEN =
ETH_DATA_LEN) which gives 1500, but why is the low boundary set to 68 ?
According to my calculations, it should have been ETH_ZLEN - ETH_HLEN which
gives 46.

Doesn't mtu means only the payload size ?
Where did the 68 come from ?


Thanks,
Shmulik Hen
Software Engineer
Linux Advanced Networking Services
Network Communications Group, Israel (NCGj)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: ioctl call for network device

2001-05-08 Thread Hen, Shmulik


> struct ifreq has a member called ifr_data. It is a pointer. You can
> put a pointer to any of your data, including the most complex structure
> you might envision, in that area. This allows you to pass anything
> to and from your module. This pointer can be properly dereferenced
> in kernel space but you should use copy_to/from_user and friends so a
> user-space coding bug won't panic the kernel.

How about a linked list ?
Will the driver be able to follow the list where each node was dynamically
allocated by the application ?
Is there a size limit on the buffer ifr_data points to ? (AFAIK, Windows
NDIS drivers limit to 1 page buffer =4096 bytes).


Thanks,

Shmulik Hen  
Linux Advanced Networking Services
Intel Network Communications Group

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: spinlock help

2001-03-07 Thread Hen, Shmulik

How about if the same sequence occurred, but from two different drivers ?

We've had some bad experience with this stuff. Our driver, which acts as an
intermediate net driver, would call the hard_start_xmit in the base driver.
The base driver, wanting to block receive interrupts would issue a
'spin_lock_irqsave(a,b)' and process the packet. If the TX queue is full, it
could call an indication entry point in our intermediate driver to signal it
to stop sending more packets. Since our indication function handles many
types of indications but can process them only one at a time, we wanted to
block other indications while queuing the request.

The whole sequence would look like that:

[our driver]
ans_send() {
.
.
e100_hard_start_xmit(dev, skb);
.
.
}

[e100.o]
e100_hard_start_xmit() {
.
.
spin_lock_irqsave(a,b);
.
.
if(tx_queue_full)
ans_notify(TX_QUEUE_FULL);  <--
.
.
spin_unlock_irqrestore(a,b);
}

[our driver]
ans_notify() {
.
.
spin_lock_irqsave(c,d);
queue_request(req_type);
spin_unlock_irqrestore(c,d);<--
.
.
}

At that point, for some reason, interrupts were back and the e100.o would
hang in an infinite loop (we verified it on kernel 2.4.0-test10 +kdb that
the processor was enabling interrupts and that the e100_isr was called for
processing an Rx int.).

How is that possible that a 'spin_unlock_irqrestore(c,d)' would also restore
what should have been restored only with a 'spin_unlock_irqrestore(a,b)' ?

Thanks in advance,
Shmulik Hen  
  Software Engineer
Linux Advanced Networking Services
Intel Network Communications Group
Jerusalem, Israel.

-Original Message-
From: Nigel Gamble [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, March 07, 2001 1:54 AM
To: Manoj Sontakke
Cc: [EMAIL PROTECTED]
Subject: Re: spinlock help

On Tue, 6 Mar 2001, Manoj Sontakke wrote:
> 1. when spin_lock_irqsave() function is called the subsequent code is
> executed untill spin_unloc_irqrestore()is called. is this right?

Yes.  The protected code will not be interrupted, or simultaneously
executed by another CPU.

> 2. is this sequence valid?
>   spin_lock_irqsave(a,b);
>   spin_lock_irqsave(c,d);

Yes, as long as it is followed by:

spin_unlock_irqrestore(c, d);
spin_unlock_irqrestore(a, b);

Nigel Gamble[EMAIL PROTECTED]
Mountain View, CA, USA. http://www.nrg.org/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: spinlock help

2001-03-07 Thread Hen, Shmulik


spin_lock_bh() won't block interrupts and we need them blocked to prevent
more indications.
spin_lock_irq() could do the trick but it's counterpart spin_unlock_irq()
enables all interrupts by calling sti(), and this is even worse for us.


Shmulik.

-Original Message-
From: Manoj Sontakke [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, March 07, 2001 12:27 PM
To: Hen, Shmulik
Cc: '[EMAIL PROTECTED]'; Manoj Sontakke; [EMAIL PROTECTED]
Subject: Re: spinlock help


hi

spin_lock_irq()   andspin_lock_bh() 

can they be of any use to u? 

"Hen, Shmulik" wrote:
> 
> How about if the same sequence occurred, but from two different drivers ?
> 
> We've had some bad experience with this stuff. Our driver, which acts as
an
> intermediate net driver, would call the hard_start_xmit in the base
driver.
> The base driver, wanting to block receive interrupts would issue a
> 'spin_lock_irqsave(a,b)' and process the packet. If the TX queue is full,
it
> could call an indication entry point in our intermediate driver to signal
it
> to stop sending more packets. Since our indication function handles many
> types of indications but can process them only one at a time, we wanted to
> block other indications while queuing the request.
> 
> The whole sequence would look like that:
> 
> [our driver]
> ans_send() {
> .
> .
> e100_hard_start_xmit(dev, skb);
> .
> .
> }
> 
> [e100.o]
> e100_hard_start_xmit() {
> .
> .
> spin_lock_irqsave(a,b);
> .
> .
> if(tx_queue_full)
> ans_notify(TX_QUEUE_FULL);  <--
> .
> .
> spin_unlock_irqrestore(a,b);
> }
> 
> [our driver]
> ans_notify() {
> .
> .
> spin_lock_irqsave(c,d);
> queue_request(req_type);
> spin_unlock_irqrestore(c,d);<--
> .
> .
> }
> 
> At that point, for some reason, interrupts were back and the e100.o would
> hang in an infinite loop (we verified it on kernel 2.4.0-test10 +kdb that
> the processor was enabling interrupts and that the e100_isr was called for
> processing an Rx int.).
> 
> How is that possible that a 'spin_unlock_irqrestore(c,d)' would also
restore
> what should have been restored only with a 'spin_unlock_irqrestore(a,b)' ?
> 
> Thanks in advance,
> Shmulik Hen
>   Software Engineer
> Linux Advanced Networking Services
> Intel Network Communications Group
> Jerusalem, Israel.
> 
> -Original Message-
> From: Nigel Gamble [mailto:[EMAIL PROTECTED]]
> Sent: Wednesday, March 07, 2001 1:54 AM
> To: Manoj Sontakke
> Cc: [EMAIL PROTECTED]
> Subject: Re: spinlock help
> 
> On Tue, 6 Mar 2001, Manoj Sontakke wrote:
> > 1. when spin_lock_irqsave() function is called the subsequent code is
> > executed untill spin_unloc_irqrestore()is called. is this right?
> 
> Yes.  The protected code will not be interrupted, or simultaneously
> executed by another CPU.
> 
> > 2. is this sequence valid?
> >   spin_lock_irqsave(a,b);
> >   spin_lock_irqsave(c,d);
> 
> Yes, as long as it is followed by:
> 
> spin_unlock_irqrestore(c, d);
> spin_unlock_irqrestore(a, b);
> 
> Nigel Gamble[EMAIL PROTECTED]
> Mountain View, CA, USA. http://www.nrg.org/
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Regards,
Manoj Sontakke

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: spinlock help

2001-03-07 Thread Hen, Shmulik

e100 implements all sorts of hooks for our intermediate driver (kind of a
co-development effort), so eepro100 is out of the question for us.

Shmulik.

-Original Message-
From: Ofer Fryman [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, March 07, 2001 12:31 PM
To: 'Hen, Shmulik'
Cc: [EMAIL PROTECTED]
Subject: RE: spinlock help

Did you try looking at Becker eepro100 driver it seems to be simple, no
unnecessary spin_lock_irqsave?.

-Original Message-
From: Hen, Shmulik [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, March 07, 2001 11:21 AM
To: '[EMAIL PROTECTED]'; Manoj Sontakke
Cc: [EMAIL PROTECTED]
Subject: RE: spinlock help

How about if the same sequence occurred, but from two different drivers ?

We've had some bad experience with this stuff. Our driver, which acts as an
intermediate net driver, would call the hard_start_xmit in the base driver.
The base driver, wanting to block receive interrupts would issue a
'spin_lock_irqsave(a,b)' and process the packet. If the TX queue is full, it
could call an indication entry point in our intermediate driver to signal it
to stop sending more packets. Since our indication function handles many
types of indications but can process them only one at a time, we wanted to
block other indications while queuing the request.

The whole sequence would look like that:

[our driver]
ans_send() {
.
.
e100_hard_start_xmit(dev, skb);
.
.
}

[e100.o]
e100_hard_start_xmit() {
.
.
spin_lock_irqsave(a,b);
.
.
if(tx_queue_full)
ans_notify(TX_QUEUE_FULL);  <--
.
.
spin_unlock_irqrestore(a,b);
}

[our driver]
ans_notify() {
.
.
spin_lock_irqsave(c,d);
queue_request(req_type);
spin_unlock_irqrestore(c,d);<--
.
.
}

At that point, for some reason, interrupts were back and the e100.o would
hang in an infinite loop (we verified it on kernel 2.4.0-test10 +kdb that
the processor was enabling interrupts and that the e100_isr was called for
processing an Rx int.).

How is that possible that a 'spin_unlock_irqrestore(c,d)' would also restore
what should have been restored only with a 'spin_unlock_irqrestore(a,b)' ?

Thanks in advance,
Shmulik Hen  
  Software Engineer
Linux Advanced Networking Services
Intel Network Communications Group
Jerusalem, Israel.

-Original Message-
From: Nigel Gamble [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, March 07, 2001 1:54 AM
To: Manoj Sontakke
Cc: [EMAIL PROTECTED]
Subject: Re: spinlock help

On Tue, 6 Mar 2001, Manoj Sontakke wrote:
> 1. when spin_lock_irqsave() function is called the subsequent code is
> executed untill spin_unloc_irqrestore()is called. is this right?

Yes.  The protected code will not be interrupted, or simultaneously
executed by another CPU.

> 2. is this sequence valid?
>   spin_lock_irqsave(a,b);
>   spin_lock_irqsave(c,d);

Yes, as long as it is followed by:

spin_unlock_irqrestore(c, d);
spin_unlock_irqrestore(a, b);

Nigel Gamble[EMAIL PROTECTED]
Mountain View, CA, USA. http://www.nrg.org/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: spinlock help

2001-03-07 Thread Hen, Shmulik


The kdb trace was accurate, we could actually see the e100 ISR pop from no
where right in the middle of our ans_notify every time the TX queue would
fill up. When we commented out the call to spin_*_irqsave(), it worked fine
under heavy stress for days.

Is it possible it was something wrong with 2.4.0-test10 specifically ?

We had to drop the locks in the final release and never got around to
checking it on other kernel releases (it went on the TO_DO list ;-).

Shmulik.

-Original Message-
From: Andrew Morton [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, March 07, 2001 1:43 PM
To: Hen, Shmulik
Subject: Re: spinlock help


"Hen, Shmulik" wrote:
> 
> How about if the same sequence occurred, but from two different drivers ?
> 
> We've had some bad experience with this stuff. Our driver, which acts as
an
> intermediate net driver, would call the hard_start_xmit in the base
driver.
> The base driver, wanting to block receive interrupts would issue a
> 'spin_lock_irqsave(a,b)' and process the packet. If the TX queue is full,
it
> could call an indication entry point in our intermediate driver to signal
it
> to stop sending more packets. Since our indication function handles many
> types of indications but can process them only one at a time, we wanted to
> block other indications while queuing the request.
> 
> The whole sequence would look like that:
> 
> [our driver]
> ans_send() {
> .
> .
> e100_hard_start_xmit(dev, skb);
> .
> .
> }
> 
> [e100.o]
> e100_hard_start_xmit() {
> .
> .
> spin_lock_irqsave(a,b);
> .
> .
> if(tx_queue_full)
> ans_notify(TX_QUEUE_FULL);  <--
> .
> .
> spin_unlock_irqrestore(a,b);
> }
> 
> [our driver]
> ans_notify() {
> .
> .
> spin_lock_irqsave(c,d);
> queue_request(req_type);
> spin_unlock_irqrestore(c,d);<--
> .
> .
> }
> 
> At that point, for some reason, interrupts were back

Sorry, that can't happen.

Really, you must have made a mistake somewhere.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: spinlock help

2001-03-08 Thread Hen, Shmulik

OK guys, you were right. The bug was in our code - sorry for trouble.
Turns out that while I was away, the problem was solved by someone else. The
problem is probably related to the fact that when we did
'spin_lock_irqsave(c,d)', 'd' was a global variable. The fix was to wrap the
call with another function and declare 'd' as local. I can't quite explain,
but I think that changing from a static to automatic variable made the
difference. My best guess is that since 'd' is passed by value and not by
reference, the macro expansion of spin_lock_irqsave() relies on the location
of 'd' in the stack and if 'd' was on the heap instead, it might get
trashed.

I would really like to hear your expert opinion on my assumption.

Thanks,
Shmulik.

-Original Message-
From: Andrew Morton [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, March 07, 2001 2:54 PM
To: Hen, Shmulik
Cc: 'LKML'
Subject: Re: spinlock help

"Hen, Shmulik" wrote:
> 
> The kdb trace was accurate, we could actually see the e100 ISR pop from no
> where right in the middle of our ans_notify every time the TX queue would
> fill up. When we commented out the call to spin_*_irqsave(), it worked
fine
> under heavy stress for days.
> 
> Is it possible it was something wrong with 2.4.0-test10 specifically ?
> 

Sorry, no.  If spin_lock_irqsave()/spin_unlock_irqrestore()
were accidentally reenabling interrupts then it would be
the biggest, ugliest catastrophe since someone put out a kernel
which forgot to flush dirty inodes to disk :)

Conceivably it was a compiler bug.  Were you using egcs-1.1.2/x86?

-

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Q: network drivers interface changes

2000-09-27 Thread Hen, Shmulik


Hello,

Is there a good source of information that describes the changes in network
driver interface between 2.2.x and 2.4.x kernels ?


Thanks,
Shmulik Hen Software Engineer
Linux  Advanced Networking Services
Intel Network Communications Group
Jerusalem


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

[OT] ethtool MII helpers (actually two OT's)

2001-06-24 Thread Hen, Shmulik


MII
---
Is there any support in the MII standard for 1000Mbps (GbE Fiber/Copper) ?
Perhaps an extension to the standard ?
I could see that some of the Gigabit adapters supported by the kernel
provide the MII IOCTLs
interface, but couldn't figure out how to extract the correct speed
information from the registers
I can read. I know it's a bit of a hassle and I have to get the local
capabilities and match them against the partner's capabilities and find the
highest common speed etc. etc. but I'm sure that if the driver can do it I
can reproduce it in userland too.

EthTool
---
Is there a way that I can extract the link status information out of the
ethtool struct ?
I could see that at least one Gigabit adapter driver (bcm5700.c), provides
the EthTool interface
and reports the correct speed and duplex mode but not the link status.
Is there a place that defines how a driver is supposed to implement the
support for EthTool ?
I figured that since there is no separate field for link status (at least in
version 1.2), a driver is supposed to report speed=0 or something like that
when the link is down. I know this driver detects link status changes for
sure because it prints messages every time, but the speed and duplex are
always reported the same.


Thanks,
Shmulik Hen  
  Software Engineer
Linux Advanced Networking Services
Intel Network Communications Group
Jerusalem, Israel

-Original Message-
From: Jeff Garzik [mailto:[EMAIL PROTECTED]]
Sent: Friday, June 22, 2001 8:59 AM
To: Chris Wedgwood
Cc: Linux Kernel Mailing List; [EMAIL PROTECTED]; David S. Miller
Subject: Re: PATCH: ethtool MII helpers


Chris Wedgwood wrote:
> 
> On Fri, Jun 22, 2001 at 01:24:36AM -0400, Jeff Garzik wrote:
> 
> Sure, and that's planned.  Wanna send me a patch for it?  :)
> 
> Possibly, but I wonder if this is a kernel-space problem or not. Why
> not put all the smarts into userland for it?

I meant, send me a patch for userland ethtool, to do exactly what you
described.


> It will definitely fall back on the MII ioctls if ethtool media
> support for the desired command doesn't exist.
> 
> Well, that is more or less as much as needs to be done. That, and
> some kind of super-set API to be defined for all new stuff, having
> two slightly different APIs for the same things sucks.

Both APIs do different things but have a common subset, yes.

The MII ioctls only do their thing for MII-like hardware.  ethtool can
be applied to any hardware.  Old ISA drivers that don't do MII, or do it
in a really nonstandard way.  For example I have ethtool code locally
which allows ne2k-pci to do media selection via ioctl, for two popular
ne2k cards, something its never been able to do before.  Emulating media
selection support for things like 10base2<->10baseT<->AUI just isn't
possible with the MII ioctls.

MII is a standard and incredibly popular, thus mii-tool works most
popular PCI NICs, for the most popular media types.  But it's still
basically a hardware interface.  I am not convinced its a good idea for
make the [G]MII ioctls the Linux software media interface for all
network hardware.

I see ethtool as the interface for tuning your NIC, that works across
all hardware.
I see mii-diag as the way to do advance MII-specific hardware stuff,
like next page or HA monitoring or whatever.

Jeff


-- 
Jeff Garzik  | Andre the Giant has a posse.
Building 1024|
MandrakeSoft |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

catch 22 - porting net driver from 2.2 to 2.4

2000-11-09 Thread Hen, Shmulik


Hello,

This is a bit long and I apologize (since there are kdb captures in it).

We are developing an advanced networking services driver (loadable module)
and are having problems porting it to work on 2.4.x kernel.
The driver is supposed to provide services such as fault tolerance, load
balancing, link aggregation and VLAN. It does that by creating a group of
"virtual" adapters that are bound on top of a team of network adapters. This
works great on 2.2 kernels but demonstrated a few problems on 2.4.0-test9

The only problem we have left now has to do with insmod/rmmod. for good
reasons, we cant just call init_etherdev() like base drivers do, so we
created our own version that handles memory and name allocations and calls
register_netdevice() on it's own the same as init_etherdev(). since we've
got several "virtual" adapters that are part of a topology being built
progressively, we can't perform their registrations during module_init() but
rather through an IOCTL and here is our problem:

if we call register_netdevice(), we get the following message:

RTNL: assertion failed at devinet.c(775):inetdev_event

we figured this is because we neglected rtnl_lock() so instead we try using
register_netdev() to handle this for us but then we get:

Scheduling in interrupt
kernel BUG at sched.c:696!
Entering kdb (current=0xc51a8000, pid 1075) on processor 1 Panic:
invalid operand
due to panic @ 0xc011aa71

eax = 0x001b ebx = 0x0020 ecx = 0xc030d80c edx = 0x 
esi = 0xc030b5b4 edi = 0x esp = 0xc51a9d34 eip = 0xc011aa71 
ebp = 0xc51a9d8c  ss = 0x0018  cs = 0x0010 eflags =
0x00010246 
 ds = 0x0018  es = 0x0018 origeax = 0x  =
0xc51a9d00

[1]kdb> bt
EBP   EIP Function(args)
0xc51a9d8c 0xc011aa71 schedule+0x935 (0xc2c9e000, 0xc4482520,
0xc51a9e10)
   kernel .text 0xc010 0xc011a13c
0xc011aa80
0xc51a9db8 0xc0107b8d __down+0xf5
   kernel .text 0xc010 0xc0107a98
0xc0107c68
0xc51a9dcc 0xc0107f43 __down_failed+0xb (0xc51a9de4, 0xd082de5c,
0xc2c9e000, 0xc60a8320, 0xc51a9dfc)
   kernel .text 0xc010 0xc0107f38
0xc0107f4c
   0xc023a7a9 stext_lock+0x4919
   kernel .text.lock 0xc0235e90
0xc0235e90 0xc023bd80
0xc51a9dd4 0xc01f2e81 rtnl_lock+0x11 (0xc2c9e000)
   kernel .text 0xc010 0xc01f2e70
0xc01f2e88
0xc51a9de4 0xd082de5c [ians]iansInitEtherdev+0x20 (0xc4482520)
   ians .text 0xd082d060 0xd082de3c
0xd082de78
.
. (boring chain of calls)
.
0xc51a9ec4 0xd082dbcd [ians]doControlIoctl+0x15d (0xc2c9e200,
0xc51a9f20, 0x89f0)
   ians .text 0xd082d060 0xd082da70
0xd082dc40
0xc51a9ee4 0xc01ef09f dev_ifsioc+0x33f (0xc51a9f20, 0x89f0,
0xc51a9f20)
   kernel .text 0xc010 0xc01eed60
0xc01ef0b0
0xc51a9f40 0xc01ef29d dev_ioctl+0x1ed (0x89f0, 0xba58)
   kernel .text 0xc010 0xc01ef0b0
0xc01ef300
0xc51a9f64 0xc021a70c inet_ioctl+0x18c (0xc339d13c, 0x89f0,
0xba58)
   kernel .text 0xc010 0xc021a580
0xc021a720
0xc51a9f84 0xc01e8f06 sock_ioctl+0x9e (0xc339d040, 0xc38e0900,
0x89f0, 0xba58)
   kernel .text 0xc010 0xc01e8e68
0xc01e8f6c
0xc51a9fbc 0xc014f5fd sys_ioctl+0x26d (0x3, 0x89f0, 0xba58,
0x4000ae60, 0xbba4)
   kernel .text 0xc010 0xc014f390
0xc014f6a0
   0xc010965f system_call+0x33
   kernel .text 0xc010 0xc010962c
0xc0109664

We figured that since we are in user context (do_ioctl) and use
spin_lock_bh() to protect us from other concurrent threads, it might
interfere with rtnl_lock() so we remove our lock just before calling
register_netdev() and lock again upon return but then the whole process just
stopped and didn't return to the prompt. from within kdb, we could see that
all CPU's are running in idle but if we try to return to the prompt the
whole system hangs. sometimes it hangs if we try to run ifconfig -a to see
if the virtual adapters appear.

I can't use it without locks, I can't use it with locks and I can't complete
the operation if I remove my own locks - catch 22.


The other problem has to do with rmmod - here we get called in our
cleanup_module function and from it we try to call unregister_netdev() for
each registered virtual adapter.
in this case, we get:

Scheduling in interrupt
kernel BUG at sched.c:696!

Entering kdb (current=0xc38c8000, pid 1602) on processor 0 Panic:

Q: kallsyms - where can I find it and what does it do ?

2000-10-18 Thread Hen, Shmulik


Hello,

I'm trying to build a new kernel with kdb support and I keep getting an
error from the make file:

kallsyms pass 1
[make] /bin/sh: /sbin/kallsyms: No such file or directory
error

What is kallsyms and where can I get it from ?

Here is what I have and what I did:
The machine is a Compaq Ap500 dual P-III Xeon.
I'm using RedHat 6.2 (clean disk - custom install - install everything).
The new kernel is linux-2.4.0-test9.tar.gz (the latest according to
www.kernel.org).
The kdb patch is kdb-v1.5-2.4.0-test9-pre9.gz (the latest according to
oss.sgi.com).

There doesn't seem to be a problem running the patch (no error messages, at
least).
I made sure kdb support is checked under 'make menuconfig'.
I ran 'make dep; make clean; make bzImage' and it keeps failing.

While on the subject, how is it possible that I can build an SMP kernel on
that machine (before the patch), but can't build a UP kernel ?.
I use 'make menuconfig' and uncheck "SMP support", then run 'make dep; make
clean; make bzImage' and I get all kinds of warnings and errors until the
make file simply stops running (doesn't return to the prompt with an error
message - just stops and I have to hit  to get back to the prompt).


Thanks in advance,

Shmulik Hen

  Software Engineer
Linux Advanced Networking Services
Network Communications Group, Israel (NCGj)
Intel Corporation Ltd.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Please ignore - RE: kallsyms - where can I find it and what does it do ?

2000-10-18 Thread Hen, Shmulik

-Original Message-
From: Hen, Shmulik [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, October 18, 2000 6:17 PM
To: [EMAIL PROTECTED]
Cc: 'Keith Owens'
Subject: Q: kallsyms - where can I find it and what does it do ?

Hello,

I'm trying to build a new kernel with kdb support and I keep getting an
error from the make file:

kallsyms pass 1
[make] /bin/sh: /sbin/kallsyms: No such file or directory
error

What is kallsyms and where can I get it from ?

Here is what I have and what I did:
The machine is a Compaq Ap500 dual P-III Xeon.
I'm using RedHat 6.2 (clean disk - custom install - install everything).
The new kernel is linux-2.4.0-test9.tar.gz (the latest according to
www.kernel.org).
The kdb patch is kdb-v1.5-2.4.0-test9-pre9.gz (the latest according to
oss.sgi.com).

There doesn't seem to be a problem running the patch (no error messages, at
least).
I made sure kdb support is checked under 'make menuconfig'.
I ran 'make dep; make clean; make bzImage' and it keeps failing.

While on the subject, how is it possible that I can build an SMP kernel on
that machine (before the patch), but can't build a UP kernel ?.
I use 'make menuconfig' and uncheck "SMP support", then run 'make dep; make
clean; make bzImage' and I get all kinds of warnings and errors until the
make file simply stops running (doesn't return to the prompt with an error
message - just stops and I have to hit  to get back to the prompt).

Thanks in advance,

Shmulik Hen

  Software Engineer
Linux Advanced Networking Services
Network Communications Group, Israel (NCGj)
Intel Corporation Ltd.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

page fault problems porting a network driver to 2.4.x

2000-10-24 Thread Hen, Shmulik


Hello,

We are developing an advanced networking services loadable module and are
having problems porting it to work on 2.4.x kernels. The driver is supposed
to provide services such as fault tolerance, load balancing and link
aggregation over a team of network adapters. It works OK on 2.2.x kernels
but hangs on 2.4.x kernels.

In order to debug it, we stripped it down to become a mere "intermediate" or
"filter" driver that binds to a base driver and passes everything through in
both directions (Rx, Tx, IOCTL, stats, etc.). After going through the basics
of modifying the driver to compile on 2.4.x kernels and fighting some nasty
dead locks due to the new nature of the networking layer, we managed to get
it to run. The driver will receive and transmit a few hundreds of thousands
of packets (while having a periodic timer expire 10 times a second and
running continuous IOCTLs), and then it causes an oops about not being able
to handle a page fault.

The function looks something like:

int iansHardStartXmit(struct sk_buff *skb, struct net_device *dev) {
int res;
struct net_device *base;

spin_lock();
base = get_base_driver_by_name(name);

if(base != NULL) {
res = base->hard_start_xmit(skb, base);
}

spin_unlock();
return res;
}

We used kdb in order to track down the problem and found out the following
stack trace:

 EBPEIP function(args)
0xc4cd1c54  0xd081e3e7  [e100]__kallsyms+0xb (0xc4b595a0,
0xc840f200)
e100 __kallsyms 0xd081e3dc
0xd081e3dc 0xd0820dsc
0xd08244ba  [ians]iansHardStartXmit+0xa6 (0xc4b595a0,
0xc4d9bc00)
ians .text 0xd0824060 0xd0824414
0xd082452c
0xc01f9d1f  qdisc_restart+0xcf (0xc4d9bc00)
kernel .text 0xc010 0xc01f9c50
0xc01f9f14
*
*
*

This goes on and shows that this is an ICMP echo reply packet going down
through the IP stack to the filter driver (apparently 0xc4b595a0 is the skb,
0xc4d9bc00 is the *dev of the filter driver and 0xc840f200 is the *dev of
the base driver). The filter driver is supposed to call the
dev->hard_start_xmit of the base driver, but strangely it lands somewhere in
the data segment of the base driver (__kallsyms is a part of the symbol
table of the module according to insmod -m).
Figuring the dev->hard_start_xmit pointer got trashed somehow, we added a
check to make sure the same pointer is always called, and indeed this was
the case. Looking at the assembly code with kdb, we could see that the call
to the base driver is done by a 'call *%eax' command. kdb reports that
eax=0x after the page fault (origeax).

How is it possible that the pointer to the function keeps it's value, but
the jump to that function falls somewhere else ?
The entire function is protected by a spinlock, so there is no worry about
the other threads messing my data.

We are using:
RedHat 6.2
gcc v2.91.66
modutils v2.3.11-1
kernel linux-2.4.0-test9
kdb v1.5-2.4.0-test9-pre9
Compaq ap500 dual p-III Xeon


Thanks,
Shmulik Hen

Software Engineer
Linux Advanced Networking Services
Network Communications Group, Israel (NCGj)
Intel Corporation Ltd.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

pointer to dev->hard_start_xmit() gets trashed in 2.4.0-test9

2000-10-25 Thread Hen, Shmulik


Hello,

We are developing an advanced networking services loadable module and are
having problems porting it to work on 2.4.x kernels. The driver is supposed
to provide services such as fault tolerance, load balancing and link
aggregation over a team of network adapters. It works OK on 2.2.x kernels
but hangs on 2.4.x kernels.

In order to debug it, we stripped it down to become a mere "intermediate" or
"filter" driver that binds to a base driver and passes everything through in
both directions (Rx, Tx, IOCTL, stats, etc.). After going through the basics
of modifying the driver to compile on 2.4.x kernels and fighting some nasty
dead locks due to the new nature of the networking layer, we managed to get
it to run. The driver will receive and transmit a few hundreds of thousands
of packets (while having a periodic timer expire 10 times a second and
running continuous IOCTLs), and then it causes an oops about not being able
to handle a page fault.

The function looks something like:

int iansHardStartXmit(struct sk_buff *skb, struct net_device *dev) {
int res;
struct net_device *base;

spin_lock();   //no interrupts involved, so spin_lock
should do
base = get_base_driver_by_name(name);

if(base != NULL) {
BUG_TRAP(ptr_g == base->hard_start_xmit); //make sure it's
always the same addr
res = base->hard_start_xmit(skb, base);
}

spin_unlock();
return res;
}

We used kdb in order to track down the problem and found out the following
stack trace:

 EBPEIP function(args)
0xc4cd1c54  0xd081e3e7  [e100]__kallsyms+0xb (0xc4b595a0,
0xc840f200)
e100 __kallsyms 0xd081e3dc
0xd081e3dc 0xd0820dsc
0xd08244ba  [ians]iansHardStartXmit+0xa6 (0xc4b595a0,
0xc4d9bc00)
ians .text 0xd0824060 0xd0824414
0xd082452c
0xc01f9d1f  qdisc_restart+0xcf (0xc4d9bc00)
kernel .text 0xc010 0xc01f9c50
0xc01f9f14
*
*
*

This goes on and shows that this is an ICMP echo reply packet going down
through the IP stack to the filter driver (apparently 0xc4b595a0 is the skb,
0xc4d9bc00 is the *dev of the filter driver and 0xc840f200 is the *dev of
the base driver). The filter driver is supposed to call the
dev->hard_start_xmit of the base driver, but strangely it lands somewhere in
the data segment of the base driver (__kallsyms is a part of the symbol
table of the module according to insmod -m).
Figuring the dev->hard_start_xmit pointer got trashed somehow, we added a
check to make sure the same pointer is always called, and indeed this is the
case. Looking at the assembly code with kdb, we could see that the call to
the base driver is done by a 'call *%eax' instruction.

How is it possible that the pointer to the function keeps it's value, but
the jump to that function falls somewhere else ?

We are using:
RedHat 6.2
gcc v2.91.66
modutils v2.3.11-1 (was upgraded because of kdb)
kernel linux-2.4.0-test8 (SMP, +kdb, compiled with frame pointers,
SPINLOCK_DEBUG=2)
kdb v1.4-2.4.0-test9-pre9
Compaq ap500 dual p-III Xeon

Could this be a version mismatch between the components above ?

Thanks,
Shmulik Hen

Software Engineer
Linux Advanced Networking Services
Network Communications Group, Israel (NCGj)
Intel Corporation Ltd.




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Multiple warnings when compiling network driver in 2.4.0-test9

2000-10-29 Thread Hen, Shmulik


Hello,

While trying to compile a network driver for 2.4.0-test9 (+kdb-v1.5,
configured for UP) I'm getting multiple warnings:
/usr/src/linux/include/linux/sched.h:700: warning: can't inline call
to `__mmdrop'
/usr/src/linux/include/linux/sched.h:704: warning: called from here

This happens for every kernel header I try to use such  as netdevice.h,
skbuff.h, malloc.h and pci.h
Each of those header files includes slab.h (line 14) that includes mm.h
(line 4) that includes sched.h which contains the following on line 700:

700:extern inline void FASTCALL(__mmdrop(struct mm_struct *));
701:static inline void mmdrop(struct mm_struct * mm)
702:{
703:if (atomic_dec_and_test(>mm_count))
704:__mmdrop(mm);
705:}

My make file uses the following flags:
gcc -fomit-frame-pointer -Wall  -Wstrict-prototypes -Winline -O3
-D__KERNEL__ -DMODULE  -DDEBUG  -DMODVERSIONS -I/usr/src/linux/include

Can anyone tell me what this warning means and if I can safely ignore it (or
expect disaster) ?


Thanks,

Shmulik Hen,
Software Engineer
Linux Advanced Networking Services
Network Communications Group, Israel (NCGj)
Intel Corporation Ltd.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Locking Between User Context and Soft IRQs in 2.4.0

2000-10-30 Thread Hen, Shmulik


Hello,

We are trying to port a network driver from 2.2.x to 2.4.x and have some
question regarding locks.
According to the kernel locking HOWTO, we have to take extra care when
locking between user context threads and BH/tasklet/softIRQ,
so we learned (the hard way ;-) that when running the ioctl system call from
an application we should use spin_lock/unlock_bh() and not
spin_lock/unlock() inside dev->do_ioctl().

*   What about the other entry points implemented in net_device ? 
*   We've got dev->get_stats, dev->set_mac_address,
dev->set_mutlicast_list and others that are all called from running
'ifconfig' which is an application. Are they considered user context too ?
*   What about dev->open and dev->stop ?
*   We figured that dev->hard_start_xmit() and timer callbacks are not
considered user context, but how can I find out if they are being run as
SoftIRQ or as tasklets or as Bottom Halves ? (their different definitions
require different types of protections)

Our driver is actually an intermediate driver bound on top of a regular net
driver. It behaves both as a network adapter driver and a protocol at the
same time. I can safely assume that it will have to handle both transmits
and receives simultaneously (no hardware interrupts are involved). We've
decided that for the first stage we are going to implement "wide" locks that
wrap entire operations from top to bottom. For example, our
dev->hard_start_xmit() will have a spin_lock() at the beginning and a
spin_unlock() at the end of the function.
*   Will it be safe to keep the lock until after the call to the base
driver's hard_start_xmit, or do I have to release the lock just before that
?
*   Or, in our receive function, will I have to release the lock before
or after the call to netif_rx() ?
*   What about other calls to the kernel ? can the running thread be
switched out of context when calling kernel entries and not be switched back
in when they finish ? should I beware of deadlocks in such case ?


Thanks in advance,
Shmulik Hen,
Software Engineer
Linux Advanced Networking Services
Network Communications Group, Israel (NCGj)
Intel Corporation Ltd.




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

RE: catch 22 - porting net driver from 2.2 to 2.4

2000-11-12 Thread Hen, Shmulik


and you don't get the "RTNL: assertion failed at
devinet.c(775):inetdev_event" in 2.4.x ?

the thing is I need to prevent Tx/Rx when a topology change is initiated
from the ioctl (registering a virtual adapter is just one example), so they
all share a single lock and I must use spin_lock_bh from the ioctl.

Shmulik.

-Original Message-
From: Olaf Titz [mailto:[EMAIL PROTECTED]]
Sent: Friday, November 10, 2000 2:09 AM
To: [EMAIL PROTECTED]
Subject: Re: catch 22 - porting net driver from 2.2 to 2.4


> We figured that since we are in user context (do_ioctl) and use
> spin_lock_bh() to protect us from other concurrent threads, it might
> interfere with rtnl_lock() so we remove our lock just before calling
> register_netdev() and lock again upon return but then the whole process
just
> stopped and didn't return to the prompt. from within kdb, we could see
that

Can't you just do this:

#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,3,0) /* not sure about the 0 */
#define rtnl_LOCK() rtnl_lock()
#define rtnl_UNLOCK()   rtnl_unlock()
#else
#define rtnl_LOCK() /* nop */
#define rtnl_UNLOCK()   /* nop */
#endif

rtnl_LOCK();
register_netdevice(...);
rtnl_UNLOCK();

that works for me (yes, from do_ioctl, but without the bh lock - I
don't know if that's absolutely needed in your case).

Olaf
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

RE: catch 22 - porting net driver from 2.2 to 2.4

2000-11-12 Thread Hen, Shmulik


So how come I get the "RTNL: assertion failed at
devinet.c(775):inetdev_event" when I call register_netdevice without
rtnl_lock/unlock ?
could it be a 2.4.0-test9 thing ? (haven't used test10 or 11 yet).

and what about rmmod causing the panic when I use unregister_netdev or never
completing the operation when I use unregister_netdevice ?
does module_exit run inside rtnl_lock too ?


Shmulik.

-Original Message-
From: Jeff Garzik [mailto:[EMAIL PROTECTED]]
Sent: Thursday, November 09, 2000 7:37 PM
To: Hen, Shmulik
Cc: 'LNML'; 'LKML'; [EMAIL PROTECTED]
Subject: Re: catch 22 - porting net driver from 2.2 to 2.4


do_ioctl is inside rtnl_lock...

Remember if you need to alter the rules, you can always queue work in
the current context, and have a kernel thread handle the work.  The nice
thing about a kernel thread is that you start with a [almost] clean
state, when it comes to locks.

Jeff


-- 
Jeff Garzik |
Building 1024   | Would you like a Twinkie?
MandrakeSoft|
-
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

RE: catch 22 - porting net driver from 2.2 to 2.4

2000-11-13 Thread Hen, Shmulik


Where can I find info about that ?
My first idea was to fire a timer and let the callback routine do the work,
but I worry about synchronization and about passing the list of items for it
to handle.
What is the accepted way of starting a kernel thread and how do I handle
parameters and sync. ?


Thanks,
Shmulik.

-Original Message-
From: Jeff Garzik [mailto:[EMAIL PROTECTED]]
Sent: Thursday, November 09, 2000 7:37 PM
To: Hen, Shmulik
Cc: 'LNML'; 'LKML'; [EMAIL PROTECTED]
Subject: Re: catch 22 - porting net driver from 2.2 to 2.4


do_ioctl is inside rtnl_lock...

Remember if you need to alter the rules, you can always queue work in
the current context, and have a kernel thread handle the work.  The nice
thing about a kernel thread is that you start with a [almost] clean
state, when it comes to locks.

Jeff


-- 
Jeff Garzik |
Building 1024   | Would you like a Twinkie?
MandrakeSoft|
-
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

RE: catch 22 - porting net driver from 2.2 to 2.4

2000-11-13 Thread Hen, Shmulik

"Jeff Garzik" wrote:
> Theoretically, if you call unregister_netdev from rmmon, it should grab
> rtnl_lock and then complete the operation for you.  If that doesn't work
> for you, it sounds like you are not setting up, or cleaning up,
> something correctly.
> 
> Basically... it sounds like there are still bugs in your driver that
> need working out :)

I followed the value of dev->refcnt and there is something strange. before
the call to register_netdev it is set to 0 and after that it is increased to
1. but before the call to unregister_netdev it is somehow 2.

How can I tell who is modifying it and when ?

I tried using kdb to find out but it keeps hanging the machine. I wanted to
place a breakpoint that will pop if a certain memory address is accessed so
I did the following:

static void *p; (global - at the top of my source file)
EXPORT_SYMBOL (p);

in my probe function:
p = (void*) dev->refcnt;

> insmod my_module
> ksyms

this showed that p is at address 0xd081ee90

then from within kdb:
kdb> md 0xd081ee90
0xd081ee90 c470bedc  

kdb> bpha 0xc470bedc
Forced Global Breakpoint at...

kdb> go

system hung

Is there a way to place a breakpoint on a memory address access ?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Q: using kdb to trap memory modifications

2000-11-14 Thread Hen, Shmulik


hello,

I'm trying to see when and how a certain variable is being modified and I
wonder how to get kdb to do that for me.

when I do 'insmod my_module' with my network driver, I noticed that
dev->refcnt is 0 at first and gets increased to 1 after calling
register_netdev. When I want to do 'rmmod my_module' somehow dev->refcnt is
at 2 before the call to unregister_netdev and so the kernel (2.4.0-test9)
won't let the module to unload. Since I don't increase it explicitly, I want
to know who does it for me and when to see if I'm doing anything wrong.

So, I modified my module to contain in the global section:
static void* ptr;
EXPORT_SYMBOL(ptr);

and in my module_init function I added:

ptr = (void*) >refcnt;

after running 'insmod my_module' I used ksyms and found that ptr is at
address 0xd081eeb4.

after entering kdb, I did:

kdb> md 0xd081eeb4

0xd081eeb4  c470bedc      
0xd081eec4        
.
.

kdb> bph 0xc470bedc
Forced Breakpoint at #0 ...

kdb> be 0
kdb> go

When I try to return to normal shell, the system is totally hung and won't
even receive inputs from the remote serial terminal.
I tried bp, bpa and bpha, but the result is always the same.


Thanks,
Shmulik Hen
Software Engineer
Linux Advanced Networking Services
Network Communications Group, Israel (NCGj)
Intel Corporation Ltd.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

RE: catch 22 - porting net driver from 2.2 to 2.4

2000-11-20 Thread Hen, Shmulik

I tried using the kernel thread as demonstrated in your example and again it
failed (panic - scheduling in interrupt).
The difference is that your code executes the thread from within dev->open,
while my code tries to do that from dev->do_ioctl that has spinlocks around
the entire operation (which apparently sleeps).
If I comment out the spin_lock/unlock it will succeed, but then I can't be
sure I don't get any concurrent Tx/Rx/timer which is a bad idea while the
topology is still being created.

is there any way to do something like firing threads/timers atomically ?

Thanks,
Shmulik.

-Original Message-
From: Jeff Garzik [mailto:[EMAIL PROTECTED]]
Sent: Monday, November 13, 2000 4:26 PM
To: Hen, Shmulik
Subject: Re: catch 22 - porting net driver from 2.2 to 2.4

"Hen, Shmulik" wrote:
> 
> Where can I find info about that ?
> My first idea was to fire a timer and let the callback routine do the
work,
> but I worry about synchronization and about passing the list of items for
it
> to handle.
> What is the accepted way of starting a kernel thread and how do I handle
> parameters and sync. ?

Attached is an example.  My "8139too" ethernet driver uses a kernel
thread instead of a timer to perform media checking.  It illustrates how
to start and stop a kernel thread.

-- 
Jeff Garzik |
Building 1024   | The chief enemy of creativity is "good" sense
MandrakeSoft|  -- Picasso

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

kernel memory allocations alignment

2001-02-04 Thread Hen, Shmulik


Hello,

When using kmalloc(size_t size), do I get a guaranty that the memory region
allocated is aligned according to the size specified ?
More to the point, if I call kmalloc for type int on an IA64 architecture is
the pointer going to be 8 bytes aligned ?


Shmulik Hen
Software Engineer
Linux Advanced Networking Services
Network Communications Group, Israel (NCGj)
Intel Corporation Ltd.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

RE: kernel memory allocations alignment

2001-02-04 Thread Hen, Shmulik

Actually yes. We were warned that on IA64 architecture the system will halt
when accessing any type of variable via a pointer if the pointer does not
contain an aligned address matching that type. Until now we were using a
method of receiving a pointer to an array, casting it to a pointer of a
struct (packed with #pragma pack(1) ) ,and retrieving fields directly from
it with pointers.
It seems we cannot do that any more and were wondering what are the
alternatives.
One way we could think of is forget the packing and rearrange the fields in
the struct in descending order so they all come out aligned, but we didn't
know for sure if the first one will be aligned too.

Will that work ?

Thanks,
Shmulik Hen  
  Software Engineer
Linux Advanced Networking Services
Intel Network Communications Group
Jerusalem, Israel

-Original Message-
From: Manfred [mailto:[EMAIL PROTECTED]]
Sent: Sunday, February 04, 2001 5:56 PM
To: Hen, Shmulik
Cc: 'LKML'
Subject: Re: kernel memory allocations alignment

"Hen, Shmulik" wrote:
> 
> When using kmalloc(size_t size), do I get a guaranty that the memory
region
> allocated is aligned according to the size specified ?
> More to the point, if I call kmalloc for type int on an IA64 architecture
is
> the pointer going to be 8 bytes aligned ?
>

Yes, kmalloc results are always 'sizeof(void*)' aligned.

Do you have stricter alignment requirements?

--
Manfred

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Q: How do I get from the latest stable kernel version to the latest prepatch version ?

2001-03-26 Thread Hen, Shmulik


Hi,

According to http://www.kernel.org, the latest stable kernel version is
2.4.2. The latest prepatch version is 2.4.3-pre3.

In order to get a full 2.4.3-pre8 kernel do I have to:

A. download linux-2.4.2.tar.gz and all the patch-2.4.3-preX.gz and apply
them in succession or,
B. download linux-2.4.3.tar.gz (exists ?) and then apply the all patches or,
C. download linux-2.4.3-pre7.tar.gz (exists ?) and apply only
patch-2.4.3-pre8.gz ?


Thanks,

Shmulik Hen
Software Engineer
Linux Advanced Networking Services
Network Communications Group, Israel (NCGj)
Intel Corporation Ltd.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: Q: How do I get from the latest stable kernel version to the latest prepatch version ?

2001-03-26 Thread Hen, Shmulik


Thanks.
It just struck me odd that the latest is 2.4.2 while the prepatches were
2.4.3 so I figured there must be something I missed in between (my logic
told me that a 2.4.3 patch would be against a 2.4.3 something ;-).

BTW, I haven't seen any announcements from Linus in this mailing list
regarding new versions, just the updates on the web site and Alan's release
notes saying he's merging with 2.4.3xx. Are those announcements being posted
somewhere else now ?

-Original Message-
From: Leonid Mamtchenkov [mailto:[EMAIL PROTECTED]]
Sent: Monday, March 26, 2001 2:33 PM
To: Hen, Shmulik
Cc: 'LKML'
Subject: Re: Q: How do I get from the latest stable kernel version to
the late st prepatch version ?


Hello Hen, Shmulik,

Once you wrote about "Q: How do I get from the latest stable kernel version
to the late st prepatch version ?":
HS> According to http://www.kernel.org, the latest stable kernel version is
HS> 2.4.2. The latest prepatch version is 2.4.3-pre3.
HS> 
HS> In order to get a full 2.4.3-pre8 kernel do I have to:
HS> 
HS> A. download linux-2.4.2.tar.gz and all the patch-2.4.3-preX.gz and apply
HS> them in succession or,
HS> B. download linux-2.4.3.tar.gz (exists ?) and then apply the all patches
or,
HS> C. download linux-2.4.3-pre7.tar.gz (exists ?) and apply only
HS> patch-2.4.3-pre8.gz ?

Download 2.4.2 and then apply 2.4.3-preX (latest) on it... that's it.
You might want to visit http://kernelnewbies.org .  They have some good docs
there.

-- 
 Best regards,
 Leonid Mamtchenkov
 System Administrator


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: URGENT : System hands on "Freeing unused kernel memory: "

2001-03-27 Thread Hen, Shmulik

Does it hang forever ?

I've noticed that my kernel (2.4.2) stalls for several minutes with the same
message but suddenly after that the login prompt appears (anything between,
like configurations and services starting messages, are gone). We've been
able to track it down to a change we did to /etc/lilo.conf to add support
for kernel prints to go out to a serial debugger. Before that everything was
OK, but after we added append="console=tty0 console=ttyS1,38400", this
problem started. We did notice however that everything that doesn't appear
on the console does appear on the serial debugger.

Shmulik.

-Original Message-
From: Thomas Foerster [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, March 27, 2001 10:40 AM
To: [EMAIL PROTECTED]
Subject: Re: URGENT : System hands on "Freeing unused kernel memory: "

> On 03.27 Thomas Foerster wrote:
>>
>> But suddenly the box was offline. One technical assistant from our ISP
tried
>> to reboot
>> our server (he couldn't tell me if there had been any messages on the
screen),
>> but the
>> system always hangs on
>>
>> Freeing unused kernel memory: xxk freed
>>

> Try booting with init=/bin/bash, it looks like kernel gets a bad
/sbin/init,
> and gets stuck. Perhaps the shutdown damaged init, it starts to run and
get
> hung.

That didn't fix the problem :(

When i run "diff" on a new and the "old" init, i get no diffs ...

Must be something other :(

Thomas

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: Plans for 2.5

2001-03-29 Thread Hen, Shmulik


Just some general questions:

1) Is there anywhere a list that describes what is intended to be in 2.5.x ?
2) Are there any early releases of 2.5.x ?
3) Are the things for 2.5.x being discussed on another mailing list ?
4) What is the time frame of releasing 2.5.x-final (or 2.6.x) ?

Specifically, I'm more interested in the network driver aspect.
1) Are there any intended changes to the networking layer ?
2) I over heard something about making the driver reentrant - any news ?
3) What about support for IPv6 ? (I noticed it was marked as experimental
until now)


Thanks in advance,
Shmulik Hen  
  Software Engineer
Linux Advanced Networking Services
Intel Network Communications Group
Jerusalem, Israel


-Original Message-
From: Bruno Avila [mailto:[EMAIL PROTECTED]]
Sent: Thursday, March 29, 2001 12:45 AM
To: [EMAIL PROTECTED]
Subject: Plans for 2.5


Hello people,

I got some questions. When are we going to develop stuff for 2.5?
What is
planed? My opinion for linux 2.5 should be performance. Since linux already
is stable or well done for nature, we could thing more on performance to be
a diferencial over others. What do you people thing?

  Bruno Avila

PS: Not a good english. I know! :)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

change_mtu boundary checking error

2001-04-17 Thread Hen, Shmulik


Hello,

Going through the change_mtu() code in the kernel, I came across the default
function supplied when calling ether_setup().
I could see that eth_change_mtu() (drivers/net/net_init.c) does the
following:

if( (new_mtu < 68) || (new_mtu > 1500) )
return -EINVAL;

Looking in include/linux/if_ether.h I found the following constants:
#define ETH_ALEN6   /* Octets in one ethernet addr */
#define ETH_HLAN14  /* Total octets in header. */
#define ETH_ZLEN60  /* Min. octets in frame sans FCS */
#define ETH_DATA_LEN1500/* Max. octets in payload */
#define ETH_FRAME_LEN   1514/* Max. octets in frame sans FCS */


Now, the high boundary seemed reasonable (ETH_FRAME_LEN - ETH_HLEN =
ETH_DATA_LEN) which gives 1500, but why is the low boundary set to 68 ?
According to my calculations, it should have been ETH_ZLEN - ETH_HLEN which
gives 46.

Doesn't mtu means only the payload size ?
Where did the 68 come from ?


Thanks,
Shmulik Hen
Software Engineer
Linux Advanced Networking Services
Network Communications Group, Israel (NCGj)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: change_mtu boundary checking error

2001-04-18 Thread Hen, Shmulik


But Ethernet is not only for IP, what about other protocols ?

-Original Message-
From: Alan Cox [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, April 17, 2001 3:41 PM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: Re: change_mtu boundary checking error


> Now, the high boundary seemed reasonable (ETH_FRAME_LEN - ETH_HLEN =
> ETH_DATA_LEN) which gives 1500, but why is the low boundary set to 68 ?
> According to my calculations, it should have been ETH_ZLEN - ETH_HLEN
which
> gives 46.

The IPv4 minimum MTU is 68 bytes. Below that not all frames can be delivered


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

55 matches

Mail list logo