Re: tg3 kernel bug in 2.6.18-mm3 and 2.6.19-rc2-mm2

2006-10-22 Thread Andrew Morton
On Sat, 21 Oct 2006 12:38:14 -0700 (PDT)
David Miller [EMAIL PROTECTED] wrote:

 From: Norbert Preining [EMAIL PROTECTED]
 Date: Sat, 21 Oct 2006 15:22:39 +0200
 
   [c010469d] dump_stack+0x12/0x14
   [c0141cb3] softlockup_tick+0xaa/0xc1
   [c0129bad] update_process_times+0x3b/0x5e
   [c01362a1] handle_update_profile+0x14/0x1e
   [c0115956] smp_apic_timer_interrupt+0x49/0x5b
   [c0103998] apic_timer_interrupt+0x28/0x30
  DWARF2 unwinder stuck at apic_timer_interrupt+0x28/0x30
  Leftover inexact backtrace:
   [c01d03af] delay_tsc+0xb/0x13
   [c01d03e0] __delay+0x6/0x7
 
 It's OOPS'ing by softlockup'ing in udelay() and then we get a corrupt
 backtrace.

The unwinder-based backtrace is screwed up (yet again) but the old-style
backtrace is there, in all its messy glory.

Weeding out the crap, I think it's this:

 [c01d03af] delay_tsc+0xb/0x13
 [c01d03e0] __delay+0x6/0x7
 [c022c240] _tw32_flush+0x3f/0x51
 [c022da97] tg3_switch_clocks+0x8f/0x93

 I assume tg3_init_hw() got inlined

 [c0237673] tg3_open+0x250/0x520
 [c02d3263] dev_open+0x2b/0x62
 [c02d1dd8] dev_change_flags+0x47/0xe4
 [c0307fcc] devinet_ioctl+0x252/0x556
 [c02d2e5a] dev_ifsioc+0x113/0x38d
 [c02d29c4] dev_load+0x24/0x4b
 [c02c9265] sock_ioctl+0x19e/0x1c2

It's strange that the post-2.6.19-rc2 changes triggered this - that code
won't have run yet.

Norbert, are you really sure?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: tg3 kernel bug in 2.6.18-mm3 and 2.6.19-rc2-mm2

2006-10-22 Thread Norbert Preining
Hi all!

On Sam, 21 Okt 2006, Michael Chan wrote:
  2.6.19-rc2  works
  2.6.19-rc2+patch does not work
  
 It doesn't make any sense.  This patch is totally benign and
 cannot cause the No firmware running and lockup that you
 reported.  Can you please double-check?

Ok, I cannot reproduce it anymore. No idea why it happened.

ANyway, with my current rc2+tg3 patch I have no problems, while with
rc2-mm2 I have the problems.

Best wishes

Norbert

---
Dr. Norbert Preining [EMAIL PROTECTED]Università di Siena
Debian Developer [EMAIL PROTECTED] Debian TeX Group
gpg DSA: 0x09C5B094  fp: 14DF 2E6C 0307 BE6D AD76  A9C0 D2BF 4AA3 09C5 B094
---
SHENANDOAH (n.)
The infinite smugness of one who knows they are entitled to a place in
a nuclear bunker.
--- Douglas Adams, The Meaning of Liff
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: tg3 kernel bug in 2.6.18-mm3 and 2.6.19-rc2-mm2

2006-10-22 Thread Norbert Preining
Hi all!

Ok, you will lough at me...

On Son, 22 Okt 2006, preining wrote:
   2.6.19-rc2works
   2.6.19-rc2+patch does not work
   
  It doesn't make any sense.  This patch is totally benign and
  cannot cause the No firmware running and lockup that you
  reported.  Can you please double-check?
 
 Ok, I cannot reproduce it anymore. No idea why it happened.

And again I can reporduce it. How, (again, please don't lough):

I booted into windows (sometimes one has too, contract from the EC with
macros in Excel tables ... grrr).

WinXP didn't mange to get an IP address from my cable modem.

Rebooting into linux, same problem as reported, 

and, but no idea whether this is related: the modem just looses sync and
need several resets until it find back into syncronization.

I will do some more experiments with Win-different linux kernel
switching.

Sorry for the chaos, no idea what has happened here!!!

Best wishes

Norbert

---
Dr. Norbert Preining [EMAIL PROTECTED]Università di Siena
Debian Developer [EMAIL PROTECTED] Debian TeX Group
gpg DSA: 0x09C5B094  fp: 14DF 2E6C 0307 BE6D AD76  A9C0 D2BF 4AA3 09C5 B094
---
`We've got to find out what people want from fire, how
they relate to it, what sort of image it has for them.'
The crowd were tense. They were expecting something
wonderful from Ford.
`Stick it up your nose,' he said.
`Which is precisely the sort of thing we need to know,'
insisted the girl, `Do people want fire that can be fitted
nasally?'
 --- Ford debating what to do with fire with a marketing
 --- girl.
 --- Douglas Adams, The Hitchhikers Guide to the Galaxy
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: tg3 kernel bug in 2.6.18-mm3 and 2.6.19-rc2-mm2

2006-10-21 Thread Andrew Morton

cc's added.

On Sat, 21 Oct 2006 15:22:39 +0200
Norbert Preining [EMAIL PROTECTED] wrote:

 Hi all!
 
 I get the same bug again and again, always when ifplugd is started:
 
 tg3: eth0: No firmware running.
 BUG: soft lockup detected on CPU#0!
  [c0103ec7] dump_trace+0x68/0x1b4
  [c010402b] show_trace_log_lvl+0x18/0x2c
  [c010463a] show_trace+0xf/0x11
  [c010469d] dump_stack+0x12/0x14
  [c0141cb3] softlockup_tick+0xaa/0xc1
  [c0129bad] update_process_times+0x3b/0x5e
  [c01362a1] handle_update_profile+0x14/0x1e
  [c0115956] smp_apic_timer_interrupt+0x49/0x5b
  [c0103998] apic_timer_interrupt+0x28/0x30
 DWARF2 unwinder stuck at apic_timer_interrupt+0x28/0x30
 Leftover inexact backtrace:
  [c01d03af] delay_tsc+0xb/0x13
  [c01d03e0] __delay+0x6/0x7
  [c022ce12] tg3_readphy+0x6e/0xd5
  [c022e0d1] tg3_setup_copper_phy+0x30b/0xa15
  [c01064d9] profile_pc+0x24/0x53
  [c022f475] tg3_setup_phy+0xc9a/0xd1f
  [c0103998] apic_timer_interrupt+0x28/0x30
  [c022c240] _tw32_flush+0x3f/0x51
  [c022dc4a] tg3_write_mem+0xcf/0xe7
  [c0231683] tg3_reset_hw+0x10ab/0x13a0
  [c01d03e0] __delay+0x6/0x7
  [c01d03e0] __delay+0x6/0x7
  [c022c240] _tw32_flush+0x3f/0x51
  [c01d03e0] __delay+0x6/0x7
  [c022da97] tg3_switch_clocks+0x8f/0x93
  [c0237673] tg3_open+0x250/0x520
  [c02d3263] dev_open+0x2b/0x62
  [c02d1dd8] dev_change_flags+0x47/0xe4
  [c0307fcc] devinet_ioctl+0x252/0x556
  [c02d2e5a] dev_ifsioc+0x113/0x38d
  [c02d29c4] dev_load+0x24/0x4b
  [c02c90c7] sock_ioctl+0x0/0x1c2
  [c02c9265] sock_ioctl+0x19e/0x1c2
  [c02ca151] sock_map_fd+0x41/0x4a
  [c02c90c7] sock_ioctl+0x0/0x1c2
  [c01684bb] do_ioctl+0x1f/0x62
  [c0168743] vfs_ioctl+0x245/0x257
  [c0168788] sys_ioctl+0x33/0x4b
  [c0102f40] syscall_call+0x7/0xb
  ===
 
 With  2.6.19-rc2 (no -mm) it does not happen.
 
 Normal dmesg gives:
 tg3.c:v3.66 (September 23, 2006)
 PCI: Enabling device :03:00.0 ( - 0002)
 ACPI: PCI Interrupt :03:00.0[A] - GSI 17 (level, low) - IRQ 18
 PCI: Setting latency timer of device :03:00.0 to 64
 eth0: Tigon3 [partno(BCM95789) rev 4101 PHY(5750)] (PCI Express) 
 10/100/1000Base
 T Ethernet 00:16:36:1e:27:ad
 eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] 
 TSOcap[1] 
 eth0: dma_rwctrl[7618] dma_mask[64-bit]
 


There are tg3 changes in -mm, but I doubt it they caused this hang.

Can you test 2.6.19-rc2 plus the below?

Thanks.

--- linux-2.6.19-rc2/drivers/net/tg3.c  2006-10-13 10:35:00.0 -0700
+++ devel/drivers/net/tg3.c 2006-10-21 09:34:42.0 -0700
@@ -68,8 +68,8 @@
 
 #define DRV_MODULE_NAMEtg3
 #define PFX DRV_MODULE_NAME: 
-#define DRV_MODULE_VERSION 3.66
-#define DRV_MODULE_RELDATE September 23, 2006
+#define DRV_MODULE_VERSION 3.67
+#define DRV_MODULE_RELDATE October 18, 2006
 
 #define TG3_DEF_MAC_MODE   0
 #define TG3_DEF_RX_MODE0
@@ -129,7 +129,7 @@
 #define RX_JUMBO_PKT_BUF_SZ(9046 + tp-rx_offset + 64)
 
 /* minimum number of free TX descriptors required to wake up TX process */
-#define TG3_TX_WAKEUP_THRESH   (TG3_TX_RING_SIZE / 4)
+#define TG3_TX_WAKEUP_THRESH(tp)   ((tp)-tx_pending / 4)
 
 /* number of ETHTOOL_GSTATS u64's */
 #define TG3_NUM_STATS  (sizeof(struct tg3_ethtool_stats)/sizeof(u64))
@@ -3075,10 +3075,10 @@ static void tg3_tx(struct tg3 *tp)
smp_mb();
 
if (unlikely(netif_queue_stopped(tp-dev) 
-(tg3_tx_avail(tp)  TG3_TX_WAKEUP_THRESH))) {
+(tg3_tx_avail(tp)  TG3_TX_WAKEUP_THRESH(tp {
netif_tx_lock(tp-dev);
if (netif_queue_stopped(tp-dev) 
-   (tg3_tx_avail(tp)  TG3_TX_WAKEUP_THRESH))
+   (tg3_tx_avail(tp)  TG3_TX_WAKEUP_THRESH(tp)))
netif_wake_queue(tp-dev);
netif_tx_unlock(tp-dev);
}
@@ -3928,7 +3928,7 @@ static int tg3_start_xmit(struct sk_buff
tp-tx_prod = entry;
if (unlikely(tg3_tx_avail(tp) = (MAX_SKB_FRAGS + 1))) {
netif_stop_queue(dev);
-   if (tg3_tx_avail(tp)  TG3_TX_WAKEUP_THRESH)
+   if (tg3_tx_avail(tp)  TG3_TX_WAKEUP_THRESH(tp))
netif_wake_queue(tp-dev);
}
 
@@ -4143,7 +4143,7 @@ static int tg3_start_xmit_dma_bug(struct
tp-tx_prod = entry;
if (unlikely(tg3_tx_avail(tp) = (MAX_SKB_FRAGS + 1))) {
netif_stop_queue(dev);
-   if (tg3_tx_avail(tp)  TG3_TX_WAKEUP_THRESH)
+   if (tg3_tx_avail(tp)  TG3_TX_WAKEUP_THRESH(tp))
netif_wake_queue(tp-dev);
}
 
@@ -8106,7 +8106,10 @@ static int tg3_set_ringparam(struct net_
 
if ((ering-rx_pending  TG3_RX_RING_SIZE - 1) ||
(ering-rx_jumbo_pending  TG3_RX_JUMBO_RING_SIZE - 1) ||
-   (ering-tx_pending  TG3_TX_RING_SIZE - 1))
+   (ering-tx_pending  TG3_TX_RING_SIZE - 1) ||
+   (ering-tx_pending = MAX_SKB_FRAGS) ||
+   

Re: tg3 kernel bug in 2.6.18-mm3 and 2.6.19-rc2-mm2

2006-10-21 Thread Rafael J. Wysocki
On Saturday, 21 October 2006 19:02, Andrew Morton wrote:
 
 cc's added.
 
 On Sat, 21 Oct 2006 15:22:39 +0200
 Norbert Preining [EMAIL PROTECTED] wrote:
 
  Hi all!
  
  I get the same bug again and again, always when ifplugd is started:
  
  tg3: eth0: No firmware running.
  BUG: soft lockup detected on CPU#0!
   [c0103ec7] dump_trace+0x68/0x1b4
   [c010402b] show_trace_log_lvl+0x18/0x2c
   [c010463a] show_trace+0xf/0x11
   [c010469d] dump_stack+0x12/0x14
   [c0141cb3] softlockup_tick+0xaa/0xc1
   [c0129bad] update_process_times+0x3b/0x5e
   [c01362a1] handle_update_profile+0x14/0x1e
   [c0115956] smp_apic_timer_interrupt+0x49/0x5b
   [c0103998] apic_timer_interrupt+0x28/0x30
  DWARF2 unwinder stuck at apic_timer_interrupt+0x28/0x30
  Leftover inexact backtrace:
   [c01d03af] delay_tsc+0xb/0x13
   [c01d03e0] __delay+0x6/0x7
   [c022ce12] tg3_readphy+0x6e/0xd5
   [c022e0d1] tg3_setup_copper_phy+0x30b/0xa15
   [c01064d9] profile_pc+0x24/0x53
   [c022f475] tg3_setup_phy+0xc9a/0xd1f
   [c0103998] apic_timer_interrupt+0x28/0x30
   [c022c240] _tw32_flush+0x3f/0x51
   [c022dc4a] tg3_write_mem+0xcf/0xe7
   [c0231683] tg3_reset_hw+0x10ab/0x13a0
   [c01d03e0] __delay+0x6/0x7
   [c01d03e0] __delay+0x6/0x7
   [c022c240] _tw32_flush+0x3f/0x51
   [c01d03e0] __delay+0x6/0x7
   [c022da97] tg3_switch_clocks+0x8f/0x93
   [c0237673] tg3_open+0x250/0x520
   [c02d3263] dev_open+0x2b/0x62
   [c02d1dd8] dev_change_flags+0x47/0xe4
   [c0307fcc] devinet_ioctl+0x252/0x556
   [c02d2e5a] dev_ifsioc+0x113/0x38d
   [c02d29c4] dev_load+0x24/0x4b
   [c02c90c7] sock_ioctl+0x0/0x1c2
   [c02c9265] sock_ioctl+0x19e/0x1c2
   [c02ca151] sock_map_fd+0x41/0x4a
   [c02c90c7] sock_ioctl+0x0/0x1c2
   [c01684bb] do_ioctl+0x1f/0x62
   [c0168743] vfs_ioctl+0x245/0x257
   [c0168788] sys_ioctl+0x33/0x4b
   [c0102f40] syscall_call+0x7/0xb
   ===
  
  With  2.6.19-rc2 (no -mm) it does not happen.
  
  Normal dmesg gives:
  tg3.c:v3.66 (September 23, 2006)
  PCI: Enabling device :03:00.0 ( - 0002)
  ACPI: PCI Interrupt :03:00.0[A] - GSI 17 (level, low) - IRQ 18
  PCI: Setting latency timer of device :03:00.0 to 64
  eth0: Tigon3 [partno(BCM95789) rev 4101 PHY(5750)] (PCI Express) 
  10/100/1000Base
  T Ethernet 00:16:36:1e:27:ad
  eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] 
  TSOcap[1] 
  eth0: dma_rwctrl[7618] dma_mask[64-bit]
  
 
 
 There are tg3 changes in -mm, but I doubt it they caused this hang.

FWIW, I have a tg3 running just fine with 2.6.19-rc2-mm2, on x86-64.

Greetings,
Rafael


-- 
You never change things by fighting the existing reality.
R. Buckminster Fuller
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: tg3 kernel bug in 2.6.18-mm3 and 2.6.19-rc2-mm2

2006-10-21 Thread Norbert Preining
Hi Andrew, hi all!

On Sam, 21 Okt 2006, Andrew Morton wrote:
 Can you test 2.6.19-rc2 plus the below?

2.6.19-rc2  works
2.6.19-rc2+patch does not work

So it is this patch.

hw:
Acer TravelMate 3012WMi
03:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5789 Gigabit 
Ethernet PCI Express (rev 11)

If you need dmesg, .config, something else, no problem.

Best wishes

Norbert

---
Dr. Norbert Preining [EMAIL PROTECTED]Università di Siena
Debian Developer [EMAIL PROTECTED] Debian TeX Group
gpg DSA: 0x09C5B094  fp: 14DF 2E6C 0307 BE6D AD76  A9C0 D2BF 4AA3 09C5 B094
---
DITHERINGTON (n)
Sudden access to panic experienced by one who realises that he is
being drawn inexorably into a clabby (q.v.) conversion, i.e. one he
has no hope of enjoying, benefiting from or understanding.
--- Douglas Adams, The Meaning of Liff
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: tg3 kernel bug in 2.6.18-mm3 and 2.6.19-rc2-mm2

2006-10-21 Thread Michael Chan
Norbert Preining wrote:
 On Sam, 21 Okt 2006, Andrew Morton wrote:
  Can you test 2.6.19-rc2 plus the below?
 
 2.6.19-rc2works
 2.6.19-rc2+patch does not work
 
 So it is this patch.
 
It doesn't make any sense.  This patch is totally benign and
cannot cause the No firmware running and lockup that you
reported.  Can you please double-check?

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html