Re: Linux v2.6.21-rc3

2007-03-13 Thread Eric W. Biederman

Here is a quick summary of the regressions I am looking at.

- Currently we appear to have a pid leak in tty_io.c
  http://lkml.org/lkml/2007/3/8/222

- There is a missing init_WORK in vt.c that cases oops
  when we attempt to use SAK.
  http://lkml.org/lkml/2007/3/11/148

- We have a network ABI regression caused by the latest sysfs
  changes to net-sysfs.c   In particular we now cannot rename network
  devices if our destination name happens to be the name of a sysfs file that
  the network device appears in, and if we try the kernel gets very
  confused and we loose access to the network device. 

  Do we just want to revert commit 43cb76d91ee85f579a69d42bc8efc08bac560278
  Greg has been working on this off and on and has not found a
  simple solution yet.

- pci_save_state, pci_restore_state are broken and have been for a
  while if used on anything besides plain pci (pci-x, pci-e and msi)
  and are not used in pairs. (gregkh and Andrew have the patches to 
  correct this).

- I am still confirming that I have fixed all of the irq handling
  problems that resulted in the No irq for vector message.  I think
  I have but I have at least one indirect bug report that I'm still
  following up on.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux v2.6.21-rc3

2007-03-13 Thread Greg KH
On Tue, Mar 13, 2007 at 01:26:48PM -0600, Eric W. Biederman wrote:
 - We have a network ABI regression caused by the latest sysfs
   changes to net-sysfs.c   In particular we now cannot rename network
   devices if our destination name happens to be the name of a sysfs file that
   the network device appears in, and if we try the kernel gets very
   confused and we loose access to the network device. 
 
   Do we just want to revert commit 43cb76d91ee85f579a69d42bc8efc08bac560278
   Greg has been working on this off and on and has not found a
   simple solution yet.

I do not think this should be reverted, as the odds that some one will
rename their network device to be irq or something else that is in the
pci device's directory is pretty slim.  It also only shows up if
CONFIG_SYSFS_DEPRECATED is disabled, not the common option.

But I am still working on it, I sent you and Kay a patch that, while it
locks up at boot time, should be close to what we need to address this
:)

 - pci_save_state, pci_restore_state are broken and have been for a
   while if used on anything besides plain pci (pci-x, pci-e and msi)
   and are not used in pairs. (gregkh and Andrew have the patches to 
   correct this).

I think these are already in Linus's tree right now, right?

thanks,

greg k-h
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux v2.6.21-rc3

2007-03-13 Thread Linus Torvalds


On Tue, 13 Mar 2007, Greg KH wrote:
 
  - pci_save_state, pci_restore_state are broken and have been for a
while if used on anything besides plain pci (pci-x, pci-e and msi)
and are not used in pairs. (gregkh and Andrew have the patches to 
correct this).
 
 I think these are already in Linus's tree right now, right?

Yes. I just wanted some more testing of it, and while I didn't hear much, 
at least Auke added his ack, and the old state was clearly broken, so they 
got applied yesterday.

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux v2.6.21-rc3

2007-03-13 Thread Eric W. Biederman
Greg KH [EMAIL PROTECTED] writes:

 I do not think this should be reverted, as the odds that some one will
 rename their network device to be irq or something else that is in the
 pci device's directory is pretty slim.  It also only shows up if
 CONFIG_SYSFS_DEPRECATED is disabled, not the common option.

Ah.  I missed that last little bit.

 But I am still working on it, I sent you and Kay a patch that, while it
 locks up at boot time, should be close to what we need to address this
 :)

Hmm.  I haven't seen that one.

 - pci_save_state, pci_restore_state are broken and have been for a
   while if used on anything besides plain pci (pci-x, pci-e and msi)
   and are not used in pairs. (gregkh and Andrew have the patches to 
   correct this).

 I think these are already in Linus's tree right now, right?

Oops I missed that.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux v2.6.21-rc3

2007-03-08 Thread Alistair John Strachan
(Dropped LKML, whoops.)

On Wednesday 07 March 2007 04:59, you wrote:
 We've finally hopefully started to put a dent in the regressions,
 especially the suspend/resume problems introduced since 2.6.20.

 So 2.6.21-rc3 is out there now, and there's some hope that it will work
 more widely than -rc1 and -rc2 did. Please do give it a good testing, and
 update Adrian and the mailing list (and me) about any regressions
 (hopefully many more of the it's fixed now than other kinds, but all
 regressions are interesting).

Robert and Jeff already know about these, but I thought I'd send out a
reminder.

ata2: EH in ADMA mode, notifier 0x0 notifier_error 0x0 gen_ctl 0x1501000 
status 0x500 next cpb count 0x0 next cpb idx 0x0
ata2: CPB 0: ctl_flags 0xd, resp_flags 0x1
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd 35/00:30:b5:c1:8f/00:01:01:00:00/e0 tag 0 cdb 0x0 data 155648 out
 res 40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata2.00: configured for UDMA/133
ata2: EH complete
SCSI device sdb: 488397168 512-byte hdwr sectors (250059 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: write cache: enabled, read cache: enabled, doesn't support 
DPO or FUA

They didn't happen (or didn't happen as frequently) in 2.6.20; it's a serious
bug. Happened in -rc2 and -rc3. A patch from Robert reverting
721449bf0d51213fe3abf0ac3e3561ef9ea7827a seems to make them go away.

-- 
Cheers,
Alistair.

Final year Computer Science undergraduate.
1F2 55 South Clerk Street, Edinburgh, UK.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux v2.6.21-rc3

2007-03-08 Thread Benjamin Herrenschmidt
On Wed, 2007-03-07 at 07:39 -0800, Linus Torvalds wrote:
 
 On Wed, 7 Mar 2007, Benjamin Herrenschmidt wrote:
 
  On Tue, 2007-03-06 at 20:59 -0800, Linus Torvalds wrote:
  
   Linus Torvalds (2):
 Revert [PATCH] LOG2: Alter get_order() so that it can make use of 
   ilog2() on a constant
 Linux 2.6.21-rc3
  
  Greg, I think we should revert that patch in 2.6.20.x stable serie too
  as get_order is broken there as well, causing random kernel memory
  corruption every now and then among others.
 
 Did you confirm that that was indeed the cause of the problem you saw?

Well, at least one of the problem I caught with my ppc32 implementation
of DEBUG_PAGEALLOC yes. PowerPC dma_alloc_coherent, on machines with
cache consistent PCI DMA, would use get_order to allocate pages and then
memset over the size passed in. The ide-pmac driver, among others, would
trigger that bug by asking for 0x1020 bytes while get_order only
returned 0. (I should look into making the ide-pmac driver allocate =
4K but that's a different matter).
 
I think it fixed David Woodhouse random crashes too.

 As far as I can tell, the bug (because it tested the wrong #define) would 
 only affect the constant-size case, and only for something larger than a 
 single page, and only for a non-power-of-two size. So it looked fairly 
 hard to trigger, if only because all the obvious constants I saw seemed 
 to already be powers-of-two..
 
 So did you hunt it down to a particular cases where it triggers?

Yup, the above. Calls to dma_alloc_consistent with a constant size that
is not a multiple of the page size and larger than one page. (Our
dma_alloc_consistent implementation on 32 bits is inline).

Ben.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux v2.6.21-rc3

2007-03-08 Thread Benjamin Herrenschmidt
On Wed, 2007-03-07 at 21:52 +0100, Arnd Bergmann wrote:
 On Wednesday 07 March 2007 16:39:00 Linus Torvalds wrote:
  So did you hunt it down to a particular cases where it triggers?
 
 IIRC, it crashed on boot in the powerpc iommu code when slab
 debugging is enabled. Not sure if it was on Cell or on benh's
 powerbook though.

Not iommu code, but dma_alloc_coherent() for non-iommu 32 bits
machines :-) Oh and it wasn't slab but DEBUG_PAGEALLOC :-)

Ben.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux v2.6.21-rc3

2007-03-07 Thread Benjamin Herrenschmidt
On Tue, 2007-03-06 at 20:59 -0800, Linus Torvalds wrote:

 Linus Torvalds (2):
   Revert [PATCH] LOG2: Alter get_order() so that it can make use of 
 ilog2() on a constant
   Linux 2.6.21-rc3

Greg, I think we should revert that patch in 2.6.20.x stable serie too
as get_order is broken there as well, causing random kernel memory
corruption every now and then among others.

Cheers,
Ben

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux v2.6.21-rc3

2007-03-07 Thread Michal Piotrowski
Hi,

Linus Torvalds napisał(a):
 We've finally hopefully started to put a dent in the regressions, 
 especially the suspend/resume problems introduced since 2.6.20.

I get this while
echo shutdown  /sys/power/disk; echo disk  /sys/power/state

BUG: using smp_processor_id() in preemptible [0001] code: 
swsusp_shutdown/3359
caller is check_tsc_sync_source+0x1b/0xef
 [c010503d] show_trace_log_lvl+0x1a/0x2f
 [c0105724] show_trace+0x12/0x14
 [c01057d6] dump_stack+0x16/0x18
 [c01f835e] debug_smp_processor_id+0xa2/0xb4
 [c0113cc5] check_tsc_sync_source+0x1b/0xef
 [c011367d] __cpu_up+0x136/0x158
 [c0141aec] _cpu_up+0x74/0xbf
 [c0141b5d] cpu_up+0x26/0x38
 [c0141bbc] enable_nonboot_cpus+0x4d/0x9a
 [c0146ae0] pm_suspend_disk+0x11c/0x210
 [c014597e] enter_state+0x50/0x1d0
 [c0145b84] state_store+0x86/0x9c
 [c01a53d0] subsys_attr_store+0x20/0x25
 [c01a54ea] sysfs_write_file+0xc1/0xe9
 [c017199b] vfs_write+0xaf/0x138
 [c0171f65] sys_write+0x3d/0x61
 [c0104064] syscall_call+0x7/0xb
 ===

l *check_tsc_sync_source+0x1b/0xef
0xc0113caa is in check_tsc_sync_source 
(/mnt/md0/devel/linux-git/arch/i386/kernel/../../x86_64/kernel/tsc_sync.c:99).
94  /*
95   * Source CPU calls into this - it waits for the freshly booted
96   * target CPU to arrive and then starts the measurement:
97   */
98  void __cpuinit check_tsc_sync_source(int cpu)
99  {
100 int cpus = 2;
101
102 /*
103  * No need to check if we already know that the TSC is not

echo platform  /sys/power/disk; echo disk  /sys/power/state
doesn't work (as always).

http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc3/boot.log
http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc3/git-config

Regards,
Michal

-- 
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux v2.6.21-rc3

2007-03-07 Thread Michal Piotrowski
Linus Torvalds napisał(a):
 We've finally hopefully started to put a dent in the regressions, 
 especially the suspend/resume problems introduced since 2.6.20.
 
 So 2.6.21-rc3 is out there now, and there's some hope that it will work 
 more widely than -rc1 and -rc2 did. Please do give it a good testing, and 
 update Adrian and the mailing list (and me) about any regressions 
 (hopefully many more of the it's fixed now than other kinds, but all 
 regressions are interesting).
 
 The appended shortlog gives a reasonable overview. In general we're 
 definitely calming down, and most of the changes are fairly small and 
 obvious fixes. 
 
 Let's keep the fixes to a minimum, especially since I'm planning on biting 
 peoples heads off if I get any more pull requests for things that aren't 
 real and obvious fixes. 
 
   Linus

BTW. Does anyone care about parport console?
console=lp0 hangs since at least 2.6.18

Calling initcall 0xc0438939: pty_init+0x0/0x231()
Calling initcall 0xc0439235: lp_init_module+0x0/0x238()
lp: driver loaded but no devices found
Calling initcall 0xc043947f: mod_init+0x0/0x286()
intel_rng: FWH not detected
Calling initcall 0xc0439aa9: serial8250_init+0x0/0x114()
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled
PM: Adding info for platform:serial8250
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
PM: Adding info for No Bus:ttyS0
serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
PM: Adding info for No Bus:ttyS1
PM: Adding info for No Bus:ttyS2
PM: Adding info for No Bus:ttyS3
Calling initcall 0xc0439c6c: serial8250_pnp_init+0x0/0xf()
PM: Removing info for No Bus:ttyS0
00:06: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
PM: Adding info for No Bus:ttyS0
PM: Removing info for No Bus:ttyS1
00:07: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
PM: Adding info for No Bus:ttyS1
Calling initcall 0xc0439c7b: serial8250_pci_init+0x0/0x16()
Calling initcall 0xc043a16d: parport_default_proc_register+0x0/0x16()
Calling initcall 0xc043a250: parport_pc_init+0x0/0x196()
parport: PnPBIOS parport detected.
parport0: PC-style at 0x378, irq 7 [PCSPP,TRISTATE]
lp0: using parport0 (interrupt-driven).

http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc3/git-config

Regards,
Michal

-- 
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux v2.6.21-rc3

2007-03-07 Thread Greg KH
On Wed, Mar 07, 2007 at 11:25:32AM +0100, Benjamin Herrenschmidt wrote:
 On Tue, 2007-03-06 at 20:59 -0800, Linus Torvalds wrote:
 
  Linus Torvalds (2):
Revert [PATCH] LOG2: Alter get_order() so that it can make use of 
  ilog2() on a constant
Linux 2.6.21-rc3
 
 Greg, I think we should revert that patch in 2.6.20.x stable serie too
 as get_order is broken there as well, causing random kernel memory
 corruption every now and then among others.

Now added to the -stable tree, thanks for pointing it out to me.

greg k-h
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux v2.6.21-rc3

2007-03-07 Thread Mark Lord

Greg / Adrian,

I didn't see anything in -rc3 to address the USB hub/serial crashes
reported here for -rc2.  What's the status for those, or who should
I be pinging to get them fixed?

Thanks

Mark


Message-ID: [EMAIL PROTECTED]
Date: Sun, 04 Mar 2007 23:43:02 -0500
From: Mark Lord [EMAIL PROTECTED]
User-Agent: Thunderbird 1.5.0.10 (X11/20070221)
MIME-Version: 1.0
To: Greg KH [EMAIL PROTECTED]
CC: Adrian Bunk [EMAIL PROTECTED], 
 Andrew Morton [EMAIL PROTECTED],

 Linux Kernel Mailing List linux-kernel@vger.kernel.org
Subject: Re: [BUG} usb regression in 2.6.21-rc2-git3
References: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED]
In-Reply-To: [EMAIL PROTECTED]
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit

Mark Lord wrote:


Here's another one for Greg:

I have a Targus USB 1.1 dock, basically a hub with built-in
serial, parallel, PS/2 KB, PS/2 Mouse, and extra USB ports.

Simply connecting, and then disconnecting it causes an oops with 
2.6.21-rc2:

..

Same behaviour with a second, different USB 1.1 dock here as well:

Mar  4 23:40:16 silvy kernel: usb 5-8: new high speed USB device using ehci_hcd 
and address 5
Mar  4 23:40:16 silvy kernel: usb 5-8: configuration #1 chosen from 1 choice
Mar  4 23:40:16 silvy kernel: hub 5-8:1.0: USB hub found
Mar  4 23:40:16 silvy kernel: hub 5-8:1.0: 4 ports detected
Mar  4 23:40:16 silvy kernel: usb 5-8.4: new full speed USB device using 
ehci_hcd and address 6
Mar  4 23:40:17 silvy kernel: usb 5-8.4: configuration #1 chosen from 1 choice
Mar  4 23:40:17 silvy kernel: hub 5-8.4:1.0: USB hub found
Mar  4 23:40:17 silvy kernel: hub 5-8.4:1.0: 4 ports detected
Mar  4 23:40:17 silvy kernel: usb 5-8.4.1: new full speed USB device using 
ehci_hcd and address 7
Mar  4 23:40:17 silvy kernel: usb 5-8.4.1: configuration #1 chosen from 1 choice
Mar  4 23:40:17 silvy kernel: usb 5-8.4.3: new low speed USB device using 
ehci_hcd and address 8
Mar  4 23:40:17 silvy kernel: usb 5-8.4.3: configuration #1 chosen from 1 choice
Mar  4 23:40:17 silvy kernel: input: Composite USB PS2 Converter USB to PS2 
Adaptor  v1.12 as /class/input/input8
Mar  4 23:40:17 silvy kernel: input: USB HID v1.10 Keyboard [Composite USB PS2 
Converter USB to PS2 Adaptor  v1.12] on usb-:00:1d.7-8.4.3
Mar  4 23:40:17 silvy kernel: input: Composite USB PS2 Converter USB to PS2 
Adaptor  v1.12 as /class/input/input9
Mar  4 23:40:17 silvy kernel: input: USB HID v1.10 Mouse [Composite USB PS2 
Converter USB to PS2 Adaptor  v1.12] on usb-:00:1d.7-8.4.3
Mar  4 23:40:17 silvy kernel: usb 5-8.4.4: new full speed USB device using 
ehci_hcd and address 9
Mar  4 23:40:17 silvy kernel: usb 5-8.4.4: configuration #1 chosen from 1 choice
Mar  4 23:40:17 silvy kernel: pl2303 5-8.4.4:1.0: pl2303 converter detected
Mar  4 23:40:17 silvy kernel: usb-serial ttyUSB0: Error registering port 
device, continuing
Mar  4 23:40:17 silvy kernel: drivers/usb/class/usblp.c: usblp0: USB 
Bidirectional printer dev 7 if 0 alt 1 proto 2 vid 0x0B39 pid 0x0801
Mar  4 23:40:17 silvy kernel: usbcore: registered new interface driver usblp
Mar  4 23:40:17 silvy kernel: drivers/usb/class/usblp.c: v0.13: USB Printer 
Device Class driver
Mar  4 23:41:05 silvy kernel: usb 5-8: USB disconnect, address 5
Mar  4 23:41:05 silvy kernel: usb 5-8.4: USB disconnect, address 6
Mar  4 23:41:05 silvy kernel: usb 5-8.4.1: USB disconnect, address 7
Mar  4 23:41:05 silvy kernel: drivers/usb/class/usblp.c: usblp0: removed
Mar  4 23:41:05 silvy kernel: usb 5-8.4.3: USB disconnect, address 8
Mar  4 23:41:05 silvy kernel: usb 5-8.4.4: USB disconnect, address 9
Mar  4 23:41:05 silvy kernel: BUG: unable to handle kernel NULL pointer 
dereference at virtual address 000c
Mar  4 23:41:05 silvy kernel:  printing eip:
Mar  4 23:41:05 silvy kernel: c027c251
Mar  4 23:41:05 silvy kernel: *pde = 
Mar  4 23:41:05 silvy kernel: Oops:  [#1]
Mar  4 23:41:05 silvy kernel: PREEMPT 
Mar  4 23:41:05 silvy kernel: Modules linked in: usblp radeon drm nfsd exportfs lockd nfs_acl sunrpc acpi_cpufreq cpufreq_ondemand cpufreq_powersave cpufreq_userspace cpufreq_stats freq_table cpufreq_conservative ac fan button thermal video battery container processor rfcomm l2cap bluetooth cfq_iosched deflate zlib_deflate twofish twofish_common serpent blowfish des cbc ecb blkcipher aes xcbc sha256 sha1 crypto_null af_key af_packet sbp2 usbhid hid pl2303 usbserial mousedev pcmcia snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore ipw2200 ahci psmouse serio_raw pcspkr ieee80211 ieee80211_crypt sdhci mmc_core snd_page_alloc yenta_socket rsrc_nonstatic firmware_class ohci1394 ieee1394 b44 mii pcmcia_core intel_agp ehci_hcd uhci_hcd usbcore agpgart sg sr_mod cdrom unix

Mar  4 23:41:05 silvy kernel: CPU:0
Mar  4 23:41:05 silvy kernel: EIP:0060:[klist_del+6/69]Not tainted VLI
Mar  4 23:41:05 silvy kernel: EFLAGS: 00010286   (2.6.21-rc2-git3 #5)
Mar  4 23:41:05 silvy kernel: EIP is at 

Re: Linux v2.6.21-rc3

2007-03-07 Thread Thomas Gleixner
On Tue, 2007-03-06 at 20:59 -0800, Linus Torvalds wrote:
 We've finally hopefully started to put a dent in the regressions, 
 especially the suspend/resume problems introduced since 2.6.20.

Still having SATA breakage on resume:

Caught that one (from screen)

ATA: abnormal status 0x7F on port 0x000118cf
irq 21: nobody cared (try booting ..)
...
Disabling IRQ #21


During normal boot I see the ATA: abnormal status 0x7F on port
0x000118cf once, but there the system behaves normal

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux v2.6.21-rc3

2007-03-07 Thread Greg KH
On Wed, Mar 07, 2007 at 09:15:39AM -0500, Mark Lord wrote:
 Greg / Adrian,
 
 I didn't see anything in -rc3 to address the USB hub/serial crashes
 reported here for -rc2.  What's the status for those, or who should
 I be pinging to get them fixed?

I have a series of USB bugfixes that need to get sent to Linus that
should fix the serial issues.  I'll get to them after I drag this next
-stable release out the door...

thanks,

greg k-h
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux v2.6.21-rc3

2007-03-07 Thread Linus Torvalds


On Wed, 7 Mar 2007, Benjamin Herrenschmidt wrote:

 On Tue, 2007-03-06 at 20:59 -0800, Linus Torvalds wrote:
 
  Linus Torvalds (2):
Revert [PATCH] LOG2: Alter get_order() so that it can make use of 
  ilog2() on a constant
Linux 2.6.21-rc3
 
 Greg, I think we should revert that patch in 2.6.20.x stable serie too
 as get_order is broken there as well, causing random kernel memory
 corruption every now and then among others.

Did you confirm that that was indeed the cause of the problem you saw?

As far as I can tell, the bug (because it tested the wrong #define) would 
only affect the constant-size case, and only for something larger than a 
single page, and only for a non-power-of-two size. So it looked fairly 
hard to trigger, if only because all the obvious constants I saw seemed 
to already be powers-of-two..

So did you hunt it down to a particular cases where it triggers?

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux v2.6.21-rc3

2007-03-07 Thread Linus Torvalds


On Wed, 7 Mar 2007, Michal Piotrowski wrote:
 
 BTW. Does anyone care about parport console?

I do think we care, but I don't think anybody in particular feels singled 
out as a maintainer...

 console=lp0 hangs since at least 2.6.18

Ok, that's not exactly new then, which implies that not a *lot* of people 
even care ;)

Do you think you'd be willing to try to figure out when it started? You 
seem to be the first one to have even noticed.

(I tried to google it, and the most recent thing google finds is your 
report, although I also saw a report of somebody trying it under qemu in 
July last year and also reported a hang)

Looking through the history of the last few years (it in git), I don't see 
anything even *remotely* suspicious there, so it's probably either 
 (a) really old, and hasn't worked in a loong time and nobody just uses it
 (b) something really stupid that happened while doing other cleanups (but 
 the changes in the last two years are *literally* just things like 
 removing devfs support)
 (c) some infrastructure change that subtly broke lpconsole, probably 
 causing an oops during printk, which obviously results in a printk 
 itself, which thus hangs.

It would be good to get it fixed, although for obvious reasons it's not a 
huge priority..

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux v2.6.21-rc3

2007-03-07 Thread Linus Torvalds

[ Ingo and Thomas added to  Cc, because I think this is them.. ]

Ingo, I think this came in during commit 95492e4646, x86: rewrite SMP TSC 
sync code.

(Leaving the original message quoted in full for Ingo and Thomas, sorry 
for the waste of bandwidth)

Linus

---
On Wed, 7 Mar 2007, Michal Piotrowski wrote:
 
 I get this while
 echo shutdown  /sys/power/disk; echo disk  /sys/power/state
 
 BUG: using smp_processor_id() in preemptible [0001] code: 
 swsusp_shutdown/3359
 caller is check_tsc_sync_source+0x1b/0xef
  [c010503d] show_trace_log_lvl+0x1a/0x2f
  [c0105724] show_trace+0x12/0x14
  [c01057d6] dump_stack+0x16/0x18
  [c01f835e] debug_smp_processor_id+0xa2/0xb4
  [c0113cc5] check_tsc_sync_source+0x1b/0xef
  [c011367d] __cpu_up+0x136/0x158
  [c0141aec] _cpu_up+0x74/0xbf
  [c0141b5d] cpu_up+0x26/0x38
  [c0141bbc] enable_nonboot_cpus+0x4d/0x9a
  [c0146ae0] pm_suspend_disk+0x11c/0x210
  [c014597e] enter_state+0x50/0x1d0
  [c0145b84] state_store+0x86/0x9c
  [c01a53d0] subsys_attr_store+0x20/0x25
  [c01a54ea] sysfs_write_file+0xc1/0xe9
  [c017199b] vfs_write+0xaf/0x138
  [c0171f65] sys_write+0x3d/0x61
  [c0104064] syscall_call+0x7/0xb
  ===
 
 l *check_tsc_sync_source+0x1b/0xef
 0xc0113caa is in check_tsc_sync_source 
 (/mnt/md0/devel/linux-git/arch/i386/kernel/../../x86_64/kernel/tsc_sync.c:99).
 94  /*
 95   * Source CPU calls into this - it waits for the freshly booted
 96   * target CPU to arrive and then starts the measurement:
 97   */
 98  void __cpuinit check_tsc_sync_source(int cpu)
 99  {
 100 int cpus = 2;
 101
 102 /*
 103  * No need to check if we already know that the TSC is not
 
 echo platform  /sys/power/disk; echo disk  /sys/power/state
 doesn't work (as always).
 
 http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc3/boot.log
 http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc3/git-config
 
 Regards,
 Michal
 
 -- 
 Michal K. K. Piotrowski
 LTG - Linux Testers Group (PL)
 (http://www.stardust.webpages.pl/ltg/)
 LTG - Linux Testers Group (EN)
 (http://www.stardust.webpages.pl/linux_testers_group_en/)
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux v2.6.21-rc3

2007-03-07 Thread Thomas Gleixner
On Wed, 2007-03-07 at 15:22 +0100, Thomas Gleixner wrote:
 On Tue, 2007-03-06 at 20:59 -0800, Linus Torvalds wrote:
  We've finally hopefully started to put a dent in the regressions, 
  especially the suspend/resume problems introduced since 2.6.20.
 
 Still having SATA breakage on resume:
 
 Caught that one (from screen)
 
 ATA: abnormal status 0x7F on port 0x000118cf
 irq 21: nobody cared (try booting ..)
 ...
 Disabling IRQ #21
 
 
 During normal boot I see the ATA: abnormal status 0x7F on port
 0x000118cf once, but there the system behaves normal

I enabled ATA_DEBUG and hacked it to provide debug output only on
resume. Now the disk resumes and no stale interrupt happens.

Full log at: http://www.tglx.de/private/tglx/sata-2.6.21-rc3.log

Both states are fully reproducible. (DEBUG ON/OFF == GOOD/BAD)

/me continues the libata exploration

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Linux-parport] Linux v2.6.21-rc3

2007-03-07 Thread Stephen Mollett
On Wednesday 07 Mar 2007, Michal Piotrowski wrote:
 BTW. Does anyone care about parport console?
 console=lp0 hangs since at least 2.6.18

For the record, I used console=lp0 quite recently (stock 2.6.19 according to 
the printout, running on i386) [to find out what was causing a panic that 
immediately vanished off the top of the screen because of atkbd.c: Spurious 
ACK...s from the flashing kb LEDs] and it worked just fine.

The parport-related lines went:

lp: driver loaded but no devices found
parport: PnPBIOS parport detected.
parport0: PC-style at 0x378, irq 7 [PCSPP,TRISTATE,EPP]
parport0: Printer, EPSON Stylus COLOR 600
lp0: using parport0 (interrupt-driven)
lp0: console ready

... then the kernel continued booting until the panic occurred (it was a silly 
storage-related misconfig on my part).

If anyone wants me to try anything (newer kernel or different parport-related 
BIOS settings, perhaps, to see if I can duplicate the problem?) and report 
back, let me know.

Stephen
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Linux-parport] Linux v2.6.21-rc3

2007-03-07 Thread Russell King
On Wed, Mar 07, 2007 at 05:14:21PM +, Stephen Mollett wrote:
 On Wednesday 07 Mar 2007, Michal Piotrowski wrote:
  BTW. Does anyone care about parport console?
  console=lp0 hangs since at least 2.6.18
 
 For the record, I used console=lp0 quite recently (stock 2.6.19 according to 
 the printout, running on i386) [to find out what was causing a panic that 
 immediately vanished off the top of the screen because of atkbd.c: Spurious 
 ACK...s from the flashing kb LEDs] and it worked just fine.

ISTR lp consoles block indefinitely until the printer is ready, so
if you ask for a lp console but don't have a working printer connected
it will hang.

-- 
Russell King
 Linux kernel2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux v2.6.21-rc3

2007-03-07 Thread Soeren Sonnenburg
On Wed, 2007-03-07 at 15:22 +0100, Thomas Gleixner wrote:
 On Tue, 2007-03-06 at 20:59 -0800, Linus Torvalds wrote:
  We've finally hopefully started to put a dent in the regressions, 
  especially the suspend/resume problems introduced since 2.6.20.
 
 Still having SATA breakage on resume:
 
 Caught that one (from screen)
 
 ATA: abnormal status 0x7F on port 0x000118cf
 irq 21: nobody cared (try booting ..)
 ...
 Disabling IRQ #21
 
 
 During normal boot I see the ATA: abnormal status 0x7F on port
 0x000118cf once, but there the system behaves normal
 
   tglx

maybe that is also causing the hang I am still seeing with the full
config... :(
(no display, no usb device activation, but I tend to think the mbp wants
to access the hdd...)

SCSI device sda: write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
ata1: port is slow to respond, please be patient (Status 0x80)
ata1: port failed to respond (30 secs, Status 0x80)
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7
ata1.00: qc timeout (cmd 0xa1)
ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata1.00: revalidation failed (errno=-5)
ata1: failed to recover some devices, retrying in 5 secs
ata1: port is slow to respond, please be patient (Status 0x80)
ata1: port failed to respond (30 secs, Status 0x80)
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux v2.6.21-rc3

2007-03-07 Thread Arnd Bergmann
On Wednesday 07 March 2007 16:39:00 Linus Torvalds wrote:
 So did you hunt it down to a particular cases where it triggers?

IIRC, it crashed on boot in the powerpc iommu code when slab
debugging is enabled. Not sure if it was on Cell or on benh's
powerbook though.

Arnd 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Linux v2.6.21-rc3

2007-03-06 Thread Linus Torvalds

We've finally hopefully started to put a dent in the regressions, 
especially the suspend/resume problems introduced since 2.6.20.

So 2.6.21-rc3 is out there now, and there's some hope that it will work 
more widely than -rc1 and -rc2 did. Please do give it a good testing, and 
update Adrian and the mailing list (and me) about any regressions 
(hopefully many more of the it's fixed now than other kinds, but all 
regressions are interesting).

The appended shortlog gives a reasonable overview. In general we're 
definitely calming down, and most of the changes are fairly small and 
obvious fixes. 

Let's keep the fixes to a minimum, especially since I'm planning on biting 
peoples heads off if I get any more pull requests for things that aren't 
real and obvious fixes. 

Linus

---

Adam Litke (1):
  Fix get_unmapped_area and fsync for hugetlb shm segments

Adrian Bunk (8):
  HID: hid-debug.c should #include linux/hid-debug.h
  arch/arm26/kernel/entry.S: remove dead code
  make ipc/shm.c:shm_nopage() static
  mm/{,tiny-}shmem.c cleanups
  drivers/video/sm501fb.c: make 4 functions static
  fix the SYSCTL=n compilation
  arch/i386/kernel/vmi.c must #include asm/kmap_types.h
  remove arch/i386/kernel/tsc.c:custom_sched_clock

Ahmed S. Darwish (1):
  KVM: Use ARRAY_SIZE macro instead of manual calculation.

Akira Iguchi (1):
  scc_pata: bugfix for checking DMA IRQ status

Alan Cox (4):
  libata-core: Fix simplex handling
  pata_qdi: Fix initialisation
  siimage: DRAC4 note
  ide: remove a ton of pointless #undef REALLY_SLOW_IO

Alexandr Andreev (1):
  [IA64] sync compat getdents

Alexey Dobriyan (1):
  geode-aes: use unsigned long for spin_lock_irqsave

Allan Graves (1):
  uml: enable RAW

Andres Salomon (3):
  i386: make x86_64 tsc header require i386 rather than vice-versa
  hrtimers: fix HRTIMER_CB_IRQSAFE_NO_SOFTIRQ description
  hrtimers: hrtimer_clock_base description typo

Andrew Morton (7):
  throttle_vm_writeout(): don't loop on GFP_NOFS and GFP_NOIO allocations
  ide: fix pmac breakage
  KVM: Move kvmfs magic number to linux/magic.h
  cyclades: return closing_wait
  revert drivers/net/tulip/dmfe: support basic carrier detection
  sis900 warning fixes
  fix build with CONFIG_NO_IDLE_HZ=n

Andrzej Zaborowski (1):
  ARM: OMAP: correct misc 15xx and non-15xx platform code

Antonino A. Daplas (2):
  MAINTAINERS: Update email address
  atyfb: Fix kconfig error

Aristeu Sergio Rozanski Filho (1):
  tty_io: fix race in master pty close/slave pty close path

Arnaldo Carvalho de Melo (1):
  [TCP]: Fix minisock tcp_create_openreq_child() typo.

Arnaud Patard (1):
  ARM: OMAP: board-nokia770: correct lcd name

Atsushi Nemoto (4):
  [MIPS] jmr3927: build fix
  [MIPS] Convert to RTC-class ds1742 driver
  [MIPS] No need to write c0_compare in plat_timer_setup
  [MIPS] TX39: Remove redundant tx39_blast_icache() calls

Avi Kivity (13):
  KVM: mmu: add missing dirty page tracking cases
  KVM: Cosmetics
  KVM: Add hypercall host support for svm
  KVM: Wire up hypercall handlers to a central arch-independent location
  KVM: svm: init cr0 with the wp bit set
  KVM: More 0 - NULL conversions
  KVM: Add internal filesystem for generating inodes
  KVM: Create an inode per virtual machine
  KVM: Rename some kvm_dev_ioctl_*() functions to kvm_vm_ioctl_*()
  KVM: Move kvm_vm_ioctl_create_vcpu() around
  KVM: Per-vcpu inodes
  KVM: Bump API version
  KVM: Fix bogus failure in kvm.ko module initialization

Bartlomiej Zolnierkiewicz (3):
  ide: remove some obsoleted kernel params (v2)
  ide: make legacy IDE VLB modules check for the probe kernel params (v2)
  pata_pdc202xx_old: fix data corruption and other problems

Ben Dooks (2):
  [ARM] 4238/1: S3C24XX: docs: update suspend and resume
  [ARM] 4239/1: S3C24XX: Update kconfig entries for PM

Brice Goglin (1):
  myri10ge: fix copyright and license

Catalin Marinas (1):
  [ARM] 4241/1: Define mb() as compiler barrier on a uniprocessor system

Christian Krafft (1):
  ipmi: check, if default ports are accessible on PPC

Christoph Lameter (1):
  Page migration: Fix vma flag checking

Con Kolivas (1):
  sched: remove SMT nice

Cornelia Huck (3):
  [S390] cio: Fix locking when calling notify function.
  [S390] cio: Use path verification to check for path state.
  [S390] cio: Call cancel_halt_clear even when actl == 0.

Dale Farnsworth (2):
  mv643xx_eth: move mac_addr inside mv643xx_eth_platform_data
  mv643xx_eth: Place explicit port number in mv643xx_eth_platform_data

Dan Aloni (1):
  [VLAN]: Avoid a 4-order allocation.

Daniel Walker (2):
  update timekeeping_is_continuous comment
  fix vsyscall settimeofday

Dave Johnson (1):
  [MIPS] Fix __raw_read_trylock() to allow multiple readers