Re: [Bug #15192] netperf ~50% regression with 2.6.33-rc1, bisect to 1b9508f

2010-02-01 Thread Mike Galbraith
On Mon, 2010-02-01 at 01:22 +0100, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.32.  Please verify if it still should be listed and let me know
> (either way).

Yes, it should remain open.  We're currently waiting for some data from
Lin Ming.  The regression itself isn't making much sense.. a kernel with
NEWIDLE disabled should show the same performance, but does not.  

-Mike

--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bug #14621] specjbb2005 and aim7 regression with 2.6.32-rc kernels

2010-02-01 Thread Mike Galbraith
On Mon, 2010-02-01 at 01:43 +0100, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.31 and 2.6.32.
> 
> The following bug entry is on the current list of known regressions
> introduced between 2.6.31 and 2.6.32.  Please verify if it still should
> be listed and let me know (either way).

Yes, it should remain open.  Aim7 regression isn't reproducible here,
specjbb2005 unknown, not available to the general public.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bug #14897] i915: Commit 0e442c60 causes flickering

2010-02-01 Thread David John
On 02/01/2010 06:13 AM, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.31 and 2.6.32.
> 
> The following bug entry is on the current list of known regressions
> introduced between 2.6.31 and 2.6.32.  Please verify if it still should
> be listed and let me know (either way).
> 
> 
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14897
> Subject   : i915: Commit 0e442c60 causes flickering
> Submitter : David John 
> Date  : 2009-12-09 17:26 (54 days old)
> First-Bad-Commit: 
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=0e442c60dd39ac6924b11a20497734bd2303744c
> References: http://marc.info/?l=linux-kernel&m=126037889600769&w=4
> Handled-By: David John 
> Patch : http://patchwork.kernel.org/patch/75423/
> 
> 
> 

Hi Rafael,

The patch fixing this has not been merged yet, so the bug should still
be listed.

Regards,
David.

--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bug #14942] gkrellm no longer shows all the temperatures on thinkpad x60

2010-02-01 Thread Pavel Machek
> On Sun, 10 Jan 2010, Rafael J. Wysocki wrote:
> > Bug-Entry   : http://bugzilla.kernel.org/show_bug.cgi?id=14942
> > Subject : gkrellm no longer shows all the temperatures on 
> > thinkpad x60
> > Submitter   : Pavel Machek 
> > Date: 2009-12-27 21:57 (15 days old)
> > References  : http://marc.info/?l=linux-kernel&m=126195107005565&w=4
> > Handled-By  : Henrique de Moraes Holschuh 
> > Patch   : http://patchwork.kernel.org/patch/69809/
> 
> Waiting for Pavel to confirm whether this can be closed or not.  There could
> be more than one bug involved, and I only fixed one.

Can be closed; it seems that gkrellm removed the graphs after it
failed once, or something like that.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bug #15196] kmem_cache_create: duplicate cache ccid2_h

2010-02-01 Thread Neil Horman
On Sun, Jan 31, 2010 at 11:20:50PM -0800, David Miller wrote:
> From: Xiaotian Feng 
> Date: Mon, 1 Feb 2010 11:30:02 +0800
> 
> > On Mon, Feb 1, 2010 at 8:22 AM, Rafael J. Wysocki  wrote:
> >> This message has been generated automatically as a part of a report
> >> of recent regressions.
> >>
> >> The following bug entry is on the current list of known regressions
> >> from 2.6.32.  Please verify if it still should be listed and let me know
> >> (either way).
> >>
> >>
> >> Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=15196
> >> Subject         : kmem_cache_create: duplicate cache ccid2_h
> >> Submitter       : Heinz Diehl 
> >> Date            : 2010-01-30 18:33 (2 days old)
> >> References      : http://marc.info/?l=linux-kernel&m=126487640324942&w=4
> > 
> > Cced Neil,
> > 
> > I think this one is introduced by commit
> > de4ef86cfce60d2250111f34f8a084e769f23b16,
> > passing char *slab_name_fmt as function parameter, but vsnprintf is
> > using sizeof(slab_name_fmt),
> > which is 8 (or 4 in 32bit kernel) instead of 32 as old version.
> > 
> > Does following patch resolve this bug, Heinz?
> 
> There seems to be even more to this than that.  Neils
> patch seems to need completely reverting.
> 
> See the patch set posted by Gerrit Renker:
> 
> http://marc.info/?l=linux-netdev&m=126500585823775&w=2
> http://marc.info/?l=linux-netdev&m=126500591923880&w=2
> 


Dave, some of this doesn't make the least bit of sense to me.  I get the sizeof
error, thats clear (and I apologize, I should have seen that), but Gerrits
revert of the dccp_probe changes is non-sensical.  I'm not sure I even follow
the comments:

>Previously (during about 4 years of this module's history) there had never
>been a problem with the 'silent dependency' that the commit tried to fix:
>this dependency is deliberate and required, since dccp_probe performs probing
>of dccp connections and hence needs to know about dccp internals.

He claims this dependency is deliberate and requires, to which I agree, but he
would seem to fix that by making the dccp_probe module error out in the event
that dccp wasn't loaded.  Why bother with that?
Neil


--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bug #15196] kmem_cache_create: duplicate cache ccid2_h

2010-02-01 Thread David Miller
From: Neil Horman 
Date: Mon, 1 Feb 2010 06:55:18 -0500

> He claims this dependency is deliberate and requires, to which I agree, but he
> would seem to fix that by making the dccp_probe module error out in the event
> that dccp wasn't loaded.  Why bother with that?

Neil, please get into the thread Gerrit started so he can see your
questions and responses too.

I already chided him for not CC:'ing you in the first place, guys
stop hiding from eachother :-)
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0

2010-02-01 Thread Dan Carpenter
On Sun, Jan 31, 2010 at 05:39:22PM -0800, Justin P. Mattock wrote:
> On 01/31/10 16:43, Rafael J. Wysocki wrote:
>> This message has been generated automatically as a part of a report
>> of regressions introduced between 2.6.31 and 2.6.32.
>>
>> The following bug entry is on the current list of known regressions
>> introduced between 2.6.31 and 2.6.32.  Please verify if it still should
>> be listed and let me know (either way).
>>
>>
>> Bug-Entry: http://bugzilla.kernel.org/show_bug.cgi?id=14487
>> Subject  : PANIC: early exception 08 rip 246:10 error 
>> 810251b5 cr2 0
>> Submitter: Justin P. Mattock
>> Date : 2009-10-23 16:45 (101 days old)
>> References   : http://lkml.org/lkml/2009/10/23/252
>>
>>
>>
>
>
> yeah still hitting this.
> looking at the issue if I change:
>
> @@ 260
>
> if ((class == 0x))
>   continue;
> to
>
> if ((class == 0x || 0x))
>   continue;
>

Uh... 0x is always true so basically that's the same as 
deleting the
if condition.

I've added the linux1394-devel people to the CC list.

Justin has found an issue that when he boots with:  ohci1394_dma=early his 
computer
crashes.

He can get it to boot by modifying drivers/ieee1394/init_ohci1394_dma.c:

init_ohci1394_dma_on_all_controllers()
   254  /* Poor man's PCI discovery, the only thing we can do at early 
boot */
   255  for (num = 0; num < 32; num++) {
   256  for (slot = 0; slot < 32; slot++) {
   257  for (func = 0; func < 8; func++) {
   258  u32 class = 
read_pci_config(num,slot,func,
   259  
PCI_CLASS_REVISION);
   260  if ((class == 0x))
   261  continue; /* No device at this 
func */

If he continues here then his system boots.

   262
   263  if (class>>8 != 
PCI_CLASS_SERIAL_FIREWIRE_OHCI)
   264  continue; /* Not an OHCI-1394 
device */
   265
   266  init_ohci1394_controller(num, slot, 
func);
   267  break; /* Assume one controller per 
device */

This comment is not terribly clear btw.  The code assumes one 
controller per slot.

   268  }
   269  }
   270  }

regards,
dan carpenter


> I'm able to boot, but don't have enough knowledge to know
> what is really happening(or how to execute this).
> will continue looking at this
> (hopefully I get somewhere on this);
>
> Justin P. Mattock
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bug #15196] kmem_cache_create: duplicate cache ccid2_h

2010-02-01 Thread Neil Horman
On Mon, Feb 01, 2010 at 04:49:11AM -0800, David Miller wrote:
> From: Neil Horman 
> Date: Mon, 1 Feb 2010 06:55:18 -0500
> 
> > He claims this dependency is deliberate and requires, to which I agree, but 
> > he
> > would seem to fix that by making the dccp_probe module error out in the 
> > event
> > that dccp wasn't loaded.  Why bother with that?
> 
> Neil, please get into the thread Gerrit started so he can see your
> questions and responses too.
> 
> I already chided him for not CC:'ing you in the first place, guys
> stop hiding from eachother :-)
> 
I already posted to both of his posts (about 45 minutes ago).  I think vger is
being slow (whoever runs that system should really tune it up ;) )

Neil

--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] ACPI: Remove old blacklist entries

2010-02-01 Thread Matthew Garrett
On Wed, Jan 27, 2010 at 11:35:20AM +0800, Feng Tang wrote:

> Which only enforces the "acpi_disabled" check and should have no
> logical problem.
> 
> And I guess your platform is blacklisted and acpi_disabled is set to 1,
> while it still need parse ACPI tables to get SMP info. So I would suggest
> to add a "acpi=force" for your case.

It's actually set to force_ht, so it sounds like some blacklisting is 
now stricter than it used to be. However, given that acpi=force seems to 
work, how about just doing the following?

commit a8d9241dad684f7dda46889f00c9e627773e868e
Author: Matthew Garrett 
Date:   Mon Feb 1 09:51:44 2010 -0500

ACPI: Remove old blacklist entries

The kernel has a set of blacklist entries that disable ACPI functionality
on various machines. These all seem to date from pre-git days and most
have no indication of what they were meant to fix. Let's work on the
assumption that we've fixed whatever it was that was broken before and so
remove most of the entries.

Signed-off-by: Matthew Garrett 

diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index fb1035c..086f0b6 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -1256,35 +1256,6 @@ static int __init disable_acpi_pci(const struct 
dmi_system_id *d)
return 0;
 }
 
-static int __init dmi_disable_acpi(const struct dmi_system_id *d)
-{
-   if (!acpi_force) {
-   printk(KERN_NOTICE "%s detected: acpi off\n", d->ident);
-   disable_acpi();
-   } else {
-   printk(KERN_NOTICE
-  "Warning: DMI blacklist says broken, but acpi forced\n");
-   }
-   return 0;
-}
-
-/*
- * Limit ACPI to CPU enumeration for HT
- */
-static int __init force_acpi_ht(const struct dmi_system_id *d)
-{
-   if (!acpi_force) {
-   printk(KERN_NOTICE "%s detected: force use of acpi=ht\n",
-  d->ident);
-   disable_acpi();
-   acpi_ht = 1;
-   } else {
-   printk(KERN_NOTICE
-  "Warning: acpi=force overrules DMI blacklist: 
acpi=ht\n");
-   }
-   return 0;
-}
-
 /*
  * Force ignoring BIOS IRQ0 pin2 override
  */
@@ -1308,116 +1279,6 @@ static int __init dmi_ignore_irq0_timer_override(const 
struct dmi_system_id *d)
  * works for you, please contact linux-a...@vger.kernel.org
  */
 static struct dmi_system_id __initdata acpi_dmi_table[] = {
-   /*
-* Boxes that need ACPI disabled
-*/
-   {
-.callback = dmi_disable_acpi,
-.ident = "IBM Thinkpad",
-.matches = {
-DMI_MATCH(DMI_BOARD_VENDOR, "IBM"),
-DMI_MATCH(DMI_BOARD_NAME, "2629H1G"),
-},
-},
-
-   /*
-* Boxes that need acpi=ht
-*/
-   {
-.callback = force_acpi_ht,
-.ident = "FSC Primergy T850",
-.matches = {
-DMI_MATCH(DMI_SYS_VENDOR, "FUJITSU SIEMENS"),
-DMI_MATCH(DMI_PRODUCT_NAME, "PRIMERGY T850"),
-},
-},
-   {
-.callback = force_acpi_ht,
-.ident = "HP VISUALIZE NT Workstation",
-.matches = {
-DMI_MATCH(DMI_BOARD_VENDOR, "Hewlett-Packard"),
-DMI_MATCH(DMI_PRODUCT_NAME, "HP VISUALIZE NT Workstation"),
-},
-},
-   {
-.callback = force_acpi_ht,
-.ident = "Compaq Workstation W8000",
-.matches = {
-DMI_MATCH(DMI_SYS_VENDOR, "Compaq"),
-DMI_MATCH(DMI_PRODUCT_NAME, "Workstation W8000"),
-},
-},
-   {
-.callback = force_acpi_ht,
-.ident = "ASUS P2B-DS",
-.matches = {
-DMI_MATCH(DMI_BOARD_VENDOR, "ASUSTeK Computer INC."),
-DMI_MATCH(DMI_BOARD_NAME, "P2B-DS"),
-},
-},
-   {
-.callback = force_acpi_ht,
-.ident = "ASUS CUR-DLS",
-.matches = {
-DMI_MATCH(DMI_BOARD_VENDOR, "ASUSTeK Computer INC."),
-DMI_MATCH(DMI_BOARD_NAME, "CUR-DLS"),
-},
-},
-   {
-.callback = force_acpi_ht,
-.ident = "ABIT i440BX-W83977",
-.matches = {
-DMI_MATCH(DMI_BOARD_VENDOR, "ABIT "),
-DMI_MATCH(DMI_BOARD_NAME, "i440BX-W83977 (BP6)"),
-},
-},
-   {
-.callback = force_acpi_ht,
-.ident = "IBM Bladecenter",
-.matches = {
-DMI_MATCH(DMI_BOARD_VENDOR, "IBM"),
-DMI_MATCH(DMI_BOARD_NAME, "IBM eServer BladeCenter HS20"),
-},
-},
-   {
-.callback = force_acpi_ht,
-.ident = "IBM eServer xSeries 360",
-.matches = {
-DMI_MATCH(DMI_BOARD_VENDOR, "IBM")

Re: [Bug #15196] kmem_cache_create: duplicate cache ccid2_h

2010-02-01 Thread Heinz Diehl
On 01.02.2010, Xiaotian Feng wrote: 

> Does following patch resolve this bug, Heinz?
[]

The patch was completely malformed, don't know what happened on the way,
but I applied it by hand. Yes, it fixes the problem for me.

Thanks,
Heinz.
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bug #14482] kernel BUG at fs/dcache.c:670 +lvm +md +ext3

2010-02-01 Thread Thomas Backlund

01.02.2010 02:43, Rafael J. Wysocki skrev:

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry   : http://bugzilla.kernel.org/show_bug.cgi?id=14482
Subject : kernel BUG at fs/dcache.c:670 +lvm +md +ext3
Submitter   : Alexander Clouter
Date: 2009-10-23 10:30 (101 days old)
References  : http://lkml.org/lkml/2009/10/23/50




Afaik this is the same issue as the one referenced here:

http://lkml.org/lkml/2010/1/28/292

The patch in the above thread should fix the issue.

--
Thomas
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bug #15127] Bluetooth: sleeping function called from invalid context

2010-02-01 Thread David John
On 02/01/2010 06:36 AM, Marcel Holtmann wrote:
> Hi Rafael,
> 
>> This message has been generated automatically as a part of a report
>> of regressions introduced between 2.6.31 and 2.6.32.
>>
>> The following bug entry is on the current list of known regressions
>> introduced between 2.6.31 and 2.6.32.  Please verify if it still should
>> be listed and let me know (either way).
>>
>>
>> Bug-Entry: http://bugzilla.kernel.org/show_bug.cgi?id=15127
>> Subject  : Bluetooth: sleeping function called from invalid 
>> context
>> Submitter: David John 
>> Date : 2010-01-12 9:19 (20 days old)
>> First-Bad-Commit: 
>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9e726b17422bade75fba94e625cd35fd1353e682
>> References   : http://marc.info/?l=linux-kernel&m=126328727021949&w=4
> 
> you have an outdated email from Luiz and I change it to the right one
> now.
> 
> I looked with him at the patch and I think this will fix it:
> 
> diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
> index fc5ee32..2b50637 100644
> --- a/net/bluetooth/rfcomm/core.c
> +++ b/net/bluetooth/rfcomm/core.c
> @@ -252,7 +252,6 @@ static void rfcomm_session_timeout(unsigned long
> arg)
>   BT_DBG("session %p state %ld", s, s->state);
>  
>   set_bit(RFCOMM_TIMED_OUT, &s->flags);
> - rfcomm_session_put(s);
>   rfcomm_schedule(RFCOMM_SCHED_TIMEO);
>  }
>  
> @@ -1920,6 +1919,7 @@ static inline void rfcomm_process_sessions(void)
>   if (test_and_clear_bit(RFCOMM_TIMED_OUT, &s->flags)) {
>   s->state = BT_DISCONN;
>   rfcomm_send_disc(s, 0);
> + rfcomm_session_put(s);
>   continue;
>   }
>  
> We need some extra testing on this with the actual hardware we did the
> patch for. So this will take at least a few days before we get our hands
> on it.
> 
> Regards
> 
> Marcel
> 
> 
> 

Hi Marcel,

FWIW, your patch fixes the issue.

Regards,
David.
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bug #15125] hung task - jbd2/dm-1-8 (during raid rebuild)

2010-02-01 Thread Michael Breuer

On 1/31/2010 7:42 PM, Michael Breuer wrote:

On 1/31/2010 7:22 PM, Rafael J. Wysocki wrote:

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.32.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry: http://bugzilla.kernel.org/show_bug.cgi?id=15125
Subject: hung task - jbd2/dm-1-8 (during raid rebuild)
Submitter: Michael Breuer
Date: 2010-01-10 21:47 (22 days old)
References: http://marc.info/?l=linux-kernel&m=126316012025978&w=4


Yup. Hit it again on 2.6.33-rc5.
--
To unsubscribe from this list: send the line "unsubscribe 
linux-kernel" in

the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

I was not able to recreate this in rc6.
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0

2010-02-01 Thread Justin P. Mattock

On 02/01/10 04:54, Dan Carpenter wrote:

On Sun, Jan 31, 2010 at 05:39:22PM -0800, Justin P. Mattock wrote:

On 01/31/10 16:43, Rafael J. Wysocki wrote:

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry   : http://bugzilla.kernel.org/show_bug.cgi?id=14487
Subject : PANIC: early exception 08 rip 246:10 error 810251b5 
cr2 0
Submitter   : Justin P. Mattock
Date: 2009-10-23 16:45 (101 days old)
References  : http://lkml.org/lkml/2009/10/23/252






yeah still hitting this.
looking at the issue if I change:

@@ 260

if ((class == 0x))
continue;
to

if ((class == 0x || 0x))
continue;



Uh... 0x is always true so basically that's the same as 
deleting the
if condition.

I've added the linux1394-devel people to the CC list.

Justin has found an issue that when he boots with:  ohci1394_dma=early his 
computer
crashes.

He can get it to boot by modifying drivers/ieee1394/init_ohci1394_dma.c:

init_ohci1394_dma_on_all_controllers()
254  /* Poor man's PCI discovery, the only thing we can do at early 
boot */
255  for (num = 0; num<  32; num++) {
256  for (slot = 0; slot<  32; slot++) {
257  for (func = 0; func<  8; func++) {
258  u32 class = 
read_pci_config(num,slot,func,
259  
PCI_CLASS_REVISION);
260  if ((class == 0x))
261  continue; /* No device at this 
func */

If he continues here then his system boots.

262
263  if (class>>8 != 
PCI_CLASS_SERIAL_FIREWIRE_OHCI)
264  continue; /* Not an OHCI-1394 
device */
265
266  init_ohci1394_controller(num, slot, 
func);
267  break; /* Assume one controller per 
device */

This comment is not terribly clear btw.  The code assumes one 
controller per slot.

268  }
269  }
270  }

regards,
dan carpenter



I'm able to boot, but don't have enough knowledge to know
what is really happening(or how to execute this).
will continue looking at this
(hopefully I get somewhere on this);

Justin P. Mattock
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/




yeah I'll admit it, I don't know what I'm doing
(but am willing to try).

Thanks for the response, I'll try and
give as much info on this as possible.

Justin P. Mattock
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bug #14949] drm_vm.c:drm_mmap: possible circular locking dependency detected

2010-02-01 Thread Borislav Petkov
On Mon, Feb 1, 2010 at 1:22 AM, Rafael J. Wysocki  wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
>
> The following bug entry is on the current list of known regressions
> from 2.6.32.  Please verify if it still should be listed and let me know
> (either way).
>
>
> Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=14949
> Subject         : drm_vm.c:drm_mmap: possible circular locking dependency 
> detected
> Submitter       : Borislav Petkov 
> Date            : 2009-12-26 9:45 (37 days old)
> References      : http://marc.info/?l=linux-kernel&m=126182073616279&w=4
> Handled-By      : Eric W. Biederman 
> Patch           : http://patchwork.kernel.org/patch/70461/

Yes, this is fixed.

-- 
Regards/Gruss,
Boris
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bug #15127] Bluetooth: sleeping function called from invalid context

2010-02-01 Thread Marcel Holtmann
Hi David,

> >> This message has been generated automatically as a part of a report
> >> of regressions introduced between 2.6.31 and 2.6.32.
> >>
> >> The following bug entry is on the current list of known regressions
> >> introduced between 2.6.31 and 2.6.32.  Please verify if it still should
> >> be listed and let me know (either way).
> >>
> >>
> >> Bug-Entry  : http://bugzilla.kernel.org/show_bug.cgi?id=15127
> >> Subject: Bluetooth: sleeping function called from invalid 
> >> context
> >> Submitter  : David John 
> >> Date   : 2010-01-12 9:19 (20 days old)
> >> First-Bad-Commit: 
> >> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9e726b17422bade75fba94e625cd35fd1353e682
> >> References : http://marc.info/?l=linux-kernel&m=126328727021949&w=4
> > 
> > you have an outdated email from Luiz and I change it to the right one
> > now.
> > 
> > I looked with him at the patch and I think this will fix it:
> > 
> > diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
> > index fc5ee32..2b50637 100644
> > --- a/net/bluetooth/rfcomm/core.c
> > +++ b/net/bluetooth/rfcomm/core.c
> > @@ -252,7 +252,6 @@ static void rfcomm_session_timeout(unsigned long
> > arg)
> > BT_DBG("session %p state %ld", s, s->state);
> >  
> > set_bit(RFCOMM_TIMED_OUT, &s->flags);
> > -   rfcomm_session_put(s);
> > rfcomm_schedule(RFCOMM_SCHED_TIMEO);
> >  }
> >  
> > @@ -1920,6 +1919,7 @@ static inline void rfcomm_process_sessions(void)
> > if (test_and_clear_bit(RFCOMM_TIMED_OUT, &s->flags)) {
> > s->state = BT_DISCONN;
> > rfcomm_send_disc(s, 0);
> > +   rfcomm_session_put(s);
> > continue;
> > }
> >  
> > We need some extra testing on this with the actual hardware we did the
> > patch for. So this will take at least a few days before we get our hands
> > on it.
>
> FWIW, your patch fixes the issue.

nice. So I can add a tested-by line to the final patch?

Just our of curiosity, which hardware did you test this with. We only
know about one headset that should cause this issue.

Regards

Marcel


--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

2010-02-01 Thread Stefan Richter
Justin P. Mattock wrote:
> On 02/01/10 04:54, Dan Carpenter wrote:
>> On Sun, Jan 31, 2010 at 05:39:22PM -0800, Justin P. Mattock wrote:
>>> On 01/31/10 16:43, Rafael J. Wysocki wrote:
 This message has been generated automatically as a part of a report
 of regressions introduced between 2.6.31 and 2.6.32.

 The following bug entry is on the current list of known regressions
 introduced between 2.6.31 and 2.6.32.  Please verify if it still should
 be listed and let me know (either way).


 Bug-Entry  : http://bugzilla.kernel.org/show_bug.cgi?id=14487
 Subject: PANIC: early exception 08 rip 246:10 error 810251b5 
 cr2 0
 Submitter  : Justin P. Mattock
 Date   : 2009-10-23 16:45 (101 days old)
 References : http://lkml.org/lkml/2009/10/23/252
[...]
>>> yeah still hitting this.
[...]
>> I've added the linux1394-devel people to the CC list.

Thanks.  Alas the original author is MIA, and the bug seems to be tied
to the early platform setup code (rather than OHCI 1394 device specific
code) about which I for one am clueless.

The listed MAINTAINERS contact of init_ohci1394_dma.c is linux1394-devel
and me, but a good deal of this driver is very x86 platform specific.
(There was some interest in making useful for other architectures, but
this would merely mean that the respective architecture people need to
keep an eye on their parts of this driver.)

>> Justin has found an issue that when he boots with:  ohci1394_dma=early
>> his computer
>> crashes.
>>
>> He can get it to boot by modifying drivers/ieee1394/init_ohci1394_dma.c:
[...]

This modification and some others in the LKML thread from October simply
cause init_ohci1394_controller() to be skipped for all devices.

init_ohci1394_controller() is simple enough:

static inline void __init init_ohci1394_controller(int num, int slot,
int func)
{
unsigned long ohci_base;
struct ti_ohci ohci;

printk(KERN_INFO "init_ohci1394_dma: initializing OHCI-1394"
 " at %02x:%02x.%x\n", num, slot, func);

ohci_base = read_pci_config(num, slot, func,
PCI_BASE_ADDRESS_0+(0<<2)) & PCI_BASE_ADDRESS_MEM_MASK;

set_fixmap_nocache(FIX_OHCI1394_BASE, ohci_base);

ohci.registers = (void *)fix_to_virt(FIX_OHCI1394_BASE);

init_ohci1394_reset_and_init_dma(&ohci);
}

Justin, you already established that read_pci_config is not the point
where it crashes, right?

set_fixmap_nocache() and fix_to_virt() frighten me because I don't know
what they do. :-)

The rest, init_ohci1394_reset_and_init_dma(), is something which I can
easily follow.  There is just a bunch of register reads and writes with
occasional mdelays.  This /could/ be a cause of the crash too if the
controller is inspired to do something dangerous in there --- meaning,
if the OHCI 1394 controller starts to write something per DMA into
memory.  However, we do not switch on any DMA context except for the
so-called physical DMA unit which only springs into action if a remote
FireWire-attached console instructs it to do so.

I am noticing one point where init_ohci1394_dma.c violates the OHCI 1394
specification:  OHCI1394_HCControl_linkEnable is witched on while the
OHCI1394_ConfigROMmap register is still invalid.  This register needs to
contain a physical address of a 1kB sized, 1kB aligned memory region
which allows DMA_TO_DEVICE.  So, since this is a read-only DMA, I am
tempted to say that this potential issue should not be a cause for a
kernel crash.

(Sinde note, the OHCI 1394 spec is freely available, see
http://ieee1394.wiki.kernel.org/index.php/Specifications#OHCI_Release_1.1.2C_January_6.2C_2000
)


Justin Mattock wrote on 2009-10-27 in http://lkml.org/lkml/2009/10/27/335:
> o.k. you should be able to view
> this:(let me know if you can't and I can
> manually write out, and in time find a public
> photo sharing suite to make things easier).
> 
> http://www.flickr.com/photos/44066...@n08/4050317695
> 
> When this happens I see lots of messages from the print
> during boot, then this happens.

(Now that a bugzilla.kernel.org ticket exists for this you can also use
bugzilla.kernel.org to publish screenshots if you have an account there.)

This screenshot looks like ___alloc_bootmem_node is the issue here, or
am I mistaken of what the order of functions in the backtrace means?
-- 
Stefan Richter
-=-==-=- --=- =
http://arcgraph.de/sr/
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

2010-02-01 Thread Justin P. Mattock

On 02/01/10 11:57, Stefan Richter wrote:

Justin P. Mattock wrote:

On 02/01/10 04:54, Dan Carpenter wrote:

On Sun, Jan 31, 2010 at 05:39:22PM -0800, Justin P. Mattock wrote:

On 01/31/10 16:43, Rafael J. Wysocki wrote:

This message has been generated automatically as a part of a report
of regressions introduced between 2.6.31 and 2.6.32.

The following bug entry is on the current list of known regressions
introduced between 2.6.31 and 2.6.32.  Please verify if it still should
be listed and let me know (either way).


Bug-Entry   : http://bugzilla.kernel.org/show_bug.cgi?id=14487
Subject : PANIC: early exception 08 rip 246:10 error 810251b5 cr2 0
Submitter   : Justin P. Mattock
Date: 2009-10-23 16:45 (101 days old)
References  : http://lkml.org/lkml/2009/10/23/252

[...]

yeah still hitting this.

[...]

I've added the linux1394-devel people to the CC list.


Thanks.  Alas the original author is MIA, and the bug seems to be tied
to the early platform setup code (rather than OHCI 1394 device specific
code) about which I for one am clueless.

The listed MAINTAINERS contact of init_ohci1394_dma.c is linux1394-devel
and me, but a good deal of this driver is very x86 platform specific.
(There was some interest in making useful for other architectures, but
this would merely mean that the respective architecture people need to
keep an eye on their parts of this driver.)


Justin has found an issue that when he boots with:  ohci1394_dma=early
his computer
crashes.

He can get it to boot by modifying drivers/ieee1394/init_ohci1394_dma.c:

[...]

This modification and some others in the LKML thread from October simply
cause init_ohci1394_controller() to be skipped for all devices.

init_ohci1394_controller() is simple enough:

static inline void __init init_ohci1394_controller(int num, int slot,
int func)
{
unsigned long ohci_base;
struct ti_ohci ohci;

printk(KERN_INFO "init_ohci1394_dma: initializing OHCI-1394"
 " at %02x:%02x.%x\n", num, slot, func);

ohci_base = read_pci_config(num, slot, func,
PCI_BASE_ADDRESS_0+(0<<2))&  PCI_BASE_ADDRESS_MEM_MASK;

set_fixmap_nocache(FIX_OHCI1394_BASE, ohci_base);

ohci.registers = (void *)fix_to_virt(FIX_OHCI1394_BASE);

init_ohci1394_reset_and_init_dma(&ohci);
}

Justin, you already established that read_pci_config is not the point
where it crashes, right?

set_fixmap_nocache() and fix_to_virt() frighten me because I don't know
what they do. :-)

The rest, init_ohci1394_reset_and_init_dma(), is something which I can
easily follow.  There is just a bunch of register reads and writes with
occasional mdelays.  This /could/ be a cause of the crash too if the
controller is inspired to do something dangerous in there --- meaning,
if the OHCI 1394 controller starts to write something per DMA into
memory.  However, we do not switch on any DMA context except for the
so-called physical DMA unit which only springs into action if a remote
FireWire-attached console instructs it to do so.

I am noticing one point where init_ohci1394_dma.c violates the OHCI 1394
specification:  OHCI1394_HCControl_linkEnable is witched on while the
OHCI1394_ConfigROMmap register is still invalid.  This register needs to
contain a physical address of a 1kB sized, 1kB aligned memory region
which allows DMA_TO_DEVICE.  So, since this is a read-only DMA, I am
tempted to say that this potential issue should not be a cause for a
kernel crash.

(Sinde note, the OHCI 1394 spec is freely available, see
http://ieee1394.wiki.kernel.org/index.php/Specifications#OHCI_Release_1.1.2C_January_6.2C_2000
)


Justin Mattock wrote on 2009-10-27 in http://lkml.org/lkml/2009/10/27/335:

o.k. you should be able to view
this:(let me know if you can't and I can
manually write out, and in time find a public
photo sharing suite to make things easier).

http://www.flickr.com/photos/44066...@n08/4050317695

When this happens I see lots of messages from the print
during boot, then this happens.


(Now that a bugzilla.kernel.org ticket exists for this you can also use
bugzilla.kernel.org to publish screenshots if you have an account there.)

This screenshot looks like ___alloc_bootmem_node is the issue here, or
am I mistaken of what the order of functions in the backtrace means?



cool, thanks for the assistance and info on this.
(I'll have to read through the specification for ohci1394);

as for __alloc_bootmem_node I have not looked into that yet.
(I can read up on this today).

what I was looking at was:
set_fixmap_nocache(FIX_OHCI1394_BASE, ohci_base);

which led me to arch/x86/include/asm/fixmap.h
leading me to believe I was hitting something with
FIXADDR_TOP because the system is a pure64.
(reading through fixmap.h there is mention that
vsyscall only covers 32bit making me think this might
be it).

and also:

init_ohci1394_reset_and_init_dma(&ohci);
(on the bugreport I have a temporary patch
which gets me up a

Re: [Bug #14943] nfs regression?

2010-02-01 Thread Nikola Ciprich
Hi Rafael,
Sorry, I haven't had time to test newer kernels lately :(, but according to 
changelogs, no related problems were fixed till 2.6.32.7...
I'll update problematic machine on wednesday though...
regards
nik
On Mon, Feb 01, 2010 at 01:43:18AM +0100, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.31 and 2.6.32.
> 
> The following bug entry is on the current list of known regressions
> introduced between 2.6.31 and 2.6.32.  Please verify if it still should
> be listed and let me know (either way).
> 
> 
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14943
> Subject   : nfs regression?
> Submitter : Nikola Ciprich 
> Date  : 2009-12-28 12:10 (35 days old)
> References: http://marc.info/?l=linux-kernel&m=126200276223524&w=4
> 
> 
> 

-- 
-
Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799

www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bug #15127] Bluetooth: sleeping function called from invalid context

2010-02-01 Thread Luiz Augusto von Dentz
Hi,

On Mon, Feb 1, 2010 at 11:14 AM, Marcel Holtmann  wrote:
> Hi David,
>
>> >> This message has been generated automatically as a part of a report
>> >> of regressions introduced between 2.6.31 and 2.6.32.
>> >>
>> >> The following bug entry is on the current list of known regressions
>> >> introduced between 2.6.31 and 2.6.32.  Please verify if it still should
>> >> be listed and let me know (either way).
>> >>
>> >>
>> >> Bug-Entry  : http://bugzilla.kernel.org/show_bug.cgi?id=15127
>> >> Subject            : Bluetooth: sleeping function called from invalid 
>> >> context
>> >> Submitter  : David John 
>> >> Date               : 2010-01-12 9:19 (20 days old)
>> >> First-Bad-Commit: 
>> >> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9e726b17422bade75fba94e625cd35fd1353e682
>> >> References : http://marc.info/?l=linux-kernel&m=126328727021949&w=4
>> >
>> > you have an outdated email from Luiz and I change it to the right one
>> > now.
>> >
>> > I looked with him at the patch and I think this will fix it:
>> >
>> > diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
>> > index fc5ee32..2b50637 100644
>> > --- a/net/bluetooth/rfcomm/core.c
>> > +++ b/net/bluetooth/rfcomm/core.c
>> > @@ -252,7 +252,6 @@ static void rfcomm_session_timeout(unsigned long
>> > arg)
>> >     BT_DBG("session %p state %ld", s, s->state);
>> >
>> >     set_bit(RFCOMM_TIMED_OUT, &s->flags);
>> > -   rfcomm_session_put(s);
>> >     rfcomm_schedule(RFCOMM_SCHED_TIMEO);
>> >  }
>> >
>> > @@ -1920,6 +1919,7 @@ static inline void rfcomm_process_sessions(void)
>> >             if (test_and_clear_bit(RFCOMM_TIMED_OUT, &s->flags)) {
>> >                     s->state = BT_DISCONN;
>> >                     rfcomm_send_disc(s, 0);
>> > +                   rfcomm_session_put(s);
>> >                     continue;
>> >             }
>> >
>> > We need some extra testing on this with the actual hardware we did the
>> > patch for. So this will take at least a few days before we get our hands
>> > on it.
>>
>> FWIW, your patch fixes the issue.
>
> nice. So I can add a tested-by line to the final patch?
>
> Just our of curiosity, which hardware did you test this with. We only
> know about one headset that should cause this issue.
>

Just in case, here is the hcidump of the Nokia HS-12W, the one that
has problem when we connection authorization is denied:

> ACL data: handle 11 flags 0x02 dlen 8
L2CAP(d): cid 0x0041 len 4 [psm 3]
  RFCOMM(s): SABM: cr 1 dlci 26 pf 1 ilen 0 fcs 0xe7
< ACL data: handle 11 flags 0x02 dlen 12
L2CAP(s): Disconn req: dcid 0x0042 scid 0x0040
< ACL data: handle 11 flags 0x02 dlen 8
L2CAP(d): cid 0x0044 len 4 [psm 3]
  RFCOMM(s): DM: cr 1 dlci 26 pf 1 ilen 0 fcs 0xcd
> HCI Event: Number of Completed Packets (0x13) plen 5
> ACL data: handle 11 flags 0x02 dlen 12
L2CAP(s): Disconn rsp: dcid 0x0042 scid 0x0040
< ACL data: handle 11 flags 0x02 dlen 8
L2CAP(d): cid 0x0044 len 4 [psm 3]
  RFCOMM(s): DISC: cr 0 dlci 0 pf 1 ilen 0 fcs 0x9c
> ACL data: handle 11 flags 0x02 dlen 8
L2CAP(d): cid 0x0041 len 4 [psm 3]
  RFCOMM(s): UA: cr 0 dlci 0 pf 1 ilen 0 fcs 0xb6
< ACL data: handle 11 flags 0x02 dlen 12
L2CAP(s): Disconn req: dcid 0x0044 scid 0x0041
> HCI Event: Number of Completed Packets (0x13) plen 5
> ACL data: handle 11 flags 0x02 dlen 12
L2CAP(s): Disconn rsp: dcid 0x0044 scid 0x0041
< HCI Command: Disconnect (0x01|0x0006) plen 3
> HCI Event: Command Status (0x0f) plen 4
> HCI Event: Disconn Complete (0x05) plen 4

So this means the patch works. DISC 0 is send from our side (due to
the session timeout) when normally it should be other end that
disconnects right away when we respond with DM.

-- 
Luiz Augusto von Dentz
Engenheiro de Computação
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

2010-02-01 Thread Stefan Richter
Justin P. Mattock wrote:
> (as for yesterdays 0x(just experimenting)Google gives me
> no info on the differences between 8f's to 16f's, I was under the
> impression that it's x86_32 and x86_64 for the pci address).

As Dan noted,
(class == 0x || 0x)
is always true because it is logically the same as
(class == whatever) || true

If you really meant
class == 0x || class == 0x
then the latter half will never become true because class is declared as
u32 and got its value from read_pci_config() which also returns u32.

BTW, whether a PCI device is capable of accessing 32 bit bus addresses
or also 64 bit bus addresses depends on the device, not on the CPU.
Originally, PCI only had a 32 bit addressing model.  OHCI 1394 1.0/1.1
implementations only deal with 32 bit local bus addresses.

The 'class' however is not an address but merely a register value with
24 bits width.  (Defined in the PCI Local Bus spec which is not freely
available, cited in OHCI 1394 annex A.3.)  This register is read as a 32
bits wide register from which the excess byte is later discarded.  If
all bits read are 1, the bus:slot:function is not actually populated.
-- 
Stefan Richter
-=-==-=- --=- =
http://arcgraph.de/sr/
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

2010-02-01 Thread Justin P. Mattock

On 02/01/10 14:27, Stefan Richter wrote:

Justin P. Mattock wrote:

(as for yesterdays 0x(just experimenting)Google gives me
no info on the differences between 8f's to 16f's, I was under the
impression that it's x86_32 and x86_64 for the pci address).


As Dan noted,
(class == 0x || 0x)
is always true because it is logically the same as
(class == whatever) || true

If you really meant
class == 0x || class == 0x


yeah that's what I was going for(just to see).


then the latter half will never become true because class is declared as
u32 and got its value from read_pci_config() which also returns u32.



That's what I was afraid of. I'm guessing there probably would be a lot 
of things to change for(if this correct) u64.



BTW, whether a PCI device is capable of accessing 32 bit bus addresses
or also 64 bit bus addresses depends on the device, not on the CPU.
Originally, PCI only had a 32 bit addressing model.  OHCI 1394 1.0/1.1
implementations only deal with 32 bit local bus addresses.


I haven't even looked at what the device was capable of doing.



The 'class' however is not an address but merely a register value with
24 bits width.  (Defined in the PCI Local Bus spec which is not freely
available, cited in OHCI 1394 annex A.3.)  This register is read as a 32
bits wide register from which the excess byte is later discarded.  If
all bits read are 1, the bus:slot:function is not actually populated.


So(correct me if I'm wrong), I'm generating a 64 bit register
and the kernel is looking for a 32 bit register causing the crash.


Justin P. Mattock


--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bug #15198] Radeon KMS regression

2010-02-01 Thread Kevin Winchester
On Mon, 2010-02-01 at 01:22 +0100, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.32.  Please verify if it still should be listed and let me know
> (either way).
> 
> 
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15198
> Subject   : Radeon KMS regression
> Submitter : Kevin Winchester 
> Date  : 2010-01-30 17:18 (2 days old)
> First-Bad-Commit: 
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=42590a75019a50012f25a962246498dead428433
> References: http://marc.info/?l=linux-kernel&m=126487191019612&w=4
> Handled-By: FUJITA Tomonori 
> Patch : http://patchwork.kernel.org/patch/75023/
> 
> 

This is fixed by the patch from FUJITA Tomonori - I just confirmed with
my latest build of Linus' tree (which has the patch).

Thanks,

-- 
Kevin Winchester


--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bug #15127] Bluetooth: sleeping function called from invalid context

2010-02-01 Thread David John
On 02/02/2010 12:44 AM, Marcel Holtmann wrote:
> Hi David,
> 
 This message has been generated automatically as a part of a report
 of regressions introduced between 2.6.31 and 2.6.32.

 The following bug entry is on the current list of known regressions
 introduced between 2.6.31 and 2.6.32.  Please verify if it still should
 be listed and let me know (either way).


 Bug-Entry  : http://bugzilla.kernel.org/show_bug.cgi?id=15127
 Subject: Bluetooth: sleeping function called from invalid 
 context
 Submitter  : David John 
 Date   : 2010-01-12 9:19 (20 days old)
 First-Bad-Commit: 
 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9e726b17422bade75fba94e625cd35fd1353e682
 References : http://marc.info/?l=linux-kernel&m=126328727021949&w=4
>>>
>>> you have an outdated email from Luiz and I change it to the right one
>>> now.
>>>
>>> I looked with him at the patch and I think this will fix it:
>>>
>>> diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
>>> index fc5ee32..2b50637 100644
>>> --- a/net/bluetooth/rfcomm/core.c
>>> +++ b/net/bluetooth/rfcomm/core.c
>>> @@ -252,7 +252,6 @@ static void rfcomm_session_timeout(unsigned long
>>> arg)
>>> BT_DBG("session %p state %ld", s, s->state);
>>>  
>>> set_bit(RFCOMM_TIMED_OUT, &s->flags);
>>> -   rfcomm_session_put(s);
>>> rfcomm_schedule(RFCOMM_SCHED_TIMEO);
>>>  }
>>>  
>>> @@ -1920,6 +1919,7 @@ static inline void rfcomm_process_sessions(void)
>>> if (test_and_clear_bit(RFCOMM_TIMED_OUT, &s->flags)) {
>>> s->state = BT_DISCONN;
>>> rfcomm_send_disc(s, 0);
>>> +   rfcomm_session_put(s);
>>> continue;
>>> }
>>>  
>>> We need some extra testing on this with the actual hardware we did the
>>> patch for. So this will take at least a few days before we get our hands
>>> on it.
>>
>> FWIW, your patch fixes the issue.
> 
> nice. So I can add a tested-by line to the final patch?

Sure,

Tested-by: David John 

> 
> Just our of curiosity, which hardware did you test this with. 

I have an inbuilt (laptop) USB Dell Wireless 365 Bluetooth Module
(413c:8160). I can send more info about the device if you want.

> We only know about one headset that should cause this issue.

That's weird. I assumed it would happen for any device, since
rfcomm_session_add is called from multiple places and it adds
rfcomm_session_timeout on a timer which will cause the trace
if the timer fires.

I could be wrong though.

Regards,
David.

>
> Regards
>
> Marcel
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bug #15127] Bluetooth: sleeping function called from invalid context

2010-02-01 Thread Marcel Holtmann
Hi David,

>  This message has been generated automatically as a part of a report
>  of regressions introduced between 2.6.31 and 2.6.32.
> 
>  The following bug entry is on the current list of known regressions
>  introduced between 2.6.31 and 2.6.32.  Please verify if it still should
>  be listed and let me know (either way).
> 
> 
>  Bug-Entry: http://bugzilla.kernel.org/show_bug.cgi?id=15127
>  Subject  : Bluetooth: sleeping function called from invalid 
>  context
>  Submitter: David John 
>  Date : 2010-01-12 9:19 (20 days old)
>  First-Bad-Commit: 
>  http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9e726b17422bade75fba94e625cd35fd1353e682
>  References   : http://marc.info/?l=linux-kernel&m=126328727021949&w=4
> >>>
> >>> you have an outdated email from Luiz and I change it to the right one
> >>> now.
> >>>
> >>> I looked with him at the patch and I think this will fix it:
> >>>
> >>> diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
> >>> index fc5ee32..2b50637 100644
> >>> --- a/net/bluetooth/rfcomm/core.c
> >>> +++ b/net/bluetooth/rfcomm/core.c
> >>> @@ -252,7 +252,6 @@ static void rfcomm_session_timeout(unsigned long
> >>> arg)
> >>>   BT_DBG("session %p state %ld", s, s->state);
> >>>  
> >>>   set_bit(RFCOMM_TIMED_OUT, &s->flags);
> >>> - rfcomm_session_put(s);
> >>>   rfcomm_schedule(RFCOMM_SCHED_TIMEO);
> >>>  }
> >>>  
> >>> @@ -1920,6 +1919,7 @@ static inline void rfcomm_process_sessions(void)
> >>>   if (test_and_clear_bit(RFCOMM_TIMED_OUT, &s->flags)) {
> >>>   s->state = BT_DISCONN;
> >>>   rfcomm_send_disc(s, 0);
> >>> + rfcomm_session_put(s);
> >>>   continue;
> >>>   }
> >>>  
> >>> We need some extra testing on this with the actual hardware we did the
> >>> patch for. So this will take at least a few days before we get our hands
> >>> on it.
> >>
> >> FWIW, your patch fixes the issue.
> > 
> > nice. So I can add a tested-by line to the final patch?
> 
> Sure,
> 
> Tested-by: David John 
> 
> > 
> > Just our of curiosity, which hardware did you test this with. 
> 
> I have an inbuilt (laptop) USB Dell Wireless 365 Bluetooth Module
> (413c:8160). I can send more info about the device if you want.

I meant which device you are connection to. Is it a headset or another
computer.

> > We only know about one headset that should cause this issue.
> 
> That's weird. I assumed it would happen for any device, since
> rfcomm_session_add is called from multiple places and it adds
> rfcomm_session_timeout on a timer which will cause the trace
> if the timer fires.

The timer will only fire for non-behaving remote stacks. With a proper
stack following the RFCOMM specification it should never fire.

Regards

Marcel


--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

2010-02-01 Thread Stefan Richter
Justin P. Mattock wrote:
> So(correct me if I'm wrong), I'm generating a 64 bit register
> and the kernel is looking for a 32 bit register causing the crash.

No, the class = read_pci_config(); if (class == ...) ... parts of the
code are entirely innocent as far as I can tell.  This is just the
FireWire--PCI chip detection.  It is the subsequent driver setup for the
chip that crashes somewhere.

When you modified that chip detection code earlier, you only prevented
crashes when your modifications ended up as "ignore all PCI devices,
also FireWire ones" == "do nothing at all".

Perhaps the bootup sequence of the x86(-64) platform was changed from
2.6.31 to .32 thus that some assumptions in init_ohci1394_dma about when
are what resources available are not true anymore.  According to your
screenshot in http://lkml.org/lkml/2009/10/27/335 the issue is about
memory allocation, not about PCI bus access.
-- 
Stefan Richter
-=-==-=- --=- ---=-
http://arcgraph.de/sr/
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bug #15127] Bluetooth: sleeping function called from invalid context

2010-02-01 Thread David John
On 02/02/2010 11:11 AM, Marcel Holtmann wrote:
> Hi David,
> 
>> This message has been generated automatically as a part of a report
>> of regressions introduced between 2.6.31 and 2.6.32.
>>
>> The following bug entry is on the current list of known regressions
>> introduced between 2.6.31 and 2.6.32.  Please verify if it still should
>> be listed and let me know (either way).
>>
>>
>> Bug-Entry: http://bugzilla.kernel.org/show_bug.cgi?id=15127
>> Subject  : Bluetooth: sleeping function called from invalid 
>> context
>> Submitter: David John 
>> Date : 2010-01-12 9:19 (20 days old)
>> First-Bad-Commit: 
>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9e726b17422bade75fba94e625cd35fd1353e682
>> References   : http://marc.info/?l=linux-kernel&m=126328727021949&w=4
>
> you have an outdated email from Luiz and I change it to the right one
> now.
>
> I looked with him at the patch and I think this will fix it:
>
> diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
> index fc5ee32..2b50637 100644
> --- a/net/bluetooth/rfcomm/core.c
> +++ b/net/bluetooth/rfcomm/core.c
> @@ -252,7 +252,6 @@ static void rfcomm_session_timeout(unsigned long
> arg)
>   BT_DBG("session %p state %ld", s, s->state);
>  
>   set_bit(RFCOMM_TIMED_OUT, &s->flags);
> - rfcomm_session_put(s);
>   rfcomm_schedule(RFCOMM_SCHED_TIMEO);
>  }
>  
> @@ -1920,6 +1919,7 @@ static inline void rfcomm_process_sessions(void)
>   if (test_and_clear_bit(RFCOMM_TIMED_OUT, &s->flags)) {
>   s->state = BT_DISCONN;
>   rfcomm_send_disc(s, 0);
> + rfcomm_session_put(s);
>   continue;
>   }
>  
> We need some extra testing on this with the actual hardware we did the
> patch for. So this will take at least a few days before we get our hands
> on it.

 FWIW, your patch fixes the issue.
>>>
>>> nice. So I can add a tested-by line to the final patch?
>>
>> Sure,
>>
>> Tested-by: David John 
>>
>>>
>>> Just our of curiosity, which hardware did you test this with. 
>>
>> I have an inbuilt (laptop) USB Dell Wireless 365 Bluetooth Module
>> (413c:8160). I can send more info about the device if you want.
> 
> I meant which device you are connection to. Is it a headset or another
> computer.
> 
>>> We only know about one headset that should cause this issue.
>>
>> That's weird. I assumed it would happen for any device, since
>> rfcomm_session_add is called from multiple places and it adds
>> rfcomm_session_timeout on a timer which will cause the trace
>> if the timer fires.
> 
> The timer will only fire for non-behaving remote stacks. With a proper
> stack following the RFCOMM specification it should never fire.
> 
> Regards
> 
> Marcel
> 
> 
> 

Ah. It's a Sony Ericsson W800i phone. I noticed a new problem while
testing yesterday: Transferring a file to the phone seems to happen
correctly, but at the end of the transfer, the phone reports that the
connection was lost and I get this in the log:

btusb_bulk_complete: hci0 urb 88007a5b59c0 failed to resubmit (19)
btusb_bulk_complete: hci0 urb 880077a200c0 failed to resubmit (19)
btusb_intr_complete: hci0 urb 88007a5b5780 failed to resubmit (19)
btusb_send_frame: hci0 urb 88004db809c0 submission failed

To remove btusb, I have to shutdown the laptop Bluetooth. I'll check and
see if I can reproduce and track down the issue. Note that the phone was
working okay pre 2.3.32.

Regards,
David.
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

2010-02-01 Thread Justin P. Mattock

On 02/01/10 21:45, Stefan Richter wrote:

Justin P. Mattock wrote:

So(correct me if I'm wrong), I'm generating a 64 bit register
and the kernel is looking for a 32 bit register causing the crash.


No, the class = read_pci_config(); if (class == ...) ... parts of the
code are entirely innocent as far as I can tell.  This is just the
FireWire--PCI chip detection.  It is the subsequent driver setup for the
chip that crashes somewhere.

When you modified that chip detection code earlier, you only prevented
crashes when your modifications ended up as "ignore all PCI devices,
also FireWire ones" == "do nothing at all".

Perhaps the bootup sequence of the x86(-64) platform was changed from
2.6.31 to .32 thus that some assumptions in init_ohci1394_dma about when
are what resources available are not true anymore.  According to your
screenshot in http://lkml.org/lkml/2009/10/27/335 the issue is about
memory allocation, not about PCI bus access.



Alright.. I'll keep focus on that
and see if I can figure this out.

As for anything changed in the kernel
(2.6.31 - present), tough to say
from what I remember I had created a new fresh
lfs system using these CFLAGS:

CFLAGS="-mtune=core2 -march=core2 -O2 -pipe -fomit-frame-pointer" 
CXXFLAGS="${CFLAGS}" MAKEOPTS="{-j3}"

(without -m option gcc defaults(I think)to -m32).

which booted with ohci1394_dma=early just fine.

then decided to build another lfs system with the same CFLAGS except
added -m64 (pure64) to the build process.
(then this showed up).

What I can try is do a git revert to 2.6.29/27 to see if this thing
fires off(before going any further). if the system boots then do a bisect.

Justin P. Mattock
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

2010-02-01 Thread Justin P. Mattock

o.k. I feel really stupid right now.
after starring at this for some time I didn't even
think to do a git revert to test other
kernel versions(duh!!).

so doing a git revert to v2.6.27 ohci1394_dma
boots up fine.
a bit late now to do a bisect, but in the morning
I'll start this and see what I get from it, then
go from there.

(man!! let this be a lesson for me);

Justin P. Mattock
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

2010-02-01 Thread Stefan Richter
Justin P. Mattock wrote:
> As for anything changed in the kernel
> (2.6.31 - present), tough to say
> from what I remember I had created a new fresh
> lfs system using these CFLAGS:
> 
> CFLAGS="-mtune=core2 -march=core2 -O2 -pipe -fomit-frame-pointer"
> CXXFLAGS="${CFLAGS}" MAKEOPTS="{-j3}"
> (without -m option gcc defaults(I think)to -m32).
> 
> which booted with ohci1394_dma=early just fine.
> 
> then decided to build another lfs system with the same CFLAGS except
> added -m64 (pure64) to the build process.
> (then this showed up).
> 
> What I can try is do a git revert to 2.6.29/27 to see if this thing
> fires off(before going any further). if the system boots then do a bisect.

Do I understand correctly that at this moment it is only known that the
bug could be
  - *either* a 2.6.31 -> 2.6.32 regression
  - *or* an x86-64 specific bug that does not occur on x86-32,
right?

I have an Core 2 Duo based PC with x86-32 kernel and userland and an AMD
based x86-64 PC and could give ohci1394_dma=early a try on both (never
tested it myself before).  I could furthermore attempt to build and
install an x86-64 kernel on the Core 2 Duo PC but I am afraid I am far
too short of spare time for that.
-- 
Stefan Richter
-=-==-=- --=- ---=-
http://arcgraph.de/sr/
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

2010-02-01 Thread Stefan Richter
Stefan Richter wrote:
> Do I understand correctly that at this moment it is only known that the
> bug could be
>   - *either* a 2.6.31 -> 2.6.32 regression
>   - *or* an x86-64 specific bug that does not occur on x86-32,
> right?

(OK, according to your other post it /is/ a regression, at least on
x86-64 and definitely between 2.6.27 (good) and 2.6.32 (bad).)
-- 
Stefan Richter
-=-==-=- --=- ---=-
http://arcgraph.de/sr/
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

2010-02-01 Thread Justin P. Mattock

On 02/01/10 22:55, Stefan Richter wrote:

Justin P. Mattock wrote:

As for anything changed in the kernel
(2.6.31 - present), tough to say
from what I remember I had created a new fresh
lfs system using these CFLAGS:

CFLAGS="-mtune=core2 -march=core2 -O2 -pipe -fomit-frame-pointer"
CXXFLAGS="${CFLAGS}" MAKEOPTS="{-j3}"
(without -m option gcc defaults(I think)to -m32).

which booted with ohci1394_dma=early just fine.

then decided to build another lfs system with the same CFLAGS except
added -m64 (pure64) to the build process.
(then this showed up).

What I can try is do a git revert to 2.6.29/27 to see if this thing
fires off(before going any further). if the system boots then do a bisect.


Do I understand correctly that at this moment it is only known that the
bug could be
   - *either* a 2.6.31 ->  2.6.32 regression
   - *or* an x86-64 specific bug that does not occur on x86-32,
right?



at first I was under the impression this was an arch thing because of 
building an x86_32, and then building x86_64(and hitting this). but now 
after reverting to 2.6.27 I'm thinking other wise.(my bad, should of 
done this at first but didn't even think too);



I have an Core 2 Duo based PC with x86-32 kernel and userland and an AMD
based x86-64 PC and could give ohci1394_dma=early a try on both (never
tested it myself before).  I could furthermore attempt to build and
install an x86-64 kernel on the Core 2 Duo PC but I am afraid I am far
too short of spare time for that.


no..
I need  to do a bisect from 2.6.27 to present to see
(just need to crash for a few hrs, then can start);
then I'll go from there.

Justin P. Mattock
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)

2010-02-01 Thread Justin P. Mattock

On 02/01/10 22:57, Stefan Richter wrote:

Stefan Richter wrote:

Do I understand correctly that at this moment it is only known that the
bug could be
   - *either* a 2.6.31 ->  2.6.32 regression
   - *or* an x86-64 specific bug that does not occur on x86-32,
right?


(OK, according to your other post it /is/ a regression, at least on
x86-64 and definitely between 2.6.27 (good) and 2.6.32 (bad).)


I'll go with the bisect in the morning(late over here),
and then go from there.(just pissed at myself
for not thinking to do this at the beginning).

Justin P. Mattock
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html