subject:"Why is 2.6.12.2 less stable on my laptop than 2.6.10\?"

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-19 Thread Paolo Ciarrocchi

2005/7/18, Mark Gross <[EMAIL PROTECTED]>:
> On Friday 15 July 2005 16:14, Rik van Riel wrote:
> > On Fri, 15 Jul 2005, Mark Gross wrote:
> > > What would be wrong in expecting the folks making the driver changes
> > > have some story on how they are validating there changes don't break
> > > existing working hardware?  I could probly be accomplished in open
> > > source with subsystem testing volenteers.
> >
> > Are you volunteering ?
> 
> I am not volunteering.  That last sentence was meant to say "It could
> probubly..."
> 
> I'm just poking at a process change that would include a more formal
> validation / testing phase as part of getting change into the stable tree.  I
> don't have any silver bullets.

I totaly agree with you, but the real problem is *how* to do that.
Do you have any suggestion ?

-- 
Paolo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-19 Thread Paolo Ciarrocchi

2005/7/18, Mark Gross [EMAIL PROTECTED]:
 On Friday 15 July 2005 16:14, Rik van Riel wrote:
  On Fri, 15 Jul 2005, Mark Gross wrote:
   What would be wrong in expecting the folks making the driver changes
   have some story on how they are validating there changes don't break
   existing working hardware?  I could probly be accomplished in open
   source with subsystem testing volenteers.
 
  Are you volunteering ?
 
 I am not volunteering.  That last sentence was meant to say It could
 probubly...
 
 I'm just poking at a process change that would include a more formal
 validation / testing phase as part of getting change into the stable tree.  I
 don't have any silver bullets.

I totaly agree with you, but the real problem is *how* to do that.
Do you have any suggestion ?

-- 
Paolo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-18 Thread Mark Gross

On Friday 15 July 2005 16:14, Rik van Riel wrote:
> On Fri, 15 Jul 2005, Mark Gross wrote:
> > What would be wrong in expecting the folks making the driver changes
> > have some story on how they are validating there changes don't break
> > existing working hardware?  I could probly be accomplished in open
> > source with subsystem testing volenteers.
>
> Are you volunteering ?

I am not volunteering.  That last sentence was meant to say "It could 
probubly..."

I'm just poking at a process change that would include a more formal 
validation / testing phase as part of getting change into the stable tree.  I 
don't have any silver bullets.

-- 
--mgross
BTW: This may or may not be the opinion of my employer, more likely not.  

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-18 Thread Mark Gross

On Friday 15 July 2005 16:14, Rik van Riel wrote:
 On Fri, 15 Jul 2005, Mark Gross wrote:
  What would be wrong in expecting the folks making the driver changes
  have some story on how they are validating there changes don't break
  existing working hardware?  I could probly be accomplished in open
  source with subsystem testing volenteers.

 Are you volunteering ?

I am not volunteering.  That last sentence was meant to say It could 
probubly...

I'm just poking at a process change that would include a more formal 
validation / testing phase as part of getting change into the stable tree.  I 
don't have any silver bullets.

-- 
--mgross
BTW: This may or may not be the opinion of my employer, more likely not.  

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-16 Thread Pavel Machek

Hi!

> Why can't I expect SWSusp work better and more reliable from release to 
> release?  

Patches welcome. Or employ someone to do swsusp development for you.

> Some possible things that could help:
> 
> *Addopt a no-regressions-allowed policy and everthing stops until any 
> identified regressions (in performance, functionally or stability) is fixed 
> or the changes are all rolled back.  This works really well if in addition 
> organized pre-flight testing is done before calling a new version number.  
> You simply cannot rely on ad-hock regression testing and reporting.  Its got 
> too much latency.

This would also mean "no development at all".

> * assign validation folks that the developer need to appease before changes 
> are allowed to be accepted into the tree. 

So... get me someone to test swsusp in each -rc and -mm
release... that would help. If you can't provide the manpower, why are
you whining?

Pavel
-- 
teflon -- maybe it is a trademark, but it should not be.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-16 Thread Pavel Machek

Hi!

 Why can't I expect SWSusp work better and more reliable from release to 
 release?  

Patches welcome. Or employ someone to do swsusp development for you.

 Some possible things that could help:
 
 *Addopt a no-regressions-allowed policy and everthing stops until any 
 identified regressions (in performance, functionally or stability) is fixed 
 or the changes are all rolled back.  This works really well if in addition 
 organized pre-flight testing is done before calling a new version number.  
 You simply cannot rely on ad-hock regression testing and reporting.  Its got 
 too much latency.

This would also mean no development at all.

 * assign validation folks that the developer need to appease before changes 
 are allowed to be accepted into the tree. 

So... get me someone to test swsusp in each -rc and -mm
release... that would help. If you can't provide the manpower, why are
you whining?

Pavel
-- 
teflon -- maybe it is a trademark, but it should not be.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-15 Thread Rik van Riel

On Fri, 15 Jul 2005, Mark Gross wrote:

> What would be wrong in expecting the folks making the driver changes 
> have some story on how they are validating there changes don't break 
> existing working hardware?  I could probly be accomplished in open 
> source with subsystem testing volenteers.

Are you volunteering ?

-- 
The Theory of Escalating Commitment: "The cost of continuing mistakes is
borne by others, while the cost of admitting mistakes is borne by yourself."
  -- Joseph Stiglitz, Nobel Laureate in Economics
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-15 Thread David Lang

On Fri, 15 Jul 2005, Mark Gross wrote:

On Thursday 14 July 2005 19:09, Dave Jones wrote:

On Fri, Jul 15, 2005 at 03:45:28AM +0200, Jesper Juhl wrote:
>>> The problem is the process, not than the code.
>>> * The issues are too much ad-hock code flux without enough
>>> disciplined/formal regression testing and review.
>>
>> It's basically impossible to regression test swsusp except to release
>> it. Its success or failure depends on exactly the driver
>> combination/platform/BIOS version etc.  e.g. all drivers have to
>> cooperate and the particular bugs in your BIOS need to be worked
>> around etc. Since that is quite fragile regressions are common.
>>
>> However in some other cases I agree some more regression testing
>> before release would be nice. But that's not how Linux works.  Linux
>> does regression testing after release.
>
> And who says that couldn't change?
>
> In my oppinion it would be nice if Linus/Andrew had some basic
> regression tests they could run on kernels before releasing them.

The problem is that this wouldn't cover the more painful problems
such as hardware specific problems.

As Fedora kernel maintainer, I frequently get asked why peoples
sound cards stopped working when they did an update, or why
their system no longer boots, usually followed by a
"wasnt this update tested before it was released?"

The bulk of all the regressions I see reported every time
I put out a kernel update rpm that rebases to a newer
upstream release are in drivers. Those just aren't going
to be caught by folks that don't have the hardware.

This problem is the developer making driver changes without have the resources
to test the changes on a enough of the hardware effected by his change, and
therefore probubly shouldn't be making changes they cannot realisticaly test.

What would be wrong in expecting the folks making the driver changes have some
story on how they are validating there changes don't break existing working
hardware?  I could probly be accomplished in open source with subsystem
testing volenteers.

in that case you will have a lot of drivers that won't work becouse the 
rest of the kernel has changed and they haven't been changed to match.

do you have the resources to test a few hundred network cards, video 
cards, etc? if you do great, hope you can help out, if not why should you 
require other kernel folks to have resources that you don't have?

David Lang

--
There are two ways of constructing a software design. One way is to make it so 
simple that there are obviously no deficiencies. And the other way is to make 
it so complicated that there are no obvious deficiencies.
 -- C.A.R. Hoare
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-15 Thread Dave Jones

On Fri, Jul 15, 2005 at 02:47:46PM -0700, Mark Gross wrote:

 > This problem is the developer making driver changes without have the 
 > resources 
 > to test the changes on a enough of the hardware effected by his change, and 
 > therefore probubly shouldn't be making changes they cannot realisticaly test.

Such is life. The situation arises quite often where fixing a bug
for one person breaks it for another. The lack of hardware to test on
isn't the fault of the person making the change, nor the person requesting
the change. The problem is that the person it breaks for doesn't test
testing kernels, so the problem is only found out about when its too late.

The agpgart driver for example supports around 50-60 different chipsets.
I don't have a tenth of the hardware that it supports at my disposal,
yet when I get patches fixing some problem for someone, or adding support
for yet another variant, I'm not going to go out and find the variants
I don't have.  By your metric I shouldn't apply that change.

That's not how things work.

 > What would be wrong in expecting the folks making the driver changes have 
 > some 
 > story on how they are validating there changes don't break existing working 
 > hardware?

It's impractical given the plethora of hardware combinations out there.

 > I could probly be accomplished in open source with subsystem 
 > testing volenteers.

People tend not to test things marked 'test kernels' or 'rc kernels'.
They prefer to shout loudly when the final release happens, and
blame it on 'the new kernel development model sucking'.

 > > The only way to cover as many combinations of hardware
 > > out there is by releasing test kernels. (Updates-testing
 > > repository for Fedora users, or -rc kernels in Linus' case).
 > > If users won't/don't test those 'test' releases, we're
 > > going to regress when the final release happens, there's
 > > no two ways about it.
 > 
 > You can't blame the users!  Don't fall into that trap.  Its not productive.

You're missing my point. The bits are out there for people to
test with.  We can't help people who won't help themselves,
and they shouldn't be at all surprised to find things breaking
if they choose to not take part in testing.

Dave

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-15 Thread Mark Gross

On Thursday 14 July 2005 19:09, Dave Jones wrote:
> On Fri, Jul 15, 2005 at 03:45:28AM +0200, Jesper Juhl wrote:
>  > > > The problem is the process, not than the code.
>  > > > * The issues are too much ad-hock code flux without enough
>  > > > disciplined/formal regression testing and review.
>  > >
>  > > It's basically impossible to regression test swsusp except to release
>  > > it. Its success or failure depends on exactly the driver
>  > > combination/platform/BIOS version etc.  e.g. all drivers have to
>  > > cooperate and the particular bugs in your BIOS need to be worked
>  > > around etc. Since that is quite fragile regressions are common.
>  > >
>  > > However in some other cases I agree some more regression testing
>  > > before release would be nice. But that's not how Linux works.  Linux
>  > > does regression testing after release.
>  >
>  > And who says that couldn't change?
>  >
>  > In my oppinion it would be nice if Linus/Andrew had some basic
>  > regression tests they could run on kernels before releasing them.
>
> The problem is that this wouldn't cover the more painful problems
> such as hardware specific problems.
>
> As Fedora kernel maintainer, I frequently get asked why peoples
> sound cards stopped working when they did an update, or why
> their system no longer boots, usually followed by a
> "wasnt this update tested before it was released?"
>
> The bulk of all the regressions I see reported every time
> I put out a kernel update rpm that rebases to a newer
> upstream release are in drivers. Those just aren't going
> to be caught by folks that don't have the hardware.

This problem is the developer making driver changes without have the resources 
to test the changes on a enough of the hardware effected by his change, and 
therefore probubly shouldn't be making changes they cannot realisticaly test.

What would be wrong in expecting the folks making the driver changes have some 
story on how they are validating there changes don't break existing working 
hardware?  I could probly be accomplished in open source with subsystem 
testing volenteers.

>
> The only way to cover as many combinations of hardware
> out there is by releasing test kernels. (Updates-testing
> repository for Fedora users, or -rc kernels in Linus' case).
> If users won't/don't test those 'test' releases, we're
> going to regress when the final release happens, there's
> no two ways about it.

You can't blame the users!  Don't fall into that trap.  Its not productive.

>
>   Dave

-- 
--mgross
BTW: This may or may not be the opinion of my employer, more likely not.  

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-15 Thread Mark Gross

On Thursday 14 July 2005 19:16, Dave Airlie wrote:
> > That, of course, you cannot do. But, you can regression test a lot of
> > other things, and having a default test suite that is constantly being
> > added to and always being run before releases (that test hardware
> > agnostic stuff) could help cut down on the number of regressions in
> > new releases.
> > You can't test everything this way, nor should you, but you can test
> > many things, and adding a bit of formal testing to the release
> > procedure wouldn't be a bad thing IMO.
>
> But if you read peoples complaints about regression they are nearly
> always to do with hardware that used to work not working any more ..
> alps touchpads, sound cards, software suspend.. so these people still
> gain nothing by you regression testing anything so you still get as
> many reports.. the -rc series is meant to provide the testing for the
> release so nothing really big gets through (like can't boot from IDE
> anymore or something like that)
>

I've seen large labs of lots of different systems used for dedicated testing 
of products I've worked on in the past.  The validation folks held the keys 
to the build and if a change got in that broke on an important OEM's 
hardware, then everything stops until that change is either fixed or backed 
out.

It aint cheap.  In open source we are attempting to simulate this, but we 
don't simulate the control of the validation leads.

> Dave.

-- 
--mgross
BTW: This may or may not be the opinion of my employer, more likely not.  

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-15 Thread Mark Gross

On Thursday 14 July 2005 19:09, Andi Kleen wrote:
> > You can't test everything this way, nor should you, but you can test
> > many things, and adding a bit of formal testing to the release
> > procedure wouldn't be a bad thing IMO.
>
> In the linux model that's left to the distributions. In fact doing it
> properly takes months. You wouldn't want to wait months for a new mainline
> kernel.
>
> Formal testing is not really compatible with "release early, release often"
>

This is true.  I think we are seeing the effects of releasing more often than 
we should be into a "stable" tree.  Early and Often make sence for developing 
new features, but should they be pushed into a stable release so often?

> You could do things like "run LTP first", but in practice LTP rarely finds
> bugs.
>
> -Andi

-- 
--mgross
BTW: This may or may not be the opinion of my employer, more likely not.  

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-15 Thread Alan Cox

> I have always wondered how Windows got it right circa 1995 - Version after 
> version, several different hardwares and it always works reliably. 
> I am using Linux since 1997 and not a single time have I succeeded in getting 
> it to suspend and resume reliably. 

Because Windows at the time used the APM BIOS and the APM BIOS vendors
made sure Windows worked and generally didnt care about more. When the
vendor got it right it worked, indeed Linux back to 1.x will suspend to
disk nicely on an old IBM thinkpad.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-15 Thread Alan Cox

 I have always wondered how Windows got it right circa 1995 - Version after 
 version, several different hardwares and it always works reliably. 
 I am using Linux since 1997 and not a single time have I succeeded in getting 
 it to suspend and resume reliably. 

Because Windows at the time used the APM BIOS and the APM BIOS vendors
made sure Windows worked and generally didnt care about more. When the
vendor got it right it worked, indeed Linux back to 1.x will suspend to
disk nicely on an old IBM thinkpad.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-15 Thread Mark Gross

On Thursday 14 July 2005 19:09, Andi Kleen wrote:
  You can't test everything this way, nor should you, but you can test
  many things, and adding a bit of formal testing to the release
  procedure wouldn't be a bad thing IMO.

 In the linux model that's left to the distributions. In fact doing it
 properly takes months. You wouldn't want to wait months for a new mainline
 kernel.

 Formal testing is not really compatible with release early, release often


This is true.  I think we are seeing the effects of releasing more often than 
we should be into a stable tree.  Early and Often make sence for developing 
new features, but should they be pushed into a stable release so often?

 You could do things like run LTP first, but in practice LTP rarely finds
 bugs.

 -Andi

-- 
--mgross
BTW: This may or may not be the opinion of my employer, more likely not.  

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-15 Thread Mark Gross

On Thursday 14 July 2005 19:16, Dave Airlie wrote:
  That, of course, you cannot do. But, you can regression test a lot of
  other things, and having a default test suite that is constantly being
  added to and always being run before releases (that test hardware
  agnostic stuff) could help cut down on the number of regressions in
  new releases.
  You can't test everything this way, nor should you, but you can test
  many things, and adding a bit of formal testing to the release
  procedure wouldn't be a bad thing IMO.

 But if you read peoples complaints about regression they are nearly
 always to do with hardware that used to work not working any more ..
 alps touchpads, sound cards, software suspend.. so these people still
 gain nothing by you regression testing anything so you still get as
 many reports.. the -rc series is meant to provide the testing for the
 release so nothing really big gets through (like can't boot from IDE
 anymore or something like that)


I've seen large labs of lots of different systems used for dedicated testing 
of products I've worked on in the past.  The validation folks held the keys 
to the build and if a change got in that broke on an important OEM's 
hardware, then everything stops until that change is either fixed or backed 
out.

It aint cheap.  In open source we are attempting to simulate this, but we 
don't simulate the control of the validation leads.

 Dave.

-- 
--mgross
BTW: This may or may not be the opinion of my employer, more likely not.  

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-15 Thread Mark Gross

On Thursday 14 July 2005 19:09, Dave Jones wrote:
 On Fri, Jul 15, 2005 at 03:45:28AM +0200, Jesper Juhl wrote:
 The problem is the process, not than the code.
 * The issues are too much ad-hock code flux without enough
 disciplined/formal regression testing and review.
   
It's basically impossible to regression test swsusp except to release
it. Its success or failure depends on exactly the driver
combination/platform/BIOS version etc.  e.g. all drivers have to
cooperate and the particular bugs in your BIOS need to be worked
around etc. Since that is quite fragile regressions are common.
   
However in some other cases I agree some more regression testing
before release would be nice. But that's not how Linux works.  Linux
does regression testing after release.
  
   And who says that couldn't change?
  
   In my oppinion it would be nice if Linus/Andrew had some basic
   regression tests they could run on kernels before releasing them.

 The problem is that this wouldn't cover the more painful problems
 such as hardware specific problems.

 As Fedora kernel maintainer, I frequently get asked why peoples
 sound cards stopped working when they did an update, or why
 their system no longer boots, usually followed by a
 wasnt this update tested before it was released?

 The bulk of all the regressions I see reported every time
 I put out a kernel update rpm that rebases to a newer
 upstream release are in drivers. Those just aren't going
 to be caught by folks that don't have the hardware.

This problem is the developer making driver changes without have the resources 
to test the changes on a enough of the hardware effected by his change, and 
therefore probubly shouldn't be making changes they cannot realisticaly test.

What would be wrong in expecting the folks making the driver changes have some 
story on how they are validating there changes don't break existing working 
hardware?  I could probly be accomplished in open source with subsystem 
testing volenteers.


 The only way to cover as many combinations of hardware
 out there is by releasing test kernels. (Updates-testing
 repository for Fedora users, or -rc kernels in Linus' case).
 If users won't/don't test those 'test' releases, we're
 going to regress when the final release happens, there's
 no two ways about it.

You can't blame the users!  Don't fall into that trap.  Its not productive.


   Dave

-- 
--mgross
BTW: This may or may not be the opinion of my employer, more likely not.  

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-15 Thread Dave Jones

On Fri, Jul 15, 2005 at 02:47:46PM -0700, Mark Gross wrote:
 
  This problem is the developer making driver changes without have the 
  resources 
  to test the changes on a enough of the hardware effected by his change, and 
  therefore probubly shouldn't be making changes they cannot realisticaly test.

Such is life. The situation arises quite often where fixing a bug
for one person breaks it for another. The lack of hardware to test on
isn't the fault of the person making the change, nor the person requesting
the change. The problem is that the person it breaks for doesn't test
testing kernels, so the problem is only found out about when its too late.

The agpgart driver for example supports around 50-60 different chipsets.
I don't have a tenth of the hardware that it supports at my disposal,
yet when I get patches fixing some problem for someone, or adding support
for yet another variant, I'm not going to go out and find the variants
I don't have.  By your metric I shouldn't apply that change.

That's not how things work.

  What would be wrong in expecting the folks making the driver changes have 
  some 
  story on how they are validating there changes don't break existing working 
  hardware?

It's impractical given the plethora of hardware combinations out there.

  I could probly be accomplished in open source with subsystem 
  testing volenteers.

People tend not to test things marked 'test kernels' or 'rc kernels'.
They prefer to shout loudly when the final release happens, and
blame it on 'the new kernel development model sucking'.

   The only way to cover as many combinations of hardware
   out there is by releasing test kernels. (Updates-testing
   repository for Fedora users, or -rc kernels in Linus' case).
   If users won't/don't test those 'test' releases, we're
   going to regress when the final release happens, there's
   no two ways about it.
  
  You can't blame the users!  Don't fall into that trap.  Its not productive.

You're missing my point. The bits are out there for people to
test with.  We can't help people who won't help themselves,
and they shouldn't be at all surprised to find things breaking
if they choose to not take part in testing.

Dave

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-15 Thread David Lang


On Fri, 15 Jul 2005, Mark Gross wrote:


On Thursday 14 July 2005 19:09, Dave Jones wrote:

On Fri, Jul 15, 2005 at 03:45:28AM +0200, Jesper Juhl wrote:
 The problem is the process, not than the code.
 * The issues are too much ad-hock code flux without enough
 disciplined/formal regression testing and review.

 It's basically impossible to regression test swsusp except to release
 it. Its success or failure depends on exactly the driver
 combination/platform/BIOS version etc.  e.g. all drivers have to
 cooperate and the particular bugs in your BIOS need to be worked
 around etc. Since that is quite fragile regressions are common.

 However in some other cases I agree some more regression testing
 before release would be nice. But that's not how Linux works.  Linux
 does regression testing after release.

 And who says that couldn't change?

 In my oppinion it would be nice if Linus/Andrew had some basic
 regression tests they could run on kernels before releasing them.

The problem is that this wouldn't cover the more painful problems
such as hardware specific problems.

As Fedora kernel maintainer, I frequently get asked why peoples
sound cards stopped working when they did an update, or why
their system no longer boots, usually followed by a
wasnt this update tested before it was released?

The bulk of all the regressions I see reported every time
I put out a kernel update rpm that rebases to a newer
upstream release are in drivers. Those just aren't going
to be caught by folks that don't have the hardware.


This problem is the developer making driver changes without have the resources
to test the changes on a enough of the hardware effected by his change, and
therefore probubly shouldn't be making changes they cannot realisticaly test.

What would be wrong in expecting the folks making the driver changes have some
story on how they are validating there changes don't break existing working
hardware?  I could probly be accomplished in open source with subsystem
testing volenteers.


in that case you will have a lot of drivers that won't work becouse the 
rest of the kernel has changed and they haven't been changed to match.


do you have the resources to test a few hundred network cards, video 
cards, etc? if you do great, hope you can help out, if not why should you 
require other kernel folks to have resources that you don't have?


David Lang

--
There are two ways of constructing a software design. One way is to make it so 
simple that there are obviously no deficiencies. And the other way is to make 
it so complicated that there are no obvious deficiencies.
 -- C.A.R. Hoare
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-15 Thread Rik van Riel

On Fri, 15 Jul 2005, Mark Gross wrote:

 What would be wrong in expecting the folks making the driver changes 
 have some story on how they are validating there changes don't break 
 existing working hardware?  I could probly be accomplished in open 
 source with subsystem testing volenteers.

Are you volunteering ?

-- 
The Theory of Escalating Commitment: The cost of continuing mistakes is
borne by others, while the cost of admitting mistakes is borne by yourself.
  -- Joseph Stiglitz, Nobel Laureate in Economics
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Dave Airlie

> That, of course, you cannot do. But, you can regression test a lot of
> other things, and having a default test suite that is constantly being
> added to and always being run before releases (that test hardware
> agnostic stuff) could help cut down on the number of regressions in
> new releases.
> You can't test everything this way, nor should you, but you can test
> many things, and adding a bit of formal testing to the release
> procedure wouldn't be a bad thing IMO.

But if you read peoples complaints about regression they are nearly
always to do with hardware that used to work not working any more ..
alps touchpads, sound cards, software suspend.. so these people still
gain nothing by you regression testing anything so you still get as
many reports.. the -rc series is meant to provide the testing for the
release so nothing really big gets through (like can't boot from IDE
anymore or something like that)

Dave.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Parag Warudkar

On Thursday 14 July 2005 20:38, Andi Kleen wrote:
> It's basically impossible to regression test swsusp except to release it.
> Its success or failure depends on exactly the driver
> combination/platform/BIOS version etc.  e.g. all drivers have to cooperate
> and the particular bugs in your BIOS need to be worked around etc. Since
> that is quite fragile regressions are common.

I have always wondered how Windows got it right circa 1995 - Version after 
version, several different hardwares and it always works reliably. 
I am using Linux since 1997 and not a single time have I succeeded in getting 
it to suspend and resume reliably. 

Is it such an un-interesting subject to warrant serious effort or there is a 
lot of hardware documentation missing or in general the driver model and OS 
design itself makes it impossible to get suspend / resume right?

Parag
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Andi Kleen

On Thu, Jul 14, 2005 at 10:09:11PM -0400, Parag Warudkar wrote:
> I have always wondered how Windows got it right circa 1995 - Version after 
> version, several different hardwares and it always works reliably. 
> I am using Linux since 1997 and not a single time have I succeeded in getting 
> it to suspend and resume reliably. 

What happens with Windows is that the Laptop vendor takes the
frozen Windows version available at the time the machine hits the market 
and then tweaks the BIOS and the drivers until everything runs and then
releases the machine.

But if you use newer (or older) W. releases or even service packs or different
drivers on that machine you end up exactly with the same problem.

> Is it such an un-interesting subject to warrant serious effort or there is a 
> lot of hardware documentation missing or in general the driver model and OS 
> design itself makes it impossible to get suspend / resume right?

I think you underestimate the complexity of the problem. Suspend/resume
is a fragile cooperation  of many many different components in the 
kernel/firmware/hardware
and all of them have to work flawlessly together.  That's hard.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Andi Kleen

> You can't test everything this way, nor should you, but you can test
> many things, and adding a bit of formal testing to the release
> procedure wouldn't be a bad thing IMO.

In the linux model that's left to the distributions. In fact doing it properly
takes months. You wouldn't want to wait months for a new mainline kernel.

Formal testing is not really compatible with "release early, release often" 

You could do things like "run LTP first", but in practice LTP rarely finds
bugs.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Dave Jones

On Fri, Jul 15, 2005 at 03:45:28AM +0200, Jesper Juhl wrote:

 > > > The problem is the process, not than the code.
 > > > * The issues are too much ad-hock code flux without enough 
 > > > disciplined/formal
 > > > regression testing and review.
 > > 
 > > It's basically impossible to regression test swsusp except to release it.
 > > Its success or failure depends on exactly the driver 
 > > combination/platform/BIOS
 > > version etc.  e.g. all drivers have to cooperate and the particular
 > > bugs in your BIOS need to be worked around etc. Since that is quite fragile
 > > regressions are common.
 > > 
 > > However in some other cases I agree some more regression testing
 > > before release would be nice. But that's not how Linux works.  Linux
 > > does regression testing after release.
 > > 
 > And who says that couldn't change?
 > 
 > In my oppinion it would be nice if Linus/Andrew had some basic
 > regression tests they could run on kernels before releasing them.

The problem is that this wouldn't cover the more painful problems
such as hardware specific problems.

As Fedora kernel maintainer, I frequently get asked why peoples
sound cards stopped working when they did an update, or why
their system no longer boots, usually followed by a
"wasnt this update tested before it was released?"

The bulk of all the regressions I see reported every time
I put out a kernel update rpm that rebases to a newer
upstream release are in drivers. Those just aren't going
to be caught by folks that don't have the hardware.

The only way to cover as many combinations of hardware
out there is by releasing test kernels. (Updates-testing
repository for Fedora users, or -rc kernels in Linus' case).
If users won't/don't test those 'test' releases, we're
going to regress when the final release happens, there's
no two ways about it.

Dave

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Jesper Juhl

On 7/15/05, Chris Friesen <[EMAIL PROTECTED]> wrote:
> Jesper Juhl wrote:
> 
> > In my oppinion it would be nice if Linus/Andrew had some basic
> > regression tests they could run on kernels before releasing them.
> 
> How do you regression test behaviour on broken hardware (and BIOSes)
> that you don't have?
> 
That, of course, you cannot do. But, you can regression test a lot of
other things, and having a default test suite that is constantly being
added to and always being run before releases (that test hardware
agnostic stuff) could help cut down on the number of regressions in
new releases.
You can't test everything this way, nor should you, but you can test
many things, and adding a bit of formal testing to the release
procedure wouldn't be a bad thing IMO.

-- 
Jesper Juhl <[EMAIL PROTECTED]>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Chris Friesen


Jesper Juhl wrote:


In my oppinion it would be nice if Linus/Andrew had some basic
regression tests they could run on kernels before releasing them.


How do you regression test behaviour on broken hardware (and BIOSes) 
that you don't have?


Chris


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Jesper Juhl

On 15 Jul 2005 02:38:58 +0200, Andi Kleen <[EMAIL PROTECTED]> wrote:
> Mark Gross <[EMAIL PROTECTED]> writes:
> >
> > The problem is the process, not than the code.
> > * The issues are too much ad-hock code flux without enough 
> > disciplined/formal
> > regression testing and review.
> 
> It's basically impossible to regression test swsusp except to release it.
> Its success or failure depends on exactly the driver combination/platform/BIOS
> version etc.  e.g. all drivers have to cooperate and the particular
> bugs in your BIOS need to be worked around etc. Since that is quite fragile
> regressions are common.
> 
> However in some other cases I agree some more regression testing
> before release would be nice. But that's not how Linux works.  Linux
> does regression testing after release.
> 
And who says that couldn't change?

In my oppinion it would be nice if Linus/Andrew had some basic
regression tests they could run on kernels before releasing them.
There are plenty of "Linux test" projects out there that could be
borrowed from to create some sort of regression test harness for them
to run prior to release.   It would be super nice if they had a suite
of tests to run and could then drop a mail on lkml saying 2.6.x is
almost ready to go, but it currently fails regression tests #x, #y &
#z, we need to get those fixed first before we can release this - and
then every time a bug was found that could resonably be tested for in
the future it would be added to the regression test suite...  That
would lead to more consistent quality I believe.

-- 
Jesper Juhl <[EMAIL PROTECTED]>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Andi Kleen

Mark Gross <[EMAIL PROTECTED]> writes:
> 
> The problem is the process, not than the code.
> * The issues are too much ad-hock code flux without enough disciplined/formal 
> regression testing and review.  

It's basically impossible to regression test swsusp except to release it. 
Its success or failure depends on exactly the driver combination/platform/BIOS
version etc.  e.g. all drivers have to cooperate and the particular
bugs in your BIOS need to be worked around etc. Since that is quite fragile
regressions are common.

However in some other cases I agree some more regression testing
before release would be nice. But that's not how Linux works.  Linux
does regression testing after release.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Andrew Morton

Mark Gross <[EMAIL PROTECTED]> wrote:
>
> I know this is a broken record, but the development process within the LKML 
>  isn't resulting in more stable and better code.  Some process change could 
> be 
>  a good thing.

We rely upon people (such as [EMAIL PROTECTED]) to send bug reports.

>  Why does my alps mouse pad have to stop working every time I test a new 
>  "STABLE" kernel?  

The alps driver is always broken.  Seems to be a feature.

Please test 2.6.13-rc3 and if it also fails send a comprehensive bug report
to Dmitry Torokhov <[EMAIL PROTECTED]> and Vojtech Pavlik
<[EMAIL PROTECTED]>

>  Why does swsup have to start hanging on shut and startup down randomly?

swsusp also is a problematic feature.  You appear to have chosen two of the
very most problematic parts of the kernel (you missed ACPI) and then
generalised them to the whole.  That isn't valid.

Please test 2.6.13-rc3 and if it also fails send a comprehensive bug report
to Pavel Machek <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Mark Gross

I know this is a broken record, but the development process within the LKML 
isn't resulting in more stable and better code.  Some process change could be 
a good thing.

Why does my alps mouse pad have to stop working every time I test a new 
"STABLE" kernel?  

Why does swsup have to start hanging on shut and startup down randomly?

I rolled back my home box with 2.6.10 because I want some stability (2.6.10 
has problems with swsusp from time to time, but it livable for me, for now.)

The process is broken if on a stable series we cannot at least make sure 
obvious regressions don't smack users between the eyes.

I see the problem as that too much code flux is happening from people without 
the resources, or discipline, to effectively regresion test for side effects 
of their changes.  

I know there is a lot of back patting on how well the dot-dot stability 
release process is working, but that process is a solution for a different 
and simpler problem and we still have breakage.

Stability and deliberate feature design and development along with disciplined 
regression testing and validation is what is needed.  Why can't there be more 
targeted and planned development?  Are we in a race to see how many changes 
we can push into a "stable" tree?

Shouldn't changes be regression tested, formally, before its allowed to go 
into a tree? 

Why can't I expect SWSusp work better and more reliable from release to 
release?  

I know there is a point where software goes from fun to work, but without more 
deliberate and disciplined WORK I see the 2.6 tree spinning out of control.

The problem is the process, not than the code.
* The issues are too much ad-hock code flux without enough disciplined/formal 
regression testing and review.  
* Small regressions are accepted and expected to be cached latter.
* ad-hock validation before changes are accepted.

Some possible things that could help:

*Addopt a no-regressions-allowed policy and everthing stops until any 
identified regressions (in performance, functionally or stability) is fixed 
or the changes are all rolled back.  This works really well if in addition 
organized pre-flight testing is done before calling a new version number.  
You simply cannot rely on ad-hock regression testing and reporting.  Its got 
too much latency.
* assign validation folks that the developer need to appease before changes 
are allowed to be accepted into the tree. 
* Make all changes to the kernel not be submitted by the developers, but by 
designated subsystem validation owners.  If too many bugs continue to sneak 
by address the problem by adding validation help to that subsystem or get a 
new owner for the problem subsystem.  (<-- I like this one a lot.)
* start 2.7 
* all of the above (<--this one is good too)

--mgross
BTW: This may or may not be the opinion of my employer, more likely not.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Mark Gross

I know this is a broken record, but the development process within the LKML 
isn't resulting in more stable and better code.  Some process change could be 
a good thing.

Why does my alps mouse pad have to stop working every time I test a new 
STABLE kernel?  

Why does swsup have to start hanging on shut and startup down randomly?

I rolled back my home box with 2.6.10 because I want some stability (2.6.10 
has problems with swsusp from time to time, but it livable for me, for now.)

The process is broken if on a stable series we cannot at least make sure 
obvious regressions don't smack users between the eyes.

I see the problem as that too much code flux is happening from people without 
the resources, or discipline, to effectively regresion test for side effects 
of their changes.  

I know there is a lot of back patting on how well the dot-dot stability 
release process is working, but that process is a solution for a different 
and simpler problem and we still have breakage.

Stability and deliberate feature design and development along with disciplined 
regression testing and validation is what is needed.  Why can't there be more 
targeted and planned development?  Are we in a race to see how many changes 
we can push into a stable tree?

Shouldn't changes be regression tested, formally, before its allowed to go 
into a tree? 

Why can't I expect SWSusp work better and more reliable from release to 
release?  

I know there is a point where software goes from fun to work, but without more 
deliberate and disciplined WORK I see the 2.6 tree spinning out of control.

The problem is the process, not than the code.
* The issues are too much ad-hock code flux without enough disciplined/formal 
regression testing and review.  
* Small regressions are accepted and expected to be cached latter.
* ad-hock validation before changes are accepted.

Some possible things that could help:

*Addopt a no-regressions-allowed policy and everthing stops until any 
identified regressions (in performance, functionally or stability) is fixed 
or the changes are all rolled back.  This works really well if in addition 
organized pre-flight testing is done before calling a new version number.  
You simply cannot rely on ad-hock regression testing and reporting.  Its got 
too much latency.
* assign validation folks that the developer need to appease before changes 
are allowed to be accepted into the tree. 
* Make all changes to the kernel not be submitted by the developers, but by 
designated subsystem validation owners.  If too many bugs continue to sneak 
by address the problem by adding validation help to that subsystem or get a 
new owner for the problem subsystem.  (-- I like this one a lot.)
* start 2.7 
* all of the above (--this one is good too)

--mgross
BTW: This may or may not be the opinion of my employer, more likely not.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Andrew Morton

Mark Gross [EMAIL PROTECTED] wrote:

 I know this is a broken record, but the development process within the LKML 
  isn't resulting in more stable and better code.  Some process change could 
 be 
  a good thing.

We rely upon people (such as [EMAIL PROTECTED]) to send bug reports.

  Why does my alps mouse pad have to stop working every time I test a new 
  STABLE kernel?  

The alps driver is always broken.  Seems to be a feature.

Please test 2.6.13-rc3 and if it also fails send a comprehensive bug report
to Dmitry Torokhov [EMAIL PROTECTED] and Vojtech Pavlik
[EMAIL PROTECTED]

  Why does swsup have to start hanging on shut and startup down randomly?

swsusp also is a problematic feature.  You appear to have chosen two of the
very most problematic parts of the kernel (you missed ACPI) and then
generalised them to the whole.  That isn't valid.

Please test 2.6.13-rc3 and if it also fails send a comprehensive bug report
to Pavel Machek [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Jesper Juhl

On 15 Jul 2005 02:38:58 +0200, Andi Kleen [EMAIL PROTECTED] wrote:
 Mark Gross [EMAIL PROTECTED] writes:
 
  The problem is the process, not than the code.
  * The issues are too much ad-hock code flux without enough 
  disciplined/formal
  regression testing and review.
 
 It's basically impossible to regression test swsusp except to release it.
 Its success or failure depends on exactly the driver combination/platform/BIOS
 version etc.  e.g. all drivers have to cooperate and the particular
 bugs in your BIOS need to be worked around etc. Since that is quite fragile
 regressions are common.
 
 However in some other cases I agree some more regression testing
 before release would be nice. But that's not how Linux works.  Linux
 does regression testing after release.
 
And who says that couldn't change?

In my oppinion it would be nice if Linus/Andrew had some basic
regression tests they could run on kernels before releasing them.
There are plenty of Linux test projects out there that could be
borrowed from to create some sort of regression test harness for them
to run prior to release.   It would be super nice if they had a suite
of tests to run and could then drop a mail on lkml saying 2.6.x is
almost ready to go, but it currently fails regression tests #x, #y 
#z, we need to get those fixed first before we can release this - and
then every time a bug was found that could resonably be tested for in
the future it would be added to the regression test suite...  That
would lead to more consistent quality I believe.


-- 
Jesper Juhl [EMAIL PROTECTED]
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Chris Friesen


Jesper Juhl wrote:


In my oppinion it would be nice if Linus/Andrew had some basic
regression tests they could run on kernels before releasing them.


How do you regression test behaviour on broken hardware (and BIOSes) 
that you don't have?


Chris


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Jesper Juhl

On 7/15/05, Chris Friesen [EMAIL PROTECTED] wrote:
 Jesper Juhl wrote:
 
  In my oppinion it would be nice if Linus/Andrew had some basic
  regression tests they could run on kernels before releasing them.
 
 How do you regression test behaviour on broken hardware (and BIOSes)
 that you don't have?
 
That, of course, you cannot do. But, you can regression test a lot of
other things, and having a default test suite that is constantly being
added to and always being run before releases (that test hardware
agnostic stuff) could help cut down on the number of regressions in
new releases.
You can't test everything this way, nor should you, but you can test
many things, and adding a bit of formal testing to the release
procedure wouldn't be a bad thing IMO.

-- 
Jesper Juhl [EMAIL PROTECTED]
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Dave Jones

On Fri, Jul 15, 2005 at 03:45:28AM +0200, Jesper Juhl wrote:
 
The problem is the process, not than the code.
* The issues are too much ad-hock code flux without enough 
disciplined/formal
regression testing and review.
   
   It's basically impossible to regression test swsusp except to release it.
   Its success or failure depends on exactly the driver 
   combination/platform/BIOS
   version etc.  e.g. all drivers have to cooperate and the particular
   bugs in your BIOS need to be worked around etc. Since that is quite fragile
   regressions are common.
   
   However in some other cases I agree some more regression testing
   before release would be nice. But that's not how Linux works.  Linux
   does regression testing after release.
   
  And who says that couldn't change?
  
  In my oppinion it would be nice if Linus/Andrew had some basic
  regression tests they could run on kernels before releasing them.

The problem is that this wouldn't cover the more painful problems
such as hardware specific problems.

As Fedora kernel maintainer, I frequently get asked why peoples
sound cards stopped working when they did an update, or why
their system no longer boots, usually followed by a
wasnt this update tested before it was released?

The bulk of all the regressions I see reported every time
I put out a kernel update rpm that rebases to a newer
upstream release are in drivers. Those just aren't going
to be caught by folks that don't have the hardware.

The only way to cover as many combinations of hardware
out there is by releasing test kernels. (Updates-testing
repository for Fedora users, or -rc kernels in Linus' case).
If users won't/don't test those 'test' releases, we're
going to regress when the final release happens, there's
no two ways about it.

Dave

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Andi Kleen

 You can't test everything this way, nor should you, but you can test
 many things, and adding a bit of formal testing to the release
 procedure wouldn't be a bad thing IMO.

In the linux model that's left to the distributions. In fact doing it properly
takes months. You wouldn't want to wait months for a new mainline kernel.

Formal testing is not really compatible with release early, release often 

You could do things like run LTP first, but in practice LTP rarely finds
bugs.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Andi Kleen

On Thu, Jul 14, 2005 at 10:09:11PM -0400, Parag Warudkar wrote:
 I have always wondered how Windows got it right circa 1995 - Version after 
 version, several different hardwares and it always works reliably. 
 I am using Linux since 1997 and not a single time have I succeeded in getting 
 it to suspend and resume reliably. 

What happens with Windows is that the Laptop vendor takes the
frozen Windows version available at the time the machine hits the market 
and then tweaks the BIOS and the drivers until everything runs and then
releases the machine.

But if you use newer (or older) W. releases or even service packs or different
drivers on that machine you end up exactly with the same problem.

 Is it such an un-interesting subject to warrant serious effort or there is a 
 lot of hardware documentation missing or in general the driver model and OS 
 design itself makes it impossible to get suspend / resume right?

I think you underestimate the complexity of the problem. Suspend/resume
is a fragile cooperation  of many many different components in the 
kernel/firmware/hardware
and all of them have to work flawlessly together.  That's hard.

-Andi

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Parag Warudkar

On Thursday 14 July 2005 20:38, Andi Kleen wrote:
 It's basically impossible to regression test swsusp except to release it.
 Its success or failure depends on exactly the driver
 combination/platform/BIOS version etc.  e.g. all drivers have to cooperate
 and the particular bugs in your BIOS need to be worked around etc. Since
 that is quite fragile regressions are common.

I have always wondered how Windows got it right circa 1995 - Version after 
version, several different hardwares and it always works reliably. 
I am using Linux since 1997 and not a single time have I succeeded in getting 
it to suspend and resume reliably. 

Is it such an un-interesting subject to warrant serious effort or there is a 
lot of hardware documentation missing or in general the driver model and OS 
design itself makes it impossible to get suspend / resume right?

Parag
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is 2.6.12.2 less stable on my laptop than 2.6.10?

2005-07-14 Thread Dave Airlie

 That, of course, you cannot do. But, you can regression test a lot of
 other things, and having a default test suite that is constantly being
 added to and always being run before releases (that test hardware
 agnostic stuff) could help cut down on the number of regressions in
 new releases.
 You can't test everything this way, nor should you, but you can test
 many things, and adding a bit of formal testing to the release
 procedure wouldn't be a bad thing IMO.

But if you read peoples complaints about regression they are nearly
always to do with hardware that used to work not working any more ..
alps touchpads, sound cards, software suspend.. so these people still
gain nothing by you regression testing anything so you still get as
many reports.. the -rc series is meant to provide the testing for the
release so nothing really big gets through (like can't boot from IDE
anymore or something like that)

Dave.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

41 matches

Mail list logo