RE: [PATCH] 2.6.25-rc2-mm1 - fix mcount GPL bogosity.

2008-02-26 Thread David Schwartz

> "David Schwartz" <[EMAIL PROTECTED]> writes:

> > I don't know who told you that or why, but it's obvious nonsense,

> Correct.

> > Exports should be marked GPL if and only if they cannot be used
> > except in a derivative work. If it is possible to use them
> > without taking
> > sufficient protectable expression, they should not be marked GPL.

> This isn't very obvious to me.

It may not be obvious, but it is the design and purpose of marking exports
GPL.

> The licence doesn't talk about GPL or non-GPL exports. It doesn't
> restrict the use, only distribution of the software. One is free to
> remove _GPL from the code and distribute it anyway (except perhaps for
> some DMCA nonsense).

That's true. The DMCA doesn't prevent it, since marking symbols is *not* a
license enforcement mechanism.

> If a code is a derivative work it has to be distributed (use is not
> restricted) under GPL, EXPORT _GPL or not _GPL.

Of course.

> One may say _GPL is a strong indication that all users are
> automatically a derivative works, but it's only that - indication. It
> doesn't mean they are really derivative works and it doesn't mean a
> module not using any _GPL exports isn't a derivative.

Of course. (The only people who argue otherwise are the 'linking makes a
derivative work' idiots.)

> I think introducing these _GPL symbols was a mistake in the first place.

Perhaps, since people seem to be trying to refight the same battles again.

The agreement made when the feature was added was that EXPORT_GPL was not a
license enforcement mechanism but was an indication that someone believed
that any use of the symbol was possible only a derivative work that would
need to be distributed under the GPL.

> Actually I think the _GPL exports are really harmful - somebody
> distributing a binary module may claim he/she doesn't violate the GPL
> because the module uses only non-GPL exports.

Anyone can argue anything. That would be an obviously stupid argument.
Perhaps clearer documentation might be helpful, but the GPL speaks for
itself.

> OTOH GPL symbols give
> _us_ exactly nothing.

They serve as a warning and, as a practical matter, may make it a bit more
difficult to violate the license.

DS


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] 2.6.25-rc2-mm1 - fix mcount GPL bogosity.

2008-02-26 Thread David Schwartz

 David Schwartz [EMAIL PROTECTED] writes:

  I don't know who told you that or why, but it's obvious nonsense,

 Correct.

  Exports should be marked GPL if and only if they cannot be used
  except in a derivative work. If it is possible to use them
  without taking
  sufficient protectable expression, they should not be marked GPL.

 This isn't very obvious to me.

It may not be obvious, but it is the design and purpose of marking exports
GPL.

 The licence doesn't talk about GPL or non-GPL exports. It doesn't
 restrict the use, only distribution of the software. One is free to
 remove _GPL from the code and distribute it anyway (except perhaps for
 some DMCA nonsense).

That's true. The DMCA doesn't prevent it, since marking symbols is *not* a
license enforcement mechanism.

 If a code is a derivative work it has to be distributed (use is not
 restricted) under GPL, EXPORT _GPL or not _GPL.

Of course.

 One may say _GPL is a strong indication that all users are
 automatically a derivative works, but it's only that - indication. It
 doesn't mean they are really derivative works and it doesn't mean a
 module not using any _GPL exports isn't a derivative.

Of course. (The only people who argue otherwise are the 'linking makes a
derivative work' idiots.)

 I think introducing these _GPL symbols was a mistake in the first place.

Perhaps, since people seem to be trying to refight the same battles again.

The agreement made when the feature was added was that EXPORT_GPL was not a
license enforcement mechanism but was an indication that someone believed
that any use of the symbol was possible only a derivative work that would
need to be distributed under the GPL.

 Actually I think the _GPL exports are really harmful - somebody
 distributing a binary module may claim he/she doesn't violate the GPL
 because the module uses only non-GPL exports.

Anyone can argue anything. That would be an obviously stupid argument.
Perhaps clearer documentation might be helpful, but the GPL speaks for
itself.

 OTOH GPL symbols give
 _us_ exactly nothing.

They serve as a warning and, as a practical matter, may make it a bit more
difficult to violate the license.

DS


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] 2.6.25-rc2-mm1 - fix mcount GPL bogosity.

2008-02-25 Thread David Schwartz

> The reason I added GPL is not because of some idea that this is all
> "chummy" with the kernel. But because I derived the mcount code from
> glibc's version of mcount. Now you may argue that glibc is under LGPL
> and a non-GPL export is fine. But I've been advised that if I ever take
> code from someone else, to always export it with GPL.
>
> -- Steve

I don't know who told you that or why, but it's obvious nonsense, as this
issue shows. Exports should be marked GPL if and only if they cannot be used
except in a derivative work. If it is possible to use them without taking
sufficient protectable expression, they should not be marked GPL.

This was what everyone agreed to when GPL exports were created.

DS


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] 2.6.25-rc2-mm1 - fix mcount GPL bogosity.

2008-02-25 Thread David Schwartz

 The reason I added GPL is not because of some idea that this is all
 chummy with the kernel. But because I derived the mcount code from
 glibc's version of mcount. Now you may argue that glibc is under LGPL
 and a non-GPL export is fine. But I've been advised that if I ever take
 code from someone else, to always export it with GPL.

 -- Steve

I don't know who told you that or why, but it's obvious nonsense, as this
issue shows. Exports should be marked GPL if and only if they cannot be used
except in a derivative work. If it is possible to use them without taking
sufficient protectable expression, they should not be marked GPL.

This was what everyone agreed to when GPL exports were created.

DS


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] USB: mark USB drivers as being GPL only

2008-02-09 Thread David Schwartz

Marcel Holtmann wrote:

> Lets phrase this in better words as Valdis pointed out: You can't
> distribute an application (binary or source form) under anything else
> than GPL if it uses a GPL library.

This simply cannot be correct. The only way it could be true is if the work
was a derivative work of a GPL'd work. There is no other way it could become
subject to the GPL.

So this argument reduces to -- any work that uses a library is a derivative
work of that library. But this is clearly wrong. For work X to be a
derivative work of work Y, it must contain substantial protected expression
from work Y, but an application need not have any expression from the
libraries it uses.

> It makes no difference if you
> distribute the GPL library with it or not.

If you do not distribute the GPL library, the library is simply being used
in the intended, ordinary way. You do not need to agree to, nor can you
violate, the GPL simply by using a work in its ordinary intended way.

If the application contains insufficient copyrightable expression from the
library to be considered a derivative work (and purely functional things do
not count), then it cannot be a derivative work. The library is not being
copied or distributed. So how can its copyright be infringed?

> But hey (again), feel free to disagree with me here.

This argument has no basis in law or common sense. It's completely
off-the-wall.

And to Pekka Enberg:

>It doesn't matter how "hard" it was to write that code. What matters
>is whether your code requires enough copyrighted aspects of the
>original work to constitute as derived work. There's a huge difference
>between using kmalloc and spin_lock and writing a driver that is built
>on to of the full USB stack of Linux kernel, for example.

The legal standard is not whether it "requires" copyrighted aspects but
whether it *contains* them. The driver does not contain the USB stack. The
aspects of the USB stack that the driver must contain are purely
functional -- its API.

You simply can't have it both ways. If the driver must contain X in order to
do its job, then X is functional and cannot make the driver a derivative
work. You cannot protect, by copyright, every way to accomplish a particular
function. Copyright only protects creative choices among millions of (at
least arguably) equally good choices.

DS


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] USB: mark USB drivers as being GPL only

2008-02-09 Thread David Schwartz

Marcel Holtmann wrote:

 Lets phrase this in better words as Valdis pointed out: You can't
 distribute an application (binary or source form) under anything else
 than GPL if it uses a GPL library.

This simply cannot be correct. The only way it could be true is if the work
was a derivative work of a GPL'd work. There is no other way it could become
subject to the GPL.

So this argument reduces to -- any work that uses a library is a derivative
work of that library. But this is clearly wrong. For work X to be a
derivative work of work Y, it must contain substantial protected expression
from work Y, but an application need not have any expression from the
libraries it uses.

 It makes no difference if you
 distribute the GPL library with it or not.

If you do not distribute the GPL library, the library is simply being used
in the intended, ordinary way. You do not need to agree to, nor can you
violate, the GPL simply by using a work in its ordinary intended way.

If the application contains insufficient copyrightable expression from the
library to be considered a derivative work (and purely functional things do
not count), then it cannot be a derivative work. The library is not being
copied or distributed. So how can its copyright be infringed?

 But hey (again), feel free to disagree with me here.

This argument has no basis in law or common sense. It's completely
off-the-wall.

And to Pekka Enberg:

It doesn't matter how hard it was to write that code. What matters
is whether your code requires enough copyrighted aspects of the
original work to constitute as derived work. There's a huge difference
between using kmalloc and spin_lock and writing a driver that is built
on to of the full USB stack of Linux kernel, for example.

The legal standard is not whether it requires copyrighted aspects but
whether it *contains* them. The driver does not contain the USB stack. The
aspects of the USB stack that the driver must contain are purely
functional -- its API.

You simply can't have it both ways. If the driver must contain X in order to
do its job, then X is functional and cannot make the driver a derivative
work. You cannot protect, by copyright, every way to accomplish a particular
function. Copyright only protects creative choices among millions of (at
least arguably) equally good choices.

DS


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] USB: mark USB drivers as being GPL only

2008-02-07 Thread David Schwartz

> Don't ignore, "mere aggregation of another work not based on the Program
> with the Program (or with a work based on the Program) on a volume of a
> storage or distribution medium does not bring the other work under the
> scope of this License."  Static linking certainly makes something part
> of the whole; dynamic linking doesn't.

Actually, static linking does not, since the whole is not a "work". Under 
copyright law, a "work" can only be made by creative effort. Static linking is 
not creative effort, so it cannot create a work. If it were, the linker would 
be entitled to copyright on the new work, which makes no sense at all.

An exception might exist if there were a large number of equally good ways to 
perform the link and the person who lined it had to creatively chose a method. 
But normally, anything purely dominated by functional considerations (which 
statically linking almost always is) is not considered sufficiently creative.

If you statically link work "X" to work "Y", the result is *not* work "Z", 
derivative from "X" and "Y". It is parts of work "X" and parts of work "Y" 
mechanically combined. A group of combined works follows the license for each 
of the individual works from which sufficient protectable expression has been 
taken.

A "derivative work" is a new work, and can only be formed by creative effort 
not in the works it is claimed to be derivative of.

And to Alan Cox, who write:

> First mistake: The GPL is not a contract it is a license.

A license is a form of contract in which part of the compensation one party 
receives is rights to the intellectual property of the other party.

>If the GPL was a contract it could most certainly impose conditions upon
>original works. Contract law permits to write things like "If you buy the
>source for this package you agree not to write a competing product for
>three years even if an origina work". 

Sure, and those things would apply to anyone who has accepted the contract. Why 
do you think the GPL couldn't say those things and enforce them against anyone 
who had agreed to the GPL?

How is agreeing to release source code any different from agreeing not to write 
a competing product? (Except that a court may be more likely to enforce the 
latter than the former, of course.)

And to Marcel:

> so how do you build this module that is not linked without using the
> Linux kernel. Hence derivative work. Hence dynamic linking at runtime of
> binary only code is violating the GPL.

When there is only one way to do it, you cannot copyright that one way. You 
need a patent for that. So, no, it's not a derivative work because what was 
taken is the one way to do it, and "one way to do it" is not protectable 
expression. A derivative work only applies when protectable expression is taken.

DS


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] USB: mark USB drivers as being GPL only

2008-02-07 Thread David Schwartz

 Don't ignore, mere aggregation of another work not based on the Program
 with the Program (or with a work based on the Program) on a volume of a
 storage or distribution medium does not bring the other work under the
 scope of this License.  Static linking certainly makes something part
 of the whole; dynamic linking doesn't.

Actually, static linking does not, since the whole is not a work. Under 
copyright law, a work can only be made by creative effort. Static linking is 
not creative effort, so it cannot create a work. If it were, the linker would 
be entitled to copyright on the new work, which makes no sense at all.

An exception might exist if there were a large number of equally good ways to 
perform the link and the person who lined it had to creatively chose a method. 
But normally, anything purely dominated by functional considerations (which 
statically linking almost always is) is not considered sufficiently creative.

If you statically link work X to work Y, the result is *not* work Z, 
derivative from X and Y. It is parts of work X and parts of work Y 
mechanically combined. A group of combined works follows the license for each 
of the individual works from which sufficient protectable expression has been 
taken.

A derivative work is a new work, and can only be formed by creative effort 
not in the works it is claimed to be derivative of.

And to Alan Cox, who write:

 First mistake: The GPL is not a contract it is a license.

A license is a form of contract in which part of the compensation one party 
receives is rights to the intellectual property of the other party.

If the GPL was a contract it could most certainly impose conditions upon
original works. Contract law permits to write things like If you buy the
source for this package you agree not to write a competing product for
three years even if an origina work. 

Sure, and those things would apply to anyone who has accepted the contract. Why 
do you think the GPL couldn't say those things and enforce them against anyone 
who had agreed to the GPL?

How is agreeing to release source code any different from agreeing not to write 
a competing product? (Except that a court may be more likely to enforce the 
latter than the former, of course.)

And to Marcel:

 so how do you build this module that is not linked without using the
 Linux kernel. Hence derivative work. Hence dynamic linking at runtime of
 binary only code is violating the GPL.

When there is only one way to do it, you cannot copyright that one way. You 
need a patent for that. So, no, it's not a derivative work because what was 
taken is the one way to do it, and one way to do it is not protectable 
expression. A derivative work only applies when protectable expression is taken.

DS


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] USB: mark USB drivers as being GPL only

2008-02-06 Thread David Schwartz

> IANAL, but when looking at the "But when you distribute the same 
> sections as part of a whole which is a work based on the Program, the 
> distribution of the whole must be on the terms of this License" of the 
> GPLv2 I would still consult a lawyer before e.g. selling a laptop with a 
> closed-source driver loaded through ndiswrapper.

If that were true, you couldn't legally install more than one program on a 
computer without permission from all the copyright holders without specific 
license permission.

A "work based on the Program" is the same as a derivative work. A laptop with 
an assortment of different programs on it is not a work, it is a collection of 
works.

DS


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: ndiswrapper and GPL-only symbols redux

2008-02-06 Thread David Schwartz

Adrian Bunk wrote:

> The Linux kernel is licenced under the GPLv2.
> 
> Ndiswrapper loads and executes code with not GPLv2 compatible licences 
> in a way in the kernel that might be considered similar to a GPLv2'ed 
> userspace program dlopen() a dynamic library file with a not GPLv2 
> compatible licence.
> 
> IANAL, but I do think there might be real copyright issues with 
> ndiswrapper.

Neither the kernel+ndiswrapper nor the non-free driver were developed with 
knowledge of the other, so there is simply no way one could be a derivative 
work of the other. Since no creative effort is required to link them together, 
and the linked result is not fixed in a permanent medium, a derivative work 
cannot be created by the linking process itself.

In any event, even if it was, this is the normal use of ndiswrapper, and normal 
use cannot be encumbered by copyright. Otherwise, it would be unwise to color 
in a coloring book.

So if there is a possible copyright issue, I for one can't imagine what it 
could be. There simply *cannot* be a copyright issue when one merely uses a 
work in the normal, intended and expected way.

DS


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: ndiswrapper and GPL-only symbols redux

2008-02-06 Thread David Schwartz

Adrian Bunk wrote:

 The Linux kernel is licenced under the GPLv2.
 
 Ndiswrapper loads and executes code with not GPLv2 compatible licences 
 in a way in the kernel that might be considered similar to a GPLv2'ed 
 userspace program dlopen() a dynamic library file with a not GPLv2 
 compatible licence.
 
 IANAL, but I do think there might be real copyright issues with 
 ndiswrapper.

Neither the kernel+ndiswrapper nor the non-free driver were developed with 
knowledge of the other, so there is simply no way one could be a derivative 
work of the other. Since no creative effort is required to link them together, 
and the linked result is not fixed in a permanent medium, a derivative work 
cannot be created by the linking process itself.

In any event, even if it was, this is the normal use of ndiswrapper, and normal 
use cannot be encumbered by copyright. Otherwise, it would be unwise to color 
in a coloring book.

So if there is a possible copyright issue, I for one can't imagine what it 
could be. There simply *cannot* be a copyright issue when one merely uses a 
work in the normal, intended and expected way.

DS


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] USB: mark USB drivers as being GPL only

2008-02-06 Thread David Schwartz

 IANAL, but when looking at the But when you distribute the same 
 sections as part of a whole which is a work based on the Program, the 
 distribution of the whole must be on the terms of this License of the 
 GPLv2 I would still consult a lawyer before e.g. selling a laptop with a 
 closed-source driver loaded through ndiswrapper.

If that were true, you couldn't legally install more than one program on a 
computer without permission from all the copyright holders without specific 
license permission.

A work based on the Program is the same as a derivative work. A laptop with 
an assortment of different programs on it is not a work, it is a collection of 
works.

DS


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: ndiswrapper and GPL-only symbols redux

2008-01-30 Thread David Schwartz

Combined reponses to many fragmented comments in this thread. No two 
consecutive excerpts are from the same person.

> Interesting... I never heard about this `transferring ownership of a
> single copy not involving GPL'.
> 
> Note that some lawyers claim that at trade shows, you should not hand over
> a demo device running GPLed code to any interested party, as it would be
> distribution...

In the United States, 17 USC 109 specifically permits this:

"Notwithstanding the provisions of section 106 (3), the owner of a particular 
copy or phonorecord lawfully made under this title, or any person authorized by 
such owner, is entitled, without the authority of the copyright owner, to sell 
or otherwise dispose of the possession of that copy or phonorecord."

> IANAL, and I don't know abou the laws in other countries, but at least 
> in Germany modifications of a copyrighted work require the permission of 
> the copyright holder.

Ah, so coloring books are illegal in Germany? Or it's just illegal to color 
them in? Or you need a special license to do so?

> IANAL, but I have serious doubts whether putting some glue layer between 
> the GPL'ed code and the code with a not GPL compatible licence is really 
> a legally effictive way of circumventing the GPL.

The GPL has no power to control works that are neither GPL nor derived from GPL 
works. There is no need to circumvent situations the GPL has no business 
applying to.

This is a use of the GPL'd code. It's not a distribution and it's not a 
creative combination. It is, and should be, outside the GPL's scope.

> Read the paragraph starting with "These requirements apply to the 
> modified work as a whole." of the GPLv2.

There is no "modified work as a whole" in this case. A machine combination of 
two or more works produces those two or more works, not a work. Otherwise, the 
linker itself would be entitled to copyright on the new work, which is nonsense.

For copyright purposes, a work can only be created by creative effort. There is 
no creative effort in linking the kernel, ndiswrapper, and a Windows driver, so 
no "modified work as a whole" is created.

A linker cannot create a work because it is incapable of creative effort. If it 
cannot create a work, it cannot create a derivative work. There is no "modified 
work as a whole".

Section 2 of the GPL is about creative modifications that form a "work based on 
the Program". Only a human can do that. GPL section 2 actually makes that 
fairly clear:

"These requirements apply to the modified work as a whole.  If
identifiable sections of that work are not derived from the Program,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works.  But when you
distribute the same sections as part of a whole which is a work based
on the Program, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote it."

Note that it is only when you distribute the "same sections" as part of a 
"whole which is a work based on the Program". So these requirements only apply 
when someone creates a single work.

DS


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: ndiswrapper and GPL-only symbols redux

2008-01-30 Thread David Schwartz

> I wouldn't quite say that. I wasn't going to comment, but...personally,
> I actually disagree with the assertions that ndiswrapper isn't causing
> proprietary code to link against GPL functions in the kernel (how is
> an NDIS implementation any different than a shim layer provided to
> load a graphics driver?), but I wasn't trying to make that point.

By that logic, the kernel should always be tainted since it could
potentially always be linked to non-GPL code.

The ndiswrapper code is just like the kernel. It is GPL, but it could be
linked to non-free code. Any reason why ndiswrapper should be tainted would
equally well argue that any kernel with module-loading capability should be
tainted. Somebody might load a non-free module.

DS


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: ndiswrapper and GPL-only symbols redux

2008-01-30 Thread David Schwartz

 I wouldn't quite say that. I wasn't going to comment, but...personally,
 I actually disagree with the assertions that ndiswrapper isn't causing
 proprietary code to link against GPL functions in the kernel (how is
 an NDIS implementation any different than a shim layer provided to
 load a graphics driver?), but I wasn't trying to make that point.

By that logic, the kernel should always be tainted since it could
potentially always be linked to non-GPL code.

The ndiswrapper code is just like the kernel. It is GPL, but it could be
linked to non-free code. Any reason why ndiswrapper should be tainted would
equally well argue that any kernel with module-loading capability should be
tainted. Somebody might load a non-free module.

DS


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: ndiswrapper and GPL-only symbols redux

2008-01-30 Thread David Schwartz

Combined reponses to many fragmented comments in this thread. No two 
consecutive excerpts are from the same person.

 Interesting... I never heard about this `transferring ownership of a
 single copy not involving GPL'.
 
 Note that some lawyers claim that at trade shows, you should not hand over
 a demo device running GPLed code to any interested party, as it would be
 distribution...

In the United States, 17 USC 109 specifically permits this:

Notwithstanding the provisions of section 106 (3), the owner of a particular 
copy or phonorecord lawfully made under this title, or any person authorized by 
such owner, is entitled, without the authority of the copyright owner, to sell 
or otherwise dispose of the possession of that copy or phonorecord.

 IANAL, and I don't know abou the laws in other countries, but at least 
 in Germany modifications of a copyrighted work require the permission of 
 the copyright holder.

Ah, so coloring books are illegal in Germany? Or it's just illegal to color 
them in? Or you need a special license to do so?

 IANAL, but I have serious doubts whether putting some glue layer between 
 the GPL'ed code and the code with a not GPL compatible licence is really 
 a legally effictive way of circumventing the GPL.

The GPL has no power to control works that are neither GPL nor derived from GPL 
works. There is no need to circumvent situations the GPL has no business 
applying to.

This is a use of the GPL'd code. It's not a distribution and it's not a 
creative combination. It is, and should be, outside the GPL's scope.

 Read the paragraph starting with These requirements apply to the 
 modified work as a whole. of the GPLv2.

There is no modified work as a whole in this case. A machine combination of 
two or more works produces those two or more works, not a work. Otherwise, the 
linker itself would be entitled to copyright on the new work, which is nonsense.

For copyright purposes, a work can only be created by creative effort. There is 
no creative effort in linking the kernel, ndiswrapper, and a Windows driver, so 
no modified work as a whole is created.

A linker cannot create a work because it is incapable of creative effort. If it 
cannot create a work, it cannot create a derivative work. There is no modified 
work as a whole.

Section 2 of the GPL is about creative modifications that form a work based on 
the Program. Only a human can do that. GPL section 2 actually makes that 
fairly clear:

These requirements apply to the modified work as a whole.  If
identifiable sections of that work are not derived from the Program,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works.  But when you
distribute the same sections as part of a whole which is a work based
on the Program, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote it.

Note that it is only when you distribute the same sections as part of a 
whole which is a work based on the Program. So these requirements only apply 
when someone creates a single work.

DS


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Why is the kfree() argument const?

2008-01-18 Thread David Schwartz

> On Thu, 17 Jan 2008, David Schwartz wrote:

> > Nonsense. The 'kfree' function *destroys* the object pointer to by the
> > pointer. How can you describe that as not doing anything to the object?
>
> Here's an idea. Think it through.
>
> Why don't we need write permissions to a file to unlink it?

You cannot unlink a file. Given a file, if you were to attempt to unlink it,
what directory would you remove it from?

Unlinking a file is an operation on the directory the file is in, not on the
directory itself. We do need write permissions to the directory.

If you had only a const pointer to the data in the file, you should
definitely not be able to use that pointer to find a directory the file is
in and remove it without clearly indicating you know *exactly* what you're
doing. Given just that 'const' pointer, you're not supposed to be modifying
the data and certainly using that pointer to modify anything logically above
it.

> Here's a hint: because unlinking doesn't *write* to it. In fact, it
> doesn't read from it either. It doesn't do any access at all to that
> object, it just *removes* it.

Right. It's an operation on the directory the file is in that might have
consequences for the file.

> Is the file gone after you unlink it? Yes (modulo refcounting for
> aliasing
> "pointers" aka filenames, but that's the same for any memory manager -
> malloc/free just doesn't have any, so you could think of it as a
> non-hardlinking filesystem).

The file is gone if and only if the directory was the only thing that needed
the file to exist. A file that is only on one directory "belongs to" that
directory. So write permission to the directory is all that is needed.

What you are arguing is essentially that you should be able to remove a file
from any directory it is in just because you have write access to the file's
data.

> So you're the one who are speaking nonsense. Making something "not exist"
> is not at all the same thing as accessing it for a write (or a read). It
> is a metadata operation that doesn't conceptually change the data in any
> way, shape or form - it just makes it go away.

Making something "not exist" is a modification operation on that thing.

> And btw, exactly as with kfree(), a unlink() may well do something like
> "disk scrubbing" for security purposes, or cancel pending writes to the
> backing store. But even though it may write (or, by undoing a pending
> write, effectively "change the state") to the disk sectors that used to
> contain the file data, ONLY AN IDIOT would call it "writing to the file".
> Because "the file" is gone. Writing to the place where the file
> used to be
> is a different thing.

I agree with you about that part. I can't understand why you keep thinking
this is where our disagreement lies when I've stated at least three times
that I agree about this. The issue has nothing to do with whether or not
'kfree' modifies the particular bytes pointed to. It has to do with whether
or not 'kfree' is the kind of operation one would normally want to allow on
a 'const' object.

> So give it up. You're wrong. Freeing a memory area is not "writing to it"
> or accessing it in *any* manner, it's an operation on another level
> entirely.

Nevertheless, it's a modification operation on an object that's not supposed
to be modified.

By the way, I did think of one argument that supports your position: Suppose
you have a reference counted object. You have a 'release reference and free
if zero' function. Should it be 'const'? If not, how can a 'lookup and
reference for read' function return a const pointer to the object?

However, on balance, I think a 'release reference and free if zero' function
that operates on a const pointer is sufficiently unusual that a cast to show
you know what you're doing is not a bad thing. In this case, you know it's
safe to destroy the object through a const pointer because you *know* nobody
gave you the 'const' pointer trusting you not to destroy the object. A cast
to show you have that special knowledge is, IMO, reasonable.

DS


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Why is the kfree() argument const?

2008-01-18 Thread David Schwartz

 On Thu, 17 Jan 2008, David Schwartz wrote:

  Nonsense. The 'kfree' function *destroys* the object pointer to by the
  pointer. How can you describe that as not doing anything to the object?

 Here's an idea. Think it through.

 Why don't we need write permissions to a file to unlink it?

You cannot unlink a file. Given a file, if you were to attempt to unlink it,
what directory would you remove it from?

Unlinking a file is an operation on the directory the file is in, not on the
directory itself. We do need write permissions to the directory.

If you had only a const pointer to the data in the file, you should
definitely not be able to use that pointer to find a directory the file is
in and remove it without clearly indicating you know *exactly* what you're
doing. Given just that 'const' pointer, you're not supposed to be modifying
the data and certainly using that pointer to modify anything logically above
it.

 Here's a hint: because unlinking doesn't *write* to it. In fact, it
 doesn't read from it either. It doesn't do any access at all to that
 object, it just *removes* it.

Right. It's an operation on the directory the file is in that might have
consequences for the file.

 Is the file gone after you unlink it? Yes (modulo refcounting for
 aliasing
 pointers aka filenames, but that's the same for any memory manager -
 malloc/free just doesn't have any, so you could think of it as a
 non-hardlinking filesystem).

The file is gone if and only if the directory was the only thing that needed
the file to exist. A file that is only on one directory belongs to that
directory. So write permission to the directory is all that is needed.

What you are arguing is essentially that you should be able to remove a file
from any directory it is in just because you have write access to the file's
data.

 So you're the one who are speaking nonsense. Making something not exist
 is not at all the same thing as accessing it for a write (or a read). It
 is a metadata operation that doesn't conceptually change the data in any
 way, shape or form - it just makes it go away.

Making something not exist is a modification operation on that thing.

 And btw, exactly as with kfree(), a unlink() may well do something like
 disk scrubbing for security purposes, or cancel pending writes to the
 backing store. But even though it may write (or, by undoing a pending
 write, effectively change the state) to the disk sectors that used to
 contain the file data, ONLY AN IDIOT would call it writing to the file.
 Because the file is gone. Writing to the place where the file
 used to be
 is a different thing.

I agree with you about that part. I can't understand why you keep thinking
this is where our disagreement lies when I've stated at least three times
that I agree about this. The issue has nothing to do with whether or not
'kfree' modifies the particular bytes pointed to. It has to do with whether
or not 'kfree' is the kind of operation one would normally want to allow on
a 'const' object.

 So give it up. You're wrong. Freeing a memory area is not writing to it
 or accessing it in *any* manner, it's an operation on another level
 entirely.

Nevertheless, it's a modification operation on an object that's not supposed
to be modified.

By the way, I did think of one argument that supports your position: Suppose
you have a reference counted object. You have a 'release reference and free
if zero' function. Should it be 'const'? If not, how can a 'lookup and
reference for read' function return a const pointer to the object?

However, on balance, I think a 'release reference and free if zero' function
that operates on a const pointer is sufficiently unusual that a cast to show
you know what you're doing is not a bad thing. In this case, you know it's
safe to destroy the object through a const pointer because you *know* nobody
gave you the 'const' pointer trusting you not to destroy the object. A cast
to show you have that special knowledge is, IMO, reasonable.

DS


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Why is the kfree() argument const?

2008-01-17 Thread David Schwartz

> On Thu, 17 Jan 2008, David Schwartz wrote:

> > > "const" has nothing to do with "logical state".  It has one
> > > meaning, and
> > > one meaning only: the compiler should complain if that
> > > particular type is
> > > used to do a write access.
> >
> > Right, exactly.

> So why do you complain?
>
> kfree() literally doesn't write to the object.

Because the object ceases to exist. However, any modification requires write
access, whether or not that modification is a write.

> > You are the only one who has suggested it has anything to do
> > with changes
> > through other pointers or in other ways. So you are arguing against only
> > yourself here.

> No, I'm saying that "const" has absolutely *zero* meaning on writes to an
> object through _other_ pointers (or direct access) to the object.

Nobody disagrees with that.

> And you're seemingly not understanding that *lack* of meaning.

No, I understand that.

> kfree() doesn't do *squat* to the object pointed to by the pointer it is
> passed. It only uses it to look up its own data structures, of which the
> pointer is but a small detail.

> And those other data structures aren't constant.

Nonsense. The 'kfree' function *destroys* the object pointer to by the
pointer. How can you describe that as not doing anything to the object?

> > Nobody has said it has anything to do with anything but
> > operations through
> > that pointer.

> .. and I'm telling you: kfree() does *nothing* conceptually through that
> pointer. No writes, and not even any reads! Which is exactly why it's
> const.

It destroys the object the pointer points to. Destroying an object requires
write access to it.

> The only thing kfree does through that pointer is to update its own
> concept of what memory it has free.

That is not what it does, that is how it does it. What it does is destroy
the object.

> Now, what it does to its own free memory is just an
> implementation detail,
> and has nothing what-so-ever to do with the pointer you passed it.

I agree, except that it destroys the object the pointer points to.

> See?

I now have a much better understanding of what you're saying, but I still
think it's nonsense.

1) An operation that modifies the logical state of an object should not
normally be done through a 'const' pointer. The reason you make a pointer
'const' is to indicate that this pointer should not be used to change the
logical state of the object pointed to.

2) The 'kfree' operation changes the logical state of the object pointed to,
as the object goes from existent to non-existent.

3) It is most useful for 'kfree' to be non-const because destroying an
object through a const pointer can easily be done in error. One of the
reasons you provide a const pointer is because you need the function you
pass the pointer to not to modify the object. Since this is an unusual
operation that could be an error, it is logical to force the person doing it
to clearly indicate that he knows the pointer is const and that he knows it
is right anyway.

I'm curious to hear how some other people on this feel. You are the first
competent coder I have *ever* heard make this argument.

By the way, I disagree with your metadata versus data argument. I would
agree that a function that changes only an object's metadata could be done
through a const pointer without needed a cast. A good example would be a
function that updates a "last time this object was read" variable.

However, *destroying* an object is not a metadata operation -- it destroys
the data as well. This is kind of a philosophical point, but an object does
not have a "does this object exist" piece of metadata. If an object does not
exist, it has no data. So destroying an object destroys the data and is thus
a write/modification operation on the data.

DS


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Why is the kfree() argument const?

2008-01-17 Thread David Schwartz

> On Thu, 17 Jan 2008, David Schwartz wrote:

> > No, that's not what it means. It has nothing to do with memory.
> > It has to do
> > with logical state.

> Blah. That's just your own made-up explanation of what you think "const"
> should mean. It has no logical background or any basis in the C language.

To some extent, I agree. You can use "const" for pretty much any reason.
It's just a way to say that you have a pointer and you would like an error
if certain things are done with it.

You could use it to mean anything you want it to mean. The most common use,
and the one intended, is to indicate that an object's logical state will not
be changed through that pointer.

> "const" has nothing to do with "logical state".  It has one meaning, and
> one meaning only: the compiler should complain if that particular type is
> used to do a write access.

Right, exactly.

> It says nothing at all about the "logical state of the object".
> It cannot,
> since a single object can - and does - have multiple pointers to it.

You are the only one who has suggested it has anything to do with changes
through other pointers or in other ways. So you are arguing against only
yourself here.

Nobody has said it has anything to do with anything but operations through
that pointer.

> So your standpoint not only has no relevant background to it, it's also
> not even logically consistent.

Actually, that is true of your position. On the one hand, you defend it
because kfree does not change the data. On the other hand, you claim that it
has nothing to do with whether or not the data is changed.

The normal use of "const" is to indicate that the logical state of the
object should not be changed through that pointer. The 'kfree' function
changes the logical state of the object. So, logically, 'kfree' should not
be const.

The usefulness of "const" is that you get an error if you unexpectedly
modify something you weren't expected to modify. If you are 'kfree'ing an
object that is supposed to be logically immutable, you should be made to
indicate that you are aware the object is logically immutable.

Simply put, you you have to cast in any case where you mean to do something
that you want to get an error in if you do not cast. I would like to get an
error if I call 'kfree' through a const pointer, because that often is an
error. I may have a const pointer because my caller still plans to use the
object.

Honestly, I find your position bizarre.

DS


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Why is the kfree() argument const?

2008-01-17 Thread David Schwartz


> On Thu, 17 Jan 2008, David Schwartz wrote:

> > Which does change the thing the pointer points to. It goes from being a
> > valid object to not existing.

> No. It's the *pointer* that is no longer valid.

The pointer is no longer valid because the object it pointed to no longer
exists. The pointer is also no longer valid, but that is not the end of the
story.

> There's definitely a difference between "exists and is changed" and
> "doesn't exist any more".

> > How is ceasing to exist not a change?

> It's not a change to the data behind it, it's a change to the *metadata*.
> Which is somethign that "const" doesn't talk about at all.

It doesn't matter what has changed. All that matters is whether this is
something we normally want to happen to a const pointer or whether doing
this to a const pointer is not normal.

> > >Why? Because we want the types to be as tight as possible,
> > > and normal
> > >code should need as few casts as possible.
> >
> > Right, and that's why you are wrong.
>
> No, it's why I'm right.

> "kmalloc/kfree" (or any memory manager) by definition has to play games
> with pointers and do things like cast them. But the users shouldn't need
> to, not for something like this.

If you don't like having to cast, don't use 'const'. But if you use 'const',
you have to cast when you mean to do something that you would like to be
warned about if you do it by accident.

> > No, it's both correct and useful. This code is the exception to
> a rule. The
> > rule is that the object remain unchanged and this violates that rule.

> No.
>
> You are continuing to make the mistake that you think that "const" means
> that the memory behind the pointer is not going to change.

No, that's not what it means. It has nothing to do with memory. It has to do
with logical state.

> Why do you make that mistake, when it is PROVABLY NOT TRUE!

I don't. You do, because you argue 'kfree' can be const because it doesn't
change the memory. The change in the memory is meaningless, the change in
the logical state of the object is what matters.

> Try this trivial program:
>
>   int main(int argc, char **argv)
>   {
>   int i;
>   const int *c;
>
>   i = 5;
>   c = 
>   i = 10;
>   return *c;
>   }
>
> and realize that according to the C rules, if it returns anything but 10,
> the compiler is *buggy*.
>
> The fact is, that in spite of us having a "const int *", the data behind
> that pointer may change.

I don't know what you think this example proves. Nobody is arguing that so
long as one const pointer to an object exists, no code anywhere should ever
be able to change it.

All I'm saying is that changing the logical state of an object *through* a
const pointer is unusual. You should need a cast to do this because that's
the only way to get a warning if you do it by mistake.

> So it doesn't matter ONE WHIT if you pass in a "const *" to "kfree()": it
> does not guarantee that the data doesn't change, because the object you
> point to has other pointers pointing to it.

Right, nobody said this was about guaranteeing that data doesn't change.

> This isn't worth discussing. It's really simple: a conforming program
> CANNOT POSSIBLY TELL whether "kfree()" modified the data or not.

But that's exactly what doesn't matter. As you've said at least twice now,
it has nothing to do with changing the data. It has to do with changing the
logical state of the object. That's what you're not supposed to do through a
'const' pointer.

> As such,
> AS FAR AS THE PROGRAM IS CONCERNED, kfree() takes a const
> pointer, and the
> rule that "if it can be considered const, it should be marked
> const" comes
> and says that kfree() should take a const pointer.

That's crazy.

> In other words - anythign that could ever disagree with "const *" is BY
> DEFINITION buggy.
>
> It really is that simple.

I think you may be the only person in the world who thinks so.

DS


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Why is the kfree() argument const?

2008-01-17 Thread David Schwartz


 On Thu, 17 Jan 2008, David Schwartz wrote:

  Which does change the thing the pointer points to. It goes from being a
  valid object to not existing.

 No. It's the *pointer* that is no longer valid.

The pointer is no longer valid because the object it pointed to no longer
exists. The pointer is also no longer valid, but that is not the end of the
story.

 There's definitely a difference between exists and is changed and
 doesn't exist any more.

  How is ceasing to exist not a change?

 It's not a change to the data behind it, it's a change to the *metadata*.
 Which is somethign that const doesn't talk about at all.

It doesn't matter what has changed. All that matters is whether this is
something we normally want to happen to a const pointer or whether doing
this to a const pointer is not normal.

  Why? Because we want the types to be as tight as possible,
   and normal
  code should need as few casts as possible.
 
  Right, and that's why you are wrong.

 No, it's why I'm right.

 kmalloc/kfree (or any memory manager) by definition has to play games
 with pointers and do things like cast them. But the users shouldn't need
 to, not for something like this.

If you don't like having to cast, don't use 'const'. But if you use 'const',
you have to cast when you mean to do something that you would like to be
warned about if you do it by accident.

  No, it's both correct and useful. This code is the exception to
 a rule. The
  rule is that the object remain unchanged and this violates that rule.

 No.

 You are continuing to make the mistake that you think that const means
 that the memory behind the pointer is not going to change.

No, that's not what it means. It has nothing to do with memory. It has to do
with logical state.

 Why do you make that mistake, when it is PROVABLY NOT TRUE!

I don't. You do, because you argue 'kfree' can be const because it doesn't
change the memory. The change in the memory is meaningless, the change in
the logical state of the object is what matters.

 Try this trivial program:

   int main(int argc, char **argv)
   {
   int i;
   const int *c;

   i = 5;
   c = i;
   i = 10;
   return *c;
   }

 and realize that according to the C rules, if it returns anything but 10,
 the compiler is *buggy*.

 The fact is, that in spite of us having a const int *, the data behind
 that pointer may change.

I don't know what you think this example proves. Nobody is arguing that so
long as one const pointer to an object exists, no code anywhere should ever
be able to change it.

All I'm saying is that changing the logical state of an object *through* a
const pointer is unusual. You should need a cast to do this because that's
the only way to get a warning if you do it by mistake.

 So it doesn't matter ONE WHIT if you pass in a const * to kfree(): it
 does not guarantee that the data doesn't change, because the object you
 point to has other pointers pointing to it.

Right, nobody said this was about guaranteeing that data doesn't change.

 This isn't worth discussing. It's really simple: a conforming program
 CANNOT POSSIBLY TELL whether kfree() modified the data or not.

But that's exactly what doesn't matter. As you've said at least twice now,
it has nothing to do with changing the data. It has to do with changing the
logical state of the object. That's what you're not supposed to do through a
'const' pointer.

 As such,
 AS FAR AS THE PROGRAM IS CONCERNED, kfree() takes a const
 pointer, and the
 rule that if it can be considered const, it should be marked
 const comes
 and says that kfree() should take a const pointer.

That's crazy.

 In other words - anythign that could ever disagree with const * is BY
 DEFINITION buggy.

 It really is that simple.

I think you may be the only person in the world who thinks so.

DS


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Why is the kfree() argument const?

2008-01-17 Thread David Schwartz

 On Thu, 17 Jan 2008, David Schwartz wrote:

  No, that's not what it means. It has nothing to do with memory.
  It has to do
  with logical state.

 Blah. That's just your own made-up explanation of what you think const
 should mean. It has no logical background or any basis in the C language.

To some extent, I agree. You can use const for pretty much any reason.
It's just a way to say that you have a pointer and you would like an error
if certain things are done with it.

You could use it to mean anything you want it to mean. The most common use,
and the one intended, is to indicate that an object's logical state will not
be changed through that pointer.

 const has nothing to do with logical state.  It has one meaning, and
 one meaning only: the compiler should complain if that particular type is
 used to do a write access.

Right, exactly.

 It says nothing at all about the logical state of the object.
 It cannot,
 since a single object can - and does - have multiple pointers to it.

You are the only one who has suggested it has anything to do with changes
through other pointers or in other ways. So you are arguing against only
yourself here.

Nobody has said it has anything to do with anything but operations through
that pointer.

 So your standpoint not only has no relevant background to it, it's also
 not even logically consistent.

Actually, that is true of your position. On the one hand, you defend it
because kfree does not change the data. On the other hand, you claim that it
has nothing to do with whether or not the data is changed.

The normal use of const is to indicate that the logical state of the
object should not be changed through that pointer. The 'kfree' function
changes the logical state of the object. So, logically, 'kfree' should not
be const.

The usefulness of const is that you get an error if you unexpectedly
modify something you weren't expected to modify. If you are 'kfree'ing an
object that is supposed to be logically immutable, you should be made to
indicate that you are aware the object is logically immutable.

Simply put, you you have to cast in any case where you mean to do something
that you want to get an error in if you do not cast. I would like to get an
error if I call 'kfree' through a const pointer, because that often is an
error. I may have a const pointer because my caller still plans to use the
object.

Honestly, I find your position bizarre.

DS


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Why is the kfree() argument const?

2008-01-17 Thread David Schwartz

 On Thu, 17 Jan 2008, David Schwartz wrote:

   const has nothing to do with logical state.  It has one
   meaning, and
   one meaning only: the compiler should complain if that
   particular type is
   used to do a write access.
 
  Right, exactly.

 So why do you complain?

 kfree() literally doesn't write to the object.

Because the object ceases to exist. However, any modification requires write
access, whether or not that modification is a write.

  You are the only one who has suggested it has anything to do
  with changes
  through other pointers or in other ways. So you are arguing against only
  yourself here.

 No, I'm saying that const has absolutely *zero* meaning on writes to an
 object through _other_ pointers (or direct access) to the object.

Nobody disagrees with that.

 And you're seemingly not understanding that *lack* of meaning.

No, I understand that.

 kfree() doesn't do *squat* to the object pointed to by the pointer it is
 passed. It only uses it to look up its own data structures, of which the
 pointer is but a small detail.

 And those other data structures aren't constant.

Nonsense. The 'kfree' function *destroys* the object pointer to by the
pointer. How can you describe that as not doing anything to the object?

  Nobody has said it has anything to do with anything but
  operations through
  that pointer.

 .. and I'm telling you: kfree() does *nothing* conceptually through that
 pointer. No writes, and not even any reads! Which is exactly why it's
 const.

It destroys the object the pointer points to. Destroying an object requires
write access to it.

 The only thing kfree does through that pointer is to update its own
 concept of what memory it has free.

That is not what it does, that is how it does it. What it does is destroy
the object.

 Now, what it does to its own free memory is just an
 implementation detail,
 and has nothing what-so-ever to do with the pointer you passed it.

I agree, except that it destroys the object the pointer points to.

 See?

I now have a much better understanding of what you're saying, but I still
think it's nonsense.

1) An operation that modifies the logical state of an object should not
normally be done through a 'const' pointer. The reason you make a pointer
'const' is to indicate that this pointer should not be used to change the
logical state of the object pointed to.

2) The 'kfree' operation changes the logical state of the object pointed to,
as the object goes from existent to non-existent.

3) It is most useful for 'kfree' to be non-const because destroying an
object through a const pointer can easily be done in error. One of the
reasons you provide a const pointer is because you need the function you
pass the pointer to not to modify the object. Since this is an unusual
operation that could be an error, it is logical to force the person doing it
to clearly indicate that he knows the pointer is const and that he knows it
is right anyway.

I'm curious to hear how some other people on this feel. You are the first
competent coder I have *ever* heard make this argument.

By the way, I disagree with your metadata versus data argument. I would
agree that a function that changes only an object's metadata could be done
through a const pointer without needed a cast. A good example would be a
function that updates a last time this object was read variable.

However, *destroying* an object is not a metadata operation -- it destroys
the data as well. This is kind of a philosophical point, but an object does
not have a does this object exist piece of metadata. If an object does not
exist, it has no data. So destroying an object destroys the data and is thus
a write/modification operation on the data.

DS


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Trailing periods in kernel messages

2007-12-21 Thread David Schwartz

Jan Engelhardt wrote:

> On Dec 21 2007 17:56, Herbert Xu wrote:
> >>
> >> I do not believe "opinions" are relevant here. Relevant would be cites
> >> from respected style guides (Fowlers, Oxford Guide To Style et al.) to
> >> show they do not need a full stop.
> >>
> >> I've not found one, but I am open to references.
> >
> >Well from where I come from, full stops are only used for complete
> >sentences.
> >[...]
> >As to what is a complete sentence, that is debatable.  However,
> >typically it would include a subject and a predicate.  By this
> >rule the following line is not a complete sentence:
> >
> > [XFS] Initialise current offset in xfs_file_readdir correctly
> >
> >The reason is that it lacks a subject.
>
> "current offset" is your subject.

I hate to have to point this out, but "current offset" is the object, not
the subject. If the sentence was, "I have initialized the current offset in
xfs_file_readdir correctly.", then it would be quite clear that "I" is the
subject and "the current offset" is the object.

The log entry has an implied subject of "I" or, if you prefer, "the kernel".
It is not a complete sentence both because it implies the subject in a
context where English does not permit that and it lacks words required by
grammar (such as the "the" before "current offset"). It also lacks a helping
verb since it should be "have initialized" (or perhaps "initialized").

Sometime you can imply the subject, such as in, "Go home!". This is not one
of those cases. You cannot say "Am sleepy" to mean "I am sleepy", even
though it would seem perfectly reasonable to allow an implied subject,
English doesn't.

There is no reason log entries should be complete sentences. If you look at
a typical log, the complete sentences generally look worse than the
fragments.

For example:

CPU: L1 I cache: 16K, L1 D cache: 16K
CPU: L2 cache: 256K
CPU serial number disabled.

and

EXT3 FS on hdc7, internal journal
EXT3-fs: mounted filesystem with ordered data mode.

And why the inconsistency in the beginning in both these examples?

Personally, I think a mix of sentences and statements is just fine.
Sentences should end with a period when it looks worse not to.

The following extracts from my log looks perfect to me:

Switched to high resolution mode on CPU 0
lp: driver loaded but no devices found
Real Time Clock Driver v1.12ac
Linux agpgart interface v0.102
agpgart: Detected VIA Apollo Pro 133 chipset
agpgart: AGP aperture is 4M @ 0xfe00

Entries that look imperfect to me include:

ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl edge)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
Detected 1004.544 MHz processor.
ENABLING IO-APIC IRQs
EXT3-fs: INFO: recovery required on readonly filesystem.
Time: tsc clocksource has been installed.

The last one just looks wrong, even though it is a complete sentence.
Perhaps changing 'tsc' to 'TSC' will help or just saying "using TSC" or "TSC
enabled" would help.

Inconsistencies include:

PCI: VIA PCI bridge detected. Disabling DAC.
PCI: Enabling Via external APIC routing
pci :00:04.2: uhci_check_and_reset_hc: legsup = 0x2000
pci :00:04.2: Performing full reset

and

TCP bind hash table entries: 65536 (order: 7, 524288 bytes)
TCP: Hash tables configured (established 131072 bind 65536)
TCP reno registered

and

PCI: Bridge: :00:01.0
  IO window: disabled.
  MEM window: f800-fddf

More important than any hard and fast rules is just how it looks. Also
important is how it looks in context. For example, with the upper case and
lower case 'pci', either way is fine, but some of each doesn't look good.
Same for 'TCP'. Why does one message have a colon and not the others?

DS


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Trailing periods in kernel messages

2007-12-21 Thread David Schwartz

Jan Engelhardt wrote:

 On Dec 21 2007 17:56, Herbert Xu wrote:
 
  I do not believe opinions are relevant here. Relevant would be cites
  from respected style guides (Fowlers, Oxford Guide To Style et al.) to
  show they do not need a full stop.
 
  I've not found one, but I am open to references.
 
 Well from where I come from, full stops are only used for complete
 sentences.
 [...]
 As to what is a complete sentence, that is debatable.  However,
 typically it would include a subject and a predicate.  By this
 rule the following line is not a complete sentence:
 
  [XFS] Initialise current offset in xfs_file_readdir correctly
 
 The reason is that it lacks a subject.

 current offset is your subject.

I hate to have to point this out, but current offset is the object, not
the subject. If the sentence was, I have initialized the current offset in
xfs_file_readdir correctly., then it would be quite clear that I is the
subject and the current offset is the object.

The log entry has an implied subject of I or, if you prefer, the kernel.
It is not a complete sentence both because it implies the subject in a
context where English does not permit that and it lacks words required by
grammar (such as the the before current offset). It also lacks a helping
verb since it should be have initialized (or perhaps initialized).

Sometime you can imply the subject, such as in, Go home!. This is not one
of those cases. You cannot say Am sleepy to mean I am sleepy, even
though it would seem perfectly reasonable to allow an implied subject,
English doesn't.

There is no reason log entries should be complete sentences. If you look at
a typical log, the complete sentences generally look worse than the
fragments.

For example:

CPU: L1 I cache: 16K, L1 D cache: 16K
CPU: L2 cache: 256K
CPU serial number disabled.

and

EXT3 FS on hdc7, internal journal
EXT3-fs: mounted filesystem with ordered data mode.

And why the inconsistency in the beginning in both these examples?

Personally, I think a mix of sentences and statements is just fine.
Sentences should end with a period when it looks worse not to.

The following extracts from my log looks perfect to me:

Switched to high resolution mode on CPU 0
lp: driver loaded but no devices found
Real Time Clock Driver v1.12ac
Linux agpgart interface v0.102
agpgart: Detected VIA Apollo Pro 133 chipset
agpgart: AGP aperture is 4M @ 0xfe00

Entries that look imperfect to me include:

ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl edge)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
Detected 1004.544 MHz processor.
ENABLING IO-APIC IRQs
EXT3-fs: INFO: recovery required on readonly filesystem.
Time: tsc clocksource has been installed.

The last one just looks wrong, even though it is a complete sentence.
Perhaps changing 'tsc' to 'TSC' will help or just saying using TSC or TSC
enabled would help.

Inconsistencies include:

PCI: VIA PCI bridge detected. Disabling DAC.
PCI: Enabling Via external APIC routing
pci :00:04.2: uhci_check_and_reset_hc: legsup = 0x2000
pci :00:04.2: Performing full reset

and

TCP bind hash table entries: 65536 (order: 7, 524288 bytes)
TCP: Hash tables configured (established 131072 bind 65536)
TCP reno registered

and

PCI: Bridge: :00:01.0
  IO window: disabled.
  MEM window: f800-fddf

More important than any hard and fast rules is just how it looks. Also
important is how it looks in context. For example, with the upper case and
lower case 'pci', either way is fine, but some of each doesn't look good.
Same for 'TCP'. Why does one message have a colon and not the others?

DS


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: /dev/urandom uses uninit bytes, leaks user data

2007-12-17 Thread David Schwartz

> Has anyone *proven* that using uninitialized data this way is safe?

You can probably find dozens of things in the Linux kernel that have not
been proven to be safe. That means nothing.

> As
> a *user* of this stuff, I'm *very* hesitant to trust Linux's RNG when I
> hear things like this.  (Hint: most protocols are considered insecure
> until proven otherwise, not the other way around.)

There's no reason whatsoever to think this is unsafe. First, you can't
access the pool directly. Second, even if you could, it's mixed in securely.

> Now imagine a security program.  It runs some forward secret protocol
> and it's very safe not to leak data that would break forward secrecy
> (mlockall, memset when done with stuff, etc).  It runs on a freshly
> booted machine (no DSA involved, so we're not automatically hosed), so
> an attacker knows the initial pool state.  Conveniently, some *secret*
> (say an ephemeral key, or, worse, a password) gets mixed in to the pool.
>   There are apparently at most three bytes of extra data mixed in, but
> suppose the attacker knows add the words that were supposed to get mixed
> in.  Now the program clears all its state to "ensure" forward secrecy,
> and *then* the machine gets hacked.  Now the attacker can learn (with at
> most 2^24 guesses worth of computation) 24 bits worth of a secret, which
> could quite easily reduce the work involved in breaking whatever forward
> secret protocol was involved from intractable to somewhat easy.  Or it
> could leak three bytes of password.  Or whatever.

This is no more precise than "imagine there's some vulnerability in the
RNG". Yes, if there's a vulnerability, then we're vulnerable.

An attacker can always (at least in principle) get the pool out of the
kernel. The RNG's design is premised on the notion that it is
computationally infeasbile to get the input entropy out of the pool. If an
attacker can watch data going into the pool, he needn't get it out of the
pool.

> Sorry for the somewhat inflammatory email, but this is absurd.

I agree.

DS


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: /dev/urandom uses uninit bytes, leaks user data

2007-12-17 Thread David Schwartz

> The bottom line:  At a cost of at most three unpredictable branches
> (whether to clear the bytes in the last word with indices congruent
> to 1, 2, or 3 modulo 4), then the code can reduce the risk from something
> small but positive, to zero.  This is very inexpensive insurance.

> John Reiser, [EMAIL PROTECTED]

Even if you're right, the change isn't free. You've simply presented
evidence of one non-zero benefit of it. You've given no ability to assess
the size of this benefit and no way to figure if it exceeds the cost. There
is also a non-zero *security* cost to this change.

DS


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: /dev/urandom uses uninit bytes, leaks user data

2007-12-17 Thread David Schwartz

 The bottom line:  At a cost of at most three unpredictable branches
 (whether to clear the bytes in the last word with indices congruent
 to 1, 2, or 3 modulo 4), then the code can reduce the risk from something
 small but positive, to zero.  This is very inexpensive insurance.

 John Reiser, [EMAIL PROTECTED]

Even if you're right, the change isn't free. You've simply presented
evidence of one non-zero benefit of it. You've given no ability to assess
the size of this benefit and no way to figure if it exceeds the cost. There
is also a non-zero *security* cost to this change.

DS


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: /dev/urandom uses uninit bytes, leaks user data

2007-12-17 Thread David Schwartz

 Has anyone *proven* that using uninitialized data this way is safe?

You can probably find dozens of things in the Linux kernel that have not
been proven to be safe. That means nothing.

 As
 a *user* of this stuff, I'm *very* hesitant to trust Linux's RNG when I
 hear things like this.  (Hint: most protocols are considered insecure
 until proven otherwise, not the other way around.)

There's no reason whatsoever to think this is unsafe. First, you can't
access the pool directly. Second, even if you could, it's mixed in securely.

 Now imagine a security program.  It runs some forward secret protocol
 and it's very safe not to leak data that would break forward secrecy
 (mlockall, memset when done with stuff, etc).  It runs on a freshly
 booted machine (no DSA involved, so we're not automatically hosed), so
 an attacker knows the initial pool state.  Conveniently, some *secret*
 (say an ephemeral key, or, worse, a password) gets mixed in to the pool.
   There are apparently at most three bytes of extra data mixed in, but
 suppose the attacker knows add the words that were supposed to get mixed
 in.  Now the program clears all its state to ensure forward secrecy,
 and *then* the machine gets hacked.  Now the attacker can learn (with at
 most 2^24 guesses worth of computation) 24 bits worth of a secret, which
 could quite easily reduce the work involved in breaking whatever forward
 secret protocol was involved from intractable to somewhat easy.  Or it
 could leak three bytes of password.  Or whatever.

This is no more precise than imagine there's some vulnerability in the
RNG. Yes, if there's a vulnerability, then we're vulnerable.

An attacker can always (at least in principle) get the pool out of the
kernel. The RNG's design is premised on the notion that it is
computationally infeasbile to get the input entropy out of the pool. If an
attacker can watch data going into the pool, he needn't get it out of the
pool.

 Sorry for the somewhat inflammatory email, but this is absurd.

I agree.

DS


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: yield API

2007-12-13 Thread David Schwartz

Kyle Moffett wrote:

> That is a *terrible* disgusting way to use yield.  Better options:
>(1) inotify/dnotify

Sure, tie yourself to a Linux-specific mechanism that may or may not work
over things like NFS. That's much worse.

>(2) create a "foo.lock" file and put the mutex in that

Right, tie yourself to process-shared mutexes which historically weren't
available on Linux. That's much better than an option that's been stable for
a decade.

>(3) just start with the check-file-and-sleep loop.

How is that better? There is literally no improvement, since the first check
will (almost) always fail.

> > Now is this the best way to handle this situation? No.  Does it
> > work better than just doing the wait loop from the start? Yes.
>
> It works better than doing the wait-loop from the start?  What
> evidence do you provide to support this assertion?

The evidence is that more than half the time, this avoids the sleep. That
means it has zero cost, since the yield is no heavier than a sleep would be,
and has a possible benefit, since the first sleep may be too long.

> Specifically, in
> the first case you tell the kernel "I'm waiting for something but I
> don't know what it is or how long it will take"; while in the second
> case you tell the kernel "I'm waiting for something that will take
> exactly X milliseconds, even though I don't know what it is.  If you
> really want something similar to the old behavior then just replace
> the "sched_yield()" call with a proper sleep for the estimated time
> it will take the program to create the file.

The problem is that if the estimate is too short, pre-emption will result in
a huge performance drop. If the estimate is too long, there will be some
wasted CPU. What was the claimed benefit of doing this again?

> > Is this a good way to use sched_yield()? Maybe, maybe not.  But it
> > *is* an actual use of the API in a real app.

> We weren't looking for "actual uses", especially not in binary-only
> apps.  What we are looking for is optimal uses of sched_yield(); ones
> where that is the best alternative.  This... certainly isn't.

Your standards for "optimal" are totally unrealistic. In his case, it was
optimal. Using platform-specific optimizations would have meant more
development and test time for minimal benefit. Sleeping first would have had
some performance cost and no benefit. In his case, sched_yield was optimal.
Really.

DS


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: yield API

2007-12-13 Thread David Schwartz

Kyle Moffett wrote:

 That is a *terrible* disgusting way to use yield.  Better options:
(1) inotify/dnotify

Sure, tie yourself to a Linux-specific mechanism that may or may not work
over things like NFS. That's much worse.

(2) create a foo.lock file and put the mutex in that

Right, tie yourself to process-shared mutexes which historically weren't
available on Linux. That's much better than an option that's been stable for
a decade.

(3) just start with the check-file-and-sleep loop.

How is that better? There is literally no improvement, since the first check
will (almost) always fail.

  Now is this the best way to handle this situation? No.  Does it
  work better than just doing the wait loop from the start? Yes.

 It works better than doing the wait-loop from the start?  What
 evidence do you provide to support this assertion?

The evidence is that more than half the time, this avoids the sleep. That
means it has zero cost, since the yield is no heavier than a sleep would be,
and has a possible benefit, since the first sleep may be too long.

 Specifically, in
 the first case you tell the kernel I'm waiting for something but I
 don't know what it is or how long it will take; while in the second
 case you tell the kernel I'm waiting for something that will take
 exactly X milliseconds, even though I don't know what it is.  If you
 really want something similar to the old behavior then just replace
 the sched_yield() call with a proper sleep for the estimated time
 it will take the program to create the file.

The problem is that if the estimate is too short, pre-emption will result in
a huge performance drop. If the estimate is too long, there will be some
wasted CPU. What was the claimed benefit of doing this again?

  Is this a good way to use sched_yield()? Maybe, maybe not.  But it
  *is* an actual use of the API in a real app.

 We weren't looking for actual uses, especially not in binary-only
 apps.  What we are looking for is optimal uses of sched_yield(); ones
 where that is the best alternative.  This... certainly isn't.

Your standards for optimal are totally unrealistic. In his case, it was
optimal. Using platform-specific optimizations would have meant more
development and test time for minimal benefit. Sleeping first would have had
some performance cost and no benefit. In his case, sched_yield was optimal.
Really.

DS


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Why does reading from /dev/urandom deplete entropy so much?

2007-12-11 Thread David Schwartz

Phillip Susi wrote:

> What good does using multiple levels of RNG do?  Why seed one RNG from
> another?  Wouldn't it be better to have just one RNG that everybody
> uses?  Doesn't the act of reading from the RNG add entropy to it, since
> no one reader has any idea how often and at what times other readers are
> stirring the pool?

No, unfortunately. The problem is that while in most typical cases may be
true, the estimate of how much entropy we have has to be based on the
assumption that everything we've done up to that point has been carefully
orchestrated by the mortal enemy of whatever is currently asking us for
entropy.

While I don't have any easy solutions with obvious irrefutable technical
brilliance or that will make everyone happy, I do think that one of the
problems is that neither /dev/random nor /dev/urandom are guaranteed to
provide what most people want. In the most common use case, you want
crypographically-strong randomness even under the assumption that all
previous activity is orchestrated by the enemy. Unfortunately, /dev/urandom
will happily give you randomness worse than this while /dev/random will
block even when you have it.

DS


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Why does reading from /dev/urandom deplete entropy so much?

2007-12-11 Thread David Schwartz

Phillip Susi wrote:

 What good does using multiple levels of RNG do?  Why seed one RNG from
 another?  Wouldn't it be better to have just one RNG that everybody
 uses?  Doesn't the act of reading from the RNG add entropy to it, since
 no one reader has any idea how often and at what times other readers are
 stirring the pool?

No, unfortunately. The problem is that while in most typical cases may be
true, the estimate of how much entropy we have has to be based on the
assumption that everything we've done up to that point has been carefully
orchestrated by the mortal enemy of whatever is currently asking us for
entropy.

While I don't have any easy solutions with obvious irrefutable technical
brilliance or that will make everyone happy, I do think that one of the
problems is that neither /dev/random nor /dev/urandom are guaranteed to
provide what most people want. In the most common use case, you want
crypographically-strong randomness even under the assumption that all
previous activity is orchestrated by the enemy. Unfortunately, /dev/urandom
will happily give you randomness worse than this while /dev/random will
block even when you have it.

DS


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Why does reading from /dev/urandom deplete entropy so much?

2007-12-08 Thread David Schwartz

> heh, along those lines you could also do
>
>   dmesg > /dev/random
>
> 
>
> dmesg often has machine-unique identifiers of all sorts (including the
> MAC address, if you have an ethernet driver loaded)
>
>   Jeff

A good three-part solution would be:

1) Encourage distributions to do "dmesg > /dev/random" in their startup
scripts. This could even be added to the kernel (as a one-time dump of the
kernel message buffer just before init is started).

2) Encourage drivers to output any unique information to the kernel log. I
believe all/most Ethernet drivers already do this with MAC addresses.
Perhaps we can get the kernel to include CPU serial numbers and we can get
the IDE/SATA drivers to include hard drive serial numbers. We can also use
the TSC, where available, in early bootup, which measures exactly how long
it took to get the kernel going, which should have some entropy in it.

3) Add more entropy to the kernel's pool at early startup, even if the
quality of that entropy is low. Track it appropriately, of course.

This should be enough to get cryptographically-strong random numbers that
would hold up against anyone who didn't have access to the 'dmesg' output.

DS


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Why does reading from /dev/urandom deplete entropy so much?

2007-12-08 Thread David Schwartz

 heh, along those lines you could also do

   dmesg  /dev/random

 grin

 dmesg often has machine-unique identifiers of all sorts (including the
 MAC address, if you have an ethernet driver loaded)

   Jeff

A good three-part solution would be:

1) Encourage distributions to do dmesg  /dev/random in their startup
scripts. This could even be added to the kernel (as a one-time dump of the
kernel message buffer just before init is started).

2) Encourage drivers to output any unique information to the kernel log. I
believe all/most Ethernet drivers already do this with MAC addresses.
Perhaps we can get the kernel to include CPU serial numbers and we can get
the IDE/SATA drivers to include hard drive serial numbers. We can also use
the TSC, where available, in early bootup, which measures exactly how long
it took to get the kernel going, which should have some entropy in it.

3) Add more entropy to the kernel's pool at early startup, even if the
quality of that entropy is low. Track it appropriately, of course.

This should be enough to get cryptographically-strong random numbers that
would hold up against anyone who didn't have access to the 'dmesg' output.

DS


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Is the PCI clock within the spec?

2007-12-04 Thread David Schwartz

> > A scope probe will allow you to see if there is
> > a clock signal. That's all. You can't determine
> > its quality. A 4-inch ground lead on the scope
> > probe will result in 10-20% overshoot and undershoot
> > being observed.

> I don't understand this 10-20% figure.
> (0V + 10-20% is still 0V.)

If you're jumping from a 900 foot marker to a 910 foot marker, does a 10%
overshoot mean you jumped 1 foot too far or 90 feet too far? The percentage
is of the distance you were trying to go, not of where you started or where
you ended up.

> AFAIU, the nominal peak-to-peak voltage is 3.3V. The observed
> peak-to-peak voltage is 6.08V (3.3V + 84%).

So a 10% undershoot would mean that rather than going from 3.3V to 0V, you
overshot 0V by 10% or the distance you travelled. The voltages could just as
well be 100V and 103.3V, the transitions would still be the same. What you
call zero is, at least in principle, arbitrary.

DS


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Is the PCI clock within the spec?

2007-12-04 Thread David Schwartz

  A scope probe will allow you to see if there is
  a clock signal. That's all. You can't determine
  its quality. A 4-inch ground lead on the scope
  probe will result in 10-20% overshoot and undershoot
  being observed.

 I don't understand this 10-20% figure.
 (0V + 10-20% is still 0V.)

If you're jumping from a 900 foot marker to a 910 foot marker, does a 10%
overshoot mean you jumped 1 foot too far or 90 feet too far? The percentage
is of the distance you were trying to go, not of where you started or where
you ended up.

 AFAIU, the nominal peak-to-peak voltage is 3.3V. The observed
 peak-to-peak voltage is 6.08V (3.3V + 84%).

So a 10% undershoot would mean that rather than going from 3.3V to 0V, you
overshot 0V by 10% or the distance you travelled. The voltages could just as
well be 100V and 103.3V, the transitions would still be the same. What you
call zero is, at least in principle, arbitrary.

DS


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: sched_yield: delete sysctl_sched_compat_yield

2007-12-03 Thread David Schwartz

> * Mark Lord <[EMAIL PROTECTED]> wrote:

> > Ack.  And what of the suggestion to try to ensure that a yielding task
> > simply not end up as the very next one chosen to run?  Maybe by
> > swapping it with another (adjacent?) task in the tree if it comes out
> > on top again?

> we did that too for quite some time in CFS - it was found to be "not
> agressive enough" by some folks and "too agressive" by others. Then when
> people started bickering over this we added these two simple corner
> cases - switchable via a flag. (minimum agression and maximum agression)

They are both correct. It is not agressive enough if there are tasks other
than those two that are at the same static priority level and ready to run.
It is too agressive if the task it is swapped with is at a lower static
priority level.

Perhaps it might be possible to scan for the task at the same static
priority level that is ready-to-run but last in line among other
ready-to-run tasks and put it after that task? I think that's about as close
as we can get to the POSIX-specified behavior.

> > Thanks Ingo -- I *really* like this scheduler!

Just in case this isn't clear, I like CFS too and sincerely appreciate the
work Ingo, Con, and others have done on it.

DS


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: sched_yield: delete sysctl_sched_compat_yield

2007-12-03 Thread David Schwartz

Chris Friesen wrote:

> David Schwartz wrote:

> > I've asked versions of this question at least three times
> > and never gotten
> > anything approaching a straight answer:
> >
> > 1) What is the current default 'sched_yield' behavior?
> >
> > 2) What is the current alternate 'sched_yield' behavior?

> I'm pretty sure I've seen responses from Ingo describing this multiple
> times in various threads.  Google should have them.

> If I remember right, the default is to simply recalculate the task's
> position in the tree and reinsert it, and the alternate is to yield to
> everything currently runnable.

The meaning of the default behavior then depends upon where in the tree it
reinserts it.

> > 3) Are either of them sensible? Simply acting as if the
> > current thread's
> > timeslice was up should be sufficient.

> The new scheduler doesn't really have a concept of "timeslice".  This is
> one of the core problems with determining what to do on sched_yield().

Then it should probably just not support 'sched_yield' and return ENOSYS.
Applications should work around an ENOSYS reply (since some versions of
Solaris return this, among other reasons). Perhaps for compatability, it
could also yield 'lightly' just in case applications ignore the return
value.

It could also handle it the way it handles the smallest sleep time that it
supports. This is sub-optimal if no other task are ready-to-run at the same
static priority level and that might be an expensive check.

If CFS really can't support sched_yield's semantics, then it should just
not, and that's that. Return ENOSYS and admit that the behavior sched_yield
is documented to have simply can't be supported by the scheduler.

> > The implication I keep getting is that neither the default
> > behavior nor the
> > alternate behavior are sensible. What is so hard about simply
> > scheduling the
> > next thread?

> The problem is where do we insert the task that is yielding?  CFS is
> based around a tree structure ordered by time.

We put it exactly where we would have when its timeslice ran out. If we can
reward it a little bit, that's great. But if not, we can live with that.
Just imagine that the timer interrupt fired to indicate the end of the
thread's run time when the thread called 'sched_yield'.

> The old scheduler was priority-based, so you could essentially yield to
> everyone of the same niceness level.
>
> With the new scheduler, this would be possible, but would involve extra
> work tracking the position of the rightmost task at each priority level.
> This additional overhead is what Ingo is trying to avoid.

Then what does he do when the task runs out of run time? It's hard to
imagine we can't do that when the task calls sched_yield.

> > We don't need perfection, but it sounds like we have two
> > alternatives of
> > which neither is sensible.

> sched_yield() isn't a great API.

I agree.

> It just says to delay the task,
> without specifying how long or what the task is waiting *for*.

That is not true. The task is waiting for something that will be done by
another thread that is ready-to-run and at the same priority level. The task
does not need to wait until the thing is guaranteed done but wishes to wait
until it is more likely to be done. This is an often-misused but sometimes
sensible thing to do.

I think the API gets blamed for two things that are not its fault:

1) It's often misunderstood and misused.

2) It was often chosen as a "best available" solution because no truly good
solutions were available.

> Other
> constructs are much more useful because they give the scheduler more
> information with which to make a decision.

Sure, if there is more information. But if all you really want to do is wait
until other threads at the same static priority level have had a chance to
run, then sched_yield is the right API.

DS


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: sched_yield: delete sysctl_sched_compat_yield

2007-12-03 Thread David Schwartz

I've asked versions of this question at least three times and never 
gotten
anything approaching a straight answer:

1) What is the current default 'sched_yield' behavior?

2) What is the current alternate 'sched_yield' behavior?

3) Are either of them sensible? Simply acting as if the current thread's
timeslice was up should be sufficient.

The implication I keep getting is that neither the default behavior nor 
the
alternate behavior are sensible. What is so hard about simply scheduling the
next thread?

We don't need perfection, but it sounds like we have two alternatives of
which neither is sensible.

DS


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: sched_yield: delete sysctl_sched_compat_yield

2007-12-03 Thread David Schwartz

I've asked versions of this question at least three times and never 
gotten
anything approaching a straight answer:

1) What is the current default 'sched_yield' behavior?

2) What is the current alternate 'sched_yield' behavior?

3) Are either of them sensible? Simply acting as if the current thread's
timeslice was up should be sufficient.

The implication I keep getting is that neither the default behavior nor 
the
alternate behavior are sensible. What is so hard about simply scheduling the
next thread?

We don't need perfection, but it sounds like we have two alternatives of
which neither is sensible.

DS


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: sched_yield: delete sysctl_sched_compat_yield

2007-12-03 Thread David Schwartz

Chris Friesen wrote:

 David Schwartz wrote:

  I've asked versions of this question at least three times
  and never gotten
  anything approaching a straight answer:
 
  1) What is the current default 'sched_yield' behavior?
 
  2) What is the current alternate 'sched_yield' behavior?

 I'm pretty sure I've seen responses from Ingo describing this multiple
 times in various threads.  Google should have them.

 If I remember right, the default is to simply recalculate the task's
 position in the tree and reinsert it, and the alternate is to yield to
 everything currently runnable.

The meaning of the default behavior then depends upon where in the tree it
reinserts it.

  3) Are either of them sensible? Simply acting as if the
  current thread's
  timeslice was up should be sufficient.

 The new scheduler doesn't really have a concept of timeslice.  This is
 one of the core problems with determining what to do on sched_yield().

Then it should probably just not support 'sched_yield' and return ENOSYS.
Applications should work around an ENOSYS reply (since some versions of
Solaris return this, among other reasons). Perhaps for compatability, it
could also yield 'lightly' just in case applications ignore the return
value.

It could also handle it the way it handles the smallest sleep time that it
supports. This is sub-optimal if no other task are ready-to-run at the same
static priority level and that might be an expensive check.

If CFS really can't support sched_yield's semantics, then it should just
not, and that's that. Return ENOSYS and admit that the behavior sched_yield
is documented to have simply can't be supported by the scheduler.

  The implication I keep getting is that neither the default
  behavior nor the
  alternate behavior are sensible. What is so hard about simply
  scheduling the
  next thread?

 The problem is where do we insert the task that is yielding?  CFS is
 based around a tree structure ordered by time.

We put it exactly where we would have when its timeslice ran out. If we can
reward it a little bit, that's great. But if not, we can live with that.
Just imagine that the timer interrupt fired to indicate the end of the
thread's run time when the thread called 'sched_yield'.

 The old scheduler was priority-based, so you could essentially yield to
 everyone of the same niceness level.

 With the new scheduler, this would be possible, but would involve extra
 work tracking the position of the rightmost task at each priority level.
 This additional overhead is what Ingo is trying to avoid.

Then what does he do when the task runs out of run time? It's hard to
imagine we can't do that when the task calls sched_yield.

  We don't need perfection, but it sounds like we have two
  alternatives of
  which neither is sensible.

 sched_yield() isn't a great API.

I agree.

 It just says to delay the task,
 without specifying how long or what the task is waiting *for*.

That is not true. The task is waiting for something that will be done by
another thread that is ready-to-run and at the same priority level. The task
does not need to wait until the thing is guaranteed done but wishes to wait
until it is more likely to be done. This is an often-misused but sometimes
sensible thing to do.

I think the API gets blamed for two things that are not its fault:

1) It's often misunderstood and misused.

2) It was often chosen as a best available solution because no truly good
solutions were available.

 Other
 constructs are much more useful because they give the scheduler more
 information with which to make a decision.

Sure, if there is more information. But if all you really want to do is wait
until other threads at the same static priority level have had a chance to
run, then sched_yield is the right API.

DS


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: sched_yield: delete sysctl_sched_compat_yield

2007-12-03 Thread David Schwartz

 * Mark Lord [EMAIL PROTECTED] wrote:

  Ack.  And what of the suggestion to try to ensure that a yielding task
  simply not end up as the very next one chosen to run?  Maybe by
  swapping it with another (adjacent?) task in the tree if it comes out
  on top again?

 we did that too for quite some time in CFS - it was found to be not
 agressive enough by some folks and too agressive by others. Then when
 people started bickering over this we added these two simple corner
 cases - switchable via a flag. (minimum agression and maximum agression)

They are both correct. It is not agressive enough if there are tasks other
than those two that are at the same static priority level and ready to run.
It is too agressive if the task it is swapped with is at a lower static
priority level.

Perhaps it might be possible to scan for the task at the same static
priority level that is ready-to-run but last in line among other
ready-to-run tasks and put it after that task? I think that's about as close
as we can get to the POSIX-specified behavior.

  Thanks Ingo -- I *really* like this scheduler!

Just in case this isn't clear, I like CFS too and sincerely appreciate the
work Ingo, Con, and others have done on it.

DS


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: namespace support requires network modules to say "GPL"

2007-12-02 Thread David Schwartz

> > Then init_net needs to be not GPL limited. Sorry, we need to allow
> > non GPL network drivers.  There is a fine line between keeping the

> Why - they aren't exactly likely to be permissible by law

Really? What law and/or what clause in the GPL says that derivative works
have to be licensed under the GPL? Or does the kernel have some new
technique to determine whether or not code has been distributed?

As I read the GPL, it only requires you to release something under the GPL
if you distribute it. The kernel has no idea whether or not code has been
distributed. So if it's enforcing the GPL, it cannot prohibit anything
non-distributed code can lawfully do. (Ergo, it's *NOT* *ENFORCING* the
GPL.)

> > binary seething masses from accessing random kernel functions,
> and allowing
> > reasonable (but still non GPL) things like ndiswrapper to use network
> > device interface.
>
> Its up to the ndiswrapper authors how the licence their code, but they
> should respect how we licence ours.

You license yours under the GPL, so they should respect the GPL.

It sounds like we're back to where we were years ago. Didn't we already
agree that EXPORT_SYMBOL_GPL was *NOT* a GPL-enforcement mechanism and had
nothing to do with respecting the GPL? After all, if it s a GPL-enforcement
mechanism, why is it not a "further restriction" which is prohibited by the
GPL? (The GPL contains no restrictions on what code can use what symbols if
that code is not distributed, but EXPORT_SYMBOL_GPL does.)

Are you now claiming that EXPORT_SYMBOL_GPL is intended to enforce the GPL?

DS


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: namespace support requires network modules to say GPL

2007-12-02 Thread David Schwartz

  Then init_net needs to be not GPL limited. Sorry, we need to allow
  non GPL network drivers.  There is a fine line between keeping the

 Why - they aren't exactly likely to be permissible by law

Really? What law and/or what clause in the GPL says that derivative works
have to be licensed under the GPL? Or does the kernel have some new
technique to determine whether or not code has been distributed?

As I read the GPL, it only requires you to release something under the GPL
if you distribute it. The kernel has no idea whether or not code has been
distributed. So if it's enforcing the GPL, it cannot prohibit anything
non-distributed code can lawfully do. (Ergo, it's *NOT* *ENFORCING* the
GPL.)

  binary seething masses from accessing random kernel functions,
 and allowing
  reasonable (but still non GPL) things like ndiswrapper to use network
  device interface.

 Its up to the ndiswrapper authors how the licence their code, but they
 should respect how we licence ours.

You license yours under the GPL, so they should respect the GPL.

It sounds like we're back to where we were years ago. Didn't we already
agree that EXPORT_SYMBOL_GPL was *NOT* a GPL-enforcement mechanism and had
nothing to do with respecting the GPL? After all, if it s a GPL-enforcement
mechanism, why is it not a further restriction which is prohibited by the
GPL? (The GPL contains no restrictions on what code can use what symbols if
that code is not distributed, but EXPORT_SYMBOL_GPL does.)

Are you now claiming that EXPORT_SYMBOL_GPL is intended to enforce the GPL?

DS


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Question regarding mutex locking

2007-11-28 Thread David Schwartz

> Thanks for the help. Someday, I hope to understand this stuff.
>
> Larry

Any code either deals with an object or it doesn't. If it doesn't deal with
that object, it should not be acquiring locks on that object. If it does
deal with that object, it must know the internal details of that object,
including when and whether locks are held, or it cannot deal with that
object sanely.

So your question starts out broken, it says, "I need to lock an object, but
I have no clue what's going on with that very same object." If you don't
know what's going on with the object, you don't know enough about the object
to lock it. If you do, you should know whether you hold the lock or not.

Either architect so this function doesn't deal with that object and so
doesn't need to lock it or architect it so that this function knows what's
going on with that object and so knows whether it holds the lock or not.

If you don't follow this rule, a lot of things can go horribly wrong. The
two biggest issues are:

1) You don't know the semantic effect of locking and unlocking the mutex. So
any code placed before the mutex is acquired or after its released may not
do what's expected. For example, you cannot unlock the mutex and yield,
because you might not actually wind up unlocking the mutex.

2) A function that acquires a lock normally expects the object it locks to
be in a consistent state when it acquires the lock. However, since your code
may or may not acquire the mutex, it is not assured that its lock gets the
object in a consistent state. Requiring the caller to know this and call the
function with the object in a consistent state creates brokenness of varying
kinds. (If the object may change, why not just release the lock before
calling? If the object may not change, why is the sub-function releasing the
lock?)

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Question regarding mutex locking

2007-11-28 Thread David Schwartz

 Thanks for the help. Someday, I hope to understand this stuff.

 Larry

Any code either deals with an object or it doesn't. If it doesn't deal with
that object, it should not be acquiring locks on that object. If it does
deal with that object, it must know the internal details of that object,
including when and whether locks are held, or it cannot deal with that
object sanely.

So your question starts out broken, it says, I need to lock an object, but
I have no clue what's going on with that very same object. If you don't
know what's going on with the object, you don't know enough about the object
to lock it. If you do, you should know whether you hold the lock or not.

Either architect so this function doesn't deal with that object and so
doesn't need to lock it or architect it so that this function knows what's
going on with that object and so knows whether it holds the lock or not.

If you don't follow this rule, a lot of things can go horribly wrong. The
two biggest issues are:

1) You don't know the semantic effect of locking and unlocking the mutex. So
any code placed before the mutex is acquired or after its released may not
do what's expected. For example, you cannot unlock the mutex and yield,
because you might not actually wind up unlocking the mutex.

2) A function that acquires a lock normally expects the object it locks to
be in a consistent state when it acquires the lock. However, since your code
may or may not acquire the mutex, it is not assured that its lock gets the
object in a consistent state. Requiring the caller to know this and call the
function with the object in a consistent state creates brokenness of varying
kinds. (If the object may change, why not just release the lock before
calling? If the object may not change, why is the sub-function releasing the
lock?)

DS


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [RFC/PATCH] SO_NO_CHECK for IPv6

2007-11-25 Thread David Schwartz

> David Schwartz <[EMAIL PROTECTED]> wrote:

> >> Regardless of whatever verifications your application is doing
> >> on the data, it is not checksumming the ports and that's what
> >> the pseudo-header is helping with.

> > So what? We are in the case where the data has already gotten
> > to him. If it
> > got to him in error, he'll reject it anyway. The receive
> > checksum check will
> > only reject packets that he would reject anyway. That makes it needless.

> What if it goes to the wrong recipient who doesn't have the upper-
> level checksums?

Since that's not him, he has no control over its policy and thus no ability
to harm it or help it.

> This is the whole point, IPv6 unlike IPv4 does not have IP header
> checksums so the high-level needs to protect it by checksumming
> the pseudo-header.

Exactly. But *he* doesn't need to check that checksum, given that he already
got the packet, since he has an upper-level checksum. He is not saying that
his reasoning applies to everyone, just that it applies to him. He is not
talking about disabling the send checksum, but the receive checksum. He
knows that he does not need it.

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [RFC/PATCH] SO_NO_CHECK for IPv6

2007-11-25 Thread David Schwartz

 David Schwartz [EMAIL PROTECTED] wrote:

  Regardless of whatever verifications your application is doing
  on the data, it is not checksumming the ports and that's what
  the pseudo-header is helping with.

  So what? We are in the case where the data has already gotten
  to him. If it
  got to him in error, he'll reject it anyway. The receive
  checksum check will
  only reject packets that he would reject anyway. That makes it needless.

 What if it goes to the wrong recipient who doesn't have the upper-
 level checksums?

Since that's not him, he has no control over its policy and thus no ability
to harm it or help it.

 This is the whole point, IPv6 unlike IPv4 does not have IP header
 checksums so the high-level needs to protect it by checksumming
 the pseudo-header.

Exactly. But *he* doesn't need to check that checksum, given that he already
got the packet, since he has an upper-level checksum. He is not saying that
his reasoning applies to everyone, just that it applies to him. He is not
talking about disabling the send checksum, but the receive checksum. He
knows that he does not need it.

DS


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [RFC/PATCH] SO_NO_CHECK for IPv6

2007-11-22 Thread David Schwartz

> Regardless of whatever verifications your application is doing
> on the data, it is not checksumming the ports and that's what
> the pseudo-header is helping with.

So what? We are in the case where the data has already gotten to him. If it
got to him in error, he'll reject it anyway. The receive checksum check will
only reject packets that he would reject anyway. That makes it needless.

Of course, if the check is nearly free, there's no potential win, so no
point in bothering.

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [RFC/PATCH] SO_NO_CHECK for IPv6

2007-11-22 Thread David Schwartz

 Regardless of whatever verifications your application is doing
 on the data, it is not checksumming the ports and that's what
 the pseudo-header is helping with.

So what? We are in the case where the data has already gotten to him. If it
got to him in error, he'll reject it anyway. The receive checksum check will
only reject packets that he would reject anyway. That makes it needless.

Of course, if the check is nearly free, there's no potential win, so no
point in bothering.

DS


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] Time-based RFC 4122 UUID generator

2007-11-19 Thread David Schwartz

> > I use libuuid and I assume libuuid uses some uuid generator support
> > from the kernel.
>
> No, it does not. It's pure userspace and may produce double UUIDs.
>
> > libuuid comes from a package that Ted's maintain IIRC.
> >
> > I (my company) use uuid to uniquely identify objects in a distributed
> > database.
> > [Proprietary closed source stuff].
>
> Same here.
>
> Helge

Any UUID generator that can produce duplicate UUIDs with probability
significantly less than purely random UUIDs is so badly broken that it
should not ever be used. Anyone who finds such a UUID generator should
immediately either fix it or throw it on the junk heap. Anyone who knowingly
uses such a UUID generator should be publically shamed.

Rather than (or at the very least, in addition to) adding a new UUID
generator, let's fix the one(s) we have.

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] Time-based RFC 4122 UUID generator

2007-11-19 Thread David Schwartz

  I use libuuid and I assume libuuid uses some uuid generator support
  from the kernel.

 No, it does not. It's pure userspace and may produce double UUIDs.

  libuuid comes from a package that Ted's maintain IIRC.
 
  I (my company) use uuid to uniquely identify objects in a distributed
  database.
  [Proprietary closed source stuff].

 Same here.

 Helge

Any UUID generator that can produce duplicate UUIDs with probability
significantly less than purely random UUIDs is so badly broken that it
should not ever be used. Anyone who finds such a UUID generator should
immediately either fix it or throw it on the junk heap. Anyone who knowingly
uses such a UUID generator should be publically shamed.

Rather than (or at the very least, in addition to) adding a new UUID
generator, let's fix the one(s) we have.

DS


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Policy on dual licensing?

2007-11-05 Thread David Schwartz

> What I suppose is that people porting BSD code to Linux don't mean
> closing the doors for back-porting changes. They are simply unaware
> or forget about the possibility of dual licensing. Obviously, each
> submitter should read Documentation/SubmittingDrivers, where it is
> explicitly stated. Yet humans are prone to forgetting, so this may
> seem not enough.
>
> What I propose is implementing a policy on accepting such code.
> According to it, every time a maintainer is considering a driver
> that is derived from BSD and licensed GPL-only, should request
> for dual licensing before accepting the patch. If the submitter is
> reluctant to do so - what can we do, it's better to have this inside
> this way than not at all. However, this should minimize such cases
> and, hopefully, satisfy the claims about Linux maintainers not doing
> all that they could to make the world a better place.
>
> Best regards,
> Remigiusz Modrzejewski

This will result in more code in the Linux tree that has a license other
than the project default. This will impose a greater and greater burden on
developers who have to carefully check the license of files every time they
cut and paste code from one file into another.

It creates a serious risk of incorrect license notices (because someone
cuts/pastes a substantial chunk of GPL-only code into a dual-licensed file
without changing the license notice) and accidental copyright violations
(because someone else took the cut/pasted part into a BSD-licensed project)
if intimately-connected files are under different licenses. Every effort
should be made to avoid this.

Having a clear policy would be a good idea. I think the general policy
should be that any dual-licensed file should contain a clear notice that the
Linux kernel is GPL (that is the only license 'guaranteed' to cover the
entire distribution) and that development may result in the file being
"contaminated" by code that is not dual-licensed.

Just a notice referring to a 'dual license FAQ' in Documention would be
fine, of course. That file should advise developers that they should remove
the dual license if they cause the file to be no longer dual-licensable due
to code they've added, cut/pasted, or modified. Gratuitous removal of
dual-licensing should be discouraged, but removing it should be encouraged
where it's a genuine impediment to development.

The example I always use is if we have a filesystem with a different
license. Imagine if a new function is added to the filesystem interface. It
is offered in a 'generic' version, with the expectation that filesystems
will override it to provide a better-performing version. Imagine if the
generic version is GPLv2-only and a filesystem in-tree is dual licensed.

A developer probably cannot cut/paste the generic version as a base without
breaking the dual license. If they want to keep the dual license, they have
to re-implement the function. This creates an increased risk of bugs or
incompatibilities. Worse, it creates a maintenance headache in that this
function will need to be understood separately from other filesystems'
implementation of the same function. A little imagination will allow one to
imagine many ways this can cause problems.

The only good way this can end is if they change the license on that file to
GPL only. Possible bad ways include accidentally contaminating the
apparently dual-licensed file with code that was offered by its author only
under the GPL.

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Policy on dual licensing?

2007-11-05 Thread David Schwartz

 What I suppose is that people porting BSD code to Linux don't mean
 closing the doors for back-porting changes. They are simply unaware
 or forget about the possibility of dual licensing. Obviously, each
 submitter should read Documentation/SubmittingDrivers, where it is
 explicitly stated. Yet humans are prone to forgetting, so this may
 seem not enough.

 What I propose is implementing a policy on accepting such code.
 According to it, every time a maintainer is considering a driver
 that is derived from BSD and licensed GPL-only, should request
 for dual licensing before accepting the patch. If the submitter is
 reluctant to do so - what can we do, it's better to have this inside
 this way than not at all. However, this should minimize such cases
 and, hopefully, satisfy the claims about Linux maintainers not doing
 all that they could to make the world a better place.

 Best regards,
 Remigiusz Modrzejewski

This will result in more code in the Linux tree that has a license other
than the project default. This will impose a greater and greater burden on
developers who have to carefully check the license of files every time they
cut and paste code from one file into another.

It creates a serious risk of incorrect license notices (because someone
cuts/pastes a substantial chunk of GPL-only code into a dual-licensed file
without changing the license notice) and accidental copyright violations
(because someone else took the cut/pasted part into a BSD-licensed project)
if intimately-connected files are under different licenses. Every effort
should be made to avoid this.

Having a clear policy would be a good idea. I think the general policy
should be that any dual-licensed file should contain a clear notice that the
Linux kernel is GPL (that is the only license 'guaranteed' to cover the
entire distribution) and that development may result in the file being
contaminated by code that is not dual-licensed.

Just a notice referring to a 'dual license FAQ' in Documention would be
fine, of course. That file should advise developers that they should remove
the dual license if they cause the file to be no longer dual-licensable due
to code they've added, cut/pasted, or modified. Gratuitous removal of
dual-licensing should be discouraged, but removing it should be encouraged
where it's a genuine impediment to development.

The example I always use is if we have a filesystem with a different
license. Imagine if a new function is added to the filesystem interface. It
is offered in a 'generic' version, with the expectation that filesystems
will override it to provide a better-performing version. Imagine if the
generic version is GPLv2-only and a filesystem in-tree is dual licensed.

A developer probably cannot cut/paste the generic version as a base without
breaking the dual license. If they want to keep the dual license, they have
to re-implement the function. This creates an increased risk of bugs or
incompatibilities. Worse, it creates a maintenance headache in that this
function will need to be understood separately from other filesystems'
implementation of the same function. A little imagination will allow one to
imagine many ways this can cause problems.

The only good way this can end is if they change the license on that file to
GPL only. Possible bad ways include accidentally contaminating the
apparently dual-licensed file with code that was offered by its author only
under the GPL.

DS


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Is gcc thread-unsafe?

2007-11-02 Thread David Schwartz

> Another conclusion from the cited text is that in contrast with what
> was stated before on the gcc mailing list, it is not required to
> declare thread-shared variables volatile if that thread-shared data is
> consistently protected by calls to locking functions.
>
> Bart Van Assche.

It all depends upon what threading standard you are using. If GCC is going
to support POSIX threading, it cannot require that thread-shared data be
marked 'volatile' since POSIX does not require this.

It can offer semantic guarantees for volatile-qualified data if it wants to.
But POSIX provides a set of guarantees that do not require marking data as
'volatile' and if GCC is going to support POSIX threading, it has to support
providing those guarantees.

As far as I know, no threading standard either requires 'volatile' or states
that it is sufficient for any particular purpose. So there seems to be no
reason to declare thread-shared variables as
volatile except as some kind of platform-specific optimization.

POSIX mutexes are sufficient. They are necessary if there is no other way to
get the guarantees you need. Nothing prevents GCC from providing any
guarantees it wants for 'volatile' qualified data. But POSIX mutexes must
work as POSIX specifies or GCC cannot support POSIX threading.

This is the nightmare scenario (thanks to Hans-J. Boehm):

int x;
bool need_to_lock;
pthread_mutex_t mutex;

for(int i=0; i<50; i++)
{
 if(unlikely(need_to_lock)) pthread_mutex_lock();
 x++;
 if(unlikely(need_to_lock)) pthread_mutex_unlock();
}

Now suppose the compiler optimizes this as follows:

register=x;
for(int i=0; i<50; i++)
{
 if(need_to_lock)
 {
  x=register; pthread_mutex_lock() register=x;
 }
 register++;
 if(need_to_lock)
 {
  x=register; pthread_mutex_unlock(); register=x;
 }
}
x=register;

This is a perfectly legal optimization for single-threaded code. It may in
fact be an actual optimization. Clearly, it totally destroys threaded code.

This shows that, unfortunately, the normal assumption that not knowing
anything about the pthread functions ensures that optimizations won't break
them is incorrect.

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Is gcc thread-unsafe?

2007-11-02 Thread David Schwartz

 Another conclusion from the cited text is that in contrast with what
 was stated before on the gcc mailing list, it is not required to
 declare thread-shared variables volatile if that thread-shared data is
 consistently protected by calls to locking functions.

 Bart Van Assche.

It all depends upon what threading standard you are using. If GCC is going
to support POSIX threading, it cannot require that thread-shared data be
marked 'volatile' since POSIX does not require this.

It can offer semantic guarantees for volatile-qualified data if it wants to.
But POSIX provides a set of guarantees that do not require marking data as
'volatile' and if GCC is going to support POSIX threading, it has to support
providing those guarantees.

As far as I know, no threading standard either requires 'volatile' or states
that it is sufficient for any particular purpose. So there seems to be no
reason to declare thread-shared variables as
volatile except as some kind of platform-specific optimization.

POSIX mutexes are sufficient. They are necessary if there is no other way to
get the guarantees you need. Nothing prevents GCC from providing any
guarantees it wants for 'volatile' qualified data. But POSIX mutexes must
work as POSIX specifies or GCC cannot support POSIX threading.

This is the nightmare scenario (thanks to Hans-J. Boehm):

int x;
bool need_to_lock;
pthread_mutex_t mutex;

for(int i=0; i50; i++)
{
 if(unlikely(need_to_lock)) pthread_mutex_lock(mutex);
 x++;
 if(unlikely(need_to_lock)) pthread_mutex_unlock(mutex);
}

Now suppose the compiler optimizes this as follows:

register=x;
for(int i=0; i50; i++)
{
 if(need_to_lock)
 {
  x=register; pthread_mutex_lock(mutex) register=x;
 }
 register++;
 if(need_to_lock)
 {
  x=register; pthread_mutex_unlock(mutex); register=x;
 }
}
x=register;

This is a perfectly legal optimization for single-threaded code. It may in
fact be an actual optimization. Clearly, it totally destroys threaded code.

This shows that, unfortunately, the normal assumption that not knowing
anything about the pthread functions ensures that optimizations won't break
them is incorrect.

DS


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: epoll design problems with common fork/exec patterns

2007-10-28 Thread David Schwartz

Eric Dumazet wrote:

> Events are not necessarly reported "by descriptors". epoll uses an opaque
> field provided by the user.
>
> It's up to the user to properly chose a tag that will makes sense
> if the user
> app is playing dup()/close() games for example.

Great. So the only issue then is that the documentation is confusing. It
frequently uses the term "fd" where it means file. For example, it says:

  Q1 What  happens  if  you  add  the  same fd to an
epoll_set
 twice?

  A1 You will probably get EEXIST.  However,  it  is
possible
 that  two  threads  may  add the same fd twice. This is
a
 harmless condition.

This gives no reason to think there's anything wrong with adding the same
file twice so long as you do so through different descriptors. (One can
imagine an application that does this to segregate read and write operations
to avoid a race where the descriptor is closed from under a writer due to
handling a fatal read error.) Obviously, that won't work.

And this part:

  Q6 Will  the  close of an fd cause it to be removed from
all
 epoll sets automatically?

  A6 Yes.

This is incorrect. Closing an fd will not cause it to be removed from all
epoll sets automatically. Only closing a file will. This is what caused the
OP's confusion, and it is at best imprecise and, at worst, flat out wrong.

DS

PS: It is customary to trim individuals off of CC lists when replying to a
list when the subject matter of the post is squarely inside the subject of
the list. If the person CC'd was interested in the list's subject, he or she
would presumably subscribe to the list. Not everyone wants two copies of
every post. Not everyone wants a personal copy of every sub-thread that
results from a post they make. In the past few years, I've received
approximately an equal number of complaints about trimming CC's on posts to
LKML and not trimming CC's on such posts.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: epoll design problems with common fork/exec patterns

2007-10-28 Thread David Schwartz

Eric Dumazet wrote:

 Events are not necessarly reported by descriptors. epoll uses an opaque
 field provided by the user.

 It's up to the user to properly chose a tag that will makes sense
 if the user
 app is playing dup()/close() games for example.

Great. So the only issue then is that the documentation is confusing. It
frequently uses the term fd where it means file. For example, it says:

  Q1 What  happens  if  you  add  the  same fd to an
epoll_set
 twice?

  A1 You will probably get EEXIST.  However,  it  is
possible
 that  two  threads  may  add the same fd twice. This is
a
 harmless condition.

This gives no reason to think there's anything wrong with adding the same
file twice so long as you do so through different descriptors. (One can
imagine an application that does this to segregate read and write operations
to avoid a race where the descriptor is closed from under a writer due to
handling a fatal read error.) Obviously, that won't work.

And this part:

  Q6 Will  the  close of an fd cause it to be removed from
all
 epoll sets automatically?

  A6 Yes.

This is incorrect. Closing an fd will not cause it to be removed from all
epoll sets automatically. Only closing a file will. This is what caused the
OP's confusion, and it is at best imprecise and, at worst, flat out wrong.

DS

PS: It is customary to trim individuals off of CC lists when replying to a
list when the subject matter of the post is squarely inside the subject of
the list. If the person CC'd was interested in the list's subject, he or she
would presumably subscribe to the list. Not everyone wants two copies of
every post. Not everyone wants a personal copy of every sub-thread that
results from a post they make. In the past few years, I've received
approximately an equal number of complaints about trimming CC's on posts to
LKML and not trimming CC's on such posts.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: epoll design problems with common fork/exec patterns

2007-10-27 Thread David Schwartz

> 6) Epoll removes the file from the set, when the *kernel* object gets
>closed (internal use-count goes to zero)
>
> With that in mind, how can the code snippet above trigger a removal from
> the epoll set?

I don't see how that can be. Suppose I add fd 8 to an epoll set. 
Suppose fd
5 is a dup of fd 8. Now, I close fd 8. How can fd 8 remain in my epoll set,
since there no longer is an fd 8? Events on files registered for epoll
notification are reported by descriptor, so the set membership has to be
associated (as reflected into userspace) with the descriptor, not the file.

For example, consider:

1) Process creates an epoll set, the set gets fd 4.

2) Process creates a socket, it gets fd 5.

3) The process adds fd 5 to set 4.

4) The process forks.

5) The child inherits the epoll set but not the socket.

Here the kernel cannot quite do the right thing. Ideally, the parent 
would
still have fd 5 in its version of the epoll set. After all, it has not
closed fd 5. However, the child *cannot* see fd 5 in its version of the
epoll set since it has no fd 5. An event reported for fd 5 would be
nonsense.

So it seems the kernel either has to break one of these "would/cannot"
requirements, or it has to split the epoll set in two. However, splitting
the set into two sets is clearly wrong since the processes should share it.

  Q6 Will  the  close of an fd cause it to be removed from
all
 epoll sets automatically?

  A6 Yes.

Note that this talks of the close of an "fd", not a file. The 'close'
function in fact closes an fd, as that fd is then reusable. So it sounds
like the problem above is solved by removing the fd from the set, but in
practice this doesn't happen. I have programs that call 'close' between
'fork' and 'exec' and do not see the socket removed from the poll set.

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: epoll design problems with common fork/exec patterns

2007-10-27 Thread David Schwartz

 6) Epoll removes the file from the set, when the *kernel* object gets
closed (internal use-count goes to zero)

 With that in mind, how can the code snippet above trigger a removal from
 the epoll set?

I don't see how that can be. Suppose I add fd 8 to an epoll set. 
Suppose fd
5 is a dup of fd 8. Now, I close fd 8. How can fd 8 remain in my epoll set,
since there no longer is an fd 8? Events on files registered for epoll
notification are reported by descriptor, so the set membership has to be
associated (as reflected into userspace) with the descriptor, not the file.

For example, consider:

1) Process creates an epoll set, the set gets fd 4.

2) Process creates a socket, it gets fd 5.

3) The process adds fd 5 to set 4.

4) The process forks.

5) The child inherits the epoll set but not the socket.

Here the kernel cannot quite do the right thing. Ideally, the parent 
would
still have fd 5 in its version of the epoll set. After all, it has not
closed fd 5. However, the child *cannot* see fd 5 in its version of the
epoll set since it has no fd 5. An event reported for fd 5 would be
nonsense.

So it seems the kernel either has to break one of these would/cannot
requirements, or it has to split the epoll set in two. However, splitting
the set into two sets is clearly wrong since the processes should share it.

  Q6 Will  the  close of an fd cause it to be removed from
all
 epoll sets automatically?

  A6 Yes.

Note that this talks of the close of an fd, not a file. The 'close'
function in fact closes an fd, as that fd is then reusable. So it sounds
like the problem above is solved by removing the fd from the set, but in
practice this doesn't happen. I have programs that call 'close' between
'fork' and 'exec' and do not see the socket removed from the poll set.

DS


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Is gcc thread-unsafe?

2007-10-26 Thread David Schwartz

> Well, yeah.  I know what you mean.  However, at this moment, some gcc
> developers are trying really hard not to be total d*ckheads about this
> issue, but get gcc fixed.  Give us a chance.
>
> Andrew.

Can we get some kind of consensus that 'optimizations' that add writes to
any object that the programmer might have taken the address of are invalid
on any platform that supports memory protection? That seems like obvious
common sense to me.

And it has the advantage that it can't be language-lawyered. There is no
document that states the rational requirements of a compiler that's going to
support a memory protection model. So they can be anything rational people
think they should be.

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Is gcc thread-unsafe?

2007-10-26 Thread David Schwartz

 Well, yeah.  I know what you mean.  However, at this moment, some gcc
 developers are trying really hard not to be total d*ckheads about this
 issue, but get gcc fixed.  Give us a chance.

 Andrew.

Can we get some kind of consensus that 'optimizations' that add writes to
any object that the programmer might have taken the address of are invalid
on any platform that supports memory protection? That seems like obvious
common sense to me.

And it has the advantage that it can't be language-lawyered. There is no
document that states the rational requirements of a compiler that's going to
support a memory protection model. So they can be anything rational people
think they should be.

DS


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Is gcc thread-unsafe?

2007-10-25 Thread David Schwartz

I asked a collection of knowledgeable people I know about the issue. The
consensus is that the optimization is not permitted in POSIX code but that
it is permitted in pure C code. The basic argument goes like this:

To make POSIX-compliant code even possible, surely optimizations that 
add
writes to variables must be prohibited. That is -- if POSIX prohibits
writing to a variable in certain cases only the programmer can detect, then
a POSIX-compliant compiler cannot write to a variable except where
explicitly told to do so. Any optimization that *adds* a write to a variable
that would not otherwise occur *must* be prohibited.

Otherwise, it is literally impossible to comply with the POSIX 
requirement
that concurrent modifications and reads to shared variables take place while
holding a mutex.

The simplest solution is simply to ditch the optimization. If it really
isn't even an optimization, then that's an easy way out.

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Is gcc thread-unsafe?

2007-10-25 Thread David Schwartz

I asked a collection of knowledgeable people I know about the issue. The
consensus is that the optimization is not permitted in POSIX code but that
it is permitted in pure C code. The basic argument goes like this:

To make POSIX-compliant code even possible, surely optimizations that 
add
writes to variables must be prohibited. That is -- if POSIX prohibits
writing to a variable in certain cases only the programmer can detect, then
a POSIX-compliant compiler cannot write to a variable except where
explicitly told to do so. Any optimization that *adds* a write to a variable
that would not otherwise occur *must* be prohibited.

Otherwise, it is literally impossible to comply with the POSIX 
requirement
that concurrent modifications and reads to shared variables take place while
holding a mutex.

The simplest solution is simply to ditch the optimization. If it really
isn't even an optimization, then that's an easy way out.

DS


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Is gcc thread-unsafe?

2007-10-24 Thread David Schwartz

> Well that's exactly right. For threaded programs (and maybe even
> real-world non-threaded ones in general), you don't want to be
> even _reading_ global variables if you don't need to. Cache misses
> and cacheline bouncing could easily cause performance to completely
> tank in some cases while only gaining a cycle or two in
> microbenchmarks for doing these funny x86 predication things.

For some CPUs, replacing an conditional branch with a conditional move is a
*huge* win because it cannot be mispredicted. In general, compilers should
optimize for unshared data since that's much more common in typical code.
Even for shared data, the usual case is that you are going to access the
data few times, so pulling the cache line to the CPU is essentially free
since it will happen eventually.

Heuristics may show that the vast majority of such constructs write anyway.
So the optimization may also be valid based on such heuristics.

A better question is whether it's legal for a compiler that claims to
support POSIX threads. I'm going to post on comp.programming.threads, where
the threading experts hang out.

A very interesting case to be sure.

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Is gcc thread-unsafe?

2007-10-24 Thread David Schwartz

 Well that's exactly right. For threaded programs (and maybe even
 real-world non-threaded ones in general), you don't want to be
 even _reading_ global variables if you don't need to. Cache misses
 and cacheline bouncing could easily cause performance to completely
 tank in some cases while only gaining a cycle or two in
 microbenchmarks for doing these funny x86 predication things.

For some CPUs, replacing an conditional branch with a conditional move is a
*huge* win because it cannot be mispredicted. In general, compilers should
optimize for unshared data since that's much more common in typical code.
Even for shared data, the usual case is that you are going to access the
data few times, so pulling the cache line to the CPU is essentially free
since it will happen eventually.

Heuristics may show that the vast majority of such constructs write anyway.
So the optimization may also be valid based on such heuristics.

A better question is whether it's legal for a compiler that claims to
support POSIX threads. I'm going to post on comp.programming.threads, where
the threading experts hang out.

A very interesting case to be sure.

DS


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [rfc][patch 3/3] x86: optimise barriers

2007-10-15 Thread David Schwartz

> From: Intel(R) 64 and IA-32 Architectures Software Developer's Manual
> Volume 3A:
>
>"7.2.2 Memory Ordering in P6 and More Recent Processor Families
> ...
> 1. Reads can be carried out speculatively and in any order.
> ..."
>
> So, it looks to me like almost the 1-st Commandment. Some people (like
> me) did believe this, others tried to check, and it was respected for
> years notwithstanding nobody had ever seen such an event.

When Intel first added speculative loads to the x86 family, they pegged the
speculative load to the cache line. If the cache line is invalidated, so is
the speculative load. As a result, out-of-order reads to normal memory are
invisible to software. If a write to the same memory location on another CPU
would make the fetched value invalid, it will make the cache line invalid,
which invalidates the fetch.

I think it's extremely unlikely that any x86 CPU will do this any
differently. It's hard to imagine Intel and AMD would go to all this trouble
for so long just to stop so late in the line's lifetime.

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [rfc][patch 3/3] x86: optimise barriers

2007-10-15 Thread David Schwartz

 From: Intel(R) 64 and IA-32 Architectures Software Developer's Manual
 Volume 3A:

7.2.2 Memory Ordering in P6 and More Recent Processor Families
 ...
 1. Reads can be carried out speculatively and in any order.
 ...

 So, it looks to me like almost the 1-st Commandment. Some people (like
 me) did believe this, others tried to check, and it was respected for
 years notwithstanding nobody had ever seen such an event.

When Intel first added speculative loads to the x86 family, they pegged the
speculative load to the cache line. If the cache line is invalidated, so is
the speculative load. As a result, out-of-order reads to normal memory are
invisible to software. If a write to the same memory location on another CPU
would make the fetched value invalid, it will make the cache line invalid,
which invalidates the fetch.

I think it's extremely unlikely that any x86 CPU will do this any
differently. It's hard to imagine Intel and AMD would go to all this trouble
for so long just to stop so late in the line's lifetime.

DS


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Aggregation in embedded context, is kernel GPL2 prejudiceagainst embedded systems?

2007-10-11 Thread David Schwartz

Adrian Bunk wrote:

> even for dynamically linking including non-GPL code is not white but 
> already dark grey.

IANAL, but personally, I think it's perfectly black and white.

No mechanical combination (that means compressing, linking, tarring, compiling, 
or whatever) can create a work for copyright purposes. It can only convert the 
original work into a new form or aggregate works.

There are a few exceptions to this by statute. For example, translation (by 
explicit law) can create a derivative work. Presumably this was because nobody 
ever imagined an automated process that could translate a work. It was assumed 
such a process must always be creative.

To create a 'derivative work', you must create a new *work*, and a compiler and 
linker can't do that. Under copyright law, the creation of a work requires 
creative input. Compilers and linkers are not creative.

If you link two works together, the result is an aggregate of those two works 
(and possibly the linker). This must be the case because there is no creative 
combination, and without creativity, a new work (for copyright purposes) cannot 
be formed.

No amount of mechanical automated combination of works can create a new work 
for copyright purposes. If you feed A and B into a linker, all you can get out 
is A, B, and perhaps the linker.

This doesn't mean that the result isn't a derivative work of one of the inputs. 
But this can only happen if one of the input works was a derivative to begin 
with.

"Mere aggregation" must mean as opposed to creative combination. Think about a 
tar/gzip. Bits of each work are mixed into the other as the subsequent work has 
elements in common to the previous work compressed out. This is just as much 
mixing as a linker does, perhaps arguably more. The key is that no creativity 
is used, and thus no *new* work (and a derivative work is a new work) is 
created.

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Aggregation in embedded context, is kernel GPL2 prejudiceagainst embedded systems?

2007-10-11 Thread David Schwartz

Adrian Bunk wrote:

 even for dynamically linking including non-GPL code is not white but 
 already dark grey.

IANAL, but personally, I think it's perfectly black and white.

No mechanical combination (that means compressing, linking, tarring, compiling, 
or whatever) can create a work for copyright purposes. It can only convert the 
original work into a new form or aggregate works.

There are a few exceptions to this by statute. For example, translation (by 
explicit law) can create a derivative work. Presumably this was because nobody 
ever imagined an automated process that could translate a work. It was assumed 
such a process must always be creative.

To create a 'derivative work', you must create a new *work*, and a compiler and 
linker can't do that. Under copyright law, the creation of a work requires 
creative input. Compilers and linkers are not creative.

If you link two works together, the result is an aggregate of those two works 
(and possibly the linker). This must be the case because there is no creative 
combination, and without creativity, a new work (for copyright purposes) cannot 
be formed.

No amount of mechanical automated combination of works can create a new work 
for copyright purposes. If you feed A and B into a linker, all you can get out 
is A, B, and perhaps the linker.

This doesn't mean that the result isn't a derivative work of one of the inputs. 
But this can only happen if one of the input works was a derivative to begin 
with.

Mere aggregation must mean as opposed to creative combination. Think about a 
tar/gzip. Bits of each work are mixed into the other as the subsequent work has 
elements in common to the previous work compressed out. This is just as much 
mixing as a linker does, perhaps arguably more. The key is that no creativity 
is used, and thus no *new* work (and a derivative work is a new work) is 
created.

DS


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: SLUB performance regression vs SLAB

2007-10-04 Thread David Schwartz

> On 10/04/2007 07:39 PM, David Schwartz wrote:

> > But this is just a preposterous position to put him in. If there's no
> > reproduceable test case, then why should he care that one
> > program he can't
> > even see works badly? If you care, you fix it.

> People have been trying for years to make reproducible test cases
> for huge and complex workloads. It doesn't work. The tests that do
> work take weeks to run and need to be carefully validated before
> they can be officially released. The open source community can and
> should be working on similar tests, but they will never be simple.

That's true, but irrelevent. Either the test can identify a problem that
applies generally, or it's doing nothing but measuring how good the system
is at doing the test. If the former, it should be possible to create a
simple test case once you know from the complex test where the problem is.
If the latter, who cares about a supposed regression?

It should be possible to identify exactly what portion of the test shows the
regression the most and exactly what the system is doing during that moment.
The test may be great at finding regressions, but once it finds them, they
should be forever *found*.

Did you follow the recent incident when iperf fout what seemed to be a
significnat CFS networking regression? The only way to identify that it was
a quirk in what iperf was doing was by looking at exactly what iperf was
doing. The only efficient way was to look at iperf's source and see that
iperf's weird yielding meant it didn't replicate typical use cases like it
was supposed to.

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: SLUB performance regression vs SLAB

2007-10-04 Thread David Schwartz

David Miller wrote:

> Using an unpublishable benchmark, whose results even cannot be
> published, really stretches the limits of "reasonable" don't you
> think?
>
> This "SLUB isn't ready yet" bullshit is just a shamans dance which
> distracts attention away from the real problem, which is that a
> reproducable, publishable test case, is not being provided to the
> developer so he can work on fixing the problem.
>
> I can tell you this thing would be fixed overnight if a proper test
> case had been provided by now.

I would just like to echo what you said just a bit angrier. This is the same
as someone asking him to fix a bug that they can only see with a binary-only
kernel module. I think he's perfectly justified in simply responding "the
bug is as likely to be in your code as mine".

Now, just because he's justified in doing that doesn't mean he should. I
presume he has an honest desire to improve his own code and if they've found
a real problem, I'm sure he'd love to fix it.

But this is just a preposterous position to put him in. If there's no
reproduceable test case, then why should he care that one program he can't
even see works badly? If you care, you fix it.

Matthew Wilcox wrote:

> Yet here we stand.  Christoph is aggressively trying to get slab removed
> from the tree.  There is a testcase which shows slub performing worse
> than slab.  It's not my fault I can't publish it.  And just because I
> can't publish it doesn't mean it doesn't exist.

It means it may or may not exist. All we have is your word that slub is the
problem. If I said I found a bug in the Linux kernel that caused it to panic
but I could only reproduce it with the nVidia driver, I'd be laughed at.

It may even be that slub is better, your benchmark simply interprets this as
worse. Without the details of your benchmark, we can't know. For example,
I've seen benchmarks that (usually unintentionally) actually do a *variable*
amount of work and details of the implementation may result in the benchmark
actually doing *more* work, so it taking longer does not mean it ran slower.

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: SLUB performance regression vs SLAB

2007-10-04 Thread David Schwartz

David Miller wrote:

 Using an unpublishable benchmark, whose results even cannot be
 published, really stretches the limits of reasonable don't you
 think?

 This SLUB isn't ready yet bullshit is just a shamans dance which
 distracts attention away from the real problem, which is that a
 reproducable, publishable test case, is not being provided to the
 developer so he can work on fixing the problem.

 I can tell you this thing would be fixed overnight if a proper test
 case had been provided by now.

I would just like to echo what you said just a bit angrier. This is the same
as someone asking him to fix a bug that they can only see with a binary-only
kernel module. I think he's perfectly justified in simply responding the
bug is as likely to be in your code as mine.

Now, just because he's justified in doing that doesn't mean he should. I
presume he has an honest desire to improve his own code and if they've found
a real problem, I'm sure he'd love to fix it.

But this is just a preposterous position to put him in. If there's no
reproduceable test case, then why should he care that one program he can't
even see works badly? If you care, you fix it.

Matthew Wilcox wrote:

 Yet here we stand.  Christoph is aggressively trying to get slab removed
 from the tree.  There is a testcase which shows slub performing worse
 than slab.  It's not my fault I can't publish it.  And just because I
 can't publish it doesn't mean it doesn't exist.

It means it may or may not exist. All we have is your word that slub is the
problem. If I said I found a bug in the Linux kernel that caused it to panic
but I could only reproduce it with the nVidia driver, I'd be laughed at.

It may even be that slub is better, your benchmark simply interprets this as
worse. Without the details of your benchmark, we can't know. For example,
I've seen benchmarks that (usually unintentionally) actually do a *variable*
amount of work and details of the implementation may result in the benchmark
actually doing *more* work, so it taking longer does not mean it ran slower.

DS


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: SLUB performance regression vs SLAB

2007-10-04 Thread David Schwartz

 On 10/04/2007 07:39 PM, David Schwartz wrote:

  But this is just a preposterous position to put him in. If there's no
  reproduceable test case, then why should he care that one
  program he can't
  even see works badly? If you care, you fix it.

 People have been trying for years to make reproducible test cases
 for huge and complex workloads. It doesn't work. The tests that do
 work take weeks to run and need to be carefully validated before
 they can be officially released. The open source community can and
 should be working on similar tests, but they will never be simple.

That's true, but irrelevent. Either the test can identify a problem that
applies generally, or it's doing nothing but measuring how good the system
is at doing the test. If the former, it should be possible to create a
simple test case once you know from the complex test where the problem is.
If the latter, who cares about a supposed regression?

It should be possible to identify exactly what portion of the test shows the
regression the most and exactly what the system is doing during that moment.
The test may be great at finding regressions, but once it finds them, they
should be forever *found*.

Did you follow the recent incident when iperf fout what seemed to be a
significnat CFS networking regression? The only way to identify that it was
a quirk in what iperf was doing was by looking at exactly what iperf was
doing. The only efficient way was to look at iperf's source and see that
iperf's weird yielding meant it didn't replicate typical use cases like it
was supposed to.

DS


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: get amount of "entropy" in /dev/random ?

2007-10-02 Thread David Schwartz

> From the userlevel, can I get an estimate of  "amount of entropy"
> in /dev/random, that is, the estimate of number of bytes
> readable until it blocks ? Of course multiple processes
> can read bytes and this would not be exact ... but still .. as an upper
> boundary estimate ?
>
> Thanks
> Yakov

Yes. Look in drivers/char/random.c at the random_ioctl handler. You will see
RNDGETENTCNT.

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Network slowdown due to CFS

2007-10-02 Thread David Schwartz

This is a combined response to Arjan's:

> that's also what trylock is for... as well as spinaphores...
> (you can argue that futexes should be more intelligent and do
> spinaphore stuff etc... and I can buy that, lets improve them in the
> kernel by any means. But userspace yield() isn't the answer. A
> yield_to() would have been a ton better (which would return immediately
> if the thing you want to yield to is running already somethere), a
> blind "yield" isn't, since it doesn't say what you want to yield to.

And Ingo's:

> but i'll attempt to weave the chain of argument one step forward (in the
> hope of not distorting your point in any way): _if_ the sched_yield()
> call in that memory allocator is done because it uses a locking
> primitive that is unfair (hence the memory pool lock can be starved),
> then the "guaranteed large latency" is caused by "guaranteed
> unfairness". The solution is not to insert a random latency (via a
> sched_yield() call) that also has a side-effect of fairness to other
> tasks, because this random latency introduces guaranteed unfairness for
> this particular task. The correct solution IMO is to make the locking
> primitive more fair _without_ random delays, and there are a number of
> good techniques for that. (they mostly center around the use of futexes)

So now I not only have to come up with an example where sched_yield is the
best practical choice, I have to come up with one where sched_yield is the
best conceivable choice? Didn't we start out by agreeing these are very rare
cases? Why are we designing new APIs for them (Arjan) and why do we care
about their performance (Ingo)?

These are *rare* cases. It is a waste of time to optimize them.

In this case, nobody cares about fairness to the service thread. It is a
cleanup task that probably runs every few minutes. It could be delayed for
minutes and nobody would care. What they do care about is the impact of the
service thread on the threads doing real work.

You two challenged me to present any legitimate use case for sched_yield. I
see now that was not a legitimate challenge and you two were determined to
shoot down any response no matter how reasonable on the grounds that there
is some way to do it better, no matter how complex, impractical, or
unjustified by the real-world problem.

I think if a pthread_mutex had a 'yield to others blocking on this mutex'
kind of a 'go to the back of the line' option, that would cover the majority
of cases where sched_yield is your best choice currently. Unfortunately,
POSIX gave us yield.

Note that I think we all agree that any program whose performance relies on
quirks of sched_yield (such as the examples that have been cited as CFS
'regressions') are broken horribly. None of the cases I am suggesting use
sched_yield as anything more than a minor optimization.

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Network slowdown due to CFS

2007-10-02 Thread David Schwartz

This is a combined response to Arjan's:

 that's also what trylock is for... as well as spinaphores...
 (you can argue that futexes should be more intelligent and do
 spinaphore stuff etc... and I can buy that, lets improve them in the
 kernel by any means. But userspace yield() isn't the answer. A
 yield_to() would have been a ton better (which would return immediately
 if the thing you want to yield to is running already somethere), a
 blind yield isn't, since it doesn't say what you want to yield to.

And Ingo's:

 but i'll attempt to weave the chain of argument one step forward (in the
 hope of not distorting your point in any way): _if_ the sched_yield()
 call in that memory allocator is done because it uses a locking
 primitive that is unfair (hence the memory pool lock can be starved),
 then the guaranteed large latency is caused by guaranteed
 unfairness. The solution is not to insert a random latency (via a
 sched_yield() call) that also has a side-effect of fairness to other
 tasks, because this random latency introduces guaranteed unfairness for
 this particular task. The correct solution IMO is to make the locking
 primitive more fair _without_ random delays, and there are a number of
 good techniques for that. (they mostly center around the use of futexes)

So now I not only have to come up with an example where sched_yield is the
best practical choice, I have to come up with one where sched_yield is the
best conceivable choice? Didn't we start out by agreeing these are very rare
cases? Why are we designing new APIs for them (Arjan) and why do we care
about their performance (Ingo)?

These are *rare* cases. It is a waste of time to optimize them.

In this case, nobody cares about fairness to the service thread. It is a
cleanup task that probably runs every few minutes. It could be delayed for
minutes and nobody would care. What they do care about is the impact of the
service thread on the threads doing real work.

You two challenged me to present any legitimate use case for sched_yield. I
see now that was not a legitimate challenge and you two were determined to
shoot down any response no matter how reasonable on the grounds that there
is some way to do it better, no matter how complex, impractical, or
unjustified by the real-world problem.

I think if a pthread_mutex had a 'yield to others blocking on this mutex'
kind of a 'go to the back of the line' option, that would cover the majority
of cases where sched_yield is your best choice currently. Unfortunately,
POSIX gave us yield.

Note that I think we all agree that any program whose performance relies on
quirks of sched_yield (such as the examples that have been cited as CFS
'regressions') are broken horribly. None of the cases I am suggesting use
sched_yield as anything more than a minor optimization.

DS


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: get amount of entropy in /dev/random ?

2007-10-02 Thread David Schwartz

 From the userlevel, can I get an estimate of  amount of entropy
 in /dev/random, that is, the estimate of number of bytes
 readable until it blocks ? Of course multiple processes
 can read bytes and this would not be exact ... but still .. as an upper
 boundary estimate ?

 Thanks
 Yakov

Yes. Look in drivers/char/random.c at the random_ioctl handler. You will see
RNDGETENTCNT.

DS


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Network slowdown due to CFS

2007-10-01 Thread David Schwartz

> yielding IS blocking. Just with indeterminate fuzzyness added to it

Yielding is sort of blocking, but the difference is that yielding will not
idle the CPU while blocking might. Yielding is sometimes preferable to
blocking in a case where the thread knows it can make forward progress even
if it doesn't get the resource. (As in the examples I explained.)

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Network slowdown due to CFS

2007-10-01 Thread David Schwartz

Arjan van de Ven wrote:

> > It can occasionally be an optimization. You may have a case where you
> > can do something very efficiently if a lock is not held, but you
> > cannot afford to wait for the lock to be released. So you check the
> > lock, if it's held, you yield and then check again. If that fails,
> > you do it the less optimal way (for example, dispatching it to a
> > thread that *can* afford to wait).

> at this point it's "use a futex" instead; once you're doing system
> calls you might as well use the right one for what you're trying to
> achieve.

There are two answers to this. One is that you sometimes are writing POSIX
code and Linux-specific optimizations don't change the fact that you still
need a portable implementation.

The other answer is that futexes don't change anything in this case. In
fact, in the last time I hit this, the lock was a futex on Linux.
Nevertheless, that doesn't change the basic issue. The lock is locked, you
cannot afford to wait for it, but not getting the lock is expensive. The
solution is to yield and check the lock again. If it's still held, you
dispatch to another thread, but many times, yielding can avoid that.

A futex doesn't change the fact that sometimes you can't afford to block on
a lock but nevertheless would save significant effort if you were able to
acquire it. Odds are the thread that holds it is about to release it anyway.

That is, you need something in-between "non-blocking trylock, fail easily"
and "blocking lock, do not fail", but you'd rather make forward progress
without the lock than actually block/sleep.

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Network slowdown due to CFS

2007-10-01 Thread David Schwartz

> These are generic statements, but i'm _really_ interested in the
> specifics. Real, specific code that i can look at. The typical Linux
> distro consists of in execess of 500 millions of lines of code, in tens
> of thousands of apps, so there really must be some good, valid and
> "right" use of sched_yield() somewhere in there, in some mainstream app,
> right? (because, as you might have guessed it, in the past decade of
> sched_yield() existence i _have_ seen my share of sched_yield()
> utilizing user-space code, and at the moment i'm not really impressed by
> those examples.)

Maybe, maybe not. Even if so, it would be very difficult to find. Simply
grepping for sched_yield is not going to help because determining whether a
given use of sched_yield is smart is not going to be easy.

> (user-space spinlocks are broken beyond words for anything but perhaps
> SCHED_FIFO tasks.)

User-space spinlocks are broken so spinlocks can only be implemented in
kernel-space? Even if you use the kernel to schedule/unschedule the tasks,
you still have to spin in user-space.

> > One example I know of is a defragmenter for a multi-threaded memory
> > allocator, and it has to lock whole pools. When it releases these
> > locks, it calls yield before re-acquiring them to go back to work. The
> > idea is to "go to the back of the line" if any threads are blocking on
> > those mutexes.

> at a quick glance this seems broken too - but if you show the specific
> code i might be able to point out the breakage in detail. (One
> underlying problem here appears to be fairness: a quick unlock/lock
> sequence may starve out other threads. yield wont solve that fundamental
> problem either, and it will introduce random latencies into apps using
> this memory allocator.)

You are assuming that random latencies are necessarily bad. Random latencies
may be significantly better than predictable high latency.


> > Can you explain what the current sched_yield behavior *is* for CFS and
> > what the tunable does to change it?

> sure. (and i described that flag on lkml before) The sched_yield flag
> does two things:

>  - if 0 ("opportunistic mode"), then the task will reschedule to any
>other task that is in "bigger need for CPU time" than the currently
>running task, as indicated by CFS's ->wait_runtime metric. (or as
>indicated by the similar ->vruntime metric in sched-devel.git)
>
>  - if 1 ("agressive mode"), then the task will be one-time requeued to
>the right end of the CFS rbtree. This means that for one instance,
>all other tasks will run before this task will run again - after that
>this task's natural ordering within the rbtree is restored.

Thank you. Unfortunately, neither of these does what sched_yiled is really
supposed to do. Opportunistic mode does too little and agressive mode does
too much.

> > The desired behavior is for the current thread to not be rescheduled
> > until every thread at the same static priority as this thread has had
> > a chance to be scheduled.

> do you realize that this "desired behavior" you just described is not
> achieved by the old scheduler, and that this random behavior _is_ the
> main problem here? If yield was well-specified then we could implement
> it in a well-specified way - even if the API was poor.

> But fact is that it is _not_ well-specified, and apps grew upon a random
> scheduler implementation details in random ways. (in the lkml discussion
> about this topic, Linus offered a pretty sane theoretical definition for
> yield but it's not simple to implement [and no scheduler implements it
> at the moment] - nor will it map to the old scheduler's yield behavior
> so we'll end up breaking more apps.)

I don't have a problem with failing to emulate the old scheduler's behavior
if we can show that the new behavior has saner semantics. Unfortunately, in
this case, I think CFS' semantics are pretty bad. Neither of these is what
sched_yield is supposed to do.

Note that I'm not saying this is a particularly big deal. And I'm not
calling CFS' behavior a regression, since it's not really better or worse
than the old behavior, simply different.

I'm not familiar enough with CFS' internals to help much on the
implementation, but there may be some simple compromise yield that might
work well enough. How about simply acting as if the task used up its
timeslice and scheduling the next one? (Possibly with a slight reduction in
penalty or reward for not really using all the time, if possible?)

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Network slowdown due to CFS

2007-10-01 Thread David Schwartz

> * Jarek Poplawski <[EMAIL PROTECTED]> wrote:
>
> > BTW, it looks like risky to criticise sched_yield too much: some
> > people can misinterpret such discussions and stop using this at all,
> > even where it's right.

> Really, i have never seen a _single_ mainstream app where the use of
> sched_yield() was the right choice.

It can occasionally be an optimization. You may have a case where you can do
something very efficiently if a lock is not held, but you cannot afford to
wait for the lock to be released. So you check the lock, if it's held, you
yield and then check again. If that fails, you do it the less optimal way
(for example, dispatching it to a thread that *can* afford to wait).

It is also sometimes used in the implementation of spinlock-type primitives.
After spinning fails, yielding is tried.

I think it's also sometimes appropriate when a thread may monopolize a
mutex. For example, consider a rarely-run task that cleans up some expensive
structures. It may need to hold locks that are only held during this complex
clean up.

One example I know of is a defragmenter for a multi-threaded memory
allocator, and it has to lock whole pools. When it releases these locks, it
calls yield before re-acquiring them to go back to work. The idea is to "go
to the back of the line" if any threads are blocking on those mutexes.

There are certainly other ways to do these things, but I have seen cases
where, IMO, yielding was the best solution. Doing nothing would have been
okay too.

> Fortunately, the sched_yield() API is already one of the most rarely
> used scheduler functionalities, so it does not really matter. [ In my
> experience a Linux scheduler is stabilizing pretty well when the
> discussion shifts to yield behavior, because that shows that everything
> else is pretty much fine ;-) ]

Can you explain what the current sched_yield behavior *is* for CFS and what
the tunable does to change it?

The desired behavior is for the current thread to not be rescheduled until
every thread at the same static priority as this thread has had a chance to
be scheduled.

Of course, it's not clear exactly what a "chance" is.

The semantics with respect to threads at other static priority levels is not
clear. Ditto for SMP issues. It's also not clear whether threads that yield
should be rewarded or punished for doing so.

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Network slowdown due to CFS

2007-10-01 Thread David Schwartz

 * Jarek Poplawski [EMAIL PROTECTED] wrote:

  BTW, it looks like risky to criticise sched_yield too much: some
  people can misinterpret such discussions and stop using this at all,
  even where it's right.

 Really, i have never seen a _single_ mainstream app where the use of
 sched_yield() was the right choice.

It can occasionally be an optimization. You may have a case where you can do
something very efficiently if a lock is not held, but you cannot afford to
wait for the lock to be released. So you check the lock, if it's held, you
yield and then check again. If that fails, you do it the less optimal way
(for example, dispatching it to a thread that *can* afford to wait).

It is also sometimes used in the implementation of spinlock-type primitives.
After spinning fails, yielding is tried.

I think it's also sometimes appropriate when a thread may monopolize a
mutex. For example, consider a rarely-run task that cleans up some expensive
structures. It may need to hold locks that are only held during this complex
clean up.

One example I know of is a defragmenter for a multi-threaded memory
allocator, and it has to lock whole pools. When it releases these locks, it
calls yield before re-acquiring them to go back to work. The idea is to go
to the back of the line if any threads are blocking on those mutexes.

There are certainly other ways to do these things, but I have seen cases
where, IMO, yielding was the best solution. Doing nothing would have been
okay too.

 Fortunately, the sched_yield() API is already one of the most rarely
 used scheduler functionalities, so it does not really matter. [ In my
 experience a Linux scheduler is stabilizing pretty well when the
 discussion shifts to yield behavior, because that shows that everything
 else is pretty much fine ;-) ]

Can you explain what the current sched_yield behavior *is* for CFS and what
the tunable does to change it?

The desired behavior is for the current thread to not be rescheduled until
every thread at the same static priority as this thread has had a chance to
be scheduled.

Of course, it's not clear exactly what a chance is.

The semantics with respect to threads at other static priority levels is not
clear. Ditto for SMP issues. It's also not clear whether threads that yield
should be rewarded or punished for doing so.

DS


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Network slowdown due to CFS

2007-10-01 Thread David Schwartz

 These are generic statements, but i'm _really_ interested in the
 specifics. Real, specific code that i can look at. The typical Linux
 distro consists of in execess of 500 millions of lines of code, in tens
 of thousands of apps, so there really must be some good, valid and
 right use of sched_yield() somewhere in there, in some mainstream app,
 right? (because, as you might have guessed it, in the past decade of
 sched_yield() existence i _have_ seen my share of sched_yield()
 utilizing user-space code, and at the moment i'm not really impressed by
 those examples.)

Maybe, maybe not. Even if so, it would be very difficult to find. Simply
grepping for sched_yield is not going to help because determining whether a
given use of sched_yield is smart is not going to be easy.

 (user-space spinlocks are broken beyond words for anything but perhaps
 SCHED_FIFO tasks.)

User-space spinlocks are broken so spinlocks can only be implemented in
kernel-space? Even if you use the kernel to schedule/unschedule the tasks,
you still have to spin in user-space.

  One example I know of is a defragmenter for a multi-threaded memory
  allocator, and it has to lock whole pools. When it releases these
  locks, it calls yield before re-acquiring them to go back to work. The
  idea is to go to the back of the line if any threads are blocking on
  those mutexes.

 at a quick glance this seems broken too - but if you show the specific
 code i might be able to point out the breakage in detail. (One
 underlying problem here appears to be fairness: a quick unlock/lock
 sequence may starve out other threads. yield wont solve that fundamental
 problem either, and it will introduce random latencies into apps using
 this memory allocator.)

You are assuming that random latencies are necessarily bad. Random latencies
may be significantly better than predictable high latency.


  Can you explain what the current sched_yield behavior *is* for CFS and
  what the tunable does to change it?

 sure. (and i described that flag on lkml before) The sched_yield flag
 does two things:

  - if 0 (opportunistic mode), then the task will reschedule to any
other task that is in bigger need for CPU time than the currently
running task, as indicated by CFS's -wait_runtime metric. (or as
indicated by the similar -vruntime metric in sched-devel.git)

  - if 1 (agressive mode), then the task will be one-time requeued to
the right end of the CFS rbtree. This means that for one instance,
all other tasks will run before this task will run again - after that
this task's natural ordering within the rbtree is restored.

Thank you. Unfortunately, neither of these does what sched_yiled is really
supposed to do. Opportunistic mode does too little and agressive mode does
too much.

  The desired behavior is for the current thread to not be rescheduled
  until every thread at the same static priority as this thread has had
  a chance to be scheduled.

 do you realize that this desired behavior you just described is not
 achieved by the old scheduler, and that this random behavior _is_ the
 main problem here? If yield was well-specified then we could implement
 it in a well-specified way - even if the API was poor.

 But fact is that it is _not_ well-specified, and apps grew upon a random
 scheduler implementation details in random ways. (in the lkml discussion
 about this topic, Linus offered a pretty sane theoretical definition for
 yield but it's not simple to implement [and no scheduler implements it
 at the moment] - nor will it map to the old scheduler's yield behavior
 so we'll end up breaking more apps.)

I don't have a problem with failing to emulate the old scheduler's behavior
if we can show that the new behavior has saner semantics. Unfortunately, in
this case, I think CFS' semantics are pretty bad. Neither of these is what
sched_yield is supposed to do.

Note that I'm not saying this is a particularly big deal. And I'm not
calling CFS' behavior a regression, since it's not really better or worse
than the old behavior, simply different.

I'm not familiar enough with CFS' internals to help much on the
implementation, but there may be some simple compromise yield that might
work well enough. How about simply acting as if the task used up its
timeslice and scheduling the next one? (Possibly with a slight reduction in
penalty or reward for not really using all the time, if possible?)

DS


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Network slowdown due to CFS

2007-10-01 Thread David Schwartz

Arjan van de Ven wrote:

  It can occasionally be an optimization. You may have a case where you
  can do something very efficiently if a lock is not held, but you
  cannot afford to wait for the lock to be released. So you check the
  lock, if it's held, you yield and then check again. If that fails,
  you do it the less optimal way (for example, dispatching it to a
  thread that *can* afford to wait).

 at this point it's use a futex instead; once you're doing system
 calls you might as well use the right one for what you're trying to
 achieve.

There are two answers to this. One is that you sometimes are writing POSIX
code and Linux-specific optimizations don't change the fact that you still
need a portable implementation.

The other answer is that futexes don't change anything in this case. In
fact, in the last time I hit this, the lock was a futex on Linux.
Nevertheless, that doesn't change the basic issue. The lock is locked, you
cannot afford to wait for it, but not getting the lock is expensive. The
solution is to yield and check the lock again. If it's still held, you
dispatch to another thread, but many times, yielding can avoid that.

A futex doesn't change the fact that sometimes you can't afford to block on
a lock but nevertheless would save significant effort if you were able to
acquire it. Odds are the thread that holds it is about to release it anyway.

That is, you need something in-between non-blocking trylock, fail easily
and blocking lock, do not fail, but you'd rather make forward progress
without the lock than actually block/sleep.

DS


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Network slowdown due to CFS

2007-10-01 Thread David Schwartz

 yielding IS blocking. Just with indeterminate fuzzyness added to it

Yielding is sort of blocking, but the difference is that yielding will not
idle the CPU while blocking might. Yielding is sometimes preferable to
blocking in a case where the thread knows it can make forward progress even
if it doesn't get the resource. (As in the examples I explained.)

DS


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Network slowdown due to CFS

2007-09-26 Thread David Schwartz

> > I think the real fix would be for iperf to use blocking network IO
> > though, or maybe to use a POSIX mutex or POSIX semaphores.
>
> So it's definitely not a bug in the kernel, only in iperf?

Martin:

Actually, in this case I think iperf is doing the right thing (though not
the best thing) and the kernel is doing the wrong thing. It's calling
'sched_yield' to ensure that every other thread gets a chance to run before
the current thread runs again. CFS is not doing that, allowing the yielding
thread to hog the CPU to the exclusion of the other threads. (It can allow
the yielding thread to hog the CPU, of course, just not to the exclusion of
other threads.)

It's still better to use some kind of rational synchronization primitive
(like mutex/sempahore) so that the other threads can tell you when there's
something for you to do. It's still better to use blocking network IO, so
the kernel will let you know exactly when to try I/O and your dynamic
priority can rise.

Ingo:

Can you clarify what CFS' current default sched_yield implementation is and
what setting sched_compat_yield to 1 does? Which way do we get the right
semantics (all threads of equal static priority are scheduled, with some
possible SMP fuzziness, before this thread is scheduled again)?

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Network slowdown due to CFS

2007-09-26 Thread David Schwartz

  I think the real fix would be for iperf to use blocking network IO
  though, or maybe to use a POSIX mutex or POSIX semaphores.

 So it's definitely not a bug in the kernel, only in iperf?

Martin:

Actually, in this case I think iperf is doing the right thing (though not
the best thing) and the kernel is doing the wrong thing. It's calling
'sched_yield' to ensure that every other thread gets a chance to run before
the current thread runs again. CFS is not doing that, allowing the yielding
thread to hog the CPU to the exclusion of the other threads. (It can allow
the yielding thread to hog the CPU, of course, just not to the exclusion of
other threads.)

It's still better to use some kind of rational synchronization primitive
(like mutex/sempahore) so that the other threads can tell you when there's
something for you to do. It's still better to use blocking network IO, so
the kernel will let you know exactly when to try I/O and your dynamic
priority can rise.

Ingo:

Can you clarify what CFS' current default sched_yield implementation is and
what setting sched_compat_yield to 1 does? Which way do we get the right
semantics (all threads of equal static priority are scheduled, with some
possible SMP fuzziness, before this thread is scheduled again)?

DS


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: CFS: some bad numbers with Java/database threading [FIXED]

2007-09-19 Thread David Schwartz

Ack, sorry, I'm wrong.

Please ignore me, if you weren't already.

I'm glad to hear this will be fixed. The task should be moved last for its
priority level.

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: CFS: some bad numbers with Java/database threading [FIXED]

2007-09-19 Thread David Schwartz

Chris Friesen wrote:

> > The yielding task has given up the cpu.  The other task should get to
> > run for a timeslice (or whatever the equivalent is in CFS) until the
> > yielding task again "becomes head of the thread list".

> Are you sure this isn't happening? I'll run some tests on my SMP
> system running CFS. But I'll bet the context switch rate is quite rapid.

Yep, that's exactly what's happening. The tasks are alternating. They are
both always ready-to-run. The yielding task is put at the end of the queue
for its priority level.

There is no reason the yielding task should get less CPU since they're both
always ready-to-run.

The only downside here is that a yielding task results in very small
timeslices which causes cache inefficiencies. A sane lower bound on the
timeslice might be a good idea. But there is no semantic problem.

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: CFS: some bad numbers with Java/database threading [FIXED]

2007-09-19 Thread David Schwartz

> David Schwartz wrote:

> > Nonsense. The task is always ready-to-run. There is no reason
> > its CPU should
> > be low. This bug report is based on a misunderstanding of what yielding
> > means.

> The yielding task has given up the cpu.  The other task should get to
> run for a timeslice (or whatever the equivalent is in CFS) until the
> yielding task again "becomes head of the thread list".

Are you sure this isn't happening? I'll run some tests on my SMP system
running CFS. But I'll bet the context switch rate is quite rapid.

Honestly, I can't imagine what else could be happening here. Does CFS spin
in a loop doing nothing when you call sched_yield even though another task
is ready-to-run? That seems kind of bizarre. Is sched_yield acting as a
no-op?

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: CFS: some bad numbers with Java/database threading [FIXED]

2007-09-19 Thread David Schwartz

> The CFS scheduler does not seem to implement sched_yield correctly. If one
> program loops with a sched_yield and another program prints out timing
> information in a loop. You will see that if both are taskset to
> the same core
> that the timing stats will be twice as long as when they are on
> different cores.
> This problem was not in 2.6.21-1.3194 but showed up in
> 2.6.22.4-65 and continues
> in the newest released kernel 2.6.22.5-76.

I disagree with the bug report.

> You will see that both tasks use 50% of the CPU.
> Then kill task2 and run:
> "taskset -c 1 ./task2"

This seems right. They're both always ready to run. They're at the same
priority. Neither ever blocks. There is no reason one should get more CPU
than the other.

> Now task2 will run twice as fast verifying that it is not some
> anomaly with the
> way top calculates CPU usage with sched_yield.
>
> Actual results:
> Tasks with sched_yield do not yield like they are suppose to.

Umm, how does he get that? It's yielding at blinding speed.

> Expected results:
> The sched_yield task's CPU usage should go to near 0% when
> another task is on
> the same CPU.

Nonsense. The task is always ready-to-run. There is no reason its CPU should
be low. This bug report is based on a misunderstanding of what yielding
means.

The Linux page says:

   "A  process can relinquish the processor voluntarily without blocking
by
   calling sched_yield().  The process will then be moved to  the  end
of
   the queue for its static priority and a new process gets to run."

Notice the "without blocking" part?

POSIX says:

"The sched_yield() function forces the running thread to relinquish the
processor until it again becomes the head of its thread list. It takes no
arguments."

CFS is perfectly complying with both of these. This bug report is a great
example of how sched_yield can be misunderstood and misused.

You can even argue that the sched_yield process should get even more CPU,
since it's voluntarily relinquishing (which should be rewarded) rather than
infinitely spinning (which should be punished). (Not that I agree with this
argument, I'm just using it to counter-balance the other argument.)

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: CFS: some bad numbers with Java/database threading [FIXED]

2007-09-19 Thread David Schwartz

 The CFS scheduler does not seem to implement sched_yield correctly. If one
 program loops with a sched_yield and another program prints out timing
 information in a loop. You will see that if both are taskset to
 the same core
 that the timing stats will be twice as long as when they are on
 different cores.
 This problem was not in 2.6.21-1.3194 but showed up in
 2.6.22.4-65 and continues
 in the newest released kernel 2.6.22.5-76.

I disagree with the bug report.

 You will see that both tasks use 50% of the CPU.
 Then kill task2 and run:
 taskset -c 1 ./task2

This seems right. They're both always ready to run. They're at the same
priority. Neither ever blocks. There is no reason one should get more CPU
than the other.

 Now task2 will run twice as fast verifying that it is not some
 anomaly with the
 way top calculates CPU usage with sched_yield.

 Actual results:
 Tasks with sched_yield do not yield like they are suppose to.

Umm, how does he get that? It's yielding at blinding speed.

 Expected results:
 The sched_yield task's CPU usage should go to near 0% when
 another task is on
 the same CPU.

Nonsense. The task is always ready-to-run. There is no reason its CPU should
be low. This bug report is based on a misunderstanding of what yielding
means.

The Linux page says:

   A  process can relinquish the processor voluntarily without blocking
by
   calling sched_yield().  The process will then be moved to  the  end
of
   the queue for its static priority and a new process gets to run.

Notice the without blocking part?

POSIX says:

The sched_yield() function forces the running thread to relinquish the
processor until it again becomes the head of its thread list. It takes no
arguments.

CFS is perfectly complying with both of these. This bug report is a great
example of how sched_yield can be misunderstood and misused.

You can even argue that the sched_yield process should get even more CPU,
since it's voluntarily relinquishing (which should be rewarded) rather than
infinitely spinning (which should be punished). (Not that I agree with this
argument, I'm just using it to counter-balance the other argument.)

DS


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: CFS: some bad numbers with Java/database threading [FIXED]

2007-09-19 Thread David Schwartz

 David Schwartz wrote:

  Nonsense. The task is always ready-to-run. There is no reason
  its CPU should
  be low. This bug report is based on a misunderstanding of what yielding
  means.

 The yielding task has given up the cpu.  The other task should get to
 run for a timeslice (or whatever the equivalent is in CFS) until the
 yielding task again becomes head of the thread list.

Are you sure this isn't happening? I'll run some tests on my SMP system
running CFS. But I'll bet the context switch rate is quite rapid.

Honestly, I can't imagine what else could be happening here. Does CFS spin
in a loop doing nothing when you call sched_yield even though another task
is ready-to-run? That seems kind of bizarre. Is sched_yield acting as a
no-op?

DS


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: CFS: some bad numbers with Java/database threading [FIXED]

2007-09-19 Thread David Schwartz

Chris Friesen wrote:

  The yielding task has given up the cpu.  The other task should get to
  run for a timeslice (or whatever the equivalent is in CFS) until the
  yielding task again becomes head of the thread list.

 Are you sure this isn't happening? I'll run some tests on my SMP
 system running CFS. But I'll bet the context switch rate is quite rapid.

Yep, that's exactly what's happening. The tasks are alternating. They are
both always ready-to-run. The yielding task is put at the end of the queue
for its priority level.

There is no reason the yielding task should get less CPU since they're both
always ready-to-run.

The only downside here is that a yielding task results in very small
timeslices which causes cache inefficiencies. A sane lower bound on the
timeslice might be a good idea. But there is no semantic problem.

DS


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: CFS: some bad numbers with Java/database threading [FIXED]

2007-09-19 Thread David Schwartz

Ack, sorry, I'm wrong.

Please ignore me, if you weren't already.

I'm glad to hear this will be fixed. The task should be moved last for its
priority level.

DS


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Wasting our Freedom

2007-09-17 Thread David Schwartz

> "David Schwartz" <[EMAIL PROTECTED]> writes:

> > My point is that you *cannot* prevent a recipient of a
> > derivative work from
> > receiving any rights under either the GPL or the BSD to any protectable
> > elements in that work.
>
> Of course you can.

No you can't.

> What rights do you have to BSD-licenced works, made available
> (under BSD) to MS exclusively? You only get the binary object...

You are equating what rights I have with my ability to exercise those
rights. They are not the same thing. For example, I once bought the rights
to publically display the movie "Monty Python and the Holy Grail". To my
surprise, the rights to public display did not include an actual copy of the
film.

In any event, I never claimed that anyone has rights to a protectable
element that they do not possess a lawful copy of. That's a complete
separate issue and one that has nothing to do with what's being discussed
here because these are all cases where you have the work.

> You know, this is quite common practice - instead of assigning
> copyright, you can grant a BSD-style licence (for some fee,
> something like "do what you want but I will do what I want with
> my code").

Sure, *you* can grant a BSD-style license to any protectable elements *you*
authored. But unless your recpients can obtain a BSD-style license to all
protectable elements in the work from their respective authors, they cannot
modify or distribute it.

*You* cannot grant any rights to protectable elements authored by someone
else, unless you have a relicensing agreement. Neither the GPL nor the BSD
is one of those.

> >> If A sold a BSD licence to B only and this B sold a proprietary
> >> licence (for a derived work) to C, C (without that clause) wouldn't
> >> have a BSD licence to the original work. This is BTW common scenario.
> >
> > C most certainly would have a BSD license, should he choose to
> > comply with
> > terms, to every protectable element that is in both the
> > original work and
> > the work he received.

> But he may have received only binary program image - or the source
> under NDA.
> Sure, NDA doesn't cover public information, but BSD doesn't mean public.
> Now what?

What the hell does that have to do with anything? Are you just trying to be
deliberately dense or waste time? Is it not totally obvious how the
principles I explain apply to a case like that?

Only someone who signs an NDA must comply with it. If you signed an NDA, you
must comply with it. An NDA can definitely subtract rights. It's a complex
question whether an NDA can subtract GPL rights, but again, that has nothing
to do with what we're talking about here.

Sure, you can have the right from me to do X and still not be allowed to do
X because you agreed with someone else not to do it. So what?

> > C has no right to license any protectable element he did not author to
> > anyone else. He cannot set the license terms for those elements to C.

> Sure, the licence covers the >>>entire work<<<, not some "elements".

This is a misleading statement. The phrase "entire work" has two senses. The
license definitely does not cover the "entire work" in the sense of every
protectable element in the work unless each individual author of those
elements chose to offer that element under that license.

If by "entire work", you mean any compilation or derivative work copyright
the "final" author has, then yes, that's available under whatever license
the "final" author places it under. But that license does not actually
permit you to distribute the work.

This is really complicated and I wish I had a clear way to explain it.
Suppose I write a work and then you modify it. Assume your modification
includes adding new protectable elements to that work. When someone
distributes that new derivative work, they are distributing protectable
elements authored by both you and me.

Absent a relicensing agreement, they must obtain some rights from you and
some rights from me to do that. You cannot license the protectable elements
that I authored that are still in the resulting derivative work.

> > Neither the BSD nor the GPL ever give you the right to change the actual
> > license a work is offered under by the original author.
>
> Of course, that's a very distant thing.

Exactly. Every protectable element in the final work is licensed by the
original author to every recipient who takes advantage of the license offer.

> >> BTW: a work by multiple authors is a different thing than a work
> >> derived from another.
> >
> > In practice it doesn't matter.
>
> Of course it does. Only author of a (derived) work can licence
> it, in this case he/she could change the licence 

RE: Wasting our Freedom

2007-09-17 Thread David Schwartz

Kryzstof Halasa writes:

> "David Schwartz" <[EMAIL PROTECTED]> writes:
>
> > Theodore Tso writes:
>
> hardly

A apologize for the error in attribution.

> > Of course you don't need a license to *use* the derived work.
> > You never need
> > a license to use a work. (In the United States. Some countries
> > word this a
> > bit differently but get the same effect.)

> Really? I thought you need a licence to use, say, MS Windows.
> Even to possess a copy. But I don't know about USA, I'm told
> there are strange things happening there :-)

No, you do not need a license to use MS Windows. Microsoft may choose to
compel you to agree to a license in exchange for allowing you to install a
copy, but that is not quite the same thing.

If you read United States copyright law, you will see that *use* is not one
of the rights reserved to the copyright holder. Every lawful possessor of a
work may use it in the ordinary way, assuming they did not *agree* to some
kind of restriction.

> > If, however, you wanted to get the right to modify or distribute a
> > derivative work, you would need to obtain the rights to every
> > protectable
> > element in that work.

> Of course.

> > Read GPL section 6, particularly this part: "Each time you
> > redistribute the
> > Program (or any work based on the Program), the recipient automatically
> > receives a license from the original licensor to copy,
> > distribute or modify
> > the Program subject to
> > these terms and conditions."

> Seems fine, your point?

My point is that you *cannot* prevent a recipient of a derivative work from
receiving any rights under either the GPL or the BSD to any protectable
elements in that work.

> In addition to the rights from you (to the whole derived work),
> the recipient receives rights to the original work, from original
> author.
> It makes perfect sense, making sure the original author can't sue
> you like in the SCO case.
>
> If A sold a BSD licence to B only and this B sold a proprietary
> licence (for a derived work) to C, C (without that clause) wouldn't
> have a BSD licence to the original work. This is BTW common scenario.

C most certainly would have a BSD license, should he choose to comply with
terms, to every protectable element that is in both the original work and
the work he received.

C has no right to license any protectable element he did not author to
anyone else. He cannot set the license terms for those elements to C.

Again, read GPL section 6. (And this is true for the BSD license as well, at
least in the United States, because it's the only way such a license could
work.)

Neither the BSD nor the GPL ever give you the right to change the actual
license a work is offered under by the original author. In fact, they could
not give you this right under US copyright law. Modify the license *text* is
not the same thing as modifying the license.

> > To distribute a derivative work that contains protectable elements from
> > multiple authors, you are distributing all of those elements
> > and need the
> > rights to all of them. You need a license to each element and
> > in the absence
> > of any relicensing arrangements (which the GPL and BSD license are not),
> > only the original author can grant that to you.

> Of course.
>
> BTW: a work by multiple authors is a different thing than a work
> derived from another.

In practice it doesn't matter. All that matters is that you have a single
fixed form or expression that contains creative elements contributed by
different people potentially under different licenses. The issues of whether
it's a derivative work or a combined work and whether the distributor has
made sufficient protectable elements to assert their own copy really has no
effect on any of the issues that matter here.

> > It is a common confusion that just because the final author has
> > copyright in
> > the derivative work, that means he can license the work.

> Of course he (and only he) can. It doesn't mean the end users can't
> receive additional rights.

No, he can't. He can only license those protectable elements that he
authored.

There is no way you can license protectable elements authored by another
absent a relicenseing agreement. The GPL is explicitly not a relicensing
agreement, see section 6. The BSD license is implicitly not a relicensing
agreement.

> Come on, licence = promise not to sue. Why would the copyright
> holder be unable to promise not to sue? It just doesn't make sense.

A license is not just a promise not to sue, it's an *enforceable*
*committment* not to sue. It's an explicit grant of permission against legal
rights.

Would you argue that I can license Disney's "The Lion King" movie to you if
I promise not t

  1   2   3   4   5   6   7   >