Re: [Numpy-discussion] New package to speed up ufunc inner loops

2020-11-04 Thread Robert Kern
On Wed, Nov 4, 2020 at 7:22 PM Aaron Meurer  wrote:

>
> > That's not to say that there isn't clearer language that could be
> drafted. The NEP is still in Draft stage. But if you think it could be
> clearer, please propose specific edits to the draft. Like with unclear
> documentation, it's the person who finds the current docs
> insufficient/confusing/unclear that is in the best position to recommend
> the language that would have helped them. Collaboration helps.
>
> I disagree. The best person to write documentation is the person who
> actually understands the package. I already noted that I don't
> actually understand the actual situation with the trademark, for
> instance.
>

Rather, I meant that the best person to fix confusing language is the
person who was confused, after consultation with the authors/experts come
to a consensus about what was intended.


> I don't really understand why there is pushback for making NEP
> clearer. Also "like with unclear documentation", if someone says that
> documentation is unclear, you should take their word for it that it
> actually is, and improve it, rather than somehow trying to argue that
> they actually aren't confused.
>

I'm not. I'm saying that I don't know how to make it more clear to those
people because I'm not experiencing it like they are. The things I could
think to add are the same kinds of things that were already stated
explicitly in the Abstract, Motivation, and Scope. It seems like Stefan is
in the same boat. Authors need editors, but the editor can't just say
"rewrite!" I don't know what kind of assumptions and context this
hypothetical reader is bringing to this reading that are leading to
confusion. Sometimes it's clear, but not for me, here (and more relevantly,
Stefan).

Do you think this needs a complete revamp? Or just an additional sentence
to explicitly state that this does not add additional legal restrictions to
the copyright license?

-- 
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New package to speed up ufunc inner loops

2020-11-04 Thread Stefan van der Walt
On Wed, Nov 4, 2020, at 16:21, Aaron Meurer wrote:
> But as I noted, this is already off topic for the original discussion
> here, and since there's apparently no interest in improving the NEP
> wording, I'll drop it.

I was trying to understand where, specifically, the language falls short, and 
what to do about improving it.

Perhaps a sentence making it clear that this is not a licensing issue will 
assuage your concerns? If not, please help me understand where statements are 
overly strong, unclear, or insufficient in coverage.

Best regards, 
Stéfan 
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New package to speed up ufunc inner loops

2020-11-04 Thread Aaron Meurer
> Misinterpreted in what way? That they would think we have an ability to 
> enforce the guidelines? We *are* trying to encourage certain behavior here. 
> If they read it and, our of abundant caution reach out to us, that's a fine 
> outcome.
> What negative outcomes do you foresee?

That it is a legal requirement, as part of the license to use NumPy.
The negative outcome is that someone reads the document and believes
NumPy to not actually be open source software.

> That's not to say that there isn't clearer language that could be drafted. 
> The NEP is still in Draft stage. But if you think it could be clearer, please 
> propose specific edits to the draft. Like with unclear documentation, it's 
> the person who finds the current docs insufficient/confusing/unclear that is 
> in the best position to recommend the language that would have helped them. 
> Collaboration helps.

I disagree. The best person to write documentation is the person who
actually understands the package. I already noted that I don't
actually understand the actual situation with the trademark, for
instance.

I don't really understand why there is pushback for making NEP
clearer. Also "like with unclear documentation", if someone says that
documentation is unclear, you should take their word for it that it
actually is, and improve it, rather than somehow trying to argue that
they actually aren't confused.

But as I noted, this is already off topic for the original discussion
here, and since there's apparently no interest in improving the NEP
wording, I'll drop it.

Aaron Meurer
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New package to speed up ufunc inner loops

2020-11-04 Thread Robert Kern
On Wed, Nov 4, 2020 at 5:55 PM Aaron Meurer  wrote:

> On Wed, Nov 4, 2020 at 3:02 PM Robert Kern  wrote:
> >
> > On Wed, Nov 4, 2020 at 4:49 PM Aaron Meurer  wrote:
> >>
> >> I hope this isn't too off topic, but this "fair play" NEP reads like
> >> it is a set of additional restrictions on the NumPy license, which if
> >> it is, would make NumPy no longer open source by the OSI definition. I
> >> think the NEP should be much clearer that these are requests but not
> >> requirements.
> >
> >
> > FWIW, I don't read the NEP like that. Aside from the trademark on the
> name "NumPy", which _are_ enforceable requirements but are orthogonal to
> the copyright license, I see enough "request-like" language on everything
> else.
>
> To be clear, I don't read it like that either. But I also implicitly
> understand that this is the intention of the document, because I know
> that NumPy wouldn't actually place restrictions like these on its
> license. My point is just that the document ought to be clearer about
> this, as I can easily see someone misinterpreting it, especially if
> they aren't close enough to the community that they would implicitly
> understand that it is only a set of guidelines.
>
> > There is no language of forced restriction.
>
> The language you quoted reads ambiguously to me. It isn't forced, but
> it also isn't obviously nonforced. "Please talk to us first" is the
> sort of language I would expect to see for software that is
> commercially licensed and can only be used with permission. All the
> bullet points say "do not", which sounds forced to me. And the
> trademark thing makes it even more confusing because even if you read
> the rest as "only guidelines", it isn't clear if this is somehow an
> exception.
>

If you pick out an individual sentence and consider it in isolation, sure.
But there's a significant amount of context in the Abstract, Motivation,
and Scope sections that preface the rules. And the discussion of many of
the rules explicitly discusses ways to "break" the rules if you have to. We
use "rule" language in many contexts besides legally-enforceable contracts
and licenses.

Again, *I* understand the purpose of this document, but I think the
> way it is currently written it could easily be misinterpreted by
> someone else.
>

I'm willing to wait for someone to actually misinterpret it.

That's not to say that there isn't clearer language that could be drafted.
The NEP is still in Draft stage. But if you think it could be clearer,
please propose specific edits to the draft. Like with unclear
documentation, it's the person who finds the current docs
insufficient/confusing/unclear that is in the best position to recommend
the language that would have helped them. Collaboration helps.

-- 
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New package to speed up ufunc inner loops

2020-11-04 Thread Stefan van der Walt
On Wed, Nov 4, 2020, at 14:54, Aaron Meurer wrote:
> Again, *I* understand the purpose of this document, but I think the
> way it is currently written it could easily be misinterpreted by
> someone else.

Misinterpreted in what way? That they would think we have an ability to enforce 
the guidelines? We *are* trying to encourage certain behavior here. If they 
read it and, our of abundant caution reach out to us, that's a fine outcome. 

What negative outcomes do you foresee? 

Stéfan 
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New package to speed up ufunc inner loops

2020-11-04 Thread Stefan van der Walt
On Wed, Nov 4, 2020, at 13:47, Aaron Meurer wrote:
> I hope this isn't too off topic, but this "fair play" NEP reads like
> it is a set of additional restrictions on the NumPy license, which if
> it is, would make NumPy no longer open source by the OSI definition. I
> think the NEP should be much clearer that these are requests but not
> requirements.

Specifically, the NEP is worded as follows:

"""
This document aims to define a minimal set of rules that, when followed, will 
be considered good-faith efforts in line with the expectations of the NumPy 
developers.

...

When in doubt, please talk to us first. We may suggest an alternative; at 
minimum, we’ll be prepared.
"""

There is no language of forced restriction.

The heading in question is "Do not reuse the NumPy name for projects not 
developed by the NumPy community".  Matti is a member of our community, and 
while the project may be sponsored by others, he is doing exactly what the NEP 
recommends: discussing the issue with the community.

Community members should weigh in if they see an issue with the naming.  I 
don't think this is a particularly good name for a package (not easy to 
pronounce, does not indicate functionality of the package), but I don't 
personally have an issue with it either.

Best regards,
Stéfan
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New package to speed up ufunc inner loops

2020-11-04 Thread Sebastian Berg
On Tue, 2020-11-03 at 17:54 +0200, Matti Picus wrote:
> Hi. On behalf of Quansight and RTOSHoldings, I would like to
> introduce 
> "pnumpy", a package to speed up NumPy.
> 
> https://quansight.github.io/numpy-threading-extensions/stable/index.html
> 

Nice to see these efforts especially with intention of possible
upstreaming.  I hope we can improve the NumPy infrastructure to make
these tries much easier and powerful in the future! (And as I
mentioned, I had such things in mind with NEP 43, albeit as a possible
later extension, not an explicit goal.)

I am a bit curious about the actual performance improvements even
without allowing more flexibility on the NumPy side, my gut feeling
would be fairly large variations with sometimes big improvements due to
parallelization bug often only added overheads due to NumPy not giving
you deep enough control?


As to the name, I don't have an issue with using `pnumpy`, although I
was never hugely concerned about it.

Initially I thought a longer name might be nicer, but the old(?)
accelerated-numpy or fast_numpy_loops doesn't seem that much clearer to
me.  I guess in the end, I think its just important to be clear that
this type of project patches/modifies NumPy and is not associated with
it directly.

It seams `pnumpy` is currently taken on PyPI with a small amount of
downloads: https://pypistats.org/packages/pnumpy
(Although I wonder how many are actual users.), though.

Cheers,

Sebastian


> 
> What is in it?
> 
> - use "PyUFunc_ReplaceLoopBySignature" to hook all the UFunc inner
> loops
> 
> - When the inner loop is called with a large enough array, chunk the 
> data and perform the iteration via a thread pool
> 
> - Add a different memory allocator for "ndarray" data (will require
> an 
> appropriate API from NumPy)
> 
> - Allow using optimized loops above and beyond what NumPy provides
> 
> - Allow logging inner loop calls and parameters to learn about the 
> current process and perhaps tune the performance accordingly
> 
> 
> The first release contains the hooking mechanism and the thread
> pool, 
> the rest has been prototyped but is not ready for release. The idea 
> behind the package is that a third-party package can try things out
> and 
> iterate much faster than NumPy. If some of the ideas bear fruit, and
> do 
> not add an undue maintenance burden to NumPy, the code can be ported
> to 
> NumPy. I am not sure NumPy wishes to take upon itself the burden of 
> managing threads, but a third-party package may be able to.
> 
> 
> I am writing to the mailing list both to announce the pre-release
> under 
> the wrong name, and, in accordance with the fair play rules[1], to 
> request use of the "numpy" name in the package. We had considered
> many 
> options, in the end would like to propose "pnumpy" (the p is either 
> "parallel" or "performant" or "preliminary", whatever you desire).
> 
> 
> Matti
> 
> 
> [1] https://numpy.org/neps/nep-0036-fair-play.html#fair-play-rules
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 



signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New package to speed up ufunc inner loops

2020-11-04 Thread Robert Kern
On Wed, Nov 4, 2020 at 4:49 PM Aaron Meurer  wrote:

> I hope this isn't too off topic, but this "fair play" NEP reads like
> it is a set of additional restrictions on the NumPy license, which if
> it is, would make NumPy no longer open source by the OSI definition. I
> think the NEP should be much clearer that these are requests but not
> requirements.
>

FWIW, I don't read the NEP like that. Aside from the trademark on the name
"NumPy", which _are_ enforceable requirements but are orthogonal to the
copyright license, I see enough "request-like" language on everything else.

-- 
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New package to speed up ufunc inner loops

2020-11-04 Thread Aaron Meurer
I hope this isn't too off topic, but this "fair play" NEP reads like
it is a set of additional restrictions on the NumPy license, which if
it is, would make NumPy no longer open source by the OSI definition. I
think the NEP should be much clearer that these are requests but not
requirements.

Aaron Meurer

On Wed, Nov 4, 2020 at 2:44 PM Ralf Gommers  wrote:
>
>
>
> On Tue, Nov 3, 2020 at 3:54 PM Matti Picus  wrote:
>>
>> Hi. On behalf of Quansight and RTOSHoldings, I would like to introduce
>> "pnumpy", a package to speed up NumPy.
>>
>> https://quansight.github.io/numpy-threading-extensions/stable/index.html
>>
>>
>> What is in it?
>>
>> - use "PyUFunc_ReplaceLoopBySignature" to hook all the UFunc inner loops
>>
>> - When the inner loop is called with a large enough array, chunk the
>> data and perform the iteration via a thread pool
>>
>> - Add a different memory allocator for "ndarray" data (will require an
>> appropriate API from NumPy)
>>
>> - Allow using optimized loops above and beyond what NumPy provides
>>
>> - Allow logging inner loop calls and parameters to learn about the
>> current process and perhaps tune the performance accordingly
>>
>>
>> The first release contains the hooking mechanism and the thread pool,
>> the rest has been prototyped but is not ready for release. The idea
>> behind the package is that a third-party package can try things out and
>> iterate much faster than NumPy. If some of the ideas bear fruit, and do
>> not add an undue maintenance burden to NumPy, the code can be ported to
>> NumPy. I am not sure NumPy wishes to take upon itself the burden of
>> managing threads, but a third-party package may be able to.
>>
>>
>> I am writing to the mailing list both to announce the pre-release under
>> the wrong name, and, in accordance with the fair play rules[1], to
>> request use of the "numpy" name in the package. We had considered many
>> options, in the end would like to propose "pnumpy" (the p is either
>> "parallel" or "performant" or "preliminary", whatever you desire).
>
>
> Thanks Matti!
>
> Obviously as another Quansight employee I have a conflict of interest here, 
> so let me just say I wasn't involved with choosing the `pnumpy` name but did 
> already comment internally on using "numpy" as part of the package name would 
> probably be fine, given that Matti is the main author and the intent is to 
> migrate the useful parts into NumPy itself.
>
> Hopefully someone else can comment, maybe Stéfan as the "fair play" NEP 
> author?
>
> Cheers,
> Ralf
>
>
>>
>>
>> Matti
>>
>>
>> [1] https://numpy.org/neps/nep-0036-fair-play.html#fair-play-rules
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New package to speed up ufunc inner loops

2020-11-04 Thread Ralf Gommers
On Tue, Nov 3, 2020 at 3:54 PM Matti Picus  wrote:

> Hi. On behalf of Quansight and RTOSHoldings, I would like to introduce
> "pnumpy", a package to speed up NumPy.
>
> https://quansight.github.io/numpy-threading-extensions/stable/index.html
>
>
> What is in it?
>
> - use "PyUFunc_ReplaceLoopBySignature" to hook all the UFunc inner loops
>
> - When the inner loop is called with a large enough array, chunk the
> data and perform the iteration via a thread pool
>
> - Add a different memory allocator for "ndarray" data (will require an
> appropriate API from NumPy)
>
> - Allow using optimized loops above and beyond what NumPy provides
>
> - Allow logging inner loop calls and parameters to learn about the
> current process and perhaps tune the performance accordingly
>
>
> The first release contains the hooking mechanism and the thread pool,
> the rest has been prototyped but is not ready for release. The idea
> behind the package is that a third-party package can try things out and
> iterate much faster than NumPy. If some of the ideas bear fruit, and do
> not add an undue maintenance burden to NumPy, the code can be ported to
> NumPy. I am not sure NumPy wishes to take upon itself the burden of
> managing threads, but a third-party package may be able to.
>
>
> I am writing to the mailing list both to announce the pre-release under
> the wrong name, and, in accordance with the fair play rules[1], to
> request use of the "numpy" name in the package. We had considered many
> options, in the end would like to propose "pnumpy" (the p is either
> "parallel" or "performant" or "preliminary", whatever you desire).
>

Thanks Matti!

Obviously as another Quansight employee I have a conflict of interest here,
so let me just say I wasn't involved with choosing the `pnumpy` name but
did already comment internally on using "numpy" as part of the package name
would probably be fine, given that Matti is the main author and the intent
is to migrate the useful parts into NumPy itself.

Hopefully someone else can comment, maybe Stéfan as the "fair play" NEP
author?

Cheers,
Ralf



>
> Matti
>
>
> [1] https://numpy.org/neps/nep-0036-fair-play.html#fair-play-rules
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] New package to speed up ufunc inner loops

2020-11-03 Thread Matti Picus
Hi. On behalf of Quansight and RTOSHoldings, I would like to introduce 
"pnumpy", a package to speed up NumPy.


https://quansight.github.io/numpy-threading-extensions/stable/index.html


What is in it?

- use "PyUFunc_ReplaceLoopBySignature" to hook all the UFunc inner loops

- When the inner loop is called with a large enough array, chunk the 
data and perform the iteration via a thread pool


- Add a different memory allocator for "ndarray" data (will require an 
appropriate API from NumPy)


- Allow using optimized loops above and beyond what NumPy provides

- Allow logging inner loop calls and parameters to learn about the 
current process and perhaps tune the performance accordingly



The first release contains the hooking mechanism and the thread pool, 
the rest has been prototyped but is not ready for release. The idea 
behind the package is that a third-party package can try things out and 
iterate much faster than NumPy. If some of the ideas bear fruit, and do 
not add an undue maintenance burden to NumPy, the code can be ported to 
NumPy. I am not sure NumPy wishes to take upon itself the burden of 
managing threads, but a third-party package may be able to.



I am writing to the mailing list both to announce the pre-release under 
the wrong name, and, in accordance with the fair play rules[1], to 
request use of the "numpy" name in the package. We had considered many 
options, in the end would like to propose "pnumpy" (the p is either 
"parallel" or "performant" or "preliminary", whatever you desire).



Matti


[1] https://numpy.org/neps/nep-0036-fair-play.html#fair-play-rules

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion