Re: [Python-Dev] Official citation for Python

2018-09-16 Thread Stephen J. Turnbull
Jacqueline Kazil writes:

 > *As a user, I am writing an academic paper and I need to cite Python. *

I don't understand the meaning of "need" and "Python".  To understand
your code, one likely needs the Language Reference and surely the
Library Reference, and probably documentation of the APIs and
semantics of various third party code.

To just give credit to the Python project for the suite of tools
you've used, a citation like the R Project's should do (I think this
has appeared more than once, I copy it from José María Mateos's
parallel post):

 > To cite R in publications use:

 >   R Core Team (2018). R: A language and environment for statistical
 >   computing. R Foundation for Statistical Computing, Vienna, Austria.
 >   URL https://www.R-project.org/.

I guess for Python that would be something like

"""
Python Core Developers [2018].  Python: A general purpose language for
computing, with batteries included.  Python Software Foundation,
Beaverton, OR.  https://www.python.org/.
"""

I like R's citation() builtin.

One caveat: I get the impression that the R Project is far more
centralized than Python is, that there are not huge independent
projects like SciPy and NumPy and Twisted and so on, nor independent
implementations of the core language like PyPy and Jython.  So I
suspect that for most serious scientific computing you would need to
cite one or more third-pary projects as well, and perhaps an
implementation such as PyPy or Jython.

Jacqueline again:

 > Let's throw reproducibility out the window for now (<--- something
 > I never thought I would say), because that should be captured in
 > the code, not in the citations.
 >
 > So, if we don't need the specific version of Python, then maybe
 > creating one citation is all we need.

Do you realize that `3 / 2` means different computations depending on
the version of Python?  And that `"a string"` produces different
objects with different duck-types depending on the version?

As far as handling versions, this would do, I think:

f"""
Python Core Developers [{release_year}].  Python: A general purpose
language for computing, with batteries included, version
{version_number}.  Python Software Foundation, Beaverton, OR.
Project URL: https://www.python.org/.
"""
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] SEC: Spectre variant 2: GCC: -mindirect-branch=thunk -mindirect-branch-register

2018-09-16 Thread Wes Turner
Should Python builds add `-mindirect-branch=thunk
-mindirect-branch-register` to CFLAGS?

Where would this be to be added in the build scripts with which
architectures?

/QSpectre is the MSVC build flag for Spectre Variant 1:

> The /Qspectre option is available in Visual Studio 2017 version 15.7 and
later.

https://docs.microsoft.com/en-us/cpp/build/reference/qspectre?view=vs-2017

security@ directed me to the issue tracker / lists,
so I'm forwarding this to python-dev and python-ideas, as well.

# Forwarded message
From: *Wes Turner* 
Date: Wednesday, September 12, 2018
Subject: SEC: Spectre variant 2: GCC: -mindirect-branch=thunk
-mindirect-branch-register
To: distutils-sig 


Should C extensions that compile all add
`-mindirect-branch=thunk -mindirect-branch-register` [1] to mitigate the
risk of Spectre variant 2 (which does indeed affect user space applications
as well as kernels)?

[1] https://github.com/speed47/spectre-meltdown-checker/
issues/119#issuecomment-361432244
[2] https://en.wikipedia.org/wiki/Spectre_(security_vulnerability)
[3] https://en.wikipedia.org/wiki/Speculative_Store_Bypass#
Speculative_execution_exploit_variants

On Wednesday, September 12, 2018, Wes Turner  wrote:
>
>> On Wednesday, September 12, 2018, Joni Orponen 
>> wrote:
>>
>>> On Wed, Sep 12, 2018 at 8:48 PM Wes Turner  wrote:
>>>
 Should C extensions that compile all add
 `-mindirect-branch=thunk -mindirect-branch-register` [1] to mitigate
 the risk of Spectre variant 2 (which does indeed affect user space
 applications as well as kernels)?

>>>
>>> Are those available on GCC <= 4.2.0 as per PEP 513?
>>>
>>
>> AFAIU, only
>> GCC 7.3 and 8 have the retpoline (indirect-branch=thunk) support enabled
>> by the `-mindirect-branch=thunk -mindirect-branch-register` CFLAGS.
>>
>
 On Wednesday, September 12, 2018, Wes Turner  wrote:

> "What is a retpoline and how does it work?"
> https://stackoverflow.com/questions/48089426/what-is-a-
> retpoline-and-how-does-it-work
>
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] SEC: Spectre variant 2: GCC: -mindirect-branch=thunk -mindirect-branch-register

2018-09-16 Thread Wes Turner
On Sunday, September 16, 2018, Wes Turner  wrote:

> Should Python builds add `-mindirect-branch=thunk
> -mindirect-branch-register` to CFLAGS?
>
> Where would this be to be added in the build scripts with which
> architectures?
>
> /QSpectre is the MSVC build flag for Spectre Variant 1:
>
> > The /Qspectre option is available in Visual Studio 2017 version 15.7 and
> later.
>
> https://docs.microsoft.com/en-us/cpp/build/reference/qspectre?view=vs-2017
>
> security@ directed me to the issue tracker / lists,
> so I'm forwarding this to python-dev and python-ideas, as well.
>
> # Forwarded message
> From: *Wes Turner* 
> Date: Wednesday, September 12, 2018
> Subject: SEC: Spectre variant 2: GCC: -mindirect-branch=thunk
> -mindirect-branch-register
> To: distutils-sig 
>
>
> Should C extensions that compile all add
> `-mindirect-branch=thunk -mindirect-branch-register` [1] to mitigate the
> risk of Spectre variant 2 (which does indeed affect user space applications
> as well as kernels)?
>
> [1] https://github.com/speed47/spectre-meltdown-checker/issues/
> 119#issuecomment-361432244
> [2] https://en.wikipedia.org/wiki/Spectre_(security_vulnerability)
> [3] https://en.wikipedia.org/wiki/Speculative_Store_Bypass#Specu
> lative_execution_exploit_variants
>
> On Wednesday, September 12, 2018, Wes Turner  wrote:
>>
>>> On Wednesday, September 12, 2018, Joni Orponen 
>>> wrote:
>>>
 On Wed, Sep 12, 2018 at 8:48 PM Wes Turner 
 wrote:

> Should C extensions that compile all add
> `-mindirect-branch=thunk -mindirect-branch-register` [1] to mitigate
> the risk of Spectre variant 2 (which does indeed affect user space
> applications as well as kernels)?
>

 Are those available on GCC <= 4.2.0 as per PEP 513?

>>>
>>> AFAIU, only
>>> GCC 7.3 and 8 have the retpoline (indirect-branch=thunk) support enabled
>>> by the `-mindirect-branch=thunk -mindirect-branch-register` CFLAGS.
>>>
>>
>  On Wednesday, September 12, 2018, Wes Turner 
> wrote:
>
>> "What is a retpoline and how does it work?"
>> https://stackoverflow.com/questions/48089426/what-is-a-retpo
>> line-and-how-does-it-work
>>
>>
There's probably already been an ANN announce about this?

If not, someone with appropriate security posture and syntax could address:

Whether python.org binaries are already rebuilt

Whether OS package binaries are already rebuilt

Whether anaconda binaries are already rebuilt

Whether C extension binaries on pypi are already rebuilt
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-16 Thread Neil Schemenauer
On 2018-09-15, Paul Moore wrote:
> On Fri, 14 Sep 2018 at 23:28, Neil Schemenauer  wrote:
> > We could have a new format, .pya (compiled python archive) that has
> > data for many .pyc files in it.
[..]
> Isn't that essentially what putting the stdlib in a zipfile does? (See
> the windows embedded distribution for an example). It probably uses
> normal IO rather than mmap, but maybe adding a "use mmap" flag to the
> zipfile module would be a more general enhancement that zipimport
> could use for free.

Yeah, it's close to the same thing.  If the syscalls are what gives
the speedup, using a better zipfile implementation might give nearly
the same benefit.

At the sprint we dicussed a variation of Larry's (FB's) patch.
Allow the frozen data to be in DLLs as well as in the python
executable data segment.  So, importlib would be frozen into the
exe.  The standard library could become another DLL.  The user could
provide one or more DLLs that contains their app code and package
deps.  In general, I think there would only be two DLLs: stdlib and
app+deps.

My suggestion of a special format (similar to zipfile) was
motivated by the wish to avoid platform build tools.  E.g. Windows
users would have a harder time to build DLLs.  However, I now think
depending on platform build tools is fine.  The people who will
build these DLLs will have the tools and skills to do so.  Even if
there is only a DLLs for the stdlib, it will be a win.  If no DLLs
are provided, you get the same behavior as current Python (i.e.
importlib is frozen in, everything else can come from .py files).

I think there is no question that Larry's PR will be faster than the
zipfile approach.  It removes the umarshal step.  Maybe that benefit
will but small but I think it should count.  Also, I suspect the OS
can page-in the DLL on-demand and perhaps leave parts of module .pyc
data on disk.  Larry had the idea of keeping code objects frozen
until they need to be executed.  It's a cool idea that would be
enabled by this first step.

I'm excited about Larry's PR.  I think if we get it cleanup up and
into Python 3.8, we will clearly leave Python 2.7 behind in terms of
startup performance.  That has been a goal of mine for a couple
years now.

Regards,

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-16 Thread Antoine Pitrou
On Fri, 14 Sep 2018 14:27:37 -0700
Larry Hastings  wrote:
> 
> I don't propose to merge the patch in its current state.  I think it 
> would need a lot of work both in terms of "doing things the way Python 
> does it" as well as just code smell (the serializer is implemented in 
> both C and Python and jumps back and forth, also the build process for 
> the serialized modules is pretty tiresome).
> 
> Is it worth working on?

I think it's of limited interest if it only helps with modules used
during the startup sequence, not arbitrary stdlib or third-party
modules.

To give an idea, on my machine the baseline Python startup is about 20ms
(`time python -c pass`), but if I import Numpy it grows to 100ms, and
with Pandas it's more than 200ms.  Saving 4ms on the baseline startup
would make no practical difference for concrete usage.

I'm ready to think there are other use cases where it matters, though.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Official citation for Python

2018-09-16 Thread Jacqueline Kazil
RE: Why cite Python….

I would say that in this paper —
http://conference.scipy.org/proceedings/scipy2015/pdfs/jacqueline_kazil.pdf,
where we introduced a new library, we should have cited Python, because the
library was based in Python. We were riding on the coattails of Python and
if Python did not exist, then this library would not exist.

(taking this a level higher)
Just as someone doing research (a specific application) should cite the
Mesa library. Without the good and bad that is Mesa, their research would
have taken a different form.

Since my Ph.D is on Mesa, I will be citing Python there.

I think for more insight we can look at who has cited some of Guido’s stuff…
For example:
https://scholar.google.com/scholar?cites=900267235435084077&as_sdt=20005&sciodt=0,9&hl=en

Does that help?
RE: Just like R - Versions

@Stephen
Are you suggesting major versions or minor versions?
RE: Guido’s prio works

Some of those have weight already. Should we be picking one those and
pointing people to that?
Final decision

I am going to the NumFocus summit for maintainers of Science Python
libraries next week. I believe that the Science Python community is where
the main audience for this is… correct me if you think this is a wrong
assumption.

I thought I could take two to three concrete formats and user test there
and report on how community members who would be using the citation feel.

Good idea? Bad idea?

On Sun, Sep 16, 2018 at 4:35 AM Stephen J. Turnbull <
turnbull.stephen...@u.tsukuba.ac.jp> wrote:

> Jacqueline Kazil writes:
>
>  > *As a user, I am writing an academic paper and I need to cite Python. *
>
> I don't understand the meaning of "need" and "Python".  To understand
> your code, one likely needs the Language Reference and surely the
> Library Reference, and probably documentation of the APIs and
> semantics of various third party code.
>
> To just give credit to the Python project for the suite of tools
> you've used, a citation like the R Project's should do (I think this
> has appeared more than once, I copy it from José María Mateos's
> parallel post):
>
>  > To cite R in publications use:
>
>  >   R Core Team (2018). R: A language and environment for statistical
>  >   computing. R Foundation for Statistical Computing, Vienna, Austria.
>  >   URL https://www.R-project.org/.
>
> I guess for Python that would be something like
>
> """
> Python Core Developers [2018].  Python: A general purpose language for
> computing, with batteries included.  Python Software Foundation,
> Beaverton, OR.  https://www.python.org/.
> """
>
> I like R's citation() builtin.
>
> One caveat: I get the impression that the R Project is far more
> centralized than Python is, that there are not huge independent
> projects like SciPy and NumPy and Twisted and so on, nor independent
> implementations of the core language like PyPy and Jython.  So I
> suspect that for most serious scientific computing you would need to
> cite one or more third-pary projects as well, and perhaps an
> implementation such as PyPy or Jython.
>
> Jacqueline again:
>
>  > Let's throw reproducibility out the window for now (<--- something
>  > I never thought I would say), because that should be captured in
>  > the code, not in the citations.
>  >
>  > So, if we don't need the specific version of Python, then maybe
>  > creating one citation is all we need.
>
> Do you realize that `3 / 2` means different computations depending on
> the version of Python?  And that `"a string"` produces different
> objects with different duck-types depending on the version?
>
> As far as handling versions, this would do, I think:
>
> f"""
> Python Core Developers [{release_year}].  Python: A general purpose
> language for computing, with batteries included, version
> {version_number}.  Python Software Foundation, Beaverton, OR.
> Project URL: https://www.python.org/.
> """
>


-- 
Jacqueline Kazil | @jackiekazil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Official citation for Python

2018-09-16 Thread Brett Cannon
On Sun, 16 Sep 2018 at 15:23 Jacqueline Kazil  wrote:

> RE: Why cite Python….
>
> I would say that in this paper —
> http://conference.scipy.org/proceedings/scipy2015/pdfs/jacqueline_kazil.pdf,
> where we introduced a new library, we should have cited Python, because the
> library was based in Python. We were riding on the coattails of Python and
> if Python did not exist, then this library would not exist.
>
> (taking this a level higher)
> Just as someone doing research (a specific application) should cite the
> Mesa library. Without the good and bad that is Mesa, their research would
> have taken a different form.
>
> Since my Ph.D is on Mesa, I will be citing Python there.
>
> I think for more insight we can look at who has cited some of Guido’s
> stuff…
> For example:
> https://scholar.google.com/scholar?cites=900267235435084077&as_sdt=20005&sciodt=0,9&hl=en
>
> Does that help?
> RE: Just like R - Versions
>
> @Stephen
> Are you suggesting major versions or minor versions?
> RE: Guido’s prio works
>
> Some of those have weight already. Should we be picking one those and
> pointing people to that?
> Final decision
>
> I am going to the NumFocus summit for maintainers of Science Python
> libraries next week. I believe that the Science Python community is where
> the main audience for this is… correct me if you think this is a wrong
> assumption.
>
> I thought I could take two to three concrete formats and user test there
> and report on how community members who would be using the citation feel.
>
> Good idea? Bad idea?
>
I think seeing how some other academics other than the ones here definitely
wouldn't hurt.

-Brett


>
> On Sun, Sep 16, 2018 at 4:35 AM Stephen J. Turnbull <
> turnbull.stephen...@u.tsukuba.ac.jp> wrote:
>
>> Jacqueline Kazil writes:
>>
>>  > *As a user, I am writing an academic paper and I need to cite Python. *
>>
>> I don't understand the meaning of "need" and "Python".  To understand
>> your code, one likely needs the Language Reference and surely the
>> Library Reference, and probably documentation of the APIs and
>> semantics of various third party code.
>>
>> To just give credit to the Python project for the suite of tools
>> you've used, a citation like the R Project's should do (I think this
>> has appeared more than once, I copy it from José María Mateos's
>> parallel post):
>>
>>  > To cite R in publications use:
>>
>>  >   R Core Team (2018). R: A language and environment for statistical
>>  >   computing. R Foundation for Statistical Computing, Vienna, Austria.
>>  >   URL https://www.R-project.org/.
>>
>> I guess for Python that would be something like
>>
>> """
>> Python Core Developers [2018].  Python: A general purpose language for
>> computing, with batteries included.  Python Software Foundation,
>> Beaverton, OR.  https://www.python.org/.
>> """
>>
>> I like R's citation() builtin.
>>
>> One caveat: I get the impression that the R Project is far more
>> centralized than Python is, that there are not huge independent
>> projects like SciPy and NumPy and Twisted and so on, nor independent
>> implementations of the core language like PyPy and Jython.  So I
>> suspect that for most serious scientific computing you would need to
>> cite one or more third-pary projects as well, and perhaps an
>> implementation such as PyPy or Jython.
>>
>> Jacqueline again:
>>
>>  > Let's throw reproducibility out the window for now (<--- something
>>  > I never thought I would say), because that should be captured in
>>  > the code, not in the citations.
>>  >
>>  > So, if we don't need the specific version of Python, then maybe
>>  > creating one citation is all we need.
>>
>> Do you realize that `3 / 2` means different computations depending on
>> the version of Python?  And that `"a string"` produces different
>> objects with different duck-types depending on the version?
>>
>> As far as handling versions, this would do, I think:
>>
>> f"""
>> Python Core Developers [{release_year}].  Python: A general purpose
>> language for computing, with batteries included, version
>> {version_number}.  Python Software Foundation, Beaverton, OR.
>> Project URL: https://www.python.org/.
>> """
>>
>
>
> --
> Jacqueline Kazil | @jackiekazil
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Official citation for Python

2018-09-16 Thread Jacqueline Kazil
Cool, thanks!

On Sun, Sep 16, 2018 at 7:19 PM Brett Cannon  wrote:

>
>
> On Sun, 16 Sep 2018 at 15:23 Jacqueline Kazil 
> wrote:
>
>> RE: Why cite Python….
>>
>> I would say that in this paper —
>> http://conference.scipy.org/proceedings/scipy2015/pdfs/jacqueline_kazil.pdf,
>> where we introduced a new library, we should have cited Python, because the
>> library was based in Python. We were riding on the coattails of Python and
>> if Python did not exist, then this library would not exist.
>>
>> (taking this a level higher)
>> Just as someone doing research (a specific application) should cite the
>> Mesa library. Without the good and bad that is Mesa, their research would
>> have taken a different form.
>>
>> Since my Ph.D is on Mesa, I will be citing Python there.
>>
>> I think for more insight we can look at who has cited some of Guido’s
>> stuff…
>> For example:
>> https://scholar.google.com/scholar?cites=900267235435084077&as_sdt=20005&sciodt=0,9&hl=en
>>
>> Does that help?
>> RE: Just like R - Versions
>>
>> @Stephen
>> Are you suggesting major versions or minor versions?
>> RE: Guido’s prio works
>>
>> Some of those have weight already. Should we be picking one those and
>> pointing people to that?
>> Final decision
>>
>> I am going to the NumFocus summit for maintainers of Science Python
>> libraries next week. I believe that the Science Python community is where
>> the main audience for this is… correct me if you think this is a wrong
>> assumption.
>>
>> I thought I could take two to three concrete formats and user test there
>> and report on how community members who would be using the citation feel.
>>
>> Good idea? Bad idea?
>>
> I think seeing how some other academics other than the ones here
> definitely wouldn't hurt.
>
> -Brett
>
>
>>
>> On Sun, Sep 16, 2018 at 4:35 AM Stephen J. Turnbull <
>> turnbull.stephen...@u.tsukuba.ac.jp> wrote:
>>
>>> Jacqueline Kazil writes:
>>>
>>>  > *As a user, I am writing an academic paper and I need to cite Python.
>>> *
>>>
>>> I don't understand the meaning of "need" and "Python".  To understand
>>> your code, one likely needs the Language Reference and surely the
>>> Library Reference, and probably documentation of the APIs and
>>> semantics of various third party code.
>>>
>>> To just give credit to the Python project for the suite of tools
>>> you've used, a citation like the R Project's should do (I think this
>>> has appeared more than once, I copy it from José María Mateos's
>>> parallel post):
>>>
>>>  > To cite R in publications use:
>>>
>>>  >   R Core Team (2018). R: A language and environment for statistical
>>>  >   computing. R Foundation for Statistical Computing, Vienna, Austria.
>>>  >   URL https://www.R-project.org/.
>>>
>>> I guess for Python that would be something like
>>>
>>> """
>>> Python Core Developers [2018].  Python: A general purpose language for
>>> computing, with batteries included.  Python Software Foundation,
>>> Beaverton, OR.  https://www.python.org/.
>>> """
>>>
>>> I like R's citation() builtin.
>>>
>>> One caveat: I get the impression that the R Project is far more
>>> centralized than Python is, that there are not huge independent
>>> projects like SciPy and NumPy and Twisted and so on, nor independent
>>> implementations of the core language like PyPy and Jython.  So I
>>> suspect that for most serious scientific computing you would need to
>>> cite one or more third-pary projects as well, and perhaps an
>>> implementation such as PyPy or Jython.
>>>
>>> Jacqueline again:
>>>
>>>  > Let's throw reproducibility out the window for now (<--- something
>>>  > I never thought I would say), because that should be captured in
>>>  > the code, not in the citations.
>>>  >
>>>  > So, if we don't need the specific version of Python, then maybe
>>>  > creating one citation is all we need.
>>>
>>> Do you realize that `3 / 2` means different computations depending on
>>> the version of Python?  And that `"a string"` produces different
>>> objects with different duck-types depending on the version?
>>>
>>> As far as handling versions, this would do, I think:
>>>
>>> f"""
>>> Python Core Developers [{release_year}].  Python: A general purpose
>>> language for computing, with batteries included, version
>>> {version_number}.  Python Software Foundation, Beaverton, OR.
>>> Project URL: https://www.python.org/.
>>> """
>>>
>>
>>
>> --
>> Jacqueline Kazil | @jackiekazil
>>
>>
>> ___
>> Python-Dev mailing list
>> Python-Dev@python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>>
>

-- 
Jacqueline Kazil | @jackiekazil
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Official citation for Python

2018-09-16 Thread Paul Ganssle
I think the "why" in this case should be a bit deeper than that, because
until recently, it's been somewhat unusual to cite the /tools you use/
to create a paper.

I see three major reasons why people cite software packages, and the
form of the citation would have different requirements for each one:

1. *Academic credit / Academic use metrics*

The weird way that academia has evolved, academics are largely judged by
their publications and how influential those publications are. A lot of
the people who work on statistical and scientific python libraries are
doing excellent and incredibly influential work, but that's largely
invisible to the metrics used by funding and tenure committees, so
there's been an effort do things like getting DOIs for libraries or
publishing articles in journals like the journal of open source
software: https://joss.theoj.org

Then you cite the libraries if you use them, and the people who
contribute to the work can say, "Look I'm a regular contributor to this
core library that is cited in 90% of papers". This seems less important
to CPython, where the majority of core contributors (as far as I can
tell) are not academics and have little use for high h-index papers.
That said, even if no one involved cares about the academic credit, if
every paper that used Python cited the language, it probably /would/
provide useful metrics to the PSF and others interested in this.

If all you want is a formal way to say "I used Python for this" as a
citation so that it can be tracked, then a single DOI for the entire
language should be sufficient.

2. *As a primary source or example for some claims
*

If you are writing an article about language design and you are
referencing how Python handles async or scoping or unicode or something,
you want to make it easy for your readers to see the context of your
statement, to verify that it's true and to get more details than you
might want to include as part of what may be a tangential mention in
your paper. I have a sense that this is closer to the original reason
people cited things in papers and books before citations became a metric
for measuring influence - and subsequently a way to give credit for the
source of ideas.

If this is why you are citing Python, you should probably be citing a
specific sub-section of the language reference and/or documentation, and
that citation should probably be versioned, since new features are added
in every minor version, and the way some of these things are handled may
change over time. In this case, a separate DOI for each minor version
that points to the documentation as built by a specific commit or git
tag or whatever would probably be ideal.

3. *To aid reproducibility*

It won't go all the way towards reproducing your research, but given
that Python is a living language that is always changing - both in
implementation and the spec itself - to the extent that you have a
"methods" section, it should probably include things like operating
system version, CPython version and the versions of all libraries you
used so that if someone is failing to replicate your results, they know
how to build an environment where it /should work/.

If you want to include this information in the form of a citation, then
I would think that you would not want to be both more granular - citing
the specific interpreter you used (CPython, Jython, Pypy), the full
version (3.6.6 rather than 3.6) and possibly even other factors like
operating system, etc, and /less/ granular in that you don't need to
cite a specific subset of the interpreter (e.g. async), but just the
interpreter as a whole.

--

My thoughts on the matter are that I think the CPython core dev team
probably cares a lot less about #1 than, say, the R dev team, which is
one reason why there's no clear way to cite "CPython" as a whole.

I think that #3 is a very laudable goal, but probably should be in some
sort of "methods" section of the document being prepared rather than
overloading citations for it, though having a standardized way to
describe your Python setup (similar to, say, the pandas debugging
feature `pandas.show_versions()`) that is optimized for publication
would probably be super helpful.

While #2 is probably only a small fraction of all the times where people
would want to "cite CPython", I think it's probably the most important
one, since it's performing a very specific function useful to the reader
of the paper. It also seems not terribly difficult to come up with some
guidance for unambiguously referencing sections of the documentation
and/or language reference, and having "get a DOI for the documentation"
be part of the release cycle.

Best,
Paul

P.S. I will also be at the NumFocus summit. It's been some time since
I've been an academic, but hopefully there will be an interesting
discussion about this there!

On 9/16/18 6:22 PM, Jacqueline Kazil wrote:

>
> RE: Why cite Python….
>
> I would say that in this paper —
> http://conference.scipy.org/proc

Re: [Python-Dev] SEC: Spectre variant 2: GCC: -mindirect-branch=thunk -mindirect-branch-register

2018-09-16 Thread Wes Turner
Are all current Python builds and C extensions vulnerable to Spectre
variants {1, 2, *}?

There are now multiple threads:

"SEC: Spectre variant 2: GCC: -mindirect-branch=thunk
-mindirect-branch-register"
-
https://mail.python.org/mm3/archives/list/distutils-...@python.org/thread/4BGE226DB5EWIAT5VCJ75QD5ASOVJZCM/
- https://mail.python.org/pipermail/python-ideas/2018-September/053473.html
- https://mail.python.org/pipermail/python-dev/2018-September/155199.html


Original thread (that I forwarded to security@):
"[Python-ideas] Executable space protection: NX bit,"
https://mail.python.org/pipermail/python-ideas/2018-September/053175.html
> ~ Do trampolines / nested functions in C extensions switch off the NX bit?

On Sunday, September 16, 2018, Nathaniel Smith  wrote:

> On Wed, Sep 12, 2018, 12:29 Joni Orponen  wrote:
>
>> On Wed, Sep 12, 2018 at 8:48 PM Wes Turner  wrote:
>>
>>> Should C extensions that compile all add
>>> `-mindirect-branch=thunk -mindirect-branch-register` [1] to mitigate the
>>> risk of Spectre variant 2 (which does indeed affect user space applications
>>> as well as kernels)?
>>>
>>
>> Are those available on GCC <= 4.2.0 as per PEP 513?
>>
>
> Pretty sure no manylinux1 compiler is ever going to get these mitigations.
>
> For manylinux2010 on x86-64, we can easily use a much newer compiler: RH
> maintains a recent compiler, currently gcc 7.3, or if that doesn't work for
> some reason then the conda folks have be apparently figured out how to
> build the equivalent from gcc upstream releases.
>

Are there different CFLAGS and/or gcc compatibility flags in conda builds
of Python and C extensions?

Where are those set in conda builds?

What's the best way to set CFLAGS in Python builds and C extensions?

export CFLAGS="-mindirect-branch=thunk -mindirect-branch-register"
./configure
make

?

Why are we supposed to use an old version of GCC that doesn't have the
retpoline patches that only mitigate Spectre variant 2?


>
> Unfortunately, the manylinux2010 infrastructure is not quite ready... I'm
> pretty sure it needs some volunteers to push it to the finish line, though
> unfortunately I haven't had enough time to keep track.
>

"PEP 571 -- The manylinux2010 Platform Tag"
https://www.python.org/dev/peps/pep-0571/

"Tracking issue for manylinux2010 rollout"
https://github.com/pypa/manylinux/issues/179

Are all current Python builds and C extensions vulnerable to Spectre
variants {1, 2, *}?

>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Official citation for Python

2018-09-16 Thread Jeremy Hylton
I wanted to start with an easy answer that is surely unsatisfying:
http://blog.apastyle.org/apastyle/2015/01/how-to-cite-software-in-apa-style.html

APA style is pretty popular, and it says that standard software doesn't
need to be specified. Standard software includes "Microsoft Word, Java, and
Adobe Photoshop." So I'd say Python fits well in that category, and doesn't
need to be cited.

I said you wouldn't be satisfied...

On Sat, Sep 15, 2018 at 11:02 AM Jacqueline Kazil 
wrote:

> I just got caught up on the thread. This is a really great discussion.
> Thank you for all the contributions.
>
> Before we get into the details, let's go back to the main use case we are
> trying to solve.
> *As a user, I am writing an academic paper and I need to cite Python. *
>

The goal here is ambiguous. Python means many things--a language described
by the language specification, the source code of a particular
implementation of the language (Python often refers to C Python), a
particular binary release of the implementation of the language (Python
1.5.2 for Windows). Which one is relevant in the context of the paper? If
you're talking about a bug in timsort in a particular version of C Python,
then you probably want to cite that specific version of the implementation.

I suspect the most common goal for a citation is just to describe the
language "in general" where 1.5.2 or 3.7.0 and Jython or CPython are all
details that don't matter. In that case I'd cite the language
specification. We're talking about putting a citation in a paper (a written
source) and the written language specification captures what we think of as
essential for the language. If you want to cite Turing's proof of the
undecidability of the halting problem, you'd cite the paper where he wrote
it down (in Proceedings of the London Mathematical Society). If you want to
cite a programming language in the abstract, cite the specification that
describes it.

I think style guides are relevant here. They give guidance on how to cite
an item based on its category. For example, the MLA style guide describes
how to cite a digital file, a physical object, and many other things. My
favorite example under "physical object" is "Physical objects found
online." (Think about it :-).

There's some discussion of how to cite source code here:
http://integrity.mit.edu/handbook/writing-code. Notably this is talking
about citing source code in the context of other source code, and it mostly
recommends using URLs. If you wanted to cite a particular piece of source
code in an written article, you'd probably follow one of the approaches for
citing online resources. Try to identify who / when / what / where. For
example MLA style for a blog post would be : Editor, screen name, author,
or compiler name (if available). “Posting Title.” Name of Site, Version
number (if available), Name of institution/organization affiliated with the
site (sponsor or publisher), URL. Date of access. You could cite a
particular source file this way or a particular source release.

The date usually refers to the original publication date. I think that was
with the 1.0 release, although I'm not sure. I'd probably pick that date,
but someone can correct me if there's an earlier date. It would suggest
somehow that current Python and the original Python were mostly the same
thing, which is an idea I like.

van Rossum, Guido (1994). "The Python Language Reference". Python Software
Foundation, https://docs.python.org/reference/index.html. Retrieved 16
September 2018.

I'd say that's all settled. If anyone asks you, "How can you be sure that
settles it?" You can answer, "Some guy said it on a mailing list." And then
you can site the message:

Jeremy Hylton. "[Python-Dev] Official citation for Python." Sep. 17, 2018.
python-dev, https://mail.python.org/mailman/listinfo/python-dev. Accessed
18 September 2018.

Jeremy


> Let's throw reproducibility out the window for now (<--- something I never
> thought I would say), because that should be captured in the code, not in
> the citations.
>
> So, if we don't need the specific version of Python, then maybe creating
> one citation is all we need.
> And that gives it some good Google juice as well.
>
> Thoughts?
>
> (Once we nail down one or many, I think we can then move into the details
> of the content of the citation.)
>
> -Jackie
>
> On Thu, Sep 13, 2018 at 12:47 AM Wes Turner  wrote:
>
>> There was a thread about adding __cite__ to things and a tool to collect
>> those citations awhile back.
>>
>> "[Python-ideas] Add a __cite__ method for scientific packages"
>> http://markmail.org/thread/rekmbmh64qxwcind
>>
>> Which CPython source file should contain this __cite__ value?
>>
>> ... On a related note, you should ask the list admin to append a URL to
>> each mailing list message whenever this list is upgraded to mm3; so that
>> you can all be appropriately cited.
>>
>> On Thursday, September 13, 2018, Wes Turner  wrote:
>>
>>> Do you guys think we should all