date:20190320

Greg Ewing:
> So use NamedTemporaryFile(delete = False) and close it before passing it to 
> the other program.

That's effectively the same as calling tempfile.mktemp.   While it does waste 
time opening and closing an unused file, that doesn't help with security.  If 
anything, it might worsen security.

If a secure implementation of mktemp is truly impossible, then the same could 
be said for NamedTemperatoryFile(delete=False). Should that be deprecated as 
well?

regards, Anders

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Remove tempfile.mktemp()

2019-03-20 Thread Sebastian Rittau



Am 20.03.19 um 09:47 schrieb Anders Munch:

Greg Ewing:

So use NamedTemporaryFile(delete = False) and close it before passing it to the 
other program.

That's effectively the same as calling tempfile.mktemp.   While it does waste 
time opening and closing an unused file, that doesn't help with security.  If 
anything, it might worsen security.


That is not actually true. The important difference is that with 
NamedTemporaryFile the file exists with appropriate access right (0600). 
This denies access of that file to other users. With mktemp() no file is 
created, so another user can "hijack" that name and cause programs to 
write potentially privileged data into or read manipulated data from 
that file.


 - Sebastian


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Remove tempfile.mktemp()

Anders Munch:
>>> So use NamedTemporaryFile(delete = False) and close it before passing it to 
>>> the other program.
>> That's effectively the same as calling tempfile.mktemp.   While it does 
>> waste time opening and closing an unused file, that doesn't help with 
>> security
Sebastian Rittau:
> That is not actually true. The important difference is that with 
> NamedTemporaryFile the file exists with appropriate access right (0600).

You are right, I must have mentally reversed the polarity of the delete 
argument.  And I didn't realise that the access right on a file had the power 
to prevent itself from being removed from the folder that it's in.  I thought 
the access flags were a property of the file itself and not the directory 
entry. Not sure how that works.

But if NamedTemporaryFile(delete=False) is secure then why not use that to 
implement mktemp?

def mktemp(suffix="", prefix=template, dir=None):
with NamedTemporaryFile(delete=False, suffix=suffix, prefix=prefix, 
dir=dir) as f:
return f.name

Yes, it does leave an empty file if the name is not used, but the name is 
usually created with the intent to use it, so that is rarely going to be a 
problem. Just document that that's how it is.  It does mean that where there's 
an explicit file-exists check before writing the file, that code will break. 
But it will break a lot less code than removing mktemp entirely.

regards, Anders

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Remove tempfile.mktemp()

2019-03-20 Thread eryk sun

On 3/20/19, Anders Munch  wrote:
>
> You are right, I must have mentally reversed the polarity of the delete
> argument.  And I didn't realise that the access right on a file had the
> power to prevent itself from being removed from the folder that it's in.  I
> thought the access flags were a property of the file itself and not the
> directory entry. Not sure how that works.

In POSIX, it's secure so long as we use a directory that doesn't grant
write access to other users, or one that has the sticky bit set such
as "/tmp". A directory that has the sticky bit set allows only root
and the owner of the file to unlink the file.

In Windows, a user's default %TEMP% directory is only accessible by
the user, SYSTEM, and Administrators. The only way others can delete a
file there is if the file security is modified to allow it (possible
for individual files, unlike POSIX). This works even with no access to
the temp directory itself because users have SeChangeNotifyPrivilege,
which bypasses traverse (execute) access checks.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Remove tempfile.mktemp()

Nathaniel J. Smith:
> Historically, mktemp variants have caused *tons* of serious security
> vulnerabilities. It's not a theoretical issue.

All the more reason to have a standard library function that gets it right.

> The choice of ENTROPY_BYTES is an interesting question. 16 (= 128 bits) would
> be a nice "obviously safe" number, and gives 22-byte filenames. We might be
> able to get away with fewer, if we had a plausible cost model for the
> attack. This is another point where a security specialist might be helpful 
> :-).

I'm not a security specialist but I play one on TV.
Here's my take on it.

Any kind of brute force attack will require at least one syscall per try, to
create a file or check if a file by a given name exists.  It's a safe assumption
that names have to be tried individually, because if the attacker has a faster
way of enumerating existing file names, then the entropy of the filename is
worthless anyway.

That means even with only 41 bits of entry, the attacker will have make 2^40
tries on average.  For an individual short-lived file, that could be enough;
even with a billion syscalls per second, that's over a thousand seconds, leaving
plenty of time to initiate whatever writes the file.

However, there could be applications where the window of attack is very long,
hours or days even, or that are constantly writing new temporary files, and
where the attacker can keep trying at a rapid pace, and then 41 bits is
definitely not secure.

128 bits seems like overkill: There's no birthday attack because no-one keeps
2^(ENTROPY_BITS/2) files around, and the attack is running on the attackee's
system, so there's no using specialised accelerator hardware.  I'd say 64 bits
is enough under those circumstances, but I wouldn't be surprised if a better
security specialist could make a case for more.  So maybe go with 80 bits,
that's puts it at 15 or 16 characters.


Med venlig hilsen/Best regards

 Anders Munch
Chief Security Architect

T: +45 76266981  *  M: +45 51856626
a...@flonidan.dk  *  www.flonidan.com 
 FLONIDAN A/S  *  Islandsvej 29  *  DK-8700 Horsens  *  CVR: 89919916
Winner of the 2018 Frost & Sullivan Customer Leadership Award
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Remove tempfile.mktemp()

Hi,

I'm not really convinced that mktemp() should be made "more secure".
To be clear: mktemp() is vulnerable by design. It's not a matter of
entropy. You can watch the /tmp directory using inotify and "discover"
immediately the "secret" filename, it doesn't depend on the amount of
entropy used to generate the filename. A function is either unsafe or
secure.

Why mktemp() only uses 8 characters? Well, I guess that humans like to
be able to copy manually (type) a filename :-)

Note: For the ones who didn't notice, "mktemp()" name comes from a
function with the same name in the libc.
http://man7.org/linux/man-pages/man3/mktemp.3.html

Victor

Le mer. 20 mars 2019 à 12:29, Anders Munch  a écrit :
>
> Nathaniel J. Smith:
> > Historically, mktemp variants have caused *tons* of serious security
> > vulnerabilities. It's not a theoretical issue.
>
> All the more reason to have a standard library function that gets it right.
>
> > The choice of ENTROPY_BYTES is an interesting question. 16 (= 128 bits) 
> > would
> > be a nice "obviously safe" number, and gives 22-byte filenames. We might be
> > able to get away with fewer, if we had a plausible cost model for the
> > attack. This is another point where a security specialist might be helpful 
> > :-).
>
> I'm not a security specialist but I play one on TV.
> Here's my take on it.
>
> Any kind of brute force attack will require at least one syscall per try, to
> create a file or check if a file by a given name exists.  It's a safe 
> assumption
> that names have to be tried individually, because if the attacker has a faster
> way of enumerating existing file names, then the entropy of the filename is
> worthless anyway.
>
> That means even with only 41 bits of entry, the attacker will have make 2^40
> tries on average.  For an individual short-lived file, that could be enough;
> even with a billion syscalls per second, that's over a thousand seconds, 
> leaving
> plenty of time to initiate whatever writes the file.
>
> However, there could be applications where the window of attack is very long,
> hours or days even, or that are constantly writing new temporary files, and
> where the attacker can keep trying at a rapid pace, and then 41 bits is
> definitely not secure.
>
> 128 bits seems like overkill: There's no birthday attack because no-one keeps
> 2^(ENTROPY_BITS/2) files around, and the attack is running on the attackee's
> system, so there's no using specialised accelerator hardware.  I'd say 64 bits
> is enough under those circumstances, but I wouldn't be surprised if a better
> security specialist could make a case for more.  So maybe go with 80 bits,
> that's puts it at 15 or 16 characters.
>
>
> Med venlig hilsen/Best regards
>
>  Anders Munch
> Chief Security Architect
>
> T: +45 76266981  *  M: +45 51856626
> a...@flonidan.dk  *  www.flonidan.com
>  FLONIDAN A/S  *  Islandsvej 29  *  DK-8700 Horsens  *  CVR: 89919916
> Winner of the 2018 Frost & Sullivan Customer Leadership Award
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com



-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Remove tempfile.mktemp()

2019-03-20 Thread Steven D'Aprano

On Wed, Mar 20, 2019 at 11:25:03AM +, Anders Munch wrote:

> 128 bits seems like overkill: There's no birthday attack because no-one keeps
> 2^(ENTROPY_BITS/2) files around, 

You haven't seen my Downloads folder... :-)

But seriously:

> and the attack is running on the attackee's
> system, so there's no using specialised accelerator hardware.  I'd say 64 bits
> is enough under those circumstances, but I wouldn't be surprised if a better
> security specialist could make a case for more.  So maybe go with 80 bits,
> that's puts it at 15 or 16 characters.

Why be so miserly with entropy? This probably isn't a token that goes to 
a human, who may have to type it into a web browser, or send it by SMS. 
Its likely to be a name used only by the machine. Using 128 bits is just 
22 characters using secrets.token_urlsafe().

The default entropy used by secrets is 32 bytes, which gives a 43 
character token. I have plenty of files with names longer than that:

"Funny video of cat playing piano while dog does backflips.mp4"

Of course, if you have some specific need for the file name to be 
shorter (or longer!) then there ought to be a way to set the entropy 
used. But I think the default secrets entropy is fine, and its better to 
have longer names than shorter ones, within reason. I don't think 40-50 
characters (plus any prefix or suffix) is excessive for a temporary file 
intended for use by an application rather than a human.

-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Remove tempfile.mktemp()

2019-03-20 Thread Jeroen Demeyer


On 2019-03-20 12:45, Victor Stinner wrote:

You can watch the /tmp directory using inotify and "discover"
immediately the "secret" filename, it doesn't depend on the amount of
entropy used to generate the filename.


That's not the problem. The security issue here is guessing the filename 
*before* it's created and putting a different file or symlink in place.


So I actually do think that mktemp() could be made secure by using a 
longer name generated by a secure random generator.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Remove tempfile.mktemp()

2019-03-20 Thread Steven D'Aprano

On Wed, Mar 20, 2019 at 12:45:40PM +0100, Victor Stinner wrote:
> Hi,
> 
> I'm not really convinced that mktemp() should be made "more secure".
> To be clear: mktemp() is vulnerable by design. It's not a matter of
> entropy. You can watch the /tmp directory using inotify and "discover"
> immediately the "secret" filename, it doesn't depend on the amount of
> entropy used to generate the filename. A function is either unsafe or
> secure.

Security is not a binary state, it is never either-or "unsafe" or 
"secure". Secure against what attacks? Unsafe under what circumstances?

I can use the unsafe mktemp on a stand alone single-user computer, 
disconnected from the internet, guaranteed to have nothing but trusted 
software, and it will be secure in practice.

Or I can use the "safe interfaces" and I'm still vulnerable to an 
Advanced Persistent Threat that has compromised the OS specifically to 
target my application. If the attacker controls the OS or the hardware, 
then effectively they've already won.

-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Remove tempfile.mktemp()

Steven D'Aprano:
>> 128 bits seems like overkill: There's no birthday attack because 
>> no-one keeps 2^(ENTROPY_BITS/2) files around
> You haven't seen my Downloads folder... :-)

I put it to you that those files are not temporary :-)

> Why be so miserly with entropy?

I don't necessarily disagree.  

> Using 128 bits is just 22 characters using secrets.token_urlsafe().

A little more when you take into account case-insensitive file systems.

regards, Anders

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Remove tempfile.mktemp()

Victor Stinner:
> To be clear: mktemp() is vulnerable by design

No: mktemp() is vulnerable by implementation.  Specifically, returning a file 
name in a world-accessible location, /tmp.

regards, Anders

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Remove tempfile.mktemp()

2019-03-20 Thread Antoine Pitrou

On Wed, 20 Mar 2019 11:25:53 +1300
Greg Ewing  wrote:
> Antoine Pitrou wrote:
> > Does it always work? According to the docs, """Whether the name can be
> > used to open the file a second time, while the named temporary file is
> > still open, varies across platforms  
> 
> So use NamedTemporaryFile(delete = False) and close it before passing
> it to the other program.

How is it more secure than using mktemp()?

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Best way to specify docstrings for member objects

2019-03-20 Thread Ethan Furman


On 03/19/2019 11:55 AM, Raymond Hettinger wrote:


I'm working on ways to make improve help() by giving docstrings
 to member objects.


Cool!


There's another way I would like to propose.  The __slots__
 definition already works with any iterable including a
 dictionary (the dict values are ignored), so we could use the
 values for the  docstrings.

[...]

What do you all think about the proposal?


This proposal only works with objects defining __slots__, and only the objects 
in __slots__?  Does it help Enum, dataclasses, or other enhanced 
classes/objects?

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Best way to specify docstrings for member objects

2019-03-20 Thread Gregory P. Smith

(answers above and below the quoting)

I like the idea of documenting attributes, but we shouldn't force the user
to use __slots__ as that has significant side effects and is rarely
something people should bother to use.  There are multiple types of
attributes.  class and instance.  but regardless of where they are
initialized, they all define the API shape of a class (or instance).

Q: Where at runtime regardless of syntax chosen would such docstrings
live?  One (of many) common conventions today is to just put them into an
Attributes: or similar section of the class docstring.  We could actually
do that automatically by appending a section to the class docstring, but
that unstructures the data enshrining one format and could break existing
code for the users of the few but existing APIs that treat docstrings as
structured runtime data instead of documentation if someone were to try and
use attribute docstrings on subclasses of those library types.  (ply does
this, I believe some database abstraction APIs do as well).

On Wed, Mar 20, 2019 at 12:41 AM Serhiy Storchaka 
wrote:

> 19.03.19 20:55, Raymond Hettinger пише:
> > I'm working on ways to make improve help() by giving docstrings to
> member objects.
> >
> > One way to do it is to wait until after the class definition and then
> make individual, direct assignments to __doc__ attributes.This way widely
> the separates docstrings from their initial __slots__ definition.   Working
> downstream from the class definition feels awkward and doesn't look pretty.
> >
> > There's another way I would like to propose¹.  The __slots__ definition
> already works with any iterable including a dictionary (the dict values are
> ignored), so we could use the values for the  docstrings.
>
> I think it would be nice to separate docstrings from the bytecode. This
> would be allow to have several translated sets of docstrings and load an
> appropriate set depending on user preferences. This would help in
> teaching Python.
>
> It is possible with docstrings of modules, classes, functions, methods
> and properties (created by using the decorator), because the compiler
> knows what string literal is a docstring. But this is impossible with
> namedtuple fields and any of the above ideas for slots.
>
> It would be nice to allow to specify docstrings for slots as for methods
> and properties. Something like in the following pseudocode:
>
> class NormalDist:
>  slot mu:
> '''Arithmetic mean'''
>  slot sigma:
>  '''Standard deviation'''
>

I don't think adding a 'slot' keyword even if limited in scope to class
body definition level is a good idea (very painful anytime we reserve a new
word that is already used in code and APIs).

> It would be also nice to annotate slots and add default values (used
> when the slot value was not set).
>
> class NormalDist:
>  mu: float = 0.0
> '''Arithmetic mean'''
>  sigma: float = 1.0
>  '''Standard deviation'''
>
>
Something along these lines is more interesting to me.  And could be
applied to variables in _any_ scope.  though there wouldn't be a point in
using a string in context where the name isn't bound to a class or module.

The best practice today remains "just use the class docstring to document
your public class and instance attributes".  FWIW other languages tend to
generate their documentation from code via comments rather than requiring a
special in language runtime accessible syntax to declare it as
documentation.

It feels like Python is diverging from the norm if we were encourage more
of this __doc__ carried around at runtime implicit assignment than we
already have.  I'm not convinced that is a good thing.

-gps
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Best way to specify docstrings for member objects

2019-03-20 Thread Ivan Pozdeev via Python-Dev


On 19.03.2019 21:55, Raymond Hettinger wrote:

I'm working on ways to make improve help() by giving docstrings to member 
objects.

One way to do it is to wait until after the class definition and then make 
individual, direct assignments to __doc__ attributes.This way widely the 
separates docstrings from their initial __slots__ definition.   Working 
downstream from the class definition feels awkward and doesn't look pretty.

There's another way I would like to propose¹.  The __slots__ definition already 
works with any iterable including a dictionary (the dict values are ignored), 
so we could use the values for the  docstrings.

This keeps all the relevant information in one place (much like we already do 
with property() objects).  This way already works, we just need a few lines in 
pydoc to check to see if a dict if present.  This way also looks pretty and 
doesn't feel awkward.

I've included worked out examples below.  What do you all think about the 
proposal?


Raymond


¹ https://bugs.python.org/issue36326


== Desired help() output ==


help(NormalDist)

Help on class NormalDist in module __main__:

class NormalDist(builtins.object)
  |  NormalDist(mu=0.0, sigma=1.0)
  |
  |  Normal distribution of a random variable
  |
  |  Methods defined here:
  |
  |  __init__(self, mu=0.0, sigma=1.0)
  |  NormalDist where mu is the mean and sigma is the standard deviation.
  |
  |  cdf(self, x)
  |  Cumulative distribution function.  P(X <= x)
  |
  |  pdf(self, x)
  |  Probability density function.  P(x <= X < x+dx) / dx
  |
  |  --
  |  Data descriptors defined here:
  |
  |  mu
  |  Arithmetic mean.
  |
  |  sigma
  |  Standard deviation.
  |
  |  variance
  |  Square of the standard deviation.



== Example of assigning docstrings after the class definition ==

class NormalDist:
 'Normal distribution of a random variable'

 __slots__ = ('mu', 'sigma')

 def __init__(self, mu=0.0, sigma=1.0):
 'NormalDist where mu is the mean and sigma is the standard deviation.'
 self.mu = mu
 self.sigma = sigma

 @property
 def variance(self):
 'Square of the standard deviation.'
 return self.sigma ** 2.

 def pdf(self, x):
 'Probability density function.  P(x <= X < x+dx) / dx'
 variance = self.variance
 return exp((x - self.mu)**2.0 / (-2.0*variance)) / sqrt(tau * variance)

 def cdf(self, x):
 'Cumulative distribution function.  P(X <= x)'
 return 0.5 * (1.0 + erf((x - self.mu) / (self.sigma * sqrt(2.0

NormalDist.mu.__doc__ = 'Arithmetic mean'
NormalDist.sigma.__doc__ = 'Standard deviation'


IMO this is another manifestation of the problem that things in the class 
definition have no access to the class object.
Logically speaking, a definition item should be able to see everything that is 
defined before it.
For the same reason, we have to jump through hoops to use a class name in a class attribute definition -- see e.g. 
https://stackoverflow.com/questions/14513019/python-get-class-name


If that problem is resolved, you would be able to write something like:

class NormalDist:
'Normal distribution of a random variable'

__slots__ = ('mu', 'sigma')

__self__.mu.__doc__= 'Arithmetic mean'
    __self__.sigma.__doc__= 'Stndard deviation'





== Example of assigning docstrings with a dict =

class NormalDist:
 'Normal distribution of a random variable'

 __slots__ = {'mu' : 'Arithmetic mean.', 'sigma': 'Standard deviation.'}

 def __init__(self, mu=0.0, sigma=1.0):
 'NormalDist where mu is the mean and sigma is the standard deviation.'
 self.mu = mu
 self.sigma = sigma

 @property
 def variance(self):
 'Square of the standard deviation.'
 return self.sigma ** 2.

 def pdf(self, x):
 'Probability density function.  P(x <= X < x+dx) / dx'
 variance = self.variance
 return exp((x - self.mu)**2.0 / (-2.0*variance)) / sqrt(tau * variance)

 def cdf(self, x):
 'Cumulative distribution function.  P(X <= x)'
 return 0.5 * (1.0 + erf((x - self.mu) / (self.sigma * sqrt(2.0

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/vano%40mail.mipt.ru


--
Regards,
Ivan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Remove tempfile.mktemp()

2019-03-20 Thread Ivan Pozdeev via Python-Dev

Before we can say if something is "secure" or not, we need a threat model -- i.e we need to agree which use cases we are protecting and from
what threats.

So far, I've seen these use cases:

1. File for the current process' private use
2. File/file name generated by the current process; written by another process,
read by current one
3. File name generated by the current process; written by the current process,
read by another one

And the following threats, three axes:

a. Processes run as other users
b. Processes run as the same user (or a user that otherwise automatically has
access to all your files)

1. Accidental collision from a process that uses CREATE_NEW or equivalent
2. Accidental collision from a process that doesn't use CREATE_NEW or equivalent
3. Malicious code creating files at random
4. Malicious code actively monitoring file creation

-1. read
-2. write

E.g. for threat b-4), it's not safe to use named files for IPC at all, only
case 1 can be secured (with exclusive open).

On 19.03.2019 16:03, Stéphane Wirtel wrote:

Hi,

Context: raise a warning or remove tempfile.mktemp()
BPO: https://bugs.python.org/issue36309

Since 2.3, this function is deprecated in the documentation, just in the
documentation. In the code, there is a commented RuntimeWarning.
Commented by Guido in 2002, because the warning was too annoying (and I
understand ;-)).

So, in this BPO, we start to discuss about the future of this function
and Serhiy proposed to discuss on the Python-dev mailing list.

Question: Should we drop it or add a (Pending)DeprecationWarning?

Suggestion and timeline:

3.8, we raise a PendingDeprecationWarning
* update the code
* update the documentation
* update the tests
(check a PendingDeprecationWarning if sys.version_info == 3.8)

3.9, we change PendingDeprecationWarning to DeprecationWarning
(check DeprecationWarning if sys.version_info == 3.9)

3.9+, we drop tempfile.mktemp()

What do you suggest?

Have a nice day and thank you for your feedback.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/vano%40mail.mipt.ru

--
Regards,
Ivan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Best way to specify docstrings for member objects

> On Mar 20, 2019, at 3:30 PM, Gregory P. Smith  wrote:
> 
> I like the idea of documenting attributes, but we shouldn't force the user to 
> use __slots__ as that has significant side effects and is rarely something 
> people should bother to use.

Member objects are like property objects in that they exist at the class level 
and show up in the help whether you want them to or not.   AFAICT, they are the 
only such objects to not have a way to attach docstrings.

For instance level attributes created by __init__, the usual way to document 
them is in either the class docstring or the __init__ docstring.  This is 
because they don't actually exist until  __init__ is run.

No one is forcing anyone to use slots.  I'm just proposing that for classes 
that do use them that there is currently no way to annotate them like we do for 
property objects (which people aren't being forced to use either).  The goal is 
to make help() better for whatever people are currently doing.  That shouldn't 
be controversial.  

Someone not liking or recommending slots is quite different from not wanting 
them documented.  In the examples I posted (taken from the standard library), 
the help() is clearly better with the annotations than without.

Raymond

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Best way to specify docstrings for member objects

2019-03-20 Thread Ethan Furman


On 03/20/2019 03:24 PM, Ethan Furman wrote:

On 03/19/2019 11:55 AM, Raymond Hettinger wrote:



There's another way I would like to propose.  The __slots__
 definition already works with any iterable including a
 dictionary (the dict values are ignored), so we could use the
 values for the  docstrings.

[...]

What do you all think about the proposal?


This proposal only works with objects defining __slots__, and only
 the objects in __slots__?  Does it help Enum, dataclasses, or other
 enhanced classes/objects?


Hmm.  Said somewhat less snarkily, is there a more general solution to the 
problem of absent docstrings or do we have to attack this problem 
piece-by-piece?

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Best way to specify docstrings for member objects



> On Mar 20, 2019, at 3:47 PM, Ivan Pozdeev via Python-Dev 
>  wrote:
> 
>> NormalDist.mu.__doc__ = 'Arithmetic mean'
>> NormalDist.sigma.__doc__ = 'Standard deviation'
> 
> IMO this is another manifestation of the problem that things in the class 
> definition have no access to the class object.
> Logically speaking, a definition item should be able to see everything that 
> is defined before it.

The member objects get created downstream by the type() metaclass.  So, there 
isn't a visibility issue because the objects don't exist yet.


Raymond

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Best way to specify docstrings for member objects

> On Mar 20, 2019, at 3:59 PM, Ethan Furman  wrote:
> 
> Hmm.  Said somewhat less snarkily, is there a more general solution to the 
> problem of absent docstrings or do we have to attack this problem 
> piece-by-piece?

I think this is the last piece.  The pydoc help() utility already knows how to 
find docstrings for other class level descriptors:  property, class method, 
staticmethod.

Enum() already has nice looking help() output because the class variables are 
assigned values that have a nice __repr__, making them self documenting.

By design, dataclasses aren't special -- they just make regular classes, 
similar to or better than you would write by hand.

Raymond
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Is XML serialization output guaranteed to be bytewise identical forever?

Hi,

Le lun. 18 mars 2019 à 23:41, Raymond Hettinger
 a écrit :
> We're having a super interesting discussion on 
> https://bugs.python.org/issue34160 .  It is now marked as a release blocker 
> and warrants a broader discussion.

Thanks for starting a thread on python-dev. I'm the one who raised the
priority to release blocker to trigger such discussion on python-dev.


> Our problem is that at least two distinct and important users have written 
> tests that depend on exact byte-by-byte comparisons of the final 
> serialization.

Sorry but I don't think that it's a good summary of the issue. IMHO
the issue is more general about how we introduce backward incompatible
in Python.

The migration from Python 2 to Python 3 took around ten years. That's
way too long and it caused a lot of troubles in the Python community.
IMHO one explanation is our patronizing behavior regarding to users
that I would like to summarize as "your code is wrong, you have to fix
it" (whereas the code was working well for 10 years with Python 2!).

I'm not opposed to backward incompatible changes, but I think that we
must very carefully prepare the migration and do our best to help
users to migrate their code.


> 2). Go into every XML module and add attribute sorting options to each 
> function that generate xml. (...)

Written like that, it sounds painful and a huge project... But in
practice, the implementation looks simple and straightforward:
https://github.com/python/cpython/pull/12354/files

I don't understand why such simple solution has been rejected.

IMHO adding an optional sort parameter is just the *bare minimum* that
we can do for our users.

Alternatives have been proposed like a recipe to sort node attributes
before serialization, but honestly, it's way too complex. I don't want
to have to copy such recipe to every project. Add a new function,
import it, use it where XML is written into a file, etc. Taken alone,
maybe it's acceptable. But please remember that some companies are
still porting their large Python 2 code base to Python 3. This new
backward incompatible gets on top of the pile of other backward
incompatible changes between 2.7 and 3.8.

I would prefer to be able to "just add" sort=True. Don't forget that
tests like "if sys.version >= (3, 8):"  will be needed which makes the
overall fix more complicated.

Said differently, the stdlib should help the user to update Python.
The pain should not only be on the user side.

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Is XML serialization output guaranteed to be bytewise identical forever?

> On Mar 19, 2019, at 4:53 AM, Ned Batchelder  wrote:
> 
> None of this is impossible, but please try not to preach to us maintainers 
> that we are doing it wrong, that it will be easy to fix, etc

There's no preaching and no judgment.  We can't have a conversation though if 
we can't state the crux of the problem: some existing tests in third-party 
modules depend on the XML serialization being byte-for-byte identical forever. 
The various respondents to this thread have indicated that the standard library 
should only make that guarantee within a single feature release and that it may 
to vary across feature releases.

For docutils, it may end-up being an easy fix (either with a semantic 
comparison or with regenerating the target files when point releases differ).  
For Coverage, I don't make any presumption that reengineering the tests will be 
easy or fun.  Several mitigation strategies have been proposed:

* alter to element creation code to create the attributes in the desired order
* use a canonicalization tool to create output that is guarantee not to change
* generate new baseline files when a feature release changes
* apply Stefan's recipe for reordering attributes
* make a semantic level comparison

Will any other these work for you?

Raymond

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Is XML serialization output guaranteed to be bytewise identical forever?

Le jeu. 21 mars 2019 à 01:30, Raymond Hettinger
 a écrit :
> There's no preaching and no judgment.  We can't have a conversation though if 
> we can't state the crux of the problem: some existing tests in third-party 
> modules depend on the XML serialization being byte-for-byte identical 
> forever. The various respondents to this thread have indicated that the 
> standard library should only make that guarantee within a single feature 
> release and that it may to vary across feature releases.
>
> For docutils, it may end-up being an easy fix (either with a semantic 
> comparison or with regenerating the target files when point releases differ). 
>  For Coverage, I don't make any presumption that reengineering the tests will 
> be easy or fun.  Several mitigation strategies have been proposed:
>
> * alter to element creation code to create the attributes in the desired order
> * use a canonicalization tool to create output that is guarantee not to change
> * generate new baseline files when a feature release changes
> * apply Stefan's recipe for reordering attributes
> * make a semantic level comparison
>
> Will any other these work for you?

Python 3.8 is still in a very early stage of testing. We only started
to discover which projects are broken by the XML change.

IMHO the problem is wider than just unit tests written in Python.
Python can be used to produce the XML, but other languages can be used
to parse or compare the generated XML. For example, if the generated
file is stored in Git, it will be seen as modified and "git diff" will
show a lot of "irrelevant" changes.

Comparison of XML using string comparison can also be used to avoid
expensive disk/database write or reduce network bandwidth. That's an
issue if the program isn't written in Python, whereas the XML is
generated by Python.

Getting the same output on Python 3.7 and Python 3.8 is also matter
for https://reproducible-builds.org/

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Is XML serialization output guaranteed to be bytewise identical forever?

> On Mar 20, 2019, at 5:22 PM, Victor Stinner  wrote:
> 
> I don't understand why such simple solution has been rejected.

It hasn't been rejected. That is above my pay grade.  Stefan and I recommended 
against going down this path. However, since you're in disagreement and have 
marked this as a release blocker, it is now time for the steering committee to 
earn their pay (which is at least double what I'm making) or defer to the 
principal module maintainer, Stefan.

To recap reasons for not going down this path:

1) The only known use case for a "sort=True" parameter is to perpetuate the 
practice of byte-by-byte output comparisons guaranteed to work across feature 
releases.  The various XML experts in this thread have opined that isn't 
something we should guarantee (and sorting isn't the only aspect detail subject 
to change, Stefan listed others).

2) The intent of the XML modules is to implement the specification and be 
interoperable with other languages and other XML tools. It is not intended to 
be used to generate an exact binary output.  Per section 3.1 of the XML spec, 
"Note that the order of attribute specifications in a start-tag or 
empty-element tag is not significant."

3) Mitigating a test failure is a one-time problem. API expansions are forever.

4) The existing API is not small and presents a challenge for teaching. Making 
the API bigger will make it worse.

5) As far as I can tell, XML tools in other languages (such as Java) don't sort 
(and likely for good reason).  LXML is dropping its attribute sorting as well, 
so the standard library would become more of an outlier.

Raymond

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Is XML serialization output guaranteed to be bytewise identical forever?

Le lun. 18 mars 2019 à 23:41, Raymond Hettinger
 a écrit :
> The code in the current 3.8 alpha differs from 3.7 in that it removes 
> attribute sorting and instead preserves the order the user specified when 
> creating an element.  As far as I can tell, there is no objection to this as 
> a feature.

By the way, what's the rationale of this backward incompatible change?

I found this short message:
"FWIW, this issue arose from an end-user problem. She had a hard
requirement to show a security clearance level as the first attribute.
We did find a work around but it was hack."
https://bugs.python.org/issue34160#msg338098

It's the first time that I hear an user asking to preserve attribute
insertion order (or did I miss a previous request?). Technically, it
was possible to implement the feature earlier using OrderedDict. So
why doing it now?

Is it really worth it to break Python backward compatibility (change
the default behavior) for everyone, if it's only needed for few users?


> 1) Revert back to the 3.7 behavior. This of course, makes all the test pass 
> :-)  The downside is that it perpetuates the practice of bytewise equality 
> tests and locks in all implementation quirks forever.  I don't know of anyone 
> advocating this option, but it is the simplest thing to do.

Can't we revert Python 3.7 behavior and add a new opt-in option to
preserve the attribution insertion order (current Python 3.8 default
behavior)?

Python 3.7, sorting attributes by name, doesn't sound so silly to me.
It's one arbitrary choice, but at least the output is deterministic.
And well, Python is doing that for 20 years :-)


> 4) Fix the tests in the third-party modules (...)

I also like the option "not break the backward compatibility" to not
have to fix any project :-)

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Is XML serialization output guaranteed to be bytewise identical forever?