subject:"\[Python\-Dev\] Re\: What to do about invalid escape sequences"


On 8/10/2019 3:36 PM, Greg Ewing wrote:

Glenn Linderman wrote:


I wonder how many raw strings actually use the \"  escape 
productively? Maybe that should be deprecated too! ?  I can't think 
of a good and necessary use for it, can anyone?


Quite rare, I expect, but it's bound to break someone's code.
It might be better to introduce a new string prefix, e.g.
'v' for 'verbatim':

   v"C:\Users\Fred\"

Which is why I suggested  rr"C:\directory\", but allowed as how there 
might be better spellings I like your  v for verbatim !
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GTOVRKM7Q4VU67KYDQF6ICU7HAJDSBRX/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-10 Thread Greg Ewing


Glenn Linderman wrote:


I wonder how many raw strings actually use the \"  escape productively? 
Maybe that should be deprecated too! ?  I can't think of a good and 
necessary use for it, can anyone?


Quite rare, I expect, but it's bound to break someone's code.
It might be better to introduce a new string prefix, e.g.
'v' for 'verbatim':

   v"C:\Users\Fred\"

--
Greg
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TQM37LMDVIKQ7UXLNLVMUUSF3ZYT7TYI/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-10 Thread Greg Ewing


Rob Cliffe via Python-Dev wrote:


Also, the former is simply more *informative* - it tells the reader that 
baz is expected to be a directory, not a file.


On Windows you can usually tell that from the fact that filenames
almost always have an extension, and directory names almost never
do.

--
Greg
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/F4Y4HNU72QOVWHCGLD74N7ZTAEJP2XBF/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-10 Thread eryk sun

On 8/10/19, Rob Cliffe via Python-Dev  wrote:
> On 10/08/2019 11:50:35, eryk sun wrote:
>> On 8/9/19, Steven D'Aprano  wrote:
>>> I'm also curious why the string needs to *end* with a backslash. Both of
>>> these are the same path:
>>>
>>>  C:\foo\bar\baz\
>>>  C:\foo\bar\baz
>
> Also, the former is simply more *informative* - it tells the reader that
> baz is expected to be a directory, not a file.

This is an important point that I overlooked. The trailing backslash
is more than just a redundant character to inform human readers. Refer
to [MS-FSA] 2.1.5.1 "Server Requests an Open of a File" [1]. A
create/open fails with STATUS_OBJECT_NAME_INVALID if either of the
following is true:

* PathName contains a trailing backslash and
  CreateOptions.FILE_NON_DIRECTORY_FILE is
  TRUE.

* PathName contains a trailing backslash and
  StreamTypeToOpen is DataStream

For NtCreateFile or NtOpenFile (in the NT API), the
FILE_NON_DIRECTORY_FILE option restricts the call to a regular file,
and FILE_DIRECTORY_FILE restricts it to a directory. With neither
option, the call can target either a file or directory. A trailing
backslash is another information channel. It tells the filesystem that
the target has to be a directory. If we specify
FILE_NON_DIRECTORY_FILE with a trailing backslash on the name, this is
an immediate failure as an invalid name without even checking the
entry. If we specify neither option and use a trailing backslash, it's
an invalid name if the filesystem finds a regular file or data stream.
Had the call specified the FILE_DIRECTORY_FILE option, it would
instead fail with STATUS_NOT_A_DIRECTORY.

We can see this in practice in the published source for the fastfat
filesystem driver. FatCommonCreate [2] (for a create or open) has the
following code to handle the second case (in this code, an FCB is a
file control block for a regular file, and a DCB is a directory
control block):

if (NodeType(Fcb) == FAT_NTC_FCB) {
//
//  Check if we were only to open a directory
//
if (OpenDirectory) {
DebugTrace(0, Dbg, "Cannot open file as directory\n", 0);
try_return( Iosb.Status = STATUS_NOT_A_DIRECTORY );
}
DebugTrace(0, Dbg, "Open existing fcb, Fcb = %p\n", Fcb);
if ( TrailingBackslash ) {
try_return( Iosb.Status = STATUS_OBJECT_NAME_INVALID );
}

We observe the first case with a typical CreateFileW call, which uses
the option FILE_NON_DIRECTORY_FILE. In the following example "baz" is
a regular file:

>>> f = open(r'foo\bar\baz') # success
>>> try: open('foo\\bar\\baz\\')
... except OSError as e: print(e)
...
[Errno 22] Invalid argument: 'foo\\bar\\baz\\'

C EINVAL (22) is mapped from Windows ERROR_INVALID_NAME (123), which
is mapped from NT STATUS_OBJECT_NAME_INVALID (0xC033).

We can observe the second case with os.stat(), which calls CreateFileW
with backup semantics, which omits the FILE_NON_DIRECTORY_FILE option
in order to allow the call to open either a file or directory. In this
case the filesystem has to actually check that "baz" is a data file
before it can fail the call, as was shown in the fasfat code snippet
above:

>>> try: os.stat('foo\\bar\\baz\\')
... except OSError as e: print(e)
...
[WinError 123] The filename, directory name, or
volume label syntax is incorrect: 'foo\\bar\\baz\\'

[1] 
https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-fsa/8ada5fbe-db4e-49fd-aef6-20d54b748e40
[2] 
https://github.com/microsoft/Windows-driver-samples/blob/74200/filesys/fastfat/create.c#L1398
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QPDXUY4OXR2XOCNUHSKC7QRQGAXWV5WQ/

[Python-Dev] Re: What to do about invalid escape sequences


On 8/10/2019 12:19 PM, Guido van Rossum wrote:

Regular expressions.


I assume that is in response to the "good use for \" escape" question?

But can't you just surround them with ' instead of " ?  Or  ''' ?



On Sat, Aug 10, 2019 at 12:12 Glenn Linderman > wrote:


On 8/10/2019 11:16 AM, Terry Reedy wrote:

On 8/10/2019 4:33 AM, Paul Moore wrote:


(Side issue)


This deserves its own thread.


As a Windows developer, who has seen far too many cases where
use of
slashes in filenames implies a Unix-based developer not thinking
sufficiently about Windows compatibility, or where it leads to
people
hard coding '/' rather than using os.sep (or better, pathlib), I
strongly object to this characterisation. Rather, I would simply
say
"to make Windows users more aware of the clash in usage between
backslashes in filenames and backslashes as string escapes".

There are *many* valid ways to write Windows pathnames in your
code:

1. Raw strings


As pointed out elsewhere, Raw strings have limitations, paths
ending in \ cannot be represented, and such do exist in various
situations, not all of which can be easily avoided... except by
the "extra character contortion" of "C:\directory\ "[:-1]  (does
someone know a better way?)

It would be useful to make a "really raw" string that doesn't
treat \ special in any way. With 4 different quoting possibilities
( ' " ''' """ ) there isn't really a reason to treat \ special at
the end of a raw string, except for backward compatibility.

I wonder how many raw strings actually use the \"  escape
productively? Maybe that should be deprecated too! ?  I can't
think of a good and necessary use for it, can anyone?

Or invent "really raw" in some spelling, such as rr"c:\directory\"
or e for exact, or x for exact, or "c:\directory\"

And that brings me to the thought that if   \e  wants to become an
escape for escape, that maybe there should be an "extended escape"
prefix... if you want to use more escapes, define   ee"string
where \\ can only be used as an escape or escaped character, \e
means the ASCII escape character, and \ followed by a character
with no escape definition would be an error."

Of course "extended escape" could be spelled lots of different
ways too, but not the same way as "really raw" :)


2. Doubling the backslashes
3. Using pathlib (possibly with slash as a directory separator,
where
it's explicitly noted as a portable option)
4. Using slashes

IMO, using slashes is the *worst* of these. But this latter is a
matter of opinion - I've no objection to others believing
differently,
but I *do* object to slashes being presented as the only option, or
the recommended option without qualification.


Perhaps Python Setup and Usage, 3. Using Python on Windows,
should have a section of file paths, at most x.y.z, so visible in
the TOC listed by https://docs.python.org/3/using/index.html



___
Python-Dev mailing list -- python-dev@python.org

To unsubscribe send an email to python-dev-le...@python.org

https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at

https://mail.python.org/archives/list/python-dev@python.org/message/5MZAXJJYKNMQAS63QW4HS2TUPMQH7LSL/

--
--Guido (mobile)


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BZDAXLX2IQTIUT2W47SFI2CJTZSPXY2V/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-10 Thread Guido van Rossum

Regular expressions.

On Sat, Aug 10, 2019 at 12:12 Glenn Linderman  wrote:

> On 8/10/2019 11:16 AM, Terry Reedy wrote:
>
> On 8/10/2019 4:33 AM, Paul Moore wrote:
>
> (Side issue)
>
>
> This deserves its own thread.
>
> As a Windows developer, who has seen far too many cases where use of
> slashes in filenames implies a Unix-based developer not thinking
> sufficiently about Windows compatibility, or where it leads to people
> hard coding '/' rather than using os.sep (or better, pathlib), I
> strongly object to this characterisation. Rather, I would simply say
> "to make Windows users more aware of the clash in usage between
> backslashes in filenames and backslashes as string escapes".
>
> There are *many* valid ways to write Windows pathnames in your code:
>
> 1. Raw strings
>
>
> As pointed out elsewhere, Raw strings have limitations, paths ending in \
> cannot be represented, and such do exist in various situations, not all of
> which can be easily avoided... except by the "extra character contortion"
> of   "C:\directory\ "[:-1]  (does someone know a better way?)
>
> It would be useful to make a "really raw" string that doesn't treat \
> special in any way. With 4 different quoting possibilities ( ' " ''' """ )
> there isn't really a reason to treat \ special at the end of a raw string,
> except for backward compatibility.
>
> I wonder how many raw strings actually use the \"  escape productively?
> Maybe that should be deprecated too! ?  I can't think of a good and
> necessary use for it, can anyone?
>
> Or invent "really raw" in some spelling, such as rr"c:\directory\"
> or e for exact, or x for exact, or  here>"c:\directory\"
>
> And that brings me to the thought that if   \e  wants to become an escape
> for escape, that maybe there should be an "extended escape" prefix... if
> you want to use more escapes, define   ee"string where \\ can only be used
> as an escape or escaped character, \e means the ASCII escape character, and
> \ followed by a character with no escape definition would be an error."
>
> Of course "extended escape" could be spelled lots of different ways too,
> but not the same way as "really raw" :)
>
> 2. Doubling the backslashes
> 3. Using pathlib (possibly with slash as a directory separator, where
> it's explicitly noted as a portable option)
> 4. Using slashes
>
> IMO, using slashes is the *worst* of these. But this latter is a
> matter of opinion - I've no objection to others believing differently,
> but I *do* object to slashes being presented as the only option, or
> the recommended option without qualification.
>
>
> Perhaps Python Setup and Usage, 3. Using Python on Windows, should have a
> section of file paths, at most x.y.z, so visible in the TOC listed by
> https://docs.python.org/3/using/index.html
>
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/5MZAXJJYKNMQAS63QW4HS2TUPMQH7LSL/
>
-- 
--Guido (mobile)
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LSFNRZTMK6HLUCE7IAWKD3GCBLZ7KINQ/

[Python-Dev] Re: What to do about invalid escape sequences

On 8/10/2019 11:16 AM, Terry Reedy wrote:

On 8/10/2019 4:33 AM, Paul Moore wrote:

(Side issue)

This deserves its own thread.

As a Windows developer, who has seen far too many cases where use of
slashes in filenames implies a Unix-based developer not thinking
sufficiently about Windows compatibility, or where it leads to people
hard coding '/' rather than using os.sep (or better, pathlib), I
strongly object to this characterisation. Rather, I would simply say
"to make Windows users more aware of the clash in usage between
backslashes in filenames and backslashes as string escapes".

There are *many* valid ways to write Windows pathnames in your code:

1. Raw strings

As pointed out elsewhere, Raw strings have limitations, paths ending in
\ cannot be represented, and such do exist in various situations, not
all of which can be easily avoided... except by the "extra character
contortion" of "C:\directory\ "[:-1] (does someone know a better way?)

It would be useful to make a "really raw" string that doesn't treat \
special in any way. With 4 different quoting possibilities ( ' " ''' """
) there isn't really a reason to treat \ special at the end of a raw
string, except for backward compatibility.

I wonder how many raw strings actually use the \" escape productively?
Maybe that should be deprecated too! ? I can't think of a good and
necessary use for it, can anyone?

Or invent "really raw" in some spelling, such as rr"c:\directory\"
or e for exact, or x for exact, or here>"c:\directory\"

Of course "extended escape" could be spelled lots of different ways too,
but not the same way as "really raw" :)

2. Doubling the backslashes
3. Using pathlib (possibly with slash as a directory separator, where
it's explicitly noted as a portable option)
4. Using slashes

IMO, using slashes is the *worst* of these. But this latter is a
matter of opinion - I've no objection to others believing differently,
but I *do* object to slashes being presented as the only option, or
the recommended option without qualification.

Perhaps Python Setup and Usage, 3. Using Python on Windows, should
have a section of file paths, at most x.y.z, so visible in the TOC
listed by https://docs.python.org/3/using/index.html

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-10 Thread Rob Cliffe via Python-Dev

On 10/08/2019 11:50:35, eryk sun wrote:

On 8/9/19, Steven D'Aprano wrote:

I'm also curious why the string needs to *end* with a backslash. Both of
these are the same path:

C:\foo\bar\baz\
C:\foo\bar\baz
Also, the former is simply more *informative* - it tells the reader that
baz is expected to be a directory, not a file.

Rob Cliffe

The above two cases are equivalent. But that's not the case for the
root directory. Unlike Unix, filesystem namespaces are implemented
directly on devices. For example, "//./C:" might resolve to a volume
device such as "\\Device\\HarddiskVolume2". With a trailing slash
added, "//./C:/" resolves to "\\Device\\HarddiskVolume2\\", which is
the root directory of the mounted filesystem on the volume.

Also, as a classic DOS path, "C:" without a trailing slash expands to
the working directory on drive "C:". The system runtime library looks
for this path in a hidden environment variable named "=C:". The
Windows API never sets these hidden "=X:" drive variables. The C
runtime sets them, as does Python's os.chdir.

Some volume-management functions require a trailing slash or
backslash, such as GetVolumeInformationW [1].
GetVolumeNameForVolumeMountPointW [2] actually requires it to be a
trailing backslash. It will not accept a trailing forward slash such
as "C:\\Mount\\Volume/" (a bug since Windows 2000). The volume name
(e.g. "?\\Volume{----}\\")
returned by the latter includes a trailing backslash, which must be
present in the target path in order for a mountpoint to function
properly as a directory, else it would resolve to the volume device
instead of the root directory.

[1]
https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-getvolumeinformationw
[2]
https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-getvolumenameforvolumemountpointw

If they're Windows developers, they ought to be aware that the Windows
file system API allows / anywhere you can use \ and it is the
common convention in Python to use forward slashes.

The Windows file API actually does not allow slash to be used anywhere
that we can use backslash. It's usually allowed, but not always. For
the most part, the conditions where forward slash is not supported are
intentional.

Windows replaces forward slash with backslash in normal DOS paths and
normal device paths. But sometimes we have to use a special form of
device path that bypasses normalization. A path that isn't normalized
can only use backslash as the path separator. For example, the most
common case is that the process doesn't have long paths enabled. In
this case we're limited to MAX_PATH, which limits file paths to a
paltry 259 characters (sans the terminating null); the current
directory to 258 characters (sans a trailing backslash and null); and
the path of a new directory to 247 characters (subtract 12 from 259 to
leave space for an 8.3 filename). By skipping DOS normalization, we
can access a path with up to about 32,750 characters (i.e. 32,767 sans
the length of the device name in the final NT path under
"\\Device\\").

(Long normalized paths are available starting in Windows 10, but the
system policy that allows this is disabled by default, and even if
enabled, each application has to declare itself to be long-path aware
in its manifest. This is declared for python[w].exe in Python 3.6+.)

A device path is an explicit reference to a user's local device
directory (in the object namespace), which shadows the global device
directory. In NT, this directory is aliased to a special "\\??\\"
prefix (backslash only). A local device directory is created for each
logon session (not terminal session) by the security system that runs
in terminal session 0 (i.e. the system services session). The
per-logon directory is located at "\\Sessions\\0\\DosDevices\\". In the Windows API, it's accessible as "//?/" or "//./",
or with any mix of forward slashes or backslashes, but only the
all-backslash form is special-cased to bypass the normalization step.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/3SDFM2EKFO3UNTATS7KVBY2WOUTFMAF5/

---
This email has been checked for viruses by AVG.
https://www.avg.com

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-10 Thread Terry Reedy

On 8/10/2019 4:33 AM, Paul Moore wrote:

(Side issue)

This deserves its own thread.

There are *many* valid ways to write Windows pathnames in your code:

1. Raw strings
2. Doubling the backslashes
3. Using pathlib (possibly with slash as a directory separator, where
it's explicitly noted as a portable option)
4. Using slashes

Perhaps Python Setup and Usage, 3. Using Python on Windows, should have
a section of file paths, at most x.y.z, so visible in the TOC listed by
https://docs.python.org/3/using/index.html

--
Terry Jan Reedy
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/SH3M5GGHJPIMKVTEYI6FFBYWHVZT7O64/

[Python-Dev] Re: What to do about invalid escape sequences

On 8/10/2019 7:03 AM, Paul Moore wrote:

On Sat, 10 Aug 2019 at 12:06, Chris Angelico wrote:

On Sat, Aug 10, 2019 at 6:39 PM Paul Moore wrote:

There are *many* valid ways to write Windows pathnames in your code:

1. Raw strings
2. Doubling the backslashes
3. Using pathlib (possibly with slash as a directory separator, where
it's explicitly noted as a portable option)
4. Using slashes

Please expand on why this is the worst?

I did say it was a matter of opinion, so I'm not going to respond if
people say that any of the following is "wrong", but since you asked:

1. Backslash is the native separator, whereas slash is not (see eryk
sun's post for *way* more detail).
2. People who routinely use slash have a tendency to forget to use
os.sep rather than a literal slash in places where it *does* matter.
3. Using slash, in my experience, ends up with paths with "mixed"
separators (os.path.join("C:/work/apps", "foo") ->
'C:/work/apps\\foo') which are messy to deal with, and ugly for the
user.
4. If a path with slashes is displayed directly to the user without
normalisation, it looks incorrect and can confuse users who are only
used to "native" Windows programs.

Etc.
Not to mention the problem of passing paths with / to other windows
programs via system or subprocess.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/A7MBGUBTRNLZ5UWCMS4NHYAFGQC6MNQJ/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-10 Thread Paul Moore

On Sat, 10 Aug 2019 at 12:06, Chris Angelico  wrote:
>
> On Sat, Aug 10, 2019 at 6:39 PM Paul Moore  wrote:
> > There are *many* valid ways to write Windows pathnames in your code:
> >
> > 1. Raw strings
> > 2. Doubling the backslashes
> > 3. Using pathlib (possibly with slash as a directory separator, where
> > it's explicitly noted as a portable option)
> > 4. Using slashes
> >
> > IMO, using slashes is the *worst* of these. But this latter is a
> > matter of opinion - I've no objection to others believing differently,
> > but I *do* object to slashes being presented as the only option, or
> > the recommended option without qualification.
>
> Please expand on why this is the worst?

I did say it was a matter of opinion, so I'm not going to respond if
people say that any of the following is "wrong", but since you asked:

1. Backslash is the native separator, whereas slash is not (see eryk
sun's post for *way* more detail).
2. People who routinely use slash have a tendency to forget to use
os.sep rather than a literal slash in places where it *does* matter.
3. Using slash, in my experience, ends up with paths with "mixed"
separators (os.path.join("C:/work/apps", "foo") ->
'C:/work/apps\\foo') which are messy to deal with, and ugly for the
user.
4. If a path with slashes is displayed directly to the user without
normalisation, it looks incorrect and can confuse users who are only
used to "native" Windows programs.

Etc.

Paul
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QNAZ4G7VCCBZSFJLUCGH6NTTGW726R6G/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-10 Thread eryk sun

On 8/10/19, eryk sun  wrote:
>
> The per-logon directory is located at "\\Sessions\\0\\DosDevices\\ Session ID>". In the Windows API, it's accessible as "//?/" or "//./",
> or with any mix of forward slashes or backslashes, but only the
> all-backslash form is special-cased to bypass the normalization step.

Correction: I slipped up in that last sentence. Only the all-backslash
form that's in the "?" namespace bypasses normalization, as most
Windows users should at least have seen in passing. These special
device paths pop up here and there. For example, r'\\?\C:\Temp\spam. .
.' allows creating or opening a file named "spam. . .", which the
Windows API would normalize as "spam". But I don't recommend
sidestepping the normal rules -- except for the path length limit
because there are ways to make long paths conveniently accessible
(e.g. symbolic links, bind-like mountpoints, and subst drives).

Sometimes people also come across "\\??\\" paths and come to the
mistaken conclusion that these can be used in Windows API programs.
No, they're for NT. The runtime library mangles them, e.g.
nt._getfullpathname(r'\??\C:') == 'C:\\??\\C:'.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VANNT2SIH7EBPEOUC6M7HI7PYASJPYC7/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-10 Thread Rob Cliffe via Python-Dev




On 06/08/2019 23:41:25, Greg Ewing wrote:

Rob Cliffe via Python-Dev wrote:


Sorry, that won't work.  Strings are parsed at compile time, open() 
is executed at run-time.


It could check for control characters, which are probably the result
of a backslash accident. Maybe even auto-correct them...


By "It", do you mean open() ?  If so:
It already checks for control characters, at least with Python 2.7 on 
Windows:


>>> open('mydir\test')
Traceback (most recent call last):
  File "", line 1, in 
IOError: [Errno 22] invalid mode ('r') or filename: 'mydir\test'

As for auto-correct (presumably "\a" to "\\a", "\b" to "\\b" etc.), I 
hope you're not serious.
"In the face of gibberish, refuse the temptation to show how smart your 
guessing is."

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UK46EASIZVFTIQPORH7AG3EFB522NFI3/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-10 Thread Chris Angelico

On Sat, Aug 10, 2019 at 6:39 PM Paul Moore  wrote:
> There are *many* valid ways to write Windows pathnames in your code:
>
> 1. Raw strings
> 2. Doubling the backslashes
> 3. Using pathlib (possibly with slash as a directory separator, where
> it's explicitly noted as a portable option)
> 4. Using slashes
>
> IMO, using slashes is the *worst* of these. But this latter is a
> matter of opinion - I've no objection to others believing differently,
> but I *do* object to slashes being presented as the only option, or
> the recommended option without qualification.

Please expand on why this is the worst?

ChrisA
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PXVO7OT4EK2GRDC5DM6JXMP3WBOVC7DC/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-10 Thread eryk sun

On 8/9/19, Steven D'Aprano  wrote:
>
> I'm also curious why the string needs to *end* with a backslash. Both of
> these are the same path:
>
> C:\foo\bar\baz\
> C:\foo\bar\baz

The above two cases are equivalent. But that's not the case for the
root directory. Unlike Unix, filesystem namespaces are implemented
directly on devices. For example, "//./C:" might resolve to a volume
device such as "\\Device\\HarddiskVolume2". With a trailing slash
added, "//./C:/" resolves to "\\Device\\HarddiskVolume2\\", which is
the root directory of the mounted filesystem on the volume.

Also, as a classic DOS path, "C:" without a trailing slash expands to
the working directory on drive "C:". The system runtime library looks
for this path in a hidden environment variable named "=C:". The
Windows API never sets these hidden "=X:" drive variables. The C
runtime sets them, as does Python's os.chdir.

Some volume-management functions require a trailing slash or
backslash, such as GetVolumeInformationW [1].
GetVolumeNameForVolumeMountPointW [2] actually requires it to be a
trailing backslash. It will not accept a trailing forward slash such
as "C:\\Mount\\Volume/" (a bug since Windows 2000). The volume name
(e.g. "?\\Volume{----}\\")
returned by the latter includes a trailing backslash, which must be
present in the target path in order for a mountpoint to function
properly as a directory, else it would resolve to the volume device
instead of the root directory.

[1] 
https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-getvolumeinformationw
[2] 
https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-getvolumenameforvolumemountpointw

> If they're Windows developers, they ought to be aware that the Windows
> file system API allows / anywhere you can use \ and it is the
> common convention in Python to use forward slashes.

The Windows file API actually does not allow slash to be used anywhere
that we can use backslash. It's usually allowed, but not always. For
the most part, the conditions where forward slash is not supported are
intentional.

Windows replaces forward slash with backslash in normal DOS paths and
normal device paths. But sometimes we have to use a special form of
device path that bypasses normalization. A path that isn't normalized
can only use backslash as the path separator. For example, the most
common case is that the process doesn't have long paths enabled. In
this case we're limited to MAX_PATH, which limits file paths to a
paltry 259 characters (sans the terminating null); the current
directory to 258 characters (sans a trailing backslash and null); and
the path of a new directory to 247 characters (subtract 12 from 259 to
leave space for an 8.3 filename). By skipping DOS normalization, we
can access a path with up to about 32,750 characters (i.e. 32,767 sans
the length of the device name in the final NT path under
"\\Device\\").

(Long normalized paths are available starting in Windows 10, but the
system policy that allows this is disabled by default, and even if
enabled, each application has to declare itself to be long-path aware
in its manifest. This is declared for python[w].exe in Python 3.6+.)

A device path is an explicit reference to a user's local device
directory (in the object namespace), which shadows the global device
directory. In NT, this directory is aliased to a special "\\??\\"
prefix (backslash only). A local device directory is created for each
logon session (not terminal session) by the security system that runs
in terminal session 0 (i.e. the system services session). The
per-logon directory is located at "\\Sessions\\0\\DosDevices\\". In the Windows API, it's accessible as "//?/" or "//./",
or with any mix of forward slashes or backslashes, but only the
all-backslash form is special-cased to bypass the normalization step.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/3SDFM2EKFO3UNTATS7KVBY2WOUTFMAF5/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-10 Thread Paul Moore

On Sat, 10 Aug 2019 at 00:36, Steven D'Aprano  wrote:
> 2. To strongly discourage newbie Windows developers from hard-coding
> paths using backslashes, but to use forward slashes instead.

(Side issue)

As a Windows developer, who has seen far too many cases where use of
slashes in filenames implies a Unix-based developer not thinking
sufficiently about Windows compatibility, or where it leads to people
hard coding '/' rather than using os.sep (or better, pathlib), I
strongly object to this characterisation. Rather, I would simply say
"to make Windows users more aware of the clash in usage between
backslashes in filenames and backslashes as string escapes".

There are *many* valid ways to write Windows pathnames in your code:

1. Raw strings
2. Doubling the backslashes
3. Using pathlib (possibly with slash as a directory separator, where
it's explicitly noted as a portable option)
4. Using slashes

IMO, using slashes is the *worst* of these. But this latter is a
matter of opinion - I've no objection to others believing differently,
but I *do* object to slashes being presented as the only option, or
the recommended option without qualification.

Paul
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/FZABAKCBZY72FKFRPK3OXPLKSQ62JZ6N/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-10 Thread Steve Holden

While not a total solution, it seems like it might be worthwhile forcing
flake8 or similar checks when uploading PyPI modules.

That would catch the illegal escape sequences where it really matters -
before they enter the ecosystem.

(general) fathead:pyxll-www sholden$ cat t.py
"Docstring with illegal \escape sequence"
(general) fathead:pyxll-www sholden$ flake8 t.py
t.py:1:25: W605 invalid escape sequence '\e'

while this won't mitigate the case for existing packages, it should reduce
the number of packages containing potentially erroneous string constants,
preparing the ground for the eventual introduction of the syntax error.

Steve Holden


On Sat, Aug 10, 2019 at 8:07 AM Serhiy Storchaka 
wrote:

> 10.08.19 02:04, Gregory P. Smith пише:
> > I've merged the PR reverting the behavior in 3.8 and am doing the same
> > in the master branch.
>
> I was going to rebase it to master and go in normal backporting process
> if we decide that DeprecationWarning should be in master. I waited the
> end of the discussion.
>
> > Recall the nightmare caused by md5.py and sha.py DeprecationWarning's in
> > 2.5...  this would be similar.
>
> It is very different because DeprecationWarning for md5.py and sha.py is
> emitted at runtime.
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/H5VXWS6UT2OZBTXG7HUERKAQQIQ4BYEA/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/24ID6EF6ESG64B6VFXVRL4XNWP5I7ITW/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-10 Thread Serhiy Storchaka


10.08.19 02:04, Gregory P. Smith пише:
I've merged the PR reverting the behavior in 3.8 and am doing the same 
in the master branch.


I was going to rebase it to master and go in normal backporting process 
if we decide that DeprecationWarning should be in master. I waited the 
end of the discussion.


Recall the nightmare caused by md5.py and sha.py DeprecationWarning's in 
2.5...  this would be similar.


It is very different because DeprecationWarning for md5.py and sha.py is 
emitted at runtime.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/H5VXWS6UT2OZBTXG7HUERKAQQIQ4BYEA/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-10 Thread Serhiy Storchaka

09.08.19 19:39, Steve Dower пише:
I also posted another possible option that helps solve the real problem
faced by users, and not just the "we want to have a warning" problem
that is purely ours.

Warnings solve two problems:

* Teaching users that a backslash has special meaning and should be
escaped unless it is used for special meaning.

* Avoid breaking or introducing bugs if we add new escape sequences
(like \e).

* change the SyntaxWarning into a default-silenced one that fires
every time a .pyc is loaded (this is the hard part, but it's doable)

It was considered an advantage that these warnings are shown only once
at compile time. So they will be shown to the author of the code, but
the user of the code will not see them (except of installation time).

Actually we need to distinguish the the author and the user of the code
and show warnings only to the author. Using .pyc files was just an
heuristic: the author compiles the Python code, and the user uses
compiled .pyc files. Would be nice to have more reliable way to
determine the owning of the code. It is related not only to
SyntaxWarnings, but to runtime DeprecationWarnings. Maybe silence
warnings only for readonly files and make files installed by PIP readonly?

* change pathlib.PureWindowsPath, os.fsencode and os.fsdecode to
explicitly warn when the path contains control characters

This can cause additional harm. Currently you get expected FileNotFound
when use user specified bad path, it can be caught and handled. But with
warnings you will either get a noise on the output or an unexpected
unhandled error.

* change the PyErr_SetExcFromWindowsErrWithFilenameObjects function to
append (or chain) an extra message when either of the filenames
contains control characters (or change OSError to do it, or the
default sys.excepthook)

I do not understand what goal will be achieved by this.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/BCAOEGQYK5KYAMPDQ5O6KWGCOOQUJ6UV/

[Python-Dev] Re: What to do about invalid escape sequences


On 8/9/2019 3:56 PM, Steven D'Aprano wrote:

I'm not trying to be confrontational, I'm trying to understand your
use-case(s) and see if it would be broken by the planned change to
string escapes.


Yeah, that's fine. Sometimes it is hard to communicate via email (versus 
saying a lot).



On Fri, Aug 09, 2019 at 03:18:29PM -0700, Glenn Linderman wrote:

On 8/9/2019 2:53 PM, Steven D'Aprano wrote:

On Fri, Aug 09, 2019 at 01:12:59PM -0700, Glenn Linderman wrote:


The reason I never use raw strings is in the documentation, it is
because \ still has a special meaning, and the first several times I
felt the need for raw strings, it was for directory names that wanted to
end with \ and couldn't.

Can you elaborate? I find it unlikely that I would ever want a docstring

I didn't mention docstring.  I just wanted a string with a path name
ending in \.

You said you never used raw strings in the documentation. I read that as
doc strings. What sort of documentation are you writing that isn't a doc
string but is inside your .py files where the difference between raw and
regular strings is meaningful?


No, what I said was that the reason is in the documentation. The reason 
that I don't use raw strings is in the Python documentation. I don't 
claim to use raw strings for documentation I write. The reason is 
because \" to end the string doesn't work, and the first good-sounding 
justification for using raw strings that I stumbled across was to avoid 
"c:\\directory\\" in favor of r"c:\directory\"  but that doesn't work, 
and neither do r"c:\directory\\". Since then, I have not found any other 
compelling need for raw strings that overcome that deficiency... the 
benefit of raw strings is that you don't have to double the \\. But the 
benefit is contradicted by not being able to use one at the end of 
sting. If you can't use it at the end of the string, the utility of not 
doubling them in the middle of the string is just too confusing to make 
it worth figuring out the workarounds when you have a string full of \ 
that happens to end in \. Just easier to remember the "always double \" 
rule, than to remember the extra "but if your string containing \ 
doesn't have one at the end you can get away with using a raw string and 
not doubling the \.



Windows users are used to seeing backslashes in paths, I don't care to
be the one to explain why my program uses / and all the rest use \.

If you don't use raw strings for paths, you get to explain why your
program uses \\ and all the rest use \ *wink*
Wrong. Users don't look at the source code. They look at the output. I 
also don't want to have to write code to convert /-laden paths to 
\-laden paths when I display them to the user.




If they're Windows end users, they won't be reading your source code and
will never know how you represent hard-coded paths in the source code.


They will if I display the path as a default value for an argument, or 
show them the path for other reasons, or if the path shows up in an 
exception message.




If they're Windows developers, they ought to be aware that the Windows
file system API allows / anywhere you can use \ and it is the
common convention in Python to use forward slashes.


This, we can agree on.


I'm also curious why the string needs to *end* with a backslash. Both of
these are the same path:

 C:\foo\bar\baz\
 C:\foo\bar\baz


Sure. But only one of them can be used successfully with   + filename  
(for example).


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OYRSO4WHUFA7Q34HJTBIMQL337JTA5RX/

[Python-Dev] Re: What to do about invalid escape sequences

On 8/9/2019 4:08 PM, MRAB wrote:

On 2019-08-09 23:56, Steven D'Aprano wrote:

I'm not trying to be confrontational, I'm trying to understand your
use-case(s) and see if it would be broken by the planned change to
string escapes.

On Fri, Aug 09, 2019 at 03:18:29PM -0700, Glenn Linderman wrote:

On 8/9/2019 2:53 PM, Steven D'Aprano wrote:
>On Fri, Aug 09, 2019 at 01:12:59PM -0700, Glenn Linderman wrote:
>
>>The reason I never use raw strings is in the documentation, it is
>>because \ still has a special meaning, and the first several times I
>>felt the need for raw strings, it was for directory names that 
wanted to

>>end with \ and couldn't.
>Can you elaborate? I find it unlikely that I would ever want a 
docstring

I didn't mention docstring.  I just wanted a string with a path name 
ending in \.

You said you never used raw strings in the documentation. I read that as
doc strings. What sort of documentation are you writing that isn't a doc
string but is inside your .py files where the difference between raw and
regular strings is meaningful?

Windows users are used to seeing backslashes in paths, I don't care 
to be the one to explain why my program uses / and all the rest use \.

If you don't use raw strings for paths, you get to explain why your
program uses \\ and all the rest use \ *wink*

If they're Windows end users, they won't be reading your source code and
will never know how you represent hard-coded paths in the source code.

If they're Windows developers, they ought to be aware that the Windows
file system API allows / anywhere you can use \ and it is the
common convention in Python to use forward slashes.

I'm also curious why the string needs to *end* with a backslash. Both of
these are the same path:

 C:\foo\bar\baz\
 C:\foo\bar\baz

The only time it's required is for the root directory of a drive:

C:\

That's not the only time it's required, but it is a case that is far 
harder to specify in other ways.  It's required any time you  want to 
say   + filename without writing + "\\" + filename, or os.path.join( 
'C:\\", filename )
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LXM72OGMFTNJP3NQPITJWWGB6ITNRBH4/

[Python-Dev] Re: What to do about invalid escape sequences

On 8/9/2019 4:07 PM, Gregory P. Smith wrote:

On Fri, Aug 9, 2019 at 11:37 AM Eric V. Smith > wrote:

On 8/9/2019 2:28 PM, Jonathan Goble wrote:
> On Fri, Aug 9, 2019 at 12:34 PM Nick Coghlan mailto:ncogh...@gmail.com>> wrote:
>> I find the "Our deprecation warnings were even less visible than
>> normal" argument for extending the deprecation period compelling.
> Outsider's 2 cents from reading this discussion (with no personal
> experience with this warning):
>
> I am perplexed at the opinion, seemingly espoused by multiple people
> in this thread, that because a major part of the problem is that the
> warnings were not visible enough, somehow the proposed solution is
> making them not visible enough again? It's too late, in my
> understanding, in the 3.8 cycle to add a new feature like a
change to
> how these warnings are produced (it seems a significant change
to the
> .pyc structure is needed to emit them at runtime), so this supposed
> "solution" is nothing but kicking the can down the road. When 3.9
> rolls around, public exposure to the problem of invalid escape
> sequences will still be approximately what it is now (because if
> nobody saw the warnings in 3.7, they certainly won't see them in 3.8
> with this "fix"), so you'll end up with the same complaints about
> SyntaxWarning that started this discussion, end up back on
> DeprecationWarning for 3.9 (hopefully with support for emitting them
> at runtime instead of just compile-time), then have to wait until
> 3.10/4.0 for SyntaxWarning and eventually the next version to
actually
> make them errors.

Yes, I think that's the idea: Deprecation warning in 3.9, but more
visible that what 3.7 has. That is, not just at compile time but
at run
time. What's required to make that happen is an open question.

i've lost track of who suggested what in this thread, but yes, that 
concept has been rolling over in my mind as a potentially good idea 
after someone suggested it.  Compile time warnings should turn into 
bytecode for a warnings.warn call in the generated pyc.  I haven't 
spent time trying to reason if that actually addresses the real issues 
we're having moving forward with a syntax warning change though. A 
reasonable feature to ask for as a feature in 3.9 or later perhaps.

The documentation actually claims it was deprecated in version 3.6. So 
it has already been 2 releases worth of deprecation, visible warning or not.

Ship it.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YYSC5CDJWOF24AUWC4IHJG45COHOTHW3/

[Python-Dev] Re: What to do about invalid escape sequences

On Fri, Aug 09, 2019 at 02:28:13PM -0400, Jonathan Goble wrote:

> I am perplexed at the opinion, seemingly espoused by multiple people
> in this thread, that because a major part of the problem is that the
> warnings were not visible enough, somehow the proposed solution is
> making them not visible enough again?

Making the warnings invisible by default is only the first step, not the 
entire solution.

We don't break backwards compatibility lightly, and the current 
behaviour is not an accident, it is a documented feature which 
developers are entitled to rely on.

We are chosing to change that behaviour, breaking backwards 
compatibility, to the inconvenience of end-users, library authors, 
and developers on Mac/Unix/Linux, for two benefits:

1. To possibly allow the addition of new escape sequences such as \e 
some time in the future.

2. To strongly discourage newbie Windows developers from hard-coding 
paths using backslashes, but to use forward slashes instead.

Especially on Python-Ideas, time and time again we hear the mantra that 
we should only break backwards compatibility if the benefit strongly 
outweighs the cost of change. Raymond has given compelling (to me at 
least) testimony that right now, the cost of change is far too high for 
the two minor benefits gained.

So *right now*, it looks like we ought to be prepared to back away from 
the change altogether. We thought that the balance would be:

"it will be a little bit painful, but the benefit will outweigh the pain"

justifying breaking backwards compatibility, but we have found that the 
pain is greater than expected. If we cannot reduce the pain, and move 
the balance into the "nett positive" rather than the "nett negative" we 
have right now, we ought to cancel the deprecation.

Making the deprecation silent by default will reduce the pain. That's 
the first step. Pushing the deprecation schedule back a release or more 
will give us time to rethink the deprecation process, fix the technical 
issues we discovered about SyntaxWarnings, and give library authors time 
to eliminate the warnings from their libraries.

> It's too late, in my
> understanding, in the 3.8 cycle to add a new feature like a change to
> how these warnings are produced (it seems a significant change to the
> .pyc structure is needed to emit them at runtime), so this supposed
> "solution" is nothing but kicking the can down the road.

Is that a problem? Any deadline we have to make unrecognised backslash 
escapes an error is a self-imposed deadline. We lived with this feature 
for more than a quarter of a century, we can keep kicking the can down 
the road until the benefit outweighs the pain.

If that means "forever", then I personally will be sad, but so be it.

However, even if it is too late to add any new tools or features to 
Python 3.8 (and that's not clear: this won't be a *language* change, so 
the feature freeze may not apply) all is not lost.

We're aware of the problem, and can start pointing library authors at 
this thread, and the relevent b.p.o. ticket, and push them in the right 
direction.

Raymond mentioned two libraries by name, bottle and docutils, and Matt 
scanned the top 100 packages on PyPI. That's a good place to start for 
anyone wanting to contribute: raise bug reports on the individual 
library trackers. (If they haven't already been raised.)

https://github.com/bottlepy/bottle/issues

(I'd do that myself except I have technical problems using Github.)

I have reported it to docutils:

https://sourceforge.net/p/docutils/bugs/373/

[...]
> So put these warnings front and center now
> so package and code maintainers actually see it

The problem is that this seriously and negatively affects the experience 
for many end-users. That's what we're trying to prevent, or at least 
mitigate.

-- 
Steven
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/B7FH5IGUX24J7X7QEANAOSTIKTOHZJ5E/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-09 Thread MRAB

On 2019-08-09 23:56, Steven D'Aprano wrote:

I'm not trying to be confrontational, I'm trying to understand your
use-case(s) and see if it would be broken by the planned change to
string escapes.

On Fri, Aug 09, 2019 at 03:18:29PM -0700, Glenn Linderman wrote:

On 8/9/2019 2:53 PM, Steven D'Aprano wrote:
>On Fri, Aug 09, 2019 at 01:12:59PM -0700, Glenn Linderman wrote:
>
>>The reason I never use raw strings is in the documentation, it is
>>because \ still has a special meaning, and the first several times I
>>felt the need for raw strings, it was for directory names that wanted to
>>end with \ and couldn't.
>Can you elaborate? I find it unlikely that I would ever want a docstring

I didn't mention docstring.  I just wanted a string with a path name 
ending in \.

You said you never used raw strings in the documentation. I read that as
doc strings. What sort of documentation are you writing that isn't a doc
string but is inside your .py files where the difference between raw and
regular strings is meaningful?

Windows users are used to seeing backslashes in paths, I don't care to 
be the one to explain why my program uses / and all the rest use \.

If you don't use raw strings for paths, you get to explain why your
program uses \\ and all the rest use \ *wink*

If they're Windows end users, they won't be reading your source code and
will never know how you represent hard-coded paths in the source code.

If they're Windows developers, they ought to be aware that the Windows
file system API allows / anywhere you can use \ and it is the
common convention in Python to use forward slashes.

I'm also curious why the string needs to *end* with a backslash. Both of
these are the same path:

 C:\foo\bar\baz\
 C:\foo\bar\baz

The only time it's required is for the root directory of a drive:

C:\
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GIPRSYAINB4NE4IORCYRTYN7TZWMCZ34/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-09 Thread Gregory P. Smith

On Fri, Aug 9, 2019 at 11:37 AM Eric V. Smith  wrote:

> On 8/9/2019 2:28 PM, Jonathan Goble wrote:
> > On Fri, Aug 9, 2019 at 12:34 PM Nick Coghlan  wrote:
> >> I find the "Our deprecation warnings were even less visible than
> >> normal" argument for extending the deprecation period compelling.
> > Outsider's 2 cents from reading this discussion (with no personal
> > experience with this warning):
> >
> > I am perplexed at the opinion, seemingly espoused by multiple people
> > in this thread, that because a major part of the problem is that the
> > warnings were not visible enough, somehow the proposed solution is
> > making them not visible enough again? It's too late, in my
> > understanding, in the 3.8 cycle to add a new feature like a change to
> > how these warnings are produced (it seems a significant change to the
> > .pyc structure is needed to emit them at runtime), so this supposed
> > "solution" is nothing but kicking the can down the road. When 3.9
> > rolls around, public exposure to the problem of invalid escape
> > sequences will still be approximately what it is now (because if
> > nobody saw the warnings in 3.7, they certainly won't see them in 3.8
> > with this "fix"), so you'll end up with the same complaints about
> > SyntaxWarning that started this discussion, end up back on
> > DeprecationWarning for 3.9 (hopefully with support for emitting them
> > at runtime instead of just compile-time), then have to wait until
> > 3.10/4.0 for SyntaxWarning and eventually the next version to actually
> > make them errors.
>
> Yes, I think that's the idea: Deprecation warning in 3.9, but more
> visible that what 3.7 has. That is, not just at compile time but at run
> time. What's required to make that happen is an open question.
>

i've lost track of who suggested what in this thread, but yes, that concept
has been rolling over in my mind as a potentially good idea after someone
suggested it.  Compile time warnings should turn into bytecode for a
warnings.warn call in the generated pyc.  I haven't spent time trying to
reason if that actually addresses the real issues we're having moving
forward with a syntax warning change though.  A reasonable feature to ask
for as a feature in 3.9 or later perhaps.

-gps
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BR7T76SXANRAGJ3QOMWZUEGRVPVP/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-09 Thread Gregory P. Smith

On Fri, Aug 9, 2019 at 8:43 AM Guido van Rossum  wrote:

> This discussion looks like there's no end in sight. Maybe the Steering
> Council should take a vote?
>

I've merged the PR reverting the behavior in 3.8 and am doing the same in
the master branch.

The sheer volume of email this is generating shows that we're not ready to
do this to our users.

Recall the nightmare caused by md5.py and sha.py DeprecationWarning's in
2.5...  this would be similar.

We need owners of code to see the problems, not end users of other peoples
code.

FWIW, lest people think I don't like this change and just pushed the revert
buttons as a result, wrong.  I agree with the ultimate SyntaxError and
believe we should move the language there (it is better for long term code
quality).  But it needs to be done in a way that disrupts the *right*
people in the process, not disrupting an exponentially higher number of
users of other peoples code.

If the steering council does anything it should be deciding if we're still
going to do this at all and, if so, planning how we do it without repeating
past mistakes.

-gps
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/W4BUFLMDX5FAFOVLYP4C2LQ2HOTJZEZX/

[Python-Dev] Re: What to do about invalid escape sequences

I'm not trying to be confrontational, I'm trying to understand your 
use-case(s) and see if it would be broken by the planned change to 
string escapes.

On Fri, Aug 09, 2019 at 03:18:29PM -0700, Glenn Linderman wrote:
> On 8/9/2019 2:53 PM, Steven D'Aprano wrote:
> >On Fri, Aug 09, 2019 at 01:12:59PM -0700, Glenn Linderman wrote:
> >
> >>The reason I never use raw strings is in the documentation, it is
> >>because \ still has a special meaning, and the first several times I
> >>felt the need for raw strings, it was for directory names that wanted to
> >>end with \ and couldn't.
> >Can you elaborate? I find it unlikely that I would ever want a docstring
> 
> I didn't mention docstring.  I just wanted a string with a path name 
> ending in \.

You said you never used raw strings in the documentation. I read that as 
doc strings. What sort of documentation are you writing that isn't a doc 
string but is inside your .py files where the difference between raw and 
regular strings is meaningful?

> Windows users are used to seeing backslashes in paths, I don't care to 
> be the one to explain why my program uses / and all the rest use \.

If you don't use raw strings for paths, you get to explain why your 
program uses \\ and all the rest use \ *wink*

If they're Windows end users, they won't be reading your source code and 
will never know how you represent hard-coded paths in the source code.

If they're Windows developers, they ought to be aware that the Windows 
file system API allows / anywhere you can use \ and it is the 
common convention in Python to use forward slashes.

I'm also curious why the string needs to *end* with a backslash. Both of 
these are the same path:

C:\foo\bar\baz\
C:\foo\bar\baz

-- 
Steven
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UT4WDQRJ5U5TA5YYHOM4RRDZV6KEC347/

[Python-Dev] Re: What to do about invalid escape sequences


On 8/9/2019 2:53 PM, Steven D'Aprano wrote:

On Fri, Aug 09, 2019 at 01:12:59PM -0700, Glenn Linderman wrote:


The reason I never use raw strings is in the documentation, it is
because \ still has a special meaning, and the first several times I
felt the need for raw strings, it was for directory names that wanted to
end with \ and couldn't.

Can you elaborate? I find it unlikely that I would ever want a docstring


I didn't mention docstring.  I just wanted a string with a path name 
ending in \.



that ends with a backslash:

 def func():
 r"""Documentation goes here...
 more documentation...
 ending with a Windows path that needs a trailing backslash
 like this C:\directory\"""

That seems horribly contrived. Why use backslashes in the path when the
strong recommendation is to use forward slashes?


Windows users are used to seeing backslashes in paths, I don't care to 
be the one to explain why my program uses / and all the rest use \.



And why not solve the problem by simply moving the closing quotes to the
next line, as PEP 8 recommends?

 r"""Documentation ...
 C:\directory\
 """


This isn't my problem, I wasn't using docstrings, and including a 
newline in a path name doesn't work.  I suppose one could "solve" the 
problem by using


"c:\directory\ "[ :-1]

but that is just as annoying as

"c:\\directory\\"

and back when I discovered the problem, I was still learning Python, and 
didn't think of the above solution either.





[...]

Even in a raw literal, quotes can be escaped with a backslash

Indeed, they're not so much "raw" strings as only slightly blanched
strings.




___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EHE2CNRDGS6AF6GYO4DX7UNNE24JH6CG/

[Python-Dev] Re: What to do about invalid escape sequences

On Fri, Aug 09, 2019 at 01:12:59PM -0700, Glenn Linderman wrote:

> The reason I never use raw strings is in the documentation, it is 
> because \ still has a special meaning, and the first several times I 
> felt the need for raw strings, it was for directory names that wanted to 
> end with \ and couldn't.

Can you elaborate? I find it unlikely that I would ever want a docstring 
that ends with a backslash:

def func():
r"""Documentation goes here...
more documentation...
ending with a Windows path that needs a trailing backslash
like this C:\directory\"""

That seems horribly contrived. Why use backslashes in the path when the 
strong recommendation is to use forward slashes?

And why not solve the problem by simply moving the closing quotes to the 
next line, as PEP 8 recommends?

r"""Documentation ...
C:\directory\
"""

[...]
> >Even in a raw literal, quotes can be escaped with a backslash

Indeed, they're not so much "raw" strings as only slightly blanched 
strings.

-- 
Steven
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Z354BLWONCWMUMFULE64MWUK4TA6PMK2/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-09 Thread brian . skinn

Nathaniel Smith wrote:
> Unfortunately, their solution isn't a pytest incantation, it's a
> separate 'compileall' invocation they run on their source tree. I'm
> not sure how you'd convert this into a pytest feature, because I don't
> think pytest always know which parts of your code are your code versus
> which parts are supporting libraries.
> -n

Ahh, did not appreciate this. :-( Nevermind, then!
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VFFV7MJUZKMSD6FS3OONSEN5XLBOLT5R/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-09 Thread Nathaniel Smith

On Fri, Aug 9, 2019 at 12:07 PM  wrote:
>
> Eric V. Smith wrote:
> >  Hopefully the warnings in 3.9 would be more visible that what we saw in
> > 3.7, so that library authors can take notice and do something about it
> > before 3.10 rolls around.
> > Eric
>
> Apologies for the ~double-post on the thread, but: the SymPy team has figured 
> out the right pytest incantation to expose these warnings. Given the 
> extensive adoption of pytest, perhaps it would be good to combine (1) a FR on 
> pytest to add a convenience flag enabling this mix of options with (2) an 
> aggressive "marketing push", encouraging library maintainers to add it to 
> their testing/CI.

Unfortunately, their solution isn't a pytest incantation, it's a
separate 'compileall' invocation they run on their source tree. I'm
not sure how you'd convert this into a pytest feature, because I don't
think pytest always know which parts of your code are your code versus
which parts are supporting libraries.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/H36DMKUODHOQOYIIZCKW6LYKSGJLXTT4/

[Python-Dev] Re: What to do about invalid escape sequences


On 8/9/2019 9:08 AM, Nick Coghlan wrote:

On Sat, 10 Aug 2019 at 01:44, Guido van Rossum  wrote:

This discussion looks like there's no end in sight. Maybe the Steering Council 
should take a vote?

I find the "Our deprecation warnings were even less visible than
normal" argument for extending the deprecation period compelling.

I also think the UX of the warning itself could be reviewed to provide
a more explicit nudge towards using raw strings when folks want to
allow arbitrary embedded backslashes. Consider:

 SyntaxWarning: invalid escape sequence \,

vs something like:

 SyntaxWarning: invalid escape sequence \, (Note: adding the raw
string literal prefix, r, will accept all non-trailing backslashes)

After all, the habit we're trying to encourage is "If I want to
include backslashes without escaping them all, I should use a raw
string", not "I should memorize the entire set of valid escape
sequences" or even "I should always escape backslashes".

Cheers,
Nick.

The reason I never use raw strings is in the documentation, it is 
because \ still has a special meaning, and the first several times I 
felt the need for raw strings, it was for directory names that wanted to 
end with \ and couldn't. Quoted below. Also relevant to the discussion 
is the "benefit" of leaving the backslash in the result of an illegal 
escape, which no one has mentioned in this huge thread.


Unlike Standard C, all unrecognized escape sequences are left in the 
string unchanged, i.e., /the backslash is left in the result/. (This 
behavior is useful when debugging: if an escape sequence is mistyped, 
the resulting output is more easily recognized as broken.) It is also 
important to note that the escape sequences only recognized in string 
literals fall into the category of unrecognized escapes for bytes 
literals.


Changed in version 3.6: Unrecognized escape sequences produce a
DeprecationWarning. In some future version of Python they will be
a SyntaxError.

Even in a raw literal, quotes can be escaped with a backslash, but the 
backslash remains in the result; for example, |r"\""| is a valid 
string literal consisting of two characters: a backslash and a double 
quote; |r"\"| is not a valid string literal (even a raw string cannot 
end in an odd number of backslashes). Specifically, /a raw literal 
cannot end in a single backslash/ (since the backslash would escape 
the following quote character). Note also that a single backslash 
followed by a newline is interpreted as those two characters as part 
of the literal, /not/ as a line continuation.




___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HKLW5VBZK46TOP6WURFH767YCHRFOYNN/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-09 Thread brian . skinn

Eric V. Smith wrote:
>  Hopefully the warnings in 3.9 would be more visible that what we saw in 
> 3.7, so that library authors can take notice and do something about it 
> before 3.10 rolls around.
> Eric

Apologies for the ~double-post on the thread, but: the SymPy team has figured 
out the right pytest incantation to expose these warnings. Given the extensive 
adoption of pytest, perhaps it would be good to combine (1) a FR on pytest to 
add a convenience flag enabling this mix of options with (2) an aggressive 
"marketing push", encouraging library maintainers to add it to their testing/CI.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/S2464WJ3QCDE4CBM6AWITHMFCISA6O75/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-09 Thread Jonathan Goble

On Fri, Aug 9, 2019 at 2:36 PM Eric V. Smith  wrote:
>
> On 8/9/2019 2:28 PM, Jonathan Goble wrote:
> > On Fri, Aug 9, 2019 at 12:34 PM Nick Coghlan  wrote:
> >> I find the "Our deprecation warnings were even less visible than
> >> normal" argument for extending the deprecation period compelling.
> > Outsider's 2 cents from reading this discussion (with no personal
> > experience with this warning):
> >
> > I am perplexed at the opinion, seemingly espoused by multiple people
> > in this thread, that because a major part of the problem is that the
> > warnings were not visible enough, somehow the proposed solution is
> > making them not visible enough again? It's too late, in my
> > understanding, in the 3.8 cycle to add a new feature like a change to
> > how these warnings are produced (it seems a significant change to the
> > .pyc structure is needed to emit them at runtime), so this supposed
> > "solution" is nothing but kicking the can down the road. When 3.9
> > rolls around, public exposure to the problem of invalid escape
> > sequences will still be approximately what it is now (because if
> > nobody saw the warnings in 3.7, they certainly won't see them in 3.8
> > with this "fix"), so you'll end up with the same complaints about
> > SyntaxWarning that started this discussion, end up back on
> > DeprecationWarning for 3.9 (hopefully with support for emitting them
> > at runtime instead of just compile-time), then have to wait until
> > 3.10/4.0 for SyntaxWarning and eventually the next version to actually
> > make them errors.
>
> Yes, I think that's the idea: Deprecation warning in 3.9, but more
> visible that what 3.7 has. That is, not just at compile time but at run
> time. What's required to make that happen is an open question.
>
> > It seems to me, in my humble but uneducated opinion, that if people
> > are not seeing the warnings, then continuing to give them warnings
> > they won't see isn't a solution to anything. Put the warning front and
> > center. The argument of third-party packages will always be an issue,
> > even if we wait ten years. So put these warnings front and center now
> > so package and code maintainers actually see it, and I'll bet the
> > problematic escape sequences get fixed rather quickly.
> >
> > What am I missing here?
>
> Hopefully the warnings in 3.9 would be more visible that what we saw in
> 3.7, so that library authors can take notice and do something about it
> before 3.10 rolls around.

OK, so I'm at least understanding the plan correctly. I just don't get
the idea of kicking the can down the road on the hope that in 3.9
people will see the warning (knowing that you are still using a
warning that is disabled by default and thus has a high chance of not
being seen until 3.10), when we already have the ability to push out a
visible-by-default warning now in 3.8 and get people to take notice
two whole feature releases (= about 3 years) earlier.

The SyntaxWarning disruption (or SyntaxError disruption) has to happen
eventually, and while I support the idea of making compile-time
DeprecationWarnings be emitted at runtime, I really don't think that a
disabled-by-default warning is going to change a whole lot. Sure, the
major packages will likely see it and update their code, but lots of
smaller specialty packages and independent developers won't see it in
3.9. The bulk of the change isn't going to happen until we go to
SyntaxWarning, so why not just get it over with instead of dragging it
out for three years?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CWM2KO5IA24UCBSAYJP735EYKXIRRQRG/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-09 Thread Eric V. Smith


On 8/9/2019 2:28 PM, Jonathan Goble wrote:

On Fri, Aug 9, 2019 at 12:34 PM Nick Coghlan  wrote:

I find the "Our deprecation warnings were even less visible than
normal" argument for extending the deprecation period compelling.

Outsider's 2 cents from reading this discussion (with no personal
experience with this warning):

I am perplexed at the opinion, seemingly espoused by multiple people
in this thread, that because a major part of the problem is that the
warnings were not visible enough, somehow the proposed solution is
making them not visible enough again? It's too late, in my
understanding, in the 3.8 cycle to add a new feature like a change to
how these warnings are produced (it seems a significant change to the
.pyc structure is needed to emit them at runtime), so this supposed
"solution" is nothing but kicking the can down the road. When 3.9
rolls around, public exposure to the problem of invalid escape
sequences will still be approximately what it is now (because if
nobody saw the warnings in 3.7, they certainly won't see them in 3.8
with this "fix"), so you'll end up with the same complaints about
SyntaxWarning that started this discussion, end up back on
DeprecationWarning for 3.9 (hopefully with support for emitting them
at runtime instead of just compile-time), then have to wait until
3.10/4.0 for SyntaxWarning and eventually the next version to actually
make them errors.


Yes, I think that's the idea: Deprecation warning in 3.9, but more 
visible that what 3.7 has. That is, not just at compile time but at run 
time. What's required to make that happen is an open question.



It seems to me, in my humble but uneducated opinion, that if people
are not seeing the warnings, then continuing to give them warnings
they won't see isn't a solution to anything. Put the warning front and
center. The argument of third-party packages will always be an issue,
even if we wait ten years. So put these warnings front and center now
so package and code maintainers actually see it, and I'll bet the
problematic escape sequences get fixed rather quickly.

What am I missing here?


Hopefully the warnings in 3.9 would be more visible that what we saw in 
3.7, so that library authors can take notice and do something about it 
before 3.10 rolls around.


Eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GGZY7B2WFHVXRQ7NVTHGC2F4L5RJIKDI/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-09 Thread Jonathan Goble

On Fri, Aug 9, 2019 at 12:34 PM Nick Coghlan  wrote:
>
> I find the "Our deprecation warnings were even less visible than
> normal" argument for extending the deprecation period compelling.

Outsider's 2 cents from reading this discussion (with no personal
experience with this warning):

I am perplexed at the opinion, seemingly espoused by multiple people
in this thread, that because a major part of the problem is that the
warnings were not visible enough, somehow the proposed solution is
making them not visible enough again? It's too late, in my
understanding, in the 3.8 cycle to add a new feature like a change to
how these warnings are produced (it seems a significant change to the
.pyc structure is needed to emit them at runtime), so this supposed
"solution" is nothing but kicking the can down the road. When 3.9
rolls around, public exposure to the problem of invalid escape
sequences will still be approximately what it is now (because if
nobody saw the warnings in 3.7, they certainly won't see them in 3.8
with this "fix"), so you'll end up with the same complaints about
SyntaxWarning that started this discussion, end up back on
DeprecationWarning for 3.9 (hopefully with support for emitting them
at runtime instead of just compile-time), then have to wait until
3.10/4.0 for SyntaxWarning and eventually the next version to actually
make them errors.

It seems to me, in my humble but uneducated opinion, that if people
are not seeing the warnings, then continuing to give them warnings
they won't see isn't a solution to anything. Put the warning front and
center. The argument of third-party packages will always be an issue,
even if we wait ten years. So put these warnings front and center now
so package and code maintainers actually see it, and I'll bet the
problematic escape sequences get fixed rather quickly.

What am I missing here?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/6ZBX2PULRGIRUBQ735ONGV2RZU2LP3WQ/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-09 Thread Paul Moore

On Fri, 9 Aug 2019 at 17:55, Steve Dower  wrote:
> > * change the SyntaxWarning into a default-silenced one that fires every 
> > time a .pyc is loaded (this is the hard part, but it's doable)
> > * change pathlib.PureWindowsPath, os.fsencode and os.fsdecode to explicitly 
> > warn when the path contains control characters
> > * change the PyErr_SetExcFromWindowsErrWithFilenameObjects function to 
> > append (or chain) an extra message when either of the filenames contains 
> > control characters (or change OSError to do it, or the default 
> > sys.excepthook)

The second and third art of this seem like they are both independent
of the first, and useful improvements in their own right.

Paul
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/6L65KDXMRTTLHX7HWAU4WLRMHEH7GXFA/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-09 Thread Steve Dower


On 09Aug2019 0905, Serhiy Storchaka wrote:

09.08.19 18:30, Guido van Rossum пише:
This discussion looks like there's no end in sight. Maybe the Steering 
Council should take a vote?


Possible options:

1. SyntaxWarning in 3.8+ (the current status).
2. DeprecationWarning in 3.8, SyntaxWarning in 3.9+ (revert changes in 
3.8 only).

3. DeprecationWarning in 3.8 and 3.9 (revert changes in master and 3.8).
4. No warnings at all.


I also posted another possible option that helps solve the real problem 
faced by users, and not just the "we want to have a warning" problem 
that is purely ours.



* change the SyntaxWarning into a default-silenced one that fires every time a 
.pyc is loaded (this is the hard part, but it's doable)
* change pathlib.PureWindowsPath, os.fsencode and os.fsdecode to explicitly 
warn when the path contains control characters
* change the PyErr_SetExcFromWindowsErrWithFilenameObjects function to append (or chain) an extra message when either of the filenames contains control characters (or change OSError to do it, or the default sys.excepthook) 


Cheers,
Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GZJPZ55OR2CIERO5Q4ETPZPAQZSFAEDD/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-09 Thread Nick Coghlan

On Sat, 10 Aug 2019 at 01:44, Guido van Rossum  wrote:
>
> This discussion looks like there's no end in sight. Maybe the Steering 
> Council should take a vote?

I find the "Our deprecation warnings were even less visible than
normal" argument for extending the deprecation period compelling.

I also think the UX of the warning itself could be reviewed to provide
a more explicit nudge towards using raw strings when folks want to
allow arbitrary embedded backslashes. Consider:

SyntaxWarning: invalid escape sequence \,

vs something like:

SyntaxWarning: invalid escape sequence \, (Note: adding the raw
string literal prefix, r, will accept all non-trailing backslashes)

After all, the habit we're trying to encourage is "If I want to
include backslashes without escaping them all, I should use a raw
string", not "I should memorize the entire set of valid escape
sequences" or even "I should always escape backslashes".

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/N7Q4R3GX5RBF3FPGWMMKWYB4LOI7GVOC/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-09 Thread Serhiy Storchaka


09.08.19 18:30, Guido van Rossum пише:
This discussion looks like there's no end in sight. Maybe the Steering 
Council should take a vote?


Possible options:

1. SyntaxWarning in 3.8+ (the current status).
2. DeprecationWarning in 3.8, SyntaxWarning in 3.9+ (revert changes in 
3.8 only).

3. DeprecationWarning in 3.8 and 3.9 (revert changes in master and 3.8).
4. No warnings at all.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TRZHNEITUZTDEEHSFWV5SUEXNRHTU3KQ/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-09 Thread Guido van Rossum

This discussion looks like there's no end in sight. Maybe the Steering
Council should take a vote?

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him/his **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5BYIHT2BV7TPDHP6F5W44K4JKN5PHQQ3/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-09 Thread brian . skinn

> This whole thread would be an excellent justification for following 3.9
> with 4.0. It's as near as we ever want to get to a breaking change, and a
> major version number would indicate the need to review. If increasing
> strictness of escape code interpretation in string literals is the only
> incompatibility there would surely be general delight.
> 
> Kind regards,
> Steve Holden

I rather doubt that allowing breaking changes into a Python 4.0 would end up 
with this as the only proposed incompatibility. Once word got out, a flood of 
incompat requests would probably get raised. I personally have a change I'd 
like made to doctest (https://bugs.python.org/issue36714), and I know of 
another in argparse (https://bugs.python.org/issue33109) that I'm personally 
neutral on but that others have stronger feelings about.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/SEBJF7C7RRG3Q3MFD5D6CTOFZUX7DNSE/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-09 Thread Chris Angelico

On Fri, Aug 9, 2019 at 11:22 PM Steven D'Aprano  wrote:
>
> And this change won't fix that, because *good* paths that currently work
> today will fail in the future, but *bad* paths that silently do the
> wrong thing will continue to silently do the wrong thing.

Except that many paths can be both "good" and "bad", because paths
have multiple components. So the warning has a VERY high probability
of happening.

But I've given up on this debate. No more posts from me. Some things
aren't worth fighting for. With the number of words posted in this
thread saying "we need convenience, not correctness", I'm done
arguing.

ChrisA
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/K4T7V5Z5GSGGK7HO73ZMFGTTAGMKHDE3/

[Python-Dev] Re: What to do about invalid escape sequences

On Wed, Aug 07, 2019 at 07:47:45PM +1000, Chris Angelico wrote:
> On Wed, Aug 7, 2019 at 7:33 PM Steven D'Aprano  wrote:
> > What's the rush? Let's be objective here: what benefit are we going to
> > get from this change? Is there anyone hanging out desperately for "\d"
> > and "\-" to become SyntaxErrors, so they can... do what?
> 
> So that problems can start to be detected. Time and again, Python
> users on Windows get EXTREMELY confused by the way their code worked
> perfectly with one path, then bizarrely fails with another. That is a
> very real problem, and the problem is that it appeared to work when
> actually it was wrong.

And this change won't fix that, because *good* paths that currently work 
today will fail in the future, but *bad* paths that silently do the 
wrong thing will continue to silently do the wrong thing.

py> filename = "location\data"  # will work correctly
:1: SyntaxWarning: invalid escape sequence \d

py> filename = "location\temp"  # doesn't work as expected, but no error
py>

Effectively, we are hoping that Windows users will infer from the 
failure of "\d" (say) that they shouldn't use "\t" even though it 
doesn't raise. Perhaps some of them will, but I maintain we're talking 
about a small, incremental improvement, not something that will once and 
for all fix the problem.

I don't think this is a benefit for users of any operating system except 
Windows users. For Linux, Unix, Mac users, one could argue strongly that 
we're making the string escape experience a tiny bit *worse*, not 
better. Raymond's example of ASCII art for example.

I think the subset of users that this will help is quite small:

- users on Windows;

- who haven't read or paid attention to the innumerable recommendations
  on the web and the documentation that they always use forwards slashes
  in paths;

- who happen to use an escape like \d rather than \t;

- and will read and understand the eventual SyntaxWarning/Error;

- and infer from that error that they should change their path to use
  forward slashes instead of backslashes;

- and all this happens *before* they get bitten by the \t problem and
  they learn the hard way not to use backslashes in paths.

I'm not saying this isn't worth doing. I'm saying it's a small benefit 
that *right now* is a lot less than the cost to library authors and users.

> Python has a history of fixing these problems. It used to be that
> b"\x61\x62\x63\x64" was equal to u"abcd", but now Python sees these as
> fundamentally different.

Yes, and we fixed that over a 10+ year period involving no fewer than 
three full releases in the Python 2.x series and eight full releases in 
the Python 3.x series, and the transition period is not over yet since 
2.7 is not yet EOLed.

> Data-dependent bugs caused by a syntactic
> oddity are a language flaw that needs to be fixed.

There is always a tradeoff between the severity of the flaw and how much 
pain we are willing to accept to fix it. I think Raymond has made a good 
case that in this instance, the pain of fixing it *now* is greater than 
the benefit.

(I don't think he has made the case to reverse the depreciation 
altogether.)

If the benefit versus pain never moves into the black, then we should 
keep the status quo indefinitely, like any other language wart or 
misfeature we're stuck with due to backwards compatibility.

("Although never is often better than *right* now.")

But having said that, I'm confident that given an improved deprecation 
process that makes it easier for library authors to see the warning 
before end-users, we will be able to move forward in a release or two.

> > Because our processes don't work the way we assumed, it turns out that
> > in practice we haven't given developers the deprecation period we
> > thought we had. Read Nathaniel's post, if you haven't already done so:
> >
> > https://mail.python.org/archives/list/python-dev@python.org/message/E7QCC74OBYEY3PVLNQG2ZAVRO653LD5K/
> >
> > He makes a compelling case that while we might have had the promised
> > deprecation period by the letter of the law, in practice most developers
> > will have never seen it, and we will be breaking the spirit of the
> > promise if we continue with the unmodified plan.
> 
> Yes, that's a fair complaint. But merely pushing the deprecation back
> by a version is not solving it. There has to be SOMETHING done
> differently.

"We must do SOMETHING!!! This is something, therefore we must do it!!!"

I agree that we ought to fix the problem with the deprecation warnings.

What I don't agree with is the demand that unless I can give a fix for 
the deprecation warning issue *right now* we must stay the course no 
matter how annoying and painful it is for users and library authors.

> > And yet here we are rushing through a breaking change in an accelerated
> > manner, for a change of marginal benefit.
> 
> It's not a marginal benefit. For people who try to teach Python on
> multiple operating systems, this

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-08 Thread Stephen J. Turnbull

Steve Holden writes:

 > This whole thread would be an excellent justification for following 3.9
 > with 4.0. It's as near as we ever want to get to a breaking change, and a
 > major version number would indicate the need to review. If increasing
 > strictness of escape code interpretation in string literals is the only
 > incompatibility there would surely be general delight.

This should be the first chapter in the Beautiful Version Numbering
book!  I love it!
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LY2RX4ROGH54IU57RO7Y2O6IDDV5LUBG/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-08 Thread Jim J. Jewett

FWIW, the web archive 
https://mail.python.org/archives/list/python-dev@python.org/thread/ZX2JLOZDOXWVBQLKE4UCVTU5JABPQSLB/
 does not seem to display the problems ... apparently the individual messages 
are not included in view source, and are cleaned up for chrome's inspect.  I'm 
not sure whether that counts as a bug in the archiving or not.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UYSBJFII467TKNA2SDYCJZUQFLCGEAKY/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-08 Thread Eric V. Smith

On 8/5/2019 4:30 PM, raymond.hettin...@gmail.com wrote:

Thanks for weighing in. I think this is an important usability discussion.
IMO it is the number one issue affecting the end user experience with this
release. If we could get more people to actively use the beta release, the
issue would stand-out front and center. But if people don't use the beta in
earnest, we won't have confirmation until it is too late.

We really don't have to go this path. Arguably, the implicit conversion of
'\latex' to '\\latex' is a feature that has existed for three decades, and now
we're deciding to turn it off to define existing practices as errors. I don't
think any commercial product manager would allow this to occur without a lot of
end user testing.

As much as I'd love to force this change through [0], it really does
seem like we're forcing it. Especially given Nathaniel's point about the
discoverability problems with compile-time warnings, I think we should
delay a visible warning about this. Possibly in 3.9 we can do something
about making these warnings visible at run time, not just compile time.
I had a similar problems with f-strings (can't recall the details now,
since resolved), and the compile-time-only nature made it difficult to
notice. I realize a run-time warning for this would require a fair bit
of work that might not be worth it.

I think Raymond's point goes beyond this. I think he's proposing that we
never make this change. I'm sympathetic to that, too. But the first step
is to change 3.8's behavior to not make this visible. That is, we should
restore the 3.7 warning behavior.

Eric

[0] And the real reason I'd like this is so we can add \e
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/UADZYIYTPGNRELG477F3SSRB3K7R2J75/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-08 Thread Jeroen Demeyer

> When you take a text string and create a string literal to represent
> it, sometimes you have to modify it to become syntactically valid.

Even simpler: use r""" instead of """

The only case where that won't work is when you need actual escape
sequences. But I find this very rare in practice for triple-quoted
strings.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LV5STHINBEREK2Y43OQLFUOBQPN2AXZC/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-08 Thread Christian Tismer

Hey friends,

This is IMHO a great idea.
If a package claims to be Python 3.8 compatible, then it has to
be correct concerning invalid escapes.

A new pip version could perhaps even refuse packages with such
literals when it claims to be supporting Python 3.8 .

But how can it actually happen that a pre-3.8 package gets installed
when you install Python 3.8? Does pip allow installation without
a section that defines the allowed versions?

Ok, maybe packages are claimed for Python 3.8 and not further checked.

But let's assume the third-party things that Raymond sees do _not_
come from pip, but elsewhere. Pre-existing stuff that is somehow copied
into the newer Python version? Sure, quite possible!

But then it is quite likely that those third-party things still
have their creation date from pre-3.8 time.
What about the simple heuristic that a Python module with a creation
date earlier than xxx does simply not issue the annoying warning?

Maybe that already cures the disease in enough cases?

just a wild idea - \leave \old \code \untouched -- ciao - \Chris

On 06.08.19 18:59, Neil Schemenauer wrote:
> 
> Making it an error so soon would be mistake, IMHO.  That will break
> currently working code for small benefit.  When Python was a young
> language with a few thousand users, it was easier to make these
> kinds of changes.  Now, we should be much more conservative and give
> people a long time and a lot of warning.  Ideally, we should provide
> tools to fix code if possible.
> 
> Could PyPI and pip gain the ability to warn and even fix these
> issues?  Having a warning from pip at install time could be better
> than a warning at import time.  If linting was built into PyPI, we
> could even do a census to see how many packages would be affected by
> turning it into an error.
> 
> On 2019-08-05, raymond.hettin...@gmail.com wrote:
>> P.S. In the world of C compilers, I suspect that if the relatively
>> new compiler warnings were treated as errors, the breakage would
>> be widespread. Presumably that's why they haven't gone down this
>> road.
> 
> The comparision with C compilers is relevant.  C and C++ represent a
> fairly extreme position on not breaking working code.   E.g. K & R
> style functional declarations were supported for decades.  I don't
> think we need to go quite that far but also one or two releases is
> not enough time.
> 
> Regards,
> 
>   Neil
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/V2EDFDJGXRIDMKJU3FKIWC2NDLMUZA2Y/
> 

-- 
Christian Tismer :^)   tis...@stackless.com
Software Consulting  : http://www.stackless.com/
Karl-Liebknecht-Str. 121 : https://github.com/PySide
14482 Potsdam: GPG key -> 0xFB7BEE0E
phone +49 173 24 18 776  fax +49 (30) 700143-0023
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XVJCYXDZ7VPMMCTP2BPNAJ3OO7S4II4V/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-08 Thread Dima Tisnek

These two ought to be converted to raw strings, shouldn't they?

On Thu, 8 Aug 2019 at 08:04,  wrote:
>
> For me, these warnings are continuing to arise almost daily.  See two recent 
> examples below.  In both cases, the code previously had always worked without 
> complaint.
>
> - Example from yesterday's class 
>
> ''' How old-style formatting works with positional placeholders
>
> print('The answer is %d today, but was %d yesterday' % (new, old))
>  \o
>   \o
> '''
>
> SyntaxWarning: invalid escape sequence \-
>
> - Example from today's class 
>
> # Cut and pasted from:
> # https://en.wikipedia.org/wiki/VCard#vCard_2.1
> vcard = '''
> BEGIN:VCARD
> VERSION:2.1
> N:Gump;Forrest;;Mr.
> FN:Forrest Gump
> ORG:Bubba Gump Shrimp Co.
> TITLE:Shrimp Man
> PHOTO;GIF:http://www.example.com/dir_photos/my_photo.gif
> TEL;WORK;VOICE:(111) 555-1212
> TEL;HOME;VOICE:(404) 555-1212
> ADR;WORK;PREF:;;100 Waters Edge;Baytown;LA;30314;United States of America
> LABEL;WORK;PREF;ENCODING=QUOTED-PRINTABLE;CHARSET=UTF-8:100 Waters Edge=0D=
>  =0ABaytown\, LA 30314=0D=0AUnited States of America
> ADR;HOME:;;42 Plantation St.;Baytown;LA;30314;United States of America
> LABEL;HOME;ENCODING=QUOTED-PRINTABLE;CHARSET=UTF-8:42 Plantation St.=0D=0A=
>  Baytown, LA 30314=0D=0AUnited States of America
> EMAIL:forrestg...@example.com
> REV:20080424T195243Z
> END:VCARD
> '''
>
> SyntaxWarning: invalid escape sequence \,
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/OYGRL5AWSJZ34MDLGIFTWJXQPLNSK23S/
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/P5YTWGKVSR5EFTHHUKOXW32CBEUYIRW2/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-08 Thread Dima Tisnek

I feel this is one of the cases, where we're expecting early adopters
to proactively post pull requests against affected libraries. Failing
that opening issues against affected libraries.

I was ready to do just that, but alas didn't even have to!
Matt's analysis shows that it's now too hard.

What was hard for me were the rules. In fact, not being up to date, I
couldn't even find the PEP that specified the change.

What the Python devs could do is to guide users on how to update existing code.
Something like python3.8 -c 'print(repr("\b\l\a\h"))' but with sensible output.
And instruction for those who support both py3 and py3 from the same codebase.

I could hope for a feature in psf/black, but maybe that's not for everyone.
Just my 2c :)

On Mon, 5 Aug 2019 at 13:30,  wrote:
>
> We should revisit what we want to do (if anything) about invalid escape 
> sequences.
>
> For Python 3.8, the DeprecationWarning was converted to a SyntaxWarning which 
> is visible by default.  The intention is to make it a SyntaxError in Python 
> 3.9.
>
> This once seemed like a reasonable and innocuous idea to me; however, I've 
> been using the 3.8 beta heavily for a month and no longer think it is a good 
> idea.  The warning crops up frequently, often due to third-party packages 
> (such as docutils and bottle) that users can't easily do anything about.  And 
> during live demos and student workshops, it is especially distracting.
>
> I now think our cure is worse than the disease.  If code currently has a 
> non-raw string with '\latex', do we really need Python to yelp about it (for 
> 3.8) or reject it entirely (for 3.9)?   If someone can't remember exactly 
> which special characters need to be escaped, do we really need to stop them 
> in their tracks during a data analysis session?  Do we really need to reject 
> ASCII art in docstrings: ` \---> special case'?
>
> IIRC, the original problem to be solved was false positives rather than false 
> negatives:  filename = '..\training\new_memo.doc'.  The warnings and errors 
> don't do (and likely can't do) anything about this.
>
> If Python 3.8 goes out as-is, we may be punching our users in the nose and 
> getting almost no gain from it.  ISTM this is a job best left for linters.  
> For a very long time, Python has been accepting the likes of 'more \latex 
> markup' and has been silently converting it to 'more \\latex markup'.  I now 
> think it should remain that way.  This issue in the 3.8 beta releases has 
> been an almost daily annoyance for me and my customers. Depending on how you 
> use Python, this may not affect you or it may arise multiple times per day.
>
>
> Raymond
>
> P.S.  Before responding, it would be a useful exercise to think for a moment 
> about whether you remember exactly which characters must be escaped or 
> whether you habitually put in an extra backslash when you aren't sure.  Then 
> see:  https://bugs.python.org/issue32912
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/ZX2JLOZDOXWVBQLKE4UCVTU5JABPQSLB/
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/F2ZIHAT2EIWM5IOJFP2THGUOSFZJ3Z2W/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-08 Thread Glenn Linderman


On 8/7/2019 6:13 PM, raymond.hettin...@gmail.com wrote:

This isn't about me.  As a heavy user of the 3.8 beta, I'm just the canary in 
the coal mine.
Are you, with an understanding of the issue, submitting bug reports on 
the issues you find, thus helping to alleviate the problem, and educate 
the package maintainers?


Or are you just carping here?

I'll apologize in advance for using the word "carping" if the answer to 
my first question is yes. :)


Glenn
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/N26WJ2BCYT7CPFRHZGLQKILDCCKDTV5N/

[Python-Dev] Re: What to do about invalid escape sequences


08.08.19 07:55, Toshio Kuratomi пише:

Like the Ansible feature, though, the problem is that over time we've
discovered that it is hard to educate users about the exact
characteristic of the feature (\k == k but \n == newline;


No, \k == \\k. This differs from most other programming languages.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7OTMWGJOMXT6F6NONVSL2WLFG3VPP4B6/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-07 Thread Toshio Kuratomi

On Mon, Aug 5, 2019 at 6:47 PM  wrote:
>
> I wish people with more product management experience would chime in; 
> otherwise, 3.8 is going to ship with an intentional hard-to-ignore annoyance 
> on the premise that we don't like the way people have been programming and 
> that they need to change their code even if it was working just fine.
>

I was resisting weighing in since I don't know the discussion around
deprecating this language feature in the first place (other than
what's given in this thread).  However, in the product I work on we
made a very similar change in our last release so I'll throw it out
there for people to take what they will from it.

We have a long standing feature which allows people to define groups
of hosts and give them a name.  In the past that name could include
dashes, dots, and other characters which are not legal as Python
identifiers.  When users use those group names in our "DSL" (not truly
a DSL but close enough), they can do it using either dictionary-lookup
syntax (groupvars['groupname']) or using dotted attribute notation
groupvars.groupname.  We also have a longstanding problem where users
will try to do something like groupvars.group-name using the
dotted attribute notation with group names that aren't proper python
identifiers.  This causes problems as the name then gets split on the
characters that aren't legal in identifiers and results in something
unexpected (undefined variable, an actual subtraction operation, etc).
In our last release we decided to deprecate and eventually make it
illegal to use non-python-identifiers for the group names.

At first, product management *did* let us get away with this.  But
after some time and usage of the pre-releases, they came to realize
that this was a major problem.  User's had gotten used to being able
to use these characters in their group names.  They had defined their
group names and gotten used to typing their group names and built up a
whole body of playbooks that used these group names

Product management still let us get away with this.. sort of. The
scope of the change was definitely modified.  Users were now allowed
to select whether invalid group names were disallowed (so they could
port their installations), allowed with a warning (presumably so they
could do work but also see that they were affected) or allow without a
warning (presumably because they knew not to use these group names
with dotted attribute notation) .  This feature was also no longer
allowed to be deprecated... We could have a warning that said "Don't
do this" but not remove the feature in the future.

Now... I said this was a config option So what we do have in the
release is that the config option allows but warns by default and *the
config option* has a deprecation warning.  You see... we're planning
on changing from warn by default now to disallowing by default in the
future so the deprecation is flagging the change in config value.

And you know what?  User's absolutely hate this.  They don't like the
warning.  They don't like the implication that they're doing something
wrong by using a long-standing feature.  They don't like that we're
going to change the default so that they're current group names will
break.  They dislike that it's being warned about because of
attribute-lookup-notation which they can just learn not to use with
their group names.  They dislike this so much that some of us have
talked about abandoning this idea... instead, having a public group
name that users use when they write in the "DSL" and an internal group
name that we use when evaluating the group names. Perhaps that works,
perhaps it doesn't, but I think that's where my story starts being
specific to our feature and no longer applicable to Python and escape
sequences

Now like I said, I don't know the discussions that lead to invalid
escape sequences being deprecated so I don't know whether there's more
compelling reasons for doing it but it seems to me that there's even
less to gain by doing this than what we did in Ansible.  The thing
Ansible is complaining about can do the wrong thing when used in
conjunction with certain other features of our "DSL".  The thing that
the python escape sequences is complaining about are never invalid (As
was pointed out, it's complaining when a sequence of two characters
will do what the user intended rather than complaining when a sequence
of two characters will do something that the user did not intend).
Like the Ansible feature, though, the problem is that over time we've
discovered that it is hard to educate users about the exact
characteristic of the feature (\k == k but \n == newline;
groupvars['group-name']  works but groupvars.group-name does not) so
we've both given up on continuing to educate the users in favor of
attempting to nanny the user into not using the feature.  That most
emphatically has not worked for us and has spent a bunch of goodwill
with our users but the python userbase is not

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-07 Thread raymond . hettinger

This isn't about me.  As a heavy user of the 3.8 beta, I'm just the canary in 
the coal mine.

After many encounters with these warnings, I'm starting to believe that 
Python's long-standing behavior was convenient for users.  Effectively, "\-" 
wasn't an error, it was just a way of writing "\-". For the most part, that 
worked out fine. Sure, we all seen interactive prompt errors from having \t in 
a pathname but not in production (likely because a FileNotFoundError would 
surface immediately).
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4YNZYCOBWGMLC6BDXQFJJWLXEK47I5PU/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-07 Thread MRAB


On 2019-08-07 23:43, Steve Holden wrote:
This whole thread would be an excellent justification for following 3.9 
with 4.0. It's as near as we ever want to get to a breaking change, and 
a major version number would indicate the need to review. If increasing 
strictness of escape code interpretation in string literals is the only 
incompatibility there would surely be general delight.



I can think of another possible one: import * requires __all__.

[snip]
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WQEHMFMR7IRWYDXDSCZUGJKGDI5HNEDK/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-07 Thread Chris Angelico

On Thu, Aug 8, 2019 at 8:58 AM  wrote:
>
> For me, these warnings are continuing to arise almost daily.  See two recent 
> examples below.  In both cases, the code previously had always worked without 
> complaint.
>
> - Example from yesterday's class 
>
> ''' How old-style formatting works with positional placeholders
>
> print('The answer is %d today, but was %d yesterday' % (new, old))
>  \o
>   \o
> '''
>
> SyntaxWarning: invalid escape sequence \-

I've no idea why this is even a string literal, but if it absolutely
has to be, then you could use a character other than backslash.

> - Example from today's class 
>
> # Cut and pasted from:
> # https://en.wikipedia.org/wiki/VCard#vCard_2.1
> vcard = '''
> ...
> LABEL;WORK;PREF;ENCODING=QUOTED-PRINTABLE;CHARSET=UTF-8:100 Waters Edge=0D=
>  =0ABaytown\, LA 30314=0D=0AUnited States of America
> ...
> '''
>
> SyntaxWarning: invalid escape sequence \,

When you take a text string and create a string literal to represent
it, sometimes you have to modify it to become syntactically valid.
This is exactly the sort of thing that SHOULD be being warned about,
because it's sometimes going to work and sometimes not, depending on
the exact data you're working with. Please don't teach people the
habit of pretending that the backslash isn't significant.

If the warning were changed to be silent for 3.8, what would you do
differently? How would having extra time to solve this problem help
you?

ChrisA
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WCNY3C7VBLCP5RDKKVMMEMN7R26GK2FI/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-07 Thread raymond . hettinger

For me, these warnings are continuing to arise almost daily.  See two recent 
examples below.  In both cases, the code previously had always worked without 
complaint.

- Example from yesterday's class 

''' How old-style formatting works with positional placeholders

print('The answer is %d today, but was %d yesterday' % (new, old))
 \o
  \o
'''
   
SyntaxWarning: invalid escape sequence \-

- Example from today's class 

# Cut and pasted from: 
# https://en.wikipedia.org/wiki/VCard#vCard_2.1
vcard = '''
BEGIN:VCARD
VERSION:2.1
N:Gump;Forrest;;Mr.
FN:Forrest Gump
ORG:Bubba Gump Shrimp Co.
TITLE:Shrimp Man
PHOTO;GIF:http://www.example.com/dir_photos/my_photo.gif
TEL;WORK;VOICE:(111) 555-1212
TEL;HOME;VOICE:(404) 555-1212
ADR;WORK;PREF:;;100 Waters Edge;Baytown;LA;30314;United States of America
LABEL;WORK;PREF;ENCODING=QUOTED-PRINTABLE;CHARSET=UTF-8:100 Waters Edge=0D=
 =0ABaytown\, LA 30314=0D=0AUnited States of America
ADR;HOME:;;42 Plantation St.;Baytown;LA;30314;United States of America
LABEL;HOME;ENCODING=QUOTED-PRINTABLE;CHARSET=UTF-8:42 Plantation St.=0D=0A=
 Baytown, LA 30314=0D=0AUnited States of America
EMAIL:forrestg...@example.com
REV:20080424T195243Z
END:VCARD
'''

SyntaxWarning: invalid escape sequence \,
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OYGRL5AWSJZ34MDLGIFTWJXQPLNSK23S/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-07 Thread Steve Holden

This whole thread would be an excellent justification for following 3.9
with 4.0. It's as near as we ever want to get to a breaking change, and a
major version number would indicate the need to review. If increasing
strictness of escape code interpretation in string literals is the only
incompatibility there would surely be general delight.

Kind regards,
Steve Holden


On Wed, Aug 7, 2019 at 8:19 PM eryk sun  wrote:

> On 8/7/19, Steve Dower  wrote:
> >
> > * change the PyErr_SetExcFromWindowsErrWithFilenameObjects function to
> > append (or chain) an extra message when either of the filenames contains
> c
> > control characters (or change OSError to do it, or the default
> > sys.excepthook)
>
> On a related note for Windows, if the error is specifically
> ERROR_INVALID_NAME, we could extend this to look for and warn about
> the five reserved wildcard characters (asterisk, question mark, double
> quote, less than, greater than), pipe, and colon. It's only sometimes
> the case for colon because it's allowed in device names and used as
> the name and type delimiter for stream names.
>
> Kernel object names don't reserve wildcard characters, pipe, and
> colon. So I wouldn't want anything but the control-character warning
> if it's say ERROR_FILE_NOT_FOUND. An example would be
> SharedMemory(name='Global\test'), or a similar error for registry key
> and value names such as OpenKey(hkey, 'spam\test'), that is if winreg
> were updated to include the name in the exception. Note that forward
> slash is just a name character in these cases, not a path separator,
> so we have to use backslash, even if just via replace('/', '\\').
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/UFMVFL4QDUXLZFBWVW4YLAKPHQ6LTPDK/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KM2IVRWN5QPLCFHJ5FUWZ6XB7DW2VONS/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-07 Thread eryk sun

On 8/7/19, Steve Dower  wrote:
>
> * change the PyErr_SetExcFromWindowsErrWithFilenameObjects function to
> append (or chain) an extra message when either of the filenames contains c
> control characters (or change OSError to do it, or the default
> sys.excepthook)

On a related note for Windows, if the error is specifically
ERROR_INVALID_NAME, we could extend this to look for and warn about
the five reserved wildcard characters (asterisk, question mark, double
quote, less than, greater than), pipe, and colon. It's only sometimes
the case for colon because it's allowed in device names and used as
the name and type delimiter for stream names.

Kernel object names don't reserve wildcard characters, pipe, and
colon. So I wouldn't want anything but the control-character warning
if it's say ERROR_FILE_NOT_FOUND. An example would be
SharedMemory(name='Global\test'), or a similar error for registry key
and value names such as OpenKey(hkey, 'spam\test'), that is if winreg
were updated to include the name in the exception. Note that forward
slash is just a name character in these cases, not a path separator,
so we have to use backslash, even if just via replace('/', '\\').
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UFMVFL4QDUXLZFBWVW4YLAKPHQ6LTPDK/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-07 Thread brian . skinn

Steven D'Aprano wrote:

> Because our processes don't work the way we assumed, it turns out that 
> in practice we haven't given developers the deprecation period we 
> thought we had. Read Nathaniel's post, if you haven't already done so:
> https://mail.python.org/archives/list/python-dev@python.org/message/E7QCC74O...
> He makes a compelling case that while we might have had the promised 
> deprecation period by the letter of the law, in practice most developers 
> will have never seen it, and we will be breaking the spirit of the 
> promise if we continue with the unmodified plan.
> ...
> I'm sure that the affected devs will understand why it was their fault 
> they couldn't see the warnings, when even people from a first-class 
> library like SymPy took four iterations to do it right.
> > Currently it
> > requires some extra steps or flags, which are not well known. What
> > change are you proposing for 3.8 that will ensure that this actually
> > gets solved?
> > Absolutely nothing. I don't have to: we're an entire community, this 
> doesn't have to fall only on my shoulders. I'm not even the messenger: 
> that's Raymond. I'm just (partly) agreeing with him.
> Just because I don't have a solution for this problem doesn't mean the 
> problem doesn't exist.

As the SymPy team has figured out the right pytest incantation to expose these 
warnings, perhaps a feature request on pytest to encapsulate that mix of 
options into a single flag would be a good idea?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WOJQTOXMYKHLQO4KICEIZH3PDEMQLMBL/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-07 Thread Steve Dower


On 07Aug2019 0247, Chris Angelico wrote:

On Wed, Aug 7, 2019 at 7:33 PM Steven D'Aprano  wrote:

What's the rush? Let's be objective here: what benefit are we going to
get from this change? Is there anyone hanging out desperately for "\d"
and "\-" to become SyntaxErrors, so they can... do what?


So that problems can start to be detected. Time and again, Python
users on Windows get EXTREMELY confused by the way their code worked
perfectly with one path, then bizarrely fails with another. That is a
very real problem, and the problem is that it appeared to work when
actually it was wrong.
[...]
If you can offer a better plan, then by all means, do so. But
deferring without a change is of no real value, and it means ANOTHER
eighteen months added onto the time before novice programmers get to
be told about string literal problems.


Allow me to offer one:

* change the SyntaxWarning into a default-silenced one that fires every 
time a .pyc is loaded (this is the hard part, but it's doable)
* change pathlib.PureWindowsPath, os.fsencode and os.fsdecode to 
explicitly warn when the path contains control characters
* change the PyErr_SetExcFromWindowsErrWithFilenameObjects function to 
append (or chain) an extra message when either of the filenames contains 
control characters (or change OSError to do it, or the default 
sys.excepthook)


I don't care whether the changes are applied to all platforms rather 
than just Windows, but since Windows developers hit the problem and 
(some) Linux developers like to use control characters in filenames, I 
can see a justification for only warning on Windows.


Long term we can still deprecate and eventually block unrecognized 
escape sequences, but the long standing behaviour can stand for a few 
more years without creating more harm.


Cheers,
Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2ICLCF5T53DBPVZPVHMT2XTXL64QF7WW/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-07 Thread Joao S. O. Bueno

For what I can see, the majority of new users in an interactive environment
seeing the warning will do so because the incorrect string will be
in _their_ code. The benefits are immediate, as people change to either
using raw-strings or using forward-slashes for file paths.

The examples in the beggining of this thread, where one changing
a file path to "C:\users" sudden have broken code speaks for themselves:
this is a _fix_ . Broken libraries will be fixed within weeks of a Py 3.8
release. People
will either be using an old install, with Python 3.7, or they keep
everything up to date,
 and for those after 2 months max, the library warnings will be all but
gone.

In the meantime, what is possible is to publicize more how to disable these
warnings on
end-users side, since we all agree that few people know how to that.

On Wed, 7 Aug 2019 at 06:51, Chris Angelico  wrote:

> On Wed, Aug 7, 2019 at 7:33 PM Steven D'Aprano 
> wrote:
> > What's the rush? Let's be objective here: what benefit are we going to
> > get from this change? Is there anyone hanging out desperately for "\d"
> > and "\-" to become SyntaxErrors, so they can... do what?
>
> So that problems can start to be detected. Time and again, Python
> users on Windows get EXTREMELY confused by the way their code worked
> perfectly with one path, then bizarrely fails with another. That is a
> very real problem, and the problem is that it appeared to work when
> actually it was wrong.
>
> Python has a history of fixing these problems. It used to be that
> b"\x61\x62\x63\x64" was equal to u"abcd", but now Python sees these as
> fundamentally different. Data-dependent bugs caused by a syntactic
> oddity are a language flaw that needs to be fixed.
>
> > Because our processes don't work the way we assumed, it turns out that
> > in practice we haven't given developers the deprecation period we
> > thought we had. Read Nathaniel's post, if you haven't already done so:
> >
> >
> https://mail.python.org/archives/list/python-dev@python.org/message/E7QCC74OBYEY3PVLNQG2ZAVRO653LD5K/
> >
> > He makes a compelling case that while we might have had the promised
> > deprecation period by the letter of the law, in practice most developers
> > will have never seen it, and we will be breaking the spirit of the
> > promise if we continue with the unmodified plan.
>
> Yes, that's a fair complaint. But merely pushing the deprecation back
> by a version is not solving it. There has to be SOMETHING done
> differently.
>
> > And yet here we are rushing through a breaking change in an accelerated
> > manner, for a change of marginal benefit.
>
> It's not a marginal benefit. For people who try to teach Python on
> multiple operating systems, this is a very very real benefit. Just
> because YOU don't see the benefit doesn't mean it isn't there.
>
> > > Otherwise, all you're doing is saying "I wish this
> > > problem would just go away".
> >
> > No, I'm saying we don't have to rush this into 3.8. Let's keep the
> > warning silent and push everything back a release.
> >
> > Now is better than never.
> > Although never is often better than *right* now.
>
> Not sure how the Zen supports what you're saying there, since you're
> specifically saying "not never, not now, just later". But what do you
> actually mean by not rushing this into 3.8?
>
> > Right now, we're looking at a seriously compromised user-experience for
> > 3.8. People are going to hate these warnings, many of them won't know
> > what to do with them and will be sure that Python is buggy, and for very
> > little benefit.
>
> Then the problem is that people blame Python for these warnings. That
> is a problem to be solved; we need people to understand that a warning
> emitted by a library is a *library bug* not a language flaw.
>
> > > Library authors can start _right now_ fixing their code so it's more
> > > 3.8 compatible.
> >
> > Provided that (1) they are aware that this is a problem that needs to be
> > fixed, and (2) they have the round tuits to actually fix it by 3.8.0.
> > Neither are guaranteed.
>
> (1) Yes it is, see above; (2) fair point, but this is restricted to
> string literals and can be detected simply by compiling the code, so
> it's a reasonably findable problem.
>
> > > ("More" because 3.8 doesn't actually break anything.)
> > > What is actually gained by waiting longer
> >
> > We gain the avoidance of a painful experience in 3.8 for a significant
> > number of users and third-party devs.
> >
> > The question we haven't had answered is what we gain by pushing through
> > with the original plan. Plenty of people have said "Let's just do it"
> > but as far as I can see not one has explained *why* we should put end-
> > users and library developers through this frustrating and annoying
> > rushed deprecation period.
>
> And unless you have a plan to do something different in 3.8 that
> ensures that library devs see the warnings, there's no justification
> for the delay. All you'll do is defer the

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-07 Thread Chris Angelico

On Wed, Aug 7, 2019 at 7:33 PM Steven D'Aprano  wrote:
> What's the rush? Let's be objective here: what benefit are we going to
> get from this change? Is there anyone hanging out desperately for "\d"
> and "\-" to become SyntaxErrors, so they can... do what?

So that problems can start to be detected. Time and again, Python
users on Windows get EXTREMELY confused by the way their code worked
perfectly with one path, then bizarrely fails with another. That is a
very real problem, and the problem is that it appeared to work when
actually it was wrong.

Python has a history of fixing these problems. It used to be that
b"\x61\x62\x63\x64" was equal to u"abcd", but now Python sees these as
fundamentally different. Data-dependent bugs caused by a syntactic
oddity are a language flaw that needs to be fixed.

> Because our processes don't work the way we assumed, it turns out that
> in practice we haven't given developers the deprecation period we
> thought we had. Read Nathaniel's post, if you haven't already done so:
>
> https://mail.python.org/archives/list/python-dev@python.org/message/E7QCC74OBYEY3PVLNQG2ZAVRO653LD5K/
>
> He makes a compelling case that while we might have had the promised
> deprecation period by the letter of the law, in practice most developers
> will have never seen it, and we will be breaking the spirit of the
> promise if we continue with the unmodified plan.

Yes, that's a fair complaint. But merely pushing the deprecation back
by a version is not solving it. There has to be SOMETHING done
differently.

> And yet here we are rushing through a breaking change in an accelerated
> manner, for a change of marginal benefit.

It's not a marginal benefit. For people who try to teach Python on
multiple operating systems, this is a very very real benefit. Just
because YOU don't see the benefit doesn't mean it isn't there.

> > Otherwise, all you're doing is saying "I wish this
> > problem would just go away".
>
> No, I'm saying we don't have to rush this into 3.8. Let's keep the
> warning silent and push everything back a release.
>
> Now is better than never.
> Although never is often better than *right* now.

Not sure how the Zen supports what you're saying there, since you're
specifically saying "not never, not now, just later". But what do you
actually mean by not rushing this into 3.8?

> Right now, we're looking at a seriously compromised user-experience for
> 3.8. People are going to hate these warnings, many of them won't know
> what to do with them and will be sure that Python is buggy, and for very
> little benefit.

Then the problem is that people blame Python for these warnings. That
is a problem to be solved; we need people to understand that a warning
emitted by a library is a *library bug* not a language flaw.

> > Library authors can start _right now_ fixing their code so it's more
> > 3.8 compatible.
>
> Provided that (1) they are aware that this is a problem that needs to be
> fixed, and (2) they have the round tuits to actually fix it by 3.8.0.
> Neither are guaranteed.

(1) Yes it is, see above; (2) fair point, but this is restricted to
string literals and can be detected simply by compiling the code, so
it's a reasonably findable problem.

> > ("More" because 3.8 doesn't actually break anything.)
> > What is actually gained by waiting longer
>
> We gain the avoidance of a painful experience in 3.8 for a significant
> number of users and third-party devs.
>
> The question we haven't had answered is what we gain by pushing through
> with the original plan. Plenty of people have said "Let's just do it"
> but as far as I can see not one has explained *why* we should put end-
> users and library developers through this frustrating and annoying
> rushed deprecation period.

And unless you have a plan to do something different in 3.8 that
ensures that library devs see the warnings, there's no justification
for the delay. All you'll do is defer the exact same problem by
another eighteen months. If the warning remains silent in 3.8, how
will library devs get any indication that they need to fix something?

If you can offer a better plan, then by all means, do so. But
deferring without a change is of no real value, and it means ANOTHER
eighteen months added onto the time before novice programmers get to
be told about string literal problems.

ChrisA
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/RISO4KSTHBMQZJT5XFS34GCB2PB66WNV/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-07 Thread Paul Moore

On Wed, 7 Aug 2019 at 10:32, Steven D'Aprano  wrote:

> No, I'm saying we don't have to rush this into 3.8. Let's keep the
> warning silent and push everything back a release.
>
> Now is better than never.
> Although never is often better than *right* now.
>
> Right now, we're looking at a seriously compromised user-experience for
> 3.8. People are going to hate these warnings, many of them won't know
> what to do with them and will be sure that Python is buggy, and for very
> little benefit.
>
> Let's slow down and put it off for another release, giving us time to
> solve the warnings problem, and library authors the deprecation period
> promised.

+1 from me. The arguments made here are pretty compelling to me, and I
agree that we should take a breath and not rush this warning into 3.8,
given what we now know.

Paul
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OXGY2MPRTK3BJAXCRVLFKKKQNREKO7O4/

[Python-Dev] Re: What to do about invalid escape sequences

2019-08-07 Thread Steven D'Aprano

On Wed, Aug 07, 2019 at 02:33:51PM +1000, Chris Angelico wrote:
> On Wed, Aug 7, 2019 at 1:54 PM Steven D'Aprano  wrote:
> > Don't think of this as a failure. Think of it as an opportunity: we've
> > identified a weakness in our deprecation process. Let's fix that
> > process, make sure that *developers* will see the warning in 3.8 or 3.9,
> > and not raise an exception until 4.0 or 4.1.
> >
> 
> So HOW are you going to make sure developers see it?

I've only just started thinking about it, give me a couple of minutes! *wink*

What's the rush? Let's be objective here: what benefit are we going to 
get from this change? Is there anyone hanging out desperately for "\d" 
and "\-" to become SyntaxErrors, so they can... do what?

Because our processes don't work the way we assumed, it turns out that 
in practice we haven't given developers the deprecation period we 
thought we had. Read Nathaniel's post, if you haven't already done so:

https://mail.python.org/archives/list/python-dev@python.org/message/E7QCC74OBYEY3PVLNQG2ZAVRO653LD5K/

He makes a compelling case that while we might have had the promised 
deprecation period by the letter of the law, in practice most developers 
will have never seen it, and we will be breaking the spirit of the 
promise if we continue with the unmodified plan.

Quite frankly, if we continue with the unmodified plan, third-party devs 
who are affected will have the right to feel mightly pissed off at us. 
We make an implicit, if not explicit, promise that we won't break 
backswards compatibility lightly, but if we do, we will give them plenty 
of notice except under the most dire circumstances (such as a serious 
security vulnerability).

And yet here we are rushing through a breaking change in an accelerated 
manner, for a change of marginal benefit. Sure, we can say that 
*technically* we gave them all the notice promised, it was at the bottom 
of a locked filing cabinet stuck in a disused lavatory with a sign on 
the door saying "Beware of The Leopard".

https://www.goodreads.com/quotes/40705-but-the-plans-were-on-display-on-display-i-eventually

I'm sure that the affected devs will understand why it was *their* fault 
they couldn't see the warnings, when even people from a first-class 
library like SymPy took four iterations to do it right.

> Currently it
> requires some extra steps or flags, which are not well known. What
> change are you proposing for 3.8 that will ensure that this actually
> gets solved? 

Absolutely nothing. I don't have to: we're an entire community, this 
doesn't have to fall only on my shoulders. I'm not even the messenger: 
that's Raymond. I'm just (partly) agreeing with him.

Just because I don't have a solution for this problem doesn't mean the 
problem doesn't exist.

> Otherwise, all you're doing is saying "I wish this
> problem would just go away".

No, I'm saying we don't have to rush this into 3.8. Let's keep the 
warning silent and push everything back a release.

Now is better than never.
Although never is often better than *right* now.

Right now, we're looking at a seriously compromised user-experience for 
3.8. People are going to hate these warnings, many of them won't know 
what to do with them and will be sure that Python is buggy, and for very 
little benefit.

Let's slow down and put it off for another release, giving us time to 
solve the warnings problem, and library authors the deprecation period 
promised.

> Library authors can start _right now_ fixing their code so it's more
> 3.8 compatible.

Provided that (1) they are aware that this is a problem that needs to be 
fixed, and (2) they have the round tuits to actually fix it by 3.8.0. 
Neither are guaranteed.

Its not a big fix, but people have other priorities, like work, family, 
a life, etc. That's why we normally give developers *multiple years* of 
warnings to fix problems, not weeks. This change is not so important 
that we have to push it through in an accelerated time frame.

> ("More" because 3.8 doesn't actually break anything.)
> What is actually gained by waiting longer

We gain the avoidance of a painful experience in 3.8 for a significant 
number of users and third-party devs.

The question we haven't had answered is what we gain by pushing through 
with the original plan. Plenty of people have said "Let's just do it" 
but as far as I can see not one has explained *why* we should put end- 
users and library developers through this frustrating and annoying 
rushed deprecation period.

-- 
Steven
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/I6DFONZPRHL4VYUYICAXIMUTR4KVVHV6/

[Python-Dev] Re: What to do about invalid escape sequences


07.08.19 03:57, Gregory P. Smith пише:
People distribute code via pypi.  if we reject uploads of packages with 
these problems and link to fixers (modernize can be taught what to do), 
we prevent them from spreading further.


How can we check that there are such problems in the package? Pass all 
*.py files through a linter? But the package can contain "incorrect" 
files, for example files for Python 2 or earlier Python 3 versions. Even 
the CPython testsuite contains bad Python files for testing purpose.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/JX7IDIGFLAZIF2YQIR5IYNNHLLHGRA4T/

[Python-Dev] Re: What to do about invalid escape sequences


07.08.19 03:31, Rob Cliffe via Python-Dev пише:

How about: whenever a third-party library uses a potentially-wrong
escape sequence, it creates a message on the console. Then when
someone sees that message, they can post a bug report against the
package.


Would not it just increase the amount of a noise? The main complain 
about new warnings is a noise.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VZHWNZ4X7PXXE4Y4XIZCLMWSYGNJ5WPY/

[Python-Dev] Re: What to do about invalid escape sequences


07.08.19 01:37, Brett Cannon пише:

I think this is a good example of how the community is not running tests with 
warnings on and making sure that their code is warnings-free. This warning has 
existed for at least one full release and fixing it doesn't require some crazy 
work-around for backwards compatibility, and so this tells me people are simply 
either ignoring the warnings or they are not aware of them.


There are several PRs for fixing warnings on GitHub every month. And 
seems a deprecation warning about importing ABCs from collections is at 
least so common (if not more) as a warning about "invalid escape 
sequences". The former is more visible to end users because is emitted 
at every run, not only at the first bytecode compilation.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7TYOKDS3D5YXKJFBJO6G6OVKVRYKRCHO/

[Python-Dev] Re: What to do about invalid escape sequences