[issue37764] email.Message.as_string infinite loop

2019-08-15 Thread Abhilash Raj


Abhilash Raj  added the comment:

Although, the 2nd bug I spoke of is kind of speculative, I haven't been able to 
find a test case which matches rfc2047_matcher but raises exception with 
get_encoded_word (after, ofcourse, the first bug is fixed), which the only way 
to cause an infinite loop.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37764] email.Message.as_string infinite loop

2019-08-15 Thread Abhilash Raj


Abhilash Raj  added the comment:

I meant, =aa is identified as encoded word escape

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37764] email.Message.as_string infinite loop

2019-08-15 Thread Abhilash Raj


Abhilash Raj  added the comment:

You have correctly identified that "=aa" is detected as a encoded word and 
causes the get_encoded_word to fail.

However, "=?utf-8?q?somevalue?=aa" should ideally get parsed as "somevalueaa" 
and not "=?utf-8?q?somevalue?=aa". This is because "=?utf-8?q?somevalue?=" is a 
valid encoded word, it is just not followed by an empty whitespace. 

modified   Lib/email/_header_value_parser.py
@@ -1037,7 +1037,10 @@ def get_encoded_word(value):
 raise errors.HeaderParseError(
 "expected encoded word but found {}".format(value))
 remstr = ''.join(remainder)
-if len(remstr) > 1 and remstr[0] in hexdigits and remstr[1] in hexdigits:
+if (len(remstr) > 1 and
+remstr[0] in hexdigits and
+remstr[1] in hexdigits and
+tok.count('?') < 2):
 # The ? after the CTE was followed by an encoded word escape (=XX).
 rest, *remainder = remstr.split('?=', 1)

This can be avoided by checking `?` occurs twice in the `tok`.

The 2nd bug, which needs a better test case, is that if the encoded_word is 
invalid, you will keep running into infinite loop, which you correctly fixed in 
your PR. However, the test case you used is more appropriate for the first 
issue.

You can fix both the issues, for which, you need to add a test case for 2nd 
issue and fix for the first issue.

Looking into the PR now.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37824] IDLE: Handle Shell input warnings properly.

2019-08-15 Thread Terry J. Reedy


Terry J. Reedy  added the comment:

I am combining the trivial 'leave Shell input SyntaxWarnings alone (instead of 
making them SyntaxErrors)' part of #34857 with this issue, 'print Shell input 
(and the very rare internal IDLE Warnings occurring after Shell exists) in 
Shell, and changing the title accordingly.

The PR is a WIP with at least two bugs. Help wanted.
1. the triple output.  (Before the patch, when I started IDLE from command 
prompt, I only saw one.
2. Warnings are inserted before the text generating the warning.  I interpret 
this as intercepted \n causing compilation and printing of the warning before 
the \n is inserted at the end of the line and the iomark moved to the beginning 
of the next.  I don't know what moves the iomark.

--
stage: patch review -> 

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37824] IDLE: Handle Shell input warnings properly.

2019-08-15 Thread Terry J. Reedy


Change by Terry J. Reedy :


--
keywords: +patch
pull_requests: +15030
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/15311

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34857] IDLE: Module warnings misplaced.

2019-08-15 Thread Terry J. Reedy


Terry J. Reedy  added the comment:

I moved leaving SyntaxWarnings as warnings to #37824, as that is mostly about 
editing pyshell.

Changing the ordering of warnings and RESTART involves run.show_warnings.  I 
believe the latter should collect lines in a warnings list and print
''.join(warnings) at the appropriate time.

--
title: IDLE: SyntaxWarning not handled properly -> IDLE: Module warnings 
misplaced.

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37824] IDLE: Handle Shell input warnings properly.

2019-08-15 Thread Terry J. Reedy


Change by Terry J. Reedy :


--
nosy: +rhettinger, taleinat
title: IDLE: DeprecationWarning not handled properly -> IDLE: Handle Shell 
input warnings properly.

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37871] 40 * 473 grid of "é" has a single wrong character on Windows

2019-08-15 Thread Eryk Sun

Eryk Sun  added the comment:

To be compatible with Windows 7, _io__WindowsConsoleIO_write_impl in 
Modules/_io/winconsoleio.c is forced to write to the console in chunks that do 
not exceed 32 KiB. It does so by repeatedly dividing the length to decode by 2 
until the decoded buffer size is small enough. 

wlen = MultiByteToWideChar(CP_UTF8, 0, b->buf, len, NULL, 0);
while (wlen > 32766 / sizeof(wchar_t)) {
len /= 2;
wlen = MultiByteToWideChar(CP_UTF8, 0, b->buf, len, NULL, 0);
}

With `('é' * 40 + '\n') * 473`, encoded as UTF-8, we have 473 82-byte lines 
(note that "\n" has been translated to "\r\n"). This is 38,786 bytes, which is 
too much for a single write, so it splits it in two.

>>> 38786 // 2
19393
>>> 19393 // 82
236
>>> 19393 % 82
41

This means line 237 ends up with 20 'é' characters (UTF-8 b'\xc3\xa9') and one 
partial character sequjence, b'\xc3'. When this buffer is passed to 
MultiByteToWideChar to decode from UTF-8 to UTF-16, the partial sequence gets 
decoded as the replacement character U+FFFD. For the next write, the remaining 
b'\xa9' byte also gets decoded as U+FFFD.

To avoid this, _io__WindowsConsoleIO_write_impl could decode the whole buffer 
in one pass, and slice that up into writes that are less than 32 KiB. Or it 
could ensure that its UTF-8 slices are always at character boundaries.

--
components: +IO
nosy: +eryksun

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37863] Speed up hash(fractions.Fraction)

2019-08-15 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

There's likely more that could be done -- I've just taken the low hanging 
fruit.  If someone wants to re-open this and go farther, please take it from 
here.

--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37863] Speed up hash(fractions.Fraction)

2019-08-15 Thread Raymond Hettinger


Raymond Hettinger  added the comment:


New changeset f3cb68f2e4c3e0c405460f9bb881f5c1db70f535 by Raymond Hettinger in 
branch 'master':
bpo-37863: Optimize Fraction.__hash__() (#15298)
https://github.com/python/cpython/commit/f3cb68f2e4c3e0c405460f9bb881f5c1db70f535


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37834] readlink on Windows cannot read app exec links

2019-08-15 Thread Eryk Sun


Eryk Sun  added the comment:

> So for an actual non-root mount point ntpath.ismount() returns True 
> and with IO_REPARSE_TAG_MOUNT_POINT included ntpath.islink() also 
> returns True. nt.readlink() returns the "\\?\Volume{GUID}\" path

If islink() is true, then st_mode has S_IFLNK and not S_IFDIR. So we have a 
mount point that's a symlink, which is not possible in POSIX, and it's not a 
directory, which is unusual in POSIX to say the least. 

For cross-platform consistency, I think it's better if ismount() is a 
directory. The question would be how to ensure that's true in all cases. 
Windows make this hard to accomplish reliably due to DOS 'devices' that get 
reparsed in the object manager to arbitrary paths, not necessarily to volume 
devices.

> Root mount points ("C:\\", etc.) do not return true for islink()

Not really. Here's a mountpoint-symlink chimera:

>>> os.readlink('C:/Mount/Windows')
'C:\\Windows'
>>> os.system('subst W: C:\\Mount\\Windows')
0

It's a symlink and not a directory:

>>> os.path.islink('W:\\')
True
>>> os.lstat('W:\\').st_mode & stat.S_IFDIR
0

But it's also a mount point:

>>> os.path.ismount('W:\\')
True

The object manager reparses "W:" as "\\??\\C:\\Mount\\Windows", and we open it 
with a trailing backlash, which is fine, i.e. "\\??\\C:\\Mount\\Windows\\". 

> I'm not seeing why having both islink() and ismount() be true 
> in this case is a problem.

It's only possible if a mount point is not a directory. That we'd be returning 
this for a junction is a strange state of affairs because a junction must 
target a file system directory. I prefer generalizing junction as a 
name-surrogate type that allows S_IFDIR.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17305] IDNA2008 encoding is missing

2019-08-15 Thread Ashwin Ramaswami


Ashwin Ramaswami  added the comment:

So is the consensus that the best way to do this is to move the "idna" library 
to stdlib, or implement it from scratch?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37834] readlink on Windows cannot read app exec links

2019-08-15 Thread Eryk Sun


Eryk Sun  added the comment:

> # Always make the OS resolve "unknown" reparse points
>ALLOWED_TO_TRAVERSE = {SYMLINK, MOUNT_POINT}
>if !traverse and st.reparse_tag not in ALLOWED_TO_TRAVERSE:
>return xstat(path, !traverse)

To me the naming here makes sense as ALLOWED_TO_OPEN -- as in if traverse is 
false, meaning we opened the tag, but the tag is not in ALLOWED_TO_OPEN, then 
we have to reopen in order to traverse it. But then, on the second pass after 
ERROR_CANT_ACCESS_FILE, traverse is false, so we would also have to special 
case tags that should never be traversed in NOT_ALLOWED_TO_TRAVERSE = 
{APPEXECLINK, ...}.

I mentioned this in a GitHub comment, and suggested maybe adding another 
internal-only parameter to check to avoid having to ever special case the 
app-exec-link tag. For example, we could have an explicit "open_reparse_point" 
parameter. It would take precedence over traverse (i.e. follow_symlinks). For 
now, it could be internal only.

I assumed stat() would return the reparse point for all tags that fail to 
reparse with ERROR_CANT_ACCESS_FILE since it's not an invalid reparse point, 
just an unhandled one that has no other significance at the file system level. 
To stat(), it can never be anything other than a reparse point. Whatever 
relevance it has depends on some other context (e.g. a CreateProcessW call).

> And the open question is just whether MOUNT_POINT should be in that 
> set near the end. I believe it should, since the alternative is to 
> force all Python developers to write special Windows-only code to
> handle directory junctions.

Python provides no cross-platform tooling to manipulate mount points or read 
what they target (e.g. something like "/dev/sda1" in Unix), so that doesn't 
bother me per se. A mount point is just a directory in the POSIX mindset. 

That doesn't mean, however, that I wouldn't like the ability to detect "name 
surrogate" reparse points in general to implement safer behavior for 
shutil.rmtree and anything else that walks a directory tree. Windows shells 
don't follow name surrogates (including mount points) when deleting a tree, 
such as `rmdir /s`. Unix `rm -rf` does follow mount points. (A process would 
need root access to unmout a directory anyway.) The author's of shutil.rmtree 
have a Unix perspective. For Windows, I'd like to change that perspective. When 
in Rome...

If the only way to get this is to special case mount-point or name-surrogate 
reparse points as applicable to "follow_symlinks", then I suggest that this 
should be clearly documented and that we not go so far as to pretend that 
they're symlinks via S_IFLNK, islink, and readlink. Continue reporting mount 
points as directories (S_IFDIR). Continue with only supporting actual symbolic 
links for the triple: islink(), readlink(), and symlink(). In this case, we can 
copy symlinks and be certain the semantics remain the same since we're not 
changing the type of reparse point.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33007] Objects referencing private-mangled names do not roundtrip properly under pickling.

2019-08-15 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

This problem is specific to private methods AFAICT, since they're the only 
things which have an unmangled __name__ used to pickle them, but are stored as 
a mangled name.

More details on cause and solution on issue #37852, which I closed as a 
duplicate of this issue.

--
nosy: +josh.r
versions: +Python 3.6, Python 3.8, Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37871] 40 * 473 grid of "é" has a single wrong character on Windows

2019-08-15 Thread ANdy

New submission from ANdy :

# To reproduce:
# Put this text in a file `a.py` and run `py a.py`.
# Or just run: py -c "print(('é' * 40 + '\n') * 473)"
# Scroll up for a while. One of the lines will be:
# ��ééé
# (You can spot this because it's slightly longer than the other lines.)
# The error is consistently on line 237, column 21 (1-indexed).

# The error reproduces on Windows but not Linux. Tested in both powershell and 
CMD.
# (Failed to reproduce on either a real Linux machine or on Ubuntu with WSL.)
# On Windows, the error reproduces every time consistently.

# There is no error if N = 472 or 474.
N = 473
# There is no error if W = 39 or 41.
# (I tested with console windows of varying sizes, all well over 40 characters.)
W = 40
# There is no error if ch = "e" with no accent.
# There is still an error for other unicode characters like "Ö" or "ü".
ch = "é"
# There is no error without newlines.
s = (ch * W + "\n") * N
# Assert the string itself is correct.
assert all(c in (ch, "\n") for c in s)
print(s)

# There is no error if we use N separate print statements
# instead of printing a single string with N newlines.

# Similar scripts written in Groovy, JS and Ruby have no error.
# Groovy: System.out.println(("é" * 40 + "\n") * 473)
# JS: console.log(("é".repeat(40) + "\n").repeat(473))
# Ruby: puts(("é" * 40 + "\n") * 473)

--
components: Windows
messages: 349837
nosy: anhans, paul.moore, steve.dower, tim.golden, zach.ware
priority: normal
severity: normal
status: open
title: 40 * 473 grid of "é" has a single wrong character on Windows
type: behavior
versions: Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37852] Pickling doesn't work for name-mangled private methods

2019-08-15 Thread Josh Rosenberg


Change by Josh Rosenberg :


--
resolution:  -> duplicate
stage:  -> resolved
status: open -> closed
superseder:  -> Objects referencing private-mangled names do not roundtrip 
properly under pickling.

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37863] Speed up hash(fractions.Fraction)

2019-08-15 Thread Tim Peters


Tim Peters  added the comment:

Mark, I did just a little browsing on this.  It seems it's well known that egcd 
beats straightforward exponentiation for this purpose in arbitrary precision 
contexts, for reasons already sketched (egcd needs narrower arithmetic from the 
start, benefits from the division narrowing on each iteration, and since the 
quotient on each iteration usually fits in a single "digit" the multiplication 
by the quotient goes fast too).

But gonzo implementations switch back to exponentiation, using fancier 
primitives like Montgomery multiplication.

As usual, I'm not keen on bloating the code for "state of the art" giant int 
algorithms, but suit yourself!  The focus in this PR is dead simple spelling 
changes with relatively massive payoffs.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37834] readlink on Windows cannot read app exec links

2019-08-15 Thread Eryk Sun


Eryk Sun  added the comment:

> It also has a bug that a drive root is a mount point, even if the 
> drive doesn't exist. Also, it's wrong in not checking for junctions 
> in UNC paths. SMB supports opening reparse points over the wire.

"It" in the above sentences is ntpath.ismount, not GetVolumePathNameW.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37834] readlink on Windows cannot read app exec links

2019-08-15 Thread Eryk Sun


Eryk Sun  added the comment:

> Okay, I get it now. So we _do_ want to "upgrade" lstat() to stat() 
> when it's not a symlink.

I don't see that as a behavior upgrade. It's just an implementation detail. 
lstat() is still following its mandate to not follow symlinks -- however you 
ultimately define what a "symlink" is in this context in Windows.
 
> I don't want to add any parameters - I want to have predictable and
> reasonable default behaviour. os.readlink() already exists for 
> "open reparse point" behaviour.

I'd appreciate a parameter to always open reparse points, even if a 
filter-driver or the I/O manager handles them. 

I'm no longer a big fan of mapping "follow_symlinks" to name surrogates (I used 
to like this idea a couple years ago), or splitting hairs regarding 
volume-mount-point junctions and bind-like junctions (used to like this too a 
year ago, because some projects do this, before I thought about the deeper 
concerns). But it's not up to me. If follow_symlinks means name surrogates, at 
least then lstat can open any reparse point that claims to link to another path 
and thus *should* have link-like behavior (hard link or soft link). 

For example, we are able to move, rename, and delete symlinks and junctions 
without affecting the target (except for a junction that's a volume mount 
point, Windows will try DeleteVolumeMountPointW, which can have side effects; 
failure is ignored and the directory deleted anyway). This is implemented by 
the Windows API opening the reparse point and checking for symlink and junction 
tags. It reparses other tags, regardless of whether they're name surrogates, 
but I assume name-surrogate reparse points should be implemented by their 
owning filter drivers to behave in a similar fashion for actions such as rename 
and delete.

While deleting a name-surrogate reparse point should have no effect on the 
target, it still might have unintended consequences. For example, it might 
revive a 'deleted' file in a VFS for Git repo if we delete the tombstone 
reparse point that marks a file that's supposed to be 'deleted'. This might 
happen if code checks os.lstat(filename) and decides to delete the file in a 
non-standard way that ensures only a reparse point is deleted, e.g. 
CreateFileW(filename, ..., FILE_FLAG_DELETE_ON_CLOSE | 
FILE_FLAG_OPEN_REPARSE_POINT, NULL), or manually setting the 
FileDispositionInfo. (DeleteFileW would fail with a file-not-found error 
because it would reparse the tombstone.) Now it's in for a surprise because the 
file exists again in the projected filesystem, even though it was just 
'deleted'. This is in theory. I haven't experimented with projected file 
systems to determine whether they actually allow opening a tombstone reparse 
point when using FILE_FLAG_OPEN_REPARSE_POINT. I assume they do,
  like any other reparse point, unless there's deeper magic involved here.

The questions for me are whether os.readlink() should also read junctions and 
exactly what follow_symlinks means in Windows. We have a complicated story to 
tell if follow_symlinks=False (lstat) opens any reparse point or opens just 
name-surrogate reparse points, and islink() is made consistent with this, but 
then readlink() doesn't work. 

If junctions are handled as symlinks, then islink(), readlink(), symlink() 
would be used to copy a junction 'link' while copying a tree (e.g. 
shutil.copytree with symlinks=True). This would transform junctions into 
directory symlinks. In this case, we potentially have a problem that relative 
symlinks in the tree no longer target the same files when accessed via a 
directory symlink instead of a junction. No one thinks about this problem on 
the POSIX side because it would be weird to copy a mountpoint as a symlink. In 
POSIX, a mountpoint is always seen as just a directory and always traversed.

> I'm still not convinced that this is what we want to do. I don't 
> have a true Linux machine handy to try it out (Python 3.6 and 3.7 on
>  WSL behave exactly like the semantics I'm proposing, but that may 
> just be because it's the Windows kernel below it).

If you're accessing NT junctions under WSL, in that environment they're always 
handled as symlinks. And the result of my "C:/Junction" and "C:/Symlink" 
example --- i.e. "/mnt/c/Junction" and "/mnt/c/Symlink" -- is that *both* 
behave the same way, which is as expected since the WSL environment sees both 
as symlinks, but also fundamentally wrong. In an NT process, they behave 
differently, as a mount point (hard name grafting) and a symlink (soft name 
grafting). This is a decision in WSL's drvfs file-system driver, and I have to 
assume it's intentional. 

In a perfect world, a path on the volume should be consistently evaluated, 
regardless of whether it's accessed from a WSL or NT process. But it's also a 
difficult problem, maybe intractable, if they want to avoid Linux programs 
traversing junctions in dangerous operations -- e.g. `rm -rf`. The only name 
surro

[issue37870] os.path.ismount returns false for disconnected CIFS mounts in Linux

2019-08-15 Thread Matt Christopher


New submission from Matt Christopher :

I've got a case where we mount a CIFS filesystem and then later the actual 
backing filesystem is deleted (but the mount remains on the machine).

When running from a shell, this is the behavior which I see after the backing 
CIFS filesystem has gone away:
root@1b20608623a246f1af69058acdfbfd3006:/fsmounts# ll
ls: cannot access 'cifsmountpoint': Input/output error
total 8
drwxrwx--- 3 _user _grp 4096 Aug 15 15:46 ./
drwxrwx--- 8 _user _grp 4096 Aug 15 15:46 ../
d? ? ??  ?? cifsmountpoint/
root@1b20608623a246f1af69058acdfbfd3006:/fsmounts# stat -c "%d" 
cifsmountpoint
stat: cannot stat 'cifsmountpoint': Input/output error

Running mount -l shows this:
///c7e868cd-3047-4881-b05b-a1a1d087dbf5 on /fsmounts/cifsmountpoint 
type cifs 
(rw,relatime,vers=3.0,cache=strict,username=,domain=,uid=0,noforceuid,gid=0,noforcegid,addr=52.239.160.104,file_mode=0777,dir_mode=0777,soft,persistenthandles,nounix,serverino,mapposix,rsize=1048576,wsize=1048576,echo_interval=60,actimeo=1)

In the Python code that I see posixpath.py has this snippet:
try:
s1 = os.lstat(path)
except (OSError, ValueError):
# It doesn't exist -- so not a mount point. :-)
return False

The problem is that the comment: "# It doesn't exist -- so not a mount point. 
:-)" assumes a particular kind of OSError - in reality not every OS error means 
that it doesn't exist. In this case we're getting OSError with errno == 5, 
which is:
OSError: [Errno 5] Input/output error: 

Now, I'm not entirely sure what (if anything) the ismount function is supposed 
to be doing here... but returning false seems incorrect. This IS a mount, and 
you can see so via mount -l.
I am aware that there are other libraries (i.e. psutil.disk_partitions) which 
can help me to detect this situation but I was surprised that ismount was 
saying false here. It seems like it should possibly just raise, or maybe 
there's a fancy way to check mounts if lstat fails.

This looks kinda related to https://bugs.python.org/issue2466 (although this is 
already fixed and not exactly the same problem it's a similar class of issue)

--
messages: 349833
nosy: Matt Christopher
priority: normal
severity: normal
status: open
title: os.path.ismount returns false for disconnected CIFS mounts in Linux
type: behavior
versions: Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37868] `is_dataclass` returns `True` if `getattr` always succeeds.

2019-08-15 Thread Eric V. Smith


Eric V. Smith  added the comment:

Yeah, I agree it's not an awesome design to work with classes or instances, but 
it's documented that way. As soon as I write some tests I'll check this in.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37834] readlink on Windows cannot read app exec links

2019-08-15 Thread Steve Dower


Steve Dower  added the comment:

So for an actual non-root mount point, ntpath.ismount() returns True and with 
IO_REPARSE_TAG_MOUNT_POINT included ntpath.islink() also returns True. 
nt.readlink() returns the "\\?\Volume{GUID}\" path

Root mount points ("C:\\", etc.) do not return true for islink()

os.rename() and os.unlink() work on non-root mount points, but not on root 
mount points. So there is at least some value in being able to detect "this is 
a root mount point that acts like a file".

I'm not seeing why having both islink() and ismount() be true in this case is a 
problem.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37587] JSON loads performance improvement for long strings

2019-08-15 Thread Marco Paolini


Marco Paolini  added the comment:

ujson (https://github.com/esnme/ultrajson) instead is faster when decoding 
non-ascii in the same example above, so it is likely there is room for 
improvement...

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37834] readlink on Windows cannot read app exec links

2019-08-15 Thread Steve Dower


Steve Dower  added the comment:

> So we _do_ want to "upgrade" lstat() to stat() when it's not a symlink.

Except this bug came about because we want to _downgrade_ stat() to lstat() 
when it's an appexeclink, because the whole point of those is to use them 
without following them (and yeah, most operations are going to fail, but they'd 
fail against the target file too).

So we have this logic:

def xstat(path, traverse):
f = open(path, flags | (0 if traverse else OPEN_REPARSE_POINT))
if !f:
# Special case for appexeclink
if traverse and ERROR_CANT_OPEN_FILE:
st = xstat(path, !traverse)
if st.reparse_tag == APPEXECLINC:
return st
raise ERROR_CANT_OPEN_FILE
# Handle "likely" errors
if ERROR_ACCESS_DENIED or SHARING_VIOLATION:
st = read_from_dir(os.path.split(path))
else:
st = read_from_file(f)

# Always make the OS resolve "unknown" reparse points
ALLOWED_TO_TRAVERSE = {SYMLINK, MOUNT_POINT}
if !traverse and st.reparse_tag not in ALLOWED_TO_TRAVERSE:
return xstat(path, !traverse)

return st

And the open question is just whether MOUNT_POINT should be in that set near 
the end. I believe it should, since the alternative is to force all Python 
developers to write special Windows-only code to handle directory junctions.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29535] datetime hash is deterministic in some cases

2019-08-15 Thread Ashwin Ramaswami


Ashwin Ramaswami  added the comment:

> Making the numeric hash non-predictable while maintaining its current 
> properties would be difficult.

Why so?

> In fact, I think it's reasonable to assume that there are no websites 
> vulnerable to a DOS via *numeric* hash collisions until we see evidence 
> otherwise. I'd expect that there are *way* more places where a dict is being 
> constructed with string keys in this way than with numeric keys.

That's true, but why do we restrict ourselves to websites? This is how I see 
it: As a Python developer, it seems like my program is immune to hash collision 
DoS if I use strings/bytes as dictionary keys, but *not* if my keys, say, are 
tuples of strings. Why not make the hash non-predictable for all builtin types 
by default?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37834] readlink on Windows cannot read app exec links

2019-08-15 Thread Steve Dower


Steve Dower  added the comment:

> For example, if we've opened an HSM reparse point, we must reopen to let
> the file-system filter driver implement its semantics to replace the
> reparse point with the real file from auxiliary storage and complete the
> request. That is the stat() result I want when I say stat(filename,
> follow_symlinks=False) or lstat(filename), because this file is not a
> symlink. It's implicitly just the file to end users -- despite whatever
> backend tricks are being played in the kernel to implement other
> behavior such as HSM. Conflating this with a symlink is not right. Lies
> catch up with us. We can't copy it as link via os.symlink and
> os.readlink, and it doesn't get treated like a symlink in API functions.

Okay, I get it now. So we _do_ want to "upgrade" lstat() to stat() when it's 
not a symlink.

> If you want to add an "open reparse point" parameter ...

I don't want to add any parameters - I want to have predictable and reasonable 
default behaviour. os.readlink() already exists for "open reparse point" 
behaviour.

The discussion is only about what os.lstat() returns when you pass in a path to 
a junction.

> As to mount points, yes, I do think we should always traverse them.
> Please see my extended comment and the follow-up example on GitHub.

I'm still not convinced that this is what we want to do. I don't have a true 
Linux machine handy to try it out (Python 3.6 and 3.7 on WSL behave exactly 
like the semantics I'm proposing, but that may just be because it's the Windows 
kernel below it).

> A mount point is not a link. ismount() and islink() can never both be
> true. Also, a POSIX symlink can never be a directory, which is why we
> make stat() pretend directory symlinks aren't directories. If the user
> wants a link, they can use a symlink that's created by os.symlink,
> mklink, new-item -type SymbolicLink, etc.

ismount() is currently not true for junctions. And I can't find any reference 
that says that POSIX symlinks can't point to directories, nor any evidence that 
we suppress symlink-to-directory creation or resolution in Python (also tested 
on WSL)..

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37587] JSON loads performance improvement for long strings

2019-08-15 Thread Marco Paolini

Marco Paolini  added the comment:

ops sorry here's the right commands

python -m pyperf timeit -s 'import json;' -s  'c = "a"; s = json.dumps(c * 
(2**10 // (len(json.dumps(c)) - 2)))' 'json.loads(s)' -o ascii2k.json
python -m pyperf timeit -s 'import json;' -s  'c = "€"; s = json.dumps(c * 
(2**10 // (len(json.dumps(c)) - 2)))' 'json.loads(s)' -o nonascii2k.json

Mean +- std dev: [ascii2k] 3.69 us +- 0.05 us -> [nonascii2k] 12.4 us +- 0.1 
us: 3.35x slower (+235%)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37587] JSON loads performance improvement for long strings

2019-08-15 Thread Marco Paolini

Marco Paolini  added the comment:

also worth noting escape sequences for non-ascii characters are slower, even 
when encoded length is the same.

python -m pyperf timeit -s 'import json;' -s  'c = "€"; s = json.dumps(c * 
(2**10 // len(json.dumps(c)) - 2))' 'json.loads(s)' -o nonascii2k.json

python -m pyperf timeit -s 'import json;' -s  'c = "a"; s = json.dumps(c * 
(2**10 // len(json.dumps(c)) - 2))' 'json.loads(s)' -o ascii2k.json

Mean +- std dev: [ascii2k] 2.59 us +- 0.04 us -> [nonascii2k] 9.98 us +- 0.12 
us: 3.86x slower (+286%)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37869] Compilation warning on GCC version 7.4.0-1ubuntu1~18.04.1

2019-08-15 Thread Hansraj Das


New submission from Hansraj Das :

I am facing below compilation warning on compilation python source:

***
gcc -pthread -c -Wno-unused-result -Wsign-compare -g -Og -Wall-std=c99 
-Wextra -Wno-unused-result -Wno-unused-parameter -Wno-missing-field-initial
izers -Werror=implicit-function-declaration  -I./Include/internal  -I. 
-I./Include-DPy_BUILD_CORE -o Objects/obmalloc.o Objects/obmalloc.c
Objects/obmalloc.c: In function __PyObject_Malloc_: 
  
Objects/obmalloc.c:1646:16: warning: _ptr_ may be used uninitialized in this 
function [-Wmaybe-uninitialized] 
 return ptr;
  
^~~
***   

This is another thread having suggestions from Victor: 
https://github.com/python/cpython/pull/15293

--
components: Build
files: warning-2019-08-15 01-15-06.png
messages: 349824
nosy: hansrajdas, vstinner
priority: normal
pull_requests: 15029
severity: normal
status: open
title: Compilation warning on GCC version 7.4.0-1ubuntu1~18.04.1
type: behavior
versions: Python 3.9
Added file: https://bugs.python.org/file48546/warning-2019-08-15 01-15-06.png

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37587] JSON loads performance improvement for long strings

2019-08-15 Thread Marco Paolini


Marco Paolini  added the comment:

I also confirm Inada's patch further improves performance!

All my previous benchmarks were done with gcc and PGO optimizations performed 
only with test_json task... maybe this explains the weird results?

I tested the performance of new master 69f37bcb28d7cd78255828029f895958b5baf6ff 
with *all* PGO task reverting my original patch:

iff --git a/Modules/_json.c b/Modules/_json.c
index 112903ea57..9b63167276 100644
--- a/Modules/_json.c
+++ b/Modules/_json.c
@@ -442,7 +442,7 @@ scanstring_unicode(PyObject *pystr, Py_ssize_t end, int 
strict, Py_ssize_t *next
 if (d == '"' || d == '\\') {
 break;
 }
-if (d <= 0x1f && strict) {
+if (strict && d <= 0x1f) {
 raise_errmsg("Invalid control character at", pystr, next);
 goto bail;
 }

... and surprise...

Mean +- std dev: [69f37bcb28d7cd78255828029f895958b5baf6ff] 5.29 us +- 0.07 us 
-> [69f37bcb28d7cd78255828029f895958b5baf6ff-patched] 5.11 us +- 0.03 us: 1.04x 
faster (-4%)

should we revert my original patch entirely now? Or am I missing something?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20490] Show clear error message on circular import

2019-08-15 Thread Anthony Sottile


Change by Anthony Sottile :


--
nosy: +Anthony Sottile
versions: +Python 3.8, Python 3.9 -Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37868] `is_dataclass` returns `True` if `getattr` always succeeds.

2019-08-15 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

I am not sure that it is good idea to accept a type and an instance, but if it 
is a goal, is_dataclass() should be defined as:

def is_dataclass(obj):
cls = obj if isinstance(obj, type) else type(obj)
return hasattr(cls, _FIELDS)

_is_dataclass_instance() should be changed too:

def _is_dataclass_instance(obj):
return hasattr(type(obj), _FIELDS)

--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20490] Show clear error message on circular import

2019-08-15 Thread Anthony Sottile


Change by Anthony Sottile :


--
keywords: +patch
pull_requests: +15028
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/15308

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37868] `is_dataclass` returns `True` if `getattr` always succeeds.

2019-08-15 Thread Eric V. Smith


Eric V. Smith  added the comment:

I'm guessing I'm looking up the attribute on the instance, not the class.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37866] PyModule_GetState Segmentation fault when called Py_Initialize

2019-08-15 Thread Joannah Nanjekye


Change by Joannah Nanjekye :


--
nosy: +eric.snow, ncoghlan, vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-08-15 Thread Abhilash Raj


Abhilash Raj  added the comment:

@Victor: This is already backported to 3.6. I am not sure about what gets 
backported to 3.5 right now, I don't even see a 'Backport to 3.5' label on 
Github (which made me think we are discouraged to backport to 3.5). I can work 
on a manual backport if needed?

This patch most probably won't backport to 2.7 without re-writing it completely 
since the implementation in 2.7 is much different than what we have today.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37863] Speed up hash(fractions.Fraction)

2019-08-15 Thread Tim Peters


Tim Peters  added the comment:

Well, details matter ;-)  Division in Python is expensive.  In the 
exponentiation algorithm each reduction (in general) requires a 122-by-61 bit 
division.  In egcd, after it gets going nothing exceeds 61 bits, and across 
iterations the inputs to the division step get smaller each time around.

So, e.g., when Raymond tried a Fraction with denominator 5813, "almost all" the 
egcd divisions involved inputs each with a single internal Python "digit".  But 
"almost all" the raise-to-the-(P-2) divisions involve a numerator with 5 
internal digts and 3 in the denominator.  Big difference, even if the total 
number of divisions is about the same.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37642] timezone allows no offset from range (23:59, 24:00)

2019-08-15 Thread Paul Ganssle


Change by Paul Ganssle :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37834] readlink on Windows cannot read app exec links

2019-08-15 Thread Eryk Sun


Eryk Sun  added the comment:

> Unless your point is that we should _always_ traverse junctions? In 
> which case we have a traverse 'upgrade' scenario (calls to lstat() 
> become calls to stat() when we find out it's a junction).

If we've opened the reparse point to test IO_REPARSE_TAG_SYMLINK, and that's 
not the case, then we need to reopen with reparsing enabled. This is exactly 
what Windows API functions do in order to implement particular behavior for 
just symlinks or just mountpoints. 

For example, if we've opened an HSM reparse point, we must reopen to let the 
file-system filter driver implement its semantics to replace the reparse point 
with the real file from auxiliary storage and complete the request. That is the 
stat() result I want when I say stat(filename, follow_symlinks=False) or 
lstat(filename), because this file is not a symlink. It's implicitly just the 
file to end users -- despite whatever backend tricks are being played in the 
kernel to implement other behavior such as HSM. Conflating this with a symlink 
is not right. Lies catch up with us. We can't copy it as link via os.symlink 
and os.readlink, and it doesn't get treated like a symlink in API functions.  

If you want to add an "open reparse point" parameter, that would make sense. 
It's of some use to get the tag and implement particular behavior for types of 
reparse points, and particularly for name surrogates, which includes mount 
points (junctions).

As to mount points, yes, I do think we should always traverse them. Please see 
my extended comment and the follow-up example on GitHub.

> Again, not sure why we'd want to hide the ability to manipulate the 
> junction itself from Python users, except to emulate POSIX. And I'd 
> imagine anyone using lstat() is doing it deliberately to manipulate 
> the link and would prefer we didn't force them to add Windows-
> specific code that's even more complex.

A mount point is not a link. ismount() and islink() can never both be true. 
Also, a POSIX symlink can never be a directory, which is why we make stat() 
pretend directory symlinks aren't directories. If the user wants a link, they 
can use a symlink that's created by os.symlink, mklink, new-item -type 
SymbolicLink, etc.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37642] timezone allows no offset from range (23:59, 24:00)

2019-08-15 Thread Paul Ganssle


Paul Ganssle  added the comment:


New changeset ed44b84961eb0e5b97e4866c1455ac4093d27549 by Paul Ganssle in 
branch '3.7':
bpo-37642: Update acceptable offsets in timezone (GH-14878) (#15226)
https://github.com/python/cpython/commit/ed44b84961eb0e5b97e4866c1455ac4093d27549


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37642] timezone allows no offset from range (23:59, 24:00)

2019-08-15 Thread Paul Ganssle


Paul Ganssle  added the comment:


New changeset 27b38b99b3a154fa5c25cd67fe01fb4fc04604b0 by Paul Ganssle in 
branch '3.8':
bpo-37642: Update acceptable offsets in timezone (GH-14878) (#15227)
https://github.com/python/cpython/commit/27b38b99b3a154fa5c25cd67fe01fb4fc04604b0


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37790] subprocess.Popen() is extremely slow

2019-08-15 Thread John Levon


Change by John Levon :


--
nosy: +movement

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37863] Speed up hash(fractions.Fraction)

2019-08-15 Thread Mark Dickinson


Mark Dickinson  added the comment:

> Indeed, I bet it would pay in `long_pow()` to add another test, under the `if 
> (Py_SIZE(b) < 0)` branch, to skip the exponentiation part entirely when b is 
> -1.

Agreed.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37863] Speed up hash(fractions.Fraction)

2019-08-15 Thread Mark Dickinson


Mark Dickinson  added the comment:

> That's a major difference between exponents of bit lengths 61 
> ((P-2).bit_length()) and 1 ((1).bit_length()).

Right, but that's stacked up against the cost of the extended Euclidean 
algorithm for computing the inverse. The extended gcd for computing the inverse 
of 1425089352415399815 (for example) modulo 2**61 - 1 takes 69 steps, each one 
of which involves a PyLong quotient-and-remainder division, a PyLong 
multiplication and a subtraction. So that's at least the same order of 
magnitude when it comes to number of operations.

I'd bet that a dedicated pure C square-and-multiply algorithm (with an addition 
chain specifically chosen for the target modulus, and with the multiplication 
and reduction specialised for the particular form of the modulus) would still 
be the fastest way to go here. I believe optimal addition chains for 2**31-3 
are known, and it shouldn't be too hard to find something close-to-optimal (as 
opposed to proved optimal) for 2**61-3.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37834] readlink on Windows cannot read app exec links

2019-08-15 Thread Steve Dower


Steve Dower  added the comment:

Unless your point is that we should _always_ traverse junctions? In which case 
we have a traverse 'upgrade' scenario (calls to lstat() become calls to stat() 
when we find out it's a junction).

Again, not sure why we'd want to hide the ability to manipulate the junction 
itself from Python users, except to emulate POSIX. And I'd imagine anyone using 
lstat() is doing it deliberately to manipulate the link and would prefer we 
didn't force them to add Windows-specific code that's even more complex.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37834] readlink on Windows cannot read app exec links

2019-08-15 Thread Steve Dower


Steve Dower  added the comment:

[Quoting from the PR comments]

> traverse is from follow_symlinks and only applies to symlinks. It does not 
> apply to other types of reparse points.

I get your argument that junctions are not symlinks, but I disagree that we 
should try this hard to emulate POSIX semantics rather than letting the OS do 
its thing. We aren't reparsing anything ourselves (in the PR) - if the OS is 
configured to do something different because of a reparse point, we're simply 
going to respect that instead of trying to work around it.

A user who has created a directory junction likely wants it to behave as if the 
directory is actually in that location. Similarly, a user who has created a 
directory symlink likely wants it to behave as if it were in that location. 
Powershell treats both the same for high-level operations - the LinkType 
attribute is the only way to tell them apart (mirrored in the st_reparse_tag 
field).

> If we've opened the reparse point to test for a symlink, we must reopen for 
> all other types

The premise here is not true - we've opened the reparse point to get the file 
attributes. The only reason we look at the reparse tag at all is to raise an 
error if the user requested traversal and despite that, we ended up at a link, 
and I'm becoming less convinced that should be an error anyway (this is 
different from nt.readlink() and ntpath.realpath(), of course, where we want to 
read the link and return where it points).

nt.stat() is trying to read the file attributes, and if they are not accessible 
then raising is the correct behaviour, so I don't see why we should try any 
harder than the OS here.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37863] Speed up hash(fractions.Fraction)

2019-08-15 Thread Tim Peters


Tim Peters  added the comment:

Why I expected a major speedup from this:  the binary exponentiation routine 
(for "reasonably small" exponents) does 30 * ceiling(exponent.bit_length() / 
30) multiply-and-reduces, plus another for each bit set in the exponent.  
That's a major difference between exponents of bit lengths 61 
((P-2).bit_length()) and 1 ((1).bit_length()).  Indeed, I bet it would pay in 
`long_pow()` to add another test, under the `if (Py_SIZE(b) < 0)` branch, to 
skip the exponentiation part entirely when b is -1.  `long_invmod()` would be 
the end of it then.  Because I expect using an exponent of -1 for modular 
inverse will be overwhelmingly more common than using any other negative 
exponent with a modulus.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37863] Speed up hash(fractions.Fraction)

2019-08-15 Thread Mark Dickinson


Mark Dickinson  added the comment:

> Should be significantly faster.  If not, the new "-1" implementation should 
> be changed ;-)

I wouldn't have bet on this, before seeing Raymond's benchmark results. Writing 
a fast path for invmod for C-size integers is still on my to-do list; the 
current implementation does way too many Python-level divisions.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37860] Add netlify deploy preview for docs

2019-08-15 Thread Brett Cannon


Change by Brett Cannon :


--
nosy: +brett.cannon

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37207] Use PEP 590 vectorcall to speed up calls to range(), list() and dict()

2019-08-15 Thread miss-islington


miss-islington  added the comment:


New changeset 37806f404f57b234902f0c8de9a04647ad01b7f1 by Miss Islington (bot) 
(Jeroen Demeyer) in branch 'master':
bpo-37207: enable vectorcall for type.__call__ (GH-14588)
https://github.com/python/cpython/commit/37806f404f57b234902f0c8de9a04647ad01b7f1


--
nosy: +miss-islington

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37834] readlink on Windows cannot read app exec links

2019-08-15 Thread Steve Dower


Steve Dower  added the comment:

> I assume you're talking about realpath() here ...

Yes, and so are you :) Let's move that discussion to issue9949 and/or PR 15287.

> I think os.chdir should raise an exception when passed a device path.

When the OS starts returning an error code for this case, we can start raising 
an exception. It might be worth reporting these cases though, as you're right 
that they don't seem to be handled correctly.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37645] Replace PyEval_GetFuncName/PyEval_GetFuncDesc

2019-08-15 Thread Petr Viktorin


Petr Viktorin  added the comment:

I am not convinced.

I'm wary of making error messages depend on the str representation of a 
function; that would prevent us from changing it later.
I'm wary of "%S" used in error messages. Those are for the programmer, not the 
user, so they should prefer __repr__.

I train beginners to recognize "" as a sign of 
omitted parentheses. The ugliness is useful: it shows you're dealing with an 
internal object, not a data value.

So, I think "" is much better than just "f()". I wouldn't mind 
"" (maybe even with the full signature), but that doesn't quite 
help this case.
(I don't care much for the "at 0x7f9f4bbe5e18" part, but that's not the issue 
here.)

--
nosy: +petr.viktorin

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37867] docs says subprocess.run accepts a string but this does not work on linux

2019-08-15 Thread simon mackenzie


simon mackenzie  added the comment:

Would be clearer if the arguments were listed before the return object.

On Thu, 15 Aug 2019 at 15:05, SilentGhost  wrote:

>
> SilentGhost  added the comment:
>
> But docs don't say that at all. You're looking at description of an
> attribute of returned object. And of course it can be a string, under
> certain conditions. The attributes of CompletedProcess and function
> arguments are described in the standard way, and I haven't heard of anyone
> mixing the two before.
>
> --
> resolution:  -> not a bug
> stage:  -> resolved
> status: open -> closed
>
> ___
> Python tracker 
> 
> ___
>

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37867] docs says subprocess.run accepts a string but this does not work on linux

2019-08-15 Thread SilentGhost


SilentGhost  added the comment:

But docs don't say that at all. You're looking at description of an attribute 
of returned object. And of course it can be a string, under certain conditions. 
The attributes of CompletedProcess and function arguments are described in the 
standard way, and I haven't heard of anyone mixing the two before.

--
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37867] docs says subprocess.run accepts a string but this does not work on linux

2019-08-15 Thread simon mackenzie


simon mackenzie  added the comment:

Technically true but I am not the first person to have incorrectly
interpreted this that it can be a string which suggests it is not clear to
the reader. Maybe should be explicitly stated in the description of run as
it is not obvious or intuitive.

On Thu, 15 Aug 2019 at 13:47, SilentGhost  wrote:

>
> SilentGhost  added the comment:
>
> The only place this phrase appears is in CompletedProcess.args description
> and it is correct there. Whether args arguments of subprocess.run (or
> generally Popen) can be a list or a string is discussed in Frequently Used
> Arguments section, and it is perfectly clear from reading the text under
> which condition you can pass a string as args argument. I don't think
> anything needs fixing, both implementation and docs are correct.
>
> --
> assignee:  -> docs@python
> components: +Documentation
> nosy: +SilentGhost, docs@python
> type:  -> behavior
>
> ___
> Python tracker 
> 
> ___
>

--
title: docs says subprocess.run  accepts a string but this does not work on 
linux -> docs says subprocess.run accepts a string but this does not work on 
linux

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37867] docs says subprocess.run accepts a string but this does not work on linux

2019-08-15 Thread SilentGhost


SilentGhost  added the comment:

The only place this phrase appears is in CompletedProcess.args description and 
it is correct there. Whether args arguments of subprocess.run (or generally 
Popen) can be a list or a string is discussed in Frequently Used Arguments 
section, and it is perfectly clear from reading the text under which condition 
you can pass a string as args argument. I don't think anything needs fixing, 
both implementation and docs are correct.

--
assignee:  -> docs@python
components: +Documentation
nosy: +SilentGhost, docs@python
type:  -> behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37868] `is_dataclass` returns `True` if `getattr` always succeeds.

2019-08-15 Thread Eric V. Smith


Change by Eric V. Smith :


--
assignee:  -> eric.smith
nosy: +eric.smith

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37868] `is_dataclass` returns `True` if `getattr` always succeeds.

2019-08-15 Thread Johan Hidding


New submission from Johan Hidding :

Given a class `A` that overloads `__getattr__`

```
class A:
  def __getattr__(self, key):
return 0
```

An instance of this class is always identified as a dataclass.

```
from dataclasses import is_dataclass

a = A()
print(is_dataclass(a))
```

gives the output `True`.

Possible fix: check for the instance type.

```
is_dataclass(type(a))
```

does give the correct answer.

--
components: Library (Lib)
messages: 349802
nosy: Johan Hidding
priority: normal
severity: normal
status: open
title: `is_dataclass` returns `True` if `getattr` always succeeds.
type: behavior
versions: Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37867] docs says subprocess.run accepts a string but this does not work on linux

2019-08-15 Thread Karthikeyan Singaravelan


Change by Karthikeyan Singaravelan :


--
nosy: +gregory.p.smith

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29535] datetime hash is deterministic in some cases

2019-08-15 Thread Mark Dickinson


Mark Dickinson  added the comment:

> shouldn't numerics, datetime objects, and tuples be non-deterministically 
> hashed as well? [...]

Making the numeric hash non-predictable while maintaining its current 
properties would be difficult.

But fortunately, I don't think it's necessary. IIUC, the original DOS attack 
involved carefully-crafted collections of keywords and values being passed to a 
website backend, with that backend then putting those keywords and values into 
a Python dictionary. I'd expect that there are *way* more places where a dict 
is being constructed with string keys in this way than with numeric keys. In 
fact, I think it's reasonable to assume that there are no websites vulnerable 
to a DOS via *numeric* hash collisions until we see evidence otherwise.

FWIW, I'd expect the same to be true for datetime objects; I'm not sure why 
they were originally included. IANASE, but it seems to me that covering Unicode 
strings and bytestrings should be enough in practice.

--
nosy: +mark.dickinson

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37867] docs says subprocess.run accepts a string but this does not work on linux

2019-08-15 Thread simon mackenzie


New submission from simon mackenzie :

The docs for subprocess.run say "The arguments used to launch the process. This 
may be a list or a string."

This works in windows but in linux it has to be a list. Either needs fixing or 
the docs need to be changed.

--
messages: 349800
nosy: simon mackenzie
priority: normal
severity: normal
status: open
title: docs says subprocess.run  accepts a string but this does not work on 
linux

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-15 Thread Peter Edwards


Peter Edwards  added the comment:

On Wed, 14 Aug 2019 at 23:13, STINNER Victor  wrote:

>
> STINNER Victor  added the comment:
>
> About PR 13649, I'm not sure that _PyThread_preferred_stacksize() is still
> relevant, since my change fixed test_faulthandler test_register_chain(). I
> chose my change since it's less invasive: it only impacts faulthandler, and
> it minimalizes the memory usage (especially when faulthandler is not used).
>

Sure - there's no reason for it to exist if you don't want to use it to fix
the issue here.

> Python/thread_pthread.h refactor changes of PR 13649 are interested. Would
> you like to extract them into a new PR which doesn't add
> _PyThread_preferred_stacksize() but just add new PLATFORM_xxx macros?
>

Yes, certainly.

Maybe test_faulthandler will fail tomorrow on a new platform, but I prefer
> to open a discussion once such case happens, rather than guessing how
> faulthandler can crash on an hypothetical platforms.

Well, one argument for the dynamic approach is that existing python
binaries can adjust without needing to be respun for new CPUs. I think
SIGSTKSZ is a vestage from when CPU architectures had consistently sized
register sets across models.  Its interesting to read the comment on the
IA64 definition for SIGSTKSZ:

https://github.com/torvalds/linux/blob/master/arch/ia64/include/uapi/asm/signal.h#L83

> I'm sure that libc developers are well aware of the FPU state size and
> update SIGSTKSZ accordingly.
>

The current value comes from the kernel sources, and has not changed since
at the latest 2005 (with the initial git commit of the kernel), which I
think predates xsave/xrestore by some margin. I don't think its a useful
measure of anything in the real (x86) world today.

> glibc code computing xsave_state_size:
>
>
> https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86/cpu-features.c;h=4bab1549132fe8a4c203a70b8c7a51c1dc304049;hb=HEAD#l223
>
> --
>
> If tomorrow, it becomes too hard to choose a good default value for
> faulthandler stack size, another workaround would be to make it
> configurable, as Python lets developers choose the thread stack size:
> _thread.stack_size(size).
>
> --
>
> ___
> Python tracker 
> 
> ___
>

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-15 Thread Peter Edwards


Peter Edwards  added the comment:

On Wed, 14 Aug 2019 at 22:32, STINNER Victor  wrote:

>
> We are talking abou the faulthandler_user() function of
> Modules/faulthandler.c. It is implemented in pure C, it doesn't allocate
> memory on the heap, it uses a very small set of functions (write(),
> sigaction(), raise()) and it tries to minimize its usage of the stack
> memory.
>

I was more concerned about what was happening in the chained handler, which
will also run on the restricted stack: I had assumed that was potentially
running arbitrary python code. That's actually probably incorrect, now that
I think about it, but it's harder to infer much about its stack usage
directly in faulthandler.c. I'll take a look (just to satisfy myself, more
than anything)

> It is very different than the traceback module which is implemented in
> pure Python.
>

Right, totally - I had jumped to the conclusion that it would end up
executing in the interpreter via the chain, but, as I say, that's probably
wrong. I'm not sure what guarantees the chained signal handler makes about
its stack usage. (Will educate myself)

> faulthandler is really designed to debug segmentation fault, stack
> overflow, Python hang (like a deadlock), etc.

> --
>
> ___
> Python tracker 
> 
> ___
>

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37866] PyModule_GetState Segmentation fault when called Py_Initialize

2019-08-15 Thread Hua Liu


New submission from Hua Liu :

the crash file came out when i tried to call Py_Initialize in a C file.
Python2.7 and python3.5.3 were installed to my x86-64 box.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21131] test_faulthandler.test_register_chain fails on 64bit ppc/arm with kernel >= 3.10

2019-08-15 Thread Peter Edwards


Peter Edwards  added the comment:

On Wed, 14 Aug 2019 at 22:34, STINNER Victor  wrote:

>
> ...I'm not sure that we can fix bpo-37851 in Python 3.7.

 That's totally reasonable, sure.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37866] PyModule_GetState Segmentation fault when called Py_Initialize

2019-08-15 Thread Hua Liu


Change by Hua Liu :


Added file: https://bugs.python.org/file48545/backtrace.txt

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37866] PyModule_GetState Segmentation fault when called Py_Initialize

2019-08-15 Thread Hua Liu


Change by Hua Liu :


--
nosy: Hua Liu
priority: normal
severity: normal
status: open
title: PyModule_GetState Segmentation fault when called Py_Initialize
type: crash
versions: Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37865] tempfile.NamedTemporaryFile() raises exception on close() when file is absent

2019-08-15 Thread Karthikeyan Singaravelan


Karthikeyan Singaravelan  added the comment:

I think this is same as https://bugs.python.org/issue29573 .

--
nosy: +xtreak

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37865] tempfile.NamedTemporaryFile() raises exception on close() when file is absent

2019-08-15 Thread Andrei Pashkin


New submission from Andrei Pashkin :

Here is an example:

import tempfile
import os


with tempfile.NamedTemporaryFile() as temp:
os.remove(temp.name)


And here is an error it produces:

Traceback (most recent call last):
  File "test.py", line 6, in 
os.remove(temp.name)
  File "/usr/lib/python3.7/tempfile.py", line 639, in __exit__
self.close()
  File "/usr/lib/python3.7/tempfile.py", line 646, in close
self._closer.close()
  File "/usr/lib/python3.7/tempfile.py", line 583, in close
unlink(self.name)
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpzn8gtiz1'

--
messages: 349794
nosy: pashkin
priority: normal
severity: normal
status: open
title: tempfile.NamedTemporaryFile() raises exception on close() when file is 
absent
versions: Python 3.6, Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com