[issue31783] Race condition in ThreadPoolExecutor when scheduling new jobs while the interpreter shuts down

2017-10-13 Thread Steven Barker

New submission from Steven Barker <blckkn...@gmail.com>:

While investigating a Stack Overflow question (here: 
https://stackoverflow.com/q/46529767/1405065), I found that there may be a race 
condition in the cleanup code for concurrent.futures.ThreadPoolIterator. The 
behavior in normal situations is fairly benign (the executor may run a few more 
jobs than you'd expect, but exits cleanly), but in rare situations it might 
lose track of a running thread and allow the interpreter to shut down while the 
thread is still trying to do work.

Here's some example that concisely demonstrates the situation where the issue 
can come up (it doesn't actually cause the race to go the wrong way on my 
system, but sets up the possibility for it to occur):


from threading import current_thread
from concurrent.futures import ThreadPoolExecutor
from time import sleep

pool = ThreadPoolExecutor(4)

def f(_):
print(current_thread().name)
future = pool.submit(sleep, 0.1)
future.add_done_callback(f)

f(None)


The callback from completion of one job schedules another job, indefinitely.

When run in an interactive session, this code will print thread names forever. 
You should get "MainThread" once, followed by a bunch of 
"ThreadPoolExecutor-X_Y" names (often the same name will be repeated most of 
the time, due to the GIL I think, but in theory the work could rotate between 
threads). The main thread will return to the interactive REPL right away, so 
you can type in other stuff while the executor's worker threads are printing 
stuff the background (I suggest running pool.shutdown() to make them stop). 
This is fine.

But if the example code is run as a script, you'll usually get "MainThread", 
followed by exactly four repeats of "ThreadPoolExecutor-0_0" (or fewer in the 
unlikely case that the race condition strikes you). That's the number of 
threads the ThreadPoolExecutor was limited to, but note that the thread name 
that gets printed will usually end with 0 every time (you don't get one output 
from each worker thread, just the same number of outputs as there are threads, 
all from the first thread). Why you get that number of outputs (instead of zero 
or one or an infinite number) was one part of the Stack Overflow question.

The answer turned out to be that after the main thread has queued up the first 
job in the ThreadPoolExecutor, it runs off the end of the script's code, so it 
starts shutting down the interpreter. The cleanup function _python_exit (in 
Lib/concurrent/futures/thread.py) gets run since it is registered with atexit, 
and it tries to signal the worker threads to shut down cleanly. However, the 
shutdown logic interacts oddly with an executor that's still spinning up its 
threads. It will only signal and join the threads that existed when it started 
running, not any new threads.

As it turns out, usually the newly spawned threads will shut themselves down 
immediately after they spawn, but as a side effect, the first worker thread 
carries on longer than expected, doing one additional job for each new thread 
that gets spawned and exiting itself only when the executor has a full set. 
This is why there are four outputs from the worker thread instead of some other 
number. But the exact behavior is dependent on the thread scheduling order, so 
there is a race condition.

You can demonstrate a change in behavior from different timing by putting a 
call to time.sleep at the very top of the _worker() function (delaying how 
quickly the new threads can get to the work queue). You should see the program 
behavior change to only print "ThreadPoolExecutor-0_0" once before exiting.

Lets go through the steps of the process:

1. The main thread runs f() and schedules a job (which adds a work item to the 
executor's work queue). The first worker thread is spawned by the executor to 
run the job, since it doesn't have any threads yet. The main thread also sets a 
callback on the future to run f() again.

2. The main thread exits f() and reaches the end of the script, so it begins 
the interpreter shutdown process, including calling atexit functions. One of 
those is _python_exit, which adds a reference to None to the executor's work 
queue. Note that the None is added *after* the job item from step 1 (since 
they're both done by the same main thread). It then calls join() on the worker 
thread spawned in step 1, waiting for it to exit. It won't try to join any 
other threads that spawn later, since they don't exist yet.

3. The first worker thread spawned by the executor in step 1 begins running and 
pops an item off the work queue. The first item is a real job, so it runs it. 
(The first parts of this step may be running in parallel with step 2, but 
completing job will take much longer than step 2, so the rest of this step runs 
by itself after step 2 has finished.) Eventually the job ends and the callback 
function on the Future is 

[issue2786] Names in function call exception should have class names, if they're methods

2016-08-04 Thread Steven Barker

Steven Barker added the comment:

A few weeks ago I reported issue 27389 which looks to be a duplicate of this 
issue. Has any progress been made on this issue?

--
nosy: +Steven.Barker

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue2786>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27389] When a TypeError is raised due to invalid arguments to a method, it should use __qualname__ to identify the class the method is in

2016-08-04 Thread Steven Barker

Steven Barker added the comment:

Yes, this looks to be a duplicate of that issue. I'm closing this issue as a 
duplicate, but I don't seem to be able to set the Superseder field. If you can, 
please set that to Issue 2786.

--
resolution:  -> duplicate
status: open -> closed

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue27389>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27389] When a TypeError is raised due to invalid arguments to a method, it should use __qualname__ to identify the class the method is in

2016-06-25 Thread Steven Barker

New submission from Steven Barker:

When a method is called with incorrect arguments (too many or too few, for 
instance), a TypeError is raised. The message in the TypeError generally of the 
form:

foo() takes 2 positional arguments but 3 were given

I think the message should include the class name along with the method name, 
so it would say `SomeClass.foo` instead of just `foo`. Since that is 
`SomeClass.foo`'s __qualname__, it should not be too hard to get the right name 
in most situations.

Here's an example showing how the current error messages can be ambiguous:

class A:
def foo(self, x):
pass

class B:
def foo(self, x, y): # different method signature!
pass

lst = [A(), B()]

for item in lst:
item.foo(1)  # raises TypeError: foo() missing 1 required positional 
argument: 'y'"

for item in lst:
item.foo(1, 2) # raises "TypeError: foo() takes 2 positional arguments but 
3 were given"

In neither loop is is clear which class's `foo` method is causing the exception 
(nor does the traceback help, since it only shows the `item.foo(...)` line). Of 
course, in this example it's easy to see the two classes have `foo` methods 
with different signatures, but if there were hundreds of objects in the list 
and they were instances of dozens of different classes it would be rather more 
annoying to figure out which class has the incorrect method signature.

I've looked through the code and the two exceptions above come from the 
`format_missing` and `too_many_positional` functions in Python/ceval.c . It's 
not obvious how to patch them to use `__qualname__` instead of `__name__`, 
since they are taking the name from a code object, rather than a function 
object or bound method object (and code objects don't have an equivalent to 
`__qualname__`, only `co_name` which is related to `__name__`).

Several other argument related TypeError exceptions are raised directly in 
_PyEval_EvalCodeWithName, which *does* have a `qualname` parameter, though the 
function doesn't use it for much. It's also where the other functions described 
above get called from, so it could probably pass the `qualname` along to them. 
Alas, it seems that in some common cases (such as calling a Python function 
with any kind of argument unpacking like `*foo` or `**foo`), the value of the 
`qualname` parameter is actually Null, so it may not be of much help.

A few extra TypeErrors related to function calls are raised directly in the 
gigantic `PyEval_EvalFrameEx` function. These seem to all use 
`PyEval_GetFuncName` to get the name, so perhaps we could modify its behavior 
to return the method's `__qualname__` rather than the `__name__`. (I have no 
idea what backwards compatibility issues this might cause. Perhaps a new 
function that returns the qualname would be better.)

--
messages: 269274
nosy: Steven.Barker
priority: normal
severity: normal
status: open
title: When a TypeError is raised due to invalid arguments to a method, it 
should use __qualname__ to identify the class the method is in
type: enhancement
versions: Python 3.6

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue27389>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26631] Unable to install Python 3.5.1 on Windows 10 - Error 0x80070643: Failed to install MSI package.

2016-06-13 Thread Steven Barker

Changes by Steven Barker <blckkn...@gmail.com>:


Added file: http://bugs.python.org/file43372/Python 3.5.2rc1 
(64-bit)_20160613002148_008_launcher_AllUsers.log

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26631>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26631] Unable to install Python 3.5.1 on Windows 10 - Error 0x80070643: Failed to install MSI package.

2016-06-13 Thread Steven Barker

Changes by Steven Barker <blckkn...@gmail.com>:


Added file: http://bugs.python.org/file43371/Python 3.5.2rc1 
(64-bit)_20160613002950.log

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26631>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26631] Unable to install Python 3.5.1 on Windows 10 - Error 0x80070643: Failed to install MSI package.

2016-06-13 Thread Steven Barker

Steven Barker added the comment:

I've just encountered this error when trying to update to the 3.5.2rc1 release 
(64-bit Python, Windows 10). I'd already had the 3.5.1 release installed, so I 
suppose it could have been an issue with the older installer trying to 
uninstall the old version.

I believe the issue may be related to installing the `py.exe` launcher program. 
I have an alpha release of Python 3.6.0 installed, and so the 3.5.2rc1 
installer gave me a popup saying that the launcher would not be updated since I 
already had a newer version. I assumed that was unrelated to the issue though 
because the installation process kept going for a couple minutes after seeing 
that message. However, trying the install again with custom settings and 
unselecting the launcher program made it work. I'll attach log the main log 
file for both the failed and successful runs of the installer, and the 
launcher_AllUsers file from the failed run.

Oh, I just realized the successful run was with the web installer while the 
failed runs were the full download version. I have no idea if that would have 
made a difference or not.

--
nosy: +Steven.Barker
Added file: http://bugs.python.org/file43370/Python 3.5.2rc1 
(64-bit)_20160613002148.log

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26631>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27072] random.getrandbits is limited to 2**31-1 bits on 64-bit Windows

2016-05-21 Thread Steven Barker

New submission from Steven Barker:

The C implementation of `_random.Random.getrandbits` is unnecessarily limited 
in the number of bits it can produce on 64-bit Windows systems. I learned about 
this issue in discussion of my answer to this stack overflow question:

 
http://stackoverflow.com/questions/37356338/is-there-a-predictable-replacement-for-os-urandom-using-pythons-random-module

The argument parsing code in `getrandbits` forces its Python `int` argument to 
fit into a C `int` variable. On my 64-bit Windows system, any value larger than 
`2**31-1` causes a `OverflowError`.

Since the number of bits is directly related to how much memory we need to 
allocate (in the non-fast case), I think `Py_ssize_t` would be more appropriate 
type than a regular `int`. This probably isn't an issue on non-Windows or 
64-bit systems, where `int` and `Py_ssize_t` will have the same size.

I'm attaching a very simple patch that changes the types of the relevant 
variables and the format code in the call to `PyArg_ParseTuple`. The code works 
and still passes its tests with the patch. I considered adding an additional 
test for this issue, but passing test cases would require allocations of 
several gigabytes of memory which seems a rather unfriendly thing to add in a 
test for a fairly minor issue.

This issue doesn't effect the pure Python implementation of 
`random.SystemRandom.getrandbits`, which already worked fine when large numbers 
of bits were requested. The documentation for `random.getrandbits` doesn't 
mention any limitation on the number of bits provided, so I don't imagine there 
will be backwards compatibility issues. I also don't expect the change to have 
any impact on third party `Random` replacement classes.

For convenience, here's the contents of the very short patch (which I'll also 
attach):

diff --git a/Modules/_randommodule.c b/Modules/_randommodule.c
index fd6b230..3bf564f 100644
--- a/Modules/_randommodule.c
+++ b/Modules/_randommodule.c
@@ -348,12 +348,12 @@ random_setstate(RandomObject *self, PyObject *state)
 static PyObject *
 random_getrandbits(RandomObject *self, PyObject *args)
 {
-int k, i, words;
+Py_ssize_t k, i, words;
 PY_UINT32_T r;
 PY_UINT32_T *wordarray;
 PyObject *result;

-if (!PyArg_ParseTuple(args, "i:getrandbits", ))
+if (!PyArg_ParseTuple(args, "n:getrandbits", ))
 return NULL;

 if (k <= 0) {

--
components: Library (Lib)
files: getrandbits.diff
keywords: patch
messages: 265987
nosy: Steven.Barker
priority: normal
severity: normal
status: open
title: random.getrandbits is limited to 2**31-1 bits on 64-bit Windows
type: enhancement
versions: Python 3.6
Added file: http://bugs.python.org/file42919/getrandbits.diff

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue27072>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12920] inspect.getsource only works for objects loaded from files, not interactive session

2016-05-18 Thread Steven Barker

Steven Barker added the comment:

The problem being discussed here just came up on Stack Overflow today: 
http://stackoverflow.com/questions/37288135/inspect-module-for-python/

The cause of the incorrect error message is pretty clear. The relevant code 
from `inspect.getfile` should do something better when the object has a 
`__module__` attribute, but the module named (when looked up in `sys.modules`) 
does not have a `__file__` attribute. Currently it says the module is a builtin 
class, which is total nonsense.

A very basic fix would be to have an extra error case:

if isclass(object):
if hasattr(object, '__module__'):
object = sys.modules.get(object.__module__)
if hasattr(object, '__file__'):
return object.__file__
raise TypeError()  # need a relevant message here!!!
raise TypeError('{!r} is a built-in class'.format(object))

It might be easier to make a meaningful message if the code after the first 
`if` didn't overwrite `object` with the module.

But, really, it would be nice to figure out a better fix, which would make the 
relevant inspect functions actually work for classes defined interactively in 
the `__main__` module.

--
nosy: +Steven.Barker

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue12920>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23447] Import fails when doing a circular import involving an `import *`

2015-10-01 Thread Steven Barker

Steven Barker added the comment:

Thanks for looking at the issue Brett.

I think you're right that your patch has incorrect semantics, since it doesn't 
save the value to the provided namespace if it had to go through the special 
path to find the name. I think my patch got that part right (though I 
definitely missed one PyUnicode_Check).

I've just noticed that our patches may not always do the right thing in all 
error situations. They may return -1 without setting an exception (or perhaps 
passing on a strange one from an internal call) if there's something weird 
going on like `__name__` not existing or not being a string.

I'm attaching a new version of my patch that moves the PyErr_Format call down 
to the later check for `value == NULL`. This handles both the basic case where 
nothing strange goes on but the requested name is not found, and the weird 
cases where unexpected stuff goes wrong. It will replace any errors raised by 
the earlier code with an ImportError. I think an ImportError is the appropriate 
exception for most error cases, but perhaps some internal errors should not be 
overwritten and more complicated error logic is needed.

I also notice that the code in the "import_from" function (from which I 
borrowed heavily for my patch) was changed in one of the 3.5rc patches to have 
new error handling (it raises ImportError more often I think). I don't see an 
obvious way to copy its new error logic to import_all_from, but my C coding 
skills are rather rusty so I could be missing an easy approach.

--
Added file: http://bugs.python.org/file40649/23447_4.diff

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue23447>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23447] Import fails when doing a circular import involving an `import *`

2015-09-08 Thread Steven Barker

Steven Barker added the comment:

I've managed to partially fix my build environment, so I can verify that my 
patch (attached previously) works as intended. I'm not completely sure that it 
doesn't break unrelated things as my build still has lots of failing tests 
(which all appear to be unrelated to this issue). Nothing obviously new breaks 
with the patch applied.

The test I added in the patch does fail (as expected) on the unpatched 
interpreter, but passes on a patched version (though I've just noticed that the 
attempt at catching the exception doesn't work, since the current import code 
raises an AttributeError rather than the ImportError I had expected). I don't 
believe any other tests failed anew on the patched code, but like I said, I've 
got a lot of unrelated test failures so I might be missing some subtle issues.

I'm attaching an updated version of the patch with better exception handling in 
the test code. The patch applies cleanly to the latest dev branch. I'd 
appreciate any reviews or other feedback.

--
Added file: http://bugs.python.org/file40415/23447_3.diff

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue23447>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17991] ctypes.c_char gives a misleading error when passed a one-character unicode string

2015-07-29 Thread Steven Barker

Steven Barker added the comment:

I was looking over some of the bugs I've contributed to, and it looks like this 
one has been fixed. It should be marked as a dupe of issue 22161 and closed (I 
can close, but not set a superseder, it seems).

--
resolution:  - duplicate
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17991
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23447] Import fails when doing a circular import involving an `import *`

2015-07-28 Thread Steven Barker

Steven Barker added the comment:

I've finally gotten around to looking into this issue and it turns out that 
fixing from package import * to work with circular imports is pretty easy, as 
all that needs to be done is apply the same logic from the patch for issue 
17636 to the loop in the next function. I'll attach a patch that does this.

Unfortunately, my Windows build environment seems to be completely messed up at 
the moment and I'm getting link errors on everything in my python repo, so I've 
not been able to test the code at all. Since I have no idea when I'll actually 
have the time to learn what the hell's going wrong with MSVC, I though I'd 
submit my patch and perhaps somebody on a better OS can make sure it works 
properly. I've included a new test which should show the issue with circular 
imports (and verify that the rest of the patch fixes it, hopefully).

--
keywords: +patch
Added file: http://bugs.python.org/file40047/23447_2.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23447
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23495] The writer.writerows method should be documented as accepting any iterable (not only a list)

2015-02-20 Thread Steven Barker

New submission from Steven Barker:

The documentation for the csv.writer.writerows method says that it expects a 
list of row objects, when it really will accept any iterable that yields rows 
(such as a generator). While it's often nice for code to be more accepting than 
the documented requirements, I think the docs in this case should state that 
writerows() expects an iterable, rather than misinforming users that a list is 
required.

This documentation issue was brought up in a Stack Overflow question here: 
http://stackoverflow.com/questions/28636848/csv-writer-writerows-takes-iterator

I expect the necessary documentation patch will be pretty trivial, and if 
nobody else gets to it first, I will try to provide one when I have enough time 
to update my cpython checkout (not soon, alas).

--
assignee: docs@python
components: Documentation
messages: 236328
nosy: Steven.Barker, docs@python
priority: normal
severity: normal
status: open
title: The writer.writerows method should be documented as accepting any 
iterable (not only a list)
type: enhancement
versions: Python 2.7, Python 3.2, Python 3.3, Python 3.4, Python 3.5, Python 3.6

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23495
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23495] The writer.writerows method should be documented as accepting any iterable (not only a list)

2015-02-20 Thread Steven Barker

Steven Barker added the comment:

Another Stack Overflow user pointed out that the DictWriter's writerows 
implementation (in pure Python) unnecessarily converts whatever its argument is 
into a list in memory before passing it on to the builtin writer.writerows 
method which would accept any iterable type. Probably it should be changed to 
use map or a generator expression instead.

So, perhaps this issue isn't purely documentation, but actually a small 
behavior enhancement as well.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23495
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23447] Relative imports with __all__ attribute

2015-02-12 Thread Steven Barker

Steven Barker added the comment:

This issue is a special case of the problem discussed in issue 992389, that 
modules within packages are not added to the package dictionary until they are 
fully loaded, which breaks circular imports in the form from package import 
module.

The consensus on that issue is that it doesn't need to be fixed completely, 
given the partial fix from issue 17636. I think the current issue is a corner 
case that was not covered by the fix, but which probably should be fixed as 
well, for consistency. The fix so far has made imports that name the module 
work, even though the module objects are still not placed into the package's 
namespace yet (this is why Antonio's last example works in the newly released 
3.5a1, though not in previous versions). Wildcard imports however still fail.

Can the fix be expanded to cover wildcard imports as well? I know, we're 
heaping up two kinds of usually-bad code (wildcard imports and circular 
imports) on top of one another, but still, the failure is very unexpected to 
anyone who's not read the bug reports.

I don't know my way around the import system at all yet, so I'm not going to be 
capable of writing up a patch immediately, but if there's any interest at all, 
and nobody more knowledgeable gets to it first I might see what I can learn and 
try to put together something.

Here's a more concise example of the issue:


package/__init__.py:

__all__ = [module]


package/module.py:

from . import module # this works in 3.5a1, thanks to the fix from issue 
17636
from . import *  # this still fails

--
nosy: +Steven.Barker

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23447
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21389] The repr of BoundMethod objects sometimes incorrectly identifies the bound function

2014-08-20 Thread Steven Barker

Steven Barker added the comment:

OK, I've written some tests of the new bound method repr functionality, which 
I'm attaching as a patch against the current tip.

I test the basic repr output, all the cases the old code got wrong (inherited 
methods, overridden methods, methods called via super, and classmethods) and 
the strange corner cases that probably won't come up in ordinary code (such as 
methods manually created from callables that don't have __name__ or 
__qualname__ attributes).

I've also fixed the defaultdict test that was relying upon the old repr output. 
I don't believe there are any other places in the standard library or tests 
where a bound method's repr is examined.

My patch adds the tests in a new file, Lib/test/test_bound_method_repr.py. I 
looked to see if there was an existing file that tested similar behavior, but 
none that I found really seemed appropriate. If I overlooked a better place to 
put the new tests, please let me know and I'll update the test patch.

I'm not very experienced at writing unit tests, so comments and/or criticism is 
welcomed. I copied parts of the file's structure (such as the test_main() 
function and if __name__ == __main__ boilerplate) from other test files, so 
hopefully I've stayed pretty close to the usual Python test style.

--
Added file: http://bugs.python.org/file36420/method_repr_tests.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21389
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1234674] filecmp.cmp's shallow option

2014-08-03 Thread Steven Barker

Steven Barker added the comment:

I've worked on this filecmp issue some more, and I have some new patches.

First up is a patch that only modifies the tests. It has one test that fails 
without the behavior patch. The test patch also modifies some other tests so 
that they will work after the behavior patch is applied. Notably, test_keyword 
had a some tests that would fail due to false negatives on shallow comparisons.

The second patch is the behavior and documentation changes required to actually 
fix this issue. This should be very simple to understand (the behavior change 
is only two lines of code, quoted in the discussion above). If you apply only 
this patch, you'll get several test failures, all due to false negative shallow 
comparisons (where two files have the same contents, but their stat signatures 
differ).

With these new patches, I think this issue is ready for a review, and 
eventually to be committed. The behavior change is simple and I think, 
obviously correct (assuming we want to risk breaking backwards compatibility). 
Perhaps my test code can be improved, but I don't think it's too bad.

So, the main question is will too much outside code break if we make this 
behavior change?

I don't think filecmp is used very widely, but as was demonstrated by the 
standard library's only use of filecmp (in Lib/test/test_keyword.py), it's 
likely that a lot of users perform shallow comparisons (by default) where they 
really don't want to get false-negatives. If we decide that the changed 
behavior is too big of a break of backwards compatibility, we just need to 
document the current behavior better (at a minimum, the docstring for 
filecmp.cmp must be corrected).

--
Added file: http://bugs.python.org/file36238/filecmp_test_patch.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1234674
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1234674] filecmp.cmp's shallow option

2014-08-03 Thread Steven Barker

Changes by Steven Barker blckkn...@gmail.com:


Added file: http://bugs.python.org/file36239/filecmp_behavior_and_doc_fix.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1234674
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1234674] filecmp.cmp's shallow option

2014-07-27 Thread Steven Barker

Steven Barker added the comment:

I think that your test patch misses the confusing/possibly wrong case. That 
case is when two files have the same contents, but different mtimes. If you 
attempt a shallow comparison, you'll actually get a deep comparison (reading 
the whole files) and a result of True rather than the expected (though 
incorrect) False.

Try the test part of my patch (without the behavior change), and you'll see the 
failure of test_shallow_false_negative. In your first assertion (that 
filecmp.cmp(self.name, self.name_uppercase) is False), you get the expected 
result, but for the wrong reason (you get it because the file contents differ, 
not because they have different mtimes).

Now, it might be that the bad case is actually working as we want it to (or 
at least, the behavior is established enough that we don't want to change it, 
for risk of breaking running code). If so, we should instead change the 
documentation (and especially the docstring) to explicit state that even if you 
request a shallow comparison, you might get a deep comparison instead.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1234674
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21547] '!s' formatting documentation bug

2014-05-21 Thread Steven Barker

Steven Barker added the comment:

The behavior of !s with the format() methods isn't exactly the same as %s with 
% formatting. With the latter, the conversion depends on the type of the result 
string, which in turn depends on whether the format string *or any of the 
values values* is unicode:

 class X():
def __str__(self): return str
def __unicode__(self): return uunicode

 %s %s % (foo, X())
'foo str'
 %s %s % (ufoo, X())
u'foo unicode'
 u%s %s % (foo, X())
u'foo unicode'
 u%s %s % (ufoo, X())
u'foo unicode'

The format methods are more consistent, always returning the same type as the 
format string regardless of the types of the arguments (and using the 
appropriate converter):

 {} {!s}.format(foo, X())
'foo str'
 {} {!s}.format(ufoo, X())
'foo str'
 u{} {!s}.format(foo, X())
u'foo unicode'
 u{} {!s}.format(ufoo, X())
u'foo unicode'

The documentation for %s conversion (in the second table here: 
https://docs.python.org/2/library/stdtypes.html#string-formatting-operations ) 
also suggests that it always uses str(), though the footnote for that table 
entry alludes to the behavior shown above without ever mentioning using 
unicode() for conversions explicitly.

--
nosy: +Steven.Barker

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21547
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21389] The repr of BoundMethod objects sometimes incorrectly identifies the bound function

2014-05-09 Thread Steven Barker

Steven Barker added the comment:

Here's a patch that changes the behavior of method_repr in 
Objects/classobject.c . It first tries to use __func__.__qualname__, then tries 
__func__.__name__ as a fallback and finally uses ? if neither of those 
attributes are available.

I'm not sure if the __name__ fallback is tested (as it seems that pretty much 
all callables have __qualname__ these days). The last ? case actually does 
get tested by Lib/test/test_descr.py which creates a messed up method with 
classmethod(1).__get__(1) (which oddly does not raise an error immediately upon 
creation, but rather only when it is called).

I've not written C in several years, so please let me know if you see I've done 
something obviously wrong, or any places the patch could be improved. It is 
mostly a copy-and-paste of existing code with a few modifications and 
deletions, so hopefully I can't have messed up anything too badly!

I'm currently ignoring a comment in the code that says we shouldn't use 
repr()/%R to format __self__. I don't really understand that, so I've stick 
with the existing behavior on that front. If that is something that should 
change, I'd be happy to try reworking it in some other way, just let me know 
what the concern is.

Here are some examples of the new repr output in a build with the patch:

 class A():
...   def foo(self):
... pass
...
 class B(A):
...   def foo(self):
... pass
...
 class C(A):
...   pass
...
 class D():
...   @classmethod
...   def bar():
... pass
...
 A().foo
bound method A.foo of __main__.A object at 0x02267508
 B().foo
bound method B.foo of __main__.B object at 0x02267578
 C().foo
bound method A.foo of __main__.C object at 0x02267658
 super(B, B()).foo
bound method A.foo of __main__.B object at 0x022676C8
 D.bar
bound method D.bar of class '__main__.D'

--
keywords: +patch
versions: +Python 3.5
Added file: http://bugs.python.org/file35202/method_repr.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21389
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21389] The repr of BoundMethod objects sometimes incorrectly identifies the bound function

2014-05-09 Thread Steven Barker

Steven Barker added the comment:

Ah, I figured out why using %R may be bad. It breaks for the following silly 
class:

class C():
def __repr__(self):
return repr(self.__repr__) # or use any other bound method

repr(C()) will recurse until the recursion limit is hit, both with and without 
my patch. If this seems like a real issue, I could probably replace the %R code 
with a variation on the base case code in PyObject_Repr:

PyUnicode_FromFormat(%s object at %p,
 v-ob_type-tp_name, v)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21389
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21445] Some asserts in test_filecmp have the wrong messages

2014-05-06 Thread Steven Barker

New submission from Steven Barker:

While working on a fix for issue 1234674, I found that the first test method in 
Lib/test/test_filecmp.py (FileCompareTestCase.test_matching) has switched up 
messages in its AssertEquals calls. The first two asserts have the message that 
should belong to the second two, and visa versa.

I've prepared a very simple patch that switches around the AssertEquals 
arguments so that the messages match what is being tested. The behavior checked 
by the asserts remains exactly the same, only the message printed if an assert 
fails will be different.

The test still passes. If you want to see the wrong message get displayed, the 
patch I uploaded for issue 1234674 may cause the test edited here to fail 
intermittently (it depend on whether two files are created in a shorter time 
than your filesystems's mtime resolution in the setup code). The failures are 
not affected by this patch, but with it you'll get the right message rather 
than a wrong one.

--
components: Tests
files: filecmp_test_messages.diff
keywords: patch
messages: 217970
nosy: Steven.Barker
priority: normal
severity: normal
status: open
title: Some asserts in test_filecmp have the wrong messages
type: behavior
versions: Python 3.5
Added file: http://bugs.python.org/file35160/filecmp_test_messages.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21445
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1234674] filecmp.cmp's shallow option

2014-05-05 Thread Steven Barker

Steven Barker added the comment:

Here's a patch against the default branch that fixes filecmp.cmp's behavior 
when shallow is True, including an update to the module's docs (correcting 
the previous ambiguity discussed in the 2011 python-dev thread mentioned by 
Sandro Tosi) and a couple of new tests of files where shallow comparisons 
should be expected get the answer wrong.

The changed behavior is very simple. The lines from Lib/filecmp.py in the 
original code:

if shallow and s1 == s2:
return True

are changed to:

if shallow:
return s1 == s2

This presumes answers to the questions asked by Georg Brandl back in 2005 of 
No, shallow comparisons never read the file contents and No, non-shallow 
comparisons don't care about mtimes (except for avoiding out of date cache 
entries).

If we only applied the test changes (not the behavior change) from my patch, 
one of the new tests would fail, as the current filecmp.cmp code never gives 
false negatives on comparisons in shallow mode (only false positives).

We probably don't want to commit the patch exactly as it stands, because the 
changed behavior causes several of the previously existing tests to fail. 
Almost all the dircmp tests fail for me, and the test_matching filecmp.cmp 
test does so intermittently (due to my system sometimes being fast enough at 
creating the files to beat my filesystem's mtime resolution).

The test failures are all because of shallow comparisons between files with the 
same contents, but (slightly) different mtimes. The new shallow comparison 
behavior is to say those files are unequal while the old code would fall back 
on testing the file contents despite the shallow parameter, and so it would 
return True. The failing tests can probably be updated to either explicitly set 
the file mtimes with sys.utime (as I do with the files used in the new tests), 
or we could rewrite some of the setup code to use something like shutil.copy2 
to make copies of files while preserving their mtimes.

I'm not very familiar with the best practices when it comes to writing unit 
tests, so I pretty much copied the style of the existing tests in 
Lib/test/test_filecmp.py (though I've split mine up a bit more).

I understand from reading other filecmp bug reports that that test style is 
considered to be pretty poor, so there's probably room to improve my code as 
well. I thought I'd post this patch before making an effort at fixing the older 
tests, in order to get some feedback. I'll be happy to incorporate any 
suggested improvements!

This issue has been open for almost 9 years. Hopefully my patch can help get it 
moving again towards being fixed!

--
keywords: +patch
versions: +Python 3.5
Added file: http://bugs.python.org/file35159/filecmp_real_shallow.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1234674
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21389] The repr of BoundMethod objects sometimes incorrectly identifies the bound function

2014-04-29 Thread Steven Barker

New submission from Steven Barker:

The repr of bound method objects can be misleading in certain situations. The 
repr is always is of the format:

bound method x.y of object

But x is often incorrect.

Here are some examples where the current code gets it wrong:

# inherited method
class Base(object):
def foo(self):
pass

class Derived(Base):
pass

print(Derived().foo)
# not too bad, prints bound method Derived.foo of __main__.Derived 
object at 0x
# it probably should say Base.foo instead of Derived.foo, but at least 
they're two names for the same function

# however, an override and super() it gets very bad
class Derived2(Base):
def foo(self):
pass

print(super(Derived2, Derived2()).foo)
# totally wrong, prints bound method Dervied2.foo of __main__.derived2 
object at 0x
# but it actually *is* Base.foo bound to a Derived2 instance!

# bound class methods:
class Test(object):
@classmethod
def foo(cls):
pass

print(Test.foo)
# wrong, prints bound method type.foo of class '__main__.Test'

I suggest that rather than trying to assemble the x.y pair by from 
__self__.__class__ and __func__.__name__, the BoundMethod should just use 
the __func__.__qualname__. In each of the cases above, the function's 
location would be correctly located this way.

I came across this bug while investigating a confusing (to me) issue with 
metaclasses and inheritance. The misleading repr output made it much harder 
to figure out that my expectations were wrong. Here's a simplified example of 
how it led me astray:

class A(object):
@classmethod
def foo(cls):
return classmethod from A

class BMeta(type):
def foo(cls):
return instancemethod from BMeta

class B(A, metaclass=BMeta):
pass

print(B.foo()) # surprisingly (to me) prints classmethod from A
print(B.foo)   # incorrectly prints bound method BMeta.foo of class 
__main__.B

It is presumably not a bug that inherited class methods take precedence over 
instance methods from a metaclass (though it was surprising to me at the time). 
The repr of the bound method though, suggested exactly the opposite. Given that 
it gets many more common situations wrong as well, I think that the repr should 
be fixed.

The relevant code appears to be in the method_repr function in 
Objects/Classobject.c .

--
components: Interpreter Core
messages: 217566
nosy: Steven.Barker
priority: normal
severity: normal
status: open
title: The repr of BoundMethod objects sometimes incorrectly identifies the 
bound function
type: behavior
versions: Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21389
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1234674] filecmp.cmp's shallow option

2014-04-21 Thread Steven Barker

Steven Barker added the comment:

A recent Stack Overflow question (http://stackoverflow.com/q/23192359/1405065) 
relates to this bug. The questioner was surprised that filecmp.cmp is much 
slower than usual for certain large files, despite the shallow parameter 
being True.

It is pretty clear to me that the behavior of filecmp.cmp does not match its 
docstring with respect to shallow. Either the docstring should be updated to 
match the existing behavior, or (more likely?) the behavior should change to 
match the docs.

--
components: +Library (Lib) -Extension Modules
nosy: +Steven.Barker
versions: +Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1234674
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20058] IDLE's shell returns a multiple-line string to input() or readline() when multiple lines of text are pasted by the user

2013-12-23 Thread Steven Barker

New submission from Steven Barker:

Pasting multiple lines of input and then pressing Enter when IDLE is waiting to 
read a single line (such as when input() or sys.stdin.readline() have been 
called) will result is a multi-line string being given as the input, rather 
than a single line.

This may be most easily understood by looking at an example. Run this code in 
IDLE (either by typing it in the shell, or from a file with F5):

s = X
while s:
s = input()
print(repr(s))

First, try typing in several lines. Each will be printed separately, with no 
newlines inside the strings (since input() strips a trailing newline).

foo
'foo'
bar
'bar'
baz
'baz'

Next, copy several lines of text from somewhere. It doesn't matter what the 
lines' contents are. Here I grabbed a list of Python version numbers, as I was 
on the download page after grabbing 3.4.0b1 for testing this bug:

3.1.5
3.0.1
2.7.6
2.6.9
2.5.6
2.4.6
2.3.7
2.2.3 all the preceding lines were pasted in one go, followed by enter 
pressed here
'3.1.5\n3.0.1\n2.7.6\n2.6.9\n2.5.6\n2.4.6\n2.3.7\n2.2.3'

This behavior is different than what the Python interpreter does in a regular 
console shell. When running in cmd.exe on Windows, Python treats a multi-line 
paste just like typed input:

3.1.5
'3.1.5'
3.0.1
'3.0.1'
2.7.6
'2.7.6'
2.6.9
'2.6.9'
2.5.6
'2.5.6'
2.4.6
'2.4.6'
2.3.7
'2.3.7'
2.2.3 enter typed here
'2.2.3'

I expect the same behavior will be common in other kinds of terminals on other 
platforms.

This issue makes testing certain kinds of programs very frustrating. If your 
program needs to read certain text from STDIN, and you want to paste that text 
in quickly, you need to update your code with special logic just for use in 
IDLE's console. As an example of the kind of pain you may experience, try 
copying and pasting a block of text with a blank line into the input loop 
above. On a regular console session it will exit the loop after the blank line. 
In IDLE, it will keep running.

I've traced the source of this issue through IDLE's sys.stdin file object and 
an RPC call, and found it probably is located in the 
idlelib.PyShell.PyShell.readline method (or the surrounding code). This grabs a 
string from the Text object in the shell window and returns it to the Python 
code running in the subprocess.

Probably it should have some extra steps added to check if it got multiple 
lines. If so, it should split the string on newlines and return just one line 
of text for each readline call. I'm not sure exactly what should be done with 
the rest of the lines, but perhaps they could be queued up (or somehow put 
back by moving the markers in the Text object) so later lines would be grabbed 
by later input requests.

Or alternatively, maybe the event where the multi-line paste arrives should be 
handled differently, as several single-line input events, rather than a single 
multiple-line one.

--
components: IDLE
messages: 206879
nosy: Steven.Barker
priority: normal
severity: normal
status: open
title: IDLE's shell returns a multiple-line string to input() or readline() 
when multiple lines of text are pasted by the user
type: behavior
versions: Python 2.7, Python 3.3, Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20058
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17991] ctypes.c_char gives a misleading error when passed a one-character unicode string

2013-05-15 Thread Steven Barker

New submission from Steven Barker:

While investigating a Stack Overflow question 
(http://stackoverflow.com/questions/16484764/multiprocessing-value-clear-syntax)
 I came across a misleading error message from the multiprocessing.Value 
constructor:

 import multiprocessing
 my_char = x
 v = multiprocessing.Value(c, my_char)
Traceback (most recent call last):
  File pyshell#21, line 1, in module
v = multiprocessing.Value(c, my_char)
  File S:\Python33\lib\multiprocessing\__init__.py, line 243, in Value
return Value(typecode_or_type, *args, lock=lock)
  File S:\Python33\lib\multiprocessing\sharedctypes.py, line 70, in Value
obj = RawValue(typecode_or_type, *args)
  File S:\Python33\lib\multiprocessing\sharedctypes.py, line 47, in RawValue
obj.__init__(*args)
TypeError: one character string expected

The one character string expected message was rather unhelpful, since that 
seemed to be what I gave it. After being stumped by this for a bit, I realized 
that it might be having an issue with Unicode and giving an error message more 
appropriate to Python 2 than Python 3. Sure enough, passing a one-character 
bytes instance works just fine.

So, at a minimum I think the error message should be updated to say it wants a 
one-character bytes instance rather than a string (which to my mind means a 
str instance). A further enhancement might be to accept unicode strings that 
contain just a single ASCII character too, but I realize that may be more messy 
and error prone.

The error message comes from the ctypes module, and can be reproduced easily:

 ctypes.c_char(x)
Traceback (most recent call last):
  File pyshell#28, line 1, in module
ctypes.c_char(x)
TypeError: one character string expected

The exception text comes from Modules/_ctypes/cfield.c:1151

--
components: Library (Lib)
messages: 189338
nosy: Steven.Barker
priority: normal
severity: normal
status: open
title: ctypes.c_char gives a misleading error when passed a one-character 
unicode string
type: enhancement
versions: Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17991
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17991] ctypes.c_char gives a misleading error when passed a one-character unicode string

2013-05-15 Thread Steven Barker

Changes by Steven Barker blckkn...@gmail.com:


--
components: +ctypes -Library (Lib)

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17991
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com