[issue31783] Race condition in ThreadPoolExecutor when scheduling new jobs while the interpreter shuts down
New submission from Steven Barker <blckkn...@gmail.com>: While investigating a Stack Overflow question (here: https://stackoverflow.com/q/46529767/1405065), I found that there may be a race condition in the cleanup code for concurrent.futures.ThreadPoolIterator. The behavior in normal situations is fairly benign (the executor may run a few more jobs than you'd expect, but exits cleanly), but in rare situations it might lose track of a running thread and allow the interpreter to shut down while the thread is still trying to do work. Here's some example that concisely demonstrates the situation where the issue can come up (it doesn't actually cause the race to go the wrong way on my system, but sets up the possibility for it to occur): from threading import current_thread from concurrent.futures import ThreadPoolExecutor from time import sleep pool = ThreadPoolExecutor(4) def f(_): print(current_thread().name) future = pool.submit(sleep, 0.1) future.add_done_callback(f) f(None) The callback from completion of one job schedules another job, indefinitely. When run in an interactive session, this code will print thread names forever. You should get "MainThread" once, followed by a bunch of "ThreadPoolExecutor-X_Y" names (often the same name will be repeated most of the time, due to the GIL I think, but in theory the work could rotate between threads). The main thread will return to the interactive REPL right away, so you can type in other stuff while the executor's worker threads are printing stuff the background (I suggest running pool.shutdown() to make them stop). This is fine. But if the example code is run as a script, you'll usually get "MainThread", followed by exactly four repeats of "ThreadPoolExecutor-0_0" (or fewer in the unlikely case that the race condition strikes you). That's the number of threads the ThreadPoolExecutor was limited to, but note that the thread name that gets printed will usually end with 0 every time (you don't get one output from each worker thread, just the same number of outputs as there are threads, all from the first thread). Why you get that number of outputs (instead of zero or one or an infinite number) was one part of the Stack Overflow question. The answer turned out to be that after the main thread has queued up the first job in the ThreadPoolExecutor, it runs off the end of the script's code, so it starts shutting down the interpreter. The cleanup function _python_exit (in Lib/concurrent/futures/thread.py) gets run since it is registered with atexit, and it tries to signal the worker threads to shut down cleanly. However, the shutdown logic interacts oddly with an executor that's still spinning up its threads. It will only signal and join the threads that existed when it started running, not any new threads. As it turns out, usually the newly spawned threads will shut themselves down immediately after they spawn, but as a side effect, the first worker thread carries on longer than expected, doing one additional job for each new thread that gets spawned and exiting itself only when the executor has a full set. This is why there are four outputs from the worker thread instead of some other number. But the exact behavior is dependent on the thread scheduling order, so there is a race condition. You can demonstrate a change in behavior from different timing by putting a call to time.sleep at the very top of the _worker() function (delaying how quickly the new threads can get to the work queue). You should see the program behavior change to only print "ThreadPoolExecutor-0_0" once before exiting. Lets go through the steps of the process: 1. The main thread runs f() and schedules a job (which adds a work item to the executor's work queue). The first worker thread is spawned by the executor to run the job, since it doesn't have any threads yet. The main thread also sets a callback on the future to run f() again. 2. The main thread exits f() and reaches the end of the script, so it begins the interpreter shutdown process, including calling atexit functions. One of those is _python_exit, which adds a reference to None to the executor's work queue. Note that the None is added *after* the job item from step 1 (since they're both done by the same main thread). It then calls join() on the worker thread spawned in step 1, waiting for it to exit. It won't try to join any other threads that spawn later, since they don't exist yet. 3. The first worker thread spawned by the executor in step 1 begins running and pops an item off the work queue. The first item is a real job, so it runs it. (The first parts of this step may be running in parallel with step 2, but completing job will take much longer than step 2, so the rest of this step runs by itself after step 2 has finished.) Eventually the job ends and the callback function on the Future is
[issue2786] Names in function call exception should have class names, if they're methods
Steven Barker added the comment: A few weeks ago I reported issue 27389 which looks to be a duplicate of this issue. Has any progress been made on this issue? -- nosy: +Steven.Barker ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue2786> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27389] When a TypeError is raised due to invalid arguments to a method, it should use __qualname__ to identify the class the method is in
Steven Barker added the comment: Yes, this looks to be a duplicate of that issue. I'm closing this issue as a duplicate, but I don't seem to be able to set the Superseder field. If you can, please set that to Issue 2786. -- resolution: -> duplicate status: open -> closed ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue27389> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27389] When a TypeError is raised due to invalid arguments to a method, it should use __qualname__ to identify the class the method is in
New submission from Steven Barker: When a method is called with incorrect arguments (too many or too few, for instance), a TypeError is raised. The message in the TypeError generally of the form: foo() takes 2 positional arguments but 3 were given I think the message should include the class name along with the method name, so it would say `SomeClass.foo` instead of just `foo`. Since that is `SomeClass.foo`'s __qualname__, it should not be too hard to get the right name in most situations. Here's an example showing how the current error messages can be ambiguous: class A: def foo(self, x): pass class B: def foo(self, x, y): # different method signature! pass lst = [A(), B()] for item in lst: item.foo(1) # raises TypeError: foo() missing 1 required positional argument: 'y'" for item in lst: item.foo(1, 2) # raises "TypeError: foo() takes 2 positional arguments but 3 were given" In neither loop is is clear which class's `foo` method is causing the exception (nor does the traceback help, since it only shows the `item.foo(...)` line). Of course, in this example it's easy to see the two classes have `foo` methods with different signatures, but if there were hundreds of objects in the list and they were instances of dozens of different classes it would be rather more annoying to figure out which class has the incorrect method signature. I've looked through the code and the two exceptions above come from the `format_missing` and `too_many_positional` functions in Python/ceval.c . It's not obvious how to patch them to use `__qualname__` instead of `__name__`, since they are taking the name from a code object, rather than a function object or bound method object (and code objects don't have an equivalent to `__qualname__`, only `co_name` which is related to `__name__`). Several other argument related TypeError exceptions are raised directly in _PyEval_EvalCodeWithName, which *does* have a `qualname` parameter, though the function doesn't use it for much. It's also where the other functions described above get called from, so it could probably pass the `qualname` along to them. Alas, it seems that in some common cases (such as calling a Python function with any kind of argument unpacking like `*foo` or `**foo`), the value of the `qualname` parameter is actually Null, so it may not be of much help. A few extra TypeErrors related to function calls are raised directly in the gigantic `PyEval_EvalFrameEx` function. These seem to all use `PyEval_GetFuncName` to get the name, so perhaps we could modify its behavior to return the method's `__qualname__` rather than the `__name__`. (I have no idea what backwards compatibility issues this might cause. Perhaps a new function that returns the qualname would be better.) -- messages: 269274 nosy: Steven.Barker priority: normal severity: normal status: open title: When a TypeError is raised due to invalid arguments to a method, it should use __qualname__ to identify the class the method is in type: enhancement versions: Python 3.6 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue27389> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26631] Unable to install Python 3.5.1 on Windows 10 - Error 0x80070643: Failed to install MSI package.
Changes by Steven Barker <blckkn...@gmail.com>: Added file: http://bugs.python.org/file43372/Python 3.5.2rc1 (64-bit)_20160613002148_008_launcher_AllUsers.log ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue26631> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26631] Unable to install Python 3.5.1 on Windows 10 - Error 0x80070643: Failed to install MSI package.
Changes by Steven Barker <blckkn...@gmail.com>: Added file: http://bugs.python.org/file43371/Python 3.5.2rc1 (64-bit)_20160613002950.log ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue26631> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26631] Unable to install Python 3.5.1 on Windows 10 - Error 0x80070643: Failed to install MSI package.
Steven Barker added the comment: I've just encountered this error when trying to update to the 3.5.2rc1 release (64-bit Python, Windows 10). I'd already had the 3.5.1 release installed, so I suppose it could have been an issue with the older installer trying to uninstall the old version. I believe the issue may be related to installing the `py.exe` launcher program. I have an alpha release of Python 3.6.0 installed, and so the 3.5.2rc1 installer gave me a popup saying that the launcher would not be updated since I already had a newer version. I assumed that was unrelated to the issue though because the installation process kept going for a couple minutes after seeing that message. However, trying the install again with custom settings and unselecting the launcher program made it work. I'll attach log the main log file for both the failed and successful runs of the installer, and the launcher_AllUsers file from the failed run. Oh, I just realized the successful run was with the web installer while the failed runs were the full download version. I have no idea if that would have made a difference or not. -- nosy: +Steven.Barker Added file: http://bugs.python.org/file43370/Python 3.5.2rc1 (64-bit)_20160613002148.log ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue26631> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27072] random.getrandbits is limited to 2**31-1 bits on 64-bit Windows
New submission from Steven Barker: The C implementation of `_random.Random.getrandbits` is unnecessarily limited in the number of bits it can produce on 64-bit Windows systems. I learned about this issue in discussion of my answer to this stack overflow question: http://stackoverflow.com/questions/37356338/is-there-a-predictable-replacement-for-os-urandom-using-pythons-random-module The argument parsing code in `getrandbits` forces its Python `int` argument to fit into a C `int` variable. On my 64-bit Windows system, any value larger than `2**31-1` causes a `OverflowError`. Since the number of bits is directly related to how much memory we need to allocate (in the non-fast case), I think `Py_ssize_t` would be more appropriate type than a regular `int`. This probably isn't an issue on non-Windows or 64-bit systems, where `int` and `Py_ssize_t` will have the same size. I'm attaching a very simple patch that changes the types of the relevant variables and the format code in the call to `PyArg_ParseTuple`. The code works and still passes its tests with the patch. I considered adding an additional test for this issue, but passing test cases would require allocations of several gigabytes of memory which seems a rather unfriendly thing to add in a test for a fairly minor issue. This issue doesn't effect the pure Python implementation of `random.SystemRandom.getrandbits`, which already worked fine when large numbers of bits were requested. The documentation for `random.getrandbits` doesn't mention any limitation on the number of bits provided, so I don't imagine there will be backwards compatibility issues. I also don't expect the change to have any impact on third party `Random` replacement classes. For convenience, here's the contents of the very short patch (which I'll also attach): diff --git a/Modules/_randommodule.c b/Modules/_randommodule.c index fd6b230..3bf564f 100644 --- a/Modules/_randommodule.c +++ b/Modules/_randommodule.c @@ -348,12 +348,12 @@ random_setstate(RandomObject *self, PyObject *state) static PyObject * random_getrandbits(RandomObject *self, PyObject *args) { -int k, i, words; +Py_ssize_t k, i, words; PY_UINT32_T r; PY_UINT32_T *wordarray; PyObject *result; -if (!PyArg_ParseTuple(args, "i:getrandbits", )) +if (!PyArg_ParseTuple(args, "n:getrandbits", )) return NULL; if (k <= 0) { -- components: Library (Lib) files: getrandbits.diff keywords: patch messages: 265987 nosy: Steven.Barker priority: normal severity: normal status: open title: random.getrandbits is limited to 2**31-1 bits on 64-bit Windows type: enhancement versions: Python 3.6 Added file: http://bugs.python.org/file42919/getrandbits.diff ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue27072> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12920] inspect.getsource only works for objects loaded from files, not interactive session
Steven Barker added the comment: The problem being discussed here just came up on Stack Overflow today: http://stackoverflow.com/questions/37288135/inspect-module-for-python/ The cause of the incorrect error message is pretty clear. The relevant code from `inspect.getfile` should do something better when the object has a `__module__` attribute, but the module named (when looked up in `sys.modules`) does not have a `__file__` attribute. Currently it says the module is a builtin class, which is total nonsense. A very basic fix would be to have an extra error case: if isclass(object): if hasattr(object, '__module__'): object = sys.modules.get(object.__module__) if hasattr(object, '__file__'): return object.__file__ raise TypeError() # need a relevant message here!!! raise TypeError('{!r} is a built-in class'.format(object)) It might be easier to make a meaningful message if the code after the first `if` didn't overwrite `object` with the module. But, really, it would be nice to figure out a better fix, which would make the relevant inspect functions actually work for classes defined interactively in the `__main__` module. -- nosy: +Steven.Barker ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue12920> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23447] Import fails when doing a circular import involving an `import *`
Steven Barker added the comment: Thanks for looking at the issue Brett. I think you're right that your patch has incorrect semantics, since it doesn't save the value to the provided namespace if it had to go through the special path to find the name. I think my patch got that part right (though I definitely missed one PyUnicode_Check). I've just noticed that our patches may not always do the right thing in all error situations. They may return -1 without setting an exception (or perhaps passing on a strange one from an internal call) if there's something weird going on like `__name__` not existing or not being a string. I'm attaching a new version of my patch that moves the PyErr_Format call down to the later check for `value == NULL`. This handles both the basic case where nothing strange goes on but the requested name is not found, and the weird cases where unexpected stuff goes wrong. It will replace any errors raised by the earlier code with an ImportError. I think an ImportError is the appropriate exception for most error cases, but perhaps some internal errors should not be overwritten and more complicated error logic is needed. I also notice that the code in the "import_from" function (from which I borrowed heavily for my patch) was changed in one of the 3.5rc patches to have new error handling (it raises ImportError more often I think). I don't see an obvious way to copy its new error logic to import_all_from, but my C coding skills are rather rusty so I could be missing an easy approach. -- Added file: http://bugs.python.org/file40649/23447_4.diff ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue23447> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23447] Import fails when doing a circular import involving an `import *`
Steven Barker added the comment: I've managed to partially fix my build environment, so I can verify that my patch (attached previously) works as intended. I'm not completely sure that it doesn't break unrelated things as my build still has lots of failing tests (which all appear to be unrelated to this issue). Nothing obviously new breaks with the patch applied. The test I added in the patch does fail (as expected) on the unpatched interpreter, but passes on a patched version (though I've just noticed that the attempt at catching the exception doesn't work, since the current import code raises an AttributeError rather than the ImportError I had expected). I don't believe any other tests failed anew on the patched code, but like I said, I've got a lot of unrelated test failures so I might be missing some subtle issues. I'm attaching an updated version of the patch with better exception handling in the test code. The patch applies cleanly to the latest dev branch. I'd appreciate any reviews or other feedback. -- Added file: http://bugs.python.org/file40415/23447_3.diff ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue23447> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17991] ctypes.c_char gives a misleading error when passed a one-character unicode string
Steven Barker added the comment: I was looking over some of the bugs I've contributed to, and it looks like this one has been fixed. It should be marked as a dupe of issue 22161 and closed (I can close, but not set a superseder, it seems). -- resolution: - duplicate status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17991 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23447] Import fails when doing a circular import involving an `import *`
Steven Barker added the comment: I've finally gotten around to looking into this issue and it turns out that fixing from package import * to work with circular imports is pretty easy, as all that needs to be done is apply the same logic from the patch for issue 17636 to the loop in the next function. I'll attach a patch that does this. Unfortunately, my Windows build environment seems to be completely messed up at the moment and I'm getting link errors on everything in my python repo, so I've not been able to test the code at all. Since I have no idea when I'll actually have the time to learn what the hell's going wrong with MSVC, I though I'd submit my patch and perhaps somebody on a better OS can make sure it works properly. I've included a new test which should show the issue with circular imports (and verify that the rest of the patch fixes it, hopefully). -- keywords: +patch Added file: http://bugs.python.org/file40047/23447_2.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23447 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23495] The writer.writerows method should be documented as accepting any iterable (not only a list)
New submission from Steven Barker: The documentation for the csv.writer.writerows method says that it expects a list of row objects, when it really will accept any iterable that yields rows (such as a generator). While it's often nice for code to be more accepting than the documented requirements, I think the docs in this case should state that writerows() expects an iterable, rather than misinforming users that a list is required. This documentation issue was brought up in a Stack Overflow question here: http://stackoverflow.com/questions/28636848/csv-writer-writerows-takes-iterator I expect the necessary documentation patch will be pretty trivial, and if nobody else gets to it first, I will try to provide one when I have enough time to update my cpython checkout (not soon, alas). -- assignee: docs@python components: Documentation messages: 236328 nosy: Steven.Barker, docs@python priority: normal severity: normal status: open title: The writer.writerows method should be documented as accepting any iterable (not only a list) type: enhancement versions: Python 2.7, Python 3.2, Python 3.3, Python 3.4, Python 3.5, Python 3.6 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23495 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23495] The writer.writerows method should be documented as accepting any iterable (not only a list)
Steven Barker added the comment: Another Stack Overflow user pointed out that the DictWriter's writerows implementation (in pure Python) unnecessarily converts whatever its argument is into a list in memory before passing it on to the builtin writer.writerows method which would accept any iterable type. Probably it should be changed to use map or a generator expression instead. So, perhaps this issue isn't purely documentation, but actually a small behavior enhancement as well. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23495 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23447] Relative imports with __all__ attribute
Steven Barker added the comment: This issue is a special case of the problem discussed in issue 992389, that modules within packages are not added to the package dictionary until they are fully loaded, which breaks circular imports in the form from package import module. The consensus on that issue is that it doesn't need to be fixed completely, given the partial fix from issue 17636. I think the current issue is a corner case that was not covered by the fix, but which probably should be fixed as well, for consistency. The fix so far has made imports that name the module work, even though the module objects are still not placed into the package's namespace yet (this is why Antonio's last example works in the newly released 3.5a1, though not in previous versions). Wildcard imports however still fail. Can the fix be expanded to cover wildcard imports as well? I know, we're heaping up two kinds of usually-bad code (wildcard imports and circular imports) on top of one another, but still, the failure is very unexpected to anyone who's not read the bug reports. I don't know my way around the import system at all yet, so I'm not going to be capable of writing up a patch immediately, but if there's any interest at all, and nobody more knowledgeable gets to it first I might see what I can learn and try to put together something. Here's a more concise example of the issue: package/__init__.py: __all__ = [module] package/module.py: from . import module # this works in 3.5a1, thanks to the fix from issue 17636 from . import * # this still fails -- nosy: +Steven.Barker ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23447 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21389] The repr of BoundMethod objects sometimes incorrectly identifies the bound function
Steven Barker added the comment: OK, I've written some tests of the new bound method repr functionality, which I'm attaching as a patch against the current tip. I test the basic repr output, all the cases the old code got wrong (inherited methods, overridden methods, methods called via super, and classmethods) and the strange corner cases that probably won't come up in ordinary code (such as methods manually created from callables that don't have __name__ or __qualname__ attributes). I've also fixed the defaultdict test that was relying upon the old repr output. I don't believe there are any other places in the standard library or tests where a bound method's repr is examined. My patch adds the tests in a new file, Lib/test/test_bound_method_repr.py. I looked to see if there was an existing file that tested similar behavior, but none that I found really seemed appropriate. If I overlooked a better place to put the new tests, please let me know and I'll update the test patch. I'm not very experienced at writing unit tests, so comments and/or criticism is welcomed. I copied parts of the file's structure (such as the test_main() function and if __name__ == __main__ boilerplate) from other test files, so hopefully I've stayed pretty close to the usual Python test style. -- Added file: http://bugs.python.org/file36420/method_repr_tests.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21389 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1234674] filecmp.cmp's shallow option
Steven Barker added the comment: I've worked on this filecmp issue some more, and I have some new patches. First up is a patch that only modifies the tests. It has one test that fails without the behavior patch. The test patch also modifies some other tests so that they will work after the behavior patch is applied. Notably, test_keyword had a some tests that would fail due to false negatives on shallow comparisons. The second patch is the behavior and documentation changes required to actually fix this issue. This should be very simple to understand (the behavior change is only two lines of code, quoted in the discussion above). If you apply only this patch, you'll get several test failures, all due to false negative shallow comparisons (where two files have the same contents, but their stat signatures differ). With these new patches, I think this issue is ready for a review, and eventually to be committed. The behavior change is simple and I think, obviously correct (assuming we want to risk breaking backwards compatibility). Perhaps my test code can be improved, but I don't think it's too bad. So, the main question is will too much outside code break if we make this behavior change? I don't think filecmp is used very widely, but as was demonstrated by the standard library's only use of filecmp (in Lib/test/test_keyword.py), it's likely that a lot of users perform shallow comparisons (by default) where they really don't want to get false-negatives. If we decide that the changed behavior is too big of a break of backwards compatibility, we just need to document the current behavior better (at a minimum, the docstring for filecmp.cmp must be corrected). -- Added file: http://bugs.python.org/file36238/filecmp_test_patch.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1234674 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1234674] filecmp.cmp's shallow option
Changes by Steven Barker blckkn...@gmail.com: Added file: http://bugs.python.org/file36239/filecmp_behavior_and_doc_fix.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1234674 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1234674] filecmp.cmp's shallow option
Steven Barker added the comment: I think that your test patch misses the confusing/possibly wrong case. That case is when two files have the same contents, but different mtimes. If you attempt a shallow comparison, you'll actually get a deep comparison (reading the whole files) and a result of True rather than the expected (though incorrect) False. Try the test part of my patch (without the behavior change), and you'll see the failure of test_shallow_false_negative. In your first assertion (that filecmp.cmp(self.name, self.name_uppercase) is False), you get the expected result, but for the wrong reason (you get it because the file contents differ, not because they have different mtimes). Now, it might be that the bad case is actually working as we want it to (or at least, the behavior is established enough that we don't want to change it, for risk of breaking running code). If so, we should instead change the documentation (and especially the docstring) to explicit state that even if you request a shallow comparison, you might get a deep comparison instead. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1234674 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21547] '!s' formatting documentation bug
Steven Barker added the comment: The behavior of !s with the format() methods isn't exactly the same as %s with % formatting. With the latter, the conversion depends on the type of the result string, which in turn depends on whether the format string *or any of the values values* is unicode: class X(): def __str__(self): return str def __unicode__(self): return uunicode %s %s % (foo, X()) 'foo str' %s %s % (ufoo, X()) u'foo unicode' u%s %s % (foo, X()) u'foo unicode' u%s %s % (ufoo, X()) u'foo unicode' The format methods are more consistent, always returning the same type as the format string regardless of the types of the arguments (and using the appropriate converter): {} {!s}.format(foo, X()) 'foo str' {} {!s}.format(ufoo, X()) 'foo str' u{} {!s}.format(foo, X()) u'foo unicode' u{} {!s}.format(ufoo, X()) u'foo unicode' The documentation for %s conversion (in the second table here: https://docs.python.org/2/library/stdtypes.html#string-formatting-operations ) also suggests that it always uses str(), though the footnote for that table entry alludes to the behavior shown above without ever mentioning using unicode() for conversions explicitly. -- nosy: +Steven.Barker ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21547 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21389] The repr of BoundMethod objects sometimes incorrectly identifies the bound function
Steven Barker added the comment: Here's a patch that changes the behavior of method_repr in Objects/classobject.c . It first tries to use __func__.__qualname__, then tries __func__.__name__ as a fallback and finally uses ? if neither of those attributes are available. I'm not sure if the __name__ fallback is tested (as it seems that pretty much all callables have __qualname__ these days). The last ? case actually does get tested by Lib/test/test_descr.py which creates a messed up method with classmethod(1).__get__(1) (which oddly does not raise an error immediately upon creation, but rather only when it is called). I've not written C in several years, so please let me know if you see I've done something obviously wrong, or any places the patch could be improved. It is mostly a copy-and-paste of existing code with a few modifications and deletions, so hopefully I can't have messed up anything too badly! I'm currently ignoring a comment in the code that says we shouldn't use repr()/%R to format __self__. I don't really understand that, so I've stick with the existing behavior on that front. If that is something that should change, I'd be happy to try reworking it in some other way, just let me know what the concern is. Here are some examples of the new repr output in a build with the patch: class A(): ... def foo(self): ... pass ... class B(A): ... def foo(self): ... pass ... class C(A): ... pass ... class D(): ... @classmethod ... def bar(): ... pass ... A().foo bound method A.foo of __main__.A object at 0x02267508 B().foo bound method B.foo of __main__.B object at 0x02267578 C().foo bound method A.foo of __main__.C object at 0x02267658 super(B, B()).foo bound method A.foo of __main__.B object at 0x022676C8 D.bar bound method D.bar of class '__main__.D' -- keywords: +patch versions: +Python 3.5 Added file: http://bugs.python.org/file35202/method_repr.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21389 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21389] The repr of BoundMethod objects sometimes incorrectly identifies the bound function
Steven Barker added the comment: Ah, I figured out why using %R may be bad. It breaks for the following silly class: class C(): def __repr__(self): return repr(self.__repr__) # or use any other bound method repr(C()) will recurse until the recursion limit is hit, both with and without my patch. If this seems like a real issue, I could probably replace the %R code with a variation on the base case code in PyObject_Repr: PyUnicode_FromFormat(%s object at %p, v-ob_type-tp_name, v) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21389 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21445] Some asserts in test_filecmp have the wrong messages
New submission from Steven Barker: While working on a fix for issue 1234674, I found that the first test method in Lib/test/test_filecmp.py (FileCompareTestCase.test_matching) has switched up messages in its AssertEquals calls. The first two asserts have the message that should belong to the second two, and visa versa. I've prepared a very simple patch that switches around the AssertEquals arguments so that the messages match what is being tested. The behavior checked by the asserts remains exactly the same, only the message printed if an assert fails will be different. The test still passes. If you want to see the wrong message get displayed, the patch I uploaded for issue 1234674 may cause the test edited here to fail intermittently (it depend on whether two files are created in a shorter time than your filesystems's mtime resolution in the setup code). The failures are not affected by this patch, but with it you'll get the right message rather than a wrong one. -- components: Tests files: filecmp_test_messages.diff keywords: patch messages: 217970 nosy: Steven.Barker priority: normal severity: normal status: open title: Some asserts in test_filecmp have the wrong messages type: behavior versions: Python 3.5 Added file: http://bugs.python.org/file35160/filecmp_test_messages.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21445 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1234674] filecmp.cmp's shallow option
Steven Barker added the comment: Here's a patch against the default branch that fixes filecmp.cmp's behavior when shallow is True, including an update to the module's docs (correcting the previous ambiguity discussed in the 2011 python-dev thread mentioned by Sandro Tosi) and a couple of new tests of files where shallow comparisons should be expected get the answer wrong. The changed behavior is very simple. The lines from Lib/filecmp.py in the original code: if shallow and s1 == s2: return True are changed to: if shallow: return s1 == s2 This presumes answers to the questions asked by Georg Brandl back in 2005 of No, shallow comparisons never read the file contents and No, non-shallow comparisons don't care about mtimes (except for avoiding out of date cache entries). If we only applied the test changes (not the behavior change) from my patch, one of the new tests would fail, as the current filecmp.cmp code never gives false negatives on comparisons in shallow mode (only false positives). We probably don't want to commit the patch exactly as it stands, because the changed behavior causes several of the previously existing tests to fail. Almost all the dircmp tests fail for me, and the test_matching filecmp.cmp test does so intermittently (due to my system sometimes being fast enough at creating the files to beat my filesystem's mtime resolution). The test failures are all because of shallow comparisons between files with the same contents, but (slightly) different mtimes. The new shallow comparison behavior is to say those files are unequal while the old code would fall back on testing the file contents despite the shallow parameter, and so it would return True. The failing tests can probably be updated to either explicitly set the file mtimes with sys.utime (as I do with the files used in the new tests), or we could rewrite some of the setup code to use something like shutil.copy2 to make copies of files while preserving their mtimes. I'm not very familiar with the best practices when it comes to writing unit tests, so I pretty much copied the style of the existing tests in Lib/test/test_filecmp.py (though I've split mine up a bit more). I understand from reading other filecmp bug reports that that test style is considered to be pretty poor, so there's probably room to improve my code as well. I thought I'd post this patch before making an effort at fixing the older tests, in order to get some feedback. I'll be happy to incorporate any suggested improvements! This issue has been open for almost 9 years. Hopefully my patch can help get it moving again towards being fixed! -- keywords: +patch versions: +Python 3.5 Added file: http://bugs.python.org/file35159/filecmp_real_shallow.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1234674 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21389] The repr of BoundMethod objects sometimes incorrectly identifies the bound function
New submission from Steven Barker: The repr of bound method objects can be misleading in certain situations. The repr is always is of the format: bound method x.y of object But x is often incorrect. Here are some examples where the current code gets it wrong: # inherited method class Base(object): def foo(self): pass class Derived(Base): pass print(Derived().foo) # not too bad, prints bound method Derived.foo of __main__.Derived object at 0x # it probably should say Base.foo instead of Derived.foo, but at least they're two names for the same function # however, an override and super() it gets very bad class Derived2(Base): def foo(self): pass print(super(Derived2, Derived2()).foo) # totally wrong, prints bound method Dervied2.foo of __main__.derived2 object at 0x # but it actually *is* Base.foo bound to a Derived2 instance! # bound class methods: class Test(object): @classmethod def foo(cls): pass print(Test.foo) # wrong, prints bound method type.foo of class '__main__.Test' I suggest that rather than trying to assemble the x.y pair by from __self__.__class__ and __func__.__name__, the BoundMethod should just use the __func__.__qualname__. In each of the cases above, the function's location would be correctly located this way. I came across this bug while investigating a confusing (to me) issue with metaclasses and inheritance. The misleading repr output made it much harder to figure out that my expectations were wrong. Here's a simplified example of how it led me astray: class A(object): @classmethod def foo(cls): return classmethod from A class BMeta(type): def foo(cls): return instancemethod from BMeta class B(A, metaclass=BMeta): pass print(B.foo()) # surprisingly (to me) prints classmethod from A print(B.foo) # incorrectly prints bound method BMeta.foo of class __main__.B It is presumably not a bug that inherited class methods take precedence over instance methods from a metaclass (though it was surprising to me at the time). The repr of the bound method though, suggested exactly the opposite. Given that it gets many more common situations wrong as well, I think that the repr should be fixed. The relevant code appears to be in the method_repr function in Objects/Classobject.c . -- components: Interpreter Core messages: 217566 nosy: Steven.Barker priority: normal severity: normal status: open title: The repr of BoundMethod objects sometimes incorrectly identifies the bound function type: behavior versions: Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue21389 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1234674] filecmp.cmp's shallow option
Steven Barker added the comment: A recent Stack Overflow question (http://stackoverflow.com/q/23192359/1405065) relates to this bug. The questioner was surprised that filecmp.cmp is much slower than usual for certain large files, despite the shallow parameter being True. It is pretty clear to me that the behavior of filecmp.cmp does not match its docstring with respect to shallow. Either the docstring should be updated to match the existing behavior, or (more likely?) the behavior should change to match the docs. -- components: +Library (Lib) -Extension Modules nosy: +Steven.Barker versions: +Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1234674 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20058] IDLE's shell returns a multiple-line string to input() or readline() when multiple lines of text are pasted by the user
New submission from Steven Barker: Pasting multiple lines of input and then pressing Enter when IDLE is waiting to read a single line (such as when input() or sys.stdin.readline() have been called) will result is a multi-line string being given as the input, rather than a single line. This may be most easily understood by looking at an example. Run this code in IDLE (either by typing it in the shell, or from a file with F5): s = X while s: s = input() print(repr(s)) First, try typing in several lines. Each will be printed separately, with no newlines inside the strings (since input() strips a trailing newline). foo 'foo' bar 'bar' baz 'baz' Next, copy several lines of text from somewhere. It doesn't matter what the lines' contents are. Here I grabbed a list of Python version numbers, as I was on the download page after grabbing 3.4.0b1 for testing this bug: 3.1.5 3.0.1 2.7.6 2.6.9 2.5.6 2.4.6 2.3.7 2.2.3 all the preceding lines were pasted in one go, followed by enter pressed here '3.1.5\n3.0.1\n2.7.6\n2.6.9\n2.5.6\n2.4.6\n2.3.7\n2.2.3' This behavior is different than what the Python interpreter does in a regular console shell. When running in cmd.exe on Windows, Python treats a multi-line paste just like typed input: 3.1.5 '3.1.5' 3.0.1 '3.0.1' 2.7.6 '2.7.6' 2.6.9 '2.6.9' 2.5.6 '2.5.6' 2.4.6 '2.4.6' 2.3.7 '2.3.7' 2.2.3 enter typed here '2.2.3' I expect the same behavior will be common in other kinds of terminals on other platforms. This issue makes testing certain kinds of programs very frustrating. If your program needs to read certain text from STDIN, and you want to paste that text in quickly, you need to update your code with special logic just for use in IDLE's console. As an example of the kind of pain you may experience, try copying and pasting a block of text with a blank line into the input loop above. On a regular console session it will exit the loop after the blank line. In IDLE, it will keep running. I've traced the source of this issue through IDLE's sys.stdin file object and an RPC call, and found it probably is located in the idlelib.PyShell.PyShell.readline method (or the surrounding code). This grabs a string from the Text object in the shell window and returns it to the Python code running in the subprocess. Probably it should have some extra steps added to check if it got multiple lines. If so, it should split the string on newlines and return just one line of text for each readline call. I'm not sure exactly what should be done with the rest of the lines, but perhaps they could be queued up (or somehow put back by moving the markers in the Text object) so later lines would be grabbed by later input requests. Or alternatively, maybe the event where the multi-line paste arrives should be handled differently, as several single-line input events, rather than a single multiple-line one. -- components: IDLE messages: 206879 nosy: Steven.Barker priority: normal severity: normal status: open title: IDLE's shell returns a multiple-line string to input() or readline() when multiple lines of text are pasted by the user type: behavior versions: Python 2.7, Python 3.3, Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue20058 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17991] ctypes.c_char gives a misleading error when passed a one-character unicode string
New submission from Steven Barker: While investigating a Stack Overflow question (http://stackoverflow.com/questions/16484764/multiprocessing-value-clear-syntax) I came across a misleading error message from the multiprocessing.Value constructor: import multiprocessing my_char = x v = multiprocessing.Value(c, my_char) Traceback (most recent call last): File pyshell#21, line 1, in module v = multiprocessing.Value(c, my_char) File S:\Python33\lib\multiprocessing\__init__.py, line 243, in Value return Value(typecode_or_type, *args, lock=lock) File S:\Python33\lib\multiprocessing\sharedctypes.py, line 70, in Value obj = RawValue(typecode_or_type, *args) File S:\Python33\lib\multiprocessing\sharedctypes.py, line 47, in RawValue obj.__init__(*args) TypeError: one character string expected The one character string expected message was rather unhelpful, since that seemed to be what I gave it. After being stumped by this for a bit, I realized that it might be having an issue with Unicode and giving an error message more appropriate to Python 2 than Python 3. Sure enough, passing a one-character bytes instance works just fine. So, at a minimum I think the error message should be updated to say it wants a one-character bytes instance rather than a string (which to my mind means a str instance). A further enhancement might be to accept unicode strings that contain just a single ASCII character too, but I realize that may be more messy and error prone. The error message comes from the ctypes module, and can be reproduced easily: ctypes.c_char(x) Traceback (most recent call last): File pyshell#28, line 1, in module ctypes.c_char(x) TypeError: one character string expected The exception text comes from Modules/_ctypes/cfield.c:1151 -- components: Library (Lib) messages: 189338 nosy: Steven.Barker priority: normal severity: normal status: open title: ctypes.c_char gives a misleading error when passed a one-character unicode string type: enhancement versions: Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17991 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17991] ctypes.c_char gives a misleading error when passed a one-character unicode string
Changes by Steven Barker blckkn...@gmail.com: -- components: +ctypes -Library (Lib) ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17991 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com