New submission from Dennis Sweeney :
I understand that the SequenceMatcher's ratio method does not guarantee that
SequenceMatcher(None, a, b).ratio() == SequenceMatcher(None, b, a).ratio().
Below is a counterexample:
# Example from
https://mail.python.org/pipermail/python-list/2010
Dennis Sweeney added the comment:
Disregard merge_recipe.py: it would skip over a value that had already been
retrieved from the iterator when the loop finished.
--
___
Python tracker
<https://bugs.python.org/issue38
Change by Dennis Sweeney :
--
keywords: +patch
pull_requests: +16903
stage: -> patch review
pull_request: https://github.com/python/cpython/pull/17422
___
Python tracker
<https://bugs.python.org/issu
New submission from Dennis Sweeney :
Although the implementation of the heapq.merge function uses an underlying heap
structure, its behavior centers on iterators. For this reason, I believe there
should either be an alias to this function in the itertools module or at least
a recipe
Dennis Sweeney added the comment:
The following seems like it is a short, readable recipe for itertools.
--
Added file: https://bugs.python.org/file48748/merge_recipe.py
___
Python tracker
<https://bugs.python.org/issue38
Dennis Sweeney added the comment:
Should there be a similar generic test case in test.seq_test?
--
___
Python tracker
<https://bugs.python.org/issue39
New submission from Dennis Sweeney :
Similar to https://bugs.python.org/issue39453, but with deques:
Python 3.9.0a3+:
>>> from collections import deque
>>> class A:
... def __eq__(self, other):
... L.clear()
... return NotImplemented
...
>>> L =
Change by Dennis Sweeney :
--
keywords: +patch
pull_requests: +17796
stage: -> patch review
pull_request: https://github.com/python/cpython/pull/18421
___
Python tracker
<https://bugs.python.org/issu
Dennis Sweeney added the comment:
I made the requested changes to reflect that this is for code cleanliness
rather than strictly for performance.
However, it appears that Visual Studio on Windows 10 was not doing the
optimization one might expect. In particular, here is the disassembly
Dennis Sweeney added the comment:
> Hmm, Is this build on release mode?
Yes--in debug mode, each Py_INCREF is these 8 instructions:
78BD071E mov ecx,dword ptr [_Py_RefTotal (79039700h)]
78BD0724 add ecx,1
78BD0727 mov dword
Dennis Sweeney added the comment:
> Debug mode is not meaningful.
> Visual Studio will optimize fully on release mode.
Sorry if I wasn't clear--the original assembly difference I posted in
(https://bugs.python.org/msg362665) was indeed using the "release" build
config
New submission from Dennis Sweeney :
The following tiny change:
diff --git a/Objects/listobject.c b/Objects/listobject.c
index 3c39c6444b..3ac03b71d0 100644
--- a/Objects/listobject.c
+++ b/Objects/listobject.c
@@ -2643,8 +2643,7 @@ list_richcompare(PyObject *v, PyObject *w, int op
Change by Dennis Sweeney :
--
keywords: +patch
pull_requests: +18004
stage: -> patch review
pull_request: https://github.com/python/cpython/pull/18638
___
Python tracker
<https://bugs.python.org/issu
Change by Dennis Sweeney :
--
pull_requests: +17803
pull_request: https://github.com/python/cpython/pull/18427
___
Python tracker
<https://bugs.python.org/issue38
Dennis Sweeney added the comment:
Correction correction: returning zero preserves the invariant below.
from math import gcd as GCD
from functools import reduce
from itertools import starmap, chain
def gcd(*args):
return reduce(GCD, args, 0)
iterables = [[10, 20, 30
Dennis Sweeney added the comment:
Correction: gcd(itertools.chain(iterables)) == gcd(*map(gcd, iterables))
--
___
Python tracker
<https://bugs.python.org/issue39
Dennis Sweeney added the comment:
I think the behavior of gcd() == 0 is correct, but it should be documented,
because it isn't completely obvious.
Arguments for gcd() == 0:
- Preserves the invariant gcd(itertools.chain(iterables)) ==
gcd(itertools.starmap(gcd, iterables)) in the case
New submission from Dennis Sweeney :
Should something like the following go in the standard library, most likely in
the math module? I know I had to use such a thing before pow(a, -1, b) worked,
but Bezout is more general. And many of the easy stackoverflow implementations
of CRT congruence
Change by Dennis Sweeney :
--
pull_requests: +17174
pull_request: https://github.com/python/cpython/pull/17729
___
Python tracker
<https://bugs.python.org/issue38
Dennis Sweeney added the comment:
PR 17729 is a C implementation of a non-recursive "flattening" of the the
recursive-lazy-mergesort algorithm into a tournament whose state is a tree of
losers of comparisons.
--
___
Python track
Change by Dennis Sweeney :
--
pull_requests: +18304
pull_request: https://github.com/python/cpython/pull/18953
___
Python tracker
<https://bugs.python.org/issue39
Change by Dennis Sweeney :
--
resolution: duplicate ->
status: closed -> open
___
Python tracker
<https://bugs.python.org/issue39944>
___
___
Python-bugs-
Change by Dennis Sweeney :
--
resolution: -> wont fix
stage: patch review -> resolved
status: open -> closed
___
Python tracker
<https://bugs.python.or
New submission from Dennis Sweeney :
It seems that `.join` methods typically return the type of the separator on
which they are called:
>>> bytearray(b" ").join([b"a", b"b"])
bytearray(b'a b')
>>> b" ".join([byt
Dennis Sweeney added the comment:
This is not a duplicate: issue16397 concerned
" ".join([US("a"), US("b")])
While this is concerned about the return value and acceptable parameters for
UserString.join().
--
Dennis Sweeney added the comment:
Yes:
>>> x = "A"*10**6
>>> x.cutprefix("B") is x
True
>>> x.cutprefix("") is x
True
>>> y = b"A"*10**6
>>> y.cutprefix(b"B
Dennis Sweeney added the comment:
The existing Python implementation is benefiting from the C accelerators for
heapify and heapreplace. When forcing pure python using test.support, I get
these results:
.\python.bat -m pyperf timeit -s "from random import random; from collections
i
Dennis Sweeney added the comment:
First, as I posted at
https://github.com/python/cpython/pull/17729#issuecomment-571864662, there is a
theoretical advantage of fewer comparisons in all cases, and the new algorithm
would be especially dominant when one iterable keeps winning. (I'm given
New submission from Dennis Sweeney :
Following discussion here (
https://mail.python.org/archives/list/python-id...@python.org/thread/RJARZSUKCXRJIP42Z2YBBAEN5XA7KEC3/
), there is a proposal to add new methods str.cutprefix and str.cutsuffix to
alleviate the common misuse of str.lstrip
Change by Dennis Sweeney :
--
keywords: +patch
pull_requests: +18292
stage: -> patch review
pull_request: https://github.com/python/cpython/pull/18939
___
Python tracker
<https://bugs.python.org/issu
Dennis Sweeney added the comment:
The trouble is that itertools.product accepts iterators, and there is no
guaranteed way of "restarting" an arbitrary iterator in Python. Consider:
>>> a = iter([1,2,3])
>>> b = iter([4,5,6])
>>> next(a)
Change by Dennis Sweeney :
--
versions: +Python 3.9 -Python 3.7
___
Python tracker
<https://bugs.python.org/issue40230>
___
___
Python-bugs-list mailin
Dennis Sweeney added the comment:
I disabled indexing and antivirus and I didn't see anything else obvious that
would access the files, but I'm probably missing something -- I get the same
intermittent failure when I build from the source at the 3.8.2 release, but not
on a copy of 3.8.2
New submission from Dennis Sweeney :
I get the following intermittent failure when running the tests on Master on
Windows 10.
=
=
=
PS C:\...\cpython> .\python.bat -m unittest
Change by Dennis Sweeney :
--
keywords: +patch
pull_requests: +18930
stage: -> patch review
pull_request: https://github.com/python/cpython/pull/19594
___
Python tracker
<https://bugs.python.org/issu
Dennis Sweeney added the comment:
== Master ==
.\python.bat -m pyperf timeit -s "import random, math;
data=random.getrandbits(8*10_000_000).to_bytes(10_000_000, 'big')" "temp =
data.hex(); '\n'.join(temp[n:n+128] for n in range(0, len(temp), 128))"
Mean
Dennis Sweeney added the comment:
I replicated this behavior. This looks like the relevant loop in pystrhex.c:
for (i=j=0; i < arglen; ++i) {
assert((j + 1) < resultlen);
unsigned char c;
c = (argbuf[i] >> 4) & 0x0f;
retbuf[j++] = Py_hexdigi
Change by Dennis Sweeney :
--
resolution: -> works for me
stage: -> resolved
status: open -> closed
___
Python tracker
<https://bugs.python.or
Dennis Sweeney added the comment:
My suspicion was confirmed about PyPy (My PyPy here is Python 3.6.1
(784b254d6699, Apr 16 2019, 12:10:48) [PyPy 7.1.1-beta0 with MSC v.1910 32 bit]
on win32). In what follows, "heapq2.py" had exactly the `class merge` Python
implementation fro
Dennis Sweeney added the comment:
I think this question is about types in c, apart from any Python c API.
According to https://docs.python.org/3/c-api/arg.html#numbers, the specifier is
c: (bytes or bytearray of length 1) -> [char]
so you should be able to write to a c variable of t
Change by Dennis Sweeney :
--
components: +Tests -Installation
title: Errors during make test python 3.8.2 -> OS-related test failures on
Linux in Python 3.8.2
type: compile error -> behavior
___
Python tracker
<https://bugs.python.org/i
Dennis Sweeney added the comment:
Thanks for reaching out! This is about test failures, not problems with
installation process, correct? I took a look at the failures:
==
ERROR: test_add_file_after_2107
Dennis Sweeney added the comment:
There's a failure here:
https://buildbot.python.org/all/#/builders/64/builds/656
Failed subtests:
test_killed_child -
test.test_concurrent_futures.ProcessPoolSpawnProcessPoolExecutorTest
Traceback (most recent call last
Dennis Sweeney added the comment:
I'm personally -0 for underscores -- they might slightly improve readability of
the function name in isolation but may also add confusion about which methods
have underscores. Only one out of the 45 non-dunder str methods has an
underscore right now
Dennis Sweeney added the comment:
Oops -- I now see the message on Python-Dev.
--
___
Python tracker
<https://bugs.python.org/issue39939>
___
___
Python-bug
Change by Dennis Sweeney :
--
keywords: +patch
nosy: +Dennis Sweeney
nosy_count: 1.0 -> 2.0
pull_requests: +19171
stage: -> patch review
pull_request: https://github.com/python/cpython/pull/19855
___
Python tracker
<https://bugs.p
Dennis Sweeney added the comment:
git bisect says that this was fixed here:
commit b94dbd7ac34dc0c79512656eb17f6f07e09fca7a
Author: Pablo Galindo
Date: Mon Apr 27 18:35:58 2020 +0100
bpo-40334: Support PyPARSE_DONT_IMPLY_DEDENT in the new parser (GH-19736)
--
nosy: +Dennis
Dennis Sweeney added the comment:
I can submit a PR. Just making sure I understand, is this essentially the
desired behavior change?
import weakref
import functools
if 0:
from test.support import import_fresh_module
functools = import_fresh_module('functools', blocked=['_functools
New submission from Dennis Sweeney :
Since bytes.hex() was added in 3.5, we should be able to make the following
change:
diff --git a/Lib/secrets.py b/Lib/secrets.py
index a546efbdd4..1dd8629f52 100644
--- a/Lib/secrets.py
+++ b/Lib/secrets.py
@@ -13,7 +13,6 @@ __all__
Change by Dennis Sweeney :
--
keywords: +patch
pull_requests: +19070
stage: -> patch review
pull_request: https://github.com/python/cpython/pull/19749
___
Python tracker
<https://bugs.python.org/issu
Change by Dennis Sweeney :
--
title: Small Refactoring: Use the bytes.hex() in secrets.token_hex() -> Small
Refactoring: Use bytes.hex() in secrets.token_hex()
___
Python tracker
<https://bugs.python.org/issu
Dennis Sweeney added the comment:
> `Mapping.__reversed__` exists
While ``'__reversed__' in dir(Mapping)`` is true, that unfortunately does not
mean that it is a real callable method:
from collections.abc import Mapping
class Map(Mapping):
def __getitem__(self
Dennis Sweeney added the comment:
I think the behavior is consistent between tuple and an empty subclass:
>>> from typing import List
>>> class T(tuple):
pass
== Empty tuple/T ==
>>> List[()]
Traceback (most recent call last):
Change by Dennis Sweeney :
Added file: https://bugs.python.org/file49166/recursive_merge.py
___
Python tracker
<https://bugs.python.org/issue38938>
___
___
Python-bug
Change by Dennis Sweeney :
Added file: https://bugs.python.org/file49165/losers.py
___
Python tracker
<https://bugs.python.org/issue38938>
___
___
Python-bugs-list mailin
Change by Dennis Sweeney :
Removed file: https://bugs.python.org/file49165/losers.py
___
Python tracker
<https://bugs.python.org/issue38938>
___
___
Python-bugs-list m
Change by Dennis Sweeney :
Added file: https://bugs.python.org/file49167/losers.py
___
Python tracker
<https://bugs.python.org/issue38938>
___
___
Python-bugs-list mailin
Change by Dennis Sweeney :
Removed file: https://bugs.python.org/file48747/iter_merge.py
___
Python tracker
<https://bugs.python.org/issue38938>
___
___
Python-bug
Change by Dennis Sweeney :
Added file: https://bugs.python.org/file49164/tournament_heap.py
___
Python tracker
<https://bugs.python.org/issue38938>
___
___
Python-bug
Change by Dennis Sweeney :
Removed file: https://bugs.python.org/file49156/recursive_merge.py
___
Python tracker
<https://bugs.python.org/issue38938>
___
___
Python-bug
Change by Dennis Sweeney :
Removed file: https://bugs.python.org/file48748/merge_recipe.py
___
Python tracker
<https://bugs.python.org/issue38938>
___
___
Python-bug
Dennis Sweeney added the comment:
It seems to me that the code sprawl mostly comes from the separate handling of
the four keyed/unkeyed and forward/reverse cases, which as far as I can tell
requires a branch in the innermost loop if not unrolled into separate cases. I
think
Dennis Sweeney added the comment:
I mostly like new_merge.py too, especially the dynamic reduction of the tree.
However, it looks like ``list(merge([2],[1],[1]))`` currently fails, and I
think what's missing is the following in the sibling-promotion:
+ if sibling.left
Dennis Sweeney added the comment:
For some more ideas for features or APIs, you could look at:
https://docs.sympy.org/latest/modules/ntheory.html or
http://doc.sagemath.org/html/en/reference/rings_standard/sage/arith/misc.html
for an absolute upper bound.
If there's to be a minimal number
Change by Dennis Sweeney :
--
keywords: +patch
pull_requests: +19253
stage: -> patch review
pull_request: https://github.com/python/cpython/pull/19938
___
Python tracker
<https://bugs.python.org/issu
Change by Dennis Sweeney :
--
nosy: +gvanrossum, levkivskyi
___
Python tracker
<https://bugs.python.org/issue40582>
___
___
Python-bugs-list mailing list
Unsub
Dennis Sweeney added the comment:
The attached recursive_merge.py should be much less ugly and still somewhat
performant.
It should be the same algorithm as that PR, just written recursively rather
than iteratively.
I got some text files from http://www.gwicks.net/dictionaries.htm
Dennis Sweeney added the comment:
As Serhiy suggested, keeping the algorithm but moving the Python implementation
to be a generator again (as I recently changed in PR 18427) gives another
performance boost (although this unrolling is many lines of code).
Timing the C implementation
Dennis Sweeney added the comment:
If no one has started, I can draft such a PEP.
--
___
Python tracker
<https://bugs.python.org/issue39939>
___
___
Python-bug
Dennis Sweeney added the comment:
Here is a draft PEP -- I believe it needs a Core Developer sponsor now?
--
Added file: https://bugs.python.org/file48983/pep-.rst
___
Python tracker
<https://bugs.python.org/issue39
Change by Dennis Sweeney :
Added file: https://bugs.python.org/file48989/pep-.rst
___
Python tracker
<https://bugs.python.org/issue39939>
___
___
Python-bugs-list m
Change by Dennis Sweeney :
Removed file: https://bugs.python.org/file48983/pep-.rst
___
Python tracker
<https://bugs.python.org/issue39939>
___
___
Python-bugs-list m
Dennis Sweeney added the comment:
https://github.com/python/peps/pull/1332
--
___
Python tracker
<https://bugs.python.org/issue39939>
___
___
Python-bugs-list m
Dennis Sweeney added the comment:
Just posted it.
--
___
Python tracker
<https://bugs.python.org/issue39939>
___
___
Python-bugs-list mailing list
Unsubscribe:
Change by Dennis Sweeney :
--
keywords: +patch
nosy: +Dennis Sweeney
nosy_count: 2.0 -> 3.0
pull_requests: +19523
stage: -> patch review
pull_request: https://github.com/python/cpython/pull/20236
___
Python tracker
<https://bugs.p
Dennis Sweeney added the comment:
The attached PR isn't exactly what you requested, but it's a very minimal code
change that uses the existing __qualname__ functionality to change the message
to
TypeError: A.foo() takes 1 positional argument but 2 were given
Does that address those
Dennis Sweeney added the comment:
I just ran the entire test suite with:
--- a/Python/ceval.c
+++ b/Python/ceval.c
@@ -4179,6 +4179,7 @@ _PyEval_EvalCode(PyThreadState *tstate,
Py_ssize_t j;
if (keyword == NULL || !PyUnicode_Check(keyword)) {
+printf("THIS
Dennis Sweeney added the comment:
Sure -- I'll file the issue.
--
___
Python tracker
<https://bugs.python.org/issue40679>
___
___
Python-bugs-list mailin
Dennis Sweeney added the comment:
https://bugs.python.org/issue40706
--
___
Python tracker
<https://bugs.python.org/issue40679>
___
___
Python-bugs-list mailin
New submission from Dennis Sweeney :
When I was looking into https://bugs.python.org/issue40679, I couldn't come up
with a test case for the following block, so I added a print statement:
--- a/Python/ceval.c
+++ b/Python/ceval.c
@@ -4179,6 +4179,7 @@ _PyEval_EvalCode(PyThreadState *tstate
New submission from Dennis Sweeney :
One of the tests (test_ttk_guionly.test_variable_change) on the Ubuntu CI is
intermittently hanging on this code:
https://github.com/python/cpython/blob/e42b705188271da108de42b55d9344642170aa2b/Lib/tkinter/test/test_ttk/test_extensions.py#L147
Dennis Sweeney added the comment:
key_and_reverse.py employs the same strategy as winners.py, but uses lists as
the nodes of the tree rather than using Node instances. It also eliminates the
recursion of treeify, and adds (with neither much of a performance hit nor much
code duplication
Dennis Sweeney added the comment:
Here's a reproducer.
--
nosy: +Dennis Sweeney
Added file: https://bugs.python.org/file49447/reproducer.py
___
Python tracker
<https://bugs.python.org/issue41
Dennis Sweeney added the comment:
Attached is a proof of concept.
--
Added file: https://bugs.python.org/file49436/disksort.py
___
Python tracker
<https://bugs.python.org/issue41
Dennis Sweeney added the comment:
If we were to do this, I think a better API might be to accept an arbitrary
iterable, then produce a sorted iterable:
def sorted_on_disk(iterable, key=None, reverse=False) -> Iterable:
...
It would sort chunks of the input and store them in fi
Dennis Sweeney added the comment:
The most recent batch of commits added a jump table.
Between master and PR 22679 now, there are 151 cases slower than master and 463
that faster than master.
The slower cases are at most twice as slow, but the faster cases are often
10-20x faster.
I could
Dennis Sweeney added the comment:
@Tim I got this again for that benchmark:
length=3442, value=ASXABCDHAB...: Mean +- std dev: 2.39 ms +- 0.01 ms
Unfortunately not a ghost.
--
___
Python tracker
<https://bugs.python.org/issue41
Dennis Sweeney added the comment:
bench_table.txt gives my results (`ref` is Master, `change` is with PR 22679).
The change gives 342 faster cases and 275 slower cases, and 9 cases with no
change.
I chose a random word of length 10**6 with a zipf character distribution for
the haystack
Dennis Sweeney added the comment:
Another algorithmic possibility: Instead of the bitset, we could have a
stack-allocated
uint8_t jump[32]; // maybe 64? Maybe uint16_t?
It would say this: If the last character lined up in the haystack is congruent
to i mod (1 << 8), then jump
Dennis Sweeney added the comment:
That test needle happened to end with a G and not have another G until much
earlier. The status quo took advantage of that, but the PR only takes advantage
of the skip value for a certain middle character. Perhaps it could do both
Dennis Sweeney added the comment:
Here is a C implementation of the two-way algorithm that should work as a
drop-in replacement for Objects/stringlib/fastsearch.h.
Benchmarking so far, it looks like it is a bit slower in a lot of cases. But
it's also a bit faster in a some other cases
Dennis Sweeney added the comment:
PR 22679 is a draft that does the two-way algorithm but also adds both of the
tricks from Fredrik's implementation: a bit-set "bloom filter" and remembering
the skip-distance between some pair of characters.
--
Added file: https://bugs.
Change by Dennis Sweeney :
--
pull_requests: +21650
stage: -> patch review
pull_request: https://github.com/python/cpython/pull/22679
___
Python tracker
<https://bugs.python.org/issu
Dennis Sweeney added the comment:
I used random_bench.py to compare PR 22679 to Master, and the results are in
bench_results.txt. Results were varied. I suppose this depends on what cases we
want to optimize for.
--
Added file: https://bugs.python.org/file49512/random_bench.py
Dennis Sweeney added the comment:
I'm doing a couple more timing tests to try to understand exactly when the
cutoff should be applied (based on some combination of needle and haystack
lengths).
Can the rolling hash algorithm be made to go sublinear like O(n/m)? It looked
like it was pretty
Dennis Sweeney added the comment:
I added the cutoff for strings >= 10 characters, and I converted the PR from a
draft to "Ready to Review."
When running stringbench.py before and after the PR, I get these results:
Summary:
Unicode Before: 81.82 Bytes Before: 92.62
Dennis Sweeney added the comment:
> But there _also_ seem to be real (but much smaller) benefits
> for the "search backward" cases, which I don't recall seeing
> when I tried it. Do you have a guess as to why?
I did change `skip = mlast - 1;` to `skip = mlast;` as you
Dennis Sweeney added the comment:
> Dennis, I think that's expected, right? Two-way on its own can exploit
> nothing about individual characters - it only preprocesses the needle to
> break the possibility for quadratic-time behavior due to periods in the
> needle.
Yes, th
Dennis Sweeney added the comment:
FWIW, one of the "# Made the spaces line up" is actually a "skip ahead by the
needle length".
--
___
Python tracker
<https://bug
Dennis Sweeney added the comment:
I posted the example thinking that having a concrete walkthrough might be good
for discussion, and it looks like it was. ;-)
This makes me curious how a simplified-but-not-as-simplified-as-the-status-quo
Sunday algorithm would fare: using the Sunday
1 - 100 of 516 matches
Mail list logo