[issue42456] Logical Error

2020-11-24 Thread Tim Peters


Tim Peters  added the comment:

There's no bug here.

"&" is the bitwise Boolean logical-and operator on integers. For example,

>>> 1 & 2
0
>>> 1 & 3
1

It binds more tightly than the "==" equality-testing operator. To get the 
result you want, you would need to add more parentheses to override the natural 
operator precedence:

>>> (a == 10) & (not(a!=b)) & (b == 10)
True

But, more to the point, what you probably really want is the "and" logical 
operator, not the bitwise integer "&" operator:

>>> a == 10 and (not(a!=b)) and b == 10
True

--
nosy: +tim.peters
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue42456>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42336] Make PCbuild/build.bat build x64 by default

2020-11-12 Thread Tim Peters


Tim Peters  added the comment:

+1. If you're feeling more ambitious, it would also be good to change build.bat 
and rt.bat to use the same "which platform?" spellings and with the same 
defaults.

--
nosy: +tim.peters

___
Python tracker 
<https://bugs.python.org/issue42336>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41972] bytes.find consistently hangs in a particular scenario

2020-11-06 Thread Tim Peters


Tim Peters  added the comment:

But that also suggests a different approach: start with the current code, but 
add introspection to switch to your enhancement of C if the current code is 
drifting out of linear-time territory.

--

___
Python tracker 
<https://bugs.python.org/issue41972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41972] bytes.find consistently hangs in a particular scenario

2020-11-06 Thread Tim Peters


Tim Peters  added the comment:

I'm sorry I haven't been able to give more time to this. I love what's been 
done, but am just overwhelmed :-(

The main thing here is to end quadratic-time disasters, without doing 
significant damage in other cases. Toward that end it would be fine to use 
"very high" cutoffs, and save tuning for later. It's clear that we _can_ get 
significant benefit even in many non-disastrous cases, but unclear exactly how 
to exploit that. But marking the issue report as "fixed" doesn't require 
resolving that - and I want to see this improvement in the next Python release.

An idea that hasn't been tried: in the world of sorting, quicksort is usually 
the fastest method - but it's prone to quadratic-time disasters. OTOH, heapsort 
never suffers disasters, but is almost always significantly slower in ordinary 
cases. So the more-or-less straightforward "introsort" compromises, by starting 
with a plain quicksort, but with "introspection" code added to measure how 
quickly it's making progress. If quicksort appears to be thrashing, it switches 
to heapsort.

It's possible an "introsearch" would work well too. Start with pure brute force 
(not even with the current preprocessing), and switch to something else if it's 
making slow progress. That's easy to measure: compare the number of character 
comparisons we've made so far to how many character positions we've advanced 
the search window so far. If that ratio exceeds (say) 3, we're heading out of 
linear-time territory, so should switch to a different approach.

With no preprocessing at all, that's likely to be faster than even the current 
code for very short needles, and in all cases where the needle is found at the 
start of the haystack.

This context is trickier, though, in that the current code, and pure C, can 
also enjoy sub-linear time cases. It's always something ;-)

--

___
Python tracker 
<https://bugs.python.org/issue41972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-22 Thread Tim Peters


Tim Peters  added the comment:

Note that Sunday doesn't care (at all) where mismatches occur. The "natural" 
way to add Sunday: follow pure C-P unless/until it finds a mismatching 
position. Pure C-P then computes a specific shift. Nothing about that changes. 
But something is added: also compute the shift Sunday suggests, and pick the 
larger of that and what C-P computed.

C-P and Sunday both have cases where they can justify "supernaturally large" 
shifts (i.e., as long as len(needle), or even that +1 for Sunday), and they're 
not always the same cases.

--

___
Python tracker 
<https://bugs.python.org/issue41972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-18 Thread Tim Peters


Tim Peters  added the comment:

I confess I _assumed_ all along that you were generalizing the current code's 
Sunday trick to 7-bit equivalence classes (up from 32 bits total) and 64K 
possible shift counts (up from just 2 total possibilities: 1 or len(needle)+1). 
The Sunday trick couldn't care less where or when the mismatch occurs, just 
that a mismatch occurred somewhere.

In my head I was picturing the paper's code, not the PR.  Whenever it makes a 
shift, it could compare it to the "Sunday-ish shift", and pick the larger.  
That should have no effect on worst case O() behavior - it would always shift 
at least as much as Crochempre-Perrin when a mismatch was hit.

I can't say how it would relate to details of the PR's spelling, because I'm 
still trying to fully understand the paper ;-) I don't believe I can usefully 
review the code before then.

--

___
Python tracker 
<https://bugs.python.org/issue41972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-18 Thread Tim Peters


Tim Peters  added the comment:

Dennis, I think that's expected, right? Two-way on its own can exploit nothing 
about individual characters - it only preprocesses the needle to break the 
possibility for quadratic-time behavior due to periods in the needle.

It sounds like you switched the current code's Sunday-ish approach to a more 
Boyer-Moore-ish approach. That probably hurt, and especially for shorter 
needles (where Sunday can yield a maximum shift of len(needle)+1, not 
len(needle) - the "+1" is more significant the smaller len(needle)).

Sunday gave 3 variants of his basic algorithm, and in his tests they were all 
faster than Boyer-Moore, and more so the shorter the needle.  Just FYI, here's 
a scanned PDF of his paper:

https://csclub.uwaterloo.ca/~pbarfuss/p132-sunday.pdf

--

___
Python tracker 
<https://bugs.python.org/issue41972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-17 Thread Tim Peters


Tim Peters  added the comment:

Yup, they act essentially the same, but yours jumps into the quicksand earlier 
;-)

I'm fond of this one:

"""
HUGE = 10**7
BIG = 10**6

bigxs = 'x' * BIG

haystack = 'x' * HUGE
needle = bigxs + 'y' + bigxs
"""

"The problem" then is in middle of the needle, not at either end, so really 
simple tricks all fail. For example, set(haystack) is a subset of set(needle), 
so our "Bloom filter" is useless, and needle[-1] == needle[-2], so our "skip" 
count is the useless 0.  Pure brute force is quadratic-time, and we're even 
slower because we're paying over & over to try heuristic speedups that happen 
never to pay off.

Two-way splits the needle as

u = bigxs
v = 'y' + bigxs

and again never gets out of the "try to match the right part" phase. Each time 
around it fails to match 'y' on the first try, and shifts right by 1.  Boring, 
but linear time overall.

--

___
Python tracker 
<https://bugs.python.org/issue41972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-17 Thread Tim Peters


Tim Peters  added the comment:

I don't think we have any idea how the OP stumbled into this. Looks like it 
"just happened".

The case you construted is quadratic-time, but not quite as bad:

BaB
BB

Fails at once, because 'a' doesn't match the trailing needle 'B'. The Bloom 
filter is useless because the next haystack 'B' _is_ in the needle.  So shift 
right 1:

BaB
xBB

Last character matches, so goes on to compare 4 Bs and an aB mismatch. 6 in 
all. Bloom filter useless again, and another shift by 1:

BaB
xxBB

Now there's 1 less compare, because only 3 Bs match.

And so on.  After the next shift, only 2 Bs match, and after the shift 
following that, only 1 B.  Getting to:

BaB
xBB

Now after matching the trailing Bs, aB mismatches at once.  And we're done, 
because another shift by 1 moves the end of the needle beyond the end of the 
haystack (although I bet the Bloom filter reads up the trailing NUL byte from 
the haystack and actually makes a "giant" shift).

You're right that two-way yawns at this ;-)  'B' * (K*2) is split into

u = "" # empty string!
v = 'B' * (K*2)

so only the "match the right half" part of the searching algorithm comes into 
play.

BaB
BB

first mismatches at the 6th character (1-based counting - at index 5), so it 
shifts right by 6:

BaB
xxBB

And it's done, because the needle has moved beyond the end of the haystack.

The brainpower that went into making this "look trivial" is quite impressive :-)

--

___
Python tracker 
<https://bugs.python.org/issue41972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-17 Thread Tim Peters


Tim Peters  added the comment:

When I ran stringbench yesterday (or the day before - don't remember), almost 
all the benefit seemed to come from the "late match, 100 characters" tests.  
Seems similar for your run.  Here are your results for just that batch, 
interleaving the two runs to make it easy to see:  first line from the "before" 
run, second line from the "after" (PR) run, then a blank line.  Lather, rinse, 
repeat.

These are dramatic speedups for the "search forward" cases.  But there _also_ 
seem to be real (but much smaller) benefits for the "search backward" cases, 
which I don't recall seeing when I tried it.  Do you have a guess as to why?

== late match, 100 characters
bytes   unicode
(in ms) (in ms) ratio%=bytes/unicode*100
2.733.8870.4s="ABC"*33; ((s+"D")*500+s+"E").find(s+"E") (*100)
0.170.15116.3   s="ABC"*33; ((s+"D")*500+s+"E").find(s+"E") (*100)

2.013.5456.8s="ABC"*33; ((s+"D")*500+"E"+s).find("E"+s) (*100)
0.890.87101.8   s="ABC"*33; ((s+"D")*500+"E"+s).find("E"+s) (*100)

1.662.3670.2s="ABC"*33; (s+"E") in ((s+"D")*300+s+"E") (*100)
0.150.13111.5   s="ABC"*33; (s+"E") in ((s+"D")*300+s+"E") (*100)

2.743.8970.5s="ABC"*33; ((s+"D")*500+s+"E").index(s+"E") (*100)
0.170.15112.4   s="ABC"*33; ((s+"D")*500+s+"E").index(s+"E") (*100)

3.934.0098.4s="ABC"*33; ((s+"D")*500+s+"E").partition(s+"E") (*100)
0.300.27108.0   s="ABC"*33; ((s+"D")*500+s+"E").partition(s+"E") (*100)

3.994.5986.8s="ABC"*33; ("E"+s+("D"+s)*500).rfind("E"+s) (*100)
3.132.51124.5   s="ABC"*33; ("E"+s+("D"+s)*500).rfind("E"+s) (*100)

1.642.2373.3s="ABC"*33; (s+"E"+("D"+s)*500).rfind(s+"E") (*100)
1.541.8284.5s="ABC"*33; (s+"E"+("D"+s)*500).rfind(s+"E") (*100)

3.974.5986.4s="ABC"*33; ("E"+s+("D"+s)*500).rindex("E"+s) (*100)
3.182.53125.8   s="ABC"*33; ("E"+s+("D"+s)*500).rindex("E"+s) (*100)

4.694.67100.3   s="ABC"*33; ("E"+s+("D"+s)*500).rpartition("E"+s) (*100)
3.372.66126.9   s="ABC"*33; ("E"+s+("D"+s)*500).rpartition("E"+s) (*100)

4.092.82145.0   s="ABC"*33; ("E"+s+("D"+s)*500).rsplit("E"+s, 1) (*100)
3.392.62129.5   s="ABC"*33; ("E"+s+("D"+s)*500).rsplit("E"+s, 1) (*100)

3.503.5199.7s="ABC"*33; ((s+"D")*500+s+"E").split(s+"E", 1) (*100)
0.300.28106.0   s="ABC"*33; ((s+"D")*500+s+"E").split(s+"E", 1) (*100)

Just for contrast, doesn't make much difference for the "late match, two 
characters" tests:

== late match, two characters
0.440.5876.2("AB"*300+"C").find("BC") (*1000)
0.570.48120.2   ("AB"*300+"C").find("BC") (*1000)

0.590.7380.5("AB"*300+"CA").find("CA") (*1000)
0.560.7277.5("AB"*300+"CA").find("CA") (*1000)

0.550.49112.5   "BC" in ("AB"*300+"C") (*1000)
0.660.37177.7   "BC" in ("AB"*300+"C") (*1000)

0.450.5876.5("AB"*300+"C").index("BC") (*1000)
0.570.49116.5   ("AB"*300+"C").index("BC") (*1000)

0.610.6298.6("AB"*300+"C").partition("BC") (*1000)
0.720.52137.2   ("AB"*300+"C").partition("BC") (*1000)

0.620.6496.4("C"+"AB"*300).rfind("CA") (*1000)
0.490.49101.6   ("C"+"AB"*300).rfind("CA") (*1000)

0.570.6587.5("BC"+"AB"*300).rfind("BC") (*1000)
0.510.5789.3("BC"+"AB"*300).rfind("BC") (*1000)

0.620.6496.5("C"+"AB"*300).rindex("CA") (*1000)
0.500.49101.2   ("C"+"AB"*300).rindex("CA") (*1000)

0.680.6999.0("C"+"AB"*300).rpartition("CA") (*1000)
0.610.54113.5   ("C"+"AB"*300).rpartition("CA") (*1000)

0.820.60137.8   ("C"+"AB"*300).rsplit("CA", 1) (*1000)
0.630.57112.0   ("C"+"AB"*300).rsplit("CA", 1) (*1000)

0.630.61103.0   ("AB"*300+"C").split("BC", 1) (*1000)
0.740.54138.2   ("AB"*300+"C").split("BC", 1) (*1000)

--

___
Python tracker 
<https://bugs.python.org/issue41972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-16 Thread Tim Peters


Tim Peters  added the comment:

BTW, in the old post of Fredrick's I linked to, he referred to a 
"stringbench.py" program he used for timings, but the link is dead.

I was surprised to find that it still lives on, under the Tools directory. Or 
did - I'm on Windows now and its Python distro doesn't always have everything 
in a Linux distro, for "reasons" I can't remember ;-)

Anyway, here on GitHub:

https://github.com/python/cpython/tree/master/Tools/stringbench

--

___
Python tracker 
<https://bugs.python.org/issue41972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-16 Thread Tim Peters


Tim Peters  added the comment:

And changed the "Type" field to "performance", because speed is the only issue 
here.

--
type: behavior -> performance

___
Python tracker 
<https://bugs.python.org/issue41972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-16 Thread Tim Peters


Tim Peters  added the comment:

Removed 3.8 from the Versions list. The code was functioning as designed, and 
the O(n*m) worst case bounds were always known to be possible, so there's no 
actual bug here.

--
versions:  -Python 3.8

___
Python tracker 
<https://bugs.python.org/issue41972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-16 Thread Tim Peters


Tim Peters  added the comment:

I don't think Rabin-Karp is worth trying here. The PR is provably worst-case 
linear time, and the constant factor is already reasonably small. Its only real 
weakness I can see is that it can be significantly (but seemingly not 
dramatically) slower than the status quo, in what are exceptionally good cases 
for the status quo, and perhaps most often due to the increased preprocessing 
cost of the new code (which is relatively most damaging if the needle is found 
early in the haystack - in which cases preprocessing can be close to a pure 
waste of time).

The status quo and the PR both enjoy sublinear (O(len(haystack) / len(needle)) 
cases too, but R-K doesn't.

Where R-K is a "go to" choice is when an app needs to search the same large 
text for several strings (Wikipedia has a decent description of R-K can be 
adapted to that) - but Python has no string API that does such a thing (closest 
is joining the search-for strings via "|" for a regexp search).

--

___
Python tracker 
<https://bugs.python.org/issue41972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-14 Thread Tim Peters


Tim Peters  added the comment:

For completeness, a link to the python-dev thread about this:

https://mail.python.org/archives/list/python-...@python.org/thread/ECAZN35JCEE67ZVYHALRXDP4FILGR53Y/#4IEDAS5QAHF53IV5G3MRWPQAYBIOCWJ5

--

___
Python tracker 
<https://bugs.python.org/issue41972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-13 Thread Tim Peters


Tim Peters  added the comment:

> There's no discomfort at all to me if, e.g., it stored
> 32-bit counts and is indexed by the last 6 bits of the
> character.  That's a measly 256 bytes in all.

Or, for the same space, 16-bit counts indexed by the last 7 bits. Then there's 
no aliasing for 7-bit ASCII, which is still very common in my world ;-)  
Needles over 64K characters aren't.

Which is a weird rule of thumb that's served me well, although for no solid 
reason I can detect:  when faced with a universe of tradeoff possibilities for 
which it appears impossible to get a handle on "the typical" case, optimize for 
_your_ cases. Then at least one user will be delighted in the end :-)

--

___
Python tracker 
<https://bugs.python.org/issue41972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-13 Thread Tim Peters


Tim Peters  added the comment:

Dennis, I'm delighted that the timing harness pointed out an actual glitch, and 
that it was so (seemingly) straightforward to identify the algorithmic cause. 
This gives me increased confidence that this project can be pushed to adoption, 
and your name will be hallowed in Python's history :-)

I have no problem with increasing the constant space. Fredrik was Python's 
Unicode pioneer too, so was naturally repelled by any scheme that required 
additional space proportional to the alphabet size. The "Bloom filter" is the 
tiny bit left of Daniel Sunday's algorithm, which actually has no such thing 
;-) Along the lines you suggested, Sunday precomputes a vector indexed by 
characters, each entry giving the distance to that character's rightmost index 
in the needle.  The "Bloom filter" throws that vector away and only saves 
hashes of the indices where the vector holds its maximum possible value.

Note that there's nothing magical about "right to left" in Sunday's algorithm 
(or, really, in our current use of the Bloom filter). Characters can be 
compared in any order, and when there's a mismatch, the skip vector can be 
indexed by the character one beyond the search window to often find a decent 
amount to skip.  Indeed, in apps where the expected frequency of characters is 
known, Sunday's algorithm is often adapted to compare the least-frequently 
expected needle character first.

The downside isn't really the space, but that stack space is uninitialized 
trash. Initializing it to a known value increases preprocessing overhead, 
albeit independent of needle length. So the relative overhead is higher the 
shorter the needle.

I agree expanding it beyond the tiny bit vector is likely to be significantly 
helpful.  There's no discomfort at all to me if, e.g., it stored 32-bit counts 
and is indexed by the last 6 bits of the character.  That's a measly 256 bytes 
in all.

It's also possible that a more capable "Sunday-ish vector" of this kind would 
render the current `skip` trick usually useless by comparison. Or not ;-)

--

___
Python tracker 
<https://bugs.python.org/issue41972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-13 Thread Tim Peters


Tim Peters  added the comment:

Dennis, would it be possible to isolate some of the cases with more extreme 
results and run them repeatedly under the same timing framework, as a test of 
how trustworthy the _framework_ is? From decades of bitter experience, most 
benchmarking efforts end up chasing ghosts ;-)

For example, this result:

length=3442, value=ASXABCDHAB...  | 289 us  | 2.36 ms: 8.19x slower (+719%) 

Is that real, or an illusion?

Since the alphabet has only 26 letters, it's all but certain that a needle that 
long has more than one instance of every letter. So the status quo's "Bloom 
filter" will have every relevant bit set, rendering its _most_ effective 
speedup trick useless. That makes it hard (but not impossible) to imagine how 
it ends up being so much faster than a method with more powerful analysis to 
exploit.

--

___
Python tracker 
<https://bugs.python.org/issue41972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-13 Thread Tim Peters


Tim Peters  added the comment:

> For a bunch of cases it's slower, for some others it's faster.

I have scant real idea what you're doing, but in the output you showed 4 output 
lines are labelled "slower" but 18 are labelled "faster".

What you wrote just above appears to say the reverse (I'd call 18 "a bunch" 
compared to 4 "some others").

Could please state plainly which of {status quo, PR} is faster on an output 
line labelled "faster"?

My a priori guess was that the PR had the highest chance of being slower when 
the needle is short and is found in the haystack early on. Then preprocessing 
time accounts for a relatively higher percentage of the total time taken, and 
the PR's preprocessing is more expensive than the status quo's.

The alphabet size here is small (just 26 possible letters, from 
`ascii_uppercase`), so it's quite likely that a short needle _will_ be found 
early on in a long haystack.

--

___
Python tracker 
<https://bugs.python.org/issue41972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-11 Thread Tim Peters


Tim Peters  added the comment:

Impressive, Dennis! Nice work.

FYI, on the OP's original test data, your fastsearch() completes each search in 
under 20 seconds using CPython, and in well under 1 second using PyPy.

Unfortunately, that's so promising it can't just be dismissed offhandedly ;-)  
Looks like something quite worth pursuing here!

--

___
Python tracker 
<https://bugs.python.org/issue41972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-10 Thread Tim Peters


Tim Peters  added the comment:

Ya, the case for the diff is at best marginal. Note that while it may be 
theoretically provable that the extra test would make the worst cases slower, 
that's almost certainly not measurable. The extra test would almost never be 
executed in the worst cases: in those the last characters of the needle and 
haystack DO match, and we spend almost all the total time in the

for (j = 0; j < mlast; j++)

loop.  The extra test is only in the branch where the last characters _don't_ 
match.

In that branch, it's already certain that the last haystack character does not 
match the last needle character, so in a probabilistic sense, for "random" data 
that increases the odds that the last haystack character isn't in the needle at 
all. In which case the payoff is relatively large compared to the cost of the 
test (can skip len(needle) positions at once, instead of only 1).

I don't believe this can be out-thought. It would require timing on "typical" 
real-life searches. Which are likely impossible to obtain ;-)

Offhand do you know what the _best_ timing for two-way search is in a 
pattern-not-found case? The algorithm is too complicated for that to be 
apparent to me at a glance. As Fredrik said in the post of his I linked to, he 
was very keen to have an algorithm with best case sublinear time. For example, 
the one we have takes best-case not-found O(n/m) time (where n is the haystack 
length and m the needle length).  For example, searching for 'B' * 1_000_000 in 
'A' * 10_000_000 fails after just 9 character comparisons (well, not counting 
the million less 1 compares to initialize `skip` to the ultimately useless 0).

--

___
Python tracker 
<https://bugs.python.org/issue41972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-09 Thread Tim Peters


Tim Peters  added the comment:

The attached fastsearch.diff removes the speed hit in the original test case 
and in the constructed one.

I don't know whether it "should be" applied, and really can't make time to dig 
into it.

The rationale:  when the last characters of the pattern string and the current 
search window don't match, this is what happens:  if the "Bloom filter" can 
prove that the character one beyond the search window isn't in the pattern at 
all, the search window is advanced by len(pattern) + 1 positions. Else it's 
advanced only 1 position.

What changes here: if the Bloom filter thinks the character one beyond the 
search window may be in the pattern, this goes on to see whether the Bloom 
filter thinks the last character in the search window may be in the pattern (by 
case assumption, we already know it's not the _last_ character in the pattern). 
If not, the search window can be advanced by len(pattern) positions.

So it adds an extra check in the last-characters-don't-match and 
next-character-may-be-in-the-pattern case:  if the last character is not in the 
pattern, it's irrelevant that the next character may be - there's no possible 
overall match including the last character, so we can move the search window 
beyond it.

The question is whether this costs more than it saves overall.  It saves 
literally hours of CPU time for one search in this report's case ;-)  But 
worst-case time remains O(m*n) regardless, just somewhat harder to provoke.

--
keywords: +patch
Added file: https://bugs.python.org/file49503/fastsearch.diff

___
Python tracker 
<https://bugs.python.org/issue41972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-08 Thread Tim Peters


Tim Peters  added the comment:

BTW, this initialization in the FASTSEARCH code appears to me to be a mistake:

skip = mlast - 1;

That's "mistake" in the sense of "not quite what was intended, and so 
confusing", not in the sense of "leads to a wrong result".

I believe `skip` should be initialized to plain `mlast` instead.  Setting it to 
something smaller than that (like to `mlast - 1`) doesn't cause an error, but 
fails to exploit as much skipping as possible.

`mlast - 1` is the same value `skip` is set to if the following loop finds that 
the last character of the pattern is also the pattern's first character, but 
doesn't appear elsewhere in the pattern. It can be one larger if the last 
pattern character is unique in the pattern (which is the initial value of 
`skip` set by the line shown above).

--

___
Python tracker 
<https://bugs.python.org/issue41972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-08 Thread Tim Peters


Tim Peters  added the comment:

Just FYI, the original test program did get the right answer for the second 
search on my box - after about 3 1/2 hours :-)

--

___
Python tracker 
<https://bugs.python.org/issue41972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-08 Thread Tim Peters


Tim Peters  added the comment:

Good sleuthing, Dennis! Yes, Fredrik was not willing to add "potentially 
expensive" (in time or in space) tricks:

http://effbot.org/zone/stringlib.htm

So worst-case time is proportional to the product of the arguments' lengths, 
and the cases here appear to, essentially, hit that.  It _was_ a goal that it 
always be at least as fast as the dirt-dumb search algorithm it replaced, and 
in good real-life (not just contrived) cases to be much faster.  It met the 
goals it had.

--

___
Python tracker 
<https://bugs.python.org/issue41972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-07 Thread Tim Peters


Tim Peters  added the comment:

Also reproduced on 64-bit Win10 with just-released 3.9.0.

Note that string search tries to incorporate a number of tricks (pieces of 
Boyer-Moore, Sunday, etc) to speed searches.  The "skip tables" here are 
probably computing a 0 by mistake. The file here contains only 2 distinct byte 
values, and the relatively huge string to search for has only 1, so there's 
plenty of opportunity for those tricks to get confused ;-)

--
nosy: +tim.peters

___
Python tracker 
<https://bugs.python.org/issue41972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41964] difflib SequenceMatcher get_matching_blocks returns non-matching blocks in some cases

2020-10-07 Thread Tim Peters


Tim Peters  added the comment:

I believe your testing code is in error, perhaps because it's so overly 
elaborate you've lost track of what it's doing.  Here's a straightforward test 
program:

import difflib
s1='http://local:56067/register/200930162135700;>'
s2='http://local:53813/register/20100517282450281;>'
d = difflib.SequenceMatcher(None, s1, s2)
for m in d.get_matching_blocks():
print(m, repr(s1[m.a : m.a + m.size]),
 repr(s2[m.b : m.b + m.size]))

and its output under 3.9.0:

Match(a=0, b=0, size=39) 'http://local:5' 'http://local:5'
Match(a=43, b=43, size=12) '/register/20' '/register/20'
Match(a=59, b=55, size=1) '1' '1'
Match(a=66, b=56, size=2) '00' '00'
Match(a=68, b=70, size=2) '">' '">'
Match(a=70, b=72, size=0) '' ''

Your test program is obtaining the substrings to display from these two lines:

mtch_a = dpiy_Frst[mblk.a : mblk.a + mblk.size];
mtch_b = dpiy_Frst[mblk.b : mblk.b + mblk.size];

But BOTH of those are extracting substrings from `dply_Frst`. Looks like you 
intended to use the `dply_Scnd` argument in the second line instead.

If I change my test program to use `s1` for both substrings, then it matches 
the output you gave. But that doesn't make sense ;-)

So I'm closing this as not-a-bug. If I missed something relevant, feel free to 
re-open it.

--
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue41964>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41912] Long generator chain causes segmentation fault

2020-10-04 Thread Tim Peters


Tim Peters  added the comment:

"Stackless" is a large topic with a convoluted history. Do the web search. In 
short, no, it will never go in the core - too disruptive to too many things. 
Parts have lived on in other ways, watered down versions. The PyPy project 
captured most of what remains, as optional features. CPython's generators 
captured the easiest part: after a yield, the C stack space the generator 
consumed is released, while the Python VM stack space lives on in the heap 
awaiting possible resumption (but, if/when it's resumed, C stack space is 
required again).

--

___
Python tracker 
<https://bugs.python.org/issue41912>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41912] Long generator chain causes segmentation fault

2020-10-02 Thread Tim Peters


Tim Peters  added the comment:

Right, generators played no essential role here. Just one way of piling up a 
tall tower of C stack frames.

Search the web for "stackless Python" for the history of attempts to divorce 
the CPython implementation from the platform C stack.

There are ways to increase the main thread's stack size on Windows too using 
MSVC, but unless someone is extremely knowledgeable and willing to write some 
assembler to help out, they need to change it via a Windows linker option when 
the .exe is built, or via the EDITBIN developer tool (to modify the .exe file).

--

___
Python tracker 
<https://bugs.python.org/issue41912>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41912] Long generator chain causes segmentation fault

2020-10-02 Thread Tim Peters


Tim Peters  added the comment:

There is no way in portable ANSI C to deduce a "safe" limit. The limits that 
exist were picked by hand across platforms, to be conservative guesses at what 
would "never" break.

You're allowed to increase the limit if you think you know better - and you 
may! No two platforms are the same. But - again - you do so at your own risk.  
There is no bug here unless you can provoke a crash _without_ overriding the 
default limit.

I was there when the limit was introduced. It was introduced precisely to avoid 
the vast array of strange system failures that can occur when the C stack 
overflows - and, again, there is no portable way to detect that "it's getting 
close" in ANSI C.  Stack sizes provided by platform C implementations are all 
over the place, from kilobytes to megabytes, and can also be different yet 
again for threads, and there is no portable way in C to find out what they are.

"""
For my friend using Windows, a value as low as 4000 suffices, which I don't 
think anyone would argue is unreasonably high.
"""

For the defaults provided by Microsoft's Visual C compiler (which CPython 
uses), it is unreasonably high - the C stack overflows.  In fact, under the 
released Python 3.8.5 on my 64-bit Windows, the largest value for which your 
program isn't obviously broken is about 2512.  If I don't override the default 
recursion limit on Windows, I cannot provoke a problem with your program on 
Windows (instead it dies - as designed and intended - with a `RecursionError` 
exception instead).

--

___
Python tracker 
<https://bugs.python.org/issue41912>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41912] Long generator chain causes segmentation fault

2020-10-02 Thread Tim Peters


Tim Peters  added the comment:

The docs are already clear about that you play with `setrecursionlimit()` at 
your own risk:

"""
Set the maximum depth of the Python interpreter stack to limit. This limit 
prevents infinite recursion from causing an overflow of the C stack and 
crashing Python.

The highest possible limit is platform-dependent. A user may need to set the 
limit higher when they have a program that requires deep recursion and a 
platform that supports a higher limit. This should be done with care, because a 
too-high limit can lead to a crash.
"""

If you see a crash instead of a `RecursionError` exception when you _don't_ 
shoot yourself in the foot this way, that might be interesting. But, as is, 
you're deliberately overriding a mechanism designed to prevent these kinds of 
crashes.

--
nosy: +tim.peters

___
Python tracker 
<https://bugs.python.org/issue41912>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37168] small_ints[] modified (third party C-extension?)

2020-08-30 Thread Tim Peters


Tim Peters  added the comment:

Closing, since it remains a unique report and there hasn't been another word 
about it in over a year.

--
resolution:  -> works for me
stage:  -> resolved
status: pending -> closed

___
Python tracker 
<https://bugs.python.org/issue37168>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41513] High accuracy math.hypot()

2020-08-29 Thread Tim Peters


Tim Peters  added the comment:

About test_frac.py, I changed the main loop like so:

got = [float(expected)] # NEW
for hypot in hypots:
actual = hypot(*coords)
got.append(float(actual)) # NEW
err = (actual - expected) / expected
bits = round(1 / err).bit_length()
errs[hypot][bits] += 1
if len(set(got)) > 1: # NEW
print(got) # NEW

That is, to display every case where the four float results weren't identical.

Result: nothing was displayed (although it's still running the n=1000 chunk, 
there's no sign that will change).  None of these variations made any 
difference to results users actually get.

Even the "worst" of these reliably develops dozens of "good bits" beyond IEEE 
double precision, but invisibly (under the covers, with no visible effect on 
delivered results).

So if there's something else that speeds the code, perhaps it's worth pursuing, 
but we're already long beyond the point of getting any payback for pursuing 
accuracy.

--

___
Python tracker 
<https://bugs.python.org/issue41513>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41660] multiprocessing.Manager objects lose connection info

2020-08-29 Thread Tim Peters


Tim Peters  added the comment:

Noting that adding a `.join()` to the failing code on the StackOverflow report 
appeared to fix that problem too.

In hindsight, I guess I'm only mildly surprised that letting the main process 
run full speed into interpreter shutdown code while worker processes are still 
running can have bad effects. But I am _somewhat_ surprised. For threads, early 
in shutdown we automatically run a loop to .join() them.

--

___
Python tracker 
<https://bugs.python.org/issue41660>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41660] multiprocessing.Manager objects lose connection info

2020-08-29 Thread Tim Peters


Tim Peters  added the comment:

And more weirdness, changing the tail to:

for i in range(10):
state_value.value = i
state_ready.clear()
producerprocess = MyProducer(state_value, state_ready)
consumerprocess = MyConsumer(state_value, state_ready)
producerprocess.start()
consumerprocess.start()
producerprocess.join()
consumerprocess.join()
print(state_value, state_ready.is_set())

That runs fine! All the expected output, and no tracebacks.  But it ALSO runs 
fine if EITHER of the .join() lines is commented out.

--

___
Python tracker 
<https://bugs.python.org/issue41660>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41660] multiprocessing.Manager objects lose connection info

2020-08-28 Thread Tim Peters


Tim Peters  added the comment:

Weird. If I insert these between the two process starts:

import time
time.sleep(2)

then the producer produces the expected output:

at start: 666
at producer start: 666

and the program blows up instead when it gets to

print("in consumer:", self.val.value)

Same thing if, instead of a sleep, I put

producerprocess.join()

before starting the consumer.

If I change the tail end to:

producerprocess.start()
import time
time.sleep(2)
print(state_value)
consumerprocess = MyConsumer(state_value, state_ready)
consumerprocess.start()

then I see

at start: 666
at producer start: 666
Value('i', 42)

before it blows up in the consumer.  So `state_value` appears to survive 
intact, and mutated as intended, across the producer's life - but gets 
corrupted somehow when it's _also_ passed to the consumer.

Weirder ;-) , if I replace the tail with the ever-more elaborate:

producerprocess = MyProducer(state_value, state_ready)
producerprocess.start()
import time
time.sleep(2)
producerprocess.join()
print(state_value)

state_value.value = 13
producerprocess = MyProducer(state_value, state_ready)
producerprocess.start()
time.sleep(2)
producerprocess.join()
print(state_value)

consumerprocess = MyConsumer(state_value, state_ready)
consumerprocess.start()

then I see

at start: 666
at producer start: 666
Value('i', 42)
at producer start: 13
Value('i', 42)

before it blows up in the consumer - so it survived two full producer 
lifetimes!  So ... generalize ... change the tail to:


for i in range(10):
state_value.value = i
producerprocess = MyProducer(state_value, state_ready)
producerprocess.start()
producerprocess.join()
print(state_value)

consumerprocess = MyConsumer(state_value, state_ready)
consumerprocess.start()

and I see everything working fine before it blows up in the consumer:

at start: 666
at producer start: 0
Value('i', 42)
at producer start: 1
Value('i', 42)
at producer start: 2
Value('i', 42)
at producer start: 3
Value('i', 42)
at producer start: 4
Value('i', 42)
at producer start: 5
Value('i', 42)
at producer start: 6
Value('i', 42)
at producer start: 7
Value('i', 42)
at producer start: 8
Value('i', 42)
at producer start: 9
Value('i', 42)

--

___
Python tracker 
<https://bugs.python.org/issue41660>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41660] multiprocessing.Manager objects lose connection info

2020-08-28 Thread Tim Peters


New submission from Tim Peters :

This started on StackOverflow:

https://stackoverflow.com/questions/63623651/how-to-properly-share-manager-dict-between-processes

Here's a simpler program.

Short course: an object of a subclass of mp.Process has an attribute of 
seemingly any type obtained from an mp.Manager(). The report above used a 
Manager.dict, and the program here a Manager.Value.

When .start() is invoked, the first time that attribute is used in any way that 
requires communication with the Manager server, the program dies. The output 
below is from 3.8.5 on Windows; the report above is some Linux flavor. The 
tracebacks are very similar, the only obvious difference being that the 
implementation complains about a socket on Linux but about a named pipe on 
Windows.

Note that there's no problem passing such things as explicit arguments to 
functions across processes. The loss here appears specific to the inscrutable 
under-the-covers pickle dance needed to create a "self" by magic on worker 
processes.

The code:

from multiprocessing import Process, Manager, Event

class MyProducer(Process):
def __init__(self, value, event):
Process.__init__(self)
self.val = value
self.event = event

def run(self):
print("at producer start:", self.val.value)
self.val.value = 42
self.event.set()

class MyConsumer(Process):
def __init__(self, value, event):
Process.__init__(self)
self.val = value
self.event = event

def run(self):
self.event.wait()
print("in consumer:", self.val.value)

if __name__ == "__main__":
state_value = Manager().Value('i', 666)
print("at start:", state_value.value)
state_ready = Event()
producerprocess = MyProducer(state_value, state_ready)
consumerprocess = MyConsumer(state_value, state_ready)
producerprocess.start()
consumerprocess.start()

The output:

at start: 666
Process MyProducer-2:
Traceback (most recent call last):
  File 
"C:\Users\Tim\AppData\Local\Programs\Python\Python38\lib\multiprocessing\managers.py",
 line 827, in _callmethod
conn = self._tls.connection
AttributeError: 'ForkAwareLocal' object has no attribute 'connection'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File 
"C:\Users\Tim\AppData\Local\Programs\Python\Python38\lib\multiprocessing\process.py",
 line 315, in _bootstrap
self.run()
  File "C:\Code\temp.py", line 10, in run
print("at producer start:", self.val.value)
  File 
"C:\Users\Tim\AppData\Local\Programs\Python\Python38\lib\multiprocessing\managers.py",
 line 1154, in get
return self._callmethod('get')
  File 
"C:\Users\Tim\AppData\Local\Programs\Python\Python38\lib\multiprocessing\managers.py",
 line 831, in _callmethod
self._connect()
  File 
"C:\Users\Tim\AppData\Local\Programs\Python\Python38\lib\multiprocessing\managers.py",
 line 818, in _connect
conn = self._Client(self._token.address, authkey=self._authkey)
  File 
"C:\Users\Tim\AppData\Local\Programs\Python\Python38\lib\multiprocessing\connection.py",
 line 500, in Client
c = PipeClient(address)
  File 
"C:\Users\Tim\AppData\Local\Programs\Python\Python38\lib\multiprocessing\connection.py",
 line 702, in PipeClient
_winapi.WaitNamedPipe(address, 1000)
FileNotFoundError: [WinError 2] The system cannot find the file specified

--
components: Library (Lib)
messages: 376051
nosy: tim.peters
priority: normal
severity: normal
status: open
title: multiprocessing.Manager objects lose connection info
type: behavior
versions: Python 3.8

___
Python tracker 
<https://bugs.python.org/issue41660>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41513] High accuracy math.hypot()

2020-08-25 Thread Tim Peters


Tim Peters  added the comment:

One more implication: since the quality of the initial square root doesn't 
really much matter, instead of

result = sqrt(to_float(parts))
a, b = split(result)
parts = add_on(-a*a, parts)
parts = add_on(-2.0*a*b, parts)
parts = add_on(-b*b, parts)
x = to_float(parts)
result += x / (2.0 * result)

at the end, it should work just as well (in fact, probably a little better, due 
to getting more beneficial cancellation) to do:

a = parts[0] - 1.0
result = sqrt(a)
x, y = split(result)
result += (a - x*x - 2.0*x*y - y*y + parts[1]) / (2.0 * result)

at the end. Although this crucially relies on the doing the last-line chain of 
subtracts and adds "left to right".

--

___
Python tracker 
<https://bugs.python.org/issue41513>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41513] High accuracy math.hypot()

2020-08-24 Thread Tim Peters


Tim Peters  added the comment:

Do it! It's elegant and practical :-)

--

___
Python tracker 
<https://bugs.python.org/issue41513>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41513] Scale by power of two in math.hypot()

2020-08-21 Thread Tim Peters


Tim Peters  added the comment:

> won't have a chance to work through it for a week or so

These have been much more in the way of FYI glosses. There's no "suggestion" 
here to be pursued - just trying to get a deeper understanding of code already 
written :-)

While I can concoct any number of cases where the add_on() method fails correct 
rounding, all the ones I dug into turned out to be "oops! went the wrong way" 
in nearest/even tie cases. By any non-anal measure, its accuracy is excellent.

--

___
Python tracker 
<https://bugs.python.org/issue41513>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41513] Scale by power of two in math.hypot()

2020-08-20 Thread Tim Peters


Tim Peters  added the comment:

My apologies if nobody cares about this ;-) I just think it's helpful if we all 
understand what really helps here.

> Pretty much the whole trick relies on computing
> "sumsq - result**2" to greater than basic machine
> precision.

But why is that necessary at all?  Because we didn't compute the sqrt of the 
sum of the squares to begin with - the input to sqrt was a _rounded_-to-double 
approximation to the sum of squares. The correction step tries to account for 
the sum-of-squares bits sqrt() didn't see to begin with.

So, e.g., instead of doing

result = sqrt(to_float(parts))
a, b = split(result)
parts = add_on(-a*a, parts)
parts = add_on(-b*b, parts)
parts = add_on(-2.0*a*b, parts)
x = to_float(parts)
result += x / (2.0 * result)

it works just as well to do just

result = float((D(parts[0] - 1.0) + D(parts[1])).sqrt())

where D is the default-precision decimal.Decimal constructor. Then sqrt sees 
all the sum-of-squares bits we have (although the "+" rounds away those 
following the 28th decimal digit).

--

___
Python tracker 
<https://bugs.python.org/issue41513>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41513] Scale by power of two in math.hypot()

2020-08-19 Thread Tim Peters


Tim Peters  added the comment:

Just FYI, if the "differential correction" step seems obscure to anyone, here's 
some insight, following a chain of mathematical equivalent respellings:

result + x / (2 * result) =
result + (sumsq - result**2) / (2 * result) =
result + (sumsq - result**2) / result / 2 =
result + (sumsq / result - result**2 / result) / 2 =
result + (sumsq / result - result) / 2 =
result + sumsq / result / 2 - result / 2 =
result / 2 + sumsq / result / 2 =
(result + sumsq / result) / 2

I hope the last line is an "aha!" for you then:  it's one iteration of Newton's 
square root method, for improving a guess "result" for the square root of 
"sumsq". Which shouldn't be too surprising, since Newton's method is also 
derived from a first-order Taylor expansion around 0.

Note an implication: the quality of the initial square root is pretty much 
irrelevant, since a Newton iteration basically doubles the number of "good 
bits".  Pretty much the whole trick relies on computing "sumsq - result**2" to 
greater than basic machine precision.

--

___
Python tracker 
<https://bugs.python.org/issue41513>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41587] Potential memory leak while using shared memory

2020-08-19 Thread Tim Peters


Tim Peters  added the comment:

There's no evidence of a Python issue here, so I recommend closing this. It's 
not the Python bug tracker's job to try to make sense of platform-specific 
reporting tools, which, as already explained, can display exceedingly confusing 
numbers. We (the Python project) didn't write them, have no control over what 
they display, and have no special insight into them.

Here's a similar simple program that creates a 4 GiB block of shared memory, 
and passes it to 16 processes. I'm running this on Windows 10, on a machine 
with 4 physical cores and 16 GiB of RAM. If the memory weren't actually being 
shared, Windows would have killed the job (there's no "over allocation" allowed 
at all on Windows), because there's nowhere near 4 * 17 = 68 GiB of RAM 
available.

Instead, the program brings the machine to its knees, but because it created 16 
processes each of which is 100% CPU- and memory-bound. The Windows task manager 
detailed view shows less than a MiB of non-shared memory in use by each of the 
worker processes, with the 4 GiB in the "memory in use that can be shared with 
other processes" column for each worker process.  I'm running lots of other 
stuff too, and Windows still says there's over 5 GiB of my 16 GiB of RAM free 
for use.

Windows reporting has its own quirks. For example, the shared memory block 
doesn't show up at all in the workers at first. Instead the amount reported 
steadily increases, as the write loop in each worker process forces the OS to 
materialize shared pages in the worker's virtual address space (that is, 
Windows reports pages actually in use by a process, not virtual address space 
reservations).

from multiprocessing import Process, shared_memory

SHM_SIZE = 2 ** 32 # 4 GiB

def f(name):
shm = shared_memory.SharedMemory(name=name)
print(f"shm of size {shm.size:,}")
buf = shm.buf
for i in range(shm.size):
buf[i] = 123
shm.close()

def main():
shm = shared_memory.SharedMemory(create=True, size=SHM_SIZE, name='shm')
ps = []
for i in range(16):
p = Process(target=f, args=('shm',))
p.start()
ps.append(p)
for p in ps:
p.join()
shm.close()
shm.unlink()

if __name__ == '__main__':
main()

--
nosy: +tim.peters

___
Python tracker 
<https://bugs.python.org/issue41587>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41513] Scale by power of two in math.hypot()

2020-08-19 Thread Tim Peters


Tim Peters  added the comment:

Here's an amusing cautionary tale:  when looking at correct-rounding failures 
for the fsum approach, I was baffled until I realized it was actually the 
_decimal_ method that was failing.  Simplest example I have is 9 instances of 
b=4.999, which is 1 ulp less than 5. Of course the best-possible 
hypot is 3*b rounded, but that's not what default decimal precision computes 
(when rounded back).  In fact, it bounces between "off by 1" and "correct" 
until decimal precision exceeds 96:

>>> pfailed = []
>>> for p in range(28, 300):
... with decimal.localcontext() as c:
... c.prec = p
... if float(sum([decimal.Decimal(b) ** 2] * 9).sqrt()) != 3*b:
... pfailed.append(p)
>>> pfailed
[28, 29, 30, 31, 34, 36, 37, 39, 41, 42, 43, 44, 45, 57, 77, 96]

In a nutshell, it takes in the ballpark of 50 decimal digits to represent b 
exactly, then twice as many to represent its square exactly, then another to 
avoid losing a digit when summing 9 of those.  So if decimal prec is less than 
about 100, it can lose info.

So the fsum method is actually more reliable (it doesn't lose any info until 
the fsum step).

--

___
Python tracker 
<https://bugs.python.org/issue41513>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41513] Scale by power of two in math.hypot()

2020-08-18 Thread Tim Peters


Tim Peters  added the comment:

> That's about 0.50026 ulp too small - shocking ;-)

Actually, that's an illusion due to the limited precision of Decimal.  The 
rounded result is exactly 1/2 ulp away from the infinitely precise result - 
it's a nearest/even tie case.

To make analysis easy, just note that the mathematical hypot(*([x] * 9)) is 
exactly 3*x.  For x = 16 + ulp(16), 3*x moves to a higher binade and has two 
trailing one bits.  The last has to be rounded away, rounding up to leave the 
last retained bit even.  Any source of error, no matter how tiny, can be enough 
so that doesn't happen.

--

___
Python tracker 
<https://bugs.python.org/issue41513>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41513] Scale by power of two in math.hypot()

2020-08-18 Thread Tim Peters


Tim Peters  added the comment:

Here's a "correct rounding" fail for the add_on approach:

xs = [16.004] * 9

decimal result = 48.01065814103642
which rounds to float 48.014

add_on result: 48.01

That's about 0.50026 ulp too small - shocking ;-)

The fsum approach got this one right, but has similar failures on other inputs 
of the same form; i.e.,

[i + ulp(i)] * 9

for various integers `i`.

--

___
Python tracker 
<https://bugs.python.org/issue41513>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41513] Scale by power of two in math.hypot()

2020-08-18 Thread Tim Peters


Tim Peters  added the comment:

About speed, the fsum() version I posted ran about twice as fast as the 
all-Decimal approach, but the add_on() version runs a little slower than 
all-Decimal. I assume that's because fsum() is coded in C while the add_on() 
prototype makes mounds of additional Python-level function calls. Using 
released Windows 3.8.5 CPython and on argument vectors of length 50.

Is it worth it? Up to you ;-) There are good arguments to be made in favor of 
increased accuracy, and in favor of speed.

About "correctly rounded", such a claim can only be justified by proof, not by 
testing (unless testing exhaustively covers the entire space of inputs). Short 
of that, best you can claim is stuff like "max error < 0.51 ulp" (or whatever 
other bound can be proved).

--

___
Python tracker 
<https://bugs.python.org/issue41513>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38379] finalizer resurrection in gc

2020-08-16 Thread Tim Peters


Tim Peters  added the comment:

I suspect you're reading some specific technical meaning into the word "block" 
that the PR and release note didn't intend by their informal use of the word. 
But I'm unclear on what technical meaning you have in mind.

Before the change, gc "just gave up" after seeing a resurrection, ending the 
then-current cyclic gc run. It that sense, yes, resurrection "blocked" gc from 
making progress. It did not, e.g., "block" the interpreter in the sense of 
deadlock, or of waiting for some lock to be released, or of waiting for a 
network request to respond, or ...

--

___
Python tracker 
<https://bugs.python.org/issue38379>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41513] Scale by power of two in math.hypot()

2020-08-15 Thread Tim Peters


Tim Peters  added the comment:

Oh no - I wouldn't use this as a default implementation. Too expensive. There 
is one aspect you may find especially attractive, though: unlike even the 
Decimal approach, it should be 100% insensitive to argument order (no info is 
lost before fsum() is called, and fsum's output should be unaffected by summand 
order).

--

___
Python tracker 
<https://bugs.python.org/issue41513>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41513] Scale by power of two in math.hypot()

2020-08-15 Thread Tim Peters


Tim Peters  added the comment:

> ...
> one at a time subtract a**2 (an argument squared) in descending
> order of magnitude
> ...

But that doesn't really help unless the sum of squares was computed without 
care to begin with. Can do as well by skipping that but instead computing the 
original sum of squares in increasing order of magnitude.

Cheapest way I know of that "seemingly always" reproduces the Decimal result 
(when that's rounded back to float) combines fsum(), Veltkamp splitting, and 
the correction trick from the paper:

def split(x, T27=ldexp(1.0, 27)+1.0):
t = x * T27
hi = t - (t - x)
lo = x - hi
assert hi + lo == x
return hi, lo

# result := hypot(*xs)
parts = []
for x in xs:
a, b = split(x)
parts.append(a*a)
parts.append(2.0*a*b)
parts.append(b*b)
result = sqrt(fsum(parts))
a, b = split(result)
parts.append(-a*a)
parts.append(-2.0*a*b)
parts.append(-b*b)
x = fsum(parts) # best float approx to sum(x_i ** 2) - result**2
result += x / (2.0 * result)

--

___
Python tracker 
<https://bugs.python.org/issue41513>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41513] Scale by power of two in math.hypot()

2020-08-14 Thread Tim Peters


Tim Peters  added the comment:

Cute: for any number of arguments, try computing h**2, then one at a time 
subtract a**2 (an argument squared) in descending order of magnitude.  Call 
that (h**2 - a1**2 - a2**2 - ...) x.

Then

h -= x/(2*h)

That should reduce errors too, although not nearly so effectively, since it's a 
cheap way of estimating (& correcting for) the discrepancy between sum(a**2) 
and h**2.

Note that "descending order" is important: it's trying to cancel as many 
leading bits as possible as early as possible, so that lower-order bits come 
into play.

Then again ... why bother? ;-)

--

___
Python tracker 
<https://bugs.python.org/issue41513>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41458] Avoid overflow/underflow in math.prod()

2020-08-09 Thread Tim Peters


Tim Peters  added the comment:

Or, like I did, they succumbed to an untested "seemingly plausible" illusion ;-)

I generated 1,000 random vectors (in [0.0, 10.0)) of length 100, and for each 
generated 10,000 permutations.  So that's 10 million 100-element products 
overall.  The convert-to-decimal method was 100% insensitive to permutations, 
generating the same product (default decimal prec result rounded to float) for 
each of the 10,000 permutations all 1,000 times.

The distributions of errors for the left-to-right and pairing products were 
truly indistinguishable.  They ranged from -20 to +20 ulp (taking the decimal 
result as being correct).  When I plotted them on the same graph, I thought I 
had made an error, because I couldn't see _any_ difference on a 32-inch 
monitor!  I only saw a single curve.  At each ulp the counts almost always 
rounded to the same pixel on the monitor, so the color of the second curve 
plotted almost utterly overwrote the first curve.

As a sanity check, on the same vectors using the same driver code I compared 
sum() to a pairing sum. Pairing sum was dramatically better, with a much 
tighter error distribution with a much higher peak at the center ("no error"). 
That's what I erroneously expected to see for products too - although, in 
hindsight, I can't imagine why ;-)

--

___
Python tracker 
<https://bugs.python.org/issue41458>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41458] Avoid overflow/underflow in math.prod()

2020-08-09 Thread Tim Peters


Tim Peters  added the comment:

More extensive testing convinces me that pairing multiplication is no real help 
at all - the error distributions appear statistically indistinguishable from 
left-to-right multiplication.

I believe this has to do with the "condition numbers" of fp addition and 
multiplication, which are poor for fp addition and good for fp multiplication.  
Intuitively, fp addition systematically loses mounds of information whenever 
two addends in different binades are added (the lower bits in the addend in the 
binade closer to 0 are entirely lost).  But the accuracy of fp mult couldn't 
care which less which binades the inputs are in, provided only the result 
doesn't overflow or become subnormal.

For "random" vectors, pairing summation tends to keep addends close together in 
magnitude, which is "the real" reason it helps.  Left-to-right summation tends 
to make the difference in magnitude increase as it goes along (the running sum 
keeps getting bigger & bigger).

So, red herring.  The one thing that _can_ be done more-or-less 
straightforwardly and cheaply for fp mult is to prevent spurious overflow and 
underflow (including spurious trips into subnormal-land).

--

___
Python tracker 
<https://bugs.python.org/issue41458>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41458] Avoid overflow/underflow in math.prod()

2020-08-08 Thread Tim Peters


Tim Peters  added the comment:

Well, that can't work:  the most likely result for a long input is 0.0 (try 
it!). frexp() forces the mantissa into range [0.5, 1.0).  Multiply N of those, 
and the result _can_ be as small as 2**-N. So, as in Mark's code, every 
thousand times (2**-1000 is nearing the subnormal boundary) frexp() is used 
again to force the product-so-far back into range. That's straightforward when 
going "left to right".

With fancier reduction schemes, "it varies". Aiming for "obviously correct" 
rather than for maximal cleverness ;-) , here I'll associate each partial 
product with an integer e such that it's guaranteed (in the absence of 
infinities, NaNs, zeroes), abs(partial_product) >= 2^^-e. Then quick integer 
arithmetic can be used in advance to guard against a partial product 
underflowing:

def frpair(seq):
from math import frexp, ldexp
stack = []
exp = 0
for i, x in enumerate(seq, start=1):
m, e = frexp(x)
exp += e
stack += [(m, 1)]
while not i&1:
i >>= 1
(x, e1), (y, e2) = stack[-2:]
esum = e1 + e2
if esum >= 1000:
x, e = frexp(x)
exp += e
y, e = frexp(y)
exp += e
esum = 2
stack[-2:] = [(x * y, esum)]
total = 1.0
totale = 0
while stack:
x, e = stack.pop()
totale += e
if totale >= 1000:
   total, e = frexp(total)
   exp += e
   x, e = frexp(x)
   exp += e
   totale = 2
total *= x
return ldexp(total, exp)

But I see no obvious improvement in accuracy over "left to right" for the 
uniformly random test cases I've tried.

--

___
Python tracker 
<https://bugs.python.org/issue41458>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41458] Avoid overflow/underflow in math.prod()

2020-08-08 Thread Tim Peters


Tim Peters  added the comment:

"Denormal" and "subnormal" mean the same thing. The former is probably still in 
more common use, but all the relevant standards moved to "subnormal" some years 
ago.

Long chains of floating mults can lose precision too, but hardly anyone bothers 
to do anything about that.  Unlike sums, they're just not common or in critical 
cores.  For example, nobody ever computes the determinant of a 
million-by-million triangular matrix by producting ;-) the diagonal.

I only recall one paper devoted to this, using "doubled precision" tricks like 
fsum employs (but much more expensive unless hardware fused mul-add is 
available):

"Accurate Floating Point Product and Exponentiation"
Stef Graillat

However, that does nothing to prevent spurious overflow or underflow.

Much simpler:  presumably pairwise products can enjoy lower accumulated error 
for essentially the same reasons pairwise summation "works well". Yet, far as I 
know, nobody bothers.

Here's a cute program:

if 1:
from decimal import Decimal
from random import random, shuffle, seed

def pairwise(xs, lo, hi):
n = hi - lo
if n == 1:
return xs[lo]
elif n == 2:
return xs[lo] * xs[lo + 1]
else:
mid = (lo + hi) // 2
return pairwise(xs, lo, mid) * pairwise(xs, mid, hi)

N = 4000
print(f"using {N=:,}")
for s in (153,
  53,
  314,
  271828,
  789):
print("\nwith aeed", s)
seed(s)
xs = [random() * 10.0 for i in range(N)]
xs.extend(1.0 / x for x in xs[:])
shuffle(xs)
print("prod   ", math.prod(xs))
print("product", product(xs)) # the code attached to this report
print("frexp  ", prod2(xs))   # straightforward frexp
print("Decimal", float(math.prod(map(Decimal, xs
print("pair   ", pairwise(xs, 0, len(xs)))

By construction, the product of xs should be close to 1.0.  With N=4000 as 
given, out-of-range doesn't happen, and all results are pretty much the same:

using N=4,000

with aeed 153
prod0.9991
product 0.9991
frexp   0.9991
Decimal 1.0042
pair1.0016

with aeed 53
prod1.0056
product 1.0056
frexp   1.0056
Decimal 1.0002
pair0.9997

with aeed 314
prod1.0067
product 1.0067
frexp   1.0067
Decimal 1.0082
pair1.0002

with aeed 271828
prod0.9984
product 0.9984
frexp   0.9984
Decimal 1.0004
pair1.0064

with aeed 789
prod0.9994
product 0.9994
frexp   0.9994
Decimal 1.0
pair1.0069

But boost N so that out-of-range is common, and only frexp and Decimal remain 
reliable. Looks almost cetain that `product()` has serious bugs:

using N=400,000

with aeed 153
prod0.0
product 1980.1146715391837
frexp   0.969
Decimal 1.027
pairnan

with aeed 53
prod0.0
product 6.595056534948324e+24
frexp   1.0484
Decimal 1.0513
pairnan

with aeed 314
prod0.0
product 6.44538471095855e+60
frexp   0.9573
Decimal 0.934
pairnan

with aeed 271828
prodinf
product 2556126.798990014
frexp   0.9818
Decimal 0.9885
pairnan

with aeed 789
prod0.0
product 118772.89118349401
frexp   0.9304
Decimal 1.0053
pairnan

--

___
Python tracker 
<https://bugs.python.org/issue41458>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41458] Avoid overflow/underflow in math.prod()

2020-08-06 Thread Tim Peters


Tim Peters  added the comment:

I may well have misread the code, believing it can still allow spurious 
over/underflows. On second reading of the current file, I don't know - it's 
more complicated than I thought.

If it does guarantee to prevent them, then I shift from -1 to (probably ) -0.

--

___
Python tracker 
<https://bugs.python.org/issue41458>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41458] Avoid overflow/underflow in math.prod()

2020-08-06 Thread Tim Peters


Tim Peters  added the comment:

Cool!  So looks like you could also address an accuracy (not out-of-range) 
thing the frexp() method also does as well as possible: loosen the definition 
of "underflow" to include losing bits to subnormal products. For example, with 
the inputs

>>> xs = [1e-200, 1e-121, 1e308]
>>> product(xs)
9.98012604599318e-14
>>> _.hex()
'0x1.c17704f58189cp-44'
>>> math.prod(xs) # same thing
9.98012604599318e-14
>>> prod2(xs) # a tiny implementation of the frexp() method
1e-13
>>> _.hex()
'0x1.c25c268497682p-44'

product/math.prod get only a handful of the leading result bits right, because

>>> 1e-200 * 1e-121
1e-321
>>> _.hex()
'0x0.000cap-1022'

loses all but a handful of the trailing bits due to being in the subnormal 
range.  Changing the order can repair that:

>>> 1e-200 * 1e308 * 1e-121
1e-13

IOW, so far as "underflow" goes, we _start_ losing bits when the product 
becomes subnormal, well before it reaches 0.

--

___
Python tracker 
<https://bugs.python.org/issue41458>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41458] Avoid overflow/underflow in math.prod()

2020-08-06 Thread Tim Peters


Tim Peters  added the comment:

See "wisdom" earlier ;-) It's ad hoc trickery that seemingly can't be explained 
without showing the precise implementation in use today. As already mentioned, 
frexp() trickery _can_ be explained: exactly what you'd get if left-to-right HW 
multiplication were performed with an unbounded exponent, over- and 
under-flowing if and only if the infinite precision exponent at the end 
"doesn't fit". It completely eliminates spurious over-/under-flow. If someone 
wants that, it would be suitable for an fprod() function (which, like fsum(), 
makes strong guarantees at real cost).

Precisely which cases does this other thing protect against? Is there any way 
to characterize them beyond "well, try it and see"?

If there's an actual problem here, doesn't it deserve to be fixed? Just making 
it "less common" in some unquantifiable, unexplainable way seems unprincipled, 
and _possibly_ even harmful (e.g., by making it harder to provoke the 
underlying still-there problem without white box adversarial testing).


> Presumably that is why numpy uses pairwise summation instead of
> straight addition for example.

Loss of precision in fp summation is an extremely common real-life problem with 
real-life consequences.  That's why you can find dozens of papers, over the 
decades, on schemes to improve that. Adding a vector of a million floats is 
common; multiplying a vector of even a hundred floats is rare.

Similarly, scaled hypot implementations have been re-invented dozens of times. 
Etc.  When there's "a real" problem in fp life, it's hard _not_ to find mounds 
of prior art trying to address it.

You can find a few pieces on avoiding spurious under-/over-flow when chaining 
multiplies and divides, primarily older papers talking about 32-bit float 
formats with very limited dynamic range. The common advice: rearrange the 
handful of inputs in the source code by hand, based on your application 
knowledge of their likely magnitudes, to make left-to-right evaluation "safe".

--

___
Python tracker 
<https://bugs.python.org/issue41458>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41458] Avoid overflow/underflow in math.prod()

2020-08-06 Thread Tim Peters


Tim Peters  added the comment:

I'm skeptical of the need for - and wisdom of - this. Where does it come up? I 
can't think of any context where this would have been useful, or of any other 
language or package that does something like this. Long chains of mults are 
unusual outside of integer combinatorics to begin with.

Closest I can recall came up in the Spambayes project. There we needed to 
compute the sum of the logs of a potentially large number of probabilities.  
But log (lack of!) speed proved to be a bottleneck, so I changed it to compute 
the log of the product of the probabilities. That product was all but certain 
to underflow to 0.0. But "for real" - no rearrangement of terms would have made 
any difference to that. Instead, akin to what Mark spelled out, every now & 
again frexp() was used to push the product bits back into [0.5, 1.0), and the 
power-of-2 exponent was tracked apart from that.

fsum() is primarily aimed at a very different problem: improving the low-order 
bits.

How to explain what this change to prod() would do? It may or may not stop 
spurious overflows or underflows, and the result depends on the order of the 
multiplicands. But in what way? If we can't/won't define what it does, how can 
other implementations of Python reproduce CPython's result?

While I don't (yet?) see a real need for it, one thing that could work:  
compute the product left to right. Full speed - no checks at all. If the result 
is a 0, a NaN, or an infinity (which is bound to be rare in real life), do it 
again left to right with the frexp() approach. Then it can be explained: the 
result is what you'd get from left-to-right multiplication if the hardware had 
an unbounded exponent field, suffering overflow/underflow at the end only if 
the true exponent has magnitude too large for the hardware.

--

___
Python tracker 
<https://bugs.python.org/issue41458>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41421] Random.paretovariate sometimes raises ZeroDivisionError for small alpha

2020-08-01 Thread Tim Peters


Tim Peters  added the comment:

That text is fine, if you feel something needs to be said at all. I really 
don't. A Pareto distribution of this kind with parameter <= 1.0 has infinite 
expected value - VERY long tail. Anyone who knows what they're doing already 
knows that. The reason the implementation can't "blow up" for parameters >= 
(roughly) 0.1 isn't that IEEE doubles have such a large dynamic range but 
rather that they can't represent a number < 1.0 larger than 1 - 2**-53 (so u = 
1 - random.random() is always at least 2**-53). The actual distribution has 
infinite expected value nonetheless, but the implementation is incapable of 
generating any of its very large values (which, while very rare, are also very 
large).

--

___
Python tracker 
<https://bugs.python.org/issue41421>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41131] Augment random.choices() with the alias method

2020-07-30 Thread Tim Peters


Tim Peters  added the comment:

Oh yes - I understood the intent of the code.  It's as good an approach to 
living with floating-point slop in this context as I've seen.  It's not 
obviously broken.  But neither is it obviously correct, and after a few minutes 
I didn't see a clear path to rigorously proving it can't break.  In FP, if 
there's one chance in 2**53 that it _can_ break, it will ;-)

The suggestions I made change none of that:  reducing the number of rounding 
errors can only help.

OTOH, an all-integer version of the algorithm _can_ be "obviously correct", and 
can embed asserts to verify that it is.  It can all be done in a single 
fixed-size list of 3-lists ([weight, value, alias]), swapping entries as needed 
to keep the 3 regions of interest ("too small", "on the nose", "too large") 
each contiguous.  Then it's straightforward to prove that you run out of "too 
small" entries at the same time you run out of "too large" ones.

--

___
Python tracker 
<https://bugs.python.org/issue41131>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41131] Augment random.choices() with the alias method

2020-07-30 Thread Tim Peters


Tim Peters  added the comment:

I'm not clear on that the alias method is valuable here. Because of the 
preprocessing expense, it cries out for a class instead: an object that can 
retain the preprocessed tables, be saved to disk (pickled), restored later, and 
used repeatedly to make O(1) selections. I have such a class, which uses exact 
integer arithmetic (e.g., during preprocessing, there's at least one "too 
small" bin if and only if there's at least one "too large" bin), and is as 
unbiased as choice() and randrange(). Reasoning about float vagaries is much 
harder, and I see no error analysis here.

Most troubling: that due to rounding, reducing a "too large" bin results in a 
weight that spuriously compares <= bin_size. Then a "too small" bin with a 
genuinely tiny weight can be left with nothing to pair with, and so ends up 
aliasing itself, giving it a massively too-high probability of being picked.

So if you're determined to stick to "a function", I'd stick to the simple 
inverse CDF sampling method.  Unless the number of choices is "very" large, the 
log factor doesn't much matter.

That said, the alias numerics here could be improved some by working with the 
weights directly, "normalizing" them only at the end.  Instead of comparing 
weights to bin_size, compare them instead to the mean:

mean = total / n

(note: the next step to getting rid of floats entirely is to pre-multiply the 
weights by lcm(total, n) // total  - then their mean is exactly the integer 
lcm(total, n) // n).

The mean (instead of bin_size) suffers rounding errors then, but the original 
weights don't during the "hard part" of preprocessing.  At the end,

fn = n + 0.0
U = [weights[i] / total + i / fn for i in range(n)]

instead.  The "i / fn" also suffers just one rounding error as opposed to the 
two in the current "bin_width * i".  That can make a difference for n as small 
as 5:

>>> 3/5
0.6
>>> (1/5)*3
0.6001

--

___
Python tracker 
<https://bugs.python.org/issue41131>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41421] Random.paretovariate sometimes raises ZeroDivisionError for small alpha

2020-07-28 Thread Tim Peters


Tim Peters  added the comment:

BTW, if we have to "do something", how about changing

return 1.0 / u ** (1.0/alpha)

to the mathematically equivalent

return (1.0 / u) ** (1.0/alpha)

? Not sure about Linux-y boxes, but on Windows that would raise OverflowError 
instead of ZeroDivisionError. Which makes more sense, because the Pareto 
variable we're _trying_ to produce is too large to represent.  If it returns 
math.inf on some box instead, same thing.

Or, also faster, and suffering fewer rounding errors,

return  u ** (-1.0 / alpha)

--

___
Python tracker 
<https://bugs.python.org/issue41421>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41421] Random.paretovariate sometimes raises ZeroDivisionError for small alpha

2020-07-28 Thread Tim Peters


Tim Peters  added the comment:

I'm inclined to ignore this. No actual user has complained about this, and I 
doubt any ever will:  it's got to be rare as hen's teeth to use a parameter 
outside of, say, [0.1, 10.0], in real life. The division error can't happen for 
those. For "small" alpha, the distribution simply does contain values too large 
to represent as binary doubles. So the only other defensible thing to do be 
done in this case is to return math.inf (or raise OverflowError) - but why slow 
it down for something no actual user cares about?

--

___
Python tracker 
<https://bugs.python.org/issue41421>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41389] Garbage Collector Ignoring Some (Not All) Circular References of Identical Type

2020-07-27 Thread Tim Peters


Tim Peters  added the comment:

Well, this isn't a help desk ;-) You may want instead to detail your problem 
on, say, StackOverflow, or the general Python mailing list.

Please note that I don't know what your "problem" _is_:  you haven't said. You 
posted some numbers that didn't make sense to you, and made unwarranted 
extrapolations from those (for example, no, those numbers won't get worse if 
you let the program run a million times longer).

So you should spell out what the "real" problem is. This shows signs of being 
an "XY problem":

http://xyproblem.info/

For example, now you say:

> Calling gc.collect() on regular intervals doesn't seem
> to work consistently

That's news to me. The code you posted shows quite different behavior when 
FORCE_GC is set.

But if it's true that calling gc.collect() regularly doesn't alleviate "the 
real problem" (whatever that may be!), then that shows the opposite of what you 
appear to be assuming: that Python's cyclic gc is the root of the cause. 
collect() _will_ reclaim every scrap of RAM that's actually trash at the time 
it's called. So if calling that doesn't help, the problem is almost certainly 
NOT that trash isn't getting reclaimed. Something else is the cause.

Examples: it's not actually trash. It is, and gc collects it, but is unable to 
return it to the C library from which the memory came. It is returned to the C 
library, but that in turn is unable to return the memory to the OS. It is 
returned to the OS, but the OS decides to leave its virtual address space 
mapped to the process for now.

Details not only matter, they _can_ be everything when dealing with the 
multiple layers of memory management on modern machines. Waiting for a "general 
insight" is probably futile here :-(

--

___
Python tracker 
<https://bugs.python.org/issue41389>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41389] Garbage Collector Ignoring Some (Not All) Circular References of Identical Type

2020-07-27 Thread Tim Peters


Tim Peters  added the comment:

It's impossible for any implementation to know that cyclic trash _is_ trash 
without, in some way, traversing the object graph. This is expensive, so 
CPython (or any other language) does not incur that expense after every single 
decref that leaves a non-zero refcount (the one and only cheap clue that cyclic 
trash _may_ have just been created).

If you want/need synchronous behavior, avoid cycles. CPython's refcounting does 
dispose of trash the instant an object (not involved in a cycle) becomes trash. 
That behavior cannot be extended to cyclic trash short of (as above) running a 
cyclic gc pass extremely frequently.

I don't know of any language that guarantees all garbage will be collected 
"right away". Do you?  CPython does much more in that respect (due to primarily 
relying on refcounting) than most.

--

___
Python tracker 
<https://bugs.python.org/issue41389>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41389] Garbage Collector Ignoring Some (Not All) Circular References of Identical Type

2020-07-24 Thread Tim Peters


Tim Peters  added the comment:

What makes you think that? Your own output shows that the number of "Active" 
objects does NOT monotonically increase across output lines. It goes up 
sometimes, and down sometimes.  Whether it goes up or down is entirely due to 
accidents of when your monitoring thread happens to wake up during the lifetime 
of the program's gc history.

I boosted the loop count to 10 million on my box just now. It had no 
significant effect on peak memory use. At the end:

>>> A.COUNT, A.TOTAL, B.COUNT, B.TOTAL
(298, 1000, 298, 1000)
>>> gc.collect()
1188
>>> A.COUNT, A.TOTAL, B.COUNT, B.TOTAL
(1, 1000, 1, 1000)

There is no leak. An A and B object survive collect() because the last A object 
created remains bound to the variable `a` used in the loop (so is still 
reachable).

So I thank you for creating a nice test program, but I'm closing this since it 
doesn't demonstrate a real problem.

--
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue41389>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41389] Garbage Collector Ignoring Some (Not All) Circular References of Identical Type

2020-07-24 Thread Tim Peters


Tim Peters  added the comment:

I see no evidence of a bug here. To the contrary, the output proves that 
__del__ methods are getting called all along. And if garbage weren't being 
collected, after allocating a million objects each with its own megabyte string 
object, memory use at the end would be a terabyte, not a comparatively measly 
;-) gigabyte

Note that Python's cyclic gc is NOT asynchronous. It only runs when you call it 
directly, or when an internal count of allocations exceeds an internal count of 
deallocations. When your loop ends, your output shows that 940 A and B objects 
remain to be collected, spread across some number of the gc's "generations". 
That's where your gigabyte lives (about a thousand A objects each with its own 
megabyte of string data). It will remain in use until gc is forced to run 
again. But 99.9% of the A objects have already been collected.

--
nosy: +tim.peters

___
Python tracker 
<https://bugs.python.org/issue41389>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41311] Add a function to get a random sample from an iterable (reservoir sampling)

2020-07-18 Thread Tim Peters


Tim Peters  added the comment:

The lack of exactness (and possibility of platform-dependent results, 
including, e.g., when a single platform changes its math libraries) certainly 
works against it.

But I think Raymond is more bothered by that there's no apparently _compelling_ 
use case, in the sense of something frequent enough in real life to warrant 
including it in the standard library.

For example, there's really no problem right now if you have a giant iterable 
_and_ you know its length.  I had a case where I had to sample a few hundred 
doubles from giant binary files of floats.  The "obvious" solution worked great:

for o in sorted(random.sample(range(0, file_size, 8), 1000)):
seek to offset o and read up 8 bytes

Now random access to that kind of iterable is doable, but a similar approach 
works fine too if it's sequential-only access to a one-shot iterator of known 
length:  pick the indices in advance, and skip over the iterator until each 
index is hit in turn.

It doesn't score much points for being faster than materializing a set or dict 
into a sequence first, since that's a micro-optimization justified only by 
current CPython implementation accidents.

Where it's absolutely needed is when there's a possibly-giant iterable of 
unknown length. Unlike Raymond, I think that's possibly worth addressing (it's 
not hard to find people asking about it on the web). But it's not a problem 
I've had in real life, so, ya, it's hard to act enthusiastic ;-)

PS: you should also guard against W > 1.0. No high-quality math library will 
create such a thing given these inputs, but testing only for exact equality to 
1.0 is no cheaper and so needlessly optimistic.

--

___
Python tracker 
<https://bugs.python.org/issue41311>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41311] Add a function to get a random sample from an iterable (reservoir sampling)

2020-07-17 Thread Tim Peters


Tim Peters  added the comment:

Julia's randsubseq() doesn't allow to specify the _size_ of the output desired. 
It picks each input element independently with probability p, and the output 
can be of any size from 0 through the input's size (with mean output length 
p*length(A)). Reservoir sampling is simply irrelevant to that, although they 
almost certainly use a form of skip-generation internally.

The quoted docs don't make much sense: for any given p, O(p*N) = O(N). I'm 
guessing they're trying to say that the mean of the number of times the RNG is 
called is p*N.

--

___
Python tracker 
<https://bugs.python.org/issue41311>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41311] Add a function to get a random sample from an iterable (reservoir sampling)

2020-07-16 Thread Tim Peters


Tim Peters  added the comment:

Thanks! That explanation really helps explain where "geometric distribution" 
comes from. Although why it keeps taking k'th roots remains a mystery to me ;-)

Speaking of which, the two instances of

exp(log(random())/k)

are numerically suspect. Better written as

random()**(1/k)

The underlying `pow()` implementation will, in effect, compute log(random()) 
with extra bits of precision for internal use. Doing log(random()) forces it to 
use a 53-bit approximation. Not to mention that it's more _obvious_ to write a 
k'th root as a k'th root.  Note: then the 1/k can be computed once outside the 
loop.

Perhaps worse is

log(1-W)

which should be written

log1p(-W)

instead.  W is between 0 and 1, and the closer it is to 0 the more its trailing 
bits are entirely lost in computing 1-W. It's the purpose of log1p to combat 
this very problem.

--

___
Python tracker 
<https://bugs.python.org/issue41311>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41311] Add a function to get a random sample from an iterable (reservoir sampling)

2020-07-16 Thread Tim Peters


Tim Peters  added the comment:

Pro: focus on the "iterable" part of the title. If you want to, e.g., select 3 
lines "at random" out of a multi-million-line text file, this kind of reservoir 
sampling allows to do that holding no more than one line in memory 
simultaneously. Materializing an iterable in a sequence is sometimes 
impossible. Yup, it takes O(number of elements) time, but so does anything else 
that has to look at every element of an iterable (including materializing an 
iterable in a sequence!).

So this would make most sense as a different function.

Con: over the years we've worked diligently to purge these algorithms of minor 
biases, leaving them as unbiased as the core generator. Introducing 
floating-point vagaries goes against that, and results may vary at times due to 
tiny differences in platform exp() and log() implementations.

Worse, I'm not sure that the _approach_ is beyond reproach: the claim that 
skips "follow a geometric distribution" is baffling to me. If the reservoir has 
size K, then when we're looking at the N'th element it should be skipped with 
probability 1-K/N. But when looking at the N+1'th, that changes to 1-K/(N+1). 
Then 1-K/(N+2), and so on. These probabilities are all different. In an actual 
geometric distribution, the skip probability is a constant.

Now 1-K/(N+i) is approximately equal to 1-K/(N+j) for i and j sufficiently 
close, and for K sufficiently smaller than N, so the skips may be well 
_approximated_ by a geometric distribution. But that's quite different than 
saying they _do_ follow a geometric distribution, and I see no reason to trust 
that the difference can't matter.

The simple algorithm on the Wikipedia page suffers none of these potential 
problems (it implements an obviously-sound approach, & our `randrange()` is 
unbiased and platform-independent).  But, for selecting 3 out of a 
million-element iterable, would invoke the core generator hundreds of thousands 
of times more often.

--
nosy: +mark.dickinson, tim.peters

___
Python tracker 
<https://bugs.python.org/issue41311>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41310] micro-optimization: increase our float parsing speed by Nx

2020-07-16 Thread Tim Peters


Tim Peters  added the comment:

Gregory, care to take their code and time it against Python?

I'm not inclined to: reading the comments in the code, they're trying "fast 
paths" already described in papers by Clinger and - later - by Gay.  When those 
fast paths don't apply, they fall back on the system `strtod`. But Python's 
current implementation builds on Gay's fastest code (and never falls back to 
the system).

It's possible they "improved" on Gay by building relatively giant tables of 
static data.

--

___
Python tracker 
<https://bugs.python.org/issue41310>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41205] Documentation Decimal power 0 to the 0 is Nan (versus 0 to the 0 which is 1)

2020-07-03 Thread Tim Peters


Tim Peters  added the comment:

Huh! I thought everyone in Standards World gave up by now, and agreed 0**0 
should be 1.

--
nosy: +tim.peters

___
Python tracker 
<https://bugs.python.org/issue41205>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41198] Round built-in function not shows zeros acording significant figures and calculates different numbers of odd and even

2020-07-03 Thread Tim Peters


Tim Peters  added the comment:

Thanks, Mark! I didn't even know __round__ had become a dunder method.

For the rest, I'll follow StackOverflow - I don't have an instant answer, and 
the instant answers I _had_ didn't survive second thoughts ;-)

--

___
Python tracker 
<https://bugs.python.org/issue41198>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41198] Round built-in function not shows zeros acording significant figures and calculates different numbers of odd and even

2020-07-02 Thread Tim Peters


Tim Peters  added the comment:

Cool! So the only thing surprising to me here is just how far off balance the 
arange() run was.  So I'd like to keep this open long enough for Mark to 
notice, just in case it's pointing to something fishy in numpy.

--

___
Python tracker 
<https://bugs.python.org/issue41198>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41198] Round built-in function not shows zeros acording significant figures and calculates different numbers of odd and even

2020-07-02 Thread Tim Peters


Tim Peters  added the comment:

I assumed Mark would tell us what's up with the arange() oddity, so let's see 
whether he does.  There is no truly good way to generate "evenly spaced" binary 
floats using a non-representable conceptual decimal delta.  The dumbass ;-) way 
doesn't show a discrepancy in pure Python:

>>> num = ne = no = 0
>>> d = 0.001
>>> while num < 1.0:
... digit = int(round(num, 1) * 10)
... if digit & 1:
... no += 1
... else:
... ne += 1
... num += d
>>> ne, no
(500, 500)

However, a somewhat less naive way does show a discrepancy, but less so than 
what arange() apparently does:

>>> ne = no = 0
>>> for i in range(1000):
... digit = int(round(i * d, 1) * 10)
... if digit & 1:
... no += 1
... else:
... ne += 1
>>> ne, no
(501, 499)

I assume that's because of the specific nearest/even behavior I already showed 
for multipliers i=250 and i=750.

--

___
Python tracker 
<https://bugs.python.org/issue41198>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41198] Round built-in function not shows zeros acording significant figures and calculates different numbers of odd and even

2020-07-02 Thread Tim Peters


Tim Peters  added the comment:

For the first, your hardware's binary floating-point has no concept of 
significant trailing zeroes. If you need such a thing, use Python's `decimal` 
module instead, which does support a "significant trailing zero" concept. You 
would need an entirely new data type to graft such a notion onto Python's (or 
numpy's!) binary floats.

For the second, we'd have to dig into exactly what numpy's `arange()` does. 
Very few of the numbers you're working with are exactly representable in binary 
floating point except for 0.0. For example, "0.001" is approximated by a binary 
float whose exact decimal value is

0.00120816681711721685132943093776702880859375

Sometimes the rounded (by machine float arithmetic) multiples of that are 
exactly representable, but usually not. For example,

>>> 0.001 * 250
0.25

rounds to the exactly representable 1/4, and

>>> 0.001 * 750
0.75

to the exactly representable 3/4. However, `round()` uses 
round-to-nearest/even, and then

>>> round(0.25, 1)
0.2
>>> round(0.75, 1)
0.8

both resolve the tie to the closest even value (although neither of those 
_results_ are exactly representable in binary floating-point - although if you 
go on to multiply them by 10.0, they do round (in hardware) to exactly 2.0 and 
8.0).

Note that numpy's arange() docs do warn you against using it ;-)

"""
When using a non-integer step, such as 0.1, the results will often not be 
consistent. It is better to use numpy.linspace for these cases.
"""

--
nosy: +tim.peters

___
Python tracker 
<https://bugs.python.org/issue41198>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41133] Insufficient description of cyclic garbage collector for C API

2020-06-29 Thread Tim Peters


Tim Peters  added the comment:

I don't see real value in the docs noting that Bad Things can happen if code 
lies about true refcounts. If a container points to an object, _of course_ the 
container should own that reference. Cheating on that isn't intended to be 
supported in any way, so there's no obligation to explain how or why things can 
go wrong otherwise. Worse, trying to explain such things would constrain 
implementations to have the same specific kinds of failure modes forever after. 
Neither would there would be real value in repeating "DON'T LIE ABOUT 
REFCOUNTS!" on every page, either ;-)

If someone just wants to know more about CPython's cyclic collector, that's 
fine, but that takes a great many more words than are suitable in the C API 
docs. Luckily, Pablo recently did that:

https://devguide.python.org/garbage_collector/

--
nosy: +tim.peters

___
Python tracker 
<https://bugs.python.org/issue41133>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41071] from an int to a float , why

2020-06-22 Thread Tim Peters


Tim Peters  added the comment:

Mike, read that exchange again. You originally wrote

"print(2 / 2) gives 2.0 instead of 2"

but you didn't _mean_ that. You meant to say it "gives 1.0 instead of 1", or 
you meant something other than "2 / 2").  In Python 3,

>>> print(2 / 2)
1.0

Which is what Serhiy said it does.

For the rest, read the PEP again after you calm down. In particular,

"Classic division will remain the default in the Python 2.x series; true 
division will be standard in Python 3.0."

Also all true.

--

___
Python tracker 
<https://bugs.python.org/issue41071>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41071] from an int to a float , why

2020-06-22 Thread Tim Peters


Tim Peters  added the comment:

Read the PEP Serhiy already linked to:

https://www.python.org/dev/peps/pep-0238/

This was a deliberate change to how "integer / integer" works, introduced with 
Python 3.

--
nosy: +tim.peters
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue41071>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40879] Strange regex cycle

2020-06-06 Thread Tim Peters


Tim Peters  added the comment:

Closing this as "Won't Fix", since the possibility of exponential-time behavior 
with naively written nested quantifiers is well known, and there are no plans 
to "do something" about that.

--
resolution:  -> wont fix
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue40879>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40879] Strange regex cycle

2020-06-05 Thread Tim Peters


Tim Peters  added the comment:

Note that the relatively tiny pattern here extracts just a small piece of the 
regexp in question. As the output shows, increase the length of a string it 
fails to match by one character, and the time taken to fail approximately 
doubles: exponential-time behavior.

>>> import re
>>> c = re.compile(r'(?:[^\s()<>]+)+x')
>>> from time import perf_counter as now
>>> size = 1
>>> while True:
... s = 'a' * size
... start = now()
... junk = c.search(s)
... finish = now()
... print(size, finish - start)
... size += 1

1 9.90009110954e-06
2 1.180009998257e-05
3 1.120017197453e-05
4 1.209992187804e-05
5 1.520007098424e-05
6 1.569977414336e-05
7 2.11999194988e-05
8 3.3900053602e-05
9 4.959982774534e-05
10 7.77998547282e-05
11 0.0001481001830304
12 0.000340170905
13 0.000634899943317
14 0.001219100143504
15 0.00248249985066
16 0.00469410023127
17 0.00934219991863
18 0.0195416999067
19 0.0388015000526
20 0.076214100156
21 0.147214899945
22 0.2777167001336
23 0.649172200095
24 1.35531179
25 2.22982969982
26 4.98656629993
27 9.56792559995
28 19.0918107999
29 42.363349
30 83.5749305999
31 158.8824948998
...

The machine was far from quiet while running this, but it doesn't matter: the 
_point_ is dead obvious regardless.

--

___
Python tracker 
<https://bugs.python.org/issue40879>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40879] Strange regex cycle

2020-06-05 Thread Tim Peters


Tim Peters  added the comment:

The repr truncates the pattern string, for display, if it's "too long". The 
only visual clue about that, though, is that the display is missing the pattern 
string's closing quote, as in the output you showed here. If you look at 
url_pat.pattern, though, you'll see that nothing has been lost.

I'm not sure why it does that.  As I vaguely recall, some years ago there was a 
crusade to limit maximum repr sizes because long output was considered to be "a 
security issue" (e.g., DoS attacks vis tricking logging/auditing facilities 
into writing giant strings when recording reprs).

In any case, that's all there is to that part.

For the rest, it's exceedingly unlikely that there's actually an infinite loop. 
Instead there's a messy regexp with multiple nested quantifiers, which are 
notorious for exhibiting exponential-time behavior and especially in 
non-matching cases. They can be rewritten to have linear-time behavior instead, 
but it's an effort I personally have no interest in pursuing here. See Jeffrey 
Friedl's "Mastering Regular Expressions" book for detailed explanations.

The reason I have no interest: it's almost always a losing idea to try to parse 
any aspect of HTML with regexps. Use an HTML parser instead (or for URLs 
specifically, see urllib.parse).

--
nosy: +tim.peters

___
Python tracker 
<https://bugs.python.org/issue40879>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-05-31 Thread Tim Peters


Tim Peters  added the comment:

`functools` is clearly a poor place for this. `imath` would also be. 
`graph_stuff_probably_limited_to_a_topsort` is the only accurate name ;-) 
Off-the-wall possibilities include `misclib` (stuff that just doesn't fit 
anywhere else - yet) and `cslib` (Computer Science-y stuff).

Whichever name wins, we should probably look to ensure the name isn't also of a 
package already in (non-trivial) use.

--

___
Python tracker 
<https://bugs.python.org/issue17005>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29882] Add an efficient popcount method for integers

2020-05-25 Thread Tim Peters


Tim Peters  added the comment:

I see I never explicitly said +1, so I will now: +1 on merging this :-)

--

___
Python tracker 
<https://bugs.python.org/issue29882>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40480] "fnmatch" exponential execution time

2020-05-12 Thread Tim Peters


Tim Peters  added the comment:

Ned, I'm happy to do this. While the ability to join wasn't documented, it's 
not an unreasonable expectation. I'm not sure it's possible to fail _unless_ 
the regexps use named groups (and/or numbered backreferences) - and nobody in 
their right mind would expect regexps for such simple patterns to do such a 
thing ;-)

So chances seem decent the regression your user stumbled into wouldn't be the 
only one to pop up.  The fix was actually quite easy (and thanks to Anthony for 
nudging me in that direction!).  The annoying part was writing a test given 
that the precise group names generated are no longer predictable :-(

--

___
Python tracker 
<https://bugs.python.org/issue40480>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40480] "fnmatch" exponential execution time

2020-05-11 Thread Tim Peters


Tim Peters  added the comment:


New changeset b1b4c790e7d3b5f4244450aefe3d8f01710c13f7 by Tim Peters in branch 
'master':
bpo-40480: restore ability to join fnmatch.translate() results (GH-20049)
https://github.com/python/cpython/commit/b1b4c790e7d3b5f4244450aefe3d8f01710c13f7


--

___
Python tracker 
<https://bugs.python.org/issue40480>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40480] "fnmatch" exponential execution time

2020-05-11 Thread Tim Peters


Change by Tim Peters :


--
pull_requests: +19358
pull_request: https://github.com/python/cpython/pull/20049

___
Python tracker 
<https://bugs.python.org/issue40480>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40480] "fnmatch" exponential execution time

2020-05-11 Thread Tim Peters


Tim Peters  added the comment:

I don't want something probabilistic.  Fix it or don't ;-)

One thing that would work, but at the cost of non-determinism:  do the same as 
now, but obtain the number part of the group name by applying next() to a 
module-global private instance of itertools.count().  That will keep the 
numbers increasing "forever", and across calls.  The point to using .count() is 
that it's atomic (i.e., won't repeat a number if multiple threads happen to be 
constructing regexps simultaneously).

It's a darned silly amount of effort, though ;-)

--

___
Python tracker 
<https://bugs.python.org/issue40480>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40480] "fnmatch" exponential execution time

2020-05-11 Thread Tim Peters


Tim Peters  added the comment:

Ned, would it be possible to rewrite code of the form:

if giant pasted regexp matches:

to:

if any(p matches for p in patterns):

That should work under any version of Python.

There's no guarantee that regexps _can_ be pasted together and still work, so I 
can't call this change "a bug".  That pasting regexps together "worked" before 
was an implementation accident.

I'd be happy to change it anyway, except I know of no way to use Python's re 
engine without backreferences that can avoid exponential-time behavior in some 
cases.  In some other regexp engines, yes (e.g., as the code comments note, in 
those that support "atomic grouping"), but not in Python's.  Nor does Python's 
re engine support reusing backreference names or numbers.

So I know of no way to restore the ability to paste regexps together that 
wouldn't reintroduce the possiblity of exponential time failure :-(

--

___
Python tracker 
<https://bugs.python.org/issue40480>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40539] Docs - difflib.SequenceMatcher quick_ratio and real_quick_ratio improved docs

2020-05-06 Thread Tim Peters


Tim Peters  added the comment:

Thanks for the effort, but I'm rejecting this.  The language deliberately 
defines nothing about how these are calculated.  It defines how `.ratio()` is 
computed, but that's all.  An implementation is free to do whatever it likes 
for the "quick" versions, provided only they return upper bounds on `.ratio()`. 
 Indeed, it's perfectly fine if an implementation merely returns 1.0 for both, 
regardless of the arguments.

If an implementation is cleverer than that, great, that's fine too - but it 
would be actively counterproductive to constrain them to be no _more_ clever 
than the current implementations.

--
nosy: +tim.peters
resolution:  -> rejected
stage: patch review -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue40539>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40480] "fnmatch" exponential execution time

2020-05-05 Thread Tim Peters


Change by Tim Peters :


--
assignee:  -> tim.peters
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue40480>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40480] "fnmatch" exponential execution time

2020-05-05 Thread Tim Peters


Tim Peters  added the comment:


New changeset b9c46a2c2d7fc68457bff641f78932d66f5e5f59 by Tim Peters in branch 
'master':
bpo-40480 "fnmatch" exponential execution time (GH-19908)
https://github.com/python/cpython/commit/b9c46a2c2d7fc68457bff641f78932d66f5e5f59


--

___
Python tracker 
<https://bugs.python.org/issue40480>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40480] "fnmatch" exponential execution time

2020-05-04 Thread Tim Peters


Change by Tim Peters :


--
keywords: +patch
pull_requests: +19223
stage: needs patch -> patch review
pull_request: https://github.com/python/cpython/pull/19908

___
Python tracker 
<https://bugs.python.org/issue40480>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40480] "fnmatch" exponential execution time

2020-05-04 Thread Tim Peters


Tim Peters  added the comment:

Changed version to 3.9, because anything done would change the regexp 
generated, and fnmatch.translate()` makes that regexp visible.

--
stage:  -> needs patch
versions: +Python 3.9 -Python 3.8

___
Python tracker 
<https://bugs.python.org/issue40480>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40480] "fnmatch" exponential execution time

2020-05-03 Thread Tim Peters


Tim Peters  added the comment:

"The trick" with this kind of pattern is that a `*` should consume as little as 
possible to allow the next fixed portion to match.  Apply that at every stage, 
and it succeeds or it doesn't.  Backtracking can't change that outcome.  For, 
e.g., the shell pattern:

a*bc*d*

a regexp to do this without backtracking would be like this, IF Python's re 
engine supported "atomic groups" (but it doesn't):

a(?>.*?bc)(?>.*?d).*\Z

The same effect can be gotten in a more convoluted way, via positive lookahead 
assertions and backreferencing (which Python's re engine does support):

a(?=(.*?bc))\1(?=(.*?d))\2.*\Z

Assertions are "one and done":  if the overall match fails, assertions don't 
try alternatives.

So that's _a_ way to proceed.  I'm unclear on that it's worth it, though.  
Stuff like "*a*a*a*a*a*a*" is just hard to swallow as a shell pattern that 
would occur in real life.

--

___
Python tracker 
<https://bugs.python.org/issue40480>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40480] "fnmatch" exponential execution time

2020-05-03 Thread Tim Peters


Tim Peters  added the comment:

Note that doctest has the same kind of potential problem with matching ellipsis 
(0 or more characters) in expected output blocks.  Backtracking isn't needed at 
all to resolve patterns of that limited kind, but I don't think Python's re 
module supports enough gimmicks to disable backtracking.

So instead doctest has its own

_ellipsis_match()

function to do it in worst-case linear time.

--
nosy: +tim.peters

___
Python tracker 
<https://bugs.python.org/issue40480>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40451] Unexpected sys.getrefcount(foo) output

2020-04-30 Thread Tim Peters


Tim Peters  added the comment:

"The 'aux' object" is simply the integer 1.  The dict is irrelevant to the 
outcome, except that the dict owns one reference to 1.  Do

sys.getrefcount(1)

all by itself and you'll see much the same.

This isn't a bug, but neither is it a feature:  it's undocumented, 
implementation-defined behavior.  It so happens that CPython treats a number of 
small integers as singletons, creating only one object for each, shared by all 
contexts that use the integer.

Here's from a fresh Python interactive session:

>>> from sys import getrefcount as r
>>> r(1)
94
>>> r(2)
76
>>> r(3)
27
>>> r(4)
49
>>> r(5)
23
>>> r(6)
11
>>> r(7)
13
>>> r(8)
35
>>> r(9)
13

Nothing about that is a bug, nor is anything about that defined behavior.  It 
just reflects how many times these small integers happen to be referenced by 
all the under-the-covers stuff that happened to get imported by the time the 
interactive prompt was displayed.

--
nosy: +tim.peters

___
Python tracker 
<https://bugs.python.org/issue40451>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40446] pow(a, b, p) where b=int((p-1)/2) spits out gibbrish for big p

2020-04-29 Thread Tim Peters


Change by Tim Peters :


--
resolution: fixed -> not a bug

___
Python tracker 
<https://bugs.python.org/issue40446>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



  1   2   3   4   5   6   7   8   9   10   >