[issue34751] Hash collisions for tuples

2018-09-29 Thread Tim Peters
Tim Peters added the comment: Jeroen, thanks for helping us fly slightly less blind! ;-) It's a lot of work. I'd say you may as well pick a prime. It's folklore, but "a reason" is that you've discovered that regular bit patterns in multipliers can hurt, and sticking to primes

[issue34751] Hash collisions for tuples

2018-09-28 Thread Tim Peters
Tim Peters added the comment: [Tim] > Perhaps worth noting that FNV-1a works great if used as > _intended_: on a stream of unsigned bytes. > ... > >Py_uhash_t t = (Py_uhash_t)y; >for (int i = 0; i < sizeof(t); ++i) { >x = (x ^ (t & 0xff)) *

[issue34751] Hash collisions for tuples

2018-09-27 Thread Tim Peters
Tim Peters added the comment: Also worth noting: other projects need to combine hashes too. Here's a 64-bit version of the highly regarded C++ Boost[1] library's hash_combine() function (I replaced its 32-bit int literal with "a random" 64-bit one): x ^= (Py

[issue34751] Hash collisions for tuples

2018-09-27 Thread Tim Peters
Tim Peters added the comment: Perhaps worth noting that FNV-1a works great if used as _intended_: on a stream of unsigned bytes. All tests except the new tuple hash test suffer no collisions; the new test suffers 14. Nothing is needed to try to worm around nested tuple catastrophes

[issue34751] Hash collisions for tuples

2018-09-27 Thread Tim Peters
Tim Peters added the comment: I should have spelled this out before: these are all permutations, so in general permuting the result space of `x * mult + y` (or any other permutation involving x and y) is exactly the same as not permuting it but applying a different permutation to y instead

[issue34751] Hash collisions for tuples

2018-09-26 Thread Tim Peters
Tim Peters added the comment: >> The two-liner above with the xor in the second line is >> exactly Bernstein 33A, followed by a permutation >> of 33A's _output_ space. > Not output space, but internal state ? 33A's output _is_ its internal state at the end.

[issue34751] Hash collisions for tuples

2018-09-26 Thread Tim Peters
Tim Peters added the comment: High-order bit: please restore the original tuple hash test. You have the worst case of "but I didn't write it" I've ever encountered ;-) Your new test is valuable, but I've seen several cases now where it fails to detect any problems where the ori

[issue34751] Hash collisions for tuples

2018-09-25 Thread Tim Peters
Tim Peters added the comment: >> j is even implies (j ^ -3) == -(j ^ 3) > This follows from what I posted before: if j is even, then > j ^ 3 is odd, so we can apply the rule x ^ -2 = -x to x = j ^ 3 > ... Thanks! That helps a lot. I had a blind spot there. This kind of th

[issue34751] Hash collisions for tuples

2018-09-25 Thread Tim Peters
Tim Peters added the comment: > Suppose that there is a hash collision, say hash((3, 3)) == > hash((-3, -3)) and you change the hashing algorithm to fix > this collision. There are _two_ hash functions at play in that collision: the tuple hash function, and the integer hash

[issue34751] Hash collisions for tuples

2018-09-24 Thread Tim Peters
Tim Peters added the comment: And one more: x = (x * mult) ^ t; also appears to work equally well. So, way back when, it appears we _could_ have wormed around the disaster du jour just by applying a shift-xor permutation to the raw hash results. Note the implication: if we

[issue34751] Hash collisions for tuples

2018-09-24 Thread Tim Peters
Tim Peters added the comment: Just noting that this Bernstein-like variant appears to work as well as the FNV-1a version in all the goofy ;-) endcase tests I've accumulated: while (--len >= 0) { y = PyObject_Hash(*p++); if (y == -1) return

[issue34751] Hash collisions for tuples

2018-09-24 Thread Tim Peters
Tim Peters added the comment: > advantage of my approach is that high-order bits become more > important: I don't much care about high-order bits, beyond that we don't systematically _lose_ them. The dict and set lookup routines have their own strategies for incorporating high-orde

[issue34751] Hash collisions for tuples

2018-09-24 Thread Tim Peters
Tim Peters added the comment: Jeroen, I understood the part about -2 from your initial report ;-) That's why the last code I posted didn't use -2 at all (neither -1, which hashes to -2). None of the very many colliding tuples contained -2 in any form. For example, these 8 tuples all have

[issue34751] Hash collisions for tuples

2018-09-24 Thread Tim Peters
Tim Peters added the comment: > when you do t ^= t << 7, then you are not changing > the lower 7 bits at all. I want to leave low-order hash bits alone. That's deliberate. The most important tuple component types, for tuples that are hashable, are strings and contiguous ranges

[issue34751] Hash collisions for tuples

2018-09-23 Thread Tim Peters
Tim Peters added the comment: BTW, those tests were all done under a 64-bit build. Some differences in a 32-bit build: 1. The test_tuple hash test started with 6 collisions. With the change, it went down to 4. Also changing to the FNV-1a 32-bit multiplier boosted it to 8. The test

[issue34751] Hash collisions for tuples

2018-09-23 Thread Tim Peters
Tim Peters added the comment: FYI, using this for the guts of the tuple hash works well on everything we've discussed. In particular, no collisions in the current test_tuple hash test, and none either in the cases mixing negative and positive little ints. This all remains so using

[issue34751] Hash collisions for tuples

2018-09-23 Thread Tim Peters
Tim Peters added the comment: [Raymond, on boosting the multiplier on 64-bit boxes] > Yes, that would be perfectly reasonable (though to some > extent the objects in the tuple also share some of the > responsibility for getting all bits into play). It's of value independent of that

[issue34751] Hash collisions for tuples

2018-09-23 Thread Tim Peters
Tim Peters added the comment: Has anyone figured out the real source of the degeneration when mixing in negative integers? I have not. XOR always permutes the hash range - it's one-to-one. No possible outputs are lost, and XOR with a negative int isn't "obviously degenerate"

[issue34751] Hash collisions for tuples

2018-09-23 Thread Tim Peters
Tim Peters added the comment: Oh, I don't agree that it's "broken" either. There's still no real-world test case here demonstrating catastrophic behavior, neither even a contrived test case demonstrating that, nor a coherent characterization of what "the problem" is. I

[issue34751] Hash collisions for tuples

2018-09-22 Thread Tim Peters
Tim Peters added the comment: Raymond, I share your concerns. There's no reason at all to make gratuitous changes (like dropping the "post-addition of a constant and incorporating length signature"), apart from that there's no apparent reason for them existing to begin with ;-)

[issue34751] Hash collisions for tuples

2018-09-22 Thread Tim Peters
Tim Peters added the comment: I strive not to believe anything in the absence of evidence ;-) FNV-1a supplanted Bernstein's scheme in many projects because it works better. Indeed, Python itself used FNV for string hashing before the security wonks got exercised over collision attacks

[issue34751] Hash collisions for tuples

2018-09-22 Thread Tim Peters
Tim Peters added the comment: So you don't know of any directly relevant research either. "Offhand I can't see anything wrong" is better than nothing, but very far from "and we know it will be OK because [see references 1 and 2]". That Bernstein's DJBX33A has been

[issue34397] remove redundant overflow checks in tuple and list implementations

2018-09-21 Thread Tim Peters
Tim Peters added the comment: Because the behavior of signed integer overflow isn't defined in C. Picture a 3-bit integer type, where the maximum value of the signed integer type is 3. 3+3 has no defined result. Cast them to the unsigned flavor of the integer type, though, and the result

[issue34751] Hash collisions for tuples

2018-09-21 Thread Tim Peters
Tim Peters added the comment: >> Why do you claim the original was "too small"? Too small for >> what purpose? > If the multiplier is too small, then the resulting hash values are > small too. This causes collisions to appear for smaller numbers: All righ

[issue34561] Replace list sorting merge_collapse()?

2018-09-21 Thread Tim Peters
Tim Peters added the comment: Thank you, Vincent! I very much enjoyed - and appreciated - your paper I referenced at the start. Way back when, I thought I had a proof of O(N log N), but never wrote it up because some details weren't convincing - even to me ;-) . Then I had to move

[issue34751] Hash collisions for tuples

2018-09-21 Thread Tim Peters
Tim Peters added the comment: Oops! """ "j odd implies j^(-2) == -j, so that m*(j^(-2)) == -m" """ The tail end should say "m*(j^(-2)) == -m*j" instead. -- ___ P

[issue34751] Hash collisions for tuples

2018-09-21 Thread Tim Peters
Tim Peters added the comment: For me, it's largely because you make raw assertions with extreme confidence that the first thing you think of off the top of your head can't possibly make anything else worse. When it turns out it does make some things worse, you're equally confident

[issue34751] Hash collisions for tuples

2018-09-21 Thread Tim Peters
Tim Peters added the comment: You said it yourself: "It's not hard to come up with ...". That's not what "real life" means. Here: >>> len(set(hash(1 << i) for i in range(100_000))) 61 Wow! Only 61 hash codes across 100 thousand distinct integers?! Y

[issue34751] Hash collisions for tuples

2018-09-21 Thread Tim Peters
Tim Peters added the comment: @jdemeyer, you didn't submit a patch, or give any hint that you _might_. It _looked_ like you wanted other people to do all the work, based on a contrived example and a vague suggestion. And we already knew from history that "a simple Bernstein hash suf

[issue34751] Hash collisions for tuples

2018-09-20 Thread Tim Peters
Change by Tim Peters : -- nosy: +ned.deily ___ Python tracker <https://bugs.python.org/issue34751> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue34751] Hash collisions for tuples

2018-09-20 Thread Tim Peters
Tim Peters added the comment: Ah! I see that the original SourceForge bug report got duplicated on this tracker, as PR #942952. So clicking on that is a lot easier than digging thru the mail archive. One message there noted that replacing xor with addition made collision statistics much

[issue34751] Hash collisions for tuples

2018-09-20 Thread Tim Peters
Tim Peters added the comment: @jdemeyer, please define exactly what you mean by "Bernstein hash". Bernstein has authored many hashes, and none on his current hash page could possibly be called "simple": https://cr.yp.to/hash.html If you're talking about the teen

[issue34659] Inconsistency between functools.reduce & itertools.accumulate

2018-09-17 Thread Tim Peters
Tim Peters added the comment: Ya, I care: `None` was always intended to be an explicit way to say "nothing here", and using unique non-None sentinels instead for that purpose is needlessly convoluted. `initial=None` is perfect. But then I'm old & in the way ;-) -

[issue34691] _contextvars missing in x64 master branch Windows build?

2018-09-17 Thread Tim Peters
Tim Peters added the comment: FYI, I bet I didn't see a problem with the Win32 target because I followed instructions ;-) and did my first build using build.bat. Using that for the x64 too target makes the problem go away. -- ___ Python tracker

[issue34561] Replace list sorting merge_collapse()?

2018-09-16 Thread Tim Peters
Tim Peters added the comment: Another runstack.py adds a bad case for 2-merge, and an even worse (percentage-wise) bad case for timsort. powersort happens to be optimal for both. So they all have contrived bad cases now. powersort's bad cases are the least bad. So far ;-) But I expect

[issue34561] Replace list sorting merge_collapse()?

2018-09-15 Thread Tim Peters
Tim Peters added the comment: New version of runstack.py. - Reworked code to reflect that Python's sort uses (start_offset, run_length) pairs to record runs. - Two unbounded-integer power implementations, one using a loop and the other division. The loop version implies that, in Python's

[issue34691] _contextvars missing in x64 master branch Windows build?

2018-09-14 Thread Tim Peters
New submission from Tim Peters : Using Visual Studio 2017 to build the current master branch of Python (something I'm trying for the first time in about two years - maybe I'm missing something obvious!), with the x64 target, under both the Release and Debug builds I get a Python that can't

[issue34561] Replace list sorting merge_collapse()?

2018-09-06 Thread Tim Peters
Tim Peters added the comment: No, there's no requirement that run lengths on the stack be ordered in any way by magnitude. That's simply one rule timsort uses, as well as 2-merge and various other schemes discussed in papers. powersort has no such rule, and that's fine. Regardless, rules

[issue34561] Replace list sorting merge_collapse()?

2018-09-06 Thread Tim Peters
Tim Peters added the comment: The notion of cost is that merging runs of lengths A and B has "cost" A+B, period. Nothing to do with logarithms. Merge runs of lengths 1 and 1000, and it has cost 1001. They don't care about galloping, only about how the order in which merges are

[issue34561] Replace list sorting merge_collapse()?

2018-09-04 Thread Tim Peters
Tim Peters added the comment: A new version of the file models a version of the `powersort` merge ordering too. It clearly dominates timsort and 2-merge in all cases tried, for this notion of "cost". Against it, its code is much more complex, and the algorithm is very far fro

[issue34561] Replace list sorting merge_collapse()?

2018-09-04 Thread Tim Peters
Tim Peters added the comment: "Galloping" is the heart & soul of Python's sorting algorithm. It's explained in detail here: https://github.com/python/cpython/blob/master/Objects/listsort.txt The Java fork of the sorting code has had repeated bugs due to reducing the size

[issue34561] Replace list sorting merge_collapse()?

2018-09-03 Thread Tim Peters
Tim Peters added the comment: Looks like all sorts of academics are exercised over the run-merging order now. Here's a paper that's unhappy because timsort's strategy, and 2-merge too, aren't always near-optimal with respect to the entropy of the distribution of natural run lengths

[issue34561] Replace list sorting merge_collapse()?

2018-09-01 Thread Tim Peters
Tim Peters added the comment: The attached runstack.py models the relevant parts of timsort's current merge_collapse and the proposed 2-merge. Barring conceptual or coding errors, they appear to behave much the same with respect to "total cost", with no clear overall winner.

[issue34561] Replace list sorting merge_collapse()?

2018-08-31 Thread Tim Peters
New submission from Tim Peters : The invariants on the run-length stack are uncomfortably subtle. There was a flap a while back when an attempt at a formal correctness proof uncovered that the _intended_ invariants weren't always maintained. That was easily repaired (as the researchers

[issue34397] remove redundant overflow checks in tuple and list implementations

2018-08-14 Thread Tim Peters
Tim Peters added the comment: Bah - the relevant thing to assert is really assert((size_t)Py_SIZE(a) + (size_t)Py_SIZE(b) <= (size_t)PY_SSIZE_T_MAX); C sucks ;-) -- ___ Python tracker <https://bugs.python.org/issu

[issue34397] remove redundant overflow checks in tuple and list implementations

2018-08-14 Thread Tim Peters
Tim Peters added the comment: I agree there's pointless code now, but don't understand why the patch replaces it with mysterious asserts. For example, what's the point of this? assert(Py_SIZE(a) <= PY_SSIZE_T_MAX / sizeof(PyObject*)); assert(Py_SIZE(b) <= PY_SSIZE_T_MAX / sizeof(Py

[issue34376] Improve accuracy of math.hypot() and math.dist()

2018-08-12 Thread Tim Peters
Tim Peters added the comment: Sure, if we make more assumptions. For 754 doubles, e.g., scaling isn't needed if `1e-100 < absmax < 1e100` unless there are a truly ludicrous number of points. Because, if that holds, the true sum is between 1e-200 and number_of_points*1e200, both fa

[issue34376] Improve accuracy of math.hypot() and math.dist()

2018-08-11 Thread Tim Peters
Tim Peters added the comment: Thanks for doing the "real ulp" calc, Raymond! It was intended to make the Kahan gimmick look better, and it succeeded ;-) I don't personally care whether adding 10K things ends up with 50 ulp error, but to each their own. Division can be mostly re

[issue34376] Improve accuracy of math.hypot() and math.dist()

2018-08-10 Thread Tim Peters
Tim Peters added the comment: Not that it matters: "ulp" is a measure of absolute error, but the script is computing some notion of relative error and _calling_ that "ulp". It can understate the true ulp error by up to a factor of 2 (the "wobble" of base 2 f

[issue34291] UnboundLocalError raised on call to global

2018-07-31 Thread Tim Peters
Tim Peters added the comment: Yes, the assignment does "hide the global definition of g". But this determination is made at compile time, not at run time: an assignment to `g` _anywhere_ inside `f()` makes _every_ appearance of `g` within `f()` local to `f`. -- nosy: +

[issue33566] re.findall() dead locked whent the expected ending char not occur until end of string

2018-07-28 Thread Tim Peters
Tim Peters added the comment: Closing as not-a-bug - not enough info to reproduce, but the regexp looked prone to exponential-time backtracking to both MRAB and me, and there's been no response to requests for more info. -- components: +Regular Expressions nosy: +ezio.melotti

[issue33113] Query performance is very low and can even lead to denial of service

2018-07-28 Thread Tim Peters
Tim Peters added the comment: Note: if you found a regexp like this _in_ the Python distribution, then a bug report would be appropriate. It's certainly possible to write regexps that can suffer catastrophic backtracking, and we've repaired a few of those, over the years, that shipped

[issue29710] Incorrect representation caveat on bitwise operation docs

2018-07-23 Thread Tim Peters
Tim Peters added the comment: Nick suggested two changes on 2018-07-15 (look above). Mark & I agreed about the first change, so it wasn't mentioned again after that. All the rest has been refining the second change. -- ___ Python tra

[issue29710] Incorrect representation caveat on bitwise operation docs

2018-07-23 Thread Tim Peters
Tim Peters added the comment: @CuriousLearner, does the PR also include Nick's first suggested change? Here: """ 1. Replace the opening paragraph of https://docs.python.org/3/library/stdtypes.html#bitwise-operations-on-integer-types (the one I originally quoted when ope

[issue34180] bool(Q) always return True for a priority queue Q

2018-07-22 Thread Tim Peters
Tim Peters added the comment: I'm sure Guido designed the API to discourage subtly bug-ridden code relying on the mistaken belief that it _can_ know the queue's current size. In the general multi-threaded context Queue is intended to be used, the only thing `.qsize()`'s caller can know

[issue34168] RAM consumption too high using concurrent.futures (Python 3.7 / 3.6 )

2018-07-20 Thread Tim Peters
Tim Peters added the comment: Note that you can consume multiple gigabytes of RAM with this simpler program too, and for the same reasons: """ import concurrent.futures as cf bucket = range(30_000_000) def _dns_query(target): from time import sleep slee

[issue34168] RAM consumption too high using concurrent.futures (Python 3.7 / 3.6 )

2018-07-20 Thread Tim Peters
Tim Peters added the comment: If your `bucket` has 30 million items, then for element in bucket: executor.submit(kwargs['function']['name'], element, **kwargs) is going to create 30 million Future objects (and all the under-the-covers objects needed to manage their concurrency

[issue29710] Incorrect representation caveat on bitwise operation docs

2018-07-16 Thread Tim Peters
Tim Peters added the comment: Ya, Mark's got a point there. Perhaps s/the internal/a finite two's complement/ ? -- ___ Python tracker <https://bugs.python.org/issue29

[issue29710] Incorrect representation caveat on bitwise operation docs

2018-07-15 Thread Tim Peters
Tim Peters added the comment: Well, all 6 operations "are calculated as though carried out in two's complement with an infinite number of sign bits", so I'd float that part out of the footnote and into the main text. When, e.g., you're thinking of ints _as_ bitstrings, it's e

[issue29710] Incorrect representation caveat on bitwise operation docs

2018-07-14 Thread Tim Peters
Tim Peters added the comment: Nick, that seems a decent compromise. "Infinite string of sign bits" is how Guido & I both thought of it when the semantics of longs were first defined, and others in this report apparently find it natural enough too. It also applies to al

[issue34109] Accumulator bug

2018-07-13 Thread Tim Peters
Tim Peters added the comment: ? I expect your code to return -1 about once per 7**4 = 2401 times, which would be about 400 times per million tries, which is what your output shows. If you start with -5, and randint(1, 7) returns 1 four times in a row, r5 is left at -5 + 4 = -1

[issue34100] Same integers in a tuple of constant literals are not merged

2018-07-11 Thread Tim Peters
Tim Peters added the comment: Fine, Serhiy, so reword it a tiny bit: it's nice if a code object's co_consts vector references as few distinct objects as possible. Still a matter of pragmatics, not of correctness. -- ___ Python tracker <ht

[issue34100] Same integers in a tuple of constant literals are not merged

2018-07-11 Thread Tim Peters
Tim Peters added the comment: The language doesn't define anything about this - any program relying on accidental identity is in error itself. Still, it's nice if a code object's co_consts vector is as short as reasonably possible. That's a matter of pragmatics, not of correctness

[issue34016] Bug in sort()

2018-07-01 Thread Tim Peters
Tim Peters added the comment: Lucas, as Mark said you're sorting _strings_ here, not sorting integers. Please study his reply. As strings, "10" is less than "9", because "1" is less than "9". >>> "10"

[issue24567] random.choice IndexError due to double-rounding

2018-06-26 Thread Tim Peters
Tim Peters added the comment: [Victor] > This method [shuffle()] has a weird API. What is > the point of passing a random function, > ... I proposed to deprecate this argument and remove it later. I don't care here. This is a bug report. Making backward-incompatible API cha

[issue24567] random.choice IndexError due to double-rounding

2018-06-26 Thread Tim Peters
Tim Peters added the comment: Victor, look at Raymond's patch. In Python 3, `randrange()` and friends already use the all-integer `getrandbits()`. He's changing three other lines, where some variant of `int(random() * someinteger)` is being used in an inner loop for speed. Presumably

[issue24567] random.choice IndexError due to double-rounding

2018-06-26 Thread Tim Peters
Tim Peters added the comment: [Mark] > If we do this, can we also persuade Guido to Pronounce that > Python implementations assume IEEE 754 format and semantics > for floating-point? On its own, I don't think a change to force 53-bit precision _on_ 754 boxes would justify that. Th

[issue24567] random.choice IndexError due to double-rounding

2018-06-26 Thread Tim Peters
Tim Peters added the comment: Mark, ya, I agree it's most prudent to let sleeping dogs lie. In the one "real" complaint we got (issue 24546) the cause was never determined - but double rounding was ruled out in that specific case, and no _plausible_ cause was identified (short of

[issue24567] random.choice IndexError due to double-rounding

2018-06-25 Thread Tim Peters
Tim Peters added the comment: Mark, do you believe that 32-bit Linux uses a different libm? One that fails if, e.g., SSE2 were used instead? I don't know, but I'd sure be surprised it if did. Very surprised - compilers have been notoriously unpredictable in exactly when and where

[issue24567] random.choice IndexError due to double-rounding

2018-06-24 Thread Tim Peters
Tim Peters added the comment: There are a couple bug reports here that have been open for years, and it's about time we closed them. My stance: if any platform still exists on which "double rounding" is still a potential problem, Python _configuration_ should be changed to disa

[issue33089] Add multi-dimensional Euclidean distance function to the math module

2018-06-24 Thread Tim Peters
Tim Peters added the comment: Raymond, I'd say scaling is vital (to prevent spurious infinities), but complications beyond that are questionable, slowing things down for an improvement in accuracy that may be of no actual benefit. Note that your original "simple homework problems for

[issue33812] Different behavior between datetime.py and its C accelerator

2018-06-09 Thread Tim Peters
Tim Peters added the comment: I'd call it a bug fix, but I'm really not anal about what people call things ;-) -- ___ Python tracker <https://bugs.python.org/issue33

[issue33814] exec() maybe has a memory leak

2018-06-09 Thread Tim Peters
Tim Peters added the comment: Dan, your bug report is pretty much incoherent ;-) This standard Stack Overflow advice applies here too: https://stackoverflow.com/help/mcve Guessing your complaint is that: sys.getrefcount(itertools.repeat) keeps increasing by 1 across calls to `leaks

[issue33812] Different behavior between datetime.py and its C accelerator

2018-06-08 Thread Tim Peters
Tim Peters added the comment: I copy/pasted the definitions of "aware" and "naive" from the docs. Your TZ's .utcoffset() returns None, so, yes, any datetime using an instance of that for its tzinfo is naive. In print(datetime(2000,1,1).astimezone(timezone.utc)) the

[issue33812] Different behavior between datetime.py and its C accelerator

2018-06-08 Thread Tim Peters
Tim Peters added the comment: The message isn't confusing - the definition of "aware" is confusing ;-) """ A datetime object d is aware if d.tzinfo is not None and d.tzinfo.utcoffset(d) does not return None. If d.tzinfo is None, or if d.tzinfo is not None but d.tz

[issue21196] Name mangling example in Python tutorial

2018-06-06 Thread Tim Peters
Tim Peters added the comment: Berker Peksag's change (PR 5667) is very simple and, I think, helpful. -- nosy: +tim.peters ___ Python tracker <https://bugs.python.org/issue21

[issue32832] doctest should support custom ps1/ps2 prompts

2018-05-28 Thread Tim Peters
Tim Peters <t...@python.org> added the comment: You missed my point about IPython: forget "In/Out arrays, etc". What you suggest is inadequate for _just_ changing PS1/PS2 for IPython. Again, read their `parse()` function. They support _more than one_ set of PS1/PS2

[issue32832] doctest should support custom ps1/ps2 prompts

2018-05-27 Thread Tim Peters
Tim Peters <t...@python.org> added the comment: Sergey, I understand that, but I don't care. The only people I've ever seen _use_ this are people writing an entirely different shell interface. They're rare. There's no value in complicating doctest to cater to theoretical use

[issue32832] doctest should support custom ps1/ps2 prompts

2018-05-27 Thread Tim Peters
Tim Peters <t...@python.org> added the comment: doctest was intended to deal with the standard CPython terminal shell. I'd like to keep it that way, but recognize that everyone wants to change everything into "a framework" ;-) How many other shells are there? As Sergey li

[issue33579] calendar.timegm not always an inverse of time.gmtime

2018-05-19 Thread Tim Peters
Tim Peters <t...@python.org> added the comment: They both look wrong to me. Under 3.6.5 on Win10, `one` and `three` are the same. Python 3.6.5 (v3.6.5:f59c0932b4, Mar 28 2018, 17:00:18) [MSC v.1900 64 bit (AMD64)] on win32 time.struct_time(tm_year=2009, tm_mon=2, tm_mday=13, tm_h

[issue33572] False/True as dictionary keys treated as integers

2018-05-18 Thread Tim Peters
Tim Peters <t...@python.org> added the comment: I expect these docs date back to when ints, longs, and floats were the only hashable language-supplied types for which mixed-type comparison could ever return True. They could stand some updates ;-) `fractions.Fraction` and `decimal.D

[issue33566] re.findall() dead locked whent the expected ending char not occur until end of string

2018-05-18 Thread Tim Peters
Tim Peters <t...@python.org> added the comment: Min, you need to give a complete example other people can actually run for themselves. Offhand, this part of the regexp (.|\s)* all by itself _can_ cause exponential-time behavior. You can run this for yourself: >>> import re

[issue33424] 4.4. break and continue Statements, and else Clauses on Loops

2018-05-03 Thread Tim Peters
Tim Peters <t...@python.org> added the comment: Closing because this appears to be senseless. -- nosy: +tim.peters resolution: -> rejected stage: -> resolved status: open -> closed ___ Python tracker <rep...@bugs.

[issue33402] Change the fractions.Fraction class to convert to a unicode fraction string

2018-05-02 Thread Tim Peters
Tim Peters <t...@python.org> added the comment: -1. We should stop pretending this _ might_ happen ;-) -- nosy: +tim.peters ___ Python tracker <rep...@bugs.python.org> <https://bugs.python

[issue33372] Wrong calculation

2018-04-26 Thread Tim Peters
Tim Peters <t...@python.org> added the comment: Please find a minimal example that illustrates the problem you think you've found, and paste the plain text _into_ the bug report. In the meantime, I'm closing this as "not a bug". The division operator applied to integers in P

[issue33293] Using datetime.datetime.utcnow().timestamp() in Python3.6.0 can't get correct UTC timestamp.

2018-04-17 Thread Tim Peters
Tim Peters <t...@python.org> added the comment: docstrings give brief statements intended to jog your memory; they're not intended to be comprehensive docs. Read the actual documentation and see whether you're still confused. When you "assumed it is irrelevant to time zone"

[issue33293] Using datetime.datetime.utcnow().timestamp() in Python3.6.0 can't get correct UTC timestamp.

2018-04-17 Thread Tim Peters
Tim Peters <t...@python.org> added the comment: Ned, I think this one is more the case that the OP didn't read the docs ;-) That said, there's a level of complexity here that seemingly can't be reduced: the distinctions between the `datetime` and `time` modules' views of the

[issue33293] Using datetime.datetime.utcnow().timestamp() in Python3.6.0 can't get correct UTC timestamp.

2018-04-17 Thread Tim Peters
Tim Peters <t...@python.org> added the comment: I agree this isn't a bug (and it was right to close it). I expect the OP is confused about what the `.timestamp()` method does, though. This note in the docs directly address what happens in their problematic `datetime.utcnow().tim

[issue33204] IDLE: remove \b from colorizer string prefix

2018-04-01 Thread Tim Peters
Tim Peters <t...@python.org> added the comment: Sounds good (removing \b) to me, Terry! -- nosy: +tim.peters ___ Python tracker <rep...@bugs.python.org> <https://bugs.python

[issue33144] random._randbelow optimization

2018-03-26 Thread Tim Peters
Tim Peters <t...@python.org> added the comment: I'm the wrong guy to ask about that. Since I worked at Zope Corp, my natural inclination is to monkey-patch everything - but knowing full well that will offend everyone else ;-) That said, this optimization seems straightforward to me

[issue33144] random._randbelow optimization

2018-03-26 Thread Tim Peters
Tim Peters <t...@python.org> added the comment: I don't see anything objectionable about the class optimizing the implementation of a private method. I'll note that there's a speed benefit beyond just removing the two type checks in the common case: the optimized `_randbelow()` also

[issue33114] random.sample() behavior is unexpected/unclear from docs

2018-03-25 Thread Tim Peters
Tim Peters <t...@python.org> added the comment: There's nothing in the docs I can see that implies `sample(x, n)` is a prefix of what `sample(x, n+1)` would have returned had the latter been called instead. If so, then - as always - it's "at your own risk" when you re

[issue33083] math.factorial accepts non-integral Decimal instances

2018-03-20 Thread Tim Peters
Tim Peters <t...@python.org> added the comment: factorial(float) was obviously intended to work the way it does, so I'd leave it alone in whatever changes are made to resolve _this_ issue. I view it as a harmless-enough quirk, but, regardless, if people want to deprecate it that

[issue33112] SequenceMatcher bug

2018-03-20 Thread Tim Peters
Tim Peters <t...@python.org> added the comment: Please see the response to issue31889. Short course: you need to pass `autojunk=False` to the SequenceMatcher constructor. -- nosy: +tim.peters resolution: -> duplicate stage: -> resolved status: op

[issue33089] Add multi-dimensional Euclidean distance function to the math module

2018-03-19 Thread Tim Peters
Tim Peters <t...@python.org> added the comment: Mark, how about writing a clever single-rounding dot product that merely _detects_ when it encounters troublesome cases? If so, it can fall back to a (presumably) much slower method. For example, like this for the latter: def srdp(

[issue33089] Add multi-dimensional Euclidean distance function to the math module

2018-03-19 Thread Tim Peters
Tim Peters <t...@python.org> added the comment: Mark, thanks! I'm happy with that resolution: if any argument is infinite, return +inf; else if any argument is a NaN, return a NaN; else do something useful ;-) Serhiy, yes, the scaling that prevents catastrophic overflow/underfl

[issue33089] Add multi-dimensional Euclidean distance function to the math module

2018-03-19 Thread Tim Peters
Tim Peters <t...@python.org> added the comment: Some notes on the hypot() code I pasted in: first, it has to special case infinities too - it works fine if there's only one of 'em, but returns a NaN if there's more than one (it ends up computing inf/inf, and the resulting NaN prop

[issue33089] Add multi-dimensional Euclidean distance function to the math module

2018-03-18 Thread Tim Peters
Tim Peters <t...@python.org> added the comment: I'd be +1 on generalizing math.hypot to accept an arbitrary number of arguments. It's the natural building block for computing distance, but the reverse is strained. Both are useful. Here's scaling code translated from the F

[issue33098] add implicit conversion for random.choice() on a dict

2018-03-18 Thread Tim Peters
Tim Peters <t...@python.org> added the comment: This won't be changed. The dict type doesn't support efficient random choice (neither do sets, by the way), and it's been repeatedly decided that it would do a disservice to users to hide that. As you know, you can materialize the

[issue26680] Incorporating float.is_integer into the numeric tower and Decimal

2018-03-15 Thread Tim Peters
Tim Peters <t...@python.org> added the comment: If you want to deprecate the method, bring that up on python-dev or python-ideas. It's inappropriate on the issue tracker (unless, e.g., you open a new issue with a patch to rip it out of the language). It's also inappropriate t

<    2   3   4   5   6   7   8   9   10   11   >