[Tim]
> I know the theoretical number of probes for dicts, but not for sets
> anymore. The latter use a mix of probe strategies now, "randomish"
> jumps (same as for dicts) but also purely linear ("up by 1") probing
> to try to exploit L1 cache.
>
> It's not _apparent_ to me that the mix actually
[Larry]
> It's a lightweight abstract dependency graph. Its nodes are opaque,
> only required to be hashable. And it doesn't require that you give it
> all the nodes in strict dependency order.
>
> When you add a node, you can also optionally specify
> dependencies, and those dependencies aren't
[Larry]
> Here is the original description of my problem, from the original email in
> this thread. I considered this an adequate explanation of my problem
> at the time.
>> I do have a use case for this. In one project I maintain a "ready" list of
>> jobs; I need to iterate over it, but I also
[Nick Coghlan ]
> I took Larry's request a slightly different way: he has a use case where
> he wants order preservation (so built in sets aren't good), but combined
> with low cost duplicate identification and elimination and removal of
> arbitrary elements (so lists and collections.deque aren't
[Skip Montanaro ]
> ...
> I thought setting PYTHONTRACEMALLOC should provoke some useful output,
> but I was confused into thinking I was (am?) still missed something
> because it continued to produce this message:
>
> Enable tracemalloc to get the memory block allocation traceback
Ah, I
[Skip Montanaro ]
> I've got a memory issue in my modified Python interpreter I'm trying
> to debug. Output at the end of the problematic unit test looks like this:
To my eyes, you left out the most important part ;-) A traceback
showing who made the fatal free() call to begin with.
In debug
For posterity, just recording best guesses for the other mysteries left hanging:
- PYTHONTRACEMALLOC didn't work for you because Victor's traceback
showed that Py_FinalizeEx was executing _PyImport_Fini,, one statement
_after_ it disabled tracemalloc via _PyTraceMalloc_Fini.
- The address passed
I'm surprised nobody has mentioned this: there are no "unboxed" types
in CPython - in effect, every object user code creates is allocated
from the heap. Even, e.g., integers and floats. So even non-contrived
code can create garbage at a ferocious rate. For example, think about
this simple
[Steven D'Aprano ]
> Perhaps this is a silly suggestion, but could we offer this as an
> external function in the stdlib rather than a string method?
>
> Leave it up to the user to decide whether or not their data best suits
> the find method or the new search function. It sounds like we can offer
Fredrik Lundh crafted our current string search algorithms, and
they've served us very well. They're nearly always as fast as
dumbest-possible brute force search, and sometimes much faster. This
was bought with some very cheap one-pass preprocessing of the pattern
(the substring to search _for_),
[Guido]
> The key seems to be:
Except none of that quoted text (which I'll skip repeating) gives the
slightest clue as to _why_ it may be an improvement. So you split the
needle into two pieces. So what? What's the _point_? Why would
someone even imagine that might help?
Why is one half then
[Dennis Sweeney ]
> Here's my attempt at some heuristic motivation:
Thanks, Dennis! It helps. One gloss:
>
> The key insight though is that the worst strings are still
> "periodic enough", and if we have two different patterns going on,
> then we can intentionally split them apart.
The
[Guido]
> I am not able to dream up any hard cases -- like other posters,
> my own use of substring search is usually looking for a short
> string in a relatively short piece of text. I doubt even the current
> optimizations matter to my uses.
I should have responded to this part differently.
[Marco Sulla]
> Excuse me if I intrude in an algorithm that I have not understood, but
> the new optimization can be applied to regexps too?
The algorithm is limited to searching for fixed strings.
However, _part_ of our regexp implementation (the bit that looks ahead
for a fixed string) will
I don't plan on making a series of these posts, just this one, to give
people _some_ insight into why the new algorithm gets systematic
benefits the current algorithm can't. It splits the needle into two
pieces, u and v, very carefully selected by subtle linear-time needle
preprocessing (and it's
[Tim Peters, explains one of the new algorithm's surprisingly
effective moving parts]
[Chris Angelico ]
> Thank you, great explanation. Can this be added to the source code
> if/when this algorithm gets implemented?
No ;-) While I enjoy trying to make hard things clear(er),
[Tim]
>> Note that no "extra" storage is needed to exploit this. No character
>> lookups, no extra expenses in time or space of any kind. Just "if we
>> mismatch on the k'th try, we can jump ahead k positions".
[Antoine Pitrou ]
> Ok, so that means that on a N-character haystack, it'll always do
[Tim]
> ...
> Alas, the higher preprocessing costs leave the current PR slower in "too
> many" cases too, especially when the needle is short and found early
> in the haystack. Then any preprocessing cost approaches a pure waste
> of time.
But that was this morning. Since then, Dennis changed
Rest assured that Dennis is aware of that pragmatics may change for
shorter needles.
The code has always made a special-case of 1-character needles,
because it's impossible "even in theory" to improve over
straightforward brute force search then.
Say the length of the text to search is `t`, and
[Guido]
> Maybe someone reading this can finish the Wikipedia page on
> Two-Way Search? The code example trails off with a function with
> some incomprehensible remarks and then a TODO..
Yes, the Wikipedia page is worse than useless in its current state,
although some of the references it lists
[Paul Moore ]
> (This is a genuine question, and I'm terrified of being yelled at for
> asking it, which gives an idea of the way this thread has gone - but I
> genuinely do want to know, to try to improve my own writing).
>
> What *is* the correct inclusive way to refer to an unidentified person
t looks like it’s happening on
>>> python-dev too” to mean that the request was for both lists.
[Tim Peters]
>> It depends on who you want to annoy least ;-)
>>
>> If it's the position of the PSF that some kind(s) of messages must be
>> suppressed, then I'll need a mo
[Brett Cannon wrote:]
> Regardless of what side you fall on, I think we can agree that
> emotions are running very high at the moment. Nothing is going
> to change in at least the next 24 hours, so I am personally
> asking folks to step back for at least that long and think about:
>
> Is what you
One microscopic point:
[Guido]
> ...
> (if `.x` is unacceptable, it’s unclear why `^x` would be any
> better),
As Python's self-appointed spokesperson for the elderly, there's one
very clear difference: a leading "." is - literally - one microscopic
point, all but invisible. A leading caret is
[Victor Stinner ]
> If someone continues to feed the PEP 8 discussion, would it be
> possible to change their account to require moderation for 1 day or
> maybe even up to 1 week? I know that Mailman 3 makes it possible.
I see no such capability. I could, for example, manually fiddle
things so
[Ernest W. Durbin III ]
> Reviewing, I may have misinterpreted the message from PSF Executive
> Director regarding the situation.
>
> It does appear that python-ideas moderators contacted postmaster@.
> Appears I misread a message saying “it looks like it’s happening on
> python-dev too” to mean
[Tim]
See reply to Glenn. Can you give an example of a dotted name that is
not a constant value pattern? An example of a non-dotted name that is?
If you can't do either (and I cannot)), then that's simply what "if
[Rhodri James ]
>>> case long.chain.of.attributes:
[Tim]
>>
[Rhodri James ]
> I'm seriously going to maintain that I will forget the meaning of "case
> _:" quickly and regularly,
Actually, you won't - trust me ;-)
> just as I quickly and regularly forget to use
> "|" instead of "+" for set union. More accurately, I will quickly and
> regularly forget
You got everything right the first time ;-) The PEP is an extended
illustration of "although that way may not be obvious at first unless
you're Dutch".
I too thought "why not else:?" at first. But "case _:" covers it in
the one obvious way after grasping how general wildcard matches are.
[Tim]
>> ".NAME" grated at first, but extends the idea that dotted names are
>> always constant value patterns to "if and only if". So it has mnemonic
>> value. When context alone can't distinguish whether a name is meant as
>> (in effect) an lvalue or an rvalue, no syntax decorations can prevent
[Taine Zhao ]
> "or" brings an intuition of the execution order of pattern matching, just
> like how people already know about "short-circuiting".
>
> "or" 's operator precedence also suggests the syntax of OR patterns.
>
> As we have "|" as an existing operator, it seems that there might be
>
[Ethan Furman ]
> "case _:" is easy to miss -- I missed it several times reading through the
> PEP.
As I said, I don't care about "shallow first impressions". I care
about how a thing hangs together _after_ climbing its learning curve -
which in this case is about a nanometer tall ;-)
You're
[Ernest W. Durbin III ]
> At the request of the list moderators of python-ideas and python-dev,
> both lists have been placed into emergency moderation mode. All
> new posts must be approved before landing on the list.
>
> When directed by the list moderators, this moderation will be disabled.
I
[Julien Danjou]
> ...
> Supposedly PyObject_Malloc() returns some memory space to store a
> PyObject. If that was true all the time, that would allow anyone to
> introspect the allocated memory and understand why it's being used.
>
> Unfortunately, this is not the case. Objects whose types are
[Dan Stromberg ]
> ...
> Timsort added the innovation of making mergesort in-place, plus a little
> (though already common) O(*n^2) sorting for small sublists.
Actually, both were already very common in mergesorts. "timsort" is
much more a work of engineering than of insight ;-) That is, it
[Ethan Furman]
> A question [1] has arisen about the viability of `random.SystemRandom` in
> Pythons before and after the secrets module was introduced
> (3.5 I think) -- specifically
>
> does it give independent and uniform discrete distribution for
> cryptographic purposes across
I'm guessing it's time to fiddle local CPython clones to account for
master->main renaming now?
If so, I've seen two blobs of instructions, which are very similar but
not identical:
Blob 1 ("origin"):
"""
You just need to update your local clone after the branch name changes.
>From the local
FYI, I just force-unsubscribed this member (Hoi Lam Poon) from
python-dev. Normally I don't do things like that, since, e.g, we have
no way to know whether the sender address was spoofed in emails we
get. But in this case Hoi's name has come up several times as the
sender of individual spam, and
Various variations on:
> ... I am also considering unsubscribing if someone doesn't step in and stop
> the mess going on between Brett and Marco. ...
Overall, "me too!" pile-ons _are_ "the [bulk of the] mess" to most
list subscribers.
It will die out on its own in time. Dr. Brett should know by
[Marco Sulla ]
> It's the Netiquette, Chris. It's older than Internet. It's a gross
> violation of the Netiquette remarking grammatical or syntactical
> errors. I think that also the least advanced AI will understand what I
> meant.
As multiple people have said now, including me, they had no idea
[Marco Sulla ]
> Oh, this is enough. The sense of the phrase was very clear and you all
> have understood it.
Sincerely, I have no idea what "I pretend your immediate excuses."
means, in or out of context.
> Remarking grammatical errors is a gross violation
> of the Netiquette. I ask
[Marco Sulla ]
> I repeat, even the worst AI will understand from the context what I
> meant.
Amazingly enough, the truth value of a proposition does not increase
via repetition ;-)
>>> bool(True * 1_000_000_000)
True
>>> bool(False * 1_000_000_000)
False
> But let me do a very rude example:
>
[Laurent Lyaudet ]
> ...
> My benchmarks could be improved but however I found that Shivers' sort
> and adaptive Shivers' sort (aka Jugé's sort) performs better than
> Tim's sort.
Cool! Could you move this to the issue report already open on this?
Replace list sorting merge_collapse()?
[me]
> If you want more active moderation, volunteer for the job. I'd happily
> give it up, and acknowledge that my laissez-faire moderation approach
> is out of style.
But, please, don't tell _me_ off-list that you volunteer. I want no
say in who would become a new moderator - I'm already doing
Sorry, all! This post was pure spam - I clicked the wrong button on
the moderator UI. The list has already been set to auto-reject any
future posts from this member.
On Mon, Aug 9, 2021 at 10:51 AM ridhimaortiz--- via Python-Dev
wrote:
>
> It is really nice post. https://bit.ly/3fsxwwl
>
Sorry for the spam! A bunch of these were backed up in the moderation
queue. I used the UI to set the list to auto-discard future messages
from this address, but then clicked "Accept" in the mistaken sense of
"yes, accept my request to auto-nuke this clown". But it took "Accept"
to mean "sure
[Gregory P. Smith ]
> The reason for digits being a multiple of 5 bits should be revisited vs
> its original intent
I added that. The only intent was to make it easier to implement
bigint exponentiation easily while viewing the exponent as being in
base 32 (so as to chew up 5 bits at a time)..
>> The reason for digits being a multiple of 5 bits should be revisited vs
>> its original intent
> I added that. The only intent was to make it easier to implement
> bigint exponentiation easily ...
That said, I see the comments in longintrepr.h note a stronger constraint:
"""
the marshal code
[Christopher Barker ]
> Earlier in the thread, we were pointed to multiple implementations.
>
> Is this particular one clearly the “best”[*]?
>
> If so, then sure.
>
> -CHB
>
> [*] best meaning “most appropriate for the stdlib”. A couple folks have
> already pointed to the quality of the code. But
[Christopher Barker ]
> Maybe a stupid question:
>
> What are use cases for sorted dicts?
>
> I don’t think I’ve ever needed one.
For example, for some mappings with totally ordered keys, it can be
useful to ask for the value associated with a key that's not actually
there, because "close to the
I started writing up a SortedDict use case I have, but it's very
elaborate and I expect it would just end with endless pointless
argument about other approaches I _could_ take. But I already know all
those ;-)
So let's look at something conceptually "dead easy" instead: priority
queues. They're a
[Bob Fang ]
> This is a modest proposal to consider having sorted containers
> (http://www.grantjenks.com/docs/sortedcontainers/) in standard library.
+1 from me, but if and only if Grant Jenks (its author) wants that too.
It's first-rate code in all respects, including that it's a fine
example
[Raymond Bisdorff ]
> I fully agree with your point. By default, all the components of the
> tuple should be used in the comparison.
>
> Yet, I was confused by the following result.
> >>> from operator import itemgetter
> >>> L = [(1, 'a'), (2, 'b'), (1, 'c'), (2, 'd'), (3, 'e')]
> >>>
[Raymond Bisdorff ]
> ...
> Please notice the following inconsistency in Python3.10.0 and before of
> a sort(reverse=True) result:
>
> >>> L = [(1, 'a'), (2, 'b'), (1, 'c'), (2, 'd'), (3, 'e')]
> >>> L.sort(reverse=True)
> >>> L
> >>> [(3, 'e'), (2, 'd'), (2, 'b'), (1, 'c'), (1, 'a')]
Looks
[Ethan Furman ]
> When is an empty container contained by a non-empty container?
That depends on how the non-empty container's type defines
__contains__. The "stringish" types (str, byte, bytearray) work _very_
differently from others (list, set, tuple) in this respect.
t in x
for the latter
[Guido]
> I don't think there's a way to do a PGO build from Visual Studio; but
> a command prompt in the repo can do it using `PCbuild\build.bat --pgo`.
> Just be patient with it.
Thanks! That worked, and was easy, and gave me an executable that runs
"// 10" at supernatural speed.
Alas, Visual
[Gregory P. Smith ]
> ...
> That only appears true in default boring -O2 builds. Use
> `./configure --enable-optimizations` and the C version is *much* faster
> than your asm one...
>
> 250ns for C vs 370ns for your asm divl one using old gcc 9.3 on my
> zen3 when compiled using
]Mark Dickinson ]
>> Division may still be problematic.
Heh. I'm half convinced that heavy duty bigint packages are so often
written in assembler because their authors are driven insane by trying
to trick C compilers into generating "the obvious" machine
instructions needed.
An alternative to HW
[Barry Scott and Steve Dower share tips for convincing Visual Studio
to show assembler without recompiling the file]
Thanks, fellows! That mostly ;-) workedl. Problem remaining is that
breakpoints just didn't work. They showed up "visually", and in the table
of set breakpoints, but code went
[Tim, incidentally notes that passing 10 as the divisor to inplace_divrem1()
is "impossibly fast" on Windows, consuming less than a third the time as
when passing seemingly any other divisor]
[Mark Dickinson, discovers much the same is true under other, but not all,
Linux-y builds, due to the
Heads up! I found my PSF Board voting info in my gmail spam folder
today; looks like it was mailed out this morning.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
[Skip Montanaro ]
> I subscribe to the python/cpython stuff on GitHub. I find it basically
> impossible to follow because of the volume.
> ...
> How (if at all) do people deal with this firehose of email? Am I the
> only person dumb enough to have tried?
My observation is that, over time, all
901 - 962 of 962 matches
Mail list logo