[Python-Dev] Re: Deferred, coalescing, and other very recent reference counting optimization
On 9/2/20 8:50 PM, Jim J. Jewett wrote: I suspect that splitting the reference count away from the object itself could also be profitable, as it means the cache won't have to be dirtied (and flushed) on read access, and can keep Copy-On-Write from duplicating pages. I had a patch from Thomas Wouters doing that for the Gilectomy. Last time I tried it, it was a performance wash. //arry/ ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/RTRVQJNOJPGKBTJVPD62Z7LTKR7QI7FB/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Travis CI migrated from the legacy GitHub API to the new GitHub Action
Hi, tl; dr Travis CI issues are now resolved thanks to Ernest! During the last 3 months, the Travis CI job failed randomly to report the build status to GitHub pull requests: https://github.com/python/core-workflow/issues/371 I discovered that travis-ci.org uses the legacy GitHub API, whereas there is a new travis-ci.com website which uses the new GitHub Action API. The migration started in May 2018, is still in the beta phase, and must be done manually: * https://blog.travis-ci.com/2018-05-02-open-source-projects-on-travis-ci-com-with-github-apps * https://docs.travis-ci.com/user/migrate/open-source-on-travis-ci-com/#existing-open-source-repositories-on-travis-ciorg Two weeks ago, Ernest W. Durbin III migrated the GitHub "python" organization from travis-ci.org to travis-ci.com, and also migrated the cpython project to travis-ci.com (the migration requires to migrate each project individually). Since the migration, I didn't notice the "Travis CI doesn't report the build status" issue anymore on new pull requests. Great! Thanks Ernest! Victor -- Night gathers, and now my watch begins. It shall not end until my death. ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/ZABQWCQKUNKNO5XUCNIMB7GELSXTA646/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Python logging with a custom clock
Dear list, log records in the Python logging library always use timestamps provided by `time.time()`, i.e. the usual system clock (UTC, CLOCK_REALTIME). This time is used as absolute timestamp in log records and for timestamps relative to the load of the library (Lib/logging/__init__.py: `_startTime`). I would like to receive feedback to and propose the attached (and possibly incomplete) patch that allows the programmer to provide a custom callable to get the time instead of `time.time()`. For example, this could be: clock = lambda: time.clock_gettime(time.CLOCK_TAI) and the callable could be provided during initial logging setup logging.basicConfig(clock=clock, ...) There is a similar approach in log4j to specify a custom clock [0]. This change enables the use of non-UTC clocks, e.g. `CLOCK_TAI` or `CLOCK_MONOTONIC`, which are unaffected by leap seconds and count SI seconds. (In fact, logging's use of differences of UTC timestamps could make users believe that the obtained duration reflects SI seconds, which it doesn't in all cases.) Combining a custom absolute clock such as `CLOCK_TAI` with custom log formatters allows users to /store or transfer/ log records with TAI timestamps, and /display/ them with UTC timestamps (e.g. properly converted from TAI to UTC with a "60" second during active leap second). This resolves the ambiguity when analysing and correlating logs from different machines also during leap seconds. Attached is a simple example showing the different timestamps based on UTC and TAI (assuming the current offset of +37 seconds [1] is properly configured on the host, e.g. through PTP or `adjtimex()` with `ADJ_TAI`). $ export TZ=GMT $ date --iso-8601=seconds && python3 example.py 2020-09-02T14:34:14+00:00 2020-09-02T14:34:51+ INFO message According to the documentation `time.CLOCK_TAI` was introduced in Python 3.9 [2], but already today the system constant can be used (e.g. on Debian Buster, Linux 4.19.0, Python 3.7.3 it is 11). The two patches provided are for Python 3.7.3 (Debian Buster) and Python 3.8.5 (python.org). In the latter case, it may need to be considered how changing the Python logging clock works: it probably should fail if handlers are already configured, unless `force` is also provided and handlers are reset. Kind regards, -- nicolas benes [0] `log4j.Clock` in https://logging.apache.org/log4j/log4j-2.8/manual/configuration.html [1] https://www.timeanddate.com/worldclock/other/tai [2] https://docs.python.org/dev/library/time.html#time.CLOCK_TAI -- Nicolas Benes [email protected] +Software Engineer +E S+ European Southern Observatory https://www.eso.org OKarl-Schwarzschild-Strasse 2 +D-85748 Garching b. Muenchen Germany -- >From e19413a025d7807f68fc83f17cbb5da147d869f5 Mon Sep 17 00:00:00 2001 From: Nicolas Benes Date: Tue, 1 Sep 2020 18:33:38 +0200 Subject: [PATCH] Logging with custom clock --- Lib/logging/__init__.py | 20 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/Lib/logging/__init__.py b/Lib/logging/__init__.py index 2761509..54ccf9e 100644 --- a/Lib/logging/__init__.py +++ b/Lib/logging/__init__.py @@ -49,10 +49,15 @@ __date__= "07 February 2010" # Miscellaneous module data #--- +# +#_clock_gettime is a callable to get the current time in seconds +# +_clock_gettime = time.time + # #_startTime is used as the base when calculating the relative time of events # -_startTime = time.time() +_startTime = _clock_gettime() # #raiseExceptions is used to see if exceptions during handling should be @@ -295,7 +300,7 @@ class LogRecord(object): """ Initialize a logging record with interesting information. """ -ct = time.time() +ct = _clock_gettime() self.name = name self.msg = msg # @@ -494,8 +499,8 @@ class Formatter(object): %(lineno)d Source line number where the logging call was issued (if available) %(funcName)sFunction name -%(created)f Time when the LogRecord was created (time.time() -return value) +%(created)f Time when the LogRecord was created (by default +time.time() return value) %(asctime)s Textual time when the LogRecord was created %(msecs)d Millisecond portion of the creation time %(relativeCreated)d Time in milliseconds when the LogRecord was created, @@ -1862,6 +1867,8 @@ def basicConfig(**kwargs): handlers, which will be added to the root handler. Any handler in the list which does not have a formatter assigned will be assigned the formatter created
[Python-Dev] Buildbot migrated to a new server
Hi, tl; dr Buildbots were unstable for 3 weeks but the issue is mostly resolved. Since last January, the disk of the buildbot server was full every 2 weeks and The Night’s Watch had to fix it in the darkness for you (usually, remove JUnit files and restart the server). The old machine only has 8 GB for the whole system and all data, whereas buildbot workers produce large JUnit (XML) files (around 5 MB per file). Three weeks ago, Ernest W. Durbin III provided us a new machine with a larger disk (60 GB) and installed PostgreSQL database (whereas SQLite was used previously). He automated the installation of the machine, but also (great new feature!) automated reloading the Buildbot server when a new configuration is pushed in the Git repository. The configuration is public and maintained at: https://github.com/python/buildmaster-config/ The migration was really smooth, except that last week, we noticed that workers started to be disconnected every minute, and then filled their temporary directory with temporary compiler files leaked by interrupted builds. Buildbot owners have to update their client configuration and remove manually temporary files: https://mail.python.org/archives/list/[email protected]/thread/SZR2OLH67OYXSSADSM65HJYOIMFF44JZ/ Most buildbot worker configurations have been updated and the issue is mostly resolved. There is another minor issue, HTTPS connections also closed after 1 minute and so the web page is refreshed automatically every minute. The load balancer configuration should be adjusted: https://bugs.python.org/issue41701 Victor -- Night gathers, and now my watch begins. It shall not end until my death. ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/DYUX5EEDAX3IO66QOICPK3VNEENSEIIQ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Deferred, coalescing, and other very recent reference counting optimization
I'm surprised nobody has mentioned this: there are no "unboxed" types
in CPython - in effect, every object user code creates is allocated
from the heap. Even, e.g., integers and floats. So even non-contrived
code can create garbage at a ferocious rate. For example, think about
this simple function, which naively computes the Euclidean distance
between two n-vectors:
```python
def dist(xs, ys):
from math import sqrt
if len(xs) != len(ys):
raise ValueError("inputs must have same length")
return sqrt(sum((x - y)**2 for x, y in zip(xs, ys)))
```
In general, `len(xs)` and `len(ys)` each create a new integer object,
which both become trash the instant `!=` completes. Say the common
length is `n`.
`zip` then creates `n` 2-tuple objects, each of which lives only long
enough to be unpacked into `x` and `y`. The the result of `x-y` lives
only long enough to be squared, and the result of that lives only long
enough to be added into `sum()`'s internal accumulator. Finally, the
grand total lives only long enough to extract its square root.
With "immediate" reclamation of garbage via refcounting, memory use is
trival regardless of how large `n` is, as CPython reuses the same heap
space over & over & over, ASAP. The space for each 2-tuple is
reclaimed before `x-y` is computed, the space for that is reclaimed
when the square completes, and the space for the square is reclaimed
right after `sum()` folds it in. It's memory-efficient and
cache-friendly "in the small".
Of course that's _assuming__, e.g., that `(x-y).__pow__(2)` doesn't
save away its arguments somewhere that outlives the method call, but
the compiler has no way to know whether it does. The only thing it
can assume about the element types is that they support the methods
the code invokes. It fact, the compiler has no idea whether the
result type of `x-y` is even the same as the type of `x` or of `y`.
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/[email protected]/message/NU4T5TFPDVYDCR5ADY6KKJ6USWVFD3TZ/
Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Deferred, coalescing, and other very recent reference counting optimization
Tim Peters wrote: > `zip` then creates `n` 2-tuple objects, each of which lives only long enough > to be unpacked into `x` and `y`... With "immediate" reclamation of garbage > via refcounting, memory use is trival regardless of how large `n` is, as > CPython reuses the same heap space over & over & over, ASAP. The space for > each 2-tuple is reclaimed before `x-y` is computed... It's also worth noting that the current refcounting scheme allows for some pretty sneaky optimizations under-the-hood. In your example, `zip` only ever creates one 2-tuple, and keeps reusing it over and over: https://github.com/python/cpython/blob/c96d00e88ead8f99bb6aa1357928ac4545d9287c/Python/bltinmodule.c#L2623 This is thanks to the fact that most `zip` usage looks exactly like yours, where the tuple is only around long enough to be unpacked. If `zip.__next__` knows it's not referenced anywhere else anymore, it is free to mutate(!) it in place. I believe PyUnicode_Append does something similar for string concatenation, as well. ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/MJ4BL42YSMP5BUGWT7EEK3EKNVGBDH35/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 622 version 2 (Structural Pattern Matching)
On 30/07/2020 00:34, Nick Coghlan wrote: the proposed name binding syntax inherently conflicts with the existing assignment statement lvalue syntax in two areas: * dotted names (binds an attribute in assignment, looks up a constraint value in a match case) * underscore targets (binds in assignment, wildcard match without binding in a match case) The former syntactic conflict presents a bigger problem, though, as it means that we'd be irrevocably committed to having two different lvalue syntaxes for the rest of Python's future as a language. +1 ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/WLNMH7OFURYPL2E7YT5JRYXW7RLDGIH6/ Code of Conduct: http://python.org/psf/codeofconduct/
