[Python-Dev] Re: Presenting PEP 695: Type Parameter Syntax

2022-07-15 Thread Paul Ganssle
I actually really like some variation on `@with type S`, or some other 
variation that has a keyword, because that makes it much easier for 
someone newly encountering one of these syntax constructs to search to 
figure out what it does. If you didn't already know what the square 
brackets did, how would you try and find out? "what do square brackets 
mean in Python" would probably turn up a bunch of stuff about element 
access, and maybe something about type generic parameters.


By contrast, `@with type S` is kinda self-explanatory, and even if it's 
not, 'What does "with type" mean in Python' will almost certainly turn 
up meaningful results.


An additional benefit is that I find some of these examples to be a bit 
visually cluttered with all the syntax:


def  func1[T](a:  T)  ->  T:  ...   # OK
class ClassA[S, T](Protocol): ... # OK

Which would look less cluttered with a prefix clause:

@with type T def  func1(a:  T)  ->  T:  ...   # OK
@with type S @with type T class ClassA(Protocol): ... # OK

Of the objections to this concept in the PEP 
, the most obvious one 
to me was that the scoping rules were less clear, but it is not entirely 
clear to me why the scope of the prefix clause couldn't be extended to 
include class / function decorators that follow the prefix clause; the 
choice of scoping seems like it was a defensible but mostly arbitrary 
one. I think as long as the new prefix clause is something that was 
syntactically forbidden prior to the introduction of PEP 695 (e.g. 
`@with type` or `[typevar: S]` or whatever), it will be relatively clear 
that this is not a normal decorator, and so "the scoping and time of 
execution doesn't match decorators" doesn't seem like a major concern to 
me relative to the benefits of using a more searchable syntax.


On 7/12/22 18:09, Jelle Zijlstra wrote:


The Rejected ideas mention “various syntactic options for specifying
type parameters that preceded def and class statements” rejected
because
scoping is less clear and doesn't work well with decorators. I
wonder if
decorator-like syntax itself was considered, e.g. something like:
```
@with type S
@with type T
@dec(Foo[S])
class ClassA: ...
```

We did consider variations of that. Pure decorator syntax (like your 
"@dec" line) wouldn't allow us to get the scoping right, since 
decorators can't define new names. (Unless you use a walrus operator, 
but that wouldn't meet the goal of providing cleaner syntax.)


We also considered some ideas similar to your "@with type". It can 
work, but it feels more verbose than the current proposal, and it's 
not in line with what most other languages do.___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OB7TJYLEFYV6MGF5VB3FYVLH3QEIRKNT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Having Sorted Containers in stdlib?

2021-11-12 Thread Paul Ganssle
I think Richard's point was two-fold: People usually don't want or need 
this kind of thing /except/ when they have some very specific 
performance requirements, in which case they probably want to also be 
very specific about the kind of container they are using rather than 
using an abstract "sorted container".


It is true that people sensitive to performance may really care about 
the way dict is implemented, but there are a great many uses for 
associative arrays in general. I knew about sortedcontainers and I also 
don't remember ever seeing a situation where I needed one or recommended 
its use. Maybe it would be useful in leetcode-style interview questions 
(which may by itself be a good reason to include it in the standard 
library, since a lot of those interviewers will let you use 
`collections.deque` or `collections.OrderedDict` without implementing 
any sort of linked listed, but they won't let you import a module that 
provides a trie or something)?


I'm fairly neutral on this proposal. I do think the standard library is 
a good place for these sorts of fundamental data structures (even 
abstract ones), but I do agree with Richard that they are pretty niche. 
In almost any situation where you'd want / need something like this, I 
feel like adding a PyPI dependency would not be a big deal.


One thing that I do think might be valuable is if the plan were to 
re-implement these as C extensions. Maintaining and distributing C 
extensions on PyPI is in many ways more difficult than bundling them 
directly into CPython. That said, doing so adds maintenance burden and 
implementation cost (and it's a different proposition than merely 
adopting an existing library), so I'm probably -0.0 on the whole thing — 
the absence of these types of containers in the standard library is not 
an obvious lack, and getting them from PyPI in the situations where you 
actually need them doesn't seem like a big deal, particularly when it's 
a pure Python impementation.


On 11/11/21 21:43, Steven D'Aprano wrote:

On Thu, Nov 11, 2021 at 11:01:32AM -0800, Richard Levasseur wrote:


Should the stdlib have e.g. SortedList? Probably not, because the use cases
of such data types are too niche to a one-size-fits-all implementation, and
there are too many implementations with too many of their own settings.
Should it have e.g. BTree, RedBlackTree, SortedArrayList, etc? Probably so,
because those are somewhat fundamental data structures and implementing,
testing, and verifying them is very much non-trivial. While niche, having
them at your immediate disposal is very powerful.

By that reasoning, we shouldn't have dicts, for exactly the same reason:
for anyone wanting an associative array, there are so many implementation
variants to choose from:

- hash table with linear probing
- hash table with chaining
- AVL tree
- red-black tree
- judy array
- splay tree
- treap
- scapegoat tree

and many more, and most of them can be tuned.

If Python didn't already have dict, your argument against SortedList
would equally apply to it: they are "fundamental data structures and
implementing, testing, and verifying them is very much non-trivial".

So if your argument is correct, that would imply that standardizing on
one single dict implementation, one that isn't even tunable by the
caller, was a mistake. We should have provided a dozen different hash
tables, trees and treaps.

But... we didn't, and I don't think that Python is a worse language
because we only have one associative array implementation in the stdlib.

Whenever I need an associative array, I don't lose sleep over whether I
could get 2% better performance for hits, at a cost of 17% worse
performance for misses, by swapping over to some other implementation. I
just reach for dict, knowing that it will almost always be good enough.



Last year, for fun, after wishing there was a SortedSet in the standard
lib, I ended up implementing a Red-Black Tree and BTree based sorted
dictionary/set[1]. After then trying to use them for my use case[2], I
found that, in order to fully and truly exploit their benefits, the basic
Sequence/Collection/Set/Dict APIs didn't really suffice. I needed APIs that
would let me, e.g. binary search to a particular spot and then iterate, or
give me a range between two points, etc.

I believe that sortedcontainers.SortedSet provides that functionality
via the irange() method.

http://www.grantjenks.com/docs/sortedcontainers/sortedlist.html#sortedcontainers.SortedList.irange

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HI3IW4XEANYTUOEWVRMH2WYZLGBSQEPD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: The list.sort(reverse=True) method not consistent with description

2021-10-30 Thread Paul Ganssle
You are re-assigning the list on line 4 here, not displaying it. I get 
the answer you expect when using the `itemgetter(0)` key:


IPython 7.28.0, on CPython 3.9.7 (default, Aug 31 2021 13:28:12)
>>> import operator
>>> from operator import itemgetter
>>> L = [(1, 'a'), (2, 'b'), (1, 'c'), (2, 'd'), (3, 'e')]
>>> L.sort(key=operator.itemgetter(0), reverse=True)
>>> L
[(3, 'e'), (2, 'b'), (2, 'd'), (1, 'a'), (1, 'c')]

On 10/30/21 12:47, Raymond Bisdorff wrote:

Dear All,

I fully agree with your point. By default, all the components of the 
tuple should be used in the comparison.


Yet, I was confused by the following result.
>>> from operator import itemgetter
>>> L = [(1, 'a'), (2, 'b'), (1, 'c'), (2, 'd'), (3, 'e')]
>>> L.sort(key=itemgetter(0), reverse=True)
>>> L = [(3, 'e'), (2, 'd'), (2, 'b'), (1, 'c'), (1, 'a')]

Should the tuples comparison is in this case, I thought, not be solely 
based on the first tuple component?

Best Regards

On 10/30/21 18:26, Tim Peters wrote:

[Raymond Bisdorff ]

...
Please notice the following inconsistency in Python3.10.0 and before of
a sort(reverse=True) result:

  >>> L = [(1, 'a'), (2, 'b'), (1, 'c'), (2, 'd'), (3, 'e')]
  >>> L.sort(reverse=True)
  >>> L
  >>> [(3, 'e'), (2, 'd'), (2, 'b'), (1, 'c'), (1, 'a')]

Looks good to me.


it should be:

  >>> L = [(1, 'a'), (2, 'b'), (1, 'c'), (2, 'd'), (3, 'e')]
  >>> reverseTuplesSort(L)
[(3, 'e'), (2, 'b'), (2, 'd'), (1, 'a'), (1, 'c')]

Stability is irrelevant in the example, because no two list elements
are equal. You appear to be thinking, perhaps, that s[0] == t[0] when
s == (1, 'a') and t == (1, 'c') means s and t "are equal", but that's
not so at all.


s = (1, 'a')
t = (1, 'c')
s == t

False

s < t

True

t > s

True

So s MUST come before t in a forward sort, and t MUST come before s in
a reverse sort.




___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LRERVUWNU2FABB2KRKV2KPHAG2UBKFXI/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: In support of PEP 649

2021-04-15 Thread Paul Ganssle
It seems evident that the problems with PEP 563 have been well-known at
least to pydantic for several years, as you can see in the issue Samuel
Colvin linked: https://github.com/samuelcolvin/pydantic/issues/2678

That said, while I do think that "please contact upstream when you see a
problem developing, not just before a major deadline" is a good lesson
to take away from this, it is probably not worth litigating the question
of the particular manner of the notification. As much as I think it
would have been good for this discussion to happen 6 months, 1 year, 2
years or 3 years ago, "before the code freeze" is a heck of a lot better
than "after the release" (which is often when the notification comes in,
and why we have been encouraging people to test against alphas and betas).

Hopefully we can let it rest by saying that the timing of learning about
this apparently urgent situation could have been much better, but it
could have been worse as well.

Best,
Paul

On 4/15/21 7:12 PM, Christopher Barker wrote:
> On Thu, Apr 15, 2021 at 3:29 PM Bernat Gabor  > wrote:
> > I'm a bit surprised that this topic is brought up just days before
> the feature freeze of Python 3.10.
>
> I have not idea about the technical details, but I think there is a
> bit of a discomment in the community:
>
> Annotations have been around for years.
>
> More recently, there was an official specification of how to use them
> for type hinting (was that PEP 484? a bit confused as that appears to
> still be provisional -- but anyway)
>
> Since the specification for type hinting, a lot of work has been done
> on/with static type checkers (e.g. MyPy).
>
> That work has informed the changes / improvements / proposals to the
> annotation and type system in Python.
>
> However, those of us that are not actively involved (and maybe not
> that interested) in static typing have been not paying a great deal of
> attention to those efforts.
>
> But some of us may, in fact, be using annotations for other reasons,
> but since we haven't been involved in the discussion, some issues may
> have been missed.
>
> -CHB
>
>
> -- 
> Christopher Barker, PhD (Chris)
>
> Python Language Consulting
>   - Teaching
>   - Scientific Software Development
>   - Desktop GUI and Web Development
>   - wxPython, numpy, scipy, Cython
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/5TGKKRZXGQROQKS2WX6WFGCHTMNUJYBF/
> Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WETOB43S2H6RGVX7RKWOZNHZZHH245V6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: In support of PEP 649

2021-04-15 Thread Paul Ganssle
I should point out that "accept PEP 649" and "break pydantic" are not
the only options here. The thing that will break pydantic is the end of
PEP 563's deprecation period, not a failure to implement PEP 649.

Other viable options include:

- Leave PEP 563 opt-in until we can agree on a solution to the problem.
- Leave PEP 563 opt-in forever.
- Deprecate PEP 563 and go back to status quo ante.

I haven't followed this closely enough — if PEP 649 were accepted today,
would it even be ready for use before the 3.10 code freeze (which is in
a few weeks)?

Assuming this is a real problem (and based in part on how long it took
for attrs to get what support it has for PEP 563
, I wouldn't be
surprised if PEP 563 is quietly throwing a spanner in the works in
several other places as well), my vote is to leave PEP 563 opt-in until
at least 3.11 rather than try to rush through a discussion on and
implementation of PEP 649.

Best,
Paul

On 4/15/21 4:20 PM, Bluenix wrote:
> Please accept PEP 649!
>
> Python's type hinting has become so much more useful than originally thought, 
> and without this change much of that will be hindered. For example (you 
> already know about Pydantic and FastAPI) 
> [discord.py](https://github.com/Rapptz/discord.py)'s commands system allows 
> you to use typehinting to specify how arguments should be converted. Take the 
> following code:
>
> ```py
> import discord
> from discord.ext import commands
>
> bot = commands.Bot(command_prefix='>')
>
> @bot.command()
> # discord.py reads the typehints and converts the arguments accordingly
> async def reply(ctx, member: discord.Member, *, text: str):  # ctx is always 
> passed
> await ctx.send(f'{member.mention}! {text}')
>
> bot.run('token')
> ```
>
> I must say, this is extremely ergonomic design! Don't break it :)
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/S2O7SE4QZQARAYSCOT7PQUEPNENHDJTQ/
> Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GT2HQDH2HOZFSOTA5LHTFL5OV46UPKTB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Distro packagers: PEP 615 and the tzdata dependency

2020-11-28 Thread Paul Ganssle
Considering the people involved and the nature of the list, I suspect
that adding a new @python.org mailing list would be better than
discourse. In my experience, it's very difficult to just follow a single
topic on the discourse, and most people complain that the e-mail
integration is not great. For something like, "Here's a head's up about
something affecting distributors", I don't think Discourse offers much
in the way of advantages.

My guess is that distributors would be happiest with a relatively
low-volume e-mail list that would point to discussions happening
elsewhere (or that announces changes relevant to distributors).

Best,
Paul

On 11/28/20 3:13 PM, Brett Cannon wrote:
> So that's two people. If five people involved in the distribution of
> Python speak up I will go ahead and create a category on
> discuss.python.org  (people can ask sooner,
> but my personal threshold here to do the work myself is 5 ).
>
> On Wed., Nov. 25, 2020, 00:18 Petr Viktorin,  > wrote:
>
> On 11/24/20 7:50 PM, Brett Cannon wrote:
> > If enough people were interested we could create a "Distributors"
> > category on discuss.python.org 
> .
>
> I'd join :)
>
> >
> > On Tue, Nov 24, 2020 at 9:08 AM Tianon Gravi
> mailto:admwig...@gmail.com>
> > >> wrote:
> >
> >      > I'd love to have an easy way to keep them in the loop.
> >
> >     I'm one of the maintainers on
> >     https://github.com/docker-library/python
> >     
> >     (which is what results in https://hub.docker.com/_/python
> >     ), and I'd
> >     love to have an easy way to keep myself in the loop too! O:)
> >
> >     Is there a lower-frequency mailing list where things like
> this are
> >     normally posted that I could follow?
> >     (I don't want to be a burden, although we'd certainly really
> love to
> >     have more upstream collaboration on that repo -- we do our
> best to
> >     represent upstream as correctly/accurately as possible, but
> we're not
> >     experts!)
> >
> >      > would it make sense to add a packaging section to our
> >     documentation or
> >      > to write an informational PEP?
> >
> >     FWIW, I love the idea of an explicit "packaging" section in
> the docs
> >     (or a PEP), but I've maintained that for other projects
> before and
> >     know it's not always easy or obvious. :)
> >
> >     ♥,
> >     - Tianon
> >        4096R / B42F 6819 007F 00F8 8E36  4FD4 036A 9C25 BF35 7DD4
> >
> >     PS. thanks doko for giving me a link to this thread! :D
> ___
> Python-Dev mailing list -- python-dev@python.org
> 
> To unsubscribe send an email to python-dev-le...@python.org
> 
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> 
> https://mail.python.org/archives/list/python-dev@python.org/message/66HPNHT576JKSFOQXJTCACX5JRNERMWV/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/STBTZ6V525QBCCNRTIOVYXPXBB7Z3CE4/
> Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Y3GTLEJIVLA7G4SPD3LEDWMPC7ZISHVS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Distro packagers: PEP 615 and the tzdata dependency

2020-11-16 Thread Paul Ganssle
Good to know. I think one reason I wouldn't immediately think to send
things to Linux-SIG is that for me "distro packagers" also includes
people maintaining packages in the BSD, Solaris, conda-forge, homebrew,
etc ecosystems. It might make sense to have a dedicated list or
discourse for distro packagers, but as I don't meet the description I
don't know how useful it would be for me to opine on the form it should
take.

Best,
Paul

On 11/16/20 10:45 AM, Miro Hrončok wrote:
> On 11/16/20 4:10 PM, Paul Ganssle wrote:
>>> Maybe it would make sense to create a community mailing list for
>>> packagers?
>> That sounds like a good idea to me.
>
> I am following the Linux SIG mailing list. But it's mostly dead.
>
> https://mail.python.org/archives/list/linux-...@python.org/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/JSGVNDBFNDZ6WHGVUA23PMXRVPHN6IQU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Distro packagers: PEP 615 and the tzdata dependency

2020-11-16 Thread Paul Ganssle
> why is this dependency needed?  The tzdata package is marked with Priority
> required. See Debian policy 2.5.
>
> Matthias

Not to put words in Barry's mouth, but I took Barry's comment to be more
an answer to the question of how to contact "distro packagers" as a
group, more than he was taking a position about this particular issue.

That said, as I mentioned on the launchpad
<https://bugs.launchpad.net/ubuntu/+source/python3.9/+bug/1904271/comments/4>
and elsewhere in this thread, I think that, in general, it makes sense
to have an explicit dependency on tzdata as a clear indicator that we
actually have a direct dependency on tzdata, even if it's not
technically required because it's satisfied automatically for some
reason. It would make it easier for people to find packages that depend
on tzdata to notify us of issues (I've done something like this before —
`tzlocal` is planning to make a backwards-incompatible change to using
`zoneinfo`, and when I found out about this I dug up all the projects I
could find that explicitly depend on `tzlocal` to notify them of the
upcoming change).

It also is less fragile. As Christian mentioned in the launchpad issue,
zoneinfo is actually broken without a direct dependency on tzdata in the
ubuntu docker container — possibly that's a bug in the way the ubuntu
docker contianer is packaged, but it wouldn't be a problem for /us/ if
python had a dependency on tzdata. As far as I can tell it also wouldn't
hurt anything for this dependency to be made explicit rather than implicit.

That said — it's entirely up to individual package maintainers how they
want to handle their dependency metadata. The important thing I wanted
to convey to distro maintainers is that PEP 615 effectively is saying
that Python now depends on the system metadata package being present. If
there's no need to do anything because tzdata is an implicit dependency
for the entire Debian ecosystem then that's fine.

Best,
Paul

On 11/16/20 2:15 PM, Matthias Klose wrote:
> On 11/16/20 6:46 PM, Barry Warsaw wrote:
>> That’s what I was going to suggest.  I’m not doing any Debian or Ubuntu work 
>> these days, but Matthias Klose should be watching both lists, and should be 
>> able to handle the Debuntu stack.
>
>> -Barry
>>
>>> On Nov 16, 2020, at 07:45, Miro Hrončok  wrote:
>>>
>>> On 11/16/20 4:10 PM, Paul Ganssle wrote:
>>>>> Maybe it would make sense to create a community mailing list for
>>>>> packagers?
>>>> That sounds like a good idea to me.
>>> I am following the Linux SIG mailing list. But it's mostly dead.
>>>
>>> https://mail.python.org/archives/list/linux-...@python.org/
>>>
>>> --
>>> Miro Hrončok
>>> --
>>> Phone: +420777974800
>>> IRC: mhroncok
>>> ___
>>> Python-Dev mailing list -- python-dev@python.org
>>> To unsubscribe send an email to python-dev-le...@python.org
>>> https://mail.python.org/mailman3/lists/python-dev.python.org/
>>> Message archived at 
>>> https://mail.python.org/archives/list/python-dev@python.org/message/DF3UXOGYRAIOLTRGSNGNSZEKZOUFMXGA/
>>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>> ___
>> Python-Dev mailing list -- python-dev@python.org
>> To unsubscribe send an email to python-dev-le...@python.org
>> https://mail.python.org/mailman3/lists/python-dev.python.org/
>> Message archived at 
>> https://mail.python.org/archives/list/python-dev@python.org/message/23R5ZD5ZVA4R73ZTKQWWJR6CDLGRKXJN/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/MF4J5HIBM37NRZ5I2RSXYLU7PIXVKXO2/
> Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CYYMZFSJ4IAI3FDJCFNIMFWJLXFNDXJ5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Distro packagers: PEP 615 and the tzdata dependency

2020-11-16 Thread Paul Ganssle
On 11/16/20 9:57 AM, Filipe Laíns wrote:
> In Arch Linux, tzdata is a dependency of glibc, which is part of the
> base[1] package that should be present on every installation. So, there
> is no action necessary :)
> We could make it an explicit dependency, but that it not necessarily
> required, it is up to the maintainer, whom I have notified.

I opened a bug on the tracker including a patch:
https://bugs.archlinux.org/task/68642?project=1=python

I do think it makes sense to make it a direct dependency, since I
personally worry that transitive dependencies may be fragile (and even
if the transitive dependency is rock-solid, it doesn't hurt anything for
it to exist — plus it gives you metadata about what projects are using
what packages, for purposes of notifying or looking for issues). That
said, it's obviously up to y'all.

There's actually another thing that is probably of interest to distro
packagers, which is that zoneinfo includes a compile-time configuration
option:
https://docs.python.org/3/library/zoneinfo.html#zoneinfo-data-compile-time-config

By default we don't know where the zoneinfo will be deployed, so we use
a set of locations where it's commonly deployed. Since distro packagers
know exactly where it is deployed, they can use the `--with-tzpath`
configuration option to specify it at build time, to make the lookup
that much faster and more accurate.

> I would say the best approach to reach distro packages is to open a bug
> in their issue tracker, or to reach them via mailing list. Some of them
> have specific mailing lists for Python, Fedora has python-devel[2].

Yeah, I was mainly looking for a way to contact all of them at once,
since it doesn't scale very well for me to open bugs on the trackers of
every distributor. I can open bugs for my distro and distros that I care
about, but even that can be pretty duplicative.

> Maybe it would make sense to create a community mailing list for
> packagers?
That sounds like a good idea to me.
> I would also suggest to in the future maybe have a "For packagers"
> section, or similar, in the release notes. I don't think right now
> there is any reasonable way packagers could have known about this new
> dependency unless they are involved with the Python upstream. All the
> current changelog does is link to PEP 615, where that information is
> present, but it has no mention of new dependencies.
This also seems like a good idea to me (though I'm not entirely sure how
often packagers are impacted by changes to CPython — zoneinfo may have
been an unusual case).

Best,
Paul



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/74KJJM6ZGHVV2RADCTZLVAKY7VO5KIG2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Distro packagers: PEP 615 and the tzdata dependency

2020-11-16 Thread Paul Ganssle
Hi all,

One part of PEP 615 (IANA time zones) that I expected to be easily
overlooked (a point about which I seem to have been right) is the
recommendation that starting with Python 3.9, distros should add a
dependency on the system time zone data to the Python package. Quoting
the PEP:

> Python distributors are encouraged to ensure that time zone data is
> installed alongside Python whenever possible (e.g. by declaring tzdata
> as a dependency for the python package).

https://www.python.org/dev/peps/pep-0615/#system-time-zone-information

I am not sure what is the best way to reach the largest number of distro
packagers to suggest that they add a dependency on tzdata for Python
3.9+, so I figured I'd send a message here. If you know a distro
packager who does not follow this list, or you use a distro that doesn't
have a dependency on tzdata for Python 3.9, please forward this e-mail
to them and/or open a bug on their issue tracker.

So far I know conda-forge got the dependency right from the first Python
3.9 release, Fedora has plans to fix this
, and
Christian Heimes has opened a bug on the Ubuntu launchpad
 for
this. I will figure out how best to notify Arch Linux and do that (since
that's the distro I use).

I suspect this will be most important for distros that are heavily used
in containers, since tzdata is a common enough dependency that it's
likely to be satisfied accidentally in most full-featured user
environments anyway.

Thanks!

Paul

P.S. If there's a mailing list or other community for Python distro
packagers, please let me know, because I often find that we're making
decisions that affect them that they don't hear about until way down the
line. I'd love to have an easy way to keep them in the loop.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PYXET7BHSETUJHSLFREM5TDZZXDTDTLY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 632: Deprecate distutils module

2020-09-04 Thread Paul Ganssle

On 9/4/20 12:45 PM, Stefan Krah wrote:
> Since distutils does not change, why remove it? It is a lot of work
> for people with little gain.

If we don't remove it, we should at least freeze the bug component so
that people cannot report bugs in distutils. Triaging these bugs alone
is a decent amount of work. We should probably also set up a Bedevere to
auto-reject PRs that touch distutils files (since telling people that
distutils is frozen and no longer maintained is effort as well), and
disable distutils in the CI so that it does not generate work for people
maintaining the buildbots.

> I'd really like to build C extensions without downloading an external
> package.
How often do you actually build extensions without building or
installing external packages? You don't use `pip install` or PEP 517
builds? Just legacy build and installs? Do you not build or release
wheels (which requires the `wheel` package)? Are you planning to upload
artifacts to PyPI — if so, won't you need an external package (or at
least a maintained package that can keep up with the APIs? Before we
deprecated and removed it in setuptools, setup.py upload was causing
problems with the metadata it uploaded — we may need to ban
distutils-created packages from PyPI in order to keep PyPI going).
> Features like C++ support have not been worked on for more than a
> decade.  Are the setuptools maintainers planning to address these
> issues now?
>
Considering that we /aren't/ adding anything to distutils today, the
chances of this happening in setuptools are pretty much strictly better
than in distutils.
>
>> * Modules/_decimal/tests/formathelper.py
> elif find_executable('locale'):
> locale_list = subprocess.Popen(["locale", "-a"],
>   stdout=subprocess.PIPE).communicate()[0]
>
> One of the many things that just work out of the box.  -10 on removing
> distutils from the stdlib.  Freezing it is fine.
>
>
>
> Stefan Krah
>
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/6LQN5OAJQSEJS6YHRZNK4QORJXCHLPHA/
> Code of Conduct: http://python.org/psf/codeofconduct/


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/C6UWPNN2YLN5RRT54DWETQLY3VJVOSY5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 632: Deprecate distutils module

2020-09-04 Thread Paul Ganssle

On 9/4/20 12:22 PM, Paul Moore wrote:
> I believe that's correct. My main concern here is that removing
> distutils from the stdlib won't have made that problem any better, it
> will simply have transferred the responsibility onto the setuptools
> maintainers. While I assume that they are comfortable with that, I
> suspect they may take a different position on backwards compatibility
> than core Python does (not least because one copy of setuptools has to
> support multiple python versions, including alternative versions like
> PyPy, whereas the stdlib copy only needs to handle the version of
> Python it's distributed with).
I think that it's /basically/ true that this move does transfer that
responsibility over to setuptools, and I'm pretty sure that this is
effectively handing over a big deprecation to deprecation specialists,
since — at least as long as I've been involved with it — the process of
maintaining setuptools is dominated by deprecating things (we do
bugfixes and add features as well, of course, but there's a lot more to
deprecate than in your typical project).

That said, there are two major advantages to moving distutils into
setuptools as the first step in making these "backwards-incompatible"
changes (moving distutils into PyPI has similar advantages):

1. Deprecation notices start going out to /all/ supported versions of
Python /immediately/ if they are using setuptools, making it easier to
get the ecosystem to move together.
2. The fact that setuptools supports many versions of Python decouples
the upgrade cycle of setuptools from the upgrade cycle of Python. Users
can opt-in to new features by pinning a minimum version of setuptools
(allowing them to take on migrations /without/ needing to upgrade their
Python version), and if setuptools removes a feature they are counting
on, they can pin a maximum version of setuptools mostly without
disrupting their ability to upgrade Python versions. Related to this,
setuptools can (and does) do more frequent updates, so it's easier to
make a quick release undoing major breaking changes (or adding a new
feature to work around them, etc).
> I think the arguments in favour of this PEP from CPython's point of
> view are fairly strong, but the arguments from the point of view of
> the wider Python ecosystem are much less clear.

I mostly agree that this is more useful for the people maintaining
setuptools and distutils than it is for consumers of these packages,
though that's not necessarily a bad thing. The downside is that we're
going to make a bunch of breaking changes over the next few years
(hopefully well-documented and with clear migration paths). The upside
is that it will be easier for people to reap the benefits of the work
we're doing to improve the packaging ecosystem (standardized build
artifacts, bugfixes, etc).

> Paul
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/QRCA2AHOGZMQS45DANW4UGA63WTMJVQ6/
> Code of Conduct: http://python.org/psf/codeofconduct/


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/T7WR2LH4KSRN2PJE5C6Q35CVMLQ6MLFY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Function suggestion: itertools.one()

2020-07-27 Thread Paul Ganssle
I would think that you can use tuple unpacking for this?

    single_element, = some_iterator

Will attempt to unpack the some_iterator into single_element and fail if
single_element doesn't have exactly 1 element. It also has the added
benefit that it works for any number of elements:

one, two = some_iterator

Example:

>>> a = [1]
>>> one, = a
>>> one
1
>>> b = [1, 2]
>>> one, = b
---
ValueError    Traceback (most recent call last)
 in 
> 1 one, = b

ValueError: too many values to unpack (expected 1)
>>> c = []
>>> one, = c
---
ValueError    Traceback (most recent call last)
 in 
> 1 one, = c

ValueError: not enough values to unpack (expected 1, got 0)

Best,
Paul

On 7/27/20 3:19 PM, Rollo Konig-Brock wrote:
> I second this as being useful.
>
> However the “pythonic” way (whatever that means nowadays) is to do a
> for break else loop, which I think is kinda difficult to read as you
> need to make a few assumptions.
>
> Rollo
>
>> On 27 Jul 2020, at 20:06, Noam Yorav-Raphael  wrote:
>>
>> 
>> Hi,
>>
>> There's a simple function that I use many times, and I think may be a
>> good fit to be added to itertools. A function that gets an iterator,
>> and if it has exactly one element returns it, and otherwise raises an
>> exception. This is very useful for cases where I do some sort of
>> query that I expect to get exactly one result, and I want an
>> exception to be raised if I'm wrong. For example:
>>
>> jack = one(p for p in people if p.id  == '1234')
>>
>> sqlalchemy already has such a function for
>> queries: 
>> https://docs.sqlalchemy.org/en/13/orm/query.html#sqlalchemy.orm.query.Query.one
>>
>> This is my implementation:
>>
>> def one(iterable):
>>     it = iter(iterable)
>>     try:
>>         r = next(it)
>>     except StopIteration:
>>         raise ValueError("Iterator is empty")
>>     try:
>>         next(it)
>>     except StopIteration:
>>         return r
>>     else:
>>         raise ValueError("Iterator has more than one item")
>>
>> What do you think?
>>
>> Thanks,
>> Noam
>> ___
>> Python-Dev mailing list -- python-dev@python.org
>> To unsubscribe send an email to python-dev-le...@python.org
>> https://mail.python.org/mailman3/lists/python-dev.python.org/
>> Message archived at
>> https://mail.python.org/archives/list/python-dev@python.org/message/D52MPKLIN4VEXBOCKVMTWAK66MAOEINY/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/3NA4E3QCTSLGIQXAMVWUL66TK6O7ZHLS/
> Code of Conduct: http://python.org/psf/codeofconduct/


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZLCCVBRXWAUDYQ3K42QOMLDERWMZMHI2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] datetime module refactoring: folder vs parallel private modules

2020-07-20 Thread Paul Ganssle
Hi all,

I was hoping to get some feedback on a proposed refactoring of the
datetime module that should dramatically improve import performance.

The datetime module is implemented more or less in full both in pure
Python and in C; the way that this is currently achieved

is that the pure Python implementation is defined in datetime.py, and
the C implementation is in _datetime, and /after/ the full Python
version is defined, the C version is star-imported and thus any symbols
defined in both versions are taken from the C version; if the C version
is used, any private symbols used only in the pure Python implementation
are manually deleted (see the end of the file
).

This adds a lot of unnecessary overhead, both to define a bunch of
unused classes and functions and to import modules that are required for
the pure Python implementation but not for the C implementation. In the
issue he created about this , Victor
Stinner demonstrated that moving the pure Python implementation to its
own module would speed up the import of datetime by a factor of 4.

I think that we should indeed move the pure Python implementation into
its own module, despite the fact that this is almost guaranteed to break
some people either relying on implementation details or doing something
funky with the import system — I don't think it should break anyone
relying on the guaranteed public interface. The issue at hand is that we
have two options available for the refactoring: either move the pure
Python implementation to its own private top-level module (single file)
such as `_pydatetime`, or make `datetime` a folder with an `__init__.py`
and move the pure Python implementation to `datetime._pydatetime` or
something of that nature.

The decimal and zoneinfo modules both have this same issue; the decimal
module uses the first strategy with _pydecimal and decimal, the zoneinfo
module uses a folder with a zoneinfo._zoneinfo submodule. Assuming we go
forward with this, we need to decide which strategy to adopt for datetime.

In favor of using a datetime/ folder, I'd say it's cleaner to put the
pure Python implementation of datetime under the datetime namespace, and
also it gives us more freedom to play with the module's structure in the
future, since we could have lazily-imported sub-components, or we could
implement some logic common to both implementations in Python and import
it from a `datetime._common` module without requiring the C version to
import the entire Python version, similar to the way zoneinfo has the
zoneinfo._common

module.

The downside of the folder method is that it complicates the way
datetime is imported — /especially/ if we add additional structure to
the module, or add any logic into the __init__.py. Two single-file
modules side-by-side, one imported by the other doesn't change anything
about the nature of how the datetime module is imported, and is much
less likely to break anything.

Anyone have thoughts or strong preferences here? Anyone have use cases
where one or the other approaches is likely to cause a bunch of undue
hardship? I'd like to avoid moving this more than once.

Best,
Paul

P.S. Victor's PR moving this code to _pydatetime
 is currently done in such
a way that the ability to backport changes from post-refactoring to
pre-refactoring branches is preserved; I have not checked but I /think/
we should be able to do the same thing with the other strategy as well.



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CCI7PDAL6G67XVVRKPP2FAYJ5YZYHTK3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Flexible assignment targets

2020-07-02 Thread Paul Ganssle
I think that creating a "matching assignment" operator is unnecessary at
this point. I think the original point of bringing this up as part of
PEP 622 is to try to suggest that the syntax for binding a value not be
incompatible with a future version of Python where that same syntax can
be used for any kind of assignment. That goal is not satisfied for all
cases if `case self.x` means anything except "bind the value to `self.x`".

I think that you /could /probably still use $, ? or <> to mark a
variable to be bound, but it would /not/ be worth the effort to make it
mandatory for lvalues in general, and if you make it optional I imagine
it would be rarely used, and you'd get effectively no benefit from
supporting that (since people would just be confused whenever they saw it).

I think that leaves as /realistic/ options here to either abandon the
idea of marking read vs. store, put the marker on variables to be read
(and have it be something other than "there is a . anywhere in the
expression"), or abandon the goal of allowing for perfect symmetry
between lvalues in case statements and lvalues in assignments.

I tend to think "mark all reads" is the best course of action here, and
stuff like `case self.x` would be a `SyntaxError` (like it is with
assignment expressions).

On 7/2/20 12:26 PM, MRAB wrote:
> On 2020-07-02 15:48, Jim J. Jewett wrote:
>> Guido van Rossum wrote:
>>> On Wed, Jul 1, 2020 at 5:50 PM Nick Coghlan ncogh...@gmail.com wrote:
>>> > The key thing I'm hoping for in PEP 622 itself is
>>> > that "Syntactic compatibility with a possible future
>>> > enhancement to assignment statements" be considered
>>> > as a constraint on the syntax for case patterns.
>>
>>> That would certainly rule out ideas like writing stores as $x or x?
>>> or 
>>> etc., since it would be syntactically incompatible with current
>>> assignment statements.
>>
>> No; it would be unfortunate that it creates a second way to
>> do things, but it wouldn't rule them out.  The problem Nick
>> pointed out is for syntax that is already meaningful, but
>> means something different.
>>
>>  self.y = 15
>>
>> already has a meaning, but that meaning is NOT "don't really
>> assign to X, I am using it as a constant defined elsewhere."
>>
>>  ?x = 14
>>  ?self.y = 15
>>
>> do not yet mean anything, and if they end up being a more
>> explicit (but also more verbose) variant of
>>
>>  x = 14
>>  self.y = 15
>>
>> that is probably sub-optimal, but it isn't any worse than :=
>>
>> The slight variation triggered by the "?" of ?var would be
>> shorthand for "and if you can't make the entire assignment
>> work, pretend I never even asked", so that
>>
>>  ?x, 0 = (4,5)
>>
>> would not lose or shadow a previous binding of x.
>>   
> IMHO, the assignment statement should remain as it is, not sometimes
> assign and sometimes not.
>
> There could be another form that does matching:
>
>     try ?x, 0 = (4,5)
>
> or:
>
>     ?x, 0 ?= (4,5)
>
> Perhaps it could also be used as an expression, having the value True
> if it matches and False if it doesn't.
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/MQV7WBASYRI7PJT5M2VUCPHKBZLXDMY2/
> Code of Conduct: http://python.org/psf/codeofconduct/


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XCHCK7ZXS3OEKOLRGJGS7USFXGBFSIBT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 618: Add Optional Length-Checking To zip

2020-05-15 Thread Paul Ganssle
I'm on the fence about using a separate function vs. a keyword argument
(I think there is merit to both), but one thing to note about the
separate function suggestion is that it makes it easier to write
backwards compatible code that doesn't rely on version checking. With
`itertools.zip_strict`, you can do some graceful degradation like so:

try:
    from itertools import zip_strict
except ImportError:
    zip_strict = zip

Or provide fallback easily:

try:
    from itertools import zip_strict
except ImportError:
    def zip_strict(*args):
    yield from zip(*args)
    for arg in args:
    if next(arg, None):
 raise ValueError("At least one input terminated early.")

There's an alternate pattern for the kwarg-only approach, which is to
just try it and see:

try:
    zip(strict=True)
    HAS_ZIP_STRICT = True
except TypeError:
    HAS_ZIP_STRICT = False

But I would say it's considerably less idiomatic.

Just food for thought here. In the long run this doesn't matter, because
eventually 3.9 will fall out of everyone's support matrices and these
workarounds will become obsolete anyway.

Best,
Paul

On 5/15/20 5:20 AM, Stephen J. Turnbull wrote:
> Brandt Bucher writes:
>
>  > Still agreed. But I think they would be *better* served by the
>  > proposed keyword argument.
>  > 
>  > This whole sub-thread of discussion has left me very confused. Was
>  > anything unclear in the PEP's phrasing here?
>
> I thought it was quite clear.  Those of us who disagree simply
> disagree.  We prefer to provide it as a separate function.  Just move
> on, please; you're not going to convince us, and we're not going to
> convince you.  Leave it to the PEP Delegate or Steering Council.
>
>  > I wouldn't confuse "can" and "should" here.
>
> You do exactly that in arguing for your preferred design, though.
>
> We could implement the strictness test with an argument to the zip
> builtin function, but I don't think we should.  I still can't think of
> a concrete use case for it from my own experience.  Of course I
> believe concrete use cases exist, but that introspection makes me
> suspicious of the claim that this should be a builtin feature, with
> what is to my taste an ugly API.
>
> Again, I don't expect to convince you, and you shouldn't expect to
> convince me, at least not without more concrete and persuasive use
> cases than I've seen so far.
>
> Steve
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/6NQZIDVMGPXA5QJWTKWJFZUUUAYQAOH4/
> Code of Conduct: http://python.org/psf/codeofconduct/



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/A4UGQRMKUZDBHEE4AFJ4PL6AUUTAPF7N/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] PEP 615 (zoneinfo) implementation ready for review

2020-05-08 Thread Paul Ganssle
Hey all,

The feature freeze is coming up on us fast, and the PEP 615
implementation is more or less ready to be integrated into the standard
library (may need one or two little tweaks, but it's well past the
"minimum viable product" stage).

Normally I'd wait longer for someone to volunteer for the task of
reviewing, but given the somewhat tight timeline and the fact that the
code and tests alone (not including the documentation) are 6000 lines, I
figured it's better to give people a head's up that I'm looking for
reviewers. I've already had a few reviews on this code when it was first
merged to the reference implementation, but there's also a decent chunk
of otherwise unreviewed code.

- The implementation and tests are in PR #19909:
https://github.com/python/cpython/pull/19909
- The documentation and What's New entry is in #20006:
https://github.com/python/cpython/pull/20006

There's also one other feature that I did not originally include in the
PEP but which I think is a reasonable feature request that we are likely
to get, which is a way to list all the time zones available on the
system. The reference implementation includes an implementation for that
feature in the property test suite, and it would be easy for me to port
it over; I'll do that if there are no objections before the feature
freeze, I've opened this BPO issue to track the discussion:
https://bugs.python.org/issue40536

Thanks!

Paul




signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5PBRWEDMKTVR2G7GBXL7HKRLQ73RVNFG/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Issues with import_fresh_module

2020-05-06 Thread Paul Ganssle
No worries, I actually seem to have solved the immediately pressing
problem that was blocking PEP 615 by doing this:

@functools.lru_cache(1)
def get_modules():
    import zoneinfo as c_module
    py_module = import_fresh_module("zoneinfo", blocked=["_czoneinfo"])

    return py_module, c_module

I'll have to dig in to figure out exactly /why/ that works, and why it
/doesn't/ work in the reference implementation (which has the the C
implementation living at `zoneinfo._czoneinfo` instead of at
`_czoneinfo`), and hopefully that will shed some light on the other
issues. For the moment I've got something that appears to work and a
suggestive pattern of behavior as to why it wasn't working, so that
actually seems like it will help me solve my short term goal of getting
zoneinfo merged ASAP and my long term goal of ensuring that the tests
are robust.

Thanks!
Paul

On 5/6/20 3:55 PM, Brett Cannon wrote:
> I'm drowning in work this month, so if you need me to look at something then 
> I unfortunately need a point-blank link of what you want me to look at with a 
> targeted question.
>
> As for import_fresh_module() not being robust: of course it isn't because 
> it's mucking with import stuff in a very non-standard way.  All it's doing 
> is an import and clearing the module from sys.modules. The extras it provides 
> is to shove None into sys.modules to trigger an ImportError and so you can 
> block any acceleration module from being imported and to forcibly use the 
> Python code. That's it.
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/VNQJBFHIEZLY6C5HNV5A6TNIWI7VAMOW/
> Code of Conduct: http://python.org/psf/codeofconduct/


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2RRWGNT2WRFG5OYLXLDKQGJDDZT456KE/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Issues with import_fresh_module

2020-05-06 Thread Paul Ganssle
Thanks for the suggestion.

I think I tried something similar for tests that involved an environment
variable and found that it doesn't play nicely with coverage.py /at all/.

Also, I will have to solve this problem at some point anyway because the
property tests for the module (not currently included in the PR) include
tests that have the C and pure Python version running side-by-side,
which would be hard to achieve with subinterpreters.

On 5/6/20 4:51 PM, Nathaniel Smith wrote:
> On Wed, May 6, 2020 at 7:52 AM Paul Ganssle  wrote:
>> As part of PEP 399, an idiom for testing both C and pure Python versions of 
>> a library is suggested making use if import_fresh_module.
>>
>> Unfortunately, I'm finding that this is not amazingly robust. We have this 
>> issue: https://bugs.python.org/issue40058, where the tester for datetime 
>> needs to do some funky manipulations to the state of sys.modules for reasons 
>> that are now somewhat unclear, and still sys.modules is apparently left in a 
>> bad state.
>>
>> When implementing PEP 615, I ran into similar issues and found it very 
>> difficult to get two independent instances of the same module – one with the 
>> C extension blocked and one with it intact. I ended up manually importing 
>> the C and Python extensions and grafting them onto two "fresh" imports with 
>> nothing blocked.
> When I've had to deal with similar issues in the past, I've given up
> on messing with sys.modules and just had one test spawn a subprocess
> to do the import+run the actual tests. It's a big hammer, but the nice
> thing about big hammers is that there's no subtle issues, either they
> smash the thing or they don't.
>
> But, I don't know how awkward that would be to fit into Python's
> unittest system, if you have lots of tests you need to run this way.
>
> -n
>


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/H4TWK574BEUDVY4MGTSFJ5OKD4OVOWZZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Issues with import_fresh_module

2020-05-06 Thread Paul Ganssle
As part of PEP 399 , an idiom
for testing both C and pure Python versions of a library is suggested
making use if import_fresh_module.

Unfortunately, I'm finding that this is not amazingly robust. We have
this issue: https://bugs.python.org/issue40058, where the tester for
datetime needs to do some funky manipulations
to
the state of sys.modules for reasons that are now somewhat unclear, and
still sys.modules is apparently left in a bad state.

When implementing PEP 615, I ran into similar issues and found it very
difficult to get two independent instances of the same module – one with
the C extension blocked and one with it intact. I ended up manually
importing the C and Python extensions and grafting them onto two "fresh"
imports with nothing blocked
.

This seems to work most of the time in my repo, but when I import it
into CPython, I'm now seeing failures due to this issue. The immediate
symptom is that assertRaises is seeing a mismatch between the exception
raised by the module and the exception *on* the module. Here's the
Travis error

(ignore the part about `tzdata`, that needs to be removed as
misleading), and here's the test
.
Evidently calling module.ZoneInfo("Bad_Zone") is raising a different
module's ZoneInfoNotFoundError in some cases and I have no idea why.

Is anyone familiar more familiar with the import system willing to take
a look at these issues?

Thanks,
Paul


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/6H4E3XDPU4YU4HZEEOBB4RL6ZQMC57YG/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Moving tzdata into the python organization

2020-05-03 Thread Paul Ganssle
All right, I've got one yay and zero nays, so I've moved this over to:
https://github.com/python/tzdata

On 4/29/20 12:48 PM, Barry Warsaw wrote:
> No objections here.
>
> -Barry
>
>> On Apr 29, 2020, at 06:05, Paul Ganssle  wrote:
>>
>> Signed PGP part
>> Hi all,
>>
>> PEP 615 specifies that we will add a first party tzdata package. I created 
>> this package at pganssle/tzdata, but I believe the final home for it should 
>> be python/tzdata. Are there any objections to me moving it into the python 
>> org as is?
>>
>> Thanks!
>> Paul
>>
>>
>>



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/MJL4GASHQSLERVICMUGOHYGSORJ2SVZ7/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Adding a "call_once" decorator to functools

2020-04-30 Thread Paul Ganssle
On 4/30/20 4:47 PM, raymond.hettin...@gmail.com wrote:
> Would either of the existing solutions work for you?
>
> class X:
> def __init__(self, name):
> self.name = name
>
> @cached_property
> def title(self):
>   print("compute title once")
>   return self.name.title()
>
> @property
> @lru_cache
> def upper(self):
>   print("compute uppper once")
>   return self.name.upper()

The second one seems a bit dangerous in that it will erroneously keep
objects alive until they are either ejected from the cache or until the
class itself is collected (plus only 128 objects would be in the cache
at one time): https://bugs.python.org/issue19859

> Thanks for the concrete example.  AFAICT, it doesn't require (and probably 
> shouldn't have) a lock to be held for the duration of the call.  Would it be 
> fair to say the 100% of your needs would be met if we just added this to the 
> functools module?
>
>   call_once = lru_cache(maxsize=None)
I am -0 on adding `call_once = lru_cache(maxsize=None)` here. I feel
like it could be misleading in that people might think that it ensures
that the function is called exactly once (it reminds me of the FnOnce
 trait in Rust),
and all it buys us is a nice way to advertise "here's a use case for
lru_cache".

That said, in any of the times I've had one of these "call exactly one
time" situations, the biggest constraint I've had is that I always
wanted the return value to be the same object so that `f(x) is f(x)`,
but I've never had a situation where it was /required/ that the function
be called exactly once, so I rarely if ever have bothered to get that
property.

I suppose I could imagine a situation where calling the function mutates
or consumes an object as part of the call, like:

class LazyList:
    def __init__(self, some_iterator):
    self._iter = some_iterator
    self._list = None

    @call_once
    def as_list(self):
    self._list = list(self._iter)
    return self._list

But I think it's just speculation to imagine anyone needs that or would
find it useful, so I'm in favor of waiting for someone to chime in with
a concrete use case where this property would be valuable.

Best,
Paul



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BXOKFCVUNFJZBLMOMUYNLH4K4HFYBSYX/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Moving tzdata into the python organization

2020-04-29 Thread Paul Ganssle
Hi all,

PEP 615 specifies that we will add a first party tzdata package
. I
created this package at pganssle/tzdata
, but I believe the final home for
it should be python/tzdata. Are there any objections to me moving it
into the python org as is?

Thanks!
Paul



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UGCDORJB4662XZ7U7CW2K7KLLC3NXYW6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: killing static types (for sub-interpreters?)

2020-04-28 Thread Paul Ganssle
I don't know the answer to this, but what are some examples of objects
where you never change the refcount? Are these Python objects? If so,
wouldn't doing something like adding the object to a list necessarily
change its refcount, since the list implementation only knows, "I have a
reference to this object, I must increase the reference count", and it
doesn't know that the object doesn't need its reference count changed?

Best,
Paul

On 4/28/20 2:38 PM, Jim J. Jewett wrote:
> Why do sub-interpreters require (separate and) heap-allocated types?  
>
> It seems types that are statically allocated are a pretty good use for 
> immortal objects, where you never change the refcount ... and then I don't 
> see why you need more than one copy.
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/S674C2BJ7NHKB3SOJF4VFRXVNQDNSCHP/
> Code of Conduct: http://python.org/psf/codeofconduct/



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/J64VIJPXBCR7DQPDFSZWTLRVIYGCXYPF/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Further PEP 615 Discussion: Equality and hash of ZoneInfo

2020-04-20 Thread Paul Ganssle
> In every use-case that I've ever had, and every one that I can
> imagine, I've not cared about the difference between "US/Eastern" and
> "America/New_York". In fact, if ZoneInfo("US/Eastern") returned
> something that had a name of "America/New_York", I would be fine with
> that. Similarly, Australia/Melbourne and Australia/Sydney are, to my
> knowledge, equivalent. (If I'm wrong on my own country's history of
> timezones, then I apologize and withdraw the example, but I'm talking
> about cases where you absolutely cannot tell the difference based on
> the displayed time.) Having those compare equal would be convenient.

I tend to agree, but there's a minor complication in that there is not,
as far as I can tell, an easy cross-platform way to determine the
"canonical" zone name, and normalizing America/New_York to the
deprecated US/Eastern would be bad, so we really don't want to do that
(in fact, this happens with the way that dateutil.zoneinfo stores its
time zones, and has been rather irksome to me). The key is exposed as
part of the public API, because it's useful for serializing the zone
between languages, e.g. if you want to send an aware datetime as JSON,
you probably want something that looks something like: {"datetime":
"2020-05-01T03:04:01", "zone": "America/New_York"}.

One reason this may be a problem is that something like Asia/Vientiane
is, at the moment, a symlink to Asia/Bangkok, but Vientiane is in Laos
and Bangkok is in Thailand - if time in Laos changes relative to
Asia/Bangkok, Asia/Vientiane will stop being a link, but if we normalize
"Asia/Vientiane" to "Asia/Bangkok" on systems with sufficiently old time
zone data, we may lose that information on deserialization.

Of course, I do not consider this to be a major problem (any more than
the whole idea of stable keys over time is a somewhat fragile
abstraction), because if, for example, Massachusetts were to go to
"permanent daylight saving time" (i.e. year-round Atlantic Standard
Time), a new America/Boston zone would be created, and all the
Bostonians who have been using America/New_York would be in much the
same situation, but it's just one thing that gives me pause about
efforts to normalize links.

> I don't think it's a problem to have equivalent objects unable to
> coexist in a set. That's just the way sets work - len({5, 5.0}) is 1,
> not 2.

I mostly agree with this, it's just that I don't have a good idea why
you'd want to put a time zone in a set in the first place, and the
notion of equivalent is relative to what you're using the object for. In
some ways two zones are not equivalent unless they are the same object,
e.g.:

dt0 = datetime(2020, 4, 1, tzinfo=zi1)
dt1 = datetime(2020, 1, 1, tzinfo=zi0)
dt1 - dt0

If we assume that zi0 and zi1 both are "America/New_York" zones, the
result depends on whether or not they are the same object. If both zi0
and zi1 are ZoneInfo("America/New_York"), then the result is one thing,
if one or more of them was constructed with
ZoneInfo.no_cache("America/New_York"), it's a different one. The result
of `.tzname()`, `.utcoffset()` and `.dst()` calls are the same no matter
what, though.

> Since options 3 and 4 are the most expensive, I'm fine with the idea
> of a future method that would test for equivalence, rather than having
> them actually compare equal; but I'd also be fine with having
> ZoneInfo("US/Eastern") actually return the same object that
> ZoneInfo("America/New_York") returns. For the equality comparison, I
> would be happy with proposal 2.

Do you have any actual use cases for the equality comparison? I think
proposal 2 is /reasonable/, but to the extent that anyone ever notices
the difference between proposal 1 and proposal 2, it's more likely to
cause confusion - you can always do `zi0.key == zi1.key`, but most
people will naively look at `zi0 == zi1` to debug their issue, only to
not realize that `zi0 == zi1` isn't actually the relevant comparison
when talking about inter-zone vs. same-zone comparisons.

> ChrisA
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/JPUGSSXX2MWF3ABH3QNHXSMNVDWMRVJS/
> Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/JVNW32COCOLBAKPREUSW7Z3TGFOMTJIB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Further PEP 615 Discussion: Equality and hash of ZoneInfo

2020-04-20 Thread Paul Ganssle
> In every use-case that I've ever had, and every one that I can
> imagine, I've not cared about the difference between "US/Eastern" and
> "America/New_York". In fact, if ZoneInfo("US/Eastern") returned
> something that had a name of "America/New_York", I would be fine with
> that. Similarly, Australia/Melbourne and Australia/Sydney are, to my
> knowledge, equivalent. (If I'm wrong on my own country's history of
> timezones, then I apologize and withdraw the example, but I'm talking
> about cases where you absolutely cannot tell the difference based on
> the displayed time.) Having those compare equal would be convenient.

I tend to agree, but there's a minor complication in that there is not,
as far as I can tell, an easy cross-platform way to determine the
"canonical" zone name, and normalizing America/New_York to the
deprecated US/Eastern would be bad, so we really don't want to do that
(in fact, this happens with the way that dateutil.zoneinfo stores its
time zones, and has been rather irksome to me). The key is exposed as
part of the public API, because it's useful for serializing the zone
between languages, e.g. if you want to send an aware datetime as JSON,
you probably want something that looks something like: {"datetime":
"2020-05-01T03:04:01", "zone": "America/New_York"}.

One reason this may be a problem is that something like Asia/Vientiane
is, at the moment, a symlink to Asia/Bangkok, but Vientiane is in Laos
and Bangkok is in Thailand - if time in Laos changes relative to
Asia/Bangkok, Asia/Vientiane will stop being a link, but if we normalize
"Asia/Vientiane" to "Asia/Bangkok" on systems with sufficiently old time
zone data, we may lose that information on deserialization.

Of course, I do not consider this to be a major problem (any more than
the whole idea of stable keys over time is a somewhat fragile
abstraction), because if, for example, Massachusetts were to go to
"permanent daylight saving time" (i.e. year-round Atlantic Standard
Time), a new America/Boston zone would be created, and all the
Bostonians who have been using America/New_York would be in much the
same situation, but it's just one thing that gives me pause about
efforts to normalize links.

> I don't think it's a problem to have equivalent objects unable to
> coexist in a set. That's just the way sets work - len({5, 5.0}) is 1,
> not 2.

I mostly agree with this, it's just that I don't have a good idea why
you'd want to put a time zone in a set in the first place, and the
notion of equivalent is relative to what you're using the object for. In
some ways two zones are not equivalent unless they are the same object,
e.g.:

dt0 = datetime(2020, 4, 1, tzinfo=zi1)
dt1 = datetime(2020, 1, 1, tzinfo=zi0)
dt1 - dt0

If we assume that zi0 and zi1 both are "America/New_York" zones, the
result depends on whether or not they are the same object. If both zi0
and zi1 are ZoneInfo("America/New_York"), then the result is one thing,
if one or more of them was constructed with
ZoneInfo.no_cache("America/New_York"), it's a different one. The result
of `.tzname()`, `.utcoffset()` and `.dst()` calls are the same no matter
what, though.

> Since options 3 and 4 are the most expensive, I'm fine with the idea
> of a future method that would test for equivalence, rather than having
> them actually compare equal; but I'd also be fine with having
> ZoneInfo("US/Eastern") actually return the same object that
> ZoneInfo("America/New_York") returns. For the equality comparison, I
> would be happy with proposal 2.

Do you have any actual use cases for the equality comparison? I think
proposal 2 is /reasonable/, but to the extent that anyone ever notices
the difference between proposal 1 and proposal 2, it's more likely to
cause confusion - you can always do `zi0.key == zi1.key`, but most
people will naively look at `zi0 == zi1` to debug their issue, only to
not realize that `zi0 == zi1` isn't actually the relevant comparison
when talking about inter-zone vs. same-zone comparisons.

> ChrisA
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/JPUGSSXX2MWF3ABH3QNHXSMNVDWMRVJS/
> Code of Conduct: http://python.org/psf/codeofconduct/


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/AW3ZMWX6MNDU35L3AW5RIRYF7MAYFCZW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Further PEP 615 Discussion: Equality and hash of ZoneInfo

2020-04-18 Thread Paul Ganssle
A few weeks ago, I submitted PEP 615 to the steering council
 for a decision.
There's been a decent amount of discussion there with some very good
questions. I think they are mostly resolved (though I'm happy to have
other people look over the logic of some of my responses there), but
Victor brought up the question of the equality and hash behavior of
ZoneInfo, and while I'm leaning one way, I could easily be convinced -
particularly if people have /real/ use cases that they are planning to
implement that would be affected by this.

I've summed up the options on the discourse thread
,
and would appreciate if anyone is able to give some feedback.

Thanks!

Paul



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/JSA6LNJH6NP7JN32NMPWYNYUDAIAKFFX/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 615: Support for IANA Time Zones in the Standard Library

2020-03-29 Thread Paul Ganssle
Hi all,

It seems like discussion on PEP 615 has mostly petered off. The last
remaining unresolved issue didn't get any comments, which was whether
the module should be called "zoneinfo" or put somewhere in the
"datetime" hierarchy, so I've gone ahead and moved that into "rejected
ideas" in this PR <https://github.com/python/peps/pull/1347>.

I have been sort of hoping to get this accepted either between Sunday,
April 5th - specifically between 2 AM and 4AM or 13:00 and 17:30 UTC -
since those times represent ambiguous or imaginary times somewhere on
earth (mostly in Australia/New Zealand/Pacific Islands), so I am
planning on submitting this to the steering council as soon as possible
and hoping that I've given them enough notice to look at it.

If anyone else was planning on commenting, please do head over to the
discourse ASAP.

Thanks!
Paul

On 3/1/20 9:25 AM, Paul Ganssle wrote:
>
> Greetings!
>
> Last year at the Language Summit, I proposed to add additional
> concrete time zones to the standard library
> <https://pyfound.blogspot.com/2019/05/paul-ganssle-time-zones-in-standard.html>
> . After much work and much more procrastination, I now have now put
> together my first proposal: support for the IANA time zone database
> <https://www.iana.org/time-zones> (also called tz, zoneinfo or the
> Olson database; Wikipedia <https://en.wikipedia.org/wiki/Tz_database>
> ). Last week, I submitted it for consideration as PEP 615
> <https://www.python.org/dev/peps/pep-0615/>.
>
> I originally posted it on the discourse last week, and advertised the
> discussion on some interest-group-specific fora (tz mailing list,
> datetime-SIG mailing list), but I think it is ready to be advertised
> to a wider forum, so I am posting it here for your consideration.
> Please direct comments to the discourse thread, so that the discussion
> can be as centralized as possible: https://discuss.python.org/t/3468
> <https://discuss.python.org/t/pep-615-support-for-the-iana-time-zone-database-in-the-standard-library/3468>.
>
> Links for easy access:
> PEP 615: https://www.python.org/dev/peps/pep-0615/
> Reference implementation: https://github.com/pganssle/zoneinfo
> tzdata repo: https://github.com/pganssle/tzdata
>
> Thanks!
> Paul
>


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UVLDWFEQAOECZ7HZTZ3CPJWAD5OTYTRQ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-25 Thread Paul Ganssle
I imagine it's an implementation detail of which ones depend on __getitem__.

The only methods that would be reasonably amenable to a guarantee like
"always returns the same thing as __getitem__" would be (l|r|)strip(),
split(), splitlines(), and .partition(), because they only work with
subsets of the input string.

Most of the other stuff involves constructing new strings and it's
harder to cast them in terms of other "primitive operations" since
strings are immutable.

I suspect that to the extent that the ones that /could/ be implemented
in terms of __getitem__ are returning base strings, it's either because
no one thought about doing it at the time and they used another
mechanism or it was a deliberate choice to be consistent with the other
methods.

I don't see removeprefix and removesuffix explicitly being implemented
in terms of slicing operations as a huge win - you've demonstrated that
someone who wants a persistent string subclass still would need to
override a /lot/ of methods, so two more shouldn't hurt much - I just
think that "consistent with most of the other methods" is a
/particularly/ good reason to avoid explicitly defining these operations
in terms of __getitem__. The /default/ semantics are the same (i.e. if
you don't explicitly change the return type of __getitem__, it won't
change the return type of the remove* methods), and the only difference
is that for all the /other/ methods, it's an implementation detail
whether they call __getitem__, whereas for the remove methods it would
be explicitly documented.

In my ideal world, a lot of these methods would be redefined in terms of
a small set of primitives that people writing subclasses could implement
as a protocol that would allow methods called on the functions to retain
their class, but I think the time for that has passed. Still, I don't
think it would /hurt/ for new methods to be defined in terms of what
primitive operations exist where possible.

Best,
Paul

On 3/25/20 3:09 PM, Dennis Sweeney wrote:
> I was surprised by the following behavior:
>
> class MyStr(str):
> def __getitem__(self, key):
> if isinstance(key, slice) and key.start is key.stop is key.end:
> return self
> return type(self)(super().__getitem__(key))
>
>
> my_foo = MyStr("foo")
> MY_FOO = MyStr("FOO")
> My_Foo = MyStr("Foo")
> empty = MyStr("")
>
> assert type(my_foo.casefold()) is str
> assert type(MY_FOO.capitalize()) is str
> assert type(my_foo.center(3)) is str
> assert type(my_foo.expandtabs()) is str
> assert type(my_foo.join(())) is str
> assert type(my_foo.ljust(3)) is str
> assert type(my_foo.lower()) is str
> assert type(my_foo.lstrip()) is str
> assert type(my_foo.replace("x", "y")) is str
> assert type(my_foo.split()[0]) is str
> assert type(my_foo.splitlines()[0]) is str
> assert type(my_foo.strip()) is str
> assert type(empty.swapcase()) is str
> assert type(My_Foo.title()) is str
> assert type(MY_FOO.upper()) is str
> assert type(my_foo.zfill(3)) is str
>
> assert type(my_foo.partition("z")[0]) is MyStr
> assert type(my_foo.format()) is MyStr
>
> I was under the impression that all of the ``str`` methods exclusively 
> returned base ``str`` objects. Is there any reason why those two are 
> different, and is there a reason that would apply to ``removeprefix`` and 
> ``removesuffix`` as well?
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/TVDATHMCK25GT4OTBUBDWG3TBJN6DOKK/
> Code of Conduct: http://python.org/psf/codeofconduct/


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/L3ZQLTUWWTNKCWTTSJSOX3ME4EDSS4FR/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-25 Thread Paul Ganssle
I've said a few times that I think it would be good if the behavior were
defined /in terms of __getitem__/'s behavior. If the rough behavior is this:

def removeprefix(self, prefix):
    if self.startswith(prefix):
    return self[len(prefix):]
    else:
    return self[:]

Then you can shift all the guarantees about whether the subtype is str
and whether it might return `self` when the prefix is missing onto the
implementation of __getitem__.

For CPython's implementation of str, `self[:]` returns `self`, so it's
clearly true that __getitem__ is allowed to return `self` in some
situations. Subclasses that do not override __getitem__ will return the
str base class, and subclasses that /do/ overwrite __getitem__ can
choose what they want to do. So someone could make their subclass do this:

class MyStr(str):
    def __getitem__(self, key):
    if isinstance(key, slice) and key.start is key.stop is key.end
is None:
    return self
    return type(self)(super().__getitem__(key))

They would then get "removeprefix" and "removesuffix" for free, with the
desired semantics and optimizations.

If we go with this approach (which again I think is much friendlier to
subclassers), that obviates the problem of whether `self[:]` is a good
summary of something that can return `self`: since "does the same thing
as self[:]" /is/ the behavior it's trying to describe, there's no ambiguity.

Best,
Paul

On 3/25/20 1:36 PM, Dennis Sweeney wrote:
> I'm removing the tuple feature from this PEP. So now, if I understand
> correctly, I don't think there's disagreement about behavior, just about
> how that behavior should be summarized in Python code. 
>
> Ethan Furman wrote:
>>> It appears that in CPython, self[:] is self is true for base
>>> str
>>>  objects, so I think return self[:] is consistent with (1) the premise
>>>  that returning self is an implementation detail that is neither mandated
>>>  nor forbidden, and (2) the premise that the methods should return base
>>>  str objects even when called on str subclasses.
>> The Python interpreter in my head sees self[:] and returns a copy. 
>> A
>> note that says a str is returned would be more useful than trying to
>> exactly mirror internal details in the Python "roughly equivalent" code.
> I think I'm still in the camp that ``return self[:]`` more precisely 
> prescribes
> the desired behavior. It would feel strange to me to write ``return self``
> and then say "but you don't actually have to return self, and in fact
> you shouldn't when working with subclasses". To me, it feels like
>
> return (the original object unchanged, or a copy of the object, 
> depending on implementation details, 
> but always make a copy when working with subclasses)
>
> is well-summarized by
>
>return self[:]
>
> especially if followed by the text
>
> Note that ``self[:]`` might not actually make a copy -- if the affix
> is empty or not found, and if ``type(self) is str``, then these methods
> may, but are not required to, make the optimization of returning ``self``.
> However, when called on instances of subclasses of ``str``, these
> methods should return base ``str`` objects, not ``self``.
>
> ...which is a necessary explanation regardless. Granted, ``return self[:]``
> isn't perfect if ``__getitem__`` is overridden, but at the cost of three
> characters, the Python gains accuracy over both the optional nature of
> returning ``self`` in all cases and the impossibility (assuming no dunders
> are overridden) of returning self for subclasses. It also dissuades readers
> from relying on the behavior of returning self, which we're specifying is
> an implementation detail.
>
> Is that text explanation satisfactory?
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/4E77QD52JCMHSP7O62C57XILLQN6SPCT/
> Code of Conduct: http://python.org/psf/codeofconduct/


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GG5BOKPQCP7J5RRWABEYOZDNDTH3UC6T/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-22 Thread Paul Ganssle
> And we *have* to decide that it returns a plain str instance if called
> on a subclass instance (unless overridden, of course) since the base
> class (str) won't know the signature of the subclass constructor.
> That's also why all other str methods return an instance of plain str
> when called on a subclass instance.

My suggestion is to rely on __getitem__ here (for subclasses), in which
case we don't actually need to know the subclass constructor. The rough
implementation in the PEP shows how to do it without needing to know the
subclass constructor:

def redbikeshed(self, prefix):
    if self.startswith(pre):
    return self[len(pre):]
    return self[:]

The actual implementation doesn't need to be implemented that way, as
long as the result is always there result of slicing the original
string, it's safe to do so* and more convenient for subclass
implementers (who now only have to implement __getitem__ to get the
affix-trimming functions for free).

One downside to this scheme is that I think it makes getting the type
hinting right more complicated, since the return type of these functions
is basically, "Whatever the return type of self.__getitem__ is", but I
don't think anyone will complain if you write -> str with the
understanding that __getitem__ should return a str or a subtype thereof.

Best,
Paul

*Assuming they haven't messed with __getitem__ to do something
non-standard, but if they've done that I think they've tossed Liskov
substitution out the window and will have to re-implement these methods
if they want them to work.

On 3/22/20 2:03 PM, Guido van Rossum wrote:
> On Sun, Mar 22, 2020 at 4:20 AM Eric V. Smith  > wrote:
>
> Agreed. I think the PEP should say that a str will be returned (in
> the
> event of a subclass, assuming that's what we decide), but if the
> argument is exactly a str, that it may or may not return the original
> object.
>
>
> Yes. Returning self if the class is exactly str is *just* an
> optimization -- it must not be mandated nor ruled out.
>
> -- 
> --Guido van Rossum (python.org/~guido )
> /Pronouns: he/him //(why is my pronoun here?)/
> 
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/ZZTY3OCJFZTZM74MVWRYL23LFJGNKICU/
> Code of Conduct: http://python.org/psf/codeofconduct/


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Y3O7CBHJB4R34TYL7RDEU2TB5OPSNI3H/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-22 Thread Paul Ganssle
Sorry, I think I accidentally left out a clause here - I meant that the
rationale for /always returning a 'str'/ (as opposed to returning a
subclass) is missing, it just says in the PEP:

> The only difference between the real implementation and the above is
> that, as with other string methods like replace, the methods will
> raise a TypeError if any of self, pre or suf is not an instace of str,
> and will cast subclasses of str to builtin str objects.

I think the rationale for these differences is not made entirely clear,
specifically the "and will cast subclasses of str to builtin str
objects" part.

I think it would be best to define the truncation in terms of
__getitem__ - possibly with the caveat that implementations are allowed
(but not required) to return `self` unchanged if no match is found.

Best,
Paul

P.S. Dennis - just noticed in this reply that there is a typo in the PEP
- s/instace/instance

On 3/22/20 12:15 PM, Victor Stinner wrote:
> tl; dr A method implemented in C is more efficient than hand-written
> pure-Python code, and it's less error-prone
>
> I don't think if it has already been said previously, but I hate
> having to compute manually the string length when writing:
>
> if line.startswith("prefix"): line = line[6:]
>
> Usually what I do is to open a Python REPL and I type: len("prefix")
> and copy-paste the result :-)
>
> Passing directly the length is a risk of mistake. What if I write
> line[7:] and it works most of the time because of a space, but
> sometimes the space is omitted randomly and the application fails?
>
> --
>
> The lazy approach is:
>
> if line.startswith("prefix"): line = line[len("prefix"):]
>
> Such code makes my "micro-optimizer hearth" bleeding since I know that
> Python is stupid and calls len() at runtime, the compiler is unable to
> optimize it (sadly for good reasons, len name can be overriden)  :-(
>
> => line.cutprefix("prefix") is more efficient! ;-) It's also also shorter.
>
> Victor
>
> Le dim. 22 mars 2020 à 17:02, Paul Ganssle  a écrit :
>> I don't see any rationale in the PEP or in the python-ideas thread
>> (admittedly I didn't read the whole thing, I just Ctrl + F-ed "subclass"
>> there). Is this just for consistency with other methods like .casefold?
>>
>> I can understand why you'd want it to be consistent, but I think it's
>> misguided in this case. It adds unnecessary complexity for subclass
>> implementers to need to re-implement these two additional methods, and I
>> can see no obvious reason why this behavior would be necessary, since
>> these methods can be implemented in terms of string slicing.
>>
>> Even if you wanted to use `str`-specific optimizations in C that aren't
>> available if you are constrained to use the subclass's __getitem__, it's
>> inexpensive to add a "PyUnicode_CheckExact(self)" check to hit a "fast
>> path" that doesn't use slice.
>>
>> I think defining this in terms of string slicing makes the most sense
>> (and, notably, slice itself returns `str` unless explicitly overridden,
>> the default is for it to return `str` anyway...).
>>
>> Either way, it would be nice to see the rationale included in the PEP
>> somewhere.
>>
>> Best,
>> Paul
>>
>> On 3/22/20 7:16 AM, Eric V. Smith wrote:
>>> On 3/22/2020 1:42 AM, Nick Coghlan wrote:
>>>> On Sun, 22 Mar 2020 at 15:13, Cameron Simpson  wrote:
>>>>> On 21Mar2020 12:45, Eric V. Smith  wrote:
>>>>>> On 3/21/2020 12:39 PM, Victor Stinner wrote:
>>>>>>> Well, if CPython is modified to implement tagged pointers and
>>>>>>> supports
>>>>>>> storing a short strings (a few latin1 characters) as a pointer, it
>>>>>>> may
>>>>>>> become harder to keep the same behavior for "x is y" where x and y
>>>>>>> are
>>>>>>> strings.
>>>>> Are you suggesting that it could become impossible to write this
>>>>> function:
>>>>>
>>>>>  def myself(o):
>>>>>  return o
>>>>>
>>>>> and not be able to rely on "o is myself(o)"? That seems... a pretty
>>>>> nasty breaking change for the language.
>>>> Other way around - because strings are immutable, their identity isn't
>>>> supposed to matter, so it's possible that functions that currently
>>>> return the exact same object in some cases may in the future start
>>>> returning a different ob

[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-22 Thread Paul Ganssle
I don't see any rationale in the PEP or in the python-ideas thread
(admittedly I didn't read the whole thing, I just Ctrl + F-ed "subclass"
there). Is this just for consistency with other methods like .casefold?

I can understand why you'd want it to be consistent, but I think it's
misguided in this case. It adds unnecessary complexity for subclass
implementers to need to re-implement these two additional methods, and I
can see no obvious reason why this behavior would be necessary, since
these methods can be implemented in terms of string slicing.

Even if you wanted to use `str`-specific optimizations in C that aren't
available if you are constrained to use the subclass's __getitem__, it's
inexpensive to add a "PyUnicode_CheckExact(self)" check to hit a "fast
path" that doesn't use slice.

I think defining this in terms of string slicing makes the most sense
(and, notably, slice itself returns `str` unless explicitly overridden,
the default is for it to return `str` anyway...).

Either way, it would be nice to see the rationale included in the PEP
somewhere.

Best,
Paul

On 3/22/20 7:16 AM, Eric V. Smith wrote:
> On 3/22/2020 1:42 AM, Nick Coghlan wrote:
>> On Sun, 22 Mar 2020 at 15:13, Cameron Simpson  wrote:
>>> On 21Mar2020 12:45, Eric V. Smith  wrote:
 On 3/21/2020 12:39 PM, Victor Stinner wrote:
> Well, if CPython is modified to implement tagged pointers and
> supports
> storing a short strings (a few latin1 characters) as a pointer, it
> may
> become harder to keep the same behavior for "x is y" where x and y
> are
> strings.
>>> Are you suggesting that it could become impossible to write this
>>> function:
>>>
>>>  def myself(o):
>>>  return o
>>>
>>> and not be able to rely on "o is myself(o)"? That seems... a pretty
>>> nasty breaking change for the language.
>> Other way around - because strings are immutable, their identity isn't
>> supposed to matter, so it's possible that functions that currently
>> return the exact same object in some cases may in the future start
>> returning a different object with the same value.
>>
>> Right now, in CPython, with no tagged pointers, we return the full
>> existing pointer wherever we can, as that saves us a data copy. With
>> tagged pointers, the pointer storage effectively *is* the instance, so
>> you can't really replicate that existing "copy the reference not the
>> storage" behaviour any more.
>>
>> That said, it's also possible that identity for tagged pointers would
>> be value based (similar to the effect of the small integer cache and
>> string interning), in which case the entire question would become
>> moot.
>>
>> Either way, the PEP shouldn't be specifying that a new object *must*
>> be returned, and it also shouldn't be specifying that the same object
>> *can't* be returned.
>
> Agreed. I think the PEP should say that a str will be returned (in the
> event of a subclass, assuming that's what we decide), but if the
> argument is exactly a str, that it may or may not return the original
> object.
>
> Eric
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/JHM7T6JZU56PWYRJDG45HMRBXE3CBXMX/
> Code of Conduct: http://python.org/psf/codeofconduct/



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/RTQWEE4KZYIIXL3HK3C6IJ2ATQ6CM7PG/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Requesting review on PEP 615 C Extension Reference Implementation

2020-03-20 Thread Paul Ganssle
Hi all,

The past few weeks I've been working on adding a C extension to the
reference implementation for PEP 615 (Support for the IANA Time Zone
Database in the Standard Library - PEP link
), but I've had some trouble
inducing anyone to review the code
. There's
about 2200 lines of C code there that's passing all the tests for the
pure Python implementation (which has 100% code coverage) - would anyone
be interested in taking a look before I merge it into the reference
implementation repo? Any little bit of review (including incremental
reviews) would help:

https://github.com/pganssle/zoneinfo/pull/15

Note: You'll need to click "Load diff" on the zoneinfo_module.c file to
see the actual code, because the diff is so large. The stuff that's
immediately visible in the diff is the more straightforward code.

I'll note that I am also happy to accept review comments about other
parts of the repo, not just the C code, but the C code is a priority
since errors there tend to be less forgiving.

Thanks!
Paul



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/MWL4Q2KRYEZ2GYK63LCBLOPJBD2FO7XY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] PEP 615: Support for IANA Time Zones in the Standard Library

2020-03-01 Thread Paul Ganssle
Greetings!

Last year at the Language Summit, I proposed to add additional concrete
time zones to the standard library
<https://pyfound.blogspot.com/2019/05/paul-ganssle-time-zones-in-standard.html>
. After much work and much more procrastination, I now have now put
together my first proposal: support for the IANA time zone database
<https://www.iana.org/time-zones> (also called tz, zoneinfo or the Olson
database; Wikipedia <https://en.wikipedia.org/wiki/Tz_database> ). Last
week, I submitted it for consideration as PEP 615
<https://www.python.org/dev/peps/pep-0615/>.

I originally posted it on the discourse last week, and advertised the
discussion on some interest-group-specific fora (tz mailing list,
datetime-SIG mailing list), but I think it is ready to be advertised to
a wider forum, so I am posting it here for your consideration. Please
direct comments to the discourse thread, so that the discussion can be
as centralized as possible: https://discuss.python.org/t/3468
<https://discuss.python.org/t/pep-615-support-for-the-iana-time-zone-database-in-the-standard-library/3468>.

Links for easy access:
PEP 615: https://www.python.org/dev/peps/pep-0615/
Reference implementation: https://github.com/pganssle/zoneinfo
tzdata repo: https://github.com/pganssle/tzdata

Thanks!
Paul



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5WVVDHBWN23A7WAOAUIQDRIR2DGF233S/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 584: Add Union Operators To dict

2020-02-06 Thread Paul Ganssle
On 2/6/20 4:23 PM, Serhiy Storchaka wrote:
> It would create an exception of two rules:
>
> 1. Operators on subclasses of builtin classes do not depend on
> overridden methods of arguments (except the corresponding dunder
> method). `list.__add__` and `set.__or__` do not call copy() and
> extend()/update(). You should override the corresponding dunder method
> to change the behavior of the operator.
>
> 2. Operators do not depend on non-dunder methods.
>
> This looks to me as a direct violation of the principle "Special cases
> aren't special enough to break the rules."
>
I may not fully understand the implications of #1, but I would think you
could implement the semantics Brandt wants using only dunder methods and
copy.copy (which itself dispatches to one of a number of dunder methods
- __copy__, __reduce__, __setstate__, depending on which ones are
defined - we could presumably avoid the `copy` import by partially
porting that logic into `__or__`):

def __or__(self, other):
    new_value = copy.copy(self)
    for key in other.__keys__():
    new_value.__setitem__(key, other.__getitem__(key))
    return new_value

Obviously the actual implementation would be in C and handle more edge
cases and whatnot, but I think the desired semantics can all be achieved
using only magic methods on the objects themselves (though we'd probably
want to bypass all that stuff in favor of a "fast path" in the case of
`dict` | `dict`).

Best,
Paul




signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QXLOQ6QZLF35FGCTTARFSUBNGK4U22MB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 584: Add Union Operators To dict

2020-02-06 Thread Paul Ganssle
Hi Brandt, very nice PEP. I have two questions here.

First:

> - The proposal in its current form is very easy to wrap your head around: "|" 
> takes dicts, "|=" takes anything dict.update does.

I see this asymmetry between the | and |= mentioned a few times in the
PEP, but I don't see any rationale other than "the authors have
decided". I am not saying that this is the wrong decision, but the
reasoning behind this choice is not obvious, and I think it might be a
good idea to include the rationale in the PEP. I'd say the asymmetry
between list's `__add__` and `__iadd__` semantics is actually fairly
confusing for anyone who hasn't encountered it before:

>>> a = []
>>> a = a + "one" 
---
TypeError Traceback (most recent
call last)
 in 
> 1 a = a + "one"

TypeError: can only concatenate list (not "str") to list
>>> a += "one"
>>> a
['o', 'n', 'e']

I think most people would be surprised at the difference in semantics
here and this also an example of a situation where it's not obvious that
the "call `list.extend`" behavior is the right thing to do. It would be
nice to see why you rejected:

1. Giving |= `.update` semantics rather than the semantics you chose for |.

2. Giving `|` the same semantics as `|=`.

Second question:

The specification mentions "Dict union will return a new dict containing
the left operand merged with the right operand, which must be a dict (or
an instance of a dict subclass)." Can you clarify if it is part of the
spec that it will always return a `dict` even if one or both of the
operands is a dict subclass? You mentioned in another post that this is
deliberately intended to be subclass-friendly, but if it always returns
`dict`, then I would expect this:

    >>> class MyDict(dict):
    ...    pass
    ...
    >>> MyDict({1: 2}) | MyDict({3: 4})
    {1: 2, 3: 4}

I realize that there's a lot of precedent for this with other builtin
types (int subclasses reverting to int with +, etc), though generally
the justifications for this are two-fold:

1. For symmetrical operations like addition it's not obvious which
operand's type should prevail, particularly if the two are both
different dict subclasses. I think in this case | is already
asymmetrical and it would be relatively natural to think of the right
thing to do as "make a copy of the LHS then update it with the RHS"
(thus retaining the subtype of the LHS).

2. Subclasses may override the constructor in unpredictable ways, so we
don't know how to construct arbitrary subtypes. I /think/ this objection
could be satisfied by using logic equivalent to `copy.copy` for the LHS
when it is a dict subclass, and then using the mapping protocol on the RHS.

Unless there is compelling reason to do otherwise, I am in favor of
trying to retain subclass identity after operations, but either way it
would be good to be explicit about it in the specification and maybe
include a bit of the rationale one way or the other.

Best,
Paul



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/L44V6WTFI4OWM7TPYIMCACJHKBRJQMIU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Announcing the new Python triage team on GitHub

2019-08-22 Thread Paul Ganssle
I think it's fine for triagers to have the close permission, and there's
actually a good reason to give it to them, which is that often the
easiest way to trigger a new CI run is to close and re-open a PR. It
will be very helpful for triagers to be able to do this to fix
intermittent build problems and the like.

Even without the "trigger a new CI run" justification, I think the risk
we're running by giving people close privileges is pretty minimal. If a
PR is erroneously closed, it can just be re-opened. If a triager is
consistently misuing /any/ of the permissions they are given, their
triage permissions can be removed.

Best,
Paul

On 8/22/19 6:02 AM, Victor Stinner wrote:
> Hi,
>
> Oh, I just wrote a similar email to python-committers, I didn't notice
> that Mariatta wrote to python-dev and python-committers.
>
> https://mail.python.org/archives/list/python-committ...@python.org/message/53K5MJAKLRGY2F34ZCYGL3WPWSJ4C5M2/
>
>
> My worry is more about closing pull requests.
>
>
> Le 21/08/2019 à 22:13, Raymond Hettinger a écrit :
>> The capabilities of a triager mostly look good except for "closing
>> PRs and issues".  This is a superpower that has traditionally been
>> reserved for more senior developers because it grants the ability to
>> shut-down the work of another aspiring contributor.  Marking someone
>> else's suggestion as rejected is the most perilous and least fun
>> aspect of core development.  Submitters tend to expect their idea
>> won't be rejected without a good deal of thought and expert
>> consideration.   Our bar for becoming a triager is somewhat low, so I
>> don't think it makes sense to give the authority to reject a PR or
>> close an issue.
>
> Closing an issue (on GitHub) is not new compared to the previous
> "Developer" role on bugs.python.org. When I gave the bug triage
> permission to a contributor, I always warned them that closing an
> issue is the most risky operation. I asked them to ask me before doing
> that.
>
> In practice, I don't recall a triagger who closed an issue, but
> someone else complained that the issue should be reopened. In a few
> specific cases, the original reporter was in disagreement with
> everybody else and didn't understand why their issue was not a bug and
> will not be fixed, but it wasn't an issue about triaggers ;-)
>
> The risk of closing an issue by mistake is quite low, since the bug
> remains in the database, it's trivial to reopen. Closed bugs can be
> found using Google for example (which doesn't care of the bug status),
> or using bugs.python.org search engine if you opt-in for closed issues
> (or ignore the open/close status in a search). The topic has been
> discussed previously (sorry, I don't recall where), and it was said
> that it's ok to give this permission (close issues) to new triaggers.
>
> Now there is the question of giving the "close pull requests"
> permission to new triaggers ;-)
>
> Victor


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HVLB2VIGOWFFTGVKEYYHEL7PQ337Q53K/


[Python-Dev] Re: What is a public API?

2019-07-23 Thread Paul Ganssle
FWIW, I actually like the idea - though not strongly enough to really
campaign for it.

My reasoning is that I think that both the current "consenting adults"
policy and possibly more importantly the fact that we are implicitly
supporting private interfaces by our reluctance to changing them has
harmed the ecosystem of Python interpreters. Because the lines between
implementation details and deliberate functionality are very fuzzy,
alternate implementations need to go out of their way to be something
like "bug compatible" with CPython.

Of course, there are all kinds of other psychological and practical
reasons that are preventing a flourishing ecosystem of alternative
Python implementations, but I do think that we could stand to be more
strict about reliance on implementation details as a way of standing up
for people who don't have the resources or market position to push
people to write their code in a way that's compatible with multiple
implementations.

I'll note that I am basically neutral on the idea of consistency across
the codebase as a goal - it would be nice but there are too many
inconsistencies even in the public portion of the API for us to ever
actually achieve it, so I don't think it's critical. The main reason I
like the idea is that I /do/ think that there are a lot of people who
use "does it start with an underscore" as their only heuristic for
whether or not something is private (particularly since that is obvious
to assess no matter how you access the function/method/attribute/class,
whereas `__all__` is extra work and many people don't know its
significance). Yes, they are just as wrong as people who we would be
breaking by sweeping changes to the private interface, but the rename
would prevent more /accidental/ reliance on implementation details.

On 7/23/19 3:27 AM, Paul Moore wrote:
> On Tue, 23 Jul 2019 at 04:58, Kyle Stanley  wrote:
>> My primary motivation was to provide more explicit declaration of public vs 
>> private, not only for the purpose of shifting the responsibility to the 
>> authors, but also to shift the liability of using private members to the 
>> user.
> My view is that the current somewhat adhoc, "consenting adults"
> approach has served us well for many years now. There have been a few
> cases where we've needed to address specific points of confusion, but
> mostly things have worked fine.
>
> With Python's increased popularity, there has been an influx of new
> users with less familiarity with Python's easy-going attitude, and
> consequently an increase in pressure for more "definite", or
> "explicit" rules. While it's great to see newcomers arrive with new
> ideas, and it's important to make their learning experience as
> pleasant as possible, we should also make sure that we don't lose the
> aspects of Python that *made* it popular in the process. And to my
> mind, that easy-going, "assume the users know what they are doing"
> attitude is a key part of Python's appeal.
>
> So I'm -1 on any global change of this nature, particularly if it is
> motivated by broad, general ideas of tightening up rules or making
> contracts more explicit rather than a specific issue.
>
> The key point about making changes on a "case by case" basis is *not*
> about doing bits of the fix when needed, but about having clear,
> practical issues that need addressing, to guide the decision on what
> particular fix is appropriate in any given situation.
>
> Paul
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/QB436KAE4WGF66LNFJICR3P3BFZNP5BR/


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/THWGSQZUK6JZEEMRYYR7CKREHF2I5KYX/


[Python-Dev] Re: Removing dead bytecode vs reporting syntax errors

2019-07-05 Thread Paul Ganssle
It seems that the issue that originally caused compatibility issues was
that `__debug__` statements were being optimized away, which was
apparently desirable from coverage's point of view. It's not clear to
me, but it seems that this may also impact what bytecode is generated
when Python is run in optimized mode, because statements of the form `if
__debug__:` will no longer be completely optimized out under `-O` (note
that from what I can tell, `assert` statements are still optimized out
under `-O`).

Does anyone have performance sensitive code that relies on `if
__debug__` so that we can look at a benchmark? The issues with code
coverage aside, if it's a significant issue, maybe it is worth
considering a special case for `if __debug__` (I don't know enough about
the implementation details to know how difficult or annoying this would
be to maintain).

Best,
Paul

On 7/5/19 5:51 PM, Ivan Pozdeev via Python-Dev wrote:
>
> "Correctness over speed" is Python's core value, so any syntax errors
> must be reported.
>
> Is this optimization so important anyway? `if 0:' seems a niche use
> case (yes, it's in site.py which is in every installation but the gain
> there is pretty small)
>



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OK3JFHXPYM4HPGPPGKE6F7QTABUCCE43/


[Python-Dev] Re: Expected stability of PyCode_New() and types.CodeType() signatures

2019-06-12 Thread Paul Ganssle
I'm not one of the people who ever shipped pre-cythonized code, but I
think at this point it should be pretty safe to ship just your .pyx
files and not the generated C files - at least if your reason for
shipping the C code was "it will be hard for people to pip install things".

For /other/ reasons, it's best to ship wheels anyway, and pip will
prefer wheels over source distributions, so if you ship a manylinux
wheel and a Windows wheel for each of the most popular Python versions,
the majority of your users won't even ever hit the source distribution
anyway. PEP 518 has been supported in pip since I think version 18.0,
and PEP 517/518 support in pip landed in 19.0 (though it's been a rocky
few months).

I do tend to live a bit ahead of the curve in terms of packaging because
of my intimate familiarity with the topic, but I do think that we're at
the point where the build-related problems you'll see from just shipping
`.pyx` files are going to be pretty rare, if you do it right.

On 6/12/19 4:11 AM, Petr Viktorin wrote:
> I hope this is something that improvements in Python's packaging story
> (specifically, PEP 518) should help with.
> I see the current practice of including Cython's output in releases as
> a workaround for the fact that you can't (reasonably) specify Cython
> as a build dependency. Cython is a much lighter dependency than a C
> compiler -- though a less common one. When there's a reliable way to
> specify build-time dependencies, running Cython on each build will
> hopefully become the obvious way to do it.


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/SMJ6WQD7Y3DRDU6OHDXUOW42MANHP2LC/


Re: [Python-Dev] Adding a toml module to the standard lib?

2019-05-15 Thread Paul Ganssle
As someone involved in the packaging side of this, I think while we'd
eventually /appreciate/ a TOML parser in the standard library, I agree
with Victor that there's no rush, for two reasons:

1. setuptools and pip have a decent number of dependencies that we
vendor /anyway/, so vendoring one more is not a big deal.
2. We will be supporting older versions of Python for some time to come,
so we'd need to vendor a TOML-parser backport for several years before
we could actually use the one in the standard library.

I think /if/ a 1.0 version of the spec is going to be forthcoming (not
clear from the issue requesting it:
https://github.com/toml-lang/toml/issues/515 ), then it's worth waiting
at least a bit longer for it.

I also think it's probably worth asking the current maintainers of
TOML-parsing libraries if /they/ think it's time to adopt/adapt one of
their libraries for use in the standard library. They probably have a
better perspective on the stability and maturity of their codebases.

Best,
Paul


On 5/15/19 8:39 AM, Victor Stinner wrote:
> H Bastian,
>
> IMHO we should wait until the format reach version 1.0, since the
> stdlib has a slow release cycle (one release every 18 months). Too
> slow for a "fast moving" standard.
>
> In the meanwhile, I'm sure setuptools and pip will manage to install a
> toml parser/generator for their needs, as they already do :-)
>
> Victor
>
> Le mer. 15 mai 2019 à 12:50, Bastian Venthur  a écrit :
>> On 15.05.19 11:33, Antoine Pitrou wrote:
>>> How stable is the TOML format?  Is it bound to change significantly in
>>> the coming years?
>>>
>>> If the format is stable enough, then I think it's a good idea.
>>
>> The last update to the spec [1] was 10 months ago and added a few
>> features. The version before that was stable for more than 3 years. It
>> is also worth noting that he writes about the current version [2]:
>>
>> "As of version 0.5.0, TOML should be considered extremely stable. The
>> goal is for version 1.0.0 to be backwards compatible (as much as humanly
>> possible) with version 0.5.0. All implementations are strongly
>> encouraged to become 0.5.0 compatible so that the transition to 1.0.0
>> will be simple when that happens."
>>
>> That is of course no guarantee, but maybe the best we can hope for.
>>
>>
>> Cheers,
>>
>> Bastian
>>
>>
>> [1]: https://github.com/toml-lang/toml/releases
>> [2]: https://github.com/toml-lang/toml
>>
>>
>> --
>> Dr. Bastian Venthur  http://venthur.de
>> Debian Developer venthur at debian org
>>
>> ___
>> Python-Dev mailing list
>> Python-Dev@python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe: 
>> https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com
>
>


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding a tzidx cache to datetime

2019-05-13 Thread Paul Ganssle
> From Marc-Andre Lemburg, I understand that Paul's PR is a good
> compromise and that other datetime implementations which cannot use
> tzidx() cache (because it's limited to an integer in [0; 254]) can
> subclass datetime or use a cache outside datetime.

One idea that we can put out there (though I'm hesitant to suggest it,
because generally Python avoids this sort of language lawyering anyway),
is that I think it's actually fine to allow the situations under which
`tzidx()` will cache a value could be implementation-dependent, and to
document that in CPython it's only integers  in [0; 254].

The reason to mention this is that I suspect that PyPy, which has a
pure-python implementation of datetime, will likely either choose to
forgo the cache entirely and always fall through to the underlying
function call or cache /any/ Python object returned, since with a pure
Python implementation, they do not have the advantage of storing the
tzidx cache in an unused padding byte.

Other than the speed concerns, because of the fallback nature of
datetime.tzidx, whether or not the cache is hit will not be visible to
the end user, so I think it's fair to allow interpreter implementations
to choose when a value is or is not cached according to what works best
for their users.

On 5/13/19 7:52 PM, Victor Stinner wrote:
> Le ven. 10 mai 2019 à 09:22, M.-A. Lemburg  a écrit :
>> Given that many datetime objects in practice don't use timezones
>> (e.g. in large data stores you typically use UTC and naive datetime
>> objects), I think that making the object itself larger to accommodate
>> for a cache, which will only be used a smaller percentage of the use
>> cases, isn't warranted. Going from 64 bytes to 72 bytes also sounds
>> like this could have negative effects on cache lines.
>>
>> If you need a per object cache, you can either use weakref
>> objects or maintain a separate dictionary in dateutil or other
>> timezone helpers which indexes objects by id(obj).
>>
>> That said, if you only add a byte field which doesn't make the object
>> larger in practice (you merely use space that alignments would
>> use anyway), this shouldn't be a problem. The use of that field
>> should be documented, though, so that other implementations can
>> use/provide it as well.
> From Marc-Andre Lemburg, I understand that Paul's PR is a good
> compromise and that other datetime implementations which cannot use
> tzidx() cache (because it's limited to an integer in [0; 254]) can
> subclass datetime or use a cache outside datetime.
>
> Note: right now, creating a weakref to a datetime fails.
>
> Victor


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding a tzidx cache to datetime

2019-05-09 Thread Paul Ganssle
This is only "only" for dateutil in the sense that no one other than
dateutil implements tzinfo with the interface provided. If dateutil were
/not/ already implemented with a list of offsets and their indexes, I
would still propose this, and just re-write dateutil to take advantage
of it. From a cursory glance at pendulum, it seems that they could take
advantage of it as well (though they use their own datetime subclass, so
they have always had the ability to add this).

> What do you think of adding a private "_cache" attribute which would
> be an arbitrary Python object? (None by default)

We cannot use a private attribute (other than to do the actual storage,
since the thing that gets stored is not directly accessible anyway and
is instead mediated by a layer that manages the cache) because this is a
feature explicitly being added for use by tzinfo, /not/ by datetime. If
it's private then it's not safe for implementations of tzinfo to
actually use it, which defeats the purpose.

Regarding the use of an arbitrary Python object: What I'm proposing is
that we offer a bit of the "free" storage space in the alignment bits to
tzinfo objects to use as a cache. In /most/ cases this will be very
useful to someone implementing a tzinfo, because there are really only
so many ways to accomplish this task, and most time zones are
expressible as a very short list of offset/name/dst combinations, plus
some rule for which applies when, which is why a small integer cache is
sufficient and more or less universal (i.e. not specific to dateutil's
implementation).

I will also note that in my design, it is still possible for `tzinfo` to
return something other than [0, 254], it's just that that information
will not be cached, so it won't get the benefit of any optimization, but
the same interface / implementation can be used.

In my test with gcc, adding an additional PyObject* to the end of the
PyDateTime_DateTime struct increased the size of the `datetime.datetime`
object from 64 to 72 bytes, whereas adding an `unsigned char` after the
`fold` leaves it unchanged. Given that the expansion to arbitrary Python
objects is speculative and doesn't have any particular use case, I would
prefer to leave the feature as is, and reconsider the possibility of
storing arbitrary Python objects on the datetime if there's some
compelling reason to do so (it would be a backwards-compatible change at
that point anyway).

On 5/9/19 8:14 PM, Victor Stinner wrote:
> Hi Paul,
>
> The change is basically an optimization. I'm uncomfortable to design
> it "only" for dateutil. What if tomorrow someone has to store an
> arbitrary Python object, rather than just an integer (in range [0;
> 254]), into a datetime for a different optimization?
>
> Moreover, I dislike adding a *public* method for an *internal* cache.
>
> Right now, it is not possible to create a weak reference to a
> datetime. If we make it possible, it would be possible to have an
> external cache implemented with weakref.WeakSet to clear old entries
> when a datetime object is detroyed.
>
> What do you think of adding a private "_cache" attribute which would
> be an arbitrary Python object? (None by default)
>
> Victor
>
> Le mar. 7 mai 2019 à 21:46, Paul Ganssle  a écrit :
>> Greetings all,
>>
>> I have one last feature request that I'd like added to datetime for Python 
>> 3.8, and this one I think could use some more discussion, the addition of a 
>> "time zone index cache" to the datetime object. The rationale is laid out in 
>> detail in bpo-35723. The general problem is that currently, every invocation 
>> of utcoffset, tzname and dst needs to do full, independent calculations of 
>> the time zone offsets, even for time zones where the mapping is guaranteed 
>> to be stable because datetimes are immutable. I have a proof of concept 
>> implementation: PR #11529.
>>
>> I'm envisioning that the `datetime` class will add a private `_tzidx` 
>> single-byte member (it seems that this does not increase the size of the 
>> datetime object, because it's just using an unused alignment byte). 
>> `datetime` will also add a `tzidx()` method, which will return `_tzidx` if 
>> it's been set and otherwise it will call `self.tzinfo.tzidx()`.  If 
>> `self.tzinfo.tzidx()` returns a number between 0 and 254 (inclusive), it 
>> sets `_tzidx` to this value. tzidx() then returns whatever 
>> self.tzinfo.tzidx() returned.
>>
>> The value of this is that as far as I can tell, nearly all non-trivial 
>> tzinfo implementations construct a list of possible offsets, and implement 
>> utcoffset(), tzname() and dst() by calculating an index into that list and 
>> returning it. There are almost always less than 255 distinct offsets. By 
>&g

[Python-Dev] Adding a tzidx cache to datetime

2019-05-07 Thread Paul Ganssle
Greetings all,

I have one last feature request that I'd like added to datetime for
Python 3.8, and this one I think could use some more discussion, the
addition of a "time zone index cache" to the /datetime/ object. The
rationale is laid out in detail in bpo-35723
. The general problem is that
currently, /every/ invocation of utcoffset, tzname and dst needs to do
full, independent calculations of the time zone offsets, even for time
zones where the mapping is guaranteed to be stable because datetimes are
immutable. I have a proof of concept implementation: PR #11529
.

I'm envisioning that the `datetime` class will add a private `_tzidx`
single-byte member (it seems that this does not increase the size of the
datetime object, because it's just using an unused alignment byte).
`datetime` will also add a `tzidx()` method, which will return `_tzidx`
if it's been set and otherwise it will call `self.tzinfo.tzidx()`.  If
`self.tzinfo.tzidx()` returns a number between 0 and 254 (inclusive), it
sets `_tzidx` to this value. tzidx() then returns whatever
self.tzinfo.tzidx() returned.

The value of this is that as far as I can tell, nearly all non-trivial
tzinfo implementations construct a list of possible offsets, and
implement utcoffset(), tzname() and dst() by calculating an index into
that list and returning it. There are almost always less than 255
distinct offsets. By adding this cache /on the datetime/, we're using a
small amount of currently-unused memory to prevent unnecessary
calculations about a given datetime. The feature is entirely opt-in, and
has no downsides if it goes unused, and it makes it possible to write
tzinfo implementations that are both lazy and as fast as the "eager
calculation" mode that pytz uses (and that causes many problems for
pytz's users).

I have explored the idea of using an lru cache of some sort on the
tzinfo object itself, but there are two problems with this:

1. Calculating the hash of a datetime calls .utcoffset(), which means
that it is necessary to, at minimum, do a `replace` on the datetime (and
constructing a new datetime is a pretty considerable speed hit)

2. It will be a much bigger memory cost, since my current proposal uses
approximately zero additional memory (not sure if the alignment stuff is
platform-dependent or something, but it doesn't use additional memory on
my linux computer).

I realize this proposal is somewhat difficult to wrap your head around,
so if anyone would like to chat with me about it in person, I'll be at
PyCon sprints until Thursday morning.

Best,
Paul



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] datetime.fromisocalendar

2019-04-27 Thread Paul Ganssle
Greetings,

Some time ago, I proposed adding a `.fromisocalendar` alternate
constructor to `datetime` (bpo-36004
), with a corresponding
implementation (PR #11888
). I advertised it on
datetime-SIG some time ago but haven't seen much discussion there, so
I'd like to bring it to python-dev's attention as we near the cut-off
for new Python 3.8 features.

Other than the fact that I've needed this functionality in the past, I
also think a good general principle for the datetime module is that when
a class (time, date, datetime) has a "serialization" method (.strftime,
.timestamp, .isoformat, .isocalendar, etc), there should be a
corresponding /deserialization/ method (.strptime, .fromtimestamp,
.fromisoformat) that constructs a datetime from the output. Now that
`fromisoformat` was introduced in Python 3.7, I think `isocalendar` is
the only remaining method without an inverse. Do people agree with this
principle? Should we add the `fromisocalendar` method?

Thanks,
Paul



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Adding shlex.join?

2019-04-17 Thread Paul Ganssle
Hey all,

I've been reviewing old "awaiting review" PRs recently, and about a week
ago I found PR #7605 ,
adding shlex.join(), with a corresponding bug at bpo-22454
. The PR's implementation is simple
and seems reasonable and decently well-tested, but it has been
unreviewed for ~10 months.

The reason I'm bringing it up here is that I believe the major blocker
here is getting agreement to actually add the function. There doesn't
seem to be much /opposition/ in the BPO issue, but given how
infrequently the shlex module is changed I'm worried that there may be
no one around who feels confident to judge how the interface should evolve.

Does anyone feel strongly about this issue? Is there anyone who wants to
make a yes/no decision on this feature?

Best,
Paul

P.S. The PR's submitter seems responsive. I made a comment on the
documentation and it was addressed in something like 5 minutes.



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bpo-36558: Change time.mktime() return type from float to int?

2019-04-16 Thread Paul Ganssle
I already chimed in on the issue, but for the list, I'll boil my
comments down to two questions:

1. For anyone who knows: when the documentation refers to "compatibility
with `.time`", is that just saying it was designed that way because
.time returns a float (i.e. for /consistency/ with `.time()`), or is
there some practical reason that you would want `.time()` and
`.mktime()` to return the same type?

2. Mainly for Victor, but anyone can answer: I agree that the natural
output of `mktime()` would be `int` if I were designing it today, but
would there be any /practical/ benefits for making this change? Are
there problems cropping up because it's returning a float? Is it faster
to return an integer?

Best,

Paul

On 4/16/19 10:24 AM, Victor Stinner wrote:
> Hi,
>
> time.mktime() looks "inconsistent" to me and I would like to change
> it, but I'm not sure how it impacts backward compatibility.
> https://bugs.python.org/issue36558
>
> time.mktime() returns a floating point number:
>
 type(time.mktime(time.localtime()))
> 
>
> The documentation says:
>
> "It returns a floating point number, for compatibility with :func:`.time`."
>
> time.time() returns a float because it has sub-second resolution, but
> the C function mktime() returns an integer number of seconds.
>
> Would it make sense to change mktime() return type from float to int?
>
> I would like to change mktime() return type to make the function more
> consistent: all inputs are integers, it sounds wrong to me to return
> float. The result should be integer as well.
>
> How much code would it break? I guess that the main impact are unit
> tests relying on repr(time.mktime(t)) exact value. But it's easy to
> fix the tests: use int(time.mktime(t)) or "%.0f" % time.mktime(t) to
> never get ".0", or use float(time.mktime(t))) to explicitly cast for a
> float (that which be a bad but quick fix).
>
> Note: I wrote and implemented the PEP 564 to avoid any precision loss.
> mktime() will not start loosing precision before year 285,422,891
> (which is quite far in the future ;-)).
>
> Victor


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove tempfile.mktemp()

2019-03-19 Thread Paul Ganssle
I'm not sure the relationship with mkdir and mktemp here. I don't see
any uses of tempfile.mktemp in pip or setuptools, though they do use
os.mkdir (which is not deprecated).

Both pip and setuptools use pytest's tmpdir_factory.mktemp() in their
test suites, but I believe that is not the same thing.

On 3/19/19 9:39 AM, Antoine Pitrou wrote:
> On Tue, 19 Mar 2019 15:32:25 +0200
> Serhiy Storchaka  wrote:
>> 19.03.19 15:03, Stéphane Wirtel пише:
>>> Suggestion and timeline:
>>>
>>> 3.8, we raise a PendingDeprecationWarning
>>>  * update the code
>>>  * update the documentation
>>>  * update the tests
>>>(check a PendingDeprecationWarning if sys.version_info == 3.8)
>>>
>>> 3.9, we change PendingDeprecationWarning to DeprecationWarning
>>>(check DeprecationWarning if sys.version_info == 3.9)
>>>
>>> 3.9+, we drop tempfile.mktemp()  
>> This plan LGTM.
>>
>> Currently mkdir() is widely used in distutils, Sphinx, pip, setuptools, 
>> virtualenv, and many other third-party projects, so it will take time to 
>> fix all these places. But we should do this, because all this code 
>> likely contains security flaws.
> The fact that many projects, including well-maintained ones such Sphinx
> or pip, use mktemp(), may be a hint that replacing it is not as easy as
> the people writing the Python documentation seem to think.
>
> Regards
>
> Antoine.
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/paul%40ganssle.io



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] datetime.timedelta total_microseconds

2019-02-27 Thread Paul Ganssle

On 2/26/19 7:03 PM, Chris Barker via Python-Dev wrote:
> This thread petered out, seemingly with a consensus that we should
> update the docs -- is anyone doing that?
>
I don't think anyone is, I've filed a BPO bug for it:
https://bugs.python.org/issue3613

>
> -- I am a physical scientist, I work with unitted quantities all the
> time (both in code and in other contexts). It never dawned on me to
> use this approach to convert to seconds or milliseconds, or ...
> Granted, I still rely on python2 for a fair bit of my work, but still,
> I had to scratch my head when it was proposed on this thread.
>
As another data point, I also have a background in the physical
sciences, and I actually do find it quite intuitive. The first time I
saw this idiom I took it to heart immediately and only stopped using it
because many of the libraries I maintain still support Python 2.

It seemed pretty obvious that I had a `timedelta` object that represents
times, and dividing it by a base value would give me the number of times
the "unit" timedelta fits into the "value" timedelta. Seeing the code
`timedelta(days=7) / timedelta(days=1)`, I think most people could
confidently say that that should return 7, there's really no ambiguity
about it.


> -- There are a number of physical unit libraries in Python, and as far
> as I know, none of them let you do this to create a unitless value in
> a particular unit. "pint" for example:
>
> https://pint.readthedocs.io/en/latest/
>
> ...
>
> And you can reduce it to a dimensionless object:
>
> In [57]: unitless.to_reduced_units()                                 
>           
> Out[57]: 172800.0 
>
I think the analogy with pint's unit-handling behavior is not completely
appropriate, because `timedelta` values are not in /specific/ units at
all, they are just in abstract duration units. It makes sense to
consider "seconds / day" as a specific dimensionless unit in the same
sense that "percent" and "ppb" make sense as specific dimensionless
units, and so that would be the natural behavior I would expect for
unit-handling code.

For timedelta, we don't have it as a value in specific units, so it's
not clear what the "ratio unit" would be. What we're looking at with
timedelta division is really more "how many s are there in this
duration".


> So no -- dividing a datetime by another datetime with the value you
> want is not intuitive: not to a physical scientist, not to a user of
> other physical quantities libraries -- is it intuitive to anyone other
> than someone that was involved in python datetime development??
>

Just to clarify, I am involved in Python datetime development now, but I
have only been involved in Python OSS for the last 4-5 years. I remember
finding it intuitive when I (likely working as a physicist at the time)
first saw it used.

> Discoverable:
> ==
>
I agree that it is not discoverable (which is unfortunate), but you
could say the same thing of /all/ operators. There's no tab-completion
that will tell you that `3.4 / 1` is a valid operation or that (3,) +
(4,) will work, but we don't generally recommend adding methods for
those things.

I do think the discoverability is hindered by the existence of the
total_seconds method, because the fact that total_seconds exists makes
you think that it is the correct way to get the number of seconds that a
timedelta represents, and that you should be looking for other analogous
methods as the "correct" way to do this, when in fact we have a simpler,
less ambiguous (for example, it's not obvious whether the methods would
truncate or not, whereas __truediv__ and __floordiv__ gives the division
operation pretty clear semantics) and more general way to do things.

I think it's too late to /remove/ `total_seconds()`, but I don't think
we should be compounding the problem by bloating the API with a bunch of
other methods, per my earlier arguments.


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] datetime.timedelta total_microseconds

2019-02-16 Thread Paul Ganssle
I am definitely sympathetic to the idea of it being more readable, but I
feel like this adds some unnecessary bloat to the interface when "divide
the value by the units" is not at all uncommon. Plus, if you add a
total_duration that by default does the same thing as total_seconds, you
now have three functions that do exactly the same thing:

- td / timedelta(seconds=1)
- td.total_seconds()
- total_duration(td)

If it's just for the purposes of readability, you can also do this:

    from operator import truediv as total_duration   # (timedelta, interval)

I think if we add such a function, it will essentially be just a slower
version of something that already exists. I suspect the main reason the
"divide the timedelta by the interval" thing isn't a common enough idiom
that people see it all the time is that it's only supported in Python 3.
As more code drops Python 2, I think the "td / interval" idiom will
hopefully become common enough that it will obviate the need for a
total_duration function.

That said, if people feel very strongly that a total_duration function
would be useful, maybe the best thing to do would be for me to add it to
dateutil.utils? In that case it would at least be available in Python 2,
so people who find it more readable /and/ people still writing polyglot
code would be able to use it, without the standard library unnecessarily
providing two ways to do the exact same thing.

On 2/16/19 11:59 AM, Nick Coghlan wrote:
> On Fri, 15 Feb 2019 at 04:15, Alexander Belopolsky
>  wrote:
>>
>>
>> On Thu, Feb 14, 2019 at 9:07 AM Paul Ganssle  wrote:
>>> I don't think it's totally unreasonable to have other total_X() methods, 
>>> where X would be days, hours, minutes and microseconds
>> I do.  I was against adding the total_seconds() method to begin with because 
>> the same effect can be achieved with
>>
>> delta / timedelta(seconds=1)
>>
>> this is easily generalized to
>>
>> delta / timedelta(X=1)
>>
>> where X can be days, hours, minutes or microseconds.
> As someone who reads date/time manipulation code far more often then
> he writes it, it's immediately obvious to me what
> "delta.total_seconds()" is doing, while "some_var / some_other_var"
> could be doing anything.
>
> So for the sake of those us that aren't as well versed in how time
> delta division works, it seems to me that adding:
>
> def total_duration(td, interval=timedelta(seconds=1)):
> return td / interval
>
> as a module level helper function would make a lot of sense. (This is
> a variant on Paul's helper function that accepts the divisor as a
> specifically named argument with a default value, rather than creating
> it on every call)
>
> Cheers,
> Nick.
>
> P.S. Why a function rather than a method? Mostly because this feels
> like "len() for timedelta objects" to me, but also because as a helper
> function, the docs can easily describe how to add it as a utility
> function for older versions.
>


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] datetime.timedelta total_microseconds

2019-02-15 Thread Paul Ganssle
I'm still with Alexander on this. I see functions like total_X as
basically putting one of the arguments directly in the function name -
it should be `total_duration(units)`, not `total_units()`, because all
of those functions do the same thing and only differ in the units they use.

But Alexander's approach of "divide it by the base unit" is /even more
general/ than this, because it allows you to use non-traditional units
like weeks (timedelta(days=7)) or "two-day periods" or whatever you
want. If you use this idiom a lot and want a simple "calculate the
total" function, this should suffice:

def total_duration(td, *args, **kwargs):
    return td / timedelta(*args, **kwargs)

Then you can spell "x.total_microseconds()" as:

total_duration(x, microseconds=1)

Or you can write it like this:

def total_duration(td, units='seconds'):
    return td / timedelta(**{units: 1})

In which case it would be spelled:

total_duration(x, units='microseconds')

I don't see there being any compelling reason to add a bunch of methods
for a marginal (and I'd say arguable) gain in aesthetics.

On 2/15/19 4:48 PM, Chris Barker via Python-Dev wrote:
> On Fri, Feb 15, 2019 at 11:58 AM Rob Cliffe via Python-Dev
> mailto:python-dev@python.org>> wrote:
>
> A function with "microseconds" in the name IMO misleadingly
> suggests that it has something closer to microsecond accuracy than
> a 1-second granularity.
>
>
> it sure does, but `delta.total_seconds()` is a float, so ms accuracy
> is preserved.
>
> However, if you DO want a "timedelta_to_microseconds" function, it
> really should use the microseconds field in the timedelta object. I
> haven't thought it through, but it makes me nervous to convert to
> floating point, and then back again -- for some large values of
> timedelta some precision may be lost.
>
> Also:
>
>> _MICROSECONDS_PER_SECOND = 100
>
> really? why in the world would you define a constant for something
> that simple that can never change? (and probably isn't used in more
> than one place anyway
>  
> As Alexander pointed out the canonical way to spell this would be:
>
> delta / timedelta(microseconds=1)
>
> but I think that is less than obvious to the usual user, so I think a:
>
> delta.total_microseconds()
>
> would be a reasonable addition.
>
> I know I use .totalseconds() quite a bit, and would not want to have
> to spell it:
>
> delta / timedelta(seconds=1)
>
> (and can't do that in py2 anyway)
>
> -CHB
>
> -- 
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> chris.bar...@noaa.gov 
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/paul%40ganssle.io


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] datetime.timedelta total_microseconds

2019-02-14 Thread Paul Ganssle
Ah yes, good point, I forgot about this because IIRC it's not supported
in Python 2.7, so it's not a particularly common idiom in polyglot
library code.

Obviously any new methods would be Python 3-only, so there's no benefit
to adding them.

Best,

Paul

On 2/14/19 1:12 PM, Alexander Belopolsky wrote:
>
>
> On Thu, Feb 14, 2019 at 9:07 AM Paul Ganssle  <mailto:p...@ganssle.io>> wrote:
>
> I don't think it's totally unreasonable to have other total_X()
> methods, where X would be days, hours, minutes and microseconds
>
> I do.  I was against adding the total_seconds() method to begin with
> because the same effect can be achieved with
>
> delta / timedelta(seconds=1)
>
> this is easily generalized to
>
> delta / timedelta(X=1)
>
> where X can be days, hours, minutes or microseconds.
>  


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] datetime.timedelta total_microseconds

2019-02-14 Thread Paul Ganssle
I don't think it's totally unreasonable to have other total_X() methods,
where X would be days, hours, minutes and microseconds, but it also
doesn't seem like a pressing need to me.

I think the biggest argument against it is that they are all trivial to
implement as necessary, because they're just unit conversions that
involve multiplication or division by constants, which is nowhere near
as complicated to implement as the original `total_seconds` method.
Here's the issue where total_seconds() was implemented, it doesn't seem
like there was any discussion of other total methods until after the
issue was closed: https://bugs.python.org/issue5788

I think the main issue is how "thick" we want the timedelta class to
be.  With separate methods for every unit, we have to maintain and
document 5 methods instead of 1, though the methods are trivial and the
documentation could maybe be shared.

If I had a time machine, I'd probably recommend an interface something
like this:

def total_duration(self, units='seconds'):
    return self._total_seconds() * _SECONDS_PER_UNIT[units]

I suppose it would be possible to move to that interface today, though I
think it would be mildly confusing to have two functions that do the
same thing (total_seconds and total_duration), which may not be worth it
considering that these functions are a pretty minor convenience.

Best,

Paul

On 2/14/19 12:05 AM, Richard Belleville via Python-Dev wrote:
> In a recent code review, the following snippet was called out as
> reinventing the
> wheel:
>
> _MICROSECONDS_PER_SECOND = 100
>
>
> def _timedelta_to_microseconds(delta):
>   return int(delta.total_seconds() * _MICROSECONDS_PER_SECOND)
>
>
> The reviewer thought that there must already exist a standard library
> function
> that fulfills this functionality. After we had both satisfied
> ourselves that we
> hadn't simply missed something in the documentation, we decided that
> we had
> better raise the issue with a wider audience.
>
> Does this functionality already exist within the standard library? If
> not, would
> a datetime.timedelta.total_microseconds function be a reasonable
> addition? I
> would be happy to submit a patch for such a thing.
>
> Richard Belleville
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/paul%40ganssle.io


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Return type of datetime subclasses added to timedelta

2019-02-04 Thread Paul Ganssle
There's already a PR, actually, #10902:
https://github.com/python/cpython/pull/10902

Victor reviewed and approved it, I think before I started this thread,
so now it's just waiting on merge.

On 2/4/19 11:38 AM, Guido van Rossum wrote:
> I recommend that you submit a PR so we can get it into 3.8 alpha 2.
>
> On Mon, Feb 4, 2019 at 5:50 AM Paul Ganssle  <mailto:p...@ganssle.io>> wrote:
>
> Hey all,
>
> This thread about the return type of datetime operations seems to
> have stopped without any explicit decision - I think I responded
> to everyone who had objections, but I think only Guido has given a
> +1 to whether or not we should go ahead.
>
> Have we got agreement to go ahead with this change? Are we still
> targeting Python 3.8 here?
>
> For those who don't want to dig through your old e-mails, here's
> the archive link for this thread:
> https://mail.python.org/pipermail/python-dev/2019-January/155984.html
>
> If you want to start commenting on the actual implementation, it's
> available here (though it's pretty simple):
> https://github.com/python/cpython/pull/10902
>
> Best,
>
> Paul
>
>
> On 1/6/19 7:17 PM, Guido van Rossum wrote:
>> OK, I concede your point (and indeed I only tested this on 3.6).
>> If we could break the backward compatibility for now() we
>> presumably can break it for this purpose.
>>
>> On Sun, Jan 6, 2019 at 11:02 AM Paul Ganssle > <mailto:p...@ganssle.io>> wrote:
>>
>> I did address this in the original post - the assumption that
>> the subclass constructor will have the same arguments as the
>> base constructor is baked into many alternate constructors of
>> datetime. I acknowledge that this is a breaking change, but
>> it is a small one - anyone creating such a subclass that
>> /cannot/ handled the class being created this way would be
>> broken in myriad ways.
>>
>> We have also in recent years changed several alternate
>> constructors (including `replace`) to retain the original
>> subclass, which by your same standard would be a breaking
>> change. I believe there have been no complaints. In fact,
>> between Python 3.6 and 3.7, the very example you showed broke:
>>
>> Python 3.6.6:
>>
>> >>> class D(datetime.datetime):
>> ... def __new__(cls):
>> ... return cls.now()
>> ...
>> >>> D()
>> D(2019, 1, 6, 13, 49, 38, 842033)
>>
>> Python 3.7.2:
>>
>> >>> class D(datetime.datetime):
>> ... def __new__(cls):
>> ... return cls.now()
>> ...
>> >>> D()
>> Traceback (most recent call last):
>>   File "", line 1, in 
>>   File "", line 3, in __new__
>> TypeError: __new__() takes 1 positional argument but 9 were given
>>
>>
>> We haven't seen any bug reports about this sort of thing;
>> what we /have/ been getting is bug reports that subclassing
>> datetime doesn't retain the subclass in various ways (because
>> people /are/ using datetime subclasses). This is likely to
>> cause very little in the way of problems, but it will improve
>> convenience for people making datetime subclasses and almost
>> certainly performance for people using them (e.g. pendulum
>> and arrow, which now need to take a slow pure python route in
>> many situations to work around this problem).
>>
>> If we're /really/ concerned with this backward compatibility
>> breaking, we could do the equivalent of:
>>
>> try:
>>     return new_behavior(...)
>> except TypeError:
>>     warnings.warn("The semantics of timedelta addition have "
>>   "changed in a way that raises an error in "
>>   "this subclass. Please implement __add__ "
>>   "if you need the old behavior.",
>> DeprecationWarning)
>>
>> Then after a suitable notice period drop the warning and turn
>> it to a hard error.
>>
>> Best,
>>
>> Paul
>>
>> On 1/6/19 1:43 PM, Guido van Rossum wrote:
>>> I don't think datetime and buil

Re: [Python-Dev] Return type of datetime subclasses added to timedelta

2019-02-04 Thread Paul Ganssle
Hey all,

This thread about the return type of datetime operations seems to have
stopped without any explicit decision - I think I responded to everyone
who had objections, but I think only Guido has given a +1 to whether or
not we should go ahead.

Have we got agreement to go ahead with this change? Are we still
targeting Python 3.8 here?

For those who don't want to dig through your old e-mails, here's the
archive link for this thread:
https://mail.python.org/pipermail/python-dev/2019-January/155984.html

If you want to start commenting on the actual implementation, it's
available here (though it's pretty simple):
https://github.com/python/cpython/pull/10902

Best,

Paul


On 1/6/19 7:17 PM, Guido van Rossum wrote:
> OK, I concede your point (and indeed I only tested this on 3.6). If we
> could break the backward compatibility for now() we presumably can
> break it for this purpose.
>
> On Sun, Jan 6, 2019 at 11:02 AM Paul Ganssle  <mailto:p...@ganssle.io>> wrote:
>
> I did address this in the original post - the assumption that the
> subclass constructor will have the same arguments as the base
> constructor is baked into many alternate constructors of datetime.
> I acknowledge that this is a breaking change, but it is a small
> one - anyone creating such a subclass that /cannot/ handled the
> class being created this way would be broken in myriad ways.
>
> We have also in recent years changed several alternate
> constructors (including `replace`) to retain the original
> subclass, which by your same standard would be a breaking change.
> I believe there have been no complaints. In fact, between Python
> 3.6 and 3.7, the very example you showed broke:
>
> Python 3.6.6:
>
> >>> class D(datetime.datetime):
> ... def __new__(cls):
> ... return cls.now()
> ...
> >>> D()
> D(2019, 1, 6, 13, 49, 38, 842033)
>
> Python 3.7.2:
>
> >>> class D(datetime.datetime):
> ... def __new__(cls):
> ... return cls.now()
> ...
> >>> D()
> Traceback (most recent call last):
>   File "", line 1, in 
>   File "", line 3, in __new__
> TypeError: __new__() takes 1 positional argument but 9 were given
>
>
> We haven't seen any bug reports about this sort of thing; what we
> /have/ been getting is bug reports that subclassing datetime
> doesn't retain the subclass in various ways (because people /are/
> using datetime subclasses). This is likely to cause very little in
> the way of problems, but it will improve convenience for people
> making datetime subclasses and almost certainly performance for
> people using them (e.g. pendulum and arrow, which now need to take
> a slow pure python route in many situations to work around this
> problem).
>
> If we're /really/ concerned with this backward compatibility
> breaking, we could do the equivalent of:
>
> try:
>     return new_behavior(...)
> except TypeError:
>     warnings.warn("The semantics of timedelta addition have "
>   "changed in a way that raises an error in "
>   "this subclass. Please implement __add__ "
>   "if you need the old behavior.", DeprecationWarning)
>
> Then after a suitable notice period drop the warning and turn it
> to a hard error.
>
> Best,
>
> Paul
>
> On 1/6/19 1:43 PM, Guido van Rossum wrote:
>> I don't think datetime and builtins like int necessarily need to
>> be aligned. But I do see a problem -- the __new__ and __init__
>> methods defined in the subclass (if any) should allow for being
>> called with the same signature as the base datetime class.
>> Currently you can have a subclass of datetime whose __new__ has
>> no arguments (or, more realistically, interprets its arguments
>> differently). Instances of such a class can still be added to a
>> timedelta. The proposal would cause this to break (since such an
>> addition has to create a new instance, which calls __new__ and
>> __init__). Since this is a backwards incompatibility, I don't see
>> how it can be done -- and I also don't see many use cases, so I
>> think it's not worth pursuing further.
>>
>> Note that the same problem already happens with the
>> .fromordinal() class method, though it doesn't happen with
>> .fromdatetime() or .now():
>>
>> >>> class D(datetime.datetime):
>> ...   def __new__(cls): return cls.now()
>&

Re: [Python-Dev] Return type of datetime subclasses added to timedelta

2019-01-06 Thread Paul Ganssle
Brett,

Thank you for bringing this up, but I think you /may/ have misunderstood
my position - though maybe you understood the thrust and wanted to
clarify for people coming in halfway, which I applaud.

I proposed this change /knowing/ that it was a breaking change - it's
why I brought it to the attention of datetime-SIG and now python-dev -
and I believe that there are several factors that lead this to being a
smaller compatibility problem than it seems.

One such factor is the fact that /many/ other features of `datetime`,
including the implementation of `datetime.now()` are /already broken/ in
the current implementation for anyone who would be broken by this
particular aspect of the semantic change. That is not saying that it's
impossible that there is code out there that will break if this change
goes through, it's just saying that the scope of the breakage is
necessarily very limited.

The reason I brought up the bug tracker is because between Python 3.6
and Python 3.7, we in fact made a similar breaking change to the one I'm
proposing here without thinking that anyone might be relying on the fact
that they could do something like:

class D(datetime.datetime):
    def __new__(cls):
    return cls.now()

My point was that there have been no bug reports about the /existing
change/ that Guido was bringing up (his example itself does not work on
Python 3.7!), which leads me to believe that few if any people are
relying on the fact that it is possible to define a datetime subclass
with a different default constructor.

As I mentioned, it is likely possible to have a transition period where
this would still work even if the subclassers have not created their own
__add__ method.

There is no way to create a similar deprecation/transition period for
people relying on the fact that `type(datetime_obj + timedelta_obj) ==
datetime.datetime`, but I think this is honestly a sufficiently minor
breakage that the good outweighs the harm. I will note that we have
already made several such changes with respect to alternate constructors
even though technically someone could have been relying on the fact that
`MyDateTime(*args).replace(month=3)` returns a `datetime` object.

This is not to say that we should lightly make the change (hence my
canvassing for opinions), it is just that there is a good amount of
evidence that, practically speaking, no one is relying on this, and in
fact it is likely that people are writing code that assumes that adding
`timedelta` to a datetime subclass returns the original subclass, either
directly or indirectly - I think we're likely to fix more people than we
break if we make this change.

Best,
Paul


On 1/6/19 3:24 PM, Brett Cannon wrote:
>
>
> On Sun, 6 Jan 2019 at 11:00, Paul Ganssle  <mailto:p...@ganssle.io>> wrote:
>
> I did address this in the original post - the assumption that the
> subclass constructor will have the same arguments as the base
> constructor is baked into many alternate constructors of datetime.
> I acknowledge that this is a breaking change, but it is a small
> one - anyone creating such a subclass that /cannot/ handled the
> class being created this way would be broken in myriad ways.
>
> We have also in recent years changed several alternate
> constructors (including `replace`) to retain the original
> subclass, which by your same standard would be a breaking change.
> I believe there have been no complaints. In fact, between Python
> 3.6 and 3.7, the very example you showed broke:
>
> Python 3.6.6:
>
> >>> class D(datetime.datetime):
> ... def __new__(cls):
> ... return cls.now()
> ...
> >>> D()
> D(2019, 1, 6, 13, 49, 38, 842033)
>
> Python 3.7.2:
>
> >>> class D(datetime.datetime):
> ... def __new__(cls):
> ... return cls.now()
> ...
> >>> D()
> Traceback (most recent call last):
>   File "", line 1, in 
>   File "", line 3, in __new__
> TypeError: __new__() takes 1 positional argument but 9 were given
>
>
> We haven't seen any bug reports about this sort of thing; what we
> /have/ been getting is bug reports that subclassing datetime
> doesn't retain the subclass in various ways (because people /are/
> using datetime subclasses).
>
>
> To help set expectations, the current semantics are not a bug and so
> the proposal isn't fixing a bug but proposing a change in semantics.
>  
>
> This is likely to cause very little in the way of problems, but it
> will improve convenience for people making datetime subclasses and
> almost certainly performance for people using them (e.g. pendulum
> and arrow, which now need to take a slow pure python route in many
> si

Re: [Python-Dev] Return type of datetime subclasses added to timedelta

2019-01-06 Thread Paul Ganssle
I did address this in the original post - the assumption that the
subclass constructor will have the same arguments as the base
constructor is baked into many alternate constructors of datetime. I
acknowledge that this is a breaking change, but it is a small one -
anyone creating such a subclass that /cannot/ handled the class being
created this way would be broken in myriad ways.

We have also in recent years changed several alternate constructors
(including `replace`) to retain the original subclass, which by your
same standard would be a breaking change. I believe there have been no
complaints. In fact, between Python 3.6 and 3.7, the very example you
showed broke:

Python 3.6.6:

>>> class D(datetime.datetime):
... def __new__(cls):
... return cls.now()
...
>>> D()
D(2019, 1, 6, 13, 49, 38, 842033)

Python 3.7.2:

>>> class D(datetime.datetime):
... def __new__(cls):
... return cls.now()
...
>>> D()
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 3, in __new__
TypeError: __new__() takes 1 positional argument but 9 were given


We haven't seen any bug reports about this sort of thing; what we /have/
been getting is bug reports that subclassing datetime doesn't retain the
subclass in various ways (because people /are/ using datetime
subclasses). This is likely to cause very little in the way of problems,
but it will improve convenience for people making datetime subclasses
and almost certainly performance for people using them (e.g. pendulum
and arrow, which now need to take a slow pure python route in many
situations to work around this problem).

If we're /really/ concerned with this backward compatibility breaking,
we could do the equivalent of:

try:
    return new_behavior(...)
except TypeError:
    warnings.warn("The semantics of timedelta addition have "
  "changed in a way that raises an error in "
  "this subclass. Please implement __add__ "
  "if you need the old behavior.", DeprecationWarning)

Then after a suitable notice period drop the warning and turn it to a
hard error.

Best,

Paul

On 1/6/19 1:43 PM, Guido van Rossum wrote:
> I don't think datetime and builtins like int necessarily need to be
> aligned. But I do see a problem -- the __new__ and __init__ methods
> defined in the subclass (if any) should allow for being called with
> the same signature as the base datetime class. Currently you can have
> a subclass of datetime whose __new__ has no arguments (or, more
> realistically, interprets its arguments differently). Instances of
> such a class can still be added to a timedelta. The proposal would
> cause this to break (since such an addition has to create a new
> instance, which calls __new__ and __init__). Since this is a backwards
> incompatibility, I don't see how it can be done -- and I also don't
> see many use cases, so I think it's not worth pursuing further.
>
> Note that the same problem already happens with the .fromordinal()
> class method, though it doesn't happen with .fromdatetime() or .now():
>
> >>> class D(datetime.datetime):
> ...   def __new__(cls): return cls.now()
> ...
> >>> D()
> D(2019, 1, 6, 10, 33, 37, 161606)
> >>> D.fromordinal(100)
> Traceback (most recent call last):
>   File "", line 1, in 
> TypeError: __new__() takes 1 positional argument but 4 were given
> >>> D.fromtimestamp(123456789)
> D(1973, 11, 29, 13, 33, 9)
> >>>
>
> On Sun, Jan 6, 2019 at 9:05 AM Paul Ganssle  <mailto:p...@ganssle.io>> wrote:
>
> I can think of many reasons why datetime is different from
> builtins, though to be honest I'm not sure that consistency for
> its own sake is really a strong argument for keeping a
> counter-intuitive behavior - and to be honest I'm open to the idea
> that /all/ arithmetic types /should/ have some form of this change.
>
> That said, I would say that the biggest difference between
> datetime and builtins (other than the fact that datetime is /not/
> a builtin, and as such doesn't necessarily need to be categorized
> in this group), is that unlike almost all other arithmetic types,
> /datetime/ has a special, dedicated type for describing
> differences in datetimes. Using your example of a float subclass,
> consider that without the behavior of "addition of floats returns
> floats", it would be hard to predict what would happen in this
> situation:
>
> >>> F(1.2) + 3.4
>
> Would that always return a float, even though F(1.2) + F(3.4)
> returns an F? Would that return an F because F is the left-hand
> operand? Would it return a float because float is the 

Re: [Python-Dev] Return type of datetime subclasses added to timedelta

2019-01-06 Thread Paul Ganssle

On 1/6/19 1:29 PM, Andrew Svetlov wrote:
> From my perspective datetime classes are even more complex than int/float.
> Let's assume we have
>
> class DT(datetime.datetime): ...
> class TD(datetime.timedelta): ...
>
> What is the result type for the following expressions?
> DT - datetime
> DT - DT
> DT + TD
> DT + timedelta
>
It is not really complicated, the default "difference between two
datetimes" returns a `timedelta`, you can change that by overriding
`__sub__` or `__rsub__` as desired, but there's no reason to think that
the fact that just because DT is a subclass of datetime that it would be
coupled to a specific timedelta subclass *by default*.

Similarly, DT + TD by default will do whatever "datetime" and
"timedelta" do unless you specifically override them. In my proposal,
adding some time to a datetime subclass would return an object of the
datetime subclass, so unless __radd__ or __rsub__ were overriden in
`timedelta`, that's what would happen, the defaults would be (sensibly):

DT - datetime -> timedelta
DT - DT -> timedelta
DT + TD -> DT
DT + timedelta -> timedelta

The only time it would be more complicated is if datetime were defined
like this:

class datetime:
    TIMEDELTA_CLASS = datetime.timedelta
    ...

In which case you'd have the same problem you have with float/int/etc
(not a particularly more complicated one. But that's not the case, and
there /is/ one obviously right answer. This is not the case with float
subclasses, because the intuitive rule is "adding together two objects
of the same class gives the same class", which fails when you have two
different subclasses. With datetime, you have "adding a delta type to a
value type returns an object of the value type", which makes perfect
sense, as opposed to "adding a delta type to a value type returns the
base value type, even if the base value type was never used".


> I have a feeling that the question has no generic answer.
> For *particular* implementation you can override all __add__, __sub__
> and other arithmetic operations, and you can do it right now with the
> current datetime module implementation.
> P.S.
> I think inheritance from datetime classes is a very rare thing, 99.99%
> of users don't need it.
>
Both of these points are addressed in my original post, IIRC, but both
of these arguments cut both ways. Assuming it's true that this is very
rare - the 0.01% of people who /are/ subclassing datetime either don't
care about this behavior or want timedelta arithmetic to return their
subclass. It's rare enough that there should be no problem giving them
what they want.

Similarly, the rarest group - people who are creating datetime
subclasses /and/ want the original behavior - can simply implement
__add__ and __sub__ to get what they want, so there's no real conflict,
it's just a matter of setting a sane default that also solves the
problem that datetime alternate constructors tend to leak their
implementation details because of the arithmetic return type issue.


Best, Paul



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Return type of datetime subclasses added to timedelta

2019-01-06 Thread Paul Ganssle
I can think of many reasons why datetime is different from builtins,
though to be honest I'm not sure that consistency for its own sake is
really a strong argument for keeping a counter-intuitive behavior - and
to be honest I'm open to the idea that /all/ arithmetic types /should/
have some form of this change.

That said, I would say that the biggest difference between datetime and
builtins (other than the fact that datetime is /not/ a builtin, and as
such doesn't necessarily need to be categorized in this group), is that
unlike almost all other arithmetic types, /datetime/ has a special,
dedicated type for describing differences in datetimes. Using your
example of a float subclass, consider that without the behavior of
"addition of floats returns floats", it would be hard to predict what
would happen in this situation:

>>> F(1.2) + 3.4

Would that always return a float, even though F(1.2) + F(3.4) returns an
F? Would that return an F because F is the left-hand operand? Would it
return a float because float is the right-hand operand? Would you walk
the MROs and find the lowest type in common between the operands and
return that? It's not entirely clear which subtype predominates. With
datetime, you have:

datetime - datetime -> timedelta
datetime ± timedelta -> datetime
timedelta ± timedelta -> timedelta

There's no operation between two datetime objects that would return a
datetime object, so it's always clear: operations between datetime
subclasses return timedelta, operations between a datetime object and a
timedelta return the subclass of the datetime that it was added to or
subtracted from.

Of course, the real way to resolve whether datetime should be different
from int/float/string/etc is to look at why this choice was actually
made for those types in the first place, and decide whether datetime is
like them /in this respect/. The heterogeneous operations problem may be
a reasonable justification for leaving the other builtins alone but
changing datetime, but if someone knows of other fundamental reasons why
the decision to have arithmetic operations always create the base class
was chosen, please let me know.

Best,
Paul

On 1/5/19 3:55 AM, Alexander Belopolsky wrote:
>
>
> On Wed, Jan 2, 2019 at 10:18 PM Paul Ganssle  <mailto:p...@ganssle.io>> wrote:
>
> .. the original objection was that this implementation assumes
> that the datetime subclass has a constructor with the same (or a
> sufficiently similar) signature as datetime.
>
> While this was used as a possible rationale for the way standard types
> behave, the main objection to changing datetime classes is that it
> will make them behave differently from builtins.  For example:
>
> >>> class F(float):
> ...     pass
> ...
> >>> type(F.fromhex('AA'))
> 
> >>> type(F(1) + F(2))
> 
>
> This may be a legitimate gripe, but unfortunately that ship has
> sailed long ago. All of datetime's alternate constructors make
> this assumption. Any subclass that does not meet this requirement
> must have worked around it long ago (or they don't care about
> alternate constructors).
>
>
> This is right, but the same argument is equally applicable to int,
> float, etc. subclasses.  If you want to limit your change to datetime
> types you should explain what makes these types special.  


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Compilation of "except FooExc as var" adds useless store

2019-01-06 Thread Paul Ganssle
On 1/6/19 9:14 AM, Steven D'Aprano wrote:
> [...]
> But a better question is, why would you (generic, not you personally) 
> imagine that, alone out of all flow control statements, ONLY "except" 
> clauses introduce a new scope? Every other flow control statement (for, 
> while, if, elif, else, try, with) runs in the current scope. The only 
> statements which create a new scope are def and class. (Did I miss any?)

To be fair except is already unique in that it /does/ "pseudo-scope" the
binding to the variable. The other obvious comparisons are to for loops
and context managers, both of which bind a value to a name that survives
after the exit of the control flow statement.

Given the reference counting reasons for exceptions *not* to outlive
their control flow statement, they are the "odd man out" in that they
delete the exception after the control statement's body exits. To me,
the natural and intuitive way to do this would be to have the exception
live in its own scope where it shadows existing variables, rather than
replacing and then completely removing them. The way it works now is
halfway between the behavior of existing control flow statements and
having a proper nested scope.

Not saying that anything has to change - in the end this is one of the
more minor "gotchas" about Python, and there may be practical reasons
for leaving it as it is - but I do think it's worth noting that for many
people this /will/ be surprising behavior, even if you know that
exceptions don't survive outside of the "except" block.



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Return type of datetime subclasses added to timedelta

2019-01-02 Thread Paul Ganssle
Happy New Year everyone!

I would like to start a thread here for wider feedback on my proposal to
change the return type of the addition operation between a datetime
subclass and a timedelta. Currently, adding a timedelta to a subclass of
datetime /always/ returns a datetime rather than an instance of the
datetime subclass.

I have an open PR implementing this, PR #10902
, but I know it's a major
change so I did not want to move forward without more discussion. I
first brought this up on datetime-SIG

[1], and we decided to move the discussion over here because the people
most likely to object to the change would be on this list and not on
datetime-SIG.

In addition to the datetime-SIG thread, you may find a detailed
rationale for the change in bpo-35364
 [2],  and a rationale for
why we would want to (and arguably already /do/) support subclassing
datetime in bpo-32417  [3].

A short version of the strongest rationale for changing how this works
is that it is causing inconsistencies in how subclassing is handled in
alternate constructors of datetime. For a given subclass of datetime
(which I will call DateTimeSub), nearly all alternate constructors
already support subclasses correctly - DateTimeSub.fromtimestamp(x) will
return a DateTimeSub, for example. However, because DateTimeSub +
timedelta returns datetime, any alternate constructor implemented in
terms of timedelta additions will leak that implementation detail by
returning a datetime object instead of the subclass. The biggest problem
is that datetime.fromutc is defined in terms of timedelta addition, so
DateTimeSub.now() returns a DateTimeSub object, but
DateTimeSub.now(timezone.utc) returns a datetime object! This is one of
the most annoying things to work around when building a datetime
subclass, and I don't know of any situation where someone /wants/ their
subclass to be lost on addition with a timedelta.

From my understanding, this has been discussed before and the original
objection was that this implementation assumes that the datetime
subclass has a constructor with the same (or a sufficiently similar)
signature as datetime. This may be a legitimate gripe, but unfortunately
that ship has sailed long ago. All of datetime's alternate constructors
make this assumption. Any subclass that does not meet this requirement
must have worked around it long ago (or they don't care about alternate
constructors).

Thanks for your attention, I look forward to your replies.

Best,

Paul

[1]
https://mail.python.org/archives/list/datetime-...@python.org/thread/TGB3VZS5EKM4R2VFUA44323FZFRN2DSJ/

[2] https://bugs.python.org/issue35364#msg331065

[3] https://bugs.python.org/issue32417#msg331353




signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Get a running instance of the doc for a PR.

2018-11-04 Thread Paul Ganssle

On 11/4/18 5:38 PM, Steven D'Aprano wrote:
> On Sun, Nov 04, 2018 at 12:16:14PM -0500, Ned Deily wrote:
>
>> On Nov 4, 2018, at 12:04, Paul Ganssle  wrote:
>>
>>> Some of the concerns about increasing the surface area I think are a 
>>> bit overblown. I haven't seen any problems yet in the projects that 
>>> do this,
> You may or may not be right, but have you looked for problems or just 
> assumed that because nobody has brought any to your attention, they 
> don't exist?
>
> "I have seen nothing" != "there is nothing to see".
>
I can only speak from my experience with setuptools, but I do look at
every setuptools PR and I've never seen anything even close to this.
That said, I have also never seen anyone using my Travis or Appveyor
instances to mine cryptocurrency, but I've been told that that happens.

In any case, I think the standard should not be "this never happens"
(otherwise you also can't run CI), but that it happens rarely enough
that it's not a major problem and that you can deal with it when it does
come up. Frankly, I think the much more likely target for these sorts of
attacks is small, mostly abandoned projects with very few followers. If
you post a spam site on some ephemeral domain via the CPython CI, it's
likely that hundreds of people will notice it just because it's a very
active project. You will be banned from the project for life and
probably reported to github nearly instantly. Likely you have much more
value for your time if you target some 1-star repo that set this up 2
years ago and is maintained by someone who hasn't committed to github in
over a year.

That said, big projects like CPython are probably more likely to attract
the troll version of this, where the point isn't to get away with
hosting some content or using the CI, but to annoy and disrupt the
project itself by wasting our resources chasing down spam or whatever. I
think if that isn't already happening with comment floods on the issue
tracker, GH threads and mailing lists, it's not especially /more/ likely
to happen because people can spin up a website with a PR.

>>> and I don't think it lends itself to abuse particularly 
>>> well. Considering that the rest of the CI suite lets you run 
>>> arbitrary code on many platforms, I don't think it's particularly 
>>> more dangerous to allow people to generate ephemeral static hosted 
>>> web sites as well.
>> The rest of the CI suite does not let you publish things on the 
>> python.org domain, unless I'm forgetting something; they're clearly 
>> under a CI environment like Travis or AppVeyor or Azure.  That's 
>> really my main concern.
> Sorry Ned, I don't follow you here. It sounds like you're saying that 
> you're fine with spam or abusive content being hosted in our name, so 
> long as its hosted by somebody else, rather than by us (python.org) 
> ourselves.
>
> I trust I'm missing something, but I don't know what it is.

I think there are two concerns - one is that the python.org domain is
generally (currently) used for official content. If people can put
arbitrary websites on there, presumably they can exploit whatever trust
people have put into this fact.

Another is that - and I am not a web expert here - I think that the
domain where content is hosted is used as a marker of trust between
different pages, and many applications will consider anything on
*.python.org to be first-party content from other *.python.org domains. 
I believe this is the reason why readthedocs moved all hosted
documentation from *.readthedocs.org to *.readthedocs.io. Similarly
user-submitted content on PyPI is usually hosted under the
pythonhosted.org domain, not pypi.org or pypi.python.org. You'll notice
that GH also hosts user content under a githubusercontent.org domain.



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Get a running instance of the doc for a PR.

2018-11-04 Thread Paul Ganssle
Oh, sorry if I misunderstood the concern. Yes, I agree that putting this
under python.org would not be a good idea.

Either hosting it on a hosting provider like netlify (or azure if that's
possible) or a dedicated domain that could be created for the purpose
(e.g. python-doc-ci.org) would be best. Alternatively, the domain could
be skipped entirely and the github hooks could link directly to
documentation by machine IP (though I suspect buying a domain for this
purpose would be a lot easier than coordinating what's necessary to make
direct links to a machine IP reasonable).


Best,
Paul

On 11/4/18 12:16 PM, Ned Deily wrote:
> On Nov 4, 2018, at 12:04, Paul Ganssle  wrote:
>> Some of the concerns about increasing the surface area I think are a bit 
>> overblown. I haven't seen any problems yet in the projects that do this, and 
>> I don't think it lends itself to abuse particularly well. Considering that 
>> the rest of the CI suite lets you run arbitrary code on many platforms, I 
>> don't think it's particularly more dangerous to allow people to generate 
>> ephemeral static hosted web sites as well.
> The rest of the CI suite does not let you publish things on the python.org 
> domain, unless I'm forgetting something; they're clearly under a CI 
> environment like Travis or AppVeyor or Azure.  That's really my main concern.
>
>
> --
>   Ned Deily
>   n...@python.org -- []
>



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Get a running instance of the doc for a PR.

2018-11-04 Thread Paul Ganssle
There is an open request for this on GH, but it's not currently done:
https://github.com/rtfd/readthedocs.org/issues/1340

At the PyCon US sprints this year, we added documentation previews via
netlify, and they have been super useful:
https://github.com/pypa/setuptools/pull/1367 My understanding is that
other projects do something similar with CircleCI.

It's not amazingly /difficult/ for reviewers to fetch the submitter's
branch, build the documentation and review it locally, but it's a decent
number of extra steps for what /should/ be a very simple review. I think
we all know that reviewer time and effort is one of the biggest
bottlenecks in the CPython development workflow, and this could make it
/much/ easier to do reviews.

Some of the concerns about increasing the surface area I think are a bit
overblown. I haven't seen any problems yet in the projects that do this,
and I don't think it lends itself to abuse particularly
well.//Considering that the rest of the CI suite lets you run arbitrary
code on many platforms, I don't think it's particularly more dangerous
to allow people to generate ephemeral static hosted web sites as well.


Best.

Paul

On 11/4/18 11:28 AM, Alex Walters wrote:
> Doesn't read the docs already do this for pull requests?  Even if it doesn't, 
> don't the core maintainers of read the docs go to pycon?  I wouldn't suggest 
> read the docs for primary docs hosting for python, but they are perfectly 
> fine for live testing pull request documentation without having to roll our 
> own.
>
>> -Original Message-
>> From: Python-Dev > list=sdamon@python.org> On Behalf Of Stephane Wirtel
>> Sent: Sunday, November 4, 2018 8:38 AM
>> To: python-dev@python.org
>> Subject: [Python-Dev] Get a running instance of the doc for a PR.
>>
>> Hi all,
>>
>> When we receive a PR about the documentation, I think that could be
>> interesting if we could have a running instance of the doc on a sub
>> domain of python.org.
>>
>> For example, pr-1-doc.python.org or whatever, but by this way the
>> reviewers could see the result online.
>>
>> The workflow would be like that:
>>
>> New PR -> build the doc (done by Travis) -> publish it to a server ->
>> once published, the PR is notified by "doc is available at URL".
>>
>> Once merged -> we remove the doc and the link (hello bedevere).
>>
>> I am interested by this feature and if you also interested, tell me.
>> I would like discuss with Julien Palard and Ernest W.  Durbin III for a
>> solution as soon as possible.
>>
>> Have a nice day,
>>
>> Stéphane
>>
>> --
>> Stéphane Wirtel - https://wirtel.be - @matrixise
>> ___
>> Python-Dev mailing list
>> Python-Dev@python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe: https://mail.python.org/mailman/options/python-dev/tritium-
>> list%40sdamon.com
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/paul%40ganssle.io


signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Official citation for Python

2018-09-16 Thread Paul Ganssle
I think the "why" in this case should be a bit deeper than that, because
until recently, it's been somewhat unusual to cite the /tools you use/
to create a paper.

I see three major reasons why people cite software packages, and the
form of the citation would have different requirements for each one:

1. *Academic credit / Academic use metrics*

The weird way that academia has evolved, academics are largely judged by
their publications and how influential those publications are. A lot of
the people who work on statistical and scientific python libraries are
doing excellent and incredibly influential work, but that's largely
invisible to the metrics used by funding and tenure committees, so
there's been an effort do things like getting DOIs for libraries or
publishing articles in journals like the journal of open source
software: https://joss.theoj.org

Then you cite the libraries if you use them, and the people who
contribute to the work can say, "Look I'm a regular contributor to this
core library that is cited in 90% of papers". This seems less important
to CPython, where the majority of core contributors (as far as I can
tell) are not academics and have little use for high h-index papers.
That said, even if no one involved cares about the academic credit, if
every paper that used Python cited the language, it probably /would/
provide useful metrics to the PSF and others interested in this.

If all you want is a formal way to say "I used Python for this" as a
citation so that it can be tracked, then a single DOI for the entire
language should be sufficient.

2. *As a primary source or example for some claims
*

If you are writing an article about language design and you are
referencing how Python handles async or scoping or unicode or something,
you want to make it easy for your readers to see the context of your
statement, to verify that it's true and to get more details than you
might want to include as part of what may be a tangential mention in
your paper. I have a sense that this is closer to the original reason
people cited things in papers and books before citations became a metric
for measuring influence - and subsequently a way to give credit for the
source of ideas.

If this is why you are citing Python, you should probably be citing a
specific sub-section of the language reference and/or documentation, and
that citation should probably be versioned, since new features are added
in every minor version, and the way some of these things are handled may
change over time. In this case, a separate DOI for each minor version
that points to the documentation as built by a specific commit or git
tag or whatever would probably be ideal.

3. *To aid reproducibility*

It won't go all the way towards reproducing your research, but given
that Python is a living language that is always changing - both in
implementation and the spec itself - to the extent that you have a
"methods" section, it should probably include things like operating
system version, CPython version and the versions of all libraries you
used so that if someone is failing to replicate your results, they know
how to build an environment where it /should work/.

If you want to include this information in the form of a citation, then
I would think that you would not want to be both more granular - citing
the specific interpreter you used (CPython, Jython, Pypy), the full
version (3.6.6 rather than 3.6) and possibly even other factors like
operating system, etc, and /less/ granular in that you don't need to
cite a specific subset of the interpreter (e.g. async), but just the
interpreter as a whole.

--

My thoughts on the matter are that I think the CPython core dev team
probably cares a lot less about #1 than, say, the R dev team, which is
one reason why there's no clear way to cite "CPython" as a whole.

I think that #3 is a very laudable goal, but probably should be in some
sort of "methods" section of the document being prepared rather than
overloading citations for it, though having a standardized way to
describe your Python setup (similar to, say, the pandas debugging
feature `pandas.show_versions()`) that is optimized for publication
would probably be super helpful.

While #2 is probably only a small fraction of all the times where people
would want to "cite CPython", I think it's probably the most important
one, since it's performing a very specific function useful to the reader
of the paper. It also seems not terribly difficult to come up with some
guidance for unambiguously referencing sections of the documentation
and/or language reference, and having "get a DOI for the documentation"
be part of the release cycle.

Best,
Paul

P.S. I will also be at the NumFocus summit. It's been some time since
I've been an academic, but hopefully there will be an interesting
discussion about this there!

On 9/16/18 6:22 PM, Jacqueline Kazil wrote:

>
> RE: Why cite Python….
>
> I would say that in this paper —
> 

Re: [Python-Dev] iso8601 parsing

2017-12-06 Thread Paul Ganssle
Here is the PR I've submitted:

https://github.com/python/cpython/pull/4699

The contract that I'm supporting (and, I think it can be argued, the only 
reasonable contract in the intial implementation) is the following:

dtstr = dt.isoformat(*args, **kwargs)
dt_rt = datetime.fromisoformat(dtstr)
assert dt_rt == dt# The two points represent the same 
absolute time
assert dt_rt.replace(tzinfo=None) == dt.replace(tzinfo=None)   # And the 
same wall time

For all valid values of `dt`, `args` and `kwargs`.

A corollary of the `dt_rt == dt` invariant is that you can perfectly recreate 
the original `datetime` with the following additional step:

dt_rt = dt_rt if dt.tzinfo is None else dt_rt.astimezone(dt.tzinfo)

There is no way for us to guarantee that `dt_rt.tzinfo == dt.tzinfo` or that 
`dt_rt.tzinfo is dt.tzinfo`, because `isoformat()` is slightly lossy (it loses 
the political zone), but this is not an issue because lossless round trips just 
require you to serialize the political zone, which is generally simple enough.


On 12/06/2017 07:54 PM, Barry Scott wrote:
> 
> 
>> On 26 Oct 2017, at 17:45, Chris Barker  wrote:
>>
>> This is a key point that I hope is obvious:
>>
>> If an ISO string has NO offset or timezone indicator, then a naive datetime 
>> should be created.
>>
>> (I say, I "hope" it's obvious, because the numpy datetime64 implementation 
>> initially (and for years) would apply the machine local timezone to a bare 
>> iso string -- which was a f-ing nightmare!)
> 
> 
> I hope the other obvious thing is that if there is a offset then a datetime 
> that is *not* naive can be created
> as it describes an unambiguous point in time. We just cannot know what 
> political timezone to choose.
> I'd guess that it should use the UTC timezone in that case.
> 
> Barry
> 
> 
> 
> 
> 
> 
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/paul%40ganssle.io
> 



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-11-05 Thread Paul Ganssle
> If it turns out that there's a dict implementation that's faster by not
> preserving order, collections.UnorderedDict could be added.
> There could also be specialized implementations that pre-size the dict (cf:
> C++ unordered_map::reserve), etc., etc.
> But these are all future things, which might not be necessary.

I think that the problem with this is that for the most part, people will use 
`dict`, and for most uses of `dict`, order doesn't matter (and it never has 
before). Given that "arbitrary order" includes *any* fixed ordering (insertion 
ordered, reverse insertion ordered, etc), the common case should keep the 
existing "no order guarantee" specification. This gives interpreter authors 
maximum freedom with a fundamental, widely used data type.

> Isn't ordered dict also useful for **kwargs?

If this is useful (and it seems like it would be), I think again a syntax 
modification that allows users to indicate that they want a particular 
implementation of **kwargs would be better than modifying dict semantics. It 
could possibly be handled with a type-hinting like syntax:

def f(*args, **kwargs : OrderedKwargs):

Or a riff on the existing syntax:

def f(*args, ***kwargs):
def f(*args, ^^kwargs):
def f(*args, .**kwargs):

In this case, the only guarantee you'd need (which relatively minor compared to 
a change in the dict semantics) would be that keyword argument order passed to 
a function would be preserved as the order that it is passed into the `kwargs` 
constructor. The old **kwargs syntax would give you a `dict` as normal, and the 
new ^^kwargs would give you an OrderedDict or some other dict subclass with 
guaranteed order.

On 11/05/2017 03:50 PM, Peter Ludemann via Python-Dev wrote:

> 
> 
> On 5 November 2017 at 12:44, Sven R. Kunze  wrote:
> 
>> +1 from me too.
>>
>> On 04.11.2017 21:55, Jim Baker wrote:
>>
>> +1, as Guido correctly recalls, this language guarantee will work well
>> with Jython when we get to the point of implementing 3.7+.
>>
>> On Sat, Nov 4, 2017 at 12:35 PM, Guido van Rossum 
>> wrote:
>>
>>> This sounds reasonable -- I think when we introduced this in 3.6 we were
>>> worried that other implementations (e.g. Jython) would have a problem with
>>> this, but AFAIK they've reported back that they can do this just fine. So
>>> let's just document this as a language guarantee.
>>>
>>> On Sat, Nov 4, 2017 at 10:30 AM, Stefan Krah  wrote:
>>>

 Hello,

 would it be possible to guarantee that dict literals are ordered in v3.7?


 The issue is well-known and the workarounds are tedious, example:

https://mail.python.org/pipermail/python-ideas/2015-Decembe
 r/037423.html


 If the feature is guaranteed now, people can rely on it around v3.9.



 Stefan Krah



 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: https://mail.python.org/mailma
 n/options/python-dev/guido%40python.org

>>>
>>>
>>>
>>> --
>>> --Guido van Rossum (python.org/~guido )
>>>
>>> ___
>>> Python-Dev mailing list
>>> Python-Dev@python.org
>>> https://mail.python.org/mailman/listinfo/python-dev
>>> Unsubscribe: https://mail.python.org/mailman/options/python-dev/jbaker%
>>> 40zyasoft.com
>>>
>>>
>>
>>
>> ___
>> Python-Dev mailing 
>> listPython-Dev@python.orghttps://mail.python.org/mailman/listinfo/python-dev
>>
>> Unsubscribe: 
>> https://mail.python.org/mailman/options/python-dev/srkunze%40mail.de
>>
>>
>>
>> ___
>> Python-Dev mailing list
>> Python-Dev@python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe: https://mail.python.org/mailman/options/python-dev/
>> pludemann%40google.com
>>
>>
> 
> 
> 
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/paul%40ganssle.io
> 



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-11-05 Thread Paul Ganssle
I think the question of whether any specific implementation of dict could be 
made faster for a given architecture or even that the trade-offs made by 
CPython are generally the right ones is kinda beside the point. It's certainly 
feasible that an implementation that does not preserve ordering could be better 
for some implementation of Python, and the question is really how much is 
gained by changing the language semantics in such a way as to cut off that 
possibility.

On 11/05/2017 02:54 PM, Serhiy Storchaka wrote:
> 05.11.17 21:30, Stefan Krah пише:
>> On Sun, Nov 05, 2017 at 09:09:37PM +0200, Serhiy Storchaka wrote:
>>> 05.11.17 20:39, Stefan Krah пише:
 On Sun, Nov 05, 2017 at 01:14:54PM -0500, Paul G wrote:
> 2. Someone invents a new arbitrary-ordered container that would improve 
> on the memory and/or CPU performance of the current dict implementation

 I would think this is very unlikely, given that the previous dict 
 implementation
 has always been very fast. The new one is very fast, too.
>>>
>>> The modification of the current implementation that don't preserve
>>> the initial order after deletion would be more compact and faster.
>>
>> How much faster?
> 
> I didn't try to implement this. But the current implementation requires 
> periodical reallocating if add and remove items. The following loop 
> reallocates the dict every len(d) iterations, while the size of the dict is 
> not changed, and the half of its storage is empty.
> 
> while True:
>     v = d.pop(k)
>     ...
>     d[k] = v
> 
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/paul%40ganssle.io



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com