Re: [Python-Dev] Some PRs to merge?

2018-10-19 Thread Victor Stinner
Le ven. 19 oct. 2018 à 19:01, Stephane Wirtel  a écrit :
> total: 49 PRs
> is:open is:pr review:approved status:success label:"awaiting merge" 
> -label:"DO-NOT-MERGE" label:""LA signed""

I merged many PRs and closed a few (2 if I recall correctly). Your
query now counts 24 PRs.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals

2018-10-19 Thread Michael Selik
On Fri, Oct 19, 2018 at 5:01 AM Sean Harrington 
wrote:

> I like the idea to extend the Pool class [to optimize the case when only
> one function is passed to the workers].
>

Why would this keep the same interface as the Pool class? If its workers
are restricted to calling only one function, that should be passed into the
Pool constructor. The map and apply methods would then only receive that
function's args and not the function itself. You're also trying to avoid
the initializer/globals pattern, so you could eliminate that parameter from
the Pool constructor. In fact, it sounds more like you'd want a function
than a class. You can call it "procmap" or similar. That's code I've
written more than once.

results = poolmap(func, iterable, processes=cpu_count())

The nuance is that, since there's no explicit context manager, you'll want
to ensure the pool is shut down after all the tasks are finished, even if
the results generator hasn't been fully consumed.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Some PRs to merge?

2018-10-19 Thread Brett Cannon
Of those 49 PRs, 18 are by core developers themselves, so 31 PRs are by
external contributors that seem ready to be merged.

There was a discussion at one point on core-workflow about changing the
default "needs" label for PRs by core devs which in this instance would
help with providing a search for PRs that are possibly very close to being
ready to be merged.

On Fri, 19 Oct 2018 at 09:58, Stephane Wirtel  wrote:

> Hi all,
>
> How are you? I am fine ;-) and you?
>
> So, on this morning I was playing with the github interface and the
> pull requests of CPython and I have discovered the advanced search of
> Github and I think this one is really useful for us and certainly for
> the core-dev.
>
> So, I was interested by somes PRs.
>
> PRs with this status:
> * open
> * review is approved
> * status of the CI is 'success'
> * has labels "awaiting merge", "CLA signed" and -"DO-NOT-MERGE"
>
> total: 49 PRs
>
> In the GitHub interface, here is the criteria
>
> is:open is:pr review:approved status:success label:"awaiting merge"
> -label:"DO-NOT-MERGE" label:""LA signed""
>
> But if you want to see the result in your browser, just click on this link.
>
> https://github.com/python/cpython/pulls?utf8=%E2%9C%93=is%3Aopen+is%3Apr+review%3Aapproved+status%3Asuccess+label%3A%22awaiting+merge%22+-label%3A%22DO-NOT-MERGE%22+label%3A%22CLA+signed%22
>
> Here are the numbers:
>
> * just open: 959
> is:open
> * and with label "CLA signed": 900
> label:"CLA signed"
> * and with label "awaiting merge": 169
> label:"awaiting merge"
> * and without label "DO-NOT-MERGE": 152
> -label:"DO-NOT-MERGE"
> * with CI is happy ;-): 112
> status:success
> * with review is approved: 49
> review:approved
>
> total: 49 PRs could be merged.
>
> I was really surprised by this tool, (doc:
> https://help.github.com/articles/searching-issues-and-pull-requests/)
>
> But I was thinking about one thing, how can I help the core-devs to
> merge these PRs?
>
> Each week, I can send a report to this ML with the "mergeable" PRs.
> This kind of report could be useful for you?
>
>
> Have a nice day,
>
> Stéphane
>
> --
> Stéphane Wirtel - https://wirtel.be - @matrixise
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Some PRs to merge?

2018-10-19 Thread Stephane Wirtel

Hi all,

How are you? I am fine ;-) and you?

So, on this morning I was playing with the github interface and the
pull requests of CPython and I have discovered the advanced search of
Github and I think this one is really useful for us and certainly for
the core-dev.

So, I was interested by somes PRs.

PRs with this status:
* open
* review is approved
* status of the CI is 'success'
* has labels "awaiting merge", "CLA signed" and -"DO-NOT-MERGE"

total: 49 PRs

In the GitHub interface, here is the criteria

is:open is:pr review:approved status:success label:"awaiting merge" -label:"DO-NOT-MERGE" 
label:""LA signed""

But if you want to see the result in your browser, just click on this link.
https://github.com/python/cpython/pulls?utf8=%E2%9C%93=is%3Aopen+is%3Apr+review%3Aapproved+status%3Asuccess+label%3A%22awaiting+merge%22+-label%3A%22DO-NOT-MERGE%22+label%3A%22CLA+signed%22

Here are the numbers:

* just open: 959
   is:open
* and with label "CLA signed": 900
   label:"CLA signed"
* and with label "awaiting merge": 169
   label:"awaiting merge"
* and without label "DO-NOT-MERGE": 152
   -label:"DO-NOT-MERGE"
* with CI is happy ;-): 112
   status:success
* with review is approved: 49
   review:approved

total: 49 PRs could be merged.

I was really surprised by this tool, (doc:
https://help.github.com/articles/searching-issues-and-pull-requests/)

But I was thinking about one thing, how can I help the core-devs to
merge these PRs?

Each week, I can send a report to this ML with the "mergeable" PRs.
This kind of report could be useful for you?


Have a nice day,

Stéphane

--
Stéphane Wirtel - https://wirtel.be - @matrixise
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Summary of Python tracker Issues

2018-10-19 Thread Python tracker

ACTIVITY SUMMARY (2018-10-12 - 2018-10-19)
Python tracker at https://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open6835 (+11)
  closed 39943 (+50)
  total  46778 (+61)

Open issues with patches: 2737 


Issues opened (41)
==

#34783: [3.7] segmentation-fault/core dump when try to run non-existin
https://bugs.python.org/issue34783  reopened by ned.deily

#34909: StrEnum subclasses cannot be created
https://bugs.python.org/issue34909  reopened by ned.deily

#34968: loop.call_soon_threadsafe should be documented to be re-entran
https://bugs.python.org/issue34968  opened by njs

#34969: Add --fast, --best to the gzip CLI
https://bugs.python.org/issue34969  opened by matrixise

#34970: Protect tasks weak set manipulation in asyncio.all_tasks()
https://bugs.python.org/issue34970  opened by asvetlov

#34971: add support for tls/ssl sessions in asyncio
https://bugs.python.org/issue34971  opened by RemiCardona

#34973: Crash in bytes constructor with mutating list
https://bugs.python.org/issue34973  opened by serhiy.storchaka

#34975: start_tls() difficult when using asyncio.start_server()
https://bugs.python.org/issue34975  opened by icgood

#34976: IDLE: Replace the search dialog with a search bar
https://bugs.python.org/issue34976  opened by taleinat

#34977: Release Windows Store app containing Python
https://bugs.python.org/issue34977  opened by steve.dower

#34978: check type of object in fix_dict.py in 2to3
https://bugs.python.org/issue34978  opened by devarakondapranav

#34979: Python throws “SyntaxError: Non-UTF-8 code start with \xe8..
https://bugs.python.org/issue34979  opened by ausaki

#34980: KillPython target doesn't detect 64-bit processes
https://bugs.python.org/issue34980  opened by jkloth

#34981: Unable to install Python from web-based installer and executab
https://bugs.python.org/issue34981  opened by skycraper

#34983: expose symtable.Symbol.is_nonlocal()
https://bugs.python.org/issue34983  opened by pablogsal

#34984: Improve error messages in bytes and bytearray constructors
https://bugs.python.org/issue34984  opened by serhiy.storchaka

#34985: python finds test modules from the wrong directory during PGO 
https://bugs.python.org/issue34985  opened by Kal Sze2

#34987: A possible null pointer dereference in _pickle.c's save_reduce
https://bugs.python.org/issue34987  opened by ZackerySpytz

#34990: year 2038 problem in compileall.py
https://bugs.python.org/issue34990  opened by bmwiedemann

#34991: variable type list [] referential integrity data loss
https://bugs.python.org/issue34991  opened by alan.pan

#34993: asyncio.streams.FlowControlMixin should be part of the API
https://bugs.python.org/issue34993  opened by xitop

#34995: functools.cached_property does not maintain the wrapped method
https://bugs.python.org/issue34995  opened by mwilbz

#34996: Add name to process and thread pool
https://bugs.python.org/issue34996  opened by Raz Manor

#35000: aexit called after loop close
https://bugs.python.org/issue35000  opened by pdxjohnny

#35003: Provide an option to venv to put files in a bin/ directory on 
https://bugs.python.org/issue35003  opened by brett.cannon

#35004: Odd behavior when using datetime.timedelta under cProfile
https://bugs.python.org/issue35004  opened by beaugunderson

#35005: argparse should accept json and yaml argument types
https://bugs.python.org/issue35005  opened by derelbenkoenig

#35007: Minor change to weakref docs
https://bugs.python.org/issue35007  opened by frankmillman

#35009: argparse throws UnicodeEncodeError for printing help with unic
https://bugs.python.org/issue35009  opened by xtreak

#35012: [3.7] test_multiprocessing_spawn hangs randomly on AppVeyor
https://bugs.python.org/issue35012  opened by vstinner

#35015: availability directive breaks po files
https://bugs.python.org/issue35015  opened by mdk

#35017: socketserver accept a last request after shutdown
https://bugs.python.org/issue35017  opened by beledouxdenis

#35018: Sax parser provides no user access to lexical handlers
https://bugs.python.org/issue35018  opened by Jonathan.Gossage

#35019: Minor Bug found in asyncio - Python 3.5.3
https://bugs.python.org/issue35019  opened by bassford

#35020: Add multisort recipe to sorting docs
https://bugs.python.org/issue35020  opened by xtreak

#35021: Assertion failures in datetimemodule.c.
https://bugs.python.org/issue35021  opened by twouters

#35022: MagicMock should support `__fspath__`
https://bugs.python.org/issue35022  opened by Maxime Belanger

#35024: Incorrect logging in importlib when '.pyc' file creation fails
https://bugs.python.org/issue35024  opened by qagren

#35025: Compiling `timemodule.c` can fail on macOS due to availability
https://bugs.python.org/issue35025  opened by Maxime Belanger

#35026: Winreg's documentation lacks mentioning required permission at
https://bugs.python.org/issue35026  

Re: [Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals

2018-10-19 Thread Sean Harrington
On Fri, Oct 19, 2018 at 7:32 AM Joni Orponen  wrote:

> On Fri, Oct 19, 2018 at 9:09 AM Thomas Moreau <
> thomas.moreau.2...@gmail.com> wrote:
>
>> Hello,
>>
>> I have been working on the concurent.futures module lately and I think
>> this optimization should be avoided in the context of python Pools.
>>
>> This is an interesting idea, however its implementation will bring many
>> complicated issues as it breaks the basic paradigm of a Pool: the tasks are
>> independent and you don't know which worker is going to run which task.
>>
>> The function is serialized with each task because of this paradigm. This
>> ensure that any worker picking the task will be able to perform it
>> independently from the tasks it has run before, given that it as been
>> initialized correctly at the beginning. This makes it simple to run each
>> task.
>>
>
> I would not mind if there would be a subtype of Pool where you can only
> apply one kind of task to. This is a very common use mode.
>
>
Though the question there is 'should this live in Python itself'? I'd be
> fine with a package on PyPi.
>

Thomas makes a good point: despite the common user mode of calling
Pool.map() once, blocking, and returning, the need for serialization of
functions within tasks arises when calling Pool.map() (and friends) while
workers are still running (i.e. there was a previous call to
Pool.async_map()).

However this is an uncommon user mode, as Joni points out. The most common
user mode is that your Pool workers are only ever executing one type of
task at a given time.  This opens up optimization opportunities, so long as
we store state on the subclassed Pool object of whether or not a SingleTask
is running, or has been completed(SingleTaskPool?), to prevent the user
from getting in this funky state above.

I would rather see this included in the multiprocessing stdlib, as it
seemingly will not introduce a lot of new code, would benefit from existing
tests. Most importantly it optimizes (in my opinion) the most common user
mode of Pool.


> As the Pool comes with no scheduler, with your idea, you would need a
>> synchronization step to send the function to all workers before you can
>> launch your task. But if there is already one worker performing a long
>> running task, does the Pool wait for it to be done before it sends the
>> function? If the Pool doesn't wait, how does it ensure that this worker
>> will be able to get the definition of the function before running it?
>> Also, the multiprocessing.Pool has some features where a worker can shut
>> itself down after a given number of tasks or a timeout. How does it ensure
>> that the new worker will have the definition of the function?
>> It is unsafe to try such a feature (sending only once an object) anywhere
>> else than in the initializer which is guaranteed to be run once per worker.
>>
>> On the other hand, you mentioned an interesting point being that making
>> globals available in the workers could be made simpler. A possible solution
>> would be to add a "globals" argument in the Pool which would instanciate
>> global variables in the workers. I have no specific idea but on the
>> implementation of such features but it would be safer as it would be an
>> initialization feature.
>>
>
> Would this also mean one could use a Pool in a context where threading is
> used? Currently using threading side effects unpicklables into the globals.
>
> Also being able to pass in globals=None would be optimal for a lot of use
> cases.
>

We could do this - but we can easily get the same behavior by declaring a
"global" in "initializer" (albeit a pattern which I do not like).  I like
the idea to extend the Pool class a bit more, but this is also my opinion.


>
> --
> Joni Orponen
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/seanharr11%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals

2018-10-19 Thread Joni Orponen
On Fri, Oct 19, 2018 at 9:09 AM Thomas Moreau 
wrote:

> Hello,
>
> I have been working on the concurent.futures module lately and I think
> this optimization should be avoided in the context of python Pools.
>
> This is an interesting idea, however its implementation will bring many
> complicated issues as it breaks the basic paradigm of a Pool: the tasks are
> independent and you don't know which worker is going to run which task.
>
> The function is serialized with each task because of this paradigm. This
> ensure that any worker picking the task will be able to perform it
> independently from the tasks it has run before, given that it as been
> initialized correctly at the beginning. This makes it simple to run each
> task.
>

I would not mind if there would be a subtype of Pool where you can only
apply one kind of task to. This is a very common use mode.

Though the question there is 'should this live in Python itself'? I'd be
fine with a package on PyPi.

As the Pool comes with no scheduler, with your idea, you would need a
> synchronization step to send the function to all workers before you can
> launch your task. But if there is already one worker performing a long
> running task, does the Pool wait for it to be done before it sends the
> function? If the Pool doesn't wait, how does it ensure that this worker
> will be able to get the definition of the function before running it?
> Also, the multiprocessing.Pool has some features where a worker can shut
> itself down after a given number of tasks or a timeout. How does it ensure
> that the new worker will have the definition of the function?
> It is unsafe to try such a feature (sending only once an object) anywhere
> else than in the initializer which is guaranteed to be run once per worker.
>
> On the other hand, you mentioned an interesting point being that making
> globals available in the workers could be made simpler. A possible solution
> would be to add a "globals" argument in the Pool which would instanciate
> global variables in the workers. I have no specific idea but on the
> implementation of such features but it would be safer as it would be an
> initialization feature.
>

Would this also mean one could use a Pool in a context where threading is
used? Currently using threading side effects unpicklables into the globals.

Also being able to pass in globals=None would be optimal for a lot of use
cases.

-- 
Joni Orponen
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals

2018-10-19 Thread Thomas Moreau
Hello,

I have been working on the concurent.futures module lately and I think this
optimization should be avoided in the context of python Pools.

This is an interesting idea, however its implementation will bring many
complicated issues as it breaks the basic paradigm of a Pool: the tasks are
independent and you don't know which worker is going to run which task.

The function is serialized with each task because of this paradigm. This
ensure that any worker picking the task will be able to perform it
independently from the tasks it has run before, given that it as been
initialized correctly at the beginning. This makes it simple to run each
task.

As the Pool comes with no scheduler, with your idea, you would need a
synchronization step to send the function to all workers before you can
launch your task. But if there is already one worker performing a long
running task, does the Pool wait for it to be done before it sends the
function? If the Pool doesn't wait, how does it ensure that this worker
will be able to get the definition of the function before running it?
Also, the multiprocessing.Pool has some features where a worker can shut
itself down after a given number of tasks or a timeout. How does it ensure
that the new worker will have the definition of the function?
It is unsafe to try such a feature (sending only once an object) anywhere
else than in the initializer which is guaranteed to be run once per worker.

On the other hand, you mentioned an interesting point being that making
globals available in the workers could be made simpler. A possible solution
would be to add a "globals" argument in the Pool which would instanciate
global variables in the workers. I have no specific idea but on the
implementation of such features but it would be safer as it would be an
initialization feature.

Regards,
Thomas Moreau

On Thu, Oct 18, 2018, 22:20 Chris Jerdonek  wrote:

> On Thu, Oct 18, 2018 at 9:11 AM Michael Selik 
> wrote:
> > On Thu, Oct 18, 2018 at 8:35 AM Sean Harrington 
> wrote:
> >> Further, let me pivot on my idea of __qualname__...we can use the `id`
> of `func` as the cache key to address your concern, and store this `id` on
> the `task` tuple (i.e. an integer in-lieu of the `func` previously stored
> there).
> >
> >
> > Possible. Does the Pool keep a reference to the passed function in the
> main process? If not, couldn't the garbage collector free that memory
> location and a new function could replace it? Then it could have the same
> qualname and id in CPython. Edge case, for sure. Worse, it'd be hard to
> reproduce as it'd be dependent on the vagaries of memory allocation.
>
> I'm not following this thread closely, but I just wanted to point out
> that __qualname__ won't necessarily be an attribute of the object if
> the API accepts any callable. (I happen to be following an issue on
> the tracker where this came up.)
>
> --Chris
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/thomas.moreau.2010%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com