[issue17005] Add a topological sort algorithm

2020-01-19 Thread Zahari Dim


Zahari Dim  added the comment:

I would like to suggest a `dependency_resolver` API that I have been using that 
goes in line with what Tim Peters proposes in 
https://bugs.python.org/issue17005#msg359702

A DAG would be an object that can be iterated in topological order with 
__iter__ (for simple sequential usage) or have a way of managing all the tasks 
that can be run in parallel. The later is done with a generator function:

```
def dependency_resolver(self):
"""Yield the set of nodes that have all dependencies satisfied (which 
could be an empty set). Send the next
completed task."""
```

which is used with something like:

```
deps = dag.dependency_resolver()
pending_tasks = deps.send(None)
if not pending_tasks:
#Graph empty
return
#Note this is a can be done in parallel/async
while True:
some_task = pending_tasks.pop()
complete_task_somehow(some_task)
try:
   more_tasks = deps.send(some_task)
except StopIteration:
   #Exit when we have sent in all the nodes in the graph
   break
else:
pending_tasks |= more_tasks

```


An implementation I have used for some time is here:


https://github.com/NNPDF/reportengine/blob/master/src/reportengine/dag.py

although I'd make simpler now. In practice I have found that the function I use 
most of the time to build the graph is:

dag.add_or_update_node(node=something_hashable, inputs={set of existing nodes}, 
outputs={set of existing nodes}).

which adds the node to the graph if it was not there and maps updates the 
dependencies to add inputs and outputs, which in my experience matches the way 
one discovers dependencies for things like packages.

--
nosy: +Zahari.Dim

___
Python tracker 
<https://bugs.python.org/issue17005>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34586] collections.ChainMap should have a get_where method

2018-09-11 Thread Zahari Dim


Zahari Dim  added the comment:

>
> I've discussed this with other core devs and spent a good deal of time 
> evaluating this proposal.  I'm going to pass on the this one but do think it 
> was a inspired suggestion.  Thank you for the proposal.

Thank you for taking the time to consider it. I understand that there
are many proposals.

>
> --
>
> Note, the original get_where() recipe has an issue.  Upon successful lookup, 
> it returns a 2-tuple but on failure it calls __missing__ which typically 
> returns a scalar (if it doesn't raise an exception).

FWIW this was intended to work when `__missing__` was subclassed to
raise a more specific exception. The case where it is made to return a
value clearly doesn't play well with the proposed method, and would
likely need to be subclassed as well. I consider this an acceptable
trade off because I find this use case rather esoteric: the same
functionality could be achieved in an arguably clearer way by passing
a mapping with the desired missing semantics as the outermost scope.

--

___
Python tracker 
<https://bugs.python.org/issue34586>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34586] collections.ChainMap should have a get_where method

2018-09-08 Thread Zahari Dim


Zahari Dim  added the comment:

On Sat, Sep 8, 2018 at 1:15 PM Serhiy Storchaka  wrote:
>
>
> Serhiy Storchaka  added the comment:
>
> I concur with Raymond. The purpose of ChainMap is providing a mapping that 
> hides the implementation detail of using several mappings as fallbacks. If 
> you don't want to hide the implementation detail, you don't need to use 
> ChainMap.
>
> ChainMap exposes underlying mappings as the maps attribute, so you can use 
> this implementation detail if you know that it is a ChainMap an not a general 
> mapping. It is easy to write a code for searching what mapping contains the 
> specified key.

I don't know where the idea that the underlying mappings are an
implementation detail comes from. It certainly isn't from the
documentation, which mentions uses such as nested scopes and
templates, which cannot be attained with a single mapping. It also
doesn't match my personal usage, where as discussed, even the simpler
cases benefit from information on the underlying mappings. It is a
surprising claim to make given than the entirety of the public
interface specific to ChainMap (maps, new_child and parents) deals
with the fact that there is more structure than one mapping. I also
have a hard time discerning this idea from Raymond's messages.

>
> for m in cm.maps:
> if key in m:
> found = m
> break
> else:
> # raise an error or set a default,
> # what is appropriate for your concrete case

This "trivial snatch of code" contains at least two issues that make
it fail in situations where the actual implementation of `__getitem__`
would work, opening the door for hard to diagnose corner cases. If
anything, in my opinion, the fact that this code is being proposed as
an alternative reinforces the idea that the implementation of the
searching method should be in the standard library.

--

___
Python tracker 
<https://bugs.python.org/issue34586>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34586] collections.ChainMap should have a get_where method

2018-09-07 Thread Zahari Dim


Zahari Dim  added the comment:

> ISTM that this is the wrong stage to perform validation of allowable values.  
> That should occur upstream when the underlying mappings are first created.  
> At that earlier stage it possible to give a more timely response to erroneous 
> input and there is access to more information (such as the line and row 
> number of an error in a configuration file).
>
> It doesn't make sense to me to defer value validation downstream after a 
> ChainMap instance has been formed and after a successful lookup has occurred. 
> That just complicates the task of tracing back to the root cause.

This is certainly the case in the situation where the validation only
depends on the value of the corresponding configuration entry, as it
admittedly does in the example above. However the example was
oversimplified insofar non trivial validation depends on the whole
ensemble configuration settings. For example taking the example
described at the top of
<http://rahmonov.me/posts/python-chainmap/>
I think it would be useful to have an error message of the form:
f"User '{db_username}', defined in {configsetttings[user_index]} is
not found in database '{database}', defined in
{configsettings[database_index]}'

>
> > Maybe the method could be called ChainMap.search?
>
> That would be better than get_where().
>
> --
>
> ___
> Python tracker 
> <https://bugs.python.org/issue34586>
> ___

--

___
Python tracker 
<https://bugs.python.org/issue34586>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34586] collections.ChainMap should have a get_where method

2018-09-06 Thread Zahari Dim


Zahari Dim  added the comment:

I believe an argument for including this functionality in the standard library
is that it facilitates writing better error messages and thus better code. Some
results that are returned when one searches for *python ChainMap* are:

  - 
<https://stackoverflow.com/questions/23392976/what-is-the-purpose-of-collections-chainmap>
  - 
<http://www.blog.pythonlibrary.org/2016/03/29/python-201-what-is-a-chainmap/>
  - <http://rahmonov.me/posts/python-chainmap/>

All of these mention prominently a layered configuration of some kind. I would
argue that all of the examples would benefit from error checking done along the
lines of the snippet above.

An additional consideration is that the method is best implemented by copying
the `__getitem__` method, which, while short, contains a couple of non trivial
details.

One analog could be `re.search`, which returns an object with information of
both the value that is found and its location, though the `span` attribute of
the Match object. Maybe the method could be called ChainMap.search?
On Thu, Sep 6, 2018 at 6:07 AM Raymond Hettinger  wrote:
>
>
> Raymond Hettinger  added the comment:
>
> I haven't run across this requirement before but it does seem plausible that 
> a person might want to know which underlying mapping found a match (compare 
> with the "which" utility in Bash). On the other hand, we haven't had requests 
> for anything like this for other lookup chains such as determining where a 
> variable appears in the sequence 
> locals-to-nested-scopes-to-globals-to-builtins.
>
> Also, I'm not sure I like the proposed API (the method name and signature).  
> Perhaps, this should be a docs recipe for a ChainMap subclass or be an 
> example of a standalone search function that the takes the *maps* attribute 
> as one of its arguments.  Will discuss this with the other core devs to get 
> their thoughts.
>
> --
>
> ___
> Python tracker 
> <https://bugs.python.org/issue34586>
> ___

--

___
Python tracker 
<https://bugs.python.org/issue34586>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34586] collections.ChainMap should have a get_where method

2018-09-05 Thread Zahari Dim


New submission from Zahari Dim :

When using ChainMap I have frequently needed to know the mapping inside the list
that contains the effective instance of a particular key. I have needed this
when using ChainMap to contain a piece of configuration with multiple sources,
like for example

```
from mycollections import ChainMap
configsources = ["Command line", "Config file", "Defaults"]
config = ChainMap(config_from_commandline(), config_from_file(),
  default_config())

class BadConfigError(Exception): pass
def get_key(key):
try:
index, value = config.get_where(key)
except KeyError as e:
raise BadConfigError(f"No such key: '{key}'") from e
try:
result = validate(key, value)
except ValidationError as e:
raise BadConfigError(f"Key '{key}' defined in {configsources[index] }"
 f"is invalid: {e}") from e
return result
```

I have also needed this when implementing custom DSLs (e.g. specifying which
context is a particular construct allowed to see).

I think this method would be generally useful for the ChainMap class and
moreover the best way of implementing it I can think of is  by copying the
`__getitem__` method and retaining the index:

```
class ChainMap(collections.ChainMap):
def get_where(self, key):
for i, mapping in enumerate(self.maps):
try:
return i, mapping[key] # can't use 'key in mapping' 
with defaultdict
except KeyError:
pass
return self.__missing__(key)# support subclasses that 
define __missing__
```

I'd be happy to write a patch that does just this.

--
components: Library (Lib)
messages: 324632
nosy: Zahari.Dim
priority: normal
severity: normal
status: open
title: collections.ChainMap should have a get_where method
type: enhancement
versions: Python 3.8

___
Python tracker 
<https://bugs.python.org/issue34586>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13349] Non-informative error message in index() and remove() functions

2017-04-26 Thread Zahari Dim

Changes by Zahari Dim <zaha...@gmail.com>:


--
pull_requests: +1418

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue13349>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27399] ChainMap.keys() is broken

2016-06-27 Thread Zahari Dim

Changes by Zahari Dim <zaha...@gmail.com>:


--
resolution:  -> not a bug
status: open -> closed

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue27399>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27399] ChainMap.keys() is broken

2016-06-27 Thread Zahari Dim

New submission from Zahari Dim:

When trying to see if the keys() of a collections.ChainMap object are empty, it 
tries to compute the hash of the dicts that compose the ChainMap, giving rise 
to an error:

In [1]: from collections import ChainMap

In [2]: m = ChainMap([{'a':1}, {'b':2}])

In [3]: bool(m.keys())
---
TypeError Traceback (most recent call last)
 in ()
> 1 bool(m.keys())

/home/zah/anaconda3/lib/python3.5/_collections_abc.py in __len__(self)
633 
634 def __len__(self):
--> 635 return len(self._mapping)
636 
637 def __repr__(self):

/home/zah/anaconda3/lib/python3.5/collections/__init__.py in __len__(self)
865 
866 def __len__(self):
--> 867 return len(set().union(*self.maps)) # reuses stored hash 
values if possible
868 
869 def __iter__(self):

TypeError: unhashable type: 'dict'

Also, I can't ask if 'a' is in keys:

In [6]: m.keys()
Out[6]: KeysView(ChainMap([{'a': 1}, {'b': 2}]))
In [9]: ks = m.keys()
In [17]: 'a' in ks
Out[17]: False

--
components: Library (Lib)
messages: 269370
nosy: Zahari.Dim
priority: normal
severity: normal
status: open
title: ChainMap.keys() is broken
versions: Python 3.5

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue27399>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24500] provide context manager to redirect C output

2016-06-02 Thread Zahari Dim

Zahari Dim added the comment:

Considering Python is used often to interact with lower level
languages, it seems interesting to have the ability to control the
"real" standard output and error that those languages use. Note that
redirecting to /dev/null is only one possible application of this
feature. Others would be for example linking the stout to the logging
module.

Specifically regarding redirecting to /dev/null, in my experience this
would be fairly useful In scientific software where low level code
tends to be used on scientific merits rather than on how much control
it has over verbosity.

On Sun, May 8, 2016 at 12:04 AM, Martin Panter <rep...@bugs.python.org> wrote:
>
> Martin Panter added the comment:
>
> Is it really common to have a C wrapper with undesirable output? I suspect 
> there is not much demand for this feature. Maybe this would be better outside 
> of Python’s standard library.
>
> --
> nosy: +martin.panter
> status: open -> languishing
>
> ___
> Python tracker <rep...@bugs.python.org>
> <http://bugs.python.org/issue24500>
> ___

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue24500>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24519] multiprocessing.Pool with maxtasksperchild starts too many processes

2015-06-27 Thread Zahari Dim

New submission from Zahari Dim:

The following example should start two processes, but instead it starts three, 
even though only two do_work(). A third process is incorrectly started after 
the first one finishes.

import os
import time
from multiprocessing import Pool

def initprocess():
print(Starting PID: %d % os.getpid())

def do_work(x):
print(Doing work in %d % os.getpid())
time.sleep(x**2)

if __name__ == '__main__':
p = Pool(2, initializer=initprocess,maxtasksperchild=1)
results = p.map(do_work, (1,2), chunksize=1)

--
components: Library (Lib)
messages: 245878
nosy: Zahari.Dim
priority: normal
severity: normal
status: open
title: multiprocessing.Pool with maxtasksperchild starts too many processes
type: resource usage
versions: Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24519
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24519] multiprocessing.Pool with maxtasksperchild starts too many processes

2015-06-27 Thread Zahari Dim

Changes by Zahari Dim zaha...@gmail.com:


--
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24519
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24500] provide context manager to redirect C output

2015-06-26 Thread Zahari Dim

Zahari Dim added the comment:

Well, the simple minded example I posted has so many bugs (many of which I 
don't understand, for example why it destroys the stdout of an interpreter 
permanently) that I really think this feature is necessary.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24500
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24500] contextlib.redirect_stdout should redirect C output

2015-06-24 Thread Zahari Dim

Changes by Zahari Dim zaha...@gmail.com:


--
title: xontextlib.redirect_stdout should redirect C output - 
contextlib.redirect_stdout should redirect C output

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24500
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24500] xontextlib.redirect_stdout should redirect C output

2015-06-24 Thread Zahari Dim

New submission from Zahari Dim:

It is common to have an inflexible C wrapper with lots of undesired output. 
However it is not so trivial to supress (or redirect) that output from Python 
in a selective way. contextlib.redirect_stdout doesn't help, since it only 
changes sys.sdout, without touching the actual file descriptor. The following 
worked for my use case, which I adapted from here 
http://eli.thegreenplace.net/2015/redirecting-all-kinds-of-stdout-in-python/:

import sys
import os
from contextlib import contextmanager, redirect_stdout

@contextmanager
def supress_stdout():
devnull = open(os.devnull, 'wb')
try:
stdout_flieno = sys.stdout.fileno()
except ValueError:
redirect = False
else:
redirect = True
sys.stdout.flush()
#sys.stdout.close()
devnull_fileno = devnull.fileno()
saved_stdout_fd = os.dup(stdout_flieno)
os.dup2(devnull_fileno, stdout_flieno)

with redirect_stdout(devnull):
yield
if redirect:
os.dup2(stdout_flieno, saved_stdout_fd)

--
components: Extension Modules, Library (Lib)
messages: 245760
nosy: Zahari.Dim
priority: normal
severity: normal
status: open
title: xontextlib.redirect_stdout should redirect C output
type: enhancement
versions: Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24500
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24475] The docs never define what a pool task is

2015-06-19 Thread Zahari Dim

New submission from Zahari Dim:

See:

http://stackoverflow.com/questions/30943161/multiprocessing-pool-with-maxtasksperchild-produces-equal-pids

The documentation never makes clear what a task in the context of Pool.map. 
At best, it says:

This method chops the iterable into a number of chunks which it submits to the 
process pool as separate tasks. The (approximate) size of these chunks can be 
specified by setting chunksize to a positive integer.

in the map documentation. However it does not say how this chunks are 
calculated by default, making the maxtasksperchild argument not very useful. 
The fact that a function evaluated by map is not a task should be much 
clearer in the documentation.

Also, in the examples, such as:

 with multiprocessing.Pool(PROCESSES) as pool:
#
# Tests
#

TASKS = [(mul, (i, 7)) for i in range(10)] + \
[(plus, (i, 8)) for i in range(10)]

results = [pool.apply_async(calculate, t) for t in TASKS]
imap_it = pool.imap(calculatestar, TASKS)
imap_unordered_it = pool.imap_unordered(calculatestar, TASKS)

TASKS are not actually tasks but rather task groups.

--
assignee: docs@python
components: Documentation
messages: 245509
nosy: Zahari.Dim, docs@python
priority: normal
severity: normal
status: open
title: The docs never define what a pool task is
versions: Python 2.7, Python 3.2, Python 3.3, Python 3.4, Python 3.5, Python 3.6

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24475
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19737] Documentation of globals() and locals() should be improved

2013-11-23 Thread Zahari Dim

New submission from Zahari Dim:

The globals() notification states:

Return a dictionary representing the current global symbol table.[...]

This doc and the fact that globals() is called as a function made me think that 
globals() returns a copy of the global namespace dict, rather than an object 
that could be used to actually modify the namespace. I don't find obvious the 
meaning of representing in this context.

This of course led to a very nasty and sneaky bug in my code.

The docs of locals() don't seem clear to me either, thought at least it seems 
to imply that it is actually modifying the namespace.

--
assignee: docs@python
components: Documentation
messages: 204052
nosy: Zahari.Dim, docs@python
priority: normal
severity: normal
status: open
title: Documentation of globals() and locals() should be improved

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19737
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19737] Documentation of globals() and locals() should be improved

2013-11-23 Thread Zahari Dim

Zahari Dim added the comment:

I am looking at the docs of the built-in functions:

http://docs.python.org/2/library/functions.html

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19737
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com