[issue17005] Add a topological sort algorithm

2021-12-17 Thread Éric Araujo

Éric Araujo  added the comment:

See https://bugs.python.org/issue46071 for request to change the API
(I preferred adding a note here than adding all people to nosy there)

--
nosy: +eric.araujo

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2021-11-29 Thread Adrian Garcia Badaracco


Adrian Garcia Badaracco  added the comment:

As part of working on a tool that deals with dependencies, I was building my 
own topological sort. I iterated through various APIs (iterable of tasks, 
iterable of parallelizable groups of tasks, etc.) until I found the (now 
stdlib) version which ended up being exactly the API I needed to most 
efficiently execute dependencies. So, kudos on the design!

I actually ended up re-writing it in Rust, partly because I wanted a good 
project to learn Rust, partly because I wanted to be able to modify the API a 
bit. Namely:
1. I needed the ability to re-execute the same DAG multiple times without 
re-checking for cycles and re-adding all nodes (so basically copying 
`npredecessors` before executing).
2. I needed the ability to remove nodes from the graph. The real-world 
application is removing pruning subgraphs corresponding to cached dependencies. 
Again, I wanted to do this without rebuilding the entire thing (removing nodes 
can never lead to a cycle, and it is possible to keep track of new leaf nodes 
as you remove them instead of iterating over the entire graph again to find 
leaf nodes).

Here's the implementation in case anyone is interested: 
https://github.com/adriangb/graphlib2

The algorithm is the same, but I had to change the data structures somewhat to 
cope w/ Rusts' borrowing rules (namely I can't hold a mutable reference to two 
values in `node2nodeinfo` at the same time, which the current implementation 
does here 
https://github.com/python/cpython/blob/32f1491a9770b7f2989507ecf8f13ef35dd95b0b/Lib/graphlib.py#L190,
 so I split them out into two separate mappings).

--
nosy: +adriangb

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2021-09-16 Thread Éric Araujo

Change by Éric Araujo :


--
versions: +Python 3.9 -Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-12-04 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
assignee: rhettinger -> 
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-08-26 Thread Paddy McCarthy


Paddy McCarthy  added the comment:

Please ignore my earlier Message-id 
<1598493715.04.0.06462575371.issue17...@roundup.psfhosted.org>.

I missed a dependency in cutting down a larger example. Sorry.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-08-26 Thread Paddy McCarthy

Paddy McCarthy  added the comment:

I've been playing with Python 3.9.0rc1 and was looking at a particular graph to 
see when it released tasks for processing.
I ran the following code:

from functools import reduce
from pprint import pprint as pp
from collections import defaultdict
from graphlib import TopologicalSorter
from heapq import heapify, heappush, heappop, heappushpop

print("""
###
### TASK DEPENDENCY 
###

A -> B -> C
↓↓↓
D -> E -> F
""")
graph3 = {"A": {"B", "D"},
  "B": {"C", "E"},
  "C": {"F"},
  "D": {"E"},
  "E": {"F"},
  }
tasktime = {task: 2 for task in 'ABCDEF'}


def simulate(graph, tasktime):
print("\n## NEW SIMULATION")
t = 0
heap = []
heapify(heap)
print("\n# Task runtimes:")
for node, tm in tasktime.items():
print(f"  {node}: {tm}")

print("\n# OUTPUT. (:> for task start times, <: for stop times).\n")
ts = TopologicalSorter(graph)
ts.prepare()
while ts.is_active():
for node in ts.get_ready():
finish = t + tasktime[node]
heappush(heap, (finish, node))
print(f"{'  ' * t}{node}:> @{t}")
t, node = heappop(heap)
print(f"{'  ' * t}{node}<: @{t}")
ts.done(node)

simulate(graph3, tasktime)
tasktime['C'] = 1
simulate(graph3, tasktime)


I got the following output:


###
### TASK DEPENDENCY 
###

A -> B -> C
↓↓↓
D -> E -> F


## NEW SIMULATION

# Task runtimes:
  A: 2
  B: 2
  C: 2
  D: 2
  E: 2
  F: 2

# OUTPUT. (:> for task start times, <: for stop times).

F:> @0
F<: @2
C:> @2
E:> @2
C<: @4
E<: @4
B:> @4
D:> @4
B<: @6
D<: @6
A:> @6
A<: @8

## NEW SIMULATION

# Task runtimes:
  A: 2
  B: 2
  C: 1
  D: 2
  E: 2
  F: 2

# OUTPUT. (:> for task start times, <: for stop times).

F:> @0
F<: @2
C:> @2
E:> @2
  C<: @3
E<: @4
B:> @4
D:> @4
B<: @6
D<: @6
A:> @6
A<: @8
>>> 


Note that in the second simulation, C finish, but B isn't then immediately 
started. I have my own code that also works like this but it isn't optimal.

Thanks guys.

--
nosy: +Paddy McCarthy

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-05-31 Thread miss-islington


miss-islington  added the comment:


New changeset 0a674638a3de14dc86b5294a5db067e0c2177a51 by Miss Islington (bot) 
in branch '3.9':
bpo-17005: Move topological sort functionality to its own module (GH-20558)
https://github.com/python/cpython/commit/0a674638a3de14dc86b5294a5db067e0c2177a51


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-05-31 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:


New changeset 2f172d8f1525defe9bba4d49e967fdfc69151731 by Pablo Galindo in 
branch 'master':
bpo-17005: Move topological sort functionality to its own module (GH-20558)
https://github.com/python/cpython/commit/2f172d8f1525defe9bba4d49e967fdfc69151731


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-05-31 Thread miss-islington


Change by miss-islington :


--
nosy: +miss-islington
nosy_count: 19.0 -> 20.0
pull_requests: +19802
pull_request: https://github.com/python/cpython/pull/20561

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-05-31 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:

Opened https://github.com/python/cpython/pull/20558. I have *initially* 
proposed "graphlib" as it feels more in-line with other stdlib module names and 
as Jim pointed it does not collide with anything major. Also, I have proposed 
this initial name to not keep the scope too short in case we want to add new 
things in the future (although we can always say "no" and just keep this 
functionality).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-05-31 Thread Pablo Galindo Salgado


Change by Pablo Galindo Salgado :


--
pull_requests: +19801
pull_request: https://github.com/python/cpython/pull/20558

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-05-31 Thread Jim Fasarakis-Hilliard


Jim Fasarakis-Hilliard  added the comment:

Another option, `graphlib`[1], does exist on PyPI but is not maintained and 
currently read-only by the author. Other flavors[2][3] of the same name also 
don't seem to have much adoption so they shouldn't confuse if a name like 
`graphlib` was chosen.

[1] https://github.com/bruth/graphlib/
[2] https://github.com/MengLiuPurdue/graph_lib
[3] https://github.com/EmileTrotignon/GraphLib

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-05-31 Thread Eric V. Smith


Eric V. Smith  added the comment:

I'd prefer a new module just for this. As for names, I like toposort over 
topsort.

I already have a library on PyPI named toposort. Personally, I don't have a 
problem with the stdlib taking over that name, and I'd just deprecate mine at 
3.9, or whatever. I'm not sure if there are any ramifications of doing that, 
however.

Or, pick another unused name, like toposortlib or something.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-05-31 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:

I checked and from the list of proposed names with "graph" prefixes, only 
"graphutils" is not colliding with something on Pypi.

For the record "cslib" and "misclib" are also available but the scope of those 
feel much much bigger.

I am going to open a PR to move this to "graphutils" for now. We can discuss 
better in the PR for better names given that seems that there is an agreement 
on moving this out of functools.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-05-31 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

I vote for it being in its own module (like difflib).  That would also be a 
good place to put my proposed tsort() function with its simpler API.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-05-31 Thread Tim Peters


Tim Peters  added the comment:

`functools` is clearly a poor place for this. `imath` would also be. 
`graph_stuff_probably_limited_to_a_topsort` is the only accurate name ;-) 
Off-the-wall possibilities include `misclib` (stuff that just doesn't fit 
anywhere else - yet) and `cslib` (Computer Science-y stuff).

Whichever name wins, we should probably look to ensure the name isn't also of a 
package already in (non-trivial) use.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-05-31 Thread Ben Mares


Ben Mares  added the comment:

dependencytools?

But I don't think it qualifies yet for the plural...

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-05-31 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:

I would like to hear Raymond and Tim advise on what the best name for the new 
module should be :)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-05-31 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:

> The downside I see with any graph prefixed names is the fact that it implies 
> a larger collection of graph operations.

I don't see that an issue. In the general term is better to have a general name 
and discuss further improvements than just name the module "topological_sort" 
and now being cornered if we ever want to add new graph-related stuff.

We can always say "no" to new functionality but changing the name of the module 
is not possible once is released.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-05-31 Thread Jim Fasarakis-Hilliard


Jim Fasarakis-Hilliard  added the comment:

The downside I see with any graph prefixed names is the fact that it implies a 
larger collection of graph operations.

Add that to the fact that people might be more tempted to propose many  graph 
related algorithms/utilities to a module with the same name.

A more localized name solves that.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-05-31 Thread gaborbernat


gaborbernat  added the comment:

I like graphutils for what it's worth.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-05-31 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:

If we move it, I would prefer a new module that somehow makes clear the scope, 
something like graphutils or graphtheory or graph (although this las one will 
probably collide with many existing things).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-05-31 Thread Jim Fasarakis-Hilliard


Jim Fasarakis-Hilliard  added the comment:

It does seem out of place in functools, intensified by it's odd interjection 
among the other functools objects.

Considering heapq and bisect exist as standalone modules, the idea that 
topological sorting could go in its own module wouldn't be without precedent.

--
nosy: +Jim Fasarakis-Hilliard

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-05-31 Thread Rémi Lapeyre

Rémi Lapeyre  added the comment:

Could it make sense to have this in the often proposed imath module? 

It's integers per se but Both number theory and graph theory are part of 
discrete mathematics so it may feel more at home there?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-05-31 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:

Any suggestions for the new module? I assume we can move this to a new 
topsort/topological_sort or similar...

What do people think?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-05-31 Thread Ben Mares


Ben Mares  added the comment:

It's great to have this feature in the standard library, but it really seems to 
clutter the functools documentation. Everything else in functools applies 
directly to functions and methods. Suddenly reading graph theory terminology 
was disorienting for me. (Due to context, I expected "node" to mean some new 
type of Python function.) It makes the documentation significantly longer, and 
IMHO abstruse. Would it be sensible to move this to a separate module?

--
nosy: +maresb

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-04-02 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

How about I post a PR so we can talk about something concrete.  Then you two 
can either fight it to its death or you can join me in making it is good as 
possible, hopefully the latter :-)

I am not happy with the current API but do accept that both of you are in 
satisfied with it.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-04-02 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
Removed message: https://bugs.python.org/msg365640

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-04-02 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

How about I post a PR so we can talk about something concrete.  Then you two 
can either fight it to its death or you can join me in making it is good as 
possible, hopefully the latter :-)

I am not happy with the current API but do accept that both of you are in love 
with it.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-04-02 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:

Is also notable to mention that you can also provide the graph as a dictionary 
to the constructor:

>>> graph = {D: {B, C}, C: {A}, B: {A}, A:{object}}
>>> ts = TopologicalSorter(graph)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-04-02 Thread Tim Peters


Tim Peters  added the comment:

Possibly, sure.  But I believe it's hard to beat

add(node, *predecessors)

for usability as a way to build the dependency graph.  For example, a list of 
pairs is a comparative PITA for most use cases I've had.  Whether it's 
following a recipe to bake a cake, or tracing a maze of C include files, it 
seems _most_ natural to get input in the form "this thing depends on these 
other things".  Not the other way around, and neither a sequence of pairs.

_If_ you buy that, then .add() is screamingly natural, and trying to squash a 
pile of .add()s into a single sequence-of-sequences argument seems strained.

Typically I don't get input in one big, single gulp.  It's instead discovered 
one item at a time.  Fine - .add() it and then move on to the next item.  It's 
certainly possible to append the item and its predecessors to a persistent 
(across items) list, and call a function once at the end with that list.

But what does that buy?  I'm building the list solely to meet the function's 
input requirement - the list serves no other purpose.  Instead of calling 
.add() N times, I call .append() N times.  "add" is 3 letters shorter ;-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-04-02 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:

> We may need two versions then, a full-featured TopologicalSorter() class and 
> a simple tsort() function that doesn't aspire to be all things to all people.

How this other version would differ from using .add() + .static_order() as Tim 
mentions? Originally we designed static_order() so it will satisfy the simpler 
use cases so I would suggest aspiring to simplify that interface if needed 
instead of adding an extra function.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-04-02 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

> If your alternative isn't equally easy to use in a parallelized
> context, I'll be at best +0.

We may need two versions then, a full-featured TopologicalSorter() class and a 
simple tsort() function that doesn't aspire to be all things to all people.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-04-02 Thread STINNER Victor


Change by STINNER Victor :


--
nosy: +vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-04-02 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:

I just want to echo what Tim mentioned with the extra data point that some of 
the maintainers of some popular and wide-used open-source libraries that indeed 
have to deal with this problem or the parallel version of the problem (like 
gaborbernat in this thread, the maintainer of "tox" and "virtualenv") do indeed 
find the current API desirable.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-04-02 Thread Tim Peters


Tim Peters  added the comment:

Raymond, what application do you have that wouldn't be completely addressed by 
sticking to just .add() (to record dependencies) and .static_order() (to 
retrieve a linear order)?

Larry Hastings and I originally worked out the fancier bits of the interface to 
deal with problems he actually had, and for which no existing Python topsort 
implementation we could find was of any use:  extract maximal parallelism.  If 
you don't want that, fine, stick to the two simple bits.

The bits to support parallelism are very easy to use to write correct 
parallelized code, but of course can seem baffling if you don't give a rip 
about parallelism.  But in that case you have no need to learn about them 
either.

If your alternative isn't equally easy to use in a parallelized context, I'll 
be at best +0.

About "fast", this is linear time, in the sum of the number of items and 
dependencies.  Including the part checking for a cycle, which is by far the 
"hardest" part.  So it's asymptotically optimal, although I've never seen a 
real context in which topsort speed made a lick of difference.

In the real world, in a parallelized context it can be important to check for a 
cycle _before_ running a topsort:  actions are performed ASAP based on 
order-deduced-so-far, and it can be no good to find out "oh! I can't finish 
this" at the end.  There's actually nothing gratuitous here.  If it seems 
"over-engineered", that's because it's addressing problems you haven't had yet 
;-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-04-02 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

At some point in the next two or three weeks, I'll have a chance to work on 
this more and to offer a competing patch.  IMO, the current checkin is 
over-engineered, both in its API and implementation.  This could have been a 
simple, fast tool written as one or two short Python functions.

Also, I would like to try to out the API alternatives on some groups of 
engineers to get some user feedback.

For me, as the API currently stands, I would have to write a wrapper to make it 
usable for my applications.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-02-20 Thread wim glenn


Change by wim glenn :


--
nosy: +wim.glenn

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-01-23 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:


New changeset 65ecc390c1fa5acdd6348ae3f9843bbdcd8870d1 by Pablo Galindo in 
branch 'master':
bpo-17005: Minor improvements to the documentation of TopologicalSorter 
(GH-18155)
https://github.com/python/cpython/commit/65ecc390c1fa5acdd6348ae3f9843bbdcd8870d1


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-01-23 Thread Pablo Galindo Salgado


Change by Pablo Galindo Salgado :


--
pull_requests: +17541
pull_request: https://github.com/python/cpython/pull/18155

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-01-23 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:


New changeset 99e6c260d60655f3d2885af545cbc220b808d492 by Pablo Galindo in 
branch 'master':
bpo-17005: Add a class to perform topological sorting to the standard library 
(GH-11583)
https://github.com/python/cpython/commit/99e6c260d60655f3d2885af545cbc220b808d492


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-01-19 Thread Zahari Dim


Zahari Dim  added the comment:

I would like to suggest a `dependency_resolver` API that I have been using that 
goes in line with what Tim Peters proposes in 
https://bugs.python.org/issue17005#msg359702

A DAG would be an object that can be iterated in topological order with 
__iter__ (for simple sequential usage) or have a way of managing all the tasks 
that can be run in parallel. The later is done with a generator function:

```
def dependency_resolver(self):
"""Yield the set of nodes that have all dependencies satisfied (which 
could be an empty set). Send the next
completed task."""
```

which is used with something like:

```
deps = dag.dependency_resolver()
pending_tasks = deps.send(None)
if not pending_tasks:
#Graph empty
return
#Note this is a can be done in parallel/async
while True:
some_task = pending_tasks.pop()
complete_task_somehow(some_task)
try:
   more_tasks = deps.send(some_task)
except StopIteration:
   #Exit when we have sent in all the nodes in the graph
   break
else:
pending_tasks |= more_tasks

```


An implementation I have used for some time is here:


https://github.com/NNPDF/reportengine/blob/master/src/reportengine/dag.py

although I'd make simpler now. In practice I have found that the function I use 
most of the time to build the graph is:

dag.add_or_update_node(node=something_hashable, inputs={set of existing nodes}, 
outputs={set of existing nodes}).

which adds the node to the graph if it was not there and maps updates the 
dependencies to add inputs and outputs, which in my experience matches the way 
one discovers dependencies for things like packages.

--
nosy: +Zahari.Dim

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-01-17 Thread gaborbernat

gaborbernat  added the comment:

I think the new interface feeds better for usage in both sequential or parallel 
workflows, which means we can use a single UI for both, so  from myself

--
nosy: +gaborbernat

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-01-16 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:

I have been playing with the API for a while and I have to say that I have 
fallen in love with it: I tried to reimplement many snippets of parallel and 
non-parallel topological-order processing of a graph with this and it always 
fits very nicely.

I have shown this API also to some library maintainers that have to deal with 
topological sorting (like the tox maintainer) and they are very enthusiastic 
about the API as well.

Tim, I have updated PR 11583 to use the proposed API with test and docs. Could 
you make a review when you have some time so we can polish it for the 3.9 
release?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-01-14 Thread Tim Peters


Tim Peters  added the comment:

Oh, it's fine!  Kahn's algorithm is what I meant when I wrote the "bog-standard 
implementation" before.

I don't believe I've ever seen a context in real life where topsort speed made 
a lick of real difference, so I expect any linear-time (in the sum of the 
number of nodes and edges) would be fine.  Nevertheless, for recording a node's 
successors ("children" in your code), I've always used a list rather than a 
set.  Lists run faster and require less memory than sets, and - unlike sets - 
in Python inherently preserve insertion order.  Iteration order can become 
visible (e.g., if B, C, and D depend on A, what's "the" topsort order?  it 
depends on the order A's children appear when iterating over them - predictable 
with a list, "it depends" with a set).

Note:  "but we have to guard against redundant edges!" would be a red herring.  
Kahn's algorithm couldn't care less, provided that predecessor counts 
accurately reflect the number of edges (redundant or not) entering a node.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-01-14 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:

Disregarding the API, what do you think about the approach of 
https://github.com/python/cpython/pull/11583 for the implementation? Under my 
benchmarks (check previous comments) it seems to perform very good with regards 
to memory and time.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-01-14 Thread Tim Peters


Tim Peters  added the comment:

I'll add ts.py, which was a work-in-progress that implements a minor variation 
of most everything I typed about.  If nothing else, its _find_cycle is useful 
as a non-recursive linear-time cycle finder (recursion is deadly here because 
recursive depth-first search can easily "blow the stack" on larger graphs).

There's also "if 1:"/"else:" blocks that set up parallel cases, using threads 
or processes, and two ways of managing the parallelism (the one I showed 
before, and a more elaborate one that puts an upper bound on how large the 
queues can grow - which is sometimes "a problem" for multiprocessing.queue).

--
Added file: https://bugs.python.org/file48842/ts.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-01-14 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:

Fair enough, I will read this carefully again and try to sketch a prototype 
soon :)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-01-14 Thread Tim Peters


Tim Peters  added the comment:

> I am slightly confused about what .prepare() should do. Why
> is this step necessary?

To say "I'm done adding edges".  Any call to add() after prepare() should raise 
an exception.  Likewise, e.g., any call to get_ready() before prepare() should 
raise an exception.  In a bog-standard topsort implementation, saving for each 
node a sequence of successors and a count of predecessors, this is also the 
time to collect all the nodes that have no predecessors (the first batch 
returned by get_ready()).

Much the same could be done without prepare() by get_ready() making a special 
case out of the first time it's called.  That's more magical, though.  "I'm 
done adding edges" is utterly non-magical.

> - Why we need the .done() method here? Why not instead make get_ready()
> simply a generator so you can just write
>
>for node in self.get_ready():

The point of done() is to enable maximum exploitation of parallelism.  As 
already sketched, if a user doesn't care about that, fine, a different method 
(like static_order()) can generate all the nodes in _some_ static topsort 
order, with no need for done().

But suppose a user does care about parallelism.  Consider graph

A -> B
A -> C
A -> D
B -> D

Output A B C D is a topsort, but useless unless the user is content to "do" one 
node at a time.

Instead get_ready() first returns [A] (or a tuple, or a generator, or a set ... 
something iterable).  A is handed out to worker processes/threads, but 
get_ready() will return an empty iterable until done(A) is called.  Indeed, if 
"doing" A fails, it's NOT the case that anything else can ever be started.

If/when "doing A" succeeds, then done(A) is called, and the next get_ready() 
returns [B, C].  Those can be done in parallel, but D can't be started until 
done(B) is called.  done(B) may or may not be called before done(C) is called - 
the topsort itself has no way to know in advance, nor _should_ it impose its 
own view of that.  Note that D does not depend on C, so there's no need to wait 
for _both_ in [B, C] to finish.  It's necessary and sufficient that B be marked 
done() for D to be ready to go.

> It seems that the .done() is very tight to use this API as a "task
> scheduler" but maybe I am doing something here in my understanding
> of the API.

done() says nothing about how the user "should" schedule work items, but 
instead allows get_ready() to return all the work items whose predecessors have 
been marked done() (but haven't already been passed out by get_ready()).  
That's the maximum set of nodes that _can_ be worked on at the time.  The 
topsort class itself has no policy about how or when they "should" be worked 
on, get_ready() is just identifying all the possibilities that exist.  Which is 
impossible to know unless the class is also told which nodes it already passed 
out have finished - the purpose of done().

is_active() eventually returns False when all the nodes passed out by 
get_ready() have been marked done(), _and_ there are no more nodes ready to 
pass out.  At that point, there's a cycle in the input relations if and only if 
there's some node get_ready() never passed out.

In my prototype implementation, that's another thing prepare() does:  checks 
for a cycle, and raises CycleError if there is one.  The user can catch & 
ignore that if they like, and continue calling get_ready() and done() until no 
more progress can be made.  I think it's more likely, though, that the user 
would stare at the cycle attached to the CycleError instance, do whatever it 
takes to break the cycle, and start over again.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-01-14 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:

> Am I seriously suggesting this for Python?  Sure.  It's fun to advance the 
> practical state of the art :-)


I think this API looks very interesting! I have some questions before start 
implementing it to play a bit with it:

- I am slightly confused about what .prepare() should do. Why is this step 
necessary?

- Why we need the .done() method here? Why not instead make get_ready() simply 
a generator so you can just write

for node in self.get_ready():

It seems that the .done() is very tight to use this API as a "task scheduler" 
but maybe I am doing something here in
my understanding of the API.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-01-09 Thread Senthil Kumaran


Change by Senthil Kumaran :


--
nosy: +orsenthil

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-01-09 Thread Tim Peters


Tim Peters  added the comment:

Let's stir this up a bit ;-)  I recently had occasion to exchange ideas with 
Larry Hastings about topsorts for use in a package manager he's writing.  I 
thought his API for adding edges was ... perfect:

add(node, *dependson)

So, e.g., add(A, B, C) says A depends on B, and on C, but says nothing else 
about B and C.  This is almost always the way topsorts show up in real life:  
you know what a thing depends *on* directly, but have scant idea how things may 
be in the opposite direction.  For example, you know that baking a cake 
requires (among other things) flour, but have no real idea of the universe of 
other things that require flour.  Likewise Larry knows which packages each 
package requires, but not the reverse.  Etc.

Nodes with no edges are trivial to add then:  add(A).

If you're building input to a topsort from a graph, also trivial:

for n, p in node2predecessors.items():
topsort_instance.add(n, *p)

and it doesn't matter whether the predecessors in the original graph were 
stored in a list, set, tuple, or any other iterable container.  Nothing special 
about an empty collection of predecessors either.

The other big thing that came up is that most topsort programs were useless for 
his goal:  downloading and installing packages takes significant wall clock 
time, and there's huge opportunity for exploiting parallelism.  But a flat 
sequence in topsort order gives no clue about what _can_ be done in parallel.  
Instead you really want several methods, like

prepare()

to say that you're done building the graph; and,

get_ready()

to get all nodes ready to go, which haven't already been returned by 
get_ready() calls (initially, this is the collection of nodes with no 
predecessors, which prepare() can know); and,

done(node)

to say that `node` (returned by a previous call to get_ready()) is finished 
now, so that the next call to get_ready() can return all (if any) nodes for 
which `node` was the last non-done predecessor; and,

is_active()

to say whether the topsort can make more progress (is_active() returns True iff 
there are still nodes ready to go that haven't yet been passed out by 
get_ready(), or if the number of nodes marked done() is less than the number 
that have been passed out by get_ready()).

These are all easy to code, and allow the user to extract all the opportunities 
for parallelism that theoretically exist.  There is no static order that can do 
so, since the opportunities that exist at any time depend on the times and 
order in which nodes are marked done() in real life - and that may vary from 
one run to the next.

Of course a deterministic static order can be derived from those, like, e.g.,

def static_order(self):
self.prepare()
while self.is_active():
for node in self.get_ready():
yield node
self.done(node)

For parallel use, e.g.,

self.prepare()
while instance.is_active():
for node in instance.get_ready():
inq.put(node)
node = outq.get()
instance.done(node)

where worker threads or processes take nodes to work on off of queue `inq`, 
then, when the work for a node is done, put the node on queue `outq`.

Am I seriously suggesting this for Python?  Sure.  It's fun to advance the 
practical state of the art :-)

--
nosy: +tim.peters

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2020-01-09 Thread Gareth Rees


Gareth Rees  added the comment:

I'd like to push back on the idea that graphs with isolated vertices are 
"unusual cases" as suggested by Raymond.

A very common use case (possibly the most common) for topological sorting is 
job scheduling. In this use case you have a collection of jobs, some of which 
have dependencies on other jobs, and you want to output a schedule according to 
which the jobs can be executed so that each job is executed after all its 
dependencies.

In this use case, any job that has no dependencies, and is not itself a 
dependency of any other job, is an isolated vertex in the dependency graph. 
This means that the proposed interface (that is, the interface taking only 
pairs of vertices) will not be suitable for this use case. Any any programmer 
who tries to use it for this use case will be setting themselves up for failure.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2019-05-31 Thread Raymond Hettinger

Raymond Hettinger  added the comment:

Unless Łukasz gives us a nod to work out this API for the second beta, we'll 
need to defer this to 3.9.   IMO, the API in the patch is not user friendly.  
It started on the right path but became contorted (for both inputs and outputs) 
to serve unusual cases.

--
nosy: +lukasz.langa

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2019-05-26 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:

> * allow input as ordered pairs like the Unix tsort command
> * allow more convenient input as dependency sequences (like graphviz):

This is how my first proposal started (and I still like it a bit more than the 
dictionary input), but there are some concerns (check other comments) regarding 
this API, like representing isolated nodes or disjoint graphs.

> return both the sorted sequence and cycles

Regarding the output, I like returning a collection of sets, where every set 
represents all possible elements of the same order in the result. This also 
helps if the user has some expectation regarding the ordering. For example, in:

['ABDGI', 'BEG', 'CEH', 'KCFHJ']

the results starting with

['A', 'B', 'D'

and

['A', 'B', 'K'

are both valid.

With the current implementation, this is the equivalent of C3 linearization:

  from itertools import tee
  from collections import defaultdict
  def c3_linearization(inheritance_seqs):
 graph = defaultdict(set)
 for seq in inheritance_seqs:
 a, b = tee(seq)
 next(b, None)
 for child, parent in zip(a,b):
 graph[child].add(parent)
 retun ((list(group) for group in functools.toposort(graph)), [])
 return tuple(reversed(order))

  >>> class A: pass
  >>> class B(A): pass
  >>> class C(A): pass
  >>> class D(B, C): pass

   >> D.__mro__
  (__main__.D, __main__.B, __main__.C, __main__.A, object)

   >> c3_linearization([(D, B, A, object), (D, C, A, object)])
  [{__main__.D}, {__main__.B, __main__.C}, {__main__.A}, {object}]

What do you think?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2019-05-26 Thread Raymond Hettinger


Change by Raymond Hettinger :


Added file: https://bugs.python.org/file48362/tsort.1.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2019-05-26 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

Attaching some of my 2013 work on this.  Here are some API notes:

* spell-out the name topological_sort()

* allow input as ordered pairs like the Unix tsort command
  https://en.wikipedia.org/wiki/Tsort#Examples

* allow more convenient input as dependency sequences (like graphviz):
 [['a', 'b', 'c', 'x], ['b', 'd', 'e', 'y']]
  is short for and equivalent to:
 [(a,b), (b,c), (c,x), (b,d), (d, e), (e, y)]

* return both the sorted sequence and cycles
  (both are individually useful and the latter
   is helpful in debugging in only the former is wanted)

* desire to match the C3 MRO computation


* would like the ordering to be stable and deterministic
  (this precludes picking arbitrary elements from sets).

--
Added file: https://bugs.python.org/file48361/topological_sort.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2019-01-20 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:

(A, B, C, D, E, F) would not be correct as B is order 2 and "C" and "A" is 
order 1. The correct orders are the inner-group permutations of:

[{'A', 'F'}, {'B', 'D', 'E'}, {'C'}]

(Assuming that the lower diagram goes from top to bottom)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2019-01-20 Thread Rémi Lapeyre

Rémi Lapeyre  added the comment:

> 2) Topological sorting usually is well-defined on totally connected graphs, 
> so I do not know what exactly it means to topologically sort two disjoint 
> graphs. This was one of the main drawbacks of the tuple-based approach, but I 
> think it may be a good property.

To give a use-case, I'm currently using topological sort to order a list of 
tasks where edges represent dependencies between tasks. Sometime a group of 
tasks does not share a dependency with another group any relative order between 
those two groups is correct:


A -> B

   C
  / \
 D   E
  \ /
   F

The order (A, B, C, D, E, F) would be correct in this example as would (C, A, 
E, B, D, F).

I think the general topological sort in Python should be able to handle such 
inputs.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2019-01-20 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:

Although I think the API with the dictionary may be cleaner, I still do not 
like it completely.

1) One of the reasons is that implementing C3 directly with the topological 
sort is not straightforward as the different nodes that are at the same level 
are unordered and therefore any permutation of all these is a valid topological 
sort, while C3 comes from a stable sort. The version with pairs is stable on 
the other hand.

2) Topological sorting usually is well-defined on totally connected graphs, so 
I do not know what exactly it means to topologically sort two disjoint graphs. 
This was one of the main drawbacks of the tuple-based approach, but I think it 
may be a good property.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2019-01-20 Thread Pablo Galindo Salgado

Pablo Galindo Salgado  added the comment:

I have updated the PR to receive a dictionary of sets as in Eric V. Smith's 
package. I have maintained the guts of the current algorithm as it scales much 
better:

>>> test_data = {x:{x+n for n in range(100)} for x in range(1000)}

>>> %timeit list(toposort.toposort(test_data)) # The one in PyPi
910 ms ± 2.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

>>> %timeit list(functools.toposort(test_data)) # In this PR

69.3 ms ± 280 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

>>> list(functools.toposort(l)) == list(toposort.toposort(l))
True

--
Removed message: https://bugs.python.org/msg334099

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2019-01-20 Thread Pablo Galindo Salgado

Pablo Galindo Salgado  added the comment:

I have updated the PR to receive a dictionary of sets as in Eric V. Smith's 
package. I have maintained the guts of the current algorithm as it scales much 
better:

>>> test_data = {x:{x+n for n in range(100)} for x in range(1000)}

>>> %timeit list(toposort.toposort(l)) # The one in PyPi
910 ms ± 2.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

>>> %timeit list(functools.toposort(l)) # In this PR

69.3 ms ± 280 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

>>> list(functools.toposort(l)) == list(toposort.toposort(l))
True

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2019-01-18 Thread Eric V. Smith


Eric V. Smith  added the comment:

This is why I prefer the API exposed by https://pypi.org/project/toposort/

list(toposort({2: {11},
   9: {11, 8, 10},
   10: {11, 3},
   11: {7, 5},
   8: {7, 3},
  }))

returns [{3, 5, 7}, {8, 11}, {2, 10}, {9}]

For an node with no edges, use an empty set:

list(toposort({100: set(),
   2: {11},
   9: {11, 8, 10},
   10: {11, 3},
   11: {7, 5},
   8: {7, 3},
  }))
[{3, 100, 5, 7}, {8, 11}, {2, 10}, {9}]

I also don't think we should provide multiple APIs. Let's just provide one, and 
recipes for any helpers, if needed. For example, to flatten the result into a 
list. Or to take a list of edges as the input.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2019-01-18 Thread Terry J. Reedy


Terry J. Reedy  added the comment:

I think 'toposort' is a good name for a function that implements 'topological 
sorting'.
https://en.wikipedia.org/wiki/Topological_sorting

--
versions: +Python 3.8 -Python 3.4

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2019-01-18 Thread Gareth Rees


Gareth Rees  added the comment:

Just to elaborate on what I mean by "bug magnet". (I'm sure Pablo understands 
this, but there may be other readers who would like to see it spelled out.)

Suppose that you have a directed graph represented as a mapping from a vertex 
to an iterable of its out-neighbours. Then the "obvious" way to get a total 
order on the vertices in the graph would be to generate the edges and pass them 
to topsort:

def edges(graph):
return ((v, w) for v, ww in graph.items() for w in ww)
order = topsort(edges(graph))

This will appear to work fine if it is never tested with a graph that has 
isolated vertices (which would be an all too easy omission).

To handle isolated vertices you have to remember to write something like this:

reversed_graph = {v: [] for v in graph}
for v, ww in graph.items():
for w in ww:
reversed_graph[w].append(v)
order = topsort(edges(graph)) + [
  v for v, ww in graph.items() if not ww and not reversed_graph[v]]

I think it likely that beginner programmers will forget to do this and be 
surprised later on when their total order is missing some of the vertices.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2019-01-18 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:

> 1. The name "topsort" is most naturally parsed as "top sort" which could be 
> misinterpreted (as a sort that puts items on top in some way). If the name 
> must be abbreviated then "toposort" would be better.


I totally agree that `topsort` is a bad name, I used it more or less as a dummy 
for starting the discussion about the implementation.

> 2. "Topological sort" is a terrible name: the analogy with topological graph 
> theory is (i) unlikely to be helpful to anyone; and (ii) not quite right. I 
> know that the name is widely used in computing, but a name incorporating 
> "linearize" or "linear order" or "total order" would be much clearer.


Topological sort (not as the function name) but as an operation is a very well 
known concept and is well defined. If you are referring to not use "Topological 
Sort" in the docstrings or the documentation, I strongly oppose.


Regarding the interface, I am more happy to change it once there is an 
agreement. I am still awaiting Raymond's comments regarding this so we can 
start discussing.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2019-01-18 Thread Gareth Rees


Gareth Rees  added the comment:

I approve in general with the principle of including a topological sort 
algorithm in the standard library. However, I have three problems with the 
approach in PR 11583:

1. The name "topsort" is most naturally parsed as "top sort" which could be 
misinterpreted (as a sort that puts items on top in some way). If the name must 
be abbreviated then "toposort" would be better.

2. "Topological sort" is a terrible name: the analogy with topological graph 
theory is (i) unlikely to be helpful to anyone; and (ii) not quite right. I 
know that the name is widely used in computing, but a name incorporating 
"linearize" or "linear order" or "total order" would be much clearer.

3. The proposed interface is not suitable for all cases! The function topsort 
takes a list of directed edges and returns a linear order on the vertices in 
those edges (if any linear order exists). But this means that if there are any 
isolated vertices (that is, vertices with no edges) in the dependency graph, 
then there is no way of passing those vertices to the function. This means that 
(i) it is inconvenient to use the proposed interface because you have to find 
the isolated vertices in your graph and add them to the linear order after 
calling the function; (ii) it is a bug magnet because many programmers will 
omit this step, meaning that their code will unexpectedly fail when their graph 
has an isolated vertex. The interface needs to be redesigned to take the graph 
in some other representation.

--
nosy: +g...@garethrees.org

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2019-01-17 Thread Rémi Lapeyre

Rémi Lapeyre  added the comment:

As the name been already discussed ? 


I fear that topsort might only be clear to people already knowing what it does. 
topoligical_sort would be more discoverable and explicit and one can always do

from functools import topological_sort as tsort

if he wants to save some typing later.

--
nosy: +remi.lapeyre

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2019-01-16 Thread Pablo Galindo Salgado

Pablo Galindo Salgado  added the comment:

The one in PR 11583 is twice as fast:

>timeit for -> 
>topsort([(2,11),(9,11),(9,8),(9,10),(10,11),(10,3),(11,7),(11,5),(8,7),(8,3)])
12.4 µs ± 59.1 ns per loop (mean ± std. dev. of 7 runs, 10 loops each)

>timeit for -> 
>tsort([(2,11),(9,11),(9,8),(9,10),(10,11),(10,3),(11,7),(11,5),(8,7),(8,3)])
29.1 µs ± 147 ns per loop (mean ± std. dev. of 7 runs, 1 loops each)

--
Removed message: https://bugs.python.org/msg333815

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2019-01-16 Thread Pablo Galindo Salgado

Pablo Galindo Salgado  added the comment:

The one in PR 11583 is twice as faster:

>timeit for -> 
>topsort([(2,11),(9,11),(9,8),(9,10),(10,11),(10,3),(11,7),(11,5),(8,7),(8,3)])
12.4 µs ± 59.1 ns per loop (mean ± std. dev. of 7 runs, 10 loops each)

>timeit for -> 
>tsort([(2,11),(9,11),(9,8),(9,10),(10,11),(10,3),(11,7),(11,5),(8,7),(8,3)])
29.1 µs ± 147 ns per loop (mean ± std. dev. of 7 runs, 1 loops each)

--
nosy: +pablogsal

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2019-01-16 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

Thanks. I have some API nits to work to make this more broadly useful.  I'll 
all those on the PR comments soonish :-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2019-01-16 Thread Pablo Galindo Salgado


Change by Pablo Galindo Salgado :


--
pull_requests:  -11268

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2019-01-16 Thread Pablo Galindo Salgado


Change by Pablo Galindo Salgado :


--
pull_requests:  -11267, 11268

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2019-01-16 Thread Alexander Belopolsky


Alexander Belopolsky  added the comment:

Here is an implementation that I've used for years. It is somewhat shorter than 
the one in PR 11583:

class CycleError(LogicalError, ValueError):
"""dependencies cycle detected

"""


def tsort(pairs):
"""topological sort

Just like unix tsort(1)

>>> tsort([(1, 2), (7, 8), (8, 10), (7, 4), (2, 3), (4, 10)])
[1, 7, 2, 8, 4, 3, 10]
>>> try:
... tsort([(1,2), (2,1)])
... except CycleError as e:
... print(e)
([], Counter({1: 1, 2: 1}), {1: [2], 2: [1]})
"""
# inspired by 
http://mail.python.org/pipermail/python-list/1999-July/002831.html
successors = {}
predecessor_counts = collections.Counter()
for x, y in pairs:
successors.setdefault(x, []).append(y)
predecessor_counts.setdefault(x, 0)
predecessor_counts[y] += 1
ordered = [x for x in predecessor_counts
   if predecessor_counts[x] == 0]
for x in ordered:
del predecessor_counts[x]
for y in successors.get(x, ()):
predecessor_counts[y] -= 1
if predecessor_counts[y] == 0:
ordered.append(y)
if predecessor_counts:
raise CycleError(ordered, predecessor_counts, successors)
return ordered

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2019-01-16 Thread Pablo Galindo Salgado


Change by Pablo Galindo Salgado :


--
keywords: +patch, patch
pull_requests: +11266, 11267
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2019-01-16 Thread Pablo Galindo Salgado


Change by Pablo Galindo Salgado :


--
keywords: +patch, patch, patch
pull_requests: +11266, 11267, 11268
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2019-01-16 Thread Pablo Galindo Salgado


Change by Pablo Galindo Salgado :


--
keywords: +patch
pull_requests: +11266
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2013-06-17 Thread Martin Panter

Changes by Martin Panter vadmium...@gmail.com:


--
nosy: +vadmium

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17005
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2013-05-31 Thread Alexander Belopolsky

Alexander Belopolsky added the comment:

+1

I had tsort in my own utilities library for so long that I thought it was in 
stdlib already.  I found this issue after unsuccessfully searching 
docs.python.org. :-)

I think functools will be a fine place for it.  It is somewhat related to total 
ordering and solves the problem which is common when implementing functional 
mini-languages.

Another possibility is shutil given that tsort is a standard POSIX command, 
http://pubs.opengroup.org/onlinepubs/009695299/utilities/tsort.html, but I 
think this will be too obscure.

--
nosy: +belopolsky

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17005
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2013-01-26 Thread Tshepang Lekhonkhobe

Changes by Tshepang Lekhonkhobe tshep...@gmail.com:


--
nosy: +tshepang

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17005
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2013-01-26 Thread Terry J. Reedy

Changes by Terry J. Reedy tjre...@udel.edu:


--
nosy: +terry.reedy

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17005
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2013-01-20 Thread Raymond Hettinger

New submission from Raymond Hettinger:

I suggest adding a topological sort algorithm to the standard library.

In addition to being a fundamental algorithm, it is immediately useful in 
demonstrating how the MRO computation works and for pure Python implementations 
of MRO logic.   IIRC, the pgen code was also expressed in pure Python for the 
same reason.

I've attached a first-draft of the algorithm and an alternative that only 
implements a topological merge.  This is just an early draft and there are a 
number of open points:

* which module to put it in
* a better implementation may be possible (perhaps using fewer dictionaries and 
sets).

--
files: mro_merge.py
messages: 180319
nosy: rhettinger
priority: low
severity: normal
status: open
title: Add a topological sort algorithm
type: enhancement
versions: Python 3.4
Added file: http://bugs.python.org/file28800/mro_merge.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17005
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2013-01-20 Thread Christian Heimes

Christian Heimes added the comment:

Good idea!

I'm using http://pypi.python.org/pypi/topsort/0.9 for a couple of years. The 
initial code was written by Tim Peters.

--
nosy: +christian.heimes

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17005
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2013-01-20 Thread Raymond Hettinger

Changes by Raymond Hettinger raymond.hettin...@gmail.com:


--
assignee:  - rhettinger

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17005
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17005] Add a topological sort algorithm

2013-01-20 Thread Eric V. Smith

Eric V. Smith added the comment:

+1

I'll note (by inspection only) your example code doesn't work under Python 3.x! 
:)

(print as a statement)

--
nosy: +eric.smith

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17005
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com