Re: list of unique non-subset sets

bearophileHUGS Fri, 18 Mar 2005 08:22:50 -0800

Looking at all the hyperedges in the connected component is a big
waste... You can look at just the hyperedges that share one or more
nodes.
(Nodes are the original letters contained in the sets, and they must be
hashable).


If nodes aren't integers in [0, len(l)) then you can use this simpler
code:

. l = [set(['a','b','c']), set(['a','c']), set(['a','d','e','1']),
.      set(['r','k','l']), set(['b','e','1']), set(['a','c']) ]
. from graph import Graph
. # http://www.fantascienza.net/animalia/graph.zip
. # you can something similar with Boost
.
. debug = True
. g = Graph()
. for n1,s in enumerate(l):
.     g.addNode(n1, nodeData=1)
.     for n2 in s:
.         g.addBiArc(n1, n2)
. if debug: print g # ****
.
. result = []
. for i,s in enumerate(l):
.     ss = set()
.     for n1 in s:
.         ss.update(g.xoutNodes(n1))
.     ss.remove(i)
.     if debug: print i, ss # ****
.
.     for sj in ss:
.         if s <= l[sj]:
.             break
.     else:
.         result.append(s)
. print result



If nodes are hashable, but they can cointain integers in [0, len(l)),
then you can use this other code with nodeID translations (in a
different graph implementation, like Gato one, such translations can be
automatic):


. l = [set(['a','b','c']), set(['a','c']), set(['a','d','e','1']),
.      set(['r','k','l']), set(['b','e','1']), set(['a','c']) ]
. from graph import Graph
.
. debug = True
. nodes = set()
. for s in l:
.     nodes.update(s)
. lenl = len(l)
. nodes = dict( (i+lenl, n) for i,n in enumerate(nodes) )
. if debug: print "nodes:", nodes # ****
. nodesid = dict( (n,i) for i,n in nodes.iteritems() )
. l2 = [ set( nodesid[n] for n in s ) for s in l ]
. if debug: print "l2:", l2 # ****
.
. g = Graph()
. for n1,s in enumerate(l2):
.     g.addNode(n1)
.     for n2 in s:
.         g.addBiArc(n1, n2)
. if debug: print g # ****
.
. result = []
. for i,s in enumerate(l2):
.     ss = set()
.     for n1 in s:
.         ss.update(g.xoutNodes(n1))
.     ss.remove(i)
.     if debug: print "i, ss:", i, ss # ****
.
.     for sj in ss:
.         if s <= l2[sj]:
.             break
.     else:
.         result.append(s)
. if debug: print "result:", result # ****
. result2 = [ set( nodes[n] for n in s ) for s in result ]
. print result2

If the hyperedges define a sparse hypergraph, then this code can be
quite faster than the original quadratic one (if len(l) is big enough).

Bye,
Bearophile

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: list of unique non-subset sets

Reply via email to