Thanks, thinking of it as map reduce is helpful. However, the graphs are small (<10,000 nodes) so I conjecture running serially on a PC would be fast enough.
I am still interested in feedback on the modules partly because I am in an "exploratory" mode and don't know exactly what I will later need. E.g. Graph::Simple can create SVG and other "pictures" of a graph. Graph supports weighted edges which is relevant. Both modules 1 & 2 can detect a cycle, and that seems like a good sanity check. If one is detected it would mean there is a problem in the data, or more likely in my code, or in the module itself (which is that bug I cited). Steve -----Original Message----- From: Ben Tilly [mailto:[email protected]] Sent: Tuesday, June 25, 2013 8:16 PM To: Steve Tolkin Cc: Boston Perl Mongers Subject: Re: [Boston.pm] directed Graph modules in perl and CPAN For a specific one-off specific task like this, I would assume that would take more work to find and evaluate a module that solves my problem than it would take to roll my own solution. Heck, in this case you can arrange it as a map-reduce. Your initial map takes each file, and spits out key/value pairs where the key is the meaningful identifier of a node, and the value is the meaningful identifier of a predecessor. Your reduce takes the key and dedupes the values to get a node and its unique predecessors. With Hadoop you can now distribute this calculation across multiple machines and parallelize the work. On Tue, Jun 25, 2013 at 4:47 PM, Steve Tolkin <[email protected]> wrote: > Summary: I went looking for a CPAN graph module so I could merge multiple > directed graphs. I found two that looked good, by "famous" Perl authors. > Unfortunately both have "issues". > > 1. Graph::Easy looks good, but it has not changed in years and has a bug > list > https://rt.cpan.org/Public/Dist/Display.html?Status=Active;Name=Graph-Easy > with a bunch of Open and Important bugs. See also > http://search.cpan.org/~shlomif/Graph-Easy-0.73/lib/Graph/Easy.pm and > http://bloodgate.com/perl/graph/index.html > This was originally by tels and now maintained by by Shlomi Fish. > > 2. Graph now says: <q> > UNSUPPORTED > Unfortunately, as of release 0.95, this module is unsupported, and will no > more be maintained. Sorry about that. </q> > Its bug list at https://rt.cpan.org/Public/Dist/Display.html?Name=Graph is > short but it includes this Important one: "find_a_cycle and has_cycle are > broken" https://rt.cpan.org/Public/Bug/Display.html?id=78465 > See also http://search.cpan.org/~jhi/Graph-0.96/lib/Graph.pod > This is by Jarkko Hietaniemi. > > 3. Graph::Simple is just v0.03 > > Are there other good modules? > > A summary of what I MIGHT want to do: > Merge separate directed graphs into one, by combining equivalent nodes and > creating the union of their predecessor sets. > > In more detail: There are several existing directed graphs, each in its own > file. Sometimes a node in one file is equivalent to a node in another file. > Nodes have associated attributes. The "meaningful" identifier for a node is > a three part key. However, in each file each node is assigned an arbitrary > integer ID starting with 1, so the same integers appear in many files, > referring to different nodes. In each file a node's predecessors are > identified just by a set of those integers. > > -- > Thanks, > Steve Tolkin > > > > > _______________________________________________ > Boston-pm mailing list > [email protected] > http://mail.pm.org/mailman/listinfo/boston-pm _______________________________________________ Boston-pm mailing list [email protected] http://mail.pm.org/mailman/listinfo/boston-pm

