2016-07-25 23:26 GMT+02:00 Ludovic Courtès <l...@gnu.org>:

> Hello!
>
> Jelle Licht <jli...@fsfe.org> skribis:
>
> > On Ludo's advice, I snarfed Ricardo's recursive importer and bolted it
> > on my npm importer. After leaving the importer running for a quite
> > some hours (and making it more robust in the face of inconsistent npm
> > information), it turns out that jQuery has a direct or indirect
> > dependcy on about everything. We are talking pretty much all of the
> > build systems, all of the testing frameworks and all of the test
> > runners. Literally thousands of packages, and multiple (conflicting)
> > versions of most.
>
> I’m really impressed that your importer can already grovel this much!
> In itself, that’s already a significant achievement, despite the
> frustration of not getting “guix package -i jquery” right away.
>
> Do you have figures on the number of vertices and edges on this graph?
> Could it be that the recursive importer keeps revisiting the same nodes
> over and over again?  :-)
>



> I would suggest publishing the code somewhere so others can try to
> import their favorite JS library and give feedback.
>
>
I'd like to indicate that in the Guix code base there's a function visiting
the graph of the dependencies of a package with the concern of covering it
all AND of not considering the same node more than one time

It's in guix/packages.scm on line 552, it's called "transitive-inputs" and
takes as an arguments the inputs of a single package.

It could be used as a blueprint for such task of dependencies graphs
covering.

The main observation that I can do is that both versions do check whether a
package being considered is already been seen in a previous passage.

So, it doesn't seem to me that Jelle's version could revisit the same
package (node) over and over

Also, the "official" version clearly distinguishes between the current
depth level in the graph and the next one, using 2 different variables for
containing the packages (nodes) at the current depth and those in the next
level ("inputs" and "propagated") as it does a breadth first visit.

Instead the Jelle's version uses the same variable for the current AND the
next level and it's a list and it conses the nodes for the next level on
the current list

So it seems to me that it does a depth first visit (I might be wrong, I
didn't do a complete trace)

If anything, the "official" version uses a VHash to check if a package has
already been "seen" while Jelle's uses a plain list (again)

So the cost of this check is constant in the official version and linear in
Jelle's version.

Further, in Jelle's version, every package gets checked against the list of
already imported packages (as a plain list) AND against the packages
already present in the store.

Both checks at every iteration. It seems to me that there's space for
improving here ;-)

In fact, in Jelle's guix/import/npm.scm, on line 462 the "and" could be
switched to an "or". It would work anyway and it could save a handful of
checks on a complete run of the algorithm

Anyway, I tried to "import" a random package and it has a ton of
dependencies.

It seems to me that a more systematic approach (like that of a real data
scientist ;-) ) could help here

The whole graph should be imported in some database and then graph
algorithms should be run on it

For example: which are the packages with less or no dependencies (and a lot
of dependants) ?
Because those should be imported first, in my opinion.

Jelle came to the notion that testing frameworks are a dependencies for
almost all the packages but as far as I understand that's a quite empirical
knowledge.


> > I am currently hovering between version 0.6 and 0.7, which can
> > properly recompile itself and slightly more contemporary
> > version. Getting to version 1.6 from June 2013 should be doable using
> > this exact same approach.  This will allow us to package a 2014
> > version of the Mocha testing framework.
>
> Impressive.
>

Indeed. Impressive.

Ok, so these are the first impressions that I came up with. I'd like my
observations on the algorithms to be vetted by more experienced people.

Just to help me verify if I can actually understand someone's else scheme
code

Thanks

Reply via email to