On May 16, 2015 11:22 AM, "David Cournapeau" <courn...@gmail.com> wrote: > > > > On Sat, May 16, 2015 at 11:36 PM, Justin Cappos <jcap...@nyu.edu> wrote: >>> >>> I am no expert, but I don't understand why backtracking algorithms would to be faster than SAT, since they both potentially need to walk over the full set of possible solutions. It is hard to reason about the cost because the worst case is in theory growing exponentially in both cases. >> >> >> This is talked about a bit in this thread: https://github.com/pypa/pip/issues/988 >> >> Each algorithm could be computationally more efficient. Basically, *if there are no conflicts* backtracking will certainly win. If there are a huge number of conflicts a SAT solver will certainly win. It's not clear where the tipping point is between the two schemes. >> >> However, a better question is does the computational difference matter? If one is a microsecond faster than the other, I don't think anyone cares. However, from the OPIUM paper (listed off of that thread), it is clear that SAT solver resolution can be slow without optimizations to make them work more like backtracking resolvers. From my experience backtracking resolvers are also slow when the conflict rate is high. > > > Pure SAT is fast enough in practice in my experience (concretely: solving thousand of rules takes < 1 sec). It becomes more complicated as you need to optimize the solution, especially when you have already installed packages. This is unfortunately not as well discussed in the literature. Pseudo-boolean SAT for optimization was argued to be too slow by the 0 install people, but OTOH, this seems to be what's used in conda, so who knows :)
Where optimizing means something like "find a solution with the newest possible releases of the required packages", not execution speed. > If you SAT solver is in pure python, you can choose a "direction" of the search which is more meaningful. I believe this is what 0install does from reading http://0install.net/solver.html, and what we have in our own SAT solver code. I unfortunately cannot look at the 0install code myself as it is under the GPL and am working on a BSD solver implementation. I also do not know how they handle updates and already installed packages. > >> >> This only considers computation cost though. Other factors can become more expensive than computation. For example, SAT solvers need all the rules to consider. So a SAT solution needs to effectively download the full dependency graph before starting. A backtracking dependency resolver can just download packages or dependency information as it considers them. The bandwidth cost for SAT solvers should be higher. > > > With a reasonable representation, I think you can make it small enough. To give an idea, our index @ Enthought containing around 20k packages takes ~340 kb compressed w/ bz2 if you only keep the data required for dependency handling (name, version and runtime dependencies), and that's using json, an inefficient encoding, so I suspect encoding all of pypi may be a few MB only fetch, which is generally faster that doing tens of http requests. > > The libsvolv people worked on a binary representation that may also be worth looking at. > >> >> P.S. If you'd like to talk off list, possibly over Skype, I'd be happy to talk more with you and/or Robert about minutiae that others may not care about. > > > Sure, I would be happy too. As I mentioned before, we have some code around a SAT-based solver, but it is not ready yet, which is why we kept it private (https://github.com/enthought/sat-solver). It handles well (== both speed and quality-wise) the case where nothing is installed, but behaves poorly when packages are already installed, and does not handle the update case yet. The code is also very prototype-ish, but is not too complicated to experimente with it. > > David > > > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG@python.org > https://mail.python.org/mailman/listinfo/distutils-sig >
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig