Thanks Evan! Yes, Sunday's Advanced Hangout would be a great time to talk about this implementation, and for anyone interested in learning for miniKanren to share notes.
--Will On Tue, Jun 13, 2017 at 4:02 PM, Apocalypse Mystic <[email protected]> wrote: > Hello, > > Since it seems that there is growing activity around applying machine > learning to minikanren at various levels, Will asked if I would share the > approach I have been taking. To the extent that there is broader interest in > mk machine learning, it probably makes sense for interested parties to be in > touch. > > Background: > > Currently, I have what seems like a complete implementation for at least one > approach to integrating neural nets with mk search. However, there remain > some open questions, such as what to train it on and how to evaluate it once > trained. Moreover, it seems likely that there are entirely other approaches > that may or may not be related to this one. In any case, this approach > should be fairly easy to implement with minor changes to any standard > minikanren implementation, so if anyone is interested in using any part of > it to try to integrate it with an existing implementation of a given mk > program, such as the reference implementation of Barliman, I can describe it > in more detail. This email will just be a basic overview. > > The goal of this approach is to derive a general way to turn a minikanren > program into a machine learning problem, so that any application developed > in minikanren can also implicitly be trained. Although I had > Barliman/program synthesis in mind as a prime example, I did not use any > Barliman-specific knowledge, so that the system would be applicable to > Barliman as well as to other programs. For the sake of explanation, however, > I will describe the particular set up in terms of how it works in Barliman. > > Overview: > > As a simplification, I decided to set up the problem in three stages: > 1) Barliman would be run like normal, and each returned package would > contain, in addition to the substitution and constraint store, an "execution > trace" representing the choices made in constructing this particular > package. Similar programs should have similar execution traces, which forms > the basis of learning. Given a corpus of programs, this would take the form > of running Barliman once on each fully ground program, so that we can > recover the trace representing that program. Given no corpus, we can use > Barliman to just generate a bunch of programs and use those for training, > although in this case we need to make sure we have a way to distinguish good > and bad programs so that we can set the rewards appropriately. > 2) Outside of minikanren, a machine learning model (in my case, a simple > RNN) is trained on the program traces, which are tree structures, although I > flatten them to sequences for training. > 3) The model is fed back into subsequent runs of the minikanren program, and > its predictions are used to guide the search. > > The overall effect is that the model represents a bias over the search > space, but does not update live during the search. There are a number of > reasons for this, all of which have to do with my inability to arrive at a > satisfactory interpretation of what success and failure mean in the middle > of a given search, but I suspect that, depending on the problem, a model > that updates live during the search would probably be a valid approach and > possibly easily reachable from this implementation, although for the moment > I am not pursuing it. I will note however that this "bias" can be a fairly > flexible concept per-search. Eg, one could imagine a bias over all scheme > programs in Barliman that down-weighted lambdas full of unused arguments and > other frequently vestigial components of synthesized programs, but one could > also imagine a bias tuned to the specific input/output constraints, so that > it weighted cons more highly if the output was a list vs a number, etc. > > Implementation: > > Since I implemented this in Smalltalk, there are probably a few > implementation details that will differ in the lisp/scheme variants. I will > try to give a general sketch of what I think such an implementation would > look like. Roughly, it will require two main changes to the basic mk > implementation: > > 1) Forms that delay the search, like fresh and conde, should have unique > global id's, and add them to the trace of the packages that they process. > > This may be easier in the scheme/lisp implementations, where you have a > clear macro expansion time and a module boundary that can contain a global > counter, although maybe not. The goal is that each package, as it passes > through the goals, will accumulate a history of all the goals it has seen. > Eg, an answer that represents a variable lookup on the third variable in the > environment might look something like a less human-readable version of: > '(eval-conde-2 lookup-fresh lookup-fail-conde lookup-fail-conde > lookup-success-conde). It is important that the same symbol be given by the > "same" goal within a recursive step, so taking the second cond in the main > eval conde should always add eval-conde-2 to the trace. > > One extra potential complication here is the handling of conjunctions of > more than one conde. This may not be necessary for a basic implementation. > If we think of the interpreter as defining an implicit grammar over the > execution traces of the program, eg: > > Eval => Quote > Quote => quoted-value > Eval => Lookup > Lookup => lookup-failure Lookup > Lookup => lookup-success > > Then we can recognize that the trace may not, in general, be a list. For an > interpreter with only lookup and quote, it will be, because these rules are > both regular, and so a sequence of trace id's uniquely defines the "parse" > of grammar rules used by the interpreter to arrive at its final package. > > Once we add in rules like list: > > ProperList => empty-list > ProperList => destructure-list Eval ProperList > > We can see that the "grammar" may potentially be context free. Of course, in > reality, even calling it context free is a modeling approximation. In some > minikanren programs, this will never come up (either because all the "rules" > are regular, or the "cfg" is unambiguous given a sequential trace anyway), > and in some machine learning models, this may not really matter, if the > model is only using information from the sequence and not from the full > hierarchy. Still, All that is needed to capture the extra information is > that fresh/cond forms that contain multiple condes should create bind forms > that essentially take the sequential traces from each conde and merge them > together up to the end of their common prefix to create a tree structured > trace. Eg: '(eval-conde-3 proper-list-fresh proper-list-nonempty-conde > ((eval-conde2 ...) (proper-list-fresh ...))) > > 2) The interleaving streams of closures should be reified as structs so that > the order of the search can be controlled by the trained model. > > This is a straightforward translation of the stream logic into a reified > form of the same stream logic. Every time the package is extended with a new > trace id, the model is run on the current trace and the package is tagged > with a score. When packages are mplused together, mplus checks which branch > has the max score, and interleaves that branch first, caching the score in > the new mplus struct for further checking higher up the tree. If the scores > are the same, it interleaves like normal, so the initial run should use a > dummy model that just scores everything with a 0. It is also possible to > choose probabilistically, etc instead of using the max. > > I've definitely left out some implementation details that would be useful > for actually making this work, but I think I've said enough already, so I > can go into more detail if anyone wants to hear it. Just let me know. Also, > I think there is a plan to talk about this more in the advanced hangout on > sunday? Will correct me if I'm wrong. So that would be a good time to talk > about it if this is interesting to anyone. > > Currently, I'm refactoring the above code a bit, and then I'm going to work > on figuring out what kind of training makes sense (corpus-based, > self-training via unbound synthesis, etc), and how to evaluate a trained > search. > > Cheers, > Evan > > -- > You received this message because you are subscribed to the Google Groups > "Barliman Editor" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/barliman-editor/CAEE2%2B5AYrW-m2%2BvO6QZ6Uu2EJkyPOtePEXto9g2NSaxL-8g1Zg%40mail.gmail.com. > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "minikanren" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/minikanren. For more options, visit https://groups.google.com/d/optout.
