Re: MiniKanren + Neural Nets

Apocalypse Mystic Wed, 14 Jun 2017 12:19:52 -0700

Sure, so here is the trace for the program that generates a list of two
quoted values:


'(eval 3 list proper 2 nonempty 1 eval ((1 quote) (2 proper 2 nonempty ((1
eval 1 quote) (2 proper 1)))))

I am using human readable tags for the functions and numbers for the conde
branches, I think in the same way you are. You will notice the tree has 3
leaves, one for each eval'd element of the list and one for the final nil.
Because I'm feeding it to a simple rnn right now, I flatten this to a
simple sequence:

"((eval) (eval 3) (list) (proper) (proper 2) (nonempty) (nonempty 1) (eval)
(eval 1) (quote) (nonempty 1) (eval) (eval 1) (quote) (nonempty 1) (eval)
(eval 1) (quote))"

Where each inner list should be taken together as an opaque token. I'm also
working in part from that grammar autoencoder paper, although I am
currently building these traces on the fly and without a full reference
grammar, which makes following the precise method kind of tricky. I may
bite the bullet and extract the grammar as a pre-step before starting the
process, but we'll see.

I'll be interested to hear if you think up good ways to encode the data,
since I'd like to be able to feed that information to the rnn and use it to
further bias the search. I haven't thought too much about it yet, but one
thing that has occurred to me is that maybe it would be possible to
repurpose the traces generated by the interpreter to encode the
input/output data. Eg, given input '(1 2) and output '(2) generate the
canonical program that produces this data, eg (cons (list (quote 1) (quote
2)) (list (quote 2))) and run the interpreter on it to generate a trace
that can be encoded by whatever you are using to encode normal program
traces. This would still leave the actual atomic elements (numbers,
symbols...) unspecified, but maybe you could work them in with a vector
representation that reserved say 3 bits for a 1 hot encoding of the type
(number, symbol, boolean) and the rest for a type specific encoding (binary
encoding for numbers, one hot encodings for distinct symbols within a given
execution of the program, etc). Depending on what one was doing, I'm also
interested in some of the stuff that has gone out to the mailing list about
vectorized 'unification' so if symbols are meaningful knowledge within a
domain rather than just opaque gensyms for specifying a scheme function,
maybe actually learning embeddings for the symbols could be part of the
execution. Lots of partial thoughts on this, but nothing too concrete yet.

I'm intrigued by where you're going with lookup, because that is also
something I'd love to have a good answer for. Ideally it would be nice to
be able to have a large library of functions in the environment without
totally sacrificing our ability to use them. I've been wondering if some of
the finite domain constraint stuff could be repurposed to do a more direct
lookup computation, possibly along the lines you are suggesting, but I
haven't made too much progress in fleshing that out so far.

Cheers,
Evan

On Wed, Jun 14, 2017 at 9:54 AM, Smock Jonathan <[email protected]> wrote:

> Do you have an example trace that you can paste? That's really
> interesting. My trace/path is just a sequence, not a tree (shown at the
> very bottom).
>
> Btw, for those interested, I've been trying to figure out how to take
> trees of data and turn them into a feature vector/embedding. Does anyone
> have good resources for this? The best I can find so far is this, which
> uses both a context-free grammar and an autoencoder to generate molecules:
> https://www.youtube.com/watch?v=ar4Fm1V65Fw .  My goal was to take data
> structures and turn them into feature vectors and then use those to make
> feature vectors for functions. Hopefully having feature vectors for
> functions gets us out of the scalability problem for `lookup` - i.e.
> instead of trying all functions, we have an intuition about what kind of
> function we need/want and then do a lookup in all existing functions by
> feature vector for nearest neighbors (or use feature vector to create a
> heuristic through the interpreter).  I think my first step is to try and
> replicate what he's doing in that video though, which will take me some
> time to figure out.
>
> Anyway, here's my substitution store (Clojure):
> {:st/path []}
>
> I only add to the path for disjunctions (btw, I realized yesterday that
> Greg said he only reorders conjunctions in Barliman, which I have to think
> about more).
>
> My implementation very much follows the microKanren approach (b/c that's
> what I understand ...). Here's the simplified version of my disjunction:
>
> (defn path-conj [st direction]
>   (update st :st/path conj direction))
>
> (defn disjunction [& goals]
>   (fn [st]
>     (->> goals
>          (map-indexed (fn [idx g]
>                         (g (path-conj st idx))))
>          vec
>          (apply interleave-all))))
>
> `map-indexed` gives each goal in the disjunction a number/index, and we
> record that in the path as we descend into that branch. So, when we're
> done, our path might look like [1 1 0], meaning we took the 1th branch,
> then the 1th branch of the next disjunction, and finally the 0th branch of
> the last disjunction. I considered labeling conde's or branches, though, so
> it's neat that you're doing something like that.
>
> My initial goal in adding the path was to be able to stop the computation
> at some arbitrary depth, serialize the substitution store, deserialize it
> somewhere else, and then resume where it left off.  I realized after I did
> that, that I could simply serialize the path alone, as long as I'm willing
> to do the work of rebuilding the substitution store as I navigate though to
> that path to resume.
>
>
> On Tue, Jun 13, 2017 at 6:37 PM, William Byrd <[email protected]> wrote:
>
>> Thanks Evan!
>>
>> Yes, Sunday's Advanced Hangout would be a great time to talk about
>> this implementation, and for anyone interested in learning for
>> miniKanren to share notes.
>>
>> --Will
>>
>>
>> On Tue, Jun 13, 2017 at 4:02 PM, Apocalypse Mystic
>> <[email protected]> wrote:
>> > Hello,
>> >
>> > Since it seems that there is growing activity around applying machine
>> > learning to minikanren at various levels, Will asked if I would share
>> the
>> > approach I have been taking. To the extent that there is broader
>> interest in
>> > mk machine learning, it probably makes sense for interested parties to
>> be in
>> > touch.
>> >
>> > Background:
>> >
>> > Currently, I have what seems like a complete implementation for at
>> least one
>> > approach to integrating neural nets with mk search. However, there
>> remain
>> > some open questions, such as what to train it on and how to evaluate it
>> once
>> > trained. Moreover, it seems likely that there are entirely other
>> approaches
>> > that may or may not be related to this one. In any case, this approach
>> > should be fairly easy to implement with minor changes to any standard
>> > minikanren implementation, so if anyone is interested in using any part
>> of
>> > it to try to integrate it with an existing implementation of a given mk
>> > program, such as the reference implementation of Barliman, I can
>> describe it
>> > in more detail. This email will just be a basic overview.
>> >
>> > The goal of this approach is to derive a general way to turn a
>> minikanren
>> > program into a machine learning problem, so that any application
>> developed
>> > in minikanren can also implicitly be trained. Although I had
>> > Barliman/program synthesis in mind as a prime example, I did not use any
>> > Barliman-specific knowledge, so that the system would be applicable to
>> > Barliman as well as to other programs. For the sake of explanation,
>> however,
>> > I will describe the particular set up in terms of how it works in
>> Barliman.
>> >
>> > Overview:
>> >
>> > As a simplification, I decided to set up the problem in three stages:
>> > 1) Barliman would be run like normal, and each returned package would
>> > contain, in addition to the substitution and constraint store, an
>> "execution
>> > trace" representing the choices made in constructing this particular
>> > package. Similar programs should have similar execution traces, which
>> forms
>> > the basis of learning. Given a corpus of programs, this would take the
>> form
>> > of running Barliman once on each fully ground program, so that we can
>> > recover the trace representing that program. Given no corpus, we can use
>> > Barliman to just generate a bunch of programs and use those for
>> training,
>> > although in this case we need to make sure we have a way to distinguish
>> good
>> > and bad programs so that we can set the rewards appropriately.
>> > 2) Outside of minikanren, a machine learning model (in my case, a simple
>> > RNN) is trained on the program traces, which are tree structures,
>> although I
>> > flatten them to sequences for training.
>> > 3) The model is fed back into subsequent runs of the minikanren
>> program, and
>> > its predictions are used to guide the search.
>> >
>> > The overall effect is that the model represents a bias over the search
>> > space, but does not update live during the search. There are a number of
>> > reasons for this, all of which have to do with my inability to arrive
>> at a
>> > satisfactory interpretation of what success and failure mean in the
>> middle
>> > of a given search, but I suspect that, depending on the problem, a model
>> > that updates live during the search would probably be a valid approach
>> and
>> > possibly easily reachable from this implementation, although for the
>> moment
>> > I am not pursuing it. I will note however that this "bias" can be a
>> fairly
>> > flexible concept per-search. Eg, one could imagine a bias over all
>> scheme
>> > programs in Barliman that down-weighted lambdas full of unused
>> arguments and
>> > other frequently vestigial components of synthesized programs, but one
>> could
>> > also imagine a bias tuned to the specific input/output constraints, so
>> that
>> > it weighted cons more highly if the output was a list vs a number, etc.
>> >
>> > Implementation:
>> >
>> > Since I implemented this in Smalltalk, there are probably a few
>> > implementation details that will differ in the lisp/scheme variants. I
>> will
>> > try to give a general sketch of what I think such an implementation
>> would
>> > look like. Roughly, it will require two main changes to the basic mk
>> > implementation:
>> >
>> > 1) Forms that delay the search, like fresh and conde, should have unique
>> > global id's, and add them to the trace of the packages that they
>> process.
>> >
>> > This may be easier in the scheme/lisp implementations, where you have a
>> > clear macro expansion time and a module boundary that can contain a
>> global
>> > counter, although maybe not. The goal is that each package, as it passes
>> > through the goals, will accumulate a history of all the goals it has
>> seen.
>> > Eg, an answer that represents a variable lookup on the third variable
>> in the
>> > environment might look something like a less human-readable  version of:
>> > '(eval-conde-2 lookup-fresh lookup-fail-conde lookup-fail-conde
>> > lookup-success-conde). It is important that the same symbol be given by
>> the
>> > "same" goal within a recursive step, so taking the second cond in the
>> main
>> > eval conde should always add eval-conde-2 to the trace.
>> >
>> > One extra potential complication here is the handling of conjunctions of
>> > more than one conde. This may not be necessary for a basic
>> implementation.
>> > If we think of the interpreter as defining an implicit grammar over the
>> > execution traces of the program, eg:
>> >
>> > Eval => Quote
>> > Quote => quoted-value
>> > Eval => Lookup
>> > Lookup => lookup-failure Lookup
>> > Lookup => lookup-success
>> >
>> > Then we can recognize that the trace may not, in general, be a list.
>> For an
>> > interpreter with only lookup and quote, it will be, because these rules
>> are
>> > both regular, and so a sequence of trace id's uniquely defines the
>> "parse"
>> > of grammar rules used by the interpreter to arrive at its final package.
>> >
>> > Once we add in rules like list:
>> >
>> > ProperList => empty-list
>> > ProperList => destructure-list Eval ProperList
>> >
>> > We can see that the "grammar" may potentially be context free. Of
>> course, in
>> > reality, even calling it context free is a modeling approximation. In
>> some
>> > minikanren programs, this will never come up (either because all the
>> "rules"
>> > are regular, or the "cfg" is unambiguous given a sequential trace
>> anyway),
>> > and in some machine learning models, this may not really matter, if the
>> > model is only using information from the sequence and not from the full
>> > hierarchy. Still, All that is needed to capture the extra information is
>> > that fresh/cond forms that contain multiple condes should create bind
>> forms
>> > that essentially take the sequential traces from each conde and merge
>> them
>> > together up to the end of their common prefix to create a tree
>> structured
>> > trace. Eg: '(eval-conde-3 proper-list-fresh proper-list-nonempty-conde
>> > ((eval-conde2 ...) (proper-list-fresh ...)))
>> >
>> > 2) The interleaving streams of closures should be reified as structs so
>> that
>> > the order of the search can be controlled by the trained model.
>> >
>> > This is a straightforward translation of the stream logic into a reified
>> > form of the same stream logic. Every time the package is extended with
>> a new
>> > trace id, the model is run on the current trace and the package is
>> tagged
>> > with a score. When packages are mplused together, mplus checks which
>> branch
>> > has the max score, and interleaves that branch first, caching the score
>> in
>> > the new mplus struct for further checking higher up the tree. If the
>> scores
>> > are the same, it interleaves like normal, so the initial run should use
>> a
>> > dummy model that just scores everything with a 0. It is also possible to
>> > choose probabilistically, etc instead of using the max.
>> >
>> > I've definitely left out some implementation details that would be
>> useful
>> > for actually making this work, but I think I've said enough already, so
>> I
>> > can go into more detail if anyone wants to hear it. Just let me know.
>> Also,
>> > I think there is a plan to talk about this more in the advanced hangout
>> on
>> > sunday? Will correct me if I'm wrong. So that would be a good time to
>> talk
>> > about it if this is interesting to anyone.
>> >
>> > Currently, I'm refactoring the above code a bit, and then I'm going to
>> work
>> > on figuring out what kind of training makes sense (corpus-based,
>> > self-training via unbound synthesis, etc), and how to evaluate a trained
>> > search.
>> >
>> > Cheers,
>> > Evan
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> Groups
>> > "Barliman Editor" group.
>> > To unsubscribe from this group and stop receiving emails from it, send
>> an
>> > email to [email protected].
>> > To post to this group, send email to [email protected].
>> > To view this discussion on the web visit
>> > https://groups.google.com/d/msgid/barliman-editor/CAEE2%2B5A
>> YrW-m2%2BvO6QZ6Uu2EJkyPOtePEXto9g2NSaxL-8g1Zg%40mail.gmail.com.
>> > For more options, visit https://groups.google.com/d/optout.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Barliman Editor" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> To view this discussion on the web visit https://groups.google.com/d/ms
>> gid/barliman-editor/CACJoNKF%2B8EOVAXVduDcutzCkGqO_HBAgfMdOx
>> CdWX1%2BnBD65hg%40mail.gmail.com.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"minikanren" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/minikanren.
For more options, visit https://groups.google.com/d/optout.

Re: MiniKanren + Neural Nets

Reply via email to