For the first stages of the language learning process, as I understand it, what we really need to do is just
*** 1. Split input into sentences (which can be done with lots of sentence-splitters, including ours) 2. For each sentence S: -- do some stemming (again that can be done with lots of stemmers, including our own), so that each word is associated with a stem -- identify each pair of words (W1, W2) so that W1 occurs before W2 in S -- create and update some Atoms based on the pair (W1, W2) *** The Atoms to be created are something like -- an EvaluationLink indicating a link-parser AnyLink between W1 and W2 -- updating a couple counts based on Linas's recent code I'm in too much of a hurry to look up the format of these links right now but Ruiting probably remembers or I could look it up this afternoon.. That would do it right? The current approach replaces the "identify each pair of words" step with "do a random link parse", but it is not clear that doing a random link parse is actually better; my own feeling is that it probably isn't, but we debated this extensively before without resolution.... To replicate the current pipeline better one would replace -- identify each pair of words (W1, W2) so that W1 occurs before W2 in S with -- do a random link parse, then identify each pair of words (W1,W2) that are linked in the random link parse ... Am I missing something? The above could all be done in C++ perfectly well; it doesn't require Guile because it doesn't require any of the fancy stuff in the current NLP pipeline... -- Ben On Tue, Jun 6, 2017 at 11:07 AM, Ben Goertzel <[email protected]> wrote: > On Tue, Jun 6, 2017 at 3:10 AM, Linas Vepstas <[email protected]> wrote: >> Re: running LG in the same adress space as the atomspace: this has already >> been done; the surreal code does this. In a day or 2 or 3 you could write >> the needed wrapper code to have LG live directly inside of opencog, >> generating the correct atoms, thus totally bypassing guile and garbage >> collection. And this would be a very easy way to get a 3x speedup, if >> that's really your end-goal. Its a lot wasier than all the other crazy >> schemes discussed. > > > yeah, we were discussing this yesterday... I think we may do something > like this... we will discuss again this afternoon... > > > > -- > Ben Goertzel, PhD > http://goertzel.org > > "I am God! I am nothing, I'm play, I am freedom, I am life. I am the > boundary, I am the peak." -- Alexander Scriabin -- Ben Goertzel, PhD http://goertzel.org "I am God! I am nothing, I'm play, I am freedom, I am life. I am the boundary, I am the peak." -- Alexander Scriabin -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/opencog. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CACYTDBdf8M_3bPO%2B5HyxqTQG%3DrpN_mwQRvwr6iZm1uz1QMbYwQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
