[opencog-dev] Re: Language Learning - Progress and Performance

Ben Goertzel Mon, 05 Jun 2017 20:33:07 -0700

For the first stages of the language learning process, as I understand
it, what we really need to do is just



***
1. Split input into sentences (which can be done with lots of
sentence-splitters, including ours)

2. For each sentence S:

-- do some stemming (again that can be done with lots of stemmers,
including our own), so that each word is associated with a stem
-- identify each pair of words (W1, W2) so that W1 occurs before W2 in S
-- create and update some Atoms based on the pair (W1, W2)
***

The Atoms to be created are something like

-- an EvaluationLink indicating a link-parser AnyLink between W1 and W2
-- updating a couple counts based on Linas's recent code

I'm in too much of a hurry to look up the format of these links right
now but Ruiting probably remembers or I could look it up this
afternoon..

That would do it right?

The current approach replaces the "identify each pair of words" step
with "do a random link parse", but it is not clear that doing a random
link parse is actually better; my own feeling is that it probably
isn't, but we debated this extensively before without resolution....
To replicate the current pipeline better one would replace

-- identify each pair of words (W1, W2) so that W1 occurs before W2 in S

with

-- do a random link parse, then identify each pair of words (W1,W2)
that are linked in the random link parse

...

Am I missing something?

The above could all be done in C++ perfectly well; it doesn't require
Guile because it doesn't require any of the fancy stuff in the current
NLP pipeline...

-- Ben






On Tue, Jun 6, 2017 at 11:07 AM, Ben Goertzel <[email protected]> wrote:
> On Tue, Jun 6, 2017 at 3:10 AM, Linas Vepstas <[email protected]> wrote:
>> Re: running LG in the same adress space as the atomspace: this has already
>> been done; the surreal code does this. In a day or 2 or 3 you could write
>> the needed wrapper code to have LG live directly inside of opencog,
>> generating the correct atoms, thus totally bypassing  guile and garbage
>> collection.  And this would be a very easy way to get a 3x speedup, if
>> that's really your end-goal.  Its a lot wasier than all the other crazy
>> schemes discussed.
>
>
> yeah, we were discussing this yesterday... I think we may do something
> like this... we will discuss again this afternoon...
>
>
>
> --
> Ben Goertzel, PhD
> http://goertzel.org
>
> "I am God! I am nothing, I'm play, I am freedom, I am life. I am the
> boundary, I am the peak." -- Alexander Scriabin



-- 
Ben Goertzel, PhD
http://goertzel.org

"I am God! I am nothing, I'm play, I am freedom, I am life. I am the
boundary, I am the peak." -- Alexander Scriabin

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/opencog.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/CACYTDBdf8M_3bPO%2B5HyxqTQG%3DrpN_mwQRvwr6iZm1uz1QMbYwQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[opencog-dev] Re: Language Learning - Progress and Performance

Reply via email to