[opencog-dev] Re: Language Learning - Progress and Performance

Linas Vepstas Sat, 03 Jun 2017 09:19:49 -0700

On Thu, Jun 1, 2017 at 11:27 PM, Curtis Faith <[email protected]>
wrote:


> One further thing, with respect to the sql-fetch and store, one relatively
> simple change might be a new explicit call to save and fetch only a single
> atom's values, i.e. not recursively save atoms and all attached values.
> Something like: sql-store-values and sql-fetch-values and
> sql-store-one-atom and sql-fetch-one-atom, perhaps.
>

I think you are mis-reading the code, because that is not what it does.  So:

1) the first time ever, it MUST fetch/store the entire atom, because you
can't put it in the atomspace, otherwise, or, in the other direction, you
don't know what the UUID's are, and thus cannot access it.

2) after this "first time ever" step, only the values are updated. They are
only updated for the current atom, and not for any other.  If other atoms
were updated, this would cause existing truth values in the atomspace to be
corrupted.  That would be very wrong, since atomspace users should have the
right to assume that no one is messing with the values attached to atoms.
Users have the right to have total control over values, and so the SQL
backend is careful to NOT update the values!  If it is, that's a bug.

--linas

>
> Then we could loop over newly created atoms at the end of the process and
> save them while saving only the final values for atoms that have already
> been persisted and who have updated values. This could be done in C++ as an
> AtomSpace call once at the end of the processing of a chunk of text so
> there would be some ability to restart long processes in the event of
> errors but the overhead would be greatly reduced while running large
> batches.
>
> When starting the CogServer, there could be a mark on the Atomspace that
> gets cleared when sql-load is called once, but this wouldn't have to be
> done more than once per initialisation of an instance of the CogServer.
> Subsequent SQL need only be writes batched together at the end as files get
> processed by observe-text or other similar steps in the pipeline.
>
> On Fri, Jun 2, 2017 at 12:09 PM, Curtis Faith <[email protected]>
> wrote:
>
>> So my intuition about the Guile GC was right about after all.
>>
>> I came into the office this morning after leaving last night with a test
>> of the 6,000 sentence file which took two days last week and started me on
>> this path before working on AWS.
>>
>> For the new test last night, I reintroduced the (gc) call in observe-text
>> but did not put back the fetch-atom and store-atom calls.
>>
>> I figured it would either finish quickly last night (if the SQL issue was
>> the only major problem) or run into the same problem I encountered last
>> time I ran it last week (if something else like the GC was causing the
>> slowdown). Either way I would be able to separate the GC performance issues
>> I suspected from the SQL performance issues I already noted.
>>
>> When I arrived, the cogserver was using 100% but was still running and
>> processing sentence #590. I suspected the Guile GC, since the CPU is
>> pegging one process at 100% on a 6CPU machine that usually has 1150% for
>> the cogserver when running. This matches the profile of one thread blocking
>> all the others, which seemed like the GC, I fired up gdb, attached to the
>> cog server, and did a backtrace:
>>
>> gdb) bt
>> #0  0x00007f562938cfb2 in do_sigsuspend (set=0x7f56244ea2c0)
>>     at ../sysdeps/unix/sysv/linux/sigsuspend.c:31
>> #1  __GI___sigsuspend (set=0x7f56244ea2c0)
>>     at ../sysdeps/unix/sysv/linux/sigsuspend.c:45
>> #2  0x00007f56242ce832 in *GC_suspend_handler_inner* ()
>>    from /usr/lib/x86_64-linux-gnu/libgc.so.1
>> #3  0x00007f56242ce8a5 in *GC_suspend_handler* ()
>>    from /usr/lib/x86_64-linux-gnu/libgc.so.1
>> #4  <signal handler called>
>> #5  0x00007f562941a93d in nanosleep () at ../sysdeps/unix/syscall-templa
>> te.S:81
>> #6  0x00007f562944bbc4 in usleep (useconds=<optimized out>)
>>     at ../sysdeps/unix/sysv/linux/usleep.c:32
>> #7  0x00007f562a2d7fb7 in opencog::CogServer::serverLoop (this=0x19727f0)
>>     at /home/inflector/opencog/opencog/opencog/cogserver/server/Cog
>> Server.cc:211
>> #8  0x0000000000406b05 in main (argc=3, argv=0x7ffec4633ae8)
>>     at /home/inflector/opencog/opencog/opencog/cogserver/server/Cog
>> ServerMain.cc:232
>> (gdb)
>>
>> This is the code that the Garbage collector invokes to suspend threads
>> while running. I've been digging into the code for the collector and found
>> several potential interactions which will result in what I've been seeing.
>> Essentially as the number of objects in the cogserver heap grows the
>> Conservative GC has to search a larger and larger space until eventually
>> this search takes a single core's entire processing capability. As the
>> search space grows larger and larger, this blocking begins to dominate all
>> execution.
>>
>> You can see this in my notes in the attached observe-text.log file where
>> I show the GC time compared with total time and it is 320 seconds of 463
>> seconds for my 5 chapter test file. So 69% of the processing when
>> processing only 303 sentences. And this percentage grows with successive
>> processing of the same file. These successive runs create no new atoms but
>> show a leak of 0.130G per execution with the (gc) call in observe-text; and
>> 0.400G per processing of the 5 chapters without the (gc) call inside
>> observe-text.
>>
>> Currently the job is sitting at 12 hours for one single sentence while
>> the garbage collector is repeatedly running while attempting to complete
>> it's job. Since I encountered this last week and just let it run over the
>> weekend, and when I came back on Monday it was finished, I suspect that the
>> GC is still churning away but being repeatedly invoked and at some point,
>> it will finish enough to process the sentence. Further the completion will
>> invoke another internal allocation which it always makes at 200% of the
>> previous size, this will cause the GC to be invoked much less again for
>> some period of time which will make the rest of the sentences process much
>> faster. But at some size the GC stops working as the processing time
>> required dominates a single CPU core.
>>
>> All the above means Guile's current GC won't work for AtomSpace. Not
>> without some significant work. That's the bad news.
>>
>> The good news is that I believe we can implement a fix for this for our
>> special case of Guile running in a separate process in the CogServer in a
>> few days. My idea is a bit of a hack but will result in extremely fast
>> allocations, zero leaks, and speedup the code by at least 2X over the small
>> file performance case where we're getting sub second processing of lines.
>> I'd not be surprised if we can hit 0.5 seconds in the no SQL saving case.
>>
>> I have attached a screenshot of the top where you can see 100% CPU for
>> the conserver and Terminal windows where you can see that there were no
>> errors starting the conserver and that it is on sentence 590 which took
>> 1049 seconds to get to with an average per-sentence time at that point of
>> 1.78 seconds. You can also see that there are no atoms in the atom space
>> (except for the default one that gets added to every atomspace).
>>
>> So my plan is: as a first step, I'm going to implement a quick-and-dirty
>> replacement for the GC which just uses malloc to see if that gets through
>> the file. If this approach works, this will be confirmation of the above,
>> but not be suitable for the long-term.
>>
>> Assuming the malloc-based allocations complete in reasonable time, I've
>> got some ideas for altering the GC code to work in a separate per-thread
>> heap for new allocations which we can then dispose en mass when the thread
>> terminates. Assuming we can get a decent bound on usage of guile-internal
>> objects during statement execution, this will be much much faster and won't
>> leak. Everyone wins.
>>
>> Let me know if you have any questions or suggestions.
>>
>> - Curtis
>>
>> On Thu, Jun 1, 2017 at 1:42 AM, Ben Goertzel <[email protected]> wrote:
>>
>>> Thanks a lot Curtis!
>>>
>>> I totally grok the desire to complete an analysis before announcing
>>> results... OTOH if we were all sitting in the same office we'd
>>> probably be communicating lots of interim results.  So generally I'm
>>> biased toward fostering communication! ...
>>>
>>> See you Monday...
>>>
>>> ben
>>>
>>> On Wed, May 31, 2017 at 7:58 AM, Curtis Faith <[email protected]>
>>> wrote:
>>> > I should have been more explicit and said that removing the store and
>>> fetch
>>> > was just a quick and dirty way of measuring the performance impact of
>>> this
>>> > issue, not a suggested solution to it.
>>> >
>>> > On Wed, May 31, 2017 at 10:13 PM, Curtis Faith <
>>> [email protected]>
>>> > wrote:
>>> >>
>>> >> I am sending this a bit earlier than I would normally as I am still
>>> trying
>>> >> to figure out the contribution of the Guile garbage collector on
>>> performance
>>> >> and am not quite done. I have also not yet measured a test over
>>> significant
>>> >> data for a before and after comparison, for example.
>>> >>
>>> >> But since Ben sent Linas an email about this problem, I'll send my
>>> work in
>>> >> progress, so Linas has a bit more information to answer Ben's
>>> question.
>>> >>
>>> >> So, the following represents a partial analysis of the SQL issues I
>>> found.
>>> >> I removed the code for fetch-atom and store-atom and while this is
>>> >> considerably faster, there still seems to be many-second pauses in
>>> >> processing that are not related to the number of words in the
>>> sentence in
>>> >> any way I can easily correlate. Guile garbage collector? This might
>>> be due
>>> >> to a memory leak which is resulting in the garbage collector having to
>>> >> search over an ever increasing memory space looking for object-ish
>>> >> references using it's conservative scheme.
>>> >>
>>> >> - Curtis
>>> >>
>>> >> -----------------------
>>> >>
>>> >> My 6,000 sentence test file finished over the weekend. It took about
>>> 48
>>> >> hours.
>>> >>
>>> >> For some context, the input corpus, Pride and Prejudice, has
>>> >> approximately:
>>> >>
>>> >> 120 K words
>>> >> 700 K bytes
>>> >> 6 K sentences
>>> >> 2 K paragraphs
>>> >> 25 average words per sentence
>>> >>
>>> >> So it is a fair sized novel but only a tiny tiny fraction of what we'd
>>> >> like to feed the language learning system.
>>> >>
>>> >> In looking at the Postgres statistics for updates on a freshly
>>> initialized
>>> >> database, I found:
>>> >>
>>> >> SELECT relname, idx_scan, idx_tup_fetch, n_tup_ins, n_tup_upd,
>>> n_tup_del
>>> >> FROM pg_stat_user_tables;
>>> >>
>>> >>
>>> >>   relname   | idx_scan  | idx_tup_fetch | n_tup_ins | n_tup_upd |
>>> >> n_tup_del
>>> >>
>>> >> ------------+-----------+---------------+-----------+-------
>>> ----+-----------
>>> >>  atoms      |  95789119 |      90072278 |   3288742 |         0 |
>>> >> 0
>>> >>  values     |         4 |             0 |         0 |         0 |
>>> >> 0
>>> >>  typecodes  |         0 |             0 |       228 |         0 |
>>> >> 0
>>> >>  valuations | 131581719 |      84520666 |  44519326 |         0 |
>>> >> 41926158
>>> >>  spaces     |   3288742 |       3288742 |         2 |         0 |
>>> >> 0
>>> >> (5 rows)
>>> >>
>>> >> So roughly 90M fetches for Atoms and 85M for Valuations while parsing
>>> a
>>> >> single file of 6K sentences.
>>> >>
>>> >> We have 3.3M atoms inserted and 90.0M index tuple fetches for the atom
>>> >> table.
>>> >>
>>> >> In digging into the code, i found we have an explicit delete and
>>> insert
>>> >> for every single increment of every atom count maintained by the
>>> system. The
>>> >> count increment code is also recursive for the trees defined by each
>>> pair
>>> >> which will in practice be many different atoms because even though we
>>> only
>>> >> increment the count for one atom in the tree, the store-atom and
>>> fetch-atom
>>> >> calls update the truth values for every atom in the tree that has a
>>> truth
>>> >> value. The pair counting code has this comment:
>>> >>
>>> >> ; ------------------------------------------------------------
>>> ---------
>>> >> ; update-pair-counts -- count occurances of word-pairs in a parse.
>>> >> ;
>>> >> ; Note that this might throw an exception...
>>> >> ;
>>> >> ; The structures that get created and incremented are of the form
>>> >> ;
>>> >> ;     EvaluationLink
>>> >> ;         PredicateNode "*-Sentence Word Pair-*"
>>> >> ;         ListLink
>>> >> ;             WordNode "lefty"  -- or whatever words these are.
>>> >> ;             WordNode "righty"
>>> >> ;
>>> >> ;     ExecutionLink
>>> >> ;         SchemaNode "*-Pair Distance-*"
>>> >> ;         ListLink
>>> >> ;             WordNode "lefty"
>>> >> ;             WordNode "righty"
>>> >> ;         NumberNode 3
>>> >>
>>> >>
>>> >> So every pair has 11 atoms.
>>> >>
>>> >> The truth value gets set only at the top level but the fetch and
>>> store end
>>> >> up retrieving and storing the valuations for the entire tree where
>>> there are
>>> >> truth values.
>>> >>
>>> >> Further, the counting code is storing and and fetching the truth
>>> values
>>> >> for each unique count value each time it gets incremented. So we'll
>>> have (1:
>>> >> an insert for 1), then (2: a delete of the 1, and an insert for 2),
>>> then (3:
>>> >> a delete of the 2, and an insert for 3), (4: a delete for 3, and an
>>> insert
>>> >> for 4), etc. And this is done for the EvaluationLink, ExecutionLink
>>> and four
>>> >> Word Nodes in the above example.
>>> >>
>>> >> I wanted to verify this assumption, so I created a fresh database and
>>> >> turned on statement logging in Postgres. It resulted in the attached
>>> log
>>> >> file for a test of the a four-word sentence:
>>> >>
>>> >> (observe-text "This is a test.")
>>> >>
>>> >> We got 5419 lines in the log file (not counting the 333 lines of
>>> startup
>>> >> creations of types etc.), each of which represents a single SQL
>>> statement.
>>> >> There are:
>>> >>
>>> >> 813 BEGIN,
>>> >> 813 COMMIT,
>>> >> 2127 SELECT,
>>> >> 720 DELETE, and
>>> >> 946 INSERT
>>> >>
>>> >> statements for a four word sentence. (See the attached text log for
>>> >> details)
>>> >>
>>> >> There are 133 new atoms created and 93 valuations after the call to
>>> >> observe-text is finished.
>>> >>
>>> >>
>>> >
>>>
>>>
>>>
>>> --
>>> Ben Goertzel, PhD
>>> http://goertzel.org
>>>
>>> "I am God! I am nothing, I'm play, I am freedom, I am life. I am the
>>> boundary, I am the peak." -- Alexander Scriabin
>>>
>>
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/opencog.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/CAHrUA36vLhCAd0Noq403E4cANiF8ZV0-aShhmaQdT5XFKsrFSQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[opencog-dev] Re: Language Learning - Progress and Performance

Reply via email to