[opencog-dev] Re: Language Learning - Progress and Performance

Curtis Faith Tue, 06 Jun 2017 02:13:27 -0700

> 1) I fixed the hang. :-) Or figured out how to avoid it. It turns out
> that, due to a stoopid bug/oversight, quotation marks in the input text
> were not being escaped. Thus, guile would see a begin-quote, end-quote,
> followed by garbage. A backtrace would be generated, and passed back to the
> witless user, who is just netcat, can couldn't give a damn, and so was
> silently ignored.   Escape the quotation marks correctly, the problem goes
> away.



Hmmm, I looked and this is NOT the problem for my testing of Pride and
Prejudice.

There is not a single normal quote " in all the text. All quotes in Pride
and Prejudice are actual left-right pairs, i.e. unicode “” not ASCII ""

On Tue, Jun 6, 2017 at 3:02 PM, Curtis Faith <[email protected]>
wrote:

> Linas wrote:
>
>
>> 1) I fixed the hang. :-) Or figured out how to avoid it. It turns out
>> that, due to a stoopid bug/oversight, quotation marks in the input text
>> were not being escaped. Thus, guile would see a begin-quote, end-quote,
>> followed by garbage. A backtrace would be generated, and passed back to the
>> witless user, who is just netcat, can couldn't give a damn, and so was
>> silently ignored.   Escape the quotation marks correctly, the problem goes
>> away.
>
>
> Cool.
>
> Puzzling how this but might result in the fp in the garbage collector
> getting screwed up. Did you find the specific bug in the GC or Guile's use
> of it that causes the fp to get screwed up, so we can help the developers
> get that fixed? A problem in scheme source syntax or variables not being
> spelled correctly shouldn't result in an infinite loop in the GC in any
> case. Seems like they must be missing some sort of exception handler
> somewhere.
>
> 2) how are you measuring GC time? I also get 70% but i also get 500% cpu
>> time for the other 181 active threads, so 70/500 seems acceptable to me.
>> Again, GC halts only guile, it does not halt the atomspace.
>
>
> I was simply using the overall duration of the test as measured by the
> perl script and the result of (gc-run-time) as measured via a telnet
> session running in another bash terminal.  My measurements may be in error
> if my assumption about the GC is incorrect as I've hinted a couple of times
> in prior emails. If anything, however, I'm over counting the total time and
> undercounting the percentage since I'm not getting CPU measurements for
> duration on the CogServer, I'm counting the time until the test is finished
> as determined by the perl script.
>
> My base assumption is that 'gc-run-time' is time when the other Guile
> threads are blocked. I make this assumption because of the way that '
> gc_time_taken' statistic in Guile is generated via hooks into the GC's
> before_gc and after_gc hooks that get called before and after each
> stop-the-world collection. So my assumption is that since the
> stop-the-world code suspends all the threads that can be garbage collected,
> and that since all the CogServer generated threads created in response to
> an "scm hush (observe-text 'Some sentence'))" are threads that are stopped
> since they allocate objects in the GC via Guile, that therefore the GC time
> reported is non-overlapping with the other processing time.
>
> If a test takes 100 seconds, and 50 seconds are spent in the GC, during
> those 50 seconds, there is no work going on in any non-GC threads because
> they are suspended during the duration of the stop-the-world collection. I 
> have
> looked at the code to verify these assumptions but it is certainly possible
> that I am missing something.
>
> How many CPUs are on your test machine? Does it have hyper-threaded Intel
> chips? If so, you don't tend to get accurate measures of real-time
> CPU performance on those chips.
>
> You have to reduce the multi-threading process CPU time by a factor of 0.6
> to 0.65 to more accurately reflect the 120% to 130% of CPU core that is
> available for a hyper-threaded core pair, it's not 200% even though it
> reports that way. So if you've got 70 units time in a single-thread
> blocking all the others and a reported 500 units total, the non-blocking
> multi-threading time is 500-70 or 430 units. Multiply by 0.6 to 0.65 to
> account for over-reporting on hyper threaded chips and you get 258 to 280.
> Now divide 258 to 280 by the number of CPUs as reported by the OS
> (all hyper threading units) and you should be close to 30% of the elapsed
> time. NOTE: even the 70% reported for the gc time may be overstated by some
> factor, though since there are plenty of empty cores for OS and other tasks
> while the GC is running, it is likely that the GC gets full use of a core
> for most of its duration on a machine that isn't already taxed with other
> processes (like Postgres asynchronous dumps into the AtomSpace). My tests
> include no such additional work as I've turned off the store-atom
> and fetch-atom tests.
>
> See: http://perfdynamics.blogspot.hk/2014/01/monitoring-cpu-utilization-
> under-hyper.html for a bit more on why the CPU times are wrong on Intel
> chips and linux.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/opencog.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/CAJzHpFp4Ci1AspT9G2xtacj30W%3DhikLSd5g%2BHgDMJoTdLgy-Yg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[opencog-dev] Re: Language Learning - Progress and Performance

Reply via email to