On Mon, Jun 5, 2017 at 10:22 PM, Curtis Faith <[email protected]> wrote:
> Linas wrote: > > >> OK. So lets think this through. The only place where GC is being used is >> in guile; GC is not being used in the atomspace itself. So you could >> accomplish exactly the same thing by periodically shutting down guile, >> completely. This would release all that memory, and then you are done. >> > > <snip> > > >> But I don't see any way of implementing a pool, without fully shutting >> down guile; and if one fully shuts down guile, then one doesn't need a pool. > > > Okay. So here's what I think it happening now in the CogServer, please > correct me if this is wrong: > > CogServer receives 3 requests, then CogServer creates three new threads: > > 'scm (observe-text "Test sentence one. )"' > Thread one > > 'scm (observe-text "Test sentence two. )"' > Thread two > > 'scm (observe-text "Test sentence three. )"' > Thread three > > If Guile's GC, the bwdgc, were to be altered to check a thread-local flag > and if it was set it would allocated new objects out of a thread-local > pool, then when these types of threads completed there would be no need to > garbage collect any allocations made out of that pool. One could just > restart the pool. > But you'd totally corrupt memory, because all three threads are sharing zillions of unknown, opaque SCM structures. in utterly unknown ways. For example, all three share the code for "observe-text". And observe-text is updating god-knows what closures where ... I'm pretty sure I'm using closures to collect and report progress statistics. And then there are all to SCM's that are embedded directly in the C++ code, holding things like the current output port, the current return value, etc. > > But this is way more work than just restarting Guile every once in a > while, I agree. There may also be other problems with this approach. It is > a very general solution to what is likely to be a very common issue in the > future, so it may be worth looking at in the future. > > So on to your objection to restarting Guile. > > The problem with this proposal is that pretty much everything runs through >> guile. All the atoms go in through it, and come out through it. So >> shutting down guile and restarting it is tantamount to shutting down the >> system, and restarting it. Which is OK, if you saved all the atoms you >> care about to the database. >> > > I don't see why shutting down Guile is tantamount to shutting down the > system and restarting it. Right now, the atom space is created first, > Ehh? Where? Right now, guile is started first, then the opencog module is loaded, and this module creates the atomspace. If you leave guile, you get the bash prompt; there is no atomspace any more; theres no running executable any more, after you leave guile. > then the SchemeEval object is created in CogServer::CogServer. > That's incorrect. The SchemeEval object is created when you (use-modules (oepncog exec)) I typically create this before running the cogserver. In many of my scripts, I don't bother to start the cogserver, as there is no real need for it. Its kind-of a legacy. I've been vaguely planning on getting rid of it, since it mostly doesn't do anything. > So I don't see why we couldn't have a function that shutdown Guile > releasing all it's memory back to the OS, > You can shutdown the cogserver -- there's a function for that .. its called (stop-cogserver) but you execute that function inside of guile. It leaves the atomspace intact. It also leaves SchemeEval intact. > and then restarted a new Guile interpreter without destroying the > AtomSpace, upon restart you'd pass in the same atom space that existed > before the shutdown. > > Now, due to some of the quirks of the GC, and perhaps Guile's interaction > with it, shutting down and restarting is a bit tricky. You've got the issue > with the infinite sleeping initialization thread noted in > https://github.com/opencog/atomspace/blob/master/opencog/ > guile/SchemeEval.cc#L241-L285 > > You've also got a few scattered SCM static globals scattered about that > would have to be cleared and reloaded. It's a bit of work, sure, but doable. > Ask Ben about this. I spent a miserable week in NYC doing this "doable" thing, and I hope I never have to go there again. That there is some truly nasty, complicated code. > > The cleanest way would be to call the appropriate destructors to free up > the memory allocated by Guile and the GC. Then unload both libguile.so and > libgc.so and reload them, and then redo the initialization. > > Still, I'm not sure any of this is what we should be doing now since the > issue at hand can be most easily handled through other means. > > On Tue, Jun 6, 2017 at 3:10 AM, Linas Vepstas <[email protected]> > wrote: > >> Long message... >> >> On Sat, Jun 3, 2017 at 10:07 PM, Curtis Faith <[email protected]> >> wrote: >> >>> Linas wrote: >>> >>> >>>> The idea that you are going to use a special pool for guile which you >>>> then clear out ever so often is just .. a proposal to take a sophisticated >>>> GC algorithm and replace it with a truly sophmoric ..a ahem freshman >>>> concept of GC. Its a waste of time. >>>> >>> >>> GC is needed when you have long-running processes or threads that can't >>> just leak with impunity. It is wholly unnecessary for short-duration tasks >>> with moderate memory requirements. In the special case of processing batch >>> requests, with web servers like nginx or Apache or the CogServer running >>> observe-text, there are not many objects that can't immediately be >>> destroyed when the request (or sets of those requests) finish. That makes >>> the overhead of GC unnecessary in these batch processing cylcles. It also >>> makes the overhead of finalizing anything unnecessary if there is enough >>> RAM to service the requests without any cleanup during a single requests >>> processing. You need to release system resources and that's about it. >>> >>> A pool-based cleanup--once-at-the-end approach will make things much >>> faster whether the problem I am seeing ends up being from a bug or not. >>> >> >> OK. So lets think this through. The only place where GC is being used is >> in guile; GC is not being used in the atomspace itself. So you could >> accomplish exactly the same thing by periodically shutting down guile, >> completely. This would release all that memory, and then you are done. >> >> The problem with this proposal is that pretty much everything runs >> through guile. All the atoms go in through it, and come out through it. >> So shutting down guile and restarting it is tantamount to shutting down the >> system, and restarting it. Which is OK, if you saved all the atoms you >> care about to the database. >> >> But I don't see any way of implementing a pool, without fully shutting >> down guile; and if one fully shuts down guile, then one doesn't need a pool. >> >> The only alternatives are to use python, by python is single-threaded, so >> this is a non-starter. The 3rd alternative is haskell, but it's also >> garbage-collelcted. Can't use C++ because it has no interpreted >> command-line. (Using C++ is tantamount to shutting everything down, >> recompiling, and restarting everything, which is clearly the worst-possible >> scenario). A 5th alternative is to invent a custom vocabulary of words to >> control C++ objects from an interactive command line, but this is clearly a >> design disaster. It was 1988, when the folly of this was realized, which >> cause tcl to be integrated into C apps The inadequacy of TCL lead to the >> invention of guile ... and so here we are, full-circle. We could switch to >> swig+perl, or to javascript.. but javascript is garbage-collected, and I >> think perl is too, not sure. I just don't see any way of implementing what >> you are talking about. >> >>> >>> Performance is clearly an issue for unsupervised learning >>> >> >> Really? Sorry, but in what way? What's the problem? >> >> >>> and general AI in general. It is also an issue for OpenCog right now >>> since getting the data into the AtomSpace in the right form is taking far >>> too long right now. >>> >> >> Really? In what way? What is the problem? >> >> >>> No matter what we do performance will always be an issue because of the >>> sheer size of the datasets researchers want to work with. >>> >> >> Well, if those folks at Intel and AMD weren't so lazy, we'd have great >> performance by now. >> >> >>> That is why Ben first asked me to look into using AWS to spawn parallel >>> processes to cut down on the calendar time required to input large corpora. >>> >> >> Well, we know Ben is crazy. This is not where the problem lies. Its easy >> to get large corpora pumped through. I can give you a dozen dumps of >> datasets so large they won't fit in the RAM of your computer. Do you want >> large datasets? Cause I got them. >> >> The problem is that I don't have tools to analyze those datasets. That's >> where 95% of my personal bottleneck lies. Simply crunching a lot of data >> is just so totally not at all the hard part. >> >> > I'm seeing 50% to 70% times spend in the GC. >> >> Are you using the tool I sent you? because I am seeing less than 20%. >> >> >>>>> >>>>> I totally >>>>> >>>> >> Sorry for some of the sarcasm. Sure, more performance would always be >> nice, but GC time is a complete red herring. Also, technically, I think GC >> is not a solvable problem. The alternative is reference counting, and that >> is also a total CPU hog. >> >> I do have some proposals, but first: 1) I have large datasets, 2) >> creating large datasets is totally not an issue. 3) creating tools to >> analyze them is almost 100% of the issue. >> >> But if you wanted to get atoms into the atomspace faster: like 10x faster >> or 20x faster: you could run the link-grammar parser in the same address >> space as the atomspace. Just take what it spits out, convert them into >> atoms, shove the atoms into the atomspace. This would completely by-pass >> guile, and bypass all GC. So GC would totally not be an issue in this >> case. >> >> To be clear: currently, LG parses text, then bloody java code turns it >> into strings, which are sent over a socket to guile, which evaluates the >> strings, and creates atoms. About 80% or more of this process is the cost >> of having guile evaluate strings that specify atoms, in string format. >> Eliminate this, and you get an instant 3x, 4x speedup. >> >> Once you did this, you'd discover two other bottlenecks: shoving atoms >> into the atomspace is slowwwww. And pushing atoms out to the database is >> slowwww. These are much harder, but more important bottlenecks to overcome. >> >> Re: running LG in the same adress space as the atomspace: this has >> already been done; the surreal code does this. In a day or 2 or 3 you could >> write the needed wrapper code to have LG live directly inside of opencog, >> generating the correct atoms, thus totally bypassing guile and garbage >> collection. And this would be a very easy way to get a 3x speedup, if >> that's really your end-goal. Its a lot wasier than all the other crazy >> schemes discussed. >> >> In the very-long term, I plan to do this anyway, because I want to apply >> the LG algorthms to generic atomspace data, not just to natural language. >> However, curently LG is totally focused only on lanugage, and its too much >> work to re-implement it as a generic data parser. Baby steps, for now. >> >> --linas >> >> >> > -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/opencog. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAHrUA34%3DY9HzsfA9r8iH1z6NtMW1Bp0tLrUf_tZhko2wx24kVA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
