Hi Linas, thanks for the detailed thoughts and responses... I will write an equally carefu reply when I get time; or else I will return to all this in detail when/if resources are obtained to pursue these sorts of directions... (which I hope will be soon...)
About Jupyter Notebooks, I note that Google is now using it to supply new Tensorflow tutorials... I do think it would be great if we could get our OpenCog tutorials working with Jupyter. Perhaps making Jupyter support Guile is indeed the easiest way... it seems this "shouldn't be hard" ... I agree Jupyter is not perfect, but it's a lot nicer than embedding code in the wiki pages like we are now, tutorial-wise... I agree that if we wanted to go with CL, we would want to test Christian's CL approach thoroughly versus others before making any sort of decision to undertake a lot of work... About using OpenCog in a data analysis context -- so for applications like bio-data-analysis, or analyzing network data for a networking company, it's clear there are non-OpenCog tools that will be used for data preprocessing, for results evaluation, etc. Putting all these other tools in Atomspace/OpenCog isn't viable. There are going to be scripts that pass data thru various other things, then through OpenCog/Atomspace, then through various other things.... For the next 5 years at least, probably more.... So finding a great way to support this kind of workflow is going to be important. Maybe the Guile shell is a great way to support this kind of workflow. In the bioinformatics case it happens that the preponderance of needed tools are coded in R, not sci-py ... we need e.g. the 100 or so scripts in bioconductor (an R package). It is true that the heavy lifting behind R packages is usually done in C++, but we need more than just the heavy lifting, we need the bio-specific data-munging code that is in the R code also.... So that is part of what I'm wrestling with. How to build a framework that makes sense in an OPenCog-universe context, and also makes sense in terms of the existing ecosystem of bio-AI/informatics tools that we need OpenCog to be used together with... Schafmeister is using a whole other set of tools for his nanotech/cheminformatics work... C++ and FORTRAN not R.... But genomics is in R these days... -- Ben On Sun, Jun 18, 2017 at 10:48 AM, Linas Vepstas <[email protected]> wrote: > Hi Ben, > > I like the general idea, but many particulars I don't like. Yes, we need > something like this, but maybe there's an alternate route. > > TL;DR: Maybe we should pick the science tools first, and then ask which > languages we should target. > > You met a really great salesman who convinced you that his CL product is the > best, but I think you need to step back and get some bearings: > > -- why is his CL better than bigger, established CL projects? > -- why is some other CL project better than some other scheme project? > -- why not consider scala or haskel? > -- are there science libraries for scala or haskel, or some other scheme or > lisp? > -- maybe its easier to fix cython, and just use scipy with cython? assuming > scipy is adequate for the science needs? > > The "autogenerating c++ bindings" is total sales bullhockey. Recall that > cython, guile, swig, and two dozen other things were invented o > "autogenerate C++ bindings". Even haskell has this ability, and you've seen > how it took Roman just one afternoon to do this! > > --linas > > > On Sat, Jun 10, 2017 at 11:11 PM, Ben Goertzel <[email protected]> wrote: >> >> >> >> TL;DR — particularly for use of OpenCog in bioinformatics and other >> >> scientific-computing applications — but also for other applications like >> >> corpus linguistics where one deals with large datasets and wants to >> >> combine OpenCog with other command-line tools — > > > "with other sparse-data, big-data and graph processing tools" would be more > appropriate. > >> >> it might be a good >> >> idea to replace or supplement our current Guile shell with a wrapping-up >> >> of OpenCog in Common Lisp … > > > Here and elsewhere, you confuse scheme, guile and lisp. So > > 1) scheme is a dialect of lisp, so is common lisp. > > 2) guile is a specific implementation of scheme that allows scheme to be > used from c++. There are very few implementations of scheme or lisp that > allow this to take place. > > 3) Creating language bindings for opencog is a lot of work. Creating common > lisp bindings would be roughly as much work as it is for guile, python or > haskell. > >> >> and in particular in the CLASP Common >> >> Lisp framework that Christian Schafmeister has developed. > > > Personally, I am nervous by one-man-shows. The one-man has a historical > tendency to get bored and move on to something else, and then everything > stops and falls apart. > >> >> This would >> >> also have some other smaller side-benefits like letting us exploit the CL >> >> bindings for Jupiter Notebooks which would be cool for OpenCog >> >> tutorials; > > > I played with jupyiter a fair amount, hoping to use it as a personal diary > of the data analysis I do in opencog. Upshot: its way cool, and very > immature. It struggles to do even basic stuff, and you got to load on and > confugre all sorts of add-ons to get it to behave properly. So its a cool > idea, but still very immature and half-baked. > >> >> and making it easy to generate LISP bindings for ROS, > > > those already exist but we fall into the age-old trap: exactly which > variant of common lisp do you propose? I count seven major, 20 minor > variants: > > http://www.cliki.net/Common+Lisp+Implementation > > its like picking racket-scheme over guile-scheme over mzscheme over > chicken-scheme. > > the point behind scheme is that its "modern", fixing many of the "bugs" in > lisp. > > >> >> thus >> >> avoiding the need to deal with ros-py for OpenCog robotics applications… >> >> >> Basic reasons are: CLASP is efficient at handling large datasets and >> >> piping them from one place to another. It can also automatically >> >> generate bindings for C++ code, which could be used to auto-generate >> >> LISP bindings for ROS…. > > > Yeah, like total bullshit. Everyone has something called "FFI" (foreign > function interface) and everyone always says that FFI makes it trivial to > interface their programming language to c++ code. Which is half-way > correct, for small and simple C/C++ libraries. > > This falls apart for something more complex, which is why things like SWIG > and Cython and Guile get invented. There's got to be a dozen-or-two more > of these "automatically generate bindings for c++ code" tools out there. > They all work sort-of-ish OK. Up to a point, and then hell breaks loose. > > Opencog is clearly on the other side of that: otherwise Roman would have > finished the haskell bindings in an afternoon, just by autogenerating them > with FFI. > > "It's easy" is the program managers swan song. What it really means is "i > see a way of doing something" > >> >> >> >> He has some strong arguments as to why this is a better way than R or >> >> python to get scientific computing done on large datasets…. > > > That is probably true. I don't get the impression that R or scipy are ready > for large datasets. >> >> >> >> One thing he found is that in many of his computational chemistry >> >> analysis scripts, the bulk of compute time was getting taken up >> >> passing around datasets and results between different C++ programs, >> >> in R or python or whatever other glue language was being used… > > > That is must surely be very true > > On the other hand, the atomspace is a kind-of database for holding datasets, > so what is really needed is a way to have these tools talk directly to the > data that is lready there, in the atomspace. The point is: bring the tools > to where the data is, don't try to cart the data around between the tools. >> >> >> >> Via using Common LISP compiled into LLVM, he found this could be >> >> worked around, because one could then script stuff in CL but have >> >> efficient garbage collection done via LLVM … > > > one can script stuff in scheme, and have efficient garbage collection with > bdwgc ... oh wait, we already have that. Don't let curtis confuse you: if > you do garbage collection once every sentence, then, yes, garbage collection > will take up a large fraction of the time. Don't do garbage collection on > ever sentence, and the problem goes away. >> >> >> >> My speculative line of thinking now is that it might be interesting to >> >> >> -- supplement or replace Guile with Clasp as a shell for working with >> OpenCog > > > Non-starter, for the above reasons. First, it will take just as much or more > time than haskel, cython or guile took, and clasp is far more obscure, and > has a far smaller user base. >> >> >> >> -- Integrate C++ bio-analytics tools into Clasp, > > > Or pick some other, more popular variant of lisp or scheme. > >> >> in a similar way to >> >> what Schafmeister has been doing for chem-analytics tools >> >> >> — Integrate R bio-analytics tools into Clasp as well, using the following >> >> LLVM compiler for R > > > Ir integrate lisp into java, like clojure, or integrate Groovy or Scala or > Jython or .... > > Besides Haskell, I think scala is the other interesting one to consider. >> >> >> >> — Use the Jupyter Notebooks wrapping for CL to make Jupyter tutorials >> >> for OpenCog > > > > Jupyter + scheme would be be the way to go. A lot easier, I'm guessing maybe > 50 times less work. > >> >> >> >> (although, as a side note, integrating Guile with Jupyter Notebooks >> >> would also not be impossible ... it would just be a bunch of work, >> >> like any of this…) > > > It would be like several orders of magnitude less work. Like maybe weeks, > instead of half-year++ or year++ > > > That's it. I think that basically, you met a really great salesman who > convinced you that his CL product is the best, but I think you need to step > back and get some bearings: > > -- why is his CL better than bigger, established CL projects? > -- why is some other CL project better than some other scheme project? > -- why not consider scala or haskel? > -- are there science libraries for scala or haskel, or some other scheme? > -- maybe its easier to fix cython, and just use scipy with cython? assuming > scipy is adequate for the science needs? > > Are we putting the cart before the horse? Maybe we should pick the science > tools first, and then ask which languages we shoul target? > > > >> >> >> gmail.com. >> For more options, visit https://groups.google.com/d/optout. > > > -- > You received this message because you are subscribed to the Google Groups > "opencog" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/opencog. > To view this discussion on the web visit > https://groups.google.com/d/msgid/opencog/CAHrUA37qymACH%3DYyVtAxuu0adk224_yL7dpsUb5An5KyD0v76Q%40mail.gmail.com. > > For more options, visit https://groups.google.com/d/optout. -- Ben Goertzel, PhD http://goertzel.org "I am God! I am nothing, I'm play, I am freedom, I am life. I am the boundary, I am the peak." -- Alexander Scriabin -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/opencog. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CACYTDBewKR2nkJKeZ%2BTKnk5pM-hkmz4nGzj25NVm5Z00b8S9-Q%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
