I mean for R -- seriously, this cannot possibly be hard. I used R for a few days, almost a week, and got the impression it was mostly vectors and arrays. I'm pretty sure that in R, there is a C/C++ wrapper, where you can create a "special" R vector, or R array, and whenever you access item k in the vector, or item i,j,k in the array, that wrapper calls your C++ code, and the only thing your C++ code has to do is to return a float or an int or a string.
For us, under the covers that C++ code just gets or sets some truth-value (or other value) on some pile of atoms in the atomspace. This can't possibly be more than a few days worth of work. and bingo: you can now move data between R and the atomspace. I'm really pretty sure I saw this kind of API in R, somwhere. --lins On Sat, Jun 17, 2017 at 10:43 PM, Linas Vepstas <[email protected]> wrote: > > 1) jupyter is probably great for tutorials, and is nice for collaboration, > but is immature and inadequate if you want to use it as an actual diary for > actual results (which is my personal use-case). So sure, install it now on > the opencog website, and have at it. Its clearly some kind of wave of some > kind of future. I doubt that sticking guile in it would be hard. Maybe we > should ask Nala Ginrut about this. > > 2) I most emphatically did NOT say "put other tools inside of opencog". I > said the exact opposite. Our data is in opencog. Put a pipeline between > where the data is: in opencog, and these other tools. > > 3) the "guille shell" is wayyyyy too low-level for a data API. Using > scheme and using atomese are like programming in assembly code. Don't do > that. Instead I want to say things like "I have a sparse matrix, please do > PCA or SVD or factor analysis or blah blah algo on my sparse matrix." I am > willing to write a small shim (in guile or in c++ or maybe even ... > shiver... python) that translates between their sparse-matrix format, and > mine (which are EvaluationLink's) The key is that this shim must be > small, simple, easy-to-write --- an afternoon, a few days, a week at most, > or its just not worth it. > > 4) I have no clue at all on how R accesses data. Can't you write some > small shim, that allows R to reach into opencog to get the data it needs? > How hard can this be? Surely lots of people do something like this, right? > > The point is that you are NOT creating a programming language API for > opecog, which is hard to do. The point is that you are creating a pipeline > to move data to and from opencog, which is a lot easier. orders of > magnitude easier. And is much closer to what you want, in the end, anyway. > > --linas > > > > > On Sat, Jun 17, 2017 at 10:12 PM, Ben Goertzel <[email protected]> wrote: > >> Hi Linas, thanks for the detailed thoughts and responses... I will >> write an equally carefu reply when I get time; or else I will return >> to all this in detail when/if resources are obtained to pursue these >> sorts of directions... (which I hope will be soon...) >> >> About Jupyter Notebooks, I note that Google is now using it to supply >> new Tensorflow tutorials... I do think it would be great if we could >> get our OpenCog tutorials working with Jupyter. Perhaps making >> Jupyter support Guile is indeed the easiest way... it seems this >> "shouldn't be hard" ... I agree Jupyter is not perfect, but it's a >> lot nicer than embedding code in the wiki pages like we are now, >> tutorial-wise... >> >> I agree that if we wanted to go with CL, we would want to test >> Christian's CL approach thoroughly versus others before making any >> sort of decision to undertake a lot of work... >> >> About using OpenCog in a data analysis context -- so for applications >> like bio-data-analysis, or analyzing network data for a networking >> company, it's clear there are non-OpenCog tools that will be used for >> data preprocessing, for results evaluation, etc. Putting all these >> other tools in Atomspace/OpenCog isn't viable. There are going to be >> scripts that pass data thru various other things, then through >> OpenCog/Atomspace, then through various other things.... For the >> next 5 years at least, probably more.... So finding a great way to >> support this kind of workflow is going to be important. Maybe the >> Guile shell is a great way to support this kind of workflow. >> >> In the bioinformatics case it happens that the preponderance of needed >> tools are coded in R, not sci-py ... we need e.g. the 100 or so >> scripts in bioconductor (an R package). It is true that the heavy >> lifting behind R packages is usually done in C++, but we need more >> than just the heavy lifting, we need the bio-specific data-munging >> code that is in the R code also.... So that is part of what I'm >> wrestling with. How to build a framework that makes sense in an >> OPenCog-universe context, and also makes sense in terms of the >> existing ecosystem of bio-AI/informatics tools that we need OpenCog to >> be used together with... >> >> Schafmeister is using a whole other set of tools for his >> nanotech/cheminformatics work... C++ and FORTRAN not R.... But >> genomics is in R these days... >> >> -- Ben >> >> >> >> On Sun, Jun 18, 2017 at 10:48 AM, Linas Vepstas <[email protected]> >> wrote: >> > Hi Ben, >> > >> > I like the general idea, but many particulars I don't like. Yes, we >> need >> > something like this, but maybe there's an alternate route. >> > >> > TL;DR: Maybe we should pick the science tools first, and then ask which >> > languages we should target. >> > >> > You met a really great salesman who convinced you that his CL product >> is the >> > best, but I think you need to step back and get some bearings: >> > >> > -- why is his CL better than bigger, established CL projects? >> > -- why is some other CL project better than some other scheme project? >> > -- why not consider scala or haskel? >> > -- are there science libraries for scala or haskel, or some other >> scheme or >> > lisp? >> > -- maybe its easier to fix cython, and just use scipy with cython? >> assuming >> > scipy is adequate for the science needs? >> > >> > The "autogenerating c++ bindings" is total sales bullhockey. Recall >> that >> > cython, guile, swig, and two dozen other things were invented o >> > "autogenerate C++ bindings". Even haskell has this ability, and you've >> seen >> > how it took Roman just one afternoon to do this! >> > >> > --linas >> > >> > >> > On Sat, Jun 10, 2017 at 11:11 PM, Ben Goertzel <[email protected]> >> wrote: >> >> >> >> >> >> >> >> TL;DR — particularly for use of OpenCog in bioinformatics and other >> >> >> >> scientific-computing applications — but also for other applications >> like >> >> >> >> corpus linguistics where one deals with large datasets and wants to >> >> >> >> combine OpenCog with other command-line tools — >> > >> > >> > "with other sparse-data, big-data and graph processing tools" would be >> more >> > appropriate. >> > >> >> >> >> it might be a good >> >> >> >> idea to replace or supplement our current Guile shell with a >> wrapping-up >> >> >> >> of OpenCog in Common Lisp … >> > >> > >> > Here and elsewhere, you confuse scheme, guile and lisp. So >> > >> > 1) scheme is a dialect of lisp, so is common lisp. >> > >> > 2) guile is a specific implementation of scheme that allows scheme to be >> > used from c++. There are very few implementations of scheme or lisp >> that >> > allow this to take place. >> > >> > 3) Creating language bindings for opencog is a lot of work. Creating >> common >> > lisp bindings would be roughly as much work as it is for guile, python >> or >> > haskell. >> > >> >> >> >> and in particular in the CLASP Common >> >> >> >> Lisp framework that Christian Schafmeister has developed. >> > >> > >> > Personally, I am nervous by one-man-shows. The one-man has a historical >> > tendency to get bored and move on to something else, and then everything >> > stops and falls apart. >> > >> >> >> >> This would >> >> >> >> also have some other smaller side-benefits like letting us exploit the >> CL >> >> >> >> bindings for Jupiter Notebooks which would be cool for OpenCog >> >> >> >> tutorials; >> > >> > >> > I played with jupyiter a fair amount, hoping to use it as a personal >> diary >> > of the data analysis I do in opencog. Upshot: its way cool, and very >> > immature. It struggles to do even basic stuff, and you got to load on >> and >> > confugre all sorts of add-ons to get it to behave properly. So its a >> cool >> > idea, but still very immature and half-baked. >> > >> >> >> >> and making it easy to generate LISP bindings for ROS, >> > >> > >> > those already exist but we fall into the age-old trap: exactly which >> > variant of common lisp do you propose? I count seven major, 20 minor >> > variants: >> > >> > http://www.cliki.net/Common+Lisp+Implementation >> > >> > its like picking racket-scheme over guile-scheme over mzscheme over >> > chicken-scheme. >> > >> > the point behind scheme is that its "modern", fixing many of the "bugs" >> in >> > lisp. >> > >> > >> >> >> >> thus >> >> >> >> avoiding the need to deal with ros-py for OpenCog robotics >> applications… >> >> >> >> >> >> Basic reasons are: CLASP is efficient at handling large datasets and >> >> >> >> piping them from one place to another. It can also automatically >> >> >> >> generate bindings for C++ code, which could be used to auto-generate >> >> >> >> LISP bindings for ROS…. >> > >> > >> > Yeah, like total bullshit. Everyone has something called "FFI" (foreign >> > function interface) and everyone always says that FFI makes it trivial >> to >> > interface their programming language to c++ code. Which is half-way >> > correct, for small and simple C/C++ libraries. >> > >> > This falls apart for something more complex, which is why things like >> SWIG >> > and Cython and Guile get invented. There's got to be a dozen-or-two >> more >> > of these "automatically generate bindings for c++ code" tools out there. >> > They all work sort-of-ish OK. Up to a point, and then hell breaks >> loose. >> > >> > Opencog is clearly on the other side of that: otherwise Roman would have >> > finished the haskell bindings in an afternoon, just by autogenerating >> them >> > with FFI. >> > >> > "It's easy" is the program managers swan song. What it really means is >> "i >> > see a way of doing something" >> > >> >> >> >> >> >> >> >> He has some strong arguments as to why this is a better way than R or >> >> >> >> python to get scientific computing done on large datasets…. >> > >> > >> > That is probably true. I don't get the impression that R or scipy are >> ready >> > for large datasets. >> >> >> >> >> >> >> >> One thing he found is that in many of his computational chemistry >> >> >> >> analysis scripts, the bulk of compute time was getting taken up >> >> >> >> passing around datasets and results between different C++ programs, >> >> >> >> in R or python or whatever other glue language was being used… >> > >> > >> > That is must surely be very true >> > >> > On the other hand, the atomspace is a kind-of database for holding >> datasets, >> > so what is really needed is a way to have these tools talk directly to >> the >> > data that is lready there, in the atomspace. The point is: bring the >> tools >> > to where the data is, don't try to cart the data around between the >> tools. >> >> >> >> >> >> >> >> Via using Common LISP compiled into LLVM, he found this could be >> >> >> >> worked around, because one could then script stuff in CL but have >> >> >> >> efficient garbage collection done via LLVM … >> > >> > >> > one can script stuff in scheme, and have efficient garbage collection >> with >> > bdwgc ... oh wait, we already have that. Don't let curtis confuse you: >> if >> > you do garbage collection once every sentence, then, yes, garbage >> collection >> > will take up a large fraction of the time. Don't do garbage collection >> on >> > ever sentence, and the problem goes away. >> >> >> >> >> >> >> >> My speculative line of thinking now is that it might be interesting to >> >> >> >> >> >> -- supplement or replace Guile with Clasp as a shell for working with >> >> OpenCog >> > >> > >> > Non-starter, for the above reasons. First, it will take just as much or >> more >> > time than haskel, cython or guile took, and clasp is far more obscure, >> and >> > has a far smaller user base. >> >> >> >> >> >> >> >> -- Integrate C++ bio-analytics tools into Clasp, >> > >> > >> > Or pick some other, more popular variant of lisp or scheme. >> > >> >> >> >> in a similar way to >> >> >> >> what Schafmeister has been doing for chem-analytics tools >> >> >> >> >> >> — Integrate R bio-analytics tools into Clasp as well, using the >> following >> >> >> >> LLVM compiler for R >> > >> > >> > Ir integrate lisp into java, like clojure, or integrate Groovy or Scala >> or >> > Jython or .... >> > >> > Besides Haskell, I think scala is the other interesting one to consider. >> >> >> >> >> >> >> >> — Use the Jupyter Notebooks wrapping for CL to make Jupyter tutorials >> >> >> >> for OpenCog >> > >> > >> > >> > Jupyter + scheme would be be the way to go. A lot easier, I'm guessing >> maybe >> > 50 times less work. >> > >> >> >> >> >> >> >> >> (although, as a side note, integrating Guile with Jupyter Notebooks >> >> >> >> would also not be impossible ... it would just be a bunch of work, >> >> >> >> like any of this…) >> > >> > >> > It would be like several orders of magnitude less work. Like maybe >> weeks, >> > instead of half-year++ or year++ >> > >> > >> > That's it. I think that basically, you met a really great salesman who >> > convinced you that his CL product is the best, but I think you need to >> step >> > back and get some bearings: >> > >> > -- why is his CL better than bigger, established CL projects? >> > -- why is some other CL project better than some other scheme project? >> > -- why not consider scala or haskel? >> > -- are there science libraries for scala or haskel, or some other >> scheme? >> > -- maybe its easier to fix cython, and just use scipy with cython? >> assuming >> > scipy is adequate for the science needs? >> > >> > Are we putting the cart before the horse? Maybe we should pick the >> science >> > tools first, and then ask which languages we shoul target? >> > >> > >> > >> >> >> >> >> >> gmail.com. >> >> For more options, visit https://groups.google.com/d/optout. >> > >> > >> > -- >> > You received this message because you are subscribed to the Google >> Groups >> > "opencog" group. >> > To unsubscribe from this group and stop receiving emails from it, send >> an >> > email to [email protected]. >> > To post to this group, send email to [email protected]. >> > Visit this group at https://groups.google.com/group/opencog. >> > To view this discussion on the web visit >> > https://groups.google.com/d/msgid/opencog/CAHrUA37qymACH%3DY >> yVtAxuu0adk224_yL7dpsUb5An5KyD0v76Q%40mail.gmail.com. >> > >> > For more options, visit https://groups.google.com/d/optout. >> >> >> >> -- >> Ben Goertzel, PhD >> http://goertzel.org >> >> "I am God! I am nothing, I'm play, I am freedom, I am life. I am the >> boundary, I am the peak." -- Alexander Scriabin >> >> -- >> You received this message because you are subscribed to the Google Groups >> "opencog" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To post to this group, send email to [email protected]. >> Visit this group at https://groups.google.com/group/opencog. >> To view this discussion on the web visit https://groups.google.com/d/ms >> gid/opencog/CACYTDBewKR2nkJKeZ%2BTKnk5pM-hkmz4nGzj25NVm5Z00b >> 8S9-Q%40mail.gmail.com. >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/opencog. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAHrUA36NvEDJfdjETt6Ws4aKgeTPrs_XeQH3D_d7N8mjoFSfiw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
