I mean for R -- seriously, this cannot possibly be hard.  I used R for a
few days, almost a week, and got the impression it was mostly vectors and
arrays.   I'm pretty sure that in R, there is a C/C++ wrapper, where you
can create a "special" R vector, or R array, and whenever you access item k
in the vector, or item i,j,k in the array, that wrapper calls your C++
code, and the only thing your C++ code has to do is to return a float or an
int or a string.

For us, under the covers that C++ code just gets or sets some truth-value
(or other value) on some pile of atoms in the atomspace.  This can't
possibly be more than a few days worth of work.  and bingo: you can now
move data between R and the atomspace.

I'm really pretty sure I saw this kind of API in R, somwhere.

--lins

On Sat, Jun 17, 2017 at 10:43 PM, Linas Vepstas <[email protected]>
wrote:

>
> 1) jupyter is probably great for tutorials, and is nice for collaboration,
> but is immature and inadequate if you want to use it as an actual diary for
> actual results (which is my personal use-case).  So sure, install it now on
> the opencog website, and have at it.  Its clearly some kind of wave of some
> kind of future.  I doubt that sticking guile in it would be hard. Maybe we
> should ask Nala Ginrut about this.
>
> 2)  I most emphatically did NOT say "put other tools inside of opencog". I
> said the exact opposite.  Our data is in opencog. Put a pipeline between
> where the data is: in opencog, and these other tools.
>
> 3) the "guille shell" is wayyyyy too low-level for a data API. Using
> scheme and using atomese are like programming in assembly code.  Don't do
> that.  Instead I want to say things like "I have a sparse matrix, please do
> PCA or SVD or factor analysis or blah blah algo on my sparse matrix."  I am
> willing to write a small shim (in guile or in c++ or maybe even ...
> shiver... python) that translates between their sparse-matrix format, and
> mine (which are EvaluationLink's)    The key is that this shim must be
> small, simple, easy-to-write --- an afternoon, a few days, a week at most,
> or its just not worth it.
>
> 4) I have no clue at all on how R accesses data. Can't you write some
> small shim, that allows R to reach into opencog to get the data it needs?
> How hard can this be?  Surely lots of people do something like this, right?
>
> The point is that you are NOT creating a programming language API for
> opecog, which is hard to do. The point is that you are creating a pipeline
> to move data to and from opencog, which is a lot easier. orders of
> magnitude easier.  And is much closer to what you want, in the end, anyway.
>
> --linas
>
>
>
>
> On Sat, Jun 17, 2017 at 10:12 PM, Ben Goertzel <[email protected]> wrote:
>
>> Hi Linas, thanks for the detailed thoughts and responses...  I will
>> write an equally carefu reply when I get time; or else I will return
>> to all this in detail when/if resources are obtained to pursue these
>> sorts of directions... (which I hope will be soon...)
>>
>> About Jupyter Notebooks, I note that Google is now using it to supply
>> new Tensorflow tutorials...   I do think it would be great if we could
>> get our OpenCog tutorials working with Jupyter.  Perhaps making
>> Jupyter support Guile is indeed the easiest way... it seems this
>> "shouldn't be hard" ...   I agree Jupyter is not perfect, but it's a
>> lot nicer than embedding code in the wiki pages like we are now,
>> tutorial-wise...
>>
>> I agree that if we wanted to go with CL, we would want to test
>> Christian's CL approach thoroughly versus others before making any
>> sort of decision to undertake a lot of work...
>>
>> About using OpenCog in a data analysis context -- so for applications
>> like bio-data-analysis, or analyzing network data for a networking
>> company, it's clear there are non-OpenCog tools that will be used for
>> data preprocessing, for results evaluation, etc.   Putting all these
>> other tools in Atomspace/OpenCog isn't viable.   There are going to be
>> scripts that pass data thru various other things, then through
>> OpenCog/Atomspace, then through various other things....   For the
>> next 5 years at least, probably more....  So finding a great way to
>> support this kind of workflow is going to be important.   Maybe the
>> Guile shell is a great way to support this kind of workflow.
>>
>> In the bioinformatics case it happens that the preponderance of needed
>> tools are coded in R, not sci-py ... we need e.g. the 100 or so
>> scripts in bioconductor (an R package).   It is true that the heavy
>> lifting behind R packages is usually done in C++, but we need more
>> than just the heavy lifting, we need the bio-specific data-munging
>> code that is in the R code also....  So that is part of what I'm
>> wrestling with.  How to build a framework that makes sense in an
>> OPenCog-universe context, and also makes sense in terms of the
>> existing ecosystem of bio-AI/informatics tools that we need OpenCog to
>> be used together with...
>>
>> Schafmeister is using a whole other set of tools for his
>> nanotech/cheminformatics work... C++ and FORTRAN not R....   But
>> genomics is in R these days...
>>
>> -- Ben
>>
>>
>>
>> On Sun, Jun 18, 2017 at 10:48 AM, Linas Vepstas <[email protected]>
>> wrote:
>> > Hi Ben,
>> >
>> > I like the general idea, but many particulars I don't like.  Yes, we
>> need
>> > something like this, but maybe there's an alternate route.
>> >
>> > TL;DR:  Maybe we should pick the science tools first, and then ask which
>> > languages we should target.
>> >
>> > You met a really great salesman who convinced you that his CL product
>> is the
>> > best, but I think you need to step back and get some bearings:
>> >
>> > -- why is his CL better than bigger, established CL projects?
>> > -- why is some other CL project better than some other scheme project?
>> > -- why not consider scala or haskel?
>> > -- are there science libraries for scala or haskel, or some other
>> scheme or
>> > lisp?
>> > -- maybe its easier to fix cython, and just use scipy with cython?
>> assuming
>> > scipy is adequate for the science needs?
>> >
>> > The "autogenerating c++ bindings" is total sales bullhockey.  Recall
>> that
>> > cython, guile, swig, and two dozen other things were invented o
>> > "autogenerate C++ bindings".  Even haskell has this ability, and you've
>> seen
>> > how it took Roman just one afternoon to do this!
>> >
>> > --linas
>> >
>> >
>> > On Sat, Jun 10, 2017 at 11:11 PM, Ben Goertzel <[email protected]>
>> wrote:
>> >>
>> >>
>> >>
>> >> TL;DR — particularly for use of OpenCog in bioinformatics and other
>> >>
>> >> scientific-computing applications — but also for other applications
>> like
>> >>
>> >> corpus linguistics where one deals with large datasets and wants to
>> >>
>> >> combine OpenCog with other command-line tools —
>> >
>> >
>> > "with other sparse-data, big-data and graph processing tools" would be
>> more
>> > appropriate.
>> >
>> >>
>> >> it might be a good
>> >>
>> >> idea to replace or supplement our current Guile shell with a
>> wrapping-up
>> >>
>> >> of OpenCog in Common Lisp …
>> >
>> >
>> > Here and elsewhere, you confuse scheme, guile and  lisp.  So
>> >
>> > 1) scheme is a dialect of lisp, so is common lisp.
>> >
>> > 2) guile is a specific implementation of scheme that allows scheme to be
>> > used from c++.  There are very few implementations of scheme or lisp
>> that
>> > allow this to take place.
>> >
>> > 3) Creating language bindings for opencog is a lot of work. Creating
>> common
>> > lisp bindings would be roughly as much work as it is for guile, python
>> or
>> > haskell.
>> >
>> >>
>> >> and in particular in the CLASP Common
>> >>
>> >> Lisp framework that Christian Schafmeister has developed.
>> >
>> >
>> > Personally, I am nervous by one-man-shows.  The one-man has a historical
>> > tendency to get bored and move on to something else, and then everything
>> > stops and falls apart.
>> >
>> >>
>> >>  This would
>> >>
>> >> also have some other smaller side-benefits like letting us exploit the
>> CL
>> >>
>> >> bindings for Jupiter Notebooks which would be cool for OpenCog
>> >>
>> >> tutorials;
>> >
>> >
>> > I played with jupyiter a fair amount, hoping to use it as a personal
>> diary
>> > of the data analysis I do in opencog.  Upshot: its way cool, and very
>> > immature.  It struggles to do even basic stuff, and you got to load on
>> and
>> > confugre all sorts of add-ons to get it to behave properly. So its a
>> cool
>> > idea, but still very immature and half-baked.
>> >
>> >>
>> >> and making it easy to generate LISP bindings for ROS,
>> >
>> >
>> > those already exist  but we fall into the age-old trap: exactly which
>> > variant of common lisp do you propose?  I count seven major, 20 minor
>> > variants:
>> >
>> > http://www.cliki.net/Common+Lisp+Implementation
>> >
>> > its like picking racket-scheme over guile-scheme over mzscheme over
>> > chicken-scheme.
>> >
>> > the point behind scheme is that its "modern", fixing many of the "bugs"
>> in
>> > lisp.
>> >
>> >
>> >>
>> >> thus
>> >>
>> >> avoiding the need to deal with ros-py for OpenCog robotics
>> applications…
>> >>
>> >>
>> >> Basic reasons are: CLASP is efficient at handling large datasets and
>> >>
>> >> piping them from one place to another.   It can also automatically
>> >>
>> >> generate bindings for C++ code, which could be used to auto-generate
>> >>
>> >> LISP bindings for ROS….
>> >
>> >
>> > Yeah, like total bullshit.  Everyone has something called "FFI" (foreign
>> > function interface) and everyone always says that FFI makes it trivial
>> to
>> > interface their programming language to c++ code.  Which is half-way
>> > correct, for small and simple C/C++ libraries.
>> >
>> > This falls apart for something more complex, which is why things like
>> SWIG
>> > and Cython  and Guile get invented.   There's got to be a dozen-or-two
>> more
>> > of these "automatically generate bindings for c++ code" tools out there.
>> > They all work sort-of-ish OK.  Up to a point, and then hell breaks
>> loose.
>> >
>> > Opencog is clearly on the other side of that: otherwise Roman would have
>> > finished the haskell bindings in an afternoon, just by autogenerating
>> them
>> > with FFI.
>> >
>> > "It's easy" is the program managers swan song. What it really means is
>> "i
>> > see a way of doing something"
>> >
>> >>
>> >>
>> >>
>> >> He has some strong arguments as to why this is a better way than R or
>> >>
>> >> python to get scientific computing done on large datasets….
>> >
>> >
>> > That is probably true. I don't get the impression that R or scipy are
>> ready
>> > for large datasets.
>> >>
>> >>
>> >>
>> >> One thing he found is that in many of his computational chemistry
>> >>
>> >> analysis scripts, the bulk of compute time was getting taken up
>> >>
>> >> passing around datasets and results between different C++ programs,
>> >>
>> >> in R or python or whatever other glue language was being used…
>> >
>> >
>> > That is must surely be very true
>> >
>> > On the other hand, the atomspace is a kind-of database for holding
>> datasets,
>> > so what is really needed is a way to have these tools talk directly to
>> the
>> > data that is lready there, in the atomspace.  The point is: bring the
>> tools
>> > to where the data is, don't try to cart the data around between the
>> tools.
>> >>
>> >>
>> >>
>> >> Via using Common LISP compiled into LLVM, he found this could be
>> >>
>> >> worked around, because one could then script stuff in CL but have
>> >>
>> >> efficient garbage collection done via LLVM …
>> >
>> >
>> > one can script stuff in scheme, and have efficient garbage collection
>> with
>> > bdwgc ... oh wait, we already have that.  Don't let curtis confuse you:
>> if
>> > you do garbage collection once every sentence, then, yes, garbage
>> collection
>> > will take up a large fraction of the time.  Don't do garbage collection
>> on
>> > ever sentence, and the problem goes away.
>> >>
>> >>
>> >>
>> >> My speculative line of thinking now is that it might be interesting to
>> >>
>> >>
>> >> -- supplement or replace Guile with Clasp as a shell for working with
>> >> OpenCog
>> >
>> >
>> > Non-starter, for the above reasons. First, it will take just as much or
>> more
>> > time than haskel, cython or guile took,  and clasp is far more obscure,
>> and
>> > has a far smaller user base.
>> >>
>> >>
>> >>
>> >> -- Integrate C++ bio-analytics tools into Clasp,
>> >
>> >
>> > Or pick some other, more popular variant of lisp or scheme.
>> >
>> >>
>> >> in a similar way to
>> >>
>> >> what Schafmeister has been doing for chem-analytics tools
>> >>
>> >>
>> >> — Integrate R bio-analytics tools into Clasp as well, using the
>> following
>> >>
>> >> LLVM compiler for R
>> >
>> >
>> > Ir integrate lisp into java, like clojure, or integrate Groovy or Scala
>> or
>> > Jython or ....
>> >
>> > Besides Haskell, I think scala is the other interesting one to consider.
>> >>
>> >>
>> >>
>> >> — Use the Jupyter Notebooks wrapping for CL to make Jupyter tutorials
>> >>
>> >> for OpenCog
>> >
>> >
>> >
>> > Jupyter + scheme would be be the way to go. A lot easier, I'm guessing
>> maybe
>> > 50 times less work.
>> >
>> >>
>> >>
>> >>
>> >> (although, as a side note, integrating Guile with Jupyter Notebooks
>> >>
>> >> would also not be impossible ... it would just be a bunch of work,
>> >>
>> >> like any of this…)
>> >
>> >
>> > It would be like several orders of magnitude less work.  Like maybe
>> weeks,
>> > instead of half-year++ or year++
>> >
>> >
>> > That's it. I think that basically, you met a really great salesman who
>> > convinced you that his CL product is the best, but I think you need to
>> step
>> > back and get some bearings:
>> >
>> > -- why is his CL better than bigger, established CL projects?
>> > -- why is some other CL project better than some other scheme project?
>> > -- why not consider scala or haskel?
>> > -- are there science libraries for scala or haskel, or some other
>> scheme?
>> > -- maybe its easier to fix cython, and just use scipy with cython?
>> assuming
>> > scipy is adequate for the science needs?
>> >
>> > Are we putting the cart before the horse? Maybe we should pick the
>> science
>> > tools first, and then ask which languages we shoul target?
>> >
>> >
>> >
>> >>
>> >>
>> >> gmail.com.
>> >> For more options, visit https://groups.google.com/d/optout.
>> >
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> Groups
>> > "opencog" group.
>> > To unsubscribe from this group and stop receiving emails from it, send
>> an
>> > email to [email protected].
>> > To post to this group, send email to [email protected].
>> > Visit this group at https://groups.google.com/group/opencog.
>> > To view this discussion on the web visit
>> > https://groups.google.com/d/msgid/opencog/CAHrUA37qymACH%3DY
>> yVtAxuu0adk224_yL7dpsUb5An5KyD0v76Q%40mail.gmail.com.
>> >
>> > For more options, visit https://groups.google.com/d/optout.
>>
>>
>>
>> --
>> Ben Goertzel, PhD
>> http://goertzel.org
>>
>> "I am God! I am nothing, I'm play, I am freedom, I am life. I am the
>> boundary, I am the peak." -- Alexander Scriabin
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "opencog" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> Visit this group at https://groups.google.com/group/opencog.
>> To view this discussion on the web visit https://groups.google.com/d/ms
>> gid/opencog/CACYTDBewKR2nkJKeZ%2BTKnk5pM-hkmz4nGzj25NVm5Z00b
>> 8S9-Q%40mail.gmail.com.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/opencog.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/CAHrUA36NvEDJfdjETt6Ws4aKgeTPrs_XeQH3D_d7N8mjoFSfiw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to