Meant to CC the opencog mailing list in teh reply. On Thu, Jun 25, 2020 at 2:14 AM Linas Vepstas <[email protected]> wrote:
> Hi Vignav, > > On Thu, Jun 25, 2020 at 12:57 AM <[email protected]> wrote: > >> Hi Linas, >> >> Thanks for your feedback. Quick question - are there any files or links >> in particular part of opencog/generate that would be most helpful as a >> reference/general code structure when building the language production >> engine in Java? >> > > I don't know how to answer that question, other than to say "all of it" > which is not what you want to hear.... > > The README explains in detail what the code does: it walks over a > collection of "jigsaw puzzle pieces" and attempts to join them together -- > it does this in a breadth-first search -- that is adding one piece at a > time, always trying to attach to one of the existing unconnected > connectors. This sounds easy, but is remarkably hard -- I had to create > something I call an "odometer", to track the state of what has been tried > and what has not been tried. It took me a while, to get it to work right > ... there were some tricky bits.. but now it works. > > So .. basically -- I'm saying -- gee, well, here is this complicated code > .. do you really want to port that to java? > > There is also some interesting interplay with something called "the > pattern engine" inside of the atomspace. Let me give a very short > explanation. So .. say you have a function f(x) and you want to plug a > value "b" into it, to get "f(b)". What you are "really doing" is connecting > "f(x) <==> b" where f(x) is one "jigsaw puzzle piece connector" (say, the > hole) and "b" is the jigsaw tab that plugs in. Painfully obvious, right? > > So, the pattern engine is currently a database search engine that has a > bunch of "holes" aka "variables x,y,z..." and it searches for all > "patterns" which can match those "holes" (and so the name "pattern > matcher"). It vaguely resembles a "perl regex" and an "SQL query" and a > "prolog inference step" all mashed up into one, and generalized for graph > search. > > I have a plan to generalize the pattern engine "real soon now" to > generalize the "variables x,y,z" and "things that match them" into > generalized "jigsaw puzzle piece connectors". This is a natural > generalization, because the pattern engine already applies type-checking > (so, for example, you can only plug an int into an int variable, a float > into a float variable, an instance of "class foo" into something that takes > "class foo" as an argument.... think of function signatures as you already > know them in java.) So the pattern engine already knows how to plug > "instances of things" into "fill-in-the-blanks" search queries, with > type-checking (with a full type-theoretical type hierarchy and type > constructors) ... I want to generalize that into arbitrary user-defined > "connectors" that can "connect together" instead of "instances of objects" > and "things that accept instances of objects". > > I want to do the above in parallel with the graph generator work, so that > the two systems are inter-compatible. I think its neat. > > Instead of thinking "jigsaw puzzle-pieces" you can think of biology > analogs: bits of proteins that fit into other proteins. Immunoglobulin > parts that can mate-with, stick to other other parts. RNA that sticks to > DNA ... So, in a certain sense, I feel like I am re-inventing biochemistry: > different "types" have different "affinities" for forming "chemical bonds". > > - Linas. > > p.s. I'm cc'ing the opencog mailing list as a way of saying "here's > another way of explaining that sheaf thing I keep talking about". > > p.p.s. to avoid confusion between link-grammar links and atomsace links, I > plan to rename the link-grammar links to "bonds". All the old > language-learning code, and the generator code still calls them "links" but > this is confusing. Of course, I cannot rename "link grammar" to "bond > grammar", so that stays unchanged. But calling them "bonds" in analogy to > "chemical bonds" gets across a better idea of the concept of attachment. > > I'm not the first: Eugene A Nida: "The Molecular Level of Lexical > Semantics" > https://www.academia.edu/36534355/The_Molecular_Level_of_Lexical_Semantics_by_EA_Nida > *International Journal of Lexicography*, Volume 10, Issue 4, December > 1997, Pages 265–274, https://doi.org/10.1093/ijl/10.4.265 > > >> Thanks, >> Vignav >> >> On Sunday, June 7, 2020 at 1:48:34 AM UTC-7, Anton Kolonin @ Gmail wrote: >>> >>> Hi Linas, thank you for the guidance. >>> >>> >Again, I would like to remind you that there already is an existing >>> project for language generation here: >>> https://github.com/opencog/generate that is link-grammar compatible, >>> and it already generates simple sentences from simple dictionaries. >>> >>> Cool, worth looking into that as a reference! >>> >>> >I would rather see work proceed on that, rather than all-new >>> green-field development. There are major compatibility problems with coding >>> in java. It's not a good language choice for these kinds of projects. -- >>> speaking from experience. >>> In given case, we are looking for "Pure Java" solution so it can run >>> under the Aigents framework on smartphones and JVM-enabled coffee machines >>> ;-) >>> >>> We also need the grammar to be coupled with ontology which is already >>> present in the in-memory Java graph DB. >>> >>> Thanks, >>> >>> -Anton >>> >>> >>> On 07/06/2020 11:04, Linas Vepstas wrote: >>> >>> Hi Vignav, >>> >>> Mailing lists work better if you actually subscribe to them. As it is, >>> you will not receive any replies unless people explicitly CC you. >>> >>> >>> On Sat, Jun 6, 2020 at 10:45 PM Vignav Ramesh <[email protected]> wrote: >>> >>>> Hi Linas & Amir, >>>> >>>> This is Vignav. I am working with Anton on the Java-based NL language >>>> production for Aigents. >>>> >>>> I am taking a look at the LinkGrammar.java file in >>>> bindings/java/org/linkgrammar but there does not seem to be much clear >>>> documentation on how to use the LinkGrammar class and its methods after >>>> downloading the file and importing it. Since it has native methods, I am >>>> assuming it uses JNI and there is a processing of integrating the C and >>>> Java code. Is there any documentation on this that I can follow to sort >>>> this all out and get the Java version working? >>>> >>> >>> If you look at the java-jni directory, you will find the java jni >>> bindings. You should spend some quality time reading the contents of >>> java-jni/jni-client.h and java-jni/jni-client.c so that you can clearly >>> understand what the jni bindings actually are. After all, this is open >>> source, and part of what makes it open is that you can actually examine and >>> explore it, instead of depending on a corporation to feed you one spoonful >>> at a time. >>> >>> The jni bindings have corresponding java files. if you say `find >>> bindings/java/org` you will see: >>> >>> org >>> org/linkgrammar >>> org/linkgrammar/Link.java >>> org/linkgrammar/LGConfig.java >>> org/linkgrammar/LGRemoteClient.java >>> org/linkgrammar/ParseResult.java >>> org/linkgrammar/LinkGrammar.java >>> org/linkgrammar/JSONUtils.java >>> org/linkgrammar/LGService.java >>> org/linkgrammar/Linkage.java >>> org/linkgrammar/JSONReader.java >>> >>> Hopefully it is painfully obvious what each file does, given its name: a >>> configuration file, two json processing files, a file for working with >>> parses, a file for working with linkages, a server file, a client file, and >>> an API file. >>> >>> All of the documentation is in javadoc format. All you have to do is to >>> run your favorite javadoc tool on it and you will have full and complete >>> documentation for everything. Please keep in mind that many different kinds >>> of documentation systems are compatible with javadoc, and so just about any >>> system will produce documentation for you. >>> >>> Again, I would like to remind you that there already is an existing >>> project for language generation here: >>> https://github.com/opencog/generate that is link-grammar compatible, >>> and it already generates simple sentences from simple dictionaries. I >>> would rather see work proceed on that, rather than all-new green-field >>> development. There are major compatibility problems with coding in java. >>> It's not a good language choice for these kinds of projects. -- speaking >>> from experience. >>> >>> -- Linas >>> >>> <https://github.com/opencog/generate> >>> >>> >>>> Thanks, >>>> Vignav >>>> >>>> On Sat, Jun 6, 2020 at 9:56 AM Linas Vepstas <[email protected]> >>>> wrote: >>>> >>>>> Hi Anton, >>>>> >>>>> On Fri, Jun 5, 2020 at 11:24 PM Anton Kolonin @ Gmail < >>>>> [email protected]> wrote: >>>>> >>>>>> Hi Linas and Amir! >>>>>> >>>>>> We are going to try using LG formalism for language production in >>>>>> Java >>>>>> project: >>>>>> https://github.com/aigents/aigents-java/issues/22 >>>>> >>>>> >>>>> So, NL Generation is the goal of https://github.com/opencog/generate >>>>> -- it already generates small sentences from simple vocabularies just >>>>> fine. >>>>> I have not attempted anything complex, maybe it will work and maybe it >>>>> won't. It's alpha version 0.1.0 so many of the things I can think of that >>>>> I >>>>> want to have are absent. >>>>> >>>>> It's not Java. >>>>> >>>>> I think it would be awesome if you could work on that project, but I >>>>> imagine that you would not want to, that you prefer green-field solutions >>>>> written by your own people over which you have total control. >>>>> >>>>> >>>>>> Note that we are not going use LG to "parse" NL texts, we are going >>>>>> to >>>>>> use it to "generate" NL texts (the opposite task but the same >>>>>> formalism >>>>>> and the same dictionaries are to be used). >>>>>> >>>>>> Can one of you answer some questions? >>>>>> >>>>>> 1. What is the current location of the best-of-breed LG dictionaries >>>>>> (for English and Russian in particular)? >>>>>> >>>>> >>>>> Comes with LG >>>>> >>>>> 2. What is the location of most reliable code branch to read these >>>>>> dictionaries? >>>>>> >>>>> >>>>> Comes with LG >>>>> >>>>> 3. If there are any known Java projects of pieces that can be re-used >>>>>> under OSS license? >>>>>> >>>>> >>>>> Comes with LG >>>>> >>>>> 4. Can we use this tutorial >>>>>> (https://www.abisource.com/projects/link-grammar/api/index.html) to >>>>>> make >>>>>> a Java implementation of the Link Grammar parser? >>>>>> >>>>> >>>>> Yes, that is the official LG documentation. >>>>> >>>>> 5. We rewrote the C++ code from the tutorial above in Java - but any >>>>>> recommendation on what the Java substitute for #include >>>>>> "link-includes.h" is? We know it has to do with the Java bindings in >>>>>> the >>>>>> opencog repo but we are not totally sure how to use that. >>>>>> >>>>> >>>>> LG already comes with java bindings that work in both local and remote >>>>> mode, and also comes with two different java servers. One java server >>>>> generates json and the other generates atomese. >>>>> >>>>> Depending on what API you want, include either >>>>> bindings/java/org/linkgrammar/LinkGrammar.java or >>>>> bindings/java/org/linkgrammar/LGRemoteClient.java the README file >>>>> explains >>>>> how to use these >>>>> >>>>> -- linas >>>>> >>>>> -- >>>>> cassette tapes - analog TV - film cameras - you >>>>> >>>> >>> >>> -- >>> cassette tapes - analog TV - film cameras - you >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "link-grammar" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/link-grammar/CAHrUA37h_f_9dLK2toQxyqXjEmDBcfst%2Bi-WqvpFUnCpD9d93w%40mail.gmail.com >>> <https://groups.google.com/d/msgid/link-grammar/CAHrUA37h_f_9dLK2toQxyqXjEmDBcfst%2Bi-WqvpFUnCpD9d93w%40mail.gmail.com?utm_medium=email&utm_source=footer> >>> . >>> >>> -- >>> -Anton Kolonin >>> skype: akolonin >>> cell: >>> [email protected]https://aigents.comhttps://www.youtube.com/aigentshttps://www.facebook.com/aigentshttps://wt.social/wt/aigentshttps://medium.com/@aigentshttps://steemit.com/@aigentshttps://reddit.com/r/aigentshttps://twitter.com/aigentshttps://golos.in/@aigentshttps://vk.com/aigentshttps://aigents.com/en/slack.htmlhttps://www.messenger.com/t/aigentshttps://web.telegram.org/#/im?p=@AigentsBot >>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "link-grammar" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/link-grammar/570da168-fce7-4630-8347-1518171e07dbo%40googlegroups.com >> <https://groups.google.com/d/msgid/link-grammar/570da168-fce7-4630-8347-1518171e07dbo%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > > > -- > cassette tapes - analog TV - film cameras - you > -- cassette tapes - analog TV - film cameras - you -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAHrUA370hfNmCL_fm0GSVzQhq-ShOGpzk%3DQU%3DX5GF%3Dv%2BLv6Hrw%40mail.gmail.com.
