Hi Xabush, Performance-wise, I think it would be really nice to replace the LALR parser in Ghost with the nyacc parser for the 5-6X performance gain + better debugging and logging capabilities, if it's not too much work :)
On Sun, Jun 30, 2019 at 6:21 AM Linas Vepstas <[email protected]> wrote: > Well, you are free do to whatever you want to do, but one of the points of > having Atomese in the first place, is to avoid having to go through such > contortions. Why do you need the output-Interaction function? ... the only > reason I can see for "(GroundedSchemaNode "scm: generate-result")" is > because you are trying to wrap "node-info" -- what does node-info do? Can > you do it directly in the atomspace? why write code in scheme? > > I mean, yes, I write bucket-loads of scheme all the time, to get things > done, but there's always the meta-question -- why, and how can it be made > simpler? The long-run goal is to eventually replace all scheme an python > code with declarative Atomese that "does the same thing" - this is > impossible in the short-run, but, in the back of your mind, always think > "how can this be coded in a declarative manner?" instead of thinking of > "how can I code this in a functional manner" or "a procedural manner" or > "an OO style"? > > --linas > > On Sat, Jun 29, 2019 at 4:29 PM Xabush Semrie <[email protected]> wrote: > >> I see your point. And for a simple use-case your method works. But in my >> case, I have the following requirements >> >> 1. I have to return both the atomese and the parsed JSON to the use. >> >> 2. I am running a different pattern matching functions to aggregate their >> outputs and parse the result as a whole. That's why I am using the parser. >> >> 3. I have to create some links based on discovered patterns instead of >> directly return a JSON string. For example, we this kind of a function: >> >> (define outputInteraction >> (lambda(gene) >> (cog-execute! (BindLink >> (VariableList >> (TypedVariable (VariableNode "$a") (Type 'GeneNode)) >> (TypedVariable (VariableNode "$b") (Type 'GeneNode))) >> >> (And >> (EvaluationLink >> (PredicateNode "interacts_with") >> (ListLink >> gene >> (VariableNode "$a") >> )) >> >> (EvaluationLink >> (PredicateNode "interacts_with") >> (ListLink >> (VariableNode "$a") >> (VariableNode "$b") >> )) >> >> (EvaluationLink >> (PredicateNode "interacts_with") >> (ListLink >> gene >> (VariableNode "$b") >> )) >> ) >> (ExecutionOutputLink >> (GroundedSchemaNode "scm: generate-result") >> (ListLink >> (VariableNode "$a") >> (VariableNode "$b") >> )) >> )) >> )) >> >> And generate result is something like >> >> (define (generate-result gene-a gene-b) >> (ListLink >> (EvaluationLink >> (PredicateNode "interacts_with") >> (ListLink gene-a gene-b)) >> (node-info gene-b) >> (node-info gene-a) >> ) >> ) >> >> >> So based on the above points, I decided to write a custom parser. >> >> On Sunday, June 30, 2019 at 12:16:24 AM UTC+3, linas wrote: >>> >>> I mean, one very low-brow, trivial way to do it would be to write: >>> >>> (BindLink >>> (VariableList >>> (TypedVariable (Variable "SRC") (Type 'GeneNode)) >>> (TypedVariable (Variable "TGT") (Type 'MoleculeNode)) >>> (TypedVariable (Variable "XPS") (Type 'PredicateNode))) >>> ; what you are looking for >>> (Evaluation (Variable "XPS")(List (Variable "SRC") (Variable "TGT"))) >>> ; what to do when you find it >>> (ExecutationOutput >>> (GroundedSchema "scm:print-stuff") >>> (List (Variable "SRC") (Variable "TGT") (Variable "XPS") )) >>> >>> ; and then define the printer: >>> >>> (define (print-stuff src tgt xps) >>> (format #t "{ \"data\": {\"source\": \"~A\", \"target\": \"~A\", >>> "name": \"~A\", \"group\": \"edges\"}}" >>> (cog-name src) (cog-name tgt) (cog-name xps)) >>> ; a return value >>> xps) >>> >>> I mean -- this is low-brow, simple, bordering on trite, but does what >>> you want to do, for your example. There are other ways of doing this that >>> are even simpler, but the above is a good demo. Maybe you need more >>> sophisticated features, but the above is lots easier than trying to figure >>> out LALR. I mean, knowing what LALR is and having experience with it is a >>> "good thing", but its overkill for this particular problem. >>> >>> --linas >>> >>> On Sat, Jun 29, 2019 at 3:58 PM Linas Vepstas <[email protected]> >>> wrote: >>> >>>> >>>> >>>> On Sat, Jun 29, 2019 at 3:54 PM Xabush Semrie <[email protected]> >>>> wrote: >>>> >>>>> >>>>> why the heck would you need to "parse" atomese? >>>>> >>>>> What are you actually trying to do? >>>>> >>>>> >>>>> I am converting it to JSON for graph visualization with Cytoscape.js >>>>> for an annotation service. For example, >>>>> (EvaluationLink >>>>> (PredicateNode "expresses") >>>>> (ListLink >>>>> (GeneNode "MAP2K4") >>>>> (MoleculeNode "Uniprot:Q5U0B8"))) >>>>> >>>>> The above will be "parsed" into the following JSON >>>>> { >>>>> "data": {"source": "MAP2K4", "target": "Uniprot:Q5U0B8", "name": >>>>> "expresses", "group": "edges"} >>>>> } >>>>> >>>>> >>>> Why not just dump directly from the atomspace? >>>> >>>> >>>>> Especially since it already comes with a built-in parser? >>>>> >>>>> >>>>> Maybe I am confusing something here, but I didn't know any parser >>>>> existed for my use case. >>>>> >>>> >>>> ? Of course there is. It's called "the atomspace". >>>> >>>> --linas >>>> >>>>> >>>>> >>>>> On Saturday, June 29, 2019 at 11:43:55 PM UTC+3, linas wrote: >>>>>> >>>>>> Dumb question: why the heck would you need to "parse" atomese? >>>>>> Especially since it already comes with a built-in parser? What are you >>>>>> actually trying to do? --linas >>>>>> >>>>>> On Sat, Jun 29, 2019 at 10:04 AM Xabush Semrie <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I have been working recently on LALR parser to parse atomese to >>>>>>> JSON(code can be found here >>>>>>> <https://github.com/Habush/annotation-scheme/blob/de66cd29c375321e5c7a14741a91c40ac40fb0b9/helpers/atomese-parser.scm#L98>). >>>>>>> I initially used the same LALR parser generator used by GHOST found in >>>>>>> *(system >>>>>>> base lalr)* module with a similar lexer generator (in my case I >>>>>>> precompiled the regex patterns for performance gain). However, I was >>>>>>> getting very bad performance and it took way too long to parse >>>>>>> moderately >>>>>>> sized atomese files. It didn't help that the module didn't provided its >>>>>>> own >>>>>>> lexer generator and in the case of the GHOST code, the regex patterns >>>>>>> were >>>>>>> not precompiled which would further degrade the performance. As a >>>>>>> result, I >>>>>>> started looking at alternatives and found the nyacc project. >>>>>>> >>>>>>> After rewriting the code using nyacc, I found that the nyacc parser >>>>>>> generator on average is 5-6X faster than the previous parser generator >>>>>>> (which used by GHOST) for the same file. In addition to the performance >>>>>>> improvement, it removes the need to provide a manually written lexer >>>>>>> generator, has support for mid-rule context actions for complicated >>>>>>> production rules, has a better debugging and "logging" capabilities and >>>>>>> (although minor) doesn't require to list all the terminal symbols. Also >>>>>>> the >>>>>>> project is also being actively developed. >>>>>>> >>>>>>> Hence, I deduced the GHOST parser could also benefit the same >>>>>>> performance improvements and thought sharing this here. I am happy to >>>>>>> work >>>>>>> on porting the LALR parser from the current one to nyacc if this gets >>>>>>> traction. >>>>>>> >>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "opencog" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to [email protected]. >>>>>>> To post to this group, send email to [email protected]. >>>>>>> Visit this group at https://groups.google.com/group/opencog. >>>>>>> To view this discussion on the web visit >>>>>>> https://groups.google.com/d/msgid/opencog/f3d23857-71b2-40a8-b99d-86249f9bd71a%40googlegroups.com >>>>>>> <https://groups.google.com/d/msgid/opencog/f3d23857-71b2-40a8-b99d-86249f9bd71a%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> cassette tapes - analog TV - film cameras - you >>>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "opencog" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To post to this group, send email to [email protected]. >>>>> Visit this group at https://groups.google.com/group/opencog. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/opencog/c1ab0355-de77-428b-b8ce-baba885dd157%40googlegroups.com >>>>> <https://groups.google.com/d/msgid/opencog/c1ab0355-de77-428b-b8ce-baba885dd157%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> >>>> -- >>>> cassette tapes - analog TV - film cameras - you >>>> >>> >>> >>> -- >>> cassette tapes - analog TV - film cameras - you >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "opencog" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To post to this group, send email to [email protected]. >> Visit this group at https://groups.google.com/group/opencog. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/opencog/d7400a53-cbc2-454d-9a8f-2b570b291718%40googlegroups.com >> <https://groups.google.com/d/msgid/opencog/d7400a53-cbc2-454d-9a8f-2b570b291718%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > > -- > cassette tapes - analog TV - film cameras - you > > -- > You received this message because you are subscribed to the Google Groups > "opencog" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/opencog. > To view this discussion on the web visit > https://groups.google.com/d/msgid/opencog/CAHrUA36iEO5Y%3DL54GY-xA%2BBd%3D6enq0bGaoOHsFTqoGW6zy-hTw%40mail.gmail.com > <https://groups.google.com/d/msgid/opencog/CAHrUA36iEO5Y%3DL54GY-xA%2BBd%3D6enq0bGaoOHsFTqoGW6zy-hTw%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/opencog. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAMfi0B%2B5gAnMw64N2MBAcF7Tu_w8ywgJ%2BztPV6aSpBe5ij3qwA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
