I see your point. And for a simple use-case your method works. But in my
case, I have the following requirements
1. I have to return both the atomese and the parsed JSON to the use.
2. I am running a different pattern matching functions to aggregate their
outputs and parse the result as a whole. That's why I am using the parser.
3. I have to create some links based on discovered patterns instead of
directly return a JSON string. For example, we this kind of a function:
(define outputInteraction
(lambda(gene)
(cog-execute! (BindLink
(VariableList
(TypedVariable (VariableNode "$a") (Type 'GeneNode))
(TypedVariable (VariableNode "$b") (Type 'GeneNode)))
(And
(EvaluationLink
(PredicateNode "interacts_with")
(ListLink
gene
(VariableNode "$a")
))
(EvaluationLink
(PredicateNode "interacts_with")
(ListLink
(VariableNode "$a")
(VariableNode "$b")
))
(EvaluationLink
(PredicateNode "interacts_with")
(ListLink
gene
(VariableNode "$b")
))
)
(ExecutionOutputLink
(GroundedSchemaNode "scm: generate-result")
(ListLink
(VariableNode "$a")
(VariableNode "$b")
))
))
))
And generate result is something like
(define (generate-result gene-a gene-b)
(ListLink
(EvaluationLink
(PredicateNode "interacts_with")
(ListLink gene-a gene-b))
(node-info gene-b)
(node-info gene-a)
)
)
So based on the above points, I decided to write a custom parser.
On Sunday, June 30, 2019 at 12:16:24 AM UTC+3, linas wrote:
>
> I mean, one very low-brow, trivial way to do it would be to write:
>
> (BindLink
> (VariableList
> (TypedVariable (Variable "SRC") (Type 'GeneNode))
> (TypedVariable (Variable "TGT") (Type 'MoleculeNode))
> (TypedVariable (Variable "XPS") (Type 'PredicateNode)))
> ; what you are looking for
> (Evaluation (Variable "XPS")(List (Variable "SRC") (Variable "TGT")))
> ; what to do when you find it
> (ExecutationOutput
> (GroundedSchema "scm:print-stuff")
> (List (Variable "SRC") (Variable "TGT") (Variable "XPS") ))
>
> ; and then define the printer:
>
> (define (print-stuff src tgt xps)
> (format #t "{ \"data\": {\"source\": \"~A\", \"target\": \"~A\",
> "name": \"~A\", \"group\": \"edges\"}}"
> (cog-name src) (cog-name tgt) (cog-name xps))
> ; a return value
> xps)
>
> I mean -- this is low-brow, simple, bordering on trite, but does what you
> want to do, for your example. There are other ways of doing this that are
> even simpler, but the above is a good demo. Maybe you need more
> sophisticated features, but the above is lots easier than trying to figure
> out LALR. I mean, knowing what LALR is and having experience with it is a
> "good thing", but its overkill for this particular problem.
>
> --linas
>
> On Sat, Jun 29, 2019 at 3:58 PM Linas Vepstas <[email protected]
> <javascript:>> wrote:
>
>>
>>
>> On Sat, Jun 29, 2019 at 3:54 PM Xabush Semrie <[email protected]
>> <javascript:>> wrote:
>>
>>>
>>> why the heck would you need to "parse" atomese?
>>>
>>> What are you actually trying to do?
>>>
>>>
>>> I am converting it to JSON for graph visualization with Cytoscape.js for
>>> an annotation service. For example,
>>> (EvaluationLink
>>> (PredicateNode "expresses")
>>> (ListLink
>>> (GeneNode "MAP2K4")
>>> (MoleculeNode "Uniprot:Q5U0B8")))
>>>
>>> The above will be "parsed" into the following JSON
>>> {
>>> "data": {"source": "MAP2K4", "target": "Uniprot:Q5U0B8", "name":
>>> "expresses", "group": "edges"}
>>> }
>>>
>>>
>> Why not just dump directly from the atomspace?
>>
>>
>>> Especially since it already comes with a built-in parser?
>>>
>>>
>>> Maybe I am confusing something here, but I didn't know any parser
>>> existed for my use case.
>>>
>>
>> ? Of course there is. It's called "the atomspace".
>>
>> --linas
>>
>>>
>>>
>>> On Saturday, June 29, 2019 at 11:43:55 PM UTC+3, linas wrote:
>>>>
>>>> Dumb question: why the heck would you need to "parse" atomese?
>>>> Especially since it already comes with a built-in parser? What are you
>>>> actually trying to do? --linas
>>>>
>>>> On Sat, Jun 29, 2019 at 10:04 AM Xabush Semrie <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I have been working recently on LALR parser to parse atomese to
>>>>> JSON(code can be found here
>>>>> <https://github.com/Habush/annotation-scheme/blob/de66cd29c375321e5c7a14741a91c40ac40fb0b9/helpers/atomese-parser.scm#L98>).
>>>>>
>>>>> I initially used the same LALR parser generator used by GHOST found in
>>>>> *(system
>>>>> base lalr)* module with a similar lexer generator (in my case I
>>>>> precompiled the regex patterns for performance gain). However, I was
>>>>> getting very bad performance and it took way too long to parse moderately
>>>>> sized atomese files. It didn't help that the module didn't provided its
>>>>> own
>>>>> lexer generator and in the case of the GHOST code, the regex patterns
>>>>> were
>>>>> not precompiled which would further degrade the performance. As a result,
>>>>> I
>>>>> started looking at alternatives and found the nyacc project.
>>>>>
>>>>> After rewriting the code using nyacc, I found that the nyacc parser
>>>>> generator on average is 5-6X faster than the previous parser generator
>>>>> (which used by GHOST) for the same file. In addition to the performance
>>>>> improvement, it removes the need to provide a manually written lexer
>>>>> generator, has support for mid-rule context actions for complicated
>>>>> production rules, has a better debugging and "logging" capabilities and
>>>>> (although minor) doesn't require to list all the terminal symbols. Also
>>>>> the
>>>>> project is also being actively developed.
>>>>>
>>>>> Hence, I deduced the GHOST parser could also benefit the same
>>>>> performance improvements and thought sharing this here. I am happy to
>>>>> work
>>>>> on porting the LALR parser from the current one to nyacc if this gets
>>>>> traction.
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "opencog" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> To post to this group, send email to [email protected].
>>>>> Visit this group at https://groups.google.com/group/opencog.
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/opencog/f3d23857-71b2-40a8-b99d-86249f9bd71a%40googlegroups.com
>>>>>
>>>>> <https://groups.google.com/d/msgid/opencog/f3d23857-71b2-40a8-b99d-86249f9bd71a%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>
>>>> --
>>>> cassette tapes - analog TV - film cameras - you
>>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "opencog" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected] <javascript:>.
>>> To post to this group, send email to [email protected]
>>> <javascript:>.
>>> Visit this group at https://groups.google.com/group/opencog.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/opencog/c1ab0355-de77-428b-b8ce-baba885dd157%40googlegroups.com
>>>
>>> <https://groups.google.com/d/msgid/opencog/c1ab0355-de77-428b-b8ce-baba885dd157%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>> --
>> cassette tapes - analog TV - film cameras - you
>>
>
>
> --
> cassette tapes - analog TV - film cameras - you
>
--
You received this message because you are subscribed to the Google Groups
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/opencog.
To view this discussion on the web visit
https://groups.google.com/d/msgid/opencog/d7400a53-cbc2-454d-9a8f-2b570b291718%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.