[julia-users] Re: PEG Parser

Abe Schneider Thu, 05 Jun 2014 03:57:07 -0700

I also forgot to push the changes last night.


On Wednesday, June 4, 2014 11:01:33 PM UTC-4, Abe Schneider wrote:
>
> After playing around with a bunch of alternatives, I think I've come up 
> with decent action semantics:
>
> @transform <name> begin
>  <label> = <action>
> end
>
> For example, a simple graph grammar might look like:
>
> @grammar nodetest begin
>   start = +node_def
>   node_def = node_label + node_name + lbrace + data + rbrace
>   node_name = string_value + space
>
>   data = *(line + semicolon)
>   line = string_value + space
>   string_value = r"[_a-zA-Z][_a-zA-Z0-9]*"
>
>   lbrace = "{" + space
>   rbrace = "}" + space
>   semicolon = ";" + space
>   node_label = "node" + space
>   space = r"[ \t\n]*"
> end
>
> with it's actions to create some data structure:
>
> type MyNode
>   name
>   values
>
>   function MyNode(name, values)
>     new(name, values)
>   end
> end
>
>
> with:
> @transform tograph begin
>   # ignore these
>   lbrace = nothing
>   rbrase = nothing
>   semicolon = nothing
>   node_label = nothing
>   space = nothing
>
>   # special action so we don't have to define every label
>   default = children
>
>   string_value = node.value
>   value = node.value
>   line = children
>   data = MyNode("", children)
>   node_def = begin
>     local name = children[1]
>     local cnode = children[2]
>     cnode.name = name
>     return cnode
>   end
> end
>
> and finally, to apply the transform:
>
> (ast, pos, error) = parse(nodetest, data)
> result = apply(tograph, ast)
> println(result)    # {MyNode("foo",{"a","b"}),MyNode("bar",{"c","d"})}
>
> The magic in '@transform' basically just creates the dictionary like 
> before, but automatically wraps the expression on the RHS  as an anonymous 
> function  (node, children) -> expr.
>
> I'm currently looking for a better name than 'children', as it's 
> potentially confusing and misleading. It's actually the values of the child 
> nodes (as opposed to node.children). Maybe cvalues?
>
> On Sunday, May 25, 2014 10:28:45 PM UTC-4, Abe Schneider wrote:
>>
>> I wrote a quick PEG Parser for Julia with Packrat capabilities:
>>
>> https://github.com/abeschneider/PEGParser
>>
>> It's a first draft and needs a ton of work, testing, etc., but if this is 
>> of interest to anyone else, here is a quick description.
>>
>> Grammars can be defined using most of the standard EBNF syntax. For 
>> example, a simple math grammar can be defined as:
>>
>> @grammar mathgrammar begin
>>
>>   start = expr
>>   number = r"([0-9]+)"
>>   expr = (term + op1 + expr) | term
>>   term = (factor + op2 + term) | factor
>>   factor = number | pfactor
>>   pfactor = ('(' + expr + ')')
>>   op1 = '+' | '-'
>>   op2 = '*' | '/'
>> end
>>
>>
>>
>> To parse a string with the grammar:
>>
>> (node, pos, error) = parse(mathgrammar, "5*(2-6)")
>>
>> This will create an AST which can then be transformed to a value. 
>> Currently this is accomplished by doing:
>>
>> math = Dict()
>>
>> math["number"] = (node, children) -> float(node.value)
>> math["expr"] = (node, children) ->
>>     length(children) == 1 ? children : eval(Expr(:call, children[2], 
>> children[1], children[3]))
>> math["factor"] = (node, children) -> children
>> math["pfactor"] = (node, children) -> children[2]
>> math["term"] = (node, children) ->
>>     length(children) == 1 ? children : eval(Expr(:call, children[2], 
>> children[1], children[3]))
>> math["op1"] = (node, children) -> symbol(node.value)
>> math["op2"] = (node, children) -> symbol(node.value)
>>
>>
>> Ideally, I would like to simplify this to using multi-dispatch on symbols 
>> (see previous post), but for now this is the easiest way to define actions 
>> based on node attributes.
>>
>> Finally, to transform the tree:
>>
>> result = transform(math, node)  # will give the value of 20
>>
>> Originally I was going to attach the transforms to the rules themselves 
>> (similiar to boost::spirit). However, there were two reasons for not doing 
>> this:
>>
>>    1. To implement the packrat part of the parser, I needed to cache the 
>>    results which meant building an AST anyways
>>    2. It's nice to be apply to get different transforms for the same 
>>    grammar (e.g. you may want to transform the result into HTML, LaTeX, etc.)
>>
>> The downside of the separation is that it adds some more complexity to 
>> the process.
>>
>

[julia-users] Re: PEG Parser

Reply via email to