Re: [nodejs] With this utility, everyone can easily invent new languages

Mark Hahn Fri, 10 Jan 2014 17:42:03 -0800

Peasy looks interesting and it's on my list to look at, but this post was
too long to read.



On Fri, Jan 10, 2014 at 5:37 PM, Simeon Chaos <[email protected]>wrote:

> How is peasy (https://github.com/chaosim/peasy) simpler and more powerful
> than other parser tools?
>
> Simpler and more powerful? Maybe it is a bit contrary to common sense.
> True or false? To affirm, please give it a glimpse at first. Because of
> being simple, you can comprehend it in a little while; because of being
> powerful, it will deserve your every minute.
>
> All along, the design of programming languages is a complex task.
> Because we need to understand the esoteric compiler theory and technology,
> and one of the most critical and very difficult part is to define the rules
> of the new language and to parse with them.To solve this problem, there
> have been many theories , techniques and tools . These tools can be roughly
> divided into two categories: one is parser generator, another is to write
> the parser by hand with or without a parser library.
>
> The parser generator generally based on some type of formal languages ,
> by which the generator make the lexer and parser from some rules. Strong
> aspect of these tools is that they can produce the most efficient parser
> support for their own form of language types , according to the
> characteristics of formal language , generally ensures linear time
> complexity. Commonly divided into three types : LR technology , LL
> technology , PEG technology. LR technology language has shift/reduction ,
> from the bottom up , the most right derived and other technical features ,
> they can only support LR languages , more accurately, mainly LALR
> variants , plus some special treatment for the priority of operator. One of
> the most famous was undoubtedly the lex / yacc and its numerous subsequent
> variants ( such as bison, jison (javascript language ), ply (python lex /
> yacc) , etc. ) . LL technology has a top-down form of the language ,
> recursive descent , the most left-dereived and other technical features ,
> although the concept is easier to understand than the LR language , but
> because LL covers less languages, so not used as universal as lex / yacc.
> Representative products are antlr. PEG technology refers to parsing
> expression grammar parser generator. peg is a kind of formal grammar
> particularly suited to be parsed, currently there are already many tools
> appear. The most common method is packrat  algorithm based implementations.
> Such as rats, ometa (there are many versions , such as ometa / squeak,
> ometa-js, pyMeta etc. ), pyPEG under python, peg.js  under javascript,
> treetop and citrus under ruby, and so on.
>
> No matter what type of parser generator , the method to use them is to
> design rules for these tools , and then the tools generate parser. This is
> the difficult aspects. First, the object program is generated based on the
> state transfer and stack process, it is difficult to understand, debug and
> modify. Second, we must understand parser theory and technology of these
> tools is based on, the rules used by these tools is actually a
> domain-specific language , their expressivenes is very limited, while we
> must understand and become familiar with the new language in order to
> design rules. Third, we must embed certain processing ( abstract syntax
> tree structure , semantics, error reporting and handling , etc) in the form
> of object parser language  into lex/parser rules. Difficulties in these
> areas prevent the most programmers to easily use this type of tool.
>
> Meanwhile, people have also been pursuing a more relaxed and flexible
> approach to write parsers. Most of these methods produce tools can be
> classified as combinational parser library , or peg grammar-based peg
> library . Use of such libraries, programmers can use their own language in
> daily use to design a new universal language , parsing text, so they are
> used more freguently. Representative works is parsec under haskell language
>  which maybe is the most mature and powerful. However, because haskell is
> too academic, and parsec is not universally popular. c + + language has
> boost phoenix library, probably because it depends on the c++ template, an
> advanced language features, it has not been widely used too. Python has
> pyparsing, which is used by many users. For specific questions , there are
> many applications do not use any tools or libraries, but manually write the
> entire parser , for example: cpython implementation, esprima for
> javascript. However, unfortunately,  in their grammar and parsing of these
> tools, there are two obvious difficulties not been solved: the first is
> left recursive grammar, because left-recursive function will lead to an
> unconditional infinite recursion. The second is the parsing efficiency. In
> order to obtain a linear time complexity , we should only parse same syntax
> component only once at any position. To achieve the latter, the result of
> parsing is required. Meanwhile , the parser also need to be considered in
> conjunction to produce abstract syntax trees , semantic processing , error
> handling , etc., and these libraries maybe try to keep away from some of
> the problems (in the combined libraries left recursion are generally
> fobidden), or in order to solve these problems the program becomes very
> complex , difficult to understand , use, modify and extend.
>
> Can be seen from the above description , parsing is a very common
> programming needs , but also has a very high technical difficulty ,
> resulting in a large number of theories, technique, and tools you have seen
> above, in this regard a variety of technical papers, code and tools can be
> used simply to describe the voluminous , but still did not achieve the
> desired results .
>
>
> with peasy, a complete and elegant solution to all these problems emerge
> for the first time. On one hand you can say peasy the most simple and
> elegant , but also has the most powerful and adaptability , and does not
> need to sacrifice speed and efficiency.
>  Its characteristics are as follows:
>
> * for the first time, peasy provides a simple and complete solution to the
> problem of left recursion in the hand-written recursive descent parser
> manner, no matter direct or indirect left-recursive rule . To handle left
> recursive in peasy, only one function is needed (only 20 lines of code in
> coffeescript).
> * peasy allows programmers to write the parser with common programming
> language, and the parser and the input to the grammar rules are similar to
> the parser generator , both concise and readable , easy to modify and
> extend.
> * the simple and flexible caching mechanism in peasy can improve the time
> efficiency of parsing, we can maintain a linear time for liner time
> complexity grammar by caching. For complex grammar and data , we can also
> improve time efficiency by caching mechanism, and avoide the exponential,
> cube or square time complexity like some other common parsing algorithm.
> peasy's cache implementation is very simple, only one function (only ten
> lines of code under coffeescript  ) .
> * the entire peasy implementation is very simple. only two hundred lines
> of code under cofeescript. The concept is simple. Only two kind of
> components: some match functions which operate on the parsed data and
> parsing pointer and some combinational functions which generate match
> function. all of these functions are very simple , usually only a few lines
> of code , And they exists in the code of peasy for demonstration objective,
> can be deleted, modified and replaced, according to the specific needs.
>
> In fact, instead of to be considered as a library or a tool, maybe it's
> better to view peasy as a method, a way or some example to write parsers
> mannually. This will allow for a greater degree of power.
>
> peasy provide specific examples to support the above conclusion. The most
> obvious example is the common programming languages in order to adapt and
> modify their parser tool to prevent left- recursive grammar . For example
> python grammar (http://docs.python.org/2/reference/grammar.html),
> javascript grammar (
> http://www-archive.mozilla.org/js/language/grammar14.html), while by
> using peasy, all binary arithmetic expression can be condensed into a
> simple left -recursive rules , left recursion elimination is not needed any
> more. For expression parsing, in order to further improve time efficiency ,
> peasy also provides an example , has many different priorities for
> computing the call stack example can skip directly to the highest priority
> constant direct expression resolves to the final rather than, as is
> currently common practice, the rules of writing as follows : or -> (or | |
> and) | and; ...; add -> add + mul | mul; mul -> mul * atom | atom. Rare is
> , peasy while achieving enhanced capabilities of these expressions can also
> ensure that the linear time complexity of the parsing process. Want to know
> how peasy achieve these results? Please give a glimpse to peasy on github(
> https://github.com/chaosim/peasy).
>
> the first two of following links is the messages I posted when I started
> to get the inspiration to start writing peasy. Because there are some of
> other tasks to finish, intermediate interrupted for some time. Not long
> ago, I use peasy to parse something in another projects , and I further
> optimize the api of peasy, and which is now very simple and elegant, and I
> am satisfied :) As for the most critical left recursive , and caching
> algorithms , withstood the subsequent examples of the test , there is no
> any change. Together, these examples also continues to demonstrate the
> advantages of peasy .
>
> Finally, I need to mention the impact of language on thought: Without
> coffeescript language, maybe it's difficult for me to have these ideas. in
> coffeescript,  the -> function notation and "assignment is an expression"
> making special concise readable grammar. In python similar results can be
> achieved, need to go through some twists and turns, and that is not so
> natural as in coffeescript.There should exists no big obstacle in other
> dynamic languages, but they is not as elegant as coffeescript tool. This
> point is obvious with the difference between the javascript code of peasy
> and its coffeescript source.
>
> messages I posted in google groups before:
>
> https://groups.google.com/forum/#!search/left$20recursive/comp.lang.javascript/GN7b9Tr6j98/DoiHzi9i77oJ
>
> https://groups.google.com/forum/#!search/%E5%B7%A6%E9%80%92%E5%BD%92/python-cn/Uqr-Xf3I3LM/oUd3Sj4HvxYJ
>
>
> lr grammar and lex/yacc, bison, ply
> http://en.wikipedia.org/wiki/LR_parser
> http://dinosaur.compilertools.net/
> http://en.wikipedia.org/wiki/Yacc
> http://www.dabeaz.com/ply/
> http://www.gnu.org/software/bison/
> http://zaach.github.io/jison/
>
> LL grammar
> http://en.wikipedia.org/wiki/LL_parser
> http://www.antlr.org/
>
>
> some one asked for parser to handle left recursion on SO:
>
> http://stackoverflow.com/questions/4397039/any-peg-parser-capable-to-handle-left-recursion
>
>
> peg and packrat
> http://bford.info/packrat/
> http://en.wikipedia.org/wiki/Parsing_expression_grammar
> http://cs.nyu.edu/rgrimm/xtc/rats-intro.html
> http://java-source.net/open-source/parser-generators/rats!
>
> phoenix
>
> http://www.boost.org/doc/libs/1_55_0/libs/spirit/phoenix/doc/html/index.html
> http://www.ontolinux.com/community/phoenix/Phoenix_Manual.pdf
>
> ometa：
> Alessandro Warth describe a method to handle left recursion in peg grammar
> in his PhD paper firstly.
> After I have my idea about Peasy and algorith to handle left recursion, I
> searched and found his paper, and found my algorithm is actully similar to
> his method.
> http://tinlizzie.org/ometa/
> Memoization in Top-Down Parsing
> http://www.tinlizzie.org/~awarth/johnson.html
>
> https://github.com/alexwarth/ometa-js
>
> parsing under python
> https://wiki.python.org/moin/LanguageParsing
>
> parser for javascript or in javascript
> http://esprima.org/
> http://zaach.github.io/jison/
>
> --
> --
> Job Board: http://jobs.nodejs.org/
> Posting guidelines:
> https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
>
> You received this message because you are subscribed to the Google
> Groups "nodejs" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/nodejs?hl=en?hl=en
>
> ---
> You received this message because you are subscribed to the Google Groups
> "nodejs" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
>
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
-- 
Job Board: http://jobs.nodejs.org/
Posting guidelines: 
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
You received this message because you are subscribed to the Google
Groups "nodejs" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/nodejs?hl=en?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"nodejs" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Re: [nodejs] With this utility, everyone can easily invent new languages

Reply via email to