aha, indeed. maybe I should blog it. 在 2014年1月11日星期六UTC+8上午9时40分31秒,Mark Hahn写道: > > Peasy looks interesting and it's on my list to look at, but this post was > too long to read. > > > On Fri, Jan 10, 2014 at 5:37 PM, Simeon Chaos > <[email protected]<javascript:> > > wrote: > >> How is peasy (https://github.com/chaosim/peasy) simpler and more >> powerful than other parser tools? >> >> Simpler and more powerful? Maybe it is a bit contrary to common sense. >> True or false? To affirm, please give it a glimpse at first. Because of >> being simple, you can comprehend it in a little while; because of being >> powerful, it will deserve your every minute. >> >> All along, the design of programming languages is a complex task. >> Because we need to understand the esoteric compiler theory and technology, >> and one of the most critical and very difficult part is to define the rules >> of the new language and to parse with them.To solve this problem, there >> have been many theories , techniques and tools . These tools can be roughly >> divided into two categories: one is parser generator, another is to write >> the parser by hand with or without a parser library. >> >> The parser generator generally based on some type of formal languages , >> by which the generator make the lexer and parser from some rules. Strong >> aspect of these tools is that they can produce the most efficient parser >> support for their own form of language types , according to the >> characteristics of formal language , generally ensures linear time >> complexity. Commonly divided into three types : LR technology , LL >> technology , PEG technology. LR technology language has shift/reduction , >> from the bottom up , the most right derived and other technical features , >> they can only support LR languages , more accurately, mainly LALR >> variants , plus some special treatment for the priority of operator. One of >> the most famous was undoubtedly the lex / yacc and its numerous subsequent >> variants ( such as bison, jison (javascript language ), ply (python lex / >> yacc) , etc. ) . LL technology has a top-down form of the language , >> recursive descent , the most left-dereived and other technical features , >> although the concept is easier to understand than the LR language , but >> because LL covers less languages, so not used as universal as lex / yacc. >> Representative products are antlr. PEG technology refers to parsing >> expression grammar parser generator. peg is a kind of formal grammar >> particularly suited to be parsed, currently there are already many tools >> appear. The most common method is packrat algorithm based implementations. >> Such as rats, ometa (there are many versions , such as ometa / squeak, >> ometa-js, pyMeta etc. ), pyPEG under python, peg.js under javascript, >> treetop and citrus under ruby, and so on. >> >> No matter what type of parser generator , the method to use them is to >> design rules for these tools , and then the tools generate parser. This is >> the difficult aspects. First, the object program is generated based on the >> state transfer and stack process, it is difficult to understand, debug and >> modify. Second, we must understand parser theory and technology of these >> tools is based on, the rules used by these tools is actually a >> domain-specific language , their expressivenes is very limited, while we >> must understand and become familiar with the new language in order to >> design rules. Third, we must embed certain processing ( abstract syntax >> tree structure , semantics, error reporting and handling , etc) in the form >> of object parser language into lex/parser rules. Difficulties in these >> areas prevent the most programmers to easily use this type of tool. >> >> Meanwhile, people have also been pursuing a more relaxed and flexible >> approach to write parsers. Most of these methods produce tools can be >> classified as combinational parser library , or peg grammar-based peg >> library . Use of such libraries, programmers can use their own language in >> daily use to design a new universal language , parsing text, so they are >> used more freguently. Representative works is parsec under haskell language >> which maybe is the most mature and powerful. However, because haskell is >> too academic, and parsec is not universally popular. c + + language has >> boost phoenix library, probably because it depends on the c++ template, an >> advanced language features, it has not been widely used too. Python has >> pyparsing, which is used by many users. For specific questions , there are >> many applications do not use any tools or libraries, but manually write the >> entire parser , for example: cpython implementation, esprima for >> javascript. However, unfortunately, in their grammar and parsing of these >> tools, there are two obvious difficulties not been solved: the first is >> left recursive grammar, because left-recursive function will lead to an >> unconditional infinite recursion. The second is the parsing efficiency. In >> order to obtain a linear time complexity , we should only parse same syntax >> component only once at any position. To achieve the latter, the result of >> parsing is required. Meanwhile , the parser also need to be considered in >> conjunction to produce abstract syntax trees , semantic processing , error >> handling , etc., and these libraries maybe try to keep away from some of >> the problems (in the combined libraries left recursion are generally >> fobidden), or in order to solve these problems the program becomes very >> complex , difficult to understand , use, modify and extend. >> >> Can be seen from the above description , parsing is a very common >> programming needs , but also has a very high technical difficulty , >> resulting in a large number of theories, technique, and tools you have seen >> above, in this regard a variety of technical papers, code and tools can be >> used simply to describe the voluminous , but still did not achieve the >> desired results . >> >> >> with peasy, a complete and elegant solution to all these problems emerge >> for the first time. On one hand you can say peasy the most simple and >> elegant , but also has the most powerful and adaptability , and does not >> need to sacrifice speed and efficiency. >> Its characteristics are as follows: >> >> * for the first time, peasy provides a simple and complete solution to >> the problem of left recursion in the hand-written recursive descent parser >> manner, no matter direct or indirect left-recursive rule . To handle left >> recursive in peasy, only one function is needed (only 20 lines of code in >> coffeescript). >> * peasy allows programmers to write the parser with common programming >> language, and the parser and the input to the grammar rules are similar to >> the parser generator , both concise and readable , easy to modify and >> extend. >> * the simple and flexible caching mechanism in peasy can improve the time >> efficiency of parsing, we can maintain a linear time for liner time >> complexity grammar by caching. For complex grammar and data , we can also >> improve time efficiency by caching mechanism, and avoide the exponential, >> cube or square time complexity like some other common parsing algorithm. >> peasy's cache implementation is very simple, only one function (only ten >> lines of code under coffeescript ) . >> * the entire peasy implementation is very simple. only two hundred lines >> of code under cofeescript. The concept is simple. Only two kind of >> components: some match functions which operate on the parsed data and >> parsing pointer and some combinational functions which generate match >> function. all of these functions are very simple , usually only a few lines >> of code , And they exists in the code of peasy for demonstration objective, >> can be deleted, modified and replaced, according to the specific needs. >> >> In fact, instead of to be considered as a library or a tool, maybe it's >> better to view peasy as a method, a way or some example to write parsers >> mannually. This will allow for a greater degree of power. >> >> peasy provide specific examples to support the above conclusion. The most >> obvious example is the common programming languages in order to adapt and >> modify their parser tool to prevent left- recursive grammar . For example >> python grammar (http://docs.python.org/2/reference/grammar.html), >> javascript grammar ( >> http://www-archive.mozilla.org/js/language/grammar14.html), while by >> using peasy, all binary arithmetic expression can be condensed into a >> simple left -recursive rules , left recursion elimination is not needed any >> more. For expression parsing, in order to further improve time efficiency , >> peasy also provides an example , has many different priorities for >> computing the call stack example can skip directly to the highest priority >> constant direct expression resolves to the final rather than, as is >> currently common practice, the rules of writing as follows : or -> (or | | >> and) | and; ...; add -> add + mul | mul; mul -> mul * atom | atom. Rare is >> , peasy while achieving enhanced capabilities of these expressions can also >> ensure that the linear time complexity of the parsing process. Want to know >> how peasy achieve these results? Please give a glimpse to peasy on github( >> https://github.com/chaosim/peasy). >> >> the first two of following links is the messages I posted when I started >> to get the inspiration to start writing peasy. Because there are some of >> other tasks to finish, intermediate interrupted for some time. Not long >> ago, I use peasy to parse something in another projects , and I further >> optimize the api of peasy, and which is now very simple and elegant, and I >> am satisfied :) As for the most critical left recursive , and caching >> algorithms , withstood the subsequent examples of the test , there is no >> any change. Together, these examples also continues to demonstrate the >> advantages of peasy . >> >> Finally, I need to mention the impact of language on thought: Without >> coffeescript language, maybe it's difficult for me to have these ideas. in >> coffeescript, the -> function notation and "assignment is an expression" >> making special concise readable grammar. In python similar results can be >> achieved, need to go through some twists and turns, and that is not so >> natural as in coffeescript.There should exists no big obstacle in other >> dynamic languages, but they is not as elegant as coffeescript tool. This >> point is obvious with the difference between the javascript code of peasy >> and its coffeescript source. >> >> messages I posted in google groups before: >> >> https://groups.google.com/forum/#!search/left$20recursive/comp.lang.javascript/GN7b9Tr6j98/DoiHzi9i77oJ >> >> https://groups.google.com/forum/#!search/%E5%B7%A6%E9%80%92%E5%BD%92/python-cn/Uqr-Xf3I3LM/oUd3Sj4HvxYJ >> >> >> lr grammar and lex/yacc, bison, ply >> http://en.wikipedia.org/wiki/LR_parser >> http://dinosaur.compilertools.net/ >> http://en.wikipedia.org/wiki/Yacc >> http://www.dabeaz.com/ply/ >> http://www.gnu.org/software/bison/ >> http://zaach.github.io/jison/ >> >> LL grammar >> http://en.wikipedia.org/wiki/LL_parser >> http://www.antlr.org/ >> >> >> some one asked for parser to handle left recursion on SO: >> >> http://stackoverflow.com/questions/4397039/any-peg-parser-capable-to-handle-left-recursion >> >> >> peg and packrat >> http://bford.info/packrat/ >> http://en.wikipedia.org/wiki/Parsing_expression_grammar >> http://cs.nyu.edu/rgrimm/xtc/rats-intro.html >> http://java-source.net/open-source/parser-generators/rats! >> >> phoenix >> >> http://www.boost.org/doc/libs/1_55_0/libs/spirit/phoenix/doc/html/index.html >> http://www.ontolinux.com/community/phoenix/Phoenix_Manual.pdf >> >> ometa: >> Alessandro Warth describe a method to handle left recursion in peg >> grammar in his PhD paper firstly. >> After I have my idea about Peasy and algorith to handle left recursion, I >> searched and found his paper, and found my algorithm is actully similar to >> his method. >> http://tinlizzie.org/ometa/ >> Memoization in Top-Down Parsing >> http://www.tinlizzie.org/~awarth/johnson.html >> >> https://github.com/alexwarth/ometa-js >> >> parsing under python >> https://wiki.python.org/moin/LanguageParsing >> >> parser for javascript or in javascript >> http://esprima.org/ >> http://zaach.github.io/jison/ >> >> -- >> -- >> Job Board: http://jobs.nodejs.org/ >> Posting guidelines: >> https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines >> >> You received this message because you are subscribed to the Google >> Groups "nodejs" group. >> To post to this group, send email to [email protected]<javascript:> >> To unsubscribe from this group, send email to >> [email protected] <javascript:> >> For more options, visit this group at >> http://groups.google.com/group/nodejs?hl=en?hl=en >> >> --- >> You received this message because you are subscribed to the Google Groups >> "nodejs" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> >> For more options, visit https://groups.google.com/groups/opt_out. >> > >
-- -- Job Board: http://jobs.nodejs.org/ Posting guidelines: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines You received this message because you are subscribed to the Google Groups "nodejs" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/nodejs?hl=en?hl=en --- You received this message because you are subscribed to the Google Groups "nodejs" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.
