On Thu, Mar 21, 2019 at 9:59 PM Alexander Korotkov <a.korot...@postgrespro.ru> wrote: > Attaches patches improving jsonpath parser. First one introduces > cosmetic changes, while second gets rid of backtracking. I'm also > planning to add high-level comment for both grammar and lexer.
The cosmetic changes look good to me. I just noticed a couple things about the comments. 0001: +/* Check if current scanstring value constitutes a keyword */ 'is a keyword' is better. 'Constitutes' implies parts of a whole. + * Resize scanstring for appending of given length. Reinitilize if required. s/Reinitilize/Reinitialize/ The first sentence is not entirely clear to me. 0002: These two rules are not strictly necessary: <xnq,xq,xvq,xsq>{unicode}+\\ { /* throw back the \\, and treat as unicode */ yyless(yyleng - 1); parseUnicode(yytext, yyleng); } <xnq,xq,xvq,xsq>{hex_char}+\\ { /* throw back the \\, and treat as hex */ yyless(yyleng - 1); parseHexChars(yytext, yyleng); } ...and only seem to be there because of how these are written: <xnq,xq,xvq,xsq>{unicode}+ { parseUnicode(yytext, yyleng); } <xnq,xq,xvq,xsq>{hex_char}+ { parseHexChars(yytext, yyleng); } <xnq,xq,xvq,xsq>{unicode}*{unicodefail} { yyerror(NULL, "Unicode sequence is invalid"); } <xnq,xq,xvq,xsq>{hex_char}*{hex_fail} { yyerror(NULL, "Hex character sequence is invalid"); } I don't understand the reasoning here -- is it a micro-optimization? The following is simpler, allow the rules I mentioned to be removed, and make check still passes. I would prefer it unless there is a performance penalty, in which case a comment to describe the additional complexity would be helpful. <xnq,xq,xvq,xsq>{unicode} { parseUnicode(yytext, yyleng); } <xnq,xq,xvq,xsq>{hex_char} { parseHexChars(yytext, yyleng); } <xnq,xq,xvq,xsq>{unicodefail} { yyerror(NULL, "Unicode sequence is invalid"); } <xnq,xq,xvq,xsq>{hex_fail} { yyerror(NULL, "Hex character sequence is invalid"); } -- John Naylor https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services