Hi, Yes, antlr3.g file have the same detailed definition.However, ANTLR v3 allows users to explicitly define the structure of the tree.
For example, setStorageGroup : KW_SET KW_STORAGE KW_GROUP KW_TO prefixPath -> ^(TOK_SET ^(TOK_STORAGEGROUP prefixPath)) ; the structure of the tree is like: 'SET' | 'STORAGEGROUP' | prefixPath The prefixPath is another tree. Users can recursively analyse the AST node by function like analyze(prefixPath). Data are accessed by reference. However, in ANTLR v4, the '->' operator is omitted.So the statement of setting storage group is defined as setStorageGroup : KW_SET KW_STORAGE KW_GROUP KW_TO prefixPath If we need to get the string info of prefixPath, we can use prefixPath.getText(), which is actually more clear and direct for developers. However, If prefixPath is not a leaf node, a StringBuilder will be created instead of reference access. Although operations on StringBuilder is faster than on String, creating StringBuilder too frequenly is a heavy overhead, which impairs the benefits and even reduce the overall performance. Currently, I think this is what leads to the problem. Best, --------------------- Yuyuan KANG > -----原始邮件----- > 发件人: "Xiangdong Huang" <saint...@gmail.com> > 发送时间: 2019-09-09 00:08:00 (星期一) > 收件人: dev@iotdb.apache.org > 抄送: > 主题: Re: [jira] [Created] (IOTDB-201) Query parsing runs slower when using > ANTLR v4 > > Hi, > > > There are some grammar definitions that are too detailed, such as decimal > numbers, which are categorized into many types. I think making the rules > more general may decrease the times of calling getText() method. > > One question, does the antlr3.g file have the same detailed definition, > e.g., the decimal numbers? > > Best, > > ----------------------------------- > Xiangdong Huang > School of Software, Tsinghua University > > 黄向东 > 清华大学 软件学院 > > > 康愈圆 <ky...@mails.tsinghua.edu.cn> 于2019年9月5日周四 下午11:11写道: > > > Hi, > > > > I've been working on JIRA issue [IOTDB-190 switch to ANTLR v4] these days. > > > > I implemented the SQL parsing module. However, it seems that the parsing > > efficiency reduces a lot when using ANTLR v4. > > > > It turns out that RuleContext.getText() is frequently called, which takes > > more than 90% of the CPU time. > > > > The grammer definition (.g4 file) here is a continuation of previous > > version (ANTLR v3). There are some grammar definitions that are too > > detailed, such as decimal numbers, which are categorized into many types. I > > think making the rules more general may decrease the times of calling > > getText() method. > > > > I plan to reconstruct the grammer definition to improve the parsing > > efficiency. > > > > ---- > > Yuyuan KANG > > > > 在2019-09-06 13:30:00,Yuyuan KANG (Jira)<j...@apache.org>写道: > > > Yuyuan KANG created IOTDB-201: > > > --------------------------------- > > > > > > Summary: Query parsing runs slower when using ANTLR v4 > > > Key: IOTDB-201 > > > URL: https://issues.apache.org/jira/browse/IOTDB-201 > > > Project: Apache IoTDB > > > Issue Type: Improvement > > > Reporter: Yuyuan KANG > > > > > > > > > The system now uses ANTLR v3. When transformed to ANTLR v4 using > > previous grammar definition, experiment result shows that the efficiency of > > logical plan generation is negatively impacted. > > > > > > > > > > > > -- > > > This message was sent by Atlassian Jira > > > (v8.3.2#803003) > > > >