Hi, > However, If prefixPath is not a leaf node, a StringBuilder will be created instead of reference access.
In your example, prefixPath is a leaf node, is that right? Maybe it is the incorrect of the API call that lead to the bad performance. Can we do some unit tests? e.g. just implement 1 ~ 2 grammars using both Antlr3 and 4 and test the performance? By the way, I noticed that Calcite uses JavaCC... Best, ----------------------------------- Xiangdong Huang School of Software, Tsinghua University 黄向东 清华大学 软件学院 康愈圆 <[email protected]> 于2019年9月9日周一 上午11:43写道: > Hi, > > Yes, antlr3.g file have the same detailed definition.However, ANTLR v3 > allows users to explicitly define the structure of the tree. > > For example, > > setStorageGroup > : KW_SET KW_STORAGE KW_GROUP KW_TO prefixPath > -> ^(TOK_SET ^(TOK_STORAGEGROUP prefixPath)) > ; > > the structure of the tree is like: > > 'SET' > | > 'STORAGEGROUP' > | > prefixPath > > The prefixPath is another tree. Users can recursively analyse the AST node > by function like analyze(prefixPath). Data are accessed by reference. > > However, in ANTLR v4, the '->' operator is omitted.So the statement of > setting storage group is defined as > > setStorageGroup > : KW_SET KW_STORAGE KW_GROUP KW_TO prefixPath > > If we need to get the string info of prefixPath, we can use > prefixPath.getText(), which is actually more clear and direct for > developers. However, If > prefixPath is not a leaf node, a StringBuilder will be created instead of > reference access. Although operations on StringBuilder is faster than on > String, > creating StringBuilder too frequenly is a heavy overhead, which impairs > the benefits and even reduce the overall performance. > > Currently, I think this is what leads to the problem. > > Best, > --------------------- > Yuyuan KANG > > > > > -----原始邮件----- > > 发件人: "Xiangdong Huang" <[email protected]> > > 发送时间: 2019-09-09 00:08:00 (星期一) > > 收件人: [email protected] > > 抄送: > > 主题: Re: [jira] [Created] (IOTDB-201) Query parsing runs slower when > using ANTLR v4 > > > > Hi, > > > > > There are some grammar definitions that are too detailed, such as > decimal > > numbers, which are categorized into many types. I think making the rules > > more general may decrease the times of calling getText() method. > > > > One question, does the antlr3.g file have the same detailed definition, > > e.g., the decimal numbers? > > > > Best, > > > > ----------------------------------- > > Xiangdong Huang > > School of Software, Tsinghua University > > > > 黄向东 > > 清华大学 软件学院 > > > > > > 康愈圆 <[email protected]> 于2019年9月5日周四 下午11:11写道: > > > > > Hi, > > > > > > I've been working on JIRA issue [IOTDB-190 switch to ANTLR v4] these > days. > > > > > > I implemented the SQL parsing module. However, it seems that the > parsing > > > efficiency reduces a lot when using ANTLR v4. > > > > > > It turns out that RuleContext.getText() is frequently called, which > takes > > > more than 90% of the CPU time. > > > > > > The grammer definition (.g4 file) here is a continuation of previous > > > version (ANTLR v3). There are some grammar definitions that are too > > > detailed, such as decimal numbers, which are categorized into many > types. I > > > think making the rules more general may decrease the times of calling > > > getText() method. > > > > > > I plan to reconstruct the grammer definition to improve the parsing > > > efficiency. > > > > > > ---- > > > Yuyuan KANG > > > > > > 在2019-09-06 13:30:00,Yuyuan KANG (Jira)<[email protected]>写道: > > > > Yuyuan KANG created IOTDB-201: > > > > --------------------------------- > > > > > > > > Summary: Query parsing runs slower when using ANTLR v4 > > > > Key: IOTDB-201 > > > > URL: > https://issues.apache.org/jira/browse/IOTDB-201 > > > > Project: Apache IoTDB > > > > Issue Type: Improvement > > > > Reporter: Yuyuan KANG > > > > > > > > > > > > The system now uses ANTLR v3. When transformed to ANTLR v4 using > > > previous grammar definition, experiment result shows that the > efficiency of > > > logical plan generation is negatively impacted. > > > > > > > > > > > > > > > > -- > > > > This message was sent by Atlassian Jira > > > > (v8.3.2#803003) > > > > > > >
