Re: Re: [jira] [Created] (IOTDB-201) Query parsing runs slower when using ANTLR v4
Hi, Yes, antlr3.g file have the same detailed definition.However, ANTLR v3 allows users to explicitly define the structure of the tree. For example, setStorageGroup : KW_SET KW_STORAGE KW_GROUP KW_TO prefixPath -> ^(TOK_SET ^(TOK_STORAGEGROUP prefixPath)) ; the structure of the tree is like: 'SET' | 'STORAGEGROUP' | prefixPath The prefixPath is another tree. Users can recursively analyse the AST node by function like analyze(prefixPath). Data are accessed by reference. However, in ANTLR v4, the '->' operator is omitted.So the statement of setting storage group is defined as setStorageGroup : KW_SET KW_STORAGE KW_GROUP KW_TO prefixPath If we need to get the string info of prefixPath, we can use prefixPath.getText(), which is actually more clear and direct for developers. However, If prefixPath is not a leaf node, a StringBuilder will be created instead of reference access. Although operations on StringBuilder is faster than on String, creating StringBuilder too frequenly is a heavy overhead, which impairs the benefits and even reduce the overall performance. Currently, I think this is what leads to the problem. Best, - Yuyuan KANG > -原始邮件- > 发件人: "Xiangdong Huang" > 发送时间: 2019-09-09 00:08:00 (星期一) > 收件人: dev@iotdb.apache.org > 抄送: > 主题: Re: [jira] [Created] (IOTDB-201) Query parsing runs slower when using > ANTLR v4 > > Hi, > > > There are some grammar definitions that are too detailed, such as decimal > numbers, which are categorized into many types. I think making the rules > more general may decrease the times of calling getText() method. > > One question, does the antlr3.g file have the same detailed definition, > e.g., the decimal numbers? > > Best, > > --- > Xiangdong Huang > School of Software, Tsinghua University > > 黄向东 > 清华大学 软件学院 > > > 康愈圆 于2019年9月5日周四 下午11:11写道: > > > Hi, > > > > I've been working on JIRA issue [IOTDB-190 switch to ANTLR v4] these days. > > > > I implemented the SQL parsing module. However, it seems that the parsing > > efficiency reduces a lot when using ANTLR v4. > > > > It turns out that RuleContext.getText() is frequently called, which takes > > more than 90% of the CPU time. > > > > The grammer definition (.g4 file) here is a continuation of previous > > version (ANTLR v3). There are some grammar definitions that are too > > detailed, such as decimal numbers, which are categorized into many types. I > > think making the rules more general may decrease the times of calling > > getText() method. > > > > I plan to reconstruct the grammer definition to improve the parsing > > efficiency. > > > > > > Yuyuan KANG > > > > 在2019-09-06 13:30:00,Yuyuan KANG (Jira)写道: > > > Yuyuan KANG created IOTDB-201: > > > - > > > > > > Summary: Query parsing runs slower when using ANTLR v4 > > > Key: IOTDB-201 > > > URL: https://issues.apache.org/jira/browse/IOTDB-201 > > > Project: Apache IoTDB > > > Issue Type: Improvement > > > Reporter: Yuyuan KANG > > > > > > > > > The system now uses ANTLR v3. When transformed to ANTLR v4 using > > previous grammar definition, experiment result shows that the efficiency of > > logical plan generation is negatively impacted. > > > > > > > > > > > > -- > > > This message was sent by Atlassian Jira > > > (v8.3.2#803003) > > > >
Re:Solving jira problem (IOTDB-180) Get rid of JSON format in "show timeseries"
The JIRA link of this issue is https://issues.apache.org/jira/projects/IOTDB/issues/IOTDB-180?filter=allopenissues 在 2019-09-09 11:21:23,"thss15_yit" 写道: >Hi, >I have been working on JIRA issue [IOTDB-180 get rid of JSON format in > "show timeseries"] these days. > My plan of dealing with this issue is merging the execution of statement > "show timeseries" into "show timeseries ",using the functions of "show > timeseries " to output the table format of the data, and then remove > some of the useless functions of JSON format. > > >Tao Yi
Solving jira problem (IOTDB-180) Get rid of JSON format in "show timeseries"
Hi, I have been working on JIRA issue [IOTDB-180 get rid of JSON format in "show timeseries"] these days. My plan of dealing with this issue is merging the execution of statement "show timeseries" into "show timeseries ",using the functions of "show timeseries " to output the table format of the data, and then remove some of the useless functions of JSON format. Tao Yi
Re: A simple tool to visualize logs
Hi, This feature is available on https://github.com/apache/incubator-iotdb/pull/370, please have a look if you have some time to spare. Thanks a lot. Tian Jiang | | Tian Jiang | | jt2594...@163.com | 签名由网易邮箱大师定制 On 9/9/2019 00:41,Julian Feinauer wrote: Hi, I definetly need that (to adopt to our stuff!) Julian Am 08.09.19, 09:13 schrieb "Xiangdong Huang" : Hi, Sounds COOOL! Analyzing system log is one of the most thing for a complex system (especially for distributed system). Will try this feature ASAP :D Best, --- Xiangdong Huang School of Software, Tsinghua University 黄向东 清华大学 软件学院 Tian Jiang 于2019年9月4日周三 下午7:59写道: Greetings, Usually, the very first thing we will do on finding a bug is to search the logs. Logs play a vital role in debugging especially in some environment that attaching a debugger is impossible. In such circumstance, logs will hopefully become the only information sources for the developers. However, a single log, which is just a string, is easy to understand. But when it comes to mining information from thousands of logs or even more, getting lost is nearly unavoidable, since humans have a much limited memory for exact truth compared to computers. From time to time, I forget what I have read before and I must go back to review the previous logs, as a result, progress is made very slowly. Reading several strings is easy, but when we have thousands, there must be some better way to present them than raw text. So, I keep thinking it would be much better if we can make the logs into plots. Of course there must some existing tools, but they are often powerful but too heavy (like Kibana) , or specialized for web or other logs (like LogStalgia). Having a fantastic web interface is great, but a simple but handy suit us better. What I want is something light-weighted, stand-alone and highly customized. As a result, I developed a simple tool that can visualize (plot) logs generated by IoTDB (with some modification, it can be applied to other type of logs, too) and generate report. I designed a simple GUI which provides full functionalities and a command line tool to fast generate reports. The attachment contains an example report I generated from one of my experiments, which reveal interesting things like how the size of memtables converges over time. I may have missed some tools that are more powerful or easier to use. If you know any, please inform me and I shall see what I can learn from them. | | Tian Jiang | | jt2594...@163.com | 签名由网易邮箱大师定制
?????? Enable to choose storage in local file system or HDFS
Hi, Thanks for this helpful suggestion! I'll try it by implementing TsFileFactory and SystemFileFactory. (They are factories to create files according to different File System, so I think the name doesn't need to be "...FileFSFactory"?) BR, -- Zesong Sun School of Software, Tsinghua University ?? -- -- ??: "Xiangdong Huang"; : 2019??9??9??(??) 0:03 ??: "dev"; : Re: Enable to choose storage in local file system or HDFS Hi, > do we need the FileFactory for all Files? A solution is having two FileSystemUtil classes (or FileFactory), TsFileFSFactory and the rest. Best, --- Xiangdong Huang School of Software, Tsinghua University ?? Zesong Sun ??2019??9??8?? 5:24?? > Hi, > > > I had intended to implement this requirement in the first way, but now I > think the second and third are better for much less modification of current > codes... Though the first way may support more than HDFS storage, it may > still take a lot more time to modify codes in TsFile module based on > current work. > > > > > BR, > -- > Zesong Sun > School of Software, Tsinghua University > > ?? > > > > > > > > > -- -- > ??: "Jialin Qiao"; > : 2019??9??8??(??) 2:08 > ??: "dev"; > > : Enable to choose storage in local file system or HDFS > > > > Hi, > > This issue is to let user directly uses spark to read data in IoTDB for > analyzing. > > This function can be done in many ways in IoTDB: > > (1) Storing all TsFiles (data files) and other files (system files, WALs) > on HDFS, then use spark-tsfile to read TsFiles on HDFS. > (2) Storing only TsFiles on HDFS, and other files on local file system, > then use spark-tsfile to read TsFiles on HDFS. > (3) Storing all files on local file system and let user use > spark-iotdb-connector to read data from IoTDB, regardless where TsFiles > store. > > Personally, I prefer the second and the third. If we use the second way, > do we need the FileFactory for all Files? > > Best, > -- > Jialin Qiao > School of Software, Tsinghua University > > ?? > > > > -- > > ??: "Zesong Sun (Jira)" > > : 2019-08-29 19:34:00 (??) > > ??: dev@iotdb.apache.org > > : > > : [jira] [Created] (IOTDB-187) Enable to choose storage in local file > system or HDFS > > > > Zesong Sun created IOTDB-187: > > > > > > Summary: Enable to choose storage in local file system or > HDFS > > Key: IOTDB-187 > > URL: https://issues.apache.org/jira/browse/IOTDB-187 > > Project: Apache IoTDB > > Issue Type: Improvement > > Reporter: Zesong Sun > > > > > > Enable to choose storage in local file system or HDFS > > "is_hdfs_storage=false" by default > > > > > > > > -- > > This message was sent by Atlassian Jira > > (v8.3.2#803003)
Re: A simple tool to visualize logs
Hi, I definetly need that (to adopt to our stuff!) Julian Am 08.09.19, 09:13 schrieb "Xiangdong Huang" : Hi, Sounds COOOL! Analyzing system log is one of the most thing for a complex system (especially for distributed system). Will try this feature ASAP :D Best, --- Xiangdong Huang School of Software, Tsinghua University 黄向东 清华大学 软件学院 Tian Jiang 于2019年9月4日周三 下午7:59写道: > Greetings, > > > > > Usually, the very first thing we will do on finding a bug is to search the > logs. Logs play a vital role in debugging especially in some environment > that attaching a debugger is impossible. In such circumstance, logs will > hopefully become the only information sources for the developers. > > > > However, a single log, which is just a string, is easy to understand. But > when it comes to mining information from thousands of logs or even more, > getting lost is nearly unavoidable, since humans have a much limited memory > for exact truth compared to computers. From time to time, I forget what I > have read before and I must go back to review the previous logs, as a > result, progress is made very slowly. Reading several strings is easy, but > when we have thousands, there must be some better way to present them than > raw text. > > > > So, I keep thinking it would be much better if we can make the logs into > plots. Of course there must some existing tools, but they are often > powerful but too heavy (like Kibana) , or specialized for web or other logs > (like LogStalgia). Having a fantastic web interface is great, but a simple > but handy suit us better. What I want is something light-weighted, > stand-alone and highly customized. > > > > As a result, I developed a simple tool that can visualize (plot) logs > generated by IoTDB (with some modification, it can be applied to other type > of logs, too) and generate report. I designed a simple GUI which provides > full functionalities and a command line tool to fast generate reports. The > attachment contains an example report I generated from one of my > experiments, which reveal interesting things like how the size of memtables > converges over time. > > > > I may have missed some tools that are more powerful or easier to use. If > you know any, please inform me and I shall see what I can learn from them. > > > > | | > Tian Jiang > | > | > jt2594...@163.com > | > 签名由网易邮箱大师定制
Re: A simple tool to visualize logs
Hi, Sounds COOOL! Analyzing system log is one of the most thing for a complex system (especially for distributed system). Will try this feature ASAP :D Best, --- Xiangdong Huang School of Software, Tsinghua University 黄向东 清华大学 软件学院 Tian Jiang 于2019年9月4日周三 下午7:59写道: > Greetings, > > > > > Usually, the very first thing we will do on finding a bug is to search the > logs. Logs play a vital role in debugging especially in some environment > that attaching a debugger is impossible. In such circumstance, logs will > hopefully become the only information sources for the developers. > > > > However, a single log, which is just a string, is easy to understand. But > when it comes to mining information from thousands of logs or even more, > getting lost is nearly unavoidable, since humans have a much limited memory > for exact truth compared to computers. From time to time, I forget what I > have read before and I must go back to review the previous logs, as a > result, progress is made very slowly. Reading several strings is easy, but > when we have thousands, there must be some better way to present them than > raw text. > > > > So, I keep thinking it would be much better if we can make the logs into > plots. Of course there must some existing tools, but they are often > powerful but too heavy (like Kibana) , or specialized for web or other logs > (like LogStalgia). Having a fantastic web interface is great, but a simple > but handy suit us better. What I want is something light-weighted, > stand-alone and highly customized. > > > > As a result, I developed a simple tool that can visualize (plot) logs > generated by IoTDB (with some modification, it can be applied to other type > of logs, too) and generate report. I designed a simple GUI which provides > full functionalities and a command line tool to fast generate reports. The > attachment contains an example report I generated from one of my > experiments, which reveal interesting things like how the size of memtables > converges over time. > > > > I may have missed some tools that are more powerful or easier to use. If > you know any, please inform me and I shall see what I can learn from them. > > > > | | > Tian Jiang > | > | > jt2594...@163.com > | > 签名由网易邮箱大师定制
Re: [jira] [Created] (IOTDB-201) Query parsing runs slower when using ANTLR v4
Hi, > There are some grammar definitions that are too detailed, such as decimal numbers, which are categorized into many types. I think making the rules more general may decrease the times of calling getText() method. One question, does the antlr3.g file have the same detailed definition, e.g., the decimal numbers? Best, --- Xiangdong Huang School of Software, Tsinghua University 黄向东 清华大学 软件学院 康愈圆 于2019年9月5日周四 下午11:11写道: > Hi, > > I've been working on JIRA issue [IOTDB-190 switch to ANTLR v4] these days. > > I implemented the SQL parsing module. However, it seems that the parsing > efficiency reduces a lot when using ANTLR v4. > > It turns out that RuleContext.getText() is frequently called, which takes > more than 90% of the CPU time. > > The grammer definition (.g4 file) here is a continuation of previous > version (ANTLR v3). There are some grammar definitions that are too > detailed, such as decimal numbers, which are categorized into many types. I > think making the rules more general may decrease the times of calling > getText() method. > > I plan to reconstruct the grammer definition to improve the parsing > efficiency. > > > Yuyuan KANG > > 在2019-09-06 13:30:00,Yuyuan KANG (Jira)写道: > > Yuyuan KANG created IOTDB-201: > > - > > > > Summary: Query parsing runs slower when using ANTLR v4 > > Key: IOTDB-201 > > URL: https://issues.apache.org/jira/browse/IOTDB-201 > > Project: Apache IoTDB > > Issue Type: Improvement > > Reporter: Yuyuan KANG > > > > > > The system now uses ANTLR v3. When transformed to ANTLR v4 using > previous grammar definition, experiment result shows that the efficiency of > logical plan generation is negatively impacted. > > > > > > > > -- > > This message was sent by Atlassian Jira > > (v8.3.2#803003) > >
Re: Enable to choose storage in local file system or HDFS
Hi, > do we need the FileFactory for all Files? A solution is having two FileSystemUtil classes (or FileFactory), TsFileFSFactory and the rest. Best, --- Xiangdong Huang School of Software, Tsinghua University 黄向东 清华大学 软件学院 Zesong Sun 于2019年9月8日周日 上午5:24写道: > Hi, > > > I had intended to implement this requirement in the first way, but now I > think the second and third are better for much less modification of current > codes... Though the first way may support more than HDFS storage, it may > still take a lot more time to modify codes in TsFile module based on > current work. > > > > > BR, > -- > Zesong Sun > School of Software, Tsinghua University > > 孙泽嵩 > 清华大学 软件学院 > > > > > > > > -- 原始邮件 -- > 发件人: "Jialin Qiao"; > 发送时间: 2019年9月8日(星期天) 下午2:08 > 收件人: "dev"; > > 主题: Enable to choose storage in local file system or HDFS > > > > Hi, > > This issue is to let user directly uses spark to read data in IoTDB for > analyzing. > > This function can be done in many ways in IoTDB: > > (1) Storing all TsFiles (data files) and other files (system files, WALs) > on HDFS, then use spark-tsfile to read TsFiles on HDFS. > (2) Storing only TsFiles on HDFS, and other files on local file system, > then use spark-tsfile to read TsFiles on HDFS. > (3) Storing all files on local file system and let user use > spark-iotdb-connector to read data from IoTDB, regardless where TsFiles > store. > > Personally, I prefer the second and the third. If we use the second way, > do we need the FileFactory for all Files? > > Best, > -- > Jialin Qiao > School of Software, Tsinghua University > > 乔嘉林 > 清华大学 软件学院 > > > -原始邮件- > > 发件人: "Zesong Sun (Jira)" > > 发送时间: 2019-08-29 19:34:00 (星期四) > > 收件人: dev@iotdb.apache.org > > 抄送: > > 主题: [jira] [Created] (IOTDB-187) Enable to choose storage in local file > system or HDFS > > > > Zesong Sun created IOTDB-187: > > > > > > Summary: Enable to choose storage in local file system or > HDFS > > Key: IOTDB-187 > > URL: https://issues.apache.org/jira/browse/IOTDB-187 > > Project: Apache IoTDB > > Issue Type: Improvement > > Reporter: Zesong Sun > > > > > > Enable to choose storage in local file system or HDFS > > "is_hdfs_storage=false" by default > > > > > > > > -- > > This message was sent by Atlassian Jira > > (v8.3.2#803003)
IoTDB Apache Con slides
Hi all, Current version is almost done, while only the performance evaluation section is blank.. You can get the slides from [1] (the url only supports view and comment): Do not hesitate to leave your comment to make it better (e.g., add more technical content? We have 50 minutes! ) :D By the way, our talk is at 10 September, 17:00- 17:50 (Beijing Time, 11 September 8:00-8:50 AM). [1] https://docs.google.com/presentation/d/1EXi4UY1IXaAKW1Ybh3iLzgkKQIvA5B--7oL_rQ8aG0Y/edit?usp=sharing --- Xiangdong Huang School of Software, Tsinghua University 黄向东 清华大学 软件学院
[jira] [Created] (IOTDB-204) spark-tsfile narrow table's new way to execute query
Lei Rui created IOTDB-204: - Summary: spark-tsfile narrow table's new way to execute query Key: IOTDB-204 URL: https://issues.apache.org/jira/browse/IOTDB-204 Project: Apache IoTDB Issue Type: Improvement Reporter: Lei Rui -- This message was sent by Atlassian Jira (v8.3.2#803003)
??????Enable to choose storage in local file system or HDFS
Hi, I had intended to implement this requirement in the first way, but now I think the second and third are better for much less modification of current codes... Though the first way may support more than HDFS storage, it may still take a lot more time to modify codes in TsFile module based on current work. BR, -- Zesong Sun School of Software, Tsinghua University ?? -- -- ??: "Jialin Qiao"; : 2019??9??8??(??) 2:08 ??: "dev"; : Enable to choose storage in local file system or HDFS Hi, This issue is to let user directly uses spark to read data in IoTDB for analyzing. This function can be done in many ways in IoTDB: (1) Storing all TsFiles (data files) and other files (system files, WALs) on HDFS, then use spark-tsfile to read TsFiles on HDFS. (2) Storing only TsFiles on HDFS, and other files on local file system, then use spark-tsfile to read TsFiles on HDFS. (3) Storing all files on local file system and let user use spark-iotdb-connector to read data from IoTDB, regardless where TsFiles store. Personally, I prefer the second and the third. If we use the second way, do we need the FileFactory for all Files? Best, -- Jialin Qiao School of Software, Tsinghua University ?? > -- > ??: "Zesong Sun (Jira)" > : 2019-08-29 19:34:00 (??) > ??: dev@iotdb.apache.org > : > : [jira] [Created] (IOTDB-187) Enable to choose storage in local file > system or HDFS > > Zesong Sun created IOTDB-187: > > > Summary: Enable to choose storage in local file system or HDFS > Key: IOTDB-187 > URL: https://issues.apache.org/jira/browse/IOTDB-187 > Project: Apache IoTDB > Issue Type: Improvement > Reporter: Zesong Sun > > > Enable to choose storage in local file system or HDFS > "is_hdfs_storage=false" by default > > > > -- > This message was sent by Atlassian Jira > (v8.3.2#803003)
Listed in Contributors in github
Hi, I noticed that some our contributors submitted PRs and get merged, but they are not listed in the "Contributors" page in our github. It may due to that your github account has not been related with your email you used to commit codes. You can try to add your email in your account settings to see whether it works... Best, -- Jialin Qiao School of Software, Tsinghua University 乔嘉林 清华大学 软件学院