Re: Re: [jira] [Created] (IOTDB-201) Query parsing runs slower when using ANTLR v4

2019-09-08 Thread 康愈圆
Hi,

Yes, antlr3.g file have the same detailed definition.However, ANTLR v3 allows 
users to explicitly define the structure of the tree.

For example,

setStorageGroup
  : KW_SET KW_STORAGE KW_GROUP KW_TO prefixPath
  -> ^(TOK_SET ^(TOK_STORAGEGROUP prefixPath))
  ;

the structure of the tree is like:

'SET'
  |
'STORAGEGROUP'
  |
 prefixPath

The prefixPath is another tree. Users can recursively analyse the AST node by 
function like analyze(prefixPath). Data are accessed by reference.

However, in ANTLR v4, the '->' operator is omitted.So the statement of setting 
storage group is defined as

setStorageGroup
  : KW_SET KW_STORAGE KW_GROUP KW_TO prefixPath

If we need to get the string info of prefixPath, we can use 
prefixPath.getText(), which is actually more clear and direct for developers. 
However, If 
prefixPath is not a leaf node, a StringBuilder will be created instead of 
reference access. Although operations on StringBuilder is faster than on 
String, 
creating StringBuilder too frequenly is a heavy overhead, which impairs the 
benefits and even reduce the overall performance.

Currently, I think this is what leads to the problem.

Best,
-
Yuyuan KANG



> -原始邮件-
> 发件人: "Xiangdong Huang" 
> 发送时间: 2019-09-09 00:08:00 (星期一)
> 收件人: dev@iotdb.apache.org
> 抄送: 
> 主题: Re: [jira] [Created] (IOTDB-201) Query parsing runs slower when using 
> ANTLR v4
> 
> Hi,
> 
> > There are some grammar definitions that are too detailed, such as decimal
> numbers, which are categorized into many types. I think making the rules
> more general may decrease the times of calling getText() method.
> 
> One question, does the antlr3.g file have the same detailed definition,
> e.g., the decimal numbers?
> 
> Best,
> 
> ---
> Xiangdong Huang
> School of Software, Tsinghua University
> 
>  黄向东
> 清华大学 软件学院
> 
> 
> 康愈圆  于2019年9月5日周四 下午11:11写道:
> 
> > Hi,
> >
> > I've been working on JIRA issue [IOTDB-190 switch to ANTLR v4] these days.
> >
> > I implemented the SQL parsing module. However, it seems that the parsing
> > efficiency reduces a lot when using ANTLR v4.
> >
> > It turns out that RuleContext.getText() is frequently called, which takes
> > more than 90% of the CPU time.
> >
> > The grammer definition (.g4 file) here is a continuation of previous
> > version (ANTLR v3). There are some grammar definitions that are too
> > detailed, such as decimal numbers, which are categorized into many types. I
> > think making the rules more general may decrease the times of calling
> > getText() method.
> >
> > I plan to reconstruct the grammer definition to improve the parsing
> > efficiency.
> >
> > 
> > Yuyuan KANG
> >
> > 在2019-09-06 13:30:00,Yuyuan KANG (Jira)写道:
> > > Yuyuan KANG created IOTDB-201:
> > > -
> > >
> > >  Summary: Query parsing runs slower when using ANTLR v4
> > >  Key: IOTDB-201
> > >  URL: https://issues.apache.org/jira/browse/IOTDB-201
> > >  Project: Apache IoTDB
> > >   Issue Type: Improvement
> > > Reporter: Yuyuan KANG
> > >
> > >
> > > The system now uses ANTLR v3. When transformed to ANTLR v4 using
> > previous grammar definition, experiment result shows that the efficiency of
> > logical plan generation is negatively impacted.
> > >
> > >
> > >
> > > --
> > > This message was sent by Atlassian Jira
> > > (v8.3.2#803003)
> >
> >


Re:Solving jira problem (IOTDB-180) Get rid of JSON format in "show timeseries"

2019-09-08 Thread thss15_yit
The JIRA link of this issue is 
https://issues.apache.org/jira/projects/IOTDB/issues/IOTDB-180?filter=allopenissues








在 2019-09-09 11:21:23,"thss15_yit"  写道:
>Hi,
>I have been working on JIRA issue [IOTDB-180 get rid of JSON format in 
> "show timeseries"] these days.
>   My plan of dealing with this issue is merging the execution of statement 
> "show timeseries" into "show timeseries ",using the functions of "show 
> timeseries " to output the table format of the data, and then remove 
> some of the useless functions of JSON format.
>
>
>Tao Yi 


Solving jira problem (IOTDB-180) Get rid of JSON format in "show timeseries"

2019-09-08 Thread thss15_yit
Hi,
I have been working on JIRA issue [IOTDB-180 get rid of JSON format in 
"show timeseries"] these days.
   My plan of dealing with this issue is merging the execution of statement 
"show timeseries" into "show timeseries ",using the functions of "show 
timeseries " to output the table format of the data, and then remove some 
of the useless functions of JSON format.


Tao Yi 

Re: A simple tool to visualize logs

2019-09-08 Thread Tian Jiang
Hi,


This feature is available on 
https://github.com/apache/incubator-iotdb/pull/370, please have a look if you 
have some time to spare. Thanks a lot.


Tian Jiang


| |
Tian Jiang
|
|
jt2594...@163.com
|
签名由网易邮箱大师定制
On 9/9/2019 00:41,Julian Feinauer wrote:
Hi,

I definetly need that (to adopt to our stuff!)

Julian

Am 08.09.19, 09:13 schrieb "Xiangdong Huang" :

Hi,

Sounds COOOL!

Analyzing system log is one of the most thing for a complex system
(especially for distributed system).

Will try this feature ASAP :D

Best,
---
Xiangdong Huang
School of Software, Tsinghua University

黄向东
清华大学 软件学院


Tian Jiang  于2019年9月4日周三 下午7:59写道:

Greetings,




Usually, the very first thing we will do on finding a bug is to search the
logs. Logs play a vital role in debugging especially in some environment
that attaching a debugger is impossible. In such circumstance, logs will
hopefully become the only information sources for the developers.



However, a single log, which is just a string, is easy to understand. But
when it comes to mining information from thousands of logs or even more,
getting lost is nearly unavoidable, since humans have a much limited memory
for exact truth compared to computers. From time to time, I forget what I
have read before and I must go back to review the previous logs, as a
result, progress is made very slowly. Reading several strings is easy, but
when we have thousands, there must be some better way to present them than
raw text.



So, I keep thinking it would be much better if we can make the logs into
plots. Of course there must some existing tools, but they are often
powerful but too heavy (like Kibana) , or specialized for web or other logs
(like LogStalgia). Having a fantastic web interface is great, but a simple
but handy suit us better. What I want is something light-weighted,
stand-alone and highly customized.



As a result, I developed a simple tool that can visualize (plot) logs
generated by IoTDB (with some modification, it can be applied to other type
of logs, too) and generate report. I designed a simple GUI which provides
full functionalities and a command line tool to fast generate reports. The
attachment contains an example report I generated from one of my
experiments, which reveal interesting things like how the size of memtables
converges over time.



I may have missed some tools that are more powerful or easier to use. If
you know any, please inform me and I shall see what I can learn from them.



| |
Tian Jiang
|
|
jt2594...@163.com
|
签名由网易邮箱大师定制




?????? Enable to choose storage in local file system or HDFS

2019-09-08 Thread Zesong Sun
Hi,


Thanks for this helpful suggestion! I'll try it by implementing TsFileFactory 
and SystemFileFactory.
(They are factories to create files according to different File System, so I 
think the name doesn't need to be "...FileFSFactory"?)




BR,

--
Zesong Sun
School of Software, Tsinghua University

??
 


 




--  --
??: "Xiangdong Huang";
: 2019??9??9??(??) 0:03
??: "dev";

: Re: Enable to choose storage in local file system or HDFS



Hi,

> do we need the FileFactory for all Files?

A  solution is having two FileSystemUtil classes (or FileFactory),
TsFileFSFactory and the rest.

Best,
---
Xiangdong Huang
School of Software, Tsinghua University

 ??
 


Zesong Sun  ??2019??9??8?? 5:24??

> Hi,
>
>
> I had intended to implement this requirement in the first way, but now I
> think the second and third are better for much less modification of current
> codes... Though the first way may support more than HDFS storage, it may
> still take a lot more time to modify codes in TsFile module based on
> current work.
>
>
>
>
> BR,
> --
> Zesong Sun
> School of Software, Tsinghua University
>
> ??
>  
>
>
>
>
>
>
>
> --  --
> ??: "Jialin Qiao";
> : 2019??9??8??(??) 2:08
> ??: "dev";
>
> : Enable to choose storage in local file system or HDFS
>
>
>
> Hi,
>
> This issue is to let user directly uses spark to read data in IoTDB for
> analyzing.
>
> This function can be done in many ways in IoTDB:
>
> (1) Storing all TsFiles (data files) and other files (system files, WALs)
> on HDFS, then use spark-tsfile to read TsFiles on HDFS.
> (2) Storing only TsFiles on HDFS, and other files on local file system,
> then use spark-tsfile to read TsFiles on HDFS.
> (3) Storing all files on local file system and let user use
> spark-iotdb-connector to read data from IoTDB, regardless where TsFiles
> store.
>
> Personally, I prefer the second and the third. If we use the second way,
> do we need the FileFactory for all Files?
>
> Best,
> --
> Jialin Qiao
> School of Software, Tsinghua University
>
> ??
>  
>
> > --
> > ??: "Zesong Sun (Jira)" 
> > : 2019-08-29 19:34:00 (??)
> > ??: dev@iotdb.apache.org
> > :
> > : [jira] [Created] (IOTDB-187) Enable to choose storage in local file
> system or HDFS
> >
> > Zesong Sun created IOTDB-187:
> > 
> >
> >  Summary: Enable to choose storage in local file system or
> HDFS
> >  Key: IOTDB-187
> >  URL: https://issues.apache.org/jira/browse/IOTDB-187
> >  Project: Apache IoTDB
> >   Issue Type: Improvement
> > Reporter: Zesong Sun
> >
> >
> > Enable to choose storage in local file system or HDFS
> > "is_hdfs_storage=false" by default
> >
> >
> >
> > --
> > This message was sent by Atlassian Jira
> > (v8.3.2#803003)

Re: A simple tool to visualize logs

2019-09-08 Thread Julian Feinauer
Hi,

I definetly need that (to adopt to our stuff!)

Julian

Am 08.09.19, 09:13 schrieb "Xiangdong Huang" :

Hi,

Sounds COOOL!

Analyzing system log is one of the most thing for a complex system
(especially for distributed system).

Will try this feature ASAP :D

Best,
---
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院


Tian Jiang  于2019年9月4日周三 下午7:59写道:

> Greetings,
>
>
>
>
> Usually, the very first thing we will do on finding a bug is to search the
> logs. Logs play a vital role in debugging especially in some environment
> that attaching a debugger is impossible. In such circumstance, logs will
> hopefully become the only information sources for the developers.
>
>
>
> However, a single log, which is just a string, is easy to understand. But
> when it comes to mining information from thousands of logs or even more,
> getting lost is nearly unavoidable, since humans have a much limited 
memory
> for exact truth compared to computers. From time to time, I forget what I
> have read before and I must go back to review the previous logs, as a
> result, progress is made very slowly. Reading several strings is easy, but
> when we have thousands, there must be some better way to present them than
> raw text.
>
>
>
> So, I keep thinking it would be much better if we can make the logs into
> plots. Of course there must some existing tools, but they are often
> powerful but too heavy (like Kibana) , or specialized for web or other 
logs
> (like LogStalgia). Having a fantastic web interface is great, but a simple
> but handy suit us better. What I want is something light-weighted,
> stand-alone and highly customized.
>
>
>
> As a result, I developed a simple tool that can visualize (plot) logs
> generated by IoTDB (with some modification, it can be applied to other 
type
> of logs, too) and generate report. I designed a simple GUI which provides
> full functionalities and a command line tool to fast generate reports. The
> attachment contains an example report I generated from one of my
> experiments, which reveal interesting things like how the size of 
memtables
> converges over time.
>
>
>
> I may have missed some tools that are more powerful or easier to use. If
> you know any, please inform me and I shall see what I can learn from them.
>
>
>
> | |
> Tian Jiang
> |
> |
> jt2594...@163.com
> |
> 签名由网易邮箱大师定制




Re: A simple tool to visualize logs

2019-09-08 Thread Xiangdong Huang
Hi,

Sounds COOOL!

Analyzing system log is one of the most thing for a complex system
(especially for distributed system).

Will try this feature ASAP :D

Best,
---
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院


Tian Jiang  于2019年9月4日周三 下午7:59写道:

> Greetings,
>
>
>
>
> Usually, the very first thing we will do on finding a bug is to search the
> logs. Logs play a vital role in debugging especially in some environment
> that attaching a debugger is impossible. In such circumstance, logs will
> hopefully become the only information sources for the developers.
>
>
>
> However, a single log, which is just a string, is easy to understand. But
> when it comes to mining information from thousands of logs or even more,
> getting lost is nearly unavoidable, since humans have a much limited memory
> for exact truth compared to computers. From time to time, I forget what I
> have read before and I must go back to review the previous logs, as a
> result, progress is made very slowly. Reading several strings is easy, but
> when we have thousands, there must be some better way to present them than
> raw text.
>
>
>
> So, I keep thinking it would be much better if we can make the logs into
> plots. Of course there must some existing tools, but they are often
> powerful but too heavy (like Kibana) , or specialized for web or other logs
> (like LogStalgia). Having a fantastic web interface is great, but a simple
> but handy suit us better. What I want is something light-weighted,
> stand-alone and highly customized.
>
>
>
> As a result, I developed a simple tool that can visualize (plot) logs
> generated by IoTDB (with some modification, it can be applied to other type
> of logs, too) and generate report. I designed a simple GUI which provides
> full functionalities and a command line tool to fast generate reports. The
> attachment contains an example report I generated from one of my
> experiments, which reveal interesting things like how the size of memtables
> converges over time.
>
>
>
> I may have missed some tools that are more powerful or easier to use. If
> you know any, please inform me and I shall see what I can learn from them.
>
>
>
> | |
> Tian Jiang
> |
> |
> jt2594...@163.com
> |
> 签名由网易邮箱大师定制


Re: [jira] [Created] (IOTDB-201) Query parsing runs slower when using ANTLR v4

2019-09-08 Thread Xiangdong Huang
Hi,

> There are some grammar definitions that are too detailed, such as decimal
numbers, which are categorized into many types. I think making the rules
more general may decrease the times of calling getText() method.

One question, does the antlr3.g file have the same detailed definition,
e.g., the decimal numbers?

Best,

---
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院


康愈圆  于2019年9月5日周四 下午11:11写道:

> Hi,
>
> I've been working on JIRA issue [IOTDB-190 switch to ANTLR v4] these days.
>
> I implemented the SQL parsing module. However, it seems that the parsing
> efficiency reduces a lot when using ANTLR v4.
>
> It turns out that RuleContext.getText() is frequently called, which takes
> more than 90% of the CPU time.
>
> The grammer definition (.g4 file) here is a continuation of previous
> version (ANTLR v3). There are some grammar definitions that are too
> detailed, such as decimal numbers, which are categorized into many types. I
> think making the rules more general may decrease the times of calling
> getText() method.
>
> I plan to reconstruct the grammer definition to improve the parsing
> efficiency.
>
> 
> Yuyuan KANG
>
> 在2019-09-06 13:30:00,Yuyuan KANG (Jira)写道:
> > Yuyuan KANG created IOTDB-201:
> > -
> >
> >  Summary: Query parsing runs slower when using ANTLR v4
> >  Key: IOTDB-201
> >  URL: https://issues.apache.org/jira/browse/IOTDB-201
> >  Project: Apache IoTDB
> >   Issue Type: Improvement
> > Reporter: Yuyuan KANG
> >
> >
> > The system now uses ANTLR v3. When transformed to ANTLR v4 using
> previous grammar definition, experiment result shows that the efficiency of
> logical plan generation is negatively impacted.
> >
> >
> >
> > --
> > This message was sent by Atlassian Jira
> > (v8.3.2#803003)
>
>


Re: Enable to choose storage in local file system or HDFS

2019-09-08 Thread Xiangdong Huang
Hi,

> do we need the FileFactory for all Files?

A  solution is having two FileSystemUtil classes (or FileFactory),
TsFileFSFactory and the rest.

Best,
---
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院


Zesong Sun  于2019年9月8日周日 上午5:24写道:

> Hi,
>
>
> I had intended to implement this requirement in the first way, but now I
> think the second and third are better for much less modification of current
> codes... Though the first way may support more than HDFS storage, it may
> still take a lot more time to modify codes in TsFile module based on
> current work.
>
>
>
>
> BR,
> --
> Zesong Sun
> School of Software, Tsinghua University
>
> 孙泽嵩
> 清华大学 软件学院
>
>
>
>
>
>
>
> -- 原始邮件 --
> 发件人: "Jialin Qiao";
> 发送时间: 2019年9月8日(星期天) 下午2:08
> 收件人: "dev";
>
> 主题: Enable to choose storage in local file system or HDFS
>
>
>
> Hi,
>
> This issue is to let user directly uses spark to read data in IoTDB for
> analyzing.
>
> This function can be done in many ways in IoTDB:
>
> (1) Storing all TsFiles (data files) and other files (system files, WALs)
> on HDFS, then use spark-tsfile to read TsFiles on HDFS.
> (2) Storing only TsFiles on HDFS, and other files on local file system,
> then use spark-tsfile to read TsFiles on HDFS.
> (3) Storing all files on local file system and let user use
> spark-iotdb-connector to read data from IoTDB, regardless where TsFiles
> store.
>
> Personally, I prefer the second and the third. If we use the second way,
> do we need the FileFactory for all Files?
>
> Best,
> --
> Jialin Qiao
> School of Software, Tsinghua University
>
> 乔嘉林
> 清华大学 软件学院
>
> > -原始邮件-
> > 发件人: "Zesong Sun (Jira)" 
> > 发送时间: 2019-08-29 19:34:00 (星期四)
> > 收件人: dev@iotdb.apache.org
> > 抄送:
> > 主题: [jira] [Created] (IOTDB-187) Enable to choose storage in local file
> system or HDFS
> >
> > Zesong Sun created IOTDB-187:
> > 
> >
> >  Summary: Enable to choose storage in local file system or
> HDFS
> >  Key: IOTDB-187
> >  URL: https://issues.apache.org/jira/browse/IOTDB-187
> >  Project: Apache IoTDB
> >   Issue Type: Improvement
> > Reporter: Zesong Sun
> >
> >
> > Enable to choose storage in local file system or HDFS
> > "is_hdfs_storage=false" by default
> >
> >
> >
> > --
> > This message was sent by Atlassian Jira
> > (v8.3.2#803003)


IoTDB Apache Con slides

2019-09-08 Thread Xiangdong Huang
Hi all,

Current version is almost done, while only the performance evaluation
section is blank..

You can get the slides from [1] (the url only supports view and comment):

Do not hesitate to leave your comment to make it better (e.g., add more
technical content? We have 50 minutes! ) :D

By the way, our talk is at 10 September, 17:00- 17:50 (Beijing Time, 11
September 8:00-8:50 AM).

[1]
https://docs.google.com/presentation/d/1EXi4UY1IXaAKW1Ybh3iLzgkKQIvA5B--7oL_rQ8aG0Y/edit?usp=sharing



---
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院


[jira] [Created] (IOTDB-204) spark-tsfile narrow table's new way to execute query

2019-09-08 Thread Lei Rui (Jira)
Lei Rui created IOTDB-204:
-

 Summary: spark-tsfile narrow table's new way to execute query
 Key: IOTDB-204
 URL: https://issues.apache.org/jira/browse/IOTDB-204
 Project: Apache IoTDB
  Issue Type: Improvement
Reporter: Lei Rui






--
This message was sent by Atlassian Jira
(v8.3.2#803003)


??????Enable to choose storage in local file system or HDFS

2019-09-08 Thread Zesong Sun
Hi,


I had intended to implement this requirement in the first way, but now I think 
the second and third are better for much less modification of current codes... 
Though the first way may support more than HDFS storage, it may still take a 
lot more time to modify codes in TsFile module based on current work.




BR,
--
Zesong Sun
School of Software, Tsinghua University

??
 


 




--  --
??: "Jialin Qiao";
: 2019??9??8??(??) 2:08
??: "dev";

: Enable to choose storage in local file system or HDFS



Hi,

This issue is to let user directly uses spark to read data in IoTDB for 
analyzing.

This function can be done in many ways in IoTDB:

(1) Storing all TsFiles (data files) and other files (system files, WALs) on 
HDFS, then use spark-tsfile to read TsFiles on HDFS.
(2) Storing only TsFiles on HDFS, and other files on local file system, then 
use spark-tsfile to read TsFiles on HDFS.
(3) Storing all files on local file system and let user use 
spark-iotdb-connector to read data from IoTDB, regardless where TsFiles store.

Personally, I prefer the second and the third. If we use the second way, do we 
need the FileFactory for all Files?

Best,
--
Jialin Qiao
School of Software, Tsinghua University

??
 

> --
> ??: "Zesong Sun (Jira)" 
> : 2019-08-29 19:34:00 (??)
> ??: dev@iotdb.apache.org
> : 
> : [jira] [Created] (IOTDB-187) Enable to choose storage in local file 
> system or HDFS
> 
> Zesong Sun created IOTDB-187:
> 
> 
>  Summary: Enable to choose storage in local file system or HDFS
>  Key: IOTDB-187
>  URL: https://issues.apache.org/jira/browse/IOTDB-187
>  Project: Apache IoTDB
>   Issue Type: Improvement
> Reporter: Zesong Sun
> 
> 
> Enable to choose storage in local file system or HDFS
> "is_hdfs_storage=false" by default
> 
> 
> 
> --
> This message was sent by Atlassian Jira
> (v8.3.2#803003)

Listed in Contributors in github

2019-09-08 Thread Jialin Qiao
Hi,

I noticed that some our contributors submitted PRs and get merged, but they are 
not listed in the "Contributors" page in our github. 

It may due to that your github account has not been related with your email you 
used to commit codes. You can try to add your email in your account settings to 
see whether it works...

Best,
--
Jialin Qiao
School of Software, Tsinghua University

乔嘉林
清华大学 软件学院