[jira] [Commented] (IOTDB-462) Failed to execute goal for `download-maven-plugin`.

2020-02-11 Thread Jialin Qiao (Jira)


[ 
https://issues.apache.org/jira/browse/IOTDB-462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034983#comment-17034983
 ] 

Jialin Qiao commented on IOTDB-462:
---

Hi, a user also encounters this problem. Reducing maven-download-plugin to 
1.3.0 works. So I'd like to fix it in 
[https://github.com/apache/incubator-iotdb/pull/794]

> Failed to execute goal for `download-maven-plugin`.
> ---
>
> Key: IOTDB-462
> URL: https://issues.apache.org/jira/browse/IOTDB-462
> Project: Apache IoTDB
>  Issue Type: Bug
>  Components: Others
>Reporter: sunjincheng
>Priority: Major
> Attachments: iotdb_mvn.gif
>
>
> Can not build the source code in my local, the error message as follows:
> {code:java}
> [ERROR] Failed to execute goal 
> com.googlecode.maven-download-plugin:download-maven-plugin:1.4.0:wget 
> (get-thrift-executable) on project service-rpc: Execution 
> get-thrift-executable of goal 
> com.googlecode.maven-download-plugin:download-maven-plugin:1.4.0:wget failed: 
> java.lang.ClassNotFoundException: 
> com.googlecode.download.maven.plugin.internal.CachedFileEntry -> [Help 1]
> {code}
> Does anyone else have this problem?
> At present, I have reduced the version(1.3.0) to solve this problem. If most 
> of people encounter this problem, I suggest reducing the version to 1.3.0. as 
> follows:
> {code:java}
>  
> 
> com.googlecode.maven-download-plugin
> download-maven-plugin
> 1.3.0 From 1.4.0 to 1.3.0
> 
> ...
> 
> 
> {code}
> What do you think?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[DISCUSS] Restructure QueryPlan: RawDataQueryPlan and AlignByDevicePlan

2020-02-11 Thread ??????
Hi all,


I'm currently fixing this issue: Seperate AlignByDevicePlan from QueryPlan[1].


There are three aspects that I changed:


(1) Restructure QueryPlan


The general query aligning by time and the AlignByDevice query ( which is 
called groupByDevice before ) were both storaged in QueryPlan before. As they 
store different parameters and invoke different query process, I seperated them 
from QueryPlan. QueryPlan is as base class now, the RawDataQueryPlan and 
AlignByDevicePlan representing two types of queries are as subclasses of 
QueryPlan.


(2) The query process of an AlignByDevicePlan


There were three subclasses of QueryPlan before: GroupByPlan, FillQueryPlan, 
AggregationPlan. They are as subclasses of RawDataQueryPlan now. To avoid 
redundance, I didn't design the same subclasses for AlignByDevicePlan, instead 
these three sub-plan are storaged as parameters in it. For example, if a query 
are both a AlignByDevicePlan and a GroupByPlan, the GroupByPlan parameter will 
be assigned and processed later.


(3) The name of AlignByDevicePlan


Maybe you have noticed that. Because GroupByDevice is not a standard sql 
statement, so I rename it as AlignByDevice.


Thanks for the advice of Xiangdong Huang for this change. I also changed some 
names which used GroupByDevice, but it was just a part. Hope we can change it 
little by little later.


[1]https://issues.apache.org/jira/browse/IOTDB-468



Best,
Xiangwei Wei





[jira] [Commented] (IOTDB-462) Failed to execute goal for `download-maven-plugin`.

2020-02-11 Thread Jialin Qiao (Jira)


[ 
https://issues.apache.org/jira/browse/IOTDB-462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034985#comment-17034985
 ] 

Jialin Qiao commented on IOTDB-462:
---

[https://github.com/apache/incubator-iotdb/pull/794]

> Failed to execute goal for `download-maven-plugin`.
> ---
>
> Key: IOTDB-462
> URL: https://issues.apache.org/jira/browse/IOTDB-462
> Project: Apache IoTDB
>  Issue Type: Bug
>  Components: Others
>Affects Versions: 0.8.0, 0.9.0, 0.9.1, 0.8.1, 0.8.2
>Reporter: sunjincheng
>Priority: Major
> Attachments: iotdb_mvn.gif
>
>
> Can not build the source code in my local, the error message as follows:
> {code:java}
> [ERROR] Failed to execute goal 
> com.googlecode.maven-download-plugin:download-maven-plugin:1.4.0:wget 
> (get-thrift-executable) on project service-rpc: Execution 
> get-thrift-executable of goal 
> com.googlecode.maven-download-plugin:download-maven-plugin:1.4.0:wget failed: 
> java.lang.ClassNotFoundException: 
> com.googlecode.download.maven.plugin.internal.CachedFileEntry -> [Help 1]
> {code}
> Does anyone else have this problem?
> At present, I have reduced the version(1.3.0) to solve this problem. If most 
> of people encounter this problem, I suggest reducing the version to 1.3.0. as 
> follows:
> {code:java}
>  
> 
> com.googlecode.maven-download-plugin
> download-maven-plugin
> 1.3.0 From 1.4.0 to 1.3.0
> 
> ...
> 
> 
> {code}
> What do you think?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IOTDB-298) Last time-value query

2020-02-11 Thread Shao Wei (Jira)


[ 
https://issues.apache.org/jira/browse/IOTDB-298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035005#comment-17035005
 ] 

Shao Wei commented on IOTDB-298:


Hi, a last query scan just work as normal query scan. I've learned that we make 
a copy for the memtable and this copy could be used for this query. So there 
should be no blocking when multiple copies of memtable can be used during read 
and write.

> Last time-value query
> -
>
> Key: IOTDB-298
> URL: https://issues.apache.org/jira/browse/IOTDB-298
> Project: Apache IoTDB
>  Issue Type: New Feature
>Reporter: Jialin Qiao
>Assignee: Shao Wei
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> We have two aggregators:
> last_value -> return the value of the last data point
> max_time -> return the time of the last data point
> However, we do not have an aggregator that returns the last time-value pair. 
> This is very antihuman in a time-series database :(:(:(
>  
> I suggest adding a new last query:
>  
> last s1 from root.sg1.d1, root.sg1.d2 or other similar grammar.
>  
> The Result should be in the following format: 
>  
> Path, Time, Value
> root.sg1.d1.s1, 100, 100
> root.sg1.d2.s1, 10, 10
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (IOTDB-462) Failed to execute goal for `download-maven-plugin`.

2020-02-11 Thread Jialin Qiao (Jira)


 [ 
https://issues.apache.org/jira/browse/IOTDB-462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jialin Qiao closed IOTDB-462.
-
Fix Version/s: 0.10.0-SNAPSHOT
   Resolution: Fixed

> Failed to execute goal for `download-maven-plugin`.
> ---
>
> Key: IOTDB-462
> URL: https://issues.apache.org/jira/browse/IOTDB-462
> Project: Apache IoTDB
>  Issue Type: Bug
>  Components: Others
>Affects Versions: 0.8.0, 0.9.0, 0.9.1, 0.8.1, 0.8.2
>Reporter: sunjincheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0-SNAPSHOT
>
> Attachments: iotdb_mvn.gif
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Can not build the source code in my local, the error message as follows:
> {code:java}
> [ERROR] Failed to execute goal 
> com.googlecode.maven-download-plugin:download-maven-plugin:1.4.0:wget 
> (get-thrift-executable) on project service-rpc: Execution 
> get-thrift-executable of goal 
> com.googlecode.maven-download-plugin:download-maven-plugin:1.4.0:wget failed: 
> java.lang.ClassNotFoundException: 
> com.googlecode.download.maven.plugin.internal.CachedFileEntry -> [Help 1]
> {code}
> Does anyone else have this problem?
> At present, I have reduced the version(1.3.0) to solve this problem. If most 
> of people encounter this problem, I suggest reducing the version to 1.3.0. as 
> follows:
> {code:java}
>  
> 
> com.googlecode.maven-download-plugin
> download-maven-plugin
> 1.3.0 From 1.4.0 to 1.3.0
> 
> ...
> 
> 
> {code}
> What do you think?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: About changing Github description topics

2020-02-11 Thread Jialin Qiao
Hi,

It looks good, only one thing that all topics need to be lowercase.

Thanks,
—
Jialin Qiao
School of Software, Tsinghua University

乔嘉林
清华大学 软件学院


atoiLiu  于2020年2月11日周二 下午12:03写道:

> Hi,
>
> Infra said it was a self-serve service and provided a link to the
> explanatory document [1].
>
> I have submitted the pr [2] and hope you can help me review it
>
> [1]
> https://cwiki.apache.org/confluence/display/INFRA/.asf.yaml+features+for+git+repositories
> <
> https://cwiki.apache.org/confluence/display/INFRA/.asf.yaml+features+for+git+repositories
> >
>
> [2] https://github.com/apache/incubator-iotdb/pull/789 <
> https://github.com/apache/incubator-iotdb/pull/789>
>
> Thanks
>
> Dawei Liu
>
> > 2020年2月10日 下午2:54,Xiangdong Huang  写道:
> >
> > Hi,
> >
> > Of course, you can.
> >
> > You can open a new ticket, choose the Project as INFRA, the Issue Type as
> > Wish, the second Project as Incubator, and then describe your wish.
> >
> > Do not remember to copy the link of this discussion in your description.
> >
> > Best,
> >
> > ---
> > Xiangdong Huang
> > School of Software, Tsinghua University
> >
> > 黄向东
> > 清华大学 软件学院
> >
> >
> > atoiLiu  于2020年2月9日周日 下午6:53写道:
> >
> >> Hi,
> >> Thank you very much for your reply, here is our summary so far:
> >>
> >> TimeSeries, TSDB, database, IoT, NoSQL, big-data, Java
> >>
> >> Can I directly send email to infra, or send it to someone with authority
> >> to modify?
> >>
> >>
> >> Thanks
> >> Dawei Liu
> >>
> >>> 2020年2月9日 下午4:49,Jialin Qiao  写道:
> >>>
> >>> Hi,
> >>>
> >>> +1 for TimeSeries, TSDB, database, IoT, NoSQL
> >>>
> >>> Thanks,
> >>> —
> >>> Jialin Qiao
> >>> School of Software, Tsinghua University
> >>>
> >>> 乔嘉林
> >>> 清华大学 软件学院
> >>>
> >>>
> >>> jincheng sun  于2020年2月9日周日 上午9:18写道:
> >>>
>  Thanks for bring up this discussion!
> 
>  +1 for add more git hub topics for IoTDB.
> 
>  I think the examples which Xiangdong mentioned is pretty good. In
> >> addition,
>  we can also add language information, such as: Python, Java, Scala
> 
>  Best,
>  Jincheng
> 
> 
>  Xiangdong Huang  于2020年2月8日周六 下午9:48写道:
> 
> > Hi,
> >
> > +1.  Let's discuss what topics we need to add.
> >
> > E.g., TimeSeries, TSDB,TimeSeriesDatabase, database, IoT, NoSQL,
> etc..
> >
> > Best,
> > ---
> > Xiangdong Huang
> > School of Software, Tsinghua University
> >
> > 黄向东
> > 清华大学 软件学院
> >
> >
> > atoiLiu  于2020年2月8日周六 下午8:20写道:
> >
> >> Hi,
> >>
> >> I noticed that IotDB's github topics were not set, which would
> greatly
> >> affect the search.
> >>
> >> If I'm a person looking for a management solution for time series
> >> data,
> >> when I search for timeseries on github, IoTDB will not recommended
> as
> >> relevant content to users.
> >>
> >> I've looked at other apache projects, and they all have Settings,
> like
> >> Flink, and what they set is:
> >> Scala Java big-data flink
> >>
> >> but we only set up IoTDB.
> >>
> >> Therefore, I suggest adding the following items:
> >>
> >> database timeseries iot iov big-data Java
> >>
> >> Any other Suggestions?
> >
> >>
> >>
>
>


Re: [DISCUSS] Remove PropertyPlan and PTree?

2020-02-11 Thread Jialin Qiao
Hi

The property tree is a semi-finished function, whose use cases are not
clear. I suggest removing it.

Thanks,
—
Jialin Qiao
School of Software, Tsinghua University

乔嘉林
清华大学 软件学院


孙泽嵩  于2020年2月11日周二 上午10:08写道:

> Hi all,
>
> I’m currently working on refactoring MManager [1] , when I found that the
> codes of PTree is a little bit confused.
>
> It is used in PropertyPlan but I did not find related documents or user
> guides except some are used in test codes like:
>
> "CREATE PROPERTY property1"
> "ADD LABEL label1 TO PROPERTY property1"
> "DELETE LABEL label1 FROM PROPERTY property1"
> "LINK root.m1.m2 TO property1.label1"
> "UNLINK root.m1.m2 FROM property1.label1”
>
> Do you think these statements are useful in the future? Or do you think
> PTree and PropertyPlan codes could be removed?
>
>
> [1] https://issues.apache.org/jira/browse/IOTDB-274# <
> https://issues.apache.org/jira/browse/IOTDB-274#>
>
>
> Best,
> ---
> Zesong Sun
> School of Software, Tsinghua University
>
> 孙泽嵩
> 清华大学 软件学院
>
>


Re: Suggestions for new TsFile

2020-02-11 Thread Haonan Hou
Hi Dawei,

Thank you so much that you share your opinion about new TsFile! 
I am very happy to take your suggestions.

You said we can remove TsOffsetArray and directly store the offset of 
TimeseriesMetaData. I agree with you. It is better than my version. 
Besides, for the optimization of TimeserieMetaData, I would like to discuss 
with other people to determine which way is better.

Best,

Haonan Hou


> On Feb 11, 2020, at 5:35 PM, atoiLiu  wrote:
> 
> Hi,
> 
> I’m learning new TsFile in PR [1], but I think TsFileMetaData has a bad 
> design.
> 
> TsFileMetaData has a TsOffsetArray,  TsOffsetArray is record every offset of 
> TimeseriesMetaData, and use Map to record startIndex , 
> endIndex of TsOffsetArray, it’s looks like :
> 
> TsFileMetaData —>{ [0,1,2,3,4,5, ….] [ {deviceId(d0), [0,2] }, {deviceId(d1), 
> [3,5] }, …. } }
> 
> We can delete TsOffsetArray  and store the offsets directly in the 
> deviceIndexArray, then TsFileMatadata will has a Map> to 
> record . This change will save 4 bytes per device on disk, because every 
> device just need record the number of offsets and offsets. it’s looks like:
> 
> TsFileMetaData ---> [ {deviceId(d0), [0,1,2] }, {deviceId(d1), [3,4,5] }, … }
> 
> 
> In addition, TimeSeriesMetaData is an ordered structure on the hard disk, and 
> the TimeSeriesMetaData for each device is linked together, so TsFileMetaData 
> does not need to store all offset information, so there two optimization 
> directions:
> 
> 1. Save startTime , endTime and offset for each TimeSeriesMetaData in 
> TsFileMetaData. The nice thing about this is that when you read 
> TsFileMetaData from your hard drive, you can directly do a filter to filter 
> which TimeSeriesMetaData is not necessary to read.
> 
> 
> 2. Only save the start TimeSeriesMetaData offset in TsFileMetaData so that 
> you can loop through it and just need once to seek, it’s looks like :
> 
> TsFileMetaData ---> [ {deviceId(d0), 0 }, {deviceId(d1), 3 }, … }
> 
> 
> 
> [1] https://github.com/apache/incubator-iotdb/pull/736 
> 
> 
> Thanks
> 
> Dawei Liu



Suggestions for new TsFile

2020-02-11 Thread atoiLiu
Hi,

I’m learning new TsFile in PR [1], but I think TsFileMetaData has a bad design.

TsFileMetaData has a TsOffsetArray,  TsOffsetArray is record every offset of 
TimeseriesMetaData, and use Map to record startIndex , 
endIndex of TsOffsetArray, it’s looks like :

TsFileMetaData —>{ [0,1,2,3,4,5, ….] [ {deviceId(d0), [0,2] }, {deviceId(d1), 
[3,5] }, …. } }

We can delete TsOffsetArray  and store the offsets directly in the 
deviceIndexArray, then TsFileMatadata will has a Map> to 
record . This change will save 4 bytes per device on disk, because every device 
just need record the number of offsets and offsets. it’s looks like:

TsFileMetaData ---> [ {deviceId(d0), [0,1,2] }, {deviceId(d1), [3,4,5] }, … }


In addition, TimeSeriesMetaData is an ordered structure on the hard disk, and 
the TimeSeriesMetaData for each device is linked together, so TsFileMetaData 
does not need to store all offset information, so there two optimization 
directions:

1. Save startTime , endTime and offset for each TimeSeriesMetaData in 
TsFileMetaData. The nice thing about this is that when you read TsFileMetaData 
from your hard drive, you can directly do a filter to filter which 
TimeSeriesMetaData is not necessary to read.


2. Only save the start TimeSeriesMetaData offset in TsFileMetaData so that you 
can loop through it and just need once to seek, it’s looks like :

TsFileMetaData ---> [ {deviceId(d0), 0 }, {deviceId(d1), 3 }, … }



[1] https://github.com/apache/incubator-iotdb/pull/736 


Thanks

Dawei Liu

Re: [DISCUSS] Table schema of group by device

2020-02-11 Thread Jialin Qiao
Hi,

If we use text when a column has multiple types, I'm ok with (3).

Thanks,
—
Jialin Qiao
School of Software, Tsinghua University

乔嘉林
清华大学 软件学院


魏祥威 <526213...@qq.com> 于2020年2月9日周日 下午5:30写道:

> Hi,
>
>
> I agree with the opinion of Xiangdong Huang.
>
>
> (3) is the most friendly for users who are using Relational DB, and if
> they want a relational query (group by device query), their applications
> should guarantee the consistency of data type.
>
> Best,
> Xiangwei Wei
>
>
>
> 
>
>
>
>
> --原始邮件--
> 发件人:"Xiangdong Huang" 发送时间:2020年2月7日(星期五) 下午2:58
> 收件人:"dev"
> 主题:Re: [DISCUSS] Table schema of group by device
>
>
>
> One more suggestion, using "align by device" is more clear than "group by
> device".
>
> ---
> Xiangdong Huang
> School of Software, Tsinghua University
>
> 黄向东
> 清华大学 软件学院
>
>
> Xiangdong Huang 
>  -1 for (2), forever and I think I will never vote +1 for
> it...
> 
>  If you do it like that, there is no chance to replace those
> applications
>  which are using relational db to manage timeseries data.
> 
>  (3) is the most friendly for those developers who are using
> Relational DB,
>  because when they write a SQL like "select c1, c2, c3 FROM", they
> think it
>  is of course that the resultset has 3 columns...
> 
>  Of course, for users who are using RDB and want a table like "Time
>  DeviceId, s1, s2", their applications can guarantee the data type of
> data
>  in s2 as const.
>  If there are many data types in s2, the RDB users may use "text"
>  "varchar2" format directly.
> 
>  Considering that, I think the choice is: if all data has the same data
>  type in a column, use the correct data type. Otherwise use String.
> 
>  (1) Well, it can be an option. But my suggestion is, if all data has
> the
>  same data type in a column, do not change its column name.
> 
>  Best,
>  ---
>  Xiangdong Huang
>  School of Software, Tsinghua University
> 
>  黄向东
>  清华大学 软件学院
> 
> 
>  Jialin Qiao  
>  Hi,
> 
>  In IOTDB-243 [1], We want to allow create measurements with the
> same name
>  but with different types in the same storage group.
> 
>  For example,
>  root.sg1.d1.s1, int32
>  root.sg1.d1.s2 int32
>  root.sg1.d2.s1 boolean
>  root.sg1.d2.s2 int32
> 
>  This may cause trouble in group by device query. How do we
> organize the
>  result (table schema)? I thought of three ways:
> 
>  (1) Time, Device, s1_int, s1_boolean, s2_int32
> 
>  * advantage:
>  - No ambiguity
>  - The number of columns is acceptable.
> 
>  * disadvantage:
>  - In most cases, the datatype indicator is redundant and weird.
>  - Difficult to use parallelization among devices in the query.
> 
>  (2) Time, d1, s1, s2 Time, d2, s1, s2
> 
>  * advantage:
>  - No ambiguity
>  - This could leverage the parallelization among devices in the
> query.
> 
>  * disadvantage:
>  - The number of columns may be large.
> 
>  (3) Time DeviceId, s1, s2
> 
>  This may need to do much work in the QueryDataSet, and users need
> to get
>  value carefully according to the measurement type of one device.
>  Otherwise,
>  it may cause RunTimeException in JDBC Client.
> 
>  * advantage:
>  - The number of columns is the minimal.
> 
>  * disadvantage:
>  - May cause ambiguity, a column of one table has more than one
> type, which
>  also conflicts to the Spark connector or Hive connector.
>  - Difficult to use parallelization in the query.
> 
>  ___
> 
>  From my perspective, I prefer (1) ≈ (2)  (3).
> 
>  What's your opinion?
> 
>  [1] https://issues.apache.org/jira/browse/IOTDB-243
> 
>  Thanks,
>  —
>  Jialin Qiao
>  School of Software, Tsinghua University
> 
>  乔嘉林
>  清华大学 软件学院
> 
> 


Re: Suggestions for new TsFile

2020-02-11 Thread atoiLiu
Hi,

Thank you for your reply. 
I am very happy that you can take my suggestion.


Thanks

Dawei Liu


> 2020年2月11日 下午6:04,Haonan Hou  写道:
> 
> Hi Dawei,
> 
> Thank you so much that you share your opinion about new TsFile! 
> I am very happy to take your suggestions.
> 
> You said we can remove TsOffsetArray and directly store the offset of 
> TimeseriesMetaData. I agree with you. It is better than my version. 
> Besides, for the optimization of TimeserieMetaData, I would like to discuss 
> with other people to determine which way is better.
> 
> Best,
> 
> Haonan Hou
> 
> 
>> On Feb 11, 2020, at 5:35 PM, atoiLiu  wrote:
>> 
>> Hi,
>> 
>> I’m learning new TsFile in PR [1], but I think TsFileMetaData has a bad 
>> design.
>> 
>> TsFileMetaData has a TsOffsetArray,  TsOffsetArray is record every offset of 
>> TimeseriesMetaData, and use Map to record startIndex , 
>> endIndex of TsOffsetArray, it’s looks like :
>> 
>> TsFileMetaData —>{ [0,1,2,3,4,5, ….] [ {deviceId(d0), [0,2] }, 
>> {deviceId(d1), [3,5] }, …. } }
>> 
>> We can delete TsOffsetArray  and store the offsets directly in the 
>> deviceIndexArray, then TsFileMatadata will has a Map> 
>> to record . This change will save 4 bytes per device on disk, because every 
>> device just need record the number of offsets and offsets. it’s looks like:
>> 
>> TsFileMetaData ---> [ {deviceId(d0), [0,1,2] }, {deviceId(d1), [3,4,5] }, … }
>> 
>> 
>> In addition, TimeSeriesMetaData is an ordered structure on the hard disk, 
>> and the TimeSeriesMetaData for each device is linked together, so 
>> TsFileMetaData does not need to store all offset information, so there two 
>> optimization directions:
>> 
>> 1. Save startTime , endTime and offset for each TimeSeriesMetaData in 
>> TsFileMetaData. The nice thing about this is that when you read 
>> TsFileMetaData from your hard drive, you can directly do a filter to filter 
>> which TimeSeriesMetaData is not necessary to read.
>> 
>> 
>> 2. Only save the start TimeSeriesMetaData offset in TsFileMetaData so that 
>> you can loop through it and just need once to seek, it’s looks like :
>> 
>> TsFileMetaData ---> [ {deviceId(d0), 0 }, {deviceId(d1), 3 }, … }
>> 
>> 
>> 
>> [1] https://github.com/apache/incubator-iotdb/pull/736 
>> 
>> 
>> Thanks
>> 
>> Dawei Liu
> 



remove the log of "login" and "close session" or move them into a separate log file

2020-02-11 Thread Xiangdong Huang
Hi,

Is there someone being troubled with the following server log:

IoTDB: Login status: Login successfully. User : root
IoTDB: receive close operation
IoTDB: receive close session

When I checked one user's IoTDB log, I am stuck with so many "login" and
"close log records...

I know audit is important for a DB. If so, at least let's move this kind of
log to a separate log file.
Then, the log_all is just for checking whether an IoTDB instance has
exceptions.

How do you think?

best.
---
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院


Re: [DISCUSS] Remove PropertyPlan and PTree?

2020-02-11 Thread Xiangdong Huang
Hi,

+1. We can pick the removed codes back from the git repo if really
necessary in the future.

---
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院


Jialin Qiao  于2020年2月11日周二 下午5:48写道:

> Hi
>
> The property tree is a semi-finished function, whose use cases are not
> clear. I suggest removing it.
>
> Thanks,
> —
> Jialin Qiao
> School of Software, Tsinghua University
>
> 乔嘉林
> 清华大学 软件学院
>
>
> 孙泽嵩  于2020年2月11日周二 上午10:08写道:
>
> > Hi all,
> >
> > I’m currently working on refactoring MManager [1] , when I found that the
> > codes of PTree is a little bit confused.
> >
> > It is used in PropertyPlan but I did not find related documents or user
> > guides except some are used in test codes like:
> >
> > "CREATE PROPERTY property1"
> > "ADD LABEL label1 TO PROPERTY property1"
> > "DELETE LABEL label1 FROM PROPERTY property1"
> > "LINK root.m1.m2 TO property1.label1"
> > "UNLINK root.m1.m2 FROM property1.label1”
> >
> > Do you think these statements are useful in the future? Or do you think
> > PTree and PropertyPlan codes could be removed?
> >
> >
> > [1] https://issues.apache.org/jira/browse/IOTDB-274# <
> > https://issues.apache.org/jira/browse/IOTDB-274#>
> >
> >
> > Best,
> > ---
> > Zesong Sun
> > School of Software, Tsinghua University
> >
> > 孙泽嵩
> > 清华大学 软件学院
> >
> >
>


Re: [DISCUSS] Table schema of group by device

2020-02-11 Thread Xiangdong Huang
Hi Jialin,

Very glad that you can agree with that. :-D

---
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院


Jialin Qiao  于2020年2月11日周二 下午5:50写道:

> Hi,
>
> If we use text when a column has multiple types, I'm ok with (3).
>
> Thanks,
> —
> Jialin Qiao
> School of Software, Tsinghua University
>
> 乔嘉林
> 清华大学 软件学院
>
>
> 魏祥威 <526213...@qq.com> 于2020年2月9日周日 下午5:30写道:
>
> > Hi,
> >
> >
> > I agree with the opinion of Xiangdong Huang.
> >
> >
> > (3) is the most friendly for users who are using Relational DB, and if
> > they want a relational query (group by device query), their applications
> > should guarantee the consistency of data type.
> >
> > Best,
> > Xiangwei Wei
> >
> >
> >
> > 
> >
> >
> >
> >
> > --原始邮件--
> > 发件人:"Xiangdong Huang" > 发送时间:2020年2月7日(星期五) 下午2:58
> > 收件人:"dev" >
> > 主题:Re: [DISCUSS] Table schema of group by device
> >
> >
> >
> > One more suggestion, using "align by device" is more clear than "group by
> > device".
> >
> > ---
> > Xiangdong Huang
> > School of Software, Tsinghua University
> >
> > 黄向东
> > 清华大学 软件学院
> >
> >
> > Xiangdong Huang  >
> >  -1 for (2), forever and I think I will never vote +1 for
> > it...
> > 
> >  If you do it like that, there is no chance to replace those
> > applications
> >  which are using relational db to manage timeseries data.
> > 
> >  (3) is the most friendly for those developers who are using
> > Relational DB,
> >  because when they write a SQL like "select c1, c2, c3 FROM", they
> > think it
> >  is of course that the resultset has 3 columns...
> > 
> >  Of course, for users who are using RDB and want a table like "Time
> >  DeviceId, s1, s2", their applications can guarantee the data type of
> > data
> >  in s2 as const.
> >  If there are many data types in s2, the RDB users may use "text"
> >  "varchar2" format directly.
> > 
> >  Considering that, I think the choice is: if all data has the same
> data
> >  type in a column, use the correct data type. Otherwise use String.
> > 
> >  (1) Well, it can be an option. But my suggestion is, if all data has
> > the
> >  same data type in a column, do not change its column name.
> > 
> >  Best,
> >  ---
> >  Xiangdong Huang
> >  School of Software, Tsinghua University
> > 
> >  黄向东
> >  清华大学 软件学院
> > 
> > 
> >  Jialin Qiao  > 
> >  Hi,
> > 
> >  In IOTDB-243 [1], We want to allow create measurements with the
> > same name
> >  but with different types in the same storage group.
> > 
> >  For example,
> >  root.sg1.d1.s1, int32
> >  root.sg1.d1.s2 int32
> >  root.sg1.d2.s1 boolean
> >  root.sg1.d2.s2 int32
> > 
> >  This may cause trouble in group by device query. How do we
> > organize the
> >  result (table schema)? I thought of three ways:
> > 
> >  (1) Time, Device, s1_int, s1_boolean, s2_int32
> > 
> >  * advantage:
> >  - No ambiguity
> >  - The number of columns is acceptable.
> > 
> >  * disadvantage:
> >  - In most cases, the datatype indicator is redundant and weird.
> >  - Difficult to use parallelization among devices in the query.
> > 
> >  (2) Time, d1, s1, s2 Time, d2, s1, s2
> > 
> >  * advantage:
> >  - No ambiguity
> >  - This could leverage the parallelization among devices in the
> > query.
> > 
> >  * disadvantage:
> >  - The number of columns may be large.
> > 
> >  (3) Time DeviceId, s1, s2
> > 
> >  This may need to do much work in the QueryDataSet, and users
> need
> > to get
> >  value carefully according to the measurement type of one device.
> >  Otherwise,
> >  it may cause RunTimeException in JDBC Client.
> > 
> >  * advantage:
> >  - The number of columns is the minimal.
> > 
> >  * disadvantage:
> >  - May cause ambiguity, a column of one table has more than one
> > type, which
> >  also conflicts to the Spark connector or Hive connector.
> >  - Difficult to use parallelization in the query.
> > 
> >  ___
> > 
> >  From my perspective, I prefer (1) ≈ (2)  (3).
> > 
> >  What's your opinion?
> > 
> >  [1] https://issues.apache.org/jira/browse/IOTDB-243
> > 
> >  Thanks,
> >  —
> >  Jialin Qiao
> >  School of Software, Tsinghua University
> > 
> >  乔嘉林
> >  清华大学 软件学院
> > 
> > 
>


Re: Suggestions for new TsFile

2020-02-11 Thread Jialin Qiao
Hi,

If each device only stores each offset of TimeseriesMetadata like this:
TsFileMetaData ---> [ {deviceId(d0), [0,1,2] }, {deviceId(d1), [3,4,5] }, …
}

It could be simplified to recording the start offset and end offset:
TsFileMetaData ---> [ {deviceId(d0), [0, 2] }, {deviceId(d1), [3,5] }, … }

And finally, it could be replaced by: TsFileMetaData ---> [ {deviceId(d0),
0 }, {deviceId(d1), 3 }, … }

Thanks,
—
Jialin Qiao
School of Software, Tsinghua University

乔嘉林
清华大学 软件学院


atoiLiu  于2020年2月11日周二 下午7:59写道:

> Hi,
>
> Thank you for your reply.
> I am very happy that you can take my suggestion.
>
>
> Thanks
>
> Dawei Liu
>
>
> > 2020年2月11日 下午6:04,Haonan Hou  写道:
> >
> > Hi Dawei,
> >
> > Thank you so much that you share your opinion about new TsFile!
> > I am very happy to take your suggestions.
> >
> > You said we can remove TsOffsetArray and directly store the offset of
> TimeseriesMetaData. I agree with you. It is better than my version.
> > Besides, for the optimization of TimeserieMetaData, I would like to
> discuss with other people to determine which way is better.
> >
> > Best,
> >
> > Haonan Hou
> >
> >
> >> On Feb 11, 2020, at 5:35 PM, atoiLiu  wrote:
> >>
> >> Hi,
> >>
> >> I’m learning new TsFile in PR [1], but I think TsFileMetaData has a bad
> design.
> >>
> >> TsFileMetaData has a TsOffsetArray,  TsOffsetArray is record every
> offset of TimeseriesMetaData, and use Map to record
> startIndex , endIndex of TsOffsetArray, it’s looks like :
> >>
> >> TsFileMetaData —>{ [0,1,2,3,4,5, ….] [ {deviceId(d0), [0,2] },
> {deviceId(d1), [3,5] }, …. } }
> >>
> >> We can delete TsOffsetArray  and store the offsets directly in the
> deviceIndexArray, then TsFileMatadata will has a Map>
> to record . This change will save 4 bytes per device on disk, because every
> device just need record the number of offsets and offsets. it’s looks like:
> >>
> >> TsFileMetaData ---> [ {deviceId(d0), [0,1,2] }, {deviceId(d1), [3,4,5]
> }, … }
> >>
> >>
> >> In addition, TimeSeriesMetaData is an ordered structure on the hard
> disk, and the TimeSeriesMetaData for each device is linked together, so
> TsFileMetaData does not need to store all offset information, so there two
> optimization directions:
> >>
> >> 1. Save startTime , endTime and offset for each TimeSeriesMetaData in
> TsFileMetaData. The nice thing about this is that when you read
> TsFileMetaData from your hard drive, you can directly do a filter to filter
> which TimeSeriesMetaData is not necessary to read.
> >>
> >>
> >> 2. Only save the start TimeSeriesMetaData offset in TsFileMetaData so
> that you can loop through it and just need once to seek, it’s looks like :
> >>
> >> TsFileMetaData ---> [ {deviceId(d0), 0 }, {deviceId(d1), 3 }, … }
> >>
> >>
> >>
> >> [1] https://github.com/apache/incubator-iotdb/pull/736 <
> https://github.com/apache/incubator-iotdb/pull/736>
> >>
> >> Thanks
> >>
> >> Dawei Liu
> >
>
>


[jira] [Commented] (IOTDB-298) Last time-value query

2020-02-11 Thread atoildw (Jira)


[ 
https://issues.apache.org/jira/browse/IOTDB-298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034666#comment-17034666
 ] 

atoildw commented on IOTDB-298:
---

i have a question, if i'm in step 2 for scanning last result ,at the same time 
another thread was exec insert ,so you wanna block insert? or kill raw data 
query?

> Last time-value query
> -
>
> Key: IOTDB-298
> URL: https://issues.apache.org/jira/browse/IOTDB-298
> Project: Apache IoTDB
>  Issue Type: New Feature
>Reporter: Jialin Qiao
>Assignee: Shao Wei
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> We have two aggregators:
> last_value -> return the value of the last data point
> max_time -> return the time of the last data point
> However, we do not have an aggregator that returns the last time-value pair. 
> This is very antihuman in a time-series database :(:(:(
>  
> I suggest adding a new last query:
>  
> last s1 from root.sg1.d1, root.sg1.d2 or other similar grammar.
>  
> The Result should be in the following format: 
>  
> Path, Time, Value
> root.sg1.d1.s1, 100, 100
> root.sg1.d2.s1, 10, 10
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IOTDB-298) Last time-value query

2020-02-11 Thread Shao Wei (Jira)


[ 
https://issues.apache.org/jira/browse/IOTDB-298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034640#comment-17034640
 ] 

Shao Wei commented on IOTDB-298:


Here is a brief design specification of *Last* query

1. "Last" query results will be cached in MTree nodes to be accessed quickly. 
2. If no "Last" time-value result has been cached,  perform a raw data query 
scan of the timeseries to find the last time-value result, and store it in an 
MTree node.
3. Upon each successful write, if the timestamp of newly inserted tuple is 
larger than the one we've already cached, this write will update the cached 
time-value pair as well.
4. To persist the Last time-value result, create a snapshot of current "Last" 
time-value in memtable. When we flush memtable, this time-value result will be 
flushed to disk.
5. When the database restarts, recover the memtable with Last time-value from 
disk and store this time-value result in MTree.

My plan: 

Phase 1,  to finish above 1, 2, 3. The persistence of Last query result will 
not be implemented.
Phase 2,  to finish above 4 and 5. 

> Last time-value query
> -
>
> Key: IOTDB-298
> URL: https://issues.apache.org/jira/browse/IOTDB-298
> Project: Apache IoTDB
>  Issue Type: New Feature
>Reporter: Jialin Qiao
>Assignee: Shao Wei
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> We have two aggregators:
> last_value -> return the value of the last data point
> max_time -> return the time of the last data point
> However, we do not have an aggregator that returns the last time-value pair. 
> This is very antihuman in a time-series database :(:(:(
>  
> I suggest adding a new last query:
>  
> last s1 from root.sg1.d1, root.sg1.d2 or other similar grammar.
>  
> The Result should be in the following format: 
>  
> Path, Time, Value
> root.sg1.d1.s1, 100, 100
> root.sg1.d2.s1, 10, 10
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Remove PropertyPlan and PTree?

2020-02-11 Thread jincheng sun
Hi Zesong,

Thanks for bring up this discussion!

+1 for your proposal!

Best,
Jincheng



Xiangdong Huang  于2020年2月11日周二 下午9:39写道:

> Hi,
>
> +1. We can pick the removed codes back from the git repo if really
> necessary in the future.
>
> ---
> Xiangdong Huang
> School of Software, Tsinghua University
>
>  黄向东
> 清华大学 软件学院
>
>
> Jialin Qiao  于2020年2月11日周二 下午5:48写道:
>
> > Hi
> >
> > The property tree is a semi-finished function, whose use cases are not
> > clear. I suggest removing it.
> >
> > Thanks,
> > —
> > Jialin Qiao
> > School of Software, Tsinghua University
> >
> > 乔嘉林
> > 清华大学 软件学院
> >
> >
> > 孙泽嵩  于2020年2月11日周二 上午10:08写道:
> >
> > > Hi all,
> > >
> > > I’m currently working on refactoring MManager [1] , when I found that
> the
> > > codes of PTree is a little bit confused.
> > >
> > > It is used in PropertyPlan but I did not find related documents or user
> > > guides except some are used in test codes like:
> > >
> > > "CREATE PROPERTY property1"
> > > "ADD LABEL label1 TO PROPERTY property1"
> > > "DELETE LABEL label1 FROM PROPERTY property1"
> > > "LINK root.m1.m2 TO property1.label1"
> > > "UNLINK root.m1.m2 FROM property1.label1”
> > >
> > > Do you think these statements are useful in the future? Or do you think
> > > PTree and PropertyPlan codes could be removed?
> > >
> > >
> > > [1] https://issues.apache.org/jira/browse/IOTDB-274# <
> > > https://issues.apache.org/jira/browse/IOTDB-274#>
> > >
> > >
> > > Best,
> > > ---
> > > Zesong Sun
> > > School of Software, Tsinghua University
> > >
> > > 孙泽嵩
> > > 清华大学 软件学院
> > >
> > >
> >
>