[jira] [Commented] (IOTDB-372) [Distributed] Support node deletion.

2020-02-03 Thread Tian Jiang (Jira)


[ 
https://issues.apache.org/jira/browse/IOTDB-372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17029556#comment-17029556
 ] 

Tian Jiang commented on IOTDB-372:
--

https://www.processon.com/diagraming/5e37c840e4b006a43aea52cb
Procedure design.

> [Distributed] Support node deletion.
> 
>
> Key: IOTDB-372
> URL: https://issues.apache.org/jira/browse/IOTDB-372
> Project: Apache IoTDB
>  Issue Type: New Feature
>Reporter: Tian Jiang
>Priority: Major
>  Labels: distributed
>
> Currently, only node addition is supported, to take a step toward scaling 
> even auto-scaling, node deletion. Node deletion is no simple reversion of 
> node addition, it should be carefully designed, discussed and verified.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IOTDB-439) [Distributed] Incorrect Snapshot implementation and LogManager

2020-02-03 Thread Tian Jiang (Jira)


[ 
https://issues.apache.org/jira/browse/IOTDB-439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17029513#comment-17029513
 ] 

Tian Jiang commented on IOTDB-439:
--

3. Currently, before one log is either timed out or committed, the next log is 
blocked as the high concurrency is mainly supported by partitioning. As 
operations in the same partition are basically serialized, this may not be a 
big issue. So the current method "replaceLastLog" is enough.
Still, holding multiple uncommitted logs is still a future optimization.

> [Distributed] Incorrect Snapshot implementation and LogManager
> --
>
> Key: IOTDB-439
> URL: https://issues.apache.org/jira/browse/IOTDB-439
> Project: Apache IoTDB
>  Issue Type: Sub-task
>Reporter: Xiangdong Huang
>Priority: Major
>
> I read the log/snapshot and manage packages in current cluster_new branch, 
> and have some questions:
> 1. PartitionedSnapshotLogManager and FilePartitionedSnapshotLogManager are 
> incorrect as
>    a. they still store log into memory while the JavaDoc says they do not 
> store data in memory.
>    b. When doing snapshot, do they need to consider the part of the log in 
> memory?
>  
> 2. Current LogManager is not thread-safety. The caller (i.e., RaftMember) 
> uses sync keyword to guarantee that for each call. 
>   a. a better design?
>   b. is there any performance problem? as all operations are serialization.
>  
> 3. Consider the Raft Protocol, don't we need APIs like 
> `removeLogFrom(startIndex)` in LogManager?  see the case of Figure 7 in Raft 
> paper [1] 
>  
> [1] [https://raft.github.io/raft.pdf]
>  
> [~jt2594838] may know clearly about current implementation.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IOTDB-439) [Distributed] Incorrect Snapshot implementation and LogManager

2020-02-03 Thread Tian Jiang (Jira)


[ 
https://issues.apache.org/jira/browse/IOTDB-439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17029508#comment-17029508
 ] 

Tian Jiang commented on IOTDB-439:
--

2.a. If you have concrete advice, it is welcomed. But using "synchronized" can 
minimize the scope being locked, as far as I see, I do not think there is any 
problem.

2.b. Performance concerns are currently left behind, and there is no proof 
supporting that. And for correctness, synchronization is necessary. Tightening 
the scope needed to be synchronized may be a good optimization, but it is too 
early for now.

> [Distributed] Incorrect Snapshot implementation and LogManager
> --
>
> Key: IOTDB-439
> URL: https://issues.apache.org/jira/browse/IOTDB-439
> Project: Apache IoTDB
>  Issue Type: Sub-task
>Reporter: Xiangdong Huang
>Priority: Major
>
> I read the log/snapshot and manage packages in current cluster_new branch, 
> and have some questions:
> 1. PartitionedSnapshotLogManager and FilePartitionedSnapshotLogManager are 
> incorrect as
>    a. they still store log into memory while the JavaDoc says they do not 
> store data in memory.
>    b. When doing snapshot, do they need to consider the part of the log in 
> memory?
>  
> 2. Current LogManager is not thread-safety. The caller (i.e., RaftMember) 
> uses sync keyword to guarantee that for each call. 
>   a. a better design?
>   b. is there any performance problem? as all operations are serialization.
>  
> 3. Consider the Raft Protocol, don't we need APIs like 
> `removeLogFrom(startIndex)` in LogManager?  see the case of Figure 7 in Raft 
> paper [1] 
>  
> [1] [https://raft.github.io/raft.pdf]
>  
> [~jt2594838] may know clearly about current implementation.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IOTDB-439) [Distributed] Incorrect Snapshot implementation and LogManager

2020-02-03 Thread Tian Jiang (Jira)


[ 
https://issues.apache.org/jira/browse/IOTDB-439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17029502#comment-17029502
 ] 

Tian Jiang commented on IOTDB-439:
--

1.a. It is saying the committed logs do not have to be stored in the memory 
while storing them may bring some benefit for catching up, but it is not 
necessary. Please compare it with PartitionedSnapshotLogManager carefully for a 
better understanding.

1.b. Committed logs are covered by snapshots and non-committed logs do not 
concern snapshot. They are already considered.

So I do not know why you call it "incorrect".
 

> [Distributed] Incorrect Snapshot implementation and LogManager
> --
>
> Key: IOTDB-439
> URL: https://issues.apache.org/jira/browse/IOTDB-439
> Project: Apache IoTDB
>  Issue Type: Sub-task
>Reporter: Xiangdong Huang
>Priority: Major
>
> I read the log/snapshot and manage packages in current cluster_new branch, 
> and have some questions:
> 1. PartitionedSnapshotLogManager and FilePartitionedSnapshotLogManager are 
> incorrect as
>    a. they still store log into memory while the JavaDoc says they do not 
> store data in memory.
>    b. When doing snapshot, do they need to consider the part of the log in 
> memory?
>  
> 2. Current LogManager is not thread-safety. The caller (i.e., RaftMember) 
> uses sync keyword to guarantee that for each call. 
>   a. a better design?
>   b. is there any performance problem? as all operations are serialization.
>  
> 3. Consider the Raft Protocol, don't we need APIs like 
> `removeLogFrom(startIndex)` in LogManager?  see the case of Figure 7 in Raft 
> paper [1] 
>  
> [1] [https://raft.github.io/raft.pdf]
>  
> [~jt2594838] may know clearly about current implementation.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IOTDB-448) Support "IN" operator in the WHERE clause

2020-02-03 Thread Tianan Li (Jira)


[ 
https://issues.apache.org/jira/browse/IOTDB-448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17028953#comment-17028953
 ] 

Tianan Li commented on IOTDB-448:
-

Hi, I'm working on this issue.

> Support "IN" operator in the WHERE clause
> -
>
> Key: IOTDB-448
> URL: https://issues.apache.org/jira/browse/IOTDB-448
> Project: Apache IoTDB
>  Issue Type: New Feature
>  Components: Core/Engine, Planner/SQLParser
>Reporter: Xiangdong Huang
>Assignee: Tianan Li
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hi, 
> while I am trying to upgrade an application's database from MySQL to IoTDB, a 
> new operator in the SELECT statment is needed:
> MySQL:
> SELECT * FROM STATION_TABLE WHERE observationTime = '2016-11-30 10:00:00' AND 
> pressure IN (100,1000,150,200,250,300,400,500,700,850,925);
> Using the SQL, if a station has data that pressure in such range, the 
> station's data will be returned.
> Therefore, we need an "IN" operator in the WHERE clause, like:
> SELECT * FROM root.station.* where time =  2016-11-30 10:00:00 and pressure 
> IN (100,1000,150,200,250,300,400,500,700,850,925) group by device;
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (IOTDB-443) ReadOnlyMemChunk round float\double data incorrectly

2020-02-03 Thread Jialin Qiao (Jira)


 [ 
https://issues.apache.org/jira/browse/IOTDB-443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jialin Qiao closed IOTDB-443.
-
Fix Version/s: 0.10.0-SNAPSHOT
   Resolution: Fixed

> ReadOnlyMemChunk round float\double data incorrectly
> 
>
> Key: IOTDB-443
> URL: https://issues.apache.org/jira/browse/IOTDB-443
> Project: Apache IoTDB
>  Issue Type: Bug
>  Components: Core/Engine
>Affects Versions: 0.9.1
>Reporter: Xiangdong Huang
>Assignee: Zesong Sun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0-SNAPSHOT
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In our design, only RLE and TS_2DIFF need float_precision to round the value.
> However, when we read data from memory, all float\double data will be 
> rounded, even if it uses Gorilla encoding.
> e.g., suppose root.sg1.d1.s1 is double datatype and Gorilla encoding. The 
> float_precision=1.
> If we insert a value (t,s1) = 1, 1.123, and select the data, we will get 1.1.
> However, if we run `flush` after insert the data, and then select the data, 
> we will get 1.123.
>  
> How to fix: we need to modify the init() method in ReadOnlyMemChunk to check 
> the encoding method first. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IOTDB-406) Refactor EnvironmentUtils for closing IoTDB timely in Integration Tests

2020-02-03 Thread Kaifeng Xue (Jira)


[ 
https://issues.apache.org/jira/browse/IOTDB-406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17028869#comment-17028869
 ] 

Kaifeng Xue commented on IOTDB-406:
---

OK, I will close mod file when closing tsfileProcessor~

> Refactor EnvironmentUtils for closing IoTDB timely in Integration Tests
> ---
>
> Key: IOTDB-406
> URL: https://issues.apache.org/jira/browse/IOTDB-406
> Project: Apache IoTDB
>  Issue Type: Task
>Reporter: Xiangdong Huang
>Assignee: Rui Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0-SNAPSHOT
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, we usually occurs that in some ITs, the IoTDB daemon is not closed 
> in previous tests.
> Besides, IoTDB daemon is not needed to exposed to IT classes. 
> Therefore, I'd like to refactor the EnvironmentUtils class.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (IOTDB-445) Unify the keyword of "timestamp" and "time"

2020-02-03 Thread Jialin Qiao (Jira)


 [ 
https://issues.apache.org/jira/browse/IOTDB-445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jialin Qiao closed IOTDB-445.
-
Fix Version/s: 0.10.0-SNAPSHOT
   Resolution: Fixed

> Unify the keyword of "timestamp" and "time"
> ---
>
> Key: IOTDB-445
> URL: https://issues.apache.org/jira/browse/IOTDB-445
> Project: Apache IoTDB
>  Issue Type: Improvement
>  Components: Planner/SQLParser
>Reporter: Xiangdong Huang
>Assignee: Boris Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0-SNAPSHOT
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Hi, 
> current the keyword of "timestamp" is confusing:
> In the insert statement, we use "insert into ... (TIMESTAMP, ) values 
> ()"
> while in the select statement, we are using "select ... where TIME > ...".
>  
> There are two choices, either unify them into one keyword or make them 
> equivalent.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IOTDB-443) ReadOnlyMemChunk round float\double data incorrectly

2020-02-03 Thread Zesong Sun (Jira)


[ 
https://issues.apache.org/jira/browse/IOTDB-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17028840#comment-17028840
 ] 

Zesong Sun commented on IOTDB-443:
--

If you need to fix this issue in your developing branch, you can cherry pick 
this commit id: 
33a9604b0fde5a585ef7312687f707d1ef952ec6

> ReadOnlyMemChunk round float\double data incorrectly
> 
>
> Key: IOTDB-443
> URL: https://issues.apache.org/jira/browse/IOTDB-443
> Project: Apache IoTDB
>  Issue Type: Bug
>  Components: Core/Engine
>Affects Versions: 0.9.1
>Reporter: Xiangdong Huang
>Assignee: Zesong Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In our design, only RLE and TS_2DIFF need float_precision to round the value.
> However, when we read data from memory, all float\double data will be 
> rounded, even if it uses Gorilla encoding.
> e.g., suppose root.sg1.d1.s1 is double datatype and Gorilla encoding. The 
> float_precision=1.
> If we insert a value (t,s1) = 1, 1.123, and select the data, we will get 1.1.
> However, if we run `flush` after insert the data, and then select the data, 
> we will get 1.123.
>  
> How to fix: we need to modify the init() method in ReadOnlyMemChunk to check 
> the encoding method first. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IOTDB-447) Support query on time series which are not created

2020-02-03 Thread Xiangdong Huang (Jira)


[ 
https://issues.apache.org/jira/browse/IOTDB-447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17028837#comment-17028837
 ] 

Xiangdong Huang commented on IOTDB-447:
---

Hi, 

One more requirement.

It is good if we can support "IS NOT NULL" clause if our strategy is "query as 
much as possible", it is because in some data analysis, users may just want to 
get the data of measurement A B C only if measurement A having data.  

For example,  meteorologist collects pressure and windSpeed, wind direction 
etc.. using meteorological balloons for meteorological data analysis.  

However, sometimes the balloons do not return the pressure data (or return 
incorrect data of pressure because of network problem). In this case, the 
windSpeed and wind direction data have no effect for relation analysis between 
pressure and wind. So, when they use mysql, the write the following SQL:

 

SELECT stationId,longitude,latitude,altitude,windDirection,windSpeed,pressure 
FROM AIR_STATION WHERE observationTime = ... AND pressure IS NOT NULL ORDER BY 
stationId ASC, pressure DESC

 

For IoTDB, we can also support "IS NOT NULL", e.g., where "pressure is not 
null", which means for a device, if column measurementA at timestamp x has data 
but the pressure has no data at timestamp x, do not return it.

Or, we can say the timestamp set in the pressure column is one of the 
timeFilters.

 

 

> Support query on time series which are not created
> --
>
> Key: IOTDB-447
> URL: https://issues.apache.org/jira/browse/IOTDB-447
> Project: Apache IoTDB
>  Issue Type: Bug
>  Components: Core/Engine
>Affects Versions: 0.9.1
>Reporter: Xiangdong Huang
>Priority: Major
> Attachments: image-2020-02-01-21-21-04-037.png, keepQuery.patch
>
>
> Hi,
> I occur a new case, which is also common I think.
> As we have supported auto-creating time series, users may give up  creating 
> time series manually (like me!). Then the following scenario occurs:
> 1. I know I have some meteorological stations that can collect the 
> temperature, press etc..
> 2. But, they may be not sent to the server in one packet. Therefore, I have 
> to write codes like:
>  
> {code:java}
> // code placeholder
> if ( temperature != null ){
>   schema.add("temperature");
>   values.add(temperature + "");
> }
> if ( press != null ){
>   schema.add("press");
>   values.add(press + "");
> }
> if ( windSpeed10min != null ){//10分钟风速
>   schema.add("windSpeed10min");
>   values.add(windSpeed10min + "");
> }
> session.insert(devicePath, observationTimeLong, schema, values);
> {code}
> Remember that I do not know when these stations will send the measurements to 
> the server. For example, maybe some stations have no anemoscopes, maybe the 
> anemoscopes of some stations are broken before I deploy my application.
> However, I do not know about that when I develop my query applications.
> 3. So, when I query data, I will write statements like:
> SELECT temperature, press,windDirection2min,windSpeed10min,... FROM 
> root.national.*.5.*.*.* WHERE time = '2020-01-19T04:00:00.000+08:00'
>  
> Ok, now suppose there is one station that never sent "windSpeed10min" to the 
> server because of mentioned reasons in 2, then the query will fail because: 
> Msg: Statement format is not right: Path: 
> "root.national.*.5.*.*.*.*.windSpeed10min" doesn't correspond to any known 
> time series.
>  
> What can I do? I can not query measurements and stations one by one...
>  
> If we do not allow auto-creating time series, the above scenario does not 
> happen because we will create ALL POSSIBLE time series at first. But, we know 
> it is not good, and that is one of the reasons that we support the 
> auto-creating time series.
>  
> So, now that we support auto-creating time series, it is better to fill the 
> result of not existed time series as NULL rather than return an error message 
> to block all queries.
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IOTDB-448) Support "IN" operator in the WHERE clause

2020-02-03 Thread Xiangdong Huang (Jira)


[ 
https://issues.apache.org/jira/browse/IOTDB-448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17028827#comment-17028827
 ] 

Xiangdong Huang commented on IOTDB-448:
---

OK now I find a solution: using OR keyword: where pressure = 100 or pressure = 
150 or ... 

At least it works.

> Support "IN" operator in the WHERE clause
> -
>
> Key: IOTDB-448
> URL: https://issues.apache.org/jira/browse/IOTDB-448
> Project: Apache IoTDB
>  Issue Type: New Feature
>  Components: Core/Engine, Planner/SQLParser
>Reporter: Xiangdong Huang
>Priority: Major
>
> Hi, 
> while I am trying to upgrade an application's database from MySQL to IoTDB, a 
> new operator in the SELECT statment is needed:
> MySQL:
> SELECT * FROM STATION_TABLE WHERE observationTime = '2016-11-30 10:00:00' AND 
> pressure IN (100,1000,150,200,250,300,400,500,700,850,925);
> Using the SQL, if a station has data that pressure in such range, the 
> station's data will be returned.
> Therefore, we need an "IN" operator in the WHERE clause, like:
> SELECT * FROM root.station.* where time =  2016-11-30 10:00:00 and pressure 
> IN (100,1000,150,200,250,300,400,500,700,850,925) group by device;
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[BUILD-STABLE]: Job 'IoTDB Website [null] [109]'

2020-02-03 Thread Apache Jenkins Server
BUILD-STABLE: Job 'IoTDB Website [null] [109]':

Is back to normal.

[jira] [Commented] (IOTDB-450) Add system design documents

2020-02-03 Thread Jialin Qiao (Jira)


[ 
https://issues.apache.org/jira/browse/IOTDB-450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17028759#comment-17028759
 ] 

Jialin Qiao commented on IOTDB-450:
---

Zesong and I have added the first version of the system design doc, welcome 
others to perfect the documents.

> Add system design documents
> ---
>
> Key: IOTDB-450
> URL: https://issues.apache.org/jira/browse/IOTDB-450
> Project: Apache IoTDB
>  Issue Type: Task
>  Components: Document
>Reporter: Jialin Qiao
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IOTDB-450) Add system design documents

2020-02-03 Thread Jialin Qiao (Jira)


[ 
https://issues.apache.org/jira/browse/IOTDB-450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17028758#comment-17028758
 ] 

Jialin Qiao commented on IOTDB-450:
---

[https://github.com/apache/incubator-iotdb/pull/757]

> Add system design documents
> ---
>
> Key: IOTDB-450
> URL: https://issues.apache.org/jira/browse/IOTDB-450
> Project: Apache IoTDB
>  Issue Type: Task
>  Components: Document
>Reporter: Jialin Qiao
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IOTDB-450) Add system design documents

2020-02-03 Thread Jialin Qiao (Jira)
Jialin Qiao created IOTDB-450:
-

 Summary: Add system design documents
 Key: IOTDB-450
 URL: https://issues.apache.org/jira/browse/IOTDB-450
 Project: Apache IoTDB
  Issue Type: Task
  Components: Document
Reporter: Jialin Qiao






--
This message was sent by Atlassian Jira
(v8.3.4#803005)