Enable to choose storage in local file system or HDFS
Hi, This issue is to let user directly uses spark to read data in IoTDB for analyzing. This function can be done in many ways in IoTDB: (1) Storing all TsFiles (data files) and other files (system files, WALs) on HDFS, then use spark-tsfile to read TsFiles on HDFS. (2) Storing only TsFiles on HDFS, and other files on local file system, then use spark-tsfile to read TsFiles on HDFS. (3) Storing all files on local file system and let user use spark-iotdb-connector to read data from IoTDB, regardless where TsFiles store. Personally, I prefer the second and the third. If we use the second way, do we need the FileFactory for all Files? Best, -- Jialin Qiao School of Software, Tsinghua University 乔嘉林 清华大学 软件学院 > -原始邮件- > 发件人: "Zesong Sun (Jira)" > 发送时间: 2019-08-29 19:34:00 (星期四) > 收件人: dev@iotdb.apache.org > 抄送: > 主题: [jira] [Created] (IOTDB-187) Enable to choose storage in local file > system or HDFS > > Zesong Sun created IOTDB-187: > > > Summary: Enable to choose storage in local file system or HDFS > Key: IOTDB-187 > URL: https://issues.apache.org/jira/browse/IOTDB-187 > Project: Apache IoTDB > Issue Type: Improvement > Reporter: Zesong Sun > > > Enable to choose storage in local file system or HDFS > "is_hdfs_storage=false" by default > > > > -- > This message was sent by Atlassian Jira > (v8.3.2#803003)
Re: A new result set format
Hi, I think this is one oft he most important discussion to make it easily accessible to users. As Xiangdong states, we have to make it comfortable for users to map to relational schemes. And I agree that there are use cases for both, so probably really just provide both. Julian Am 07.09.19, 13:18 schrieb "Xiangdong Huang" : Hi, Glad to see this discussion. I am on my travel so I have no enough time these days to join with you but I will never miss this discussion.. The discussion is important because the query is the highest frequent function that user use IoTDB and make data valuable. Besides, it will impact how we transfer ioTDB's schema view to a relational schema view (e.g., be integrated with Calcite.). I think Lei Rui got the key difference: the wide, the narrow and the narrowest table depends on "how to align the data". - If we want to align all timeseries according to the timestamp, then it is a wide table. (minor issue: I think for sql "select * from root.sg_1", the printed result should be "d1.s1, d1.s2..." rather than "root.sg1.d1.s1, root.sg1.d1.s2". So the table's head row can be concise) - If we want to align all timeseries that belong to the same device (i.e., the data source) according to the time stamp, then it is a narrow table. - If we do not want to align data, then it is the narrowest table. I do not know which one that users like. If we decide support all the three format, maybe an ALIGN clause can be introduced in our SQL. (Well, Jialin said it as "Group By", I am not sure which one is better). Best, 在 2019年9月7日星期六,Rui, Lei 写道: > Sorry, pictures cannot be attached in the last email I sent . So I > supplement them here in text. > The "wide" table is: > | time | root.sg_1.device_1.sensor_1 | root.sg_1.device_1.sensor_2 | > root.sg_1.device_2.sensor_1 | root.sg_1.device_2.sensor_2 | > | 1 | 100 | 2.5 | 99 | 1.3 | > | ... | ... | ... | ... | ... | > > > The "narrow" table is : > | time | device_Id | sensor_1 | sensor_2 | > | 1 | root.sg_1.device_1 | 100 | 2.5 | > | ... | ... | ... | ... | > | 1 | root.sg_1.device_2 | 99 | 1.3 | > | ... | ... | ... | ... | > On 9/7/2019 15:51,Rui, Lei wrote: > Hi, > > > I try to make this proposal more concrete from a semantic perspective. > > > Consider the sql "select * from root.sg_1". The following format is the > "wide" table: > > > The following format is the "narrow" table: > > > The levels of data from low to high are: > - sensor data, or series data, e.g., from root.sg_1.device_1.sensor_1 > - device data, e.g., from root.sg_1.device_1 > - storage group data , e.g., from root.sg_1 > > > So, the sql "select * from root.sg_1" queries data at the storage group > level. To present the results, > the wide table aligns all series data across multiple devices in the > storage group by timestamp, > while the narrow table aligns series data in a single device by timestamp, > and does the same for other devices in the storage group. > > > By the way, I guess the "narrowest" table is for a single sensor's data, > without the need to align with any other series data. > > > I have one question: > Why not make full use of sql and just use "select * from > root.sg_1.device_1" to specify the device (or the data level) they care > about? > Why use "select * from root.sg_1" with a narrow table format? > > > Lastly, I think the better query execution efficiency that a narrow table > may sometimes has is not the drive purpose, > because presenting the query result in a wide table and in a narrow table > are two different tasks. > > > Sincerely, > Lei Rui > > > From: Jialin Qiao > Date: 9/7/2019 15:26 > To: > Subject: Re: A new result set format > Hi Julian, > > He is my friend and contacted me offline, because I advertise IoTDB in my > weChat(like facebook or twitter). > > Next time I will try to let him put issue in the mail list himself :) > > Best, > -- > Jialin Qiao > School of Software, Tsinghua University > > 乔嘉林 > 清华大学 软件学院 > > -原始邮件- > 发件人: "Julian Feinauer" > 发送时间: 2019-09-07 13:52:17 (星期六) > 收件人: "dev@iotdb.apache.org" > 抄送: > 主题: Re: A new result set format > > Hi Jialin, > > perhaps one question about "wanted by users" means (as I didn’t see > anything on the list). > How do these users get in contact with you? > > Julian > > Am 07.09.19, 04:29 schrieb "Jialin Qiao" : > > Hi, > > As described in this issue, a new result set format is wanted by users. > I'd like to open a discus
Re: A new result set format
Hi Jialin, well thats no hard requirement... ist totally fine if things come in that way from time to time : ) As I'm so new to IoTDB I'm just still trying to understand whats the user group currently and who drives feature ideas and stuff : ) I hope that we start soon with using IoTDB and then we'll very likely also contribute some wishes : ) Julian Am 07.09.19, 09:26 schrieb "Jialin Qiao" : Hi Julian, He is my friend and contacted me offline, because I advertise IoTDB in my weChat(like facebook or twitter). Next time I will try to let him put issue in the mail list himself :) Best, -- Jialin Qiao School of Software, Tsinghua University 乔嘉林 清华大学 软件学院 > -原始邮件- > 发件人: "Julian Feinauer" > 发送时间: 2019-09-07 13:52:17 (星期六) > 收件人: "dev@iotdb.apache.org" > 抄送: > 主题: Re: A new result set format > > Hi Jialin, > > perhaps one question about "wanted by users" means (as I didn’t see anything on the list). > How do these users get in contact with you? > > Julian > > Am 07.09.19, 04:29 schrieb "Jialin Qiao" : > > Hi, > > As described in this issue, a new result set format is wanted by users. I'd like to open a discussion here. > > For simplicity, I refer this format "time, root.sg1.d1.s1, root.sg1.d2.s1" to wide table, and "time, deviceId, s1" as narrow table. > > This issue is not only about how to organize the results, but also the query process. > > There are some advantages about narrow table. > > (1) For wide table, we need to open a SeriesReader for each series at the same time, each SeriesReader holds some ChunkMetadatas. For narrow table, we only need to open SeriesReaders for one device at one time, then return results and open SeriesReaders for the next device, which occupies less memory compared to the wide table. > (2) Avoid reading all series at once may also improve the query latency. > > There is also a question: > > (1) If we show result in the narrow table format for users, do we need to highlight the concept of table and device? > (2) If the answer of the first question is yes, do we need to support sql: "select time, deviceId, s1, s2, s3 from root.sg1 where deviceId=d1"? This may involve a lot of work... > > From my side, I prefer the answers of the two questions are all NO. Then we do not need to change the sql grammar and only use a new query process to organize the result set. > > Best, > -- > Jialin Qiao > School of Software, Tsinghua University > > 乔嘉林 > 清华大学 软件学院 > > > -原始邮件- > > 发件人: "Jialin Qiao (Jira)" > > 发送时间: 2019-09-07 09:40:00 (星期六) > > 收件人: dev@iotdb.apache.org > > 抄送: > > 主题: [jira] [Created] (IOTDB-203) A new result set format > > > > Jialin Qiao created IOTDB-203: > > - > > > > Summary: A new result set format > > Key: IOTDB-203 > > URL: https://issues.apache.org/jira/browse/IOTDB-203 > > Project: Apache IoTDB > > Issue Type: New Feature > > Reporter: Jialin Qiao > > > > > > When executing a SQL like "select d1.s1, d2.s1 from root.sg1", the default result set format in IoTDB is > > > > "time, root.sg1.d1.s1, root.sg1.d2.s1" > > > > 1 , 1, 1 > > > > 2, 2, 2 > > > > However, some users want to get another format, The results could be grouped by device, then sorted by time. > > > > "time, deviceId, s1". > > > > 1, root.sg1.d1, 1 > > > > 2, root.sg1.d2, 2 > > > > > > > > This can be done in the client, but it would be better if we support this format in the server. > > > > > > > > > > > > -- > > This message was sent by Atlassian Jira > > (v8.3.2#803003) > >
A new result set format
Hi, Glad to see this discussion. I am on my travel so I have no enough time these days to join with you but I will never miss this discussion.. The discussion is important because the query is the highest frequent function that user use IoTDB and make data valuable. Besides, it will impact how we transfer ioTDB's schema view to a relational schema view (e.g., be integrated with Calcite.). I think Lei Rui got the key difference: the wide, the narrow and the narrowest table depends on "how to align the data". - If we want to align all timeseries according to the timestamp, then it is a wide table. (minor issue: I think for sql "select * from root.sg_1", the printed result should be "d1.s1, d1.s2..." rather than "root.sg1.d1.s1, root.sg1.d1.s2". So the table's head row can be concise) - If we want to align all timeseries that belong to the same device (i.e., the data source) according to the time stamp, then it is a narrow table. - If we do not want to align data, then it is the narrowest table. I do not know which one that users like. If we decide support all the three format, maybe an ALIGN clause can be introduced in our SQL. (Well, Jialin said it as "Group By", I am not sure which one is better). Best, 在 2019年9月7日星期六,Rui, Lei 写道: > Sorry, pictures cannot be attached in the last email I sent . So I > supplement them here in text. > The "wide" table is: > | time | root.sg_1.device_1.sensor_1 | root.sg_1.device_1.sensor_2 | > root.sg_1.device_2.sensor_1 | root.sg_1.device_2.sensor_2 | > | 1 | 100 | 2.5 | 99 | 1.3 | > | ... | ... | ... | ... | ... | > > > The "narrow" table is : > | time | device_Id | sensor_1 | sensor_2 | > | 1 | root.sg_1.device_1 | 100 | 2.5 | > | ... | ... | ... | ... | > | 1 | root.sg_1.device_2 | 99 | 1.3 | > | ... | ... | ... | ... | > On 9/7/2019 15:51,Rui, Lei wrote: > Hi, > > > I try to make this proposal more concrete from a semantic perspective. > > > Consider the sql "select * from root.sg_1". The following format is the > "wide" table: > > > The following format is the "narrow" table: > > > The levels of data from low to high are: > - sensor data, or series data, e.g., from root.sg_1.device_1.sensor_1 > - device data, e.g., from root.sg_1.device_1 > - storage group data , e.g., from root.sg_1 > > > So, the sql "select * from root.sg_1" queries data at the storage group > level. To present the results, > the wide table aligns all series data across multiple devices in the > storage group by timestamp, > while the narrow table aligns series data in a single device by timestamp, > and does the same for other devices in the storage group. > > > By the way, I guess the "narrowest" table is for a single sensor's data, > without the need to align with any other series data. > > > I have one question: > Why not make full use of sql and just use "select * from > root.sg_1.device_1" to specify the device (or the data level) they care > about? > Why use "select * from root.sg_1" with a narrow table format? > > > Lastly, I think the better query execution efficiency that a narrow table > may sometimes has is not the drive purpose, > because presenting the query result in a wide table and in a narrow table > are two different tasks. > > > Sincerely, > Lei Rui > > > From: Jialin Qiao > Date: 9/7/2019 15:26 > To: > Subject: Re: A new result set format > Hi Julian, > > He is my friend and contacted me offline, because I advertise IoTDB in my > weChat(like facebook or twitter). > > Next time I will try to let him put issue in the mail list himself :) > > Best, > -- > Jialin Qiao > School of Software, Tsinghua University > > 乔嘉林 > 清华大学 软件学院 > > -原始邮件- > 发件人: "Julian Feinauer" > 发送时间: 2019-09-07 13:52:17 (星期六) > 收件人: "dev@iotdb.apache.org" > 抄送: > 主题: Re: A new result set format > > Hi Jialin, > > perhaps one question about "wanted by users" means (as I didn’t see > anything on the list). > How do these users get in contact with you? > > Julian > > Am 07.09.19, 04:29 schrieb "Jialin Qiao" : > > Hi, > > As described in this issue, a new result set format is wanted by users. > I'd like to open a discussion here. > > For simplicity, I refer this format "time, root.sg1.d1.s1, root.sg1.d2.s1" > to wide table, and "time, deviceId, s1" as narrow table. > > This issue is not only about how to organize the results, but also the > query process. > > There are some advantages about narrow table. > > (1) For wide table, we need to open a SeriesReader for each series at the > same time, each SeriesReader holds some ChunkMetadatas. For narrow table, > we only need to open SeriesReaders for one device at one time, then return > results and open SeriesReaders for the next device, which occupies less > memory compared to the wide table. > (2) Avoid reading all series at once may also improve the query latency. > > There is also a question: > > (1) If we show result in the narrow table format for users, do we need to > highlight the concept of table and device? > (2) If the answer of the
Re: A new result set format
Sorry, pictures cannot be attached in the last email I sent. So I supplement them here in text. The "wide" table is: | time | root.sg_1.device_1.sensor_1 | root.sg_1.device_1.sensor_2 | root.sg_1.device_2.sensor_1 | root.sg_1.device_2.sensor_2 | | 1 | 100 | 2.5 | 99 | 1.3 | | ... | ... | ... | ... | ... | The "narrow" table is : | time | device_Id | sensor_1 | sensor_2 | | 1 | root.sg_1.device_1 | 100 | 2.5 | | ... | ... | ... | ... | | 1 | root.sg_1.device_2 | 99 | 1.3 | | ... | ... | ... | ... | On 9/7/2019 15:51,Rui, Lei wrote: Hi, I try to make this proposal more concrete from a semantic perspective. Consider the sql "select * from root.sg_1". The following format is the "wide" table: The following format is the "narrow" table: The levels of data from low to high are: - sensor data, or series data, e.g., from root.sg_1.device_1.sensor_1 - device data, e.g., from root.sg_1.device_1 - storage group data , e.g., from root.sg_1 So, the sql "select * from root.sg_1" queries data at the storage group level. To present the results, the wide table aligns all series data across multiple devices in the storage group by timestamp, while the narrow table aligns series data in a single device by timestamp, and does the same for other devices in the storage group. By the way, I guess the "narrowest" table is for a single sensor's data, without the need to align with any other series data. I have one question: Why not make full use of sql and just use "select * from root.sg_1.device_1" to specify the device (or the data level) they care about? Why use "select * from root.sg_1" with a narrow table format? Lastly, I think the better query execution efficiency that a narrow table may sometimes has is not the drive purpose, because presenting the query result in a wide table and in a narrow table are two different tasks. Sincerely, Lei Rui From: Jialin Qiao Date: 9/7/2019 15:26 To: Subject: Re: A new result set format Hi Julian, He is my friend and contacted me offline, because I advertise IoTDB in my weChat(like facebook or twitter). Next time I will try to let him put issue in the mail list himself :) Best, -- Jialin Qiao School of Software, Tsinghua University 乔嘉林 清华大学 软件学院 -原始邮件- 发件人: "Julian Feinauer" 发送时间: 2019-09-07 13:52:17 (星期六) 收件人: "dev@iotdb.apache.org" 抄送: 主题: Re: A new result set format Hi Jialin, perhaps one question about "wanted by users" means (as I didn’t see anything on the list). How do these users get in contact with you? Julian Am 07.09.19, 04:29 schrieb "Jialin Qiao" : Hi, As described in this issue, a new result set format is wanted by users. I'd like to open a discussion here. For simplicity, I refer this format "time, root.sg1.d1.s1, root.sg1.d2.s1" to wide table, and "time, deviceId, s1" as narrow table. This issue is not only about how to organize the results, but also the query process. There are some advantages about narrow table. (1) For wide table, we need to open a SeriesReader for each series at the same time, each SeriesReader holds some ChunkMetadatas. For narrow table, we only need to open SeriesReaders for one device at one time, then return results and open SeriesReaders for the next device, which occupies less memory compared to the wide table. (2) Avoid reading all series at once may also improve the query latency. There is also a question: (1) If we show result in the narrow table format for users, do we need to highlight the concept of table and device? (2) If the answer of the first question is yes, do we need to support sql: "select time, deviceId, s1, s2, s3 from root.sg1 where deviceId=d1"? This may involve a lot of work... From my side, I prefer the answers of the two questions are all NO. Then we do not need to change the sql grammar and only use a new query process to organize the result set. Best, -- Jialin Qiao School of Software, Tsinghua University 乔嘉林 清华大学 软件学院 -原始邮件- 发件人: "Jialin Qiao (Jira)" 发送时间: 2019-09-07 09:40:00 (星期六) 收件人: dev@iotdb.apache.org 抄送: 主题: [jira] [Created] (IOTDB-203) A new result set format Jialin Qiao created IOTDB-203: - Summary: A new result set format Key: IOTDB-203 URL: https://issues.apache.org/jira/browse/IOTDB-203 Project: Apache IoTDB Issue Type: New Feature Reporter: Jialin Qiao When executing a SQL like "select d1.s1, d2.s1 from root.sg1", the default result set format in IoTDB is "time, root.sg1.d1.s1, root.sg1.d2.s1" 1 , 1, 1 2, 2, 2 However, some users want to get another format, The results could be grouped by device, then sorted by time. "time, deviceId, s1". 1, root.sg1.d1, 1 2, root.sg1.d2, 2 This can be done in the client, but it would be better if we support this format in the server. -- This message was sent by Atlassian Jira (v8.3.2#803003)
Re: A new result set format
Sorry, pictures cannot be attached in the last email I sent. So I supplement them here in text. The "wide" table is: | time | root.sg_1.device_1.sensor_1 | root.sg_1.device_1.sensor_2 | root.sg_1.device_2.sensor_1 | root.sg_1.device_2.sensor_2 | | 1 | 100 | 2.5 | 99 | 1.3 | | ... | ... | ... | ... | ... | On 9/7/2019 15:51,Rui, Lei wrote: Hi, I try to make this proposal more concrete from a semantic perspective. Consider the sql "select * from root.sg_1". The following format is the "wide" table: The following format is the "narrow" table: The levels of data from low to high are: - sensor data, or series data, e.g., from root.sg_1.device_1.sensor_1 - device data, e.g., from root.sg_1.device_1 - storage group data , e.g., from root.sg_1 So, the sql "select * from root.sg_1" queries data at the storage group level. To present the results, the wide table aligns all series data across multiple devices in the storage group by timestamp, while the narrow table aligns series data in a single device by timestamp, and does the same for other devices in the storage group. By the way, I guess the "narrowest" table is for a single sensor's data, without the need to align with any other series data. I have one question: Why not make full use of sql and just use "select * from root.sg_1.device_1" to specify the device (or the data level) they care about? Why use "select * from root.sg_1" with a narrow table format? Lastly, I think the better query execution efficiency that a narrow table may sometimes has is not the drive purpose, because presenting the query result in a wide table and in a narrow table are two different tasks. Sincerely, Lei Rui From: Jialin Qiao Date: 9/7/2019 15:26 To: Subject: Re: A new result set format Hi Julian, He is my friend and contacted me offline, because I advertise IoTDB in my weChat(like facebook or twitter). Next time I will try to let him put issue in the mail list himself :) Best, -- Jialin Qiao School of Software, Tsinghua University 乔嘉林 清华大学 软件学院 -原始邮件- 发件人: "Julian Feinauer" 发送时间: 2019-09-07 13:52:17 (星期六) 收件人: "dev@iotdb.apache.org" 抄送: 主题: Re: A new result set format Hi Jialin, perhaps one question about "wanted by users" means (as I didn’t see anything on the list). How do these users get in contact with you? Julian Am 07.09.19, 04:29 schrieb "Jialin Qiao" : Hi, As described in this issue, a new result set format is wanted by users. I'd like to open a discussion here. For simplicity, I refer this format "time, root.sg1.d1.s1, root.sg1.d2.s1" to wide table, and "time, deviceId, s1" as narrow table. This issue is not only about how to organize the results, but also the query process. There are some advantages about narrow table. (1) For wide table, we need to open a SeriesReader for each series at the same time, each SeriesReader holds some ChunkMetadatas. For narrow table, we only need to open SeriesReaders for one device at one time, then return results and open SeriesReaders for the next device, which occupies less memory compared to the wide table. (2) Avoid reading all series at once may also improve the query latency. There is also a question: (1) If we show result in the narrow table format for users, do we need to highlight the concept of table and device? (2) If the answer of the first question is yes, do we need to support sql: "select time, deviceId, s1, s2, s3 from root.sg1 where deviceId=d1"? This may involve a lot of work... From my side, I prefer the answers of the two questions are all NO. Then we do not need to change the sql grammar and only use a new query process to organize the result set. Best, -- Jialin Qiao School of Software, Tsinghua University 乔嘉林 清华大学 软件学院 -原始邮件- 发件人: "Jialin Qiao (Jira)" 发送时间: 2019-09-07 09:40:00 (星期六) 收件人: dev@iotdb.apache.org 抄送: 主题: [jira] [Created] (IOTDB-203) A new result set format Jialin Qiao created IOTDB-203: - Summary: A new result set format Key: IOTDB-203 URL: https://issues.apache.org/jira/browse/IOTDB-203 Project: Apache IoTDB Issue Type: New Feature Reporter: Jialin Qiao When executing a SQL like "select d1.s1, d2.s1 from root.sg1", the default result set format in IoTDB is "time, root.sg1.d1.s1, root.sg1.d2.s1" 1 , 1, 1 2, 2, 2 However, some users want to get another format, The results could be grouped by device, then sorted by time. "time, deviceId, s1". 1, root.sg1.d1, 1 2, root.sg1.d2, 2 This can be done in the client, but it would be better if we support this format in the server. -- This message was sent by Atlassian Jira (v8.3.2#803003)
Re: A new result set format
Hi, I try to make this proposal more concrete from a semantic perspective. Consider the sql "select * from root.sg_1". The following format is the "wide" table: The following format is the "narrow" table: The levels of data from low to high are: - sensor data, or series data, e.g., from root.sg_1.device_1.sensor_1 - device data, e.g., from root.sg_1.device_1 - storage group data , e.g., from root.sg_1 So, the sql "select * from root.sg_1" queries data at the storage group level. To present the results, the wide table aligns all series data across multiple devices in the storage group by timestamp, while the narrow table aligns series data in a single device by timestamp, and does the same for other devices in the storage group. By the way, I guess the "narrowest" table is for a single sensor's data, without the need to align with any other series data. I have one question: Why not make full use of sql and just use "select * from root.sg_1.device_1" to specify the device (or the data level) they care about? Why use "select * from root.sg_1" with a narrow table format? Lastly, I think the better query execution efficiency that a narrow table may sometimes has is not the drive purpose, because presenting the query result in a wide table and in a narrow table are two different tasks. Sincerely, Lei Rui From: Jialin Qiao Date: 9/7/2019 15:26 To: Subject: Re: A new result set format Hi Julian, He is my friend and contacted me offline, because I advertise IoTDB in my weChat(like facebook or twitter). Next time I will try to let him put issue in the mail list himself :) Best, -- Jialin Qiao School of Software, Tsinghua University 乔嘉林 清华大学 软件学院 -原始邮件- 发件人: "Julian Feinauer" 发送时间: 2019-09-07 13:52:17 (星期六) 收件人: "dev@iotdb.apache.org" 抄送: 主题: Re: A new result set format Hi Jialin, perhaps one question about "wanted by users" means (as I didn’t see anything on the list). How do these users get in contact with you? Julian Am 07.09.19, 04:29 schrieb "Jialin Qiao" : Hi, As described in this issue, a new result set format is wanted by users. I'd like to open a discussion here. For simplicity, I refer this format "time, root.sg1.d1.s1, root.sg1.d2.s1" to wide table, and "time, deviceId, s1" as narrow table. This issue is not only about how to organize the results, but also the query process. There are some advantages about narrow table. (1) For wide table, we need to open a SeriesReader for each series at the same time, each SeriesReader holds some ChunkMetadatas. For narrow table, we only need to open SeriesReaders for one device at one time, then return results and open SeriesReaders for the next device, which occupies less memory compared to the wide table. (2) Avoid reading all series at once may also improve the query latency. There is also a question: (1) If we show result in the narrow table format for users, do we need to highlight the concept of table and device? (2) If the answer of the first question is yes, do we need to support sql: "select time, deviceId, s1, s2, s3 from root.sg1 where deviceId=d1"? This may involve a lot of work... From my side, I prefer the answers of the two questions are all NO. Then we do not need to change the sql grammar and only use a new query process to organize the result set. Best, -- Jialin Qiao School of Software, Tsinghua University 乔嘉林 清华大学 软件学院 -原始邮件- 发件人: "Jialin Qiao (Jira)" 发送时间: 2019-09-07 09:40:00 (星期六) 收件人: dev@iotdb.apache.org 抄送: 主题: [jira] [Created] (IOTDB-203) A new result set format Jialin Qiao created IOTDB-203: - Summary: A new result set format Key: IOTDB-203 URL: https://issues.apache.org/jira/browse/IOTDB-203 Project: Apache IoTDB Issue Type: New Feature Reporter: Jialin Qiao When executing a SQL like "select d1.s1, d2.s1 from root.sg1", the default result set format in IoTDB is "time, root.sg1.d1.s1, root.sg1.d2.s1" 1 , 1, 1 2, 2, 2 However, some users want to get another format, The results could be grouped by device, then sorted by time. "time, deviceId, s1". 1, root.sg1.d1, 1 2, root.sg1.d2, 2 This can be done in the client, but it would be better if we support this format in the server. -- This message was sent by Atlassian Jira (v8.3.2#803003)
Re: A new result set format
Hi Julian, He is my friend and contacted me offline, because I advertise IoTDB in my weChat(like facebook or twitter). Next time I will try to let him put issue in the mail list himself :) Best, -- Jialin Qiao School of Software, Tsinghua University 乔嘉林 清华大学 软件学院 > -原始邮件- > 发件人: "Julian Feinauer" > 发送时间: 2019-09-07 13:52:17 (星期六) > 收件人: "dev@iotdb.apache.org" > 抄送: > 主题: Re: A new result set format > > Hi Jialin, > > perhaps one question about "wanted by users" means (as I didn’t see anything > on the list). > How do these users get in contact with you? > > Julian > > Am 07.09.19, 04:29 schrieb "Jialin Qiao" : > > Hi, > > As described in this issue, a new result set format is wanted by users. > I'd like to open a discussion here. > > For simplicity, I refer this format "time, root.sg1.d1.s1, > root.sg1.d2.s1" to wide table, and "time, deviceId, s1" as narrow table. > > This issue is not only about how to organize the results, but also the > query process. > > There are some advantages about narrow table. > > (1) For wide table, we need to open a SeriesReader for each series at the > same time, each SeriesReader holds some ChunkMetadatas. For narrow table, we > only need to open SeriesReaders for one device at one time, then return > results and open SeriesReaders for the next device, which occupies less > memory compared to the wide table. > (2) Avoid reading all series at once may also improve the query latency. > > There is also a question: > > (1) If we show result in the narrow table format for users, do we need to > highlight the concept of table and device? > (2) If the answer of the first question is yes, do we need to support > sql: "select time, deviceId, s1, s2, s3 from root.sg1 where deviceId=d1"? > This may involve a lot of work... > > From my side, I prefer the answers of the two questions are all NO. Then > we do not need to change the sql grammar and only use a new query process to > organize the result set. > > Best, > -- > Jialin Qiao > School of Software, Tsinghua University > > 乔嘉林 > 清华大学 软件学院 > > > -原始邮件- > > 发件人: "Jialin Qiao (Jira)" > > 发送时间: 2019-09-07 09:40:00 (星期六) > > 收件人: dev@iotdb.apache.org > > 抄送: > > 主题: [jira] [Created] (IOTDB-203) A new result set format > > > > Jialin Qiao created IOTDB-203: > > - > > > > Summary: A new result set format > > Key: IOTDB-203 > > URL: https://issues.apache.org/jira/browse/IOTDB-203 > > Project: Apache IoTDB > > Issue Type: New Feature > > Reporter: Jialin Qiao > > > > > > When executing a SQL like "select d1.s1, d2.s1 from root.sg1", the > default result set format in IoTDB is > > > > "time, root.sg1.d1.s1, root.sg1.d2.s1" > > > > 1 , 1, 1 > > > > 2, 2, 2 > > > > However, some users want to get another format, The results could be > grouped by device, then sorted by time. > > > > "time, deviceId, s1". > > > > 1, root.sg1.d1, 1 > > > > 2, root.sg1.d2, 2 > > > > > > > > This can be done in the client, but it would be better if we support > this format in the server. > > > > > > > > > > > > -- > > This message was sent by Atlassian Jira > > (v8.3.2#803003) > >