Hi, If we use text when a column has multiple types, I'm ok with (3).
Thanks, ————————————————— Jialin Qiao School of Software, Tsinghua University 乔嘉林 清华大学 软件学院 魏祥威 <526213...@qq.com> 于2020年2月9日周日 下午5:30写道: > Hi, > > > I agree with the opinion of Xiangdong Huang. > > > (3) is the most friendly for users who are using Relational DB, and if > they want a relational query (group by device query), their applications > should guarantee the consistency of data type. > > Best, > Xiangwei Wei > > > > > > > > > ------------------ 原始邮件 ------------------ > 发件人: "Xiangdong Huang"<saint...@gmail.com>; > 发送时间: 2020年2月7日(星期五) 下午2:58 > 收件人: "dev"<dev@iotdb.apache.org>; > > 主题: Re: [DISCUSS] Table schema of group by device > > > > One more suggestion, using "align by device" is more clear than "group by > device". > > ----------------------------------- > Xiangdong Huang > School of Software, Tsinghua University > > 黄向东 > 清华大学 软件学院 > > > Xiangdong Huang <saint...@gmail.com> 于2020年2月7日周五 下午2:56写道: > > > -1 for (2), forever and I think I will never vote +1 for > it... > > > > If you do it like that, there is no chance to replace those > applications > > which are using relational db to manage timeseries data. > > > > (3) is the most friendly for those developers who are using > Relational DB, > > because when they write a SQL like "select c1, c2, c3 FROM", they > think it > > is of course that the resultset has 3 columns... > > > > Of course, for users who are using RDB and want a table like "Time > > DeviceId, s1, s2", their applications can guarantee the data type of > data > > in s2 as const. > > If there are many data types in s2, the RDB users may use "text" > > "varchar2" format directly. > > > > Considering that, I think the choice is: if all data has the same data > > type in a column, use the correct data type. Otherwise use String. > > > > (1) Well, it can be an option. But my suggestion is, if all data has > the > > same data type in a column, do not change its column name. > > > > Best, > > ----------------------------------- > > Xiangdong Huang > > School of Software, Tsinghua University > > > > 黄向东 > > 清华大学 软件学院 > > > > > > Jialin Qiao <qiaojia...@apache.org> 于2020年2月7日周五 下午2:29写道: > > > >> Hi, > >> > >> In IOTDB-243 [1], We want to allow create measurements with the > same name > >> but with different types in the same storage group. > >> > >> For example, > >> root.sg1.d1.s1, int32 > >> root.sg1.d1.s2 int32 > >> root.sg1.d2.s1 boolean > >> root.sg1.d2.s2 int32 > >> > >> This may cause trouble in group by device query. How do we > organize the > >> result (table schema)? I thought of three ways: > >> > >> (1) Time, Device, s1_int, s1_boolean, s2_int32 > >> > >> * advantage: > >> - No ambiguity > >> - The number of columns is acceptable. > >> > >> * disadvantage: > >> - In most cases, the datatype indicator is redundant and weird. > >> - Difficult to use parallelization among devices in the query. > >> > >> (2) Time, d1, s1, s2 Time, d2, s1, s2 > >> > >> * advantage: > >> - No ambiguity > >> - This could leverage the parallelization among devices in the > query. > >> > >> * disadvantage: > >> - The number of columns may be large. > >> > >> (3) Time DeviceId, s1, s2 > >> > >> This may need to do much work in the QueryDataSet, and users need > to get > >> value carefully according to the measurement type of one device. > >> Otherwise, > >> it may cause RunTimeException in JDBC Client. > >> > >> * advantage: > >> - The number of columns is the minimal. > >> > >> * disadvantage: > >> - May cause ambiguity, a column of one table has more than one > type, which > >> also conflicts to the Spark connector or Hive connector. > >> - Difficult to use parallelization in the query. > >> > >> _______________ > >> > >> From my perspective, I prefer (1) ≈ (2) > (3). > >> > >> What's your opinion? > >> > >> [1] https://issues.apache.org/jira/browse/IOTDB-243 > >> > >> Thanks, > >> ————————————————— > >> Jialin Qiao > >> School of Software, Tsinghua University > >> > >> 乔嘉林 > >> 清华大学 软件学院 > >> > >