Hi,

Let's move it to JIRA as a new feature.

> something like "NaN"
Indeed, supporting "NaN" is important for real applications.

Best,
-----------------------------------
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院


Julian Feinauer <[email protected]> 于2019年10月31日周四 下午8:40写道:

> Hi,
>
> I agree with your interpretation. Ist just another layer with different
> interpretation.
> So the idea would be to provide a different API initially to experiment a
> bit and probably add it to the "core" API finally.
> So that the Type resolution always checks whether the type is primitive or
> Logical.
>
> I mainly wanted to get your ideas and feedback about that and if you could
> imagine use cases for that.
> We would need something like "NaN" quite often in our use cases and I
> would also like to use a "string" mapping for "ON/OFF" rather than
> true/false as it makes it easier to interpret the data later on.
>
> Julian
>
> Am 31.10.19, 05:39 schrieb "Xiangdong Huang" <[email protected]>:
>
>     Hi,
>
>     > You can look at how avro handles non primitive types (they call it
>     LogicalTypes) here:
>     https://avro.apache.org/docs/1.8.1/spec.html#Logical+Types
>
>     Yes, I read some materials about LogicalTypes. It looks like a nick
> name of
>     a data type, with some new interpretation. E.g., a byte array data
> type can
>     be called as Decimal, while the interpretation relies on how user
> define
>     the precision and scale..
>
>     Using this kind of implementation is also ok. I think.
>
>     So, you'd like to provide the interface in the IoTDB layer to user (so
>     using SQL to operate data), or on top of the TsFile layer (so using
> TsFile
>     API to operate data)?
>
>     Best,
>     -----------------------------------
>     Xiangdong Huang
>     School of Software, Tsinghua University
>
>      黄向东
>     清华大学 软件学院
>
>
>     Julian Feinauer <[email protected]> 于2019年10月30日周三
> 下午5:59写道:
>
>     > Hi,
>     >
>     > in fact it is mostly in the MDF spec not for compression (that’s a
> nice
>     > side effect) but rather for being able to really express the
> (physical)
>     > content of a signal.
>     > So my initial idea was to implement it as an optional layer on top
> of the
>     > current tsfile which does the "interpretation". Because in the
> tsfile its
>     > always just a "primitive" series that is stored.
>     >
>     > So the idea would be to store some metadata (like a formula, lookup
> table,
>     > ...) on creation and use that on reading but only optionally.
>     > You can look at how avro handles non primitive types (they call it
>     > LogicalTypes) here:
>     > https://avro.apache.org/docs/1.8.1/spec.html#Logical+Types
>     > This is similar to my idea.
>     >
>     > Julian
>     >
>     > Am 29.10.19, 14:40 schrieb "Xiangdong Huang" <[email protected]>:
>     >
>     >     Hi,
>     >
>     >     > Then its most efficient to store integers and a formula like a
> * x +
>     > b
>     >     with e.g. b = 3 and a = 1/100.
>     >     > So 3V would be stored as x = 0, 3.01V -> x = 1, ... 4.2V as x
> = 1200.
>     >     > So we only store 0 to 1200 and no decimals and stuff which
> would be
>     > very
>     >     easily compressable I thnk.
>     >
>     >     Good idea! Two thumbs up for that.
>     >
>     >     But for cases like the above, implementing a new encoding method
> is
>     > better
>     >     than a new data type.
>     >
>     >     e.g, create time series root.a.b.voltage with encoding =
>     >     linear_transformation and encoding_parameter = "describe the
> function
>     > like
>     >     y=a * x + b" and datatype = INT.
>     >
>     >     "linear_transformation" is the new encoding method.
>     >
>     >     Now I get two cases from the discussion, one is like Optional
> data,
>     > and the
>     >     other is data that can be transformative.
>     >     So, do we want to support the above two, or find a more general
> data
>     > type
>     >     for "rich data type" (can the MDF file support some inspiration)?
>     >
>     >     Best,
>     >     -----------------------------------
>     >     Xiangdong Huang
>     >     School of Software, Tsinghua University
>     >
>     >      黄向东
>     >     清华大学 软件学院
>     >
>     >
>     >     Julian Feinauer <[email protected]> 于2019年10月29日周二
>     > 下午8:26写道:
>     >
>     >     > Hi Xiangdong,
>     >     >
>     >     > to your second question:
>     >     > The use case ist he other way round.
>     >     > We know that we measure e.g. a voltage between 3V and 4.2V
> with a
>     >     > precision of 0.01 or something.
>     >     > Then its most efficient to store integers and a formula like a
> * x +
>     > b
>     >     > with e.g. b = 3 and a = 1/100.
>     >     > So 3V would be stored as x = 0, 3.01V -> x = 1, ... 4.2V as x
> = 1200.
>     >     > So we only store 0 to 1200 and no decimals and stuff which
> would be
>     > very
>     >     > easily compressable I thnk.
>     >     >
>     >     > Julian
>     >     >
>     >     > Am 29.10.19, 07:13 schrieb "Xiangdong Huang" <
> [email protected]>:
>     >     >
>     >     >     Hi,
>     >     >
>     >     >     > In Java we could model it as a variable Optional<> x
> which
>     > could be
>     >     > null,
>     >     >     Optional.empty(), Optional.of(true), Optional.of(false).
>     >     >
>     >     >     It make sense.  And, using a new data type to achieve in
> IoTDB
>     > it is
>     >     > ok.
>     >     >
>     >     >     > Or scale formulas like a*x+b which allows to leverage the
>     > precision
>     >     > even
>     >     >     for “small” double values or even integers.
>     >     >
>     >     >     So, are you considering a use case like: the time series
> value
>     > should
>     >     > be
>     >     >     [1, 1, 0, 0, 1, 1, 1, 0, 0...]  but actually we get [0.99,
> 0.99,
>     > 0.01,
>     >     > 0,
>     >     >     1, 1, 0.999, 0, 0.01] (because of the precision of
> sensors)?
>     >     >     And, what values do you want to save?
>     >     >     (1)save them as 1 and 0.  Or,
>     >     >     (2)  save them as 0.99, 0.01 indeed, but using a specific
> query
>     > API to
>     >     >     return data like 1 and 0?
>     >     >
>     >     >     My another question is, is there a general data type can
> support
>     > the
>     >     > above
>     >     >     cases?
>     >     >
>     >     >     Best,
>     >     >     -----------------------------------
>     >     >     Xiangdong Huang
>     >     >     School of Software, Tsinghua University
>     >     >
>     >     >      黄向东
>     >     >     清华大学 软件学院
>     >     >
>     >     >
>     >     >     Julian Feinauer <[email protected]>
> 于2019年10月29日周二
>     >     > 上午3:58写道:
>     >     >
>     >     >     > Hi all,
>     >     >     >
>     >     >     > I wanted to discuss a possible new feature I will call
> Rich
>     > Datatypes
>     >     >     > (RDT) API in the following.
>     >     >     > I worked a lot in the automotive industry and there is a
>     > broadly
>     >     > adopted
>     >     >     > open Standard called ASAM MDF (
>     >     > https://www.asam.net/standards/detail/mdf/
>     >     >     > ).
>     >     >     > It is a format which is targeted at the efficient
> storage but
>     > at the
>     >     > same
>     >     >     > time it supports VERY complex types (which are often
> used in
>     >     > automotive
>     >     >     > controllers).
>     >     >     >
>     >     >     > Take something as simple as a boolean. We could store it
> as a
>     >     > boolean (as
>     >     >     > java bool) in 1 bit.
>     >     >     > BUT we have overall 4 possibilities:
>     >     >     >
>     >     >     >   *   No value is available for a timestamp (NULL /
> nothing
>     > stored)
>     >     >     >   *   We had a successful request but the Controller
> does not
>     > know
>     >     > whether
>     >     >     > true or false (or had an internal error), this is a bit
> like
>     >     >     > Optional.isPresent() == false
>     >     >     >   *   True
>     >     >     >   *   False
>     >     >     > In Java we could model it as a variable Optional<> x
> which
>     > could be
>     >     > null,
>     >     >     > Optional.empty(), Optional.of(true), Optional.of(false).
>     >     >     >
>     >     >     > Other examples are discrete values like “ON”, “OFF”
> (which are
>     >     > handled as
>     >     >     > “lookup tables” on integer rows, internally).
>     >     >     > Or scale formulas like a*x+b which allows to leverage the
>     > precision
>     >     > even
>     >     >     > for “small” double values or even integers.
>     >     >     > A formula but also a “fallback” lookup value like “NV”.
>     >     >     >
>     >     >     > I think this could be a valuable extension to IoTDB as an
>     > additional
>     >     > API
>     >     >     > (not change anything below but just provide an API on
> top to
>     > do the
>     >     >     > calculation).
>     >     >     >
>     >     >     > What do others think?
>     >     >     >
>     >     >     > Julian
>     >     >     >
>     >     >
>     >     >
>     >     >
>     >
>     >
>     >
>
>
>

Reply via email to