Hi,

I agree with your interpretation. Ist just another layer with different 
interpretation.
So the idea would be to provide a different API initially to experiment a bit 
and probably add it to the "core" API finally.
So that the Type resolution always checks whether the type is primitive or 
Logical.

I mainly wanted to get your ideas and feedback about that and if you could 
imagine use cases for that.
We would need something like "NaN" quite often in our use cases and I would 
also like to use a "string" mapping for "ON/OFF" rather than true/false as it 
makes it easier to interpret the data later on.

Julian

Am 31.10.19, 05:39 schrieb "Xiangdong Huang" <[email protected]>:

    Hi,
    
    > You can look at how avro handles non primitive types (they call it
    LogicalTypes) here:
    https://avro.apache.org/docs/1.8.1/spec.html#Logical+Types
    
    Yes, I read some materials about LogicalTypes. It looks like a nick name of
    a data type, with some new interpretation. E.g., a byte array data type can
    be called as Decimal, while the interpretation relies on how user define
    the precision and scale..
    
    Using this kind of implementation is also ok. I think.
    
    So, you'd like to provide the interface in the IoTDB layer to user (so
    using SQL to operate data), or on top of the TsFile layer (so using TsFile
    API to operate data)?
    
    Best,
    -----------------------------------
    Xiangdong Huang
    School of Software, Tsinghua University
    
     黄向东
    清华大学 软件学院
    
    
    Julian Feinauer <[email protected]> 于2019年10月30日周三 下午5:59写道:
    
    > Hi,
    >
    > in fact it is mostly in the MDF spec not for compression (that’s a nice
    > side effect) but rather for being able to really express the (physical)
    > content of a signal.
    > So my initial idea was to implement it as an optional layer on top of the
    > current tsfile which does the "interpretation". Because in the tsfile its
    > always just a "primitive" series that is stored.
    >
    > So the idea would be to store some metadata (like a formula, lookup table,
    > ...) on creation and use that on reading but only optionally.
    > You can look at how avro handles non primitive types (they call it
    > LogicalTypes) here:
    > https://avro.apache.org/docs/1.8.1/spec.html#Logical+Types
    > This is similar to my idea.
    >
    > Julian
    >
    > Am 29.10.19, 14:40 schrieb "Xiangdong Huang" <[email protected]>:
    >
    >     Hi,
    >
    >     > Then its most efficient to store integers and a formula like a * x +
    > b
    >     with e.g. b = 3 and a = 1/100.
    >     > So 3V would be stored as x = 0, 3.01V -> x = 1, ... 4.2V as x = 
1200.
    >     > So we only store 0 to 1200 and no decimals and stuff which would be
    > very
    >     easily compressable I thnk.
    >
    >     Good idea! Two thumbs up for that.
    >
    >     But for cases like the above, implementing a new encoding method is
    > better
    >     than a new data type.
    >
    >     e.g, create time series root.a.b.voltage with encoding =
    >     linear_transformation and encoding_parameter = "describe the function
    > like
    >     y=a * x + b" and datatype = INT.
    >
    >     "linear_transformation" is the new encoding method.
    >
    >     Now I get two cases from the discussion, one is like Optional data,
    > and the
    >     other is data that can be transformative.
    >     So, do we want to support the above two, or find a more general data
    > type
    >     for "rich data type" (can the MDF file support some inspiration)?
    >
    >     Best,
    >     -----------------------------------
    >     Xiangdong Huang
    >     School of Software, Tsinghua University
    >
    >      黄向东
    >     清华大学 软件学院
    >
    >
    >     Julian Feinauer <[email protected]> 于2019年10月29日周二
    > 下午8:26写道:
    >
    >     > Hi Xiangdong,
    >     >
    >     > to your second question:
    >     > The use case ist he other way round.
    >     > We know that we measure e.g. a voltage between 3V and 4.2V with a
    >     > precision of 0.01 or something.
    >     > Then its most efficient to store integers and a formula like a * x +
    > b
    >     > with e.g. b = 3 and a = 1/100.
    >     > So 3V would be stored as x = 0, 3.01V -> x = 1, ... 4.2V as x = 
1200.
    >     > So we only store 0 to 1200 and no decimals and stuff which would be
    > very
    >     > easily compressable I thnk.
    >     >
    >     > Julian
    >     >
    >     > Am 29.10.19, 07:13 schrieb "Xiangdong Huang" <[email protected]>:
    >     >
    >     >     Hi,
    >     >
    >     >     > In Java we could model it as a variable Optional<> x which
    > could be
    >     > null,
    >     >     Optional.empty(), Optional.of(true), Optional.of(false).
    >     >
    >     >     It make sense.  And, using a new data type to achieve in IoTDB
    > it is
    >     > ok.
    >     >
    >     >     > Or scale formulas like a*x+b which allows to leverage the
    > precision
    >     > even
    >     >     for “small” double values or even integers.
    >     >
    >     >     So, are you considering a use case like: the time series value
    > should
    >     > be
    >     >     [1, 1, 0, 0, 1, 1, 1, 0, 0...]  but actually we get [0.99, 0.99,
    > 0.01,
    >     > 0,
    >     >     1, 1, 0.999, 0, 0.01] (because of the precision of sensors)?
    >     >     And, what values do you want to save?
    >     >     (1)save them as 1 and 0.  Or,
    >     >     (2)  save them as 0.99, 0.01 indeed, but using a specific query
    > API to
    >     >     return data like 1 and 0?
    >     >
    >     >     My another question is, is there a general data type can support
    > the
    >     > above
    >     >     cases?
    >     >
    >     >     Best,
    >     >     -----------------------------------
    >     >     Xiangdong Huang
    >     >     School of Software, Tsinghua University
    >     >
    >     >      黄向东
    >     >     清华大学 软件学院
    >     >
    >     >
    >     >     Julian Feinauer <[email protected]> 于2019年10月29日周二
    >     > 上午3:58写道:
    >     >
    >     >     > Hi all,
    >     >     >
    >     >     > I wanted to discuss a possible new feature I will call Rich
    > Datatypes
    >     >     > (RDT) API in the following.
    >     >     > I worked a lot in the automotive industry and there is a
    > broadly
    >     > adopted
    >     >     > open Standard called ASAM MDF (
    >     > https://www.asam.net/standards/detail/mdf/
    >     >     > ).
    >     >     > It is a format which is targeted at the efficient storage but
    > at the
    >     > same
    >     >     > time it supports VERY complex types (which are often used in
    >     > automotive
    >     >     > controllers).
    >     >     >
    >     >     > Take something as simple as a boolean. We could store it as a
    >     > boolean (as
    >     >     > java bool) in 1 bit.
    >     >     > BUT we have overall 4 possibilities:
    >     >     >
    >     >     >   *   No value is available for a timestamp (NULL / nothing
    > stored)
    >     >     >   *   We had a successful request but the Controller does not
    > know
    >     > whether
    >     >     > true or false (or had an internal error), this is a bit like
    >     >     > Optional.isPresent() == false
    >     >     >   *   True
    >     >     >   *   False
    >     >     > In Java we could model it as a variable Optional<> x which
    > could be
    >     > null,
    >     >     > Optional.empty(), Optional.of(true), Optional.of(false).
    >     >     >
    >     >     > Other examples are discrete values like “ON”, “OFF” (which are
    >     > handled as
    >     >     > “lookup tables” on integer rows, internally).
    >     >     > Or scale formulas like a*x+b which allows to leverage the
    > precision
    >     > even
    >     >     > for “small” double values or even integers.
    >     >     > A formula but also a “fallback” lookup value like “NV”.
    >     >     >
    >     >     > I think this could be a valuable extension to IoTDB as an
    > additional
    >     > API
    >     >     > (not change anything below but just provide an API on top to
    > do the
    >     >     > calculation).
    >     >     >
    >     >     > What do others think?
    >     >     >
    >     >     > Julian
    >     >     >
    >     >
    >     >
    >     >
    >
    >
    >
    

Reply via email to