Re: Showcase your project at ApacheCON at a Podling's Shark Tank

2019-08-14 Thread Xiangdong Huang
Hi,

We have a talk already. Can we sign up this again? (If so, it is better).

Besides, we need to prepare our slides now... I can prepare the first
version in one week.

Then, let's discuss about what should be removed and added.

Best,
---
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院


Julian Feinauer  于2019年8月15日周四 下午12:55写道:

> Wer should definitely participate here!
>
> Von meinem Mobiltelefon gesendet
>
>
>  Ursprüngliche Nachricht 
> Betreff: Showcase your project at ApacheCON at a Podling's Shark Tank
> Von: Roman Shaposhnik
> An: general ,Roman Shaposhnik
> Cc:
>
> Hi Podlings!
>
> in less than a month we're going to have our first
> ApacheCON this year -- the one in Las Vegas. In
> about two month there will be one more in Berlin.
>
> These are not your regular ApacheCONs -- these are
> 20th Anniversary of ASF ApacehCONs! In other words,
> these are not to be missed!
>
> And even if your talk didn't get accepted -- you still
> get an opportunity to highlight your project to, what's
> likely going to be the biggest audience attending.
>
> Here's how: if you (or any community member who's
> passionate about your project) are going to be at either
> of those ApacheCONs consider signing up for
> Podling's Shark Tank
> events:
> https://www.apachecon.com/acna19/s/#/scheduledEvent/1038
> https://aceu19.apachecon.com/session/podlings-shark-tank
>
> Each project presenting will get ~10 min for the pitch and ~5 min
> of panel grilling them on all sorts of things. Kind of like this ;-)
>  https://www.youtube.com/watch?v=wmenN7NEdBc
>
> You've got nothing to lose (in fact, the opposite: you're likely to get
> a prize!) and you will get a chance to receive feedback that might
> actually help you grow your community and ultimately graduate to the
> TLP status. And! Given our awesome panel of judges:
>  * Myrle Krantz
>  * Justin Mclean
>  * Craig Russel
>  * Shane Curcuru
> We guarantee this to be a fun and useful event for your community!
>
> We will be tracking signups over here:
>  https://wiki.apache.org/apachecon/ACNA19PodlingSharkTank
>  https://wiki.apache.org/apachecon/ACEU19PodlingSharkTank
> but for now:
>
> SIMPLY REPLY TO THIS EMAIL if you're interested.
>
> It is first come, first serve -- so don't delay -- sign up today!
>
> Thanks,
> Roman.
>


WG: Showcase your project at ApacheCON at a Podling's Shark Tank

2019-08-14 Thread Julian Feinauer
Wer should definitely participate here!

Von meinem Mobiltelefon gesendet


 Ursprüngliche Nachricht 
Betreff: Showcase your project at ApacheCON at a Podling's Shark Tank
Von: Roman Shaposhnik
An: general ,Roman Shaposhnik
Cc:

Hi Podlings!

in less than a month we're going to have our first
ApacheCON this year -- the one in Las Vegas. In
about two month there will be one more in Berlin.

These are not your regular ApacheCONs -- these are
20th Anniversary of ASF ApacehCONs! In other words,
these are not to be missed!

And even if your talk didn't get accepted -- you still
get an opportunity to highlight your project to, what's
likely going to be the biggest audience attending.

Here's how: if you (or any community member who's
passionate about your project) are going to be at either
of those ApacheCONs consider signing up for
Podling's Shark Tank
events:
https://www.apachecon.com/acna19/s/#/scheduledEvent/1038
https://aceu19.apachecon.com/session/podlings-shark-tank

Each project presenting will get ~10 min for the pitch and ~5 min
of panel grilling them on all sorts of things. Kind of like this ;-)
 https://www.youtube.com/watch?v=wmenN7NEdBc

You've got nothing to lose (in fact, the opposite: you're likely to get
a prize!) and you will get a chance to receive feedback that might
actually help you grow your community and ultimately graduate to the
TLP status. And! Given our awesome panel of judges:
 * Myrle Krantz
 * Justin Mclean
 * Craig Russel
 * Shane Curcuru
We guarantee this to be a fun and useful event for your community!

We will be tracking signups over here:
 https://wiki.apache.org/apachecon/ACNA19PodlingSharkTank
 https://wiki.apache.org/apachecon/ACEU19PodlingSharkTank
but for now:

SIMPLY REPLY TO THIS EMAIL if you're interested.

It is first come, first serve -- so don't delay -- sign up today!

Thanks,
Roman.


Re: Re: An easier way to create time series.

2019-08-14 Thread Xiangdong Huang
Hi,

This is really an interesting topic.

I just want to list some use cases and the term different users use
according to my knowledge.

Indeed different users use different terms to describe the same thing:

1. In many industrial cases, users will say a "machine/device" has some
"sensors",
while a sensor may generate several "conditions" (I mean, the state of sth,
and I am not sure whether the English word is correct) (工况 in Chinese).
For example, in an excavator management applications, these "condition"s
are like: the speed, the mileage, the malfunction state of sth

2. Some users say a "machine/device" has some "variable" (变量 in Chinese)..
For example, a subway train has more than 3000 variables now. The
"variable"
here is equal with "conditions".

3. Some users say  "metering data" (测点 in Chinese).. For example, an
electricity meter has more than 100 "metering data". e.g., the A/B/C
voltage..

They do not say "source" or "data source". However, if you told them your
"machine/device" is a "source", they will agree.

In my understanding, a "source data source" is a specific object that can
generate data..

Now let's review the concept[1] of InfluxDB.

The schema of InfluxDB is:  database - Measurement - {tags}, {fields}.
And, there is a logical concept, series:  "In InfluxDB, a series is the
collection of data that share a retention policy, measurement, and tag
set."[1]

- "Database" is for separating different applications, I think. IoTDB has
no such a concept, we can introduce it in a proper time.

- "Measurement": we need to discuss about it in detail.
Let's look at some examples, In [1], “census” is a measurement. In [2],
"CPU" (or "cpu_load_short")  is a measurement.
Then,  there are many machines that have CPU, e.g., "host=serverA,
region=us_west", "host=serverB, region=us_east".
So, who is a data source,  CPU?

In my opinion, I think CPU is just a kind of data source. "The CPU of
serverA in us_west" is an actually data source, i.e., the "device" in Tian
Jiang's view, and the "series" in InfluxDB's view [1].

Then let's look at the fields. There are many metrics (OK, I used another
term, metrics... I can not distinguish the difference between it with
"variable/condition" in English) in CPU, e.g., the load utilization, the
temperature. That is to say, A physical CPU does not just generate one
value at a time, it can generate several values, while each one is a
variable/condition/measurement point (变量/工况/测点)...

(As InfluxDB uses columnar file format, it will store all "load
utilization" data that belongs to the same device together, and then store
all "temperature" data..)

According to the examples, I think a good practice of InfluxDB is, if a set
of "series" (the concept of InfluxDB) has the same "fields", they belong to
a measurement. Or, in a measurement, the cognominal "field"s on different
"series" have the same meaning, and should have the same  data type (e.g,
double, int, bool...).

Now let's look at the schema of IoTDB.

"Measurement": now in IoTDB, the "measurement" means a
variable/condition... It is the similar with "field" in InfluxDB...

Storage group: storage group now plays two roles: (1) data in different
storage groups will be stored in different files. (In the future, we can
also introduce the replication_refactor, the retention/TTL policy on
different storage groups); (2) the cognominal "measurement" on different
"device"s (in IoTDB) that belong to the same storage group should have the
same meaning and data type (BTW, can this restriction be canceled?)

"Series"/"Path": In IoTDB, a complete "path" is a "series", e.g.,
"root.cpu.serverA.us_west.load_utilization",
"root.cpu.serverB.us_west.temperature". That is to say, "series" in IoTDB =
"series" in InfluxDB + "field" in InfluxDB...

"Device": actually now IoTDB does not explicitly introduce the concept of
"device".. But when we developed codes, we thought a "path" without
"measurement" (i.e., the "field" in InfluxDB) is a "device" (or we can
called it as machine). In my opinion, "device" is a really data source,
things like "CPU" is just a kind of data source.
(Analogy,  from a data analyzer's view, MySQL is not a data source,
username@MySQL_IP:port with password is a valid data source).

Device Template:
I think it is useful. Because in real world, machines/devices are built by
product line. A set of machine/device that have the same hardware and
software can collect the same variables... Actually, when transferring the
data from the machine to the data center,  the manufacturers always have
their protocols, e.g., the first 4 bytes refer to the machine ID, the next
4 bytes refer to the speed, the next 1 byte refers to whether a malfunction
occurs There are many devices/machines that share the same protocol and
these machines should have the same measurement (in IoTDB, fields in
InfluxDB). So it is good to just define the concept (e.g., the data type,
and the encoding type) of these measurements once.
That is what  

Showcase your project at ApacheCON at a Podling's Shark Tank

2019-08-14 Thread Roman Shaposhnik
Hi Podlings!

in less than a month we're going to have our first
ApacheCON this year -- the one in Las Vegas. In
about two month there will be one more in Berlin.

These are not your regular ApacheCONs -- these are
20th Anniversary of ASF ApacehCONs! In other words,
these are not to be missed!

And even if your talk didn't get accepted -- you still
get an opportunity to highlight your project to, what's
likely going to be the biggest audience attending.

Here's how: if you (or any community member who's
passionate about your project) are going to be at either
of those ApacheCONs consider signing up for
Podling's Shark Tank
events:
https://www.apachecon.com/acna19/s/#/scheduledEvent/1038
https://aceu19.apachecon.com/session/podlings-shark-tank

Each project presenting will get ~10 min for the pitch and ~5 min
of panel grilling them on all sorts of things. Kind of like this ;-)
 https://www.youtube.com/watch?v=wmenN7NEdBc

You've got nothing to lose (in fact, the opposite: you're likely to get
a prize!) and you will get a chance to receive feedback that might
actually help you grow your community and ultimately graduate to the
TLP status. And! Given our awesome panel of judges:
 * Myrle Krantz
 * Justin Mclean
 * Craig Russel
 * Shane Curcuru
We guarantee this to be a fun and useful event for your community!

We will be tracking signups over here:
 https://wiki.apache.org/apachecon/ACNA19PodlingSharkTank
 https://wiki.apache.org/apachecon/ACEU19PodlingSharkTank
but for now:

SIMPLY REPLY TO THIS EMAIL if you're interested.

It is first come, first serve -- so don't delay -- sign up today!

Thanks,
Roman.


Re: An easier way to create time series.

2019-08-14 Thread Julian Feinauer
Hi,

agree.
This was more as a first comment I wanted to throw in : )
And with such BIG changes I totally agree that a DISCUSSION Thread has to be 
openend and in the end if no consensus is found even a VOTE should be done.

Julian

Am 14.08.19, 12:18 schrieb "Tian Jiang" :

Hi Julian,


Surely naming is important to users, but different users may have different 
opinions upon naming. I think it is hard to get everyone satisfied, so maybe we 
can hold a vote or something to discuss that later.


My starting point is to provide a way to create a bunch of time series with 
fewer statements(as the title suggests) and this will not interfere with 
existing functionality. It is light-weight, and I can add this feature within a 
day or two.


Adding tags is cool, which can definitely enhance the expressive power of 
IoTDB, But, the implementation may cause a lot of changes(and potential 
troubles) in the whole system, which is beyond the discussion. Since you seem 
interested, we may open another thread to discuss about the tags(or whatever 
you want to call it) in detail.


Tian Jiang



At 2019-08-14 17:12:41, "Julian Feinauer"  
wrote:
>Hi Tian,
>
>naming i see as a minor issue to change but as a bigger issue to users 
(nomen est omen...).
>Regarding your other comment I don’t get what you mean.
>
>Think of situations like monitoring stuff from several machines of 
multiple types in multiple plants.
>Then I would like to say something like
>
>"do that for all series in plan A" or "in all series for machine Type X".
>
>Indeed its quite a "huge" change which has implications but it would 
rather "widen" the api to do "multi-series querying" rather than change it, I 
guess?
>
>Julian
>
>Am 14.08.19, 11:06 schrieb "Tian Jiang" :
>
>The naming is not a big issue, but your schema proposals seem to be 
turning IoTDB into something else.
>
>
>
>
>
>At 2019-08-14 16:55:33, "Jialin Qiao"  
wrote:
>>Hi,
>>
>>I think source or datasource is good, and it's better to use, or at 
least add the tags and fields, because many TSDBs use these conceptual module.
>>
>>Some feasible schema organization ways and "select * from the table" 
results:
>>
>>(1) Each type of datasource is a table, which has a time column, some 
tag columns and some field columns.
>>
>>Table: sourceType
>>time tag1,  field1, field2
>>1, device1, 1, 1
>>2, device1, 2, 2
>>2, device2, 2, 2
>>
>>(2) Each datasource is a table with some tags. Each table has a time 
column, and some field columns. (Tags of one datasource may be not changed, so 
just see it as metadata.)
>>
>>Table: source1(tag1=device1)
>>time, field1, field2
>>1, 1, 1
>>2, 2, 2
>>
>>Table: source2(tag1=device2)
>>time, field1, field2
>>2, 2, 2
>>
>>
>>Best,
>>--
>>Jialin Qiao
>>School of Software, Tsinghua University
>>
>>乔嘉林
>>清华大学 软件学院
>>
>>> -原始邮件-
>>> 发件人: "Julian Feinauer" 
>>> 发送时间: 2019-08-14 16:10:22 (星期三)
>>> 收件人: "dev@iotdb.apache.org" 
>>> 抄送: 
>>> 主题: Re: An easier way to create time series.
>>> 
>>> Hi,
>>> 
>>> let me stick in hier also.
>>> One of the things which was at first a bit "unfamiliar" for me was 
this device focus.
>>> It’s a bit to "one-dimensional" in my perspective.
>>> 
>>> Personally, I quite like how Influx does it that you have a name 
and can attach tags and fields to it.
>>> And even if we do not do it that way I would prefer to name it a 
bit differently as "series" or "measurement" or "source".
>>> Device is a bit specific and just sounds odd, from a users 
perspective.
>>> 
>>> I think it was good to keep it that way for 0.8.0.
>>> But for the next release we are open to break things a bit.
>>> 
>>> What do others think?
>>> 
>>> Julian
>>> 
>>> Am 14.08.19, 04:52 schrieb "Tian Jiang" :
>>> 
>>> Maybe starting from a sugar, we can add some improvements 
gradually. Currently, I think making timeseries creation easier should be 
enough. Please share if you have some fancy ideas that can go with the 
introduction of "device".
>>> 
>>> Tian Jiang
>>> 
>>> 
>>> At 2019-08-14 10:44:14, "Xiangdong Huang"  
wrote:
>>> >Hi,
>>> >
>>> >Looks fine for me.
>>> >
>>> >One question, is it just a language syntax sugar, or we can as 
well as
>>> >improve the schema 

Re:Re: An easier way to create time series.

2019-08-14 Thread Tian Jiang
Hi Julian,


Surely naming is important to users, but different users may have different 
opinions upon naming. I think it is hard to get everyone satisfied, so maybe we 
can hold a vote or something to discuss that later.


My starting point is to provide a way to create a bunch of time series with 
fewer statements(as the title suggests) and this will not interfere with 
existing functionality. It is light-weight, and I can add this feature within a 
day or two.


Adding tags is cool, which can definitely enhance the expressive power of 
IoTDB, But, the implementation may cause a lot of changes(and potential 
troubles) in the whole system, which is beyond the discussion. Since you seem 
interested, we may open another thread to discuss about the tags(or whatever 
you want to call it) in detail.


Tian Jiang



At 2019-08-14 17:12:41, "Julian Feinauer"  wrote:
>Hi Tian,
>
>naming i see as a minor issue to change but as a bigger issue to users (nomen 
>est omen...).
>Regarding your other comment I don’t get what you mean.
>
>Think of situations like monitoring stuff from several machines of multiple 
>types in multiple plants.
>Then I would like to say something like
>
>"do that for all series in plan A" or "in all series for machine Type X".
>
>Indeed its quite a "huge" change which has implications but it would rather 
>"widen" the api to do "multi-series querying" rather than change it, I guess?
>
>Julian
>
>Am 14.08.19, 11:06 schrieb "Tian Jiang" :
>
>The naming is not a big issue, but your schema proposals seem to be 
> turning IoTDB into something else.
>
>
>
>
>
>At 2019-08-14 16:55:33, "Jialin Qiao"  wrote:
>>Hi,
>>
>>I think source or datasource is good, and it's better to use, or at least 
> add the tags and fields, because many TSDBs use these conceptual module.
>>
>>Some feasible schema organization ways and "select * from the table" 
> results:
>>
>>(1) Each type of datasource is a table, which has a time column, some tag 
> columns and some field columns.
>>
>>Table: sourceType
>>time tag1,  field1, field2
>>1, device1, 1, 1
>>2, device1, 2, 2
>>2, device2, 2, 2
>>
>>(2) Each datasource is a table with some tags. Each table has a time 
> column, and some field columns. (Tags of one datasource may be not changed, 
> so just see it as metadata.)
>>
>>Table: source1(tag1=device1)
>>time, field1, field2
>>1, 1, 1
>>2, 2, 2
>>
>>Table: source2(tag1=device2)
>>time, field1, field2
>>2, 2, 2
>>
>>
>>Best,
>>--
>>Jialin Qiao
>>School of Software, Tsinghua University
>>
>>乔嘉林
>>清华大学 软件学院
>>
>>> -原始邮件-
>>> 发件人: "Julian Feinauer" 
>>> 发送时间: 2019-08-14 16:10:22 (星期三)
>>> 收件人: "dev@iotdb.apache.org" 
>>> 抄送: 
>>> 主题: Re: An easier way to create time series.
>>> 
>>> Hi,
>>> 
>>> let me stick in hier also.
>>> One of the things which was at first a bit "unfamiliar" for me was this 
> device focus.
>>> It’s a bit to "one-dimensional" in my perspective.
>>> 
>>> Personally, I quite like how Influx does it that you have a name and 
> can attach tags and fields to it.
>>> And even if we do not do it that way I would prefer to name it a bit 
> differently as "series" or "measurement" or "source".
>>> Device is a bit specific and just sounds odd, from a users perspective.
>>> 
>>> I think it was good to keep it that way for 0.8.0.
>>> But for the next release we are open to break things a bit.
>>> 
>>> What do others think?
>>> 
>>> Julian
>>> 
>>> Am 14.08.19, 04:52 schrieb "Tian Jiang" :
>>> 
>>> Maybe starting from a sugar, we can add some improvements 
> gradually. Currently, I think making timeseries creation easier should be 
> enough. Please share if you have some fancy ideas that can go with the 
> introduction of "device".
>>> 
>>> Tian Jiang
>>> 
>>> 
>>> At 2019-08-14 10:44:14, "Xiangdong Huang"  
> wrote:
>>> >Hi,
>>> >
>>> >Looks fine for me.
>>> >
>>> >One question, is it just a language syntax sugar, or we can as 
> well as
>>> >improve the schema management? Any idea?
>>> >
>>> >Best,
>>> >---
>>> >Xiangdong Huang
>>> >School of Software, Tsinghua University
>>> >
>>> > 黄向东
>>> >清华大学 软件学院
>>> >
>>> >
>>> >Tian Jiang  于2019年8月14日周三 上午10:37写道:
>>> >
>>> >> Greetings,
>>> >>
>>> >>
>>> >> In the present version, it is a little trouble some to create a 
> set
>>> >> timeseries that has the same measurements. On the other hand, 
> although we
>>> >> use the conception "device" in the code, it is not properly 
> abstracted.
>>> >>
>>> >> Expected 

Re: An easier way to create time series.

2019-08-14 Thread Julian Feinauer
Hi Tian,

naming i see as a minor issue to change but as a bigger issue to users (nomen 
est omen...).
Regarding your other comment I don’t get what you mean.

Think of situations like monitoring stuff from several machines of multiple 
types in multiple plants.
Then I would like to say something like

"do that for all series in plan A" or "in all series for machine Type X".

Indeed its quite a "huge" change which has implications but it would rather 
"widen" the api to do "multi-series querying" rather than change it, I guess?

Julian

Am 14.08.19, 11:06 schrieb "Tian Jiang" :

The naming is not a big issue, but your schema proposals seem to be turning 
IoTDB into something else.





At 2019-08-14 16:55:33, "Jialin Qiao"  wrote:
>Hi,
>
>I think source or datasource is good, and it's better to use, or at least 
add the tags and fields, because many TSDBs use these conceptual module.
>
>Some feasible schema organization ways and "select * from the table" 
results:
>
>(1) Each type of datasource is a table, which has a time column, some tag 
columns and some field columns.
>
>Table: sourceType
>time tag1,  field1, field2
>1, device1, 1, 1
>2, device1, 2, 2
>2, device2, 2, 2
>
>(2) Each datasource is a table with some tags. Each table has a time 
column, and some field columns. (Tags of one datasource may be not changed, so 
just see it as metadata.)
>
>Table: source1(tag1=device1)
>time, field1, field2
>1, 1, 1
>2, 2, 2
>
>Table: source2(tag1=device2)
>time, field1, field2
>2, 2, 2
>
>
>Best,
>--
>Jialin Qiao
>School of Software, Tsinghua University
>
>乔嘉林
>清华大学 软件学院
>
>> -原始邮件-
>> 发件人: "Julian Feinauer" 
>> 发送时间: 2019-08-14 16:10:22 (星期三)
>> 收件人: "dev@iotdb.apache.org" 
>> 抄送: 
>> 主题: Re: An easier way to create time series.
>> 
>> Hi,
>> 
>> let me stick in hier also.
>> One of the things which was at first a bit "unfamiliar" for me was this 
device focus.
>> It’s a bit to "one-dimensional" in my perspective.
>> 
>> Personally, I quite like how Influx does it that you have a name and can 
attach tags and fields to it.
>> And even if we do not do it that way I would prefer to name it a bit 
differently as "series" or "measurement" or "source".
>> Device is a bit specific and just sounds odd, from a users perspective.
>> 
>> I think it was good to keep it that way for 0.8.0.
>> But for the next release we are open to break things a bit.
>> 
>> What do others think?
>> 
>> Julian
>> 
>> Am 14.08.19, 04:52 schrieb "Tian Jiang" :
>> 
>> Maybe starting from a sugar, we can add some improvements gradually. 
Currently, I think making timeseries creation easier should be enough. Please 
share if you have some fancy ideas that can go with the introduction of 
"device".
>> 
>> Tian Jiang
>> 
>> 
>> At 2019-08-14 10:44:14, "Xiangdong Huang"  wrote:
>> >Hi,
>> >
>> >Looks fine for me.
>> >
>> >One question, is it just a language syntax sugar, or we can as well 
as
>> >improve the schema management? Any idea?
>> >
>> >Best,
>> >---
>> >Xiangdong Huang
>> >School of Software, Tsinghua University
>> >
>> > 黄向东
>> >清华大学 软件学院
>> >
>> >
>> >Tian Jiang  于2019年8月14日周三 上午10:37写道:
>> >
>> >> Greetings,
>> >>
>> >>
>> >> In the present version, it is a little trouble some to create a 
set
>> >> timeseries that has the same measurements. On the other hand, 
although we
>> >> use the conception "device" in the code, it is not properly 
abstracted.
>> >>
>> >> Expected usage:
>> >>
>> >> Using IoTDB in a more relational way:
>> >>
>> >> CREATE DEVICE TEMPLATE vehicle (speed DOUBLE PLAIN, direction 
DOUBLE
>> >> PLAIN, temperature DOUBLE PLAIN, fuel DOUBLE PLAIN)
>> >>
>> >> If all datatypes(or encodings) are the same, you can write the 
equal form:
>> >>
>> >> CREATE DEVICE TEMPLATE vehicle MEASUREMENTS (speed, direction,
>> >> temperature, fuel) DATATYPE DOUBLE ENCODING PLAIN
>> >>
>> >> Then you will be able to create time series in an easier way:
>> >>
>> >> CREATE DEVICE (vehicle) root.sg1.vehicle1
>> >>
>> >> Which equals:
>> >>
>> >> CREATE TIMESERIES root.sg1.vehicle1.speed WITH
>> >> DATATYPE=DOUBLE,ENCODING=PLAIN
>> >>
>> >> CREATE TIMESERIES root.sg1.vehicle1.direction WITH
>> >> DATATYPE=DOUBLE,ENCODING=PLAIN
>> >>
>> >> CREATE TIMESERIES root.sg1.vehicle1.fuel WITH
>> 

Re:Re: An easier way to create time series.

2019-08-14 Thread Tian Jiang
The naming is not a big issue, but your schema proposals seem to be turning 
IoTDB into something else.





At 2019-08-14 16:55:33, "Jialin Qiao"  wrote:
>Hi,
>
>I think source or datasource is good, and it's better to use, or at least add 
>the tags and fields, because many TSDBs use these conceptual module.
>
>Some feasible schema organization ways and "select * from the table" results:
>
>(1) Each type of datasource is a table, which has a time column, some tag 
>columns and some field columns.
>
>Table: sourceType
>time tag1,  field1, field2
>1, device1, 1, 1
>2, device1, 2, 2
>2, device2, 2, 2
>
>(2) Each datasource is a table with some tags. Each table has a time column, 
>and some field columns. (Tags of one datasource may be not changed, so just 
>see it as metadata.)
>
>Table: source1(tag1=device1)
>time, field1, field2
>1, 1, 1
>2, 2, 2
>
>Table: source2(tag1=device2)
>time, field1, field2
>2, 2, 2
>
>
>Best,
>--
>Jialin Qiao
>School of Software, Tsinghua University
>
>乔嘉林
>清华大学 软件学院
>
>> -原始邮件-
>> 发件人: "Julian Feinauer" 
>> 发送时间: 2019-08-14 16:10:22 (星期三)
>> 收件人: "dev@iotdb.apache.org" 
>> 抄送: 
>> 主题: Re: An easier way to create time series.
>> 
>> Hi,
>> 
>> let me stick in hier also.
>> One of the things which was at first a bit "unfamiliar" for me was this 
>> device focus.
>> It’s a bit to "one-dimensional" in my perspective.
>> 
>> Personally, I quite like how Influx does it that you have a name and can 
>> attach tags and fields to it.
>> And even if we do not do it that way I would prefer to name it a bit 
>> differently as "series" or "measurement" or "source".
>> Device is a bit specific and just sounds odd, from a users perspective.
>> 
>> I think it was good to keep it that way for 0.8.0.
>> But for the next release we are open to break things a bit.
>> 
>> What do others think?
>> 
>> Julian
>> 
>> Am 14.08.19, 04:52 schrieb "Tian Jiang" :
>> 
>> Maybe starting from a sugar, we can add some improvements gradually. 
>> Currently, I think making timeseries creation easier should be enough. 
>> Please share if you have some fancy ideas that can go with the introduction 
>> of "device".
>> 
>> Tian Jiang
>> 
>> 
>> At 2019-08-14 10:44:14, "Xiangdong Huang"  wrote:
>> >Hi,
>> >
>> >Looks fine for me.
>> >
>> >One question, is it just a language syntax sugar, or we can as well as
>> >improve the schema management? Any idea?
>> >
>> >Best,
>> >---
>> >Xiangdong Huang
>> >School of Software, Tsinghua University
>> >
>> > 黄向东
>> >清华大学 软件学院
>> >
>> >
>> >Tian Jiang  于2019年8月14日周三 上午10:37写道:
>> >
>> >> Greetings,
>> >>
>> >>
>> >> In the present version, it is a little trouble some to create a set
>> >> timeseries that has the same measurements. On the other hand, 
>> although we
>> >> use the conception "device" in the code, it is not properly 
>> abstracted.
>> >>
>> >> Expected usage:
>> >>
>> >> Using IoTDB in a more relational way:
>> >>
>> >> CREATE DEVICE TEMPLATE vehicle (speed DOUBLE PLAIN, direction DOUBLE
>> >> PLAIN, temperature DOUBLE PLAIN, fuel DOUBLE PLAIN)
>> >>
>> >> If all datatypes(or encodings) are the same, you can write the equal 
>> form:
>> >>
>> >> CREATE DEVICE TEMPLATE vehicle MEASUREMENTS (speed, direction,
>> >> temperature, fuel) DATATYPE DOUBLE ENCODING PLAIN
>> >>
>> >> Then you will be able to create time series in an easier way:
>> >>
>> >> CREATE DEVICE (vehicle) root.sg1.vehicle1
>> >>
>> >> Which equals:
>> >>
>> >> CREATE TIMESERIES root.sg1.vehicle1.speed WITH
>> >> DATATYPE=DOUBLE,ENCODING=PLAIN
>> >>
>> >> CREATE TIMESERIES root.sg1.vehicle1.direction WITH
>> >> DATATYPE=DOUBLE,ENCODING=PLAIN
>> >>
>> >> CREATE TIMESERIES root.sg1.vehicle1.fuel WITH
>> >> DATATYPE=DOUBLE,ENCODING=PLAIN
>> >>
>> >> CREATE TIMESERIES root.sg1.vehicle1.temperature WITH
>> >> DATATYPE=DOUBLE,ENCODING=PLAIN
>> >>
>> >> I hope this will narrow the gap between using IoTDB and traditional
>> >> relation databases.
>> >> Jira link:
>> >> 
>> https://issues.apache.org/jira/projects/IOTDB/issues/IOTDB-163?filter=allopenissues
>> >>
>> >>
>> >> Tian Jiang
>> 
>> 


Re: An easier way to create time series.

2019-08-14 Thread Jialin Qiao
Hi,

I think source or datasource is good, and it's better to use, or at least add 
the tags and fields, because many TSDBs use these conceptual module.

Some feasible schema organization ways and "select * from the table" results:

(1) Each type of datasource is a table, which has a time column, some tag 
columns and some field columns.

Table: sourceType
time tag1,  field1, field2
1, device1, 1, 1
2, device1, 2, 2
2, device2, 2, 2

(2) Each datasource is a table with some tags. Each table has a time column, 
and some field columns. (Tags of one datasource may be not changed, so just see 
it as metadata.)

Table: source1(tag1=device1)
time, field1, field2
1, 1, 1
2, 2, 2

Table: source2(tag1=device2)
time, field1, field2
2, 2, 2


Best,
--
Jialin Qiao
School of Software, Tsinghua University

乔嘉林
清华大学 软件学院

> -原始邮件-
> 发件人: "Julian Feinauer" 
> 发送时间: 2019-08-14 16:10:22 (星期三)
> 收件人: "dev@iotdb.apache.org" 
> 抄送: 
> 主题: Re: An easier way to create time series.
> 
> Hi,
> 
> let me stick in hier also.
> One of the things which was at first a bit "unfamiliar" for me was this 
> device focus.
> It’s a bit to "one-dimensional" in my perspective.
> 
> Personally, I quite like how Influx does it that you have a name and can 
> attach tags and fields to it.
> And even if we do not do it that way I would prefer to name it a bit 
> differently as "series" or "measurement" or "source".
> Device is a bit specific and just sounds odd, from a users perspective.
> 
> I think it was good to keep it that way for 0.8.0.
> But for the next release we are open to break things a bit.
> 
> What do others think?
> 
> Julian
> 
> Am 14.08.19, 04:52 schrieb "Tian Jiang" :
> 
> Maybe starting from a sugar, we can add some improvements gradually. 
> Currently, I think making timeseries creation easier should be enough. Please 
> share if you have some fancy ideas that can go with the introduction of 
> "device".
> 
> Tian Jiang
> 
> 
> At 2019-08-14 10:44:14, "Xiangdong Huang"  wrote:
> >Hi,
> >
> >Looks fine for me.
> >
> >One question, is it just a language syntax sugar, or we can as well as
> >improve the schema management? Any idea?
> >
> >Best,
> >---
> >Xiangdong Huang
> >School of Software, Tsinghua University
> >
> > 黄向东
> >清华大学 软件学院
> >
> >
> >Tian Jiang  于2019年8月14日周三 上午10:37写道:
> >
> >> Greetings,
> >>
> >>
> >> In the present version, it is a little trouble some to create a set
> >> timeseries that has the same measurements. On the other hand, although 
> we
> >> use the conception "device" in the code, it is not properly abstracted.
> >>
> >> Expected usage:
> >>
> >> Using IoTDB in a more relational way:
> >>
> >> CREATE DEVICE TEMPLATE vehicle (speed DOUBLE PLAIN, direction DOUBLE
> >> PLAIN, temperature DOUBLE PLAIN, fuel DOUBLE PLAIN)
> >>
> >> If all datatypes(or encodings) are the same, you can write the equal 
> form:
> >>
> >> CREATE DEVICE TEMPLATE vehicle MEASUREMENTS (speed, direction,
> >> temperature, fuel) DATATYPE DOUBLE ENCODING PLAIN
> >>
> >> Then you will be able to create time series in an easier way:
> >>
> >> CREATE DEVICE (vehicle) root.sg1.vehicle1
> >>
> >> Which equals:
> >>
> >> CREATE TIMESERIES root.sg1.vehicle1.speed WITH
> >> DATATYPE=DOUBLE,ENCODING=PLAIN
> >>
> >> CREATE TIMESERIES root.sg1.vehicle1.direction WITH
> >> DATATYPE=DOUBLE,ENCODING=PLAIN
> >>
> >> CREATE TIMESERIES root.sg1.vehicle1.fuel WITH
> >> DATATYPE=DOUBLE,ENCODING=PLAIN
> >>
> >> CREATE TIMESERIES root.sg1.vehicle1.temperature WITH
> >> DATATYPE=DOUBLE,ENCODING=PLAIN
> >>
> >> I hope this will narrow the gap between using IoTDB and traditional
> >> relation databases.
> >> Jira link:
> >> 
> https://issues.apache.org/jira/projects/IOTDB/issues/IOTDB-163?filter=allopenissues
> >>
> >>
> >> Tian Jiang
> 
> 


Re: An easier way to create time series.

2019-08-14 Thread Julian Feinauer
Hi,

let me stick in hier also.
One of the things which was at first a bit "unfamiliar" for me was this device 
focus.
It’s a bit to "one-dimensional" in my perspective.

Personally, I quite like how Influx does it that you have a name and can attach 
tags and fields to it.
And even if we do not do it that way I would prefer to name it a bit 
differently as "series" or "measurement" or "source".
Device is a bit specific and just sounds odd, from a users perspective.

I think it was good to keep it that way for 0.8.0.
But for the next release we are open to break things a bit.

What do others think?

Julian

Am 14.08.19, 04:52 schrieb "Tian Jiang" :

Maybe starting from a sugar, we can add some improvements gradually. 
Currently, I think making timeseries creation easier should be enough. Please 
share if you have some fancy ideas that can go with the introduction of 
"device".

Tian Jiang


At 2019-08-14 10:44:14, "Xiangdong Huang"  wrote:
>Hi,
>
>Looks fine for me.
>
>One question, is it just a language syntax sugar, or we can as well as
>improve the schema management? Any idea?
>
>Best,
>---
>Xiangdong Huang
>School of Software, Tsinghua University
>
> 黄向东
>清华大学 软件学院
>
>
>Tian Jiang  于2019年8月14日周三 上午10:37写道:
>
>> Greetings,
>>
>>
>> In the present version, it is a little trouble some to create a set
>> timeseries that has the same measurements. On the other hand, although we
>> use the conception "device" in the code, it is not properly abstracted.
>>
>> Expected usage:
>>
>> Using IoTDB in a more relational way:
>>
>> CREATE DEVICE TEMPLATE vehicle (speed DOUBLE PLAIN, direction DOUBLE
>> PLAIN, temperature DOUBLE PLAIN, fuel DOUBLE PLAIN)
>>
>> If all datatypes(or encodings) are the same, you can write the equal 
form:
>>
>> CREATE DEVICE TEMPLATE vehicle MEASUREMENTS (speed, direction,
>> temperature, fuel) DATATYPE DOUBLE ENCODING PLAIN
>>
>> Then you will be able to create time series in an easier way:
>>
>> CREATE DEVICE (vehicle) root.sg1.vehicle1
>>
>> Which equals:
>>
>> CREATE TIMESERIES root.sg1.vehicle1.speed WITH
>> DATATYPE=DOUBLE,ENCODING=PLAIN
>>
>> CREATE TIMESERIES root.sg1.vehicle1.direction WITH
>> DATATYPE=DOUBLE,ENCODING=PLAIN
>>
>> CREATE TIMESERIES root.sg1.vehicle1.fuel WITH
>> DATATYPE=DOUBLE,ENCODING=PLAIN
>>
>> CREATE TIMESERIES root.sg1.vehicle1.temperature WITH
>> DATATYPE=DOUBLE,ENCODING=PLAIN
>>
>> I hope this will narrow the gap between using IoTDB and traditional
>> relation databases.
>> Jira link:
>> 
https://issues.apache.org/jira/projects/IOTDB/issues/IOTDB-163?filter=allopenissues
>>
>>
>> Tian Jiang




[jira] [Created] (IOTDB-164) Convert Path(String) to ID(Long)

2019-08-14 Thread Yanzhe An (JIRA)
Yanzhe An created IOTDB-164:
---

 Summary: Convert Path(String) to ID(Long)
 Key: IOTDB-164
 URL: https://issues.apache.org/jira/browse/IOTDB-164
 Project: Apache IoTDB
  Issue Type: Improvement
Reporter: Yanzhe An
 Attachments: path示例.png

Now in IoTDB,  paths of time series are in the form of *String*. The path of a 
time series contains the information of its storage group, device, and 
measurement.

!path示例.png!

However, the fact shows that there may be overlaps between paths. For example, 
time series belonging to the same storage group _sg123_ must be prefixed with 
the name of _sg123_. It can cause the problem of wasted space. By contrast, 
*Long* takes much less space than *String*. 

What's worse, paths are usually put into the *Map* structure to carry out other 
operations. It can cause the problem of wasted time. By contrast, when carrying 
out operations of _put_, _get_, and _remove,_ *Map* performs 
better than *Map*.

Therefore in IoTDB, paths in the form of *String* should be replaced by IDs in 
the form of  *Long*.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)