On Friday, October 21, 2016 at 9:09:02 AM UTC-6, bruno binet wrote:
>
>
>
> On 21 October 2016 at 17:02, Jason Wilder <[email protected] 
> <javascript:>> wrote:
>
>>
>>
>> On Friday, October 21, 2016 at 8:17:06 AM UTC-6, Guillaume Berthomé wrote:
>>>
>>> And are "0" values stored in the TSM files with the same "system" 
>>> (~3bytes at the end) ?
>>>
>>>
>> If they are integers and just long runs of the same value, (e.g. 0), they 
>> can be run length encoded as well.  The max block size is 1000 values and 
>> we'd use 11 bytes for all those values which is  much less than 3 bytes.
>>
>
> That is great! Does it also apply for float values? This is also common to 
> have long runs of the same float value (e.g. 0.0)
>  
>

We can't run length encode the floats as we do with integers, but the 
compression algorithm used for floats will use 1 bit per value in this case.

Keep in mind that the value types are encoded separately from the 
timestamps so the sizes above only apply to the value part of the 
(timestamp, value) pair.   The timestamps encoding for a block of points 
depends on how regular the timestamps are and the precision being stored.
 

>  
>>
>>> Thank you
>>>
>>> Le vendredi 21 octobre 2016 11:17:41 UTC+2, Guillaume Berthomé a écrit :
>>>>
>>>> @Sean, 
>>>>
>>>> Can I use the rule you gave me for integers ?
>>>>
>>>> Le jeudi 20 octobre 2016 08:53:34 UTC+2, Guillaume Berthomé a écrit :
>>>>>
>>>>> Thank you everyone !
>>>>>
>>>>> Le mercredi 19 octobre 2016 21:30:54 UTC+2, Sean Beckett a écrit :
>>>>>>
>>>>>> Mathias, I agree that irregular timestamps do lead to more space 
>>>>>> taken on disk. However, that number still falls under 3 bytes per 
>>>>>> numeric 
>>>>>> value, even with irregular nanosecond timestamps. Both what Jason and I 
>>>>>> are 
>>>>>> saying is true; they are not mutually exclusive. 
>>>>>>
>>>>>> > The 2 or 3 bytes footprint is usually not achievable if you store 
>>>>>> your timestamps with milli, micro or nanosecond precision as the delta 
>>>>>> between consecutive timestamps will vary and will therefore not compress 
>>>>>> as 
>>>>>> well.
>>>>>>
>>>>>> That statement is not true, at least not in any of our testing 
>>>>>> experience. It would be fair to say that 2-3 bytes per value is not 
>>>>>> always 
>>>>>> achievable, as some rare edge cases can lead to larger footprint on 
>>>>>> disk, 
>>>>>> but to say that it is *usually* not achievable is simply incorrect.
>>>>>>
>>>>>> On Wed, Oct 19, 2016 at 1:10 PM, Mathias Herberts <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> It seems Jason just posted the exact same findings as me, maybe 
>>>>>>> check with him directly, I persist in my saying, if your timestamps are 
>>>>>>> not 
>>>>>>> regularly spaced then compression efficiency will decrease.
>>>>>>>
>>>>>>> On Wednesday, October 19, 2016 at 8:33:56 PM UTC+2, Sean Beckett 
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Mathias, I question your findings on this. We've run extensive 
>>>>>>>> testing with nanosecond timestamps and routinely get < 3 bytes per 
>>>>>>>> recorded 
>>>>>>>> numeric value.
>>>>>>>>
>>>>>>>> Please note that this figure is only for fully compacted data, e.g. 
>>>>>>>> cold shards. Shards currently hot for writes will be much less 
>>>>>>>> compact. 
>>>>>>>> Over time the steady state of the system will approach ~2.5 bytes per 
>>>>>>>> numeric field.
>>>>>>>>
>>>>>>>> On Wed, Oct 19, 2016 at 5:45 AM, Mathias Herberts <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> Efficiency of compression depends on different factors, including 
>>>>>>>>> interval between timestamps, resolution of said timestamps, 
>>>>>>>>> volatility and 
>>>>>>>>> type of values.
>>>>>>>>>
>>>>>>>>> The 2 or 3 bytes footprint is usually not achievable if you store 
>>>>>>>>> your timestamps with milli, micro or nanosecond precision as the 
>>>>>>>>> delta 
>>>>>>>>> between consecutive timestamps will vary and will therefore not 
>>>>>>>>> compress as 
>>>>>>>>> well.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wednesday, October 19, 2016 at 1:04:44 PM UTC+2, Guillaume 
>>>>>>>>> Berthomé wrote:
>>>>>>>>>>
>>>>>>>>>> Thank you for your answer.
>>>>>>>>>>
>>>>>>>>>> So let's admit that I have 2 metrics collected every 15 seconds 
>>>>>>>>>> during 24h
>>>>>>>>>>
>>>>>>>>>> - cpu with 20 values 
>>>>>>>>>> - mem with 10 values
>>>>>>>>>>
>>>>>>>>>> The total size of stored data will be 3 bytes * 8 metrics per 
>>>>>>>>>> minutes * 1440 minutes ?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Le mardi 18 octobre 2016 18:57:53 UTC+2, Sean Beckett a écrit :
>>>>>>>>>>>
>>>>>>>>>>> After final compactions, InfluxDB uses between 2 and 3 bytes per 
>>>>>>>>>>> field value stored. If your metrics are all numbers, then the total 
>>>>>>>>>>> steady-state size will be a bit more than 3 bytes * (metrics per 
>>>>>>>>>>> minute) * 
>>>>>>>>>>> minutes of data stored.
>>>>>>>>>>>
>>>>>>>>>>> See 
>>>>>>>>>>> http://docs.influxdata.com/influxdb/v1.0/concepts/storage_engine/ 
>>>>>>>>>>> and 
>>>>>>>>>>> http://docs.influxdata.com/influxdb/v1.0/guides/hardware_sizing/#how-much-storage-do-i-need
>>>>>>>>>>>  for 
>>>>>>>>>>> details. 
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Oct 18, 2016 at 2:51 AM, <[email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi everybody,
>>>>>>>>>>>>
>>>>>>>>>>>> Is there any way to calculate the database's size from the 
>>>>>>>>>>>> number measurements done per minute ? I mean predict the databse 
>>>>>>>>>>>> behavior 
>>>>>>>>>>>> (wal files, tsm files ...)
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you.
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Remember to include the version number!
>>>>>>>>>>>> ---
>>>>>>>>>>>> You received this message because you are subscribed to the 
>>>>>>>>>>>> Google Groups "InfluxData" group.
>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from 
>>>>>>>>>>>> it, send an email to [email protected].
>>>>>>>>>>>> To post to this group, send email to [email protected].
>>>>>>>>>>>> Visit this group at https://groups.google.com/group/influxdb.
>>>>>>>>>>>> To view this discussion on the web visit 
>>>>>>>>>>>> https://groups.google.com/d/msgid/influxdb/ee32b8d0-f9bf-40df-baf1-e77fa93cddf5%40googlegroups.com
>>>>>>>>>>>> .
>>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -- 
>>>>>>>>>>> Sean Beckett
>>>>>>>>>>> Director of Support and Professional Services
>>>>>>>>>>> InfluxDB
>>>>>>>>>>>
>>>>>>>>>> -- 
>>>>>>>>> Remember to include the version number!
>>>>>>>>> --- 
>>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>>> Groups "InfluxData" group.
>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>>> send an email to [email protected].
>>>>>>>>> To post to this group, send email to [email protected].
>>>>>>>>> Visit this group at https://groups.google.com/group/influxdb.
>>>>>>>>> To view this discussion on the web visit 
>>>>>>>>> https://groups.google.com/d/msgid/influxdb/b08ad8c0-341e-4390-a18c-8d6b67da26f0%40googlegroups.com
>>>>>>>>>  
>>>>>>>>> <https://groups.google.com/d/msgid/influxdb/b08ad8c0-341e-4390-a18c-8d6b67da26f0%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>> .
>>>>>>>>>
>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> Sean Beckett
>>>>>>>> Director of Support and Professional Services
>>>>>>>> InfluxDB
>>>>>>>>
>>>>>>> -- 
>>>>>>> Remember to include the version number!
>>>>>>> --- 
>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>> Groups "InfluxData" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>> send an email to [email protected].
>>>>>>> To post to this group, send email to [email protected].
>>>>>>> Visit this group at https://groups.google.com/group/influxdb.
>>>>>>> To view this discussion on the web visit 
>>>>>>> https://groups.google.com/d/msgid/influxdb/4ba6c676-d0be-43a4-8f43-2c360ccafb3b%40googlegroups.com
>>>>>>>  
>>>>>>> <https://groups.google.com/d/msgid/influxdb/4ba6c676-d0be-43a4-8f43-2c360ccafb3b%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>>
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> Sean Beckett
>>>>>> Director of Support and Professional Services
>>>>>> InfluxDB
>>>>>>
>>>>> -- 
>> Remember to include the version number!
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "InfluxData" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To post to this group, send email to [email protected] 
>> <javascript:>.
>> Visit this group at https://groups.google.com/group/influxdb.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/influxdb/d9d3b396-32d6-4697-b3aa-2c5fee48909c%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/influxdb/d9d3b396-32d6-4697-b3aa-2c5fee48909c%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
Remember to include the version number!
--- 
You received this message because you are subscribed to the Google Groups 
"InfluxData" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/influxdb.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/influxdb/a1ee3829-b84d-4610-ab91-fddacdc619be%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to