RE: Var-Length-Numeric encoding?

2022-06-17 Thread Christofer Dutz
Hi Xiangdong,

I doubt you invented a new encoding form. So, in general, I was asking which 
form this actually is.
Juilian already pointed out that bit of code.

So, as I can see it, the sign information is in the least significant bit. This 
would usually be an indicator for ZigZag encoding. The only part I don’t quite 
understand, is the bit-flipping in case of negative values. In case of ZigZag 
encoding, the value would be shift left by one and the last bit would be set as 
the new first bit (So effectively the last bit would just be rotated to become 
the first). In IoTDB it seems as if the left-shifted value is inverted. Don’t 
quite understand why that is happening. I could imagine that for small negative 
integers (small as in “close to 0”) the 2s complement notation has many 1s, 
therefore it would consume a lot of memory in serialized form. So, flipping the 
entire number would get rid of these 1s and hence reduce the size of the 
serialized form.

But going though this document again: 
https://golb.hplar.ch/2019/06/variable-length-int-java.html

If the number is negative, it is x-ored with all bits set to 1 … so this is 
identical to flipping the bits … this is actually really cool and efficient.

So, I would like to confirm that IoTDB uses ZigZag encoding for variable length 
signed integers. Possibly a comment to the utils class to which encoding is 
actually used, would be a great addition. I’ll probably add one asap.

Chris




From: Xiangdong Huang 
Sent: Freitag, 17. Juni 2022 09:33
To: dev ; Yuan Tian 
Subject: Re: Var-Length-Numeric encoding?

Hi,

I think the encoding implementation is in 
src/main/java/org/apache/iotdb/tsfile/utils/ReadWriteForEncodingUtils.java
@Yuan Tian  implemented it.

Best,
---
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院


Julian Feinauer 
mailto:j.feina...@pragmaticminds.de>> 
于2022年6月13日周一 17:47写道:
Hi,

I can only comment on floating points: we dont.
Currently we also only have var-length encoding vor u32 (not for u64).

Regarding ZigZag Encoding perhaps anybody else can jump in here?

Julian

Julian Feinauer
Geschäftsführer/CEO

j.feina...@pragmaticminds.de
+49 (0) 7021 87868-01 |
Jesinger Str. 57, 73230 Kirchheim unter Teck
www.pragmaticindustries.de

[cid:1817091c10b45ac8cae1]   [cid:1817091c10b6373642a2] 
  
[cid:1817091c10b5017b7993]   
[cid:1817091c10b32bee5404] 
  
[cid:1817091c10b8dea4c1d5] 
Pflichtinformationen  
gemäß Artikel 13 DSGVO
Von: Christofer Dutz 
mailto:christofer.d...@c-ware.de>>
Datum: Montag, 13. Juni 2022 um 09:50
An: dev@iotdb.apache.org 
mailto:dev@iotdb.apache.org>>
Betreff: Var-Length-Numeric encoding?
Hi all,

Just out of curiosity. Julian told me TSFiles make use of variable length 
encoding of numeric types.
I would expect the encoding for unsigned integers to be the "ordinary" one 
where 7 bits of a byte are being used for encoding the numeric value and new 
bytes are added as long as the first bit is 1.
However, I would be interested in which encoding is being used for unsigned 
integers? Julian posted a reply in the #iotdb slack channel, but I'm unsure 
which official encoding type this is.
It most likely looks like ZigZag Encoding, but I'm a bit unsure if it really is.
Could anyone here please shed a bit of lite on this? And do we have var-length 
encoding for floating-point types too?

Chris


Re: Var-Length-Numeric encoding?

2022-06-17 Thread Xiangdong Huang
Hi,

I think the encoding implementation is in
src/main/java/org/apache/iotdb/tsfile/utils/ReadWriteForEncodingUtils.java
@Yuan Tian   implemented it.

Best,
---
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院


Julian Feinauer  于2022年6月13日周一 17:47写道:

> Hi,
>
>
>
> I can only comment on floating points: we dont.
>
> Currently we also only have var-length encoding vor u32 (not for u64).
>
>
>
> Regarding ZigZag Encoding perhaps anybody else can jump in here?
>
>
>
> Julian
>
>
>
> *Julian Feinauer*
> Geschäftsführer/CEO
>  
> j.feina...@pragmaticminds.de <%7BE-mail%7D>
> +49 (0) 7021 87868-01 <+49%20(0)%207021%2087868-01> |
> Jesinger Str. 57, 73230 Kirchheim unter Teck
> www.pragmaticindustries.de 
>
> 
> 
> 
> 
> Pflichtinformationen
>   gemäß Artikel
> 13 DSGVO
>
> *Von: *Christofer Dutz 
> *Datum: *Montag, 13. Juni 2022 um 09:50
> *An: *dev@iotdb.apache.org 
> *Betreff: *Var-Length-Numeric encoding?
>
> Hi all,
>
> Just out of curiosity. Julian told me TSFiles make use of variable length
> encoding of numeric types.
> I would expect the encoding for unsigned integers to be the "ordinary" one
> where 7 bits of a byte are being used for encoding the numeric value and
> new bytes are added as long as the first bit is 1.
> However, I would be interested in which encoding is being used for
> unsigned integers? Julian posted a reply in the #iotdb slack channel, but
> I'm unsure which official encoding type this is.
> It most likely looks like ZigZag Encoding, but I'm a bit unsure if it
> really is.
> Could anyone here please shed a bit of lite on this? And do we have
> var-length encoding for floating-point types too?
>
> Chris
>


Re: guanchu shen from yonyou request to join iotdb community

2022-06-17 Thread Xiangdong Huang
Hi Guanchu,

Jira and Confluence account added.

Best,
---
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院


沈冠初  于2022年6月13日周一 10:32写道:

> Hi, I'm guanchu shen, from yonyou.
> jira id : Shenguanchu
> Confluence id : guanchushen