I went for a simpler approach (only with null mask( and yes the gain is high 
for smaller object but low otherwise. I gain between 5-20% on my objects. But 
to me it is the step stone to easily implement other optimisations like varint 
and schemaless without using raw. Trying to solve the latest unit tests to give 
you a better idea. If not worth then let's not do it but it is worth a try I 
think.


-----Original Message-----
From: Ilya Kasnacheev <ilya.kasnach...@gmail.com> 
Sent: Monday, May 25, 2020 11:48 AM
To: dev <dev@ignite.apache.org>
Subject: Re: IGNITE-6499 Compact NULL fields

Caution, this email may be from a sender outside Wolters Kluwer. Verify the 
sender and know the content is safe.

Hello!

I can't help myself but wonder how large of a benefit will it give.  I have 
checked the ticket description, it looks the proposed scheme is elaborate and 
benefit for non-extreme binary objects rather tiny.

WDYT?

Regards,
--
Ilya Kasnacheev


пн, 18 мая 2020 г. в 22:54, steve.hostett...@gmail.com <
steve.hostett...@gmail.com>:

> Hello igniters,
>
> while I would like to help on the calcite because H2 optimiser (or the 
> lack
> thereof) is really killing us, I think that it would be wiser to start 
> by contributing on something easier.
>
> Therefore I will tackle another problem that we have which is the 
> memory consumption. I stumbled upon this IEP
>
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwik
> i.apache.org%2Fconfluence%2Fdisplay%2FIGNITE%2FIEP-2%253A%2BBinary%2Bo
> bject%2Bformat%2Bimprovements&amp;data=02%7C01%7CSteve.Hostettler%40wo
> lterskluwer.com%7C7568148487434617407b08d80090b1f2%7C8ac76c91e7f141ffa
> 89c3553b2da2c17%7C0%7C0%7C637259968758509764&amp;sdata=ZNFJ5gqEXRv5KR3
> HJUfYZ4rmnGwCiFVGg4IrWTORT2k%3D&amp;reserved=0
> <
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwik
> i.apache.org%2Fconfluence%2Fdisplay%2FIGNITE%2FIEP-2%253A%2BBinary%2Bo
> bject%2Bformat%2Bimprovements&amp;data=02%7C01%7CSteve.Hostettler%40wo
> lterskluwer.com%7C7568148487434617407b08d80090b1f2%7C8ac76c91e7f141ffa
> 89c3553b2da2c17%7C0%7C0%7C637259968758509764&amp;sdata=ZNFJ5gqEXRv5KR3
> HJUfYZ4rmnGwCiFVGg4IrWTORT2k%3D&amp;reserved=0>
>
> that is about optimising the binary marshaller.
>
> The low hanging fruit seemed to be the null compaction so I decided to 
> start with it. Though I am sure I do see some hidden complexity.
>
> Here a couple of questions:
> - Can I assign myself IGNITE-6499 and attach a patch?
> - Who can I contact to help with the review. In the following page
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwik
> i.apache.org%2Fconfluence%2Fdisplay%2FIGNITE%2FHow%2Bto%2BContribute&a
> mp;data=02%7C01%7CSteve.Hostettler%40wolterskluwer.com%7C7568148487434
> 617407b08d80090b1f2%7C8ac76c91e7f141ffa89c3553b2da2c17%7C0%7C0%7C63725
> 9968758519763&amp;sdata=1Uzz8DLO%2B9nd1FPQ14efFeL35QsYE6tT3BvhIKf03FQ%
> 3D&amp;reserved=0 
> <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwi
> ki.apache.org%2Fconfluence%2Fdisplay%2FIGNITE%2FHow%2Bto%2BContribute&
> amp;data=02%7C01%7CSteve.Hostettler%40wolterskluwer.com%7C756814848743
> 4617407b08d80090b1f2%7C8ac76c91e7f141ffa89c3553b2da2c17%7C0%7C0%7C6372
> 59968758519763&amp;sdata=1Uzz8DLO%2B9nd1FPQ14efFeL35QsYE6tT3BvhIKf03FQ
> %3D&amp;reserved=0> there is no one assigned for marshalling.
>
> On the details:
> The compression is disabled by default as it is not compatible with 
> objects previously marshalled.
>
> My approach was to go a bit beyond the JIRA. No only do I remove the 
> indexes to null fields in the footer, I also remove the 0x65 in the 
> objects. I did not remove them fro the collections and arrays because 
> they are using absolute positioning.
>
> I gain between 5% to 20% depending of my test cases. Obviously the 
> smaller the object and the higher the number of nulls, the higher the 
> compression rate.
>
> Based on that I can quite easily add var int compression which is
> IGNITE-6418 and should significantly increase the compression rate 
> with a lot of integers and longs when only using small numbers.
>
> Next step is to add JMH micro-benchmark to check the impact in terms 
> of performances.
>
>
> Example on a simple object w/ null compaction
>
> Length=55 FooterPosition=50
> 0x67 // ValueType
> 0x01 // FormatVersion
> 0x2b 0x00 //Flags userType=true hasSchema=true offset=1 
> compactFooter=true
> 0x78 0x66 0xbe 0x44 //TypeId
> 0xf9 0xcd 0x07 0x57 //Hashcode
> 0x37 0x00 0x00 0x00 //Length
> 0x3d 0xa8 0x15 0xe4 //SchemaId
> 0x32 0x00 0x00 0x00 //Footer position = 50
> 0x03 0x01 0x00 0x00 0x00 0x03 0x01 0x00 0x00 0x00 0x09 0x03 0x00 0x00 
> 0x00
> 0x61 0x62 0x63 0x09 0x03 0x00 0x00 0x00 0x61 0x62 0x63 Footer length=5
> 0x18 0x1d 0x22 0x2a 0x47
>
> and w/o null compaction
> Length=60 FooterPosition=53
> 0x67 // ValueType
> 0x01 // FormatVersion
> 0x2b 0x00 //Flags userType=true hasSchema=true offset=1 
> compactFooter=true
> 0x78 0x66 0xbe 0x44 //TypeId
> 0xa4 0x43 0x0e 0xf5 //Hashcode
> 0x3c 0x00 0x00 0x00 //Length
> 0x3d 0xa8 0x15 0xe4 //SchemaId
> 0x35 0x00 0x00 0x00 //Footer position = 53
> 0x03 0x01 0x00 0x00 0x00 0x03 0x01 0x00 0x00 0x00 0x09 0x03 0x00 0x00 
> 0x00
> 0x61 0x62 0x63 0x65 0x65 0x65 0x09 0x03 0x00 0x00 0x00 0x61 0x62 0x63 
> Footer length=7
> 0x18 0x1d 0x22 0x2a 0x2b 0x2c 0x2d
>
>
>
>
> --
> Sent from: 
> https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fapache-ignite-developers.2346864.n4.nabble.com%2F&amp;data=02%7C01%7CSteve.Hostettler%40wolterskluwer.com%7C7568148487434617407b08d80090b1f2%7C8ac76c91e7f141ffa89c3553b2da2c17%7C0%7C0%7C637259968758519763&amp;sdata=YmPxlqtaJCQLQmB6yyoaNr27mstXWyFuWyJYZDafwU4%3D&amp;reserved=0
>

Reply via email to