I went for a simpler approach (only with null mask( and yes the gain is high for smaller object but low otherwise. I gain between 5-20% on my objects. But to me it is the step stone to easily implement other optimisations like varint and schemaless without using raw. Trying to solve the latest unit tests to give you a better idea. If not worth then let's not do it but it is worth a try I think.
-----Original Message----- From: Ilya Kasnacheev <ilya.kasnach...@gmail.com> Sent: Monday, May 25, 2020 11:48 AM To: dev <dev@ignite.apache.org> Subject: Re: IGNITE-6499 Compact NULL fields Caution, this email may be from a sender outside Wolters Kluwer. Verify the sender and know the content is safe. Hello! I can't help myself but wonder how large of a benefit will it give. I have checked the ticket description, it looks the proposed scheme is elaborate and benefit for non-extreme binary objects rather tiny. WDYT? Regards, -- Ilya Kasnacheev пн, 18 мая 2020 г. в 22:54, steve.hostett...@gmail.com < steve.hostett...@gmail.com>: > Hello igniters, > > while I would like to help on the calcite because H2 optimiser (or the > lack > thereof) is really killing us, I think that it would be wiser to start > by contributing on something easier. > > Therefore I will tackle another problem that we have which is the > memory consumption. I stumbled upon this IEP > > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwik > i.apache.org%2Fconfluence%2Fdisplay%2FIGNITE%2FIEP-2%253A%2BBinary%2Bo > bject%2Bformat%2Bimprovements&data=02%7C01%7CSteve.Hostettler%40wo > lterskluwer.com%7C7568148487434617407b08d80090b1f2%7C8ac76c91e7f141ffa > 89c3553b2da2c17%7C0%7C0%7C637259968758509764&sdata=ZNFJ5gqEXRv5KR3 > HJUfYZ4rmnGwCiFVGg4IrWTORT2k%3D&reserved=0 > < > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwik > i.apache.org%2Fconfluence%2Fdisplay%2FIGNITE%2FIEP-2%253A%2BBinary%2Bo > bject%2Bformat%2Bimprovements&data=02%7C01%7CSteve.Hostettler%40wo > lterskluwer.com%7C7568148487434617407b08d80090b1f2%7C8ac76c91e7f141ffa > 89c3553b2da2c17%7C0%7C0%7C637259968758509764&sdata=ZNFJ5gqEXRv5KR3 > HJUfYZ4rmnGwCiFVGg4IrWTORT2k%3D&reserved=0> > > that is about optimising the binary marshaller. > > The low hanging fruit seemed to be the null compaction so I decided to > start with it. Though I am sure I do see some hidden complexity. > > Here a couple of questions: > - Can I assign myself IGNITE-6499 and attach a patch? > - Who can I contact to help with the review. In the following page > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwik > i.apache.org%2Fconfluence%2Fdisplay%2FIGNITE%2FHow%2Bto%2BContribute&a > mp;data=02%7C01%7CSteve.Hostettler%40wolterskluwer.com%7C7568148487434 > 617407b08d80090b1f2%7C8ac76c91e7f141ffa89c3553b2da2c17%7C0%7C0%7C63725 > 9968758519763&sdata=1Uzz8DLO%2B9nd1FPQ14efFeL35QsYE6tT3BvhIKf03FQ% > 3D&reserved=0 > <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwi > ki.apache.org%2Fconfluence%2Fdisplay%2FIGNITE%2FHow%2Bto%2BContribute& > amp;data=02%7C01%7CSteve.Hostettler%40wolterskluwer.com%7C756814848743 > 4617407b08d80090b1f2%7C8ac76c91e7f141ffa89c3553b2da2c17%7C0%7C0%7C6372 > 59968758519763&sdata=1Uzz8DLO%2B9nd1FPQ14efFeL35QsYE6tT3BvhIKf03FQ > %3D&reserved=0> there is no one assigned for marshalling. > > On the details: > The compression is disabled by default as it is not compatible with > objects previously marshalled. > > My approach was to go a bit beyond the JIRA. No only do I remove the > indexes to null fields in the footer, I also remove the 0x65 in the > objects. I did not remove them fro the collections and arrays because > they are using absolute positioning. > > I gain between 5% to 20% depending of my test cases. Obviously the > smaller the object and the higher the number of nulls, the higher the > compression rate. > > Based on that I can quite easily add var int compression which is > IGNITE-6418 and should significantly increase the compression rate > with a lot of integers and longs when only using small numbers. > > Next step is to add JMH micro-benchmark to check the impact in terms > of performances. > > > Example on a simple object w/ null compaction > > Length=55 FooterPosition=50 > 0x67 // ValueType > 0x01 // FormatVersion > 0x2b 0x00 //Flags userType=true hasSchema=true offset=1 > compactFooter=true > 0x78 0x66 0xbe 0x44 //TypeId > 0xf9 0xcd 0x07 0x57 //Hashcode > 0x37 0x00 0x00 0x00 //Length > 0x3d 0xa8 0x15 0xe4 //SchemaId > 0x32 0x00 0x00 0x00 //Footer position = 50 > 0x03 0x01 0x00 0x00 0x00 0x03 0x01 0x00 0x00 0x00 0x09 0x03 0x00 0x00 > 0x00 > 0x61 0x62 0x63 0x09 0x03 0x00 0x00 0x00 0x61 0x62 0x63 Footer length=5 > 0x18 0x1d 0x22 0x2a 0x47 > > and w/o null compaction > Length=60 FooterPosition=53 > 0x67 // ValueType > 0x01 // FormatVersion > 0x2b 0x00 //Flags userType=true hasSchema=true offset=1 > compactFooter=true > 0x78 0x66 0xbe 0x44 //TypeId > 0xa4 0x43 0x0e 0xf5 //Hashcode > 0x3c 0x00 0x00 0x00 //Length > 0x3d 0xa8 0x15 0xe4 //SchemaId > 0x35 0x00 0x00 0x00 //Footer position = 53 > 0x03 0x01 0x00 0x00 0x00 0x03 0x01 0x00 0x00 0x00 0x09 0x03 0x00 0x00 > 0x00 > 0x61 0x62 0x63 0x65 0x65 0x65 0x09 0x03 0x00 0x00 0x00 0x61 0x62 0x63 > Footer length=7 > 0x18 0x1d 0x22 0x2a 0x2b 0x2c 0x2d > > > > > -- > Sent from: > https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fapache-ignite-developers.2346864.n4.nabble.com%2F&data=02%7C01%7CSteve.Hostettler%40wolterskluwer.com%7C7568148487434617407b08d80090b1f2%7C8ac76c91e7f141ffa89c3553b2da2c17%7C0%7C0%7C637259968758519763&sdata=YmPxlqtaJCQLQmB6yyoaNr27mstXWyFuWyJYZDafwU4%3D&reserved=0 >