09.06.2022 16:18, Adriano dos Santos Fernandes wrote:

09.06.2022 15:16, Adriano dos Santos Fernandes wrote:

Yes, it should work. However, I'm not going to remove the limit until we
introduce a denser compression. Also, we have a number of places where
records is stored unpacked in memory (rpb's, RecordBuffer, HashJoin,
etc), so longer records could increase server memory usage. This should
be improved somehow.

Yes, but it's so bad when one needs to improve their schema and hit a limit.

And sometimes the arbitrary limit is hit by few margin, like switching a
field encoding or small necessary increase in length.

Agreed. In the meantime, I will not object to keeping the limit but raising it to e.g. 256KB.

The same with record formats limited to 255. It's so awful, and it's
related stuff, as the format could also be variable encoded to not
always use 2 bytes.

True. Another approach is also possible: (optionally) extend sweeping to upgrade the record format of the committed records on the data pages being swept, garbage collect unused formats and re-use them when the counter wraps around.

The problem, however, is that format-aware processing was found to be
slower. The dumb scheme presented above (with no real compression)
provided almost the same record size as RLE compression for mixed
"real-world" fields and was even denser for records with longish UTF8
fields, but it was also ~20% slower.

If the storage takes less space, is this slow down estimation calculated
also taking into account the slower number of pages read (when page is
not cached)?

Surely reading from disk is way slower than decompressing in memory, so less data pages to read easily outbids the increased decompression penalty. Things are not so cool when the working set is cached though.


Dmitry


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Reply via email to