On 09.04.2019 17:09, Konstantin Knizhnik wrote:
Hi,

On 09.04.2019 3:27, Ashwin Agrawal wrote:
Heikki and I have been hacking recently for few weeks to implement
in-core columnar storage for PostgreSQL. Here's the design and initial
implementation of Zedstore, compressed in-core columnar storage (table
access method). Attaching the patch and link to github branch [1] to
follow along.

Thank you for publishing this patch. IMHO Postgres is really missing normal support of columnar store and table access method
API is the best way of integrating it.

I wanted to compare memory footprint and performance of zedstore with standard Postgres heap and my VOPS extension. As test data I used TPC-H benchmark (actually only one lineitem table generated with tpch-dbgen utility with scale factor 10 (~8Gb database). I attached script which I have use to populate data (you have to to download, build and run tpch-dbgen yourself, also you can comment code related with VOPS).
Unfortunately I failed to load data in zedstore:

postgres=# insert into zedstore_lineitem_projection (select l_shipdate,l_quantity,l_extendedprice,l_discount,l_tax,l_returnflag::"char",l_linestatus::"char" from lineitem);
psql: ERROR:  compression failed. what now?
Time: 237804.775 ms (03:57.805)


Then I try to check if there is something in zedstore_lineitem_projection table:

postgres=# select count(*) from zedstore_lineitem_projection;
psql: WARNING:  terminating connection because of crash of another server process DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT:  In a moment you should be able to reconnect to the database and repeat your command.
psql: server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
Time: 145710.828 ms (02:25.711)


Backend consumes 16GB of RAM and 16Gb of swap and was killed by OOM killer (undo log?) Subsequent attempt to run the same command is failed with the following error:

postgres=# select count(*) from zedstore_lineitem_projection;
psql: ERROR:  unexpected level encountered when descending tree


So the only thing I can do at this moment is report size of tables on the disk:

postgres=# select pg_relation_size('lineitem');
 pg_relation_size
------------------
      10455441408
(1 row)


postgres=# select pg_relation_size('lineitem_projection');
 pg_relation_size
------------------
       3129974784
(1 row)

postgres=# select pg_relation_size('vops_lineitem_projection');
 pg_relation_size
------------------
       1535647744
(1 row)

postgres=# select pg_relation_size('zedstore_lineitem_projection');
 pg_relation_size
------------------
       2303688704
(1 row)


But I do not know how much data was actually loaded in zedstore table...
Actually the main question is why this table is not empty if INSERT statement was failed?

Please let me know if I can somehow help you to reproduce and investigate the problem.


Looks like the original problem was caused by internal postgres compressor: I have not configured Postgres to use lz4. When I configured Postgres --with-lz4, data was correctly inserted in zedstore table, but looks it is not compressed at all:

postgres=# select pg_relation_size('zedstore_lineitem_projection');
 pg_relation_size
------------------
       9363010640

No wonder that zedstore shows the worst results:

lineitem                                      6240.261 ms
lineitem_projection                    5390.446 ms
zedstore_lineitem_projection   23310.341 ms
vops_lineitem_projection             439.731 ms


Updated version of vstore_bench.sql is attached (sorry, there was some errors in previous version of this script).

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment: vstore_bench.sql
Description: application/sql

Reply via email to