Hi,

Thanks for the updated patch. It's a quite massive amount of code - I I
don't think we had many 2MB patches in the past, so this is by no means
a full review.

1) the psql_1.out is missing a bit of expected output (due to 098fb0079)

2) I'm getting crashes in intarray contrib, due to hitting this error in
lwlock.c (backtrace attached):

        /* Ensure we will have room to remember the lock */
        if (num_held_lwlocks >= MAX_SIMUL_LWLOCKS)
                elog(ERROR, "too many LWLocks taken");

I haven't investigates this too much, but it's regular build with
asserts and TAP tests, so it should be simple to reproduce using "make
check-world" I guess.


3) I did a very simple benchmark, loading a TPC-H data (for 75GB),
followed by pg_dump, and the duration (in seconds) looks like this:

           master    zedstore/pglz    zedstore/lz4
  -------------------------------------------------
   copy      1855            68092            2131
   dump       751              905             811

And the size of the lineitem table (as shown by \d+) is:

  master: 64GB
  zedstore/pglz: 51GB
  zedstore/lz4: 20GB

It's mostly expected lz4 beats pglz in performance and compression
ratio, but this seems a bit too extreme I guess. Per past benchmarks
(e.g. [1] and [2]) the difference in compression/decompression time
should be maybe 1-2x or something like that, not 35x like here.

[1]
https://www.postgresql.org/message-id/20130621000900.GA12425%40alap2.anarazel.de

[2]
https://www.postgresql.org/message-id/20130605150144.GD28067%40alap2.anarazel.de

Furthermore, the pglz compression is not consuming the most CPU, at
least that's what perf says:

    24.82%  postgres          [.] encode_chunk_varlen
    20.49%  postgres          [.] decode_chunk
    13.01%  postgres          [.] merge_attstream_guts.isra.0
    12.68%  libc-2.32.so      [.] __memmove_avx_unaligned_erms
     8.72%  postgres          [.] encode_chunk_fixed
     6.16%  postgres          [.] pglz_compress
     4.36%  postgres          [.] decode_attstream_cont
     2.27%  postgres          [.] 0x00000000000baff0
     1.84%  postgres          [.] AllocSetAlloc
     0.79%  postgres          [.] append_attstream
     0.70%  postgres          [.] palloc

So I wonder if this is a sign of a deeper issue - maybe the lower
compression ratio (for pglz) triggers some sort of feedback loop in
zedstore, or something like that? Not sure, but this seems strange.


4) I looked at some of the code, like merge_attstream etc. and I wonder
if this might be related to some of the FIXME comments. For example this
bit in merge_attstream seems interesting:

    * FIXME: we don't actually pay attention to the compression anymore.
    * We never repack.
    * FIXME: this is backwords, the normal fast path is if (firsttid1 >
lasttid2)

But I suppose that should affect both pglz and lz4, and I'm not sure how
up to date those comments actually are.

BTW the comments in general need updating and tidying up, to make
reviews easier. For example the merge_attstream comment references
attstream1 and attstream2, but those are not the current parameters of
the function.


5) IHMO there should be a #define specifying the maximum number of items
per chunk (60). Currently there are literal constants used in various
places, sometimes 60, sometimes 59 etc. which makes it harder to
understand the code. FWIW 60 seems a bit low, but maybe it's OK.


6) I do think ZSAttStream should track which compression is used by the
stream, for two main reasons. Firstly, there's another patch to support
"custom compression" methods, which (also) allows multiple compression
methods per column. It'd be a bit strange to support that for varlena
columns in heap table, and not here, I guess. Secondly, I think one of
the interesting columnstore features down the road will be execution on
compressed data, which however requires compression method designed for
that purpose, and it's often datatype-specific (delta encoding, ...).

I don't think we need to go as far as supporting "custom" compression
methods here, but I think we should allow different built-in compression
methods for different attstreams.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Core was generated by `postgres: user contrib_regression [local] CREATE INDEX   
                     '.
Program terminated with signal SIGABRT, Aborted.
#0  0x000074f9452239e5 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.31-4.fc32.x86_64
(gdb) bt
#0  0x000074f9452239e5 in raise () from /lib64/libc.so.6
#1  0x000074f94520c895 in abort () from /lib64/libc.so.6
#2  0x0000000000936246 in errfinish (filename=<optimized out>, 
lineno=<optimized out>, funcname=0xab4660 <__func__.5> "LWLockAcquire") at 
elog.c:592
#3  0x00000000007fa895 in LWLockAcquire (lock=0x74f93bcb0400, 
mode=LW_EXCLUSIVE) at lwlock.c:1240
#4  0x000000000053c850 in AdvanceXLInsertBuffer (upto=28450816, 
opportunistic=false) at xlog.c:2203
#5  0x000000000053cbf7 in GetXLogBuffer (ptr=28450816) at xlog.c:1962
#6  0x000000000053d622 in CopyXLogRecordToWAL (EndPos=28557992, 
StartPos=27456680, rdata=0x74f93bb55f60, isLogSwitch=false, write_len=1098071)
    at xlog.c:1561
#7  XLogInsertRecord (rdata=<optimized out>, fpw_lsn=<optimized out>, 
flags=<optimized out>, num_fpi=199) at xlog.c:1125
#8  0x0000000000548dec in XLogInsert (rmid=rmid@entry=0 '\000', 
info=info@entry=176 '\260') at xloginsert.c:818
#9  0x0000000000549716 in log_newpage_range (rel=0x74f93bbaa508, 
forkNum=MAIN_FORKNUM, startblk=<optimized out>, endblk=4261, 
page_std=<optimized out>)
    at xloginsert.c:1182
#10 0x00000000004b2b9e in gistbuild (heap=<optimized out>, 
index=0x74f93bbaa508, indexInfo=<optimized out>) at gistbuild.c:330
#11 0x0000000000581d52 in index_build 
(heapRelation=heapRelation@entry=0x74f93bba4208, 
indexRelation=indexRelation@entry=0x74f93bbaa508, 
    indexInfo=indexInfo@entry=0x21748a0, isreindex=isreindex@entry=false, 
parallel=parallel@entry=true) at index.c:3067
#12 0x00000000005831d6 in index_create (heapRelation=<optimized out>, 
indexRelationName=<optimized out>, indexRelationId=<optimized out>, 
    parentIndexRelid=<optimized out>, parentConstraintId=<optimized out>, 
relFileNode=<optimized out>, indexInfo=<optimized out>, 
    indexColNames=<optimized out>, accessMethodObjectId=<optimized out>, 
tableSpaceId=<optimized out>, collationObjectId=<optimized out>, 
    classObjectId=<optimized out>, coloptions=<optimized out>, 
reloptions=<optimized out>, flags=<optimized out>, constr_flags=<optimized 
out>, 
    allow_system_table_mods=<optimized out>, is_internal=<optimized out>, 
constraintId=<optimized out>) at index.c:1256
#13 0x0000000000626d10 in DefineIndex (relationId=16507, stmt=<optimized out>, 
indexRelationId=0, parentIndexId=0, parentConstraintId=0, 
    is_alter_table=false, check_rights=true, check_not_in_use=true, 
skip_build=false, quiet=false) at indexcmds.c:1118
#14 0x00000000008135c9 in ProcessUtilitySlow (pstate=pstate@entry=0x21b55a0, 
pstmt=pstmt@entry=0x2092a70, 
    queryString=queryString@entry=0x2091920 "CREATE INDEX text_idx on test__int 
using gist (a gist__intbig_ops(siglen = 2024));", 
    context=context@entry=PROCESS_UTILITY_TOPLEVEL, params=params@entry=0x0, 
queryEnv=queryEnv@entry=0x0, qc=0x7fffc1132580, dest=0x2092b60)
    at utility.c:1521
#15 0x00000000008125b9 in standard_ProcessUtility (pstmt=0x2092a70, 
    queryString=0x2091920 "CREATE INDEX text_idx on test__int using gist (a 
gist__intbig_ops(siglen = 2024));", context=PROCESS_UTILITY_TOPLEVEL, 
    params=0x0, queryEnv=0x0, dest=0x2092b60, qc=0x7fffc1132580) at 
utility.c:1071
#16 0x000000000080fb36 in PortalRunUtility (portal=0x20f9a50, pstmt=0x2092a70, 
isTopLevel=<optimized out>, setHoldSnapshot=<optimized out>, dest=0x2092b60, 
    qc=0x7fffc1132580) at pquery.c:1159
#17 0x00000000008105e4 in PortalRunMulti (portal=portal@entry=0x20f9a50, 
isTopLevel=isTopLevel@entry=true, setHoldSnapshot=setHoldSnapshot@entry=false, 
    dest=dest@entry=0x2092b60, altdest=altdest@entry=0x2092b60, 
qc=qc@entry=0x7fffc1132580) at pquery.c:1305
#18 0x00000000008114d6 in PortalRun (portal=0x20f9a50, 
count=9223372036854775807, isTopLevel=<optimized out>, run_once=<optimized 
out>, dest=0x2092b60, 
    altdest=0x2092b60, qc=0x7fffc1132580) at pquery.c:779
#19 0x000000000080cd16 in exec_simple_query (query_string=0x2091920 "CREATE 
INDEX text_idx on test__int using gist (a gist__intbig_ops(siglen = 2024));")
    at postgres.c:1239
#20 0x000000000080ea1f in PostgresMain (argc=<optimized out>, argv=<optimized 
out>, dbname=<optimized out>, username=<optimized out>) at postgres.c:4305
#21 0x00000000007804ab in BackendRun (port=<optimized out>, port=<optimized 
out>) at postmaster.c:4488
#22 BackendStartup (port=<optimized out>) at postmaster.c:4210
#23 ServerLoop () at postmaster.c:1727
#24 0x000000000078142c in PostmasterMain (argc=<optimized out>, argv=0x208b5f0) 
at postmaster.c:1400
#25 0x00000000004838f0 in main (argc=8, argv=0x208b5f0) at main.c:209
    98.96%     0.00%  postgres  [unknown]           [.] 0x495641002713333d
            |
            ---0x495641002713333d
               __libc_start_main
               startup_hacks
               PostmasterMain
               ServerLoop
               BackendStartup
               ExitPostmaster
               PostgresMain
               exec_simple_query
               PortalRun
               PortalRunMulti
               PortalRunUtility
               ProcessUtility
               standard_ProcessUtility
               DoCopy
               |          
                --98.95%--CopyFrom
                          |          
                          |--96.09%--CopyMultiInsertInfoFlush
                          |          |          
                          |           --96.09%--CopyMultiInsertBufferFlush
                          |                     |          
                          |                      --96.05%--table_multi_insert
                          |                                |          
                          |                                 
--96.05%--zedstoream_multi_insert
                          |                                           |         
 
                          |                                            
--96.03%--zsbt_tuplebuffer_spool_slots
                          |                                                     
 |          
                          |                                                     
  --95.82%--zsbt_attbuffer_spool
                          |                                                     
            |          
                          |                                                     
             --95.51%--zsbt_attbuffer_flush
                          |                                                     
                       |          
                          |                                                     
                        --95.19%--zsbt_attr_add
                          |                                                     
                                  |          
                          |                                                     
                                  |--86.14%--merge_attstream
                          |                                                     
                                  |          |          
                          |                                                     
                                  |           --85.70%--merge_attstream_guts
                          |                                                     
                                  |                     |          
                          |                                                     
                                  |                     
|--40.57%--append_attstream
                          |                                                     
                                  |                     |          |          
                          |                                                     
                                  |                     |           
--39.48%--encode_chunk
                          |                                                     
                                  |                     |                     | 
         
                          |                                                     
                                  |                     |                     
|--25.36%--encode_chunk_varlen
                          |                                                     
                                  |                     |                     | 
         
                          |                                                     
                                  |                     |                     
|--11.59%--encode_chunk_fixed
                          |                                                     
                                  |                     |                     | 
         
                          |                                                     
                                  |                     |                      
--2.10%--__memmove_avx_unaligned_erms
                          |                                                     
                                  |                     |          
                          |                                                     
                                  |                     
|--27.24%--decode_attstream_cont
                          |                                                     
                                  |                     |          |          
                          |                                                     
                                  |                     |           
--23.14%--decode_chunk
                          |                                                     
                                  |                     |                     | 
         
                          |                                                     
                                  |                     |                     
|--12.71%--decode_chunk_varlen
                          |                                                     
                                  |                     |                     | 
         |          
                          |                                                     
                                  |                     |                     | 
          --2.34%--palloc
                          |                                                     
                                  |                     |                     | 
                    |          
                          |                                                     
                                  |                     |                     | 
                     --1.86%--AllocSetAlloc
                          |                                                     
                                  |                     |                     | 
         
                          |                                                     
                                  |                     |                     
|--6.43%--decode_chunk_fixed
                          |                                                     
                                  |                     |                     | 
         
                          |                                                     
                                  |                     |                      
--3.44%--__memmove_avx_unaligned_erms
                          |                                                     
                                  |                     |          
                          |                                                     
                                  |                      
--1.36%--__memmove_avx_unaligned_erms
                          |                                                     
                                  |          
                          |                                                     
                                   --8.72%--zsbt_attr_pack_attstream
                          |                                                     
                                             |          
                          |                                                     
                                              --8.44%--zs_compress_destSize
                          |                                                     
                                                        |          
                          |                                                     
                                                         --8.41%--pglz_compress
                          |                                                     
                                                                   |          
                          |                                                     
                                                                    
--6.16%--pglz_find_match
                          |          
                           --2.60%--NextCopyFrom
                                     |          
                                     |--1.57%--InputFunctionCall
                                     |          |          
                                     |          |--0.60%--date_in
                                     |          |          
                                     |           --0.56%--numeric_in
                                     |          
                                      --0.83%--NextCopyFromRawFields
                                                |          
                                                 --0.59%--CopyReadLine

Reply via email to