Joe McDonnell created IMPALA-12076:
--------------------------------------

             Summary: Potential performance improvement using ZSTD's 
ZSTD_decompressDCtx interface
                 Key: IMPALA-12076
                 URL: https://issues.apache.org/jira/browse/IMPALA-12076
             Project: IMPALA
          Issue Type: Improvement
          Components: Backend
    Affects Versions: Impala 4.3.0
            Reporter: Joe McDonnell


In ORC-639, they note that ZSTD's simple interface initializes the context on 
each call to ZSTD_decompress(). When calling ZSTD_decompress() many times, it 
is better to allocate the context once and use the ZSTD_decompressDCtx() 
interface to avoid the repeated initialization.

The ZSTD code mentions that here:

 
{noformat}
/*= Decompression context
 *  When decompressing many times,
 *  it is recommended to allocate a context only once,
 *  and re-use it for each successive compression operation.
 *  This will make workload friendlier for system's memory.
 *  Use one context per thread for parallel execution. */
typedef struct ZSTD_DCtx_s ZSTD_DCtx;{noformat}
We should investigate using this for decompress.h/.cc's ZstandardDecompressor. 
We already do that for the streaming decompression mode, but this should also 
apply to block decompression. Something similar is possible for compression as 
well.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to