Hi, On 2022-03-23 18:31:12 -0400, Robert Haas wrote: > On Wed, Mar 23, 2022 at 5:14 PM Andres Freund <and...@anarazel.de> wrote: > > The most likely source of problem would errors thrown while zstd threads are > > alive. Should make sure that that can't happen. > > > > What is the lifetime of the threads zstd spawns? Are they tied to a single > > compression call? A single ZSTD_createCCtx()? If the latter, how bulletproof > > is our code ensuring that we don't leak such contexts? > > I haven't found any real documentation explaining how libzstd manages > its threads. I am assuming that it is tied to the ZSTD_CCtx, but I > don't know. I guess I could try to figure it out from the source code.
I found this the following section in the manual [1]: ZSTD_c_nbWorkers=400, /* Select how many threads will be spawned to compress in parallel. * When nbWorkers >= 1, triggers asynchronous mode when invoking ZSTD_compressStream*() : * ZSTD_compressStream*() consumes input and flush output if possible, but immediately gives back control to caller, * while compression is performed in parallel, within worker thread(s). * (note : a strong exception to this rule is when first invocation of ZSTD_compressStream2() sets ZSTD_e_end : * in which case, ZSTD_compressStream2() delegates to ZSTD_compress2(), which is always a blocking call). * More workers improve speed, but also increase memory usage. * Default value is `0`, aka "single-threaded mode" : no worker is spawned, * compression is performed inside Caller's thread, and all invocations are blocking */ "ZSTD_compressStream*() consumes input ... immediately gives back control" pretty much confirms that. Do we care about zstd's memory usage here? I think it's OK to mostly ignore work_mem/maintenance_work_mem here, but I could also see limiting concurrency so that estimated memory usage would fit into work_mem/maintenance_work_mem. > It's probably also worth mentioning here that even if, contrary to > expectations, the compression threads hang around to the end of time > and chill, in practice nobody is likely to run BASE_BACKUP and then > keep the connection open for a long time afterward. So it probably > wouldn't really affect resource utilization in real-world scenarios > even if the threads never exited, as long as they didn't, you know, > busy-loop in the background. And I assume the actual library behavior > can't be nearly that bad. This is a pretty mainstream piece of > software. I'm not really worried about resource utilization, more about the existence of threads moving us into undefined behaviour territory or such. I don't think that's possible, but it's IIRC UB to fork() while threads are present and do pretty much *anything* other than immediately exec*(). > > > but that's not to say that there couldn't be problems. I worry a bit that > > > the mere presence of threads could in some way mess things up, but I don't > > > know what the mechanism for that would be, and I don't want to postpone > > > shipping useful features based on nebulous fears. > > > > One thing that'd be good to tests for is cancelling in-progress server-side > > compression. And perhaps a few assertions that ensure that we don't escape > > with some threads still running. That'd have to be platform dependent, but I > > don't see a problem with that in this case. > > More specific suggestions, please? I was thinking of doing something like calling pthread_is_threaded_np() before and after the zstd section and erroring out if they differ. But I forgot that that's on mac-ism. > > > For both parallel and non-parallel zstd compression, I see differences > > > between the compressed size depending on where the compression is > > > done. I don't know whether this is an expected behavior of the zstd > > > library or a bug. Both files uncompress OK and pass pg_verifybackup, > > > but that doesn't mean we're not, for example, selecting different > > > compression levels where we shouldn't be. I'll try to figure out > > > what's going on here. > > > > > > zstd, client-side: 1.7GB, 17 seconds > > > zstd, server-side: 1.3GB, 25 seconds > > > parallel zstd, 4 workers, client-side: 1.7GB, 7.5 seconds > > > parallel zstd, 4 workers, server-side: 1.3GB, 7.2 seconds > > > > What causes this fairly massive client-side/server-side size difference? > > You seem not to have read what I wrote about this exact point in the > text which you quoted. Somehow not... Perhaps it's related to the amounts of memory fed to ZSTD_compressStream2() in one invocation? I recall that there's some differences between basebackup client / serverside around buffer sizes - but that's before all the recent-ish changes... Greetings, Andres Freund [1] http://facebook.github.io/zstd/zstd_manual.html