theosib-amazon commented on PR #959: URL: https://github.com/apache/parquet-mr/pull/959#issuecomment-1194259084
I did some poking around. It looks like if you call release() on a codec, it (a) resets the codec (freeing resources, I think) and (b) returns it to a pool of codecs without actually destroying the codec. Later, when release() is called on the factory, it just calls release() again on each of the codecs, returning them to the pool. The only other effect is that references are removed from a container in the factory. The only question, then, is what happens if release is called twice on a codec. It looks like nothing happens because CodecPool.payback() will return false when the codec is already in the pool. Moreover, I'm pretty sure the original implementation already did this. So I think the solution it to literally do nothing. The new usage pattern is now: - Create Codec factory - Create worker threads - Threads create codecs - Threads finish using codecs - Threads *optionally* call release on their codecs if they want to free resources right away. - Threads terminate - The thread that created the worker threads waits until those threads are done - release is called on the factory, cleaning up any codecs that were not released already -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
