[GitHub] [parquet-mr] theosib-amazon commented on pull request #959: PARQUET-2126: Make cached (de)compressors thread-safe

2022-07-26 Thread GitBox
theosib-amazon commented on PR #959: URL: https://github.com/apache/parquet-mr/pull/959#issuecomment-1195676003 I just thought of something that makes me nervous about this PR that requires further investigation. Consider the following scenario: - Thread A allocates a codec - Thread A

[GitHub] [parquet-mr] theosib-amazon commented on pull request #959: PARQUET-2126: Make cached (de)compressors thread-safe

2022-07-25 Thread GitBox
theosib-amazon commented on PR #959: URL: https://github.com/apache/parquet-mr/pull/959#issuecomment-1194259084 I did some poking around. It looks like if you call release() on a codec, it (a) resets the codec (freeing resources, I think) and (b) returns it to a pool of codecs without

[GitHub] [parquet-mr] theosib-amazon commented on pull request #959: PARQUET-2126: Make cached (de)compressors thread-safe

2022-07-25 Thread GitBox
theosib-amazon commented on PR #959: URL: https://github.com/apache/parquet-mr/pull/959#issuecomment-1194158751 One option is to provide another API call that releases the cached instance for only the current thread. What should we call it? I forget whether close or release is used more,

[GitHub] [parquet-mr] theosib-amazon commented on pull request #959: PARQUET-2126: Make cached (de)compressors thread-safe

2022-07-20 Thread GitBox
theosib-amazon commented on PR #959: URL: https://github.com/apache/parquet-mr/pull/959#issuecomment-1190765848 > @theosib-amazon Do you still have time for addressing the feedback? I think we are very close to merge. I'm not really sure which feedback to address. Are you concerned

[GitHub] [parquet-mr] theosib-amazon commented on pull request #959: PARQUET-2126: Make cached (de)compressors thread-safe

2022-05-17 Thread GitBox
theosib-amazon commented on PR #959: URL: https://github.com/apache/parquet-mr/pull/959#issuecomment-1129005310 I added a test. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [parquet-mr] theosib-amazon commented on pull request #959: PARQUET-2126: Make cached (de)compressors thread-safe

2022-05-16 Thread GitBox
theosib-amazon commented on PR #959: URL: https://github.com/apache/parquet-mr/pull/959#issuecomment-1127885617 > My question is when a thread exits, we don't have a corresponding evict operation on the map. Using thread pool might be OK if the thread object is not changed, but not sure if

[GitHub] [parquet-mr] theosib-amazon commented on pull request #959: PARQUET-2126: Make cached (de)compressors thread-safe

2022-05-16 Thread GitBox
theosib-amazon commented on PR #959: URL: https://github.com/apache/parquet-mr/pull/959#issuecomment-1127839048 > If we change it to be per thread, then would it be a problem in the scenario where short living threads come and go? When the thread stopped, we might not know and leak here.

[GitHub] [parquet-mr] theosib-amazon commented on pull request #959: PARQUET-2126: Make cached (de)compressors thread-safe

2022-04-22 Thread GitBox
theosib-amazon commented on PR #959: URL: https://github.com/apache/parquet-mr/pull/959#issuecomment-1106632106 Alright. You have a point. If the maintainers want me to delete that stuff, they can let me know, and I'll go ahead and do it. -- This is an automated message from the