SAY-5 commented on issue #794: URL: https://github.com/apache/arrow-go/issues/794#issuecomment-4386135478
Quick verification on a fresh `main` checkout (post #786, commit 23c1ed3): the reproducer's `Leaked` count is now 0, while v18.6.0 still leaks 4_718_592 bytes from the 36 row-group closes. The dictionary-fallback rework in #786 (Apr 30) added a per-page `defer p.Release()` inside the renamed `drainBufferedDataPages`, which incidentally drains the dict-encoded `pages[]` even when the underlying writer fails — that closes the surface this report exercises. The structural anti-patterns in the report ([`rowGroupWriter.Close` early-returns](https://github.com/apache/arrow-go/blob/main/parquet/file/row_group_writer.go#L240-L248) on the first column-close error, and [`columnWriter.Close` registers its cleanup defer after fallible calls](https://github.com/apache/arrow-go/blob/main/parquet/file/column_writer.go#L606-L630)) are still on `main`, but they're no longer observably reachable via the buffered `WriteBuffered` path. If anyone has another reproducer (e.g. a `SerialRowGroupWriter` path where the failing column has accumulated post-fallback pages in `inMemSink`) that still leaks on `main`, that would be useful to share — happy to take a regression-test PR if so. Otherwise this is probably best closed as fixed-by-#786. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
