zeroshade commented on code in PR #864:
URL: https://github.com/apache/arrow-go/pull/864#discussion_r3475598270
##########
parquet/metadata/bloom_filter_leak_test.go:
##########
@@ -0,0 +1,162 @@
+package metadata
Review Comment:
this is missing the apache license header that is required.
##########
parquet/metadata/bloom_filter.go:
##########
@@ -56,6 +56,11 @@ var (
defaultCompression = format.BloomFilterCompression{UNCOMPRESSED:
&format.Uncompressed{}}
)
+var (
+ headerPool = make(chan []byte, 512)
+ filterPool = make(chan *memory.Buffer, 512)
+)
Review Comment:
you're replacing the per-reader, allocator-aware `sync.Pool` with package
level globals. You're bypassing the user-configured allocator (which means that
if users are using the `mallocator` i.e. CGO/non-go memory, unreleased buffers
will leak native memory rather than Go memory that can be reclaimed).
##########
parquet/metadata/bloom_filter.go:
##########
@@ -292,6 +297,13 @@ func (b *blockSplitBloomFilter) WriteTo(w io.Writer, enc
encryption.Encryptor) (
return w.Write(b.data.Bytes())
}
+func (b *blockSplitBloomFilter) Close() {
+ if b.cancelCleanup != nil {
+ b.cancelCleanup()
+ b.cancelCleanup = nil
+ }
+}
Review Comment:
this isn't actually called anywhere in the code except for in your new test,
so this PR doesn't actually fix anything
##########
parquet/metadata/bloom_filter.go:
##########
@@ -360,6 +372,7 @@ type BloomFilter interface {
Hasher() Hasher
CheckHash(hash uint64) bool
Size() int64
+ Close()
Review Comment:
this is a breaking change that we shouldn't introduce and shouldn't be
needed.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]