ssirovica commented on issue #14007:
URL: https://github.com/apache/arrow/issues/14007#issuecomment-1232211982

   Looking forward to any insights you have!
   
    Here are a couple key notes from my debugging:
   1. I believe it's specifically on the write path. If we don't write to the 
file IE:
   ```
   if i%2000000 == 0 {
            // Write row groups out every 2M times
            rec := builder.NewRecord()
            // w.Write(rec) Comment out write
            rec.Release()
         }
   ```
   there is no memory leaking.
   2. Using a different schema like from `arrow.PrimitiveTypes` doesn't have 
this issue. Just BinaryTypes.
   3. At one point I used the Checked Allocator 
(https://github.com/apache/arrow/blob/master/go/arrow/memory/checked_allocator.go#L35).
 And the reference counts after each record release were 0. However, that's at 
the lowest level so memory could still be held on elsewhere.
   
   PS: I can share all my heap profiles but since you can reproduce I'll let 
you grab dumps yourself unless you'd like me to attach them.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to