eerhardt commented on pull request #10562:
URL: https://github.com/apache/arrow/pull/10562#issuecomment-875768799


   Take a look at all the changes we've been making in the dotnet/runtime 
libraries that reduce allocations: 
https://github.com/dotnet/runtime/pulls?q=is%3Apr+is%3Aclosed+allocation+. Even 
the article I linked says:
   
   > Lots of effort goes into reducing allocation, not because the act of 
allocating is itself particularly expensive, but because of the follow-on costs 
in cleaning up after those allocations via the garbage collector (GC).
   
   If you allocate less objects, the GC has less work to do.
   
   > I still value API "unsurpriseness"
   
   I agree. Looking at the rest of the APIs that copy, they all take 
`IEnumerable<T>` for the parameter, and then call `ToList` or `ToArray`. Here 
are some examples:
   
   
https://github.com/apache/arrow/blob/8e43f23dcc6a9e630516228f110c48b64d13cec6/csharp/src/Apache.Arrow/Arrays/ArrayData.cs#L34-L44
   
   
https://github.com/apache/arrow/blob/8e43f23dcc6a9e630516228f110c48b64d13cec6/csharp/src/Apache.Arrow/RecordBatch.cs#L63-L70
   
   
https://github.com/apache/arrow/blob/8e43f23dcc6a9e630516228f110c48b64d13cec6/csharp/src/Apache.Arrow/Schema.cs#L39-L48
   
   But then we also have `internal` APIs that take `List<T>` and the code takes 
ownership of that list without copying. This reduces allocations internally, 
while keeping the public API "unsurprising".
   
   So how about following that same pattern here?
   
   * Change these APIs to take `IEnumerable` instead of `IList`
   * Internally when we create a Table, Column, ChunkedArray, we call the 
internal API that takes ownership of the list.
   
   thoughts?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to