CurtHagenlocher commented on code in PR #3683:
URL: https://github.com/apache/arrow-adbc/pull/3683#discussion_r2500523683
##########
csharp/src/Drivers/Databricks/Lz4Utilities.cs:
##########
@@ -77,13 +87,13 @@ public static ReadOnlyMemory<byte> DecompressLz4(byte[]
compressedData, int buff
/// <summary>
/// Asynchronously decompresses LZ4 compressed data into memory.
- /// Returns the buffer and length as a tuple for efficient wrapping in
a MemoryStream.
+ /// Returns a RecyclableMemoryStream that must be disposed by the
caller.
/// </summary>
/// <param name="compressedData">The compressed data bytes.</param>
/// <param name="cancellationToken">Cancellation token for the async
operation.</param>
- /// <returns>A tuple containing the decompressed buffer and its valid
length.</returns>
+ /// <returns>A MemoryStream containing the decompressed data. Caller
must dispose.</returns>
/// <exception cref="AdbcException">Thrown when decompression
fails.</exception>
- public static Task<(byte[] buffer, int length)> DecompressLz4Async(
+ public static Task<MemoryStream> DecompressLz4Async(
Review Comment:
If these are always returning a `RecyclableMemoryStream` instead of a
`MemoryStream`, perhaps the method should be typed to do that? One important
distinction is that a lot of people don't bother disposing a `MemoryStream`
because it's a NOP, but failure to `Dispose` a `RecyclableMemoryStream` means
that the blocks of memory aren't returned to the manager until the finalizer
runs -- which might be considerably later.
##########
csharp/src/Drivers/Databricks/Lz4Utilities.cs:
##########
@@ -33,6 +34,11 @@ internal static class Lz4Utilities
/// </summary>
private const int DefaultBufferSize = 81920;
+ /// <summary>
+ /// Shared RecyclableMemoryStreamManager for pooled memory stream
allocation.
+ /// </summary>
+ private static readonly RecyclableMemoryStreamManager
_memoryStreamManager = new RecyclableMemoryStreamManager();
Review Comment:
Is this the right scoping and options for the
`RecyclableMemoryStreamManager`? The drawback of making it static is that it
will live for the lifetime of the process. Without a cap on the size -- the
default is 0 --, this memory will be held for the rest of the life of the
process. Notably, the documentation says "If you set these to 0, you can have
unbounded pool growth, which is essentially indistinguishable from a memory
leak."
Perhaps the lifetime could instead be tied to either the Driver or the
Database.
##########
csharp/src/Drivers/Databricks/Lz4Utilities.cs:
##########
@@ -77,13 +87,13 @@ public static ReadOnlyMemory<byte> DecompressLz4(byte[]
compressedData, int buff
/// <summary>
/// Asynchronously decompresses LZ4 compressed data into memory.
- /// Returns the buffer and length as a tuple for efficient wrapping in
a MemoryStream.
+ /// Returns a RecyclableMemoryStream that must be disposed by the
caller.
/// </summary>
/// <param name="compressedData">The compressed data bytes.</param>
/// <param name="cancellationToken">Cancellation token for the async
operation.</param>
- /// <returns>A tuple containing the decompressed buffer and its valid
length.</returns>
+ /// <returns>A MemoryStream containing the decompressed data. Caller
must dispose.</returns>
/// <exception cref="AdbcException">Thrown when decompression
fails.</exception>
- public static Task<(byte[] buffer, int length)> DecompressLz4Async(
+ public static Task<MemoryStream> DecompressLz4Async(
Review Comment:
(It's not a major consideration if only because this isn't a public API.)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]