jvanstraten commented on a change in pull request #12116: URL: https://github.com/apache/arrow/pull/12116#discussion_r807160841
########## File path: cpp/src/arrow/memory_pool.cc ########## @@ -603,14 +643,116 @@ class BaseMemoryPoolImpl : public MemoryPool { stats_.UpdateAllocatedBytes(-size); } - void ReleaseUnused() override { Allocator::ReleaseUnused(); } + protected: + virtual Status AllocateImmutableZeros(int64_t size, uint8_t** out) { +#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS + if (size > 0) { + *out = static_cast<uint8_t*>(mmap( + nullptr, size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0)); + if (*out == MAP_FAILED) { + auto err = errno; + return Status::OutOfMemory("Failed to allocate zero buffer of size ", size, ": ", + strerror(err)); + } + return Status::OK(); + } +#endif + // TODO: jemalloc and mimalloc support zero-initialized allocations as + // well, which might be faster than allocate + memset. + RETURN_NOT_OK(Allocate(size, out)); + std::memset(*out, 0, size); + return Status::OK(); + } + + void FreeImmutableZeros(uint8_t* buffer, int64_t size) override { +#ifdef USE_MMAP_FOR_IMMUTABLE_ZEROS + if (size > 0) { + munmap(buffer, size); + return; + } +#endif + Free(buffer, size); + } + + public: + Result<std::shared_ptr<Buffer>> GetImmutableZeros(int64_t size) override { + // Thread-safely get the current largest buffer of zeros. + auto current_buffer = atomic_load(&immutable_zeros_cache_); + + // If this buffer satisfies the requirements, return it. + if (current_buffer && current_buffer->size() >= size) { + return std::move(current_buffer); + } + + // Acquire the lock for allocating a new buffer. + std::lock_guard<std::mutex> gg(immutable_zeros_mutex_); + + // Between our previous atomic load and acquisition of the lock, another + // thread may have allocated a buffer. So we need to check again. + current_buffer = atomic_load(&immutable_zeros_cache_); + if (current_buffer && current_buffer->size() >= size) { + return std::move(current_buffer); + } + + // Let's now figure out a good size to allocate. This is done + // heuristically, with the following rules: + // - allocate at least the requested size (obviously); + // - allocate at least 2x the previous size; + // - allocate at least kMinAllocSize bytes (to avoid lots of small + // allocations). + static const int64_t kMinAllocSize = 4096; + int64_t alloc_size = + std::max(size, current_buffer ? (current_buffer->size() * 2) : kMinAllocSize); + + // Attempt to allocate the block. + uint8_t* data = nullptr; + auto result = AllocateImmutableZeros(alloc_size, &data); + + // If we fail to do so, fall back to trying to allocate the requested size + // exactly as a last-ditch effort. + if (!result.ok() || data == nullptr) { Review comment: You shouldn't be able to in the default implementation, but `AllocateImmutableZeros` is virtual, so someone could override it in a custom implementation. Still, I suppose I should have gotten rid of that. As a first pass I had it throw an error properly when data is set to null or not modified, then changed my mind and did a `DCHECK` instead. Now the null check there indeed doesn't really do anything useful anymore. 3dda474 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org