niyue commented on a change in pull request #11588:
URL: https://github.com/apache/arrow/pull/11588#discussion_r740770892
##########
File path: cpp/src/arrow/io/file.cc
##########
@@ -719,27 +719,38 @@ Future<std::shared_ptr<Buffer>>
MemoryMappedFile::ReadAsync(const IOContext&,
return Future<std::shared_ptr<Buffer>>::MakeFinished(ReadAt(position,
nbytes));
}
-Status MemoryMappedFile::WillNeed(const std::vector<ReadRange>& ranges) {
- using ::arrow::internal::MemoryRegion;
-
- RETURN_NOT_OK(memory_map_->CheckClosed());
- auto guard_resize = memory_map_->writable()
- ?
std::unique_lock<std::mutex>(memory_map_->resize_lock())
+Status MemoryMappedFile::ReadRangesToMemoryRegions(
+ const std::vector<ReadRange>& ranges,
+ std::shared_ptr<MemoryMappedFile::MemoryMap>& memory_map,
+ std::vector<MemoryRegion>& regions) {
+ RETURN_NOT_OK(memory_map->CheckClosed());
+ auto guard_resize = memory_map->writable()
+ ?
std::unique_lock<std::mutex>(memory_map->resize_lock())
: std::unique_lock<std::mutex>();
- std::vector<MemoryRegion> regions(ranges.size());
for (size_t i = 0; i < ranges.size(); ++i) {
const auto& range = ranges[i];
- ARROW_ASSIGN_OR_RAISE(
- auto size,
- internal::ValidateReadRange(range.offset, range.length,
memory_map_->size()));
- DCHECK_NE(memory_map_->data(), nullptr);
- regions[i] = {const_cast<uint8_t*>(memory_map_->data() + range.offset),
+ ARROW_ASSIGN_OR_RAISE(auto size, internal::ValidateReadRange(
+ range.offset, range.length,
memory_map->size()));
+ DCHECK_NE(memory_map->data(), nullptr);
+ regions[i] = {const_cast<uint8_t*>(memory_map->data() + range.offset),
static_cast<size_t>(size)};
}
+ return Status::OK();
+}
+
+Status MemoryMappedFile::WillNeed(const std::vector<ReadRange>& ranges) {
+ std::vector<MemoryRegion> regions(ranges.size());
+ RETURN_NOT_OK(ReadRangesToMemoryRegions(ranges, memory_map_, regions));
return ::arrow::internal::MemoryAdviseWillNeed(regions);
Review comment:
Previously we already have `WillNeed` API in `MemoryMappedFile` to
advise OS about the needed ranges, I add an `AdviseRandom` API similarly to
indicate the random access pattern. Initially, I would like to make this API
consistent with `WillNeed` and simply call it `Random` but I think this may be
slightly confusing as well, so I name it `AdviseRandom` currently. Let me know
if you have other naming suggestion for this API.
##########
File path: cpp/src/arrow/util/io_util.cc
##########
@@ -1090,19 +1106,15 @@ Status MemoryAdviseWillNeed(const
std::vector<MemoryRegion>& regions) {
}
return Status::OK();
#elif defined(POSIX_MADV_WILLNEED)
- for (const auto& region : regions) {
- if (region.size != 0) {
- const auto aligned = align_region(region);
- int err = posix_madvise(aligned.addr, aligned.size, POSIX_MADV_WILLNEED);
- // EBADF can be returned on Linux in the following cases:
- // - the kernel version is older than 3.9
- // - the kernel was compiled with CONFIG_SWAP disabled (ARROW-9577)
- if (err != 0 && err != EBADF) {
- return IOErrorFromErrno(err, "posix_madvise failed");
- }
- }
- }
Review comment:
I extract this piece of code from `MemoryAdviseWillNeed` API so that it
can be used to provide other advices to OS.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]