Wes McKinney created ARROW-9633:
-----------------------------------
Summary: [C++] Do not toggle memory mapping globally in
LocalFileSystem
Key: ARROW-9633
URL: https://issues.apache.org/jira/browse/ARROW-9633
Project: Apache Arrow
Issue Type: Improvement
Components: C++
Reporter: Wes McKinney
Fix For: 2.0.0
In the context of the Datasets API, some file formats benefit greatly from
memory mapping (like Arrow IPC files) while other less so. Additionally, in
some scenarios, memory mapping could fail when used on network-attached storage
devices. Since a filesystem may be used to read different kinds of files and
use both memory mapping and non-memory mapping, and additionally the Datasets
API should be able to fall back on non-memory mapping if the attempt to memory
map fails, it would make sense to have a non-global option for this:
https://github.com/apache/arrow/blob/master/cpp/src/arrow/filesystem/localfs.h
I would suggest adding a new filesystem API with something like
{{OpenMappedInputFile}} with some options to control the behavior when memory
mapping is not possible. These options may be among:
* Falling back on a normal RandomAccessFile
* Reading the entire file into memory (or even tmpfs?) and then wrapping it in
a BufferReader
* Failing
--
This message was sent by Atlassian Jira
(v8.3.4#803005)