Wes McKinney created ARROW-9633:
-----------------------------------

             Summary: [C++] Do not toggle memory mapping globally in 
LocalFileSystem
                 Key: ARROW-9633
                 URL: https://issues.apache.org/jira/browse/ARROW-9633
             Project: Apache Arrow
          Issue Type: Improvement
          Components: C++
            Reporter: Wes McKinney
             Fix For: 2.0.0


In the context of the Datasets API, some file formats benefit greatly from 
memory mapping (like Arrow IPC files) while other less so. Additionally, in 
some scenarios, memory mapping could fail when used on network-attached storage 
devices. Since a filesystem may be used to read different kinds of files and 
use both memory mapping and non-memory mapping, and additionally the Datasets 
API should be able to fall back on non-memory mapping if the attempt to memory 
map fails, it would make sense to have a non-global option for this:

https://github.com/apache/arrow/blob/master/cpp/src/arrow/filesystem/localfs.h

I would suggest adding a new filesystem API with something like 
{{OpenMappedInputFile}} with some options to control the behavior when memory 
mapping is not possible. These options may be among:

* Falling back on a normal RandomAccessFile
* Reading the entire file into memory (or even tmpfs?) and then wrapping it in 
a BufferReader
* Failing



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to