hi Simba, is it possible the file has zero length?
$ touch foo $ ipython In [1]: import pyarrow In [2]: pyarrow.memory_map('foo') --------------------------------------------------------------------------- ArrowIOError Traceback (most recent call last) <ipython-input-2-1111f1c5d786> in <module>() ----> 1 pyarrow.memory_map('foo') /home/wesm/code/arrow/python/pyarrow/io.pxi in pyarrow.lib.memory_map (/home/wesm/code/arrow/python/build/temp.linux-x86_64-3.5/lib.cxx:55830)() /home/wesm/code/arrow/python/pyarrow/io.pxi in pyarrow.lib.MemoryMappedFile._open (/home/wesm/code/arrow/python/build/temp.linux-x86_64-3.5/lib.cxx:55609)() /home/wesm/code/arrow/python/pyarrow/error.pxi in pyarrow.lib.check_status (/home/wesm/code/arrow/python/build/temp.linux-x86_64-3.5/lib.cxx:8379)() ArrowIOError: /home/wesm/code/arrow/cpp/src/arrow/io/file.cc:690 code: result->memory_map_->Open(path, mode) Memory mapping file failed, errno: 22 In [3]: import pyarrow.parquet as pq In [4]: pq.read_table('foo') <SNIP> ArrowIOError: /home/wesm/code/arrow/cpp/src/arrow/io/file.cc:690 code: result->memory_map_->Open(path, mode) Memory mapping file failed, errno: 22 That's admittedly not the best error message, opening a JIRA to improve that: https://issues.apache.org/jira/browse/ARROW-2118 - Wes On Thu, Feb 8, 2018 at 4:20 PM, simba nyatsanga <simnyatsa...@gmail.com> wrote: > Hi Everyone, > > I've encountered a memory mapping error when attempting to read a parquet > file to a Pandas DataFrame. It seems to be happening intermittently though, > I've so far encountered it once. In my case the pq.read_table code is being > invoked in a Linux docker container. I had a look at the docs for the > PyArrow memory and IO management here: > https://arrow.apache.org/docs/python/memory.html > > What could give rise to the stacktrace below? > > File "read_file.py", line 173, in load_chunked_data return > pq.read_table(data_obj_path, columns=columns).to_pandas()File > "/opt/anaconda-python-5.0.1/lib/python2.7/site-packages/pyarrow/parquet.py", > line 890, in read_table pf = ParquetFile (source, > metadata=metadata)File > "/opt/anaconda-python-5.0.1/lib/python2.7/site-packages/pyarrow/parquet.py", > line 56, in __init__ self.reader.open(source, metadata=metadata)File > "pyarrow/_parquet.pyx", line 624, in > pyarrow._parquet.ParquetReader.open > (/arrow/python/build/temp.linux-x86_64-2.7/_parquet.cxx:11558) > get_reader(source, &rd_handle)File "pyarrow/io.pxi", line 798, in > pyarrow.lib.get_reader > (/arrow/python/build/temp.linux-x86_64-2.7/lib.cxx:58504) source = > memory_map(source, mode='r')File "pyarrow/io.pxi", line 473, in > pyarrow.lib.memory_map > (/arrow/python/build/temp.linux-x86_64-2.7/lib.cxx:54834) > mmap._open(path, mode)File "pyarrow/io.pxi", line 452, in > pyarrow.lib.MemoryMappedFile ._open > (/arrow/python/build/temp.linux-x86_64-2.7/lib.cxx:54613) > check_status(CMemoryMappedFile .Open(c_path, c_mode, &handle))File > "pyarrow/error.pxi", line 79, in pyarrow.lib.check_status > (/arrow/python/build/temp.linux-x86_64-2.7/lib.cxx:8345) raise > ArrowIOError(message) ArrowIOError: Memory mapping file failed, errno: > 22 > > > > Thanks for the help. > > Kind Regards > Simba