[ 
https://issues.apache.org/jira/browse/ARROW-3662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670225#comment-16670225
 ] 

Antoine Pitrou commented on ARROW-3662:
---------------------------------------

> We do however call seek on the fd in some cases, (io-util.cc:372  => 
> io-util.cc:218) and I'm not sure about the reason here

The problem here is: if you get st_size = 0, does it mean the file is empty, or 
does it mean the file doesn't have a well-defined size? By trying to call 
tell(), we disambiguate between those two cases. Example in Python:
{code:python}
>>> import os, socket                                                           
>>>                                                                             
>>>            
>>> r, w = os.pipe()                                                            
>>>                                                                             
>>>            
>>> os.fstat(r)                                                                 
>>>                                                                             
>>>            
os.stat_result(st_mode=4480, st_ino=261470, st_dev=12, st_nlink=1, st_uid=1000, 
st_gid=1000, st_size=0, st_atime=1540998512, st_mtime=1540998512, 
st_ctime=1540998512)
>>> os.lseek(r, 0, os.SEEK_CUR)                                                 
>>>                                                                             
>>>            
Traceback (most recent call last):
  File "<ipython-input-9-462a3a2bd3e7>", line 1, in <module>
    os.lseek(r, 0, os.SEEK_CUR)
OSError: [Errno 29] Illegal seek
{code}


> [C++] Add a const overload to MemoryMappedFile::GetSize
> -------------------------------------------------------
>
>                 Key: ARROW-3662
>                 URL: https://issues.apache.org/jira/browse/ARROW-3662
>             Project: Apache Arrow
>          Issue Type: New Feature
>    Affects Versions: 0.11.1
>            Reporter: Dimitri Vorona
>            Priority: Minor
>              Labels: pull-request-available
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
>  
> While GetSize in general is not a const function, it can be on a 
> MemoryMappedFile. I propose to add a const override directly to the 
> MemoryMappedFile.
> Alternatively we could add a const version on the RandomAccessFile level 
> which would fail, if a const size getting (e.g. without a seek) isn't 
> possible, but it seems to me to be a potential source of hard-to-debug bugs 
> and spurious failures. At would at least require a careful analysis of the 
> platform support of different size getting options.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to