[ https://issues.apache.org/jira/browse/ARROW-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16933223#comment-16933223 ]
Antoine Pitrou commented on ARROW-6618: --------------------------------------- This deserves fixing indeed. > [Python] Reading a zero-size buffer can segfault > ------------------------------------------------ > > Key: ARROW-6618 > URL: https://issues.apache.org/jira/browse/ARROW-6618 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Reporter: Joris Van den Bossche > Priority: Major > Fix For: 1.0.0 > > > Simplest reproducible code is: > {code} > pa.read_message(b'') > {code} > which gives a segfault. > You can easily run into this interactively when eg by accident passing a > already-read buffer to it, like: > {code} > serialized = pa.schema([('a', pa.int64())]).serialize().to_pybytes() > buffer = pa.BufferReader(serialized) > pa.read_message(buffer) > pa.read_message(buffer) > {code} > And for example, if you compare to {{read_schema}}, this gives an error on > the second time / empty buffer: > {code} > >>> pa.read_schema(buffer) > >>> pa.read_schema(buffer) > ... > ArrowInvalid: Tried reading schema message, was null or length 0 > {code} > I know this is not proper usage of Buffer(Reader), but since it is easy to > accidentally do this, we should try to protect users from this I think. -- This message was sent by Atlassian Jira (v8.3.4#803005)