[jira] [Commented] (ARROW-2155) [Python] pa.frombuffer(bytearray) returns immutable Buffer
[ https://issues.apache.org/jira/browse/ARROW-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16365698#comment-16365698 ] ASF GitHub Bot commented on ARROW-2155: --- wesm closed pull request #1608: ARROW-2155: [Python] frombuffer() should respect mutability of argument URL: https://github.com/apache/arrow/pull/1608 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/cpp/src/arrow/python/common.cc b/cpp/src/arrow/python/common.cc index 14a8ae6fd..1ded88071 100644 --- a/cpp/src/arrow/python/common.cc +++ b/cpp/src/arrow/python/common.cc @@ -23,6 +23,7 @@ #include "arrow/memory_pool.h" #include "arrow/status.h" +#include "arrow/util/logging.h" namespace arrow { namespace py { @@ -47,22 +48,39 @@ MemoryPool* get_memory_pool() { // -- // PyBuffer -PyBuffer::PyBuffer(PyObject* obj) : Buffer(nullptr, 0), obj_(nullptr) { - if (PyObject_CheckBuffer(obj)) { -obj_ = PyMemoryView_FromObject(obj); -Py_buffer* buffer = PyMemoryView_GET_BUFFER(obj_); -data_ = reinterpret_cast(buffer->buf); -size_ = buffer->len; -capacity_ = buffer->len; -is_mutable_ = false; +PyBuffer::PyBuffer() : Buffer(nullptr, 0) {} + +Status PyBuffer::Init(PyObject* obj) { + if (!PyObject_GetBuffer(obj, &py_buf_, PyBUF_ANY_CONTIGUOUS)) { +data_ = reinterpret_cast(py_buf_.buf); +DCHECK(data_ != nullptr); +size_ = py_buf_.len; +capacity_ = py_buf_.len; +is_mutable_ = !py_buf_.readonly; +return Status::OK(); + } else { +return Status(StatusCode::PythonError, ""); } } +Status PyBuffer::FromPyObject(PyObject* obj, std::shared_ptr* out) { + PyBuffer* buf = new PyBuffer(); + std::shared_ptr res(buf); + RETURN_NOT_OK(buf->Init(obj)); + *out = res; + return Status::OK(); +} + PyBuffer::~PyBuffer() { - PyAcquireGIL lock; - Py_XDECREF(obj_); + if (data_ != nullptr) { +PyAcquireGIL lock; +PyBuffer_Release(&py_buf_); + } } +// -- +// Python exception -> Status + Status CheckPyError(StatusCode code) { if (PyErr_Occurred()) { PyObject* exc_type = nullptr; diff --git a/cpp/src/arrow/python/common.h b/cpp/src/arrow/python/common.h index b1e0888af..269385c1a 100644 --- a/cpp/src/arrow/python/common.h +++ b/cpp/src/arrow/python/common.h @@ -18,6 +18,7 @@ #ifndef ARROW_PYTHON_COMMON_H #define ARROW_PYTHON_COMMON_H +#include #include #include "arrow/python/config.h" @@ -140,15 +141,17 @@ ARROW_EXPORT MemoryPool* get_memory_pool(); class ARROW_EXPORT PyBuffer : public Buffer { public: - /// Note that the GIL must be held when calling the PyBuffer constructor. - /// - /// While memoryview objects support multi-demensional buffers, PyBuffer only supports + /// While memoryview objects support multi-dimensional buffers, PyBuffer only supports /// one-dimensional byte buffers. - explicit PyBuffer(PyObject* obj); ~PyBuffer(); + static Status FromPyObject(PyObject* obj, std::shared_ptr* out); + private: - PyObject* obj_; + PyBuffer(); + Status Init(PyObject*); + + Py_buffer py_buf_; }; } // namespace py diff --git a/cpp/src/arrow/python/io.cc b/cpp/src/arrow/python/io.cc index 2cff04608..801a32574 100644 --- a/cpp/src/arrow/python/io.cc +++ b/cpp/src/arrow/python/io.cc @@ -149,14 +149,11 @@ Status PyReadableFile::Read(int64_t nbytes, int64_t* bytes_read, void* out) { Status PyReadableFile::Read(int64_t nbytes, std::shared_ptr* out) { PyAcquireGIL lock; - PyObject* bytes_obj = NULL; - ARROW_RETURN_NOT_OK(file_->Read(nbytes, &bytes_obj)); - DCHECK(bytes_obj != NULL); - - *out = std::make_shared(bytes_obj); - Py_XDECREF(bytes_obj); + OwnedRef bytes_obj; + ARROW_RETURN_NOT_OK(file_->Read(nbytes, bytes_obj.ref())); + DCHECK(bytes_obj.obj() != NULL); - return Status::OK(); + return PyBuffer::FromPyObject(bytes_obj.obj(), out); } Status PyReadableFile::ReadAt(int64_t position, int64_t nbytes, int64_t* bytes_read, @@ -219,13 +216,5 @@ Status PyOutputStream::Write(const void* data, int64_t nbytes) { return file_->Write(data, nbytes); } -// -- -// A readable file that is backed by a PyBuffer - -PyBytesReader::PyBytesReader(PyObject* obj) -: io::BufferReader(std::make_shared(obj)) {} - -PyBytesReader::~PyBytesReader() {} - } // namespace py } // namespace arrow diff --git a/cpp/src/arrow/python/io.h b/cpp/src/arrow/python/io.h index 0632d28fa..648f6de06 100644 --- a/cpp/src/arrow/python/io.h +++ b/cpp/src/arrow/python/io.h @@ -79,13 +79,6 @@ class ARROW_EXPORT PyOutputStream : public
[jira] [Commented] (ARROW-2155) [Python] pa.frombuffer(bytearray) returns immutable Buffer
[ https://issues.apache.org/jira/browse/ARROW-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16365696#comment-16365696 ] ASF GitHub Bot commented on ARROW-2155: --- wesm commented on issue #1608: ARROW-2155: [Python] frombuffer() should respect mutability of argument URL: https://github.com/apache/arrow/pull/1608#issuecomment-365955604 +1, thanks @pitrou! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Python] pa.frombuffer(bytearray) returns immutable Buffer > -- > > Key: ARROW-2155 > URL: https://issues.apache.org/jira/browse/ARROW-2155 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Affects Versions: 0.8.0 >Reporter: Antoine Pitrou >Assignee: Antoine Pitrou >Priority: Minor > Labels: pull-request-available > Fix For: 0.9.0 > > > I'd expect it to return a mutable buffer: > {code:python} > >>> pa.frombuffer(bytearray(10)).is_mutable > False > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-2155) [Python] pa.frombuffer(bytearray) returns immutable Buffer
[ https://issues.apache.org/jira/browse/ARROW-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16365695#comment-16365695 ] ASF GitHub Bot commented on ARROW-2155: --- wesm commented on a change in pull request #1608: ARROW-2155: [Python] frombuffer() should respect mutability of argument URL: https://github.com/apache/arrow/pull/1608#discussion_r168503098 ## File path: cpp/src/arrow/python/common.cc ## @@ -47,22 +48,37 @@ MemoryPool* get_memory_pool() { // -- // PyBuffer -PyBuffer::PyBuffer(PyObject* obj) : Buffer(nullptr, 0), obj_(nullptr) { - if (PyObject_CheckBuffer(obj)) { -obj_ = PyMemoryView_FromObject(obj); -Py_buffer* buffer = PyMemoryView_GET_BUFFER(obj_); -data_ = reinterpret_cast(buffer->buf); -size_ = buffer->len; -capacity_ = buffer->len; -is_mutable_ = false; +PyBuffer::PyBuffer() : Buffer(nullptr, 0) {} + +PyBuffer::PyBuffer(PyObject* obj) : Buffer(nullptr, 0) { + // Cannot propagate error upwards + (void)Init(obj); Review comment: SGTM This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Python] pa.frombuffer(bytearray) returns immutable Buffer > -- > > Key: ARROW-2155 > URL: https://issues.apache.org/jira/browse/ARROW-2155 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Affects Versions: 0.8.0 >Reporter: Antoine Pitrou >Assignee: Antoine Pitrou >Priority: Minor > Labels: pull-request-available > Fix For: 0.9.0 > > > I'd expect it to return a mutable buffer: > {code:python} > >>> pa.frombuffer(bytearray(10)).is_mutable > False > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-2155) [Python] pa.frombuffer(bytearray) returns immutable Buffer
[ https://issues.apache.org/jira/browse/ARROW-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16365516#comment-16365516 ] ASF GitHub Bot commented on ARROW-2155: --- pitrou commented on issue #1608: ARROW-2155: [Python] frombuffer() should respect mutability of argument URL: https://github.com/apache/arrow/pull/1608#issuecomment-365926963 AppVeyor build at https://ci.appveyor.com/project/pitrou/arrow/build/1.0.86 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Python] pa.frombuffer(bytearray) returns immutable Buffer > -- > > Key: ARROW-2155 > URL: https://issues.apache.org/jira/browse/ARROW-2155 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Affects Versions: 0.8.0 >Reporter: Antoine Pitrou >Assignee: Antoine Pitrou >Priority: Minor > Labels: pull-request-available > Fix For: 0.9.0 > > > I'd expect it to return a mutable buffer: > {code:python} > >>> pa.frombuffer(bytearray(10)).is_mutable > False > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-2155) [Python] pa.frombuffer(bytearray) returns immutable Buffer
[ https://issues.apache.org/jira/browse/ARROW-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16365430#comment-16365430 ] ASF GitHub Bot commented on ARROW-2155: --- pitrou closed pull request #1607: ARROW-2155: [Python] frombuffer() should respect mutability of argument URL: https://github.com/apache/arrow/pull/1607 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/cpp/src/arrow/python/common.cc b/cpp/src/arrow/python/common.cc index 14a8ae6fd..2699d51ba 100644 --- a/cpp/src/arrow/python/common.cc +++ b/cpp/src/arrow/python/common.cc @@ -23,6 +23,7 @@ #include "arrow/memory_pool.h" #include "arrow/status.h" +#include "arrow/util/logging.h" namespace arrow { namespace py { @@ -47,20 +48,25 @@ MemoryPool* get_memory_pool() { // -- // PyBuffer -PyBuffer::PyBuffer(PyObject* obj) : Buffer(nullptr, 0), obj_(nullptr) { - if (PyObject_CheckBuffer(obj)) { -obj_ = PyMemoryView_FromObject(obj); -Py_buffer* buffer = PyMemoryView_GET_BUFFER(obj_); -data_ = reinterpret_cast(buffer->buf); -size_ = buffer->len; -capacity_ = buffer->len; -is_mutable_ = false; +PyBuffer::PyBuffer(PyObject* obj) : Buffer(nullptr, 0) { + if (!PyObject_GetBuffer(obj, &py_buf_, PyBUF_ANY_CONTIGUOUS)) { +data_ = reinterpret_cast(py_buf_.buf); +DCHECK(data_ != nullptr); +size_ = py_buf_.len; +capacity_ = py_buf_.len; +is_mutable_ = !py_buf_.readonly; + } else { +// Cannot propagate error directly, caller should check the data_ +// field for null +PyErr_Clear(); } } PyBuffer::~PyBuffer() { - PyAcquireGIL lock; - Py_XDECREF(obj_); + if (data_ != nullptr) { +PyAcquireGIL lock; +PyBuffer_Release(&py_buf_); + } } Status CheckPyError(StatusCode code) { diff --git a/cpp/src/arrow/python/common.h b/cpp/src/arrow/python/common.h index b1e0888af..59d2c4808 100644 --- a/cpp/src/arrow/python/common.h +++ b/cpp/src/arrow/python/common.h @@ -142,13 +142,13 @@ class ARROW_EXPORT PyBuffer : public Buffer { public: /// Note that the GIL must be held when calling the PyBuffer constructor. /// - /// While memoryview objects support multi-demensional buffers, PyBuffer only supports + /// While memoryview objects support multi-dimensional buffers, PyBuffer only supports /// one-dimensional byte buffers. explicit PyBuffer(PyObject* obj); ~PyBuffer(); private: - PyObject* obj_; + Py_buffer py_buf_; }; } // namespace py diff --git a/python/pyarrow/io.pxi b/python/pyarrow/io.pxi index bd508cf57..2d9c7f413 100644 --- a/python/pyarrow/io.pxi +++ b/python/pyarrow/io.pxi @@ -802,12 +802,12 @@ def frombuffer(object obj): Construct an Arrow buffer from a Python bytes object """ cdef shared_ptr[CBuffer] buf -try: -memoryview(obj) -buf.reset(new PyBuffer(obj)) -return pyarrow_wrap_buffer(buf) -except TypeError: +buf.reset(new PyBuffer(obj)) +if buf.get().data() == NULL: +# XXX should be TypeError? Can we find a way to propagate the +# original error instead? raise ValueError('Must pass object that implements buffer protocol') +return pyarrow_wrap_buffer(buf) cdef get_reader(object source, shared_ptr[RandomAccessFile]* reader): diff --git a/python/pyarrow/tests/test_io.py b/python/pyarrow/tests/test_io.py index da26b101d..fd3744c18 100644 --- a/python/pyarrow/tests/test_io.py +++ b/python/pyarrow/tests/test_io.py @@ -132,6 +132,7 @@ def test_buffer_bytes(): buf = pa.frombuffer(val) assert isinstance(buf, pa.Buffer) +assert not buf.is_mutable result = buf.to_pybytes() @@ -143,6 +144,7 @@ def test_buffer_memoryview(): buf = pa.frombuffer(val) assert isinstance(buf, pa.Buffer) +assert not buf.is_mutable result = memoryview(buf) @@ -154,13 +156,19 @@ def test_buffer_bytearray(): buf = pa.frombuffer(val) assert isinstance(buf, pa.Buffer) +assert buf.is_mutable result = bytearray(buf) assert result == val -def test_buffer_numpy(): +def test_buffer_invalid(): +with pytest.raises(ValueError, match="buffer protocol"): +pa.frombuffer(None) + + +def test_buffer_to_numpy(): # Make sure creating a numpy array from an arrow buffer works byte_array = bytearray(20) byte_array[0] = 42 @@ -170,6 +178,19 @@ def test_buffer_numpy(): assert array.base == buf +def test_buffer_from_numpy(): +# C-contiguous +arr = np.arange(12, dtype=np.int8).reshape((3, 4)) +buf = pa.frombuffer(arr) +assert buf.to_pybytes() == arr.tobytes() +# F-contiguous; note strides informa
[jira] [Commented] (ARROW-2155) [Python] pa.frombuffer(bytearray) returns immutable Buffer
[ https://issues.apache.org/jira/browse/ARROW-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16365413#comment-16365413 ] ASF GitHub Bot commented on ARROW-2155: --- pitrou commented on a change in pull request #1608: ARROW-2155: [Python] frombuffer() should respect mutability of argument (alternative solution) URL: https://github.com/apache/arrow/pull/1608#discussion_r168447869 ## File path: cpp/src/arrow/python/common.cc ## @@ -47,22 +48,37 @@ MemoryPool* get_memory_pool() { // -- // PyBuffer -PyBuffer::PyBuffer(PyObject* obj) : Buffer(nullptr, 0), obj_(nullptr) { - if (PyObject_CheckBuffer(obj)) { -obj_ = PyMemoryView_FromObject(obj); -Py_buffer* buffer = PyMemoryView_GET_BUFFER(obj_); -data_ = reinterpret_cast(buffer->buf); -size_ = buffer->len; -capacity_ = buffer->len; -is_mutable_ = false; +PyBuffer::PyBuffer() : Buffer(nullptr, 0) {} + +PyBuffer::PyBuffer(PyObject* obj) : Buffer(nullptr, 0) { + // Cannot propagate error upwards + (void)Init(obj); Review comment: Ok, done. As a side effect, this removes the `PyBytesReader` class which is not needed anymore (you can simply construct a `BufferReader` from the given `PyBuffer` pointer). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Python] pa.frombuffer(bytearray) returns immutable Buffer > -- > > Key: ARROW-2155 > URL: https://issues.apache.org/jira/browse/ARROW-2155 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Affects Versions: 0.8.0 >Reporter: Antoine Pitrou >Assignee: Antoine Pitrou >Priority: Minor > Labels: pull-request-available > Fix For: 0.9.0 > > > I'd expect it to return a mutable buffer: > {code:python} > >>> pa.frombuffer(bytearray(10)).is_mutable > False > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-2155) [Python] pa.frombuffer(bytearray) returns immutable Buffer
[ https://issues.apache.org/jira/browse/ARROW-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16364915#comment-16364915 ] ASF GitHub Bot commented on ARROW-2155: --- wesm commented on a change in pull request #1608: ARROW-2155: [Python] frombuffer() should respect mutability of argument (alternative solution) URL: https://github.com/apache/arrow/pull/1608#discussion_r168339620 ## File path: cpp/src/arrow/python/common.cc ## @@ -47,22 +48,37 @@ MemoryPool* get_memory_pool() { // -- // PyBuffer -PyBuffer::PyBuffer(PyObject* obj) : Buffer(nullptr, 0), obj_(nullptr) { - if (PyObject_CheckBuffer(obj)) { -obj_ = PyMemoryView_FromObject(obj); -Py_buffer* buffer = PyMemoryView_GET_BUFFER(obj_); -data_ = reinterpret_cast(buffer->buf); -size_ = buffer->len; -capacity_ = buffer->len; -is_mutable_ = false; +PyBuffer::PyBuffer() : Buffer(nullptr, 0) {} + +PyBuffer::PyBuffer(PyObject* obj) : Buffer(nullptr, 0) { + // Cannot propagate error upwards + (void)Init(obj); Review comment: I see. Then `PyBytesReader` will need to change, too (since making a `PyBuffer` can fail). Let me know if I can help This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Python] pa.frombuffer(bytearray) returns immutable Buffer > -- > > Key: ARROW-2155 > URL: https://issues.apache.org/jira/browse/ARROW-2155 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Affects Versions: 0.8.0 >Reporter: Antoine Pitrou >Assignee: Antoine Pitrou >Priority: Minor > Labels: pull-request-available > Fix For: 0.9.0 > > > I'd expect it to return a mutable buffer: > {code:python} > >>> pa.frombuffer(bytearray(10)).is_mutable > False > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-2155) [Python] pa.frombuffer(bytearray) returns immutable Buffer
[ https://issues.apache.org/jira/browse/ARROW-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16364883#comment-16364883 ] ASF GitHub Bot commented on ARROW-2155: --- pitrou commented on a change in pull request #1608: ARROW-2155: [Python] frombuffer() should respect mutability of argument (alternative solution) URL: https://github.com/apache/arrow/pull/1608#discussion_r168334217 ## File path: cpp/src/arrow/python/common.cc ## @@ -47,22 +48,37 @@ MemoryPool* get_memory_pool() { // -- // PyBuffer -PyBuffer::PyBuffer(PyObject* obj) : Buffer(nullptr, 0), obj_(nullptr) { - if (PyObject_CheckBuffer(obj)) { -obj_ = PyMemoryView_FromObject(obj); -Py_buffer* buffer = PyMemoryView_GET_BUFFER(obj_); -data_ = reinterpret_cast(buffer->buf); -size_ = buffer->len; -capacity_ = buffer->len; -is_mutable_ = false; +PyBuffer::PyBuffer() : Buffer(nullptr, 0) {} + +PyBuffer::PyBuffer(PyObject* obj) : Buffer(nullptr, 0) { + // Cannot propagate error upwards + (void)Init(obj); Review comment: From what I can see the explicit constructor is required by the PyBytesReader constructor implementation: ``` PyBytesReader::PyBytesReader(PyObject* obj) : io::BufferReader(std::make_shared(obj)) {} ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Python] pa.frombuffer(bytearray) returns immutable Buffer > -- > > Key: ARROW-2155 > URL: https://issues.apache.org/jira/browse/ARROW-2155 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Affects Versions: 0.8.0 >Reporter: Antoine Pitrou >Assignee: Antoine Pitrou >Priority: Minor > Labels: pull-request-available > Fix For: 0.9.0 > > > I'd expect it to return a mutable buffer: > {code:python} > >>> pa.frombuffer(bytearray(10)).is_mutable > False > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-2155) [Python] pa.frombuffer(bytearray) returns immutable Buffer
[ https://issues.apache.org/jira/browse/ARROW-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16364750#comment-16364750 ] ASF GitHub Bot commented on ARROW-2155: --- wesm commented on a change in pull request #1608: ARROW-2155: [Python] frombuffer() should respect mutability of argument (alternative solution) URL: https://github.com/apache/arrow/pull/1608#discussion_r168303091 ## File path: cpp/src/arrow/python/common.cc ## @@ -47,22 +48,37 @@ MemoryPool* get_memory_pool() { // -- // PyBuffer -PyBuffer::PyBuffer(PyObject* obj) : Buffer(nullptr, 0), obj_(nullptr) { - if (PyObject_CheckBuffer(obj)) { -obj_ = PyMemoryView_FromObject(obj); -Py_buffer* buffer = PyMemoryView_GET_BUFFER(obj_); -data_ = reinterpret_cast(buffer->buf); -size_ = buffer->len; -capacity_ = buffer->len; -is_mutable_ = false; +PyBuffer::PyBuffer() : Buffer(nullptr, 0) {} + +PyBuffer::PyBuffer(PyObject* obj) : Buffer(nullptr, 0) { + // Cannot propagate error upwards + (void)Init(obj); Review comment: This suggests we should make the ctors private and add a static alternative constructor `Status PyBuffer::FromPyObject(PyObject* obj, std::shared_ptr* out)` or something. This can be declared as `@staticmethod` in Cython This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Python] pa.frombuffer(bytearray) returns immutable Buffer > -- > > Key: ARROW-2155 > URL: https://issues.apache.org/jira/browse/ARROW-2155 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Affects Versions: 0.8.0 >Reporter: Antoine Pitrou >Assignee: Antoine Pitrou >Priority: Minor > Labels: pull-request-available > Fix For: 0.9.0 > > > I'd expect it to return a mutable buffer: > {code:python} > >>> pa.frombuffer(bytearray(10)).is_mutable > False > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-2155) [Python] pa.frombuffer(bytearray) returns immutable Buffer
[ https://issues.apache.org/jira/browse/ARROW-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16364417#comment-16364417 ] ASF GitHub Bot commented on ARROW-2155: --- pitrou commented on a change in pull request #1607: ARROW-2155: [Python] frombuffer() should respect mutability of argument URL: https://github.com/apache/arrow/pull/1607#discussion_r168240705 ## File path: python/pyarrow/io.pxi ## @@ -802,12 +802,12 @@ def frombuffer(object obj): Construct an Arrow buffer from a Python bytes object """ cdef shared_ptr[CBuffer] buf -try: -memoryview(obj) -buf.reset(new PyBuffer(obj)) -return pyarrow_wrap_buffer(buf) -except TypeError: +buf.reset(new PyBuffer(obj)) +if buf.get().data() == NULL: +# XXX should be TypeError? Can we find a way to propagate the Review comment: I opened PR #1607 for a TypeError-based solution. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Python] pa.frombuffer(bytearray) returns immutable Buffer > -- > > Key: ARROW-2155 > URL: https://issues.apache.org/jira/browse/ARROW-2155 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Affects Versions: 0.8.0 >Reporter: Antoine Pitrou >Assignee: Antoine Pitrou >Priority: Minor > Labels: pull-request-available > Fix For: 0.9.0 > > > I'd expect it to return a mutable buffer: > {code:python} > >>> pa.frombuffer(bytearray(10)).is_mutable > False > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-2155) [Python] pa.frombuffer(bytearray) returns immutable Buffer
[ https://issues.apache.org/jira/browse/ARROW-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16364416#comment-16364416 ] ASF GitHub Bot commented on ARROW-2155: --- pitrou opened a new pull request #1608: ARROW-2155: [Python] frombuffer() should respect mutability of argument (alternative solution) URL: https://github.com/apache/arrow/pull/1608 Also raise TypeError instead of ValueError for invalid arguments to frombuffer() This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Python] pa.frombuffer(bytearray) returns immutable Buffer > -- > > Key: ARROW-2155 > URL: https://issues.apache.org/jira/browse/ARROW-2155 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Affects Versions: 0.8.0 >Reporter: Antoine Pitrou >Assignee: Antoine Pitrou >Priority: Minor > Labels: pull-request-available > Fix For: 0.9.0 > > > I'd expect it to return a mutable buffer: > {code:python} > >>> pa.frombuffer(bytearray(10)).is_mutable > False > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-2155) [Python] pa.frombuffer(bytearray) returns immutable Buffer
[ https://issues.apache.org/jira/browse/ARROW-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16364349#comment-16364349 ] ASF GitHub Bot commented on ARROW-2155: --- pitrou commented on a change in pull request #1607: ARROW-2155: [Python] frombuffer() should respect mutability of argument URL: https://github.com/apache/arrow/pull/1607#discussion_r168226882 ## File path: python/pyarrow/io.pxi ## @@ -802,12 +802,12 @@ def frombuffer(object obj): Construct an Arrow buffer from a Python bytes object """ cdef shared_ptr[CBuffer] buf -try: -memoryview(obj) -buf.reset(new PyBuffer(obj)) -return pyarrow_wrap_buffer(buf) -except TypeError: +buf.reset(new PyBuffer(obj)) +if buf.get().data() == NULL: +# XXX should be TypeError? Can we find a way to propagate the Review comment: @wesm What do you think about this? ValueError doesn't sound right here. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Python] pa.frombuffer(bytearray) returns immutable Buffer > -- > > Key: ARROW-2155 > URL: https://issues.apache.org/jira/browse/ARROW-2155 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Affects Versions: 0.8.0 >Reporter: Antoine Pitrou >Assignee: Antoine Pitrou >Priority: Minor > Labels: pull-request-available > Fix For: 0.9.0 > > > I'd expect it to return a mutable buffer: > {code:python} > >>> pa.frombuffer(bytearray(10)).is_mutable > False > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-2155) [Python] pa.frombuffer(bytearray) returns immutable Buffer
[ https://issues.apache.org/jira/browse/ARROW-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16364340#comment-16364340 ] ASF GitHub Bot commented on ARROW-2155: --- pitrou opened a new pull request #1607: ARROW-2155: [Python] frombuffer() should respect mutability of argument URL: https://github.com/apache/arrow/pull/1607 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Python] pa.frombuffer(bytearray) returns immutable Buffer > -- > > Key: ARROW-2155 > URL: https://issues.apache.org/jira/browse/ARROW-2155 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Affects Versions: 0.8.0 >Reporter: Antoine Pitrou >Assignee: Antoine Pitrou >Priority: Minor > Labels: pull-request-available > Fix For: 0.9.0 > > > I'd expect it to return a mutable buffer: > {code:python} > >>> pa.frombuffer(bytearray(10)).is_mutable > False > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-2155) [Python] pa.frombuffer(bytearray) returns immutable Buffer
[ https://issues.apache.org/jira/browse/ARROW-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16364211#comment-16364211 ] Wes McKinney commented on ARROW-2155: - Agreed. I would expect this to respect the {{PyBuf_WRITABLE}} flag > [Python] pa.frombuffer(bytearray) returns immutable Buffer > -- > > Key: ARROW-2155 > URL: https://issues.apache.org/jira/browse/ARROW-2155 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Affects Versions: 0.8.0 >Reporter: Antoine Pitrou >Priority: Minor > Fix For: 0.9.0 > > > I'd expect it to return a mutable buffer: > {code:python} > >>> pa.frombuffer(bytearray(10)).is_mutable > False > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)