Hi,

I have a short program which I'm wondering about the sensibility of. Could 
anyone let me know if this is reasonable or not:

>>> import pyarrow as pa, third_party_library
>>> memory_views = third_party_library.get_strings()
>>> memory_views
[<memory at 0x7f1745cc0870>, <memory at 0x7f1745cc0940>, <memory at 
0x7f1745cc0a10>, <memory at 0x7f1745cc0ae0>]
>>> pa.array(memory_views,pa.string())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pyarrow/array.pxi", line 269, in pyarrow.lib.array
  File "pyarrow/array.pxi", line 38, in pyarrow.lib._sequence_to_array
  File "pyarrow/error.pxi", line 107, in pyarrow.lib.check_status
pyarrow.lib.ArrowTypeError: Expected a string or bytes object, got a 
'memoryview' object
>>> pa.array(map(bytes,memory_views),pa.string())
<pyarrow.lib.StringArray object at 0x7f1745cbdd00>
[
  "this",
  "is",
  "a",
  "sample"
]

I have a big list of byte sequences being provided to me as memoryviews from a 
third party library. I'd like to create an Arrow StringArray from them as 
efficiently as possible. Having to map and consequently copy them through a 
bytes constructor seems not great (and the memoryview tobytes function appears 
to just call the bytes constructor, afaict).

To me, it seemed like pa.array should be able to use the memoryview objects 
directly in order to construct the StringArray, but it seems like Arrow wants 
them copied into fresh byte objects first. I don't know if I understand why and 
was ultimately wondering if it's a reasonable thing to desire.

Thanks in advance,
-Dan Nugent



######################################################################

The information contained in this communication is confidential and

may contain information that is privileged or exempt from disclosure

under applicable law. If you are not a named addressee, please notify

the sender immediately and delete this email from your system.

If you have received this communication, and are not a named

recipient, you are hereby notified that any dissemination,

distribution or copying of this communication is strictly prohibited.

######################################################################

Reply via email to