The one you just opened seems like a good first issue

https://issues.apache.org/jira/browse/ARROW-8070

If you follow the instructions in
https://github.com/apache/arrow/blob/master/docs/source/developers/python.rst
and can't get thing to build please let us know the details so we can
help you

On Wed, Mar 11, 2020 at 6:06 PM Nugent, Daniel <[email protected]> wrote:
>
> Thanks for closing this out!
>
> Sorry I didn't get around to working on this before you ended up putting it 
> in. I had some difficulty getting the dev environment set up and limited time 
> to work on it.
>
> Is there a list of good first issues to take a crack at? I've really 
> appreciated the project overall and would like to help out in the time I can.
>
> -Dan Nugent
>
> -----Original Message-----
> From: Wes McKinney <[email protected]>
> Sent: Saturday, March 7, 2020 10:55 AM
> To: [email protected]
> Subject: [EXTERNAL] Re: Question about memoryviews and array construction
>
> There's a couple places to start
>
> * Add PyMemoryView type check to internal::IsPyBinary 
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/python/helpers.h#L80.
> I think this is all that's needed to take care of type inference
> * Make sure PyMemoryView is handled in the PyBytesView helper in
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/python/common.h#L193
>
> On Sat, Mar 7, 2020 at 9:35 AM Daniel Nugent <[email protected]> wrote:
> >
> > Great!
> >
> > If you could provide a smidgen of guidance about where to start making this 
> > change, I would be happy to give it a shot.
> >
> > Thanks,
> >
> > -Dan Nugent
> > On Mar 7, 2020, 09:18 -0500, Wes McKinney <[email protected]>, wrote:
> >
> > hi Dan,
> >
> > Yes, we should support constructing StringArray directly from
> > memoryview as we do with bytes and unicode -- you're the first person
> > to ask about this so far. I opened
> > https://issues.apache.org/jira/browse/ARROW-8026. This should not be a
> > huge amount of work so would be a good first contribution to the
> > project
> >
> > Thanks
> >
> > Wes
> >
> > On Fri, Mar 6, 2020 at 8:29 PM Nugent, Daniel <[email protected]> wrote:
> >
> >
> > Hi,
> >
> >
> >
> > I have a short program which I’m wondering about the sensibility of. Could 
> > anyone let me know if this is reasonable or not:
> >
> >
> >
> > import pyarrow as pa, third_party_library
> >
> >
> > memory_views = third_party_library.get_strings()
> >
> >
> > memory_views
> >
> >
> > [<memory at 0x7f1745cc0870>, <memory at 0x7f1745cc0940>, <memory at
> > 0x7f1745cc0a10>, <memory at 0x7f1745cc0ae0>]
> >
> > pa.array(memory_views,pa.string())
> >
> >
> > Traceback (most recent call last):
> >
> > File "<stdin>", line 1, in <module>
> >
> > File "pyarrow/array.pxi", line 269, in pyarrow.lib.array
> >
> > File "pyarrow/array.pxi", line 38, in pyarrow.lib._sequence_to_array
> >
> > File "pyarrow/error.pxi", line 107, in pyarrow.lib.check_status
> >
> > pyarrow.lib.ArrowTypeError: Expected a string or bytes object, got a
> > 'memoryview' object
> >
> > pa.array(map(bytes,memory_views),pa.string())
> >
> >
> > <pyarrow.lib.StringArray object at 0x7f1745cbdd00>
> >
> > [
> >
> > "this",
> >
> > "is",
> >
> > "a",
> >
> > "sample"
> >
> > ]
> >
> >
> >
> > I have a big list of byte sequences being provided to me as memoryviews 
> > from a third party library. I’d like to create an Arrow StringArray from 
> > them as efficiently as possible. Having to map and consequently copy them 
> > through a bytes constructor seems not great (and the memoryview tobytes 
> > function appears to just call the bytes constructor, afaict).
> >
> >
> >
> > To me, it seemed like pa.array should be able to use the memoryview objects 
> > directly in order to construct the StringArray, but it seems like Arrow 
> > wants them copied into fresh byte objects first. I don’t know if I 
> > understand why and was ultimately wondering if it’s a reasonable thing to 
> > desire.
> >
> >
> >
> > Thanks in advance,
> >
> > -Dan Nugent
> >
> >
> >
> >
> > ######################################################################
> >
> > The information contained in this communication is confidential and
> >
> > may contain information that is privileged or exempt from disclosure
> >
> > under applicable law. If you are not a named addressee, please notify
> >
> > the sender immediately and delete this email from your system.
> >
> > If you have received this communication, and are not a named
> >
> > recipient, you are hereby notified that any dissemination,
> >
> > distribution or copying of this communication is strictly prohibited.
> >
> > ######################################################################
>
>
> ######################################################################
>
> The information contained in this communication is confidential and
>
> may contain information that is privileged or exempt from disclosure
>
> under applicable law. If you are not a named addressee, please notify
>
> the sender immediately and delete this email from your system.
>
> If you have received this communication, and are not a named
>
> recipient, you are hereby notified that any dissemination,
>
> distribution or copying of this communication is strictly prohibited.
>
> ######################################################################

Reply via email to