Re: R/arrow update

Wes McKinney Wed, 21 Mar 2018 06:16:01 -0700

Cool. For JIRA, the only issue fields we really use are:

Project: This is always Apache Arrow
Issue Type: Select one (in this case it will mostly be New Feature or
Improvement)
Summary: Describe what the issue is. Preferably add "[R]" in the title
to help with inbox filters
Priority: Fine to leave as Major for all issues
Components: add "R" as a component
Fix version: The project version that this issue is planned to be
completed for. For example, the next major release is 0.10.0
Assignee: New issues can be left as Unassigned, or you can assign to yourself


The other fields can be left blank.

On Wed, Mar 21, 2018 at 2:47 AM, Romain François <rom...@purrple.cat> wrote:
> That sounds good. I’ll make a pull request of what I have once I have 
> something useful in the readme.
>
> Things like build are not dealt with at the moment so it might be that this 
> only works on macOS or even (don’t think so) only on my 💻.
>
> As long as it’s clearly established that this is wip and that it might 
> entirely change, for sure let’s merge patches to master.
>
> JIRA is new to me, I usually work with github issues, so I’ll probably need 
> some guidance.
>
> Romain
>
>> Le 20 mars 2018 à 23:30, Wes McKinney <wesmck...@gmail.com> a écrit :
>>
>> hi Romain,
>>
>> Cool! I would suggest that we proceed in one of two ways:
>>
>> * Start merging R patches to master (what I would prefer)
>> * Merge patches into an r-devel branch while the R bindings initiative
>> is in early stages
>>
>> I don't really see any benefits to hiding early-stage code in a
>> branch; the README for R should clearly indicate that the API is
>> experimental. I think it would be better for the code to start going
>> into the Arrow project (rather than staying in your personal branch)
>> for a few reasons:
>>
>> * More opportunities for the community to participate
>> * More visible progress / transparency into what is going on
>> * You will earn karma in the Apache project and be on your way to
>> becoming a committer
>> * Opportunities for code review from other C++ developers on use of
>> the Arrow APIs, and opportunities for improvement
>> * Incremental IP / licensing oversight (this gets harder when the
>> patches get bigger)
>> * Help with roadmapping / enumerating work to be done
>>
>> On that last note, I would recommend beginning to liberally create
>> JIRAs as you think of things that need to be done to build first class
>> R support for Arrow. JIRA is the simplest way to develop the roadmap
>> organically, it doesn't need to be anything formal.
>>
>> Thanks!
>> Wes
>>
>>> On Tue, Mar 20, 2018 at 12:04 PM, Romain Francois <rom...@purrple.cat> 
>>> wrote:
>>> Hello,
>>>
>>> Today is Tuesday, so that's the day I work on porting arrow to R. This 
>>> week, I've continued some of the work from last week, still following the 
>>> steps of the python front end as documented here: 
>>> https://arrow.apache.org/docs/python/data.html#type-metadata 
>>> <https://arrow.apache.org/docs/python/data.html#type-metadata>
>>>
>>> Things are starting to materialize, and I try to give it an R feel.
>>>
>>>> int32()
>>> DataType(int32)
>>>>
>>>> float64()
>>> DataType(double)
>>>>
>>>> struct( x = int32(), y = float64(), d1 = date32() )
>>> StructType(struct<x: int32, y: double, d1: date32[day]>)
>>>>
>>>> schema( x = int32(), y = float64(), d1 = date32() )
>>> x: int32
>>> y: double
>>> d1: date32[day]
>>>
>>>
>>> This is not that interesting, but it sets a nice premise for the future.
>>>
>>> Quick ones:
>>> - are there examples of uses of pyarrow.union ?
>>> - how does pyarrow.array dispatches to the right array type ? And perhaps 
>>> more generally, how do I know what's inside the function ?
>>>
>>>>>> pa.array([1, 2, None, 3])
>>> <pyarrow.lib.Int64Array object at 0x10db246d8>
>>> [
>>>  1,
>>>  2,
>>>  NA,
>>>  3
>>> ]
>>>>>>
>>>>>> pa.array
>>> <function pyarrow.lib.array>
>>>
>>>
>>> Romain
>>>
>>>
>

Re: R/arrow update

Reply via email to