Re: R/arrow update

Romain François Tue, 20 Mar 2018 23:47:50 -0700

That sounds good. I’ll make a pull request of what I have once I have something 
useful in the readme.


Things like build are not dealt with at the moment so it might be that this 
only works on macOS or even (don’t think so) only on my 💻. 

As long as it’s clearly established that this is wip and that it might entirely 
change, for sure let’s merge patches to master.

JIRA is new to me, I usually work with github issues, so I’ll probably need 
some guidance. 

Romain

> Le 20 mars 2018 à 23:30, Wes McKinney <wesmck...@gmail.com> a écrit :
> 
> hi Romain,
> 
> Cool! I would suggest that we proceed in one of two ways:
> 
> * Start merging R patches to master (what I would prefer)
> * Merge patches into an r-devel branch while the R bindings initiative
> is in early stages
> 
> I don't really see any benefits to hiding early-stage code in a
> branch; the README for R should clearly indicate that the API is
> experimental. I think it would be better for the code to start going
> into the Arrow project (rather than staying in your personal branch)
> for a few reasons:
> 
> * More opportunities for the community to participate
> * More visible progress / transparency into what is going on
> * You will earn karma in the Apache project and be on your way to
> becoming a committer
> * Opportunities for code review from other C++ developers on use of
> the Arrow APIs, and opportunities for improvement
> * Incremental IP / licensing oversight (this gets harder when the
> patches get bigger)
> * Help with roadmapping / enumerating work to be done
> 
> On that last note, I would recommend beginning to liberally create
> JIRAs as you think of things that need to be done to build first class
> R support for Arrow. JIRA is the simplest way to develop the roadmap
> organically, it doesn't need to be anything formal.
> 
> Thanks!
> Wes
> 
>> On Tue, Mar 20, 2018 at 12:04 PM, Romain Francois <rom...@purrple.cat> wrote:
>> Hello,
>> 
>> Today is Tuesday, so that's the day I work on porting arrow to R. This week, 
>> I've continued some of the work from last week, still following the steps of 
>> the python front end as documented here: 
>> https://arrow.apache.org/docs/python/data.html#type-metadata 
>> <https://arrow.apache.org/docs/python/data.html#type-metadata>
>> 
>> Things are starting to materialize, and I try to give it an R feel.
>> 
>>> int32()
>> DataType(int32)
>>> 
>>> float64()
>> DataType(double)
>>> 
>>> struct( x = int32(), y = float64(), d1 = date32() )
>> StructType(struct<x: int32, y: double, d1: date32[day]>)
>>> 
>>> schema( x = int32(), y = float64(), d1 = date32() )
>> x: int32
>> y: double
>> d1: date32[day]
>> 
>> 
>> This is not that interesting, but it sets a nice premise for the future.
>> 
>> Quick ones:
>> - are there examples of uses of pyarrow.union ?
>> - how does pyarrow.array dispatches to the right array type ? And perhaps 
>> more generally, how do I know what's inside the function ?
>> 
>>>>> pa.array([1, 2, None, 3])
>> <pyarrow.lib.Int64Array object at 0x10db246d8>
>> [
>>  1,
>>  2,
>>  NA,
>>  3
>> ]
>>>>> 
>>>>> pa.array
>> <function pyarrow.lib.array>
>> 
>> 
>> Romain
>> 
>>

Re: R/arrow update

Reply via email to