[ 
https://issues.apache.org/jira/browse/ARROW-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alessandro Molina updated ARROW-7798:
-------------------------------------
    Fix Version/s:     (was: 5.0.0)
                   6.0.0

> [R] Refactor R <-> Array conversion
> -----------------------------------
>
>                 Key: ARROW-7798
>                 URL: https://issues.apache.org/jira/browse/ARROW-7798
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: R
>            Reporter: Francois Saint-Jacques
>            Assignee: Romain Francois
>            Priority: Major
>             Fix For: 6.0.0
>
>
> There's a bit of technical debt accumulated in array_to_vector and 
> vector_to_array:
>  * Mix of conversion *and* casting, ideally we'd move casting out of there 
> (at the cost of more memory copy). The rationale is that the conversion logic 
> will differ from the CastKernels, e.g. when to raise errors, benefits from 
> complex conversions like timezone... The current implementation is fast, e.g. 
> it fuses the conversion and casting in a single loop at the cost of code 
> clarity and divergence.
>  * There should be 2 paths, zero-copy, non zero-copy. The non-zero copy 
> should use the newly introduced VectorToArrayConverter which will work with 
> complex nested types.
>  * The in array_to vector, Converter should work primarily with Array and not 
> ArrayVector
>  * The vector_to_array should not use builders, sizes are known, the null 
> bitmap should be constructed separately. There's probably a chance that we 
> can re-use R's memory with zero-copy for the raw data.
>  * There seem to be multiple paths that do the same conversion: 
> [https://github.com/apache/arrow/pull/7514#discussion_r446706140]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to