[
https://issues.apache.org/jira/browse/ARROW-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alessandro Molina updated ARROW-7798:
-------------------------------------
Fix Version/s: (was: 5.0.0)
6.0.0
> [R] Refactor R <-> Array conversion
> -----------------------------------
>
> Key: ARROW-7798
> URL: https://issues.apache.org/jira/browse/ARROW-7798
> Project: Apache Arrow
> Issue Type: Improvement
> Components: R
> Reporter: Francois Saint-Jacques
> Assignee: Romain Francois
> Priority: Major
> Fix For: 6.0.0
>
>
> There's a bit of technical debt accumulated in array_to_vector and
> vector_to_array:
> * Mix of conversion *and* casting, ideally we'd move casting out of there
> (at the cost of more memory copy). The rationale is that the conversion logic
> will differ from the CastKernels, e.g. when to raise errors, benefits from
> complex conversions like timezone... The current implementation is fast, e.g.
> it fuses the conversion and casting in a single loop at the cost of code
> clarity and divergence.
> * There should be 2 paths, zero-copy, non zero-copy. The non-zero copy
> should use the newly introduced VectorToArrayConverter which will work with
> complex nested types.
> * The in array_to vector, Converter should work primarily with Array and not
> ArrayVector
> * The vector_to_array should not use builders, sizes are known, the null
> bitmap should be constructed separately. There's probably a chance that we
> can re-use R's memory with zero-copy for the raw data.
> * There seem to be multiple paths that do the same conversion:
> [https://github.com/apache/arrow/pull/7514#discussion_r446706140]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)