[
https://issues.apache.org/jira/browse/ARROW-252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15418062#comment-15418062
]
Wes McKinney commented on ARROW-252:
------------------------------------
+1 I agree with these sentiments
> Add implementation guidelines to the documentation
> --------------------------------------------------
>
> Key: ARROW-252
> URL: https://issues.apache.org/jira/browse/ARROW-252
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Format
> Reporter: Julien Le Dem
> Assignee: Julien Le Dem
>
> I'd like to add a paragraph to the documentation providing implementation
> guidelines:
> An execution engine (or framework, or UDF executor, or storage engine, etc)
> can use only a subset of the arrow Arrow spec or extend it given the
> following constraints:
> Implementing a subset:
> 1) If it is only producing (and not consuming) arrow vectors.
> - any subset of the vector spec and the corresponding metadata can be
> implemented
> 2) If it is consuming *and* producing vectors
> - there is a minimal subset of vectors to be supported (To Be Defined)
> - production of a subset of vectors and their corresponding metadata is
> always fine
> - consumption of vectors should at least convert the unsupported input
> vectors to the supported subset (for example Timestamp.millis to
> timestamp.micros or int32 to int64)
> An execution engine implementor can also extend their memory representation
> with their own vectors internally as long as they are never exposed. Before
> sending data to another system expecting Arrow data these custom vectors
> should be converted to a type that exist in the Arrow spec.
> An exemple of this is operating on compressed data.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)