[ https://issues.apache.org/jira/browse/ARROW-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16719205#comment-16719205 ]
Benjamin Kietzman edited comment on ARROW-47 at 12/12/18 5:06 PM: ------------------------------------------------------------------ One alternative to using `vector<shared_ptr<Scalar>>` would be a flatbuffer: {code:title=struct_scalar.h} struct StructScalar { template <typename T> T GetFieldAs(int field_index); flatbuffers::Table *root_; std::vector<flatbuffers::reflection::Field*> fields_; std::shared_ptr<Buffer> storage_; }; {code} In any case, the main challenge I see is the amount of fragile unboxing boilerplate that StructScalar would require to be user friendly. That can be mitigated with good metaprogramming, but it's still a bit verbose: {code:title=unbox_styles.cc} StructScalar* obj = get(); Status s = Unbox1<int, string, ignore, bool>(obj, [](vector<bool> is_valid, int id, string_view name, ignore, bool admin) { // ... }); vector<bool> is_valid; tuple<int, string_view, ignore, bool> employee; RETURN_NOT_OK(Unbox2(obj, &employee, &is_valid)); pair<bool, int> id; pair<bool, string_view> name; pair<bool, bool> admin; RETURN_NOT_OK(Unbox3<int>(obj, 0, &id)); RETURN_NOT_OK(Unbox3<int>(obj, 1, &name)); RETURN_NOT_OK(Unbox3<int>(obj, 3, &admin)); // more options available beyond c++11 {code} was (Author: bkietz): One alternative to using `vector<shared_ptr<Scalar>>` would be a flatbuffer: ``` struct StructScalar { template <typename T> T GetFieldAs(int field_index); flatbuffers::Table *root_; std::vector<flatbuffers::reflection::Field*> fields_; std::shared_ptr<Buffer> storage_; }; ``` In any case, the main challenge I see is the amount of fragile unboxing boilerplate that StructScalar would require to be user friendly. That can be mitigated with good metaprogramming, but it's still a bit verbose: {{ StructScalar* obj = get(); Status s = Unbox1<int, string, ignore, bool>(obj, [](vector<bool> is_valid, int id, string_view name, ignore, bool admin) { // ... }); vector<bool> is_valid; tuple<int, string_view, ignore, bool> employee; RETURN_NOT_OK(Unbox2(obj, &employee, &is_valid)); pair<bool, int> id; pair<bool, string_view> name; pair<bool, bool> admin; RETURN_NOT_OK(Unbox3<int>(obj, 0, &id)); RETURN_NOT_OK(Unbox3<int>(obj, 1, &name)); RETURN_NOT_OK(Unbox3<int>(obj, 3, &admin)); // more options available beyond c++11 }} > [C++] Consider adding a scalar type object model > ------------------------------------------------ > > Key: ARROW-47 > URL: https://issues.apache.org/jira/browse/ARROW-47 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ > Reporter: Wes McKinney > Assignee: Uwe L. Korn > Priority: Major > Labels: Analytics > Fix For: 0.13.0 > > > Just did this on the Python side. In later analytics routines, passing in > scalar values (example: Array + Scalar) requires some kind of container. Some > systems, like the R language, solve this problem with length-1 arrays, but we > should do some analysis of use cases and figure out what will work best for > Arrow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)