[ 
https://issues.apache.org/jira/browse/ARROW-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16719205#comment-16719205
 ] 

Benjamin Kietzman edited comment on ARROW-47 at 12/12/18 5:06 PM:
------------------------------------------------------------------

One alternative to using `vector<shared_ptr<Scalar>>` would be a flatbuffer:

{code:title=struct_scalar.h}
struct StructScalar {
  template <typename T>
  T GetFieldAs(int field_index);

  flatbuffers::Table *root_;
  std::vector<flatbuffers::reflection::Field*> fields_;
  std::shared_ptr<Buffer> storage_;
};
{code}

In any case, the main challenge I see is the amount of fragile unboxing 
boilerplate that StructScalar would require to be user friendly. That can be 
mitigated with good metaprogramming, but it's still a bit verbose:

{code:title=unbox_styles.cc}
StructScalar* obj = get();
Status s = Unbox1<int, string, ignore, bool>(obj, [](vector<bool> is_valid, int 
id, string_view name, ignore, bool admin) {
  // ...
});

vector<bool> is_valid;
tuple<int, string_view, ignore, bool> employee;
RETURN_NOT_OK(Unbox2(obj, &employee, &is_valid));

pair<bool, int> id;
pair<bool, string_view> name;
pair<bool, bool> admin;
RETURN_NOT_OK(Unbox3<int>(obj, 0, &id));
RETURN_NOT_OK(Unbox3<int>(obj, 1, &name));
RETURN_NOT_OK(Unbox3<int>(obj, 3, &admin));

// more options available beyond c++11
{code}


was (Author: bkietz):
One alternative to using `vector<shared_ptr<Scalar>>` would be a flatbuffer:

```
struct StructScalar {
  template <typename T>
  T GetFieldAs(int field_index);

  flatbuffers::Table *root_;
  std::vector<flatbuffers::reflection::Field*> fields_;
  std::shared_ptr<Buffer> storage_;
};
```

In any case, the main challenge I see is the amount of fragile unboxing 
boilerplate that StructScalar would require to be user friendly. That can be 
mitigated with good metaprogramming, but it's still a bit verbose:

{{
StructScalar* obj = get();
Status s = Unbox1<int, string, ignore, bool>(obj, [](vector<bool> is_valid, int 
id, string_view name, ignore, bool admin) {
  // ...
});

vector<bool> is_valid;
tuple<int, string_view, ignore, bool> employee;
RETURN_NOT_OK(Unbox2(obj, &employee, &is_valid));

pair<bool, int> id;
pair<bool, string_view> name;
pair<bool, bool> admin;
RETURN_NOT_OK(Unbox3<int>(obj, 0, &id));
RETURN_NOT_OK(Unbox3<int>(obj, 1, &name));
RETURN_NOT_OK(Unbox3<int>(obj, 3, &admin));

// more options available beyond c++11
}}

> [C++] Consider adding a scalar type object model
> ------------------------------------------------
>
>                 Key: ARROW-47
>                 URL: https://issues.apache.org/jira/browse/ARROW-47
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: C++
>            Reporter: Wes McKinney
>            Assignee: Uwe L. Korn
>            Priority: Major
>              Labels: Analytics
>             Fix For: 0.13.0
>
>
> Just did this on the Python side. In later analytics routines, passing in 
> scalar values (example: Array + Scalar) requires some kind of container. Some 
> systems, like the R language, solve this problem with length-1 arrays, but we 
> should do some analysis of use cases and figure out what will work best for 
> Arrow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to