[ 
https://issues.apache.org/jira/browse/AVRO-3984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hagen Weiße updated AVRO-3984:
------------------------------
    Release Note: Getters and setters generated by avrogencpp return references 
and allow passing r-values to avoid redundant copies. Also avrogencpp generated 
a enum for each union type that represents the union branches
          Status: Patch Available  (was: Open)

> Improve generated code for unions (avrogencpp)
> ----------------------------------------------
>
>                 Key: AVRO-3984
>                 URL: https://issues.apache.org/jira/browse/AVRO-3984
>             Project: Apache Avro
>          Issue Type: Improvement
>          Components: c++
>            Reporter: Hagen Weiße
>            Priority: Major
>              Labels: c++, pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> There are some issues in the code generated by avrogencpp for union types. 
> The issues i would like to address in this ticket are listed below
> I will try to provide a pull request in the near future.
> h2. Performance
> h3. Setter
> The generated setter for a field takes a const reference and performs a copy 
> for each set:
> {code:java}
> void set_BranchA(const BranchA& v); {code}
> This  leads to redundant copies and performance issues if BranchA contains a 
> large string or vector.
> A improvemnt could be to provide an additional setter with move semantics:
> {code:java}
> void set_BranchA(BranchA&& v); {code}
> h2. Getter
> The generated setter for a field returns a value:
> {code:java}
> BranchA get_BranchA() const;
> {code}
> This leads to a copy of BranchA, which might be a issue if BranchA is large 
> or contains a large string/vector.
> It would propably be better to provide a reference to the stored value like 
> this:
> {code:java}
> const BranchA& get_BranchA() const;
> BranchA& get_BranchA();
> {code}
> h2. std::any
> Generated unions store the value in a private std::any. This could probably 
> be replaced with a std::variant which should prevent unnecessary heap 
> allocations.
> h1. API issues
> h2. accessing union branches
> There are currently two ways to read a union branch:
> First using a get_[BRANCH] method. This will either return the value from the 
> branch or throw an exception. This version is okay, if you know that this 
> branch is set.
> Alternative is to use the idx() function. This will return a size_t with the 
> index that is used. 
> !https://ticket.rsint.net/images/icons/emoticons/warning.png|width=16,height=16!
>   The issue here is that you need to manually implement a mapping between 
> branch index -> union branch. 
> A solution may look like this:
> {code:java}
> struct _Example_no_alias_avsc_Union__209__ {
> private:
>     size_t idx_;
>     std::any value_;
> public:
>     enum class Index: size_t {
>         BranchA = 0,
>         BranchB = 1,
>         BranchC = 2,
>     };
>     size_t idx() const { return idx_; }
>     Index idx_enum() const { return static_cast<Index>(idx_); }
>     bool is_BranchA() const;
>     const AttributeGroup& get_BranchA() const;
>     void set_BranchA(const BranchA& v);
> {code}
> Added stuff:
>  * a enum with all union branches
>  * a additional idx_enum() function that returns this Index enum
>  * a is_[BRANCH] method, to check if a branch is set



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to