[ 
https://issues.apache.org/jira/browse/AVRO-3984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17875366#comment-17875366
 ] 

ASF subversion and git services commented on AVRO-3984:
-------------------------------------------------------

Commit f350a8fbdd2e1f6797e46900fe905292ae55d7c9 in avro's branch 
refs/heads/main from hwse
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=f350a8fbd ]

AVRO-3984 [C++] Improve code generated for unions (#3047)

* AVRO-3984 [C++] Getters created by avrogencpp return a reference instead of a 
value to avoid calling copy constructor of large classes

* AVRO-3984 [C++] Add getter for generated unions that returns a mutable 
reference. This allows the user to modify values in union branches after 
creation (#3047)

* AVRO-3984 [C++] Add move setters for generated unions to provide a more 
efficient way to set a value (#3047)

* AVRO-3984 [C++] Use std::move in decode implementation of codec_traits for 
unions to avoid a copy (#3047)

* AVRO-3984 [C++] Generate an enum for each union type that maps the branch 
names to the corresponding index. This allows the user to avoid checks against 
"magic numbers" (#3047)

* AVRO-3984 [C++] Add additional checks for the union branch in 
testUnionMethods test (#3047)

* AVRO-3984 [C++] Add additional branch() method that returns the Branch enum 
directly, this avoids a manual static_cast (#3047)

---------

Co-authored-by: hwse <[email protected]>

> Improve generated code for unions (avrogencpp)
> ----------------------------------------------
>
>                 Key: AVRO-3984
>                 URL: https://issues.apache.org/jira/browse/AVRO-3984
>             Project: Apache Avro
>          Issue Type: Improvement
>          Components: c++
>            Reporter: Hagen Weiße
>            Priority: Major
>              Labels: c++, pull-request-available
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> There are some issues in the code generated by avrogencpp for union types. 
> The issues i would like to address in this ticket are listed below
> I will try to provide a pull request in the near future.
> h2. Performance
> h3. Setter
> The generated setter for a field takes a const reference and performs a copy 
> for each set:
> {code:java}
> void set_BranchA(const BranchA& v); {code}
> This  leads to redundant copies and performance issues if BranchA contains a 
> large string or vector.
> A improvemnt could be to provide an additional setter with move semantics:
> {code:java}
> void set_BranchA(BranchA&& v); {code}
> h2. Getter
> The generated setter for a field returns a value:
> {code:java}
> BranchA get_BranchA() const;
> {code}
> This leads to a copy of BranchA, which might be a issue if BranchA is large 
> or contains a large string/vector.
> It would propably be better to provide a reference to the stored value like 
> this:
> {code:java}
> const BranchA& get_BranchA() const;
> BranchA& get_BranchA();
> {code}
> h2. std::any
> Generated unions store the value in a private std::any. This could probably 
> be replaced with a std::variant which should prevent unnecessary heap 
> allocations.
> h1. API issues
> h2. accessing union branches
> There are currently two ways to read a union branch:
> First using a get_[BRANCH] method. This will either return the value from the 
> branch or throw an exception. This version is okay, if you know that this 
> branch is set.
> Alternative is to use the idx() function. This will return a size_t with the 
> index that is used. 
> !https://ticket.rsint.net/images/icons/emoticons/warning.png|width=16,height=16!
>   The issue here is that you need to manually implement a mapping between 
> branch index -> union branch. 
> A solution may look like this:
> {code:java}
> struct _Example_no_alias_avsc_Union__209__ {
> private:
>     size_t idx_;
>     std::any value_;
> public:
>     enum class Index: size_t {
>         BranchA = 0,
>         BranchB = 1,
>         BranchC = 2,
>     };
>     size_t idx() const { return idx_; }
>     Index idx_enum() const { return static_cast<Index>(idx_); }
>     bool is_BranchA() const;
>     const AttributeGroup& get_BranchA() const;
>     void set_BranchA(const BranchA& v);
> {code}
> Added stuff:
>  * a enum with all union branches
>  * a additional idx_enum() function that returns this Index enum
>  * a is_[BRANCH] method, to check if a branch is set



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to