[ 
https://issues.apache.org/jira/browse/AVRO-3984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hagen Weiße updated AVRO-3984:
------------------------------
    Description: 
There are some issues in the code generated by avrogencpp for union types. The 
issues i would like to address in this ticket are listed below

I will try to provide a pull request in the near future.
h2. Performance
h3. Setter

The generated setter for a field takes a const reference and performs a copy 
for each set:
{code:java}
void set_BranchA(const BranchA& v); {code}
This  leads to redundant copies and performance issues if BranchA contains a 
large string or vector.

A improvemnt could be to provide an additional setter with move semantics:
{code:java}
void set_BranchA(BranchA&& v); {code}
h2. Getter

The generated setter for a field returns a value:
{code:java}
BranchA get_BranchA() const;
{code}
This leads to a copy of BranchA, which might be a issue if BranchA is large or 
contains a large string/vector.

It would propably be better to provide a reference to the stored value like 
this:
{code:java}
const BranchA& get_BranchA() const;
BranchA& get_BranchA();
{code}
h2. std::any

Generated unions store the value in a private std::any. This could probably be 
replaced with a std::variant which should prevent unnecessary heap allocations.
h1. API issues
h2. accessing union branches

There are currently two ways to read a union branch:

First using a get_[BRANCH] method. This will either return the value from the 
branch or throw an exception. This version is okay, if you know that this 
branch is set.

Alternative is to use the idx() function. This will return a size_t with the 
index that is used. 

!https://ticket.rsint.net/images/icons/emoticons/warning.png|width=16,height=16!
  The issue here is that you need to manually implement a mapping between 
branch index -> union branch. 

A solution may look like this:
{code:java}
struct _Example_no_alias_avsc_Union__209__ {
private:
    size_t idx_;
    std::any value_;
public:
    enum class Index: size_t {
        BranchA = 0,
        BranchB = 1,
        BranchC = 2,
    };
    size_t idx() const { return idx_; }
    Index idx_enum() const { return static_cast<Index>(idx_); }
    bool is_BranchA() const;
    const AttributeGroup& get_BranchA() const;
    void set_BranchA(const BranchA& v);
{code}
Added stuff:
 * a enum with all union branches
 * a additional idx_enum() function that returns this Index enum
 * a is_[BRANCH] method, to check if a branch is set

  was:
There are some issues in the code generated by avrogencpp for union types. The 
issues i would like to address in this ticket are listed below

I will try to provide a pull request in the near future.
h2. Performance
h3. Setter

The generated setter for a field takes a const reference and performs a copy 
for each set:
{code:java}
void set_BranchA(const BranchA& v); {code}
This  leads to redundant copies and performance issues if BranchA contains a 
large string or vector.

A improvemnt could be to provide an additional setter with move semantics:
{code:java}
void set_BranchA(BranchA&& v); {code}
h2. Getter

The generated setter for a field returns a value:
{code:java}
BranchA get_BranchA() const;
{code}
This leads to a copy of BranchA, which might be a issue if BranchA is large or 
contains a large string/vector.

It would propably be better to provide a reference to the stored value like 
this:
{code:java}
const BranchA& get_BranchA() const;
BranchA& get_BranchA();
{code}
h2. std::any

Generated unions store the value in a private std::any. This could probably be 
replaced with a std::variant.
h1. API issues
h2. accessing union branches

There are currently two ways to read a union branch:

First using a get_[BRANCH] method. This will either return the value from the 
branch or throw an exception. This version is okay, if you know that this 
branch is set.

Alternative is to use the idx() function. This will return a size_t with the 
index that is used. 

!https://ticket.rsint.net/images/icons/emoticons/warning.png|width=16,height=16!
  The issue here is that you need to manually implement a mapping between 
branch index -> union branch. 

A solution may look like this:
{code:java}
struct _Example_no_alias_avsc_Union__209__ {
private:
    size_t idx_;
    std::any value_;
public:
    enum class Index: size_t {
        BranchA = 0,
        BranchB = 1,
        BranchC = 2,
    };
    size_t idx() const { return idx_; }
    Index idx_enum() const { return static_cast<Index>(idx_); }
    bool is_BranchA() const;
    const AttributeGroup& get_BranchA() const;
    void set_BranchA(const BranchA& v);
{code}
Added stuff:
 * a enum with all union branches
 * a additional idx_enum() function that returns this Index enum
 * a is_[BRANCH] method, to check if a branch is set


> Improve generated code for unions (avrogencpp)
> ----------------------------------------------
>
>                 Key: AVRO-3984
>                 URL: https://issues.apache.org/jira/browse/AVRO-3984
>             Project: Apache Avro
>          Issue Type: Improvement
>          Components: c++
>            Reporter: Hagen Weiße
>            Priority: Major
>
> There are some issues in the code generated by avrogencpp for union types. 
> The issues i would like to address in this ticket are listed below
> I will try to provide a pull request in the near future.
> h2. Performance
> h3. Setter
> The generated setter for a field takes a const reference and performs a copy 
> for each set:
> {code:java}
> void set_BranchA(const BranchA& v); {code}
> This  leads to redundant copies and performance issues if BranchA contains a 
> large string or vector.
> A improvemnt could be to provide an additional setter with move semantics:
> {code:java}
> void set_BranchA(BranchA&& v); {code}
> h2. Getter
> The generated setter for a field returns a value:
> {code:java}
> BranchA get_BranchA() const;
> {code}
> This leads to a copy of BranchA, which might be a issue if BranchA is large 
> or contains a large string/vector.
> It would propably be better to provide a reference to the stored value like 
> this:
> {code:java}
> const BranchA& get_BranchA() const;
> BranchA& get_BranchA();
> {code}
> h2. std::any
> Generated unions store the value in a private std::any. This could probably 
> be replaced with a std::variant which should prevent unnecessary heap 
> allocations.
> h1. API issues
> h2. accessing union branches
> There are currently two ways to read a union branch:
> First using a get_[BRANCH] method. This will either return the value from the 
> branch or throw an exception. This version is okay, if you know that this 
> branch is set.
> Alternative is to use the idx() function. This will return a size_t with the 
> index that is used. 
> !https://ticket.rsint.net/images/icons/emoticons/warning.png|width=16,height=16!
>   The issue here is that you need to manually implement a mapping between 
> branch index -> union branch. 
> A solution may look like this:
> {code:java}
> struct _Example_no_alias_avsc_Union__209__ {
> private:
>     size_t idx_;
>     std::any value_;
> public:
>     enum class Index: size_t {
>         BranchA = 0,
>         BranchB = 1,
>         BranchC = 2,
>     };
>     size_t idx() const { return idx_; }
>     Index idx_enum() const { return static_cast<Index>(idx_); }
>     bool is_BranchA() const;
>     const AttributeGroup& get_BranchA() const;
>     void set_BranchA(const BranchA& v);
> {code}
> Added stuff:
>  * a enum with all union branches
>  * a additional idx_enum() function that returns this Index enum
>  * a is_[BRANCH] method, to check if a branch is set



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to