[
https://issues.apache.org/jira/browse/MESOS-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14360011#comment-14360011
]
Michael Park edited comment on MESOS-2487 at 3/13/15 6:33 AM:
--------------------------------------------------------------
h3. TL;DR
* Note that we don't always want full member-wise equality.
* I don't think we can solve this at the protobuf level.
* Ideal solution I think is to a tool like {{clang-tidy}}.
* Considered adding a default {{operator==}} for {{google::protobuf::Message}}
which does full member-wise equality by default, but I think we're worse off.
* I don't have a good short-term recommendation, so if you're looking for a
solution for right now... stop reading.
h3. Important Note
An important thing to remember is that full member-wise equality is not always
what we want. Some objects include *inessential* parts that should be ignored.
Here are a few examples in Mesos:
1. {{DiskInfo}}: This one's fairly new so I remember it off the top of my head.
Here's [~jieyu]'s comment:
{code}
// NOTE: We ignore 'volume' inside DiskInfo when doing comparison
// because it describes how this resource will be used which has
// nothing to do with the Resource object itself. A framework can
// use this resource and specify different 'volume' every time it
// uses it
{code}
In our context this means: {{volume}} is an inessential component of
{{DiskInfo}} as far as equality is concerned.
2. {{FrameworkInfo}}: It has many fields, but the {{operator==}} implementation
only compares for {{user}} and {{name}}. These 2 fields are what's considered
*essential* for {{FrameworkInfo}}.
Only the author of the class knows what parts are essential. Therefore if we
decide to provide a full member-wise equality as the default implementation of
{{operator==}}, we must ensure that the author can *override* this behavior. If
protobuf generated the {{operator==}} for all messages, the wouldn't be
possible. This I think is the more important point than [overhead in generated
code size|https://code.google.com/p/protobuf/issues/detail?id=154].
h3. Problem
The problem then is that when we add a new essential member, we end up with
unexpected behavior and it doesn't result in a compile-time error. If we add a
non-essential member, things work fine... but ideally we want an explicit
acknowledgement from the programmer that it really is non-essential.
h3. Ideal Solution
First off, I don't think we can solve the problem at the protobuf level since
we can't auto-generate the correct implementations. Even if we had a way to add
attributes to a field to mark it as *non-essential*, we still don't know what
exact equality comparisons the programmer wants for the *essential* parts. For
example, {{repeated Resource resources;}} fields need to construct into our C++
{{Resources}} classes so that we can collapse them and perform {{contains}}
both ways, another example is a list of environment variables which we perform
set equality on rather than list equality.
I think the ideal solution would be to: extract the protobuf message schema and
feed it into a tool like {{clang-tidy}} to check something like: "for
{{operator==}} implementations that compares 2 protobuf messages, enforce that
all of the members are mentioned".
For a full member-wise equality type such as {{FrameworkID}}, we would have:
{code}
bool operator == (const FrameworkID& lhs, const FrameworkID& rhs)
{
return lhs.value() == rhs.value(); // All members mentioned.
}
{code}
For a partial member-wise equality such as {{DiskInfo}}, we would need:
{code}
bool operator == (const DiskInfo& lhs, const DiskInfo& rhs)
{
// Explicit acknowledgement that this is non-essential... plus the existing
note about why.
(void)lhs.volume(); (void)rhs.volume();
if (left.has_persistence() != rhs.has_persistence()) {
return false;
}
if (lhs.has_persistence()) {
return lhs.persistence().id() == rhs.persistence().id();
}
}
{code}
When a new member is added, we'll be notified to handle the case whether the
new member is essential or non-essential. If we added a new member to a message
that doesn't have an {{operator==}} implemented, well then it doesn't matter to
us.
I think this would be nice, but it's probably far out.
h3. Considered Solution
What if we provide a default {{operator==}} and {{operator!=}} implementation
which compares all the fields for all protobuf messages? If there's no specific
{{operator==}} defined, adding a new essential member just works, but adding a
non-essential member now gives unexpected behavior with no errors or warnings.
If there *is* a specific {{operator==}} defined, adding a new essential member
still breaks without warning, and adding a non-essential member still works.
Now we end up with 4 cases instead 2. I considered this solution but I think
this might be actually be worse.
The implementation is pretty simple if we wanted to pursue it as a stopgap
solution however.
As suggested by Kenton Varda (protobuf author)
[here|https://code.google.com/p/protobuf/issues/detail?id=154], and his quote
[here|https://groups.google.com/forum/#!topic/protobuf/BPIL4jQ7D3U].
{quote}
Another thing you could do which might be faster is serialize the message and
compare the raw bytes. As long as the message contains no unknown fields, it
will be serialized in canonical order. You can use DiscardUnknownFields() in
C++ to remove all unknown fields from the message.
{quote}
*NOTE*: I actually don't know what unknown fields are and I can't seem to get
anything useful from googling "protobuf unknown fields" but my guess is that we
don't use it. If we do... well, I don't like this solution anyway.
{code}
inline bool operator == (
const google::protobuf::Message& left,
const google::protobuf::Message& right)
{
return left.SerializeAsString() == right.SerializeAsString();
}
inline bool operator != (
const google::protobuf::Message& left,
const google::protobuf::Message& right)
{
return !(left == right);
}
{code}
I did a little test by adding the above function and removing the following
{{operator==}} implementations that checked for full member-wise equality.
That's not all of them, I just picked out a few for testing purposes.
* {{ContainerID}}
* {{ExecutorID}}
* {{FrameworkID}}
* {{OfferID}}
* {{SlaveID}}
* {{TaskID}}
* {{MasterInfo}}
I accidentally removed {{FrameworkInfo}} as well and {{make check}} broke. Then
realized that {{FrameworkInfo}} only compares {{user}} and {{name}}, so I added
it back and {{make check}} worked. So... I think it works.
h3. Conclusion
I don't have a good short-term recommendation for a solution. Just wanted to
document my ideas and thoughts around this.
was (Author: mcypark):
h3. TL;DR
* Note that we don't always want full member-wise equality.
* I don't think we can solve this at the protobuf level.
* Ideal solution I think is to a tool like {{clang-tidy}}.
* Considered adding a default {{operator==}} for {{google::protobuf::Message}}
which does full member-wise equality by default, but I think we're worse off.
* I don't have a good short-term recommendation, so if you're looking for a
solution for right now... stop reading.
h3. Important Note
An important thing to remember is that full member-wise equality is not always
what we want. Some objects include *inessential* parts that should be ignored.
Here are a few examples in Mesos:
1. {{DiskInfo}}: This one's fairly new so I remember it off the top of my head.
Here's [~jieyu]'s comment:
{code}
// NOTE: We ignore 'volume' inside DiskInfo when doing comparison
// because it describes how this resource will be used which has
// nothing to do with the Resource object itself. A framework can
// use this resource and specify different 'volume' every time it
// uses it
{code}
In our context this means: {{volume}} is an inessential component of
{{DiskInfo}} as far as equality is concerned.
2. {{FrameworkInfo}}: It has many fields, but the {{operator==}} implementation
only compares for {{user}} and {{name}}. These 2 fields are what's considered
*essential* for {{FrameworkInfo}}.
Only the author of the class knows what parts are essential. Therefore if we
decide to provide a full member-wise equality as the default implementation of
{{operator==}}, we must ensure that the author can *override* this behavior. If
protobuf generated the {{operator==}} for all messages, the wouldn't be
possible. This I think is the more important point than [overhead in generated
code size|https://code.google.com/p/protobuf/issues/detail?id=154].
h3. Problem
The problem then is that when we add a new essential member, we end up with
unexpected behavior and it doesn't result in a compile-time error. If we add a
non-essential member, things work fine... but ideally we want an explicit
acknowledgement from the programmer that it really is non-essential.
h3. Ideal Solution
First off, I don't think we can solve the problem at the protobuf level since
we can't auto-generate the correct implementations. Even if we had a way to add
attributes to a field to mark it as *non-essential*, we still don't know what
exact equality comparisons the programmer wants for the *essential* parts. For
example, {{repeated Resource resources;}} fields need to construct into our C++
{{Resources}} classes so that we can collapse them and perform {{contains}}
both ways, another example is a list of environment variables which we perform
set equality on rather than list equality.
I think the ideal solution would be to: extract the protobuf message schema and
feed it into a tool like {{clang-tidy}} to check something like: "for
{{operator==}} implementations that compares 2 protobuf messages, enforce that
all of the members are mentioned".
For a full member-wise equality type such as {{FrameworkID}}, we would have:
{code}
bool operator == (const FrameworkID& lhs, const FrameworkID& rhs)
{
return lhs.value() == rhs.value(); // All members mentioned.
}
{code}
For a partial member-wise equality such as {{DiskInfo}}, we would need:
{code}
bool operator == (const DiskInfo& lhs, const DiskInfo& rhs)
{
// Explicit acknowledgement that this is non-essential... plus the existing
note about why.
(void)lhs.volume(); (void)rhs.volume();
if (left.has_persistence() != rhs.has_persistence()) {
return false;
}
if (lhs.has_persistence()) {
return lhs.persistence().id() == rhs.persistence().id();
}
}
{code}
When a new member is added, we'll be notified to handle the case whether the
new member is essential or non-essential. If we added a new member to a message
that doesn't have an {{operator==}} implemented, well then it doesn't matter to
us.
I think this would be nice, but it's probably far out.
h3. Considered Solution
What if we provide a default {{operator==}} and {{operator!=}} implementation
which compares all the fields for all protobuf messages? If there's no specific
{{operator==}} defined, adding a new essential member just works, but adding a
non-essential member now gives unexpected behavior with no errors or warnings.
If there *is* a specific {{operator==}} defined, adding a new essential member
still breaks without warning, and adding a non-essential member still works.
Now we end up with 4 cases instead 2. I considered this solution but I think
this might be actually be worse.
The implementation is pretty simple if we wanted to pursue it as a stopgap
solution however.
As suggested by Kenton Varda (protobuf author)
[here|https://code.google.com/p/protobuf/issues/detail?id=154], and his quote
[here|https://groups.google.com/forum/#!topic/protobuf/BPIL4jQ7D3U].
{quote}
Another thing you could do which might be faster is serialize the message and
compare the raw bytes. As long as the message contains no unknown fields, it
will be serialized in canonical order. You can use DiscardUnknownFields() in
C++ to remove all unknown fields from the message.
{quote}
*NOTE*: I actually don't know what unknown fields are and I can't seem to get
anything useful from googling "protobuf unknown fields" but my guess is that we
don't use it. If we do... well, I don't like this solution anyway.
{code}
inline bool operator == (
const google::protobuf::Message& left,
const google::protobuf::Message& right)
{
return left.SerializeAsString() == right.SerializeAsString();
}
inline bool operator != (
const google::protobuf::Message& left,
const google::protobuf::Message& right)
{
return !(left == right);
}
{code}
I did a little test by adding the above function and removing the following
{{operator==}} implementations that checked for full member-wise equality.
That's not all of them, I just picked out a few for testing purposes.
* {{ContainerID}}
* {{ExecutorID}}
* {{FrameworkID}}
* {{OfferID}}
* {{SlaveID}}
* {{TaskID}}
* {{MasterInfo}}
I accidentally removed {{FrameworkInfo}} as well and {{make check}} broke. Then
realized that {{FrameworkInfo}} only compares {{user}} and {{name}}, so I added
it back and {{make check}} worked. So in conclusion, I think it works.
h3. Conclusion
I don't have a good short-term recommendation for a solution. Just wanted to
document my ideas and thoughts around this.
> Ensure protobuf "==" operator does not go out of sync with new protobuf fields
> ------------------------------------------------------------------------------
>
> Key: MESOS-2487
> URL: https://issues.apache.org/jira/browse/MESOS-2487
> Project: Mesos
> Issue Type: Task
> Reporter: Vinod Kone
>
> Currently when a new field is added to a protobuf that has a custom "=="
> operator defined, we don't make sure that the field is accounted for in the
> comparison. Ideally we should catch such errors at build time or 'make check'
> time.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)