KalleOlaviNiemitalo commented on code in PR #3284:
URL: https://github.com/apache/avro/pull/3284#discussion_r1905056392
##########
lang/c++/impl/Compiler.cc:
##########
@@ -136,7 +136,11 @@ int64_t getLongField(const Entity &e, const Object &m,
// Unescape double quotes (") for de-serialization. This method complements
the
// method NodeImpl::escape() which is used for serialization.
Review Comment:
A side note. The NodeImpl::escape function referenced from here calls
NodeImpl::intToHex:
<https://github.com/apache/avro/blob/dd01f97869e374d6427bec999afa089f760791ab/lang/c%2B%2B/impl/NodeImpl.cc#L59-L65>
<https://github.com/apache/avro/blob/dd01f97869e374d6427bec999afa089f760791ab/lang/c%2B%2B/include/avro/NodeImpl.hh#L543-L550>
In NodeImpl::intToHex, the `std::setw(sizeof(T))` call looks wrong for two
reasons:
* If bytes are 8-bit, then each byte makes two hexadecimal digits, not 1.
* According to <https://datatracker.ietf.org/doc/html/rfc7159#section-7>,
`\u` in JSON must always be followed by four hexadecimal digits; this does not
depend on which C++ type was used.
But I suppose it works in practice because `T` is either `int` or `unsigned
int` and `sizeof(T)` is 4 on usual architectures and the actual value of `T i`
is cast from `char` or `uint8_t` so it is 0xFF or less. Although this could
break if `char` is a signed type and `std::iscntrl` returns true for a negative
value.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]