szaszm commented on a change in pull request #1208: URL: https://github.com/apache/nifi-minifi-cpp/pull/1208#discussion_r777601496
########## File path: extensions/standard-processors/tests/unit/PutUDPTests.cpp ########## @@ -0,0 +1,112 @@ +/** + * + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include <memory> +#include <new> +#include <random> +#include <string> +#include "SingleInputTestController.h" +#include "PutUDP.h" +#include "utils/net/DNS.h" +#include "utils/net/Socket.h" +#include "utils/expected.h" +#include "utils/StringUtils.h" + +namespace org::apache::nifi::minifi::processors { + +namespace { +struct DatagramListener { + DatagramListener(const char* const hostname, const char* const port) + :resolved_names_{utils::net::resolveHost(hostname, port, utils::net::IpProtocol::Udp).value()}, + open_socket_{utils::net::open_socket(resolved_names_.get()) + | utils::valueOrElse([=]() -> utils::net::OpenSocketResult { throw std::runtime_error{utils::StringUtils::join_pack("Failed to connect to ", hostname, " on port ", port)}; })} + { + const auto bind_result = bind(open_socket_.socket_.get(), open_socket_.selected_name->ai_addr, open_socket_.selected_name->ai_addrlen); + if (bind_result == utils::net::SocketError) { + throw std::runtime_error{utils::StringUtils::join_pack("bind: ", utils::net::get_last_socket_error().message())}; + } + } + + struct ReceiveResult { + std::string remote_address; + std::string message; + }; + + [[nodiscard]] ReceiveResult receive(const size_t max_message_size = 8192) const { + ReceiveResult result; + result.message.resize(max_message_size); + sockaddr_storage remote_address{}; + socklen_t addrlen = sizeof(remote_address); + const auto recv_result = recvfrom(open_socket_.socket_.get(), result.message.data(), result.message.size(), 0, std::launder(reinterpret_cast<sockaddr*>(&remote_address)), &addrlen); Review comment: It shouldn't be needed, but since the code most likely violates the aliasing rules, I wanted to reduce the possibility of a future compiler messing it up with some optimization. The Sockets API was made in the 90s with no regards to aliasing rules in C and C++, and back then it didn't really matter, because compilers were much simpler and rarely if ever took advantage of optimization opportunities that arose as a consequence of having differently typed pointers in the same context, and those can not point to the same memory location. In the intended usage of the sockets API, one is supposed to initialize a memory location as `struct sockaddr_storage`, then fill it in with a function that works with `struct sockaddr_in` or `struct sockaddr_in6` (for IPv4 and IPv6 respectively) while passing it around as `struct sockaddr*`. These structures have compatible layouts, but this doesn't make the practice legal C or C++. You're correct that `std::launder` is not related, but I was hoping that the optimization barrier it creates overlaps with those that are required to make this aliasing violation work in that hypothetical future compiler that takes advantage of this UB. There was a similar usage of `std::launder` in libc++ "just to be safe": https://reviews.llvm.org/D47607. I can't give a good explanation of the proper usage of `std::launder`, because I'm not too familiar with the object model changes of C++17 that introduced it. I think it's used to obtain a valid pointer from an invalidated pointer to the same memory location, or something like this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
