cshannon commented on PR #1284: URL: https://github.com/apache/activemq/pull/1284#issuecomment-2311244331
I haven't had a change to look at all the changes yet but wanted to respond to a few comments: > Yes, this is tricky! If the old/existing logic is used for both writing and reading, then it works even for 4-byte characters. In fact, 3-bytes logic serializes 4-bytes data incorrectly, but since it reads data back (and expects incorrect bytes), it works. > This is pretty much what I was assuming, that the current code works because the decoding handles the incorrect format, otherwise I didn't see how it could be working. > This is a very great point! I think the current changes are not backward-compatible. I'll give a few rounds of testing to backward-compatibility to make sure there is a problem. Most probably, it will require to develop some additional logic for that. Could you please point me to some code where there is some compatibility-related logic. I'd like to understand how this type of problems are handled currently. I'm not sure if we really have anything to work for compatibility for something like this because it's a bug and not designed. For OpenWire, the version is negotiated so old versions of clients/brokers can communicate by using the same version. To handle it I thought of 3 quick ways but all seem pretty terrible but figured I'd list them anyways: 1. We just fix it but only apply to a major version and say it's incompatible but this is not really practical because one of the core parts of OpenWire is old clients should be able to communicate. So I don't think this is a good idea. 2. We provide a way to make the fix optional, maybe a flag to fall back to the old logic. This also seems like a bad idea though because we'd either have to make the default the old buggy way (and no one would change it)...or if the default is the new fixed way then there's a risk of breaking if someone sends data that needs 4 byte encoding which may or may not be rare depending on the use case. 3. We fix it and then just try and handle errors automatically and fall back. For example, if on deserialization it blows up maybe we try using the old logic. I feel like this might work only sometimes because it requires an exception to be thrown but I assume it could be possible for things to fail in such a way where it doesn't error and just ends up with the wrong result which is not detectable and much worse. It also requires keeping around all the old logic which isn't great. About the only thing I can think of that might work and be reliable would be to generate a new OpenWire version and for the new version the marshaller will just use the fixed UTF-8 encoding/decoding. This works for the datastore as well as KahaDB also tracks the openwrie version used to store. This method still requires keeping around the old broken logic as well but at least if we negotiated the new version we would know the client could support it. Although as I type this it makes me wonder about other non-java clients. I'm wondering if the C++ client or .NET clients (for example) are currently broken if they properly handle UTF-8 encoding already. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@activemq.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@activemq.apache.org For additional commands, e-mail: gitbox-h...@activemq.apache.org For further information, visit: https://activemq.apache.org/contact