Re: [PR] [AMQ-8398] Fix conversion of 4-byte unicode characters between JMS and STOMP [activemq]

via GitHub Mon, 26 Aug 2024 15:51:55 -0700


cshannon commented on PR #1284:
URL: https://github.com/apache/activemq/pull/1284#issuecomment-2311244331


   I haven't had a change to look at all the changes yet but wanted to respond 
to a few comments:
   
   > Yes, this is tricky! If the old/existing logic is used for both writing 
and reading, then it works even for 4-byte characters. In fact, 3-bytes logic 
serializes 4-bytes data incorrectly, but since it reads data back (and expects 
incorrect bytes), it works.
   > 
   
   This is pretty much what I was assuming, that the current code works because 
the decoding handles the incorrect format, otherwise I didn't see how it could 
be working.
   
   > This is a very great point! I think the current changes are not 
backward-compatible. I'll give a few rounds of testing to 
backward-compatibility to make sure there is a problem. Most probably, it will 
require to develop some additional logic for that. Could you please point me to 
some code where there is some compatibility-related logic. I'd like to 
understand how this type of problems are handled currently.
   
   I'm not sure if we really have anything to work for compatibility for 
something like this because it's a bug and not designed. For OpenWire, the 
version is negotiated so old versions of clients/brokers can communicate by 
using the same version. 
   
   To handle it I thought of 3 quick ways but all seem pretty terrible but 
figured I'd list them anyways:
   1. We just fix it but only apply to a major version and say it's 
incompatible but this is not really practical because one of the core parts of 
OpenWire is old clients should be able to communicate. So I don't think this is 
a good idea.
   2. We provide a way to make the fix optional, maybe a flag to fall back to 
the old logic. This also seems like a bad idea though because we'd either have 
to make the default the old buggy way (and no one would change it)...or if the 
default is the new fixed way then there's a risk of breaking if someone sends 
data that needs 4 byte encoding which may or may not be rare depending on the 
use case. 
   3. We fix it and then just try and handle errors automatically and fall 
back. For example, if on deserialization it blows up maybe we try using the old 
logic. I feel like this might work only sometimes because it requires an 
exception to be thrown but I assume it could be possible for things to fail in 
such a way where it doesn't error and just ends up with the wrong result which 
is not detectable and much worse. It also requires keeping around all the old 
logic which isn't great.
   
   About the only thing I can think of that might work and be reliable would be 
to generate a new OpenWire version and for the new version the marshaller will 
just use the fixed UTF-8 encoding/decoding. This works for the datastore as 
well as KahaDB also tracks the openwrie version used to store. This method 
still requires keeping around the old broken logic as well but at least if we 
negotiated the new version we would know the client could support it. Although 
as I type this it makes me wonder about other non-java clients. I'm wondering 
if the C++ client or .NET clients (for example) are currently broken if they 
properly handle UTF-8 encoding already.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@activemq.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@activemq.apache.org
For additional commands, e-mail: gitbox-h...@activemq.apache.org
For further information, visit: https://activemq.apache.org/contact

Re: [PR] [AMQ-8398] Fix conversion of 4-byte unicode characters between JMS and STOMP [activemq]

Reply via email to