fishy commented on code in PR #3105: URL: https://github.com/apache/thrift/pull/3105#discussion_r1962372345
########## lib/py/src/protocol/TBinaryProtocol.py: ########## @@ -146,7 +146,7 @@ def readMessageBegin(self): if self.strictRead: raise TProtocolException(type=TProtocolException.BAD_VERSION, message='No protocol version header') - name = binary_to_str(self.trans.readAll(sz)) + name = self.trans.readAll(sz).decode('utf8') Review Comment: it looks like `utf8` is some compatibility alias for `utf-8` defined in python: ``` Python 3.13.2 (main, Feb 5 2025, 01:23:35) [GCC 14.2.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> "foobar".encode('utf8') b'foobar' >>> "foobar".encode('foobar') Traceback (most recent call last): File "<python-input-1>", line 1, in <module> "foobar".encode('foobar') ~~~~~~~~~~~~~~~^^^^^^^^^^ LookupError: unknown encoding: foobar >>> "foobar".encode('utf-8') b'foobar' ``` so this still works, but I think we probably want to use `utf-8` instead as that's the correct term (that's also the encoding used in the examples on https://docs.python.org/3/howto/unicode.html) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@thrift.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org