[
https://issues.apache.org/jira/browse/THRIFT-4207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16223556#comment-16223556
]
ASF GitHub Bot commented on THRIFT-4207:
----------------------------------------
Github user nsuke commented on a diff in the pull request:
https://github.com/apache/thrift/pull/1274#discussion_r147556105
--- Diff: lib/py/src/ext/protocol.tcc ---
@@ -419,18 +419,30 @@ bool ProtocolBase<Impl>::encodeValue(PyObject* value,
TType type, PyObject* type
case T_STRING: {
ScopedPyObject nval;
+ Py_ssize_t len;
+ char *str;
if (PyUnicode_Check(value)) {
nval.reset(PyUnicode_AsUTF8String(value));
if (!nval) {
return false;
}
} else {
+ if (isUtf8(typeargs)) {
+ if (PyBytes_AsStringAndSize(value, &str, &len) < 0) {
+ return false;
+ }
+ // Check that input is a valid UTF-8 string.
+ nval.reset(PyUnicode_DecodeUTF8(str, len, 0));
+ if (!nval) {
+ return false;
+ }
+ }
--- End diff --
Doesn't this affect every user's performance who are passing relatively
large utf8-encoded `byte` ?
The problem might be that we're not rejecting `byte` in the first place.
Although "fixing" that wouldn't be backward compatible.
What do you think ?
> Accelerated version of TBinaryProtocol allows invalid input to string fields.
> -----------------------------------------------------------------------------
>
> Key: THRIFT-4207
> URL: https://issues.apache.org/jira/browse/THRIFT-4207
> Project: Thrift
> Issue Type: Bug
> Components: Python - Library
> Affects Versions: 0.10.0
> Reporter: Elvis Pranskevichus
> Assignee: James E. King, III
> Fix For: 0.11.0
>
>
> {{TBinaryProtocolAccelerated}} and {{TCompactProtocolAccelerated}} currently
> accept arbitrary bytes as input to string fields even when {{py:utf8strings}}
> is on.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)