[
https://issues.apache.org/jira/browse/THRIFT-4677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16904382#comment-16904382
]
Jarry Shaw edited comment on THRIFT-4677 at 8/10/19 9:14 AM:
-------------------------------------------------------------
Sorry for the late reply. It was quite a long time ago, and I just tried to
reproduce the bug recently.
So here's the exception traceback stack:
{code:python}
Traceback (most recent call last):
File
"C:\Users\fakepath\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py",
line 121, in worker
result = (True, func(*args, **kwds))
File
"C:\Users\fakepath\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py",
line 44, in mapstar
return list(map(*args))
File "C:\Users\fakepath\Desktop\osquery_all_mp.py", line 54, in query
query = instance.client.query(f'SELECT * FROM {table};')
File
"C:\Users\fakepath\AppData\Local\Programs\Python\Python37\lib\site-packages\osquery\extensions\ExtensionManager.py",
line 182, in query
return self.recv_query()
File
"C:\Users\fakepath\AppData\Local\Programs\Python\Python37\lib\site-packages\osquery\extensions\ExtensionManager.py",
line 201, in recv_query
result.read(iprot)
File
"C:\Users\fakepath\AppData\Local\Programs\Python\Python37\lib\site-packages\osquery\extensions\ExtensionManager.py",
line 981, in read
self.success.read(iprot)
File
"C:\Users\fakepath\AppData\Local\Programs\Python\Python37\lib\site-packages\osquery\extensions\ttypes.py",
line 339, in read
_val12 = iprot.readString().decode('utf-8') if sys.version_info[0] == 2
else iprot.readString()
File
"C:\Users\fakepath\AppData\Local\Programs\Python\Python37\lib\site-packages\thrift\protocol\TProtocol.py",
line 184, in readString
return binary_to_str(self.readBinary())
File
"C:\Users\fakepath\AppData\Local\Programs\Python\Python37\lib\site-packages\thrift\compat.py",
line 37, in binary_to_str
return bin_val.decode('utf8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc8 in position 0: invalid
continuation byte
{code}
Environment:
* Windows 10 Pro (Simplified Chinese)
* osquery v3.3.0
* osquery-python v3.0.6 (Python binding)
* thrift v0.11.0
And the Python system locale information:
{code:python}
>>> locale.getpreferredencoding()
'cp936'
{code}
Sorry I'm not familiar Thrift's implementation, so not really know how this bug
should be fixed.
However, you may find the source code I'm using in the attachment.
[^osquery_all_mp.py]
was (Author: jarryshaw):
Sorry for the late reply. It was quite a long time ago, and I just tried to
reproduce the bug recently.
So here's the exception traceback stack:
{code:python}
Traceback (most recent call last):
File
"C:\Users\fakepath\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py",
line 121, in worker
result = (True, func(*args, **kwds))
File
"C:\Users\fakepath\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py",
line 44, in mapstar
return list(map(*args))
File "C:\Users\fakepath\Desktop\osquery_all_mp.py", line 54, in query
query = instance.client.query(f'SELECT * FROM {table};')
File
"C:\Users\fakepath\AppData\Local\Programs\Python\Python37\lib\site-packages\osquery\extensions\ExtensionManager.py",
line 182, in query
return self.recv_query()
File
"C:\Users\fakepath\AppData\Local\Programs\Python\Python37\lib\site-packages\osquery\extensions\ExtensionManager.py",
line 201, in recv_query
result.read(iprot)
File
"C:\Users\fakepath\AppData\Local\Programs\Python\Python37\lib\site-packages\osquery\extensions\ExtensionManager.py",
line 981, in read
self.success.read(iprot)
File
"C:\Users\fakepath\AppData\Local\Programs\Python\Python37\lib\site-packages\osquery\extensions\ttypes.py",
line 339, in read
_val12 = iprot.readString().decode('utf-8') if sys.version_info[0] == 2
else iprot.readString()
File
"C:\Users\fakepath\AppData\Local\Programs\Python\Python37\lib\site-packages\thrift\protocol\TProtocol.py",
line 184, in readString
return binary_to_str(self.readBinary())
File
"C:\Users\fakepath\AppData\Local\Programs\Python\Python37\lib\site-packages\thrift\compat.py",
line 37, in binary_to_str
return bin_val.decode('utf8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc8 in position 0: invalid
continuation byte
{code}
Environment:
* Windows 10 Pro (Simplified Chinese)
* osquery v3.3.0
* osquery-python v3.0.6 (Python binding)
* thrift v0.11.0
And the Python system locale information:
{code:python}
>>> locale.getpreferredencoding()
'cp936'
{code}
Sorry I'm not familiar Thrift's implementation, so not really know how should
this bug be fixed.
However, you may find the source code I'm using in the attachment.
[^osquery_all_mp.py]
> UnicodeDecodeError in Python3
> -----------------------------
>
> Key: THRIFT-4677
> URL: https://issues.apache.org/jira/browse/THRIFT-4677
> Project: Thrift
> Issue Type: Bug
> Components: Python - Library
> Environment: Operating System: Windows 10 Pro (Simplified Chinese)
> Python Interpreter: Python 3.6.6
> {{osquery}} Version: 3.3.0
> {{osquery-python}} Version: 3.0.5
>
> Reporter: Jarry Shaw
> Priority: Major
> Attachments: compat.py, osquery_all_mp.py
>
> Original Estimate: 0.5h
> Remaining Estimate: 0.5h
>
> This is an issue occurred when using
> [osquery-python|https://github.com/osquery/osquery-python] (Python binding of
> [osquery|https://osquery.io/] by Facebook).
> When querying, {{UnicodeDecodeError}} raised with error message: "{{'utf-8'
> codec can't decode byte 0xc3 in position 0: invalid continuation byte}}" from
> {{thrift.compat.binary_to_str}}, which is because the encoding of {{bin_val}}
> parameter should be "{{gbk}}".
> Possible approaches are:
> * add a parameter for user to determine encodings
> * get the system encoding through {{locale.getpreferredencoding()}}
> * call {{bin_val.decode}} with {{errors='replace'}} or {{errors='ignore'}}
> parameter
> * introduce {{chardet}} to try and resolve encoding problems
> The attachment is my hack solution to this issue (through not perfect).
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)