[
https://issues.apache.org/jira/browse/ARROW-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16036395#comment-16036395
]
Jeff Knupp commented on ARROW-1088:
-----------------------------------
Well, the issue is that functions like `os.stat` and `os.remove` are indirectly
called on the file (which, of course, can be any bytes for the name itself
within Python). Below is the traceback output:
{code}
root@183bbfdb623a:/home/ubuntu/arrow/arrow/python# py.test pyarrow
==================================================================== test
session starts
====================================================================
platform linux -- Python 3.5.3, pytest-3.1.1, py-1.4.33, pluggy-0.4.0
rootdir: /home/ubuntu/arrow/arrow/python, inifile: setup.cfg
collected 185 items / 2 skipped
pyarrow/tests/test_array.py ...........
pyarrow/tests/test_convert_builtin.py ......................
pyarrow/tests/test_convert_pandas.py ............................x....
pyarrow/tests/test_deprecations.py ..
pyarrow/tests/test_feather.py .......................x..FE.
pyarrow/tests/test_io.py ..................
pyarrow/tests/test_ipc.py .............x
pyarrow/tests/test_jemalloc.py ..
pyarrow/tests/test_scalars.py ..........
pyarrow/tests/test_schema.py ..............
pyarrow/tests/test_table.py ...............
pyarrow/tests/test_tensor.py ................
==========================================================================
ERRORS
===========================================================================
_______________________________________________ ERROR at teardown of
TestFeatherReader.test_unicode_filename
________________________________________________
self = <pyarrow.tests.test_feather.TestFeatherReader
testMethod=test_unicode_filename>
def tearDown(self):
for path in self.test_files:
try:
> os.remove(path)
E UnicodeEncodeError: 'ascii' codec can't encode character '\xeb'
in position 10: ordinal not in range(128)
pyarrow/tests/test_feather.py:45: UnicodeEncodeError
=========================================================================
FAILURES
==========================================================================
__________________________________________________________
TestFeatherReader.test_unicode_filename
__________________________________________________________
self = <pyarrow.tests.test_feather.TestFeatherReader
testMethod=test_unicode_filename>
def test_unicode_filename(self):
# GH #209
name = (b'Besa_Kavaj\xc3\xab.feather').decode('utf-8')
df = pd.DataFrame({'foo': [1, 2, 3, 4]})
> self._check_pandas_roundtrip(df, path=name)
pyarrow/tests/test_feather.py:362:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pyarrow/tests/test_feather.py:71: in _check_pandas_roundtrip
if not os.path.exists(path):
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
path = 'Besa_Kavaj\xeb.feather'
def exists(path):
"""Test whether a path exists. Returns False for broken symbolic
links"""
try:
> os.stat(path)
E UnicodeEncodeError: 'ascii' codec can't encode character '\xeb' in
position 10: ordinal not in range(128)
/usr/lib/python3.5/genericpath.py:19: UnicodeEncodeError
{code}
> [Python] test_unicode_filename test fails when unicode filenames aren't
> supported by system
> -------------------------------------------------------------------------------------------
>
> Key: ARROW-1088
> URL: https://issues.apache.org/jira/browse/ARROW-1088
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Reporter: Jeff Knupp
>
> Building/running pyarrow in Docker using Ubuntu 17.04 as a base (with no
> other modification) fails {{test_unicode_filename}} as unicode filenames are
> apparently not supported by default in this setup. This is further confirmed
> by the value of {{os.path.supports_unicode_filenames = False}}. This test
> should be skipped in such situations.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)