I wonder what the value of LC_ALL, LC_CTYPE, LANG are set to in your environment? On my system LC_CTYPE=en_US.UTF-8 and LANG=en_US.UTF-8. My understand of Python on Linux is that it reads these environment variables to set `sys.getfilesystemencoding()`. This has to do with configuring Python to consistently read filenames and such with the way the OS is presenting them.
https://docs.python.org/2/library/sys.html#sys.getfilesystemencoding On Mon, Jan 4, 2016, at 05:12 PM, Ryan Lovelett via swift-dev wrote: > On Mon, Jan 4, 2016, at 03:40 PM, Tom Gall via swift-dev wrote: > > Building with: ./swift/utils/build-script -R -t --foundation > > > > on Linux (gentoo amd64) fails with > > > > + /usr/bin/cmake --build > > /home/tgall/swift/build/Ninja-ReleaseAssert/swift-linux-x86_64 -- -j4 > > SwiftUnitTests > > > > [6/29] Generating UnicodeGraphemeBreakTest.cpp from > > UnicodeGraphemeBreakTest.cpp.gyb with ptr size = 8 > > > > FAILED: cd /home/tgall/swift/swift/unittests/Basic && /usr/bin/cmake > > -E make_directory > > /home/tgall/swift/build/Ninja-ReleaseAssert/swift-linux-x86_64/unittests/Basic/8 > > && /home/tgall/swift/swift/utils/gyb --test > > -DunicodeGraphemeBreakPropertyFile=/home/tgall/swift/swift/utils/UnicodeData/GraphemeBreakProperty.txt > > -DunicodeGraphemeBreakTestFile=/home/tgall/swift/swift/utils/UnicodeData/GraphemeBreakTest.txt > > -DCMAKE_SIZEOF_VOID_P=8 -o > > /home/tgall/swift/build/Ninja-ReleaseAssert/swift-linux-x86_64/unittests/Basic/8/UnicodeGraphemeBreakTest.cpp.tmp > > UnicodeGraphemeBreakTest.cpp.gyb && /usr/bin/cmake -E > > copy_if_different > > /home/tgall/swift/build/Ninja-ReleaseAssert/swift-linux-x86_64/unittests/Basic/8/UnicodeGraphemeBreakTest.cpp.tmp > > /home/tgall/swift/build/Ninja-ReleaseAssert/swift-linux-x86_64/unittests/Basic/8/UnicodeGraphemeBreakTest.cpp > > && /usr/bin/cmake -E remove > > /home/tgall/swift/build/Ninja-ReleaseAssert/swift-linux-x86_64/unittests/Basic/8/UnicodeGraphemeBreakTest.cpp.tmp > > > > Traceback (most recent call last): > > > > File "/home/tgall/swift/swift/utils/gyb", line 3, in <module> > > gyb.main() > > File "/home/tgall/swift/swift/utils/gyb.py", line 1071, in main > > args.target.write(executeTemplate(ast, args.line_directive, > > **bindings)) > > File "/home/tgall/swift/swift/utils/gyb.py", line 974, in > > executeTemplate > > ast.execute(executionContext) > > File "/home/tgall/swift/swift/utils/gyb.py", line 591, in execute > > x.execute(context) > > File "/home/tgall/swift/swift/utils/gyb.py", line 667, in execute > > result = eval(self.code, context.localBindings) > > File > > > > "/home/tgall/swift/swift/unittests/Basic/UnicodeGraphemeBreakTest.cpp.gyb", > > line 23, in <module> > > get_grapheme_cluster_break_tests_as_UTF8(unicodeGraphemeBreakTestFile) > > File "/home/tgall/swift/swift/utils/GYBUnicodeDataUtils.py", line > > 553, in get_grapheme_cluster_break_tests_as_UTF8 > > for line in f: > > File "/usr/lib64/python2.7/codecs.py", line 687, in next > > return self.reader.next() > > File "/usr/lib64/python2.7/codecs.py", line 618, in next > > line = self.readline() > > File "/usr/lib64/python2.7/codecs.py", line 533, in readline > > data = self.read(readsize, firstline=True) > > File "/usr/lib64/python2.7/codecs.py", line 480, in read > > newchars, decodedbytes = self.decode(data, self.errors) > > UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position > > 0: ordinal not in range(128) > > [6/29] Building CXX object > > unittests/Parse/CMakeFiles/SwiftParseTests.dir/LexerTests.cpp.o > > ninja: build stopped: subcommand failed. > > > > Ah yes ... the joys of python stack dumps... anyway, tracing this a bit: > > > > in swift/utils/GYBUnicodeDataUtils.py there is: > > > > with codecs.open(grapheme_break_test_file_name, > > encoding=sys.getfilesystemencoding(), errors='strict') as f: > > > > I wrote that code and patch (see: > https://github.com/apple/swift/commit/7dbb4127f55022bca7b191d448652b5decf8626e). > The change was in service of adding Python 3 support to GYB. So first of > all let me say: I'm sorry. 😏 > > Open up your python interpreter and figure out what your filesystem is > reporting its encoding to be (e.g., `sys.getfilesystemencoding()`). On > OS X and my copy of Arch linux it reports `'utf-8'` which is why it > doesn't have an issue. Worst case scenario we can just force it to be > `with codecs.open(grapheme_break_test_file_name, encoding='utf-8', > errors='strict') as f:` but I went with the filesystem encoding because > hopefully it is always UTF-8. > > > It appears to be our offending bit of python code. Now my unicode & > > python foo isn't the strongest, but if I change what is passed as > > encoding to : encoding='utf-8', the swift testcases seem to run quite > > a bit better and end up reporting : > > > > Testing Time: 65.82s > > Expected Passes : 1748 > > Expected Failures : 83 > > Unsupported Tests : 585 > > -- check-swift-linux-x86_64 finished -- > > --- Finished tests for swift --- > > > > Question is, is that little fix the 'right thing' (TM) ? If so happy > > to submit this as my first 'lame' patch. > > > > Thanks > > > > -- > > Regards, > > Tom > > > > "Where's the kaboom!? There was supposed to be an earth-shattering > > kaboom!" Marvin Martian > > Director, Linaro Mobile Group > > Tech Lead, GPGPU > > Linaro.org │ Open source software for ARM SoCs > > irc: tgall_foo | skype : tom_gall > > _______________________________________________ > > swift-dev mailing list > > [email protected] > > https://lists.swift.org/mailman/listinfo/swift-dev > _______________________________________________ > swift-dev mailing list > [email protected] > https://lists.swift.org/mailman/listinfo/swift-dev _______________________________________________ swift-dev mailing list [email protected] https://lists.swift.org/mailman/listinfo/swift-dev
