[ https://issues.apache.org/jira/browse/ARROW-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16352679#comment-16352679 ]
Antoine Pitrou commented on ARROW-2071: --------------------------------------- Ok, I managed to demystify the serialization tests issue. It occurs when pytorch is installed. Why? Because it seems the pytorch build is buggy and fails importing: {code} $ python -c "import torch" Traceback (most recent call last): File "<string>", line 1, in <module> File "/home/antoine/miniconda3/envs/ttt/lib/python3.6/site-packages/torch/__init__.py", line 53, in <module> from torch._C import * ImportError: /home/antoine/miniconda3/envs/ttt/lib/python3.6/site-packages/torch/lib/libTHC.so.1: undefined symbol: THLongStorage_inferSizeN {code} You'll see this too in the Travis-CI test output: {code} SKIP [1] /home/travis/build/apache/arrow/pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_serialization.py:349: could not import 'torch' {code} But why does it take so much CPU time? Because {{import torch}} is tried at each call to {{assert_equal()}} in {{test_serialization.py}} -- that function calls itself recursively for each element in a container... And failing to import is actually costly, because it invokes the dynamic loader every time. There is a simple solution which is to blacklist the module once it has failed importing. > [Python] Reduce runtime of builds in Travis CI > ---------------------------------------------- > > Key: ARROW-2071 > URL: https://issues.apache.org/jira/browse/ARROW-2071 > Project: Apache Arrow > Issue Type: Improvement > Components: Python > Reporter: Wes McKinney > Assignee: Antoine Pitrou > Priority: Major > Labels: pull-request-available > Fix For: 0.9.0 > > > For some reason, recently each Python build has been taking about 15 minutes > to run. I speculate this is due to VM thrashing caused by reduced resources > on the Travis CI workers, related to the problem I fixed in ARROW-2062. > We should experiment, but it seems like perhaps this can be fixed either by: > * Reducing the size of the Plasma store on Travis CI > * Disabling valgrind in Plasma tests > The slowness could be caused by something else, though, so we should > investigate (and have pytest report slow tests in the logs) -- This message was sent by Atlassian JIRA (v7.6.3#76005)