[
https://issues.apache.org/jira/browse/ARROW-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16352679#comment-16352679
]
Antoine Pitrou commented on ARROW-2071:
---------------------------------------
Ok, I managed to demystify the serialization tests issue. It occurs when
pytorch is installed. Why? Because it seems the pytorch build is buggy and
fails importing:
{code}
$ python -c "import torch"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File
"/home/antoine/miniconda3/envs/ttt/lib/python3.6/site-packages/torch/__init__.py",
line 53, in <module>
from torch._C import *
ImportError:
/home/antoine/miniconda3/envs/ttt/lib/python3.6/site-packages/torch/lib/libTHC.so.1:
undefined symbol: THLongStorage_inferSizeN
{code}
You'll see this too in the Travis-CI test output:
{code}
SKIP [1]
/home/travis/build/apache/arrow/pyarrow-test-2.7/lib/python2.7/site-packages/pyarrow/tests/test_serialization.py:349:
could not import 'torch'
{code}
But why does it take so much CPU time?
Because {{import torch}} is tried at each call to {{assert_equal()}} in
{{test_serialization.py}} -- that function calls itself recursively for each
element in a container... And failing to import is actually costly, because it
invokes the dynamic loader every time.
There is a simple solution which is to blacklist the module once it has failed
importing.
> [Python] Reduce runtime of builds in Travis CI
> ----------------------------------------------
>
> Key: ARROW-2071
> URL: https://issues.apache.org/jira/browse/ARROW-2071
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Python
> Reporter: Wes McKinney
> Assignee: Antoine Pitrou
> Priority: Major
> Labels: pull-request-available
> Fix For: 0.9.0
>
>
> For some reason, recently each Python build has been taking about 15 minutes
> to run. I speculate this is due to VM thrashing caused by reduced resources
> on the Travis CI workers, related to the problem I fixed in ARROW-2062.
> We should experiment, but it seems like perhaps this can be fixed either by:
> * Reducing the size of the Plasma store on Travis CI
> * Disabling valgrind in Plasma tests
> The slowness could be caused by something else, though, so we should
> investigate (and have pytest report slow tests in the logs)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)