[
https://issues.apache.org/jira/browse/THRIFT-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14791157#comment-14791157
]
ASF GitHub Bot commented on THRIFT-2918:
----------------------------------------
GitHub user bufferoverflow opened a pull request:
https://github.com/apache/thrift/pull/606
THRIFT-2918 Race condition in Python TProcessPoolServer test
This makes python test pass on my machine.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/bufferoverflow/thrift THRIFT-2918
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/thrift/pull/606.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #606
----
commit 66192cf5394c71d499db179515b89820858fe402
Author: Roger Meier <[email protected]>
Date: 2015-09-16T17:53:07Z
THRIFT-2918 Race condition in Python TProcessPoolServer test
----
> Race condition in Python TProcessPoolServer test
> ------------------------------------------------
>
> Key: THRIFT-2918
> URL: https://issues.apache.org/jira/browse/THRIFT-2918
> Project: Thrift
> Issue Type: Bug
> Components: Python - Library
> Environment: openSUSE 13.2
> Python 2.7.8 (default, Sep 30 2014, 15:34:38) [GCC] on linux2
> Reporter: Jens Geyer
> Assignee: Jens Geyer
> Fix For: 0.9.3
>
> Attachments: THRIFT-2918-python-test-timeouts.patch
>
>
> {{make check}} gets stuck very reproducible in my VM at Test run #218:
> {code}
> Test run #217: (includes gen-py-default) Server=TProcessPoolServer,
> Proto=accel, zlib=False, SSL=False
> Testing server TProcessPoolServer: /usr/bin/python ./TestServer.py
> --genpydir=gen-py-default --protocol=accel --port=9090 TProcessPoolServer
> Testing client: /usr/bin/python ./TestClient.py --genpydir=gen-py-default
> --protocol=accel --port=9090 --transport=buffered
> ...testException(Safe)
> testException(Xception)
> testException(throw_undeclared)
> ...............
> ----------------------------------------------------------------------
> Ran 18 tests in 0.563s
> OK
> Giving TProcessPoolServer (proto=accel,zlib=False,ssl=False) an extra 3
> seconds for childprocesses to terminate via alarm
> Terminating worker: <Process(Process-1, started daemon)>
> Terminating worker: <Process(Process-2, started daemon)>
> Terminating worker: <Process(Process-3, started daemon)>
> Terminating worker: <Process(Process-4, started daemon)>
> Terminating worker: <Process(Process-5, started daemon)>
> Requesting server to stop()
> OK: Finished (includes gen-py-default) TProcessPoolServer / accel proto /
> zlib=False / SSL=False. 217 combinations tested.
> Test run #218: (includes gen-py-default) Server=TProcessPoolServer,
> Proto=accel, zlib=False, SSL=True
> Testing server TProcessPoolServer: /usr/bin/python ./TestServer.py
> --genpydir=gen-py-default --protocol=accel --port=9090 --ssl
> TProcessPoolServer
> Testing client: /usr/bin/python ./TestClient.py --genpydir=gen-py-default
> --protocol=accel --port=9090 --ssl --transport=buffered
> ...testException(Safe)
> testException(Xception)
> testException(throw_undeclared)
> ..........Terminating worker: <Process(Process-1, started daemon)>
> Terminating worker: <Process(Process-2, started daemon)>
> Terminating worker: <Process(Process-3, started daemon)>
> Terminating worker: <Process(Process-4, started daemon)>
> Terminating worker: <Process(Process-5, started daemon)>
> Requesting server to stop()
> {code}
> After fiddling a bit around with it I got it to work by increasing the
> alarm() timeout from 2 seconds to 4 seconds.
> I'm not a Python expert, but the code looks somewhat interesting to me:
> - The server code starts the workers, but some piece of code outside of the
> server is responsible for terminating them. Is that really idiomatic in
> Python or just bad design?
> - The Condition() object used in TProcessPoolsServer.py should probably be
> replaced by an Event() object. Especially, as Condition.wait() [seems to have
> it's own
> perils|http://stackoverflow.com/questions/24137480/threading-condition-waittimeout-ignores-threading-condition-notify]
> and Event is much easier to use.
> - Calling Condition.aquire() without a matching release() within a {{while
> True:}} loop looks also not very convincing to me. AFAIK the second call to
> aquire() will block, if that ever happens (it did not in my tests).
> The bad news is, that neither of the proposed changes above had any effect on
> the race conditions, except increasing the timeout - but that is merely a
> workaround, not a solution.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)