[
https://issues.apache.org/jira/browse/AMBARI-22473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258924#comment-16258924
]
Jonathan Matthew commented on AMBARI-22473:
-------------------------------------------
Where are the tests for the ambari-commons shell module?
It'd be fairly easy to write a test for this - just use process_executor to run
something like python -c 'import os; import time; os.close(1); time.sleep(1)'
expecting the return code to be 0.
I ran into this trying to install an ambari cluster on an Ubuntu image running
on Joyent Triton (see
https://docs.joyent.com/public-cloud/instances/infrastructure/images). This bug
causes the watchdog thread to try to kill the process, and that was hanging
forever, likely due to bad interactions between fork/exec and threads. Fixing
process_executor to wait for the process to exit allowed the install to
complete successfully.
> shell.process_executor races process exit
> -----------------------------------------
>
> Key: AMBARI-22473
> URL: https://issues.apache.org/jira/browse/AMBARI-22473
> Project: Ambari
> Issue Type: Bug
> Affects Versions: 2.6.0
> Reporter: Jonathan Matthew
> Attachments:
> 0001-AMBARI-22473-shell.process_executor-races-process-ex.patch
>
>
> By calling cmd.poll(), shell.process_executor assumes that the subcommand
> will have exited once it has finished reading from its stdout pipe. This
> isn't necessarily the case, so it should call cmd.wait() instead.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)