Oh, I have figured out the problem, which has something to do with my
~/.profile, i cannot remember when i added one line in the ~/.profile,
which sources my .zshrc, leading to  the login shell always goes to zsh.

On Wed, Mar 7, 2018 at 2:13 AM, Yesheng Ma <kimi.y...@gmail.com> wrote:

> Related source code: https://github.com/apache/flink/blob/master/
> flink-dist/src/main/flink-bin/bin/start-cluster.sh#L40
>
> On Wed, Mar 7, 2018 at 2:11 AM, Yesheng Ma <kimi.y...@gmail.com> wrote:
>
>> Hi Nico,
>>
>> Thanks for your reply. My major concern is actually the `-l` argument.
>> The command I executed is: `nohup /bin/bash -x -l
>> "/state/partition1/ysma/flink-1.4.1/bin/jobmanager.sh" start cluster
>> dell-01.epcc 8091`, with and without the `-l` argument (the script in
>> Flink's bin directory uses the `-l` argument).
>>
>> 1) with the `-l` argument: the log is quite messy, but there are some
>> clue, the last executed command starts a zsh shell:
>> ```
>> + . /home/ysma/.bashrc
>> ++ case $- in
>> ++ return
>> + PATH=/home/ysma/bin:/home/ysma/.local/bin:/state/partition1/
>> ysma/redis-4.0.8/../bin:/home/ysma/env/jdk1.8.0_151/bin:/
>> home/ysma/env/maven/bin:/home/ysma/bin:/home/ysma/.local/
>> bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/
>> sbin:/bin:/usr/games:/usr/local/games:/snap/bin
>> + '[' -f /bin/zsh ']'
>> + exec /bin/zsh -l
>> ```
>> I guess the bash -l arguments detects the user's login shell and then
>> logs in a zsh shell (which I'm currently using) and never back.
>>
>> 2) without the `-l` argument, everything just goes fine.
>>
>> Therefore I suspect there might be something wrong with the `-l`
>> argument, or something wrong with my bash config?  Any ideas? Thanks!
>>
>>
>> On Wed, Mar 7, 2018 at 12:20 AM, Nico Kruber <n...@data-artisans.com>
>> wrote:
>>
>>> Hi Yesheng,
>>> `nohup /bin/bash -l bin/jobmanager.sh start cluster ...` looks a bit
>>> strange since it should (imho) be an absolute path towards flink.
>>>
>>> What you could do to diagnose further, is to try to run the ssh command
>>> manually, i.e. figure out what is being executed by calling
>>> bash -x ./bin/start-cluster.sh
>>> and then run the ssh command without "-n" and not in background "&".
>>> Then you should also see the JobManager stdout to diagnose further.
>>>
>>> If that does not help yet, please log into the master manually and
>>> execute the "nohup /bin/bash..." command there to see what is going on.
>>>
>>> Depending on where the failure was, there may even be logs on the master
>>> machine.
>>>
>>>
>>> Nico
>>>
>>> On 04/03/18 15:52, Yesheng Ma wrote:
>>> > Hi all,
>>> >
>>> > ​​When I execute bin/start-cluster.sh on the master machine, actually
>>> > the command `nohup /bin/bash -l bin/jobmanager.sh start cluster ...` is
>>> > exexuted, which does not open the job manager properly.
>>> >
>>> > I think there might be something wrong with the `-l` argument, since
>>> > when I use the `bin/jobmanager.sh start` command, everything is fine.
>>> > Kindly point out if I've done any configuration wrong. Thanks!
>>> >
>>> > Best,
>>> > Yesheng
>>> >
>>> >
>>>
>>>
>>
>

Reply via email to