Ming LI created HAWQ-901:
----------------------------
Summary: hawq init failed:
hawqstandbywatch.py:test5:gpadmin-[WARNING]:-syncmaster not running
Key: HAWQ-901
URL: https://issues.apache.org/jira/browse/HAWQ-901
Project: Apache HAWQ
Issue Type: Bug
Components: Command Line Tools
Reporter: Ming LI
Assignee: Lei Chang
Error message in ~/hawqAdminLogs/hawq_init_XXXXXXXX.log
------------------------------------------------------------------------------------
20160706:06:45:53:006218 hawq_start:test1:gpadmin-[INFO]:-Start hawq with args:
['start', 'standby']
20160706:06:45:53:006218 hawq_start:test1:gpadmin-[INFO]:-Gathering information
and validating the environment...
20160706:06:45:53:006218 hawq_start:test1:gpadmin-[INFO]:-Start standby master
service
20160706:06:46:02:006218 hawq_start:test1:gpadmin-[INFO]:-Checking standby
master status
20160706:06:45:55:004418 hawqstandbywatch.py:test5:gpadmin-[INFO]:-Monitoring
logs
20160706:06:46:00:004418 hawqstandbywatch.py:test5:gpadmin-[INFO]:-checking if
syncmaster is running
20160706:06:46:02:004418
hawqstandbywatch.py:test5:gpadmin-[WARNING]:-syncmaster not running
20160706:06:46:02:006218 hawq_start:test1:gpadmin-[ERROR]:-Standby master start
failed, exit
20160706:06:46:02:003999 hawqinit.sh:test5:gpadmin-[ERROR]:-Start HAWQ standby
failed
------------------------------------------------------------------------------
(1) I suspect the root cause maybe: we only wait 5 seconds before we check
standby running status, this interval is too small. Could you please firstly
change the standby running status check interval from 5 seconds to a loop like
recovery running status check on master?
(2) If the error 'syncmaster not running' will lead to init failure, we should
change from [WARNING] to [ERROR].
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)