Hi! 

After i call "hawq stop cluster -a", i found that there is still has gpadmin 
process: 

gpadmin 61866 0.4 5.2 811448 419620 ? S 17:29 0:00 
/usr/local/apache-hawq/bin/gpsyncmaster -D /data/hawq/masterdd -i -p 1809 
gpadmin 61882 0.0 0.0 302688 7200 ? Ss 17:29 0:00 postgres: port 1809, logger 
process 
gpadmin 61883 0.0 0.0 812000 7384 ? S 17:29 0:00 postgres: port 1809, WAL Redo 
Server process 
gpadmin 61907 0.0 0.1 812300 8128 ? Ss 17:29 0:00 postgres: port 1809, 
gpsyncagent process con2 idle 

Then I call "hawq start cluster -a" failed: 

20181030:17:29:05:2789929 hawq_start:dx-computing:gpadmin-[INFO]:-Starting 
standby master '192.168.10.18' 
20181030:17:29:05:2789929 hawq_start:dx-computing:gpadmin-[INFO]:-Start standby 
master service 
20181030:17:29:04:061879 
hawqstandbywatch.py:dx-computing2:gpadmin-[INFO]:-Checking standby master 
status 
20181030:17:29:04:061879 
hawqstandbywatch.py:dx-computing2:gpadmin-[INFO]:-Monitoring logs 
20181030:17:29:08:061879 
hawqstandbywatch.py:dx-computing2:gpadmin-[INFO]:-checking if syncmaster is 
running 
20181030:17:29:08:061879 
hawqstandbywatch.py:dx-computing2:gpadmin-[INFO]:-syncmaster appears ok, pid 
61866 
20181030:17:29:09:2789929 hawq_start:dx-computing:gpadmin-[INFO]:-Standby 
master started successfully 
20181030:17:29:09:2789929 hawq_start:dx-computing:gpadmin-[INFO]:-Starting 
master node '192.168.10.17' 
20181030:17:29:09:2789929 hawq_start:dx-computing:gpadmin-[INFO]:-Start master 
service 
20181030:17:29:10:2789929 hawq_start:dx-computing:gpadmin-[INFO]:-Checking if 
standby is synced with master 
20181030:17:29:10:2789929 hawq_start:dx-computing:gpadmin-[ERROR]:-Failed to 
connect to database, this script can only be run when the database is up 
Traceback (most recent call last): 
File "/usr/local/apache-hawq/bin/hawq_ctl", line 1459, in <module> 
start_hawq(opts, hawq_dict) 
File "/usr/local/apache-hawq/bin/hawq_ctl", line 1233, in start_hawq 
instance.run() 
File "/usr/local/apache-hawq/bin/hawq_ctl", line 765, in run 
check_return_code(self._start_all_nodes()) 
File "/usr/local/apache-hawq/bin/hawq_ctl", line 701, in _start_all_nodes 
check_return_code(self.start_master(), logger, "Master start failed, exit", \ 
File "/usr/local/apache-hawq/bin/hawq_ctl", line 618, in start_master 
sync_result = self._check_standby_sync() 
File "/usr/local/apache-hawq/bin/hawq_ctl", line 671, in _check_standby_sync 
for row in rows: 
UnboundLocalError: local variable 'rows' referenced before assignment 

So, why stop cluster can not stop gpsyncmaster on standby node? 

I use hawq 2.2, upgrade can solve it? 

Reply via email to