On Sat, 2018-04-28 at 08:46 +0000, 范国腾 wrote: > Hi, > There are three nodes: node1,node2,node3。node1 is master, node2 and > node3 is slave。 > We execute the “truncate table” in 14:25:36 and kill the WAL progress > in db2。Then the db2 pacemaker is down and db1 is reboot. > But I could not find any information from the /var/log/messages. The > flowing is log, could you help find any clue? > > > Current DC: db3 (version 1.1.15-11.el7-e174ec8) - partition with > quorum
The DC is the node that schedules all actions, so its logs will be helpful too. If there was any fencing, it should be mentioned there. I don't see anything obvious in the logs you've posted here. But if the nodes were fenced or crashed, the most recent logs may have not been written to disk yet. > Last updated: Sat Apr 28 16:37:53 2018 Last change: Sat Apr > 28 16:02:25 2018 by hacluster via crmd on db3 > > 3 nodes and 19 resources configured > > Node db2: pending > Online: [ db3 ] > OFFLINE: [ db1 ] > > Full list of resources: > > ipmi_node1 (stonith:fence_ipmilan): Started db3 > ipmi_node2 (stonith:fence_ipmilan): Started db3 > ipmi_node3 (stonith:fence_ipmilan): Stopped > Clone Set: dlm-clone [dlm] > Started: [ db3 ] > Stopped: [ db1 db2 ] > Clone Set: clvmd-clone [clvmd] > Started: [ db3 ] > Stopped: [ db1 db2 ] > Clone Set: clusterfs-clone [clusterfs] > Started: [ db3 ] > Stopped: [ db1 db2 ] > Master/Slave Set: pgsql-ha [pgsqld] > Masters: [ db3 ] > Stopped: [ db1 db2 ] > Resource Group: mastergroup > master-vip (ocf::heartbeat:IPaddr2): Started db3 > rep-vip (ocf::heartbeat:IPaddr2): Started db3 > slave1-vip (ocf::heartbeat:IPaddr2): Stopped > slave2-vip (ocf::heartbeat:IPaddr2): Stopped > > Daemon Status: > corosync: active/disabled > pacemaker: active/disabled > pcsd: active/enabled > > DB1 /var/log/messages > > > DB2 /var/log/messages > > > > DB1 postgres log > > DB2 postgres log > > > 发件人: 徐晓菲 > 发送时间: 2018年4月27日 9:57 > 收件人: 邵大明 <shaodam...@highgo.com>; 范国腾 <fanguot...@highgo.com>; 王亮 <wa > ngli...@highgo.com> > 主题: 回复: 回复: message+pglog > > 嗯嗯,知道了。 > > 还有昨天邮件发log的那个问题,不知道是不是跟truncate tb有关,因为跟下面这种情况一样都做过truncate tb > > 这还有一种情况: > 操作步骤: > (1)1主2备(db1主 db2备 db3备),psql -h master-vip > (2)create tb1; insert tb1执行中 > (3)kill一个备机(db3)的流复制进程 > (4)该备机重启流复制进程,pcs status仍为原有的1主2备 > > (5)truncate tb1 > (6)重新kill一个备机(db3)的流复制进程(没有执行insert) > (7)原主机db1被关机 > (8)db2上执行pcs status和查看进程 > [root@sds2 ~]# pcs status > Cluster name: hgpurog > Stack: corosync > Current DC: db2 (version 1.1.15-11.el7-e174ec8) - partition with > quorum > Last updated: Fri Apr 27 09:47:15 2018 Last change: Fri Apr 27 > 09:28:01 2018 by root via crm_attribute on db1 > > 3 nodes and 19 resources configured > > Node db3: pending > Online: [ db2 ] > OFFLINE: [ db1 ] > > Full list of resources: > > ipmi_node1 (stonith:fence_ipmilan): Started db2 > ipmi_node2 (stonith:fence_ipmilan): Stopped > ipmi_node3 (stonith:fence_ipmilan): Started db2 > Clone Set: dlm-clone [dlm] > Started: [ db2 ] > Stopped: [ db1 db3 ] > Clone Set: clvmd-clone [clvmd] > Started: [ db2 ] > Stopped: [ db1 db3 ] > Clone Set: clusterfs-clone [clusterfs] > Started: [ db2 ] > Stopped: [ db1 db3 ] > Master/Slave Set: pgsql-ha [pgsqld] > Slaves: [ db2 ] > Stopped: [ db1 db3 ] > Resource Group: mastergroup > master-vip (ocf::heartbeat:IPaddr2): Stopped > rep-vip (ocf::heartbeat:IPaddr2): Stopped > slave1-vip (ocf::heartbeat:IPaddr2): Stopped > slave2-vip (ocf::heartbeat:IPaddr2): Stopped > > Failed Actions: > * pgsqld_promote_0 on db2 'unknown error' (1): call=94, status=Timed > Out, exitreason='none', > last-rc-change='Fri Apr 27 09:36:06 2018', queued=0ms, > exec=300002ms > > > Daemon Status: > corosync: active/disabled > pacemaker: active/disabled > pcsd: active/enabled > [root@sds2 ~]# > > [highgo@sds2 data]$ ps -ef|grep postgres > highgo 29499 28255 0 09:51 pts/1 00:00:00 grep --color=auto > postgres > > db3上pcs staus 和查看进程 > [root@sds3 ~]# pcs status > Error: cluster is not currently running on this node > > [root@sds3 ~]# ps -ef |grep postgres > highgo 4388 1 0 09:19 ? 00:00:13 > /home/highgo/hgdb/bin/postgres -D /home/highgo/hgdb/data > highgo 4449 4388 0 09:19 ? 00:00:00 postgres: logger > process > highgo 10723 4388 2 09:28 ? 00:00:35 postgres: startup > process recovering 0000000900000000000000EF > highgo 10732 4388 0 09:28 ? 00:00:00 postgres: > checkpointer process > highgo 10733 4388 0 09:28 ? 00:00:00 postgres: writer > process > highgo 11261 4388 0 09:28 ? 00:00:00 postgres: stats > collector process > highgo 12229 4388 0 09:50 ? 00:00:00 postgres: wal > receiver process > root 12231 17313 0 09:50 pts/0 00:00:00 grep --color=auto > postgres > [root@sds3 ~]# > > > 祝工作顺利! > ---------------------------------- > 徐晓菲 产品检测部 > 瀚高基础软件股份有限公司 > 网址:www.highgo.com > 地址:济南市高新区新泺大街2117号铭盛大厦20层 > 手机:183-6307-3951 邮箱:xuxiao...@highgo.com > > > 发件人: shaodam...@highgo.com > 发送时间: 2018-04-27 09:37 > 收件人: xuxiao...@highgo.com; fanguoteng; 王亮 > 主题: 回复: 回复: message+pglog > hi, xiaofei > > 交叉就是, 如果两个机器作为client server. > 一个机器建立400个client访问 备1 数据库1 > 一个机器建立400 个client 访问 备 2 数据库2 > 交叉10% 就是 360个访问备1的数据库1, 40个访问备1的数据库2. > 就是 360个访问备2的数据库2, 40个访问备2的数据库1. > 其他的情况类似按比例改变如上 > > thanks. > Br. > Bret > shaodam...@highgo.com > > 发件人: xuxiao...@highgo.com > 发送时间: 2018-04-27 09:18 > 收件人: 范国腾; wangliang; shaodaming > 主题: 回复: message+pglog > 哈喽 > 这里的交叉是指,比如100%交叉是同时发select,比如10%交叉是备一读一段时间之后,备二再读 么 > > > 祝工作顺利! > ---------------------------------- > 徐晓菲 产品检测部 > 瀚高基础软件股份有限公司 > 网址:www.highgo.com > 地址:济南市高新区新泺大街2117号铭盛大厦20层 > 手机:183-6307-3951 邮箱:xuxiao...@highgo.com > > > 发件人: xuxiao...@highgo.com > 发送时间: 2018-04-26 16:33 > 收件人: 范国腾; wangliang; shaodaming > 主题: message+pglog > > > > 祝工作顺利! > ---------------------------------- > 徐晓菲 产品检测部 > 瀚高基础软件股份有限公司 > 网址:www.highgo.com > 地址:济南市高新区新泺大街2117号铭盛大厦20层 > 手机:183-6307-3951 邮箱:xuxiao...@highgo.com > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > https://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch. > pdf > Bugs: http://bugs.clusterlabs.org -- Ken Gaillot <kgail...@redhat.com> _______________________________________________ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org