Re: [Linux-ha-dev] pgsql and streaming replcation

Serge Dubrouski Sun, 05 Feb 2012 09:13:49 -0800

Takatoshi -

Please consider apllying attached patch to your version of pgsql RA. It's
all cosmetics: replacing tabs with spaces and improving English. Thanks for
implementing recommendations I expressed earlier.



On Tue, Dec 13, 2011 at 3:37 AM, Takatoshi MATSUO <[email protected]>wrote:

> Hello Serge
>
> 2011/12/13 Serge Dubrouski <[email protected]>:
> >
> >
> > On Mon, Dec 12, 2011 at 4:28 AM, Takatoshi MATSUO <[email protected]>
> > wrote:
> >>
> >> Hello Serge
> >>
> >> 2011/12/12 Serge Dubrouski <[email protected]>:
> >> >
> >> >
> >> > On Thu, Dec 8, 2011 at 10:04 PM, Takatoshi MATSUO <
> [email protected]>
> >> > wrote:
> >> >>
> >> >> Hello Serge
> >> >>
> >> >> 2011/12/8 Serge Dubrouski <[email protected]>:
> >> >> >
> >> >> >
> >> >> > On Mon, Dec 5, 2011 at 9:15 PM, Takatoshi MATSUO
> >> >> > <[email protected]>
> >> >> > wrote:
> >> >> >>
> >> >> >> Hello Serge
> >> >> >>
> >> >> >> Serge Dubrouski <[email protected]>:
> >> >> >> > Hello -
> >> >> >> >
> >> >> >> > Takatoshi MATSUO did a tremendous job on implementing support
> for
> >> >> >> > streaming
> >> >> >> > replication feature in pgsql RA. Also it looks like PostgeSQL
> 9.1
> >> >> >> > has
> >> >> >> > all
> >> >> >> > necessary interfaces to successfully implement  Pacemaker's M/S
> >> >> >> > concept.
> >> >> >> > So
> >> >> >> > I think it's time to start discussion on how to merge
> Takatoshi's
> >> >> >> > work
> >> >> >> > into
> >> >> >> > pgsql RA baseline. Here is the link to Takatoshi's GitHUB if
> >> >> >> > somebody
> >> >> >> > wants
> >> >> >> > to test his RA:
> >> >> >> >
> >> >> >> > https://github.com/t-matsuo/
> >> >> >> >
> >> >> >> > So far I tested it for backward compatibility in a standard
> >> >> >> > non-replication
> >> >> >> > mode  and also tested M/S model and found no real issues. Though
> >> >> >> > it
> >> >> >> > definitely requires some more polishing and testing.
> >> >> >> >
> >> >> >> > Takatoshi, here are some changes that I want to discuss with
> you:
> >> >> >> >
> >> >> >> > 1. Is it possible to add a check for PostgreSQL version and fail
> >> >> >> > with
> >> >> >> > OCF_ERR_INSTALLED when one tries to start replication on version
> >> >> >> > less
> >> >> >> > than
> >> >> >> > 9.1? A simple cat on PG_VERSION with some analysis would
> probably
> >> >> >> > do.
> >> >> >>
> >> >> >> I'll add a check.
> >> >>
> >> >> I added a check.
> >> >>
> >> >>
> >> >>
> https://github.com/t-matsuo/resource-agents/commit/3ab7cfdcce118043cd149b348740e50e7a946eb3
> >> >>
> >> >> >>
> >> >> >> > 2. I think that following lines should be moved from pgsql_start
> >> >> >> > to
> >> >> >> > pgsql_validate_all
> >> >> >> >
> >> >> >> >  535     # Check whether tmpdir is readable by pgdba user
> >> >> >> >  536     if ! runasowner "test -r $OCF_RESKEY_tmpdir"; then
> >> >> >> >  537         ocf_log err "Directory $OCF_RESKEY_tmpdir is not
> >> >> >> > readable
> >> >> >> > by
> >> >> >> > $OCF_RESKEY_pgdba"
> >> >> >> >  538         return $OCF_ERR_PERM
> >> >> >> >  539     fi
> >> >> >>
> >> >> >> Thanks. I think so too.
> >> >> >> I'll fix it.
> >> >> >>
> >> >>
> >> >> I fixed it and I deleted a check for tmpdir existence
> >> >> because the checking for permittion fills the role.
> >> >>
> >> >>
> >> >>
> https://github.com/t-matsuo/resource-agents/commit/82d4939486bcca429e2deb804d7faf756099bb59
> >> >>
> >> >>
> >> >> > On a second thought I'm not sure why we need that parameter and
> >> >> > directory at
> >> >> > all. Why not to create rep_mode.conf, PGSQL.lock and xlog.note in
> >> >> > $OCF_RESKEY_pgdata ? What problems it can create?
> >> >> >
> >> >> > One more advantage to do
> >> >> > it in $OCF_RESKEY_pgdata is an ability to handle more than
> PostgreSQL
> >> >> > instance on the same server without a need for additional temp
> >> >> > directories.
> >> >>
> >> >> When backup is needed, customers may backup these files and restore
> it.
> >> >> It may cause problems.
> >> >> Specially PGSQL.lock causes an error on start.
> >> >>
> >> >> I think that they should be treated as a separate thing.
> >> >> because they are independent from PostgreSQL's data.
> >> >
> >> >
> >> > Ok. Then may be it should default to something like
> >> > ${OCF_RESKEY_pgdata}/temp and RA should create it and set the right
> >> > ownership and permissions if it doesn't exist? And again may be in
> this
> >> > case
> >> > we don't need that parameter?
> >> >
> >>
> >> I agree that RA creates it and sets the right ownership and permissions.
> >> But considering backup using tar command, I think it's not better to
> >> create it
> >> under $OCF_RESKEY_pgdata.
> >>
> >> According to "Filesystem Hierarchy Standard"
> >> it's better to use /var/lib/somewhere.
> >>
> >>
> >>
> http://www.pathname.com/fhs/pub/fhs-2.3.html#VARLIBVARIABLESTATEINFORMATION
> ,
> >>
> >> Then I designed using /var/lig/[RA Name] (=/var/lib/pgsql).
> >> What do you think?
> >
> >
> > Ahh, I see now where confusion comes from. /var/lib/pgsql is actually
> taken
> > :-) It's used as default home directory in RedHat (at least) for postgres
> > user in PostgreSQL 8.XX. When they'll start start shipping version 9
> they'll
> > probably use it as well. OCF_RESKEY_pg_data  defaults to
> > /var/lib/pgsql/data.
> >
> > Then, /var/lib is usually used for non-temporary data, For temporary
> files
> > it's probably better to use /var/run or /var/tmp. If you still want to
> use
> > /var/lib/pgsql you probably need to use /var/lib/pgsql/temp or so.
>
> Files under /var/run or /var/tmp are cleared by tmpwatch or at the
> beginning
> of the boot process, aren't they?
>
> Clearing PGSQL.lock file causes problem.
>
>
> >
> >>
> >> >>
> >> >>
> >> >> Incidentally I considered to handle more than PostgreSQL instance.
> >> >> Initially I added port number to these filenames, but I deleted it
> >> >> to simplify filenames.
> >> >>
> >> >>
> >> >>
> https://github.com/t-matsuo/resource-agents/commit/b16faf2d797200048dc0fc07a45b6751cf5be190
> >> >>
> >> >>
> >> >> > Also I think it would be good if RA was able to take care of adding
> >> >> > "include
> >> >> > $WHATEVER_DIR/rep_mode.conf" in postgresql.conf. It will make the
> RA
> >> >> > self
> >> >> > sustainable. In a current situation admin has add that directive
> >> >> > manually.
> >> >> > RA though can something like this in a start function for
> replication
> >> >> > mode:
> >> >> >
> >> >> > if ! grep -i "include $WHATEVER_DIR/rep_mode.conf"
> $OCF_RESKEY_config
> >> >> > then
> >> >> >      echo "include $WHATEVER_DIR/rep_mode.conf" >>
> $OCF_RESKEY_config
> >> >> > fi
> >> >>
> >> >> Sounds good.
> >> >>
> >> >> > Don't know if it makes sense to remove it on stop.
> >> >>
> >> >> I think it doesn't make sense to remove it,
> >> >> because rep_mode.conf becomes empty on stop.
> >> >
> >> >
> >> > The only problem here if admin changes temp_dir parameter but doesn't
> >> > delete
> >> > records from postgres.conf. We could end up with several include
> records
> >> > then or with records pointing to the non existing files.
> >>
> >> Should we consider this situation?
> >>
> >> If same parameters exist, PostgreSQL uses last parameter.
> >> So I will implement it to add several include if temp_dir is changed.
> >
> >
> > The problem happens when file gets deleted but reference to it still
> exists
> > in postgres.conf. They DB fails to start.
>
>
> Start manually?
> The RA manages some statuses using an attribute and PGSQL.lock file.
> I don't recommend starting it manually in the first place.
>
>
> How about just describing it in a document? meta-data?
>
>
> >
> >>
> >>
> >>
> >> >>
> >> >> >> > 3. I don't really like this part of pgsql_real_monitor:
> >> >> >> >
> >> >> >> >  775     if ! is_replication; then
> >> >> >> >  776         OCF_RESKEY_monitor_sql=`escape_string >
> >> >> >> > "$OCF_RESKEY_monitor_sql"`
> >> >> >> >  777         runasowner -q $loglevel "$OCF_RESKEY_psql
> >> >> >> > $psql_options
> >> >> >> > -c
> >> >> >> > > '$OCF_RESKEY_monitor_sql'"
> >> >> >> >  778         rc=$?
> >> >> >> >  779     else
> >> >> >> >  780         output=`su $OCF_RESKEY_pgdba -c "cd
> >> >> >> > $OCF_RESKEY_pgdata;
> >> >> >> > >
> >> >> >> > $OCF_RESKEY_psql $psql_options -Atc \"${CHECK_MS_SQL}\""`
> >> >> >> >  781         rc=$?
> >> >> >> >  782     fi
> >> >> >> >
> >> >> >> > I think that functional monitor (the one that uses monitor_sql)
> >> >> >> > should
> >> >> >> > run
> >> >> >> > always independently of DB mode since its primary role is to
> check
> >> >> >> > data
> >> >> >> > and
> >> >> >> > fail if it's not correct or corrupted. In replication mode there
> >> >> >> > should
> >> >> >> > be
> >> >> >> > additional monitoring. Other way it misleads customer on a usage
> >> >> >> > of
> >> >> >> > monitor_sql.
> >> >> >>
> >> >> >> All right.
> >> >> >>
> >> >> >> Does it need to execute "select now();" if monitor_sql parameter
> is
> >> >> >> empty?
> >> >> >> I think it's unnecessary.
> >> >> >
> >> >> >
> >> >> > For a case of an empty parameter you are right and running "select
> >> >> > now()"
> >> >> > probably unnecessary, Non-empty monitro_sql shall be executed in my
> >> >> > opinion.
> >> >> >
> >> >> >>
> >> >> >>
> >> >> >> > 4. You already populate several attributes with crm_attribute.
> >> >> >> > Does
> >> >> >> > it
> >> >> >> > make
> >> >> >> > sense to populate a name of a Master node in promote function
> and
> >> >> >> > use
> >> >> >> > it
> >> >> >> > later on instead of running crm_mon on each monitor command?
> >> >> >> >
> >> >> >>
> >> >> >> Do you mean that you don't want to use crm_mon in monitor?
> >> >> >
> >> >> >
> >> >> > I prefer to have as few dependencies  on external programs as
> >> >> > possible.
> >> >>
> >> >> Agree.
> >> >>
> >> >> > Using crm_attribute to communicate the name of a master node would
> be
> >> >> > consistent with the rest of the script since you already use it for
> >> >> > communicating state of the nodes. If you prefer using crm_mon than
> >> >> > you
> >> >> > have
> >> >> > to add a check that that binary exists on the server. In 99.99% of
> >> >> > the
> >> >> > case
> >> >> > it will but still the check is necessary I think.
> >> >>
> >> >> I use crm_mon to check for node existence and node online too.
> >> >> It is impossible to get it from attribute, so I prefer to using
> >> >> crm_mon.
> >> >>
> >> >> Do I have to add a check for crm_master, crm_attribute, crm_failcount
> >> >> too?
> >> >>
> >> >
> >> > That's a different  interesting topic. What is the reason for doing
> >> > this?
> >> >
> >> > my_fail_count=`$CRM_FAILCOUNT -r $OCF_RESOURCE_INSTANCE -N $HOSTNAME
> -G
> >> > -Q |
> >> > sed "s/INFINITY/1000000/g"`
> >> > if [ "$my_fail_count" != "0" ]; then
> >> >       ocf_log info "I don't start PostgreSQL on post-demote because of
> >> > my
> >> > fail-count=$my_fail_count."
> >> >       return $OCF_SUCCESS
> >> > fi
> >> >
> >> > Why not to leave pacemaker to do its job on managing failcounts?
> >>
> >> Because I want to make fail-over faster.
> >> At any rate Pacemaker stops it?
> >
> >
> > I'm probably wrong here. Looks like you use this trick in case if slave's
> > data is newer than new master's data. In this case it makes sense to
> disable
> > such slave.
>
> Why do you think so ?
> The check whether slave's data is newer than new master's one  is here.
>
> in pgsql_post_demote()
> -------------------------------------------------------
> 886 is_slave_right || return $OCF_ERR_GENERIC
> -------------------------------------------------------
>
> This line is executed after this trick.
>
>
>
> Regards,
> Takatoshi MATSUO
> _______________________________________________________
> Linux-HA-Dev: [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
>



-- 
Serge Dubrouski.

--- pgsql.org	2012-02-05 09:26:09.000000000 -0700
+++ pgsql	2012-02-05 10:10:38.000000000 -0700
@@ -470,7 +470,7 @@
     if ! check_log_file $OCF_RESKEY_logfile
     then
         ocf_log err "PostgreSQL can't write to the log file: $OCF_RESKEY_logfile"
-	return $OCF_ERR_PERM
+        return $OCF_ERR_PERM
     fi
 
     # Check socket directory
@@ -485,18 +485,18 @@
     # Set options passed to the PostgreSQL server process
     postgres_options=""
     if [ -n "$OCF_RESKEY_config" ]; then
-	postgres_options="$postgres_options -c config_file=${OCF_RESKEY_config}"
+        postgres_options="$postgres_options -c config_file=${OCF_RESKEY_config}"
     fi
     if [ -n "$OCF_RESKEY_pghost" ]; then
-	postgres_options="$postgres_options -h $OCF_RESKEY_pghost"
+        postgres_options="$postgres_options -h $OCF_RESKEY_pghost"
     fi
     if [ -n "$OCF_RESKEY_start_opt" ]; then
-	postgres_options="$postgres_options $OCF_RESKEY_start_opt"
+        postgres_options="$postgres_options $OCF_RESKEY_start_opt"
     fi
 
     # Tack pass-through options onto pg_ctl options
     if [ -n "$postgres_options" ]; then
-	pgctl_options="$pgctl_options -o '$postgres_options'"
+        pgctl_options="$pgctl_options -o '$postgres_options'"
     fi
 
     # Invoke pg_ctl
@@ -506,7 +506,7 @@
 	# Probably started.....
         ocf_log info "PostgreSQL start command sent."
     else
-	ocf_log err "Can't start PostgreSQL."
+        ocf_log err "Can't start PostgreSQL."
         return $OCF_ERR_GENERIC
     fi
 
@@ -518,7 +518,7 @@
             break;
         fi
         sleep 1
-	ocf_log debug "PostgreSQL still hasn't started yet. Waiting..."
+        ocf_log debug "PostgreSQL still hasn't started yet. Waiting..."
     done
 
     ocf_log info "PostgreSQL is started."
@@ -544,7 +544,7 @@
     fi
 
     if [ -f $PGSQL_LOCK ]; then
-        ocf_log err "My data may be inconsistent. You have to remove $PGSQL_LOCK file to force to start."
+        ocf_log err "My data may be inconsistent. You have to remove $PGSQL_LOCK file to force start."
         return $OCF_ERR_GENERIC
     fi
     is_slave_right || return $OCF_ERR_GENERIC
@@ -552,7 +552,7 @@
     # start
     pgsql_real_start
     if [ $? -ne $OCF_SUCCESS ]; then
-            return $OCF_ERR_GENERIC
+        return $OCF_ERR_GENERIC
     fi
     change_pgsql_status "$HOSTNAME" "HS:alone"
     return $OCF_SUCCESS
@@ -564,7 +564,7 @@
     local rc
 
     if ! is_replication; then
-        ocf_log err "It's not replication mode."
+        ocf_log err "Not in a replication mode."
         return $OCF_ERR_CONFIGURED
     fi
     rm -f ${XLOG_NOTE_FILE}.*
@@ -613,7 +613,7 @@
     local rc
 
     if ! is_replication; then
-        ocf_log err "It's not replication mode."
+        ocf_log err "Not a replication mode."
         return $OCF_ERR_CONFIGURED
     fi
 
@@ -693,8 +693,8 @@
             # An unnecessary debug log is prevented.
             break;
         fi
-	sleep 1
-	ocf_log debug "PostgreSQL still hasn't stopped yet. Waiting..."
+        sleep 1
+        ocf_log debug "PostgreSQL still hasn't stopped yet. Waiting..."
     done
 
     # Remove postmaster.pid if it exists
@@ -765,11 +765,12 @@
 
     if ! pgsql_status
     then
-	ocf_log info "PostgreSQL is down"
-	return $OCF_NOT_RUNNING
+        ocf_log info "PostgreSQL is down"
+        return $OCF_NOT_RUNNING
     fi
 
     if is_replication; then
+        #Check replication state
         output=`su $OCF_RESKEY_pgdba -c "cd $OCF_RESKEY_pgdata; $OCF_RESKEY_psql $psql_options -U $OCF_RESKEY_pgdba -Atc \"${CHECK_MS_SQL}\""`
         rc=$?
         if [ $rc -ne  0 ]; then
@@ -852,7 +853,7 @@
 
     if [ ! -n "$is_master" ]; then
         # If I am Slave and Master is not exist
-        ocf_log info "Master is not exist."
+        ocf_log info "Master does not exist."
         change_pgsql_status "$HOSTNAME" "HS:alone"
         is_master_right
         if [ $? -eq 0 ]; then
@@ -915,7 +916,7 @@
     local cmp_location
     local number_of_nodes
 
-    # If my data is newer than new master's one, I fails my resource.
+    # If my data is newer than new master's one, I fail my resource.
     PROMOTE_NODE=`echo $OCF_RESKEY_CRM_meta_notify_promote_uname | sed "s/ /\n/g" | head -1`
     number_of_nodes=`echo $OCF_RESKEY_node_list | sed -e "s/  */ /g" | sed -e "s/^ \| $//g" | sed -e "s/ /\n/g" | wc -l`
     if [ $number_of_nodes -ge 3 -a "$OCF_RESKEY_rep_mode" = "sync" -a "$PROMOTE_NODE" != "$HOSTNAME" ]; then
@@ -1146,7 +1147,7 @@
     local oldfile
     local newfile
 
-    ocf_log debug "Checking right of master."
+    ocf_log debug "Checking if master is right."
 
     data_status=`$CRM_ATTR_FOREVER -N "$HOSTNAME" -n "$PGSQL_DATA_STATUS_ATTR" -G -q`
     if [ "$OCF_RESKEY_rep_mode" = "sync" ]; then
@@ -1164,7 +1165,7 @@
 
     show_xlog_location
     if [ $? -ne 0 ]; then
-        ocf_log err "Failed showing my xlog location."
+        ocf_log err "Failed to show my xlog location."
         exit $OCF_ERR_GENERIC
     fi
 
@@ -1210,7 +1211,7 @@
             return 0
         fi
         change_data_status "$HOSTNAME" "DISCONNECT"
-        ocf_log info "I don't have a master right."
+        ocf_log info "I don't have correct master data."
         # reset counter
         rm -f ${XLOG_NOTE_FILE}.*
         echo -e "$newfile" > ${XLOG_NOTE_FILE}.0
@@ -1322,7 +1323,7 @@
     fi
     location=`get_my_location`
     timelineid=`get_my_timeline_id`
-    ocf_log info "my master baseline : $timelineid:$location."
+    ocf_log info "My master baseline : $timelineid:$location."
     $CRM_ATTR_REBOOT -N "$HOSTNAME" -n "$PGSQL_MASTER_BASELINE" -v "$timelineid:$location"
     return $?
 }
@@ -1334,10 +1335,10 @@
 
 set_async_mode_all() {
     [ "$OCF_RESKEY_rep_mode" = "sync" ] || return 0
-    ocf_log info "Setup all nodes as an async."
+    ocf_log info "Set all nodes into async mode."
     runasowner -q err "echo "" > \"$REP_MODE_CONF\""
     if [ $? -ne 0 ]; then
-        ocf_log err "Can't setup all nodes as an async mode."
+        ocf_log err "Can't set all nodes into async mode."
         return 1
     fi
     return 0
@@ -1349,10 +1350,10 @@
     sync_node_in_conf=`cat $REP_MODE_CONF | cut -d "'" -f 2`
     if [ -n "$sync_node_in_conf" ]; then
         if ! echo $sync_node_in_conf | grep "$1"; then
-            ocf_log info "$1 is already an async mode."
+            ocf_log info "$1 is already in async mode."
             return 0
         else
-            ocf_log info "Setup $1 as an async mode."
+            ocf_log info "Setup $1 into async mode."
             sync_node_in_conf=`echo $sync_node_in_conf | sed "s/$1//g" |  sed "s/^,//g" | sed "s/,,/,/g" | sed "s/,$//g"`
             if [ -n $sync_node_in_conf ]; then
                 echo "synchronous_standby_names = '$sync_node_in_conf'" > "$REP_MODE_CONF"
@@ -1361,7 +1362,7 @@
             fi
         fi
     else
-        ocf_log info "$1 is already an async mode."
+        ocf_log info "$1 is already in async mode."
         return 0
     fi
 
@@ -1376,14 +1377,14 @@
     sync_node_in_conf=`cat $REP_MODE_CONF | cut -d "'" -f 2`
     if [ -n "$sync_node_in_conf" ]; then
         if echo "$sync_node_in_conf" | grep "$1"; then
-            ocf_log info "$1 is already a sync mode."
+            ocf_log info "$1 is already in sync mode."
             return 0
         else
-            ocf_log info "Setup $1 as a sync mode."
+            ocf_log info "Setup $1 into sync mode."
             echo "synchronous_standby_names = '$sync_node_in_conf,$1'" > "$REP_MODE_CONF"
         fi
     else
-        ocf_log info "Setup $1 as a sync mode."
+        ocf_log info "Setup $1 into sync mode."
         echo "synchronous_standby_names = '$1'" > "$REP_MODE_CONF"
     fi
 
@@ -1450,8 +1451,8 @@
     output=`$CRM_ATTR_REBOOT -N "$1" -n "$PGSQL_STATUS_ATTR" -G -q`
     if [ "$output" != "$2" ]; then
         # If slave's disk is broken, RA cannot read PID file
-        # and misjudges the PostgreSQL is down whether it is running.
-        # It causes overwriting pgsql-status by Master because replication is still connected.
+        # and misjudges the PostgreSQL as down while it is running.
+        # It causes overwriting of pgsql-status by Master because replication is still connected.
         if [ "$output" = "STOP" -o "$output" = "UNKNOWN" ]; then
             if [ "$1" != "$HOSTNAME" ]; then
                 ocf_log warn "Changing $PGSQL_STATUS_ATTR on $1 : $output->$2 by $HOSTNAME is prohibited."
@@ -1473,7 +1474,7 @@
 change_data_status() {
     local output
 
-    if ! is_node_exist $1; then
+    if ! node_exist $1; then
         return 0
     fi
 
@@ -1582,7 +1583,7 @@
     return $?
 }
 
-is_node_exist() {
+node_exist() {
     crm_mon -1 -n | grep -q "^Node $1"
     return $?
 }
@@ -1600,7 +1601,7 @@
 
     if [ ! -f "$1" ]; then
         if ocf_is_probe; then
-           ocf_log info "Configuration file $1 not readable during probe."
+           ocf_log info "Configuration file is $1 not readable during probe."
            rc=1
         else
            ocf_log err "Configuration file $1 doesn't exist"

_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Re: [Linux-ha-dev] pgsql and streaming replcation

Reply via email to