Hi Serge I think about removing code of restarting PostgreSQL on demote. Because almost restarts are failed to avoid data inconsistency. In addition this check is complex. PostgreSQL developer says that this occurrence is specific and can't be fix immediately.
What do you think ? Regards, Takatoshi MATSUO 2012/2/6 Takatoshi MATSUO <[email protected]>: > Hi Serge > > Thank you. > I applied it to my repository. > > > 2012/2/6 Serge Dubrouski <[email protected]>: >> Takatoshi - >> >> Please consider apllying attached patch to your version of pgsql RA. It's >> all cosmetics: replacing tabs with spaces and improving English. Thanks for >> implementing recommendations I expressed earlier. >> >> >> On Tue, Dec 13, 2011 at 3:37 AM, Takatoshi MATSUO <[email protected]> >> wrote: >>> >>> Hello Serge >>> >>> 2011/12/13 Serge Dubrouski <[email protected]>: >>> > >>> > >>> > On Mon, Dec 12, 2011 at 4:28 AM, Takatoshi MATSUO <[email protected]> >>> > wrote: >>> >> >>> >> Hello Serge >>> >> >>> >> 2011/12/12 Serge Dubrouski <[email protected]>: >>> >> > >>> >> > >>> >> > On Thu, Dec 8, 2011 at 10:04 PM, Takatoshi MATSUO >>> >> > <[email protected]> >>> >> > wrote: >>> >> >> >>> >> >> Hello Serge >>> >> >> >>> >> >> 2011/12/8 Serge Dubrouski <[email protected]>: >>> >> >> > >>> >> >> > >>> >> >> > On Mon, Dec 5, 2011 at 9:15 PM, Takatoshi MATSUO >>> >> >> > <[email protected]> >>> >> >> > wrote: >>> >> >> >> >>> >> >> >> Hello Serge >>> >> >> >> >>> >> >> >> Serge Dubrouski <[email protected]>: >>> >> >> >> > Hello - >>> >> >> >> > >>> >> >> >> > Takatoshi MATSUO did a tremendous job on implementing support >>> >> >> >> > for >>> >> >> >> > streaming >>> >> >> >> > replication feature in pgsql RA. Also it looks like PostgeSQL >>> >> >> >> > 9.1 >>> >> >> >> > has >>> >> >> >> > all >>> >> >> >> > necessary interfaces to successfully implement Pacemaker's M/S >>> >> >> >> > concept. >>> >> >> >> > So >>> >> >> >> > I think it's time to start discussion on how to merge >>> >> >> >> > Takatoshi's >>> >> >> >> > work >>> >> >> >> > into >>> >> >> >> > pgsql RA baseline. Here is the link to Takatoshi's GitHUB if >>> >> >> >> > somebody >>> >> >> >> > wants >>> >> >> >> > to test his RA: >>> >> >> >> > >>> >> >> >> > https://github.com/t-matsuo/ >>> >> >> >> > >>> >> >> >> > So far I tested it for backward compatibility in a standard >>> >> >> >> > non-replication >>> >> >> >> > mode and also tested M/S model and found no real issues. >>> >> >> >> > Though >>> >> >> >> > it >>> >> >> >> > definitely requires some more polishing and testing. >>> >> >> >> > >>> >> >> >> > Takatoshi, here are some changes that I want to discuss with >>> >> >> >> > you: >>> >> >> >> > >>> >> >> >> > 1. Is it possible to add a check for PostgreSQL version and >>> >> >> >> > fail >>> >> >> >> > with >>> >> >> >> > OCF_ERR_INSTALLED when one tries to start replication on >>> >> >> >> > version >>> >> >> >> > less >>> >> >> >> > than >>> >> >> >> > 9.1? A simple cat on PG_VERSION with some analysis would >>> >> >> >> > probably >>> >> >> >> > do. >>> >> >> >> >>> >> >> >> I'll add a check. >>> >> >> >>> >> >> I added a check. >>> >> >> >>> >> >> >>> >> >> >>> >> >> https://github.com/t-matsuo/resource-agents/commit/3ab7cfdcce118043cd149b348740e50e7a946eb3 >>> >> >> >>> >> >> >> >>> >> >> >> > 2. I think that following lines should be moved from >>> >> >> >> > pgsql_start >>> >> >> >> > to >>> >> >> >> > pgsql_validate_all >>> >> >> >> > >>> >> >> >> > 535 # Check whether tmpdir is readable by pgdba user >>> >> >> >> > 536 if ! runasowner "test -r $OCF_RESKEY_tmpdir"; then >>> >> >> >> > 537 ocf_log err "Directory $OCF_RESKEY_tmpdir is not >>> >> >> >> > readable >>> >> >> >> > by >>> >> >> >> > $OCF_RESKEY_pgdba" >>> >> >> >> > 538 return $OCF_ERR_PERM >>> >> >> >> > 539 fi >>> >> >> >> >>> >> >> >> Thanks. I think so too. >>> >> >> >> I'll fix it. >>> >> >> >> >>> >> >> >>> >> >> I fixed it and I deleted a check for tmpdir existence >>> >> >> because the checking for permittion fills the role. >>> >> >> >>> >> >> >>> >> >> >>> >> >> https://github.com/t-matsuo/resource-agents/commit/82d4939486bcca429e2deb804d7faf756099bb59 >>> >> >> >>> >> >> >>> >> >> > On a second thought I'm not sure why we need that parameter and >>> >> >> > directory at >>> >> >> > all. Why not to create rep_mode.conf, PGSQL.lock and xlog.note in >>> >> >> > $OCF_RESKEY_pgdata ? What problems it can create? >>> >> >> > >>> >> >> > One more advantage to do >>> >> >> > it in $OCF_RESKEY_pgdata is an ability to handle more than >>> >> >> > PostgreSQL >>> >> >> > instance on the same server without a need for additional temp >>> >> >> > directories. >>> >> >> >>> >> >> When backup is needed, customers may backup these files and restore >>> >> >> it. >>> >> >> It may cause problems. >>> >> >> Specially PGSQL.lock causes an error on start. >>> >> >> >>> >> >> I think that they should be treated as a separate thing. >>> >> >> because they are independent from PostgreSQL's data. >>> >> > >>> >> > >>> >> > Ok. Then may be it should default to something like >>> >> > ${OCF_RESKEY_pgdata}/temp and RA should create it and set the right >>> >> > ownership and permissions if it doesn't exist? And again may be in >>> >> > this >>> >> > case >>> >> > we don't need that parameter? >>> >> > >>> >> >>> >> I agree that RA creates it and sets the right ownership and >>> >> permissions. >>> >> But considering backup using tar command, I think it's not better to >>> >> create it >>> >> under $OCF_RESKEY_pgdata. >>> >> >>> >> According to "Filesystem Hierarchy Standard" >>> >> it's better to use /var/lib/somewhere. >>> >> >>> >> >>> >> >>> >> http://www.pathname.com/fhs/pub/fhs-2.3.html#VARLIBVARIABLESTATEINFORMATION, >>> >> >>> >> Then I designed using /var/lig/[RA Name] (=/var/lib/pgsql). >>> >> What do you think? >>> > >>> > >>> > Ahh, I see now where confusion comes from. /var/lib/pgsql is actually >>> > taken >>> > :-) It's used as default home directory in RedHat (at least) for >>> > postgres >>> > user in PostgreSQL 8.XX. When they'll start start shipping version 9 >>> > they'll >>> > probably use it as well. OCF_RESKEY_pg_data defaults to >>> > /var/lib/pgsql/data. >>> > >>> > Then, /var/lib is usually used for non-temporary data, For temporary >>> > files >>> > it's probably better to use /var/run or /var/tmp. If you still want to >>> > use >>> > /var/lib/pgsql you probably need to use /var/lib/pgsql/temp or so. >>> >>> Files under /var/run or /var/tmp are cleared by tmpwatch or at the >>> beginning >>> of the boot process, aren't they? >>> >>> Clearing PGSQL.lock file causes problem. >>> >>> >>> > >>> >> >>> >> >> >>> >> >> >>> >> >> Incidentally I considered to handle more than PostgreSQL instance. >>> >> >> Initially I added port number to these filenames, but I deleted it >>> >> >> to simplify filenames. >>> >> >> >>> >> >> >>> >> >> >>> >> >> https://github.com/t-matsuo/resource-agents/commit/b16faf2d797200048dc0fc07a45b6751cf5be190 >>> >> >> >>> >> >> >>> >> >> > Also I think it would be good if RA was able to take care of >>> >> >> > adding >>> >> >> > "include >>> >> >> > $WHATEVER_DIR/rep_mode.conf" in postgresql.conf. It will make the >>> >> >> > RA >>> >> >> > self >>> >> >> > sustainable. In a current situation admin has add that directive >>> >> >> > manually. >>> >> >> > RA though can something like this in a start function for >>> >> >> > replication >>> >> >> > mode: >>> >> >> > >>> >> >> > if ! grep -i "include $WHATEVER_DIR/rep_mode.conf" >>> >> >> > $OCF_RESKEY_config >>> >> >> > then >>> >> >> > echo "include $WHATEVER_DIR/rep_mode.conf" >> >>> >> >> > $OCF_RESKEY_config >>> >> >> > fi >>> >> >> >>> >> >> Sounds good. >>> >> >> >>> >> >> > Don't know if it makes sense to remove it on stop. >>> >> >> >>> >> >> I think it doesn't make sense to remove it, >>> >> >> because rep_mode.conf becomes empty on stop. >>> >> > >>> >> > >>> >> > The only problem here if admin changes temp_dir parameter but doesn't >>> >> > delete >>> >> > records from postgres.conf. We could end up with several include >>> >> > records >>> >> > then or with records pointing to the non existing files. >>> >> >>> >> Should we consider this situation? >>> >> >>> >> If same parameters exist, PostgreSQL uses last parameter. >>> >> So I will implement it to add several include if temp_dir is changed. >>> > >>> > >>> > The problem happens when file gets deleted but reference to it still >>> > exists >>> > in postgres.conf. They DB fails to start. >>> >>> >>> Start manually? >>> The RA manages some statuses using an attribute and PGSQL.lock file. >>> I don't recommend starting it manually in the first place. >>> >>> >>> How about just describing it in a document? meta-data? >>> >>> >>> > >>> >> >>> >> >>> >> >>> >> >> >>> >> >> >> > 3. I don't really like this part of pgsql_real_monitor: >>> >> >> >> > >>> >> >> >> > 775 if ! is_replication; then >>> >> >> >> > 776 OCF_RESKEY_monitor_sql=`escape_string > >>> >> >> >> > "$OCF_RESKEY_monitor_sql"` >>> >> >> >> > 777 runasowner -q $loglevel "$OCF_RESKEY_psql >>> >> >> >> > $psql_options >>> >> >> >> > -c >>> >> >> >> > > '$OCF_RESKEY_monitor_sql'" >>> >> >> >> > 778 rc=$? >>> >> >> >> > 779 else >>> >> >> >> > 780 output=`su $OCF_RESKEY_pgdba -c "cd >>> >> >> >> > $OCF_RESKEY_pgdata; >>> >> >> >> > > >>> >> >> >> > $OCF_RESKEY_psql $psql_options -Atc \"${CHECK_MS_SQL}\""` >>> >> >> >> > 781 rc=$? >>> >> >> >> > 782 fi >>> >> >> >> > >>> >> >> >> > I think that functional monitor (the one that uses monitor_sql) >>> >> >> >> > should >>> >> >> >> > run >>> >> >> >> > always independently of DB mode since its primary role is to >>> >> >> >> > check >>> >> >> >> > data >>> >> >> >> > and >>> >> >> >> > fail if it's not correct or corrupted. In replication mode >>> >> >> >> > there >>> >> >> >> > should >>> >> >> >> > be >>> >> >> >> > additional monitoring. Other way it misleads customer on a >>> >> >> >> > usage >>> >> >> >> > of >>> >> >> >> > monitor_sql. >>> >> >> >> >>> >> >> >> All right. >>> >> >> >> >>> >> >> >> Does it need to execute "select now();" if monitor_sql parameter >>> >> >> >> is >>> >> >> >> empty? >>> >> >> >> I think it's unnecessary. >>> >> >> > >>> >> >> > >>> >> >> > For a case of an empty parameter you are right and running "select >>> >> >> > now()" >>> >> >> > probably unnecessary, Non-empty monitro_sql shall be executed in >>> >> >> > my >>> >> >> > opinion. >>> >> >> > >>> >> >> >> >>> >> >> >> >>> >> >> >> > 4. You already populate several attributes with crm_attribute. >>> >> >> >> > Does >>> >> >> >> > it >>> >> >> >> > make >>> >> >> >> > sense to populate a name of a Master node in promote function >>> >> >> >> > and >>> >> >> >> > use >>> >> >> >> > it >>> >> >> >> > later on instead of running crm_mon on each monitor command? >>> >> >> >> > >>> >> >> >> >>> >> >> >> Do you mean that you don't want to use crm_mon in monitor? >>> >> >> > >>> >> >> > >>> >> >> > I prefer to have as few dependencies on external programs as >>> >> >> > possible. >>> >> >> >>> >> >> Agree. >>> >> >> >>> >> >> > Using crm_attribute to communicate the name of a master node would >>> >> >> > be >>> >> >> > consistent with the rest of the script since you already use it >>> >> >> > for >>> >> >> > communicating state of the nodes. If you prefer using crm_mon than >>> >> >> > you >>> >> >> > have >>> >> >> > to add a check that that binary exists on the server. In 99.99% of >>> >> >> > the >>> >> >> > case >>> >> >> > it will but still the check is necessary I think. >>> >> >> >>> >> >> I use crm_mon to check for node existence and node online too. >>> >> >> It is impossible to get it from attribute, so I prefer to using >>> >> >> crm_mon. >>> >> >> >>> >> >> Do I have to add a check for crm_master, crm_attribute, >>> >> >> crm_failcount >>> >> >> too? >>> >> >> >>> >> > >>> >> > That's a different interesting topic. What is the reason for doing >>> >> > this? >>> >> > >>> >> > my_fail_count=`$CRM_FAILCOUNT -r $OCF_RESOURCE_INSTANCE -N $HOSTNAME >>> >> > -G >>> >> > -Q | >>> >> > sed "s/INFINITY/1000000/g"` >>> >> > if [ "$my_fail_count" != "0" ]; then >>> >> > ocf_log info "I don't start PostgreSQL on post-demote because >>> >> > of >>> >> > my >>> >> > fail-count=$my_fail_count." >>> >> > return $OCF_SUCCESS >>> >> > fi >>> >> > >>> >> > Why not to leave pacemaker to do its job on managing failcounts? >>> >> >>> >> Because I want to make fail-over faster. >>> >> At any rate Pacemaker stops it? >>> > >>> > >>> > I'm probably wrong here. Looks like you use this trick in case if >>> > slave's >>> > data is newer than new master's data. In this case it makes sense to >>> > disable >>> > such slave. >>> >>> Why do you think so ? >>> The check whether slave's data is newer than new master's one is here. >>> >>> in pgsql_post_demote() >>> ------------------------------------------------------- >>> 886 is_slave_right || return $OCF_ERR_GENERIC >>> ------------------------------------------------------- >>> >>> This line is executed after this trick. >>> >>> >>> >>> Regards, >>> Takatoshi MATSUO >>> >>> _______________________________________________________ >>> Linux-HA-Dev: [email protected] >>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev >>> Home Page: http://linux-ha.org/ >> >> >> >> >> -- >> Serge Dubrouski. >> > > -- > Takatoshi MATSUO _______________________________________________________ Linux-HA-Dev: [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
