Re: HAWQ standby master sync process

Ming Li Thu, 08 Sep 2016 21:56:26 -0700

Hi Kyle,

As for your question how to config standby host, when standby nodes(which
is config in hawq-site.xml) started, it will auto registered it's info in
the system table gp_segment_configuration(
there is system table:
http://hdb.docs.pivotal.io/20/reference/catalog/gp_segment_configuration.html),
so that hawq can use this info internally in catalog.  if you need more
details about it, @wen lin can help you.


Then standby will report the LSN of WALs it synched to master node, master
node according to this LSN to test the gap between master and node is still
in xlog file or it is overwritten (because xlog file recycled). If the gap
is not in the xlog file, we cannot do further just report "out of sync",
which need to manually run hawq init standby to recreate standby node; else
we just push the WAL after this LSN to standby node, and redo them. All
related standby script problem can ask @radar for help.

In most cases the standby should be less workload than master, so I
suggestion maybe we can implement it as:
(1) Master push WAL to standby node, when standby received them, it firstly
write to file, then report successfully to master so that no blocking
transaction commit.
(2) standby node redo them on this node, and at the same time, it need to
guarantee that the WAL should be transferred to the remote DR node, we can
set different sync policy (whether need to guarantee WAL transferred to
remote node when transaction committed ) in case of different transaction
commit latency and different data loss acceptance at remote node.

More to discussed:
(1) If standby "report out of sync" and gap is not available on master
node, we need to reinit standby manually, which need to shutdown master
node. We need to think an stronger policy for this scenario, e.g. just push
WAL to other nodes, and write as duplicate file? or we can further to write
into hdfs directly?
(2) If multiple master feature implemented, maybe the design need to be
changed. I don't take time on it.

Any comments or suggestions are welcomed. Thanks.


On Fri, Sep 9, 2016 at 1:22 AM, Kyle Dunn <kd...@pivotal.io> wrote:

> Ming -
>
> Thank you for the info, this is very helpful in understanding how WAL
> shipment happens.
>
> One question I have is: if/where the destination host is configured in
> walsendserver.c? Alternatively, does a standby master client initiate the
> request rather than the active master pushing out WALs as they become
> available? I ask because for a more robust DR solution than what I'm
> currently working on would allow multiple standby targets (i.e. one
> traditional standby, one DR mirror, etc.)
>
> At the moment I've opted for an approach that stops the active HAWQ master,
> creates a tarball of the entire MASTER_DATA_DIRECTORY, archives it on HDFS,
> then invokes distcp via Apache Falcon to mirror /hawq_default in HDFS to
> the DR site. After a DR event there would be some manual process to restore
> said archive and update the hostname / DFS references to reflect the actual
> DR environment.
>
> This approach is a step in the right direction but the act of creating the
> tarball necessitates a brief HAWQ master outage (currently ~1 minute when
> excluding pg_log contents and not compressing), whereas extending the
> walserver code could avoid any outage by allowing WAL replication to have
> multiple destinations.
>
> The top-level code for orchestrating this process is currently written in
> Python 2.6 compatible code - I'd like to have some review of it by the DEV
> team, if possible, as a first step to a future PR for "HAWQ DR" via Falcon.
>
> Thoughts?
>
>
> -Kyle
>
> On Mon, Sep 5, 2016 at 9:41 AM Ming Li <m...@pivotal.io> wrote:
>
> > Hi,
> >
> > The general idea please refer to PostgreSQL:
> >
> > https://www.pgcon.org/2008/schedule/attachments/61_
> Synchronous%20Log%20Shipping%20Replication.pdf
> >
> >
> > Here just share some info about standby code.
> >
> > The standby related code is here:
> > src/backend/postmaster/walredoserver.c
> > src/backend/postmaster/walsendserver.c
> >
> > Global pic:
> > - Backend generate WAL and pass it to the forked process "WAL Sender",
> the
> > calling stack is: XLogQDMirrorWrite() => WalSendServerClientSendRequest
> ()
> >
> > - "WAL sender" process will be forked up and loop for processing request
> > and response, the calling stack is:
> > walsendserver_forkexec() -> walsendserver_start() -> ServiceMain() ->
> > ServiceListenLoop() -> ServiceProcessRequest() ->
> > serviceConfig->ServiceRequest()
> > -> WalSendServer_ServiceRequest()
> >
> > - "WAL Sender" send WAL to "WAL Receiver" which is on the standby node,
> the
> > calling stack is:
> > WalSendServer_ServiceRequest() => WalSendServerDoRequest() =>
> > disconnectMirrorQD_SendClose() => write_qd_sync() => PQsendQuery()
> >
> > - On the standby side, all API are similar,  e.g.
> walredoserver_forkexec()
> > vs walsendserver_forkexec()
> >
> > Hope it helps you! ~_~
> >
> >
> >
> > On Thu, Aug 11, 2016 at 1:09 AM, Kyle Dunn <kd...@pivotal.io> wrote:
> >
> > > Hello,
> > >
> > > I'm investigating DR options for HAWQ and was curious about the
> existing
> > > master catalog synchronization process. My question is mainly around
> what
> > > this process does at a high level and where I might look in the code
> base
> > > or management tools to see about extending it for additional standby
> > > masters (e.g. one in a geographically distant data center and/or
> > different
> > > logical HAWQ cluster). The assumption is the HDFS blocks would be
> > > replicated by something like distcp via Falcon.
> > >
> > > I believe there are obvious things to address like DFS / namenode URI
> > > parameters, FQDNs, and certainly failure scenarios / edge cases, but
> I'm
> > > mainly trying to get a dialog started to see what input, ideas, and
> > > considerations others have. One thing I'm specifically interested in is
> > > whether / how WAL can be used (@Keaton).
> > >
> > >
> > > Thanks,
> > > Kyle
> > > --
> > > *Kyle Dunn | Data Engineering | Pivotal*
> > > Direct: 303.905.3171 <3039053171> | Email: kd...@pivotal.io
> > >
> >
> --
> *Kyle Dunn | Data Engineering | Pivotal*
> Direct: 303.905.3171 <3039053171> | Email: kd...@pivotal.io
>

Re: HAWQ standby master sync process

Reply via email to