Em 02-11-2017 16:43, Jeremy Evans escreveu:
On Thursday, November 2, 2017 at 6:57:57 PM UTC+1, Rodrigo Rosenfeld Rosas wrote:

    Currently we have quite a few clients depending on the
    application. Coordinating a scheduled maintenance window is a
    complicated and lengthy process as I have to check with both my
    employer (which would certainly prefer to avoid any upgrades for
    years or as long as possible) and with its clients and some of
    them have very rigid rules for such maintenances involving
    downtime. Also, I'd usually have to work in weekends or at night
    when they accept a maintenance window. As you can probably guess,
    I much prefer spend much more time figuring out how to reduce the
    downtime as much as possible then trying to orchestrate a planned
    migration downtime window.

    For regular tasks, I can deploy many times a day without any
    downtime by performing blue-green (canary) deployments. However,
    upgrading the database is one of those tasks that I can't do
    without any downtime yet. This time I got so close to this that I
    decided to perform the upgrade without requesting for the
    maintenance window because it would take just a few seconds of
    downtime and then I could afford dealing with eventual errors that
    would happen in that small window if any. It wouldn't cause
    anything critical, except that the user would have to retry their
    operation just like it would be the case if their Internet had a
    temporary connection issue.

    But I'd prefer to keep the window as short as possible. By using
    pglogical, I was able to get PG10 in sync with PG96 in such a way
    that when I stop PG96, PG10 would have the latest data and I could
    then restart PG10 in the production port. All of that would take
    about just 1s in our production environment. So far so great,
    however due to this problem with OIDs I also have to restart the
    application apps dealing with the database (the main app, the
    admin app and the sidekiq cluster). Restarting those apps will add
    a few extra seconds to the downtime window and if some user would
    try to load the application while it's been restarted it would get
    an error page from nginx which I'd really prefer to avoid at all
    if possible to avoid having to explain to someone sending that
    nginx error page screenshot by e-mail.


You should be able to figure out the PG10 OIDs well before the switchover.  OIDs are not changed after the schema is created.  What takes time is copying the data from PG96 to PG10, and then after the data is copied, you can switch.

Note that even if you want to wait till all the data is copied, that should be fine. It should be possible to use pglocal to replicate from PG96 to PG10 for an arbitrarily long timeframe, certainly far longer than it would take determine the PG96 to PG10 OID mapping.  You can restart your ruby processes running Sequel on a rolling basis so that no connections are ever dropped. Once all process have been replaced by new processes that are aware of the PG96 to PG10 OID mapping, then you can switch the database so that PG10 runs on the production port.  This should result in zero downtime (at least downtime caused by Sequel).

For an example of how Sequel gets the type OIDs, here's the specific related code:

https://github.com/jeremyevans/sequel/blob/master/lib/sequel/adapters/shared/postgres.rb#L205
https://github.com/jeremyevans/sequel/blob/master/lib/sequel/extensions/pg_array.rb#L130
https://github.com/jeremyevans/sequel/blob/master/lib/sequel/extensions/pg_row.rb#L424

One solution may be to use Sequel's sharding support, add a shard for the PG10 (with the default shard using PG96), and at startup register all used types twice, once inside a with_server block for the PG10 database.

    Sometimes I'd need to restart PostgreSQL and that's why we setup
    Sequel to always verify if the connection is valid before using it
    to avoid errors in production when the database was restarted due
    to some change to postgresql.conf for example. If I was able to
    set up the application so that it would fix itself when the
    database is upgraded that would be awesome even if it would add
    some overhead to the application. Server-side performance is
    barely a concern in our application after all. Keeping the app
    working all the time is much more of a concern just like keeping
    the dependencies up to date as much as possible.


That's unexpected downtime, and the connection_validator can handle that well.  This PG96 to PG10 database switchover is not unexpected and can be adequately planned for using the process I explained above.

    I know there are some PG load balancer solutions that allow one to
    start enqueueing new requests to connections so that the database
    could be restarted and then the load balancer could resume its
    operation and return the connections from the queue without the
    user noticing the database has restarted or upgraded. Except for
    this issue with OIDs changing over upgrades. Another solution
    would be to keep the same OIDs when upgrading if possible. Is it
    possible to keep the same OIDs when upgrading?

I'm not sure if it is possible to keep the same type OIDs. You would have to ask on a PostgreSQL specific list.

I don't think you need a different load balancer if you follow the process I laid out above.

I read the rest of your message.  I realize you have it in your mind that this is something that should be done on the fly, but I don't think that will work well for reasons I have already explained.  I advise you to consider the process I described above.


I think I'm pretty lost with these OID mapping thing. I performed the following query both on PG 9.6 and PG 10 and got the same result:

select typname, oid, typarray from pg_type where typname = 'json';
 typname | oid | typarray
---------+-----+----------
 json    | 114 |      199

It seems like the JSON oid, assuming I'm looking at the right columns, for built-in types such as JSON should remain the same when I upgrade the database. However, if I remember correctly the application was interpreting a JSON column as if it was TEXT which means I'm missing something here because it doesn't seem to be a matter of simply ensuring those type OIDs are mapped...

I mean, I have just a handful of custom types and it would be probably easy to add those mappings to the initialization code and deploy a new version before switching to the new DB version. However I'm not sure if this would be enough.

But I was also wondering, can't this approach yield to conflicts? I mean, maybe the same OID is present for both databases but requiring different conversion procs.

For example, if I understood correctly, in one of your proposed solutions, I'd add PG 10 as a read-only slave and keep 9.6 as the master and deploy the new version. I'd register the custom types in both databases using a with_server block. Then I'd do something like this:

in with_server block I'd assign variables such as:

pg96_conversion_procs = DB.conversion_procs # in PG 9.6 with_server block
pg10_conversion_procs = DB.conversion_procs # in PG 10 with_server block

(pg10_conversion_procs.keys - pg96_conversion_procs.keys).each {|k| pg96_conversion_procs[k] = pg10_conversion_procs[k] }

Assuming there's no conflicting OIDs between both databases, it should be safe to switch the databases without restarting the application.

Did I understand your proposal correctly this time? I just have this feeling that this wouldn't be enough because of errors related to JSON columns being interpreted as TEXT when I switched the database versions... After all, the JSON OID didn't change among those servers.

I'll try to find some time to try those methods and will let you know in case I face any issues with that approach.

Thanks,
Rodrigo.

--
You received this message because you are subscribed to the Google Groups 
"sequel-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/sequel-talk.
For more options, visit https://groups.google.com/d/optout.

Reply via email to