So, I have a wonderful setup right now with JRuby, torquebox, Roda and
Sequel connecting to PostgreSQL RDS instance in AWS. It is the most
productive stack I have worked on, it is truly beautiful to work on these
JSON/API micro services I have built. :)
One minor issue I am having is with multi-AZ failover. I am running some
tests with my multi-AZ RDS, and I am trying to minimize my downtime when an
outage occurs, or if I were to "upgrade" and resize the RDS database
instance (thus requiring a failover/reboot of the multi-AZ). Right now when
I initiate the reboot/failover of RDS - my API service takes requests and
just blocks as it is trying to connect to the DB. I have tried playing
around with different configuration options in Sequel - but none seem to
really affect anything. Basically the DB status says "rebooting" and each
incoming request blocks, eventually responding with 504 after about 60
seconds. At this point (or shortly after), the RDS database status changes
to "available", yet it still takes another 45 seconds before requests start
responding normally again.
Here is a sample of my top level config for the Sequel.connect:
DB = Sequel.connect(:adapter => 'postgres', :host => contacts_db_host,:user
=> contacts_db_user, :password => contacts_db_pwd, :database =>
contacts_db_name,
:servers => {:node_1 => DB_NODE_LIST[:node_1], :node_2
=> DB_NODE_LIST[:node_2], :read_only =>{}}, :servers_hash=>Hash.new{|h,v|
raise Exception.new("Unknown server: #{v}")},
:loggers => [SimpleLogger.logger], :max_connections =>
25, :connect_timeout => 5, :pool_timeout => 5)
I should note that I am running in jRuby 9.1.2.0 and with pg_jruby driver.
Adding/changing :connect_timeout => 5, :pool_timeout => 5 seems to have no
effect
Surprisingly I also added this line right after my config setup to enable
connection_validation in hopes that once it tried to fetch a connection
that was down it would quickly try again, but this made zero difference as
well.
DB.extension :connection_validator #use only for faster DB related
scheduled outages
DB.pool.connection_validation_timeout = -1
I have also adjusted the DNS cache TTL according to this doc to be 15
seconds =>
http://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/java-dg-jvm-ttl.html
Here are the errors I see in the log:
16:45:05 +0000 severity=ERROR, error is PG::ConnectionBad: Connection timed
out, error backtrace => org/jruby/pg/Connection.java:366:in `initialize'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/adapters/postgres.rb:244:in
`connect'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/extensions/server_logging.rb:40:in
`connect'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/connection_pool.rb:116:in
`make_new'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/connection_pool/sharded_threaded.rb:286:in
`make_new'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/connection_pool/sharded_threaded.rb:241:in
`available'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/connection_pool/sharded_threaded.rb:181:in
`_acquire'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/connection_pool/sharded_threaded.rb:195:in
`block in acquire'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/connection_pool/threaded.rb:282:in
`block in sync'
[NUQSO2] org/jruby/ext/thread/Mutex.java:151:in `synchronize'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/connection_pool/threaded.rb:282:in
`sync'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/connection_pool/sharded_threaded.rb:194:in
`acquire'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/extensions/connection_validator.rb:98:in
`acquire'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/connection_pool/sharded_threaded.rb:132:in
`hold'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/database/connecting.rb:285:in
`synchronize'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/adapters/postgres.rb:838:in
`literal_string_append'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/dataset/sql.rb:79:in
`literal_append'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/dataset/sql.rb:491:in
`complex_expression_sql_append'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/adapters/shared/postgres.rb:1302:in
`complex_expression_sql_append'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/model/associations.rb:2587:in
`complex_expression_sql_append'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/sql.rb:120:in
`to_s_append'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/dataset/sql.rb:1238:in
`literal_expression_append'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/dataset/sql.rb:86:in
`literal_append'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/dataset/sql.rb:499:in
`block in complex_expression_sql_append'
[NUQSO2] org/jruby/RubyArray.java:1593:in `each'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/dataset/sql.rb:497:in
`complex_expression_sql_append'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/adapters/shared/postgres.rb:1302:in
`complex_expression_sql_append'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/model/associations.rb:2587:in
`complex_expression_sql_append'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/sql.rb:120:in
`to_s_append'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/dataset/sql.rb:1238:in
`literal_expression_append'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/dataset/sql.rb:86:in
`literal_append'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/dataset/sql.rb:1483:in
`select_where_sql'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/dataset/sql.rb:240:in
`select_sql'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/dataset/actions.rb:762:in
`single_value!'
[NUQSO2]
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/dataset/actions.rb:106:in
`count'
[NUQSO2] /opt/api-contacts/models/contact.rb:137:in `count_by_query_params'
This error is logged after AWS says that the Multi-AZ instance failover
completed. Any help on how to better handle this scenario? For unplanned
downtime (unexpected database outage) I am ok with ~2 minutes downtime, but
would love to shorten this when we are dealing with a known outage if
possible. Is there a way to flush and reload the pool completely on some
error or event? Would that help? Any production grade connection settings,
tips, etc you could provide would be really helpful.
Thanks!
--
You received this message because you are subscribed to the Google Groups
"sequel-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/sequel-talk.
For more options, visit https://groups.google.com/d/optout.