So, I have a wonderful setup right now with JRuby, torquebox, Roda and 
Sequel connecting to PostgreSQL RDS instance in AWS. It is the most 
productive stack I have worked on, it is truly beautiful to work on these 
JSON/API micro services I have built. :) 

One minor issue I am having is with multi-AZ failover. I am running some 
tests with my multi-AZ RDS, and I am trying to minimize my downtime when an 
outage occurs, or if I were to "upgrade" and resize the RDS database 
instance (thus requiring a failover/reboot of the multi-AZ). Right now when 
I initiate the reboot/failover of RDS - my API service takes requests and 
just blocks as it is trying to connect to the DB. I have tried playing 
around with different configuration options in Sequel - but none seem to 
really affect anything. Basically the DB status says "rebooting" and each 
incoming request blocks, eventually responding with 504 after about 60 
seconds. At this point (or shortly after), the RDS database status changes 
to "available", yet it still takes another 45 seconds before requests start 
responding normally again.  

Here is a sample of my top level config for the Sequel.connect:

DB = Sequel.connect(:adapter => 'postgres', :host => contacts_db_host,:user 
=>  contacts_db_user, :password => contacts_db_pwd, :database => 
contacts_db_name,
                    :servers => {:node_1 => DB_NODE_LIST[:node_1], :node_2 
=> DB_NODE_LIST[:node_2], :read_only =>{}}, :servers_hash=>Hash.new{|h,v| 
raise Exception.new("Unknown server: #{v}")},
                    :loggers => [SimpleLogger.logger], :max_connections => 
25, :connect_timeout => 5, :pool_timeout => 5)


I should note that I am running in jRuby 9.1.2.0 and with pg_jruby driver. 
Adding/changing :connect_timeout => 5, :pool_timeout => 5 seems to have no 
effect

Surprisingly I also added this line right after my config setup to enable 
connection_validation in hopes that once it tried to fetch a connection 
that was down it would quickly try again, but this made zero difference as 
well.
DB.extension :connection_validator #use only for faster DB related 
scheduled outages
DB.pool.connection_validation_timeout = -1

I have also adjusted the DNS cache TTL according to this doc to be 15 
seconds => 
http://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/java-dg-jvm-ttl.html

Here are the errors I see in the log:
16:45:05 +0000 severity=ERROR, error is PG::ConnectionBad: Connection timed 
out, error backtrace => org/jruby/pg/Connection.java:366:in `initialize'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/adapters/postgres.rb:244:in
 
`connect'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/extensions/server_logging.rb:40:in
 
`connect'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/connection_pool.rb:116:in
 
`make_new'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/connection_pool/sharded_threaded.rb:286:in
 
`make_new'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/connection_pool/sharded_threaded.rb:241:in
 
`available'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/connection_pool/sharded_threaded.rb:181:in
 
`_acquire'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/connection_pool/sharded_threaded.rb:195:in
 
`block in acquire'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/connection_pool/threaded.rb:282:in
 
`block in sync'
[NUQSO2] org/jruby/ext/thread/Mutex.java:151:in `synchronize'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/connection_pool/threaded.rb:282:in
 
`sync'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/connection_pool/sharded_threaded.rb:194:in
 
`acquire'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/extensions/connection_validator.rb:98:in
 
`acquire'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/connection_pool/sharded_threaded.rb:132:in
 
`hold'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/database/connecting.rb:285:in
 
`synchronize'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/adapters/postgres.rb:838:in
 
`literal_string_append'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/dataset/sql.rb:79:in
 
`literal_append'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/dataset/sql.rb:491:in
 
`complex_expression_sql_append'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/adapters/shared/postgres.rb:1302:in
 
`complex_expression_sql_append'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/model/associations.rb:2587:in
 
`complex_expression_sql_append'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/sql.rb:120:in
 
`to_s_append'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/dataset/sql.rb:1238:in
 
`literal_expression_append'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/dataset/sql.rb:86:in
 
`literal_append'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/dataset/sql.rb:499:in
 
`block in complex_expression_sql_append'
[NUQSO2] org/jruby/RubyArray.java:1593:in `each'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/dataset/sql.rb:497:in
 
`complex_expression_sql_append'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/adapters/shared/postgres.rb:1302:in
 
`complex_expression_sql_append'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/model/associations.rb:2587:in
 
`complex_expression_sql_append'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/sql.rb:120:in
 
`to_s_append'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/dataset/sql.rb:1238:in
 
`literal_expression_append'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/dataset/sql.rb:86:in
 
`literal_append'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/dataset/sql.rb:1483:in
 
`select_where_sql'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/dataset/sql.rb:240:in
 
`select_sql'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/dataset/actions.rb:762:in
 
`single_value!'
[NUQSO2] 
/opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/dataset/actions.rb:106:in
 
`count'
[NUQSO2] /opt/api-contacts/models/contact.rb:137:in `count_by_query_params'


This error is logged after AWS says that the Multi-AZ instance failover 
completed. Any help on how to better handle this scenario? For unplanned 
downtime (unexpected database outage) I am ok with ~2 minutes downtime, but 
would love to shorten this when we are dealing with a known outage if 
possible. Is there a way to flush and reload the pool completely on some 
error or event? Would that help?  Any production grade connection settings, 
tips, etc you could provide would be really helpful.

Thanks!


-- 
You received this message because you are subscribed to the Google Groups 
"sequel-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/sequel-talk.
For more options, visit https://groups.google.com/d/optout.

Reply via email to