Nick, the 'db gone away' thing is a real problem. I ended up with a RobustWorker that all our other workers subclass, providing a kind of max tries mechanism as well as emails to me when that try count is exceeded. I think it would be great to see the library handle this more transparently, since before I learned how to handle all this I was beginning to think that all the naysayers may have been right about backgroundrb.

Adam Williams

On Apr 24, 2009, at 12:06 PM, [email protected] wrote:

My name's Nick, and I work for a company called Bytemark - we use backgroundrb in a range of internal projects (all our internal apps are Ruby on Rails, so
it makes sense).

Basically, and as I mentioned on IRC, I've been tasked with making
backgroundrb "better" over the next month or so, and I'd like to push as much upstream as I can while I'm at it (although some stuff I can guarantee you won't want - I'll my making my own local branch use a login plugin with a
different syntax, for instance).

First on the agenda is this backtrace:
/home/bmrack/bmrack/releases/20090404011130/vendor/rails/ activerecord/lib/active_record/connection_adapters/ abstract_adapter.rb:147:in
`log': Mysql::Error: MySQL server has gone away: SELECT * FROM
`bdrb_job_queues` WHERE ( worker_name = 'dhshell_export_worker' AND taken
= 0 AND scheduled_at <= '2009-04-24 10:41:44' )  LIMIT 1 FOR UPDATE
(ActiveRecord::StatementInvalid)
from /home/bmrack/bmrack/releases/20090404011130/vendor/rails/ activerecord/lib/active_record/connection_adapters/mysql_adapter.rb: 302:in
`execute'
from /home/bmrack/bmrack/releases/20090404011130/vendor/rails/ activerecord/lib/active_record/connection_adapters/mysql_adapter.rb: 537:in
`select'
from /home/bmrack/bmrack/releases/20090404011130/vendor/rails/ activerecord/lib/active_record/connection_adapters/abstract/ database_statements.rb:7:in
`select_all_without_query_cache'
from /home/bmrack/bmrack/releases/20090404011130/vendor/rails/ activerecord/lib/active_record/connection_adapters/abstract/ query_cache.rb:61:in
`select_all'
from /home/bmrack/bmrack/releases/20090404011130/vendor/rails/ activerecord/lib/active_record/base.rb:586:in
`find_by_sql'
from /home/bmrack/bmrack/releases/20090404011130/vendor/rails/ activerecord/lib/active_record/base.rb:1345:in
`find_every'
from /home/bmrack/bmrack/releases/20090404011130/vendor/rails/ activerecord/lib/active_record/base.rb:1307:in
`find_initial'
from /home/bmrack/bmrack/releases/20090404011130/vendor/rails/ activerecord/lib/active_record/base.rb:538:in
`find'
from /home/bmrack/bmrack/releases/20090404011130/vendor/plugins/ backgroundrb/lib/backgroundrb/bdrb_job_queue.rb:11:in
`find_next'
from /home/bmrack/bmrack/releases/20090404011130/vendor/rails/ activerecord/lib/active_record/connection_adapters/abstract/ database_statements.rb:66:in
`transaction'
from /home/bmrack/bmrack/releases/20090404011130/vendor/rails/ activerecord/lib/active_record/transactions.rb:79:in
`transaction'
from /home/bmrack/bmrack/releases/20090404011130/vendor/plugins/ backgroundrb/lib/backgroundrb/bdrb_job_queue.rb:8:in
`find_next'
from /home/bmrack/bmrack/releases/20090404011130/vendor/plugins/ backgroundrb/server/lib/meta_worker.rb:271:in
`check_for_enqueued_tasks'
from /home/bmrack/bmrack/releases/20090404011130/vendor/plugins/ backgroundrb/server/lib/meta_worker.rb:133:in
`worker_init'
from /usr/lib/ruby/gems/1.8/gems/packet-0.1.14/bin/../lib/packet/ packet_periodic_event.rb:23:in
`call'
from /usr/lib/ruby/gems/1.8/gems/packet-0.1.14/bin/../lib/packet/ packet_periodic_event.rb:23:in
`run'
from /usr/lib/ruby/gems/1.8/gems/packet-0.1.14/bin/../lib/packet/ packet_core.rb:301:in
`check_for_timer_events'
from /usr/lib/ruby/gems/1.8/gems/packet-0.1.14/bin/../lib/packet/ packet_core.rb:301:in
`each'
from /usr/lib/ruby/gems/1.8/gems/packet-0.1.14/bin/../lib/packet/ packet_core.rb:301:in
`check_for_timer_events'
from /home/bmrack/bmrack/releases/20090404011130/vendor/plugins/ backgroundrb/server/lib/meta_worker.rb:296:in
`check_for_timer_events'
from /usr/lib/ruby/gems/1.8/gems/packet-0.1.14/bin/../lib/packet/ packet_core.rb:140:in
`start_reactor'
from /usr/lib/ruby/gems/1.8/gems/packet-0.1.14/bin/../lib/packet/ packet_core.rb:139:in
`loop'
from /usr/lib/ruby/gems/1.8/gems/packet-0.1.14/bin/../lib/packet/ packet_core.rb:139:in
`start_reactor'
from /usr/lib/ruby/gems/1.8/gems/packet-0.1.14/bin/../lib/packet/ packet_worker.rb:20:in
`start_worker'
from /usr/lib/ruby/gems/1.8/gems/packet-0.1.14/bin/ packet_worker_runner:33:in
`load_worker'
from /usr/lib/ruby/gems/1.8/gems/packet-0.1.14/bin/ packet_worker_runner:26:in
`initialize'
from /usr/lib/ruby/gems/1.8/gems/packet-0.1.14/bin/ packet_worker_runner:47:in
`new'
from /usr/lib/ruby/gems/1.8/gems/packet-0.1.14/bin/ packet_worker_runner:47
from /usr/bin/packet_worker_runner:19:in `load'
from /usr/bin/packet_worker_runner:19

(not latest git, but definitely git - we're just updating to packet 0.1.15, but I doubt that'll affect this particular error) - this kills the server from time to time for us (we have our MySQL server set to go away after half a day) - it's not the full story, since for that to happen, worker requests must have stopped too (or something else), but fixing this is number 1 on my list. I've not delved much into the source yet (5pm on a Friday is *not* the time to start with that!); I was wondering if you guys have a preferred / global strategy for dealing with errors. My approach to the above would be to catch the error, and respond by attempting to re-establish the connection. If that succeeds, I'd re-do the scheduled item (so the database going away would
be transparent to the client); if not... hmm. Do we have a method of
reporting a failed task back to the worker?

Another problem I was seeing that led me to stop using the backgroundrb cache object was that requests into a worker via the MiddleMan API would just freeze up from time to time, leaving the Rails controller method hanging on for it. That's another one for me to investigate (we weren't using memcached,
mind, but it annoys me when the Hash - a far simpler system - is less
reliable).

I'm also tasked with redeploying the current backgroundrb setup - right now, we have one server per application using it (which is about 4,5 backgroundrb servers in total), with the backgroundrb table stored in the application's
database. I'm kind of moving towards a scheme where we have a pair of
backgroundrb servers (transparently load-balanced) used by all applications. Each backgroundrb server would have its own separate *SQLite* database. Requests would go to one or another of the backgroundrb servers; if one of
the servers died, all the requests would go to the other server, and
vice-versa. If possible (I know this is a -devel list), I'd like to get
comments on whether that's a good setup or not, and suggestions for
improvement. If not, well, just consider it to be a bit of context ;)

Regards,
--
Nick Thomas
Bytemark Hosting Limited
_______________________________________________
Backgroundrb-devel mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/backgroundrb-devel
_______________________________________________
Backgroundrb-devel mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/backgroundrb-devel

Reply via email to