New Relic causing crash on deploy

2011-09-22 Thread mattsly
I just made a trivial text change and deployed to both staging and
production.  Staging was fine (I'm not using New Relic), but
production crashed during bounce, w/ a stack trace that points to
something awry w/ New Relic.  I had to rollback.  I haven't made any
changes

Pasting the stack trace from a log tail below.

For kicks, I tried adding New Relic to staging ( heroku addons:add
newrelic:standard --app staging), and it crashed immediately.  I
removed it and it was fine.

Anyone else?

Matt




2011-09-23T03:33:05+00:00 heroku[web.3]: State changed from up to
bouncing
2011-09-23T03:33:05+00:00 heroku[web.3]: State changed from bouncing
to created
2011-09-23T03:33:05+00:00 heroku[web.3]: State changed from created to
starting
2011-09-23T03:33:05+00:00 heroku[web.2]: State changed from up to
bouncing
2011-09-23T03:33:05+00:00 heroku[web.2]: State changed from bouncing
to created
2011-09-23T03:33:05+00:00 heroku[web.2]: State changed from created to
starting
2011-09-23T03:33:07+00:00 heroku[web.3]: Stopping process with SIGTERM
2011-09-23T03:33:07+00:00 app[web.3]:  Stopping ...
2011-09-23T03:33:07+00:00 heroku[web.2]: Stopping process with SIGTERM
2011-09-23T03:33:07+00:00 app[web.2]:  Stopping ...
2011-09-23T03:33:07+00:00 heroku[web.3]: Process exited
2011-09-23T03:33:07+00:00 heroku[web.2]: Process exited
2011-09-23T03:33:08+00:00 heroku[web.2]: Starting process with command
`thin -p 20025 -e production -R /home/heroku_rack/heroku.ru start`
2011-09-23T03:33:08+00:00 heroku[web.3]: Starting process with command
`thin -p 53644 -e production -R /home/heroku_rack/heroku.ru start`
2011-09-23T03:33:10+00:00 app[web.2]: /app/vendor/plugins/rpm/lib/
new_relic/agent/stats_engine/metric_stats.rb:13: uninitialized
constant
NewRelic::Agent::StatsEngine::MetricStats::SynchronizedHash::Sync_m
(NameError)
2011-09-23T03:33:10+00:00 app[web.2]:   from /app/vendor/plugins/rpm/
lib/new_relic/agent/stats_engine.rb:1
2011-09-23T03:33:10+00:00 app[web.2]:   from /app/vendor/plugins/rpm/
lib/new_relic/agent.rb:82
2011-09-23T03:33:10+00:00 app[web.2]:   from /app/vendor/plugins/rpm/
lib/new_relic/control.rb:20
2011-09-23T03:33:10+00:00 app[web.2]:   from /app/vendor/plugins/rpm/
init.rb:3
2011-09-23T03:33:10+00:00 app[web.2]:   from /app/.bundle/gems/ruby/1.8/
gems/railties-3.0.10/lib/rails/plugin.rb:81
2011-09-23T03:33:10+00:00 app[web.2]:   from /app/.bundle/gems/ruby/1.8/
gems/railties-3.0.10/lib/rails/initializable.rb:25:in `instance_exec'
2011-09-23T03:33:10+00:00 app[web.2]:   from /app/.bundle/gems/ruby/1.8/
gems/railties-3.0.10/lib/rails/initializable.rb:25:in `run'
2011-09-23T03:33:10+00:00 app[web.2]:   from /app/.bundle/gems/ruby/1.8/
gems/railties-3.0.10/lib/rails/initializable.rb:50:in
`run_initializers'
2011-09-23T03:33:10+00:00 app[web.2]:   from /app/.bundle/gems/ruby/1.8/
gems/railties-3.0.10/lib/rails/initializable.rb:49:in `each'
2011-09-23T03:33:10+00:00 app[web.2]:   from /app/.bundle/gems/ruby/1.8/
gems/railties-3.0.10/lib/rails/initializable.rb:49:in
`run_initializers'
2011-09-23T03:33:10+00:00 app[web.2]:   from /app/.bundle/gems/ruby/1.8/
gems/railties-3.0.10/lib/rails/application.rb:134:in `initialize!'
2011-09-23T03:33:10+00:00 app[web.2]:   from /app/.bundle/gems/ruby/1.8/
gems/railties-3.0.10/lib/rails/application.rb:77:in `send'
2011-09-23T03:33:10+00:00 app[web.2]:   from /app/.bundle/gems/ruby/1.8/
gems/railties-3.0.10/lib/rails/application.rb:77:in `method_missing'
2011-09-23T03:33:10+00:00 app[web.2]:   from /app/config/environment.rb:
5
2011-09-23T03:33:10+00:00 app[web.2]:   from /usr/ruby1.8.7/lib/ruby/
site_ruby/1.8/rubygems/custom_require.rb:31:in `gem_original_require'
2011-09-23T03:33:10+00:00 app[web.2]:   from /usr/ruby1.8.7/lib/ruby/
site_ruby/1.8/rubygems/custom_require.rb:31:in `require'
2011-09-23T03:33:10+00:00 app[web.2]:   from config.ru:3
2011-09-23T03:33:10+00:00 app[web.2]:   from /home/heroku_rack/
heroku.ru:23
2011-09-23T03:33:10+00:00 app[web.2]:   from /app/.bundle/gems/ruby/1.8/
gems/rack-1.2.3/lib/rack/builder.rb:46:in `instance_eval'
2011-09-23T03:33:10+00:00 app[web.2]:   from /app/.bundle/gems/ruby/1.8/
gems/rack-1.2.3/lib/rack/builder.rb:46:in `initialize'
2011-09-23T03:33:10+00:00 app[web.2]:   from /app/.bundle/gems/ruby/1.8/
gems/rack-1.2.3/lib/rack/builder.rb:63:in `new'
2011-09-23T03:33:10+00:00 app[web.2]:   from /app/.bundle/gems/ruby/1.8/
gems/rack-1.2.3/lib/rack/builder.rb:63:in `map'
2011-09-23T03:33:10+00:00 app[web.2]:   from /home/heroku_rack/
heroku.ru:18
2011-09-23T03:33:10+00:00 app[web.2]:   from /app/.bundle/gems/ruby/1.8/
gems/rack-1.2.3/lib/rack/builder.rb:46:in `instance_eval'
2011-09-23T03:33:10+00:00 app[web.2]:   from /app/.bundle/gems/ruby/1.8/
gems/rack-1.2.3/lib/rack/builder.rb:46:in `initialize'
2011-09-23T03:33:10+00:00 app[web.2]:   from /home/heroku_rack/
heroku.ru:11:in `new'
2011-09-23T03:33:10+00:00 app[web.2]:   from /home/heroku_rack/
heroku.ru:11
2011-09-23T03:33:10+00:00 app[web.2]:   from /app/.bundle/gems/ruby/1.8/

blocked in china

2011-03-10 Thread mattsly
As of early last week, I lost all of my traffic from China, and
started getting tweets from users that they couldn't hit my site.

Indeed, it looks like all three Heroku proxy IPs are blocked by the
GFW (Great Firewall):
75.101.145.87
75.101.163.44
174.129.212.2

Not sure what we can do about this...just kind of an FYI for
everyone.  And a bummer.  Pesky geopolitics.

Here's the service I used to test btw...pretty cool:
http://www.watchmouse.com/en/ping.php

Matt

-- 
You received this message because you are subscribed to the Google Groups 
Heroku group.
To post to this group, send email to heroku@googlegroups.com.
To unsubscribe from this group, send email to 
heroku+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en.



amazon simple email service + heroku

2011-01-26 Thread mattsly
Assume others on this list are mighty intrigued by this announcement:
http://aws.amazon.com/ses/

I've always thought sendgrid looked great, but way overpriced.
Amazon's offering looks far more small-site friendly.

A few points of discussion:
1) Does calling it from an app running on Heroku count as an EC2
instance?
The pricing page here: http://aws.amazon.com/ses/pricing/ includes the
following:
If you are an Amazon EC2 user, you can get started with Amazon SES
for free. You can send 2,000 messages for free each day when you call
Amazon SES from an Amazon EC2 instance directly or through AWS Elastic
Beanstalk. Many applications are able to operate entirely within this
free tier limit.

My assumption is that this is based on IP so we'll be fine, but
curious if there's confirmation.

2) Any sign of a Ruby library?
Currently only .NET, Java and PHP wrappers.  All looks easy enough of
port...but just thought I'd check if anyone's already done so.

-- 
You received this message because you are subscribed to the Google Groups 
Heroku group.
To post to this group, send email to heroku@googlegroups.com.
To unsubscribe from this group, send email to 
heroku+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en.



Re: App Timeouts

2010-11-30 Thread mattsly
Yeah unfortunately I missed the log entries where the crash first
occurred, and I didn't have a tail -f going (dumb) so I don't know the
request that caused it.  And nothing was logged to exceptional so I
don't have a stacktrace - maybe the Heroku guys can dig one up? (I
also have a thread open w/ support about the issue)

In terms of being unrecoverable, once the original crash happened,
every subsequent resulted in an App Crashed response from the dyno -
here's a sample from the logs (just after the first crash event -
really just missed by seconds, sadly...)

2010-11-29T20:14:15-08:00 heroku[nginx]: GET /letters/recently_written?
offset=48 HTTP/1.1 | [ip address] | 3310 | http | 500
2010-11-29T20:14:20-08:00 heroku[router]: Error H10 (App crashed) -
GET www.futureme.org/letters/recently_written dyno=none queue=0
wait=0ms service=0ms bytes=0
2010-11-29T20:14:20-08:00 heroku[nginx]: GET /letters/recently_written?
offset=48 HTTP/1.1 | [ip address] | 3310 | http | 500
2010-11-29T20:14:25-08:00 heroku[router]: Error H10 (App crashed) -
GET www.futureme.org/letter/218665-no-subject dyno=none queue=0
wait=0ms service=0ms bytes=0
2010-11-29T20:14:31-08:00 heroku[router]: Error H10 (App crashed) -
GET www.futureme.org/letter/218665-no-subject dyno=none queue=0
wait=0ms service=0ms bytes=0
2010-11-29T20:14:31-08:00 heroku[router]: Error H10 (App crashed) -
GET www.futureme.org/ dyno=none queue=0 wait=0ms service=0ms bytes=0
2010-11-29T20:14:31-08:00 heroku[nginx]: GET / HTTP/1.1 | [ip address]
| 3310 | http | 500
2010-11-29T20:14:36-08:00 heroku[router]: Error H10 (App crashed) -
GET futureme.org/ dyno=none queue=0 wait=0ms service=0ms bytes=0
2010-11-29T20:14:36-08:00 heroku[nginx]: GET / HTTP/1.1 | [ip address]
| 3310 | http | 500
2010-11-29T20:14:37-08:00 heroku[nginx]: GET /letters/our_favs?
offset=483 HTTP/1.1 | [ip address] | 6194 | http | 200
2010-11-29T20:14:37-08:00 heroku[router]: Error H10 (App crashed) -
GET www.futureme.org/ dyno=none queue=0 wait=0ms service=0ms bytes=0

I got back on track by commenting out the two gems (system-timer and
rack-timeout) and then redeploying.  But until I did that, every
request was hitting the Application crashed page.

Feel free to ping me offline to discuss more.  I tried to repro in my
staging environment for a while last night but couldn't but can keep
giving it a shot.

Matt



On Nov 30, 1:07 pm, Caio Chassot caio.chas...@gmail.com wrote:
 On Nov 30, 2:55 am, mattsly matt...@gmail.com wrote:

  So I've been trying to get rack-timeout installed, but it's causing
  consistent, though non-deterministic, application crashes from which
  my app won't recover.  I've had to rollback.

 Can you tell me more about the nature of these crashes?

 What do you see, what do you mean by won't recover?

 How were you getting your app back on track? By restarting it?

 Mostly, I'd love to see a stack trace if that's possible.

-- 
You received this message because you are subscribed to the Google Groups 
Heroku group.
To post to this group, send email to her...@googlegroups.com.
To unsubscribe from this group, send email to 
heroku+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en.



Re: App Timeouts

2010-11-29 Thread mattsly
So I've been trying to get rack-timeout installed, but it's causing
consistent, though non-deterministic, application crashes from which
my app won't recover.  I've had to rollback.

I'm on Rails 3.0.1 on Bamboo (Ruby 1.8.7)

Basically, per README here:
https://github.com/kch/rack-timeout

...I add two lines to my Gemfile
gem system_timer if RUBY_VERSION  1.9
gem rack-timeout

...and that causes the crash (after running peachy keen for a few
minutes, even up to an hour or so...) Comment them out, and we're all
good.

Have others had success using rack-timeout?



On Nov 7, 9:34 pm, Subramanya Sastry sss.li...@gmail.com wrote:
 Aha!  Thanks for the explanation.  That is very helpful.  So, it could
 just be a single bad request that pretty much times out all additional
 requests down the pipe.  We've added rack-timeout already, and next
 time we hit one such bad request, we'll know with an exceptional
 report!

 Subbu.







 On Sun, Nov 7, 2010 at 8:27 PM, daniel hoey danielho...@gmail.com wrote:
  Just to follow up on my original post: We had one action that we knew
  had a timeout problem but we hadn't prioritized fixing it. We
  eventually discovered that this action caused other requests to
  timeout. The understanding that I got from talking to Heroku support
  is when a request comes in it gets assigned to a dyno immediately. For
  the purposes of herokutimeoutsthe request 'start time' is now. But
  if that dyno is currently processing some other request then the new
  request will just wait. If 30 seconds passes and the first request has
  not finished processing, then both requests timeout. Note also that if
  the first request takes 29s and the second request takes 2s then the
  second request will timeout.

  We ended up putting SystemTimer (http://systemtimer.rubyforge.org/)
 timeoutsaround some of our actions and filters so an exception gets
  raised when something times out, rack-timeout looks like a better way
  of doing this. We also used New Relic Silver to find the actions that
  where the root cause of the problem.

  Basically the moral of the story is that you have to make sure that
  none of your actions ever timeout.

  On Nov 6, 4:31 am, Oren Teich o...@heroku.com wrote:
  I've seen a few people with weirdtimeoutswhere theappowner was
  able to find out that it was a bug in their code.  Anything from a
  weird SQL query locking a table that was hanging their process to API
  requests to other hard to track stuff.

  This gem (http://github.com/kch/rack-timeout) will timeout your
  requests after a period you specify.  The advantage of this is you can
  set it to a short time, and exceptional/hoptoad should catch the
  timeout giving you some indication in the backtrace of what's going
  on.

  Oren

  On Fri, Nov 5, 2010 at 9:06 AM, Subbu Sastry sss.li...@gmail.com wrote:
   Has anyone found a reasonable solution to this problem yet?  On our
  appas well, we notice totally random timeout errors that couldn't
   possibly be associated with db lookup -- sometimes request time out on
   pages that lookup a row by primary key on a table with 15 records.
   Favicon.ico timed out as well.  Thetimeoutsseem arbitrary, and
   *always* get fixed on server restart (heroku restart).  This has
   happened to us a few times over the last week.  And yes, as several of
   you have noted, there is no exceptions raised (neither exceptional nor
   NewRelic).

   I think given that we experienced timeout with favicon.ico and an
   about page with a single db lookup and newrelic doesn't see this at
   all, I suspect this is something higher up the heroku stack that is
   timing out .. It almost smells like a memory leak somewhere which is
   howapprestart seems to fix the problem.  Now, the question is
   whether the memory leak is in ourappor somewhere else (plugins,
   gems, interaction with heroku stack) ... I will debug this, but wanted
   to see if someone else has found a reasonable solution to this.

   Subbu.

   On Oct 6, 9:37 pm, mattsly matt...@gmail.com wrote:
   In just manual testing myapp, I've seen a fair number oftimeouts
   (maybe a dozen) but have not received any communication.  I am pretty
   sure I'd have no idea they occurred had I not personally witnessed the
   error page.  I find this a borderline ship blocker for a migration
   to Heroku as I consider migrating a ~500K monthly page viewappto
   Heroku, and get very anxious thinking about lots of users seeing funky
   error page and having no way of being alerted or knowing how prevalent
   the issue is.

   WRT to thetimeouts, it's maybe 1% of requests thattimeout...and I
   still can't pin down why they're happening.  I'm on a single dyno,
   with Koi, and  5 alpha testers on it concurrently (andtimeout
   errors are related to response...not concurrency...) and these are
   extremely simple paging requests, that according to New Relic, return
   in ~100MS on average...and then all of a sudden...bam! - a 
   requesttimeout

rails 3.0.3 bundle - tread carefully!

2010-11-24 Thread mattsly
I just had an exciting afternoon-before-thanksgiving...I read all the
goodness at:
http://weblog.rubyonrails.org/2010/11/15/rails-3-0-3-faster-active-record-plus-plenty-of-fixes

...about better perf and low-risk changes with Rails 3.0.3, so I did a
local upgrade from 3.0.1, ran some tests, everything seemed fine,
pushed to staging, ditto, pushed to prod, ditto...went out to do some
errands. And then got a crash alert:

2010-11-24T12:35:55-08:00 app[worker.1]: Could not find
columnize-0.3.1 in any of the sources

boom!

Rushed home, rolled back to 3.0.1...

I'm still not sure exactly what happened - trying to diagnose now.
But clearly some funkiness w/ dependency management btwn versions, as
Rails 3.0.3 doesn't seem to install columnize (or linecache-0.43,
possibly more) whereas they do get installed when Rails = 3.0.1.

If anyone else has thoughts, please jump in. But, at the very least,
consider this a public service announcement that the 3.0.1 - 3.0.3
upgrade doesn't appear to be vanilla.

-- 
You received this message because you are subscribed to the Google Groups 
Heroku group.
To post to this group, send email to her...@googlegroups.com.
To unsubscribe from this group, send email to 
heroku+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en.



HTTP caching best practice: caches_page or response.headers?

2010-11-01 Thread mattsly
It seems to me (though I'm not positive?) that caches_page is
overwritten to simply write response.headers['Cache-Control'] =
'public, max-age=300', as evidenced by the git post-push action
detecting and installed caches_page_via_http.

Does anyone know if this is indeed the case?

If so, I wonder why http://docs.heroku.com/http-caching doesn't
explain as much and recommend using caches_page rather than explicitly
writing cache-control headers?  Seems like a more elegant approach?

Also - cache key name space is the full URL (including the query
string), for either approach, right?  If so, which I assume is the
case, I saw definite weirdness yesterday where:
a) Cached content for /controller/id?foo=bar was being return for /
controller/id (i.e. no query string).
b) Shift-reloading, restarting, even repushing the app was still
serving the same cached content (it had a session specific flash)

Continued for a good two hours (10 - 12 EST)...and now I can't repro
it. I know it was Halloween and all...but I swear I was not making
things up.  Anyone ever seen this kind of behavior?

-- 
You received this message because you are subscribed to the Google Groups 
Heroku group.
To post to this group, send email to her...@googlegroups.com.
To unsubscribe from this group, send email to 
heroku+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en.



Re: Heroku down time today, about 30 minutes total

2010-10-27 Thread mattsly

You can browse the incident archive here:
http://status.heroku.com/past

Most of the issues seems to be related to tooling, but  there was at
least one other case of app outage earlier in the month (Oct. 4)

Does Heroku publish uptime numbers?  Does anyone running an app in
production (I'm still not yet...) have app uptime numbers they're
willing to share?


On Oct 26, 7:03 pm, Shane Becker veganstraighte...@gmail.com wrote:
  How can we be sure this won't happen again?

 No

-- 
You received this message because you are subscribed to the Google Groups 
Heroku group.
To post to this group, send email to her...@googlegroups.com.
To unsubscribe from this group, send email to 
heroku+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en.



Re: Koi (how variable is variable?) vs. Ronin (expensive!) vs. Amazon RDS (winner?)

2010-10-12 Thread mattsly
Thanks Chris - that's great to know that you're handling that much
traffic w/ only 2 dynos and Koi. I have highly cachable content as
well, and plan on aggressively using varnish, as well as caches_page
behind that, and then rendering user specific partials (welcome back
username) w/ ajax (sounds like you're doing the same).  I haven't
turned all that on yet, b/c I want to really exercise the full end-to-
end under some load first.

With Koi, are your New Relic DB throughput and response time numbers
pretty consistent?  What is the variance that you're seeing in terms
of database response time for the same queries?  Any sign of timeouts
caused by a slow DB response?

And I wonder if anyone else out there has had success w/ Amazon RDS?
The long-term pricing to get dedicated database is very attractive,
and I'm more familiar w/ MySQL that PostGreSQL

m

p.s. As an aside, I am surprised by how little activity this group
gets?  Latest home page on heroku said there are 90,000+ apps running
on the platform...but like 3 messages posted a day.  Kinda weird?
Implies that either a) people are just using the free hello world
version, or b) it's just that damn easy that nobody is hitting
issues :)



On Oct 7, 11:39 pm, chris mcclellan...@gmail.com wrote:
 It really depends a lot on how well you leverage caching. We run a
 ~1.5mil page views / month site on heroku with just two (2!) dynos and
 koi. But our pages are cached for ~3 hours via varnish, which heroku
 provides for free. All users see (essentially) the same version of the
 site, barring JS-driven login/logout links, and certain admin
 functionality.

 One thing you should keep in mind is that amazon RDS is mysql based
 and heroku runs postgres DBs.

 On Oct 7, 10:42 pm, mattsly matt...@gmail.com wrote:



  I really want to pull the trigger with Heroku.  I love so much of it.
  I'm looking to move over a ~500K page views/month site that is
  decently data intensive, and still weighing my options wrt database,
  which may make or break my decision to use Heroku vs. EY, etc, given
  the price differences.

  Koi seems like a great deal. 20 GB is plenty for my app.  My
  benchmarks so far seem promising. But variable performance has me
  concerned a bit...does anyone have more concrete numbers on just what
  that means?  Anyone running decently high traffic sites on just Koi?

  The jump to Ronin is obviously dramatic in terms of price.  Is there
  any more info on just what a compute unit is?  Like RAM and I/O
  specs? It seems to me, given EC2 prices (small ec2 is ~85/month
  variable, and large = ~910/year term  $100/month), that there should
  be a dedicated option for less than $200/month.  Any hints of a Koi-
  like price drop in the near future here?  My 2 cents to Heroku's
  pricing team would be treat the data layer as a break-even loss
  leader, and make up the revenue on the dyno/worker side...

  Amazon RDS seems like quite possibly the way to go.  3 year term for a
  small instance (1.7 GB) is $350  2 months of the cost of a Ronin
  instance! Has anyone gone this route and had success? Should I be
  worried about latency between Heroku and RDS? (it's all EC2, right?)
  Which zone should I have a DB placed in? (Virginia vs. California?)

  m

-- 
You received this message because you are subscribed to the Google Groups 
Heroku group.
To post to this group, send email to her...@googlegroups.com.
To unsubscribe from this group, send email to 
heroku+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en.



Koi (how variable is variable?) vs. Ronin (expensive!) vs. Amazon RDS (winner?)

2010-10-07 Thread mattsly
I really want to pull the trigger with Heroku.  I love so much of it.
I'm looking to move over a ~500K page views/month site that is
decently data intensive, and still weighing my options wrt database,
which may make or break my decision to use Heroku vs. EY, etc, given
the price differences.

Koi seems like a great deal. 20 GB is plenty for my app.  My
benchmarks so far seem promising. But variable performance has me
concerned a bit...does anyone have more concrete numbers on just what
that means?  Anyone running decently high traffic sites on just Koi?

The jump to Ronin is obviously dramatic in terms of price.  Is there
any more info on just what a compute unit is?  Like RAM and I/O
specs? It seems to me, given EC2 prices (small ec2 is ~85/month
variable, and large = ~910/year term  $100/month), that there should
be a dedicated option for less than $200/month.  Any hints of a Koi-
like price drop in the near future here?  My 2 cents to Heroku's
pricing team would be treat the data layer as a break-even loss
leader, and make up the revenue on the dyno/worker side...

Amazon RDS seems like quite possibly the way to go.  3 year term for a
small instance (1.7 GB) is $350  2 months of the cost of a Ronin
instance! Has anyone gone this route and had success? Should I be
worried about latency between Heroku and RDS? (it's all EC2, right?)
Which zone should I have a DB placed in? (Virginia vs. California?)

m

-- 
You received this message because you are subscribed to the Google Groups 
Heroku group.
To post to this group, send email to her...@googlegroups.com.
To unsubscribe from this group, send email to 
heroku+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en.



Re: App Timeouts

2010-10-06 Thread mattsly
In just manual testing my app, I've seen a fair number of timeouts
(maybe a dozen) but have not received any communication.  I am pretty
sure I'd have no idea they occurred had I not personally witnessed the
error page.  I find this a borderline ship blocker for a migration
to Heroku as I consider migrating a ~500K monthly page view app to
Heroku, and get very anxious thinking about lots of users seeing funky
error page and having no way of being alerted or knowing how prevalent
the issue is.

WRT to the timeouts, it's maybe 1% of requests that timeout...and I
still can't pin down why they're happening.  I'm on a single dyno,
with Koi, and  5 alpha testers on it concurrently (and timeout
errors are related to response...not concurrency...) and these are
extremely simple paging requests, that according to New Relic, return
in ~100MS on average...and then all of a sudden...bam! - a request
timeout.  And we're talking about essentially the exact same code
path, except a different :offset in the ActiveRecord find call.  The
complexity is nothing along the lines of suggested timeout causes
here: http://docs.heroku.com/performance#request-timeout

Strangely, I just tried turning off all varnish level caching (which I
hope to rely on heavily) to try and isolate the issue and now perf
seems *more* consistent and faster (haven't seen a timout yet). Could
it be that the timeouts are being caused during lookup at the Varnish
layer? My understanding is this wouldn't be a possible explanation, as
I think the dyno doesn't even catch a request if the a varnish cache
hit is found.  So maybe Varnish caching is a red herring...but does
seem curious.

Matt



On Sep 24, 7:56 pm, John Norman j...@7fff.com wrote:
 Well, you should get an e-mail if your app is generating backlogs.

 I have one app that did generate 2 in a whole week, and I received at least
 two e-mails from Heroku suggesting that I up the number of dynos.



 On Fri, Sep 24, 2010 at 11:42 AM, mattsly matt...@gmail.com wrote:
  How are you finding the timeouts? Just manually?  I was having timeout
  issues (that I now think I've solved - see below) but am concerned
  that, once I flip my site public, that:

  a) There's no apparent native reporting/alerting for timeouts or
  backlog too deep errors if they do occur
  b) No ability to render a custom (static) error page in that case

  Re: reporting. When timeouts occur, am I mistaken in not seeing them
  reported anywhere?  They don't seem to throw exceptional or new relic
  exceptions with the free version?  It's unclear to me that they would
  be with the (expensive - .$.05/hr = $36/month for alerting?) Silver
  - can anyone confirm that they in fact do?

  It seems like timeout/backlog too deep reporting/alerting should
  really be a built-in feature of Heroku, since they are core elements
  in the architecture, and such alerting (especially backlog) helps you
  make a quick call about cranking dyno count up/down and or restarting
  an app to minimize adverse user affects...i.e. really what this cloud
  and hosting-as-a-service thing is all about.

  I'm about to (I think) migrate a high traffic site to Heroku. I *love*
  the idea of being able to focus on development and not sysadmin...but
  have to say that I am getting a little anxious about quirks like this
  and what it might mean for my users.

  Matt

  (On a slightly related note - I've learned the hard way the
  Table.count is a great way to cause a timeout - looks like MySQL and
  PostGreSQL handle counts *way* differently...something to keep in mind
  if you're migrating from mysql:
 http://www.wikivs.com/wiki/MySQL_vs_PostgreSQL#COUNT.28.2A.29)

  On Sep 10, 3:45 am, daniel hoey danielho...@gmail.com wrote:
   We go through short periods where we get frequentapptimeouts. The
   pages that timeout are often very simple and do not relying on
   external services or performing any demanding database queries. We
   don't get any information in our New Relic transaction traces for
   these queries (we have for othertimeoutsin the past). Basically we
   can't get any information about what is going on, and only know about
   the problem if our users tell us. Has anyone else experienced similar
   problems or have anything to suggest in terms of investigating the
   root cause?

   The last time that we are aware of this happening was between 06:30
   and 07:00 GMT on Sept 10.

  On Sep 10, 3:45 am, daniel hoey danielho...@gmail.com wrote:
   We go through short periods where we get frequentapptimeouts. The
   pages that timeout are often very simple and do not relying on
   external services or performing any demanding database queries. We
   don't get any information in our New Relic transaction traces for
   these queries (we have for othertimeoutsin the past). Basically we
   can't get any information about what is going on, and only know about
   the problem if our users tell us. Has anyone else experienced similar
   problems or have anything

Re: App Timeouts

2010-09-24 Thread mattsly
How are you finding the timeouts? Just manually?  I was having timeout
issues (that I now think I've solved - see below) but am concerned
that, once I flip my site public, that:

a) There's no apparent native reporting/alerting for timeouts or
backlog too deep errors if they do occur
b) No ability to render a custom (static) error page in that case

Re: reporting. When timeouts occur, am I mistaken in not seeing them
reported anywhere?  They don't seem to throw exceptional or new relic
exceptions with the free version?  It's unclear to me that they would
be with the (expensive - .$.05/hr = $36/month for alerting?) Silver
- can anyone confirm that they in fact do?

It seems like timeout/backlog too deep reporting/alerting should
really be a built-in feature of Heroku, since they are core elements
in the architecture, and such alerting (especially backlog) helps you
make a quick call about cranking dyno count up/down and or restarting
an app to minimize adverse user affects...i.e. really what this cloud
and hosting-as-a-service thing is all about.

I'm about to (I think) migrate a high traffic site to Heroku. I *love*
the idea of being able to focus on development and not sysadmin...but
have to say that I am getting a little anxious about quirks like this
and what it might mean for my users.

Matt

(On a slightly related note - I've learned the hard way the
Table.count is a great way to cause a timeout - looks like MySQL and
PostGreSQL handle counts *way* differently...something to keep in mind
if you're migrating from mysql: 
http://www.wikivs.com/wiki/MySQL_vs_PostgreSQL#COUNT.28.2A.29)



On Sep 10, 3:45 am, daniel hoey danielho...@gmail.com wrote:
 We go through short periods where we get frequentapptimeouts. The
 pages that timeout are often very simple and do not relying on
 external services or performing any demanding database queries. We
 don't get any information in our New Relic transaction traces for
 these queries (we have for othertimeoutsin the past). Basically we
 can't get any information about what is going on, and only know about
 the problem if our users tell us. Has anyone else experienced similar
 problems or have anything to suggest in terms of investigating the
 root cause?

 The last time that we are aware of this happening was between 06:30
 and 07:00 GMT on Sept 10.

On Sep 10, 3:45 am, daniel hoey danielho...@gmail.com wrote:
 We go through short periods where we get frequentapptimeouts. The
 pages that timeout are often very simple and do not relying on
 external services or performing any demanding database queries. We
 don't get any information in our New Relic transaction traces for
 these queries (we have for othertimeoutsin the past). Basically we
 can't get any information about what is going on, and only know about
 the problem if our users tell us. Has anyone else experienced similar
 problems or have anything to suggest in terms of investigating the
 root cause?

 The last time that we are aware of this happening was between 06:30
 and 07:00 GMT on Sept 10.

-- 
You received this message because you are subscribed to the Google Groups 
Heroku group.
To post to this group, send email to her...@googlegroups.com.
To unsubscribe from this group, send email to 
heroku+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en.