Re: How do I track down a painfully long pause in a small web app?

2014-09-24 Thread dennis zhuang
You can use

jstat -gcutil pid 2000

to print the GC statistics every 2 seconds,
http://docs.oracle.com/javase/1.5.0/docs/tooldocs/share/jstat.html

It the long pause is from GC, the columns FGCT/FGC values would be large.

If you think it's a swap issue, you may want to use

   vmstat 1 1

watch out the si/so columns.

What's your jvm arguments? Too small heap memory size may be the issue.

2014-09-24 9:47 GMT+08:00 larry google groups lawrencecloj...@gmail.com:

 I'm guessing that strace is showing me userland threads? When I quit
 strace I see:

 ^CProcess 19363 detached
 Process 19364 detached
 Process 19365 detached
 Process 19366 detached
 Process 19367 detached
 Process 19368 detached
 Process 19369 detached
 Process 19370 detached
 Process 19371 detached
 Process 19372 detached
 Process 19377 detached
 Process 19378 detached
 Process 19379 detached
 Process 19380 detached
 Process 19381 detached
 Process 19382 detached
 Process 19383 detached
 Process 19384 detached
 Process 19385 detached
 Process 19386 detached
 Process 19387 detached
 Process 19388 detached
 Process 19389 detached
 Process 19390 detached
 Process 19391 detached
 Process 19392 detached
 Process 19393 detached
 Process 19394 detached
 Process 19395 detached
 Process 19396 detached
 Process 19397 detached
 Process 19398 detached
 Process 19399 detached
 Process 19400 detached
 Process 19401 detached
 Process 19402 detached
 Process 19403 detached
 Process 19404 detached
 Process 19405 detached
 Process 19406 detached
 Process 19407 detached
 Process 19408 detached
 Process 19409 detached
 Process 19410 detached
 Process 19606 detached
 % time seconds  usecs/call callserrors syscall
 -- --- --- - - 
  90.06   40.9730721363 30059 10449 futex
   4.231.926411 819  2353   epoll_wait
   3.021.373282  11444012 6 restart_syscall
   1.240.563107   93851 6   accept
   0.990.449988  12 36909   gettimeofday
   0.350.156992   5 29410   clock_gettime
   0.050.021064  67   316   recvfrom
   0.020.010338  30   347   write
   0.010.005117  24   209   sendto
   0.010.004369  24   180   poll
   0.010.002683  24   11222 read
   0.010.002563  24   108 6 epoll_ctl
   0.000.001618  14   112   open
   0.000.001189   5   230   fcntl
   0.000.001132   8   142   mprotect
   0.000.000969   8   118   close
   0.000.000806  3821   writev
   0.000.000685   6   109   ioctl
   0.000.000655   6   110   fstat
   0.000.000229  1317   mmap
   0.000.000216  36 6   shutdown
   0.000.000197  33 6   dup2
   0.000.92  46 2   madvise
   0.000.61   512   setsockopt
   0.000.57  14 4   munmap
   0.000.56   512   getsockname
   0.000.35   4 8   rt_sigprocmask
   0.000.18   5 4   sched_getaffinity
   0.000.10   5 2   clone
   0.000.09   9 1   rt_sigreturn
   0.000.09   5 2   uname
   0.000.09   5 2   set_robust_list
   0.000.08   4 2   gettid
 -- --- --- - - 
 100.00   45.497046100943 10483 total






 On Tuesday, September 23, 2014 9:44:52 PM UTC-4, larry google groups wrote:

 I am intrigued by this article, as the problem sounds the same as mine:

 http://corner.squareup.com/2014/09/logging-can-be-tricky.html

 No significant amount of resources appeared to be in use — disk I/O,
 network I/O, CPU, and memory all looked fairly tame. Furthermore, the bulk
 of queries being served were all performing as expected. 

 So I tried to follow their example regarding strace. But I have never
 worked with strace before. I used grep to find the PID and then I:

  sudo strace -c -f -p 19363

 and I got:

 Process 19363 attached with 45 threads

 Then I ran our health check which is like a series of functional tests
 that ping our actual app (a live environment rather than a test
 environment). I got nothing out of strace except these 2 lines appeared:

 Process 20973 attached
 Process 20974 attached

 What does this mean? I had the impression that the JVM ran in 1 process?
 Does strace show me userland threads (like htop does) or are these child
 processes?





 On Monday, 

Re: How do I track down a painfully long pause in a small web app?

2014-09-23 Thread larry google groups
I am intrigued by this article, as the problem sounds the same as mine:

http://corner.squareup.com/2014/09/logging-can-be-tricky.html

No significant amount of resources appeared to be in use — disk I/O, 
network I/O, CPU, and memory all looked fairly tame. Furthermore, the bulk 
of queries being served were all performing as expected. 

So I tried to follow their example regarding strace. But I have never 
worked with strace before. I used grep to find the PID and then I:

 sudo strace -c -f -p 19363

and I got:

Process 19363 attached with 45 threads

Then I ran our health check which is like a series of functional tests 
that ping our actual app (a live environment rather than a test 
environment). I got nothing out of strace except these 2 lines appeared:

Process 20973 attached
Process 20974 attached

What does this mean? I had the impression that the JVM ran in 1 process? 
Does strace show me userland threads (like htop does) or are these child 
processes? 





On Monday, September 15, 2014 12:15:14 AM UTC-4, larry google groups wrote:


 I have an embarrassing problem. I convinced my boss that I could use 
 Clojure to build a RESTful API. I was successful in so far as that went, 
 but now I face the issue that every once in a while, the program pauses, 
 for a painfully long time -- sometimes 30 seconds, which causes some 
 requests to the API to timeout. We are still in testing, so there is no 
 real load on the app, just the frontenders, writing Javascript and making 
 Ajax calls to the service. 

 The service seems like a basic Clojure web app. I use Jetty as the 
 webserver, and the libraries in use are: 

 Ring

 Compojure

 Liberator

 Monger 

 Timbre

 Lamina

 Dire

 When someone complains about the pauses, I will go test the service, and I 
 can hit with 40 requests in 10 seconds and it has great performance. The 
 pauses actually seem to come after periods of inactivity, which made me 
 think that this had something to do with garbage collection, except that 
 the pauses are so extreme -- like I said, sometimes as much as 30 seconds, 
 causing requests to timeout. When the app does finally start to respond it 
 again, it goes very fast, and responds to those pending request very fast. 

 But I have to find a way to fix these pauses. 

 Right now I packaged the app as an Uberjar and put it on the server, spun 
 it up on port 24000 and proxied it through Apache. I put a script in 
 /etc/init.d to start the app using  start-stop-daemon.

 Possible things that could be going wrong:

 Maybe Jetty needs more threads, or maybe less threads? How would I test 
 that?

 Maybe the link to MongoDB sometimes dies? (Mongo is on another server at 
 Amazon) How would I test that? 

 Maybe it is garbage collection? How would I test that? 

 Maybe I have some code that somehow blocks the whole app? Seems unlikely 
 but I'm trying to keep an open mind. 

 Maybe the thread pool managed by Lamina sometimes gets overwhelmed? How 
 would I test that? 

 Maybe when Timbre writes to the log file it causes things to pause? (But I 
 believe Timbre does this in its own thread?) How do I test that? 

 This is a small app: only about 1,100 lines of code.

 I don't have much experience debugging problems on the JVM, so I welcome 
 any suggestions. 









-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: How do I track down a painfully long pause in a small web app?

2014-09-23 Thread larry google groups
I'm guessing that strace is showing me userland threads? When I quit strace 
I see:

^CProcess 19363 detached
Process 19364 detached
Process 19365 detached
Process 19366 detached
Process 19367 detached
Process 19368 detached
Process 19369 detached
Process 19370 detached
Process 19371 detached
Process 19372 detached
Process 19377 detached
Process 19378 detached
Process 19379 detached
Process 19380 detached
Process 19381 detached
Process 19382 detached
Process 19383 detached
Process 19384 detached
Process 19385 detached
Process 19386 detached
Process 19387 detached
Process 19388 detached
Process 19389 detached
Process 19390 detached
Process 19391 detached
Process 19392 detached
Process 19393 detached
Process 19394 detached
Process 19395 detached
Process 19396 detached
Process 19397 detached
Process 19398 detached
Process 19399 detached
Process 19400 detached
Process 19401 detached
Process 19402 detached
Process 19403 detached
Process 19404 detached
Process 19405 detached
Process 19406 detached
Process 19407 detached
Process 19408 detached
Process 19409 detached
Process 19410 detached
Process 19606 detached
% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 90.06   40.9730721363 30059 10449 futex
  4.231.926411 819  2353   epoll_wait
  3.021.373282  11444012 6 restart_syscall
  1.240.563107   93851 6   accept
  0.990.449988  12 36909   gettimeofday
  0.350.156992   5 29410   clock_gettime
  0.050.021064  67   316   recvfrom
  0.020.010338  30   347   write
  0.010.005117  24   209   sendto
  0.010.004369  24   180   poll
  0.010.002683  24   11222 read
  0.010.002563  24   108 6 epoll_ctl
  0.000.001618  14   112   open
  0.000.001189   5   230   fcntl
  0.000.001132   8   142   mprotect
  0.000.000969   8   118   close
  0.000.000806  3821   writev
  0.000.000685   6   109   ioctl
  0.000.000655   6   110   fstat
  0.000.000229  1317   mmap
  0.000.000216  36 6   shutdown
  0.000.000197  33 6   dup2
  0.000.92  46 2   madvise
  0.000.61   512   setsockopt
  0.000.57  14 4   munmap
  0.000.56   512   getsockname
  0.000.35   4 8   rt_sigprocmask
  0.000.18   5 4   sched_getaffinity
  0.000.10   5 2   clone
  0.000.09   9 1   rt_sigreturn
  0.000.09   5 2   uname
  0.000.09   5 2   set_robust_list
  0.000.08   4 2   gettid
-- --- --- - - 
100.00   45.497046100943 10483 total






On Tuesday, September 23, 2014 9:44:52 PM UTC-4, larry google groups wrote:

 I am intrigued by this article, as the problem sounds the same as mine:

 http://corner.squareup.com/2014/09/logging-can-be-tricky.html

 No significant amount of resources appeared to be in use — disk I/O, 
 network I/O, CPU, and memory all looked fairly tame. Furthermore, the bulk 
 of queries being served were all performing as expected. 

 So I tried to follow their example regarding strace. But I have never 
 worked with strace before. I used grep to find the PID and then I:

  sudo strace -c -f -p 19363

 and I got:

 Process 19363 attached with 45 threads

 Then I ran our health check which is like a series of functional tests 
 that ping our actual app (a live environment rather than a test 
 environment). I got nothing out of strace except these 2 lines appeared:

 Process 20973 attached
 Process 20974 attached

 What does this mean? I had the impression that the JVM ran in 1 process? 
 Does strace show me userland threads (like htop does) or are these child 
 processes? 





 On Monday, September 15, 2014 12:15:14 AM UTC-4, larry google groups wrote:


 I have an embarrassing problem. I convinced my boss that I could use 
 Clojure to build a RESTful API. I was successful in so far as that went, 
 but now I face the issue that every once in a while, the program pauses, 
 for a painfully long time -- sometimes 30 seconds, which causes some 
 requests to the API to timeout. We are still in testing, so there is no 
 real load on the app, just the frontenders, writing Javascript and making 
 Ajax calls to the service. 

 The service 

Re: How do I track down a painfully long pause in a small web app?

2014-09-15 Thread Linus Ericsson
If you turn on verbose gc for the JVM you could at least rule out GC pauses.

Hmm, exactly how do you route the requests through the apache server? It
almost sounds like your applikation is restarted every now and then, iirc
Apache only servers a limited amount of requests per server thread.

If this somehow started a new JVM per apache thread things would go
strange. What does $ps ax --forest say?

/Linus
 Den 15 sep 2014 06:44 skrev Shantanu Kumar kumar.shant...@gmail.com:

 Few thing to consider:
 1. Which API calls pause? If only certain calls pause, then probably you
 have something specific to suspect. Try adding a dummy REST call - see if
 that call pauses while others do.
 2. Is any of your services running on a t1.micro or a burst-oriented EC2
 instance on AWS? Try changing the instance type in that case.
 3. Can you mock out the components that you suspect could be a problem?
 Begin by mocking out everything you suspect, then replace the mock with
 actual impl one component at a time until you isolate the problematic
 component.
 4. Have you tried running a profiler?
 5. Have you tried printing GC info? Maybe this could be useful:
 http://blog.ragozin.info/2011/09/hotspot-jvm-garbage-collection-options.html

 Shantanu

 On Monday, 15 September 2014 09:45:14 UTC+5:30, larry google groups wrote:


 I have an embarrassing problem. I convinced my boss that I could use
 Clojure to build a RESTful API. I was successful in so far as that went,
 but now I face the issue that every once in a while, the program pauses,
 for a painfully long time -- sometimes 30 seconds, which causes some
 requests to the API to timeout. We are still in testing, so there is no
 real load on the app, just the frontenders, writing Javascript and making
 Ajax calls to the service.

 The service seems like a basic Clojure web app. I use Jetty as the
 webserver, and the libraries in use are:

 Ring

 Compojure

 Liberator

 Monger

 Timbre

 Lamina

 Dire

 When someone complains about the pauses, I will go test the service, and
 I can hit with 40 requests in 10 seconds and it has great performance. The
 pauses actually seem to come after periods of inactivity, which made me
 think that this had something to do with garbage collection, except that
 the pauses are so extreme -- like I said, sometimes as much as 30 seconds,
 causing requests to timeout. When the app does finally start to respond it
 again, it goes very fast, and responds to those pending request very fast.

 But I have to find a way to fix these pauses.

 Right now I packaged the app as an Uberjar and put it on the server, spun
 it up on port 24000 and proxied it through Apache. I put a script in
 /etc/init.d to start the app using  start-stop-daemon.

 Possible things that could be going wrong:

 Maybe Jetty needs more threads, or maybe less threads? How would I test
 that?

 Maybe the link to MongoDB sometimes dies? (Mongo is on another server at
 Amazon) How would I test that?

 Maybe it is garbage collection? How would I test that?

 Maybe I have some code that somehow blocks the whole app? Seems unlikely
 but I'm trying to keep an open mind.

 Maybe the thread pool managed by Lamina sometimes gets overwhelmed? How
 would I test that?

 Maybe when Timbre writes to the log file it causes things to pause? (But
 I believe Timbre does this in its own thread?) How do I test that?

 This is a small app: only about 1,100 lines of code.

 I don't have much experience debugging problems on the JVM, so I welcome
 any suggestions.







  --
 You received this message because you are subscribed to the Google
 Groups Clojure group.
 To post to this group, send email to clojure@googlegroups.com
 Note that posts from new members are moderated - please be patient with
 your first post.
 To unsubscribe from this group, send email to
 clojure+unsubscr...@googlegroups.com
 For more options, visit this group at
 http://groups.google.com/group/clojure?hl=en
 ---
 You received this message because you are subscribed to the Google Groups
 Clojure group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to clojure+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: How do I track down a painfully long pause in a small web app?

2014-09-15 Thread François Rey
GC would be the first suspect, but then it could also be combined with a 
swap issue, or a JVM bug.
Have a look at this article, which ends with a concrete list of things 
to do:

https://blogs.oracle.com/poonam/entry/troubleshooting_long_gc_pauses

--
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups Clojure group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: How do I track down a painfully long pause in a small web app?

2014-09-15 Thread David Powell
Use the jvisualvm tool that comes with the jdk- you should be able to
connect to the clojure process.

Looking at the memory usage graphs, and if the heap size is banging against
the max heap size, then you might just be using too small a heap size - try
upping it.

You can also install the visualgc plugin for jvisualvm to get more info on
timings.


Alternatively go to the Threads pane, and click Thread Dump during the 30
second pause - you should be able to confirm what actual code is running at
this point, which might give a clue to what is going on.


If you have a memory leak, the Heap Dump button on the Monitor tab lets you
interactively explore all memory in the jvm.  If there is a lot of
something, that might be the thing that is leaking.



On Mon, Sep 15, 2014 at 7:55 AM, François Rey fmj...@gmail.com wrote:

 GC would be the first suspect, but then it could also be combined with a
 swap issue, or a JVM bug.
 Have a look at this article, which ends with a concrete list of things to
 do:
 https://blogs.oracle.com/poonam/entry/troubleshooting_long_gc_pauses


 --
 You received this message because you are subscribed to the Google
 Groups Clojure group.
 To post to this group, send email to clojure@googlegroups.com
 Note that posts from new members are moderated - please be patient with
 your first post.
 To unsubscribe from this group, send email to
 clojure+unsubscr...@googlegroups.com
 For more options, visit this group at
 http://groups.google.com/group/clojure?hl=en
 --- You received this message because you are subscribed to the Google
 Groups Clojure group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to clojure+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: How do I track down a painfully long pause in a small web app?

2014-09-15 Thread larry google groups
 1. Which API calls pause? If only certain calls pause, then probably you 
have something 
 specific to suspect. Try adding a dummy REST call - see if that call 
pauses 
 while others do.

I will add a dummy REST call, although this pause does not seem specific to 
a particular API call.


 2. Is any of your services running on a t1.micro or a burst-oriented EC2 
 instance on AWS? Try changing the instance type in that case.

We started on a small instance but recently we moved up to a reasonably 
powered machine with 4 gigs of RAM. 


 Have you tried printing GC info? 

No, but I will. Thank you.



On Monday, September 15, 2014 12:44:54 AM UTC-4, Shantanu Kumar wrote:

 Few thing to consider:
 1. Which API calls pause? If only certain calls pause, then probably you 
 have something specific to suspect. Try adding a dummy REST call - see if 
 that call pauses while others do.
 2. Is any of your services running on a t1.micro or a burst-oriented EC2 
 instance on AWS? Try changing the instance type in that case.
 3. Can you mock out the components that you suspect could be a problem? 
 Begin by mocking out everything you suspect, then replace the mock with 
 actual impl one component at a time until you isolate the problematic 
 component.
 4. Have you tried running a profiler?
 5. Have you tried printing GC info? Maybe this could be useful: 
 http://blog.ragozin.info/2011/09/hotspot-jvm-garbage-collection-options.html

 Shantanu

 On Monday, 15 September 2014 09:45:14 UTC+5:30, larry google groups wrote:


 I have an embarrassing problem. I convinced my boss that I could use 
 Clojure to build a RESTful API. I was successful in so far as that went, 
 but now I face the issue that every once in a while, the program pauses, 
 for a painfully long time -- sometimes 30 seconds, which causes some 
 requests to the API to timeout. We are still in testing, so there is no 
 real load on the app, just the frontenders, writing Javascript and making 
 Ajax calls to the service. 

 The service seems like a basic Clojure web app. I use Jetty as the 
 webserver, and the libraries in use are: 

 Ring

 Compojure

 Liberator

 Monger 

 Timbre

 Lamina

 Dire

 When someone complains about the pauses, I will go test the service, and 
 I can hit with 40 requests in 10 seconds and it has great performance. The 
 pauses actually seem to come after periods of inactivity, which made me 
 think that this had something to do with garbage collection, except that 
 the pauses are so extreme -- like I said, sometimes as much as 30 seconds, 
 causing requests to timeout. When the app does finally start to respond it 
 again, it goes very fast, and responds to those pending request very fast. 

 But I have to find a way to fix these pauses. 

 Right now I packaged the app as an Uberjar and put it on the server, spun 
 it up on port 24000 and proxied it through Apache. I put a script in 
 /etc/init.d to start the app using  start-stop-daemon.

 Possible things that could be going wrong:

 Maybe Jetty needs more threads, or maybe less threads? How would I test 
 that?

 Maybe the link to MongoDB sometimes dies? (Mongo is on another server at 
 Amazon) How would I test that? 

 Maybe it is garbage collection? How would I test that? 

 Maybe I have some code that somehow blocks the whole app? Seems unlikely 
 but I'm trying to keep an open mind. 

 Maybe the thread pool managed by Lamina sometimes gets overwhelmed? How 
 would I test that? 

 Maybe when Timbre writes to the log file it causes things to pause? (But 
 I believe Timbre does this in its own thread?) How do I test that? 

 This is a small app: only about 1,100 lines of code.

 I don't have much experience debugging problems on the JVM, so I welcome 
 any suggestions. 









-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: How do I track down a painfully long pause in a small web app?

2014-09-15 Thread larry google groups
 Hmm, exactly how do you route the requests through the apache server? It 
 almost sounds like your applikation is restarted every now and then, iirc 
 Apache only servers a limited amount of requests per server thread.

Interesting if true, but I assume there would be an error if 2 instances of 
the app both tried to grab the same port. In:

/etc/apache2/sites-available/000-default.conf

which is the only config file we are using right now, I have this:

ProxyPass /user/ http://127.0.0.1:34002/
ProxyPassReverse /user/ http://127.0.0.1:34002/

ProxyPass /user http://127.0.0.1:34002/
ProxyPassReverse /user http://127.0.0.1:34002/

This works fine. The requests proxy through Apache and back. 

I repeated the line to deal with the trailing /



On Monday, September 15, 2014 2:45:13 AM UTC-4, Linus Ericsson wrote:

 If you turn on verbose gc for the JVM you could at least rule out GC 
 pauses.

 Hmm, exactly how do you route the requests through the apache server? It 
 almost sounds like your applikation is restarted every now and then, iirc 
 Apache only servers a limited amount of requests per server thread.

 If this somehow started a new JVM per apache thread things would go 
 strange. What does $ps ax --forest say?

 /Linus
  Den 15 sep 2014 06:44 skrev Shantanu Kumar kumar.s...@gmail.com 
 javascript::

 Few thing to consider:
 1. Which API calls pause? If only certain calls pause, then probably you 
 have something specific to suspect. Try adding a dummy REST call - see if 
 that call pauses while others do.
 2. Is any of your services running on a t1.micro or a burst-oriented EC2 
 instance on AWS? Try changing the instance type in that case.
 3. Can you mock out the components that you suspect could be a problem? 
 Begin by mocking out everything you suspect, then replace the mock with 
 actual impl one component at a time until you isolate the problematic 
 component.
 4. Have you tried running a profiler?
 5. Have you tried printing GC info? Maybe this could be useful: 
 http://blog.ragozin.info/2011/09/hotspot-jvm-garbage-collection-options.html

 Shantanu

 On Monday, 15 September 2014 09:45:14 UTC+5:30, larry google groups wrote:


 I have an embarrassing problem. I convinced my boss that I could use 
 Clojure to build a RESTful API. I was successful in so far as that went, 
 but now I face the issue that every once in a while, the program pauses, 
 for a painfully long time -- sometimes 30 seconds, which causes some 
 requests to the API to timeout. We are still in testing, so there is no 
 real load on the app, just the frontenders, writing Javascript and making 
 Ajax calls to the service. 

 The service seems like a basic Clojure web app. I use Jetty as the 
 webserver, and the libraries in use are: 

 Ring

 Compojure

 Liberator

 Monger 

 Timbre

 Lamina

 Dire

 When someone complains about the pauses, I will go test the service, and 
 I can hit with 40 requests in 10 seconds and it has great performance. The 
 pauses actually seem to come after periods of inactivity, which made me 
 think that this had something to do with garbage collection, except that 
 the pauses are so extreme -- like I said, sometimes as much as 30 seconds, 
 causing requests to timeout. When the app does finally start to respond it 
 again, it goes very fast, and responds to those pending request very fast. 

 But I have to find a way to fix these pauses. 

 Right now I packaged the app as an Uberjar and put it on the server, 
 spun it up on port 24000 and proxied it through Apache. I put a script in 
 /etc/init.d to start the app using  start-stop-daemon.

 Possible things that could be going wrong:

 Maybe Jetty needs more threads, or maybe less threads? How would I test 
 that?

 Maybe the link to MongoDB sometimes dies? (Mongo is on another server at 
 Amazon) How would I test that? 

 Maybe it is garbage collection? How would I test that? 

 Maybe I have some code that somehow blocks the whole app? Seems unlikely 
 but I'm trying to keep an open mind. 

 Maybe the thread pool managed by Lamina sometimes gets overwhelmed? How 
 would I test that? 

 Maybe when Timbre writes to the log file it causes things to pause? (But 
 I believe Timbre does this in its own thread?) How do I test that? 

 This is a small app: only about 1,100 lines of code.

 I don't have much experience debugging problems on the JVM, so I welcome 
 any suggestions. 







  -- 
 You received this message because you are subscribed to the Google
 Groups Clojure group.
 To post to this group, send email to clo...@googlegroups.com 
 javascript:
 Note that posts from new members are moderated - please be patient with 
 your first post.
 To unsubscribe from this group, send email to
 clojure+u...@googlegroups.com javascript:
 For more options, visit this group at
 http://groups.google.com/group/clojure?hl=en
 --- 
 You received this message because you are subscribed to the Google Groups 
 Clojure group.
 To unsubscribe from this 

Re: How do I track down a painfully long pause in a small web app?

2014-09-15 Thread larry google groups
 If this somehow started a new JVM per apache thread things would go 
strange. What 
 does $ps ax --forest say?


That is a good thought, but I only see it once. 



On Monday, September 15, 2014 2:45:13 AM UTC-4, Linus Ericsson wrote:

 If you turn on verbose gc for the JVM you could at least rule out GC 
 pauses.

 Hmm, exactly how do you route the requests through the apache server? It 
 almost sounds like your applikation is restarted every now and then, iirc 
 Apache only servers a limited amount of requests per server thread.

 If this somehow started a new JVM per apache thread things would go 
 strange. What does $ps ax --forest say?

 /Linus
  Den 15 sep 2014 06:44 skrev Shantanu Kumar kumar.s...@gmail.com 
 javascript::

 Few thing to consider:
 1. Which API calls pause? If only certain calls pause, then probably you 
 have something specific to suspect. Try adding a dummy REST call - see if 
 that call pauses while others do.
 2. Is any of your services running on a t1.micro or a burst-oriented EC2 
 instance on AWS? Try changing the instance type in that case.
 3. Can you mock out the components that you suspect could be a problem? 
 Begin by mocking out everything you suspect, then replace the mock with 
 actual impl one component at a time until you isolate the problematic 
 component.
 4. Have you tried running a profiler?
 5. Have you tried printing GC info? Maybe this could be useful: 
 http://blog.ragozin.info/2011/09/hotspot-jvm-garbage-collection-options.html

 Shantanu

 On Monday, 15 September 2014 09:45:14 UTC+5:30, larry google groups wrote:


 I have an embarrassing problem. I convinced my boss that I could use 
 Clojure to build a RESTful API. I was successful in so far as that went, 
 but now I face the issue that every once in a while, the program pauses, 
 for a painfully long time -- sometimes 30 seconds, which causes some 
 requests to the API to timeout. We are still in testing, so there is no 
 real load on the app, just the frontenders, writing Javascript and making 
 Ajax calls to the service. 

 The service seems like a basic Clojure web app. I use Jetty as the 
 webserver, and the libraries in use are: 

 Ring

 Compojure

 Liberator

 Monger 

 Timbre

 Lamina

 Dire

 When someone complains about the pauses, I will go test the service, and 
 I can hit with 40 requests in 10 seconds and it has great performance. The 
 pauses actually seem to come after periods of inactivity, which made me 
 think that this had something to do with garbage collection, except that 
 the pauses are so extreme -- like I said, sometimes as much as 30 seconds, 
 causing requests to timeout. When the app does finally start to respond it 
 again, it goes very fast, and responds to those pending request very fast. 

 But I have to find a way to fix these pauses. 

 Right now I packaged the app as an Uberjar and put it on the server, 
 spun it up on port 24000 and proxied it through Apache. I put a script in 
 /etc/init.d to start the app using  start-stop-daemon.

 Possible things that could be going wrong:

 Maybe Jetty needs more threads, or maybe less threads? How would I test 
 that?

 Maybe the link to MongoDB sometimes dies? (Mongo is on another server at 
 Amazon) How would I test that? 

 Maybe it is garbage collection? How would I test that? 

 Maybe I have some code that somehow blocks the whole app? Seems unlikely 
 but I'm trying to keep an open mind. 

 Maybe the thread pool managed by Lamina sometimes gets overwhelmed? How 
 would I test that? 

 Maybe when Timbre writes to the log file it causes things to pause? (But 
 I believe Timbre does this in its own thread?) How do I test that? 

 This is a small app: only about 1,100 lines of code.

 I don't have much experience debugging problems on the JVM, so I welcome 
 any suggestions. 







  -- 
 You received this message because you are subscribed to the Google
 Groups Clojure group.
 To post to this group, send email to clo...@googlegroups.com 
 javascript:
 Note that posts from new members are moderated - please be patient with 
 your first post.
 To unsubscribe from this group, send email to
 clojure+u...@googlegroups.com javascript:
 For more options, visit this group at
 http://groups.google.com/group/clojure?hl=en
 --- 
 You received this message because you are subscribed to the Google Groups 
 Clojure group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to clojure+u...@googlegroups.com javascript:.
 For more options, visit https://groups.google.com/d/optout.



-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received 

Re: How do I track down a painfully long pause in a small web app?

2014-09-15 Thread larry google groups
Okay, I will dig into jvisualvm. Thanks. 


On Monday, September 15, 2014 5:53:34 AM UTC-4, David Powell wrote:

 Use the jvisualvm tool that comes with the jdk- you should be able to 
 connect to the clojure process.

 Looking at the memory usage graphs, and if the heap size is banging 
 against the max heap size, then you might just be using too small a heap 
 size - try upping it.

 You can also install the visualgc plugin for jvisualvm to get more info on 
 timings.


 Alternatively go to the Threads pane, and click Thread Dump during the 30 
 second pause - you should be able to confirm what actual code is running at 
 this point, which might give a clue to what is going on.


 If you have a memory leak, the Heap Dump button on the Monitor tab lets 
 you interactively explore all memory in the jvm.  If there is a lot of 
 something, that might be the thing that is leaking.



 On Mon, Sep 15, 2014 at 7:55 AM, François Rey fmj...@gmail.com 
 javascript: wrote:

 GC would be the first suspect, but then it could also be combined with a 
 swap issue, or a JVM bug.
 Have a look at this article, which ends with a concrete list of things to 
 do:
 https://blogs.oracle.com/poonam/entry/troubleshooting_long_gc_pauses


 -- 
 You received this message because you are subscribed to the Google
 Groups Clojure group.
 To post to this group, send email to clo...@googlegroups.com 
 javascript:
 Note that posts from new members are moderated - please be patient with 
 your first post.
 To unsubscribe from this group, send email to
 clojure+u...@googlegroups.com javascript:
 For more options, visit this group at
 http://groups.google.com/group/clojure?hl=en
 --- You received this message because you are subscribed to the Google 
 Groups Clojure group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to clojure+u...@googlegroups.com javascript:.
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: How do I track down a painfully long pause in a small web app?

2014-09-15 Thread Nando Breiter
I don't have any experience configuring Clojure apps on the JVM, yet, but
it may be that increasing the RAM on the server does not increase the RAM
allocated to the JVM instance Clojure is running on.



Aria Media Sagl
Via Rompada 40
6987 Caslano
Switzerland

+41 (0)91 600 9601
+41 (0)76 303 4477 cell
skype: ariamedia

On Mon, Sep 15, 2014 at 4:04 PM, larry google groups 
lawrencecloj...@gmail.com wrote:

  1. Which API calls pause? If only certain calls pause, then probably you
 have something
  specific to suspect. Try adding a dummy REST call - see if that call
 pauses
  while others do.

 I will add a dummy REST call, although this pause does not seem specific
 to a particular API call.


  2. Is any of your services running on a t1.micro or a burst-oriented EC2
  instance on AWS? Try changing the instance type in that case.

 We started on a small instance but recently we moved up to a reasonably
 powered machine with 4 gigs of RAM.


  Have you tried printing GC info?

 No, but I will. Thank you.



 On Monday, September 15, 2014 12:44:54 AM UTC-4, Shantanu Kumar wrote:

 Few thing to consider:
 1. Which API calls pause? If only certain calls pause, then probably you
 have something specific to suspect. Try adding a dummy REST call - see if
 that call pauses while others do.
 2. Is any of your services running on a t1.micro or a burst-oriented EC2
 instance on AWS? Try changing the instance type in that case.
 3. Can you mock out the components that you suspect could be a problem?
 Begin by mocking out everything you suspect, then replace the mock with
 actual impl one component at a time until you isolate the problematic
 component.
 4. Have you tried running a profiler?
 5. Have you tried printing GC info? Maybe this could be useful:
 http://blog.ragozin.info/2011/09/hotspot-jvm-garbage-collection-options.
 html

 Shantanu

 On Monday, 15 September 2014 09:45:14 UTC+5:30, larry google groups wrote:


 I have an embarrassing problem. I convinced my boss that I could use
 Clojure to build a RESTful API. I was successful in so far as that went,
 but now I face the issue that every once in a while, the program pauses,
 for a painfully long time -- sometimes 30 seconds, which causes some
 requests to the API to timeout. We are still in testing, so there is no
 real load on the app, just the frontenders, writing Javascript and making
 Ajax calls to the service.

 The service seems like a basic Clojure web app. I use Jetty as the
 webserver, and the libraries in use are:

 Ring

 Compojure

 Liberator

 Monger

 Timbre

 Lamina

 Dire

 When someone complains about the pauses, I will go test the service, and
 I can hit with 40 requests in 10 seconds and it has great performance. The
 pauses actually seem to come after periods of inactivity, which made me
 think that this had something to do with garbage collection, except that
 the pauses are so extreme -- like I said, sometimes as much as 30 seconds,
 causing requests to timeout. When the app does finally start to respond it
 again, it goes very fast, and responds to those pending request very fast.

 But I have to find a way to fix these pauses.

 Right now I packaged the app as an Uberjar and put it on the server,
 spun it up on port 24000 and proxied it through Apache. I put a script in
 /etc/init.d to start the app using  start-stop-daemon.

 Possible things that could be going wrong:

 Maybe Jetty needs more threads, or maybe less threads? How would I test
 that?

 Maybe the link to MongoDB sometimes dies? (Mongo is on another server at
 Amazon) How would I test that?

 Maybe it is garbage collection? How would I test that?

 Maybe I have some code that somehow blocks the whole app? Seems unlikely
 but I'm trying to keep an open mind.

 Maybe the thread pool managed by Lamina sometimes gets overwhelmed? How
 would I test that?

 Maybe when Timbre writes to the log file it causes things to pause? (But
 I believe Timbre does this in its own thread?) How do I test that?

 This is a small app: only about 1,100 lines of code.

 I don't have much experience debugging problems on the JVM, so I welcome
 any suggestions.







  --
 You received this message because you are subscribed to the Google
 Groups Clojure group.
 To post to this group, send email to clojure@googlegroups.com
 Note that posts from new members are moderated - please be patient with
 your first post.
 To unsubscribe from this group, send email to
 clojure+unsubscr...@googlegroups.com
 For more options, visit this group at
 http://groups.google.com/group/clojure?hl=en
 ---
 You received this message because you are subscribed to the Google Groups
 Clojure group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to clojure+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to 

How do I track down a painfully long pause in a small web app?

2014-09-14 Thread larry google groups

I have an embarrassing problem. I convinced my boss that I could use 
Clojure to build a RESTful API. I was successful in so far as that went, 
but now I face the issue that every once in a while, the program pauses, 
for a painfully long time -- sometimes 30 seconds, which causes some 
requests to the API to timeout. We are still in testing, so there is no 
real load on the app, just the frontenders, writing Javascript and making 
Ajax calls to the service. 

The service seems like a basic Clojure web app. I use Jetty as the 
webserver, and the libraries in use are: 

Ring

Compojure

Liberator

Monger 

Timbre

Lamina

Dire

When someone complains about the pauses, I will go test the service, and I 
can hit with 40 requests in 10 seconds and it has great performance. The 
pauses actually seem to come after periods of inactivity, which made me 
think that this had something to do with garbage collection, except that 
the pauses are so extreme -- like I said, sometimes as much as 30 seconds, 
causing requests to timeout. When the app does finally start to respond it 
again, it goes very fast, and responds to those pending request very fast. 

But I have to find a way to fix these pauses. 

Right now I packaged the app as an Uberjar and put it on the server, spun 
it up on port 24000 and proxied it through Apache. I put a script in 
/etc/init.d to start the app using  start-stop-daemon.

Possible things that could be going wrong:

Maybe Jetty needs more threads, or maybe less threads? How would I test 
that?

Maybe the link to MongoDB sometimes dies? (Mongo is on another server at 
Amazon) How would I test that? 

Maybe it is garbage collection? How would I test that? 

Maybe I have some code that somehow blocks the whole app? Seems unlikely 
but I'm trying to keep an open mind. 

Maybe the thread pool managed by Lamina sometimes gets overwhelmed? How 
would I test that? 

Maybe when Timbre writes to the log file it causes things to pause? (But I 
believe Timbre does this in its own thread?) How do I test that? 

This is a small app: only about 1,100 lines of code.

I don't have much experience debugging problems on the JVM, so I welcome 
any suggestions. 







-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: How do I track down a painfully long pause in a small web app?

2014-09-14 Thread Shantanu Kumar
Few thing to consider:
1. Which API calls pause? If only certain calls pause, then probably you 
have something specific to suspect. Try adding a dummy REST call - see if 
that call pauses while others do.
2. Is any of your services running on a t1.micro or a burst-oriented EC2 
instance on AWS? Try changing the instance type in that case.
3. Can you mock out the components that you suspect could be a problem? 
Begin by mocking out everything you suspect, then replace the mock with 
actual impl one component at a time until you isolate the problematic 
component.
4. Have you tried running a profiler?
5. Have you tried printing GC info? Maybe this could be 
useful: 
http://blog.ragozin.info/2011/09/hotspot-jvm-garbage-collection-options.html

Shantanu

On Monday, 15 September 2014 09:45:14 UTC+5:30, larry google groups wrote:


 I have an embarrassing problem. I convinced my boss that I could use 
 Clojure to build a RESTful API. I was successful in so far as that went, 
 but now I face the issue that every once in a while, the program pauses, 
 for a painfully long time -- sometimes 30 seconds, which causes some 
 requests to the API to timeout. We are still in testing, so there is no 
 real load on the app, just the frontenders, writing Javascript and making 
 Ajax calls to the service. 

 The service seems like a basic Clojure web app. I use Jetty as the 
 webserver, and the libraries in use are: 

 Ring

 Compojure

 Liberator

 Monger 

 Timbre

 Lamina

 Dire

 When someone complains about the pauses, I will go test the service, and I 
 can hit with 40 requests in 10 seconds and it has great performance. The 
 pauses actually seem to come after periods of inactivity, which made me 
 think that this had something to do with garbage collection, except that 
 the pauses are so extreme -- like I said, sometimes as much as 30 seconds, 
 causing requests to timeout. When the app does finally start to respond it 
 again, it goes very fast, and responds to those pending request very fast. 

 But I have to find a way to fix these pauses. 

 Right now I packaged the app as an Uberjar and put it on the server, spun 
 it up on port 24000 and proxied it through Apache. I put a script in 
 /etc/init.d to start the app using  start-stop-daemon.

 Possible things that could be going wrong:

 Maybe Jetty needs more threads, or maybe less threads? How would I test 
 that?

 Maybe the link to MongoDB sometimes dies? (Mongo is on another server at 
 Amazon) How would I test that? 

 Maybe it is garbage collection? How would I test that? 

 Maybe I have some code that somehow blocks the whole app? Seems unlikely 
 but I'm trying to keep an open mind. 

 Maybe the thread pool managed by Lamina sometimes gets overwhelmed? How 
 would I test that? 

 Maybe when Timbre writes to the log file it causes things to pause? (But I 
 believe Timbre does this in its own thread?) How do I test that? 

 This is a small app: only about 1,100 lines of code.

 I don't have much experience debugging problems on the JVM, so I welcome 
 any suggestions. 









-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
Clojure group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.