[ 
https://issues.apache.org/jira/browse/AURORA-458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14014305#comment-14014305
 ] 

David McLaughlin edited comment on AURORA-458 at 5/30/14 10:31 PM:
-------------------------------------------------------------------

So these profile runs show conclusively that GzipStream is the cause. 

This is timed output from a local run with no network latency:

{code}
$ time curl -s 'http://localhost:8081/api' -H 'Accept-Encoding: 
gzip,deflate,sdch' --data-binary 
'[1,"getTasksStatus",1,0,{"1":{"rec":{"8":{"rec":{"1":{"str":"mesos"}}},"9":{"str":"test"},"2":{"str":"bigJob"}}}}]'
 --compressed > /tmp/results
real  0m1.530s
user  0m0.014s
sys 0m0.011s


$ time curl -s 'http://localhost:8081/api' -H 'Origin: http://localhost:8081' 
--data-binary 
'[1,"getTasksStatus",1,0,{"1":{"rec":{"8":{"rec":{"1":{"str":"mesos"}}},"9":{"str":"test"},"2":{"str":"bigJob"}}}}]'
 > /tmp/blah

real  0m0.297s
user  0m0.007s
sys 0m0.015s
{code}

As you can see, without compression it is 5x faster. 

With actual network latency (and a real production job with a much bigger 
payload - 10MB vs 3MB on local):

{code}
$ time curl 'https://internal-scheduler/api' -H 'Accept-Encoding: 
gzip,deflate,sdch' --data-binary 
'[1,"getTasksStatus",1,0,{"1":{"rec":{"8":{"rec":{"1":{"str":"test"}}},"9":{"str":"prod"},"2":{"str":"bigJob"}}}}]'
 --compressed > /tmp/results
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  305k  100  305k  100   124  63172     25  0:00:04  0:00:04 --:--:-- 81652

real  0m4.957s
user  0m0.038s
sys 0m0.024s


$ time curl 'https://internal-scheduler/api' --data-binary 
'[1,"getTasksStatus",1,0,{"1":{"rec":{"8":{"rec":{"1":{"str":"test"}}},"9":{"str":"prod"},"2":{"str":"bigJob"}}}}]'
 > /tmp/results
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 10.3M  100 10.3M  100   124  3670k     42  0:00:02  0:00:02 --:--:-- 3684k

real  0m2.904s
user  0m0.192s
sys 0m0.083s
{code}

Still nearly twice as fast. So we should remove on the fly gzip compression for 
dynamic content. 


was (Author: davmclau):
So these profile runs show conclusively that GzipStream is the cause. 

This is timed output from a local run with no network latency:

{code}
$ time curl -s 'http://localhost:8081/api' -H 'Accept-Encoding: 
gzip,deflate,sdch' --data-binary 
'[1,"getTasksStatus",1,0,{"1":{"rec":{"8":{"rec":{"1":{"str":"mesos"}}},"9":{"str":"test"},"2":{"str":"bigJob"}}}}]'
 --compressed > /tmp/results
real  0m1.530s
user  0m0.014s
sys 0m0.011s


$ time curl -s 'http://localhost:8081/api' -H 'Origin: http://localhost:8081' 
--data-binary 
'[1,"getTasksStatus",1,0,{"1":{"rec":{"8":{"rec":{"1":{"str":"mesos"}}},"9":{"str":"test"},"2":{"str":"bigJob"}}}}]'
 > /tmp/blah

real  0m0.297s
user  0m0.007s
sys 0m0.015s
{code}

As you can see, without compression it is 5x faster. 

With actual network latency (and a real production job with a much bigger 
payload - 10MB vs 3MB on local):

{code}
$ time curl 'https://internal-scheduler/api' -H 'Accept-Encoding: 
gzip,deflate,sdch' --data-binary 
'[1,"getTasksStatus",1,0,{"1":{"rec":{"8":{"rec":{"1":{"str":"test"}}},"9":{"str":"prod"},"2":{"str":"bigJob"}}}}]'
 --compressed > /tmp/results
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  305k  100  305k  100   124  63172     25  0:00:04  0:00:04 --:--:-- 81652

real  0m4.957s
user  0m0.038s
sys 0m0.024s


$ time curl 'https://scheduler-prod-mesos.service.smf1.twitter.biz/api' 
--data-binary 
'[1,"getTasksStatus",1,0,{"1":{"rec":{"8":{"rec":{"1":{"str":"test"}}},"9":{"str":"prod"},"2":{"str":"bigJob"}}}}]'
 > /tmp/results
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 10.3M  100 10.3M  100   124  3670k     42  0:00:02  0:00:02 --:--:-- 3684k

real  0m2.904s
user  0m0.192s
sys 0m0.083s
{code}

Still nearly twice as fast. So we should remove on the fly gzip compression for 
dynamic content. 

> Web interface has become slow, especially the job page
> ------------------------------------------------------
>
>                 Key: AURORA-458
>                 URL: https://issues.apache.org/jira/browse/AURORA-458
>             Project: Aurora
>          Issue Type: Bug
>          Components: UI
>            Reporter: Bill Farner
>            Assignee: David McLaughlin
>            Priority: Critical
>         Attachments: Screen Shot 2014-05-22 at 11.42.24 AM.png, Screen Shot 
> 2014-05-22 at 11.44.27 AM.png, scheduler-profile-curl.csv, 
> scheduler-profile-curl.png, scheduler-profile.csv, scheduler-profile.png
>
>
> The web interface is noticeably more sluggish since the revamp.  This is most 
> noticeable for large jobs, where the job page may display a blank page for 
> several seconds before showing anything useful.  We need to adapt the API to 
> reduce the amount of data fetched to render these pages.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to