Re: [webkit-dev] Adding archive.org-based page loading time performance tests

2012-07-27 Thread Ryosuke Niwa
Hi,

I have made some progress in the last couple of months, and you can now
play with it on your machine using Mac port or Chromium port on Mac or
Linux. Follow instructions on
http://trac.webkit.org/wiki/Writing%20Performance%20Tests

Note: it has been found that *it doesn't run on a Google-issued Mac* due to
some network setting issues (DRT simply ignores proxy setting you set
regardless of whether you're on the corporate network or not).

- Ryosuke

On Mon, Apr 16, 2012 at 1:42 AM, Ryosuke Niwa rn...@webkit.org wrote:

 *Summary*
 I propose to add a new page load time performance test suite that loads
 pages on archive.org.

 *Problem*
 Google's page cycler and Apple's PLT test suites are both private due to
 copyright restrictions, and people outside of these two organizations
 cannot see the contents. This severely limits the utility and the
 effectiveness of these test suites because not all contributors can run
 them locally. We need a publicly distributable version of these two test
 suites.

 *Proposal*
 We can measure the performance metrics on the snapshots of popular
 websites at a specific date and time on archive.org. Because archive.orgwill 
 give us the same snapshot each time for given a URL, our test suite
 only need to store a list of URLs. It eliminates the need for distributing
 the page contents with the suite and still allows all contributors to
 obtain the same test page when running the test suite. In order to avoid
 DoS'ing archive.org, we can use web-page-replay:
 http://code.google.com/p/web-page-replay/ to create persistent cache.
 Credit for this novel idea: Greg Simon.

 I have posted a work in progress patch on
 https://bugs.webkit.org/show_bug.cgi?id=84008. In addition, I have
 notified the Internet Archive of the proposed plan on April 12th but I
 haven't received any responses yet.

 Best,
 Ryosuke Niwa
 Software Engineer
 Google Inc.


___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev


Re: [webkit-dev] Adding archive.org-based page loading time performance tests

2012-04-30 Thread Darin Fisher
On Sun, Apr 29, 2012 at 3:44 PM, Ryosuke Niwa rn...@webkit.org wrote:

 On Fri, Apr 27, 2012 at 1:49 AM, Nat Duca nd...@chromium.org wrote:

 I'm concerned at how well this would work graphics performance tests.

 Consider:
 http://web.archive.org/web/20110111083848/http://techcrunch.com/

 http://web.archive.org/web/20110222032916/http://www.nytimes.com/


 http://web.archive.org/web/20110429194113/http://www.thewildernessdowntown.com/

 What do we do for the cases where archive.org is getting bad/incomplete
 ... erm, archives?

 There's no fix to it. If archive.org doesn't work, then we need to pull
 data directly from the website. We can do that. The infrastructure I'm
 developing is agnostic of whether we use archive.org or not. However,
 pulling data directly from websites will make the test suite behave
 differently depending on when you run the test so the test suite can't be
 open that way.


Does it matter if the page contents are bad/incomplete?  It seems like all
that matters is that they are consistent from pull-to-pull and somewhat
representative of pages we'd care to optimize.  Is the concern that those
URLs are missing too much content to be useful?

Note: The page cyclers used by Chromium all have data sets that are
bad/incomplete.  This was intentional.  For example, if a subresource was
not available for whatever reason, then the request to fetch it was
neutered (e.g., all http substrings were replaced with httpdisabled).

-Darin
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] Adding archive.org-based page loading time performance tests

2012-04-30 Thread Tony Gentilcore
 Does it matter if the page contents are bad/incomplete?

Good point. Seems fine for any given page to be incomplete is a
specific way. The only thing that would concern me is if we always
miss a certain class of resources. For instance, if we never recorded
resources fetched via XHR, it could lead us to miss a class of
optimizations or, worse yet, to make a bad tradeoff.

-Tony
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] Adding archive.org-based page loading time performance tests

2012-04-29 Thread Ryosuke Niwa
On Fri, Apr 27, 2012 at 1:49 AM, Nat Duca nd...@chromium.org wrote:

 I'm concerned at how well this would work graphics performance tests.

 Consider:
 http://web.archive.org/web/20110111083848/http://techcrunch.com/

 http://web.archive.org/web/20110222032916/http://www.nytimes.com/


 http://web.archive.org/web/20110429194113/http://www.thewildernessdowntown.com/

 What do we do for the cases where archive.org is getting bad/incomplete
 ... erm, archives?

There's no fix to it. If archive.org doesn't work, then we need to pull
data directly from the website. We can do that. The infrastructure I'm
developing is agnostic of whether we use archive.org or not. However,
pulling data directly from websites will make the test suite behave
differently depending on when you run the test so the test suite can't be
open that way.

- Ryosuke
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev