On Wed, 14 Dec 2011 08:36:44 +0100, Boris Zbarsky <bzbar...@mit.edu> wrote:
John Jensen here at Mozilla has been doing some web crawling trying to
find what barewords are used in on* attributes.
Awesome!
What I have so far as a result is a list of about 1.7 million barewords
used across several tens of thousands of pages.
Do you have a more accurate figure for the number of pages?
If people are interested in the exact methodology, I can probably get a
description.
I'm interested. It's hard to make conclusions from data without knowing
what the data is, how it is biased, what false positives it might have,
etc.
I'm working on making sure that it's ok for me to post the data in its
entirety so you can all look as well. Assuming it is (very likely),
where's a good place to stick a 7MB compressed file?
In any case, for this particular data set there are no hits on "findAll"
or "matches" (good!), but there are two hits on "find" as a bareword in
an on* attribute. Specifically:
1) http://otc-pif.rbc.ru/pif_calculator/calculator.jsp has
onclick="find(document.getElementById(current + 'List').children,
searchString.value)"
2) http://bookmark.people.com.cn/index.html has onclick="find()"
These would both obviously get broken by the proposed find() API, unless
we actually do some sort of workaround for this problem...
-Boris
--
Simon Pieters
Opera Software