On Wed, 14 Dec 2011 08:36:44 +0100, Boris Zbarsky <bzbar...@mit.edu> wrote:

John Jensen here at Mozilla has been doing some web crawling trying to find what barewords are used in on* attributes.

Awesome!

What I have so far as a result is a list of about 1.7 million barewords used across several tens of thousands of pages.

Do you have a more accurate figure for the number of pages?

If people are interested in the exact methodology, I can probably get a description.

I'm interested. It's hard to make conclusions from data without knowing what the data is, how it is biased, what false positives it might have, etc.

I'm working on making sure that it's ok for me to post the data in its entirety so you can all look as well. Assuming it is (very likely), where's a good place to stick a 7MB compressed file?

In any case, for this particular data set there are no hits on "findAll" or "matches" (good!), but there are two hits on "find" as a bareword in an on* attribute. Specifically:

1) http://otc-pif.rbc.ru/pif_calculator/calculator.jsp has onclick="find(document.getElementById(current + 'List').children, searchString.value)"

2)  http://bookmark.people.com.cn/index.html has onclick="find()"

These would both obviously get broken by the proposed find() API, unless we actually do some sort of workaround for this problem...

-Boris



--
Simon Pieters
Opera Software

Reply via email to