We find that proper-nouns constitute 40% of query terms, and proper nouns and nouns together constitute over 70% of query terms. We also show that the majority of queries are nounphrases, not unstructured collections of terms.
Way back when, we had a web-authoring environment that proposed likely hyperlinks by picking noun phrases out of a draft webpage and matching them against a full-text index of other content on the same server. At the time NCSA's "what's new" page was state-of-the-art in finding third-party web content; it might be interesting were someone else to take a run at that fence using modern search engine queries...
-Dave
