I've wondered about a distributed version of this... If those of us who want to sift through the "entire" stream were to pool our API usage, in theory we could do it w/o knocking over twitter right?
My particular usage is mining for geo content, either lat/lng or NLP based feature extraction.
