I think you would be better off consuming the firehose, geocode the
tweets yourself, and throw away any that aren’t in regions you care
about, caching the rest for a period of time.

The thing to remember about "geocoding" of tweets is that until very
recently the geocoding was solely by the <location> field in a user’s
profile.  True geocoding of individual tweets is very recent and
depends on the user enabling geo coding, and on the user agent posting
the lat/lon with the tweet.  So the firehose *does* contain the <geo>
field, it's just mostly empty because most clients don’t populate it
yet.  So if the <geo> field is empty you’d have to geocode based on
the <location> field which is a bit of a hairball and may contain any
data up to 30 bytes.

Alternately, do the cron job thing but enlarge the regions you’re
searching on (search on the top N cities or metros for example, not
200,000 coordinates).  Cache the data, and accept that it won’t be
absolutely up to date (it’s already lost a lot of precision since the
<location> field is completely arbitrary and even if it is a city or
lat/lon pair, does not necessarily represent where the twitter user
was at that moment in time).

-ed costello

