Doug -
So the point is to attempt "early detection" of an outbreak of
something based on what people are tweeting?
( "influenza", "flu", "cold", "fever", "H1N1", "H3N2", "sneezing",
"aching", "ache", "achy", "congested" )
It certainly sounds like there might be some utility to it, but I'm
wondering what kinds of reasoning went into this? Is it based on any
models of who tweets or what they are likely to tweet about?
Was it more of a demonstration or team-building exercise, or does
someone expect to actually put it to use?
So, the data was pre-archived, but I presume a more useful version would
work from more real-time data and probably would have a sliding time
(exponential moving average?) window?
Do you know about Norm Packard's (ofEudaimonic, Prediction, ProtoLife
<http://en.wikipedia.org/wiki/Norman_Packard> fame) latest venture
called LuckySort <http://luckysort.com/>? It's R interface is called
TopicWatchr <http://luckysort.com/products/r-package> and seems to be
doing something roughly similar (but without specific geolocation?).
Their examples suggest that they are aiming this at the Investment sector.
Our own Mick Thompson (well, SFX if not FRIAM) was working on related
things before the startup Collecta went dark... I'm not sure if he's
still in this game (or on this list?). I used Collecta when it was
alive... it aggregated Twitter as well as some subset of blog and maybe
newsfeeds? For example, stuck in northbound traffic on I-25 near La
Cienega one time, I was able to discover within seconds of stopping my
vehicle that 3 people also stuck in traffic had mentioned that they too
were stuck and one of them was close enough to the front of the line to
see that it was a fuel truck that had been involved in an accident so
they weren't inclined to let anyone past it until the HazMat or Fire
folks had determined there was no risk. On the other hand a CB Radio
and/or a Police Scanner (oldschool) would have told me all that and more
in time to take the La Cienega exit and frontage on into town with only
a minor delay.
One of my projects is funded by NIH, and it sponsored (read: paid for)
a group of 15 of us software developer types from 10 different
organizations across the country who are working on the project to get
together last week in Las Vegas, NV to conduct a two-day hackathon. We
split into three groups, and my group produced some rough, ugly, but
working Python and R code.
The Python code conducts keyword searches on archived 1% Twitter API
data, filtered to only search only those tweets that have valid
geolocation data. The short piece of R code calls a Google map API and
plots the data on a Google map in a browser, allowing the user to
click on the geolocated map points to view the originator's tweet text.
Our next step will be to replace the R code with Python for calling
the Google map API.
Here, it's ugly, but it's free. Don't say I never gave you anything.
--Doug
--
/Doug Roberts
[email protected] <mailto:[email protected]>/
/http://parrot-farm.net/Second-Cousins/
/
505-455-7333 - Office
505-672-8213 - Mobile/
============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com
============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com