On Thu, Apr 9, 2009 at 7:13 AM, kanny <[email protected]> wrote:
>
>
> Caching is something i will definitely be doing, but as i said, to do
> something complex like semantic model generation, i need access to a
> user's last, at least 100,000 friends_timeline tweets. For a typical
> user following 100 reasonably active persons, this would take 2-3
> months to build, which is not practical to wait for the application to
> be usable.


I have about 2.3 million cached statuses for more than 10,000 users,
gathered over the last couple of months for the analysis I do for TwURLed
News (http://TwURLedNews.com).  There's a sampling bias in favor of people
who have tended to cite URLs that became popular.

I'm quite interested in the kind of analysis you're doing, so I'd be happy
to share the data with you or anyone else who might be want it for this sort
of purpose.  It wouldn't be hard for me to export it in the format you want
and make it available for download, though if a lot of people want it, that
would become a problem... but then we can figure out somewhere other than my
servers to put it on.

So... would this be useful as a one-time offer?  Do you intend to share the
results of your analysis?

Nick

Reply via email to