Hi Jay, very interesting project. I run a hyperlocal wiki in Boston:
http://boston.povo.com. How are you pulling these, are you going
after specific users who set their location to Boston (or whichever
city)?
On Apr 17, 4:20 pm, jayb wrote:
> I've been collecting tweets for about a week for a
Anything you can do to help people determine better the language of tweets,
so search is more usable for international users. ;))
I am a bit curious about the mentioned 'costs of publishing in journals
and conferences' - don't know about the journals. but none of the
conferences I know of in Tech
I've been collecting tweets for about a week for a project (http://
www.happn.in).
Some characteristics of my current dataset:
* Begin around April 10th 2009
* Collected from users who are located nearby 26 US cities
* ~5,000,000 tweets
* Growing at ~800,000 per day
* ~900MB in mysql
* ~375,000 u
Part 1: http://drop.io/gmx85rd (tweetsgzaa) Part 2: http://drop.io/f5itrsx
(tweetsgzab)
Password (for the download): twitter
The two parts need to be concatenated and then un-gzipped (naming the
concatenated file tweets.gz would be appropriate).
Nick
The format is a tab-delimited text file.
I'm splitting it and putting it on drop.io.
Will take a little while to upload... I'll post when it's available.
Nick
On Fri, Apr 17, 2009 at 9:17 AM, djMax wrote:
>
> http://drop.io
>
> On Apr 17, 12:07 pm, Nick Arnett wrote:
> > Michele, djMax and anybody else interested... It is a 128MB f
On Fri, Apr 17, 2009 at 9:17 AM, djMax wrote:
>
> http://drop.io
>
The free version is limited to 100MB... I could split it, I guess. Any
others with a higher limit?
Nick
http://drop.io
On Apr 17, 12:07 pm, Nick Arnett wrote:
> Michele, djMax and anybody else interested... It is a 128MB file after
> gzipping (291MB uncompressed). Any thoughts on a place to put it for
> download? I'm reluctant to sacrifice a lot of my own bandwidth for this and
> off the top of
Michele, djMax and anybody else interested... It is a 128MB file after
gzipping (291MB uncompressed). Any thoughts on a place to put it for
download? I'm reluctant to sacrifice a lot of my own bandwidth for this and
off the top of my head, I can't think of a good place to share it.
Nick
I've wondered about a distributed version of this... If those of us
who want to sift through the "entire" stream were to pool our API
usage, in theory we could do it w/o knocking over twitter right?
My particular usage is mining for geo content, either lat/lng or NLP
based feature extraction.
Hi Nick,
I am linguist currently working on Twitter. I would be very interested
in using the corpus that you mention you have created.
I work in the area of Systemic Functional Linguistics and am looking
at how people use language to affiliate on Twitter. At the moment I am
working with a corpus
On Thu, Apr 9, 2009 at 2:04 PM, kanny wrote:
> ... It could change the twitter
> client game completely as we dive deeper into the meanings of the
> tweets instead of the keyword based or author based groupings.
>
That's what TwURLed News is about, but using a much simpler clue - cited
URLs - a
Thanks Nick for your gesture. I will certainly be interested in trying
out your cached tweets, but its usefulness will be limited to those
who follow the cached tweets' authors.
About sharing, i don't intend to publish in journals or conferences as
i can't afford the costs, but will definitely sh
On Thu, Apr 9, 2009 at 7:13 AM, kanny wrote:
>
>
> Caching is something i will definitely be doing, but as i said, to do
> something complex like semantic model generation, i need access to a
> user's last, at least 100,000 friends_timeline tweets. For a typical
> user following 100 reasonably act
Thank you Doug for the reply.
Currently, I am able to get only about 1000 tweets from a user's
timeline, though the limit says about 3000. I also requested for the
whitelisting and am glad that it is accepted, but i don't know where
do i request for a datamining feed ?
Caching is something i wil
We don't have a method to download the entire friends_timeline for a user.
If you search the boards or documentation you will find there is an
artificial limit on the number of tweets you can download [1].
Please doing datamining often request access to the datamining feed and
cache tweets as they
15 matches
Mail list logo