On Sat, Feb 20, 2016 at 5:50 PM, Kevin Leduc <[email protected]> wrote:
> I cam across a data visualizer that looks a lot like the pageview analysis
> tool [1].  It shows the frequency of words in comments on reddit.com: the
> n-gram visualizer [2].  If only that dataset was public ;-)
>
>
> [1]
> https://tools.wmflabs.org/pageviews/#start=2016-01-31&end=2016-02-19&project=en.wikipedia.org&platform=all-access&agent=user&pages=Cat|Dog
>
> [2]
> http://projects.fivethirtyeight.com/reddit-ngram/?keyword=global_warming.climate_change&start=20071014&end=20150831&smoothing=30

The raw data is public [3][4] and available in bigquery [5].

[3]: 
https://www.reddit.com/r/datasets/comments/3bxlg7/i_have_every_publicly_available_reddit_comment
[4]: 
https://www.reddit.com/r/datasets/comments/3mg812/full_reddit_submission_corpus_now_available_2006/
[5]: 
https://www.reddit.com/r/bigquery/comments/3cej2b/17_billion_reddit_comments_loaded_on_bigquery/

Bryan
-- 
Bryan Davis              Wikimedia Foundation    <[email protected]>
[[m:User:BDavis_(WMF)]]  Sr Software Engineer            Boise, ID USA
irc: bd808                                        v:415.839.6885 x6855

_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to