On Sat, Feb 20, 2016 at 5:50 PM, Kevin Leduc <[email protected]> wrote: > I cam across a data visualizer that looks a lot like the pageview analysis > tool [1]. It shows the frequency of words in comments on reddit.com: the > n-gram visualizer [2]. If only that dataset was public ;-) > > > [1] > https://tools.wmflabs.org/pageviews/#start=2016-01-31&end=2016-02-19&project=en.wikipedia.org&platform=all-access&agent=user&pages=Cat|Dog > > [2] > http://projects.fivethirtyeight.com/reddit-ngram/?keyword=global_warming.climate_change&start=20071014&end=20150831&smoothing=30
The raw data is public [3][4] and available in bigquery [5]. [3]: https://www.reddit.com/r/datasets/comments/3bxlg7/i_have_every_publicly_available_reddit_comment [4]: https://www.reddit.com/r/datasets/comments/3mg812/full_reddit_submission_corpus_now_available_2006/ [5]: https://www.reddit.com/r/bigquery/comments/3cej2b/17_billion_reddit_comments_loaded_on_bigquery/ Bryan -- Bryan Davis Wikimedia Foundation <[email protected]> [[m:User:BDavis_(WMF)]] Sr Software Engineer Boise, ID USA irc: bd808 v:415.839.6885 x6855 _______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
