Ottomata has submitted this change and it was merged. (
https://gerrit.wikimedia.org/r/395917 )
Change subject: Add and link readme pages for analytics datasets
..
Add and link readme pages for analytics datasets
Bug: T167033
Change-Id: Icf47d81cde07bee94b54fdb4241fbc912c84966e
---
M modules/dumps/files/web/html/analytics_index.html
A modules/dumps/files/web/html/clickstream_readme.html
A modules/dumps/files/web/html/mediacounts_readme.html
A modules/dumps/files/web/html/pageviews_readme.html
A modules/dumps/files/web/html/unique_devices_readme.html
M modules/dumps/manifests/web/html.pp
6 files changed, 207 insertions(+), 8 deletions(-)
Approvals:
Ottomata: Verified; Looks good to me, approved
diff --git a/modules/dumps/files/web/html/analytics_index.html
b/modules/dumps/files/web/html/analytics_index.html
index 7f9ba98..74fc0aa 100644
--- a/modules/dumps/files/web/html/analytics_index.html
+++ b/modules/dumps/files/web/html/analytics_index.html
@@ -16,11 +16,11 @@
Pageviews: statistics compiled using the
current https://meta.wikimedia.org/wiki/Research:Page_view;
target="_blank">Pageview Definition. Available as:
-Pageview/projectview data
filtered to what we believe is only human traffic. Available since May
2015.
+Pageview/projectview data filtered to what
we believe is only human traffic. Available since May 2015.
Pageview/projectview
data, highly compressed and corrected for outages. This dataset was
historically computed using the best source available at the time:
-Dec 2015 - now: compressing and correcting the
pageviews dataset
+Dec 2015 - now: compressing and correcting the
pageviews dataset
2007 - Dec 2015: compressing and correcting
the pagecounts-raw dataset
@@ -31,27 +31,27 @@
Mediacounts: statistics from all projects on
media file access. Available as:
-Request counts for the
upload domain (pictures, movies, audio files)
+Request
counts for the upload domain (pictures, movies, audio files)
Unique Devices: statistics from all projects
on unique devices. Available as:
-Estimate of unique
devices based on a privacy-sensitive last access cookie.
+Estimate
of unique devices based on a privacy-sensitive last access cookie.
Clickstream: (referer, resource) pairs
extracted from the request logs of Wikipedia. Please visit
- the https://meta.wikimedia.org/wiki/Research:Wikipedia_clickstream;>Clickstream
mediawiki page for detailed
- information, and the https://figshare.com/articles/Wikipedia_Clickstream/1305770;>Clickstream
figshare page for
- correctly reference this dataset. Available as:
+the https://meta.wikimedia.org/wiki/Research:Wikipedia_clickstream;>Clickstream
research page for detailed
+information. Available as:
-Monthly generated
clickstream for wikipedia in English, Russian, German, Spanish and
Japanese.
+Monthly
generated clickstream for wikipedia in English, Russian, German, Spanish, and
Japanese.
+
Deprecated datasets (no longer maintained or updated)
Pagecounts: simple pageview definition.
Available from 2007 to 2016. Some of the data does not include counts from the
mobile site and no filtering of automata is performed. Available as:
diff --git a/modules/dumps/files/web/html/clickstream_readme.html
b/modules/dumps/files/web/html/clickstream_readme.html
new file mode 100644
index 000..41bae67
--- /dev/null
+++ b/modules/dumps/files/web/html/clickstream_readme.html
@@ -0,0 +1,37 @@
+http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd;>
+http://www.w3.org/1999/xhtml; xml:lang="en" lang="en" dir="ltr">
+
+
+
+
+Analytics: Clickstream
+
+
+
+
+
+Analytics Datasets: Clickstream
+
+
+(referer, resource) pairs extracted from the request logs of
Wikipedia.
+The data shows how people get to a Wikipedia article and what
links they click on, in aggregate.
+
+
+In depth documentation is available
+https://meta.wikimedia.org/wiki/Research:Wikipedia_clickstream;
target="_blank">on our wiki
+and the https://figshare.com/articles/Wikipedia_Clickstream/1305770;>Clickstream