[MediaWiki-commits] [Gerrit] wikimedia...autoreporter[master]: small fix of survival plots

2017-11-21 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/392761 )

Change subject: small fix of survival plots
..


small fix of survival plots

Change-Id: I7167c7ea7c628aa3cf63d1d9171eaf922e8be6c5
---
M modules/interleaved_test/page_dwelltime.R
M modules/stat_test/serp_from_autocomplete.R
M modules/stat_test/visited_page.R
M modules/test_summary/browser_os.R
4 files changed, 4 insertions(+), 5 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/modules/interleaved_test/page_dwelltime.R 
b/modules/interleaved_test/page_dwelltime.R
index a01926c..4f57fc7 100644
--- a/modules/interleaved_test/page_dwelltime.R
+++ b/modules/interleaved_test/page_dwelltime.R
@@ -25,7 +25,7 @@
 ggtheme = wmf::theme_facet()
   )
   p <- ggsurv$plot +
-facet_wrap(~ group, scales = "free_y") +
+facet_wrap(~ group) +
 labs(
   title = "How long users stay on each team's results",
   subtitle = "With 95% confidence intervals."
@@ -50,7 +50,7 @@
 ggtheme = wmf::theme_facet()
   )
   p <- ggsurv$plot +
-facet_wrap(~ wiki, ncol = 3, scales = "free_y") +
+facet_wrap(~ wiki, ncol = 3) +
 labs(
   title = paste0("How long users stay on each team's results, by wiki 
(Group = ", this_group, ")"),
   subtitle = "With 95% confidence intervals."
diff --git a/modules/stat_test/serp_from_autocomplete.R 
b/modules/stat_test/serp_from_autocomplete.R
index d89fa58..c982e9f 100644
--- a/modules/stat_test/serp_from_autocomplete.R
+++ b/modules/stat_test/serp_from_autocomplete.R
@@ -67,7 +67,7 @@
   ggtheme = wmf::theme_facet()
 )
 p <- ggsurv$plot +
-  facet_wrap(~ wiki, ncol = 3, scales = "free_y") +
+  facet_wrap(~ wiki, ncol = 3) +
   labs(
 title = "Proportion of search results pages from autocomplete last 
longer than T, by test group and wiki",
 subtitle = "With 95% confidence intervals."
diff --git a/modules/stat_test/visited_page.R b/modules/stat_test/visited_page.R
index e8383b6..e608606 100644
--- a/modules/stat_test/visited_page.R
+++ b/modules/stat_test/visited_page.R
@@ -56,7 +56,7 @@
   ggtheme = wmf::theme_facet()
 )
 p <- ggsurv$plot +
-  facet_wrap(~ wiki, ncol = 3, scales = "free_y") +
+  facet_wrap(~ wiki, ncol = 3) +
   labs(
 title = "Proportion of visited search results last longer than T, by 
test group and wiki",
 subtitle = "With 95% confidence intervals."
diff --git a/modules/test_summary/browser_os.R 
b/modules/test_summary/browser_os.R
index b73886d..741e639 100644
--- a/modules/test_summary/browser_os.R
+++ b/modules/test_summary/browser_os.R
@@ -1,7 +1,6 @@
 if ("user_agent" %in% names(events)) {
 
   user_agents <- dplyr::distinct(events, wiki, session_id, group, user_agent)
-  user_agents$user_agent <- gsub('(Kindle Fire HD[X]? [0-9\\.]{1,3})"', '\\1', 
user_agents$user_agent, fixed = FALSE) # remove double quote in kindle name
   user_agents <- user_agents %>%
 cbind(., purrr::map_df(.$user_agent, ~ wmf::null2na(jsonlite::fromJSON(.x, 
simplifyVector = FALSE %>%
 mutate(

-- 
To view, visit https://gerrit.wikimedia.org/r/392761
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I7167c7ea7c628aa3cf63d1d9171eaf922e8be6c5
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/autoreporter
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 
Gerrit-Reviewer: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...autoreporter[master]: small fix of survival plots

2017-11-21 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/392761 )

Change subject: small fix of survival plots
..

small fix of survival plots

Change-Id: I7167c7ea7c628aa3cf63d1d9171eaf922e8be6c5
---
M modules/interleaved_test/page_dwelltime.R
M modules/stat_test/serp_from_autocomplete.R
M modules/stat_test/visited_page.R
M modules/test_summary/browser_os.R
4 files changed, 4 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/autoreporter 
refs/changes/61/392761/1

diff --git a/modules/interleaved_test/page_dwelltime.R 
b/modules/interleaved_test/page_dwelltime.R
index a01926c..4f57fc7 100644
--- a/modules/interleaved_test/page_dwelltime.R
+++ b/modules/interleaved_test/page_dwelltime.R
@@ -25,7 +25,7 @@
 ggtheme = wmf::theme_facet()
   )
   p <- ggsurv$plot +
-facet_wrap(~ group, scales = "free_y") +
+facet_wrap(~ group) +
 labs(
   title = "How long users stay on each team's results",
   subtitle = "With 95% confidence intervals."
@@ -50,7 +50,7 @@
 ggtheme = wmf::theme_facet()
   )
   p <- ggsurv$plot +
-facet_wrap(~ wiki, ncol = 3, scales = "free_y") +
+facet_wrap(~ wiki, ncol = 3) +
 labs(
   title = paste0("How long users stay on each team's results, by wiki 
(Group = ", this_group, ")"),
   subtitle = "With 95% confidence intervals."
diff --git a/modules/stat_test/serp_from_autocomplete.R 
b/modules/stat_test/serp_from_autocomplete.R
index d89fa58..c982e9f 100644
--- a/modules/stat_test/serp_from_autocomplete.R
+++ b/modules/stat_test/serp_from_autocomplete.R
@@ -67,7 +67,7 @@
   ggtheme = wmf::theme_facet()
 )
 p <- ggsurv$plot +
-  facet_wrap(~ wiki, ncol = 3, scales = "free_y") +
+  facet_wrap(~ wiki, ncol = 3) +
   labs(
 title = "Proportion of search results pages from autocomplete last 
longer than T, by test group and wiki",
 subtitle = "With 95% confidence intervals."
diff --git a/modules/stat_test/visited_page.R b/modules/stat_test/visited_page.R
index e8383b6..e608606 100644
--- a/modules/stat_test/visited_page.R
+++ b/modules/stat_test/visited_page.R
@@ -56,7 +56,7 @@
   ggtheme = wmf::theme_facet()
 )
 p <- ggsurv$plot +
-  facet_wrap(~ wiki, ncol = 3, scales = "free_y") +
+  facet_wrap(~ wiki, ncol = 3) +
   labs(
 title = "Proportion of visited search results last longer than T, by 
test group and wiki",
 subtitle = "With 95% confidence intervals."
diff --git a/modules/test_summary/browser_os.R 
b/modules/test_summary/browser_os.R
index b73886d..741e639 100644
--- a/modules/test_summary/browser_os.R
+++ b/modules/test_summary/browser_os.R
@@ -1,7 +1,6 @@
 if ("user_agent" %in% names(events)) {
 
   user_agents <- dplyr::distinct(events, wiki, session_id, group, user_agent)
-  user_agents$user_agent <- gsub('(Kindle Fire HD[X]? [0-9\\.]{1,3})"', '\\1', 
user_agents$user_agent, fixed = FALSE) # remove double quote in kindle name
   user_agents <- user_agents %>%
 cbind(., purrr::map_df(.$user_agent, ~ wmf::null2na(jsonlite::fromJSON(.x, 
simplifyVector = FALSE %>%
 mutate(

-- 
To view, visit https://gerrit.wikimedia.org/r/392761
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I7167c7ea7c628aa3cf63d1d9171eaf922e8be6c5
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/autoreporter
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...autoreporter[master]: Change grouping color in survival plots

2017-11-21 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/392699 )

Change subject: Change grouping color in survival plots
..


Change grouping color in survival plots

Change-Id: Iacb7d2c295e820b1ac802d72fe7001a4b28b7f95
---
M modules/interleaved_test/page_dwelltime.R
M modules/stat_test/serp_from_autocomplete.R
M modules/stat_test/visited_page.R
3 files changed, 16 insertions(+), 10 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/modules/interleaved_test/page_dwelltime.R 
b/modules/interleaved_test/page_dwelltime.R
index 99c7aae..a01926c 100644
--- a/modules/interleaved_test/page_dwelltime.R
+++ b/modules/interleaved_test/page_dwelltime.R
@@ -18,9 +18,10 @@
 xlab = "T (Dwell Time in seconds)",
 ylab = "Proportion of visits longer than T (P%)",
 surv.scale = "percent",
-palette = colorRampPalette(RColorBrewer::brewer.pal(9, "Dark2"))(2 * 
length(report_params$interleaved_group_names)),
+color = "team",
+palette = "Dark2",
 legend = "bottom",
-legend.title = "",
+legend.title = "Team",
 ggtheme = wmf::theme_facet()
   )
   p <- ggsurv$plot +
@@ -42,9 +43,10 @@
 xlab = "T (Dwell Time in seconds)",
 ylab = "Proportion of visits longer than T (P%)",
 surv.scale = "percent",
-palette = colorRampPalette(RColorBrewer::brewer.pal(9, "Dark2"))(2 * 
n_wiki),
+color = "team",
+palette = "Dark2",
 legend = "bottom",
-legend.title = "",
+legend.title = "Team",
 ggtheme = wmf::theme_facet()
   )
   p <- ggsurv$plot +
diff --git a/modules/stat_test/serp_from_autocomplete.R 
b/modules/stat_test/serp_from_autocomplete.R
index ca7cde1..d89fa58 100644
--- a/modules/stat_test/serp_from_autocomplete.R
+++ b/modules/stat_test/serp_from_autocomplete.R
@@ -34,9 +34,10 @@
 xlab = "T (Dwell Time in seconds)",
 ylab = "Proportion of SERPs longer than T (P%)",
 surv.scale = "percent",
+color = "group",
 palette = "Set1",
 legend = "bottom",
-legend.title = "",
+legend.title = "Group",
 legend.labs = traditional_test_groups,
 ggtheme = wmf::theme_min()
   )
@@ -59,9 +60,10 @@
   xlab = "T (Dwell Time in seconds)",
   ylab = "Proportion of SERPs longer than T (P%)",
   surv.scale = "percent",
-  palette = colorRampPalette(RColorBrewer::brewer.pal(9, "Set1"))(n_wiki * 
length(traditional_test_groups)),
+  color = "group",
+  palette = "Set1",
   legend = "bottom",
-  legend.title = "",
+  legend.title = "Group",
   ggtheme = wmf::theme_facet()
 )
 p <- ggsurv$plot +
diff --git a/modules/stat_test/visited_page.R b/modules/stat_test/visited_page.R
index ca0df0f..e8383b6 100644
--- a/modules/stat_test/visited_page.R
+++ b/modules/stat_test/visited_page.R
@@ -9,9 +9,10 @@
 xlab = "T (Dwell Time in seconds)",
 ylab = "Proportion of visits longer than T (P%)",
 surv.scale = "percent",
+color = "group",
 palette = "Set1",
 legend = "bottom",
-legend.title = "",
+legend.title = "Group",
 legend.labs = traditional_test_groups,
 ggtheme = wmf::theme_min()
   )
@@ -48,9 +49,10 @@
   xlab = "T (Dwell Time in seconds)",
   ylab = "Proportion of visits longer than T (P%)",
   surv.scale = "percent",
-  palette = colorRampPalette(RColorBrewer::brewer.pal(9, "Set1"))(n_wiki * 
length(traditional_test_groups)),
+  color = "group",
+  palette = "Set1",
   legend = "bottom",
-  legend.title = "",
+  legend.title = "Group",
   ggtheme = wmf::theme_facet()
 )
 p <- ggsurv$plot +

-- 
To view, visit https://gerrit.wikimedia.org/r/392699
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Iacb7d2c295e820b1ac802d72fe7001a4b28b7f95
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/autoreporter
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 
Gerrit-Reviewer: Bearloga 
Gerrit-Reviewer: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...autoreporter[master]: Change grouping color in survival plots

2017-11-21 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/392699 )

Change subject: Change grouping color in survival plots
..

Change grouping color in survival plots

Change-Id: Iacb7d2c295e820b1ac802d72fe7001a4b28b7f95
---
M modules/interleaved_test/page_dwelltime.R
M modules/stat_test/serp_from_autocomplete.R
M modules/stat_test/visited_page.R
3 files changed, 16 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/autoreporter 
refs/changes/99/392699/1

diff --git a/modules/interleaved_test/page_dwelltime.R 
b/modules/interleaved_test/page_dwelltime.R
index 99c7aae..a01926c 100644
--- a/modules/interleaved_test/page_dwelltime.R
+++ b/modules/interleaved_test/page_dwelltime.R
@@ -18,9 +18,10 @@
 xlab = "T (Dwell Time in seconds)",
 ylab = "Proportion of visits longer than T (P%)",
 surv.scale = "percent",
-palette = colorRampPalette(RColorBrewer::brewer.pal(9, "Dark2"))(2 * 
length(report_params$interleaved_group_names)),
+color = "team",
+palette = "Dark2",
 legend = "bottom",
-legend.title = "",
+legend.title = "Team",
 ggtheme = wmf::theme_facet()
   )
   p <- ggsurv$plot +
@@ -42,9 +43,10 @@
 xlab = "T (Dwell Time in seconds)",
 ylab = "Proportion of visits longer than T (P%)",
 surv.scale = "percent",
-palette = colorRampPalette(RColorBrewer::brewer.pal(9, "Dark2"))(2 * 
n_wiki),
+color = "team",
+palette = "Dark2",
 legend = "bottom",
-legend.title = "",
+legend.title = "Team",
 ggtheme = wmf::theme_facet()
   )
   p <- ggsurv$plot +
diff --git a/modules/stat_test/serp_from_autocomplete.R 
b/modules/stat_test/serp_from_autocomplete.R
index ca7cde1..d89fa58 100644
--- a/modules/stat_test/serp_from_autocomplete.R
+++ b/modules/stat_test/serp_from_autocomplete.R
@@ -34,9 +34,10 @@
 xlab = "T (Dwell Time in seconds)",
 ylab = "Proportion of SERPs longer than T (P%)",
 surv.scale = "percent",
+color = "group",
 palette = "Set1",
 legend = "bottom",
-legend.title = "",
+legend.title = "Group",
 legend.labs = traditional_test_groups,
 ggtheme = wmf::theme_min()
   )
@@ -59,9 +60,10 @@
   xlab = "T (Dwell Time in seconds)",
   ylab = "Proportion of SERPs longer than T (P%)",
   surv.scale = "percent",
-  palette = colorRampPalette(RColorBrewer::brewer.pal(9, "Set1"))(n_wiki * 
length(traditional_test_groups)),
+  color = "group",
+  palette = "Set1",
   legend = "bottom",
-  legend.title = "",
+  legend.title = "Group",
   ggtheme = wmf::theme_facet()
 )
 p <- ggsurv$plot +
diff --git a/modules/stat_test/visited_page.R b/modules/stat_test/visited_page.R
index ca0df0f..e8383b6 100644
--- a/modules/stat_test/visited_page.R
+++ b/modules/stat_test/visited_page.R
@@ -9,9 +9,10 @@
 xlab = "T (Dwell Time in seconds)",
 ylab = "Proportion of visits longer than T (P%)",
 surv.scale = "percent",
+color = "group",
 palette = "Set1",
 legend = "bottom",
-legend.title = "",
+legend.title = "Group",
 legend.labs = traditional_test_groups,
 ggtheme = wmf::theme_min()
   )
@@ -48,9 +49,10 @@
   xlab = "T (Dwell Time in seconds)",
   ylab = "Proportion of visits longer than T (P%)",
   surv.scale = "percent",
-  palette = colorRampPalette(RColorBrewer::brewer.pal(9, "Set1"))(n_wiki * 
length(traditional_test_groups)),
+  color = "group",
+  palette = "Set1",
   legend = "bottom",
-  legend.title = "",
+  legend.title = "Group",
   ggtheme = wmf::theme_facet()
 )
 p <- ggsurv$plot +

-- 
To view, visit https://gerrit.wikimedia.org/r/392699
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Iacb7d2c295e820b1ac802d72fe7001a4b28b7f95
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/autoreporter
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...autoreporter[master]: Small fixes

2017-11-20 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/392506 )

Change subject: Small fixes
..


Small fixes

Change-Id: Ic5fb7e72a97ef532d5c84b2b07ccf655e401e1b2
---
M modules/test_summary/browser_os.R
M run.R
2 files changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/modules/test_summary/browser_os.R 
b/modules/test_summary/browser_os.R
index 741e639..b73886d 100644
--- a/modules/test_summary/browser_os.R
+++ b/modules/test_summary/browser_os.R
@@ -1,6 +1,7 @@
 if ("user_agent" %in% names(events)) {
 
   user_agents <- dplyr::distinct(events, wiki, session_id, group, user_agent)
+  user_agents$user_agent <- gsub('(Kindle Fire HD[X]? [0-9\\.]{1,3})"', '\\1', 
user_agents$user_agent, fixed = FALSE) # remove double quote in kindle name
   user_agents <- user_agents %>%
 cbind(., purrr::map_df(.$user_agent, ~ wmf::null2na(jsonlite::fromJSON(.x, 
simplifyVector = FALSE %>%
 mutate(
diff --git a/run.R b/run.R
index 0d86df4..cd78fa1 100644
--- a/run.R
+++ b/run.R
@@ -33,7 +33,6 @@
 
 # Set up
 report_params <- yaml::yaml.load_file(opt$yaml_file)
-report_params <- yaml::yaml.load_file("reports/ltr_test_18lang.yaml")
 if (!dir.exists("reports")) {
   dir.create("reports")
 }

-- 
To view, visit https://gerrit.wikimedia.org/r/392506
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ic5fb7e72a97ef532d5c84b2b07ccf655e401e1b2
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/autoreporter
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 
Gerrit-Reviewer: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...autoreporter[master]: Small fixes

2017-11-20 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/392506 )

Change subject: Small fixes
..

Small fixes

Change-Id: Ic5fb7e72a97ef532d5c84b2b07ccf655e401e1b2
---
M modules/test_summary/browser_os.R
M run.R
2 files changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/autoreporter 
refs/changes/06/392506/1

diff --git a/modules/test_summary/browser_os.R 
b/modules/test_summary/browser_os.R
index 741e639..b73886d 100644
--- a/modules/test_summary/browser_os.R
+++ b/modules/test_summary/browser_os.R
@@ -1,6 +1,7 @@
 if ("user_agent" %in% names(events)) {
 
   user_agents <- dplyr::distinct(events, wiki, session_id, group, user_agent)
+  user_agents$user_agent <- gsub('(Kindle Fire HD[X]? [0-9\\.]{1,3})"', '\\1', 
user_agents$user_agent, fixed = FALSE) # remove double quote in kindle name
   user_agents <- user_agents %>%
 cbind(., purrr::map_df(.$user_agent, ~ wmf::null2na(jsonlite::fromJSON(.x, 
simplifyVector = FALSE %>%
 mutate(
diff --git a/run.R b/run.R
index 0d86df4..cd78fa1 100644
--- a/run.R
+++ b/run.R
@@ -33,7 +33,6 @@
 
 # Set up
 report_params <- yaml::yaml.load_file(opt$yaml_file)
-report_params <- yaml::yaml.load_file("reports/ltr_test_18lang.yaml")
 if (!dir.exists("reports")) {
   dir.create("reports")
 }

-- 
To view, visit https://gerrit.wikimedia.org/r/392506
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic5fb7e72a97ef532d5c84b2b07ccf655e401e1b2
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/autoreporter
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...autoreporter[master]: Add interleaved test analysis

2017-11-19 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/392102 )

Change subject: Add interleaved test analysis
..


Add interleaved test analysis

Bug: T176493
Change-Id: I795023856963030e67e85e1cde7352842aa3a7a8
---
M README.md
M functions.R
M modules/data/data_aggregation.R
M modules/data/data_cleansing.R
M modules/data/fetch_data.R
M modules/explore_similar/esclicks.R
M modules/explore_similar/hover_over.R
A modules/interleaved_test/data_processing.R
A modules/interleaved_test/interleaved_preference.R
A modules/interleaved_test/page_dwelltime.R
M modules/setup.R
M modules/sister_search/iwclicks.R
M modules/sister_search/sidebar_results.R
M modules/sister_search/ssclicks.R
M modules/stat_test/engagement.R
M modules/stat_test/first_clicked.R
M modules/stat_test/max_clicked.R
M modules/stat_test/paulscore.R
A modules/stat_test/remove_interleaved_data.R
M modules/stat_test/return_rate.R
M modules/stat_test/search_abandon_rate.R
M modules/stat_test/serp_from_autocomplete.R
M modules/stat_test/serp_load_time.R
M modules/stat_test/serp_offset.R
M modules/stat_test/visited_page.R
M modules/stat_test/zrr.R
M modules/test_summary/browser_os.R
M modules/test_summary/events.R
M modules/test_summary/searches.R
M report.Rmd
M run.R
31 files changed, 483 insertions(+), 148 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/README.md b/README.md
index 11f34a5..b0ae2f8 100644
--- a/README.md
+++ b/README.md
@@ -8,7 +8,8 @@
 
 ```R
 install.packages(c("tidyverse", "toOrdinal", "jsonlite", "yaml", "rmarkdown", 
"tools",
-  "knitr", "RMySQL", "data.table", "lubridate", "binom", "survival", 
"survminer", "import"))
+  "knitr", "RMySQL", "data.table", "lubridate", "binom", "survival", 
"survminer", "import",
+  "BayesFactor", "formattable", "DT", "htmltools", "scales", "Rcpp", 
"urltools", "rlang", "RColorBrewer"))
 
devtools::install_git("https://gerrit.wikimedia.org/r/p/wikimedia/discovery/wmf.git;)
 
devtools::install_git("https://gerrit.wikimedia.org/r/p/wikimedia/discovery/polloi.git;)
 devtools::install_github("bearloga/BCDA")
diff --git a/functions.R b/functions.R
index a4c6367..8be26c1 100644
--- a/functions.R
+++ b/functions.R
@@ -1,5 +1,6 @@
 # PaulScore Calculation
-query_score <- function(positions, F) { # 0-based positions
+# 0-based positions
+query_score <- function(positions, F) {
   if (length(positions) == 1 || all(is.na(positions))) {
 # no clicks were made
 return(0)
@@ -66,7 +67,7 @@
 ggplot2::geom_bar(stat = "identity", position = "dodge") +
 ggplot2::scale_fill_brewer("Group", palette = "Set1") +
 ggplot2::scale_y_continuous(labels = polloi::compress) +
-ggplot2::geom_text(aes_string(label = y, vjust = -0.5), position = 
position_dodge(width = 1), size = geom_text_size) +
+ggplot2::geom_text(aes_string(label = y, vjust = -0.05), position = 
position_dodge(width = 1), size = geom_text_size) +
 ggplot2::labs(y = y_lab, x = x_lab, title = title, subtitle = subtitle, 
caption = caption)
 }
 
@@ -79,3 +80,59 @@
 ggplot2::scale_color_brewer(palette = "Set1") +
 ggplot2::labs(x = NULL, color = "Group", y = y_lab, title = title, 
subtitle = subtitle)
 }
+
+cppFunction('CharacterVector fill_in(CharacterVector ids) {
+  CharacterVector new_ids(ids.size());
+  String current_id = ids[0];
+  new_ids[0] = current_id;
+  for (int i = 1; i < ids.size(); i++) {
+if (ids[i] != NA_STRING) {
+  current_id = ids[i];
+}
+new_ids[i] = current_id;
+  }
+  return new_ids;
+}')
+
+cppFunction('NumericVector cumunique(CharacterVector ids) {
+  NumericVector count(ids.size());
+  String current_id = ids[0];
+  count[0] = 1;
+  for (int i = 1; i < ids.size(); i++) {
+if (ids[i] == current_id) {
+  count[i] = count[i-1];
+} else {
+  count[i] = count[i-1] + 1;
+  current_id = ids[i];
+}
+  }
+  return count;
+}')
+
+# Process interleaved team draft
+process_session <- function(df) {
+  processed_session <- unsplit(lapply(split(df, df$serp_id), function(df) {
+if (is.na(df$event_extraParams[1]) || df$event_extraParams[1] == "") {
+  visited_pages <- rep(as.character(NA), times = nrow(df))
+} else {
+  from_json <- jsonlite::fromJSON(df$event_extraParams[1], simplifyVector 
= FALSE)
+  if (!("teamDraft" %in% names(from_json)) || all(is.na(df$article_id))) {
+visited_pages <- rep(as.character(NA), times = nrow(df))
+  } else {
+team_a <- unlist(from_json$teamDraft$a)
+team_b <- unlist(from_json$teamDraft$b)
+visited_pages <- vapply(df$article_id, function(article_id) {
+  if (article_id %in% team_a) {
+return("A")
+  } else if (article_id %in% team_b) {
+return("B")
+  } else {
+return(as.character(NA))
+  }
+}, "")
+  }
+}
+return(visited_pages)
+  }), df$serp_id)
+  

[MediaWiki-commits] [Gerrit] wikimedia...autoreporter[master]: Add interleaved test analysis

2017-11-17 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/392102 )

Change subject: Add interleaved test analysis
..

Add interleaved test analysis

Bug: T176493
Change-Id: I795023856963030e67e85e1cde7352842aa3a7a8
---
M functions.R
M modules/data/data_aggregation.R
M modules/data/data_cleansing.R
M modules/data/fetch_data.R
A modules/interleaved_test/data_processing.R
A modules/interleaved_test/interleaved_preference.R
A modules/interleaved_test/page_dwelltime.R
M modules/setup.R
M modules/sister_search/sidebar_results.R
M modules/stat_test/engagement.R
A modules/stat_test/remove_interleaved_data.R
M modules/stat_test/return_rate.R
M modules/stat_test/serp_from_autocomplete.R
M modules/stat_test/serp_offset.R
M modules/stat_test/visited_page.R
M modules/test_summary/browser_os.R
M modules/test_summary/events.R
M modules/test_summary/searches.R
M report.Rmd
M run.R
20 files changed, 388 insertions(+), 65 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/autoreporter 
refs/changes/02/392102/1

diff --git a/functions.R b/functions.R
index a4c6367..a7dff7e 100644
--- a/functions.R
+++ b/functions.R
@@ -79,3 +79,59 @@
 ggplot2::scale_color_brewer(palette = "Set1") +
 ggplot2::labs(x = NULL, color = "Group", y = y_lab, title = title, 
subtitle = subtitle)
 }
+
+cppFunction('CharacterVector fill_in(CharacterVector ids) {
+  CharacterVector new_ids(ids.size());
+  String current_id = ids[0];
+  new_ids[0] = current_id;
+  for (int i = 1; i < ids.size(); i++) {
+if (ids[i] != NA_STRING) {
+  current_id = ids[i];
+}
+new_ids[i] = current_id;
+  }
+  return new_ids;
+}')
+
+cppFunction('NumericVector cumunique(CharacterVector ids) {
+  NumericVector count(ids.size());
+  String current_id = ids[0];
+  count[0] = 1;
+  for (int i = 1; i < ids.size(); i++) {
+if (ids[i] == current_id) {
+  count[i] = count[i-1];
+} else {
+  count[i] = count[i-1] + 1;
+  current_id = ids[i];
+}
+  }
+  return count;
+}')
+
+# Process interleaved team draft
+process_session <- function(df) {
+  processed_session <- unsplit(lapply(split(df, df$serp_id), function(df) {
+if (is.na(df$event_extraParams[1]) || df$event_extraParams[1] == "") {
+  visited_pages <- rep(as.character(NA), times = nrow(df))
+} else {
+  from_json <- jsonlite::fromJSON(df$event_extraParams[1], simplifyVector 
= FALSE)
+  if (!("teamDraft" %in% names(from_json)) || all(is.na(df$article_id))) {
+visited_pages <- rep(as.character(NA), times = nrow(df))
+  } else {
+team_a <- unlist(from_json$teamDraft$a)
+team_b <- unlist(from_json$teamDraft$b)
+visited_pages <- vapply(df$article_id, function(article_id) {
+  if (article_id %in% team_a) {
+return("A")
+  } else if (article_id %in% team_b) {
+return("B")
+  } else {
+return(as.character(NA))
+  }
+}, "")
+  }
+}
+return(visited_pages)
+  }), df$serp_id)
+  return(processed_session)
+}
diff --git a/modules/data/data_aggregation.R b/modules/data/data_aggregation.R
index db5a178..6115703 100644
--- a/modules/data/data_aggregation.R
+++ b/modules/data/data_aggregation.R
@@ -8,9 +8,9 @@
 
 message("Aggregating by search...")
 searches <- events %>%
-  keep_where(!(is.na(serp_id))) %>% # remove visitPage and checkin events
-  arrange(date, session_id, serp_id, timestamp) %>%
-  group_by(group, wiki, session_id, serp_id) %>%
+  keep_where(!(is.na(search_id))) %>% # remove visitPage and checkin events
+  arrange(date, session_id, search_id, timestamp) %>%
+  group_by(group, wiki, session_id, search_id) %>%
   summarize(
 date = date[1],
 timestamp = timestamp[1],
@@ -56,19 +56,19 @@
 keep_where(event == "searchResultPage", `some same-wiki results` == 
"TRUE") %>%
 # SERPs with 0 results will not have an offset in extraParams ^
 mutate(offset = purrr::map_int(event_extraParams, ~ parse_extraParams(.x, 
action = "searchResultPage")$offset)) %>%
-select(session_id, event_id, serp_id, offset)
+select(session_id, event_id, search_id, offset)
 
   message("Processing SERP interwiki data...")
-  extract_iw <- function(session_id, event_id, serp_id, event_extraParams) {
+  extract_iw <- function(session_id, event_id, search_id, event_extraParams) {
 return(data.frame(
-  session_id, event_id, serp_id,
+  session_id, event_id, search_id,
   parse_extraParams(event_extraParams, action = "searchResultPage")$iw,
   stringsAsFactors = FALSE
 ))
   }
   serp_iw <- events %>%
 keep_where(event == "searchResultPage") %>%
-select(session_id, event_id, serp_id, event_extraParams) %>%
+select(session_id, event_id, search_id, event_extraParams) %>%
 purrr::pmap_df(extract_iw) %>%
 mutate(source = case_when(
   source == "wikt" ~ "Wiktionary",
@@ -104,12 +104,12 @@
 

[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: db1047 => db1108

2017-11-13 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/391062 )

Change subject: db1047 => db1108
..


db1047 => db1108

Bug: T156844
Change-Id: I91270ef00fcb698e686e536162ab4d330ba7cb2b
---
M CHANGELOG.md
M README.md
M modules/metrics/maps/config.yaml
M modules/metrics/portal/config.yaml
M modules/metrics/search/config.yaml
5 files changed, 22 insertions(+), 9 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/CHANGELOG.md b/CHANGELOG.md
index 3c37dae..6c428e2 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,6 +1,10 @@
 # Change Log (Patch Notes)
 All notable changes to this project will be documented in this file.
 
+## 2017/11/13
+- Switched host name from db1047.eqiad.wmnet to db1108.eqiad.wmnet per 
[T156844](https://phabricator.wikimedia.org/T156844)
+- Updated documentation
+
 ## 2017/11/02
 - Disabled forecasting (per 
[T112170#3724472](https://phabricator.wikimedia.org/T112170#3724472))
 
diff --git a/README.md b/README.md
index f8dfca6..b4f6b70 100644
--- a/README.md
+++ b/README.md
@@ -119,12 +119,15 @@
 - [x] Search on Mobile Web
 - [x] [Event counts](modules/metrics/search/mobile_event_counts.sql)
 - [x] [Load times](modules/metrics/search/mobile_load_times) (invokes 
[load_times.R](modules/metrics/search/load_times.R))
+- [x] [Session counts](modules/metrics/search/mobile_session_counts) 
(invokes 
[mobile_session_counts.R](modules/metrics/search/mobile_session_counts.R))
 - [x] Search on Desktop
 - [x] [Event counts](modules/metrics/search/desktop_event_counts.sql)
 - [x] [Load times](modules/metrics/search/desktop_load_times) (invokes 
[load_times.R](modules/metrics/search/load_times.R))
 - [x] [Survival/LDN: Retention of users on visited 
pages](modules/metrics/search/sample_page_visit_ld) 
([T113297](https://phabricator.wikimedia.org/T113297))
 - [x] [Dwell-time: % of users visiting results for more than 
10s](modules/metrics/search/search_threshold_pass_rate) 
([T113297](https://phabricator.wikimedia.org/T113297), 
[T113513](https://phabricator.wikimedia.org/T113513), [Change 
240593](https://gerrit.wikimedia.org/r/#/c/240593/))
+- [x] [Time spent on search result pages 
(SRPs)](modules/metrics/search/srp_survtime) (invokes 
[srp_survtime.R](modules/metrics/search/srp_survtime.R))
 - [x] [PaulScore](modules/metrics/search/paulscore_approximations) 
([T144424](https://phabricator.wikimedia.org/T144424))
+- [x] [Bounce rate](modules/metrics/search/desktop_return_rate) 
(invokes [desktop_return_rate.R](modules/metrics/search/desktop_return_rate.R))
 - Dwell-time, PaulScore, event counts, etc. broken down by 
language-project (planned, [T150410](https://phabricator.wikimedia.org/T150410))
 - [x] Zero results rate (all invoke 
[cirrus_aggregates.R](modules/metrics/search/cirrus_aggregates.R))
 - [x] Overall
@@ -140,9 +143,15 @@
 - [x] [No 
automata](modules/metrics/search/cirrus_langproj_breakdown_no_automata)
 - [x] [With 
automata](modules/metrics/search/cirrus_langproj_breakdown_with_automata)
 - Well-behaved searchers (planned, 
[T150901](https://phabricator.wikimedia.org/T150901))
-- Probable non-bots, as detected by ML (planned, 
[T149440](https://phabricator.wikimedia.org/T149440)
+- Probable non-bots, as detected by ML (abandoned, 
[T149440](https://phabricator.wikimedia.org/T149440)
+- [x] Sister search
+  - [x] [Prevalence on 
SRPs](modules/metrics/search/sister_search_prevalence.sql)
+  - [x] [Traffic to sister projects from Wikipedia 
SRPs](modules/metrics/search/sister_search_traffic)
+- [x] [Article pageviews from full-text 
search](modules/metrics/search/pageviews_from_fulltext_search)
+- [x] [Full-text SRP views by device and agent 
type](modules/metrics/search/search_result_pages)
   - [x] [Wikipedia.org 
Portal](https://www.mediawiki.org/wiki/Wikipedia.org_Portal) 
([configuration](modules/metrics/portal/config.yaml), 
[T118994](https://phabricator.wikimedia.org/T118994))
 - [x] [Pageviews](modules/metrics/portal/pageviews) 
([T125737](https://phabricator.wikimedia.org/T125737), 
[T143064](https://phabricator.wikimedia.org/T143064), 
[T143605](https://phabricator.wikimedia.org/T143605))
+- [x] [Pageviews by device (mobile vs 
desktop)](modules/metrics/portal/pageviews_by_device)
 - [x] [Referers](modules/metrics/portal/referer_data)
 - [x] [User Agent breakdown](modules/metrics/portal/user_agent_data)
 - [x] Languages
@@ -162,6 +171,8 @@
 - [x] [Last performed 
action](modules/metrics/portal/last_action_country)
 - [x] [Most commonly clicked section per 
visit](modules/metrics/portal/most_common_country)
 - [x] [Clickthrough on first 
visit](modules/metrics/portal/first_visits_country)
+- [x] 

[MediaWiki-commits] [Gerrit] wikimedia...wmf[master]: db1047 => db1108

2017-11-13 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/391063 )

Change subject: db1047 => db1108
..


db1047 => db1108

Bug: T156844
Change-Id: I81f0f93a97f7467e1fcf30e20c252fc044bbbd31
---
M DESCRIPTION
M NEWS.md
M R/mysql.R
M man/mysql.Rd
4 files changed, 8 insertions(+), 4 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/DESCRIPTION b/DESCRIPTION
index d0d3314..57e4d2f 100644
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -1,8 +1,8 @@
 Package: wmf
 Type: Package
 Title: R Code for Wikimedia Foundation Internal Usage
-Version: 0.3.0
-Date: 2017-11-01
+Version: 0.3.1
+Date: 2017-11-13
 Authors@R: c(
 person("Mikhail", "Popov", email = "mikh...@wikimedia.org", role = 
c("aut", "cre")),
 person("Oliver", "Keyes", role = "aut", comment = "No longer employed at 
the Foundation"),
diff --git a/NEWS.md b/NEWS.md
index c3927ce..dfd22b8 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -1,3 +1,7 @@
+wmf 0.3.1
+=
+* Switched host name from db1047.eqiad.wmnet to db1108.eqiad.wmnet per 
[T156844](https://phabricator.wikimedia.org/T156844)
+
 wmf 0.3.0
 =
 * C++-based `exact_binomial()` to quickly estimate sample size for exact 
binomial tests
diff --git a/R/mysql.R b/R/mysql.R
index 4594a36..725b8c4 100644
--- a/R/mysql.R
+++ b/R/mysql.R
@@ -33,7 +33,7 @@
 #' @export
 mysql_connect <- function(
   database, default_file = NULL,
-  hostname = ifelse(database == "log", "db1047.eqiad.wmnet", 
"analytics-store.eqiad.wmnet")
+  hostname = ifelse(database == "log", "db1108.eqiad.wmnet", 
"analytics-store.eqiad.wmnet")
 ) {
   # Begin Exclude Linting
   if (is.null(default_file)) {
diff --git a/man/mysql.Rd b/man/mysql.Rd
index 09c9606..2032aa8 100644
--- a/man/mysql.Rd
+++ b/man/mysql.Rd
@@ -11,7 +11,7 @@
 \title{Work with MySQL databases}
 \usage{
 mysql_connect(database, default_file = NULL, hostname = ifelse(database ==
-  "log", "db1047.eqiad.wmnet", "analytics-store.eqiad.wmnet"))
+  "log", "db1108.eqiad.wmnet", "analytics-store.eqiad.wmnet"))
 
 mysql_read(query, database, con = NULL)
 

-- 
To view, visit https://gerrit.wikimedia.org/r/391063
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I81f0f93a97f7467e1fcf30e20c252fc044bbbd31
Gerrit-PatchSet: 2
Gerrit-Project: wikimedia/discovery/wmf
Gerrit-Branch: master
Gerrit-Owner: Bearloga 
Gerrit-Reviewer: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Disable forecasting

2017-11-02 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/388117 )

Change subject: Disable forecasting
..


Disable forecasting

Bug: T112170
Change-Id: Ie985c774b83e961b526bd86d1ec17754a0f03c66
---
M CHANGELOG.md
M README.md
M docs/README.Rmd
M docs/README.md
M main.sh
M test.R
6 files changed, 30 insertions(+), 80 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/CHANGELOG.md b/CHANGELOG.md
index 9767a68..3c37dae 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,6 +1,9 @@
 # Change Log (Patch Notes)
 All notable changes to this project will be documented in this file.
 
+## 2017/11/02
+- Disabled forecasting (per 
[T112170#3724472](https://phabricator.wikimedia.org/T112170#3724472))
+
 ## 2017/10/05
 - Changed which hostname the SQL queries are run on 
([T176639](https://phabricator.wikimedia.org/T176639))
 
diff --git a/README.md b/README.md
index e93ed81..f8dfca6 100644
--- a/README.md
+++ b/README.md
@@ -183,7 +183,7 @@
 - KPIs (planned)
   - [x] External Traffic 
([configuration](modules/metrics/external_traffic/config.yaml))
 - [x] [Referer data](modules/metrics/external_traffic/referer_data) 
([T116295](https://phabricator.wikimedia.org/T116295), [Change 
247601](https://gerrit.wikimedia.org/r/#/c/247601/))
-- [x] **Forecasts** 
([modules/forecasts/forecast.R](modules/forecasts/forecast.R), see 
[T112170](https://phabricator.wikimedia.org/T112170) for more details)
+- [x] **Forecasts** 
([modules/forecasts/forecast.R](modules/forecasts/forecast.R), see 
[T112170](https://phabricator.wikimedia.org/T112170) for more details) 
(DISABLED)
   - [x] Search ([configuration](modules/forecasts/search/config.yaml))
 - [x] Cirrus API usage
 - [x] [ARIMA-modelled 
forecasts](modules/forecasts/search/api_cirrus_arima)
diff --git a/docs/README.Rmd b/docs/README.Rmd
index c7d27c0..ff37748 100644
--- a/docs/README.Rmd
+++ b/docs/README.Rmd
@@ -39,20 +39,3 @@
 ```{r results='asis'}
 print_reports(metrics)
 ```
-
-## Daily Forecasts
-
-```{r yamls_forecasts}
-config_yamls <- list.files(path = "../modules/forecasts", pattern = 
"^config\\.yaml$", recursive = TRUE, full.names = TRUE)
-names(config_yamls) <- sub("../modules/forecasts/", "", dirname(config_yamls), 
fixed = TRUE)
-forecasts <- dplyr::bind_rows(lapply(config_yamls, function(path) {
-  config_yaml <- 
suppressMessages(suppressWarnings(data.tree::as.Node(yaml::yaml.load_file(path
-  reports <- data.tree::ToDataFrameTable(config_yaml[["reports"]], "report" = 
"name", "description")
-  reports$path = paste0(file.path(dirname(path), reports$report), 
ifelse(reports$type == "sql", ".sql", ""))
-  return(reports)
-}), .id = "module")
-```
-
-```{r results='asis'}
-print_reports(forecasts)
-```
diff --git a/docs/README.md b/docs/README.md
index 2053bcf..ef4ef60 100644
--- a/docs/README.md
+++ b/docs/README.md
@@ -1,14 +1,14 @@
 Discovery Datasets
 ==
 
-These files are generated by Discovery's
+These files are generated by Discovery’s
 [Golden](https://github.com/wikimedia/wikimedia-discovery-golden/) data
 retrieval codebase that executes daily and uses
 
[Reportupdater](https://wikitech.wikimedia.org/wiki/Analytics/Systems/Reportupdater)
 infrastructure. These datasets provide the metrics that are used by
-[Discovery's Dashboards](https://discovery.wmflabs.org/)
+[Discovery’s Dashboards](https://discovery.wmflabs.org/)
 
-Last updated on 27 September 2017
+Last updated on 02 November 2017
 
 Daily Metrics
 -
@@ -16,18 +16,18 @@
 external\_traffic/
 --
 
--   **referer\_data.tsv**: Pageviews broken down by referrer class (e.g.
-internal vs external) and search engine
+-   **referer\_data.tsv**: Pageviews broken down by referrer class
+(e.g. internal vs external) and search engine
 -   **referer\_nonbot\_data.tsv**: User-made pageviews broken down by
-referrer class (e.g. internal vs external) and search engine
+referrer class (e.g. internal vs external) and search engine
 
 maps/
 -
 
--   **actions\_per\_tool.tsv**: Actions broken down by feature (e.g.
-GeoHack)
+-   **actions\_per\_tool.tsv**: Actions broken down by feature
+(e.g. GeoHack)
 -   **users\_per\_feature.tsv**: Counts of users broken down by feature
-(e.g. GeoHack)
+(e.g. GeoHack)
 -   **users\_by\_country.tsv**: Counts of users broken down by top 10
 countries
 -   **tile\_aggregates\_with\_automata.tsv**: Tile counts by style, zoom
@@ -43,11 +43,11 @@
 ---
 
 -   **pageviews.tsv**: Wikipedia.org Portal pageviews, broken down by
-high-volume clients vs. low-volume clients
--   **referer\_data.tsv**: Pageviews broken down by referrer class (e.g.
-internal vs external)
--   **user\_agent\_data.tsv**: Wikipedia.org Portal visitors' browsers
--   **dwell\_metrics.tsv**: Wikipedia.org Portal visitors' dwell-time
+high-volume clients vs. 

[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Annotation of mw.track bug fix

2017-10-18 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/385014 )

Change subject: Annotation of mw.track bug fix
..

Annotation of mw.track bug fix

Bug: T178097
Change-Id: I720a8ec47c0a2d86480f3b7c97880ee890a53c03
---
M modules/mobile_web/events.R
M tab_documentation/mobile_events.md
2 files changed, 5 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow 
refs/changes/14/385014/1

diff --git a/modules/mobile_web/events.R b/modules/mobile_web/events.R
index 3e0125a..8e044cc 100644
--- a/modules/mobile_web/events.R
+++ b/modules/mobile_web/events.R
@@ -40,12 +40,14 @@
 polloi::make_dygraph(xlab = "Date", ylab = "Events", title = "Mobile 
search events, by day") %>%
 dyRangeSelector %>%
 dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") 
%>%
-dyEvent(as.Date("2017-03-29"), "H (new header)", labelLoc = "bottom")
+dyEvent(as.Date("2017-03-29"), "H (new header)", labelLoc = "bottom") %>%
+dyEvent(as.Date("2017-09-28"), "B (mw.track bug)", labelLoc = "bottom")
 })
 
 output$mobile_session_plot <- renderDygraph({
   mobile_session %>%
 polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_mobile_event)) %>%
 polloi::make_dygraph(xlab = "Date", ylab = "Events", title = "Mobile user 
sessions, by volume") %>%
-dyRangeSelector
+dyRangeSelector %>%
+dyEvent(as.Date("2017-09-28"), "B (mw.track bug)", labelLoc = "bottom")
 })
diff --git a/tab_documentation/mobile_events.md 
b/tab_documentation/mobile_events.md
index c8a029a..fc8db23 100644
--- a/tab_documentation/mobile_events.md
+++ b/tab_documentation/mobile_events.md
@@ -26,6 +26,7 @@
 * Between 5 May and 6 May 2015, approximately 40% of incoming EventLogging 
data was lost due to a wider EventLogging outage. You can read more about the 
outage 
[here](https://wikitech.wikimedia.org/wiki/Incident_documentation/20150506-EventLogging).
 * '__R__': on 2017-01-01 we started calculating all of Discovery's metrics 
using a new version of [our data retrieval and processing 
codebase](https://phabricator.wikimedia.org/diffusion/WDGO/) that we migrated 
to [Wikimedia Analytics](https://www.mediawiki.org/wiki/Analytics)' 
[Reportupdater 
infrastructure](https://wikitech.wikimedia.org/wiki/Analytics/Reportupdater). 
See [T150915](https://phabricator.wikimedia.org/T150915) for more details.
 * '__H__': on 2017-03-29 we deployed the new mobile header treatment 
(including the search box) which may result in the decrease of search. See 
[T176464](https://phabricator.wikimedia.org/T176464) for more information.
+* '__B__': on 2017-09-28 a bug in mw.track was fixed. Before 2017-09-28, if 
events are logged via mw.track, only events tracked during the first pageview 
of a user's session were logged. See 
[T175918](https://phabricator.wikimedia.org/T175918) for more details.
 
 Questions, bug reports, and feature suggestions
 --

-- 
To view, visit https://gerrit.wikimedia.org/r/385014
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I720a8ec47c0a2d86480f3b7c97880ee890a53c03
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/rainbow
Gerrit-Branch: develop
Gerrit-Owner: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...autoreporter[master]: Fix UI stuff

2017-10-18 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/384935 )

Change subject: Fix UI stuff
..


Fix UI stuff

Change-Id: Ie2402b37b124e4a2fdab2cb7697674d65342fd79
---
M parameters.yaml
M report.Rmd
2 files changed, 175 insertions(+), 118 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/parameters.yaml b/parameters.yaml
index 71eccbc..345068f 100644
--- a/parameters.yaml
+++ b/parameters.yaml
@@ -17,4 +17,5 @@
 event_action: NULL # if not NULL, only specified actions are selected
 event_source: fulltext # autocomplete not yet supported
 other_filter: "event_searchSessionId <> 'explore_similar_test'" # if not NULL, 
these filters will be appended to WHERE clause
+serp_dwell_time: false # If true, dwell time of fulltext search result pages 
from autocomplete will be included
 debug: false # setting to false hides messages and warnings
diff --git a/report.Rmd b/report.Rmd
index 50c315b..82d1981 100644
--- a/report.Rmd
+++ b/report.Rmd
@@ -19,6 +19,7 @@
   event_action: [searchResultPage, click, ssclick, visitPage, checkin, 
hover-on, hover-off, esclick]
   event_source: "fulltext"
   other_filter: "event_subTest IS NOT NULL"
+  serp_dwell_time: false # If true, dwell time of fulltext search result pages 
from autocomplete will be included
   debug: false # setting to false hides messages and warnings
 title: '`r params$report_title`'
 author: '`r paste("Generated by", ifelse(params$debug, Sys.info()["user"], 
"the automated A/B test reporting tool"))`'
@@ -97,10 +98,6 @@
 # Take all R colors from graphical devices (with grey removed)
 large_color_palette = grDevices::colors()[grep('gr(a|e)y', 
grDevices::colors(), invert = T)]
 ```
-
-`r if (!is.null(params$test_description)) { params$test_description }`
-
-This test ran from `r format(lubridate::ymd(params$start_date), "%d %B %Y")` 
to `r format(lubridate::ymd(params$end_date) - 1, "%d %B %Y")` on `r 
ifelse(is.null(params$wiki), "all wikis", paste(params$wiki, collapse = ", 
"))`. There were `r length(params$test_group_names)` test groups: `r 
paste(params$test_group_names, collapse = ", ")`. This report includes `r 
paste(params$event_source, collapse = ", ")` searches. Refer to Phabricator 
ticket [`r params$phab_ticket`](`r paste0("https://phabricator.wikimedia.org/;, 
params$phab_ticket)`) for more details.
 
 ```{r sql_setup, echo=FALSE}
 is_stat_machine <- grepl("^stat1", Sys.info()["nodename"])
@@ -200,11 +197,15 @@
   if (is_stat_machine) {
 message("(Running on a stat machine.)")
 events_raw <- wmf::mysql_read(query, "log")
-fulltext_from_auto <- wmf::mysql_read(query_autocomplete, "log")
+if (params$serp_dwell_time) {
+  fulltext_from_auto <- wmf::mysql_read(query_autocomplete, "log")
+}
   } else {
 message("Using SSH tunnel & connection to Analytics-Store...")
 events_raw <- wmf::mysql_read(query, "log", con = con)
-fulltext_from_auto <- wmf::mysql_read(query_autocomplete, "log", con = con)
+if (params$serp_dwell_time) {
+  fulltext_from_auto <- wmf::mysql_read(query_autocomplete, "log", con = 
con)
+}
 message("Closing connection...")
 wmf::mysql_close(con)
   }
@@ -216,11 +217,16 @@
 
   message("Saving raw events data...")
   save(events_raw, file = file.path("data", gsub(.Platform$file.sep, "", 
params$report_title), paste0("events_raw_", gsub("[^0-9]", "", Sys.time()), 
".RData")))
-  message("Saving SERP data that are from autocomplete...")
-  save(fulltext_from_auto, file = file.path("data", gsub(.Platform$file.sep, 
"", params$report_title), paste0("fulltext_from_auto_", gsub("[^0-9]", "", 
Sys.time()), ".RData")))
-
+  
+  if (params$serp_dwell_time) {
+message("Saving SERP data that are from autocomplete...")
+save(fulltext_from_auto, file = file.path("data", gsub(.Platform$file.sep, 
"", params$report_title), paste0("fulltext_from_auto_", gsub("[^0-9]", "", 
Sys.time()), ".RData")))
+  }
+ 
   cat("**Query for full-text events**:\n\n```SQL\n", query, "\n```\n")
-  cat("**Query for SERP from autocomplete**:\n\n```SQL\n", query_autocomplete, 
"\n```\n")
+  if (params$serp_dwell_time) {
+cat("**Query for SERP from autocomplete**:\n\n```SQL\n", 
query_autocomplete, "\n```\n")
+  }
 
 } else {
 
@@ -232,6 +238,10 @@
 
 }
 ```
+
+`r if (!is.null(params$test_description)) { params$test_description }`
+
+This test ran from `r format(lubridate::ymd_hms(min(events_raw$timestamp)), 
"%d %B %Y")` to `r format(lubridate::ymd_hms(max(events_raw$timestamp)), "%d %B 
%Y")` on `r ifelse(is.null(params$wiki), "all wikis", paste(params$wiki, 
collapse = ", "))`. There were `r length(params$test_group_names)` test groups: 
`r paste(params$test_group_names, collapse = ", ")`. This report includes `r 
paste(params$event_source, collapse = ", ")` searches. Refer to Phabricator 
ticket [`r params$phab_ticket`](`r 

[MediaWiki-commits] [Gerrit] wikimedia...autoreporter[master]: Fix UI stuff

2017-10-18 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/384935 )

Change subject: Fix UI stuff
..

Fix UI stuff

Change-Id: Ie2402b37b124e4a2fdab2cb7697674d65342fd79
---
M parameters.yaml
M report.Rmd
2 files changed, 179 insertions(+), 122 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/autoreporter 
refs/changes/35/384935/1

diff --git a/parameters.yaml b/parameters.yaml
index 71eccbc..345068f 100644
--- a/parameters.yaml
+++ b/parameters.yaml
@@ -17,4 +17,5 @@
 event_action: NULL # if not NULL, only specified actions are selected
 event_source: fulltext # autocomplete not yet supported
 other_filter: "event_searchSessionId <> 'explore_similar_test'" # if not NULL, 
these filters will be appended to WHERE clause
+serp_dwell_time: false # If true, dwell time of fulltext search result pages 
from autocomplete will be included
 debug: false # setting to false hides messages and warnings
diff --git a/report.Rmd b/report.Rmd
index 50c315b..d70f2d5 100644
--- a/report.Rmd
+++ b/report.Rmd
@@ -19,6 +19,7 @@
   event_action: [searchResultPage, click, ssclick, visitPage, checkin, 
hover-on, hover-off, esclick]
   event_source: "fulltext"
   other_filter: "event_subTest IS NOT NULL"
+  serp_dwell_time: false # If true, dwell time of fulltext search result pages 
from autocomplete will be included
   debug: false # setting to false hides messages and warnings
 title: '`r params$report_title`'
 author: '`r paste("Generated by", ifelse(params$debug, Sys.info()["user"], 
"the automated A/B test reporting tool"))`'
@@ -97,10 +98,6 @@
 # Take all R colors from graphical devices (with grey removed)
 large_color_palette = grDevices::colors()[grep('gr(a|e)y', 
grDevices::colors(), invert = T)]
 ```
-
-`r if (!is.null(params$test_description)) { params$test_description }`
-
-This test ran from `r format(lubridate::ymd(params$start_date), "%d %B %Y")` 
to `r format(lubridate::ymd(params$end_date) - 1, "%d %B %Y")` on `r 
ifelse(is.null(params$wiki), "all wikis", paste(params$wiki, collapse = ", 
"))`. There were `r length(params$test_group_names)` test groups: `r 
paste(params$test_group_names, collapse = ", ")`. This report includes `r 
paste(params$event_source, collapse = ", ")` searches. Refer to Phabricator 
ticket [`r params$phab_ticket`](`r paste0("https://phabricator.wikimedia.org/;, 
params$phab_ticket)`) for more details.
 
 ```{r sql_setup, echo=FALSE}
 is_stat_machine <- grepl("^stat1", Sys.info()["nodename"])
@@ -200,11 +197,15 @@
   if (is_stat_machine) {
 message("(Running on a stat machine.)")
 events_raw <- wmf::mysql_read(query, "log")
-fulltext_from_auto <- wmf::mysql_read(query_autocomplete, "log")
+if (params$serp_dwell_time) {
+  fulltext_from_auto <- wmf::mysql_read(query_autocomplete, "log")
+}
   } else {
 message("Using SSH tunnel & connection to Analytics-Store...")
 events_raw <- wmf::mysql_read(query, "log", con = con)
-fulltext_from_auto <- wmf::mysql_read(query_autocomplete, "log", con = con)
+if (params$serp_dwell_time) {
+  fulltext_from_auto <- wmf::mysql_read(query_autocomplete, "log", con = 
con)
+}
 message("Closing connection...")
 wmf::mysql_close(con)
   }
@@ -216,11 +217,16 @@
 
   message("Saving raw events data...")
   save(events_raw, file = file.path("data", gsub(.Platform$file.sep, "", 
params$report_title), paste0("events_raw_", gsub("[^0-9]", "", Sys.time()), 
".RData")))
-  message("Saving SERP data that are from autocomplete...")
-  save(fulltext_from_auto, file = file.path("data", gsub(.Platform$file.sep, 
"", params$report_title), paste0("fulltext_from_auto_", gsub("[^0-9]", "", 
Sys.time()), ".RData")))
-
+  
+  if (params$serp_dwell_time) {
+message("Saving SERP data that are from autocomplete...")
+save(fulltext_from_auto, file = file.path("data", gsub(.Platform$file.sep, 
"", params$report_title), paste0("fulltext_from_auto_", gsub("[^0-9]", "", 
Sys.time()), ".RData")))
+  }
+ 
   cat("**Query for full-text events**:\n\n```SQL\n", query, "\n```\n")
-  cat("**Query for SERP from autocomplete**:\n\n```SQL\n", query_autocomplete, 
"\n```\n")
+  if (params$serp_dwell_time) {
+cat("**Query for SERP from autocomplete**:\n\n```SQL\n", 
query_autocomplete, "\n```\n")
+  }
 
 } else {
 
@@ -232,6 +238,10 @@
 
 }
 ```
+
+`r if (!is.null(params$test_description)) { params$test_description }`
+
+This test ran from `r format(lubridate::ymd_hms(min(events_raw$timestamp)), 
"%d %B %Y")` to `r format(lubridate::ymd_hms(max(events_raw$timestamp)), "%d %B 
%Y")` on `r ifelse(is.null(params$wiki), "all wikis", paste(params$wiki, 
collapse = ", "))`. There were `r length(params$test_group_names)` test groups: 
`r paste(params$test_group_names, collapse = ", ")`. This report includes `r 
paste(params$event_source, collapse = ", ")` searches. Refer to Phabricator 
ticket [`r params$phab_ticket`](`r 

[MediaWiki-commits] [Gerrit] wikimedia...autoreporter[master]: Bug fixes

2017-10-12 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/383960 )

Change subject: Bug fixes
..


Bug fixes

Change-Id: Ifa99d8f6796a091124a0c902b8d2e370a9ec5b13
---
M report.Rmd
1 file changed, 21 insertions(+), 19 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/report.Rmd b/report.Rmd
index ba84ad6..50c315b 100644
--- a/report.Rmd
+++ b/report.Rmd
@@ -94,6 +94,8 @@
   )
 })
 source("functions.R")
+# Take all R colors from graphical devices (with grey removed)
+large_color_palette = grDevices::colors()[grep('gr(a|e)y', 
grDevices::colors(), invert = T)]
 ```
 
 `r if (!is.null(params$test_description)) { params$test_description }`
@@ -514,7 +516,7 @@
 ```{r event_count_wiki, echo=FALSE, eval=(n_wiki > 1), fig.height=(5 * n_wiki)}
 event_count_function(by_wiki = TRUE) + 
   theme_facet() +
-  facet_wrap(~ wiki, nrow = n_wiki, scales = "free_y")
+  facet_wrap(~ wiki, ncol = 1, scales = "free_y")
 ```
 
 ```{r event_after_click_all, echo=FALSE}
@@ -529,10 +531,10 @@
 event_after_click_function() + theme_min()
 ```
 
-```{r event_after_click_wiki, echo=FALSE, eval=(n_wiki > 1), fig.height=(5 * 
n_wiki)}
+```{r event_after_click_wiki, echo=FALSE, eval=(n_wiki > 1), fig.height=(5 * 
ceiling(n_wiki / 2))}
 event_after_click_function(by_wiki = TRUE) +
   theme_facet() +
-  facet_wrap(~ wiki, nrow = n_wiki, scales = "free_y")
+  facet_wrap(~ wiki, ncol = 2, scales = "free_y")
 ```
 
  Searches
@@ -559,7 +561,7 @@
   knitr::kable()
 ```
 
-```{r daily_searches, echo=FALSE}
+```{r daily_searches, echo=FALSE, fig.height=(4 * n_wiki)}
 searches %>%
   group_by(group, wiki, date) %>%
   summarize(`All Searches` = n(), `Searches with Results` = sum(`got same-wiki 
results`), `Searches with Clicks` = sum(`same-wiki clickthrough`)) %>%
@@ -583,7 +585,7 @@
 group_by(!!! rlang::syms(c("group", "results", switch(by_wiki, "wiki", 
NULL %>%
 summarize(searches = length(unique(serp_id[!is.na(serp_id)]))) %>%
 bar_chart(x = "results", y = "searches", x_lab = "Number of same-wiki 
results returned", 
-  y_lab = "Number of searches", title = expression(paste("Number 
of searches with ", italic("n"), " same-wiki result returned, by test group", 
switch(by_wiki, "and wiki", NULL
+  y_lab = "Number of searches", title = paste("Number of searches 
with n same-wiki result returned, by test group", switch(by_wiki, "and wiki", 
NULL)))
 }
 n_results_summary_function() + theme_min()
 ```
@@ -609,7 +611,7 @@
 group_by(!!! rlang::syms(c("group", "offset", switch(by_wiki, "wiki", 
NULL %>%
 tally %>%
 bar_chart(x = "offset", y = "n", x_lab = "Offset", y_lab = "Number of 
SERPs", 
-  title = expression(paste("Number of SERPs with ", italic("n"), " 
offset results, by test group", switch(by_wiki, "and wiki", NULL))),
+  title = paste("Number of SERPs with n offset results, by test 
group", switch(by_wiki, "and wiki", NULL)),
   caption = "This can be regarded as a proxy for users visiting 
additional pages of their search results.") +
 scale_x_discrete(limits = c("No offset (page 1)", Pluralize(c(20, 40, 60, 
80), "result"), "100+ results"))
 }
@@ -643,14 +645,15 @@
 tally %>%
 mutate(prop = paste0(scales::percent_format()(n/sum(n)), " (", n, ")")) %>%
 select(-n) %>%
-tidyr::spread(group, prop)
+tidyr::spread(group, prop) %>%
+ungroup
 }
 get_bayes_factor <- function(data) {
   BF <- data %>%
 tally %>%
 tidyr::spread(group, n) %>%
 ungroup %>%
-select(params$test_group_names) %>%
+select(dplyr::one_of(params$test_group_names)) %>%
 as.matrix() %>%
 # see http://bayesfactorpcl.r-forge.r-project.org/#ctables for more info
 BayesFactor::contingencyTableBF(sampleType = "indepMulti", fixedMargin = 
"cols")
@@ -808,7 +811,7 @@
 iwclick_position_function() + theme_min()
 ```
 
-```{r iwclick_position_wiki, eval=("iwclick" %in% events$event & n_wiki > 1), 
echo=FALSE, fig.height=(5 * n_wiki)}
+```{r iwclick_position_wiki, eval=("iwclick" %in% events$event & n_wiki > 1), 
echo=FALSE, fig.height=(4 * n_wiki)}
 iwclick_position_function(by_wiki = TRUE) + 
   facet_wrap(~ wiki, nrow = n_wiki, scales = "free_y") +
   theme_facet()
@@ -1044,7 +1047,7 @@
   theme_facet()
 ```
 
-```{r engagement_OR_wiki, eval=(n_wiki > 1), echo=FALSE, results='asis', 
include=TRUE}
+```{r engagement_OR_wiki, eval=(n_wiki > 1), echo=FALSE, fig.width=11, 
fig.height=10, results='asis', include=TRUE}
 # TODO: duplicated code engagement_OR_all
 control_group <- grep("control", params$`test_group_names`, value = TRUE)
 test_group <- setdiff(params$`test_group_names`, control_group)
@@ -1063,17 +1066,16 @@
   labels = c("Pr[Control Engaging]", "Pr[Test Engaging]", "Pr[Test] - 
Pr[Control]", "Relative Risk", "Odds Ratio")
 )) %>%
 ggplot(aes(x = 1, y = estimate, ymin = 

[MediaWiki-commits] [Gerrit] wikimedia...autoreporter[master]: Bug fixes

2017-10-12 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/383960 )

Change subject: Bug fixes
..

Bug fixes

Change-Id: Ifa99d8f6796a091124a0c902b8d2e370a9ec5b13
---
M report.Rmd
1 file changed, 21 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/autoreporter 
refs/changes/60/383960/1

diff --git a/report.Rmd b/report.Rmd
index ba84ad6..50c315b 100644
--- a/report.Rmd
+++ b/report.Rmd
@@ -94,6 +94,8 @@
   )
 })
 source("functions.R")
+# Take all R colors from graphical devices (with grey removed)
+large_color_palette = grDevices::colors()[grep('gr(a|e)y', 
grDevices::colors(), invert = T)]
 ```
 
 `r if (!is.null(params$test_description)) { params$test_description }`
@@ -514,7 +516,7 @@
 ```{r event_count_wiki, echo=FALSE, eval=(n_wiki > 1), fig.height=(5 * n_wiki)}
 event_count_function(by_wiki = TRUE) + 
   theme_facet() +
-  facet_wrap(~ wiki, nrow = n_wiki, scales = "free_y")
+  facet_wrap(~ wiki, ncol = 1, scales = "free_y")
 ```
 
 ```{r event_after_click_all, echo=FALSE}
@@ -529,10 +531,10 @@
 event_after_click_function() + theme_min()
 ```
 
-```{r event_after_click_wiki, echo=FALSE, eval=(n_wiki > 1), fig.height=(5 * 
n_wiki)}
+```{r event_after_click_wiki, echo=FALSE, eval=(n_wiki > 1), fig.height=(5 * 
ceiling(n_wiki / 2))}
 event_after_click_function(by_wiki = TRUE) +
   theme_facet() +
-  facet_wrap(~ wiki, nrow = n_wiki, scales = "free_y")
+  facet_wrap(~ wiki, ncol = 2, scales = "free_y")
 ```
 
  Searches
@@ -559,7 +561,7 @@
   knitr::kable()
 ```
 
-```{r daily_searches, echo=FALSE}
+```{r daily_searches, echo=FALSE, fig.height=(4 * n_wiki)}
 searches %>%
   group_by(group, wiki, date) %>%
   summarize(`All Searches` = n(), `Searches with Results` = sum(`got same-wiki 
results`), `Searches with Clicks` = sum(`same-wiki clickthrough`)) %>%
@@ -583,7 +585,7 @@
 group_by(!!! rlang::syms(c("group", "results", switch(by_wiki, "wiki", 
NULL %>%
 summarize(searches = length(unique(serp_id[!is.na(serp_id)]))) %>%
 bar_chart(x = "results", y = "searches", x_lab = "Number of same-wiki 
results returned", 
-  y_lab = "Number of searches", title = expression(paste("Number 
of searches with ", italic("n"), " same-wiki result returned, by test group", 
switch(by_wiki, "and wiki", NULL
+  y_lab = "Number of searches", title = paste("Number of searches 
with n same-wiki result returned, by test group", switch(by_wiki, "and wiki", 
NULL)))
 }
 n_results_summary_function() + theme_min()
 ```
@@ -609,7 +611,7 @@
 group_by(!!! rlang::syms(c("group", "offset", switch(by_wiki, "wiki", 
NULL %>%
 tally %>%
 bar_chart(x = "offset", y = "n", x_lab = "Offset", y_lab = "Number of 
SERPs", 
-  title = expression(paste("Number of SERPs with ", italic("n"), " 
offset results, by test group", switch(by_wiki, "and wiki", NULL))),
+  title = paste("Number of SERPs with n offset results, by test 
group", switch(by_wiki, "and wiki", NULL)),
   caption = "This can be regarded as a proxy for users visiting 
additional pages of their search results.") +
 scale_x_discrete(limits = c("No offset (page 1)", Pluralize(c(20, 40, 60, 
80), "result"), "100+ results"))
 }
@@ -643,14 +645,15 @@
 tally %>%
 mutate(prop = paste0(scales::percent_format()(n/sum(n)), " (", n, ")")) %>%
 select(-n) %>%
-tidyr::spread(group, prop)
+tidyr::spread(group, prop) %>%
+ungroup
 }
 get_bayes_factor <- function(data) {
   BF <- data %>%
 tally %>%
 tidyr::spread(group, n) %>%
 ungroup %>%
-select(params$test_group_names) %>%
+select(dplyr::one_of(params$test_group_names)) %>%
 as.matrix() %>%
 # see http://bayesfactorpcl.r-forge.r-project.org/#ctables for more info
 BayesFactor::contingencyTableBF(sampleType = "indepMulti", fixedMargin = 
"cols")
@@ -808,7 +811,7 @@
 iwclick_position_function() + theme_min()
 ```
 
-```{r iwclick_position_wiki, eval=("iwclick" %in% events$event & n_wiki > 1), 
echo=FALSE, fig.height=(5 * n_wiki)}
+```{r iwclick_position_wiki, eval=("iwclick" %in% events$event & n_wiki > 1), 
echo=FALSE, fig.height=(4 * n_wiki)}
 iwclick_position_function(by_wiki = TRUE) + 
   facet_wrap(~ wiki, nrow = n_wiki, scales = "free_y") +
   theme_facet()
@@ -1044,7 +1047,7 @@
   theme_facet()
 ```
 
-```{r engagement_OR_wiki, eval=(n_wiki > 1), echo=FALSE, results='asis', 
include=TRUE}
+```{r engagement_OR_wiki, eval=(n_wiki > 1), echo=FALSE, fig.width=11, 
fig.height=10, results='asis', include=TRUE}
 # TODO: duplicated code engagement_OR_all
 control_group <- grep("control", params$`test_group_names`, value = TRUE)
 test_group <- setdiff(params$`test_group_names`, control_group)
@@ -1063,17 +1066,16 @@
   labels = c("Pr[Control Engaging]", "Pr[Test Engaging]", "Pr[Test] - 
Pr[Control]", "Relative Risk", "Odds Ratio")
 )) %>%
 

[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: pageviews that are search results pages

2017-10-04 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/382320 )

Change subject: pageviews that are search results pages
..

pageviews that are search results pages

In T176464#3636190, @Jdlrobson mentioned that on some browsers, when you 
clicked on the search icon, it will take you to a blank Special:Search page and 
let you start from there. Therefore, we should exclude these blank SRP from our 
counts.

Change-Id: If4aef7521a3268da85e7a3498cce1b33a2ee43a4
---
M modules/metrics/search/search_result_pages
M modules/metrics/search/sister_search_traffic
2 files changed, 10 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/golden 
refs/changes/20/382320/1

diff --git a/modules/metrics/search/search_result_pages 
b/modules/metrics/search/search_result_pages
index 348de5f..5908f4e 100755
--- a/modules/metrics/search/search_result_pages
+++ b/modules/metrics/search/search_result_pages
@@ -26,13 +26,11 @@
   AND page_id IS NULL
   AND (
 uri_path = '/wiki/Special:Search'
-OR (
-  uri_path = '/w/index.php'
-  AND (
-LENGTH(PARSE_URL(CONCAT('http://', uri_host, uri_path, uri_query), 
'QUERY', 'search')) > 0
-OR LENGTH(PARSE_URL(CONCAT('http://', uri_host, uri_path, 
uri_query), 'QUERY', 'searchToken')) > 0
-  )
-)
+OR uri_path = '/w/index.php'
+  )
+  AND (
+LENGTH(PARSE_URL(CONCAT('http://', uri_host, uri_path, uri_query), 
'QUERY', 'search')) > 0
+OR LENGTH(PARSE_URL(CONCAT('http://', uri_host, uri_path, uri_query), 
'QUERY', 'searchToken')) > 0
   )
   ) AS serp
   GROUP BY date, access_method, agent_type;
diff --git a/modules/metrics/search/sister_search_traffic 
b/modules/metrics/search/sister_search_traffic
index 0e5b7c6..3e40bc0 100755
--- a/modules/metrics/search/sister_search_traffic
+++ b/modules/metrics/search/sister_search_traffic
@@ -23,13 +23,11 @@
   page_id IS NULL
   AND (
 uri_path = '/wiki/Special:Search'
-OR (
-  uri_path = '/w/index.php'
-  AND (
-PARSE_URL(CONCAT('http://', uri_host, uri_path, uri_query), 
'QUERY', 'search') IS NOT NULL
-OR PARSE_URL(CONCAT('http://', uri_host, uri_path, uri_query), 
'QUERY', 'searchToken') IS NOT NULL
-  )
-)
+OR uri_path = '/w/index.php'
+  )
+  AND (
+LENGTH(PARSE_URL(CONCAT('http://', uri_host, uri_path, uri_query), 
'QUERY', 'search')) > 0
+OR LENGTH(PARSE_URL(CONCAT('http://', uri_host, uri_path, uri_query), 
'QUERY', 'searchToken')) > 0
   )
 ) AS is_serp
   FROM webrequest

-- 
To view, visit https://gerrit.wikimedia.org/r/382320
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: If4aef7521a3268da85e7a3498cce1b33a2ee43a4
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/golden
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Count the number of user session tokens by volume for mobile...

2017-09-29 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/381508 )

Change subject: Count the number of user session tokens by volume for mobile 
web search
..

Count the number of user session tokens by volume for mobile web search

Bug: T176811
Change-Id: I9ce01d5c6ffcce6ddb6e4fe35281d41c39f9f9d6
---
M modules/mobile_web/events.R
M tab_documentation/mobile_events.md
M ui.R
M utils.R
4 files changed, 45 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow 
refs/changes/08/381508/1

diff --git a/modules/mobile_web/events.R b/modules/mobile_web/events.R
index 6f326c6..3e0125a 100644
--- a/modules/mobile_web/events.R
+++ b/modules/mobile_web/events.R
@@ -1,6 +1,15 @@
+output$mobile_event_user_session <- renderValueBox(
+  valueBox(
+value = mobile_session_mean["Total user sessions"],
+subtitle = "User sessions per day*",
+icon = icon("search"),
+color = "green"
+  )
+)
+
 output$mobile_event_searches <- renderValueBox(
   valueBox(
-value = mobile_dygraph_means["search sessions"],
+value = mobile_dygraph_means["search start"],
 subtitle = "Search sessions per day*",
 icon = icon("search"),
 color = "green"
@@ -30,5 +39,13 @@
 polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_mobile_event)) %>%
 polloi::make_dygraph(xlab = "Date", ylab = "Events", title = "Mobile 
search events, by day") %>%
 dyRangeSelector %>%
-dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
+dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") 
%>%
+dyEvent(as.Date("2017-03-29"), "H (new header)", labelLoc = "bottom")
+})
+
+output$mobile_session_plot <- renderDygraph({
+  mobile_session %>%
+polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_mobile_event)) %>%
+polloi::make_dygraph(xlab = "Date", ylab = "Events", title = "Mobile user 
sessions, by volume") %>%
+dyRangeSelector
 })
diff --git a/tab_documentation/mobile_events.md 
b/tab_documentation/mobile_events.md
index e6859b9..c8a029a 100644
--- a/tab_documentation/mobile_events.md
+++ b/tab_documentation/mobile_events.md
@@ -1,13 +1,15 @@
-Mobile search
+Mobile web search
 ===
 
-User actions that we track around search on the mobile website generally fall 
into three categories:
+User actions that we track around prefix search on the mobile website 
generally fall into three categories:
 
-1. The start of a user's search session;
-2. The presentation of the user with a results page, and;
-3. A user clicking through to an article in the results page.
+1. **search start (aka search session)**: An API request is being made to 
retrieve search results whenever the user types enough characters to perform a 
search (3 or more). A search session is identified by searchSessionToken. For 
example, if a user types "Bara", then a new search session is started; if they 
then type "ck" (Barack), then a new search session is started;
+2. **Result pages opened**: The API request has finished and the results have 
been rendered;
+3. **clickthroughs**: A user clicking through to an article in the results 
page.
 
-These three things are tracked via the [EventLogging 'MobileWebSearch' 
schema](https://meta.wikimedia.org/wiki/Schema:MobileWebSearch), and stored to 
a database. The results are then aggregated and anonymised, and presented on 
this page. For performance/privacy reasons we randomly sample what we store, so 
the actual numbers are a vast understatement of how many user actions our 
servers receive - what's more interesting is how they change over time. In the 
case of Mobile Web search, this sampling rate is *going* to be **0.1%**: it's 
currently turned off entirely but should be enabled soon.
+When a user opens the search overlay, a **user session** start. We use a 
random generated userSessionToken to identify this search funnel. A user 
session can have multiple search sessions. We split user sessions into “low 
volume”, "medium volume" and “high-volume” sessions. A “high-volume” session is 
a user session whose search sessions are equal to or greater than the 90th 
percentile for the whole population on any particular day. A “low-volume” 
session is a user session whose search sessions are equal to or less than the 
5th percentile. The rest are categorized as "medium-volume".
+
+We use the [EventLogging 'MobileWebSearch' 
schema](https://meta.wikimedia.org/wiki/Schema:MobileWebSearch) to track these 
activities, and stored to a database. Currently the schema tracks prefix search 
only. The results are then aggregated and anonymised, and presented on this 
page. For performance/privacy reasons we randomly sample what we store, so the 
actual numbers are a vast understatement of how many user actions our servers 
receive - what's more 

[MediaWiki-commits] [Gerrit] wikimedia...wetzel[develop]: Add maplink & mapframe prevalence graphs and modularize

2017-09-27 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/379150 )

Change subject: Add maplink & mapframe prevalence graphs and modularize
..


Add maplink & mapframe prevalence graphs and modularize

- Splits up server.R into modules (like Search & Portal dashboards)
- Adds maplink & mapframe prevalence graphs
  - Overall prevalence
  - Language-project breakdown of prevalence

Bug: T170022
Change-Id: If1f1efa619037ce8adea873c148f9a1f78376506
---
M CHANGELOG.md
A modules/feature_usage.R
A modules/geographic_breakdown.R
A modules/kartographer/language-project_breakdown.R
A modules/kartographer/overall_prevalence.R
A modules/kartotherian.R
M server.R
A tab_documentation/overall_prevalence.md
A tab_documentation/prevalence_langproj.md
M tab_documentation/tiles_summary.md
M ui.R
M utils.R
12 files changed, 586 insertions(+), 160 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/CHANGELOG.md b/CHANGELOG.md
index 208e2ab..f3e77ee 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,6 +1,10 @@
 # Change Log (Patch Notes)
 All notable changes to this project will be documented in this file.
 
+## 2017/09/18
+- Modularized the dashboard source code
+- Added maplink & mapframe prevalence graphs 
([T170022](https://phabricator.wikimedia.org/T170022))
+
 ## 2017/06/20
 - Added licensing info ([T167930](https://phabricator.wikimedia.org/T167930))
 
diff --git a/modules/feature_usage.R b/modules/feature_usage.R
new file mode 100644
index 000..ec2460e
--- /dev/null
+++ b/modules/feature_usage.R
@@ -0,0 +1,55 @@
+output$users_per_platform <- renderDygraph({
+  user_data %>%
+polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, 
input$smoothing_users_per_platform)) %>%
+polloi::make_dygraph("Date", "Events", "Unique users by platform, by day") 
%>%
+dyAxis("y", logscale = input$users_per_platform_logscale) %>%
+dyLegend(labelsDiv = "users_per_platform_legend", show = "always") %>%
+dyRangeSelector %>%
+dyEvent(as.Date("2016-04-15"), "A (Maps EL bug)", labelLoc = "bottom") %>%
+dyEvent(as.Date("2016-06-17"), "A (Maps EL patch)", labelLoc = "bottom") 
%>%
+dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
+})
+
+output$geohack_feature_usage <- renderDygraph({
+  usage_data$GeoHack %>%
+polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, 
input$smoothing_geohack_feature_usage)) %>%
+polloi::make_dygraph("Date", "Events", "Feature usage for GeoHack") %>%
+dyRangeSelector %>%
+dyAxis("y", logscale = input$geohack_feature_usage_logscale) %>%
+dyEvent(as.Date("2016-04-15"), "A (Maps EL bug)", labelLoc = "bottom") %>%
+dyEvent(as.Date("2016-06-17"), "A (Maps EL patch)", labelLoc = "bottom") 
%>%
+dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
+})
+
+output$wikiminiatlas_feature_usage <- renderDygraph({
+  usage_data$WikiMiniAtlas %>%
+polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, 
input$smoothing_wikiminiatlas_feature_usage)) %>%
+polloi::make_dygraph("Date", "Events", "Feature usage for WikiMiniAtlas") 
%>%
+dyRangeSelector %>%
+dyAxis("y", logscale = input$wikiminiatlas_feature_usage_logscale) %>%
+dyEvent(as.Date("2016-04-15"), "A (Maps EL bug)", labelLoc = "bottom") %>%
+dyEvent(as.Date("2016-06-17"), "A (Maps EL patch)", labelLoc = "bottom") 
%>%
+dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
+})
+
+output$wikivoyage_feature_usage <- renderDygraph({
+  usage_data$Wikivoyage %>%
+polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, 
input$smoothing_wikivoyage_feature_usage)) %>%
+polloi::make_dygraph("Date", "Events", "Feature usage for Wikivoyage") %>%
+dyRangeSelector %>%
+dyAxis("y", logscale = input$wikivoyage_feature_usage_logscale) %>%
+dyEvent(as.Date("2016-04-15"), "A (Maps EL bug)", labelLoc = "bottom") %>%
+dyEvent(as.Date("2016-06-17"), "A (Maps EL patch)", labelLoc = "bottom") 
%>%
+dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
+})
+
+output$wiwosm_feature_usage <- renderDygraph({
+  usage_data$WIWOSM %>%
+polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, 
input$smoothing_wiwosm_feature_usage)) %>%
+polloi::make_dygraph("Date", "Events", "Feature usage for WIWOSM") %>%
+dyRangeSelector %>%
+dyAxis("y", logscale = input$wiwosm_feature_usage_logscale) %>%
+dyEvent(as.Date("2016-04-15"), "A (Maps EL bug)", labelLoc = "bottom") %>%
+dyEvent(as.Date("2016-06-17"), "A (Maps EL patch)", labelLoc = "bottom") 
%>%
+dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
+})
diff --git a/modules/geographic_breakdown.R b/modules/geographic_breakdown.R
new file mode 100644
index 

[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Session counts by volume for mobile web search

2017-09-27 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/381126 )

Change subject: Session counts by volume for mobile web search
..

Session counts by volume for mobile web search

Bug: T176811
Change-Id: I545a80a5f4214e3f170d6a104a48e6d30dddecc9
---
M docs/README.md
M modules/metrics/search/config.yaml
A modules/metrics/search/mobile_session_counts
A modules/metrics/search/mobile_session_counts.R
4 files changed, 80 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/golden 
refs/changes/26/381126/1

diff --git a/docs/README.md b/docs/README.md
index e5cc336..2053bcf 100644
--- a/docs/README.md
+++ b/docs/README.md
@@ -8,7 +8,7 @@
 infrastructure. These datasets provide the metrics that are used by
 [Discovery's Dashboards](https://discovery.wmflabs.org/)
 
-Last updated on 22 September 2017
+Last updated on 27 September 2017
 
 Daily Metrics
 -
@@ -204,6 +204,8 @@
 after clickthrough; Number of sessions with at least a click and the
 number of sessions that return to search for different things after
 clickthrough.
+-   **mobile\_session\_counts.tsv**: Number of user sessions on mobile
+web, broken down by high, medium and low volume.
 
 wdqs/
 -
diff --git a/modules/metrics/search/config.yaml 
b/modules/metrics/search/config.yaml
index 2181168..4b3f099 100644
--- a/modules/metrics/search/config.yaml
+++ b/modules/metrics/search/config.yaml
@@ -233,3 +233,8 @@
 granularity: days
 starts: 2017-04-01
 type: script
+mobile_session_counts:
+description: Number of user sessions on mobile web, broken down by 
high, medium and low volume.
+granularity: days
+starts: 2017-04-01
+type: script
diff --git a/modules/metrics/search/mobile_session_counts 
b/modules/metrics/search/mobile_session_counts
new file mode 100755
index 000..e88dc7e
--- /dev/null
+++ b/modules/metrics/search/mobile_session_counts
@@ -0,0 +1,3 @@
+#!/bin/bash
+
+Rscript modules/metrics/search/mobile_session_counts.R -d $1
diff --git a/modules/metrics/search/mobile_session_counts.R 
b/modules/metrics/search/mobile_session_counts.R
new file mode 100644
index 000..89a3d10
--- /dev/null
+++ b/modules/metrics/search/mobile_session_counts.R
@@ -0,0 +1,69 @@
+#!/usr/bin/env Rscript
+
+source("config.R")
+.libPaths(r_library)
+suppressPackageStartupMessages(library("optparse"))
+
+option_list <- list(
+  make_option(c("-d", "--date"), default = NA, action = "store", type = 
"character")
+)
+
+# Get command line options, if help option encountered print help and exit,
+# otherwise if options not found on command line then set defaults:
+opt <- parse_args(OptionParser(option_list = option_list))
+
+if (is.na(opt$date)) {
+  quit(save = "no", status = 1)
+}
+
+# Build query:
+date_clause <- as.character(as.Date(opt$date), format = "LEFT(timestamp, 8) = 
'%Y%m%d'")
+
+query <-paste0("SELECT
+  DATE('", opt$date, "') AS date,
+  event_userSessionToken AS userSessionToken,
+  COUNT(DISTINCT event_searchSessionToken) AS n_search_session
+  FROM MobileWebSearch_12054448
+  WHERE ", date_clause, "
+  GROUP BY date, event_userSessionToken;")
+
+# Fetch data from MySQL database:
+results <- tryCatch(
+  suppressMessages(data.table::as.data.table(wmf::mysql_read(query, "log"))),
+  error = function(e) {
+return(data.frame())
+  }
+)
+
+if (nrow(results) == 0) {
+  # Here we make the script output tab-separated
+  # column names, as required by Reportupdater:
+  output <- data.frame(
+date = character(),
+user_sessions = numeric(),
+search_sessions = numeric(),
+high_volume = numeric(),
+medium_volume = numeric(),
+low_volume = numeric(),
+threshold_high = numeric(),
+threshold_low = numeric()
+  )
+} else {
+  # Split session counts:
+  `90th percentile` <- floor(quantile(results$n_search_session, 0.9))
+  `10th percentile` <- ceiling(quantile(results$n_search_session, 0.1))
+  results$session_type <- dplyr::case_when(
+results$n_search_session > `90th percentile` ~ "high_volume",
+results$n_search_session < `10th percentile` ~ "low_volume",
+TRUE ~ "medium_volume"
+  )
+  output <- cbind(date = "20170901",#opt$date,
+  user_sessions = nrow(results),
+  search_sessions = sum(results$n_search_session, na.rm = 
TRUE),
+  tidyr::spread(results[, list(userSession = 
length(userSessionToken)), by = "session_type"],
+session_type, userSession),
+  threshold_high = `90th percentile`,
+  threshold_low = `10th percentile`)
+}
+
+write.table(output, file = "", append = FALSE, sep = "\t", row.names = FALSE, 
quote = FALSE)

-- 
To view, visit https://gerrit.wikimedia.org/r/381126
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange

[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Sister search prevalence by language

2017-09-22 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/379939 )

Change subject: Sister search prevalence by language
..


Sister search prevalence by language

Adds the percentage of searches where the sister project search
results were shown to the user.

Change-Id: I4c59f2e693570b92d63d66826ca23400fc90be61
---
M CHANGELOG.md
A modules/sister_search/prevalence.R
M server.R
A tab_documentation/sister_search_prevalence.md
M ui.R
M utils.R
6 files changed, 111 insertions(+), 1 deletion(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/CHANGELOG.md b/CHANGELOG.md
index 099e8a1..7ecd033 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -2,6 +2,9 @@
 
 All notable changes to this project will be documented in this file.
 
+## 2017/09/25
+- Added sister project search result prevalence
+
 ## 2017/08/30
 - Added SRP visit times ([T170468](https://phabricator.wikimedia.org/T170468))
 - Added [dygraph-based rolling 
periods](https://rstudio.github.io/dygraphs/gallery-roll-periods.html) to page 
visit times modules
diff --git a/modules/sister_search/prevalence.R 
b/modules/sister_search/prevalence.R
new file mode 100644
index 000..0bedfa8
--- /dev/null
+++ b/modules/sister_search/prevalence.R
@@ -0,0 +1,37 @@
+output$sister_search_prevalence_lang_container <- renderUI({
+  languages_to_display <- sister_search_averages$language
+  names(languages_to_display) <- sprintf("%s (%.1f%%)", 
sister_search_averages$language, sister_search_averages$avg)
+  if (input$sister_search_prevalence_lang_order != "alphabet") {
+languages_to_display <- languages_to_display[order(
+  sister_search_averages$avg,
+  decreasing = input$sister_search_prevalence_lang_order == "high2low"
+)]
+  }
+  if (!is.null(input$language_selector)) {
+selected_language <- input$language_selector
+  } else {
+selected_language <- languages_to_display[1]
+  }
+  return(selectInput(
+"sister_search_prevalence_lang_selector", "Language",
+multiple = TRUE, selectize = FALSE, size = 19,
+choices = languages_to_display, selected = selected_language
+  ))
+})
+
+output$sister_search_prevalence_plot <- renderDygraph({
+  req(input$sister_search_prevalence_lang_selector)
+  sister_search_prevalence %>%
+dplyr::filter(language %in% input$sister_search_prevalence_lang_selector) 
%>%
+tidyr::spread(language, prevalence, fill = 0) %>%
+polloi::reorder_columns() %>%
+polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, 
input$smoothing_sister_search_prevalence_plot)) %>%
+polloi::make_dygraph("Date", "Prevalence (%)", "Wikipedia searches that 
showed sister project search results") %>%
+dyLegend(show = "always", width = 400, labelsDiv = 
"sister_search_prevalence_plot_legend") %>%
+dyAxis("y",
+  axisLabelFormatter = "function(x) { return x + '%'; }",
+  valueFormatter = "function(x) { return Math.round(x * 100)/100 + '%'; }"
+) %>%
+dyAxis("x", axisLabelFormatter = polloi::custom_axis_formatter) %>%
+dyRangeSelector(fillColor = "", strokeColor = "")
+})
diff --git a/server.R b/server.R
index b91bcf9..21d45a2 100644
--- a/server.R
+++ b/server.R
@@ -66,6 +66,7 @@
   source("modules/zero_results.R", local = TRUE)
   # Sister Search
   source("modules/sister_search/traffic.R", local = TRUE)
+  source("modules/sister_search/prevalence.R", local = TRUE)
   # Survival
   source("modules/page_visit_times.R", local = TRUE)
   # Language/Project Breakdown
diff --git a/tab_documentation/sister_search_prevalence.md 
b/tab_documentation/sister_search_prevalence.md
new file mode 100644
index 000..84b58fc
--- /dev/null
+++ b/tab_documentation/sister_search_prevalence.md
@@ -0,0 +1,26 @@
+Sister project search results prevalence
+===
+Sister project (cross-wiki) snippets is a feature that adds search results 
from sister projects of Wikipedia to a sidebar on the search engine results 
page (SERP). If a query results in matches from the sister projects, users will 
be shown snippets from Wiktionary, Wikisource, Wikiquote and/or other projects. 
See [T162276](https://phabricator.wikimedia.org/T162276) for more details.
+
+General trends
+-
+* English Wikipedia has the highest prevalence with 75% of searches showing 
sister project results on average, followed by Chinese (73%) and French (70%) 
Wikipedias.
+* 38% of languages show the sister project results in at least 50% of the 
searches made.
+
+Notes, outages, and inaccuracies
+-
+* English Wikipedia has a different display than all the other languages due 
to community feedback. Specifically, it does not show results from 
Commons/multimedia, Wikinews, and Wikiversity. Refer to 
[T162276#3278689](https://phabricator.wikimedia.org/T162276#3278689) for more 
details.
+* Languages without a lot of traffic also yield less (sampled) event logging 
data. In order to show 

[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Track sister search prevalence

2017-09-22 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/379834 )

Change subject: Track sister search prevalence
..


Track sister search prevalence

Number of searches that have sister project results vs
number of searches that do not, by language.

Change-Id: I413d37930d959a212fa8fd7c1dfb35898a5f793f
---
M CHANGELOG.md
M docs/README.md
M modules/metrics/search/config.yaml
A modules/metrics/search/sister_search_prevalence.sql
4 files changed, 49 insertions(+), 3 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/CHANGELOG.md b/CHANGELOG.md
index 8610ba9..f983f48 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,6 +1,18 @@
 # Change Log (Patch Notes)
 All notable changes to this project will be documented in this file.
 
+## 2017/09/22
+- Added sister project search results prevalence
+
+## 2017/09/21
+- Added new datasets in search and portal 
([T172453](https://phabricator.wikimedia.org/T172453)):
+  - wikipedia portal pageview by device (desktop vs mobile)
+  - wikipedia portal clickthrough rate by device (desktop vs mobile)
+  - proportion of wikipedia portal visitors on mobile devices in US vs 
elsewhere
+  - pageviews from full-text search (desktop vs mobile)
+  - search return rate on desktop
+  - SERPs by access method
+
 ## 2017/08/29
 - Switched Hive queries to use the "nice" queue 
([T156841](https://phabricator.wikimedia.org/T156841)). See [this 
section](https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive/Queries#Run_long_queries_in_a_screen_session_and_in_the_nice_queue)
 for additional details.
 
diff --git a/docs/README.md b/docs/README.md
index 88bec45..e5cc336 100644
--- a/docs/README.md
+++ b/docs/README.md
@@ -8,7 +8,7 @@
 infrastructure. These datasets provide the metrics that are used by
 [Discovery's Dashboards](https://discovery.wmflabs.org/)
 
-Last updated on 14 September 2017
+Last updated on 22 September 2017
 
 Daily Metrics
 -
@@ -186,6 +186,8 @@
 Wikipedia search results pages; broken up by language, destination
 type (SERP vs not), and access method (desktop vs mobile web);
 exlcudes known automata
+-   **sister\_search\_prevalence.tsv**: Prevalence of sister search
+results on Wikipedia search result pages; broken up by language
 -   **srp\_survtime.tsv**: Estimates (via survival analysis) of how long
 Wikipedia searchers stay on full-text search results page after
 getting there from autocomplete search, split by English vs French
@@ -193,10 +195,10 @@
 -   **pageviews\_from\_fulltext\_search.tsv**: Number of searches,
 pageviews and users to articles from full-text search, broken down
 by access method (desktop vs mobile web) and agent type (user vs
-spider)
+spider).
 -   **search\_result\_pages.tsv**: Number of searches, search result
 pages and users, broken down by access method (desktop vs mobile
-web) and agent type (user vs spider)
+web) and agent type (user vs spider).
 -   **desktop\_return\_rate.tsv**: Number of searches with at least a
 click and the number of searches that return to the same search page
 after clickthrough; Number of sessions with at least a click and the
diff --git a/modules/metrics/search/config.yaml 
b/modules/metrics/search/config.yaml
index 00c4524..2181168 100644
--- a/modules/metrics/search/config.yaml
+++ b/modules/metrics/search/config.yaml
@@ -204,6 +204,12 @@
 starts: 2017-06-01
 funnel: true
 type: script
+sister_search_prevalence:
+description: Prevalence of sister search results on Wikipedia search 
result pages; broken up by language
+granularity: days
+starts: 2017-07-01
+funnel: true
+type: sql
 srp_survtime:
 description: Estimates (via survival analysis) of how long Wikipedia 
searchers stay on full-text search results page after getting there from 
autocomplete search, split by English vs French and Catalan vs other languages.
 granularity: days
diff --git a/modules/metrics/search/sister_search_prevalence.sql 
b/modules/metrics/search/sister_search_prevalence.sql
new file mode 100644
index 000..40ae915
--- /dev/null
+++ b/modules/metrics/search/sister_search_prevalence.sql
@@ -0,0 +1,26 @@
+SELECT
+  DATE('{from_timestamp}') AS date, wiki_id,
+  SUM(has_iw) AS has_sister_results,
+  SUM(IF(has_iw, 0, 1)) AS no_sister_results
+FROM (
+  SELECT DISTINCT
+wiki_id, session_id, query_hash, has_iw
+  FROM (
+SELECT DISTINCT
+  wiki AS wiki_id,
+  event_uniqueId AS event_id,
+  event_searchSessionId AS session_id,
+  MD5(LOWER(TRIM(event_query))) AS query_hash,
+  INSTR(event_extraParams, '"iw":') > 0 AS has_iw -- sister project 
results shown
+FROM TestSearchSatisfaction2_16909631
+WHERE timestamp >= '{from_timestamp}' AND timestamp < '{to_timestamp}'
+  AND event_source = 

[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Fix maplink/mapframe query

2017-09-14 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/377807 )

Change subject: Fix maplink/mapframe query
..


Fix maplink/mapframe query

Bug: T170022
Change-Id: I1d70b09a54b47002a948f29f21e1ad843b87af55
---
M modules/metrics/maps/config.yaml
M modules/metrics/maps/prevalence.R
M modules/metrics/maps/prevalence.yaml
3 files changed, 10 insertions(+), 4 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/modules/metrics/maps/config.yaml b/modules/metrics/maps/config.yaml
index f74d21a..f5b3b46 100644
--- a/modules/metrics/maps/config.yaml
+++ b/modules/metrics/maps/config.yaml
@@ -40,12 +40,12 @@
 mapframe_prevalence:
 description: Proportion of articles on a wiki that have a mapframe
 granularity: days
-starts: 2017-09-01 # this will need to be set to when patch goes live, 
we can't backfill this data
+starts: 2017-09-14 # this will need to be set to when patch goes live, 
we can't backfill this data
 funnel: true
 type: script
 maplink_prevalence:
 description: Proportion of articles on a wiki that have a maplink
 granularity: days
-starts: 2017-09-01 # this will need to be set to when patch goes live, 
we can't backfill this data
+starts: 2017-09-14 # this will need to be set to when patch goes live, 
we can't backfill this data
 funnel: true
 type: script
diff --git a/modules/metrics/maps/prevalence.R 
b/modules/metrics/maps/prevalence.R
index 2d23ad9..c9fd596 100644
--- a/modules/metrics/maps/prevalence.R
+++ b/modules/metrics/maps/prevalence.R
@@ -35,14 +35,19 @@
   SUM(COALESCE({type}s, 0)) AS total_{type}s
 FROM (
   SELECT
-page.page_id,
+p.page_id,
 pp_value AS {type}s
   FROM (
 SELECT pp_page, pp_value
 FROM page_props
 WHERE pp_propname = '{prop_name}' AND pp_value > 0
   ) AS filtered_props
-  RIGHT JOIN page ON page.page_id = filtered_props.pp_page AND 
page.page_namespace = {ns}
+  RIGHT JOIN (
+SELECT page_id
+FROM page
+WHERE page_namespace = {ns} AND page_is_redirect = 0
+  ) p
+  ON p.page_id = filtered_props.pp_page
 ) joined_tables;")
   return(query)
 }
diff --git a/modules/metrics/maps/prevalence.yaml 
b/modules/metrics/maps/prevalence.yaml
index c67d65a..f66d14e 100644
--- a/modules/metrics/maps/prevalence.yaml
+++ b/modules/metrics/maps/prevalence.yaml
@@ -16,6 +16,7 @@
 - mediawikiwiki
 - metawiki
 - commonswiki
+- uawikimedia # as of August 2017
   wikivoyages: # enabled for all *except* the following:
 - hewikivoyage # https://phabricator.wikimedia.org/T170976#3471701
 maplink:

-- 
To view, visit https://gerrit.wikimedia.org/r/377807
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I1d70b09a54b47002a948f29f21e1ad843b87af55
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/golden
Gerrit-Branch: master
Gerrit-Owner: Bearloga 
Gerrit-Reviewer: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Interpretation and general findings for API dashboards

2017-09-14 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/378067 )

Change subject: Interpretation and general findings for API dashboards
..

Interpretation and general findings for API dashboards

Bug: T172452
Change-Id: If97bb9cd23ae93117d106012d69b8f6250a19ce9
---
M modules/api.R
M modules/key_performance_metrics/api_usage.R
M tab_documentation/fulltext_basic.md
M tab_documentation/geo_basic.md
M tab_documentation/kpi_api_usage.md
M tab_documentation/language_basic.md
M tab_documentation/morelike_basic.md
M tab_documentation/open_basic.md
M tab_documentation/prefix_basic.md
M tab_documentation/referer_breakdown.md
M ui.R
11 files changed, 322 insertions(+), 105 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow 
refs/changes/67/378067/1

diff --git a/modules/api.R b/modules/api.R
index 790b29e..6cae3ad 100644
--- a/modules/api.R
+++ b/modules/api.R
@@ -1,9 +1,22 @@
 output$cirrus_aggregate <- renderDygraph({
-  split_dataset$`full-text via API` %>%
+  temp <- split_dataset$`full-text via API` %>%
 tidyr::spread(referrer, calls) %>%
-polloi::reorder_columns() %>%
+polloi::reorder_columns()
+  if (input$fulltext_search_prop) {
+temp <- cbind(temp[, "date"], purrr::map_df(temp[, -c(1, 2)], function(x) 
round(100 * x / temp$All, 2))) %>%
+  dplyr::filter(date >= "2017-06-29")
+  }
+  temp %>%
 polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_fulltext_search)) 
%>%
-polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily 
Full-text search API usage by referrer", legend_name = "Searches") %>%
+polloi::make_dygraph(xlab = "Date",
+ ylab = dplyr::case_when(
+   input$fulltext_search_prop ~ "API Calls Share (%)",
+   input$fulltext_search_log_scale ~ "Calls (log10 
scale)",
+   TRUE ~ "API Calls"
+ ),
+ title = "Daily Full-text search via API usage by 
referrer",
+ legend_name = "API Calls",
+ logscale = input$fulltext_search_log_scale) %>%
 dyLegend(labelsDiv = "cirrus_aggregate_legend", width = 600) %>%
 dyRangeSelector %>%
 dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") 
%>%
@@ -11,21 +24,47 @@
 })
 
 output$morelike_aggregate <- renderDygraph({
-  split_dataset$`morelike via API` %>%
+  temp <- split_dataset$`morelike via API` %>%
 tidyr::spread(referrer, calls) %>%
-polloi::reorder_columns() %>%
+polloi::reorder_columns()
+  if (input$morelike_search_prop) {
+temp <- cbind(temp[, "date"], purrr::map_df(temp[, -c(1, 2)], function(x) 
round(100 * x / temp$All, 2))) %>%
+  dplyr::filter(date >= "2017-06-29")
+  }
+  temp %>%
 polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_morelike_search)) 
%>%
-polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily 
Morelike search API usage by referrer", legend_name = "Searches") %>%
+polloi::make_dygraph(xlab = "Date",
+ ylab = dplyr::case_when(
+   input$morelike_search_prop ~ "API Calls Share (%)",
+   input$morelike_search_log_scale ~ "Calls (log10 
scale)",
+   TRUE ~ "API Calls"
+ ),
+ title = "Daily Morelike search API usage by referrer",
+ legend_name = "API Calls",
+ logscale = input$morelike_search_log_scale) %>%
 dyLegend(labelsDiv = "morelike_aggregate_legend", width = 600) %>%
 dyRangeSelector
 })
 
 output$open_aggregate <- renderDygraph({
-  split_dataset$open %>%
+  temp <- split_dataset$open %>%
 tidyr::spread(referrer, calls) %>%
-polloi::reorder_columns() %>%
+polloi::reorder_columns()
+  if (input$open_search_prop) {
+temp <- cbind(temp[, "date"], purrr::map_df(temp[, -c(1, 2)], function(x) 
round(100 * x / temp$All, 2))) %>%
+  dplyr::filter(date >= "2017-06-29")
+  }
+  temp %>%
 polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_open_search)) %>%
-polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily 
OpenSearch API usage by referrer", legend_name = "Searches") %>%
+polloi::make_dygraph(xlab = "Date",
+ ylab = dplyr::case_when(
+   input$open_search_prop ~ "API Calls Share (%)",
+   input$open_search_log_scale ~ "Calls (log10 scale)",
+   TRUE ~ "API Calls"
+ ),
+ title = "Daily OpenSearch API usage by referrer",
+ legend_name = "API Calls",
+ 

[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Add new datasets in search and portal

2017-09-01 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/375408 )

Change subject: Add new datasets in search and portal
..

Add new datasets in search and portal

* wikipedia portal pageview by device (desktop vs mobile)
* wikipedia portal clickthrough rate by device (desktop vs mobile)
* proportion of wikipedia portal visitors on mobile devices in US vs elsewhere
* pageviews from full-text search (desktop vs mobile)
* search return rate on desktop
* SERPs by access method

Bug: T172453
Change-Id: I4615f4070ced26ce886b49be7393115953320cfe
---
M docs/README.md
A modules/metrics/portal/clickthrough_by_device
M modules/metrics/portal/config.yaml
M modules/metrics/portal/engagement.R
A modules/metrics/portal/mobile_use_us_elsewhere
A modules/metrics/portal/pageviews_by_device
M modules/metrics/search/config.yaml
A modules/metrics/search/desktop_return_rate
A modules/metrics/search/desktop_return_rate.R
A modules/metrics/search/pageviews_from_fulltext_search
A modules/metrics/search/search_result_pages
11 files changed, 378 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/golden 
refs/changes/08/375408/1

diff --git a/docs/README.md b/docs/README.md
index 1b2abe6..bcd72e1 100644
--- a/docs/README.md
+++ b/docs/README.md
@@ -8,7 +8,7 @@
 infrastructure. These datasets provide the metrics that are used by
 [Discovery's Dashboards](https://discovery.wmflabs.org/)
 
-Last updated on 30 August 2017
+Last updated on 01 September 2017
 
 Daily Metrics
 -
@@ -57,14 +57,24 @@
 top 10 countries
 -   **all\_country\_data.tsv**: Sampled traffic to Wikipedia.org Portal,
 broken down by country
+-   **all\_country\_data\_history.tsv**: Sampled traffic to
+Wikipedia.org Portal, broken down by country. Historical data store.
 -   **app\_link\_clicks.tsv**: Clicks to Wikipedia mobile apps and list
 of apps
 -   **last\_action\_country.tsv**: Last action performed on
 Wikipedia.org Portal per user session
+-   **last\_action\_country\_history.tsv**: Last action performed on
+Wikipedia.org Portal per user session. Historical data store.
 -   **most\_common\_country.tsv**: Most common action performed on
 Wikipedia.org Portal per user session, broken down by country
+-   **most\_common\_country\_history.tsv**: Most common action performed
+on Wikipedia.org Portal per user session, broken down by country.
+Historical data store.
 -   **first\_visits\_country.tsv**: Action performed on Wikipedia.org
 Portal on each user's initial visit, broken down by country
+-   **first\_visits\_country\_history.tsv**: Action performed on
+Wikipedia.org Portal on each user's initial visit, broken down by
+country. Historical data store.
 -   **clickthrough\_rate.tsv**: Last action (no action vs clickthrough)
 by Wikipedia.org Portal visitors
 -   **clickthrough\_sisterprojects.tsv**: Clicks to Wikimedia projects
@@ -76,6 +86,12 @@
 section
 -   **most\_common\_per\_visit.tsv**: Most common action performed on
 Wikipedia.org Portal per user session
+-   **pageviews\_by\_device.tsv**: Pageviews broken down by device
+(desktop vs mobile)
+-   **clickthrough\_by\_device.tsv**: Clickthroughs from Wikipedia.org
+Portal, broken down by device (desktop vs mobile)
+-   **mobile\_use\_us\_elsewhere.tsv**: Number of Wikipedia.org Portal
+visitors on mobile devices in U.S. vs everywhere else
 
 search/
 ---
@@ -85,6 +101,9 @@
 -   **app\_event\_counts\_langproj\_breakdown.tsv**: Clicks and other
 events by users searching on Android and iOS apps broken down by
 language
+-   **app\_event\_counts\_langproj\_breakdown\_history.tsv**: Clicks and
+other events by users searching on Android and iOS apps broken down
+by language. Historical data store.
 -   **app\_load\_times.tsv**: User-perceived load times when searching
 on Android and iOS apps
 -   **invoke\_source\_counts.tsv**: How the user initiated their search
@@ -96,6 +115,9 @@
 -   **mobile\_event\_counts\_langproj\_breakdown.tsv**: Clicks and other
 events by users searching on mobile web broken down by
 language-project pairs
+-   **mobile\_event\_counts\_langproj\_breakdown\_history.tsv**: Clicks
+and other events by users searching on mobile web broken down by
+language-project pairs. Historical data store.
 -   **mobile\_load\_times.tsv**: User-perceived load times when
 searching on mobile web
 -   **desktop\_event\_counts.tsv**: Clicks and other events by users
@@ -103,6 +125,9 @@
 -   **desktop\_event\_counts\_langproj\_breakdown.tsv**: Clicks and
 other events by users searching on desktop broken down by
 language-project pairs
+-   **desktop\_event\_counts\_langproj\_breakdown\_history.tsv**: Clicks
+and other events by users searching on desktop broken down by
+language-project pairs. Historical data store.
 -   

[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: SRP visit times label fixes

2017-08-31 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/375091 )

Change subject: SRP visit times label fixes
..


SRP visit times label fixes

Also added data checks & fixed a bug introduced with a new version
of tidyr (at least I think that's how the issue started)

Change-Id: Ia3f4e6b030858b382c0a7c336d6759d022ebf14e
---
M modules/page_visit_times.R
M server.R
M tab_documentation/survival.md
M ui.R
M utils.R
5 files changed, 56 insertions(+), 38 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/modules/page_visit_times.R b/modules/page_visit_times.R
index 1321dd6..df1fbe9 100644
--- a/modules/page_visit_times.R
+++ b/modules/page_visit_times.R
@@ -22,7 +22,7 @@
 tidyr::spread(label, time) %>%
 polloi::reorder_columns() %>%
 polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_srp_ld_plot), 
rename = FALSE) %>%
-polloi::make_dygraph(xlab = "", ylab = "Time (s)", title = "Time at N% 
users leave the search results page") %>%
+polloi::make_dygraph(xlab = "", ylab = "Time (s)", title = "Time at which 
N% users leave the search results page") %>%
 dyAxis("x", ticker = "Dygraph.dateTicker", axisLabelFormatter = 
polloi::custom_axis_formatter,
axisLabelWidth = 100, pixelsPerLabel = 80) %>%
 dyRoller(rollPeriod = input$rolling_srp_ld_plot) %>%
diff --git a/server.R b/server.R
index 752f5ba..b91bcf9 100644
--- a/server.R
+++ b/server.R
@@ -80,18 +80,28 @@
   polloi::check_past_week(mobile_load_data, "Mobile Web load times"),
   polloi::check_yesterday(android_dygraph_set, "Android events"),
   polloi::check_past_week(android_load_data, "Android load times"),
+  polloi::check_yesterday(position_prop, "clicked result positions"),
+  polloi::check_past_week(position_prop, "clicked result positions"),
+  polloi::check_yesterday(source_prop, "source of search on Android"),
+  polloi::check_past_week(source_prop, "source of search on Android"),
   polloi::check_yesterday(ios_dygraph_set, "iOS events"),
   polloi::check_past_week(ios_load_data, "iOS load times"),
-  polloi::check_yesterday(dplyr::bind_rows(split_dataset), "API usage 
data"),
-  polloi::check_past_week(dplyr::bind_rows(split_dataset), "API usage 
data"),
+  polloi::check_yesterday(dplyr::bind_rows(split_dataset, .id = "api"), 
"API usage data"),
+  polloi::check_past_week(dplyr::bind_rows(split_dataset, .id = "api"), 
"API usage data"),
   polloi::check_yesterday(failure_data_with_automata, "zero results data"),
   polloi::check_past_week(failure_data_with_automata, "zero results data"),
   polloi::check_yesterday(suggestion_with_automata, "suggestions data"),
   polloi::check_past_week(suggestion_with_automata, "suggestions data"),
   polloi::check_yesterday(augmented_clickthroughs, "engagement % data"),
   polloi::check_past_week(augmented_clickthroughs, "engagement % data"),
-  polloi::check_yesterday(user_page_visit_dataset, "survival times"),
-  polloi::check_past_week(user_page_visit_dataset, "survival times"))
+  polloi::check_yesterday(paulscore_fulltext, "full-text PaulScores"),
+  polloi::check_past_week(paulscore_fulltext, "full-text PaulScores"),
+  polloi::check_yesterday(sister_search_traffic, "sister search traffic"),
+  polloi::check_past_week(sister_search_traffic, "sister search traffic"),
+  polloi::check_yesterday(user_page_visit_dataset, "page survival times"),
+  polloi::check_past_week(user_page_visit_dataset, "page survival times"),
+  polloi::check_yesterday(serp_page_visit_dataset, "serp survival times"),
+  polloi::check_past_week(serp_page_visit_dataset, "serp survival times"))
 notifications <- notifications[!vapply(notifications, is.null, FALSE)]
 return(dropdownMenu(type = "notifications", .list = notifications))
   })
diff --git a/tab_documentation/survival.md b/tab_documentation/survival.md
index e066ad5..ae7ab59 100644
--- a/tab_documentation/survival.md
+++ b/tab_documentation/survival.md
@@ -1,15 +1,15 @@
-Automated survival analysis: page visit times
+How long searchers stay on the visited search results
 ===
 
 When someone is randomly selected for search satisfaction tracking (using our 
[TSS2 schema](https://meta.wikimedia.org/wiki/Schema:TestSearchSatisfaction2)), 
we use a check-in system and survival analysis to estimate how long users stay 
on visited pages. To summarize the results on a daily basis, we record a set of 
statistics based on a measure formally known as "[median lethal 
dose](https://en.wikipedia.org/wiki/Median_lethal_dose)".
 
-This graph shows the length of time that must pass before N% of the users 
leave the page they visited. When the number goes up, we can infer that users 
are staying on the pages longer. In general, it appears it takes 15s to 

[MediaWiki-commits] [Gerrit] wikimedia...wmf[master]: Add function 'null2na'

2017-08-31 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/375078 )

Change subject: Add function 'null2na'
..

Add function 'null2na'

Change-Id: Id149be60ae41c9f09d81b91aaf6a1ee9ecc8db3b
---
M DESCRIPTION
M NAMESPACE
M R/wmf.R
M man/FiveThirtyNine.Rd
M man/build_query.Rd
M man/date_clause.Rd
M man/get_logfile.Rd
M man/global_query.Rd
M man/mysql.Rd
A man/null2na.Rd
M man/query_hive.Rd
M man/read_sampled_log.Rd
M man/rewrite_conditional.Rd
M man/sample_size_effect.Rd
M man/sample_size_odds.Rd
M man/set_proxies.Rd
M man/timeconverters.Rd
M man/write_conditional.Rd
18 files changed, 49 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/wmf 
refs/changes/78/375078/1

diff --git a/DESCRIPTION b/DESCRIPTION
index f739244..b8fe3d1 100644
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -21,4 +21,4 @@
 projects=Discovery-Analysis
 Suggests:
 testthat
-RoxygenNote: 5.0.1
+RoxygenNote: 6.0.1
diff --git a/NAMESPACE b/NAMESPACE
index 4a137c4..6a737d2 100644
--- a/NAMESPACE
+++ b/NAMESPACE
@@ -12,6 +12,7 @@
 export(mysql_exists)
 export(mysql_read)
 export(mysql_write)
+export(null2na)
 export(query_hive)
 export(read_sampled_log)
 export(rewrite_conditional)
diff --git a/R/wmf.R b/R/wmf.R
index e69de29..91f3e7e 100644
--- a/R/wmf.R
+++ b/R/wmf.R
@@ -0,0 +1,18 @@
+#'@title Turns Null Into Character "NA"
+#'@description The function turns NULL in a list into character "NA".
+#'
+#'@param x A list
+#'
+#'@return If any element from the input list is NULL, they will be turned into 
character
+#'  "NA". Otherwise, return the original list.
+#'
+#'@export
+null2na <- function(x) {
+  return(lapply(x, function(y) {
+if (is.null(y)) {
+  return(as.character(NA))
+} else {
+  return(y)
+}
+  }))
+}
diff --git a/man/FiveThirtyNine.Rd b/man/FiveThirtyNine.Rd
index fb66530..7b129ce 100644
--- a/man/FiveThirtyNine.Rd
+++ b/man/FiveThirtyNine.Rd
@@ -20,4 +20,3 @@
  allow for long titles) back in and does a small amount of reduction of the
  overall plot size to avoid an absolute ton of extraneous spacing.
 }
-
diff --git a/man/build_query.Rd b/man/build_query.Rd
index 8f73d7a..649388b 100644
--- a/man/build_query.Rd
+++ b/man/build_query.Rd
@@ -19,4 +19,3 @@
 constructs a MySQL query with a conditional around date.
 This is aimed at eventlogging, where the date/time is always "timestamp".
 }
-
diff --git a/man/date_clause.Rd b/man/date_clause.Rd
index b2736b6..59a2faf 100644
--- a/man/date_clause.Rd
+++ b/man/date_clause.Rd
@@ -17,4 +17,3 @@
 what it says on the tin; generates a "WHERE year = foo AND month = bar" using 
lubridate
 that can then be combined with other elements to form a Hive query.
 }
-
diff --git a/man/get_logfile.Rd b/man/get_logfile.Rd
index 00e28d9..f8f88a2 100644
--- a/man/get_logfile.Rd
+++ b/man/get_logfile.Rd
@@ -24,4 +24,3 @@
 sampled log files; it can be used to retrieve a particular date range of
 files through the "earliest" and "latest" arguments.
 }
-
diff --git a/man/global_query.Rd b/man/global_query.Rd
index f07525c..277b903 100644
--- a/man/global_query.Rd
+++ b/man/global_query.Rd
@@ -15,12 +15,11 @@
 \code{global_query} is a simple wrapper around the mysql queries that allows a 
useR to send a query to all production
 dbs on analytics-store.eqiad.wmnet, joining the results from each query into a 
single object.
 }
-\author{
-Oliver Keyes 
-}
 \seealso{
 \code{\link{mysql_read}} for querying an individual db, 
\code{\link{mw_strptime}}
 for converting MediaWiki timestamps into POSIXlt timestamps, or 
\code{\link{hive_query}} for
 accessing the Hive datastore.
 }
-
+\author{
+Oliver Keyes 
+}
diff --git a/man/mysql.Rd b/man/mysql.Rd
index de36b60..a7913dc 100644
--- a/man/mysql.Rd
+++ b/man/mysql.Rd
@@ -2,12 +2,12 @@
 % Please edit documentation in R/mysql.R
 \name{mysql}
 \alias{mysql}
-\alias{mysql_close}
 \alias{mysql_connect}
-\alias{mysql_disconnect}
-\alias{mysql_exists}
 \alias{mysql_read}
+\alias{mysql_exists}
 \alias{mysql_write}
+\alias{mysql_close}
+\alias{mysql_disconnect}
 \title{Work with MySQL databases}
 \usage{
 mysql_connect(database, default_file = NULL)
@@ -43,4 +43,3 @@
 \seealso{
 \code{\link{hive_query}} or \code{\link{global_query}}
 }
-
diff --git a/man/null2na.Rd b/man/null2na.Rd
new file mode 100644
index 000..9584d89
--- /dev/null
+++ b/man/null2na.Rd
@@ -0,0 +1,18 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/wmf.R
+\name{null2na}
+\alias{null2na}
+\title{Turns Null Into Character "NA"}
+\usage{
+null2na(x)
+}
+\arguments{
+\item{x}{A list}
+}
+\value{
+If any element from the input list is NULL, they will be turned into character
+ "NA". Otherwise, return the original list.
+}
+\description{
+The function turns NULL in a list into character "NA".
+}
diff --git a/man/query_hive.Rd b/man/query_hive.Rd
index 8483052..fb7886f 100644
--- 

[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Fix legend positions and rename type of API calls

2017-08-31 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/375074 )

Change subject: Fix legend positions and rename type of API calls
..

Fix legend positions and rename type of API calls

Bug: T172452
Change-Id: Ie03a33551afe50df10f33bd8f6c35095097b91c8
---
M modules/api.R
M modules/key_performance_metrics/api_usage.R
M ui.R
M utils.R
4 files changed, 26 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow 
refs/changes/74/375074/1

diff --git a/modules/api.R b/modules/api.R
index 495065d..790b29e 100644
--- a/modules/api.R
+++ b/modules/api.R
@@ -1,22 +1,22 @@
 output$cirrus_aggregate <- renderDygraph({
-  split_dataset$cirrus %>%
+  split_dataset$`full-text via API` %>%
 tidyr::spread(referrer, calls) %>%
 polloi::reorder_columns() %>%
 polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_fulltext_search)) 
%>%
 polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily 
Full-text search API usage by referrer", legend_name = "Searches") %>%
-dyLegend(width = 1000, show = "always") %>%
+dyLegend(labelsDiv = "cirrus_aggregate_legend", width = 600) %>%
 dyRangeSelector %>%
 dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") 
%>%
 dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = "bottom")
 })
 
 output$morelike_aggregate <- renderDygraph({
-  split_dataset$`cirrus (more like)` %>%
+  split_dataset$`morelike via API` %>%
 tidyr::spread(referrer, calls) %>%
 polloi::reorder_columns() %>%
 polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_morelike_search)) 
%>%
 polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily 
Morelike search API usage by referrer", legend_name = "Searches") %>%
-dyLegend(width = 1000, show = "always") %>%
+dyLegend(labelsDiv = "morelike_aggregate_legend", width = 600) %>%
 dyRangeSelector
 })
 
@@ -26,7 +26,7 @@
 polloi::reorder_columns() %>%
 polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_open_search)) %>%
 polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily 
OpenSearch API usage by referrer", legend_name = "Searches") %>%
-dyLegend(width = 1000, show = "always") %>%
+dyLegend(labelsDiv = "open_aggregate_legend", width = 600) %>%
 dyRangeSelector %>%
 dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") 
%>%
 dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = "bottom")
@@ -38,7 +38,7 @@
 polloi::reorder_columns() %>%
 polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_geo_search)) %>%
 polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily Geo 
Search API usage by referrer", legend_name = "Searches") %>%
-dyLegend(width = 1000, show = "always") %>%
+dyLegend(labelsDiv = "geo_aggregate_legend", width = 600) %>%
 dyRangeSelector %>%
 dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") 
%>%
 dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = "bottom")
@@ -50,7 +50,7 @@
 polloi::reorder_columns() %>%
 polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_language_search)) 
%>%
 polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily 
Language search API usage by referrer", legend_name = "Searches") %>%
-dyLegend(width = 1000, show = "always") %>%
+dyLegend(labelsDiv = "language_aggregate_legend", width = 600) %>%
 dyRangeSelector %>%
 dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") 
%>%
 dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = "bottom")
@@ -62,7 +62,7 @@
 polloi::reorder_columns() %>%
 polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_prefix_search)) 
%>%
 polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily 
Prefix search API usage by referrer", legend_name = "Searches") %>%
-dyLegend(width = 1000, show = "always") %>%
+dyLegend(labelsDiv = "prefix_aggregate_legend", width = 600) %>%
 dyRangeSelector %>%
 dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") 
%>%
 dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = "bottom")
@@ -84,6 +84,6 @@
 polloi::make_dygraph(xlab = "Date",
  ylab = ifelse(input$referer_breakdown_prop, "API 
Calls Share (%)", "API Calls"),
  title = "Daily API usage by referrer", legend_name = 
"API Calls") %>%
-dyLegend(width = 1000, show = "always") %>%
+dyLegend(labelsDiv = "referer_breakdown_plot_legend", width = 600) %>%
 dyRangeSelector
 })
diff --git 

[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: SRP visit times additional fixes

2017-08-31 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/375057 )

Change subject: SRP visit times additional fixes
..


SRP visit times additional fixes

Bug: T170468
Change-Id: I8758a3559e8ca6ad5713afec171bbdeca29f4dc3
---
M modules/page_visit_times.R
M tab_documentation/srp_surv.md
M ui.R
3 files changed, 25 insertions(+), 18 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/modules/page_visit_times.R b/modules/page_visit_times.R
index d312fcc..1321dd6 100644
--- a/modules/page_visit_times.R
+++ b/modules/page_visit_times.R
@@ -1,12 +1,13 @@
 output$lethal_dose_plot <- renderDygraph({
   req(length(input$filter_lethal_dose_plot) > 0)
   user_page_visit_dataset[, c("date", input$filter_lethal_dose_plot)] %>%
-polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, 
input$smoothing_lethal_dose_plot)) %>%
+polloi::reorder_columns() %>%
+polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, 
input$smoothing_lethal_dose_plot), rename = FALSE) %>%
 polloi::make_dygraph(xlab = "", ylab = "Time (s)", title = "Time at which 
N% users leave the visited page") %>%
 dyAxis("x", ticker = "Dygraph.dateTicker", axisLabelFormatter = 
polloi::custom_axis_formatter,
axisLabelWidth = 100, pixelsPerLabel = 80) %>%
 dyRoller(rollPeriod = input$rolling_lethal_dose_plot) %>%
-dyLegend(labelsDiv = "lethal_dose_plot_legend") %>%
+dyLegend(labelsDiv = "lethal_dose_plot_legend", width = 600) %>%
 dyRangeSelector(fillColor = "", strokeColor = "") %>%
 dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") 
%>%
 dyEvent(as.Date("2017-04-25"), "S (rates)", labelLoc = "bottom")
@@ -17,17 +18,15 @@
   serp_page_visit_dataset[, c("date", "language", input$filter_srp_ld_plot)] 
%>%
 tidyr::gather(LD, time, -c(date, language)) %>%
 dplyr::filter(language %in% input$language_srp_ld_plot) %>%
-dplyr::transmute(
-  date = date, time = time,
-  label = paste0(LD, " (", language, ")")
-) %>%
+dplyr::transmute(date = date, time = time, label = paste0(LD, " (", 
language, ")")) %>%
 tidyr::spread(label, time) %>%
-polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_srp_ld_plot)) %>%
+polloi::reorder_columns() %>%
+polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_srp_ld_plot), 
rename = FALSE) %>%
 polloi::make_dygraph(xlab = "", ylab = "Time (s)", title = "Time at N% 
users leave the search results page") %>%
 dyAxis("x", ticker = "Dygraph.dateTicker", axisLabelFormatter = 
polloi::custom_axis_formatter,
axisLabelWidth = 100, pixelsPerLabel = 80) %>%
 dyRoller(rollPeriod = input$rolling_srp_ld_plot) %>%
-dyLegend(labelsDiv = "srp_ld_plot_legend") %>%
+dyLegend(labelsDiv = "srp_ld_plot_legend", width = 600) %>%
 dyRangeSelector(fillColor = "", strokeColor = "") %>%
 dyEvent(as.Date("2017-04-25"), "S (sampling rates)", labelLoc = "bottom") 
%>%
 dyEvent(as.Date("2017-06-15"), "SS (sister search)", labelLoc = "bottom")
diff --git a/tab_documentation/srp_surv.md b/tab_documentation/srp_surv.md
index 254818f..c84fde7 100644
--- a/tab_documentation/srp_surv.md
+++ b/tab_documentation/srp_surv.md
@@ -1,11 +1,19 @@
 How long Wikipedia searchers stay on the search result pages
 ===
 
-When someone is randomly selected for search satisfaction tracking (using our 
[TSS2 schema](https://meta.wikimedia.org/wiki/Schema:TestSearchSatisfaction2)), 
we use a check-in system and survival analysis to estimate how long users stay 
on visited pages. When a Wikipedia visitor searches using autocomplete and ends 
up on a full-text search results page (SRP), we can track how long that page is 
"alive" before the user either closes the tab, clicks on a result, or navigates 
elsewhere.
+When someone is randomly selected for search satisfaction tracking (using our 
[TSS2 schema](https://meta.wikimedia.org/wiki/Schema:TestSearchSatisfaction2)), 
we use a check-in system and survival analysis to estimate how long users stay 
on visited pages. When a Wikipedia visitor searches using autocomplete and ends 
up on a **full-text _search results page_** (SRP), we can track how long that 
page is "alive" before the user either closes the tab, clicks on a result, or 
navigates elsewhere.
 
 To summarize the results on a daily basis, we record a set of statistics based 
on a measure formally known as "[median lethal 
dose](https://en.wikipedia.org/wiki/Median_lethal_dose)". This graph shows the 
length of time that must pass before N% of the users leave the search results 
page. When the number goes up, we can infer that users are staying on the pages 
longer.
 
-Outages and inaccuracies
+Notes
+-
+These summary statistics are the same between 

[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Order legends according to the last observed values

2017-08-30 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/374924 )

Change subject: Order legends according to the last observed values
..

Order legends according to the last observed values

Bug: T172452
Change-Id: I1552b8f5adf8dde941b567154b08f9d9c674eb5d
---
M modules/api.R
M modules/key_performance_metrics/api_usage.R
2 files changed, 49 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow 
refs/changes/24/374924/1

diff --git a/modules/api.R b/modules/api.R
index 5fd6cd1..6ec9d1d 100644
--- a/modules/api.R
+++ b/modules/api.R
@@ -1,6 +1,11 @@
 output$cirrus_aggregate <- renderDygraph({
   split_dataset$cirrus %>%
 tidyr::spread(referrer, calls) %>%
+{
+  # Reorder columns according to the last observed values:
+  cols <- unlist(polloi::safe_tail(., 1)[, -1])
+  .[, c(1, order(cols, decreasing = TRUE) + 1)]
+} %>%
 polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_fulltext_search)) 
%>%
 polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily 
Full-text search API usage by referrer", legend_name = "Searches") %>%
 dyLegend(width = 1000, show = "always") %>%
@@ -12,6 +17,11 @@
 output$morelike_aggregate <- renderDygraph({
   split_dataset$`cirrus (more like)` %>%
 tidyr::spread(referrer, calls) %>%
+{
+  # Reorder columns according to the last observed values:
+  cols <- unlist(polloi::safe_tail(., 1)[, -1])
+  .[, c(1, order(cols, decreasing = TRUE) + 1)]
+} %>%
 polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_morelike_search)) 
%>%
 polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily 
Morelike search API usage by referrer", legend_name = "Searches") %>%
 dyLegend(width = 1000, show = "always") %>%
@@ -21,6 +31,11 @@
 output$open_aggregate <- renderDygraph({
   split_dataset$open %>%
 tidyr::spread(referrer, calls) %>%
+{
+  # Reorder columns according to the last observed values:
+  cols <- unlist(polloi::safe_tail(., 1)[, -1])
+  .[, c(1, order(cols, decreasing = TRUE) + 1)]
+} %>%
 polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_open_search)) %>%
 polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily 
OpenSearch API usage by referrer", legend_name = "Searches") %>%
 dyLegend(width = 1000, show = "always") %>%
@@ -31,7 +46,13 @@
 
 output$geo_aggregate <- renderDygraph({
   split_dataset$geo %>%
-tidyr::spread(referrer, calls) %>%polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_geo_search)) %>%
+tidyr::spread(referrer, calls) %>%
+{
+  # Reorder columns according to the last observed values:
+  cols <- unlist(polloi::safe_tail(., 1)[, -1])
+  .[, c(1, order(cols, decreasing = TRUE) + 1)]
+} %>%
+polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_geo_search)) %>%
 polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily Geo 
Search API usage by referrer", legend_name = "Searches") %>%
 dyLegend(width = 1000, show = "always") %>%
 dyRangeSelector %>%
@@ -41,7 +62,13 @@
 
 output$language_aggregate <- renderDygraph({
   split_dataset$language %>%
-tidyr::spread(referrer, calls) %>%polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_language_search)) 
%>%
+tidyr::spread(referrer, calls) %>%
+{
+  # Reorder columns according to the last observed values:
+  cols <- unlist(polloi::safe_tail(., 1)[, -1])
+  .[, c(1, order(cols, decreasing = TRUE) + 1)]
+} %>%
+polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_language_search)) 
%>%
 polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily 
Language search API usage by referrer", legend_name = "Searches") %>%
 dyLegend(width = 1000, show = "always") %>%
 dyRangeSelector %>%
@@ -52,6 +79,11 @@
 output$prefix_aggregate <- renderDygraph({
   split_dataset$prefix %>%
 tidyr::spread(referrer, calls) %>%
+{
+  # Reorder columns according to the last observed values:
+  cols <- unlist(polloi::safe_tail(., 1)[, -1])
+  .[, c(1, order(cols, decreasing = TRUE) + 1)]
+} %>%
 polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_prefix_search)) 
%>%
 polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily 
Prefix search API usage by referrer", legend_name = "Searches") %>%
 dyLegend(width = 1000, show = "always") %>%
@@ -71,6 +103,11 @@
 temp <- cbind(temp$date, purrr::map_df(temp[, -c(1, 2)], function(x) 
round(100 * x / temp$All, 2)))
   }
   temp %>%
+{
+ 

[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: SRP visit times

2017-08-30 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/374920 )

Change subject: SRP visit times
..


SRP visit times

Functional but still has the following TODOs:
- reorder how the %-lang combos show up in the legend
- once more of the data has been backfilled, need to add some
  general comments on trends

Bug: T170468
Change-Id: I690230e3d3a7a41156f5878169577a62f52ddeb6
---
M CHANGELOG.md
M modules/page_visit_times.R
A tab_documentation/srp_surv.md
M tab_documentation/survival.md
M ui.R
M utils.R
6 files changed, 127 insertions(+), 7 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/CHANGELOG.md b/CHANGELOG.md
index 7cb188e..099e8a1 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -2,6 +2,10 @@
 
 All notable changes to this project will be documented in this file.
 
+## 2017/08/30
+- Added SRP visit times ([T170468](https://phabricator.wikimedia.org/T170468))
+- Added [dygraph-based rolling 
periods](https://rstudio.github.io/dygraphs/gallery-roll-periods.html) to page 
visit times modules
+
 ## 2017/08/29
 - Added support for breakdown of API usage by referrer 
([T172452](https://phabricator.wikimedia.org/T172452))
 - Added morelike API usage (see [Gerrit change 
345863](https://gerrit.wikimedia.org/r/#/c/345863/)) for more details
diff --git a/modules/page_visit_times.R b/modules/page_visit_times.R
index 115cbb4..d312fcc 100644
--- a/modules/page_visit_times.R
+++ b/modules/page_visit_times.R
@@ -1,11 +1,34 @@
 output$lethal_dose_plot <- renderDygraph({
-  user_page_visit_dataset %>%
+  req(length(input$filter_lethal_dose_plot) > 0)
+  user_page_visit_dataset[, c("date", input$filter_lethal_dose_plot)] %>%
 polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, 
input$smoothing_lethal_dose_plot)) %>%
-polloi::make_dygraph(xlab = "", ylab = "Time (s)", title = "Time at which 
we have lost N% of the users") %>%
+polloi::make_dygraph(xlab = "", ylab = "Time (s)", title = "Time at which 
N% users leave the visited page") %>%
 dyAxis("x", ticker = "Dygraph.dateTicker", axisLabelFormatter = 
polloi::custom_axis_formatter,
axisLabelWidth = 100, pixelsPerLabel = 80) %>%
+dyRoller(rollPeriod = input$rolling_lethal_dose_plot) %>%
 dyLegend(labelsDiv = "lethal_dose_plot_legend") %>%
 dyRangeSelector(fillColor = "", strokeColor = "") %>%
 dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") 
%>%
 dyEvent(as.Date("2017-04-25"), "S (rates)", labelLoc = "bottom")
 })
+
+output$srp_ld_plot <- renderDygraph({
+  req(length(input$filter_srp_ld_plot) > 0 && 
length(input$language_srp_ld_plot) > 0)
+  serp_page_visit_dataset[, c("date", "language", input$filter_srp_ld_plot)] 
%>%
+tidyr::gather(LD, time, -c(date, language)) %>%
+dplyr::filter(language %in% input$language_srp_ld_plot) %>%
+dplyr::transmute(
+  date = date, time = time,
+  label = paste0(LD, " (", language, ")")
+) %>%
+tidyr::spread(label, time) %>%
+polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_srp_ld_plot)) %>%
+polloi::make_dygraph(xlab = "", ylab = "Time (s)", title = "Time at N% 
users leave the search results page") %>%
+dyAxis("x", ticker = "Dygraph.dateTicker", axisLabelFormatter = 
polloi::custom_axis_formatter,
+   axisLabelWidth = 100, pixelsPerLabel = 80) %>%
+dyRoller(rollPeriod = input$rolling_srp_ld_plot) %>%
+dyLegend(labelsDiv = "srp_ld_plot_legend") %>%
+dyRangeSelector(fillColor = "", strokeColor = "") %>%
+dyEvent(as.Date("2017-04-25"), "S (sampling rates)", labelLoc = "bottom") 
%>%
+dyEvent(as.Date("2017-06-15"), "SS (sister search)", labelLoc = "bottom")
+})
diff --git a/tab_documentation/srp_surv.md b/tab_documentation/srp_surv.md
new file mode 100644
index 000..254818f
--- /dev/null
+++ b/tab_documentation/srp_surv.md
@@ -0,0 +1,23 @@
+How long Wikipedia searchers stay on the search result pages
+===
+
+When someone is randomly selected for search satisfaction tracking (using our 
[TSS2 schema](https://meta.wikimedia.org/wiki/Schema:TestSearchSatisfaction2)), 
we use a check-in system and survival analysis to estimate how long users stay 
on visited pages. When a Wikipedia visitor searches using autocomplete and ends 
up on a full-text search results page (SRP), we can track how long that page is 
"alive" before the user either closes the tab, clicks on a result, or navigates 
elsewhere.
+
+To summarize the results on a daily basis, we record a set of statistics based 
on a measure formally known as "[median lethal 
dose](https://en.wikipedia.org/wiki/Median_lethal_dose)". This graph shows the 
length of time that must pass before N% of the users leave the search results 
page. When the number goes up, we can infer that users are staying on the pages 
longer.
+
+Outages and 

[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Duplicate reports without max data points limit to keep data...

2017-08-30 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/374900 )

Change subject: Duplicate reports without max data points limit to keep data 
longer
..

Duplicate reports without max data points limit to keep data longer

Bug: T172453
Change-Id: Iabadd6af646cf186aff811aef8f91d2d9106a3dd
---
A modules/metrics/portal/all_country_data_history
M modules/metrics/portal/config.yaml
A modules/metrics/portal/first_visits_country_history
A modules/metrics/portal/last_action_country_history
A modules/metrics/portal/most_common_country_history
A modules/metrics/search/app_event_counts_langproj_breakdown_history
A modules/metrics/search/cirrus_langproj_breakdown_no_automata_history
A modules/metrics/search/cirrus_langproj_breakdown_with_automata_history
M modules/metrics/search/config.yaml
A modules/metrics/search/desktop_event_counts_langproj_breakdown_history
A modules/metrics/search/mobile_event_counts_langproj_breakdown_history
A 
modules/metrics/search/paulscore_approximations_fulltext_langproj_breakdown_history
A modules/metrics/search/search_threshold_pass_rate_langproj_breakdown_history
13 files changed, 99 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/golden 
refs/changes/00/374900/1

diff --git a/modules/metrics/portal/all_country_data_history 
b/modules/metrics/portal/all_country_data_history
new file mode 100755
index 000..d295ff7
--- /dev/null
+++ b/modules/metrics/portal/all_country_data_history
@@ -0,0 +1,3 @@
+#!/bin/bash
+
+Rscript modules/metrics/portal/geographic_breakdown.R -d $1 --include_all
diff --git a/modules/metrics/portal/config.yaml 
b/modules/metrics/portal/config.yaml
index b980a64..3a44813 100644
--- a/modules/metrics/portal/config.yaml
+++ b/modules/metrics/portal/config.yaml
@@ -53,6 +53,12 @@
 max_data_points: 60
 funnel: true
 type: script
+all_country_data_history:
+description: Sampled traffic to Wikipedia.org Portal, broken down by 
country. Historical data store.
+granularity: days
+starts: 2017-04-01
+funnel: true
+type: script
 app_link_clicks:
 description: Clicks to Wikipedia mobile apps and list of apps
 granularity: days
@@ -66,6 +72,12 @@
 max_data_points: 60
 funnel: true
 type: script
+last_action_country_history:
+description: Last action performed on Wikipedia.org Portal per user 
session. Historical data store.
+granularity: days
+starts: 2017-04-01
+funnel: true
+type: script
 most_common_country:
 description: Most common action performed on Wikipedia.org Portal per 
user session, broken down by country
 granularity: days
@@ -73,6 +85,12 @@
 max_data_points: 60
 funnel: true
 type: script
+most_common_country_history:
+description: Most common action performed on Wikipedia.org Portal per 
user session, broken down by country. Historical data store.
+granularity: days
+starts: 2017-04-01
+funnel: true
+type: script
 first_visits_country:
 description: Action performed on Wikipedia.org Portal on each user's 
initial visit, broken down by country
 granularity: days
@@ -80,6 +98,12 @@
 max_data_points: 60
 funnel: true
 type: script
+first_visits_country_history:
+description: Action performed on Wikipedia.org Portal on each user's 
initial visit, broken down by country. Historical data store.
+granularity: days
+starts: 2017-04-01
+funnel: true
+type: script
 clickthrough_rate:
 description: Last action (no action vs clickthrough) by Wikipedia.org 
Portal visitors
 granularity: days
diff --git a/modules/metrics/portal/first_visits_country_history 
b/modules/metrics/portal/first_visits_country_history
new file mode 100755
index 000..2312008
--- /dev/null
+++ b/modules/metrics/portal/first_visits_country_history
@@ -0,0 +1,3 @@
+#!/bin/bash
+
+Rscript modules/metrics/portal/engagement.R -d $1 -o clickthrough_firstvisit 
--by_country
diff --git a/modules/metrics/portal/last_action_country_history 
b/modules/metrics/portal/last_action_country_history
new file mode 100755
index 000..dd3d177
--- /dev/null
+++ b/modules/metrics/portal/last_action_country_history
@@ -0,0 +1,3 @@
+#!/bin/bash
+
+Rscript modules/metrics/portal/engagement.R -d $1 -o clickthrough_breakdown 
--by_country
diff --git a/modules/metrics/portal/most_common_country_history 
b/modules/metrics/portal/most_common_country_history
new file mode 100755
index 000..d17ce0b
--- /dev/null
+++ b/modules/metrics/portal/most_common_country_history
@@ -0,0 +1,3 @@
+#!/bin/bash
+
+Rscript modules/metrics/portal/engagement.R -d $1 -o most_common_per_visit 
--by_country
diff --git 

[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: metrics::search::srp_survtime: Split by language

2017-08-30 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/374876 )

Change subject: metrics::search::srp_survtime: Split by language
..


metrics::search::srp_survtime: Split by language

Bug: T170468
Change-Id: I2b935be14eeb26350dea6d9c31a66977c531c052
---
M docs/README.md
M modules/metrics/search/config.yaml
M modules/metrics/search/sample_page_visit_ld.R
M modules/metrics/search/srp_survtime.R
4 files changed, 49 insertions(+), 29 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/docs/README.md b/docs/README.md
index 8af2aa3..1b2abe6 100644
--- a/docs/README.md
+++ b/docs/README.md
@@ -8,7 +8,7 @@
 infrastructure. These datasets provide the metrics that are used by
 [Discovery's Dashboards](https://discovery.wmflabs.org/)
 
-Last updated on 28 August 2017
+Last updated on 30 August 2017
 
 Daily Metrics
 -
@@ -147,7 +147,8 @@
 exlcudes known automata
 -   **srp\_survtime.tsv**: Estimates (via survival analysis) of how long
 Wikipedia searchers stay on full-text search results page after
-getting there from autocomplete search.
+getting there from autocomplete search, split by English vs French
+and Catalan vs other languages.
 
 wdqs/
 -
diff --git a/modules/metrics/search/config.yaml 
b/modules/metrics/search/config.yaml
index bfa78ab..7514982 100644
--- a/modules/metrics/search/config.yaml
+++ b/modules/metrics/search/config.yaml
@@ -163,8 +163,8 @@
 funnel: true
 type: script
 srp_survtime:
-description: Estimates (via survival analysis) of how long Wikipedia 
searchers stay on full-text search results page after getting there from 
autocomplete search.
+description: Estimates (via survival analysis) of how long Wikipedia 
searchers stay on full-text search results page after getting there from 
autocomplete search, split by English vs French and Catalan vs other languages.
 granularity: days
 starts: 2017-04-01
-funnel: false
+funnel: true
 type: script
diff --git a/modules/metrics/search/sample_page_visit_ld.R 
b/modules/metrics/search/sample_page_visit_ld.R
index ee425f6..dc04dac 100644
--- a/modules/metrics/search/sample_page_visit_ld.R
+++ b/modules/metrics/search/sample_page_visit_ld.R
@@ -40,10 +40,8 @@
   }
 )
 
-if (nrow(results) == 0) {
-  # Here we make the script output tab-separated
-  # column names, as required by Reportupdater:
-  page_visit_survivorship <- data.frame(
+empty_df <- function() {
+  data.frame(
 date = character(),
 LD10 = character(),
 LD25 = character(),
@@ -53,6 +51,12 @@
 LD95 = character(),
 LD99 = character()
   )
+}
+
+if (nrow(results) == 0) {
+  # Here we make the script output tab-separated
+  # column names, as required by Reportupdater:
+  page_visit_survivorship <- empty_df()
 } else {
   # De-duplicate, clean, and sort:
   results$timestamp <- as.POSIXct(results$timestamp, format = "%Y%m%d%H%M%S")
@@ -69,7 +73,7 @@
   # Treat each individual search session as its own thing, rather than 
belonging
   #   to a set of other search sessions by the same user.
   page_visits <- results[, {
-if (all(!is.na(.SD$checkin))) {
+if (any(.SD$event == "checkin")) {
   last_checkin <- max(.SD$checkin, na.rm = TRUE)
   idx <- which(checkins > last_checkin)
   if (length(idx) == 0) idx <- 16 # length(checkins) = 16
@@ -82,13 +86,19 @@
   )
 }
   }, by = c("session_id", "page_id")]
-  surv <- survival::Surv(time = page_visits$`last check-in`,
- time2 = page_visits$`next check-in`,
- event = page_visits$status,
- type = "interval")
-  fit <- survival::survfit(surv ~ 1)
-  page_visit_survivorship <- data.frame(date = opt$date, rbind(quantile(fit, 
probs = c(0.10, 0.25, 0.50, 0.75, 0.90, 0.95, 0.99))$quantile))
-  colnames(page_visit_survivorship) <- c('date', 'LD10', 'LD25', 'LD50', 
'LD75', 'LD90', 'LD95', 'LD99')
+  if (nrow(page_visits) == 0) {
+page_visit_survivorship <- empty_df()
+  } else {
+surv <- survival::Surv(
+  time = page_visits$`last check-in`,
+  time2 = page_visits$`next check-in`,
+  event = page_visits$status,
+  type = "interval"
+)
+fit <- survival::survfit(surv ~ 1)
+page_visit_survivorship <- data.frame(date = opt$date, rbind(quantile(fit, 
probs = c(0.10, 0.25, 0.50, 0.75, 0.90, 0.95, 0.99))$quantile))
+colnames(page_visit_survivorship) <- c('date', 'LD10', 'LD25', 'LD50', 
'LD75', 'LD90', 'LD95', 'LD99')
+  }
 }
 
 write.table(page_visit_survivorship, file = "", append = FALSE, sep = "\t", 
row.names = FALSE, quote = FALSE)
diff --git a/modules/metrics/search/srp_survtime.R 
b/modules/metrics/search/srp_survtime.R
index a4ca9cb..985a392 100644
--- a/modules/metrics/search/srp_survtime.R
+++ b/modules/metrics/search/srp_survtime.R
@@ -56,6 +56,7 @@
 

[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Breakdown API calls by referer class

2017-08-29 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/374669 )

Change subject: Breakdown API calls by referer class
..

Breakdown API calls by referer class

Bug: T172452
Change-Id: Ic70d7054e02569eb8545dd347026c7f77321ab2c
---
M modules/api.R
A tab_documentation/referer_breakdown.md
M ui.R
M utils.R
4 files changed, 65 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow 
refs/changes/69/374669/1

diff --git a/modules/api.R b/modules/api.R
index dc8e332..bfc2350 100644
--- a/modules/api.R
+++ b/modules/api.R
@@ -53,3 +53,22 @@
 dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") 
%>%
 dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = "bottom")
 })
+
+output$referer_breakdown_plot <- renderDygraph({
+  temp <- split_dataset %>%
+dplyr::bind_rows(.id = "api") %>%
+dplyr::group_by(date, referrer) %>%
+dplyr::summarize(calls = sum(calls, na.rm = TRUE)) %>%
+tidyr::spread(referrer, calls)
+  if (input$referer_breakdown_prop) {
+temp <- cbind(temp$date, purrr::map_df(temp[, -c(1, 2)], function(x) 
round(100 * x / temp$All, 2)))
+  }
+  temp %>%
+polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, 
input$smoothing_referer_breakdown)) %>%
+polloi::make_dygraph(xlab = "Date",
+ ylab = ifelse(input$referer_breakdown_prop, "API 
Calls Share (%)", "API Calls"),
+ title = "Daily API usage by referrer", legend_name = 
"API Calls") %>%
+dyRangeSelector %>%
+dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") 
%>%
+dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = "bottom")
+})
diff --git a/tab_documentation/referer_breakdown.md 
b/tab_documentation/referer_breakdown.md
new file mode 100644
index 000..0b1f8d0
--- /dev/null
+++ b/tab_documentation/referer_breakdown.md
@@ -0,0 +1,24 @@
+API Calls by Referrer Class
+===
+
+All types of API calls are aggregated by date and referrer class.
+
+**Internal** is traffic referred by Wikimedia sites, specifically: 
mediawiki.org, wikibooks.org, wikidata.org, wikinews.org, wikimedia.org, 
wikimediafoundation.org, wikipedia.org, wikiquote.org, wikisource.org, 
wikiversity.org, wikivoyage.org, and wiktionary.org (See [Webrequest 
source](https://git.wikimedia.org/blob/analytics%2Frefinery%2Fsource.git/master/refinery-core%2Fsrc%2Fmain%2Fjava%2Forg%2Fwikimedia%2Fanalytics%2Frefinery%2Fcore%2FWebrequest.java#L203)
 for more information.)
+
+Outages and inaccuracies
+--
+
+* '__R__': on 2017-01-01 we started calculating all of Discovery's metrics 
using a new version of [our data retrieval and processing 
codebase](https://phabricator.wikimedia.org/diffusion/WDGO/) that we migrated 
to [Wikimedia Analytics](https://www.mediawiki.org/wiki/Analytics)' 
[Reportupdater 
infrastructure](https://wikitech.wikimedia.org/wiki/Analytics/Reportupdater). 
See [T150915](https://phabricator.wikimedia.org/T150915) for more details. 
Furthermore, we switched to an updated UDF for counting API calls -- the 
previous version was undercounting full-text and geo search API calls (see 
[Gerrit change 315503](https://gerrit.wikimedia.org/r/#/c/315503/) for more 
details).
+* '__U__': on 2017-06-29 we started to use a new UDF to get the type of search 
API (see [Gerrit change 345863](https://gerrit.wikimedia.org/r/#/c/345863/) for 
more details) and break down the API calls by referer class. 
+
+Questions, bug reports, and feature suggestions
+--
+For technical, non-bug questions, [email 
Mikhail](mailto:mpo...@wikimedia.org?subject=Dashboard%20Question) or 
[Chelsy](mailto:c...@wikimedia.org?subject=Dashboard%20Question). If you 
experience a bug or notice something wrong or have a suggestion, [open a ticket 
in 
Phabricator](https://phabricator.wikimedia.org/maniphest/task/create/?projects=Discovery)
 in the Discovery board or [email 
Deb](mailto:d...@wikimedia.org?subject=Dashboard%20Question).
+
+
+
+  Link to this dashboard: https://discovery.wmflabs.org/metrics/#referer_breakdown;>https://discovery.wmflabs.org/metrics/#referer_breakdown
+  | Page is available under https://creativecommons.org/licenses/by-sa/3.0/; title="Creative Commons 
Attribution-ShareAlike License">CC-BY-SA 3.0
+  | https://phabricator.wikimedia.org/diffusion/WDRN/; title="Search 
Metrics Dashboard source code repository">Code is licensed under https://phabricator.wikimedia.org/diffusion/WDRN/browse/master/LICENSE.md;
 title="MIT License">MIT
+  | Part of https://discovery.wmflabs.org/;>Discovery Dashboards
+
diff --git a/ui.R b/ui.R
index 8b20615..7131e7e 100644
--- a/ui.R
+++ b/ui.R
@@ -60,7 +60,8 @@
menuSubItem(text = "Open Search", tabName = 
"open_search"),
menuSubItem(text = "Geo Search", tabName = 
"geo_search"),

[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Add a tab to track morelike search usage

2017-08-28 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/374442 )

Change subject: Add a tab to track morelike search usage
..

Add a tab to track morelike search usage

Bug: https://phabricator.wikimedia.org/T172452
Change-Id: I0d0b107df1f6b46a28b8e2c025d1acf5f0fec327
---
M modules/api.R
M modules/key_performance_metrics/api_usage.R
M tab_documentation/fulltext_basic.md
M tab_documentation/geo_basic.md
M tab_documentation/kpi_api_usage.md
M tab_documentation/language_basic.md
A tab_documentation/morelike_basic.md
M tab_documentation/open_basic.md
M tab_documentation/prefix_basic.md
M ui.R
10 files changed, 45 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow 
refs/changes/42/374442/1

diff --git a/modules/api.R b/modules/api.R
index 73368cd..8838a99 100644
--- a/modules/api.R
+++ b/modules/api.R
@@ -6,7 +6,17 @@
 polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Full-text 
via API usage by day", legend_name = "Searches") %>%
 dyRangeSelector %>%
 dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") 
%>%
-dyEvent(as.Date("2017-08-14"), "U (new UDF)", labelLoc = "bottom")
+dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = "bottom")
+})
+
+output$morelike_aggregate <- renderDygraph({
+  split_dataset$`cirrus (more like)` %>%
+dplyr::group_by(date) %>%
+dplyr::mutate(All = sum(calls, na.rm = TRUE)) %>%
+tidyr::spread(key = referer_class, value = calls) %>%
+polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_morelike_search)) 
%>%
+polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Morelike 
Search via API usage by day", legend_name = "Searches") %>%
+dyRangeSelector
 })
 
 output$open_aggregate <- renderDygraph({
@@ -17,7 +27,7 @@
 polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "OpenSearch 
API usage by day", legend_name = "Searches") %>%
 dyRangeSelector %>%
 dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") 
%>%
-dyEvent(as.Date("2017-08-14"), "U (new UDF)", labelLoc = "bottom")
+dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = "bottom")
 })
 
 output$geo_aggregate <- renderDygraph({
@@ -28,7 +38,7 @@
 polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Geo Search 
API usage by day", legend_name = "Searches") %>%
 dyRangeSelector %>%
 dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") 
%>%
-dyEvent(as.Date("2017-08-14"), "U (new UDF)", labelLoc = "bottom")
+dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = "bottom")
 })
 
 output$language_aggregate <- renderDygraph({
@@ -39,7 +49,7 @@
 polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Language 
Search API usage by day", legend_name = "Searches") %>%
 dyRangeSelector %>%
 dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") 
%>%
-dyEvent(as.Date("2017-08-14"), "U (new UDF)", labelLoc = "bottom")
+dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = "bottom")
 })
 
 output$prefix_aggregate <- renderDygraph({
@@ -50,5 +60,5 @@
 polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Prefix 
Search API usage by day", legend_name = "Searches") %>%
 dyRangeSelector %>%
 dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") 
%>%
-dyEvent(as.Date("2017-08-14"), "U (new UDF)", labelLoc = "bottom")
+dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = "bottom")
 })
diff --git a/modules/key_performance_metrics/api_usage.R 
b/modules/key_performance_metrics/api_usage.R
index 13a4c3a..b1ba34b 100644
--- a/modules/key_performance_metrics/api_usage.R
+++ b/modules/key_performance_metrics/api_usage.R
@@ -40,7 +40,7 @@
  dyCSS(css = system.file("custom.css", package = "polloi")) %>%
  dyRangeSelector %>%
  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = 
"bottom") %>%
- dyEvent(as.Date("2017-08-14"), "U (new UDF)", labelLoc = 
"bottom"))
+ dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = 
"bottom"))
   }
   api_usage_change <- api_usage %>%
 dplyr::mutate(
@@ -63,5 +63,5 @@
dyCSS(css = system.file("custom.css", package = "polloi")) %>%
dyRangeSelector %>%
dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = 
"bottom") %>%
-   dyEvent(as.Date("2017-08-14"), "U (new UDF)", labelLoc = "bottom"))
+   dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = "bottom"))
 })
diff --git a/tab_documentation/fulltext_basic.md 
b/tab_documentation/fulltext_basic.md
index c2a121a..76826cf 100644
--- a/tab_documentation/fulltext_basic.md
+++ b/tab_documentation/fulltext_basic.md
@@ -13,7 +13,7 @@
 --
 
 * '__R__': on 

[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: metrics::search::srp_survtime: Track search results page dwe...

2017-08-28 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/374396 )

Change subject: metrics::search::srp_survtime: Track search results page dwell 
time
..


metrics::search::srp_survtime: Track search results page dwell time

Bug: T170468
Change-Id: I694a2f24cd831428ad95872dea085f8307994b4a
---
M docs/README.Rmd
M docs/README.md
M modules/metrics/search/config.yaml
A modules/metrics/search/srp_survtime
A modules/metrics/search/srp_survtime.R
5 files changed, 136 insertions(+), 4 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/docs/README.Rmd b/docs/README.Rmd
index c03b43d..c7d27c0 100644
--- a/docs/README.Rmd
+++ b/docs/README.Rmd
@@ -1,12 +1,12 @@
 ---
 output: md_document
 note: >
-  Needs to be knit into Markdown and rsync'd to 
stat1002:/a/published-datasets/discovery/README.md
+  Needs to be knit into Markdown and rsync'd to 
stat1005:/srv/published-datasets/discovery/README.md
 ---
 
 # Discovery Datasets
 
-These files are generated by Discovery's 
[Golden](https://github.com/wikimedia/wikimedia-discovery-golden/) data 
retrieval codebase that executes daily and uses 
[Reportupdater](https://wikitech.wikimedia.org/wiki/Analytics/Reportupdater) 
infrastructure. These datasets provide the metrics that are used by 
[Discovery's Dashboards](https://discovery.wmflabs.org/)
+These files are generated by Discovery's 
[Golden](https://github.com/wikimedia/wikimedia-discovery-golden/) data 
retrieval codebase that executes daily and uses 
[Reportupdater](https://wikitech.wikimedia.org/wiki/Analytics/Systems/Reportupdater)
 infrastructure. These datasets provide the metrics that are used by 
[Discovery's Dashboards](https://discovery.wmflabs.org/)
 
 Last updated on `r format(Sys.Date(), "%d %B %Y")`
 
diff --git a/docs/README.md b/docs/README.md
index f1ea48c..8af2aa3 100644
--- a/docs/README.md
+++ b/docs/README.md
@@ -4,11 +4,11 @@
 These files are generated by Discovery's
 [Golden](https://github.com/wikimedia/wikimedia-discovery-golden/) data
 retrieval codebase that executes daily and uses
-[Reportupdater](https://wikitech.wikimedia.org/wiki/Analytics/Reportupdater)
+[Reportupdater](https://wikitech.wikimedia.org/wiki/Analytics/Systems/Reportupdater)
 infrastructure. These datasets provide the metrics that are used by
 [Discovery's Dashboards](https://discovery.wmflabs.org/)
 
-Last updated on 01 August 2017
+Last updated on 28 August 2017
 
 Daily Metrics
 -
@@ -145,6 +145,9 @@
 Wikipedia search results pages; broken up by language, destination
 type (SERP vs not), and access method (desktop vs mobile web);
 exlcudes known automata
+-   **srp\_survtime.tsv**: Estimates (via survival analysis) of how long
+Wikipedia searchers stay on full-text search results page after
+getting there from autocomplete search.
 
 wdqs/
 -
diff --git a/modules/metrics/search/config.yaml 
b/modules/metrics/search/config.yaml
index 82f1c3f..56ec39b 100644
--- a/modules/metrics/search/config.yaml
+++ b/modules/metrics/search/config.yaml
@@ -162,3 +162,9 @@
 starts: 2017-06-01
 funnel: true
 type: script
+srp_survtime:
+description: Estimates (via survival analysis) of how long Wikipedia 
searchers stay on full-text search results page after getting there from 
autocomplete search.
+granularity: days
+starts: 2017-04-01
+funnel: true
+type: script
diff --git a/modules/metrics/search/srp_survtime 
b/modules/metrics/search/srp_survtime
new file mode 100755
index 000..08e2682
--- /dev/null
+++ b/modules/metrics/search/srp_survtime
@@ -0,0 +1,3 @@
+#!/bin/bash
+
+Rscript modules/metrics/search/srp_survtime.R -d $1
diff --git a/modules/metrics/search/srp_survtime.R 
b/modules/metrics/search/srp_survtime.R
new file mode 100644
index 000..e24f99d
--- /dev/null
+++ b/modules/metrics/search/srp_survtime.R
@@ -0,0 +1,120 @@
+#!/usr/bin/env Rscript
+
+source("config.R")
+.libPaths(r_library)
+suppressPackageStartupMessages({
+  library("optparse")
+  library("glue")
+  library("magrittr")
+})
+
+option_list <- list(
+  make_option(c("-d", "--date"), default = NA, action = "store", type = 
"character")
+)
+
+# Get command line options, if help option encountered print help and exit,
+# otherwise if options not found on command line then set defaults:
+opt <- parse_args(OptionParser(option_list = option_list))
+
+if (is.na(opt$date)) {
+  quit(save = "no", status = 1)
+}
+
+mmdd <- format(as.Date(opt$date), "%Y%m%d")
+revision_number <- dplyr::case_when(
+  as.Date(opt$date) < "2017-02-10" ~ "15922352",
+  as.Date(opt$date) < "2017-06-29" ~ "16270835",
+  TRUE ~ "16909631"
+)
+
+query <- glue("SELECT
+  timestamp AS ts, wiki,
+  event_uniqueId AS event_id,
+  event_searchSessionId AS session_id,
+  event_pageViewId AS page_id,
+  event_action AS event,
+  event_checkin AS checkin,
+  

[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Use new UDF and break api calls down by referer class

2017-08-28 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/374387 )

Change subject: Use new UDF and break api calls down by referer class
..

Use new UDF and break api calls down by referer class

Bug: T172452
Change-Id: I0c3fad23abb3931223d0b6212c1f8a969a251f72
---
M modules/api.R
M modules/key_performance_metrics/api_usage.R
M tab_documentation/fulltext_basic.md
M tab_documentation/kpi_api_usage.md
M utils.R
5 files changed, 33 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow 
refs/changes/87/374387/1

diff --git a/modules/api.R b/modules/api.R
index 7e8e7ff..affe6fa 100644
--- a/modules/api.R
+++ b/modules/api.R
@@ -1,13 +1,18 @@
 output$cirrus_aggregate <- renderDygraph({
   split_dataset$cirrus %>%
+tidyr::spread(key = referer_class, value = calls) %>%
+dplyr::mutate(All = ifelse(is.na(All), rowSums(.[, -c(1, 2)], na.rm = 
TRUE), All)) %>%
 polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_fulltext_search)) 
%>%
 polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Full-text 
via API usage by day", legend_name = "Searches") %>%
 dyRangeSelector %>%
-dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
+dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") 
%>%
+dyEvent(as.Date("2017-08-14"), "U (new UDF)", labelLoc = "bottom")
 })
 
 output$open_aggregate <- renderDygraph({
   split_dataset$open %>%
+tidyr::spread(key = referer_class, value = calls) %>%
+dplyr::mutate(All = ifelse(is.na(All), rowSums(.[, -c(1, 2)], na.rm = 
TRUE), All)) %>%
 polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_open_search)) %>%
 polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "OpenSearch 
API usage by day", legend_name = "Searches") %>%
 dyRangeSelector %>%
@@ -16,6 +21,8 @@
 
 output$geo_aggregate <- renderDygraph({
   split_dataset$geo %>%
+tidyr::spread(key = referer_class, value = calls) %>%
+dplyr::mutate(All = ifelse(is.na(All), rowSums(.[, -c(1, 2)], na.rm = 
TRUE), All)) %>%
 polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_geo_search)) %>%
 polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Geo Search 
API usage by day", legend_name = "Searches") %>%
 dyRangeSelector %>%
@@ -24,6 +31,8 @@
 
 output$language_aggregate <- renderDygraph({
   split_dataset$language %>%
+tidyr::spread(key = referer_class, value = calls) %>%
+dplyr::mutate(All = ifelse(is.na(All), rowSums(.[, -c(1, 2)], na.rm = 
TRUE), All)) %>%
 polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_language_search)) 
%>%
 polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Language 
Search API usage by day", legend_name = "Searches") %>%
 dyRangeSelector %>%
@@ -32,6 +41,8 @@
 
 output$prefix_aggregate <- renderDygraph({
   split_dataset$prefix %>%
+tidyr::spread(key = referer_class, value = calls) %>%
+dplyr::mutate(All = ifelse(is.na(All), rowSums(.[, -c(1, 2)], na.rm = 
TRUE), All)) %>%
 polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_prefix_search)) 
%>%
 polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Prefix 
Search API usage by day", legend_name = "Searches") %>%
 dyRangeSelector %>%
diff --git a/modules/key_performance_metrics/api_usage.R 
b/modules/key_performance_metrics/api_usage.R
index 271b030..13a4c3a 100644
--- a/modules/key_performance_metrics/api_usage.R
+++ b/modules/key_performance_metrics/api_usage.R
@@ -2,6 +2,11 @@
   smooth_level <- input$smoothing_kpi_api_usage
   start_date <- Sys.Date() - switch(input$kpi_summary_date_range_selector, all 
= NA, daily = 1, weekly = 8, monthly = 31, quarterly = 91)
   api_usage <- split_dataset %>%
+  purrr::map(function(x) {
+dplyr::group_by(x, date) %>%
+dplyr::summarize(calls = sum(calls, na.rm = TRUE)) %>%
+dplyr::ungroup()
+  }) %>%
   {
 if (!is.na(start_date)) {
   lapply(., polloi::subset_by_date_range, from = start_date, to = 
Sys.Date() - 1)
@@ -12,33 +17,35 @@
 dplyr::bind_rows(.id = "api") %>%
 tidyr::spread("api", "calls")
   if ( input$kpi_api_usage_series_include_open ) {
-api_usage <- dplyr::mutate(api_usage, all = cirrus + geo + language + open 
+ prefix)
+api_usage <- dplyr::mutate(api_usage, all = cirrus + ifelse(is.na(`cirrus 
(more like)`), 0, `cirrus (more like)`) + geo + language + open + prefix)
   } else {
-api_usage <- dplyr::mutate(api_usage, all = cirrus + geo + language + 
prefix)
+api_usage <- dplyr::mutate(api_usage, all = cirrus + ifelse(is.na(`cirrus 
(more like)`), 0, `cirrus (more like)`) + geo + language + prefix)
   }
   if ( 

[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Breakdown search API requests by referer class and use GetSe...

2017-08-14 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/371980 )

Change subject: Breakdown search API requests by referer class and use 
GetSearchRequestTypeUDF
..

Breakdown search API requests by referer class and use GetSearchRequestTypeUDF

Please do not merge this patch since the new UDF hasn’t been released to 
production.

Bug: T172452
Change-Id: Ia4aa5260fe243abeced91c67de8f44bdc9be859b
---
M modules/metrics/search/search_api_usage
1 file changed, 4 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/golden 
refs/changes/80/371980/1

diff --git a/modules/metrics/search/search_api_usage 
b/modules/metrics/search/search_api_usage
index a0b1b7c..f9de476 100755
--- a/modules/metrics/search/search_api_usage
+++ b/modules/metrics/search/search_api_usage
@@ -1,11 +1,12 @@
 #!/bin/bash
 
 hive -S -e "ADD JAR hdfs:///wmf/refinery/current/artifacts/refinery-hive.jar;
-CREATE TEMPORARY FUNCTION search_classify AS 
'org.wikimedia.analytics.refinery.hive.SearchClassifierUDF';
+CREATE TEMPORARY FUNCTION search_classify AS 
'org.wikimedia.analytics.refinery.hive.GetSearchRequestTypeUDF';
 USE wmf;
 SELECT
   '$1' AS date,
   search_classify(uri_path, uri_query) AS api,
+  referer_class,
   COUNT(*) AS calls
 FROM webrequest
 WHERE
@@ -13,6 +14,6 @@
   AND CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')) >= '$1'
   AND CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')) < '$2'
   AND http_status = '200'
-  AND search_classify(uri_path, uri_query) IN('language', 'cirrus', 'prefix', 
'geo', 'open')
-GROUP BY '$1', search_classify(uri_path, uri_query);
+  AND search_classify(uri_path, uri_query) IN('language', 'cirrus', 'cirrus 
(more like)', 'prefix', 'geo', 'open')
+GROUP BY '$1', search_classify(uri_path, uri_query), referer_class;
 " 2> /dev/null | grep -v parquet.hadoop | grep -v WARN:

-- 
To view, visit https://gerrit.wikimedia.org/r/371980
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia4aa5260fe243abeced91c67de8f44bdc9be859b
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/golden
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Remove duplicated clicks on the same position for each query...

2017-08-09 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/370977 )

Change subject: Remove duplicated clicks on the same position for each query 
when computing paulscore
..

Remove duplicated clicks on the same position for each query when computing 
paulscore

Bug: T172960
Change-Id: I972500c6150408a119f2c80dad9fe8a49f00845e
---
M modules/metrics/search/paulscore_approximations.R
1 file changed, 8 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/golden 
refs/changes/77/370977/1

diff --git a/modules/metrics/search/paulscore_approximations.R 
b/modules/metrics/search/paulscore_approximations.R
index 1f7fe9f..1a1ede0 100644
--- a/modules/metrics/search/paulscore_approximations.R
+++ b/modules/metrics/search/paulscore_approximations.R
@@ -35,11 +35,14 @@
   SUM(IF(event_action = 'click', POW(0.7, event_position), 0)) / 
SUM(IF(event_action = 'searchResultPage', 1, 0)) AS pow_7,
   SUM(IF(event_action = 'click', POW(0.8, event_position), 0)) / 
SUM(IF(event_action = 'searchResultPage', 1, 0)) AS pow_8,
   SUM(IF(event_action = 'click', POW(0.9, event_position), 0)) / 
SUM(IF(event_action = 'searchResultPage', 1, 0)) AS pow_9
-FROM TestSearchSatisfaction2_", dplyr::if_else(as.Date(opt$date) < 
"2017-02-10", "15922352", dplyr::if_else(as.Date(opt$date) < "2017-06-29", 
"16270835", "16909631")), "
-WHERE ", date_clause, "
-  AND event_action IN ('searchResultPage', 'click')
-  AND IF(event_source = 'autocomplete', event_inputLocation = 'header', TRUE)
-  AND IF(event_source = 'autocomplete' AND event_action = 'click', 
event_position >= 0, TRUE)
+FROM
+  (SELECT event_searchSessionId, event_source, wiki, event_action, 
event_position, event_pageViewId, event_query
+   FROM TestSearchSatisfaction2_", dplyr::if_else(as.Date(opt$date) < 
"2017-02-10", "15922352", dplyr::if_else(as.Date(opt$date) < "2017-06-29", 
"16270835", "16909631")), "
+  WHERE ", date_clause, "
+AND event_action IN ('searchResultPage', 'click')
+AND IF(event_source = 'autocomplete', event_inputLocation = 'header', TRUE)
+AND IF(event_source = 'autocomplete' AND event_action = 'click', 
event_position >= 0, TRUE)
+  GROUP BY event_searchSessionId, event_source, wiki, event_action, 
event_position, event_pageViewId, event_query) AS deduplicate
 GROUP BY date, event_searchSessionId, event_source, wiki;")
 
 # Fetch data from MySQL database:

-- 
To view, visit https://gerrit.wikimedia.org/r/370977
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I972500c6150408a119f2c80dad9fe8a49f00845e
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/golden
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Add 'na.rm = TRUE' to sum functions

2017-08-09 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/370922 )

Change subject: Add 'na.rm = TRUE' to sum functions
..

Add 'na.rm = TRUE' to sum functions

Bug: T170469
Change-Id: I065f732b94bc59c487885e59c618abb1319c72ca
---
M utils.R
1 file changed, 22 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow 
refs/changes/22/370922/1

diff --git a/utils.R b/utils.R
index 6d25af8..ab34131 100644
--- a/utils.R
+++ b/utils.R
@@ -20,14 +20,14 @@
 dplyr::summarize(volume = sum(as.numeric(`search sessions`), na.rm = 
TRUE)) %>%
 dplyr::filter(volume > 0) %>%
 dplyr::arrange(desc(volume)) %>%
-dplyr::mutate(prop = volume / sum(volume),
+dplyr::mutate(prop = volume / sum(volume, na.rm = TRUE),
   label = sprintf("%s (%.3f%%)", language, 100 * prop))
   available_projects_desktop <<- desktop_langproj_dygraph_set %>%
 dplyr::group_by(project) %>%
 dplyr::summarize(volume = sum(as.numeric(`search sessions`), na.rm = 
TRUE)) %>%
 dplyr::filter(volume > 0) %>%
 dplyr::arrange(desc(volume)) %>%
-dplyr::mutate(prop = volume / sum(volume),
+dplyr::mutate(prop = volume / sum(volume, na.rm = TRUE),
   label = sprintf("%s (%.3f%%)", project, 100 * prop))
 }
 
@@ -69,7 +69,7 @@
 dplyr::filter(!is.na(click_position), !is.na(events)) %>%
 dplyr::distinct(date, click_position, .keep_all = TRUE) %>%
 dplyr::group_by(date) %>%
-dplyr::mutate(prop = round(events / sum(events) * 100, 2)) %>%
+dplyr::mutate(prop = round(events / sum(events, na.rm = TRUE) * 100, 2)) 
%>%
 dplyr::ungroup() %>%
 dplyr::select(-events) %>%
 tidyr::spread(click_position, prop, fill = 0)
@@ -80,7 +80,7 @@
 dplyr::filter(!is.na(invoke_source), !is.na(events)) %>%
 dplyr::distinct(date, invoke_source, .keep_all = TRUE) %>%
 dplyr::group_by(date) %>%
-dplyr::mutate(prop = round(events / sum(events) * 100, 2)) %>%
+dplyr::mutate(prop = round(events / sum(events, na.rm = TRUE) * 100, 2)) 
%>%
 dplyr::ungroup() %>%
 dplyr::select(-events) %>%
 tidyr::spread(invoke_source, prop, fill = 0)
@@ -179,14 +179,14 @@
 dplyr::summarize(volume = sum(as.numeric(total), na.rm = TRUE)) %>%
 dplyr::filter(volume > 0) %>%
 dplyr::arrange(desc(volume)) %>%
-dplyr::mutate(prop = volume / sum(volume),
+dplyr::mutate(prop = volume / sum(volume, na.rm = TRUE),
   label = sprintf("%s (%.3f%%)", language, 100 * prop))
   available_projects <<- langproj_with_automata %>%
 dplyr::group_by(project) %>%
 dplyr::summarize(volume = sum(as.numeric(total), na.rm = TRUE)) %>%
 dplyr::filter(volume > 0) %>%
 dplyr::arrange(desc(volume)) %>%
-dplyr::mutate(prop = volume / sum(volume),
+dplyr::mutate(prop = volume / sum(volume, na.rm = TRUE),
   label = sprintf("%s (%.3f%%)", project, 100 * prop))
   projects_db <<- readr::read_csv(system.file("extdata/projects.csv", package 
= "polloi"), col_types = "cclc")[, c("project", "multilingual")]
 }
@@ -203,7 +203,7 @@
   ) %>%
 dplyr::bind_rows(.id = "platform") %>%
 dplyr::group_by(date) %>%
-dplyr::summarize(clickthroughs = sum(clickthroughs), serps = sum(`Result 
pages opened`)) %>%
+dplyr::summarize(clickthroughs = sum(clickthroughs, na.rm = TRUE), serps = 
sum(`Result pages opened`, na.rm = TRUE)) %>%
 dplyr::right_join(threshold_data, by = "date") %>%
 dplyr::transmute(
   date = date,
@@ -244,7 +244,7 @@
   ) %>%
 dplyr::bind_rows(.id = "platform") %>%
 dplyr::group_by(date, language, project) %>%
-dplyr::summarize(clickthroughs = sum(clickthroughs), serps = sum(`Result 
pages opened`)) %>%
+dplyr::summarize(clickthroughs = sum(clickthroughs, na.rm = TRUE), serps = 
sum(`Result pages opened`, na.rm = TRUE)) %>%
 dplyr::right_join(threshold_data, by = c("date", "language", "project")) 
%>%
 dplyr::ungroup() %>%
 dplyr::transmute(
@@ -263,14 +263,14 @@
 dplyr::summarize(volume = sum(as.numeric(`Result pages opened`), na.rm = 
TRUE)) %>%
 dplyr::filter(volume > 0) %>%
 dplyr::arrange(desc(volume)) %>%
-dplyr::mutate(prop = volume / sum(volume),
+dplyr::mutate(prop = volume / sum(volume, na.rm = TRUE),
   label = sprintf("%s (%.3f%%)", language, 100 * prop))
   available_projects_ctr <<- augmented_clickthroughs_langproj %>%
 dplyr::group_by(project) %>%
 dplyr::summarize(volume = sum(as.numeric(`Result pages opened`), na.rm = 
TRUE)) %>%
 dplyr::filter(volume > 0) %>%
 dplyr::arrange(desc(volume)) %>%
-dplyr::mutate(prop = volume / sum(volume),
+dplyr::mutate(prop = volume / sum(volume, na.rm = TRUE),
   label = sprintf("%s (%.3f%%)", project, 100 * prop))
 }
 
@@ -301,14 +301,14 @@
 dplyr::summarize(volume = sum(as.numeric(`search sessions`), na.rm = 
TRUE)) 

[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Annotate sample rate change

2017-08-09 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/370857 )

Change subject: Annotate sample rate change
..


Annotate sample rate change

On April 25th, we changed the sample rates for several projects, which results 
in changes in our search metrics.

Bug: T172428
Change-Id: I709222fa4fad807762c23858e6c00c43d0747d9a
---
M modules/desktop/events.R
M modules/desktop/load_times.R
M modules/desktop/paulscore.R
M modules/page_visit_times.R
M tab_documentation/desktop_events.md
M tab_documentation/desktop_load.md
M tab_documentation/paulscore_approx.html
M tab_documentation/survival.md
8 files changed, 12 insertions(+), 6 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/modules/desktop/events.R b/modules/desktop/events.R
index bcfd686..4f94e44 100644
--- a/modules/desktop/events.R
+++ b/modules/desktop/events.R
@@ -32,5 +32,6 @@
 polloi::make_dygraph(xlab = "Date", ylab = "Events", title = "Desktop 
search events, by day") %>%
 dyRangeSelector %>%
 dyEvent(as.Date("2016-07-12"), "A (schema switch)", labelLoc = "bottom") 
%>%
-dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
+dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") 
%>%
+dyEvent(as.Date("2017-04-25"), "S (rates)", labelLoc = "bottom")
 })
diff --git a/modules/desktop/load_times.R b/modules/desktop/load_times.R
index 50fb49a..a797c80 100644
--- a/modules/desktop/load_times.R
+++ b/modules/desktop/load_times.R
@@ -5,5 +5,6 @@
 dyRangeSelector %>%
 dyEvent(as.Date("2016-07-12"), "A (schema switch)", labelLoc = "bottom") 
%>%
 dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") 
%>%
-dyEvent(as.Date("2017-06-15"), "B (sister search)", labelLoc = "bottom")
+dyEvent(as.Date("2017-06-15"), "B (sister search)", labelLoc = "bottom") 
%>%
+dyEvent(as.Date("2017-04-25"), "S (rates)", labelLoc = "bottom")
 })
diff --git a/modules/desktop/paulscore.R b/modules/desktop/paulscore.R
index 144569b..b0ffb14 100644
--- a/modules/desktop/paulscore.R
+++ b/modules/desktop/paulscore.R
@@ -10,7 +10,7 @@
 polloi::make_dygraph(xlab = "Date", ylab = "PaulScore", title = "PaulScore 
for fulltext searches, by day", use_si = FALSE, group = "paulscore_approx") %>%
 dyRangeSelector %>%
 dyLegend(labelsDiv = "paulscore_approx_legend", show = "always") %>%
-dyEvent(as.Date("2017-04-19"), "A (rates)", labelLoc = "bottom")
+dyEvent(as.Date("2017-04-25"), "S (rates)", labelLoc = "bottom")
   if (input$paulscore_relative) {
 dyOut <- dyAxis(dyOut, "y", axisLabelFormatter = "function(x) { return 
Math.round(100 * x, 3) + '%'; }", valueFormatter = "function(x) { return 
Math.round(100 * x, 3) + '%'; }")
   }
@@ -29,7 +29,7 @@
 polloi::make_dygraph(xlab = "Date", ylab = "PaulScore", title = "PaulScore 
for autocomplete searches, by day", use_si = FALSE, group = "paulscore_approx") 
%>%
 dyRangeSelector %>%
 dyLegend(labelsDiv = "paulscore_approx_legend", show = "always") %>%
-dyEvent(as.Date("2017-04-19"), "A (rates)", labelLoc = "bottom")
+dyEvent(as.Date("2017-04-25"), "S (rates)", labelLoc = "bottom")
   if (input$paulscore_relative) {
 dyOut <- dyAxis(dyOut, "y", axisLabelFormatter = "function(x) { return 
Math.round(100 * x, 3) + '%'; }", valueFormatter = "function(x) { return 
Math.round(100 * x, 3) + '%'; }")
   }
diff --git a/modules/page_visit_times.R b/modules/page_visit_times.R
index 4a51a78..115cbb4 100644
--- a/modules/page_visit_times.R
+++ b/modules/page_visit_times.R
@@ -6,5 +6,6 @@
axisLabelWidth = 100, pixelsPerLabel = 80) %>%
 dyLegend(labelsDiv = "lethal_dose_plot_legend") %>%
 dyRangeSelector(fillColor = "", strokeColor = "") %>%
-dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
+dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") 
%>%
+dyEvent(as.Date("2017-04-25"), "S (rates)", labelLoc = "bottom")
 })
diff --git a/tab_documentation/desktop_events.md 
b/tab_documentation/desktop_events.md
index 94c5f95..be4d9d8 100644
--- a/tab_documentation/desktop_events.md
+++ b/tab_documentation/desktop_events.md
@@ -21,6 +21,7 @@
 * Data in late September/early October 2015 is unavailable due to another bug 
in EventLogging as a whole, which impacted data collection.
 * '__A__': we switched to using data from 
[Schema:TestSearchSatisfaction2](https://meta.wikimedia.org/wiki/Schema:TestSearchSatisfaction2)
 instead of [Schema:Search](https://meta.wikimedia.org/wiki/Schema:Search) for 
Desktop event counts and load times on 12 July 2016.
 * '__R__': on 2017-01-01 we started calculating all of Discovery's metrics 
using a new version of [our data retrieval and processing 
codebase](https://phabricator.wikimedia.org/diffusion/WDGO/) that we migrated 
to [Wikimedia 

[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Annotate sample rate change

2017-08-09 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/370857 )

Change subject: Annotate sample rate change
..

Annotate sample rate change

On April 25th, we changed the sample rates for several projects, which results 
in changes in our search metrics.

Bug: T172428
Change-Id: I709222fa4fad807762c23858e6c00c43d0747d9a
---
M modules/desktop/events.R
M modules/desktop/load_times.R
M modules/desktop/paulscore.R
M modules/page_visit_times.R
M tab_documentation/desktop_events.md
M tab_documentation/desktop_load.md
M tab_documentation/paulscore_approx.html
M tab_documentation/survival.md
8 files changed, 12 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow 
refs/changes/57/370857/1

diff --git a/modules/desktop/events.R b/modules/desktop/events.R
index bcfd686..4f94e44 100644
--- a/modules/desktop/events.R
+++ b/modules/desktop/events.R
@@ -32,5 +32,6 @@
 polloi::make_dygraph(xlab = "Date", ylab = "Events", title = "Desktop 
search events, by day") %>%
 dyRangeSelector %>%
 dyEvent(as.Date("2016-07-12"), "A (schema switch)", labelLoc = "bottom") 
%>%
-dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
+dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") 
%>%
+dyEvent(as.Date("2017-04-25"), "S (rates)", labelLoc = "bottom")
 })
diff --git a/modules/desktop/load_times.R b/modules/desktop/load_times.R
index 50fb49a..a797c80 100644
--- a/modules/desktop/load_times.R
+++ b/modules/desktop/load_times.R
@@ -5,5 +5,6 @@
 dyRangeSelector %>%
 dyEvent(as.Date("2016-07-12"), "A (schema switch)", labelLoc = "bottom") 
%>%
 dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") 
%>%
-dyEvent(as.Date("2017-06-15"), "B (sister search)", labelLoc = "bottom")
+dyEvent(as.Date("2017-06-15"), "B (sister search)", labelLoc = "bottom") 
%>%
+dyEvent(as.Date("2017-04-25"), "S (rates)", labelLoc = "bottom")
 })
diff --git a/modules/desktop/paulscore.R b/modules/desktop/paulscore.R
index 144569b..b0ffb14 100644
--- a/modules/desktop/paulscore.R
+++ b/modules/desktop/paulscore.R
@@ -10,7 +10,7 @@
 polloi::make_dygraph(xlab = "Date", ylab = "PaulScore", title = "PaulScore 
for fulltext searches, by day", use_si = FALSE, group = "paulscore_approx") %>%
 dyRangeSelector %>%
 dyLegend(labelsDiv = "paulscore_approx_legend", show = "always") %>%
-dyEvent(as.Date("2017-04-19"), "A (rates)", labelLoc = "bottom")
+dyEvent(as.Date("2017-04-25"), "S (rates)", labelLoc = "bottom")
   if (input$paulscore_relative) {
 dyOut <- dyAxis(dyOut, "y", axisLabelFormatter = "function(x) { return 
Math.round(100 * x, 3) + '%'; }", valueFormatter = "function(x) { return 
Math.round(100 * x, 3) + '%'; }")
   }
@@ -29,7 +29,7 @@
 polloi::make_dygraph(xlab = "Date", ylab = "PaulScore", title = "PaulScore 
for autocomplete searches, by day", use_si = FALSE, group = "paulscore_approx") 
%>%
 dyRangeSelector %>%
 dyLegend(labelsDiv = "paulscore_approx_legend", show = "always") %>%
-dyEvent(as.Date("2017-04-19"), "A (rates)", labelLoc = "bottom")
+dyEvent(as.Date("2017-04-25"), "S (rates)", labelLoc = "bottom")
   if (input$paulscore_relative) {
 dyOut <- dyAxis(dyOut, "y", axisLabelFormatter = "function(x) { return 
Math.round(100 * x, 3) + '%'; }", valueFormatter = "function(x) { return 
Math.round(100 * x, 3) + '%'; }")
   }
diff --git a/modules/page_visit_times.R b/modules/page_visit_times.R
index 4a51a78..115cbb4 100644
--- a/modules/page_visit_times.R
+++ b/modules/page_visit_times.R
@@ -6,5 +6,6 @@
axisLabelWidth = 100, pixelsPerLabel = 80) %>%
 dyLegend(labelsDiv = "lethal_dose_plot_legend") %>%
 dyRangeSelector(fillColor = "", strokeColor = "") %>%
-dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
+dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") 
%>%
+dyEvent(as.Date("2017-04-25"), "S (rates)", labelLoc = "bottom")
 })
diff --git a/tab_documentation/desktop_events.md 
b/tab_documentation/desktop_events.md
index 94c5f95..be4d9d8 100644
--- a/tab_documentation/desktop_events.md
+++ b/tab_documentation/desktop_events.md
@@ -21,6 +21,7 @@
 * Data in late September/early October 2015 is unavailable due to another bug 
in EventLogging as a whole, which impacted data collection.
 * '__A__': we switched to using data from 
[Schema:TestSearchSatisfaction2](https://meta.wikimedia.org/wiki/Schema:TestSearchSatisfaction2)
 instead of [Schema:Search](https://meta.wikimedia.org/wiki/Schema:Search) for 
Desktop event counts and load times on 12 July 2016.
 * '__R__': on 2017-01-01 we started calculating all of Discovery's metrics 
using a new version of [our data retrieval and processing 
codebase](https://phabricator.wikimedia.org/diffusion/WDGO/) that we migrated 
to [Wikimedia 

[MediaWiki-commits] [Gerrit] wikimedia...polloi[master]: Fix a bug in function 'compress' when number is between 0 and 1

2017-08-08 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/370756 )

Change subject: Fix a bug in function 'compress' when number is between 0 and 1
..

Fix a bug in function 'compress' when number is between 0 and 1

Change-Id: I25ef7d1332d0bacafbd07d7ee64c6c22e3dd7bbd
---
M R/maths.R
M tests/testthat/test-maths.R
2 files changed, 3 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/polloi 
refs/changes/56/370756/1

diff --git a/R/maths.R b/R/maths.R
index aab5b7f..5cfd017 100644
--- a/R/maths.R
+++ b/R/maths.R
@@ -22,5 +22,6 @@
 #' @export
 compress <- function(x, round_by = 2) {
   div <- findInterval(x, c(1, 1e3, 1e6, 1e9, 1e12))
-  return(paste0(round( x / 10 ^ (3 * (div - 1)), round_by), c("", "", "K", 
"M", "B", "T")[div + 1]))
+  return(paste0(round( x / 10 ^ (3 * ifelse(div - 1 < 0, 0, div - 1)), 
round_by),
+c("", "", "K", "M", "B", "T")[div + 1]))
 }
diff --git a/tests/testthat/test-maths.R b/tests/testthat/test-maths.R
index 9a84d57..2455e10 100644
--- a/tests/testthat/test-maths.R
+++ b/tests/testthat/test-maths.R
@@ -7,7 +7,7 @@
 })
 
 test_that("suffixes", {
-  expect_equal(compress(c(0, 1, 10, 100)), c("0", "1", "10", "100"))
+  expect_equal(compress(c(0, 0.5, 1, 10, 100)), c("0", "0.5", "1", "10", 
"100"))
   expect_equal(compress(1.642e3, round_by = 1), "1.6K")
   expect_equal(compress(c(10, 1e6, 1e12, 1e9)), c("100K", "1M", "1T", 
"1B"))
   expect_equal(compress(c(0, 1, 1e6, 1e3)), c("0", "1", "1M", "1K"))

-- 
To view, visit https://gerrit.wikimedia.org/r/370756
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I25ef7d1332d0bacafbd07d7ee64c6c22e3dd7bbd
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/polloi
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...prince[develop]: Get all country names with portal traffic from polloi

2017-07-24 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/367459 )

Change subject: Get all country names with portal traffic from polloi
..

Get all country names with portal traffic from polloi

Bug: T167913
Change-Id: I781a1a11844df5599d8535df5e7ce440ea81428f
---
M extras.R
1 file changed, 2 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/prince 
refs/changes/59/367459/1

diff --git a/extras.R b/extras.R
index 69c502e..605a49d 100644
--- a/extras.R
+++ b/extras.R
@@ -29,7 +29,8 @@
 )
 
 # For selectizeInput in ui.R
-all_country_names <- c("Zimbabwe", "Zambia", "Yemen", "Virgin Islands, 
British", "Viet Nam", "Venezuela, Bolivarian Republic of", "Uzbekistan", "U.S. 
(West)", "U.S. (South)", "U.S. (Pacific)", "U.S. (Other)", "U.S. (Northeast)", 
"U.S. (Midwest)", "Uruguay", "United Kingdom", "United Arab Emirates", 
"Ukraine", "Uganda", "Turkmenistan", "Turkey", "Tunisia", "Trinidad and 
Tobago", "Timor-Leste", "Thailand", "Tanzania, United Republic of", 
"Tajikistan", "Taiwan, Province of China", "Syrian Arab Republic", 
"Switzerland", "Sweden", "Suriname", "Sudan", "Sri Lanka", "Spain", "South 
Africa", "Somalia", "Slovenia", "Slovakia", "Singapore", "Seychelles", 
"Serbia", "Senegal", "Saudi Arabia", "Rwanda", "Russian Federation", "Romania", 
"Qatar", "Portugal", "Poland", "Philippines", "Peru", "Paraguay", "Papua New 
Guinea", "Panama", "Palestine, State of", "Pakistan", "Other", "Oman", 
"Norway", "Nigeria", "Niger", "Nicaragua", "New Zealand", "Netherlands", 
"Nepal", "Namibia", "Myanmar", "Mozambique", "Morocco", "Montenegro", 
"Mongolia", "Moldova, Republic of", "Mexico", "Mauritius", "Mauritania", 
"Martinique", "Mali", "Malaysia", "Malawi", "Madagascar", "Macedonia, Republic 
of", "Macao", "Luxembourg", "Lithuania", "Libya", "Lebanon", "Latvia", "Lao 
People's Democratic Republic", "Kyrgyzstan", "Kuwait", "Korea, Republic of", 
"Kenya", "Kazakhstan", "Jordan", "Jersey", "Japan", "Jamaica", "Italy", 
"Israel", "Ireland", "Iraq", "Iran, Islamic Republic of", "Indonesia", "India", 
"Iceland", "Hungary", "Hong Kong", "Honduras", "Haiti", "Guernsey", 
"Guatemala", "Greenland", "Greece", "Ghana", "Germany", "Georgia", "French 
Polynesia", "France", "Finland", "Fiji", "Ethiopia", "Estonia", "El Salvador", 
"Egypt", "Ecuador", "Dominican Republic", "Dominica", "Djibouti", "Denmark", 
"Czechia", "Cyprus", "Curacao", "Cuba", "Croatia", "Cote d'Ivoire", "Costa 
Rica", "Congo, The Democratic Republic of the", "Colombia", "China", "Chile", 
"Canada", "Cameroon", "Cambodia", "Burkina Faso", "Bulgaria", "British Indian 
Ocean Territory", "Brazil", "Botswana", "Bolivia, Plurinational State of", 
"Bhutan", "Benin", "Belgium", "Belarus", "Barbados", "Bangladesh", "Bahrain", 
"Azerbaijan", "Austria", "Australia", "Aruba", "Armenia", "Argentina", 
"Angola", "Algeria", "Albania", "Afghanistan", "Togo", "Malta", "Guadeloupe", 
"Gibraltar", "Gabon", "Faroe Islands", "Congo", "Cayman Islands", "Brunei 
Darussalam", "Bosnia and Herzegovina", "Bahamas", "Reunion", "Maldives", 
"Guyana", "Guinea", "Cabo Verde", "Burundi", "Antigua and Barbuda", 
"Swaziland", "Saint Lucia", "Isle of Man", "Gambia", "Central African 
Republic", "Belize", "Vanuatu", "Sierra Leone", "Saint Kitts and Nevis", "New 
Caledonia", "Lesotho", "Solomon Islands", "French Guiana", "Chad", "Bermuda", 
"Turks and Caicos Islands", "Liberia", "Comoros", "Bonaire, Sint Eustatius and 
Saba", "Aland Islands", "Grenada", "Mayotte", "Liechtenstein", "Samoa", 
"Equatorial Guinea", "Andorra", "South Sudan", "Saint Martin (French part)", 
"Saint Vincent and the Grenadines", "Holy See (Vatican City State)", 
"Guinea-Bissau", "Eritrea", "Saint Barthelemy", "Cook Islands", "Sint Maarten 
(Dutch part)", "Sao Tome and Principe", "Anguilla", "Monaco", "Kiribati", 
"Micronesia, Federated States of", "San Marino", "United States")
+data(portal_regions, package = "polloi")
+all_country_names <- portal_regions
 
 fill_out <- function(x, start_date, end_date, fill = 0) {
   temp <- dplyr::data_frame(date = seq(start_date, end_date, "day"))

-- 
To view, visit https://gerrit.wikimedia.org/r/367459
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I781a1a11844df5599d8535df5e7ce440ea81428f
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/prince
Gerrit-Branch: develop
Gerrit-Owner: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Use new functions in polloi to get geo data

2017-07-24 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/367456 )

Change subject: Use new functions in polloi to get geo data
..

Use new functions in polloi to get geo data

Bug: T167913
Change-Id: I00cad391fa22399f583eb7791256c1eb25ba611b
---
M modules/metrics/portal/geographic_breakdown.R
1 file changed, 2 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/golden 
refs/changes/56/367456/1

diff --git a/modules/metrics/portal/geographic_breakdown.R 
b/modules/metrics/portal/geographic_breakdown.R
index 606e6a0..3456a45 100644
--- a/modules/metrics/portal/geographic_breakdown.R
+++ b/modules/metrics/portal/geographic_breakdown.R
@@ -66,23 +66,11 @@
 } else {
   results$ts <- as.POSIXct(results$ts, format = "%Y%m%d%H%M%S")
   # Geography data that is common to both outputs:
-  data("ISO_3166_1", package = "ISOcodes")
-  # Remove accents because Reportupdater requires ASCII:
-  ISO_3166_1$Name <- stringi::stri_trans_general(ISO_3166_1$Name, 
"Latin-ASCII")
-  us_other_abb <- c("AS", "GU", "MP", "PR", "VI")
-  us_other_mask <- match(us_other_abb, ISO_3166_1$Alpha_2)
-  regions <- data.frame(abb = c(paste0("US:", c(as.character(state.abb), 
"DC")), us_other_abb),
-region = paste0("U.S. (", 
c(as.character(state.region), "South", rep("Other",5)), ")"),
-state = c(state.name, "District of Columbia", 
ISO_3166_1$Name[us_other_mask]),
-stringsAsFactors = FALSE)
-  regions$region[regions$region == "U.S. (North Central)"] <- "U.S. (Midwest)"
-  regions$region[c(state.division == "Pacific", rep(FALSE, 5))] <- "U.S. 
(Pacific)" # see https://phabricator.wikimedia.org/T136257#2399411
+  regions <- polloi::get_us_state()
   library(magrittr) # Required for piping
   if (opt$include_all) {
 # Generate all countries breakdown
-all_countries <- data.frame(abb = c(regions$abb, 
ISO_3166_1$Alpha_2[-us_other_mask]),
-name = c(regions$region, 
ISO_3166_1$Name[-us_other_mask]),
-stringsAsFactors = FALSE)
+all_countries <- polloi::get_country_state()
 data_w_countryname <- results %>%
   dplyr::mutate(country = ifelse(country %in% all_countries$abb, country, 
"Other")) %>%
   dplyr::left_join(all_countries, by = c("country" = "abb")) %>%

-- 
To view, visit https://gerrit.wikimedia.org/r/367456
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I00cad391fa22399f583eb7791256c1eb25ba611b
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/golden
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Fix sister search traffic query

2017-07-13 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/365190 )

Change subject: Fix sister search traffic query
..


Fix sister search traffic query

- Creates a new category in the 'language' column for
  French and Catalan since they have their own sister
  search sidebar that shows up in addition to ours.
- Makes some adjustments for how SERPs are detected.

We'll need to clear out the current data and do a full
recount with this query. Since the backfill is only
from 2017-06-01, we should be OK as the original data
should still be present.

Bug: T164854, T170183
Change-Id: Ic21aeac43891ebb1b65696fe8e907bb959a4d7b7
---
M modules/metrics/search/sister_search_traffic
1 file changed, 11 insertions(+), 6 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/modules/metrics/search/sister_search_traffic 
b/modules/metrics/search/sister_search_traffic
index 3dc2852..76c091c 100755
--- a/modules/metrics/search/sister_search_traffic
+++ b/modules/metrics/search/sister_search_traffic
@@ -12,8 +12,12 @@
  WHEN 'species' THEN 'wikispecies'
  ELSE normalized_host.project_class
 END AS project,
-IF(normalized_host.project IN('commons', 'meta', 'simple', 'incubator', 
'species'), '',
-   IF(normalized_host.project = 'en', 'English', 'Other languages')) AS 
language,
+CASE WHEN normalized_host.project IN('commons', 'meta', 'simple', 
'incubator', 'species') THEN ''
+ WHEN normalized_host.project = 'en' THEN 'English'
+ -- frwiki and cawiki use homebrew sister search that shows up in 
addition to ours
+ WHEN normalized_host.project IN('ca', 'fr') THEN 'French and Catalan'
+ ELSE 'Other languages'
+END AS language,
 -- flag for pageviews that are search results pages (e.g. if user clicked 
to see more results from a sister project):
 (
   page_id IS NULL
@@ -22,8 +26,8 @@
 OR (
   uri_path = '/w/index.php'
   AND (
-uri_query RLIKE '^\?search\='
-OR INSTR(uri_query, '?title=Special:Search=') > 0
+PARSE_URL(CONCAT('http://', uri_host, uri_path, uri_query), 
'QUERY', 'search') IS NOT NULL
+OR PARSE_URL(CONCAT('http://', uri_host, uri_path, uri_query), 
'QUERY', 'searchToken') IS NOT NULL
   )
 )
   )
@@ -34,10 +38,11 @@
 AND CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')) >= '$1'
 AND CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')) < '$2'
 AND is_pageview
+-- only those that have been referred by a search results page on a 
wikipedia:
 AND referer_class = 'internal'
 AND (
-  INSTR(referer, '/w/index.php?search=') > 0
-  OR INSTR(referer, '/wiki/Special:Search?search=') > 0
+  PARSE_URL(referer, 'QUERY', 'search') IS NOT NULL
+  OR PARSE_URL(referer, 'QUERY', 'searchToken') IS NOT NULL
 )
 -- warning: comparing uri_host = PARSE_URL(referer, 'HOST') would mark 
'en.m.wikipedia.org' as a sister of 'en.wikipedia.org'
 AND normalize_host(PARSE_URL(referer, 'HOST')).project_class = 'wikipedia'

-- 
To view, visit https://gerrit.wikimedia.org/r/365190
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ic21aeac43891ebb1b65696fe8e907bb959a4d7b7
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/golden
Gerrit-Branch: master
Gerrit-Owner: Bearloga 
Gerrit-Reviewer: Chelsyx 
Gerrit-Reviewer: EBernhardson 
Gerrit-Reviewer: HaeB 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Annotate PaulScore decrease as a result of sampling rate change

2017-07-10 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/364333 )

Change subject: Annotate PaulScore decrease as a result of sampling rate change
..


Annotate PaulScore decrease as a result of sampling rate change

Bug: T168466
Change-Id: I9bf40ce1804ee679c24f664900db55a48e88f5e2
---
M modules/desktop/paulscore.R
M tab_documentation/paulscore_approx.html
2 files changed, 5 insertions(+), 2 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/modules/desktop/paulscore.R b/modules/desktop/paulscore.R
index ecfc79e..144569b 100644
--- a/modules/desktop/paulscore.R
+++ b/modules/desktop/paulscore.R
@@ -9,7 +9,8 @@
 polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, 
input$smoothing_paulscore_approx)) %>%
 polloi::make_dygraph(xlab = "Date", ylab = "PaulScore", title = "PaulScore 
for fulltext searches, by day", use_si = FALSE, group = "paulscore_approx") %>%
 dyRangeSelector %>%
-dyLegend(labelsDiv = "paulscore_approx_legend", show = "always")
+dyLegend(labelsDiv = "paulscore_approx_legend", show = "always") %>%
+dyEvent(as.Date("2017-04-19"), "A (rates)", labelLoc = "bottom")
   if (input$paulscore_relative) {
 dyOut <- dyAxis(dyOut, "y", axisLabelFormatter = "function(x) { return 
Math.round(100 * x, 3) + '%'; }", valueFormatter = "function(x) { return 
Math.round(100 * x, 3) + '%'; }")
   }
@@ -27,7 +28,8 @@
 polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, 
input$smoothing_paulscore_approx)) %>%
 polloi::make_dygraph(xlab = "Date", ylab = "PaulScore", title = "PaulScore 
for autocomplete searches, by day", use_si = FALSE, group = "paulscore_approx") 
%>%
 dyRangeSelector %>%
-dyLegend(labelsDiv = "paulscore_approx_legend", show = "always")
+dyLegend(labelsDiv = "paulscore_approx_legend", show = "always") %>%
+dyEvent(as.Date("2017-04-19"), "A (rates)", labelLoc = "bottom")
   if (input$paulscore_relative) {
 dyOut <- dyAxis(dyOut, "y", axisLabelFormatter = "function(x) { return 
Math.round(100 * x, 3) + '%'; }", valueFormatter = "function(x) { return 
Math.round(100 * x, 3) + '%'; }")
   }
diff --git a/tab_documentation/paulscore_approx.html 
b/tab_documentation/paulscore_approx.html
index 0a5b441..b73d61b 100644
--- a/tab_documentation/paulscore_approx.html
+++ b/tab_documentation/paulscore_approx.html
@@ -25,6 +25,7 @@
 
 
   'R': on 2017-01-01 we started calculating all of 
Discovery's metrics using a new version of https://phabricator.wikimedia.org/diffusion/WDGO/;>our data retrieval and 
processing codebase that we migrated to https://www.mediawiki.org/wiki/Analytics;>Wikimedia Analytics' https://wikitech.wikimedia.org/wiki/Analytics/Reportupdater;>Reportupdater
 infrastructure. See https://phabricator.wikimedia.org/T150915;>T150915 for more 
details.
+  'A': on 2017-04-19 we changed the rates at which users 
are put into event logging (see https://phabricator.wikimedia.org/T163273; title="Phabricator ticket: 
Adjust search satisfaction sampling rate">T163273. Specifically, we 
decreased the rate on English Wikipedia ("EnWiki") and increased it everywhere 
else, and since EnWiki generally has higher PaulScore than other projects, we 
effectively lowered the overall PaulScore by lessening EnWiki's contribution. 
See https://phabricator.wikimedia.org/T168466; title="Phabricator 
ticket: Investigate PaulScores for late April and May for full-text 
searches">T168466 for more details.
 
 
 Questions, bug reports, and feature suggestions

-- 
To view, visit https://gerrit.wikimedia.org/r/364333
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I9bf40ce1804ee679c24f664900db55a48e88f5e2
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/rainbow
Gerrit-Branch: develop
Gerrit-Owner: Bearloga 
Gerrit-Reviewer: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...polloi[master]: Add capitalization function

2017-07-07 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/363867 )

Change subject: Add capitalization function
..


Add capitalization function

Also adds correct licensing info for code from Stack Overflow

Change-Id: I936b59b728e61fad07102c7e79a94d0754784607
---
M DESCRIPTION
M NAMESPACE
M NEWS.md
M R/manipulate.R
M R/maths.R
A man/capitalize_first_letter.Rd
M man/compress.Rd
M tests/testthat/test-manipulation.R
8 files changed, 60 insertions(+), 3 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/DESCRIPTION b/DESCRIPTION
index 94d9e10..5b3828b 100644
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -1,12 +1,13 @@
 Package: polloi
 Type: Package
 Title: Common Functionality for Wikimedia Dashboards
-Version: 0.2.0
-Date: 2017-06-28
+Version: 0.2.1
+Date: 2017-07-07
 Authors@R: c(
 person("Mikhail", "Popov", email = "mikh...@wikimedia.org", role = 
c("aut", "cre")),
 person("Chelsy", "Xie", email = "c...@wikimedia.org", role = "aut"),
-person("Oliver", "Keyes", role = "aut", comment = "No longer employed at 
the Foundation")
+person("Oliver", "Keyes", role = "aut", comment = "No longer employed at 
the Foundation"),
+person("Andrie", "de Vries", role = "ctb", comment = "Capitalization code 
from StackOverflow")
 )
 Description: This package contains common functionality for all of the
 Wikimedia Foundation's Shiny Dashboards.
diff --git a/NAMESPACE b/NAMESPACE
index 1b06a9b..3542a1d 100644
--- a/NAMESPACE
+++ b/NAMESPACE
@@ -1,6 +1,7 @@
 # Generated by roxygen2: do not edit by hand
 
 export(automata_select)
+export(capitalize_first_letter)
 export(cbind_fill)
 export(check_past_week)
 export(check_yesterday)
diff --git a/NEWS.md b/NEWS.md
index 713fb16..4a41b2b 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -1,3 +1,7 @@
+polloi 0.2.1
+
+- Adds `capitalize_first_letter`
+
 polloi 0.2.0
 
 - Adds unit tests and lint checking 
([T145445](https://phabricator.wikimedia.org/T145445)).
diff --git a/R/manipulate.R b/R/manipulate.R
index e9e9fb1..01294b0 100644
--- a/R/manipulate.R
+++ b/R/manipulate.R
@@ -117,3 +117,16 @@
   }
   return(no_set)
 }
+
+#' @title Capitalize First Letter Of Every Word
+#' @description Capitalizes the first letter of every word.
+#' @details This function is made available under CC-BY-SA 3.0
+#' @param x character vector
+#' @author [Andrie de Vries](https://stackoverflow.com/users/602276/andrie)
+#' @source 
\url{https://stackoverflow.com/questions/6364783/capitalize-the-first-letter-of-both-words-in-a-two-word-string}
+#' @export
+capitalize_first_letter <- function(x) {
+  return(vapply(strsplit(x, " "), function(s) {
+return(paste0(toupper(substring(s, 1, 1)), substring(s, 2), collapse = " 
"))
+  }, ""))
+}
diff --git a/R/maths.R b/R/maths.R
index aab5b7f..3308e91 100644
--- a/R/maths.R
+++ b/R/maths.R
@@ -16,8 +16,11 @@
 #' @title Convert Numeric Values to use SI suffixes
 #' @description takes a numeric vector (e.g. 1200, 130) and converts it to
 #'   use SI suffixes (e.g. 1.2K, 1.3M)
+#' @details This function is made available under CC-BY-SA 3.0
 #' @param x a vector of numeric or integer values
 #' @param round_by how many digits to round the resulting numbers by
+#' @author Original code: [42-](https://stackoverflow.com/users/1855677/42);
+#'   improvement: Mikhail
 #' @references 
\url{https://stackoverflow.com/questions/28159936/formatting-large-currency-or-dollar-values-to-millions-billions/}
 #' @export
 compress <- function(x, round_by = 2) {
diff --git a/man/capitalize_first_letter.Rd b/man/capitalize_first_letter.Rd
new file mode 100644
index 000..184da6e
--- /dev/null
+++ b/man/capitalize_first_letter.Rd
@@ -0,0 +1,23 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/manipulate.R
+\name{capitalize_first_letter}
+\alias{capitalize_first_letter}
+\title{Capitalize First Letter Of Every Word}
+\source{
+\url{https://stackoverflow.com/questions/6364783/capitalize-the-first-letter-of-both-words-in-a-two-word-string}
+}
+\usage{
+capitalize_first_letter(x)
+}
+\arguments{
+\item{x}{character vector}
+}
+\description{
+Capitalizes the first letter of every word.
+}
+\details{
+This function is made available under CC-BY-SA 3.0
+}
+\author{
+\href{https://stackoverflow.com/users/602276/andrie}{Andrie de Vries}
+}
diff --git a/man/compress.Rd b/man/compress.Rd
index fb16556..85826c6 100644
--- a/man/compress.Rd
+++ b/man/compress.Rd
@@ -15,6 +15,13 @@
 takes a numeric vector (e.g. 1200, 130) and converts it to
 use SI suffixes (e.g. 1.2K, 1.3M)
 }
+\details{
+This function is made available under CC-BY-SA 3.0
+}
 \references{
 
\url{https://stackoverflow.com/questions/28159936/formatting-large-currency-or-dollar-values-to-millions-billions/}
 }
+\author{
+Original code: \href{https://stackoverflow.com/users/1855677/42}{42-};
+improvement: Mikhail
+}
diff 

[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Sister search traffic changes per Deb's feedback

2017-07-07 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/363752 )

Change subject: Sister search traffic changes per Deb's feedback
..


Sister search traffic changes per Deb's feedback

- Corrects referenced ticket
- Changes title and summary
- Adds more notes and explanations
- Changes the UI/UX to a "choose-your-own-split-by-combo"
  adventure using checkboxes
- Adds annotations for sister search on KPI::LoadTimes and
  Desktop::LoadTimes dashboards because it's relevant

Bug: T164854
Change-Id: I5c1e4db0b2b92ad3b28d74b8113a511704946326
---
M server.R
M tab_documentation/desktop_load.md
M tab_documentation/kpi_load_time.md
M tab_documentation/sister_search_traffic.md
M ui.R
A www/js4checkbox.js
6 files changed, 67 insertions(+), 36 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/server.R b/server.R
index 49c429b..972cd74 100644
--- a/server.R
+++ b/server.R
@@ -83,7 +83,8 @@
   polloi::make_dygraph(xlab = "Date", ylab = "Load time (ms)", title = 
"Desktop load times, by day", use_si = FALSE) %>%
   dyRangeSelector %>%
   dyEvent(as.Date("2016-07-12"), "A (schema switch)", labelLoc = "bottom") 
%>%
-  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
+  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") 
%>%
+  dyEvent(as.Date("2017-06-15"), "B (sister search)", labelLoc = "bottom")
   })
 
   output$paulscore_approx_plot_fulltext <- renderDygraph({
@@ -363,34 +364,37 @@
 
   # Sister Search
   output$sister_search_traffic_plot <- renderDygraph({
-switch(
-  input$sister_search_traffic_split,
-  "none" = {
-sister_search_traffic %>%
+# Code that prepares a custom data.frame 'sst'
+# that will then be processed in a generic way:
+if (length(input$sister_search_traffic_split) == 0) {
+  sst <- sister_search_traffic %>%
   dplyr::mutate(split = "Sister search traffic")
-  },
-  "project" = {
-sister_search_traffic %>%
-  dplyr::rename(split = project)
-  },
-  "destination" = {
-sister_search_traffic %>%
-  dplyr::mutate(split = dplyr::if_else(is_serp, "Search results page", 
"Article"))
-  },
-  "language" = {
-sister_search_traffic %>%
-  dplyr::filter(project != "wikimedia commons", !is.na(language)) %>%
-  dplyr::mutate(split = language)
-  },
-  "access_method" = {
-sister_search_traffic %>%
-  dplyr::mutate(split = access_method)
+} else {
+  split_by <- head(input$sister_search_traffic_split, 2)
+  sst <- sister_search_traffic
+  if ("language" %in% split_by) {
+sst <- dplyr::filter(sst, !is.na(language))
   }
-) %>%
+  if ("destination" %in% split_by) {
+sst <- dplyr::mutate(sst, destination = dplyr::if_else(is_serp, 
"Search results page", "Article"))
+  }
+  if (length(split_by) == 1) {
+sst$split <- sst[[split_by[1]]]
+  } else {
+sst$split <- paste0(sst[[split_by[1]]], " (", sst[[split_by[2]]], ")")
+  }
+}
+# Code that works on the prepared dataet:
+sst %>%
   dplyr::group_by(date, split) %>%
   dplyr::summarize(pageviews = sum(pageviews)) %>%
   tidyr::spread(split, pageviews, fill = 0) %>%
-  polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, 
input$smoothing_sister_search_traffic_plot)) %>%
+  {
+# Reorder columns according to the last observed values:
+cols <- unlist(polloi::safe_tail(., 1)[, -1])
+.[, c(1, order(cols, decreasing = TRUE) + 1)]
+  } %>%
+  polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, 
input$smoothing_sister_search_traffic_plot), rename = FALSE) %>%
   polloi::make_dygraph(xlab = "Date", ylab = "Pageviews", title = "Traffic 
to sister projects from Wikipedia SERPs") %>%
   dyAxis("x", ticker = "Dygraph.dateTicker", axisLabelFormatter = 
polloi::custom_axis_formatter,
  axisLabelWidth = 100, pixelsPerLabel = 80) %>%
@@ -735,7 +739,8 @@
  dyCSS(css = system.file("custom.css", package = "polloi")) %>%
  dyRangeSelector %>%
  dyEvent(as.Date("2016-07-12"), "A (schema switch)", labelLoc = 
"bottom") %>%
- dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = 
"bottom"))
+ dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = 
"bottom") %>%
+ dyEvent(as.Date("2017-06-15"), "B (sister search)", labelLoc = 
"bottom"))
   })
   output$kpi_zero_results_series <- renderDygraph({
 smooth_level <- input$smoothing_kpi_zero_results
diff --git a/tab_documentation/desktop_load.md 
b/tab_documentation/desktop_load.md
index be3d643..dcd55c0 100644
--- a/tab_documentation/desktop_load.md
+++ b/tab_documentation/desktop_load.md
@@ -10,7 

[MediaWiki-commits] [Gerrit] wikimedia...polloi[master]: Fix spline smoothing and add tests

2017-06-29 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/362107 )

Change subject: Fix spline smoothing and add tests
..


Fix spline smoothing and add tests

- This patch fixes a bug wherein spline smoothing was
  broken.
- This patch also adds a bunch of unit tests (via
  testthat) because if we had those earlier, we would
  have known that the previous patch actually broke
  spline smoothing.
- There is now an example dataset (wdqs_usage) that is
  used in examples an some of the unit tests.
- This patch also finally fixes the issue with
  compress() wherein it returned really weird results
  if the input vector contained a 0.
- Also! There is lint checking now! It is included as a
  unit test.
- While I was at it, I fixed a bunch of stylistic
  issues (spacing, line lengths, single vs double
  quotes) and documentation issues (e.g. missing
  descriptions that `R CMD check` would yell about).

Bug: T169125, T153856
Change-Id: I5752d0a528bffb2bee6186d49efd4a751551cb95
---
M .Rbuildignore
A .lintr
M DESCRIPTION
M NAMESPACE
M NEWS.md
M R/check_notify.R
M R/data.R
M R/dygraphs.R
M R/manipulate.R
M R/maths.R
M R/reading.R
M R/shiny.R
M R/smoothing.R
M R/utils.R
A data/wdqs_usage.rda
M man/automata_select.Rd
M man/cbind_fill.Rd
M man/compress.Rd
M man/cond_color.Rd
M man/cond_icon.Rd
A man/get_sample_data.Rd
M man/make_dygraph.Rd
M man/parse_wikiid.Rd
M man/percent_change.Rd
M man/portal_regions.Rd
M man/read_dataset.Rd
M man/smart_palette.Rd
M man/smooth_select.Rd
M man/smoother.Rd
M man/subset_by_date_range.Rd
M man/timeframe_daterange.Rd
M man/timeframe_select.Rd
M man/update_prefixes.Rd
M man/update_projects.Rd
A man/wdqs_usage.Rd
M polloi.Rproj
A tests/testthat.R
A tests/testthat/test-manipulation.R
A tests/testthat/test-maths.R
A tests/testthat/test-smoothing.R
A tests/testthat/test-syntax.R
41 files changed, 412 insertions(+), 163 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/.Rbuildignore b/.Rbuildignore
index 8ed3933..a7ed3c6 100644
--- a/.Rbuildignore
+++ b/.Rbuildignore
@@ -1,4 +1,5 @@
 ^.*\.Rproj$
 ^\.Rproj\.user$
-.gitreview
+^\.gitreview$
 ^CONDUCT\.md$
+^\.lintr
diff --git a/.lintr b/.lintr
new file mode 100644
index 000..0c6cdb9
--- /dev/null
+++ b/.lintr
@@ -0,0 +1,4 @@
+linters: with_defaults(line_length_linter(120), object_usage_linter = NULL, 
closed_curly_linter = NULL, open_curly_linter = NULL)
+exclude: "# Exclude Linting"
+exclude_start: "# Begin Exclude Linting"
+exclude_end: "# End Exclude Linting"
diff --git a/DESCRIPTION b/DESCRIPTION
index 933d209..94d9e10 100644
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -1,8 +1,8 @@
 Package: polloi
 Type: Package
 Title: Common Functionality for Wikimedia Dashboards
-Version: 0.1.9
-Date: 2017-06-26
+Version: 0.2.0
+Date: 2017-06-28
 Authors@R: c(
 person("Mikhail", "Popov", email = "mikh...@wikimedia.org", role = 
c("aut", "cre")),
 person("Chelsy", "Xie", email = "c...@wikimedia.org", role = "aut"),
@@ -36,7 +36,9 @@
 zoo
 Suggests:
 datasets,
-ISOcodes
+ISOcodes,
+lintr,
+testthat
 LazyData: TRUE
 Roxygen: list(markdown = TRUE)
 RoxygenNote: 6.0.1
diff --git a/NAMESPACE b/NAMESPACE
index 80881aa..1b06a9b 100644
--- a/NAMESPACE
+++ b/NAMESPACE
@@ -44,7 +44,6 @@
 importFrom(lubridate,ymd)
 importFrom(magrittr,"%>%")
 importFrom(magrittr,set_names)
-importFrom(readr,read_delim)
 importFrom(rvest,html_nodes)
 importFrom(rvest,html_table)
 importFrom(shiny,icon)
diff --git a/NEWS.md b/NEWS.md
index 21243cb..713fb16 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -1,12 +1,20 @@
+polloi 0.2.0
+
+- Adds unit tests and lint checking 
([T145445](https://phabricator.wikimedia.org/T145445)).
+- Adds an example dataset (`wdqs_usage`) that is used for running examples and 
tests.
+- Fixes problem with spline smoothing 
([T169125](https://phabricator.wikimedia.org/T169125)).
+- Fixes a whole bunch of stylistic issues (removes lints).
+- Fixes a bug with `compress()` wherein it would yield weird results if the 
input vector included a 0.
+
 polloi 0.1.9
 
-- Adds geography datasets and functions 
([T167913](https://phabricator.wikimedia.org/T167913))
+- Adds geography datasets and functions 
([T167913](https://phabricator.wikimedia.org/T167913)).
 
 polloi 0.1.8
 
-- Updates dataset of prefixes
-- Changes path to download datasets from
-- Uses latest roxygen with markdown support
+- Updates dataset of prefixes.
+- Changes path to download datasets from.
+- Uses latest roxygen with markdown support.
 
 polloi 0.1.7
 
diff --git a/R/check_notify.R b/R/check_notify.R
index 1e5c840..c6ff3d0 100644
--- a/R/check_notify.R
+++ b/R/check_notify.R
@@ -16,7 +16,7 @@
   # e.g. label = "desktop events"
   yesterday_date <- Sys.Date() - 1
   if (!(yesterday_date %in% dataset$date)) {
-return(notificationItem(text = paste("No", label," from yesterday."),
+

[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Add sister search traffic

2017-06-28 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/361902 )

Change subject: Add sister search traffic
..


Add sister search traffic

- Adds a "Sister Search" section with a "Traffic" subsection

Bug: T164854
Change-Id: Ic89b51f3b89b25b50387389ef84ba9496423be4b
---
M server.R
A tab_documentation/sister_search_traffic.md
M ui.R
M utils.R
4 files changed, 91 insertions(+), 13 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/server.R b/server.R
index 0de1586..92e94c1 100644
--- a/server.R
+++ b/server.R
@@ -18,20 +18,22 @@
 read_desktop()
 progress$set(message = "Downloading apps data", value = 0.1)
 read_apps()
-progress$set(message = "Downloading mobile web data", value = 0.3)
+progress$set(message = "Downloading mobile web data", value = 0.2)
 read_web()
-progress$set(message = "Downloading API usage data", value = 0.4)
+progress$set(message = "Downloading API usage data", value = 0.3)
 read_api()
-progress$set(message = "Downloading zero results data", value = 0.5)
+progress$set(message = "Downloading zero results data", value = 0.4)
 read_failures()
-progress$set(message = "Downloading engagement data", value = 0.6)
+progress$set(message = "Downloading engagement data", value = 0.5)
 read_augmented_clickthrough()
-progress$set(message = "Downloading language-project engagement data", 
value = 0.7)
+progress$set(message = "Downloading language-project engagement data", 
value = 0.6)
 read_augmented_clickthrough_langproj()
-progress$set(message = "Downloading survival data", value = 0.8)
+progress$set(message = "Downloading survival data", value = 0.7)
 read_lethal_dose()
-progress$set(message = "Downloading PaulScore data", value = 0.9)
+progress$set(message = "Downloading PaulScore data", value = 0.8)
 read_paul_score()
+progress$set(message = "Downloading sister search data", value = 0.9)
+read_sister_search()
 progress$set(message = "Finished downloading datasets", value = 1)
 existing_date <<- Sys.Date()
 progress$close()
@@ -359,6 +361,40 @@
   dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
   })
 
+  # Sister Search
+  output$sister_search_traffic_plot <- renderDygraph({
+switch(
+  input$sister_search_traffic_split,
+  "project" = {
+sister_search_traffic %>%
+  dplyr::rename(split = project)
+  },
+  "destination" = {
+sister_search_traffic %>%
+  dplyr::mutate(split = dplyr::if_else(is_serp, "Search results page", 
"Article"))
+  },
+  "language" = {
+sister_search_traffic %>%
+  dplyr::filter(project != "wikimedia commons", !is.na(language)) %>%
+  dplyr::mutate(split = language)
+  },
+  "access_method" = {
+sister_search_traffic %>%
+  dplyr::mutate(split = access_method)
+  }
+) %>%
+  dplyr::group_by(date, split) %>%
+  dplyr::summarize(pageviews = sum(pageviews)) %>%
+  tidyr::spread(split, pageviews, fill = 0) %>%
+  polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, 
input$smoothing_sister_search_traffic_plot)) %>%
+  polloi::make_dygraph(xlab = "Date", ylab = "Pageviews", title = "Traffic 
to sister projects from Wikipedia SERPs") %>%
+  dyAxis("x", ticker = "Dygraph.dateTicker", axisLabelFormatter = 
polloi::custom_axis_formatter,
+ axisLabelWidth = 100, pixelsPerLabel = 80) %>%
+  dyLegend(labelsDiv = "sister_search_traffic_plot_legend") %>%
+  dyRangeSelector(fillColor = "", strokeColor = "") %>%
+  dyEvent(as.Date("2017-06-15"), "A (deployed)", labelLoc = "bottom")
+  })
+
   # Survival
   output$lethal_dose_plot <- renderDygraph({
 user_page_visit_dataset %>%
diff --git a/tab_documentation/sister_search_traffic.md 
b/tab_documentation/sister_search_traffic.md
new file mode 100644
index 000..6258a6b
--- /dev/null
+++ b/tab_documentation/sister_search_traffic.md
@@ -0,0 +1,28 @@
+Sister search traffic
+===
+Sister (cross-wiki) search is a feature that adds results from other projects 
to a sidebar on the search engine results page (SERP). For example: if there 
are additional results found, users are shown images from Wikimedia Commons, 
definitions from Wiktionary, and results from works on Wikisource. See 
[T146667](https://phabricator.wikimedia.org/T146667) for more details.
+
+Notes
+-
+Some communities (e.g. Italian Wikipedia) developed their own cross-wiki 
search results sidebars, which is why we see some sister traffic before the 
deployment of the sister search feature across all Wikipedias.
+
+__\*__ Users can click on a cross-wiki result or view all the results at the 
sister project
+
+__†__ This excludes the language-less Wikimedia Commons
+
+Outages and inaccuracies
+--
+* 

[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Add traffic from sister search

2017-06-27 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/361592 )

Change subject: Add traffic from sister search
..


Add traffic from sister search

Bug: T164854
Change-Id: I7632d68b560049a145d1bccf54cf12abf9095582
---
M docs/README.md
M modules/metrics/search/config.yaml
A modules/metrics/search/sister_search_traffic
3 files changed, 67 insertions(+), 2 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/docs/README.md b/docs/README.md
index 40b1c28..4f595f0 100644
--- a/docs/README.md
+++ b/docs/README.md
@@ -8,7 +8,7 @@
 infrastructure. These datasets provide the metrics that are used by
 [Discovery's Dashboards](https://discovery.wmflabs.org/)
 
-Last updated on 31 May 2017
+Last updated on 26 June 2017
 
 Daily Metrics
 -
@@ -137,6 +137,10 @@
 -   **cirrus\_langproj\_breakdown\_with\_automata.tsv**: Zero results
 and total searches broken down by language-project pairs (e.g.
 German Wikiquote ZRR vs. French Wikibooks ZRR)
+-   **sister\_search\_traffic.tsv**: Traffic to various wikis from
+Wikipedia search results pages; broken up by language, destination
+type (SERP vs not), and access method (desktop vs mobile web);
+exlcudes known automata
 
 wdqs/
 -
diff --git a/modules/metrics/search/config.yaml 
b/modules/metrics/search/config.yaml
index cffd6fc..46d9768 100644
--- a/modules/metrics/search/config.yaml
+++ b/modules/metrics/search/config.yaml
@@ -155,4 +155,10 @@
 starts: 2016-11-01
 funnel: true
 max_data_points: 30
-type: script
\ No newline at end of file
+type: script
+sister_search_traffic:
+description: Traffic to various wikis from Wikipedia search results 
pages; broken up by language, destination type (SERP vs not), and access method 
(desktop vs mobile web); exlcudes known automata
+granularity: days
+starts: 2017-06-01
+funnel: true
+type: script
diff --git a/modules/metrics/search/sister_search_traffic 
b/modules/metrics/search/sister_search_traffic
new file mode 100755
index 000..541463a
--- /dev/null
+++ b/modules/metrics/search/sister_search_traffic
@@ -0,0 +1,55 @@
+#!/bin/bash
+
+hive -S -e "USE wmf;
+ADD JAR hdfs:///wmf/refinery/current/artifacts/refinery-hive.jar;
+CREATE TEMPORARY FUNCTION normalize_host AS 
'org.wikimedia.analytics.refinery.hive.GetHostPropertiesUDF';
+WITH sister_search_pvs AS (
+  SELECT
+TO_DATE(ts) AS `date`, access_method,
+CASE normalized_host.project
+ WHEN 'commons' THEN 'wikimedia commons'
+ WHEN 'simple' THEN CONCAT('simple ', normalized_host.project_class)
+ WHEN 'species' THEN 'wikispecies'
+ ELSE normalized_host.project_class
+END AS project,
+IF(normalized_host.project IN('commons', 'meta', 'simple', 'incubator', 
'species'), '',
+   IF(normalized_host.project = 'en', 'English', 'Other languages')) AS 
language,
+-- flag for pageviews that are search results pages (e.g. if user clicked 
to see more results from a sister project):
+(
+  page_id IS NULL
+  AND (
+uri_path = '/wiki/Special:Search'
+OR (
+  uri_path = '/w/index.php'
+  AND (
+uri_query RLIKE '^\?search\='
+OR INSTR(uri_query, '?title=Special:Search=') > 0
+  )
+)
+  )
+) AS is_serp
+  FROM webrequest
+  WHERE
+webrequest_source = 'text'
+AND CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')) >= '$1'
+AND CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')) < '$2'
+AND is_pageview
+AND referer_class = 'internal'
+AND (
+  INSTR(referer, '/w/index.php?search=') > 0
+  OR INSTR(referer, '/wiki/Special:Search?search=') > 0
+)
+-- warning: comparing uri_host = PARSE_URL(referer, 'HOST') would mark 
'en.m.wikipedia.org' as a sister of 'en.wikipedia.org'
+AND normalize_host(PARSE_URL(referer, 'HOST')).project_class = 'wikipedia'
+AND normalize_host(PARSE_URL(referer, 'HOST')).project_class != 
normalized_host.project_class
+AND NOT normalized_host.project_class IN('mediawiki', 
'wikimediafoundation', 'wikidata')
+AND NOT normalized_host.project IN('meta', 'incubator')
+-- keep commons.wikimedia.org and species.wikimedia.org:
+AND NOT (normalized_host.project_class = 'wikimedia' AND NOT 
(normalized_host.project IN('commons', 'species')))
+)
+SELECT `date`, access_method, project, language, IF(is_serp, 'TRUE', 'FALSE') 
AS is_serp, COUNT(1) AS pageviews
+FROM sister_search_pvs
+GROUP BY `date`, access_method, project, language, IF(is_serp, 'TRUE', 'FALSE')
+ORDER BY `date`, access_method, project, language, is_serp
+LIMIT 1;
+" 2> /dev/null | grep -v parquet.hadoop | grep -v WARN:

-- 
To view, visit https://gerrit.wikimedia.org/r/361592
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings


[MediaWiki-commits] [Gerrit] wikimedia...polloi[master]: The following datasets are included:

2017-06-22 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/360797 )

Change subject: The following datasets are included:
..

The following datasets are included:

- Countries and Regions with Traffic to Wikipedia.org
- U.S. States and Regions
- All Countries and U.S. States

Bug: T167913
Change-Id: I3c111f75bb827bb4a296b4148bee16d608844d26
---
M NAMESPACE
M R/data.R
M R/utils.R
A inst/extdata/all_countries_us_states.csv
A inst/extdata/portal_geo_names.RData
A inst/extdata/us_state_region.csv
A man/get_ctr_state.Rd
A man/get_portal_geo.Rd
A man/get_us_state.Rd
A man/update_portal_geo.Rd
10 files changed, 504 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/polloi 
refs/changes/97/360797/1

diff --git a/NAMESPACE b/NAMESPACE
index 9f924d8..8ccbbf9 100644
--- a/NAMESPACE
+++ b/NAMESPACE
@@ -9,9 +9,12 @@
 export(cond_icon)
 export(custom_axis_formatter)
 export(data_select)
+export(get_ctr_state)
 export(get_langproj)
+export(get_portal_geo)
 export(get_prefixes)
 export(get_projects)
+export(get_us_state)
 export(half)
 export(make_dygraph)
 export(na_box)
@@ -27,6 +30,7 @@
 export(time_frame_range)
 export(timeframe_daterange)
 export(timeframe_select)
+export(update_portal_geo)
 export(update_prefixes)
 export(update_projects)
 import(httr)
diff --git a/R/data.R b/R/data.R
index 288de6e..d329cbb 100644
--- a/R/data.R
+++ b/R/data.R
@@ -71,3 +71,55 @@
 rbind(projects, .)
   return(result)
 }
+
+#' @title Countries and Regions with Traffic to Wikipedia.org
+#' @description Attach `portal_geo_names` to search path
+#'
+#' @format `portal_geo_names` is a character vector containing about 230
+#'   country/region names with traffic to Wikipedia.org (portal).
+#'
+#' @source 
\url{https://analytics.wikimedia.org/datasets/discovery/metrics/portal/all_country_data.tsv}
+#' @seealso [update_portal_geo]
+#' @export
+get_portal_geo <- function() {
+  attach(system.file("extdata/portal_geo_names.RData", package = "polloi"))
+}
+
+#' @title U.S. States and Regions
+#' @description Returns a dataset containing all U.S. states' and territories'
+#'   names, abbreviations and regions.
+#'
+#' @format A data frame with 56 rows and 3 variables:
+#' \describe{
+#'   \item{abb}{The abbreviations of U.S. states and territories.}
+#'   \item{region}{The regions of U.S. states and territories. See
+#'https://phabricator.wikimedia.org/T136257#2399411.}
+#'   \item{state}{The names of U.S. states and territories.}
+#' }
+#'
+#' @source `ISO_3166_1` from package `ISOcodes`; `state.name`, `state.abb` and
+#'  `state.region` from package `datasets`; see
+#'   \url{https://phabricator.wikimedia.org/T136257#2399411} for U.S. regions.
+#' @importFrom readr read_csv
+#' @export
+get_us_state <- function() {
+  return(readr::read_csv(system.file("extdata/us_state_region.csv", package = 
"polloi")))
+}
+
+#' @title All Countries and U.S. States
+#' @description Returns a dataset containing all countries' and U.S. states'
+#'   names and abbreviations.
+#'
+#' @format A data frame with 300 rows and 2 variables:
+#' \describe{
+#'   \item{abb}{The abbreviations of all countries and U.S. states.}
+#'   \item{name}{The names of all countries and U.S. states.}
+#' }
+#'
+#' @source `ISO_3166_1` from package `ISOcodes`; `state.name`, `state.abb` and
+#'  `state.region` from package `datasets`.
+#' @importFrom readr read_csv
+#' @export
+get_ctr_state <- function() {
+  return(readr::read_csv(system.file("extdata/all_countries_us_states.csv", 
package = "polloi")))
+}
diff --git a/R/utils.R b/R/utils.R
index 1a81423..9c84d02 100644
--- a/R/utils.R
+++ b/R/utils.R
@@ -61,3 +61,17 @@
   result <- left_join(data.frame(wikiid = x, stringsAsFactors = FALSE), temp, 
by = "wikiid")
   return(result[, c('language', 'project')])
 }
+
+#' @title Update Country and Region Names with Traffic to Wikipedia.org
+#' @description Get unique country and region names from the `country` column 
of
+#'   
\url{https://analytics.wikimedia.org/datasets/discovery/metrics/portal/all_country_data.tsv}.
+#' @export
+update_portal_geo <- function() {
+  file_location <- system.file("extdata/portal_geo_names.RData", package = 
"polloi")
+
+  portal_geo_names <- 
read_dataset("discovery/metrics/portal/all_country_data.tsv")
+  portal_geo_names <- sort(c(unique(portal_geo_names$country), "United 
States"))
+
+  save(portal_geo_names, file = file_location)
+  return(invisible())
+}
diff --git a/inst/extdata/all_countries_us_states.csv 
b/inst/extdata/all_countries_us_states.csv
new file mode 100644
index 000..85ceea1
--- /dev/null
+++ b/inst/extdata/all_countries_us_states.csv
@@ -0,0 +1,301 @@
+abb,name
+US:AL,U.S. (South)
+US:AK,U.S. (Pacific)
+US:AZ,U.S. (West)
+US:AR,U.S. (South)
+US:CA,U.S. (Pacific)
+US:CO,U.S. (West)
+US:CT,U.S. (Northeast)
+US:DE,U.S. (South)
+US:FL,U.S. (South)
+US:GA,U.S. (South)
+US:HI,U.S. (Pacific)

[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Add licensing info

2017-06-21 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/360591 )

Change subject: Add licensing info
..


Add licensing info

Bug: T167930
Change-Id: Ib01224fd1a952eeaab9cc378ca0e16e7ea3845d3
---
M .gitreview
M CHANGELOG.md
A LICENSE.md
M README.md
M server.R
M tab_documentation/app_events.md
M tab_documentation/app_load.md
D tab_documentation/build_a_plot.md
M tab_documentation/click_position.md
M tab_documentation/desktop_events.md
M tab_documentation/desktop_load.md
M tab_documentation/failure_breakdown.md
M tab_documentation/failure_rate.md
M tab_documentation/failure_suggests.md
M tab_documentation/fulltext_basic.md
M tab_documentation/geo_basic.md
M tab_documentation/invoke_source.md
M tab_documentation/kpi_api_usage.md
M tab_documentation/kpi_augmented_clickthroughs.md
M tab_documentation/kpi_load_time.md
M tab_documentation/kpi_zero_results.md
M tab_documentation/kpis_summary.md
M tab_documentation/langproj_breakdown.md
M tab_documentation/language_basic.md
M tab_documentation/mobile_events.md
M tab_documentation/mobile_load.md
M tab_documentation/monthly_metrics.md
M tab_documentation/open_basic.md
M tab_documentation/paulscore_approx.html
M tab_documentation/prefix_basic.md
M tab_documentation/survival.md
M ui.R
32 files changed, 185 insertions(+), 179 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/.gitreview b/.gitreview
index 3f659b0..6ab16d0 100644
--- a/.gitreview
+++ b/.gitreview
@@ -2,5 +2,5 @@
 host=gerrit.wikimedia.org
 port=29418
 project=wikimedia/discovery/rainbow.git
-defaultbranch=master
+defaultbranch=develop
 defaultrebase=0
diff --git a/CHANGELOG.md b/CHANGELOG.md
index ab4260b..bda0aab 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -2,6 +2,9 @@
 
 All notable changes to this project will be documented in this file.
 
+## 2017/06/20
+- Added licensing info ([T167930](https://phabricator.wikimedia.org/T167930))
+
 ## 2017/05/01
 - Added a language-project breakdown of additional metrics 
([T150410](https://phabricator.wikimedia.org/T150410))
 
diff --git a/LICENSE.md b/LICENSE.md
new file mode 100644
index 000..7355a50
--- /dev/null
+++ b/LICENSE.md
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2017 Wikimedia Foundation
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/README.md b/README.md
index d1de5d9..ff65401 100644
--- a/README.md
+++ b/README.md
@@ -17,4 +17,4 @@
 shiny::runApp(launch.browser = 0)
 ```
 
-Please note that this project is released with a [Contributor Code of 
Conduct](CONDUCT.md). By participating in this project you agree to abide by 
its terms.
+Please note that this project is licensed under [MIT License](LICENSE.md) and 
released with a [Contributor Code of Conduct](CONDUCT.md). By participating in 
this project you agree to abide by its terms. [Wikimedia technical spaces code 
of conduct](https://www.mediawiki.org/wiki/Special:MyLanguage/Code_of_Conduct) 
also applies.
diff --git a/server.R b/server.R
index 54c0886..0de1586 100644
--- a/server.R
+++ b/server.R
@@ -383,7 +383,7 @@
  temp <- dates %>%
as.character("%e") %>%
as.numeric %>%
-   sapply(toOrdinal::toOrdinal) %>%
+   vapply(toOrdinal::toOrdinal, "") %>%
sub("([a-z]{2})", "\\1", .) %>%
paste0(as.character(dates, "%A, %b "), .)
},
@@ -392,7 +392,7 @@
  temp <- dates %>%
as.character("%e") %>%
as.numeric %>%
-   sapply(toOrdinal::toOrdinal) %>%
+   vapply(toOrdinal::toOrdinal, "") %>%
sub("([a-z]{2})", "\\1", .) %>%
paste0(as.character(dates, "%b "), .) %>%
{
@@ -404,7 +404,7 @@
  temp <- dates %>%
as.character("%e") %>%
as.numeric %>%
-   sapply(toOrdinal::toOrdinal) %>%
+   

[MediaWiki-commits] [Gerrit] wikimedia...wetzel[develop]: Add licensing info

2017-06-21 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/360590 )

Change subject: Add licensing info
..


Add licensing info

Bug: T167930
Change-Id: I36ba0e9e5395d87380efda3bcde0ff7d22542efd
---
M .gitreview
M CHANGELOG.md
A LICENSE.md
M README.md
M server.R
M tab_documentation/geo_breakdown.md
M tab_documentation/geohack_usage.md
M tab_documentation/tiles_summary.md
M tab_documentation/tiles_total_by_style.md
M tab_documentation/tiles_total_by_zoom.md
M tab_documentation/tiles_users_by_style.md
M tab_documentation/unique_users.md
M tab_documentation/wikiminiatlas_usage.md
M tab_documentation/wikivoyage_usage.md
M tab_documentation/wiwosm_usage.md
M ui.R
16 files changed, 88 insertions(+), 63 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/.gitreview b/.gitreview
index 42d7c49..45a84e3 100644
--- a/.gitreview
+++ b/.gitreview
@@ -2,4 +2,4 @@
 host=gerrit.wikimedia.org
 port=29418
 project=wikimedia/discovery/wetzel.git
-defaultbranch=master
+defaultbranch=develop
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 410f696..208e2ab 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,6 +1,9 @@
 # Change Log (Patch Notes)
 All notable changes to this project will be documented in this file.
 
+## 2017/06/20
+- Added licensing info ([T167930](https://phabricator.wikimedia.org/T167930))
+
 ## 2017/02/02
 - Updated to work with new datasets generated by Reportupdater-based golden 
([T150915](https://phabricator.wikimedia.org/T150915))
 
diff --git a/LICENSE.md b/LICENSE.md
new file mode 100644
index 000..7355a50
--- /dev/null
+++ b/LICENSE.md
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2017 Wikimedia Foundation
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/README.md b/README.md
index 1cd09ef..eb30184 100644
--- a/README.md
+++ b/README.md
@@ -17,4 +17,4 @@
 shiny::runApp(launch.browser = 0)
 ```
 
-Please note that this project is released with a [Contributor Code of 
Conduct](CONDUCT.md). By participating in this project you agree to abide by 
its terms.
+Please note that this project is licensed under [MIT License](LICENSE.md) and 
released with a [Contributor Code of Conduct](CONDUCT.md). By participating in 
this project you agree to abide by its terms. [Wikimedia technical spaces code 
of conduct](https://www.mediawiki.org/wiki/Special:MyLanguage/Code_of_Conduct) 
also applies.
diff --git a/server.R b/server.R
index d24e621..a659724 100644
--- a/server.R
+++ b/server.R
@@ -91,6 +91,7 @@
   })
 
   output$tiles_zoom_series <- renderDygraph({
+req(input$zoom_level_selector)
 polloi::data_select(
   input$tile_zoom_automata_check,
   new_tiles_automata,
diff --git a/tab_documentation/geo_breakdown.md 
b/tab_documentation/geo_breakdown.md
index b3e4c4c..c2f6adf 100644
--- a/tab_documentation/geo_breakdown.md
+++ b/tab_documentation/geo_breakdown.md
@@ -10,12 +10,12 @@
 
 Questions, bug reports, and feature suggestions
 --
-For technical, non-bug questions, [email 
Mikhail](mailto:mpo...@wikimedia.org?subject=Dashboard%20Question). If you 
experience a bug or notice something wrong or have a suggestion, [open a ticket 
in 
Phabricator](https://phabricator.wikimedia.org/maniphest/task/create/?projects=Discovery)
 in the Discovery board or [email 
Deb](mailto:d...@wikimedia.org?subject=Dashboard%20Question).
+For technical, non-bug questions, [email 
Mikhail](mailto:mpo...@wikimedia.org?subject=Dashboard%20Question) or 
[Chelsy](mailto:c...@wikimedia.org?subject=Dashboard%20Question). If you 
experience a bug or notice something wrong or have a suggestion, [open a ticket 
in 
Phabricator](https://phabricator.wikimedia.org/maniphest/task/create/?projects=Discovery)
 in the Discovery board or [email 
Deb](mailto:d...@wikimedia.org?subject=Dashboard%20Question).
 
 
-
-  Link to this dashboard:
-  http://discovery.wmflabs.org/maps/#geo_breakdown;>
-

[MediaWiki-commits] [Gerrit] wikimedia...wonderbolt[develop]: Add licensing info

2017-06-21 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/360589 )

Change subject: Add licensing info
..


Add licensing info

Bug: T167930
Change-Id: I9db26c507f5825b780b7584c309afd07375d7920
---
M .gitreview
A LICENSE.md
M README.md
A tab_documentation/traffic_by_engine.md
D tab_documentation/traffic_byengine.md
M tab_documentation/traffic_summary.md
M ui.R
7 files changed, 58 insertions(+), 37 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/.gitreview b/.gitreview
index be475f0..1e92d5f 100644
--- a/.gitreview
+++ b/.gitreview
@@ -2,4 +2,4 @@
 host=gerrit.wikimedia.org
 port=29418
 project=wikimedia/discovery/wonderbolt.git
-defaultbranch=master
+defaultbranch=develop
diff --git a/LICENSE.md b/LICENSE.md
new file mode 100644
index 000..7355a50
--- /dev/null
+++ b/LICENSE.md
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2017 Wikimedia Foundation
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/README.md b/README.md
index 6687c11..d29419b 100644
--- a/README.md
+++ b/README.md
@@ -17,4 +17,4 @@
 shiny::runApp(launch.browser = 0)
 ```
 
-Please note that this project is released with a [Contributor Code of 
Conduct](CONDUCT.md). By participating in this project you agree to abide by 
its terms.
+Please note that this project is licensed under [MIT License](LICENSE.md) and 
released with a [Contributor Code of Conduct](CONDUCT.md). By participating in 
this project you agree to abide by its terms. [Wikimedia technical spaces code 
of conduct](https://www.mediawiki.org/wiki/Special:MyLanguage/Code_of_Conduct) 
also applies.
diff --git a/tab_documentation/traffic_by_engine.md 
b/tab_documentation/traffic_by_engine.md
new file mode 100644
index 000..8269810
--- /dev/null
+++ b/tab_documentation/traffic_by_engine.md
@@ -0,0 +1,27 @@
+Traffic from external search engines, broken down
+===
+
+A key metric in understanding the role external search engines play in 
Wikipedia's (and Wikimedia's) readership and content discovery processes is a 
very direct one - how many pageviews we get from them. This can be discovered 
very simply by looking at our request logs.
+
+This dashboard simply breaks down the [summary 
data](https://discovery.wmflabs.org/external/#traffic_summary) to investigate 
how much traffic is coming from each search engine, individually. As you can 
see, Google dominates, which is why we've included the option of log-scaling
+the traffic.
+
+General trends
+--
+
+Outages and notes
+--
+* '__A__': on 2016-08-25 we patched the UDF to also look for [Duck Duck 
Go](https://duckduckgo.com) when it processes referer data. That referreral 
data was deleted and backfilled from 26 June 2016. See 
[T143287](https://phabricator.wikimedia.org/T143287) for more details.
+* '__R__': on 2017-01-01 we started calculating all of Discovery's metrics 
using a new version of [our data retrieval and processing 
codebase](https://phabricator.wikimedia.org/diffusion/WDGO/) that we migrated 
to [Wikimedia Analytics](https://www.mediawiki.org/wiki/Analytics)' 
[Reportupdater 
infrastructure](https://wikitech.wikimedia.org/wiki/Analytics/Reportupdater). 
See [T150915](https://phabricator.wikimedia.org/T150915) for more details.
+
+Questions, bug reports, and feature suggestions
+--
+For technical, non-bug questions, [email 
Mikhail](mailto:mpo...@wikimedia.org?subject=Dashboard%20Question) or 
[Chelsy](mailto:c...@wikimedia.org?subject=Dashboard%20Question). If you 
experience a bug or notice something wrong or have a suggestion, [open a ticket 
in 
Phabricator](https://phabricator.wikimedia.org/maniphest/task/create/?projects=Discovery)
 in the Discovery board or [email 
Deb](mailto:d...@wikimedia.org?subject=Dashboard%20Question).
+
+
+
+  Link to this dashboard: 

[MediaWiki-commits] [Gerrit] wikimedia...prince[develop]: Add licensing info

2017-06-21 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/360588 )

Change subject: Add licensing info
..


Add licensing info

Bug: T167930
Change-Id: I5a42359c682de98dfa8d26231bd4d7cd43a25d9c
---
M .gitreview
A LICENSE.md
M README.md
M tab_documentation/action_breakdown.md
M tab_documentation/applinks.md
M tab_documentation/browsers.md
M tab_documentation/clickthrough_rate.md
M tab_documentation/dwelltime.md
M tab_documentation/first_visit.md
M tab_documentation/first_visit_geo.md
M tab_documentation/geography.md
M tab_documentation/languages_summary.md
M tab_documentation/languages_visited.md
M tab_documentation/last_action_geo.md
M tab_documentation/most_common.md
M tab_documentation/most_common_geo.md
M tab_documentation/pageviews.md
M tab_documentation/referers_byengine.md
M tab_documentation/referers_summary.md
M tab_documentation/sisproj.md
M tab_documentation/traffic_ctr_geo.md
M ui.R
M utils.R
23 files changed, 140 insertions(+), 121 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/.gitreview b/.gitreview
index dfa799f..02def08 100644
--- a/.gitreview
+++ b/.gitreview
@@ -2,4 +2,4 @@
 host=gerrit.wikimedia.org
 port=29418
 project=wikimedia/discovery/prince.git
-defaultbranch=master
+defaultbranch=develop
diff --git a/LICENSE.md b/LICENSE.md
new file mode 100644
index 000..7355a50
--- /dev/null
+++ b/LICENSE.md
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2017 Wikimedia Foundation
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/README.md b/README.md
index 5fd2f2b..a93e802 100644
--- a/README.md
+++ b/README.md
@@ -17,4 +17,4 @@
 shiny::runApp(launch.browser = 0)
 ```
 
-Please note that this project is released with a [Contributor Code of 
Conduct](CONDUCT.md). By participating in this project you agree to abide by 
its terms.
+Please note that this project is licensed under [MIT License](LICENSE.md) and 
released with a [Contributor Code of Conduct](CONDUCT.md). By participating in 
this project you agree to abide by its terms. [Wikimedia technical spaces code 
of conduct](https://www.mediawiki.org/wiki/Special:MyLanguage/Code_of_Conduct) 
also applies.
diff --git a/tab_documentation/action_breakdown.md 
b/tab_documentation/action_breakdown.md
index 6ec8197..77cc787 100644
--- a/tab_documentation/action_breakdown.md
+++ b/tab_documentation/action_breakdown.md
@@ -26,12 +26,12 @@
 
 Questions, bug reports, and feature suggestions
 --
-For technical, non-bug questions, [email 
Mikhail](mailto:mpo...@wikimedia.org?subject=Dashboard%20Question). If you 
experience a bug or notice something wrong or have a suggestion, [open a ticket 
in 
Phabricator](https://phabricator.wikimedia.org/maniphest/task/create/?projects=Discovery)
 in the Discovery board or [email 
Deb](mailto:d...@wikimedia.org?subject=Dashboard%20Question).
+For technical, non-bug questions, [email 
Mikhail](mailto:mpo...@wikimedia.org?subject=Dashboard%20Question) or 
[Chelsy](mailto:c...@wikimedia.org?subject=Dashboard%20Question). If you 
experience a bug or notice something wrong or have a suggestion, [open a ticket 
in 
Phabricator](https://phabricator.wikimedia.org/maniphest/task/create/?projects=Discovery)
 in the Discovery board or [email 
Deb](mailto:d...@wikimedia.org?subject=Dashboard%20Question).
 
 
-
-  Link to this dashboard:
-  http://discovery.wmflabs.org/portal/#action_breakdown;>
-http://discovery.wmflabs.org/portal/#action_breakdown
-  
+
+  Link to this dashboard: https://discovery.wmflabs.org/portal/#action_breakdown;>https://discovery.wmflabs.org/portal/#action_breakdown
+  | Page is available under https://creativecommons.org/licenses/by-sa/3.0/; title="Creative Commons 
Attribution-ShareAlike License">CC-BY-SA 3.0
+  | https://phabricator.wikimedia.org/diffusion/WDPR/; 
title="Wikipedia.org Portal Dashboard source code repository">Code is 
licensed under 

[MediaWiki-commits] [Gerrit] wikimedia...twilightsparql[develop]: Add licensing info

2017-06-21 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/360586 )

Change subject: Add licensing info
..


Add licensing info

Bug: T167930
Change-Id: Iee990ba2506eea10b1cda38b060c94802b6e48fe
---
M .gitreview
M CHANGELOG.md
A LICENSE.md
M README.md
M tab_documentation/wdqs_usage.md
M tab_documentation/wdqs_visits.md
M ui.R
7 files changed, 39 insertions(+), 15 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/.gitreview b/.gitreview
index 14640ea..a51a85c 100644
--- a/.gitreview
+++ b/.gitreview
@@ -2,5 +2,5 @@
 host=gerrit.wikimedia.org
 port=29418
 project=wikimedia/discovery/twilightsparql.git
-defaultbranch=master
+defaultbranch=develop
 defaultrebase=0
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 96ea811..84ae0e2 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -2,6 +2,9 @@
 
 All notable changes to this project will be documented in this file.
 
+## 2017/06/20
+- Added licensing info ([T167930](https://phabricator.wikimedia.org/T167930))
+
 ## 2017/02/02
 - Updated to work with new datasets generated by Reportupdater-based golden 
([T150915](https://phabricator.wikimedia.org/T150915))
 - Added LDF endpoint usage 
([T153936](https://phabricator.wikimedia.org/T153936))
diff --git a/LICENSE.md b/LICENSE.md
new file mode 100644
index 000..7355a50
--- /dev/null
+++ b/LICENSE.md
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2017 Wikimedia Foundation
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/README.md b/README.md
index 5ce05f6..f1c36f4 100644
--- a/README.md
+++ b/README.md
@@ -17,4 +17,4 @@
 shiny::runApp(launch.browser = 0)
 ```
 
-Please note that this project is released with a [Contributor Code of 
Conduct](CONDUCT.md). By participating in this project you agree to abide by 
its terms.
+Please note that this project is licensed under [MIT License](LICENSE.md) and 
released with a [Contributor Code of Conduct](CONDUCT.md). By participating in 
this project you agree to abide by its terms. [Wikimedia technical spaces code 
of conduct](https://www.mediawiki.org/wiki/Special:MyLanguage/Code_of_Conduct) 
also applies.
diff --git a/tab_documentation/wdqs_usage.md b/tab_documentation/wdqs_usage.md
index 627ac1d..1dfb2da 100644
--- a/tab_documentation/wdqs_usage.md
+++ b/tab_documentation/wdqs_usage.md
@@ -15,12 +15,12 @@
 
 Questions, bug reports, and feature suggestions
 --
-For technical, non-bug questions, [email 
Mikhail](mailto:mpo...@wikimedia.org?subject=Dashboard%20Question). If you 
experience a bug or notice something wrong or have a suggestion, [open a ticket 
in 
Phabricator](https://phabricator.wikimedia.org/maniphest/task/create/?projects=Discovery)
 in the Discovery board or [email 
Deb](mailto:d...@wikimedia.org?subject=Dashboard%20Question).
+For technical, non-bug questions, [email 
Mikhail](mailto:mpo...@wikimedia.org?subject=Dashboard%20Question) or 
[Chelsy](mailto:c...@wikimedia.org?subject=Dashboard%20Question). If you 
experience a bug or notice something wrong or have a suggestion, [open a ticket 
in 
Phabricator](https://phabricator.wikimedia.org/maniphest/task/create/?projects=Discovery)
 in the Discovery board or [email 
Deb](mailto:d...@wikimedia.org?subject=Dashboard%20Question).
 
 
-
-  Link to this dashboard:
-  http://discovery.wmflabs.org/wdqs/#wdqs_usage;>
-http://discovery.wmflabs.org/wdqs/#wdqs_usage
-  
+
+  Link to this dashboard: https://discovery.wmflabs.org/wdqs/#endpoint_usage;>https://discovery.wmflabs.org/wdqs/#endpoint_usage
+  | Page is available under https://creativecommons.org/licenses/by-sa/3.0/; title="Creative Commons 
Attribution-ShareAlike License">CC-BY-SA 3.0
+  | https://phabricator.wikimedia.org/diffusion/WDTS/; title="WDQS 
Dashboard source code repository">Code is licensed under https://phabricator.wikimedia.org/diffusion/WDTS/browse/master/LICENSE.md;
 title="MIT License">MIT
+  | Part of 

[MediaWiki-commits] [Gerrit] wikimedia...wonderbolt[master]: Fix desktop/mobile web mix-up

2017-06-14 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/359026 )

Change subject: Fix desktop/mobile web mix-up
..


Fix desktop/mobile web mix-up

Previous version assumed a specific order of access method when
renaming the elements of the list. This uses relative naming.

Also changes the x-axis formatting so it displays day names.

Bug: T167850
Change-Id: I818553b66e7be0e960da37477549d9ad60e9d58d
---
M server.R
M utils.R
2 files changed, 11 insertions(+), 10 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/server.R b/server.R
index 528fb21..f9b4162 100644
--- a/server.R
+++ b/server.R
@@ -37,6 +37,7 @@
   polloi::make_dygraph(xlab = "Date", ylab = 
ifelse(input$platform_traffic_summary_prop, "Pageview Share (%)", "Pageviews"),
title = "Sources of page views (e.g. search engines 
and internal referers)") %>%
   dyLegend(labelsDiv = "traffic_summary_legend", show = "always", 
showZeroValues = FALSE) %>%
+  dyAxis("x", axisLabelFormatter = polloi::custom_axis_formatter, 
axisLabelWidth = 70) %>%
   dyAxis("y", logscale = input$platform_traffic_summary_log) %>%
   dyRangeSelector(fillColor = ifelse(input$platform_traffic_summary_prop, 
"", "#A7B1C4"),
   strokeColor = 
ifelse(input$platform_traffic_summary_prop, "", "#808FAB"),
@@ -64,6 +65,7 @@
   polloi::make_dygraph(xlab = "Date", ylab = 
ifelse(input$platform_traffic_bysearch_prop, "Pageview Share (%)", "Pageviews"),
title = "Pageviews from external search engines, 
broken down by engine") %>%
   dyLegend(labelsDiv = "traffic_bysearch_legend", show = "always", 
showZeroValues = FALSE) %>%
+  dyAxis("x", axisLabelFormatter = polloi::custom_axis_formatter, 
axisLabelWidth = 70) %>%
   dyAxis("y", logscale = input$platform_traffic_bysearch_log) %>%
   dyRangeSelector(fillColor = ifelse(input$platform_traffic_bysearch_prop, 
"", "#A7B1C4"),
   strokeColor = 
ifelse(input$platform_traffic_bysearch_prop, "", "#808FAB"),
diff --git a/utils.R b/utils.R
index 2169808..4b3de60 100644
--- a/utils.R
+++ b/utils.R
@@ -26,13 +26,13 @@
 lapply(dplyr::select_, .dots = list(quote(-access_method))) # fixes 
smoothing
   interim$total <- data[, j = list(pageviews = sum(pageviews)),
 by = c("date", "referer_class")]
-  names(interim) <- c("Desktop", "Mobile Web", "All")
+  names(interim) <- c("desktop" = "Desktop", "mobile web" = "Mobile Web", 
"total" = "All")[names(interim)]
   summary_traffic_data <<- lapply(interim, tidyr::spread, key = 
"referer_class", value = "pageviews", fill = NA)
 
   # Proportion
   summary_traffic_data_prop <<- interim %>%
 lapply(dplyr::group_by, date) %>%
-lapply(dplyr::mutate, pageviews = 100*pageviews/sum(pageviews)) %>%
+lapply(dplyr::mutate, pageviews = 100 * pageviews / sum(pageviews)) %>%
 lapply(tidyr::spread, key = "referer_class", value = "pageviews", fill = 
NA)
 
   # Generate per-engine values
@@ -44,7 +44,7 @@
   interim$total <- data[is_search == TRUE,
 j = list(pageviews = sum(pageviews)),
 by = c("date", "search_engine")]
-  names(interim) <- c("Desktop", "Mobile Web", "All")
+  names(interim) <- c("desktop" = "Desktop", "mobile web" = "Mobile Web", 
"total" = "All")[names(interim)]
   bysearch_traffic_data <<- interim %>%
 lapply(dplyr::filter_, .dots = list(quote(search_engine != "Not referred 
by search"))) %>%
 lapply(tidyr::spread, key = "search_engine", value = "pageviews", fill = 
NA)
@@ -52,7 +52,7 @@
   # Proportion
   bysearch_traffic_data_prop <<- interim %>%
 lapply(dplyr::group_by, date) %>%
-lapply(dplyr::mutate, pageviews = 100*pageviews/sum(pageviews)) %>%
+lapply(dplyr::mutate, pageviews = 100 * pageviews / sum(pageviews)) %>%
 lapply(dplyr::filter_, .dots = list(quote(search_engine != "Not referred 
by search"))) %>%
 lapply(tidyr::spread, key = "search_engine", value = "pageviews", fill = 
NA)
 
@@ -72,8 +72,7 @@
 `None (direct)` = "none",
 `Search engine` = "external (search engine)",
 `External (but not search engine)` = "external",
-Internal = "internal",
-Unknown = "unknown"
+Internal = "internal"
   )
 ) %>%
 data.table::as.data.table()
@@ -85,13 +84,13 @@
 lapply(dplyr::select_, .dots = list(quote(-access_method))) # fixes 
smoothing
   interim$total <- data[, j = list(pageviews = sum(pageviews)),
 by = c("date", "referer_class")]
-  names(interim) <- c("Desktop", "Mobile Web", "All")
+  names(interim) <- c("desktop" = "Desktop", "mobile web" = "Mobile Web", 
"total" = "All")[names(interim)]
   summary_traffic_nonbot_data <<- lapply(interim, tidyr::spread, key = 
"referer_class", value = "pageviews", fill = NA)
 
   # 

[MediaWiki-commits] [Gerrit] wikimedia...prince[master]: Change the way ui.R get date range and country list

2017-06-13 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/358890 )

Change subject: Change the way ui.R get date range and country list
..

Change the way ui.R get date range and country list

Previously, I asked ui.R to download a public dataset before rendering the 
dashboard, which is problematic. Change to storing the country name list in 
extras.R for selectizeInput in ui.R

Change-Id: Id564a1f7371e14932ea72da03630463b6e9c348e
---
M extras.R
M ui.R
2 files changed, 16 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/prince 
refs/changes/90/358890/1

diff --git a/extras.R b/extras.R
index a3a96fa..69c502e 100644
--- a/extras.R
+++ b/extras.R
@@ -28,6 +28,9 @@
   "Modern") # Android 4+
 )
 
+# For selectizeInput in ui.R
+all_country_names <- c("Zimbabwe", "Zambia", "Yemen", "Virgin Islands, 
British", "Viet Nam", "Venezuela, Bolivarian Republic of", "Uzbekistan", "U.S. 
(West)", "U.S. (South)", "U.S. (Pacific)", "U.S. (Other)", "U.S. (Northeast)", 
"U.S. (Midwest)", "Uruguay", "United Kingdom", "United Arab Emirates", 
"Ukraine", "Uganda", "Turkmenistan", "Turkey", "Tunisia", "Trinidad and 
Tobago", "Timor-Leste", "Thailand", "Tanzania, United Republic of", 
"Tajikistan", "Taiwan, Province of China", "Syrian Arab Republic", 
"Switzerland", "Sweden", "Suriname", "Sudan", "Sri Lanka", "Spain", "South 
Africa", "Somalia", "Slovenia", "Slovakia", "Singapore", "Seychelles", 
"Serbia", "Senegal", "Saudi Arabia", "Rwanda", "Russian Federation", "Romania", 
"Qatar", "Portugal", "Poland", "Philippines", "Peru", "Paraguay", "Papua New 
Guinea", "Panama", "Palestine, State of", "Pakistan", "Other", "Oman", 
"Norway", "Nigeria", "Niger", "Nicaragua", "New Zealand", "Netherlands", 
"Nepal", "Namibia", "Myanmar", "Mozambique", "Morocco", "Montenegro", 
"Mongolia", "Moldova, Republic of", "Mexico", "Mauritius", "Mauritania", 
"Martinique", "Mali", "Malaysia", "Malawi", "Madagascar", "Macedonia, Republic 
of", "Macao", "Luxembourg", "Lithuania", "Libya", "Lebanon", "Latvia", "Lao 
People's Democratic Republic", "Kyrgyzstan", "Kuwait", "Korea, Republic of", 
"Kenya", "Kazakhstan", "Jordan", "Jersey", "Japan", "Jamaica", "Italy", 
"Israel", "Ireland", "Iraq", "Iran, Islamic Republic of", "Indonesia", "India", 
"Iceland", "Hungary", "Hong Kong", "Honduras", "Haiti", "Guernsey", 
"Guatemala", "Greenland", "Greece", "Ghana", "Germany", "Georgia", "French 
Polynesia", "France", "Finland", "Fiji", "Ethiopia", "Estonia", "El Salvador", 
"Egypt", "Ecuador", "Dominican Republic", "Dominica", "Djibouti", "Denmark", 
"Czechia", "Cyprus", "Curacao", "Cuba", "Croatia", "Cote d'Ivoire", "Costa 
Rica", "Congo, The Democratic Republic of the", "Colombia", "China", "Chile", 
"Canada", "Cameroon", "Cambodia", "Burkina Faso", "Bulgaria", "British Indian 
Ocean Territory", "Brazil", "Botswana", "Bolivia, Plurinational State of", 
"Bhutan", "Benin", "Belgium", "Belarus", "Barbados", "Bangladesh", "Bahrain", 
"Azerbaijan", "Austria", "Australia", "Aruba", "Armenia", "Argentina", 
"Angola", "Algeria", "Albania", "Afghanistan", "Togo", "Malta", "Guadeloupe", 
"Gibraltar", "Gabon", "Faroe Islands", "Congo", "Cayman Islands", "Brunei 
Darussalam", "Bosnia and Herzegovina", "Bahamas", "Reunion", "Maldives", 
"Guyana", "Guinea", "Cabo Verde", "Burundi", "Antigua and Barbuda", 
"Swaziland", "Saint Lucia", "Isle of Man", "Gambia", "Central African 
Republic", "Belize", "Vanuatu", "Sierra Leone", "Saint Kitts and Nevis", "New 
Caledonia", "Lesotho", "Solomon Islands", "French Guiana", "Chad", "Bermuda", 
"Turks and Caicos Islands", "Liberia", "Comoros", "Bonaire, Sint Eustatius and 
Saba", "Aland Islands", "Grenada", "Mayotte", "Liechtenstein", "Samoa", 
"Equatorial Guinea", "Andorra", "South Sudan", "Saint Martin (French part)", 
"Saint Vincent and the Grenadines", "Holy See (Vatican City State)", 
"Guinea-Bissau", "Eritrea", "Saint Barthelemy", "Cook Islands", "Sint Maarten 
(Dutch part)", "Sao Tome and Principe", "Anguilla", "Monaco", "Kiribati", 
"Micronesia, Federated States of", "San Marino", "United States")
+
 fill_out <- function(x, start_date, end_date, fill = 0) {
   temp <- dplyr::data_frame(date = seq(start_date, end_date, "day"))
   y <- dplyr::right_join(x, temp, by = "date")
diff --git a/ui.R b/ui.R
index 8b2ece9..c9d1ea6 100644
--- a/ui.R
+++ b/ui.R
@@ -1,7 +1,7 @@
 library(shiny)
 library(shinydashboard)
 
-all_country_data <- 
polloi::read_dataset("discovery/metrics/portal/all_country_data.tsv", col_types 
= "Dcididid")
+source("extras.R")
 
 function(request) {
   dashboardPage(
@@ -237,8 +237,8 @@
   fluidRow(
 column(width = 3,
dateRangeInput("date_all_country", "Date Range",
-  start = min(all_country_data$date),
-  end = max(all_country_data$date),
+

[MediaWiki-commits] [Gerrit] wikimedia...rainbow[master]: Clarify event counts + switch to 90-day median

2017-06-13 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/358489 )

Change subject: Clarify event counts + switch to 90-day median
..


Clarify event counts + switch to 90-day median

Change-Id: I6b2f1b51f405e8acc003033cd20b2e27fc95ba3b
---
M server.R
M tab_documentation/app_events.md
M tab_documentation/desktop_events.md
M tab_documentation/mobile_events.md
M utils.R
5 files changed, 20 insertions(+), 14 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/server.R b/server.R
index 2225127..54c0886 100644
--- a/server.R
+++ b/server.R
@@ -41,7 +41,7 @@
   output$desktop_event_searches <- renderValueBox(
 valueBox(
   value = desktop_dygraph_means["search sessions"],
-  subtitle = "Search sessions per day",
+  subtitle = "Tracked search sessions per day*",
   icon = icon("search"),
   color = "green"
 )
@@ -50,7 +50,7 @@
   output$desktop_event_resultsets <- renderValueBox(
 valueBox(
   value = desktop_dygraph_means["Result pages opened"],
-  subtitle = "Result sets per day",
+  subtitle = "Result pages opened per day*",
   icon = icon("list", lib = "glyphicon"),
   color = "green"
 )
@@ -59,7 +59,7 @@
   output$desktop_event_clickthroughs <- renderValueBox(
 valueBox(
   value = desktop_dygraph_means["clickthroughs"],
-  subtitle = "Clickthroughs per day",
+  subtitle = "Clickthroughs per day*",
   icon = icon("hand-up", lib = "glyphicon"),
   color = "green"
 )
@@ -124,7 +124,7 @@
   output$mobile_event_searches <- renderValueBox(
 valueBox(
   value = mobile_dygraph_means["search sessions"],
-  subtitle = "Search sessions per day",
+  subtitle = "Search sessions per day*",
   icon = icon("search"),
   color = "green"
 )
@@ -133,7 +133,7 @@
   output$mobile_event_resultsets <- renderValueBox(
 valueBox(
   value = mobile_dygraph_means["Result pages opened"],
-  subtitle = "Result sets per day",
+  subtitle = "Result pages opened per day*",
   icon = icon("list", lib = "glyphicon"),
   color = "green"
 )
@@ -142,7 +142,7 @@
   output$mobile_event_clickthroughs <- renderValueBox(
 valueBox(
   value = mobile_dygraph_means["clickthroughs"],
-  subtitle = "Clickthroughs per day",
+  subtitle = "Clickthroughs per day*",
   icon = icon("hand-up", lib = "glyphicon"),
   color = "green"
 )
@@ -169,7 +169,7 @@
   output$app_event_searches <- renderValueBox(
 valueBox(
   value = ios_dygraph_means["search sessions"] + 
android_dygraph_means["search sessions"],
-  subtitle = "Search sessions per day",
+  subtitle = "Search sessions per day*",
   icon = icon("search"),
   color = "green"
 )
@@ -178,7 +178,7 @@
   output$app_event_resultsets <- renderValueBox(
 valueBox(
   value = ios_dygraph_means["Result pages opened"] + 
android_dygraph_means["Result pages opened"],
-  subtitle = "Result sets per day",
+  subtitle = "Result pages opened per day*",
   icon = icon("list", lib = "glyphicon"),
   color = "green"
 )
@@ -187,7 +187,7 @@
   output$app_event_clickthroughs <- renderValueBox(
 valueBox(
   value = ios_dygraph_means["clickthroughs"] + 
android_dygraph_means["clickthroughs"],
-  subtitle = "Clickthroughs per day",
+  subtitle = "Clickthroughs per day*",
   icon = icon("hand-up", lib = "glyphicon"),
   color = "green"
 )
diff --git a/tab_documentation/app_events.md b/tab_documentation/app_events.md
index 02b87ba..3f696d1 100644
--- a/tab_documentation/app_events.md
+++ b/tab_documentation/app_events.md
@@ -11,6 +11,8 @@
 
 Due to a bug in the iOS EventLogging system, iOS events are currently being 
tracked much more frequently than Android ones and so are displayed in a 
different graph to avoid confusion.
 
+\* This number represents the median of the last 90 days.
+
 Notes
 --
 * There is a spike in events on 2 June 2015 because of a release of the iOS 
app that added search logging. This has been 
[confirmed](https://phabricator.wikimedia.org/T102098) by a mobile apps 
software engineer.
diff --git a/tab_documentation/desktop_events.md 
b/tab_documentation/desktop_events.md
index 9c07536..044d3f7 100644
--- a/tab_documentation/desktop_events.md
+++ b/tab_documentation/desktop_events.md
@@ -8,7 +8,9 @@
 3. A user clicking through to an article in the results page.
 
 These three things are tracked via the [EventLogging 'TestSearchSatisfaction2' 
schema](https://meta.wikimedia.org/wiki/Schema:TestSearchSatisfaction2) 
(previously '[Search](https://meta.wikimedia.org/wiki/Schema:Search)', see note 
"A"), and stored to
-a database. The results are then aggregated and anonymised, and presented on 
this page. For performance/privacy reasons we randomly sample what we store, so 
the actual numbers are a vast understatement of how 

[MediaWiki-commits] [Gerrit] wikimedia...dashboard[master]: Deploy fixes

2017-06-02 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/356982 )

Change subject: Deploy fixes
..


Deploy fixes

Change-Id: I92d0cbe388113e3b878c82f73f3970e64ebfbae1
---
M shiny-server/portal
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/shiny-server/portal b/shiny-server/portal
index 51df8cf..fa78f60 16
--- a/shiny-server/portal
+++ b/shiny-server/portal
@@ -1 +1 @@
-Subproject commit 51df8cf55d3856c0277a55f15d43a780b477b8f8
+Subproject commit fa78f60f4734432e4fd3c5f8e61803f5a870a024

-- 
To view, visit https://gerrit.wikimedia.org/r/356982
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I92d0cbe388113e3b878c82f73f3970e64ebfbae1
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/dashboard
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 
Gerrit-Reviewer: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...dashboard[master]: Deploy fixes

2017-06-02 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/356982 )

Change subject: Deploy fixes
..

Deploy fixes

Change-Id: I92d0cbe388113e3b878c82f73f3970e64ebfbae1
---
M shiny-server/portal
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/dashboard 
refs/changes/82/356982/1

diff --git a/shiny-server/portal b/shiny-server/portal
index 51df8cf..fa78f60 16
--- a/shiny-server/portal
+++ b/shiny-server/portal
@@ -1 +1 @@
-Subproject commit 51df8cf55d3856c0277a55f15d43a780b477b8f8
+Subproject commit fa78f60f4734432e4fd3c5f8e61803f5a870a024

-- 
To view, visit https://gerrit.wikimedia.org/r/356982
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I92d0cbe388113e3b878c82f73f3970e64ebfbae1
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/dashboard
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...prince[master]: Use new path in ui.R

2017-06-02 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/356980 )

Change subject: Use new path in ui.R
..


Use new path in ui.R

Change-Id: I981a31e3d8462c2609203653838b636cd7c5935a
---
M ui.R
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/ui.R b/ui.R
index 84f69c7..8b2ece9 100644
--- a/ui.R
+++ b/ui.R
@@ -1,7 +1,7 @@
 library(shiny)
 library(shinydashboard)
 
-all_country_data <- 
polloi::read_dataset("discovery/portal/all_country_data.tsv", col_types = 
"Dcididid")
+all_country_data <- 
polloi::read_dataset("discovery/metrics/portal/all_country_data.tsv", col_types 
= "Dcididid")
 
 function(request) {
   dashboardPage(

-- 
To view, visit https://gerrit.wikimedia.org/r/356980
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I981a31e3d8462c2609203653838b636cd7c5935a
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/prince
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 
Gerrit-Reviewer: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...prince[master]: Use new path in ui.R

2017-06-02 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/356980 )

Change subject: Use new path in ui.R
..

Use new path in ui.R

Change-Id: I981a31e3d8462c2609203653838b636cd7c5935a
---
M ui.R
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/prince 
refs/changes/80/356980/1

diff --git a/ui.R b/ui.R
index 84f69c7..8b2ece9 100644
--- a/ui.R
+++ b/ui.R
@@ -1,7 +1,7 @@
 library(shiny)
 library(shinydashboard)
 
-all_country_data <- 
polloi::read_dataset("discovery/portal/all_country_data.tsv", col_types = 
"Dcididid")
+all_country_data <- 
polloi::read_dataset("discovery/metrics/portal/all_country_data.tsv", col_types 
= "Dcididid")
 
 function(request) {
   dashboardPage(

-- 
To view, visit https://gerrit.wikimedia.org/r/356980
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I981a31e3d8462c2609203653838b636cd7c5935a
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/prince
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...wonderbolt[master]: Note that PVs are for Wikimedia in general

2017-04-21 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/349718 )

Change subject: Note that PVs are for Wikimedia in general
..


Note that PVs are for Wikimedia in general

Change-Id: Iee0800433fb1d936b0058595b903a10aafaa64f0
---
M tab_documentation/traffic_byengine.md
M tab_documentation/traffic_summary.md
2 files changed, 4 insertions(+), 4 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/tab_documentation/traffic_byengine.md 
b/tab_documentation/traffic_byengine.md
index 30d33ac..5db2e29 100644
--- a/tab_documentation/traffic_byengine.md
+++ b/tab_documentation/traffic_byengine.md
@@ -1,9 +1,9 @@
 Traffic from external search engines, broken down
 ===
 
-A key metric in understanding the role external search engines play in 
Wikipedia's readership and content discovery processes is a very direct one - 
how many pageviews we get from them. This can be discovered very simply by 
looking at our request logs.
+A key metric in understanding the role external search engines play in 
Wikipedia's (and Wikimedia's) readership and content discovery processes is a 
very direct one - how many pageviews we get from them. This can be discovered 
very simply by looking at our request logs.
 
-This dashboard simply breaks down the [summary 
data](http://discovery.wmflabs.org/external/#traffic_summary) to investigate 
how much traffic is coming from each search engine, individually. As you can 
see, Google dominates, which is why we've included the option of log-scaling
+This dashboard simply breaks down the [summary 
data](https://discovery.wmflabs.org/external/#traffic_summary) to investigate 
how much traffic is coming from each search engine, individually. As you can 
see, Google dominates, which is why we've included the option of log-scaling
 the traffic.
 
 General trends
diff --git a/tab_documentation/traffic_summary.md 
b/tab_documentation/traffic_summary.md
index e8f0919..8cfb982 100644
--- a/tab_documentation/traffic_summary.md
+++ b/tab_documentation/traffic_summary.md
@@ -1,9 +1,9 @@
 Traffic from external search engines - summary
 ===
 
-A key metric in understanding the role external search engines play in 
Wikipedia's readership and content discovery processes is a very direct one - 
how many pageviews we get from them. This can be discovered very simply by 
looking at our request logs.
+A key metric in understanding the role external search engines play in 
Wikipedia's (and Wikimedia's) readership and content discovery processes is a 
very direct one - how many pageviews we get from them. This can be discovered 
very simply by looking at our request logs.
 
-This dashboard simply looks at, very broadly, where our requests are coming 
from - search engines or something else? It is split up into
+This dashboard simply looks at, very broadly, where our pageviews (across all 
Wikimedia projects) are coming from - search engines or something else? It is 
split up into
 "all", "desktop" and "mobile web" platforms - but not apps, since the apps do 
not log referers.
 
 **Internal** is traffic referred by Wikimedia sites, specifically: 
mediawiki.org, wikibooks.org, wikidata.org, wikinews.org, wikimedia.org, 
wikimediafoundation.org, wikipedia.org, wikiquote.org, wikisource.org, 
wikiversity.org, wikivoyage.org, and wiktionary.org (See [Webrequest 
source](https://git.wikimedia.org/blob/analytics%2Frefinery%2Fsource.git/master/refinery-core%2Fsrc%2Fmain%2Fjava%2Forg%2Fwikimedia%2Fanalytics%2Frefinery%2Fcore%2FWebrequest.java#L203)
 for more information.)

-- 
To view, visit https://gerrit.wikimedia.org/r/349718
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Iee0800433fb1d936b0058595b903a10aafaa64f0
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/wonderbolt
Gerrit-Branch: master
Gerrit-Owner: Bearloga 
Gerrit-Reviewer: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Add secondSPARQL endpoint

2017-04-20 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/349290 )

Change subject: Add secondSPARQL endpoint
..


Add secondSPARQL endpoint

Bug: T163501
Change-Id: Ifef4c91d66a87e0a8d33bf044d6d956b0e3b63e2
---
M modules/metrics/wdqs/basic_usage
1 file changed, 3 insertions(+), 3 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/modules/metrics/wdqs/basic_usage b/modules/metrics/wdqs/basic_usage
index 5fd7a10..b1c640a 100755
--- a/modules/metrics/wdqs/basic_usage
+++ b/modules/metrics/wdqs/basic_usage
@@ -3,7 +3,7 @@
 hive -S -e "USE wmf;
 SELECT
   '$1' AS date,
-  uri_path AS path,
+  IF(uri_path = '/sparql', '/bigdata/namespace/wdq/sparql', uri_path) AS path,
   UPPER(http_status IN('200','304')) as http_success,
   CASE
 WHEN (
@@ -27,10 +27,10 @@
   AND CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')) >= '$1'
   AND CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')) < '$2'
   AND uri_host = 'query.wikidata.org'
-  AND uri_path IN('/', '/bigdata/namespace/wdq/sparql', '/bigdata/ldf')
+  AND uri_path IN('/', '/bigdata/namespace/wdq/sparql', '/bigdata/ldf', 
'/sparql')
 GROUP BY
   '$1',
-  uri_path,
+  IF(uri_path = '/sparql', '/bigdata/namespace/wdq/sparql', uri_path) AS path,
   UPPER(http_status IN('200','304')),
   CASE
 WHEN (

-- 
To view, visit https://gerrit.wikimedia.org/r/349290
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ifef4c91d66a87e0a8d33bf044d6d956b0e3b63e2
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/golden
Gerrit-Branch: master
Gerrit-Owner: Bearloga 
Gerrit-Reviewer: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Add dataset READMEs to output dirs

2017-04-17 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/348487 )

Change subject: Add dataset READMEs to output dirs
..


Add dataset READMEs to output dirs

This adds to Rmarkdown files that must be re-knit after any new
reports are added. The Rmarkdown files read config.yaml info and
output Markdown documents that are rsync'd into the output
directories, so that users of datasets.wikimedia.org and browsers
of stat1002:/a/aggregate-datasets/ can find out what the TSVs are

Change-Id: Iaaebdd605c53e74102379a452f70bd17a1aaf851
---
A docs/README.md
A docs/discovery-forecasts.Rmd
A docs/discovery-forecasts.md
A docs/discovery.Rmd
A docs/discovery.md
M main.sh
M modules/metrics/portal/config.yaml
7 files changed, 263 insertions(+), 1 deletion(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/docs/README.md b/docs/README.md
new file mode 100644
index 000..a5e21e6
--- /dev/null
+++ b/docs/README.md
@@ -0,0 +1,4 @@
+# READMEs for generated datasets
+
+* [discovery.Rmd](disovery.Rmd) needs to be knit into Markdown 
([discovery.md](disovery.md)) and that is rsync'd to 
stat1002:/a/aggregate-datasets/discovery/README.md
+* [discovery-forecasts.Rmd](disovery-forecasts.Rmd) needs to be knit into 
Markdown ([discovery-forecasts.md](disovery-forecasts.md)) and that is rsync'd 
to stat1002:/a/aggregate-datasets/discovery-forecasts/README.md
diff --git a/docs/discovery-forecasts.Rmd b/docs/discovery-forecasts.Rmd
new file mode 100644
index 000..c5ea557
--- /dev/null
+++ b/docs/discovery-forecasts.Rmd
@@ -0,0 +1,34 @@
+---
+output: md_document
+---
+
+# Discovery Forecasts
+
+These files are generated by Discovery's 
[Golden](https://github.com/wikimedia/wikimedia-discovery-golden/) data 
retrieval codebase that executes daily and uses 
[Reportupdater](https://wikitech.wikimedia.org/wiki/Analytics/Reportupdater) 
infrastructure. These datasets provide the metrics that are used by 
[Discovery's Dashboards](https://discovery.wmflabs.org/)
+
+Last updated on `r format(Sys.Date(), "%d %B %Y")`
+
+```{r setup, include=FALSE}
+knitr::opts_chunk$set(echo = FALSE)
+options(width = 1)
+```
+
+```{r yamls}
+config_yamls <- list.files(path = "../modules/forecasts", pattern = 
"^config\\.yaml$", recursive = TRUE, full.names = TRUE)
+names(config_yamls) <- sub("../modules/forecasts/", "", dirname(config_yamls), 
fixed = TRUE)
+reports <- dplyr::bind_rows(lapply(config_yamls, function(path) {
+  config_yaml <- 
suppressMessages(suppressWarnings(data.tree::as.Node(yaml::yaml.load_file(path
+  reports <- data.tree::ToDataFrameTable(config_yaml[["reports"]], "report" = 
"name", "description")
+  reports$path = paste0(file.path(dirname(path), reports$report), 
ifelse(reports$type == "sql", ".sql", ""))
+  return(reports)
+}), .id = "module")
+```
+
+```{r results='asis'}
+for (module in unique(reports$module)) {
+  cat(sprintf("\n## %s", module), "/\n\n", sep = "")
+  for (i in which(reports$module == module)) {
+cat("- **", reports$report[i], ".tsv**: ", reports$description[i], "\n", 
sep = "")
+  }
+}
+```
diff --git a/docs/discovery-forecasts.md b/docs/discovery-forecasts.md
new file mode 100644
index 000..69c0b2c
--- /dev/null
+++ b/docs/discovery-forecasts.md
@@ -0,0 +1,43 @@
+Discovery Forecasts
+===
+
+These files are generated by Discovery's
+[Golden](https://github.com/wikimedia/wikimedia-discovery-golden/) data
+retrieval codebase that executes daily and uses
+[Reportupdater](https://wikitech.wikimedia.org/wiki/Analytics/Reportupdater)
+infrastructure. These datasets provide the metrics that are used by
+[Discovery's Dashboards](https://discovery.wmflabs.org/)
+
+Last updated on 17 April 2017
+
+search/
+---
+
+-   **api\_cirrus\_arima.tsv**: ARIMA-modelled forecasts of Cirrus API
+usage by non-automata users
+-   **api\_cirrus\_bsts.tsv**: BSTS-modelled forecasts of Cirrus API
+usage by non-automata users
+-   **api\_cirrus\_prophet.tsv**: Prophet-modelled forecasts of Cirrus
+API usage by non-automata users
+-   **zrr\_overall\_arima.tsv**: ARIMA-modelled forecasts of zero
+results rate, excluding known bots/tools
+-   **zrr\_overall\_bsts.tsv**: BSTS-modelled forecasts of zero results
+rate, excluding known bots/tools
+-   **zrr\_overall\_prophet.tsv**: Prophet-modelled forecasts of zero
+results rate, excluding known bots/tools
+
+wdqs/
+-
+
+-   **homepage\_traffic\_arima.tsv**: ARIMA-modelled forecasts of WDQS
+homepage traffic by non-automata users
+-   **homepage\_traffic\_bsts.tsv**: BSTS-modelled forecasts of WDQS
+homepage traffic by non-automata users
+-   **homepage\_traffic\_prophet.tsv**: Prophet-modelled forecasts of
+WDQS homepage traffic by non-automata users
+-   **sparql\_usage\_arima.tsv**: ARIMA-modelled forecasts of WDQS
+SPARQL endpoint usage by non-automata
+-   **sparql\_usage\_bsts.tsv**: 

[MediaWiki-commits] [Gerrit] wikimedia...wonderbolt[master]: Let users view non-bot pageview traffic breakdown

2017-04-13 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/348018 )

Change subject: Let users view non-bot pageview traffic breakdown
..


Let users view non-bot pageview traffic breakdown

Bug: T161932
Change-Id: I80c6c8a20ac0559e9ba4e4b2711acf505cadb547
---
M server.R
M ui.R
M utils.R
3 files changed, 113 insertions(+), 16 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/server.R b/server.R
index 42ff6bf..528fb21 100644
--- a/server.R
+++ b/server.R
@@ -9,14 +9,30 @@
 function(input, output, session) {
 
   if (Sys.Date() != existing_date) {
+progress <- shiny::Progress$new(session, min = 0, max = 1)
+on.exit(progress$close())
+progress$set(message = "Downloading overall pageview counts...", value = 0)
 read_traffic()
+progress$set(message = "Downloading non-bot pageview counts...", value = 
1/2)
+read_nonbot_traffic()
+progress$set(message = "Finished downloading datasets.", value = 1)
 existing_date <<- Sys.Date()
   }
 
   output$traffic_summary_dygraph <- renderDygraph({
-input$platform_traffic_summary_prop %>%
-  
polloi::data_select(summary_traffic_data_prop[[input$platform_traffic_summary]],
-  
summary_traffic_data[[input$platform_traffic_summary]]) %>%
+input$include_automata_traffic_summary %>%
+  polloi::data_select(
+polloi::data_select(
+  input$platform_traffic_summary_prop,
+  summary_traffic_data_prop[[input$platform_traffic_summary]],
+  summary_traffic_data[[input$platform_traffic_summary]]
+),
+polloi::data_select(
+  input$platform_traffic_summary_prop,
+  summary_traffic_nonbot_data_prop[[input$platform_traffic_summary]],
+  summary_traffic_nonbot_data[[input$platform_traffic_summary]]
+)
+  ) %>%
   polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_traffic_summary)) 
%>%
   polloi::make_dygraph(xlab = "Date", ylab = 
ifelse(input$platform_traffic_summary_prop, "Pageview Share (%)", "Pageviews"),
title = "Sources of page views (e.g. search engines 
and internal referers)") %>%
@@ -31,9 +47,19 @@
   })
 
   output$traffic_bysearch_dygraph <- renderDygraph({
-input$platform_traffic_bysearch_prop %>%
-  
polloi::data_select(bysearch_traffic_data_prop[[input$platform_traffic_bysearch]],
-  
bysearch_traffic_data[[input$platform_traffic_bysearch]]) %>%
+input$include_automata_traffic_bysearch %>%
+  polloi::data_select(
+polloi::data_select(
+  input$platform_traffic_bysearch_prop,
+  bysearch_traffic_data_prop[[input$platform_traffic_bysearch]],
+  bysearch_traffic_data[[input$platform_traffic_bysearch]]
+),
+polloi::data_select(
+  input$platform_traffic_bysearch_prop,
+  bysearch_traffic_nonbot_data_prop[[input$platform_traffic_bysearch]],
+  bysearch_traffic_nonbot_data[[input$platform_traffic_bysearch]]
+)
+  ) %>%
   polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, 
input$smoothing_traffic_bysearch)) %>%
   polloi::make_dygraph(xlab = "Date", ylab = 
ifelse(input$platform_traffic_bysearch_prop, "Pageview Share (%)", "Pageviews"),
title = "Pageviews from external search engines, 
broken down by engine") %>%
diff --git a/ui.R b/ui.R
index 6baa5be..e4cf745 100644
--- a/ui.R
+++ b/ui.R
@@ -2,6 +2,10 @@
 library(shinydashboard)
 library(dygraphs)
 
+spider_checkbox <- function(input_id) {
+  shiny::checkboxInput(input_id, "Include automata", value = TRUE, width = 
NULL)
+}
+
 function(request) {
   dashboardPage(
 
@@ -29,26 +33,34 @@
   tabItems(
 tabItem(tabName = "traffic_summary",
 fluidRow(
-  column(selectizeInput(inputId = "platform_traffic_summary", 
label = "Platform", choices = c("All", "Desktop", "Mobile Web")), width = 2),
-  column(HTML("Scale"),
+  column(selectizeInput(inputId = "platform_traffic_summary", 
label = "Platform", choices = c("All", "Desktop", "Mobile Web")), width = 3),
+  column(HTML("Data"),
+ spider_checkbox("include_automata_traffic_summary"), 
width = 2),
+  
column(conditionalPanel("!input.platform_traffic_summary_prop", HTML("Scale")),
  
conditionalPanel("!input.platform_traffic_summary_prop", 
checkboxInput("platform_traffic_summary_log", label = "Use Log scale", value = 
FALSE)),
+ width = 2),
+  
column(conditionalPanel("!input.platform_traffic_summary_log", HTML("Type")),
  
conditionalPanel("!input.platform_traffic_summary_log", 
checkboxInput("platform_traffic_summary_prop", 

[MediaWiki-commits] [Gerrit] wikimedia...prince[master]: Fix data path in ui.R for full geo dashboard

2017-04-11 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/347798 )

Change subject: Fix data path in ui.R for full geo dashboard
..

Fix data path in ui.R for full geo dashboard

Bug: T161806
Change-Id: I4ba0952a4e522fb8febb2d53e2a0440763ec0787
---
M ui.R
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/prince 
refs/changes/98/347798/1

diff --git a/ui.R b/ui.R
index a39db01..84f69c7 100644
--- a/ui.R
+++ b/ui.R
@@ -1,7 +1,7 @@
 library(shiny)
 library(shinydashboard)
 
-all_country_data <- polloi::read_dataset("portal/all_country_data.tsv", 
col_types = "Dcididid")
+all_country_data <- 
polloi::read_dataset("discovery/portal/all_country_data.tsv", col_types = 
"Dcididid")
 
 function(request) {
   dashboardPage(

-- 
To view, visit https://gerrit.wikimedia.org/r/347798
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I4ba0952a4e522fb8febb2d53e2a0440763ec0787
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/prince
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...wonderbolt[master]: Add relative option to referrer summary

2017-04-10 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/347040 )

Change subject: Add relative option to referrer summary
..


Add relative option to referrer summary

- Adds the option to view traffic breakdown as percentages
- Adds the option to view traffic breakdown on a log10 scale

Bug: T161771
Change-Id: I4516f7a6d1d7bc12bdd9c41d3983aa64bb3123d5
---
M server.R
M ui.R
M utils.R
3 files changed, 21 insertions(+), 7 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/server.R b/server.R
index 1c892de..42ff6bf 100644
--- a/server.R
+++ b/server.R
@@ -14,12 +14,17 @@
   }
 
   output$traffic_summary_dygraph <- renderDygraph({
-summary_traffic_data[[input$platform_traffic_summary]] %>%
+input$platform_traffic_summary_prop %>%
+  
polloi::data_select(summary_traffic_data_prop[[input$platform_traffic_summary]],
+  
summary_traffic_data[[input$platform_traffic_summary]]) %>%
   polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_traffic_summary)) 
%>%
-  polloi::make_dygraph(xlab = "Date", ylab = "Pageviews",
+  polloi::make_dygraph(xlab = "Date", ylab = 
ifelse(input$platform_traffic_summary_prop, "Pageview Share (%)", "Pageviews"),
title = "Sources of page views (e.g. search engines 
and internal referers)") %>%
   dyLegend(labelsDiv = "traffic_summary_legend", show = "always", 
showZeroValues = FALSE) %>%
-  dyRangeSelector(retainDateWindow = TRUE) %>%
+  dyAxis("y", logscale = input$platform_traffic_summary_log) %>%
+  dyRangeSelector(fillColor = ifelse(input$platform_traffic_summary_prop, 
"", "#A7B1C4"),
+  strokeColor = 
ifelse(input$platform_traffic_summary_prop, "", "#808FAB"),
+  retainDateWindow = TRUE) %>%
   dyEvent(as.Date("2016-03-07"), "A (new UDF)", labelLoc = "bottom") %>%
   dyEvent(as.Date("2016-06-26"), "B (DuckDuckGo)", labelLoc = "bottom") %>%
   dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
diff --git a/ui.R b/ui.R
index 1ac0e5a..6baa5be 100644
--- a/ui.R
+++ b/ui.R
@@ -30,8 +30,12 @@
 tabItem(tabName = "traffic_summary",
 fluidRow(
   column(selectizeInput(inputId = "platform_traffic_summary", 
label = "Platform", choices = c("All", "Desktop", "Mobile Web")), width = 2),
+  column(HTML("Scale"),
+ 
conditionalPanel("!input.platform_traffic_summary_prop", 
checkboxInput("platform_traffic_summary_log", label = "Use Log scale", value = 
FALSE)),
+ 
conditionalPanel("!input.platform_traffic_summary_log", 
checkboxInput("platform_traffic_summary_prop", label = "Use Proportion", value 
= FALSE)),
+ width = 2),
   column(polloi::smooth_select("smoothing_traffic_summary"), 
width = 3),
-  column(div(id = "traffic_summary_legend", style = 
"text-align: right;"), width = 7)),
+  column(div(id = "traffic_summary_legend", style = 
"text-align: right;"), width = 5)),
 dygraphOutput("traffic_summary_dygraph"),
 includeMarkdown("./tab_documentation/traffic_summary.md")
 ),
diff --git a/utils.R b/utils.R
index b3f338b..009a0c7 100644
--- a/utils.R
+++ b/utils.R
@@ -29,6 +29,12 @@
   names(interim) <- c("Desktop", "Mobile Web", "All")
   summary_traffic_data <<- lapply(interim, tidyr::spread, key = 
"referer_class", value = "pageviews", fill = NA)
 
+  # Proportion
+  summary_traffic_data_prop <<- interim %>%
+lapply(dplyr::group_by, date) %>%
+lapply(dplyr::mutate, pageviews = 100*pageviews/sum(pageviews)) %>%
+lapply(tidyr::spread, key = "referer_class", value = "pageviews", fill = 
NA)
+
   # Generate per-engine values
   interim <- data[is_search == TRUE,
   j = list(pageviews = sum(pageviews)),
@@ -44,10 +50,9 @@
 lapply(tidyr::spread, key = "search_engine", value = "pageviews", fill = 
NA)
 
   # Proportion
-  interim <- interim %>%
-lapply(dplyr::group_by, date) %>%
-lapply(dplyr::mutate, pageviews = 100*pageviews/sum(pageviews))
   bysearch_traffic_data_prop <<- interim %>%
+lapply(dplyr::group_by, date) %>%
+lapply(dplyr::mutate, pageviews = 100*pageviews/sum(pageviews)) %>%
 lapply(dplyr::filter_, .dots = list(quote(search_engine != "Not referred 
by search"))) %>%
 lapply(tidyr::spread, key = "search_engine", value = "pageviews", fill = 
NA)
 

-- 
To view, visit https://gerrit.wikimedia.org/r/347040
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I4516f7a6d1d7bc12bdd9c41d3983aa64bb3123d5
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/wonderbolt
Gerrit-Branch: master
Gerrit-Owner: Bearloga 

[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Get browser info from new userAgent field

2017-04-05 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/346655 )

Change subject: Get browser info from new userAgent field
..

Get browser info from new userAgent field

Bug: T162178
Change-Id: Ib292a1b87338c596125618b29ad16b7a82e48141
---
M modules/metrics/portal/user_agents.R
1 file changed, 10 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/golden 
refs/changes/55/346655/1

diff --git a/modules/metrics/portal/user_agents.R 
b/modules/metrics/portal/user_agents.R
index 20097b1..6e45723 100644
--- a/modules/metrics/portal/user_agents.R
+++ b/modules/metrics/portal/user_agents.R
@@ -53,7 +53,16 @@
   # Get user agent data
   wmf::set_proxies() # To allow for the latest YAML to be retrieved.
   uaparser::update_regexes()
-  ua_data <- 
data.table::as.data.table(uaparser::parse_agents(results$user_agent, fields = 
c("browser", "browser_major")))
+  ua_data <- data.table::rbindlist(lapply(results$user_agent, function(x){
+if (grepl("^\\{", x)){
+  temp <- unlist(jsonlite::fromJSON(x)[c("browser_family", 
"browser_major")])
+  names(temp)[1] <- "browser"
+  temp <- as.data.frame(as.list(temp))
+  return(temp)
+} else {
+  uaparser::parse_agents(x, fields = c("browser", "browser_major"))
+}
+  }), fill = TRUE)
   ua_data <- ua_data[, j = list(amount = .N), by = c("browser", 
"browser_major")]
   ua_data$date <- results$date[1]
   ua_data$percent <- round((ua_data$amount/sum(ua_data$amount)) * 100, 2)

-- 
To view, visit https://gerrit.wikimedia.org/r/346655
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib292a1b87338c596125618b29ad16b7a82e48141
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/golden
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...rainbow[master]: Implement the wiki/language selector in more search dashboards

2017-04-04 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/346461 )

Change subject: Implement the wiki/language selector in more search dashboards
..

Implement the wiki/language selector in more search dashboards

Three new dashboards are added:
- CTR by Language/Project
- Events by Language/Project
- PaulScore by Language/Project

Bug: T150410
Change-Id: Ie04762d747a9dcbec1564d8945f8949ed8c52adc
---
M server.R
A tab_documentation/desktop_events_langproj.md
A tab_documentation/kpi_ctr_langproj.md
A tab_documentation/paulscore_langproj.html
M ui.R
M utils.R
6 files changed, 520 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow 
refs/changes/61/346461/1

diff --git a/server.R b/server.R
index 5ec500e..79a7846 100644
--- a/server.R
+++ b/server.R
@@ -26,6 +26,7 @@
 read_failures(existing_date)
 progress$set(message = "Downloading engagement data", value = 0.7)
 read_augmented_clickthrough()
+read_augmented_clickthrough_langproj()
 progress$set(message = "Downloading survival data", value = 0.8)
 read_lethal_dose()
 progress$set(message = "Downloading PaulScore data", value = 0.9)
@@ -877,6 +878,191 @@
   dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
   })
 
+  output$ctr_language_selector_container <- renderUI({
+if (input$ctr_language_order == "alphabet") {
+  languages_to_display <- as.list(sort(available_languages_ctr$language))
+  names(languages_to_display) <- 
available_languages_ctr$label[order(available_languages_ctr$language)]
+} else {
+  languages_to_display <- available_languages_ctr$language
+  names(languages_to_display) <- available_languages_ctr$label
+}
+
+# e.g. if user sorts projects alphabetically and the selected project is 
"10th Anniversary of Wikipeda"
+#  then automatically select the language "(None)" to avoid giving 
user an error. This also works if
+#  the user selects a project that is not multilingual, so this 
automatically chooses the "(None)"
+#  option for the user.
+if (any(input$ctr_project_selector %in% 
projects_db$project[!projects_db$multilingual])) {
+  if (any(input$ctr_project_selector %in% 
projects_db$project[projects_db$multilingual])) {
+if (!is.null(input$ctr_language_selector)) {
+  selected_language <- union("(None)", input$ctr_language_selector)
+} else {
+  selected_language <- c("(None)", languages_to_display[[1]])
+}
+  } else {
+selected_language <- "(None)"
+  }
+} else {
+  if (!is.null(input$ctr_language_selector)) {
+selected_language <- input$ctr_language_selector
+  } else {
+selected_language <- languages_to_display[[1]]
+  }
+}
+return(selectInput("ctr_language_selector", "Language", multiple = 
TRUE,selectize = FALSE, size = 19,
+   choices = languages_to_display, selected = 
selected_language))
+  })
+
+  output$ctr_project_selector_container <- renderUI({
+if (input$ctr_project_order == "alphabet") {
+  projects_to_display <- as.list(sort(available_projects_ctr$project))
+  names(projects_to_display) <- 
available_projects_ctr$label[order(available_projects_ctr$project)]
+} else {
+  projects_to_display <- available_projects_ctr$project
+  names(projects_to_display) <- available_projects_ctr$label
+}
+return(selectInput("ctr_project_selector", "Project", multiple = 
TRUE,selectize = FALSE, size = 19,
+   choices = projects_to_display, selected = 
projects_to_display[[1]]))
+  })
+
+  output$kpi_ctr_langproj_plot <- renderDygraph({
+augmented_clickthroughs_langproj %>%
+  kpi_ctr_aggregate_wikis(input$ctr_language_selector, 
input$ctr_project_selector) %>%
+  dplyr::select_(.dots=c("date", "wiki", 
paste0("`",input$ctr_metrics,"`"))) %>%
+  tidyr::spread_(., key_col="wiki", value_col=input$ctr_metrics, fill=0) 
%>%
+  polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, 
input$smoothing_kpi_ctr_langproj)) %>%
+  polloi::make_dygraph(xlab = "Date", ylab = input$ctr_metrics, title = 
paste0(input$ctr_metrics, ", by day")) %>%
+  dyAxis("y", axisLabelFormatter = "function(x) { return x + '%'; }", 
valueFormatter = "function(x) { return x + '%'; }") %>%
+  dyLegend(show = "always", width = 400, labelsDiv = 
"kpi_ctr_langproj_legend") %>%
+  dyAxis("x", axisLabelFormatter = polloi::custom_axis_formatter) %>%
+  dyRangeSelector(fillColor = "")
+  })
+
+  output$desktop_events_language_selector_container <- renderUI({
+if (input$desktop_events_language_order == "alphabet") {
+  languages_to_display <- 
as.list(sort(available_languages_desktop$language))
+  names(languages_to_display) <- 

[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Fix execution permissions

2017-03-30 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/345470 )

Change subject: Fix execution permissions
..


Fix execution permissions

Change-Id: I682e0a6a6cae2f7d144e8150502d446251317877
---
M main.sh
M modules/forecasts/search/api_cirrus_arima
M modules/forecasts/search/api_cirrus_bsts
M modules/forecasts/search/zrr_overall_arima
M modules/forecasts/search/zrr_overall_bsts
M modules/forecasts/wdqs/homepage_traffic_arima
M modules/forecasts/wdqs/homepage_traffic_bsts
M modules/forecasts/wdqs/sparql_usage_arima
M modules/forecasts/wdqs/sparql_usage_bsts
M modules/metrics/external_traffic/referer_data
M modules/metrics/maps/tile_aggregates_no_automata
M modules/metrics/maps/tile_aggregates_with_automata
M modules/metrics/maps/users_by_country
M modules/metrics/portal/all_country_data
M modules/metrics/portal/clickthrough_breakdown
M modules/metrics/portal/clickthrough_firstvisit
M modules/metrics/portal/clickthrough_rate
M modules/metrics/portal/clickthrough_sisterprojects
M modules/metrics/portal/country_data
M modules/metrics/portal/dwell_metrics
M modules/metrics/portal/first_visits_country
M modules/metrics/portal/language_destination
M modules/metrics/portal/language_switching
M modules/metrics/portal/last_action_country
M modules/metrics/portal/most_common_country
M modules/metrics/portal/most_common_per_visit
M modules/metrics/portal/pageviews
M modules/metrics/portal/referer_data
M modules/metrics/portal/user_agent_data
M modules/metrics/search/app_load_times
M modules/metrics/search/cirrus_langproj_breakdown_no_automata
M modules/metrics/search/cirrus_langproj_breakdown_with_automata
M modules/metrics/search/cirrus_query_aggregates_no_automata
M modules/metrics/search/cirrus_query_aggregates_with_automata
M modules/metrics/search/cirrus_query_breakdowns_no_automata
M modules/metrics/search/cirrus_query_breakdowns_with_automata
M modules/metrics/search/cirrus_suggestion_breakdown_no_automata
M modules/metrics/search/cirrus_suggestion_breakdown_with_automata
M modules/metrics/search/desktop_load_times
M modules/metrics/search/mobile_load_times
M modules/metrics/search/sample_page_visit_ld
M modules/metrics/search/search_api_usage
M modules/metrics/wdqs/basic_usage
43 files changed, 0 insertions(+), 7 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/main.sh b/main.sh
index 19ad53b..cb5022d 100644
--- a/main.sh
+++ b/main.sh
@@ -1,12 +1,5 @@
 #!/bin/bash
 
-# Check if modules/forecasts/forecast.R has execution permission for 
Reportupdater
-# (If it doesn't, then other R and shell scripts in modules/ probably don't 
either.)
-if [ `ls -l modules/forecasts | grep -e forecast.R | grep -e "-rwxrwxr-x" | wc 
-l` == "0" ]; then
-  echo "Warning: modules do not have execution permission; granting now..."
-  chmod +x -R modules/
-fi
-
 # Check if Reportupdater git submodule is set up
 if [ ! -f reportupdater/update_reports.py ]; then
   echo "Warning: Reportupdater needs to be initialized and updated..."
diff --git a/modules/forecasts/search/api_cirrus_arima 
b/modules/forecasts/search/api_cirrus_arima
old mode 100644
new mode 100755
diff --git a/modules/forecasts/search/api_cirrus_bsts 
b/modules/forecasts/search/api_cirrus_bsts
old mode 100644
new mode 100755
diff --git a/modules/forecasts/search/zrr_overall_arima 
b/modules/forecasts/search/zrr_overall_arima
old mode 100644
new mode 100755
diff --git a/modules/forecasts/search/zrr_overall_bsts 
b/modules/forecasts/search/zrr_overall_bsts
old mode 100644
new mode 100755
diff --git a/modules/forecasts/wdqs/homepage_traffic_arima 
b/modules/forecasts/wdqs/homepage_traffic_arima
old mode 100644
new mode 100755
diff --git a/modules/forecasts/wdqs/homepage_traffic_bsts 
b/modules/forecasts/wdqs/homepage_traffic_bsts
old mode 100644
new mode 100755
diff --git a/modules/forecasts/wdqs/sparql_usage_arima 
b/modules/forecasts/wdqs/sparql_usage_arima
old mode 100644
new mode 100755
diff --git a/modules/forecasts/wdqs/sparql_usage_bsts 
b/modules/forecasts/wdqs/sparql_usage_bsts
old mode 100644
new mode 100755
diff --git a/modules/metrics/external_traffic/referer_data 
b/modules/metrics/external_traffic/referer_data
old mode 100644
new mode 100755
diff --git a/modules/metrics/maps/tile_aggregates_no_automata 
b/modules/metrics/maps/tile_aggregates_no_automata
old mode 100644
new mode 100755
diff --git a/modules/metrics/maps/tile_aggregates_with_automata 
b/modules/metrics/maps/tile_aggregates_with_automata
old mode 100644
new mode 100755
diff --git a/modules/metrics/maps/users_by_country 
b/modules/metrics/maps/users_by_country
old mode 100644
new mode 100755
diff --git a/modules/metrics/portal/all_country_data 
b/modules/metrics/portal/all_country_data
old mode 100644
new mode 100755
diff --git a/modules/metrics/portal/clickthrough_breakdown 
b/modules/metrics/portal/clickthrough_breakdown
old mode 100644
new mode 100755

[MediaWiki-commits] [Gerrit] wikimedia...wonderbolt[master]: Add relative option to External By Search Engine

2017-03-30 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/345611 )

Change subject: Add relative option to External By Search Engine
..

Add relative option to External By Search Engine

Bug: T161771
Change-Id: I833b6477e6375d2ee16da38dac5096e37eb6afb4
---
M server.R
M ui.R
M utils.R
3 files changed, 34 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/wonderbolt 
refs/changes/11/345611/1

diff --git a/server.R b/server.R
index b0863cf..f835b6e 100644
--- a/server.R
+++ b/server.R
@@ -26,15 +26,28 @@
   })
 
   output$traffic_bysearch_dygraph <- renderDygraph({
-bysearch_traffic_data[[input$platform_traffic_bysearch]] %>%
-  polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, 
input$smoothing_traffic_bysearch)) %>%
-  polloi::make_dygraph(xlab = "Date", ylab = "Pageviews",
-   title = "Pageviews from external search engines, 
broken down by engine") %>%
-  dyLegend(labelsDiv = "traffic_bysearch_legend", show = "always", 
showZeroValues = FALSE) %>%
-  dyAxis("y", logscale = input$platform_traffic_bysearch_log) %>%
-  dyRangeSelector(fillColor = "", strokeColor = "") %>%
-  dyEvent(as.Date("2016-06-26"), "A (DuckDuckGo)", labelLoc = "bottom") %>%
-  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
+if (input$platform_traffic_bysearch_prop == FALSE){
+  bysearch_traffic_data[[input$platform_traffic_bysearch]] %>%
+polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, 
input$smoothing_traffic_bysearch)) %>%
+polloi::make_dygraph(xlab = "Date", ylab = "Pageviews",
+ title = "Pageviews from external search engines, 
broken down by engine") %>%
+dyLegend(labelsDiv = "traffic_bysearch_legend", show = "always", 
showZeroValues = FALSE) %>%
+dyAxis("y", logscale = input$platform_traffic_bysearch_log) %>%
+dyRangeSelector(fillColor = "", strokeColor = "") %>%
+dyEvent(as.Date("2016-06-26"), "A (DuckDuckGo)", labelLoc = "bottom") 
%>%
+dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = 
"bottom")
+} else{
+  bysearch_traffic_data_prop[[input$platform_traffic_bysearch]] %>%
+polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, 
input$smoothing_traffic_bysearch)) %>%
+polloi::make_dygraph(xlab = "Date", ylab = "Pageview Share (%)",
+ title = "Pageview shares from external search 
engines, broken down by engine") %>%
+dyLegend(labelsDiv = "traffic_bysearch_legend", show = "always", 
showZeroValues = FALSE) %>%
+dyAxis("y", logscale = input$platform_traffic_bysearch_log) %>%
+dyRangeSelector(fillColor = "", strokeColor = "") %>%
+dyEvent(as.Date("2016-06-26"), "A (DuckDuckGo)", labelLoc = "bottom") 
%>%
+dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = 
"bottom")
+}
+
   })
 
   # Check datasets for missing data and notify user which datasets are missing 
data (if any)
diff --git a/ui.R b/ui.R
index 358e8a9..c0928f4 100644
--- a/ui.R
+++ b/ui.R
@@ -38,7 +38,10 @@
 tabItem(tabName = "traffic_by_engine",
 fluidRow(
   column(selectizeInput(inputId = "platform_traffic_bysearch", 
label = "Platform", choices = c("All", "Desktop", "Mobile Web")), width = 2),
-  column(HTML("Scale"), 
checkboxInput("platform_traffic_bysearch_log", label = "Use Log scale", value = 
FALSE), width = 2),
+  column(HTML("Scale"),
+ checkboxInput("platform_traffic_bysearch_log", label 
= "Use Log scale", value = FALSE),
+ checkboxInput("platform_traffic_bysearch_prop", label 
= "Use Proportion", value = FALSE),
+ width = 2),
   column(polloi::smooth_select("smoothing_traffic_bysearch"), 
width = 3),
   column(div(id = "traffic_bysearch_legend", style = 
"text-align: right;"), width = 5)),
 dygraphOutput("traffic_bysearch_dygraph"),
diff --git a/utils.R b/utils.R
index e2d7a1b..4a12d66 100644
--- a/utils.R
+++ b/utils.R
@@ -41,5 +41,13 @@
 lapply(dplyr::filter_, .dots = list(quote(search_engine != "Not referred 
by search"))) %>%
 lapply(tidyr::spread, key = "search_engine", value = "pageviews", fill = 
NA)
 
+  # Proportion
+  interim <- interim %>%
+lapply(dplyr::group_by, date) %>%
+lapply(dplyr::mutate, pageviews = 100*pageviews/sum(pageviews))
+  bysearch_traffic_data_prop <<- interim %>%
+lapply(dplyr::filter_, .dots = list(quote(search_engine != "Not referred 
by search"))) %>%
+lapply(tidyr::spread, key = "search_engine", value = "pageviews", fill = 
NA)
+
   return(invisible())
 }

-- 
To view, visit 

[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Add scripts to enable language/project breakdown for several...

2017-03-22 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/344207 )

Change subject: Add scripts to enable language/project breakdown for several 
search metrics
..

Add scripts to enable language/project breakdown for several search metrics

- App event counts
- Mobile event counts
- Desktop event counts
- Paulscore approximations
- Search threshold pass rate

Bug: T150410
Change-Id: I4fbe097a84362fc13cb4b2e44b46fdbddf385bc4
---
A modules/metrics/search/app_event_counts_wiki_breakdown.sql
M modules/metrics/search/config.yaml
M modules/metrics/search/desktop_event_counts
M modules/metrics/search/desktop_event_counts.R
A modules/metrics/search/desktop_event_counts_langproj_breakdown
A modules/metrics/search/mobile_event_counts_wiki_breakdown.sql
A modules/metrics/search/paulscore_approximations_fulltext_wiki_breakdown.sql
M modules/metrics/search/search_threshold_pass_rate
M modules/metrics/search/search_threshold_pass_rate.R
A modules/metrics/search/search_threshold_pass_rate_langproj_breakdown
10 files changed, 209 insertions(+), 28 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/golden 
refs/changes/07/344207/1

diff --git a/modules/metrics/search/app_event_counts_wiki_breakdown.sql 
b/modules/metrics/search/app_event_counts_wiki_breakdown.sql
new file mode 100644
index 000..4a2b141
--- /dev/null
+++ b/modules/metrics/search/app_event_counts_wiki_breakdown.sql
@@ -0,0 +1,32 @@
+SELECT
+  date, wiki, action, platform, COUNT(*) AS events
+FROM (
+  SELECT
+DATE('{from_timestamp}') AS date,
+wiki,
+CASE event_action WHEN 'click' THEN 'clickthroughs'
+  WHEN 'start' THEN 'search sessions'
+  WHEN 'results' THEN 'Result pages opened'
+  END AS action,
+CASE WHEN INSTR(userAgent, 'Android') > 0 THEN 'Android'
+ ELSE 'iOS' END AS platform
+  FROM MobileWikiAppSearch_10641988
+  WHERE
+timestamp >= '{from_timestamp}' AND timestamp < '{to_timestamp}'
+AND event_action IN ('click', 'start', 'results')
+  UNION ALL
+  SELECT
+DATE('{from_timestamp}') AS date,
+wiki,
+CASE event_action WHEN 'click' THEN 'clickthroughs'
+  WHEN 'start' THEN 'search sessions'
+  WHEN 'results' THEN 'Result pages opened'
+  END AS action,
+CASE WHEN INSTR(userAgent, 'Android') > 0 THEN 'Android'
+ ELSE 'iOS' END AS platform
+  FROM MobileWikiAppSearch_15729321
+  WHERE
+timestamp >= '{from_timestamp}' AND timestamp < '{to_timestamp}'
+AND event_action IN ('click', 'start', 'results')
+) AS MobileWikiAppSearch
+GROUP BY date, wiki, action, platform;
diff --git a/modules/metrics/search/config.yaml 
b/modules/metrics/search/config.yaml
index b397e08..9730121 100644
--- a/modules/metrics/search/config.yaml
+++ b/modules/metrics/search/config.yaml
@@ -13,6 +13,12 @@
 starts: 2014-12-05
 funnel: true
 type: sql
+app_event_counts_wiki_breakdown:
+description: Clicks and other events by users searching on Android and 
iOS apps broken down by wiki ID
+granularity: days
+starts: 2014-12-05
+funnel: true
+type: sql
 app_load_times:
 description: User-perceived load times when searching on Android and 
iOS apps
 granularity: days
@@ -37,6 +43,12 @@
 starts: 2015-06-11
 funnel: true
 type: sql
+mobile_event_counts_wiki_breakdown:
+description: Clicks and other events by users searching on mobile web 
broken down by wiki ID
+granularity: days
+starts: 2015-06-11
+funnel: true
+type: sql
 mobile_load_times:
 description: User-perceived load times when searching on mobile web
 granularity: days
@@ -48,6 +60,12 @@
 starts: 2015-04-14
 funnel: true
 type: script
+desktop_event_counts_langproj_breakdown:
+description: Clicks and other events by users searching on desktop 
broken down by language-project pairs
+granularity: days
+starts: 2015-04-14
+funnel: true
+type: script
 desktop_load_times:
 description: User-perceived load times when searching on desktop
 granularity: days
@@ -55,6 +73,12 @@
 type: script
 paulscore_approximations:
 description: Relevancy of our desktop search as measured by 
[PaulScore](https://www.mediawiki.org/wiki/Wikimedia_Discovery/Search/Glossary#PaulScore)
+granularity: days
+starts: 2016-10-25
+funnel: true
+type: sql
+paulscore_approximations_fulltext_wiki_breakdown:
+description: Relevancy of our fulltext desktop search as measured by 
[PaulScore](https://www.mediawiki.org/wiki/Wikimedia_Discovery/Search/Glossary#PaulScore)
 broken down by wiki ID
 granularity: days
 starts: 

[MediaWiki-commits] [Gerrit] wikimedia...prince[master]: Adds 'x' button to remove selected languages, countries

2017-03-21 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/343931 )

Change subject: Adds 'x' button to remove selected languages, countries
..


Adds 'x' button to remove selected languages, countries

Change-Id: Ifcb0478f0b39273ec7aa86b8704e3a3bc25cf6eb
---
M server.R
M ui.R
2 files changed, 7 insertions(+), 7 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/server.R b/server.R
index 2d82d55..77dca0e 100644
--- a/server.R
+++ b/server.R
@@ -514,7 +514,7 @@
 if (input$lv_sort %in% c("top10", "bottom50")) {
   hidden(disabled(selectizeInput("lv_languages", "Wikipedia languages", 
lv_reactive$choices, lv_reactive$selected_langs, multiple = TRUE)))
 } else {
-  selectizeInput("lv_languages", "Wikipedia languages (12 max)", 
lv_reactive$choices, lv_reactive$selected_langs, multiple = TRUE, options = 
list(maxItems = 12))
+  selectizeInput("lv_languages", "Wikipedia languages (12 max)", 
lv_reactive$choices, lv_reactive$selected_langs, multiple = TRUE, options = 
list(maxItems = 12, plugins = list("remove_button")))
 }
   })
 
diff --git a/ui.R b/ui.R
index 84a55cc..a39db01 100644
--- a/ui.R
+++ b/ui.R
@@ -44,8 +44,8 @@
menuSubItem(text = "Most Common Section", tabName = 
"most_common_by_country"),
icon = icon("globe", lib = "glyphicon")),
   menuItem(text = "Global Settings",
-   selectInput(inputId = "smoothing_global", label = 
"Smoothing", selectize = TRUE, selected = "day",
-   choices = c("No Smoothing" = "day", 
"Weekly Median" = "week", "Monthly Median" = "month", "Splines" = "gam")),
+   selectizeInput(inputId = "smoothing_global", label 
= "Smoothing", selected = "day",
+  choices = c("No Smoothing" = "day", 
"Weekly Median" = "week", "Monthly Median" = "month", "Splines" = "gam")),
br(style = "line-height:25%;"), icon = icon("cog", 
lib = "glyphicon"))
   ),
   div(icon("info-sign", lib = "glyphicon"), HTML("Tip: 
you can drag on the graphs with your mouse to zoom in on a particular date 
range."), style = "padding: 10px; color: black;"),
@@ -270,7 +270,7 @@
conditionalPanel("(input.traffic_select=='events' || 
input.traffic_select=='visits' || input.traffic_select=='sessions') && 
!input.prop_a", checkboxInput("cntr_logscale_a", "Use Log scale", FALSE))),
 column(width = 5,
conditionalPanel("input.cntr_sort_a == 'custom_a'",
-selectizeInput("cntr_a", "Countries", 
choices = sort(c(unique(all_country_data$country), "United States")), selected 
= c("United Kingdom", "Germany", "India", "Canada", "U.S. (South)"), multiple = 
TRUE, width = "100%")))
+selectizeInput("cntr_a", "Countries", 
choices = sort(c(unique(all_country_data$country), "United States")), selected 
= c("United Kingdom", "Germany", "India", "Canada", "U.S. (South)"), multiple = 
TRUE, options = list(plugins = list("remove_button")), width = "100%")))
   ),
   fluidRow(
 highcharter::highchartOutput("traffic_pie_pl", height = "500px"),
@@ -320,7 +320,7 @@
conditionalPanel("!input.prop_f", 
checkboxInput("cntr_logscale_f", "Use Log scale", FALSE))),
 column(width = 5,
conditionalPanel("input.cntr_sort_f == 'custom_f'",
-selectizeInput("cntr_f", "Countries", 
choices = sort(c(unique(all_country_data$country),"United States")), selected = 
c("United Kingdom", "Germany", "India", "Canada", "U.S. (South)"), multiple = 
TRUE, width = "100%")))
+selectizeInput("cntr_f", "Countries", 
choices = sort(c(unique(all_country_data$country),"United States")), selected = 
c("United Kingdom", "Germany", "India", "Canada", "U.S. (South)"), multiple = 
TRUE, options = list(plugins = list("remove_button")), width = "100%")))
   ),
   fluidRow(
 highcharter::highchartOutput("first_visit_pie_pl", height = 
"500px"),
@@ -369,7 +369,7 @@
conditionalPanel("!input.prop_l", 
checkboxInput("cntr_logscale_l", "Use Log scale", FALSE))),
 column(width = 5,
conditionalPanel("input.cntr_sort_l == 'custom_l'",
-selectizeInput("cntr_l", "Countries", 
choices = sort(c(unique(all_country_data$country), "United States")), selected 
= c("United Kingdom", "Germany", "India", "Canada", "U.S. (South)"), multiple = 
TRUE, width = "100%")))
+selectizeInput("cntr_l", "Countries", 
choices = sort(c(unique(all_country_data$country), "United States")), selected 
= c("United 

[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Enable forecasting modules

2017-03-17 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/343323 )

Change subject: Enable forecasting modules
..


Enable forecasting modules

- Fixes dormant forecasting modules
  - Search
- Cirrus API
- ZRR
  - Wikidata Query Service
- Homepage traffic
- SPARQL endpoint usage
- Wakes up forecasting in main.sh
- Augments test.R for forecasting

Testing command:

```
Rscript test.R --disable_metrics --start_date=2017-03-01 --end_date=2017-03-02 
>> test_`date +%F_%T`.log.md 2>&1
```

Bug: T112170
Change-Id: I4a1d9591ab73ed45ef8f234bcdb0a528c120cf77
---
M README.md
M main.sh
M modules/forecasts/forecast.R
M modules/forecasts/search/api_cirrus_arima
M modules/forecasts/search/api_cirrus_bsts
M modules/forecasts/search/config.yaml
M modules/forecasts/search/zrr_overall_arima
M modules/forecasts/search/zrr_overall_bsts
M modules/forecasts/wdqs/config.yaml
M modules/forecasts/wdqs/homepage_traffic_arima
M modules/forecasts/wdqs/homepage_traffic_bsts
M modules/forecasts/wdqs/sparql_usage_arima
M modules/forecasts/wdqs/sparql_usage_bsts
M test.R
14 files changed, 99 insertions(+), 51 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/README.md b/README.md
index bcdbe08..f49c6e9 100644
--- a/README.md
+++ b/README.md
@@ -246,7 +246,7 @@
   DATE('{from_timestamp}') AS date,
   ...,
   COUNT(*) AS events
-FROM 
+FROM {Schema_Revision}
 WHERE timestamp >= '{from_timestamp}' AND timestamp < '{to_timestamp}'
 GROUP BY date, ...;
 ```
@@ -403,11 +403,13 @@
 ```bash
 #!/bin/bash
 
-Rscript modules/forecasts/forecast.R --date=$1 --metric=[your forecasted 
metric] --model=[ARIMA [--bootstrap_ci]|BSTS]
+Rscript modules/forecasts/forecast.R --date=$2 --metric=[your forecasted 
metric] --model=[ARIMA [--bootstrap_ci]|BSTS]
 ```
 
 Change the `--metric` and `--model` arguments accordingly. The actual 
data-reading and metric-forecasting calls are in a switch statement in 
[modules/forecasts/forecast.R](modules/forecasts/forecast.R). Don't forget to 
add the forecasted metric to the `--metric` option's help text at the top of 
**forecast.R** and don't forget to subset the data after reading it in (e.g. 
`dplyr::filter(data, date < as.Date(opt$date))`)
 
+**Note** the `--date=$2` in there instead of `--date=$1`. This is because 
Reportupdater passes a *start date* and an *end date* to every script it runs, 
with the goal of generating a report for *start date*. However, with 
forecasting modules we're actually interested in generating a report for *end 
date* after observing the latest metric for *start date*.
+
 ## Additional Information
 
 This repository can be browsed in 
[Phabricator/Diffusion](https://phabricator.wikimedia.org/diffusion/WDGO/), but 
is also (read-only) mirrored to 
[GitHub](https://github.com/wikimedia/wikimedia-discovery-golden/).
diff --git a/main.sh b/main.sh
index 150c2fe..19ad53b 100644
--- a/main.sh
+++ b/main.sh
@@ -21,8 +21,8 @@
 done
 
 # Forecasts (dependent on latest metrics)
-# for module in "search" "wdqs"
-# do
-#  echo "Running Reportupdater on ${module} forecasts..."
-#  reportupdater/update_reports.py "modules/forecasts/${module}" 
"/a/aggregate-datasets/discovery-forecasts/${module}"
-# done
+for module in "search" "wdqs"
+do
+ echo "Running Reportupdater on ${module} forecasts..."
+ reportupdater/update_reports.py "modules/forecasts/${module}" 
"/a/aggregate-datasets/discovery-forecasts/${module}"
+done
diff --git a/modules/forecasts/forecast.R b/modules/forecasts/forecast.R
index 9c30a2f..bcbc850 100644
--- a/modules/forecasts/forecast.R
+++ b/modules/forecasts/forecast.R
@@ -12,14 +12,20 @@
   * wdqs_sparql"),
   make_option("--model", default = NA, action = "store", type = "character",
   help = "Available: ARIMA, BSTS"),
-  make_option("--iters", default = 1000, action = "store", type = "numeric",
+  make_option("--iters", default = 1, action = "store", type = "numeric",
   help = "Number of MCMC iterations to keep in BSTS models 
[default %default]"),
-  make_option("--burnin", default = 500, action = "store", type = "numeric",
+  make_option("--burnin", default = 1000, action = "store", type = "numeric",
   help = "Number of iterations to use as burn-in in BSTS models 
[default %default]")
 )
 
 read_data <- function(path, ...) {
-  return(readr::read_tsv(file.path("/a/aggregate-datasets/discovery/", path), 
...))
+  if (grepl("^stat[0-9]{4}$", Sys.info()["nodename"])) {
+# Use local datasets if run on stat1002
+return(readr::read_tsv(file.path("/a/aggregate-datasets", path), ...))
+  } else {
+# Download from datasets.wikimedia.org otherwise
+return(polloi::read_dataset(path, ...))
+  }
 }
 
 # Get command line options, if help option encountered print help and exit,
@@ -27,32 +33,45 @@
 opt <- parse_args(OptionParser(option_list = option_list))
 
 if 

[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Adds permission and submodule checking

2017-03-17 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/343322 )

Change subject: Adds permission and submodule checking
..


Adds permission and submodule checking

"You better check yo self before you wreck yo self"

Bug: T160772
Change-Id: Ia53e8826af757fa4435c71273d6d9eac86864c23
---
M main.sh
1 file changed, 13 insertions(+), 0 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/main.sh b/main.sh
index 17ad051..150c2fe 100644
--- a/main.sh
+++ b/main.sh
@@ -1,5 +1,18 @@
 #!/bin/bash
 
+# Check if modules/forecasts/forecast.R has execution permission for 
Reportupdater
+# (If it doesn't, then other R and shell scripts in modules/ probably don't 
either.)
+if [ `ls -l modules/forecasts | grep -e forecast.R | grep -e "-rwxrwxr-x" | wc 
-l` == "0" ]; then
+  echo "Warning: modules do not have execution permission; granting now..."
+  chmod +x -R modules/
+fi
+
+# Check if Reportupdater git submodule is set up
+if [ ! -f reportupdater/update_reports.py ]; then
+  echo "Warning: Reportupdater needs to be initialized and updated..."
+  git submodule init && git submodule update
+fi
+
 # Metrics
 for module in "external_traffic" "wdqs" "maps" "search" "portal"
 do

-- 
To view, visit https://gerrit.wikimedia.org/r/343322
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ia53e8826af757fa4435c71273d6d9eac86864c23
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/golden
Gerrit-Branch: master
Gerrit-Owner: Bearloga 
Gerrit-Reviewer: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...prince[master]: Annotate Reportupdater migration on graphs

2017-03-08 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/341743 )

Change subject: Annotate Reportupdater migration on graphs
..


Annotate Reportupdater migration on graphs

Bug: T150915
Change-Id: Idd8b46e61db9e33788d2be63564c3dc40334dc5f
---
M server.R
M tab_documentation/action_breakdown.md
M tab_documentation/applinks.md
M tab_documentation/clickthrough_rate.md
M tab_documentation/dwelltime.md
M tab_documentation/first_visit.md
M tab_documentation/geography.md
M tab_documentation/languages_summary.md
M tab_documentation/languages_visited.md
M tab_documentation/most_common.md
M tab_documentation/pageviews.md
M tab_documentation/referers_byengine.md
M tab_documentation/referers_summary.md
M tab_documentation/sisproj.md
M utils.R
15 files changed, 95 insertions(+), 63 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/server.R b/server.R
index 9da5d5a..2d82d55 100644
--- a/server.R
+++ b/server.R
@@ -51,7 +51,8 @@
   dyEvent(as.Date("2016-05-18"), "Sister Links Updated", labelLoc = 
"bottom", color = "white") %>%
   dyEvent(as.Date("2016-06-02"), "Detect Language Deployed", labelLoc = 
"bottom", color = "white") %>%
   dyEvent(as.Date("2016-08-16"), "Secondary Links Collapsed", labelLoc = 
"bottom", color = "white") %>%
-  dyEvent(as.Date("2016-09-13"), "B (schema switch)", labelLoc = "bottom", 
color = "white")
+  dyEvent(as.Date("2016-09-13"), "B (schema switch)", labelLoc = "bottom", 
color = "white") %>%
+  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom", 
color = "white")
   })
 
   output$action_breakdown_dygraph <- renderDygraph({
@@ -68,7 +69,8 @@
   dyEvent(as.Date("2016-05-18"), "Sister Links Updated", labelLoc = 
"bottom", color = "white") %>%
   dyEvent(as.Date("2016-06-02"), "Detect Language Deployed", labelLoc = 
"bottom", color = "white") %>%
   dyEvent(as.Date("2016-08-16"), "Secondary Links Collapsed", labelLoc = 
"bottom", color = "white") %>%
-  dyEvent(as.Date("2016-09-13"), "B (schema switch)", labelLoc = "bottom", 
color = "white")
+  dyEvent(as.Date("2016-09-13"), "B (schema switch)", labelLoc = "bottom", 
color = "white") %>%
+  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom", 
color = "white")
   })
 
   output$most_common_dygraph <- renderDygraph({
@@ -83,7 +85,8 @@
   dyEvent(as.Date("2016-05-18"), "Sister Links Updated", labelLoc = 
"bottom", color = "white") %>%
   dyEvent(as.Date("2016-06-02"), "Detect Language Deployed", labelLoc = 
"bottom", color = "white") %>%
   dyEvent(as.Date("2016-08-16"), "Secondary Links Collapsed", labelLoc = 
"bottom", color = "white") %>%
-  dyEvent(as.Date("2016-09-13"), "A (schema switch)", labelLoc = "bottom", 
color = "white")
+  dyEvent(as.Date("2016-09-13"), "A (schema switch)", labelLoc = "bottom", 
color = "white") %>%
+  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom", 
color = "white")
   })
 
   output$first_visit_dygraph <- renderDygraph({
@@ -99,7 +102,8 @@
   dyEvent(as.Date("2016-05-18"), "Sister Links Updated", labelLoc = 
"bottom", color = "white") %>%
   dyEvent(as.Date("2016-06-02"), "Detect Language Deployed", labelLoc = 
"bottom", color = "white") %>%
   dyEvent(as.Date("2016-08-16"), "Secondary Links Collapsed", labelLoc = 
"bottom", color = "white") %>%
-  dyEvent(as.Date("2016-09-13"), "A (schema switch)", labelLoc = "bottom", 
color = "white")
+  dyEvent(as.Date("2016-09-13"), "A (schema switch)", labelLoc = "bottom", 
color = "white") %>%
+  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom", 
color = "white")
   })
 
   output$dwelltime_dygraph <- renderDygraph({
@@ -115,7 +119,8 @@
   dyEvent(as.Date("2016-05-18"), "Sister Links Updated", labelLoc = 
"bottom", color = "white") %>%
   dyEvent(as.Date("2016-06-02"), "Detect Language Deployed", labelLoc = 
"bottom", color = "white") %>%
   dyEvent(as.Date("2016-08-16"), "Secondary Links Collapsed", labelLoc = 
"bottom", color = "white") %>%
-  dyEvent(as.Date("2016-09-13"), "B (schema switch)", labelLoc = "bottom", 
color = "white")
+  dyEvent(as.Date("2016-09-13"), "B (schema switch)", labelLoc = "bottom", 
color = "white") %>%
+  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom", 
color = "white")
   })
 
   output$sisproj_dygraph <- renderDygraph({
@@ -137,7 +142,8 @@
   polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_sisproj), rename 
= FALSE) %>%
   polloi::make_dygraph("Date", ifelse(input$sisproj_type == "prop", 
"Proportion (%)", input$sisproj_metric),
paste(ifelse(input$sisproj_metric == "Clicks", 
"Clicks", "Users who clicked"), "on links other Wikimedia Foundation 
projects")) %>%
-  dyLegend(labelsDiv = 

[MediaWiki-commits] [Gerrit] wikimedia...twilightsparql[master]: Annotate Reportupdater migration on graphs

2017-03-08 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/341746 )

Change subject: Annotate Reportupdater migration on graphs
..


Annotate Reportupdater migration on graphs

Bug: T150915
Change-Id: I51e717f7a0f9782e6d4d0261ab60264aa98a64b2
---
M server.R
M tab_documentation/wdqs_usage.md
M tab_documentation/wdqs_visits.md
3 files changed, 13 insertions(+), 8 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/server.R b/server.R
index 0a9ec5f..1974ddb 100644
--- a/server.R
+++ b/server.R
@@ -27,7 +27,8 @@
   dyAxis("y", logscale = input$usage_logscale) %>%
   dyLegend(labelsDiv = "usage_legend") %>%
   dyRangeSelector %>%
-  dyEvent(as.Date("2017-01-01"), "D (Started tracking LDF usage)", 
labelLoc = "bottom")
+  dyEvent(as.Date("2017-01-01"), "D (Started tracking LDF usage)", 
labelLoc = "bottom") %>%
+  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
   )
 
   output$sparql_usage_plot <- renderDygraph(
@@ -43,7 +44,8 @@
   dyRangeSelector %>%
   dyEvent(as.Date("2015-09-07"), "A (Announcement)", labelLoc = "bottom") 
%>%
   dyEvent(as.Date("2015-11-05"), "B (Labs bot)", labelLoc = "bottom") %>%
-  dyEvent(as.Date("2016-12-28"), "C (Bot ruleset)", labelLoc = "bottom")
+  dyEvent(as.Date("2016-12-28"), "C (Bot ruleset)", labelLoc = "bottom") 
%>%
+  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
   )
 
   output$wdqs_visits_plot <- renderDygraph(
@@ -59,7 +61,8 @@
   # ...because we're using dygraphs' native log-scaling:
   dyAxis("y", logscale = input$visits_logscale) %>%
   dyLegend(labelsDiv = "wdqs_visits_legend") %>%
-  dyEvent(as.Date("2015-09-07"), "A (Announcement)", labelLoc = "bottom")
+  dyEvent(as.Date("2015-09-07"), "A (Announcement)", labelLoc = "bottom") 
%>%
+  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
   )
 
   # Check datasets for missing data and notify user which datasets are missing 
data (if any)
diff --git a/tab_documentation/wdqs_usage.md b/tab_documentation/wdqs_usage.md
index 3d6cc3f..158a6c6 100644
--- a/tab_documentation/wdqs_usage.md
+++ b/tab_documentation/wdqs_usage.md
@@ -6,10 +6,11 @@
 Outages and inaccuracies
 --
 
-- **'A'**: We announced WDQS to the public.
-- **'B'**: From 2015-11-04 to 2015-11-06 there was what we believe to be a 
broken bot responsible for 21+ million requests.
-- **'C'**: As part of a refactoring to a new metric-generating framework (see 
[T150915](https://phabricator.wikimedia.org/T150915)), we revised the ruleset 
for determining when a request came from a bot/tool. For example, requests with 
URLs and email addresses in the UserAgent were classified as automata after 
2016-12-28.
-- **'D'**: We started tracking LDF endpoint usage on 2017-01-01. See 
[T153936](https://phabricator.wikimedia.org/T153936) and 
[T136358](https://phabricator.wikimedia.org/T136358) for more details.
+* '__A__': We announced WDQS to the public.
+* '__B__': From 2015-11-04 to 2015-11-06 there was what we believe to be a 
broken bot responsible for 21+ million requests.
+* '__C__': As part of a refactoring to a new metric-generating framework (see 
[T150915](https://phabricator.wikimedia.org/T150915)), we revised the ruleset 
for determining when a request came from a bot/tool. For example, requests with 
URLs and email addresses in the UserAgent were classified as automata after 
2016-12-28.
+* '__D__': We started tracking LDF endpoint usage on 2017-01-01. See 
[T153936](https://phabricator.wikimedia.org/T153936) and 
[T136358](https://phabricator.wikimedia.org/T136358) for more details.
+* '__R__': on 2017-01-01 we started calculating all of Discovery's metrics 
using a new version of [our data retrieval and processing 
codebase](https://phabricator.wikimedia.org/diffusion/WDGO/) that we migrated 
to [Wikimedia Analytics](https://www.mediawiki.org/wiki/Analytics)' 
[Reportupdater 
infrastructure](https://wikitech.wikimedia.org/wiki/Analytics/Reportupdater). 
See [T150915](https://phabricator.wikimedia.org/T150915) for more details.
 
 Questions, bug reports, and feature suggestions
 --
diff --git a/tab_documentation/wdqs_visits.md b/tab_documentation/wdqs_visits.md
index 02fffcf..192eb2e 100644
--- a/tab_documentation/wdqs_visits.md
+++ b/tab_documentation/wdqs_visits.md
@@ -6,7 +6,8 @@
 Outages and inaccuracies
 --
 
-- **'A'**: We announced WDQS to the public.
+* '__A__': We announced WDQS to the public.
+* '__R__': on 2017-01-01 we started calculating all of Discovery's metrics 
using a new version of [our data retrieval and processing 
codebase](https://phabricator.wikimedia.org/diffusion/WDGO/) that we migrated 
to [Wikimedia Analytics](https://www.mediawiki.org/wiki/Analytics)' 
[Reportupdater 

[MediaWiki-commits] [Gerrit] wikimedia...wonderbolt[master]: Annotate Reportupdater migration on graphs

2017-03-08 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/341744 )

Change subject: Annotate Reportupdater migration on graphs
..


Annotate Reportupdater migration on graphs

Bug: T150915
Change-Id: If916e90d5b11e6a2ee6f9582b0603d6d7b224b9e
---
M server.R
M tab_documentation/traffic_byengine.md
M tab_documentation/traffic_summary.md
3 files changed, 14 insertions(+), 10 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/server.R b/server.R
index 1888f89..b0863cf 100644
--- a/server.R
+++ b/server.R
@@ -7,12 +7,12 @@
 existing_date <- Sys.Date() - 1
 
 function(input, output, session) {
-  
+
   if (Sys.Date() != existing_date) {
 read_traffic()
 existing_date <<- Sys.Date()
   }
-  
+
   output$traffic_summary_dygraph <- renderDygraph({
 summary_traffic_data[[input$platform_traffic_summary]] %>%
   polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_traffic_summary)) 
%>%
@@ -21,9 +21,10 @@
   dyLegend(labelsDiv = "traffic_summary_legend", show = "always", 
showZeroValues = FALSE) %>%
   dyRangeSelector %>%
   dyEvent(as.Date("2016-03-07"), "A (new UDF)", labelLoc = "bottom") %>%
-  dyEvent(as.Date("2016-06-26"), "B (DuckDuckGo)", labelLoc = "bottom")
+  dyEvent(as.Date("2016-06-26"), "B (DuckDuckGo)", labelLoc = "bottom") %>%
+  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
   })
-  
+
   output$traffic_bysearch_dygraph <- renderDygraph({
 bysearch_traffic_data[[input$platform_traffic_bysearch]] %>%
   polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, 
input$smoothing_traffic_bysearch)) %>%
@@ -32,9 +33,10 @@
   dyLegend(labelsDiv = "traffic_bysearch_legend", show = "always", 
showZeroValues = FALSE) %>%
   dyAxis("y", logscale = input$platform_traffic_bysearch_log) %>%
   dyRangeSelector(fillColor = "", strokeColor = "") %>%
-  dyEvent(as.Date("2016-06-26"), "A (DuckDuckGo)", labelLoc = "bottom")
+  dyEvent(as.Date("2016-06-26"), "A (DuckDuckGo)", labelLoc = "bottom") %>%
+  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
   })
-  
+
   # Check datasets for missing data and notify user which datasets are missing 
data (if any)
   output$message_menu <- renderMenu({
 notifications <- list(
@@ -43,5 +45,5 @@
 notifications <- notifications[!sapply(notifications, is.null)]
 return(dropdownMenu(type = "notifications", .list = notifications))
   })
-  
+
 }
diff --git a/tab_documentation/traffic_byengine.md 
b/tab_documentation/traffic_byengine.md
index b6e7571..30d33ac 100644
--- a/tab_documentation/traffic_byengine.md
+++ b/tab_documentation/traffic_byengine.md
@@ -11,7 +11,8 @@
 
 Outages and notes
 --
-- **A**: On 25 August 2016 we patched the UDF to also look for [Duck Duck 
Go](https://duckduckgo.com) when it processes referer data. That referreral 
data was deleted and backfilled from 26 June 2016. See 
[T143287](https://phabricator.wikimedia.org/T143287) for more details.
+* '__A__': on 2016-08-25 we patched the UDF to also look for [Duck Duck 
Go](https://duckduckgo.com) when it processes referer data. That referreral 
data was deleted and backfilled from 26 June 2016. See 
[T143287](https://phabricator.wikimedia.org/T143287) for more details.
+* '__R__': on 2017-01-01 we started calculating all of Discovery's metrics 
using a new version of [our data retrieval and processing 
codebase](https://phabricator.wikimedia.org/diffusion/WDGO/) that we migrated 
to [Wikimedia Analytics](https://www.mediawiki.org/wiki/Analytics)' 
[Reportupdater 
infrastructure](https://wikitech.wikimedia.org/wiki/Analytics/Reportupdater). 
See [T150915](https://phabricator.wikimedia.org/T150915) for more details.
 
 Questions, bug reports, and feature suggestions
 --
diff --git a/tab_documentation/traffic_summary.md 
b/tab_documentation/traffic_summary.md
index b1b7cf6..e8f0919 100644
--- a/tab_documentation/traffic_summary.md
+++ b/tab_documentation/traffic_summary.md
@@ -10,9 +10,10 @@
 
 Outages and notes
 --
-- **A**: We switched to a finalized version of the UDF that extracts internal 
traffic (see [T130083](https://phabricator.wikimedia.org/T130083))
-- **B**: On 25 August 2016 we patched the UDF to also look for [Duck Duck 
Go](https://duckduckgo.com) when it processes referer data. That referreral 
data was deleted and backfilled from 26 June 2016. See 
[T143287](https://phabricator.wikimedia.org/T143287) for more details.
+* '__A__': We switched to a finalized version of the UDF that extracts 
internal traffic (see [T130083](https://phabricator.wikimedia.org/T130083))
+* '__B__': on 25 August 2016 we patched the UDF to also look for [Duck Duck 
Go](https://duckduckgo.com) when it processes referer data. That referreral 
data was deleted and backfilled from 26 June 

[MediaWiki-commits] [Gerrit] wikimedia...wetzel[master]: Annotate Reportupdater migration on graphs

2017-03-08 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/341745 )

Change subject: Annotate Reportupdater migration on graphs
..


Annotate Reportupdater migration on graphs

Bug: T150915
Change-Id: I8bc4bd8ca8883e947ca77439edb0f5af47a6be8a
---
M server.R
M tab_documentation/geo_breakdown.md
M tab_documentation/geohack_usage.md
M tab_documentation/tiles_summary.md
M tab_documentation/tiles_total_by_zoom.md
M tab_documentation/tiles_users_by_style.md
M tab_documentation/unique_users.md
M tab_documentation/wikiminiatlas_usage.md
M tab_documentation/wikivoyage_usage.md
M tab_documentation/wiwosm_usage.md
10 files changed, 42 insertions(+), 21 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/server.R b/server.R
index d3353e7..d24e621 100644
--- a/server.R
+++ b/server.R
@@ -40,7 +40,8 @@
   dyEvent(as.Date("2015-09-17"), "A (announcement)", labelLoc = "bottom") 
%>%
   dyEvent(as.Date("2016-01-08"), "B (enwiki launch)", labelLoc = "bottom") 
%>%
   dyEvent(as.Date("2016-01-12"), "C (cache clear)", labelLoc = "bottom") 
%>%
-  dyEvent(as.Date("2016-11-09"), "D (pkget)", labelLoc = "bottom")
+  dyEvent(as.Date("2016-11-09"), "D (pkget)", labelLoc = "bottom") %>%
+  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
   })
 
   output$tiles_style_series <- renderDygraph({
@@ -59,7 +60,8 @@
   dyEvent(as.Date("2015-09-17"), "A (announcement)", labelLoc = "bottom") 
%>%
   dyEvent(as.Date("2016-01-08"), "B (enwiki launch)", labelLoc = "bottom") 
%>%
   dyEvent(as.Date("2016-01-12"), "C (cache clear)", labelLoc = "bottom") 
%>%
-  dyEvent(as.Date("2016-11-09"), "D (pkget)", labelLoc = "top")
+  dyEvent(as.Date("2016-11-09"), "D (pkget)", labelLoc = "top") %>%
+  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
   })
 
   output$tiles_users_series <- renderDygraph({
@@ -78,7 +80,8 @@
   dyEvent(as.Date("2015-09-17"), "A (announcement)", labelLoc = "bottom") 
%>%
   dyEvent(as.Date("2016-01-08"), "B (enwiki launch)", labelLoc = "bottom") 
%>%
   dyEvent(as.Date("2016-01-12"), "C (cache clear)", labelLoc = "bottom") 
%>%
-  dyEvent(as.Date("2016-11-08"), "D (pkget)", labelLoc = "top")
+  dyEvent(as.Date("2016-11-08"), "D (pkget)", labelLoc = "top") %>%
+  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
   })
 
   output$zoom_level_selector_container <- renderUI({
@@ -99,7 +102,8 @@
   polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, 
input$smoothing_tiles_zoom_series)) %>%
   polloi::make_dygraph("Date", "Tiles", "Total tiles by zoom level") %>%
   dyAxis("y", logscale = input$tiles_zoom_logscale) %>%
-  dyLegend(labelsDiv = "tiles_zoom_series_legend", show = "always")
+  dyLegend(labelsDiv = "tiles_zoom_series_legend", show = "always") %>%
+  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
   })
 
   output$users_per_platform <- renderDygraph({
@@ -110,7 +114,8 @@
   dyLegend(labelsDiv = "users_per_platform_legend", show = "always") %>%
   dyRangeSelector %>%
   dyEvent(as.Date("2016-04-15"), "A (Maps EL bug)", labelLoc = "bottom") 
%>%
-  dyEvent(as.Date("2016-06-17"), "A (Maps EL patch)", labelLoc = "bottom")
+  dyEvent(as.Date("2016-06-17"), "A (Maps EL patch)", labelLoc = "bottom") 
%>%
+  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
   })
 
   output$geohack_feature_usage <- renderDygraph({
@@ -120,7 +125,8 @@
   dyRangeSelector %>%
   dyAxis("y", logscale = input$geohack_feature_usage_logscale) %>%
   dyEvent(as.Date("2016-04-15"), "A (Maps EL bug)", labelLoc = "bottom") 
%>%
-  dyEvent(as.Date("2016-06-17"), "A (Maps EL patch)", labelLoc = "bottom")
+  dyEvent(as.Date("2016-06-17"), "A (Maps EL patch)", labelLoc = "bottom") 
%>%
+  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
   })
 
   output$wikiminiatlas_feature_usage <- renderDygraph({
@@ -130,7 +136,8 @@
   dyRangeSelector %>%
   dyAxis("y", logscale = input$wikiminiatlas_feature_usage_logscale) %>%
   dyEvent(as.Date("2016-04-15"), "A (Maps EL bug)", labelLoc = "bottom") 
%>%
-  dyEvent(as.Date("2016-06-17"), "A (Maps EL patch)", labelLoc = "bottom")
+  dyEvent(as.Date("2016-06-17"), "A (Maps EL patch)", labelLoc = "bottom") 
%>%
+  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
   })
 
   output$wikivoyage_feature_usage <- renderDygraph({
@@ -140,7 +147,8 @@
   dyRangeSelector %>%
   dyAxis("y", logscale = input$wikivoyage_feature_usage_logscale) %>%
   dyEvent(as.Date("2016-04-15"), "A (Maps EL bug)", labelLoc = "bottom") 
%>%
-  dyEvent(as.Date("2016-06-17"), "A (Maps EL patch)", labelLoc = "bottom")
+  

[MediaWiki-commits] [Gerrit] wikimedia...rainbow[master]: Annotate Reportupdater migration on graphs

2017-03-08 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/341742 )

Change subject: Annotate Reportupdater migration on graphs
..


Annotate Reportupdater migration on graphs

Bug: T150915
Change-Id: Ie650b1eb0f5c9cc40e43a316b71f44e0b8b8cab7
---
M server.R
M tab_documentation/app_events.md
M tab_documentation/app_load.md
M tab_documentation/click_position.md
M tab_documentation/desktop_events.md
M tab_documentation/desktop_load.md
M tab_documentation/failure_breakdown.md
M tab_documentation/failure_langproj.md
M tab_documentation/failure_rate.md
M tab_documentation/failure_suggests.md
M tab_documentation/fulltext_basic.md
M tab_documentation/geo_basic.md
M tab_documentation/invoke_source.md
M tab_documentation/kpi_api_usage.md
M tab_documentation/kpi_augmented_clickthroughs.md
M tab_documentation/kpi_load_time.md
M tab_documentation/kpi_zero_results.md
M tab_documentation/language_basic.md
M tab_documentation/mobile_events.md
M tab_documentation/mobile_load.md
M tab_documentation/open_basic.md
M tab_documentation/paulscore_approx.html
M tab_documentation/prefix_basic.md
M tab_documentation/survival.md
24 files changed, 100 insertions(+), 57 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/server.R b/server.R
index 343dd5f..158ecf5 100644
--- a/server.R
+++ b/server.R
@@ -69,7 +69,8 @@
   polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_desktop_event)) 
%>%
   polloi::make_dygraph(xlab = "Date", ylab = "Events", title = "Desktop 
search events, by day") %>%
   dyRangeSelector %>%
-  dyEvent(as.Date("2016-07-12"), "A (schema switch)", labelLoc = "bottom")
+  dyEvent(as.Date("2016-07-12"), "A (schema switch)", labelLoc = "bottom") 
%>%
+  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
   })
 
   output$desktop_load_plot <- renderDygraph({
@@ -77,7 +78,8 @@
   polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_desktop_load)) %>%
   polloi::make_dygraph(xlab = "Date", ylab = "Load time (ms)", title = 
"Desktop load times, by day", use_si = FALSE) %>%
   dyRangeSelector %>%
-  dyEvent(as.Date("2016-07-12"), "A (schema switch)", labelLoc = "bottom")
+  dyEvent(as.Date("2016-07-12"), "A (schema switch)", labelLoc = "bottom") 
%>%
+  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
   })
 
   output$paulscore_approx_plot_fulltext <- renderDygraph({
@@ -149,14 +151,16 @@
 mobile_dygraph_set %>%
   polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_mobile_event)) %>%
   polloi::make_dygraph(xlab = "Date", ylab = "Events", title = "Mobile 
search events, by day") %>%
-  dyRangeSelector
+  dyRangeSelector %>%
+  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
   })
 
   output$mobile_load_plot <- renderDygraph({
 mobile_load_data %>%
   polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_mobile_load)) %>%
   polloi::make_dygraph(xlab = "Date", ylab = "Load time (ms)", title = 
"Mobile search events, by day", use_si = FALSE) %>%
-  dyRangeSelector
+  dyRangeSelector %>%
+  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
   })
 
   ## App value boxes
@@ -192,28 +196,32 @@
 android_dygraph_set %>%
   polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_app_event)) %>%
   polloi::make_dygraph(xlab = "Date", ylab = "Events", title = "Android 
mobile app search events, by day") %>%
-  dyRangeSelector
+  dyRangeSelector %>%
+  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
   })
 
   output$android_load_plot <- renderDygraph({
 android_load_data %>%
   polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_app_load)) %>%
   polloi::make_dygraph(xlab = "Date", ylab = "Load time (ms)", title = 
"Android result load times, by day", use_si = FALSE) %>%
-  dyRangeSelector
+  dyRangeSelector %>%
+  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
   })
 
   output$ios_event_plot <- renderDygraph({
 ios_dygraph_set %>%
   polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_app_event)) %>%
   polloi::make_dygraph(xlab = "Date", ylab = "Events", title = "iOS mobile 
app search events, by day") %>%
-  dyRangeSelector
+  dyRangeSelector %>%
+  dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
   })
 
   output$ios_load_plot <- renderDygraph({
 ios_load_data %>%
   polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, 

[MediaWiki-commits] [Gerrit] wikimedia...prince[master]: Fixed bug in tab country_breakdown

2017-03-06 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/341373 )

Change subject: Fixed bug in tab country_breakdown
..

Fixed bug in tab country_breakdown

Bug: T150915
Change-Id: If0e96f2fc25edc8a094367ac61d7e21879687d2e
---
M utils.R
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/prince 
refs/changes/73/341373/1

diff --git a/utils.R b/utils.R
index 6548a0e..67a1594 100644
--- a/utils.R
+++ b/utils.R
@@ -49,7 +49,7 @@
   country_data <<- tidyr::spread(
 dplyr::distinct(interim, date, country, .keep_all = TRUE),
 country, events, fill = NA
-  )
+  ) %>% as.data.frame()
   return(invisible())
 }
 

-- 
To view, visit https://gerrit.wikimedia.org/r/341373
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: If0e96f2fc25edc8a094367ac61d7e21879687d2e
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/prince
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...wonderbolt[master]: Note about internally referred traffic being miscategorized

2017-02-16 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/338279 )

Change subject: Note about internally referred traffic being miscategorized
..

Note about internally referred traffic being miscategorized

Bug: T154722
Change-Id: I57c8878519efd943b476c28de3a50e2989c99307
---
M tab_documentation/traffic_summary.md
1 file changed, 1 insertion(+), 0 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/wonderbolt 
refs/changes/79/338279/1

diff --git a/tab_documentation/traffic_summary.md 
b/tab_documentation/traffic_summary.md
index 5cf797d..b1b7cf6 100644
--- a/tab_documentation/traffic_summary.md
+++ b/tab_documentation/traffic_summary.md
@@ -12,6 +12,7 @@
 --
 - **A**: We switched to a finalized version of the UDF that extracts internal 
traffic (see [T130083](https://phabricator.wikimedia.org/T130083))
 - **B**: On 25 August 2016 we patched the UDF to also look for [Duck Duck 
Go](https://duckduckgo.com) when it processes referer data. That referreral 
data was deleted and backfilled from 26 June 2016. See 
[T143287](https://phabricator.wikimedia.org/T143287) for more details.
+- On 22 February 2016, a bug was introduced and some of the internally 
referred traffic are miscategorized as none. See 
[T148780](https://phabricator.wikimedia.org/T148780) and 
[T154722](https://phabricator.wikimedia.org/T154722) for more details.
 
 Questions, bug reports, and feature suggestions
 --

-- 
To view, visit https://gerrit.wikimedia.org/r/338279
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I57c8878519efd943b476c28de3a50e2989c99307
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/wonderbolt
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...wmf[master]: Change MySQL config file

2017-01-19 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/333132 )

Change subject: Change MySQL config file
..

Change MySQL config file

Change-Id: I3d9cf2f9f6a8a48a55a7e00fa42eb3d38572ecf6
---
M R/mysql.R
1 file changed, 2 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/wmf 
refs/changes/32/333132/1

diff --git a/R/mysql.R b/R/mysql.R
index 12e2fb7..510e4d5 100644
--- a/R/mysql.R
+++ b/R/mysql.R
@@ -38,8 +38,8 @@
 #'@export
 mysql_connect <- function(database, default_file = NULL) {
   if (is.null(default_file)) {
-default_file = "/etc/mysql/conf.d/stats-research-client.cnf"
-# there's also "/etc/mysql/conf.d/analytics-research-client.cnf"
+default_file = "/etc/mysql/conf.d/analytics-research-client.cnf"
+# there's also "/etc/mysql/conf.d/stats-research-client.cnf"
   }
   if (RMySQL_version() > 93) {
 con <- dbConnect(drv = RMySQL::MySQL(),

-- 
To view, visit https://gerrit.wikimedia.org/r/333132
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I3d9cf2f9f6a8a48a55a7e00fa42eb3d38572ecf6
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/wmf
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...rainbow[master]: Highlight sparklines according to date range selection on KP...

2016-12-21 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/328593 )

Change subject: Highlight sparklines according to date range selection on KPI 
summary page
..

Highlight sparklines according to date range selection on KPI summary page

Bug: T150215
Change-Id: Ib0e06619b3c3e7069fcd227528bc87dd9a1c0bea
---
M server.R
1 file changed, 81 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow 
refs/changes/93/328593/1

diff --git a/server.R b/server.R
index d5d34a8..62678bd 100644
--- a/server.R
+++ b/server.R
@@ -572,10 +572,27 @@
   dplyr::select(Median) %>%
   unlist(use.names = FALSE) %>%
   round(2)
-sparkline::sparkline(values = output_sl, type = "line",
+sl1 <- sparkline::sparkline(values = output_sl, type = "line",
  height = 50, width = '100%',
  lineColor = 'black', fillColor = 'transparent',
+ chartRangeMin = min(output_sl), chartRangeMax = 
max(output_sl),
  highlightLineColor = 'orange', highlightSpotColor = 
'orange')
+# highlight selected date range
+if (input$kpi_summary_date_range_selector == "weekly"){
+  output_highlight <- c(rep(NA, length(output_sl)-7), 
output_sl[(length(output_sl)-6):length(output_sl)])
+} else if (input$kpi_summary_date_range_selector == "monthly"){
+  output_highlight <- c(rep(NA, length(output_sl)-30), 
output_sl[(length(output_sl)-29):length(output_sl)])
+} else if (input$kpi_summary_date_range_selector == "quarterly"){
+  output_highlight <- output_sl
+} else {
+  return(sl1)
+}
+sl2 <- sparkline::sparkline(values = output_highlight, type = "line",
+height = 50, width = '100%', lineWidth = 2,
+lineColor = 'red', chartRangeMin = 
min(output_sl), chartRangeMax = max(output_sl),
+minSpotColor = F, maxSpotColor = F, 
disableInteraction = T,
+highlightLineColor = NULL, highlightSpotColor 
= NULL)
+return(sparkline::spk_composite(sl1, sl2))
   })
   output$sparkline_zero_results <- sparkline:::renderSparkline({
 if(input$kpi_summary_date_range_selector == "all"){
@@ -588,10 +605,27 @@
   dplyr::select(rate) %>%
   unlist(use.names = FALSE) %>%
   round(2)
-sparkline::sparkline(values = output_sl, type = "line",
- height = 50, width = '100%',
- lineColor = 'black', fillColor = 'transparent',
- highlightLineColor = 'orange', highlightSpotColor = 
'orange')
+sl1 <- sparkline::sparkline(values = output_sl, type = "line",
+height = 50, width = '100%',
+lineColor = 'black', fillColor = 'transparent',
+chartRangeMin = min(output_sl), chartRangeMax 
= max(output_sl),
+highlightLineColor = 'orange', 
highlightSpotColor = 'orange')
+# highlight selected date range
+if (input$kpi_summary_date_range_selector == "weekly"){
+  output_highlight <- c(rep(NA, length(output_sl)-7), 
output_sl[(length(output_sl)-6):length(output_sl)])
+} else if (input$kpi_summary_date_range_selector == "monthly"){
+  output_highlight <- c(rep(NA, length(output_sl)-30), 
output_sl[(length(output_sl)-29):length(output_sl)])
+} else if (input$kpi_summary_date_range_selector == "quarterly"){
+  output_highlight <- output_sl
+} else {
+  return(sl1)
+}
+sl2 <- sparkline::sparkline(values = output_highlight, type = "line",
+height = 50, width = '100%', lineWidth = 2,
+lineColor = 'red', chartRangeMin = 
min(output_sl), chartRangeMax = max(output_sl),
+minSpotColor = F, maxSpotColor = F, 
disableInteraction = T,
+highlightLineColor = NULL, highlightSpotColor 
= NULL)
+return(sparkline::spk_composite(sl1, sl2))
   })
   output$sparkline_api_usage <- sparkline:::renderSparkline({
 if(input$kpi_summary_date_range_selector == "all"){
@@ -609,10 +643,27 @@
   dplyr::summarize(total = sum(events)) %>%
   dplyr::select(total) %>%
   unlist(use.names = FALSE)
-sparkline::sparkline(values = output_sl, type = "line",
- height = 50, width = '100%',
- lineColor = 'black', fillColor = 'transparent',
- highlightLineColor = 'orange', highlightSpotColor = 
'orange')
+sl1 <- sparkline::sparkline(values = output_sl, type = "line",
+height = 50, width = '100%',
+lineColor = 'black', fillColor = 'transparent',
+chartRangeMin 

[MediaWiki-commits] [Gerrit] wikimedia...rainbow[master]: Add sparklines for KPIs: - KPI Summary Page - Monthly Metric...

2016-12-16 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/327877 )

Change subject: Add sparklines for KPIs: - KPI Summary Page - Monthly Metrics 
Page
..

Add sparklines for KPIs:
- KPI Summary Page
- Monthly Metrics Page

Bug: T150215
Change-Id: I4b64830a3db7f734977b19de695fdf7b0ae7ee12
---
M server.R
M ui.R
2 files changed, 142 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow 
refs/changes/77/327877/1

diff --git a/server.R b/server.R
index 3f980c6..c2481db 100644
--- a/server.R
+++ b/server.R
@@ -1,6 +1,9 @@
 library(shiny)
 library(shinydashboard)
 library(dygraphs)
+library(sparkline)
+library(DT)
+library(data.table)
 
 source("utils.R")
 
@@ -559,6 +562,84 @@
 return(polloi::na_box("User engagement (data problem)"))
   })
 
+  ## KPI Sparklines
+  output$sparkline_load_time <- sparkline:::renderSparkline({
+if(input$kpi_summary_date_range_selector == "all"){
+  output_sl <- list(desktop_load_data, mobile_load_data, 
android_load_data, ios_load_data)
+} else{
+  output_sl <- list(desktop_load_data, mobile_load_data, 
android_load_data, ios_load_data) %>%
+lapply(polloi::subset_by_date_range, from = Sys.Date() - 91, to = 
Sys.Date() - 1)
+}
+output_sl <- output_sl %>%
+  lapply(function(platform_load_data) {
+platform_load_data[, c("date", "Median")]
+  }) %>%
+  dplyr::bind_rows(.id = "platform") %>%
+  dplyr::group_by(date) %>%
+  dplyr::summarize(Median = median(Median)) %>%
+  dplyr::ungroup() %>%
+  dplyr::select(Median) %>%
+  unlist(use.names = FALSE) %>%
+  round(2)
+sparkline::sparkline(values = output_sl, type = "line",
+ height = 50, width = '100%',
+ lineColor = 'black', fillColor = '#ccc',
+ highlightLineColor = 'orange', highlightSpotColor = 
'orange')
+  })
+  output$sparkline_zero_results <- sparkline:::renderSparkline({
+if(input$kpi_summary_date_range_selector == "all"){
+  output_sl <- failure_data_with_automata
+} else{
+  output_sl <- failure_data_with_automata %>%
+polloi::subset_by_date_range(from = Sys.Date() - 91, to = Sys.Date() - 
1)
+}
+output_sl <- output_sl %>%
+  dplyr::select(rate) %>%
+  unlist(use.names = FALSE) %>%
+  round(2)
+sparkline::sparkline(values = output_sl, type = "line",
+ height = 50, width = '100%',
+ lineColor = 'black', fillColor = '#ccc',
+ highlightLineColor = 'orange', highlightSpotColor = 
'orange')
+  })
+  output$sparkline_api_usage <- sparkline:::renderSparkline({
+if(input$kpi_summary_date_range_selector == "all"){
+  output_sl <- split_dataset
+} else{
+  output_sl <- split_dataset %>%
+lapply(polloi::subset_by_date_range, from = Sys.Date() - 91, to = 
Sys.Date() - 1)
+}
+output_sl <- output_sl %>%
+  lapply(function(platform_load_data) {
+platform_load_data[, c("date", "events")]
+  }) %>%
+  dplyr::bind_rows(.id = "api") %>%
+  dplyr::group_by(date) %>%
+  dplyr::summarize(total = sum(events)) %>%
+  dplyr::select(total) %>%
+  unlist(use.names = FALSE)
+sparkline::sparkline(values = output_sl, type = "line",
+ height = 50, width = '100%',
+ lineColor = 'black', fillColor = '#ccc',
+ highlightLineColor = 'orange', highlightSpotColor = 
'orange')
+  })
+  output$sparkline_augmented_clickthroughs <- sparkline:::renderSparkline({
+if(input$kpi_summary_date_range_selector == "all"){
+  output_sl <- augmented_clickthroughs
+} else{
+  output_sl <- augmented_clickthroughs %>%
+polloi::subset_by_date_range(from = Sys.Date() - 91, to = Sys.Date() - 
1)
+}
+output_sl <- output_sl %>%
+  dplyr::select(user_engagement) %>%
+  unlist(use.names = FALSE) %>%
+  round(2)
+sparkline::sparkline(values = output_sl, type = "line",
+ height = 50, width = '100%',
+ lineColor = 'black', fillColor = '#ccc',
+ highlightLineColor = 'orange', highlightSpotColor = 
'orange')
+  })
+
   ## KPI Modules
   output$kpi_load_time_series <- renderDygraph({
 smooth_level <- input$smoothing_kpi_load_time
@@ -722,8 +803,9 @@
   dyEvent(as.Date("2016-07-12"), "A (schema switch)", labelLoc = "bottom")
   })
 
-  output$monthly_metrics_tbl <- renderUI({
-temp <- data.frame(
+  output$monthly_metrics_tbl <- DT::renderDataTable(
+{
+  temp <- data.frame(
   KPI = c("Load time", "Zero results rate", "API Usage", "User 
engagement"),
   Units = c("ms", "%", "", "%")
 )
@@ -795,28 +877,64 @@
 # Sanitize:
 temp[temp == "NA%" | temp == "NANA%" | temp == "NANA"] <- "--"

[MediaWiki-commits] [Gerrit] wikimedia...prince[master]: add geo breakdown

2016-12-13 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/327139 )

Change subject: add geo breakdown
..

add geo breakdown

Change-Id: I7a4d371f40fbb4ec822008ae3866870688621154
---
M functions.R
M server.R
A tab_documentation/first_visit_geo.md
A tab_documentation/last_action_geo.md
A tab_documentation/most_common_geo.md
A tab_documentation/traffic_ctr_geo.md
M ui.R
M www/stylesheet.css
8 files changed, 1,449 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/prince 
refs/changes/39/327139/1

diff --git a/functions.R b/functions.R
index 7ee1f65..0398f8c 100644
--- a/functions.R
+++ b/functions.R
@@ -1,9 +1,21 @@
 library(polloi)
+library(ggplot2)
 library(data.table)
 library(reshape2)
 library(magrittr)
+library(toOrdinal)
+library(xts)
+library(dplyr)
+library(tidyr)
 
 source("extras.R")
+
+# Capitalize the first letter
+simpleCap <- function(x) {
+  s <- strsplit(x, " ")[[1]]
+  paste(toupper(substring(s, 1,1)), substring(s, 2),
+sep="", collapse=" ")
+}
 
 # Read in the traffic data
 read_clickthrough <- function(){
@@ -129,6 +141,155 @@
 
 }
 
+read_geo <- function() {
+
+  all_country_data <- polloi::read_dataset("portal/all_country_data.tsv", 
col_types = "Dcididid")
+  first_visits_country <- 
polloi::read_dataset("portal/first_visits_country.tsv", col_types = "Dccid")
+  last_action_country <- 
polloi::read_dataset("portal/last_action_country.tsv", col_types = "Dccid")
+  most_common_country <- 
polloi::read_dataset("portal/most_common_country.tsv", col_types = "Dccid")
+  data("countrycode_data", package="countrycode")
+  countrycode_data$country.name[c(44,54,143)] <- c("Cape Verde", "Congo, The 
Democratic Republic of the", "Macedonia, Republic of" )
+  countrycode_data$continent[countrycode_data$country.name %in% c("British 
Indian Ocean Territory","Christmas Island","Taiwan, Province of China")] <- 
"Asia"
+  countrycode_data$continent[countrycode_data$country.name %in% 
c("Bermuda","Canada","Greenland","Saint Pierre and Miquelon","United States")] 
<- "Northern America"
+  countrycode_data$continent[countrycode_data$continent == "Americas"] <- 
"South America"
+
+
+  all_country_data <- 
all_country_data[!duplicated(all_country_data[,1:2],fromLast=T),]
+  all_country_data_prop <- all_country_data %>%
+group_by(date) %>%
+mutate(event_prop=round(events/sum(events),4)*100, 
visit_prop=round(n_visit/sum(n_visit),4)*100, 
session_prop=round(n_session/sum(n_session),4)*100) %>%
+
select(date,country,event_prop,ctr,visit_prop,ctr_visit,session_prop,ctr_session)
 %>% ungroup()
+  us_mask <- grepl("^U\\.S\\.", all_country_data$country)
+  us_data <- all_country_data[us_mask,]
+  all_country_data <- us_data %>%
+mutate(clicks = events*ctr, click_v=n_visit*ctr_visit, 
click_s=n_session*ctr_session) %>%
+group_by(date) %>%
+summarise(country="United States", events=sum(events), 
ctr=round(sum(clicks)/sum(events),4),
+  n_visit=sum(n_visit), 
ctr_visit=round(sum(click_v)/sum(n_visit),4),
+  n_session=sum(n_session), 
ctr_session=round(sum(click_s)/sum(n_session),4)) %>%
+rbind(all_country_data[!us_mask,]) %>%
+arrange(date, country)
+  us_mask <- grepl("^U\\.S\\.", all_country_data_prop$country)
+  us_data_prop <- all_country_data_prop[us_mask,]
+  all_country_data_prop <- us_data_prop %>%
+group_by(date) %>%
+summarise(country="United States", event_prop=sum(event_prop),
+  visit_prop=sum(visit_prop), session_prop=sum(session_prop)) %>%
+left_join(all_country_data[, 
c("date","country","ctr","ctr_visit","ctr_session")], by=c("date","country")) 
%>%
+rbind(all_country_data_prop[!us_mask,]) %>%
+
select(date,country,event_prop,ctr,visit_prop,ctr_visit,session_prop,ctr_session)
 %>%
+arrange(date, country)
+  colnames(all_country_data) <- c("Date", "Country", "No. Events",
+  "Overall Clickthrough Rate", "No. Visit", 
"Clickthrough Rate Per Visit",
+  "No. Session", "Clickthrough Rate Per 
Session")
+  colnames(all_country_data_prop) <- c("Date", "Country", "No. Events",
+   "Overall Clickthrough Rate", "No. 
Visit", "Clickthrough Rate Per Visit",
+   "No. Session", "Clickthrough Rate Per 
Session")
+  colnames(us_data) <- c("Date", "Country", "No. Events",
+ "Overall Clickthrough Rate", "No. Visit", 
"Clickthrough Rate Per Visit",
+ "No. Session", "Clickthrough Rate Per Session")
+  colnames(us_data_prop) <- c("Date", "Country", "No. Events",
+  "Overall Clickthrough Rate", "No. Visit", 
"Clickthrough Rate Per Visit",
+  "No. Session", "Clickthrough Rate Per Session")
+  region_mask <- 

[MediaWiki-commits] [Gerrit] wikimedia...experimental[master]: Fixed bugs in poultry

2016-12-12 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/326830 )

Change subject: Fixed bugs in poultry
..


Fixed bugs in poultry

Change-Id: Idd48ec7c25a197ffb79ca8c3b3520ee2b5e6f70e
---
M shiny-server/poultry
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/shiny-server/poultry b/shiny-server/poultry
index c4e1f7d..5fe5a0b 16
--- a/shiny-server/poultry
+++ b/shiny-server/poultry
@@ -1 +1 @@
-Subproject commit c4e1f7d9bc6f46365151f67c0da28f0f72c595b5
+Subproject commit 5fe5a0ba6849ce9af881376e0bd2869ec2a25abe

-- 
To view, visit https://gerrit.wikimedia.org/r/326830
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Idd48ec7c25a197ffb79ca8c3b3520ee2b5e6f70e
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/experimental
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 
Gerrit-Reviewer: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...experimental[master]: Fixed bugs in poultry

2016-12-12 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/326830 )

Change subject: Fixed bugs in poultry
..

Fixed bugs in poultry

Change-Id: Idd48ec7c25a197ffb79ca8c3b3520ee2b5e6f70e
---
M shiny-server/poultry
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/experimental 
refs/changes/30/326830/1

diff --git a/shiny-server/poultry b/shiny-server/poultry
index c4e1f7d..5fe5a0b 16
--- a/shiny-server/poultry
+++ b/shiny-server/poultry
@@ -1 +1 @@
-Subproject commit c4e1f7d9bc6f46365151f67c0da28f0f72c595b5
+Subproject commit 5fe5a0ba6849ce9af881376e0bd2869ec2a25abe

-- 
To view, visit https://gerrit.wikimedia.org/r/326830
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Idd48ec7c25a197ffb79ca8c3b3520ee2b5e6f70e
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/experimental
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...experimental[master]: Updating dashboard poultry

2016-12-08 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/326059 )

Change subject: Updating dashboard poultry
..


Updating dashboard poultry

Change-Id: I51da2775ec943f79b385c8044fdab4c3c06577cc
---
M shiny-server/poultry
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/shiny-server/poultry b/shiny-server/poultry
index 263ab79..c4e1f7d 16
--- a/shiny-server/poultry
+++ b/shiny-server/poultry
@@ -1 +1 @@
-Subproject commit 263ab7937a776d1aa4b349a2bc35732117c5225c
+Subproject commit c4e1f7d9bc6f46365151f67c0da28f0f72c595b5

-- 
To view, visit https://gerrit.wikimedia.org/r/326059
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I51da2775ec943f79b385c8044fdab4c3c06577cc
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/experimental
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 
Gerrit-Reviewer: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...experimental[master]: Updating dashboard poultry

2016-12-08 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/326059 )

Change subject: Updating dashboard poultry
..

Updating dashboard poultry

Change-Id: I51da2775ec943f79b385c8044fdab4c3c06577cc
---
M shiny-server/poultry
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/experimental 
refs/changes/59/326059/1

diff --git a/shiny-server/poultry b/shiny-server/poultry
index 263ab79..c4e1f7d 16
--- a/shiny-server/poultry
+++ b/shiny-server/poultry
@@ -1 +1 @@
-Subproject commit 263ab7937a776d1aa4b349a2bc35732117c5225c
+Subproject commit c4e1f7d9bc6f46365151f67c0da28f0f72c595b5

-- 
To view, visit https://gerrit.wikimedia.org/r/326059
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I51da2775ec943f79b385c8044fdab4c3c06577cc
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/experimental
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...experimental[master]: poultry fix

2016-11-08 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged.

Change subject: poultry fix
..


poultry fix

Change-Id: I6d5bb9d6ad4b639fc578d1ce6f0d9aad12c3fe82
---
M shiny-server/poultry
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/shiny-server/poultry b/shiny-server/poultry
index 8357194..263ab79 16
--- a/shiny-server/poultry
+++ b/shiny-server/poultry
@@ -1 +1 @@
-Subproject commit 8357194edd41127c702f44184e9a0fcaf2b41acb
+Subproject commit 263ab7937a776d1aa4b349a2bc35732117c5225c

-- 
To view, visit https://gerrit.wikimedia.org/r/320514
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I6d5bb9d6ad4b639fc578d1ce6f0d9aad12c3fe82
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/experimental
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 
Gerrit-Reviewer: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...experimental[master]: poultry fix

2016-11-08 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review.

  https://gerrit.wikimedia.org/r/320514

Change subject: poultry fix
..

poultry fix

Change-Id: I6d5bb9d6ad4b639fc578d1ce6f0d9aad12c3fe82
---
M shiny-server/poultry
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/experimental 
refs/changes/14/320514/1

diff --git a/shiny-server/poultry b/shiny-server/poultry
index 8357194..263ab79 16
--- a/shiny-server/poultry
+++ b/shiny-server/poultry
@@ -1 +1 @@
-Subproject commit 8357194edd41127c702f44184e9a0fcaf2b41acb
+Subproject commit 263ab7937a776d1aa4b349a2bc35732117c5225c

-- 
To view, visit https://gerrit.wikimedia.org/r/320514
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I6d5bb9d6ad4b639fc578d1ce6f0d9aad12c3fe82
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/experimental
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...experimental[master]: install highcharter packages

2016-11-08 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged.

Change subject: install highcharter packages
..


install highcharter packages

Change-Id: I55a5ddd535917642bf5ed07cab4e5ff3e909
---
M setup.sh
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/setup.sh b/setup.sh
index 75bc813..65906eb 100755
--- a/setup.sh
+++ b/setup.sh
@@ -151,7 +151,7 @@
   install_r_package shinydashboard
   install_r_package flexdashboard
   install_r_package shinyjs
-  github_install_r_package jcheng5/googleCharts
+  install_r_package highcharter
   git_install_r_package 
https://gerrit.wikimedia.org/r/wikimedia/discovery/polloi
   # Statistical modeling:
   install_r_package forecast

-- 
To view, visit https://gerrit.wikimedia.org/r/320440
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I55a5ddd535917642bf5ed07cab4e5ff3e909
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/experimental
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 
Gerrit-Reviewer: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...experimental[master]: install highcharter packages

2016-11-08 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review.

  https://gerrit.wikimedia.org/r/320440

Change subject: install highcharter packages
..

install highcharter packages

Change-Id: I55a5ddd535917642bf5ed07cab4e5ff3e909
---
M setup.sh
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/experimental 
refs/changes/40/320440/1

diff --git a/setup.sh b/setup.sh
index 75bc813..65906eb 100755
--- a/setup.sh
+++ b/setup.sh
@@ -151,7 +151,7 @@
   install_r_package shinydashboard
   install_r_package flexdashboard
   install_r_package shinyjs
-  github_install_r_package jcheng5/googleCharts
+  install_r_package highcharter
   git_install_r_package 
https://gerrit.wikimedia.org/r/wikimedia/discovery/polloi
   # Statistical modeling:
   install_r_package forecast

-- 
To view, visit https://gerrit.wikimedia.org/r/320440
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I55a5ddd535917642bf5ed07cab4e5ff3e909
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/experimental
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...experimental[master]: Updating dashboards...

2016-11-08 Thread Chelsyx (Code Review)
Chelsyx has submitted this change and it was merged.

Change subject: Updating dashboards...
..


Updating dashboards...

Change-Id: Ib3cb8cf4e3a7441d67df74e45437288862720884
---
M shiny-server/forecast
M shiny-server/poultry
2 files changed, 2 insertions(+), 2 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/shiny-server/forecast b/shiny-server/forecast
index 186bdaa..550e524 16
--- a/shiny-server/forecast
+++ b/shiny-server/forecast
@@ -1 +1 @@
-Subproject commit 186bdaacef0ba1b5c27b4c43589ac408036c2877
+Subproject commit 550e524aa9c6266bfdc67df4539c83b19bb54141
diff --git a/shiny-server/poultry b/shiny-server/poultry
index eb36a59..8357194 16
--- a/shiny-server/poultry
+++ b/shiny-server/poultry
@@ -1 +1 @@
-Subproject commit eb36a59e7f9fa4a53a57eba75410b2ec3a87908d
+Subproject commit 8357194edd41127c702f44184e9a0fcaf2b41acb

-- 
To view, visit https://gerrit.wikimedia.org/r/320437
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ib3cb8cf4e3a7441d67df74e45437288862720884
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/experimental
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 
Gerrit-Reviewer: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...experimental[master]: Updating dashboards...

2016-11-08 Thread Chelsyx (Code Review)
Chelsyx has uploaded a new change for review.

  https://gerrit.wikimedia.org/r/320437

Change subject: Updating dashboards...
..

Updating dashboards...

Change-Id: Ib3cb8cf4e3a7441d67df74e45437288862720884
---
M shiny-server/forecast
M shiny-server/poultry
2 files changed, 2 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/experimental 
refs/changes/37/320437/1

diff --git a/shiny-server/forecast b/shiny-server/forecast
index 186bdaa..550e524 16
--- a/shiny-server/forecast
+++ b/shiny-server/forecast
@@ -1 +1 @@
-Subproject commit 186bdaacef0ba1b5c27b4c43589ac408036c2877
+Subproject commit 550e524aa9c6266bfdc67df4539c83b19bb54141
diff --git a/shiny-server/poultry b/shiny-server/poultry
index eb36a59..8357194 16
--- a/shiny-server/poultry
+++ b/shiny-server/poultry
@@ -1 +1 @@
-Subproject commit eb36a59e7f9fa4a53a57eba75410b2ec3a87908d
+Subproject commit 8357194edd41127c702f44184e9a0fcaf2b41acb

-- 
To view, visit https://gerrit.wikimedia.org/r/320437
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib3cb8cf4e3a7441d67df74e45437288862720884
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/experimental
Gerrit-Branch: master
Gerrit-Owner: Chelsyx 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


  1   2   >