[MediaWiki-commits] [Gerrit] wikimedia...autoreporter[master]: small fix of survival plots
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/392761 ) Change subject: small fix of survival plots .. small fix of survival plots Change-Id: I7167c7ea7c628aa3cf63d1d9171eaf922e8be6c5 --- M modules/interleaved_test/page_dwelltime.R M modules/stat_test/serp_from_autocomplete.R M modules/stat_test/visited_page.R M modules/test_summary/browser_os.R 4 files changed, 4 insertions(+), 5 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/modules/interleaved_test/page_dwelltime.R b/modules/interleaved_test/page_dwelltime.R index a01926c..4f57fc7 100644 --- a/modules/interleaved_test/page_dwelltime.R +++ b/modules/interleaved_test/page_dwelltime.R @@ -25,7 +25,7 @@ ggtheme = wmf::theme_facet() ) p <- ggsurv$plot + -facet_wrap(~ group, scales = "free_y") + +facet_wrap(~ group) + labs( title = "How long users stay on each team's results", subtitle = "With 95% confidence intervals." @@ -50,7 +50,7 @@ ggtheme = wmf::theme_facet() ) p <- ggsurv$plot + -facet_wrap(~ wiki, ncol = 3, scales = "free_y") + +facet_wrap(~ wiki, ncol = 3) + labs( title = paste0("How long users stay on each team's results, by wiki (Group = ", this_group, ")"), subtitle = "With 95% confidence intervals." diff --git a/modules/stat_test/serp_from_autocomplete.R b/modules/stat_test/serp_from_autocomplete.R index d89fa58..c982e9f 100644 --- a/modules/stat_test/serp_from_autocomplete.R +++ b/modules/stat_test/serp_from_autocomplete.R @@ -67,7 +67,7 @@ ggtheme = wmf::theme_facet() ) p <- ggsurv$plot + - facet_wrap(~ wiki, ncol = 3, scales = "free_y") + + facet_wrap(~ wiki, ncol = 3) + labs( title = "Proportion of search results pages from autocomplete last longer than T, by test group and wiki", subtitle = "With 95% confidence intervals." diff --git a/modules/stat_test/visited_page.R b/modules/stat_test/visited_page.R index e8383b6..e608606 100644 --- a/modules/stat_test/visited_page.R +++ b/modules/stat_test/visited_page.R @@ -56,7 +56,7 @@ ggtheme = wmf::theme_facet() ) p <- ggsurv$plot + - facet_wrap(~ wiki, ncol = 3, scales = "free_y") + + facet_wrap(~ wiki, ncol = 3) + labs( title = "Proportion of visited search results last longer than T, by test group and wiki", subtitle = "With 95% confidence intervals." diff --git a/modules/test_summary/browser_os.R b/modules/test_summary/browser_os.R index b73886d..741e639 100644 --- a/modules/test_summary/browser_os.R +++ b/modules/test_summary/browser_os.R @@ -1,7 +1,6 @@ if ("user_agent" %in% names(events)) { user_agents <- dplyr::distinct(events, wiki, session_id, group, user_agent) - user_agents$user_agent <- gsub('(Kindle Fire HD[X]? [0-9\\.]{1,3})"', '\\1', user_agents$user_agent, fixed = FALSE) # remove double quote in kindle name user_agents <- user_agents %>% cbind(., purrr::map_df(.$user_agent, ~ wmf::null2na(jsonlite::fromJSON(.x, simplifyVector = FALSE %>% mutate( -- To view, visit https://gerrit.wikimedia.org/r/392761 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I7167c7ea7c628aa3cf63d1d9171eaf922e8be6c5 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/autoreporter Gerrit-Branch: master Gerrit-Owner: ChelsyxGerrit-Reviewer: Chelsyx ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...autoreporter[master]: small fix of survival plots
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/392761 ) Change subject: small fix of survival plots .. small fix of survival plots Change-Id: I7167c7ea7c628aa3cf63d1d9171eaf922e8be6c5 --- M modules/interleaved_test/page_dwelltime.R M modules/stat_test/serp_from_autocomplete.R M modules/stat_test/visited_page.R M modules/test_summary/browser_os.R 4 files changed, 4 insertions(+), 5 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/autoreporter refs/changes/61/392761/1 diff --git a/modules/interleaved_test/page_dwelltime.R b/modules/interleaved_test/page_dwelltime.R index a01926c..4f57fc7 100644 --- a/modules/interleaved_test/page_dwelltime.R +++ b/modules/interleaved_test/page_dwelltime.R @@ -25,7 +25,7 @@ ggtheme = wmf::theme_facet() ) p <- ggsurv$plot + -facet_wrap(~ group, scales = "free_y") + +facet_wrap(~ group) + labs( title = "How long users stay on each team's results", subtitle = "With 95% confidence intervals." @@ -50,7 +50,7 @@ ggtheme = wmf::theme_facet() ) p <- ggsurv$plot + -facet_wrap(~ wiki, ncol = 3, scales = "free_y") + +facet_wrap(~ wiki, ncol = 3) + labs( title = paste0("How long users stay on each team's results, by wiki (Group = ", this_group, ")"), subtitle = "With 95% confidence intervals." diff --git a/modules/stat_test/serp_from_autocomplete.R b/modules/stat_test/serp_from_autocomplete.R index d89fa58..c982e9f 100644 --- a/modules/stat_test/serp_from_autocomplete.R +++ b/modules/stat_test/serp_from_autocomplete.R @@ -67,7 +67,7 @@ ggtheme = wmf::theme_facet() ) p <- ggsurv$plot + - facet_wrap(~ wiki, ncol = 3, scales = "free_y") + + facet_wrap(~ wiki, ncol = 3) + labs( title = "Proportion of search results pages from autocomplete last longer than T, by test group and wiki", subtitle = "With 95% confidence intervals." diff --git a/modules/stat_test/visited_page.R b/modules/stat_test/visited_page.R index e8383b6..e608606 100644 --- a/modules/stat_test/visited_page.R +++ b/modules/stat_test/visited_page.R @@ -56,7 +56,7 @@ ggtheme = wmf::theme_facet() ) p <- ggsurv$plot + - facet_wrap(~ wiki, ncol = 3, scales = "free_y") + + facet_wrap(~ wiki, ncol = 3) + labs( title = "Proportion of visited search results last longer than T, by test group and wiki", subtitle = "With 95% confidence intervals." diff --git a/modules/test_summary/browser_os.R b/modules/test_summary/browser_os.R index b73886d..741e639 100644 --- a/modules/test_summary/browser_os.R +++ b/modules/test_summary/browser_os.R @@ -1,7 +1,6 @@ if ("user_agent" %in% names(events)) { user_agents <- dplyr::distinct(events, wiki, session_id, group, user_agent) - user_agents$user_agent <- gsub('(Kindle Fire HD[X]? [0-9\\.]{1,3})"', '\\1', user_agents$user_agent, fixed = FALSE) # remove double quote in kindle name user_agents <- user_agents %>% cbind(., purrr::map_df(.$user_agent, ~ wmf::null2na(jsonlite::fromJSON(.x, simplifyVector = FALSE %>% mutate( -- To view, visit https://gerrit.wikimedia.org/r/392761 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I7167c7ea7c628aa3cf63d1d9171eaf922e8be6c5 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/autoreporter Gerrit-Branch: master Gerrit-Owner: Chelsyx___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...autoreporter[master]: Change grouping color in survival plots
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/392699 ) Change subject: Change grouping color in survival plots .. Change grouping color in survival plots Change-Id: Iacb7d2c295e820b1ac802d72fe7001a4b28b7f95 --- M modules/interleaved_test/page_dwelltime.R M modules/stat_test/serp_from_autocomplete.R M modules/stat_test/visited_page.R 3 files changed, 16 insertions(+), 10 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/modules/interleaved_test/page_dwelltime.R b/modules/interleaved_test/page_dwelltime.R index 99c7aae..a01926c 100644 --- a/modules/interleaved_test/page_dwelltime.R +++ b/modules/interleaved_test/page_dwelltime.R @@ -18,9 +18,10 @@ xlab = "T (Dwell Time in seconds)", ylab = "Proportion of visits longer than T (P%)", surv.scale = "percent", -palette = colorRampPalette(RColorBrewer::brewer.pal(9, "Dark2"))(2 * length(report_params$interleaved_group_names)), +color = "team", +palette = "Dark2", legend = "bottom", -legend.title = "", +legend.title = "Team", ggtheme = wmf::theme_facet() ) p <- ggsurv$plot + @@ -42,9 +43,10 @@ xlab = "T (Dwell Time in seconds)", ylab = "Proportion of visits longer than T (P%)", surv.scale = "percent", -palette = colorRampPalette(RColorBrewer::brewer.pal(9, "Dark2"))(2 * n_wiki), +color = "team", +palette = "Dark2", legend = "bottom", -legend.title = "", +legend.title = "Team", ggtheme = wmf::theme_facet() ) p <- ggsurv$plot + diff --git a/modules/stat_test/serp_from_autocomplete.R b/modules/stat_test/serp_from_autocomplete.R index ca7cde1..d89fa58 100644 --- a/modules/stat_test/serp_from_autocomplete.R +++ b/modules/stat_test/serp_from_autocomplete.R @@ -34,9 +34,10 @@ xlab = "T (Dwell Time in seconds)", ylab = "Proportion of SERPs longer than T (P%)", surv.scale = "percent", +color = "group", palette = "Set1", legend = "bottom", -legend.title = "", +legend.title = "Group", legend.labs = traditional_test_groups, ggtheme = wmf::theme_min() ) @@ -59,9 +60,10 @@ xlab = "T (Dwell Time in seconds)", ylab = "Proportion of SERPs longer than T (P%)", surv.scale = "percent", - palette = colorRampPalette(RColorBrewer::brewer.pal(9, "Set1"))(n_wiki * length(traditional_test_groups)), + color = "group", + palette = "Set1", legend = "bottom", - legend.title = "", + legend.title = "Group", ggtheme = wmf::theme_facet() ) p <- ggsurv$plot + diff --git a/modules/stat_test/visited_page.R b/modules/stat_test/visited_page.R index ca0df0f..e8383b6 100644 --- a/modules/stat_test/visited_page.R +++ b/modules/stat_test/visited_page.R @@ -9,9 +9,10 @@ xlab = "T (Dwell Time in seconds)", ylab = "Proportion of visits longer than T (P%)", surv.scale = "percent", +color = "group", palette = "Set1", legend = "bottom", -legend.title = "", +legend.title = "Group", legend.labs = traditional_test_groups, ggtheme = wmf::theme_min() ) @@ -48,9 +49,10 @@ xlab = "T (Dwell Time in seconds)", ylab = "Proportion of visits longer than T (P%)", surv.scale = "percent", - palette = colorRampPalette(RColorBrewer::brewer.pal(9, "Set1"))(n_wiki * length(traditional_test_groups)), + color = "group", + palette = "Set1", legend = "bottom", - legend.title = "", + legend.title = "Group", ggtheme = wmf::theme_facet() ) p <- ggsurv$plot + -- To view, visit https://gerrit.wikimedia.org/r/392699 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: Iacb7d2c295e820b1ac802d72fe7001a4b28b7f95 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/autoreporter Gerrit-Branch: master Gerrit-Owner: ChelsyxGerrit-Reviewer: Bearloga Gerrit-Reviewer: Chelsyx ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...autoreporter[master]: Change grouping color in survival plots
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/392699 ) Change subject: Change grouping color in survival plots .. Change grouping color in survival plots Change-Id: Iacb7d2c295e820b1ac802d72fe7001a4b28b7f95 --- M modules/interleaved_test/page_dwelltime.R M modules/stat_test/serp_from_autocomplete.R M modules/stat_test/visited_page.R 3 files changed, 16 insertions(+), 10 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/autoreporter refs/changes/99/392699/1 diff --git a/modules/interleaved_test/page_dwelltime.R b/modules/interleaved_test/page_dwelltime.R index 99c7aae..a01926c 100644 --- a/modules/interleaved_test/page_dwelltime.R +++ b/modules/interleaved_test/page_dwelltime.R @@ -18,9 +18,10 @@ xlab = "T (Dwell Time in seconds)", ylab = "Proportion of visits longer than T (P%)", surv.scale = "percent", -palette = colorRampPalette(RColorBrewer::brewer.pal(9, "Dark2"))(2 * length(report_params$interleaved_group_names)), +color = "team", +palette = "Dark2", legend = "bottom", -legend.title = "", +legend.title = "Team", ggtheme = wmf::theme_facet() ) p <- ggsurv$plot + @@ -42,9 +43,10 @@ xlab = "T (Dwell Time in seconds)", ylab = "Proportion of visits longer than T (P%)", surv.scale = "percent", -palette = colorRampPalette(RColorBrewer::brewer.pal(9, "Dark2"))(2 * n_wiki), +color = "team", +palette = "Dark2", legend = "bottom", -legend.title = "", +legend.title = "Team", ggtheme = wmf::theme_facet() ) p <- ggsurv$plot + diff --git a/modules/stat_test/serp_from_autocomplete.R b/modules/stat_test/serp_from_autocomplete.R index ca7cde1..d89fa58 100644 --- a/modules/stat_test/serp_from_autocomplete.R +++ b/modules/stat_test/serp_from_autocomplete.R @@ -34,9 +34,10 @@ xlab = "T (Dwell Time in seconds)", ylab = "Proportion of SERPs longer than T (P%)", surv.scale = "percent", +color = "group", palette = "Set1", legend = "bottom", -legend.title = "", +legend.title = "Group", legend.labs = traditional_test_groups, ggtheme = wmf::theme_min() ) @@ -59,9 +60,10 @@ xlab = "T (Dwell Time in seconds)", ylab = "Proportion of SERPs longer than T (P%)", surv.scale = "percent", - palette = colorRampPalette(RColorBrewer::brewer.pal(9, "Set1"))(n_wiki * length(traditional_test_groups)), + color = "group", + palette = "Set1", legend = "bottom", - legend.title = "", + legend.title = "Group", ggtheme = wmf::theme_facet() ) p <- ggsurv$plot + diff --git a/modules/stat_test/visited_page.R b/modules/stat_test/visited_page.R index ca0df0f..e8383b6 100644 --- a/modules/stat_test/visited_page.R +++ b/modules/stat_test/visited_page.R @@ -9,9 +9,10 @@ xlab = "T (Dwell Time in seconds)", ylab = "Proportion of visits longer than T (P%)", surv.scale = "percent", +color = "group", palette = "Set1", legend = "bottom", -legend.title = "", +legend.title = "Group", legend.labs = traditional_test_groups, ggtheme = wmf::theme_min() ) @@ -48,9 +49,10 @@ xlab = "T (Dwell Time in seconds)", ylab = "Proportion of visits longer than T (P%)", surv.scale = "percent", - palette = colorRampPalette(RColorBrewer::brewer.pal(9, "Set1"))(n_wiki * length(traditional_test_groups)), + color = "group", + palette = "Set1", legend = "bottom", - legend.title = "", + legend.title = "Group", ggtheme = wmf::theme_facet() ) p <- ggsurv$plot + -- To view, visit https://gerrit.wikimedia.org/r/392699 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Iacb7d2c295e820b1ac802d72fe7001a4b28b7f95 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/autoreporter Gerrit-Branch: master Gerrit-Owner: Chelsyx___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...autoreporter[master]: Small fixes
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/392506 ) Change subject: Small fixes .. Small fixes Change-Id: Ic5fb7e72a97ef532d5c84b2b07ccf655e401e1b2 --- M modules/test_summary/browser_os.R M run.R 2 files changed, 1 insertion(+), 1 deletion(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/modules/test_summary/browser_os.R b/modules/test_summary/browser_os.R index 741e639..b73886d 100644 --- a/modules/test_summary/browser_os.R +++ b/modules/test_summary/browser_os.R @@ -1,6 +1,7 @@ if ("user_agent" %in% names(events)) { user_agents <- dplyr::distinct(events, wiki, session_id, group, user_agent) + user_agents$user_agent <- gsub('(Kindle Fire HD[X]? [0-9\\.]{1,3})"', '\\1', user_agents$user_agent, fixed = FALSE) # remove double quote in kindle name user_agents <- user_agents %>% cbind(., purrr::map_df(.$user_agent, ~ wmf::null2na(jsonlite::fromJSON(.x, simplifyVector = FALSE %>% mutate( diff --git a/run.R b/run.R index 0d86df4..cd78fa1 100644 --- a/run.R +++ b/run.R @@ -33,7 +33,6 @@ # Set up report_params <- yaml::yaml.load_file(opt$yaml_file) -report_params <- yaml::yaml.load_file("reports/ltr_test_18lang.yaml") if (!dir.exists("reports")) { dir.create("reports") } -- To view, visit https://gerrit.wikimedia.org/r/392506 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ic5fb7e72a97ef532d5c84b2b07ccf655e401e1b2 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/autoreporter Gerrit-Branch: master Gerrit-Owner: ChelsyxGerrit-Reviewer: Chelsyx ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...autoreporter[master]: Small fixes
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/392506 ) Change subject: Small fixes .. Small fixes Change-Id: Ic5fb7e72a97ef532d5c84b2b07ccf655e401e1b2 --- M modules/test_summary/browser_os.R M run.R 2 files changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/autoreporter refs/changes/06/392506/1 diff --git a/modules/test_summary/browser_os.R b/modules/test_summary/browser_os.R index 741e639..b73886d 100644 --- a/modules/test_summary/browser_os.R +++ b/modules/test_summary/browser_os.R @@ -1,6 +1,7 @@ if ("user_agent" %in% names(events)) { user_agents <- dplyr::distinct(events, wiki, session_id, group, user_agent) + user_agents$user_agent <- gsub('(Kindle Fire HD[X]? [0-9\\.]{1,3})"', '\\1', user_agents$user_agent, fixed = FALSE) # remove double quote in kindle name user_agents <- user_agents %>% cbind(., purrr::map_df(.$user_agent, ~ wmf::null2na(jsonlite::fromJSON(.x, simplifyVector = FALSE %>% mutate( diff --git a/run.R b/run.R index 0d86df4..cd78fa1 100644 --- a/run.R +++ b/run.R @@ -33,7 +33,6 @@ # Set up report_params <- yaml::yaml.load_file(opt$yaml_file) -report_params <- yaml::yaml.load_file("reports/ltr_test_18lang.yaml") if (!dir.exists("reports")) { dir.create("reports") } -- To view, visit https://gerrit.wikimedia.org/r/392506 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ic5fb7e72a97ef532d5c84b2b07ccf655e401e1b2 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/autoreporter Gerrit-Branch: master Gerrit-Owner: Chelsyx___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...autoreporter[master]: Add interleaved test analysis
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/392102 ) Change subject: Add interleaved test analysis .. Add interleaved test analysis Bug: T176493 Change-Id: I795023856963030e67e85e1cde7352842aa3a7a8 --- M README.md M functions.R M modules/data/data_aggregation.R M modules/data/data_cleansing.R M modules/data/fetch_data.R M modules/explore_similar/esclicks.R M modules/explore_similar/hover_over.R A modules/interleaved_test/data_processing.R A modules/interleaved_test/interleaved_preference.R A modules/interleaved_test/page_dwelltime.R M modules/setup.R M modules/sister_search/iwclicks.R M modules/sister_search/sidebar_results.R M modules/sister_search/ssclicks.R M modules/stat_test/engagement.R M modules/stat_test/first_clicked.R M modules/stat_test/max_clicked.R M modules/stat_test/paulscore.R A modules/stat_test/remove_interleaved_data.R M modules/stat_test/return_rate.R M modules/stat_test/search_abandon_rate.R M modules/stat_test/serp_from_autocomplete.R M modules/stat_test/serp_load_time.R M modules/stat_test/serp_offset.R M modules/stat_test/visited_page.R M modules/stat_test/zrr.R M modules/test_summary/browser_os.R M modules/test_summary/events.R M modules/test_summary/searches.R M report.Rmd M run.R 31 files changed, 483 insertions(+), 148 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/README.md b/README.md index 11f34a5..b0ae2f8 100644 --- a/README.md +++ b/README.md @@ -8,7 +8,8 @@ ```R install.packages(c("tidyverse", "toOrdinal", "jsonlite", "yaml", "rmarkdown", "tools", - "knitr", "RMySQL", "data.table", "lubridate", "binom", "survival", "survminer", "import")) + "knitr", "RMySQL", "data.table", "lubridate", "binom", "survival", "survminer", "import", + "BayesFactor", "formattable", "DT", "htmltools", "scales", "Rcpp", "urltools", "rlang", "RColorBrewer")) devtools::install_git("https://gerrit.wikimedia.org/r/p/wikimedia/discovery/wmf.git;) devtools::install_git("https://gerrit.wikimedia.org/r/p/wikimedia/discovery/polloi.git;) devtools::install_github("bearloga/BCDA") diff --git a/functions.R b/functions.R index a4c6367..8be26c1 100644 --- a/functions.R +++ b/functions.R @@ -1,5 +1,6 @@ # PaulScore Calculation -query_score <- function(positions, F) { # 0-based positions +# 0-based positions +query_score <- function(positions, F) { if (length(positions) == 1 || all(is.na(positions))) { # no clicks were made return(0) @@ -66,7 +67,7 @@ ggplot2::geom_bar(stat = "identity", position = "dodge") + ggplot2::scale_fill_brewer("Group", palette = "Set1") + ggplot2::scale_y_continuous(labels = polloi::compress) + -ggplot2::geom_text(aes_string(label = y, vjust = -0.5), position = position_dodge(width = 1), size = geom_text_size) + +ggplot2::geom_text(aes_string(label = y, vjust = -0.05), position = position_dodge(width = 1), size = geom_text_size) + ggplot2::labs(y = y_lab, x = x_lab, title = title, subtitle = subtitle, caption = caption) } @@ -79,3 +80,59 @@ ggplot2::scale_color_brewer(palette = "Set1") + ggplot2::labs(x = NULL, color = "Group", y = y_lab, title = title, subtitle = subtitle) } + +cppFunction('CharacterVector fill_in(CharacterVector ids) { + CharacterVector new_ids(ids.size()); + String current_id = ids[0]; + new_ids[0] = current_id; + for (int i = 1; i < ids.size(); i++) { +if (ids[i] != NA_STRING) { + current_id = ids[i]; +} +new_ids[i] = current_id; + } + return new_ids; +}') + +cppFunction('NumericVector cumunique(CharacterVector ids) { + NumericVector count(ids.size()); + String current_id = ids[0]; + count[0] = 1; + for (int i = 1; i < ids.size(); i++) { +if (ids[i] == current_id) { + count[i] = count[i-1]; +} else { + count[i] = count[i-1] + 1; + current_id = ids[i]; +} + } + return count; +}') + +# Process interleaved team draft +process_session <- function(df) { + processed_session <- unsplit(lapply(split(df, df$serp_id), function(df) { +if (is.na(df$event_extraParams[1]) || df$event_extraParams[1] == "") { + visited_pages <- rep(as.character(NA), times = nrow(df)) +} else { + from_json <- jsonlite::fromJSON(df$event_extraParams[1], simplifyVector = FALSE) + if (!("teamDraft" %in% names(from_json)) || all(is.na(df$article_id))) { +visited_pages <- rep(as.character(NA), times = nrow(df)) + } else { +team_a <- unlist(from_json$teamDraft$a) +team_b <- unlist(from_json$teamDraft$b) +visited_pages <- vapply(df$article_id, function(article_id) { + if (article_id %in% team_a) { +return("A") + } else if (article_id %in% team_b) { +return("B") + } else { +return(as.character(NA)) + } +}, "") + } +} +return(visited_pages) + }), df$serp_id) +
[MediaWiki-commits] [Gerrit] wikimedia...autoreporter[master]: Add interleaved test analysis
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/392102 ) Change subject: Add interleaved test analysis .. Add interleaved test analysis Bug: T176493 Change-Id: I795023856963030e67e85e1cde7352842aa3a7a8 --- M functions.R M modules/data/data_aggregation.R M modules/data/data_cleansing.R M modules/data/fetch_data.R A modules/interleaved_test/data_processing.R A modules/interleaved_test/interleaved_preference.R A modules/interleaved_test/page_dwelltime.R M modules/setup.R M modules/sister_search/sidebar_results.R M modules/stat_test/engagement.R A modules/stat_test/remove_interleaved_data.R M modules/stat_test/return_rate.R M modules/stat_test/serp_from_autocomplete.R M modules/stat_test/serp_offset.R M modules/stat_test/visited_page.R M modules/test_summary/browser_os.R M modules/test_summary/events.R M modules/test_summary/searches.R M report.Rmd M run.R 20 files changed, 388 insertions(+), 65 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/autoreporter refs/changes/02/392102/1 diff --git a/functions.R b/functions.R index a4c6367..a7dff7e 100644 --- a/functions.R +++ b/functions.R @@ -79,3 +79,59 @@ ggplot2::scale_color_brewer(palette = "Set1") + ggplot2::labs(x = NULL, color = "Group", y = y_lab, title = title, subtitle = subtitle) } + +cppFunction('CharacterVector fill_in(CharacterVector ids) { + CharacterVector new_ids(ids.size()); + String current_id = ids[0]; + new_ids[0] = current_id; + for (int i = 1; i < ids.size(); i++) { +if (ids[i] != NA_STRING) { + current_id = ids[i]; +} +new_ids[i] = current_id; + } + return new_ids; +}') + +cppFunction('NumericVector cumunique(CharacterVector ids) { + NumericVector count(ids.size()); + String current_id = ids[0]; + count[0] = 1; + for (int i = 1; i < ids.size(); i++) { +if (ids[i] == current_id) { + count[i] = count[i-1]; +} else { + count[i] = count[i-1] + 1; + current_id = ids[i]; +} + } + return count; +}') + +# Process interleaved team draft +process_session <- function(df) { + processed_session <- unsplit(lapply(split(df, df$serp_id), function(df) { +if (is.na(df$event_extraParams[1]) || df$event_extraParams[1] == "") { + visited_pages <- rep(as.character(NA), times = nrow(df)) +} else { + from_json <- jsonlite::fromJSON(df$event_extraParams[1], simplifyVector = FALSE) + if (!("teamDraft" %in% names(from_json)) || all(is.na(df$article_id))) { +visited_pages <- rep(as.character(NA), times = nrow(df)) + } else { +team_a <- unlist(from_json$teamDraft$a) +team_b <- unlist(from_json$teamDraft$b) +visited_pages <- vapply(df$article_id, function(article_id) { + if (article_id %in% team_a) { +return("A") + } else if (article_id %in% team_b) { +return("B") + } else { +return(as.character(NA)) + } +}, "") + } +} +return(visited_pages) + }), df$serp_id) + return(processed_session) +} diff --git a/modules/data/data_aggregation.R b/modules/data/data_aggregation.R index db5a178..6115703 100644 --- a/modules/data/data_aggregation.R +++ b/modules/data/data_aggregation.R @@ -8,9 +8,9 @@ message("Aggregating by search...") searches <- events %>% - keep_where(!(is.na(serp_id))) %>% # remove visitPage and checkin events - arrange(date, session_id, serp_id, timestamp) %>% - group_by(group, wiki, session_id, serp_id) %>% + keep_where(!(is.na(search_id))) %>% # remove visitPage and checkin events + arrange(date, session_id, search_id, timestamp) %>% + group_by(group, wiki, session_id, search_id) %>% summarize( date = date[1], timestamp = timestamp[1], @@ -56,19 +56,19 @@ keep_where(event == "searchResultPage", `some same-wiki results` == "TRUE") %>% # SERPs with 0 results will not have an offset in extraParams ^ mutate(offset = purrr::map_int(event_extraParams, ~ parse_extraParams(.x, action = "searchResultPage")$offset)) %>% -select(session_id, event_id, serp_id, offset) +select(session_id, event_id, search_id, offset) message("Processing SERP interwiki data...") - extract_iw <- function(session_id, event_id, serp_id, event_extraParams) { + extract_iw <- function(session_id, event_id, search_id, event_extraParams) { return(data.frame( - session_id, event_id, serp_id, + session_id, event_id, search_id, parse_extraParams(event_extraParams, action = "searchResultPage")$iw, stringsAsFactors = FALSE )) } serp_iw <- events %>% keep_where(event == "searchResultPage") %>% -select(session_id, event_id, serp_id, event_extraParams) %>% +select(session_id, event_id, search_id, event_extraParams) %>% purrr::pmap_df(extract_iw) %>% mutate(source = case_when( source == "wikt" ~ "Wiktionary", @@ -104,12 +104,12 @@
[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: db1047 => db1108
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/391062 ) Change subject: db1047 => db1108 .. db1047 => db1108 Bug: T156844 Change-Id: I91270ef00fcb698e686e536162ab4d330ba7cb2b --- M CHANGELOG.md M README.md M modules/metrics/maps/config.yaml M modules/metrics/portal/config.yaml M modules/metrics/search/config.yaml 5 files changed, 22 insertions(+), 9 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/CHANGELOG.md b/CHANGELOG.md index 3c37dae..6c428e2 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,6 +1,10 @@ # Change Log (Patch Notes) All notable changes to this project will be documented in this file. +## 2017/11/13 +- Switched host name from db1047.eqiad.wmnet to db1108.eqiad.wmnet per [T156844](https://phabricator.wikimedia.org/T156844) +- Updated documentation + ## 2017/11/02 - Disabled forecasting (per [T112170#3724472](https://phabricator.wikimedia.org/T112170#3724472)) diff --git a/README.md b/README.md index f8dfca6..b4f6b70 100644 --- a/README.md +++ b/README.md @@ -119,12 +119,15 @@ - [x] Search on Mobile Web - [x] [Event counts](modules/metrics/search/mobile_event_counts.sql) - [x] [Load times](modules/metrics/search/mobile_load_times) (invokes [load_times.R](modules/metrics/search/load_times.R)) +- [x] [Session counts](modules/metrics/search/mobile_session_counts) (invokes [mobile_session_counts.R](modules/metrics/search/mobile_session_counts.R)) - [x] Search on Desktop - [x] [Event counts](modules/metrics/search/desktop_event_counts.sql) - [x] [Load times](modules/metrics/search/desktop_load_times) (invokes [load_times.R](modules/metrics/search/load_times.R)) - [x] [Survival/LDN: Retention of users on visited pages](modules/metrics/search/sample_page_visit_ld) ([T113297](https://phabricator.wikimedia.org/T113297)) - [x] [Dwell-time: % of users visiting results for more than 10s](modules/metrics/search/search_threshold_pass_rate) ([T113297](https://phabricator.wikimedia.org/T113297), [T113513](https://phabricator.wikimedia.org/T113513), [Change 240593](https://gerrit.wikimedia.org/r/#/c/240593/)) +- [x] [Time spent on search result pages (SRPs)](modules/metrics/search/srp_survtime) (invokes [srp_survtime.R](modules/metrics/search/srp_survtime.R)) - [x] [PaulScore](modules/metrics/search/paulscore_approximations) ([T144424](https://phabricator.wikimedia.org/T144424)) +- [x] [Bounce rate](modules/metrics/search/desktop_return_rate) (invokes [desktop_return_rate.R](modules/metrics/search/desktop_return_rate.R)) - Dwell-time, PaulScore, event counts, etc. broken down by language-project (planned, [T150410](https://phabricator.wikimedia.org/T150410)) - [x] Zero results rate (all invoke [cirrus_aggregates.R](modules/metrics/search/cirrus_aggregates.R)) - [x] Overall @@ -140,9 +143,15 @@ - [x] [No automata](modules/metrics/search/cirrus_langproj_breakdown_no_automata) - [x] [With automata](modules/metrics/search/cirrus_langproj_breakdown_with_automata) - Well-behaved searchers (planned, [T150901](https://phabricator.wikimedia.org/T150901)) -- Probable non-bots, as detected by ML (planned, [T149440](https://phabricator.wikimedia.org/T149440) +- Probable non-bots, as detected by ML (abandoned, [T149440](https://phabricator.wikimedia.org/T149440) +- [x] Sister search + - [x] [Prevalence on SRPs](modules/metrics/search/sister_search_prevalence.sql) + - [x] [Traffic to sister projects from Wikipedia SRPs](modules/metrics/search/sister_search_traffic) +- [x] [Article pageviews from full-text search](modules/metrics/search/pageviews_from_fulltext_search) +- [x] [Full-text SRP views by device and agent type](modules/metrics/search/search_result_pages) - [x] [Wikipedia.org Portal](https://www.mediawiki.org/wiki/Wikipedia.org_Portal) ([configuration](modules/metrics/portal/config.yaml), [T118994](https://phabricator.wikimedia.org/T118994)) - [x] [Pageviews](modules/metrics/portal/pageviews) ([T125737](https://phabricator.wikimedia.org/T125737), [T143064](https://phabricator.wikimedia.org/T143064), [T143605](https://phabricator.wikimedia.org/T143605)) +- [x] [Pageviews by device (mobile vs desktop)](modules/metrics/portal/pageviews_by_device) - [x] [Referers](modules/metrics/portal/referer_data) - [x] [User Agent breakdown](modules/metrics/portal/user_agent_data) - [x] Languages @@ -162,6 +171,8 @@ - [x] [Last performed action](modules/metrics/portal/last_action_country) - [x] [Most commonly clicked section per visit](modules/metrics/portal/most_common_country) - [x] [Clickthrough on first visit](modules/metrics/portal/first_visits_country) +- [x]
[MediaWiki-commits] [Gerrit] wikimedia...wmf[master]: db1047 => db1108
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/391063 ) Change subject: db1047 => db1108 .. db1047 => db1108 Bug: T156844 Change-Id: I81f0f93a97f7467e1fcf30e20c252fc044bbbd31 --- M DESCRIPTION M NEWS.md M R/mysql.R M man/mysql.Rd 4 files changed, 8 insertions(+), 4 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/DESCRIPTION b/DESCRIPTION index d0d3314..57e4d2f 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,8 +1,8 @@ Package: wmf Type: Package Title: R Code for Wikimedia Foundation Internal Usage -Version: 0.3.0 -Date: 2017-11-01 +Version: 0.3.1 +Date: 2017-11-13 Authors@R: c( person("Mikhail", "Popov", email = "mikh...@wikimedia.org", role = c("aut", "cre")), person("Oliver", "Keyes", role = "aut", comment = "No longer employed at the Foundation"), diff --git a/NEWS.md b/NEWS.md index c3927ce..dfd22b8 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,3 +1,7 @@ +wmf 0.3.1 += +* Switched host name from db1047.eqiad.wmnet to db1108.eqiad.wmnet per [T156844](https://phabricator.wikimedia.org/T156844) + wmf 0.3.0 = * C++-based `exact_binomial()` to quickly estimate sample size for exact binomial tests diff --git a/R/mysql.R b/R/mysql.R index 4594a36..725b8c4 100644 --- a/R/mysql.R +++ b/R/mysql.R @@ -33,7 +33,7 @@ #' @export mysql_connect <- function( database, default_file = NULL, - hostname = ifelse(database == "log", "db1047.eqiad.wmnet", "analytics-store.eqiad.wmnet") + hostname = ifelse(database == "log", "db1108.eqiad.wmnet", "analytics-store.eqiad.wmnet") ) { # Begin Exclude Linting if (is.null(default_file)) { diff --git a/man/mysql.Rd b/man/mysql.Rd index 09c9606..2032aa8 100644 --- a/man/mysql.Rd +++ b/man/mysql.Rd @@ -11,7 +11,7 @@ \title{Work with MySQL databases} \usage{ mysql_connect(database, default_file = NULL, hostname = ifelse(database == - "log", "db1047.eqiad.wmnet", "analytics-store.eqiad.wmnet")) + "log", "db1108.eqiad.wmnet", "analytics-store.eqiad.wmnet")) mysql_read(query, database, con = NULL) -- To view, visit https://gerrit.wikimedia.org/r/391063 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I81f0f93a97f7467e1fcf30e20c252fc044bbbd31 Gerrit-PatchSet: 2 Gerrit-Project: wikimedia/discovery/wmf Gerrit-Branch: master Gerrit-Owner: BearlogaGerrit-Reviewer: Chelsyx ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Disable forecasting
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/388117 ) Change subject: Disable forecasting .. Disable forecasting Bug: T112170 Change-Id: Ie985c774b83e961b526bd86d1ec17754a0f03c66 --- M CHANGELOG.md M README.md M docs/README.Rmd M docs/README.md M main.sh M test.R 6 files changed, 30 insertions(+), 80 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/CHANGELOG.md b/CHANGELOG.md index 9767a68..3c37dae 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,6 +1,9 @@ # Change Log (Patch Notes) All notable changes to this project will be documented in this file. +## 2017/11/02 +- Disabled forecasting (per [T112170#3724472](https://phabricator.wikimedia.org/T112170#3724472)) + ## 2017/10/05 - Changed which hostname the SQL queries are run on ([T176639](https://phabricator.wikimedia.org/T176639)) diff --git a/README.md b/README.md index e93ed81..f8dfca6 100644 --- a/README.md +++ b/README.md @@ -183,7 +183,7 @@ - KPIs (planned) - [x] External Traffic ([configuration](modules/metrics/external_traffic/config.yaml)) - [x] [Referer data](modules/metrics/external_traffic/referer_data) ([T116295](https://phabricator.wikimedia.org/T116295), [Change 247601](https://gerrit.wikimedia.org/r/#/c/247601/)) -- [x] **Forecasts** ([modules/forecasts/forecast.R](modules/forecasts/forecast.R), see [T112170](https://phabricator.wikimedia.org/T112170) for more details) +- [x] **Forecasts** ([modules/forecasts/forecast.R](modules/forecasts/forecast.R), see [T112170](https://phabricator.wikimedia.org/T112170) for more details) (DISABLED) - [x] Search ([configuration](modules/forecasts/search/config.yaml)) - [x] Cirrus API usage - [x] [ARIMA-modelled forecasts](modules/forecasts/search/api_cirrus_arima) diff --git a/docs/README.Rmd b/docs/README.Rmd index c7d27c0..ff37748 100644 --- a/docs/README.Rmd +++ b/docs/README.Rmd @@ -39,20 +39,3 @@ ```{r results='asis'} print_reports(metrics) ``` - -## Daily Forecasts - -```{r yamls_forecasts} -config_yamls <- list.files(path = "../modules/forecasts", pattern = "^config\\.yaml$", recursive = TRUE, full.names = TRUE) -names(config_yamls) <- sub("../modules/forecasts/", "", dirname(config_yamls), fixed = TRUE) -forecasts <- dplyr::bind_rows(lapply(config_yamls, function(path) { - config_yaml <- suppressMessages(suppressWarnings(data.tree::as.Node(yaml::yaml.load_file(path - reports <- data.tree::ToDataFrameTable(config_yaml[["reports"]], "report" = "name", "description") - reports$path = paste0(file.path(dirname(path), reports$report), ifelse(reports$type == "sql", ".sql", "")) - return(reports) -}), .id = "module") -``` - -```{r results='asis'} -print_reports(forecasts) -``` diff --git a/docs/README.md b/docs/README.md index 2053bcf..ef4ef60 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,14 +1,14 @@ Discovery Datasets == -These files are generated by Discovery's +These files are generated by Discovery’s [Golden](https://github.com/wikimedia/wikimedia-discovery-golden/) data retrieval codebase that executes daily and uses [Reportupdater](https://wikitech.wikimedia.org/wiki/Analytics/Systems/Reportupdater) infrastructure. These datasets provide the metrics that are used by -[Discovery's Dashboards](https://discovery.wmflabs.org/) +[Discovery’s Dashboards](https://discovery.wmflabs.org/) -Last updated on 27 September 2017 +Last updated on 02 November 2017 Daily Metrics - @@ -16,18 +16,18 @@ external\_traffic/ -- -- **referer\_data.tsv**: Pageviews broken down by referrer class (e.g. -internal vs external) and search engine +- **referer\_data.tsv**: Pageviews broken down by referrer class +(e.g. internal vs external) and search engine - **referer\_nonbot\_data.tsv**: User-made pageviews broken down by -referrer class (e.g. internal vs external) and search engine +referrer class (e.g. internal vs external) and search engine maps/ - -- **actions\_per\_tool.tsv**: Actions broken down by feature (e.g. -GeoHack) +- **actions\_per\_tool.tsv**: Actions broken down by feature +(e.g. GeoHack) - **users\_per\_feature.tsv**: Counts of users broken down by feature -(e.g. GeoHack) +(e.g. GeoHack) - **users\_by\_country.tsv**: Counts of users broken down by top 10 countries - **tile\_aggregates\_with\_automata.tsv**: Tile counts by style, zoom @@ -43,11 +43,11 @@ --- - **pageviews.tsv**: Wikipedia.org Portal pageviews, broken down by -high-volume clients vs. low-volume clients -- **referer\_data.tsv**: Pageviews broken down by referrer class (e.g. -internal vs external) -- **user\_agent\_data.tsv**: Wikipedia.org Portal visitors' browsers -- **dwell\_metrics.tsv**: Wikipedia.org Portal visitors' dwell-time +high-volume clients vs.
[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Annotation of mw.track bug fix
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/385014 ) Change subject: Annotation of mw.track bug fix .. Annotation of mw.track bug fix Bug: T178097 Change-Id: I720a8ec47c0a2d86480f3b7c97880ee890a53c03 --- M modules/mobile_web/events.R M tab_documentation/mobile_events.md 2 files changed, 5 insertions(+), 2 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow refs/changes/14/385014/1 diff --git a/modules/mobile_web/events.R b/modules/mobile_web/events.R index 3e0125a..8e044cc 100644 --- a/modules/mobile_web/events.R +++ b/modules/mobile_web/events.R @@ -40,12 +40,14 @@ polloi::make_dygraph(xlab = "Date", ylab = "Events", title = "Mobile search events, by day") %>% dyRangeSelector %>% dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% -dyEvent(as.Date("2017-03-29"), "H (new header)", labelLoc = "bottom") +dyEvent(as.Date("2017-03-29"), "H (new header)", labelLoc = "bottom") %>% +dyEvent(as.Date("2017-09-28"), "B (mw.track bug)", labelLoc = "bottom") }) output$mobile_session_plot <- renderDygraph({ mobile_session %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_mobile_event)) %>% polloi::make_dygraph(xlab = "Date", ylab = "Events", title = "Mobile user sessions, by volume") %>% -dyRangeSelector +dyRangeSelector %>% +dyEvent(as.Date("2017-09-28"), "B (mw.track bug)", labelLoc = "bottom") }) diff --git a/tab_documentation/mobile_events.md b/tab_documentation/mobile_events.md index c8a029a..fc8db23 100644 --- a/tab_documentation/mobile_events.md +++ b/tab_documentation/mobile_events.md @@ -26,6 +26,7 @@ * Between 5 May and 6 May 2015, approximately 40% of incoming EventLogging data was lost due to a wider EventLogging outage. You can read more about the outage [here](https://wikitech.wikimedia.org/wiki/Incident_documentation/20150506-EventLogging). * '__R__': on 2017-01-01 we started calculating all of Discovery's metrics using a new version of [our data retrieval and processing codebase](https://phabricator.wikimedia.org/diffusion/WDGO/) that we migrated to [Wikimedia Analytics](https://www.mediawiki.org/wiki/Analytics)' [Reportupdater infrastructure](https://wikitech.wikimedia.org/wiki/Analytics/Reportupdater). See [T150915](https://phabricator.wikimedia.org/T150915) for more details. * '__H__': on 2017-03-29 we deployed the new mobile header treatment (including the search box) which may result in the decrease of search. See [T176464](https://phabricator.wikimedia.org/T176464) for more information. +* '__B__': on 2017-09-28 a bug in mw.track was fixed. Before 2017-09-28, if events are logged via mw.track, only events tracked during the first pageview of a user's session were logged. See [T175918](https://phabricator.wikimedia.org/T175918) for more details. Questions, bug reports, and feature suggestions -- -- To view, visit https://gerrit.wikimedia.org/r/385014 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I720a8ec47c0a2d86480f3b7c97880ee890a53c03 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/rainbow Gerrit-Branch: develop Gerrit-Owner: Chelsyx___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...autoreporter[master]: Fix UI stuff
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/384935 ) Change subject: Fix UI stuff .. Fix UI stuff Change-Id: Ie2402b37b124e4a2fdab2cb7697674d65342fd79 --- M parameters.yaml M report.Rmd 2 files changed, 175 insertions(+), 118 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/parameters.yaml b/parameters.yaml index 71eccbc..345068f 100644 --- a/parameters.yaml +++ b/parameters.yaml @@ -17,4 +17,5 @@ event_action: NULL # if not NULL, only specified actions are selected event_source: fulltext # autocomplete not yet supported other_filter: "event_searchSessionId <> 'explore_similar_test'" # if not NULL, these filters will be appended to WHERE clause +serp_dwell_time: false # If true, dwell time of fulltext search result pages from autocomplete will be included debug: false # setting to false hides messages and warnings diff --git a/report.Rmd b/report.Rmd index 50c315b..82d1981 100644 --- a/report.Rmd +++ b/report.Rmd @@ -19,6 +19,7 @@ event_action: [searchResultPage, click, ssclick, visitPage, checkin, hover-on, hover-off, esclick] event_source: "fulltext" other_filter: "event_subTest IS NOT NULL" + serp_dwell_time: false # If true, dwell time of fulltext search result pages from autocomplete will be included debug: false # setting to false hides messages and warnings title: '`r params$report_title`' author: '`r paste("Generated by", ifelse(params$debug, Sys.info()["user"], "the automated A/B test reporting tool"))`' @@ -97,10 +98,6 @@ # Take all R colors from graphical devices (with grey removed) large_color_palette = grDevices::colors()[grep('gr(a|e)y', grDevices::colors(), invert = T)] ``` - -`r if (!is.null(params$test_description)) { params$test_description }` - -This test ran from `r format(lubridate::ymd(params$start_date), "%d %B %Y")` to `r format(lubridate::ymd(params$end_date) - 1, "%d %B %Y")` on `r ifelse(is.null(params$wiki), "all wikis", paste(params$wiki, collapse = ", "))`. There were `r length(params$test_group_names)` test groups: `r paste(params$test_group_names, collapse = ", ")`. This report includes `r paste(params$event_source, collapse = ", ")` searches. Refer to Phabricator ticket [`r params$phab_ticket`](`r paste0("https://phabricator.wikimedia.org/;, params$phab_ticket)`) for more details. ```{r sql_setup, echo=FALSE} is_stat_machine <- grepl("^stat1", Sys.info()["nodename"]) @@ -200,11 +197,15 @@ if (is_stat_machine) { message("(Running on a stat machine.)") events_raw <- wmf::mysql_read(query, "log") -fulltext_from_auto <- wmf::mysql_read(query_autocomplete, "log") +if (params$serp_dwell_time) { + fulltext_from_auto <- wmf::mysql_read(query_autocomplete, "log") +} } else { message("Using SSH tunnel & connection to Analytics-Store...") events_raw <- wmf::mysql_read(query, "log", con = con) -fulltext_from_auto <- wmf::mysql_read(query_autocomplete, "log", con = con) +if (params$serp_dwell_time) { + fulltext_from_auto <- wmf::mysql_read(query_autocomplete, "log", con = con) +} message("Closing connection...") wmf::mysql_close(con) } @@ -216,11 +217,16 @@ message("Saving raw events data...") save(events_raw, file = file.path("data", gsub(.Platform$file.sep, "", params$report_title), paste0("events_raw_", gsub("[^0-9]", "", Sys.time()), ".RData"))) - message("Saving SERP data that are from autocomplete...") - save(fulltext_from_auto, file = file.path("data", gsub(.Platform$file.sep, "", params$report_title), paste0("fulltext_from_auto_", gsub("[^0-9]", "", Sys.time()), ".RData"))) - + + if (params$serp_dwell_time) { +message("Saving SERP data that are from autocomplete...") +save(fulltext_from_auto, file = file.path("data", gsub(.Platform$file.sep, "", params$report_title), paste0("fulltext_from_auto_", gsub("[^0-9]", "", Sys.time()), ".RData"))) + } + cat("**Query for full-text events**:\n\n```SQL\n", query, "\n```\n") - cat("**Query for SERP from autocomplete**:\n\n```SQL\n", query_autocomplete, "\n```\n") + if (params$serp_dwell_time) { +cat("**Query for SERP from autocomplete**:\n\n```SQL\n", query_autocomplete, "\n```\n") + } } else { @@ -232,6 +238,10 @@ } ``` + +`r if (!is.null(params$test_description)) { params$test_description }` + +This test ran from `r format(lubridate::ymd_hms(min(events_raw$timestamp)), "%d %B %Y")` to `r format(lubridate::ymd_hms(max(events_raw$timestamp)), "%d %B %Y")` on `r ifelse(is.null(params$wiki), "all wikis", paste(params$wiki, collapse = ", "))`. There were `r length(params$test_group_names)` test groups: `r paste(params$test_group_names, collapse = ", ")`. This report includes `r paste(params$event_source, collapse = ", ")` searches. Refer to Phabricator ticket [`r params$phab_ticket`](`r
[MediaWiki-commits] [Gerrit] wikimedia...autoreporter[master]: Fix UI stuff
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/384935 ) Change subject: Fix UI stuff .. Fix UI stuff Change-Id: Ie2402b37b124e4a2fdab2cb7697674d65342fd79 --- M parameters.yaml M report.Rmd 2 files changed, 179 insertions(+), 122 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/autoreporter refs/changes/35/384935/1 diff --git a/parameters.yaml b/parameters.yaml index 71eccbc..345068f 100644 --- a/parameters.yaml +++ b/parameters.yaml @@ -17,4 +17,5 @@ event_action: NULL # if not NULL, only specified actions are selected event_source: fulltext # autocomplete not yet supported other_filter: "event_searchSessionId <> 'explore_similar_test'" # if not NULL, these filters will be appended to WHERE clause +serp_dwell_time: false # If true, dwell time of fulltext search result pages from autocomplete will be included debug: false # setting to false hides messages and warnings diff --git a/report.Rmd b/report.Rmd index 50c315b..d70f2d5 100644 --- a/report.Rmd +++ b/report.Rmd @@ -19,6 +19,7 @@ event_action: [searchResultPage, click, ssclick, visitPage, checkin, hover-on, hover-off, esclick] event_source: "fulltext" other_filter: "event_subTest IS NOT NULL" + serp_dwell_time: false # If true, dwell time of fulltext search result pages from autocomplete will be included debug: false # setting to false hides messages and warnings title: '`r params$report_title`' author: '`r paste("Generated by", ifelse(params$debug, Sys.info()["user"], "the automated A/B test reporting tool"))`' @@ -97,10 +98,6 @@ # Take all R colors from graphical devices (with grey removed) large_color_palette = grDevices::colors()[grep('gr(a|e)y', grDevices::colors(), invert = T)] ``` - -`r if (!is.null(params$test_description)) { params$test_description }` - -This test ran from `r format(lubridate::ymd(params$start_date), "%d %B %Y")` to `r format(lubridate::ymd(params$end_date) - 1, "%d %B %Y")` on `r ifelse(is.null(params$wiki), "all wikis", paste(params$wiki, collapse = ", "))`. There were `r length(params$test_group_names)` test groups: `r paste(params$test_group_names, collapse = ", ")`. This report includes `r paste(params$event_source, collapse = ", ")` searches. Refer to Phabricator ticket [`r params$phab_ticket`](`r paste0("https://phabricator.wikimedia.org/;, params$phab_ticket)`) for more details. ```{r sql_setup, echo=FALSE} is_stat_machine <- grepl("^stat1", Sys.info()["nodename"]) @@ -200,11 +197,15 @@ if (is_stat_machine) { message("(Running on a stat machine.)") events_raw <- wmf::mysql_read(query, "log") -fulltext_from_auto <- wmf::mysql_read(query_autocomplete, "log") +if (params$serp_dwell_time) { + fulltext_from_auto <- wmf::mysql_read(query_autocomplete, "log") +} } else { message("Using SSH tunnel & connection to Analytics-Store...") events_raw <- wmf::mysql_read(query, "log", con = con) -fulltext_from_auto <- wmf::mysql_read(query_autocomplete, "log", con = con) +if (params$serp_dwell_time) { + fulltext_from_auto <- wmf::mysql_read(query_autocomplete, "log", con = con) +} message("Closing connection...") wmf::mysql_close(con) } @@ -216,11 +217,16 @@ message("Saving raw events data...") save(events_raw, file = file.path("data", gsub(.Platform$file.sep, "", params$report_title), paste0("events_raw_", gsub("[^0-9]", "", Sys.time()), ".RData"))) - message("Saving SERP data that are from autocomplete...") - save(fulltext_from_auto, file = file.path("data", gsub(.Platform$file.sep, "", params$report_title), paste0("fulltext_from_auto_", gsub("[^0-9]", "", Sys.time()), ".RData"))) - + + if (params$serp_dwell_time) { +message("Saving SERP data that are from autocomplete...") +save(fulltext_from_auto, file = file.path("data", gsub(.Platform$file.sep, "", params$report_title), paste0("fulltext_from_auto_", gsub("[^0-9]", "", Sys.time()), ".RData"))) + } + cat("**Query for full-text events**:\n\n```SQL\n", query, "\n```\n") - cat("**Query for SERP from autocomplete**:\n\n```SQL\n", query_autocomplete, "\n```\n") + if (params$serp_dwell_time) { +cat("**Query for SERP from autocomplete**:\n\n```SQL\n", query_autocomplete, "\n```\n") + } } else { @@ -232,6 +238,10 @@ } ``` + +`r if (!is.null(params$test_description)) { params$test_description }` + +This test ran from `r format(lubridate::ymd_hms(min(events_raw$timestamp)), "%d %B %Y")` to `r format(lubridate::ymd_hms(max(events_raw$timestamp)), "%d %B %Y")` on `r ifelse(is.null(params$wiki), "all wikis", paste(params$wiki, collapse = ", "))`. There were `r length(params$test_group_names)` test groups: `r paste(params$test_group_names, collapse = ", ")`. This report includes `r paste(params$event_source, collapse = ", ")` searches. Refer to Phabricator ticket [`r params$phab_ticket`](`r
[MediaWiki-commits] [Gerrit] wikimedia...autoreporter[master]: Bug fixes
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/383960 ) Change subject: Bug fixes .. Bug fixes Change-Id: Ifa99d8f6796a091124a0c902b8d2e370a9ec5b13 --- M report.Rmd 1 file changed, 21 insertions(+), 19 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/report.Rmd b/report.Rmd index ba84ad6..50c315b 100644 --- a/report.Rmd +++ b/report.Rmd @@ -94,6 +94,8 @@ ) }) source("functions.R") +# Take all R colors from graphical devices (with grey removed) +large_color_palette = grDevices::colors()[grep('gr(a|e)y', grDevices::colors(), invert = T)] ``` `r if (!is.null(params$test_description)) { params$test_description }` @@ -514,7 +516,7 @@ ```{r event_count_wiki, echo=FALSE, eval=(n_wiki > 1), fig.height=(5 * n_wiki)} event_count_function(by_wiki = TRUE) + theme_facet() + - facet_wrap(~ wiki, nrow = n_wiki, scales = "free_y") + facet_wrap(~ wiki, ncol = 1, scales = "free_y") ``` ```{r event_after_click_all, echo=FALSE} @@ -529,10 +531,10 @@ event_after_click_function() + theme_min() ``` -```{r event_after_click_wiki, echo=FALSE, eval=(n_wiki > 1), fig.height=(5 * n_wiki)} +```{r event_after_click_wiki, echo=FALSE, eval=(n_wiki > 1), fig.height=(5 * ceiling(n_wiki / 2))} event_after_click_function(by_wiki = TRUE) + theme_facet() + - facet_wrap(~ wiki, nrow = n_wiki, scales = "free_y") + facet_wrap(~ wiki, ncol = 2, scales = "free_y") ``` Searches @@ -559,7 +561,7 @@ knitr::kable() ``` -```{r daily_searches, echo=FALSE} +```{r daily_searches, echo=FALSE, fig.height=(4 * n_wiki)} searches %>% group_by(group, wiki, date) %>% summarize(`All Searches` = n(), `Searches with Results` = sum(`got same-wiki results`), `Searches with Clicks` = sum(`same-wiki clickthrough`)) %>% @@ -583,7 +585,7 @@ group_by(!!! rlang::syms(c("group", "results", switch(by_wiki, "wiki", NULL %>% summarize(searches = length(unique(serp_id[!is.na(serp_id)]))) %>% bar_chart(x = "results", y = "searches", x_lab = "Number of same-wiki results returned", - y_lab = "Number of searches", title = expression(paste("Number of searches with ", italic("n"), " same-wiki result returned, by test group", switch(by_wiki, "and wiki", NULL + y_lab = "Number of searches", title = paste("Number of searches with n same-wiki result returned, by test group", switch(by_wiki, "and wiki", NULL))) } n_results_summary_function() + theme_min() ``` @@ -609,7 +611,7 @@ group_by(!!! rlang::syms(c("group", "offset", switch(by_wiki, "wiki", NULL %>% tally %>% bar_chart(x = "offset", y = "n", x_lab = "Offset", y_lab = "Number of SERPs", - title = expression(paste("Number of SERPs with ", italic("n"), " offset results, by test group", switch(by_wiki, "and wiki", NULL))), + title = paste("Number of SERPs with n offset results, by test group", switch(by_wiki, "and wiki", NULL)), caption = "This can be regarded as a proxy for users visiting additional pages of their search results.") + scale_x_discrete(limits = c("No offset (page 1)", Pluralize(c(20, 40, 60, 80), "result"), "100+ results")) } @@ -643,14 +645,15 @@ tally %>% mutate(prop = paste0(scales::percent_format()(n/sum(n)), " (", n, ")")) %>% select(-n) %>% -tidyr::spread(group, prop) +tidyr::spread(group, prop) %>% +ungroup } get_bayes_factor <- function(data) { BF <- data %>% tally %>% tidyr::spread(group, n) %>% ungroup %>% -select(params$test_group_names) %>% +select(dplyr::one_of(params$test_group_names)) %>% as.matrix() %>% # see http://bayesfactorpcl.r-forge.r-project.org/#ctables for more info BayesFactor::contingencyTableBF(sampleType = "indepMulti", fixedMargin = "cols") @@ -808,7 +811,7 @@ iwclick_position_function() + theme_min() ``` -```{r iwclick_position_wiki, eval=("iwclick" %in% events$event & n_wiki > 1), echo=FALSE, fig.height=(5 * n_wiki)} +```{r iwclick_position_wiki, eval=("iwclick" %in% events$event & n_wiki > 1), echo=FALSE, fig.height=(4 * n_wiki)} iwclick_position_function(by_wiki = TRUE) + facet_wrap(~ wiki, nrow = n_wiki, scales = "free_y") + theme_facet() @@ -1044,7 +1047,7 @@ theme_facet() ``` -```{r engagement_OR_wiki, eval=(n_wiki > 1), echo=FALSE, results='asis', include=TRUE} +```{r engagement_OR_wiki, eval=(n_wiki > 1), echo=FALSE, fig.width=11, fig.height=10, results='asis', include=TRUE} # TODO: duplicated code engagement_OR_all control_group <- grep("control", params$`test_group_names`, value = TRUE) test_group <- setdiff(params$`test_group_names`, control_group) @@ -1063,17 +1066,16 @@ labels = c("Pr[Control Engaging]", "Pr[Test Engaging]", "Pr[Test] - Pr[Control]", "Relative Risk", "Odds Ratio") )) %>% ggplot(aes(x = 1, y = estimate, ymin =
[MediaWiki-commits] [Gerrit] wikimedia...autoreporter[master]: Bug fixes
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/383960 ) Change subject: Bug fixes .. Bug fixes Change-Id: Ifa99d8f6796a091124a0c902b8d2e370a9ec5b13 --- M report.Rmd 1 file changed, 21 insertions(+), 19 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/autoreporter refs/changes/60/383960/1 diff --git a/report.Rmd b/report.Rmd index ba84ad6..50c315b 100644 --- a/report.Rmd +++ b/report.Rmd @@ -94,6 +94,8 @@ ) }) source("functions.R") +# Take all R colors from graphical devices (with grey removed) +large_color_palette = grDevices::colors()[grep('gr(a|e)y', grDevices::colors(), invert = T)] ``` `r if (!is.null(params$test_description)) { params$test_description }` @@ -514,7 +516,7 @@ ```{r event_count_wiki, echo=FALSE, eval=(n_wiki > 1), fig.height=(5 * n_wiki)} event_count_function(by_wiki = TRUE) + theme_facet() + - facet_wrap(~ wiki, nrow = n_wiki, scales = "free_y") + facet_wrap(~ wiki, ncol = 1, scales = "free_y") ``` ```{r event_after_click_all, echo=FALSE} @@ -529,10 +531,10 @@ event_after_click_function() + theme_min() ``` -```{r event_after_click_wiki, echo=FALSE, eval=(n_wiki > 1), fig.height=(5 * n_wiki)} +```{r event_after_click_wiki, echo=FALSE, eval=(n_wiki > 1), fig.height=(5 * ceiling(n_wiki / 2))} event_after_click_function(by_wiki = TRUE) + theme_facet() + - facet_wrap(~ wiki, nrow = n_wiki, scales = "free_y") + facet_wrap(~ wiki, ncol = 2, scales = "free_y") ``` Searches @@ -559,7 +561,7 @@ knitr::kable() ``` -```{r daily_searches, echo=FALSE} +```{r daily_searches, echo=FALSE, fig.height=(4 * n_wiki)} searches %>% group_by(group, wiki, date) %>% summarize(`All Searches` = n(), `Searches with Results` = sum(`got same-wiki results`), `Searches with Clicks` = sum(`same-wiki clickthrough`)) %>% @@ -583,7 +585,7 @@ group_by(!!! rlang::syms(c("group", "results", switch(by_wiki, "wiki", NULL %>% summarize(searches = length(unique(serp_id[!is.na(serp_id)]))) %>% bar_chart(x = "results", y = "searches", x_lab = "Number of same-wiki results returned", - y_lab = "Number of searches", title = expression(paste("Number of searches with ", italic("n"), " same-wiki result returned, by test group", switch(by_wiki, "and wiki", NULL + y_lab = "Number of searches", title = paste("Number of searches with n same-wiki result returned, by test group", switch(by_wiki, "and wiki", NULL))) } n_results_summary_function() + theme_min() ``` @@ -609,7 +611,7 @@ group_by(!!! rlang::syms(c("group", "offset", switch(by_wiki, "wiki", NULL %>% tally %>% bar_chart(x = "offset", y = "n", x_lab = "Offset", y_lab = "Number of SERPs", - title = expression(paste("Number of SERPs with ", italic("n"), " offset results, by test group", switch(by_wiki, "and wiki", NULL))), + title = paste("Number of SERPs with n offset results, by test group", switch(by_wiki, "and wiki", NULL)), caption = "This can be regarded as a proxy for users visiting additional pages of their search results.") + scale_x_discrete(limits = c("No offset (page 1)", Pluralize(c(20, 40, 60, 80), "result"), "100+ results")) } @@ -643,14 +645,15 @@ tally %>% mutate(prop = paste0(scales::percent_format()(n/sum(n)), " (", n, ")")) %>% select(-n) %>% -tidyr::spread(group, prop) +tidyr::spread(group, prop) %>% +ungroup } get_bayes_factor <- function(data) { BF <- data %>% tally %>% tidyr::spread(group, n) %>% ungroup %>% -select(params$test_group_names) %>% +select(dplyr::one_of(params$test_group_names)) %>% as.matrix() %>% # see http://bayesfactorpcl.r-forge.r-project.org/#ctables for more info BayesFactor::contingencyTableBF(sampleType = "indepMulti", fixedMargin = "cols") @@ -808,7 +811,7 @@ iwclick_position_function() + theme_min() ``` -```{r iwclick_position_wiki, eval=("iwclick" %in% events$event & n_wiki > 1), echo=FALSE, fig.height=(5 * n_wiki)} +```{r iwclick_position_wiki, eval=("iwclick" %in% events$event & n_wiki > 1), echo=FALSE, fig.height=(4 * n_wiki)} iwclick_position_function(by_wiki = TRUE) + facet_wrap(~ wiki, nrow = n_wiki, scales = "free_y") + theme_facet() @@ -1044,7 +1047,7 @@ theme_facet() ``` -```{r engagement_OR_wiki, eval=(n_wiki > 1), echo=FALSE, results='asis', include=TRUE} +```{r engagement_OR_wiki, eval=(n_wiki > 1), echo=FALSE, fig.width=11, fig.height=10, results='asis', include=TRUE} # TODO: duplicated code engagement_OR_all control_group <- grep("control", params$`test_group_names`, value = TRUE) test_group <- setdiff(params$`test_group_names`, control_group) @@ -1063,17 +1066,16 @@ labels = c("Pr[Control Engaging]", "Pr[Test Engaging]", "Pr[Test] - Pr[Control]", "Relative Risk", "Odds Ratio") )) %>%
[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: pageviews that are search results pages
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/382320 ) Change subject: pageviews that are search results pages .. pageviews that are search results pages In T176464#3636190, @Jdlrobson mentioned that on some browsers, when you clicked on the search icon, it will take you to a blank Special:Search page and let you start from there. Therefore, we should exclude these blank SRP from our counts. Change-Id: If4aef7521a3268da85e7a3498cce1b33a2ee43a4 --- M modules/metrics/search/search_result_pages M modules/metrics/search/sister_search_traffic 2 files changed, 10 insertions(+), 14 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/golden refs/changes/20/382320/1 diff --git a/modules/metrics/search/search_result_pages b/modules/metrics/search/search_result_pages index 348de5f..5908f4e 100755 --- a/modules/metrics/search/search_result_pages +++ b/modules/metrics/search/search_result_pages @@ -26,13 +26,11 @@ AND page_id IS NULL AND ( uri_path = '/wiki/Special:Search' -OR ( - uri_path = '/w/index.php' - AND ( -LENGTH(PARSE_URL(CONCAT('http://', uri_host, uri_path, uri_query), 'QUERY', 'search')) > 0 -OR LENGTH(PARSE_URL(CONCAT('http://', uri_host, uri_path, uri_query), 'QUERY', 'searchToken')) > 0 - ) -) +OR uri_path = '/w/index.php' + ) + AND ( +LENGTH(PARSE_URL(CONCAT('http://', uri_host, uri_path, uri_query), 'QUERY', 'search')) > 0 +OR LENGTH(PARSE_URL(CONCAT('http://', uri_host, uri_path, uri_query), 'QUERY', 'searchToken')) > 0 ) ) AS serp GROUP BY date, access_method, agent_type; diff --git a/modules/metrics/search/sister_search_traffic b/modules/metrics/search/sister_search_traffic index 0e5b7c6..3e40bc0 100755 --- a/modules/metrics/search/sister_search_traffic +++ b/modules/metrics/search/sister_search_traffic @@ -23,13 +23,11 @@ page_id IS NULL AND ( uri_path = '/wiki/Special:Search' -OR ( - uri_path = '/w/index.php' - AND ( -PARSE_URL(CONCAT('http://', uri_host, uri_path, uri_query), 'QUERY', 'search') IS NOT NULL -OR PARSE_URL(CONCAT('http://', uri_host, uri_path, uri_query), 'QUERY', 'searchToken') IS NOT NULL - ) -) +OR uri_path = '/w/index.php' + ) + AND ( +LENGTH(PARSE_URL(CONCAT('http://', uri_host, uri_path, uri_query), 'QUERY', 'search')) > 0 +OR LENGTH(PARSE_URL(CONCAT('http://', uri_host, uri_path, uri_query), 'QUERY', 'searchToken')) > 0 ) ) AS is_serp FROM webrequest -- To view, visit https://gerrit.wikimedia.org/r/382320 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: If4aef7521a3268da85e7a3498cce1b33a2ee43a4 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/golden Gerrit-Branch: master Gerrit-Owner: Chelsyx___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Count the number of user session tokens by volume for mobile...
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/381508 ) Change subject: Count the number of user session tokens by volume for mobile web search .. Count the number of user session tokens by volume for mobile web search Bug: T176811 Change-Id: I9ce01d5c6ffcce6ddb6e4fe35281d41c39f9f9d6 --- M modules/mobile_web/events.R M tab_documentation/mobile_events.md M ui.R M utils.R 4 files changed, 45 insertions(+), 13 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow refs/changes/08/381508/1 diff --git a/modules/mobile_web/events.R b/modules/mobile_web/events.R index 6f326c6..3e0125a 100644 --- a/modules/mobile_web/events.R +++ b/modules/mobile_web/events.R @@ -1,6 +1,15 @@ +output$mobile_event_user_session <- renderValueBox( + valueBox( +value = mobile_session_mean["Total user sessions"], +subtitle = "User sessions per day*", +icon = icon("search"), +color = "green" + ) +) + output$mobile_event_searches <- renderValueBox( valueBox( -value = mobile_dygraph_means["search sessions"], +value = mobile_dygraph_means["search start"], subtitle = "Search sessions per day*", icon = icon("search"), color = "green" @@ -30,5 +39,13 @@ polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_mobile_event)) %>% polloi::make_dygraph(xlab = "Date", ylab = "Events", title = "Mobile search events, by day") %>% dyRangeSelector %>% -dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") +dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% +dyEvent(as.Date("2017-03-29"), "H (new header)", labelLoc = "bottom") +}) + +output$mobile_session_plot <- renderDygraph({ + mobile_session %>% +polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_mobile_event)) %>% +polloi::make_dygraph(xlab = "Date", ylab = "Events", title = "Mobile user sessions, by volume") %>% +dyRangeSelector }) diff --git a/tab_documentation/mobile_events.md b/tab_documentation/mobile_events.md index e6859b9..c8a029a 100644 --- a/tab_documentation/mobile_events.md +++ b/tab_documentation/mobile_events.md @@ -1,13 +1,15 @@ -Mobile search +Mobile web search === -User actions that we track around search on the mobile website generally fall into three categories: +User actions that we track around prefix search on the mobile website generally fall into three categories: -1. The start of a user's search session; -2. The presentation of the user with a results page, and; -3. A user clicking through to an article in the results page. +1. **search start (aka search session)**: An API request is being made to retrieve search results whenever the user types enough characters to perform a search (3 or more). A search session is identified by searchSessionToken. For example, if a user types "Bara", then a new search session is started; if they then type "ck" (Barack), then a new search session is started; +2. **Result pages opened**: The API request has finished and the results have been rendered; +3. **clickthroughs**: A user clicking through to an article in the results page. -These three things are tracked via the [EventLogging 'MobileWebSearch' schema](https://meta.wikimedia.org/wiki/Schema:MobileWebSearch), and stored to a database. The results are then aggregated and anonymised, and presented on this page. For performance/privacy reasons we randomly sample what we store, so the actual numbers are a vast understatement of how many user actions our servers receive - what's more interesting is how they change over time. In the case of Mobile Web search, this sampling rate is *going* to be **0.1%**: it's currently turned off entirely but should be enabled soon. +When a user opens the search overlay, a **user session** start. We use a random generated userSessionToken to identify this search funnel. A user session can have multiple search sessions. We split user sessions into “low volume”, "medium volume" and “high-volume” sessions. A “high-volume” session is a user session whose search sessions are equal to or greater than the 90th percentile for the whole population on any particular day. A “low-volume” session is a user session whose search sessions are equal to or less than the 5th percentile. The rest are categorized as "medium-volume". + +We use the [EventLogging 'MobileWebSearch' schema](https://meta.wikimedia.org/wiki/Schema:MobileWebSearch) to track these activities, and stored to a database. Currently the schema tracks prefix search only. The results are then aggregated and anonymised, and presented on this page. For performance/privacy reasons we randomly sample what we store, so the actual numbers are a vast understatement of how many user actions our servers receive - what's more
[MediaWiki-commits] [Gerrit] wikimedia...wetzel[develop]: Add maplink & mapframe prevalence graphs and modularize
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/379150 ) Change subject: Add maplink & mapframe prevalence graphs and modularize .. Add maplink & mapframe prevalence graphs and modularize - Splits up server.R into modules (like Search & Portal dashboards) - Adds maplink & mapframe prevalence graphs - Overall prevalence - Language-project breakdown of prevalence Bug: T170022 Change-Id: If1f1efa619037ce8adea873c148f9a1f78376506 --- M CHANGELOG.md A modules/feature_usage.R A modules/geographic_breakdown.R A modules/kartographer/language-project_breakdown.R A modules/kartographer/overall_prevalence.R A modules/kartotherian.R M server.R A tab_documentation/overall_prevalence.md A tab_documentation/prevalence_langproj.md M tab_documentation/tiles_summary.md M ui.R M utils.R 12 files changed, 586 insertions(+), 160 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/CHANGELOG.md b/CHANGELOG.md index 208e2ab..f3e77ee 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,6 +1,10 @@ # Change Log (Patch Notes) All notable changes to this project will be documented in this file. +## 2017/09/18 +- Modularized the dashboard source code +- Added maplink & mapframe prevalence graphs ([T170022](https://phabricator.wikimedia.org/T170022)) + ## 2017/06/20 - Added licensing info ([T167930](https://phabricator.wikimedia.org/T167930)) diff --git a/modules/feature_usage.R b/modules/feature_usage.R new file mode 100644 index 000..ec2460e --- /dev/null +++ b/modules/feature_usage.R @@ -0,0 +1,55 @@ +output$users_per_platform <- renderDygraph({ + user_data %>% +polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_users_per_platform)) %>% +polloi::make_dygraph("Date", "Events", "Unique users by platform, by day") %>% +dyAxis("y", logscale = input$users_per_platform_logscale) %>% +dyLegend(labelsDiv = "users_per_platform_legend", show = "always") %>% +dyRangeSelector %>% +dyEvent(as.Date("2016-04-15"), "A (Maps EL bug)", labelLoc = "bottom") %>% +dyEvent(as.Date("2016-06-17"), "A (Maps EL patch)", labelLoc = "bottom") %>% +dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") +}) + +output$geohack_feature_usage <- renderDygraph({ + usage_data$GeoHack %>% +polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_geohack_feature_usage)) %>% +polloi::make_dygraph("Date", "Events", "Feature usage for GeoHack") %>% +dyRangeSelector %>% +dyAxis("y", logscale = input$geohack_feature_usage_logscale) %>% +dyEvent(as.Date("2016-04-15"), "A (Maps EL bug)", labelLoc = "bottom") %>% +dyEvent(as.Date("2016-06-17"), "A (Maps EL patch)", labelLoc = "bottom") %>% +dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") +}) + +output$wikiminiatlas_feature_usage <- renderDygraph({ + usage_data$WikiMiniAtlas %>% +polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_wikiminiatlas_feature_usage)) %>% +polloi::make_dygraph("Date", "Events", "Feature usage for WikiMiniAtlas") %>% +dyRangeSelector %>% +dyAxis("y", logscale = input$wikiminiatlas_feature_usage_logscale) %>% +dyEvent(as.Date("2016-04-15"), "A (Maps EL bug)", labelLoc = "bottom") %>% +dyEvent(as.Date("2016-06-17"), "A (Maps EL patch)", labelLoc = "bottom") %>% +dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") +}) + +output$wikivoyage_feature_usage <- renderDygraph({ + usage_data$Wikivoyage %>% +polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_wikivoyage_feature_usage)) %>% +polloi::make_dygraph("Date", "Events", "Feature usage for Wikivoyage") %>% +dyRangeSelector %>% +dyAxis("y", logscale = input$wikivoyage_feature_usage_logscale) %>% +dyEvent(as.Date("2016-04-15"), "A (Maps EL bug)", labelLoc = "bottom") %>% +dyEvent(as.Date("2016-06-17"), "A (Maps EL patch)", labelLoc = "bottom") %>% +dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") +}) + +output$wiwosm_feature_usage <- renderDygraph({ + usage_data$WIWOSM %>% +polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_wiwosm_feature_usage)) %>% +polloi::make_dygraph("Date", "Events", "Feature usage for WIWOSM") %>% +dyRangeSelector %>% +dyAxis("y", logscale = input$wiwosm_feature_usage_logscale) %>% +dyEvent(as.Date("2016-04-15"), "A (Maps EL bug)", labelLoc = "bottom") %>% +dyEvent(as.Date("2016-06-17"), "A (Maps EL patch)", labelLoc = "bottom") %>% +dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") +}) diff --git a/modules/geographic_breakdown.R b/modules/geographic_breakdown.R new file mode 100644 index
[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Session counts by volume for mobile web search
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/381126 ) Change subject: Session counts by volume for mobile web search .. Session counts by volume for mobile web search Bug: T176811 Change-Id: I545a80a5f4214e3f170d6a104a48e6d30dddecc9 --- M docs/README.md M modules/metrics/search/config.yaml A modules/metrics/search/mobile_session_counts A modules/metrics/search/mobile_session_counts.R 4 files changed, 80 insertions(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/golden refs/changes/26/381126/1 diff --git a/docs/README.md b/docs/README.md index e5cc336..2053bcf 100644 --- a/docs/README.md +++ b/docs/README.md @@ -8,7 +8,7 @@ infrastructure. These datasets provide the metrics that are used by [Discovery's Dashboards](https://discovery.wmflabs.org/) -Last updated on 22 September 2017 +Last updated on 27 September 2017 Daily Metrics - @@ -204,6 +204,8 @@ after clickthrough; Number of sessions with at least a click and the number of sessions that return to search for different things after clickthrough. +- **mobile\_session\_counts.tsv**: Number of user sessions on mobile +web, broken down by high, medium and low volume. wdqs/ - diff --git a/modules/metrics/search/config.yaml b/modules/metrics/search/config.yaml index 2181168..4b3f099 100644 --- a/modules/metrics/search/config.yaml +++ b/modules/metrics/search/config.yaml @@ -233,3 +233,8 @@ granularity: days starts: 2017-04-01 type: script +mobile_session_counts: +description: Number of user sessions on mobile web, broken down by high, medium and low volume. +granularity: days +starts: 2017-04-01 +type: script diff --git a/modules/metrics/search/mobile_session_counts b/modules/metrics/search/mobile_session_counts new file mode 100755 index 000..e88dc7e --- /dev/null +++ b/modules/metrics/search/mobile_session_counts @@ -0,0 +1,3 @@ +#!/bin/bash + +Rscript modules/metrics/search/mobile_session_counts.R -d $1 diff --git a/modules/metrics/search/mobile_session_counts.R b/modules/metrics/search/mobile_session_counts.R new file mode 100644 index 000..89a3d10 --- /dev/null +++ b/modules/metrics/search/mobile_session_counts.R @@ -0,0 +1,69 @@ +#!/usr/bin/env Rscript + +source("config.R") +.libPaths(r_library) +suppressPackageStartupMessages(library("optparse")) + +option_list <- list( + make_option(c("-d", "--date"), default = NA, action = "store", type = "character") +) + +# Get command line options, if help option encountered print help and exit, +# otherwise if options not found on command line then set defaults: +opt <- parse_args(OptionParser(option_list = option_list)) + +if (is.na(opt$date)) { + quit(save = "no", status = 1) +} + +# Build query: +date_clause <- as.character(as.Date(opt$date), format = "LEFT(timestamp, 8) = '%Y%m%d'") + +query <-paste0("SELECT + DATE('", opt$date, "') AS date, + event_userSessionToken AS userSessionToken, + COUNT(DISTINCT event_searchSessionToken) AS n_search_session + FROM MobileWebSearch_12054448 + WHERE ", date_clause, " + GROUP BY date, event_userSessionToken;") + +# Fetch data from MySQL database: +results <- tryCatch( + suppressMessages(data.table::as.data.table(wmf::mysql_read(query, "log"))), + error = function(e) { +return(data.frame()) + } +) + +if (nrow(results) == 0) { + # Here we make the script output tab-separated + # column names, as required by Reportupdater: + output <- data.frame( +date = character(), +user_sessions = numeric(), +search_sessions = numeric(), +high_volume = numeric(), +medium_volume = numeric(), +low_volume = numeric(), +threshold_high = numeric(), +threshold_low = numeric() + ) +} else { + # Split session counts: + `90th percentile` <- floor(quantile(results$n_search_session, 0.9)) + `10th percentile` <- ceiling(quantile(results$n_search_session, 0.1)) + results$session_type <- dplyr::case_when( +results$n_search_session > `90th percentile` ~ "high_volume", +results$n_search_session < `10th percentile` ~ "low_volume", +TRUE ~ "medium_volume" + ) + output <- cbind(date = "20170901",#opt$date, + user_sessions = nrow(results), + search_sessions = sum(results$n_search_session, na.rm = TRUE), + tidyr::spread(results[, list(userSession = length(userSessionToken)), by = "session_type"], +session_type, userSession), + threshold_high = `90th percentile`, + threshold_low = `10th percentile`) +} + +write.table(output, file = "", append = FALSE, sep = "\t", row.names = FALSE, quote = FALSE) -- To view, visit https://gerrit.wikimedia.org/r/381126 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange
[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Sister search prevalence by language
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/379939 ) Change subject: Sister search prevalence by language .. Sister search prevalence by language Adds the percentage of searches where the sister project search results were shown to the user. Change-Id: I4c59f2e693570b92d63d66826ca23400fc90be61 --- M CHANGELOG.md A modules/sister_search/prevalence.R M server.R A tab_documentation/sister_search_prevalence.md M ui.R M utils.R 6 files changed, 111 insertions(+), 1 deletion(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/CHANGELOG.md b/CHANGELOG.md index 099e8a1..7ecd033 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,9 @@ All notable changes to this project will be documented in this file. +## 2017/09/25 +- Added sister project search result prevalence + ## 2017/08/30 - Added SRP visit times ([T170468](https://phabricator.wikimedia.org/T170468)) - Added [dygraph-based rolling periods](https://rstudio.github.io/dygraphs/gallery-roll-periods.html) to page visit times modules diff --git a/modules/sister_search/prevalence.R b/modules/sister_search/prevalence.R new file mode 100644 index 000..0bedfa8 --- /dev/null +++ b/modules/sister_search/prevalence.R @@ -0,0 +1,37 @@ +output$sister_search_prevalence_lang_container <- renderUI({ + languages_to_display <- sister_search_averages$language + names(languages_to_display) <- sprintf("%s (%.1f%%)", sister_search_averages$language, sister_search_averages$avg) + if (input$sister_search_prevalence_lang_order != "alphabet") { +languages_to_display <- languages_to_display[order( + sister_search_averages$avg, + decreasing = input$sister_search_prevalence_lang_order == "high2low" +)] + } + if (!is.null(input$language_selector)) { +selected_language <- input$language_selector + } else { +selected_language <- languages_to_display[1] + } + return(selectInput( +"sister_search_prevalence_lang_selector", "Language", +multiple = TRUE, selectize = FALSE, size = 19, +choices = languages_to_display, selected = selected_language + )) +}) + +output$sister_search_prevalence_plot <- renderDygraph({ + req(input$sister_search_prevalence_lang_selector) + sister_search_prevalence %>% +dplyr::filter(language %in% input$sister_search_prevalence_lang_selector) %>% +tidyr::spread(language, prevalence, fill = 0) %>% +polloi::reorder_columns() %>% +polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_sister_search_prevalence_plot)) %>% +polloi::make_dygraph("Date", "Prevalence (%)", "Wikipedia searches that showed sister project search results") %>% +dyLegend(show = "always", width = 400, labelsDiv = "sister_search_prevalence_plot_legend") %>% +dyAxis("y", + axisLabelFormatter = "function(x) { return x + '%'; }", + valueFormatter = "function(x) { return Math.round(x * 100)/100 + '%'; }" +) %>% +dyAxis("x", axisLabelFormatter = polloi::custom_axis_formatter) %>% +dyRangeSelector(fillColor = "", strokeColor = "") +}) diff --git a/server.R b/server.R index b91bcf9..21d45a2 100644 --- a/server.R +++ b/server.R @@ -66,6 +66,7 @@ source("modules/zero_results.R", local = TRUE) # Sister Search source("modules/sister_search/traffic.R", local = TRUE) + source("modules/sister_search/prevalence.R", local = TRUE) # Survival source("modules/page_visit_times.R", local = TRUE) # Language/Project Breakdown diff --git a/tab_documentation/sister_search_prevalence.md b/tab_documentation/sister_search_prevalence.md new file mode 100644 index 000..84b58fc --- /dev/null +++ b/tab_documentation/sister_search_prevalence.md @@ -0,0 +1,26 @@ +Sister project search results prevalence +=== +Sister project (cross-wiki) snippets is a feature that adds search results from sister projects of Wikipedia to a sidebar on the search engine results page (SERP). If a query results in matches from the sister projects, users will be shown snippets from Wiktionary, Wikisource, Wikiquote and/or other projects. See [T162276](https://phabricator.wikimedia.org/T162276) for more details. + +General trends +- +* English Wikipedia has the highest prevalence with 75% of searches showing sister project results on average, followed by Chinese (73%) and French (70%) Wikipedias. +* 38% of languages show the sister project results in at least 50% of the searches made. + +Notes, outages, and inaccuracies +- +* English Wikipedia has a different display than all the other languages due to community feedback. Specifically, it does not show results from Commons/multimedia, Wikinews, and Wikiversity. Refer to [T162276#3278689](https://phabricator.wikimedia.org/T162276#3278689) for more details. +* Languages without a lot of traffic also yield less (sampled) event logging data. In order to show
[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Track sister search prevalence
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/379834 ) Change subject: Track sister search prevalence .. Track sister search prevalence Number of searches that have sister project results vs number of searches that do not, by language. Change-Id: I413d37930d959a212fa8fd7c1dfb35898a5f793f --- M CHANGELOG.md M docs/README.md M modules/metrics/search/config.yaml A modules/metrics/search/sister_search_prevalence.sql 4 files changed, 49 insertions(+), 3 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/CHANGELOG.md b/CHANGELOG.md index 8610ba9..f983f48 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,6 +1,18 @@ # Change Log (Patch Notes) All notable changes to this project will be documented in this file. +## 2017/09/22 +- Added sister project search results prevalence + +## 2017/09/21 +- Added new datasets in search and portal ([T172453](https://phabricator.wikimedia.org/T172453)): + - wikipedia portal pageview by device (desktop vs mobile) + - wikipedia portal clickthrough rate by device (desktop vs mobile) + - proportion of wikipedia portal visitors on mobile devices in US vs elsewhere + - pageviews from full-text search (desktop vs mobile) + - search return rate on desktop + - SERPs by access method + ## 2017/08/29 - Switched Hive queries to use the "nice" queue ([T156841](https://phabricator.wikimedia.org/T156841)). See [this section](https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive/Queries#Run_long_queries_in_a_screen_session_and_in_the_nice_queue) for additional details. diff --git a/docs/README.md b/docs/README.md index 88bec45..e5cc336 100644 --- a/docs/README.md +++ b/docs/README.md @@ -8,7 +8,7 @@ infrastructure. These datasets provide the metrics that are used by [Discovery's Dashboards](https://discovery.wmflabs.org/) -Last updated on 14 September 2017 +Last updated on 22 September 2017 Daily Metrics - @@ -186,6 +186,8 @@ Wikipedia search results pages; broken up by language, destination type (SERP vs not), and access method (desktop vs mobile web); exlcudes known automata +- **sister\_search\_prevalence.tsv**: Prevalence of sister search +results on Wikipedia search result pages; broken up by language - **srp\_survtime.tsv**: Estimates (via survival analysis) of how long Wikipedia searchers stay on full-text search results page after getting there from autocomplete search, split by English vs French @@ -193,10 +195,10 @@ - **pageviews\_from\_fulltext\_search.tsv**: Number of searches, pageviews and users to articles from full-text search, broken down by access method (desktop vs mobile web) and agent type (user vs -spider) +spider). - **search\_result\_pages.tsv**: Number of searches, search result pages and users, broken down by access method (desktop vs mobile -web) and agent type (user vs spider) +web) and agent type (user vs spider). - **desktop\_return\_rate.tsv**: Number of searches with at least a click and the number of searches that return to the same search page after clickthrough; Number of sessions with at least a click and the diff --git a/modules/metrics/search/config.yaml b/modules/metrics/search/config.yaml index 00c4524..2181168 100644 --- a/modules/metrics/search/config.yaml +++ b/modules/metrics/search/config.yaml @@ -204,6 +204,12 @@ starts: 2017-06-01 funnel: true type: script +sister_search_prevalence: +description: Prevalence of sister search results on Wikipedia search result pages; broken up by language +granularity: days +starts: 2017-07-01 +funnel: true +type: sql srp_survtime: description: Estimates (via survival analysis) of how long Wikipedia searchers stay on full-text search results page after getting there from autocomplete search, split by English vs French and Catalan vs other languages. granularity: days diff --git a/modules/metrics/search/sister_search_prevalence.sql b/modules/metrics/search/sister_search_prevalence.sql new file mode 100644 index 000..40ae915 --- /dev/null +++ b/modules/metrics/search/sister_search_prevalence.sql @@ -0,0 +1,26 @@ +SELECT + DATE('{from_timestamp}') AS date, wiki_id, + SUM(has_iw) AS has_sister_results, + SUM(IF(has_iw, 0, 1)) AS no_sister_results +FROM ( + SELECT DISTINCT +wiki_id, session_id, query_hash, has_iw + FROM ( +SELECT DISTINCT + wiki AS wiki_id, + event_uniqueId AS event_id, + event_searchSessionId AS session_id, + MD5(LOWER(TRIM(event_query))) AS query_hash, + INSTR(event_extraParams, '"iw":') > 0 AS has_iw -- sister project results shown +FROM TestSearchSatisfaction2_16909631 +WHERE timestamp >= '{from_timestamp}' AND timestamp < '{to_timestamp}' + AND event_source =
[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Fix maplink/mapframe query
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/377807 ) Change subject: Fix maplink/mapframe query .. Fix maplink/mapframe query Bug: T170022 Change-Id: I1d70b09a54b47002a948f29f21e1ad843b87af55 --- M modules/metrics/maps/config.yaml M modules/metrics/maps/prevalence.R M modules/metrics/maps/prevalence.yaml 3 files changed, 10 insertions(+), 4 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/modules/metrics/maps/config.yaml b/modules/metrics/maps/config.yaml index f74d21a..f5b3b46 100644 --- a/modules/metrics/maps/config.yaml +++ b/modules/metrics/maps/config.yaml @@ -40,12 +40,12 @@ mapframe_prevalence: description: Proportion of articles on a wiki that have a mapframe granularity: days -starts: 2017-09-01 # this will need to be set to when patch goes live, we can't backfill this data +starts: 2017-09-14 # this will need to be set to when patch goes live, we can't backfill this data funnel: true type: script maplink_prevalence: description: Proportion of articles on a wiki that have a maplink granularity: days -starts: 2017-09-01 # this will need to be set to when patch goes live, we can't backfill this data +starts: 2017-09-14 # this will need to be set to when patch goes live, we can't backfill this data funnel: true type: script diff --git a/modules/metrics/maps/prevalence.R b/modules/metrics/maps/prevalence.R index 2d23ad9..c9fd596 100644 --- a/modules/metrics/maps/prevalence.R +++ b/modules/metrics/maps/prevalence.R @@ -35,14 +35,19 @@ SUM(COALESCE({type}s, 0)) AS total_{type}s FROM ( SELECT -page.page_id, +p.page_id, pp_value AS {type}s FROM ( SELECT pp_page, pp_value FROM page_props WHERE pp_propname = '{prop_name}' AND pp_value > 0 ) AS filtered_props - RIGHT JOIN page ON page.page_id = filtered_props.pp_page AND page.page_namespace = {ns} + RIGHT JOIN ( +SELECT page_id +FROM page +WHERE page_namespace = {ns} AND page_is_redirect = 0 + ) p + ON p.page_id = filtered_props.pp_page ) joined_tables;") return(query) } diff --git a/modules/metrics/maps/prevalence.yaml b/modules/metrics/maps/prevalence.yaml index c67d65a..f66d14e 100644 --- a/modules/metrics/maps/prevalence.yaml +++ b/modules/metrics/maps/prevalence.yaml @@ -16,6 +16,7 @@ - mediawikiwiki - metawiki - commonswiki +- uawikimedia # as of August 2017 wikivoyages: # enabled for all *except* the following: - hewikivoyage # https://phabricator.wikimedia.org/T170976#3471701 maplink: -- To view, visit https://gerrit.wikimedia.org/r/377807 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I1d70b09a54b47002a948f29f21e1ad843b87af55 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/golden Gerrit-Branch: master Gerrit-Owner: BearlogaGerrit-Reviewer: Chelsyx ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Interpretation and general findings for API dashboards
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/378067 ) Change subject: Interpretation and general findings for API dashboards .. Interpretation and general findings for API dashboards Bug: T172452 Change-Id: If97bb9cd23ae93117d106012d69b8f6250a19ce9 --- M modules/api.R M modules/key_performance_metrics/api_usage.R M tab_documentation/fulltext_basic.md M tab_documentation/geo_basic.md M tab_documentation/kpi_api_usage.md M tab_documentation/language_basic.md M tab_documentation/morelike_basic.md M tab_documentation/open_basic.md M tab_documentation/prefix_basic.md M tab_documentation/referer_breakdown.md M ui.R 11 files changed, 322 insertions(+), 105 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow refs/changes/67/378067/1 diff --git a/modules/api.R b/modules/api.R index 790b29e..6cae3ad 100644 --- a/modules/api.R +++ b/modules/api.R @@ -1,9 +1,22 @@ output$cirrus_aggregate <- renderDygraph({ - split_dataset$`full-text via API` %>% + temp <- split_dataset$`full-text via API` %>% tidyr::spread(referrer, calls) %>% -polloi::reorder_columns() %>% +polloi::reorder_columns() + if (input$fulltext_search_prop) { +temp <- cbind(temp[, "date"], purrr::map_df(temp[, -c(1, 2)], function(x) round(100 * x / temp$All, 2))) %>% + dplyr::filter(date >= "2017-06-29") + } + temp %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_fulltext_search)) %>% -polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily Full-text search API usage by referrer", legend_name = "Searches") %>% +polloi::make_dygraph(xlab = "Date", + ylab = dplyr::case_when( + input$fulltext_search_prop ~ "API Calls Share (%)", + input$fulltext_search_log_scale ~ "Calls (log10 scale)", + TRUE ~ "API Calls" + ), + title = "Daily Full-text search via API usage by referrer", + legend_name = "API Calls", + logscale = input$fulltext_search_log_scale) %>% dyLegend(labelsDiv = "cirrus_aggregate_legend", width = 600) %>% dyRangeSelector %>% dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% @@ -11,21 +24,47 @@ }) output$morelike_aggregate <- renderDygraph({ - split_dataset$`morelike via API` %>% + temp <- split_dataset$`morelike via API` %>% tidyr::spread(referrer, calls) %>% -polloi::reorder_columns() %>% +polloi::reorder_columns() + if (input$morelike_search_prop) { +temp <- cbind(temp[, "date"], purrr::map_df(temp[, -c(1, 2)], function(x) round(100 * x / temp$All, 2))) %>% + dplyr::filter(date >= "2017-06-29") + } + temp %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_morelike_search)) %>% -polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily Morelike search API usage by referrer", legend_name = "Searches") %>% +polloi::make_dygraph(xlab = "Date", + ylab = dplyr::case_when( + input$morelike_search_prop ~ "API Calls Share (%)", + input$morelike_search_log_scale ~ "Calls (log10 scale)", + TRUE ~ "API Calls" + ), + title = "Daily Morelike search API usage by referrer", + legend_name = "API Calls", + logscale = input$morelike_search_log_scale) %>% dyLegend(labelsDiv = "morelike_aggregate_legend", width = 600) %>% dyRangeSelector }) output$open_aggregate <- renderDygraph({ - split_dataset$open %>% + temp <- split_dataset$open %>% tidyr::spread(referrer, calls) %>% -polloi::reorder_columns() %>% +polloi::reorder_columns() + if (input$open_search_prop) { +temp <- cbind(temp[, "date"], purrr::map_df(temp[, -c(1, 2)], function(x) round(100 * x / temp$All, 2))) %>% + dplyr::filter(date >= "2017-06-29") + } + temp %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_open_search)) %>% -polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily OpenSearch API usage by referrer", legend_name = "Searches") %>% +polloi::make_dygraph(xlab = "Date", + ylab = dplyr::case_when( + input$open_search_prop ~ "API Calls Share (%)", + input$open_search_log_scale ~ "Calls (log10 scale)", + TRUE ~ "API Calls" + ), + title = "Daily OpenSearch API usage by referrer", + legend_name = "API Calls", +
[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Add new datasets in search and portal
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/375408 ) Change subject: Add new datasets in search and portal .. Add new datasets in search and portal * wikipedia portal pageview by device (desktop vs mobile) * wikipedia portal clickthrough rate by device (desktop vs mobile) * proportion of wikipedia portal visitors on mobile devices in US vs elsewhere * pageviews from full-text search (desktop vs mobile) * search return rate on desktop * SERPs by access method Bug: T172453 Change-Id: I4615f4070ced26ce886b49be7393115953320cfe --- M docs/README.md A modules/metrics/portal/clickthrough_by_device M modules/metrics/portal/config.yaml M modules/metrics/portal/engagement.R A modules/metrics/portal/mobile_use_us_elsewhere A modules/metrics/portal/pageviews_by_device M modules/metrics/search/config.yaml A modules/metrics/search/desktop_return_rate A modules/metrics/search/desktop_return_rate.R A modules/metrics/search/pageviews_from_fulltext_search A modules/metrics/search/search_result_pages 11 files changed, 378 insertions(+), 3 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/golden refs/changes/08/375408/1 diff --git a/docs/README.md b/docs/README.md index 1b2abe6..bcd72e1 100644 --- a/docs/README.md +++ b/docs/README.md @@ -8,7 +8,7 @@ infrastructure. These datasets provide the metrics that are used by [Discovery's Dashboards](https://discovery.wmflabs.org/) -Last updated on 30 August 2017 +Last updated on 01 September 2017 Daily Metrics - @@ -57,14 +57,24 @@ top 10 countries - **all\_country\_data.tsv**: Sampled traffic to Wikipedia.org Portal, broken down by country +- **all\_country\_data\_history.tsv**: Sampled traffic to +Wikipedia.org Portal, broken down by country. Historical data store. - **app\_link\_clicks.tsv**: Clicks to Wikipedia mobile apps and list of apps - **last\_action\_country.tsv**: Last action performed on Wikipedia.org Portal per user session +- **last\_action\_country\_history.tsv**: Last action performed on +Wikipedia.org Portal per user session. Historical data store. - **most\_common\_country.tsv**: Most common action performed on Wikipedia.org Portal per user session, broken down by country +- **most\_common\_country\_history.tsv**: Most common action performed +on Wikipedia.org Portal per user session, broken down by country. +Historical data store. - **first\_visits\_country.tsv**: Action performed on Wikipedia.org Portal on each user's initial visit, broken down by country +- **first\_visits\_country\_history.tsv**: Action performed on +Wikipedia.org Portal on each user's initial visit, broken down by +country. Historical data store. - **clickthrough\_rate.tsv**: Last action (no action vs clickthrough) by Wikipedia.org Portal visitors - **clickthrough\_sisterprojects.tsv**: Clicks to Wikimedia projects @@ -76,6 +86,12 @@ section - **most\_common\_per\_visit.tsv**: Most common action performed on Wikipedia.org Portal per user session +- **pageviews\_by\_device.tsv**: Pageviews broken down by device +(desktop vs mobile) +- **clickthrough\_by\_device.tsv**: Clickthroughs from Wikipedia.org +Portal, broken down by device (desktop vs mobile) +- **mobile\_use\_us\_elsewhere.tsv**: Number of Wikipedia.org Portal +visitors on mobile devices in U.S. vs everywhere else search/ --- @@ -85,6 +101,9 @@ - **app\_event\_counts\_langproj\_breakdown.tsv**: Clicks and other events by users searching on Android and iOS apps broken down by language +- **app\_event\_counts\_langproj\_breakdown\_history.tsv**: Clicks and +other events by users searching on Android and iOS apps broken down +by language. Historical data store. - **app\_load\_times.tsv**: User-perceived load times when searching on Android and iOS apps - **invoke\_source\_counts.tsv**: How the user initiated their search @@ -96,6 +115,9 @@ - **mobile\_event\_counts\_langproj\_breakdown.tsv**: Clicks and other events by users searching on mobile web broken down by language-project pairs +- **mobile\_event\_counts\_langproj\_breakdown\_history.tsv**: Clicks +and other events by users searching on mobile web broken down by +language-project pairs. Historical data store. - **mobile\_load\_times.tsv**: User-perceived load times when searching on mobile web - **desktop\_event\_counts.tsv**: Clicks and other events by users @@ -103,6 +125,9 @@ - **desktop\_event\_counts\_langproj\_breakdown.tsv**: Clicks and other events by users searching on desktop broken down by language-project pairs +- **desktop\_event\_counts\_langproj\_breakdown\_history.tsv**: Clicks +and other events by users searching on desktop broken down by +language-project pairs. Historical data store. -
[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: SRP visit times label fixes
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/375091 ) Change subject: SRP visit times label fixes .. SRP visit times label fixes Also added data checks & fixed a bug introduced with a new version of tidyr (at least I think that's how the issue started) Change-Id: Ia3f4e6b030858b382c0a7c336d6759d022ebf14e --- M modules/page_visit_times.R M server.R M tab_documentation/survival.md M ui.R M utils.R 5 files changed, 56 insertions(+), 38 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/modules/page_visit_times.R b/modules/page_visit_times.R index 1321dd6..df1fbe9 100644 --- a/modules/page_visit_times.R +++ b/modules/page_visit_times.R @@ -22,7 +22,7 @@ tidyr::spread(label, time) %>% polloi::reorder_columns() %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_srp_ld_plot), rename = FALSE) %>% -polloi::make_dygraph(xlab = "", ylab = "Time (s)", title = "Time at N% users leave the search results page") %>% +polloi::make_dygraph(xlab = "", ylab = "Time (s)", title = "Time at which N% users leave the search results page") %>% dyAxis("x", ticker = "Dygraph.dateTicker", axisLabelFormatter = polloi::custom_axis_formatter, axisLabelWidth = 100, pixelsPerLabel = 80) %>% dyRoller(rollPeriod = input$rolling_srp_ld_plot) %>% diff --git a/server.R b/server.R index 752f5ba..b91bcf9 100644 --- a/server.R +++ b/server.R @@ -80,18 +80,28 @@ polloi::check_past_week(mobile_load_data, "Mobile Web load times"), polloi::check_yesterday(android_dygraph_set, "Android events"), polloi::check_past_week(android_load_data, "Android load times"), + polloi::check_yesterday(position_prop, "clicked result positions"), + polloi::check_past_week(position_prop, "clicked result positions"), + polloi::check_yesterday(source_prop, "source of search on Android"), + polloi::check_past_week(source_prop, "source of search on Android"), polloi::check_yesterday(ios_dygraph_set, "iOS events"), polloi::check_past_week(ios_load_data, "iOS load times"), - polloi::check_yesterday(dplyr::bind_rows(split_dataset), "API usage data"), - polloi::check_past_week(dplyr::bind_rows(split_dataset), "API usage data"), + polloi::check_yesterday(dplyr::bind_rows(split_dataset, .id = "api"), "API usage data"), + polloi::check_past_week(dplyr::bind_rows(split_dataset, .id = "api"), "API usage data"), polloi::check_yesterday(failure_data_with_automata, "zero results data"), polloi::check_past_week(failure_data_with_automata, "zero results data"), polloi::check_yesterday(suggestion_with_automata, "suggestions data"), polloi::check_past_week(suggestion_with_automata, "suggestions data"), polloi::check_yesterday(augmented_clickthroughs, "engagement % data"), polloi::check_past_week(augmented_clickthroughs, "engagement % data"), - polloi::check_yesterday(user_page_visit_dataset, "survival times"), - polloi::check_past_week(user_page_visit_dataset, "survival times")) + polloi::check_yesterday(paulscore_fulltext, "full-text PaulScores"), + polloi::check_past_week(paulscore_fulltext, "full-text PaulScores"), + polloi::check_yesterday(sister_search_traffic, "sister search traffic"), + polloi::check_past_week(sister_search_traffic, "sister search traffic"), + polloi::check_yesterday(user_page_visit_dataset, "page survival times"), + polloi::check_past_week(user_page_visit_dataset, "page survival times"), + polloi::check_yesterday(serp_page_visit_dataset, "serp survival times"), + polloi::check_past_week(serp_page_visit_dataset, "serp survival times")) notifications <- notifications[!vapply(notifications, is.null, FALSE)] return(dropdownMenu(type = "notifications", .list = notifications)) }) diff --git a/tab_documentation/survival.md b/tab_documentation/survival.md index e066ad5..ae7ab59 100644 --- a/tab_documentation/survival.md +++ b/tab_documentation/survival.md @@ -1,15 +1,15 @@ -Automated survival analysis: page visit times +How long searchers stay on the visited search results === When someone is randomly selected for search satisfaction tracking (using our [TSS2 schema](https://meta.wikimedia.org/wiki/Schema:TestSearchSatisfaction2)), we use a check-in system and survival analysis to estimate how long users stay on visited pages. To summarize the results on a daily basis, we record a set of statistics based on a measure formally known as "[median lethal dose](https://en.wikipedia.org/wiki/Median_lethal_dose)". -This graph shows the length of time that must pass before N% of the users leave the page they visited. When the number goes up, we can infer that users are staying on the pages longer. In general, it appears it takes 15s to
[MediaWiki-commits] [Gerrit] wikimedia...wmf[master]: Add function 'null2na'
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/375078 ) Change subject: Add function 'null2na' .. Add function 'null2na' Change-Id: Id149be60ae41c9f09d81b91aaf6a1ee9ecc8db3b --- M DESCRIPTION M NAMESPACE M R/wmf.R M man/FiveThirtyNine.Rd M man/build_query.Rd M man/date_clause.Rd M man/get_logfile.Rd M man/global_query.Rd M man/mysql.Rd A man/null2na.Rd M man/query_hive.Rd M man/read_sampled_log.Rd M man/rewrite_conditional.Rd M man/sample_size_effect.Rd M man/sample_size_odds.Rd M man/set_proxies.Rd M man/timeconverters.Rd M man/write_conditional.Rd 18 files changed, 49 insertions(+), 24 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/wmf refs/changes/78/375078/1 diff --git a/DESCRIPTION b/DESCRIPTION index f739244..b8fe3d1 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -21,4 +21,4 @@ projects=Discovery-Analysis Suggests: testthat -RoxygenNote: 5.0.1 +RoxygenNote: 6.0.1 diff --git a/NAMESPACE b/NAMESPACE index 4a137c4..6a737d2 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -12,6 +12,7 @@ export(mysql_exists) export(mysql_read) export(mysql_write) +export(null2na) export(query_hive) export(read_sampled_log) export(rewrite_conditional) diff --git a/R/wmf.R b/R/wmf.R index e69de29..91f3e7e 100644 --- a/R/wmf.R +++ b/R/wmf.R @@ -0,0 +1,18 @@ +#'@title Turns Null Into Character "NA" +#'@description The function turns NULL in a list into character "NA". +#' +#'@param x A list +#' +#'@return If any element from the input list is NULL, they will be turned into character +#' "NA". Otherwise, return the original list. +#' +#'@export +null2na <- function(x) { + return(lapply(x, function(y) { +if (is.null(y)) { + return(as.character(NA)) +} else { + return(y) +} + })) +} diff --git a/man/FiveThirtyNine.Rd b/man/FiveThirtyNine.Rd index fb66530..7b129ce 100644 --- a/man/FiveThirtyNine.Rd +++ b/man/FiveThirtyNine.Rd @@ -20,4 +20,3 @@ allow for long titles) back in and does a small amount of reduction of the overall plot size to avoid an absolute ton of extraneous spacing. } - diff --git a/man/build_query.Rd b/man/build_query.Rd index 8f73d7a..649388b 100644 --- a/man/build_query.Rd +++ b/man/build_query.Rd @@ -19,4 +19,3 @@ constructs a MySQL query with a conditional around date. This is aimed at eventlogging, where the date/time is always "timestamp". } - diff --git a/man/date_clause.Rd b/man/date_clause.Rd index b2736b6..59a2faf 100644 --- a/man/date_clause.Rd +++ b/man/date_clause.Rd @@ -17,4 +17,3 @@ what it says on the tin; generates a "WHERE year = foo AND month = bar" using lubridate that can then be combined with other elements to form a Hive query. } - diff --git a/man/get_logfile.Rd b/man/get_logfile.Rd index 00e28d9..f8f88a2 100644 --- a/man/get_logfile.Rd +++ b/man/get_logfile.Rd @@ -24,4 +24,3 @@ sampled log files; it can be used to retrieve a particular date range of files through the "earliest" and "latest" arguments. } - diff --git a/man/global_query.Rd b/man/global_query.Rd index f07525c..277b903 100644 --- a/man/global_query.Rd +++ b/man/global_query.Rd @@ -15,12 +15,11 @@ \code{global_query} is a simple wrapper around the mysql queries that allows a useR to send a query to all production dbs on analytics-store.eqiad.wmnet, joining the results from each query into a single object. } -\author{ -Oliver Keyes-} \seealso{ \code{\link{mysql_read}} for querying an individual db, \code{\link{mw_strptime}} for converting MediaWiki timestamps into POSIXlt timestamps, or \code{\link{hive_query}} for accessing the Hive datastore. } - +\author{ +Oliver Keyes +} diff --git a/man/mysql.Rd b/man/mysql.Rd index de36b60..a7913dc 100644 --- a/man/mysql.Rd +++ b/man/mysql.Rd @@ -2,12 +2,12 @@ % Please edit documentation in R/mysql.R \name{mysql} \alias{mysql} -\alias{mysql_close} \alias{mysql_connect} -\alias{mysql_disconnect} -\alias{mysql_exists} \alias{mysql_read} +\alias{mysql_exists} \alias{mysql_write} +\alias{mysql_close} +\alias{mysql_disconnect} \title{Work with MySQL databases} \usage{ mysql_connect(database, default_file = NULL) @@ -43,4 +43,3 @@ \seealso{ \code{\link{hive_query}} or \code{\link{global_query}} } - diff --git a/man/null2na.Rd b/man/null2na.Rd new file mode 100644 index 000..9584d89 --- /dev/null +++ b/man/null2na.Rd @@ -0,0 +1,18 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/wmf.R +\name{null2na} +\alias{null2na} +\title{Turns Null Into Character "NA"} +\usage{ +null2na(x) +} +\arguments{ +\item{x}{A list} +} +\value{ +If any element from the input list is NULL, they will be turned into character + "NA". Otherwise, return the original list. +} +\description{ +The function turns NULL in a list into character "NA". +} diff --git a/man/query_hive.Rd b/man/query_hive.Rd index 8483052..fb7886f 100644 ---
[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Fix legend positions and rename type of API calls
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/375074 ) Change subject: Fix legend positions and rename type of API calls .. Fix legend positions and rename type of API calls Bug: T172452 Change-Id: Ie03a33551afe50df10f33bd8f6c35095097b91c8 --- M modules/api.R M modules/key_performance_metrics/api_usage.R M ui.R M utils.R 4 files changed, 26 insertions(+), 14 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow refs/changes/74/375074/1 diff --git a/modules/api.R b/modules/api.R index 495065d..790b29e 100644 --- a/modules/api.R +++ b/modules/api.R @@ -1,22 +1,22 @@ output$cirrus_aggregate <- renderDygraph({ - split_dataset$cirrus %>% + split_dataset$`full-text via API` %>% tidyr::spread(referrer, calls) %>% polloi::reorder_columns() %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_fulltext_search)) %>% polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily Full-text search API usage by referrer", legend_name = "Searches") %>% -dyLegend(width = 1000, show = "always") %>% +dyLegend(labelsDiv = "cirrus_aggregate_legend", width = 600) %>% dyRangeSelector %>% dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = "bottom") }) output$morelike_aggregate <- renderDygraph({ - split_dataset$`cirrus (more like)` %>% + split_dataset$`morelike via API` %>% tidyr::spread(referrer, calls) %>% polloi::reorder_columns() %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_morelike_search)) %>% polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily Morelike search API usage by referrer", legend_name = "Searches") %>% -dyLegend(width = 1000, show = "always") %>% +dyLegend(labelsDiv = "morelike_aggregate_legend", width = 600) %>% dyRangeSelector }) @@ -26,7 +26,7 @@ polloi::reorder_columns() %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_open_search)) %>% polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily OpenSearch API usage by referrer", legend_name = "Searches") %>% -dyLegend(width = 1000, show = "always") %>% +dyLegend(labelsDiv = "open_aggregate_legend", width = 600) %>% dyRangeSelector %>% dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = "bottom") @@ -38,7 +38,7 @@ polloi::reorder_columns() %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_geo_search)) %>% polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily Geo Search API usage by referrer", legend_name = "Searches") %>% -dyLegend(width = 1000, show = "always") %>% +dyLegend(labelsDiv = "geo_aggregate_legend", width = 600) %>% dyRangeSelector %>% dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = "bottom") @@ -50,7 +50,7 @@ polloi::reorder_columns() %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_language_search)) %>% polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily Language search API usage by referrer", legend_name = "Searches") %>% -dyLegend(width = 1000, show = "always") %>% +dyLegend(labelsDiv = "language_aggregate_legend", width = 600) %>% dyRangeSelector %>% dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = "bottom") @@ -62,7 +62,7 @@ polloi::reorder_columns() %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_prefix_search)) %>% polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily Prefix search API usage by referrer", legend_name = "Searches") %>% -dyLegend(width = 1000, show = "always") %>% +dyLegend(labelsDiv = "prefix_aggregate_legend", width = 600) %>% dyRangeSelector %>% dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = "bottom") @@ -84,6 +84,6 @@ polloi::make_dygraph(xlab = "Date", ylab = ifelse(input$referer_breakdown_prop, "API Calls Share (%)", "API Calls"), title = "Daily API usage by referrer", legend_name = "API Calls") %>% -dyLegend(width = 1000, show = "always") %>% +dyLegend(labelsDiv = "referer_breakdown_plot_legend", width = 600) %>% dyRangeSelector }) diff --git
[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: SRP visit times additional fixes
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/375057 ) Change subject: SRP visit times additional fixes .. SRP visit times additional fixes Bug: T170468 Change-Id: I8758a3559e8ca6ad5713afec171bbdeca29f4dc3 --- M modules/page_visit_times.R M tab_documentation/srp_surv.md M ui.R 3 files changed, 25 insertions(+), 18 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/modules/page_visit_times.R b/modules/page_visit_times.R index d312fcc..1321dd6 100644 --- a/modules/page_visit_times.R +++ b/modules/page_visit_times.R @@ -1,12 +1,13 @@ output$lethal_dose_plot <- renderDygraph({ req(length(input$filter_lethal_dose_plot) > 0) user_page_visit_dataset[, c("date", input$filter_lethal_dose_plot)] %>% -polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_lethal_dose_plot)) %>% +polloi::reorder_columns() %>% +polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_lethal_dose_plot), rename = FALSE) %>% polloi::make_dygraph(xlab = "", ylab = "Time (s)", title = "Time at which N% users leave the visited page") %>% dyAxis("x", ticker = "Dygraph.dateTicker", axisLabelFormatter = polloi::custom_axis_formatter, axisLabelWidth = 100, pixelsPerLabel = 80) %>% dyRoller(rollPeriod = input$rolling_lethal_dose_plot) %>% -dyLegend(labelsDiv = "lethal_dose_plot_legend") %>% +dyLegend(labelsDiv = "lethal_dose_plot_legend", width = 600) %>% dyRangeSelector(fillColor = "", strokeColor = "") %>% dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% dyEvent(as.Date("2017-04-25"), "S (rates)", labelLoc = "bottom") @@ -17,17 +18,15 @@ serp_page_visit_dataset[, c("date", "language", input$filter_srp_ld_plot)] %>% tidyr::gather(LD, time, -c(date, language)) %>% dplyr::filter(language %in% input$language_srp_ld_plot) %>% -dplyr::transmute( - date = date, time = time, - label = paste0(LD, " (", language, ")") -) %>% +dplyr::transmute(date = date, time = time, label = paste0(LD, " (", language, ")")) %>% tidyr::spread(label, time) %>% -polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_srp_ld_plot)) %>% +polloi::reorder_columns() %>% +polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_srp_ld_plot), rename = FALSE) %>% polloi::make_dygraph(xlab = "", ylab = "Time (s)", title = "Time at N% users leave the search results page") %>% dyAxis("x", ticker = "Dygraph.dateTicker", axisLabelFormatter = polloi::custom_axis_formatter, axisLabelWidth = 100, pixelsPerLabel = 80) %>% dyRoller(rollPeriod = input$rolling_srp_ld_plot) %>% -dyLegend(labelsDiv = "srp_ld_plot_legend") %>% +dyLegend(labelsDiv = "srp_ld_plot_legend", width = 600) %>% dyRangeSelector(fillColor = "", strokeColor = "") %>% dyEvent(as.Date("2017-04-25"), "S (sampling rates)", labelLoc = "bottom") %>% dyEvent(as.Date("2017-06-15"), "SS (sister search)", labelLoc = "bottom") diff --git a/tab_documentation/srp_surv.md b/tab_documentation/srp_surv.md index 254818f..c84fde7 100644 --- a/tab_documentation/srp_surv.md +++ b/tab_documentation/srp_surv.md @@ -1,11 +1,19 @@ How long Wikipedia searchers stay on the search result pages === -When someone is randomly selected for search satisfaction tracking (using our [TSS2 schema](https://meta.wikimedia.org/wiki/Schema:TestSearchSatisfaction2)), we use a check-in system and survival analysis to estimate how long users stay on visited pages. When a Wikipedia visitor searches using autocomplete and ends up on a full-text search results page (SRP), we can track how long that page is "alive" before the user either closes the tab, clicks on a result, or navigates elsewhere. +When someone is randomly selected for search satisfaction tracking (using our [TSS2 schema](https://meta.wikimedia.org/wiki/Schema:TestSearchSatisfaction2)), we use a check-in system and survival analysis to estimate how long users stay on visited pages. When a Wikipedia visitor searches using autocomplete and ends up on a **full-text _search results page_** (SRP), we can track how long that page is "alive" before the user either closes the tab, clicks on a result, or navigates elsewhere. To summarize the results on a daily basis, we record a set of statistics based on a measure formally known as "[median lethal dose](https://en.wikipedia.org/wiki/Median_lethal_dose)". This graph shows the length of time that must pass before N% of the users leave the search results page. When the number goes up, we can infer that users are staying on the pages longer. -Outages and inaccuracies +Notes +- +These summary statistics are the same between
[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Order legends according to the last observed values
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/374924 ) Change subject: Order legends according to the last observed values .. Order legends according to the last observed values Bug: T172452 Change-Id: I1552b8f5adf8dde941b567154b08f9d9c674eb5d --- M modules/api.R M modules/key_performance_metrics/api_usage.R 2 files changed, 49 insertions(+), 8 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow refs/changes/24/374924/1 diff --git a/modules/api.R b/modules/api.R index 5fd6cd1..6ec9d1d 100644 --- a/modules/api.R +++ b/modules/api.R @@ -1,6 +1,11 @@ output$cirrus_aggregate <- renderDygraph({ split_dataset$cirrus %>% tidyr::spread(referrer, calls) %>% +{ + # Reorder columns according to the last observed values: + cols <- unlist(polloi::safe_tail(., 1)[, -1]) + .[, c(1, order(cols, decreasing = TRUE) + 1)] +} %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_fulltext_search)) %>% polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily Full-text search API usage by referrer", legend_name = "Searches") %>% dyLegend(width = 1000, show = "always") %>% @@ -12,6 +17,11 @@ output$morelike_aggregate <- renderDygraph({ split_dataset$`cirrus (more like)` %>% tidyr::spread(referrer, calls) %>% +{ + # Reorder columns according to the last observed values: + cols <- unlist(polloi::safe_tail(., 1)[, -1]) + .[, c(1, order(cols, decreasing = TRUE) + 1)] +} %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_morelike_search)) %>% polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily Morelike search API usage by referrer", legend_name = "Searches") %>% dyLegend(width = 1000, show = "always") %>% @@ -21,6 +31,11 @@ output$open_aggregate <- renderDygraph({ split_dataset$open %>% tidyr::spread(referrer, calls) %>% +{ + # Reorder columns according to the last observed values: + cols <- unlist(polloi::safe_tail(., 1)[, -1]) + .[, c(1, order(cols, decreasing = TRUE) + 1)] +} %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_open_search)) %>% polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily OpenSearch API usage by referrer", legend_name = "Searches") %>% dyLegend(width = 1000, show = "always") %>% @@ -31,7 +46,13 @@ output$geo_aggregate <- renderDygraph({ split_dataset$geo %>% -tidyr::spread(referrer, calls) %>%polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_geo_search)) %>% +tidyr::spread(referrer, calls) %>% +{ + # Reorder columns according to the last observed values: + cols <- unlist(polloi::safe_tail(., 1)[, -1]) + .[, c(1, order(cols, decreasing = TRUE) + 1)] +} %>% +polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_geo_search)) %>% polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily Geo Search API usage by referrer", legend_name = "Searches") %>% dyLegend(width = 1000, show = "always") %>% dyRangeSelector %>% @@ -41,7 +62,13 @@ output$language_aggregate <- renderDygraph({ split_dataset$language %>% -tidyr::spread(referrer, calls) %>%polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_language_search)) %>% +tidyr::spread(referrer, calls) %>% +{ + # Reorder columns according to the last observed values: + cols <- unlist(polloi::safe_tail(., 1)[, -1]) + .[, c(1, order(cols, decreasing = TRUE) + 1)] +} %>% +polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_language_search)) %>% polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily Language search API usage by referrer", legend_name = "Searches") %>% dyLegend(width = 1000, show = "always") %>% dyRangeSelector %>% @@ -52,6 +79,11 @@ output$prefix_aggregate <- renderDygraph({ split_dataset$prefix %>% tidyr::spread(referrer, calls) %>% +{ + # Reorder columns according to the last observed values: + cols <- unlist(polloi::safe_tail(., 1)[, -1]) + .[, c(1, order(cols, decreasing = TRUE) + 1)] +} %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_prefix_search)) %>% polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Daily Prefix search API usage by referrer", legend_name = "Searches") %>% dyLegend(width = 1000, show = "always") %>% @@ -71,6 +103,11 @@ temp <- cbind(temp$date, purrr::map_df(temp[, -c(1, 2)], function(x) round(100 * x / temp$All, 2))) } temp %>% +{ +
[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: SRP visit times
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/374920 ) Change subject: SRP visit times .. SRP visit times Functional but still has the following TODOs: - reorder how the %-lang combos show up in the legend - once more of the data has been backfilled, need to add some general comments on trends Bug: T170468 Change-Id: I690230e3d3a7a41156f5878169577a62f52ddeb6 --- M CHANGELOG.md M modules/page_visit_times.R A tab_documentation/srp_surv.md M tab_documentation/survival.md M ui.R M utils.R 6 files changed, 127 insertions(+), 7 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/CHANGELOG.md b/CHANGELOG.md index 7cb188e..099e8a1 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,10 @@ All notable changes to this project will be documented in this file. +## 2017/08/30 +- Added SRP visit times ([T170468](https://phabricator.wikimedia.org/T170468)) +- Added [dygraph-based rolling periods](https://rstudio.github.io/dygraphs/gallery-roll-periods.html) to page visit times modules + ## 2017/08/29 - Added support for breakdown of API usage by referrer ([T172452](https://phabricator.wikimedia.org/T172452)) - Added morelike API usage (see [Gerrit change 345863](https://gerrit.wikimedia.org/r/#/c/345863/)) for more details diff --git a/modules/page_visit_times.R b/modules/page_visit_times.R index 115cbb4..d312fcc 100644 --- a/modules/page_visit_times.R +++ b/modules/page_visit_times.R @@ -1,11 +1,34 @@ output$lethal_dose_plot <- renderDygraph({ - user_page_visit_dataset %>% + req(length(input$filter_lethal_dose_plot) > 0) + user_page_visit_dataset[, c("date", input$filter_lethal_dose_plot)] %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_lethal_dose_plot)) %>% -polloi::make_dygraph(xlab = "", ylab = "Time (s)", title = "Time at which we have lost N% of the users") %>% +polloi::make_dygraph(xlab = "", ylab = "Time (s)", title = "Time at which N% users leave the visited page") %>% dyAxis("x", ticker = "Dygraph.dateTicker", axisLabelFormatter = polloi::custom_axis_formatter, axisLabelWidth = 100, pixelsPerLabel = 80) %>% +dyRoller(rollPeriod = input$rolling_lethal_dose_plot) %>% dyLegend(labelsDiv = "lethal_dose_plot_legend") %>% dyRangeSelector(fillColor = "", strokeColor = "") %>% dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% dyEvent(as.Date("2017-04-25"), "S (rates)", labelLoc = "bottom") }) + +output$srp_ld_plot <- renderDygraph({ + req(length(input$filter_srp_ld_plot) > 0 && length(input$language_srp_ld_plot) > 0) + serp_page_visit_dataset[, c("date", "language", input$filter_srp_ld_plot)] %>% +tidyr::gather(LD, time, -c(date, language)) %>% +dplyr::filter(language %in% input$language_srp_ld_plot) %>% +dplyr::transmute( + date = date, time = time, + label = paste0(LD, " (", language, ")") +) %>% +tidyr::spread(label, time) %>% +polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_srp_ld_plot)) %>% +polloi::make_dygraph(xlab = "", ylab = "Time (s)", title = "Time at N% users leave the search results page") %>% +dyAxis("x", ticker = "Dygraph.dateTicker", axisLabelFormatter = polloi::custom_axis_formatter, + axisLabelWidth = 100, pixelsPerLabel = 80) %>% +dyRoller(rollPeriod = input$rolling_srp_ld_plot) %>% +dyLegend(labelsDiv = "srp_ld_plot_legend") %>% +dyRangeSelector(fillColor = "", strokeColor = "") %>% +dyEvent(as.Date("2017-04-25"), "S (sampling rates)", labelLoc = "bottom") %>% +dyEvent(as.Date("2017-06-15"), "SS (sister search)", labelLoc = "bottom") +}) diff --git a/tab_documentation/srp_surv.md b/tab_documentation/srp_surv.md new file mode 100644 index 000..254818f --- /dev/null +++ b/tab_documentation/srp_surv.md @@ -0,0 +1,23 @@ +How long Wikipedia searchers stay on the search result pages +=== + +When someone is randomly selected for search satisfaction tracking (using our [TSS2 schema](https://meta.wikimedia.org/wiki/Schema:TestSearchSatisfaction2)), we use a check-in system and survival analysis to estimate how long users stay on visited pages. When a Wikipedia visitor searches using autocomplete and ends up on a full-text search results page (SRP), we can track how long that page is "alive" before the user either closes the tab, clicks on a result, or navigates elsewhere. + +To summarize the results on a daily basis, we record a set of statistics based on a measure formally known as "[median lethal dose](https://en.wikipedia.org/wiki/Median_lethal_dose)". This graph shows the length of time that must pass before N% of the users leave the search results page. When the number goes up, we can infer that users are staying on the pages longer. + +Outages and
[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Duplicate reports without max data points limit to keep data...
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/374900 ) Change subject: Duplicate reports without max data points limit to keep data longer .. Duplicate reports without max data points limit to keep data longer Bug: T172453 Change-Id: Iabadd6af646cf186aff811aef8f91d2d9106a3dd --- A modules/metrics/portal/all_country_data_history M modules/metrics/portal/config.yaml A modules/metrics/portal/first_visits_country_history A modules/metrics/portal/last_action_country_history A modules/metrics/portal/most_common_country_history A modules/metrics/search/app_event_counts_langproj_breakdown_history A modules/metrics/search/cirrus_langproj_breakdown_no_automata_history A modules/metrics/search/cirrus_langproj_breakdown_with_automata_history M modules/metrics/search/config.yaml A modules/metrics/search/desktop_event_counts_langproj_breakdown_history A modules/metrics/search/mobile_event_counts_langproj_breakdown_history A modules/metrics/search/paulscore_approximations_fulltext_langproj_breakdown_history A modules/metrics/search/search_threshold_pass_rate_langproj_breakdown_history 13 files changed, 99 insertions(+), 0 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/golden refs/changes/00/374900/1 diff --git a/modules/metrics/portal/all_country_data_history b/modules/metrics/portal/all_country_data_history new file mode 100755 index 000..d295ff7 --- /dev/null +++ b/modules/metrics/portal/all_country_data_history @@ -0,0 +1,3 @@ +#!/bin/bash + +Rscript modules/metrics/portal/geographic_breakdown.R -d $1 --include_all diff --git a/modules/metrics/portal/config.yaml b/modules/metrics/portal/config.yaml index b980a64..3a44813 100644 --- a/modules/metrics/portal/config.yaml +++ b/modules/metrics/portal/config.yaml @@ -53,6 +53,12 @@ max_data_points: 60 funnel: true type: script +all_country_data_history: +description: Sampled traffic to Wikipedia.org Portal, broken down by country. Historical data store. +granularity: days +starts: 2017-04-01 +funnel: true +type: script app_link_clicks: description: Clicks to Wikipedia mobile apps and list of apps granularity: days @@ -66,6 +72,12 @@ max_data_points: 60 funnel: true type: script +last_action_country_history: +description: Last action performed on Wikipedia.org Portal per user session. Historical data store. +granularity: days +starts: 2017-04-01 +funnel: true +type: script most_common_country: description: Most common action performed on Wikipedia.org Portal per user session, broken down by country granularity: days @@ -73,6 +85,12 @@ max_data_points: 60 funnel: true type: script +most_common_country_history: +description: Most common action performed on Wikipedia.org Portal per user session, broken down by country. Historical data store. +granularity: days +starts: 2017-04-01 +funnel: true +type: script first_visits_country: description: Action performed on Wikipedia.org Portal on each user's initial visit, broken down by country granularity: days @@ -80,6 +98,12 @@ max_data_points: 60 funnel: true type: script +first_visits_country_history: +description: Action performed on Wikipedia.org Portal on each user's initial visit, broken down by country. Historical data store. +granularity: days +starts: 2017-04-01 +funnel: true +type: script clickthrough_rate: description: Last action (no action vs clickthrough) by Wikipedia.org Portal visitors granularity: days diff --git a/modules/metrics/portal/first_visits_country_history b/modules/metrics/portal/first_visits_country_history new file mode 100755 index 000..2312008 --- /dev/null +++ b/modules/metrics/portal/first_visits_country_history @@ -0,0 +1,3 @@ +#!/bin/bash + +Rscript modules/metrics/portal/engagement.R -d $1 -o clickthrough_firstvisit --by_country diff --git a/modules/metrics/portal/last_action_country_history b/modules/metrics/portal/last_action_country_history new file mode 100755 index 000..dd3d177 --- /dev/null +++ b/modules/metrics/portal/last_action_country_history @@ -0,0 +1,3 @@ +#!/bin/bash + +Rscript modules/metrics/portal/engagement.R -d $1 -o clickthrough_breakdown --by_country diff --git a/modules/metrics/portal/most_common_country_history b/modules/metrics/portal/most_common_country_history new file mode 100755 index 000..d17ce0b --- /dev/null +++ b/modules/metrics/portal/most_common_country_history @@ -0,0 +1,3 @@ +#!/bin/bash + +Rscript modules/metrics/portal/engagement.R -d $1 -o most_common_per_visit --by_country diff --git
[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: metrics::search::srp_survtime: Split by language
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/374876 ) Change subject: metrics::search::srp_survtime: Split by language .. metrics::search::srp_survtime: Split by language Bug: T170468 Change-Id: I2b935be14eeb26350dea6d9c31a66977c531c052 --- M docs/README.md M modules/metrics/search/config.yaml M modules/metrics/search/sample_page_visit_ld.R M modules/metrics/search/srp_survtime.R 4 files changed, 49 insertions(+), 29 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/docs/README.md b/docs/README.md index 8af2aa3..1b2abe6 100644 --- a/docs/README.md +++ b/docs/README.md @@ -8,7 +8,7 @@ infrastructure. These datasets provide the metrics that are used by [Discovery's Dashboards](https://discovery.wmflabs.org/) -Last updated on 28 August 2017 +Last updated on 30 August 2017 Daily Metrics - @@ -147,7 +147,8 @@ exlcudes known automata - **srp\_survtime.tsv**: Estimates (via survival analysis) of how long Wikipedia searchers stay on full-text search results page after -getting there from autocomplete search. +getting there from autocomplete search, split by English vs French +and Catalan vs other languages. wdqs/ - diff --git a/modules/metrics/search/config.yaml b/modules/metrics/search/config.yaml index bfa78ab..7514982 100644 --- a/modules/metrics/search/config.yaml +++ b/modules/metrics/search/config.yaml @@ -163,8 +163,8 @@ funnel: true type: script srp_survtime: -description: Estimates (via survival analysis) of how long Wikipedia searchers stay on full-text search results page after getting there from autocomplete search. +description: Estimates (via survival analysis) of how long Wikipedia searchers stay on full-text search results page after getting there from autocomplete search, split by English vs French and Catalan vs other languages. granularity: days starts: 2017-04-01 -funnel: false +funnel: true type: script diff --git a/modules/metrics/search/sample_page_visit_ld.R b/modules/metrics/search/sample_page_visit_ld.R index ee425f6..dc04dac 100644 --- a/modules/metrics/search/sample_page_visit_ld.R +++ b/modules/metrics/search/sample_page_visit_ld.R @@ -40,10 +40,8 @@ } ) -if (nrow(results) == 0) { - # Here we make the script output tab-separated - # column names, as required by Reportupdater: - page_visit_survivorship <- data.frame( +empty_df <- function() { + data.frame( date = character(), LD10 = character(), LD25 = character(), @@ -53,6 +51,12 @@ LD95 = character(), LD99 = character() ) +} + +if (nrow(results) == 0) { + # Here we make the script output tab-separated + # column names, as required by Reportupdater: + page_visit_survivorship <- empty_df() } else { # De-duplicate, clean, and sort: results$timestamp <- as.POSIXct(results$timestamp, format = "%Y%m%d%H%M%S") @@ -69,7 +73,7 @@ # Treat each individual search session as its own thing, rather than belonging # to a set of other search sessions by the same user. page_visits <- results[, { -if (all(!is.na(.SD$checkin))) { +if (any(.SD$event == "checkin")) { last_checkin <- max(.SD$checkin, na.rm = TRUE) idx <- which(checkins > last_checkin) if (length(idx) == 0) idx <- 16 # length(checkins) = 16 @@ -82,13 +86,19 @@ ) } }, by = c("session_id", "page_id")] - surv <- survival::Surv(time = page_visits$`last check-in`, - time2 = page_visits$`next check-in`, - event = page_visits$status, - type = "interval") - fit <- survival::survfit(surv ~ 1) - page_visit_survivorship <- data.frame(date = opt$date, rbind(quantile(fit, probs = c(0.10, 0.25, 0.50, 0.75, 0.90, 0.95, 0.99))$quantile)) - colnames(page_visit_survivorship) <- c('date', 'LD10', 'LD25', 'LD50', 'LD75', 'LD90', 'LD95', 'LD99') + if (nrow(page_visits) == 0) { +page_visit_survivorship <- empty_df() + } else { +surv <- survival::Surv( + time = page_visits$`last check-in`, + time2 = page_visits$`next check-in`, + event = page_visits$status, + type = "interval" +) +fit <- survival::survfit(surv ~ 1) +page_visit_survivorship <- data.frame(date = opt$date, rbind(quantile(fit, probs = c(0.10, 0.25, 0.50, 0.75, 0.90, 0.95, 0.99))$quantile)) +colnames(page_visit_survivorship) <- c('date', 'LD10', 'LD25', 'LD50', 'LD75', 'LD90', 'LD95', 'LD99') + } } write.table(page_visit_survivorship, file = "", append = FALSE, sep = "\t", row.names = FALSE, quote = FALSE) diff --git a/modules/metrics/search/srp_survtime.R b/modules/metrics/search/srp_survtime.R index a4ca9cb..985a392 100644 --- a/modules/metrics/search/srp_survtime.R +++ b/modules/metrics/search/srp_survtime.R @@ -56,6 +56,7 @@
[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Breakdown API calls by referer class
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/374669 ) Change subject: Breakdown API calls by referer class .. Breakdown API calls by referer class Bug: T172452 Change-Id: Ic70d7054e02569eb8545dd347026c7f77321ab2c --- M modules/api.R A tab_documentation/referer_breakdown.md M ui.R M utils.R 4 files changed, 65 insertions(+), 2 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow refs/changes/69/374669/1 diff --git a/modules/api.R b/modules/api.R index dc8e332..bfc2350 100644 --- a/modules/api.R +++ b/modules/api.R @@ -53,3 +53,22 @@ dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = "bottom") }) + +output$referer_breakdown_plot <- renderDygraph({ + temp <- split_dataset %>% +dplyr::bind_rows(.id = "api") %>% +dplyr::group_by(date, referrer) %>% +dplyr::summarize(calls = sum(calls, na.rm = TRUE)) %>% +tidyr::spread(referrer, calls) + if (input$referer_breakdown_prop) { +temp <- cbind(temp$date, purrr::map_df(temp[, -c(1, 2)], function(x) round(100 * x / temp$All, 2))) + } + temp %>% +polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_referer_breakdown)) %>% +polloi::make_dygraph(xlab = "Date", + ylab = ifelse(input$referer_breakdown_prop, "API Calls Share (%)", "API Calls"), + title = "Daily API usage by referrer", legend_name = "API Calls") %>% +dyRangeSelector %>% +dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% +dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = "bottom") +}) diff --git a/tab_documentation/referer_breakdown.md b/tab_documentation/referer_breakdown.md new file mode 100644 index 000..0b1f8d0 --- /dev/null +++ b/tab_documentation/referer_breakdown.md @@ -0,0 +1,24 @@ +API Calls by Referrer Class +=== + +All types of API calls are aggregated by date and referrer class. + +**Internal** is traffic referred by Wikimedia sites, specifically: mediawiki.org, wikibooks.org, wikidata.org, wikinews.org, wikimedia.org, wikimediafoundation.org, wikipedia.org, wikiquote.org, wikisource.org, wikiversity.org, wikivoyage.org, and wiktionary.org (See [Webrequest source](https://git.wikimedia.org/blob/analytics%2Frefinery%2Fsource.git/master/refinery-core%2Fsrc%2Fmain%2Fjava%2Forg%2Fwikimedia%2Fanalytics%2Frefinery%2Fcore%2FWebrequest.java#L203) for more information.) + +Outages and inaccuracies +-- + +* '__R__': on 2017-01-01 we started calculating all of Discovery's metrics using a new version of [our data retrieval and processing codebase](https://phabricator.wikimedia.org/diffusion/WDGO/) that we migrated to [Wikimedia Analytics](https://www.mediawiki.org/wiki/Analytics)' [Reportupdater infrastructure](https://wikitech.wikimedia.org/wiki/Analytics/Reportupdater). See [T150915](https://phabricator.wikimedia.org/T150915) for more details. Furthermore, we switched to an updated UDF for counting API calls -- the previous version was undercounting full-text and geo search API calls (see [Gerrit change 315503](https://gerrit.wikimedia.org/r/#/c/315503/) for more details). +* '__U__': on 2017-06-29 we started to use a new UDF to get the type of search API (see [Gerrit change 345863](https://gerrit.wikimedia.org/r/#/c/345863/) for more details) and break down the API calls by referer class. + +Questions, bug reports, and feature suggestions +-- +For technical, non-bug questions, [email Mikhail](mailto:mpo...@wikimedia.org?subject=Dashboard%20Question) or [Chelsy](mailto:c...@wikimedia.org?subject=Dashboard%20Question). If you experience a bug or notice something wrong or have a suggestion, [open a ticket in Phabricator](https://phabricator.wikimedia.org/maniphest/task/create/?projects=Discovery) in the Discovery board or [email Deb](mailto:d...@wikimedia.org?subject=Dashboard%20Question). + + + + Link to this dashboard: https://discovery.wmflabs.org/metrics/#referer_breakdown;>https://discovery.wmflabs.org/metrics/#referer_breakdown + | Page is available under https://creativecommons.org/licenses/by-sa/3.0/; title="Creative Commons Attribution-ShareAlike License">CC-BY-SA 3.0 + | https://phabricator.wikimedia.org/diffusion/WDRN/; title="Search Metrics Dashboard source code repository">Code is licensed under https://phabricator.wikimedia.org/diffusion/WDRN/browse/master/LICENSE.md; title="MIT License">MIT + | Part of https://discovery.wmflabs.org/;>Discovery Dashboards + diff --git a/ui.R b/ui.R index 8b20615..7131e7e 100644 --- a/ui.R +++ b/ui.R @@ -60,7 +60,8 @@ menuSubItem(text = "Open Search", tabName = "open_search"), menuSubItem(text = "Geo Search", tabName = "geo_search"),
[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Add a tab to track morelike search usage
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/374442 ) Change subject: Add a tab to track morelike search usage .. Add a tab to track morelike search usage Bug: https://phabricator.wikimedia.org/T172452 Change-Id: I0d0b107df1f6b46a28b8e2c025d1acf5f0fec327 --- M modules/api.R M modules/key_performance_metrics/api_usage.R M tab_documentation/fulltext_basic.md M tab_documentation/geo_basic.md M tab_documentation/kpi_api_usage.md M tab_documentation/language_basic.md A tab_documentation/morelike_basic.md M tab_documentation/open_basic.md M tab_documentation/prefix_basic.md M ui.R 10 files changed, 45 insertions(+), 13 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow refs/changes/42/374442/1 diff --git a/modules/api.R b/modules/api.R index 73368cd..8838a99 100644 --- a/modules/api.R +++ b/modules/api.R @@ -6,7 +6,17 @@ polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Full-text via API usage by day", legend_name = "Searches") %>% dyRangeSelector %>% dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% -dyEvent(as.Date("2017-08-14"), "U (new UDF)", labelLoc = "bottom") +dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = "bottom") +}) + +output$morelike_aggregate <- renderDygraph({ + split_dataset$`cirrus (more like)` %>% +dplyr::group_by(date) %>% +dplyr::mutate(All = sum(calls, na.rm = TRUE)) %>% +tidyr::spread(key = referer_class, value = calls) %>% +polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_morelike_search)) %>% +polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Morelike Search via API usage by day", legend_name = "Searches") %>% +dyRangeSelector }) output$open_aggregate <- renderDygraph({ @@ -17,7 +27,7 @@ polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "OpenSearch API usage by day", legend_name = "Searches") %>% dyRangeSelector %>% dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% -dyEvent(as.Date("2017-08-14"), "U (new UDF)", labelLoc = "bottom") +dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = "bottom") }) output$geo_aggregate <- renderDygraph({ @@ -28,7 +38,7 @@ polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Geo Search API usage by day", legend_name = "Searches") %>% dyRangeSelector %>% dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% -dyEvent(as.Date("2017-08-14"), "U (new UDF)", labelLoc = "bottom") +dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = "bottom") }) output$language_aggregate <- renderDygraph({ @@ -39,7 +49,7 @@ polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Language Search API usage by day", legend_name = "Searches") %>% dyRangeSelector %>% dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% -dyEvent(as.Date("2017-08-14"), "U (new UDF)", labelLoc = "bottom") +dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = "bottom") }) output$prefix_aggregate <- renderDygraph({ @@ -50,5 +60,5 @@ polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Prefix Search API usage by day", legend_name = "Searches") %>% dyRangeSelector %>% dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% -dyEvent(as.Date("2017-08-14"), "U (new UDF)", labelLoc = "bottom") +dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = "bottom") }) diff --git a/modules/key_performance_metrics/api_usage.R b/modules/key_performance_metrics/api_usage.R index 13a4c3a..b1ba34b 100644 --- a/modules/key_performance_metrics/api_usage.R +++ b/modules/key_performance_metrics/api_usage.R @@ -40,7 +40,7 @@ dyCSS(css = system.file("custom.css", package = "polloi")) %>% dyRangeSelector %>% dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% - dyEvent(as.Date("2017-08-14"), "U (new UDF)", labelLoc = "bottom")) + dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = "bottom")) } api_usage_change <- api_usage %>% dplyr::mutate( @@ -63,5 +63,5 @@ dyCSS(css = system.file("custom.css", package = "polloi")) %>% dyRangeSelector %>% dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% - dyEvent(as.Date("2017-08-14"), "U (new UDF)", labelLoc = "bottom")) + dyEvent(as.Date("2017-06-29"), "U (new UDF)", labelLoc = "bottom")) }) diff --git a/tab_documentation/fulltext_basic.md b/tab_documentation/fulltext_basic.md index c2a121a..76826cf 100644 --- a/tab_documentation/fulltext_basic.md +++ b/tab_documentation/fulltext_basic.md @@ -13,7 +13,7 @@ -- * '__R__': on
[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: metrics::search::srp_survtime: Track search results page dwe...
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/374396 ) Change subject: metrics::search::srp_survtime: Track search results page dwell time .. metrics::search::srp_survtime: Track search results page dwell time Bug: T170468 Change-Id: I694a2f24cd831428ad95872dea085f8307994b4a --- M docs/README.Rmd M docs/README.md M modules/metrics/search/config.yaml A modules/metrics/search/srp_survtime A modules/metrics/search/srp_survtime.R 5 files changed, 136 insertions(+), 4 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/docs/README.Rmd b/docs/README.Rmd index c03b43d..c7d27c0 100644 --- a/docs/README.Rmd +++ b/docs/README.Rmd @@ -1,12 +1,12 @@ --- output: md_document note: > - Needs to be knit into Markdown and rsync'd to stat1002:/a/published-datasets/discovery/README.md + Needs to be knit into Markdown and rsync'd to stat1005:/srv/published-datasets/discovery/README.md --- # Discovery Datasets -These files are generated by Discovery's [Golden](https://github.com/wikimedia/wikimedia-discovery-golden/) data retrieval codebase that executes daily and uses [Reportupdater](https://wikitech.wikimedia.org/wiki/Analytics/Reportupdater) infrastructure. These datasets provide the metrics that are used by [Discovery's Dashboards](https://discovery.wmflabs.org/) +These files are generated by Discovery's [Golden](https://github.com/wikimedia/wikimedia-discovery-golden/) data retrieval codebase that executes daily and uses [Reportupdater](https://wikitech.wikimedia.org/wiki/Analytics/Systems/Reportupdater) infrastructure. These datasets provide the metrics that are used by [Discovery's Dashboards](https://discovery.wmflabs.org/) Last updated on `r format(Sys.Date(), "%d %B %Y")` diff --git a/docs/README.md b/docs/README.md index f1ea48c..8af2aa3 100644 --- a/docs/README.md +++ b/docs/README.md @@ -4,11 +4,11 @@ These files are generated by Discovery's [Golden](https://github.com/wikimedia/wikimedia-discovery-golden/) data retrieval codebase that executes daily and uses -[Reportupdater](https://wikitech.wikimedia.org/wiki/Analytics/Reportupdater) +[Reportupdater](https://wikitech.wikimedia.org/wiki/Analytics/Systems/Reportupdater) infrastructure. These datasets provide the metrics that are used by [Discovery's Dashboards](https://discovery.wmflabs.org/) -Last updated on 01 August 2017 +Last updated on 28 August 2017 Daily Metrics - @@ -145,6 +145,9 @@ Wikipedia search results pages; broken up by language, destination type (SERP vs not), and access method (desktop vs mobile web); exlcudes known automata +- **srp\_survtime.tsv**: Estimates (via survival analysis) of how long +Wikipedia searchers stay on full-text search results page after +getting there from autocomplete search. wdqs/ - diff --git a/modules/metrics/search/config.yaml b/modules/metrics/search/config.yaml index 82f1c3f..56ec39b 100644 --- a/modules/metrics/search/config.yaml +++ b/modules/metrics/search/config.yaml @@ -162,3 +162,9 @@ starts: 2017-06-01 funnel: true type: script +srp_survtime: +description: Estimates (via survival analysis) of how long Wikipedia searchers stay on full-text search results page after getting there from autocomplete search. +granularity: days +starts: 2017-04-01 +funnel: true +type: script diff --git a/modules/metrics/search/srp_survtime b/modules/metrics/search/srp_survtime new file mode 100755 index 000..08e2682 --- /dev/null +++ b/modules/metrics/search/srp_survtime @@ -0,0 +1,3 @@ +#!/bin/bash + +Rscript modules/metrics/search/srp_survtime.R -d $1 diff --git a/modules/metrics/search/srp_survtime.R b/modules/metrics/search/srp_survtime.R new file mode 100644 index 000..e24f99d --- /dev/null +++ b/modules/metrics/search/srp_survtime.R @@ -0,0 +1,120 @@ +#!/usr/bin/env Rscript + +source("config.R") +.libPaths(r_library) +suppressPackageStartupMessages({ + library("optparse") + library("glue") + library("magrittr") +}) + +option_list <- list( + make_option(c("-d", "--date"), default = NA, action = "store", type = "character") +) + +# Get command line options, if help option encountered print help and exit, +# otherwise if options not found on command line then set defaults: +opt <- parse_args(OptionParser(option_list = option_list)) + +if (is.na(opt$date)) { + quit(save = "no", status = 1) +} + +mmdd <- format(as.Date(opt$date), "%Y%m%d") +revision_number <- dplyr::case_when( + as.Date(opt$date) < "2017-02-10" ~ "15922352", + as.Date(opt$date) < "2017-06-29" ~ "16270835", + TRUE ~ "16909631" +) + +query <- glue("SELECT + timestamp AS ts, wiki, + event_uniqueId AS event_id, + event_searchSessionId AS session_id, + event_pageViewId AS page_id, + event_action AS event, + event_checkin AS checkin, +
[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Use new UDF and break api calls down by referer class
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/374387 ) Change subject: Use new UDF and break api calls down by referer class .. Use new UDF and break api calls down by referer class Bug: T172452 Change-Id: I0c3fad23abb3931223d0b6212c1f8a969a251f72 --- M modules/api.R M modules/key_performance_metrics/api_usage.R M tab_documentation/fulltext_basic.md M tab_documentation/kpi_api_usage.md M utils.R 5 files changed, 33 insertions(+), 12 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow refs/changes/87/374387/1 diff --git a/modules/api.R b/modules/api.R index 7e8e7ff..affe6fa 100644 --- a/modules/api.R +++ b/modules/api.R @@ -1,13 +1,18 @@ output$cirrus_aggregate <- renderDygraph({ split_dataset$cirrus %>% +tidyr::spread(key = referer_class, value = calls) %>% +dplyr::mutate(All = ifelse(is.na(All), rowSums(.[, -c(1, 2)], na.rm = TRUE), All)) %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_fulltext_search)) %>% polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Full-text via API usage by day", legend_name = "Searches") %>% dyRangeSelector %>% -dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") +dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% +dyEvent(as.Date("2017-08-14"), "U (new UDF)", labelLoc = "bottom") }) output$open_aggregate <- renderDygraph({ split_dataset$open %>% +tidyr::spread(key = referer_class, value = calls) %>% +dplyr::mutate(All = ifelse(is.na(All), rowSums(.[, -c(1, 2)], na.rm = TRUE), All)) %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_open_search)) %>% polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "OpenSearch API usage by day", legend_name = "Searches") %>% dyRangeSelector %>% @@ -16,6 +21,8 @@ output$geo_aggregate <- renderDygraph({ split_dataset$geo %>% +tidyr::spread(key = referer_class, value = calls) %>% +dplyr::mutate(All = ifelse(is.na(All), rowSums(.[, -c(1, 2)], na.rm = TRUE), All)) %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_geo_search)) %>% polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Geo Search API usage by day", legend_name = "Searches") %>% dyRangeSelector %>% @@ -24,6 +31,8 @@ output$language_aggregate <- renderDygraph({ split_dataset$language %>% +tidyr::spread(key = referer_class, value = calls) %>% +dplyr::mutate(All = ifelse(is.na(All), rowSums(.[, -c(1, 2)], na.rm = TRUE), All)) %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_language_search)) %>% polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Language Search API usage by day", legend_name = "Searches") %>% dyRangeSelector %>% @@ -32,6 +41,8 @@ output$prefix_aggregate <- renderDygraph({ split_dataset$prefix %>% +tidyr::spread(key = referer_class, value = calls) %>% +dplyr::mutate(All = ifelse(is.na(All), rowSums(.[, -c(1, 2)], na.rm = TRUE), All)) %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_prefix_search)) %>% polloi::make_dygraph(xlab = "Date", ylab = "Searches", title = "Prefix Search API usage by day", legend_name = "Searches") %>% dyRangeSelector %>% diff --git a/modules/key_performance_metrics/api_usage.R b/modules/key_performance_metrics/api_usage.R index 271b030..13a4c3a 100644 --- a/modules/key_performance_metrics/api_usage.R +++ b/modules/key_performance_metrics/api_usage.R @@ -2,6 +2,11 @@ smooth_level <- input$smoothing_kpi_api_usage start_date <- Sys.Date() - switch(input$kpi_summary_date_range_selector, all = NA, daily = 1, weekly = 8, monthly = 31, quarterly = 91) api_usage <- split_dataset %>% + purrr::map(function(x) { +dplyr::group_by(x, date) %>% +dplyr::summarize(calls = sum(calls, na.rm = TRUE)) %>% +dplyr::ungroup() + }) %>% { if (!is.na(start_date)) { lapply(., polloi::subset_by_date_range, from = start_date, to = Sys.Date() - 1) @@ -12,33 +17,35 @@ dplyr::bind_rows(.id = "api") %>% tidyr::spread("api", "calls") if ( input$kpi_api_usage_series_include_open ) { -api_usage <- dplyr::mutate(api_usage, all = cirrus + geo + language + open + prefix) +api_usage <- dplyr::mutate(api_usage, all = cirrus + ifelse(is.na(`cirrus (more like)`), 0, `cirrus (more like)`) + geo + language + open + prefix) } else { -api_usage <- dplyr::mutate(api_usage, all = cirrus + geo + language + prefix) +api_usage <- dplyr::mutate(api_usage, all = cirrus + ifelse(is.na(`cirrus (more like)`), 0, `cirrus (more like)`) + geo + language + prefix) } if (
[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Breakdown search API requests by referer class and use GetSe...
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/371980 ) Change subject: Breakdown search API requests by referer class and use GetSearchRequestTypeUDF .. Breakdown search API requests by referer class and use GetSearchRequestTypeUDF Please do not merge this patch since the new UDF hasn’t been released to production. Bug: T172452 Change-Id: Ia4aa5260fe243abeced91c67de8f44bdc9be859b --- M modules/metrics/search/search_api_usage 1 file changed, 4 insertions(+), 3 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/golden refs/changes/80/371980/1 diff --git a/modules/metrics/search/search_api_usage b/modules/metrics/search/search_api_usage index a0b1b7c..f9de476 100755 --- a/modules/metrics/search/search_api_usage +++ b/modules/metrics/search/search_api_usage @@ -1,11 +1,12 @@ #!/bin/bash hive -S -e "ADD JAR hdfs:///wmf/refinery/current/artifacts/refinery-hive.jar; -CREATE TEMPORARY FUNCTION search_classify AS 'org.wikimedia.analytics.refinery.hive.SearchClassifierUDF'; +CREATE TEMPORARY FUNCTION search_classify AS 'org.wikimedia.analytics.refinery.hive.GetSearchRequestTypeUDF'; USE wmf; SELECT '$1' AS date, search_classify(uri_path, uri_query) AS api, + referer_class, COUNT(*) AS calls FROM webrequest WHERE @@ -13,6 +14,6 @@ AND CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')) >= '$1' AND CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')) < '$2' AND http_status = '200' - AND search_classify(uri_path, uri_query) IN('language', 'cirrus', 'prefix', 'geo', 'open') -GROUP BY '$1', search_classify(uri_path, uri_query); + AND search_classify(uri_path, uri_query) IN('language', 'cirrus', 'cirrus (more like)', 'prefix', 'geo', 'open') +GROUP BY '$1', search_classify(uri_path, uri_query), referer_class; " 2> /dev/null | grep -v parquet.hadoop | grep -v WARN: -- To view, visit https://gerrit.wikimedia.org/r/371980 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ia4aa5260fe243abeced91c67de8f44bdc9be859b Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/golden Gerrit-Branch: master Gerrit-Owner: Chelsyx___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Remove duplicated clicks on the same position for each query...
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/370977 ) Change subject: Remove duplicated clicks on the same position for each query when computing paulscore .. Remove duplicated clicks on the same position for each query when computing paulscore Bug: T172960 Change-Id: I972500c6150408a119f2c80dad9fe8a49f00845e --- M modules/metrics/search/paulscore_approximations.R 1 file changed, 8 insertions(+), 5 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/golden refs/changes/77/370977/1 diff --git a/modules/metrics/search/paulscore_approximations.R b/modules/metrics/search/paulscore_approximations.R index 1f7fe9f..1a1ede0 100644 --- a/modules/metrics/search/paulscore_approximations.R +++ b/modules/metrics/search/paulscore_approximations.R @@ -35,11 +35,14 @@ SUM(IF(event_action = 'click', POW(0.7, event_position), 0)) / SUM(IF(event_action = 'searchResultPage', 1, 0)) AS pow_7, SUM(IF(event_action = 'click', POW(0.8, event_position), 0)) / SUM(IF(event_action = 'searchResultPage', 1, 0)) AS pow_8, SUM(IF(event_action = 'click', POW(0.9, event_position), 0)) / SUM(IF(event_action = 'searchResultPage', 1, 0)) AS pow_9 -FROM TestSearchSatisfaction2_", dplyr::if_else(as.Date(opt$date) < "2017-02-10", "15922352", dplyr::if_else(as.Date(opt$date) < "2017-06-29", "16270835", "16909631")), " -WHERE ", date_clause, " - AND event_action IN ('searchResultPage', 'click') - AND IF(event_source = 'autocomplete', event_inputLocation = 'header', TRUE) - AND IF(event_source = 'autocomplete' AND event_action = 'click', event_position >= 0, TRUE) +FROM + (SELECT event_searchSessionId, event_source, wiki, event_action, event_position, event_pageViewId, event_query + FROM TestSearchSatisfaction2_", dplyr::if_else(as.Date(opt$date) < "2017-02-10", "15922352", dplyr::if_else(as.Date(opt$date) < "2017-06-29", "16270835", "16909631")), " + WHERE ", date_clause, " +AND event_action IN ('searchResultPage', 'click') +AND IF(event_source = 'autocomplete', event_inputLocation = 'header', TRUE) +AND IF(event_source = 'autocomplete' AND event_action = 'click', event_position >= 0, TRUE) + GROUP BY event_searchSessionId, event_source, wiki, event_action, event_position, event_pageViewId, event_query) AS deduplicate GROUP BY date, event_searchSessionId, event_source, wiki;") # Fetch data from MySQL database: -- To view, visit https://gerrit.wikimedia.org/r/370977 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I972500c6150408a119f2c80dad9fe8a49f00845e Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/golden Gerrit-Branch: master Gerrit-Owner: Chelsyx___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Add 'na.rm = TRUE' to sum functions
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/370922 ) Change subject: Add 'na.rm = TRUE' to sum functions .. Add 'na.rm = TRUE' to sum functions Bug: T170469 Change-Id: I065f732b94bc59c487885e59c618abb1319c72ca --- M utils.R 1 file changed, 22 insertions(+), 22 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow refs/changes/22/370922/1 diff --git a/utils.R b/utils.R index 6d25af8..ab34131 100644 --- a/utils.R +++ b/utils.R @@ -20,14 +20,14 @@ dplyr::summarize(volume = sum(as.numeric(`search sessions`), na.rm = TRUE)) %>% dplyr::filter(volume > 0) %>% dplyr::arrange(desc(volume)) %>% -dplyr::mutate(prop = volume / sum(volume), +dplyr::mutate(prop = volume / sum(volume, na.rm = TRUE), label = sprintf("%s (%.3f%%)", language, 100 * prop)) available_projects_desktop <<- desktop_langproj_dygraph_set %>% dplyr::group_by(project) %>% dplyr::summarize(volume = sum(as.numeric(`search sessions`), na.rm = TRUE)) %>% dplyr::filter(volume > 0) %>% dplyr::arrange(desc(volume)) %>% -dplyr::mutate(prop = volume / sum(volume), +dplyr::mutate(prop = volume / sum(volume, na.rm = TRUE), label = sprintf("%s (%.3f%%)", project, 100 * prop)) } @@ -69,7 +69,7 @@ dplyr::filter(!is.na(click_position), !is.na(events)) %>% dplyr::distinct(date, click_position, .keep_all = TRUE) %>% dplyr::group_by(date) %>% -dplyr::mutate(prop = round(events / sum(events) * 100, 2)) %>% +dplyr::mutate(prop = round(events / sum(events, na.rm = TRUE) * 100, 2)) %>% dplyr::ungroup() %>% dplyr::select(-events) %>% tidyr::spread(click_position, prop, fill = 0) @@ -80,7 +80,7 @@ dplyr::filter(!is.na(invoke_source), !is.na(events)) %>% dplyr::distinct(date, invoke_source, .keep_all = TRUE) %>% dplyr::group_by(date) %>% -dplyr::mutate(prop = round(events / sum(events) * 100, 2)) %>% +dplyr::mutate(prop = round(events / sum(events, na.rm = TRUE) * 100, 2)) %>% dplyr::ungroup() %>% dplyr::select(-events) %>% tidyr::spread(invoke_source, prop, fill = 0) @@ -179,14 +179,14 @@ dplyr::summarize(volume = sum(as.numeric(total), na.rm = TRUE)) %>% dplyr::filter(volume > 0) %>% dplyr::arrange(desc(volume)) %>% -dplyr::mutate(prop = volume / sum(volume), +dplyr::mutate(prop = volume / sum(volume, na.rm = TRUE), label = sprintf("%s (%.3f%%)", language, 100 * prop)) available_projects <<- langproj_with_automata %>% dplyr::group_by(project) %>% dplyr::summarize(volume = sum(as.numeric(total), na.rm = TRUE)) %>% dplyr::filter(volume > 0) %>% dplyr::arrange(desc(volume)) %>% -dplyr::mutate(prop = volume / sum(volume), +dplyr::mutate(prop = volume / sum(volume, na.rm = TRUE), label = sprintf("%s (%.3f%%)", project, 100 * prop)) projects_db <<- readr::read_csv(system.file("extdata/projects.csv", package = "polloi"), col_types = "cclc")[, c("project", "multilingual")] } @@ -203,7 +203,7 @@ ) %>% dplyr::bind_rows(.id = "platform") %>% dplyr::group_by(date) %>% -dplyr::summarize(clickthroughs = sum(clickthroughs), serps = sum(`Result pages opened`)) %>% +dplyr::summarize(clickthroughs = sum(clickthroughs, na.rm = TRUE), serps = sum(`Result pages opened`, na.rm = TRUE)) %>% dplyr::right_join(threshold_data, by = "date") %>% dplyr::transmute( date = date, @@ -244,7 +244,7 @@ ) %>% dplyr::bind_rows(.id = "platform") %>% dplyr::group_by(date, language, project) %>% -dplyr::summarize(clickthroughs = sum(clickthroughs), serps = sum(`Result pages opened`)) %>% +dplyr::summarize(clickthroughs = sum(clickthroughs, na.rm = TRUE), serps = sum(`Result pages opened`, na.rm = TRUE)) %>% dplyr::right_join(threshold_data, by = c("date", "language", "project")) %>% dplyr::ungroup() %>% dplyr::transmute( @@ -263,14 +263,14 @@ dplyr::summarize(volume = sum(as.numeric(`Result pages opened`), na.rm = TRUE)) %>% dplyr::filter(volume > 0) %>% dplyr::arrange(desc(volume)) %>% -dplyr::mutate(prop = volume / sum(volume), +dplyr::mutate(prop = volume / sum(volume, na.rm = TRUE), label = sprintf("%s (%.3f%%)", language, 100 * prop)) available_projects_ctr <<- augmented_clickthroughs_langproj %>% dplyr::group_by(project) %>% dplyr::summarize(volume = sum(as.numeric(`Result pages opened`), na.rm = TRUE)) %>% dplyr::filter(volume > 0) %>% dplyr::arrange(desc(volume)) %>% -dplyr::mutate(prop = volume / sum(volume), +dplyr::mutate(prop = volume / sum(volume, na.rm = TRUE), label = sprintf("%s (%.3f%%)", project, 100 * prop)) } @@ -301,14 +301,14 @@ dplyr::summarize(volume = sum(as.numeric(`search sessions`), na.rm = TRUE))
[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Annotate sample rate change
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/370857 ) Change subject: Annotate sample rate change .. Annotate sample rate change On April 25th, we changed the sample rates for several projects, which results in changes in our search metrics. Bug: T172428 Change-Id: I709222fa4fad807762c23858e6c00c43d0747d9a --- M modules/desktop/events.R M modules/desktop/load_times.R M modules/desktop/paulscore.R M modules/page_visit_times.R M tab_documentation/desktop_events.md M tab_documentation/desktop_load.md M tab_documentation/paulscore_approx.html M tab_documentation/survival.md 8 files changed, 12 insertions(+), 6 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/modules/desktop/events.R b/modules/desktop/events.R index bcfd686..4f94e44 100644 --- a/modules/desktop/events.R +++ b/modules/desktop/events.R @@ -32,5 +32,6 @@ polloi::make_dygraph(xlab = "Date", ylab = "Events", title = "Desktop search events, by day") %>% dyRangeSelector %>% dyEvent(as.Date("2016-07-12"), "A (schema switch)", labelLoc = "bottom") %>% -dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") +dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% +dyEvent(as.Date("2017-04-25"), "S (rates)", labelLoc = "bottom") }) diff --git a/modules/desktop/load_times.R b/modules/desktop/load_times.R index 50fb49a..a797c80 100644 --- a/modules/desktop/load_times.R +++ b/modules/desktop/load_times.R @@ -5,5 +5,6 @@ dyRangeSelector %>% dyEvent(as.Date("2016-07-12"), "A (schema switch)", labelLoc = "bottom") %>% dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% -dyEvent(as.Date("2017-06-15"), "B (sister search)", labelLoc = "bottom") +dyEvent(as.Date("2017-06-15"), "B (sister search)", labelLoc = "bottom") %>% +dyEvent(as.Date("2017-04-25"), "S (rates)", labelLoc = "bottom") }) diff --git a/modules/desktop/paulscore.R b/modules/desktop/paulscore.R index 144569b..b0ffb14 100644 --- a/modules/desktop/paulscore.R +++ b/modules/desktop/paulscore.R @@ -10,7 +10,7 @@ polloi::make_dygraph(xlab = "Date", ylab = "PaulScore", title = "PaulScore for fulltext searches, by day", use_si = FALSE, group = "paulscore_approx") %>% dyRangeSelector %>% dyLegend(labelsDiv = "paulscore_approx_legend", show = "always") %>% -dyEvent(as.Date("2017-04-19"), "A (rates)", labelLoc = "bottom") +dyEvent(as.Date("2017-04-25"), "S (rates)", labelLoc = "bottom") if (input$paulscore_relative) { dyOut <- dyAxis(dyOut, "y", axisLabelFormatter = "function(x) { return Math.round(100 * x, 3) + '%'; }", valueFormatter = "function(x) { return Math.round(100 * x, 3) + '%'; }") } @@ -29,7 +29,7 @@ polloi::make_dygraph(xlab = "Date", ylab = "PaulScore", title = "PaulScore for autocomplete searches, by day", use_si = FALSE, group = "paulscore_approx") %>% dyRangeSelector %>% dyLegend(labelsDiv = "paulscore_approx_legend", show = "always") %>% -dyEvent(as.Date("2017-04-19"), "A (rates)", labelLoc = "bottom") +dyEvent(as.Date("2017-04-25"), "S (rates)", labelLoc = "bottom") if (input$paulscore_relative) { dyOut <- dyAxis(dyOut, "y", axisLabelFormatter = "function(x) { return Math.round(100 * x, 3) + '%'; }", valueFormatter = "function(x) { return Math.round(100 * x, 3) + '%'; }") } diff --git a/modules/page_visit_times.R b/modules/page_visit_times.R index 4a51a78..115cbb4 100644 --- a/modules/page_visit_times.R +++ b/modules/page_visit_times.R @@ -6,5 +6,6 @@ axisLabelWidth = 100, pixelsPerLabel = 80) %>% dyLegend(labelsDiv = "lethal_dose_plot_legend") %>% dyRangeSelector(fillColor = "", strokeColor = "") %>% -dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") +dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% +dyEvent(as.Date("2017-04-25"), "S (rates)", labelLoc = "bottom") }) diff --git a/tab_documentation/desktop_events.md b/tab_documentation/desktop_events.md index 94c5f95..be4d9d8 100644 --- a/tab_documentation/desktop_events.md +++ b/tab_documentation/desktop_events.md @@ -21,6 +21,7 @@ * Data in late September/early October 2015 is unavailable due to another bug in EventLogging as a whole, which impacted data collection. * '__A__': we switched to using data from [Schema:TestSearchSatisfaction2](https://meta.wikimedia.org/wiki/Schema:TestSearchSatisfaction2) instead of [Schema:Search](https://meta.wikimedia.org/wiki/Schema:Search) for Desktop event counts and load times on 12 July 2016. * '__R__': on 2017-01-01 we started calculating all of Discovery's metrics using a new version of [our data retrieval and processing codebase](https://phabricator.wikimedia.org/diffusion/WDGO/) that we migrated to [Wikimedia
[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Annotate sample rate change
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/370857 ) Change subject: Annotate sample rate change .. Annotate sample rate change On April 25th, we changed the sample rates for several projects, which results in changes in our search metrics. Bug: T172428 Change-Id: I709222fa4fad807762c23858e6c00c43d0747d9a --- M modules/desktop/events.R M modules/desktop/load_times.R M modules/desktop/paulscore.R M modules/page_visit_times.R M tab_documentation/desktop_events.md M tab_documentation/desktop_load.md M tab_documentation/paulscore_approx.html M tab_documentation/survival.md 8 files changed, 12 insertions(+), 6 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow refs/changes/57/370857/1 diff --git a/modules/desktop/events.R b/modules/desktop/events.R index bcfd686..4f94e44 100644 --- a/modules/desktop/events.R +++ b/modules/desktop/events.R @@ -32,5 +32,6 @@ polloi::make_dygraph(xlab = "Date", ylab = "Events", title = "Desktop search events, by day") %>% dyRangeSelector %>% dyEvent(as.Date("2016-07-12"), "A (schema switch)", labelLoc = "bottom") %>% -dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") +dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% +dyEvent(as.Date("2017-04-25"), "S (rates)", labelLoc = "bottom") }) diff --git a/modules/desktop/load_times.R b/modules/desktop/load_times.R index 50fb49a..a797c80 100644 --- a/modules/desktop/load_times.R +++ b/modules/desktop/load_times.R @@ -5,5 +5,6 @@ dyRangeSelector %>% dyEvent(as.Date("2016-07-12"), "A (schema switch)", labelLoc = "bottom") %>% dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% -dyEvent(as.Date("2017-06-15"), "B (sister search)", labelLoc = "bottom") +dyEvent(as.Date("2017-06-15"), "B (sister search)", labelLoc = "bottom") %>% +dyEvent(as.Date("2017-04-25"), "S (rates)", labelLoc = "bottom") }) diff --git a/modules/desktop/paulscore.R b/modules/desktop/paulscore.R index 144569b..b0ffb14 100644 --- a/modules/desktop/paulscore.R +++ b/modules/desktop/paulscore.R @@ -10,7 +10,7 @@ polloi::make_dygraph(xlab = "Date", ylab = "PaulScore", title = "PaulScore for fulltext searches, by day", use_si = FALSE, group = "paulscore_approx") %>% dyRangeSelector %>% dyLegend(labelsDiv = "paulscore_approx_legend", show = "always") %>% -dyEvent(as.Date("2017-04-19"), "A (rates)", labelLoc = "bottom") +dyEvent(as.Date("2017-04-25"), "S (rates)", labelLoc = "bottom") if (input$paulscore_relative) { dyOut <- dyAxis(dyOut, "y", axisLabelFormatter = "function(x) { return Math.round(100 * x, 3) + '%'; }", valueFormatter = "function(x) { return Math.round(100 * x, 3) + '%'; }") } @@ -29,7 +29,7 @@ polloi::make_dygraph(xlab = "Date", ylab = "PaulScore", title = "PaulScore for autocomplete searches, by day", use_si = FALSE, group = "paulscore_approx") %>% dyRangeSelector %>% dyLegend(labelsDiv = "paulscore_approx_legend", show = "always") %>% -dyEvent(as.Date("2017-04-19"), "A (rates)", labelLoc = "bottom") +dyEvent(as.Date("2017-04-25"), "S (rates)", labelLoc = "bottom") if (input$paulscore_relative) { dyOut <- dyAxis(dyOut, "y", axisLabelFormatter = "function(x) { return Math.round(100 * x, 3) + '%'; }", valueFormatter = "function(x) { return Math.round(100 * x, 3) + '%'; }") } diff --git a/modules/page_visit_times.R b/modules/page_visit_times.R index 4a51a78..115cbb4 100644 --- a/modules/page_visit_times.R +++ b/modules/page_visit_times.R @@ -6,5 +6,6 @@ axisLabelWidth = 100, pixelsPerLabel = 80) %>% dyLegend(labelsDiv = "lethal_dose_plot_legend") %>% dyRangeSelector(fillColor = "", strokeColor = "") %>% -dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") +dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% +dyEvent(as.Date("2017-04-25"), "S (rates)", labelLoc = "bottom") }) diff --git a/tab_documentation/desktop_events.md b/tab_documentation/desktop_events.md index 94c5f95..be4d9d8 100644 --- a/tab_documentation/desktop_events.md +++ b/tab_documentation/desktop_events.md @@ -21,6 +21,7 @@ * Data in late September/early October 2015 is unavailable due to another bug in EventLogging as a whole, which impacted data collection. * '__A__': we switched to using data from [Schema:TestSearchSatisfaction2](https://meta.wikimedia.org/wiki/Schema:TestSearchSatisfaction2) instead of [Schema:Search](https://meta.wikimedia.org/wiki/Schema:Search) for Desktop event counts and load times on 12 July 2016. * '__R__': on 2017-01-01 we started calculating all of Discovery's metrics using a new version of [our data retrieval and processing codebase](https://phabricator.wikimedia.org/diffusion/WDGO/) that we migrated to [Wikimedia
[MediaWiki-commits] [Gerrit] wikimedia...polloi[master]: Fix a bug in function 'compress' when number is between 0 and 1
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/370756 ) Change subject: Fix a bug in function 'compress' when number is between 0 and 1 .. Fix a bug in function 'compress' when number is between 0 and 1 Change-Id: I25ef7d1332d0bacafbd07d7ee64c6c22e3dd7bbd --- M R/maths.R M tests/testthat/test-maths.R 2 files changed, 3 insertions(+), 2 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/polloi refs/changes/56/370756/1 diff --git a/R/maths.R b/R/maths.R index aab5b7f..5cfd017 100644 --- a/R/maths.R +++ b/R/maths.R @@ -22,5 +22,6 @@ #' @export compress <- function(x, round_by = 2) { div <- findInterval(x, c(1, 1e3, 1e6, 1e9, 1e12)) - return(paste0(round( x / 10 ^ (3 * (div - 1)), round_by), c("", "", "K", "M", "B", "T")[div + 1])) + return(paste0(round( x / 10 ^ (3 * ifelse(div - 1 < 0, 0, div - 1)), round_by), +c("", "", "K", "M", "B", "T")[div + 1])) } diff --git a/tests/testthat/test-maths.R b/tests/testthat/test-maths.R index 9a84d57..2455e10 100644 --- a/tests/testthat/test-maths.R +++ b/tests/testthat/test-maths.R @@ -7,7 +7,7 @@ }) test_that("suffixes", { - expect_equal(compress(c(0, 1, 10, 100)), c("0", "1", "10", "100")) + expect_equal(compress(c(0, 0.5, 1, 10, 100)), c("0", "0.5", "1", "10", "100")) expect_equal(compress(1.642e3, round_by = 1), "1.6K") expect_equal(compress(c(10, 1e6, 1e12, 1e9)), c("100K", "1M", "1T", "1B")) expect_equal(compress(c(0, 1, 1e6, 1e3)), c("0", "1", "1M", "1K")) -- To view, visit https://gerrit.wikimedia.org/r/370756 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I25ef7d1332d0bacafbd07d7ee64c6c22e3dd7bbd Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/polloi Gerrit-Branch: master Gerrit-Owner: Chelsyx___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...prince[develop]: Get all country names with portal traffic from polloi
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/367459 ) Change subject: Get all country names with portal traffic from polloi .. Get all country names with portal traffic from polloi Bug: T167913 Change-Id: I781a1a11844df5599d8535df5e7ce440ea81428f --- M extras.R 1 file changed, 2 insertions(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/prince refs/changes/59/367459/1 diff --git a/extras.R b/extras.R index 69c502e..605a49d 100644 --- a/extras.R +++ b/extras.R @@ -29,7 +29,8 @@ ) # For selectizeInput in ui.R -all_country_names <- c("Zimbabwe", "Zambia", "Yemen", "Virgin Islands, British", "Viet Nam", "Venezuela, Bolivarian Republic of", "Uzbekistan", "U.S. (West)", "U.S. (South)", "U.S. (Pacific)", "U.S. (Other)", "U.S. (Northeast)", "U.S. (Midwest)", "Uruguay", "United Kingdom", "United Arab Emirates", "Ukraine", "Uganda", "Turkmenistan", "Turkey", "Tunisia", "Trinidad and Tobago", "Timor-Leste", "Thailand", "Tanzania, United Republic of", "Tajikistan", "Taiwan, Province of China", "Syrian Arab Republic", "Switzerland", "Sweden", "Suriname", "Sudan", "Sri Lanka", "Spain", "South Africa", "Somalia", "Slovenia", "Slovakia", "Singapore", "Seychelles", "Serbia", "Senegal", "Saudi Arabia", "Rwanda", "Russian Federation", "Romania", "Qatar", "Portugal", "Poland", "Philippines", "Peru", "Paraguay", "Papua New Guinea", "Panama", "Palestine, State of", "Pakistan", "Other", "Oman", "Norway", "Nigeria", "Niger", "Nicaragua", "New Zealand", "Netherlands", "Nepal", "Namibia", "Myanmar", "Mozambique", "Morocco", "Montenegro", "Mongolia", "Moldova, Republic of", "Mexico", "Mauritius", "Mauritania", "Martinique", "Mali", "Malaysia", "Malawi", "Madagascar", "Macedonia, Republic of", "Macao", "Luxembourg", "Lithuania", "Libya", "Lebanon", "Latvia", "Lao People's Democratic Republic", "Kyrgyzstan", "Kuwait", "Korea, Republic of", "Kenya", "Kazakhstan", "Jordan", "Jersey", "Japan", "Jamaica", "Italy", "Israel", "Ireland", "Iraq", "Iran, Islamic Republic of", "Indonesia", "India", "Iceland", "Hungary", "Hong Kong", "Honduras", "Haiti", "Guernsey", "Guatemala", "Greenland", "Greece", "Ghana", "Germany", "Georgia", "French Polynesia", "France", "Finland", "Fiji", "Ethiopia", "Estonia", "El Salvador", "Egypt", "Ecuador", "Dominican Republic", "Dominica", "Djibouti", "Denmark", "Czechia", "Cyprus", "Curacao", "Cuba", "Croatia", "Cote d'Ivoire", "Costa Rica", "Congo, The Democratic Republic of the", "Colombia", "China", "Chile", "Canada", "Cameroon", "Cambodia", "Burkina Faso", "Bulgaria", "British Indian Ocean Territory", "Brazil", "Botswana", "Bolivia, Plurinational State of", "Bhutan", "Benin", "Belgium", "Belarus", "Barbados", "Bangladesh", "Bahrain", "Azerbaijan", "Austria", "Australia", "Aruba", "Armenia", "Argentina", "Angola", "Algeria", "Albania", "Afghanistan", "Togo", "Malta", "Guadeloupe", "Gibraltar", "Gabon", "Faroe Islands", "Congo", "Cayman Islands", "Brunei Darussalam", "Bosnia and Herzegovina", "Bahamas", "Reunion", "Maldives", "Guyana", "Guinea", "Cabo Verde", "Burundi", "Antigua and Barbuda", "Swaziland", "Saint Lucia", "Isle of Man", "Gambia", "Central African Republic", "Belize", "Vanuatu", "Sierra Leone", "Saint Kitts and Nevis", "New Caledonia", "Lesotho", "Solomon Islands", "French Guiana", "Chad", "Bermuda", "Turks and Caicos Islands", "Liberia", "Comoros", "Bonaire, Sint Eustatius and Saba", "Aland Islands", "Grenada", "Mayotte", "Liechtenstein", "Samoa", "Equatorial Guinea", "Andorra", "South Sudan", "Saint Martin (French part)", "Saint Vincent and the Grenadines", "Holy See (Vatican City State)", "Guinea-Bissau", "Eritrea", "Saint Barthelemy", "Cook Islands", "Sint Maarten (Dutch part)", "Sao Tome and Principe", "Anguilla", "Monaco", "Kiribati", "Micronesia, Federated States of", "San Marino", "United States") +data(portal_regions, package = "polloi") +all_country_names <- portal_regions fill_out <- function(x, start_date, end_date, fill = 0) { temp <- dplyr::data_frame(date = seq(start_date, end_date, "day")) -- To view, visit https://gerrit.wikimedia.org/r/367459 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I781a1a11844df5599d8535df5e7ce440ea81428f Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/prince Gerrit-Branch: develop Gerrit-Owner: Chelsyx___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Use new functions in polloi to get geo data
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/367456 ) Change subject: Use new functions in polloi to get geo data .. Use new functions in polloi to get geo data Bug: T167913 Change-Id: I00cad391fa22399f583eb7791256c1eb25ba611b --- M modules/metrics/portal/geographic_breakdown.R 1 file changed, 2 insertions(+), 14 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/golden refs/changes/56/367456/1 diff --git a/modules/metrics/portal/geographic_breakdown.R b/modules/metrics/portal/geographic_breakdown.R index 606e6a0..3456a45 100644 --- a/modules/metrics/portal/geographic_breakdown.R +++ b/modules/metrics/portal/geographic_breakdown.R @@ -66,23 +66,11 @@ } else { results$ts <- as.POSIXct(results$ts, format = "%Y%m%d%H%M%S") # Geography data that is common to both outputs: - data("ISO_3166_1", package = "ISOcodes") - # Remove accents because Reportupdater requires ASCII: - ISO_3166_1$Name <- stringi::stri_trans_general(ISO_3166_1$Name, "Latin-ASCII") - us_other_abb <- c("AS", "GU", "MP", "PR", "VI") - us_other_mask <- match(us_other_abb, ISO_3166_1$Alpha_2) - regions <- data.frame(abb = c(paste0("US:", c(as.character(state.abb), "DC")), us_other_abb), -region = paste0("U.S. (", c(as.character(state.region), "South", rep("Other",5)), ")"), -state = c(state.name, "District of Columbia", ISO_3166_1$Name[us_other_mask]), -stringsAsFactors = FALSE) - regions$region[regions$region == "U.S. (North Central)"] <- "U.S. (Midwest)" - regions$region[c(state.division == "Pacific", rep(FALSE, 5))] <- "U.S. (Pacific)" # see https://phabricator.wikimedia.org/T136257#2399411 + regions <- polloi::get_us_state() library(magrittr) # Required for piping if (opt$include_all) { # Generate all countries breakdown -all_countries <- data.frame(abb = c(regions$abb, ISO_3166_1$Alpha_2[-us_other_mask]), -name = c(regions$region, ISO_3166_1$Name[-us_other_mask]), -stringsAsFactors = FALSE) +all_countries <- polloi::get_country_state() data_w_countryname <- results %>% dplyr::mutate(country = ifelse(country %in% all_countries$abb, country, "Other")) %>% dplyr::left_join(all_countries, by = c("country" = "abb")) %>% -- To view, visit https://gerrit.wikimedia.org/r/367456 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I00cad391fa22399f583eb7791256c1eb25ba611b Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/golden Gerrit-Branch: master Gerrit-Owner: Chelsyx___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Fix sister search traffic query
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/365190 ) Change subject: Fix sister search traffic query .. Fix sister search traffic query - Creates a new category in the 'language' column for French and Catalan since they have their own sister search sidebar that shows up in addition to ours. - Makes some adjustments for how SERPs are detected. We'll need to clear out the current data and do a full recount with this query. Since the backfill is only from 2017-06-01, we should be OK as the original data should still be present. Bug: T164854, T170183 Change-Id: Ic21aeac43891ebb1b65696fe8e907bb959a4d7b7 --- M modules/metrics/search/sister_search_traffic 1 file changed, 11 insertions(+), 6 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/modules/metrics/search/sister_search_traffic b/modules/metrics/search/sister_search_traffic index 3dc2852..76c091c 100755 --- a/modules/metrics/search/sister_search_traffic +++ b/modules/metrics/search/sister_search_traffic @@ -12,8 +12,12 @@ WHEN 'species' THEN 'wikispecies' ELSE normalized_host.project_class END AS project, -IF(normalized_host.project IN('commons', 'meta', 'simple', 'incubator', 'species'), '', - IF(normalized_host.project = 'en', 'English', 'Other languages')) AS language, +CASE WHEN normalized_host.project IN('commons', 'meta', 'simple', 'incubator', 'species') THEN '' + WHEN normalized_host.project = 'en' THEN 'English' + -- frwiki and cawiki use homebrew sister search that shows up in addition to ours + WHEN normalized_host.project IN('ca', 'fr') THEN 'French and Catalan' + ELSE 'Other languages' +END AS language, -- flag for pageviews that are search results pages (e.g. if user clicked to see more results from a sister project): ( page_id IS NULL @@ -22,8 +26,8 @@ OR ( uri_path = '/w/index.php' AND ( -uri_query RLIKE '^\?search\=' -OR INSTR(uri_query, '?title=Special:Search=') > 0 +PARSE_URL(CONCAT('http://', uri_host, uri_path, uri_query), 'QUERY', 'search') IS NOT NULL +OR PARSE_URL(CONCAT('http://', uri_host, uri_path, uri_query), 'QUERY', 'searchToken') IS NOT NULL ) ) ) @@ -34,10 +38,11 @@ AND CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')) >= '$1' AND CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')) < '$2' AND is_pageview +-- only those that have been referred by a search results page on a wikipedia: AND referer_class = 'internal' AND ( - INSTR(referer, '/w/index.php?search=') > 0 - OR INSTR(referer, '/wiki/Special:Search?search=') > 0 + PARSE_URL(referer, 'QUERY', 'search') IS NOT NULL + OR PARSE_URL(referer, 'QUERY', 'searchToken') IS NOT NULL ) -- warning: comparing uri_host = PARSE_URL(referer, 'HOST') would mark 'en.m.wikipedia.org' as a sister of 'en.wikipedia.org' AND normalize_host(PARSE_URL(referer, 'HOST')).project_class = 'wikipedia' -- To view, visit https://gerrit.wikimedia.org/r/365190 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ic21aeac43891ebb1b65696fe8e907bb959a4d7b7 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/golden Gerrit-Branch: master Gerrit-Owner: BearlogaGerrit-Reviewer: Chelsyx Gerrit-Reviewer: EBernhardson Gerrit-Reviewer: HaeB ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Annotate PaulScore decrease as a result of sampling rate change
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/364333 ) Change subject: Annotate PaulScore decrease as a result of sampling rate change .. Annotate PaulScore decrease as a result of sampling rate change Bug: T168466 Change-Id: I9bf40ce1804ee679c24f664900db55a48e88f5e2 --- M modules/desktop/paulscore.R M tab_documentation/paulscore_approx.html 2 files changed, 5 insertions(+), 2 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/modules/desktop/paulscore.R b/modules/desktop/paulscore.R index ecfc79e..144569b 100644 --- a/modules/desktop/paulscore.R +++ b/modules/desktop/paulscore.R @@ -9,7 +9,8 @@ polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_paulscore_approx)) %>% polloi::make_dygraph(xlab = "Date", ylab = "PaulScore", title = "PaulScore for fulltext searches, by day", use_si = FALSE, group = "paulscore_approx") %>% dyRangeSelector %>% -dyLegend(labelsDiv = "paulscore_approx_legend", show = "always") +dyLegend(labelsDiv = "paulscore_approx_legend", show = "always") %>% +dyEvent(as.Date("2017-04-19"), "A (rates)", labelLoc = "bottom") if (input$paulscore_relative) { dyOut <- dyAxis(dyOut, "y", axisLabelFormatter = "function(x) { return Math.round(100 * x, 3) + '%'; }", valueFormatter = "function(x) { return Math.round(100 * x, 3) + '%'; }") } @@ -27,7 +28,8 @@ polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_paulscore_approx)) %>% polloi::make_dygraph(xlab = "Date", ylab = "PaulScore", title = "PaulScore for autocomplete searches, by day", use_si = FALSE, group = "paulscore_approx") %>% dyRangeSelector %>% -dyLegend(labelsDiv = "paulscore_approx_legend", show = "always") +dyLegend(labelsDiv = "paulscore_approx_legend", show = "always") %>% +dyEvent(as.Date("2017-04-19"), "A (rates)", labelLoc = "bottom") if (input$paulscore_relative) { dyOut <- dyAxis(dyOut, "y", axisLabelFormatter = "function(x) { return Math.round(100 * x, 3) + '%'; }", valueFormatter = "function(x) { return Math.round(100 * x, 3) + '%'; }") } diff --git a/tab_documentation/paulscore_approx.html b/tab_documentation/paulscore_approx.html index 0a5b441..b73d61b 100644 --- a/tab_documentation/paulscore_approx.html +++ b/tab_documentation/paulscore_approx.html @@ -25,6 +25,7 @@ 'R': on 2017-01-01 we started calculating all of Discovery's metrics using a new version of https://phabricator.wikimedia.org/diffusion/WDGO/;>our data retrieval and processing codebase that we migrated to https://www.mediawiki.org/wiki/Analytics;>Wikimedia Analytics' https://wikitech.wikimedia.org/wiki/Analytics/Reportupdater;>Reportupdater infrastructure. See https://phabricator.wikimedia.org/T150915;>T150915 for more details. + 'A': on 2017-04-19 we changed the rates at which users are put into event logging (see https://phabricator.wikimedia.org/T163273; title="Phabricator ticket: Adjust search satisfaction sampling rate">T163273. Specifically, we decreased the rate on English Wikipedia ("EnWiki") and increased it everywhere else, and since EnWiki generally has higher PaulScore than other projects, we effectively lowered the overall PaulScore by lessening EnWiki's contribution. See https://phabricator.wikimedia.org/T168466; title="Phabricator ticket: Investigate PaulScores for late April and May for full-text searches">T168466 for more details. Questions, bug reports, and feature suggestions -- To view, visit https://gerrit.wikimedia.org/r/364333 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I9bf40ce1804ee679c24f664900db55a48e88f5e2 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/rainbow Gerrit-Branch: develop Gerrit-Owner: BearlogaGerrit-Reviewer: Chelsyx ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...polloi[master]: Add capitalization function
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/363867 ) Change subject: Add capitalization function .. Add capitalization function Also adds correct licensing info for code from Stack Overflow Change-Id: I936b59b728e61fad07102c7e79a94d0754784607 --- M DESCRIPTION M NAMESPACE M NEWS.md M R/manipulate.R M R/maths.R A man/capitalize_first_letter.Rd M man/compress.Rd M tests/testthat/test-manipulation.R 8 files changed, 60 insertions(+), 3 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/DESCRIPTION b/DESCRIPTION index 94d9e10..5b3828b 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,12 +1,13 @@ Package: polloi Type: Package Title: Common Functionality for Wikimedia Dashboards -Version: 0.2.0 -Date: 2017-06-28 +Version: 0.2.1 +Date: 2017-07-07 Authors@R: c( person("Mikhail", "Popov", email = "mikh...@wikimedia.org", role = c("aut", "cre")), person("Chelsy", "Xie", email = "c...@wikimedia.org", role = "aut"), -person("Oliver", "Keyes", role = "aut", comment = "No longer employed at the Foundation") +person("Oliver", "Keyes", role = "aut", comment = "No longer employed at the Foundation"), +person("Andrie", "de Vries", role = "ctb", comment = "Capitalization code from StackOverflow") ) Description: This package contains common functionality for all of the Wikimedia Foundation's Shiny Dashboards. diff --git a/NAMESPACE b/NAMESPACE index 1b06a9b..3542a1d 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -1,6 +1,7 @@ # Generated by roxygen2: do not edit by hand export(automata_select) +export(capitalize_first_letter) export(cbind_fill) export(check_past_week) export(check_yesterday) diff --git a/NEWS.md b/NEWS.md index 713fb16..4a41b2b 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,3 +1,7 @@ +polloi 0.2.1 + +- Adds `capitalize_first_letter` + polloi 0.2.0 - Adds unit tests and lint checking ([T145445](https://phabricator.wikimedia.org/T145445)). diff --git a/R/manipulate.R b/R/manipulate.R index e9e9fb1..01294b0 100644 --- a/R/manipulate.R +++ b/R/manipulate.R @@ -117,3 +117,16 @@ } return(no_set) } + +#' @title Capitalize First Letter Of Every Word +#' @description Capitalizes the first letter of every word. +#' @details This function is made available under CC-BY-SA 3.0 +#' @param x character vector +#' @author [Andrie de Vries](https://stackoverflow.com/users/602276/andrie) +#' @source \url{https://stackoverflow.com/questions/6364783/capitalize-the-first-letter-of-both-words-in-a-two-word-string} +#' @export +capitalize_first_letter <- function(x) { + return(vapply(strsplit(x, " "), function(s) { +return(paste0(toupper(substring(s, 1, 1)), substring(s, 2), collapse = " ")) + }, "")) +} diff --git a/R/maths.R b/R/maths.R index aab5b7f..3308e91 100644 --- a/R/maths.R +++ b/R/maths.R @@ -16,8 +16,11 @@ #' @title Convert Numeric Values to use SI suffixes #' @description takes a numeric vector (e.g. 1200, 130) and converts it to #' use SI suffixes (e.g. 1.2K, 1.3M) +#' @details This function is made available under CC-BY-SA 3.0 #' @param x a vector of numeric or integer values #' @param round_by how many digits to round the resulting numbers by +#' @author Original code: [42-](https://stackoverflow.com/users/1855677/42); +#' improvement: Mikhail #' @references \url{https://stackoverflow.com/questions/28159936/formatting-large-currency-or-dollar-values-to-millions-billions/} #' @export compress <- function(x, round_by = 2) { diff --git a/man/capitalize_first_letter.Rd b/man/capitalize_first_letter.Rd new file mode 100644 index 000..184da6e --- /dev/null +++ b/man/capitalize_first_letter.Rd @@ -0,0 +1,23 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/manipulate.R +\name{capitalize_first_letter} +\alias{capitalize_first_letter} +\title{Capitalize First Letter Of Every Word} +\source{ +\url{https://stackoverflow.com/questions/6364783/capitalize-the-first-letter-of-both-words-in-a-two-word-string} +} +\usage{ +capitalize_first_letter(x) +} +\arguments{ +\item{x}{character vector} +} +\description{ +Capitalizes the first letter of every word. +} +\details{ +This function is made available under CC-BY-SA 3.0 +} +\author{ +\href{https://stackoverflow.com/users/602276/andrie}{Andrie de Vries} +} diff --git a/man/compress.Rd b/man/compress.Rd index fb16556..85826c6 100644 --- a/man/compress.Rd +++ b/man/compress.Rd @@ -15,6 +15,13 @@ takes a numeric vector (e.g. 1200, 130) and converts it to use SI suffixes (e.g. 1.2K, 1.3M) } +\details{ +This function is made available under CC-BY-SA 3.0 +} \references{ \url{https://stackoverflow.com/questions/28159936/formatting-large-currency-or-dollar-values-to-millions-billions/} } +\author{ +Original code: \href{https://stackoverflow.com/users/1855677/42}{42-}; +improvement: Mikhail +} diff
[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Sister search traffic changes per Deb's feedback
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/363752 ) Change subject: Sister search traffic changes per Deb's feedback .. Sister search traffic changes per Deb's feedback - Corrects referenced ticket - Changes title and summary - Adds more notes and explanations - Changes the UI/UX to a "choose-your-own-split-by-combo" adventure using checkboxes - Adds annotations for sister search on KPI::LoadTimes and Desktop::LoadTimes dashboards because it's relevant Bug: T164854 Change-Id: I5c1e4db0b2b92ad3b28d74b8113a511704946326 --- M server.R M tab_documentation/desktop_load.md M tab_documentation/kpi_load_time.md M tab_documentation/sister_search_traffic.md M ui.R A www/js4checkbox.js 6 files changed, 67 insertions(+), 36 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/server.R b/server.R index 49c429b..972cd74 100644 --- a/server.R +++ b/server.R @@ -83,7 +83,8 @@ polloi::make_dygraph(xlab = "Date", ylab = "Load time (ms)", title = "Desktop load times, by day", use_si = FALSE) %>% dyRangeSelector %>% dyEvent(as.Date("2016-07-12"), "A (schema switch)", labelLoc = "bottom") %>% - dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") + dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% + dyEvent(as.Date("2017-06-15"), "B (sister search)", labelLoc = "bottom") }) output$paulscore_approx_plot_fulltext <- renderDygraph({ @@ -363,34 +364,37 @@ # Sister Search output$sister_search_traffic_plot <- renderDygraph({ -switch( - input$sister_search_traffic_split, - "none" = { -sister_search_traffic %>% +# Code that prepares a custom data.frame 'sst' +# that will then be processed in a generic way: +if (length(input$sister_search_traffic_split) == 0) { + sst <- sister_search_traffic %>% dplyr::mutate(split = "Sister search traffic") - }, - "project" = { -sister_search_traffic %>% - dplyr::rename(split = project) - }, - "destination" = { -sister_search_traffic %>% - dplyr::mutate(split = dplyr::if_else(is_serp, "Search results page", "Article")) - }, - "language" = { -sister_search_traffic %>% - dplyr::filter(project != "wikimedia commons", !is.na(language)) %>% - dplyr::mutate(split = language) - }, - "access_method" = { -sister_search_traffic %>% - dplyr::mutate(split = access_method) +} else { + split_by <- head(input$sister_search_traffic_split, 2) + sst <- sister_search_traffic + if ("language" %in% split_by) { +sst <- dplyr::filter(sst, !is.na(language)) } -) %>% + if ("destination" %in% split_by) { +sst <- dplyr::mutate(sst, destination = dplyr::if_else(is_serp, "Search results page", "Article")) + } + if (length(split_by) == 1) { +sst$split <- sst[[split_by[1]]] + } else { +sst$split <- paste0(sst[[split_by[1]]], " (", sst[[split_by[2]]], ")") + } +} +# Code that works on the prepared dataet: +sst %>% dplyr::group_by(date, split) %>% dplyr::summarize(pageviews = sum(pageviews)) %>% tidyr::spread(split, pageviews, fill = 0) %>% - polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_sister_search_traffic_plot)) %>% + { +# Reorder columns according to the last observed values: +cols <- unlist(polloi::safe_tail(., 1)[, -1]) +.[, c(1, order(cols, decreasing = TRUE) + 1)] + } %>% + polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_sister_search_traffic_plot), rename = FALSE) %>% polloi::make_dygraph(xlab = "Date", ylab = "Pageviews", title = "Traffic to sister projects from Wikipedia SERPs") %>% dyAxis("x", ticker = "Dygraph.dateTicker", axisLabelFormatter = polloi::custom_axis_formatter, axisLabelWidth = 100, pixelsPerLabel = 80) %>% @@ -735,7 +739,8 @@ dyCSS(css = system.file("custom.css", package = "polloi")) %>% dyRangeSelector %>% dyEvent(as.Date("2016-07-12"), "A (schema switch)", labelLoc = "bottom") %>% - dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")) + dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% + dyEvent(as.Date("2017-06-15"), "B (sister search)", labelLoc = "bottom")) }) output$kpi_zero_results_series <- renderDygraph({ smooth_level <- input$smoothing_kpi_zero_results diff --git a/tab_documentation/desktop_load.md b/tab_documentation/desktop_load.md index be3d643..dcd55c0 100644 --- a/tab_documentation/desktop_load.md +++ b/tab_documentation/desktop_load.md @@ -10,7
[MediaWiki-commits] [Gerrit] wikimedia...polloi[master]: Fix spline smoothing and add tests
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/362107 ) Change subject: Fix spline smoothing and add tests .. Fix spline smoothing and add tests - This patch fixes a bug wherein spline smoothing was broken. - This patch also adds a bunch of unit tests (via testthat) because if we had those earlier, we would have known that the previous patch actually broke spline smoothing. - There is now an example dataset (wdqs_usage) that is used in examples an some of the unit tests. - This patch also finally fixes the issue with compress() wherein it returned really weird results if the input vector contained a 0. - Also! There is lint checking now! It is included as a unit test. - While I was at it, I fixed a bunch of stylistic issues (spacing, line lengths, single vs double quotes) and documentation issues (e.g. missing descriptions that `R CMD check` would yell about). Bug: T169125, T153856 Change-Id: I5752d0a528bffb2bee6186d49efd4a751551cb95 --- M .Rbuildignore A .lintr M DESCRIPTION M NAMESPACE M NEWS.md M R/check_notify.R M R/data.R M R/dygraphs.R M R/manipulate.R M R/maths.R M R/reading.R M R/shiny.R M R/smoothing.R M R/utils.R A data/wdqs_usage.rda M man/automata_select.Rd M man/cbind_fill.Rd M man/compress.Rd M man/cond_color.Rd M man/cond_icon.Rd A man/get_sample_data.Rd M man/make_dygraph.Rd M man/parse_wikiid.Rd M man/percent_change.Rd M man/portal_regions.Rd M man/read_dataset.Rd M man/smart_palette.Rd M man/smooth_select.Rd M man/smoother.Rd M man/subset_by_date_range.Rd M man/timeframe_daterange.Rd M man/timeframe_select.Rd M man/update_prefixes.Rd M man/update_projects.Rd A man/wdqs_usage.Rd M polloi.Rproj A tests/testthat.R A tests/testthat/test-manipulation.R A tests/testthat/test-maths.R A tests/testthat/test-smoothing.R A tests/testthat/test-syntax.R 41 files changed, 412 insertions(+), 163 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/.Rbuildignore b/.Rbuildignore index 8ed3933..a7ed3c6 100644 --- a/.Rbuildignore +++ b/.Rbuildignore @@ -1,4 +1,5 @@ ^.*\.Rproj$ ^\.Rproj\.user$ -.gitreview +^\.gitreview$ ^CONDUCT\.md$ +^\.lintr diff --git a/.lintr b/.lintr new file mode 100644 index 000..0c6cdb9 --- /dev/null +++ b/.lintr @@ -0,0 +1,4 @@ +linters: with_defaults(line_length_linter(120), object_usage_linter = NULL, closed_curly_linter = NULL, open_curly_linter = NULL) +exclude: "# Exclude Linting" +exclude_start: "# Begin Exclude Linting" +exclude_end: "# End Exclude Linting" diff --git a/DESCRIPTION b/DESCRIPTION index 933d209..94d9e10 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,8 +1,8 @@ Package: polloi Type: Package Title: Common Functionality for Wikimedia Dashboards -Version: 0.1.9 -Date: 2017-06-26 +Version: 0.2.0 +Date: 2017-06-28 Authors@R: c( person("Mikhail", "Popov", email = "mikh...@wikimedia.org", role = c("aut", "cre")), person("Chelsy", "Xie", email = "c...@wikimedia.org", role = "aut"), @@ -36,7 +36,9 @@ zoo Suggests: datasets, -ISOcodes +ISOcodes, +lintr, +testthat LazyData: TRUE Roxygen: list(markdown = TRUE) RoxygenNote: 6.0.1 diff --git a/NAMESPACE b/NAMESPACE index 80881aa..1b06a9b 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -44,7 +44,6 @@ importFrom(lubridate,ymd) importFrom(magrittr,"%>%") importFrom(magrittr,set_names) -importFrom(readr,read_delim) importFrom(rvest,html_nodes) importFrom(rvest,html_table) importFrom(shiny,icon) diff --git a/NEWS.md b/NEWS.md index 21243cb..713fb16 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,12 +1,20 @@ +polloi 0.2.0 + +- Adds unit tests and lint checking ([T145445](https://phabricator.wikimedia.org/T145445)). +- Adds an example dataset (`wdqs_usage`) that is used for running examples and tests. +- Fixes problem with spline smoothing ([T169125](https://phabricator.wikimedia.org/T169125)). +- Fixes a whole bunch of stylistic issues (removes lints). +- Fixes a bug with `compress()` wherein it would yield weird results if the input vector included a 0. + polloi 0.1.9 -- Adds geography datasets and functions ([T167913](https://phabricator.wikimedia.org/T167913)) +- Adds geography datasets and functions ([T167913](https://phabricator.wikimedia.org/T167913)). polloi 0.1.8 -- Updates dataset of prefixes -- Changes path to download datasets from -- Uses latest roxygen with markdown support +- Updates dataset of prefixes. +- Changes path to download datasets from. +- Uses latest roxygen with markdown support. polloi 0.1.7 diff --git a/R/check_notify.R b/R/check_notify.R index 1e5c840..c6ff3d0 100644 --- a/R/check_notify.R +++ b/R/check_notify.R @@ -16,7 +16,7 @@ # e.g. label = "desktop events" yesterday_date <- Sys.Date() - 1 if (!(yesterday_date %in% dataset$date)) { -return(notificationItem(text = paste("No", label," from yesterday."), +
[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Add sister search traffic
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/361902 ) Change subject: Add sister search traffic .. Add sister search traffic - Adds a "Sister Search" section with a "Traffic" subsection Bug: T164854 Change-Id: Ic89b51f3b89b25b50387389ef84ba9496423be4b --- M server.R A tab_documentation/sister_search_traffic.md M ui.R M utils.R 4 files changed, 91 insertions(+), 13 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/server.R b/server.R index 0de1586..92e94c1 100644 --- a/server.R +++ b/server.R @@ -18,20 +18,22 @@ read_desktop() progress$set(message = "Downloading apps data", value = 0.1) read_apps() -progress$set(message = "Downloading mobile web data", value = 0.3) +progress$set(message = "Downloading mobile web data", value = 0.2) read_web() -progress$set(message = "Downloading API usage data", value = 0.4) +progress$set(message = "Downloading API usage data", value = 0.3) read_api() -progress$set(message = "Downloading zero results data", value = 0.5) +progress$set(message = "Downloading zero results data", value = 0.4) read_failures() -progress$set(message = "Downloading engagement data", value = 0.6) +progress$set(message = "Downloading engagement data", value = 0.5) read_augmented_clickthrough() -progress$set(message = "Downloading language-project engagement data", value = 0.7) +progress$set(message = "Downloading language-project engagement data", value = 0.6) read_augmented_clickthrough_langproj() -progress$set(message = "Downloading survival data", value = 0.8) +progress$set(message = "Downloading survival data", value = 0.7) read_lethal_dose() -progress$set(message = "Downloading PaulScore data", value = 0.9) +progress$set(message = "Downloading PaulScore data", value = 0.8) read_paul_score() +progress$set(message = "Downloading sister search data", value = 0.9) +read_sister_search() progress$set(message = "Finished downloading datasets", value = 1) existing_date <<- Sys.Date() progress$close() @@ -359,6 +361,40 @@ dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") }) + # Sister Search + output$sister_search_traffic_plot <- renderDygraph({ +switch( + input$sister_search_traffic_split, + "project" = { +sister_search_traffic %>% + dplyr::rename(split = project) + }, + "destination" = { +sister_search_traffic %>% + dplyr::mutate(split = dplyr::if_else(is_serp, "Search results page", "Article")) + }, + "language" = { +sister_search_traffic %>% + dplyr::filter(project != "wikimedia commons", !is.na(language)) %>% + dplyr::mutate(split = language) + }, + "access_method" = { +sister_search_traffic %>% + dplyr::mutate(split = access_method) + } +) %>% + dplyr::group_by(date, split) %>% + dplyr::summarize(pageviews = sum(pageviews)) %>% + tidyr::spread(split, pageviews, fill = 0) %>% + polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_sister_search_traffic_plot)) %>% + polloi::make_dygraph(xlab = "Date", ylab = "Pageviews", title = "Traffic to sister projects from Wikipedia SERPs") %>% + dyAxis("x", ticker = "Dygraph.dateTicker", axisLabelFormatter = polloi::custom_axis_formatter, + axisLabelWidth = 100, pixelsPerLabel = 80) %>% + dyLegend(labelsDiv = "sister_search_traffic_plot_legend") %>% + dyRangeSelector(fillColor = "", strokeColor = "") %>% + dyEvent(as.Date("2017-06-15"), "A (deployed)", labelLoc = "bottom") + }) + # Survival output$lethal_dose_plot <- renderDygraph({ user_page_visit_dataset %>% diff --git a/tab_documentation/sister_search_traffic.md b/tab_documentation/sister_search_traffic.md new file mode 100644 index 000..6258a6b --- /dev/null +++ b/tab_documentation/sister_search_traffic.md @@ -0,0 +1,28 @@ +Sister search traffic +=== +Sister (cross-wiki) search is a feature that adds results from other projects to a sidebar on the search engine results page (SERP). For example: if there are additional results found, users are shown images from Wikimedia Commons, definitions from Wiktionary, and results from works on Wikisource. See [T146667](https://phabricator.wikimedia.org/T146667) for more details. + +Notes +- +Some communities (e.g. Italian Wikipedia) developed their own cross-wiki search results sidebars, which is why we see some sister traffic before the deployment of the sister search feature across all Wikipedias. + +__\*__ Users can click on a cross-wiki result or view all the results at the sister project + +__†__ This excludes the language-less Wikimedia Commons + +Outages and inaccuracies +-- +*
[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Add traffic from sister search
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/361592 ) Change subject: Add traffic from sister search .. Add traffic from sister search Bug: T164854 Change-Id: I7632d68b560049a145d1bccf54cf12abf9095582 --- M docs/README.md M modules/metrics/search/config.yaml A modules/metrics/search/sister_search_traffic 3 files changed, 67 insertions(+), 2 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/docs/README.md b/docs/README.md index 40b1c28..4f595f0 100644 --- a/docs/README.md +++ b/docs/README.md @@ -8,7 +8,7 @@ infrastructure. These datasets provide the metrics that are used by [Discovery's Dashboards](https://discovery.wmflabs.org/) -Last updated on 31 May 2017 +Last updated on 26 June 2017 Daily Metrics - @@ -137,6 +137,10 @@ - **cirrus\_langproj\_breakdown\_with\_automata.tsv**: Zero results and total searches broken down by language-project pairs (e.g. German Wikiquote ZRR vs. French Wikibooks ZRR) +- **sister\_search\_traffic.tsv**: Traffic to various wikis from +Wikipedia search results pages; broken up by language, destination +type (SERP vs not), and access method (desktop vs mobile web); +exlcudes known automata wdqs/ - diff --git a/modules/metrics/search/config.yaml b/modules/metrics/search/config.yaml index cffd6fc..46d9768 100644 --- a/modules/metrics/search/config.yaml +++ b/modules/metrics/search/config.yaml @@ -155,4 +155,10 @@ starts: 2016-11-01 funnel: true max_data_points: 30 -type: script \ No newline at end of file +type: script +sister_search_traffic: +description: Traffic to various wikis from Wikipedia search results pages; broken up by language, destination type (SERP vs not), and access method (desktop vs mobile web); exlcudes known automata +granularity: days +starts: 2017-06-01 +funnel: true +type: script diff --git a/modules/metrics/search/sister_search_traffic b/modules/metrics/search/sister_search_traffic new file mode 100755 index 000..541463a --- /dev/null +++ b/modules/metrics/search/sister_search_traffic @@ -0,0 +1,55 @@ +#!/bin/bash + +hive -S -e "USE wmf; +ADD JAR hdfs:///wmf/refinery/current/artifacts/refinery-hive.jar; +CREATE TEMPORARY FUNCTION normalize_host AS 'org.wikimedia.analytics.refinery.hive.GetHostPropertiesUDF'; +WITH sister_search_pvs AS ( + SELECT +TO_DATE(ts) AS `date`, access_method, +CASE normalized_host.project + WHEN 'commons' THEN 'wikimedia commons' + WHEN 'simple' THEN CONCAT('simple ', normalized_host.project_class) + WHEN 'species' THEN 'wikispecies' + ELSE normalized_host.project_class +END AS project, +IF(normalized_host.project IN('commons', 'meta', 'simple', 'incubator', 'species'), '', + IF(normalized_host.project = 'en', 'English', 'Other languages')) AS language, +-- flag for pageviews that are search results pages (e.g. if user clicked to see more results from a sister project): +( + page_id IS NULL + AND ( +uri_path = '/wiki/Special:Search' +OR ( + uri_path = '/w/index.php' + AND ( +uri_query RLIKE '^\?search\=' +OR INSTR(uri_query, '?title=Special:Search=') > 0 + ) +) + ) +) AS is_serp + FROM webrequest + WHERE +webrequest_source = 'text' +AND CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')) >= '$1' +AND CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')) < '$2' +AND is_pageview +AND referer_class = 'internal' +AND ( + INSTR(referer, '/w/index.php?search=') > 0 + OR INSTR(referer, '/wiki/Special:Search?search=') > 0 +) +-- warning: comparing uri_host = PARSE_URL(referer, 'HOST') would mark 'en.m.wikipedia.org' as a sister of 'en.wikipedia.org' +AND normalize_host(PARSE_URL(referer, 'HOST')).project_class = 'wikipedia' +AND normalize_host(PARSE_URL(referer, 'HOST')).project_class != normalized_host.project_class +AND NOT normalized_host.project_class IN('mediawiki', 'wikimediafoundation', 'wikidata') +AND NOT normalized_host.project IN('meta', 'incubator') +-- keep commons.wikimedia.org and species.wikimedia.org: +AND NOT (normalized_host.project_class = 'wikimedia' AND NOT (normalized_host.project IN('commons', 'species'))) +) +SELECT `date`, access_method, project, language, IF(is_serp, 'TRUE', 'FALSE') AS is_serp, COUNT(1) AS pageviews +FROM sister_search_pvs +GROUP BY `date`, access_method, project, language, IF(is_serp, 'TRUE', 'FALSE') +ORDER BY `date`, access_method, project, language, is_serp +LIMIT 1; +" 2> /dev/null | grep -v parquet.hadoop | grep -v WARN: -- To view, visit https://gerrit.wikimedia.org/r/361592 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
[MediaWiki-commits] [Gerrit] wikimedia...polloi[master]: The following datasets are included:
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/360797 ) Change subject: The following datasets are included: .. The following datasets are included: - Countries and Regions with Traffic to Wikipedia.org - U.S. States and Regions - All Countries and U.S. States Bug: T167913 Change-Id: I3c111f75bb827bb4a296b4148bee16d608844d26 --- M NAMESPACE M R/data.R M R/utils.R A inst/extdata/all_countries_us_states.csv A inst/extdata/portal_geo_names.RData A inst/extdata/us_state_region.csv A man/get_ctr_state.Rd A man/get_portal_geo.Rd A man/get_us_state.Rd A man/update_portal_geo.Rd 10 files changed, 504 insertions(+), 0 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/polloi refs/changes/97/360797/1 diff --git a/NAMESPACE b/NAMESPACE index 9f924d8..8ccbbf9 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -9,9 +9,12 @@ export(cond_icon) export(custom_axis_formatter) export(data_select) +export(get_ctr_state) export(get_langproj) +export(get_portal_geo) export(get_prefixes) export(get_projects) +export(get_us_state) export(half) export(make_dygraph) export(na_box) @@ -27,6 +30,7 @@ export(time_frame_range) export(timeframe_daterange) export(timeframe_select) +export(update_portal_geo) export(update_prefixes) export(update_projects) import(httr) diff --git a/R/data.R b/R/data.R index 288de6e..d329cbb 100644 --- a/R/data.R +++ b/R/data.R @@ -71,3 +71,55 @@ rbind(projects, .) return(result) } + +#' @title Countries and Regions with Traffic to Wikipedia.org +#' @description Attach `portal_geo_names` to search path +#' +#' @format `portal_geo_names` is a character vector containing about 230 +#' country/region names with traffic to Wikipedia.org (portal). +#' +#' @source \url{https://analytics.wikimedia.org/datasets/discovery/metrics/portal/all_country_data.tsv} +#' @seealso [update_portal_geo] +#' @export +get_portal_geo <- function() { + attach(system.file("extdata/portal_geo_names.RData", package = "polloi")) +} + +#' @title U.S. States and Regions +#' @description Returns a dataset containing all U.S. states' and territories' +#' names, abbreviations and regions. +#' +#' @format A data frame with 56 rows and 3 variables: +#' \describe{ +#' \item{abb}{The abbreviations of U.S. states and territories.} +#' \item{region}{The regions of U.S. states and territories. See +#'https://phabricator.wikimedia.org/T136257#2399411.} +#' \item{state}{The names of U.S. states and territories.} +#' } +#' +#' @source `ISO_3166_1` from package `ISOcodes`; `state.name`, `state.abb` and +#' `state.region` from package `datasets`; see +#' \url{https://phabricator.wikimedia.org/T136257#2399411} for U.S. regions. +#' @importFrom readr read_csv +#' @export +get_us_state <- function() { + return(readr::read_csv(system.file("extdata/us_state_region.csv", package = "polloi"))) +} + +#' @title All Countries and U.S. States +#' @description Returns a dataset containing all countries' and U.S. states' +#' names and abbreviations. +#' +#' @format A data frame with 300 rows and 2 variables: +#' \describe{ +#' \item{abb}{The abbreviations of all countries and U.S. states.} +#' \item{name}{The names of all countries and U.S. states.} +#' } +#' +#' @source `ISO_3166_1` from package `ISOcodes`; `state.name`, `state.abb` and +#' `state.region` from package `datasets`. +#' @importFrom readr read_csv +#' @export +get_ctr_state <- function() { + return(readr::read_csv(system.file("extdata/all_countries_us_states.csv", package = "polloi"))) +} diff --git a/R/utils.R b/R/utils.R index 1a81423..9c84d02 100644 --- a/R/utils.R +++ b/R/utils.R @@ -61,3 +61,17 @@ result <- left_join(data.frame(wikiid = x, stringsAsFactors = FALSE), temp, by = "wikiid") return(result[, c('language', 'project')]) } + +#' @title Update Country and Region Names with Traffic to Wikipedia.org +#' @description Get unique country and region names from the `country` column of +#' \url{https://analytics.wikimedia.org/datasets/discovery/metrics/portal/all_country_data.tsv}. +#' @export +update_portal_geo <- function() { + file_location <- system.file("extdata/portal_geo_names.RData", package = "polloi") + + portal_geo_names <- read_dataset("discovery/metrics/portal/all_country_data.tsv") + portal_geo_names <- sort(c(unique(portal_geo_names$country), "United States")) + + save(portal_geo_names, file = file_location) + return(invisible()) +} diff --git a/inst/extdata/all_countries_us_states.csv b/inst/extdata/all_countries_us_states.csv new file mode 100644 index 000..85ceea1 --- /dev/null +++ b/inst/extdata/all_countries_us_states.csv @@ -0,0 +1,301 @@ +abb,name +US:AL,U.S. (South) +US:AK,U.S. (Pacific) +US:AZ,U.S. (West) +US:AR,U.S. (South) +US:CA,U.S. (Pacific) +US:CO,U.S. (West) +US:CT,U.S. (Northeast) +US:DE,U.S. (South) +US:FL,U.S. (South) +US:GA,U.S. (South) +US:HI,U.S. (Pacific)
[MediaWiki-commits] [Gerrit] wikimedia...rainbow[develop]: Add licensing info
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/360591 ) Change subject: Add licensing info .. Add licensing info Bug: T167930 Change-Id: Ib01224fd1a952eeaab9cc378ca0e16e7ea3845d3 --- M .gitreview M CHANGELOG.md A LICENSE.md M README.md M server.R M tab_documentation/app_events.md M tab_documentation/app_load.md D tab_documentation/build_a_plot.md M tab_documentation/click_position.md M tab_documentation/desktop_events.md M tab_documentation/desktop_load.md M tab_documentation/failure_breakdown.md M tab_documentation/failure_rate.md M tab_documentation/failure_suggests.md M tab_documentation/fulltext_basic.md M tab_documentation/geo_basic.md M tab_documentation/invoke_source.md M tab_documentation/kpi_api_usage.md M tab_documentation/kpi_augmented_clickthroughs.md M tab_documentation/kpi_load_time.md M tab_documentation/kpi_zero_results.md M tab_documentation/kpis_summary.md M tab_documentation/langproj_breakdown.md M tab_documentation/language_basic.md M tab_documentation/mobile_events.md M tab_documentation/mobile_load.md M tab_documentation/monthly_metrics.md M tab_documentation/open_basic.md M tab_documentation/paulscore_approx.html M tab_documentation/prefix_basic.md M tab_documentation/survival.md M ui.R 32 files changed, 185 insertions(+), 179 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/.gitreview b/.gitreview index 3f659b0..6ab16d0 100644 --- a/.gitreview +++ b/.gitreview @@ -2,5 +2,5 @@ host=gerrit.wikimedia.org port=29418 project=wikimedia/discovery/rainbow.git -defaultbranch=master +defaultbranch=develop defaultrebase=0 diff --git a/CHANGELOG.md b/CHANGELOG.md index ab4260b..bda0aab 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,9 @@ All notable changes to this project will be documented in this file. +## 2017/06/20 +- Added licensing info ([T167930](https://phabricator.wikimedia.org/T167930)) + ## 2017/05/01 - Added a language-project breakdown of additional metrics ([T150410](https://phabricator.wikimedia.org/T150410)) diff --git a/LICENSE.md b/LICENSE.md new file mode 100644 index 000..7355a50 --- /dev/null +++ b/LICENSE.md @@ -0,0 +1,21 @@ +MIT License + +Copyright (c) 2017 Wikimedia Foundation + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. diff --git a/README.md b/README.md index d1de5d9..ff65401 100644 --- a/README.md +++ b/README.md @@ -17,4 +17,4 @@ shiny::runApp(launch.browser = 0) ``` -Please note that this project is released with a [Contributor Code of Conduct](CONDUCT.md). By participating in this project you agree to abide by its terms. +Please note that this project is licensed under [MIT License](LICENSE.md) and released with a [Contributor Code of Conduct](CONDUCT.md). By participating in this project you agree to abide by its terms. [Wikimedia technical spaces code of conduct](https://www.mediawiki.org/wiki/Special:MyLanguage/Code_of_Conduct) also applies. diff --git a/server.R b/server.R index 54c0886..0de1586 100644 --- a/server.R +++ b/server.R @@ -383,7 +383,7 @@ temp <- dates %>% as.character("%e") %>% as.numeric %>% - sapply(toOrdinal::toOrdinal) %>% + vapply(toOrdinal::toOrdinal, "") %>% sub("([a-z]{2})", "\\1", .) %>% paste0(as.character(dates, "%A, %b "), .) }, @@ -392,7 +392,7 @@ temp <- dates %>% as.character("%e") %>% as.numeric %>% - sapply(toOrdinal::toOrdinal) %>% + vapply(toOrdinal::toOrdinal, "") %>% sub("([a-z]{2})", "\\1", .) %>% paste0(as.character(dates, "%b "), .) %>% { @@ -404,7 +404,7 @@ temp <- dates %>% as.character("%e") %>% as.numeric %>% - sapply(toOrdinal::toOrdinal) %>% +
[MediaWiki-commits] [Gerrit] wikimedia...wetzel[develop]: Add licensing info
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/360590 ) Change subject: Add licensing info .. Add licensing info Bug: T167930 Change-Id: I36ba0e9e5395d87380efda3bcde0ff7d22542efd --- M .gitreview M CHANGELOG.md A LICENSE.md M README.md M server.R M tab_documentation/geo_breakdown.md M tab_documentation/geohack_usage.md M tab_documentation/tiles_summary.md M tab_documentation/tiles_total_by_style.md M tab_documentation/tiles_total_by_zoom.md M tab_documentation/tiles_users_by_style.md M tab_documentation/unique_users.md M tab_documentation/wikiminiatlas_usage.md M tab_documentation/wikivoyage_usage.md M tab_documentation/wiwosm_usage.md M ui.R 16 files changed, 88 insertions(+), 63 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/.gitreview b/.gitreview index 42d7c49..45a84e3 100644 --- a/.gitreview +++ b/.gitreview @@ -2,4 +2,4 @@ host=gerrit.wikimedia.org port=29418 project=wikimedia/discovery/wetzel.git -defaultbranch=master +defaultbranch=develop diff --git a/CHANGELOG.md b/CHANGELOG.md index 410f696..208e2ab 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,6 +1,9 @@ # Change Log (Patch Notes) All notable changes to this project will be documented in this file. +## 2017/06/20 +- Added licensing info ([T167930](https://phabricator.wikimedia.org/T167930)) + ## 2017/02/02 - Updated to work with new datasets generated by Reportupdater-based golden ([T150915](https://phabricator.wikimedia.org/T150915)) diff --git a/LICENSE.md b/LICENSE.md new file mode 100644 index 000..7355a50 --- /dev/null +++ b/LICENSE.md @@ -0,0 +1,21 @@ +MIT License + +Copyright (c) 2017 Wikimedia Foundation + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. diff --git a/README.md b/README.md index 1cd09ef..eb30184 100644 --- a/README.md +++ b/README.md @@ -17,4 +17,4 @@ shiny::runApp(launch.browser = 0) ``` -Please note that this project is released with a [Contributor Code of Conduct](CONDUCT.md). By participating in this project you agree to abide by its terms. +Please note that this project is licensed under [MIT License](LICENSE.md) and released with a [Contributor Code of Conduct](CONDUCT.md). By participating in this project you agree to abide by its terms. [Wikimedia technical spaces code of conduct](https://www.mediawiki.org/wiki/Special:MyLanguage/Code_of_Conduct) also applies. diff --git a/server.R b/server.R index d24e621..a659724 100644 --- a/server.R +++ b/server.R @@ -91,6 +91,7 @@ }) output$tiles_zoom_series <- renderDygraph({ +req(input$zoom_level_selector) polloi::data_select( input$tile_zoom_automata_check, new_tiles_automata, diff --git a/tab_documentation/geo_breakdown.md b/tab_documentation/geo_breakdown.md index b3e4c4c..c2f6adf 100644 --- a/tab_documentation/geo_breakdown.md +++ b/tab_documentation/geo_breakdown.md @@ -10,12 +10,12 @@ Questions, bug reports, and feature suggestions -- -For technical, non-bug questions, [email Mikhail](mailto:mpo...@wikimedia.org?subject=Dashboard%20Question). If you experience a bug or notice something wrong or have a suggestion, [open a ticket in Phabricator](https://phabricator.wikimedia.org/maniphest/task/create/?projects=Discovery) in the Discovery board or [email Deb](mailto:d...@wikimedia.org?subject=Dashboard%20Question). +For technical, non-bug questions, [email Mikhail](mailto:mpo...@wikimedia.org?subject=Dashboard%20Question) or [Chelsy](mailto:c...@wikimedia.org?subject=Dashboard%20Question). If you experience a bug or notice something wrong or have a suggestion, [open a ticket in Phabricator](https://phabricator.wikimedia.org/maniphest/task/create/?projects=Discovery) in the Discovery board or [email Deb](mailto:d...@wikimedia.org?subject=Dashboard%20Question). - - Link to this dashboard: - http://discovery.wmflabs.org/maps/#geo_breakdown;> -
[MediaWiki-commits] [Gerrit] wikimedia...wonderbolt[develop]: Add licensing info
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/360589 ) Change subject: Add licensing info .. Add licensing info Bug: T167930 Change-Id: I9db26c507f5825b780b7584c309afd07375d7920 --- M .gitreview A LICENSE.md M README.md A tab_documentation/traffic_by_engine.md D tab_documentation/traffic_byengine.md M tab_documentation/traffic_summary.md M ui.R 7 files changed, 58 insertions(+), 37 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/.gitreview b/.gitreview index be475f0..1e92d5f 100644 --- a/.gitreview +++ b/.gitreview @@ -2,4 +2,4 @@ host=gerrit.wikimedia.org port=29418 project=wikimedia/discovery/wonderbolt.git -defaultbranch=master +defaultbranch=develop diff --git a/LICENSE.md b/LICENSE.md new file mode 100644 index 000..7355a50 --- /dev/null +++ b/LICENSE.md @@ -0,0 +1,21 @@ +MIT License + +Copyright (c) 2017 Wikimedia Foundation + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. diff --git a/README.md b/README.md index 6687c11..d29419b 100644 --- a/README.md +++ b/README.md @@ -17,4 +17,4 @@ shiny::runApp(launch.browser = 0) ``` -Please note that this project is released with a [Contributor Code of Conduct](CONDUCT.md). By participating in this project you agree to abide by its terms. +Please note that this project is licensed under [MIT License](LICENSE.md) and released with a [Contributor Code of Conduct](CONDUCT.md). By participating in this project you agree to abide by its terms. [Wikimedia technical spaces code of conduct](https://www.mediawiki.org/wiki/Special:MyLanguage/Code_of_Conduct) also applies. diff --git a/tab_documentation/traffic_by_engine.md b/tab_documentation/traffic_by_engine.md new file mode 100644 index 000..8269810 --- /dev/null +++ b/tab_documentation/traffic_by_engine.md @@ -0,0 +1,27 @@ +Traffic from external search engines, broken down +=== + +A key metric in understanding the role external search engines play in Wikipedia's (and Wikimedia's) readership and content discovery processes is a very direct one - how many pageviews we get from them. This can be discovered very simply by looking at our request logs. + +This dashboard simply breaks down the [summary data](https://discovery.wmflabs.org/external/#traffic_summary) to investigate how much traffic is coming from each search engine, individually. As you can see, Google dominates, which is why we've included the option of log-scaling +the traffic. + +General trends +-- + +Outages and notes +-- +* '__A__': on 2016-08-25 we patched the UDF to also look for [Duck Duck Go](https://duckduckgo.com) when it processes referer data. That referreral data was deleted and backfilled from 26 June 2016. See [T143287](https://phabricator.wikimedia.org/T143287) for more details. +* '__R__': on 2017-01-01 we started calculating all of Discovery's metrics using a new version of [our data retrieval and processing codebase](https://phabricator.wikimedia.org/diffusion/WDGO/) that we migrated to [Wikimedia Analytics](https://www.mediawiki.org/wiki/Analytics)' [Reportupdater infrastructure](https://wikitech.wikimedia.org/wiki/Analytics/Reportupdater). See [T150915](https://phabricator.wikimedia.org/T150915) for more details. + +Questions, bug reports, and feature suggestions +-- +For technical, non-bug questions, [email Mikhail](mailto:mpo...@wikimedia.org?subject=Dashboard%20Question) or [Chelsy](mailto:c...@wikimedia.org?subject=Dashboard%20Question). If you experience a bug or notice something wrong or have a suggestion, [open a ticket in Phabricator](https://phabricator.wikimedia.org/maniphest/task/create/?projects=Discovery) in the Discovery board or [email Deb](mailto:d...@wikimedia.org?subject=Dashboard%20Question). + + + + Link to this dashboard:
[MediaWiki-commits] [Gerrit] wikimedia...prince[develop]: Add licensing info
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/360588 ) Change subject: Add licensing info .. Add licensing info Bug: T167930 Change-Id: I5a42359c682de98dfa8d26231bd4d7cd43a25d9c --- M .gitreview A LICENSE.md M README.md M tab_documentation/action_breakdown.md M tab_documentation/applinks.md M tab_documentation/browsers.md M tab_documentation/clickthrough_rate.md M tab_documentation/dwelltime.md M tab_documentation/first_visit.md M tab_documentation/first_visit_geo.md M tab_documentation/geography.md M tab_documentation/languages_summary.md M tab_documentation/languages_visited.md M tab_documentation/last_action_geo.md M tab_documentation/most_common.md M tab_documentation/most_common_geo.md M tab_documentation/pageviews.md M tab_documentation/referers_byengine.md M tab_documentation/referers_summary.md M tab_documentation/sisproj.md M tab_documentation/traffic_ctr_geo.md M ui.R M utils.R 23 files changed, 140 insertions(+), 121 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/.gitreview b/.gitreview index dfa799f..02def08 100644 --- a/.gitreview +++ b/.gitreview @@ -2,4 +2,4 @@ host=gerrit.wikimedia.org port=29418 project=wikimedia/discovery/prince.git -defaultbranch=master +defaultbranch=develop diff --git a/LICENSE.md b/LICENSE.md new file mode 100644 index 000..7355a50 --- /dev/null +++ b/LICENSE.md @@ -0,0 +1,21 @@ +MIT License + +Copyright (c) 2017 Wikimedia Foundation + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. diff --git a/README.md b/README.md index 5fd2f2b..a93e802 100644 --- a/README.md +++ b/README.md @@ -17,4 +17,4 @@ shiny::runApp(launch.browser = 0) ``` -Please note that this project is released with a [Contributor Code of Conduct](CONDUCT.md). By participating in this project you agree to abide by its terms. +Please note that this project is licensed under [MIT License](LICENSE.md) and released with a [Contributor Code of Conduct](CONDUCT.md). By participating in this project you agree to abide by its terms. [Wikimedia technical spaces code of conduct](https://www.mediawiki.org/wiki/Special:MyLanguage/Code_of_Conduct) also applies. diff --git a/tab_documentation/action_breakdown.md b/tab_documentation/action_breakdown.md index 6ec8197..77cc787 100644 --- a/tab_documentation/action_breakdown.md +++ b/tab_documentation/action_breakdown.md @@ -26,12 +26,12 @@ Questions, bug reports, and feature suggestions -- -For technical, non-bug questions, [email Mikhail](mailto:mpo...@wikimedia.org?subject=Dashboard%20Question). If you experience a bug or notice something wrong or have a suggestion, [open a ticket in Phabricator](https://phabricator.wikimedia.org/maniphest/task/create/?projects=Discovery) in the Discovery board or [email Deb](mailto:d...@wikimedia.org?subject=Dashboard%20Question). +For technical, non-bug questions, [email Mikhail](mailto:mpo...@wikimedia.org?subject=Dashboard%20Question) or [Chelsy](mailto:c...@wikimedia.org?subject=Dashboard%20Question). If you experience a bug or notice something wrong or have a suggestion, [open a ticket in Phabricator](https://phabricator.wikimedia.org/maniphest/task/create/?projects=Discovery) in the Discovery board or [email Deb](mailto:d...@wikimedia.org?subject=Dashboard%20Question). - - Link to this dashboard: - http://discovery.wmflabs.org/portal/#action_breakdown;> -http://discovery.wmflabs.org/portal/#action_breakdown - + + Link to this dashboard: https://discovery.wmflabs.org/portal/#action_breakdown;>https://discovery.wmflabs.org/portal/#action_breakdown + | Page is available under https://creativecommons.org/licenses/by-sa/3.0/; title="Creative Commons Attribution-ShareAlike License">CC-BY-SA 3.0 + | https://phabricator.wikimedia.org/diffusion/WDPR/; title="Wikipedia.org Portal Dashboard source code repository">Code is licensed under
[MediaWiki-commits] [Gerrit] wikimedia...twilightsparql[develop]: Add licensing info
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/360586 ) Change subject: Add licensing info .. Add licensing info Bug: T167930 Change-Id: Iee990ba2506eea10b1cda38b060c94802b6e48fe --- M .gitreview M CHANGELOG.md A LICENSE.md M README.md M tab_documentation/wdqs_usage.md M tab_documentation/wdqs_visits.md M ui.R 7 files changed, 39 insertions(+), 15 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/.gitreview b/.gitreview index 14640ea..a51a85c 100644 --- a/.gitreview +++ b/.gitreview @@ -2,5 +2,5 @@ host=gerrit.wikimedia.org port=29418 project=wikimedia/discovery/twilightsparql.git -defaultbranch=master +defaultbranch=develop defaultrebase=0 diff --git a/CHANGELOG.md b/CHANGELOG.md index 96ea811..84ae0e2 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,9 @@ All notable changes to this project will be documented in this file. +## 2017/06/20 +- Added licensing info ([T167930](https://phabricator.wikimedia.org/T167930)) + ## 2017/02/02 - Updated to work with new datasets generated by Reportupdater-based golden ([T150915](https://phabricator.wikimedia.org/T150915)) - Added LDF endpoint usage ([T153936](https://phabricator.wikimedia.org/T153936)) diff --git a/LICENSE.md b/LICENSE.md new file mode 100644 index 000..7355a50 --- /dev/null +++ b/LICENSE.md @@ -0,0 +1,21 @@ +MIT License + +Copyright (c) 2017 Wikimedia Foundation + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. diff --git a/README.md b/README.md index 5ce05f6..f1c36f4 100644 --- a/README.md +++ b/README.md @@ -17,4 +17,4 @@ shiny::runApp(launch.browser = 0) ``` -Please note that this project is released with a [Contributor Code of Conduct](CONDUCT.md). By participating in this project you agree to abide by its terms. +Please note that this project is licensed under [MIT License](LICENSE.md) and released with a [Contributor Code of Conduct](CONDUCT.md). By participating in this project you agree to abide by its terms. [Wikimedia technical spaces code of conduct](https://www.mediawiki.org/wiki/Special:MyLanguage/Code_of_Conduct) also applies. diff --git a/tab_documentation/wdqs_usage.md b/tab_documentation/wdqs_usage.md index 627ac1d..1dfb2da 100644 --- a/tab_documentation/wdqs_usage.md +++ b/tab_documentation/wdqs_usage.md @@ -15,12 +15,12 @@ Questions, bug reports, and feature suggestions -- -For technical, non-bug questions, [email Mikhail](mailto:mpo...@wikimedia.org?subject=Dashboard%20Question). If you experience a bug or notice something wrong or have a suggestion, [open a ticket in Phabricator](https://phabricator.wikimedia.org/maniphest/task/create/?projects=Discovery) in the Discovery board or [email Deb](mailto:d...@wikimedia.org?subject=Dashboard%20Question). +For technical, non-bug questions, [email Mikhail](mailto:mpo...@wikimedia.org?subject=Dashboard%20Question) or [Chelsy](mailto:c...@wikimedia.org?subject=Dashboard%20Question). If you experience a bug or notice something wrong or have a suggestion, [open a ticket in Phabricator](https://phabricator.wikimedia.org/maniphest/task/create/?projects=Discovery) in the Discovery board or [email Deb](mailto:d...@wikimedia.org?subject=Dashboard%20Question). - - Link to this dashboard: - http://discovery.wmflabs.org/wdqs/#wdqs_usage;> -http://discovery.wmflabs.org/wdqs/#wdqs_usage - + + Link to this dashboard: https://discovery.wmflabs.org/wdqs/#endpoint_usage;>https://discovery.wmflabs.org/wdqs/#endpoint_usage + | Page is available under https://creativecommons.org/licenses/by-sa/3.0/; title="Creative Commons Attribution-ShareAlike License">CC-BY-SA 3.0 + | https://phabricator.wikimedia.org/diffusion/WDTS/; title="WDQS Dashboard source code repository">Code is licensed under https://phabricator.wikimedia.org/diffusion/WDTS/browse/master/LICENSE.md; title="MIT License">MIT + | Part of
[MediaWiki-commits] [Gerrit] wikimedia...wonderbolt[master]: Fix desktop/mobile web mix-up
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/359026 ) Change subject: Fix desktop/mobile web mix-up .. Fix desktop/mobile web mix-up Previous version assumed a specific order of access method when renaming the elements of the list. This uses relative naming. Also changes the x-axis formatting so it displays day names. Bug: T167850 Change-Id: I818553b66e7be0e960da37477549d9ad60e9d58d --- M server.R M utils.R 2 files changed, 11 insertions(+), 10 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/server.R b/server.R index 528fb21..f9b4162 100644 --- a/server.R +++ b/server.R @@ -37,6 +37,7 @@ polloi::make_dygraph(xlab = "Date", ylab = ifelse(input$platform_traffic_summary_prop, "Pageview Share (%)", "Pageviews"), title = "Sources of page views (e.g. search engines and internal referers)") %>% dyLegend(labelsDiv = "traffic_summary_legend", show = "always", showZeroValues = FALSE) %>% + dyAxis("x", axisLabelFormatter = polloi::custom_axis_formatter, axisLabelWidth = 70) %>% dyAxis("y", logscale = input$platform_traffic_summary_log) %>% dyRangeSelector(fillColor = ifelse(input$platform_traffic_summary_prop, "", "#A7B1C4"), strokeColor = ifelse(input$platform_traffic_summary_prop, "", "#808FAB"), @@ -64,6 +65,7 @@ polloi::make_dygraph(xlab = "Date", ylab = ifelse(input$platform_traffic_bysearch_prop, "Pageview Share (%)", "Pageviews"), title = "Pageviews from external search engines, broken down by engine") %>% dyLegend(labelsDiv = "traffic_bysearch_legend", show = "always", showZeroValues = FALSE) %>% + dyAxis("x", axisLabelFormatter = polloi::custom_axis_formatter, axisLabelWidth = 70) %>% dyAxis("y", logscale = input$platform_traffic_bysearch_log) %>% dyRangeSelector(fillColor = ifelse(input$platform_traffic_bysearch_prop, "", "#A7B1C4"), strokeColor = ifelse(input$platform_traffic_bysearch_prop, "", "#808FAB"), diff --git a/utils.R b/utils.R index 2169808..4b3de60 100644 --- a/utils.R +++ b/utils.R @@ -26,13 +26,13 @@ lapply(dplyr::select_, .dots = list(quote(-access_method))) # fixes smoothing interim$total <- data[, j = list(pageviews = sum(pageviews)), by = c("date", "referer_class")] - names(interim) <- c("Desktop", "Mobile Web", "All") + names(interim) <- c("desktop" = "Desktop", "mobile web" = "Mobile Web", "total" = "All")[names(interim)] summary_traffic_data <<- lapply(interim, tidyr::spread, key = "referer_class", value = "pageviews", fill = NA) # Proportion summary_traffic_data_prop <<- interim %>% lapply(dplyr::group_by, date) %>% -lapply(dplyr::mutate, pageviews = 100*pageviews/sum(pageviews)) %>% +lapply(dplyr::mutate, pageviews = 100 * pageviews / sum(pageviews)) %>% lapply(tidyr::spread, key = "referer_class", value = "pageviews", fill = NA) # Generate per-engine values @@ -44,7 +44,7 @@ interim$total <- data[is_search == TRUE, j = list(pageviews = sum(pageviews)), by = c("date", "search_engine")] - names(interim) <- c("Desktop", "Mobile Web", "All") + names(interim) <- c("desktop" = "Desktop", "mobile web" = "Mobile Web", "total" = "All")[names(interim)] bysearch_traffic_data <<- interim %>% lapply(dplyr::filter_, .dots = list(quote(search_engine != "Not referred by search"))) %>% lapply(tidyr::spread, key = "search_engine", value = "pageviews", fill = NA) @@ -52,7 +52,7 @@ # Proportion bysearch_traffic_data_prop <<- interim %>% lapply(dplyr::group_by, date) %>% -lapply(dplyr::mutate, pageviews = 100*pageviews/sum(pageviews)) %>% +lapply(dplyr::mutate, pageviews = 100 * pageviews / sum(pageviews)) %>% lapply(dplyr::filter_, .dots = list(quote(search_engine != "Not referred by search"))) %>% lapply(tidyr::spread, key = "search_engine", value = "pageviews", fill = NA) @@ -72,8 +72,7 @@ `None (direct)` = "none", `Search engine` = "external (search engine)", `External (but not search engine)` = "external", -Internal = "internal", -Unknown = "unknown" +Internal = "internal" ) ) %>% data.table::as.data.table() @@ -85,13 +84,13 @@ lapply(dplyr::select_, .dots = list(quote(-access_method))) # fixes smoothing interim$total <- data[, j = list(pageviews = sum(pageviews)), by = c("date", "referer_class")] - names(interim) <- c("Desktop", "Mobile Web", "All") + names(interim) <- c("desktop" = "Desktop", "mobile web" = "Mobile Web", "total" = "All")[names(interim)] summary_traffic_nonbot_data <<- lapply(interim, tidyr::spread, key = "referer_class", value = "pageviews", fill = NA) #
[MediaWiki-commits] [Gerrit] wikimedia...prince[master]: Change the way ui.R get date range and country list
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/358890 ) Change subject: Change the way ui.R get date range and country list .. Change the way ui.R get date range and country list Previously, I asked ui.R to download a public dataset before rendering the dashboard, which is problematic. Change to storing the country name list in extras.R for selectizeInput in ui.R Change-Id: Id564a1f7371e14932ea72da03630463b6e9c348e --- M extras.R M ui.R 2 files changed, 16 insertions(+), 13 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/prince refs/changes/90/358890/1 diff --git a/extras.R b/extras.R index a3a96fa..69c502e 100644 --- a/extras.R +++ b/extras.R @@ -28,6 +28,9 @@ "Modern") # Android 4+ ) +# For selectizeInput in ui.R +all_country_names <- c("Zimbabwe", "Zambia", "Yemen", "Virgin Islands, British", "Viet Nam", "Venezuela, Bolivarian Republic of", "Uzbekistan", "U.S. (West)", "U.S. (South)", "U.S. (Pacific)", "U.S. (Other)", "U.S. (Northeast)", "U.S. (Midwest)", "Uruguay", "United Kingdom", "United Arab Emirates", "Ukraine", "Uganda", "Turkmenistan", "Turkey", "Tunisia", "Trinidad and Tobago", "Timor-Leste", "Thailand", "Tanzania, United Republic of", "Tajikistan", "Taiwan, Province of China", "Syrian Arab Republic", "Switzerland", "Sweden", "Suriname", "Sudan", "Sri Lanka", "Spain", "South Africa", "Somalia", "Slovenia", "Slovakia", "Singapore", "Seychelles", "Serbia", "Senegal", "Saudi Arabia", "Rwanda", "Russian Federation", "Romania", "Qatar", "Portugal", "Poland", "Philippines", "Peru", "Paraguay", "Papua New Guinea", "Panama", "Palestine, State of", "Pakistan", "Other", "Oman", "Norway", "Nigeria", "Niger", "Nicaragua", "New Zealand", "Netherlands", "Nepal", "Namibia", "Myanmar", "Mozambique", "Morocco", "Montenegro", "Mongolia", "Moldova, Republic of", "Mexico", "Mauritius", "Mauritania", "Martinique", "Mali", "Malaysia", "Malawi", "Madagascar", "Macedonia, Republic of", "Macao", "Luxembourg", "Lithuania", "Libya", "Lebanon", "Latvia", "Lao People's Democratic Republic", "Kyrgyzstan", "Kuwait", "Korea, Republic of", "Kenya", "Kazakhstan", "Jordan", "Jersey", "Japan", "Jamaica", "Italy", "Israel", "Ireland", "Iraq", "Iran, Islamic Republic of", "Indonesia", "India", "Iceland", "Hungary", "Hong Kong", "Honduras", "Haiti", "Guernsey", "Guatemala", "Greenland", "Greece", "Ghana", "Germany", "Georgia", "French Polynesia", "France", "Finland", "Fiji", "Ethiopia", "Estonia", "El Salvador", "Egypt", "Ecuador", "Dominican Republic", "Dominica", "Djibouti", "Denmark", "Czechia", "Cyprus", "Curacao", "Cuba", "Croatia", "Cote d'Ivoire", "Costa Rica", "Congo, The Democratic Republic of the", "Colombia", "China", "Chile", "Canada", "Cameroon", "Cambodia", "Burkina Faso", "Bulgaria", "British Indian Ocean Territory", "Brazil", "Botswana", "Bolivia, Plurinational State of", "Bhutan", "Benin", "Belgium", "Belarus", "Barbados", "Bangladesh", "Bahrain", "Azerbaijan", "Austria", "Australia", "Aruba", "Armenia", "Argentina", "Angola", "Algeria", "Albania", "Afghanistan", "Togo", "Malta", "Guadeloupe", "Gibraltar", "Gabon", "Faroe Islands", "Congo", "Cayman Islands", "Brunei Darussalam", "Bosnia and Herzegovina", "Bahamas", "Reunion", "Maldives", "Guyana", "Guinea", "Cabo Verde", "Burundi", "Antigua and Barbuda", "Swaziland", "Saint Lucia", "Isle of Man", "Gambia", "Central African Republic", "Belize", "Vanuatu", "Sierra Leone", "Saint Kitts and Nevis", "New Caledonia", "Lesotho", "Solomon Islands", "French Guiana", "Chad", "Bermuda", "Turks and Caicos Islands", "Liberia", "Comoros", "Bonaire, Sint Eustatius and Saba", "Aland Islands", "Grenada", "Mayotte", "Liechtenstein", "Samoa", "Equatorial Guinea", "Andorra", "South Sudan", "Saint Martin (French part)", "Saint Vincent and the Grenadines", "Holy See (Vatican City State)", "Guinea-Bissau", "Eritrea", "Saint Barthelemy", "Cook Islands", "Sint Maarten (Dutch part)", "Sao Tome and Principe", "Anguilla", "Monaco", "Kiribati", "Micronesia, Federated States of", "San Marino", "United States") + fill_out <- function(x, start_date, end_date, fill = 0) { temp <- dplyr::data_frame(date = seq(start_date, end_date, "day")) y <- dplyr::right_join(x, temp, by = "date") diff --git a/ui.R b/ui.R index 8b2ece9..c9d1ea6 100644 --- a/ui.R +++ b/ui.R @@ -1,7 +1,7 @@ library(shiny) library(shinydashboard) -all_country_data <- polloi::read_dataset("discovery/metrics/portal/all_country_data.tsv", col_types = "Dcididid") +source("extras.R") function(request) { dashboardPage( @@ -237,8 +237,8 @@ fluidRow( column(width = 3, dateRangeInput("date_all_country", "Date Range", - start = min(all_country_data$date), - end = max(all_country_data$date), +
[MediaWiki-commits] [Gerrit] wikimedia...rainbow[master]: Clarify event counts + switch to 90-day median
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/358489 ) Change subject: Clarify event counts + switch to 90-day median .. Clarify event counts + switch to 90-day median Change-Id: I6b2f1b51f405e8acc003033cd20b2e27fc95ba3b --- M server.R M tab_documentation/app_events.md M tab_documentation/desktop_events.md M tab_documentation/mobile_events.md M utils.R 5 files changed, 20 insertions(+), 14 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/server.R b/server.R index 2225127..54c0886 100644 --- a/server.R +++ b/server.R @@ -41,7 +41,7 @@ output$desktop_event_searches <- renderValueBox( valueBox( value = desktop_dygraph_means["search sessions"], - subtitle = "Search sessions per day", + subtitle = "Tracked search sessions per day*", icon = icon("search"), color = "green" ) @@ -50,7 +50,7 @@ output$desktop_event_resultsets <- renderValueBox( valueBox( value = desktop_dygraph_means["Result pages opened"], - subtitle = "Result sets per day", + subtitle = "Result pages opened per day*", icon = icon("list", lib = "glyphicon"), color = "green" ) @@ -59,7 +59,7 @@ output$desktop_event_clickthroughs <- renderValueBox( valueBox( value = desktop_dygraph_means["clickthroughs"], - subtitle = "Clickthroughs per day", + subtitle = "Clickthroughs per day*", icon = icon("hand-up", lib = "glyphicon"), color = "green" ) @@ -124,7 +124,7 @@ output$mobile_event_searches <- renderValueBox( valueBox( value = mobile_dygraph_means["search sessions"], - subtitle = "Search sessions per day", + subtitle = "Search sessions per day*", icon = icon("search"), color = "green" ) @@ -133,7 +133,7 @@ output$mobile_event_resultsets <- renderValueBox( valueBox( value = mobile_dygraph_means["Result pages opened"], - subtitle = "Result sets per day", + subtitle = "Result pages opened per day*", icon = icon("list", lib = "glyphicon"), color = "green" ) @@ -142,7 +142,7 @@ output$mobile_event_clickthroughs <- renderValueBox( valueBox( value = mobile_dygraph_means["clickthroughs"], - subtitle = "Clickthroughs per day", + subtitle = "Clickthroughs per day*", icon = icon("hand-up", lib = "glyphicon"), color = "green" ) @@ -169,7 +169,7 @@ output$app_event_searches <- renderValueBox( valueBox( value = ios_dygraph_means["search sessions"] + android_dygraph_means["search sessions"], - subtitle = "Search sessions per day", + subtitle = "Search sessions per day*", icon = icon("search"), color = "green" ) @@ -178,7 +178,7 @@ output$app_event_resultsets <- renderValueBox( valueBox( value = ios_dygraph_means["Result pages opened"] + android_dygraph_means["Result pages opened"], - subtitle = "Result sets per day", + subtitle = "Result pages opened per day*", icon = icon("list", lib = "glyphicon"), color = "green" ) @@ -187,7 +187,7 @@ output$app_event_clickthroughs <- renderValueBox( valueBox( value = ios_dygraph_means["clickthroughs"] + android_dygraph_means["clickthroughs"], - subtitle = "Clickthroughs per day", + subtitle = "Clickthroughs per day*", icon = icon("hand-up", lib = "glyphicon"), color = "green" ) diff --git a/tab_documentation/app_events.md b/tab_documentation/app_events.md index 02b87ba..3f696d1 100644 --- a/tab_documentation/app_events.md +++ b/tab_documentation/app_events.md @@ -11,6 +11,8 @@ Due to a bug in the iOS EventLogging system, iOS events are currently being tracked much more frequently than Android ones and so are displayed in a different graph to avoid confusion. +\* This number represents the median of the last 90 days. + Notes -- * There is a spike in events on 2 June 2015 because of a release of the iOS app that added search logging. This has been [confirmed](https://phabricator.wikimedia.org/T102098) by a mobile apps software engineer. diff --git a/tab_documentation/desktop_events.md b/tab_documentation/desktop_events.md index 9c07536..044d3f7 100644 --- a/tab_documentation/desktop_events.md +++ b/tab_documentation/desktop_events.md @@ -8,7 +8,9 @@ 3. A user clicking through to an article in the results page. These three things are tracked via the [EventLogging 'TestSearchSatisfaction2' schema](https://meta.wikimedia.org/wiki/Schema:TestSearchSatisfaction2) (previously '[Search](https://meta.wikimedia.org/wiki/Schema:Search)', see note "A"), and stored to -a database. The results are then aggregated and anonymised, and presented on this page. For performance/privacy reasons we randomly sample what we store, so the actual numbers are a vast understatement of how
[MediaWiki-commits] [Gerrit] wikimedia...dashboard[master]: Deploy fixes
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/356982 ) Change subject: Deploy fixes .. Deploy fixes Change-Id: I92d0cbe388113e3b878c82f73f3970e64ebfbae1 --- M shiny-server/portal 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/shiny-server/portal b/shiny-server/portal index 51df8cf..fa78f60 16 --- a/shiny-server/portal +++ b/shiny-server/portal @@ -1 +1 @@ -Subproject commit 51df8cf55d3856c0277a55f15d43a780b477b8f8 +Subproject commit fa78f60f4734432e4fd3c5f8e61803f5a870a024 -- To view, visit https://gerrit.wikimedia.org/r/356982 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I92d0cbe388113e3b878c82f73f3970e64ebfbae1 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/dashboard Gerrit-Branch: master Gerrit-Owner: ChelsyxGerrit-Reviewer: Chelsyx ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...dashboard[master]: Deploy fixes
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/356982 ) Change subject: Deploy fixes .. Deploy fixes Change-Id: I92d0cbe388113e3b878c82f73f3970e64ebfbae1 --- M shiny-server/portal 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/dashboard refs/changes/82/356982/1 diff --git a/shiny-server/portal b/shiny-server/portal index 51df8cf..fa78f60 16 --- a/shiny-server/portal +++ b/shiny-server/portal @@ -1 +1 @@ -Subproject commit 51df8cf55d3856c0277a55f15d43a780b477b8f8 +Subproject commit fa78f60f4734432e4fd3c5f8e61803f5a870a024 -- To view, visit https://gerrit.wikimedia.org/r/356982 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I92d0cbe388113e3b878c82f73f3970e64ebfbae1 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/dashboard Gerrit-Branch: master Gerrit-Owner: Chelsyx___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...prince[master]: Use new path in ui.R
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/356980 ) Change subject: Use new path in ui.R .. Use new path in ui.R Change-Id: I981a31e3d8462c2609203653838b636cd7c5935a --- M ui.R 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/ui.R b/ui.R index 84f69c7..8b2ece9 100644 --- a/ui.R +++ b/ui.R @@ -1,7 +1,7 @@ library(shiny) library(shinydashboard) -all_country_data <- polloi::read_dataset("discovery/portal/all_country_data.tsv", col_types = "Dcididid") +all_country_data <- polloi::read_dataset("discovery/metrics/portal/all_country_data.tsv", col_types = "Dcididid") function(request) { dashboardPage( -- To view, visit https://gerrit.wikimedia.org/r/356980 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I981a31e3d8462c2609203653838b636cd7c5935a Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/prince Gerrit-Branch: master Gerrit-Owner: ChelsyxGerrit-Reviewer: Chelsyx ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...prince[master]: Use new path in ui.R
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/356980 ) Change subject: Use new path in ui.R .. Use new path in ui.R Change-Id: I981a31e3d8462c2609203653838b636cd7c5935a --- M ui.R 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/prince refs/changes/80/356980/1 diff --git a/ui.R b/ui.R index 84f69c7..8b2ece9 100644 --- a/ui.R +++ b/ui.R @@ -1,7 +1,7 @@ library(shiny) library(shinydashboard) -all_country_data <- polloi::read_dataset("discovery/portal/all_country_data.tsv", col_types = "Dcididid") +all_country_data <- polloi::read_dataset("discovery/metrics/portal/all_country_data.tsv", col_types = "Dcididid") function(request) { dashboardPage( -- To view, visit https://gerrit.wikimedia.org/r/356980 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I981a31e3d8462c2609203653838b636cd7c5935a Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/prince Gerrit-Branch: master Gerrit-Owner: Chelsyx___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...wonderbolt[master]: Note that PVs are for Wikimedia in general
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/349718 ) Change subject: Note that PVs are for Wikimedia in general .. Note that PVs are for Wikimedia in general Change-Id: Iee0800433fb1d936b0058595b903a10aafaa64f0 --- M tab_documentation/traffic_byengine.md M tab_documentation/traffic_summary.md 2 files changed, 4 insertions(+), 4 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/tab_documentation/traffic_byengine.md b/tab_documentation/traffic_byengine.md index 30d33ac..5db2e29 100644 --- a/tab_documentation/traffic_byengine.md +++ b/tab_documentation/traffic_byengine.md @@ -1,9 +1,9 @@ Traffic from external search engines, broken down === -A key metric in understanding the role external search engines play in Wikipedia's readership and content discovery processes is a very direct one - how many pageviews we get from them. This can be discovered very simply by looking at our request logs. +A key metric in understanding the role external search engines play in Wikipedia's (and Wikimedia's) readership and content discovery processes is a very direct one - how many pageviews we get from them. This can be discovered very simply by looking at our request logs. -This dashboard simply breaks down the [summary data](http://discovery.wmflabs.org/external/#traffic_summary) to investigate how much traffic is coming from each search engine, individually. As you can see, Google dominates, which is why we've included the option of log-scaling +This dashboard simply breaks down the [summary data](https://discovery.wmflabs.org/external/#traffic_summary) to investigate how much traffic is coming from each search engine, individually. As you can see, Google dominates, which is why we've included the option of log-scaling the traffic. General trends diff --git a/tab_documentation/traffic_summary.md b/tab_documentation/traffic_summary.md index e8f0919..8cfb982 100644 --- a/tab_documentation/traffic_summary.md +++ b/tab_documentation/traffic_summary.md @@ -1,9 +1,9 @@ Traffic from external search engines - summary === -A key metric in understanding the role external search engines play in Wikipedia's readership and content discovery processes is a very direct one - how many pageviews we get from them. This can be discovered very simply by looking at our request logs. +A key metric in understanding the role external search engines play in Wikipedia's (and Wikimedia's) readership and content discovery processes is a very direct one - how many pageviews we get from them. This can be discovered very simply by looking at our request logs. -This dashboard simply looks at, very broadly, where our requests are coming from - search engines or something else? It is split up into +This dashboard simply looks at, very broadly, where our pageviews (across all Wikimedia projects) are coming from - search engines or something else? It is split up into "all", "desktop" and "mobile web" platforms - but not apps, since the apps do not log referers. **Internal** is traffic referred by Wikimedia sites, specifically: mediawiki.org, wikibooks.org, wikidata.org, wikinews.org, wikimedia.org, wikimediafoundation.org, wikipedia.org, wikiquote.org, wikisource.org, wikiversity.org, wikivoyage.org, and wiktionary.org (See [Webrequest source](https://git.wikimedia.org/blob/analytics%2Frefinery%2Fsource.git/master/refinery-core%2Fsrc%2Fmain%2Fjava%2Forg%2Fwikimedia%2Fanalytics%2Frefinery%2Fcore%2FWebrequest.java#L203) for more information.) -- To view, visit https://gerrit.wikimedia.org/r/349718 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: Iee0800433fb1d936b0058595b903a10aafaa64f0 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/wonderbolt Gerrit-Branch: master Gerrit-Owner: BearlogaGerrit-Reviewer: Chelsyx ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Add secondSPARQL endpoint
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/349290 ) Change subject: Add secondSPARQL endpoint .. Add secondSPARQL endpoint Bug: T163501 Change-Id: Ifef4c91d66a87e0a8d33bf044d6d956b0e3b63e2 --- M modules/metrics/wdqs/basic_usage 1 file changed, 3 insertions(+), 3 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/modules/metrics/wdqs/basic_usage b/modules/metrics/wdqs/basic_usage index 5fd7a10..b1c640a 100755 --- a/modules/metrics/wdqs/basic_usage +++ b/modules/metrics/wdqs/basic_usage @@ -3,7 +3,7 @@ hive -S -e "USE wmf; SELECT '$1' AS date, - uri_path AS path, + IF(uri_path = '/sparql', '/bigdata/namespace/wdq/sparql', uri_path) AS path, UPPER(http_status IN('200','304')) as http_success, CASE WHEN ( @@ -27,10 +27,10 @@ AND CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')) >= '$1' AND CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')) < '$2' AND uri_host = 'query.wikidata.org' - AND uri_path IN('/', '/bigdata/namespace/wdq/sparql', '/bigdata/ldf') + AND uri_path IN('/', '/bigdata/namespace/wdq/sparql', '/bigdata/ldf', '/sparql') GROUP BY '$1', - uri_path, + IF(uri_path = '/sparql', '/bigdata/namespace/wdq/sparql', uri_path) AS path, UPPER(http_status IN('200','304')), CASE WHEN ( -- To view, visit https://gerrit.wikimedia.org/r/349290 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ifef4c91d66a87e0a8d33bf044d6d956b0e3b63e2 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/golden Gerrit-Branch: master Gerrit-Owner: BearlogaGerrit-Reviewer: Chelsyx ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Add dataset READMEs to output dirs
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/348487 ) Change subject: Add dataset READMEs to output dirs .. Add dataset READMEs to output dirs This adds to Rmarkdown files that must be re-knit after any new reports are added. The Rmarkdown files read config.yaml info and output Markdown documents that are rsync'd into the output directories, so that users of datasets.wikimedia.org and browsers of stat1002:/a/aggregate-datasets/ can find out what the TSVs are Change-Id: Iaaebdd605c53e74102379a452f70bd17a1aaf851 --- A docs/README.md A docs/discovery-forecasts.Rmd A docs/discovery-forecasts.md A docs/discovery.Rmd A docs/discovery.md M main.sh M modules/metrics/portal/config.yaml 7 files changed, 263 insertions(+), 1 deletion(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/docs/README.md b/docs/README.md new file mode 100644 index 000..a5e21e6 --- /dev/null +++ b/docs/README.md @@ -0,0 +1,4 @@ +# READMEs for generated datasets + +* [discovery.Rmd](disovery.Rmd) needs to be knit into Markdown ([discovery.md](disovery.md)) and that is rsync'd to stat1002:/a/aggregate-datasets/discovery/README.md +* [discovery-forecasts.Rmd](disovery-forecasts.Rmd) needs to be knit into Markdown ([discovery-forecasts.md](disovery-forecasts.md)) and that is rsync'd to stat1002:/a/aggregate-datasets/discovery-forecasts/README.md diff --git a/docs/discovery-forecasts.Rmd b/docs/discovery-forecasts.Rmd new file mode 100644 index 000..c5ea557 --- /dev/null +++ b/docs/discovery-forecasts.Rmd @@ -0,0 +1,34 @@ +--- +output: md_document +--- + +# Discovery Forecasts + +These files are generated by Discovery's [Golden](https://github.com/wikimedia/wikimedia-discovery-golden/) data retrieval codebase that executes daily and uses [Reportupdater](https://wikitech.wikimedia.org/wiki/Analytics/Reportupdater) infrastructure. These datasets provide the metrics that are used by [Discovery's Dashboards](https://discovery.wmflabs.org/) + +Last updated on `r format(Sys.Date(), "%d %B %Y")` + +```{r setup, include=FALSE} +knitr::opts_chunk$set(echo = FALSE) +options(width = 1) +``` + +```{r yamls} +config_yamls <- list.files(path = "../modules/forecasts", pattern = "^config\\.yaml$", recursive = TRUE, full.names = TRUE) +names(config_yamls) <- sub("../modules/forecasts/", "", dirname(config_yamls), fixed = TRUE) +reports <- dplyr::bind_rows(lapply(config_yamls, function(path) { + config_yaml <- suppressMessages(suppressWarnings(data.tree::as.Node(yaml::yaml.load_file(path + reports <- data.tree::ToDataFrameTable(config_yaml[["reports"]], "report" = "name", "description") + reports$path = paste0(file.path(dirname(path), reports$report), ifelse(reports$type == "sql", ".sql", "")) + return(reports) +}), .id = "module") +``` + +```{r results='asis'} +for (module in unique(reports$module)) { + cat(sprintf("\n## %s", module), "/\n\n", sep = "") + for (i in which(reports$module == module)) { +cat("- **", reports$report[i], ".tsv**: ", reports$description[i], "\n", sep = "") + } +} +``` diff --git a/docs/discovery-forecasts.md b/docs/discovery-forecasts.md new file mode 100644 index 000..69c0b2c --- /dev/null +++ b/docs/discovery-forecasts.md @@ -0,0 +1,43 @@ +Discovery Forecasts +=== + +These files are generated by Discovery's +[Golden](https://github.com/wikimedia/wikimedia-discovery-golden/) data +retrieval codebase that executes daily and uses +[Reportupdater](https://wikitech.wikimedia.org/wiki/Analytics/Reportupdater) +infrastructure. These datasets provide the metrics that are used by +[Discovery's Dashboards](https://discovery.wmflabs.org/) + +Last updated on 17 April 2017 + +search/ +--- + +- **api\_cirrus\_arima.tsv**: ARIMA-modelled forecasts of Cirrus API +usage by non-automata users +- **api\_cirrus\_bsts.tsv**: BSTS-modelled forecasts of Cirrus API +usage by non-automata users +- **api\_cirrus\_prophet.tsv**: Prophet-modelled forecasts of Cirrus +API usage by non-automata users +- **zrr\_overall\_arima.tsv**: ARIMA-modelled forecasts of zero +results rate, excluding known bots/tools +- **zrr\_overall\_bsts.tsv**: BSTS-modelled forecasts of zero results +rate, excluding known bots/tools +- **zrr\_overall\_prophet.tsv**: Prophet-modelled forecasts of zero +results rate, excluding known bots/tools + +wdqs/ +- + +- **homepage\_traffic\_arima.tsv**: ARIMA-modelled forecasts of WDQS +homepage traffic by non-automata users +- **homepage\_traffic\_bsts.tsv**: BSTS-modelled forecasts of WDQS +homepage traffic by non-automata users +- **homepage\_traffic\_prophet.tsv**: Prophet-modelled forecasts of +WDQS homepage traffic by non-automata users +- **sparql\_usage\_arima.tsv**: ARIMA-modelled forecasts of WDQS +SPARQL endpoint usage by non-automata +- **sparql\_usage\_bsts.tsv**:
[MediaWiki-commits] [Gerrit] wikimedia...wonderbolt[master]: Let users view non-bot pageview traffic breakdown
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/348018 ) Change subject: Let users view non-bot pageview traffic breakdown .. Let users view non-bot pageview traffic breakdown Bug: T161932 Change-Id: I80c6c8a20ac0559e9ba4e4b2711acf505cadb547 --- M server.R M ui.R M utils.R 3 files changed, 113 insertions(+), 16 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/server.R b/server.R index 42ff6bf..528fb21 100644 --- a/server.R +++ b/server.R @@ -9,14 +9,30 @@ function(input, output, session) { if (Sys.Date() != existing_date) { +progress <- shiny::Progress$new(session, min = 0, max = 1) +on.exit(progress$close()) +progress$set(message = "Downloading overall pageview counts...", value = 0) read_traffic() +progress$set(message = "Downloading non-bot pageview counts...", value = 1/2) +read_nonbot_traffic() +progress$set(message = "Finished downloading datasets.", value = 1) existing_date <<- Sys.Date() } output$traffic_summary_dygraph <- renderDygraph({ -input$platform_traffic_summary_prop %>% - polloi::data_select(summary_traffic_data_prop[[input$platform_traffic_summary]], - summary_traffic_data[[input$platform_traffic_summary]]) %>% +input$include_automata_traffic_summary %>% + polloi::data_select( +polloi::data_select( + input$platform_traffic_summary_prop, + summary_traffic_data_prop[[input$platform_traffic_summary]], + summary_traffic_data[[input$platform_traffic_summary]] +), +polloi::data_select( + input$platform_traffic_summary_prop, + summary_traffic_nonbot_data_prop[[input$platform_traffic_summary]], + summary_traffic_nonbot_data[[input$platform_traffic_summary]] +) + ) %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_traffic_summary)) %>% polloi::make_dygraph(xlab = "Date", ylab = ifelse(input$platform_traffic_summary_prop, "Pageview Share (%)", "Pageviews"), title = "Sources of page views (e.g. search engines and internal referers)") %>% @@ -31,9 +47,19 @@ }) output$traffic_bysearch_dygraph <- renderDygraph({ -input$platform_traffic_bysearch_prop %>% - polloi::data_select(bysearch_traffic_data_prop[[input$platform_traffic_bysearch]], - bysearch_traffic_data[[input$platform_traffic_bysearch]]) %>% +input$include_automata_traffic_bysearch %>% + polloi::data_select( +polloi::data_select( + input$platform_traffic_bysearch_prop, + bysearch_traffic_data_prop[[input$platform_traffic_bysearch]], + bysearch_traffic_data[[input$platform_traffic_bysearch]] +), +polloi::data_select( + input$platform_traffic_bysearch_prop, + bysearch_traffic_nonbot_data_prop[[input$platform_traffic_bysearch]], + bysearch_traffic_nonbot_data[[input$platform_traffic_bysearch]] +) + ) %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_traffic_bysearch)) %>% polloi::make_dygraph(xlab = "Date", ylab = ifelse(input$platform_traffic_bysearch_prop, "Pageview Share (%)", "Pageviews"), title = "Pageviews from external search engines, broken down by engine") %>% diff --git a/ui.R b/ui.R index 6baa5be..e4cf745 100644 --- a/ui.R +++ b/ui.R @@ -2,6 +2,10 @@ library(shinydashboard) library(dygraphs) +spider_checkbox <- function(input_id) { + shiny::checkboxInput(input_id, "Include automata", value = TRUE, width = NULL) +} + function(request) { dashboardPage( @@ -29,26 +33,34 @@ tabItems( tabItem(tabName = "traffic_summary", fluidRow( - column(selectizeInput(inputId = "platform_traffic_summary", label = "Platform", choices = c("All", "Desktop", "Mobile Web")), width = 2), - column(HTML("Scale"), + column(selectizeInput(inputId = "platform_traffic_summary", label = "Platform", choices = c("All", "Desktop", "Mobile Web")), width = 3), + column(HTML("Data"), + spider_checkbox("include_automata_traffic_summary"), width = 2), + column(conditionalPanel("!input.platform_traffic_summary_prop", HTML("Scale")), conditionalPanel("!input.platform_traffic_summary_prop", checkboxInput("platform_traffic_summary_log", label = "Use Log scale", value = FALSE)), + width = 2), + column(conditionalPanel("!input.platform_traffic_summary_log", HTML("Type")), conditionalPanel("!input.platform_traffic_summary_log", checkboxInput("platform_traffic_summary_prop",
[MediaWiki-commits] [Gerrit] wikimedia...prince[master]: Fix data path in ui.R for full geo dashboard
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/347798 ) Change subject: Fix data path in ui.R for full geo dashboard .. Fix data path in ui.R for full geo dashboard Bug: T161806 Change-Id: I4ba0952a4e522fb8febb2d53e2a0440763ec0787 --- M ui.R 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/prince refs/changes/98/347798/1 diff --git a/ui.R b/ui.R index a39db01..84f69c7 100644 --- a/ui.R +++ b/ui.R @@ -1,7 +1,7 @@ library(shiny) library(shinydashboard) -all_country_data <- polloi::read_dataset("portal/all_country_data.tsv", col_types = "Dcididid") +all_country_data <- polloi::read_dataset("discovery/portal/all_country_data.tsv", col_types = "Dcididid") function(request) { dashboardPage( -- To view, visit https://gerrit.wikimedia.org/r/347798 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I4ba0952a4e522fb8febb2d53e2a0440763ec0787 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/prince Gerrit-Branch: master Gerrit-Owner: Chelsyx___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...wonderbolt[master]: Add relative option to referrer summary
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/347040 ) Change subject: Add relative option to referrer summary .. Add relative option to referrer summary - Adds the option to view traffic breakdown as percentages - Adds the option to view traffic breakdown on a log10 scale Bug: T161771 Change-Id: I4516f7a6d1d7bc12bdd9c41d3983aa64bb3123d5 --- M server.R M ui.R M utils.R 3 files changed, 21 insertions(+), 7 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/server.R b/server.R index 1c892de..42ff6bf 100644 --- a/server.R +++ b/server.R @@ -14,12 +14,17 @@ } output$traffic_summary_dygraph <- renderDygraph({ -summary_traffic_data[[input$platform_traffic_summary]] %>% +input$platform_traffic_summary_prop %>% + polloi::data_select(summary_traffic_data_prop[[input$platform_traffic_summary]], + summary_traffic_data[[input$platform_traffic_summary]]) %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_traffic_summary)) %>% - polloi::make_dygraph(xlab = "Date", ylab = "Pageviews", + polloi::make_dygraph(xlab = "Date", ylab = ifelse(input$platform_traffic_summary_prop, "Pageview Share (%)", "Pageviews"), title = "Sources of page views (e.g. search engines and internal referers)") %>% dyLegend(labelsDiv = "traffic_summary_legend", show = "always", showZeroValues = FALSE) %>% - dyRangeSelector(retainDateWindow = TRUE) %>% + dyAxis("y", logscale = input$platform_traffic_summary_log) %>% + dyRangeSelector(fillColor = ifelse(input$platform_traffic_summary_prop, "", "#A7B1C4"), + strokeColor = ifelse(input$platform_traffic_summary_prop, "", "#808FAB"), + retainDateWindow = TRUE) %>% dyEvent(as.Date("2016-03-07"), "A (new UDF)", labelLoc = "bottom") %>% dyEvent(as.Date("2016-06-26"), "B (DuckDuckGo)", labelLoc = "bottom") %>% dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") diff --git a/ui.R b/ui.R index 1ac0e5a..6baa5be 100644 --- a/ui.R +++ b/ui.R @@ -30,8 +30,12 @@ tabItem(tabName = "traffic_summary", fluidRow( column(selectizeInput(inputId = "platform_traffic_summary", label = "Platform", choices = c("All", "Desktop", "Mobile Web")), width = 2), + column(HTML("Scale"), + conditionalPanel("!input.platform_traffic_summary_prop", checkboxInput("platform_traffic_summary_log", label = "Use Log scale", value = FALSE)), + conditionalPanel("!input.platform_traffic_summary_log", checkboxInput("platform_traffic_summary_prop", label = "Use Proportion", value = FALSE)), + width = 2), column(polloi::smooth_select("smoothing_traffic_summary"), width = 3), - column(div(id = "traffic_summary_legend", style = "text-align: right;"), width = 7)), + column(div(id = "traffic_summary_legend", style = "text-align: right;"), width = 5)), dygraphOutput("traffic_summary_dygraph"), includeMarkdown("./tab_documentation/traffic_summary.md") ), diff --git a/utils.R b/utils.R index b3f338b..009a0c7 100644 --- a/utils.R +++ b/utils.R @@ -29,6 +29,12 @@ names(interim) <- c("Desktop", "Mobile Web", "All") summary_traffic_data <<- lapply(interim, tidyr::spread, key = "referer_class", value = "pageviews", fill = NA) + # Proportion + summary_traffic_data_prop <<- interim %>% +lapply(dplyr::group_by, date) %>% +lapply(dplyr::mutate, pageviews = 100*pageviews/sum(pageviews)) %>% +lapply(tidyr::spread, key = "referer_class", value = "pageviews", fill = NA) + # Generate per-engine values interim <- data[is_search == TRUE, j = list(pageviews = sum(pageviews)), @@ -44,10 +50,9 @@ lapply(tidyr::spread, key = "search_engine", value = "pageviews", fill = NA) # Proportion - interim <- interim %>% -lapply(dplyr::group_by, date) %>% -lapply(dplyr::mutate, pageviews = 100*pageviews/sum(pageviews)) bysearch_traffic_data_prop <<- interim %>% +lapply(dplyr::group_by, date) %>% +lapply(dplyr::mutate, pageviews = 100*pageviews/sum(pageviews)) %>% lapply(dplyr::filter_, .dots = list(quote(search_engine != "Not referred by search"))) %>% lapply(tidyr::spread, key = "search_engine", value = "pageviews", fill = NA) -- To view, visit https://gerrit.wikimedia.org/r/347040 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I4516f7a6d1d7bc12bdd9c41d3983aa64bb3123d5 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/wonderbolt Gerrit-Branch: master Gerrit-Owner: Bearloga
[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Get browser info from new userAgent field
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/346655 ) Change subject: Get browser info from new userAgent field .. Get browser info from new userAgent field Bug: T162178 Change-Id: Ib292a1b87338c596125618b29ad16b7a82e48141 --- M modules/metrics/portal/user_agents.R 1 file changed, 10 insertions(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/golden refs/changes/55/346655/1 diff --git a/modules/metrics/portal/user_agents.R b/modules/metrics/portal/user_agents.R index 20097b1..6e45723 100644 --- a/modules/metrics/portal/user_agents.R +++ b/modules/metrics/portal/user_agents.R @@ -53,7 +53,16 @@ # Get user agent data wmf::set_proxies() # To allow for the latest YAML to be retrieved. uaparser::update_regexes() - ua_data <- data.table::as.data.table(uaparser::parse_agents(results$user_agent, fields = c("browser", "browser_major"))) + ua_data <- data.table::rbindlist(lapply(results$user_agent, function(x){ +if (grepl("^\\{", x)){ + temp <- unlist(jsonlite::fromJSON(x)[c("browser_family", "browser_major")]) + names(temp)[1] <- "browser" + temp <- as.data.frame(as.list(temp)) + return(temp) +} else { + uaparser::parse_agents(x, fields = c("browser", "browser_major")) +} + }), fill = TRUE) ua_data <- ua_data[, j = list(amount = .N), by = c("browser", "browser_major")] ua_data$date <- results$date[1] ua_data$percent <- round((ua_data$amount/sum(ua_data$amount)) * 100, 2) -- To view, visit https://gerrit.wikimedia.org/r/346655 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ib292a1b87338c596125618b29ad16b7a82e48141 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/golden Gerrit-Branch: master Gerrit-Owner: Chelsyx___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...rainbow[master]: Implement the wiki/language selector in more search dashboards
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/346461 ) Change subject: Implement the wiki/language selector in more search dashboards .. Implement the wiki/language selector in more search dashboards Three new dashboards are added: - CTR by Language/Project - Events by Language/Project - PaulScore by Language/Project Bug: T150410 Change-Id: Ie04762d747a9dcbec1564d8945f8949ed8c52adc --- M server.R A tab_documentation/desktop_events_langproj.md A tab_documentation/kpi_ctr_langproj.md A tab_documentation/paulscore_langproj.html M ui.R M utils.R 6 files changed, 520 insertions(+), 5 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow refs/changes/61/346461/1 diff --git a/server.R b/server.R index 5ec500e..79a7846 100644 --- a/server.R +++ b/server.R @@ -26,6 +26,7 @@ read_failures(existing_date) progress$set(message = "Downloading engagement data", value = 0.7) read_augmented_clickthrough() +read_augmented_clickthrough_langproj() progress$set(message = "Downloading survival data", value = 0.8) read_lethal_dose() progress$set(message = "Downloading PaulScore data", value = 0.9) @@ -877,6 +878,191 @@ dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") }) + output$ctr_language_selector_container <- renderUI({ +if (input$ctr_language_order == "alphabet") { + languages_to_display <- as.list(sort(available_languages_ctr$language)) + names(languages_to_display) <- available_languages_ctr$label[order(available_languages_ctr$language)] +} else { + languages_to_display <- available_languages_ctr$language + names(languages_to_display) <- available_languages_ctr$label +} + +# e.g. if user sorts projects alphabetically and the selected project is "10th Anniversary of Wikipeda" +# then automatically select the language "(None)" to avoid giving user an error. This also works if +# the user selects a project that is not multilingual, so this automatically chooses the "(None)" +# option for the user. +if (any(input$ctr_project_selector %in% projects_db$project[!projects_db$multilingual])) { + if (any(input$ctr_project_selector %in% projects_db$project[projects_db$multilingual])) { +if (!is.null(input$ctr_language_selector)) { + selected_language <- union("(None)", input$ctr_language_selector) +} else { + selected_language <- c("(None)", languages_to_display[[1]]) +} + } else { +selected_language <- "(None)" + } +} else { + if (!is.null(input$ctr_language_selector)) { +selected_language <- input$ctr_language_selector + } else { +selected_language <- languages_to_display[[1]] + } +} +return(selectInput("ctr_language_selector", "Language", multiple = TRUE,selectize = FALSE, size = 19, + choices = languages_to_display, selected = selected_language)) + }) + + output$ctr_project_selector_container <- renderUI({ +if (input$ctr_project_order == "alphabet") { + projects_to_display <- as.list(sort(available_projects_ctr$project)) + names(projects_to_display) <- available_projects_ctr$label[order(available_projects_ctr$project)] +} else { + projects_to_display <- available_projects_ctr$project + names(projects_to_display) <- available_projects_ctr$label +} +return(selectInput("ctr_project_selector", "Project", multiple = TRUE,selectize = FALSE, size = 19, + choices = projects_to_display, selected = projects_to_display[[1]])) + }) + + output$kpi_ctr_langproj_plot <- renderDygraph({ +augmented_clickthroughs_langproj %>% + kpi_ctr_aggregate_wikis(input$ctr_language_selector, input$ctr_project_selector) %>% + dplyr::select_(.dots=c("date", "wiki", paste0("`",input$ctr_metrics,"`"))) %>% + tidyr::spread_(., key_col="wiki", value_col=input$ctr_metrics, fill=0) %>% + polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_kpi_ctr_langproj)) %>% + polloi::make_dygraph(xlab = "Date", ylab = input$ctr_metrics, title = paste0(input$ctr_metrics, ", by day")) %>% + dyAxis("y", axisLabelFormatter = "function(x) { return x + '%'; }", valueFormatter = "function(x) { return x + '%'; }") %>% + dyLegend(show = "always", width = 400, labelsDiv = "kpi_ctr_langproj_legend") %>% + dyAxis("x", axisLabelFormatter = polloi::custom_axis_formatter) %>% + dyRangeSelector(fillColor = "") + }) + + output$desktop_events_language_selector_container <- renderUI({ +if (input$desktop_events_language_order == "alphabet") { + languages_to_display <- as.list(sort(available_languages_desktop$language)) + names(languages_to_display) <-
[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Fix execution permissions
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/345470 ) Change subject: Fix execution permissions .. Fix execution permissions Change-Id: I682e0a6a6cae2f7d144e8150502d446251317877 --- M main.sh M modules/forecasts/search/api_cirrus_arima M modules/forecasts/search/api_cirrus_bsts M modules/forecasts/search/zrr_overall_arima M modules/forecasts/search/zrr_overall_bsts M modules/forecasts/wdqs/homepage_traffic_arima M modules/forecasts/wdqs/homepage_traffic_bsts M modules/forecasts/wdqs/sparql_usage_arima M modules/forecasts/wdqs/sparql_usage_bsts M modules/metrics/external_traffic/referer_data M modules/metrics/maps/tile_aggregates_no_automata M modules/metrics/maps/tile_aggregates_with_automata M modules/metrics/maps/users_by_country M modules/metrics/portal/all_country_data M modules/metrics/portal/clickthrough_breakdown M modules/metrics/portal/clickthrough_firstvisit M modules/metrics/portal/clickthrough_rate M modules/metrics/portal/clickthrough_sisterprojects M modules/metrics/portal/country_data M modules/metrics/portal/dwell_metrics M modules/metrics/portal/first_visits_country M modules/metrics/portal/language_destination M modules/metrics/portal/language_switching M modules/metrics/portal/last_action_country M modules/metrics/portal/most_common_country M modules/metrics/portal/most_common_per_visit M modules/metrics/portal/pageviews M modules/metrics/portal/referer_data M modules/metrics/portal/user_agent_data M modules/metrics/search/app_load_times M modules/metrics/search/cirrus_langproj_breakdown_no_automata M modules/metrics/search/cirrus_langproj_breakdown_with_automata M modules/metrics/search/cirrus_query_aggregates_no_automata M modules/metrics/search/cirrus_query_aggregates_with_automata M modules/metrics/search/cirrus_query_breakdowns_no_automata M modules/metrics/search/cirrus_query_breakdowns_with_automata M modules/metrics/search/cirrus_suggestion_breakdown_no_automata M modules/metrics/search/cirrus_suggestion_breakdown_with_automata M modules/metrics/search/desktop_load_times M modules/metrics/search/mobile_load_times M modules/metrics/search/sample_page_visit_ld M modules/metrics/search/search_api_usage M modules/metrics/wdqs/basic_usage 43 files changed, 0 insertions(+), 7 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/main.sh b/main.sh index 19ad53b..cb5022d 100644 --- a/main.sh +++ b/main.sh @@ -1,12 +1,5 @@ #!/bin/bash -# Check if modules/forecasts/forecast.R has execution permission for Reportupdater -# (If it doesn't, then other R and shell scripts in modules/ probably don't either.) -if [ `ls -l modules/forecasts | grep -e forecast.R | grep -e "-rwxrwxr-x" | wc -l` == "0" ]; then - echo "Warning: modules do not have execution permission; granting now..." - chmod +x -R modules/ -fi - # Check if Reportupdater git submodule is set up if [ ! -f reportupdater/update_reports.py ]; then echo "Warning: Reportupdater needs to be initialized and updated..." diff --git a/modules/forecasts/search/api_cirrus_arima b/modules/forecasts/search/api_cirrus_arima old mode 100644 new mode 100755 diff --git a/modules/forecasts/search/api_cirrus_bsts b/modules/forecasts/search/api_cirrus_bsts old mode 100644 new mode 100755 diff --git a/modules/forecasts/search/zrr_overall_arima b/modules/forecasts/search/zrr_overall_arima old mode 100644 new mode 100755 diff --git a/modules/forecasts/search/zrr_overall_bsts b/modules/forecasts/search/zrr_overall_bsts old mode 100644 new mode 100755 diff --git a/modules/forecasts/wdqs/homepage_traffic_arima b/modules/forecasts/wdqs/homepage_traffic_arima old mode 100644 new mode 100755 diff --git a/modules/forecasts/wdqs/homepage_traffic_bsts b/modules/forecasts/wdqs/homepage_traffic_bsts old mode 100644 new mode 100755 diff --git a/modules/forecasts/wdqs/sparql_usage_arima b/modules/forecasts/wdqs/sparql_usage_arima old mode 100644 new mode 100755 diff --git a/modules/forecasts/wdqs/sparql_usage_bsts b/modules/forecasts/wdqs/sparql_usage_bsts old mode 100644 new mode 100755 diff --git a/modules/metrics/external_traffic/referer_data b/modules/metrics/external_traffic/referer_data old mode 100644 new mode 100755 diff --git a/modules/metrics/maps/tile_aggregates_no_automata b/modules/metrics/maps/tile_aggregates_no_automata old mode 100644 new mode 100755 diff --git a/modules/metrics/maps/tile_aggregates_with_automata b/modules/metrics/maps/tile_aggregates_with_automata old mode 100644 new mode 100755 diff --git a/modules/metrics/maps/users_by_country b/modules/metrics/maps/users_by_country old mode 100644 new mode 100755 diff --git a/modules/metrics/portal/all_country_data b/modules/metrics/portal/all_country_data old mode 100644 new mode 100755 diff --git a/modules/metrics/portal/clickthrough_breakdown b/modules/metrics/portal/clickthrough_breakdown old mode 100644 new mode 100755
[MediaWiki-commits] [Gerrit] wikimedia...wonderbolt[master]: Add relative option to External By Search Engine
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/345611 ) Change subject: Add relative option to External By Search Engine .. Add relative option to External By Search Engine Bug: T161771 Change-Id: I833b6477e6375d2ee16da38dac5096e37eb6afb4 --- M server.R M ui.R M utils.R 3 files changed, 34 insertions(+), 10 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/wonderbolt refs/changes/11/345611/1 diff --git a/server.R b/server.R index b0863cf..f835b6e 100644 --- a/server.R +++ b/server.R @@ -26,15 +26,28 @@ }) output$traffic_bysearch_dygraph <- renderDygraph({ -bysearch_traffic_data[[input$platform_traffic_bysearch]] %>% - polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_traffic_bysearch)) %>% - polloi::make_dygraph(xlab = "Date", ylab = "Pageviews", - title = "Pageviews from external search engines, broken down by engine") %>% - dyLegend(labelsDiv = "traffic_bysearch_legend", show = "always", showZeroValues = FALSE) %>% - dyAxis("y", logscale = input$platform_traffic_bysearch_log) %>% - dyRangeSelector(fillColor = "", strokeColor = "") %>% - dyEvent(as.Date("2016-06-26"), "A (DuckDuckGo)", labelLoc = "bottom") %>% - dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") +if (input$platform_traffic_bysearch_prop == FALSE){ + bysearch_traffic_data[[input$platform_traffic_bysearch]] %>% +polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_traffic_bysearch)) %>% +polloi::make_dygraph(xlab = "Date", ylab = "Pageviews", + title = "Pageviews from external search engines, broken down by engine") %>% +dyLegend(labelsDiv = "traffic_bysearch_legend", show = "always", showZeroValues = FALSE) %>% +dyAxis("y", logscale = input$platform_traffic_bysearch_log) %>% +dyRangeSelector(fillColor = "", strokeColor = "") %>% +dyEvent(as.Date("2016-06-26"), "A (DuckDuckGo)", labelLoc = "bottom") %>% +dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") +} else{ + bysearch_traffic_data_prop[[input$platform_traffic_bysearch]] %>% +polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_traffic_bysearch)) %>% +polloi::make_dygraph(xlab = "Date", ylab = "Pageview Share (%)", + title = "Pageview shares from external search engines, broken down by engine") %>% +dyLegend(labelsDiv = "traffic_bysearch_legend", show = "always", showZeroValues = FALSE) %>% +dyAxis("y", logscale = input$platform_traffic_bysearch_log) %>% +dyRangeSelector(fillColor = "", strokeColor = "") %>% +dyEvent(as.Date("2016-06-26"), "A (DuckDuckGo)", labelLoc = "bottom") %>% +dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") +} + }) # Check datasets for missing data and notify user which datasets are missing data (if any) diff --git a/ui.R b/ui.R index 358e8a9..c0928f4 100644 --- a/ui.R +++ b/ui.R @@ -38,7 +38,10 @@ tabItem(tabName = "traffic_by_engine", fluidRow( column(selectizeInput(inputId = "platform_traffic_bysearch", label = "Platform", choices = c("All", "Desktop", "Mobile Web")), width = 2), - column(HTML("Scale"), checkboxInput("platform_traffic_bysearch_log", label = "Use Log scale", value = FALSE), width = 2), + column(HTML("Scale"), + checkboxInput("platform_traffic_bysearch_log", label = "Use Log scale", value = FALSE), + checkboxInput("platform_traffic_bysearch_prop", label = "Use Proportion", value = FALSE), + width = 2), column(polloi::smooth_select("smoothing_traffic_bysearch"), width = 3), column(div(id = "traffic_bysearch_legend", style = "text-align: right;"), width = 5)), dygraphOutput("traffic_bysearch_dygraph"), diff --git a/utils.R b/utils.R index e2d7a1b..4a12d66 100644 --- a/utils.R +++ b/utils.R @@ -41,5 +41,13 @@ lapply(dplyr::filter_, .dots = list(quote(search_engine != "Not referred by search"))) %>% lapply(tidyr::spread, key = "search_engine", value = "pageviews", fill = NA) + # Proportion + interim <- interim %>% +lapply(dplyr::group_by, date) %>% +lapply(dplyr::mutate, pageviews = 100*pageviews/sum(pageviews)) + bysearch_traffic_data_prop <<- interim %>% +lapply(dplyr::filter_, .dots = list(quote(search_engine != "Not referred by search"))) %>% +lapply(tidyr::spread, key = "search_engine", value = "pageviews", fill = NA) + return(invisible()) } -- To view, visit
[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Add scripts to enable language/project breakdown for several...
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/344207 ) Change subject: Add scripts to enable language/project breakdown for several search metrics .. Add scripts to enable language/project breakdown for several search metrics - App event counts - Mobile event counts - Desktop event counts - Paulscore approximations - Search threshold pass rate Bug: T150410 Change-Id: I4fbe097a84362fc13cb4b2e44b46fdbddf385bc4 --- A modules/metrics/search/app_event_counts_wiki_breakdown.sql M modules/metrics/search/config.yaml M modules/metrics/search/desktop_event_counts M modules/metrics/search/desktop_event_counts.R A modules/metrics/search/desktop_event_counts_langproj_breakdown A modules/metrics/search/mobile_event_counts_wiki_breakdown.sql A modules/metrics/search/paulscore_approximations_fulltext_wiki_breakdown.sql M modules/metrics/search/search_threshold_pass_rate M modules/metrics/search/search_threshold_pass_rate.R A modules/metrics/search/search_threshold_pass_rate_langproj_breakdown 10 files changed, 209 insertions(+), 28 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/golden refs/changes/07/344207/1 diff --git a/modules/metrics/search/app_event_counts_wiki_breakdown.sql b/modules/metrics/search/app_event_counts_wiki_breakdown.sql new file mode 100644 index 000..4a2b141 --- /dev/null +++ b/modules/metrics/search/app_event_counts_wiki_breakdown.sql @@ -0,0 +1,32 @@ +SELECT + date, wiki, action, platform, COUNT(*) AS events +FROM ( + SELECT +DATE('{from_timestamp}') AS date, +wiki, +CASE event_action WHEN 'click' THEN 'clickthroughs' + WHEN 'start' THEN 'search sessions' + WHEN 'results' THEN 'Result pages opened' + END AS action, +CASE WHEN INSTR(userAgent, 'Android') > 0 THEN 'Android' + ELSE 'iOS' END AS platform + FROM MobileWikiAppSearch_10641988 + WHERE +timestamp >= '{from_timestamp}' AND timestamp < '{to_timestamp}' +AND event_action IN ('click', 'start', 'results') + UNION ALL + SELECT +DATE('{from_timestamp}') AS date, +wiki, +CASE event_action WHEN 'click' THEN 'clickthroughs' + WHEN 'start' THEN 'search sessions' + WHEN 'results' THEN 'Result pages opened' + END AS action, +CASE WHEN INSTR(userAgent, 'Android') > 0 THEN 'Android' + ELSE 'iOS' END AS platform + FROM MobileWikiAppSearch_15729321 + WHERE +timestamp >= '{from_timestamp}' AND timestamp < '{to_timestamp}' +AND event_action IN ('click', 'start', 'results') +) AS MobileWikiAppSearch +GROUP BY date, wiki, action, platform; diff --git a/modules/metrics/search/config.yaml b/modules/metrics/search/config.yaml index b397e08..9730121 100644 --- a/modules/metrics/search/config.yaml +++ b/modules/metrics/search/config.yaml @@ -13,6 +13,12 @@ starts: 2014-12-05 funnel: true type: sql +app_event_counts_wiki_breakdown: +description: Clicks and other events by users searching on Android and iOS apps broken down by wiki ID +granularity: days +starts: 2014-12-05 +funnel: true +type: sql app_load_times: description: User-perceived load times when searching on Android and iOS apps granularity: days @@ -37,6 +43,12 @@ starts: 2015-06-11 funnel: true type: sql +mobile_event_counts_wiki_breakdown: +description: Clicks and other events by users searching on mobile web broken down by wiki ID +granularity: days +starts: 2015-06-11 +funnel: true +type: sql mobile_load_times: description: User-perceived load times when searching on mobile web granularity: days @@ -48,6 +60,12 @@ starts: 2015-04-14 funnel: true type: script +desktop_event_counts_langproj_breakdown: +description: Clicks and other events by users searching on desktop broken down by language-project pairs +granularity: days +starts: 2015-04-14 +funnel: true +type: script desktop_load_times: description: User-perceived load times when searching on desktop granularity: days @@ -55,6 +73,12 @@ type: script paulscore_approximations: description: Relevancy of our desktop search as measured by [PaulScore](https://www.mediawiki.org/wiki/Wikimedia_Discovery/Search/Glossary#PaulScore) +granularity: days +starts: 2016-10-25 +funnel: true +type: sql +paulscore_approximations_fulltext_wiki_breakdown: +description: Relevancy of our fulltext desktop search as measured by [PaulScore](https://www.mediawiki.org/wiki/Wikimedia_Discovery/Search/Glossary#PaulScore) broken down by wiki ID granularity: days starts:
[MediaWiki-commits] [Gerrit] wikimedia...prince[master]: Adds 'x' button to remove selected languages, countries
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/343931 ) Change subject: Adds 'x' button to remove selected languages, countries .. Adds 'x' button to remove selected languages, countries Change-Id: Ifcb0478f0b39273ec7aa86b8704e3a3bc25cf6eb --- M server.R M ui.R 2 files changed, 7 insertions(+), 7 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/server.R b/server.R index 2d82d55..77dca0e 100644 --- a/server.R +++ b/server.R @@ -514,7 +514,7 @@ if (input$lv_sort %in% c("top10", "bottom50")) { hidden(disabled(selectizeInput("lv_languages", "Wikipedia languages", lv_reactive$choices, lv_reactive$selected_langs, multiple = TRUE))) } else { - selectizeInput("lv_languages", "Wikipedia languages (12 max)", lv_reactive$choices, lv_reactive$selected_langs, multiple = TRUE, options = list(maxItems = 12)) + selectizeInput("lv_languages", "Wikipedia languages (12 max)", lv_reactive$choices, lv_reactive$selected_langs, multiple = TRUE, options = list(maxItems = 12, plugins = list("remove_button"))) } }) diff --git a/ui.R b/ui.R index 84a55cc..a39db01 100644 --- a/ui.R +++ b/ui.R @@ -44,8 +44,8 @@ menuSubItem(text = "Most Common Section", tabName = "most_common_by_country"), icon = icon("globe", lib = "glyphicon")), menuItem(text = "Global Settings", - selectInput(inputId = "smoothing_global", label = "Smoothing", selectize = TRUE, selected = "day", - choices = c("No Smoothing" = "day", "Weekly Median" = "week", "Monthly Median" = "month", "Splines" = "gam")), + selectizeInput(inputId = "smoothing_global", label = "Smoothing", selected = "day", + choices = c("No Smoothing" = "day", "Weekly Median" = "week", "Monthly Median" = "month", "Splines" = "gam")), br(style = "line-height:25%;"), icon = icon("cog", lib = "glyphicon")) ), div(icon("info-sign", lib = "glyphicon"), HTML("Tip: you can drag on the graphs with your mouse to zoom in on a particular date range."), style = "padding: 10px; color: black;"), @@ -270,7 +270,7 @@ conditionalPanel("(input.traffic_select=='events' || input.traffic_select=='visits' || input.traffic_select=='sessions') && !input.prop_a", checkboxInput("cntr_logscale_a", "Use Log scale", FALSE))), column(width = 5, conditionalPanel("input.cntr_sort_a == 'custom_a'", -selectizeInput("cntr_a", "Countries", choices = sort(c(unique(all_country_data$country), "United States")), selected = c("United Kingdom", "Germany", "India", "Canada", "U.S. (South)"), multiple = TRUE, width = "100%"))) +selectizeInput("cntr_a", "Countries", choices = sort(c(unique(all_country_data$country), "United States")), selected = c("United Kingdom", "Germany", "India", "Canada", "U.S. (South)"), multiple = TRUE, options = list(plugins = list("remove_button")), width = "100%"))) ), fluidRow( highcharter::highchartOutput("traffic_pie_pl", height = "500px"), @@ -320,7 +320,7 @@ conditionalPanel("!input.prop_f", checkboxInput("cntr_logscale_f", "Use Log scale", FALSE))), column(width = 5, conditionalPanel("input.cntr_sort_f == 'custom_f'", -selectizeInput("cntr_f", "Countries", choices = sort(c(unique(all_country_data$country),"United States")), selected = c("United Kingdom", "Germany", "India", "Canada", "U.S. (South)"), multiple = TRUE, width = "100%"))) +selectizeInput("cntr_f", "Countries", choices = sort(c(unique(all_country_data$country),"United States")), selected = c("United Kingdom", "Germany", "India", "Canada", "U.S. (South)"), multiple = TRUE, options = list(plugins = list("remove_button")), width = "100%"))) ), fluidRow( highcharter::highchartOutput("first_visit_pie_pl", height = "500px"), @@ -369,7 +369,7 @@ conditionalPanel("!input.prop_l", checkboxInput("cntr_logscale_l", "Use Log scale", FALSE))), column(width = 5, conditionalPanel("input.cntr_sort_l == 'custom_l'", -selectizeInput("cntr_l", "Countries", choices = sort(c(unique(all_country_data$country), "United States")), selected = c("United Kingdom", "Germany", "India", "Canada", "U.S. (South)"), multiple = TRUE, width = "100%"))) +selectizeInput("cntr_l", "Countries", choices = sort(c(unique(all_country_data$country), "United States")), selected = c("United
[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Enable forecasting modules
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/343323 ) Change subject: Enable forecasting modules .. Enable forecasting modules - Fixes dormant forecasting modules - Search - Cirrus API - ZRR - Wikidata Query Service - Homepage traffic - SPARQL endpoint usage - Wakes up forecasting in main.sh - Augments test.R for forecasting Testing command: ``` Rscript test.R --disable_metrics --start_date=2017-03-01 --end_date=2017-03-02 >> test_`date +%F_%T`.log.md 2>&1 ``` Bug: T112170 Change-Id: I4a1d9591ab73ed45ef8f234bcdb0a528c120cf77 --- M README.md M main.sh M modules/forecasts/forecast.R M modules/forecasts/search/api_cirrus_arima M modules/forecasts/search/api_cirrus_bsts M modules/forecasts/search/config.yaml M modules/forecasts/search/zrr_overall_arima M modules/forecasts/search/zrr_overall_bsts M modules/forecasts/wdqs/config.yaml M modules/forecasts/wdqs/homepage_traffic_arima M modules/forecasts/wdqs/homepage_traffic_bsts M modules/forecasts/wdqs/sparql_usage_arima M modules/forecasts/wdqs/sparql_usage_bsts M test.R 14 files changed, 99 insertions(+), 51 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/README.md b/README.md index bcdbe08..f49c6e9 100644 --- a/README.md +++ b/README.md @@ -246,7 +246,7 @@ DATE('{from_timestamp}') AS date, ..., COUNT(*) AS events -FROM +FROM {Schema_Revision} WHERE timestamp >= '{from_timestamp}' AND timestamp < '{to_timestamp}' GROUP BY date, ...; ``` @@ -403,11 +403,13 @@ ```bash #!/bin/bash -Rscript modules/forecasts/forecast.R --date=$1 --metric=[your forecasted metric] --model=[ARIMA [--bootstrap_ci]|BSTS] +Rscript modules/forecasts/forecast.R --date=$2 --metric=[your forecasted metric] --model=[ARIMA [--bootstrap_ci]|BSTS] ``` Change the `--metric` and `--model` arguments accordingly. The actual data-reading and metric-forecasting calls are in a switch statement in [modules/forecasts/forecast.R](modules/forecasts/forecast.R). Don't forget to add the forecasted metric to the `--metric` option's help text at the top of **forecast.R** and don't forget to subset the data after reading it in (e.g. `dplyr::filter(data, date < as.Date(opt$date))`) +**Note** the `--date=$2` in there instead of `--date=$1`. This is because Reportupdater passes a *start date* and an *end date* to every script it runs, with the goal of generating a report for *start date*. However, with forecasting modules we're actually interested in generating a report for *end date* after observing the latest metric for *start date*. + ## Additional Information This repository can be browsed in [Phabricator/Diffusion](https://phabricator.wikimedia.org/diffusion/WDGO/), but is also (read-only) mirrored to [GitHub](https://github.com/wikimedia/wikimedia-discovery-golden/). diff --git a/main.sh b/main.sh index 150c2fe..19ad53b 100644 --- a/main.sh +++ b/main.sh @@ -21,8 +21,8 @@ done # Forecasts (dependent on latest metrics) -# for module in "search" "wdqs" -# do -# echo "Running Reportupdater on ${module} forecasts..." -# reportupdater/update_reports.py "modules/forecasts/${module}" "/a/aggregate-datasets/discovery-forecasts/${module}" -# done +for module in "search" "wdqs" +do + echo "Running Reportupdater on ${module} forecasts..." + reportupdater/update_reports.py "modules/forecasts/${module}" "/a/aggregate-datasets/discovery-forecasts/${module}" +done diff --git a/modules/forecasts/forecast.R b/modules/forecasts/forecast.R index 9c30a2f..bcbc850 100644 --- a/modules/forecasts/forecast.R +++ b/modules/forecasts/forecast.R @@ -12,14 +12,20 @@ * wdqs_sparql"), make_option("--model", default = NA, action = "store", type = "character", help = "Available: ARIMA, BSTS"), - make_option("--iters", default = 1000, action = "store", type = "numeric", + make_option("--iters", default = 1, action = "store", type = "numeric", help = "Number of MCMC iterations to keep in BSTS models [default %default]"), - make_option("--burnin", default = 500, action = "store", type = "numeric", + make_option("--burnin", default = 1000, action = "store", type = "numeric", help = "Number of iterations to use as burn-in in BSTS models [default %default]") ) read_data <- function(path, ...) { - return(readr::read_tsv(file.path("/a/aggregate-datasets/discovery/", path), ...)) + if (grepl("^stat[0-9]{4}$", Sys.info()["nodename"])) { +# Use local datasets if run on stat1002 +return(readr::read_tsv(file.path("/a/aggregate-datasets", path), ...)) + } else { +# Download from datasets.wikimedia.org otherwise +return(polloi::read_dataset(path, ...)) + } } # Get command line options, if help option encountered print help and exit, @@ -27,32 +33,45 @@ opt <- parse_args(OptionParser(option_list = option_list)) if
[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Adds permission and submodule checking
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/343322 ) Change subject: Adds permission and submodule checking .. Adds permission and submodule checking "You better check yo self before you wreck yo self" Bug: T160772 Change-Id: Ia53e8826af757fa4435c71273d6d9eac86864c23 --- M main.sh 1 file changed, 13 insertions(+), 0 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/main.sh b/main.sh index 17ad051..150c2fe 100644 --- a/main.sh +++ b/main.sh @@ -1,5 +1,18 @@ #!/bin/bash +# Check if modules/forecasts/forecast.R has execution permission for Reportupdater +# (If it doesn't, then other R and shell scripts in modules/ probably don't either.) +if [ `ls -l modules/forecasts | grep -e forecast.R | grep -e "-rwxrwxr-x" | wc -l` == "0" ]; then + echo "Warning: modules do not have execution permission; granting now..." + chmod +x -R modules/ +fi + +# Check if Reportupdater git submodule is set up +if [ ! -f reportupdater/update_reports.py ]; then + echo "Warning: Reportupdater needs to be initialized and updated..." + git submodule init && git submodule update +fi + # Metrics for module in "external_traffic" "wdqs" "maps" "search" "portal" do -- To view, visit https://gerrit.wikimedia.org/r/343322 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ia53e8826af757fa4435c71273d6d9eac86864c23 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/golden Gerrit-Branch: master Gerrit-Owner: BearlogaGerrit-Reviewer: Chelsyx ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...prince[master]: Annotate Reportupdater migration on graphs
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/341743 ) Change subject: Annotate Reportupdater migration on graphs .. Annotate Reportupdater migration on graphs Bug: T150915 Change-Id: Idd8b46e61db9e33788d2be63564c3dc40334dc5f --- M server.R M tab_documentation/action_breakdown.md M tab_documentation/applinks.md M tab_documentation/clickthrough_rate.md M tab_documentation/dwelltime.md M tab_documentation/first_visit.md M tab_documentation/geography.md M tab_documentation/languages_summary.md M tab_documentation/languages_visited.md M tab_documentation/most_common.md M tab_documentation/pageviews.md M tab_documentation/referers_byengine.md M tab_documentation/referers_summary.md M tab_documentation/sisproj.md M utils.R 15 files changed, 95 insertions(+), 63 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/server.R b/server.R index 9da5d5a..2d82d55 100644 --- a/server.R +++ b/server.R @@ -51,7 +51,8 @@ dyEvent(as.Date("2016-05-18"), "Sister Links Updated", labelLoc = "bottom", color = "white") %>% dyEvent(as.Date("2016-06-02"), "Detect Language Deployed", labelLoc = "bottom", color = "white") %>% dyEvent(as.Date("2016-08-16"), "Secondary Links Collapsed", labelLoc = "bottom", color = "white") %>% - dyEvent(as.Date("2016-09-13"), "B (schema switch)", labelLoc = "bottom", color = "white") + dyEvent(as.Date("2016-09-13"), "B (schema switch)", labelLoc = "bottom", color = "white") %>% + dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom", color = "white") }) output$action_breakdown_dygraph <- renderDygraph({ @@ -68,7 +69,8 @@ dyEvent(as.Date("2016-05-18"), "Sister Links Updated", labelLoc = "bottom", color = "white") %>% dyEvent(as.Date("2016-06-02"), "Detect Language Deployed", labelLoc = "bottom", color = "white") %>% dyEvent(as.Date("2016-08-16"), "Secondary Links Collapsed", labelLoc = "bottom", color = "white") %>% - dyEvent(as.Date("2016-09-13"), "B (schema switch)", labelLoc = "bottom", color = "white") + dyEvent(as.Date("2016-09-13"), "B (schema switch)", labelLoc = "bottom", color = "white") %>% + dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom", color = "white") }) output$most_common_dygraph <- renderDygraph({ @@ -83,7 +85,8 @@ dyEvent(as.Date("2016-05-18"), "Sister Links Updated", labelLoc = "bottom", color = "white") %>% dyEvent(as.Date("2016-06-02"), "Detect Language Deployed", labelLoc = "bottom", color = "white") %>% dyEvent(as.Date("2016-08-16"), "Secondary Links Collapsed", labelLoc = "bottom", color = "white") %>% - dyEvent(as.Date("2016-09-13"), "A (schema switch)", labelLoc = "bottom", color = "white") + dyEvent(as.Date("2016-09-13"), "A (schema switch)", labelLoc = "bottom", color = "white") %>% + dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom", color = "white") }) output$first_visit_dygraph <- renderDygraph({ @@ -99,7 +102,8 @@ dyEvent(as.Date("2016-05-18"), "Sister Links Updated", labelLoc = "bottom", color = "white") %>% dyEvent(as.Date("2016-06-02"), "Detect Language Deployed", labelLoc = "bottom", color = "white") %>% dyEvent(as.Date("2016-08-16"), "Secondary Links Collapsed", labelLoc = "bottom", color = "white") %>% - dyEvent(as.Date("2016-09-13"), "A (schema switch)", labelLoc = "bottom", color = "white") + dyEvent(as.Date("2016-09-13"), "A (schema switch)", labelLoc = "bottom", color = "white") %>% + dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom", color = "white") }) output$dwelltime_dygraph <- renderDygraph({ @@ -115,7 +119,8 @@ dyEvent(as.Date("2016-05-18"), "Sister Links Updated", labelLoc = "bottom", color = "white") %>% dyEvent(as.Date("2016-06-02"), "Detect Language Deployed", labelLoc = "bottom", color = "white") %>% dyEvent(as.Date("2016-08-16"), "Secondary Links Collapsed", labelLoc = "bottom", color = "white") %>% - dyEvent(as.Date("2016-09-13"), "B (schema switch)", labelLoc = "bottom", color = "white") + dyEvent(as.Date("2016-09-13"), "B (schema switch)", labelLoc = "bottom", color = "white") %>% + dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom", color = "white") }) output$sisproj_dygraph <- renderDygraph({ @@ -137,7 +142,8 @@ polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_sisproj), rename = FALSE) %>% polloi::make_dygraph("Date", ifelse(input$sisproj_type == "prop", "Proportion (%)", input$sisproj_metric), paste(ifelse(input$sisproj_metric == "Clicks", "Clicks", "Users who clicked"), "on links other Wikimedia Foundation projects")) %>% - dyLegend(labelsDiv =
[MediaWiki-commits] [Gerrit] wikimedia...twilightsparql[master]: Annotate Reportupdater migration on graphs
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/341746 ) Change subject: Annotate Reportupdater migration on graphs .. Annotate Reportupdater migration on graphs Bug: T150915 Change-Id: I51e717f7a0f9782e6d4d0261ab60264aa98a64b2 --- M server.R M tab_documentation/wdqs_usage.md M tab_documentation/wdqs_visits.md 3 files changed, 13 insertions(+), 8 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/server.R b/server.R index 0a9ec5f..1974ddb 100644 --- a/server.R +++ b/server.R @@ -27,7 +27,8 @@ dyAxis("y", logscale = input$usage_logscale) %>% dyLegend(labelsDiv = "usage_legend") %>% dyRangeSelector %>% - dyEvent(as.Date("2017-01-01"), "D (Started tracking LDF usage)", labelLoc = "bottom") + dyEvent(as.Date("2017-01-01"), "D (Started tracking LDF usage)", labelLoc = "bottom") %>% + dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") ) output$sparql_usage_plot <- renderDygraph( @@ -43,7 +44,8 @@ dyRangeSelector %>% dyEvent(as.Date("2015-09-07"), "A (Announcement)", labelLoc = "bottom") %>% dyEvent(as.Date("2015-11-05"), "B (Labs bot)", labelLoc = "bottom") %>% - dyEvent(as.Date("2016-12-28"), "C (Bot ruleset)", labelLoc = "bottom") + dyEvent(as.Date("2016-12-28"), "C (Bot ruleset)", labelLoc = "bottom") %>% + dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") ) output$wdqs_visits_plot <- renderDygraph( @@ -59,7 +61,8 @@ # ...because we're using dygraphs' native log-scaling: dyAxis("y", logscale = input$visits_logscale) %>% dyLegend(labelsDiv = "wdqs_visits_legend") %>% - dyEvent(as.Date("2015-09-07"), "A (Announcement)", labelLoc = "bottom") + dyEvent(as.Date("2015-09-07"), "A (Announcement)", labelLoc = "bottom") %>% + dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") ) # Check datasets for missing data and notify user which datasets are missing data (if any) diff --git a/tab_documentation/wdqs_usage.md b/tab_documentation/wdqs_usage.md index 3d6cc3f..158a6c6 100644 --- a/tab_documentation/wdqs_usage.md +++ b/tab_documentation/wdqs_usage.md @@ -6,10 +6,11 @@ Outages and inaccuracies -- -- **'A'**: We announced WDQS to the public. -- **'B'**: From 2015-11-04 to 2015-11-06 there was what we believe to be a broken bot responsible for 21+ million requests. -- **'C'**: As part of a refactoring to a new metric-generating framework (see [T150915](https://phabricator.wikimedia.org/T150915)), we revised the ruleset for determining when a request came from a bot/tool. For example, requests with URLs and email addresses in the UserAgent were classified as automata after 2016-12-28. -- **'D'**: We started tracking LDF endpoint usage on 2017-01-01. See [T153936](https://phabricator.wikimedia.org/T153936) and [T136358](https://phabricator.wikimedia.org/T136358) for more details. +* '__A__': We announced WDQS to the public. +* '__B__': From 2015-11-04 to 2015-11-06 there was what we believe to be a broken bot responsible for 21+ million requests. +* '__C__': As part of a refactoring to a new metric-generating framework (see [T150915](https://phabricator.wikimedia.org/T150915)), we revised the ruleset for determining when a request came from a bot/tool. For example, requests with URLs and email addresses in the UserAgent were classified as automata after 2016-12-28. +* '__D__': We started tracking LDF endpoint usage on 2017-01-01. See [T153936](https://phabricator.wikimedia.org/T153936) and [T136358](https://phabricator.wikimedia.org/T136358) for more details. +* '__R__': on 2017-01-01 we started calculating all of Discovery's metrics using a new version of [our data retrieval and processing codebase](https://phabricator.wikimedia.org/diffusion/WDGO/) that we migrated to [Wikimedia Analytics](https://www.mediawiki.org/wiki/Analytics)' [Reportupdater infrastructure](https://wikitech.wikimedia.org/wiki/Analytics/Reportupdater). See [T150915](https://phabricator.wikimedia.org/T150915) for more details. Questions, bug reports, and feature suggestions -- diff --git a/tab_documentation/wdqs_visits.md b/tab_documentation/wdqs_visits.md index 02fffcf..192eb2e 100644 --- a/tab_documentation/wdqs_visits.md +++ b/tab_documentation/wdqs_visits.md @@ -6,7 +6,8 @@ Outages and inaccuracies -- -- **'A'**: We announced WDQS to the public. +* '__A__': We announced WDQS to the public. +* '__R__': on 2017-01-01 we started calculating all of Discovery's metrics using a new version of [our data retrieval and processing codebase](https://phabricator.wikimedia.org/diffusion/WDGO/) that we migrated to [Wikimedia Analytics](https://www.mediawiki.org/wiki/Analytics)' [Reportupdater
[MediaWiki-commits] [Gerrit] wikimedia...wonderbolt[master]: Annotate Reportupdater migration on graphs
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/341744 ) Change subject: Annotate Reportupdater migration on graphs .. Annotate Reportupdater migration on graphs Bug: T150915 Change-Id: If916e90d5b11e6a2ee6f9582b0603d6d7b224b9e --- M server.R M tab_documentation/traffic_byengine.md M tab_documentation/traffic_summary.md 3 files changed, 14 insertions(+), 10 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/server.R b/server.R index 1888f89..b0863cf 100644 --- a/server.R +++ b/server.R @@ -7,12 +7,12 @@ existing_date <- Sys.Date() - 1 function(input, output, session) { - + if (Sys.Date() != existing_date) { read_traffic() existing_date <<- Sys.Date() } - + output$traffic_summary_dygraph <- renderDygraph({ summary_traffic_data[[input$platform_traffic_summary]] %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_traffic_summary)) %>% @@ -21,9 +21,10 @@ dyLegend(labelsDiv = "traffic_summary_legend", show = "always", showZeroValues = FALSE) %>% dyRangeSelector %>% dyEvent(as.Date("2016-03-07"), "A (new UDF)", labelLoc = "bottom") %>% - dyEvent(as.Date("2016-06-26"), "B (DuckDuckGo)", labelLoc = "bottom") + dyEvent(as.Date("2016-06-26"), "B (DuckDuckGo)", labelLoc = "bottom") %>% + dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") }) - + output$traffic_bysearch_dygraph <- renderDygraph({ bysearch_traffic_data[[input$platform_traffic_bysearch]] %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_traffic_bysearch)) %>% @@ -32,9 +33,10 @@ dyLegend(labelsDiv = "traffic_bysearch_legend", show = "always", showZeroValues = FALSE) %>% dyAxis("y", logscale = input$platform_traffic_bysearch_log) %>% dyRangeSelector(fillColor = "", strokeColor = "") %>% - dyEvent(as.Date("2016-06-26"), "A (DuckDuckGo)", labelLoc = "bottom") + dyEvent(as.Date("2016-06-26"), "A (DuckDuckGo)", labelLoc = "bottom") %>% + dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") }) - + # Check datasets for missing data and notify user which datasets are missing data (if any) output$message_menu <- renderMenu({ notifications <- list( @@ -43,5 +45,5 @@ notifications <- notifications[!sapply(notifications, is.null)] return(dropdownMenu(type = "notifications", .list = notifications)) }) - + } diff --git a/tab_documentation/traffic_byengine.md b/tab_documentation/traffic_byengine.md index b6e7571..30d33ac 100644 --- a/tab_documentation/traffic_byengine.md +++ b/tab_documentation/traffic_byengine.md @@ -11,7 +11,8 @@ Outages and notes -- -- **A**: On 25 August 2016 we patched the UDF to also look for [Duck Duck Go](https://duckduckgo.com) when it processes referer data. That referreral data was deleted and backfilled from 26 June 2016. See [T143287](https://phabricator.wikimedia.org/T143287) for more details. +* '__A__': on 2016-08-25 we patched the UDF to also look for [Duck Duck Go](https://duckduckgo.com) when it processes referer data. That referreral data was deleted and backfilled from 26 June 2016. See [T143287](https://phabricator.wikimedia.org/T143287) for more details. +* '__R__': on 2017-01-01 we started calculating all of Discovery's metrics using a new version of [our data retrieval and processing codebase](https://phabricator.wikimedia.org/diffusion/WDGO/) that we migrated to [Wikimedia Analytics](https://www.mediawiki.org/wiki/Analytics)' [Reportupdater infrastructure](https://wikitech.wikimedia.org/wiki/Analytics/Reportupdater). See [T150915](https://phabricator.wikimedia.org/T150915) for more details. Questions, bug reports, and feature suggestions -- diff --git a/tab_documentation/traffic_summary.md b/tab_documentation/traffic_summary.md index b1b7cf6..e8f0919 100644 --- a/tab_documentation/traffic_summary.md +++ b/tab_documentation/traffic_summary.md @@ -10,9 +10,10 @@ Outages and notes -- -- **A**: We switched to a finalized version of the UDF that extracts internal traffic (see [T130083](https://phabricator.wikimedia.org/T130083)) -- **B**: On 25 August 2016 we patched the UDF to also look for [Duck Duck Go](https://duckduckgo.com) when it processes referer data. That referreral data was deleted and backfilled from 26 June 2016. See [T143287](https://phabricator.wikimedia.org/T143287) for more details. +* '__A__': We switched to a finalized version of the UDF that extracts internal traffic (see [T130083](https://phabricator.wikimedia.org/T130083)) +* '__B__': on 25 August 2016 we patched the UDF to also look for [Duck Duck Go](https://duckduckgo.com) when it processes referer data. That referreral data was deleted and backfilled from 26 June
[MediaWiki-commits] [Gerrit] wikimedia...wetzel[master]: Annotate Reportupdater migration on graphs
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/341745 ) Change subject: Annotate Reportupdater migration on graphs .. Annotate Reportupdater migration on graphs Bug: T150915 Change-Id: I8bc4bd8ca8883e947ca77439edb0f5af47a6be8a --- M server.R M tab_documentation/geo_breakdown.md M tab_documentation/geohack_usage.md M tab_documentation/tiles_summary.md M tab_documentation/tiles_total_by_zoom.md M tab_documentation/tiles_users_by_style.md M tab_documentation/unique_users.md M tab_documentation/wikiminiatlas_usage.md M tab_documentation/wikivoyage_usage.md M tab_documentation/wiwosm_usage.md 10 files changed, 42 insertions(+), 21 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/server.R b/server.R index d3353e7..d24e621 100644 --- a/server.R +++ b/server.R @@ -40,7 +40,8 @@ dyEvent(as.Date("2015-09-17"), "A (announcement)", labelLoc = "bottom") %>% dyEvent(as.Date("2016-01-08"), "B (enwiki launch)", labelLoc = "bottom") %>% dyEvent(as.Date("2016-01-12"), "C (cache clear)", labelLoc = "bottom") %>% - dyEvent(as.Date("2016-11-09"), "D (pkget)", labelLoc = "bottom") + dyEvent(as.Date("2016-11-09"), "D (pkget)", labelLoc = "bottom") %>% + dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") }) output$tiles_style_series <- renderDygraph({ @@ -59,7 +60,8 @@ dyEvent(as.Date("2015-09-17"), "A (announcement)", labelLoc = "bottom") %>% dyEvent(as.Date("2016-01-08"), "B (enwiki launch)", labelLoc = "bottom") %>% dyEvent(as.Date("2016-01-12"), "C (cache clear)", labelLoc = "bottom") %>% - dyEvent(as.Date("2016-11-09"), "D (pkget)", labelLoc = "top") + dyEvent(as.Date("2016-11-09"), "D (pkget)", labelLoc = "top") %>% + dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") }) output$tiles_users_series <- renderDygraph({ @@ -78,7 +80,8 @@ dyEvent(as.Date("2015-09-17"), "A (announcement)", labelLoc = "bottom") %>% dyEvent(as.Date("2016-01-08"), "B (enwiki launch)", labelLoc = "bottom") %>% dyEvent(as.Date("2016-01-12"), "C (cache clear)", labelLoc = "bottom") %>% - dyEvent(as.Date("2016-11-08"), "D (pkget)", labelLoc = "top") + dyEvent(as.Date("2016-11-08"), "D (pkget)", labelLoc = "top") %>% + dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") }) output$zoom_level_selector_container <- renderUI({ @@ -99,7 +102,8 @@ polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_tiles_zoom_series)) %>% polloi::make_dygraph("Date", "Tiles", "Total tiles by zoom level") %>% dyAxis("y", logscale = input$tiles_zoom_logscale) %>% - dyLegend(labelsDiv = "tiles_zoom_series_legend", show = "always") + dyLegend(labelsDiv = "tiles_zoom_series_legend", show = "always") %>% + dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") }) output$users_per_platform <- renderDygraph({ @@ -110,7 +114,8 @@ dyLegend(labelsDiv = "users_per_platform_legend", show = "always") %>% dyRangeSelector %>% dyEvent(as.Date("2016-04-15"), "A (Maps EL bug)", labelLoc = "bottom") %>% - dyEvent(as.Date("2016-06-17"), "A (Maps EL patch)", labelLoc = "bottom") + dyEvent(as.Date("2016-06-17"), "A (Maps EL patch)", labelLoc = "bottom") %>% + dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") }) output$geohack_feature_usage <- renderDygraph({ @@ -120,7 +125,8 @@ dyRangeSelector %>% dyAxis("y", logscale = input$geohack_feature_usage_logscale) %>% dyEvent(as.Date("2016-04-15"), "A (Maps EL bug)", labelLoc = "bottom") %>% - dyEvent(as.Date("2016-06-17"), "A (Maps EL patch)", labelLoc = "bottom") + dyEvent(as.Date("2016-06-17"), "A (Maps EL patch)", labelLoc = "bottom") %>% + dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") }) output$wikiminiatlas_feature_usage <- renderDygraph({ @@ -130,7 +136,8 @@ dyRangeSelector %>% dyAxis("y", logscale = input$wikiminiatlas_feature_usage_logscale) %>% dyEvent(as.Date("2016-04-15"), "A (Maps EL bug)", labelLoc = "bottom") %>% - dyEvent(as.Date("2016-06-17"), "A (Maps EL patch)", labelLoc = "bottom") + dyEvent(as.Date("2016-06-17"), "A (Maps EL patch)", labelLoc = "bottom") %>% + dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") }) output$wikivoyage_feature_usage <- renderDygraph({ @@ -140,7 +147,8 @@ dyRangeSelector %>% dyAxis("y", logscale = input$wikivoyage_feature_usage_logscale) %>% dyEvent(as.Date("2016-04-15"), "A (Maps EL bug)", labelLoc = "bottom") %>% - dyEvent(as.Date("2016-06-17"), "A (Maps EL patch)", labelLoc = "bottom") +
[MediaWiki-commits] [Gerrit] wikimedia...rainbow[master]: Annotate Reportupdater migration on graphs
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/341742 ) Change subject: Annotate Reportupdater migration on graphs .. Annotate Reportupdater migration on graphs Bug: T150915 Change-Id: Ie650b1eb0f5c9cc40e43a316b71f44e0b8b8cab7 --- M server.R M tab_documentation/app_events.md M tab_documentation/app_load.md M tab_documentation/click_position.md M tab_documentation/desktop_events.md M tab_documentation/desktop_load.md M tab_documentation/failure_breakdown.md M tab_documentation/failure_langproj.md M tab_documentation/failure_rate.md M tab_documentation/failure_suggests.md M tab_documentation/fulltext_basic.md M tab_documentation/geo_basic.md M tab_documentation/invoke_source.md M tab_documentation/kpi_api_usage.md M tab_documentation/kpi_augmented_clickthroughs.md M tab_documentation/kpi_load_time.md M tab_documentation/kpi_zero_results.md M tab_documentation/language_basic.md M tab_documentation/mobile_events.md M tab_documentation/mobile_load.md M tab_documentation/open_basic.md M tab_documentation/paulscore_approx.html M tab_documentation/prefix_basic.md M tab_documentation/survival.md 24 files changed, 100 insertions(+), 57 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/server.R b/server.R index 343dd5f..158ecf5 100644 --- a/server.R +++ b/server.R @@ -69,7 +69,8 @@ polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_desktop_event)) %>% polloi::make_dygraph(xlab = "Date", ylab = "Events", title = "Desktop search events, by day") %>% dyRangeSelector %>% - dyEvent(as.Date("2016-07-12"), "A (schema switch)", labelLoc = "bottom") + dyEvent(as.Date("2016-07-12"), "A (schema switch)", labelLoc = "bottom") %>% + dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") }) output$desktop_load_plot <- renderDygraph({ @@ -77,7 +78,8 @@ polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_desktop_load)) %>% polloi::make_dygraph(xlab = "Date", ylab = "Load time (ms)", title = "Desktop load times, by day", use_si = FALSE) %>% dyRangeSelector %>% - dyEvent(as.Date("2016-07-12"), "A (schema switch)", labelLoc = "bottom") + dyEvent(as.Date("2016-07-12"), "A (schema switch)", labelLoc = "bottom") %>% + dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") }) output$paulscore_approx_plot_fulltext <- renderDygraph({ @@ -149,14 +151,16 @@ mobile_dygraph_set %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_mobile_event)) %>% polloi::make_dygraph(xlab = "Date", ylab = "Events", title = "Mobile search events, by day") %>% - dyRangeSelector + dyRangeSelector %>% + dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") }) output$mobile_load_plot <- renderDygraph({ mobile_load_data %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_mobile_load)) %>% polloi::make_dygraph(xlab = "Date", ylab = "Load time (ms)", title = "Mobile search events, by day", use_si = FALSE) %>% - dyRangeSelector + dyRangeSelector %>% + dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") }) ## App value boxes @@ -192,28 +196,32 @@ android_dygraph_set %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_app_event)) %>% polloi::make_dygraph(xlab = "Date", ylab = "Events", title = "Android mobile app search events, by day") %>% - dyRangeSelector + dyRangeSelector %>% + dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") }) output$android_load_plot <- renderDygraph({ android_load_data %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_app_load)) %>% polloi::make_dygraph(xlab = "Date", ylab = "Load time (ms)", title = "Android result load times, by day", use_si = FALSE) %>% - dyRangeSelector + dyRangeSelector %>% + dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") }) output$ios_event_plot <- renderDygraph({ ios_dygraph_set %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_app_event)) %>% polloi::make_dygraph(xlab = "Date", ylab = "Events", title = "iOS mobile app search events, by day") %>% - dyRangeSelector + dyRangeSelector %>% + dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") }) output$ios_load_plot <- renderDygraph({ ios_load_data %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global,
[MediaWiki-commits] [Gerrit] wikimedia...prince[master]: Fixed bug in tab country_breakdown
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/341373 ) Change subject: Fixed bug in tab country_breakdown .. Fixed bug in tab country_breakdown Bug: T150915 Change-Id: If0e96f2fc25edc8a094367ac61d7e21879687d2e --- M utils.R 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/prince refs/changes/73/341373/1 diff --git a/utils.R b/utils.R index 6548a0e..67a1594 100644 --- a/utils.R +++ b/utils.R @@ -49,7 +49,7 @@ country_data <<- tidyr::spread( dplyr::distinct(interim, date, country, .keep_all = TRUE), country, events, fill = NA - ) + ) %>% as.data.frame() return(invisible()) } -- To view, visit https://gerrit.wikimedia.org/r/341373 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: If0e96f2fc25edc8a094367ac61d7e21879687d2e Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/prince Gerrit-Branch: master Gerrit-Owner: Chelsyx___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...wonderbolt[master]: Note about internally referred traffic being miscategorized
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/338279 ) Change subject: Note about internally referred traffic being miscategorized .. Note about internally referred traffic being miscategorized Bug: T154722 Change-Id: I57c8878519efd943b476c28de3a50e2989c99307 --- M tab_documentation/traffic_summary.md 1 file changed, 1 insertion(+), 0 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/wonderbolt refs/changes/79/338279/1 diff --git a/tab_documentation/traffic_summary.md b/tab_documentation/traffic_summary.md index 5cf797d..b1b7cf6 100644 --- a/tab_documentation/traffic_summary.md +++ b/tab_documentation/traffic_summary.md @@ -12,6 +12,7 @@ -- - **A**: We switched to a finalized version of the UDF that extracts internal traffic (see [T130083](https://phabricator.wikimedia.org/T130083)) - **B**: On 25 August 2016 we patched the UDF to also look for [Duck Duck Go](https://duckduckgo.com) when it processes referer data. That referreral data was deleted and backfilled from 26 June 2016. See [T143287](https://phabricator.wikimedia.org/T143287) for more details. +- On 22 February 2016, a bug was introduced and some of the internally referred traffic are miscategorized as none. See [T148780](https://phabricator.wikimedia.org/T148780) and [T154722](https://phabricator.wikimedia.org/T154722) for more details. Questions, bug reports, and feature suggestions -- -- To view, visit https://gerrit.wikimedia.org/r/338279 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I57c8878519efd943b476c28de3a50e2989c99307 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/wonderbolt Gerrit-Branch: master Gerrit-Owner: Chelsyx___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...wmf[master]: Change MySQL config file
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/333132 ) Change subject: Change MySQL config file .. Change MySQL config file Change-Id: I3d9cf2f9f6a8a48a55a7e00fa42eb3d38572ecf6 --- M R/mysql.R 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/wmf refs/changes/32/333132/1 diff --git a/R/mysql.R b/R/mysql.R index 12e2fb7..510e4d5 100644 --- a/R/mysql.R +++ b/R/mysql.R @@ -38,8 +38,8 @@ #'@export mysql_connect <- function(database, default_file = NULL) { if (is.null(default_file)) { -default_file = "/etc/mysql/conf.d/stats-research-client.cnf" -# there's also "/etc/mysql/conf.d/analytics-research-client.cnf" +default_file = "/etc/mysql/conf.d/analytics-research-client.cnf" +# there's also "/etc/mysql/conf.d/stats-research-client.cnf" } if (RMySQL_version() > 93) { con <- dbConnect(drv = RMySQL::MySQL(), -- To view, visit https://gerrit.wikimedia.org/r/333132 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I3d9cf2f9f6a8a48a55a7e00fa42eb3d38572ecf6 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/wmf Gerrit-Branch: master Gerrit-Owner: Chelsyx___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...rainbow[master]: Highlight sparklines according to date range selection on KP...
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/328593 ) Change subject: Highlight sparklines according to date range selection on KPI summary page .. Highlight sparklines according to date range selection on KPI summary page Bug: T150215 Change-Id: Ib0e06619b3c3e7069fcd227528bc87dd9a1c0bea --- M server.R 1 file changed, 81 insertions(+), 13 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow refs/changes/93/328593/1 diff --git a/server.R b/server.R index d5d34a8..62678bd 100644 --- a/server.R +++ b/server.R @@ -572,10 +572,27 @@ dplyr::select(Median) %>% unlist(use.names = FALSE) %>% round(2) -sparkline::sparkline(values = output_sl, type = "line", +sl1 <- sparkline::sparkline(values = output_sl, type = "line", height = 50, width = '100%', lineColor = 'black', fillColor = 'transparent', + chartRangeMin = min(output_sl), chartRangeMax = max(output_sl), highlightLineColor = 'orange', highlightSpotColor = 'orange') +# highlight selected date range +if (input$kpi_summary_date_range_selector == "weekly"){ + output_highlight <- c(rep(NA, length(output_sl)-7), output_sl[(length(output_sl)-6):length(output_sl)]) +} else if (input$kpi_summary_date_range_selector == "monthly"){ + output_highlight <- c(rep(NA, length(output_sl)-30), output_sl[(length(output_sl)-29):length(output_sl)]) +} else if (input$kpi_summary_date_range_selector == "quarterly"){ + output_highlight <- output_sl +} else { + return(sl1) +} +sl2 <- sparkline::sparkline(values = output_highlight, type = "line", +height = 50, width = '100%', lineWidth = 2, +lineColor = 'red', chartRangeMin = min(output_sl), chartRangeMax = max(output_sl), +minSpotColor = F, maxSpotColor = F, disableInteraction = T, +highlightLineColor = NULL, highlightSpotColor = NULL) +return(sparkline::spk_composite(sl1, sl2)) }) output$sparkline_zero_results <- sparkline:::renderSparkline({ if(input$kpi_summary_date_range_selector == "all"){ @@ -588,10 +605,27 @@ dplyr::select(rate) %>% unlist(use.names = FALSE) %>% round(2) -sparkline::sparkline(values = output_sl, type = "line", - height = 50, width = '100%', - lineColor = 'black', fillColor = 'transparent', - highlightLineColor = 'orange', highlightSpotColor = 'orange') +sl1 <- sparkline::sparkline(values = output_sl, type = "line", +height = 50, width = '100%', +lineColor = 'black', fillColor = 'transparent', +chartRangeMin = min(output_sl), chartRangeMax = max(output_sl), +highlightLineColor = 'orange', highlightSpotColor = 'orange') +# highlight selected date range +if (input$kpi_summary_date_range_selector == "weekly"){ + output_highlight <- c(rep(NA, length(output_sl)-7), output_sl[(length(output_sl)-6):length(output_sl)]) +} else if (input$kpi_summary_date_range_selector == "monthly"){ + output_highlight <- c(rep(NA, length(output_sl)-30), output_sl[(length(output_sl)-29):length(output_sl)]) +} else if (input$kpi_summary_date_range_selector == "quarterly"){ + output_highlight <- output_sl +} else { + return(sl1) +} +sl2 <- sparkline::sparkline(values = output_highlight, type = "line", +height = 50, width = '100%', lineWidth = 2, +lineColor = 'red', chartRangeMin = min(output_sl), chartRangeMax = max(output_sl), +minSpotColor = F, maxSpotColor = F, disableInteraction = T, +highlightLineColor = NULL, highlightSpotColor = NULL) +return(sparkline::spk_composite(sl1, sl2)) }) output$sparkline_api_usage <- sparkline:::renderSparkline({ if(input$kpi_summary_date_range_selector == "all"){ @@ -609,10 +643,27 @@ dplyr::summarize(total = sum(events)) %>% dplyr::select(total) %>% unlist(use.names = FALSE) -sparkline::sparkline(values = output_sl, type = "line", - height = 50, width = '100%', - lineColor = 'black', fillColor = 'transparent', - highlightLineColor = 'orange', highlightSpotColor = 'orange') +sl1 <- sparkline::sparkline(values = output_sl, type = "line", +height = 50, width = '100%', +lineColor = 'black', fillColor = 'transparent', +chartRangeMin
[MediaWiki-commits] [Gerrit] wikimedia...rainbow[master]: Add sparklines for KPIs: - KPI Summary Page - Monthly Metric...
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/327877 ) Change subject: Add sparklines for KPIs: - KPI Summary Page - Monthly Metrics Page .. Add sparklines for KPIs: - KPI Summary Page - Monthly Metrics Page Bug: T150215 Change-Id: I4b64830a3db7f734977b19de695fdf7b0ae7ee12 --- M server.R M ui.R 2 files changed, 142 insertions(+), 18 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow refs/changes/77/327877/1 diff --git a/server.R b/server.R index 3f980c6..c2481db 100644 --- a/server.R +++ b/server.R @@ -1,6 +1,9 @@ library(shiny) library(shinydashboard) library(dygraphs) +library(sparkline) +library(DT) +library(data.table) source("utils.R") @@ -559,6 +562,84 @@ return(polloi::na_box("User engagement (data problem)")) }) + ## KPI Sparklines + output$sparkline_load_time <- sparkline:::renderSparkline({ +if(input$kpi_summary_date_range_selector == "all"){ + output_sl <- list(desktop_load_data, mobile_load_data, android_load_data, ios_load_data) +} else{ + output_sl <- list(desktop_load_data, mobile_load_data, android_load_data, ios_load_data) %>% +lapply(polloi::subset_by_date_range, from = Sys.Date() - 91, to = Sys.Date() - 1) +} +output_sl <- output_sl %>% + lapply(function(platform_load_data) { +platform_load_data[, c("date", "Median")] + }) %>% + dplyr::bind_rows(.id = "platform") %>% + dplyr::group_by(date) %>% + dplyr::summarize(Median = median(Median)) %>% + dplyr::ungroup() %>% + dplyr::select(Median) %>% + unlist(use.names = FALSE) %>% + round(2) +sparkline::sparkline(values = output_sl, type = "line", + height = 50, width = '100%', + lineColor = 'black', fillColor = '#ccc', + highlightLineColor = 'orange', highlightSpotColor = 'orange') + }) + output$sparkline_zero_results <- sparkline:::renderSparkline({ +if(input$kpi_summary_date_range_selector == "all"){ + output_sl <- failure_data_with_automata +} else{ + output_sl <- failure_data_with_automata %>% +polloi::subset_by_date_range(from = Sys.Date() - 91, to = Sys.Date() - 1) +} +output_sl <- output_sl %>% + dplyr::select(rate) %>% + unlist(use.names = FALSE) %>% + round(2) +sparkline::sparkline(values = output_sl, type = "line", + height = 50, width = '100%', + lineColor = 'black', fillColor = '#ccc', + highlightLineColor = 'orange', highlightSpotColor = 'orange') + }) + output$sparkline_api_usage <- sparkline:::renderSparkline({ +if(input$kpi_summary_date_range_selector == "all"){ + output_sl <- split_dataset +} else{ + output_sl <- split_dataset %>% +lapply(polloi::subset_by_date_range, from = Sys.Date() - 91, to = Sys.Date() - 1) +} +output_sl <- output_sl %>% + lapply(function(platform_load_data) { +platform_load_data[, c("date", "events")] + }) %>% + dplyr::bind_rows(.id = "api") %>% + dplyr::group_by(date) %>% + dplyr::summarize(total = sum(events)) %>% + dplyr::select(total) %>% + unlist(use.names = FALSE) +sparkline::sparkline(values = output_sl, type = "line", + height = 50, width = '100%', + lineColor = 'black', fillColor = '#ccc', + highlightLineColor = 'orange', highlightSpotColor = 'orange') + }) + output$sparkline_augmented_clickthroughs <- sparkline:::renderSparkline({ +if(input$kpi_summary_date_range_selector == "all"){ + output_sl <- augmented_clickthroughs +} else{ + output_sl <- augmented_clickthroughs %>% +polloi::subset_by_date_range(from = Sys.Date() - 91, to = Sys.Date() - 1) +} +output_sl <- output_sl %>% + dplyr::select(user_engagement) %>% + unlist(use.names = FALSE) %>% + round(2) +sparkline::sparkline(values = output_sl, type = "line", + height = 50, width = '100%', + lineColor = 'black', fillColor = '#ccc', + highlightLineColor = 'orange', highlightSpotColor = 'orange') + }) + ## KPI Modules output$kpi_load_time_series <- renderDygraph({ smooth_level <- input$smoothing_kpi_load_time @@ -722,8 +803,9 @@ dyEvent(as.Date("2016-07-12"), "A (schema switch)", labelLoc = "bottom") }) - output$monthly_metrics_tbl <- renderUI({ -temp <- data.frame( + output$monthly_metrics_tbl <- DT::renderDataTable( +{ + temp <- data.frame( KPI = c("Load time", "Zero results rate", "API Usage", "User engagement"), Units = c("ms", "%", "", "%") ) @@ -795,28 +877,64 @@ # Sanitize: temp[temp == "NA%" | temp == "NANA%" | temp == "NANA"] <- "--"
[MediaWiki-commits] [Gerrit] wikimedia...prince[master]: add geo breakdown
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/327139 ) Change subject: add geo breakdown .. add geo breakdown Change-Id: I7a4d371f40fbb4ec822008ae3866870688621154 --- M functions.R M server.R A tab_documentation/first_visit_geo.md A tab_documentation/last_action_geo.md A tab_documentation/most_common_geo.md A tab_documentation/traffic_ctr_geo.md M ui.R M www/stylesheet.css 8 files changed, 1,449 insertions(+), 10 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/prince refs/changes/39/327139/1 diff --git a/functions.R b/functions.R index 7ee1f65..0398f8c 100644 --- a/functions.R +++ b/functions.R @@ -1,9 +1,21 @@ library(polloi) +library(ggplot2) library(data.table) library(reshape2) library(magrittr) +library(toOrdinal) +library(xts) +library(dplyr) +library(tidyr) source("extras.R") + +# Capitalize the first letter +simpleCap <- function(x) { + s <- strsplit(x, " ")[[1]] + paste(toupper(substring(s, 1,1)), substring(s, 2), +sep="", collapse=" ") +} # Read in the traffic data read_clickthrough <- function(){ @@ -129,6 +141,155 @@ } +read_geo <- function() { + + all_country_data <- polloi::read_dataset("portal/all_country_data.tsv", col_types = "Dcididid") + first_visits_country <- polloi::read_dataset("portal/first_visits_country.tsv", col_types = "Dccid") + last_action_country <- polloi::read_dataset("portal/last_action_country.tsv", col_types = "Dccid") + most_common_country <- polloi::read_dataset("portal/most_common_country.tsv", col_types = "Dccid") + data("countrycode_data", package="countrycode") + countrycode_data$country.name[c(44,54,143)] <- c("Cape Verde", "Congo, The Democratic Republic of the", "Macedonia, Republic of" ) + countrycode_data$continent[countrycode_data$country.name %in% c("British Indian Ocean Territory","Christmas Island","Taiwan, Province of China")] <- "Asia" + countrycode_data$continent[countrycode_data$country.name %in% c("Bermuda","Canada","Greenland","Saint Pierre and Miquelon","United States")] <- "Northern America" + countrycode_data$continent[countrycode_data$continent == "Americas"] <- "South America" + + + all_country_data <- all_country_data[!duplicated(all_country_data[,1:2],fromLast=T),] + all_country_data_prop <- all_country_data %>% +group_by(date) %>% +mutate(event_prop=round(events/sum(events),4)*100, visit_prop=round(n_visit/sum(n_visit),4)*100, session_prop=round(n_session/sum(n_session),4)*100) %>% + select(date,country,event_prop,ctr,visit_prop,ctr_visit,session_prop,ctr_session) %>% ungroup() + us_mask <- grepl("^U\\.S\\.", all_country_data$country) + us_data <- all_country_data[us_mask,] + all_country_data <- us_data %>% +mutate(clicks = events*ctr, click_v=n_visit*ctr_visit, click_s=n_session*ctr_session) %>% +group_by(date) %>% +summarise(country="United States", events=sum(events), ctr=round(sum(clicks)/sum(events),4), + n_visit=sum(n_visit), ctr_visit=round(sum(click_v)/sum(n_visit),4), + n_session=sum(n_session), ctr_session=round(sum(click_s)/sum(n_session),4)) %>% +rbind(all_country_data[!us_mask,]) %>% +arrange(date, country) + us_mask <- grepl("^U\\.S\\.", all_country_data_prop$country) + us_data_prop <- all_country_data_prop[us_mask,] + all_country_data_prop <- us_data_prop %>% +group_by(date) %>% +summarise(country="United States", event_prop=sum(event_prop), + visit_prop=sum(visit_prop), session_prop=sum(session_prop)) %>% +left_join(all_country_data[, c("date","country","ctr","ctr_visit","ctr_session")], by=c("date","country")) %>% +rbind(all_country_data_prop[!us_mask,]) %>% + select(date,country,event_prop,ctr,visit_prop,ctr_visit,session_prop,ctr_session) %>% +arrange(date, country) + colnames(all_country_data) <- c("Date", "Country", "No. Events", + "Overall Clickthrough Rate", "No. Visit", "Clickthrough Rate Per Visit", + "No. Session", "Clickthrough Rate Per Session") + colnames(all_country_data_prop) <- c("Date", "Country", "No. Events", + "Overall Clickthrough Rate", "No. Visit", "Clickthrough Rate Per Visit", + "No. Session", "Clickthrough Rate Per Session") + colnames(us_data) <- c("Date", "Country", "No. Events", + "Overall Clickthrough Rate", "No. Visit", "Clickthrough Rate Per Visit", + "No. Session", "Clickthrough Rate Per Session") + colnames(us_data_prop) <- c("Date", "Country", "No. Events", + "Overall Clickthrough Rate", "No. Visit", "Clickthrough Rate Per Visit", + "No. Session", "Clickthrough Rate Per Session") + region_mask <-
[MediaWiki-commits] [Gerrit] wikimedia...experimental[master]: Fixed bugs in poultry
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/326830 ) Change subject: Fixed bugs in poultry .. Fixed bugs in poultry Change-Id: Idd48ec7c25a197ffb79ca8c3b3520ee2b5e6f70e --- M shiny-server/poultry 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/shiny-server/poultry b/shiny-server/poultry index c4e1f7d..5fe5a0b 16 --- a/shiny-server/poultry +++ b/shiny-server/poultry @@ -1 +1 @@ -Subproject commit c4e1f7d9bc6f46365151f67c0da28f0f72c595b5 +Subproject commit 5fe5a0ba6849ce9af881376e0bd2869ec2a25abe -- To view, visit https://gerrit.wikimedia.org/r/326830 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: Idd48ec7c25a197ffb79ca8c3b3520ee2b5e6f70e Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/experimental Gerrit-Branch: master Gerrit-Owner: ChelsyxGerrit-Reviewer: Chelsyx ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...experimental[master]: Fixed bugs in poultry
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/326830 ) Change subject: Fixed bugs in poultry .. Fixed bugs in poultry Change-Id: Idd48ec7c25a197ffb79ca8c3b3520ee2b5e6f70e --- M shiny-server/poultry 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/experimental refs/changes/30/326830/1 diff --git a/shiny-server/poultry b/shiny-server/poultry index c4e1f7d..5fe5a0b 16 --- a/shiny-server/poultry +++ b/shiny-server/poultry @@ -1 +1 @@ -Subproject commit c4e1f7d9bc6f46365151f67c0da28f0f72c595b5 +Subproject commit 5fe5a0ba6849ce9af881376e0bd2869ec2a25abe -- To view, visit https://gerrit.wikimedia.org/r/326830 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Idd48ec7c25a197ffb79ca8c3b3520ee2b5e6f70e Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/experimental Gerrit-Branch: master Gerrit-Owner: Chelsyx___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...experimental[master]: Updating dashboard poultry
Chelsyx has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/326059 ) Change subject: Updating dashboard poultry .. Updating dashboard poultry Change-Id: I51da2775ec943f79b385c8044fdab4c3c06577cc --- M shiny-server/poultry 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/shiny-server/poultry b/shiny-server/poultry index 263ab79..c4e1f7d 16 --- a/shiny-server/poultry +++ b/shiny-server/poultry @@ -1 +1 @@ -Subproject commit 263ab7937a776d1aa4b349a2bc35732117c5225c +Subproject commit c4e1f7d9bc6f46365151f67c0da28f0f72c595b5 -- To view, visit https://gerrit.wikimedia.org/r/326059 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I51da2775ec943f79b385c8044fdab4c3c06577cc Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/experimental Gerrit-Branch: master Gerrit-Owner: ChelsyxGerrit-Reviewer: Chelsyx ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...experimental[master]: Updating dashboard poultry
Chelsyx has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/326059 ) Change subject: Updating dashboard poultry .. Updating dashboard poultry Change-Id: I51da2775ec943f79b385c8044fdab4c3c06577cc --- M shiny-server/poultry 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/experimental refs/changes/59/326059/1 diff --git a/shiny-server/poultry b/shiny-server/poultry index 263ab79..c4e1f7d 16 --- a/shiny-server/poultry +++ b/shiny-server/poultry @@ -1 +1 @@ -Subproject commit 263ab7937a776d1aa4b349a2bc35732117c5225c +Subproject commit c4e1f7d9bc6f46365151f67c0da28f0f72c595b5 -- To view, visit https://gerrit.wikimedia.org/r/326059 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I51da2775ec943f79b385c8044fdab4c3c06577cc Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/experimental Gerrit-Branch: master Gerrit-Owner: Chelsyx___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...experimental[master]: poultry fix
Chelsyx has submitted this change and it was merged. Change subject: poultry fix .. poultry fix Change-Id: I6d5bb9d6ad4b639fc578d1ce6f0d9aad12c3fe82 --- M shiny-server/poultry 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/shiny-server/poultry b/shiny-server/poultry index 8357194..263ab79 16 --- a/shiny-server/poultry +++ b/shiny-server/poultry @@ -1 +1 @@ -Subproject commit 8357194edd41127c702f44184e9a0fcaf2b41acb +Subproject commit 263ab7937a776d1aa4b349a2bc35732117c5225c -- To view, visit https://gerrit.wikimedia.org/r/320514 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I6d5bb9d6ad4b639fc578d1ce6f0d9aad12c3fe82 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/experimental Gerrit-Branch: master Gerrit-Owner: ChelsyxGerrit-Reviewer: Chelsyx ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...experimental[master]: poultry fix
Chelsyx has uploaded a new change for review. https://gerrit.wikimedia.org/r/320514 Change subject: poultry fix .. poultry fix Change-Id: I6d5bb9d6ad4b639fc578d1ce6f0d9aad12c3fe82 --- M shiny-server/poultry 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/experimental refs/changes/14/320514/1 diff --git a/shiny-server/poultry b/shiny-server/poultry index 8357194..263ab79 16 --- a/shiny-server/poultry +++ b/shiny-server/poultry @@ -1 +1 @@ -Subproject commit 8357194edd41127c702f44184e9a0fcaf2b41acb +Subproject commit 263ab7937a776d1aa4b349a2bc35732117c5225c -- To view, visit https://gerrit.wikimedia.org/r/320514 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I6d5bb9d6ad4b639fc578d1ce6f0d9aad12c3fe82 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/experimental Gerrit-Branch: master Gerrit-Owner: Chelsyx___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...experimental[master]: install highcharter packages
Chelsyx has submitted this change and it was merged. Change subject: install highcharter packages .. install highcharter packages Change-Id: I55a5ddd535917642bf5ed07cab4e5ff3e909 --- M setup.sh 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/setup.sh b/setup.sh index 75bc813..65906eb 100755 --- a/setup.sh +++ b/setup.sh @@ -151,7 +151,7 @@ install_r_package shinydashboard install_r_package flexdashboard install_r_package shinyjs - github_install_r_package jcheng5/googleCharts + install_r_package highcharter git_install_r_package https://gerrit.wikimedia.org/r/wikimedia/discovery/polloi # Statistical modeling: install_r_package forecast -- To view, visit https://gerrit.wikimedia.org/r/320440 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I55a5ddd535917642bf5ed07cab4e5ff3e909 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/experimental Gerrit-Branch: master Gerrit-Owner: ChelsyxGerrit-Reviewer: Chelsyx ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...experimental[master]: install highcharter packages
Chelsyx has uploaded a new change for review. https://gerrit.wikimedia.org/r/320440 Change subject: install highcharter packages .. install highcharter packages Change-Id: I55a5ddd535917642bf5ed07cab4e5ff3e909 --- M setup.sh 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/experimental refs/changes/40/320440/1 diff --git a/setup.sh b/setup.sh index 75bc813..65906eb 100755 --- a/setup.sh +++ b/setup.sh @@ -151,7 +151,7 @@ install_r_package shinydashboard install_r_package flexdashboard install_r_package shinyjs - github_install_r_package jcheng5/googleCharts + install_r_package highcharter git_install_r_package https://gerrit.wikimedia.org/r/wikimedia/discovery/polloi # Statistical modeling: install_r_package forecast -- To view, visit https://gerrit.wikimedia.org/r/320440 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I55a5ddd535917642bf5ed07cab4e5ff3e909 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/experimental Gerrit-Branch: master Gerrit-Owner: Chelsyx___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...experimental[master]: Updating dashboards...
Chelsyx has submitted this change and it was merged. Change subject: Updating dashboards... .. Updating dashboards... Change-Id: Ib3cb8cf4e3a7441d67df74e45437288862720884 --- M shiny-server/forecast M shiny-server/poultry 2 files changed, 2 insertions(+), 2 deletions(-) Approvals: Chelsyx: Verified; Looks good to me, approved diff --git a/shiny-server/forecast b/shiny-server/forecast index 186bdaa..550e524 16 --- a/shiny-server/forecast +++ b/shiny-server/forecast @@ -1 +1 @@ -Subproject commit 186bdaacef0ba1b5c27b4c43589ac408036c2877 +Subproject commit 550e524aa9c6266bfdc67df4539c83b19bb54141 diff --git a/shiny-server/poultry b/shiny-server/poultry index eb36a59..8357194 16 --- a/shiny-server/poultry +++ b/shiny-server/poultry @@ -1 +1 @@ -Subproject commit eb36a59e7f9fa4a53a57eba75410b2ec3a87908d +Subproject commit 8357194edd41127c702f44184e9a0fcaf2b41acb -- To view, visit https://gerrit.wikimedia.org/r/320437 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ib3cb8cf4e3a7441d67df74e45437288862720884 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/experimental Gerrit-Branch: master Gerrit-Owner: ChelsyxGerrit-Reviewer: Chelsyx ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] wikimedia...experimental[master]: Updating dashboards...
Chelsyx has uploaded a new change for review. https://gerrit.wikimedia.org/r/320437 Change subject: Updating dashboards... .. Updating dashboards... Change-Id: Ib3cb8cf4e3a7441d67df74e45437288862720884 --- M shiny-server/forecast M shiny-server/poultry 2 files changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/experimental refs/changes/37/320437/1 diff --git a/shiny-server/forecast b/shiny-server/forecast index 186bdaa..550e524 16 --- a/shiny-server/forecast +++ b/shiny-server/forecast @@ -1 +1 @@ -Subproject commit 186bdaacef0ba1b5c27b4c43589ac408036c2877 +Subproject commit 550e524aa9c6266bfdc67df4539c83b19bb54141 diff --git a/shiny-server/poultry b/shiny-server/poultry index eb36a59..8357194 16 --- a/shiny-server/poultry +++ b/shiny-server/poultry @@ -1 +1 @@ -Subproject commit eb36a59e7f9fa4a53a57eba75410b2ec3a87908d +Subproject commit 8357194edd41127c702f44184e9a0fcaf2b41acb -- To view, visit https://gerrit.wikimedia.org/r/320437 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ib3cb8cf4e3a7441d67df74e45437288862720884 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/experimental Gerrit-Branch: master Gerrit-Owner: Chelsyx___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits