Bearloga has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/374920 )
Change subject: [WIP] SRP visit times ...................................................................... [WIP] SRP visit times Functional but still has the following TODOs: - reorder how the %-lang combos show up in the legend - once more of the data has been backfilled, need to add some general comments on trends Bug: T170468 Change-Id: I690230e3d3a7a41156f5878169577a62f52ddeb6 --- M CHANGELOG.md M modules/page_visit_times.R A tab_documentation/srp_surv.md M tab_documentation/survival.md M ui.R M utils.R 6 files changed, 127 insertions(+), 7 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow refs/changes/20/374920/1 diff --git a/CHANGELOG.md b/CHANGELOG.md index 7cb188e..099e8a1 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,10 @@ All notable changes to this project will be documented in this file. +## 2017/08/30 +- Added SRP visit times ([T170468](https://phabricator.wikimedia.org/T170468)) +- Added [dygraph-based rolling periods](https://rstudio.github.io/dygraphs/gallery-roll-periods.html) to page visit times modules + ## 2017/08/29 - Added support for breakdown of API usage by referrer ([T172452](https://phabricator.wikimedia.org/T172452)) - Added morelike API usage (see [Gerrit change 345863](https://gerrit.wikimedia.org/r/#/c/345863/)) for more details diff --git a/modules/page_visit_times.R b/modules/page_visit_times.R index 115cbb4..d312fcc 100644 --- a/modules/page_visit_times.R +++ b/modules/page_visit_times.R @@ -1,11 +1,34 @@ output$lethal_dose_plot <- renderDygraph({ - user_page_visit_dataset %>% + req(length(input$filter_lethal_dose_plot) > 0) + user_page_visit_dataset[, c("date", input$filter_lethal_dose_plot)] %>% polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_lethal_dose_plot)) %>% - polloi::make_dygraph(xlab = "", ylab = "Time (s)", title = "Time at which we have lost N% of the users") %>% + polloi::make_dygraph(xlab = "", ylab = "Time (s)", title = "Time at which N% users leave the visited page") %>% dyAxis("x", ticker = "Dygraph.dateTicker", axisLabelFormatter = polloi::custom_axis_formatter, axisLabelWidth = 100, pixelsPerLabel = 80) %>% + dyRoller(rollPeriod = input$rolling_lethal_dose_plot) %>% dyLegend(labelsDiv = "lethal_dose_plot_legend") %>% dyRangeSelector(fillColor = "", strokeColor = "") %>% dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") %>% dyEvent(as.Date("2017-04-25"), "S (rates)", labelLoc = "bottom") }) + +output$srp_ld_plot <- renderDygraph({ + req(length(input$filter_srp_ld_plot) > 0 && length(input$language_srp_ld_plot) > 0) + serp_page_visit_dataset[, c("date", "language", input$filter_srp_ld_plot)] %>% + tidyr::gather(LD, time, -c(date, language)) %>% + dplyr::filter(language %in% input$language_srp_ld_plot) %>% + dplyr::transmute( + date = date, time = time, + label = paste0(LD, " (", language, ")") + ) %>% + tidyr::spread(label, time) %>% + polloi::smoother(smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_srp_ld_plot)) %>% + polloi::make_dygraph(xlab = "", ylab = "Time (s)", title = "Time at N% users leave the search results page") %>% + dyAxis("x", ticker = "Dygraph.dateTicker", axisLabelFormatter = polloi::custom_axis_formatter, + axisLabelWidth = 100, pixelsPerLabel = 80) %>% + dyRoller(rollPeriod = input$rolling_srp_ld_plot) %>% + dyLegend(labelsDiv = "srp_ld_plot_legend") %>% + dyRangeSelector(fillColor = "", strokeColor = "") %>% + dyEvent(as.Date("2017-04-25"), "S (sampling rates)", labelLoc = "bottom") %>% + dyEvent(as.Date("2017-06-15"), "SS (sister search)", labelLoc = "bottom") +}) diff --git a/tab_documentation/srp_surv.md b/tab_documentation/srp_surv.md new file mode 100644 index 0000000..254818f --- /dev/null +++ b/tab_documentation/srp_surv.md @@ -0,0 +1,23 @@ +How long Wikipedia searchers stay on the search result pages +======= + +When someone is randomly selected for search satisfaction tracking (using our [TSS2 schema](https://meta.wikimedia.org/wiki/Schema:TestSearchSatisfaction2)), we use a check-in system and survival analysis to estimate how long users stay on visited pages. When a Wikipedia visitor searches using autocomplete and ends up on a full-text search results page (SRP), we can track how long that page is "alive" before the user either closes the tab, clicks on a result, or navigates elsewhere. + +To summarize the results on a daily basis, we record a set of statistics based on a measure formally known as "[median lethal dose](https://en.wikipedia.org/wiki/Median_lethal_dose)". This graph shows the length of time that must pass before N% of the users leave the search results page. When the number goes up, we can infer that users are staying on the pages longer. + +Outages and inaccuracies +------ +* '__S__': on 2017-04-25 we changed the rates at which users are put into event logging (see [T163273](https://phabricator.wikimedia.org/T163273)). Specifically, we decreased the rate on English Wikipedia ("EnWiki") and increased it everywhere else. +* '__SS__': [on 2017-06-15](https://lists.wikimedia.org/pipermail/discovery/2017-June/001536.html) we deployed the sister search feature to all Wikipedia in all languages. Sister project (cross-wiki) snippets is a feature that adds search results from sister projects of Wikipedia to a sidebar on the search engine results page (SERP). If a query results in matches from the sister projects, users will be shown snippets from Wiktionary, Wikisource, Wikiquote and/or other projects. See [T162276](https://phabricator.wikimedia.org/T162276) for more details. + +Questions, bug reports, and feature suggestions +------ +For technical, non-bug questions, [email Mikhail](mailto:[email protected]?subject=Dashboard%20Question) or [Chelsy](mailto:[email protected]?subject=Dashboard%20Question). If you experience a bug or notice something wrong or have a suggestion, [open a ticket in Phabricator](https://phabricator.wikimedia.org/maniphest/task/create/?projects=Discovery) in the Discovery board or [email Deb](mailto:[email protected]?subject=Dashboard%20Question). + +<hr style="border-color: gray;"> +<p style="font-size: small;"> + <strong>Link to this dashboard:</strong> <a href="https://discovery.wmflabs.org/metrics/#srp_surv">https://discovery.wmflabs.org/metrics/#srp_surv</a> + | Page is available under <a href="https://creativecommons.org/licenses/by-sa/3.0/" title="Creative Commons Attribution-ShareAlike License">CC-BY-SA 3.0</a> + | <a href="https://phabricator.wikimedia.org/diffusion/WDRN/" title="Search Metrics Dashboard source code repository">Code</a> is licensed under <a href="https://phabricator.wikimedia.org/diffusion/WDRN/browse/master/LICENSE.md" title="MIT License">MIT</a> + | Part of <a href="https://discovery.wmflabs.org/">Discovery Dashboards</a> +</p> diff --git a/tab_documentation/survival.md b/tab_documentation/survival.md index 711fffa..e066ad5 100644 --- a/tab_documentation/survival.md +++ b/tab_documentation/survival.md @@ -1,7 +1,9 @@ Automated survival analysis: page visit times ======= -This shows the length of time that must pass before we lose N% of the test population. In general, it appears it takes 15s to lose 10%, 25-35s to lose 25%, and 55-75s to lose 50%. When the number goes up, we can infer that users are staying on the pages longer. +When someone is randomly selected for search satisfaction tracking (using our [TSS2 schema](https://meta.wikimedia.org/wiki/Schema:TestSearchSatisfaction2)), we use a check-in system and survival analysis to estimate how long users stay on visited pages. To summarize the results on a daily basis, we record a set of statistics based on a measure formally known as "[median lethal dose](https://en.wikipedia.org/wiki/Median_lethal_dose)". + +This graph shows the length of time that must pass before N% of the users leave the page they visited. When the number goes up, we can infer that users are staying on the pages longer. In general, it appears it takes 15s to lose 10%, 25-35s to lose 25%, and 55-75s to lose 50%. On most days, we retain at least 20% of the test population past the 7 minute mark (the point at which the user's browser stops checking in), so on those days we cannot calculate the time it takes to lose 90/95/99% of the users. diff --git a/ui.R b/ui.R index 8b20615..97ad231 100644 --- a/ui.R +++ b/ui.R @@ -67,7 +67,9 @@ menuSubItem(text = "Search Suggestions", tabName = "failure_suggestions")), menuItem(text = "Sister Search", menuSubItem(text = "Traffic", tabName = "sister_search_traffic")), - menuItem(text = "Page Visit Times", tabName = "survival"), + menuItem(text = "Page Visit Times", + menuSubItem(text = "Visited search results", tabName = "survival"), + menuSubItem(text = "Search result pages", tabName = "spr_surv")), menuItem(text = "Language/Project Breakdown", tabName = "langproj_breakdown"), menuItem(text = "Global Settings", selectInput(inputId = "smoothing_global", label = "Smoothing", selectize = TRUE, selected = "day", @@ -97,7 +99,7 @@ box(sparkline:::sparklineOutput("sparkline_zero_results"), width = 3), box(sparkline:::sparklineOutput("sparkline_api_usage"), width = 3), box(sparkline:::sparklineOutput("sparkline_augmented_clickthroughs"), width = 3) - ), + ), includeMarkdown("./tab_documentation/kpis_summary.md")), tabItem(tabName = "monthly_metrics", fluidRow( @@ -290,11 +292,74 @@ div(id = "sister_search_traffic_plot_legend"), includeMarkdown("./tab_documentation/sister_search_traffic.md")), tabItem(tabName = "survival", - polloi::smooth_select("smoothing_lethal_dose_plot"), - div(id = "lethal_dose_plot_legend"), + fluidRow( + column( + polloi::smooth_select("smoothing_lethal_dose_plot"), + width = 3 + ), + column( + numericInput("rolling_lethal_dose_plot", "Roll Period", 1, min = 1, max = 30), + helpText("Each point will represent an average of this many days."), + width = 3 + ), + column( + checkboxGroupInput( + "filter_lethal_dose_plot", "Time until", + choices = c( + "10% of users left SRP" = "10%", + "25% of users left SRP" = "25%", + "50% of users left SRP" = "50%", + "75% of users left SRP" = "75%", + "90% of users left SRP" = "90%", + "95% of users left SRP" = "95%", + "99% of users left SRP" = "99%" + ), + selected = c("25%", "50%"), inline = TRUE + ), + width = 6 + ) + ), dygraphOutput("lethal_dose_plot"), + div(id = "lethal_dose_plot_legend"), includeMarkdown("./tab_documentation/survival.md") ), + tabItem(tabName = "spr_surv", + fluidRow( + column( + fluidRow( + column(polloi::smooth_select("smoothing_srp_ld_plot"), width = 8), + column(numericInput("rolling_srp_ld_plot", "Roll Period", 1, min = 1, max = 30), width = 4) + ), + helpText("Each point will represent an average of this many days."), + width = 3 + ), + column( + checkboxGroupInput( + "language_srp_ld_plot", "Language", + choices = c("English", "French and Catalan", "Other languages"), + selected = c("English", "Other languages"), inline = TRUE + ), + width = 4 + ), + column( + checkboxGroupInput( + "filter_srp_ld_plot", "Time until", + choices = c( + "10% of users left SRP" = "10%", + "25% of users left SRP" = "25%", + "50% of users left SRP" = "50%", + "75% of users left SRP" = "75%", + "90% of users left SRP" = "90%", + "95% of users left SRP" = "95%" + ), + selected = c("25%", "50%"), inline = TRUE), + width = 5 + ) + ), + div(id = "srp_ld_plot_legend"), + dygraphOutput("srp_ld_plot"), + includeMarkdown("./tab_documentation/srp_surv.md") + ), tabItem(tabName = "langproj_breakdown", fluidRow(column(polloi::smooth_select("smoothing_langproj_breakdown"), width = 4), column(selectInput("langproj_metrics", "Metrics", diff --git a/utils.R b/utils.R index 70db0ed..c753e23 100644 --- a/utils.R +++ b/utils.R @@ -282,6 +282,9 @@ user_page_visit_dataset <<- polloi::read_dataset("discovery/metrics/search/sample_page_visit_ld.tsv", col_types = "Dddddddd") %>% dplyr::filter(!is.na(LD10)) %>% set_colnames(c("date", "10%", "25%", "50%", "75%", "90%", "95%", "99%")) + serp_page_visit_dataset <<- polloi::read_dataset("discovery/metrics/search/srp_survtime.tsv", col_types = "Dcddddddd") %>% + dplyr::filter(!is.na(LD10)) %>% + set_colnames(c("date", "language", "10%", "25%", "50%", "75%", "90%", "95%", "99%")) } read_paul_score <- function() { -- To view, visit https://gerrit.wikimedia.org/r/374920 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I690230e3d3a7a41156f5878169577a62f52ddeb6 Gerrit-PatchSet: 1 Gerrit-Project: wikimedia/discovery/rainbow Gerrit-Branch: develop Gerrit-Owner: Bearloga <[email protected]> _______________________________________________ MediaWiki-commits mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
