OliverKeyes has submitted this change and it was merged. Change subject: Adds augmented clickthroughs to dashboard + Adds 'user engagement' KPI summary box + Adds 'user engagement' KPI time series + Adds page visit time LD10/25/50/75/90/95/99 time series + Fixes dashboard titles (including in browser window) ......................................................................
Adds augmented clickthroughs to dashboard + Adds 'user engagement' KPI summary box + Adds 'user engagement' KPI time series + Adds page visit time LD10/25/50/75/90/95/99 time series + Fixes dashboard titles (including in browser window) Bug: T113637 Change-Id: I808213c595925ae7ad3269403637b77483a4116e --- A assets/content/kpi_augmented_clickthroughs.md M assets/content/kpis_summary.md A assets/content/survival.md M server.R M ui.R M utils.R 6 files changed, 263 insertions(+), 160 deletions(-) Approvals: OliverKeyes: Verified; Looks good to me, approved diff --git a/assets/content/kpi_augmented_clickthroughs.md b/assets/content/kpi_augmented_clickthroughs.md new file mode 100644 index 0000000..4a52a64 --- /dev/null +++ b/assets/content/kpi_augmented_clickthroughs.md @@ -0,0 +1,23 @@ +Key Performance Indicator: User Engagement (Augmented Clickthroughs) +======= + +We are in the process of obtaining qualitative data from our users (their intent and satisfaction), so this metric is less akin to "user satisfaction" and more akin to "user engagement" we observe. + +This metric combines the clickthrough rate and the proportion of users' session dwell times exceeding the threshold of 10s. + +Outages and inaccuracies +------ + +* None so far! + +Questions, bug reports, and feature suggestions +------ +For technical, non-bug questions, [email Mikhail](mailto:mpo...@wikimedia.org?subject=Dashboard%20Question). If you experience a bug or notice something wrong or have a suggestion, [open a ticket in Phabricator](https://phabricator.wikimedia.org/maniphest/task/create/?projects=Discovery) in the Discovery board or [email Dan](mailto:dga...@wikimedia.org?subject=Dashboard%20Question). + +<hr style="border-color: gray;"> +<p style="font-size: small; color: gray;"> + <strong>Link to this dashboard:</strong> + <a href="http://searchdata.wmflabs.org/metrics/#kpi_zero_results"> + http://searchdata.wmflabs.org/metrics/#kpi_zero_results + </a> +</p> diff --git a/assets/content/kpis_summary.md b/assets/content/kpis_summary.md index 40d2ab6..ff55a60 100644 --- a/assets/content/kpis_summary.md +++ b/assets/content/kpis_summary.md @@ -1,10 +1,10 @@ Key Performance Indicators ======= -* **User-perceived load time** If our search is fast and snappy, then more people will use it! -* **Zero Results Rate** If a user gets zero results for their query, they’ve by definition not found what they’re looking for. -* **API usage** We want people, both within our movement and outside it, to be able to easily access our information. -* **[User Satisfaction](https://meta.wikimedia.org/wiki/Research:Measuring_User_Search_Satisfaction)** If a user searches for something and clicks on a result, then they found what they wanted. +- **User-perceived load time** If our search is fast and snappy, then more people will use it! +- **Zero Results Rate** If a user gets zero results for their query, they’ve by definition not found what they’re looking for. +- **API usage** We want people, both within our movement and outside it, to be able to easily access our information. +- **User Engagement** (not quite **[User Satisfaction](https://meta.wikimedia.org/wiki/Research:Measuring_User_Search_Satisfaction)**) This is an augmented version of clickthrough rate. In it we are including the proportion of users' sessions exceeding a pre-specified threshold. **Note** that we deployed v2.0 of the satisfaction schema on 9/2/2015. You may see **NA** if we do not have enough data available at the time. Outages and inaccuracies ------ diff --git a/assets/content/survival.md b/assets/content/survival.md new file mode 100644 index 0000000..cd41a11 --- /dev/null +++ b/assets/content/survival.md @@ -0,0 +1,25 @@ +Automated survival analysis: page visit times +======= + +This shows the length of time that must pass before we lose N% of the test population. In general, it appears it takes 15s to lose 10%, 25-35s to lose 25%, and 55-75s to lose 50%. When the number goes up, we can infer that users are staying on the pages longer. + +On most days, we retain at least 20% of the test population past the 7 minute mark (the point at which the user's browser stops checking in), so on those days we cannot calculate the time it takes to lose 90/95/99% of the users. + +There are some days when we CAN calculate those times, and it can take anywhere between 270s (4m30s) and 390s (6m30s) for 90% of the users to have closed the page they clicked through from the search results page. + +Outages and inaccuracies +------ + +* None so far! + +Questions, bug reports, and feature suggestions +------ +For technical, non-bug questions, [email Mikhail](mailto:mpo...@wikimedia.org?subject=Dashboard%20Question). If you experience a bug or notice something wrong or have a suggestion, [open a ticket in Phabricator](https://phabricator.wikimedia.org/maniphest/task/create/?projects=Discovery) in the Discovery board or [email Dan](mailto:dga...@wikimedia.org?subject=Dashboard%20Question). + +<hr style="border-color: gray;"> +<p style="font-size: small; color: gray;"> + <strong>Link to this dashboard:</strong> + <a href="http://searchdata.wmflabs.org/metrics/#survival"> + http://searchdata.wmflabs.org/metrics/#survival + </a> +</p> diff --git a/server.R b/server.R index e8b40bb..d91a6ab 100644 --- a/server.R +++ b/server.R @@ -1,95 +1,18 @@ ## Version 0.2.0 source("utils.R") -existing_date <- (Sys.Date()-1) - -## Read in desktop data and generate means for the value boxes, along with a time-series appropriate form for -## dygraphs. -read_desktop <- function(){ - data <- polloi::read_dataset("search/desktop_event_counts.tsv") - interim <- reshape2::dcast(data, formula = timestamp ~ action, fun.aggregate = sum) - interim[is.na(interim)] <- 0 - desktop_dygraph_set <<- interim - desktop_dygraph_means <<- round(colMeans(desktop_dygraph_set[,2:5])) - - data <- polloi::read_dataset("search/desktop_load_times.tsv") - desktop_load_data <<- data - return(invisible()) -} - -read_web <- function(){ - data <- polloi::read_dataset("search/mobile_event_counts.tsv") - interim <- reshape2::dcast(data, formula = timestamp ~ action, fun.aggregate = sum) - interim[is.na(interim)] <- 0 - mobile_dygraph_set <<- interim - mobile_dygraph_means <<- round(colMeans(mobile_dygraph_set[,2:4])) - - mobile_load_data <<- polloi::read_dataset("search/mobile_load_times.tsv") - return(invisible()) -} - -read_apps <- function(){ - data <- polloi::read_dataset("search/app_event_counts.tsv") - - ios <- reshape2::dcast(data[data$platform == "iOS",], formula = timestamp ~ action, fun.aggregate = sum) - android <- reshape2::dcast(data[data$platform == "Android",], formula = timestamp ~ action, fun.aggregate = sum) - ios_dygraph_set <<- ios - ios_dygraph_means <<- round(colMeans(ios[,2:4])) - - android_dygraph_set <<- android - android_dygraph_means <<- round(colMeans(android[,2:4])) - - app_load_data <- polloi::read_dataset("search/app_load_times.tsv") - ios_load_data <<- app_load_data[app_load_data$platform == "iOS", names(app_load_data) != "platform"] - android_load_data <<- app_load_data[app_load_data$platform == "Android", names(app_load_data) != "platform"] - - return(invisible()) -} - -read_api <- function(){ - data <- polloi::read_dataset("search/search_api_aggregates.tsv") - data <- data[order(data$event_type),] - split_dataset <<- split(data, f = data$event_type) - return(invisible()) -} - -read_failures <- function(date){ - data <- polloi::read_dataset("search/cirrus_query_aggregates.tsv") - interim_data <- reshape2::dcast(data, formula = date ~ variable, fun.aggregate = sum) - failure_dygraph_set <<- interim_data - - interim_vector <- interim_data$`Zero Result Queries`/interim_data$`Search Queries` - output_vector <- (interim_vector[2:nrow(interim_data)] - interim_vector[1:(nrow(interim_data)-1)]) / interim_vector[1:(nrow(interim_data)-1)] - - failure_roc_dygraph_set <<- data.frame(date = interim_data$date[2:nrow(interim_data)], - variable = "failure ROC", - daily_change = output_vector*100, - stringsAsFactors = FALSE) - - interim_breakdown_data <- polloi::read_dataset("search/cirrus_query_breakdowns.tsv") - interim_breakdown_data$value <- interim_breakdown_data$value*100 - failure_breakdown_dygraph_set <<- reshape2::dcast(interim_breakdown_data, - formula = date ~ variable, fun.aggregate = sum) - - suggestion_data <- polloi::read_dataset("search/cirrus_suggestion_breakdown.tsv") - suggestion_data$variable <- "Full-Text with Suggestions" - suggestion_data$value <- suggestion_data$value*100 - suggestion_data <- rbind(suggestion_data, - interim_breakdown_data[interim_breakdown_data$date %in% suggestion_data$date - & interim_breakdown_data$variable == "Full-Text Search",]) - suggestion_dygraph_set <<- reshape2::dcast(suggestion_data, - formula = date ~ variable, fun.aggregate = sum) - return(invisible()) -} +existing_date <- Sys.Date() - 1 shinyServer(function(input, output) { - if(Sys.Date() != existing_date){ + if (Sys.Date() != existing_date) { read_desktop() read_apps() read_web() read_api() read_failures(existing_date) + read_augmented_clickthrough() + read_lethal_dose() existing_date <<- Sys.Date() } @@ -311,7 +234,18 @@ xlab = "Date", ylab = "Zero Results Rate (%)", "Zero Result Rates with Search Suggestions") }) - ## KPI module + output$lethal_dose_plot <- renderDygraph({ + polloi::make_dygraph(data = polloi::smoother(user_page_visit_dataset, + smooth_level = polloi::smooth_switch(input$smoothing_global, + input$smoothing_lethal_dose_plot)), + xlab = "", ylab = "Time (s)", + title = "Time at which we have lost N% of the users") %>% + dyAxis("x", ticker = "Dygraph.dateTicker", axisLabelFormatter = CustomAxisFormatter, + axisLabelWidth = 100, pixelsPerLabel = 80) %>% + dyLegend(labelsDiv = "lethal_dose_plot_legend") + }) + + ## KPI Summary Boxes output$kpi_summary_date_range <- renderUI({ date_range <- input$kpi_summary_date_range_selector switch(date_range, @@ -355,39 +289,15 @@ }) output$kpi_summary_box_load_time <- renderValueBox({ date_range <- input$kpi_summary_date_range_selector - switch(date_range, - daily = { - x <- lapply(list(desktop_load_data, mobile_load_data, - android_load_data, ios_load_data), - polloi::safe_tail, n = 2) %>% - lapply(function(data_tail) return(data_tail$Median)) %>% - do.call(cbind, .) %>% - apply(MARGIN = 1, FUN = median) - }, - weekly = { - x <- lapply(list(desktop_load_data, mobile_load_data, - android_load_data, ios_load_data), - polloi::safe_tail, n = 14) %>% - lapply(function(data_tail) return(data_tail$Median)) %>% - do.call(cbind, .) %>% - apply(MARGIN = 1, FUN = median) - }, - monthly = { - x <- lapply(list(desktop_load_data, mobile_load_data, - android_load_data, ios_load_data), - polloi::safe_tail, n = 60) %>% - lapply(function(data_tail) return(data_tail$Median)) %>% - do.call(cbind, .) %>% - apply(MARGIN = 1, FUN = median) - }, - quarterly = { - x <- lapply(list(desktop_load_data, mobile_load_data, - android_load_data, ios_load_data), - polloi::safe_tail, n = 90) %>% - lapply(function(data_tail) return(data_tail$Median)) - y <- median(apply(do.call(cbind, x), 1, median)) - return(valueBox(subtitle = "Load time", value = sprintf("%.0fms", y), color = "orange")) - }) + x <- lapply(list(desktop_load_data, mobile_load_data, + android_load_data, ios_load_data), + polloi::safe_tail, n = date_range_switch(date_range)) %>% + lapply(function(data_tail) return(data_tail$Median)) + if ( date_range == "quarterly" ) { + y <- median(apply(do.call(cbind, x), 1, median)) + return(valueBox(subtitle = "Load time", value = sprintf("%.0fms", y), color = "orange")) + } + x %<>% do.call(cbind, .) %>% apply(MARGIN = 1, FUN = median) y1 <- median(polloi::half(x)); y2 <- median(polloi::half(x, FALSE)); z <- 100 * (y2 - y1) / y1 if (abs(z) > 0) { return(valueBox(subtitle = sprintf("Load time (%.1f%%)", z), @@ -398,11 +308,7 @@ }) output$kpi_summary_box_zero_results <- renderValueBox({ date_range <- input$kpi_summary_date_range_selector - switch(date_range, - daily = {x <- polloi::safe_tail(failure_dygraph_set, 2)}, - weekly = {x <- polloi::safe_tail(failure_dygraph_set, 14)}, - monthly = {x <- polloi::safe_tail(failure_dygraph_set, 60)}, - quarterly = {x <- polloi::safe_tail(failure_dygraph_set, 90)}) + x <- polloi::safe_tail(failure_dygraph_set, date_range_switch(date_range)) x <- transform(x, Rate = `Zero Result Queries` / `Search Queries`)$Rate if (date_range == "quarterly") { return(valueBox(subtitle = "Zero results rate", color = "orange", @@ -422,18 +328,14 @@ output$kpi_summary_box_api_usage <- renderValueBox({ date_range <- input$kpi_summary_date_range_selector x <- lapply(split_dataset, function(x) { - switch(date_range, - daily = { polloi::safe_tail(x, 2)$events }, - weekly = { polloi::safe_tail(x, 14)$events }, - monthly = { polloi::safe_tail(x, 60)$events }, - quarterly = { polloi::safe_tail(x, 90)$events }) + polloi::safe_tail(x, date_range_switch(date_range))$events }) %>% do.call(cbind, .) %>% transform(total = cirrus + geo + language + open + prefix) %>% { .$total } if (date_range == "quarterly") { return(valueBox(subtitle = "API usage", value = polloi::compress(median(x), 0), color = "orange")) } - y1 <- median(polloi::half(x, FALSE)) + y1 <- median(polloi::half(x, TRUE)) y2 <- median(polloi::half(x, FALSE)) z <- 100 * (y2 - y1) / y1 # % change from t-1 to t if (abs(z) > 0) { @@ -442,13 +344,33 @@ } return(valueBox(subtitle = "API usage (no change)", value = polloi::compress(y2, 0), color = "orange")) }) + output$kpi_summary_box_augmented_clickthroughs <- renderValueBox({ + date_range <- input$kpi_summary_date_range_selector + #========= We can delete this block after we get 90 days of data ========= + if ( (date_range == "monthly" && (Sys.Date()-1)-60 < as.Date("2015-09-02")) || date_range == "quarterly" && (Sys.Date()-1)-90 < as.Date("2015-09-02") ) { + return(valueBox(subtitle = "User engagement", color = "black", value = "NA")) + } + #========================================================================= + x <- polloi::safe_tail(augmented_clickthroughs, date_range_switch(date_range)) + if (date_range == "quarterly") { + return(valueBox(subtitle = "User engagement", color = "orange", + value = sprintf("%.1f%%", median(x$user_engagement)))) + } + y1 <- median(polloi::half(x$user_engagement)) + y2 <- median(polloi::half(x$user_engagement, FALSE)) + z <- 100 * (y2 - y1)/y1 + if (abs(z) > 0) { + return(valueBox( + subtitle = sprintf("User engagement (%.1f%%)", z), + value = sprintf("%.1f%%", y2), + icon = cond_icon(z > 0), color = polloi::cond_color(z > 0, "green") + )) + } + return(valueBox(subtitle = "User engagement (no change)", + value = sprintf("%.1f%%", y2), color = "orange")) + }) output$kpi_summary_api_usage_proportions <- renderPlot({ - switch (input$kpi_summary_date_range_selector, - daily = { n <- 1 }, - weekly = { n <- 7 }, - monthly = { n <- 30 }, - quarterly = { n <- 90 } - ) + n <- date_range_switch(input$kpi_summary_date_range_selector, 1, 7, 30, 90) api_latest <- cbind("Full-text via API" = polloi::safe_tail(split_dataset$cirrus, n)$events, "Geo Search" = polloi::safe_tail(split_dataset$geo, n)$events, "OpenSearch" = polloi::safe_tail(split_dataset$open, n)$events, @@ -466,6 +388,8 @@ rm(i) gg_prop_bar(api_latest, cols = list(item = "API", prop = "Prop", label = "Label")) }) + + ## KPI Modules output$kpi_load_time_series <- renderDygraph({ smooth_level <- input$smoothing_kpi_load_time num_of_days_in_common <- min(sapply(list(desktop_load_data$Median, mobile_load_data$Median, android_load_data$Median, ios_load_data$Median), length)) @@ -489,10 +413,10 @@ dyAxis("y2", label = "Day-to-day % change in median load time", independentTicks = TRUE, drawGrid = FALSE) %>% dyLegend(width = 500, show = "always") %>% - dyOptions(strokeWidth = 2, colors = brewer.pal(5, "Set2")[5:1], + dyOptions(strokeWidth = 2, colors = RColorBrewer::brewer.pal(5, "Set2")[5:1], drawPoints = FALSE, pointSize = 3, labelsKMB = TRUE, includeZero = TRUE) %>% - dyCSS(css = "./assets/css/custom.css")) + dyCSS(css = system.file("custom.css", package = "polloi"))) }) output$kpi_zero_results_series <- renderDygraph({ smooth_level <- input$smoothing_kpi_zero_results @@ -507,21 +431,21 @@ ylab = "% of search queries that yield zero results") %>% dySeries("change", axis = 'y2', label = "day-to-day % change", strokeWidth = 1) %>% dyLimit(limit = 12.50, label = "Goal: 12.50% zero results rate", - color = brewer.pal(3, "Set2")[3]) %>% + color = RColorBrewer::brewer.pal(3, "Set2")[3]) %>% dyAxis("y2", label = "Day-to-day % change", valueRange = c(-1, 1) * max(max(abs(as.numeric(zrr$change))), 10), - axisLineColor = brewer.pal(3, "Set2")[2], - axisLabelColor = brewer.pal(3, "Set2")[2], + axisLineColor = RColorBrewer::brewer.pal(3, "Set2")[2], + axisLabelColor = RColorBrewer::brewer.pal(3, "Set2")[2], independentTicks = TRUE, drawGrid = FALSE) %>% dyAxis("y", drawGrid = FALSE, - axisLineColor = brewer.pal(3, "Set2")[1], - axisLabelColor = brewer.pal(3, "Set2")[1]) %>% - dyLimit(limit = 0, color = brewer.pal(3, "Set2")[2], strokePattern = "dashed") %>% + axisLineColor = RColorBrewer::brewer.pal(3, "Set2")[1], + axisLabelColor = RColorBrewer::brewer.pal(3, "Set2")[1]) %>% + dyLimit(limit = 0, color = RColorBrewer::brewer.pal(3, "Set2")[2], strokePattern = "dashed") %>% dyLegend(width = 400, show = "always") %>% - dyOptions(strokeWidth = 3, colors = brewer.pal(3, "Set2"), + dyOptions(strokeWidth = 3, colors = RColorBrewer::brewer.pal(3, "Set2"), drawPoints = FALSE, pointSize = 3, labelsKMB = TRUE, includeZero = TRUE) %>% - dyCSS(css = "./assets/css/custom.css")) + dyCSS(css = system.file("custom.css", package = "polloi"))) }) output$kpi_api_usage_series <- renderDygraph({ smooth_level <- input$smoothing_kpi_api_usage @@ -541,12 +465,12 @@ ylab = ifelse(input$kpi_api_usage_series_log_scale, "Calls (log10 scale)", "Calls")) %>% dySeries("cirrus", label = "full-text via API") %>% dyLegend(width = 400, show = "always") %>% - dyOptions(strokeWidth = 3, colors = brewer.pal(6, "Set2")[6:1], + dyOptions(strokeWidth = 3, colors = RColorBrewer::brewer.pal(6, "Set2")[6:1], drawPoints = FALSE, pointSize = 3, labelsKMB = TRUE, includeZero = input$kpi_api_usage_series_log_scale, logscale = input$kpi_api_usage_series_log_scale ) %>% - dyCSS(css = "./assets/css/custom.css")) + dyCSS(css = system.file("custom.css", package = "polloi"))) } api_usage_change <- transform(api_usage, cirrus = polloi::percent_change(cirrus), @@ -563,10 +487,18 @@ main = "Day-to-day % change over time", xlab = "Date", ylab = "% change") %>% dyLegend(width = 400, show = "always") %>% - dyOptions(strokeWidth = 3, colors = brewer.pal(6, "Set2"), + dyOptions(strokeWidth = 3, colors = RColorBrewer::brewer.pal(6, "Set2"), drawPoints = FALSE, pointSize = 3, labelsKMB = TRUE, includeZero = TRUE) %>% - dyCSS(css = "./assets/css/custom.css")) + dyCSS(css = system.file("custom.css", package = "polloi"))) + }) + output$kpi_augmented_clickthroughs_series <- renderDygraph({ + smoothed_data <- polloi::smoother(augmented_clickthroughs, + smooth_level = polloi::smooth_switch(input$smoothing_global, input$smoothing_augmented_clickthroughs)) + polloi::make_dygraph(data = smoothed_data, xlab = "Date", ylab = "Rates", "User engagement (augmented clickthroughs) by day") %>% + dySeries(name = colnames(smoothed_data)[2], strokeWidth = 1.5, strokePattern = "dashed") %>% + dySeries(name = colnames(smoothed_data)[3], strokeWidth = 1.5, strokePattern = "dashed") %>% + dyLegend(labelsDiv = "kpi_augmented_clickthroughs_series_legend") }) }) diff --git a/ui.R b/ui.R index df544c7..4b084ce 100644 --- a/ui.R +++ b/ui.R @@ -4,7 +4,7 @@ options(scipen = 500) #Header elements for the visualisation -header <- dashboardHeader(title = "Search & Discovery", disable = FALSE) +header <- dashboardHeader(title = "Search Metrics", disable = FALSE) #Sidebar elements for the search visualisations. sidebar <- dashboardSidebar( @@ -17,7 +17,8 @@ menuSubItem(text = "Summary", tabName = "kpis_summary"), menuSubItem(text = "Load times", tabName = "kpi_load_time"), menuSubItem(text = "Zero results", tabName = "kpi_zero_results"), - menuSubItem(text = "API usage", tabName = "kpi_api_usage")), + menuSubItem(text = "API usage", tabName = "kpi_api_usage"), + menuSubItem(text = "Augmented Clickthrough", tabName = "kpi_augmented_clickthroughs")), menuItem(text = "Desktop", menuSubItem(text = "Events", tabName = "desktop_events"), menuSubItem(text = "Load times", tabName = "desktop_load")), @@ -26,20 +27,19 @@ menuSubItem(text = "Load times", tabName = "mobile_load")), menuItem(text = "Mobile Apps", menuSubItem(text = "Events", tabName = "app_events"), - menuSubItem(text = "Load times", tabName = "app_load") - ), + menuSubItem(text = "Load times", tabName = "app_load")), menuItem(text = "API", menuSubItem(text = "Full-text via API", tabName = "fulltext_search"), menuSubItem(text = "Open Search", tabName = "open_search"), menuSubItem(text = "Geo Search", tabName = "geo_search"), menuSubItem(text = "Prefix Search", tabName = "prefix_search"), - menuSubItem(text = "Language Search", tabName = "language_search") - ), + menuSubItem(text = "Language Search", tabName = "language_search")), menuItem(text = "Zero Results", menuSubItem(text = "Summary", tabName = "failure_rate"), menuSubItem(text = "Search Type Breakdown", tabName = "failure_breakdown"), - menuSubItem(text = "Search Suggestions", tabName = "failure_suggestions") - ), + menuSubItem(text = "Search Suggestions", tabName = "failure_suggestions")), + menuItem(text = "Page Visit Times", tabName = "survival", + badgeLabel = "new", badgeColor = "fuchsia"), selectInput(inputId = "smoothing_global", label = "Smoothing (Global Setting)", selectize = TRUE, selected = "day", choices = c("No Smoothing" = "day", "Moving Average" = "moving_avg", "Weekly Median" = "week", "Monthly Median" = "month")) @@ -65,7 +65,7 @@ fluidRow(valueBoxOutput("kpi_summary_box_load_time", width = 3), valueBoxOutput("kpi_summary_box_zero_results", width = 3), valueBoxOutput("kpi_summary_box_api_usage", width = 3), - valueBox(subtitle = "User-satisfaction", value = "WIP", color = "black", width = 3)), + valueBoxOutput("kpi_summary_box_augmented_clickthroughs", width = 3)), plotOutput("kpi_summary_api_usage_proportions", height = "30px"), includeMarkdown("./assets/content/kpis_summary.md") ), @@ -95,6 +95,12 @@ column(smooth_select("smoothing_kpi_api_usage"), width = 3)), dygraphOutput("kpi_api_usage_series"), includeMarkdown("./assets/content/kpi_api_usage.md")), + tabItem(tabName = "kpi_augmented_clickthroughs", + fluidRow( + column(smooth_select("smoothing_augmented_clickthroughs"), width = 4), + column(div(id = "kpi_augmented_clickthroughs_series_legend"), width = 8)), + dygraphOutput("kpi_augmented_clickthroughs_series"), + includeMarkdown("./assets/content/kpi_augmented_clickthroughs.md")), tabItem(tabName = "desktop_events", fluidRow( valueBoxOutput("desktop_event_searches"), @@ -180,8 +186,17 @@ smooth_select("smoothing_failure_suggestions"), dygraphOutput("suggestion_dygraph_plot"), includeMarkdown("./assets/content/failure_suggests.md") + ), + tabItem(tabName = "survival", + fluidRow( + column(smooth_select("smoothing_lethal_dose_plot"), width = 4), + column(div(id = "lethal_dose_plot_legend"), width = 8) + ), + dygraphOutput("lethal_dose_plot"), + includeMarkdown("./assets/content/survival.md") ) ) ) -dashboardPage(header, sidebar, body, skin = "black") +dashboardPage(header, sidebar, body, skin = "black", + title = "Search Metrics Dashboard | Discovery | Engineering | Wikimedia Foundation") diff --git a/utils.R b/utils.R index 12d863a..3391e39 100644 --- a/utils.R +++ b/utils.R @@ -6,6 +6,98 @@ library(polloi) library(xts) +## Read in desktop data and generate means for the value boxes, along with a time-series appropriate form for +## dygraphs. +read_desktop <- function() { + data <- polloi::read_dataset("search/desktop_event_counts.tsv") + interim <- reshape2::dcast(data, formula = timestamp ~ action, fun.aggregate = sum) + interim[is.na(interim)] <- 0 + desktop_dygraph_set <<- interim + desktop_dygraph_means <<- round(colMeans(desktop_dygraph_set[,2:5])) + desktop_load_data <<- polloi::read_dataset("search/desktop_load_times.tsv") +} + +read_web <- function() { + data <- polloi::read_dataset("search/mobile_event_counts.tsv") + interim <- reshape2::dcast(data, formula = timestamp ~ action, fun.aggregate = sum) + interim[is.na(interim)] <- 0 + mobile_dygraph_set <<- interim + mobile_dygraph_means <<- round(colMeans(mobile_dygraph_set[,2:4])) + mobile_load_data <<- polloi::read_dataset("search/mobile_load_times.tsv") +} + +read_apps <- function() { + + data <- polloi::read_dataset("search/app_event_counts.tsv") + + ios <- reshape2::dcast(data[data$platform == "iOS",], formula = timestamp ~ action, fun.aggregate = sum) + android <- reshape2::dcast(data[data$platform == "Android",], formula = timestamp ~ action, fun.aggregate = sum) + ios_dygraph_set <<- ios + ios_dygraph_means <<- round(colMeans(ios[,2:4])) + + android_dygraph_set <<- android + android_dygraph_means <<- round(colMeans(android[,2:4])) + + app_load_data <- polloi::read_dataset("search/app_load_times.tsv") + ios_load_data <<- app_load_data[app_load_data$platform == "iOS", names(app_load_data) != "platform"] + android_load_data <<- app_load_data[app_load_data$platform == "Android", names(app_load_data) != "platform"] + +} + +read_api <- function(){ + data <- polloi::read_dataset("search/search_api_aggregates.tsv") + data <- data[order(data$event_type),] + split_dataset <<- split(data, f = data$event_type) +} + +read_failures <- function(date) { + + data <- polloi::read_dataset("search/cirrus_query_aggregates.tsv") + interim_data <- reshape2::dcast(data, formula = date ~ variable, fun.aggregate = sum) + failure_dygraph_set <<- interim_data + + interim_vector <- interim_data$`Zero Result Queries`/interim_data$`Search Queries` + output_vector <- (interim_vector[2:nrow(interim_data)] - interim_vector[1:(nrow(interim_data)-1)]) / interim_vector[1:(nrow(interim_data)-1)] + + failure_roc_dygraph_set <<- data.frame(date = interim_data$date[2:nrow(interim_data)], + variable = "failure ROC", + daily_change = output_vector*100, + stringsAsFactors = FALSE) + + interim_breakdown_data <- polloi::read_dataset("search/cirrus_query_breakdowns.tsv") + interim_breakdown_data$value <- interim_breakdown_data$value*100 + failure_breakdown_dygraph_set <<- reshape2::dcast(interim_breakdown_data, + formula = date ~ variable, fun.aggregate = sum) + + suggestion_data <- polloi::read_dataset("search/cirrus_suggestion_breakdown.tsv") + suggestion_data$variable <- "Full-Text with Suggestions" + suggestion_data$value <- suggestion_data$value*100 + suggestion_data <- rbind(suggestion_data, + interim_breakdown_data[interim_breakdown_data$date %in% suggestion_data$date + & interim_breakdown_data$variable == "Full-Text Search",]) + suggestion_dygraph_set <<- reshape2::dcast(suggestion_data, + formula = date ~ variable, fun.aggregate = sum) + +} + +read_augmented_clickthrough <- function() { + data <- polloi::read_dataset("search/search_threshold_pass_rate.tsv") + temp <- polloi::safe_tail(desktop_dygraph_set, nrow(data))[, c('clickthroughs', 'Result pages opened')] + + polloi::safe_tail(mobile_dygraph_set, nrow(data))[, c('clickthroughs', 'Result pages opened')] + + polloi::safe_tail(ios_dygraph_set, nrow(data))[, c('clickthroughs', 'Result pages opened')] + + polloi::safe_tail(android_dygraph_set, nrow(data))[, c('clickthroughs', 'Result pages opened')] + intermediary_dataset <- cbind(data, clickthrough_rate = 100 * temp$clickthroughs/temp$'Result pages opened') + colnames(intermediary_dataset) <- c("date", "threshold_passing_rate", "clickthrough_rate") + intermediary_dataset$threshold_passing_rate <- 100 * intermediary_dataset$threshold_passing_rate + augmented_clickthroughs <<- transform(intermediary_dataset, user_engagement = (threshold_passing_rate + clickthrough_rate)/2) +} + +read_lethal_dose <- function() { + intermediary_dataset <- polloi::read_dataset("search/sample_page_visit_ld.tsv") + colnames(intermediary_dataset) <- c("date", "10%", "25%", "50%", "75%", "90%", "95%", "99%") + user_page_visit_dataset <<- intermediary_dataset +} + # Uses ggplot2 to create a pie chart in bar form. (Will look up actual name) gg_prop_bar <- function(data, cols) { # `cols` = list(`item`, `prop`, `label`) @@ -27,3 +119,19 @@ y = "text_position", x = 1)) } + +date_range_switch <- function(date_range, daily = 2, weekly = 14, monthly = 60, quarterly = 90) { + return(switch(date_range, daily = daily, weekly = weekly, monthly = monthly, quarterly = quarterly)) +} + +CustomAxisFormatter <- 'function (d, gran) { + var weekday = new Array(7); + weekday[0]= "Sunday"; + weekday[1] = "Monday"; + weekday[2] = "Tuesday"; + weekday[3] = "Wednesday"; + weekday[4] = "Thursday"; + weekday[5] = "Friday"; + weekday[6] = "Saturday"; + return weekday[d.getDay()] + " (" + (d.getMonth()+1) + "/" + d.getDate() + ")"; +}' -- To view, visit https://gerrit.wikimedia.org/r/241115 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I808213c595925ae7ad3269403637b77483a4116e Gerrit-PatchSet: 2 Gerrit-Project: wikimedia/discovery/rainbow Gerrit-Branch: master Gerrit-Owner: Bearloga <mpo...@wikimedia.org> Gerrit-Reviewer: Bearloga <mpo...@wikimedia.org> Gerrit-Reviewer: OliverKeyes <oke...@wikimedia.org> _______________________________________________ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits