OliverKeyes has submitted this change and it was merged.

Change subject: Adds augmented clickthroughs to dashboard + Adds 'user 
engagement' KPI summary box + Adds 'user engagement' KPI time series + Adds 
page visit time LD10/25/50/75/90/95/99 time series + Fixes dashboard titles 
(including in browser window)
......................................................................


Adds augmented clickthroughs to dashboard
+ Adds 'user engagement' KPI summary box
+ Adds 'user engagement' KPI time series
+ Adds page visit time LD10/25/50/75/90/95/99 time series
+ Fixes dashboard titles (including in browser window)

Bug: T113637
Change-Id: I808213c595925ae7ad3269403637b77483a4116e
---
A assets/content/kpi_augmented_clickthroughs.md
M assets/content/kpis_summary.md
A assets/content/survival.md
M server.R
M ui.R
M utils.R
6 files changed, 263 insertions(+), 160 deletions(-)

Approvals:
  OliverKeyes: Verified; Looks good to me, approved



diff --git a/assets/content/kpi_augmented_clickthroughs.md 
b/assets/content/kpi_augmented_clickthroughs.md
new file mode 100644
index 0000000..4a52a64
--- /dev/null
+++ b/assets/content/kpi_augmented_clickthroughs.md
@@ -0,0 +1,23 @@
+Key Performance Indicator: User Engagement (Augmented Clickthroughs)
+=======
+
+We are in the process of obtaining qualitative data from our users (their 
intent and satisfaction), so this metric is less akin to "user satisfaction" 
and more akin to "user engagement" we observe.
+
+This metric combines the clickthrough rate and the proportion of users' 
session dwell times exceeding the threshold of 10s.
+
+Outages and inaccuracies
+------
+
+* None so far!
+
+Questions, bug reports, and feature suggestions
+------
+For technical, non-bug questions, [email 
Mikhail](mailto:mpo...@wikimedia.org?subject=Dashboard%20Question). If you 
experience a bug or notice something wrong or have a suggestion, [open a ticket 
in 
Phabricator](https://phabricator.wikimedia.org/maniphest/task/create/?projects=Discovery)
 in the Discovery board or [email 
Dan](mailto:dga...@wikimedia.org?subject=Dashboard%20Question).
+
+<hr style="border-color: gray;">
+<p style="font-size: small; color: gray;">
+  <strong>Link to this dashboard:</strong>
+  <a href="http://searchdata.wmflabs.org/metrics/#kpi_zero_results";>
+    http://searchdata.wmflabs.org/metrics/#kpi_zero_results
+  </a>
+</p>
diff --git a/assets/content/kpis_summary.md b/assets/content/kpis_summary.md
index 40d2ab6..ff55a60 100644
--- a/assets/content/kpis_summary.md
+++ b/assets/content/kpis_summary.md
@@ -1,10 +1,10 @@
 Key Performance Indicators
 =======
 
-* **User-perceived load time** If our search is fast and snappy, then more 
people will use it! 
-* **Zero Results Rate** If a user gets zero results for their query, they’ve 
by definition not found what they’re looking for.
-* **API usage** We want people, both within our movement and outside it, to be 
able to easily access our information.
-* **[User 
Satisfaction](https://meta.wikimedia.org/wiki/Research:Measuring_User_Search_Satisfaction)**
 If a user searches for something and clicks on a result, then they found what 
they wanted.
+- **User-perceived load time** If our search is fast and snappy, then more 
people will use it! 
+- **Zero Results Rate** If a user gets zero results for their query, they’ve 
by definition not found what they’re looking for.
+- **API usage** We want people, both within our movement and outside it, to be 
able to easily access our information.
+- **User Engagement** (not quite **[User 
Satisfaction](https://meta.wikimedia.org/wiki/Research:Measuring_User_Search_Satisfaction)**)
 This is an augmented version of clickthrough rate. In it we are including the 
proportion of users' sessions exceeding a pre-specified threshold. **Note** 
that we deployed v2.0 of the satisfaction schema on 9/2/2015. You may see 
**NA** if we do not have enough data available at the time.
 
 Outages and inaccuracies
 ------
diff --git a/assets/content/survival.md b/assets/content/survival.md
new file mode 100644
index 0000000..cd41a11
--- /dev/null
+++ b/assets/content/survival.md
@@ -0,0 +1,25 @@
+Automated survival analysis: page visit times
+=======
+
+This shows the length of time that must pass before we lose N% of the test 
population. In general, it appears it takes 15s to lose 10%, 25-35s to lose 
25%, and 55-75s to lose 50%. When the number goes up, we can infer that users 
are staying on the pages longer.
+
+On most days, we retain at least 20% of the test population past the 7 minute 
mark (the point at which the user's browser stops checking in), so on those 
days we cannot calculate the time it takes to lose 90/95/99% of the users.
+
+There are some days when we CAN calculate those times, and it can take 
anywhere between 270s (4m30s) and 390s (6m30s) for 90% of the users to have 
closed the page they clicked through from the search results page.
+
+Outages and inaccuracies
+------
+
+* None so far!
+
+Questions, bug reports, and feature suggestions
+------
+For technical, non-bug questions, [email 
Mikhail](mailto:mpo...@wikimedia.org?subject=Dashboard%20Question). If you 
experience a bug or notice something wrong or have a suggestion, [open a ticket 
in 
Phabricator](https://phabricator.wikimedia.org/maniphest/task/create/?projects=Discovery)
 in the Discovery board or [email 
Dan](mailto:dga...@wikimedia.org?subject=Dashboard%20Question).
+
+<hr style="border-color: gray;">
+<p style="font-size: small; color: gray;">
+  <strong>Link to this dashboard:</strong>
+  <a href="http://searchdata.wmflabs.org/metrics/#survival";>
+    http://searchdata.wmflabs.org/metrics/#survival
+  </a>
+</p>
diff --git a/server.R b/server.R
index e8b40bb..d91a6ab 100644
--- a/server.R
+++ b/server.R
@@ -1,95 +1,18 @@
 ## Version 0.2.0
 source("utils.R")
 
-existing_date <- (Sys.Date()-1)
-
-## Read in desktop data and generate means for the value boxes, along with a 
time-series appropriate form for
-## dygraphs.
-read_desktop <- function(){
-  data <- polloi::read_dataset("search/desktop_event_counts.tsv")
-  interim <- reshape2::dcast(data, formula = timestamp ~ action, fun.aggregate 
= sum)
-  interim[is.na(interim)] <- 0
-  desktop_dygraph_set <<- interim
-  desktop_dygraph_means <<- round(colMeans(desktop_dygraph_set[,2:5]))
-
-  data <- polloi::read_dataset("search/desktop_load_times.tsv")
-  desktop_load_data <<- data
-  return(invisible())
-}
-
-read_web <- function(){
-  data <- polloi::read_dataset("search/mobile_event_counts.tsv")
-  interim <- reshape2::dcast(data, formula = timestamp ~ action, fun.aggregate 
= sum)
-  interim[is.na(interim)] <- 0
-  mobile_dygraph_set <<- interim
-  mobile_dygraph_means <<- round(colMeans(mobile_dygraph_set[,2:4]))
-
-  mobile_load_data <<- polloi::read_dataset("search/mobile_load_times.tsv")
-  return(invisible())
-}
-
-read_apps <- function(){
-  data <- polloi::read_dataset("search/app_event_counts.tsv")
-
-  ios <- reshape2::dcast(data[data$platform == "iOS",], formula = timestamp ~ 
action, fun.aggregate = sum)
-  android <- reshape2::dcast(data[data$platform == "Android",], formula = 
timestamp ~ action, fun.aggregate = sum)
-  ios_dygraph_set <<- ios
-  ios_dygraph_means <<- round(colMeans(ios[,2:4]))
-
-  android_dygraph_set <<- android
-  android_dygraph_means <<- round(colMeans(android[,2:4]))
-
-  app_load_data <- polloi::read_dataset("search/app_load_times.tsv")
-  ios_load_data <<- app_load_data[app_load_data$platform == "iOS", 
names(app_load_data) != "platform"]
-  android_load_data <<- app_load_data[app_load_data$platform == "Android", 
names(app_load_data) != "platform"]
-
-  return(invisible())
-}
-
-read_api <- function(){
-  data <- polloi::read_dataset("search/search_api_aggregates.tsv")
-  data <- data[order(data$event_type),]
-  split_dataset <<- split(data, f = data$event_type)
-  return(invisible())
-}
-
-read_failures <- function(date){
-  data <- polloi::read_dataset("search/cirrus_query_aggregates.tsv")
-  interim_data <- reshape2::dcast(data, formula = date ~ variable, 
fun.aggregate = sum)
-  failure_dygraph_set <<- interim_data
-
-  interim_vector <- interim_data$`Zero Result Queries`/interim_data$`Search 
Queries`
-  output_vector <- (interim_vector[2:nrow(interim_data)] - 
interim_vector[1:(nrow(interim_data)-1)]) / 
interim_vector[1:(nrow(interim_data)-1)]
-
-  failure_roc_dygraph_set <<- data.frame(date = 
interim_data$date[2:nrow(interim_data)],
-                                         variable = "failure ROC",
-                                         daily_change = output_vector*100,
-                                         stringsAsFactors = FALSE)
-
-  interim_breakdown_data <- 
polloi::read_dataset("search/cirrus_query_breakdowns.tsv")
-  interim_breakdown_data$value <- interim_breakdown_data$value*100
-  failure_breakdown_dygraph_set <<- reshape2::dcast(interim_breakdown_data,
-                                                    formula = date ~ variable, 
fun.aggregate = sum)
-
-  suggestion_data <- 
polloi::read_dataset("search/cirrus_suggestion_breakdown.tsv")
-  suggestion_data$variable <- "Full-Text with Suggestions"
-  suggestion_data$value <- suggestion_data$value*100
-  suggestion_data <- rbind(suggestion_data,
-                           interim_breakdown_data[interim_breakdown_data$date 
%in% suggestion_data$date
-                                                  & 
interim_breakdown_data$variable == "Full-Text Search",])
-  suggestion_dygraph_set <<- reshape2::dcast(suggestion_data,
-                                             formula = date ~ variable, 
fun.aggregate = sum)
-  return(invisible())
-}
+existing_date <- Sys.Date() - 1
 
 shinyServer(function(input, output) {
 
-  if(Sys.Date() != existing_date){
+  if (Sys.Date() != existing_date) {
     read_desktop()
     read_apps()
     read_web()
     read_api()
     read_failures(existing_date)
+    read_augmented_clickthrough()
+    read_lethal_dose()
     existing_date <<- Sys.Date()
   }
 
@@ -311,7 +234,18 @@
                          xlab = "Date", ylab = "Zero Results Rate (%)", "Zero 
Result Rates with Search Suggestions")
   })
 
-  ## KPI module
+  output$lethal_dose_plot <- renderDygraph({
+    polloi::make_dygraph(data = polloi::smoother(user_page_visit_dataset,
+                                                 smooth_level = 
polloi::smooth_switch(input$smoothing_global,
+                                                                               
       input$smoothing_lethal_dose_plot)),
+                         xlab = "", ylab = "Time (s)",
+                         title = "Time at which we have lost N% of the users") 
%>%
+      dyAxis("x", ticker = "Dygraph.dateTicker", axisLabelFormatter = 
CustomAxisFormatter,
+             axisLabelWidth = 100, pixelsPerLabel = 80) %>%
+      dyLegend(labelsDiv = "lethal_dose_plot_legend")
+  })
+
+  ## KPI Summary Boxes
   output$kpi_summary_date_range <- renderUI({
     date_range <- input$kpi_summary_date_range_selector
     switch(date_range,
@@ -355,39 +289,15 @@
   })
   output$kpi_summary_box_load_time <- renderValueBox({
     date_range <- input$kpi_summary_date_range_selector
-    switch(date_range,
-           daily = {
-             x <- lapply(list(desktop_load_data, mobile_load_data,
-                              android_load_data, ios_load_data),
-                         polloi::safe_tail, n = 2) %>%
-               lapply(function(data_tail) return(data_tail$Median)) %>%
-               do.call(cbind, .) %>%
-               apply(MARGIN = 1, FUN = median)
-           },
-           weekly = {
-             x <- lapply(list(desktop_load_data, mobile_load_data,
-                              android_load_data, ios_load_data),
-                         polloi::safe_tail, n = 14) %>%
-               lapply(function(data_tail) return(data_tail$Median)) %>%
-               do.call(cbind, .) %>%
-               apply(MARGIN = 1, FUN = median)
-           },
-           monthly = {
-             x <- lapply(list(desktop_load_data, mobile_load_data,
-                              android_load_data, ios_load_data),
-                         polloi::safe_tail, n = 60) %>%
-               lapply(function(data_tail) return(data_tail$Median)) %>%
-               do.call(cbind, .) %>%
-               apply(MARGIN = 1, FUN = median)
-           },
-           quarterly = {
-             x <- lapply(list(desktop_load_data, mobile_load_data,
-                              android_load_data, ios_load_data),
-                         polloi::safe_tail, n = 90) %>%
-               lapply(function(data_tail) return(data_tail$Median))
-             y <- median(apply(do.call(cbind, x), 1, median))
-             return(valueBox(subtitle = "Load time", value = sprintf("%.0fms", 
y), color = "orange"))
-           })
+    x <- lapply(list(desktop_load_data, mobile_load_data,
+                     android_load_data, ios_load_data),
+                polloi::safe_tail, n = date_range_switch(date_range)) %>%
+      lapply(function(data_tail) return(data_tail$Median))
+    if ( date_range == "quarterly" ) {
+      y <- median(apply(do.call(cbind, x), 1, median))
+      return(valueBox(subtitle = "Load time", value = sprintf("%.0fms", y), 
color = "orange"))
+    }
+    x %<>% do.call(cbind, .) %>% apply(MARGIN = 1, FUN = median)
     y1 <- median(polloi::half(x)); y2 <- median(polloi::half(x, FALSE)); z <- 
100 * (y2 - y1) / y1
     if (abs(z) > 0) {
       return(valueBox(subtitle = sprintf("Load time (%.1f%%)", z),
@@ -398,11 +308,7 @@
   })
   output$kpi_summary_box_zero_results <- renderValueBox({
     date_range <- input$kpi_summary_date_range_selector
-    switch(date_range,
-           daily = {x <- polloi::safe_tail(failure_dygraph_set, 2)},
-           weekly = {x <- polloi::safe_tail(failure_dygraph_set, 14)},
-           monthly = {x <- polloi::safe_tail(failure_dygraph_set, 60)},
-           quarterly = {x <- polloi::safe_tail(failure_dygraph_set, 90)})
+    x <- polloi::safe_tail(failure_dygraph_set, date_range_switch(date_range))
     x <- transform(x, Rate = `Zero Result Queries` / `Search Queries`)$Rate
     if (date_range == "quarterly") {
       return(valueBox(subtitle = "Zero results rate", color = "orange",
@@ -422,18 +328,14 @@
   output$kpi_summary_box_api_usage <- renderValueBox({
     date_range <- input$kpi_summary_date_range_selector
     x <- lapply(split_dataset, function(x) {
-      switch(date_range,
-             daily = { polloi::safe_tail(x, 2)$events },
-             weekly = { polloi::safe_tail(x, 14)$events },
-             monthly = { polloi::safe_tail(x, 60)$events },
-             quarterly = { polloi::safe_tail(x, 90)$events })
+      polloi::safe_tail(x, date_range_switch(date_range))$events
     }) %>% do.call(cbind, .) %>%
       transform(total = cirrus + geo + language + open + prefix) %>%
       { .$total }
     if (date_range == "quarterly") {
       return(valueBox(subtitle = "API usage", value = 
polloi::compress(median(x), 0), color = "orange"))
     }
-    y1 <- median(polloi::half(x, FALSE))
+    y1 <- median(polloi::half(x, TRUE))
     y2 <- median(polloi::half(x, FALSE))
     z <- 100 * (y2 - y1) / y1 # % change from t-1 to t
     if (abs(z) > 0) {
@@ -442,13 +344,33 @@
     }
     return(valueBox(subtitle = "API usage (no change)", value = 
polloi::compress(y2, 0), color = "orange"))
   })
+  output$kpi_summary_box_augmented_clickthroughs <- renderValueBox({
+    date_range <- input$kpi_summary_date_range_selector
+    #========= We can delete this block after we get 90 days of data =========
+    if ( (date_range == "monthly" && (Sys.Date()-1)-60 < 
as.Date("2015-09-02")) || date_range == "quarterly" && (Sys.Date()-1)-90 < 
as.Date("2015-09-02") ) {
+      return(valueBox(subtitle = "User engagement", color = "black", value = 
"NA"))
+    }
+    #=========================================================================
+    x <- polloi::safe_tail(augmented_clickthroughs, 
date_range_switch(date_range))
+    if (date_range == "quarterly") {
+      return(valueBox(subtitle = "User engagement", color = "orange",
+                      value = sprintf("%.1f%%", median(x$user_engagement))))
+    }
+    y1 <- median(polloi::half(x$user_engagement))
+    y2 <- median(polloi::half(x$user_engagement, FALSE))
+    z <- 100 * (y2 - y1)/y1
+    if (abs(z) > 0) {
+      return(valueBox(
+        subtitle = sprintf("User engagement (%.1f%%)", z),
+        value = sprintf("%.1f%%", y2),
+        icon = cond_icon(z > 0), color = polloi::cond_color(z > 0, "green")
+      ))
+    }
+    return(valueBox(subtitle = "User engagement (no change)",
+                    value = sprintf("%.1f%%", y2), color = "orange"))
+  })
   output$kpi_summary_api_usage_proportions <- renderPlot({
-    switch (input$kpi_summary_date_range_selector,
-            daily = { n <- 1 },
-            weekly = { n <- 7 },
-            monthly = { n <- 30 },
-            quarterly = { n <- 90 }
-    )
+    n <- date_range_switch(input$kpi_summary_date_range_selector, 1, 7, 30, 90)
     api_latest <- cbind("Full-text via API" = 
polloi::safe_tail(split_dataset$cirrus, n)$events,
                         "Geo Search" = polloi::safe_tail(split_dataset$geo, 
n)$events,
                         "OpenSearch" = polloi::safe_tail(split_dataset$open, 
n)$events,
@@ -466,6 +388,8 @@
     rm(i)
     gg_prop_bar(api_latest, cols = list(item = "API", prop = "Prop", label = 
"Label"))
   })
+
+  ## KPI Modules
   output$kpi_load_time_series <- renderDygraph({
     smooth_level <- input$smoothing_kpi_load_time
     num_of_days_in_common <- min(sapply(list(desktop_load_data$Median, 
mobile_load_data$Median, android_load_data$Median, ios_load_data$Median), 
length))
@@ -489,10 +413,10 @@
              dyAxis("y2", label = "Day-to-day % change in median load time",
                     independentTicks = TRUE, drawGrid = FALSE) %>%
              dyLegend(width = 500, show = "always") %>%
-             dyOptions(strokeWidth = 2, colors = brewer.pal(5, "Set2")[5:1],
+             dyOptions(strokeWidth = 2, colors = RColorBrewer::brewer.pal(5, 
"Set2")[5:1],
                        drawPoints = FALSE, pointSize = 3, labelsKMB = TRUE,
                        includeZero = TRUE) %>%
-             dyCSS(css = "./assets/css/custom.css"))
+             dyCSS(css = system.file("custom.css", package = "polloi")))
   })
   output$kpi_zero_results_series <- renderDygraph({
     smooth_level <- input$smoothing_kpi_zero_results
@@ -507,21 +431,21 @@
                    ylab = "% of search queries that yield zero results") %>%
              dySeries("change", axis = 'y2', label = "day-to-day % change", 
strokeWidth = 1) %>%
              dyLimit(limit = 12.50, label = "Goal: 12.50% zero results rate",
-                     color = brewer.pal(3, "Set2")[3]) %>%
+                     color = RColorBrewer::brewer.pal(3, "Set2")[3]) %>%
              dyAxis("y2", label = "Day-to-day % change",
                     valueRange = c(-1, 1) * 
max(max(abs(as.numeric(zrr$change))), 10),
-                    axisLineColor = brewer.pal(3, "Set2")[2],
-                    axisLabelColor = brewer.pal(3, "Set2")[2],
+                    axisLineColor = RColorBrewer::brewer.pal(3, "Set2")[2],
+                    axisLabelColor = RColorBrewer::brewer.pal(3, "Set2")[2],
                     independentTicks = TRUE, drawGrid = FALSE) %>%
              dyAxis("y", drawGrid = FALSE,
-                    axisLineColor = brewer.pal(3, "Set2")[1],
-                    axisLabelColor = brewer.pal(3, "Set2")[1]) %>%
-             dyLimit(limit = 0, color = brewer.pal(3, "Set2")[2], 
strokePattern = "dashed") %>%
+                    axisLineColor = RColorBrewer::brewer.pal(3, "Set2")[1],
+                    axisLabelColor = RColorBrewer::brewer.pal(3, "Set2")[1]) 
%>%
+             dyLimit(limit = 0, color = RColorBrewer::brewer.pal(3, 
"Set2")[2], strokePattern = "dashed") %>%
              dyLegend(width = 400, show = "always") %>%
-             dyOptions(strokeWidth = 3, colors = brewer.pal(3, "Set2"),
+             dyOptions(strokeWidth = 3, colors = RColorBrewer::brewer.pal(3, 
"Set2"),
                        drawPoints = FALSE, pointSize = 3, labelsKMB = TRUE,
                        includeZero = TRUE) %>%
-             dyCSS(css = "./assets/css/custom.css"))
+             dyCSS(css = system.file("custom.css", package = "polloi")))
   })
   output$kpi_api_usage_series <- renderDygraph({
     smooth_level <- input$smoothing_kpi_api_usage
@@ -541,12 +465,12 @@
                      ylab = ifelse(input$kpi_api_usage_series_log_scale, 
"Calls (log10 scale)", "Calls")) %>%
                dySeries("cirrus", label = "full-text via API") %>%
                dyLegend(width = 400, show = "always") %>%
-               dyOptions(strokeWidth = 3, colors = brewer.pal(6, "Set2")[6:1],
+               dyOptions(strokeWidth = 3, colors = RColorBrewer::brewer.pal(6, 
"Set2")[6:1],
                          drawPoints = FALSE, pointSize = 3, labelsKMB = TRUE,
                          includeZero = input$kpi_api_usage_series_log_scale,
                          logscale = input$kpi_api_usage_series_log_scale
                ) %>%
-               dyCSS(css = "./assets/css/custom.css"))
+               dyCSS(css = system.file("custom.css", package = "polloi")))
     }
     api_usage_change <- transform(api_usage,
                                   cirrus = polloi::percent_change(cirrus),
@@ -563,10 +487,18 @@
                    main = "Day-to-day % change over time",
                    xlab = "Date", ylab = "% change") %>%
              dyLegend(width = 400, show = "always") %>%
-             dyOptions(strokeWidth = 3, colors = brewer.pal(6, "Set2"),
+             dyOptions(strokeWidth = 3, colors = RColorBrewer::brewer.pal(6, 
"Set2"),
                        drawPoints = FALSE, pointSize = 3, labelsKMB = TRUE,
                        includeZero = TRUE) %>%
-             dyCSS(css = "./assets/css/custom.css"))
+             dyCSS(css = system.file("custom.css", package = "polloi")))
+  })
+  output$kpi_augmented_clickthroughs_series <- renderDygraph({
+    smoothed_data <- polloi::smoother(augmented_clickthroughs,
+      smooth_level = polloi::smooth_switch(input$smoothing_global, 
input$smoothing_augmented_clickthroughs))
+    polloi::make_dygraph(data = smoothed_data, xlab = "Date", ylab = "Rates", 
"User engagement (augmented clickthroughs) by day") %>%
+      dySeries(name = colnames(smoothed_data)[2], strokeWidth = 1.5, 
strokePattern = "dashed") %>%
+      dySeries(name = colnames(smoothed_data)[3], strokeWidth = 1.5, 
strokePattern = "dashed") %>%
+      dyLegend(labelsDiv = "kpi_augmented_clickthroughs_series_legend")
   })
 
 })
diff --git a/ui.R b/ui.R
index df544c7..4b084ce 100644
--- a/ui.R
+++ b/ui.R
@@ -4,7 +4,7 @@
 options(scipen = 500)
 
 #Header elements for the visualisation
-header <- dashboardHeader(title = "Search & Discovery", disable = FALSE)
+header <- dashboardHeader(title = "Search Metrics", disable = FALSE)
 
 #Sidebar elements for the search visualisations.
 sidebar <- dashboardSidebar(
@@ -17,7 +17,8 @@
              menuSubItem(text = "Summary", tabName = "kpis_summary"),
              menuSubItem(text = "Load times", tabName = "kpi_load_time"),
              menuSubItem(text = "Zero results", tabName = "kpi_zero_results"),
-             menuSubItem(text = "API usage", tabName = "kpi_api_usage")),
+             menuSubItem(text = "API usage", tabName = "kpi_api_usage"),
+             menuSubItem(text = "Augmented Clickthrough", tabName = 
"kpi_augmented_clickthroughs")),
     menuItem(text = "Desktop",
              menuSubItem(text = "Events", tabName = "desktop_events"),
              menuSubItem(text = "Load times", tabName = "desktop_load")),
@@ -26,20 +27,19 @@
              menuSubItem(text = "Load times", tabName = "mobile_load")),
     menuItem(text = "Mobile Apps",
              menuSubItem(text = "Events", tabName = "app_events"),
-             menuSubItem(text = "Load times", tabName = "app_load")
-    ),
+             menuSubItem(text = "Load times", tabName = "app_load")),
     menuItem(text = "API",
              menuSubItem(text = "Full-text via API", tabName = 
"fulltext_search"),
              menuSubItem(text = "Open Search", tabName = "open_search"),
              menuSubItem(text = "Geo Search", tabName = "geo_search"),
              menuSubItem(text = "Prefix Search", tabName = "prefix_search"),
-             menuSubItem(text = "Language Search", tabName = "language_search")
-    ),
+             menuSubItem(text = "Language Search", tabName = 
"language_search")),
     menuItem(text = "Zero Results",
              menuSubItem(text = "Summary", tabName = "failure_rate"),
              menuSubItem(text = "Search Type Breakdown", tabName = 
"failure_breakdown"),
-             menuSubItem(text = "Search Suggestions", tabName = 
"failure_suggestions")
-    ),
+             menuSubItem(text = "Search Suggestions", tabName = 
"failure_suggestions")),
+    menuItem(text = "Page Visit Times", tabName = "survival",
+             badgeLabel = "new", badgeColor = "fuchsia"),
     selectInput(inputId = "smoothing_global", label = "Smoothing (Global 
Setting)", selectize = TRUE, selected = "day",
                 choices = c("No Smoothing" = "day", "Moving Average" = 
"moving_avg",
                             "Weekly Median" = "week", "Monthly Median" = 
"month"))
@@ -65,7 +65,7 @@
             fluidRow(valueBoxOutput("kpi_summary_box_load_time", width = 3),
                      valueBoxOutput("kpi_summary_box_zero_results", width = 3),
                      valueBoxOutput("kpi_summary_box_api_usage", width = 3),
-                     valueBox(subtitle = "User-satisfaction", value = "WIP", 
color = "black", width = 3)),
+                     valueBoxOutput("kpi_summary_box_augmented_clickthroughs", 
width = 3)),
             plotOutput("kpi_summary_api_usage_proportions", height = "30px"),
             includeMarkdown("./assets/content/kpis_summary.md")
             ),
@@ -95,6 +95,12 @@
                      column(smooth_select("smoothing_kpi_api_usage"), width = 
3)),
             dygraphOutput("kpi_api_usage_series"),
             includeMarkdown("./assets/content/kpi_api_usage.md")),
+    tabItem(tabName = "kpi_augmented_clickthroughs",
+            fluidRow(
+              column(smooth_select("smoothing_augmented_clickthroughs"), width 
= 4),
+              column(div(id = "kpi_augmented_clickthroughs_series_legend"), 
width = 8)),
+            dygraphOutput("kpi_augmented_clickthroughs_series"),
+            
includeMarkdown("./assets/content/kpi_augmented_clickthroughs.md")),
     tabItem(tabName = "desktop_events",
             fluidRow(
               valueBoxOutput("desktop_event_searches"),
@@ -180,8 +186,17 @@
             smooth_select("smoothing_failure_suggestions"),
             dygraphOutput("suggestion_dygraph_plot"),
             includeMarkdown("./assets/content/failure_suggests.md")
+    ),
+    tabItem(tabName = "survival",
+            fluidRow(
+              column(smooth_select("smoothing_lethal_dose_plot"), width = 4),
+              column(div(id = "lethal_dose_plot_legend"), width = 8)
+            ),
+            dygraphOutput("lethal_dose_plot"),
+            includeMarkdown("./assets/content/survival.md")
     )
   )
 )
 
-dashboardPage(header, sidebar, body, skin = "black")
+dashboardPage(header, sidebar, body, skin = "black",
+              title = "Search Metrics Dashboard | Discovery | Engineering | 
Wikimedia Foundation")
diff --git a/utils.R b/utils.R
index 12d863a..3391e39 100644
--- a/utils.R
+++ b/utils.R
@@ -6,6 +6,98 @@
 library(polloi)
 library(xts)
 
+## Read in desktop data and generate means for the value boxes, along with a 
time-series appropriate form for
+## dygraphs.
+read_desktop <- function() {
+  data <- polloi::read_dataset("search/desktop_event_counts.tsv")
+  interim <- reshape2::dcast(data, formula = timestamp ~ action, fun.aggregate 
= sum)
+  interim[is.na(interim)] <- 0
+  desktop_dygraph_set <<- interim
+  desktop_dygraph_means <<- round(colMeans(desktop_dygraph_set[,2:5]))
+  desktop_load_data <<- polloi::read_dataset("search/desktop_load_times.tsv")
+}
+
+read_web <- function() {
+  data <- polloi::read_dataset("search/mobile_event_counts.tsv")
+  interim <- reshape2::dcast(data, formula = timestamp ~ action, fun.aggregate 
= sum)
+  interim[is.na(interim)] <- 0
+  mobile_dygraph_set <<- interim
+  mobile_dygraph_means <<- round(colMeans(mobile_dygraph_set[,2:4]))
+  mobile_load_data <<- polloi::read_dataset("search/mobile_load_times.tsv")
+}
+
+read_apps <- function() {
+
+  data <- polloi::read_dataset("search/app_event_counts.tsv")
+
+  ios <- reshape2::dcast(data[data$platform == "iOS",], formula = timestamp ~ 
action, fun.aggregate = sum)
+  android <- reshape2::dcast(data[data$platform == "Android",], formula = 
timestamp ~ action, fun.aggregate = sum)
+  ios_dygraph_set <<- ios
+  ios_dygraph_means <<- round(colMeans(ios[,2:4]))
+
+  android_dygraph_set <<- android
+  android_dygraph_means <<- round(colMeans(android[,2:4]))
+
+  app_load_data <- polloi::read_dataset("search/app_load_times.tsv")
+  ios_load_data <<- app_load_data[app_load_data$platform == "iOS", 
names(app_load_data) != "platform"]
+  android_load_data <<- app_load_data[app_load_data$platform == "Android", 
names(app_load_data) != "platform"]
+
+}
+
+read_api <- function(){
+  data <- polloi::read_dataset("search/search_api_aggregates.tsv")
+  data <- data[order(data$event_type),]
+  split_dataset <<- split(data, f = data$event_type)
+}
+
+read_failures <- function(date) {
+
+  data <- polloi::read_dataset("search/cirrus_query_aggregates.tsv")
+  interim_data <- reshape2::dcast(data, formula = date ~ variable, 
fun.aggregate = sum)
+  failure_dygraph_set <<- interim_data
+
+  interim_vector <- interim_data$`Zero Result Queries`/interim_data$`Search 
Queries`
+  output_vector <- (interim_vector[2:nrow(interim_data)] - 
interim_vector[1:(nrow(interim_data)-1)]) / 
interim_vector[1:(nrow(interim_data)-1)]
+
+  failure_roc_dygraph_set <<- data.frame(date = 
interim_data$date[2:nrow(interim_data)],
+                                         variable = "failure ROC",
+                                         daily_change = output_vector*100,
+                                         stringsAsFactors = FALSE)
+
+  interim_breakdown_data <- 
polloi::read_dataset("search/cirrus_query_breakdowns.tsv")
+  interim_breakdown_data$value <- interim_breakdown_data$value*100
+  failure_breakdown_dygraph_set <<- reshape2::dcast(interim_breakdown_data,
+                                                    formula = date ~ variable, 
fun.aggregate = sum)
+
+  suggestion_data <- 
polloi::read_dataset("search/cirrus_suggestion_breakdown.tsv")
+  suggestion_data$variable <- "Full-Text with Suggestions"
+  suggestion_data$value <- suggestion_data$value*100
+  suggestion_data <- rbind(suggestion_data,
+                           interim_breakdown_data[interim_breakdown_data$date 
%in% suggestion_data$date
+                                                  & 
interim_breakdown_data$variable == "Full-Text Search",])
+  suggestion_dygraph_set <<- reshape2::dcast(suggestion_data,
+                                             formula = date ~ variable, 
fun.aggregate = sum)
+
+}
+
+read_augmented_clickthrough <- function() {
+  data <- polloi::read_dataset("search/search_threshold_pass_rate.tsv")
+  temp <- polloi::safe_tail(desktop_dygraph_set, nrow(data))[, 
c('clickthroughs', 'Result pages opened')] +
+    polloi::safe_tail(mobile_dygraph_set, nrow(data))[, c('clickthroughs', 
'Result pages opened')] +
+    polloi::safe_tail(ios_dygraph_set, nrow(data))[, c('clickthroughs', 
'Result pages opened')] +
+    polloi::safe_tail(android_dygraph_set, nrow(data))[, c('clickthroughs', 
'Result pages opened')]
+  intermediary_dataset <- cbind(data, clickthrough_rate = 100 * 
temp$clickthroughs/temp$'Result pages opened')
+  colnames(intermediary_dataset) <- c("date", "threshold_passing_rate", 
"clickthrough_rate")
+  intermediary_dataset$threshold_passing_rate <- 100 * 
intermediary_dataset$threshold_passing_rate
+  augmented_clickthroughs <<- transform(intermediary_dataset, user_engagement 
= (threshold_passing_rate + clickthrough_rate)/2)
+}
+
+read_lethal_dose <- function() {
+  intermediary_dataset <- 
polloi::read_dataset("search/sample_page_visit_ld.tsv")
+  colnames(intermediary_dataset) <- c("date", "10%", "25%", "50%", "75%", 
"90%", "95%", "99%")
+  user_page_visit_dataset <<- intermediary_dataset
+}
+
 # Uses ggplot2 to create a pie chart in bar form. (Will look up actual name)
 gg_prop_bar <- function(data, cols) {
   # `cols` = list(`item`, `prop`, `label`)
@@ -27,3 +119,19 @@
                   y = "text_position",
                   x = 1))
 }
+
+date_range_switch <- function(date_range, daily = 2, weekly = 14, monthly = 
60, quarterly = 90) {
+  return(switch(date_range, daily = daily, weekly = weekly, monthly = monthly, 
quarterly = quarterly))
+}
+
+CustomAxisFormatter <- 'function (d, gran) {
+  var weekday = new Array(7);
+  weekday[0]=  "Sunday";
+  weekday[1] = "Monday";
+  weekday[2] = "Tuesday";
+  weekday[3] = "Wednesday";
+  weekday[4] = "Thursday";
+  weekday[5] = "Friday";
+  weekday[6] = "Saturday";
+  return weekday[d.getDay()] + " (" + (d.getMonth()+1) + "/" + d.getDate() + 
")";
+}'

-- 
To view, visit https://gerrit.wikimedia.org/r/241115
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I808213c595925ae7ad3269403637b77483a4116e
Gerrit-PatchSet: 2
Gerrit-Project: wikimedia/discovery/rainbow
Gerrit-Branch: master
Gerrit-Owner: Bearloga <mpo...@wikimedia.org>
Gerrit-Reviewer: Bearloga <mpo...@wikimedia.org>
Gerrit-Reviewer: OliverKeyes <oke...@wikimedia.org>

_______________________________________________
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to