Chelsyx has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/381508 )

Change subject: Count the number of user session tokens by volume for mobile 
web search
......................................................................

Count the number of user session tokens by volume for mobile web search

Bug: T176811
Change-Id: I9ce01d5c6ffcce6ddb6e4fe35281d41c39f9f9d6
---
M modules/mobile_web/events.R
M tab_documentation/mobile_events.md
M ui.R
M utils.R
4 files changed, 45 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/rainbow 
refs/changes/08/381508/1

diff --git a/modules/mobile_web/events.R b/modules/mobile_web/events.R
index 6f326c6..3e0125a 100644
--- a/modules/mobile_web/events.R
+++ b/modules/mobile_web/events.R
@@ -1,6 +1,15 @@
+output$mobile_event_user_session <- renderValueBox(
+  valueBox(
+    value = mobile_session_mean["Total user sessions"],
+    subtitle = "User sessions per day*",
+    icon = icon("search"),
+    color = "green"
+  )
+)
+
 output$mobile_event_searches <- renderValueBox(
   valueBox(
-    value = mobile_dygraph_means["search sessions"],
+    value = mobile_dygraph_means["search start"],
     subtitle = "Search sessions per day*",
     icon = icon("search"),
     color = "green"
@@ -30,5 +39,13 @@
     polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_mobile_event)) %>%
     polloi::make_dygraph(xlab = "Date", ylab = "Events", title = "Mobile 
search events, by day") %>%
     dyRangeSelector %>%
-    dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom")
+    dyEvent(as.Date("2017-01-01"), "R (reportupdater)", labelLoc = "bottom") 
%>%
+    dyEvent(as.Date("2017-03-29"), "H (new header)", labelLoc = "bottom")
+})
+
+output$mobile_session_plot <- renderDygraph({
+  mobile_session %>%
+    polloi::smoother(smooth_level = 
polloi::smooth_switch(input$smoothing_global, input$smoothing_mobile_event)) %>%
+    polloi::make_dygraph(xlab = "Date", ylab = "Events", title = "Mobile user 
sessions, by volume") %>%
+    dyRangeSelector
 })
diff --git a/tab_documentation/mobile_events.md 
b/tab_documentation/mobile_events.md
index e6859b9..c8a029a 100644
--- a/tab_documentation/mobile_events.md
+++ b/tab_documentation/mobile_events.md
@@ -1,13 +1,15 @@
-Mobile search
+Mobile web search
 =======
 
-User actions that we track around search on the mobile website generally fall 
into three categories:
+User actions that we track around prefix search on the mobile website 
generally fall into three categories:
 
-1. The start of a user's search session;
-2. The presentation of the user with a results page, and;
-3. A user clicking through to an article in the results page.
+1. **search start (aka search session)**: An API request is being made to 
retrieve search results whenever the user types enough characters to perform a 
search (3 or more). A search session is identified by searchSessionToken. For 
example, if a user types "Bara", then a new search session is started; if they 
then type "ck" (Barack), then a new search session is started;
+2. **Result pages opened**: The API request has finished and the results have 
been rendered;
+3. **clickthroughs**: A user clicking through to an article in the results 
page.
 
-These three things are tracked via the [EventLogging 'MobileWebSearch' 
schema](https://meta.wikimedia.org/wiki/Schema:MobileWebSearch), and stored to 
a database. The results are then aggregated and anonymised, and presented on 
this page. For performance/privacy reasons we randomly sample what we store, so 
the actual numbers are a vast understatement of how many user actions our 
servers receive - what's more interesting is how they change over time. In the 
case of Mobile Web search, this sampling rate is *going* to be **0.1%**: it's 
currently turned off entirely but should be enabled soon.
+When a user opens the search overlay, a **user session** start. We use a 
random generated userSessionToken to identify this search funnel. A user 
session can have multiple search sessions. We split user sessions into “low 
volume”, "medium volume" and “high-volume” sessions. A “high-volume” session is 
a user session whose search sessions are equal to or greater than the 90th 
percentile for the whole population on any particular day. A “low-volume” 
session is a user session whose search sessions are equal to or less than the 
5th percentile. The rest are categorized as "medium-volume".
+
+We use the [EventLogging 'MobileWebSearch' 
schema](https://meta.wikimedia.org/wiki/Schema:MobileWebSearch) to track these 
activities, and stored to a database. Currently the schema tracks prefix search 
only. The results are then aggregated and anonymised, and presented on this 
page. For performance/privacy reasons we randomly sample what we store, so the 
actual numbers are a vast understatement of how many user actions our servers 
receive - what's more interesting is how they change over time. In the case of 
Mobile Web search, this sampling rate is **0.1%**.
 
 \* This number represents the median of the last 90 days.
 
@@ -23,6 +25,7 @@
 
 * Between 5 May and 6 May 2015, approximately 40% of incoming EventLogging 
data was lost due to a wider EventLogging outage. You can read more about the 
outage 
[here](https://wikitech.wikimedia.org/wiki/Incident_documentation/20150506-EventLogging).
 * '__R__': on 2017-01-01 we started calculating all of Discovery's metrics 
using a new version of [our data retrieval and processing 
codebase](https://phabricator.wikimedia.org/diffusion/WDGO/) that we migrated 
to [Wikimedia Analytics](https://www.mediawiki.org/wiki/Analytics)' 
[Reportupdater 
infrastructure](https://wikitech.wikimedia.org/wiki/Analytics/Reportupdater). 
See [T150915](https://phabricator.wikimedia.org/T150915) for more details.
+* '__H__': on 2017-03-29 we deployed the new mobile header treatment 
(including the search box) which may result in the decrease of search. See 
[T176464](https://phabricator.wikimedia.org/T176464) for more information.
 
 Questions, bug reports, and feature suggestions
 ------
diff --git a/ui.R b/ui.R
index d91732a..1fbe4de 100644
--- a/ui.R
+++ b/ui.R
@@ -189,11 +189,13 @@
                 includeHTML("./tab_documentation/paulscore_approx.html")),
         tabItem(tabName = "mobile_events",
                 fluidRow(
-                  valueBoxOutput("mobile_event_searches"),
-                  valueBoxOutput("mobile_event_resultsets"),
-                  valueBoxOutput("mobile_event_clickthroughs")),
+                  valueBoxOutput("mobile_event_user_session", width = 3),
+                  valueBoxOutput("mobile_event_searches", width = 3),
+                  valueBoxOutput("mobile_event_resultsets", width = 3),
+                  valueBoxOutput("mobile_event_clickthroughs", width = 3)),
                 polloi::smooth_select("smoothing_mobile_event"),
                 dygraphOutput("mobile_event_plot"),
+                dygraphOutput("mobile_session_plot"),
                 includeMarkdown("./tab_documentation/mobile_events.md")
         ),
         tabItem(tabName = "mobile_load",
diff --git a/utils.R b/utils.R
index 3ac5c5a..602a3ab 100644
--- a/utils.R
+++ b/utils.R
@@ -34,8 +34,18 @@
 read_web <- function() {
   mobile_dygraph_set <<- 
polloi::read_dataset("discovery/metrics/search/mobile_event_counts.tsv", 
col_types = "Dci") %>%
     dplyr::filter(!is.na(action), !is.na(events)) %>%
-    tidyr::spread(action, events, fill = 0)
-  mobile_dygraph_means <<- round(apply(mobile_dygraph_set[, 2:4], 2, median))
+    tidyr::spread(action, events, fill = 0) %>%
+    dplyr::rename(`search start` = `search sessions`)
+  mobile_dygraph_means <<- 
round(apply(mobile_dygraph_set[(nrow(mobile_dygraph_set) - 
89):nrow(mobile_dygraph_set), 2:4], 2, median))
+  mobile_session <<- 
polloi::read_dataset("discovery/metrics/search/mobile_session_counts.tsv", 
col_types = "Diiiiiii") %>%
+    dplyr::select(date, user_sessions, high_volume, medium_volume, low_volume) 
%>%
+    dplyr::rename(
+      `Total user sessions` = user_sessions,
+      `High volume` = high_volume,
+      `Medium volume` = medium_volume,
+      `Low volume` = low_volume
+    )
+  mobile_session_mean <<- round(apply(mobile_session[(nrow(mobile_session) - 
89):nrow(mobile_session), -1], 2, median))
   mobile_load_data <<- 
polloi::read_dataset("discovery/metrics/search/mobile_load_times.tsv", 
col_types = "Dddd") %>%
     dplyr::filter(!is.na(Median))
 }

-- 
To view, visit https://gerrit.wikimedia.org/r/381508
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I9ce01d5c6ffcce6ddb6e4fe35281d41c39f9f9d6
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/rainbow
Gerrit-Branch: develop
Gerrit-Owner: Chelsyx <[email protected]>

_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to