[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Add approximate PaulScore daily tracking

2016-09-12 Thread Bearloga (Code Review)
Bearloga has submitted this change and it was merged.

Change subject: Add approximate PaulScore daily tracking
..


Add approximate PaulScore daily tracking

The R script uses some shortcuts to produce a SQL query that computes
multiple PaulScores (0.1, ..., 0.9) using Erik B.'s original query as
a template. Here is the query it runs:

```
SELECT
  date, event_source,
  ROUND(SUM(pow_1)/COUNT(1), 3) AS pow_1,
  ROUND(SUM(pow_2)/COUNT(1), 3) AS pow_2,
  ROUND(SUM(pow_3)/COUNT(1), 3) AS pow_3,
  ROUND(SUM(pow_4)/COUNT(1), 3) AS pow_4,
  ROUND(SUM(pow_5)/COUNT(1), 3) AS pow_5,
  ROUND(SUM(pow_6)/COUNT(1), 3) AS pow_6,
  ROUND(SUM(pow_7)/COUNT(1), 3) AS pow_7,
  ROUND(SUM(pow_8)/COUNT(1), 3) AS pow_8,
  ROUND(SUM(pow_9)/COUNT(1), 3) AS pow_9
FROM (
  SELECT
LEFT(timestamp, 8) AS date,
event_source,
event_searchSessionId,
SUM(IF(event_action = 'click', POW(0.1, event_position), 0)) / 
SUM(IF(event_action = 'searchResultPage', 1, 0)) AS pow_1,
SUM(IF(event_action = 'click', POW(0.2, event_position), 0)) / 
SUM(IF(event_action = 'searchResultPage', 1, 0)) AS pow_2,
SUM(IF(event_action = 'click', POW(0.3, event_position), 0)) / 
SUM(IF(event_action = 'searchResultPage', 1, 0)) AS pow_3,
SUM(IF(event_action = 'click', POW(0.4, event_position), 0)) / 
SUM(IF(event_action = 'searchResultPage', 1, 0)) AS pow_4,
SUM(IF(event_action = 'click', POW(0.5, event_position), 0)) / 
SUM(IF(event_action = 'searchResultPage', 1, 0)) AS pow_5,
SUM(IF(event_action = 'click', POW(0.6, event_position), 0)) / 
SUM(IF(event_action = 'searchResultPage', 1, 0)) AS pow_6,
SUM(IF(event_action = 'click', POW(0.7, event_position), 0)) / 
SUM(IF(event_action = 'searchResultPage', 1, 0)) AS pow_7,
SUM(IF(event_action = 'click', POW(0.8, event_position), 0)) / 
SUM(IF(event_action = 'searchResultPage', 1, 0)) AS pow_8,
SUM(IF(event_action = 'click', POW(0.9, event_position), 0)) / 
SUM(IF(event_action = 'searchResultPage', 1, 0)) AS pow_9
  FROM TestSearchSatisfaction2_15700292
  WHERE
LEFT(timestamp, 8) = '20160908'
AND event_action IN ('searchResultPage', 'click')
AND (event_subTest IS NULL OR event_subTest IN ('null','baseline'))
  GROUP BY date, event_source, event_searchSessionId
) AS pows
GROUP BY date, event_source;
```

Bug: T144424
Change-Id: Ibc7b8503b6be4b8c3cfeeba2cc757a4f184f5a14
---
A search/paulscore.R
1 file changed, 36 insertions(+), 0 deletions(-)

Approvals:
  Chelsyx: Verified; Looks good to me, approved



diff --git a/search/paulscore.R b/search/paulscore.R
new file mode 100644
index 000..553512c
--- /dev/null
+++ b/search/paulscore.R
@@ -0,0 +1,36 @@
+# Per-file config:
+base_path <- paste0(write_root, "search/")
+
+main <- function(date = NULL) {
+  
+  # Ensure we have a date and deconstruct it into a MW-friendly format
+  if (is.null(date)) {
+date <- Sys.Date() - 1
+  }
+  date <- gsub(x = date, pattern = "-", replacement = "")
+  
+  query <- paste0("SELECT
+  date, event_source,
+  ", paste0("ROUND(SUM(pow_", 1:9,")/COUNT(1), 3) AS pow_", 1:9, collapse = 
",\n  "), "
+FROM (
+  SELECT
+LEFT(timestamp, 8) AS date,
+event_source,
+event_searchSessionId,
+", paste0("SUM(IF(event_action = 'click', POW(0.", 1:9, ", 
event_position), 0)) / SUM(IF(event_action = 'searchResultPage', 1, 0)) AS 
pow_", 1:9, collapse = ",\n"), "
+  FROM TestSearchSatisfaction2_15700292
+  WHERE
+LEFT(timestamp, 8) = '", date, "'
+AND event_action IN ('searchResultPage', 'click')
+AND (event_subTest IS NULL OR event_subTest IN ('null','baseline'))
+  GROUP BY date, event_source, event_searchSessionId
+) AS pows
+GROUP BY date, event_source;") # cat(query) if you want to copy and paste into 
MySQL CLI
+  # See https://phabricator.wikimedia.org/T144424 for more details.
+  data <- wmf::mysql_read(query, "log")
+  
+  # Report
+  wmf::write_conditional(paul_scores, file.path(base_path, 
"paulscore_approximations.tsv"))
+  
+  return(invisible())
+}

-- 
To view, visit https://gerrit.wikimedia.org/r/309702
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ibc7b8503b6be4b8c3cfeeba2cc757a4f184f5a14
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/golden
Gerrit-Branch: master
Gerrit-Owner: Bearloga 
Gerrit-Reviewer: Bearloga 
Gerrit-Reviewer: Chelsyx 
Gerrit-Reviewer: EBernhardson 
Gerrit-Reviewer: Tjones 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] wikimedia...golden[master]: Add approximate PaulScore daily tracking

2016-09-09 Thread Bearloga (Code Review)
Bearloga has uploaded a new change for review.

  https://gerrit.wikimedia.org/r/309702

Change subject: Add approximate PaulScore daily tracking
..

Add approximate PaulScore daily tracking

The R script uses some shortcuts to produce a SQL query that computes
multiple PaulScores (0.1, ..., 0.9) using Erik B.'s original query as
a template. Here is the query it runs:

```
SELECT
  date, event_source,
  ROUND(SUM(pow_1)/COUNT(1), 3) AS pow_1,
  ROUND(SUM(pow_2)/COUNT(1), 3) AS pow_2,
  ROUND(SUM(pow_3)/COUNT(1), 3) AS pow_3,
  ROUND(SUM(pow_4)/COUNT(1), 3) AS pow_4,
  ROUND(SUM(pow_5)/COUNT(1), 3) AS pow_5,
  ROUND(SUM(pow_6)/COUNT(1), 3) AS pow_6,
  ROUND(SUM(pow_7)/COUNT(1), 3) AS pow_7,
  ROUND(SUM(pow_8)/COUNT(1), 3) AS pow_8,
  ROUND(SUM(pow_9)/COUNT(1), 3) AS pow_9
FROM (
  SELECT
LEFT(timestamp, 8) AS date,
event_source,
event_searchSessionId,
SUM(IF(event_action = 'click', POW(0.1, event_position), 0)) / 
SUM(IF(event_action = 'searchResultPage', 1, 0)) AS pow_1,
SUM(IF(event_action = 'click', POW(0.2, event_position), 0)) / 
SUM(IF(event_action = 'searchResultPage', 1, 0)) AS pow_2,
SUM(IF(event_action = 'click', POW(0.3, event_position), 0)) / 
SUM(IF(event_action = 'searchResultPage', 1, 0)) AS pow_3,
SUM(IF(event_action = 'click', POW(0.4, event_position), 0)) / 
SUM(IF(event_action = 'searchResultPage', 1, 0)) AS pow_4,
SUM(IF(event_action = 'click', POW(0.5, event_position), 0)) / 
SUM(IF(event_action = 'searchResultPage', 1, 0)) AS pow_5,
SUM(IF(event_action = 'click', POW(0.6, event_position), 0)) / 
SUM(IF(event_action = 'searchResultPage', 1, 0)) AS pow_6,
SUM(IF(event_action = 'click', POW(0.7, event_position), 0)) / 
SUM(IF(event_action = 'searchResultPage', 1, 0)) AS pow_7,
SUM(IF(event_action = 'click', POW(0.8, event_position), 0)) / 
SUM(IF(event_action = 'searchResultPage', 1, 0)) AS pow_8,
SUM(IF(event_action = 'click', POW(0.9, event_position), 0)) / 
SUM(IF(event_action = 'searchResultPage', 1, 0)) AS pow_9
  FROM TestSearchSatisfaction2_15700292
  WHERE
LEFT(timestamp, 8) = '20160908'
AND event_action IN ('searchResultPage', 'click')
AND (event_subTest IS NULL OR event_subTest IN ('null','baseline'))
  GROUP BY date, event_source, event_searchSessionId
) AS pows
GROUP BY date, event_source;
```

Bug: T144424
Change-Id: Ibc7b8503b6be4b8c3cfeeba2cc757a4f184f5a14
---
A search/paulscore.R
1 file changed, 36 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/wikimedia/discovery/golden 
refs/changes/02/309702/1

diff --git a/search/paulscore.R b/search/paulscore.R
new file mode 100644
index 000..553512c
--- /dev/null
+++ b/search/paulscore.R
@@ -0,0 +1,36 @@
+# Per-file config:
+base_path <- paste0(write_root, "search/")
+
+main <- function(date = NULL) {
+  
+  # Ensure we have a date and deconstruct it into a MW-friendly format
+  if (is.null(date)) {
+date <- Sys.Date() - 1
+  }
+  date <- gsub(x = date, pattern = "-", replacement = "")
+  
+  query <- paste0("SELECT
+  date, event_source,
+  ", paste0("ROUND(SUM(pow_", 1:9,")/COUNT(1), 3) AS pow_", 1:9, collapse = 
",\n  "), "
+FROM (
+  SELECT
+LEFT(timestamp, 8) AS date,
+event_source,
+event_searchSessionId,
+", paste0("SUM(IF(event_action = 'click', POW(0.", 1:9, ", 
event_position), 0)) / SUM(IF(event_action = 'searchResultPage', 1, 0)) AS 
pow_", 1:9, collapse = ",\n"), "
+  FROM TestSearchSatisfaction2_15700292
+  WHERE
+LEFT(timestamp, 8) = '", date, "'
+AND event_action IN ('searchResultPage', 'click')
+AND (event_subTest IS NULL OR event_subTest IN ('null','baseline'))
+  GROUP BY date, event_source, event_searchSessionId
+) AS pows
+GROUP BY date, event_source;") # cat(query) if you want to copy and paste into 
MySQL CLI
+  # See https://phabricator.wikimedia.org/T144424 for more details.
+  data <- wmf::mysql_read(query, "log")
+  
+  # Report
+  wmf::write_conditional(paul_scores, file.path(base_path, 
"paulscore_approximations.tsv"))
+  
+  return(invisible())
+}

-- 
To view, visit https://gerrit.wikimedia.org/r/309702
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ibc7b8503b6be4b8c3cfeeba2cc757a4f184f5a14
Gerrit-PatchSet: 1
Gerrit-Project: wikimedia/discovery/golden
Gerrit-Branch: master
Gerrit-Owner: Bearloga 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits