Joal has submitted this change and it was merged. (
https://gerrit.wikimedia.org/r/362160 )
Change subject: Add project_family to webrequest normalized_host
......................................................................
Add project_family to webrequest normalized_host
In preparation for removing project_class from the normalized_host
field in webrequest, we currently provide both project_class and
project_family (same value).
Also, bump the webrequest record version.
Bug: T168874
Change-Id: Iaefae90d0ca3815836873e3af9294eefcacb223f
---
M hive/webrequest/create_webrequest_table.hql
M oozie/webrequest/load/bundle.properties
2 files changed, 3 insertions(+), 3 deletions(-)
Approvals:
Joal: Verified; Looks good to me, approved
diff --git a/hive/webrequest/create_webrequest_table.hql
b/hive/webrequest/create_webrequest_table.hql
index 0973eca..25d8ad5 100644
--- a/hive/webrequest/create_webrequest_table.hql
+++ b/hive/webrequest/create_webrequest_table.hql
@@ -51,7 +51,7 @@
`agent_type` string COMMENT 'Categorise the agent making the
webrequest as either user or spider (automatas to be added).',
`is_zero` boolean COMMENT 'Indicates if the webrequest is
accessed through a zero provider',
`referer_class` string COMMENT 'Indicates if a referer is internal,
external or unknown.',
- `normalized_host` struct<project_class: string, project:string,
qualifiers: array<string>, tld: String> COMMENT 'struct containing
project_class (such as wikipedia or wikidata for instance), project (such as en
or commons), qualifiers (a list of in-between values, such as m and/or zero)
and tld (org most often)',
+ `normalized_host` struct<project_class: string, project_family: string,
project:string, qualifiers: array<string>, tld: String> COMMENT 'struct
containing project_family (such as wikipedia or wikidata for instance), project
(such as en or commons), qualifiers (a list of in-between values, such as m
and/or zero) and tld (org most often)',
`pageview_info` map<string, string> COMMENT 'map containing project,
language_variant and page_title values only when is_pageview = TRUE.',
`page_id` bigint COMMENT 'MediaWiki page_id for this page
title. For redirects this could be the page_id of the redirect or the page_id
of the target. This may not always be set, even if the page is actually a
pageview.',
`namespace_id` int COMMENT 'MediaWiki namespace_id for this page
title. This may not always be set, even if the page is actually a pageview.',
diff --git a/oozie/webrequest/load/bundle.properties
b/oozie/webrequest/load/bundle.properties
index 3593204..826581d 100644
--- a/oozie/webrequest/load/bundle.properties
+++ b/oozie/webrequest/load/bundle.properties
@@ -57,10 +57,10 @@
webrequest_table = wmf.webrequest
# Version of Hive UDF jar to import
-refinery_jar_version = 0.0.49
+refinery_jar_version = 0.0.50
# Record version to keep track of changes
-record_version = 0.0.18
+record_version = 0.0.19
# Hive table name.
statistics_table = wmf_raw.webrequest_sequence_stats
--
To view, visit https://gerrit.wikimedia.org/r/362160
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: Iaefae90d0ca3815836873e3af9294eefcacb223f
Gerrit-PatchSet: 3
Gerrit-Project: analytics/refinery
Gerrit-Branch: master
Gerrit-Owner: Joal <[email protected]>
Gerrit-Reviewer: Joal <[email protected]>
Gerrit-Reviewer: Nuria <[email protected]>
Gerrit-Reviewer: Ottomata <[email protected]>
_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits