[MediaWiki-commits] [Gerrit] Geocode glam_nara files - change (analytics/refinery)

2015-02-02 Thread Ottomata (Code Review)
Ottomata has submitted this change and it was merged.

Change subject: Geocode glam_nara files
..


Geocode glam_nara files

Geocoding happens directly on the client ip without resolving
X-Forwarded-For. While that's wrong from the start, it is what the
tsvs on the udp2log pipeline do and did. And as at this point the
objective is switching away from udp2log, fixing the conceptual issues
with the glam_nara tsvs is left for the future.

Change-Id: I54f5db3f61291b0d455e44fd20de090b40c1a3ef
---
M oozie/webrequest/legacy_tsvs/bundle.properties
M oozie/webrequest/legacy_tsvs/coordinator_misc_mobile_text.xml
M oozie/webrequest/legacy_tsvs/coordinator_mobile.xml
M oozie/webrequest/legacy_tsvs/coordinator_mobile_text.xml
M oozie/webrequest/legacy_tsvs/coordinator_mobile_text_upload.xml
M oozie/webrequest/legacy_tsvs/generate_glam_nara_tsv.hql
M oozie/webrequest/legacy_tsvs/workflow.xml
7 files changed, 57 insertions(+), 9 deletions(-)

Approvals:
  Ottomata: Verified; Looks good to me, approved



diff --git a/oozie/webrequest/legacy_tsvs/bundle.properties 
b/oozie/webrequest/legacy_tsvs/bundle.properties
index 900fbd3..85eef77 100644
--- a/oozie/webrequest/legacy_tsvs/bundle.properties
+++ b/oozie/webrequest/legacy_tsvs/bundle.properties
@@ -12,9 +12,20 @@
 job_tracker = 
resourcemanager.analytics.eqiad.wmnet:8032
 queue_name  = default
 
+# Base path in HDFS to refinery.
+# When submitting this job for production, you should
+# override this to point directly at a deployed
+# directory name, and not the 'symbolic' 'current' directory.
+# E.g.  /wmf/refinery/2015-01-05T17.59.18Z--7bb7f07
+refinery_directory  = ${name_node}/wmf/refinery/current
+
+# HDFS path to artifacts that will be used by this job.
+# E.g. refinery-hive.jar should exist here.
+artifacts_directory = ${refinery_directory}/artifacts
+
 # Base path in HDFS to oozie files.
 # Other files will be used relative to this path.
-oozie_directory = ${name_node}/wmf/refinery/current/oozie
+oozie_directory = ${refinery_directory}/oozie
 
 # HDFS paths to the coordinators to run.
 # All of them are essentially the same coordinator and differ only in the
diff --git a/oozie/webrequest/legacy_tsvs/coordinator_misc_mobile_text.xml 
b/oozie/webrequest/legacy_tsvs/coordinator_misc_mobile_text.xml
index 56fbb84..9f17521 100644
--- a/oozie/webrequest/legacy_tsvs/coordinator_misc_mobile_text.xml
+++ b/oozie/webrequest/legacy_tsvs/coordinator_misc_mobile_text.xml
@@ -21,6 +21,7 @@
 propertynamewebrequest_datasets_file/name/property
 propertynamewebrequest_data_directory/name/property
 propertynamehive_site_xml/name/property
+propertynameartifacts_directory/name/property
 propertynameworkflow_file/name/property
 propertynamewebrequest_table/name/property
 propertynamemark_directory_done_workflow_file/name/property
@@ -101,6 +102,10 @@
 value${hive_site_xml}/value
 /property
 property
+nameartifacts_directory/name
+value${artifacts_directory}/value
+/property
+property
 namewebrequest_table/name
 value${webrequest_table}/value
 /property
diff --git a/oozie/webrequest/legacy_tsvs/coordinator_mobile.xml 
b/oozie/webrequest/legacy_tsvs/coordinator_mobile.xml
index eb64da3..c5199aa 100644
--- a/oozie/webrequest/legacy_tsvs/coordinator_mobile.xml
+++ b/oozie/webrequest/legacy_tsvs/coordinator_mobile.xml
@@ -21,6 +21,7 @@
 propertynamewebrequest_datasets_file/name/property
 propertynamewebrequest_data_directory/name/property
 propertynamehive_site_xml/name/property
+propertynameartifacts_directory/name/property
 propertynameworkflow_file/name/property
 propertynamewebrequest_table/name/property
 propertynamemark_directory_done_workflow_file/name/property
@@ -91,6 +92,10 @@
 value${hive_site_xml}/value
 /property
 property
+nameartifacts_directory/name
+value${artifacts_directory}/value
+/property
+property
 namewebrequest_table/name
 value${webrequest_table}/value
 /property
diff --git a/oozie/webrequest/legacy_tsvs/coordinator_mobile_text.xml 
b/oozie/webrequest/legacy_tsvs/coordinator_mobile_text.xml
index 942ec66..5abf779 100644
--- a/oozie/webrequest/legacy_tsvs/coordinator_mobile_text.xml
+++ b/oozie/webrequest/legacy_tsvs/coordinator_mobile_text.xml
@@ -21,6 +21,7 @@
 propertynamewebrequest_datasets_file/name/property
 propertynamewebrequest_data_directory/name/property
 

[MediaWiki-commits] [Gerrit] Geocode glam_nara files - change (analytics/refinery)

2015-02-01 Thread QChris (Code Review)
Hello Ottomata,

I'd like you to do a code review.  Please visit

https://gerrit.wikimedia.org/r/188010

to review the following change.

Change subject: Geocode glam_nara files
..

Geocode glam_nara files

Geocoding happens directly on the client ip without resolving
X-Forwarded-For. While that's wrong from the start, it is what the
tsvs on the udp2log pipeline do and did. And as at this point the
objective is switching away from udp2log, fixing the conceptual issues
with the glam_nara tsvs is left for the future.

Change-Id: I54f5db3f61291b0d455e44fd20de090b40c1a3ef
---
M oozie/webrequest/legacy_tsvs/bundle.properties
M oozie/webrequest/legacy_tsvs/coordinator_misc_mobile_text.xml
M oozie/webrequest/legacy_tsvs/coordinator_mobile.xml
M oozie/webrequest/legacy_tsvs/coordinator_mobile_text.xml
M oozie/webrequest/legacy_tsvs/coordinator_mobile_text_upload.xml
M oozie/webrequest/legacy_tsvs/generate_glam_nara_tsv.hql
M oozie/webrequest/legacy_tsvs/workflow.xml
7 files changed, 57 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/analytics/refinery 
refs/changes/10/188010/1

diff --git a/oozie/webrequest/legacy_tsvs/bundle.properties 
b/oozie/webrequest/legacy_tsvs/bundle.properties
index 900fbd3..85eef77 100644
--- a/oozie/webrequest/legacy_tsvs/bundle.properties
+++ b/oozie/webrequest/legacy_tsvs/bundle.properties
@@ -12,9 +12,20 @@
 job_tracker = 
resourcemanager.analytics.eqiad.wmnet:8032
 queue_name  = default
 
+# Base path in HDFS to refinery.
+# When submitting this job for production, you should
+# override this to point directly at a deployed
+# directory name, and not the 'symbolic' 'current' directory.
+# E.g.  /wmf/refinery/2015-01-05T17.59.18Z--7bb7f07
+refinery_directory  = ${name_node}/wmf/refinery/current
+
+# HDFS path to artifacts that will be used by this job.
+# E.g. refinery-hive.jar should exist here.
+artifacts_directory = ${refinery_directory}/artifacts
+
 # Base path in HDFS to oozie files.
 # Other files will be used relative to this path.
-oozie_directory = ${name_node}/wmf/refinery/current/oozie
+oozie_directory = ${refinery_directory}/oozie
 
 # HDFS paths to the coordinators to run.
 # All of them are essentially the same coordinator and differ only in the
diff --git a/oozie/webrequest/legacy_tsvs/coordinator_misc_mobile_text.xml 
b/oozie/webrequest/legacy_tsvs/coordinator_misc_mobile_text.xml
index 56fbb84..9f17521 100644
--- a/oozie/webrequest/legacy_tsvs/coordinator_misc_mobile_text.xml
+++ b/oozie/webrequest/legacy_tsvs/coordinator_misc_mobile_text.xml
@@ -21,6 +21,7 @@
 propertynamewebrequest_datasets_file/name/property
 propertynamewebrequest_data_directory/name/property
 propertynamehive_site_xml/name/property
+propertynameartifacts_directory/name/property
 propertynameworkflow_file/name/property
 propertynamewebrequest_table/name/property
 propertynamemark_directory_done_workflow_file/name/property
@@ -101,6 +102,10 @@
 value${hive_site_xml}/value
 /property
 property
+nameartifacts_directory/name
+value${artifacts_directory}/value
+/property
+property
 namewebrequest_table/name
 value${webrequest_table}/value
 /property
diff --git a/oozie/webrequest/legacy_tsvs/coordinator_mobile.xml 
b/oozie/webrequest/legacy_tsvs/coordinator_mobile.xml
index eb64da3..c5199aa 100644
--- a/oozie/webrequest/legacy_tsvs/coordinator_mobile.xml
+++ b/oozie/webrequest/legacy_tsvs/coordinator_mobile.xml
@@ -21,6 +21,7 @@
 propertynamewebrequest_datasets_file/name/property
 propertynamewebrequest_data_directory/name/property
 propertynamehive_site_xml/name/property
+propertynameartifacts_directory/name/property
 propertynameworkflow_file/name/property
 propertynamewebrequest_table/name/property
 propertynamemark_directory_done_workflow_file/name/property
@@ -91,6 +92,10 @@
 value${hive_site_xml}/value
 /property
 property
+nameartifacts_directory/name
+value${artifacts_directory}/value
+/property
+property
 namewebrequest_table/name
 value${webrequest_table}/value
 /property
diff --git a/oozie/webrequest/legacy_tsvs/coordinator_mobile_text.xml 
b/oozie/webrequest/legacy_tsvs/coordinator_mobile_text.xml
index 942ec66..5abf779 100644
--- a/oozie/webrequest/legacy_tsvs/coordinator_mobile_text.xml
+++ b/oozie/webrequest/legacy_tsvs/coordinator_mobile_text.xml
@@ -21,6 +21,7 @@