Ottomata has submitted this change and it was merged.

Change subject: Put refined datsets definition in separate file
......................................................................


Put refined datsets definition in separate file

The addition of the variable $webrequest_refined_data_directory was causing
existing oozie properties to fail, since there is no default value
for this directory.

Once the refined dataset is considered stable, I'd like to refactor a little
bit of this.  Particularly, the 'refined' dataset should be the main one,
and the raw ones should be refered to as 'raw'.

Change-Id: I9337490755355a97873509ff53e222a0f0db80c2
---
M oozie/webrequest/datasets.xml
A oozie/webrequest/datasets_refined.xml
2 files changed, 37 insertions(+), 25 deletions(-)

Approvals:
  Ottomata: Verified; Looks good to me, approved
  Nuria: Looks good to me, but someone else must approve



diff --git a/oozie/webrequest/datasets.xml b/oozie/webrequest/datasets.xml
index 5d12aff..7056ac5 100644
--- a/oozie/webrequest/datasets.xml
+++ b/oozie/webrequest/datasets.xml
@@ -7,8 +7,6 @@
                         Example: 2014-04-01T00:00Z
     ${webrequest_data_directory} - Path to directory where data is time 
bucketed.
                         Example: /wmf/data/raw/webrequest
-    ${webrequest_refined_data_directory} - Path to directory where refined 
data is time bucketed.
-                        Example: /wmf/data/wmf/webrequest
 -->
 
 <datasets>
@@ -81,29 +79,6 @@
              timezone="Universal">
         
<uri-template>${webrequest_data_directory}/webrequest_upload/hourly/${YEAR}/${MONTH}/${DAY}/${HOUR}</uri-template>
         <done-flag>_SUCCESS</done-flag>
-    </dataset>
-
-    <!--
-    The webrequest_*_refined datasets contain the same data as the
-    above two 'raw' datasets, except that they use a more efficient
-    storage format, and contain extra information.
-
-    This dataset does not yet include upload or bits.
-
-    TODO: I would like to eventually name this data set 'webrequest_mobile',
-          etc.  and rename the above dataset to webrequest_mobile_raw, etc.
-    -->
-    <dataset name="webrequest_mobile_refined"
-             frequency="${coord:hours(1)}"
-             initial-instance="${start_time}"
-             timezone="Universal">
-        
<uri-template>${webrequest_refined_data_directory}/webrequest_source=mobile/year=${YEAR}/month=${MONTH}/day=${DAY}/hour=${HOUR}</uri-template>
-    </dataset>
-    <dataset name="webrequest_text_refined"
-             frequency="${coord:hours(1)}"
-             initial-instance="${start_time}"
-             timezone="Universal">
-        
<uri-template>${webrequest_refined_data_directory}/webrequest_source=upload/year=${YEAR}/month=${MONTH}/day=${DAY}/hour=${HOUR}</uri-template>
     </dataset>
 
 </datasets>
diff --git a/oozie/webrequest/datasets_refined.xml 
b/oozie/webrequest/datasets_refined.xml
new file mode 100644
index 0000000..df9b42a
--- /dev/null
+++ b/oozie/webrequest/datasets_refined.xml
@@ -0,0 +1,37 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+Defines reusable datasets for refined webrequest data.
+Use this dataset in your coordinator.xml files by setting:
+
+    ${start_time}     - the initial instance of your data.
+                        Example: 2014-04-01T00:00Z
+    ${webrequest_refined_data_directory} - Path to directory where refined 
data is time bucketed.
+                        Example: /wmf/data/wmf/webrequest
+-->
+
+<datasets>
+
+    <!--
+    The webrequest_*_refined datasets contain the same data as the
+    above two 'raw' datasets, except that they use a more efficient
+    storage format, and contain extra information.
+
+    This dataset does not yet include upload or bits.
+
+    TODO: I would like to eventually name this data set 'webrequest_mobile',
+          etc.  and rename the datasets.xml datasets to webrequest_mobile_raw, 
etc.
+    -->
+    <dataset name="webrequest_mobile_refined"
+             frequency="${coord:hours(1)}"
+             initial-instance="${start_time}"
+             timezone="Universal">
+        
<uri-template>${webrequest_refined_data_directory}/webrequest_source=mobile/year=${YEAR}/month=${MONTH}/day=${DAY}/hour=${HOUR}</uri-template>
+    </dataset>
+    <dataset name="webrequest_text_refined"
+             frequency="${coord:hours(1)}"
+             initial-instance="${start_time}"
+             timezone="Universal">
+        
<uri-template>${webrequest_refined_data_directory}/webrequest_source=upload/year=${YEAR}/month=${MONTH}/day=${DAY}/hour=${HOUR}</uri-template>
+    </dataset>
+
+</datasets>

-- 
To view, visit https://gerrit.wikimedia.org/r/183988
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I9337490755355a97873509ff53e222a0f0db80c2
Gerrit-PatchSet: 1
Gerrit-Project: analytics/refinery
Gerrit-Branch: master
Gerrit-Owner: Ottomata <[email protected]>
Gerrit-Reviewer: Nuria <[email protected]>
Gerrit-Reviewer: Ottomata <[email protected]>

_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to