Ottomata has uploaded a new change for review.

  https://gerrit.wikimedia.org/r/180305

Change subject: Catch ValueError raised by hive.partition_datetime_from_path
......................................................................

Catch ValueError raised by hive.partition_datetime_from_path

We just saw a case where the data directory had a directory in it
that did not match something that could be passed to dateutil.parser.parse.
This change now keeps the whole script from dying when that happens.

Change-Id: I4e53a25a0519826128982fe94427b2bba66fda6f
---
M bin/refinery-drop-webrequest-partitions
1 file changed, 13 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/analytics/refinery 
refs/changes/05/180305/1

diff --git a/bin/refinery-drop-webrequest-partitions 
b/bin/refinery-drop-webrequest-partitions
index ddb871a..685e204 100755
--- a/bin/refinery-drop-webrequest-partitions
+++ b/bin/refinery-drop-webrequest-partitions
@@ -119,12 +119,19 @@
     # Loop through all the partition directory paths for this table
     # and check if any of them are old enough for deletion.
     for partition_path in HdfsUtils.ls(partition_glob, include_children=False):
-        partition_datetime = hive.partition_datetime_from_path(
-            partition_path,
-            webrequest_date_regex
-        )
-        if partition_datetime < old_partition_datetime_threshold:
-            partition_paths_to_delete.append(partition_path)
+        try:
+            partition_datetime = hive.partition_datetime_from_path(
+                partition_path,
+                webrequest_date_regex
+            )
+            if partition_datetime < old_partition_datetime_threshold:
+                partition_paths_to_delete.append(partition_path)
+        except ValueError:
+            logging.error(
+                ' hive.partition_datetime_from_path could not parse date found 
in {0} using pattern {1}. Skipping.'
+                .format(partition_path, webrequest_date_regex.pattern)
+            )
+            continue
 
     # Drop any old Hive partitions
     if partition_specs_to_drop:

-- 
To view, visit https://gerrit.wikimedia.org/r/180305
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I4e53a25a0519826128982fe94427b2bba66fda6f
Gerrit-PatchSet: 1
Gerrit-Project: analytics/refinery
Gerrit-Branch: master
Gerrit-Owner: Ottomata <[email protected]>

_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to