Ottomata has uploaded a new change for review.
https://gerrit.wikimedia.org/r/180305
Change subject: Catch ValueError raised by hive.partition_datetime_from_path
......................................................................
Catch ValueError raised by hive.partition_datetime_from_path
We just saw a case where the data directory had a directory in it
that did not match something that could be passed to dateutil.parser.parse.
This change now keeps the whole script from dying when that happens.
Change-Id: I4e53a25a0519826128982fe94427b2bba66fda6f
---
M bin/refinery-drop-webrequest-partitions
1 file changed, 13 insertions(+), 6 deletions(-)
git pull ssh://gerrit.wikimedia.org:29418/analytics/refinery
refs/changes/05/180305/1
diff --git a/bin/refinery-drop-webrequest-partitions
b/bin/refinery-drop-webrequest-partitions
index ddb871a..685e204 100755
--- a/bin/refinery-drop-webrequest-partitions
+++ b/bin/refinery-drop-webrequest-partitions
@@ -119,12 +119,19 @@
# Loop through all the partition directory paths for this table
# and check if any of them are old enough for deletion.
for partition_path in HdfsUtils.ls(partition_glob, include_children=False):
- partition_datetime = hive.partition_datetime_from_path(
- partition_path,
- webrequest_date_regex
- )
- if partition_datetime < old_partition_datetime_threshold:
- partition_paths_to_delete.append(partition_path)
+ try:
+ partition_datetime = hive.partition_datetime_from_path(
+ partition_path,
+ webrequest_date_regex
+ )
+ if partition_datetime < old_partition_datetime_threshold:
+ partition_paths_to_delete.append(partition_path)
+ except ValueError:
+ logging.error(
+ ' hive.partition_datetime_from_path could not parse date found
in {0} using pattern {1}. Skipping.'
+ .format(partition_path, webrequest_date_regex.pattern)
+ )
+ continue
# Drop any old Hive partitions
if partition_specs_to_drop:
--
To view, visit https://gerrit.wikimedia.org/r/180305
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: I4e53a25a0519826128982fe94427b2bba66fda6f
Gerrit-PatchSet: 1
Gerrit-Project: analytics/refinery
Gerrit-Branch: master
Gerrit-Owner: Ottomata <[email protected]>
_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits