Hello Ottomata, I'd like you to do a code review. Please visit
https://gerrit.wikimedia.org/r/194855 to review the following change. Change subject: Fail less hard for misrepresented urls in MediaFileUrlParser ...................................................................... Fail less hard for misrepresented urls in MediaFileUrlParser That way Hive queries do not abort if non-sensical Urls occur. Change-Id: I176f087d8b37a968f38e671e49e55b172aa992c4 --- M changelog.md M refinery-core/src/main/java/org/wikimedia/analytics/refinery/core/MediaFileUrlParser.java M refinery-core/src/test/java/org/wikimedia/analytics/refinery/core/TestMediaFileUrlParser.java 3 files changed, 12 insertions(+), 3 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/analytics/refinery/source refs/changes/55/194855/1 diff --git a/changelog.md b/changelog.md index 6ebf6aa..3245942 100644 --- a/changelog.md +++ b/changelog.md @@ -4,6 +4,7 @@ * Start counting www.mediawiki.org hits * Consistently count search attempts * Make custom file ending optional for thumbnails in MediaFileUrlParser +* Fail less hard for misrepresented urls in MediaFileUrlParser ## v0.0.7 * Add Referer classifier diff --git a/refinery-core/src/main/java/org/wikimedia/analytics/refinery/core/MediaFileUrlParser.java b/refinery-core/src/main/java/org/wikimedia/analytics/refinery/core/MediaFileUrlParser.java index a039988..ad5de4a 100644 --- a/refinery-core/src/main/java/org/wikimedia/analytics/refinery/core/MediaFileUrlParser.java +++ b/refinery-core/src/main/java/org/wikimedia/analytics/refinery/core/MediaFileUrlParser.java @@ -229,9 +229,13 @@ Integer height = parseDigitString(heightStr); ret = MediaFileUrlInfo.createTranscodedToMovie(baseName, height); } else { - throw new AssertionError("Logic error due to" - + "transcodingSpec without a suffix specific handler '" - + transcodingSpec + "'"); + // We sometimes see urls as + // /wikipedia/commons/transcoded/b/bf/foo.ogv/foo.ogv + // which would match this branch. But such requests do + // not make much sense. Instead of failing hard for + // them, we just signal that we could not make sense + // of them. + return null; } } else { return null; diff --git a/refinery-core/src/test/java/org/wikimedia/analytics/refinery/core/TestMediaFileUrlParser.java b/refinery-core/src/test/java/org/wikimedia/analytics/refinery/core/TestMediaFileUrlParser.java index d6b7274..89348b5 100644 --- a/refinery-core/src/test/java/org/wikimedia/analytics/refinery/core/TestMediaFileUrlParser.java +++ b/refinery-core/src/test/java/org/wikimedia/analytics/refinery/core/TestMediaFileUrlParser.java @@ -562,6 +562,10 @@ 100); } + public void testMistranscodedUrl() { + assertUnidentified("/wikipedia/commons/transcoded/b/bf/foo.ogv/foo.ogv"); + } + // Test uploaded media files; Archive ------------------------------------- public void testMediaArchive() { -- To view, visit https://gerrit.wikimedia.org/r/194855 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I176f087d8b37a968f38e671e49e55b172aa992c4 Gerrit-PatchSet: 1 Gerrit-Project: analytics/refinery/source Gerrit-Branch: master Gerrit-Owner: QChris <christ...@quelltextlich.at> Gerrit-Reviewer: Ottomata <o...@wikimedia.org> _______________________________________________ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits