Hello Ottomata,

I'd like you to do a code review.  Please visit

    https://gerrit.wikimedia.org/r/194855

to review the following change.

Change subject: Fail less hard for misrepresented urls in MediaFileUrlParser
......................................................................

Fail less hard for misrepresented urls in MediaFileUrlParser

That way Hive queries do not abort if non-sensical Urls occur.

Change-Id: I176f087d8b37a968f38e671e49e55b172aa992c4
---
M changelog.md
M 
refinery-core/src/main/java/org/wikimedia/analytics/refinery/core/MediaFileUrlParser.java
M 
refinery-core/src/test/java/org/wikimedia/analytics/refinery/core/TestMediaFileUrlParser.java
3 files changed, 12 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/analytics/refinery/source 
refs/changes/55/194855/1

diff --git a/changelog.md b/changelog.md
index 6ebf6aa..3245942 100644
--- a/changelog.md
+++ b/changelog.md
@@ -4,6 +4,7 @@
 * Start counting www.mediawiki.org hits
 * Consistently count search attempts
 * Make custom file ending optional for thumbnails in MediaFileUrlParser
+* Fail less hard for misrepresented urls in MediaFileUrlParser
 
 ## v0.0.7
 * Add Referer classifier
diff --git 
a/refinery-core/src/main/java/org/wikimedia/analytics/refinery/core/MediaFileUrlParser.java
 
b/refinery-core/src/main/java/org/wikimedia/analytics/refinery/core/MediaFileUrlParser.java
index a039988..ad5de4a 100644
--- 
a/refinery-core/src/main/java/org/wikimedia/analytics/refinery/core/MediaFileUrlParser.java
+++ 
b/refinery-core/src/main/java/org/wikimedia/analytics/refinery/core/MediaFileUrlParser.java
@@ -229,9 +229,13 @@
                         Integer height = parseDigitString(heightStr);
                         ret = 
MediaFileUrlInfo.createTranscodedToMovie(baseName, height);
                     } else {
-                        throw new AssertionError("Logic error due to"
-                                + "transcodingSpec without a suffix specific 
handler '"
-                                + transcodingSpec + "'");
+                        // We sometimes see urls as
+                        //   /wikipedia/commons/transcoded/b/bf/foo.ogv/foo.ogv
+                        // which would match this branch. But such requests do
+                        // not make much sense. Instead of failing hard for
+                        // them, we just signal that we could not make sense
+                        // of them.
+                        return null;
                     }
                 } else {
                     return null;
diff --git 
a/refinery-core/src/test/java/org/wikimedia/analytics/refinery/core/TestMediaFileUrlParser.java
 
b/refinery-core/src/test/java/org/wikimedia/analytics/refinery/core/TestMediaFileUrlParser.java
index d6b7274..89348b5 100644
--- 
a/refinery-core/src/test/java/org/wikimedia/analytics/refinery/core/TestMediaFileUrlParser.java
+++ 
b/refinery-core/src/test/java/org/wikimedia/analytics/refinery/core/TestMediaFileUrlParser.java
@@ -562,6 +562,10 @@
                 100);
     }
 
+    public void testMistranscodedUrl() {
+        
assertUnidentified("/wikipedia/commons/transcoded/b/bf/foo.ogv/foo.ogv");
+    }
+
     // Test uploaded media files; Archive -------------------------------------
 
     public void testMediaArchive() {

-- 
To view, visit https://gerrit.wikimedia.org/r/194855
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I176f087d8b37a968f38e671e49e55b172aa992c4
Gerrit-PatchSet: 1
Gerrit-Project: analytics/refinery/source
Gerrit-Branch: master
Gerrit-Owner: QChris <christ...@quelltextlich.at>
Gerrit-Reviewer: Ottomata <o...@wikimedia.org>

_______________________________________________
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to