Brion VIBBER has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/376088 )

Change subject: Retain some WebM metadata for processing purposes
......................................................................

Retain some WebM metadata for processing purposes

All Matroska-specific metadata from getid3 was being wiped out
from MediaWiki-side processing because there was some extra
detailed binary junk in there.

Now retaining the 'comments' subsection, which includes things
like the writingapp and muxingapp tags. These can be used to
detect files copied from YouTube for instance by their listing
"Google" for these, though that's not an automatic sign of
being a problematic file.

Bug: T167000
Change-Id: I195b09ca8713e36477dab3327b8361db35e41dcf
---
M handlers/WebMHandler/WebMHandler.php
1 file changed, 56 insertions(+), 1 deletion(-)


  git pull 
ssh://gerrit.wikimedia.org:29418/mediawiki/extensions/TimedMediaHandler 
refs/changes/88/376088/1

diff --git a/handlers/WebMHandler/WebMHandler.php 
b/handlers/WebMHandler/WebMHandler.php
index 708a46a..15abfb8 100644
--- a/handlers/WebMHandler/WebMHandler.php
+++ b/handlers/WebMHandler/WebMHandler.php
@@ -11,7 +11,13 @@
        protected function getID3( $path ) {
                $id3 = parent::getID3( $path );
                // Unset some parts of id3 that are too detailed and matroska 
specific:
-               unset( $id3['matroska'] );
+               // @todo include the basic file codec and other metadata too?
+               if ( isset( $id3['matroska'] ) ) {
+                       $comments = $id3['matroska']['comments'];
+                       $id3['matroska'] = [
+                               'comments' => $comments
+                       ];
+               }
                return $id3;
        }
 
@@ -173,4 +179,53 @@
                                $file->getHeight()
                        )->text();
        }
+
+       /**
+        * Display metadata box on file description page.
+        *
+        * Very basic, cribbed from OggHandlerTMH fow now.
+        * Only shows the top-level writing/demuxing app comment.
+        *
+        * @param File $file
+        * @param bool|IContextSource $context Context to use (optional)
+        * @return array|bool
+        */
+       public function formatMetadata( $file, $context = false ) {
+               $metadata = $file->getMetadata();
+
+               if ( is_string( $metadata ) ) {
+                       $metadata = $this->unpackMetadata( $metadata );
+               }
+
+               if ( isset( $metadata['error'] ) ) {
+                       return false;
+               }
+
+               if ( !$metadata ) {
+                       return false;
+               }
+
+               $props = [];
+
+               if ( isset( $metadata['matroska'] ) && isset( 
$metadata['matroska']['comments'] ) ) {
+                       $comments = $metadata['matroska']['comments'];
+                       // Map comments from getid3's matroska handler to 
output format
+                       // Localization of labels by FormatMetadata...
+                       $map = [
+                               'muxingapp' => 'Software',
+                               'writingapp' => 'Software',
+                       ];
+                       foreach ( $map as $commentTag => $propTag ) {
+                               if ( isset( $comments[$commentTag] ) ) {
+                                       if ( !isset( $props[$propTag] ) ) {
+                                               $props[$propTag] = [];
+                                       }
+                                       $props[$propTag][] = 
$comments[$commentTag];
+                               }
+                       }
+               }
+
+               return $this->formatMetadataHelper( $props, $context );
+       }
+
 }

-- 
To view, visit https://gerrit.wikimedia.org/r/376088
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I195b09ca8713e36477dab3327b8361db35e41dcf
Gerrit-PatchSet: 1
Gerrit-Project: mediawiki/extensions/TimedMediaHandler
Gerrit-Branch: master
Gerrit-Owner: Brion VIBBER <[email protected]>

_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to