[ https://issues.apache.org/jira/browse/TIKA-2761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16659203#comment-16659203 ]
Hudson commented on TIKA-2761: ------------------------------ UNSTABLE: Integrated in Jenkins build tika-2.x-windows #337 (See [https://builds.apache.org/job/tika-2.x-windows/337/]) TIKA-2761 -- write as much metadata as possible before writing to xhtml. (tallison: rev f7c3ece80e2db7e060deb0be3746d7dfa003303b) * (edit) tika-parsers/src/test/java/org/apache/tika/parser/mp3/Mp3ParserTest.java * (edit) tika-parsers/src/main/java/org/apache/tika/parser/mp3/Mp3Parser.java > XML Structured Text Is Missing Metadata Fields for mp3 files > ------------------------------------------------------------ > > Key: TIKA-2761 > URL: https://issues.apache.org/jira/browse/TIKA-2761 > Project: Tika > Issue Type: Bug > Components: metadata > Affects Versions: 1.19.1 > Environment: All > Reporter: Nick Sincaglia > Assignee: Tim Allison > Priority: Minor > Fix For: 2.0.0, 1.20 > > > I am using the Tika 1.19 as a GUI to extract metadata from an .mp3 file. The > sample rate is available and I am able access it, but only as a string or as > part of a JSON document. I am working in XML and wold like to use XML as a > content handler. But when the metadata is returned as 'structured text' (XML) > the sample rate is not returned. I have tried using Tika 1.19 in a Maven > project and experimented with different contentHandlers and the same issue > occurs. I cannot seem to get the sample rate returned in an XML doc, but I am > able to access the data from the metadata object itself. If the metadata is > returned as a string, the sample rate is there, if it is returned as XML, the > sample rate is not returned. I am wondering what I am doing wrong or > misunderstanding. Perhaps an issue with the parser or contentHandler that is > used? > > *_+Tika 1.19 'Metadata' view (sample rate is available):+_* > > Author: Glee Cast > Content-Length: 8251946 > Content-Type: audio/mpeg > X-Parsed-By: org.apache.tika.parser.DefaultParser > X-Parsed-By: org.apache.tika.parser.mp3.Mp3Parser > X-TIKA:digest:MD5: e0bdf3a0e171fca838604f9baad46612 > X-TIKA:digest:SHA256: > ea1e4aa998f2c6e80139fa100c62fc1ee17652cf702cd484532b90183e7c5cc0 > channels: 2 > creator: Glee Cast > dc:creator: Glee Cast > dc:title: Rehab (Glee Cast Version) > meta:author: Glee Cast > resourceName: USQX90900223_A4_T7.mp3 > *+_samplerate: 44100_+* > title: Rehab (Glee Cast Version) > version: MPEG 3 Layer III Version 1 > xmpDM:album: Glee: The Music, The Complete Season One > xmpDM:artist: Glee Cast > xmpDM:audioChannelType: Stereo > xmpDM:audioCompressor: MP3 > *_+xmpDM:audioSampleRate: 44100+_* > xmpDM:duration: 206301.296875 > xmpDM:genre: > xmpDM:logComment: XXX - > (P) 2009 Twentieth Century Fox Television - USQX90900223 > xmpDM:releaseDate: > xmpDM:trackNumber: 4 > > > *Tika 1.19 'Structured Text' view (no sample rate):* > > <?xml version="1.0" encoding="UTF-8"?><html > xmlns="[http://www.w3.org/1999/xhtml]"> > <head> > <meta name="xmpDM:genre" content=""/> > <meta name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser"/> > <meta name="X-Parsed-By" content="org.apache.tika.parser.mp3.Mp3Parser"/> > <meta name="creator" content="Glee Cast"/> > <meta name="xmpDM:album" content="Glee: The Music, The Complete Season One"/> > <meta name="xmpDM:releaseDate" content=""/> > <meta name="meta:author" content="Glee Cast"/> > <meta name="xmpDM:artist" content="Glee Cast"/> > <meta name="X-TIKA:digest:SHA256" > content="ea1e4aa998f2c6e80139fa100c62fc1ee17652cf702cd484532b90183e7c5cc0"/> > <meta name="dc:creator" content="Glee Cast"/> > <meta name="xmpDM:audioCompressor" content="MP3"/> > <meta name="resourceName" content="USQX90900223_A4_T7.mp3"/> > <meta name="xmpDM:logComment" content="XXX - (P) 2009 Twentieth Century > Fox Television - USQX90900223"/> > <meta name="dc:title" content="Rehab (Glee Cast Version)"/> > <meta name="Author" content="Glee Cast"/> > <meta name="Content-Length" content="8251946"/> > <meta name="X-TIKA:digest:MD5" content="e0bdf3a0e171fca838604f9baad46612"/> > <meta name="Content-Type" content="audio/mpeg"/> > <title>Rehab (Glee Cast Version)</title> > </head> > <body><h1>Rehab (Glee Cast Version)</h1> > <p>Glee Cast</p> > <p>Glee: The Music, The Complete Season One, track 4</p> > <p>206301.3</p> > <p>XXX - (P) 2009 Twentieth Century Fox Television - USQX90900223</p> > </body></html> > > *_+Tika 1.19 Recursive JSON view (the sample rate is there):+_* > > [ > { > "Author": "Glee Cast", > "Content-Type": "audio/mpeg", > "X-Parsed-By": [ > "org.apache.tika.parser.DefaultParser", > "org.apache.tika.parser.mp3.Mp3Parser" > ], > "X-TIKA:content": "Rehab (Glee Cast Version)\nGlee Cast\nGlee: The Music, > The Complete Season One, track 4\n206301.3\nXXX - \n(P) 2009 Twentieth > Century Fox Television - USQX90900223\n", > "X-TIKA:digest:MD5": "e0bdf3a0e171fca838604f9baad46612", > "X-TIKA:digest:SHA256": > "ea1e4aa998f2c6e80139fa100c62fc1ee17652cf702cd484532b90183e7c5cc0", > "X-TIKA:parse_time_millis": "86", > "channels": "2", > "creator": "Glee Cast", > "dc:creator": "Glee Cast", > "dc:title": "Rehab (Glee Cast Version)", > "meta:author": "Glee Cast", > *+_"samplerate": "44100",_+* > "title": "Rehab (Glee Cast Version)", > "version": "MPEG 3 Layer III Version 1", > "xmpDM:album": "Glee: The Music, The Complete Season One", > "xmpDM:artist": "Glee Cast", > "xmpDM:audioChannelType": "Stereo", > "xmpDM:audioCompressor": "MP3", > *_+"xmpDM:audioSampleRate": "44100",+_* > "xmpDM:duration": "206301.296875", > "xmpDM:genre": "", > "xmpDM:logComment": "XXX - \n(P) 2009 Twentieth Century Fox Television - > USQX90900223", > "xmpDM:releaseDate": "", > "xmpDM:trackNumber": "4" > } > ] -- This message was sent by Atlassian JIRA (v7.6.3#76005)