Will take a look over next few days... I’m sorry for not having a ready answer.
On Sun, Oct 14, 2018 at 11:08 PM Nick Sincaglia <[email protected]> wrote: > I was wondering if anyone might have some insights on why the XML output > does not contain some of the technical file information that the JSON and > text version does. Is this something that can be fixed? Could someone > suggest a way to go about identifying the root cause and fixing it? > > Thanks, > > Nick > > On Oct 8, 2018, at 9:31 PM, Nick Sincaglia <[email protected]> wrote: > > I am using the Tika 1.19 as a GUI to extract metadata from an .mp3 file. > The sample rate is available and I am able access it, but only as a string > or as part of a JSON document. I am working in XML and would like to use > XML as a content handler. But when the metadata is returned as ‘structured > text’ (XML) the sample rate is not returned. I have tried using Tika 1.19 > in a Maven project and experimented with different contentHandlers and the > same issue occurs. I cannot seem to get the sample rate returned in an XML > doc, but I am able to access the data from the metadata object itself. If > the metadata is returned as a string, the sample rate is there, if it is > returned as XML, the sample rate is not returned. I am wondering what I am > doing wrong or misunderstanding. Perhaps an issue with the parser or > contentHandler that is used? > > > *Tika 1.19 ‘Metadata’ view (sample rate is available):* > > > Author: Glee Cast > Content-Length: 8251946 > Content-Type: audio/mpeg > X-Parsed-By: org.apache.tika.parser.DefaultParser > X-Parsed-By: org.apache.tika.parser.mp3.Mp3Parser > X-TIKA:digest:MD5: e0bdf3a0e171fca838604f9baad46612 > X-TIKA:digest:SHA256: > ea1e4aa998f2c6e80139fa100c62fc1ee17652cf702cd484532b90183e7c5cc0 > channels: 2 > creator: Glee Cast > dc:creator: Glee Cast > dc:title: Rehab (Glee Cast Version) > meta:author: Glee Cast > resourceName: USQX90900223_A4_T7.mp3 > *samplerate: 44100* > title: Rehab (Glee Cast Version) > version: MPEG 3 Layer III Version 1 > xmpDM:album: Glee: The Music, The Complete Season One > xmpDM:artist: Glee Cast > xmpDM:audioChannelType: Stereo > xmpDM:audioCompressor: MP3 > *xmpDM:audioSampleRate: 44100* > xmpDM:duration: 206301.296875 > xmpDM:genre: > xmpDM:logComment: XXX - > (P) 2009 Twentieth Century Fox Television - USQX90900223 > xmpDM:releaseDate: > xmpDM:trackNumber: 4 > > > > > *Tika 1.19 ‘Structured Text’ view (no sample rate):* > > > <?xml version="1.0" encoding="UTF-8"?><html xmlns=" > http://www.w3.org/1999/xhtml"> > <head> > <meta name="xmpDM:genre" content=""/> > <meta name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser"/> > <meta name="X-Parsed-By" content="org.apache.tika.parser.mp3.Mp3Parser"/> > <meta name="creator" content="Glee Cast"/> > <meta name="xmpDM:album" content="Glee: The Music, The Complete Season > One"/> > <meta name="xmpDM:releaseDate" content=""/> > <meta name="meta:author" content="Glee Cast"/> > <meta name="xmpDM:artist" content="Glee Cast"/> > <meta name="X-TIKA:digest:SHA256" > content="ea1e4aa998f2c6e80139fa100c62fc1ee17652cf702cd484532b90183e7c5cc0"/> > <meta name="dc:creator" content="Glee Cast"/> > <meta name="xmpDM:audioCompressor" content="MP3"/> > <meta name="resourceName" content="USQX90900223_A4_T7.mp3"/> > <meta name="xmpDM:logComment" content="XXX - (P) 2009 Twentieth > Century Fox Television - USQX90900223"/> > <meta name="dc:title" content="Rehab (Glee Cast Version)"/> > <meta name="Author" content="Glee Cast"/> > <meta name="Content-Length" content="8251946"/> > <meta name="X-TIKA:digest:MD5" content="e0bdf3a0e171fca838604f9baad46612"/> > <meta name="Content-Type" content="audio/mpeg"/> > <title>Rehab (Glee Cast Version)</title> > </head> > <body><h1>Rehab (Glee Cast Version)</h1> > <p>Glee Cast</p> > <p>Glee: The Music, The Complete Season One, track 4</p> > <p>206301.3</p> > <p>XXX - (P) 2009 Twentieth Century Fox Television - USQX90900223</p> > </body></html> > > > *Tika 1.19 Recursive JSON view (the sample rate is there):* > > > [ > { > "Author": "Glee Cast", > "Content-Type": "audio/mpeg", > "X-Parsed-By": [ > "org.apache.tika.parser.DefaultParser", > "org.apache.tika.parser.mp3.Mp3Parser" > ], > "X-TIKA:content": "Rehab (Glee Cast Version)\nGlee Cast\nGlee: The > Music, The Complete Season One, track 4\n206301.3\nXXX - \n(P) 2009 > Twentieth Century Fox Television - USQX90900223\n", > "X-TIKA:digest:MD5": "e0bdf3a0e171fca838604f9baad46612", > "X-TIKA:digest:SHA256": > "ea1e4aa998f2c6e80139fa100c62fc1ee17652cf702cd484532b90183e7c5cc0", > "X-TIKA:parse_time_millis": "86", > "channels": "2", > "creator": "Glee Cast", > "dc:creator": "Glee Cast", > "dc:title": "Rehab (Glee Cast Version)", > "meta:author": "Glee Cast", > * "samplerate": "44100",* > "title": "Rehab (Glee Cast Version)", > "version": "MPEG 3 Layer III Version 1", > "xmpDM:album": "Glee: The Music, The Complete Season One", > "xmpDM:artist": "Glee Cast", > "xmpDM:audioChannelType": "Stereo", > "xmpDM:audioCompressor": "MP3", > * "xmpDM:audioSampleRate": "44100",* > "xmpDM:duration": "206301.296875", > "xmpDM:genre": "", > "xmpDM:logComment": "XXX - \n(P) 2009 Twentieth Century Fox Television > - USQX90900223", > "xmpDM:releaseDate": "", > "xmpDM:trackNumber": "4" > } > ] > > > >
