[
https://issues.apache.org/jira/browse/TIKA-887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tyler Palsulich resolved TIKA-887.
----------------------------------
Resolution: Fixed
No objection and the linked file seemed to have valid metadata. So I'm marking
this as fixed.
> Tika fails to parse some MP3 tags correctly and produces null characters in
> value
> ---------------------------------------------------------------------------------
>
> Key: TIKA-887
> URL: https://issues.apache.org/jira/browse/TIKA-887
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 1.0, 1.1
> Reporter: Jens Hübel
> Priority: Minor
>
> I have a problem when extracting the comment tag from an MP3 file. It
> contains an invalid prefix then a '\0' character and then the real value of
> the tag. This happpens with files downloaded from www.jamendo.com, for
> example this one:
> http://storage.newjamendo.com/download/track/450545/mp32/Swansong.mp3
> It may be that the tags are not created properly on this site, but at least
> tools like mp3tag display them correctly.
> The extracted value looks like this: eng http://www.jamendo.com
> Attribution-Noncommercial-Share Alike 3.0
> At position 3 there is a null character. The tag value should start with
> http...
> Here is the byte sequence at the beginning of this file:
> 49 44 33 04 00 00 00 01 18 32 54 49 54 32 00 00
> 00 09 00 00 03 53 77 61 6E 73 6F 6E 67 54 50 45
> 31 00 00 00 0E 00 00 03 4A 6F 73 68 20 57 6F 6F
> 64 77 61 72 64 54 41 4C 42 00 00 00 0C 00 00 03
> 42 72 65 61 64 63 72 75 6D 62 73 54 44 52 4C 00
> 00 00 05 00 00 03 32 30 30 39 43 4F 4D 4D 00 00
> 00 22 00 00 03 65 6E 67 49 44 33 20 76 31 20 43
> 6F 6D 6D 65 6E 74 00 41 74 74 72 69 62 75 74 69
> 6F 6E 20 33 2E 30 54 43 4F 4E 00 00 00 06 00 00
> 03 28 32 35 35 29 54 50 55 42 00 00 00 08 00 00
> 03 4A 61 6D 65 6E 64 6F 43 4F 4D 4D 00 00 00 2C
> 00 00 03 65 6E 67 00 68 74 74 70 3A 2F 2F 77 77
> 77 2E 6A 61 6D 65 6E 64 6F 2E 63 6F 6D 20 41 74
> 74 72 69 62 75 74 69 6F 6E 20 33 2E 30 20 54 43
> 4F 50 00 00 01 1F 00 00 03 32 30 30 39 2D 31 30
> 2D 32 31 54 31 31 3A 31 31 3A 32 30 2B 30 31 3A
> 30 30 20 4A 6F 73 68 20 57 6F 6F 64 77 61 72 64
> 2E 20 4C 69 63 65 6E 73 65 64 20 74 6F 20 74 68
> ID3......2TIT2.......SwansongTPE1.......Josh
> WoodwardTALB.......BreadcrumbsTDRL.......2009COMM..."...engID3 v1
> Comment.Attribution
> 3.0TCON.......(255)TPUB.......JamendoCOMM...,...eng.http://www.jamendo.com
> Attribution 3.0 TCOP.......2009-10-21T11:11:20+01:00 Josh Woodward. Licensed
> to th
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)