[ 
https://issues.apache.org/jira/browse/TIKA-887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13570705#comment-13570705
 ] 

Nick Burch commented on TIKA-887:
---------------------------------

I've just tried with the most recent build of tika from SVN, and I'm not seeing 
any random control characters turn up. I think therefore that the work on the 
MP3 parser over the last year has solved it

Any chance you could double check yourself, and close the ticket if it's now 
behaving?
                
> Tika fails to parse some MP3 tags correctly and produces null characters in 
> value
> ---------------------------------------------------------------------------------
>
>                 Key: TIKA-887
>                 URL: https://issues.apache.org/jira/browse/TIKA-887
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.0, 1.1
>            Reporter: Jens Hübel
>            Priority: Minor
>
> I have a problem when extracting the comment tag from an MP3 file. It 
> contains an invalid prefix then a '\0' character and then the real value of 
> the tag. This happpens with files downloaded from www.jamendo.com, for 
> example this one:
> http://storage.newjamendo.com/download/track/450545/mp32/Swansong.mp3
> It may be that the tags are not created properly on this site, but at least 
> tools like mp3tag display them correctly.
> The extracted value looks like this: eng http://www.jamendo.com 
> Attribution-Noncommercial-Share Alike 3.0
> At position 3 there is a null character. The tag value should start with 
> http...
> Here is the byte sequence at the beginning of this file:
> 49 44 33 04 00 00 00 01 18 32 54 49 54 32 00 00 
> 00 09 00 00 03 53 77 61 6E 73 6F 6E 67 54 50 45 
> 31 00 00 00 0E 00 00 03 4A 6F 73 68 20 57 6F 6F 
> 64 77 61 72 64 54 41 4C 42 00 00 00 0C 00 00 03 
> 42 72 65 61 64 63 72 75 6D 62 73 54 44 52 4C 00 
> 00 00 05 00 00 03 32 30 30 39 43 4F 4D 4D 00 00 
> 00 22 00 00 03 65 6E 67 49 44 33 20 76 31 20 43 
> 6F 6D 6D 65 6E 74 00 41 74 74 72 69 62 75 74 69 
> 6F 6E 20 33 2E 30 54 43 4F 4E 00 00 00 06 00 00 
> 03 28 32 35 35 29 54 50 55 42 00 00 00 08 00 00 
> 03 4A 61 6D 65 6E 64 6F 43 4F 4D 4D 00 00 00 2C 
> 00 00 03 65 6E 67 00 68 74 74 70 3A 2F 2F 77 77 
> 77 2E 6A 61 6D 65 6E 64 6F 2E 63 6F 6D 20 41 74 
> 74 72 69 62 75 74 69 6F 6E 20 33 2E 30 20 54 43 
> 4F 50 00 00 01 1F 00 00 03 32 30 30 39 2D 31 30 
> 2D 32 31 54 31 31 3A 31 31 3A 32 30 2B 30 31 3A 
> 30 30 20 4A 6F 73 68 20 57 6F 6F 64 77 61 72 64 
> 2E 20 4C 69 63 65 6E 73 65 64 20 74 6F 20 74 68
> ID3......2TIT2.......SwansongTPE1.......Josh 
> WoodwardTALB.......BreadcrumbsTDRL.......2009COMM..."...engID3 v1 
> Comment.Attribution 
> 3.0TCON.......(255)TPUB.......JamendoCOMM...,...eng.http://www.jamendo.com 
> Attribution 3.0 TCOP.......2009-10-21T11:11:20+01:00 Josh Woodward. Licensed 
> to th

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to