[issue2104] ASS demuxer doesn't write UTF-8 BOM
Michael Niedermayer michae...@gmx.at added the comment: On Sat, Jul 17, 2010 at 02:49:00PM +, fmm wrote: Though this is probably a lost cause, of about 201 .ASS files found randomly in Google from different places, 193 had a BOM, and the rest, sending a patch would probably increase your chances to see this changed [...] FFmpeg issue tracker iss...@roundup.ffmpeg.org https://roundup.ffmpeg.org/issue2104
[issue2104] ASS demuxer doesn't write UTF-8 BOM
Reimar Döffinger b...@reimardoeffinger.de added the comment: On Fri, Jul 16, 2010 at 10:01:09PM +, fmm wrote: When demuxing ASS subtitle tracks via ffmpeg -i file.mkv -scodec copy subtitles.ass the resulting file doesn't have an UTF-8 BOM mark at the beggining (0xEF 0xBB 0xBF). There is no official UTF-8 BOM after all UTF-8 does not depend on byte-order, and there are tools that are having issues if it is there (this includes FFmpeg, FFmpeg will not strip it, in part because nobody implemented it, in part because since it is not necessary, someone might actually have wanted that character code there). Also UTF-8 can be detected quite reliably without such a thing so there isn't much of a reason those other tools couldn't be fixed instead (and maybe it would finally be time for those tools to just _assume_ UTF8 unless told otherwise?)... FFmpeg issue tracker iss...@roundup.ffmpeg.org https://roundup.ffmpeg.org/issue2104
[issue2104] ASS demuxer doesn't write UTF-8 BOM
Carl Eugen Hoyos ceho...@rainbow.studorg.tuwien.ac.at added the comment: Does not sound like a valid issue. -- status: new - closed substatus: new - invalid FFmpeg issue tracker iss...@roundup.ffmpeg.org https://roundup.ffmpeg.org/issue2104
[issue2104] ASS demuxer doesn't write UTF-8 BOM
New submission from fmm fmm...@sogetthis.com: When demuxing ASS subtitle tracks via ffmpeg -i file.mkv -scodec copy subtitles.ass the resulting file doesn't have an UTF-8 BOM mark at the beggining (0xEF 0xBB 0xBF). Many tools assume or guess incorrectly the charset if they don't find the BOM (ASS subtitles are always stored as UTF-8 in Matroska). When this happens non-ASCII characters break. mkvextract and Aegisub always write the BOM when extracting/writing ASS files to avoid these kind of problems. -- messages: 11240 priority: normal status: new substatus: new title: ASS demuxer doesn't write UTF-8 BOM type: bug FFmpeg issue tracker iss...@roundup.ffmpeg.org https://roundup.ffmpeg.org/issue2104