[issue2104] ASS demuxer doesn't write UTF-8 BOM

2010-07-20 Thread Michael Niedermayer

Michael Niedermayer michae...@gmx.at added the comment:

On Sat, Jul 17, 2010 at 02:49:00PM +, fmm wrote:
 Though this is probably a lost cause, of about 201 .ASS files found 
 randomly in Google from different places, 193 had a BOM, and the rest, 

sending a patch would probably increase your chances to see this
changed

[...]


FFmpeg issue tracker iss...@roundup.ffmpeg.org
https://roundup.ffmpeg.org/issue2104



[issue2104] ASS demuxer doesn't write UTF-8 BOM

2010-07-17 Thread Reimar Döffinger

Reimar Döffinger b...@reimardoeffinger.de added the comment:

On Fri, Jul 16, 2010 at 10:01:09PM +, fmm wrote:
 When demuxing ASS subtitle tracks via
 
 ffmpeg -i file.mkv -scodec copy subtitles.ass
 
 the resulting file doesn't have an UTF-8 BOM mark at the beggining (0xEF 0xBB
 0xBF).

There is no official UTF-8 BOM after all UTF-8 does not depend on byte-order,
and there are tools that are having issues if it is there (this includes FFmpeg,
FFmpeg will not strip it, in part because nobody implemented it, in part because
since it is not necessary, someone might actually have wanted that character
code there).
Also UTF-8 can be detected quite reliably without such a thing so there isn't
much of a reason those other tools couldn't be fixed instead (and maybe it would
finally be time for those tools to just _assume_ UTF8 unless told otherwise?)...


FFmpeg issue tracker iss...@roundup.ffmpeg.org
https://roundup.ffmpeg.org/issue2104



[issue2104] ASS demuxer doesn't write UTF-8 BOM

2010-07-17 Thread Carl Eugen Hoyos

Carl Eugen Hoyos ceho...@rainbow.studorg.tuwien.ac.at added the comment:

Does not sound like a valid issue.

--
status: new - closed
substatus: new - invalid


FFmpeg issue tracker iss...@roundup.ffmpeg.org
https://roundup.ffmpeg.org/issue2104



[issue2104] ASS demuxer doesn't write UTF-8 BOM

2010-07-16 Thread fmm

New submission from fmm fmm...@sogetthis.com:

When demuxing ASS subtitle tracks via

ffmpeg -i file.mkv -scodec copy subtitles.ass

the resulting file doesn't have an UTF-8 BOM mark at the beggining (0xEF 0xBB
0xBF). Many tools assume or guess incorrectly the charset if they don't find the
BOM (ASS subtitles are always stored as UTF-8 in Matroska). When this happens
non-ASCII characters break. mkvextract and Aegisub always write the BOM when
extracting/writing ASS files to avoid these kind of problems.

--
messages: 11240
priority: normal
status: new
substatus: new
title: ASS demuxer doesn't write UTF-8 BOM
type: bug


FFmpeg issue tracker iss...@roundup.ffmpeg.org
https://roundup.ffmpeg.org/issue2104