> -----Original Message----- > From: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> On Behalf Of Soft > Works > Sent: Sunday, February 6, 2022 2:09 AM > To: FFmpeg development discussions and patches <ffmpeg- > de...@ffmpeg.org> > Subject: Re: [FFmpeg-devel] [PATCH 1/2] avcodec/{ass, webvttdec}: fix > handling of backslashes > > > > > -----Original Message----- > > From: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> On Behalf Of > > Oneric > > Sent: Saturday, February 5, 2022 11:00 PM > > To: FFmpeg development discussions and patches <ffmpeg- > > de...@ffmpeg.org> > > Subject: Re: [FFmpeg-devel] [PATCH 1/2] avcodec/{ass, webvttdec}: > fix > > handling of backslashes > > > > On Sat, Feb 05, 2022 at 02:08:48 +0000, Soft Works wrote: > > > Let's try to approach this from a different side. Which case is > > > your [1/2] commit actually supposed to fix? > > > > Escape backslashes when converting from WebVTT to not accidentally > > introduce active ASS sequences and replace the wrong backslash- > escape > > in ff_ass_bprint_text_event with an escape that actually works. > > > > > How did you test your patch? > > > Can we please go over an example? > > > > Take a look at the attached WebVTT file. > > Thanks a lot for the test file! > > > > We expect the second event to be rendered like this, > > as from WebVTT’s point of view it’s just normal text: > > > > our final \h approach \N into \ Coruscant. > > > > What we currently get after conversion to ASS is like this though > > (pay attention to the number of spaces): > > > > our final approach > > into \ Coruscant. > > Yup, no doubt that this is wrong. > > > > If instead the word-joiner is appended as in my patch, the > > visual output matches the expectation (mail does not contain > U+2060): > > > > our final \h approach \N into \ Coruscant. > > I can also confirm that your patch would "work" with regards > to ass output when trying with both "old" libass and new libass. > I haven't tried with other implementations yet, but when this > would turn out to be working with all usual implementations, > I might even be OK with the word-joiner. > > But this is where the agreement ends. > > - If at all, the word-joiner insertion would need to be > limited to ASS output ONLY > - it would need to be controllable through an option in the ASS > encoder > - The word joiners should not be used in internal processing and > only be (optionally) added when encoding to ass > - Unfortunately, the FATE tests for the other subtitle formats > do not include these sequences in the test source files, and > that means, before making such change that might potentially > affect all other text subtitle encoders, those sequences would > need to be added to make sure these conversion won't be affected > - Generally, those changes (also the BIDI mark insertion) should > be evaluated with regards to all text subtitle encoders, > making sure there's no side effect. > > You said: > > > I’m not interested in reworking ffmpeg’s internal subtitle handling. > > The proposed patch is a clear improvement over the status quo which > > is plain incorrect. Within reasonable effort and sound arguments for > > it adjustments to the patch can be made; reworking ffmpeg internals > is > > imo not “reasonable” effort to correct an uncontestedly wrong > escape. > > And that is a problem. The changes you are proposing are making > changes to ffmpeg’s internal subtitle handling, so you need to decide > whether you want to work on it or not. > > > > You have two options: > [..] > > Or go ahead and create your own patch. > > I will do this, but "only" on top of my subtitle filtering patchset > because that's my current focus area and just two weeks ago I had to > add a temporary hack for a related case which is about ASS dialog > lines > like: > > Dialogue: 0,0:00:00.00,0:00:05.00,Default,,0,0,0,,{comment text: > \....} > > Currently, ffmpeg does not recognize this as override code, even > though > it's valid in ASS that a backslash with the actual code doesn't appear > immediately after the opening curly brace. > What comes on top of this is that other subtitle decoders do NOT > escape > the curly braces like WebVTTdec and ass_bprint_text_event(). > When you look at SubRip_capability_tester.srt from the FATE suite, you > can see that it contains ASS codes that are expected to be recognized > and applied, but when there's normal text in curly braces without > a backslash, it should be treated at normal text. > > This is quite a mess that needs to be cleaned up with a plan and it's > all related. Like I said already: A central point to this is the > escaping > and what's needed is a solution that can cover all of those things. > > I had put this subject aside as I've been unsure about how to do it, > but this discussion has been very helpful to see the issues more > clearly. > > How about separating the BIDI part from your patch? I'd see only two > things remaining: > > - Go through all text subtitle encoders and think/research whether > those marks would be acceptable in those formats or whether they > would need to be removed (like now) > - Think about whether the Unicode bidi marks should be replaced back > to the html-style codes > It's not these wouldn't work, but it's again about visibility and > I tend to think that it would be preferable to have them visible > in the output > > softworkz
What I would also find acceptable is to just remove the escaping of backslashes in ff_ass_bprint_text_event() without adding a word-joiner, as this is clearly nonsense and it would still be an improvement. sw _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".