On date Monday 2025-04-28 20:40:37 +0000, softworkz . wrote:
[...]
> > > > >      /* validate replace string */
> > > > >      {
> > > > > -        const uint8_t *p = tctx->string_validation_replacement;
> > > > > -        const uint8_t *endp = p + strlen(p);
> > > > > +        const uint8_t *p = (uint8_t *)tctx-
> > >string_validation_replacement;
> > > > > +        const uint8_t *endp = p + strlen((const char *)p);
> > > > >          while (*p) {
> > > > >              const uint8_t *p0 = p;
> > > > >              int32_t code;
> > > > >              ret = av_utf8_decode(&code, &p, endp, tctx-
> > > > >string_validation_utf8_flags);
> > > > >              if (ret < 0) {
> > > >
> > > > >                  AVBPrint bp;
> > > > > -                av_bprint_init(&bp, 0, AV_BPRINT_SIZE_AUTOMATIC);
> > > > > +                av_bprint_init(&bp, 0, AV_BPRINT_SIZE_UNLIMITED);
> > > > >                  bprint_bytes(&bp, p0, p - p0),
> > > > >                      av_log(tctx, AV_LOG_ERROR,
> > > > >                             "Invalid UTF8 sequence %s found in string
> > > > validation replace '%s'\n",
> > > > >                             bp.str, tctx-
> > >string_validation_replacement);
> > > > > -                return ret;
> > > > > +                av_bprint_finalize(&bp, NULL);
> > > >
> > 
> > > > I see no advantage in this, since it's adding dynamic allocation and
> > > > the need to free which was previously not needed
> > >
> > > The dynamic allocation does not happen until the buffer content size
> > > does not fit anymore into the stack-alloced memory.
> > 
> > Yes, but is this ever happening? 
> 

> Are we sure that it's not? Because unless we are 100% sure, we need
> to check the return value and when that check indicates an overflow
> of the buffer - what then? How to remedy?
> 
> The check for the return value alone - even without any remediation -
> is more complicated code (more lines) than a single line with
> av_bprint_finalize().
> 
> That's my general rationale behind favoring _UNLIMITED.
> 
> There are more cases where I have that, and where it's really relevant
> but for this one, I'll just remove the change, it's not worth the 
> discussion.

So I checked the code, a valid UTF8 sequence should be no longer than
six bytes, and reading the implementation of av_utf8_decode I could
not quickly find out the maximum possible size of an invalid sequence,
so I guess your argument has a point in case of crafted sequences -
although possibly we might compute the maximum size and it might be
small (assuming there are no mistakes in the implementation and no
regressions are introduced). Even in that case, the worst it might
happen would be that the sequence is truncated, so this should have no
security implications anyway - while allocation might fail (in case of
a very looong crafted string), leading again to truncation.

That said, I'm not against this change but it should be moved to a
dedicated change.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to