Re: [FFmpeg-devel] [PATCH] lavfi: add nlmeans_opencl filter

Carl Eugen Hoyos Tue, 09 Apr 2019 06:20:50 -0700

2019-04-09 4:54 GMT+02:00, Song, Ruiling <ruiling.s...@intel.com>:

>> > +kernel void vert_sum(__global uint4 *ii,
>> > +                     int width,
>> > +                     int height)
>> > +{
>> > +    int x = get_global_id(0);
>> > +    uint4 sum = 0;
>> > +    for (int i = 0; i < height; i++) {
>> > +        ii[i * width + x] += sum;
>> > +        sum = ii[i * width + x];
>>
>> This looks like it might be able to overflow in extreme cases?
>>
>> 3840 * 2160 * (1 - 0)^2 * 255 * 255 = 539,343,360,000 which
>> is a long way out of range for a 32-bit int.  That requires
>> impossible input (all pixels differing by the most extreme
>> value), but something like a chequerboard might be of the
>> same order?
> Yes this is a dilemma for me. Generally the filter is with
> high computation cost.
> To fix the overflow, we have to use 64bit integer for the
> integral image. Most GPUs are not good at 64bit integer
> calculation I think. May be we can try later.
> So I would prefer to stay with 32bit integer for a while.


Can the overflow be detected at runtime?

Could the user choose between 32 and 64 bit calculation?

Carl Eugen
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] lavfi: add nlmeans_opencl filter

Reply via email to