On Mon, Jun 20, 2016 at 4:33 PM Michael Niedermayer <mich...@niedermayer.cc> wrote:
> On Mon, Jun 20, 2016 at 09:54:15AM +0000, Davinder Singh wrote: > > On Sat, Jun 18, 2016 at 3:16 AM Michael Niedermayer > <mich...@niedermayer.cc> > > wrote: > > > > > On Fri, Jun 17, 2016 at 08:19:00AM +0000, Davinder Singh wrote: > > > [...] > > > > Yes, I did that, after understanding it completely. It now works > with the > > > > motion vectors generated by mEstimate filter. Now I’m trying to > improve > > > it > > > > based on this paper: Overlapped Block Motion Compensation: An > > > > Estimation-Theoretic Approach > > > > > > > < > > > > http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.112.8359&rep=rep1&type=pdf > > > > > > > > > > this is 22 years old > > > > > > > > > > and > > > > this one: Window Motion Compensation > > > > <https://www.researchgate.net/publication/252182199>.Takes a lot of > time > > > > > > this is 25 years old > > > > > > not saying old papers are bad, just that this represents the knowledge > > > of 20 years ago > > > > > > also its important to keep in mind that blind block matching of any > > > metric will not be enough. To find true motion the whole motion > > > vector fields of multiple frames will need to be considered > > > > > > For example a ball thrown accross the field of view entering and > > > exiting the picture needs to move smoothly and at the ends (in time) > > > there are frames without the ball then a frame with the ball > > > these 2 are not enough to interpolate the frames between as we have > > > just one location where the ball is. With the next frames though > > > we can find the motion trajectory of the ball and interpolate it end > > > to end > > > > > > I think papers which work on problems like this and also interpolation > > > of all the areas that end up overlapping and covering each other > > > like the backgroud behind the ball in that example would be better > > > starting points for implementing motion estiation because ultimatly > > > that is the kind of ME code we would like to have. > > > Block matching with various windows, OBMC, ... are all good but > > > if in our example the vectors for the ball or background are off that > > > will look rather bad with any motion compensation > > > So trying to move a bit toward this would make sense but first > > > having some motion estimation even really basic and dumb with > > > mc working in a testable filter (pair) should probably be done. > > > Iam just mentioning this as a bit of a preview of what i hope could > > > eventually be implemented, maybe this would be after GSoC but its > > > the kind of code needed to have really usable frame interpolation > > > > > > > > > > > > > reading them. I think we need to add new Raised Cosine window > (weights) > > > > along with Linear Window (currently implemented). What do you say? > > > > > > i dont know, the windows used in snow are already the best of several > > > tried (for snow). > > > no great gains will be found by changing the OBMC window from snow. > > > > > > > > > > > > > > Also making mInterpolate work with variable macroblock size MC. The > > > current > > > > interpolation works without half pel accuracy, though. > > > > > > mcfps has fully working 1/4 pel OBMC code, that should be fine to be > > > used as is i think unless i miss something > > > > > > half pel is 20 years old, it is not usefull > > > multiple block sizes on the MC side should not really matter ATM > > > smaller blocks are a bit slower but first we should get the code > > > working, then working with good quality and then working fast. > > > > > > multiple block sizes may be usefull for the estimation side if it > > > improves estimation somehow. > > > > > > Can i see your current "work in progress" ? > > > > > > > > > [...] > > > > I’m moving estimation code to some new file motion_est.c file and the > > > > methods are shared by both mEstimate and mInterpolate filters. > mEstimate > > > > store the MVs in frame’s side data for any other filter. Moreover, > any > > > > other filter if need post processing on MVs it can directly use the > > > shared > > > > methods. But, mInterpolate use them internally, no saving in > sidedata, > > > and > > > > saving unnecessary processing. > > > > > > This design sounds good > > > > > > > > > > > > > > > > > > Also, Paper [1] doesn’t uses window with OBMC at all. It just find > normal > > > > average without weight. Perhaps to compare papers I either need to > add > > > > multiple option for each setting or need to assign the algorithm as > > > > researcher’s name in filter options. > > > > > > > > > > > Paper [1] and [2] uses functions or do post processing on motion vectors, > > so needs fast ME algorithms, which currently I’m working on. [*M] > > > > Let me summarize the papers (from Email 1, this thread): > > > > Paper [1]: Zhai et al. (2005) A Low Complexity Motion Compensated Frame > > Interpolation Method > > > > [Quote] > > This paper propose a MCFI method intended for real time processing. It > > first examines the motion vectors in the bitstream [*1]. 8x8 block size > is > > used rather than 16x16 as in most cases; Using smaller block size leads > to > > denser motion field, so neighboring MVs are more highly correlated, so > > prediction is better. To reduce complexity, MVs in bitstream are utilized > > [*1]. But need to be filtered as not all of them represent true motion. > > They are grouped into “good vectors, can be used directly” and “bad > > vectors, need to find true motion”. For classification of MVs into > groups, > > SAD and BAD is used. For an 8x8 block in to-be-interpolated frame F(in) > we > > get motion vector MV of block at same location in current frame. If F(in) > > is exactly middle of F(prev) and F(cur), then MV/2 points to avblock in > > prev frame & -MV/2 points to a block in current frame from F(in). Then > SAD > > & BAD of both of these blocks are compared to certain thresholds [*2], > > based on which blocks are classified. For bad ones, overlapped block > > bi-directional motion estimation (OBBME) is carried out to find true > > motion. In OBBME, the size of block in F(in) is enlarged to 12x12, then > > bi-directional ME is performed to get MV that minimizes the diff. between > > two block located at MV/2 & -MV/2 in F(prev) & F(cur) wrt current block > in > > F(in). Diff is calc by eq (1) in Paper. Like in BMA, we can use any fast > ME > > algo here [*M]. After this, there are still few MVs. For that post > > processing is performed on MVs that break the continuity. We calculate > the > > variation of each motion vector and its neighboring MVs. If variation > > exceeds a certain threshold, the MV is regarded as a single bad motion > > vector and then vector median filtering is applied. It finds one vector > > among 8, that minimizes eq (2). Finally, OBMC is applied. No weights are > > used [*3]. Pixels are simple averages given by eq 4-6. > > [/Qoute] > > > > [*1] We can for now use motion vectors generated on filter side. As you > > suggested, later we can use decoder’s vectors. > > [*2] Threshold values are not given in paper :( > > [*3] Initially, we can test using the generated/refined motion vector > field > > with the currently implemented window based OBMC. Later to reduce > > complexity we can use their method. > > > > > > Paper [2]: Choi et al. (2007) Motion-Compensated Frame Interpolation > Using > > Bilateral Motion Estimation and Adaptive Overlapped Block Motion > > Compensation > > > > [Quote] > > This algorithm has four steps. First, we propose the bilateral ME scheme > to > > obtain the motion field of an interpolated frame. Then, we partition a > > frame into several object regions by clustering MVs. We apply the > > variable-size block MC (VS-BMC) to object boundaries in order to > > reconstruct edge information with a higher quality. Finally, we use the > > adaptive overlapped block MC, which adjusts the coefficients of > overlapped > > windows based on the reliabilities of neighboring MVs. The adaptive OBMC > > (AOBMC) can overcome the limitations of the conventional OBMC, such as > > over-smoothing and poor de-blocking. > > I. We perform bilateral ME which prevents overlapping and hole problem > [*4] > > by estimating the motion vectors of interpolated frame directly. If the > > conventional BMA is used to find a block-wise motion vector field between > > the previous frame and the current frame, the motion trajectories may not > > cover all pixels in the interpolated frame, consequently yielding hole > > regions. In addition, multiple trajectories may pass through the same > > pixel, causing overlapping regions. Therefore, we should estimate the > > motion vectors for the blocks in the interpolated frame, instead of using > > the motion vectors between the previous frame and the current frame. In > > proposed Bilateral ME we obtain the MV by comparing a block at a shifted > > position in the F(prev) and another block at the opposite position in > > F(curr), by minimizing SAD [*5][*M]. Since there can be multiple > > trajectories through the current block, we impose a spatial smoothness > > constraint to improve robustness of ME. The SMD is calc which is avg. > > between abs. px. values at boundary of predicted and neighboring block. > We > > find best MV by minimizing weighted sum of SAD & SMD given by eq (6). > > II. Then MBs are classified into clusters according to MVs. [TL;DR] > First, > > all MBs are considered as single object and cluster center is set to avg. > > MV of blocks. If diff b/w block's MV and threshold T (=8), the block > > belongs to new object. The avg. MV of the blocks in new object is set as > a > > new cluster center. Each cluster center is updated to the avg. of MVs in > > the cluster. Steps 2–4 are iteratively repeated until there is no change > in > > the cluster centers. [/TL;DR] > > III. To express complex motions, we adopt VS-BMC to reconstruct boundary > > blocks. We adopt a quadtree-based VS-BMC, which divides an 8x8 boundary > > block into 4x4 or 2x2 sub-blocks. [*6] Then we find MVs for sub-block > using > > SAD like before, if new MV is less than 1/4MV of orig. block, accept the > > division - iterate it by subdividing it furthermore, or terminate > procedure. > > IV. Finally adaptive OBMC is used with window such as raised cosine [*7]. > > Conventional OBMC can yield blurring or over-smoothing artifacts. AOBMC > > reconstructs the interpolated frame faithfully by controlling the weights > > of overlapping windows according to the reliabilities of MVs. See Fig. 6 > & > > 7. > > [/Quote] > > > > > [*M] The shared methods from motion_est.c will allow this without > > repetition of code. > > Just keep in mind the motion estimation we have is a bit mpeg centric > so block sizes below 8x8 will not work with all the routines we have > > > > [*5] It is very similar to OBBME used in Paper [1] except the block size > is > > not changed. > > [*6] This required the current mInterpolate code to support variable size > > OBMC. > > > [*7] We could use the linear window instead of raised cosine one. But too > > late, I already implemented it. > > :) > > > > > > Another interesting paper I found is 3D recursive search. It's little old > > but very popular. See images here: > > http://i65.tinypic.com/zkfgox.png > > http://i67.tinypic.com/2dihmb7.png > > http://i65.tinypic.com/rgw38n.png > > interresting > one thing that is very noticable on this though is that what they > use as comparission (full search) in these 3 images is alot worse than > what modern encoders use (rate distortion based predictive zonal ME) > this shouldnt matter much but i wanted to point out that its not > possible from this to conclude how these relate to what a modern > video encoder would use as "full search" > all the videos they use in papers are available here: https://media.xiph.org/video/derf/ can be used to compare 3DRS v/s MVs generated by EPZS in modern codec. +export_mvs can be of EPZS? > > > > > Paper [3]: de Haan et al. (1994) True Motion Estimation with 3D Recursive > > Search Block Matching > > (http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=246088) > > Gonna read now. It has unusual notation. > > > > Once we implement these, then we can deal with objects entering or > exiting > > the screen. I think it is hole or overlapping problem addressed in paper > > [2], several approaches have been proposed to handle it like median > > filtering, spatial interpolation ( > > http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=389461) or MC using > > neighboring motion fields. Will look into it more. The hole or > overlapping > > problem is handled by bilateral motion estimation used in paper [2] (*4). > > > Also have to handle scene changing issues. I read in some paper that they > > yes, scene changing will need to be handled too, it was a problem in > mcfps too > the quick solution is probably to just detect by some threshold that > there is a scene change and then set all MVs to 0,0 that will look > alot better than random bits of images randomly moving and merging > into each other > > > > are too computational expensive. > > > > Which one do you think we should start with? I think it should be 3DRS. > > 3DRS is fastest of these three. Paper 2 compares result of all these > three. > > 3DRS is around 16fps, [1] is ~7fps. [2] is ~3fps. Paper 2 outperforms > both > > of them. > > is the full text of paper 2 available somewhere ? > http://www.mediafire.com/?nxmx358680k0d90 couldn't find original link also motion trajectories should be interpolated through more than > 2 frames, i dont know if the quoted papers do that but > vf_mcfps already provides the framework for this (aka its neccessary > to have 2 future and past frames available) > a random paper which seems to compare linear vs cubic shows very > significant gains > http://www.ripublication.com/ijaer16/ijaerv11n10_42.pdf > I dont know if that paper is good or not but for example a ball (to > keep the example used previously) would move along the edges of a > polygon if linear MV interpolation is used, That might work with a > slow moving ball but a spinning wheel should be heavily distorted > with interpolation along straight lines. > > > [...] > -- > Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB > > it is not once nor twice but times without number that the same ideas make > their appearance in the world. -- Aristotle > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel