Thanks for the review.
I fixed most of the things you reported. I included my rationale for
some of the things I disagree with.
Regards,
Laurent
+// REVIEW. After reading the code multiple times, I am not a fan of this
syntax.
+// Having to interpret the save_flag when reading the code removes a
+// layer of clarity. I would prefer a function or a macro that calls
+// this function, adding the save_flag so that when you read the code,
+// you immediately know if you are saving to the stash, or loading from
+// it. The macro approach might best here.
%%% Done.
+ // REVIEW. Are we always going to assume 4:2:0 subsampling? We should
+ // probably add a FIXME tag here for further modifications.
%%% Done.
+ // REVIEW. This value is constant. Consider declaring it outside the
+ // loop, or directly after comp in the loop declaration.
+ // for (int comp = 0, ref_stride = t->me.ref_stride;
%%% Done.
+// REVIEW. Constants related to the stash flags needs to be define. Reading
+// stash_flags = 16|8; is not insightful. Instead, we should have
+// something like stash_flags = F265_STASH_CABAC|F265_STASH_TT;.
+// Please change this in the f265 analysis. I don't expect you to apply
+// the changes to the HM emulation analysis code.
%%% Done.
+ // REVIEW. Are we always going to assume 4:2:0 subsampling? Might want to
+ // add a FIXME tag for future modifications.
%%% Done.
+
+ // REVIEW. From compactness, simply use t->me.ref_stride in the function
+ // call below.
%%% Done.
+ // REVIEW. Lamba is in 256 fractions. Bits are in 32768 fractions. Why keep
+ // an extra 5 bits of precision? Please document.
%%% Done.
+ // REVIEW. Are we always going to assume 4:2:0 subsampling? Add a FIXME tag
+ // for later modifications.
%%% Done.
+ // REVIEW. Replace 1<<lg_bs by bs. It's a local variable.
%%% Done.
+ // REVIEW. ((*t->tt.tn++)>>3)&1 also gets the job done (in one less op).
%%% I don't think that's the case, but I fixed it anyway.
// Get the split flag. Skip the 4x4 subblocks if present.
int split_flag = !!((*t->tt.tn++)&8);
+// REVIEW. Please elaborate the strategy of signalling a max cost. We use this
+// to favour skip over merge, and cbf_root=0. The idea is to select the
+// cheapest, most explicit signalling.
%%% Done.
+ // REVIEW. Please add a quick comment to say that we avoid complex
+ // signalling when all we need is to signal no residual.
%%% Done.
+ // REVIEW. This function is only called from fenc_analyze_ctb. The
ambiguity
+ // flag could be computed there and simply passed along as a
+ // parameter. From the looks of things, this value will always be
+ // the same. In fact, you might want to consider two separate
+ // functions: one for the ambiguous case and one for the
unambiguous
+ // case. This would lead to 2 smaller recursive functions without
+ // branching and very little redundant code: the 2 conditions
above.
%%% Kept as-is for now, this is experimental code. Added a comment.
+ // REVIEW. Please clean up. If the branchy version is truly better,
+ // remove the MIN_MAX case. If there is little difference,
+ // I would opt for the MIN_MAX approach to stay consistent
+ // with the code block above.
%%% Done.
+ // REVIEW. These comment lines exceed 80 characters.
%%% Done.
+ // REVIEW. Guard against cur_mode < 2 or cur_mode > 34? In
practice,
+ // this may never happen.
+ // REVIEW. Guard against cur_mode < 2?
+ // REVIEW. Guard against cur_mode > 34?
+ // REVIEW. Guard against init_mode - i - 1 < 2?
+ // REVIEW. Guard against init_mode + i + 1 > 34?
%%% Cannot happen by design.
+// REVIEW. I'm not convinced of the gain we get from separating the case we
+// only want to keep 1 mode from the alternative case of keeping 3. We
+// could always run the second part of the loop, and simply change the
+// loop that transfers the best candidates to the intra block from
+// for (int i = 0; i < 3; i++) ib->cands[i] = best_modes[i];
+// to
+// for (int i = 0; i < nb_kept; i++) ib->cands[i] = best_modes[i];
+// To be on the safe side, we could run F265_MIN(3, nb_kept).
+// Unless the code really improves performance, I would prefer less
code
+// with the extra conditions.
%%% There are semantic differences. Notice there is no save operation in the
second version.
+ // REVIEW. I don't understand the need for this initialization block.
+ // rdm_base_algo is sure to be overwritten. rdm_angular_algo
+ // will also be overwritten. Please remove any unnecessary
+ // assignments to zero. It will make the function more compact.
%%% Experimental code, all of this will go away. Initializing helps removing
bugs.
+ // REVIEW. s/algorihtm/algorithm/.
%%% Done.
+ // REVIEW. See note above about naming the stash flags.
%%% Done.
+ // REVIEW. Consider caching lg_bs-1 here. It's used multiple times
+ // below.
%%% The compiler can do that. I don't believe adding a variable here helps
readability.
+ // REVIEW. Please use an->rdm_flag for consistency.
+ // REVIEW. Please use an->rdm_flag for consistency.
%%% Done.
+ // REVIEW. Why not simply cache t->inter_block.neighbours? We use it twice?
%%% Done.
+ // REVIEW. These two exact statements appear below in
+ // fenc_analyze_un_merge_rdo_all. I checked and this function is
+ // only called from there. Calling fenc_analyze_get_un_merge_cands
+ // does not change the flags, nor the inter partition type. You can
+ // safely remove these statements. Please move the comment above
the
+ // statements in fenc_analyze_un_merge_rdo_all.
%%% This code path is not the only one possible. It is legal to call
fenc_analyze_un_merge_rdo() from another function than
%%% fenc_analyze_un_merge_rdo_all(). It has to do the right thing without
assuming that the CB flags are correctly set.
+ // REVIEW. Please refine the comment. Merge + no residual = skip.
+ // Favour skip in such cases.
%%% Done.
+ // REVIEW. It took me a while to understand the "update".
+ // Reading through the code, I was lead to believe that
+ // merge_idx_ctx and contexts[F265_CO_MERGE_IDX] were
+ // identical at this point. It isn't the case because
+ // calling fenc_stash_reset reverts the CABAC contexts
+ // to their state prior to signalling the merge_idx.
+ // The comment above merge_idx_ctx's declaration as
well
+ // as the one atop this line need to be refined to help
+ // the reader catch on quickly.
%%% Done.
+ // REVIEW. Add a FIXME tag to the optimization comment.
%%% Done.
+
+ // REVIEW. Can you move this closer to the children loop to put emphasis on
+ // this being a bookmark rather than a local variable to write more
+ // compact expressions (like the "an" variable above).
%%% Done, added an explicit comment instead.
+ // REVIEW. Adding details to understand these costs will make it
+ // easier to understand. No need to add much details.
+ // Something like this would suffice:
+ // split_cu_flag=0, cu_skip_flag=1.
%%% Done.
+ // REVIEW. For code clarity, could you define a local best_mode variable
and
+ // avoid the ternary operator as conditional statements?
+ // Assuming a small type (see REVIEW note below), all 4 ternary
+ // operators could be replaced by it.
%%% Done, that was stupid logic there.
+ // REVIEW. I don't follow why rdm_inter_mode is being used here
+ // rather than rdm_best_mode. The call to
+ // fenc_analyze_import_inter_mode uses rdm_best_mode or
+ // rdo_best_mode.
%%% Done.
+ // REVIEW. These comment lines exceed 80 characters.
%%% Done.
+ // REVIEW. Even if this fits in 120 characters, it is not fun to read.
+ // Can you please change it to
+ // if (cond)
+ // {
+ // if (cond) statement;
+ // else statement;
+ // }
+ //
+ // Alternatively, you may want to simply add an
+ // fenc_analyze_inter_cb function and hide the conditional call.
+ // You can make it an inline function. I would prefer this as it
+ // would match the call to fenc_analyze_intra_cb.
%%% Temporary code until next patch, useful as it is because you can swap lines
in one shot to experiment.
+ // REVIEW. For compactness, please use gd->algo. gd is fetched above.
%%% Done.
+ // REVIEW. For readability, please define (1<<12) in bdi.h.
%%% I disagree. The semantics of the "algo" fields are brittle and change all
the time by design. I don't want to make them
%%% more permanent than required.
an->rdm_rec_flag = F265_GET_FLAG(t->enc->gd.algo, (1<<12));
+ // REVIEW. I understand that the following assignments are an infinitely
+ // small part of the analysis, but we should simply copy them to
the
+ // analysis structure during the encoding thread's initialization.
+ // That is unless we plan to play with the analysis depths to speed
+ // up the analysis.
%%% It is marked FIXME, this is going to change.
// Set the analyzed TB range to the actual parameter values.
// FIXME, precompute tb_depth.
// - intra[lg_bs-2][2];
+ // REVIEW. t->store = t->cba.
%%% Done.
+ // REVIEW. For consistency, please use an->rdm_flag instead of
+ // t->an.rdm_flag.
%%% Done.