Sender:Jan Hubicka <[email protected]> Sent at:2018 Nov 5 (Mon) 22:21 To:Richard Biener <[email protected]> Cc:bin.cheng <[email protected]>; GCC Patches <[email protected]> Subject:Re: [PATCH AutoFDO/2]Treat ZERO as common profile probability/count > > > On Wed, Oct 31, 2018 at 7:30 AM bin.cheng <[email protected]> > > wrote: > > > > > > Hi, > > > In new profile probability/count infra, we have different precision > > > quality categories, > > > and probabilities/counts of different categories are not supposed to be > > > compared or > > > calculated. Though in general is an improvement, it introduces > > > unexpected behavior. > > > Specifically, class profile_probablity and profile_count themselves are > > > implemented > > > by comparing probabilities/counts against profile_count::zero(). while > > > zero() is of > > > profile_precision category, it's always compared different to zero of > > > other precision > > > categories including afdo. > > > > > > I can see two ways fixing this: 1) Treat zero as a common > > > probability/count regardless > > > of its category; 2) Provide an "is_zero" method rather than relying on > > > "==" comparison > > > against probability_count::zero(). 2) requires lots of code changes so I > > > went with 1) > > > in this patch set. This patch doesn't handle "always" but it might be. > > > > > > This patch also corrects a minor issue where we try to invert an > > > uninitialized value. > > > > > > Bootstrap and test on x86_64 in patch set. Is it OK? > > > > I'll defer on the emit_store_flag_force change, likewise for the zero > > handling in > > compares - I don't think zeros of different qualities should compare equal. > > Would compares against ::always() not have the very same issue? > > Likewise ::even(), > > ::likely(), etc.? Those always get guessed quality. > > > > The invert change looks OK to me. The related change to the always() API > > would > > suggest to replace guessed_always() with always (guessed) and also do > > similar > > changes throughout the whole API... > > > > Honza? > > The zeros are really differenct zeros. profile_count::zero makes us to > drop the basic block into cold section because we know that it won't be > executed in normal run of program (either we have accurate profile > feedback or by proving that the program is on way to crash or user > annotated cold section). Having guessed zero or auto-fdo zero won't > make us to do such agressive size optimization. > This is important since those zeros relatively commonly happens by > accident and thus if we dropped all the code to cold section the cold > section would be visited relativel often during execution of program > which would eliminate its need. > > Most comparsion in profile-count.h which goes agains profile_count==zero > are realy intended to pass only for this "aboslute zero". They bypass > the precision adjusmtents which normally happen when you merge values > of different precision. > > What kind of unexpected behaviour are you seeing? > We already have nonzero_p which is what we use when we want to know that > count is non-zero in some sense of precision. Hi Honza, Sorry for letting this slip away. So in case of AutoFDO, due to the nature of sampling, lots of funcs/bbs are annotated with zero profile_count in afdo precision, and we have checks against zero profile_count in precise precision All these checks end up with false and cause issues. Take the code in update_profiling_info as an example:
update_profiling_info (struct cgraph_node *orig_node,
struct cgraph_node *new_node)
{
struct cgraph_edge *cs;
struct caller_statistics stats;
profile_count new_sum, orig_sum;
profile_count remainder, orig_node_count = orig_node->count;
if (!(orig_node_count.ipa () > profile_count::zero ()))
return;
//...
for (cs = new_node->callees; cs; cs = cs->next_callee)
cs->count = cs->count.apply_scale (new_sum, orig_node_count);
Since we also have below code in profile_count::operator>,
if (other == profile_count::zero ())
return !(*this == profile_count::zero ());
If orig_node_count is afdo zero, the above zero check for orig_node_count
returns false, we end up with passing zero density to apply_scale issue and
asserting.
In this updated patch, I restrcited changes only to profile_count::operator
<, >, <= and >=. Plus, I think there is a latent typo in operator>= because
current code return TRUE if '*this' is precise zero and 'other' is precise
non-zero.
@@ -879,7 +879,7 @@ public:
if (other == profile_count::zero ())
return true;
if (*this == profile_count::zero ())
- return !(other == profile_count::zero ());
+ return !other.nonzero_p ();
Bootstrap and test on x86_64 along with other patches.
Thanks,
bin
2018-11-19 Bin Cheng <[email protected]>
* profile-count.h (profile_count::operator<, >, <=): Check ZERO count
using nonzero_p.
(profile_count::oeprator>=): Invert return condition when *this is
precise zero. Check ZERO count in that condition using nonzero_p.
0003-Check-ZERO-profile-count-regardless-of-precision.patch
Description: Binary data
