* Sean Christopherson <sean.j.christopher...@intel.com> wrote:

> On Fri, Oct 04, 2019 at 07:39:08AM -0700, Dave Hansen wrote:
> > On 10/4/19 6:45 AM, Changbin Du wrote:
> > > +static inline bool is_canonical_addr(u64 addr)
> > > +{
> > > +#ifdef CONFIG_X86_64
> > > + int shift = 64 - boot_cpu_data.x86_phys_bits;
> > 
> > I think you mean to check the virtual bits member, not "phys_bits".
> > 
> > BTW, I also prefer the IS_ENABLED(CONFIG_) checks to explicit #ifdefs.
> > Would one of those work in this case?
> > 
> > As for the error message:
> > 
> > >  {
> > > - WARN_ONCE(trapnr == X86_TRAP_GP, "General protection fault in user 
> > > access. Non-canonical address?");
> > > + WARN_ONCE(trapnr == X86_TRAP_GP, "General protection fault at %s 
> > > address in user access.",
> > > +           is_canonical_addr(fault_addr) ? "canonical" : 
> > > "non-canonical");
> > 
> > I've always read that as "the GP might have been caused by a
> > non-canonical access".  The main nit I'd have with the change is that I
> > don't think all #GP's during user access functions which are given a
> > non-canonical address *necessarily* caused the #GP.
> > 
> > There are a billion ways you can get a #GP and I bet canonical
> > violations aren't the only way you can get one in a user copy function.
> 
> All the other reasons would require a fairly egregious kernel bug, hence
> the speculation that the #GP is due to a non-canonical address.  Something
> like the following would be more precise, though highly unlikely to ever
> be exercised, e.g. KVM had a fatal bug related to injecting a non-zero
> error code that went unnoticed for years.
> 
>       WARN_ONCE(trapnr == X86_TRAP_GP, "General protection fault in user 
> access. %s?\n",
>                 (IS_ENABLED(CONFIG_X86_64) && !error_code) ? "Non-canonical 
> address" :
>                                                              "Segmentation 
> bug");

Instead of trying to guess the reason of the #GPF (which guess might be 
wrong), please just state it as the reason if we are sure that the cause 
is a non-canonical address - and provide a best-guess if it's not but 
clearly signal that it's a guess.

I.e. if I understood all the cases correctly we'd have three types of 
messages generated:

 !error_code:
        "General protection fault in user access, due to non-canonical address."

 error_code && !is_canonical_addr(fault_addr):
        "General protection fault in user access. Non-canonical address?"

 error_code && is_canonical_addr(fault_addr):
        "General protection fault in user access. Segmentation bug?"

Only the first one is declarative, because we know we got a #GP with a 
zero error code which should denote a non-canonical address access.

The second and third ones are guesses with question marks to communicate 
the uncertainty.

Assuming that !error_code always means non-canonical access?

And hopefully "!error_code && !is_canonical_addr(fault_addr)" is not 
possible?

Thanks,

        Ingo

Reply via email to