down)
>
> I believe the problem is this:
>
> #define PV_INDIRECT(addr) *addr(%rip)
>
> The displacement that the linker computes will be relative to the where
> this instruction is placed at the time of linking, which is in
> .pv_altinstructions (and not .text). So when we copy it into .text the
> displacement becomes bogus.
>
> Replacing the macro with
>
> #define PV_INDIRECT(addr) *addr // well, it's not so much
> indirect anymore
>
> makes things work. Or maybe it can be adjusted top be kept truly indirect.
That is still an indirect call, just using absolute addressing for the
pointer instead of RIP-relative. Alternatives has very limited
relocation capabilities. It will only handle a single call or jmp
replacement. Using absolute addressing is slightly less efficient
(takes one extra byte to encode, and needs a relocation for KASLR),
but it works just as well. You could also relocate the instruction
manually by adding the delta between the original and replacement code
to the displacement.
--
Brian Gerst
ning 'unsigned long', we end
> up with 0xffea in %rax, which is wrong.
>
> To work around this and maintain the 'int' behavior while using
> the SYSCALL_DEFINEx() macros, so we add a cast to 'unsigned int'
> in both implementations of sys_modify_ldt().
Reviewed-by: Brian Gerst
need to make sure that the return
> value is cast to int in all cases.
I don't think there will be a problem here. If 64-bit userspace
treats it as an int, it will truncate to 32-bit signed and all is
well. If it is treating it as a long, then it's currently broken for
errors anyways.
--
Brian Gerst
On Sat, Oct 14, 2017 at 12:42 AM, Andy Lutomirski wrote:
> On Fri, Oct 13, 2017 at 4:49 PM, Brian Gerst wrote:
>> On Fri, Oct 13, 2017 at 5:03 PM, Andy Lutomirski wrote:
>>> On Fri, Oct 13, 2017 at 1:39 PM, Dave Hansen
>>> wrote:
>>>>
>&g
t; +
> + ebda_start = *(unsigned short *)0x40e << 4;
> + bios_start = *(unsigned short *)0x413 << 10;
> +
> + if (bios_start < BIOS_START_MIN || bios_start > BIOS_START_MAX)
> + bios_start = BIOS_START_MAX;
> +
> + if (ebda_start > BIOS_START_MIN && ebda_start < bios_start)
> + bios_start = ebda_start;
> +
> + /* Place trampoline one page below end of low memory, alinged to 4k */
> + return round_down(bios_start - PAGE_SIZE, PAGE_SIZE);
> +}
> +#endif
> --
> 2.14.2
>
>From what we've seen with the TLB flush rework, having potential
garbage in the page tables that speculative reads can see can cause
bad things like machine checks. It would be best to have a second
temporary page just for the page table (and properly cleared).
The trampoline also needs its own stack, in case the stack pointer was
above 4G. That could be at the end of the code page, since you only
need 8 bytes.
--
Brian Gerst
On Mon, Oct 16, 2017 at 4:55 AM, Kirill A. Shutemov
wrote:
> On Sat, Oct 14, 2017 at 01:19:08PM -0400, Brian Gerst wrote:
>> From what we've seen with the TLB flush rework, having potential
>> garbage in the page tables that speculative reads can see can cause
>> bad thing
affected at all. So it might by easier to just set
> the bit for older AMD processors as a boot quirk.
> Changing the TLB code so late might not be a good idea...
Looking at the AMD K10 revision guide
(http://support.amd.com/TechDocs/41322_10h_Rev_Gd.pdf), errata #298
that this fixes should only apply to revisions DR-BA and DR-B2, which
include the original Phenom, but not Phenom II. The Phenom II X6 is
revision PH-E0, which does not have this errata.
--
Brian Gerst
> -
> -/* Instantiate ptregs_stub for each ptregs-using syscall */
> -#define __SYSCALL_64_QUAL_(sym)
> -#define __SYSCALL_64_QUAL_ptregs(sym) ptregs_stub sym
> -#define __SYSCALL_64(nr, sym, qual) __SYSCALL_64_QUAL_##qual(sym)
> -#include
> -
You can't just blindly remove this. We need to make sure that
syscalls that modify registers take the slow path exit, because they
may change the registers to be incompatible with SYSRET.
--
Brian Gerst
On Wed, Jan 10, 2018 at 7:16 PM, Andi Kleen wrote:
> On Tue, Jan 09, 2018 at 09:46:16PM -0500, Brian Gerst wrote:
>> On Tue, Jan 9, 2018 at 8:03 PM, Andi Kleen wrote:
>> > From: Andi Kleen
>> >
>> > Remove the partial stack frame in the 64bit syscall fast
On Wed, Jan 10, 2018 at 7:55 PM, Andy Lutomirski wrote:
>
>
>> On Jan 10, 2018, at 4:16 PM, Andi Kleen wrote:
>>
>>> On Tue, Jan 09, 2018 at 09:46:16PM -0500, Brian Gerst wrote:
>>>> On Tue, Jan 9, 2018 at 8:03 PM, Andi Kleen wrote:
>>>> F
> + xor %ecx, %ecx
>
> UNWIND_HINT_REGS extra=0
>
> @@ -263,6 +271,7 @@ entry_SYSCALL_64_fastpath:
> #endif
> ja 1f /* return -ENOSYS (already in
> pt_regs->ax) */
> movq %r10, %rcx
> + xor %r10, %r10
RCX is already clear, so xchgq %r10, %rcx will be simpler.
--
Brian Gerst
I confess I don't actually know why that's a problem,
>
> You get a giant string of NOPs, a single jmp should be faster.
That could be handled in add_nops(), where if over a certain threshold
it changes to a JMP.
--
Brian Gerst
On Thu, Jan 11, 2018 at 10:23 AM, David Woodhouse wrote:
> On Thu, 2018-01-11 at 10:22 -0500, Brian Gerst wrote:
>> On Thu, Jan 11, 2018 at 9:32 AM, Peter Zijlstra wrote:
>> > On Thu, Jan 11, 2018 at 02:28:32PM +, David Woodhouse wrote:
>> >> On Thu, 2018-01-11
lled by gcc generated code, or with the asm macros
> + * in asm/jump-asm.h
> + */
> +
> +#include
> +#include
> +#include
> +
> + .section.text.__x86.indirect_thunk,"ax"
> +
> +ENTRY(__x86.indirect_thunk)
> + CFI_STARTPROC
> + callretpoline_call_target
> +2:
> + lfence /* stop speculation */
> + jmp 2b
> +retpoline_call_target:
> +#ifdef CONFIG_64BIT
> + lea 8(%rsp), %rsp
> +#else
> + lea 4(%esp), %esp
> +#endif
> + ret
> + CFI_ENDPROC
> +ENDPROC(__x86.indirect_thunk)
> +
> + EXPORT_SYMBOL(__x86.indirect_thunk)
> --
> 2.14.3
>
Can someone actually explain WTF this mess is trying to accomplish?
--
Brian Gerst
instruction is already serializing.
Does this require a microcode update?
--
Brian Gerst
MD: lfence+jmp)
Another way to do it is with two consecutive alternatives:
ALTERNATIVE(NOP, K8: lfence)
ALTERNATIVE(jmp indirect, RETPOLINE: jmp thunk)
This also avoids the issue with the relocation of the jmp target when
the replacement is more than one instruction.
--
Brian Gerst
nd CR3. Switch to kernel gsbase and CR3:
> */
> + SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
> SWAPGS
>
> /*
> @@ -1313,6 +1332,7 @@ ENTRY(nmi)
> UNWIND_HINT_REGS
> ENCODE_FRAME_POINTER
>
> + SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi
> /*
> * At this point we no longer need to worry about stack damage
> * due to nesting -- we're on the normal thread stack and we're
> @@ -1328,6 +1348,7 @@ ENTRY(nmi)
> * work, because we don't want to enable interrupts.
> */
> SWAPGS
> + SWITCH_TO_USER_CR3 scratch_reg=%rdi
> jmp restore_regs_and_iret
>
> .Lnmi_from_kernel:
> @@ -1538,6 +1559,8 @@ end_repeat_nmi:
> movq$-1, %rsi
> calldo_nmi
>
> + RESTORE_CR3 save_reg=%r14
> +
> testl %ebx, %ebx /* swapgs needed? */
> jnz nmi_restore
> nmi_swapgs:
> _
This all needs to be conditional on a config option. Something with
this amount of performance impact needs to be 100% optional.
--
Brian Gerst
dy restored (see code above) */
> UNWIND_HINT_EMPTY
> POP_EXTRA_REGS
> +.Lpop_c_regs_except_rcx_r11_and_sysret:
> popq%rsi/* skip r11 */
> popq%r10
> popq%r9
Wouldn't it be more logical to keep the SYSRET path at the end of the
fast path (reverse of what you are doing here)? That way the fast
path falls through without jumping.
--
Brian Gerst
ermode)
Is adding swapgs to the label really necessary? It's redundant with
the usermode part. I'm noticing a trend towards absurdly verbose
labels lately.
--
Brian Gerst
_cs = __KERNEL_CS;
> load_sp0(tss, >thread);
> + refresh_sysenter_cs(>thread);
> vm86->saved_sp0 = 0;
> put_cpu();
>
> @@ -368,8 +370,10 @@ static long do_sys_vm86(struct vm86plus_struct __user
> *user_vm86, bool plus)
> /* make room for real-mode segments */
> tsk->thread.sp0 += 16;
>
> - if (static_cpu_has(X86_FEATURE_SEP))
> + if (static_cpu_has(X86_FEATURE_SEP)) {
> tsk->thread.sysenter_cs = 0;
> + refresh_sysenter_cs(>thread);
> + }
>
> load_sp0(tss, >thread);
> put_cpu();
The MSR update is missing in the new version.
--
Brian Gerst
E_MAX
>
> +#ifdef CONFIG_X86_32
> #define INIT_THREAD { \
> .sp0= TOP_OF_INIT_STACK,\
> .addr_limit = KERNEL_DS,\
> }
> +#else
> +#define INIT_THREAD { \
> + .addr_limit = KERNEL_DS,\
> +}
> +#endif
There is already a separate INIT_THREAD for 32-bit. Just delete the sp0 member.
--
Brian Gerst
if it's applicable to interrupts or not.
No, it is not. The processor pushes 5 or 6 words of data on the stack
(the IRET frame plus an error code for certain exceptions) before the
interrupt handler gets control. So without using the IST for stack
switching on every interrupt, the redzone cannot be used in the kernel
as it will get smashed by the IRET frame. In addition, since the
kernel's stack is limited in size, skipping 128 bytes on every
interrupt would overrun the stack faster. The small gain from using
the redzone in the kernel is outweighed by these limitations.
--
Brian Gerst
pat sysenter target */
> ENTRY(xen_sysenter_target)
> - undo_xen_syscall
> + mov 0*8(%rsp), %rcx
> + mov 1*8(%rsp), %r11
> + mov 5*8(%rsp), %rsp
> jmp entry_SYSENTER_compat
> ENDPROC(xen_sysenter_target)
This patch causes the iopl_32 and ioperm_32 self-tests to fail on a
64-bit PV kernel. The 64-bit versions pass. It gets a seg fault after
"parent: write to 0x80 (should fail)", and the fault isn't caught by
the signal handler. It just dumps back to the shell. The tests pass
after reverting this.
--
Brian Gerst
On Mon, Aug 14, 2017 at 1:53 AM, Andy Lutomirski wrote:
> On Sun, Aug 13, 2017 at 7:44 PM, Brian Gerst wrote:
>> On Mon, Aug 7, 2017 at 11:59 PM, Andy Lutomirski wrote:
>>> /* Normal 64-bit system call target */
>>> ENTRY(xen_syscall_target)
>>> -
/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -247,6 +247,7 @@ GLOBAL(entry_SYSCALL_64_after_hwframe)
> TRACE_IRQS_OFF
>
> /* IRQs are off. */
> + CLEAR_EXTRA_REGS_NOSPEC
> movq%rsp, %rdi
> calldo_syscall_64 /* returns with IRQs disabled */
>
>
Now that the fast syscall path is gone, all regs (except RSP
obviously) are dead after being saved to pt_regs.
--
Brian Gerst
bit code (.Lerror_bad_iret) for example.
For 32-bit, you could just expand that check to cover the whole exit
prologue after the CR3 switch, including the data segment loads.
I do wonder though, how expensive is a CR3 read? The SDM implies that
only writes are serializing. It may be simpler to just
unconditionally check it.
--
Brian Gerst
e the NMI handler could destroy the
> +* contents of the task-stack we are about to copy.
> +*/
> + movl%edi, %ebx
> +
> + /* Do the copy */
> + shrl$2, %ecx
> + cld
> + rep movsl
> +
> + /* Safe to switch to entry-stack now */
> + movl%ebx, %esp
> +
> +.Lend_\@:
> +.endm
> +/*
> * %eax: prev task
> * %edx: next task
> */
> @@ -765,6 +872,7 @@ restore_all:
>
> restore_all_kernel:
> TRACE_IRQS_IRET
> + PARANOID_EXIT_TO_KERNEL_MODE
> RESTORE_REGS 4
> jmp .Lirq_return
>
> --
> 2.7.4
>
--
Brian Gerst
On Mon, Mar 5, 2018 at 11:44 AM, Joerg Roedel wrote:
> On Mon, Mar 05, 2018 at 09:51:29AM -0500, Brian Gerst wrote:
>> For the IRET fault case you will still need to catch it in the
>> exception code. See the 64-bit code (.Lerror_bad_iret) for example.
>> For 32-bit,
On Mon, Mar 5, 2018 at 1:25 PM, Joerg Roedel wrote:
> Hi Brian,
>
> thanks for your review and helpful input.
>
> On Mon, Mar 05, 2018 at 11:41:01AM -0500, Brian Gerst wrote:
>> On Mon, Mar 5, 2018 at 5:25 AM, Joerg Roedel wrote:
>&
be we should have a test for the "take DB on first instruction
> of sysenter".
>
>Linus
There already is a test: single_step_syscall.c
--
Brian Gerst
Pavel
KASLR leaks are a much lower threat than Meltdown. Given that no AMD
processor supports PCID, enabling PTI has a much more significant
performance impact for a much smaller benefit. For the paranoid user
they still have the option to enable PTI at boot, but it should not be
on by default.
--
Brian Gerst
. So how does it reload R10?
>
> In fact do_syscall_64() as a C function does not touch R10, R11, R12, R13,
> R14,
> R15 - it passes their values through.
>
> What am I missing?
The syscall ABI uses R10 for the 4th argument instead of RCX, because
RCX gets clobbered by the SYSCALL instruction for RIP.
--
Brian Gerst
0 },
> { X86_FEATURE_CPB, CPUID_EDX, 9, 0x8007, 0 },
> { X86_FEATURE_PROC_FEEDBACK,CPUID_EDX, 11, 0x8007, 0 },
> + { X86_FEATURE_SME, CPUID_EAX, 0, 0x801f, 0 },
This should also be conditional. We don't want to set this feature on
32-bit, even if the processor has support.
> { 0, 0, 0, 0, 0 }
> };
--
Brian Gerst
_SHIFT;
>
Removing this also affects 32-bit, which is more likely to access
legacy devices in this range. Put in a check for SME instead
(provided you follow my recommendations to not set the SME feature bit
on 32-bit even when the processor supports it).
--
Brian Gerst
On Mon, Jul 10, 2017 at 3:50 PM, Tom Lendacky wrote:
> On 7/8/2017 7:57 AM, Brian Gerst wrote:
>>
>> On Fri, Jul 7, 2017 at 9:39 AM, Tom Lendacky
>> wrote:
>>>
>>> Currently there is a check if the address being mapped is in the ISA
>>> range (
On Mon, Jul 10, 2017 at 3:41 PM, Tom Lendacky wrote:
> On 7/8/2017 7:50 AM, Brian Gerst wrote:
>>
>> On Fri, Jul 7, 2017 at 9:38 AM, Tom Lendacky
>> wrote:
>>>
>>> Update the CPU features to include identifying and reporting on the
>>> Secure Mem
On Tue, Jul 11, 2017 at 4:35 AM, Arnd Bergmann wrote:
> On Tue, Jul 11, 2017 at 6:58 AM, Brian Gerst wrote:
>> On Mon, Jul 10, 2017 at 3:50 PM, Tom Lendacky
>> wrote:
>>> On 7/8/2017 7:57 AM, Brian Gerst wrote:
>>>> On Fri, Jul 7, 2017 at 9:39 AM, Tom Lendac
On Tue, Jul 11, 2017 at 11:02 AM, Tom Lendacky wrote:
> On 7/10/2017 11:58 PM, Brian Gerst wrote:
>>
>> On Mon, Jul 10, 2017 at 3:50 PM, Tom Lendacky
>> wrote:
>>>
>>> On 7/8/2017 7:57 AM, Brian Gerst wrote:
>>>>
>>>>
R +
>> emit a warning.
>
> OTOH, disabling NX is a simple way to verify that DEBUG_WX works correctly
> also on the shadow maps.
>
> But surely we can drop the PAGE_GLOBAL thing, as all 64bit systems have it.
I seem to recall that some virtualized environments (maybe Xen?) don't
support global pages.
--
Brian Gerst
t would need to be done
is to remove the zero-base of the percpu segment (which would simplify
alot of other code).
--
Brian Gerst
define __KERNEL_PERCPU when either SMP or STACKPROTECTOR
are enabled.
--
Brian Gerst
i_is_native(void)
> {
> - if (!IS_ENABLED(CONFIG_X86_64))
> - return true;
> return efi_is_64bit();
> }
This would then return false for native 32-bit.
--
Brian Gerst
An alternative to the patch I proposed earlier would be to use aliases
with the __x32_ prefix for the common syscalls.
--
Brian Gerst
On Sat, Sep 19, 2020 at 1:14 PM wrote:
>
> On September 19, 2020 9:23:22 AM PDT, Andy Lutomirski wrote:
> >On Fri, Sep 18, 2020 at 10:35 PM Chris
args for native 32-bit.
Reported-by: Paweł Jasiak
Fixes: 121b32a58a3a ("x86/entry/32: Use IA32-specific wrappers for syscalls
taking 64-bit arguments")
Signed-off-by: Brian Gerst
---
arch/Kconfig | 6 ++
arch/x86/Kconfig | 1 +
fs/notif
On Tue, Dec 1, 2020 at 12:34 PM Andy Lutomirski wrote:
>
> On Tue, Dec 1, 2020 at 9:23 AM Andy Lutomirski wrote:
> >
> > On Mon, Nov 30, 2020 at 2:31 PM Brian Gerst wrote:
> > >
> > > Commit 121b32a58a3a converted native x86-32 which take 64-bit argume
On Tue, Aug 25, 2020 at 6:44 AM Thomas Gleixner wrote:
>
> On Fri, Aug 21 2020 at 11:35, Brian Gerst wrote:
> > On Fri, Aug 21, 2020 at 10:22 AM Sean Christopherson
> >> > .macro GET_PERCPU_BASE reg:req
> >> > - ALTERNATIVE \
> >> >
GS indirect=1
> movqORIG_RAX(%rdi), %rsi /* get vector from stack */
> - movq$-1, ORIG_RAX(%rdi) /* no syscall to restart */
> callsmp_spurious_interrupt /* rdi points to pt_regs */
> jmp ret_from_intr
> SYM_CODE_END(common_spurious)
> @@ -746,7 +745,6 @@ SYM_CODE_START_LOCAL(common_interrupt)
> callinterrupt_entry
> UNWIND_HINT_REGS indirect=1
> movqORIG_RAX(%rdi), %rsi/* get vector from stack */
> - movq$-1, ORIG_RAX(%rdi) /* no syscall to restart */
> calldo_IRQ /* rdi points to pt_regs */
> /* 0(%rsp): old RSP */
> ret_from_intr:
> diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
> index 67768e5443..5b6f74e 100644
> --- a/arch/x86/kernel/apic/vector.c
> +++ b/arch/x86/kernel/apic/vector.c
> @@ -934,7 +934,7 @@ static void __irq_complete_move(struct irq_cfg *cfg,
> unsigned vector)
>
> void irq_complete_move(struct irq_cfg *cfg)
> {
> - __irq_complete_move(cfg, ~get_irq_regs()->orig_ax);
> + __irq_complete_move(cfg, get_irq_regs()->orig_ax);
> }
I think you need to also truncate the vector to 8-bits, since it now
gets sign-extended when pushed into the orig_ax slot.
--
Brian Gerst
ops.
> +*/
> + dr6 &= ~DR_STEP;
> + set_thread_flag(TIF_SINGLESTEP);
> + regs->flags &= ~X86_EFLAGS_TF;
> + }
> +
> handle_debug(regs, dr6, false);
>
> out:
Can this be removed or changed to a BUG()? The warning has been there
since 2016 and nobody has apparently complained about it.
--
Brian Gerst
PU_AND_NODE_SEG_LIMIT \reg
> andq$VDSO_CPUNODE_MASK, \reg
> movq__per_cpu_offset(, \reg, 8), \reg
> .endm
LOAD_CPU_AND_NODE_SEG_LIMIT can be merged into this, as its only
purpose was to work around using CPP macros in an alternative.
--
Brian Gerst
the kernel would consume a guest's TSC_AUX if an NMI arrives
> > + * while running KVM's run loop.
> > */
> > .macro GET_PERCPU_BASE reg:req
> > - ALTERNATIVE \
> > - "LOAD_CPU_AND_NODE_SEG_LIMIT \reg", \
> > - "RDPID \reg", \
>
> This was the only user of the RDPID macro, I assume we want to yank that out
> as well?
No. That one should be kept until the minimum binutils version is
raised to one that supports the RDPID opcode.
--
Brian Gerst
(IS_ENABLED(CONFIG_64_BIT) &&
> boot_cpu_has(X86_FEATURE_XENPV)))
> + mask |= X86_EFLAGS_AC;
Is the explicit Xen check necessary? IIRC the Xen hypervisor will
filter out the SMAP bit in the cpuid pvop.
--
Brian Gerst
On Wed, Sep 2, 2020 at 12:31 PM wrote:
>
> On Wed, Sep 02, 2020 at 06:24:27PM +0200, Jürgen Groß wrote:
> > On 02.09.20 17:58, Brian Gerst wrote:
> > > On Wed, Sep 2, 2020 at 9:38 AM Peter Zijlstra
> > > wrote:
> > > >
> > > > From: Pe
SIGUSR1
> [RUN]Step again
> [OK]pause(2) restarted correctly
Bisected to commit 0b085e68f407 ("x86/entry: Consolidate 32/64 bit
syscall entry").
It looks like it is because syscall_enter_from_user_mode() is called
before reading the 6th argument from the user stack.
--
Brian Gerst
dn't be all that costly. Famous last words, of course...
>
> Does anybody see fundamental problems with that?
I think this would be a good idea. I have been working on a patchset
to clean up the conditional syscall handling (sys_ni.c), and conflicts
with the prototypes in syscalls.h have been getting in the way.
Having the prototypes use SYSCALL_DECLAREx(...) would solve that
issue.
--
Brian Gerst
Linux. Linux is installed on
a broad range of hardware, new and old. In general, we don't drop
support for hardware unless there is nobody willing to maintain it.
--
Brian Gerst
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord..
On Thu, Aug 23, 2012 at 3:03 PM, wbrana wrote:
> On 8/23/12, Brian Gerst wrote:
>> The x86-32 arch is mature and well maintained, and shares so much in
>> common with x86-64, that there is little to be gained by dropping
>> kernel support.
> I would gain better chance,
On Thu, Aug 23, 2012 at 3:08 PM, wbrana wrote:
> On 8/23/12, Brian Gerst wrote:
>> Nobody here cares about closed source drivers.
> There are also open source software which don't support X32 like
> Oracle Java, VirtualBox, Mozilla Firefox, Google Chrome.
As I said before, X32
f the 64-bit virtual
address space and
all addresses are 32 bits in size. They should conform to small code model or
small position independent code model (PIC) described in Section 3.5.1.
10.5 Coding Examples
Although ILP32 binaries run in the 64-bit mode...
--
Brian Gerst
--
To unsubscribe f
On Fri, Aug 24, 2012 at 5:53 AM, wbrana wrote:
> On 8/23/12, Brian Gerst wrote:
>> Windows mostly sells with new hardware, and by the time win9 is
>> released all new hardware designed for it will be 64-bit capable.
>> Therefore it is not *profitable* for Microsoft to contin
han there
is different, that x86-32 specific maintenance is almost nil. I
haven't seen any patches from you, so your opinions about the
maintenance burden don't carry much weight, versus those of people who
are contributors.
--
Brian Gerst
--
To unsubscribe from this list: send the line "unsubscribe
->ip = 0 (placeholder) */
> - pushl %eax/* pt_regs->orig_ax */
> + pushl (%eax) /* pt_regs->orig_ax */
Add an %ss: override here too.
> SAVE_ALL pt_regs_ax=$-ENOSYS/* save rest, stack already switched
> */
>
> /*
> --
> 2.19.1.6.gb485710b
>
--
Brian Gerst
On Mon, May 18, 2020 at 6:46 PM Nick Desaulniers
wrote:
>
> On Sun, May 17, 2020 at 8:29 AM Brian Gerst wrote:
> >
> > The core percpu macros already have a switch on the data size, so the switch
> > in the x86 code is redundant and produces more dead code.
> >
On Mon, May 18, 2020 at 5:15 PM Nick Desaulniers
wrote:
>
> On Sun, May 17, 2020 at 8:29 AM Brian Gerst wrote:
> >
> > The core percpu macros already have a switch on the data size, so the switch
> > in the x86 code is redundant and produces more dead code.
> >
-fno-stack-protector)
> +CFLAGS_head64.o:= $(nostackp)
> +
> # If instrumentation of this dir is enabled, boot hangs during first second.
> # Probably could be more selective here, but note that files related to irqs,
> # boot, dumpstack/stacktrace, etc are either non-interesting
d, from the other thread [1] in case you missed it -- the plain
> hidden.h fails to build in-tree. We need something like
> KBUILD_CFLAGS += -include $(srctree)/$(src)/hidden.h
> instead.
>
> [1] https://lore.kernel.org/lkml/20200526153104.gc2190...@rani.riverdale.lan/
How about using -fvisibility=hidden instead of including this header?
--
Brian Gerst
.ld in the same directory.
If the compiler is making assumptions based on the function name
"main" wouldn't it be simpler just to rename the function?
--
Brian Gerst
On Mon, Jun 1, 2020 at 4:43 PM Nick Desaulniers wrote:
>
> On Sat, May 30, 2020 at 3:11 PM Brian Gerst wrote:
> >
> > Use __pcpu_size_call_return() to simplify this_cpu_read_stable().
>
> Clever! As in this_cpu_read() in include/linux/percpu-defs.h. Could
&
The core percpu macros already have a switch on the data size, so the switch
in the x86 code is redundant and produces more dead code.
Also use appropriate types for the width of the instructions. This avoids
errors when compiling with Clang.
Signed-off-by: Brian Gerst
---
arch/x86/include
The core percpu macros already have a switch on the data size, so the switch
in the x86 code is redundant and produces more dead code.
Also use appropriate types for the width of the instructions. This avoids
errors when compiling with Clang.
Signed-off-by: Brian Gerst
---
arch/x86/include
In preparation for cleaning up the percpu operations, define macros for
abstraction based on the width of the operation.
Signed-off-by: Brian Gerst
---
arch/x86/include/asm/percpu.h | 30 ++
1 file changed, 30 insertions(+)
diff --git a/arch/x86/include/asm/percpu.h
The core percpu macros already have a switch on the data size, so the switch
in the x86 code is redundant and produces more dead code.
Also use appropriate types for the width of the instructions. This avoids
errors when compiling with Clang.
Signed-off-by: Brian Gerst
Reviewed-by: Nick
operands, and to cast variables to the width used in
the assembly to make Clang happy.
Changes from v1:
- Add separate patch for XADD constraint fix
- Fixed sparse truncation warning
- Add cleanup of percpu_stable_op()
- Add patch to Remove PER_CPU()
Brian Gerst (10):
x86/percpu: Introduce size
The core percpu macros already have a switch on the data size, so the switch
in the x86 code is redundant and produces more dead code.
Also use appropriate types for the width of the instructions. This avoids
errors when compiling with Clang.
Signed-off-by: Brian Gerst
Reviewed-by: Nick
Use __pcpu_size_call_return() to simplify this_cpu_read_stable().
Also remove __bad_percpu_size() which is now unused.
Signed-off-by: Brian Gerst
---
arch/x86/include/asm/percpu.h | 41 ++-
1 file changed, 12 insertions(+), 29 deletions(-)
diff --git a/arch/x86
The "e" constraint represents a constant, but the XADD instruction doesn't
accept immediate operands.
Signed-off-by: Brian Gerst
---
arch/x86/include/asm/percpu.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/as
The core percpu macros already have a switch on the data size, so the switch
in the x86 code is redundant and produces more dead code.
Also use appropriate types for the width of the instructions. This avoids
errors when compiling with Clang.
Signed-off-by: Brian Gerst
Reviewed-by: Nick
The core percpu macros already have a switch on the data size, so the switch
in the x86 code is redundant and produces more dead code.
Also use appropriate types for the width of the instructions. This avoids
errors when compiling with Clang.
Signed-off-by: Brian Gerst
Reviewed-by: Nick
Also remove now unused __percpu_mov_op.
Signed-off-by: Brian Gerst
---
arch/x86/include/asm/percpu.h | 18 --
1 file changed, 18 deletions(-)
diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
index cf2b9c2a241e..a3c33b79fb86 100644
--- a/arch/x86
to do the 'popl %fs' before the
> cr3-switch. I'll fix it in the next version.
>
> I have no real idea on how to switch back to the entry stack without
> access to per_cpu variables. I also can't access the cpu_entry_area for
> the cpu yet, because for that we need to be on the entry stack already.
Switch to the trampoline stack before loading user segments.
--
Brian Gerst
On Wed, Jan 17, 2018 at 5:57 AM, Brian Gerst wrote:
> On Wed, Jan 17, 2018 at 1:24 AM, Joerg Roedel wrote:
>> On Tue, Jan 16, 2018 at 02:48:43PM -0800, Andy Lutomirski wrote:
>>> On Tue, Jan 16, 2018 at 8:36 AM, Joerg Roedel wrote:
>>> > + /*
ated by moving the check to
the places that do modify ptregs (ptrace, sigreturn, and exec) which
would set a flag to force return with IRET if the modified regs do not
satisfy the criteria for SYSRET.
--
Brian Gerst
return flags;
> }
>
> static __always_inline void smap_restore(unsigned long flags)
> {
> - asm volatile (ALTERNATIVE("", "push %0; popf", X86_FEATURE_SMAP)
> + asm volatile ("# smap_restore\n\t"
> + "push %0; popf"
> : : "g" (flags) : "memory", "cc");
> }
POPF is an expensive instruction that should be avoided if possible.
A better solution would be to have the alternative jump over the
push/pop when SMAP is disabled.
--
Brian Gerst
s_inline void smap_restore(unsigned long flags)
> {
> - asm volatile (ALTERNATIVE("", "push %0; popf", X86_FEATURE_SMAP)
> + asm volatile ("# smap_restore\n\t"
> + ALTERNATIVE("jmp 1f", "", X86_FEATURE_SMAP)
> + "push %0; popf\n\t"
> + "1:"
> : : "g" (flags) : "memory", "cc");
> }
>
Looks good. Alternatively, you could use static_cpu_has(X86_FEATURE_SMAP).
--
Brian Gerst
er_from_user_mode(regs);
> - instrumentation_begin();
> + unsigned int nr = syscall_32_enter(regs);
>
> - local_irq_enable();
> - do_syscall_32_irqs_on(regs);
> -
> - instrumentation_end();
> - exit_to_user_mode();
> + do_syscall_32_irqs_on(regs, nr);
> + syscall_return_slowpath(regs);
> }
>
> -static bool __do_fast_syscall_32(struct pt_regs *regs)
> +static noinstr bool __do_fast_syscall_32(struct pt_regs *regs)
Can __do_fast_syscall_32() be merged back into do_fast_syscall_32()
now that both are marked noinstr?
--
Brian Gerst
On Tue, Jul 14, 2020 at 2:40 AM Christoph Hellwig wrote:
>
> On Tue, Jun 16, 2020 at 10:23:13AM -0400, Brian Gerst wrote:
> > Christoph Hellwig uncovered an issue with how we currently handle X32
> > syscalls. Currently, we can only use COMPAT_SYS_DEFINEx() for X32
>
Signed-off-by: Brian Gerst
---
arch/x86/entry/calling.h | 10 +
arch/x86/entry/common.c| 56 ++-
arch/x86/entry/entry_64.S | 71 ++
arch/x86/include/asm/syscall.h | 2 +-
4 files changed, 60 insertions(+), 79
This series cleans up the tests for using the SYSRET or SYSEXIT
instructions on exit from a syscall, moving some code from assembly to C
and merging native and compat tests.
Brian Gerst (3):
x86-64: Move SYSRET validation code to C
x86-32: Remove SEP test for SYSEXIT
x86: Clean up
SEP must be present in order for do_fast_syscall_32() to be called on native
32-bit. Therefore the check here is redundant.
Signed-off-by: Brian Gerst
---
arch/x86/entry/common.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/x86/entry/common.c b/arch/x86/entry
Reorganize the tests for SYSEXITS/SYSRETL, cleaning up comments and merging
native and compat code.
Signed-off-by: Brian Gerst
---
arch/x86/entry/common.c | 85 ++--
arch/x86/entry/entry_32.S| 6 +--
arch/x86/entry/entry_64_compat.S | 13 ++---
arch
FLAGS_FTRACE) -fstack-protector
> -fstack-protector-strong
Is this necessary for syscall_*.o? They just contain the syscall
tables (ie. data).
--
Brian Gerst
https://github.com/ClangBuiltLinux/linux/issues/961
> Link: https://lore.kernel.org/lkml/20200504193524.ga221...@google.com/
> Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
> Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
&g
ot; constraint only has meaning on -m32 otherwise is treated as
> > "r". Not all GPRs have low-8-bit aliases for -m32.
> >
>
> Looks very good!
> One question though, does it work with minimum supported version of gcc?
Yes. The operand width modifiers have been around a long time but not
well documented until more recently.
--
Brian Gerst
ptr, size, retval) \
> > do { \
> > + unsigned char x_u8__; \
> > retval = 0; \
> > __chk_user_ptr(ptr);\
> > switch (size) { \
> > case 1: \
> > - __get_user_asm(x, ptr, retval, "b", "=q"); \
> > + __get_user_asm(x_u8__, ptr, retval, "b", "=q"); \
> > + (x) = x_u8__; \
> > break; \
> > case 2: \
> > __get_user_asm(x, ptr, retval, "w", "=r"); \
> > --
> > 2.26.2.526.g744177e7f7-goog
> >
>
>
> --
> Thanks,
> ~Nick Desaulniers
This looks like the same issue that we just discussed for bitops.h.
Add the "b" operand size modifier to force it to use the 8-bit
register names (and probably also needs the "w" modifier in the 16-bit
case).
--
Brian Gerst
q" ((u8)(CONST_MASK(nr) & 0xff))
I think a better fix would be to make CONST_MASK() return a u8 value
rather than have to cast on every use.
Also I question the need for the "q" constraint. It was added in
commit 437a0a54 as a workaround for GCC 3.4.4. Now that the minimum
GCC version is 4.6, is this still necessary?
--
Brian Gerst
On Thu, May 7, 2020 at 3:02 AM wrote:
>
> On May 6, 2020 11:18:09 PM PDT, Brian Gerst wrote:
> >On Tue, May 5, 2020 at 1:47 PM Nick Desaulniers
> > wrote:
> >>
> >> From: Sedat Dilek
> >>
> >> It turns out that if your config tickle
gt; Drop & 0xff and change ^ 0xff to ~.
>
> But then we're back to sparse being unhappy, no? The thing with ~ is
> that it will set high bits which will be truncated, which makes sparse
> sad.
This change will make sparse happy and allow these cleanups:
#define CONST_MASK(nr) ((u8)1 << ((nr) & 7))
Tested with GCC 9.3.1.
--
Brian Gerst
https://github.com/ClangBuiltLinux/linux/issues/961
> Link: https://lore.kernel.org/lkml/20200504193524.ga221...@google.com/
> Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
> Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
&g
On Mon, May 11, 2020 at 2:46 PM Nick Desaulniers
wrote:
>
> On Mon, May 11, 2020 at 11:09 AM Brian Gerst wrote:
> > This looks like the same issue that we just discussed for bitops.h.
> > Add the "b" operand size modifier to force it to use the 8-bit
> > regis
On Mon, May 11, 2020 at 3:34 PM Brian Gerst wrote:
>
> On Mon, May 11, 2020 at 2:46 PM Nick Desaulniers
> wrote:
> >
> > On Mon, May 11, 2020 at 11:09 AM Brian Gerst wrote:
> > > This looks like the same issue that we just discussed for bitops.h.
> > > Add
On Thu, May 7, 2020 at 6:29 PM Nick Desaulniers wrote:
>
> On Thu, May 7, 2020 at 12:19 PM Nick Desaulniers
> wrote:
> >
> > On Thu, May 7, 2020 at 7:00 AM Brian Gerst wrote:
> > >
> > > This change will make sparse happy and allow thes
701 - 800 of 1225 matches
Mail list logo