>>> On 12.08.14 at 11:31, wrote:
> Jan, Pater, does this look correct _and_ human-understandable?
>
> --- a/arch/x86/kernel/entry_64.S
> +++ b/arch/x86/kernel/entry_64.S
> @@ -652,10 +652,14 @@ END(interrupt)
> cmovzq PER_CPU_VAR(irq_stack_ptr),%rsp
> CFI_DEF_CFA_REGISTERrsi
On 08/11/2014 05:13 PM, H. Peter Anvin wrote:
> On 08/11/2014 08:08 AM, Jan Beulich wrote:
>>> No, in *human language*. What does the DW_CFA_def_cfa_expression
>>> actually aim to accomplish? If you don't know the innards of the DWARF
>>> spec, the whole thing might as well be Hungarian.
>>
>>
On 08/11/2014 05:13 PM, H. Peter Anvin wrote:
On 08/11/2014 08:08 AM, Jan Beulich wrote:
No, in *human language*. What does the DW_CFA_def_cfa_expression
actually aim to accomplish? If you don't know the innards of the DWARF
spec, the whole thing might as well be Hungarian.
Just like the
On 12.08.14 at 11:31, dvlas...@redhat.com wrote:
Jan, Pater, does this look correct _and_ human-understandable?
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -652,10 +652,14 @@ END(interrupt)
cmovzq PER_CPU_VAR(irq_stack_ptr),%rsp
On 08/11/2014 08:08 AM, Jan Beulich wrote:
>>
>> No, in *human language*. What does the DW_CFA_def_cfa_expression
>> actually aim to accomplish? If you don't know the innards of the DWARF
>> spec, the whole thing might as well be Hungarian.
>
> Just like the other DW_CFA_def_cfa_* ones it sets
>>> On 11.08.14 at 16:53, wrote:
> On 08/11/2014 07:17 AM, Jan Beulich wrote:
>>>
>>> The existing comments explain what every byte means.
>>> They are useful if CFI-literate reader wants to check correctness
>>> of the encoding of this annotation.
>>>
>>> There is no overall comment what this
On 08/11/2014 07:17 AM, Jan Beulich wrote:
>>
>> The existing comments explain what every byte means.
>> They are useful if CFI-literate reader wants to check correctness
>> of the encoding of this annotation.
>>
>> There is no overall comment what this CFI annotation
>> *achieves*. In human
>>> On 11.08.14 at 15:26, wrote:
> On 08/11/2014 10:40 AM, Jan Beulich wrote:
>> CFI_ESCAPE 0x0f /* DW_CFA_def_cfa_expression */, 6, \
>> 0x77 /* DW_OP_breg7 */, 0, \
>> 0x06 /* DW_OP_deref */, \
>> - 0x08
On 08/11/2014 10:40 AM, Jan Beulich wrote:
> CFI_ESCAPE 0x0f /* DW_CFA_def_cfa_expression */, 6, \
> 0x77 /* DW_OP_breg7 */, 0, \
> 0x06 /* DW_OP_deref */, \
> - 0x08 /* DW_OP_const1u */, SS+8-RBP, \
> +
>>> On 11.08.14 at 11:07, wrote:
> How does one test the entry CFI annotations? The best that I know of
> is to single-step through using gdb attached to qemu and see whether
> backtraces seem to work.
Or have the kernel generate a backtrace from a suitable location and
check that the backtrace
On Mon, Aug 11, 2014 at 5:40 PM, Jan Beulich wrote:
On 11.08.14 at 02:46, wrote:
>> On Mon, Aug 04, 2014 at 05:03:42AM +0200, Denys Vlasenko wrote:
>>> On Sat, Aug 2, 2014 at 1:19 AM, Frederic Weisbecker
>> wrote:
>>> >> CFI_ESCAPE 0x0f /* DW_CFA_def_cfa_expression */, 6, \
>>>
>>> On 11.08.14 at 02:46, wrote:
> On Mon, Aug 04, 2014 at 05:03:42AM +0200, Denys Vlasenko wrote:
>> On Sat, Aug 2, 2014 at 1:19 AM, Frederic Weisbecker
> wrote:
>> >> CFI_ESCAPE 0x0f /* DW_CFA_def_cfa_expression */, 6, \
>> >> 0x77 /* DW_OP_breg7 */, 0, \
>>
On 11.08.14 at 02:46, fweis...@gmail.com wrote:
On Mon, Aug 04, 2014 at 05:03:42AM +0200, Denys Vlasenko wrote:
On Sat, Aug 2, 2014 at 1:19 AM, Frederic Weisbecker fweis...@gmail.com
wrote:
CFI_ESCAPE 0x0f /* DW_CFA_def_cfa_expression */, 6, \
0x77 /*
On Mon, Aug 11, 2014 at 5:40 PM, Jan Beulich jbeul...@suse.com wrote:
On 11.08.14 at 02:46, fweis...@gmail.com wrote:
On Mon, Aug 04, 2014 at 05:03:42AM +0200, Denys Vlasenko wrote:
On Sat, Aug 2, 2014 at 1:19 AM, Frederic Weisbecker fweis...@gmail.com
wrote:
CFI_ESCAPE 0x0f /*
On 11.08.14 at 11:07, l...@amacapital.net wrote:
How does one test the entry CFI annotations? The best that I know of
is to single-step through using gdb attached to qemu and see whether
backtraces seem to work.
Or have the kernel generate a backtrace from a suitable location and
check that
On 08/11/2014 10:40 AM, Jan Beulich wrote:
CFI_ESCAPE 0x0f /* DW_CFA_def_cfa_expression */, 6, \
0x77 /* DW_OP_breg7 */, 0, \
0x06 /* DW_OP_deref */, \
- 0x08 /* DW_OP_const1u */, SS+8-RBP, \
+ 0x08
On 11.08.14 at 15:26, dvlas...@redhat.com wrote:
On 08/11/2014 10:40 AM, Jan Beulich wrote:
CFI_ESCAPE 0x0f /* DW_CFA_def_cfa_expression */, 6, \
0x77 /* DW_OP_breg7 */, 0, \
0x06 /* DW_OP_deref */, \
- 0x08 /*
On 08/11/2014 07:17 AM, Jan Beulich wrote:
The existing comments explain what every byte means.
They are useful if CFI-literate reader wants to check correctness
of the encoding of this annotation.
There is no overall comment what this CFI annotation
*achieves*. In human language, what do
On 11.08.14 at 16:53, h...@zytor.com wrote:
On 08/11/2014 07:17 AM, Jan Beulich wrote:
The existing comments explain what every byte means.
They are useful if CFI-literate reader wants to check correctness
of the encoding of this annotation.
There is no overall comment what this CFI
On 08/11/2014 08:08 AM, Jan Beulich wrote:
No, in *human language*. What does the DW_CFA_def_cfa_expression
actually aim to accomplish? If you don't know the innards of the DWARF
spec, the whole thing might as well be Hungarian.
Just like the other DW_CFA_def_cfa_* ones it sets the
On Mon, Aug 04, 2014 at 05:03:42AM +0200, Denys Vlasenko wrote:
> On Sat, Aug 2, 2014 at 1:19 AM, Frederic Weisbecker
> wrote:
> >> CFI_ESCAPE 0x0f /* DW_CFA_def_cfa_expression */, 6, \
> >> 0x77 /* DW_OP_breg7 */, 0, \
> >> 0x06 /*
On Mon, Aug 04, 2014 at 05:03:42AM +0200, Denys Vlasenko wrote:
On Sat, Aug 2, 2014 at 1:19 AM, Frederic Weisbecker fweis...@gmail.com
wrote:
CFI_ESCAPE 0x0f /* DW_CFA_def_cfa_expression */, 6, \
0x77 /* DW_OP_breg7 */, 0, \
0x06 /*
On Tue, Aug 5, 2014 at 4:53 PM, Andy Lutomirski wrote:
> On Aug 5, 2014 7:36 PM, "Denys Vlasenko" wrote:
>> Then old_rsp can be nuked everywhere else,
>> RESTORE_TOP_OF_STACK can be nuked, and
>> FIXUP_TOP_OF_STACK can be reduced to merely:
>>
>> movq $__USER_DS,SS(%rsp)
>> movq
On Tue, Aug 5, 2014 at 4:53 PM, Andy Lutomirski l...@amacapital.net wrote:
On Aug 5, 2014 7:36 PM, Denys Vlasenko vda.li...@googlemail.com wrote:
Then old_rsp can be nuked everywhere else,
RESTORE_TOP_OF_STACK can be nuked, and
FIXUP_TOP_OF_STACK can be reduced to merely:
movq
On Aug 6, 2014 12:17 AM, "Denys Vlasenko" wrote:
>
> On 08/05/2014 04:53 PM, Andy Lutomirski wrote:
> > On Aug 5, 2014 7:36 PM, "Denys Vlasenko" wrote:
> >>
> >> On Mon, Aug 4, 2014 at 11:03 PM, Andy Lutomirski
> >> wrote:
> > Next up: remove FIXUP/RESTORE_TOP_OF_STACK? :) Maybe I'll give
On 08/05/2014 04:53 PM, Andy Lutomirski wrote:
> On Aug 5, 2014 7:36 PM, "Denys Vlasenko" wrote:
>>
>> On Mon, Aug 4, 2014 at 11:03 PM, Andy Lutomirski wrote:
> Next up: remove FIXUP/RESTORE_TOP_OF_STACK? :) Maybe I'll give that a
> shot.
I'm yet at the stage "what that stuff
On Aug 5, 2014 7:36 PM, "Denys Vlasenko" wrote:
>
> On Mon, Aug 4, 2014 at 11:03 PM, Andy Lutomirski wrote:
> >>> Next up: remove FIXUP/RESTORE_TOP_OF_STACK? :) Maybe I'll give that a
> >>> shot.
> >>
> >> I'm yet at the stage "what that stuff does anyway?" and at
> >> "why do we need percpu
On Mon, Aug 4, 2014 at 11:03 PM, Andy Lutomirski wrote:
>>> Next up: remove FIXUP/RESTORE_TOP_OF_STACK? :) Maybe I'll give that a shot.
>>
>> I'm yet at the stage "what that stuff does anyway?" and at
>> "why do we need percpu old_rsp thingy?" in particular.
>
> On x86_64, the syscall
On Mon, Aug 4, 2014 at 11:03 PM, Andy Lutomirski l...@amacapital.net wrote:
Next up: remove FIXUP/RESTORE_TOP_OF_STACK? :) Maybe I'll give that a shot.
I'm yet at the stage what that stuff does anyway? and at
why do we need percpu old_rsp thingy? in particular.
On x86_64, the syscall
On Aug 5, 2014 7:36 PM, Denys Vlasenko vda.li...@googlemail.com wrote:
On Mon, Aug 4, 2014 at 11:03 PM, Andy Lutomirski l...@amacapital.net wrote:
Next up: remove FIXUP/RESTORE_TOP_OF_STACK? :) Maybe I'll give that a
shot.
I'm yet at the stage what that stuff does anyway? and at
why
On 08/05/2014 04:53 PM, Andy Lutomirski wrote:
On Aug 5, 2014 7:36 PM, Denys Vlasenko vda.li...@googlemail.com wrote:
On Mon, Aug 4, 2014 at 11:03 PM, Andy Lutomirski l...@amacapital.net wrote:
Next up: remove FIXUP/RESTORE_TOP_OF_STACK? :) Maybe I'll give that a
shot.
I'm yet at the
On Aug 6, 2014 12:17 AM, Denys Vlasenko dvlas...@redhat.com wrote:
On 08/05/2014 04:53 PM, Andy Lutomirski wrote:
On Aug 5, 2014 7:36 PM, Denys Vlasenko vda.li...@googlemail.com wrote:
On Mon, Aug 4, 2014 at 11:03 PM, Andy Lutomirski l...@amacapital.net
wrote:
Next up: remove
On Tue, Aug 05, 2014 at 06:03:24AM +0900, Andy Lutomirski wrote:
> (It's too bad that there's no unlocked xchg; this could be faster if
> we had one.
There is - just put both operands in registers. :-)
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
--
On Mon, Aug 4, 2014 at 11:28 PM, Denys Vlasenko
wrote:
> On Fri, Aug 1, 2014 at 7:04 PM, Andy Lutomirski wrote:
>> On Fri, Aug 1, 2014 at 7:48 AM, Denys Vlasenko wrote:
>>> 64-bit code was using six stack slots fewer by not saving/restoring
>>> registers which a callee-preserved according to C
On 08/04, Oleg Nesterov wrote:
>
> On 08/04, Denys Vlasenko wrote:
> >
> > "why do we need percpu old_rsp thingy?" in particular.
>
> See thread_struct->usersp, current_user_stack_pointer().
Btw, perhaps it makes sense to document that task_pt_regs(current)->sp
is not current_user_stack_pointer()
On 08/04, Denys Vlasenko wrote:
>
> "why do we need percpu old_rsp thingy?" in particular.
See thread_struct->usersp, current_user_stack_pointer().
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More
On Fri, Aug 1, 2014 at 7:04 PM, Andy Lutomirski wrote:
> On Fri, Aug 1, 2014 at 7:48 AM, Denys Vlasenko wrote:
>> 64-bit code was using six stack slots fewer by not saving/restoring
>> registers which a callee-preserved according to C ABI,
>> and not allocating space for them
>
> This is great.
On Mon, Aug 04, 2014 at 05:03:42AM +0200, Denys Vlasenko wrote:
> I changed this entire area in v2: basically, I will not change the
> logic, but will add comments explaining what are we doing here, and
> why.
Very good idea! This file needs some good commenting.
--
Regards/Gruss,
Boris.
On Mon, Aug 04, 2014 at 05:03:42AM +0200, Denys Vlasenko wrote:
I changed this entire area in v2: basically, I will not change the
logic, but will add comments explaining what are we doing here, and
why.
Very good idea! This file needs some good commenting.
--
Regards/Gruss,
Boris.
Sent
On Fri, Aug 1, 2014 at 7:04 PM, Andy Lutomirski l...@amacapital.net wrote:
On Fri, Aug 1, 2014 at 7:48 AM, Denys Vlasenko dvlas...@redhat.com wrote:
64-bit code was using six stack slots fewer by not saving/restoring
registers which a callee-preserved according to C ABI,
and not allocating
On 08/04, Denys Vlasenko wrote:
why do we need percpu old_rsp thingy? in particular.
See thread_struct-usersp, current_user_stack_pointer().
Oleg.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info
On 08/04, Oleg Nesterov wrote:
On 08/04, Denys Vlasenko wrote:
why do we need percpu old_rsp thingy? in particular.
See thread_struct-usersp, current_user_stack_pointer().
Btw, perhaps it makes sense to document that task_pt_regs(current)-sp
is not current_user_stack_pointer() inside
On Mon, Aug 4, 2014 at 11:28 PM, Denys Vlasenko
vda.li...@googlemail.com wrote:
On Fri, Aug 1, 2014 at 7:04 PM, Andy Lutomirski l...@amacapital.net wrote:
On Fri, Aug 1, 2014 at 7:48 AM, Denys Vlasenko dvlas...@redhat.com wrote:
64-bit code was using six stack slots fewer by not
On Tue, Aug 05, 2014 at 06:03:24AM +0900, Andy Lutomirski wrote:
(It's too bad that there's no unlocked xchg; this could be faster if
we had one.
There is - just put both operands in registers. :-)
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
--
To
On Sat, Aug 2, 2014 at 1:19 AM, Frederic Weisbecker wrote:
> On Fri, Aug 01, 2014 at 04:48:17PM +0200, Denys Vlasenko wrote:
>>
>> /* 0(%rsp): ~(interrupt number) */
>> .macro interrupt func
>> - /* reserve pt_regs for scratch regs and rbp */
>> - subq $ORIG_RAX-RBP, %rsp
>> -
On Sat, Aug 2, 2014 at 1:19 AM, Frederic Weisbecker fweis...@gmail.com wrote:
On Fri, Aug 01, 2014 at 04:48:17PM +0200, Denys Vlasenko wrote:
/* 0(%rsp): ~(interrupt number) */
.macro interrupt func
- /* reserve pt_regs for scratch regs and rbp */
- subq $ORIG_RAX-RBP, %rsp
-
On Sat, Aug 2, 2014 at 2:23 PM, H. Peter Anvin wrote:
> It would be nice to automate running a T-test on it.
I'll see if I can do something like that. Using the t statistic seems
like overkill here, though -- timing_test_64 runs millions of
iterations, so even if I batch them, I'll end up with
It would be nice to automate running a T-test on it.
On August 2, 2014 2:14:50 PM PDT, Andy Lutomirski wrote:
>On Fri, Aug 1, 2014 at 3:13 PM, H. Peter Anvin wrote:
>> On 08/01/2014 03:11 PM, Denys Vlasenko wrote:
Could you please try to see if there is a measurable change in the
On Fri, Aug 1, 2014 at 3:13 PM, H. Peter Anvin wrote:
> On 08/01/2014 03:11 PM, Denys Vlasenko wrote:
>>>
>>> Could you please try to see if there is a measurable change in the
>>> latency of a trivial syscall?
>>
>> Will do.
>> Something along the lines of "how long does it take to execute two
On Fri, Aug 1, 2014 at 3:13 PM, H. Peter Anvin h...@zytor.com wrote:
On 08/01/2014 03:11 PM, Denys Vlasenko wrote:
Could you please try to see if there is a measurable change in the
latency of a trivial syscall?
Will do.
Something along the lines of how long does it take to execute two
It would be nice to automate running a T-test on it.
On August 2, 2014 2:14:50 PM PDT, Andy Lutomirski l...@amacapital.net wrote:
On Fri, Aug 1, 2014 at 3:13 PM, H. Peter Anvin h...@zytor.com wrote:
On 08/01/2014 03:11 PM, Denys Vlasenko wrote:
Could you please try to see if there is a
On Sat, Aug 2, 2014 at 2:23 PM, H. Peter Anvin h...@zytor.com wrote:
It would be nice to automate running a T-test on it.
I'll see if I can do something like that. Using the t statistic seems
like overkill here, though -- timing_test_64 runs millions of
iterations, so even if I batch them, I'll
On Fri, Aug 01, 2014 at 04:48:17PM +0200, Denys Vlasenko wrote:
>
> /* 0(%rsp): ~(interrupt number) */
> .macro interrupt func
> - /* reserve pt_regs for scratch regs and rbp */
> - subq $ORIG_RAX-RBP, %rsp
> - CFI_ADJUST_CFA_OFFSET ORIG_RAX-RBP
> - cld
> - /* start
On Fri, Aug 01, 2014 at 04:48:17PM +0200, Denys Vlasenko wrote:
> 64-bit code was using six stack slots fewer by not saving/restoring
> registers which a callee-preserved according to C ABI,
> and not allocating space for them.
>
> Only when syscall needed a complete "struct pt_regs",
> the
On 08/01/2014 03:11 PM, Denys Vlasenko wrote:
>>
>> Could you please try to see if there is a measurable change in the
>> latency of a trivial syscall?
>
> Will do.
> Something along the lines of "how long does it take to execute two
> gazillions of getppid()?"
>
Something like that, yes, but
On Fri, Aug 1, 2014 at 8:35 PM, H. Peter Anvin wrote:
> On 08/01/2014 07:48 AM, Denys Vlasenko wrote:
>>
>> Patch was run-tested: 64-bit executables, 32-bit executables,
>> strace works.
>>
>
> Could you please try to see if there is a measurable change in the
> latency of a trivial syscall?
On 08/01/2014 07:48 AM, Denys Vlasenko wrote:
>
> Patch was run-tested: 64-bit executables, 32-bit executables,
> strace works.
>
Could you please try to see if there is a measurable change in the
latency of a trivial syscall?
-hpa
--
To unsubscribe from this list: send the line
On 08/01, Denys Vlasenko wrote:
>
> This patch changes code to always allocate a complete "struct pt_regs".
> The saving of registers is still done lazily.
I obviously like this change very much. Unfortunately I can only ack the
intent ;)
I really hope that maintainers will take a closer look.
On Fri, Aug 1, 2014 at 7:48 AM, Denys Vlasenko wrote:
> 64-bit code was using six stack slots fewer by not saving/restoring
> registers which a callee-preserved according to C ABI,
> and not allocating space for them.
>
> Only when syscall needed a complete "struct pt_regs",
> the complete area
On Fri, Aug 1, 2014 at 7:48 AM, Denys Vlasenko wrote:
> 64-bit code was using six stack slots fewer by not saving/restoring
> registers which a callee-preserved according to C ABI,
> and not allocating space for them
This is great.
Next up: remove FIXUP/RESTORE_TOP_OF_STACK? :) Maybe I'll give
64-bit code was using six stack slots fewer by not saving/restoring
registers which a callee-preserved according to C ABI,
and not allocating space for them.
Only when syscall needed a complete "struct pt_regs",
the complete area was allocated and filled in.
This proved to be a source of
64-bit code was using six stack slots fewer by not saving/restoring
registers which a callee-preserved according to C ABI,
and not allocating space for them.
Only when syscall needed a complete struct pt_regs,
the complete area was allocated and filled in.
This proved to be a source of
On Fri, Aug 1, 2014 at 7:48 AM, Denys Vlasenko dvlas...@redhat.com wrote:
64-bit code was using six stack slots fewer by not saving/restoring
registers which a callee-preserved according to C ABI,
and not allocating space for them
This is great.
Next up: remove FIXUP/RESTORE_TOP_OF_STACK? :)
On Fri, Aug 1, 2014 at 7:48 AM, Denys Vlasenko dvlas...@redhat.com wrote:
64-bit code was using six stack slots fewer by not saving/restoring
registers which a callee-preserved according to C ABI,
and not allocating space for them.
Only when syscall needed a complete struct pt_regs,
the
On 08/01, Denys Vlasenko wrote:
This patch changes code to always allocate a complete struct pt_regs.
The saving of registers is still done lazily.
I obviously like this change very much. Unfortunately I can only ack the
intent ;)
I really hope that maintainers will take a closer look.
Oleg.
On 08/01/2014 07:48 AM, Denys Vlasenko wrote:
Patch was run-tested: 64-bit executables, 32-bit executables,
strace works.
Could you please try to see if there is a measurable change in the
latency of a trivial syscall?
-hpa
--
To unsubscribe from this list: send the line
On Fri, Aug 1, 2014 at 8:35 PM, H. Peter Anvin h...@zytor.com wrote:
On 08/01/2014 07:48 AM, Denys Vlasenko wrote:
Patch was run-tested: 64-bit executables, 32-bit executables,
strace works.
Could you please try to see if there is a measurable change in the
latency of a trivial syscall?
On 08/01/2014 03:11 PM, Denys Vlasenko wrote:
Could you please try to see if there is a measurable change in the
latency of a trivial syscall?
Will do.
Something along the lines of how long does it take to execute two
gazillions of getppid()?
Something like that, yes, but you have to
On Fri, Aug 01, 2014 at 04:48:17PM +0200, Denys Vlasenko wrote:
64-bit code was using six stack slots fewer by not saving/restoring
registers which a callee-preserved according to C ABI,
and not allocating space for them.
Only when syscall needed a complete struct pt_regs,
the complete area
On Fri, Aug 01, 2014 at 04:48:17PM +0200, Denys Vlasenko wrote:
/* 0(%rsp): ~(interrupt number) */
.macro interrupt func
- /* reserve pt_regs for scratch regs and rbp */
- subq $ORIG_RAX-RBP, %rsp
- CFI_ADJUST_CFA_OFFSET ORIG_RAX-RBP
- cld
- /* start from rbp
70 matches
Mail list logo