Re: amd64: kernel aslr support

2017-10-06 Thread Maxime Villard

Le 06/10/2017 à 08:19, Martin Husemann a écrit :

On Thu, Oct 05, 2017 at 12:56:02PM +0200, Maxime Villard wrote:

I don't think it is possible to compile some parts as relocatable and some
others as static. What we could do is compile both the kernel and the prekern
separately, and use objcopy to merge them.


I don't see why this would be a problem.


I didn't say objcopy is a problem, the problem is the bootloader.


Re: amd64: kernel aslr support

2017-10-06 Thread Mouse
 Is live-kernel update more viable with this approach?
>>> Live kernel update is a much more complicated business, [...]
>> [I]t occurs to me that [the "live-kernel update" text] could have
>> been intended to refer not to updating the kernel without disturbing
>> a running system but rather to something more along the lines of
>> [making] the kernel [] play bootloader for a new kernel.
> I think the real point is updating a running system without
> interrupting the services it provides (apache etc).

Probably could be done, but it would need a lot more work than just
this.  As I understand the prekern, it would be of almost no use for
that; most of the pain will, I think, be in migrating a running
userland between kernels (especially if they are different in certain
important ways - if the kernel/user ABI differs, for example, I doubt
anything using the affected portions can migrate).

> There is, I think, little interest in pseudo-reboots via the prekern,
> except wanting a fast reboot - which is still not really useful.

It would be useful to me.  (Well, not in this form; I do not expect to
be using this prekern. I just mean I would really like to see reboots
cut out the time between the old kernel deciding to reboot and the
bootloader loading the new kernel.  The trip back through the BIOS can,
and on some of my machines does, turn what could be a fifteen- or
thirty-second reboot into a multi-minute reboot.)

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: amd64: kernel aslr support

2017-10-06 Thread Maxime Villard

Le 06/10/2017 à 10:37, Martin Husemann a écrit :

I don't get the issue with the bootloader - it just loads one binary
and doesn't care about ASLR, the "prekernel" entry point handles that
later (rearranging any mappings the bootloader might have previously
done afterwards)?


The bootloader loads *two* binaries: the prekern with loadfile_static(), and
the kernel with loadfile_dynamic().

The former is used by all the architectures we support, and it uses the memory
address referenced in the ELF segments. The latter chooses an address (see my
XXX in bootloader.diff), and maps the sections starting from there.

Basically, the problem is that if the prekern and the kernel are merged in one
binary, we're forced to have only one function, and it will be shared by all
architectures. Here, it becomes complicated to implement both the static and
dynamic loading, and it will certainly result in breakages or inconsistencies.

Beyond this bootloader issue, having dynamic+static sections&segments in a
single binary is not structurally correct from an ELF pov.

Maxime


Re: amd64: kernel aslr support

2017-10-06 Thread Martin Husemann
On Fri, Oct 06, 2017 at 10:32:01AM +0200, Maxime Villard wrote:
> Le 06/10/2017 à 08:19, Martin Husemann a écrit :
> > On Thu, Oct 05, 2017 at 12:56:02PM +0200, Maxime Villard wrote:
> > > I don't think it is possible to compile some parts as relocatable and some
> > > others as static. What we could do is compile both the kernel and the 
> > > prekern
> > > separately, and use objcopy to merge them.
> > 
> > I don't see why this would be a problem.
> 
> I didn't say objcopy is a problem, the problem is the bootloader.

I don't even think objcopy would be needed, but there is always the
devil in the details and I didn't try.

I don't get the issue with the bootloader - it just loads one binary
and doesn't care about ASLR, the "prekernel" entry point handles that
later (rearranging any mappings the bootloader might have previously
done afterwards)?

Martin


Re: amd64: kernel aslr support

2017-10-06 Thread Maxime Villard

Le 06/10/2017 à 02:30, Thor Lancelot Simon a écrit :

  * The RNG is not really strong. Help in this area would be greatly
appreciated.


This is tricky mostly because once you start probing for hardware
devices or even CPU features, you're going to find yourself wanting
more and more of the support you'd get from the "real kernel".

For example, to probe for RDRAND support on the CPU, you need a
whole pile of CPU feature decoding.  To probe for environmental
sensors or an audio device you may need to know a whole pile about
ACPI and/or PCI.  And so forth.

EFI has a RNG API, but I think it's usually just stubbed out and
besides, you can't rely on having EFI...

I think I'd suggest some combination of:

* Just enough CPU-feature support to find/use RDRAND
  (Intel's sample code is not that big and I think it's
   suitably-licensed)

* Hash the contents of the "CMOS RAM" and/or EFI boot variables

* Maybe poke around for an IPMI BMC (has environmental sensors),
  or a TPM (has a RNG) on the LPC bus

* Maybe poke around for on-die temperature/voltage sensors
  (will again require some CPU identification support).

* Rather than just using rdtsc once, consider using rdtsc to
  "time" multiple hardware oscillators against one another;
  at the very least, you've always got the hardware clock.

* Also, you can use rdtsc to time memory accesses.

For quick and dirty "entropy extraction", you can crunch as much of this
data as you're able to connect together using SHA512.


As you said, all of this requires some heavy identification code. Besides, it
will be complicated on systems that don't have these features (I don't have
RDRAND for example).

Initially, I was more thinking about extending the rndsave_t structure with
fields dedicated exclusively to the prekern. The bootloader gives this
structure to the kernel for early entropy (BI_MODULE_RND), generated from the
previous run; the kernel could easily add a random uint32_t in it, which the
prekern uses right away.

Maxime


Re: RFC: PERCPU_LIST to fix PR kern/52515

2017-10-06 Thread Ryota Ozaki
On Fri, Oct 6, 2017 at 4:24 PM, Ryota Ozaki  wrote:
> On Fri, Oct 6, 2017 at 1:14 PM, Ryota Ozaki  wrote:
>> On Fri, Oct 6, 2017 at 11:58 AM, Taylor R Campbell
>>  wrote:
 Date: Fri, 6 Oct 2017 11:26:40 +0900
 From: Ryota Ozaki 

 On Mon, Oct 2, 2017 at 11:13 PM, Taylor R Campbell
  wrote:
 > Quick summary of the problem:
 >
 > Possible solutions.  I'm leaning toward (6), to open-code the linked
 > list operations for this special purpose, with compile-tested patch
 > attached.  This changes the text of psref.h, but shouldn't change the
 > ABI.  Comments?

 How about using SLIST instead of open-coding? The instructions of them
 are very similar, but the SLIST version is slightly simpler.
>>>
>>> I avoided that because n psref_release operations takes expected and
>>> worst-case O(n^2) time and there's no constant bound on the latency of
>>> a single psref_release operation.  But maybe n is always small enough
>>> that it doesn't matter -- perhaps enough that the concrete cost of
>>> maintaining a doubly-linked list is higher.
>>
>> I also suppose that a target being released is at the head of the list
>> because targets of psref are typically manipulated in last-in-first-out
>> manner. But yes psref_release of SLIST version can be O(n) in worst
>> cases theoretically.
>>
>> Well, when I think this kind of problems I tend to think of our usages,
>> i.e., network processing as a router. In such usages, psref is used in
>> softint and the last-in-first-out manner would keep in many cases. But
>> in other usages such as uses in threads the assumption is perhaps not
>> really.
>>
>>>
>>> (My desire to avoid thinking about bounds on n is also what motivated
>>> me to use a linked list instead of an array in the first place.)
>>>
>>> Note that your patch changes the ABI of struct psref!
>>
>> Let's bump the kernel version!
>>
>>>
>>> I wonder whether the open-coded version would do better if it
>>> unconditionally loaded the percpu:
>>>
>>> pcpu = percpu_getref(class->prc_percpu);
>>> KASSERTMSG(psref->psref_prevp == NULL || *psref->psref_prevp == 
>>> psref,
>>> "psref %p prevp %p points at %p",
>>> psref, psref->psref_prevp, *psref->psref_prevp);
>>> KASSERTMSG(psref->psref_prevp != NULL || pcpu->pcpu_first == psref,
>>> "psref %p marked as first but psref_cpu %p on %d first is %p",
>>> psref, pcpu, cpu_index(curcpu()), pcpu->pcpu_first);
>>> *(psref->psref_prevp ? psref->psref_prevp : &pcpu->pcpu_first) =
>>> psref->psref_next;
>>> percpu_putref(class->prc_percpu);
>>>
>>> With DIAGNOSTIC disabled, I get a conditional move instruction instead
>>> of branches this way:
>>>
>>>  4f9:   e8 00 00 00 00  callq  4fe 
>>> 4fa: R_X86_64_PC32 
>>> percpu_getref+0xfffc
>>>  4fe:   48 8b 53 08 mov0x8(%rbx),%rdx
>>>  502:   48 85 d2test   %rdx,%rdx
>>>  505:   48 0f 44 d0 cmove  %rax,%rdx
>>>  509:   48 8b 03mov(%rbx),%rax
>>>  50c:   48 89 02mov%rax,(%rdx)
>>>  50f:   49 8b 7c 24 20  mov0x20(%r12),%rdi
>>>  514:   e8 00 00 00 00  callq  519 
>>> 515: R_X86_64_PC32 
>>> percpu_putref+0xfffc
>>>
>>> Also, my original patch was missing a percpu_putref.  I hope you put
>>> it back before you ran your test!
>>
>> I'll test with the patch in the other mail later.
>
> The above results were actually measured by knakahara@ and
> this time I've measured by myself (the measurement hardware
> is the same and the kernel configs of his and mine should
> be almost the same though).

I notice a difference between his and mine: check-out date of
the source code. Mine is checked out today while his is probably
some days ago.

>
> ...so the results show a slightly different trend:
>   - original:  137.9 138.0 (144.7) Mbps
>   - open-code: 135.2 134.6 (138.5) Mbps
>   - SLIST: 140.7 141.4 (140.1) Mbps

...though still these results are puzzling.

  ozaki-r

>
> I think knakahara@ (or someone else) needs to re-evaluate :-/
>
>   ozaki-r


Re: RFC: PERCPU_LIST to fix PR kern/52515

2017-10-06 Thread Manuel Bouyer
On Fri, Oct 06, 2017 at 04:21:48AM +, Taylor R Campbell wrote:
> > Date: Fri, 6 Oct 2017 13:14:14 +0900
> > From: Ryota Ozaki 
> > 
> > On Fri, Oct 6, 2017 at 11:58 AM, Taylor R Campbell
> >  wrote:
> > > Note that your patch changes the ABI of struct psref!
> > 
> > Let's bump the kernel version!
> 
> If we want to pull the fix up to netbsd-8 (which we almost certainly
> do), then we should really avoid changing the ABI.  I forget what the
> rules are for pullups to pre-release branches, but changing the ABI is
> a big no-no post-release.

I think it's OK until the first release it tagged.
We already had a userland ABI change in netbsd-8

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: RFC: PERCPU_LIST to fix PR kern/52515

2017-10-06 Thread Ryota Ozaki
On Fri, Oct 6, 2017 at 1:14 PM, Ryota Ozaki  wrote:
> On Fri, Oct 6, 2017 at 11:58 AM, Taylor R Campbell
>  wrote:
>>> Date: Fri, 6 Oct 2017 11:26:40 +0900
>>> From: Ryota Ozaki 
>>>
>>> On Mon, Oct 2, 2017 at 11:13 PM, Taylor R Campbell
>>>  wrote:
>>> > Quick summary of the problem:
>>> >
>>> > Possible solutions.  I'm leaning toward (6), to open-code the linked
>>> > list operations for this special purpose, with compile-tested patch
>>> > attached.  This changes the text of psref.h, but shouldn't change the
>>> > ABI.  Comments?
>>>
>>> How about using SLIST instead of open-coding? The instructions of them
>>> are very similar, but the SLIST version is slightly simpler.
>>
>> I avoided that because n psref_release operations takes expected and
>> worst-case O(n^2) time and there's no constant bound on the latency of
>> a single psref_release operation.  But maybe n is always small enough
>> that it doesn't matter -- perhaps enough that the concrete cost of
>> maintaining a doubly-linked list is higher.
>
> I also suppose that a target being released is at the head of the list
> because targets of psref are typically manipulated in last-in-first-out
> manner. But yes psref_release of SLIST version can be O(n) in worst
> cases theoretically.
>
> Well, when I think this kind of problems I tend to think of our usages,
> i.e., network processing as a router. In such usages, psref is used in
> softint and the last-in-first-out manner would keep in many cases. But
> in other usages such as uses in threads the assumption is perhaps not
> really.
>
>>
>> (My desire to avoid thinking about bounds on n is also what motivated
>> me to use a linked list instead of an array in the first place.)
>>
>> Note that your patch changes the ABI of struct psref!
>
> Let's bump the kernel version!
>
>>
>> I wonder whether the open-coded version would do better if it
>> unconditionally loaded the percpu:
>>
>> pcpu = percpu_getref(class->prc_percpu);
>> KASSERTMSG(psref->psref_prevp == NULL || *psref->psref_prevp == 
>> psref,
>> "psref %p prevp %p points at %p",
>> psref, psref->psref_prevp, *psref->psref_prevp);
>> KASSERTMSG(psref->psref_prevp != NULL || pcpu->pcpu_first == psref,
>> "psref %p marked as first but psref_cpu %p on %d first is %p",
>> psref, pcpu, cpu_index(curcpu()), pcpu->pcpu_first);
>> *(psref->psref_prevp ? psref->psref_prevp : &pcpu->pcpu_first) =
>> psref->psref_next;
>> percpu_putref(class->prc_percpu);
>>
>> With DIAGNOSTIC disabled, I get a conditional move instruction instead
>> of branches this way:
>>
>>  4f9:   e8 00 00 00 00  callq  4fe 
>> 4fa: R_X86_64_PC32 
>> percpu_getref+0xfffc
>>  4fe:   48 8b 53 08 mov0x8(%rbx),%rdx
>>  502:   48 85 d2test   %rdx,%rdx
>>  505:   48 0f 44 d0 cmove  %rax,%rdx
>>  509:   48 8b 03mov(%rbx),%rax
>>  50c:   48 89 02mov%rax,(%rdx)
>>  50f:   49 8b 7c 24 20  mov0x20(%r12),%rdi
>>  514:   e8 00 00 00 00  callq  519 
>> 515: R_X86_64_PC32 
>> percpu_putref+0xfffc
>>
>> Also, my original patch was missing a percpu_putref.  I hope you put
>> it back before you ran your test!
>
> I'll test with the patch in the other mail later.

The above results were actually measured by knakahara@ and
this time I've measured by myself (the measurement hardware
is the same and the kernel configs of his and mine should
be almost the same though).

...so the results show a slightly different trend:
  - original:  137.9 138.0 (144.7) Mbps
  - open-code: 135.2 134.6 (138.5) Mbps
  - SLIST: 140.7 141.4 (140.1) Mbps

I think knakahara@ (or someone else) needs to re-evaluate :-/

  ozaki-r


Re: amd64: kernel aslr support

2017-10-06 Thread Maxime Villard

Le 05/10/2017 à 17:39, Mouse a écrit :

Is live-kernel update more viable with this approach?

Live kernel update is a much more complicated business, [...]


I didn't write that the bit about live-kernel updates, but it occurs to
me that it could have been intended to refer not to updating the kernel
without disturbing a running system but rather to something more along
the lines of...kexec(), I think it was?  Basically, make the kernel -
or, here, the prekern - play bootloader for a new kernel.  Reboot in
most senses without actually dropping back to the hardware's BIOS (or
moral equivalent, for non-peecees).


I think the real point is updating a running system without interrupting the
services it provides (apache etc). There is, I think, little interest in
pseudo-reboots via the prekern, except wanting a fast reboot - which is still
not really useful.

Maxime