subject:"CVS commit\: src\/sys\/arch"

Re: MAX_PAGE_SIZE for m68k (Re: CVS commit: src/sys/arch/arm/include/arm32)

2020-01-13 Thread Jason Thorpe

> On Jan 13, 2020, at 10:24 AM, Christos Zoulas  wrote:
> 
> Talking to myself:
> 
> The arm PAGE_SIZE_{MIN,MAX} should go away after nick eliminates the
> need for the 8K pages. This leaves us with m68k to deal with...
> Do modules work on m68k? Should modules be shared between kernels with
> different page sizes? Then perhaps we don't need a new constant?

On m68k, I think the following two statements are true:

a) The platform should use a constant PAGE_SIZE to the extent possible because 
it's a slow platform.

b) Modules should be built such that they can use a non-fixed PAGE_SIZE.

But (b) also requires that all of the OTHER non-same constants on m68k are 
avoided (take a closer look at  for example).  Those 
probably should be properly hidden from module builds so that we can at least 
*catch* such cases.

-- thorpej

Re: MAX_PAGE_SIZE for m68k (Re: CVS commit: src/sys/arch/arm/include/arm32)

2020-01-13 Thread Christos Zoulas

On Jan 14,  1:15am, tsut...@ceres.dti.ne.jp (Izumi Tsutsui) wrote:
-- Subject: Re: MAX_PAGE_SIZE for m68k (Re: CVS commit: src/sys/arch/arm/incl

| christos@ wrote:
| 
| > >Now I get the following erro during local tests of
| > >"build.sh -U -m hp300 release" on NetBSD/i386 9.0_RC1 host:
| > >
| > >---
| > >#create  compat_util/compat_exec.d
|  :
| > >In file included from /s/cvs/src/sys/sys/param.h:149:0,
| > > from /s/cvs/src/sys/compat/common/compat_exec.c:41:
| > >./m68k/pmap_motorola.h:165:5: error: operator '*' has no left operand
| > > #if PAGE_SIZE == 8192 /* NBPG / (SG4_LEV1SIZE * sizeof(st_entry_t)) */
| > > ^
| > >nbmkdep: compile failed.
| > >*** [compat_exec.d] Error code 1
| > 
| > try cc -E?
| 
| It turns out the problem is more complicated.
| 
|  has the following definitions:
| 
| https://nxr.netbsd.org/xref/src/sys/uvm/uvm_param.h?r=1.38#135
| ---
| 135  * If MIN_PAGE_SIZE and MAX_PAGE_SIZE are not equal, then we must use
| 136  * non-constant PAGE_SIZE, et al for LKMs.
| 137  */
| 138 #if (MIN_PAGE_SIZE != MAX_PAGE_SIZE)
| 139 #define   __uvmexp_pagesize
| 140 #if defined(_LKM) || defined(_MODULE)
| 141 #undef PAGE_SIZE
| 142 #undef PAGE_MASK
| 143 #undef PAGE_SHIFT
| 144 #endif
| 145 #endif
| 146 
| 147 /*
| 148  * Now provide PAGE_SIZE, PAGE_MASK, and PAGE_SHIFT if we do not
| 149  * have ones that are compile-time constants.
| 150  */
| 151 #if !defined(PAGE_SIZE)
| 152 extern const int *const uvmexp_pagesize;
| 153 extern const int *const uvmexp_pagemask;
| 154 extern const int *const uvmexp_pageshift;
| 155 #define   PAGE_SIZE   (*uvmexp_pagesize)  /* size of page 
*/
| 156 #define   PAGE_MASK   (*uvmexp_pagemask)  /* size of page 
- 1 */
| 157 #define   PAGE_SHIFT  (*uvmexp_pageshift) /* bits to 
shift for pages */
| 158 #endif /* PAGE_SIZE */
| ---
| 
| I.e.  assumes PAGE_SIZE is not compile time constant
| for MODULEs if (MIN_PAGE_SIZE != MAX_PAGE_SIZE).
| 
| Probably this is the same reason of recent arm build failures:
|  https://releng.netbsd.org/builds/HEAD/202001130720Z/
|  https://releng.netbsd.org/builds/HEAD/202001130720Z/evbarm-earm.build.failed
| ---
| /tmp/genassym.Lq8h9d/assym.c:57:1: error: asm operand 0 probably doesn't 
match constraints [-Werror]
|  __asm("XYZZY VM_MIN_ADDRESS %0" : : "n" (VM_MIN_ADDRESS));
|  ^
| /tmp/genassym.Lq8h9d/assym.c:58:1: error: asm operand 0 probably doesn't 
match constraints [-Werror]
|  __asm("XYZZY VM_MAXUSER_ADDRESS %0" : : "n" (VM_MAXUSER_ADDRESS));
|  ^
| /tmp/genassym.Lq8h9d/assym.c:97:1: error: asm operand 0 probably doesn't 
match constraints [-Werror]
|  __asm("XYZZY PAGE_MASK %0" : : "n" (PAGE_MASK));
|  ^
| /tmp/genassym.Lq8h9d/assym.c:98:1: error: asm operand 0 probably doesn't 
match constraints [-Werror]
|  __asm("XYZZY PAGE_SIZE %0" : : "n" (PAGE_SIZE));
|  ^
| ---
| 
| Should we prepare independent constant for
| "possible pagesize value among different MACHINE with the same MACHINE_ARCH"
| for jemalloc(3)?

Ouch, this hurts. Sure, perhaps MALLOC_PAGE_SIZE?

christos

Re: CVS commit: src/sys/arch/x86/x86

2018-07-10 Thread Kamil Rytarowski

On 08.07.2018 17:44, Kamil Rytarowski wrote:
> I will try to scratch a new header unaligned.h with the set of macros
> and submit it to evaluation.

I've prepared a scratch of unaligned.h with get_unaligned():

http://netbsd.org/~kamil/kubsan/unaligned.h

There are at least two problems to proceed:

1. GCC 8.x is required for no_sanitizer attributes

https://gcc.gnu.org/gcc-8/changes.html
This version will also ship with NetBSD code for sanitization.

The basesystem GCC version in HEAD (v. 6.4.0) is too old.

2. get_unaligned() is a fundamental type oriented (char, int, long etc)

A large part of the issues detected in the kernel are due to a
misaligned pointer to a struct passed (like disklabel or in6_addr).

I think that these cases shall to be addressed directly in the kernel
code and treated as buggy.

I'm deferring right now the work on unaligned.h and wait for a required
minimum version of GCC in the base. I will keep KUBSan reports as
non-fatals for now.

signature.asc
Description: OpenPGP digital signature

Re: CVS commit: src/sys/arch/x86/x86

2018-07-09 Thread Kamil Rytarowski

On 09.07.2018 16:58, Thor Lancelot Simon wrote:
> On Mon, Jul 09, 2018 at 12:24:15PM +0200, Kamil Rytarowski wrote:
>>
>> The C11 standard could indeed use consistent wording. In one place
>> "correctly aligned" in other alignment "restrictions" and
>> "requirements". None of these terms is marked as a keyword or term and I
>> read them in the context of the document as the same phenomenon (I
>> haven't found a different interpretation of this in the wild).
> 
> Right, but, architecturally, x86 doesn't have these "restrictions" or
> "requirements".  Not for correctness, not with the overwhelming majority
> of integer instructions.  Only (sometimes) for performance.
> 

I have an experience of boosting an application processing packets, that
moving members of a struct boosted the performance of an application by
20%. In my case it was not misalignment, but introduction of gaps into
structs and not fitting them into CPU cache.

> Which makes relying on the standard's language about unaligned access
> incorrect.  There's no undefined behavior here because there can't be;
> strict alignment is not a feature of the target architecture.
> 

I think that is interpretation of undefined behavior in C to what will
happen on x86. Kernel UBSan checks the former and treats it as a bug.
With some assumptions we can predict what will happen in the latter
case... without the assumptions we aren't safe on x86 either (like the
EFLAGS bit state)

> I understand we have some code that's on the borderline between being MI
> and MD, or that's "MD but for several different architectures".  ACPICA
> is an obvious example.  And that such code may legitimately be compiled for
> targets where unaligned access is not architecturally valid, and compilers
> are free to do odd things with code that tries to do it.  But the right
> thing to do, I think, is to acknowledge that such code is MI; not to, by
> misguided policy, insist that _all_ "MD" code should be written as if it
> were MI.
> 

put_misaligned()/get_misaligned() will handle these special cases,
without changing algorithms in the code (already used in DRMKMS).

In my test-case there are 7 subsystems or kernel features that trigger
KUBSan. Access will be marked with proper compiler decoration and the
problem will be gone.

If we will detect that in some scenario during fuzzing we can trigger
misaligned read in e.g. msdos fs, we will know why reading the same  the
code breaks on RPi.

> Because if you head down that road, we're going to be forced to
> write a bunch more stuff in assembler that we'd figured out over the decades
> how to write in a performant and readable way in C; and I don't think anyone
> benefits from having more asm in our kernel.
> 

There is no danger of any major rewrite of anything, mostly small
patches here and there. Also we already down to misaligned access reports.

> Thor
> 

signature.asc
Description: OpenPGP digital signature

Re: CVS commit: src/sys/arch/x86/x86

2018-07-09 Thread Christos Zoulas

In article <20180709145848.ga21...@panix.com>,
Thor Lancelot Simon   wrote:
>On Mon, Jul 09, 2018 at 12:24:15PM +0200, Kamil Rytarowski wrote:
>> 
>> The C11 standard could indeed use consistent wording. In one place
>> "correctly aligned" in other alignment "restrictions" and
>> "requirements". None of these terms is marked as a keyword or term and I
>> read them in the context of the document as the same phenomenon (I
>> haven't found a different interpretation of this in the wild).
>
>Right, but, architecturally, x86 doesn't have these "restrictions" or
>"requirements".  Not for correctness, not with the overwhelming majority
>of integer instructions.  Only (sometimes) for performance.

Unless you change one bit in the PSL and then the above is wrong:

#define PSL_AC  0x0004  /* alignment check flag */

christos

Re: CVS commit: src/sys/arch/x86/x86

2018-07-09 Thread Thor Lancelot Simon

On Mon, Jul 09, 2018 at 12:24:15PM +0200, Kamil Rytarowski wrote:
> 
> The C11 standard could indeed use consistent wording. In one place
> "correctly aligned" in other alignment "restrictions" and
> "requirements". None of these terms is marked as a keyword or term and I
> read them in the context of the document as the same phenomenon (I
> haven't found a different interpretation of this in the wild).

Right, but, architecturally, x86 doesn't have these "restrictions" or
"requirements".  Not for correctness, not with the overwhelming majority
of integer instructions.  Only (sometimes) for performance.

Which makes relying on the standard's language about unaligned access
incorrect.  There's no undefined behavior here because there can't be;
strict alignment is not a feature of the target architecture.

That means that a compiler which emits broken code if it encounters such
a pointer is the broken thing -- *not the source code* -- and whanging
around x86 MD code in our tree to accomodate such a compiler would be
broken too.

I understand we have some code that's on the borderline between being MI
and MD, or that's "MD but for several different architectures".  ACPICA
is an obvious example.  And that such code may legitimately be compiled for
targets where unaligned access is not architecturally valid, and compilers
are free to do odd things with code that tries to do it.  But the right
thing to do, I think, is to acknowledge that such code is MI; not to, by
misguided policy, insist that _all_ "MD" code should be written as if it
were MI.

Because if you head down that road, we're going to be forced to
write a bunch more stuff in assembler that we'd figured out over the decades
how to write in a performant and readable way in C; and I don't think anyone
benefits from having more asm in our kernel.

Thor

Re: CVS commit: src/sys/arch/x86/x86

2018-07-09 Thread Kamil Rytarowski

On 09.07.2018 11:32, Martin Husemann wrote:
> On Mon, Jul 09, 2018 at 11:00:15AM +0200, Kamil Rytarowski wrote:
>> According to my understanding, alignment requirement for a type/object
>> is implementation defined (6.2.8); however during the process of
>> converting types, if the returned pointer is not correctly aligned the
>> result is undefined behavior (6.3.2.3 p7).
> 
> My point is: I see no connection in the standard between "correctly aligned"
> (in 6.3.2.3 p7) and the implementation defined "Alignment of objects" (as
> measured by _Alginof). There are various types of alignments defined in
> 6.2.8 but no "correct" alignment.
> 
> Martin
> 

The C11 standard could indeed use consistent wording. In one place
"correctly aligned" in other alignment "restrictions" and
"requirements". None of these terms is marked as a keyword or term and I
read them in the context of the document as the same phenomenon (I
haven't found a different interpretation of this in the wild).

C++11 specification document uses more consistently "alignment
requirement" wording (3.11), however there is other wording in use
"proper alignment", "suitable aligned", "aligned appropriately" to
describe the same phenomenon in my interpretation.

signature.asc
Description: OpenPGP digital signature

Re: CVS commit: src/sys/arch/x86/x86

2018-07-09 Thread Martin Husemann

On Mon, Jul 09, 2018 at 11:00:15AM +0200, Kamil Rytarowski wrote:
> According to my understanding, alignment requirement for a type/object
> is implementation defined (6.2.8); however during the process of
> converting types, if the returned pointer is not correctly aligned the
> result is undefined behavior (6.3.2.3 p7).

My point is: I see no connection in the standard between "correctly aligned"
(in 6.3.2.3 p7) and the implementation defined "Alignment of objects" (as
measured by _Alginof). There are various types of alignments defined in
6.2.8 but no "correct" alignment.

Martin

Re: CVS commit: src/sys/arch/x86/x86

2018-07-09 Thread Kamil Rytarowski

On 09.07.2018 10:09, Martin Husemann wrote:
> On Sun, Jul 08, 2018 at 03:30:36PM +0200, Kamil Rytarowski wrote:
>> Misaligned pointer is explicitly documented as undefined behavior in the
>> C standard (C11 6.3.2.3 p7). (In C++ it's basically the same.)
> 
> Yes, but the standard dos not define what a misaligned pointer is
> (or "correctly aligned").
> 
> It talks about the prefered alignment that implementations select and
> that you can query with _Alignof - but there is no connection of that
> value to the section you refer. Some implementations require minimal
> alignement like their _Alignof returns, for others "correctly aligned" is
> always true.
> 
> I might be missing something though.
> 
> Martin
> 

According to my understanding, alignment requirement for a type/object
is implementation defined (6.2.8); however during the process of
converting types, if the returned pointer is not correctly aligned the
result is undefined behavior (6.3.2.3 p7).

int16_t (short for LP64) and int8_t (char) have different alignment
requirements on NetBSD/amd64. char has the weakest one (6.2.8.6).

$ cat test.c
#include 

int
main(int argc, char **argv)
{
printf("alignof(char)=%zu\n", __alignof__(char));
printf("alignof(short)=%zu\n", __alignof__(short));

return 0;
}

$ ./a.out
alignof(char)=1
alignof(short)=2

If one alignment is larger than the other one, it's called stricter one
(6.2.8.7).

This means that mpbios_page (int8_t*) has weaker alignment than memtop
(int16_t*) and this cast is undefined behavior, because the returned
pointer is not correctly aligned for the destination type (as
&mpbios_page[0x413] violates it).

signature.asc
Description: OpenPGP digital signature

Re: CVS commit: src/sys/arch/x86/x86

2018-07-09 Thread Martin Husemann

On Sun, Jul 08, 2018 at 03:30:36PM +0200, Kamil Rytarowski wrote:
> Misaligned pointer is explicitly documented as undefined behavior in the
> C standard (C11 6.3.2.3 p7). (In C++ it's basically the same.)

Yes, but the standard dos not define what a misaligned pointer is
(or "correctly aligned").

It talks about the prefered alignment that implementations select and
that you can query with _Alignof - but there is no connection of that
value to the section you refer. Some implementations require minimal
alignement like their _Alignof returns, for others "correctly aligned" is
always true.

I might be missing something though.

Martin

Re: CVS commit: src/sys/arch/x86/x86

2018-07-08 Thread Mouse

>>> Misaligned pointer is explicitly documented as undefined behavior
>>> in the C standard (C11 6.3.2.3 p7).
>> So what?  Do you have reason to think that sys/arch/x86 will at some
>> point be ported to a compiler [] that does something unexpected with
>> that code?
> Due to UB a compiler is free to perform optimization and treat x86
> like a RISC architecture...

Indeed.  But a compiler is also free to not do so.

Writing machine-dependent code in something other than assembler is
dependent on having a compiler that produces the desired result.  A
compiler that doesn't, whether because it's broken or because it takes
advantage of latitude in the languageg spec, simply is not suitable for
that use.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B

Re: CVS commit: src/sys/arch/x86/x86

2018-07-08 Thread Kamil Rytarowski

On 08.07.2018 17:30, Mouse wrote:
> Caveat: this is all opinion.  I'm not the one doing the work here.
> 
> src/sys/arch/x86/x86: mpbios.c
> 
> Remove unaligned access to mpbios_page[]
> 
> sys/arch/x86/x86/mpbios.c:308:11, load of misaligned address 
> 0x800031c7a413 for type 'const __uint16_t' which requires 2 byte 
> alignment
> 
>>> Can we please do NOT do such stupid changes?
> 
>> Kernel Undefined Behavior Sanitizer detects various portability
>> issues including alignment.
> 
> Portability issues are, in general, not issues when they are in code
> that is inherently already nonportable - such as almost everything
> under sys/arch/.
> 
>> Misaligned pointer is explicitly documented as undefined behavior in
>> the C standard (C11 6.3.2.3 p7).
> 
> So what?  Do you have reason to think that sys/arch/x86 will at some
> point be ported to a compiler (I would say "and architecture", but it's
> already tightly bound to a very small set of CPU architectures) that
> does something unexpected with that code?  Expecting the MD code in the
> low levels of an OS to never produce formally implementation-defined or
> undefined behaviour is a fool's dream.
> 
> Programs such as undefined-behaviour detectors are tools to serve us,
> not shackles to bind us.  Intelligence should be applied when using
> their results, including not expecting portability from inherently
> nonportable code.
> 

Due to UB a compiler is free to perform optimization and treat x86 like
a RISC architecture... especially with more instruction sets like SSE
(as noted by Christos).

x86 sanitizers experienced alignment trouble too. MOVDQA used to break
sanitizers in the interceptor of __tls_get_addr().

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58066

I will try to propose a macro as noted in a reply to Jason Thorpe.



signature.asc
Description: OpenPGP digital signature

Re: CVS commit: src/sys/arch/x86/x86

2018-07-08 Thread Kamil Rytarowski

On 08.07.2018 17:16, Jason Thorpe wrote:
> 
> 
>> On Jul 8, 2018, at 6:30 AM, Kamil Rytarowski  wrote:
>>
>> In future __NO_STRICT_ALIGNMENT could be defined for aarch64, at least
>> for the use of acpica (which still contains a fallback for Itanium
>> without misaligned access, but not actively maintained).
>>
>> Linux uses a different approach and ships get_unaligned() and
>> set_unaligned() macros and implements it on per ABI and CPU-basis.
> 
> In general, I get the utility of UBSan, but for these unaligned access 
> issues, an ad hoc approach like what was done in mpbios is the wrong way to 
> go.
> 
> Here's my $0.02:
> 
> -- Define a set of standard unaligned-accessors/mutators.  I propose:
> 
>   uint16_t__unaligned_load16(const uint16_t *);
>   uint32_t__unaligned_load32(const uint32_t *);
>   uint64_t__unaligned_load64(const uint64_t *);
> 
>   void__unaligned_store16(uint16_t *, uint16_t);
>   void__unaligned_store32(uint32_t *, uint32_t);
>   void__unaligned_store64(uint64_t *, uint64_t);
> 
> ...and maybe you need to have another set for the signed value flavor, dunno. 
>  (I guess probably, because you want to preserve the type information as much 
> as possible ... no casting.)
> 
> -- Implement them as static inlines in a suitable system header 
> (, maybe, although these types are  types, yah?  So, 
> adjust the types to __uint16_t or whatever as necessary).  Decorate only 
> these static inlines with the "don't sanitize me" directives.  Implementing 
> them as inline functions rather than macros has 2 benefits: avoids unintended 
> multiple-evaluation of arguments, allows the smallest possible "don't 
> sanitize" footprint.
> 
> -- Implement 2 standard sets, one for __NO_STRICT_ALIGNMENT platforms, and 
> one for the strict-alignment case.
> 

I did something similar once upon a time when I was working on
libo(verflow).

https://github.com/krytarowski/libo/blob/nbsd/overflow.h

There is a compiler switch for all C types: char, signed char, short,
int, long, long long, unsigned char, unsigned short, unsigned int,
unsigned long, unsigned long long.

However at that point of time, C11 (_Generic) wasn't adopted so widely
and pcc was in more active development so I've deferred the libo library
for future. Today a compiler without C11 is not an issue.

I can reuse the same idea and template (no licensing issues) for
unaligned accessor and I will propose a patch.

BTW. we are using a subset of the get_unaligned() feature macro in
DRMKMS -- sys/external/bsd/drm2/include/asm/unaligned.h this file is
even (c) TNF.

I will try to scratch a new header unaligned.h with the set of macros
and submit it to evaluation.

> -- For changes like what would be needed in acpica, GET INPUT FROM THE 
> UPSTREAM as to how to integrate use of these macros, and preferably push the 
> changes to the upstream FIRST so that we can simply import them in a regular 
> update.
> 

The discussion about the ACPICA case is ongoing:

https://github.com/acpica/acpica/issues/393
"NetBSD Kernel Undefined Behavior Sanitizer: acpica reports"

It will probably end up as a new macro for decoration of a function.

Harry (luserx0) is working on it as a GSoC (sub)task.

I intend to import the fix for acpica together with a new version of
this software.

> -- thorpej
> 

signature.asc
Description: OpenPGP digital signature

Re: CVS commit: src/sys/arch/x86/x86

2018-07-08 Thread Mouse

Caveat: this is all opinion.  I'm not the one doing the work here.

 src/sys/arch/x86/x86: mpbios.c

 Remove unaligned access to mpbios_page[]

 sys/arch/x86/x86/mpbios.c:308:11, load of misaligned address 
 0x800031c7a413 for type 'const __uint16_t' which requires 2 byte 
 alignment

>> Can we please do NOT do such stupid changes?

> Kernel Undefined Behavior Sanitizer detects various portability
> issues including alignment.

Portability issues are, in general, not issues when they are in code
that is inherently already nonportable - such as almost everything
under sys/arch/.

> Misaligned pointer is explicitly documented as undefined behavior in
> the C standard (C11 6.3.2.3 p7).

So what?  Do you have reason to think that sys/arch/x86 will at some
point be ported to a compiler (I would say "and architecture", but it's
already tightly bound to a very small set of CPU architectures) that
does something unexpected with that code?  Expecting the MD code in the
low levels of an OS to never produce formally implementation-defined or
undefined behaviour is a fool's dream.

Programs such as undefined-behaviour detectors are tools to serve us,
not shackles to bind us.  Intelligence should be applied when using
their results, including not expecting portability from inherently
nonportable code.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B

Re: CVS commit: src/sys/arch

2018-07-08 Thread Thor Lancelot Simon

On Sun, Jul 08, 2018 at 04:49:51PM +0200, Kamil Rytarowski wrote:
> On 08.07.2018 12:01, Martin Husemann wrote:
> > 
> > This is unecessary churn for no good reason, please stop it.
> > 
> > But worse are the other changes you are doing where kubsan insists on
> > natural alignement for {u,}int_{16,32,64} types, which is plain wrong,
> > these CPUs do not require that alignment (and it is not even clear
> > if kubsan propagates the alignment of structures correctly).
> > 
> > Martin
> > 
> 
> I've started a thread on it in tech-kern@. Please move the discussion there.

OK, let's talk about it here.  I would suggest that making changes to
MD code to conform to alignment constraints not actually present on the
"M" in question is not the right thing to do.  I'd further suggest that
there are in fact cases where it can break things or harm performance
(the trick conditionally used in some device drivers of intentionally
misaligning structures so their fields accessed by DMA meet DMA alignment
constraints comes to mind).

-- 
  Thor Lancelot Simont...@panix.com
 "The two most common variations translate as follows:
illegitimi non carborundum = the unlawful are not silicon carbide
illegitimis non carborundum = the unlawful don't have silicon carbide."

Re: CVS commit: src/sys/arch/x86/x86

2018-07-08 Thread Jason Thorpe




> On Jul 8, 2018, at 6:30 AM, Kamil Rytarowski  wrote:
> 
> In future __NO_STRICT_ALIGNMENT could be defined for aarch64, at least
> for the use of acpica (which still contains a fallback for Itanium
> without misaligned access, but not actively maintained).
> 
> Linux uses a different approach and ships get_unaligned() and
> set_unaligned() macros and implements it on per ABI and CPU-basis.

In general, I get the utility of UBSan, but for these unaligned access issues, 
an ad hoc approach like what was done in mpbios is the wrong way to go.

Here's my $0.02:

-- Define a set of standard unaligned-accessors/mutators.  I propose:

uint16_t__unaligned_load16(const uint16_t *);
uint32_t__unaligned_load32(const uint32_t *);
uint64_t__unaligned_load64(const uint64_t *);

void__unaligned_store16(uint16_t *, uint16_t);
void__unaligned_store32(uint32_t *, uint32_t);
void__unaligned_store64(uint64_t *, uint64_t);

...and maybe you need to have another set for the signed value flavor, dunno.  
(I guess probably, because you want to preserve the type information as much as 
possible ... no casting.)

-- Implement them as static inlines in a suitable system header (, 
maybe, although these types are  types, yah?  So, adjust the types 
to __uint16_t or whatever as necessary).  Decorate only these static inlines 
with the "don't sanitize me" directives.  Implementing them as inline functions 
rather than macros has 2 benefits: avoids unintended multiple-evaluation of 
arguments, allows the smallest possible "don't sanitize" footprint.

-- Implement 2 standard sets, one for __NO_STRICT_ALIGNMENT platforms, and one 
for the strict-alignment case.

-- For changes like what would be needed in acpica, GET INPUT FROM THE UPSTREAM 
as to how to integrate use of these macros, and preferably push the changes to 
the upstream FIRST so that we can simply import them in a regular update.

-- thorpej

Re: CVS commit: src/sys/arch

2018-07-08 Thread Kamil Rytarowski

On 08.07.2018 12:01, Martin Husemann wrote:
> On Sat, Jul 07, 2018 at 09:35:16PM +, Kamil Rytarowski wrote:
>> Module Name: src
>> Committed By:kamil
>> Date:Sat Jul  7 21:35:16 UTC 2018
>>
>> Modified Files:
>>  src/sys/arch/amd64/include: tss.h
>>  src/sys/arch/i386/include: tss.h
>>
>> Log Message:
>> Correct unportable signed integer left shift in i386/amd64 tss code
> 
> This change itself is OK, but your reasoning is broken. What do you
> want to port the MD i386/amd64 code to?
> 
> This is unecessary churn for no good reason, please stop it.
> 
> But worse are the other changes you are doing where kubsan insists on
> natural alignement for {u,}int_{16,32,64} types, which is plain wrong,
> these CPUs do not require that alignment (and it is not even clear
> if kubsan propagates the alignment of structures correctly).
> 
> Martin
> 

I've started a thread on it in tech-kern@. Please move the discussion there.



signature.asc
Description: OpenPGP digital signature

Re: CVS commit: src/sys/arch/x86/x86

2018-07-08 Thread Kamil Rytarowski

On 08.07.2018 15:53, Jaromír Doleček wrote:
> Le dim. 8 juil. 2018 à 15:29, Kamil Rytarowski  a écrit :
>> I've introduced the change to mpbios.c as it was small, selfcontained
>> and without the need to decorate the whole function.
> 
> Am I reading the code wrong or you actually introduced bug in mpbios.c?
> 
> Shouldn't this:
> 
> memtop |= (uint16_t)mpbios_page[0x414] << 8;
> 
> be actually << 16 to keep the same semantics?
> 
> Jaromir
> 

We are moving a 16-bit unaligned integer into a variable. The first byte
is 0 bits shifted, the second one 8.

According to my checks old and new versions produce the same result.



signature.asc
Description: OpenPGP digital signature

Re: CVS commit: src/sys/arch/x86/x86

2018-07-08 Thread Jaromír Doleček

Le dim. 8 juil. 2018 à 15:29, Kamil Rytarowski  a écrit :
> I've introduced the change to mpbios.c as it was small, selfcontained
> and without the need to decorate the whole function.

Am I reading the code wrong or you actually introduced bug in mpbios.c?

Shouldn't this:

memtop |= (uint16_t)mpbios_page[0x414] << 8;

be actually << 16 to keep the same semantics?

Jaromir

Re: CVS commit: src/sys/arch/x86/x86

2018-07-08 Thread Kamil Rytarowski

On 08.07.2018 11:24, Martin Husemann wrote:
> On Sun, Jul 08, 2018 at 10:49:53AM +0200, Jaromír Dole?ek wrote:
>>> Module Name:src
>>> Committed By:   kamil
>>> Date:   Sat Jul  7 23:05:50 UTC 2018
>>>
>>> Modified Files:
>>> src/sys/arch/x86/x86: mpbios.c
>>>
>>> Log Message:
>>> Remove unaligned access to mpbios_page[]
>>>
>>> Replace unaligned pointer dereference with a more portable construct that
>>> is free from Undefined Behavior semantics.
>>>
>>> sys/arch/x86/x86/mpbios.c:308:11, load of misaligned address 
>>> 0x800031c7a413 for type 'const __uint16_t' which requires 2 byte 
>>> alignment
> 
> 
> Can we please do NOT do such stupid changes?
> 
> This is a bogus error message, please restore the original code!
> 
> Martin
> 

On the request by Martin, I'm moving this discussion to tech-kern.

Kernel Undefined Behavior Sanitizer detects various portability issues
including alignment.

Misaligned pointer is explicitly documented as undefined behavior in the
C standard (C11 6.3.2.3 p7). (In C++ it's basically the same.)

Depending on the CPU, misaligned access can be either supported (with or
without performance hit) or unsupported (CPU exception, reinterpret
instruction to perform a different operation etc). For some CPUs it's
possible to define a dedicated exception handler to align the data
access on the performance cost.

There is a NetBSD header symbol that marks certain architectures as
capable to handle miaslignment: __NO_STRICT_ALIGNMENT. It's defined
right now for m68k, vax, amd64 and i386.

In future __NO_STRICT_ALIGNMENT could be defined for aarch64, at least
for the use of acpica (which still contains a fallback for Itanium
without misaligned access, but not actively maintained).

Linux uses a different approach and ships get_unaligned() and
set_unaligned() macros and implements it on per ABI and CPU-basis.

Almost all of the remaining issues detected by Kernel UBsan come from
unaligned memory access:

Report from a boot in qemu:
http://netbsd.org/~kamil/kubsan/0006-boot.txt

Report from a boot on a real hardware (with drmkms disabled):
http://netbsd.org/~kamil/kubsan/0007-boot-real-hardware.txt

The plan with Kernel Sanitizers is to mark the issues detected with them
as fatal, with a warning/fatal switch option for UBSan. Marking UBSan
issues as fatal makes this sanitizer useful for fuzzing and regular
checks for regressions (ATF execution), because it will detect the
issues immediately or just detect at all.

However in order to make Kernel UBSan more useful there is need to
address the reports, some of them might be cautious.

Handling cautious KUBSan reports can be treated similarly to handling
compiler warnings. We are already marking a lot of variables as
initialized, signed/unsigned, long/short or even properly intended. Some
of them are overcautious. UBSan detects a similar set of problems during
the runtime execution.

Regarding the unaligned access, there is an option to:
 - mark certain kernel functions with no-sanitize flags,
 - address the reported issues if possible.

I've introduced the change to mpbios.c as it was small, selfcontained
and without the need to decorate the whole function.

Various device drivers (vge(4), vr(4), sip(4), wdc(4) etc) contain
fixups for misalignment access under ifdef. Other ones like bwfm(4) [1]
contain fixups inlined and were detecting when a driver was in the
process of porting from x86 to a port with strict alignment.

Such issues as bwfm(4) could be detected with the sanitizer aid more easily.

[1]
commit 44bb30584312c2fa6bc0661178986c4965b67c0c
Author: jmcneill 
Date:   Fri Oct 20 23:38:21 2017 +

Fix an alignment problem with scan results within an escan event

signature.asc
Description: OpenPGP digital signature

Re: CVS commit: src/sys/arch

2018-07-08 Thread Martin Husemann

On Sat, Jul 07, 2018 at 09:35:16PM +, Kamil Rytarowski wrote:
> Module Name:  src
> Committed By: kamil
> Date: Sat Jul  7 21:35:16 UTC 2018
> 
> Modified Files:
>   src/sys/arch/amd64/include: tss.h
>   src/sys/arch/i386/include: tss.h
> 
> Log Message:
> Correct unportable signed integer left shift in i386/amd64 tss code

This change itself is OK, but your reasoning is broken. What do you
want to port the MD i386/amd64 code to?

This is unecessary churn for no good reason, please stop it.

But worse are the other changes you are doing where kubsan insists on
natural alignement for {u,}int_{16,32,64} types, which is plain wrong,
these CPUs do not require that alignment (and it is not even clear
if kubsan propagates the alignment of structures correctly).

Martin

Re: CVS commit: src/sys/arch

2016-09-23 Thread Manuel Bouyer

On Fri, Sep 23, 2016 at 09:33:36PM +0200, Jaromír Dole?ek wrote:
> Hey Maxime,
> 
> Seems the KASSERTs() are too aggressive, or there is some other bug.
> 
> I can trigger the kassert by simply attaching to rump_ffs, setting a
> breakpoint and continuing, i.e:
> 
> > rump_ffs -o log ./ffs ./mnt
> > gdb rump_ffs
> ...
> (gdb) attach RUMP_PID
> (gdb) break ffs_truncate
> Breakpoint 1 at 0xad0b951f: file
> /usr/home/dolecek/netbsd/sys/rump/fs/lib/libffs/../../../../ufs/ffs/ffs_inode.c,
> line 210.
> (gdb) cont
> panic: kernel diagnostic assetion "onfault == kcopy_fault || rcr2() <
> VM_MAXUSER_ADDRESS" failed: file "../../../../arch/i386/i386/trap.c",
> line 358
> 
> Could you please look at it?
> 
> I'll disable the KASSERT() in my local tree, so that I'll be able to
> develop. But would be good idea to check what so special that gdb is
> doing that it trips over.

This anita test run:
http://www-soc.lip6.fr/~bouyer/NetBSD-tests/xen/HEAD/i386/201609171110Z_anita.txt

also triggered the KASSERT(). As it didn't happen with newer builds I assumed
it has been fixed, but maybe it's just that it's not 100% reproductible in
this context.

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--

Re: CVS commit: src/sys/arch

2016-09-23 Thread Jaromír Doleček

Hey Maxime,

Seems the KASSERTs() are too aggressive, or there is some other bug.

I can trigger the kassert by simply attaching to rump_ffs, setting a
breakpoint and continuing, i.e:

> rump_ffs -o log ./ffs ./mnt
> gdb rump_ffs
...
(gdb) attach RUMP_PID
(gdb) break ffs_truncate
Breakpoint 1 at 0xad0b951f: file
/usr/home/dolecek/netbsd/sys/rump/fs/lib/libffs/../../../../ufs/ffs/ffs_inode.c,
line 210.
(gdb) cont
panic: kernel diagnostic assetion "onfault == kcopy_fault || rcr2() <
VM_MAXUSER_ADDRESS" failed: file "../../../../arch/i386/i386/trap.c",
line 358

Could you please look at it?

I'll disable the KASSERT() in my local tree, so that I'll be able to
develop. But would be good idea to check what so special that gdb is
doing that it trips over.

Thank you.

Jaromir

2016-09-16 13:48 GMT+02:00 Maxime Villard :
> Module Name:src
> Committed By:   maxv
> Date:   Fri Sep 16 11:48:10 UTC 2016
>
> Modified Files:
> src/sys/arch/amd64/amd64: trap.c
> src/sys/arch/i386/i386: trap.c
>
> Log Message:
> Put two KASSERTs, to make sure the fault is happening in the correct
> half of the vm space when using special copy functions. It can detect
> bugs where the kernel would fault when copying a kernel buffer which
> it wrongly believes comes from userland.
>
>
> To generate a diff of this commit:
> cvs rdiff -u -r1.84 -r1.85 src/sys/arch/amd64/amd64/trap.c
> cvs rdiff -u -r1.278 -r1.279 src/sys/arch/i386/i386/trap.c
>
> Please note that diffs are not public domain; they are subject to the
> copyright notices on the relevant files.
>

Re: Module auto-unloading (was Re: CVS commit: src/sys/arch/x86/x86)

2011-10-18 Thread David Young

On Tue, Oct 18, 2011 at 12:50:58PM -0400, Jachym Holecek wrote:
> # David Young 2011-10-18:
> > On Tue, Oct 18, 2011 at 11:28:22AM -0400, Jared McNeill wrote:
> > > I played around with driver module autoloading a while back, and it
> > > worked pretty well but the implementation I came up with required
> > > duplicating match data in .plist.
> > 
> > This sounds like a step in the right direction (recording match data
> > in a plist, not duplicating it). What would it take to get rid of the
> > non-plist match data?
> 
> Funny enough that's exactly what my GSoC project did some years ago.
> There was a lot of manual work to abstract all accesses to match data
> behind an API, major rework of autoconf(4) (not sure how much of that
> was actually necessary), simple change to config(1) to emit match data
> as plist array -- that sort of thing. No big science, just a lot of
> typing... The match data array for builtin drivers is going to be
> fairly huge in XML encoding, it was much better with bplist.

Wonderful.  Can you generate a patch for review?

Dave

-- 
David Young OJC Technologies is now Pixo
dyo...@pixotech.com Urbana, IL   (217) 344-0444 x24

Re: Module auto-unloading (was Re: CVS commit: src/sys/arch/x86/x86)

2011-10-18 Thread Jachym Holecek

# David Young 2011-10-18:
> On Tue, Oct 18, 2011 at 11:28:22AM -0400, Jared McNeill wrote:
> > I played around with driver module autoloading a while back, and it
> > worked pretty well but the implementation I came up with required
> > duplicating match data in .plist.
> 
> This sounds like a step in the right direction (recording match data
> in a plist, not duplicating it). What would it take to get rid of the
> non-plist match data?

Funny enough that's exactly what my GSoC project did some years ago.
There was a lot of manual work to abstract all accesses to match data
behind an API, major rework of autoconf(4) (not sure how much of that
was actually necessary), simple change to config(1) to emit match data
as plist array -- that sort of thing. No big science, just a lot of
typing... The match data array for builtin drivers is going to be
fairly huge in XML encoding, it was much better with bplist.

BR,
-- Jachym

Re: Module auto-unloading (was Re: CVS commit: src/sys/arch/x86/x86)

2011-10-18 Thread Jared McNeill

I think you'd just need to find a way to supply that data for built-in 
modules too.


On Tue, 18 Oct 2011, David Young wrote:


On Tue, Oct 18, 2011 at 11:28:22AM -0400, Jared McNeill wrote:

I played around with driver module autoloading a while back, and it
worked pretty well but the implementation I came up with required
duplicating match data in .plist.


This sounds like a step in the right direction (recording match data
in a plist, not duplicating it). What would it take to get rid of the
non-plist match data?

Dave

--
David Young OJC Technologies is now Pixo
dyo...@pixotech.com Urbana, IL   (217) 344-0444 x24

Re: Module auto-unloading (was Re: CVS commit: src/sys/arch/x86/x86)

2011-10-18 Thread David Young

On Tue, Oct 18, 2011 at 11:28:22AM -0400, Jared McNeill wrote:
> I played around with driver module autoloading a while back, and it
> worked pretty well but the implementation I came up with required
> duplicating match data in .plist.

This sounds like a step in the right direction (recording match data
in a plist, not duplicating it). What would it take to get rid of the
non-plist match data?

Dave

-- 
David Young OJC Technologies is now Pixo
dyo...@pixotech.com Urbana, IL   (217) 344-0444 x24

Module auto-unloading (was Re: CVS commit: src/sys/arch/x86/x86)

2011-10-18 Thread Jared McNeill

We could use the reference counter in struct cfdriver to keep driver 
modules "busy", but I'm not sure if the system can figure out what's 
unneeded if a "needed" driver might be one for a hotpluggable device. 
Would you treat those drivers differently? What about drivers (if_ath_pci 
comes to mind) that can be used both for fixed or hotplug devices?


I played around with driver module autoloading a while back, and it worked 
pretty well but the implementation I came up with required duplicating 
match data in .plist.


On Tue, 18 Oct 2011, Warner Losh wrote:



On Oct 18, 2011, at 4:39 AM, Jared McNeill wrote:

I would argue that any manually loaded module shouldn't be autounloaded. What 
do you think about flagging modules as autoloaded and only autounloading the 
autoloaded ones?


If I "manually" load a dozen drivers at boot because I have a dozen different boards with 
different devices.  I'd kinda like the system to "automatically" figure out what isn't 
needed and unload those drivers.

Warner


On Tue, 18 Oct 2011, Jukka Ruohonen wrote:


On Tue, Oct 18, 2011 at 08:43:46AM +0200, Marc Balmer wrote:

Am 18.10.11 06:27, schrieb Jukka Ruohonen:

On Tue, Oct 18, 2011 at 12:07:45AM +, Jared D. McNeill wrote:

Module Name:src
Committed By:   jmcneill
Date:   Tue Oct 18 00:07:45 UTC 2011

Modified Files:
src/sys/arch/x86/x86: vmt.c

Log Message:
don't allow module autounload


I wonder should autounloading be prohibited for all driver-class modules?


Why?  When the parent goes away, why not autounload a driver?


I am not sure. But have we thought about all the consequences and corner-
cases? Unloading happens while modifying hardware state? Deferred calls
in the drivers? And so on? To me it also seems that if I manually load
a driver-module, I expect it to stay loaded until I unload it.

- Jukka.

Re: CVS commit: src/sys/arch/xen

2011-08-29 Thread Jean-Yves Migeon

On 29.08.2011 15:03, Cherry G. Mathew wrote:
> I'm a bit confused now - are we assuming that the pmap lock protects the
> (pte update op queue-push(es) + pmap_pte_flush()) as a single atomic
> operation (ie; no invlpg/tlbflush or out-of-band-read can occur between
> the update(s) and the pmap_pte_flush()) ?
> 
> If so, I think I've slightly misunderstood the scope of the mmu queue
> design - I assumed that the queue is longer-lived, and the sync points
> (for the queue flush) can span across pmap locking - a sort of lazy pte
> update, with the queue being flushed at out-of-band or in-band read
> time ( I guess that won't work though - how does one know when the
> hardware walks the page table ?) . It seems that the queue is meant for
> pte updates in loops, for eg:, quickly followed by a flush. Is this
> correct ? 

IMHO, this should be regarded this way, and nothing else. x86 and xen
pmap(9) share a lot in common, low level operations (like these: PT/PD
editing, TLB flushes, MMU updates...) should not leak through this
abstraction.

Said differently, the way Xen handles MMU must remain transparent to
pmap, except in a few places. Remember, although we are adding a level
of indirection through hypervisor, the calls should remain close to
native x86 semantic.

However, for convenience, Xen offers "multiseats" MMU hypercalls, where
you can schedule more than one op at a time to avoid unneeded context
switches, like in the pmap_alloc_level() function. This is our
problematic part.

> If so, there's just one hazard afaict - the synchronous TLB_FLUSH_MULTI
> could beat the race between the queue update and the queue flush via
> pmap_tlb_shootnow() (see pmap_tlb.c on the cherry-xenmp branch, and *if*
> other CPUs reload their TLBs before the flush, they'll have stale info.

What stale info? If a VCPU queue isn't empty while another VCPU has
scheduled a TLB_FLUSH_MULTI op, the stale content of the queue will
eventually be depleted later after a pmap_pte_flush() followed by a
local invalidation. This is the part that should be carefully reviewed.

For clarity, I would expect a queue to be empty when leaving pmap (e.g.
when releasing its lock). Assertions might be necessary to catch all
corner cases.

> So the important question (rmind@ ?) is, is pmap_tlb_shootnow()
> guaranteed to be always called with the pmap lock held ?
> 
> In real life, I removed the global xpq_queue_lock() and the pmap was
> falling apart. So a bit of debugging ahead.

Did you keep the same queue for all CPUs? If that was the case, yes,
this is a call for trouble.

Anyway, we can't put a giant lock around xpq_queue. This doesn't make
any sense in a MP system, especially for operations that are frequently
called and still may take a while to complete. Just imagine: all CPUs
waiting for one to finish its TLB flush before they can edit their PD/PT
again... ouch.

-- 
Jean-Yves Migeon
jeanyves.mig...@free.fr

Re: CVS commit: src/sys/arch/xen

2011-08-29 Thread Manuel Bouyer

On Mon, Aug 29, 2011 at 03:03:37PM +0200, Cherry G. Mathew wrote:
> Hi Manuel,
> 
> > "Manuel" == Manuel Bouyer  writes:
> 
> 
> [...]
> 
> >> 
> >> Xen's TLB_FLUSH operation is synchronous, and doesn't require an
> >> IPI (within the domain), which makes the queue ordering even more
> >> important (to make sure that stale ptes are not reloaded before
> >> the per-cpu queue has made progress). Yes, we can implement a
> >> roundabout ipi driven queueflush + tlbflush scheme(described
> >> below), but that would be performance sensitive, and the basic
> >> issue won't go away, imho.
> >> 
> >> Let's stick to the xpq ops for a second, ignoring "out-of-band"
> >> reads (for which I agree that your assertion, that locking needs
> >> to be done at a higher level, holds true).
> >> 
> >> The question here, really is, what are the global ordering
> >> requirements of per-cpu memory op queues, given the following
> >> basic "ops":
> >> 
> >> i) write memory (via MMU_NORMAL_PT_UPDATE, MMU_MACHPHYS_UPDATE)
> >> ii) read memory via: MMUEXT_PIN_L1_TABLE MMUEXT_PIN_L2_TABLE
> >> MMUEXT_PIN_L3_TABLE MMUEXT_PIN_L4_TABLE MMUEXT_UNPIN_TABLE
> 
> Manuel> This is when adding/removing a page table from a pmap. When
> Manuel> this occurs, the pmap is locked, isn't it ?
> 
> >> MMUEXT_NEW_BASEPTR MMUEXT_NEW_USER_BASEPTR
> 
> Manuel> This is a context switch
> 
> >> MMUEXT_TLB_FLUSH_LOCAL MMUEXT_INVLPG_LOCAL MMUEXT_TLB_FLUSH_MULTI
> >> MMUEXT_INVLPG_MULTI MMUEXT_TLB_FLUSH_ALL MMUEXT_INVLPG_ALL
> >> MMUEXT_FLUSH_CACHE
> 
> Manuel> This may, or may not, cause a read. This usually happens
> Manuel> after updating the pmap, and I guess this also happens with
> Manuel> the pmap locked (I have not carefully checked).
> 
> Manuel> So couldn't we just use the pmap lock for this ?  I suspect
> Manuel> the same lock will also be needed for out of band reads at
> Manuel> some point (right now it's protected by splvm()).
> 
> I'm a bit confused now - are we assuming that the pmap lock protects the
> (pte update op queue-push(es) + pmap_pte_flush()) as a single atomic
> operation (ie; no invlpg/tlbflush or out-of-band-read can occur between
> the update(s) and the pmap_pte_flush()) ?

out of band reads can always occurs, there's no lock which can protect against
this.

> 
> If so, I think I've slightly misunderstood the scope of the mmu queue
> design - I assumed that the queue is longer-lived, and the sync points
> (for the queue flush) can span across pmap locking - a sort of lazy pte
> update, with the queue being flushed at out-of-band or in-band read
> time ( I guess that won't work though - how does one know when the
> hardware walks the page table ?) . It seems that the queue is meant for
> pte updates in loops, for eg:, quickly followed by a flush. Is this
> correct ? 

it was not explicitely designed this way, but I think that's how things are
in practice, yes. Usage would need to be checked, though.
There may be some special case in the kernel pmap area ...

> 
> If so, there's just one hazard afaict - the synchronous TLB_FLUSH_MULTI
> could beat the race between the queue update and the queue flush via
> pmap_tlb_shootnow() (see pmap_tlb.c on the cherry-xenmp branch, and *if*
> other CPUs reload their TLBs before the flush, they'll have stale info.
> 
> So the important question (rmind@ ?) is, is pmap_tlb_shootnow()
> guaranteed to be always called with the pmap lock held ?

I don't think so; but I also don't think that's the problem.
There shouldn't be more race with this than on native hardware.

> 
> In real life, I removed the global xpq_queue_lock() and the pmap was
> falling apart. So a bit of debugging ahead.

Hum, on seocnd though, something more may be needed to protect the queue.
The pmap lock won't probably work for pmap_kernel ...

> 
> >> [...]
> >> 
> >> >>> I'm thinking that it might be easier and more justifiable to
> >> >>> nuke the current queue scheme and implement shadow page
> >> tables, >>> which would fit more naturally and efficiently with
> >> CAS pte >>> updates, etc.
> >> >> 
> >> >> I'm not sure this would completely fis the issue: with shadow
> >> >> page tables you can't use a CAS to assure atomic operation
> >> with >> the hardware TLB, as this is, precisely, a shadow PT and
> >> not the >> one used by hardware.
> >> 
> >> Definitely worth looking into, I imho. I'm not very comfortable
> >> with the queue based scheme for MP.
> >> 
> >> the CAS doesn't provide any guarantees with the TLB on native
> >> h/w, afaict.
> 
> Manuel> What makes you think so ? I think the hw TLB also does CAS
> Manuel> to update referenced and dirty bits in the PTE, otherwise we
> Manuel> couldn't rely on these bits; this would be bad especially
> Manuel> for the dirty bit.
> 
> Yes, I missed that one (which i

Re: CVS commit: src/sys/arch/xen

2011-08-29 Thread Cherry G . Mathew

Hi Manuel,

> "Manuel" == Manuel Bouyer  writes:

[...]

>> 
>> Xen's TLB_FLUSH operation is synchronous, and doesn't require an
>> IPI (within the domain), which makes the queue ordering even more
>> important (to make sure that stale ptes are not reloaded before
>> the per-cpu queue has made progress). Yes, we can implement a
>> roundabout ipi driven queueflush + tlbflush scheme(described
>> below), but that would be performance sensitive, and the basic
>> issue won't go away, imho.
>> 
>> Let's stick to the xpq ops for a second, ignoring "out-of-band"
>> reads (for which I agree that your assertion, that locking needs
>> to be done at a higher level, holds true).
>> 
>> The question here, really is, what are the global ordering
>> requirements of per-cpu memory op queues, given the following
>> basic "ops":
>> 
>> i) write memory (via MMU_NORMAL_PT_UPDATE, MMU_MACHPHYS_UPDATE)
>> ii) read memory via: MMUEXT_PIN_L1_TABLE MMUEXT_PIN_L2_TABLE
>> MMUEXT_PIN_L3_TABLE MMUEXT_PIN_L4_TABLE MMUEXT_UNPIN_TABLE

Manuel> This is when adding/removing a page table from a pmap. When
Manuel> this occurs, the pmap is locked, isn't it ?

>> MMUEXT_NEW_BASEPTR MMUEXT_NEW_USER_BASEPTR

Manuel> This is a context switch

>> MMUEXT_TLB_FLUSH_LOCAL MMUEXT_INVLPG_LOCAL MMUEXT_TLB_FLUSH_MULTI
>> MMUEXT_INVLPG_MULTI MMUEXT_TLB_FLUSH_ALL MMUEXT_INVLPG_ALL
>> MMUEXT_FLUSH_CACHE

Manuel> This may, or may not, cause a read. This usually happens
Manuel> after updating the pmap, and I guess this also happens with
Manuel> the pmap locked (I have not carefully checked).

Manuel> So couldn't we just use the pmap lock for this ?  I suspect
Manuel> the same lock will also be needed for out of band reads at
Manuel> some point (right now it's protected by splvm()).

I'm a bit confused now - are we assuming that the pmap lock protects the
(pte update op queue-push(es) + pmap_pte_flush()) as a single atomic
operation (ie; no invlpg/tlbflush or out-of-band-read can occur between
the update(s) and the pmap_pte_flush()) ?

If so, I think I've slightly misunderstood the scope of the mmu queue
design - I assumed that the queue is longer-lived, and the sync points
(for the queue flush) can span across pmap locking - a sort of lazy pte
update, with the queue being flushed at out-of-band or in-band read
time ( I guess that won't work though - how does one know when the
hardware walks the page table ?) . It seems that the queue is meant for
pte updates in loops, for eg:, quickly followed by a flush. Is this
correct ? 

If so, there's just one hazard afaict - the synchronous TLB_FLUSH_MULTI
could beat the race between the queue update and the queue flush via
pmap_tlb_shootnow() (see pmap_tlb.c on the cherry-xenmp branch, and *if*
other CPUs reload their TLBs before the flush, they'll have stale info.

So the important question (rmind@ ?) is, is pmap_tlb_shootnow()
guaranteed to be always called with the pmap lock held ?

In real life, I removed the global xpq_queue_lock() and the pmap was
falling apart. So a bit of debugging ahead.

>> [...]
>> 
>> >>> I'm thinking that it might be easier and more justifiable to
>> >>> nuke the current queue scheme and implement shadow page
>> tables, >>> which would fit more naturally and efficiently with
>> CAS pte >>> updates, etc.
>> >> 
>> >> I'm not sure this would completely fis the issue: with shadow
>> >> page tables you can't use a CAS to assure atomic operation
>> with >> the hardware TLB, as this is, precisely, a shadow PT and
>> not the >> one used by hardware.
>> 
>> Definitely worth looking into, I imho. I'm not very comfortable
>> with the queue based scheme for MP.
>> 
>> the CAS doesn't provide any guarantees with the TLB on native
>> h/w, afaict.

Manuel> What makes you think so ? I think the hw TLB also does CAS
Manuel> to update referenced and dirty bits in the PTE, otherwise we
Manuel> couldn't rely on these bits; this would be bad especially
Manuel> for the dirty bit.

Yes, I missed that one (which is much of the point of the CAS in the
first place!), you're right.

>> If you do a CAS pte update, and the update succeeded, it's a good
>> idea to invalidate + shootdown anyway (even on baremetal).

Manuel> Yes, of course inval is needed after updating the PTE. But
Manuel> using a true CAS is important to get the refereced and dirty
Manuel> bits right.

I haven't looked into this in detail, but I thought that xen disconnects
the shadow (presumably with correct update/dirty bits)  and flushes the
TLB when a write to the shadow occurs ? In which case these bits will be
in sync until the next h/w access ? I'm speculating, I haven't looked at
the xen code for this yet.

-- 
Cherry

Re: CVS commit: src/sys/arch/xen

2011-08-29 Thread Cherry G . Mathew

> "Cherry" == Cherry G Mathew  writes:


[...]

Cherry> Xen's TLB_FLUSH operation is synchronous, and doesn't

I mean, Xen's TLB_FLUSH_MULTI operations. my current implementation of
tlb shootdown is therefore synchronous wrt TLB flushes on all cpus.

Cheers,
-- 
Cherry

Re: CVS commit: src/sys/arch/xen

2011-08-29 Thread Manuel Bouyer

On Mon, Aug 29, 2011 at 12:07:05PM +0200, Cherry G. Mathew wrote:
> JM> On Mon, 22 Aug 2011 12:47:40 +0200, Manuel Bouyer wrote:
> >>> This is slightly more complicated than it appears. Some of the
> >>> "ops" in a per-cpu queue may have ordering dependencies with
> >>> other cpu queues, and I think this would be hard to express
> >>> trivially. (an example would be a pte update on one queue, and
> >>> reading the same pte read on another queue - these cases are
> >>> quite analogous (although completely unrelated)
> >> 
> 
> Hi,
> 
> So I had a better look at this - implemented per-cpu queues and messed
> with locking a bit:
> 
> 
> >> read don't go through the xpq queue, don't they ?
> 
> JM> Nope, PTE are directly obtained from the recursive mappings
> JM> (vtopte/kvtopte).
> 
> Let's call this "out of band" reads. But see below for "in-band" reads.
> 
> JM> Content is "obviously" only writable by hypervisor (so it can
> JM> keep control of his mapping alone).
> >> I think this is similar to a tlb flush but the other way round, I
> >> guess we could use a IPI for this too.
> 
> JM> IIRC that's what the current native x86 code does: it uses an
> JM> IPI to signal other processors that a shootdown is necessary.
> 
> Xen's TLB_FLUSH operation is synchronous, and doesn't require an IPI
> (within the domain), which makes the queue ordering even more important
> (to make sure that stale ptes are not reloaded before the per-cpu queue
> has made progress). Yes, we can implement a roundabout ipi driven
> queueflush + tlbflush scheme(described below), but that would be
> performance sensitive, and the basic issue won't go away, imho.
> 
> Let's stick to the xpq ops for a second, ignoring "out-of-band" reads
> (for which I agree that your assertion, that locking needs to be done at
> a higher level, holds true).
> 
> The question here, really is, what are the global ordering requirements
> of per-cpu memory op queues, given the following basic "ops":
> 
> i) write memory (via MMU_NORMAL_PT_UPDATE, MMU_MACHPHYS_UPDATE)
> ii) read memory
> via:
> MMUEXT_PIN_L1_TABLE
> MMUEXT_PIN_L2_TABLE
> MMUEXT_PIN_L3_TABLE
> MMUEXT_PIN_L4_TABLE
> MMUEXT_UNPIN_TABLE

This is when adding/removing a page table from a pmap. When this occurs,
the pmap is locked, isn't it ?

> MMUEXT_NEW_BASEPTR
> MMUEXT_NEW_USER_BASEPTR

This is a context switch

> MMUEXT_TLB_FLUSH_LOCAL
> MMUEXT_INVLPG_LOCAL
> MMUEXT_TLB_FLUSH_MULTI
> MMUEXT_INVLPG_MULTI
> MMUEXT_TLB_FLUSH_ALL
> MMUEXT_INVLPG_ALL
> MMUEXT_FLUSH_CACHE

This may, or may not, cause a read. This usually happens after updating
the pmap, and I guess this also happens with the pmap locked (I have not
carefully checked).

So couldn't we just use the pmap lock for this ?
I suspect the same lock will also be needed for out of band reads at some
point (right now it's protected by splvm()).

> [...]
> 
> >>> I'm thinking that it might be easier and more justifiable to
> >>> nuke the current queue scheme and implement shadow page tables,
> >>> which would fit more naturally and efficiently with CAS pte
> >>> updates, etc.
> >> 
> >> I'm not sure this would completely fis the issue: with shadow
> >> page tables you can't use a CAS to assure atomic operation with
> >> the hardware TLB, as this is, precisely, a shadow PT and not the
> >> one used by hardware.
> 
> Definitely worth looking into, I imho. I'm not very comfortable with
> the queue based scheme for MP.
> 
> the CAS doesn't provide any guarantees with the TLB on native h/w,
> afaict.

What makes you think so ? I think the hw TLB also does CAS to update
referenced and dirty bits in the PTE, otherwise we couldn't rely on these
bits; this would be bad especially for the dirty bit.

> If you do a CAS pte update, and the update succeeded, it's a
> good idea to invalidate + shootdown anyway (even on baremetal).

Yes, of course inval is needed after updating the PTE. But using a true CAS
is important to get the refereced and dirty bits right.

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--

Re: CVS commit: src/sys/arch/xen

2011-08-29 Thread Cherry G . Mathew

Cc:ing tech-kern, to get wider feedback.
Thread started here: 
http://mail-index.netbsd.org/source-changes-d/2011/08/21/msg003897.html

> "JM" == Jean-Yves Migeon  writes:

JM> On Mon, 22 Aug 2011 12:47:40 +0200, Manuel Bouyer wrote:
>>> This is slightly more complicated than it appears. Some of the
>>> "ops" in a per-cpu queue may have ordering dependencies with
>>> other cpu queues, and I think this would be hard to express
>>> trivially. (an example would be a pte update on one queue, and
>>> reading the same pte read on another queue - these cases are
>>> quite analogous (although completely unrelated)
>> 

Hi,

So I had a better look at this - implemented per-cpu queues and messed
with locking a bit:

>> read don't go through the xpq queue, don't they ?

JM> Nope, PTE are directly obtained from the recursive mappings
JM> (vtopte/kvtopte).

Let's call this "out of band" reads. But see below for "in-band" reads.

JM> Content is "obviously" only writable by hypervisor (so it can
JM> keep control of his mapping alone).
>> I think this is similar to a tlb flush but the other way round, I
>> guess we could use a IPI for this too.

JM> IIRC that's what the current native x86 code does: it uses an
JM> IPI to signal other processors that a shootdown is necessary.

Xen's TLB_FLUSH operation is synchronous, and doesn't require an IPI
(within the domain), which makes the queue ordering even more important
(to make sure that stale ptes are not reloaded before the per-cpu queue
has made progress). Yes, we can implement a roundabout ipi driven
queueflush + tlbflush scheme(described below), but that would be
performance sensitive, and the basic issue won't go away, imho.

Let's stick to the xpq ops for a second, ignoring "out-of-band" reads
(for which I agree that your assertion, that locking needs to be done at
a higher level, holds true).

The question here, really is, what are the global ordering requirements
of per-cpu memory op queues, given the following basic "ops":

i) write memory (via MMU_NORMAL_PT_UPDATE, MMU_MACHPHYS_UPDATE)
ii) read memory
via:
MMUEXT_PIN_L1_TABLE
MMUEXT_PIN_L2_TABLE
MMUEXT_PIN_L3_TABLE
MMUEXT_PIN_L4_TABLE
MMUEXT_UNPIN_TABLE
MMUEXT_NEW_BASEPTR
MMUEXT_TLB_FLUSH_LOCAL
MMUEXT_INVLPG_LOCAL
MMUEXT_TLB_FLUSH_MULTI
MMUEXT_INVLPG_MULTI
MMUEXT_TLB_FLUSH_ALL
MMUEXT_INVLPG_ALL
MMUEXT_FLUSH_CACHE
MMUEXT_NEW_USER_BASEPTR

(ie; anything that will cause the processor to re-read data updated
via another cpu (via, for eg: pte update with i), above)

There's two ways I can think of fixing this:

a) *before* queueing a local read-op, synchronously flush queues on all
   other CPUs via ipis. This is slightly racy, but can be done, I think.

   An optimisation for invlpg could be to implement a scoreboard that
   watches mem. locations that have been queued for update on any
   cpu. Scan through the scoreboard for the memory range we're
   invlpg-ing. If it's not there, there's no need to flush any queues on
   other cpus.

b) read-ops wait on a global condvar. If it's set, a write-op that needs
   flushing is pending. Wait (with optional timeout and ipi-"nudge")
   until the remote queue is flushed. When flushing a queue, send a
   cv_broadcast to any waiters.

Option "b)" is slightly better than my current scheme which is to lock
any and all mmu-ops and operate the queues synchronously (via
XENDEBUG_SYNC).

I cannot think of anything else, other than ad-hoc locking + queue
flushing, which could be hard to maintain and debug in the long run.

>>> I'm thinking that it might be easier and more justifiable to
>>> nuke the current queue scheme and implement shadow page tables,
>>> which would fit more naturally and efficiently with CAS pte
>>> updates, etc.
>> 
>> I'm not sure this would completely fis the issue: with shadow
>> page tables you can't use a CAS to assure atomic operation with
>> the hardware TLB, as this is, precisely, a shadow PT and not the
>> one used by hardware.

Definitely worth looking into, I imho. I'm not very comfortable with
the queue based scheme for MP.

the CAS doesn't provide any guarantees with the TLB on native h/w,
afaict. If you do a CAS pte update, and the update succeeded, it's a
good idea to invalidate + shootdown anyway (even on baremetal).

Do let me know your thoughts,

Cheers,
-- 
Cherry

Re: CVS commit: src/sys/arch/powerpc/oea

2010-11-15 Thread David Holland

On Mon, Nov 15, 2010 at 03:47:32PM -0500, der Mouse wrote:
 > > [...] just forward declarations of the structs.
 > 
 > > (this is, btw, one of the reasons to avoid silly typedefs)
 > 
 > I'm not sure what typedefs have to do with it.  typedeffing a name to
 > an incomplete ("forward") struct type works just fine:
 > 
 > struct foo;
 > typedef struct foo FOO;
 > 
 > (You can't do anything with a FOO without completing the struct type,
 > but you can work with pointers to them)

But now there's no protection against divergence; that is, if I have

   typedef struct foo FOO;

in one header and a typo'd

   typedef struct tfoo FOO;

in another, assuming suitable ifdef guards as already mentioned, now
FOO can be two different things, and the inconsistency in the
cut-and-pasted-material might not be detected for some time.

However, if I just have

   struct foo;

in multiple headers there aren't very many ways this can be wrong that
will compile at all. The only common way for this to go bad is if
you've removed struct foo from your program completely; then you have
to hunt down all the forward declarations by hand and kill them off.
But that's more or less unavoidable.

The difference between these two cases is inherent in the fact that
the typedef form is declaring two things and the plain struct
declaration is declaring only one... there's no particular reason C
couldn't provide a way to create a forward declaration (without
definition) of a typedef name, but it doesn't.

-- 
David A. Holland
dholl...@netbsd.org

Re: CVS commit: src/sys/arch/powerpc/oea

2010-11-15 Thread David Holland

On Mon, Nov 15, 2010 at 10:41:55PM +, David Laight wrote:
 > > Indeed. Properly speaking though, headers that are exported to
 > > userland should define only the precise symbols that userland needs;
 > > kernel-only material should be kept elsewhere.
 > 
 > One start would be to add a sys/proc_internal.h so that sys/proc
 > can be reduced to only stuff that userspace and some kernel parts
 > are really expected to use.

The right way (TM) is to create src/sys/include and put kernel-only
headers in there, to be included as e.g. .

In the long term the user-visible parts would go in
src/sys/include/kern/proc.h, which would be included as .

(It has to be kern/ and not sys/ because a couple decades of standards
creep and poor API maintenance has led to half of sys/*.h properly
belonging to libc. In order to avoid repeating this problem in the
future, all APIs should be defined without direct reference to any
kern/*.h files; those should only be included from other libc or
kernel headers. So libc would grow its own  because that's
part of the libkvm API.)

When done completely the entire kern/ subtree is the same for both
userland and the kernel, including MD headers, no other random kernel
headers need to be installed, and there's no longer any need for
#ifdef _KERNEL.

As much as this probably sounds obvious, the first couple of times I
set out to do it myself I got it wrong. (And it's wrong in Linux too.)

-- 
David A. Holland
dholl...@netbsd.org

Re: CVS commit: src/sys/arch/powerpc/oea

2010-11-15 Thread Masao Uebayashi

On Mon, Nov 15, 2010 at 08:34:40PM +, David Holland wrote:
> (moving this to tech-kern because it's the right place and per request)
> 
> On Mon, Nov 15, 2010 at 11:24:21AM +0900, Masao Uebayashi wrote:
>  > > Every header file should include the things it requires to compile.
>  > > Therefore, there should in principle be no cases where a header file
>  > > (or source file) needs to #include something it doesn't itself use.
>  > 
>  > This clarifies my long-unanswered question, thanks!
> 
> *bow*
> 
>  > I've (re)built about 300 kernels in the last days.  I've found:
>  > 
>  > - sys/sysctl.h's struct kinfo_proc should be moved into sys/proc.h
>  >   (I've done this locally).  Otherwise all sysctl node providers
>  >   includes sys/proc.h and uvm/uvm_extern.h.
>  >   (This is where I started...)
> 
> I'm not sure this is a good plan in the long run. Shouldn't it at some
> point be unhooked fully from the "real" proc structure?
> 
>  > - sys/syscallargs.h should be split into pieces, otherwise all its
>  >   users have to know unrelated types (sys/mount.h, sys/cpu.h).
> 
> Since system calls don't in general pass structures by value, it
> shouldn't need most of those header files, just forward declarations
> of the structs.
> 
> (this is, btw, one of the reasons to avoid silly typedefs)

OK, you're absolutely right.

And I'm fully agree that forward decls are better in modularity POV.

> 
>  > - sys/proc.h's tsleep(), wakeup(), and friends should be moved into
>  >   some common header, because it's widely used API.  sys/proc.h will
>  >   be used only for "struct proc" related things.
> 
> Given that this is a deprecated API in the long term I'm not sure it's
> worthwhile.
> 
> -- 
> David A. Holland
> dholl...@netbsd.org

-- 
Masao Uebayashi / Tombi Inc. / Tel: +81-90-9141-4635

Re: CVS commit: src/sys/arch/powerpc/oea

2010-11-15 Thread der Mouse

>> I've long felt this way: that, except for a very few examples like
>>  that are defined to depend on context, the order of
>> #includes should not matter.  In particular, if multiple files must
>> be included, any of them may come first - so any file that generates
>> errors if it's included first needs fixing.
> assert.h is the *only* header that is not supposed to be idempotent.
> Pretty much anything else should be classified as bug.

I'm not talking about idempotency.  Idempotency is the question of
whether

#include 
#include 

is operationally equivalent to

#include 

But what I'm talking about is whether

#include 
#include 

and

#include 
#include 

are operationally equivalent.

> Another item is that too many of our headers depend on non-standard
> compliant types polluting the namespace.

This is another issue I have with our include files, but it's
significantly more work to fix, so I've been working around it rather
than fixing it right.  (It's also not entirely clear what the right fix
is when more than two levels of software supplier are involved.)

> Nothing installed in /usr/include should depend on u_char for
> example.

Well...ideally.  But there's a big difference between (say) 
depending on u_char and  depending on u_char.
(I'm not happy about either, but substantially less happy about the
former.)

>> struct foo;
>> typedef struct foo FOO;

> Problem is that this requires a guard for the typedef if FOO is
> supposed to be defined by multiple files.

True.

Most of the cases I've seen for this push the incomplete struct
declaration and the typedef into a separate file, wrap it with an
idempotency guard, and then include that from files that need the
struct and/or type.  I've seen a very few cases where this hasn't been
done and I think none at all where it couldn't be done if the relevant
software authors cared to bother.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B

Re: CVS commit: src/sys/arch/powerpc/oea

2010-11-15 Thread Joerg Sonnenberger

On Mon, Nov 15, 2010 at 03:47:32PM -0500, der Mouse wrote:
> >>> Every header file should include the things it requires to compile.
> 
> I've long felt this way: that, except for a very few examples like
>  that are defined to depend on context, the order of
> #includes should not matter.  In particular, if multiple files must be
> included, any of them may come first - so any file that generates
> errors if it's included first needs fixing.  (Well, unless it's an
> internal file, one that shouldn't be included directly.)

assert.h is the *only* header that is not supposed to be idempotent.
Pretty much anything else should be classified as bug.

Another item is that too many of our headers depend on non-standard
compliant types polluting the namespace. Nothing installed in
/usr/include should depend on u_char for example.

> I've got numerous fixes to 4.0.1 for such issues, in case anyone thinks
> it's worth applying this stance to 4.x.

Doing it properly has a chance of creating some fallout and that's not
really acceptable for a release branch, especially one that doesn't get
many non-critical changes.

> > [...] just forward declarations of the structs.
> 
> > (this is, btw, one of the reasons to avoid silly typedefs)
> 
> I'm not sure what typedefs have to do with it.  typedeffing a name to
> an incomplete ("forward") struct type works just fine:
> 
> struct foo;
> typedef struct foo FOO;
> 
> (You can't do anything with a FOO without completing the struct type,
> but you can work with pointers to them)

Problem is that this requires a guard for the typedef if FOO is supposed
to be defined by multiple files.

Joerg

Re: CVS commit: src/sys/arch/powerpc/oea

2010-11-15 Thread der Mouse

>>> Every header file should include the things it requires to compile.

I've long felt this way: that, except for a very few examples like
 that are defined to depend on context, the order of
#includes should not matter.  In particular, if multiple files must be
included, any of them may come first - so any file that generates
errors if it's included first needs fixing.  (Well, unless it's an
internal file, one that shouldn't be included directly.)

I've got numerous fixes to 4.0.1 for such issues, in case anyone thinks
it's worth applying this stance to 4.x.

> [...] just forward declarations of the structs.

> (this is, btw, one of the reasons to avoid silly typedefs)

I'm not sure what typedefs have to do with it.  typedeffing a name to
an incomplete ("forward") struct type works just fine:

struct foo;
typedef struct foo FOO;

(You can't do anything with a FOO without completing the struct type,
but you can work with pointers to them)

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B

Re: CVS commit: src/sys/arch/powerpc/oea

2010-11-15 Thread David Holland

(moving this to tech-kern because it's the right place and per request)

On Mon, Nov 15, 2010 at 11:24:21AM +0900, Masao Uebayashi wrote:
 > > Every header file should include the things it requires to compile.
 > > Therefore, there should in principle be no cases where a header file
 > > (or source file) needs to #include something it doesn't itself use.
 > 
 > This clarifies my long-unanswered question, thanks!

*bow*

 > I've (re)built about 300 kernels in the last days.  I've found:
 > 
 > - sys/sysctl.h's struct kinfo_proc should be moved into sys/proc.h
 >   (I've done this locally).  Otherwise all sysctl node providers
 >   includes sys/proc.h and uvm/uvm_extern.h.
 >   (This is where I started...)

I'm not sure this is a good plan in the long run. Shouldn't it at some
point be unhooked fully from the "real" proc structure?

 > - sys/syscallargs.h should be split into pieces, otherwise all its
 >   users have to know unrelated types (sys/mount.h, sys/cpu.h).

Since system calls don't in general pass structures by value, it
shouldn't need most of those header files, just forward declarations
of the structs.

(this is, btw, one of the reasons to avoid silly typedefs)

 > - sys/proc.h's tsleep(), wakeup(), and friends should be moved into
 >   some common header, because it's widely used API.  sys/proc.h will
 >   be used only for "struct proc" related things.

Given that this is a deprecated API in the long term I'm not sure it's
worthwhile.

-- 
David A. Holland
dholl...@netbsd.org

Re: CVS commit: src/sys/arch/powerpc/oea

2010-11-14 Thread Adam Hamsik


On Nov,Monday 15 2010, at 7:16 AM, Bernd Ernesti wrote:

> On Mon, Nov 15, 2010 at 11:24:21AM +0900, Masao Uebayashi wrote:
>> On Sun, Nov 14, 2010 at 05:52:51AM +, David Holland wrote:
>>> On Sun, Nov 14, 2010 at 03:32:44AM +, Masao Uebayashi wrote:
 XXX What is the conclusion about direct vs. indirect #include from headers?
>>> 
>>> Every header file should include the things it requires to compile.
>>> Therefore, there should in principle be no cases where a header file
>>> (or source file) needs to #include something it doesn't itself use.
>> 
>> This clarifies my long-unanswered question, thanks!
>> 
>> I've (re)built about 300 kernels in the last days.  I've found:
>> 
>> - sys/sysctl.h's struct kinfo_proc should be moved into sys/proc.h
>>  (I've done this locally).  Otherwise all sysctl node providers
>>  includes sys/proc.h and uvm/uvm_extern.h.
>>  (This is where I started...)
>> 
>> - sys/syscallargs.h should be split into pieces, otherwise all its
>>  users have to know unrelated types (sys/mount.h, sys/cpu.h).
>> 
>> - sys/proc.h's tsleep(), wakeup(), and friends should be moved into
>>  some common header, because it's widely used API.  sys/proc.h will
>>  be used only for "struct proc" related things.
> 
> What are the issues here that we need to fix this right now?

I think that it's quite good time to fix, it would be much harder to do this 
after 6.0 branch.

Regards

Adam.

Re: CVS commit: src/sys/arch/powerpc/oea

2010-11-14 Thread Bernd Ernesti

On Mon, Nov 15, 2010 at 11:24:21AM +0900, Masao Uebayashi wrote:
> On Sun, Nov 14, 2010 at 05:52:51AM +, David Holland wrote:
> > On Sun, Nov 14, 2010 at 03:32:44AM +, Masao Uebayashi wrote:
> >  > XXX What is the conclusion about direct vs. indirect #include from 
> > headers?
> > 
> > Every header file should include the things it requires to compile.
> > Therefore, there should in principle be no cases where a header file
> > (or source file) needs to #include something it doesn't itself use.
> 
> This clarifies my long-unanswered question, thanks!
> 
> I've (re)built about 300 kernels in the last days.  I've found:
> 
> - sys/sysctl.h's struct kinfo_proc should be moved into sys/proc.h
>   (I've done this locally).  Otherwise all sysctl node providers
>   includes sys/proc.h and uvm/uvm_extern.h.
>   (This is where I started...)
> 
> - sys/syscallargs.h should be split into pieces, otherwise all its
>   users have to know unrelated types (sys/mount.h, sys/cpu.h).
> 
> - sys/proc.h's tsleep(), wakeup(), and friends should be moved into
>   some common header, because it's widely used API.  sys/proc.h will
>   be used only for "struct proc" related things.

What are the issues here that we need to fix this right now?

Can you please post changes and/or start a thread about all this
before you do them on tech-kern?

Bernd

Re: CVS commit: src/sys/arch

2010-02-07 Thread Jukka Ruohonen

On Sat, Feb 06, 2010 at 01:07:08PM -0800, Paul Goyette wrote:
> If it matches a device, and there is also a "native" driver for the 
> underlying i2c controller, then there'll be two devices accessing the 
> same bus.  Bad things (tm) will happen.  This is noted in the BUGS 
> section of the acpismbus(4) man page.

On a related note, a similar warning should be probably added to aiboost(4).
At least on Linux it is known to cause weird problems and lockups if the
iic(4) is being accessed at the same time by a "native" driver (it87?).

It is also a reasonable assumption that things will get worse at this front.
The new ACPI 4.0 standard introduced a "sensor framework" of its own, and my
guess is that consumer PC manufacturers will jump on the bandwagon, trying
to hide these things in the abyss of ACPI.

- Jukka.

Re: CVS commit: src/sys/arch

2010-02-06 Thread Paul Goyette


On Sat, 6 Feb 2010, Joerg Sonnenberger wrote:


The ALL kernel is NOT intended to be usable. It is all about compile
coverage of as many code pathes possible (or those complemental to
GENERIC). As such: please do add it with a BIG warning before to
annotate that it should not just be copied to a custom config.


Done.  Thanks!


-
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:  |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul at whooppee.com   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-

Re: CVS commit: src/sys/arch

2010-02-06 Thread Joerg Sonnenberger

On Sat, Feb 06, 2010 at 01:07:08PM -0800, Paul Goyette wrote:
> >If it builds it should be enabled in the ALL kernel.
> 
> In this case, I would prefer not to enable it.
> 
> If it matches a device, and there is also a "native" driver for the
> underlying i2c controller, then there'll be two devices accessing
> the same bus.  Bad things (tm) will happen.  This is noted in the
> BUGS section of the acpismbus(4) man page.

The ALL kernel is NOT intended to be usable. It is all about compile
coverage of as many code pathes possible (or those complemental to
GENERIC). As such: please do add it with a BIG warning before to
annotate that it should not just be copied to a custom config.

Joerg

Re: CVS commit: src/sys/arch

2010-02-06 Thread Stephen Borrill


On Sat, 6 Feb 2010, Paul Goyette wrote:

On Sat, 6 Feb 2010, Tobias Nygren wrote:


On Sat, 6 Feb 2010 20:12:32 +
Paul Goyette  wrote:


Modified Files:
src/sys/arch/amd64/conf: GENERIC
src/sys/arch/i386/conf: ALL GENERIC

Log Message:
Add acpismbus enries - commented out!


If it builds it should be enabled in the ALL kernel.


In this case, I would prefer not to enable it.

If it matches a device, and there is also a "native" driver for the 
underlying i2c controller, then there'll be two devices accessing the same 
bus.  Bad things (tm) will happen.  This is noted in the BUGS section of the 
acpismbus(4) man page.


The point is that ALL is not intended to be bootable (except as a lucky 
fluke). It's a sanity test kernel, not a production one.


--
Stephen

Re: CVS commit: src/sys/arch

2010-02-06 Thread Paul Goyette


(Adding tech-kern@)

On Sat, 6 Feb 2010, Tobias Nygren wrote:


On Sat, 6 Feb 2010 20:12:32 +
Paul Goyette  wrote:


Modified Files:
src/sys/arch/amd64/conf: GENERIC
src/sys/arch/i386/conf: ALL GENERIC

Log Message:
Add acpismbus enries - commented out!


If it builds it should be enabled in the ALL kernel.


In this case, I would prefer not to enable it.

If it matches a device, and there is also a "native" driver for the 
underlying i2c controller, then there'll be two devices accessing the 
same bus.  Bad things (tm) will happen.  This is noted in the BUGS 
section of the acpismbus(4) man page.



-
|   Paul Goyette   | PGP DSS Key fingerprint: |  E-mail addresses:  |
| Customer Service | FA29 0E3B 35AF E8AE 6651 |  paul at whooppee.com   |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-

48 matches

Mail list logo