Re: poll(): IN/OUT vs {RD,WR}NORM

2024-05-28 Thread Valery Ushakov
On Tue, May 28, 2024 at 02:33:48 +, David Holland wrote:

> anything other than the same set of vague descriptions we had in the
> older poll(2).

poll(2) is ... ok, I'm not even sure what adjective to use here.  I
had to write some async TCP poll code that needed to work on Linux,
Solaris and MacOS, and I also tested it on NetBSD - the behavior was
fairly noticably different (half-close was half the fun).  Yet all
behaviors were conforming to what the vaguue descriptions in (various)
poll(2) manpages said.

-uwe


Re: [RFC] new APIs to use wskbd(4) input on non-wsdisplayttydevices

2024-04-08 Thread Valery Ushakov
On Mon, Apr 08, 2024 at 23:27:11 +0900, Izumi Tsutsui wrote:

> macallan@ wrote:
[...]
> > Oh, so it's an entire terminal emulation, not just something that lets
> > you draw characters?
> 
> Ah, maybe I see misunderstandings among us.
> 
> In sgimips crmfb and newport cases, a putchar() function provided
> by the firmware just draws a character (glyph) at the cursor
> or specified position. All virtual terminal emulation ops are
> done in wsdisplay(9) and vcons(9) layer and MD drivers just draw
> characters (or whitespace) per vcons pseudo text VRAM attributes.
> 
> On the other hand, news68k (and sun) machines have putchar()
> that also handles virtual terminal ops like backspace, CR/LF,
> and even scrolling at the bottom of screen. In this case
> no VT emulation layer is necessary in the kernel side,
> so kernel's putc(9) just calls firmware's putchar(),
> and for userland processes we can simply pass translated
> wskbd inputs to line discipline of the tty device.
> 
> That's the reason why I proposed to add register/deregister
> APIs to pass wskbd data to romcons tty device.
> 
> What do you think about this case?

Add trivial wsemul_none (or wsemul_delegate, or whatever a suitable
name might be) that does even less than wsemul_dumb and only ever uses
putchar to pass chars to the firmware emulator?

-uwe


Re: [RFC] new APIs to use wskbd(4) input on non-wsdisplay tty devices

2024-04-06 Thread Valery Ushakov
On Sat, Apr 06, 2024 at 23:56:27 +0900, Izumi Tsutsui wrote:

> To support "text only" framebuffer console, we can use putchar
> functions provided by the firmware PROM.

Is that a console-typewriter--like device without addressable cursor
terminal emulation?  Can you use wsemul_dumb to avoid rasops ?  It
still uses copy/erase, but with trivial argument values that can be
reversed back to puchar calls for tab/lf (from a very quick look).


> The attached patch provides new two APIs
> - wskbd_consdev_kbdinput_register()
> - wskbd_consdev_kbdinput_deregister()
> to allow a kernel to use wskbd(9) for non-wsdisplay tty device.

AFAIU, there's nothing console-specific in this (except that it's
first use is going to be for a console), so may be it would be better
to drop the "consdev" from the name?


> Index: sys/dev/wscons/wskbd.c
> ===
> RCS file: /cvsroot/src/sys/dev/wscons/wskbd.c,v
> retrieving revision 1.143
> diff -u -p -d -r1.143 wskbd.c
> --- sys/dev/wscons/wskbd.c5 Feb 2019 10:04:49 -   1.143
> +++ sys/dev/wscons/wskbd.c6 Apr 2024 06:59:50 -
[...]
> @@ -706,6 +709,24 @@ wskbd_input(device_t dev, u_int type, in
>   }
>  #endif
>  
> +#if NWSDISPLAY == 0
> + if (sc->sc_translating) {

The #endif above is for NWSDISPLAY > 0, so may be get rid of the
ifdefs and use plain ifs?

Thanks.

-uwe


Re: Change max ttys from 8 to 12?

2023-12-19 Thread Valery Ushakov
On Tue, Dec 19, 2023 at 11:10:52 +0100, Dan-Simon Myrland wrote:

> 2) Make a custom kernel with the option WSDISPLAY_DEFAULTSCREENS=12

Why?  WSDISPLAY_DEFAULTSCREENS is the number of screens pre-created by
the kernel, but you can always create as many as you need (subject to
WSDISPLAY_MAXSCREEN), see /etc/wscons.conf and /etc/rc.d/wscons.  The
default for that default is actually 0.


> I don't mind that NetBSD has four active ttys by default, but the steps
> to enable all 12 seems unnecessarily tedious. I realize that not all
> architectures have 12 function keys, but laptops usually do. If the
> kernel supported a maximum of 12 ttys on popular architectures,
> enabling then would only require steps 1 and 2.

But where do you stop?  macs have like, what, 19?

I don't like the idea.  The only real limitation currently is the
switching commands/keysyms.  WSDISPLAY_MAXSCREEN is a static limit b/c
it was easier to hardcode it, but it may be made an option easily as
far as I can tell.

Switching from a fixed size array to a dynamic one is probably not too
much work either.  But then, overall, I think that trying to make the
kernel substitute for screen, tmux (in base), etc is kinda dead end,
so I'd rather we don't encourage it.

-uwe


Re: how do I preset ddb's LINES to zero

2023-12-15 Thread Valery Ushakov
On Fri, Dec 15, 2023 at 11:19:39 -0500, Andrew Cagney wrote:

> I've the stock 10.0 boot.iso booting within a KVM based test
> framework.  I'd like to set things up so that should there be a panic,
> it dumps registers et.al., without stopping half way waiting for
> someone to hit the space bar vis:
> 
> r9  af80b451d080
> r10 81d9a063
> r11 0
> r12 0
> --db_more--
> 
> Is there a way to stop this without having to rebuild the kernel.

If the kernel is already there, you can't avoid that prompt without
*some* interaction.  I don't think you can tweak this from boot.cfg

There's probably no good default for db_more prompt, as there are
situations where someone wants it on and someone off.  May be we
should make that into a boot argument, so that if a script talks to
the console, it can issue the corresponding boot command at a well
defined time instead of doing expect-like things?  Or may be force the
paging off for db_cmd_on_enter.

PS: xen console seems to forcibly override db_max_line to avoid paging
prompt.

-uwe


Re: [RFC] userconf(4) modification

2023-11-02 Thread Valery Ushakov
On Thu, Nov 02, 2023 at 16:29:42 +0100, tlaro...@kergis.com wrote:

> You will find attached the man page in order to be able to comment
> about the proposed new syntax---supplementary syntax: it does not
> replace the "legacy" one.

The man page is super-confusing.  Someone who needs to use userconf to
get their system to boot needs a clear reference, but the proposed
version tries to be overly formal and ends up a bit opaque.

I also don't understand why it is necessary to call the old syntax -
"legacy".  From the man page my impression is that the command can be
either

command dev

or

command property = value

both are in a sense a kind of device selector, why do you have to
declare one of them "legacy"?  The user probably doesn't care much
either way, they need to get the kernel booting and are not interested
in the lore.

Why the thing after = is called "expression"?  That position only
accepts two kinds of literals, one of which is a shorthand for the
other (but I had to re-read that paragraph several times and I'm still
not quite sure it actually clearly says that).

-uwe


Re: Testing Emulation Syscalls

2023-08-01 Thread Valery Ushakov
On Tue, Aug 01, 2023 at 12:39:46 +0200, Martin Husemann wrote:

> On Tue, Aug 01, 2023 at 01:34:54PM +0300, Valery Ushakov wrote:
> > As for testing emulated syscalls - can we solve this problem with a
> > bit of elf branding to convince the kernel to start the binary under
> > emulation directly?  Inventing a whole new backdoor API for that seems
> > kinda an overkill.
> 
> That is probably quite easy to do, but we have a toolchain problem then
> (solvable too).
> 
> We need build.sh to be able to produce the test binaries (including
> any needed libs, which don't have to be "native" libs of the emulated
> system).

For simple cases - can we get away with tiny'ish freestanding test
programs that invoke the tested syscalls so that we don't have to pull
a new cross-compilation setup out of thin air just for that?  Testing
something like lwp-related calls probably requires a real linux
libpthread anyway, not a gum-and-toothpicks effigy.

-uwe


Re: Testing Emulation Syscalls

2023-08-01 Thread Valery Ushakov
On Tue, Aug 01, 2023 at 08:16:50 +0200, Martin Husemann wrote:

> On Mon, Jul 31, 2023 at 05:03:48PM -0400, Theodore Preduta wrote:
> > One idea (mentioned in the original thread) would be to introduce a
> > syscall along the lines of
> > 
> > int emul_syscall(const char *emul_name, int number, ...)
> > 
> > which executes a single syscall.  The flaw with this idea is that state
> > may need to be stored across syscalls in struct linux_emuldata, but I
> > don't know how this interface could accommodate this.
> > 
> > Another idea would be to introduce a syscall along the lines of
> > 
> > int setemul(const char *emul_name)
> > 
> > which would switch the syscall table dynamically so that the test case
> > could be run under emulation (preserving emuldata state) and then switch
> > back to report the result.  (And then individual syscalls would be
> > called via __syscall(2).)
> 
> I think this would be quite tricky for the test code in userland.
> 
> But what about a variant of the initial suggestion:
> 
> // returns an integer descriptor
> int open_emul(const char *emulname);
> 
> // invokes a syscall under an open emulation
> int emul_syscall(int emul, int number, ...);
> 
> // frees all state for the emulation, returns 0 or -1
> int close_emul(int emul);
> 
> IMO this still is far better than exposing native syscalls that we do
> not really need/want.

I'd rather we don't expose epoll(2) as a native syscall at least not
just yet, given what pkgsrc folks are telling us.  It's not like we
have pressing need to use it in our own code in base.  And for third
party software it creates confusion as was pointed out.

As for testing emulated syscalls - can we solve this problem with a
bit of elf branding to convince the kernel to start the binary under
emulation directly?  Inventing a whole new backdoor API for that seems
kinda an overkill.

-uwe


Re: [PATCH] style(5): No struct typedefs

2023-07-11 Thread Valery Ushakov
On Tue, Jul 11, 2023 at 05:56:27 -0700, Jason Thorpe wrote:

> > On Jul 11, 2023, at 3:17 AM, Taylor R Campbell  wrote:
> > 
> > If we used `struct bus_dma_tag *' instead, the forward declaration
> > could be `struct bus_dma_tag;' instead of having to pull in all of
> > sys/bus.h, _and_ the C compiler would actually check types.
> 
> In the original design, it's not always a struct.  That was the
> whole point of using a more abstract type.

The bus_dma_tag_t example from the original email is not the best one,
but I didn't want to open that can of worms in my reply, so I
mentioned the "not always struct" case without actually mentioning
names.

The style(5) specifically gives an example of a struct typedef, not of
an opaque typedef.


> If you want to hide the struct'ness in a machdep header file, fine,
> but I completely disagree with the notion of requiring the use of
> the "struct" keyword all over the place.

I used to lean both ways at different times and in different contexts.
I think that existence and usefulness of opaque typedefs is exactly
the strong argument against using "convenience struct typedefs", b/c
the latter dilute the message so to speak.  If someone wants to
program with "systems hungarian", they know where to find it...

-uwe


Re: [PATCH] style(5): No struct typedefs

2023-07-11 Thread Valery Ushakov
On Tue, Jul 11, 2023 at 10:17:24 +, Taylor R Campbell wrote:

> I propose the attached change to KNF style(5) to advise against
> typedefs for structs or unions, or pointers to them.
[...]
> (Typedefs for integer, function, and function pointer types are not
> covered by this advice.)

Yes, please.

Typedefs make sense when the type is *really* opaque and can, behind
the scenes, be an integer type, a pointer or a struct.  [Ab]using
typedefs to save 8 bytes of "struct " + "*" just adds cognitive load
(and whatever logistical complications that you have enumerated in the
elided part of the quote).

-uwe


Re: PROPOSAL: Split uiomove into uiopeek, uioskip

2023-05-09 Thread Valery Ushakov
On Tue, May 09, 2023 at 14:33:26 -0700, Jason Thorpe wrote:

> I'm not a fan of uioskip() as a name - I think uioadvance() would be
> better.  Skip implies, to my brain, that the data is being thrown
> away (even if you're already consumed it).

I agree.  "skip" seem to have wrong connotations (cf. dd(1)).

-uwe


Re: building 9.1 kernel with /usr/src elsewhere?

2023-03-08 Thread Valery Ushakov
On Wed, Mar 08, 2023 at 15:22:11 +1100, matthew green wrote:

> > This completed apparently normally, reporting the build directory and
> > telling me to remember to make depend.  I then went to ~/kbuild/GEN91
> > and ran make depend && make.  It failed fast - no more than a second or
> > two - with
> >
> > make[1]: don't know how to make absvdi2.c. Stop
> 
> what happens if you run "make USETOOLS=no"?

That's orthogonal.  The problem is that NETBSDSRCDIR cannot be
inferred for a randomly located kernel builddir and
sys/lib/libkern/Makefile.compiler-rt uses it.  I don't know enough
about sys Makefiles, but may be using $S instead of
${NETBSDSRCDIR}/sys will just fix it.  But may be will also break
something else.  Our makefile spaghetti is a bit out of control.

-uwe


Re: The list of __HAVE macros

2023-03-05 Thread Valery Ushakov
On Sun, Mar 05, 2023 at 18:48:08 +, Taylor R Campbell wrote:

> > I think it might be a good idea to document the list in one place with
> > xrefs to the relevant section 9 pages (and to document the options
> > there too, e.g. __HAVE_SIMPLE_MUTEXES in mutex(9)).
> 
> I agree!  Want to draft a skeleton in share/man/man9, say
> portfeatures.9 or something, so we can fill them and xref as needed?

Done.  If people can document stuff there and in relevant section 9
pages that would be cool.  Don't get too concerned with the finer
points of mdoc markup, I'll try to clean that up if necessary.

-uwe


The list of __HAVE macros

2023-03-05 Thread Valery Ushakov
We don't seem to have(9) a man page that lists all __HAVE_* macros
that a port may provide.  E.g.

  $ apropos -M '"__HAVE_PREEMPTION"'
  cpu_need_resched (9)context switch notification

but

  $ apropos -M '"__HAVE_SIMPLE_MUTEXES"'
  apropos: No relevant results obtained.
  Please make sure that you spelled all the terms correctly or try using 
different keywords.

I think it might be a good idea to document the list in one place with
xrefs to the relevant section 9 pages (and to document the options
there too, e.g. __HAVE_SIMPLE_MUTEXES in mutex(9)).

-uwe


Re: erlang -> asmjit -> mremap questions/bugs

2023-03-01 Thread Valery Ushakov
On Wed, Mar 01, 2023 at 15:29:27 +0100, Thomas Klausner wrote:

> It seems the problem is that mmap() in the mremap(2) man page example
> (which was used to implement the asmjit version) is not using
> MAP_SHARED.
> 
> - I'd like to add MAP_SHARED in the mmap() call in the mremap(2) man
>   page example. Is that fine?

Not really.  There are no other processes involved in the example, so
MAP_SHARED makes no sense.


> - Reading mmap(2) it seems that one of MAP_SHARED or MAP_PRIVATE is
>   required, but there is no error if none is provided. Should we change
>   mmap() to return an error in that case?

sys/kern/vfs_vnops.c:

/*
 * Old programs may not select a specific sharing type, so
 * default to an appropriate one.
 */


> - Why does MAP_PRIVATE (instead of MAP_SHARED) not work?

Are there multiple processes involved in the erlang case?


-uwe


Re: Finding the slot in the ioconf table a module attaches to?

2023-02-01 Thread Valery Ushakov
On Wed, Feb 01, 2023 at 11:14:42 -0800, Brian Buhrow wrote:

>   hello.  Okay.  That is helpful.  Passing -1 in as the cmajor
> number to the devsw_attach() function does, in fact, assign a
> reasonable major number which seems to work.  I use the
> cdevsw_lookup_major() function to retrieve the assigned number and
> print it for the user.

devsw_attach updates  with the assigned number if you passed
NODEVMAJOR (-1) in it, so you don't even need to look it up
separately.  We also have in-kernel convenience "MAKEDEV".

E.g., paraphrasing a bit, vbox guest additions module does:

bmajor = cmajor = NODEVMAJOR;
error = devsw_attach("vboxguest", NULL, ,
 _cdevsw, );
if (error)
...

error = do_sys_mknod(curlwp, "/dev/vboxguest",
 S_IFCHR | 0666, makedev(cmajor, 0),
 , UIO_SYSSPACE);
if (error == EEXIST) {
error = 0;
/*
 * Since NetBSD doesn't yet have a major reserved for
 * vboxguest, the (first free) major we get will
 * change when new devices are added, so an existing
 * /dev/vboxguest may now point to some other device,
 * creating confusion (tripped me up a few times).
 */
aprint_normal("vboxguest: major %d:"
  " check existing /dev/vboxguest\n", cmajor);
}

(The comment is no longer true, as we do have a reserved major for vbox now).

-uwe


Re: Finding the slot in the ioconf table a module attaches to?

2023-02-01 Thread Valery Ushakov
On Wed, Feb 01, 2023 at 08:27:57 -0500, Brad Spencer wrote:

> To add a bit...  generally I have just added an entry to one of the
> "major" files in sys/conf.  However, I have noticed that in order for
> the module to be able to use it, after the major file edit, I had to
> rebuild the kernel as well.  I have never been 100% sure that was proper
> behavior, but it seems to be the case.  That is, just editing the major
> file and building or rebuilding the module has not been enough.

.Xr devsw_attach 9

Major numbers (mapping "foo" -> 42, so that a program that opens a
node with major 42 gets to the device "foo") are a property of the
kernel config, so yes, you need to rebuild the kernel when you
introduce a new fixed major number.

When the module is loaded, the driver tells the kernel, "I'm `foo'".
It can also tell the kernel either "I'm ok with whatever major number
you give me", or it can tell "I want a specific major number N".  It's
an error to request a specific major N that is already taken (either
fixed in a major config file, or dynamically allocated to another
driver).

-uwe


Re: Finding the slot in the ioconf table a module attaches to?

2023-02-01 Thread Valery Ushakov
On Wed, Feb 01, 2023 at 08:28:35 +, RVP wrote:

> /usr/src/sys/modules/examples/readhappy/readhappy.c
> /usr/src/sys/conf/majors*

Hmm, lots of real modules seems to use config_init_component() that is
not documented at all in the section 9.  Can someone please write a
man page for that?  I'll help with mdoc if troff incantations make you
anxious :)

-uwe


Re: Add five new escape sequences to wscons

2023-01-16 Thread Valery Ushakov
On Mon, Jan 16, 2023 at 15:10:06 -0300, Crystal Kolipe wrote:

> On Mon, Jan 16, 2023 at 08:20:35PM +0300, Valery Ushakov wrote:
> > On Mon, Jan 16, 2023 at 09:18:53 -0300, Crystal Kolipe wrote:
> > 
> > > It's useful, because these sequences correspond to the terminfo
> > > capabilities rin, indn, vpa, hpa, and cbt as defined in the xterm
> > > terminfo entry.  With these sequences implemented, it becomes
> > > slightly more practical to set TERM=xterm when connecting to remote
> > > systems that don't have a comprehensive terminfo database.
> > 
> > Why is is desirable to set specifically TERM=xterm instead of, say,
> > vt220, or whichever vt entry describes wscons the closest?
> 
> The xterm entry supports colour, which vt220 does not.

As someone who routinely runs xterm with TERM=vt220 I'm probably not
qualified to comment further.


> The multi-line scroll commands, as far as I understand, are supposed to
> scroll the entire screen, (or the scrolling region).

It's the "or the scrolling region" part that I'm not sure about.  The
terminfo documentation seems to indicate that the scrolling
capabilities like "ind" are to operate on the whole screen.

E.g. X/Open Curses, Issue 7 (p.353):

  To scroll text up, a program goes to the bottom left CORNER OF THE
  SCREEN and sends the ind (index) string.  To scroll text down, a
  program goes to the top left CORNER OF THE SCREEN and sends the ri
  (reverse index) string.  The strings ind and ri are undefined when
  not on their respective corners of the screen.

On the other hand a few pages later the same document says (p.356):

  To determine whether a terminal has destructive scrolling regions or
  non-destructive scrolling regions, create a scrolling region in the
  middle of the screen, place data on the bottom line of the scrolling
  region, move the cursor to the top LINE OF THE SCROLLING REGION, and
  do a reverse index (ri) followed by a delete line (dl1) or index
  (ind).  If the data that was originally on the bottom line of the
  scrolling region was restored into the scrolling region by dl1 or
  ind, then the terminal has non-destructive scrolling regions.
  Otherwise, it has destructive scrolling regions.

I cannot find any passages that would explicitly say how ind/ri and
csr interact.  (Note, I'm not talking about the observed behaviour of
specific xterm/vt commands, but about the semantic of terminfo
capabilities as abstractly defined in the ETI).

May be it's so obvious to everyone involved that "ind" and "ri" and to
operate on the scrolling region that no-one even realizes that the
current wording does actually say something different and you need to
do exegetics on an tangential remark elsewhere in the document to be
kinda able to infer that it's "screen (or the scrolling region)"

-uwe


Re: Add five new escape sequences to wscons

2023-01-16 Thread Valery Ushakov
On Mon, Jan 16, 2023 at 09:18:53 -0300, Crystal Kolipe wrote:

> It's useful, because these sequences correspond to the terminfo
> capabilities rin, indn, vpa, hpa, and cbt as defined in the xterm
> terminfo entry.  With these sequences implemented, it becomes
> slightly more practical to set TERM=xterm when connecting to remote
> systems that don't have a comprehensive terminfo database.

Why is is desirable to set specifically TERM=xterm instead of, say,
vt220, or whichever vt entry describes wscons the closest?

For multi-line scroll the patch just calls scrollup/scrolldown, but
that's not what the single-line scroll commands do (see
wsemul_vt100.c)

I'm actually not entirely convinced that it's even correct to describe
vt220 as having sf/ind scrolling capabilities, b/c the vt220 scrolling
sequences take the scrolling region into account and the terminfo
capabilities for scrolling are defined to operate on the whole screen
as far as I can tell.

So in its current form I don't think this patch is suitable and I'm
not convinced it's needed at all.

-uwe


Re: KSYMS_CLOSEST

2022-12-25 Thread Valery Ushakov
On Sun, Dec 25, 2022 at 17:41:10 +0100, Anders Magnusson wrote:

> Den 2022-12-25 kl. 17:25, skrev Valery Ushakov:
> > On Sun, Dec 25, 2022 at 15:42:47 +0100, Anders Magnusson wrote:
> > 
> > > Den 2022-12-25 kl. 13:43, skrev Valery Ushakov:
> > > > On Sun, Dec 25, 2022 at 09:20:49 +0100, Anders Magnusson wrote:
> > > > 
> > > > > IIRC it was to match the ddb "sift" command.
> > > > I'm not sure I get how it might be used for sifting - a kind of "next"
> > > > for external iteration?  Since we never got around to do that do we
> > > > still want to keep it, or shall we deprecate/delete it?
> > >
> > > Ah! I had to look at the code - no, it has nothing to do with sift.
> > > I think it is implicit when asking for a name these days; it is used
> > > to get nearest lower address address in debug output. (like
> > > tstile+0x18 )
> >
> > Right, right, but I wonder what could it possibly mean then, when the
> > flag is not specified - as opposed to the example above.  I.e. if
> > KSYMS_CLOSEST is foo+0x10, what KSYMS_EXTERN (i.e. no specific flags)
> > could be, other than foo+0x10, for the same address?  I mean,
> > technically, netbsd + 0xcaffe42 would also be a correct reply in that
> > case :)
>
> :-)  If you are not specifying KSYMS_EXACT, you may not get the exact
> address, yes.  That is true :-)
>
> > Also, checking the very first versions of ksyms code I don't see
> > KSYMS_CLOSEST ever actually handled (it's defined and specified in the
> > ddb strategy defines, but never tested in ksyms).  May be I missed
> > some later short-lived incarnation.
> > 
> > The existing call sites that supply the flag look like cargo-cult^W^W
> > common sense ("looks like you might need to specify that flag to get
> > foo+0x10, well, *shrug*, won't hurt").
> I assume that might be the case, yes.
> The ksyms code comes from another system for which I wrote it a long time
> ago, where the meaning may have had a significance (do not remember).
> But feel free to clean this up.  (IMHO KSYMS_EXACT should be the default,
> requiring KSYMS_CLOSEST to be defined if that is requested).

But KSYMS_EXACT has different meaning.  It means to look for exactly
"foo" (foo+0) and fail otherwise.

if ((f & KSYMS_EXACT) && (v != es->st_value))
return ENOENT;


-uwe


Re: KSYMS_CLOSEST

2022-12-25 Thread Valery Ushakov
On Sun, Dec 25, 2022 at 15:42:47 +0100, Anders Magnusson wrote:

> Den 2022-12-25 kl. 13:43, skrev Valery Ushakov:
> > On Sun, Dec 25, 2022 at 09:20:49 +0100, Anders Magnusson wrote:
> > 
> > > IIRC it was to match the ddb "sift" command.
> > I'm not sure I get how it might be used for sifting - a kind of "next"
> > for external iteration?  Since we never got around to do that do we
> > still want to keep it, or shall we deprecate/delete it?
>
> Ah! I had to look at the code - no, it has nothing to do with sift.
> I think it is implicit when asking for a name these days; it is used
> to get nearest lower address address in debug output. (like
> tstile+0x18 )

Right, right, but I wonder what could it possibly mean then, when the
flag is not specified - as opposed to the example above.  I.e. if
KSYMS_CLOSEST is foo+0x10, what KSYMS_EXTERN (i.e. no specific flags)
could be, other than foo+0x10, for the same address?  I mean,
technically, netbsd + 0xcaffe42 would also be a correct reply in that
case :)

Also, checking the very first versions of ksyms code I don't see
KSYMS_CLOSEST ever actually handled (it's defined and specified in the
ddb strategy defines, but never tested in ksyms).  May be I missed
some later short-lived incarnation.

The existing call sites that supply the flag look like cargo-cult^W^W
common sense ("looks like you might need to specify that flag to get
foo+0x10, well, *shrug*, won't hurt").

-uwe


Re: KSYMS_CLOSEST

2022-12-25 Thread Valery Ushakov
On Sun, Dec 25, 2022 at 09:20:49 +0100, Anders Magnusson wrote:

> IIRC it was to match the ddb "sift" command.

I'm not sure I get how it might be used for sifting - a kind of "next"
for external iteration?  Since we never got around to do that do we
still want to keep it, or shall we deprecate/delete it?


> Den 2022-12-25 kl. 01:01, skrev Valery Ushakov:
> > KSYMS_CLOSEST flag is documented as "Nearest lower match".  However as
> > far as I can tell nothing in ksyms code ever pays attention to this
> > flag and it's not clear to me what meaning one can ascribe to the set
> > of flags that doesn't have KSYMS_CLOSEST set.
> > 
> > Ragge, do you remember what did you have in mind for it when you
> > introduced it back in 2003?
> > 
> > I think we should g/c it.
> > 
> > -uwe

-uwe


KSYMS_CLOSEST

2022-12-24 Thread Valery Ushakov
KSYMS_CLOSEST flag is documented as "Nearest lower match".  However as
far as I can tell nothing in ksyms code ever pays attention to this
flag and it's not clear to me what meaning one can ascribe to the set
of flags that doesn't have KSYMS_CLOSEST set.

Ragge, do you remember what did you have in mind for it when you
introduced it back in 2003?

I think we should g/c it.

-uwe


Re: symbol lookup in ddb - bad heuristic

2022-12-18 Thread Valery Ushakov
On Sun, Dec 18, 2022 at 09:38:14 -0800, Chuck Silvers wrote:

> > May be the hack need to be applied only with a new special flag, say,
> > KSYMS_RET?  Then we can define separate DB_STGY_PROC (no heuristic)
> > and DB_STGY_RET (with the heuristic).
> > 
> > The downside is that all MD db_stack_trace_print functions need to be
> > adjusted, but it actually makes sense to use both strategies there,
> > b/c when we are traversing an interrupt/exception frame, the
> > DB_STACK_PROC (without the heuristic) is the right thing to use, but
> > unwinding a call needs DB_STACK_RET (with the new flag).
> 
> you're right, to print the right function name we do need to distinguish
> between addresses that are function return addresses and those that are not,
> and the DB_STGY_RET / KSYMS_RET flags that you suggest sound like a fine way
> of doing that.  would you like to implement this or do you want me to do it?

I probably won't get around to it until the next weekend at the
earliest and I'm not territorial about this :), so if you have time
and inclination, please do, otherwise I'll do it when I get around to
it.

Thanks.

-uwe


Re: i386: 9.99.108 traps booting on VirtualBox

2022-12-12 Thread Valery Ushakov
On Mon, Dec 12, 2022 at 23:31:06 +0300, Valery Ushakov wrote:

> > > With KDTRACE_HOOKS enabled (modulo clockintr hack) and the serial
> > > console (for debugging) I see the system stuck on console output when
> > > rc runs.  It gets unstuck on a com interrupt (e.g. pressing a key).
> > > 
> > > Seems to work fine with KDTRACE_HOOKS disabled.
> > 
> > Do you mean that:
> > 
> > - with KDTRACE_HOOKS enabled, clockintr hack applied, and console on
> >   serial, system gets stuck on console output until com interrupt
> 
> Yes, I get some of the early output from rc and then the system
> stalls.  There's no further rc output and I don't get a login prompt
> on the wscons.  When I type a key into the serial console, the output
> gets unstuck and I get the rest of the rc output and the login prompt
> on wscons.

PS: This is not on real hardware though, but under VirtualBox with
serial connected to a TCP port.  I have an ancient Dell laptop with a
real on-board serial that I can probably try to verify this with if
need be.

-uwe


Re: i386: 9.99.108 traps booting on VirtualBox

2022-12-12 Thread Valery Ushakov
On Mon, Dec 12, 2022 at 20:12:57 +, Taylor R Campbell wrote:

> Annoying...  We really shouldn't abuse function prototypes like this:
> according to the prototype, what I did with intr_kdtrace_wrapper is
> correct.

Right, we decieved the compiler and the compiler was like, ok,
boomer...


> I think it would be reasonable to add an exception like you did for
> now, maybe with an INTR_NOTRACE flag (perhaps someone can find a way
> to phrase this positively) instead of a magic number, until we can
> remove the abuse of calling convention for clockintr.

As I said, it was just a quick kludge to avoid a bunch of files
recompiled (and I didn't even get the number right...).


> > With KDTRACE_HOOKS enabled (modulo clockintr hack) and the serial
> > console (for debugging) I see the system stuck on console output when
> > rc runs.  It gets unstuck on a com interrupt (e.g. pressing a key).
> > 
> > Seems to work fine with KDTRACE_HOOKS disabled.
> 
> Do you mean that:
> 
> - with KDTRACE_HOOKS enabled, clockintr hack applied, and console on
>   serial, system gets stuck on console output until com interrupt

Yes, I get some of the early output from rc and then the system
stalls.  There's no further rc output and I don't get a login prompt
on the wscons.  When I type a key into the serial console, the output
gets unstuck and I get the rest of the rc output and the login prompt
on wscons.


> - with KDTRACE_HOOKS disabled, and console on serial, system proceeds
>   without getting stuck on console output?

Yes.


-uwe


Re: symbol lookup in ddb - bad heuristic

2022-12-09 Thread Valery Ushakov
On Sat, Dec 10, 2022 at 01:03:06 +0300, Valery Ushakov wrote:

> That causes breakpoints on a function entry to be misreported:

Actually it's more than that.  The corresponding MD change in i386
db_frame_info that applies the same heuristic causes another side
effect.

With the heuristic I get the following backtrace from the breakpoint
at clockintr for the real problem I've been debugging (see my earlier
mail):

  db{0}> bt
  sysbeepdetach(c2f50680,c1930d9c,0,0,0,0,0,0,0,0) at netbsd:clockintr
  --- switch to interrupt stack ---

but with the MD part of the heuristic also disabled (I missed it
originally), I get:

  db{0}> bt
  clockintr(0,0,0,0,0,0,0,0,c2d72000,c010322a) at netbsd:clockintr
  intr_kdtrace_wrapper(c2f50680,c1930d9c,0,0,0,0,0,0,0,0) at 
netbsd:intr_kdtrace_wrapper+0x21
  --- switch to interrupt stack ---

Yes, I should have realized I did see that intr_kdtrace_wrapper in
another backtrace, taken earlier, further down the call chain:

  db{0}> bt
  hardclock(0,0,da3eef6c,c04ac8f1,0,0,0,0,0,0) at netbsd:hardclock+0x23
  clockintr(0,0,0,0,0,0,0,0,c2d72000,c010322a) at netbsd:clockintr+0x2a
  intr_kdtrace_wrapper(c2f50680,c1930d9c,0,0,0,0,0,0,0,0) at 
netbsd:intr_kdtrace_wrapper+0x21
  --- switch to interrupt stack ---

but it kinda drifted out of focus...

-uwe


Re: symbol lookup in ddb - bad heuristic

2022-12-09 Thread Valery Ushakov
On Sat, Dec 10, 2022 at 01:03:06 +0300, Valery Ushakov wrote:

> KSYMS_RET?  Then we can define separate DB_STGY_PROC (no heuristic)
> and DB_STGY_RET (with the heuristic).
> 
> The downside is that all MD db_stack_trace_print functions need to be
> adjusted, but it actually makes sense to use both strategies there,
> b/c when we are traversing an interrupt/exception frame, the
> DB_STACK_PROC (without the heuristic) is the right thing to use, but
> unwinding a call needs DB_STACK_RET (with the new flag).

PS: Grr.  Obviously, I meant to say DB_STGY_PROC and DB_STGY_RET here.

-uwe


Re: i386: 9.99.108 traps booting on VirtualBox

2022-12-09 Thread Valery Ushakov
[ATTN: riastradh]

On Fri, Dec 09, 2022 at 02:59:12 +0300, Valery Ushakov wrote:

> [reposting from current-users]
> 
> On Wed, Nov 30, 2022 at 13:05:52 +0300, Valery Ushakov wrote:
> 
> > I tried to upgrade a 32-bit VBox VM from 9.99.99 to .107 and the
> > kernel from the yesterday's sources crashes on boot.  
> 
> Tried .108 and it crashes the same with:

> [   1.0091954] trap type 6 code 0 eip 0xc0d3d8f8 cs 0x8 eflags 0x10246 cr2 
> 0x3c ilevel 0x7 esp 0x6
> [   1.0091954] curlwp 0xc1657840 pid 0 lid 0 lowest kstack 0xc192e2c0
> kernel: supervisor trap page fault, code=0
> Stopped in pid 0.0 (system) at  netbsd:hardclock+0x23:  movl3c(%esi),%eax
> db{0}> bt
> hardclock(0,0,da3eef6c,c04ac8f1,0,0,0,0,0,0) at netbsd:hardclock+0x23
> clockintr(0,0,0,0,0,0,0,0,c2d72000,c010322a) at netbsd:clockintr+0x2a
> intr_kdtrace_wrapper(c2f50680,c1930d9c,0,0,0,0,0,0,0,0) at 
> netbsd:intr_kdtrace_wrapper+0x21
> --- switch to interrupt stack ---

So the culprit is KDTRACE_HOOKS in sys/arch/x86/x86/intr.c

  revision 1.163
  date: 2022-10-29 16:59:04 +0300;  author: riastradh;  state: Exp;  lines: +38 
-2;  commitid: w28zVvYhMCIOsCZD;
  x86: Add dtrace probes for interrupt handler entry and return.

The problem is that clockintr has magic calling convention that
intr_kdtrace_wrapper doesn't know about.  As a quick hack I changed
i8254_initclocks to pass a magic argument (that is ignored by
clockintr anyway) and told the hook code to ignore such handlers:

#ifdef KDTRACE_HOOKS
if (arg != (void *)0x8042c10c) { /* clockintr is magic */
ih->ih_fun = intr_kdtrace_wrapper;
ih->ih_arg = ih;
}
#endif

and that kernel doesn't crash.

It's *almost* fine, but I see the problem with com(4) that I suspect
is related to the recent commits by Nakahara-san:

  
  revision 1.382
  date: 2022-12-09 03:35:58 +0300;  author: knakahara;  state: Exp;  lines: +7 
-29;  commitid: 9zcguFpBLJvxHO4E;
  Revert com.c:r1.381 because i386/qemu cannot boot.  Pointed out by gson@n.o 
and martin@n.o.
  
  revision 1.381
  date: 2022-12-08 12:08:49 +0300;  author: knakahara;  state: Exp;  lines: +29 
-7;  commitid: 0xs100bYdUbwzJ4E;
  Fix hang up writing /dev/console rarely in specific environments.

  Some BMC seems to require these syncronous operations.  If not,
  it does not send transmit completion interrupts for some reason.

With KDTRACE_HOOKS enabled (modulo clockintr hack) and the serial
console (for debugging) I see the system stuck on console output when
rc runs.  It gets unstuck on a com interrupt (e.g. pressing a key).

Seems to work fine with KDTRACE_HOOKS disabled.

-uwe


symbol lookup in ddb - bad heuristic

2022-12-09 Thread Valery Ushakov
db_printsym has the following heuristic:

  revision 1.68
  date: 2021-12-13 04:25:29 +0300;  author: chs;  state: Exp;  lines: +16 -2;  
commitid: MT9cIBmUIZU1AqkD;
  ddb: fix function names of "noreturn" functions in stack traces.

  when looking up function names for stack traces (where the addresses
  are the return addresses of function calls), if the address is the
  first instruction in the function, assume that the function being
  called is marked "noreturn" and that the function containing the
  call is actually the function immediately before the address that we
  looked up.  to find the correct function name, do the lookup again
  with (address - 1) and then add one to the offset within the
  function that we find.


That causes breakpoints on a function entry to be misreported:

  Breakpoint in pid 0.0 (system) at netbsd:sysbeepdetach+0x21: pushl %ebp
  ...
  db{0}> show break
   Map  CountAddress
  *0x0 1netbsd:sysbeepdetach+0x21
  db{0}> x/i sysbeepdetach+0x21   
  netbsd:clockintr:   pushl   %ebp

May be the hack need to be applied only with a new special flag, say,
KSYMS_RET?  Then we can define separate DB_STGY_PROC (no heuristic)
and DB_STGY_RET (with the heuristic).

The downside is that all MD db_stack_trace_print functions need to be
adjusted, but it actually makes sense to use both strategies there,
b/c when we are traversing an interrupt/exception frame, the
DB_STACK_PROC (without the heuristic) is the right thing to use, but
unwinding a call needs DB_STACK_RET (with the new flag).

Thoughts?

-uwe


i386: 9.99.108 traps booting on VirtualBox

2022-12-08 Thread Valery Ushakov
[reposting from current-users]

On Wed, Nov 30, 2022 at 13:05:52 +0300, Valery Ushakov wrote:

> I tried to upgrade a 32-bit VBox VM from 9.99.99 to .107 and the
> kernel from the yesterday's sources crashes on boot.  

Tried .108 and it crashes the same with:

> boot netbsd.new
21926532+587532+743668 [994880+103+13802]=0x182cf08
[   1.000] cpu_rng: rdrand/rdseed
[   1.000] entropy: ready
[   1.000] Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 
2004, 2005,
[   1.000] 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 
2016, 2017,
[   1.000] 2018, 2019, 2020, 2021, 2022
[   1.000] The NetBSD Foundation, Inc.  All rights reserved.
[   1.000] Copyright (c) 1982, 1986, 1989, 1991, 1993
[   1.000] The Regents of the University of California.  All rights 
reserved.
[   1.000] NetBSD 9.99.108 (GENERIC) #0: Fri Dec  9 01:23:00 MSK 2022
[   1.000]
uwe@majava:/home/uwe/work/netbsd/cvs/src/sys/arch/i386/compile/GENERIC
[   1.000] total memory = 1023 MB
[   1.000] avail memory = 980 MB
[   1.040] mainbus0 (root)
[   1.040] ACPI: RSDP 0x000E 24 (v02 VBOX  )
[   1.040] ACPI: XSDT 0x3FFF0030 34 (v01 VBOX   VBOXXSDT 
0001 ASL  0061)
[   1.040] ACPI: FACP 0x3FFF00F0 F4 (v04 VBOX   VBOXFACP 
0001 ASL  0061)
[   1.040] ACPI: DSDT 0x3FFF05B0 002353 (v02 VBOX   VBOXBIOS 
0002 INTL 20200925)
[   1.040] ACPI: FACS 0x3FFF0200 40
[   1.040] ACPI: SSDT 0x3FFF0240 00036C (v01 VBOX   VBOXCPUT 
0002 INTL 20200925)
[   1.040] ACPI: 2 ACPI AML tables successfully acquired and loaded
[   1.040] cpu0 at mainbus0
[   1.040] cpu0: Intel(R) Core(TM) i5-5300U CPU @ 2.30GHz, id 0x306d4
[   1.040] cpu0: node 0, package 0, core 0, smt 0
[   1.040] acpi0 at mainbus0: Intel ACPICA 20220331
[   1.040] acpi0: fixed power button present
[   1.040] acpi0: fixed sleep button present
[   1.0091954] pckbc1 at acpi0 (PS2K, PNP0303) (kbd port): io 0x60,0x64 irq 1
[   1.0091954] pckbc2 at acpi0 (PS2M, PNP0F03) (aux port): irq 12
[   1.0091954] attimer1 at acpi0 (TIMR, PNP0100): io 0x40-0x43,0x50-0x53
[   1.0091954] SRL0 (PNP0501) at acpi0 not configured
[   1.0091954] acpivga0 at acpi0 (GFX0): ACPI Display Adapter
[   1.0091954] acpiout0 at acpivga0 (VGA, 0x0100): ACPI Display Output Device
[   1.0091954] acpibat0 at acpi0 (BAT0, PNP0C0A-0): ACPI Battery
[   1.0091954] acpiacad0 at acpi0 (AC, ACPI0003-0): ACPI AC Adapter
[   1.0091954] apm0 at acpi0: Power Management spec V1.2
[   1.0091954] ACPI: Enabled 2 GPEs in block 00 to 07
[   1.0091954] pckbd0 at pckbc1 (kbd slot)
[   1.0091954] pckbc1: using irq 1 for kbd slot
[   1.0091954] wskbd0 at pckbd0 mux 1
[   1.0091954] pms0 at pckbc1 (aux slot)
[   1.0091954] pckbc1: using irq 12 for aux slot
[   1.0091954] wsmouse0 at pms0 mux 0
[   1.0091954] pci0 at mainbus0 bus 0: configuration mode 1
[   1.0091954] pchb0 at pci0 dev 0 function 0: Intel 82441FX (PMC) PCI and 
Memory Controller (rev. 0x02)
[   1.0091954] pcib0 at pci0 dev 1 function 0: Intel 82371SB (PIIX3) PCI-ISA 
Bridge (rev. 0x00)
[   1.0091954] piixide0 at pci0 dev 1 function 1: Intel 82371AB IDE controller 
(PIIX4) (rev. 0x01)
[   1.0091954] piixide0: primary channel interrupting at irq 14
[   1.0091954] atabus0 at piixide0 channel 0
[   1.0091954] piixide0: secondary channel interrupting at irq 15
[   1.0091954] atabus1 at piixide0 channel 1
[   1.0091954] vga0 at pci0 dev 2 function 0: VirtualBox Graphics (rev. 0x00)
[   1.0091954] wsdisplay0 at vga0 kbdmux 1
[   1.0091954] drm at vga0 not configured
[   1.0091954] wm0 at pci0 dev 3 function 0: Intel i82540EM 1000BASE-T Ethernet 
(rev. 0x02)
[   1.0091954] wm0: interrupting at irq 9
[   1.0091954] wm0: Ethernet address 08:00:27:d2:84:ac
[   1.0091954] makphy0 at wm0 phy 1: Marvell 88E1011 Gigabit PHY, rev. 4
[   1.0091954] makphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
[   1.0091954] VirtualBox Guest Service (miscellaneous system) at pci0 dev 4 
function 0 not configured
[   1.0091954] auich0 at pci0 dev 5 function 0: i82801AA (ICH) AC-97 Audio
[   1.0091954] auich0: interrupting at irq 11
[   1.0091954] auich0: ac97: SigmaTel STAC9700 codec; no 3D stereo
[   1.0091954] auich0: ac97: ext id 0x809
[   1.0091954] ohci0 at pci0 dev 6 function 0: Apple Computer Intrepid USB 
Controller (rev. 0x00)
[   1.0091954] ohci0: interrupting at irq 10
[   1.0091954] ohci0: OHCI version 1.0
[   1.0091954] usb0 at ohci0: USB revision 1.0
[   1.0091954] piixpm0 at pci0 dev 7 function 0: Intel 82371AB (PIIX4) Power 
Management Controller (rev. 0x08)
[   1.0091954] piixpm0: interrupting at irq 9
[   1.0091954] iic0 at piixpm0 port 0: I2C bus
[   1.0091954] wm1 at pci0 dev 8 function 0: Intel i82540EM 1000BASE-T Ethernet 
(rev. 0x02)
[   1.0091954] wm1: interrupting at irq 11
[   1.0091954] wm1: Ethernet address 08:00:27:95:0b:c1
[   1.0091954

Re: Limiting malloc to the low 2GB?

2022-11-28 Thread Valery Ushakov
On Mon, Nov 28, 2022 at 23:22:08 +, RVP wrote:

> On Tue, 29 Nov 2022, Valery Ushakov wrote:
> 
> > Turns out you can use MALLOC_CONF="dss:primary" to make (the new)
> > jemalloc prefer sbrk(2).
> 
> Yes, I saw that, but, I wasn't sure if that setting meant a) always
> use sbrk() or b) prefer sbrk(), then, fall-back to mmap() (typically
> for very large allocations).

Me too :)  But that works for now.  Will need to RTFS.

-uwe


Re: Limiting malloc to the low 2GB?

2022-11-28 Thread Valery Ushakov
On Mon, Nov 28, 2022 at 21:45:40 +, RVP wrote:

> On Mon, 28 Nov 2022, Valery Ushakov wrote:
> 
> > Do we have a way to tell malloc on a 32-bit system to allocate memory
> > only below the 2GB boundary (on i386, including when run under amd64)?
> > I'm trying to port a(n old) program that wants to use the sign bit for
> > its internal purposes.  I guess one option would be to prevent malloc
> > from using mmap (and disable alsr?) so that only sbrk (in the low 2GB)
> > is used.
> 
> The standard jemalloc in the system has a compile-time flag to do this
> `--with-lg-vaddr=31'. No run-time setting possible from what I can see.
> Or, you could compile the program against the old `src/lib/libbsdmalloc'
> which only uses sbrk().

Turns out you can use MALLOC_CONF="dss:primary" to make (the new)
jemalloc prefer sbrk(2).

He man page documents that you can aslo use

  const char *malloc_conf = "...";

in your program, but the variable you actually have to use is
__je_malloc_conf.

There is __weak_alias(malloc_conf, __je_malloc_conf) but that doesn't
work across DSO boundaries, I guess.

-uwe


Limiting malloc to the low 2GB?

2022-11-28 Thread Valery Ushakov
Do we have a way to tell malloc on a 32-bit system to allocate memory
only below the 2GB boundary (on i386, including when run under amd64)?
I'm trying to port a(n old) program that wants to use the sign bit for
its internal purposes.  I guess one option would be to prevent malloc
from using mmap (and disable alsr?) so that only sbrk (in the low 2GB)
is used.

Suggestions are appreciated.

-uwe


Re: Module autounload proposal: opt-in, not opt-out

2022-08-08 Thread Valery Ushakov
On Sun, Aug 07, 2022 at 23:08:47 +, Taylor R Campbell wrote:

> Currently there are many types of modules that are autoloaded from
> open-ended patterns:

Our current auto-load policy is a bit too enthusiastic, IMHO.  E.g. an
"unknown" ioctl used to (probably still does) trigger autoload of
compat module, so each time you run vi you have compat module loaded.
One one hand it might seem convenient, but on the other hand we are
getting into the "WAT?!" territory, like those dynamically typed
languages that go out of their way to interpret your program in *some*
way, using most vexing coercions, as if failure is not an option.
IMO, it most certainly is an option and often the most sensible one
too.

Auto-unload seems to me like a kludge to compensate for the blind
"let's try and see if this helps" auto-load policy.

-uwe


Re: userconf question

2022-08-05 Thread Valery Ushakov
On Fri, Aug 05, 2022 at 07:35:17 +, Emmanuel Dreyfus wrote:

> menu=Boot normally:rndseed /var/db/entropy-file;boot
> menu=Drop to boot prompt:prompt
> default=1
> timeout=3
> userconf=disable atabus0
> clear=1
> 
> But atabus0 is still configured, and wd0 still breaks the boot with
> tumeouts. What is wrong with the above userconf syntax?

I have very vague memories of userconf, but does your kernel config
have explicit atabus0 or catch-all atabus*?  IIRC, userconf does not
operate on devices (e.g. atabus0 as instantiated from atabus*), but on
the configured attachments (i.e. you can only disable atabus0 if you
have that line in the config).

-uwe


Re: Proposal: Deprecate (or rename) extsrc/

2022-01-06 Thread Valery Ushakov
On Fri, Jan 07, 2022 at 12:47:53 +1100, Luke Mewburn wrote:

> The "extsrc/" tree was added in late 2009.
> Nothing in tree uses "extsrc/"; it's a placeholder for third-party
> vendors to hook into the build for their own extensions.
> There's no reason a vendor can't just integrate into the build
> with local changes - that's what Wasabi Systems did in the early 2000s.
> 
> (Also, if I recall correctly, "extsrc/" without much consultation).
> 
> If some people require the "extsrc/" functionality in tree, then I propose:
> 1) A good case to retain the functionality should be made by them.
> 2) A better name than "extsrc/" should be chosen, that's not
>going to cause completion rage. Maybe "3rdparty"?

Yeah, it's kinda there "for the furniture" only.

The build.sh usage still claims the default is /usr/extsrc and the
makefile do-extsrc target uses hardcoded "extsrc" instead of
EXTSRCSRCDIR (sic).

Also it's built separately, so one can't easily use it to add say a
library that some other 3rd party tool already in src can use, etc.

-uwe


Re: wsvt25 backspace key should match terminfo definition

2021-11-23 Thread Valery Ushakov
On Tue, Nov 23, 2021 at 18:37:19 -0500, Greg Troxel wrote:

> Valery Ushakov  writes:
> 
> > vt52 is different.  I never used a real vt52 or a clone, but the
> > manual at vt100.net gives the following picture:
> >
> >   https://vt100.net/docs/vt52-mm/figure3-1.html
> >
> > and the description
> >
> >   https://vt100.net/docs/vt52-mm/chapter3.html#S3.1.2.3
> >
> >   Key   CodeAction Taken if Codes Are Echoed
> >   BACK SPACE010 Backspace (Cursor Left) function
> >   DELETE177 Nothing
> 
> That is explaining what the terminal does when those codes are sent by
> the computer.  That is a different thing from how the computer
> interprets user input.

No. Or rather not only.  Please, read the sentence before that table.
The "code" column is the code that the terminal transmits when the key
is pressed:

  Table 3-4 lists the function keys, the code they transmit to the
  host, and the terminal action taken if the code is echoed back to
  the terminal.


> When using a VT52 on Seventh Edition, for example one pushed DEL to
> remove the previous character, and the computer woudl send
> "" to make it disappear and leave the cursor left.  One
> basically never pushed BS.

It dawned on me that the terminals I used on the pdp-11 clone were
(not surprisingly) vt clones and managed to find a picture of the
keyboard, which jogged my memory:

  http://www.leningrad.su/museum/show_big.php?n=1539

so yeah, you would use DEL key on those to correct your typing
mistakes.


> > But vt200 and later use a different keyboard, lk201 (and i did use a
> > real vt220 a lot)
> >
> >   https://vt100.net/docs/vt220-rm/figure3-1.html
> >
> > that picture is not very good, the one from the vt320 manual is better
> >
> >   https://vt100.net/docs/vt320-uu/chapter3.html
> >
> > vt220 does NOT have a configuration option that selects the code that
> > the  
> So that is the "DEL" key, not the BS key.

See, this is exactly why I said "

Re: wsvt25 backspace key should match terminfo definition

2021-11-23 Thread Valery Ushakov
On Tue, Nov 23, 2021 at 19:23:30 +0100, Johnny Billquist wrote:

> > But somehow the official terminfo database has kbs=^H for vt220!
> 
> Which is wrong.

Exactly.

> >kbs=^?,
> 
> Which I think it should be.

Amen!  (unironically)

:)

-uwe


Re: wsvt25 backspace key should match terminfo definition

2021-11-23 Thread Valery Ushakov
On Tue, Nov 23, 2021 at 09:22:43 -0500, Greg Troxel wrote:

> Valery Ushakov  writes:
> 
> > On Tue, Nov 23, 2021 at 00:01:40 +, RVP wrote:
> >
> >> On Tue, 23 Nov 2021, Johnny Billquist wrote:
> >> 
> >> > If something pretends to be a VT220, then the key that deletes
> >> > characters to the left should send DEL, not BS...
> >> > Just saying...
> >> 
> >> That's fine with me too. As long as things are consistent. I suggested the
> >> kernel change because both terminfo definitions (and the FreeBSD console)
> >> go for ^H.
> >
> > Note that the pckbd_keydesc_us keymap maps the scancode of the <- key to
> >
> > KC(14),  KS_Cmd_ResetEmul, KS_Delete,
> >
> > i.e. 0x7f (^?).
> >
> > terminfo is obviously incorrect here.  Amazingly, the bug is actually
> > in vt220 description!  wsvt25 just inherits from it:
> >
> > $ infocmp -1 vt220 | grep kbs
> > kbs=^H,
> >
> > I checkeed termcap.src from netbsd-4 and it's wrong there too.  I have
> > no idea htf that could have happened.
> 
> I think (memory is getting fuzzy) the problem is that the old terminals
> had a delete key, in the upper right, that users use to remove the
> previous character, and a BS key, upper left, that was actually a
> carriage control character.
[... snip ...]
> I see the same kbs=^H on vt52.

vt52 is different.  I never used a real vt52 or a clone, but the
manual at vt100.net gives the following picture:

  https://vt100.net/docs/vt52-mm/figure3-1.html

and the description

  https://vt100.net/docs/vt52-mm/chapter3.html#S3.1.2.3

  Key   CodeAction Taken if Codes Are Echoed
  BACK SPACE010 Backspace (Cursor Left) function
  DELETE177 Nothing


vt100 had similar keyboard (again, never used a real one personally)

  https://vt100.net/docs/vt100-ug/chapter3.html#F3-2

  BACKSPACE 010 Backspace function
  DELETE177 Ignored by the VT100


But vt200 and later use a different keyboard, lk201 (and i did use a
real vt220 a lot)

  https://vt100.net/docs/vt220-rm/figure3-1.html

that picture is not very good, the one from the vt320 manual is better

  https://vt100.net/docs/vt320-uu/chapter3.html

vt220 does NOT have a configuration option that selects the code that
the https://vt100.net/docs/vt320-uu/chapter4.html#S4.13

For vt320 (where it *is* configurable) terminfo has

  $ infocmp -1 vt320 | grep kbs
  kbs=^?,


> I think the first thing to answer is "what is kbs in terminfo supposed
> to mean".

X/Open Curses, Issue 7 doesn't explain, other than saying "backspace"
key, which is an unfortunate name, as it's loaded.  But it's
sufficiently clear from the context that it's the key that deletes
backwards, i.e.  My other question is how kbs is used from terminfo.  Is it about
> generating output sequences to move the active cursor one left?  If so,
> it's right.  Is it about "what should the user type to delete left",
> then for a vt52/vt220, that's wrong.  If it is supposed to be both,
> that's an architectural bug as those aren't the same thing.

No, k* capabilities are sequences generated by the terminal when some
key is pressed.  The capability for the sequence sent to the the
terminal to move the cursor left one position is cub1

  $ infocmp -1 vt220 | grep cub1
  cub1=^H,
  kcub1=\E[D,

(kcub1 is the sequence generated by the left arrow _k_ey).


-uwe


Re: wsvt25 backspace key should match terminfo definition

2021-11-22 Thread Valery Ushakov
On Tue, Nov 23, 2021 at 00:01:40 +, RVP wrote:

> On Tue, 23 Nov 2021, Johnny Billquist wrote:
> 
> > If something pretends to be a VT220, then the key that deletes
> > characters to the left should send DEL, not BS...
> > Just saying...
> 
> That's fine with me too. As long as things are consistent. I suggested the
> kernel change because both terminfo definitions (and the FreeBSD console)
> go for ^H.

Note that the pckbd_keydesc_us keymap maps the scancode of the <- key to

KC(14),  KS_Cmd_ResetEmul, KS_Delete,

i.e. 0x7f (^?).

terminfo is obviously incorrect here.  Amazingly, the bug is actually
in vt220 description!  wsvt25 just inherits from it:

$ infocmp -1 vt220 | grep kbs
kbs=^H,

I checkeed termcap.src from netbsd-4 and it's wrong there too.  I have
no idea htf that could have happened.

-uwe


Re: Request for implementation of KERN_PROC_SIGTRAMP sysctl

2021-10-28 Thread Valery Ushakov
On Wed, Oct 27, 2021 at 20:59:12 -0700, Jason Thorpe wrote:

> > On Oct 27, 2021, at 4:01 PM, Jason Thorpe  wrote:
> > 
> > 
> >> On Oct 27, 2021, at 3:44 PM, Valery Ushakov  wrote:
> >> 
> >> On Wed, Oct 27, 2021 at 07:50:55 -0700, Jason Thorpe wrote:
> >> 
> >> I was wondering if it might be easier to not put the onus onto the
> >> caller and instead have a function that returns the interrupted
> >> ucontext (or NULL, if the pc is not in a trampoline).
> >> 
> >> ucontext_t *__unwind_sigtramp(return_pc, return_sp)
> > 
> > That would certainly be a nicer API.
> 
> Thought about it a little more.
> 
> To make this really work, we'd definitely have to version
> sigaction() so that it fully de-supported sigcontext handlers.
> Otherwise, it's a toss-up whether you have a sigcontext or a
> ucontext on the stack.

It is ucontext for the siginfo trampoline and sigcontext for the older
one, isn't it?


> I also see some value in the basic check (which you need to have to
> __sigtramp_unwind() anyway?).

You need internally, obviously, but the only useful thing you can do
with it is get the interrupted context that the trampoline would
restore.  Or do I miss something here (which is entirely possible, as
I'm writing all these remarks from fading memories of implementing the
trampolines for sh3).

-uwe


Re: Request for implementation of KERN_PROC_SIGTRAMP sysctl

2021-10-27 Thread Valery Ushakov
On Wed, Oct 27, 2021 at 07:50:55 -0700, Jason Thorpe wrote:

> > On Oct 18, 2021, at 9:41 AM, John Marino (NetBSD)  wrote:
> > 
> > yes, it sounds like a __in_signal_trampoline function would work for
> > the GCC unwind, and I would think it would work for GDB as well.
> 
> Ok, I have implemented a new function with this signature:
> 
> /*
>  * __sigtramp_check_np --
>  *
>  *  Non-portable function that checks if the specified program
>  *  counter value is within the signal return trampoline.  Returns
>  *  the trampoline version numnber corresponding to what style of
>  *  trampoline it matches, or -1 if the program value is not within
>  *  the signal return trampoline.
>  */
> int __sigtramp_check_np(void *pc);
> 
> Usage would be like:
[... lots of code ...]

I was wondering if it might be easier to not put the onus onto the
caller and instead have a function that returns the interrupted
ucontext (or NULL, if the pc is not in a trampoline).

ucontext_t *__unwind_sigtramp(return_pc, return_sp)


-uwe


Re: Request for implementation of KERN_PROC_SIGTRAMP sysctl

2021-10-18 Thread Valery Ushakov
On Mon, Oct 18, 2021 at 10:41:48 -0500, John Marino (NetBSD) wrote:

> How we did it with libc before is shown in the netbsd-unwind.h link in
> the original post.  This technique looks for __sigtramp_siginfo_2
> assembly code but no longer works.  I don't know how to do this any
> other way.  GDB doesn't either, it uses the debug information to match
> the function name __sigtramp_siginfo_2 and I am not even sure that's
> valid for current NetBSD releases based on what we've learned here.

Didn't kamil@ fixed this a while back?  E.g. for amd64:

revision 1.8
date: 2020-10-12 20:55:54 +0300;  author: kamil;  state: Exp;  lines: +29 -3;  
commitid: sz57gQtWi3mGKDrC;
Decorate the x86_64 signal trampoline with CFI attributes easing unwinding

Combine the approach provided by Nikhil Benesch and Andrew Cagney.

Now, the unwinders (in gccgo, backtrace(3), etc) can unwind properly
the stack from a signal handler.

Fixes lib/55719 by Nikhil Benesch

-uwe


Re: Request for implementation of KERN_PROC_SIGTRAMP sysctl

2021-10-15 Thread Valery Ushakov
On Fri, Oct 15, 2021 at 23:14:39 +0300, Valery Ushakov wrote:

> On Fri, Oct 15, 2021 at 14:44:16 -0500, John Marino (NetBSD) wrote:
> 
> > Is it possible for NetBSD to implement KERN_PROC_SIGTRAMP sysctl?
> 
> It's been ages since I touched this area, but don't we have
> per-sigaction trampolines?  I mean, in practice they all use the same
> __sigtramp_siginfo_$version trampoline, that sigaction passes to the
> actual syscall, but in principle the process can have different
> trampolines for different signals, can't it?
> 
> struct sys___sigaction_sigtramp_args {
>   syscallarg(int) signum;
>   syscallarg(const struct sigaction *) nsa;
>   syscallarg(struct sigaction *) osa;
>   syscallarg(const void *) tramp; // <-
>   syscallarg(int) vers;
> };

PS: We used to have a trampoline that the kernel copied out into the
process address space (bottom of the stack, iirc) - and that would be
something for KERN_PROC_SIGTRAMP to return indeed.  But that was like
before netbsd 2.0, iirc.

-uwe


Re: Request for implementation of KERN_PROC_SIGTRAMP sysctl

2021-10-15 Thread Valery Ushakov
On Fri, Oct 15, 2021 at 14:44:16 -0500, John Marino (NetBSD) wrote:

> Is it possible for NetBSD to implement KERN_PROC_SIGTRAMP sysctl?
> 
> TLDR;
> For several years, the GNAT Ada compiler has not been able to unwind a
> stack containing a signal trampoline.  The unwinder I wrote for gcc
> several years ago just stopped working on newer NetBSD release even
> though the signal trampoline code itself did not change.  FreeBSD and
> DragonFly BSD are immune to sigtramp location changes because they've
> introduced the KERN_PROC_SIGTRAMP sysctl which provides the location
> of the signal tramp of the process.

It's been ages since I touched this area, but don't we have
per-sigaction trampolines?  I mean, in practice they all use the same
__sigtramp_siginfo_$version trampoline, that sigaction passes to the
actual syscall, but in principle the process can have different
trampolines for different signals, can't it?

struct sys___sigaction_sigtramp_args {
syscallarg(int) signum;
syscallarg(const struct sigaction *) nsa;
syscallarg(struct sigaction *) osa;
syscallarg(const void *) tramp; // <-
syscallarg(int) vers;
};


-uwe


Re: Level for Unix-domain socket options

2021-08-06 Thread Valery Ushakov
On Thu, Aug 05, 2021 at 22:55:12 +0200, Rhialto wrote:

> On Thu 05 Aug 2021 at 13:22:55 +, nia wrote:
> > The unix(4) man page incorrectly states:
> > 
> > "A UNIX-domain socket supports two socket-level options for use with
> > setsockopt(2) and getsockopt(2): [...]"
> > 
> > In reality, the protocol level when using these socket options
> > must be 0, which is a magic number not really documented anywhere
> > except the test suite.
> 
> and getsockopt(2) says
> 
> DESCRIPTION
>  getsockopt(), setsockopt() and getsockopt2() manipulate the options
>  associated with a socket.  Options may exist at multiple protocol levels;
>  they are always present at the uppermost "socket" level.
> 
> which I interpret to mean that even if you use SOL_SOCKET for these
> options, it should work. Do I read that as intended?

Was that perhaps an artifact of an old implementation?

POSIX says

  The getsockopt() function shall fail if:

  [EINVAL]
  The specified option is invalid at the specified socket level.
  [ENOPROTOOPT]
  The option is not supported by the protocol.

while our man page only has

 [ENOPROTOOPT]  The option is unknown at the level indicated.

which might actually be problematic, but I will leave the exegetic
exercise to someone more skilled.


-uwe


Re: protect pmf from network drivers that don't provide if_stop

2021-07-01 Thread Valery Ushakov
On Thu, Jul 01, 2021 at 06:47:08 -0300, Jared McNeill wrote:

> Not really a fan of this as it doesn't protect other potential if_stop users
> (and "temporary fix" rarely is..). How about something like this instead?

I agree.  If for whatever reason we really insist on if-specific stop,
then just use a different stub that would complain about "pure
virtual" method called, kassert, or whatever.


> --- sys/net/if.c  29 Jun 2021 21:19:58 -  1.486
> +++ sys/net/if.c  1 Jul 2021 09:46:10 -
> @@ -761,11 +761,13 @@ void
>  if_register(ifnet_t *ifp)
>  {
>   /*
> -  * If the driver has not supplied its own if_ioctl, then
> -  * supply the default.
> +  * If the driver has not supplied its own if_ioctl or if_stop,
> +  * then supply the default.
>*/
>   if (ifp->if_ioctl == NULL)
>   ifp->if_ioctl = ifioctl_common;
> + if (ifp->if_stop == NULL)
> + ifp->if_stop = if_nullstop;
> 
>   sysctl_sndq_setup(>if_sysctl_log, ifp->if_xname, >if_snd);

-uwe


Re: Inconsistencies in usage of "locators" argument to config (*ca_rescan)() functions

2021-03-26 Thread Valery Ushakov
On Fri, Mar 26, 2021 at 13:18:16 -0700, Jason Thorpe wrote:

> I think it may have been the terminology used by Chris Torek in his
>  paper on the new 4.4BSD device auto configuration framework [...].
>  Sadly, that paper is somewhat hard to find, and I don't know if it
>  was ever actually published anywhere.

This one? :)

http://www.netbsd.org/docs/kernel/config-torek.ps

-uwe


Re: checking for a closed socket

2021-02-02 Thread Valery Ushakov
On Tue, Feb 02, 2021 at 19:20:22 +0100, Manuel Bouyer wrote:

> I've been debugging an issue wuth Xen, where xenstored loops at 100%
> CPU on poll(2).
> after code analysis it's looping on closed Unix socket desriptors.
> From what I understood the code expect poll(2) to return something
> different from POLLIN when the remote end of the socket is
> closed (it checks for (~(POLLOUT|POLLIN)) to it could be either
> POLLERR or POLLHUP I guess - or eventually POLLRDHUP which we don't have).
> 
> Who is right here, linux or NetBSD (linux claims to be posix, while
> our man page doens't mention it) ?
> 
> Is there a way to check if a connection has been closed without a read() ?

You have to be careful what you read into "claim to be posix",
especially when connection creation and termination are concerned.
Termination is extra fun because there are half-closed sockets.

My experience is that the only thing you can rely on is that if
POLLFOO is reported for an fd then the "foo" action on that fd will
not block - which is, essentially, poll's principal raison d'etre.
The details can vary wildly from system to system, so you might need
some strategic planning and experimentation.

I don't have all my relevant notes handy, but as an example, consider
a failed connect(2) that you poll for POLLOUT (posix: "A file
descriptor for a socket that is connecting asynchronously shall
indicate that it is ready for writing, once a connection has been
established.").  On failed connect(2) you will get:

- NetBSD, Solaris: POLLOUT
- Linux: POLLERR | POLLHUP | POLLOUT
- MacOS: POLLHUP

POLLHUP on "close" is even more fun because of half-closed
connections.  NetBSD and Solaris never report POLLHUP for sockets,
MacOS reports POLLHUP when remote closes, Linux reports POLLHUP when
both directions are closed.  Note that getting POLLHUP doesn't mean
that you can immediately "give up" on that socket, you still have to
read it b/c there may still be unread data.  E.g. consider sending a
request, half-closing your side, getting a reply from the server that
ends up in the kernel's socket buffer followed by the server
half-closing its end and thus completely closing the connection.  At
this point you haven't read anything yet in the application, but you
will get POLLHUP (and POLLIN for the data, iirc).  So that POLLHUP is
not really telling you much.

All of the above is strictly "IIRC" and might have changed since the
last time I checked.

To reiterate, my point is that 1) you can assume very little about
specific events reported for boundary conditions - different systems
report them differently; 2) you have to remember that the main promise
of the poll(2) is that the corresponding operation will not block.

PS: Sorry if that was a bit on the rambling side.

-uwe


enet(4) problem? (Was: NFS client performance problems)

2020-12-30 Thread Valery Ushakov
TL;DR: looks like a problem in enet(4)

On Fri, Dec 25, 2020 at 02:20:13 +0300, Valery Ushakov wrote:

> I've stumbled into a weird performance problem with NFS client.  I
> have CompuLab's Utilite Pro (evbarm) machine running a very current
> -current.  It's connected to my Ubuntu laptop with Ethernet (direct
> cable link) and uses it as an NFS server to get e.g. pkgsrc tree and
> distfiles.  The performance is really bad, e.g. make extract of a
> package may take literal ages (I did a lot of the initial
> investigation for this mail while python distfile was being
> extracted).  Extracting 31M uncompressed bash distfile may take, from
> run to run:
> 
> real2m21.110s
> user0m0.635s
> sys 0m4.233s
> 
> or
> 
> real4m52.010s
> user0m0.769s
> sys 0m4.815s
> 
> or whatever.
> 
> Looking at the traffic with wireshark I can see a curious recurring
> pattern.  Looking at the time/sequence plot one immediately sees short
> bursts of activity separated by huge gaps of no traffic.
> 
> Here's one such instance abridged and wrapped for clarity.  Timestamps
> are shown as delta to the previous frame.  Window scale is 3 for the
> client (utilite) and 7 for the server (majava), i.e the server has
> plenty of window open.
> 
> > 413 00.000351 IP utilite.1011 > majava.nfs: Flags [.],
>   seq 177121:178569, ack 79601, win 3225,
>   options [nop,nop,TS val 111 ecr 1941833655],
>   length 1448:
>   NFS request xid 1992059772 1444 write fh ... 5406 (5406) bytes
>   @ 16384
> > 414 00.48 IP utilite.1011 > majava.nfs: Flags [.],
>   seq 178569:180017, ack 79601, win 3225,
>   options [nop,nop,TS val 111 ecr 1941833655],
>   length 1448
>   415 00.09 IP majava.nfs > utilite.1011: Flags [.],
>   ack 180017, win 1834,
>   options [nop,nop,TS val 1941833656 ecr 111],
>   length 0
> > 416 00.51 IP utilite.1011 > majava.nfs: Flags [.],
>   seq 180017:181465, ack 79601, win 3225,
>   options [nop,nop,TS val 111 ecr 1941833655],
>   length 1448
>   417 00.043745 IP majava.nfs > utilite.1011: Flags [.],
>   ack 181465, win 1834,
>   options [nop,nop,TS val 1941833700 ecr 111],
>   length 0
> > 418 00.994813 IP utilite.1011 > majava.nfs: Flags [P.],
>   seq 181465:182645, ack 79601, win 3225,
>   options [nop,nop,TS val 111 ecr 1941833655],
>   length 1180
>   419 00.32 IP majava.nfs > utilite.1011: Flags [.],
>   ack 182645, win 1834,
>   options [nop,nop,TS val 1941834694 ecr 111],
>   length 0
> ! 420 00.07 IP utilite.1011 > majava.nfs: Flags [P.],
>   seq 181465:182645, ack 79601, win 3225,
>   options [nop,nop,TS val 113 ecr 1941833700],
>   length 1180
>   421 00.09 IP majava.nfs > utilite.1011: Flags [.],
>   ack 182645, win 1834,
>   options [nop,nop,TS val 1941834694 ecr 113,nop,nop,
>sack 1 {181465:182645}],
>   length 0
> 
> Here frames 413, 414, 416 and 418 comprise single NFS write request.
> All, but the last segments are sent very fast.  Then note that the
> last segment (418) is sent after a 1 second delay, and then
> immediately resent (420, marked as "spurious retransmission" by
> wireshark).
> 
> This pattern repeats through the trace.  From time to time a large
> write has its last segment delayed by 1 second, then there's an ACK
> from the server and then that last segment is immediately "spuriously"
> resent.
> 
> Does this ring any bells?  Either from the TCP point of view or from
> NFS might be doing here that might trigger that.
> 
> Just copying a large file to the server seems to be ok, the
> time/sequence plot is nice and linear.

Looking at the same traffic from the client I see a different picture
that complements the server side story.
 
The NFS write request is sent in one batch of N chunks.  The server
acks chunks up to N-1, but not the last one.  After one second the
client retransmits the last chunk of the batch and gets two acks on
it.

So what seems to be happenning is that enet(4) does not actually
transmit the packet with the last chunk of data (418) along with its
siblings (413, 414, 416).  The server does not see it and so does not
ack it.  Eventually the rexmit timer kicks in and the client's TCP
resends the last chunk in a new packet (420).  At this point the old
last packet (418) gets unstuck and so both the original last packet
and the new copy are sent.

So this seems to be some kind of enet(4) bug.

mlelstv@ pointed out that TXDESC_WRITEOUT and RXDESC_WRITEOUT use
PREWRITE, not POSTWRITE, which seems suspicious.  Unfortunately
changing them to POSTWRITE doesn't seem to help.  Since this problem
doesn't happen with all long WRITEs there must be something else at
play here too.  If i have to guess - ringbuffer wraparound may be?


-uwe


NFS client performance problems

2020-12-24 Thread Valery Ushakov
I've stumbled into a weird performance problem with NFS client.  I
have CompuLab's Utilite Pro (evbarm) machine running a very current
-current.  It's connected to my Ubuntu laptop with Ethernet (direct
cable link) and uses it as an NFS server to get e.g. pkgsrc tree and
distfiles.  The performance is really bad, e.g. make extract of a
package may take literal ages (I did a lot of the initial
investigation for this mail while python distfile was being
extracted).  Extracting 31M uncompressed bash distfile may take, from
run to run:

real2m21.110s
user0m0.635s
sys 0m4.233s

or

real4m52.010s
user0m0.769s
sys 0m4.815s

or whatever.

Looking at the traffic with wireshark I can see a curious recurring
pattern.  Looking at the time/sequence plot one immediately sees short
bursts of activity separated by huge gaps of no traffic.

Here's one such instance abridged and wrapped for clarity.  Timestamps
are shown as delta to the previous frame.  Window scale is 3 for the
client (utilite) and 7 for the server (majava), i.e the server has
plenty of window open.

> 413 00.000351 IP utilite.1011 > majava.nfs: Flags [.],
  seq 177121:178569, ack 79601, win 3225,
  options [nop,nop,TS val 111 ecr 1941833655],
  length 1448:
  NFS request xid 1992059772 1444 write fh ... 5406 (5406) bytes
  @ 16384
> 414 00.48 IP utilite.1011 > majava.nfs: Flags [.],
  seq 178569:180017, ack 79601, win 3225,
  options [nop,nop,TS val 111 ecr 1941833655],
  length 1448
  415 00.09 IP majava.nfs > utilite.1011: Flags [.],
  ack 180017, win 1834,
  options [nop,nop,TS val 1941833656 ecr 111],
  length 0
> 416 00.51 IP utilite.1011 > majava.nfs: Flags [.],
  seq 180017:181465, ack 79601, win 3225,
  options [nop,nop,TS val 111 ecr 1941833655],
  length 1448
  417 00.043745 IP majava.nfs > utilite.1011: Flags [.],
  ack 181465, win 1834,
  options [nop,nop,TS val 1941833700 ecr 111],
  length 0
> 418 00.994813 IP utilite.1011 > majava.nfs: Flags [P.],
  seq 181465:182645, ack 79601, win 3225,
  options [nop,nop,TS val 111 ecr 1941833655],
  length 1180
  419 00.32 IP majava.nfs > utilite.1011: Flags [.],
  ack 182645, win 1834,
  options [nop,nop,TS val 1941834694 ecr 111],
  length 0
! 420 00.07 IP utilite.1011 > majava.nfs: Flags [P.],
  seq 181465:182645, ack 79601, win 3225,
  options [nop,nop,TS val 113 ecr 1941833700],
  length 1180
  421 00.09 IP majava.nfs > utilite.1011: Flags [.],
  ack 182645, win 1834,
  options [nop,nop,TS val 1941834694 ecr 113,nop,nop,
   sack 1 {181465:182645}],
  length 0

Here frames 413, 414, 416 and 418 comprise single NFS write request.
All, but the last segments are sent very fast.  Then note that the
last segment (418) is sent after a 1 second delay, and then
immediately resent (420, marked as "spurious retransmission" by
wireshark).

This pattern repeats through the trace.  From time to time a large
write has its last segment delayed by 1 second, then there's an ACK
from the server and then that last segment is immediately "spuriously"
resent.

Does this ring any bells?  Either from the TCP point of view or from
NFS might be doing here that might trigger that.

Just copying a large file to the server seems to be ok, the
time/sequence plot is nice and linear.

-uwe


Re: autoloading compat43 on tty ioctls

2020-10-10 Thread Valery Ushakov
On Sat, Oct 10, 2020 at 11:49:47 -0700, Paul Goyette wrote:

> True, but the way ioctl's are handled in kern/tty.c seems to auto-load
> the compat_43 and compat_60 modules for _any_ unhandled ioctl.  So if
> you have an illegal/invalid ioctl it will autoload the modules, and then
> unload them 10 seconds later.
> 
> I question whether we should do the autoloads...

I think I mentioned exectly this problem in Lillehammer - I noticed it
a few years ago b/c newever binutils started using some reloc type or
other that sh3 kobj was not prepared to handle, so you'd see a kernel
message on each (auto)load.

-uwe


Re: "Boot this kernel once" functionality? (amd64)

2020-09-16 Thread Valery Ushakov
On Wed, Sep 16, 2020 at 12:20:57 +0200, Anthony Mallet wrote:

> On Wednesday 16 Sep 2020, at 12:09, Martin Husemann wrote:
> > This works fine on e.g. sparc*; I can do: shutdown -b netbsd.t -r
> > now
> >
> > No state is modified on any disks, very convenient.
> 
> Right, not changing any state seems safer!
> 
> > I don't know if there is enough of a persistent environment for UEFI
> > boots (I would guess there is), and probably no easy way for BIOS
> > boot.
> 
> The machine in question is not UEFI, so I would be more interested in
> a pure BIOS solution.

As der Mouse mentioned upthread, kloader(4) would seem like a
promising candidate to implement this.  It doesn't support x86
currently, but existing kloader_machdep.c files are minuscule - the
non-boilerplate code is essentially just one function that is
essentially a bit more than a fancy memcpy.  The realy interesting
question is if NetBSD on a given platform leaves the machine in a
state that a newly booted kernel expects the machine to be in.  The
hpc* ports that support kloader do not expect anything much from the
initial state of the machine.

Of course that doesn't suit your immediate needs...

-uwe


-.su file in kernel compile dir

2020-07-05 Thread Valery Ushakov
Recent changes to record stack usage cause a file named -.su to be
created (that refers assym.c).  It plays tricks with targets like
clean that refer to *.su

-uwe


Re: Submitting a new module example

2020-06-02 Thread Valery Ushakov
On Tue, Jun 02, 2020 at 00:02:29 +, bmelo wrote:

> I have written a ddb_hello module example. You can found the patch here:
> https://pastebin.com/WCUpRc0J
> 
> Could it be imported in src even if there is a ddbping example, please?

It's the same skeleton code that ddbping already demoes (a bit less,
actually), so I don't think there's a reason to have a duplicate
example.  Sorry I've beaten you to it, I had no idea.

-uwe


Re: Rump makes the kernel problematically brittle

2020-04-02 Thread Valery Ushakov
On Thu, Apr 02, 2020 at 23:29:55 +0300, Valery Ushakov wrote:

> On Thu, Apr 02, 2020 at 16:15:30 -0400, Mouse wrote:
> 
> > > http://www.fixup.fi/misc/rumpkernel-book/
> > 
> > That page I can look at fine, but when I try to fetch the PDF, I get a
> > 403 Forbidden.  In case it helps anyone, the body says
> > 
> > Code: AccessDenied
> > Message: Access Denied
> > RequestId: CE223007341C4B9F
> > HostId: 
> > iIeEi7wGEkGET4V/Pw2ndjkjrChsKswcqoLJJpmExJOrqdRFFgHw6L6XWjB2ZSNqBTTXyPHJYMI=
> 
> Works with firefox.  It porbably needs javascript or cookies or
> whatever.

It wants a referrer.  Go to that page with lynx and you can download
the pdf by following the link.

-uwe


Re: Rump makes the kernel problematically brittle

2020-04-02 Thread Valery Ushakov
On Thu, Apr 02, 2020 at 16:15:30 -0400, Mouse wrote:

> > http://www.fixup.fi/misc/rumpkernel-book/
> 
> That page I can look at fine, but when I try to fetch the PDF, I get a
> 403 Forbidden.  In case it helps anyone, the body says
> 
> Code: AccessDenied
> Message: Access Denied
> RequestId: CE223007341C4B9F
> HostId: 
> iIeEi7wGEkGET4V/Pw2ndjkjrChsKswcqoLJJpmExJOrqdRFFgHw6L6XWjB2ZSNqBTTXyPHJYMI=

Works with firefox.  It porbably needs javascript or cookies or
whatever.


> (While I recognize you may not be the person to say this to, denying
> access like that without any indication of what the problem is or whom
> to ask for help is...singularly useless.)

This is so *richly* ironic coming from you :) I've given up trying to
send you personal mail years ago.  No, please, don't answer that, I'm
no longer interested.

Re whom to contact, there's a contact email literally right beneath
that link.  Not sure if your MX will agree to accept the reply
though. :)

PS: Sorry, I still can't stop giggling...
PPS: Sorry... :)
PPPS: *giggle*

-uwe


Re: Rump makes the kernel problematically brittle

2020-04-02 Thread Valery Ushakov
On Fri, Apr 03, 2020 at 02:23:31 +0700, Robert Elz wrote:

>   | Is this documented anywhere?
> 
> You're putting documented and rump into the same thought space?

http://www.fixup.fi/misc/rumpkernel-book/

-uwe


sys_ptrace_lwpstatus.c (Was: CVS commit: src/sys)

2019-12-26 Thread Valery Ushakov
On Thu, Dec 26, 2019 at 08:52:39 +, Kamil Rytarowski wrote:

> Module Name:  src
> Committed By: kamil
> Date: Thu Dec 26 08:52:39 UTC 2019
> 
> Modified Files:
>   src/sys/kern: files.kern sys_ptrace_common.c
>   src/sys/sys: ptrace.h
> Added Files:
>   src/sys/kern: sys_ptrace_lwpstatus.c
> 
> Log Message:
> Put ptrace_read_lwpstatus() and process_read_lwpstatus() to a new file
> 
> Fixes "no PTRACE" kernel build, in particular zaurus kernel=INSTALL_C700.

This is counterintuitive when a sys_ptrace* file with ptrace_*
functions does not depend on options ptrace.  That seems to be a
strong indication the functions and the file are misnamed.

filekern/sys_ptrace.c   ptrace
filekern/sys_ptrace_common.cptrace
filekern/sys_ptrace_lwpstatus.c kern

-uwe


xc_barrier()

2019-10-06 Thread Valery Ushakov
gcc 8 -Wcast-function-type (enabled by -Wextra that we do turn on for
x86 ports and a few others) is not very happy about many function
casts for nullop and friends in the kernel.

A small portion of them is code that does xcall barrier with:

uint64_t where;
where = xc_broadcast(0, (xcfunc_t)nullop, NULL, NULL);
xc_wait(where);

The attached patch replaces all these with

xc_barrier(0);

with obvious implementation.

Suggestions for a better name and (especially) for the descriptive
comment and the man-page text are welcome.

-uwe
Index: sys/xcall.h
===
RCS file: /cvsroot/src/sys/sys/xcall.h,v
retrieving revision 1.7
diff -u -p -r1.7 xcall.h
--- sys/xcall.h 27 Aug 2018 07:10:15 -  1.7
+++ sys/xcall.h 6 Oct 2019 12:28:38 -
@@ -53,6 +53,8 @@ uint64_t  xc_broadcast(u_int, xcfunc_t, v
 uint64_t   xc_unicast(u_int, xcfunc_t, void *, void *, struct cpu_info *);
 void   xc_wait(uint64_t);
 
+void   xc_barrier(u_int);
+
 unsigned int   xc_encode_ipl(int);
 
 #endif /* _KERNEL */
Index: kern/subr_xcall.c
===
RCS file: /cvsroot/src/sys/kern/subr_xcall.c,v
retrieving revision 1.26
diff -u -p -r1.26 subr_xcall.c
--- kern/subr_xcall.c   7 Feb 2018 04:25:09 -   1.26
+++ kern/subr_xcall.c   6 Oct 2019 12:28:37 -
@@ -247,6 +247,30 @@ xc_init_cpu(struct cpu_info *ci)
KASSERT(error == 0);
 }
 
+
+static void
+xc_nop(void *arg1, void *arg2)
+{
+
+return;
+}
+
+
+/*
+ * xc_barrier:
+ *
+ * Broadcast a nop to all CPUs in the system.
+ */
+void
+xc_barrier(unsigned int flags)
+{
+   uint64_t where;
+
+   where = xc_broadcast(flags, xc_nop, NULL, NULL);
+   xc_wait(where);
+}
+
+
 /*
  * xc_broadcast:
  *
Index: arch/x86/acpi/acpi_cpu_md.c
===
RCS file: /cvsroot/src/sys/arch/x86/acpi/acpi_cpu_md.c,v
retrieving revision 1.79
diff -u -p -r1.79 acpi_cpu_md.c
--- arch/x86/acpi/acpi_cpu_md.c 10 Nov 2018 09:42:42 -  1.79
+++ arch/x86/acpi/acpi_cpu_md.c 6 Oct 2019 12:28:35 -
@@ -378,7 +378,6 @@ acpicpu_md_cstate_stop(void)
 {
static char text[16];
void (*func)(void);
-   uint64_t xc;
bool ipi;
 
x86_cpu_idle_get(, text, sizeof(text));
@@ -393,8 +392,7 @@ acpicpu_md_cstate_stop(void)
 * Run a cross-call to ensure that all CPUs are
 * out from the ACPI idle-loop before detachment.
 */
-   xc = xc_broadcast(0, (xcfunc_t)nullop, NULL, NULL);
-   xc_wait(xc);
+   xc_barrier(0);
 
return 0;
 }
Index: kern/kern_lwp.c
===
RCS file: /cvsroot/src/sys/kern/kern_lwp.c,v
retrieving revision 1.204
diff -u -p -r1.204 kern_lwp.c
--- kern/kern_lwp.c 3 Oct 2019 22:48:44 -   1.204
+++ kern/kern_lwp.c 6 Oct 2019 12:28:37 -
@@ -367,7 +367,6 @@ static void
 lwp_dtor(void *arg, void *obj)
 {
lwp_t *l = obj;
-   uint64_t where;
(void)l;
 
/*
@@ -379,8 +378,7 @@ lwp_dtor(void *arg, void *obj)
 * the value of l->l_cpu must be still valid at this point.
 */
KASSERT(l->l_cpu != NULL);
-   where = xc_broadcast(0, (xcfunc_t)nullop, NULL, NULL);
-   xc_wait(where);
+   xc_barrier(0);
 }
 
 /*
Index: kern/kern_ras.c
===
RCS file: /cvsroot/src/sys/kern/kern_ras.c,v
retrieving revision 1.38
diff -u -p -r1.38 kern_ras.c
--- kern/kern_ras.c 4 Jul 2016 07:56:07 -   1.38
+++ kern/kern_ras.c 6 Oct 2019 12:28:37 -
@@ -66,9 +66,7 @@ ras_sync(void)
/* No need to sync if exiting or single threaded. */
if (curproc->p_nlwps > 1 && ncpu > 1) {
 #ifdef NO_SOFTWARE_PATENTS
-   uint64_t where;
-   where = xc_broadcast(0, (xcfunc_t)nullop, NULL, NULL);
-   xc_wait(where);
+   xc_barrier(0);
 #else
/*
 * Assumptions:
Index: kern/kern_softint.c
===
RCS file: /cvsroot/src/sys/kern/kern_softint.c,v
retrieving revision 1.47
diff -u -p -r1.47 kern_softint.c
--- kern/kern_softint.c 17 May 2019 03:34:26 -  1.47
+++ kern/kern_softint.c 6 Oct 2019 12:28:37 -
@@ -407,7 +407,6 @@ softint_disestablish(void *arg)
softcpu_t *sc;
softhand_t *sh;
uintptr_t offset;
-   uint64_t where;
u_int flags;
 
offset = (uintptr_t)arg;
@@ -432,8 +431,7 @@ softint_disestablish(void *arg)
 * SOFTINT_ACTIVE already set.
 */
if (__predict_true(mp_online)) {
-   where = xc_broadcast(0, (xcfunc_t)nullop, NULL, NULL);
-   xc_wait(where);
+   xc_barrier(0);
}
 
for (;;) {
Index: kern/kern_syscall.c

Re: Proposal, again: Disable autoload of compat_xyz modules

2019-09-27 Thread Valery Ushakov
On Fri, Sep 27, 2019 at 11:36:08 -, Christos Zoulas wrote:

> >} I propose something very slightly different that can preserve the current
> >} functionality with user action:
> >} 
> >} 1. Remove them from standard kernels in architectures where modules are
> >}supported. Users can add them back or just use modules.
> >} 2. Disable autoloading, but provide a sysctl to enable autoloading
> >}(1 global sysctl for all compat modules). Users can change the default
> >}in /etc/sysctl.conf (adds sysctl to the proposal)
> >
> > You mean this (first line):
> >
> >i386devel: {31} sysctl kern.module
> >kern.module.autoload = 0
> >kern.module.verbose = 0
> >kern.module.path = /stand/amd64-xen/8.99.26/modules
> >kern.module.autotime = 10
> 
> Perhaps:
> 
> kern.module.autoload.disable = linux,linux32

May be we should take a look at how SNMP did tables in MIB, b/c we are
trying to create just such a table indexed by module name.

Also, I'm not that sure about autoload of compat stuff especially
since iirc it currently implies auto-unload too.  I vaguely remember
when I was debugging something in sh3 kobj_machdep.c I had some
printfs there that made the autoloads visibile and (iirc) each vi
invocation would trigger an autoload of compat ioctl code (which
wouldn't recognize the ioctl, and that would be auto-unloaded a few
seconds later).

-uwe


Re: Proposal, again: Disable autoload of compat_xyz modules

2019-09-27 Thread Valery Ushakov
On Fri, Sep 27, 2019 at 10:57:12 +0200, Jarom?r Dole?ek wrote:

> Le jeu. 26 sept. 2019 ? 18:08, Manuel Bouyer  a ?crit 
> :
> >
> > On Thu, Sep 26, 2019 at 05:10:01PM +0200, Maxime Villard wrote:
> > > issues for a clearly marginal use case, and given the current general
> >  ^^^
> >
> > This is where we dissagree. You guess it's marginal but there's no
> > evidence of that (and there's no evidence of the opposite either).
> 
> FYI - I've put also a lot of efford into fixing & enhancing
> compat_linux in past. I also greatly appreciate all the work work of
> other folks working on the layer, it's super useful in some situations
> - browser with flash support used to be important (thankfully not
> anymore), also vmware and matlab, I also used some Oracle dev tools.
> However, that is not the topic of the discussion.
> 
> Let's concentrate on whether it should be enabled by default.

Yes, please.  This discussion has veered way off topic.


> Given the history, to me it's completely clear compat_linux shouldn't
> be on by default. Any possible linux-specific exploits should only be
> problem for people actually explicitly enabling it. Let's just stop
> pretending that we'd setup any kind of reasonable testing suite for
> this - it has not been done in last >20 years, it's even less likely
> to happen now that most of the major use cases are actually moot.
> 
> As Maya suggested, let's keep this concentrated on COMPAT_LINUX only
> to avoid further bikeshed flogging, so basically I propose doing this:
> 1) Comment out COMPAT_LINUX from all kernels configs for all archs
> which support modular
> 2) Disable autoload for compat_linux, requiring the user to explicitly
> configure system to load it. No extra sysctl.
> 
> Any major and specific objections?

At some point it became very hard to follow the technical content of
this thread, but I don't think there were any.

Thanks!

-uwe


Re: mknod(2) and POSIX

2019-06-18 Thread Valery Ushakov
On Tue, Jun 18, 2019 at 17:22:14 +0200, Kamil Rytarowski wrote:

> I wrote a patch to add support for it, but untested as currently the
> kernel build is broken:
> 
> http://netbsd.org/~kamil/patch-00128-posix-mknod.txt
> 
> Independently, I have removed unused variable retval.
> 
> If this patch is fine and once the kernel will be unbroken, I can land
> it, document and add ATF tests.

Please, please, please, don't mix unrelated changes.  If retval is
unused already, g/c it first in a separate commit.

-uwe


Re: mknod(2) and POSIX

2019-06-18 Thread Valery Ushakov
On Tue, Jun 18, 2019 at 14:30:26 +0200, Jason Thorpe wrote:

> > On Jun 18, 2019, at 2:25 PM, Jason Thorpe  wrote:
> > 
> >> On Jun 18, 2019, at 2:01 PM, Greg Troxel  wrote:
> >> 
> >> I realize mkfifo is preferred in our world, and POSIX says it is
> >> preferred.  But I believe we have a failure to follow POSIX.
> >> 
> >> Other opinions?
> > 
> > Seems you are correct.
> 
> Sorry!  Hit "send" prematurely.
> 
> mknod(2) for the FIFO case should allow users under the same
> circumstances that mkfifo(2) does.

Since our mknod() is a wrapper, we can trivially dispath to mkfifo
syscall for mknod calls with S_IFIFO, can't we?  I don't think we
should make the mknod syscall itself to support this.

-uwe


Re: fork-the-syscall return semantics

2019-02-16 Thread Valery Ushakov
On Sat, Feb 16, 2019 at 20:14:35 -0500, Mouse wrote:

> In fork1(), in kern/kern_fork.c, there is code
> 
> /*
>  * Return child pid to parent process,
>  * marking us as parent via retval[1].
>  */
> if (retval != NULL) {
> retval[0] = p2->p_pid;
> retval[1] = 0;
> }
> 
> This is very old code; identical code appears as far back as 1.4T,
> quite likely even farther back.  It appears the return semantics of
> fork-the-syscall-trap (and related calls, like __vfork14) are a bit
> odd: the parent returns  and the child returns  (or
> at least so a comment in the SPARC libc wrapper claims; I haven't dug
> enough to find the kernel code where the child's return values are set
> up).  But I see no reason for this, as the libc wrapper immediately
> destroys the first return value in the child.
> 
> Does anyone happen to know why this was done?  So far I haven't found
> any reason to not simply return the abstract return value in retval[0]
> like most other syscalls that return a simple integer value, but for a
> special case like this to have survived this long, I can't help feeling
> there must be _something_ behind it.

I would look at 

  http://mail-index.netbsd.org/source-changes/1995/12/10/msg012114.html

  Modified Files:
  init_main.c 
  Log Message:
  Change the way we test whether or not we're in the child process.

except there seems to be no such commit actually recorded in
init_main.c log :)  Was reverted in the repo, I guess.

The code before that change looks like:

#ifdef cpu_set_init_frame   /* XXX should go away */
if (rval[1]) {
/*
 * Now in process 2.
 */
start_pagedaemon(curproc);
}
#else


Its counterpart is

  http://mail-index.netbsd.org/source-changes/1995/12/10/msg012115.html

  Modified Files:
  kern_fork.c 
  Log Message:
  If __FORK_BRAINDAMAGE, continue stuffing retval[1] for the benefit of main().


Other relevant commits are probably:

  http://mail-index.netbsd.org/source-changes/1995/12/09/.html
  http://mail-index.netbsd.org/source-changes/1995/12/09/msg012096.html
  http://mail-index.netbsd.org/source-changes/1995/12/09/msg012098.html


-uwe


Re: Help needed with understanding of config(1) debug output

2018-09-27 Thread Valery Ushakov
On Thu, Sep 27, 2018 at 16:20:50 +0800, Paul Goyette wrote:

> I've got a problem where something I've changed over the last six months
> (or more) on the [pgoyette-compat] branch has broken the release build
> for at least ``build.sh -m algor'' port.  For some unknown reason it is
> defining COMPAT_NETBSD32 in opt_compat_netbsd32.h even though the option
> is not selected in the kernel definition file.
> 
> I've tried to understand the debug output from ``config -d ...'' but
> I simply don't understand the output.  (The output looks more like it is
> intended to debug config(1) itself, and not for debugging issues with
> config's input files.)  I find the following snippet in the debug output
> 
>dependopts:326: debug: depend attr `COMPAT_NETBSD32'
>dependopts:326: debug: option selected `compat_netbsd32'
>dependopts:326: debug: depend `COMPAT_NETBSD32' searched
> 
> This seems to indicate that attribute COMPAT_NETBSD32 was previously
> "needed" and therefore we need to include option `compat_netbsd32'.  But
> there is no earlier mention of COMPAT_NETBSD32 in the debug output.

You made EXEC_ELF32 depend on COMPAT_NETBSD32 and since you enable
EXEC_ELF32, it pulls in COMPAT_NETBSD32 that it now depends on.

-uwe


Re: How to prevent a mutex being _enter()ed from being _destroy()ed?

2018-08-10 Thread Valery Ushakov
On Sat, Aug 11, 2018 at 00:46:26 +0700, Robert Elz wrote:

> Date:Fri, 10 Aug 2018 08:03:55 -0400
> From:Greg Troxel 
> Message-ID:  
> 
>   | Ancient BSD tradition is not to explain these things :-(
> 
> Older than that.   Don't you remember
>   you are not expected to understand this
> (or wording very similar) in ancient 4th/5th edition unix.

The explanation for that comment I've read somewhere was that it
really meant "it will not be in the exam" (which is a wonderful story
even if it's not true :).

-uwe


Old FFS triggers assertion in BUFRD()

2018-07-18 Thread Valery Ushakov
I have found OpenWindows Version 3 CD in a drawer.  The label claims
"ISO 9660 format", but it's really an FFS image.

I was able to mount it with a little tweak - ffs_superblock_validate()
checked only fs_size, but this CD from 1991 only has fs_old_size (fix
committed).

The next hiccup I ran into was an assertion in ufs_readwrite.c:172
(rump is compiled with DIAGNOSTIC).

KASSERT(vp->v_type != VLNK || ump->um_maxsymlinklen != 0 ||
DIP(ip, blocks) == 0);

It was triggered by di_blocks being 2.

May be someone with enough FFS clue could take a look?  Disc image
available upon request.

-uwe


Leaking kernel stack data in struct padding

2018-06-13 Thread Valery Ushakov
On Wed, Jun 13, 2018 at 02:09:09 +, Valeriy E. Ushakov wrote:

> Module Name:  src
> Committed By: uwe
> Date: Wed Jun 13 02:09:09 UTC 2018
> 
> Modified Files:
>   src/sys/dev/wscons: wsevent.c
> 
> Log Message:
> wsevent_copyout_events50 - don't leak garbage from the kernel stack.
> 
> On 64-bit machines struct timespec50 has padding between 32-bit tv_sec
> and long tv_nsec that is not affected by normal assignment.  Scrub it
> before we uiomove struct owscons_event.

I was looking at mouse events on an amd64 VM with

  # hexdump -e '/4 " %2d" /4 " %5d" /8 "  %d" /8 ".%09d" "\n"' /dev/wsmouse

note: wscons event sources give you compat event structs unless you
request the current version with an ioctl (which is kinda hard to do
in hexdump :).

I noticed that the first reported event always had bogus timestamp.
Took me a bit of time to realize what was going on.  I fixed it in
wsevent.c (indentation reduced for readability):

+#if INT32_MAX < LONG_MAX   /* scrub padding */
+   memset(, 0, offsetof(struct timespec50, tv_nsec));
+#endif 
timespec_to_timespec50(>time, );

but I wonder if this scrubbing should be moved into
timespec_to_timespec50() - after all the most likley use of the compat
struct is to write or copyout it in the compat code, so the same
problem probably happens elsewhere.

On amd64 the compiler is smart enough to convert memset() to a few
movq's.  The compiler is not smart enough to notice that tv_nsec is
written to in timespec_to_timespec50(), so

memset(, 0, sizeof(ev50.time));
timespec_to_timespec50(...);

would still emit two movq's immediately followed by another movq to
tv_nsec.  Hence this specific arguments in the call to memset().

Comments?


PS: The next logical question is if there's a tool that can help audit the
rest of the kernel for problems like that.  :)

-uwe


Re: I would like to contribute to NetBSD

2018-04-08 Thread Valery Ushakov
On Sat, Apr 07, 2018 at 18:43:29 -0700, Andy Ruhl wrote:

> On Fri, Apr 6, 2018 at 7:31 AM, Narendra Kangralkar
>  wrote:
> > Hello All,
> >
> > I found that NetBSD a supported guest OS under VirtualBox project is
> > partially completed. I would like to work on this if this project is still
> > available. Please let me know your thoughts regarding this.
> 
> I use NetBSD under Virtualbox so I'm guessing you're talking about
> making a supported set of guest additions?

Support for NetBSD Guest Additions has been committed to the
VirtualBox tree quite a while ago.  Though there's still no pkgsrc
package.

-uwe


Re: setting DDB_COMMANDONENTER="bt" by default

2018-02-16 Thread Valery Ushakov
On Sat, Feb 17, 2018 at 08:35:32 +1100, matthew green wrote:

> Valery Ushakov writes:
> > On Thu, Feb 15, 2018 at 01:19:31 +, Sevan Janiyan wrote:
> > 
> > > > I might/would suggest
> > > > 
> > > >OPTIONS DDB_ONPANIC=2
> > > 
> > > clear, any reason not to have this as a default? (I'm going to sleep on 
> > > it)
> > 
> > As someone has already mentioned upthread, because printing a
> > backtrace might cause another panic, so the default was selected to be
> > on the safe(r) side.  At least that's what I recall.
> 
> i don't think this is the case.
> 
> the builtin stack trace code is fault-tolerant.  if it
> faults, it will not re-try and you'll get a db> prompt.

My memory is hazy.  I do have (for more than a decade it seems) a
local change in db_trap() that adds db_recover around
db_print_loc_and_inst() call, but I think that was to protect from fat
fingers in ddb (hpcsh keyboard is tiny :).

-uwe


Re: setting DDB_COMMANDONENTER="bt" by default

2018-02-15 Thread Valery Ushakov
On Thu, Feb 15, 2018 at 02:11:07 +, Sevan Janiyan wrote:

> On 02/15/18 01:23, Valery Ushakov wrote:
> > As someone has already mentioned upthread, because printing a
> > backtrace might cause another panic, so the default was selected to be
> > on the safe(r) side.  At least that's what I recall.
> 
> On 02/15/18 01:33, Paul Goyette wrote:
> > Yes, that matches my recall as well.
> 
> Ah, ok, so leave this to rest? (is it worth testing in -current to see
> how things go?)

Well, "testing" here would be to throw random garbage in the stack for
"bt" to choke on (and that garbage might also need to point to just
the right other data).  You might be able to script this with
something like vbox snapshots I guess, by snapshotting a VM when it's
in ddb and then fuzzing the kernel stack before resuming it (I don't
remember if vbox vm debugger is scriptable, you might also need to
hack it a bit to be).

-uwe


Re: setting DDB_COMMANDONENTER="bt" by default

2018-02-14 Thread Valery Ushakov
On Thu, Feb 15, 2018 at 01:19:31 +, Sevan Janiyan wrote:

> > I might/would suggest
> > 
> >OPTIONS DDB_ONPANIC=2
> 
> clear, any reason not to have this as a default? (I'm going to sleep on it)

As someone has already mentioned upthread, because printing a
backtrace might cause another panic, so the default was selected to be
on the safe(r) side.  At least that's what I recall.

-uwe


Re: gcc: optimizations, and stack traces

2018-02-09 Thread Valery Ushakov
[Summoning Krister]

On Fri, Feb 09, 2018 at 11:23:17 +0100, Maxime Villard wrote:

> There are also several cases where functions in the call tree can disappear
> from the backtrace. In the following call tree:
> 
>   A -> B -> C -> D   (and D panics)
> 
> if, in B, GCC put the two instructions after the instruction that calls C,
> the backtrace will be:
> 
>   A -> C -> D
> 
> This can make a bug completely undebuggable.

Does gcc actually generates code like that?  I thought that it can
delay frame pointer creation, but only until it needs to make a nested
call, to C in your example, (as in the sample I showed in another mail
to this thread).

-uwe


Re: gcc: optimizations, and stack traces

2018-02-09 Thread Valery Ushakov
On Fri, Feb 09, 2018 at 11:38:47 +0100, Martin Husemann wrote:

> On Fri, Feb 09, 2018 at 11:23:17AM +0100, Maxime Villard wrote:
>
> > When I spotted this several months ago (while developing Live
> > Kernel ASLR), I tried to look for GCC options that say "optimize
> > with -O2, but keep the stack trace intact". I couldn't find one,
> > and the only thing I ended up doing was disabling -O2 in the
> > makefiles.
> 
> -fno-omit-frame-pointer?

That won't help.

 `-O' also turns on `-fomit-frame-pointer' on machines where doing
 so does not interfere with debugging.

so it's not turned off in the first place.  The problem is that some
of the later optimization passes may push frame pointer setup to some
place later in function.  E.g. on -7 

void
kernfs_get_rrootdev(void)
{
static int tried = 0;

if (tried) {
/* Already did it once. */
return;
}
tried = 1;

if (rootdev == NODEV)
return;
rrootdev = devsw_blk2chr(rootdev);
if (rrootdev != NODEV)
return;
rrootdev = NODEV;
printf("kernfs_get_rrootdev: no raw root device\n");
}

is compiled to 

c068f81b :
c068f81b:   mov0xc0fc6b40,%eax
c068f820:   test   %eax,%eax
c068f822:   jnec068f867 
c068f824:   movl   $0x1,0xc0fc6b40
c068f82e:   mov0xc0fde0b8,%edx
c068f834:   mov0xc0fde0bc,%eax
c068f839:   mov%edx,%ecx
c068f83b:   and%eax,%ecx
c068f83d:   cmp$0x,%ecx
c068f840:   je c068f867 
->  c068f842:   push   %ebp
->  c068f843:   mov%esp,%ebp
c068f845:   sub$0x8,%esp
c068f848:   mov%edx,(%esp)
c068f84b:   mov%eax,0x4(%esp)
c068f84f:   call   c091ce52 

So the "tried" check and the first "rootdev" check happen before the
frame pointer is set up.

-uwe


Re: Proposal to obsolete SYS_pipe

2017-12-25 Thread Valery Ushakov
On Tue, Dec 26, 2017 at 01:29:42 +, Christos Zoulas wrote:

> In article ,
> Kamil Rytarowski   wrote:
> >-=-=-=-=-=-
> >-=-=-=-=-=-
> >
> >On 25.12.2017 17:43, Christos Zoulas wrote:
> >> On Dec 25,  4:42pm, n...@gmx.com (Kamil Rytarowski) wrote:
> >> -- Subject: Re: Proposal to obsolete SYS_pipe
> >> 
> >> | I've extracted two changes from the original mail:
> >> | 
> >> | https://mail-index.netbsd.org/tech-kern/2017/12/25/msg022836.html
> >> 
> >> Yes, the first patch is exactly what I had in mind; remove the
> >> assembly stubs from libc and make pipe() a wrapper for pipe2().
> >> The second patch sounds good too, but it is not in the email...
> >> 
> >> christos
> >> 
> >
> >I've included the missing patch in the subsequent mail:
> >
> >https://mail-index.netbsd.org/tech-kern/2017/12/25/msg022840.html
> >
> >Patch (pasted here for the reference):
> >
> >http://netbsd.org/~kamil/patch-00041-refactor-pipe1.txt
> 
> I am good with both since they eliminate the MD code and simplify
> the MI code. The only drawback is that sys_pipe (the system call)
> is not handled directly anymore by libc, but that's not an issue
> except for the slight performance loss (which does not really matter
> the moment you start doing I/O).

Why can't we just leave pipe() alone?  There are other syscalls that
return two values, e.g. fork.  The MD asm stubs are trivial and they
are already written.  They've been there for ages.  Why the sudden
desire to "create movment"?

The pipe1() change is a good thing, OTOH.

-uwe


Re: Proposal to obsolete SYS_pipe

2017-12-25 Thread Valery Ushakov
On Mon, Dec 25, 2017 at 16:37:43 +0100, Kamil Rytarowski wrote:

> On 24.12.2017 22:25, Kamil Rytarowski wrote:
>
> > http://netbsd.org/~kamil/patch-00039-obsolete-SYS_pipe.txt
> 
> I've extracted two patches from the above proposal.
> 
> In these patches SYS_pipe is not marked COMPAT_80 and not removed from
> rump. I've left it as it is.
> 
> 1. Implement pipe() with pipe2(2) in libc:
> 
> New source code is now Machine Independent.
> 
> http://netbsd.org/~kamil/patch-00040-implement-pipe-with-pipe2-in-libc.txt
> 
> The generated code in libc for x86_64 is also simpler and shorter:
> 
> 0008b2a2 <_pipe>:
>8b2a2:   31 f6   xor%esi,%esi
>8b2a4:   e9 b7 f5 fa ff  jmpq   3a860 

But you incur the price of pipe2's copyout().  I'm curious, does
anyone know how things like SMAP contribute to that price?


> 2. Refactor pipe1() kernel-internal function to operate over int[2]
> rather than register_t[2].  Stop returning garbage through retval[2]
> from pipe2(2).

Please, can you be more specific with your characterizations.
"Returning garbage" is vague, and without further details (that you do
know yourself but don't disclose) makes every reader expend time and
mental effort to figure out what are you really talking about.

For the reference, sys_pipe2() overwrites retval[1] with the second
descriptor b/c it passes retval[] to pipe1(), like sys_pipe() does.
But what is the intended effect for pipe() causes retval[1] register
to be clobbered for pipe2().


-uwe


config vs. modules iconf files

2017-12-15 Thread Valery Ushakov
[torn off of the original thread]

On Fri, Dec 08, 2017 at 04:40:01 +0300, Valery Ushakov wrote:

> Date: Fri, 8 Dec 2017 04:40:01 +0300
> From: Valery Ushakov <u...@stderr.spb.ru>
> Subject: Re: Attaching to an attribute
> To: tech-kern@netbsd.org
> Mail-Followup-To: tech-kern@netbsd.org
> 
> On Fri, Dec 08, 2017 at 04:29:49 +0300, Valery Ushakov wrote:
> 
> > On Thu, Dec 07, 2017 at 23:07:47 +0300, Valery Ushakov wrote:
> > 
> > > However config(1) instead of providing single wildcard parent spec
> > > seems to instantiate parent specs for all parents it's seen that carry
> > > the attribute.
> > 
> > Bah, my emacs has too many buffers.  Apparently I was looking at the
> > kernel config from a different architecture.
> > 
> > Astonishingly, i386 and amd64 GENERIC do _not_ have
> > 
> >   wsmouse* at wsmousedev?
> > 
> > wildcard attachment and instead use separate attachments for each
> > parent.  I'm overcome with nostalgy, but this probably should be
> > fixed, it's not 1990s anymore.
> 
> This, however, still highlights a problem.  How can a module device
> driver attach wsmouse as a child regardless of how the kernel is
> configured.

I have filed http://gnats.netbsd.org/52821 for this so that it's not
lost in the proverbial cracks.

Since most people don't read all of netbsd-bugs@ I'm also duplicating
it here.  Separately, so that the PR is not spammed with every reply
(should there be any :).

8<8<

config(8) supports generating autoconf glue for modules with (still
undocumented!) "ioconf" keyword.  Multiple examples can be found under
sys/modules.  Unfortunately in certain circumstances it generates
ioconf.c structures that are not directly usable.

Consider the ioconf file for VirtualBox Guest Addtions driver:

  ioconf vboxguest

  include "conf/files"

  include "dev/i2o/files.i2o" # XXX: pci needs device iop
  include "dev/pci/files.pci"

  device  vboxguest: wsmousedev
  attach  vboxguest at pci

  pseudo-root pci*
  vboxguest0  at pci? dev ? function ?

  wsmouse*at vboxguest?

wsmouse(4) attachment is necessary here because generally speaking we
cannot rely on the kernel that loads the module to have

  wsmouse*at wsmousedev?

and in fact until very recently i386 and amd64 kernels didn't, they
only had attachments to specific parents.

Unfortunately config(8) is overzealous and seeing that wsmouse
attachment causes it to emit

  CFDRIVER_DECL(wsmouse, ...)

and it also includes wsmouse into cfdriver_ioconf_vboxguest[] and
cfattach_ioconf_vboxguest[] arrays that are to be passed to
config_init_component(9).  That obviously causes the modload to fail
as the wsmouse driver is already registered with autoconf.

My guess is that config(8) emits these because it sees the
attachments.  This probably made sense for the in-tree modules, where
the actual "device" command comes from the relevnat "files.*" file, so
the only way for config to infer what to emit is to look at the
attachments.  Also all in-tree modules only ever attach single driver,
so they never run into this problem with config (though I think uatp
module should fail to attach wsmouse when loaded).

We need a way to tell config which definitions it should emit.

Just off the top of my head, may be can just mark the attachments, e.g:

  module vboxguest* at pci? dev ? function ?

or even

  module vboxguest

where config can see that vboxguest has single possible parent and
infer the wildcard attachment.  While here, it can also infer
necessary pseudo-root so that the user doesn't have to specifiy it.

-uwe


Re: Attaching to an attribute

2017-12-07 Thread Valery Ushakov
On Fri, Dec 08, 2017 at 04:29:49 +0300, Valery Ushakov wrote:

> On Thu, Dec 07, 2017 at 23:07:47 +0300, Valery Ushakov wrote:
> 
> > However config(1) instead of providing single wildcard parent spec
> > seems to instantiate parent specs for all parents it's seen that carry
> > the attribute.
> 
> Bah, my emacs has too many buffers.  Apparently I was looking at the
> kernel config from a different architecture.
> 
> Astonishingly, i386 and amd64 GENERIC do _not_ have
> 
>   wsmouse* at wsmousedev?
> 
> wildcard attachment and instead use separate attachments for each
> parent.  I'm overcome with nostalgy, but this probably should be
> fixed, it's not 1990s anymore.

This, however, still highlights a problem.  How can a module device
driver attach wsmouse as a child regardless of how the kernel is
configured.

-uwe


Re: Attaching to an attribute

2017-12-07 Thread Valery Ushakov
On Thu, Dec 07, 2017 at 23:07:47 +0300, Valery Ushakov wrote:

> However config(1) instead of providing single wildcard parent spec
> seems to instantiate parent specs for all parents it's seen that carry
> the attribute.

Bah, my emacs has too many buffers.  Apparently I was looking at the
kernel config from a different architecture.

Astonishingly, i386 and amd64 GENERIC do _not_ have

  wsmouse* at wsmousedev?

wildcard attachment and instead use separate attachments for each
parent.  I'm overcome with nostalgy, but this probably should be
fixed, it's not 1990s anymore.

-uwe


Attaching to an attribute

2017-12-07 Thread Valery Ushakov
Devices can be attached to an attribute, e.g. 

wsmouse* at wsmousedev?

where potential parents declare to have that attribute, e.g.

device ums: hid, wsmousedev

and the autoconf code knows how to attach to the attribute only:

static int
cfparent_match(const device_t parent, const struct cfparent *cfp)
{
/* ... */
/*
 * If no specific parent device instance was specified (i.e.
 * we're attaching to the attribute only), we're done!
 */
if (cfp->cfp_parent == NULL)
return 1;

/*
 * Check the parent device's name.
 */
if (STREQ(pcd->cd_name, cfp->cfp_parent) == 0)
return 0;/* not the same parent */

/*
 * Make sure the unit number matches.
 */
if (cfp->cfp_unit == DVUNIT_ANY ||   /* wildcard */
cfp->cfp_unit == parent->dv_unit)
return 1;

/* Unit numbers don't match. */
return 0;
}


However config(1) instead of providing single wildcard parent spec
seems to instantiate parent specs for all parents it's seen that carry
the attribute.  Check ioconf.c of your kernel: instead of single

static const struct cfparent pspecXXX = {
"wsmousedev", NULL, DVUNIT_ANY
};

struct cfdata cfdata[] = {
...
   { "wsmouse", "wsmouse", 0, STAR, loc+XXX, 0,  },
...
};

it emits

static const struct cfparent pspec15 = {
"wsmousedev", "spic", DVUNIT_ANY
};
/* ... */

/*238: wsmouse* at spic? mux 0 */
{ "wsmouse","wsmouse", 0, STAR, loc+1423, 0,  },
/*239: wsmouse* at pms? mux 0 */
{ "wsmouse","wsmouse", 0, STAR, loc+1424, 0,  },
/*240: wsmouse* at ums? mux 0 */
{ "wsmouse","wsmouse", 0, STAR, loc+1425, 0,  },
/* ... */

for each device with wsmousedev attribute.

This wastes a bit of memory in the static config, but that's not much
of a problem.

However if you want to attach such device to an attribute on another
device you load as a module, you can't, at least naively, b/c there's
no wildcard pspec for the wsmouse.

In existing code only uatp(4) module attaches wsmouse(4).  I don't
have one, but my prediction is that it will fail with "device not
configured".  Can someone with the device try and verify that?

You can add 

wsmouse* at wsmousedev?

to the module's ioconf.  Surprisingly, that generates wildcard parent
spec for wsmouse!  But it also adds wsmouse to cfdriver and cfattach
arrays and loading the module will fail with EEXIST.

The workaround seems to be to manually hack the ioconf.c so that the
module has the wildcard pspec line for wsmouse in cfdata only.

Anyone with enough config clue to comment (or better yet, fix :)?

-uwe


Re: amd64: kernel aslr support

2017-10-07 Thread Valery Ushakov
On Sat, Oct 07, 2017 at 20:42:58 +0200, Maxime Villard wrote:

> Le 04/10/2017 ? 21:00, Maxime Villard a ?crit :
> > Here is a Kernel ASLR implementation for NetBSD-amd64.
> > [...]
> > Known issues:
> > [...]
> >  * There are several redefinitions in the prekern headers. The way to remove
> >them depends on where we put the prekern in the source tree.
> 
> Does someone have a preference on where to put the prekern? I guess I'll
> put it in src/sys/arch/amd64/prekern/.

I'd say src/sys/arch/amd64/stand/prekern to conform to existing
practice.

-uwe


Re: Patching wscons_keydesc at runtime

2017-08-07 Thread Valery Ushakov
On Fri, Aug 04, 2017 at 02:46:15 +0300, Valery Ushakov wrote:
> Date: Fri, 4 Aug 2017 02:46:15 +0300
> From: Valery Ushakov <u...@stderr.spb.ru>
> Subject: Re: Patching wscons_keydesc at runtime
> To: tech-kern@netbsd.org
> Mail-Followup-To: tech-kern@netbsd.org
> 
> On Fri, Aug 04, 2017 at 01:38:38 +0200, Emmanuel Dreyfus wrote:
> 
> > Emmanuel Dreyfus <m...@netbsd.org> wrote:
> > 
> > > > Unfortunately this breaks hpcsh which initializes console very early
> > > > when malloc is not available, so when you boot with wscons the machine
> > > > wedges.
> > > > 
> > > > I think your change should be reverted for now and a different fix
> > > > developed.
> > > 
> > > Or perhaps it could be just ifdef hpcarm?
> > 
> > What about this change?
> > 
> > Index: sys/dev/hpc/hpckbd.c
> > ===
> > RCS file: /cvsroot/src/sys/dev/hpc/hpckbd.c,v
> > retrieving revision 1.31
> > diff -U4 -r1.31 hpckbd.c
> > --- sys/dev/hpc/hpckbd.c12 Jun 2017 09:23:39 -  1.31
> > +++ sys/dev/hpc/hpckbd.c3 Aug 2017 23:36:47 -
> > @@ -265,15 +265,17 @@
> > const keysym_t *map, int mapsize)
> >  {
> > int i;
> > const struct wscons_keydesc *desc;
> > +#ifdef hpcarm
> > static struct wscons_keydesc *ndesc = NULL;
> >  
> > /* 
> >  * fix keydesc table. Since it is const data, we must 
> > -* copy it once before changingg it.
> > +* copy it once before changingg it. That does not work
> > +* on hpcsh which initialize console before malloc is 
> > +* available.
> >  */
> > -
> > if (ndesc == NULL) {
> > size_t sz;
> >  
> > for (sz = 0; hpckbd_keymapdata.keydesc[sz].name != 0; sz++);
> > @@ -282,14 +284,15 @@
> > memcpy(ndesc, hpckbd_keymapdata.keydesc, sz * 
> > sizeof(*ndesc));
> >  
> > hpckbd_keymapdata.keydesc = ndesc;
> > }
> > +#endif /* hpcarm */
> >  
> > desc = hpckbd_keymapdata.keydesc;
> > for (i = 0; desc[i].name != 0; i++) {
> > if ((desc[i].name & KB_MACHDEP) && desc[i].map == NULL) {
> > -   ndesc[i].map = map;
> > -   ndesc[i].map_size = mapsize;
> > +   desc[i].map = map;
> > +   desc[i].map_size = mapsize;
> > }
> > }
> >  
> > return;
> 
> I think it might be better to just have two copies of the function,
> old and new.  E.g. this patch doesn't restore the unconst hack.
> 
> PS: Also "changingg" has a typo.

Looking closer (it's been a while since I touched low-level sh3
stuff), I think I'll just drop the early consinit() call that hpcsh
does and let main() do it.  That avoids the ugly special case.

-uwe


Re: Patching wscons_keydesc at runtime

2017-08-03 Thread Valery Ushakov
On Fri, Aug 04, 2017 at 01:38:38 +0200, Emmanuel Dreyfus wrote:

> Emmanuel Dreyfus  wrote:
> 
> > > Unfortunately this breaks hpcsh which initializes console very early
> > > when malloc is not available, so when you boot with wscons the machine
> > > wedges.
> > > 
> > > I think your change should be reverted for now and a different fix
> > > developed.
> > 
> > Or perhaps it could be just ifdef hpcarm?
> 
> What about this change?
> 
> Index: sys/dev/hpc/hpckbd.c
> ===
> RCS file: /cvsroot/src/sys/dev/hpc/hpckbd.c,v
> retrieving revision 1.31
> diff -U4 -r1.31 hpckbd.c
> --- sys/dev/hpc/hpckbd.c12 Jun 2017 09:23:39 -  1.31
> +++ sys/dev/hpc/hpckbd.c3 Aug 2017 23:36:47 -
> @@ -265,15 +265,17 @@
> const keysym_t *map, int mapsize)
>  {
> int i;
> const struct wscons_keydesc *desc;
> +#ifdef hpcarm
> static struct wscons_keydesc *ndesc = NULL;
>  
> /* 
>  * fix keydesc table. Since it is const data, we must 
> -* copy it once before changingg it.
> +* copy it once before changingg it. That does not work
> +* on hpcsh which initialize console before malloc is 
> +* available.
>  */
> -
> if (ndesc == NULL) {
> size_t sz;
>  
> for (sz = 0; hpckbd_keymapdata.keydesc[sz].name != 0; sz++);
> @@ -282,14 +284,15 @@
> memcpy(ndesc, hpckbd_keymapdata.keydesc, sz * sizeof(*ndesc));
>  
> hpckbd_keymapdata.keydesc = ndesc;
> }
> +#endif /* hpcarm */
>  
> desc = hpckbd_keymapdata.keydesc;
> for (i = 0; desc[i].name != 0; i++) {
> if ((desc[i].name & KB_MACHDEP) && desc[i].map == NULL) {
> -   ndesc[i].map = map;
> -   ndesc[i].map_size = mapsize;
> +   desc[i].map = map;
> +   desc[i].map_size = mapsize;
> }
> }
>  
> return;

I think it might be better to just have two copies of the function,
old and new.  E.g. this patch doesn't restore the unconst hack.

PS: Also "changingg" has a typo.

-uwe


Re: Patching wscons_keydesc at runtime

2017-08-03 Thread Valery Ushakov
On Fri, Aug 04, 2017 at 01:30:14 +0200, Emmanuel Dreyfus wrote:

> Valery Ushakov <u...@stderr.spb.ru> wrote:
> 
> > Unfortunately this breaks hpcsh which initializes console very early
> > when malloc is not available, so when you boot with wscons the machine
> > wedges.
> > 
> > I think your change should be reverted for now and a different fix
> > developed.
> 
> Or perhaps it could be just ifdef hpcarm?

That will also do for now.  Please, don't forget to request pullups.

Please, can you also file a PR with the details on what is broken in
the original "unconst" version.  Do you need to boot with some
specific selection in hpcboot, what are the commands, what you expect
to happen and what actually happens, etc.

I'll try to look into it, but probably not immediately.  As I said in
an earlier reply, unfortunately layout handling is a mess, b/c the
data structure definitions contradict the intended purpose of machdep
entries, so some rototill might be necessary.

Thanks.

-uwe


Re: Patching wscons_keydesc at runtime

2017-08-03 Thread Valery Ushakov
On Sat, Jun 10, 2017 at 05:18:16 +0200, Emmanuel Dreyfus wrote:

> I managed to restore wscons keymaps by copying
> hpckbd_keymapdata.keydesc into a malloc() buffer and changing the
> hpckbd_keymapdata.keydesc to the new location, which is mapped
> read/write.

Unfortunately this breaks hpcsh which initializes console very early
when malloc is not available, so when you boot with wscons the machine
wedges.

I think your change should be reverted for now and a different fix
developed.

-uwe


Wskbd constness (Was: Patching wscons_keydesc at runtime)

2017-06-12 Thread Valery Ushakov
On Sat, Jun 10, 2017 at 05:18:16 +0200, Emmanuel Dreyfus wrote:

> I just upgraded an HP Jornada 720 from NetBSD 2.0 to NetBSD 7.1, and
> discovered the wscons keymaps were broken in the meantime: it is impossible to
> change the keymap using wsconsctl encoding or wsconsctl map. Both commands
> succeed but have no effect.
> 
> After poking a few printf in the kernel, I found this in
> src/sys/dev/hpc/hpckbd.c:
> 
> /* fix keydesc table */
> /* 
>  * XXX The way this is done is really wrong.  The __UNCONST()
>  * is a hint as to what is wrong.  This actually ends up modifying
>  * initialized data which is marked "const".
>  * The reason we get away with it here is apparently that text
>  * and read-only data gets mapped read/write on the platforms
>  * using this code.
>  */
> desc = (struct wscons_keydesc *)__UNCONST(hpckbd_keymapdata.keydesc);
> for (i = 0; desc[i].name != 0; i++) {
> if ((desc[i].name & KB_MACHDEP) && desc[i].map == NULL) {
> desc[i].map = map;
> desc[i].map_size = mapsize;
> }
> }
> 
> I managed to restore wscons keymaps by copying hpckbd_keymapdata.keydesc into
> a malloc() buffer and changing the hpckbd_keymapdata.keydesc to the new
> location, which is mapped read/write. 
> 
> The offending code did not change since NetBSD 2.0, except the XXX comment
> added in 2015. That suggests the compiler behavior changed about initalized
> const data, which was still mapped R/W in the ancient time and is now really
> read-only, altough it accepts nilpotent writes without raising an exception.

The constness in the MI wskbd code looks wrong:

/* KBD_NULLMAP generates a entry for machine native variant.
   the entry will be modified by machine dependent keyboard driver. */
#define KBD_NULLMAP() ...

const struct wscons_keydesc pckbd_keydesctab[] = {
...
/* placeholders */
KBD_NULLMAP(KB_US | KB_MACHDEP, KB_US),
...
};

Which is obviously self-contradictory.


This is probably b/c induced by const in:

struct wskbd_mapdata {
const struct wscons_keydesc *keydesc;
kbd_t layout;
};



> +   for (sz = 0; hpckbd_keymapdata.keydesc[sz].name != 0; sz++);

/usr/share/misc/style requires explicit no-op "continue" here.


-uwe


Re: Adding ruminit(4)

2017-05-24 Thread Valery Ushakov
On Wed, May 24, 2017 at 17:20:52 +, Christos Zoulas wrote:

> Why not move the all the code into a single "ubulkdisable" or
> something driver?

Finally a thumb to use in-kernel Lua on? :)

-uwe


Re: Cnmagic support for wscons

2017-01-16 Thread Valery Ushakov
On Tue, Jan 17, 2017 at 11:55:52 +1100, Nathanial Sloss wrote:

> On Tue, 17 Jan 2017 07:32:34 Valery Ushakov wrote:
> > On Tue, Jan 17, 2017 at 04:26:48 +1100, Nathanial Sloss wrote:
> > > On Mon, 16 Jan 2017 00:44:02 Valery Ushakov wrote:
> > > > On Sun, Jan 15, 2017 at 13:30:15 +0100, Martin Husemann wrote:
> > > > > On Sun, Jan 15, 2017 at 01:59:06PM +1100, Nathanial Sloss wrote:
> > > > > > Mapping KS_Cmd_Debugger would also work but I'm unsure as to how
> > > > > > to do this without using wskbd key sequences in the magic.
> > > > > 
> > > > > I don't understand - if you just assing KS_Cmd_Debugger somewhere,
> > > > > why would you need cnmagic?
> > > > 
> > > > Exactly.  You need cnmagic(9) for detecting debugger _sequence_
> > > > in-band like in serial console.
> > > > 
> > > > Your patch doesn't provide any documentation or an accompaning
> > > > description, so I'm not sure what exactly it does, e.g. what should
> > > > cnmagic value look like for wskbd?
> > > 
> > > I've put an example in the updated man page for both a wskbd command
> > > and regular text.
> > > 
> > | +sysctl variable must be prefixed by \\001 for normal characters and/or
> > | +control codes.
> > | +Alternatively it can be prefixed by \\002 for wskbd commands.
> > 
> > [...]
> > 
> > | +The default cnmagic is \\002\\040\\0364 which on most platforms is
> > | +--
> > | +the wskbd command for
> > | +.Xr ddb 4 .
> > 
> > - How does \040\0364 (0x20 0xf4) correspond to ?
> > 
> > - Does the fact that KS_Cmd_Debugger is defined as 0xf420 have
> >   anything to do with it?
> 
> Yes it does.
>
> > - If so, why the byte order is little endian?
> 
> Endianness could be an issue I'll test in a sparc emulator.
> 
> > - KS_Cmd_Debugger becomes a meaningless placeholder - you will have it
> >   mapped by default but you can set cnmagic to something else
> > 
> > > > But in general it's not even
> > > > entirely clear to me what a semantic of cnmagic(9) for wskbd could be.
> > > > Should it use individual key-presseses as the basic input stream it
> > > > parses?  If yes, you will lose the ability to use, e.g., *both*
> > > > C-A-Esc and A-C-Esc _chords_ to break into debuger, because with
> > > > wskbd(4) keyboard mapping they are the same _chords_, but with
> > > > cnmagic(9) they are different _sequences_.
> > > 
> > > That's why I have two different prefixes one for wskbd commands and
> > > another for regular text.
> > 
> > - What if one wants to use a mixture of them?
> 
> I can make it possibile to specify keycodes as well. So if a mixture
> was wanted one would have to use that.
>
> > - What happens if you specify in cnmaigc the sequence of individual
> >   keys that map (at wskbd level) to a KS_Cmd_*?
> 
> It would jump into ddb and on return run the KS_Cmd_*.

I would expect the proposal to actually describe and document all of
that and more.  Do we really need to play the game where I have to ask
very specific direct questions that you try to answer as literally as
possible?


> > > > I'd also say that the very fact that the patch doesn't use
> > > > cn_check_magic(9) indicates in some sense that it probably does not
> > > > implement "cnmagic support for wscons". :)
> > > 
> > > Please see:
> > > 
> > > ftp://ftp.netbsd.org/pub/NetBSD/misc/nat/cnmagic.v2.diff
> > > 
> > > It now uses cn_check magic instead of the custom ws_check_magic which was
> > > based on cn_check_magic.
> > 
> > I still intensly dislike the very idea behind this patch.  I don't
> > think it's meaningful or useful to bolt cnmagic onto wskbd.  Yes, I
> > can theoretically imagine the possibility that someone somewhere might
> > need a wskbd sequence to break into the debugger that for some reason
> > cannot be expressed with just mapping KS_Cmd_Debugger.  I'd estimate
> > the probablity of that be to just slighly more than that someone
> > somewhere will find it useful to be able to express morse code support
> > in wskbd mappings :).
> > 
> > You PR states the motivation as:
> > | wscons does not support cn_magic - setting hw.cnmagic has no effect.
> > 
> > So just drop hw.cnmagic sysctl node if the console doesn't support it.
> 
> There are three specific cases I have experienced.
> 
> An Older hp microserver with remote access c

Re: Cnmagic support for wscons

2017-01-16 Thread Valery Ushakov
On Tue, Jan 17, 2017 at 04:26:48 +1100, Nathanial Sloss wrote:

> On Mon, 16 Jan 2017 00:44:02 Valery Ushakov wrote:
> > On Sun, Jan 15, 2017 at 13:30:15 +0100, Martin Husemann wrote:
> > > On Sun, Jan 15, 2017 at 01:59:06PM +1100, Nathanial Sloss wrote:
> > > > Mapping KS_Cmd_Debugger would also work but I'm unsure as to how
> > > > to do this without using wskbd key sequences in the magic.
> > > 
> > > I don't understand - if you just assing KS_Cmd_Debugger somewhere,
> > > why would you need cnmagic?
> > 
> > Exactly.  You need cnmagic(9) for detecting debugger _sequence_
> > in-band like in serial console.
> > 
> > Your patch doesn't provide any documentation or an accompaning
> > description, so I'm not sure what exactly it does, e.g. what should
> > cnmagic value look like for wskbd? 
> 
> I've put an example in the updated man page for both a wskbd command
> and regular text.

| +sysctl variable must be prefixed by \\001 for normal characters and/or
| +control codes.
| +Alternatively it can be prefixed by \\002 for wskbd commands.
[...]
| +The default cnmagic is \\002\\040\\0364 which on most platforms is
| +--
| +the wskbd command for
| +.Xr ddb 4 .

- How does \040\0364 (0x20 0xf4) correspond to ?

- Does the fact that KS_Cmd_Debugger is defined as 0xf420 have
  anything to do with it?

- If so, why the byte order is little endian?

- KS_Cmd_Debugger becomes a meaningless placeholder - you will have it
  mapped by default but you can set cnmagic to something else

> > But in general it's not even
> > entirely clear to me what a semantic of cnmagic(9) for wskbd could be.
> > Should it use individual key-presseses as the basic input stream it
> > parses?  If yes, you will lose the ability to use, e.g., *both*
> > C-A-Esc and A-C-Esc _chords_ to break into debuger, because with
> > wskbd(4) keyboard mapping they are the same _chords_, but with
> > cnmagic(9) they are different _sequences_.
> 
> That's why I have two different prefixes one for wskbd commands and
> another for regular text.

- What if one wants to use a mixture of them?

- What happens if you specify in cnmaigc the sequence of individual
  keys that map (at wskbd level) to a KS_Cmd_*?


> > I'd also say that the very fact that the patch doesn't use
> > cn_check_magic(9) indicates in some sense that it probably does not
> > implement "cnmagic support for wscons". :)
> 
> Please see:
> 
> ftp://ftp.netbsd.org/pub/NetBSD/misc/nat/cnmagic.v2.diff
> 
> It now uses cn_check magic instead of the custom ws_check_magic which was 
> based on cn_check_magic.

I still intensly dislike the very idea behind this patch.  I don't
think it's meaningful or useful to bolt cnmagic onto wskbd.  Yes, I
can theoretically imagine the possibility that someone somewhere might
need a wskbd sequence to break into the debugger that for some reason
cannot be expressed with just mapping KS_Cmd_Debugger.  I'd estimate
the probablity of that be to just slighly more than that someone
somewhere will find it useful to be able to express morse code support
in wskbd mappings :).

You PR states the motivation as:

| wscons does not support cn_magic - setting hw.cnmagic has no effect.

So just drop hw.cnmagic sysctl node if the console doesn't support it.

-uwe


Re: Cnmagic support for wscons

2017-01-15 Thread Valery Ushakov
On Sun, Jan 15, 2017 at 13:30:15 +0100, Martin Husemann wrote:

> On Sun, Jan 15, 2017 at 01:59:06PM +1100, Nathanial Sloss wrote:
>
> > Mapping KS_Cmd_Debugger would also work but I'm unsure as to how
> > to do this without using wskbd key sequences in the magic.
> 
> I don't understand - if you just assing KS_Cmd_Debugger somewhere,
> why would you need cnmagic?

Exactly.  You need cnmagic(9) for detecting debugger _sequence_
in-band like in serial console.

Your patch doesn't provide any documentation or an accompaning
description, so I'm not sure what exactly it does, e.g. what should
cnmagic value look like for wskbd?  But in general it's not even
entirely clear to me what a semantic of cnmagic(9) for wskbd could be.
Should it use individual key-presseses as the basic input stream it
parses?  If yes, you will lose the ability to use, e.g., *both*
C-A-Esc and A-C-Esc _chords_ to break into debuger, because with
wskbd(4) keyboard mapping they are the same _chords_, but with
cnmagic(9) they are different _sequences_.

I'd also say that the very fact that the patch doesn't use
cn_check_magic(9) indicates in some sense that it probably does not
implement "cnmagic support for wscons". :)

-uwe


Re: Cnmagic support for wscons

2017-01-14 Thread Valery Ushakov
On Sun, Jan 15, 2017 at 09:08:43 +1100, Nathanial Sloss wrote:

> Please see:
> 
> ftp://ftp.netbsd.org/pub/NetBSD/misc/nat/wscons.cnmagic.diff
> 
> Also the original PR:
> 
> http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=48360
> 
> The diff adds cnmagic support for wscons consoles.
> 
> If there are no objections, I'd like to commit this within by the
> 20th of January.

The PR doesn't really says why is this necessary?  What's wrong with
just mapping KS_Cmd_Debugger?

-uwe


Re: ptrace(2) interface for hardware watchpoints (breakpoints)

2016-12-15 Thread Valery Ushakov
On Thu, Dec 15, 2016 at 19:51:35 +0100, Kamil Rytarowski wrote:

> On 15.12.2016 16:42, Valery Ushakov wrote:
> > Again, you don't provide any details.  What extra logic?  Also, what
> > are these few dozens of instructions you are talking about?  I.e. what
> > is that extra work you have to do for a process-wide watchpoint that
> > you don't have to do for an lwp-specific watchpoint on each return to
> > userland?
> 
> 1. Complexity is adding extra case in ptrace_watchpoint structure,
> adding there a way to specify per-thread or per-process. If there
> someone wants to set per-thread watchpoints inside the process
> structure.. there would be need to have a list of available watchpoints,
> that would scale to number of watchpoints possible x number of threads list.
> 
> 2. Complexity on returning to userland - need to lock structure process
> in userret(9) and check every watchpoint if it's process-wide or
> dedicated for the thread.

Why would you need all this?  Consider the case when debug registers
are part of the mcontext, then the very act of restoring the context
enables corresponding watchpoints for the lwp.  When the debug
registers are not part of mcontext the only difference is that after
restoring the mcontext you also set debug registers from some other
structure.

E.g. sh3 uses User Break Controller to implement single-stepping, so
effectively a kind of watchpoint that is triggered after instruction,
not matching any address bits, asid, etc, etc.  The register in UBC
that enables the watchpoint is set from a field in trapframe, just
like any other register.

So at ptrace(2) time to set a process-wide watchpoint, you go over all
existing lwps and setup their trapframes accordingly.  For new lwps
created after the watchpoint is set you need to do that at lwp
creation time.  But when lwp returns to userland, there's no overhead.


> I implemented it originally per process and I finally decided to throw
> the per-process vs per-thread logic away, out of the kernel and expose
> watchpoints (or technically bitmasks of available debug registers) to
> userland.
> 
> It's easier to check perlwp local structure and end up with up to 4
> fields there, than lock a list and iterate over N elements. Every thread
> has also dedicated bit in its property indicating whether it has
> attached watchpoints.
> 
> From user-land point of view, and management it's equivalent. With the
> difference that debugger needs to catch thread creation and apply
> desired watchpoint to it.
> 
> Why bitmasks and not raw registers? On some level there is need to check
> if the composed combination is valid in the kernel - dividing
> user-settable bits from registers to bitmask is needed on some level
> anyway, and while it's possible to be done in kernel, why not to export
> it to userland?
> 
> I've found it easier to be reused in 3rd party software.

-uwe


Re: ptrace(2) interface for hardware watchpoints (breakpoints)

2016-12-15 Thread Valery Ushakov
On Tue, Dec 13, 2016 at 18:16:04 +0100, Kamil Rytarowski wrote:

> >> 4. Do not set watchpoints globally per process, limit them to
> >> threads (LWP). [...]  Adding process-wide management in the
> >> ptrace(2) interface calls adds extra complexity that should be
> >> pushed away to user-land code in debuggers.
> > 
> > I have no idea what amd64 debug registers do, but this smells like you
> > are exposing in the MI interface some of those details.  I don't think
> > this can be done in hardware on sh3, e.g.  

Ok, I was confused there for a moment.  The "debug state" is per-lwp
and is restored when lwp is switched to.  What was I thinking...


> > Also, you quite often have no idea which thread stomps on your data,
> > so I'd imagine most of the time you do want a global watchpoint.
> 
> This is true.
> 
> With the proposed interface per-thread a debugger can set the same
> hardware watchpoint for each LWP and achieve the same result. There are
> no performance or synchronization challenges as watchpoints can be set
> only when a process is stopped.
> 
> In my older code I had logic per-process to access watchpoints, but
> it required extra logic in thread-specific functions to access
> process specific data. I assumed that saving few dozens of CPU
> cycles before each thread entering user-space is precious. (I know
> it's a small optimization, however it's for free)

Again, you don't provide any details.  What extra logic?  Also, what
are these few dozens of instructions you are talking about?  I.e. what
is that extra work you have to do for a process-wide watchpoint that
you don't have to do for an lwp-specific watchpoint on each return to
userland?


> >> 5. Do not allow to mix PT_STEP and hardware watchpoint, in case of
> >> single-stepping the code, disable (it means: don't set) hardware
> >> watchpoints for threads. Some platforms might implement single-step with
> >> hardware watchpoints and managing both at the same time is generating
> >> extra pointless complexity.
> > 
> > I don't think I see how "extra pointless complexity" follows.
> 
> 1. At least in MD x86 specific code, watchpoint traps triggered with
> stepped code are reported differently to those reported with plain steps
> and also differently to plain hardware watchpoint traps. They are 3rd
> type of a trap.
>
> 2. Single stepping can be implemented with hardware assisted watchpoints
> (technically breakpoints) on the kernel side in MD. And if so, trying to
> apply watchpoints and singlestep will conflict and this will need
> additional handling on the kernel side.
> 
> To oppose extra complexity I propose to make stepping and watchpoints
> separable, one or the other, but not both.

And again you allude to MD details and don't provide any.  You cannot
just handwave this away.  You will have to provide enough information
for people to implement this for other arches evnentually, including
MD specifics that affected the design, so that people can see how
their MD specific details affect their implementation.  Why don't
provide this upfront?  I understand you might be eager to commit this
work and be done with it, but you are doing this fulltime.  Others
don't have this luxury.  So I don't want to come around to
implementing your desing in a few months time when I have some spare
cycles and discover that it's ill suited for the hardware I have to
deal with.

May be you are right, and it's hard to mix single-stepping and
watchpoints, but I don't have time to investigate this fully right now
for sh3 and you don't provide any details that will back your
conclusion for x86.  Have it occured to you that you might me missing
some approach to solving this, but people that grok x86 can't tell you
unless they know the details.  And I don't think that committing
first, as you seem to have done already, and then let people figure it
out from RTFS is an acceptable approach, b/c, again, without
description you force people to RTFS and they might not have the time.


> > Also, you might want both, single-stepping and waiting for a
> > watchpoint.  Will debugger have switch dynamically to software
> > watchpoints when single-stepping?  Can it even do that already?
> 
> My understanding of stepping the code is that we want to go one and only
> one instruction ahead (unless port restricts it and its 1 or more),
> followed with a break.
> 
> What's the use case of waiting for data access and stepping in the same
> time? Is it needed? Does it solve some issues that cannot be solved
> otherwise? Could it be implemented in software (in case of watch)?

Isn't it your job to tell us the answers?  So, let's say I set a
watchpoint and then I hit some other breakpoint and do some stepi.  If
one of those instructions I'm stepping will do the read/write I'm
watching for, how it will be detected y the debugger if you can't mix
hw-assisted watchpoints and single-stepping?


> My original intention was to make it friendly for ports, without too
> specific 

Re: ptrace(2) interface for hardware watchpoints (breakpoints)

2016-12-12 Thread Valery Ushakov
On Tue, Dec 13, 2016 at 02:04:36 +0100, Kamil Rytarowski wrote:

> The design is as follows:
> 
> 1. Accessors through:
>  - PT_WRITE_WATCHPOINT - write new watchpoint's state (set, unset, ...),
>  - PT_READ_WATCHPOINT - read watchpoints's state,
>  - PT_COUNT_WATCHPOINT - receive the number of available watchpoints.

Gdb supports hardware assisted watchpoints.  That implies that other
OSes have existing designs for them.  Have you studied those existing
designs?  Why do you think they are not suitable to be copied?


> 4. Do not set watchpoints globally per process, limit them to
> threads (LWP). [...]  Adding process-wide management in the
> ptrace(2) interface calls adds extra complexity that should be
> pushed away to user-land code in debuggers.


I have no idea what amd64 debug registers do, but this smells like you
are exposing in the MI interface some of those details.  I don't think
this can be done in hardware on sh3, e.g.  

Also, you quite often have no idea which thread stomps on your data,
so I'd imagine most of the time you do want a global watchpoint.
Note, that if you want to restrict your watchpoint to one thread, you
can probably (I don't know and I haven't checked) do this with gdb
"command" that "continue"s if it's on the wrong thread.


> 5. Do not allow to mix PT_STEP and hardware watchpoint, in case of
> single-stepping the code, disable (it means: don't set) hardware
> watchpoints for threads. Some platforms might implement single-step with
> hardware watchpoints and managing both at the same time is generating
> extra pointless complexity.

I don't think I see how "extra pointless complexity" follows.

Also, you might want both, single-stepping and waiting for a
watchpoint.  Will debugger have switch dynamically to software
watchpoints when single-stepping?  Can it even do that already?


In general I'd appreciate if handwavy "this is pointless/extra
complexity" arguments were spelled out.  They might be obvious to you,
but most people reading this don't have relevant information swapped
in, or don't know enough details.

-uwe


Re: CVS commit: src/sys/dev/pci

2016-11-29 Thread Valery Ushakov
On Tue, Nov 29, 2016 at 21:54:11 +, Valeriy E. Ushakov wrote:

> Module Name:  src
> Committed By: uwe
> Date: Tue Nov 29 21:54:11 UTC 2016
> 
> Modified Files:
>   src/sys/dev/pci: if_vioif.c
> 
> Log Message:
> vioif_start() - do not call virtio_enqueue_abort() after error from
> virtio_enqueue_reserve(), as it's already done by the latter, so we
> ended up with a kind of "double free" that messed up out free list of
> vq_entry's.
> 
> This is even documented in a "typical usage" comment in virtio.c (and
> those quotes are not intended to be sarcastic).
> 
> PR 51132 - virtio net device stuck for UDP burst transmission
> 
> 
> To generate a diff of this commit:
> cvs rdiff -u -r1.26 -r1.27 src/sys/dev/pci/if_vioif.c

This seems to be a common problem, as both ld at virtio and viornd
drivers do the same mistake too.

I'd appreciate if people can fix and test (with simulated failure if
necessary).

I wonder if http://gnats.netbsd.org/50604 might be caused by this as
the first time you run out of vq_entry's, you will end up with a
messed up free list.

-uwe


Re: SOSEND_LOAN problems in MIPS

2016-06-19 Thread Valery Ushakov
On Sun, Jun 19, 2016 at 16:25:20 +0100, Robert Swindells wrote:

> co...@sdf.org wrote:
> >in emulating pmax with gxemul I had trouble using:
> >  cat somefile | command
> >
> >when somefile is bigger than 4096 bytes.
[...]
> It would probably also help to know where the file that you read
> using cat was stored, was it read over NFS?

More likely options PIPE_SOCKETPAIR, I guess.

-uwe


Re: Scripting DDB in Forth?

2016-05-08 Thread Valery Ushakov
On Mon, May 02, 2016 at 04:59:32 +0300, Valery Ushakov wrote:

> I'd say that someone familiar with the target ISA can port == write
> the asm core in an evening or two.

Just a quick follow up note.  I wanted to verify that claim, so I
ported it to powerpc, which also helped to make the MI part really (or
at least more) MI.  I had zero ppc knowledge before starting that
exercise.  It took me three evenings, four if you count reading up on
the ISA.

https://bitbucket.org/nbuwe/forth

-uwe


  1   2   >