re: zsh crash in recent -current

2019-03-13 Thread matthew green
> (while none I have ever seen actually do, the
> malloc implementation is free to retrn the memory to the kernel, and
> remove it from the process's address space).

FWIW, both old and new jemalloc are capable of doing this.  :-)


.mrg.


daily CVS update output

2019-03-13 Thread NetBSD source update


Updating src tree:
P src/crypto/external/bsd/openssl/dist/crypto/armcap.c
P src/crypto/external/bsd/openssl/lib/libcrypto/arch/powerpc/ppccpuid.S
P src/crypto/external/bsd/openssl/lib/libcrypto/arch/powerpc64/ppccpuid.S
P src/lib/libcurses/get_wch.c
P src/lib/libcurses/getch.c
P src/sys/arch/arm/rockchip/rk3399_cru.c
P src/sys/arch/arm/rockchip/rk_emmcphy.c
P src/sys/arch/mvme68k/dev/pcctwo_68k.c
P src/sys/arch/x86/include/specialreg.h
U src/sys/dev/fdt/arasan_sdhc_fdt.c
P src/sys/dev/fdt/files.fdt
P src/sys/dev/pci/ixgbe/ixgbe.c
P src/sys/dev/pci/ixgbe/ixv.c
P src/sys/dev/sbus/zx.c
P src/sys/dev/sdmmc/sdhc.c
P src/sys/dev/sdmmc/sdhcvar.h
P src/sys/kern/subr_pool.c
P src/tests/lib/libc/sys/t_mlock.c

Updating xsrc tree:


Killing core files:




Updating file list:
-rw-rw-r--  1 srcmastr  netbsd  41037841 Mar 14 03:03 ls-lRA.gz


Re: ATF t_mlock() babylon5 kernel panics

2019-03-13 Thread Robert Elz
Date:Wed, 13 Mar 2019 11:44:51 -0700
From:Jason Thorpe 
Message-ID:  <9a2a4a34-35b0-490e-9a92-aab44174f...@me.com>

  | Ok, well, I see some problematic code in sys_mlock() and sys_munlock(),
  | but I don't think it's affecting this case (and it may in fact have
  | the effect of making the test pass if a non-page-aligned buffer is passed):

Yes, I was expecting that might happen - and in some ways that's a convenient
thing, in (just one) recent run, the b5 tests actually passed (without
any apparent changes anywhere that could have affected anything.)   Tests
that pass don't save any logs to be looked at (they're not supposed to
be interesting!) so there's no way we will ever know for sure, but I was
kind of hoping that this might have been the explanation.   Usually we're
getting page aligned memory, but perhaps not, that one time.

  | I would suggest instrumenting-with-printf the "new_pageable"
  | case of uvm_map_pageable()

I will look at that later today (unless someone else solves the problem
first) - more unrelated tasks for the next few hours...

kre



Re: zsh crash in recent -current

2019-03-13 Thread Robert Elz
Date:Wed, 13 Mar 2019 10:37:41 -0700
From:Brian Buhrow 
Message-ID:  <201903131737.x2dhbfd8001...@lothlorien.nfbcal.org>

  | Given this code fragment and the discussion you raise
  | about it, allow me to ask what perhaps is a naive question.  If the sample
  | you quote is incorrect, what is the correct way to accomplish the same
  | task?
 
The basic principle is that once memory is freed it no longer belongs
to the application (while none I have ever seen actually do, the
malloc implementation is free to retrn the memory to the kernel, and
remove it from the process's address space).

So, everything one wants from the memory block needs to be done before
the free() happens.

In cases of muti-threaded applications, including the kernel,
the right answer depends upon just how locking gets done, what
gets locked when, and for how long (need to make sure that the
list isn't being changed by some other thread, or deal with the
possibility that it is, if we cannot reasonably prevent it - in
the worst case, exit the loop and restart from the beginning each
time something happens that requires releasing a lock on the
list itself).

But for a simple single threaded userland application, something
like:

  | }   for (list_ptr = list_head; list_ptr != NULL; list_ptr = next) {
next = list_ptr->nxt;
  | }   /* do stuff on list */
  | }   if (element_should_be_deleted) {
  | }   /* with testing for NULLs added but not shown here */
  | }   list_ptr->prev->nxt = list_ptr->nxt;
  | }   list_ptr->nxt->prev = list_ptr->prev;
  | }   free(list_ptr);
  | }   }
  | }   }

There are plenty of other (valid) ways to write it of course, but all of
them require that after that "free()" we never touch *list_ptr (aka list_ptr->)
until list_ptr has been changed to refer to something that has not been free'd.

kre



Re: UFS2 feature: inode space used for data

2019-03-13 Thread Michael van Elst
rhia...@falu.nl (Rhialto) writes:

>in the set is allocated and initialized. The set of blocks that may
>be allocated to inodes is held as part of the free-space reserve
>until all other space in the filesystem is allocated. Only then can
>it be used for file data.=20

>But the bit at the end: if you don't need so many inodes, but run out of
>data space, then unused inode blocks can be re-purposed as data blocks.

I says that blocks that might be used for inodes later aren't allocated
for data unless there is no other free block. It doesn't talk about
re-purposing inode blocks once they are used as such.

-- 
-- 
Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


UFS2 feature: inode space used for data

2019-03-13 Thread Rhialto
I was reading about the history of the Unix File System. In
https://www.usenix.org/legacy/events/bsdcon03/tech/full_papers/mckusick/mckusick_html/
"Enhancements to the Fast Filesystem To Support Multi-Terabyte Storage
Systems"
there is this in section 

4.1. Dynamic Inodes

One of the common complaints about the UFS1 filesystem is that it
preallocates all its inodes at the time that the filesystem is
created.  For filesystems with millions of files, the initialization
of the filesystem can take several hours. Additionally, the
filesystem creation program, newfs, had to assume that every
filesystem would be filled with many small files and allocate a lot
more inodes than were likely to ever be used. If a UFS1 filesystem
uses up all its inodes, the only way to get more is to dump,
rebuild, and restore the filesystem. The UFS2 filesystem resolves
these problems by dynamically allocating its inodes.  [...]

To avoid these costs, UFS2 preallocates a range of inode numbers and
a set of blocks for each cylinder group. Initially each cylinder
group has a single block of inodes allocated (a typical block holds
32 or 64 inodes). When the block fills up, the next block of inodes
in the set is allocated and initialized. The set of blocks that may
be allocated to inodes is held as part of the free-space reserve
until all other space in the filesystem is allocated. Only then can
it be used for file data. 

I have indeed seen that we have code to delay initializing the inodes.

But the bit at the end: if you don't need so many inodes, but run out of
data space, then unused inode blocks can be re-purposed as data blocks.
It doesn't look like we have that? At least I can't find any comment to
that effect. Code would likely involve looking at and/or modifying
cgp->cg_initediblk, but that field is used very rarely and not for this
purpose. (On the other hand, I can't find such code in FreeBSD either,
and I would expect it to be there, if anywhere.)

Is my impression correct?

-Olaf.
-- 
___ Olaf 'Rhialto' Seibert  -- "What good is a Ring of Power
\X/ rhialto/at/falu.nl  -- if you're unable...to Speak." - Agent Elrond


signature.asc
Description: PGP signature


Re: zsh crash in recent -current

2019-03-13 Thread Brett Lymn
On Wed, Mar 13, 2019 at 11:32:05AM +, Chavdar Ivanov wrote:
> Setting INSTALL_UNSTRIPPED=yes didn't work for me for zsh and ncurses,
> I still had to find all the '-s' flags in the install command
> invocations. Perhaps I am doing something wrong. Still, I've gor now
> zsh with the debug options and all modules plus ncurses unstripped.
> Now to build the OS debug sets.
> 

Oh ncurses I was going to mention that some time ago I did link our
native curses against libefence which caught a few memory issues.

-- 
Brett Lymn
--
Sent from my NetBSD device.

"We are were wolves",
"You mean werewolves?",
"No we were wolves, now we are something else entirely",
"Oh"


Re: ATF t_mlock() babylon5 kernel panics

2019-03-13 Thread Jason Thorpe


> On Mar 13, 2019, at 10:27 AM, Robert Elz  wrote:
> 
> Some progress.
> 
> #1 touching the buffer that malloc() returns (which is page aligned)
> made no difference - it is likely that the malloc llibrary would have
> done that in any case (the malloc is for just one page, so either it
> is resident (or paged) or it is a ZFoD page, and most likely not the
> latter - certainly not after this small mod.)
> 
> #2 changing the mlock_clip to avoid the sequence
> 
>   mlock(buf, 0);
>   munlock(buf, 0);
> 
> avoided all of the issues and problems, and the test passed.
> 
> What locking 0 bytes might achieve

Finding bugs :-)

> I have no idea, but that is
> what the test does:
> 
>buf = malloc(page);
> /*...*/
>for (size_t i = page; i >= 1; i = i - 1024) { 
>err1 = mlock(buf, page - i);
>if (err1 != 0)
>fprintf(stderr, "mlock_clip: page=%ld i=%zu,"
>" mlock(%p, %ld): %s\n", page, i, buf, page - i,
>strerror(errno));
>err2 = munlock(buf, page - i);
> /*...*/
> 
> That first munlock() is where the panic (with the extra KASSERTs added)
> occurs.

Ok, well, I see some problematic code in sys_mlock() and sys_munlock(), but I 
don't think it's affecting this case (and it may in fact have the effect of 
making the test pass if a non-page-aligned buffer is passed):

pageoff = (addr & PAGE_MASK);
addr -= pageoff;
size += pageoff;
size = (vsize_t)round_page(size);

If you pass in (0x1004, 0), for example, you end up with:

pageoff = 0x4;
addr = 0x1000;
size = 0x4 ROUND UP-> 0x1000;

It's not clear that's the actual intent ... seems like if you pass 0, you 
should get 0, regardless of alignment... but let's fix that problem LATER.

Since the logic for trunc / round is the same in both mlock and munlock, let's 
turn our attention to any logical differences in uvm_map_pageable().


If first looks up the entry that covers the range, and assigns the same value 
to "entry" and "start_entry".

The code paths for the "pageable" and "not pageable" are split, so we'll focus 
on "pageable" since that's the one that's panic'ing.

Assuming the region that's passed to munlock() is page-aligned (I'll use the 
example 0x1000 with size 0), we should see uvm_map_pageable() called with:

start = 0x1000;
end = 0x1000;

The first thing it does is call UVM_MAP_CLIP_START(), which ensures that 
"entry"'s start is the start address, splitting the entry as necessary 
(inserting the new split-off entry ahead of "entry").

Next it does a pass over all the entries in the range to ensure that every 
entry is actually wired and that there are no unmapped regions in the range.

The check it uses is:

- If the current entry is not wired
OR
- If the current entry ends before the end of the range
AND
- This is the last entry
OR
- The next entry starts after the end of the current entry 
(hole)

...then return EINVAL

The logic there seems correct, and you're not seeing EINVAL.

Now, AFTER that first loop, it does this:

entry = start_entry;

...because it wants to pass over the range again.  This should be safe, because 
UVM_MAP_CLIP_START() is *supposed* to preserve the value of "entry".

The *second* loop uses the same termination logic as the first, and the 
is-entry-wired test is equivalent (look at the macro's expansion).

It's obvious from the panic that the bottom layer of this cake is freaking out 
because it's being asked to unwire something that's not actually wired, but I 
don't see how this can happen for a page-aligned zero-length region.

I would suggest instrumenting-with-printf the "new_pageable" case of 
uvm_map_pageable()

> The err1 = and err2 = are recent additions so that printf (and a
> similar one after the munlock) can indicate if anything goes wrong.
> 
> The mlock(buf, 0) (first time around the loop when page == i)
> call did not fail (err1 == 0).   Or at least it seemed to be OK
> (there was no output, but there might have been some coming when
> the munlock() caused the panic i I will try again (much much
> later today) with a sleep between mlock() and munlock() so any
> output has time to drain.
> 
> Anyway, this might give someone enough of a clue as to what is happening
> that that someone (who understands UVM and the pmap code) can fix things.
> 
> There's no indication in the mlock() man page, or in the posix spec,
> that len == 0 is anything but a valid call (one would expect it to be
> a no-op though, rather than scrambling the kernel data structs).
> 
> With that loop modified so i starts at page - 1024 everything is fine,
> 
> kre
> 

-- thorpej



Re: zsh crash in recent -current

2019-03-13 Thread Brian Buhrow
hello Robert.  Given this code fragment and the discussion you raise
about it, allow me to ask what perhaps is a naive question.  If the sample
you quote is incorrect, what is the correct way to accomplish the same
task?
-thanks
-Brian

On Mar 13,  6:27pm, Robert Elz wrote:
} Subject: Re: zsh crash in recent -current
} Date:Wed, 13 Mar 2019 10:06:42 +
} From:Chavdar Ivanov 
} Message-ID:  

} 
}   | I saw the one with the trashed history as well.
}   |
}   | I don't think it is zsh's problem, though. As I mentioned above, I've
}   | used v5.7 since it came out without any problems until perhaps 3-4
}   | days ago.
} 
} I would guess that maybe there is code like this
} 
}   for (list_ptr = list_head; list_ptr != NULL; list_ptr = list_ptr->nxt) 
} {
}   /* do stuff on list */
}   if (element_should_be_deleted) {
}   /* with testing for NULLs added but not shown here */
}   list_ptr->prev->nxt = list_ptr->nxt;
}   list_ptr->nxt->prev = list_ptr->prev;
}   free(list_ptr);
}   }
}   }
} 
} which will "work" perfectly wih most versions of malloc, as
} that free does not change anything in the memory that has been
} freed, but will collapse in a giant heap if free() scribbles
} over the memory as part of deleting things, which some of the
} dumps that various people have shown on this (and similar) issues
} looks to be what is happening (the scribbling - it is deliberate
} to expose bugs like this one).
} 
} Code like the above is easy to write, and most of the time works fine
} (and would have worked with the previous malloc) but will die
} big time when the arena is scrambled (not just zeroed, usually).
} 
} Someone should look for something like this in the areas of zsh
} that are crashing, and other programs.
} 
} This is far more likely than the new malloc being broken, and just
} only happening to hit a few programs, and is more likely than some
} random memory corruption that simply has never been noticed until
} now.
} 
} kre
} 
} 
>-- End of excerpt from Robert Elz




Re: ATF t_mlock() babylon5 kernel panics

2019-03-13 Thread Robert Elz
Some progress.

#1 touching the buffer that malloc() returns (which is page aligned)
made no difference - it is likely that the malloc llibrary would have
done that in any case (the malloc is for just one page, so either it
is resident (or paged) or it is a ZFoD page, and most likely not the
latter - certainly not after this small mod.)

#2 changing the mlock_clip to avoid the sequence

mlock(buf, 0);
munlock(buf, 0);

avoided all of the issues and problems, and the test passed.

What locking 0 bytes might achieve I have no idea, but that is
what the test does:

buf = malloc(page);
/*...*/
for (size_t i = page; i >= 1; i = i - 1024) { 
err1 = mlock(buf, page - i);
if (err1 != 0)
fprintf(stderr, "mlock_clip: page=%ld i=%zu,"
" mlock(%p, %ld): %s\n", page, i, buf, page - i,
strerror(errno));
err2 = munlock(buf, page - i);
/*...*/

That first munlock() is where the panic (with the extra KASSERTs added)
occurs.

The err1 = and err2 = are recent additions so that printf (and a
similar one after the munlock) can indicate if anything goes wrong.

The mlock(buf, 0) (first time around the loop when page == i)
call did not fail (err1 == 0).   Or at least it seemed to be OK
(there was no output, but there might have been some coming when
the munlock() caused the panic i I will try again (much much
later today) with a sleep between mlock() and munlock() so any
output has time to drain.

Anyway, this might give someone enough of a clue as to what is happening
that that someone (who understands UVM and the pmap code) can fix things.

There's no indication in the mlock() man page, or in the posix spec,
that len == 0 is anything but a valid call (one would expect it to be
a no-op though, rather than scrambling the kernel data structs).

With that loop modified so i starts at page - 1024 everything is fine,

kre



Re: xdm receives no input

2019-03-13 Thread Chavdar Ivanov
Awesome, thanks! That's a thing which pops up from time to time, I've
had it myself.

On Wed, 13 Mar 2019 at 15:04, Patrick Welche  wrote:
>
> On Tue, Mar 12, 2019 at 05:17:10PM +, Chavdar Ivanov wrote:
> > /etc/ttys ?
>
> Absolutely right. This laptop's xdm worked before the update, so I doubted
> your suggestion.
>
> What happened is that the update replaced my /etc/X11/xdm/Xservers.ws
> in which I had carefully set vt08 as the terminal to use. The update
> changed it to vt05, which was inconsistent with my /etc/ttys.
>
> Editing /etc/X11/xdm/Xservers back to vt08 fixed things.
>
> mystery solved!
>
>
> Cheers,
>
> Patrick



-- 



Re: xdm receives no input

2019-03-13 Thread Patrick Welche
On Tue, Mar 12, 2019 at 05:17:10PM +, Chavdar Ivanov wrote:
> /etc/ttys ?

Absolutely right. This laptop's xdm worked before the update, so I doubted
your suggestion.

What happened is that the update replaced my /etc/X11/xdm/Xservers.ws
in which I had carefully set vt08 as the terminal to use. The update
changed it to vt05, which was inconsistent with my /etc/ttys.

Editing /etc/X11/xdm/Xservers back to vt08 fixed things.

mystery solved!


Cheers,

Patrick


Re: new jemalloc vs VirtualBox

2019-03-13 Thread Christos Zoulas
In article <20190313104136.gb58...@imac7.ub.uni-mainz.de>,
K. Schreiner  wrote:
>Hi,
>
>I've had "build.sh ... distribution" failing on an amd64-VM running in
>VirtualBox (newest test version) like so:
>
>...
>: /u/NetBSD/src/external/bsd/jemalloc/lib/../dist/src/arena.c:647:
>Failed assertion: "nstime_compa\ re(>epoch, ) <= 0"
>...
>[1]   Abort trap (core dumped) /u/NetBSD/arch/amd64/TOOLS/bin/nbmake 
>_THISDIR...
>--- obj-ln ---
>*** [obj-ln] Error code 134
>nbmake[3]: stopped in /u/NetBSD/src/bin
>1 error
>...
>
>
>Reason seems to be time "going backwards" what new jemalloc doesn't like.
>
>After adding
>  VBoxManage setextradata "VM name" "VBoxInternal/TM/TSCTiedToExecution" 1
>to the amd64-VM definiton, the problem seems to be gone.
>
>
>Have others seen this perhaps?

Yes, I have and it seems to be indeed time going backwards. Perhaps I should
disable this test, and leave debugging on until we cleanup all the issues it
finds.

christos



Re: zsh crash in recent -current

2019-03-13 Thread Chavdar Ivanov
s/gor/got/.

On the other hand there were some jemalloc changes overnight, so I
will update the system first and then try to get zsh crashes.

On Wed, 13 Mar 2019 at 11:32, Chavdar Ivanov  wrote:
>
> Setting INSTALL_UNSTRIPPED=yes didn't work for me for zsh and ncurses,
> I still had to find all the '-s' flags in the install command
> invocations. Perhaps I am doing something wrong. Still, I've gor now
> zsh with the debug options and all modules plus ncurses unstripped.
> Now to build the OS debug sets.
>
> On Wed, 13 Mar 2019 at 10:23, Chavdar Ivanov  wrote:
> >
> > Thanks. One has to read the manuals from time to time...
> >
> > On Wed, 13 Mar 2019 at 10:21, Patrick Welche  wrote:
> > >
> > > On Wed, Mar 13, 2019 at 10:06:42AM +, Chavdar Ivanov wrote:
> > > > I saw the one with the trashed history as well.
> > > >
> > > > I don't think it is zsh's problem, though. As I mentioned above, I've
> > > > used v5.7 since it came out without any problems until perhaps 3-4
> > > > days ago.
> > > >
> > > > I tried to build zsh with debug (adding  --enable-zsh-debug
> > > > --enable-zsh-mem --enable-zsh-mem-debug --enable-zsh-mem-warning
> > > > --enable-zsh-secure-free --enable-zsh-heap-debug
> > > > --enable-zsh-hash-debug to the makefile), but I still get a stripped
> > > > executable, no doubt I miss some pkgsrc environment variable. If I try
> > > > to build it within the work folder with gmake, it refuses to build the
> > > > curses.so module, which is one of the failing ones.
> > >
> > > INSTALL_UNSTRIPPED=yes
> > >
> > > ?
> > >
> > > Cheers,
> > >
> > > Patrick
> >
> >
> >
> > --
> > 
>
>
>
> --
> 



-- 



Re: zsh crash in recent -current

2019-03-13 Thread Chavdar Ivanov
Setting INSTALL_UNSTRIPPED=yes didn't work for me for zsh and ncurses,
I still had to find all the '-s' flags in the install command
invocations. Perhaps I am doing something wrong. Still, I've gor now
zsh with the debug options and all modules plus ncurses unstripped.
Now to build the OS debug sets.

On Wed, 13 Mar 2019 at 10:23, Chavdar Ivanov  wrote:
>
> Thanks. One has to read the manuals from time to time...
>
> On Wed, 13 Mar 2019 at 10:21, Patrick Welche  wrote:
> >
> > On Wed, Mar 13, 2019 at 10:06:42AM +, Chavdar Ivanov wrote:
> > > I saw the one with the trashed history as well.
> > >
> > > I don't think it is zsh's problem, though. As I mentioned above, I've
> > > used v5.7 since it came out without any problems until perhaps 3-4
> > > days ago.
> > >
> > > I tried to build zsh with debug (adding  --enable-zsh-debug
> > > --enable-zsh-mem --enable-zsh-mem-debug --enable-zsh-mem-warning
> > > --enable-zsh-secure-free --enable-zsh-heap-debug
> > > --enable-zsh-hash-debug to the makefile), but I still get a stripped
> > > executable, no doubt I miss some pkgsrc environment variable. If I try
> > > to build it within the work folder with gmake, it refuses to build the
> > > curses.so module, which is one of the failing ones.
> >
> > INSTALL_UNSTRIPPED=yes
> >
> > ?
> >
> > Cheers,
> >
> > Patrick
>
>
>
> --
> 



-- 



Re: zsh crash in recent -current

2019-03-13 Thread Robert Elz
Date:Wed, 13 Mar 2019 10:06:42 +
From:Chavdar Ivanov 
Message-ID:  


  | I saw the one with the trashed history as well.
  |
  | I don't think it is zsh's problem, though. As I mentioned above, I've
  | used v5.7 since it came out without any problems until perhaps 3-4
  | days ago.

I would guess that maybe there is code like this

for (list_ptr = list_head; list_ptr != NULL; list_ptr = list_ptr->nxt) 
{
/* do stuff on list */
if (element_should_be_deleted) {
/* with testing for NULLs added but not shown here */
list_ptr->prev->nxt = list_ptr->nxt;
list_ptr->nxt->prev = list_ptr->prev;
free(list_ptr);
}
}

which will "work" perfectly wih most versions of malloc, as
that free does not change anything in the memory that has been
freed, but will collapse in a giant heap if free() scribbles
over the memory as part of deleting things, which some of the
dumps that various people have shown on this (and similar) issues
looks to be what is happening (the scribbling - it is deliberate
to expose bugs like this one).

Code like the above is easy to write, and most of the time works fine
(and would have worked with the previous malloc) but will die
big time when the arena is scrambled (not just zeroed, usually).

Someone should look for something like this in the areas of zsh
that are crashing, and other programs.

This is far more likely than the new malloc being broken, and just
only happening to hit a few programs, and is more likely than some
random memory corruption that simply has never been noticed until
now.

kre




Re: xdm receives no input

2019-03-13 Thread Patrick Welche
On Wed, Mar 13, 2019 at 11:54:11AM +0100, Martin Husemann wrote:
> On Wed, Mar 13, 2019 at 10:44:46AM +, Patrick Welche wrote:
> > I stared at /etc/X11/xdm/Xresources, and
> > https://gitlab.freedesktop.org/xorg/app/xdm/blob/master/config/Xresources.in
> > and didn't spot additional fields (just our fonts are smaller and we
> > fiddle with the NetBSD logo) - what should I be looking out for?
> 
> postinstall(8) checks for inpColor:

Hmmm...

$ grep inpColor /etc/X11/xdm/Xresources 
xlogin*inpColor: grey80

and I didn't edit that file... (so postinstall already fixed it for me?)

$ ls -l /etc/X11/xdm/Xresources 
-r--r--r--  1 root  wheel  3438 Mar 12 12:40 /etc/X11/xdm/Xresources


I have to say, it seems to smell more of Xkeyboard...


Cheers,

Patrick


Tilissäsi havaittu epätavallinen toiminta vahvistaa tilisi vastaanottaaksesi saapuvan odottavan postisi

2019-03-13 Thread Emiliano.Bagnoli



Päivitä tilisi

Tietojemme mukaan tiliäsi ei ole päivitetty, mikä on saattanut johtaa tilisi 
sulkemiseen. Jos et päivitä tiliäsi, et voi enää lähettää ja vastaanottaa 
sähköpostiviestejä, ja et saa pääsyä moniin uusimpiin keskusteluihin, 
yhteystietoihin ja liitteisiin.

Päivitä tili nopeammin ja täydellisemmällä postikokemuksella.

   Päivitä tilisi napsauttamalla tätä
Huomautus: Postilaatikon päivittämisen epäonnistuminen johtaa tilisi pysyvään 
poistamiseen.

Monet kiitokset
Turvallisuusryhmä

Tekijänoikeus © 2019 Webmail .Inc. Kaikki oikeudet pidätetään.



Re: xdm receives no input

2019-03-13 Thread Martin Husemann
On Wed, Mar 13, 2019 at 10:44:46AM +, Patrick Welche wrote:
> I stared at /etc/X11/xdm/Xresources, and
> https://gitlab.freedesktop.org/xorg/app/xdm/blob/master/config/Xresources.in
> and didn't spot additional fields (just our fonts are smaller and we
> fiddle with the NetBSD logo) - what should I be looking out for?

postinstall(8) checks for inpColor:

Martin


Re: xdm receives no input

2019-03-13 Thread Patrick Welche
On Wed, Mar 13, 2019 at 08:10:23AM +1100, matthew green wrote:
> Patrick Welche writes:
> > Had a go with the shiny new X (thanks!) on the sandy bridge laptop
> > which no longer likes SNA but works with UX, and xdm seems to sit
> > at the prompt waiting for something:
> > ...

> is it a black input bar that doesn't appear to do anything?
> (it probably does work -- did you try typing blind?)

If you mean a text cursor input type thing inside the Login: text
field, yes? Blind typing doesn't do anything...

> the fix is to update /usr/X11/xdm/Xresources file -- it has
> new required entries.  i'm going to work on making this less
> awful when it isn't updated, but i've been busy working on
> the Mesa18 update :-)

I stared at /etc/X11/xdm/Xresources, and
https://gitlab.freedesktop.org/xorg/app/xdm/blob/master/config/Xresources.in
and didn't spot additional fields (just our fonts are smaller and we
fiddle with the NetBSD logo) - what should I be looking out for?

Cheers,

Patrick


new jemalloc vs VirtualBox

2019-03-13 Thread K. Schreiner
Hi,

I've had "build.sh ... distribution" failing on an amd64-VM running in
VirtualBox (newest test version) like so:

...
: /u/NetBSD/src/external/bsd/jemalloc/lib/../dist/src/arena.c:647:
Failed assertion: "nstime_compa\ re(>epoch, ) <= 0"
...
[1]   Abort trap (core dumped) /u/NetBSD/arch/amd64/TOOLS/bin/nbmake _THISDIR...
--- obj-ln ---
*** [obj-ln] Error code 134
nbmake[3]: stopped in /u/NetBSD/src/bin
1 error
...


Reason seems to be time "going backwards" what new jemalloc doesn't like.

After adding
  VBoxManage setextradata "VM name" "VBoxInternal/TM/TSCTiedToExecution" 1
to the amd64-VM definiton, the problem seems to be gone.


Have others seen this perhaps?


Kurt



Re: zsh crash in recent -current

2019-03-13 Thread Chavdar Ivanov
Thanks. One has to read the manuals from time to time...

On Wed, 13 Mar 2019 at 10:21, Patrick Welche  wrote:
>
> On Wed, Mar 13, 2019 at 10:06:42AM +, Chavdar Ivanov wrote:
> > I saw the one with the trashed history as well.
> >
> > I don't think it is zsh's problem, though. As I mentioned above, I've
> > used v5.7 since it came out without any problems until perhaps 3-4
> > days ago.
> >
> > I tried to build zsh with debug (adding  --enable-zsh-debug
> > --enable-zsh-mem --enable-zsh-mem-debug --enable-zsh-mem-warning
> > --enable-zsh-secure-free --enable-zsh-heap-debug
> > --enable-zsh-hash-debug to the makefile), but I still get a stripped
> > executable, no doubt I miss some pkgsrc environment variable. If I try
> > to build it within the work folder with gmake, it refuses to build the
> > curses.so module, which is one of the failing ones.
>
> INSTALL_UNSTRIPPED=yes
>
> ?
>
> Cheers,
>
> Patrick



-- 



Re: zsh crash in recent -current

2019-03-13 Thread Patrick Welche
On Wed, Mar 13, 2019 at 10:06:42AM +, Chavdar Ivanov wrote:
> I saw the one with the trashed history as well.
> 
> I don't think it is zsh's problem, though. As I mentioned above, I've
> used v5.7 since it came out without any problems until perhaps 3-4
> days ago.
> 
> I tried to build zsh with debug (adding  --enable-zsh-debug
> --enable-zsh-mem --enable-zsh-mem-debug --enable-zsh-mem-warning
> --enable-zsh-secure-free --enable-zsh-heap-debug
> --enable-zsh-hash-debug to the makefile), but I still get a stripped
> executable, no doubt I miss some pkgsrc environment variable. If I try
> to build it within the work folder with gmake, it refuses to build the
> curses.so module, which is one of the failing ones.

INSTALL_UNSTRIPPED=yes

?

Cheers,

Patrick


Re: zsh crash in recent -current

2019-03-13 Thread Chavdar Ivanov
OK, I understand. I should carry on using it to see if it will break
again and perhaps get something useful.

On Wed, 13 Mar 2019 at 10:09, Martin Husemann  wrote:
>
> On Wed, Mar 13, 2019 at 10:06:42AM +, Chavdar Ivanov wrote:
> > I saw the one with the trashed history as well.
> >
> > I don't think it is zsh's problem, though. As I mentioned above, I've
> > used v5.7 since it came out without any problems until perhaps 3-4
> > days ago.
>
> It still is very likely a zsh bug, only old jemalloc (and new one w/o
> debugging enabled) is more forgiving.
>
> Martin



-- 



Re: zsh crash in recent -current

2019-03-13 Thread Martin Husemann
On Wed, Mar 13, 2019 at 10:06:42AM +, Chavdar Ivanov wrote:
> I saw the one with the trashed history as well.
> 
> I don't think it is zsh's problem, though. As I mentioned above, I've
> used v5.7 since it came out without any problems until perhaps 3-4
> days ago.

It still is very likely a zsh bug, only old jemalloc (and new one w/o
debugging enabled) is more forgiving.

Martin


Re: zsh crash in recent -current

2019-03-13 Thread Chavdar Ivanov
I saw the one with the trashed history as well.

I don't think it is zsh's problem, though. As I mentioned above, I've
used v5.7 since it came out without any problems until perhaps 3-4
days ago.

I tried to build zsh with debug (adding  --enable-zsh-debug
--enable-zsh-mem --enable-zsh-mem-debug --enable-zsh-mem-warning
--enable-zsh-secure-free --enable-zsh-heap-debug
--enable-zsh-hash-debug to the makefile), but I still get a stripped
executable, no doubt I miss some pkgsrc environment variable. If I try
to build it within the work folder with gmake, it refuses to build the
curses.so module, which is one of the failing ones.

On Tue, 12 Mar 2019 at 21:58, Thomas Klausner  wrote:
>
> On Tue, Mar 12, 2019 at 03:33:26PM +, Chavdar Ivanov wrote:
> > On amd64 -curent from yesterday (and a couple of days earlier) I
> > started to get zsh crashes when tab-completing  (files, directories,
> > packages), similar to
>
> I see lots of crashes with zsh too.
>
> Some happen in completion, sometimes when I press enter on a command
> line, sometimes the history gets trashed (lots of weird characters
> turn up when I press 'up') and the shell dies soon after.
>
> I think there are at least two different bugs in zsh here.
>  Thomas



-- 



Re: ATF t_mlock() babylon5 kernel panics

2019-03-13 Thread Robert Elz
OK, with DIAGNOSTIC enabled, and with this patch made:

--- uvm_page.c  19 May 2018 15:03:26 -  1.198
+++ uvm_page.c  13 Mar 2019 08:51:11 -
@@ -1605,9 +1605,11 @@
 uvm_pageunwire(struct vm_page *pg)
 {
KASSERT(mutex_owned(_pageqlock));
+   KASSERT(pg->wire_count != 0);
pg->wire_count--;
if (pg->wire_count == 0) {
uvm_pageactivate(pg);
+   KASSERT(uvmexp.wired != 0);
uvmexp.wired--;
}
 }

I now get a *very* quick panic...

t_mlock: pagesize 4096
tp-start: 1552467032.204169, t_mlock, 5
tc-start: 1552467032.204178, mlock_clip
[  47.4101095] pmap_unwire: wiring for pmap 0xaf80023f6b40 va 
0x7f7ff7ed6000 did not change!
[  47.4101095] panic: kernel diagnostic assertion "pg->wire_count != 0" failed: 
file "/readonly/release/testing/src/sys/uvm/uvm_page.c", line 1608 
[  47.4101095] cpu0: Begin traceback...
[  47.4101095] vpanic() at netbsd:vpanic+0x143
[  47.4101095] kern_assert() at netbsd:kern_assert+0x48
[  47.4101095] uvm_pageunwire() at netbsd:uvm_pageunwire+0x81
[  47.4101095] uvm_fault_unwire_locked() at netbsd:uvm_fault_unwire_locked+0x138
[  47.4101095] uvm_map_pageable() at netbsd:uvm_map_pageable+0x346
[  47.4101095] sys_munlock() at netbsd:sys_munlock+0x63
[  47.4101095] syscall() at netbsd:syscall+0x9c
[  47.4101095] --- syscall (number 204) ---
[  47.4101095] 7f7ff7042ffa:
[  47.4101095] cpu0: End traceback...
[  47.4101095] fatal breakpoint trap in supervisor mode
[  47.4101095] trap type 1 code 0 rip 0x802059b5 cs 0xe030 rflags 0x202 
cr2 0x7f7ff7ed6000 ilevel 0 rsp 0xaf804d94cd10
[  47.4101095] curlwp 0xaf80024f4b00 pid 364.1 lowest kstack 
0xaf804d9482c0
Stopped in pid 364.1 (t_mlock) at   netbsd:breakpoint+0x5:  leave
breakpoint() at netbsd:breakpoint+0x5
vpanic() at netbsd:vpanic+0x143
kern_assert() at netbsd:kern_assert+0x48
uvm_pageunwire() at netbsd:uvm_pageunwire+0x81
uvm_fault_unwire_locked() at netbsd:uvm_fault_unwire_locked+0x138
uvm_map_pageable() at netbsd:uvm_map_pageable+0x346
sys_munlock() at netbsd:sys_munlock+0x63
syscall() at netbsd:syscall+0x9c
--- syscall (number 204) ---
7f7ff7042ffa:
ds  cd20
es  ccd0
fs  cd10
gs  10
rdi 0
rsi af804d94cabc
rbp af804d94cd10
rbx 104
rdx 1
rcx 0
rax 0
r8  805b2780cpu_info_primary
r9  3e93e7
r10 0
r11 0
r12 80502e70ostype+0xab0
r13 af804d94cd58
r14 805202f0ostype+0x1df30
r15 af80023f5730
rip 802059b5breakpoint+0x5
cs  e030
rflags  202
rsp af804d94cd10
ss  e02b
netbsd:breakpoint+0x5:  leave

And the system is left sitting in ddb (which I also enabled for
this) so if anyone can think of anything I should be looking at
to help diagnose this, now would be the time to tell me!

The mlock_clip() test is the first one run.   It can be seen in
.../src/tests/lib/libc/sys/t_mlock.c

But I have non-computer work to do for a couple of hours, so this
will be the last you hear from me about this for a while.

kre



Re: ATF t_mlock() babylon5 kernel panics

2019-03-13 Thread Manuel Bouyer
On Wed, Mar 13, 2019 at 03:22:27PM +0700, Robert Elz wrote:
>   [...]
> 
> netbsd# df /tmp
> Filesystem1K-blocks   Used  Avail %Cap Mounted on
> tmpfs 4  4  0 100% /tmp
> 
> That's what it showed (it was still in my xterm scrollback buffer from
> the window I use as the DomU console).So, no, df isn't showing space
> allocated, just nothing available.   Something had "stolen" all the available
> ram, and wasn't letting go.
> 
> It did not look as if there was no free memory however:
> 
> netbsd# vmstat m
>  procsmemory  page   disk faults  cpu
>  r b  avmfre  flt  re  pi   po   fr   sr x0   in   sy  cs us sy id
>  0 034868 945956  264   0   0000 15  428  426 119  0  0 100
> 
> but something as simple as attempting to read a man page (to check
> what other options I could try):
> 
> netbsd# man vmstat
> man: Formatting manual page...
> mandoc: stdout: No space left on device

I've seen this too. df showed my tmpfs as full (although there
was only a few mb in the 'used' colum) , top was showing a few MB free RAM
but lots of ram allocated to files (I have 8GB RAM). UVM could have 
freed some memory from the file cache for tmpfs ...

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: ATF t_mlock() babylon5 kernel panics

2019-03-13 Thread Robert Elz
Date:Tue, 12 Mar 2019 23:21:59 -0700
From:Jason Thorpe 
Message-ID:  

  | THAT is particularly special, because the code in question is:
  |
  | 
  | void
  | uvm_pagewire(struct vm_page *pg)
  | {
  | KASSERT(mutex_owned(_pageqlock));
  | #if defined(READAHEAD_STATS)
  | if ((pg->pqflags & PQ_READAHEAD) != 0) {
  | uvm_ra_hit.ev_count++;
  | pg->pqflags &= ~PQ_READAHEAD;
  | }
  | #endif /* defined(READAHEAD_STATS) */
  | if (pg->wire_count == 0) {
  | uvm_pagedequeue(pg);
  | uvmexp.wired++;
  | }
  | pg->wire_count++;
  | KASSERT(pg->wire_count > 0);/* detect wraparound */
  | }
  | 

Actually, that probably also explains why my kernel is not crashing.
I don't have DIAGNOSTIC enabled, so that KASSERT() does not happen
in my kernel.   Realised that when I was about to add the KASSERT
you suggested...   I will change that while continuing to test this.

kre



Re: ATF t_mlock() babylon5 kernel panics

2019-03-13 Thread Robert Elz
Date:Tue, 12 Mar 2019 23:21:59 -0700
From:Jason Thorpe 
Message-ID:  

Thanks for the reply.   I have dropped tech-kern and tech-userlevel
from this reply though.

  | The test employs a bogus understanding of how malloc() is specified.

Yes, that is kind of obvious, but perhaps understandable given the man
page.

  | I've also seen the term "fundamental object" used.

I think that might be pushing understanding a bit far.   That could
refer to a quark, or qason or one of those truly fundamental objects,
or perhaps a proton/neutron/electron, or even atom or molecule...

I am not going to change the man page, as that should be done by someone
who knows what they're actually talking about - what NetBSD libc malloc()
is actually willing to promise, but perhaps something like

suitable for any C primitive or constructed data type

?

  | One has to remember that malloc() is specified by the C standard,
  | and C has no notion of "pages" or any other such silliness that
  | we Unix people assume are fundamental :-)

Yes, of course, but as long as we meet its requirements, there's no
reason that we can't provide "more" alignment than is required.  It
would not perhaps be completely unreasonable for a malloc of any power
of two size (perhaps up to the page size, or perhaps even more,
depending upon what works best for the architecture) to align the
result to that same size (or the minimum required alignment, whichever
is greater).

An application should not assume that, but it might make some operations
magically work more efficiently if things just happened to be aligned
that way - eg: a read for a (one or more) full page into a page aligned
buffer could simply do page flipping (sharing with kernel on a CoW basis,
or if the kernel doesn't need to keep the page, simply a page swap) rather
than a byte by byte copy.   No-one need even know that such a thing was
happening...


  | POSIX specifically states that mlock() //may// require that the
  | address is page-aligned ... Our implementation does not require this:

Yes, I know - the test is specifically testing that the implementation
does not require it (run on a different system that test might fail).

  | And there are no files there?

No, there were no files, and unless init has taken to holding open
large unlinked files in /tmp (a /tmp that is not yet mounted when
init does most of is work) then there were no remaining processes
to be keeping any unlinked files.

  | Even an open-unliked file should disappear when the offending process exits.

yes.   I don't think it was that, I think it is some anon mmap()'d object
that is no longer referenced by anyone but which is still hanging around.

  | Well, note that tmpfs also uses anonymous memory.

Yes, I'm aware.

  | Is it that "df" on the tmpfs is really showing a bunch of space
  | allocated to the tmpfs?

netbsd# df /tmp
Filesystem1K-blocks   Used  Avail %Cap Mounted on
tmpfs 4  4  0 100% /tmp

That's what it showed (it was still in my xterm scrollback buffer from
the window I use as the DomU console).So, no, df isn't showing space
allocated, just nothing available.   Something had "stolen" all the available
ram, and wasn't letting go.

It did not look as if there was no free memory however:

netbsd# vmstat m
 procsmemory  page   disk faults  cpu
 r b  avmfre  flt  re  pi   po   fr   sr x0   in   sy  cs us sy id
 0 034868 945956  264   0   0000 15  428  426 119  0  0 100

but something as simple as attempting to read a man page (to check
what other options I could try):

netbsd# man vmstat
man: Formatting manual page...
mandoc: stdout: No space left on device

(attempting to write a temp file on /tmp I presume).

netbsd# ps ax
  PID TTY STATTIME COMMAND
0 ?   OKl  0:27.38 [system]
1 ?   Ss   0:00.02 init 
 1966 xencons O+   0:00.00 ps -ax 
 7241 xencons Ss   0:00.01 login 
23376 xencons S0:00.00 -sh 
netbsd# fstat
[...]

The fstat showed nothing interesting, in partucular, init had
no open files at all (not surprising).   The ps was done by
the time the fstat started of course, fstat login and sh had
files open (also no surprise - except why login is there at
all (it is the parent of the sh) - when did we start keeping
login around rather than simply having it exec sh, and why?

In any case I made that login and sh go away, by logging out
and in again, and confirming new pids...


A "normal" df of /tmp shows something more like ...

netbsd# df /tmp
Filesystem1K-blocks   Used  Avail %Cap Mounted on
tmpfs524664  4 524660   0% /tmp

  | We use basic 4K page size.

Yes, I saw that when I added diagnostics into the test (I printed
that value).I also saw when I ran it on my laptop, where I do
most basic development work initially, that whatever malloc it is
using (userland from -current from just over a year 

Re: ATF t_mlock() babylon5 kernel panics

2019-03-13 Thread Jason Thorpe


> On Mar 12, 2019, at 9:09 PM, Robert Elz  wrote:
> 
> The first issue I noticed, is that t_mlock() apparently belives
> the malloc(3) man page, which states:
> 
> The malloc() function allocates size bytes of uninitialized memory.  The
> allocated space is suitably aligned (after possible pointer coercion) for
> storage of any type of object.
> 
> and in particular, those last few words.   The "any type of object" that
> t_mlock wants to store is a "page" - that is a hardware page.

The test employs a bogus understanding of how malloc() is specified.  On x86, 
malloc() should return memory that is 16-byte aligned because that is the 
maximum alignment requirement of the fundamental types used by the compiler.

>   It obtains
> the size of that using:
> 
>   page = sysconf(_SC_PAGESIZE);
> 
> and then does
> 
>buf = malloc(page);
> 
> and if buf is not NULL (which it does check) assumes that it now
> has a correctly page aligned page sized block of memory, in which
> it can run mlock() related tests.
> 
> Something tells me that the "any type of object" does not include this
> one, and that t_mlock should be using posix_memalign() instead to allocate
> its page, so it can specify that it needs a page aligned page.

Correct.  Or mmap() (which always returns page-aligned pointers).

> Again, I am not proposing fixing the test until the kernel issues
> are corrected, but it would be good for someone who knows what alignment
> malloc() really promises to return (across all NetBSD architectures)
> to rewrite the man page to say something more specific than "any type of
> object" !

I've also seen the term "fundamental object" used.  One has to remember that 
malloc() is specified by the C standard, and C has no notion of "pages" or any 
other such silliness that we Unix people assume are fundamental :-)

> NetBSD's mlock() rounds down, so regardless of the alignment of the
> space allocated, the mlock() tests should be working (the locking might
> not be exactly what the test is expecting, but all it is doing is keeping
> pages locked in memory - which pages exactly this test does not really
> care).

POSIX specifically states that mlock() //may// require that the address is 
page-aligned ... Our implementation does not require this:


/*
 * align the address to a page boundary and adjust the size accordingly
 */

pageoff = (addr & PAGE_MASK);
addr -= pageoff;
size += pageoff;
size = (vsize_t)round_page(size);


That is to say, the intent of our implementation is to wire the page where the 
range begins through the page where the range ends.  Note that internally, UVM 
represents all ranges as [start, start+size) (assuming start and size are page 
aligned / rounded).


> On my test setup, the kernel did not panic.   It does however experience
> some other weirdness, some of which is also apparent in the bablylon5
> tests, and others which might be.
> 
> My test system is an amd64 XEN DomU - configired with no swap space, and
> just 1GB RAM.   It typically has a tmpfs mounted limitted to 1/2 GB
> (actually slightly more than that - not sure where I got the number from,
> there may have been a typo... the -s param is -s=537255936 in fstab.
> That oddity should be irrelevant.
> 
> The first thing I noticed was that when I run the t_mlock test in this
> environment, it ends up failing when /tmp has run out of space.   And I
> mean really run out of space, in that it is gone forever, and nothing I
> have thought of so far to try gets any of that space back again.

And there are no files there?  Even an open-unliked file should disappear when 
the offending process exits.

> I assume that jemalloc() (aka malloc() in the test) is doing some kind
> of mmap() that is backed by space on /tmp and continually grabbing more
> until it eventually runs out, and that the kernel never releases that
> space (even after the program that mapped it has exited).   That seems
> sub-optimal, and needs fixing in the kernel, anonymous mmap's (or whatever
> kind jemalloc() is doing) need to be released when there are no more
> processes that can possibly use them.

Well, note that tmpfs also uses anonymous memory.  Is it that "df" on the tmpfs 
is really showing a bunch of space allocated to the tmpfs?

> I did not try umount -f (easier to just reboot...) but a regular umount
> failed (EBUSY) even though there was nothing visibly using anything on
> /tmp (and I killed every possible program, leaving only init - and yes,
> that did include the console shell I use to test things).
> 
> Umounting the tmpfs before running the t_mlock test worked fine (which also
> illustrates that none of the very few daemon processes, nor the shell, etc,
> from my login, are just happening to be using /tmp - and that it is the
> results of the malloc() calls from t_mlock that must be the culprit.
> (While ATF is running, it would be using /tmp as both its working
> directory,